second class
SOE,RUE
2020-11-16
readr
: reads .csv data.
haven
: reads SPSS, Stata, and SAS files.
readxl
: reads excel files (both .xls and .xlsx).
jsonlite
: reads .json data.
rio
: import, export, convert
sjPlot::view_df()
dplyr::glimpse()
skimr::skim()
janitor::clean_names()
& remove_empty()
ggplot2
, for data visualization.
dplyr
, for data manipulation.
tidyr
, for data tidying.
readr
, for data import.
purrr
, for functional programming.
tibble
, for tibbles, a modern re-imagining of data frames.
stringr
, for strings.
forcats
, for factors.
select
rename
relocate
mutate
case_match
Tidyverse selections implement a dialect of R where operators make it easy to select variables:
: for selecting a range of consecutive variables.
! for taking the complement of a set of variables.
& and | for selecting the intersection or the union of two sets of variables.
c() for combining selections.
starts_with()
: Starts with a prefix.
ends_with()
: Ends with a suffix.
contains()
: Contains a literal string.
matches()
: Matches a regular expression.
num_range()
: Matches a numerical range like x01, x02, x03.
where()
: Applies a function to all variables and selects those for which the function returns TRUE.
filter
arrange
slice,slice_head, slice_tail, slice_sample, slice_min, slice_max
conditional: >, >=, <, <=, != (not equal), and == (equal).
conditional: & is ??and??, | is ??or??, and ! is ??not??.,
group_by & summarize
count
distinct
n_distinct
across
rowwise & c_across
extract
separate
unite
Mutating Joins — inner_join(), left_join(), right_join(), full_join()
Filtering Joins — semi_join(), anti_join()
pivot_longer()
pivot_wider()
fct_recode
fct_reorder
fct_lump
str_c
str_sub
str_detect
str_replace
学生id, 学校id, 城市id,是否独生子女、是否近视、是否恋爱、期中考试语文原始成绩、是否实行百分制
只选择期中语文成绩实行100制的学生
将语文成绩的变量名称更改为chinese, 将是否独生子女变量名称更改为child_only, 是否恋爱变更为in_love.
visdat::vis_miss()
tidyr::drop_na()
tidyr::replace_na() & na_if()
base::is.na()
seeking truths through facts, seeking facts throught data