speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
EDU, RUC, yea027@ruc.edu.cn
数据无处不在(Data, Information, Knowledge, and Wisdom)
研究走向实证(describe、explain、predict、intervene)
日常工作需要,数据的分析与解读已经成为一种基本生存技能
In rating ease of description, after almost any reasonable change of point of view, as very important, we are essentially asserting a belief in quantitative knowledge--a belief that most of the key questions in our world sooner or later demand answers to “by how much” rather than merely to “in which direction?”.
—John W. Tukey, 1977
数据收集
数据存储与管理
数据分析
数据呈现
数据科学以统计学为基础
数据科学应用性更强(机器学习、因果推断)
数据科学更注重数据挖掘
定量研究更侧重于验证
为了让大家熟悉教育统计数据,现需要大家查找以下数据
1.分省教育经费支出明细(小学)
2.分省教育经费支出明细(初级中学)
3.分省小学专任教师数(小学)
4.分省中学专任教师数(初级中学)
Present and class performance (10%)
Assignments (50%)
Individual project proposal (40%)
高考成绩分析报告
不同年代的人受教育程度
教师法律地位问题
教师工资中部凹陷问题
技能需求结构变化
Teacher self-efficacy
R is a language and environment for statistical computing and graphics.
R is available as Free Software.
R can be extended (easily) via packages.
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
create a script, and save it, and open
create a project, put a data in it
install package, like “tidyverse”
options/appearance, setup theme.
文件管理工具
可重复、可迁移
script记录过程与结果
product仅仅只是结果
Everything is object in R
object name
print()
environment
download from: CRAN, Bioconductor and GitHub.
加载: library/ require
update
Here’s a summary table of some of the logical test and coercion functions available to you.
Type | Logical test | Coercing |
---|---|---|
Character | is.character |
as.character |
Numeric | is.numeric |
as.numeric |
Logical | is.logical |
as.logical |
Factor | is.factor |
as.factor |
Complex | is.complex |
as.complex |
scalars & vectors
matrices & arrays
list
data frame
Import, Export, and Convert Data Files, 数据打开与保存的package rio
的使用方法。
chapter 5, 10, 12, 13, 14, 15, 16 R for data science,数据清理的基本技术。
Download CEPS data. read documentations
data import: rio
data manipulation: tidyverse, sjPlot, sjmisc, janitor, skimr, naniar, visdat,