使用CEPS数据,看看教师性别对语数英成绩的影响;(hint:首先将学生数据与教师数据合并,生成一个新变量,将学生可分为四组:女学生女教师,女学生男教师,男学生男教师,男学生女教师,比较四组成绩差异,以上操作分学科独立进行。)
## setups
library (tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.3 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library (rio)
library (janitor)
Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
library (sjPlot)
library (modelsummary)
library (rstatix)
Attaching package: 'rstatix'
The following object is masked from 'package:janitor':
make_clean_names
The following object is masked from 'package:stats':
filter
cls <- import ("/Users/yangyongye/microcloud/projects/survey_data/ceps/2013_2014rawdata/CEPS_class.dta" )
std <- import ("/Users/yangyongye/microcloud/projects/survey_data/ceps/2013_2014rawdata/CEPS_student.dta" )
std_cls <- std %>% left_join (cls,by= join_by ("clsids" ))
std_cls2 <- std_cls %>%
filter (tr_mat< 151 ) %>%
filter (tr_eng< 151 ) %>%
mutate (
chn_rlation= case_when (
a01== 1 & chnb01 == 1 ~ "男学生男老师" ,
a01== 1 & chnb01 == 2 ~ "男学生女老师" ,
a01== 2 & chnb01 == 1 ~ "女学生男老师" ,
a01== 2 & chnb01 == 2 ~ "女学生女老师" ,
),
mat_rlation= case_when (
a01== 1 & matb01 == 1 ~ "男学生男老师" ,
a01== 1 & matb01 == 2 ~ "男学生女老师" ,
a01== 2 & matb01 == 1 ~ "女学生男老师" ,
a01== 2 & matb01 == 2 ~ "女学生女老师" ,
),
eng_rlation= case_when (
a01== 1 & engb01 == 1 ~ "男学生男老师" ,
a01== 1 & engb01 == 2 ~ "男学生女老师" ,
a01== 2 & engb01 == 1 ~ "女学生男老师" ,
a01== 2 & engb01 == 2 ~ "女学生女老师" ,
)
)
语文成绩
就语文成绩而言,
#语文成绩
library (ggstatsplot)
You can cite this package as:
Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167
std_cls2 %>% ggbetweenstats (y= tr_chn,x= chn_rlation)
数学成绩
就数学成绩而言,女成绩比男生成绩好,且女教师的成绩比男老师的成绩好。
#数学成绩
std_cls2 %>% ggbetweenstats (y= tr_mat,x= mat_rlation)
英语成绩
就英语成绩而言,女生的成绩高于男生的成绩,但男女老师之间的并无显著差异。
std_cls2 %>% ggbetweenstats (y= tr_eng,x= eng_rlation)
使用CEPS数据,在控制其它影响学生成绩的因素下,看看有没有班主任的学科优势?即,是否说班主任是英语老师,该班学生的英语成绩就好于班主任是非英语的教师?语文、数学亦是同样的问题。
std_cls3 <- std_cls2 %>%
mutate (hr_chn= if_else (hra01== 2 ,1 ,0 ),
hr_mat= if_else (hra01== 1 ,1 ,0 ),
hr_eng= if_else (hra01== 3 ,1 ,0 ))
lm_chn <- lm (tr_chn ~ hr_chn + a01 + grade9.x+ stonly + steco_3c + stcog,data= std_cls3)
lm_mat <- lm (tr_mat ~ hr_mat + a01 + grade9.x+ stonly + steco_3c + stcog,data= std_cls3)
lm_eng <- lm (tr_eng ~ hr_eng + a01 + grade9.x+ stonly + steco_3c + stcog,data= std_cls3)
models <- list (lm_chn,lm_mat,lm_eng)
modelsummary (models,stars = TRUE ,gof_omit = "AIC|BIC|RMSE|R2 Adj.|Log.Lik.|F" )
(Intercept)
48.202***
33.825***
33.435***
(0.928)
(1.413)
(1.333)
hr_chn
0.063
(0.294)
a01
8.075***
4.104***
13.727***
(0.266)
(0.407)
(0.384)
grade9.x
11.100***
8.927***
−1.753***
(0.274)
(0.418)
(0.395)
stonly
−3.065***
−4.840***
−6.607***
(0.279)
(0.426)
(0.402)
steco_3c
1.601***
1.513***
2.213***
(0.274)
(0.419)
(0.395)
stcog
1.956***
3.839***
3.162***
(0.038)
(0.058)
(0.054)
hr_mat
2.004***
(0.441)
hr_eng
4.807***
(0.450)
Num.Obs.
18629
18647
18647
R2
0.219
0.230
0.252
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001