final2023answer

  1. 使用CEPS数据,看看教师性别对语数英成绩的影响;(hint:首先将学生数据与教师数据合并,生成一个新变量,将学生可分为四组:女学生女教师,女学生男教师,男学生男教师,男学生女教师,比较四组成绩差异,以上操作分学科独立进行。)
## setups
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rio)
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(sjPlot)
library(modelsummary)
library(rstatix)

Attaching package: 'rstatix'

The following object is masked from 'package:janitor':

    make_clean_names

The following object is masked from 'package:stats':

    filter
cls <- import("/Users/yangyongye/microcloud/projects/survey_data/ceps/2013_2014rawdata/CEPS_class.dta")
std <- import("/Users/yangyongye/microcloud/projects/survey_data/ceps/2013_2014rawdata/CEPS_student.dta")

std_cls <- std %>% left_join(cls,by=join_by("clsids"))
std_cls2 <- std_cls %>%
  filter(tr_mat<151) %>%
  filter(tr_eng<151) %>%
  mutate(
    chn_rlation=case_when(
      a01==1 & chnb01 ==1 ~ "男学生男老师",
      a01==1 & chnb01 ==2 ~ "男学生女老师",
      a01==2 & chnb01 ==1 ~ "女学生男老师",
      a01==2 & chnb01 ==2 ~ "女学生女老师",
    ),
    mat_rlation=case_when(
      a01==1 & matb01 ==1 ~ "男学生男老师",
      a01==1 & matb01 ==2 ~ "男学生女老师",
      a01==2 & matb01 ==1 ~ "女学生男老师",
      a01==2 & matb01 ==2 ~ "女学生女老师",
    ),
    eng_rlation=case_when(
      a01==1 & engb01 ==1 ~ "男学生男老师",
      a01==1 & engb01 ==2 ~ "男学生女老师",
      a01==2 & engb01 ==1 ~ "女学生男老师",
      a01==2 & engb01 ==2 ~ "女学生女老师",
    )
  )

语文成绩

就语文成绩而言,

#语文成绩
library(ggstatsplot)
You can cite this package as:
     Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
     Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167
std_cls2 %>% ggbetweenstats(y=tr_chn,x=chn_rlation)

数学成绩

就数学成绩而言,女成绩比男生成绩好,且女教师的成绩比男老师的成绩好。

#数学成绩
std_cls2 %>% ggbetweenstats(y=tr_mat,x=mat_rlation)

英语成绩

就英语成绩而言,女生的成绩高于男生的成绩,但男女老师之间的并无显著差异。

std_cls2 %>% ggbetweenstats(y=tr_eng,x=eng_rlation)

  1. 使用CEPS数据,在控制其它影响学生成绩的因素下,看看有没有班主任的学科优势?即,是否说班主任是英语老师,该班学生的英语成绩就好于班主任是非英语的教师?语文、数学亦是同样的问题。
std_cls3 <- std_cls2 %>%
  mutate(hr_chn=if_else(hra01==2,1,0),
         hr_mat=if_else(hra01==1,1,0),
         hr_eng=if_else(hra01==3,1,0))

lm_chn <- lm(tr_chn ~ hr_chn + a01 + grade9.x+ stonly + steco_3c + stcog,data=std_cls3)
lm_mat <- lm(tr_mat ~ hr_mat + a01 + grade9.x+ stonly + steco_3c + stcog,data=std_cls3)
lm_eng <- lm(tr_eng ~ hr_eng + a01 + grade9.x+ stonly + steco_3c + stcog,data=std_cls3)


models <- list(lm_chn,lm_mat,lm_eng)
modelsummary(models,stars = TRUE,gof_omit = "AIC|BIC|RMSE|R2 Adj.|Log.Lik.|F")
 (1)   (2)   (3)
(Intercept) 48.202*** 33.825*** 33.435***
(0.928) (1.413) (1.333)
hr_chn 0.063
(0.294)
a01 8.075*** 4.104*** 13.727***
(0.266) (0.407) (0.384)
grade9.x 11.100*** 8.927*** −1.753***
(0.274) (0.418) (0.395)
stonly −3.065*** −4.840*** −6.607***
(0.279) (0.426) (0.402)
steco_3c 1.601*** 1.513*** 2.213***
(0.274) (0.419) (0.395)
stcog 1.956*** 3.839*** 3.162***
(0.038) (0.058) (0.054)
hr_mat 2.004***
(0.441)
hr_eng 4.807***
(0.450)
Num.Obs. 18629 18647 18647
R2 0.219 0.230 0.252
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001