library(tidyverse)
library(tidymodels)
library(GGally)
library(knitr)
library(patchwork)
library(viridis)
library(ggfortify)
library(kableExtra)
library(lme4)
library(broom.mixed)
Modeling two-level longitudinal data
BMLR Chapter 9
Setup
Learning goals
Describe general process for fitting and comparing multilevel models
Fit and interpret multilevel models for longitudinal data
Compare multilevel models
Conduct inference for random effects
Conduct inference for fixed effects
Data: Charter schools in MN
The data set charter-long.csv
contains standardized test scores and demographic information for schools in Minneapolis, MN from 2008 to 2010. The data were collected by the Minnesota Department of Education. Understanding the effectiveness of charter schools is of particular interest, since they often incorporate unique methods of instruction and learning that differ from public schools.
MathAvgScore
: Average MCA-II score for all 6th grade students in a school (response variable)urban
: urban (1) or rural (0) location school locationcharter
: charter school (1) or a non-charter public school (0)schPctfree
: proportion of students who receive free or reduced lunches in a school (based on 2010 figures).year08
: Years since 2008
Data
<- read_csv("data/charter-long.csv") charter
Rows: 1854 Columns: 9
── Column specification ──────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): schoolid, schoolName
dbl (7): urban, charter, schPctnonw, schPctsped, schPctfree, year08, MathAvgScore
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
|>
charter select(schoolName, year08, urban, charter, schPctfree, MathAvgScore) |>
slice(1:3, 1852:1854) |>
kable(digits = 3)
schoolName | year08 | urban | charter | schPctfree | MathAvgScore |
---|---|---|---|---|---|
RIPPLESIDE ELEMENTARY | 0 | 0 | 0 | 0.363 | 652.8 |
RIPPLESIDE ELEMENTARY | 1 | 0 | 0 | 0.363 | 656.6 |
RIPPLESIDE ELEMENTARY | 2 | 0 | 0 | 0.363 | 652.6 |
RICHARD ALLEN MATH&SCIENCE ACADEMY | 0 | 1 | 1 | 0.545 | NA |
RICHARD ALLEN MATH&SCIENCE ACADEMY | 1 | 1 | 1 | 0.545 | NA |
RICHARD ALLEN MATH&SCIENCE ACADEMY | 2 | 1 | 1 | 0.545 | 631.2 |
Assess missingness (1/2)
Missing data is common in longitudinal data. Before starting the analysis, it is important to understand the missing data patterns. Use the skim
function from the skimr R package to get a quick view of the missingness.
library(skimr)
Warning: package 'skimr' was built under R version 4.4.1
|> skim() |> select(skim_variable, n_missing, complete_rate) charter
# A tibble: 9 × 3
skim_variable n_missing complete_rate
<chr> <int> <dbl>
1 schoolid 0 1
2 schoolName 0 1
3 urban 0 1
4 charter 0 1
5 schPctnonw 0 1
6 schPctsped 0 1
7 schPctfree 0 1
8 year08 0 1
9 MathAvgScore 121 0.935
Assess missingness (2/2)
Another option is from the DataExplorer
package. The function creations a plot.
library(DataExplorer)
Warning: package 'DataExplorer' was built under R version 4.4.1
plot_missing(charter)
Closer look at missing
- Question: If we are going to fit a multilevel by schoolid, what types of missing data will be particularly problematic?
. . .
MathAvgScore0_miss | MathAvgScore1_miss | MathAvgScore2_miss | n |
---|---|---|---|
0 | 0 | 0 | 540 |
0 | 0 | 1 | 6 |
0 | 1 | 0 | 4 |
0 | 1 | 1 | 7 |
1 | 0 | 0 | 25 |
1 | 0 | 1 | 1 |
1 | 1 | 0 | 35 |
|>
charter select(schoolid, schoolName, year08, MathAvgScore) |>
pivot_wider(id_cols = c(schoolid, schoolName), names_from = year08,
names_prefix = "MathAvgScore.", values_from = MathAvgScore) |>
mutate(MathAvgScore0_miss = if_else(is.na(MathAvgScore.0), 1, 0),
MathAvgScore1_miss = if_else(is.na(MathAvgScore.1), 1, 0),
MathAvgScore2_miss = if_else(is.na(MathAvgScore.2), 1, 0)) |>
count(MathAvgScore0_miss, MathAvgScore1_miss, MathAvgScore2_miss) |>
kable()
Dealing with missing data
Complete case analysis: Only include schools with complete data for all three years. This would remove 12.6% of observations in this data.
Last observation carried forward: Keep the last observation from each group (school) and conduct analysis for independent observations.
Impute missing observations: “Fill in” values of missing observations using the typical observed trends from groups with similar covariates.
Apply multilevel methods: Estimate patterns using available data recognizing that trends for groups with complete data are more precise than for those with fewer measurements. This is under the condition that the probability of missingness does not depend on unobserved predictors or the response.
. . .
Question: What is an advantage of each method? What is a disadvantage?
Strategy for building multilevel models
Conduct exploratory data analysis for Level One and Level Two variables.
Fit model with no covariates to assess variability at each level.
Create Level One models. Start with a single term, then add terms as needed.
Create Level Two models. Start with a single term, then add terms as needed. Start with equation for intercept term.
Begin with the full set of variance components, then remove variance terms as needed.
. . .
Alternate model building strategies in BMLR Section 8.6
Exploratory data analysis
Given the longitudinal structure of the data, we are able to answer questions at two levels
Level One (within school): How did average math scores for a given school change over time?
Level Two (between schools): What is the effect of school-specific covariates on the average math scores in 2008 and the rate of change from 2008 to 2010?
. . .
We can conduct exploratory data analysis at both levels, e.g.,
Univariate and bivariate EDA
lattice plots
Spaghetti plots
Let’s do some EDA together.
See BMLR Section 9.3 for full exploratory data analysis.
Unconditional means model
Start with the unconditional means model, a model with no covariates at any level. This is also called the random intercepts model.
Level One : \(Y_{ij} = a_{i} + \epsilon_{ij}, \hspace{5mm} \epsilon_{ij} \sim N(0, \sigma^2)\)
Level Two: \(a_i = \alpha_0 + u_i, \hspace{5mm} u_{i} \sim N(0, \sigma^2_u)\)
Question: Write the composite model.
Intraclass correlation (1/2)
The intraclass correlation is the relative variability between groups
\[\hat{\rho} = \frac{\text{Between variability}}{\text{Total variability}} = \frac{\hat{\sigma}_u^2}{\hat{\sigma}^2_u + \hat{\sigma}^2}\]
What is the meaning of \(\hat{\rho}\) close to 0?
What is the meaning of \(\hat{\rho}\) close to 1?
. . .
Question: Fit the unconditional means model and calculate the intraclass correlation.
Intraclass correlation (2/2)
The intraclass correlation is the relative variability between groups
\(\hat{\rho} = 0.798\). This means…
About 79.8% of the variability in math scores can be attributed to differences between schools (school-to-school variability). About 20.2% of the variability can be attributed to changes over time.
The average correlation between any two responses from the same school is about 0.798.
The effective sample size (number of independent pieces of information available for modeling) is closer to the number of schools \((\rho \text{ close to 1})\) than the number of observations \((\rho\text{ close to 0})\)
Unconditional growth model
A next step in the model building is the unconditional growth model, a model with Level One predictors but no Level Two predictors.
Level One: \(Y_{ij} = a_i + b_iYear08_{ij} + \epsilon_{ij}\)
Level Two: - \(a_i = \alpha_0 + u_i\) - \(b_i = \beta_0 + v_i\) . . .
. . .
Questions:
- Write the composite model.
- What can we learn from this model?
- Fit the unconditional growth model.
Pseudo \(R^2\)
We can use Pseudo R2 to explain changes in variance components between two models
- Note: This should only be used when the definition of the variance component is the same between the two models
. . .
\[\text{Pseudo }R^2 = \frac{\hat{\sigma}^2(\text{Model 1}) - \hat{\sigma}^2(\text{Model 2})}{\hat{\sigma}^2(\text{Model 1})}\]
. . .
Question: Calculate the \(\text{Pseudo }R^2\) to estimate the change of within school variance between the unconditional means and unconditional growth models.
Model with school-level covariates
Fit a model with school-level covariates that takes the following form:
Level One
\[\begin{equation*} Y_{ij}= a_{i} + b_{i}Year08_{ij} + \epsilon_{ij} \end{equation*}\]
Level Two
\[\begin{align*} a_{i} & = \alpha_{0} + \alpha_{1}Charter_i + \alpha_{2}urban_i + \alpha_{3}schpctfree_i + u_{i} \\ b_{i} & = \beta_{0} + \beta_{1}Charter_i + \beta_{2}urban_i + v_{i} \end{align*}\]
Questions
- Write out the composite model.
- Fit the model in R.
- Use the model to describe how the average math scores differed between charter and non-charter schools.
Consider a simpler model
Would a model without the effects \(v_i\) and \(\rho_{uv}\) be preferable?
. . .
Level One
\[\begin{equation*} Y_{ij}= a_{i} + b_{i}Year08_{ij} + \epsilon_{ij} \end{equation*}\]
Level Two
\[\begin{align*} a_{i} & = \alpha_{0} + \alpha_{1}Charter_i + \alpha_{2}urban_i + \alpha_{3}schpctfree_i + u_{i} \\ b_{i} & = \beta_{0} + \beta_{1}Charter_i + \beta_{2}urban_i \end{align*}\]
. . .
In this model, the effect of year is the same for all schools with a given combination of Charter
and urban
Compare two models
Full model
\[\begin{aligned}Y_{ij} = [&\alpha_{0} + \alpha_{1}Charter_i + \alpha_{2}Urban_i+ \alpha_{3}schpctfree_i \\ &+ \beta_0Year08_{ij} + \beta_1Charter_i:Year08_{ij} + \beta_2Urban_i:Year_{ij}] \\&+ [u_i + v_iYear08_{ij} + \epsilon_{ij}]\end{aligned}\]
where
\[\left[ \begin{array}{c} u_{i} \\ v_{i} \end{array} \right] \sim N \left( \left[ \begin{array}{c} 0 \\ 0 \end{array} \right], \left[ \begin{array}{cc} \sigma_{u}^{2} & \sigma_{uv} \\ \sigma_{uv} & \sigma_{v}^{2} \end{array} \right] \right) \hspace{2mm} \text{ and }\hspace{2mm} \epsilon_{ij} \sim N(0, \sigma^2)\]
. . .
Estimated fixed effects: \(\alpha_0, \alpha_1, \alpha_2, \alpha_3, \beta_0, \beta_1, \beta_2\)
Variance components to estimate: \(\sigma, \sigma_u, \sigma_v, \rho_{uv}\) (Note: \(\sigma_{uv} = \rho_{uv}\sigma_u\sigma_v\))
Compare two models
Null Model (simplified variance structure)
\[\begin{aligned}Y_{ij} = [&\alpha_{0} + \alpha_{1}Charter_i + \alpha_{2}Urban_i+ \alpha_{3}schpctfree_i \\ &+ \beta_0Year08_{ij} + \beta_1Charter_i:Year08_{ij} + \beta_2Urban_i:Year_{ij}] \\&+ [u_i + \epsilon_{ij}]\end{aligned}\]
where
\[u_i \sim N(0, \sigma^2_u) \hspace{2mm} \text{ and }\hspace{2mm} \epsilon_{ij} \sim N(0, \sigma^2)\]
. . .
Estimated fixed effects: \(\alpha_0, \alpha_1, \alpha_2, \alpha_3, \beta_0, \beta_1, \beta_2\)
Variance components to estimate: \(\sigma, \sigma_u\)
Full and reduced models
<- lmer(MathAvgScore ~ charter + urban + schPctfree +
full_model :year08 + urban:year08 + year08 +
charter|schoolid), REML = T, data = charter) #<< (year08
<- lmer(MathAvgScore ~ charter + urban + schPctfree +
reduced_model :year08 + urban:year08 + year08 +
charter1 | schoolid), REML = T, data = charter) #<< (
. . .
Hypotheses
\[\begin{aligned}&H_0: \sigma_v = \rho_{uv} = 0\\ &H_a: \text{at least one of the parameters is not equal to 0}\end{aligned}\]
Note: \(\rho_{uv} \neq 0 \Rightarrow \sigma_v \neq 0\)
Issues with the drop-in-deviance test
anova(full_model, reduced_model, test = "Chisq") |>
kable(digits = 3)
refitting model(s) with ML (instead of REML)
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|---|
reduced_model | 9 | 9952.992 | 10002.11 | -4967.496 | 9934.992 | NA | NA | NA |
full_model | 11 | 9953.793 | 10013.83 | -4965.897 | 9931.793 | 3.199 | 2 | 0.202 |
\(\chi^2\) test conservative, i.e., the p-values are larger than they should be, when testing random effects at the boundary ( e.g., \(\sigma_v^2 = 0\)) or those with bounded ranges (e.g., \(\rho_{uv}\) )
If you observe small p-values, you can feel relatively certain the tested effects are statistically significant
Use bootstrap methods to obtain more accurate p-values
Parametric bootstrapping
Bootstrapping is from the phrase “pulling oneself up by one’s bootstraps”
Accomplishing a difficult task without any outside help
Task: conduct inference for model parameters (fixed and random effects) using only the sample data
Parametric bootstrapping for likelihood ratio test
1️⃣ Fit the null (reduced) model to obtain the fixed effects and variance components (parametric part).
2️⃣ Use the estimated fixed effects and variance components to generate a new set of response values with the same sample size and associated covariates for each observation as the original data (bootstrap part).
3️⃣ Fit the full and null models to the newly generated data.
4️⃣ Compute the likelihood test statistic comparing the models from the previous step.
5️⃣ Repeat steps 2 - 4 many times (~ 1000).
Parametric bootstrapping for likelihood ratio test
6️⃣ Create a histogram of the likelihood ratio statistics to get the distribution of likelihood ratio statistic under the null hypothesis.
7️⃣ Get the p-value by calculating the proportion of bootstrapped test statistics greater than the observed statistic.
Let’s calculate the bootstrapped p-value for the likelihood ratio test statistic testing whether \(v_i\) and \(\rho_{uv}\) can be removed from the model.
LRT using \(\chi^2\) and parametric bootstrap
Likelihood ratio test using \(\chi^2\) distribution
refitting model(s) with ML (instead of REML)
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|---|
reduced_model | 9 | 9952.992 | 10002.11 | -4967.496 | 9934.992 | NA | NA | NA |
full_model | 11 | 9953.793 | 10013.83 | -4965.897 | 9931.793 | 3.199 | 2 | 0.202 |
Likelihood ratio test using parametric bootstrap
term |
npar |
AIC |
BIC |
logLik |
deviance |
statistic |
df |
Pr_boot..Chisq. |
---|---|---|---|---|---|---|---|---|
m0 | 9 | 9952.992 | 10002.11 | -4967.496 | 9934.992 | NA | NA | NA |
mA | 11 | 9953.793 | 10013.83 | -4965.897 | 9931.793 | 3.199347 | 2 | 0.144 |
Inference for fixed effects (1/2)
The output for multilevel models do not contain p-values for the coefficients of fixed effects
The exact distribution of the test statistic under the null hypothesis (no fixed effect) is unknown, because the exact degrees of freedom are unknown
- Finding suitable approximations is an area of ongoing research
In the tidy function, with its variant from
broom.mixed
, you can ask forconf.int = T
to get confidence intervals.We can look up how that works in the packages vignette. Let’s look up the package!
Inference for fixed effects (2/2)
- We can use likelihood ratio test with an approximate \(\chi^2\) distribution to test these effects, since we’re not testing on the boundary and fixed effects do not have limited ranges
- Some research suggests the p-values are too low but approximations are generally pretty good
- Can also calculate the p-values using parametric bootstrap approach
- Let’s test whether
schPctFree
should be included in the current model using likelihood ratio test with the \(\chi^2\) distribution and the parametric bootstrap.
Methods to conduct inference for individual coefficients
- Use the t-value \(\big(\frac{estimate}{std.error}\big)\) in the model output
- General rule: Coefficients with |t-value| > 2 considered to be statistically significant, i.e., different from 0
- The confidence intervals given by
tidy
are not proven to be good- We better use multiple approaches and see if they agree.
- Calculate confidence intervals using nonparametric bootstrapping
Nonparametric bootstrapping
1️⃣ Take a sample, with replacement, of size \(n\) (the size of the original data) (called case resampling).
2️⃣ Fit the model to obtain estimates of the coefficients.
3️⃣ Repeat steps 1 - 2 many times (~ 1000) to obtain the bootstrap distribution.
4️⃣ Get the coefficients for the 95% confidence interval by taking the middle 95% of the bootstrap distribution.
Bootstrapped CI for coefficients
confint(reduced_model, method = "boot", level = 0.95, oldNames = F) |>
kable(digits = 3)
Computing bootstrap confidence intervals ...
2.5 % | 97.5 % | |
---|---|---|
sd_(Intercept)|schoolid | 4.264 | 4.829 |
sigma | 2.870 | 3.110 |
(Intercept) | 659.315 | 661.415 |
charter | -4.374 | -1.395 |
urban | -1.958 | -0.218 |
schPctfree | -19.523 | -16.289 |
year08 | 1.259 | 1.784 |
charter:year08 | 0.397 | 1.639 |
urban:year08 | -0.900 | -0.193 |
Summary: Model comparisons
Methods to compare models with different fixed effects
- Likelihood ratio tests based on \(\chi^2\) distribution
- Likelihood ratio test based on parametric bootstrapped p-values
- AIC or BIC
. . .
Methods to compare models with different variance components
- Likelihood ratio test based on parametric bootstrapped p-values
- AIC or BIC
Summary: Understanding the model
Methods to understand individual coefficients
- Likelihood ratio tests
- Bootstrap confidence intervals
- Pseudo \(R^2\) for variance components (if meaning is unchanged between models)
. . .
Methods to understand data structure
- Calculate intraclass correlation coefficient using unconditional means model
Acknowledgements
The content in the slides is from
BMLR: Chapter 9 - Two-level Longitudinal Data
- Sections 9.1 - 9.6
Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation by Peter Boedeker
Applied longitudinal data analysis: Modeling change and event occurrence. by J.D. Singer and J.B. Willett
- Online copy available through Duke library
Extending the linear model with R. by Julian Faraway
- Online copy available through Duke library
Initial versions of the slides are by Dr. Maria Tackett, Duke University