BMLR Chapter 8
Cornell College
STA 363 Spring 2025 Block 8
We can think of correlated data as a multilevel structure
For now we will focus on data with two levels:
Radon is a carcinogen – a naturally occurring radioactive gas whose decay products are also radioactive – known to cause lung cancer in high concentrations. The EPA sampled more than 80,000 homes across the U.S. Each house came from a randomly selected county and measurements were made on each level of each home. Uranium measurements at the county level were included to improve the radon estimates.
The data musicdata.csv
come from the Sadler and Miller (2010) study of the emotional state of musicians before performances. The dataset contains information collected from 37 undergraduate music majors who completed the Positive Affect Negative Affect Schedule (PANAS), an instrument produces a measure of anxiety (negative affect) and a measure of happiness (positive affect). This analysis will focus on negative affect as a measure of performance anxiety.
The primary variables we’ll use are
na
: negative affect score on PANAS (the response variable)perform_type
: type of performance (Solo, Large Ensemble, Small Ensemble)instrument
: type of intstrument (Voice, Orchestral, Piano)id | diary | perform_type | na | gender | instrument |
---|---|---|---|---|---|
1 | 1 | Solo | 11 | Female | voice |
1 | 2 | Large Ensemble | 19 | Female | voice |
1 | 3 | Large Ensemble | 14 | Female | voice |
43 | 1 | Solo | 19 | Female | voice |
43 | 2 | Solo | 13 | Female | voice |
43 | 3 | Small Ensemble | 19 | Female | voice |
What are the Level One observations? Level Two observations?
What are the Level One variables? Level Two variables?
Level One variables
Two ways to approach univariate EDA (visualizations and summary statistics) for Level One variables:
Use individual observations (i.e., treat observations as independent)
Use aggregated values for each Level Two observation
Level Two variables
Let’s do some Univariate EDA together
p1 <- ggplot(data = music, aes(x = na)) +
geom_histogram(fill = "steelblue", color = "black", binwidth = 2) +
labs(x = "Individual negative affect",
title = "Negative affect scores")
p2 <- music %>%
group_by(id) %>%
summarise(mean_na = mean(na)) %>%
ggplot(aes(x = mean_na)) +
geom_histogram(fill = "steelblue", color = "black", binwidth = 2) +
labs(x = "Mean negative affect",
title = "Mean negative affect scores")
p1 + p2
Goals
There are two ways to visualize these associations:
One plot of individual observations (i.e., treat observations as independent)
Separate plots of responses vs. predictor for each Level Two observation (lattice plots)
Now lets do some Bivariate EDA.
The goal is to understand variability in performance anxiety (na
) based on performance-level and musician-level characteristics. Specifically:
What is the problem with using the following model to draw conclusions?
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 15.721 | 0.359 | 43.778 | 0.000 |
orchestra | 1.789 | 0.552 | 3.243 | 0.001 |
large_ensemble | -0.277 | 0.791 | -0.350 | 0.727 |
orchestra:large_ensemble | -1.709 | 1.062 | -1.609 | 0.108 |
1️⃣ Condense each musician’s set of responses into a single outcome (e.g., mean max, last observation, etc.) and fit a linear model on these condensed observations
2️⃣ Fit a separate model for each musician understand the association between performance type (Level One models). Then fit a system of Level Two models to predict the fitted coefficients in the Level One model for each subject based on instrument type (Level Two model).
Let’s look at approach #2
We’ll start with the Level One model to understand the association between performance type and performance anxiety for the \(i^{th}\) musician.
\[na_{ij} = a_i + b_i ~ LargeEnsemble_{ij} + \epsilon_i, \hspace{5mm} \epsilon_{ij} \sim N(0,\sigma^2)\]
Why is it more meaningful to use performance type for the Level One model than instrument?
For now, estimate \(a_i\) and \(b_i\) using least-squares regression.
Below is partial data for observation #22
music %>%
filter(id == 22) %>%
select(id, diary, perform_type, instrument, na) %>%
slice(1:3, 13:15) %>%
kable()
id | diary | perform_type | instrument | na |
---|---|---|---|---|
22 | 1 | Solo | orchestral instrument | 24 |
22 | 2 | Large Ensemble | orchestral instrument | 21 |
22 | 3 | Large Ensemble | orchestral instrument | 14 |
22 | 13 | Large Ensemble | orchestral instrument | 12 |
22 | 14 | Large Ensemble | orchestral instrument | 19 |
22 | 15 | Solo | orchestral instrument | 25 |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 24.500 | 1.96 | 12.503 | 0.000 |
large_ensemble | -7.833 | 2.53 | -3.097 | 0.009 |
. . .
Repeat for all 37 musicians
. . .
Now let’s consider if there is an association between the estimated slopes, estimated intercepts, and the type of instrument
The slope and intercept for the \(i^{th}\) musician can be modeled as
\[\begin{aligned}&a_i = \alpha_0 + \alpha_1 ~ Orchestra_i + u_i \\ &b_i = \beta_0 + \beta_1 ~ Orchestra_i + v_i\end{aligned}\]
Note the response variable in the Level Two models are not observed outcomes but the (fitted) slope and intercept from each musician
Let’s fit a level 2 model
musicians <- music %>%
distinct(id, orchestra) %>%
bind_cols(model_stats)
p1 <- ggplot(data = musicians, aes(x = intercepts, y = factor(orchestra))) +
geom_boxplot(fill = "steelblue", color = "black") +
labs(x = "Fitted intercepts",
y = "Orchestra")
p2 <- ggplot(data = musicians, aes(x = slopes, y = factor(orchestra))) +
geom_boxplot(fill = "steelblue", color = "black") +
labs(x = "Fitted slopes",
y = "Orchestra")
p1 / p2
Model for intercepts
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 16.283 | 0.671 | 24.249 | 0.000 |
orchestra | 1.411 | 0.991 | 1.424 | 0.163 |
Model for slopes
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -0.771 | 0.851 | -0.906 | 0.373 |
orchestra | -1.406 | 1.203 | -1.168 | 0.253 |
Level One
\[\hat{na}_{ij} = \hat{a}_i + \hat{b}_i ~ LargeEnsemble_{ij}\]
for each musician.
Level Two
\[\begin{aligned}&\hat{a}_i = 16.283 + 1.441 ~ Orchestra_i \\ &\hat{b}_i = -0.771 - 1.406 ~ Orchestra_i\end{aligned}\]
\[\begin{aligned}\hat{na}_i &= 16.283 + 1.441 ~ Orchestra_i - 0.771 ~ LargeEnsemble_{ij} \\ &- 1.406 ~ Orchestra:LargeEnsemble_{ij}\end{aligned}\]
(Note that we also have the error terms \(\epsilon_{ij}, u_i, v_i\) that we will discuss soon)
What is the predicted average performance anxiety before solos and small ensemble performances for vocalists and keyboardists? For those who place orchestral instruments?
What is the predicted average performance anxiety before large ensemble performances for those who play orchestral instruments?
⚠️ Weighs each musician the same regardless of number of diary entries
⚠️ Drops subjects who have missing values for slope (7 individuals who didn’t play a large ensemble performance)
⚠️ Does not share strength effectively across individuals
We will use a unified approach that utilizes likelihood-based methods to address some of these drawbacks.
Let \(Y_{ij}\) be the performance anxiety for the \(i^{th}\) musician before performance \(j\).
Level One
\[Y_{ij} = a_i + b_i ~ LargeEnsemble + \epsilon_{ij}\]
Level Two
\[\begin{aligned}&a_i = \alpha_0 + \alpha_1 ~ Orchestra_i+ u_i\\ &b_i = \beta_0 + \beta_1~Orchestra_i + v_i\end{aligned}\]
This approach uses likelihood-based methods (instead of least squares) to address the previously mentioned disadvantages
Plug in the equations for \(a_i\) and \(b_i\) to get the composite model
\[Y_{ij} = a_i + b_i ~ LargeEnsemble + \epsilon_{ij}\]
\[\begin{aligned}&a_i = \alpha_0 + \alpha_1 ~ Orchestra_i+ u_i\\ &b_i = \beta_0 + \beta_1~Orchestra_i + v_i\end{aligned}\]
\[\begin{aligned}Y_{ij} &= a_i \hspace{1.8in}+ \hspace{1.62in} b_i ~ LargeEnsemble \hspace{2.2in}+ \epsilon_{ij}\\ &= (\alpha_0 + \alpha_1 ~ Orchestra_i+ u_i) + (\beta_0 + \beta_1~Orchestra_i + v_i)LargeEnsemble\hspace{2.17in}+\epsilon_{ij}\\ &= \space \alpha_0 + \alpha_1 ~ Orchestra_i+ u_i \hspace{.2in} + \beta_0LargeEnsemble+\beta_1Orchestra_iLargeEnsemble+v_iLargeEnsemble+\epsilon_{ij} \end{aligned}\]
lmer(response~fixed effect variables + (random effect variables|group variable),data = your_data)
\[\begin{aligned} Y_{ij}&= \space \alpha_0 + \alpha_1 ~ Orchestra_i+ u_i +\space\space\space \beta_0LargeEnsemble+\beta_1Orchestra_iLargeEnsemble+v_iLargeEnsemble+\epsilon_{ij}\\ &=\alpha_0 + \alpha_1 ~ Orchestra_i + \beta_0LargeEnsemble+\beta_1Orchestra_iLargeEnsemble+u_i + v_iLargeEnsemble+\epsilon_{ij} \end{aligned}\]
lmer(na ~ 1 + Orchastra + LargeEnsemble + Orchestra * LargeEnsemble + (1+LargeEnsemble|id),data = music)
Plug in the equations for \(a_i\) and \(b_i\) to get the composite model
\[\begin{aligned}Y_{ij} &= (\alpha_0 + \alpha_1 ~ Orchestra_i + \beta_0 ~ LargeEnsemble_{ij} \\ &+ \beta_1 ~ Orchestra_i:LargeEnsemble_{ij})\\ &+ (u_i + v_i ~ LargeEnsemble_{ij} + \epsilon_{ij})\end{aligned}\]
Note that we no longer need to estimate \(a_i\) and \(b_i\) directly as we did earlier. They conceptually connect the Level One and Level Two models.
Greek letters denote the fixed effect model parameters to be estimated
Roman letters denote the preliminary fixed effects at lower levels that will not be estimated directly.
\(\sigma\) and \(\rho\) denote variance components that will be estimated
\(\epsilon_{ij}, u_i, v_i\) denote error terms
We generally assume that the error terms are normally distributed, e.g. error associated with each performance of a given musician is \(\epsilon_{ij} \sim N(0, \sigma^2)\)
For the Level Two models, the errors are
Need to account for fact that \(u_i\) and \(v_i\) are correlated for the \(i^{th}\) musician
Question: Describe what we learn about the association between the slopes and intercepts based on this plot.
musicians %>%
filter(!is.na(slopes)) %>%
ggplot(aes(x = intercepts, y = slopes)) +
geom_point() +
geom_smooth(method = "lm") +
labs(x = "Fitted intercepts",
y = "Fitted slopes",
title = "Fitted slopes and intercepts",
subtitle = paste0("r = ", round(cor(musicians %>% filter(!is.na(slopes)) %>% pull(intercepts), musicians %>% filter(!is.na(slopes)) %>% pull(slopes)),3)))
Use a multivariate normal distribution for the Level Two error terms
\[\left[ \begin{array}{c} u_{i} \\ v_{i} \end{array} \right] \sim N \left( \left[ \begin{array}{c} 0 \\ 0 \end{array} \right], \left[ \begin{array}{cc} \sigma_{u}^{2} & \rho_{uv}\sigma_{u}\sigma_v \\ \rho_{uv}\sigma_{u}\sigma_v & \sigma_{v}^{2} \end{array} \right] \right)\]
where \(\sigma^2_u\) and \(\sigma^2_v\) are the variance of \(u_i\)’s and \(v_i\)’s respectively, and \(\sigma_{uv} = \rho_{uv}\sigma_u\sigma_v\) is covariance between \(u_i\) and \(v_i\)
What does it mean for \(\rho_{uv} > 0\)?
What does it mean for \(\rho_{uv} < 0\)?
Being able to write out these mammoth variance-covariance matrices is less important than recognizing the number of variance components that must be estimated by our intended model.
Recreated from Figure 8.12
Fit multilevel model using the lmer
function from the lme4 package. Display results using hte tidy()
function from the broom.mixed package.
effect | group | term | estimate | std.error | statistic |
---|---|---|---|---|---|
fixed | NA | (Intercept) | 15.930 | 0.641 | 24.833 |
fixed | NA | orchestra | 1.693 | 0.945 | 1.791 |
fixed | NA | large_ensemble | -0.911 | 0.845 | -1.077 |
fixed | NA | orchestra:large_ensemble | -1.424 | 1.099 | -1.295 |
ran_pars | id | sd__(Intercept) | 2.378 | NA | NA |
ran_pars | id | cor__(Intercept).large_ensemble | -0.635 | NA | NA |
ran_pars | id | sd__large_ensemble | 0.672 | NA | NA |
ran_pars | Residual | sd__Observation | 4.670 | NA | NA |
na ~ orchestra + large_ensemble + orchestra:large_ensemble: Represents the fixed effects + intercept
(large_ensemble|id): Represents the error terms and associated variance components - Specifies two error terms: \(u_i\) corresponding to the intercepts, \(v_i\) corresponding to effect of large ensemble
Display results using the tidy
function from the broom.mixed package.
tidy(music_model) %>% filter(effect == “fixed”)
tidy(music_model) %>% filter(effect == “ran_pars”)
effect | group | term | estimate | std.error | statistic |
---|---|---|---|---|---|
fixed | NA | (Intercept) | 15.930 | 0.641 | 24.833 |
fixed | NA | orchestra | 1.693 | 0.945 | 1.791 |
fixed | NA | large_ensemble | -0.911 | 0.845 | -1.077 |
fixed | NA | orchestra:large_ensemble | -1.424 | 1.099 | -1.295 |
Label the fixed effects
effect | group | term | estimate | std.error | statistic |
---|---|---|---|---|---|
ran_pars | id | sd__(Intercept) | 2.378 | NA | NA |
ran_pars | id | cor__(Intercept).large_ensemble | -0.635 | NA | NA |
ran_pars | id | sd__large_ensemble | 0.672 | NA | NA |
ran_pars | Residual | sd__Observation | 4.670 | NA | NA |
Label the error terms and variance components
Split into 3 groups.
Each group will write the interpretation for one main effect and one variance component. The terms on each slide do not necessarily have a direct correspondence to each other.
One person write the group’s interpretations on the board.
Below are the estimated coefficients and standard errors for four modeling approaches.
Variable | Independence | Two.Stage | LVCF | Multilevel |
---|---|---|---|---|
Intercept | 15.72(0.36) | 16.28(0.67) | 15.20(1.25) | 15.93(0.64) |
Orch | 1.79(0.55) | 1.41(0.99) | 1.45(1.84) | 1.69(0.95) |
Large | -0.28(0.79) | -0.77(0.85) | - | -0.91(0.85) |
Orch*Large | -1.71(1.06) | -1.41(1.20) | - | -1.42(1.10) |
\[\textrm{Pseudo }R^2_{L1} = \frac{\hat{\sigma}^{2}(\textrm{Model A})-\hat{\sigma}^{2}(\textrm{Model B})}{\hat{\sigma}^{2}(\textrm{Model A})}\] - Pseudo R-squared values are not universally reliable as measures of model performance. - Because of the complexity of estimating fixed effects and variance components at various levels of a multilevel model, it is not unusual to encounter situations in which covariates in a Level Two equation for, say, the intercept remain constant (while other aspects of the model change), yet the associated pseudo R-squared values differ or are negative. For this reason, pseudo R-squared values in multilevel models should be interpreted cautiously.
Maximum Likelihood (ML) and Restricted (Residual) Maximum Likelihood (REML) are the two most common methods for estimating the fixed effects and variance components
Maximum Likelihood (ML)
Restricted Maximum Likelihood (REML)
Example using OLS
Research has not determined one method absolutely superior to the other
REML (REML = TRUE
; default in lmer
) is preferable when
ML (REML = FALSE
) must be used if you want to compare nested fixed effects models using a likelihood ratio test (e.g., a drop-in-deviance test).
Maximum Likelihood | Restricted Maximum Likelihood | |||
---|---|---|---|---|
Term | Estimate | Std. Error | Estimate | Std. Error |
(Intercept) | 15.924 | 0.623 | 15.930 | 0.641 |
orchestra1 | 1.696 | 0.919 | 1.693 | 0.945 |
large_ensemble1 | -0.895 | 0.827 | -0.911 | 0.845 |
orchestra1:large_ensemble1 | -1.438 | 1.074 | -1.424 | 1.099 |
sd__(Intercept) | 2.286 | NA | 2.378 | NA |
cor__(Intercept).large_ensemble1 | -1.00 | NA | -0.635 | NA |
sd__large_ensemble1 | 0.385 | NA | 0.672 | NA |
sd__Observation | 4.665 | NA | 4.670 | NA |
Conduct exploratory data analysis for Level One and Level Two variables
Fit model with no covariates to assess variability at each level
Create Level One models. Start with a single term, then add terms as needed.
Create Level Two models. Start with a single term, then add terms as needed. Start with equation for intercept term.
Begin with the full set of variance components, then remove variance terms as needed.
Alternate model building strategy in BMLR Section 8.6
See BMLR Sections 8.6 - 8.11 for full step-by-step analysis.
The content in the slides is from