STA 310 - Spring 2024 - Multilevel models

id	diary	large_ensemble	mpqnem
1	1	0	16
1	2	1	16
1	3	1	16
43	1	0	17
43	2	0	17
43	3	0	17

term	estimate	std.error	statistic	p.value
(Intercept)	24.500	1.96	12.503	0.000
large_ensemble1	-7.833	2.53	-3.097	0.009

term	estimate	std.error	statistic	p.value
(Intercept)	16.283	0.671	24.249	0.000
orchestra1	1.411	0.991	1.424	0.163

term	estimate	std.error	statistic	p.value
(Intercept)	-0.771	0.851	-0.906	0.373
orchestra1	-1.406	1.203	-1.168	0.253

Framework

Let $Y_{i j}$ be the performance anxiety for the $i^{t h}$ musician before performance $j$ .

Level One

$Y_{i j} = a_{i} + b_{i} L a r g e E n s e m b l e_{i j} + ϵ_{i j}$

Level Two

$\begin{aligned} a_{i} = α_{0} + α_{1} O r c h e s t r a_{i} + u_{i} \\ b_{i} = β_{0} + β_{1} O r c h e s t r a_{i} + v_{i} \end{aligned}$

Note

We will discuss the distribution of the error terms $ϵ_{i j}, u_{i}, v_{i}$ shortly.

Composite model

Plug in the equations for $a_{i}$ and $b_{i}$ to get the composite model $\begin{aligned} Y_{i j} & = (α_{0} + α_{1} O r c h e s t r a_{i} + β_{0} L a r g e E n s e m b l e_{i j} \\ + β_{1} O r c h e s t r a_{i} : L a r g e E n s e m b l e_{i j}) \\ + (u_{i} + v_{i} L a r g e E n s e m b l e_{i j} + ϵ_{i j}) \end{aligned}$

The fixed effects to estimate are $α_{0}, α_{1}, β_{0}, β_{1}$
The error terms are $u_{i}, v_{i}, ϵ_{i j}$
- $u_{i}$ and $v_{i}$ are associated with musician random effect
- $ϵ_{i j}$ is what’s left unexplained

Note that we no longer need to estimate $a_{i}$ and $b_{i}$ directly as we did earlier. They conceptually connect the Level One and Level Two models.

Notation

Greek letters denote the fixed effect model parameters to be estimated
- e.g., $α_{0}, α_{1}, β_{0}, β_{1}$
Roman letters denote the preliminary fixed effects at lower levels (not directly estimated)
- e.g. $a_{i}, b_{i}$
$σ$ and $ρ$ denote variance components that will be estimated
$ϵ_{i j}, u_{i}, v_{i}$ denote error terms (not directly estimated)

Error terms

We generally assume that the error terms are normally distributed, e.g. error associated with each performance of a given musician is $ϵ_{i j} \sim N (0, σ^{2})$
For the Level Two models, the errors are
- $u_{i}$ : deviation of musician $i$ from the mean performance anxiety before solos and small ensembles after accounting for the instrument
  - musician-to-musician differences in the intercepts
- $v_{i}$ : deviance of musician $i$ from the mean difference in performance anxiety between large ensembles and other performance types after accounting for instrument
  - musician-to-musician differences in the slopes
Need to account for fact that $u_{i}$ and $v_{i}$ are correlated for the $i^{t h}$ musician

Recreated from Figure 8.11

Describe what we learn about the association between the slopes and intercepts based on this plot.

Distribution of Level Two errors

Use a multivariate normal distribution for the Level Two error terms $[\begin{matrix} u_{i} \\ v_{i} \end{matrix}] \sim N ([\begin{matrix} 0 \\ 0 \end{matrix}], [\begin{array}{cc} σ_{u}^{2} & ρ_{u v} σ_{u} σ_{v} \\ ρ_{u v} σ_{u} σ_{v} & σ_{v}^{2} \end{array}])$

where $σ_{u}^{2}$ and $σ_{v}^{2}$ are the variance of $u_{i}$ ’s and $v_{i}$ ’s respectively, and $σ_{u v} = ρ_{u v} σ_{u} σ_{v}$ is covariance between $u_{i}$ and $v_{i}$

What does it mean for $ρ_{u v} > 0$ ?
What does it mean for $ρ_{u v} < 0$ ?

Visualizing multivariate normal distribution

Recreated from Figure 8.12

Fit the model in R

Fit multilevel model using the lmer (“linear mixed effects in R”) function from the lme4 package.

library(lme4)
music_model <- lmer(na ~ orchestra + large_ensemble +
       orchestra:large_ensemble + (large_ensemble|id),
       REML = TRUE, data = music)

na ~ orchestra + large_ensemble + orchestra:large_ensemble: Represents the fixed effects
(large_ensemble|id): Represents the error terms and associated variance components
- Specifies two error terms: $u_{i}$ corresponding to the intercepts, $v_{i}$ corresponding to effect of large ensemble
- Use (1|id) for models with random intercepts and all other effects fixed.

Tidy output

Display results using the tidy function from the broom.mixed package.

library(broom.mixed)
tidy(music_model)

Get fixed effects only

tidy(music_model) |> filter(effect == "fixed")

Get errors and variance components only

tidy(music_model) |> filter(effect == "ran_pars")

Estimated fixed effects

tidy(music_model) |> 
  filter(effect == "fixed") |> 
  kable(digits = 3)

effect	group	term	estimate	std.error	statistic
fixed	NA	(Intercept)	15.930	0.641	24.833
fixed	NA	orchestra1	1.693	0.945	1.791
fixed	NA	large_ensemble1	-0.911	0.845	-1.077
fixed	NA	orchestra1:large_ensemble1	-1.424	1.099	-1.295

Estimated random effects

tidy(music_model) |> 
  filter(effect == "ran_pars") |> 
  kable(digits = 3)

effect	group	term	estimate	std.error	statistic
ran_pars	id	sd__(Intercept)	2.378	NA	NA
ran_pars	id	cor__(Intercept).large_ensemble1	-0.635	NA	NA
ran_pars	id	sd__large_ensemble1	0.672	NA	NA
ran_pars	Residual	sd__Observation	4.670	NA	NA

Fitted model

$\begin{aligned} {\hat{n a}}_{i j} & = 15.930 + 1.693 O r c h e s t r a_{i} - 0.911 L a r g e E n s e m b l e_{i j} \\ - 1.424 O r c h e s t r a_{i} : L a r g e E n s e m b l e_{i j} \\ [\begin{array}{c} u_{i} \\ v_{i} \end{array}] \sim N ([\begin{array}{c} 0 \\ 0 \end{array}], [\begin{array}{cc} {2.378}^{2} & - 0.635 * 2.378 * 0.672 \\ - 0.635 * 2.378 * 0.672 & {0.672}^{2} \end{array}]) \\ ϵ_{i j} \sim N (0, {4.670}^{2}) \end{aligned}$

Sadler and Miller (2010)

Read the Data Analysis Section in Sadler and Miller (2010). Click here to access the paper in Canvas. Use the text to answer the following:

What is the goal of the analysis?
What type of model is used? What is the response variable? What is the multilevel structure?
Describe the details of the model estimation.
Describe the data wrangling / creation of new variables. What were the goals of the data wrangling steps?
How as model performance assessed?

06:00

Sadler and Miller 2010

Split into 3 - 5 groups and discuss your responses.
One person will write your group’s response to your assigned question(s).
Click here to access the slides.

Sadler and Miller (2010) Model 1

References

Roback, Paul, and Julie Legler. 2021. Beyond multiple linear regression: applied generalized linear models and multilevel models in R. CRC Press.

Sadler, Michael E, and Christopher J Miller. 2010. “Performance Anxiety: A Longitudinal Study of the Roles of Personality and Experience in Musicians.” Social Psychological and Personality Science 1 (3): 280–87.

Multilevel models

Announcements

Topics

Data: Music performance anxiety

Look at data

Fitting the model

Questions we want to answer

Initial modeling approach

Level One model

Example Level One model

Level One model

Level One models

Level Two Model

Estimated coefficients by instrument

Level Two model

Estimated two-level model

Estimated composite model

Disadvantages to this approach

Unified approach to modeling multilevel data

Framework

Composite model

Notation

Error terms

Distribution of Level Two errors

Visualizing multivariate normal distribution

Fit the model in R

Tidy output

Estimated fixed effects

Estimated random effects

Fitted model

Sadler and Miller (2010)

Sadler and Miller 2010

Sadler and Miller (2010) Model 1

References