STA 310 - Spring 2024 - Multilevel models

id	diary	perform_type	na	gender	instrument
1	1	Solo	11	Female	voice
1	2	Large Ensemble	19	Female	voice
1	3	Large Ensemble	14	Female	voice
12	1	Solo	23	Female	orchestral instrument
12	2	Solo	17	Female	orchestral instrument
12	3	Small Ensemble	25	Female	orchestral instrument

Questions we want to answer

What is the association between performance type (large ensemble or not) and performance anxiety? Does the association differ based on instrument type (orchestral or not)?

What is the problem with an ordinary least squares model to draw conclusions?

term	estimate	std.error	statistic	p.value
(Intercept)	15.721	0.359	43.778	0.000
orchestra1	1.789	0.552	3.243	0.001
large_ensemble1	-0.277	0.791	-0.350	0.727
orchestra1:large_ensemble1	-1.709	1.062	-1.609	0.108

Other modeling approaches

1️⃣ Condense each musician’s set of responses into a single outcome (e.g., mean, max, last observation, etc.), and fit a linear model on these condensed observations

Leaves only a few observations (37) to fit the model
Ignoring a lot of information in the multiple observations for each musician

2️⃣ Fit a separate model for each musician understand the association between performance type (Level One models) and anxiety. Then fit a system of Level Two models to predict the fitted coefficients in the Level One model for each subject based on instrument type (Level Two model).

Let’s look at approach #2

Level One model

We’ll start with the Level One model to understand the association between performance type and performance anxiety for the $i^{t h}$ musician and the $j^{t h}$ performance $n a_{i j} = a_{i} + b_{i} L a r g e E n s e m b l e_{i j} + ϵ_{i j}, ϵ_{i j} \sim N (0, σ^{2})$

Why is it more meaningful to use performance type for the Level One model than instrument?

For now, estimate $a_{i}$ and $b_{i}$ using least-squares regression.

Example Level One model

Below is data for id #22

# A tibble: 15 × 5
      id diary perform_type   instrument               na
   <dbl> <dbl> <chr>          <chr>                 <dbl>
 1    22     1 Solo           orchestral instrument    24
 2    22     2 Large Ensemble orchestral instrument    21
 3    22     3 Large Ensemble orchestral instrument    14
 4    22     4 Large Ensemble orchestral instrument    15
 5    22     5 Large Ensemble orchestral instrument    10
 6    22     6 Solo           orchestral instrument    24
 7    22     7 Solo           orchestral instrument    24
 8    22     8 Solo           orchestral instrument    16
 9    22     9 Small Ensemble orchestral instrument    34
10    22    10 Large Ensemble orchestral instrument    22
11    22    11 Large Ensemble orchestral instrument    19
12    22    12 Large Ensemble orchestral instrument    18
13    22    13 Large Ensemble orchestral instrument    12
14    22    14 Large Ensemble orchestral instrument    19
15    22    15 Solo           orchestral instrument    25

Level One model

music |>
  filter(id == 22) |>
  lm(na ~ large_ensemble, data = _) |>
  tidy() |>
  kable(digits = 3)

term	estimate	std.error	statistic	p.value
(Intercept)	24.500	1.96	12.503	0.000
large_ensemble1	-7.833	2.53	-3.097	0.009

Repeat for all 37 musicians. See Part 3: Level One Models in AE.

Level One models

Recreated from BMLR Figure 8.9

Now let’s consider if there is an association between the estimated slopes, estimated intercepts, and the type of instrument

Level Two Model

The slope and intercept for the $i^{t h}$ musician can be modeled as $\begin{aligned} a_{i} = α_{0} + α_{1} O r c h e s t r a_{i} + u_{i} \\ b_{i} = β_{0} + β_{1} O r c h e s t r a_{i} + v_{i} \end{aligned}$

Note the response variable in the Level Two models are not observed outcomes but the (fitted) slope and intercept from each musician

See Part 4: Level Two Models in AE.

Estimated coefficients by instrument

Level Two model

Model for intercepts

term	estimate	std.error	statistic	p.value
(Intercept)	16.283	0.671	24.249	0.000
orchestra1	1.411	0.991	1.424	0.163

Model for slopes

term	estimate	std.error	statistic	p.value
(Intercept)	-0.771	0.851	-0.906	0.373
orchestra1	-1.406	1.203	-1.168	0.253

Writing out the models

Level One

${\hat{n a}}_{i j} = {\hat{a}}_{i} + {\hat{b}}_{i} L a r g e E n s e m b l e_{i j}$

Level Two

$\begin{aligned} {\hat{a}}_{i} = 16.283 + 1.411 O r c h e s t r a_{i} \\ {\hat{b}}_{i} = - 0.771 - 1.406 O r c h e s t r a_{i} \end{aligned}$

Estimated composite model

$\begin{aligned} {\hat{n a}}_{i j} & = 16.283 + 1.411 O r c h e s t r a_{i} - 0.771 L a r g e E n s e m b l e_{i j} \\ - 1.406 O r c h e s t r a_{i} : L a r g e E n s e m b l e_{i j} \end{aligned}$

(Note that we also have the error terms $ϵ_{i j}, u_{i}, v_{i}$ that we will discuss later)

What is the predicted average performance anxiety before solos and small ensemble performances for vocalists and keyboardists? For those who play orchestral instruments?
What is the predicted average performance anxiety before large ensemble performances for those who play orchestral instruments?

Disadvantages to this approach

⚠️ Weighs each musician the same regardless of number of diary entries

⚠️ Drops subjects who have missing values for slope (7 individuals who didn’t play a large ensemble performance)

⚠️ Does not share strength effectively across individuals (look at $R^{2}$ values in Part 3: Level One Models of AE)

We will use a unified approach that utilizes likelihood-based methods to address some of these drawbacks.

effect	group	term	estimate	std.error	statistic
fixed	NA	(Intercept)	15.930	0.641	24.833
fixed	NA	orchestra1	1.693	0.945	1.791
fixed	NA	large_ensemble1	-0.911	0.845	-1.077
fixed	NA	orchestra1:large_ensemble1	-1.424	1.099	-1.295
ran_pars	id	sd__(Intercept)	2.378	NA	NA
ran_pars	id	cor__(Intercept).large_ensemble1	-0.635	NA	NA
ran_pars	id	sd__large_ensemble1	0.672	NA	NA
ran_pars	Residual	sd__Observation	4.670	NA	NA

Multilevel models

Announcements

Topics

Correlated observations

Multilevel data

Two types of effects

Example

Multilevel models

Data: Music performance anxiety

Look at data

Univariate exploratory data analysis

Bivariate exploratory data analysis

Application exercise

Fitting the model

Questions we want to answer

Other modeling approaches

Level One model

Example Level One model

Level One model

Level One models

Level Two Model

Estimated coefficients by instrument

Level Two model

Writing out the models

Estimated composite model

Disadvantages to this approach

Unified approach to modeling multilevel data

Framework

Composite model

Notation

Error terms

Distribution of Level Two errors

Visualizing multivariate normal distribution

Fit the model

Fitted model

Fitted model

References