Multilevel Models

PSYC 573

Guiding Questions

  • What is hierarchical/multilevel models (MLMs)?
  • How to fit Bayesian multilevel models?
  • What are the advantages of MLM?
  • How to allow separate regression lines for many groups/clusters?
    • Varying intercepts and slopes (and variances)
    • Interpretations

Multilevel Models

  • Discovered in different disciplines separately
    • Mixed/mixed-effect models
    • Hierarchical linear models
    • Variance component models
  • A flexible class of models to handle clustered (dependent) data
    • Extremely common in the behavioral and social sciences

MLM

MLM subsumes

  • Dependent-sample \(t\)-test
  • Random-effect ANOVA
  • Repeated-measure ANOVA
  • Variance components models
  • Growth curve models
  • Generalizability theory
  • Random-effect meta-analysis
  • Build cluster-specific regression/other types of models
  • Borrow information across clusters
  • Include higher-level predictors

Multilevel Data Structures

  • Hierarchical/Nested
    • Students in schools
    • Clients nested within therapists within clinics
    • Employees nested within organizations
    • Citizens nested within employees
    • Repeated measures nested within persons

Multilevel Data Structures (Cont’d)

  • Crossed
    • Students cross-classified by high schools and middle schools
    • Responses cross-classified by items and persons

Quantifying Dependence

  • Intraclass correlation (ICC): \(\rho = \dfrac{\tau^2}{\tau^2 + \sigma^2}\)
    • Analogous to \(\eta^2\)/\(R^2\) effect size

ICC

The proportion of variance of the outcome that are due to between-level (e.g., between-group, between-person) differences

Data

data(sleepstudy, package = "lme4")
head(sleepstudy)
  Reaction Days Subject
1      250    0     308
2      259    1     308
3      251    2     308
4      321    3     308
5      357    4     308
6      415    5     308
?lme4::sleepstudy
hist(sleepstudy$Reaction)

Trajectories

ICC of Reaction

  • Varying/Random intercept model

\[ \begin{aligned} \text{Reaction}_{ij} & \sim N(\mu_j, \sigma) \\ \mu_j & \sim N(\gamma, \tau) \end{aligned} \]

  • \(\mu_j\): mean reaction for the \(j\)th person
  • \(i\) indexes measurement occasions
m2 <- brm(Reaction10 ~ (1 | Subject), data = sleepstudy,
          prior = c(# for intercept
            prior(normal(0, 50), class = "Intercept"),
            # for tau
            prior(gamma(2, 0.2), class = "sd"),
            # for sigma
            prior(student_t(4, 0, 5), class = "sigma")),
          # Higher adapt_delta is usually needed for MLM
          control = list(adapt_delta = .95),
          seed = 2107,
          file = "11_m2")
variable median mad q5 q95
b_Intercept 29.80 0.953 28.14 31.44
sd_Subject__Intercept 3.81 0.761 2.77 5.58
sigma 4.44 0.250 4.05 4.87

Interpretations

The model suggested that the average reaction time across individuals and measurement occasions was 298ms, 90% CI [281ms, 314ms]. It was estimated that 43.22%, 90% CI [27.12%, 61.31%] of the variations in reaction time was attributed to between-person differences.

Regression for One Person (308)

\[ \begin{aligned} \text{Reaction10}_i & \sim N(\mu_i, \sigma) \\ \mu_i & = \beta_0 + \beta_1 \texttt{Days}_i \end{aligned} \]

Varying Coefficients

In MLM, parameters (\(\beta_0\), \(\beta_1\), \(\sigma\)) can be

  • different across clusters (persons)
  • be estimated by partial pooling

Varying Intercepts

Repeated-measure level:

\[ \begin{aligned} \text{Reaction10}_{ij} & \sim N(\mu_{ij}, \sigma) \\ \mu_{ij} & = \beta_{0j} + \beta_{1} \text{Days}_{ij} \\ \end{aligned} \]

Person level:

\[ \begin{aligned} \beta_{0j} & \sim N(\mu^{[\beta_0]}, \tau^{[\beta_0]}) \\ \end{aligned} \]

Priors:

\[ \begin{aligned} \mu^{[\beta_0]} & \sim N(0, 50) \\ \tau^{[\beta_0]} & \sim \mathrm{Gamma}(2, 0.2) \\ \beta_1 & \sim N(0, 10) \\ \sigma & \sim t^+(4, 0, 5) \end{aligned} \]

m3 <- brm(Reaction10 ~ Days + (1 | Subject),
    data = sleepstudy,
    prior = c( # for intercept
        prior(normal(0, 50), class = "Intercept"),
        # for slope
        prior(normal(0, 10), class = "b"),
        # for tau
        prior(gamma(2, 0.2), class = "sd"),
        # for sigma
        prior(student_t(4, 0, 5), class = "sigma")
    ),
    control = list(adapt_delta = .95),
    seed = 2107,
    file = "11_m3"
)
variable median mad q5 q95
b_Intercept 25.23 1.054 23.552 26.93
b_Days 1.05 0.081 0.915 1.18
sd_Subject__Intercept 3.98 0.745 2.972 5.64
sigma 3.11 0.166 2.855 3.42

Overall Fit

Fit to Individuals

Remember: The model assumes equal slopes for each person

Varying Slopes

Repeated-measure level:

\[ \begin{aligned} \text{Reaction10}_{ij} & \sim N(\mu_{ij}, \sigma) \\ \mu_{ij} & = \beta_{0j} + \beta_{1j} \text{Days}_{ij} \\ \end{aligned} \]

Person level:

\[ \begin{aligned} \begin{bmatrix} \beta_{0j} \\ \beta_{1j} \\ \end{bmatrix} & \sim N_2\left( \begin{bmatrix} \mu^{[\beta_0]} \\ \mu^{[\beta_1]} \\ \end{bmatrix}, \mathbf T \right) \\ \mathbf T & = \begin{bmatrix} {\tau^{[\beta_0]}}^2 & \\ \tau^{\beta{10}} & {\tau^{[\beta_1]}}^2 \\ \end{bmatrix} \end{aligned} \]

LKJ Prior

Decomposing Covariance Matrix

  • Covariance = SD \(\times\) Correlation \(\times\) SD

\[ \mathbf T = \mathrm{diag}(\boldsymbol{\tau}) \boldsymbol{\Omega} \mathrm{diag}(\boldsymbol{\tau}) \]

Shape parameter \(\eta\)

\[ P(\boldsymbol{\Omega} | \eta) \propto \det(\boldsymbol{\Omega})^{\eta - 1} \]

  • \(\eta = 1\): Uniform
  • \(\eta \geq 1\): increasingly concentrated to zero correlations
  • \(\eta \leq 1\): more correlations closer to 1

Priors

\[ \begin{aligned} \mu^{[\beta_0]} & \sim N(0, 50) \\ \mu^{[\beta_1]} & \sim N(0, 10) \\ \tau^{[\beta_m]} & \sim \mathrm{Gamma}(2, 0.2), \; m = 0, 1 \\ \boldsymbol{\Omega} & \sim \mathrm{LKJ}(1) \\ \sigma & \sim t^+(4, 0, 5) \end{aligned} \]

m4 <- brm(Reaction10 ~ Days + (Days | Subject),
    data = sleepstudy,
    prior = c( # for intercept
        prior(normal(0, 50), class = "Intercept"),
        # for slope
        prior(normal(0, 10), class = "b"),
        # for tau_beta0 and tau_beta1
        prior(gamma(2, 0.2), class = "sd", group = "Subject"),
        # for correlation
        prior(lkj(1), class = "cor"),
        # for sigma
        prior(student_t(4, 0, 5), class = "sigma")
    ),
    control = list(adapt_delta = .95),
    seed = 2107,
    file = "11_m4"
)
variable median mad q5 q95
b_Intercept 25.122 0.731 23.835 26.39
b_Days 1.036 0.167 0.765 1.34
sd_Subject__Intercept 2.755 0.671 1.806 4.09
sd_Subject__Days 0.666 0.146 0.468 1.00
cor_Subject__Intercept__Days 0.058 0.327 -0.450 0.58
sigma 2.582 0.157 2.339 2.87

Fit to Individuals

Varying Regression Lines

Interpretations

\(\beta_0\)

Based on the model, at Day 0, the average reaction time across individuals was 251ms, 90% CI [238ms, 264ms], and the SD at Day 0 was 28.318ms, 95% CI [18.061ms, 40.863ms].

\(\beta_1\)

The average rate of change per day in reaction time across individuals was 10ms, 90% CI [7.6ms, 13ms], and the SD of the rates of change at Day 0 was 6.901ms, 95% CI [4.681ms, 10.049ms].

Random \(\sigma\)

Comparing Models

effect  Var Int  Var Slp  Var \(\sigma\)
fixed \(\mu^{[\beta_0]}\) 25.23 [23.16, 27.27] 25.12 [23.55, 26.66] 25.11 [23.52, 26.77]
\(\mu^{[\beta_1]}\) 1.05 [0.89, 1.21] 1.04 [0.70, 1.41] 1.04 [0.70, 1.40]
\(\sigma\) 3.11 [2.81, 3.49] 2.58 [2.29, 2.92]
\(\mu^{[\log \sigma]}\) 0.73 [0.44, 1.02]
random \(\tau^{[\beta_0]}\) 3.98 [2.85, 6.04] 2.75 [1.68, 4.41] 3.06 [2.06, 4.87]
\(\tau^{[\beta_1]}\) 0.67 [0.44, 1.09] 0.68 [0.46, 1.07]
\(\tau^{[\log \sigma]}\) 0.49 [0.31, 0.79]
Num.Obs. 180 180 180
ELPD −470.0 −447.0 −418.5
ELPD s.e. 14.3 22.7 13.4
LOOIC 940.0 894.0 837.0
LOOIC s.e. 28.6 45.5 26.8
WAIC 939.5 891.0 827.6

Many More Topics in MLM

  • Adding higher-level predictors
  • Cross-level interactions
  • Decomposing effects and the ecological fallacy
  • Categorical outcomes (i.e., generalized linear mixed model, GLMM)
  • Complex data structures (e.g., 3-level, crossed)
  • And more . . . Check out MLM classes on campus