Let’s look at experimental data from a common cognitive task, the flanker task. These are real data recently published in Hall, Schreiber, Allen, & Hallquist, Journal of Personality. This is a sample of 107 young adults who completed a version of the Eriksen flanker task.). In each trial, participants saw five horizontal arrows and were told to press a key corresponding to the direction of the center arrow (left or right). Participants were instructed to respond as quickly and accurately as possible. Half of the trials presented arrows on either side of the center arrow that pointed in the same direction (“congruent” trials), whereas the flanking arrows on the other trials pointed in the opposite direction (“incongruent” trials).
Participants completed 160 trials in two conditions: mostly congruent (70% of trials congruent) and mostly incongruent (70% of trials incongruent). Each condition was split into blocks of 40 trials. The direction of the central arrow was counterbalanced across trials. Four blocks of forty trials were presented in ABBA order, where A is a mostly congruent block and B is a mostly incongruent block. Stimuli were displayed for 1000ms each with a 500ms inter-trial interval (ITI).
Congruent: \(\leftarrow \leftarrow \leftarrow \leftarrow \leftarrow\)
Incongruent: \(\leftarrow \leftarrow \rightarrow \leftarrow \leftarrow\)
flanker <- readRDS(file="flanker_jop.rds") %>% arrange(id, trial)
str(flanker)
## tibble [17,120 × 11] (S3: tbl_df/tbl/data.frame)
## $ id : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
## $ run : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
## $ trial : num [1:17120] 1 2 3 4 5 6 7 8 9 10 ...
## $ run_trial : int [1:17120] 1 2 3 4 5 6 7 8 9 10 ...
## $ block : Factor w/ 2 levels "most_incon","most_con": 2 2 2 2 2 2 2 2 2 2 ...
## $ cond : Factor w/ 2 levels "congruent","incongruent": 1 2 1 1 2 1 1 1 2 2 ...
## $ rt : num [1:17120] 537 599 397 450 376 413 403 377 439 445 ...
## $ correct : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
## $ exclude : num [1:17120] 0 0 0 0 0 0 0 0 0 0 ...
## $ MPS_aggression: num [1:17120] 0 0 0 0 0 0 0 0 0 0 ...
## $ MPS_alienation: num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
The key dependent variables are reaction time (rt
) and accuracy (correct
). We will defer models of accuracy for a later workshop. Let’s just focus on rt for today.
lattice::histogram(~ rt | cond*block, flanker)
Notice how non-normal the RTs are? This is a familiar problem (e.g., see the classic Ratcliff 1993 Psychological Bulletin paper) and we will spend more time on it in a few weeks. For now, we can assume that the strong positive skew will be a problem when we fit a model (GLM or MLM) that assumes approximately normal residuals.
Therefore, for now, we will transform the RTs to approximate normality for analysis. Two transformations are often effective: logarithm and inverse. Note that for the inverse (\(1/x\)) transformation, it is useful to take the negative inverse so that higher values still represent longer RTs. Furthermore, the inverse often leads to very tiny values, so scaling up by a large constant (100 or 1000) can make the variable more practically useful (avoid tiny regression coefficients, for example).
flanker <- flanker %>% mutate(
rt_log10 = log10(rt),
rt_inv = -1000/rt
)
lattice::histogram(~ rt_log10 | cond*block, flanker)
lattice::histogram(~ rt_inv | cond*block, flanker)
Here, the inverse RT looks awfully good – we’ll use this moving forward.
Thinking back to our multilevel structure, we have 160 trials nested within 107 people. Thus, our dataset should be 17,120 rows in long form. (Note that if there were missing or dropped trials, this number could be lower). For virtually all MLMs (in R), we want our data in person-period form (more generically, ‘long form’) where one variable encodes person (level 2 unit) and another encodes the period or repetition for each observation (level 1 unit). Other variables would encode predictors or outcome variables.
head(flanker)
id | run | trial | run_trial | block | cond | rt | correct | exclude | MPS_aggression | MPS_alienation | rt_log10 | rt_inv |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | most_con | congruent | 537 | 1 | 0 | 0 | 1 | 2.73 | -1.86 |
1 | 1 | 2 | 2 | most_con | incongruent | 599 | 1 | 0 | 0 | 1 | 2.78 | -1.67 |
1 | 1 | 3 | 3 | most_con | congruent | 397 | 1 | 0 | 0 | 1 | 2.60 | -2.52 |
1 | 1 | 4 | 4 | most_con | congruent | 450 | 1 | 0 | 0 | 1 | 2.65 | -2.22 |
1 | 1 | 5 | 5 | most_con | incongruent | 376 | 1 | 0 | 0 | 1 | 2.58 | -2.66 |
1 | 1 | 6 | 6 | most_con | congruent | 413 | 1 | 0 | 0 | 1 | 2.62 | -2.42 |
Let’s think through what variables exist at the within-person (L1) and between-person (L2) levels.
Level 1 – within-person
Level 2 – between-person
Unit variables
Recall from Alison’s lecture that when we violate the assumption of independent residuals, we probably need to move away from the standard general linear model (GLM). The level of non-independence of observations within a cluster is proportionate to the intraclass correlation:
\[ ICC = \frac{s^2_b}{s^2_b + s^2_w} \]
What happens if we plow through these considerations and try to address our questions above?
Our basic model of the task would be:
\[ \textrm{RT}_{ij} = \beta_{0} + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \boldsymbol{\varepsilon} \sim \mathcal{N}(0, \sigma^2) \] where we have trial \(i\) nested within subject \(j\).
Note: none of the coefficients varies by subject or trial, only the residuals do!
lm_fit <- lm(rt_inv ~ block*cond, flanker)
summary(lm_fit)
##
## Call:
## lm(formula = rt_inv ~ block * cond, data = flanker)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4782 -0.3145 -0.0018 0.3042 1.6324
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.70410 0.00968 -279.38 <2e-16 ***
## blockmost_con 0.02460 0.01159 2.12 0.0338 *
## condincongruent 0.10846 0.01157 9.37 <2e-16 ***
## blockmost_con:condincongruent 0.04920 0.01640 3.00 0.0027 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.487 on 16797 degrees of freedom
## (319 observations deleted due to missingness)
## Multiple R-squared: 0.0159, Adjusted R-squared: 0.0158
## F-statistic: 90.7 on 3 and 16797 DF, p-value: <2e-16
car::Anova(lm_fit, type=3)
Sum Sq | Df | F value | Pr(>F) | |
---|---|---|---|---|
(Intercept) | 18521.70 | 1 | 78050.42 | 0.000 |
block | 1.07 | 1 | 4.51 | 0.034 |
cond | 20.86 | 1 | 87.89 | 0.000 |
block:cond | 2.14 | 1 | 9.00 | 0.003 |
Residuals | 3986.00 | 16797 | NA | NA |
summary(emblock <- emmeans(lm_fit, ~block))
## NOTE: Results may be misleading due to involvement in interactions
block | emmean | SE | df | lower.CL | upper.CL |
---|---|---|---|---|---|
most_incon | -2.65 | 0.006 | 16797 | -2.66 | -2.64 |
most_con | -2.60 | 0.006 | 16797 | -2.61 | -2.59 |
pairs(emblock)
## contrast estimate SE df t.ratio p.value
## most_incon - most_con -0.0492 0.0082 16797 -6.000 <.0001
##
## Results are averaged over the levels of: cond
summary(emcond <- emmeans(lm_fit, ~cond))
## NOTE: Results may be misleading due to involvement in interactions
cond | emmean | SE | df | lower.CL | upper.CL |
---|---|---|---|---|---|
congruent | -2.69 | 0.006 | 16797 | -2.70 | -2.68 |
incongruent | -2.56 | 0.006 | 16797 | -2.57 | -2.55 |
pairs(emcond)
## contrast estimate SE df t.ratio p.value
## congruent - incongruent -0.133 0.0082 16797 -16.230 <.0001
##
## Results are averaged over the levels of: block
Force back onto original RT using custom function
rg <- update(ref_grid(lm_fit), tran=list(linkinv=function(x) { -1000/x }, mu.eta=function(x) { -1000/x }), predict.type="response")
summary(res_trans <- emmeans(rg, ~cond|block))
cond | block | response | SE | df | lower.CL | upper.CL |
---|---|---|---|---|---|---|
congruent | most_incon | 370 | 3.58 | 16797 | 367 | 372 |
incongruent | most_incon | 385 | 2.44 | 16797 | 383 | 387 |
congruent | most_con | 373 | 2.38 | 16797 | 371 | 375 |
incongruent | most_con | 397 | 3.86 | 16797 | 394 | 400 |
summary(emmeans(rg, ~cond))
## NOTE: Results may be misleading due to involvement in interactions
cond | response | SE | df | lower.CL | upper.CL |
---|---|---|---|---|---|
congruent | 371 | 2.15 | 16797 | 370 | 373 |
incongruent | 391 | 2.27 | 16797 | 389 | 393 |
summary(emmeans(rg, ~block))
## NOTE: Results may be misleading due to involvement in interactions
block | response | SE | df | lower.CL | upper.CL |
---|---|---|---|---|---|
most_incon | 377 | 2.18 | 16797 | 376 | 379 |
most_con | 385 | 2.23 | 16797 | 383 | 386 |
emmip(rg, ~cond|block)
pairs(emmeans(lm_fit, ~block|cond))
## cond = congruent:
## contrast estimate SE df t.ratio p.value
## most_incon - most_con -0.0246 0.0116 16797 -2.120 0.0338
##
## cond = incongruent:
## contrast estimate SE df t.ratio p.value
## most_incon - most_con -0.0738 0.0116 16797 -6.360 <.0001
ggplot(lm_fit, aes(sample = rstandard(lm_fit))) +
geom_qq() + stat_qq_line() +
ggtitle("Normal Q-Q plot on standardized residuals")
At a high level, we might conclude that there is some evidence that:
df <- flanker %>% add_residuals(model = lm_fit)
by_subj <- ggplot(df, aes(x=factor(id), y=resid)) +
stat_summary(fun.data = mean_cl_boot, geom = "pointrange") +
geom_hline(yintercept=0, color="blue", size=2) +
ggtitle("95% CIs on residuals by subject") +
labs(x=NULL) + ylim(-1.5, 1.6) +
theme_cowplot() +
theme(axis.text.x = element_blank())
overall <- ggplot(df, aes(x=1, y=resid)) + geom_boxplot() +
# stat_summary(fun.data = mean_cl_boot, geom = "pointrange") +
geom_hline(yintercept=0, color="blue", size=2) +
ggtitle("Overall") +
labs(x=NULL) + ylim(-1.5, 1.6) +
theme_cowplot() +
theme(axis.text.x = element_blank())
cowplot::plot_grid(by_subj, overall, nrow=1,rel_widths = c(0.8, 0.2))
## Warning: Removed 324 rows containing non-finite values (stat_summary).
## Warning: Removed 324 rows containing non-finite values (stat_boxplot).
By eye, it looks a bit suspicious, right?! Let’s re-use the GLM to look at problems with the GLM. Here’s a model that treats id as a factor and looks at whether there is significant variation across ids in the residuals.
mm <- lm(resid ~ factor(id), df)
summary(mm)
##
## Call:
## lm(formula = resid ~ factor(id), data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.799 -0.228 0.008 0.229 1.589
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.24529 0.03029 8.10 6.0e-16 ***
## factor(id)2 -0.41601 0.04318 -9.63 < 2e-16 ***
## factor(id)3 -0.37202 0.04284 -8.68 < 2e-16 ***
## factor(id)4 -0.83873 0.04355 -19.26 < 2e-16 ***
## factor(id)5 -0.38191 0.04284 -8.92 < 2e-16 ***
## factor(id)6 -0.28932 0.04284 -6.75 1.5e-11 ***
## factor(id)8 -0.12694 0.04284 -2.96 0.00305 **
## factor(id)9 -0.73489 0.04340 -16.93 < 2e-16 ***
## factor(id)10 -0.52365 0.04311 -12.15 < 2e-16 ***
## factor(id)11 0.00792 0.04284 0.18 0.85339
## factor(id)12 -0.83344 0.04434 -18.80 < 2e-16 ***
## factor(id)13 -0.47480 0.04284 -11.08 < 2e-16 ***
## factor(id)14 0.44608 0.04290 10.40 < 2e-16 ***
## factor(id)15 0.49738 0.04297 11.57 < 2e-16 ***
## factor(id)16 -0.40843 0.04332 -9.43 < 2e-16 ***
## factor(id)17 -0.48984 0.04284 -11.43 < 2e-16 ***
## factor(id)18 -0.34916 0.04311 -8.10 5.9e-16 ***
## factor(id)19 -0.24169 0.04290 -5.63 1.8e-08 ***
## factor(id)20 -0.30117 0.04284 -7.03 2.1e-12 ***
## factor(id)21 -0.10775 0.04311 -2.50 0.01245 *
## factor(id)22 -0.67788 0.04290 -15.80 < 2e-16 ***
## factor(id)23 -0.21032 0.04284 -4.91 9.2e-07 ***
## factor(id)24 -0.67003 0.04370 -15.33 < 2e-16 ***
## factor(id)25 -0.45270 0.04284 -10.57 < 2e-16 ***
## factor(id)26 -0.44352 0.04304 -10.30 < 2e-16 ***
## factor(id)27 -0.09973 0.04290 -2.32 0.02012 *
## factor(id)28 -0.48700 0.04297 -11.33 < 2e-16 ***
## factor(id)29 -0.45874 0.04297 -10.68 < 2e-16 ***
## factor(id)30 0.14625 0.04284 3.41 0.00064 ***
## factor(id)31 -0.80450 0.04332 -18.57 < 2e-16 ***
## factor(id)32 -0.32334 0.04297 -7.52 5.6e-14 ***
## factor(id)33 -0.32109 0.04304 -7.46 9.1e-14 ***
## factor(id)34 -0.38455 0.04290 -8.96 < 2e-16 ***
## factor(id)35 -0.77034 0.04325 -17.81 < 2e-16 ***
## factor(id)36 -0.45882 0.04564 -10.05 < 2e-16 ***
## factor(id)37 -0.24248 0.04290 -5.65 1.6e-08 ***
## factor(id)38 0.39919 0.04284 9.32 < 2e-16 ***
## factor(id)40 -0.42446 0.04284 -9.91 < 2e-16 ***
## factor(id)41 -0.14849 0.04284 -3.47 0.00053 ***
## factor(id)42 -0.31970 0.04311 -7.42 1.3e-13 ***
## factor(id)43 -0.28215 0.04304 -6.56 5.7e-11 ***
## factor(id)45 0.10422 0.04332 2.41 0.01615 *
## factor(id)46 0.13858 0.04297 3.22 0.00126 **
## factor(id)48 -0.64253 0.04297 -14.95 < 2e-16 ***
## factor(id)49 -0.06866 0.04284 -1.60 0.10898
## factor(id)50 -0.50522 0.04311 -11.72 < 2e-16 ***
## factor(id)51 0.07398 0.04304 1.72 0.08566 .
## factor(id)52 0.14858 0.04284 3.47 0.00052 ***
## factor(id)53 -0.21420 0.04284 -5.00 5.8e-07 ***
## factor(id)54 -0.33281 0.04284 -7.77 8.4e-15 ***
## factor(id)56 -0.47160 0.04311 -10.94 < 2e-16 ***
## factor(id)57 -0.33916 0.04311 -7.87 3.8e-15 ***
## factor(id)58 -0.06277 0.04290 -1.46 0.14348
## factor(id)59 -0.60609 0.04318 -14.04 < 2e-16 ***
## factor(id)60 -0.52008 0.04355 -11.94 < 2e-16 ***
## factor(id)61 -0.41963 0.04290 -9.78 < 2e-16 ***
## factor(id)62 0.26537 0.04478 5.93 3.2e-09 ***
## factor(id)63 -0.64276 0.04284 -15.00 < 2e-16 ***
## factor(id)64 -0.36744 0.04318 -8.51 < 2e-16 ***
## factor(id)65 0.33972 0.04304 7.89 3.1e-15 ***
## factor(id)66 0.17324 0.04284 4.04 5.3e-05 ***
## factor(id)67 -0.60154 0.04304 -13.98 < 2e-16 ***
## factor(id)68 -0.31396 0.04290 -7.32 2.6e-13 ***
## factor(id)69 -0.03910 0.04284 -0.91 0.36136
## factor(id)70 -0.58466 0.04325 -13.52 < 2e-16 ***
## factor(id)71 -0.63960 0.04284 -14.93 < 2e-16 ***
## factor(id)72 -0.24642 0.04297 -5.73 1.0e-08 ***
## factor(id)73 -0.21059 0.04325 -4.87 1.1e-06 ***
## factor(id)74 -0.18266 0.04284 -4.26 2.0e-05 ***
## factor(id)75 0.16779 0.04284 3.92 9.0e-05 ***
## factor(id)76 -0.36953 0.04284 -8.63 < 2e-16 ***
## factor(id)77 -0.45617 0.04284 -10.65 < 2e-16 ***
## factor(id)78 0.19719 0.04284 4.60 4.2e-06 ***
## factor(id)79 -0.59938 0.04304 -13.93 < 2e-16 ***
## factor(id)81 -0.00174 0.04284 -0.04 0.96751
## factor(id)82 0.06080 0.04297 1.41 0.15711
## factor(id)84 -0.33296 0.04284 -7.77 8.1e-15 ***
## factor(id)85 -0.48373 0.04284 -11.29 < 2e-16 ***
## factor(id)86 -0.21452 0.04284 -5.01 5.6e-07 ***
## factor(id)87 -0.30948 0.04347 -7.12 1.1e-12 ***
## factor(id)88 -0.35559 0.04325 -8.22 < 2e-16 ***
## factor(id)89 0.18750 0.04297 4.36 1.3e-05 ***
## factor(id)90 0.41492 0.04304 9.64 < 2e-16 ***
## factor(id)91 -0.26983 0.04284 -6.30 3.1e-10 ***
## factor(id)92 -0.12035 0.04417 -2.72 0.00645 **
## factor(id)93 -0.12757 0.04284 -2.98 0.00290 **
## factor(id)94 -0.13616 0.04284 -3.18 0.00148 **
## factor(id)95 -0.36663 0.04284 -8.56 < 2e-16 ***
## factor(id)96 -0.31796 0.04290 -7.41 1.3e-13 ***
## factor(id)97 -0.57405 0.04290 -13.38 < 2e-16 ***
## factor(id)98 -0.73954 0.04297 -17.21 < 2e-16 ***
## factor(id)99 -0.39441 0.04355 -9.06 < 2e-16 ***
## factor(id)100 0.06134 0.04284 1.43 0.15221
## factor(id)101 0.03500 0.04284 0.82 0.41385
## factor(id)102 -0.24965 0.04290 -5.82 6.0e-09 ***
## factor(id)103 -0.15830 0.04284 -3.70 0.00022 ***
## factor(id)104 -0.56794 0.04347 -13.06 < 2e-16 ***
## factor(id)105 -0.26834 0.04290 -6.25 4.1e-10 ***
## factor(id)106 -0.08392 0.04284 -1.96 0.05013 .
## factor(id)107 -0.05651 0.04284 -1.32 0.18709
## factor(id)108 -0.23763 0.04290 -5.54 3.1e-08 ***
## factor(id)109 0.04264 0.04297 0.99 0.32106
## factor(id)110 0.04741 0.04311 1.10 0.27147
## factor(id)111 0.52864 0.04284 12.34 < 2e-16 ***
## factor(id)112 -0.19344 0.04290 -4.51 6.6e-06 ***
## factor(id)113 -0.10159 0.04290 -2.37 0.01790 *
## factor(id)114 0.00377 0.04284 0.09 0.92986
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.383 on 16694 degrees of freedom
## (319 observations deleted due to missingness)
## Multiple R-squared: 0.385, Adjusted R-squared: 0.381
## F-statistic: 98.7 on 106 and 16694 DF, p-value: <2e-16
car::Anova(mm, type=3)
Sum Sq | Df | F value | Pr(>F) | |
---|---|---|---|---|
(Intercept) | 9.63 | 1 | 65.6 | 0 |
factor(id) | 1535.31 | 106 | 98.7 | 0 |
Residuals | 2450.69 | 16694 | NA | NA |
So, yeah, an F-test of almost 100 is probably worth attention.
More formally, we know that the ICC can give us a useful guide on whether clustering within units is substantial and, therefore, problematic for the GLM. Here, we calculate ICCs from the residuals of the GLM (i.e., removing variance due to condition and block).
rfx_model <- lmer(resid ~ 1 + (1|id), df)
performance::icc(rfx_model)
## # Intraclass Correlation Coefficient
##
## Adjusted ICC: 0.385
## Conditional ICC: 0.385
Bonus for reflection: is it a coincidence that this ICC is identical to the \(R^2\) from the GLM of residuals with subject as a factor? No, it’s not – take some time to think on why these are equivalent.
Conceptually, when we have repeated measures data, we are often interested in the possibility that certain relationships vary from one person to the next. Indeed, these between-person differences might be related to other individual difference variables such as personality traits.
What if we wanted to take this approach here and get a sense of the how experimental effects differ by subject?
Here are boxplots for the first 25 subjects
ggplot(flanker %>% filter(id <= 25), aes(x=cond, y=rt, color=block)) +
geom_boxplot() +
facet_wrap(~id)
## Warning: Removed 79 rows containing non-finite values (stat_boxplot).
Definitely some heterogeneity there!
What if we truly treated data for each subject separately? We could split up the data by subject and run a model with each person as a separate dataset.
#a bit beyond the scope, but at a high level, we use dplyr and broom to run the same RT
#Documentation: https://cran.r-project.org/web/packages/broom/vignettes/broom_and_dplyr.html
res <- flanker %>%
group_by(id) %>%
nest() %>%
mutate(test=map(data, ~ lm(rt_inv ~ block*cond, .x, na.action = na.exclude)),
stats=map(test, tidy)) %>%
unnest(stats) %>% dplyr::select(-data, -test) %>%
ungroup() %>% group_by(term) %>%
mutate(estimate_termz=as.vector(scale(estimate))) %>%
ungroup()
ggplot(res, aes(x=term, y=id, fill=estimate_termz)) +
geom_tile() + scale_fill_viridis_c() + ggtitle("Coefficients") +
theme(axis.text.x=element_text(angle=90))
overall <- data.frame(term=names(coef(lm_fit)), estimate=coef(lm_fit))
ggplot(res, aes(x=1, y=estimate)) + geom_boxplot() + facet_wrap(~term, scales="free") +
geom_point(data=overall, size=6, color="blue", shape=18) +
labs(title="Heterogeneity in within-person model coefficients",
subtitle="Blue diamonds are coefs from overall lm()", x=NULL) +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())
Let’s move into the formal MLM framework to examine these data. Here is the basic model for a random intercept model, where we allow the intercept to vary by cluster, but keep the condition effects (cond
and block
) homogeneous (i.e., fixed only).
\[ \textrm{RT}_{ij} = \beta_{0j} + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \beta_{0j} = \gamma_{00} + \mu_{0j} \\ \boldsymbol{\mu_{0.}} \sim \mathcal{N}(0, \tau_{00}) \\ \boldsymbol{\varepsilon_{..}} \sim \mathcal{N}(0, \sigma^2) \]
(Also called unconditional means model)
MLM usually begins by running an unconditional model that allows for cluster variation in the mean/expectation/intercept. The term ‘unconditional’ refers to the fact that the outcome (rt_inv
) is not conditioned on (regressed on) any predictor. In this scenario, the intercept of the model is nearly the same as the overall average in the sample (and the reason for small divergence would be a distraction at the moment).
Why is this model useful? It gives us a starting point in terms of the amount of variation in the outcome that is attributable to between-cluster (here, between persons) versus within-cluster variation.
m1 <- lmer(rt_inv ~ 1 + (1|id), flanker)
summary(m1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ 1 + (1 | id)
## Data: flanker
##
## REML criterion at convergence: 16381
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.741 -0.598 0.012 0.601 4.363
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 0.0918 0.303
## Residual 0.1507 0.388
## Number of obs: 16801, groups: id, 107
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -2.6323 0.0295 105.9839 -89.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mean(flanker$rt_inv, na.rm=TRUE)
## [1] -2.63
#store variance components
v1 <- as.data.frame(get_variance(m1)) %>% mutate(model="unconditional random intercept") %>%
dplyr::select(model, var.fixed, var.intercept, var.residual)
As a quick overview of lmer
syntax, all terms on the right-hand side that are not in parentheses are fixed effects in the model – that is, the estimates reflect the sample average effect. Here, ~ 1
indicates that an overall intercept (\(\gamma_{00}\)) should be included in the model. Note that this the default, but it’s useful to include it clarity.
All terms included inside parentheses are entered as random effects, introducing a variance component for each term. Here, 1
within the parentheses requests that the intercept varies across clusters. The |
operator denotes the separation between variance components and clustering variables. Here, id
is the clustering variable, telling lmer
that all terms on the left of the ‘pipe’ vary across units of the clustering variable on the right.
To map this output onto our model:
Intercept
under fixed effects)Intercept
variation by id
under random effects)And remember our formula for ICC – proportion of between-cluster variation relative to overall variation. With this in mind, we have the values to calculate it by hand:
vc <- VarCorr(m1)
print(vc,comp=c("Variance"))
## Groups Name Variance
## id (Intercept) 0.0918
## Residual 0.1507
dd <- as.data.frame(vc)
ICC_by_hand <- dd$vcov[dd$grp=="id"] / (dd$vcov[dd$grp=="id"] + dd$vcov[dd$grp=="Residual"])
print(ICC_by_hand)
## [1] 0.379
#helper function
performance::icc(m1)
## # Intraclass Correlation Coefficient
##
## Adjusted ICC: 0.379
## Conditional ICC: 0.379
This principle is intuitive, but often overlooked. When we fit a standard GLM, we hope that as we add good predictors into the model, \(SS_\textrm{Error}\) goes down, while the \(SS_\textrm{Model}\) goes up.
\[ \begin{align} SS_\textrm{Total} &= \sum_{i=1}^{N}({y_i - \bar{y}})^2 \\ SS_\textrm{Model} &= \sum_{i=1}^{N}({\hat{y}_i - \bar{y}})^2 \\ SS_\textrm{Error} &= \sum_{i=1}^{N}({y_i - \hat{y}})^2 = SS_\textrm{Total} - SS_\textrm{Model} \end{align} \]
By extension, when we move into the multilevel framework, each variance component (not just \(\sigma^2\)) represents variation that could potentially be explained by one or more predictors.
In the MLM, however, we can think of each variance component as a ‘compartment’ of overall variance, where the compartments can vary by level (e.g., within- versus between-persons). For example, if we introduce a level 1 predictor, it can potentially explain variation – and thus reduce the variance components – for both the L1 residual variability (\(\sigma^2\)) and level 2 variability in intercepts (\(\tau_{00}\)). This relates to the discussion of parcelling within versus between variation in L1 predictors and the concept of a ‘total effect’ (that combines both L1 and L2 variation).
Let’s take a look at what happens to our model when we add in the design factors of cond
and block
. This is a basic random intercepts model where we have fixed effects of cond
and block
and intercepts that vary across subjects (random). A useful way to rewrite the MLM equation above is:
\[ \textrm{RT}_{ij} = \textrm{Intercept}_j + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \textrm{Intercept}_{j} = \gamma_{00} + \mu_{0j} \\ {\textbf{Intercepts}} \sim \mathcal{N}(\gamma_{00}, \tau_{00}) \]
This notation emphasizes that the intercepts are a set of values, one per subject, and are normally distributed with mean \(\gamma_{00}\) and variance \(\tau_{00}\).
m2 <- lmer(rt_inv ~ block*cond + (1|id), flanker)
summary(m2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ block * cond + (1 | id)
## Data: flanker
##
## REML criterion at convergence: 15963
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.678 -0.593 0.019 0.599 4.138
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 0.0919 0.303
## Residual 0.1468 0.383
## Number of obs: 16801, groups: id, 107
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -2.71e+00 3.03e-02 1.18e+02 -89.35 < 2e-16 ***
## blockmost_con 2.34e-02 9.11e-03 1.67e+04 2.57 0.01 *
## condincongruent 1.09e-01 9.10e-03 1.67e+04 11.96 < 2e-16 ***
## blockmost_con:condincongruent 5.05e-02 1.29e-02 1.67e+04 3.91 9.2e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) blckm_ cndncn
## blockmst_cn -0.210
## condncngrnt -0.210 0.699
## blckmst_cn: 0.148 -0.706 -0.705
v2 <- as.data.frame(get_variance(m2)) %>% mutate(model="random intercept cond+block") %>%
dplyr::select(model, var.fixed, var.intercept, var.residual)
Let’s look at the reduction in variance components with the addition of fixed effects for cond and block
v1_long <- v1 %>% pivot_longer(-model) %>% rename(value.unconditional = value) %>% dplyr::select(-model)
v2_long <- v2 %>% pivot_longer(-model) %>% rename(value.addfixed = value) %>% dplyr::select(-model)
v1_long %>% left_join(v2_long) %>% mutate(re.change.pct=(1 - value.addfixed/value.unconditional)*100) %>%
kable(digits=2)
## Joining, by = "name"
name | value.unconditional | value.addfixed | re.change.pct |
---|---|---|---|
var.fixed | 0.00 | 0.00 | -Inf |
var.intercept | 0.09 | 0.09 | -0.11 |
var.residual | 0.15 | 0.15 | 2.59 |
Here, we see that introducing cond*block
reduced L1 residual variability by about 2.6% but had no effect on variability in the intercepts.
For discussion: I mentioned above that adding L1 predictors – which is what cond and block are – has the potential to explain variation at both L1 and L2. Why didn’t we see any reduction in the intercept variability here?
We can obtain the estimated random effects for each cluster in an lmer
model using ranef
.
res <- ranef(m2)$id
lattice::histogram(res$`(Intercept)`, xlab="Estimated subject intercepts (mu_0j)")
We can use REsim
from merTools
to get estimated random effects for each cluster (here, id
), including uncertainties in these estimates. We can also plot these using plotREsim
:
#predictInterval(m2) # for various model predictions, possibly with new data
simranef <- REsim(m2) # mean, median and sd of the random effect estimates
simranef %>% head()
groupFctr | groupID | term | mean | median | sd |
---|---|---|---|---|---|
id | 1 | (Intercept) | 0.253 | 0.257 | 0.043 |
id | 2 | (Intercept) | -0.164 | -0.163 | 0.042 |
id | 3 | (Intercept) | -0.119 | -0.120 | 0.039 |
id | 4 | (Intercept) | -0.582 | -0.586 | 0.040 |
id | 5 | (Intercept) | -0.130 | -0.131 | 0.042 |
id | 6 | (Intercept) | -0.038 | -0.040 | 0.040 |
plotREsim(simranef) # plot the interval estimates
This is a handy plotting function because it shows by eye the tendency for cluster-level deviations to differ significantly from zero. Recall that if all cluster deviations (here, \(\mu_{0j}\)) are approximately zero, we don’t really need to model that variance component (random effect).
Furthermore, as discussed previously, if we add predictors (as fixed effects) into the model and they explain considerable between-subjects (or more generally, between-cluster) variability, the subject-specific intercepts will increasingly approach zero and the variance component \(\tau_{00}\) will also diminish toward zero.
What if we allowed for the possibility that the effect of incongruent vs. congruent trials varied by subject? That is, we think that there is some overall effect of congruency, but also that this effect could be stronger or weaker in some people.
How does this modify our model above? To keep things simple for now (we’ll come back to this), I’m falling back to an additive model that does not include the cond*block
interaction.
\[ \begin{align} \textrm{RT}_{ij} &= \beta_{0j} + \beta_{1} \textrm{block}_{ij} + \beta_{2j} \textrm{cond}_{ij} + \varepsilon_{ij} \\ \beta_{0j} &= \gamma_{00} + \mu_{0j} \\ \beta_{1j} &= \gamma_{10} \\ \beta_{2j} &= \gamma_{20} + \mu_{2j} \\ \boldsymbol{\varepsilon_{..}} &\sim \mathcal{N}(0, \sigma^2) \end{align} \]
\[ \begin{bmatrix} \mu_{0j} \\ \mu_{2j} \end{bmatrix} \sim \mathcal{N}(\begin{bmatrix} 0 \\ 0\end{bmatrix}, \begin{bmatrix} \tau_{00} & \tau_{\textrm{int,cond}} \\ \tau_{\textrm{int,cond}} & \tau_{11}\end{bmatrix}) \] Here, \(\tau_{\textrm{int,cond}}\) represents the covariance between the a subject’s estimated intercept and their estimated slope for the cond-RT association.
m3 <- lmer(rt_inv ~ block + cond + (1 + cond | id), flanker)
summary(m3)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ block + cond + (1 + cond | id)
## Data: flanker
##
## REML criterion at convergence: 15940
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.688 -0.596 0.021 0.595 4.188
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.09552 0.3091
## condincongruent 0.00365 0.0604 -0.24
## Residual 0.14604 0.3822
## Number of obs: 16801, groups: id, 107
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -2.72e+00 3.05e-02 1.11e+02 -89.30 < 2e-16 ***
## blockmost_con 4.86e-02 6.43e-03 1.66e+04 7.55 4.6e-14 ***
## condincongruent 1.34e-01 8.69e-03 1.26e+02 15.44 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) blckm_
## blockmst_cn -0.147
## condncngrnt -0.267 0.296
print(unlist(get_variance(m3)))
## var.fixed var.random var.residual
## 0.00379 0.09286 0.14604
## var.distribution var.dispersion var.intercept.id
## 0.14604 0.00000 0.09552
## var.slope.id.condincongruent cor.slope_intercept.id
## 0.00365 -0.23980
simranef3 <- REsim(m3) # mean, median and sd of the random effect estimates
plotREsim(simranef3) + facet_wrap(~term, scales="free", ncol=1) # plot the interval estimates
We can see by eye that a handful of subjects have congruency slopes that differ reliably from zero, though this is not particularly striking relative to the random intercept.
plot_model(m2, type = "pred", terms = c("cond", "block"), pred.type = "re") +
labs(x = "Condition", y = "RT inverse", title = "Predicted effects in model that excludes condition random slope")
#add back interaction to make it comparable to random intercept model (m2)
m4 <- lmer(rt_inv ~ block + cond + (1 + cond | id), flanker)
plot_model(m4, type = "pred", terms = c("cond", "block"), pred.type = "re") +
labs(x = "Condition", y = "RT inverse", title = "Predicted effects in model that includes condition random slope")
plot_model(m4, type = "re", terms = c("cond", "block")) +
#facet_wrap(~facet, scales="free_y") +
labs(y = "Model effect", x = "ID", title = "Model that includes random slopes")
rr <- ranef(m3)$id %>% pivot_longer(cols=everything())
ggplot(rr, aes(x=value)) + geom_histogram(bins=15) +
facet_wrap(~name, scales="free")
Not too bad. One of our assumptions in MLM is that the random effects are distributed as (approximately) multivariate normal. Here, I’ve plotted the univariate distributions. We would also perhaps want to check that the bivariate distribution is approximately normal.
If you really want to get fancy, you could formally test for multivariate normality using the MVN
package, though this is overkill for most applications.
ggplot(ranef(m3)$id, aes(x=`(Intercept)`, y=condincongruent)) + geom_density_2d()