1 Intuition from application: Flanker Task

Let’s look at experimental data from a common cognitive task, the flanker task. These are real data recently published in Hall, Schreiber, Allen, & Hallquist, Journal of Personality. This is a sample of 107 young adults who completed a version of the Eriksen flanker task.). In each trial, participants saw five horizontal arrows and were told to press a key corresponding to the direction of the center arrow (left or right). Participants were instructed to respond as quickly and accurately as possible. Half of the trials presented arrows on either side of the center arrow that pointed in the same direction (“congruent” trials), whereas the flanking arrows on the other trials pointed in the opposite direction (“incongruent” trials).

Participants completed 160 trials in two conditions: mostly congruent (70% of trials congruent) and mostly incongruent (70% of trials incongruent). Each condition was split into blocks of 40 trials. The direction of the central arrow was counterbalanced across trials. Four blocks of forty trials were presented in ABBA order, where A is a mostly congruent block and B is a mostly incongruent block. Stimuli were displayed for 1000ms each with a 500ms inter-trial interval (ITI).

Congruent: \(\leftarrow \leftarrow \leftarrow \leftarrow \leftarrow\)

Incongruent: \(\leftarrow \leftarrow \rightarrow \leftarrow \leftarrow\)

1.1 Data structure

flanker <- readRDS(file="flanker_jop.rds") %>% arrange(id, trial)
str(flanker)
## tibble [17,120 × 11] (S3: tbl_df/tbl/data.frame)
##  $ id            : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
##  $ run           : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
##  $ trial         : num [1:17120] 1 2 3 4 5 6 7 8 9 10 ...
##  $ run_trial     : int [1:17120] 1 2 3 4 5 6 7 8 9 10 ...
##  $ block         : Factor w/ 2 levels "most_incon","most_con": 2 2 2 2 2 2 2 2 2 2 ...
##  $ cond          : Factor w/ 2 levels "congruent","incongruent": 1 2 1 1 2 1 1 1 2 2 ...
##  $ rt            : num [1:17120] 537 599 397 450 376 413 403 377 439 445 ...
##  $ correct       : num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...
##  $ exclude       : num [1:17120] 0 0 0 0 0 0 0 0 0 0 ...
##  $ MPS_aggression: num [1:17120] 0 0 0 0 0 0 0 0 0 0 ...
##  $ MPS_alienation: num [1:17120] 1 1 1 1 1 1 1 1 1 1 ...

The key dependent variables are reaction time (rt) and accuracy (correct). We will defer models of accuracy for a later workshop. Let’s just focus on rt for today.

lattice::histogram(~ rt | cond*block, flanker)

Notice how non-normal the RTs are? This is a familiar problem (e.g., see the classic Ratcliff 1993 Psychological Bulletin paper) and we will spend more time on it in a few weeks. For now, we can assume that the strong positive skew will be a problem when we fit a model (GLM or MLM) that assumes approximately normal residuals.

Therefore, for now, we will transform the RTs to approximate normality for analysis. Two transformations are often effective: logarithm and inverse. Note that for the inverse (\(1/x\)) transformation, it is useful to take the negative inverse so that higher values still represent longer RTs. Furthermore, the inverse often leads to very tiny values, so scaling up by a large constant (100 or 1000) can make the variable more practically useful (avoid tiny regression coefficients, for example).

flanker <- flanker %>% mutate(
  rt_log10 = log10(rt),
  rt_inv = -1000/rt
)
  
lattice::histogram(~ rt_log10 | cond*block, flanker)

lattice::histogram(~ rt_inv | cond*block, flanker)

Here, the inverse RT looks awfully good – we’ll use this moving forward.

1.1.1 Person-period form (long form)

Thinking back to our multilevel structure, we have 160 trials nested within 107 people. Thus, our dataset should be 17,120 rows in long form. (Note that if there were missing or dropped trials, this number could be lower). For virtually all MLMs (in R), we want our data in person-period form (more generically, ‘long form’) where one variable encodes person (level 2 unit) and another encodes the period or repetition for each observation (level 1 unit). Other variables would encode predictors or outcome variables.

head(flanker)
id run trial run_trial block cond rt correct exclude MPS_aggression MPS_alienation rt_log10 rt_inv
1 1 1 1 most_con congruent 537 1 0 0 1 2.73 -1.86
1 1 2 2 most_con incongruent 599 1 0 0 1 2.78 -1.67
1 1 3 3 most_con congruent 397 1 0 0 1 2.60 -2.52
1 1 4 4 most_con congruent 450 1 0 0 1 2.65 -2.22
1 1 5 5 most_con incongruent 376 1 0 0 1 2.58 -2.66
1 1 6 6 most_con congruent 413 1 0 0 1 2.62 -2.42

1.1.2 L1 and L2 variables

Let’s think through what variables exist at the within-person (L1) and between-person (L2) levels.

Level 1within-person

  1. rt: reaction time (in milliseconds)
  2. cond: congruent, incongruent (type of flanker display)
  3. block: most_incon, most_con (proportion of incongruent or congruent trials in a block)
  4. trial: linear trial number
  5. run_trial: linear trial number within each run (consisting of 40 trials)

Level 2between-person

  1. exclude: An exclusion variable for sensitivity analysis, where higher values denote more dubious data
  2. MPS_aggression: The MPQ aggression subscale (personality)
  3. MPS_alienation: The MPQ alienation subscale (personality)

Unit variables

  1. id: The subject ID number
  2. run: The run number (4 per person)

1.2 Basic research questions

  1. Are RTs slower for incongruent relative to congruent trials?
  2. Are RTs slower for mostly incongruent blocks relative to mostly congruent blocks?
  3. Are congruent RTs slower in mostly incongruent blocks relative to mostly congruent blocks?
  4. Do RTs change substantially with experience (i.e., as trial increases)?

2 Fitting things the wrong way: GLM

Recall from Alison’s lecture that when we violate the assumption of independent residuals, we probably need to move away from the standard general linear model (GLM). The level of non-independence of observations within a cluster is proportionate to the intraclass correlation:

\[ ICC = \frac{s^2_b}{s^2_b + s^2_w} \]

What happens if we plow through these considerations and try to address our questions above?

Our basic model of the task would be:

\[ \textrm{RT}_{ij} = \beta_{0} + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \boldsymbol{\varepsilon} \sim \mathcal{N}(0, \sigma^2) \] where we have trial \(i\) nested within subject \(j\).

Note: none of the coefficients varies by subject or trial, only the residuals do!

lm_fit <- lm(rt_inv ~ block*cond, flanker)
summary(lm_fit)
## 
## Call:
## lm(formula = rt_inv ~ block * cond, data = flanker)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.4782 -0.3145 -0.0018  0.3042  1.6324 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   -2.70410    0.00968 -279.38   <2e-16 ***
## blockmost_con                  0.02460    0.01159    2.12   0.0338 *  
## condincongruent                0.10846    0.01157    9.37   <2e-16 ***
## blockmost_con:condincongruent  0.04920    0.01640    3.00   0.0027 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.487 on 16797 degrees of freedom
##   (319 observations deleted due to missingness)
## Multiple R-squared:  0.0159, Adjusted R-squared:  0.0158 
## F-statistic: 90.7 on 3 and 16797 DF,  p-value: <2e-16
car::Anova(lm_fit, type=3)
Sum Sq Df F value Pr(>F)
(Intercept) 18521.70 1 78050.42 0.000
block 1.07 1 4.51 0.034
cond 20.86 1 87.89 0.000
block:cond 2.14 1 9.00 0.003
Residuals 3986.00 16797 NA NA
summary(emblock <- emmeans(lm_fit, ~block))
## NOTE: Results may be misleading due to involvement in interactions
block emmean SE df lower.CL upper.CL
most_incon -2.65 0.006 16797 -2.66 -2.64
most_con -2.60 0.006 16797 -2.61 -2.59
pairs(emblock)
##  contrast              estimate     SE    df t.ratio p.value
##  most_incon - most_con  -0.0492 0.0082 16797 -6.000  <.0001 
## 
## Results are averaged over the levels of: cond
summary(emcond <- emmeans(lm_fit, ~cond))
## NOTE: Results may be misleading due to involvement in interactions
cond emmean SE df lower.CL upper.CL
congruent -2.69 0.006 16797 -2.70 -2.68
incongruent -2.56 0.006 16797 -2.57 -2.55
pairs(emcond)
##  contrast                estimate     SE    df t.ratio p.value
##  congruent - incongruent   -0.133 0.0082 16797 -16.230 <.0001 
## 
## Results are averaged over the levels of: block

2.1 What if we want these estimates on the original (back-transformed scale?)

Force back onto original RT using custom function

rg <- update(ref_grid(lm_fit), tran=list(linkinv=function(x) { -1000/x }, mu.eta=function(x) { -1000/x }), predict.type="response")
summary(res_trans <- emmeans(rg, ~cond|block))
cond block response SE df lower.CL upper.CL
congruent most_incon 370 3.58 16797 367 372
incongruent most_incon 385 2.44 16797 383 387
congruent most_con 373 2.38 16797 371 375
incongruent most_con 397 3.86 16797 394 400
summary(emmeans(rg, ~cond))
## NOTE: Results may be misleading due to involvement in interactions
cond response SE df lower.CL upper.CL
congruent 371 2.15 16797 370 373
incongruent 391 2.27 16797 389 393
summary(emmeans(rg, ~block))
## NOTE: Results may be misleading due to involvement in interactions
block response SE df lower.CL upper.CL
most_incon 377 2.18 16797 376 379
most_con 385 2.23 16797 383 386
emmip(rg, ~cond|block)

pairs(emmeans(lm_fit, ~block|cond))
## cond = congruent:
##  contrast              estimate     SE    df t.ratio p.value
##  most_incon - most_con  -0.0246 0.0116 16797 -2.120  0.0338 
## 
## cond = incongruent:
##  contrast              estimate     SE    df t.ratio p.value
##  most_incon - most_con  -0.0738 0.0116 16797 -6.360  <.0001
ggplot(lm_fit, aes(sample = rstandard(lm_fit))) + 
  geom_qq() + stat_qq_line() + 
  ggtitle("Normal Q-Q plot on standardized residuals")

At a high level, we might conclude that there is some evidence that:

  1. RTs are about 20ms slower for incongruent relative to congruent trials
  2. RTs are somewhat faster in mostly incongruent blocks
  3. Congruent RTs are slightly faster in mostly incongruent blocks
  4. Incongruent trials are much slower in mostly congruent compared to mostly congruent blocks!

2.2 We knows it’s wrong (conceptually), but can we see the violations?

df <- flanker %>% add_residuals(model = lm_fit)

by_subj <- ggplot(df, aes(x=factor(id), y=resid)) + 
  stat_summary(fun.data = mean_cl_boot, geom = "pointrange") + 
  geom_hline(yintercept=0, color="blue", size=2) +
  ggtitle("95% CIs on residuals by subject") +
  labs(x=NULL) + ylim(-1.5, 1.6) +
  theme_cowplot() +
  theme(axis.text.x = element_blank())

overall <- ggplot(df, aes(x=1, y=resid)) + geom_boxplot() + 
  # stat_summary(fun.data = mean_cl_boot, geom = "pointrange") +
  geom_hline(yintercept=0, color="blue", size=2) +
  ggtitle("Overall") +
  labs(x=NULL) + ylim(-1.5, 1.6) +
  theme_cowplot() +
  theme(axis.text.x = element_blank())

cowplot::plot_grid(by_subj, overall, nrow=1,rel_widths = c(0.8, 0.2))
## Warning: Removed 324 rows containing non-finite values (stat_summary).
## Warning: Removed 324 rows containing non-finite values (stat_boxplot).

By eye, it looks a bit suspicious, right?! Let’s re-use the GLM to look at problems with the GLM. Here’s a model that treats id as a factor and looks at whether there is significant variation across ids in the residuals.

mm <- lm(resid ~ factor(id), df)
summary(mm)
## 
## Call:
## lm(formula = resid ~ factor(id), data = df)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -1.799 -0.228  0.008  0.229  1.589 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.24529    0.03029    8.10  6.0e-16 ***
## factor(id)2   -0.41601    0.04318   -9.63  < 2e-16 ***
## factor(id)3   -0.37202    0.04284   -8.68  < 2e-16 ***
## factor(id)4   -0.83873    0.04355  -19.26  < 2e-16 ***
## factor(id)5   -0.38191    0.04284   -8.92  < 2e-16 ***
## factor(id)6   -0.28932    0.04284   -6.75  1.5e-11 ***
## factor(id)8   -0.12694    0.04284   -2.96  0.00305 ** 
## factor(id)9   -0.73489    0.04340  -16.93  < 2e-16 ***
## factor(id)10  -0.52365    0.04311  -12.15  < 2e-16 ***
## factor(id)11   0.00792    0.04284    0.18  0.85339    
## factor(id)12  -0.83344    0.04434  -18.80  < 2e-16 ***
## factor(id)13  -0.47480    0.04284  -11.08  < 2e-16 ***
## factor(id)14   0.44608    0.04290   10.40  < 2e-16 ***
## factor(id)15   0.49738    0.04297   11.57  < 2e-16 ***
## factor(id)16  -0.40843    0.04332   -9.43  < 2e-16 ***
## factor(id)17  -0.48984    0.04284  -11.43  < 2e-16 ***
## factor(id)18  -0.34916    0.04311   -8.10  5.9e-16 ***
## factor(id)19  -0.24169    0.04290   -5.63  1.8e-08 ***
## factor(id)20  -0.30117    0.04284   -7.03  2.1e-12 ***
## factor(id)21  -0.10775    0.04311   -2.50  0.01245 *  
## factor(id)22  -0.67788    0.04290  -15.80  < 2e-16 ***
## factor(id)23  -0.21032    0.04284   -4.91  9.2e-07 ***
## factor(id)24  -0.67003    0.04370  -15.33  < 2e-16 ***
## factor(id)25  -0.45270    0.04284  -10.57  < 2e-16 ***
## factor(id)26  -0.44352    0.04304  -10.30  < 2e-16 ***
## factor(id)27  -0.09973    0.04290   -2.32  0.02012 *  
## factor(id)28  -0.48700    0.04297  -11.33  < 2e-16 ***
## factor(id)29  -0.45874    0.04297  -10.68  < 2e-16 ***
## factor(id)30   0.14625    0.04284    3.41  0.00064 ***
## factor(id)31  -0.80450    0.04332  -18.57  < 2e-16 ***
## factor(id)32  -0.32334    0.04297   -7.52  5.6e-14 ***
## factor(id)33  -0.32109    0.04304   -7.46  9.1e-14 ***
## factor(id)34  -0.38455    0.04290   -8.96  < 2e-16 ***
## factor(id)35  -0.77034    0.04325  -17.81  < 2e-16 ***
## factor(id)36  -0.45882    0.04564  -10.05  < 2e-16 ***
## factor(id)37  -0.24248    0.04290   -5.65  1.6e-08 ***
## factor(id)38   0.39919    0.04284    9.32  < 2e-16 ***
## factor(id)40  -0.42446    0.04284   -9.91  < 2e-16 ***
## factor(id)41  -0.14849    0.04284   -3.47  0.00053 ***
## factor(id)42  -0.31970    0.04311   -7.42  1.3e-13 ***
## factor(id)43  -0.28215    0.04304   -6.56  5.7e-11 ***
## factor(id)45   0.10422    0.04332    2.41  0.01615 *  
## factor(id)46   0.13858    0.04297    3.22  0.00126 ** 
## factor(id)48  -0.64253    0.04297  -14.95  < 2e-16 ***
## factor(id)49  -0.06866    0.04284   -1.60  0.10898    
## factor(id)50  -0.50522    0.04311  -11.72  < 2e-16 ***
## factor(id)51   0.07398    0.04304    1.72  0.08566 .  
## factor(id)52   0.14858    0.04284    3.47  0.00052 ***
## factor(id)53  -0.21420    0.04284   -5.00  5.8e-07 ***
## factor(id)54  -0.33281    0.04284   -7.77  8.4e-15 ***
## factor(id)56  -0.47160    0.04311  -10.94  < 2e-16 ***
## factor(id)57  -0.33916    0.04311   -7.87  3.8e-15 ***
## factor(id)58  -0.06277    0.04290   -1.46  0.14348    
## factor(id)59  -0.60609    0.04318  -14.04  < 2e-16 ***
## factor(id)60  -0.52008    0.04355  -11.94  < 2e-16 ***
## factor(id)61  -0.41963    0.04290   -9.78  < 2e-16 ***
## factor(id)62   0.26537    0.04478    5.93  3.2e-09 ***
## factor(id)63  -0.64276    0.04284  -15.00  < 2e-16 ***
## factor(id)64  -0.36744    0.04318   -8.51  < 2e-16 ***
## factor(id)65   0.33972    0.04304    7.89  3.1e-15 ***
## factor(id)66   0.17324    0.04284    4.04  5.3e-05 ***
## factor(id)67  -0.60154    0.04304  -13.98  < 2e-16 ***
## factor(id)68  -0.31396    0.04290   -7.32  2.6e-13 ***
## factor(id)69  -0.03910    0.04284   -0.91  0.36136    
## factor(id)70  -0.58466    0.04325  -13.52  < 2e-16 ***
## factor(id)71  -0.63960    0.04284  -14.93  < 2e-16 ***
## factor(id)72  -0.24642    0.04297   -5.73  1.0e-08 ***
## factor(id)73  -0.21059    0.04325   -4.87  1.1e-06 ***
## factor(id)74  -0.18266    0.04284   -4.26  2.0e-05 ***
## factor(id)75   0.16779    0.04284    3.92  9.0e-05 ***
## factor(id)76  -0.36953    0.04284   -8.63  < 2e-16 ***
## factor(id)77  -0.45617    0.04284  -10.65  < 2e-16 ***
## factor(id)78   0.19719    0.04284    4.60  4.2e-06 ***
## factor(id)79  -0.59938    0.04304  -13.93  < 2e-16 ***
## factor(id)81  -0.00174    0.04284   -0.04  0.96751    
## factor(id)82   0.06080    0.04297    1.41  0.15711    
## factor(id)84  -0.33296    0.04284   -7.77  8.1e-15 ***
## factor(id)85  -0.48373    0.04284  -11.29  < 2e-16 ***
## factor(id)86  -0.21452    0.04284   -5.01  5.6e-07 ***
## factor(id)87  -0.30948    0.04347   -7.12  1.1e-12 ***
## factor(id)88  -0.35559    0.04325   -8.22  < 2e-16 ***
## factor(id)89   0.18750    0.04297    4.36  1.3e-05 ***
## factor(id)90   0.41492    0.04304    9.64  < 2e-16 ***
## factor(id)91  -0.26983    0.04284   -6.30  3.1e-10 ***
## factor(id)92  -0.12035    0.04417   -2.72  0.00645 ** 
## factor(id)93  -0.12757    0.04284   -2.98  0.00290 ** 
## factor(id)94  -0.13616    0.04284   -3.18  0.00148 ** 
## factor(id)95  -0.36663    0.04284   -8.56  < 2e-16 ***
## factor(id)96  -0.31796    0.04290   -7.41  1.3e-13 ***
## factor(id)97  -0.57405    0.04290  -13.38  < 2e-16 ***
## factor(id)98  -0.73954    0.04297  -17.21  < 2e-16 ***
## factor(id)99  -0.39441    0.04355   -9.06  < 2e-16 ***
## factor(id)100  0.06134    0.04284    1.43  0.15221    
## factor(id)101  0.03500    0.04284    0.82  0.41385    
## factor(id)102 -0.24965    0.04290   -5.82  6.0e-09 ***
## factor(id)103 -0.15830    0.04284   -3.70  0.00022 ***
## factor(id)104 -0.56794    0.04347  -13.06  < 2e-16 ***
## factor(id)105 -0.26834    0.04290   -6.25  4.1e-10 ***
## factor(id)106 -0.08392    0.04284   -1.96  0.05013 .  
## factor(id)107 -0.05651    0.04284   -1.32  0.18709    
## factor(id)108 -0.23763    0.04290   -5.54  3.1e-08 ***
## factor(id)109  0.04264    0.04297    0.99  0.32106    
## factor(id)110  0.04741    0.04311    1.10  0.27147    
## factor(id)111  0.52864    0.04284   12.34  < 2e-16 ***
## factor(id)112 -0.19344    0.04290   -4.51  6.6e-06 ***
## factor(id)113 -0.10159    0.04290   -2.37  0.01790 *  
## factor(id)114  0.00377    0.04284    0.09  0.92986    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.383 on 16694 degrees of freedom
##   (319 observations deleted due to missingness)
## Multiple R-squared:  0.385,  Adjusted R-squared:  0.381 
## F-statistic: 98.7 on 106 and 16694 DF,  p-value: <2e-16
car::Anova(mm, type=3)
Sum Sq Df F value Pr(>F)
(Intercept) 9.63 1 65.6 0
factor(id) 1535.31 106 98.7 0
Residuals 2450.69 16694 NA NA

So, yeah, an F-test of almost 100 is probably worth attention.

More formally, we know that the ICC can give us a useful guide on whether clustering within units is substantial and, therefore, problematic for the GLM. Here, we calculate ICCs from the residuals of the GLM (i.e., removing variance due to condition and block).

rfx_model <- lmer(resid ~ 1 + (1|id), df)
performance::icc(rfx_model)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.385
##   Conditional ICC: 0.385

Bonus for reflection: is it a coincidence that this ICC is identical to the \(R^2\) from the GLM of residuals with subject as a factor? No, it’s not – take some time to think on why these are equivalent.

3 Alternative: model each subject

Conceptually, when we have repeated measures data, we are often interested in the possibility that certain relationships vary from one person to the next. Indeed, these between-person differences might be related to other individual difference variables such as personality traits.

What if we wanted to take this approach here and get a sense of the how experimental effects differ by subject?

3.1 Visualize RT effects by subject

Here are boxplots for the first 25 subjects

ggplot(flanker %>% filter(id <= 25), aes(x=cond, y=rt, color=block)) + 
  geom_boxplot() +
  facet_wrap(~id)
## Warning: Removed 79 rows containing non-finite values (stat_boxplot).

Definitely some heterogeneity there!

3.2 Looking at coefficients by subject

What if we truly treated data for each subject separately? We could split up the data by subject and run a model with each person as a separate dataset.

#a bit beyond the scope, but at a high level, we use dplyr and broom to run the same RT
#Documentation: https://cran.r-project.org/web/packages/broom/vignettes/broom_and_dplyr.html
res <- flanker %>%
  group_by(id) %>%
  nest() %>%
  mutate(test=map(data, ~ lm(rt_inv ~ block*cond, .x, na.action = na.exclude)),
         stats=map(test, tidy)) %>%
  unnest(stats) %>% dplyr::select(-data, -test) %>%
  ungroup() %>% group_by(term) %>% 
  mutate(estimate_termz=as.vector(scale(estimate))) %>% 
  ungroup()

ggplot(res, aes(x=term, y=id, fill=estimate_termz)) + 
  geom_tile() + scale_fill_viridis_c() + ggtitle("Coefficients") +
  theme(axis.text.x=element_text(angle=90))

overall <- data.frame(term=names(coef(lm_fit)), estimate=coef(lm_fit))
ggplot(res, aes(x=1, y=estimate)) + geom_boxplot() + facet_wrap(~term, scales="free") + 
  geom_point(data=overall, size=6, color="blue", shape=18) + 
  labs(title="Heterogeneity in within-person model coefficients",
       subtitle="Blue diamonds are coefs from overall lm()", x=NULL) +
  theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())

4 Overview of mixed effects in lmer

Let’s move into the formal MLM framework to examine these data. Here is the basic model for a random intercept model, where we allow the intercept to vary by cluster, but keep the condition effects (cond and block) homogeneous (i.e., fixed only).

\[ \textrm{RT}_{ij} = \beta_{0j} + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \beta_{0j} = \gamma_{00} + \mu_{0j} \\ \boldsymbol{\mu_{0.}} \sim \mathcal{N}(0, \tau_{00}) \\ \boldsymbol{\varepsilon_{..}} \sim \mathcal{N}(0, \sigma^2) \]

4.1 Basic unconditional random intercept model

(Also called unconditional means model)

MLM usually begins by running an unconditional model that allows for cluster variation in the mean/expectation/intercept. The term ‘unconditional’ refers to the fact that the outcome (rt_inv) is not conditioned on (regressed on) any predictor. In this scenario, the intercept of the model is nearly the same as the overall average in the sample (and the reason for small divergence would be a distraction at the moment).

Why is this model useful? It gives us a starting point in terms of the amount of variation in the outcome that is attributable to between-cluster (here, between persons) versus within-cluster variation.

m1 <- lmer(rt_inv ~ 1 + (1|id), flanker)
summary(m1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ 1 + (1 | id)
##    Data: flanker
## 
## REML criterion at convergence: 16381
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -4.741 -0.598  0.012  0.601  4.363 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 0.0918   0.303   
##  Residual             0.1507   0.388   
## Number of obs: 16801, groups:  id, 107
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  -2.6323     0.0295 105.9839   -89.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mean(flanker$rt_inv, na.rm=TRUE)
## [1] -2.63
#store variance components
v1 <- as.data.frame(get_variance(m1)) %>% mutate(model="unconditional random intercept") %>%
  dplyr::select(model, var.fixed, var.intercept, var.residual)

As a quick overview of lmer syntax, all terms on the right-hand side that are not in parentheses are fixed effects in the model – that is, the estimates reflect the sample average effect. Here, ~ 1 indicates that an overall intercept (\(\gamma_{00}\)) should be included in the model. Note that this the default, but it’s useful to include it clarity.

All terms included inside parentheses are entered as random effects, introducing a variance component for each term. Here, 1 within the parentheses requests that the intercept varies across clusters. The | operator denotes the separation between variance components and clustering variables. Here, id is the clustering variable, telling lmer that all terms on the left of the ‘pipe’ vary across units of the clustering variable on the right.

To map this output onto our model:

  1. \(\gamma_{00} = -2.6323\) (estimate of Intercept under fixed effects)
  2. \(\tau_{00} = .0918\) (estimate of Intercept variation by id under random effects)
  3. \(\textrm{var}(\boldsymbol{\varepsilon_{..}}) = .1507\) (estimate of residual variability under random effects)

And remember our formula for ICC – proportion of between-cluster variation relative to overall variation. With this in mind, we have the values to calculate it by hand:

vc <- VarCorr(m1)
print(vc,comp=c("Variance"))
##  Groups   Name        Variance
##  id       (Intercept) 0.0918  
##  Residual             0.1507
dd <- as.data.frame(vc)

ICC_by_hand <- dd$vcov[dd$grp=="id"] / (dd$vcov[dd$grp=="id"] + dd$vcov[dd$grp=="Residual"])
print(ICC_by_hand)
## [1] 0.379
#helper function
performance::icc(m1)
## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.379
##   Conditional ICC: 0.379

4.1.1 Important principle: good predictors reduce residual variation

This principle is intuitive, but often overlooked. When we fit a standard GLM, we hope that as we add good predictors into the model, \(SS_\textrm{Error}\) goes down, while the \(SS_\textrm{Model}\) goes up.

\[ \begin{align} SS_\textrm{Total} &= \sum_{i=1}^{N}({y_i - \bar{y}})^2 \\ SS_\textrm{Model} &= \sum_{i=1}^{N}({\hat{y}_i - \bar{y}})^2 \\ SS_\textrm{Error} &= \sum_{i=1}^{N}({y_i - \hat{y}})^2 = SS_\textrm{Total} - SS_\textrm{Model} \end{align} \]

By extension, when we move into the multilevel framework, each variance component (not just \(\sigma^2\)) represents variation that could potentially be explained by one or more predictors.

In the MLM, however, we can think of each variance component as a ‘compartment’ of overall variance, where the compartments can vary by level (e.g., within- versus between-persons). For example, if we introduce a level 1 predictor, it can potentially explain variation – and thus reduce the variance components – for both the L1 residual variability (\(\sigma^2\)) and level 2 variability in intercepts (\(\tau_{00}\)). This relates to the discussion of parcelling within versus between variation in L1 predictors and the concept of a ‘total effect’ (that combines both L1 and L2 variation).

4.2 Basic random intercepts model

Let’s take a look at what happens to our model when we add in the design factors of cond and block. This is a basic random intercepts model where we have fixed effects of cond and block and intercepts that vary across subjects (random). A useful way to rewrite the MLM equation above is:

\[ \textrm{RT}_{ij} = \textrm{Intercept}_j + \beta_{1} \textrm{block}_{ij} + \beta_{2} \textrm{cond}_{ij} + \beta_{3} \textrm{block}_{ij} \cdot \textrm{cond}_{ij} + \varepsilon_{ij} \\ \textrm{Intercept}_{j} = \gamma_{00} + \mu_{0j} \\ {\textbf{Intercepts}} \sim \mathcal{N}(\gamma_{00}, \tau_{00}) \]

This notation emphasizes that the intercepts are a set of values, one per subject, and are normally distributed with mean \(\gamma_{00}\) and variance \(\tau_{00}\).

m2 <- lmer(rt_inv ~ block*cond + (1|id), flanker)
summary(m2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ block * cond + (1 | id)
##    Data: flanker
## 
## REML criterion at convergence: 15963
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -4.678 -0.593  0.019  0.599  4.138 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 0.0919   0.303   
##  Residual             0.1468   0.383   
## Number of obs: 16801, groups:  id, 107
## 
## Fixed effects:
##                                Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)                   -2.71e+00   3.03e-02  1.18e+02  -89.35  < 2e-16 ***
## blockmost_con                  2.34e-02   9.11e-03  1.67e+04    2.57     0.01 *  
## condincongruent                1.09e-01   9.10e-03  1.67e+04   11.96  < 2e-16 ***
## blockmost_con:condincongruent  5.05e-02   1.29e-02  1.67e+04    3.91  9.2e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) blckm_ cndncn
## blockmst_cn -0.210              
## condncngrnt -0.210  0.699       
## blckmst_cn:  0.148 -0.706 -0.705
v2 <- as.data.frame(get_variance(m2)) %>% mutate(model="random intercept cond+block") %>%
  dplyr::select(model, var.fixed, var.intercept, var.residual)

Let’s look at the reduction in variance components with the addition of fixed effects for cond and block

v1_long <-  v1 %>% pivot_longer(-model) %>% rename(value.unconditional = value) %>% dplyr::select(-model)
v2_long <-  v2 %>% pivot_longer(-model) %>% rename(value.addfixed = value) %>% dplyr::select(-model)
v1_long %>% left_join(v2_long) %>% mutate(re.change.pct=(1 - value.addfixed/value.unconditional)*100) %>%
  kable(digits=2)
## Joining, by = "name"
name value.unconditional value.addfixed re.change.pct
var.fixed 0.00 0.00 -Inf
var.intercept 0.09 0.09 -0.11
var.residual 0.15 0.15 2.59

Here, we see that introducing cond*block reduced L1 residual variability by about 2.6% but had no effect on variability in the intercepts.

For discussion: I mentioned above that adding L1 predictors – which is what cond and block are – has the potential to explain variation at both L1 and L2. Why didn’t we see any reduction in the intercept variability here?

4.2.1 Visualizing our random effects

We can obtain the estimated random effects for each cluster in an lmer model using ranef.

res <- ranef(m2)$id
lattice::histogram(res$`(Intercept)`, xlab="Estimated subject intercepts (mu_0j)")

We can use REsim from merTools to get estimated random effects for each cluster (here, id), including uncertainties in these estimates. We can also plot these using plotREsim:

#predictInterval(m2)   # for various model predictions, possibly with new data

simranef <- REsim(m2)  # mean, median and sd of the random effect estimates
simranef %>% head()
groupFctr groupID term mean median sd
id 1 (Intercept) 0.253 0.257 0.043
id 2 (Intercept) -0.164 -0.163 0.042
id 3 (Intercept) -0.119 -0.120 0.039
id 4 (Intercept) -0.582 -0.586 0.040
id 5 (Intercept) -0.130 -0.131 0.042
id 6 (Intercept) -0.038 -0.040 0.040
plotREsim(simranef)  # plot the interval estimates

This is a handy plotting function because it shows by eye the tendency for cluster-level deviations to differ significantly from zero. Recall that if all cluster deviations (here, \(\mu_{0j}\)) are approximately zero, we don’t really need to model that variance component (random effect).

Furthermore, as discussed previously, if we add predictors (as fixed effects) into the model and they explain considerable between-subjects (or more generally, between-cluster) variability, the subject-specific intercepts will increasingly approach zero and the variance component \(\tau_{00}\) will also diminish toward zero.

5 Basic random slopes model

What if we allowed for the possibility that the effect of incongruent vs. congruent trials varied by subject? That is, we think that there is some overall effect of congruency, but also that this effect could be stronger or weaker in some people.

How does this modify our model above? To keep things simple for now (we’ll come back to this), I’m falling back to an additive model that does not include the cond*block interaction.

\[ \begin{align} \textrm{RT}_{ij} &= \beta_{0j} + \beta_{1} \textrm{block}_{ij} + \beta_{2j} \textrm{cond}_{ij} + \varepsilon_{ij} \\ \beta_{0j} &= \gamma_{00} + \mu_{0j} \\ \beta_{1j} &= \gamma_{10} \\ \beta_{2j} &= \gamma_{20} + \mu_{2j} \\ \boldsymbol{\varepsilon_{..}} &\sim \mathcal{N}(0, \sigma^2) \end{align} \]

\[ \begin{bmatrix} \mu_{0j} \\ \mu_{2j} \end{bmatrix} \sim \mathcal{N}(\begin{bmatrix} 0 \\ 0\end{bmatrix}, \begin{bmatrix} \tau_{00} & \tau_{\textrm{int,cond}} \\ \tau_{\textrm{int,cond}} & \tau_{11}\end{bmatrix}) \] Here, \(\tau_{\textrm{int,cond}}\) represents the covariance between the a subject’s estimated intercept and their estimated slope for the cond-RT association.

m3 <- lmer(rt_inv ~ block + cond + (1 + cond | id), flanker)
summary(m3)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: rt_inv ~ block + cond + (1 + cond | id)
##    Data: flanker
## 
## REML criterion at convergence: 15940
## 
## Scaled residuals: 
##    Min     1Q Median     3Q    Max 
## -4.688 -0.596  0.021  0.595  4.188 
## 
## Random effects:
##  Groups   Name            Variance Std.Dev. Corr 
##  id       (Intercept)     0.09552  0.3091        
##           condincongruent 0.00365  0.0604   -0.24
##  Residual                 0.14604  0.3822        
## Number of obs: 16801, groups:  id, 107
## 
## Fixed effects:
##                  Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)     -2.72e+00   3.05e-02  1.11e+02  -89.30  < 2e-16 ***
## blockmost_con    4.86e-02   6.43e-03  1.66e+04    7.55  4.6e-14 ***
## condincongruent  1.34e-01   8.69e-03  1.26e+02   15.44  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) blckm_
## blockmst_cn -0.147       
## condncngrnt -0.267  0.296
print(unlist(get_variance(m3)))
##                    var.fixed                   var.random                 var.residual 
##                      0.00379                      0.09286                      0.14604 
##             var.distribution               var.dispersion             var.intercept.id 
##                      0.14604                      0.00000                      0.09552 
## var.slope.id.condincongruent       cor.slope_intercept.id 
##                      0.00365                     -0.23980
simranef3 <- REsim(m3)  # mean, median and sd of the random effect estimates
plotREsim(simranef3) + facet_wrap(~term, scales="free", ncol=1) # plot the interval estimates

We can see by eye that a handful of subjects have congruency slopes that differ reliably from zero, though this is not particularly striking relative to the random intercept.

5.1 Visualizing these effects

plot_model(m2, type = "pred", terms = c("cond", "block"), pred.type = "re") + 
  labs(x = "Condition", y = "RT inverse", title = "Predicted effects in model that excludes condition random slope")

#add back interaction to make it comparable to random intercept model (m2)
m4 <- lmer(rt_inv ~ block + cond + (1 + cond | id), flanker)

plot_model(m4, type = "pred", terms = c("cond", "block"), pred.type = "re") + 
  labs(x = "Condition", y = "RT inverse", title = "Predicted effects in model that includes condition random slope")

plot_model(m4, type = "re", terms = c("cond", "block")) + 
  #facet_wrap(~facet, scales="free_y") +
  labs(y = "Model effect", x = "ID", title = "Model that includes random slopes")

5.2 What about approximate normality of random effects?

rr <- ranef(m3)$id %>% pivot_longer(cols=everything())

ggplot(rr, aes(x=value)) + geom_histogram(bins=15) + 
  facet_wrap(~name, scales="free")

Not too bad. One of our assumptions in MLM is that the random effects are distributed as (approximately) multivariate normal. Here, I’ve plotted the univariate distributions. We would also perhaps want to check that the bivariate distribution is approximately normal.

If you really want to get fancy, you could formally test for multivariate normality using the MVN package, though this is overkill for most applications.

ggplot(ranef(m3)$id, aes(x=`(Intercept)`, y=condincongruent)) + geom_density_2d()