Note that this Rmd file depends on having Mplus installed. You can obtain a fully functioning student version for $350 here:

Or you can use the demo version for free here:

The demo version has the following constraints:

  1. Maximum number of dependent variables: 6
  2. Maximum number of independent variables: 2
  3. Maximum number of between variables in two-level analysis: 2
  4. Maximum number of continuous latent variables in time series analysis: 2

MSEM can also be conducted in lavaan in R, but its capabilities are still limited. Currently, you can fit a two-level SEM with random intercepts (only when all data is continuous and complete; listwise deletion is used for cases with missing values). You can see more here:

Mplus is probably the most user-friendly program for multilevel SEM, though there is similar functionality in EQS and LISREL. In addition, the OpenMx package in R is free and supports multilevel analyses, but requires a substantially different approach to syntax and specification.

This RMarkdown also leverages a package called MplusAutomation, which allows for effective cross-talk between Mplus and R. In particular, Mplus outputs all information into a text-based file, which makes it difficult to compare outputs across models or to extract specific subsections. If you’re interested in how this package works, please see Hallquist and Wiley (in press), Structural Equation Modeling.

1 What is multilevel analysis?

In the standard general linear model (GLM), we assume that observations are independent of each other. In particular, we assume that the residuals are distributed iid (indpendently and identically distributed) normal:

\[ \begin{align*} Y &= B_0 + B_1 X_1 + B_2 X_2 + ... B_p X_p + \varepsilon \\ \varepsilon &\sim \mathcal{N}(0, \sigma^2) \end{align*} \]

We know from previous sessions that the assumption of independence is violated when: there are multiple observations per person (e.g., longitudinal data) or there are known clusters in the data that introduce similarity among observations within the clusters.

1.1 L1 versus L2 effects

Multilevel analysis also allows us to ask specific questions at each level of the model. Today, we will look at how we can do this using a longitudinal dataset of individuals diagnosed with BPD who were followed at annual assessments for up to 30 years. Participants completed a variety of assessments measuring personality & pyschopathology at baseline. At each annual visit, they reported on recent interpersonal functioning, suicidal ideation, and any suicide attempts since their last visit. In this case, annual visits (L1) are nested within participants (L2), and predictors at either L1 or L2 could affect the outcome of interest. At the participant level (L2), we might hypothesize that certain traits, like negative affectivity, influence an outcome of interest (e.g., suicidal ideation). By definition, trait scores are only meaningful as an L2 variable– they were only measured at baseline, and are presumed (in this study at least) to be stable over time. So if we take ideation as our dependent variable, the data might look like this:

tt <- dplyr::tribble(
~year, ~participant, ~negative_affect, ~ideation_score,
1, "JohnID", 4.5, 8,
2, "JohnID", 4.5, 7,
3, "JohnID", 4.5, 9,
1, "SarahID", 2.1, 4,
2, "SarahID", 2.1, 4,
3, "SarahID", 2.1, 2

year participant negative_affect ideation_score
1 JohnID 4.5 8
2 JohnID 4.5 7
3 JohnID 4.5 9
1 SarahID 2.1 4
2 SarahID 2.1 4
3 SarahID 2.1 2

Notice that there is no yearly variation in negative affect. This is the simplest case for a predictor because there is no concern that it could represent a mixture of L1 and L2 effects. That is, negative affect cannot predict any L1 variance (in this case, year-to-year fluctuation) in the DV.

We wish to predict the severity of suicidal ideation in year i for person j as a function of their negative affect (note, sometimes you will see different subscripts when person is the L2 variable. Often, you will time t nested within person i. The models however, are the same). Some notation:

\[ \begin{align*} \textrm{ideation}_{ij} &= \beta_{0j} + r_{ij} \\ \beta_{0j} &= \gamma_{00} + \gamma_{01} \textrm{negative_affect}_j + \mu_{0j} \\ \end{align*} \] In this notation, \(\gamma_{01}\) represents the person-level effect of negative affect on suicidal ideation. We also allow for there to be normally distributed person-level variation in ideation, \(\mu_{0j}\) around the sample average \(\gamma_{00}\). To keep the \(\beta_{0j}\) equation easy to interpret, L2 predictors are typically centered around the grand mean. This ensures that \(\gamma_{00}\) retains the interpretation of the average ideation in the sample (fixed effect), and \(\mu_{0j}\) is person-level variation around this mean (random effect).

Variables measured on a more frequent basis – in this case, at each annual assessment wave – might also have explanatory value in predicting ideation. For example, we might ask: does year-to-year fluctuation in interpersonal functioning influence suicidal ideation? In this case, interpersonal dysfunction would be an L1 predictor because it was measured at each wave, and varies on a yearly basis.

1.2 Disaggregating within- versus between-cluster effects

Importantly, the association between an L1 predictor and the outcome could reflect either L1 or L2 variability. Perhaps participants report higher ideation because they recently went through a break up and have experienced a lot of interpersonal problems in the last year (L1). But some participants may simply experience more interpersonal dysfunction on average, which may also drive higher average ideation (L2).

Because interpersonal dysfunction is an L1 variable, it may have systematic variation at both the year (L1) and participant (L2) level. In conventional multilevel modeling (MLM), one can disentangle the variance at each level using centering (Curran & Bauer, 2011). The usual approach is to center the predictor (interpersonal dysfunction) around the participant mean such that we now have a variable that represents yearly deviation in interpersonal dysfunction relative to the participant’s average across all waves (this is often called group-mean centering). Then, we also compute the average interpersonal dysfunction for each participant across all waves and include that as a predictor at L2 (this is grand-mean centered). In this way, we can examine whether a participant’s average interpersonal dysfunction predicts their average suicidal ideation (an L2 effect). Likewise, the effect of yearly variation in interpersonal dysfunction that is not explainable by the person’s average levels can also be examined.

Here, we add interpersonal dysfunction at L1 only:

\[ \begin{align*} \textrm{ideation}_{ij} &= \beta_{0j} + \beta_{1j} \textrm{int_dys}_{ij} + r_{ij} \\ \beta_{0j} &= \gamma_{00} + \gamma_{01} \textrm{negative_affect}_j + \mu_{0j} \\ \beta_{1j} &= \gamma_{10} + \mu_{1j} \end{align*} \]

The major difficulty of this representation is that \(\textrm{int_dys}_{ij}\) may have variation both within-person (from year-to-year) and between-person (on average). Thus, the main coefficient of interest for this predictor, \(\gamma_{10}\), (fixed effect) is an unknown combination of L1 and L2 variability. Here, we add a random slope to allow for the association of interpersonal dysfunction with ideation to vary randomly by person, \(\mu_{1j}\).

As mentioned above, to handle the problem of \(\textrm{int_dys}\) potentially having variation at L1 and L2, we apply the L1 centering + L2 means approach.

The notation for the person-centered interpersonal dysfunction is:

\[ \dot{\textrm{int_dys}_{ij}} = \textrm{int_dys}_{ij} - \bar{\textrm{int_dys}_{j}} \]

That is, we subtract the average interpersonal dysfunction for a participant across all waves, \(\bar{\textrm{int_dys}_{j}}\), from the interpersonal dysfunction that they report in each year.

We then introduce both of these new variables into the model:

\[ \begin{align*} \textrm{ideation}_{ij} &= \beta_{0j} + \beta_{1j} \dot{\textrm{int_dys}_{ij}} + r_{ij} \\ \beta_{0j} &= \gamma_{00} + \gamma_{01} \textrm{negative_affect}_j + \gamma_{02} \bar{\textrm{int_dys}_{j}} + \mu_{0j} \\ \beta_{1j} &= \gamma_{10} + \mu_{1j} \end{align*} \]

Thus, average interpersonal dysfunction for each person predicts average variation in ideation (this is a between-person effect). The difference between one’s intepersonal dysfunction in a given year and their average interpersonal dysfunction represents the yearly effect of int_dys on ideation (a within-person effect).

This is perhaps more evident in the combined equation:

\[ \textrm{ideation}_{ij} = ( \gamma_{00} + \gamma_{01} \textrm{negative_affect}_j + \gamma_{02} \bar{\textrm{int_dys}_{j}} + \gamma_{10} \dot{\textrm{int_dys}_{ij}} ) + \\ ( r_{ij} + \mu_{0j} + \mu_{1j} \dot{\textrm{int_dys}_{ij}} ) \] Note how \(\textrm{int_dys}\) enters the model in two ways (year and person level). As in previous examples, the fixed effects are denoted in the first set of parentheses, and the random effects in the second.

2 Multilevel SEM (MSEM) overview

Okay, so we’ve actually seen most of this from previous lectures. Where does SEM enter the picture? In multilevel SEM, we use a latent variable approach to parcellate variation between and within clusters, rather than applying a cluster-based centering approach. This is the intuition:

Latent variables, which are depicted in circles above, are variables for which we have no direct measurement within our dataset. For example, we may measure interpersonal dysfunction at each wave, but the scores on that measure are neither a pure indicator of within-person interpersonal dysfunction, nor a pure indicator of between-person interpersonal dysfunction. Instead, scores are likely a function of both within- and between-person variance (as well as measurement error). By invoking SEM, we are postulating that variance in our observed variable is due to two latent influences: within-person interpersonal dysfunction, and between-person interpersonal dysfunction.

Importantly, this framework also allows for the specification of random slopes such that within-person/cluster variation in the strength of an association between two variables can itself be a latent variable at the between-person level. For example, does the within-person association of interpersonal dysfunction with ideation depend on negative affect? Said differently, does negative affect moderate the relationship between interpersonal dysfunction in a given year and ideation in a given year? This is the multilevel SEM equivalent of cross-level interaction in MLM. Our graphical notation is as follows:

So far, however, the expression of multilevel data in a SEM framework has not revealed any new capabilities beyond standard MLM. Why would one consider multilevel SEM? A few reasons:

  1. As in single-level regression, classic MLM assumes that all predictors are measured without error, which can result in estimation biases (Cole & Preacher 2014; Ludtke et al., 2008; Westfal & Yarkoni, 2016). Moving to MSEM allows us to specify latent variables that reflect the shared variance among several indicators of our constructs – at either level of the model.
  2. Classic MLMs have only a single outcome, and variables must be designated exclusively as outcomes or predictors. This makes it tricky to look at things like mediation, where the effect of one variables on another is mediated through a third variable that serves as both predictor and outcome. For an excellent introduction to the value of SEM for testing multilevel mediation, see Preacher, Zyphur, and Zhang (2010), Psychological Methods.
  3. The standard MLM approach of cluster-mean centering introduces two potentially important biases. Nickell’s Bias: in time-series data, observed-mean centering can negativley bias lagged effects when few time points are present (e.g., imagine cluster-mean centering with just 2 time points, the result would be one positive and one negative value, and these would be negatively correlated; latent centering avoids this bias). Ludtke’s Bias: MLMs typically assume observed L2 variables are based on aggregated L1 variables. If the number of observations at L1 is low though, the L2 measure may be unreliable and biased. In short, the latent decomposition of within-between in MSEM tends to provide better estimates than the centering method in MLM.
  4. As we’ll see below. MSEM also allows us to test hypotheses involving latent variables, which are hard to implement in standard MLM.
  5. Finally, on a more practical level, some people (especially those with a background in SEM) may find it easier to conceptualize of multilevel data structures and models within the SEM framework where we are used to path diagrams.

One important thing to note is that while we have the option of using latent decomposition to partition variance in MSEM, we do not have to use it – we can still employ a conventional centering strategy if we choose (in which case, the present of latent variables at either level is the only thing that makes the model MSEM).

2.1 A bit about SEM generally

At this point, we have made the case that there are some limitations to classic MLM that could be overcome if we use a latent variable approach to parcellating variance at the between and within-levels. But what are latent variables exactly? Bollen (2002) defines a latent variable as a variable “for which there is no sample realization for at least some observations in a given sample.” In other words, a latent variable is any variable for which we have no direct measurement. You can see why this is the case for variables in multilevel data. Returning to our example, we have no way of knowing what variance in the interpersonal dysfunction or suicidal ideation scores we get is due to “within-person” (year-to-year) variability, and which is due to “between-person” variability (individual differences). Since we have no direct measurement of these quantities, we can treat them as latent variables and estimate them, using maximum likelihood or some other estimator (e.g., Bayesian estimation).

Moving to an SEM-framework carries other advantages as well, like allowing us to model latent variables within each level to capture patterns of covariation in our observed indicators. In the example above, if we had several measures of interpersonal dysfunction, we could estimate latent between- and within-person variance components, and further use those latent components to model a latent interpersonal dysfunction variable at each level (or perhaps one level, but not the other, it’s up to you!). Here’s what this would look like: