# 1 Matrix Algebra

In multiple regression, we are used to seeing a single regression equation such as:

$Y_i = \beta_0 + \beta_1 V_i + \beta_2 W_i + \beta_3 X_i + \varepsilon_i$ where $$i$$ denotes the ith individual/observation. The goal of this tutorial is to provide some background that will review how such a regression equation can be represented by a set of matrices and vectors:

$\mathbf{Y} = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix} = \begin{bmatrix} \beta_0 + \beta_1 V_1 + \beta_2 W_1 + \beta_3 X_1 \\ \beta_0 + \beta_1 V_2 + \beta_2 W_2 + \beta_3 X_2 \\ \vdots \\ \beta_0 + \beta_1 V_n + \beta_2 W_n + \beta_3 X_n \\ \end{bmatrix} \enspace + \begin{bmatrix} \varepsilon_1 \\ \varepsilon_2 \\ \vdots \\ \varepsilon_n \end{bmatrix}$

If we factor out the parameters ($$\mathbf{\Beta}$$) from the data, we can see the difference between the model matrix and the parameter estimates:

$\mathbf{Y} = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix} = \begin{bmatrix} 1 & V_1 & W_1 & X_1 \\ 1 & V_2 & W_2 & X_2 \\ \vdots \\ 1 & V_n & W_n & X_n \\ \end{bmatrix} \enspace \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ \beta_3 \end{bmatrix} + \begin{bmatrix} \varepsilon_1 \\ \varepsilon_2 \\ \vdots \\ \varepsilon_n \end{bmatrix}$

# 2 Types of matrices

Remember that matrices are defined by rows (the first dimension) and columns (the second dimension):

$\underset{m \times n}{\mathbf{A}} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{bmatrix}$

And a position in the matrix is specific by subscripting according to the row and column: $$a_{11}$$.

## 2.1 Square

A square matrix has the same number of rows and columns. Covariance matrices are always square.

$\underset{n \times n}{\mathbf{A}} = \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{bmatrix}$

## 2.2 Symmetric

A symmetric matrix a square matrix that is identical when transposed. That is, flipping the rows and columns has no effect. Another way to think of it is that the off-diagonal structure (upper triangle and lower triangle) is identical.

\begin{align} \underset{n \times n}{\mathbf{A}} &= \begin{bmatrix} a & ab & ac & ad \\ ab & b & bc & bd \\ ac & bc & c & cd \\ ad & bd & cd & d \end{bmatrix} \\ \cr \mathbf{A} &= \mathbf{A}' \end{align}

This is pretty close to the structure we’ll see in much of the class – with $$ab$$ representing some function of both $$a$$ and $$b$$ (e.g., covariance).

## 2.3 Diagonal

A diagonal matrix is a special case of a square symmetric matrix in which there are values along the diagonal, but zeros elsewhere:

\begin{align} \underset{n \times n}{\mathbf{A}} &= \begin{bmatrix} a & 0 & 0 & 0 \\ 0 & b & 0 & 0 \\ 0 & 0 & c & 0 \\ 0 & 0 & 0 & d \end{bmatrix} \\ \cr \mathbf{A} &= \mathbf{A}' \end{align}

### 2.3.1 Matrix trace

The trace of a square matrix is the sum of elements along the diagonal:

$tr(\mathbf{A}) = a + b + c + d$

Or more generally, if the matrix is $$n \times n$$:

$tr(\mathbf{A}) = \sum_{i=1}^{n}{a_{ii}} = a_{11} + a_{22} + ... + a_{nn}$

## 2.4 Identity

An identity matrix is a special case of a diagonal matrix in which the elements of the diagonal are all 1:

$\underset{n \times n}{\mathbf{I}} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$

Why would this be useful? Mostly it helps make matrix multiplication work, but for now, just remember that any matrix multiplied by an identity matrix is unchanged. Just like multiplying a number by 1:

Here’s a square matrix

A <- matrix(rnorm(25), nrow=5, ncol=5)
print(A)
##         [,1]    [,2]   [,3]   [,4]    [,5]
## [1,] -0.6208 -0.6728  0.655  1.458  0.3162
## [2,] -0.0968 -0.0921  1.111 -1.590  0.0229
## [3,]  0.8798  0.8030  0.898 -0.529 -1.0620
## [4,] -1.1098 -1.2248 -0.805 -0.277  0.9781
## [5,] -0.9037  1.0291 -1.073  0.337 -1.0197

And now multiplied by $$\mathbf{I}$$:

A %*% diag(5)
##         [,1]    [,2]   [,3]   [,4]    [,5]
## [1,] -0.6208 -0.6728  0.655  1.458  0.3162
## [2,] -0.0968 -0.0921  1.111 -1.590  0.0229
## [3,]  0.8798  0.8030  0.898 -0.529 -1.0620
## [4,] -1.1098 -1.2248 -0.805 -0.277  0.9781
## [5,] -0.9037  1.0291 -1.073  0.337 -1.0197

# 3 Matrix addition and subtraction

Matrix addition and subtraction are straightforward. These operations are applied elementwise:

$\mathbf{A} = \begin{bmatrix} 10 & 5 \\ 9 & 1 \end{bmatrix} , \enspace \mathbf{B} = \begin{bmatrix} 2 & 1 \\ 20 & 0 \end{bmatrix}, \enspace \textrm{then } \mathbf{A}-\mathbf{B}= \begin{bmatrix} 8 & 4 \\ -11 & 1 \end{bmatrix}$

Note that matrices must be of the same dimension (i.e., number of rows and columns) to be subtracted or added.

# 4 Matrix multiplication

Multiplication is more complex.

## 4.1 Multiplication of a matrix by a scalar value

To multiply a matrix $$\mathbf{X}$$ by a (scalar) constant $$a$$, one simply multiplies all elements of $$\mathbf{X}$$ by $$a$$:

$\mathbf{A} = \begin{bmatrix} 10 & 5 \\ 9 & 1 \end{bmatrix}, \enspace k=2, \enspace k\mathbf{A} = \begin{bmatrix} 20 & 10 \\ 18 & 2 \end{bmatrix}$

## 4.2 Multiplication of a matrix by another matrix

Multiplication is a more complex operation when both objects are matrices. First, the order matters, such that $$\mathbf{AB}$$ is not (usually) the same as $$\mathbf{BA}$$. This gives rise to the terms ‘pre-multiplication’ and ‘post-multiplication’, though we don’t need those much in SEM. Second, if we are computing $$C = AB$$, then the number of columns in A must match the number of rows in B:

$\underset{n \times k}{\mathbf{C}} = \underset{n \times p}{\mathbf{A}} \cdot \underset{p \times k}{\mathbf{B}}$ Thus, the resulting matrix $$\mathbf{C}$$ has the number of rows of $$\mathbf{A}$$ and the number of columns of $$\mathbf{B}$$. Matrices that can be multiplied are called ‘compatible’ or ‘comformable.’ Matrices in which the inner dimensions (i.e., columns of $$\mathbf{A}$$, rows of $$\mathbf{B}$$) do not match are called ‘incompatible’ or ‘non-conformable.’ These cannot be multiplied.

How does matrix multiplication work? One multiplies the elements of the ith row of $$\mathbf{A}$$ by the elements of the jth column of $$\mathbf{B}$$, then sums up these values into the ith row and jth column of $$\mathbf{C}$$. Like so:

$c_{ij} = \sum_{k=1}^{p} a_{ik} b_{kj}$