The use of a ‘hat’ over a variable denotes an estimator of a quantity. So, \(\hat{Y}\) indicates the predicted values of \(Y\) from a statistical model.
Sample means are denoted \(\bar{x}\).
Sample variances are \(s^2_x\) and standard deviations are \(s_x\).
Matrices are denoted by boldface capital letters, \(\mathbf{X}\).
Vectors are denoted by boldface lowercase letters, \(\mathbf{x}\).
The transpose of matrix flips a matrix along its diagonal (transposing rows and columns). It is denoted with the prime operator, \(\mathbf{X}'\) and the t()
function in R
.
Matrices are composed of rows and columns (we won’t deal with arrays of 3+ dimensions). Rows are the first dimension, columns are the second. So, a matrix composed of \(n\) observations and \(k\) variables is denoted:
\[\underset{n \times k}{\mathbf{X}}\]
In R
, selecting row i
in column j
from matrix X
is: X[i,j]
.
\[ \begin{equation} \mathbf{x} = \begin{bmatrix} x_1 & x_2 & \cdots & x_m \end{bmatrix} \end{equation} \]
\[ \begin{equation} \mathbf{x} =\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m \end{bmatrix} \end{equation} \]
\[ \begin{equation} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m \end{bmatrix}' = \begin{bmatrix} x_1 & x_2 & \cdots & x_m \end{bmatrix} \end{equation} \]
I will probably screw these up along the way, but I wanted to mention them up front to be on the same page!
We’ll use housing price data from Boston’s 1970 census to review important concepts in correlation and regression. This is a nice dataset for regression because there are many interdependent variables: crime, pollutants, age of properties, etc.
#example dataset from mlbench package with home prices in Boston by census tract
data(BostonHousing2)
BostonSmall <- BostonHousing2 %>% dplyr::select(
cmedv, #median value of home in 1000s
crim, #per capita crime by town
nox, #nitric oxide concentration
lstat #proportion of lower status
)
n <- nrow(BostonSmall) #number of observations
k <- ncol(BostonSmall) #number of variables
#scatterplot matrix of variables
splom(BostonSmall)