In this model we again consider k independent variables x1, …, xk and observed data for each of these variables. Our objective is to identify m factors y1, …, ym, preferably with m ≤ k as small as possible, which explain the observed data more succinctly.
Definition 1: Let X = [xi] be a random k × 1 column vector where each xi represents an observable trait, and let μ = [μi] be the k × 1 column vector of the population means. Thus E[Xi] = μi. Let Y = [yi] be an m × 1 vector of unobserved common factors where m ≤ k. These factors play a role similar to the principal components in Principal Component Analysis.
We next suppose that each xi can be represented as a linear combination of the factors as follows:
where the εi are the components which are not explained by the linear relationship. We further assume that the mean of each εi is 0 and the factors are independent with mean 0 and variance 1. We can consider the above equations to be a series of regression equations.
The coefficient βij is called the loading of the ith variable on the jth factor. The coefficient εi is called the specific factor for the ith variable. Let β = [βij] be the k × m matrix of loading factors and let ε = [εi] be the k × 1 column vector of specific factors.
Define the communality of variable xi to be φi = and let ϕi = var(εi) and = var(xi).
Observation: Since μi = E[xi] = E[βi0 + yj + εi] = E[βi0] + E[yj] + E[εi] = βi0 + 0 + 0 = βi0, it follows that the intercept term βi0 = μi, and so the regression equations can be expressed as
From the assumptions stated above it also follows that:
E[xi] = μi for all i
E[εi] = 0 for all i (the specific factors are presumed to be random with mean 0)
cov(yi, yj) = 0 if i ≠ j
cov(εi, εj) = 0 if i ≠ j
cov(yi, εj) = 0 for all i, j
From these equivalences it follows that the population covariance matrix Σ for X has the form
where is the k × k diagonal matrix with in the ith position on the diagonal.
Observation: Let λ1 ≥ … ≥ λk be the eigenvalues of Σ with corresponding unit eigenvectors γ1, …, γk where each eigenvector γi = [γij] is a k × 1 column vector of the form γi = [γij]. Now define the k × k matrix β = [βij] such that βij = γij for all 1 ≤ i, j ≤ k. As observed in Linear Algebra Background, all the eigenvalues of Σ are non-negative, and so the βij are well defined (see Property 8 of Positive Definite Matrices). By Theorem 1 of Linear Algebra Background (Spectral Decomposition Theorem), it follows that
As usual, we will approximate the population covariance matrix Σ by the sample covariance matrix S (for a given random sample). Using the above logic, it follows that
where λ1 ≥ … ≥ λk are the eigenvalues of S (a slight abuse of notation since these are not the same as the eigenvalues of Σ) with corresponding unit eigenvectors C1, …, Ck and L = [bij] is the k × k matrix such that bij = cij.
As we saw previously
The sample versions of these are
We have also seen previously that
The sample version is therefore