# Factor Extraction

A number of methods are available to determine the factor loadings used for factor analysis. We will start by explaining the principal component method. Another commonly used method, the principal axis method, is presented in Principal Axis Method of Factor Extraction.

Using the concepts that are described in Basic Concepts of Factor Analysis, we show how to carry out factor analysis via the following example..

Example 1: Carry out the factor analysis for evaluating great teachers based on the data in Example 1 of Principal Component Analysis.

As we saw in Example 1 of Principal Component Analysis, nine criteria are measured. Our objective is to find a set of fewer than nine factors which reasonably captures what is a great teacher. In fact we hope to find substantially fewer than nine factors that do the job.

Figure 1 shows the correlation matrix for this data (repeated from Figure 4 of Principal Component Analysis).

Figure 1 – Correlation Matrix

Figure 72 shows the table of eigenvalues and eigenvectors for the correlation matrix (repeated from in Figure 5 of Principal Component Analysis) using the supplemental function eVECTORS(B6:J14).

Figure 2 – Eigenvalues and eigenvectors

Using the formula bij = $\sqrt \lambda_j$cij where C1, …, Ck are the eigenvectors (range B19:J27 in Figure 2) corresponding to the eigenvalues (range B18:J18 in Figure 2) λ1 ≥ ⋯ ≥ λk, we calculate the loading factors for the nine common factors (see Figure 3).

For example, the loading factor of the Passion variable on Factor 1 (cell B38) is given by the formula =B26*SQRT(B\$18). Figure 3 also contains the communalities (range K31:K39). The communality of each variable represents the portion of that variable’s variance captured by the model. For variable xi this is $\sum_{j=1}^k b_{ij}^2$. E.g., the communality of the Passion variable (cell K38) is calculated via the formula =SUMSQ(B38;J38). Since we are using the full model (where all nine common factors are present) and the variance of each variable is 1 (remember we standardized the data), it is not surprising that column K contains all ones.

In fact, if we had used the eigenvalues and eigenvectors as calculated in Figure 2, we would have seen communalities that are close to 1, but not exactly 1. In fact, to get the communalities to come out to 1 we reran the eigenvector function eVECTORS(B6:J14, 200), using 200 iterations (instead the default of 100) to get a more accurate picture of the eigenvalues and eigenvectors.

### 11 Responses to Factor Extraction

Hi Charles,
Regarding the first question, how to convert the original data value of X into a value of the factor Z. I understand from the tutorial, that x can be represented as a linear combination of Z but given X, how to know Z in order to proceed with the regression?
Thanks.

2. Lata Sujata says:

• Charles says:

Lata,
You use the factor loadings to convert your original data into data about the factors (i.e. the hidden variables). Then you perform regression on the data about the factor. This assumes that the y value is not part of your factor analysis.
So if you had 100 samples about the vector X = (x1, …, x20) and then used factor analysis to find factors Z = (z1, z2, z3) you would perform regression using the data (z11, z21, z31, y1), …., (z1H, z2H, z3H, yH). Here H simply means 100.
Charles

• Joey Summer says:

Hi Charles,
Sorry but can you explain this answer again.
For my example I have X1, X2 and Y series.
I have the Factor Load matrix via PCA (2×2 matrix)
How do I convert X1 and X2 to Z1 and Z2 to then perform my multiple regression with Y?
Is it simply a matrix multiplication of (100 x 2) x (2 × 2) to get my new Z1 and Z2 series?

3. rohit khamkar says:

• Charles says:

Rohit,
Calculation of the factor loadings is part of a process that identifies hidden factors and how to interpret the original variables in terms of the hidden factors.
Charles

4. ighofose akpomejevwe says:

what are the usefulness of communalities in factor analysis?

• Charles says:

The communality of each variable represents the portion of that variable’s variance captured by the model.
Charles

5. Ngerem Thomas Chinedu says:

thanks for the explanation on how to use the tool.
from
Thomas