Definition 1: Given a square matrix A, an eigenvalue is a scalar λ such that det (A – λI) = 0, where A is a k × k matrix and I is the k × k identity matrix. The eigenvalue with the largest absolute value is called the dominant eigenvalue.
Observation: det (A – λI) = 0 expands into an kth degree polynomial equation in the unknown λ called the characteristic equation. The polynomial itself is called the characteristic polynomial. The solutions to the characteristic equation are the eigenvalues. Since, based on the fundamental theorem of algebra, any kth degree polynomial p(x) has n roots (i.e. solutions to the equation p(x) = 0), we conclude that any k × k matrix has k eigenvalues.
Example 1: Find the eigenvalues for matrix A
This is the characteristic equation. Solving for λ, we have the eigenvalues λ = 3 and λ = 14.
Observation: Let A = . Then
Now let λ1 and λ2 be the eigenvalues. Then (λ – λ1)(λ – λ2)=0, and so λ2 – (λ1 + λ2)λ+ λ1 λ2, and so λ1 + λ2 = trace A and λ1 λ2 = det A.
This is consistent with the eigenvalues from Example 1.
In fact, these properties are true in general, not just for 2 × 2 matrices.
- The product of all the eigenvalues of A = det A
- The sum of all the eigenvalues of A = trace A
- A square matrix is invertible if and only if it none of its eigenvalues is zero.
- The eigenvalues of an upper triangular matrix (including a diagonal matrix) are the entries on the main diagonal
a) By definition, each eigenvalue is a root of the characteristic equation det(A – λI) = 0. By Definition 1 of Determinants and Simultaneous Linear Equations, det(A – λI) can be expressed as a kth degree polynomial in the form where ak = (-1)k. Thus by the fundamental theorem of algebra, det(A – λI) = where λ1, …, λk are the eigenvalues of A (here we treat the λk as constants and λ as a variable). Setting λ = 0 yields
c) Results from (a) and Property 4 of Determinants and Simultaneous Linear Equations.
d) If A is an upper triangular matrix, then so is A – λI. The result now follows from Definition 1 of Determinants and Simultaneous Linear Equations..
Definition 2: If λ is an eigenvalue of the k × k matrix A, then a non-zero k × 1 matrix X is an eigenvector which corresponds to λ provided (A – λI)X = 0, where 0 is the k × k null matrix (i.e. 0’s in all positions).
Property 2: Every eigenvalue of a square matrix has an infinite number of corresponding eigenvectors.
Proof: Let λ be an eigenvalue of a k × k matrix A. Thus by Definition 1, det (A – λI) = 0, and so by Property 3 and 4 of Rank of a Matrix, (A – λI)X = 0 has an infinite number of solutions. Each such solution is an eigenvector.
Property 3: X is an eigenvector corresponding to eigenvalue λ if and only if AX = λX. If X is an eigenvector corresponding to λ, then every non-zero scalar multiple of X is also an eigenvector corresponding to λ.
Proof: Let λ be an eigenvalue of a k × k matrix A and let X be an eigenvector corresponding to λ. Then as we saw in the proof of Property 2, (A – λI)X = 0, an assertion which is equivalent to AX = λX. The converse is obvious.
Now let c be a non-zero scalar. Then A(cX) = c(AX) = c(λX) = λ(cX), and so cX is also an eigenvector.
Property 4: If λ is an eigenvalue of an invertible matrix A then λ-1 is an eigenvalue of A-1. Thus, the smallest eigenvalue of A = the reciprocal of the dominant eigenvalue of A-1
Proof: If λ is an eigenvalue, then there is a vector X ≠ 0, such that AX = λX. Thus A-1AX = A-1λX, and so X = A-1 λX = λA-1X. Dividing both sides of the equation by λ yields the result λ-1X = A-1X.
Property 5: If λ is an eigenvalue of the k × k matrix A and X is a corresponding eigenvector, then 1 + λ is an eigenvalue of I + A with corresponding eigenvector X.
Proof: Since (A – λI) X = 0, we have ((I + A) – (1 + λ)I) X = X + AX – X – λX = AX – λX = (A – λI) X = 0.
Example 2: Find the eigenvectors for A from Example 1.
This equivalent to
x(13–λ) + 5y = 0
2x + (4–λ)y = 0
For λ = 3
10x + 5y = 0
2x + y = 0
Thus, y = -2x, which means = or any scalar multiple.
For λ = 14
-x + 5y = 0
2x – 10y = 0
Thus, x = 5y, which means = or any scalar multiple.
We now find the eigenvectors with unit length.
For eigenvalue λ = 3, an eigenvector is . Its distance from the origin is = , and so is an eigenvector of λ = 3 of unit length.
For λ = 14, an eigenvector is . Its distance from the origin is = , and so is an eigenvector of λ = 14 of unit length.
Property 6: If X is an eigenvector then the Rayleigh Quotient = (XTAX) / (XTX) is a corresponding eigenvalue
Proof: (XTAX) / (XTX) = (XTλX) / (XTX) = λ (XTX) / (XTX) = λ where λ is the eigenvalue that corresponds to X
Example 3: Find the eigenvalues for the two unit eigenvectors from Example 2.
If AX = λX, then (A – λI) X = 0, and so λ is an eigenvalue corresponding to the eigenvector X. Since
it follows that λ = 3 and λ = 14 are the two corresponding eigenvalues.
Another approach is to use Rayleigh Quotient = (XTAX) / (XTX) per Property 6. For example
Thus, λ = 15/5 = 3.
Property 7: If all the eigenvalues of a square matrix A are distinct then any set of eigenvectors corresponding to these eigenvalues are independent.
Proof: We prove the result by induction on k. Suppose λ1, …, λk are the eigenvalues of A and suppose they are all distinct. Suppose that X1, …, Xk are the corresponding eigenvectors and b1X1 + … + bkXk = 0. Thus,
0 = A0 = b1AX1 + … + bkAXk = b1λ1X1 + … + bkλkXk
0 = b1λ1X1 + … + bkλ1Xk
Subtracting these linear combinations from each other, we get:
0 = b2(λ2 – λ1) X2 + … + bk(λk – λ1) Xk
Since this is a linear combination of k – 1 of the eigenvectors, by the induction hypothesis, b2(λ2 – λ1) = 0, …, bk(λk – λ1) = 0. But since all the λi are distinct, the expressions in parentheses are non-zero, and so b2 = … = bk = 0. Thus 0 = b1X1 + … + bkXk = b1X1 + 0 + … + 0 = b1X1. since X1 is an eigenvector X1 ≠ 0, and so b1 = 0. This proves that all the bi = 0, and so X1, …, Xk are independent.
Property 8: If the eigenvalues of a square k × k matrix A are distinct, then any set of eigenvectors corresponding to these eigenvalues are a basis for the set of all k × 1 column vectors (and so any set of k × 1 vector can be expressed uniquely as a linear combination of these eigenvectors).
Proof: The result follows from Corollary 4 of Linear Independent Vectors and Property 7.
Definition 3: A square matrix A is diagonalizable if there exist an invertible matrix P and a diagonal matrix D such that A = PDP-1.
Property 9: For any square matric A and invertible matrix P, A and PDP-1 have the same eigenvalues.
Proof: This follows since A and B have the same characteristic equation:
Property 10: An n × n matrix is diagonalizable if and only if it has n linearly independent eigenvectors.
Proof: First we show that if A is diagonalizable then A has n linearly independent eigenvectors.
Suppose A = PDP-1 where D = [dij] is a diagonal matrix and P is invertible. Thus AP = PD. Let Pj be the jth column of P. Thus the jth column of AP is APj and the jth column of PD is djj Pj. Since AP = PD, it follows that APj = djj Pj, which means that djj is an eigenvalue of A with corresponding eigenvector Pj.
Since P is invertible, by Property 9, rank(P) = n, and so the columns of P are linearly independent. But the columns of P are the eigenvectors, thus showing that the eigenvectors are independent.
Next we show that if A has n linearly independent eigenvectors then A is diagonalizable.
Let P1, …, Pn be the n linearly independent eigenvectors of A and let P be the n × n matrix whose columns are the Pj. Let λ1, …, λn be the eigenvalues of A and let D = [dij] be the diagonal matrix whose main diagonal contains these eigenvalues. Since APj = λj Pj, it follows that the jth column of AP = the jth column of PD and so AP = PD, from which it follows that A = PDP-1.
Observation: The proof of Property 10, shows that for any square matric A, if for some diagonal matrix D and invertible matrix P, then the main diagonal of D consists of the eigenvalues of A and the columns of P are corresponding eigenvectors.
Property 11: If A has n distinct real eigenvalues then A is diagonalizable.
Proof: The result follows from Property 7 and 10.
Observation: The converse of Property 11 is not true. E.g. the n × n identity matrix is trivially diagonalizable but it does not have n distinct eigenvalues.
Real Statistics Functions: The Real Statistics Resource Pack provides the following supplemental functions, where R1 is a k × k range in Excel.
eVALUES(R1, iter, order): Produces an 1 × k array containing the eigenvalues of matrix in range R1. These eigenvalues are listed in decreasing absolute value order.
eVECTORS(R1, iter, order) : Produces a row with the eigenvalues as for eVALUES(R1). Below each eigenvalue is a unit eigenvector corresponding to this eigenvalue. Thus the output of eVECTORS(R1) must be an (k+1) × k array.
Since the calculation of these functions uses iterative techniques, you can optionally specify the number of iteration used via the iter parameter. If the iter parameter is not used then it defaults to 100 iterations.
If order is True or omitted then the eigenvalues are listed in order from highest in absolute value to smallest. If order is False then they are listed in order from highest to lowest.
The eigenvectors produced by eVECTORS(R1) are all orthogonal, as described in Definition 8 of Matrix Operations. See Figure 5 of Principal Component Analysis for an example of the output from the eVECTORS function.
Real Statistics Data Analysis Tool: The Matrix data analysis tool contains an Eigenvalues/vectors option that computes the eigenvalues and eigenvectors of the matrix in the Input Range. See Figure 3 of Matrix Operations for an example of the use of this tool.
Observation: Every square k × k matrix has at most k (real) eigenvalues. If A is symmetric then it has k eigenvalues, although these don’t need to be distinct (see Symmetric Matrices). It turns out that the eigenvalues for covariance and correlation matrices are always non-negative (see Positive Definite Matrices).
If A is not symmetric, then some of the eigenvalues may be complex numbers. The values outputted by eVALUES corresponding to complex eigenvalues will not be correct. These complex eigenvalues occur in pairs, and so for example a 3 × 3 matrix will have either 1 or 3 real eigenvalues, never 2. You can always check that the values produced by eVALUES are real eigenvalues by checking whether det (A – λI) = 0. Since we will only use eVALUES where A is a covariance or correlation matrix, and so is symmetric, you don’t really need to worry about these details.