Eigenvalues and Eigenvectors

Definition 1: Given a square matrix A, an eigenvalue is a scalar λ such that det (A – λI) = 0, where A is a k × k matrix and I is the k × k identity matrix. The eigenvalue with the largest absolute value is called the dominant eigenvalue.

Observation: det (A – λI) = 0 expands into an kth degree polynomial equation in the unknown λ called the characteristic equation. The polynomial itself is called the characteristic polynomial. The solutions to the characteristic equation are the eigenvalues. Since, based on the fundamental theorem of algebra, any kth degree polynomial p(x) has n roots (i.e. solutions to the equation p(x) = 0), we conclude that any k × k matrix has k eigenvalues.

Example 1: Find the eigenvalues for matrix A

image9070 image9071

Thus
image9072

This is the characteristic equation. Solving for λ, we have the eigenvalues λ = 3 and λ = 14.

Observation: Let A = \left[\begin{array}{cc} a & b \\ d & e \end{array}\right]. Then

image9074 image9075

image9076

Thus

image9077

Now let λ1 and λ2 be the eigenvalues. Then (λλ1)(λ – λ2)=0, and so λ (λ1 + λ2)λ+ λ1 λ2, and so λ1 + λ2 = trace A and λ1 λ2 = det A.

This is consistent with the eigenvalues from Example 1.

In fact, these properties are true in general, not just for 2 × 2 matrices.

Property 1:

  1. The product of all the eigenvalues of A = det A
  2. The sum of all the eigenvalues of A = trace A
  3. A square matrix is invertible if and only if it none of its eigenvalues is zero.
  4. The eigenvalues of an upper triangular matrix (including a diagonal matrix) are the entries on the main diagonal

Proof:

a) By definition, each eigenvalue is a root of the characteristic equation det(A – λI) = 0. By Definition 1 of Determinants and Simultaneous Linear Equations, det(A – λI) can be expressed as a kth degree polynomial in the form \sum_{i=0}^k a_i \lambda^i where ak = (-1)k. Thus by the fundamental theorem of algebra, det(A – λI) = \prod_{i=0}^k \lambda - \lambda_i where λ1, …, λk are the eigenvalues of A (here we treat the λk as constants and λ as a variable). Setting λ = 0 yields

image9078b) Note that if you expand the terms of det(A – λI) = \prod_{i=0}^k \lambda - \lambda_i you get

image9079If you expand the terms of det(A – λI) = \sum_{i=0}^k a_i \lambda^i you will find that

image9080Thus, trace A = a11 + a22 + ⋯ + akk  = λ1 + λ2 + ⋯ + λk.

c) Results from (a) and Property 4 of Determinants and Simultaneous Linear Equations.

d) If A is an upper triangular matrix, then so is A – λI. The result now follows from Definition 1 of Determinants and Simultaneous Linear Equations..

Definition 2: If λ is an eigenvalue of the k × k matrix A, then a non-zero k × 1 matrix X is an eigenvector which corresponds to λ provided (A – λI)X = 0, where 0 is the k × k null matrix (i.e. 0’s in all positions).

Property 2: Every eigenvalue of a square matrix has an infinite number of corresponding eigenvectors.

Proof: Let λ be an eigenvalue of a k × k matrix A. Thus by Definition 1, det (A – λI) = 0, and so by Property 3 and 4 of Rank of a Matrix, (A – λI)X = 0 has an infinite number of solutions. Each such solution is an eigenvector.

Property 3: X is an eigenvector corresponding to eigenvalue λ if and only if AX = λX. If X is an eigenvector corresponding to λ, then every non-zero scalar multiple of X is also an eigenvector corresponding to λ.

Proof: Let λ be an eigenvalue of a k × k matrix A and let X be an eigenvector corresponding to λ. Then as we saw in the proof of Property 2, (A – λI)X = 0, an assertion which is equivalent to AX = λX. The converse is obvious.

Now let c be a non-zero scalar. Then A(cX) = c(AX) = c(λX) = λ(cX), and so cX is also an eigenvector.

Property 4: If λ is an eigenvalue of an invertible matrix A then λ-1 is an eigenvalue of A-1. Thus, the smallest eigenvalue of A = the reciprocal of the dominant eigenvalue of A-1

Proof: If λ is an eigenvalue, then there is a vector X ≠ 0, such that AX = λX. Thus A-1AX = A-1λX, and so X = A-1 λX = λA-1X. Dividing both sides of the equation by λ yields the result λ-1X = A-1X.

Property 5: If λ is an eigenvalue of the k × k matrix A and X is a corresponding eigenvector, then 1 + λ is an eigenvalue of I + A with corresponding eigenvector X.

Proof: Since (A – λI) X = 0, we have ((I + A) – (1 + λ)I) X = X + AX – X – λX = AX – λX = (A – λI) X = 0.

Example 2: Find the eigenvectors for A from Example 1.

image9301

This equivalent to

x(13–λ) + 5y = 0
2x + (4–λ)y = 0

For λ = 3

10x + 5y = 0
2x + y = 0

Thus, y = -2x, which means \left[\begin{array}{c} x \\ y \end{array}\right] = \left[\begin{array}{c} -1 \\ 2 \end{array}\right] or any scalar multiple.

For λ = 14

-x + 5y = 0
2x – 10y = 0

Thus, x = 5y, which means \left[\begin{array}{c} x \\ y \end{array}\right]\left[\begin{array}{c} 5 \\ 1 \end{array}\right] or any scalar multiple.

We now find the eigenvectors with unit length.

For eigenvalue λ = 3, an eigenvector is \left[\begin{array}{c} -1 \\ 2 \end{array}\right]. Its distance from the origin is \sqrt{(-1)^2+2^2} = \sqrt{5}, and so \left[\begin{array}{c} -1/\sqrt{5} \\ 2/\sqrt{5} \end{array}\right] is an eigenvector of λ = 3 of unit length.

For λ = 14, an eigenvector is \left[\begin{array}{c} 5 \\ 1 \end{array}\right]. Its distance from the origin is \sqrt{5^2+21^2} = \sqrt{26}, and so \left[\begin{array}{c} 5/\sqrt{26} \\ 1/\sqrt{26} \end{array}\right] is an eigenvector of λ = 14 of unit length.

Property 6: If X is an eigenvector then the Rayleigh Quotient = (XTAX) / (XTX) is a corresponding eigenvalue

Proof: (XTAX) / (XTX) = (XTλX) / (XTX) = λ (XTX) / (XTX) = λ where λ is the eigenvalue that corresponds to X

Example 3: Find the eigenvalues for the two unit eigenvectors from Example 2.

If AX = λX, then (A – λIX = 0, and so λ is an eigenvalue corresponding to the eigenvector X. Since

image9302

image9303

it follows that λ = 3 and λ = 14 are the two corresponding eigenvalues.

Another approach is to use Rayleigh Quotient = (XTAX) / (XTX) per Property 6. For example

image9304

image9305

Thus, λ = 15/5 = 3.

Property 7: If all the eigenvalues of a square matrix A are distinct then any set of eigenvectors corresponding to these eigenvalues are independent.

Proof: We prove the result by induction on k. Suppose λ1, …, λk are the eigenvalues of A and suppose they are all distinct. Suppose that X1, …, Xk are the corresponding eigenvectors and b1X+ … + bkXk = 0. Thus,

 0 = A0 = b1AX1 + … + bkAXk = b1λ1X1 + … + bkλkXk

But also

0 = b1λ1X1 + … + bkλ1Xk

Subtracting these linear combinations from each other, we get:

0 = b2(λ2λ1) X2 + … + bk(λkλ1) Xk

Since this is a linear combination of k – 1 of the eigenvectors, by the induction hypothesis, b2(λ2λ1) = 0, …, bk(λk – λ1)  = 0. But since all the λi are distinct, the expressions in parentheses are non-zero, and so b2 = … = bk = 0. Thus 0 = b1X1 + … + bkXk = b1X1 + 0 + … + 0 = b1X1. since X1 is an eigenvector X1 ≠ 0, and so b1 = 0. This proves that all the bi = 0, and so X1, …, Xk are independent.

Property 8: If the eigenvalues of a square k × k matrix A are distinct, then any set of eigenvectors corresponding to these eigenvalues are a basis for the set of all k × 1 column vectors (and so any set of k × 1 vector can be expressed uniquely as a linear combination of these eigenvectors).

Proof: The result follows from Corollary 4 of Linear Independent Vectors and Property 7.

Definition 3: A square matrix A is diagonalizable if there exist an invertible matrix P and a diagonal matrix D such that A = PDP-1.

Property 9: For any square matric A and invertible matrix P, A and PDP-1 have the same eigenvalues.

Proof: This follows since A and B have the same characteristic equation:

image9306 image9307

Property 10: An n × n matrix is diagonalizable if and only if it has n linearly independent eigenvectors.

Proof: First we show that if A is diagonalizable then A has n linearly independent eigenvectors.

Suppose A = PDP-1 where D = [dij] is a diagonal matrix and P is invertible. Thus AP = PD. Let Pj be the jth column of P. Thus the jth column of AP is APj and the jth column of PD is djj Pj. Since AP = PD, it follows that APj = djj Pj, which means that djj is an eigenvalue of A with corresponding eigenvector Pj.

Since P is invertible, by Property 9, rank(P) = n, and so the columns of P are linearly independent. But the columns of P are the eigenvectors, thus showing that the eigenvectors are independent.

Next we show that if A has n linearly independent eigenvectors then A is diagonalizable.

Let P1, …, Pn be the n linearly independent eigenvectors of A and let P be the n × n matrix whose columns are the Pj. Let λ1, …, λn be the eigenvalues of A and let D = [dij] be the diagonal matrix whose main diagonal contains these eigenvalues. Since APj = λj Pj, it follows that the jth column of AP = the jth column of PD and so AP = PD, from which it follows that A = PDP-1.

Observation: The proof of Property 10, shows that for any square matric A, if  for some diagonal matrix D and invertible matrix P, then the main diagonal of D consists of the eigenvalues of A and the columns of P are corresponding eigenvectors.

Property 11: If A has n distinct real eigenvalues then A is diagonalizable.

Proof: The result follows from Property 7 and 10.

Observation: The converse of Property 11 is not true. E.g. the n × n identity matrix is trivially diagonalizable but it does not have n distinct eigenvalues.

Real Statistics Functions: The Real Statistics Resource Pack provides the following supplemental functions, where R1 is a k × k range in Excel.

eVALUES(R1, iter, order): Produces a 2 × k array whose first row contains the eigenvalues of matrix in range R1. Below each eigenvalue λ is the value det(AλI).

eVECTORS(R1, iter, order, prec): Produces a row with the eigenvalues as for eVALUES(R1). Below each eigenvalue is a unit eigenvector corresponding to this eigenvalue. Thus the output of eVECTORS(R1) must be an (k+1) × k array.

Since the calculation of these functions uses iterative techniques, you can optionally specify the number of iteration used via the iter parameter. If the iter parameter is not used then it defaults to 100 iterations.

If order is True or omitted then the eigenvalues are listed in order from highest in absolute value to smallest. If order is False then they are listed in order from highest to lowest.

See below for a description of prec.

The eigenvectors produced by eVECTORS(R1) are all orthogonal, as described in Definition 8 of Matrix Operations. See Figure 5 of Principal Component Analysis for an example of the output from the eVECTORS function.

Real Statistics Data Analysis Tool: The Matrix data analysis tool contains an Eigenvalues/vectors option that computes the eigenvalues and eigenvectors of the matrix in the Input Range. See Figure 3 of Matrix Operations for an example of the use of this tool.

Observation: Every square k × k matrix has at most k (real) eigenvalues. If A is symmetric then it has k (real) eigenvalues, although these don’t need to be distinct (see Symmetric Matrices). It turns out that the eigenvalues for covariance and correlation matrices are always non-negative (see Positive Definite Matrices).

If A is not symmetric, then some of the eigenvalues may be complex numbers.  The values outputted by eVALUES corresponding to complex eigenvalues will not be correct. These complex eigenvalues occur in pairs, and so for example a 3 × 3 matrix will have either 1 or 3 real eigenvalues, never 2.

You can always check that the values produced by eVALUES are real eigenvalues by checking whether det (A – λI) = 0 (using the second row of the output from eVALUES). You can also specify a small value prec, the fourth argument in eVECTORS(R1, iter, order, prec). If prec > 0 then any eigenvalue for which abs(det (A – λI)) > prec is replaced by #N/A. If prec = 0 or is omitted then this replacement is not made.

Since elsewhere in this website we will only use eVALUES and eVECTORS where A is a covariance or correlation matrix, and so is symmetric, you don’t really need to worry about these details.

3 Responses to Eigenvalues and Eigenvectors

  1. Yuri says:

    If characteristic equation of matrix has compex eigenvalues the QR-algorithm does not converge. In this case evectors-function produce some numbers which may be confusing as real eigenvalues and eigenvectors not exist. Free add-in “xnumbers” produce “?” mark if it is not possible to evaluate operation correctly. It would be great if evectors (and other functions as well) produce some alert that the result of operation is not correct.

    Any way, excellent soft, thanks a lot.

    Yuri

    • Charles says:

      Yuri,
      Excellent point. I had thought of doing that but since the only time I need to find eigenvalues are for covariance/correlation matrices, which are symmetric and so always have real eigenvalues/vectors, I didn’t bother. In any case I will look into this further shortly.
      Charles

    • Charles says:

      Yuri,
      I have addressed your point in Release 2.12 of the software which was issued yesterday. The eVALUES function will produce the values det(A-cI) and if it is far from zero then c is not a real eigenvalue. The eVECTORS function will change imaginary eigenvalues to #N/A.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>