Eigenvalues and Eigenvectors

Definition 1: Given a square matrix A, an eigenvalue is a scalar λ such that det (A – λI) = 0, where A is a k × k matrix and I is the k × k identity matrix. The eigenvalue with the largest absolute value is called the dominant eigenvalue.

Observation: det (A – λI) = 0 expands into an kth degree polynomial equation in the unknown λ called the characteristic equation. The polynomial itself is called the characteristic polynomial. The solutions to the characteristic equation are the eigenvalues. Since, based on the fundamental theorem of algebra, any kth degree polynomial p(x) has n roots (i.e. solutions to the equation p(x) = 0), we conclude that any k × k matrix has k eigenvalues.

Example 1: Find the eigenvalues for matrix A

image9070image9193

Thus
image9072

This is the characteristic equation. Solving for λ, we have the eigenvalues λ = 3 and λ = 14.

Observation: Let A = \left[\begin{array}{cc} a & b \\ d & e \end{array}\right]. Then

image9074 image9075

image9076

Thus

image9077

Now let λ1 and λ2 be the eigenvalues. Then (λλ1)(λ – λ2)=0, and so λ (λ1 + λ2)λ+ λ1 λ2, and so λ1 + λ2 = trace A and λ1 λ2 = det A.

This is consistent with the eigenvalues from Example 1.

In fact, these properties are true in general, not just for 2 × 2 matrices.

Property 1:

  1. The product of all the eigenvalues of A = det A
  2. The sum of all the eigenvalues of A = trace A
  3. A square matrix is invertible if and only if it none of its eigenvalues is zero.
  4. The eigenvalues of an upper triangular matrix (including a diagonal matrix) are the entries on the main diagonal

Proof:

a) By definition, each eigenvalue is a root of the characteristic equation det(A – λI) = 0. By Definition 1 of Determinants and Simultaneous Linear Equations, det(A – λI) can be expressed as a kth degree polynomial in the form \sum_{i=0}^k a_i \lambda^i where ak = (-1)k. Thus by the fundamental theorem of algebra, det(A – λI) = \prod_{i=0}^k \lambda - \lambda_i where λ1, …, λk are the eigenvalues of A (here we treat the λk as constants and λ as a variable). Setting λ = 0 yields

image9078b) Note that if you expand the terms of det(A – λI) = \prod_{i=0}^k \lambda - \lambda_i you get

image9079If you expand the terms of det(A – λI) = \sum_{i=0}^k a_i \lambda^i you will find that

image9080Thus, trace A = a11 + a22 + ⋯ + akk  = λ1 + λ2 + ⋯ + λk.

c) Results from (a) and Property 4 of Determinants and Simultaneous Linear Equations.

d) If A is an upper triangular matrix, then so is A – λI. The result now follows from Definition 1 of Determinants and Simultaneous Linear Equations..

Definition 2: If λ is an eigenvalue of the k × k matrix A, then a non-zero k × 1 matrix X is an eigenvector which corresponds to λ provided (A – λI)X = 0, where 0 is the k × k null matrix (i.e. 0’s in all positions).

Property 2: Every eigenvalue of a square matrix has an infinite number of corresponding eigenvectors.

Proof: Let λ be an eigenvalue of a k × k matrix A. Thus by Definition 1, det (A – λI) = 0, and so by Property 3 and 4 of Rank of a Matrix, (A – λI)X = 0 has an infinite number of solutions. Each such solution is an eigenvector.

Property 3: X is an eigenvector corresponding to eigenvalue λ if and only if AX = λX. If X is an eigenvector corresponding to λ, then every non-zero scalar multiple of X is also an eigenvector corresponding to λ.

Proof: Let λ be an eigenvalue of a k × k matrix A and let X be an eigenvector corresponding to λ. Then as we saw in the proof of Property 2, (A – λI)X = 0, an assertion which is equivalent to AX = λX. The converse is obvious.

Now let c be a non-zero scalar. Then A(cX) = c(AX) = c(λX) = λ(cX), and so cX is also an eigenvector.

Property 4: If λ is an eigenvalue of an invertible matrix A then λ-1 is an eigenvalue of A-1. Thus, the smallest eigenvalue of A = the reciprocal of the dominant eigenvalue of A-1

Proof: If λ is an eigenvalue, then there is a vector X ≠ 0, such that AX = λX. Thus A-1AX = A-1λX, and so X = A-1 λX = λA-1X. Dividing both sides of the equation by λ yields the result λ-1X = A-1X.

Property 5: If λ is an eigenvalue of the k × k matrix A and X is a corresponding eigenvector, then 1 + λ is an eigenvalue of I + A with corresponding eigenvector X.

Proof: Since (A – λI) X = 0, we have ((I + A) – (1 + λ)I) X = X + AX – X – λX = AX – λX = (A – λI) X = 0.

Example 2: Find the eigenvectors for A from Example 1.

image9301

This equivalent to

x(13–λ) + 5y = 0
2x + (4–λ)y = 0

For λ = 3

10x + 5y = 0
2x + y = 0

Thus, y = -2x, which means \left[\begin{array}{c} x \\ y \end{array}\right] = \left[\begin{array}{c} -1 \\ 2 \end{array}\right] or any scalar multiple.

For λ = 14

x + 5y = 0
2x – 10y = 0

Thus, x = 5y, which means \left[\begin{array}{c} x \\ y \end{array}\right]\left[\begin{array}{c} 5 \\ 1 \end{array}\right] or any scalar multiple.

We now find the eigenvectors with unit length.

For eigenvalue λ = 3, an eigenvector is \left[\begin{array}{c} -1 \\ 2 \end{array}\right]. Its distance from the origin is \sqrt{(-1)^2+2^2} = \sqrt{5}, and so \left[\begin{array}{c} -1/\sqrt{5} \\ 2/\sqrt{5} \end{array}\right] is an eigenvector of λ = 3 of unit length.

For λ = 14, an eigenvector is \left[\begin{array}{c} 5 \\ 1 \end{array}\right]. Its distance from the origin is \sqrt{5^2+21^2} = \sqrt{26}, and so \left[\begin{array}{c} 5/\sqrt{26} \\ 1/\sqrt{26} \end{array}\right] is an eigenvector of λ = 14 of unit length.

Property 6: If X is an eigenvector then the Rayleigh Quotient = (XTAX) / (XTX) is a corresponding eigenvalue

Proof: (XTAX) / (XTX) = (XTλX) / (XTX) = λ (XTX) / (XTX) = λ where λ is the eigenvalue that corresponds to X

Example 3: Find the eigenvalues for the two unit eigenvectors from Example 2.

If AX = λX, then (A – λIX = 0, and so λ is an eigenvalue corresponding to the eigenvector X. Since

image9302

image9303

it follows that λ = 3 and λ = 14 are the two corresponding eigenvalues.

Another approach is to use Rayleigh Quotient = (XTAX) / (XTX) per Property 6. For example

image9304

image9305

Thus, λ = 15/5 = 3.

Property 7: If all the eigenvalues of a square matrix A are distinct then any set of eigenvectors corresponding to these eigenvalues are independent.

Proof: We prove the result by induction on k. Suppose λ1, …, λk are the eigenvalues of A and suppose they are all distinct. Suppose that X1, …, Xk are the corresponding eigenvectors and b1X+ … + bkXk = 0. Thus,

 0 = A0 = b1AX1 + … + bkAXk = b1λ1X1 + … + bkλkXk

But also

0 = b1λ1X1 + … + bkλ1Xk

Subtracting these linear combinations from each other, we get:

0 = b2(λ2λ1) X2 + … + bk(λkλ1) Xk

Since this is a linear combination of k – 1 of the eigenvectors, by the induction hypothesis, b2(λ2λ1) = 0, …, bk(λk – λ1)  = 0. But since all the λi are distinct, the expressions in parentheses are non-zero, and so b2 = … = bk = 0. Thus 0 = b1X1 + … + bkXk = b1X1 + 0 + … + 0 = b1X1. since X1 is an eigenvector X1 ≠ 0, and so b1 = 0. This proves that all the bi = 0, and so X1, …, Xk are independent.

Property 8: If the eigenvalues of a square k × k matrix A are distinct, then any set of eigenvectors corresponding to these eigenvalues are a basis for the set of all k × 1 column vectors (and so any set of k × 1 vector can be expressed uniquely as a linear combination of these eigenvectors).

Proof: The result follows from Corollary 4 of Linear Independent Vectors and Property 7.

Definition 3: A square matrix A is diagonalizable if there exist an invertible matrix P and a diagonal matrix D such that A = PDP-1.

Property 9: For any square matric A and invertible matrix P, A and PDP-1 have the same eigenvalues.

Proof: This follows since A and B have the same characteristic equation:

image9306 image9307

Property 10: An n × n matrix is diagonalizable if and only if it has n linearly independent eigenvectors.

Proof: First we show that if A is diagonalizable then A has n linearly independent eigenvectors.

Suppose A = PDP-1 where D = [dij] is a diagonal matrix and P is invertible. Thus AP = PD. Let Pj be the jth column of P. Thus the jth column of AP is APj and the jth column of PD is djj Pj. Since AP = PD, it follows that APj = djj Pj, which means that djj is an eigenvalue of A with corresponding eigenvector Pj.

Since P is invertible, by Property 9, rank(P) = n, and so the columns of P are linearly independent. But the columns of P are the eigenvectors, thus showing that the eigenvectors are independent.

Next we show that if A has n linearly independent eigenvectors then A is diagonalizable.

Let P1, …, Pn be the n linearly independent eigenvectors of A and let P be the n × n matrix whose columns are the Pj. Let λ1, …, λn be the eigenvalues of A and let D = [dij] be the diagonal matrix whose main diagonal contains these eigenvalues. Since APj = λj Pj, it follows that the jth column of AP = the jth column of PD and so AP = PD, from which it follows that A = PDP-1.

Observation: The proof of Property 10, shows that for any square matric A, if  for some diagonal matrix D and invertible matrix P, then the main diagonal of D consists of the eigenvalues of A and the columns of P are corresponding eigenvectors.

Property 11: If A has n distinct real eigenvalues then A is diagonalizable.

Proof: The result follows from Property 7 and 10.

Observation: The converse of Property 11 is not true. E.g. the n × n identity matrix is trivially diagonalizable but it does not have n distinct eigenvalues.

Real Statistics Functions: The Real Statistics Resource Pack provides the following supplemental functions, where R1 is a k × k range in Excel.

eVALUES(R1, iter, order): produces a 2 × k array whose first row contains the eigenvalues of matrix in range R1. Below each eigenvalue λ is the value det(AλI).

eVECTORS(R1, iter, order): returns an n+3 × n range, where n = the number of rows/columns in the square range R1. The first row of the output consists of the real eigenvalues of the square matrix A corresponding to the data in R1. Below each eigenvalue λ in the first row is a unit n × 1 eigenvector corresponding to λ. In the second-to-last row of the output are the values det(A−λI). In the last row of the output, below each eigenvalue λ and eigenvector X is the value max {bi: i = 1 to n} where B = AX− λX.

The eVALUES function will produce all real eigenvalues (but not any imaginary eigenvalues). The eVECTORS function only work reliably for symmetric matrices, which are the only ones for which we will need to calculate eigenvalues and eigenvectors in this website. When the matrix in range R1 is not symmetric you can use the eVECT function described in Eigenvectors of Non-symmetric Matrices.

Since the calculation of these functions uses iterative techniques, you can optionally specify the number of iteration used via the iter parameter. If the iter parameter is not used then it defaults to 100 iterations. For some matrices the value of iter must be increased to obtain sufficiently accurate eigenvalues or eigenvectors.

If order is TRUE or omitted then the eigenvalues are listed in order from highest in absolute value to smallest. If order is FALSE then they are listed in order from highest to lowest.

The eigenvectors produced by eVECTORS(R1) are all orthogonal, as described in Definition 8 of Matrix Operations. See Figure 5 of Principal Component Analysis for an example of the output from the eVECTORS function.

Real Statistics Data Analysis Tool: The Matrix data analysis tool contains an Eigenvalues/vectors option that computes the eigenvalues and eigenvectors of the matrix in the Input Range. See Figure 3 of Matrix Operations for an example of the use of this tool.

Observation: Every square k × k matrix has at most k (real) eigenvalues. If A is symmetric then it has k (real) eigenvalues, although these don’t need to be distinct (see Symmetric Matrices). It turns out that the eigenvalues for covariance and correlation matrices are always non-negative (see Positive Definite Matrices).

If A is not symmetric, then some of the eigenvalues may be complex numbers.  The values outputted by eVALUES corresponding to complex eigenvalues will not be correct. These complex eigenvalues occur in pairs, and so for example a 3 × 3 matrix will have either 1 or 3 real eigenvalues, never 2.

You can always check that the values produced by eVALUES are real eigenvalues by checking whether det (A – λI) = 0 (using the second row of the output from eVALUES).

Since elsewhere in this website we will only use eVALUES and eVECTORS where A is a covariance or correlation matrix, and so is symmetric, you don’t really need to worry about these details.

8 Responses to Eigenvalues and Eigenvectors

  1. Sarah says:

    You’re welcome. Good luck with both of those releases. I hope they go smoothly for you.

  2. Sarah says:

    Thanks for the quick turnaround on my question, Charles. I’ve been toying with the summation of a signed square of the inner product (an adaption of what appears on page 12 in “Resolving the sign ambiguity in the singular value decomposition” of 2007 by R. Bro, E., acar, and T. Kolda at Sandia National Laboratories, an article I sourced recently via an online search).

    So far I’ve simply been trying to replicate in Matlab the eVECTOR sign results for one covariance matrix, because they seem to make sense relative to the ‘geometry’ of the data (i.e., results contain very few negative eigenvector components). The eigenvector signs coming out of Matlab using its standard functions seem inside out (lots of negative components, including those of the first eigenvector), so to speak, for the matrix I’m currently considering.

    I haven’t gotten to the time-series aspect yet. My over-arching goal is to feed my data into Matlab and then just let-her-rip with the calculations. I’m interested in doing this to avoid fiddling around with the many Excel cells, sheets, and formula-entry steps that would be needed (too many opportunities for human error).

    If the Bro et al. approach leads to an eventual dead-end, I’m going to look into the ‘Eigenshuffle’ approach that John D’Errico takes, as documented at the Matlab Central File Exchange. It looks like it involves finding distances between the sequence of matrices.

    These are just possible leads. I hope what I’ve written makes sense! If it does, could you please explain it to me, too? Ha ha.

    • Charles says:

      Sarah,

      Thanks for your explanation. I expect to come back to the issue of eigenvalues/eigenvectors later this year, at which time I will follow up on the approaches that you referenced.

      My immediate priority is to release a new version of the websitethis week with a focus on reampling and some bug fixes. I also plan to release an online book on statistics to accompany this website, which I hope to make available this month.

      Charles

  3. Sarah says:

    Great site and software, Charles. How does eVECTORS handle the sign indeterminacy issue that is intrinsic to eigenvectors? Some call it the eigenvector sign-flip problem. In a purely mathematical sense, the phenomenon doesn’t matter at all. But if you’re doing data analysis on a ‘time series’ of covariance matrices, consistent eigenvector sign results are desirable.

    • Charles says:

      Sarah,
      Thank you for your comment. Thus far the eVECTORS function doesn’t do anything special about the sign. The signs are whatever are produced by the algorithm. What is your suggestion rearding how the sign ambiguity should be resolved?
      Charles

  4. Yuri says:

    If characteristic equation of matrix has compex eigenvalues the QR-algorithm does not converge. In this case evectors-function produce some numbers which may be confusing as real eigenvalues and eigenvectors not exist. Free add-in “xnumbers” produce “?” mark if it is not possible to evaluate operation correctly. It would be great if evectors (and other functions as well) produce some alert that the result of operation is not correct.

    Any way, excellent soft, thanks a lot.

    Yuri

    • Charles says:

      Yuri,
      Excellent point. I had thought of doing that but since the only time I need to find eigenvalues are for covariance/correlation matrices, which are symmetric and so always have real eigenvalues/vectors, I didn’t bother. In any case I will look into this further shortly.
      Charles

    • Charles says:

      Yuri,
      I have addressed your point in Release 2.12 of the software which was issued yesterday. The eVALUES function will produce the values det(A-cI) and if it is far from zero then c is not a real eigenvalue. The eVECTORS function will change imaginary eigenvalues to #N/A.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>