**Definition 1**: Given a square matrix *A*, an **eigenvalue** is a scalar *λ* such that det (*A – λI*) = 0, where *A* is a *k × k* matrix and *I* is the *k × k* identity matrix. The eigenvalue with the largest absolute value is called the **dominant eigenvalue**.

**Observation**: det (*A – λI*) = 0 expands into an *k*th degree polynomial equation in the unknown *λ* called the **characteristic equation**. The polynomial itself is called the **characteristic polynomial**. The solutions to the characteristic equation are the eigenvalues. Since, based on the fundamental theorem of algebra, any *k*th degree polynomial *p(x*) has *n* roots (i.e. solutions to the equation *p*(*x*) = 0), we conclude that any *k × k* matrix has *k* eigenvalues.

**Example 1**: Find the eigenvalues for matrix *A*

This is the characteristic equation. Solving for *λ*, we have the eigenvalues *λ* = 3 and *λ* = 14.

**Observation**: Let *A* = . Then

Now let *λ*_{1} and *λ*_{2} be the eigenvalues. Then (*λ* – *λ*_{1})(*λ – λ*_{2})=0, and so *λ*^{2 }*–* (*λ*_{1} + λ_{2})*λ*+ *λ*_{1} *λ*_{2}, and so *λ*_{1} + *λ*_{2} = trace *A* and λ_{1} λ_{2} = det *A*.

This is consistent with the eigenvalues from Example 1.

In fact, these properties are true in general, not just for 2 × 2 matrices.

**Property 1**:

- The product of all the eigenvalues of
*A*= det*A* - The sum of all the eigenvalues of
*A*= trace*A* - A square matrix is invertible if and only if it none of its eigenvalues is zero.
- The eigenvalues of an upper triangular matrix (including a diagonal matrix) are the entries on the main diagonal

Proof:

a) By definition, each eigenvalue is a root of the characteristic equation det(*A – λI*) = 0. By Definition 1 of Determinants and Simultaneous Linear Equations, det(*A – λI*) can be expressed as a *k*th degree polynomial in the form where *a _{k}* = (-1)

^{k}. Thus by the fundamental theorem of algebra, det(

*A – λI*) = where

*λ*

_{1}, …,

*λ*are the eigenvalues of

_{k}*A*(here we treat the

*λ*as constants and

_{k}*λ*as a variable). Setting

*λ*= 0 yields

b) Note that if you expand the terms of det(*A – λI*) = you get

If you expand the terms of det(*A – λI*) = you will find that

Thus, trace *A* = *a*_{11} + *a*_{22} + ⋯ + *a _{kk}* =

*λ*

_{1}+

*λ*

_{2}+ ⋯ +

*λ*.

_{k}c) Results from (a) and Property 4 of Determinants and Simultaneous Linear Equations.

d) If *A* is an upper triangular matrix, then so is* A – λI*. The result now follows from Definition 1 of Determinants and Simultaneous Linear Equations..

**Definition 2**: If *λ* is an eigenvalue of the *k × k* matrix *A*, then a non-zero *k* *×* 1 matrix *X* is an **eigenvector** which corresponds to *λ* provided (*A – λI*)*X* = 0, where 0 is the *k × k* null matrix (i.e. 0’s in all positions).

**Property 2**: Every eigenvalue of a square matrix has an infinite number of corresponding eigenvectors.

Proof: Let *λ* be an eigenvalue of a *k × k* matrix *A*. Thus by Definition 1, det (*A – λI*) = 0, and so by Property 3 and 4 of Rank of a Matrix, (*A – λI*)*X* = 0 has an infinite number of solutions. Each such solution is an eigenvector.

**Property 3**: *X* is an eigenvector corresponding to eigenvalue λ if and only if *AX = λX*. If *X* is an eigenvector corresponding to *λ*, then every non-zero scalar multiple of *X* is also an eigenvector corresponding to *λ*.

Proof: Let *λ* be an eigenvalue of a *k × k* matrix *A* and let *X* be an eigenvector corresponding to *λ*. Then as we saw in the proof of Property 2, (*A – λI*)*X* = 0, an assertion which is equivalent to *AX = λX*. The converse is obvious.

Now let *c* be a non-zero scalar. Then *A*(*cX*) = *c*(*AX*) = *c*(*λX*) = *λ*(*cX*), and so *cX* is also an eigenvector.

**Property 4**: If *λ* is an eigenvalue of an invertible matrix *A* then *λ*^{-1} is an eigenvalue of *A*^{-1}. Thus, the smallest eigenvalue of *A* = the reciprocal of the dominant eigenvalue of *A*^{-1}

Proof: If *λ* is an eigenvalue, then there is a vector *X* ≠ 0, such that *AX = λX*. Thus *A*^{-1}*AX* = *A*^{-1}*λX*, and so *X* = *A*^{-1} *λX* = *λA*^{-1}*X*. Dividing both sides of the equation by *λ* yields the result *λ*^{-1}*X* = *A*^{-1}*X*.

**Property 5**: If *λ* is an eigenvalue of the* k × k* matrix *A* and *X* is a corresponding eigenvector, then 1 + *λ* is an eigenvalue of *I + A* with corresponding eigenvector *X*.

Proof: Since (*A – λI*) *X* = 0, we have ((*I + A*) – (1 + *λ*)*I*) *X* = *X + AX – X – λX* = *AX – λX* = (*A – λI*) *X* = 0.

**Example 2**: Find the eigenvectors for *A* from Example 1.

This equivalent to

*x*(13–*λ*) + 5y = 0

2*x* + (4–λ)y = 0

For *λ* = 3

10*x* + 5y = 0

2*x* + y = 0

Thus, y = -2*x*, which means = or any scalar multiple.

For *λ* = 14

–*x* + 5y = 0

2*x – *10y = 0

Thus, *x* = 5y, which means = or any scalar multiple.

We now find the eigenvectors with unit length.

For eigenvalue *λ* = 3, an eigenvector is . Its distance from the origin is = , and so is an eigenvector of *λ* = 3 of unit length.

For *λ* = 14, an eigenvector is . Its distance from the origin is = , and so is an eigenvector of *λ* = 14 of unit length.

**Property 6**: If *X* is an eigenvector then the **Rayleigh Quotient** = (*X*^{T}*AX*) / (*X*^{T}*X*) is a corresponding eigenvalue

Proof: (*X*^{T}*AX*) / (*X*^{T}*X*) = (*X*^{T}*λX*) / (*X*^{T}*X*) = *λ* (*X*^{T}*X*) / (*X*^{T}*X*) = *λ* where *λ* is the eigenvalue that corresponds to *X*

**Example 3**: Find the eigenvalues for the two unit eigenvectors from Example 2.

If *AX = λX, *then* ( A – λI) X *= 0

*,*and so

*λ*is an eigenvalue corresponding to the eigenvector

*X*. Since

it follows that *λ* = 3 and *λ* = 14 are the two corresponding eigenvalues.

Another approach is to use Rayleigh Quotient = (*X*^{T}*AX*) / (*X*^{T}*X*) per Property 6. For example

Thus, *λ* = 15/5 = 3.

**Property 7**: If all the eigenvalues of a square matrix *A* are distinct then any set of eigenvectors corresponding to these eigenvalues are independent.

Proof: We prove the result by induction on *k*. Suppose λ_{1}, …, *λ _{k}* are the eigenvalues of

*A*and suppose they are all distinct. Suppose that

*X*

_{1}, …,

*X*are the corresponding eigenvectors and

_{k}*b*

_{1}

*X*_{1 }*0*

*+ … + b*=_{k}X_{k}*.*Thus,

0 = *A*0 = *b*_{1}*AX*_{1} + … + *b _{k}AX_{k}* =

*b*

_{1}

*λ*

_{1}

*X*

_{1}+ … +

*b*

_{k}λ_{k}X_{k}But also

0 = *b*_{1}*λ*_{1}*X*_{1} + … + *b _{k}λ_{1}X_{k}*

Subtracting these linear combinations from each other, we get:

0 = *b*_{2}(*λ*_{2} – *λ*_{1}) *X*_{2} + … + *b _{k}*(

*λ*–

_{k}*λ*

_{1})

*X*

_{k}Since this is a linear combination of *k* – 1 of the eigenvectors, by the induction hypothesis, *b*_{2}(*λ*_{2} – *λ*_{1}) = 0, …, *b _{k}*(

*λ*

_{k}– λ_{1}) = 0. But since all the

*λ*are distinct, the expressions in parentheses are non-zero, and so

_{i}*b*

_{2}= … =

*b*= 0. Thus 0 =

_{k}*b*

_{1}

*X*

_{1}+ … +

*b*=

_{k}X_{k}*b*

_{1}

*X*

_{1}+ 0 + … + 0 =

*b*

_{1}

*X*

_{1}. since

*X*

_{1}is an eigenvector

*X*

_{1}≠ 0, and so

*b*

_{1}= 0. This proves that all the

*b*= 0, and so

_{i}*X*

_{1}, …,

*X*are independent.

_{k}**Property 8**: If the eigenvalues of a square *k × k* matrix *A* are distinct, then any set of eigenvectors corresponding to these eigenvalues are a basis for the set of all *k* × 1 column vectors (and so any set of *k* × 1 vector can be expressed uniquely as a linear combination of these eigenvectors).

Proof: The result follows from Corollary 4 of Linear Independent Vectors and Property 7.

**Definition 3**: A square matrix *A* is **diagonalizable** if there exist an invertible matrix *P* and a diagonal matrix *D* such that *A = PDP*^{-1}.

**Property 9**: For any square matric *A* and invertible matrix *P, A* and *PDP*^{-1} have the same eigenvalues.

Proof: This follows since *A* and *B* have the same characteristic equation:

**Property 10**: An *n × n* matrix is diagonalizable if and only if it has *n* linearly independent eigenvectors.

Proof: First we show that if *A* is diagonalizable then *A* has *n* linearly independent eigenvectors.

Suppose *A = PDP*^{-1} where *D* = [*d _{ij}*] is a diagonal matrix and

*P*is invertible. Thus

*AP = PD*. Let

*P*be the

_{j}*j*th column of

*P*. Thus the

*j*th column of

*AP*is

*AP*and the

_{j}*j*th column of

*PD*is

*d*. Since

_{jj}P_{j}*AP = PD*, it follows that

*AP*, which means that

_{j}= d_{jj}P_{j}*d*is an eigenvalue of

_{jj}*A*with corresponding eigenvector

*P*.

_{j}Since *P* is invertible, by Property 9, rank(*P*) = *n*, and so the columns of *P* are linearly independent. But the columns of *P* are the eigenvectors, thus showing that the eigenvectors are independent.

Next we show that if *A* has *n* linearly independent eigenvectors then *A* is diagonalizable.

Let *P*_{1}, …,* P _{n}* be the

*n*linearly independent eigenvectors of

*A*and let

*P*be the

*n × n*matrix whose columns are the

*P*. Let

_{j}*λ*

_{1}, …,

*λ*be the eigenvalues of

_{n}*A*and let

*D*= [

*d*] be the diagonal matrix whose main diagonal contains these eigenvalues. Since

_{ij}*AP*, it follows that the

_{j}= λ_{j}P_{j}*j*th column of

*AP*= the

*j*th column of

*PD*and so

*AP = PD*, from which it follows that

*A = PDP*

^{-1}.

**Observation**: The proof of Property 10, shows that for any square matric *A*, if for some diagonal matrix *D* and invertible matrix *P*, then the main diagonal of *D* consists of the eigenvalues of *A* and the columns of *P* are corresponding eigenvectors.

**Property 11**: If *A* has *n* distinct real eigenvalues then *A* is diagonalizable.

Proof: The result follows from Property 7 and 10.

**Observation**: The converse of Property 11 is not true. E.g. the *n × n* identity matrix is trivially diagonalizable but it does not have *n* distinct eigenvalues.

**Real Statistics Functions**: The Real Statistics Resource Pack provides the following supplemental functions, where R1 is a* k × k* range in Excel.

**eVALUES**(R1, *iter, order*): produces a 2 *×* *k* array whose first row contains the eigenvalues of matrix *A *in range R1. Below each eigenvalue *λ* is the value det(*A*─*λI*).

**eVECTORS**(R1, *iter, order*): returns an *n*+3 × *n* range, where *n* = the number of rows/columns in the square range R1. The first row of the output consists of the real eigenvalues of the square matrix *A* corresponding to the data in R1. Below each eigenvalue *λ* in the first row is a unit *n* × 1 eigenvector corresponding to *λ. *In the second-to-last row of the output are the values det(*A−λI*). In the last row of the output, below each eigenvalue *λ* and eigenvector *X* is the value max {*b _{i}*:

*i*= 1 to

*n*} where

*B = AX− λX.*

The eVALUES function will produce all real eigenvalues (but not any imaginary eigenvalues). The eVECTORS function only work reliably for symmetric matrices, which are the only ones for which we will need to calculate eigenvalues and eigenvectors in this website. When the matrix in range R1 is not symmetric you can use the eVECT function described in Eigenvectors of Non-symmetric Matrices.

Since the calculation of these functions uses iterative techniques, you can optionally specify the number of iteration used via the *iter* parameter. If the *iter* parameter is not used then it defaults to 100 iterations. For some matrices the value of *iter* must be increased to obtain sufficiently accurate eigenvalues or eigenvectors.

If *order* is TRUE or omitted then the eigenvalues are listed in order from highest in absolute value to smallest. If *order* is FALSE then they are listed in order from highest to lowest.

The eigenvectors produced by eVECTORS(R1) are all orthogonal, as described in Definition 8 of Matrix Operations. See Figure 5 of Principal Component Analysis for an example of the output from the eVECTORS function.

** Real Statistics Data Analysis Tool: **The

**Matrix**data analysis tool contains an

**Eigenvalues/vectors**option that computes the eigenvalues and eigenvectors of the matrix in the Input Range. See Figure 3 of

**Matrix Operations**for an example of the use of this tool.

**Observation**: Every square *k × k* matrix has at most *k* (real) eigenvalues. If *A* is symmetric then it has *k* (real) eigenvalues, although these don’t need to be distinct (see Symmetric Matrices). It turns out that the eigenvalues for covariance and correlation matrices are always non-negative (see Positive Definite Matrices).

If *A* is not symmetric, then some of the eigenvalues may be complex numbers. The values outputted by eVALUES corresponding to complex eigenvalues will not be correct. These complex eigenvalues occur in pairs, and so for example a 3 × 3 matrix will have either 1 or 3 real eigenvalues, never 2.

You can always check that the values produced by eVALUES are real eigenvalues by checking whether det (*A – λI*) = 0 (using the second row of the output from eVALUES).

Since elsewhere in this website we will only use eVALUES and eVECTORS where *A* is a covariance or correlation matrix, and so is symmetric, you don’t really need to worry about these details.

If characteristic equation of matrix has compex eigenvalues the QR-algorithm does not converge. In this case evectors-function produce some numbers which may be confusing as real eigenvalues and eigenvectors not exist. Free add-in “xnumbers” produce “?” mark if it is not possible to evaluate operation correctly. It would be great if evectors (and other functions as well) produce some alert that the result of operation is not correct.

Any way, excellent soft, thanks a lot.

Yuri

Yuri,

Excellent point. I had thought of doing that but since the only time I need to find eigenvalues are for covariance/correlation matrices, which are symmetric and so always have real eigenvalues/vectors, I didn’t bother. In any case I will look into this further shortly.

Charles

Yuri,

I have addressed your point in Release 2.12 of the software which was issued yesterday. The eVALUES function will produce the values det(A-cI) and if it is far from zero then c is not a real eigenvalue. The eVECTORS function will change imaginary eigenvalues to #N/A.

Charles

Great site and software, Charles. How does eVECTORS handle the sign indeterminacy issue that is intrinsic to eigenvectors? Some call it the eigenvector sign-flip problem. In a purely mathematical sense, the phenomenon doesn’t matter at all. But if you’re doing data analysis on a ‘time series’ of covariance matrices, consistent eigenvector sign results are desirable.

Sarah,

Thank you for your comment. Thus far the eVECTORS function doesn’t do anything special about the sign. The signs are whatever are produced by the algorithm. What is your suggestion rearding how the sign ambiguity should be resolved?

Charles

Thanks for the quick turnaround on my question, Charles. I’ve been toying with the summation of a signed square of the inner product (an adaption of what appears on page 12 in “Resolving the sign ambiguity in the singular value decomposition” of 2007 by R. Bro, E., acar, and T. Kolda at Sandia National Laboratories, an article I sourced recently via an online search).

So far I’ve simply been trying to replicate in Matlab the eVECTOR sign results for one covariance matrix, because they seem to make sense relative to the ‘geometry’ of the data (i.e., results contain very few negative eigenvector components). The eigenvector signs coming out of Matlab using its standard functions seem inside out (lots of negative components, including those of the first eigenvector), so to speak, for the matrix I’m currently considering.

I haven’t gotten to the time-series aspect yet. My over-arching goal is to feed my data into Matlab and then just let-her-rip with the calculations. I’m interested in doing this to avoid fiddling around with the many Excel cells, sheets, and formula-entry steps that would be needed (too many opportunities for human error).

If the Bro et al. approach leads to an eventual dead-end, I’m going to look into the ‘Eigenshuffle’ approach that John D’Errico takes, as documented at the Matlab Central File Exchange. It looks like it involves finding distances between the sequence of matrices.

These are just possible leads. I hope what I’ve written makes sense! If it does, could you please explain it to me, too? Ha ha.

Sarah,

Thanks for your explanation. I expect to come back to the issue of eigenvalues/eigenvectors later this year, at which time I will follow up on the approaches that you referenced.

My immediate priority is to release a new version of the websitethis week with a focus on reampling and some bug fixes. I also plan to release an online book on statistics to accompany this website, which I hope to make available this month.

Charles

You’re welcome. Good luck with both of those releases. I hope they go smoothly for you.

Hi Charles,

Big users of your extremely helpful add-in. It is the best-kept secret for industrial statisticians, and I can’t believe I haven’t been using this sooner.

I just wanted to confirm that the output of the n+3 by n area from the eVECTORS function is in fact using a QR decomposition process, comparable to the Francis/Kublanovskaya algorithm?

Thank you again for all of your very important work.

Hi Victor,

The eVECTORS function does use the QR decomposition.

The eVECT function uses the QR decomposition as well as the Schur decomposition.

Charles

Thank you, Charles. Much obliged for your quick response.