# Orthogonal Vectors and Matrices

Observation: As we observed in Matrix Operations, two non-null vectors X = [xi] and Y = [yi] of the same shape are orthogonal if their dot product is 0, i.e. 0 = X ∙ Y = $\sum_{i=1}^n {}$ xi yi. Note that if X and Y are n × 1 column vectors, then X ∙ Y = XTY = YTX, while if X and Y are 1 × n row vectors, then X ∙ Y = XY= YXT. It is easy to see that (cX) ∙ c(X ∙ Y), (X + Y) ∙ X ∙ Z + Y ∙ Z, X ∙ X = $\sum_{i=1}^n x^2$ > 0 and other similar properties of the dot product.

Property 1: If A is an m × n matrix, X is an n × 1 vector and Y is an m × 1 vector, then

(AX) ∙ Y = X ∙ (ATY)

Proof: (AX) ∙ = (AX)TY = (XTAT)Y = XT(ATY) = ∙ (ATY)

Property 2: If X1, …, Xm are mutually orthogonal vectors, then they are independent.

Proof: Suppose X1, …, Xm are mutually orthogonal and let $\sum_{i=1}^m c_i X_i$ = 0. Then for any j, 0 = Xj$\sum_{i=1}^m c_i X_i$ = $\sum_{i=1}^m c_i (X_i X_j)$ = cj (Xj  ∙ Xj) since Xj ∙ Xi = 0 when i ≠ j. Thus, cj (Xj  ∙ Xj) = 0. But since Xj ∙ Xj > 0, it follows that cj = 0. Since this is true for any j, X1, …, Xm are independent.

Property 3: Any set of n mutually orthogonal n × 1 column vectors is a basis for the set of n × 1 column vectors. Similarly, any set of n mutually orthogonal 1 × n row vectors is a basis for the set of 1 × n row vectors.

Proof: This follows by Corollary 4 of Linear Independent Vectors and Property 2.

Observation: Let Cj be the jth column of the identity matrix In. As we mentioned in the proof of Corollary 4 of Linear Independent Vectors, it is easy to see that for any n, C1, …, Cn forms a basis for the set of all n × 1 column vectors. It is also easy to see that the C1, …, Cn are mutually orthogonal.

We next show that any set of vectors has a basis consisting of mutually orthogonal vectors.

Theorem 1 (Gram-Schmidt Process): Suppose X1, …, Xm are independent n × 1 column vectors. Then we can find n × 1 column vectors V1, …, Vm which are mutually orthogonal and have the same span.

Proof: We show how to construct the V1, …, Vm from the X1, …, Xm as follows.

Define V1, …, Vm as follows:

Proof: We first show that the Vk are mutually orthogonal by induction on k. The case where k = 1 is trivial. Assume that V1, …, Vk are mutually orthogonal. To show that V1, …, Vk+1 are mutually orthogonal, it is sufficient to show that Vk+1Vi = 0 for all i where 1 ≤ i ≤ k. Using the induction hypothesis that Vj ∙ Vi = 0 for 1 ≤ j ≤ k and j ≠ i and Vi ∙ Vi ≠ 0 (since Vi ≠ 0), we see that

This completes the proof that V1, …, Vm are mutually orthogonal. By Property 2, it follows that V1, …, Vm are also independent.

We next show that the span of V1, …, Vk is a subset of the span of X1, …, Xk  for all k ≤ m. The result for k = 1 is trivial. We assume the result is true for k and show that it is true for k + 1. Based on the induction hypothesis, it is sufficient to show that Vk+1 can be expressed as a linear combination of X1, …, Xk+1. This is true since by definition

and by the induction hypothesis all the Vj can be expressed as a linear combination of the  X1, …, Xk.

By induction, we can now conclude that the span of V1, …, Vm is a subset of the span of X1, …, Xm, and so trivially V1, …, Vm are elements in the span of X1, …, Xm But since the V1, …, Vm are independent, by Property 3 of Linear Independent Vectors, we can conclude that the span of V1, …, Vm is equal to the span of X1, …, Xm.

Corollary 1: For any closed set of vectors we can construct an orthogonal basis

Proof: By Corollary 1 of Linear Independent Vectors, every closed set of vectors V has a basis. In fact, we can construct this basis. By Theorem 1, we can construct an orthogonal set of vectors which spans the same set. Since this orthogonal set of vectors is independent, it is a basis for V.

Definition 1: A set of vectors is orthonormal if the vectors are mutually orthogonal and each vector is a unit vector.

Corollary 2: For any closed set of vectors we can construct an orthonormal basis

Proof: If V1, …, Vm is the orthogonal basis, then Q1, …, Qm is an orthonormal normal basis where

Observation: The following is an alternative way of constructing Q1, …, Qm (which yields the same result).

Define V1, …, Vm and Q1, …, Qm from X1, …, Xm as follows:

Definition 2: A matrix A is orthogonal if ATA = I.

Observation: The following property is an obvious consequence of this definition.

Property 4: A matrix is orthogonal if and only if all of its columns are orthonormal.

Property 5: If A is an m × n orthogonal matrix and B is an n × p orthogonal then AB is orthogonal.

Proof: If A and B are orthogonal, then

(AB)T(AB) = (BTAT)(AB) = BT(ATA)= BTIB BTB =

Example 1: Find an orthonormal basis for the three column vectors which are shown in range A4:C7 of Figure 1.

Figure 1 – Gram Schmidt Process

The columns in matrix Q (range I4:K7) are simply the normalization of the columns in matrix V. E.g., the third column of matrix Q (range K4:K7) is calculated using the array formula G4:G7/SQRT(SUMSQ(G4:G7)). The columns of V are calculated as described in Figure 2.

Figure 2 – Formulas for V in the Gram Schmidt Process

The orthonormal basis is given by the columns of matrix Q. That these columns are orthonormal is confirmed by checking that QTQ = I by using the array formula =MMULT(TRANSPOSE(I4:K7),I4:K7) and noticing that the result is the 3 × 3 identity matrix.

We explain how to calculate the matrix R in Example 1 of QR Factorization.

Property 6: If A is an orthogonal square matrix, then

1. AT = A-1
2. AAT = I
3. AT is orthogonal
4. det A = ±1 (the converse is not necessarily true)

Proof:

a) Since AT is a left inverse of A, by Property 5 of Rank of a Matrix, AT is the inverse of A
b) This follows from (a)
c) This follows from (b) since (AT)TAT = AAT = I
d) By Property 1 of Determinants and Linear Equations, |A|2 = |A| ∙ |A| = |AT| ∙ |A|= |ATA| = |I| = 1. Thus |A|=±1.

Property 7: A square matrix is orthogonal if and only if all of its rows are orthonormal.

Proof: By Property 4 and 6b.

Observation: Multiplying a vector X by an orthogonal matrix A has the effect of rotating or reflecting the vector. Thus we can think of X as point in n-space which is transformed into the point AX in n-space. Note that the distance between the point X and the origin (i.e. the length of vector X) is the same as the distance between AX and the origin (i.e. the length of vector AX), which can be seen from

Also multiplication of two vectors by A also preserves the angle between the two vectors, which is characterized by the dot product of the vectors (since the dot product of two unit vectors is the cosine of this angle), as can be seen from

Note too that A represents a rotation if det A = +1 and a reflection if det A = -1.