Contents 1. Linear Equations and Matrices Linear Equations Matrices 2. Vector Spaces 3. Linear Transformations and Matrix Representations 3.1. The Existence of Linear Transformations 3.2. The Dimension Theorem 3.3. Matrix Representation Dual Spaces 4. Matrices Systems of Linear Equations Determinants 5. Polynomial Rings 6. Diagonalizations Matrix Limits 7. Jordan Canonical Forms Jordan Canonical Form Theorem Minimal polynomials 8. Inner Product Spaces Orthogonal Projection and Spectral Theorem
1 1 3 4 9 9 9 10 14 18 19 19 22 23 27 29 31 34 39 43
LECTURE NOTE : LINEAR ALGEBRA DONG SEUNG KANG
1. Linear Equations and Matrices Linear Equations. One of the central motivation for linear algebra is solving system of linear equations. We call the following a system of linear equations: (∗)
a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . am1 x1 + am2 x2 + · · · + amn xn = bm ,
where x1 , · · · , xn are unknowns(or variables) and the aij are coefficients and bj are constants. In particular, when all bj = 0 then we call the system homogeneous. 1
2
DONG SEUNG KANG
We note that if the system is homogeneous, the system of linear equations always has at least one solution, x1 = x2 = · · · = xn = 0 , say trivial solution. Our main goal is to determine whether a given system of linear equations has a solution, or not. To solve the system of linear equations, we will use Gaussian Elimination, and the main three operations follow( called elementary operation ): (1) multiply a nonzero constant throughout an equation (2) interchange two equations (3) change an equation by adding a constant multiple of another equation. Theorem 1.1. Any system of linear equations has either no solution, exactly one solution, or infinitely many solutions. Problem Set Problem 1.2. Describe all the possible types of solution set: (1) a1 x + b1 y = c1 a2 x + b2 y = c2 . (2) a11 x1 + a12 x2 + a13 x3 = b1 a21 x1 + a22 x2 + a23 x3 = b2 a31 x1 + a32 x2 + a33 x3 = b3 . The system of linear equations (*) can be written as a matrix form as follows: AX = b , where
a11 .. A= . am1
··· .. . ···
a1n x1 b1 .. , X = .. , and b = .. . . . . amn xn bm
We call this equation a matrix equation. To apply to Gaussian elimination, we need to modify the matrix A as follows: a11 · · · a1n b1 .. .. . .. (∗∗) ... . . . am1 · · · amn bm We call this matrix a augmented matrix. To solve this equation (**), similar to the elementary operation, there are some operations.
LINEAR ALGEBRA
3
Definition 1.3. Let A ∈ Mn (F ) . Any one of the following three operations on the rows of A is called an elementary row operation : ROP 1 interchanging any two rows of A (ROP 1 is Ri ↔ Ri , for some 1 ≤ i 6= j ≤ n ;) ROP 2 multiplying any row of A by a nonzero constant (ROP 2 is Ri = c × Ri , for some 1 ≤ i ≤ n ;) ROP 3 adding any constant multiple of a row of A to another row (ROP 3 is Ri = Ri + c × Rj , for some 1 ≤ i 6= j ≤ n ,) where for some i , Ri is the row of A . In fact, the elementary operation to a system of linear equations (*) are rephrased as the elementary row operations for the augmented matrix (**). Now, I will introduce a specified matrix. Definition 1.4. A matrix is called a row-echelon form if it has the following properties: (1) the first nonzero entry of each row is 1, called a leading 1. (2) A row containing only 0’s should come after all rows with some nonzero entries. (3) The leading 1’s appear from left to the right in succesive rows. Moreover, the matrix of the reduced row-echelon form satisfies (4) Each column that contains a leading 1 has zeros everywhere else, in addition to the above three properties. By using the finite sequence of elementary row operations, any given matrix can be transformed to the row-echelon form( or reduced row-echelon form ). In particular, the reduced row-echelon form is unique. The whole process to obtain this reduced row-echelon form is called a Gaussian-Jordan Elimination. Definition 1.5. Two augmented matrices (or systems of linear equations) are said to be row-equivalent if we can be transformed to the other by a finite sequence of elementary row operations. Theorem 1.6. If two systems of linear equations are row-equivalent, then they have the same set of solutions. Matrices. Historically, Cayley introduced the word ”matrix” in the year 1858. The meaning of the word is ”that within which something originates”. In this section we will investigate some properties of such matrices. Definition 1.7. Let I = {1, 2, · · · , m} , J = {1, 2, · · · , n} be sets and let F be a field. A function A : I × J → F is called a m × n matrix over a field F.
4
DONG SEUNG KANG
In general, a matrix M is written in the following form: a11 a12 · · · a1n a21 a22 · · · a2n A = .. .. .. = ( aij ) , . . . am1 am2 amn where the number aij is called a (i, j)−entry of the matrix A . Definition 1.8. Let A be a m × n matrix over a field F . (1) The transpose of A is n × m matrix, denoted by At , whose j−th column is taken from the j−th row of A . (2) the matrix A is called a symmetric matrix if At = A . (3) the matrix A is called a skew-symmetric matrix if At = −A . (4) if n = m , we call the A a n−square matrix. Theorem 1.9. Every matrix can be written as a sum of symmetric and skew-symmetric matrices. In particular, when A is a square matrix and has an inverse, the matrix equation (**) can be easily solved such as X = A−1 b . Definition 1.10. Let A ne a n−square matrix. We call the matrix B is an inverse of A if AB = In = BA . If a given matrix A has its inverse, is said to be invertible. If not, it said to be singular or not invertible. It is a natural question how to find the inverse for a given matrix. 2. Vector Spaces Definition 2.1. A vector space V over a field F consists of a non-empty set on which two operations, called addition and scalar multiplication, are defined so that for each pair x, y ∈ V , a unique element x + y ∈ V is defined and for each a ∈ F and v ∈ V , a unique element av ∈ V is defined such that the following hold: (1) for all x, y ∈ V , x + y = y + x (2) for all x, y, z ∈ V , x + (y + z) = (x + y) + z (3) there exists an element in V denoted by 0 such that x + 0 = x for all x ∈ V (4) for each x ∈ V , there exists a unique element y ∈ V such that x + y = 0 (y is denoted by −x) (5) for each x ∈ V , 1x = x (6) for each a, b ∈ F and v ∈ V , (ab)v = a(bv) (7) for each a ∈ F and v, w ∈ V , a(v + w) = av + aw (8) for each a, b ∈ F and v ∈ V , (a + b)v = av + bv . Definition 2.2. A subset W of a vector space V is called a subspace of V if W is a vector space under the operations of addition and scalar multiplication defined on V .
LINEAR ALGEBRA
5
Note that V and {0} are both subspaces of V . Theorem 2.3. Let V be a vector space and W a subspace of V . Then W is a subspace of V if and only if the following hold: (1) W is a non-empty set(0 ∈ W ) (2) x + y ∈ W whenever x, y ∈ W (3) aw ∈ W whenever a ∈ F and w ∈ W . Definition 2.4. Let V be a vector space and let v1 , · · · , vn ∈ V , a1 , · · · , an ∈ F . Then a1 v1 +· · ·+an vn is called the linear combination of v1 , · · · , vn with coefficients a1 , · · · , an . If S is a non-empty subset of V , we define the span of S , span(S) , to be the set of all linear combinations of the elements of S . If S = ∅ , the null set, we define span(∅) to be {0} . We say that S spans (generates) V if V = span(S) . Theorem 2.5. Let S be a subset of a vector space V . The span(S) is a subspace of V . Note that suppose that v1 , · · · , vr are column vectors in F n . By (v1 , · · · , vr ) ∈ Mn×r (F ) , we mean that n × r matrix having vi as its i−th column. For a1 , · · · , ar ∈ F , we have a1 .. (v1 , · · · , vr ) . = a1 v1 + · · · + ar vr . ar Let S = {v1 , · · · , vr } ⊂ F n and let w ∈ F n . A basic question is whether or not w ∈ span(S) . This is the same as asking whether there exist x1 , · · · , xr ∈ F such that w = x1 v1 + · · · + xr vr . This is the case if and only if the matrix equation Ax = w has a solution, where A ∈ Mn×r (F ) whose i−th column is vi and w is written as a column vector. Theorem 2.6. Let A ∈ Mm×n (F ) , where m < n . Then the equation Ax = 0 has non-trivial solution. Definition 2.7. A subset S of a vector space V is called linearly dependent if there exists a finite subset v1 , · · · , vr of S and a1 , · · · , ar ∈ F , not all 0, such that a1 v1 + · · · + ar vr = 0 ; if S is not linearly dependent, S said to be linearly independent. Note that S is linearly independent if and only if the only representation of 0 as linear combination of element of S are trivial representation, i.e., all coefficients are 0. The following are easy consequences of the definition. (1) Any set which contains a linearly dependent set is linearly dependent. (2) Any subset of a linearly independent set is linearly independent.
6
DONG SEUNG KANG
(3) Any set which contains the 0 vector is linearly dependent; for 1 · 0 = 0. (4) A set S of vectors is linearly independent if and only if each finite subset of S is linearly independent, i.e., if and only if for any distinct vectors v1 , · · · , vr of S , a1 v1 + · · · + ar vr = 0 implies each ai = 0 . Definition 2.8. A basis of a vector space V is (1) a linear independent subset of V (2) span(S) = V . Note that ∅ is a basis for the 0 vector space. Theorem 2.9. A subset S of a vector space V is a basis for V if and only if for each element v ∈ V , there exists unique elements v1 , · · · , vr ∈ S and unique a1 , · · · , ar ∈ F such that v = a1 v1 + · · · + ar vr . Theorem 2.10. Let S be a linearly independent subset of a vector space V , and let v ∈ V . Then S ∪ {v} is linearly independent if and only if v ∈ span(S) . Theorem 2.11. Suppose that the vector space V has a finite spanning set S and I is a linearly independent set of vectors, I ⊆ S , possibly I = ∅ . Then there exists a basis B for V with I ⊆ B ⊆ S . In particular, V has a finite basis. A vector space V having a finite spanning set is called a finite dimensional vector space; if V does not have a finite spanning set, V is said to be infinite dimensional. Corollary 2.12. Let V be a finite dimensional vector space. Then (1) every spanning set for V contains a basis for V , (2) every linearly independent subset of V is contained in a basis for V . Theorem 2.13. Let V be a finite dimensional vector space, let S be a finite spanning set for V , and let T be a subset of V having more than elements that S. Then T is linearly independent. Corollary 2.14. Let V be a finite dimensional vector space and let B , and C be bases for V . Then B and C are finite sets having the same number of elements. Definition 2.15. By Corollary 2.14, if V is a finite dimensional vector space, then there is an integer n such that every basis for V has exactly n elements. We call n the dimension of V and say that V is n−dimensional; we write dim(V ) = n . Theorem 2.16. Suppose that V is a n−dimensional vector space. Then (1) any linearly independent subset of V containing n elements is a basis for V , (2) any spanning set for V which contains exactly n elements is a basis for V .
LINEAR ALGEBRA
7
Theorem 2.17. Suppose that V is a n−dimensional vector space and W is a subspace of V . Then (1) W is finite dimensional (2) dim(W ) ≤ dim(V ) and dim(W ) = dim(V ) if and only if W = V . (3) any basis for W can be extended to a basis for V . Problem Set Problem 2.18. Let V be a vector space over a field F and nonempty set S ⊂ V . Show that span(S) is the intersection of all subspaces of V that contains S . Problem 2.19. Let W1 and W2 be subspaces of a vector space V . Prove that W1 ∪ W2 is a subspace of V if and only if W1 ⊆ W2 or W2 ⊆ W1 . Problem 2.20. Let C be the complex numbers and R be the real numbers. Show that (1) {1, i} is linearly dependent when C is regarded as a C−vector space. (2) {1, i} is linearly independent when C is regarded as a R−vector space. Problem 2.21. Let
−1 3 0 A = 0 2 0 ∈ M3 (R) . 2 1 −1 Show that there is a positive integer n and real numbers, not all zero, say c0 , c1 , · · · , cn such that c0 I + c1 A + · · · + cn An = 0 . Problem 2.22. Let W = {(a, b, c, d, e) ∈ R5 |3a = d + e, e = a − b + 2c} be a subspace of R5 . Show that (1, 1, 2, −1, 4) ∈ W and find a basis for W that contains (1, 1, 2, −1, 4) . Be sure to explain why your proposed basis is a basis. Definition 2.23. A vector space V is called the direct sum of W1 and W2 if W1 and W2 are subspaces of V such that W1 ∩ W2 = {0} and W1 + W2 = V . We denote that V is the direct sum of W1 and W2 by writing V = W1 ⊕ W2 . Problem 2.24. Let W1 and W2 be subspaces of a vector space V . Show that V is the direct sum of W1 and W2 if and only if each element in V can be uniquely written as x1 + x2 where x1 ∈ W1 and x2 ∈ W2 . Problem 2.25. Show that F n is the direct sum of the subspaces W1 = {(a1 , · · · , an ) ∈ F n |an = 0} and W2 = {(a1 , · · · , an ) ∈ F n |a1 = · · · = an−1 = 0} . Problem 2.26. A matrix M is called skew-symmetric if M t = −M . Clearly, a skew-symmetric matrix is square.
8
DONG SEUNG KANG
(1) Prove that the set W1 of all skew-symmetric n × n real matrices is a subspace of Mn (R) . (2) Let W2 be the subspace of Mn (R) consisting of the symmetric n × n matrices. Prove that Mn (R) = W1 ⊕ W2 . (3) Find dim W1 and dim W2 . (4) Find the bases for W1 and W2 , respectively. Problem 2.27. Show that a subset W of a vector space V is a subspace of V if and only if span (W ) = W . Problem 2.28. Show that the set W of all n×n matrices having trace equal to zero is a subspace of Mn (F ) , and find a basis for W . Problem 2.29. For a fixed a ∈ R , determine the dimension of the subspace of Pn (R) defined by {f ∈ Pn (R)|f (a) = 0} . Problem 2.30. Let D0 [0, 1] = {f ∈ D[0, 1]|f (0) = 0} . Show that D0 = D0 [0, 1] is a subspace of D = D[0, 1] and show that D = D0 ⊕ W , where W is a simple finite dimensional subspace of D . Do this by finding W and proving the direct sum statement.
LINEAR ALGEBRA
9
3. Linear Transformations and Matrix Representations Definition 3.1. Let V and W be vector spaces over a field F . A function T : V → W is called a linear transformation from V into W if for all x, y ∈ V and c ∈ F we have T (x + y) = T (x) + T (y) , T (c x) = c T (x) . We denote the set of all linear transformations from V into W by L(V, W ) . Example 3.2. Let A ∈ Mm×n (F ) . View F m and F n as column vectors. The function LA : F n → F m defined by LA (v) = Av is a linear transformation called the left-multiplication transformation. 3.1. The Existence of Linear Transformations. Theorem 3.3. Let V and W be vector spaces over a field F , and suppose V is a finite-dimensional with a basis {b1 , · · · , bn } . For any vectors w1 , · · · , wn in W there exists uniquely one linear transformation T : V → W such that T (bi ) = wi for all i = 1, · · · , n . The following Theorem 3.4 is viewed as the generalized existence of linear transformation. Theorem 3.4. Let V and W be vector spaces over a field F , and suppose V is a finite-dimensional with a basis B . Let f : B → W be a function. Then there exists a unique T ∈ L(V, W ) such that T (x) = f (x) for all x ∈ B . Corollary 3.5. Let V and W be vector spaces over a field F , and let B be a basis for V . Let T, S ∈ L(V, W ) . If T (x) = S(x) for all x ∈ B , then T =S. Hence to classify linear transformations from V , we need to check with a basis for V , that is, any linear transformation T on a vector space V is determined by the basis B for V . 3.2. The Dimension Theorem. Definition 3.6. Let V and W be vector spaces over a field F , and let T ∈ L(V, W ) . (1) The null space(or kernel) of T is defined to be N (T ) = {x ∈ V |T (x) = 0} . (2) Then range(or image) of T is defined to be R(T ) = {T (x)|x ∈ V } . Theorem 3.7. Let V and W be vector spaces over a field F , and let T ∈ L(V, W ) . Then (1) N (T ) is a subspace of V .
10
DONG SEUNG KANG
(2) R(T ) is a subspace of W . (3) T is one-to-one if and only if N (T ) = {0}. (4) If B is a basis for V , then R(T ) = span({T (X)|x ∈ B}) . Theorem 3.8. Let V and W be vector spaces over a field F with V finitedimensional and let T ∈ L(V, W ) . Then dim(V ) = dim N (T ) + dim R(T ) . Definition 3.9. If N (T ) and R(T ) are finite-dimensional, then we define the nullity of T , denoted nullity(T ) , and the rank of T , denoted rank(T ) , to be the dimensions of N (T ) and R(T ) , respectively. Theorem 3.10. Let V and W be vector spaces over a field F , and let T ∈ L(V, W ) . Then the following statements are equivalent: (1) T is both one-to-one and onto; (2) there exists S ∈ L(V, W ) such that S T = IV and T S = IW . If the linear transformation T satisfies one of the conditions as in Theorem 3.10, we say that T is invertible. The S is the inverse of T ; it is denoted by S = T −1 . The following Corollary is the special case of Theorems 3.10 3.8. Corollary 3.11. Let V and W be n−dimensional vector spaces over a field F , and let T ∈ L(V, W ) . Then the following statements are equivalent:(note that dim V = n = dim W ). (1) T is invertible; (2) N (T ) = {0} ; (3) R(T ) = W ; (4) there exists S ∈ L(V, W ) such that ST = IV ; (5) there exists S ∈ L(V, W ) such that T S = IW . Theorem 3.12. Every n−dimensional vector space is isomorphic to Rn . Example 3.13. Let F be a field and V the vector space of all polynomial functions from F into F . Let D be the differentiation linear transformation in L(V ) , and let T be the linear transformation in L(V ) defined by T (f )(x) = xf (x) . Then DT 6= T D . But DT − T D = idV . 3.3. Matrix Representation. We will define the matrix representation of a given linear transformation. To express the matrix, we have to define a special basis called an ordered basis for V , which a basis endowed with a specific order. Now let B = {b1 , · · · , bn } be an ordered basis for V . Definition 3.14. For v ∈ V , there exist unique scalars a1 , · · · , an such that n X v= ai bi . i=1
LINEAR ALGEBRA
11
We define the coordinate vector of v relative to B , denoted [v]B , by a1 .. [v]B = . . an Let V be n−dimensional vector space over a field F with an ordered basis B , and let T ∈ L(V ) . We define the matrix representation of T for the ordered basis B , denoted [T ]B , to be ([T (b1 )]B , · · · , [T (bn )]B ) . Theorem 3.15. Let V be n−dimensional vector space over a field F with an ordered basis B . Let φ : V → F n be defined by φ(x) = [x]B for all x ∈ V . Then φ is an isomorphism. Example 3.16. By Example 3.2, if A ∈ Mn (F ) and B is the standard ordered basis, then [LA ]B = A . Let V and W be vector spaces over a field F . Then L(V, W ) can be considered as a F −vector space with addition and scalar multiplication as follows; (S + T )(x) = S(x) + T (x) and (cT )(x) = c T (x) , for all S, T ∈ L(V, W ) and c ∈ F , and x ∈ V . Theorem 3.17. Let V be n−dimensional vector space over a field F with an ordered basis B . Let Φ : L(V ) → Mn (F ) be defined by Φ(T ) = [T ]B for all T ∈ L(V ) . Then Φ is an isomorphism. The following Theorem is similar to Corollary 3.11. You should know the relation between them. Theorem 3.18. Let A ∈ Mn (F ) be a matrix. Then the following are equivalent: (1) A is invertible; (2) if Ax = 0 for x ∈ F n , then x = 0 ( that is, the equation Ax = 0 has a trivial solution.); (3) there exists B ∈ Mn (F ) such that AB = In ; (4) there exists B ∈ Mn (F ) such that BA = In ; (5) det(A) 6= 0 ; (6) the columns of A are linearly independent. (7) A is row equivalent to In . (8) A is a product of elementary matrices. (9) Ax = b has a solution for every b ∈ F m . (10) rank(A) = n . (11) The linear transformation A : F n → F n via A(x) = Ax is injective. (12) The linear transformation A : F n → F n via A(x) = Ax is surjective. (13) Zero is not an eigenvalue of A .
12
DONG SEUNG KANG
The following Theorem is the answer about the relation between Corollary 3.11 and the previous Theorem 3.18. Theorem 3.19. Let V be n−dimensional vector space over a field F with an ordered basis B , and let T ∈ L(V, W ) . Then T is invertible if and only if Φ(T ) = [T ]B is invertible in Mn (F ) . Definition 3.20. Let A, B ∈ Mn (F ) be matrices. We say that B is similar to A if there is an invertible matrix Q ∈ Mn (F ) such that B = Q−1 AQ . Theorem 3.21. Let B = {b1 , · · · , bn } and C = {c1 , · · · , cn } be ordered bases for V . Let P ∈ Mn (F ) be the matrix whose j−th column is [cj ]B . Then (1) P is invertible. (2) [x]C = P −1 [x]B for all x ∈ V . (3) [T ]C = P −1 [T ]B P . Corollary 3.22. Similar matrices have the same determinant. Let V and W be finite dimensional vector spaces with standard bases B = {b1 , · · · , bn } and C = {c1 , · · · , cm } , respectively, and let T ∈ L(V, W ) . Then the matrix representation of T for the ordered bases B and C is [T ] = ([T (b1 )], · · · , [T (bn )]) , where [T (bi )] =
m X
xji cj for all 1 ≤ i ≤ n ;
j=1
that is the ith column of [T ] is
x1i x2i [T (bi )] = .. . . xmi Then R(T ) = span{T (b1 ), · · · , T (bn )} . Hence the R(T ) is the column space of [T ] . Thus dim R(T ) = rank of [T ] . Note that, by Dimension Theorem, we may conclude that dim N (T ) = dim V − rank of [T ] . Example 3.23. Let V = R3 , W = P2 (R) be vector spaces with B and A as standard bases for V and W , and Bv = {(0, 1, 1), (1, 0, 1), (1, 1, 0)} and Aw = {x + x2 , 1 + x2 , 1 + x} be bases for V and W , respectively. Let a linear transformation T : R3 → P2 (R) via T (a, b, c) = c + (b + c)x + (a + b + c)x2 .
LINEAR ALGEBRA
13
Then the following diagram commutes: T −→ [T ]A B
(V, B)
(W, A)
w idw ↓ [idw ]A A .
idv ↑ [idv ]B Bv T −→ w [T ]A Bv
(V, Bv )
(W, Aw )
Compute all matrices above diagram, where [idv ]B Bv is a transition matrix(coordinate change matrix) . Please, check that Aw A B w [T ]B = [idw ]A A [T ]B [idv ]Bv . v
To compute [T ]A B , T (1, 0, 0) = x2 , T (0, 1, 0) = x + x2 , T (0, 0, 1) = 1 + x + x2 . Hence
0 0 1 0 1 1 . [T ]A B = 1 1 1
To compute the transition matrix of [idv ]B Bv , idv (0, 1, 1) = 0(1, 0, 0) + 1(0, 1, 0) + 1(0, 0, 1) idv (1, 0, 1) = 1(1, 0, 0) + 0(0, 1, 0) + 1(0, 0, 1) idv (1, 1, 0) = 0(1, 0, 0) + 1(0, 1, 0) + 0(0, 0, 1) . Hence we have
[idv ]B Bv
0 1 1 = 1 0 1 . 1 1 0
Similarly, we have w [idw ]A A
1 −2 = 1 2 1 2
1 2 − 21 1 2
1 2 1 2 − 12
.
w Then the [T ]A Bv will be computed by two methods as follows:
3 1 1 T (0, 1, 1) = 1 + 2x + 2x2 = (x + x2 ) + (1 + x2 ) + (1 + x) 2 2 2 2 2 2 T (1, 0, 1) = 1 + x + 2x = 1(x + x ) + 1(1 + x ) + 0(1 + x) 3 1 1 T (1, 1, 0) = x + 2x2 = (x + x2 ) + (1 + x2 ) − (1 + x) . 2 2 2 Also, we may use Aw A B w [T ]A Bv = [idw ]A [T ]B [idv ]Bv .
14
DONG SEUNG KANG
Thus we have 3 w [T ]A Bv =
2 1 2 1 2
3 1 2 1 . 1 2 0 − 12
−1 Aw Aw B Note that [T ]A and [T ] are similar because [id ] = [id ] . Also, w A v Bv B Bv the determinant of them is -1. Use Theorem 3.12, and compare with Theorem 3.21. Dual Spaces. Let V be a vector space over a field F , and let L(V, F ) = {f : V → F linear transformation } with the following operations : for f, g ∈ V ∗ and c ∈ F , (f + g)(x) = f (x) + g(x) and (cf )(x) = c f (x) , for all x ∈ V . Then V ∗ forms a vector space over a field F . Definition 3.24. For a vector space V over F , we define the dual space of V to be the vector space L(V, F ) , denoted by V ∗ Note that if V is finite-dimensional, then we have dim(V ∗ ) = dim(L(V, F )) = dim(V ) · dim(F ) = dim(V ) . Theorem 3.25. Suppose that V is a finite-dimensional vector space with the ordered basis B = {x1 , · · · , xn } . Let fi (1 ≤ i ≤ n) be the ith coordinate function with repect to B as defined above, and let B ∗ = {f1 , · · · , fn } . Then B ∗ is an ordered basis for V ∗ , and for any f ∈ V ∗ we have f=
n X
f (xi )fi .
i=1
Definition 3.26. Using the notation Theorem 3.25, we call the ordered basis B ∗ = {f1 , · · · , fn } of V ∗ that satisfies fi (xj ) = δij (1 ≤ i, j ≤ n) the dual basis of B . Theorem 3.27. Let V and W be finite-dimensional vector spaces over a field F with ordered bases B and A , respectively. For any linear transformation T : V → W , the mapping T ∗ : W ∗ → V ∗ defined by T ∗ (g) = g ◦ T for all g ∈ W ∗ is a linear transformation with the property that t ∗ A [T ∗ ]B = [T ] . ∗ A B Note that
LINEAR ALGEBRA
(V, B)
(V
∗ , B∗ )
T −→ [T ]A B
(W, A)
T∗ ←− ∗ [T ∗ ]B A∗
(W ∗ , A∗ )
15
Then for any g ∈ W ∗ , T V → W Hence we may define T ∗ (g) = g ◦ T .
g . → F
Problem Set R3
Problem 3.28. Let T : → R be a linear transformation. Show that there exist scalars a, b, and c such that T (x, y, z) = ax + by + cz for all (x, y, z) ∈ R3 . Can you generalize this result to T : Rn → R ? 1 2 Problem 3.29. Let V = M2 (R) , let A = , and let T ∈ L(V ) be 3 4 defined by T (B) = AB − BA for all B ∈ V . (1) Show that T is a linear transformation. (2) Pick any basis B you wish for V and compute [T ]B for that basis. Problem 3.30. Let v = (5, 2, 3, 1) ∈ F 4 and let B = {b1 , b2 , b3 , b4 } be the standard basis for F 4 . (1) Does there exist T ∈ L(V ) such that T (b1 ) = v, T (v) = b2 , T (b2 ) = b3 , T (b3 ) = b1 ? Be sure to justify your answer. (2) Let T be as in (1). Determine [T ]B . Problem 3.31. Find A ∈ M6 (R) such that A3 = 5I and neither A nor A2 is a diagonal matrix. Problem 3.32. Let V be a finite-dimensional vector space and let T ∈ L(V ) . Show that V = ImT ⊕ KerT if and only if R(T ) = R(T 2 ) . Problem 3.33. Assume that there exists T ∈ L(V ) such that N (T ) = R(T ) . Prove that dim V is even. Problem 3.34. Assume that dim V is even. Prove that there exists a T ∈ L(V ) such that N (T ) = R(T ) . Problem 3.35. Let V be a vector space of dimension n over a field F . If T ∈ L(V ) prove that the following statements are equivalent:
16
DONG SEUNG KANG
(1) N (T ) = R(T ) (2) T 2 = 0, T 6= 0 , rank (T ) =
n 2
.
Problem 3.36. Let r ≤ n and let U and W be subspaces of F n with dim U = r , dim W = n − r . Prove that there exists T ∈ L(F n ) such that N (T ) = W and R(T ) = U . Problem 3.37. Let A be 3 × 2 matrix and let B be 2 × 3 matrix. Prove that C = AB is not invertible. Generalize this or give a counterexample (Hint : Use Theorem 3.18). Problem 3.38. Let T : R3 → R2 and U : R2 → R3 be linear transformations. Show that U T is not invertible. Problem 3.39. Let V be a finite-dimensional vector space and let T ∈ L(V ) . Establish the chains V ⊇ Im T ⊇ Im T 2 ⊇ · · · ⊇ Im T n ⊇ Im T n+1 ⊇ · · · {0} ⊆ Ker T ⊆ Ker T 2 ⊆ · · · ⊆ Ker T n ⊆ Ker T n+1 ⊆ · · · Show that there is a positive integer p such that Im T p = Im T p+1 and deduce that Im T p = Im T p+k for all k ≥ 1 and Ker T p = Ker T p+k . Show also that V = Im T p ⊕ Ker T p . Problem 3.40. Let V be a finite-dimensional vector space and let T ∈ L(V ) . Suppose there is U ∈ L(V ) such that T U = idV . Show that T is invertible and U = T −1 . Give an example which shows that this is false when V is not finite dimensional. Problem 3.41. Consider R as a vector space over the field Q of rational numbers. Let B be a basis for R over Q . Determine whether B is finite or countable or uncountable and show that your assertion. Problem 3.42. (1) Let f : R → R be a continuous function such that f (x + y) = f (x) + f (y) for all x, y ∈ R . Prove that there exists c ∈ R such that f (x) = cx , for all x ∈ R . (2) Show that there exists a discontinuous function f : R → R such that f (x + y) = f (x) + f (y) for all x, y ∈ R . (Hint: View R as a Q−vector space. Then {1} is linearly independent so there exists a basis B for R as a Q−vector space with 1 ∈ B . Any function f0 : B → R can be extended to a function f : R → R as follows: if r ∈ R , then r = a1 b1 + · · · + at bt , where ai ∈ Q and b1 ∈ R . Define f (r) = a1 f0 (b1 ) + · · · + at f0 (bt ) . Choose f0 appropriately so that the resulting f can not be continuous). Problem 3.43. For each positive integer n , there exists a complex number ζn = e2π/n with the property that ζnn = 1 but ζnm 6= 1 for all 1 ≤ m < n . View
LINEAR ALGEBRA
17
C as an R−vector space and let n ≥ 3 . Define T : C → C by T (z) = ζn z for all z ∈ C . (1) Prove that T ∈ L(C) . (2) Prove that B = {1, ζn } is a basis for C . (3) Let A = [T ]B . Prove that An = I but Am 6= I for all 1 ≤ m < n . Problem 3.44. A diagram of finite-dimensional vector spaces and linear transformations of the form V1
T1 −→
V2
T2 −→
···
Tn −→
Vn+1
is called an exact sequence if (a)T1 is injective (b)Tn is surjective (c)R(Ti ) = N (Ti+1 ) for all i = 1, · · · , n − 1 . Prove that, for each exact sequence, n+1 X (−1)i dim Vi = 0 . i=1
18
DONG SEUNG KANG
4. Matrices This section is devoted to two related objectives: (1) the study of certain ”rank-preserving” operations on matrices. (2) the application of these operations and the theory of linear transformations to the solution to systems of linear equations. Definition 4.1. Let A ∈ Mn (F ) . Any one of the following three operations on the rows of A is called an elementary row operation : ROP 1 interchanging any two rows of A (ROP 1 is Ri ↔ Ri , for some 1 ≤ i 6= j ≤ n ;) ROP 2 multiplying any row of A by a nonzero constant (ROP 2 is Ri = c × Ri , for some 1 ≤ i ≤ n ;) ROP 3 adding any constant multiple of a row of A to another row (ROP 3 is Ri = Ri + c × Rj , for some 1 ≤ i 6= j ≤ n ,) where for some i , Ri is the row of A . Definition 4.2. If A ∈ Mm×n (F ) , we define the rank of A , denoted rank(A) , to be the rank of linear transformation LA : F n → F m defined by T (x) = Ax , for all x ∈ F n . Theorem 4.3. Let A ∈ Mn×m (F ) be a matrix over a field F . Then (1) dim Row(A) = dim Col(A) . (2) Let R = (rij ) be the row-reduced echelon form of A over F . Suppose that R 6= 0 , the first s(1 ≤ s ≤ m) rows are not zero rows and let the column of the ith row(i = 1, · · · , s) with leading coefficient 1 be ki th column(,which is the pivot column and hence k1 < k2 < · · · < ks ). Let a1 , a2 , · · · , am be row vectors of R , let b1 , b2 , · · · , bn be column vectors of A , and let S = {a1 , a2 , · · · , as } and P = {bk1 , bk2 , · · · , bks } . Then S is a basis for Row(A) and P is a basis for Col(A). We recall that the image of A is Col(A), and hence P is a basis for the range of A . Example 4.4. Let
1 2 2 3 A = 1 2 3 5 . 2 4 5 8 Then
1 2 0 −1 rref (A) = 0 0 1 2 . 0 0 0 0 Then a1 = (1, 2, 0, −1) , a2 = (0, 0, 1, 2) , and then the pivot columns are 1st and 3rd columns of rref (A) , b1 = (1, 1, 2) , b3 = (2, 3, 5) . Hence {a1 , a2 } is a basis for Row(A) and {b1 , b3 } is a basis Col(A).
LINEAR ALGEBRA
19
Theorem 4.5. Let A ∈ Mm×n (F ) be a matrix over a field F . Then (1) If P ∈ Mm (F ) , then Row(P A) ⊂ Row(A). (2) In particular, if P is invertible, then Row(P A) equals Row(A). (3) If Q ∈ Mn (F ) , then Col(AQ) ⊂ Col(A). (4) In particular, if Q is invertible, then Col(AQ) equals col(A). Systems of Linear Equations. Definition 4.6. A system Ax = b of m linear equations in n unknowns is said to be homogeneous if b = 0 . Otherwise the system is said to be nonhomogeneous. Note that any homogeneous system has at least one solution, namely, the zero vector. This solution is called the trivial solution. Also, this solution of homogeneous system is the same as the null space of the linear transformation T : F n → F m defined by T (x) = Ax , where x ∈ F n . Remark 4.7. Any system of linear equations has only three types solutions as follows: type 1 there is only one solution. type 2 there are infinitely many solutions type 3 there is no solution. Since any homogeneous system has always a solution, the third type can not occur, that is, the solution types of homogeneous are either type 1 or type 2. Remark 4.8. The solutions of Ax = b and Rx = b0 have the same solutions, where [R|b0 ] is obtained from [A|b] by using the row operations as in Definition 4.1. Note that R and A are row equivalent. Remark 4.9. The only case not to ∗ ··· [R|b0 ] = ∗ · · · 0 ···
have solutions is of form : ∗ |∗ ∗ |∗ with r 6= 0 , 0 |r
that is, R contains a zero row with nonzero constant r in b0 . Determinants. The following remark gives us the properties and applications of determinants. Remark 4.10. (1) For any A ∈ Mn (F ) , det(A) = − det(ROP 1(A)) , where ROP 1 is Ri ↔ Ri , for some 1 ≤ i 6= j ≤ n . (2) For any A ∈ Mn (F ) , det(A) = (c)n det(ROP 2(A)) , where ROP 2 is Ri = c × Ri , for some 1 ≤ i ≤ n . (3) For any A ∈ Mn (F ) , det(A) = det(ROP 3(A)) , where ROP 3 is Ri = Ri + c × Rj , for some 1 ≤ i 6= j ≤ n . (4) For any A, B ∈ Mn (F ) , det(AB) = det(A) det(B) . (5) For any A ∈ Mn (F ) , det(A) = det(At ) . (6) For any A ∈ Mn (F ) , and k ∈ F , det(kA) = k n det(A) .
20
DONG SEUNG KANG
(7) Let A, B ∈ Mn (F ) such that AB = −BA . Prove that if n is odd and F is not a field of characteristic two, then A and B is not invertible. (8) Prove that if n is odd, then there does not exist an A ∈ Mn (R) such that A2 = −In . (9) Let M ∈ Mn (C) . If M is a skew-symmetric and n is odd, then M is not invertible. What if n is even ? (10) Prove that an upper triangular n × n matrix is invertible if and only if all its diagonal entries are nonzero. Problem Set Problem 4.11. Prove that E is an elementary matrix if and only if E T is . Problem 4.12. If the real matrix a 1 0 b 0 0 0 0
a 1 c 0
0 b 1 d
0 0 c 1
0 0 0 d
has rank r , show that (1) r > 2 (2) r = 3 if and only if a = d = 0 and bc = 1 (3) r = 4 in all other cases. Problem 4.13. Prove that for any m × n matrix, rank(A) = 0 if and only if A is the zero matrix. Problem 4.14. Let A be a m × n matrix with rank m . Prove that there exists an n × m matrix B such that AB = Im . Problem 4.15. Let B be a n × m matrix with rank m . Prove that there exists an m × n matrix A such that AB = Im . Problem 4.16. Suppose A ∈ M5×6 (F ) such that (1, 2, 3, 4, 5, 6) and (1, 1, 1, 1, 1, 1) are solutions to Ax = 0 . Must the reduced echelon form of A contain a row of zeros ? Be sure to fully justify your answer. Problem 4.17. Let F be a field, and let A, B ∈ Mn (F ) . (1) Show that tr(AB) = tr(BA) . (2) Show that AB − BA = In is impossible. (3) Let A ∈ Mm×n (R) . Show that A = 0 if and only if tr(At A) = 0 . Problem 4.18. Let A ∈ Mn (C) be a skew-symmetric matrix. Suppose that n is odd. Then det A = 0 . Problem 4.19. Let A ∈ Mn (F ) be called orthogonal if AAt = I . If A is orthogonal, show that det A = ±1 . Give an example of an orthogonal matrix for which det A = −1 .
LINEAR ALGEBRA
21
Problem 4.20. (1) Let A ∈ Mn (F ) . Show that there are at most n distinct scalars c ∈ F such that det(cI − A) = 0 . (2) Let A, B ∈ Mn (F ) . Show that if A is invertible then there are at most n scalars c ∈ F for which the matrix cA + B is not invertible. Problem 4.21. Prove or give a counterexample to the following statements: If the coefficient matrix of a system of m linear equations in n unknowns has rank m , then the system has a solution. Problem 4.22. Let A be an n × n matrix. Prove that A is row-equivalent to the n × n identity matrix if and only if the system of equations AX = 0 has only the trivial solution. Problem 4.23. Let A and B be 2 × 2 matrices such that AB = I2 . Prove that BA = I2 . Problem 4.24. Determine all solutions to the following infinite system of linear equations in the infinitely many unknowns x1 , x2 , · · · . x1 + x3 + x5 = 0 x2 + x4 + x6 = 0 x3 + x5 + x7 = 0 .. .. .. .. . . . . How many free parameters are required ?
22
DONG SEUNG KANG
5. Polynomial Rings Let F [x] = P(F ) be the polynomials in x over a field F and let f (x), g(x) ∈ F [x] . If g(x) 6= 0 , deg g(x) is the highest power of x in g(x) . We call g(x) is monic if 1 is the coefficient of the highest power of x in g(x) . [ Division Algorithm ] Let f (x), g(x) ∈ F [x], with g(x) 6= 0 . Then there exist unique q(x), r(x) ∈ F [x] such that f (x) = g(x)q(x) + r(x) where either r(x) = 0 or deg r(x) < deg g(x) . Proposition 5.1. If deg f (x) = n , then f (x) has at most n roots in F , counting multiplicity. Definition 5.2. Let I ⊂ F [x] . I is called an ideal of F [x] if for every f (x), g(x) ∈ I and h(x) ∈ F [x] , f (x) + g(x) ∈ I and h(x)f (x) ∈ I . Example 5.3. I = {f (x) ∈ F [x]|f (3) = 0} is an ideal of F [x] . From this example above, we may conclude the set of all polynomials such that f (T ) = 0 is ideal, where T ∈ L(V, W ) . Proof. For f (x), g(x) ∈ I and h(x) ∈ F [x] , we have to check whether f (x)+ g(x) ∈ I and h(x)f (x) ∈ I . Indeed, (f + g)(3) = f (3) + g(3) = 0 + 0 = 0 and (hf )(3) = h(3)g(3) = h(3) × 0 = 0 , as desired. Theorem 5.4. Let I 6= {0} be an ideal of F [x] . Then there exists a unique monic polynomial f (x) ∈ I such that I = {g(x)f (x)|g(x) ∈ F [x]} . Then we call the polynomial f (x) a generator of the ideal I . Let T ∈ L(V, W ) . Hence, again, the set of all polynomials such that f (T ) = 0 is ideal I , and then we can find a generator for I , say m(x) . We call the monic generator m(x) of I = {f (x) ∈ F [x]|f (T ) = 0} the minimal polynomial of the linear transformation of T .
LINEAR ALGEBRA
23
6. Diagonalizations For a given linear transformation T on a finite-dimensional vector space V , we seek answers to the following questions: 1. Does there exist an ordered basis B for V such that [T ]B is a diagonal matrix ? 2. If such a basis exists, how can it be found ? Let V be a finite dimensional vector space over a field F , and let T ∈ L(V ) . Then we may have c1 0 .. [T ]B = . 0
cn
in some basis B = {b1 , · · · , bn } for V . For each j , we write T (bj ) = cj bj . Definition 6.1. Let v ∈ V , v 6= 0 . v is called an eigenvector of T if T (v) = λv , for some λ ∈ F ; λ is called the eigenvalue of T corresponding to the eigenvector v . Similar to the previous definition, we may define eigenvector and eigenvalue of a matrix M such as M v = λv , where v 6= 0 . Example 6.2. Let T ∈ L(C2 ) be defined by T (x, y) = (−y, x) . Then i and −i are eigenvalues of T and (i, 1) and (−i, 1) are eigenvectors of T with associated i and −i , respectively. Let B = {(i, 1), (−i, 1)} . What is the [T ]B ? Theorem 6.3. T ∈ L(V ) can be written in the diagonal form c1 0 .. . 0 cn in the basis B = {b1 , · · · , bn } if and only if each bj is an eigenvector of T with associated the eigenvalue cj . Example 6.4. Let T ∈ L(R3 ) be defined by T (x, y, z) = (0, x, y) . If there is a basis {b1 , b2 , b3 } for R3 consisting of eigenvectors of T then T (b1 ) = T (b2 ) = T (b3 ) = 0, i.e., T = 0 . This is a contradiction. Hence we may conclude that R3 has no basis consisting of eigenvectors of T . Now, we will investigate which conditions can be required to be similar to a diagonal matrix. Proposition 6.5. Let dim V < ∞ , let T ∈ L(V ) , and let B and C be ordered bases for V . det([T ]B ) = det([T ]C ) .
24
DONG SEUNG KANG
Now, we may define det(T ) to be det([T ]B ) , where B is an ordered basis for V . Theorem 6.6. Let dim V < ∞ , let T ∈ L(V ) , and let λ ∈ F . The following statements are equivalent: (1) λ is an eigenvalue of T ; (2) N (T − λIV ) 6= {0} ; (3) det([T − λIV ]B ) = 0 for some basis B for V . Moreover, if λ is an eigenvalue of T and v ∈ V , v 6= 0 , then v is an eigenvector corresponding to λ if and only if v ∈ N (T − λIV ) . Hence we may consider N (T ) the set of eigenvectors of T corresponding to the eigenvalue 0 of T . It is a natural step to ask how to find eigenvectors and eigenvalues of T . Definition 6.7. Let dim V = n and T ∈ L(V ) . Then cT (x) = det(T − xIV ) ∈ F [x] is called the characteristic polynomial of T . Proposition 6.8. cT (x) is a polynomial of degree n with leading coefficients (−1)n . λ ∈ F is an eigenvalue of T if and only if λ is a roots of cT (x) . Proof. Suppose λ ∈ F is an eigenvalue of T . Then there exists a non-zero vector α ∈ V such that T (α) = λα . Hence (T − λI)(α) = 0 has non-trivial solution, say α . Thus (T − λI) is singular, i.e., det(T − λI) = 0 . Conversely, suppose λ is a roots of cT (x) . Then the system of linear equations (T − λI)(x1 , · · · , xn )t = 0 has non-trivial solution, .i.e., λ ∈ F is an eigenvalue of T. Hence the characteristic polynomial cT (x) is in the ideal I = {f ∈ F [x]|f (T ) = 0} . Then we asserted that we can find the generator(monic minimal polynomial) of I , say mT (x) ∈ I ; that is, cT (x) = g(x)mT (x) , for some g(x) ∈ F [x] . Note that mT (T ) = 0 , since mT (x) ∈ I . The characteristic and minimal polynomials for T have the same roots, except for multiplicities. Definition 6.9. Let dim V = n and T ∈ L(V ) , and let λ is an eigenvalue of T . The eigenspace of T corresponding to the eigenvalue λ is Eλ (T ) = {v ∈ V |T (v) = λv} . Definition 6.10. Let V be a vector space, T ∈ L(V ) . A subspace W of V is called T −invariant if T (w) ∈ W for all w ∈ W . Lemma 6.11. N (T ) , R(T ) , and Eλ (T ) are T −invariant subspaces of V . Now, we want to know some properties which determine which linear transformation(matrix) will be diagonalizable. Definition 6.12. Let V be an F −vector space, and let W1 , · · · , Wr be subspaces of V . V is the direct sum of the Wi if there exist bases Bi for Wi , i = 1, · · · , r such that Bi ∩ Bj = ∅ for i 6= j and ∪ri=1 Bi is a basis for V .
LINEAR ALGEBRA
25
Then we write V = W1 ⊕ · · · ⊕ Wr . Proposition 6.13. Suppose that V = W1 ⊕ · · · ⊕ Wr . Then each element of V has a unique representation in the form w1 + · · · + wr , where wi ∈ Wi , for i = 1, · · · , r . Lemma 6.14. Let λ1 , · · · , λk be distinct eigenvalues of T . For each i = 1, · · · , k , let vi ∈ Eλi (T ) , and suppose that v1 + · · · + vk = 0 . Then vi = 0 for all i = 1, · · · , k . Theorem 6.15. Let λ1 , · · · , λk be distinct eigenvalues of T ∈ L(V ) . For each i = 1, · · · , k , let Si be a linearly independent subset of Eλi (T ) . Then S = ∪ki=1 Si is linearly independent. Theorem 6.16. Let λ1 , · · · , λk be distinct eigenvalues of T ∈ L(V ) . Let W be the subspace spanned by all eigenvectors of T . Then W = Eλ1 (T ) ⊕ · · · ⊕ Eλk (T ) . Theorem 6.17. Let dim V < ∞ , let T ∈ L(V ) . Assume that cT (x) splits over F and let λ1 , · · · , λk be distinct eigenvalues of T . Then the following statements are equivalent: (1) T is diagonalizable; (2) V = Eλ1 (T ) ⊕ · · · ⊕ Eλk (T ) ; (3) for each i = 1, · · · , k , the geometric multiplicity of λi equals the algebraic multiplicity of λi . (4) the minimal polynomial mT (x) has distinct roots. Corollary 6.18. Let dim V = n , let T ∈ L(V ) . If T has n distinct eigenvalues, then T is diagonalizable ( Note that the converse is not true ). Remark 6.19. Let V be a vector space with a basis B . We may write A = [T ]B ∈ Mn (F ) . Then we have the following diagram : AX = 0 has a non-trivial solution ⇐⇒ the null space of T 6= {0} m ⇑ zero is eigenvalue of A k m ⇓ A is not invertible ,that is, det A = 0 ⇐⇒ T is not one-to-one . Also, we may say that all eigenvalues of A are not zero ⇐⇒ AX = 0 has the trivial solution ⇐⇒ A is invertible, that is, det A 6= 0 .
Problem Set
26
DONG SEUNG KANG
Problem 6.20. Find the eigenvalues and eigenvectors for 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 Problem 6.21. Let D : P3 (R) → P3 (R) be the differentiable defined by D(f (x)) = f 0 (x) for f (x) ∈ P3 (R) . Find all eigenvalues and eigenvectors of D and D2 . Problem 6.22. For any square matrix A , show that A and At have the same characteristic polynomial. Problem 6.23. Let T ∈ L(Mn (R) defined by T (A) = At , the transpose of A. (1) Show that T is a linear transformation. (2) Show that ±1 are the only eigenvalues of T . (3) Describe the eigenvectors corresponding to each eigenvalue of T . (4) Find an order basis B for M2 (R) such that [T ]B ia a diagonal matrix. (5) Find an ordered basis for Mn (R) such that [T ]B ia a diagonal matrix for n > 2 . Problem 6.24. Let A, B ∈ Mn (C) . (1) Show that if B is invertible, then there exists a scalar c ∈ C such that A + cB is not invertible. Hint: Check its determinant. (2) Find 2 × 2 matrices A and B such that A is invertible, B 6= 0 , but A + cB is invertible for all c ∈ C . Problem 6.25. Let λ1 , · · · , λn be the eigenvalues of n × n matrix A . Then (1) A is invertible if and only if for all λi 6= 0 . (2) if A is invertible then the inverse A−1 has eigenvalues 1 1 −1 ?) λ1 , · · · , λn .(How about eigenvectors for A and A (3) Let A, B ∈ Mn (F ) , where F is a field. Show that if (I − AB) is invertible, then (I − BA) is invertible and (I − BA)−1 = I + B(I − AB)−1 A . (4) For all A, B ∈ Mn (R) , show that AB and BA have the same eigenvalues. (5) Let V = Mn (F ) , A ∈ Mn (F ) , and let T ∈ L(V ) defined by T (B) = AB . Show that the minimal polynomial for T is the minimal polynomial for A . (6) Let A, B ∈ Mn (F ) , where F is a field. Show that they have the same eigenvalues. Do they have the same characteristic polynomial ? Do they have the same minimal polynomial ? (7) show that A and At have the same eigenvalues. How about eigenvectors for A and At ? (8) if for some m > 0 , Am = 0 then all eigenvalues of A are zero.
LINEAR ALGEBRA
27
(9) if A2 = I , then the sum of eigenvalues of A is an integer. (10) Let A be a real skew-symmetric matrix with eigenvalue λ . Show that ¯ is also an eigenvalue. the real part of λ is zero, and that λ Problem 6.26. Let A ∈ Mn (F ) . Suppose A has two distinct eigenvalues λ1 and λ2 with dim(Eλ1 ) = n − 1 . Show that A is diagonalizable. Problem 6.27. If A ∈ M3 (F ) has eigenvalues 1, 2, and 3 , what is the eigenvectors of B = (A − I3 )(A − 2I3 )(A − 3I3 ) ? Problem 6.28. Let f (x) = det(A−xI) be the characteristic polynomial of A . Evaluate f (A) for 1 2 2 A = 1 2 −1 . −1 1 4 (In fact, this is the Cayley-Hamilton Theorem.) Problem 6.29.
0 a A= 0 0
0 0 b 0
0 0 0 c
0 0 . 0 0
Find the conditions on a,b, and c ∈ R such that A is diagonalizable ? Problem 6.30. Let P : R2 → R2 defined by P (x, y) = (x, 0) . Show that P is a linear transformation. What is the minimal polynomial for P ? Problem 6.31. Every matrix A such that A2 = A is similar to a diagonal matrix. Problem 6.32. Compute A2004 b for 1 2 −1 0 5 −2 , 0 6 −2
2 b = 4 . 7
Problem 6.33. Let V = M2 (F ) and let T ∈ L(V ) be defined by T (A) = At for A ∈ V . (1) Determine a maximal linearly independent set of eigenvectors of T . (2) Determine whether or not T is diagonalizable. If so, determine a diagonal matrix D such that D = [T ]B for a suitable basis B ; if not, prove that T is not diagonalizable. Matrix Limits. In this part we will investigate the limit of a sequence of powers A, A2 , · · · , where A is a square matrix with complex entries. Such sequences and their limits have practical applications in the life and natural sciences.
28
DONG SEUNG KANG
Definition 6.34. Let L, A1 , A2 , · · · ∈ Mn×p (C) . The sequence A1 , A2 , · · · is said to converge to the matrix L if lim (Am )ij = Lij
m→∞
for all 1 ≤ i ≤ n and 1 ≤ j ≤ p . Note that ex = 1 + x +
x2 xn + ··· + + ··· . 2! n!
Definition 6.35. For A ∈ Mn (C) , define eA = limm→∞ Bm , where Bm = I + A +
A2 Am + ··· + . 2! m!
Thus we have An A2 + ··· + + ··· . 2! n! Problem 6.36. (1) Let L, A1 , A2 , · · · ∈ Mn×p (C) . If limm→∞ (Am ) = L , then limm→∞ (Am )t = Lt . (2) Let P −1 AP = D be a diagonal matrix. Prove that eA = P eD P −1 . (3) Find A, B ∈ M2 (R) such that eA eB 6= eA+B . eA = I + A +
LINEAR ALGEBRA
29
7. Jordan Canonical Forms In the pervious section we studied the properties to determine which linear transformations(matrices) are diagonalizable. Unfortunately, only some parts of them can be diagonalizable. Hence in this section we will study even if they are not diagonalizable, they can be reduced to considerable simple forms(we call these forms Jordan canonical forms). Let’s start with an example. Example 7.1. Let T : C8 → C8 be a linear transformation with an order basis(we call this basis a Jordan canonical basis) B = {b1 , · · · , b8 } for C8 such that 2 1 2 1 2 2 [T ]B = 3 1 3 0 1 0 is a Jordan canonical form of T . Then the characteristic polynomial of T is cT (x) = (x − 2)4 (x − 3)2 x2 . Also, we know that T (b1 ) = 2b1 , T (b4 ) = 2b4 , T (b5 ) = 3b5 , T (b7 ) = 0b7 . Hence we know that b1 , b4 are eigenvectors of λ = 2 , b5 is an eigenvector of λ = 3 , and b7 is an eigenvector of λ = 0 . In particular, the minimal polynomial of T is mT (x) = (x − 2)3 (x − 3)2 x2 . For example, S : C8 → C8 be a linear transformation with an order basis(set of eigenvectors) C = {b1 , · · · , b8 } for C8 such that 2 2 2 2 [S]C = 3 3 0 0 is a Jordan canonical form(diagonal matrix) different from [T ]B . Then the characteristic polynomial of T is cT (x) = (x − 2)4 (x − 3)2 x2 . Also, the minimal polynomial of T is mT (x) = (x − 2)(x − 3)x . Note that they have the same characteristic polynomial, but they have different Jordan canonical forms. Also. T (b2 ) = b1 + 2b2 ⇒ (T − 2I)(b2 ) = b1 .
30
DONG SEUNG KANG
Similarly, (T − 2I)(b3 ) = b2 , (T − 2I)2 (b3 ) = b1 . Note (T − 2I)3 (bj ) = 0 for all j = 1, 2, 3 . Also, T (b4 ) = 2b4 . Hence b1 = (T − 2I)2 (b3 ), b2 = (T − 2I)(b3 ) . Similarly, we have b5 = (T − 3I)(b6 ) and b7 = (T − 0I)(b8 ) . Now, generalize this as follows; Definition 7.2. The Jordan block of the size r corresponding to λ , J(r, λ) , is the r × r matrix such that J(r, λ)i,i = λ for i = 1, · · · , r , J(r, λ)i,i+1 = 1 for i = 1, · · · , r − 1 , and J(r, λ)i,j = 0 for all other i, j . We say that a matrix is in Jordan form if it is block diagonal with each block being a Jordan block. Definition 7.3. Let V be an F −vector space, and let T ∈ L(V ) . Let λ ∈ F and let v ∈ V , v 6= 0 . (1) v is called a generalized eigenvector of T corresponding to λ if (T − λIV )p (v) = 0 for some positive integer p . (2) For some positive integer p , Kλ (T ) = {v|(T − λIV )p (v) = 0} is called the generalized eigenspace of T corresponding to λ . Remark 7.4. Let v ∈ Kλ (T ) , and let p > 0 be minimal with (T − λIV )p (v) = 0 . Set vi = (T − λIV )p−i (v) for i = 1, · · · , p . Then v = vp , vi = (T − λIV )(vi+1 ) for i = 1, · · · , p − 1 . We call the set {v1 , · · · , vp } = {(T − λIV )p−1 (vp ), · · · , (T − λIV )(vp ), vp } a cycle of generalized eigenvector of T corresponding to λ . v1 is called the initial vector, and vp the end vector, p is called the length of the cycle. Note that v1 is an eigenvector of T corresponding to the eigenvalue λ . Proposition 7.5. Kλ (T ) is T −invariant subspace of V , and Kλ (T ) ⊇ Eλ (T ) . Theorem 7.6. Let T be a linear transformation on a finite-dimensional vector space V such that the characteristic polynomial of T splits, and, let λ1 , · · · , λk be the distinct eigenvalues of T with corresponding multiplicities m1 , · · · , mk . For each j = 1, · · · , k , let Bj be an order basis for Kλj (T ) . Then (1) Bi ∩ Bj = ∅ for i 6= j . (2) B = B1 ∪ · · · ∪ Bk is an order basis for V . (3) dim Kλj (T ) = mj for j = 1, · · · , k .
LINEAR ALGEBRA
31
Theorem 7.7. Let T be a linear transformation on a finite-dimensional vector space V such that the characteristic polynomial of T splits, and suppose that B is a basis for V such that B is a disjoint union of cycles of generalized eigenvectors of T . Then (1) For each cycle of generalized eigenvectors gamma contained in B , W = span(γ) is T −invariant, and [T |W ]γ is a Jordan block. (2) B is a Jordan canonical basis for V . Theorem 7.8. A cycle of generalized eigenvectors is linearly independent. Remark 7.9. Let BW = {v1 , · · · , vp } a cycle of generalized eigenvector of T corresponding to λ , and let W = {v1 , · · · , vp } . Then W is T −invariant. Indeed, (T − λIV )(vj ) = T (vj ) − λvj . Then ( λ vj + vj−1 T (vj ) = λ vj
if j 6= 1, if j = 1 .
Let T|W be the linear transformation of T restricted to W . Then we have [T|W ]BW
= [[T|W (v1 )]BW , · · · λ 1 0 ··· 0 λ 1 = 0 0 λ . .. .. . 0 0 0 ···
, [T|W (vp )]BW ]] 0 0 .. . , 1 λ
and hence the matrix is the Jordan block of the size p corresponding to λ , denoted by J(p, λ) , Jordan Canonical Form Theorem. Let V is F −vector space with dim V < ∞ , and let T ∈ L(V ) . Suppose cT (x) splits over F . Then there exists a basis B for V which is disjoint union of cycles of generalized eigenspaces for T . Then we call such a basis a Jordan Canonical basis for T . Hence we have J(r1 , λ1 ) 0 .. [T ]B = , . 0 0 0 0 J(rs , λs ) where each J(ri , λi ) is an ri × ri Jordan block corresponding to λi (the λi not necessarily distinct). We will investigate there exists a Jordan Canonical form. Lemma 7.10. Let V be a n-dimensional vector space over a field F = C , where C is a set of complex numbers, and let T ∈ L(V ) . Then there exists (n − 1)−dimensional subspace W of V such that T (W ) ⊂ W .
32
DONG SEUNG KANG
Proof. Consider the linear transformation T t : V ∗ → V ∗ , where V ∗ is the dual space of V . Since F = C , T t has an eigenvalue l ∈ V ∗ (l 6= 0) such that T t (l) = λl , for some λ ∈ C . Since l ∈ V ∗ , l : V → C is a non-zero linear transformation. Hence let W = Ker l . Then dim W = n − 1 (because of dim C = 1 and Dimension Theorem). Now, we may claim that T (W ) ⊂ W . For any w ∈ W = Ker l , l(T (w)) = (T t l)(w) = (λ l)(w) = λ(l(w)) = 0 . Thus T (w) ∈ Ker l = W , as claimed.
Theorem 7.11. (Existence of Jordan Canonical Forms) Let V be a ndimensional vector space over a field F = C , where C is a set of complex numbers, and let T ∈ L(V ) . Then there exists a basis B for V such that J(p, λ1 ) J(q, λ2 ) [T ]B = , .. . J(s, λr ) where p + q + · · · + s = n and λ1 , λ2 , · · · , λr are eigenvalues of T . Proof. We will prove it by induction on dim V = n . Let n = 1 . Then for any α ∈ V , T (α) = λα , for some λ ∈ C . Then [T ]B = (λ)1×1 , where B = {α} , α 6= 0 . Suppose that it holds on n − 1 . Now, we will show the case where dim = n . By previous Lemma, there exists (n−1)−dimensional subspace W of V such that T (W ) ⊂ W . Suppose B0 = {e1 , · · · , ep , f1 , · · · , fq , · · · , h1 , · · · , hs } is a basis for W such that ( ej−1 + λ1 ej j 6= 1, T (ej ) = λ1 e1 j = 1, ( fj−1 + λ2 fj j 6= 1, T (fj ) = λ2 f1 j = 1, and ( hj−1 + λr hj T (hj ) = λ r h1
j= 6 1, j = 1.
Complete B0 to a basis B for V by adding e . Then T (e) =
p X
αj ej +
j=1
q X
βj fj + · · · +
j=1
s X
γj hj + λe .
j=1
Then we may assume λ = 0 ; otherwise replace T by T − λI . Let 0
e =e−
p X j=1
xj ej −
q X j=1
yj fj − · · · −
s X j=1
zj h j .
LINEAR ALGEBRA
33
Want to find the xj , yj , · · · , zj to make T (e0 ) as simple as possible. Then 0
T (e ) = T (e) −
p X
xj T (ej ) −
j=1
= + + +
q X
yj T (fj ) − · · · −
j=1
s X
zj T (hj )
j=1
(α1 − λ1 x1 − x2 )e1 + (α2 − λ1 x2 − x3 )e2 · · · + (αp−1 − λ1 xp−1 − xp )ep−1 + (αp − λ1 xp )ep (β1 − λ2 y1 − y2 )f1 + · · · + (βq − λ2 yq )fq ··· .
Now, we will consider two cases as follows: (1) λi 6= 0 for some i . Let i = 1 . Then we may take the xj to α make T (e0 ) as simple as possible such that xp = λ1p , xp−1 = xp −αp−1 , · · · . Then T (e0 ) ∈ span{f1 , · · · , fq , · · · , h1 , · · · , hs , e0 } . λ1 Now, let V0 = span{f1 , · · · , fq , · · · , h1 , · · · , hs , e0 } be a subspace of V . Then T (V0 ) ⊂ V0 and dim V0 ≤ n − 1 . By induction step, there exists a Jordan Canonical basis B∗ for V0 . Hence {e1 , · · · , ep } ∪ B∗ is a Jordan Canonical basis for V . (2) Suppose for all eigenvalues are zero. T (e0 ) = T (e) −
p X
xj T (ej ) −
j=1
q X
yj T (fj ) − · · · −
j=1
s X
zj T (hj )
j=1
= (α1 − x2 )e1 + (α2 − x3 )e2 + · · · + (αp−1 − xp )ep−1 + (αp )ep + (β1 − y2 )f1 + · · · + (βq − yq )fq + ··· . Then we may choose the coefficients such that T (e0 ) = αp ep + βq fq + · · · + γs hs . If αp = 0 , then T (e0 ) ∈ span{f1 , · · · , fq , · · · , h1 , · · · , hs } . It follow from case (1). We may assume αp 6= 0 , βq 6= 0 , · · · , γs 6= 0 , and p ≥ q ≥ · · · ≥ s . Then we have e0 T (e0 ) = αp ep + βq fq + · · · + γs hs T 2 (e0 ) = αp T (ep ) + βq T (fq ) + · · · + γs T (hs ) = αp (ep−1 + λ1 ep ) + βq (fq−1 + λ2 fq ) + · · · + γs (hs−1 + λ1 hs ) = αp ep−1 + βq fq−1 + · · · + γs hs−1 T p−1 (e0 ) = αp e2 + · · · T p (e0 ) = αp e1 + · · · . It is easy to check that B = {T p (e0 ), T p−1 (e0 ), T (e0 ), e0 , f1 , · · · , fq , · · · , h1 , · · · , hs }
34
DONG SEUNG KANG
is a basis for V . Thus we have a Jordan Canonical basis B for V .
Remark 7.12. (1) The number of Jordan blocks for the eigenvalue λ equals to dim Eλ (T ) . (2) V is the direct sum of the Kλi (T ) . (3) If B 0 is any other Jordan Canonical basis, then for each eigenvalue λ , B and B 0 have exactly the same number of cycles of generalized eigenvectors corresponding to λ , and those cycles have the same lengths. (4) In particular, [T ]B = [T ]B0 up to permutation of the Jordan blocks. Minimal polynomials. Theorem 7.13. Let V be a finite dimensional vector space over F and let T ∈ L(V ) . Let I = {p(x) ∈ F [x] | p(T ) = 0} . Then I contains a unique monic polynomial mT (x) of minimal degree among all non-zero polynomials in I . Moreover, I = {g(x)mT (x) | g(T ) ∈ F [x]} . Recall the mT (x) is called the minimal polynomial of T . Theorem 7.14. Let V be a finite dimensional vector space over F and let T ∈ L(V ) . Assume that cT (x) splits over F and let λ1 , · · · , λk be the distinct eigenvalues of . For each i , let mi denote the size of largest Jordan block corresponding to λi in the Jordan canonical form of T . Then mT (x) = (x − λ1 )m1 · · · (x − λk )mk . Remark 7.15. Let V be a finite dimensional vector space over F and let T ∈ L(V ) . Assume that cT (x) splits over F and the Jordan canonical form of T has t Jordan blocks corresponding to λ , say of sizes r1 , · · · , rt . Then cT (x) = Πλ (x − λ)r1 +···+rt and mT (x) = Πλ (x − λ)Max{r1 ,··· ,rt } . Corollary 7.16. Let V be a finite dimensional vector space over F and let T ∈ L(V ) . Then λ ∈ F is an eigenvalue of T if and only if λ is a root of mT (x) ; that is, cT (x) and mT (x) have exactly the same roots. Proof. Let mT (x) = (x − λ1 )e1 (x − λ2 )e2 · · · (x − λr )er . We claim that all λj are eigenvalues of T , Enough to show that λr is an eigenvalue of T and then the others are similar. Let S = (T − λ1 I)e1 (T − λ2 I)e2 · · · (T − λr I)er T (x) be a linear transformation on V . Note that S 6= 0 , otherwise S(x) = mx−λ r with S(T ) = 0 . This is a contradiction to the minimality of mT (x) . Let W = S(V ) = ImS , where V 6= Ker S . We will show that T (w) = λr w for w ∈ W . Let w = S(α), for some α ∈ V . (T − λr I)S = 0 = mT (T ) . Apply to α ∈ V , 0 = (T − λr I)S(α) = (T − λr I)(w) .
LINEAR ALGEBRA
35
Hence T (w) = λr w , as desired, that is λr is an eigenvalue of T . Conversely, let λ be an eigenvalue of T , that is, for some nonzero vector α , T (α) = λα , α 6= 0 . Note that (T − λI)er (α) = (T − λI)er −1 (T − λI)(α) = (λ − λr )(T − λI)(α) = (λ − λr )(α) . Hence we have 0 = mT (T )(α) = (λ − λ1 )e1 (λ − λ2 )e2 · · · (λ − λr )er (α) . Since α 6= 0 , then (λ − λ1 )e1 (λ − λ2 )e2 · · · (λ − λr )er = 0 . Hence there exists a 1 ≤ j ≤ r such that λ = λj . Thus λ is a root of mT (x) . Corollary 7.17. Let V be a finite dimensional vector space over F and let T ∈ L(V ) . Then the following statements are equivalent: (1) T is diagonalizable; (2) mT (x) has no multiple roots; (3) there exists f (x) ∈ F [x] which splits over F and has no multiple roots such that f (T ) = 0 . Corollary 7.18 (Cayley-Hamilton Theorem). cT (x) is a multiple of mT (x) . Example 7.19. Let T ∈ L(Mn (C)) be defined by T (A) = A − At , where At is the transpose of A . Then we will determine the Jordan canonical form of T . (1) We have to find some polynomial p(x) ∈ {g(x) ∈ C[x]|g(T ) = 0} . (2) Since T (A) = A−At , we know that T 2 (A) = T (A−At ) = (A−At )− (A − At )t = 2T (A) ; that is, T 2 − 2T = 0 . Hence p(x) = x2 − 2x . (3) Now, we have to find the candidates for minimal polynomials for {g(x) ∈ C[x]|g(T ) = 0} . (4) There are some candidates as follows; x, x − 2,
x(x − 2) = p(x) .
(5) By Corollary 7.17, we may conclude T is diagonalizable as follows; [T ] = 0, [T ] = 2I , or mT (x) = p(x) . (6) Now we need to check possibilities for T . First two cases are impossible because if A is skew-symmetric(At = −A), T (A) = 2A , and if A is symmetric (At = A), T (A) = 0 . and (7) Hence mT (x) = x(x − 2) . We recalled that dim (Sym) = n(n+1) 2 n(n−1) 2 dim (Skew-Sym) = 2 , where dim (Mn (C)) = n . (8) Thus we have the Jordan canonical form of T is ! J( n(n+1) , 0) 0 2 . 0 J( n(n−1) , 2) 2 Example 7.20. Let dim V = 4 and B = {b1 , b2 , b3 , b4 } be a basis for V . Let T ∈ L(V ) be defined by T (b1 ) = 0 T (b2 ) = −5b1 , T (b3 ) = 5b1 , and T (b4 ) = 2b2 + 5b3 . We will determine the Jordan canonical form for T .
36
DONG SEUNG KANG
(1) Since T (b1 ) = 0 T (b2 ) = −5b1 , T (b3 ) = 5b1 , and T (b4 ) = 2b2 + 5b3 , we have T 2 (b1 ) = 0 T 2 (b2 ) = 0, T 2 (b3 ) = 0, and T 2 (b4 ) = 15b1 . Again, we have T 3 (bi ) = 0 for all i = 1, · · · , 4 . Note that T 2 6= 0 and T 3 = 0 . (2) Since cT (x) and mT (x) have exactly the same roots, we may conclude that cT (x) = x4 (because dim V = 4) and mT (x) = x3 . (3) By Remark 7.15 or Theorem 7.14, the Jordan canonical form for T is J(3, 0) 0 . 0 J(1, 0) Example 7.21. Let T be the linear transformation on P2 (R) defined by T (f ) = −f − f 0 . Let B = {1, x, x2 } be the standard basis for P2 (R) . Then −1 −1 0 [T ]B = 0 −1 −2 , 0 0 −1 which has the characteristic polynomial cT (x) = −(x + 1)3 . Hence λ = −1 is the only eigenvalue of T , and then Kλ = P2 (R) . Also, dim Eλ = 1 . Now, want to compute a Jordan canonical basis, i.e., have to find a vector f ∈ P2 (R) such that {(T + I)2 (f ), (T + I)(f ), f } is a basis for T . Also, (T + I)3 (f ) = 0 . Take f (x) = x2 , A = {(T + I)2 (x2 ), (T + I)(x2 ), x2 } is a Jordan canonical basis for T such that −1 1 0 [T ]A = 0 −1 1 . 0 0 −1 Note that the minimal polynomial mT (x) = (x + 1)3 . Hence the maximal size of Jordan canonical form with respect to λ = −1 is 3. Problem Set Problem 7.22. Let dim V = 4 and let B = {b1 , b2 , b3 , b4 } be a basis for V . Let T ∈ L(V ) be defined by T (b1 ) = 0, T (b2 ) = −5b1 , T (b3 ) = 5b1 , T (b4 ) = 2b2 + 5b3 . (1) Determine cT (x) and mT (x) without determining a Jordan canonical basis for V . (2) Determine the Jordan canonical for for T without determining a Jordan canonical basis for V . (3) Determine a Jordan canonical basis for V .
LINEAR ALGEBRA
Problem 7.23. Let A be the 2 1 −1 0 0 0
37
complex matrix 0 2 0 1 0 0
0 0 2 0 0 0
0 0 0 2 0 0
0 0 0 0 0 0 . 0 0 2 0 1 −1
Compute the Jordan canonical form for A . Problem 7.24. If A is a complex 5×5 matrix with characteristic polynomial CA (x) = (x − 2)3 (x + 7)2 and the minimal polynomial mA (x) = (x − 2)2 (x + 7) , what is the Jordan canonical form for A ? Problem 7.25. How many possible Jordan canonical forms are there for a 6 × 6 complex matrix with characteristic polynomial (x + 2)4 (x − 1)2 ? Problem 7.26. Suppose that T ∈ L(V ) . Let cT (x) = x10 and N (T ) = R(T ) . Determine the Jordan canonical form of T . Problem 7.27. Determine the Jordan canonical form of the linear transformation T ∈ L(Cn ) defined by T (ei ) =
n X
ek
for all i = 1, · · · , n ,
k=1
where {e1 , · · · , en } is the standard basis for Cn . Problem 7.28. Let F = C , let T ∈ L(V ) , and let W be a T −invariant subspace of V . Let T|W ∈ L(W ) denote the restriction of T to W . (1) Prove that mT|W (x) divides mT (x) in C[x] . (2) Suppose that T is diagonalizable. Prove that T|W is also diagonalizable. Problem 7.29. Let A ∈ Mn (F ) be such that Ak = 0 for some positive integer k . Prove that An = 0 . Problem 7.30. Classify up to similarity all 3 × 3 complex matrices A such that A3 = I . Problem 7.31. Classify up to similarity all n × n complex matrices A such that An = I . Problem 7.32. Suppose A, B ∈ Mn (C) have the same characteristic and minimal polynomials. Can we conclude that A and B are similar ? (a) if n = 3 ? (b) if n = 4 ? Problem 7.33. Describe, up to similarity, all 3 × 3 complex matrices A such that A2 + 2A − 3I = 0 .
38
DONG SEUNG KANG
Problem 7.34. Let V be a finite dimensional vector space, and let T ∈ L(V ) . If T 2 = T , prove that there exists a basis B for V such that for every v ∈ B we have either T (v) = 0 or T (v) = v . Problem 7.35. Let V be an n−dimensional R−vector space and let T ∈ L(V ) of minimal polynomial x2 + 1 . Show that n must be even.
LINEAR ALGEBRA
39
8. Inner Product Spaces Definition 8.1. Let V be a n−dimensional inner product space over F . Then B = {b1 , · · · , bn } is called an orthogonal basis for V if hbi , bj i = 0 if i 6= j . Moreover, B = {b1 , · · · , bn } is called an orthonormal basis for V if hbi , bj i = 0 if i 6= j and hbi , bj i = 1 if i = j . Theorem 8.2 (Gram-Schmidt Orthogonalization Process). Let V be a finite dimensional inner product space over F . Then V has an orthogonal (moreover, orthonormal) basis for V . Remark 8.3 (Gram-Schmidt orthogonalization process). Let V be a finite dimensional inner product space over F , and let B = {b1 , · · · , bn } be a basis for V . Then we define ak = bk −
k−1 X hbk , aj i j=1
||aj ||2
aj for 0 ≤ k ≤ n .
Then {a1 , · · · , an } is called an orthogonal basis for V . Hence { ||aa11 || , · · · , ||aann || } is an orthonormal basis for V , where ||α||2 = hα, αi . Example 8.4. Let {(1, 1, 0), (2, 0, 1), (2, 2, 1)} be a basis for R3 . Then {(1, 1, 0), (1, −1, 1), (− 31 , 31 , 23 )} is an orthogonal basis for R3 . Theorem 8.5. Let V be a finite dimensional inner product space over F , and let B = {b1 , · · · , bn } be an orthonormal basis for V . Then for any α ∈ V , we have n X α= hα, bi ibi . i=1
Corollary 8.6. Let V be a finite dimensional inner product space over F , B = {b1 , · · · , bn } be an orthonormal basis for V , and let let T ∈ L(V ) . Then for each i = 1, · · · , n , T (bi ) =
n X
hT (bi ), bi ibi .
i=1
Theorem 8.7. Let V be a finite dimensional inner product space over F and let g ∈ L(V, F ) . Then there exists a unique y ∈ V such that g(x) = hx, yi for all x ∈ V . Theorem 8.8. Let V be a finite dimensional inner product space over F and let T ∈ L(V ) . Then there exists a unique T ∗ ∈ L(V ) such that hT (x), yi = hx, T ∗ (y)i for all x, y ∈ V . Then we call T ∗ the adjoint of T .
40
DONG SEUNG KANG
Note that for A ∈ Mn (F ) , A∗ = A¯t . Theorem 8.9. Let V be a finite dimensional inner product space over F , let T ∈ L(V ) , and let B be an orthonomal basis for V . Then [T ∗ ]B = [T ]∗B . Note that this is false if B is not an orthonormal basis for V . Remark 8.10. Let W be a subspace of a vector space of V , and let T be a linear transformation on V to itself. Also, let A ∈ Mn (F ) , k ∈ F . (1) if W is T −invariant, then W ⊥ is T ∗ −invariant. (2) T ∗∗ = T . (3) N (T ∗ T ) = N (T ) ⇒ rank(T ∗ T ) = rank(T ) (4) rank(T ) = rank(T ∗ ) ⇒ rank(T T ∗ ) = rank(T ) . ¯ ∗. (5) (kA)∗ = kA ∗ (6) (AB) = B ∗ A∗ . (7) (A + B)∗ = A∗ + B ∗ . (8) det(A∗ ) = det(A) . (9) For any A ∈ Mn (F ) , rank(A∗ A) = rank(AA∗ ) = rank(A) . Theorem 8.11 (Schur’s Theorem). Let V be a finite-dimensional inner product space over a field F and let T ∈ L(V ) . Assume cT (x) splits over F . Then there exists an orthonormal basis B for V such that [T ]B is upper triangular. Proof. Proceed induction on dim V . If dim V = 1 , then done. Assume that the result is true for inner product space of dimension less than n . We recall that there is T ∗ ∈ L(V ) so that (1) [T ∗ ]γ = [T ]∗γ for some γ orthonormal basis (2) if W is T −invariant then W ⊥ is T ∗ −invariant (3)T ∗∗ = T . We claim that T ∗ has an eigenvector. Let γ be an orthonormal basis for V . Then t
CT ∗ (x) = det([T ∗ ]γ − xIn ) = det([T ]γ − xIn ) = det([T ]γ − xIn ) . Since CT (x) splits over F , CT ∗ (x) splits over F . If CT (x) = (x − λ1 ) · · · (x − λn ) then CT (x) = (x − λ1 ) · · · (x − λn ) . Let λ be an eigenvalue of T ∗ . Then there exists a z 6= 0 such that T ∗ (z) = λz . Set U = span(z) so that dim U = 1 , and then dim U ⊥ = n − 1 . That is V = U ⊕ U ⊥ . Also, since T ∗ (z) = λz then U is T ∗ −invariant. This implies that U ⊥ is T ∗∗ −invariant, i.e., T −invariant. Since CTU ⊥ (x) divide CT (x), CTU ⊥ (x) splits over F . By induction, there is an orthonormal basis γ for U ⊥ . Hence [TU ⊥ ]γ is upper z triangular. Let B = γ ∪ { ||z|| } . Then B is orthonormal and upper ∗ triangular ∗ . [T ]B = (n − 1) × (n − 1) ∗ 0 ··· 0 ∗
LINEAR ALGEBRA
Thus [T ]B is upper triangular.
41
Remark 8.12. If in addition we are given T = T ∗ then [T ]B = [T ∗ ]B = [T ]∗B = ([T ]B )t . Hence [T ]B is diagonal and diagonal entries are real, i.e., T diagonalizable with real eigenvalues. Definition 8.13. Let V be an inner product space and let T ∈ L(V ) . (1) T is called self-adjoint if T = T ∗ . (2) T is called normal if T T ∗ = T ∗ T . (3) T is called unitary if hT (x), T (x)i = hx, xi and F = C . In particular, T is called orthogonal if hT (x), T (x)i = hx, xi and F = R . Example 8.14. Let A ∈ Mn (F ) be a skew-symmetric matrix. Then AAt and At A are normal. Note that if T is self-adjoint then it is normal, but the converse is not true. Theorem 8.15. Let V be an inner product space over a field F , and let T be a normal linear transformation on V . The the following are true: (1) hT (x), T (x)i = hT ∗ (x), T ∗ (x)i , for all x ∈ V . (2) T − cI is a normal linear transformation for all c ∈ F . ¯ . (3) if T (α) = λα , then T ∗ (α) = λα (4) if T (α1 ) = λ1 α1 and T (α2 ) = λ2 α2 with λ1 6= λ2 , then hα1 , α2 i = 0 . Theorem 8.16. Let V be a finite dimensional inner product space over F , let T ∈ L(V ) , and suppose cT (x) splits over F . Then V has an orthonormal basis of eigenvectors of T if and only if either (1) F = C and T is normal or, (2) F = R and T is self-adjoint. Proof. By Schur’s Theorem 8.11, we may say that [T ]B is an upper triangular matrix in a some orthonormal basis B for V . First of all, let F = C . Suppose that the basis B is consisting of all eigenvectors of T . Then [T ]B is a diagonal matrix, say λ1 0 0 0 ... 0 , 0 0 λn where λ1 , · · · , λn are eigenvalues of T . Hence λ1 λ1 0 0 t .. [T ]B [T ]B t = 0 . 0 = [T ]B [T ]B , 0
0
λn λn
i.e., T is normal. Conversely, we will show inductively. Since T is normal, we may say that [T ]B [T ]B t = [T ]B t [T ]B ,
42
DONG SEUNG KANG
that is, z 1 z2 · · · .. ∗ . ∗ ···
z¯ zn 1 z¯2 . ∗ .. ∗ z¯n
0 ··· z¯1 z¯2 ∗ ∗ =. .. . ∗ .. ∗ ∗ z¯n
0 ··· z 1 z2 · · · ∗ ∗ .. ∗ . .. . ∗ ∗ ··· ∗ ∗
zn
∗ , ∗
where all zj ∈ C . Then we have z1 z¯1 + · · · + zn z¯n = z1 z¯1 . Since 0 = (x + yi)(x − yi) = x2 + y 2 and x, y ∈ R , then x = y = 0 . We know that z2 = · · · = z n = 0 . After then, we have simplified matrix [T ]B = z1 0 · · · 0 0 w2 · · · wn , where z1 , all wj ∈ C . Continue in this way, we have ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ a diagonal matrix z1 0 .. [T ]B = , . 0 un as desired. Now let F = R . Then the T is self-adjoint if and only if [T ]B = [T ∗ ]B = [T ]∗B = [T ]tB if and only if [T ]B is a diagonal matrix if and only if B is an orthonormal basis of eigenvectors of T . Corollary 8.17. If T is self-adjoint, then every eigenvalue of T is real. Interestingly, when F = C and T is normal , Theorem 8.16 does not extend to infinite-dimensional complex inner product spaces. Example 8.18. Let H be an inner product space consisting of continuous complex-valued functions defined on the interval [0, 2π] with the following inner product Z 2π 1 f (t)g(t) dt . hf, gi = 2π 0 Then the√following set S = {eijt } where 0 ≤ t ≤ 2π and i is the imaginary number −1 (recall that eijt = cos jt + i sin jt ). It is easy to check that the set S is an orthonormal subset of H and linearly independent. Now, let V = span(S) , and let T , U ∈ L(V ) defined by T (f ) = f1 f nd U (f ) = f−1 f . For all integer k we have T (fk ) = fk+1 and U (fk ) = fk−1 . Then hT (fi )fj i = hfi+1 , fj i = δ(i+1),j = δi,(j−1) = hfi , fj−1 i = hfi , U (fj )i . Hence U = T ∗ . It implies that T T ∗ = I = T ∗ T that is, T is normal. Now, we will show that T has no eigenvectors. Suppose that F is an eigenvector of T . Then we have T (f ) = λf , for some λ . Since V = span(S) , we may write
LINEAR ALGEBRA
43
P f = m j=n aj fj , where am 6= 0 . Applying T to both sides of the preceding equation, we obtain m m X X aj fj+1 = λaj fj . j=n
j=n
Since am 6= 0 , we can write fm+1 as a linear combination of fn , fn+1 , · · · , fm . But this is a contradiction because S is linearly independent. The following statements are equivalent to the definition of a unitary or orthogonal linear transformation. Theorem 8.19. Let T be a linear transformation on a finite-dimensional inner product space V . Then the following are equivalent. (1) T T ∗ = I = T ∗ T . (2) hT (x), T (y)i = hx, yi , for all x, y ∈ V . (3) if B is an orthonormal basis for V , then T (B) is an orthonormal basis for V . (4) There exists an orthonormal basis B for V such that T (B) is an orthonormal basis for V . (5) hT (x), T (x)i = hx, xi for all x ∈ V . Orthogonal Projection and Spectral Theorem. Recall that if V = W1 ⊕ W2 , then a linear transformation T on V is the projection on W1 along W2 if whenever x = x1 + x2 , with x1 ∈ W1 and x2 ∈ W2 , we have T (x) = x1 . Then we have R(T ) = W1 = {x ∈ V |T (x) = x} and N (T ) = W2 . So V = R(T ) ⊕ N (T ) . Note that T is a projection if and only if T 2 = T . Definition 8.20. Let V be an inner product space, and let T : V → V be a projection. We say that T is an orthogonal projection if R(T )⊥ = N (T ) and N (T )⊥ = R(T ) . Theorem 8.21. Let V be an inner product space, and let T : V → V be a linear transformation. Then T is an orthogonal projection if and only if T has adjoint T ∗ and T 2 = T = T ∗ . Proof. Suppose T is an orthogonal projection. Since T is a projection, we have T 2 = T . Enough to show that T ∗ exists and T = T ∗ . For all x = x1 +x2 and y = y1 + y2 ∈ V = R(T ) ⊕ N (T ) , respectively, we have hx, T (y)i = hx1 + x2 , T (y1 )i = hx1 , T (y1 )i + hx2 , T (y1 )i = hx1 , y1 i hT (x), yi = hT (x1 ), y1 + y2 i = hT (x1 ), y1 i + hT (x1 ), y2 i = hx1 , y1 i . So hx, T (y)i = hT (x), yi , for all x, y ∈ V . Hence by Theorem 8.8, T ∗ exists and T = T ∗ . Conversely, suppose that T 2 = T = T ∗ , we know that T is a
44
DONG SEUNG KANG
projection, and hence we need to show that R(T ) = N (T )⊥ and R(T )⊥ = N (T ) . leave it exercise. Let V be a finite-dimensional inner product space, W be a subspace of V , and T be the orthogonal projection on W . We may choose an orthonormal basis B = {b1 , · · · , bn } for V such that {b1 , · · · , bk } is a basis for W . Then we have Ik O1 [T ]B = . O2 O3 Theorem 8.22 (The Spectral Theorem). Suppose that T is a linear transformation on a finite-dimensional inner product space V over F with the distinct eigenvalues λ1 , · · · , λk . Assume that T is normal if F = C and that T is self-adjoint if F = R . For each i = 1, · · · , k , let Wi be the eigenspace of T corresponding to the eigenvalue λi , and let Ti be the orthogonal projection on Wi . Then the folowing are true. (1) V = W1 ⊕ · · · ⊕ Wk . (2) If Wi0 denotes the direct sum of subspaces Wj (j 6= i) , then Wi⊥ = Wi0 . (3) Ti Tj = δij Ti for 1 ≤ i, j ≤ k . (4) I = T1 + · · · + Tk . (5) T = λ1 T1 + · · · + λk Tk . Note that the set {λ1 , · · · , λk } of eigenvalues of T is called the spectrum of T , I = T1 + · · · + Tk is called resolution of the identity operator induced by T , and the sum T = λ1 T1 + · · · + λk Tk is called the spectral decomposition of T . Remark 8.23. Let T ∈ L(V ) be self-adjoint over R , where B is a basis for V . Then [T ]B is a real symmetric matrix and also every eigenvalue of [T ]B is real and cT (x) splits over R . We now list several interesting properties of the spectral theorem. Corollary 8.24. Suppose that T is a linear transformation on a finitedimensional inner product space V over F . (1) Let F = C . T is normal if and only if T ∗ = g(T ) for some polynomial g . (2) Let F = C . T is unitary if and only if T is normal and |λ| = 1 for every eigenvalue of T . (3) Let F = C and T is normal. T is self-adjoint if and only if every eigenvalue of T is real. (4) Let T be as in the spectral theorem with spectral decomposition T = λ1 T1 + · · · + λk Tk . Then each Tj is a polynomial in T . Problem Set
LINEAR ALGEBRA
45
Problem 8.25. Suppose that h·, ·i1 and h·, ·i2 are two inner products on a vector space V . Prove that h·, ·i = h·, ·i1 + h·, ·i2 is another inner product on V. Problem 8.26. Let B be a basis for a finite-dimensional inner product space. Prove that if hx, yi = 0 for all x ∈ B , then y = 0 . Problem 8.27. Let T be a self-adjoint linear transformation in L(V ) , where V is a finite-dimensional inner product space. Show that if hx, T (x)i = 0 for all x ∈ V , then T = 0 , where 0 is a zero linear transformation. Problem 8.28. Let A ∈ Mn (R) be a symmetric matrix. Prove that A is similar to a diagonal matrix. Problem 8.29. Let A ∈ Mn (R) be symmetric (or A ∈ Mn (C) be normal). Then prove the followings: n n X X ∗ tr(A) = λi , tr(A A) = |λi |2 , and det(A) = Πni=1 λi , i=1
i=1
where the λi are (not necessarily distinct) eigenvalues of A . Problem 8.30. Let V be a finite-dimensional complex inner product space over a field F , and let T ∈ L(V ) . Use the spectral decomposition T = λ1 T1 + · · · + λk Tk of T to prove the following statements: (1) IF g is a polynomial, then g(T ) =
k X
g(λi )Ti .
i=1
(2) If T n = 0 for some n , then T = 0 , where 0 is a linear transformation. (3) Let U ∈ L(V ) . Then U commutes with T if and only if U commutes with each Ti . (4) There exists a normal U ∈ L(v) such that U 2 = T . (5) T is invertible if and only if λi 6= 0 for some 1 ≤ i ≤ k . (6) T is projection if and only if every eigenvalue of T is 0 or 1 . (7) T = −T ∗ if and only if every λi is an imaginary number.