University of New South Wales MATH 1131 Mathematics 1A Algebra Semester 1 2016 Dr Bill Ellis∗
[email protected] February 23, 2016
∗ Original
notes by Mr Peter Brown & modified by Dr
Bill Ellis 1
Chapter 1 - Introduction to Vectors 1.1 Vector Quantities You may have already met the notion of a vector in physics. There you will have thought of it as an arrow, that was used to represent a force. Adding forces corresponded to ‘adding arrows’. In this topic we are going to look at vectors from a geometric view point, although we will include some examples based on simple ideas from physics. One of the most powerful developments in Mathematics came from the simple idea of the coordinate plane. Indeed 2-dimensional co-ordinate geometry was crucial in the development of the Calculus. The obvious question arises as to how we can generalise this to higher dimensions. Vectors give us a way of generalising co-ordinate geometry into higher dimensions in a very straight forward manner. 2
Definition: A vector is a directed line segment which represents a displacement from one point P to another point Q. The word vector comes from the Latin veho (cf. vehicle), meaning to carry. We −→ represent a vector either using the notation P Q or by using v . In the Algebra Course Notes e (yellow book) (and in these notes), vectors are represented using bold letters, v. You, when hand writing, should represent vectors by using a tilde sign under the letter, viz v or ~v . This is e important, because you will need to carefully distinguish between vectors, scalars (and later matrices).
v w
R
Q −→
PQ
−→ P
SR S
3
A vector has both direction and length (or magnitude). Two vectors are equal if they have the same direction and the same magnitude. Hence in the diagram v = w. We will denote the length (magnitude) of the vector v by |v|. (In textbooks you will also see kvk.) Two vectors are parallel if they have the same direction. (Not necessarily the same length.) We choose a fixed point O, in whatever dimensional space we happen to be and call this the origin. The position vector of a point in any number of dimensions will be represented by a vector from the origin to that point. P
O 4
−→
Hence the vector OP in the diagram is called the position vector of the point P . A position vector gives the position of a point in space, whereas a direction vector is simply a vector having direction and magnitude (length). Addition of vectors: To geometrically add two vectors there are two different methods (each important). If we think of a force vector, then, the obvious way to add two vectors is to put them tip to tail and join the tail of the first to the tip of the second, as in the diagram.
w v+w w
v
To add the vectors v and w, we move w and then complete the triangle. This method of addition is known as the triangle law of addition. 5
You can see from this that one could obtain the same vector by forming a parallelogram from the two vectors and taking the diagonal (often called the resultant) as the sum of the two vectors. v+w
v w
This method is known as the parallelogram law. Subtraction of vectors is performed in a similar way:
w−v w v 6
To check this makes sense, add the vectors that are tip to tail, v + (w − v) = w as expected. Observe that the vector labelled w − v is not a position vector. Thus if P and Q have position vectors v and −→
w respectively, then P Q = w − v. In general, −→
−→
−→
P Q = OQ − OP .
w−v
Q
w
P
v
O
7
The Triangle Inequality: Let us restrict ourselves, for the moment, to the plane. Since the sum of any two sides of a triangle must exceed the third side, we can write |u + v | ≤ |u | + |v | for any vectors u and v.
v u+v u
Example: When do we have equality in the Triangle Inequality?
8
Scalar Multiplication: We can multiply a vector by a scalar λ (generally just a real number). This has the geometric effect of stretching the vector if λ > 1, stretching and reversing its direction if λ < −1. λu
u
9
Commutative and Associative Laws: The commutative law of vector addition states that a + b = b + a. Geometrically this is obvious: a+b=b+a b a
a b
The associative law of vector addition states that a+(b+c) = (a+b)+c. Again the following geometric proof will suffice.
a + (b + c) (a + b) + c = c b+c a+b b a
10
Example: Construct a vector equation to determine the midpoint of the line joining the points P and Q.
11
1.2 Vector Quantities and Rn Thus far, much of what we have done works in any number of dimensions. We are now going to define n dimensional space and introduce a co-ordinate system in which to place our vectors.
a1 a We take an n-tuple 2. of real numbers and .
an think of each ai as lying on an axis xi. In 2 and 3 dimensions, we identify these axes as the XY and XY Z axes respectively, which are mutually orthogonal. The set of all such n-tuples will be called Rn.
12
−→
n has co-ordinates We say that the vector P Q in R a1 a 2 . , if we must move a1 units along the x1 . an axis, a2 units along the x2 axis, and so on, when moving from P to Q.
a1 a Hence an n-tuple 2.. in Rn will interpreted an as the position vector of a point P in Rn. For example, in R3, the point P in the diagram 2 has position vector 3 . 1
2
P 3
1
0 O= 0 0
13
We can then define the addition of two vectors (algebraically) in Rn by
a1 a2 .. an
+
b1 b2 .. bn
=
a1 + b1 a2 + b2 .. an + bn
and multiplication by a scalar λ to be
λ
a1 a2 .. an
=
λa1 λa2 .. λan
14
Multiplying a vector by a scalar λ merely stretches the vector (if λ > 1 ) or shrinks it if 0 ≤ λ < 1. If λ is negative then the vector reverses direction. (Note: The algebraic definition of addition agrees with the geometric definition.)
2v
v
b
0
−v
15
Note that we can now prove such rules as the commutative law algebraically, viz:
a+b
=
=
=
=
b1 a1 b a2 2 + .. .. bn an a1 + b1 a2 + b2 .. an + bn b1 + a1 b2 + a2 .. bn + an a1 b1 a b2 + 2. .. . an bn
=
b + a.
16
Two vectors are defined to be parallel if one is a non-zero multiple of the other. That is, v is parallel to w if v = λw for some scalar λ 6= 0.
1 −2 Example: 2 is parallel to −4 . 3 −6 −→
−→
Example: vectors Find the P Q, and QP if 7 2 P = −1 and Q = 1 . 3 −3
17
Example: Suppose that A = !
!
0 0
!
1 4
B=
!
,
2 3 are the position vec,D = 1 5 tors for four points A, B, C, D. Prove that the quadrilateral ABCD is a parallelogram.
C =
18
(cont.)
19
1.3 Rn and Analytic Geometry Basis Vectors: The standard basis!vectors in R2 are the vec! 1 0 tors and which are often denoted 0 1 by i and j. Observe that every vector in R2 can be written in terms of these basis vectors. For ! 1 example can be written as i − 2j. −2 In basis vectors, i, j, k are the 3-dimensions, 0 0 1 0 , 1 , 0 . 1 0 0 Note that every vector in R3 can be expressed uniquely in terms of these basis vectors, viz: a1 a2 can be expressed as a1 i + a2 j + a3k. a3 20
In higher dimensions, we label the basis vectors as e1, e2, e3,... and so on. Thus, in R4, we have
1
0
0
0
0 0 1 0 e1 = , e2 = , e3 = , e4 = . 0 1 0 0
0
0
1
0
Once again, we can represent any vector in Rn uniquely in terms of the standard basis vectors in Rn.
21
Distances and Lengths: !
a in R2, we can use b Pythagoras’ Theorem to compute the length q of this vector as a2 + b2. We use the notation Given a vector x =
|x| =
q
a2 + b2.
In R3, given a vector x
a
= b , we can see
c
from the diagram that |OP | =
q
a2 + b2 and
then in ∆OAP we have |OA| = |x| =
q
a2 + b2 + c2.
A c O a X
b
P 22
In higher dimensions, we can define the length of a vector by generalising this formula, i.e.
Definition: A vector x length |x| given by |x| =
=
q
a1 a2 .. an
in Rn has
2 2 a2 1 + a2 + · · · + an .
Example: Find the lengths of vectors a = 1 −1 2 . 3 and b = 3 2 4
23
The distance between two points A and B in Rn will be defined as the length of the vector −→ AB, in = a other words, if A has position vector b1 a1 b a 2 .. and B has position vector b = 2.. , bn an −→
then the length of AB is −→
| AB | = |b − a| = −→
q
(b1 − a1)2 + · · · + (bn − an )2. −→
Note: | AB | = | BA |
24
!
1 −2 ! −1 3 3 and between 2 and 6 . −4 5 −1
Example: Find the distance between and
25
The ‘length function’ | · |, (sometimes called a norm) has the following properties:
(i) |a| ≥ 0. (ii) |a| = 0 if and only if a = 0. (iii) |λa| = |λ||a|, for λ ∈ R. A vector which has unit length is called a unit vector. Any vector can be made into a unit vector by dividing by its length. The vector v ˆ is the unit vector of v. Example: Find a unit vector parallel to the 2 vector −3 . 1
26
Example: Suppose A and B are points with position vectors a and b respectively. Find a vector (in terms of a and b) which bisects the angle AOB, where O is the origin.
27
Example: Are the points A(1, 2, 3), B(3, 1, 1) and C(7, −1, −3) collinear?
28
Example: If A(−1, 3, 4), B(4, 6, 3), C(−1, 2, 1) and D are the vertices of a parallelogram, determine all the possible co-ordinates for the point D.
29
(cont.)
30
1.4 Lines We seek to find the equation of a line in vector form. The vector equation of a line is a formula which gives the position vector x of every point on that line. This equation is sometimes referred to at the parametric vector form of the line. I will generally just say ‘vector equation’ of the line. Suppose we have a line passing through the origin which contains a vector u in R2. Every point on that line will have a position vector which is a multiple of u. Conversely, every multiple of u will correspond to the position vector of a point on that line. Hence the equation of the line can be written as x = λu where λ is any real number.
31
x = λu
u
0
!
2 then 3 the equation of the line through!u passing through 2 the origin would be x = λ . 3
For example, if u were the vector
32
Another way of denoting the set of all real multiples of a given vector is to call it the span of the vector. Thus we could write λu as span(u). This idea of span is extremely important. It will be developed more formally in MATH 1231. 1 0
Example: In R2 what is the span of What is span
1 1
!
?
33
!
?
The advantage of this definition of the equation of a line is that it easily generalises to any number of dimensions. For example, the equation of the line in R3 which passes through the −1 origin and is parallel to the vector 3 is −6 −1 simply x = λ 3 . −6 If the line does not pass through the origin, then we proceed as follows: To find the vector equation of a line we need to know two things:
(i) The position vector a of a point on the line
(ii) The direction of the line, i.e., a vector b parallel to the line.
34
λb A
X
x = a + λb
b x
a O Thus, the span of b will give a line through the origin parallel to b and adding a will shift the line to its proper position. Thus we can find the position vector x of any point X on the line by going from the origin along the vector a and then moving along the line, by adding some multiple of b until we reach X. Thus, −→ −→ −→ −→ the vector OX is given by OX=OA + AX and −→ AX is some multiple of b. Hence the equation of the line is x = a + λb.
35
Example: Find the vector equation of the line passing through the point P with position vec 2 1 tor −3 and parallel to the vector 6 . 5 −4
36
Example: Find the vector equation of the line passing through the Q with posi two points P, 4 −1 tion vectors p = 2 and q = −2 . 3 6
37
Example: Find the vector equation of the line in 2-dimensions with Cartesian equation y = 2x + 1.
38
Example: (−1, 2, 3) lie on the Does thepoint 2 3 line x = 4 + λ 1 ? 4 11
39
1.4.2 Lines in R3 In R3 two lines can
• meet at a point • be parallel • neither meet nor be be parallel.
40
Cartesian Equations of the Line: Given the vector form of a line in 3-dimensions, we can write down the Cartesian equations (note the plural) as follows. For example, suppose the 3 −1 vector equation is x = 2 +λ −2 . Re1 3 x1 call that x is simply an abbreviation for x2 . x3 Hence, equating co-ordinates, we can write x1 = −1 + 3λ, x2 = 2 − 2λ, x3 = 3 + λ. Eliminating λ from these equations we have x1 + 1 x −2 x −3 = 2 = 3 (= λ) . 3 −2 1 These are called the Cartesian equations of the line. Clearly this can be done for any such line and so the general form is x2 − b x3 − c x1 − a = = (= λ), v1 v2 v3 where (a, b, c) is a point on the line and (v1, v2, v3) is a direction vector of the line, provided that none of the numbers v1, v2, v3 is zero. 41
Hence invector form this would be a v1 x = b + λ v2 . The following example c v3 tells us what to do when one of the components of the direction vector is zero.
Example: Convert x into Cartesian form.
=
2 2 1 + µ 0 −5 −1
42
Observe that two lines will be parallel if their direction vectors are parallel. Two direction vectors are parallel if and only if one is a nonzero multiple of the the other. For example lines 1 3 −2 6 x= 5 +λ 2 and x = 4 +λ 4 6 −1 7 −2 3 are parallel since their direction vectors are 2 −1 6 and 4 respectively and the second vec−2 tor is simply a multiple of the Observe first. 1 3 also that the equations x = 5 + λ 2 6 −1 1 6 and x = 5 + λ 4 represent the same 6 −2 lines, since they are parallel and pass through the same point.
43
Example: Find the vector equation of the line passing through (1, −2, 5) and parallel to x+1 y−1 z+6 = = . 3 −2 −4
44
Example: Find the intersection 1 −1 of the lines x = 0 + λ 2 −1 3 2 3 x= 5 + µ 1 . −3 −2
(if possible) and
45
(cont.)
Two non-parallel lines in space that never meet are called skew lines. 46
1.5 Planes In 3-dimensions and higher, we can construct planes. Suppose we seek the vector equation of the plane passing through the origin parallel to two given non-parallel vectors a and b. Q
X
µb
x
b
O a λa P To reach any point X with position vector x on the plane we need to ‘stretch’ the vector a to P and ‘stretch’ the vector b to Q in such a −→
−→
−→
way that OX=OP + OQ. Thus, x = λa + µb. Conversely, if we take the vector which results from adding a multiple of a and a multiple of b then this will be the position vector of a point on the plane. 47
Example: Find the vector equation of the plane passing through parallel to the the origin 1 −2 vectors 5 and 4 . −3 7
48
The plane generated by two such vectors is called the span of the So two vectors. for ex1 −2 ample, the span of 5 and 4 is sim−3 7 −2 1 ply the set λ 5 + µ 4 : λ, µ ∈ R . 7 −3 This set is also referred to as the set of all linear combinations of the two vectors. Thus, given any two vectors a, b, span{a, b} = {λa + µb : λ, µ ∈ R} and we say that x is a linear combination of a and b if x = λa + µb for some particular λ and µ. −1 1 Example: Describe the span of 2 , 3 . 3 5 2 1 Repeat for span 2 , 4 . 3 6
49
As with the equation of a line, to get the vector equation of a plane not through the origin, we simply shift the plane by adding any position vector of a point which lies on the plane. Thus to obtain the vector equation of a plane we need:
(i) The position vector of a point on the plane.
(ii) Two non-parallel vectors which are parallel to (or lie on) the plane.
50
Example: Find the vector equation of the plane passing the point P with posi through 2 tion vector −3 and parallel to the vectors 5 2 1 6 and −5 . 1 −4
51
Example: Find the vector equation of the plane passing through thethreepointsP, Q, R 4 −1 −2 2 with position vectors P = , Q = 3 6 −1 2 1 7 and R = . −2 5
52
Configurations In R3, two (distinct) planes can be
• parallel • meet in a line
53
In R3, three (distinct) planes can be arranged so that
• the three planes are parallel • two planes are parallel and the third plane is parallel to neither the first two.
• they meet at a single point • they meet in a line • none are parallel, but no point lies on all three planes.
In the next chapter we will learn how to analyse these scenarios algebraically.
54
Regions Example: The set
1 −1 S = x = λ 2 + µ 2 , 0 ≤ λ ≤ 1, 0 ≤ µ ≤ 1 0 4 represents −1 2 , 4
the with vertices O, parallelogram 0 1 2 , 4 in R3. 4 0
Example: The set
−1 1 S = x = λ 2 + µ 2 , 0 ≤ λ ≤ 1, 0 ≤ µ ≤ λ 4 0
0 1 represents the triangle with vertices O, 2 , 4 4 0 in R3.
55
1.5.3 Cartesian Form of a Plane in R3 As with lines, we can find the Cartesian equation of a plane by eliminating the two parameters λ and µ. This is generally quite fiddly to do algebraically. Later in this course, you will see a much better method, but for the moment, we will do it by algebra. Example: Find Cartesian equation the of the 1 2 −1 plane x = 2 + λ 4 + µ 0 . 3 0 3
56
(cont.)
The procedure can be reversed to find the vector equation of a plane from the Cartesian equation. 57
Example: Find the vector equation form of the plane 3x − 6y + 2z = 12.
58
Further Examples: Example: Find the intersection of the planes 2x + y − z = 10 and 3x + 4y + 2z = 29.
59
Example: Show that the line x
11
= λ −5 is
parallel plane the to −1 4 2 x = 5 + µ 2 + ν 3 . 5 6 6
−3
60
Example: Find the −1 1 x= 2 +λ 4 3 −2
intersection of
and 2x + 3y − z = 29.
61
Chapter 2 - VECTOR GEOMETRY In Chapter 1, see looked at the rudiments of vector geometry. We now have enough machinery to study this subject in greater depth. 2.1 Lengths In higher dimensions, we can define the length of a vector as follows:
Definition: A vector x length |x| given by |x| =
=
a1 a2 .. an
in Rn has
q
2 2 a2 1 + a2 + · · · + an .
62
The distance between two points A and B in Rn will be defined as the length of the vector −→ AB, in = a other words, if A has position vector b1 a1 b a 2 .. and B has position vector b = 2.. , bn an −→
then the length of AB is −→
| AB | = |b − a| =
q
(b1 − a1)2 + · · · + (bn − an )2.
63
The ‘length function’ | · |, (sometimes called a norm) has the following properties:
1. |a| ≥ 0. 2. |a| = 0 if and only if a = 0. 3. |λa| = |λ||a|, for λ ∈ R. A vector which has unit length is called a unit vector. Any vector can be made into a unit vector by dividing by its length. The vector v ˆ is the unit vector of v.
64
2.2 Scalar (Dot) Product Definition: The scalar or dot product of two vectors a and b in Rn is given by
a · b = a1b1 + a2b2 + · · · + anbn =
n X
ak bk .
k=1
The scalar product has a geometric interpretation in R2 and R3. Indeed for non-parallel vectors a and b in R3, we consider the following triangle.
65
a c b We can write c = a − b and so taking lengths, we have |c|2 = (a1 − b1)2 + (a2 − b2)2 + (a3 − b3)2 = |a|2 + |b|2 − 2(a1b1 + a2b2 + a3b3) = |a|2 + |b|2 − 2 a · b.
Also, if we write down the cosine rule for this triangle, we have |c|2 = |a|2 +|b|2 −2|a||b| cos θ, where θ is the angle between the vectors a and b. Comparing the two expressions we see that a1b1 + a2b2 + a3b3 = a · b = |a||b| cos θ. This formula is fundamental and should be committed to memory.
66
Notice that if the angle between a and b is (≡ 90◦) we have
π 2
a · b = a1b1 + a2b2 + a3b3 = 0. Example: Find the scalar product and hence the acute angle between the vectors a and b if 2 1 a= −1 and b = 1 . 1 2
67
(cont.)
68
2.2.1 Arithmetic Properties of the Scalar Product The scalar product has the following properties:
1. a · a
√ 2 = |a| and so |a| = a · a.
2. a · b is a scalar (which is why it is called the scalar product).
3. a · b = b · a. (Commutative law). 4. a · (λb) = λ(a · b). 5. a · (b + c) = a · b + a · c. (Distributive law). The proofs of these are straightforward. 69
Example: Prove that the diagonals of a rhombus are perpendicular.
70
2.2.2 Geometric Interpretation of the Scalar Product in Rn In R2 and R3, the equality a · b = |a||b| cos θ immediately yields, |a · b| ≤ |a||b|, since | cos θ| ≤ 1 and |a| ≥ 0, |b| ≥ 0. This is known as the Cauchy-Schwartz inequality. It generalises to vectors in Rn, but the proof required there is different. Theorem 3: (Cauchy-Schwartz inequality). If a, b ∈ Rn then |a · b| ≤ |a||b| .
71
Proof: The result is clearly true if either vector is the zero vector, so we may suppose that neither vector is zero. Define the function f (t) by f (t) = |a − tb|2. Observe that f (t) ≥ 0 for all t and that f (t) is simply a quadratic. Using rule (1) above, we may write f (t) as f (t) = (a − tb) · (a − tb) = t2|b|2 − 2ta · b + |a|2
a·b and so f (t) has a minimum when t = . 2 |b |
Substituting into f , we see that the minimum (a · b)2 2 value of f is |a| − |b |2 Thus, as f (t) ≥ 0 for all t, we have |a|2 − (a · b)2 2 2 2 ≥ 0 which implies that ( a · b ) ≤ | a | | b | . 2 |b | Taking positive square roots gives the result. 72
In higher dimensions we can define the angle θ between two vectors a and b by
a·b cos θ = . |a||b| By the Cauchy-Schwarz inequality, the righthand side lies in [−1, 1], so this definition makes sense. Now we can give a general proof for the triangle inequality for vectors in Rn . Theorem 4: (The Triangle Inequality.) If a, b ∈ Rn then |a + b| ≤ |a| + |b|. Proof: Again, using rule (1), |a + b|2 = (a + b) · (a + b) = |a|2 + 2a · b + |b|2 ≤ |a|2 + 2|a||b| + |b|2 = (|a| + |b|)2. Taking positive square roots gives the result. 73
2.3 Applications: Orthogonality and Projection 2.3.1 Orthogonality of Vectors Two vectors a and b are said to be orthogonal if a · b = 0. In two and three dimensions, this is the same as saying that the two vectors are perpendicular.
2 Example: Show that 5 is orthogonal to −1 3 1 . 11
74
Example: Write down a unit vector which is 2 4 perpendicular to both −6 and 3 . −3 −1
75
Example: Prove that the altitudes of a triangle are concurrent. Suppose our triangle is ABC which vertices at position vectors a, b, c. Let P be the intersection of the altitudes AD and BE, and have position vector p. Hence
76
A set of vectors which are mutually orthogonal is called an orthogonal set. For the vectors example set of 1 1 1 1 , −1 , 1 is an orthogonal set. 1 0 −2
A set of vectors which each have length 1 and are also orthogonal is called an orthonormal set.
3 For the3 basis example vectors in R , 0 0 1 0 , 1 , 0 form an orthonormal set. 0 0 1
(
1 1 √ 2 1 mal set in R2.
The set
!
1 , √ 2
1 −1
!)
is an orthonor-
77
Example: Suppose that {a1, a2, a3} is an orthonormal set in Rn , (with n ≥ 3) and
b = c1a1 + c2a2 + c3a3. Find the scalars c1, c2, c3.
78
2.3.2 Projections In the diagram, we are given two (non-zero) vectors a and b. A a
θ P B O projb a b Drop a perpendicular AP to OB from A. −→
The vector OP is called the projection of a onto b. It is parallel to b. −→
a·b Now for θ acute, |OP | = |a| cos θ = using |b | −→
the properties of the scalar product. Also OP = −→ b and so, in two and three-dimensional | OP | |b | space, projba =
!
a·b b= 2 |b |
!
a·b b. b·b 79
In higher dimensions we will take this for our definition of the projection of a onto b.
a1 a1 0 is the projection of a2 a3 0 onto the unit vector e1 and so on. Note that
2
Example: Find the projection of −3 onto
1
−2 3 . 6
80
Example: Find the ! ! length of the projection of 2 5 onto . −3 6
81
2.3.3 Distance between a Point and a Line in R3 Example: Find the shortest distance 0 point B = 3 to the line x = 8 1 λ −1 . 4
from the 1 2 + 3
B
A O
P
d
a
1 −→ Let A = 2 and thus a = OA. Similarly b = 3 − − → OB. Also let P be the foot of the perpendicular from B to the line. 82
−→ −→ Then AP is the projection of AB onto the di1 rection vector d = −1 of the line. We 4 − − → could then find the length of P B using the −→ length of AB and Pythagoras. −→ An alternate method is as follows. Notice AB = −→ − − → b − a = AP + P B. Thus −→ − − → −→ −→ −→ P B = AB − AP = AB − projdAB = b − a − projd(b − a) ! (b − a) · d d = b−a− 2 |d |
Hence
− − → |P B| =
q
−→ −→ |AB|2 − |AP |2.
For our example we have −→ |AB| = −→ |AP | = − − → |P B| =
q
−→ 2 −→ 2 |AB| − |AP | = 83
2.4 Vector (Cross) Product There is another way in which we can define vector multiplication. This product only makes sense in R3 and is known as the vector or cross product.
a1 Given two (non-zero) vectors a = a2 and a3 x1 b1 b= b2 , we seek a third vector x = x2
x3 b3 which is perpendicular to both of these. From the scalar (dot) product we know that such a vector must satisfy each of the equations a1x1 + a2x2 + a3x3 = 0 b1x1 + b2x2 + b3x3 = 0. We can solve these to obtain
a2b3 − a3b2 x = λ a3 b1 − a1b3 a1b2 − a2b1
where λ ∈ R. 84
We put λ = 1 and define the vector (cross) product of a and b by
a2b3 − a3b2 a×b= a3b1 − a1 b3 . a1b2 − a2b1
It can be quite difficult to remember, but can be written mnemonically as a determinant (to be studied in detail at the end of the course):
a×b where, recall,
1 i= 0 , 0
i = a1 b1
j a2 b2
0 j= 1 , 0
k a3 b3
0 k= 0 . 1
That is, we expand the determinant and write
a×b
a = i 2 b2
a a3 −j 1 b1 b3
a2 . b2
a a3 + k 1 b1 b3
Each subdeterminant is then evaluated to give the three co-ordinates of the vector product. 85
Example: Use the determinant formula to find 2 6 the cross product of −1 and −2 . 2 −3
86
2.4.1 Arithmetic Properties of Vector Products Suppose a, b and c are vectors in R3. Then (i) a × a = 0. (ii) b × a = −a × b. (Note then that the cross product is NOT commutative. Changing the order changes the sign.) (iii) a × (λb) = λ(a × b) and (λa) × b = λ(a × b) for λ ∈ R. (iv) From (i) and (iii) we have a × (λa) = 0. In other words, the cross product of parallel vectors is the zero vector. (v) a × (b + c) = a × b + a × c and (a + b) × c = a × c + b × c. (Distributive Law). 87
(vi) For the three standard basis vectors in R3 we have
e1 × e2 = e3,
e2 × e3 = e1,
e3 × e1 = e2.
(vii) |a × b| = |a||b| sin θ, where θ is the angle between the two vectors. All of these may be proven by writing out the co-ordinates. Some of the proofs are given in the Algebra Course Notes (yellow book). Example: If a, b 6= 0 and a × b = 0 explain why the vectors must be parallel.
88
2.4.2 A Geometric Interpretation of the Vector Product
Note in particular result (vii) in the previous section. We can use this to find: 89
2.4.3 Area of a Parallelogram Given a parallelogram S in R3, defined by S = {λa + µb : 0 ≤ λ ≤ 1,
0 ≤ µ ≤ 1},
the area of S is given by |a||b| sin θ.
b θ 0
Area = |a × b|
a
But by (vii) above, this is simply |a × b|. Example: Find the parallelogram areaof the −1 2 spanned by 3 and 3 . 7 −1
90
2.5 Scalar Triple Product and Volume The scalar triple product of three vectors is defined by
a · (b × c). We will see shortly that it also has a geometric interpretation. Note that in the evaluation of the triple product, you must perform the cross product first. Example: scalar triple product of the Find 5 −3 2 −1 , 1 , −2 . 1 4 3
91
Theorem: For a, b, c we have (i) a · (b × c) = (a × b) · c (ii) a · (b × c) = −a · (c × b) (iii) a · (a × b) = (a × a) · b = 0 (iv) The scalar triple product can be written as a determinant, viz: a 1 a2 a3 a · (b × c) = b1 b2 b3 c1 c2 c3
.
There is also a vector triple product which arises in Physics. The vector triple product of three vectors a, b, c is given by a × (b × c). Note that cross product is NOT associative so a × (b × c) 6= (a × b) × c.
92
2.5.1 Volume of Parallelepipeds
A C
a n O
Volume = |a · (b × c)|
c b
B
A parallelepiped is the 3-dimensional analogue of a parallelogram. (The word comes literally from the Greek παραλληλǫπ´ ιπǫδoν meaning that (the top) is parallel to (ǫπ´ ι) the base (literally plane)). Suppose we have three non-coplanar vectors a, b and c, which form the edges of the parallelepiped. The parallelepiped is said to be spanned by these three vectors. 93
To find the volume, we must multiply the area of the base by the perpendicular height. Let n be a normal to the base given by n = b × c. The perpendicular height then, is given by the length of the projection of a onto n which is simply |projna| =
|a · n | |a · (b × c)| = . |n | |b × c|
Now the denominator of this expression is simply the area of the base parallelogram, and so the volume is given by the magnitude of the scalar triple product, i.e., |a · (b × c)|.
94
Example: Find the the parallelepiped volume of 2 3 −3 spanned by −4 , −4 , 2 . 1 6 −5
95
Note that if the three vectors are coplanar then the volume of the parallelepiped that they span will be zero, and conversely, so this gives us: Theorem: Three vectors in R3 are co-planar if and only if their scalar triple product is zero.
96
2.6 Planes in R3 2.6.1 Equations of Planes in R3 We can use the scalar (dot) product to give an alternate formula for the equation of a plane. Suppose we know a point A on the plane and a vector n which is perpendicular to the plane.
n A a
x
X
b
0 Take any general point X on the plane. Then AX and n are perpendicular and so their scalar −−→ product is zero. Hence n · AX = 0. If we let x be the position vector of X and a be the position vector of A, then
n · (x − a) = 0.
This is called the point normal form of the plane. 97
Example: Find the point normal form and hence the Cartesian equation of the plane pass2 1 ing through −1 with normal −2 . 3 1
98
Observe that if n
n1 = n2 , then when we n3
use the point-normal form of the equation, the co-efficients in the Cartesian equation are precisely the components of this vector. We can exploit this idea to convert a plane from vector form to Cartesian form. Example: Find the Cartesian form of the plane whose is equation 3 −1 2 x= −1 + λ −2 + µ −4 . 2 4 2
99
Example: Convert to point-normal form, the plane with equation 2x − 3y + 4z = 12.
100
Example: Convert to point-normal form, the plane withvector equation 2 3 0 x= 5 + λ 0 + µ 1 . −3 −2 −2
101
2.6.2 Distance between a point and a plane in R3 Find the shortest distance from the point 1 B = −2 to the plane 3 2 1 −1 x= 0 + λ 1 + µ 0 . 0 0 2 Let the BP AB
2 A be the point 0 and P be the foot of 0 perpendicular from B to the plane. Then is simply the length of the projection of onto the normal n to the plane. B
n P A 102
(cont.)
103
Chapter 3 - Complex Numbers 3.1 & 3.2 Review of Number Systems, Introduction to Complex Numbers Let us begin by trying to solve various algebraic equations. Suppose we only know about the set of natural numbers (written as N). Then we can solve the equation x − 3 = 2 and obtain the solution x = 5. On the other hand, if we try to solve the equation x + 3 = 2 then there is no solution! To solve this equation we need a larger set of numbers which includes the negative whole numbers as well as the positive ones. This set is called the set of integers and is denoted by Z. Continuing this idea:
104
In Z: In Q: In R:
x+3=2 x = −1 3x = 2 x=2 3 2 x = 2√ x=± 2
√i 2 2−i π 2 3
Z 0 −1 1 N
3x = 2 No solution x2 = 2 No solution x2 + 1 = 0 No solution
C
Q R
−2
105
At each stage in the above we are able to solve each new type of equation by extending the set of numbers in which we are working. Hence, to solve the equation x2 = −1 we introduce a new symbol i (much as we introduced the √ symbol 2 to solve x2 = 2.) We define i to be the (complex) number whose square is −1. i.e. i2 = −1. Using this new symbol, we can now solve x2 = −1 to obtain solutions x = ±i. Furthermore, we can define the complex numbers by: Definition: The set of all numbers of the form a + bi where a, b are real numbers, i.e., a, b ∈ R and i2 = −1 is called the set of all complex numbers, C.
106
As with the set of real numbers and the set of rational numbers, if we add, subtract, multiply, or divide (with the exception of division by 0) any two complex numbers, we again obtain a complex number. This property is called closure. The complex numbers are closed under addition, subtraction, multiplication and division (not by 0). A set of objects which has these (and a number of other properties) is called a field in mathematics. Definition: Let F be a non-empty set of elements for which a rule of addition (+) and a rule of multiplication (×) are defined. Then the system is a field if the following twelve axioms (or fundamental number laws) are satisfied.
107
1. Closure under Addition. If x, y ∈ F then x + y ∈ F. 2. Associative Law of Addition. (x + y) + z = x + (y + z) for all x, y, z ∈ F. 3. Commutative Law of Addition. x + y = y + x for all x, y ∈ F. 4. Existence of a Zero. There exists an element of F (usually written as 0) such that 0 + x = x + 0 = x for all x ∈ F. 5. Existence of a Negative. For each x ∈ F, there exists an element w ∈ F (usually written as −x) such that x+w = w+x = 0. 6. Closure under Multiplication. If x, y ∈ F then xy ∈ F. 108
7. Associative Law of Multiplication. x(yz) = (xy)z for all x, y, z ∈ F. 8. Commutative Law of Multiplication. xy = yx for all x, y ∈ F. 9. Existence of a One. There exists a nonzero element of F (usually written as 1) such that x1 = 1x = x for all x ∈ F. 10. Existence of an Inverse for Multiplication. For each non-zero x ∈ F, there exists an element w of F (usually written as 1/x or x−1) such that xw = wx = 1. 11. Distributive Law. x(y + z) = xy + xz for all x, y, z ∈ F. 12. Distributive Law. (x + y)z = xz + yz for all x, y, z ∈ F. 109
Example: Is the set {−1, 1} a field?
Example: Is the set {0, 1} a field?
110
Example: Is the set of integers Z a field?
111
Example: How do we show that the set of complex numbers C is a field?
112
3.3 The Rules of Arithmetic for Complex Numbers Rules: Let z = a + ib, w = c + id be complex numbers such that a, b, c, d ∈ R. Then (i) z ± w = (a ± c) + i(b ± d) (ii) zw = (ac − bd) + i(ad + bc). (iii)
a + ib c − id (ac + bd) + i(bc − ad) z = × = w c + id c − id c2 + d2
113
Example: If z = 1 + 2i then calculate z 2.
114
Example: If z = 2 + 5i then calculate 1 Cartesian form, i.e., = a + ib. z
1 in z
115
3.4 Real Parts, Imaginary Parts and Complex Conjugates Definition: The real part of z = a + bi (written Re(z)), where a, b ∈ R, is given by Re(z) = a. Definition: The imaginary part of z = a + bi (written Im(z)), where a, b ∈ R, is given by Im(z) = b. Note: The imaginary part of a complex number is a real number!
116
Example: If z = 2 + 5i what is Im(z)? Re(z)?
Example: If z = 3+4i what is Im(z 2)? Re(z 2)?
117
1 + 2i Example: What is Im ? 3 − 4i
118
Example: If z = a+bi what is Im(iz)? Re(iz)?
119
Example: For any two complex numbers z1 and z2 is the following true Im(z1z2) = Im(z1)Im(z2) ?
120
We noted in the previous lecture to calculate 1 where z = a + bi it was useful to multiply z top and bottom of 1z by a − bi. We give this complex number a special name. Definition: If z = a + bi, where a, b ∈ R, then the complex conjugate of z is z = a − bi. Properties of the Complex Conjugates. 1. z = z. 1 1 2. Re(z) = (z + z) and Im(z) = (z − z). 2 2i 3. zz ∈ R and zz ≥ 0. 4. z + w = z + w and z − w = z − w. 5. zw = z w and
z w
z . = w 121
Example: Prove that zz ≥ 0.
122
Example: Prove that zw = z w.
123
Example: Let P (z) = anz n + an−1z n−1 + · · · + a1z + a0 be a polynomial with real coefficients an, an−1, . . . , a1, a0. If z1 is a complex root of P (z), i.e., P (z1) = 0, show that z1 is also a root.
124
(cont.)
125
3.5 The Argand Diagram Complex numbers can be represented using the Argand plane, which consists of Cartesian axes similar to that which you used to represent points in the plane. The horizontal axis is used to represent the real part and the vertical axis, (sometimes called the imaginary axis), is used to represent the imaginary part. For example, the following points have been plotted: −3, 2i, −3 + 2i, 4 + i.
−3 + 2i 2i b
b
b
4+i
b
−3 Complex numbers then are 2-dimensional, in that we require two axes to represent them. 126
Observe that a complex number z and its conjugate z are simply reflections of each other in the real axis. We lose the notion of comparison in the complex plane. That is, we cannot say whether one complex number is greater or lesser than another. Plotting real and imaginary parts leads to a simple geometric picture of addition and subtraction of complex numbers. Addition and subtraction is done by adding or subtracting x-coordinates and y-coordinates separately, as shown in the figure below. z1 + z2 b
z2 b
z1 x2
z2
y2
b
b
0
b
0
z1
x2 y 2 z1 − z2 b
127
3.6 Polar form, Modulus and Argument
Imaginary axis
You have already seen that complex numbers can be expressed in Cartesian Form, a+ib, a, b ∈ R. We can also specify a complex number z by specifing the distance of z from the origin and the angle it makes with the positive real axis.
y r 0
θ
z = x + yi x Real axis
This distance is called the modulus (also called magnitude and absolute value) of z and written as |z| while the angle is called the argument and written as Arg(z). We insist, to remove ambiguity, that −π < Arg(z) ≤ π. Pythagoras’ theorem gives: If z = x + yi then r = |z| =
q
x2 + y 2 . 128
Care must be taken to find the correct argument. It is easiest to find the related angle θ such that tan θ = ab and then use this to find the argument in the correct quadrant recalling that we use negative angles in the third and fourth quadrant. Example: Find the modulus and argument of √ z = −1 + i 3 and w = 1 − 2i.
129
Properties of Modulus: (i) |zw| = |z||w| z |z| (ii) = , provided w 6= 0.
w
|w|
(iii) |z n| = |z|n
(iv) |z| = 0 ⇒ z = 0.
130
Example: Which of the following points lie inside the circle |z − 1| = 2? 1 + i, 2
2 + 3i,
√
√ 2 + i( 2 + 1),
√ 1 − +i 3 2
131
Example: If z1 and z2 are arbitrary complex numbers prove the triangle inequality |z1+z2| ≤ |z1| + |z2|.
132
Polar Form: r sin θ b
z = r(cos θ + i sin θ) b
r
θ r cos θ b
From the diagram, we can see that the complex number z can be written in the form z = r(cos θ + i sin θ), where r is the modulus of z and θ is the argument of z. This is the called the polar form of z. Example: Write 1 − i in polar form.
133
You will need to be able to convert a complex number from Cartesian form, (a + bi), into polar form r(cos θ + i sin θ) and vice-versa. √ π π Example: Write 3(cos + i sin ) in Carte3 3 sian form.
From the diagram we can also write down for z = x + yi x y x y cos θ = q and sin θ = q = = . r r x2 + y 2 x2 + y 2 Also Re(z) = x = r cos θ
and
Im(z) = y = r sin θ . 134
Example: Write
−2 √ in polar form. 1 + 3i
135
Example: Write
1 in polar form. 2 (1 − i)
136
Example: Determine a value for Arg(zz) if z 6= 0.
You may have seen the abbreviation cis θ to represent cos θ + i sin θ. You should not use that here, since your tutor may not know what it is. This form is NOT generally used in books beyond High School. Moreover, as we shall see, this polar form, is really a stepping stone to a much better form which involves e. 137
1 (1 − i)2 in polar form. Different answers were given. Example: Students were asked to write 1 1 (1 + i)2 = (1 − i)2 (1 − i)2 (1 + i)2 2i = 4
=
i 2
π = i sin 6
π π 1 cos + i sin = 2 2 2 Which one is correct?
138
3.7 Properties and Applications of the Polar Form One important fact about the polar form is a remarkable result called: De Moivre’s Theorem: For any real number θ, and any integer n, we have (cos θ + i sin θ)n = cos nθ + i sin nθ. The proof of this, looking at the various cases of n is given in the Algebra Course Notes (yellow book). The method of proof by induction is used.
139
√ Example: Let z = 1 − i 3. Find z 12.
140
Let us write de Moivre’s theorem as follows: Let f (θ) = cos θ + i sin θ, then (f (θ))n = f (nθ). Also f (0) = 1. Euler did the following: He supposed we can differentiate the function, treating i just like a real number. If we momentarily ignore the logical difficulties involved then f ′(θ) = i(cos θ + i sin θ). d (eiθ ) = ieiθ , we can see Comparing this with dθ then that this function seems to have properties that are very similar to the exponential function eiθ . We will therefore define:
Definition: eiθ = cos θ + i sin θ (and hence e−iθ = cos θ − i sin θ). This formula is sometimes called Euler’s formula. We shall take it to be a definition of the complex exponential. 141
Thus, any complex number z can be expressed in the polar form z = reiθ where r = |z| is the modulus and θ the argument of z. For example, z =1−i= and z = −1 = This last formula is quite remarkable since it links together the four fundamental constants of mathematics. In a very important sense, this is the best way to write complex numbers. We have used the term polar form in two different senses. From now on, when I say polar form, I will (generally) mean this new exponential form. You ought to be able to convert a complex number from Cartesian form to polar form and vice-versa. 142
3.7.1 The Arithmetic of Polar Forms Note the following important facts: (i) The conjugate of the complex number z = eiθ is given by z = e−iθ . (ii) eiθ = ei(θ+2kπ) where k is an integer. We can write cos θ and sin θ in terms of the complex exponential as follows: eiθ − e−iθ eiθ + e−iθ and sin θ = . cos θ = 2 2i From the polar form, we can deduce the properties of modulus and argument which we listed earlier. Let z = r1eiθ1 and w = r2eiθ2 then zw = r1r2ei(θ1 +θ2) from which it follows that |zw| = |z||w| and Arg(zw) = Arg(z) + Arg(w) + 2kπ z |z| z = and Arg = Arg(z) − Arg(w) + 2kπ w |w| w
where k is chosen so that −π < Arg(zw) ≤ π z ≤ π. or −π < Arg w 143
Example: Evaluate zw in polar form if iπ i 5π 6 z = 2e , w = 3e 3 .
Example: Evaluate the product √ √ 1+ 3 π √ . (1 + i)(1 − i 3) to show that cos = 12 2 2
144
Example: Find the square roots of 21 − 20i.
145
(cont.)
146
Example: Find the roots (in Cartesian form) of z 2 − 3z + (3 − i) = 0.
147
(cont.)
148
3.7.2 Powers of Complex Numbers √ Example: Find ( 3 − i)6.
149
3.7.3 Roots of Complex Numbers Definition: A complex number z is an nth root of a number z0 if z0 is the nth power of z, that is, z is the nth root of z0 if z n = z0. To find roots of complex numbers, we will use the polar form. Note that to find the nth root of a complex number z0, we are really solving z n = z0 and so we will convert z0 into polar form. Such an equation will have n solutions! To get all of these solutions we express z0 in polar form, using the general argument, not the principal one. An example will make this clear.
150
Example: Find the 7th roots of −1.
151
(cont).
If we plot these complex numbers we see that they lie on a circle radius 1 and are equally spaced around that circle. z2 b
z3 b
z1 b
z4 b
b
z5
b
z7 b
z6
152
Example: Find the 5th roots of 4(1 − i).
153
3.8 Trigonometric Applications of Complex Numbers Using Euler’s formula, we note that eiθ = cos θ + i sin θ, e−iθ = cos(−θ) + i sin (−θ) = cos θ − i sin θ. On first adding and then subtracting these formulae, we obtain the important formulae 1 iθ 1 iθ −iθ −iθ cos θ = e +e e −e , sin θ = . 2 2i Thus, sines and cosines can be replaced by exponentials with imaginary exponents.
A large variety of trigonometric formulae can be obtained by using the above results and the Binomial Theorem. This theorem is stated below, and a proof is given in the course notes.
154
Theorem: If a, b ∈ C and n ∈ N, then n(n − 1) n−2 2 a b 2! n(n − 1)(n − 2) n−3 3 + a b + ··· 3! · · · + nabn−1 + bn n X n n−k k a b , = k k=0
(a + b)n = an + nan−1b +
where the numbers
n
binomial coefficients.
k
=
n! are the k!(n − k)!
For small values of n the binomial coefficients may be easily calculated using Pascal’s triangle. n
BINOMIAL COEFFICIENTS
0 1 2 3 4 5
1 1 1 1 1 1
2 3
4 5
1 1 3 6
10
1 4
10
1 5
1 155
Example: For n = 3, the coefficients are 1, 3, 3, 1, and hence (a + b)3 = a3 + 3a2b + 3ab2 + b3. Example Show that cos2 θ + sin2 θ = 1 using the polar form of the trigonometric functions.
156
Example Find a formula for cos3 θ in terms of cosines of multiples of θ.
157
Example Find a formula for cos 3θ in terms of powers of sin θ and cos θ.
158
3.9 Geometric Applications of Complex Numbers In this section we see how to represent regions in the argand plane algebraically. Example: Give a geometric interpretation of |z − w| and Arg(z − w). Let z = x + iy, w = a + ib. Note that z − w = (x − a) + i(y − b). Now |z − w| =
q
(x − a)2 + (y − b)2
which is the distance between the points representing z and w.
159
To understand the geometric interpretation of Arg(z − w), we represent the complex number z − w by the directed line segment (the arrow) from w to z. We plot z and w on the Argand diagram for each of the four cases as shown in the figure below. The angle α satisfies
b
w α b
Imaginary axis
0
0
Real axis
w α b
z b
Real axis
Imaginary axis
z
x−a cos α = , |z − w|
z b
α w b
0
Imaginary axis
Imaginary axis
y−b and sin α = |z − w| and hence α = Arg(z − w).
0
Real axis
w b
α b
z Real axis 160
Example: The set A = {z ∈ C : |z| ≤ 2} represents the set of points whose distance from the origin is less or equal to 2. (N.B. |z − α| measures the distance between z and α.) Hence this set represents a disc radius 2 centre the origin.
b
2
161
Similarly, the set B = {z ∈ C : |z + 1| < 2} represents the open disc centre (−1, 0) radius 2.
b
−3
b
−1
b
1
162
π } repre3 sents a wedge vertex at the origin, and arms separated by an angle of 60◦.
The set C = {x ∈ C : 0 ≤ Arg(z) ≤
o
Note that the origin in NOT included since the argument of 0 is not defined.
163
Similarly the set π D = {z ∈ C : 0 ≤ Arg(z − i) ≤ } represents a 6 wedge centre the point i as shown.
1o
164
Example: Sketch {z ∈ C : |z − i + 1| < 2} ∩ {z ∈ C : Re(z) ≥ 0}
165
Example: Sketch {z ∈ C : |z − 3| < 2} ∪ {z ∈ C : Im(z − 3i) > 0}
166
The rules for multiplication of complex numbers in polar form tell us that when we multiply two complex numbers together, rotation and stretching are involved. In particular, since iπ i = e 2 , multiplying a complex number by i has the effect of rotating z anti-clockwise about the origin, through an angle of 90◦.
b
iz
z
b
Example: Find the complex number obtained by rotating (4 + 2i) anti-clockwise about the π origin through . 2
167
More generally, to rotate complex number anticlockwise around 0 through an angle θ, we multiply it by eiθ . Example: Rotate 3 − i anticlockwise about 0 π through an angle of . 4
168
3.10 Complex Polynomials Theorem: (Fundamental Theorem of Algebra (FTA)) Suppose p(z) = anz n +an−1z n−1 +· · ·+a1z +a0 is a polynomial, whose co-efficients an, . . . , a1, a0 are all real (or complex) numbers, then the equation p(z) = 0 has at least one root in the complex numbers. Corollary: The equation p(z) = 0 has exactly n (complex) solutions in the complex numbers (counting multiplicity). (The last proviso ‘counting multiplicity’ refers to polynomials which may, for example, have factors such as (z − 2)4 in which case the root z = 2 is counted four times.)
169
The above result tells us (among other things) that we do not need to find any larger set of numbers if we want to solve polynomial equations. The complex numbers contain all the roots of every polynomial. The fundamental theorem of algebra, mentioned above, tells us that in the complex plane, all polynomials have all their roots. This is a very powerful theoretical tool, but it does not explicitly tell us how to find these roots for a given polynomial. Moreover, if we know the roots then we also know how to factor the polynomial. You will need to recall a number of basic facts about polynomials from High School, which are:
170
Remainder Theorem: If p(z) is a polynomial then the remainder r when p(z) is divided by z − α is given by r = p(α). Factor Theorem: If p(α) = 0 then (z − α) is a factor of p(z). From the fundamental theorem of algebra, it is clear that over the complex numbers, all polynomials completely factor (at least in theory) into linear factors. Theorem: Every polynomial (with real or complex co-efficients) of degree n ≥ 1 has a factorisation into linear factors of the form: p(z) = a(z − α1)(z − α2) · · · (z − αn)
where α1, α2, · · · , αn are the (complex) roots of p(z). This result, still does not tell us how to factor. The key to factoring over the real numbers, is to firstly factor over the complex numbers since in the complex plane the polynomial ‘falls to pieces’ into linear factors. 171
Example: Factor z 4 + 1 over the complex numbers and hence over the real numbers.
172
(cont.)
173
Example: Factor z 6 + 8 over the complex and real numbers.
174
(cont.)
175
Note that if the co-efficients of the polynomial are real, then the roots occur in conjugate pairs. Theorem: Let p(z) = anz n + an−1z n−1 + · · · + a1z + a0 be a polynomial with real coefficients an, an−1, . . . , a1, a0. If α is a complex root of p(z), i.e., p(α) = 0, show that α is also a root. Proof:
176
From this is follows that: Theorem: A polynomial with real co-efficients can be factored into a product of real linear and/or real quadratic factors. Proof: Factor p(z) over the complex numbers in the form p(z) = a(z − b1) · · · (z − br ) ×
(z − α1)(z − α1) · · · (z − αs)(z − αs)
where the bi’s are real and the αi’s are complex (non-real). By the above theorem, these must occur in conjugate pairs. Now each such pair of factors containing the conjugate pairs, can be expanded, viz: (z − α)(z − α) = (z 2 − (α + α)z + αα). Now α + α = 2Re(α) and so is REAL, and also αα = |α|2 is also REAL. Hence the quadratic we obtain has REAL co-efficients. 177
Example: Show that z = i is a root of p(z) = z 4 − 2z 3 + 6z 2 − 2z + 5 and hence factor p over R and C.
178
(cont.)
A very useful result, especially for some of the tutorial problems, is given by z n − 1 = (z − 1)(1 + z + z 2 + · · · + z n−1), n ∈ N . 179
Chapter 4 - Linear Equations and Matrices 4.1 Introduction to Linear Equations At School you learnt how to solve two-by-two systems of linear equations such as 2x 4x
+ −
3y 2y
= 1 = 2
You did this by multiplying the equations by various numbers and adding or subtracting so as to eliminate one of the unknowns. What does one do when faced with a system of 30 equations in 30 unknowns (such things turn up routinely in the applications of mathematics to the real world)? Ad hoc manipulation of the equations is clearly out of the question.
180
Example: Suppose A, B, C, D are arbitrary (but fixed) real numbers. Find the (unique) polynomial p(x) of degree 3, such that p(0) = A, p′(0) = B, p(1) = C and p′(1) = D.
We need a SYSTEMATIC method for solving systems of linear equations in many variables. 181
This is provided by a method known as Gaussian Elimination. It is one of the absolute fundamentals of the algebra course and will be used constantly in MATH 1231 as well as in this course. You will need to learn it carefully and get lots of practice at it. Before we look at this method, however, we wish to look carefully at the geometric interpretation of solving such systems. In 2-dimensions, solving a system of 2 simultaneous equations corresponds to looking at the intersection of two lines. The two lines may be parallel, in which case there is no intersection. They may meet at one point, in which case there will be one unique solution or they may in fact be the same line in which case there will be infinitely many solutions.
182
In 3-dimensions, consider the following system of linear equations: x x
+ −
y y
+ −
z z
= =
6 0
Geometrically, we are looking at the intersection of two planes. Now two planes could be parallel, which would then mean that there are no solutions to this system. The two planes could be the same, in which case there are infinitely many solutions, all lying on that plane. The third possibility is that the two planes could meet in a line. In our example,
183
(cont.)
Observe that this is simply the equation of a line! Hence the two planes in our example meet in a line. 184
Example: Consider the following set of simultaneous equations. x
+ −
y 2y
+ −
z z 3z
= = =
7 0 12
These represent three planes which intersect in a single point. 185
This technique is called back-substitution. This last example was quite easy because at each stage we could solve the last equation and successively back substitute to solve the others. We are going to learn a procedure which will convert any set of equations into this or a similar form. Example: Solve x + 2x +
2y 4y
+ +
3z 6z
= =
6 11
186
Example: Solve x + 2x +
2y 4y
+ +
3z 6z
= =
6 12
These represent two planes which are in fact the same. 187
The examples above were simply designed to show you what kinds of things can happen when we try to solve systems of equations and how we can represent the solutions. What we now need is a systematic method of reducing the equations to a point where we can use the above ideas to get all the solutions. This will be achieved by the method known as Gaussian Elimination. To explain this method we need a new way of representing simultaneous equations which we will discuss next lecture.
188
Example: Solve x1 2x1
+ +
2x2 5x2
+ +
3x3 8x3
= =
5 12
189
4.2 Systems of Linear Equations and Matrix Notation When we manipulate the equations, we are really only concerned with the co-efficients. The symbols x, y and z or x1, x2, x3 are simply place markers to separate off the co-efficients. We will therefore replace the equations with the co-efficients properly aligned. For example the system x x
+ −
+ −
y y
= =
z z
6 0
will be written as the block of numbers 1 1
1 −1
1 −1
6 0
!
.
This is called an augmented matrix. The list consisting only of the co-efficients 1 1
1 1 −1 −1
!
is called a matrix. 190
The word matrix (plural matrices) is Latin for womb. I suppose the idea was that a developing fetus is stored in the womb, just as a matrix is used to store a block of numbers. Example: The augmented matrix of x
+ −
y 2y
+ −
z z 3z
= = =
7 0 12
is
191
An m × n matrix then, is a block of numbers of the form
a 11 a21 a 31 .. am1
a12 a22 a32 .. am2
a13 a23 a33 .. a33
... ... ... ... ...
a1n a2n a3n .. amn
Note the m×n matrix has m rows and n columns. At this stage, we will think of these numbers as the co-efficients of the unknowns in some set of equations. (In the next topic we will look at matrices from a more general viewpoint.) An alternate way to write a system of equations is the so-called vector form.
192
Example: Write the following system of equations in augmented matrix and in vector form. x1 −2x1 7x1
+
3x2
− +
6x3 5x3
+ − −
The augmented matrix form is
7x4 4x4 5x4
= = =
−2 3 −10
The vector form is
193
If we write x
for the vector
x1 x2 x3 x4
example, A for the matrix
1
−2
7
3 0 0
−6 5 0
in the above
7 −4 −5
−2 and b 3 then the system −10 of equations will be written briefly as Ax = b and the augmented matrix as (A|b). (Note that we have as yet no defined matrix multiplication, so Ax is purely a formal symbol.) for the vector
194
Thus, the system a11x1 + a12x2 + · · · + a1nxn = b1 a21x1 + a22x2 + · · · + a2nxn = b2 ..
..
...
..
..
am1x1 + am2x2 + · · · + amn xn = bm can be written as Ax = b with augmented matrix (A|b), written as
a 11 a21 a 31 .. am1
a12 a22 a32 .. am2
a13 a23 a33 .. a33
... ... ... ... ...
a1n a2n a3n .. amn
b1 b2 b3 .. bm
195
4.3 Elementary Row Operations Suppose we have our system of equations written in augmented matrix form. There are a number of operations we can perform on the rows of the matrix without essentially altering the underlying set of equations. These operations will be called elementary row operations. We will describe these informally first, using examples, and then write them down formally later. The systematic application of elementary row operations is called Gaussian elimination. The first such operation is to swap two rows. Note that this is simply the same as writing down the equations in a different order.
196
Example:
2 1 3
−1 1 2
1 1 −1
1 −1 R1↔R2 4 −→ 2 −2 3
1 −1 2
1 1 −1
4 −1 −2
Observe that this places a 1 in the top lefthand corner. This is very desirable. Given any augmented matrix, we swap the rows so as to have a non-zero number, preferably a 1 or −1 in the top left-hand corner. A non-zero number in the top left hand corner is called a pivot and the column is referred to as a pivot column.
197
The next operation involves subtracting a multiple of one row from another. This is precisely the sort of thing you did back at school when using the elimination method. In our example I am going to subtract twice row 1 from row 2 to obtain
1 2 3
1 −1 2
1 1 −1
1 4 R2−2R1 −1 −→ 0 −2 3
1 −3 2
1 −1 −1
4 −9 −2
Notice that this eliminates the x term from the second equation. Now we subtract three times row 1 from row 3 to obtain
1
0
3
1 −3 2
1 −1 −1
4 1 R3−3R1 −9 −→ 0 0 −2
1 −3 −1
1 −1 −4
198
4 −9 −14
Normally we would perform both of these operations in one go and write
1 2 3
1 −1 2
1 4 R2−2R1,R3−3R1 1 −1 −→ −1 −2 1
0
0
1 −3 −1
1 −1 −4
4 −9 −14
Observe now that the variable x (or x1) is absent from the second and third equations. This process is called row reduction and is central to the Gaussian elimination method. We seek to reduce the co-efficients below the pivot to 0.
199
We now move to the second line and observe that the first non-zero number in that row is −3. We could also call this a pivot element, but there is a −1 below it. So we swap R2 and R3 to obtain
1 0 0
1 −3 −1
1 −1 −4
4 1 R2↔R3 −9 −→ 0 −14 0
1 −1 −3
1 −4 −1
0 −14 −9
This makes −1 our new pivot and we use it to eliminate the −3 below by subtracting three times row 2 from row 3 to obtain
1 0 0
1 −1 −3
1 −4 −1
4 1 R3−3R2 −14 −→ 0 −9 0
1 −1 0
1 −4 11
4 −14 33
Finally observe that row 2 can be made a little simpler by multiplying by −1. Again this will not essentially change the underlying equations. Thus
1
0
0
1 −1 0
1 −4 11
1 4 R2→−R2 −14 −→ 0 33 0
1 1 0
1 4 11 200
4 14 33
Note then, that at this stage, the underlying system of equations is x
+
y y
+ +
z 4z 11z
= = =
4 14 33
which is very easy to solve by back-substitution, giving z = 3, y = 2, x = −1. The final matrix we obtained is called the row echelon form of the matrix we started with. The first non-zero element in each row is called a pivot. We use row reduction to ensure that each pivot has zeros below it. In any matrix, the leading row is any row which is not all zero. The leading entry in a leading row is the first non-zero element and the leading column is the column which contains that leading entry. We use the words leading and pivot interchangably.
201
For example, in the matrix 0 0
5 0
7 0
!
the first row is a leading row and 5 is the leading entry. The second row is not a leading row.
202
Example: Write the following systems of equations in augmented matrix form and then perform elementary row operations to reduce the augmented matrix to row echelon form x1 −3x1 −2x1
+ + −
2x2 3x2 x2
+ + +
4x3 15x3 x3
= = =
10 15 . −5
203
(cont.)
204
Example: Write the following systems of equations in augmented matrix form and then perform elementary row operations to reduce the augmented matrix to row echelon form x1 2x1 4x1
− − +
2x2 x2 x2
+ + −
3x3 3x3 x3
= = =
11 10 . 4
205
(cont.)
206
4.4 Solving Systems of Equations A matrix is in row echelon form if
• all rows of zeros are at the bottom. • in any leading row, the leading entry is further to the right than the leading entry in any row higher up in the matrix.
Note that this implies that the numbers below any leading entry must be zero. Here are some examples: 2 0
5 0
7 4
!
,
2 0 0 0
0 0 0 0
5 5 0 0
7 8 −2 0
207
The Gaussian Elimination method which we used to transform a matrix into row echelon form involved the following operations called elementary row operations.
• Swapping rows • Multiplying (or dividing) any row by a nonzero constant.
• Adding or subtracting a multiple of one row from another.
208
Given a matrix which we seek to row reduce, we go through the following steps. 1. Move any zero rows to the bottom. 2. Look down the first column and find a row with a non-zero element, preferably 1 or −1, and make it the first row. The leading entry in that row is called the pivot. If the entire first column consists only of zeros then look to the second column and so on until you find a pivot. If the leading entry is not 1 or −1 then we can divide the row by this leading entry to make it 1. This will often introduce fractions. Alternately, you might be able to turn it into 1 by subtracting (or adding) (some multiple of) another row. 3. Eliminate (i.e. reduce to zero) all the nonzero entries below the pivot using the third elementary row operation. 209
4. Repeat the process until the matrix is in row echelon form. Note that you may swap rows and divide any row by a common factor as you like. Try to avoid fractions as much as possible by swapping rows.
Example: Row reduce 0 2 3 3 5 2 1 −4 −2 −1 1 −1 −2 −2 0
the matrix .
210
(cont.)
211
Example: Row reduce the matrix 2 4 −1 3 3 −7 10 1 . 4 0 1 0
212
Example: Use Gaussian Elimination to solve the system of equations: x + 2x + −3x +
y y 2y
− + +
z 3z 4z
= = =
0 −2 −16
213
(cont.)
214
Example: Use Gaussian Elimination to solve the system of equations: x1 2x1 4x1
+ + +
x2 3x2 2x2
− +
x3 2x3
− − +
x4 x4 x4
= = =
−3 −15 −1
215
(cont.)
216
4.4.3 Transformation to Reduced Row Echelon Form The reduced row echelon form of a matrix, is the row echelon form with the added property that there are zeros below and above the leading entries, and every leading entry is 1. Example:
1
0
0
0 1 0
0 0 1
4 0 −2
1 2 3
To get a matrix into this form, we follow the same algorithm as before but we reduce (to zero) the entries above as well as below the leading entries and (at the end is easiest) divide each leading row by the leading entry to make the leading entry equal to 1.
217
Example: Find the reduced row echelon form 1 1 2 3 10 1 for 2 1 3 4 . 3 1 4 −2 −5
218
Example: Suppose A is a 3 × 3 matrix. Describe all the possible types of reduced row echelon forms for A.
219
4.5 Deducing Solubility from Row Echelon Form The number of solutions, whether it be no solutions, infinitely many solutions or one (unique) solution, can be gleaned from the row echelon form without necessarily back-substituting. In many applications, (especially in MATH1231), we will be interested in whether a given system has solutions and if so whether the solution is unique or not, rather than in knowing precisely what the solutions are. Compare the following row echelon forms of systems of equations.
220
Example:
2 0 0
4 5 0
−1 5 2
3 2 7
We can see that we could solve for x3 uniquely, back-substitute to find x2 uniquely and then back-substitute again to find x1 uniquely. Hence this system has a unique solution. Observe that every column is a leading column. Example:
2
0
0
4 5 0
−1 5 0
3 2 7
The bottom row says ‘0x3 = 7’ which is clearly impossible, so this system has NO solution. Observe here that the right-hand column is a leading column.
221
Example:
2 0 0
4 5 0
−1 5 0
3 2 0
In this case the bottom row consists only of zeros, so this equation gives no information. From the middle row, we see that we must introduce one parameter λ for x3, and so x2 can then be determined in terms of λ and thus x1 can be determined in terms of λ. This system has infinitely many solutions. Observe that column 3 is a non-leading column, and that there will be 1 parameter in the solution.
222
Example:
2 0 0 0
4 5 0 0
−1 5 0 0
3 2 0 0
−2 1 5 0
1 6 5 0
In this example, the last line can be ignored, and the second last line gives x5 uniquely. Backsubstituting into the second line gives an equation with three unknowns. Hence we will need to introduce two parameters x4 = λ and x3 = µ and solve for x2. Then x1 can be found in terms of these parameters. Observe in this case that there are two non-leading columns in the row-reduced form of the matrix.
223
Theorem: If the augmented matrix for a system of linear equations (A|b) is reduced to rowechelon form (U |y), then 1. The system has no solution if the righthand column y is a leading column.
2. If the right-hand column y is not a leading column then the system has • a unique solution if and only if every column of U is a leading column. • infinitely many solutions if at least one column of U is non-leading column. The number of parameters required to describe the solutions equals the number of non-leading columns.
224
Example: Discuss the number of solutions of:
2 0 0 0
4 5 0 0
2
0 0
0
2 0 0 0
−1 5 0 0
4 5 0 0
4 5 0 0
−1 5 2 0
−1 5 0 0
3 2 0 0
−2 1 5 0
3 2 4 2
3 2 2 0
−2 1 5 0
−2 1 5 0
1 6 5 6
225
4.6 Solving Ax = b for indeterminate b
−1 3 2 3 1 4 . Find con2 4 6 ditions on the numbers b1 , b2, b3 such that if b1 b= b2 then Ax = b has at least one solub3 tion.
Example: Let A =
226
(cont.)
227
Example: Find conditions on λ such that the system x1 x1 2x1
+ + +
x2 2λx2 4λx2
+ + +
λx3 x3 λx3
= = =
1 0 −1
has an unique solution, no solution, infinitely many solutions.
228
(cont.)
229
4.7 General Properties of the Solution of Ax = b A system of equations which has each righthand side equal to 0, i.e., Ax = 0, is called a homogeneous system. Clearly, Theorem: A homogeneous system of equations always has at least one solution (viz x = 0), and has a unique solution if and only if every column in the row-echelon form is a leading column. Example: The system
−1 0 0
3 1 0
2 4 0
0 0 0
must have at least one solution and in fact has infinitely many since there is one non-leading column.
230
Theorem: A homogenous system of linear equations with more unknowns than equations always has infinitely many solutions. Recall that we used the notation Ax to represent the system of linear expressions a11x1 + a12x2 + · · · + a1nxn a21x1 + a22x2 + · · · + a2nxn ..
..
...
..
am1x1 + am2x2 + · · · + amn xn. Theorem: If A is an m × n matrix and x and y are vectors with n components and λ is a scalar, then
(a) A(x + y) = Ax + Ay.
(b) A(λx) = λAx. 231
Proof: of x as (a) Writing the components y1 x1 y x 2 . and components of y as 2. then . . yn xn A(x + y) =
a11(x1 + y1) + a12(x2 + y2) + · · · + a1n(xn + yn) a21(x1 + y1) + a22(x2 + y2) + · · · + a2n(xn + yn) .. .. .. ... am1(x1 + y1) + am2(x2 + y2) + · · · + amn (xn + yn) =
a11x1 + a12x2 + · · · + a1nxn a21x1 + a22x2 + · · · + a2nxn .. .. .. ... am1x1 + am2x2 + · · · + amnxn
a11y1 + a12y2 + · · · + a1nyn a21y1 + a22y2 + · · · + a2nyn + .. .. .. ... am1y1 + am2y2 + · · · + amnyn = Ax + Ay .
(b) is a similar calculation. 232
Corollary: A(x1 + x2 + · · ·+ xk ) = Ax1 + Ax2 + · · · + Axk . Theorem: If u and v are solutions of the homogeneous equation Ax = 0 then so are u + v and λu, for any λ ∈ R. Proof:
233
Theorem: If x1 and x2 are solutions to Ax = b then x2 = x1 + v, where v is a solution to the homogeneous equation Ax = 0. Proof:
Corollary: Ax = b has unique solution iff Ax = 0 has unique solution. 234
4.8 Applications 4.8.1 Geometry We can use matrix methods to solve problems involving vector geometry using the ideas in the previous chapter.
Example: Does 0 1 −2 4 ? , 1 3
2
3 0 5 6
belong to the span of
2
235
(cont.)
236
Example: Is v
3
= 1 parallel to the plane
4
1 −1 3 x= 2 + λ 3 + µ 1 ? 3 2 4
237
(cont.)
238
Example: (if any) of the Findthe intersection 3 1 line x = 2 + λ 2 and the plane x = 1 −1 1 3 −1 −11 + α 5 + β −2 . −5 1 −7
239
(cont.)
240
Chapter 5 - Matrices 5.1 Matrix Arithmetic and Algebra In our last topic we used matrices to solve systems of linear equations. A matrix was simply a block of numbers, which, in that context, represented the co-efficients of the unknowns in some system of equations. In this topic, we are going to look at the basic properties of matrices, regarded simply as blocks of numbers. Definition: An m × n matrix A is simply a block of numbers written with m rows and n columns. If m = n we say that A is a square matrix. The set of all m×n matrices is denoted by Mmn. In most of our work, the numbers in question will be real, but later we will also deal with matrices whose entries are complex numbers.
a11 a21 A= a31 .. am1
a12 a22 a32 .. am2
··· ··· ··· ... ···
a1n a2n a3n .. amn
. 241
We will use the notation aij , to denote the general element in the matrix A, (similarly bij for the general element in the matrix B and so on). We will also use the notation (A)ij to denote the general element aij in the matrix A. Example:
1 2 3 4 A = 9 −2 1 0 2 5 4 3 2
1. is a 3 × 4 matrix and a23 = 2
In Mmn there is a special matrix called the zero matrix which has all its entries equal to 0. It is generally denoted by 0 or 0 (if written by = hand).
Example: 0 trix in M34.
0 0 0 0
= 0 0 0 0 is the zero ma-
0 0 0 0
242
In Mnn, the set of all square matrices of size n, there is another special matrix called the identity matrix. It consists of 1’s down the main diagonal, i.e., aii = 1 and zeros everywhere else. It is generally written as In and sometimes simply as I, where the context tells us the size. Example: I2 =
1 0 0 1
!
is the identity matrix in M22 and
1 0 0
I3 = 0 1 0
0 0 1
is the identity matrix in M33.
243
5.1.1 Equality, Addition and Multiplication by a Scalar Definition: We say that two matrices A and B are equal if they have the same size (i.e., the same number of rows and columns) and aij = bij for each i, j. In other words, their entries in identical positions are the same. We can only add or subtract matrices which have the same size, and we do this by simply adding or subtracting the entries in the same position. More formally, if A = (aij ) and B = (bij ) are matrices of the same size, then we define C = A + B, by C = (cij ) where cij = aij + bij . Similarly with subtraction.
244
Example: Find A + B and A − B if 1 2 3 5 3 9 A = 9 −2 5 and B = −2 4 −6 . −1 3 −5 8 −4 2
Example:
1
2 −2 then −1 3
9
4 −1
If A = 4 −7
1
2
2 3 and B = −8
245
Note that matrix addition is commutative and associative, so A+B = B+A and (A+B)+C = A + (B + C) whenever the addition is defined. As with vectors, (a vector is really just an n × 1 matrix anyway), we can multiply a matrix by a scalar in the obvious way. If A is a matrix and λ a scalar, then B = λA is a matrix with entries bij = λaij .
1 2 3 Example: If A = 9 −2 5 then −1 3 −5 3A =
Note that for all matrices A, we have A + 0 = 0 + A = A and 0A = 0. 246
5.1.2 Matrix Multiplication Before giving you the ‘formal definition’ of matrix multiplication, let us see if we can guess what a sensible definition might be, by looking at the following simple example. Suppose we have 3 houses H1, H2 and H3 which order a certain number of bottles of milk, cream and a certain number of newspapers. The milk costs $ 2 , the cream $ 3 and the newspaper $ 1. This information can be represented using matrices as follows:
$ M C N H1 1 2 3 2 H2 2 2 1 3 H3 5 4 3 1
To work out the total cost to each house, we would do the following. The cost to the first house will be 1 × 2 + 2 × 3 + 3 × 1 = 11 and so on for the other houses, for the second house it is 2 × 2 + 2 × 3 + 1 × 1 = 11 and for the third 5 × 2 + 4 × 3 + 3 × 1 = 25. 247
This example suggests how we might define multiplication of an m × n matrix by an n × 1 matrix (i.e. a vector with n entries).
1 2 3 Example: Multiply A = 9 −2 5 by −1 3 −5 1 v= 2 . −3
248
The formal definition is: Definition: If A = (aij ) is an m × n matrix and x is an n × 1 matrix with entries xi then we define the product of A and x to be the matrix b, an m × 1 matrix whose entries are given by bi = ai1x1 + ai2x2 + · · · + ainxn =
n X
k=1
aik xk for 1 ≤ i ≤ m.
Now to multiply two general matrices together, we can think of each column of the second matrix as a vector and simply apply the above ideas. Notice that for this to make sense we need the number of columns of the first matrix to equal to number of rows of the second. In other words we can multiply an n × m matrix by an m × p matrix and answer will be an n × p matrix.
249
Example: Multiply the matrices A and B where 1 2 3 2 −3 A = 9 −2 5 and B = 5 9 . −1 3 −5 −2 4
250
2 3 Example: Let A = −2 5 and B = 3 −5
1 2 9 −2
AB is defined but BA is not. The formal definition of this procedure is: Definition: Suppose A = (aij ) is an m × n matrix and B = (bij ) is an n × p matrix, then we define the product C = AB to be the m × p matrix whose entries, cij are given by cij = ai1b1j + ai2b2j + · · · + ainbnj = for 1 ≤ i ≤ m and 1 ≤ j ≤ p.
n X
aik bkj
k=1
251
!
.
Example: Find AB and BA if A = and B =
9 −2 −1 3
!
2 3 −2 5
!
.
As you see from the above example, in general AB does not equal BA. This is extremely important fact to keep in mind. In general, matrix multiplication is NOT commutative. Note that we write An for A {z · · · × A} . | ×A× n times 252
Example: Find AI and IA if A = and I =
1 0 0 1
!
2 3 −2 5
!
.
From the above example, you surmise that: Theorem: If A is any n × n matrix, and I the identity matrix of size n, then AI = IA = A. Also I n = I for any positive integer n. Hence the identity matrix I behaves in the set of matrices just as the number 1 does in ordinary arithmetic. 253
Here is a list of some of the basic properties of matrix multiplication. Take careful note of exactly what you are allowed to do with matrices. Theorem: If A, B and C are matrices for which the necessary sums and products are defined and λ is a scalar, then
(i) A(λB) = λAB
(ii) (AB)C = A(BC). (This is known as the associative law).
(iii) (A+B)C = AC +BC and C(A+B) = CA+ CB. Note that these are NOT in general equal to each other. (This is known as the distributive law).
254
Idempotent Matrices: A matrix A is said to be idempotent if A2 = A. Example: Suppose A is idempotent. Prove that B = I − A is idempotent and that AB = BA.
255
5.2 The Transpose of a Matrix One other operation on matrices is the transpose. To take the transpose of a matrix, we interchange the rows and columns.
1 2 Example: If B = 9 −2 then −1 3 ! 1 9 −1 BT = . 2 −2 3 More formally, (AT )ij = aji . Thus (AT )T = A. (An operation which when composed with itself gives the identity function is called an involution. Complex conjugation is another example of an involution.)
256
Note that the transpose of a column vector is a row vector. If we take two (column) vectors, a, b, (of the same size) then the quantity aT b is a 1 × 1 matrix. We often write this matrix as a single number (i.e. a scalar) and identify it as the scalar or dot product of a and b. Note however that if the vectors have length n, then abT is an n × n matrix matrix. Example: Find aT b and abT for a
2 and b = 3 . −1
=
1 0 −1
257
The transpose has the following properties which should be carefully noted. Theorem: If A and B are matrices such that A + B and AB is defined, then (i) (AT )T = A (ii) (A + B)T = AT + B T (iii) (λA)T = λAT . (iv) (AB)T = B T AT . Take particular note of result (iv).
258
Example: Simplify AT (CBA)T C.
Definition: A real matrix A is said to be symmetric if AT = A. Note that symmetric matrix must necessarily be square. Symmetric matrices have remarkable properties which are studied in second year (MATH 2501). Definition: A matrix A is said to be skewsymmetric if AT = −A. You can immediately see that the entries on the main diagonal of a skew-symmetric matrix are zero. 259
5.3 The Inverse of a Matrix Given a square matrix A, can we find a matrix B such that AB = I, the identity matrix? In general the answer is NO, and later we shall find a way of telling exactly when such a matrix exists. For the moment suppose we can find such a B. We call B the inverse of A and write B = A−1. We should read A−1 as ‘A inverse’. A matrix A which has an inverse is called invertible. For a 2×2 matrix there is a very useful formula (which is worth learning) for the inverse. a b c d
!−1
1 = ad − bc
d −b −c a
!
.
This formula will only work if ad − bc 6= 0. This quantity is called the determinant of the matrix. You can check the formula works by simply multiplying. 260
Example: Find the inverse of
3 1 −2 5
!
.
For matrices of higher order, there is no simple formula for the inverse. We will develop an algorithm for finding the inverse if it exists. 261
5.3.2 Calculating the Inverse of a Matrix Suppose we have a 3 × 3 invertible matrix A whose inverse X to find. We know we seek 1 0 0 that AX = I ≡ 0 1 0 . 0 0 1
Suppose we write X as x1 x2 x3 where x1 etc are the columns of X. Thus we have 1 0 0 A x1 x2 x3 = 0 1 0 . We can think 0 0 1 of this equation as three sets of equations, 1 0 namely Ax1 = 0 , Ax2 = 1 and 0 0 0 Ax3 = 0 . 1
262
To solve each of these equations, we would take A and augment it with each of the three right hand sides in turn and row reduce. If we completely row reduce to reduced echelon form, then the solutions would appear on the right hand side of each reduction. To save time we could do all three reductions in one go, by augmenting the matrix A with the identity matrix and completely row reducing. This is the algorithm for finding the inverse of A.
263
1 2 3
Example: Find the inverse of A = 2 5 3 .
1 0 8
Check your answer.
264
(cont.)
265
Example: Does the matrix A =
have an inverse?
1 6 4 2 4 −1 −1 2 5
266
(cont.)
267
Challenge Problem: Consider a general n × n matrix which has 1’s in every entry on and above the main diagonal and zeros elsewhere. Discover a formula for the inverse of each such matrix. ! !−1 1 −1 1 1 = and For example 0 1 0 1 −1 1 1 1 1 −1 0 = 0 1 −1 . 0 1 1
0 0 1
0
0
1
(Get MAPLE to assist you in finding the pattern).
268
5.3.1 Some Useful Properties of Inverses Suppose that A, B are invertible matrices of the same size then (i) (A−1)−1 = A. (ii) (AB)−1 = B −1A−1 Note the second result. It is very similar to the rule we had for transposes. Observe also that it generalises: (ABC)−1 = C −1B −1A−1 assuming that all the matrices have the same size and invertible. To see why (ii) is true, we simply note that (AB)(B −1 A−1) = A(BB −1)A−1 = AIA−1 = AA−1 = I. Hence the inverse of AB is B −1A−1. 269
Example: Simplify HG(F HG)−1F G.
Challenge Problem: Suppose A and B are n × n square matrices and that A is invertible. Prove that (A + B)A−1(A − B) = (A − B)A−1(A + B).
270
5.3.5 Inverses and solution of Ax = b We can use inverses to solve equations. For example, solve 2 4 −1 7
!
x=
2 −3
!
.
Hence, if we are solving Ax = b and A is invertible, we can write x = A−1b. We can then state the following result. Theorem: If A is a square matrix then Ax = b has a unique solution if and only if A is invertible. 271
5.4 Determinants Note: Determinants exist only for square matrices! We have seen that ! the inverse of the general a b 2 × 2 matrix exists if and only if ad − c d bc 6= 0. This quantity is called the determinant of the matrix. For the matrix A = terminant as
a b c d
!
, we write its de-
a b det(A) = ad − bc or c d
= ad − bc.
There is a similar, but more complicated formula for the determinant of a 3 × 3 and higher size matrices. There are various ways to define the determinant, but we take a purely computational approach and develop a method of calculating the determinant for a general matrix. 272
5.4.1 The Definition of a Determinant Definition: Suppose A is a square matrix. We define |Aij |, called the ijth minor of A, to be the determinant of the matrix obtained by deleting the ith row and jth column of A.
1
4 5 2 7 then 2 −3 4
For example, if A = −1 1 4 |A23| = 2 −3
= −11.
We can now define the determinant recursively as follows:
273
Definition: The determinant of an n × n matrix is |A| = a11|A11| − a12|A12| + a13|A13| − · · · +(−1)n−1a1n|A1n|.
This formula is sometimes referred to as expanding along the top row. Essentially, it says that we multiply each entry in the top row by the determinant of the matrix obtained by deleting the row and column that the entry belongs to and then we take an alternating sum and difference of all these quantities.
274
Example: Find the determinant of 1 4 5 A = −1 2 7 . 2 −3 4
275
(cont.)
For 4×4 matrices (and higher orders), we have to apply this definition iteratively. There is, however, a more efficient method of finding the determinant which we shall see later. 276
5.4.2 Properties of Determinants Suppose A and B are n × n matrices then (i) |AB| = |A||B|. (ii) |A| = |AT |. (iii) |I| = 1. 1 −1 (iv) |A | =
|A|
assuming that |A| = 6 0.
Proof of (iv): AA−1 = I so |AA−1| = |I|. Now using (i) and (iii) we see that this is equal to |A||A−1| = 1 and the result follows.
277
In the light of property (ii), we can see that we can change the definition of determinant slightly and expand down the first column instead of the across the first row. This can be very useful calculationally, for example: 1 4 5 2 7 The deteminant 0 2 7 = 1 = −3 4 0 −3 4
29, since if we expand down the first column, the other minors are multiplied by zero.
Example: Suppose |A| = 3, |B| = −2, and A, B are square matrices of the same size. Simplify |A2B T (B −1)2A−1B|.
278
5.4.3 The Efficient Numerical Evaluation of Determinants
(i) |A| = 0 if and only if A is NOT invertible. (ii) If A has a row or column of zeros then |A| = 0. (iii) If A has two rows or columns which are multiples of each other then |A| = 0. (iv) If a row or column of A is multiplied by a scalar λ then the determinant of A is multiplied by λ. Note that if A is an n × n matrix, then |λA| = λn|A|. (v) If we swap two rows or two columns of A then the determinant of the resulting matrix is −1 times |A|. 279
(vi) If U is in row echelon form, then |U | is simply the product of the diagonal entries.
(vii) If U is the row-echelon form of A, which was achieved using only row swaps and row subtractions then |A| = ǫ|U |, where ǫ = 1 if there was an even number of row swaps and −1 if there was an odd number of row swaps.
For some proofs of these results, see the Algebra Course Notes (yellow book). From these results, we can find an efficient method to calculate the determinant of any square matrix.
280
We take the matrix A and reduce it to row echelon form U , making sure that we do NOT make any row swaps and we do not divide any row by a common factor of entries, then |A| = |U | and the latter is simply the product of the diagonal entries. If it is necessary (or convenient) to make a row swap or to divide a row by a common factor, then we can take that into account using the above rules. 1 2 3 Example: Find 2 5 3 1 0 8
.
281
Example: Find
1 −1 0 3 2 −2 6 −1 4 −2 1 7 3 5 −7 2
.
282
1 1 1 1 a b c d Example: Find ∆ = 2 2 2 2 . Regard a b c d a3 b3 c3 d3 ∆ as a function of a, then ∆ = c1a3 + c2a2 +
c3a + c4.
283
(cont.)
This generalises and is known as the Van der Monde determinant, ∆(a, b, c, d). 284
Challenge Problem: Show that 1 1 1 1 a b c d 2 2 2 2 = ∆(a, b, c, d)(a + b + c + d). a b c d a4 b4 c4 d4
285
Challenge Problem: Prove that a+b a a ··· a a a + b a ··· a = bn−1(na+b) where . . . . . . . . . . . a a a ··· a + b n is the size of the matrix.
286
5.4.4 Determinants and Solutions of Ax = b We finish this section by putting together a number of ideas. If we are looking at the solution of Ax = b, where A is a square matrix, then we can see that if A is invertible, we could multiply both sides by A−1 and obtain the unique solution x = A−1b. Thus, we can conclude: Theorem: Suppose A is a square matrix. Then Ax = b has a unique solution if A is invertible which is true if and only if det(A) 6= 0. If A is not invertible, i.e. if det(A) = 0, then the system may have no solution or infinitely many solutions.
287
Example: Without solving, determine whether or not Ax = b has a unique solution if A = 1 −1 3 3 1 . 2 3 1 4
288