Pietro-Luciano Buono Advanced Advanced Calculus Calculus De Gruyter Graduate
Also of interest interest Functional Analysis. A Terse Introduction Gerard Chacón, Chacón, Humberto Rafeiro, Juan Camilo Vallejo, 2016 ISBN 978-3-11-044191-8, e-ISBN (PDF) 978-3-11-044192-5, e-ISBN (EPUB) 978-3-11-043364-7
Introduction to Topology Topology Min Yan, 2016 ISBN 978-3-11-037815-3, e-ISBN (PDF) 978-3-11-041302-1, e-ISBN (EPUB) 978-3-11-037816-0
Tensors and Riemannian Geometry. With Applications to Differential Differential Equations Nail H. Ibragimov, 2015 ISBN 978-3-11-037949-5, e-ISBN (PDF) 978-3-11-037950-1, e-ISBN (EPUB) 978-3-11-037964-8
Multivariable Calculus and Differential Geometry Gerard Walschap, 2015 ISBN 978-3-11-036949-6, e-ISBN (PDF) 978-3-11-036954-0
Elements of Partial Differential Equations Pavel Drábek, Gabriela Holubová, 2014 ISBN 978-3-11-031665-0, e-ISBN (PDF) 978-3-11-031667-4, e-ISBN (EPUB) 978-3-11-037404-9 978-3-11-037404-9
Pietro-Luciano Pietro-Luciano Buono
Advanced Advanced Calcu Calculus lus
Differential Calculus and Stokesʼ Theorem
Mathematics Subject Classification 2010 35-02, 65-02, 65C30, 65C05, 65N35, 65N75, 65N80 Author Prof. Pietro-Luciano Buono University of Ontario Institute of Technology 2000 Simcoe St North Oshawa ON L1H 7K4 Canada
[email protected]
ISBN 978-3-11-043821-5 e-ISBN (PDF) 978-3-11-043822-2 e-ISBN (EPUB) 978-3-11-042911-4 Library of Congress Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliografische Information der Deutschen Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. http://dnb.dnb.de.
© 2016 Walter de Gruyter G ruyter GmbH, Berlin/Boston Cover image: Pietro-Luciano Buono via Asymptote: The Vector Graphics Langu age Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
À Isabelle pour son amour, son support et sa patience.
Contents Preface 1
1.1 1.2 1.3 1.4 1.5 1.6 2
2.1 2.2 2.3 3
IX Introduction
1
Review of Set Theory 1 Review of Linear Algebra 3 Coordinate systems 13 Functions and Mappings: including partial derivatives Parametric representation of curves 27 Quadrics 34 Calculus of Vector Functions
Derivatives and Integrals Best Linear Approximation and Tangent Lines Reparametrizations and arc-length parameter
3.1 3.2 3.3
Tangent Spaces and 1-forms Tangent spaces 58 Differentials 69 1-forms 80
4
Line Integrals
4.1 4.2 4.3 5
5.1 5.2 5.3 5.4 5.5 5.6 5.7 6
6.1 6.2 6.3 6.4
41 41
49 53
58
84
Integration of 1 forms 84 Arc-length, Metrics and Applications Line integrals of vector fields 111 Differential Calculus of Mappings Graphs and Level Sets 117 Limits and Continuity 121
95
117
Best Linear Approximation and Derivatives Tangent spaces 139 The Chain Rule 142 Higher Derivatives 145 Taylor expansions 151 Applications of Differential Calculus Optimization 156 Parametrizations 166 Differential Operators 176
129
156
Application of Clairault’s theorem to 1-forms
183
19
VIII
7
7.1 7.2 7.3 7.4 8
Contents
Double and Triple Integrals
Area and Volume Forms Double integrals 194 Green’s Theorem 207 Three-dimensional domains
188 188
215
8.1 8.2 8.3
Wedge Products and Exterior Derivatives 224 More on Wedge Products Differential Forms 231 Exterior Derivative 234
9
Integration of Forms
9.1 9.2 9.3 9.4 9.5 10
10.1 10.2 10.3
243
Pullbacks of k-forms: k = 1, 2, 3 243 Integrals of Forms: change of variables formula Integrals on a surface 253 Orientation of Surfaces 264 270 General Pullback Formula Stokes’ Theorem and Applications
301
299
247
274
More on orientation of curves and surfaces Stokes’ Theorem 280 Stokes’s Theorem for Vector Fields 294
Bibliography Index
224
274
Preface This book is an outgrowth of the notes I have been using to teach a one semester Calculus III course at the University of Ontario Institute of Technology since 2012. It is intended for students who have already completed at least one semester of Elementary Linear Algebra and two semester long courses in Calculus. The approach taken in this book is to take full advantage of Linear Algebra in order to present the Calculus concepts in as much generality as possible. Because of this bias towards using Linear Algebra, I decided also to go one step further (from many other books) and introduce the concept of tangent space early in the text from which it is possible to define properly the differential of a function and from there, differential forms and pullbacks in the context of line integrals. In the following chapters, those are generalized just enough to provide a unified treatment of integration and the generalized Stokes’ theorem in R3 (Green, Classical Stokes and Divergence). Therefore, this book can also serve as a gentle introduction to the theory of differential forms and prepare the reader to delve into more advanced topics from differential geometry and mathematical physics. The book begins with an introductory chapter on basic topics which are recurrent in the remainder of the book. Several of those topics may already be familiar to some readers. In order to provide an incremental progression in the blending of Linear Algebra and Calculus, sprinkled with differential forms theory, the next three chapters discuss almost exclusively vector functions of one variable and culminate with the Fundamental Theorem of Line Integrals. Chapter 3 is the cornerstone for the whole book and it uses tangent vectors to vector functions to introduce tangent spaces to curves, Rn and to surfaces. It is then possible to define differentials and 1-forms as acting on tangent vectors. The second part of the book is made up of Chapters 5 and 6 and focuses on differentiable mappings from Rn to Rm . Chapters 7 through 9 introduce, in a blended way, additional concepts of differential form theory along with the theory of multiple integrals. Finally, Chapter 10 puts the results from the previous chapters together in the statement and proof of Stokes’ theorem (Green, Classical and Divergence) using differential forms and exterior derivatives. The statement is also rewritten in terms of the classical differential operators. One of the advantage I see in the unconventional ordering of topics adopted here is that it is now possible to start introducing terminology in the context of curves (i.e. one dimensional geometrical objects) which are typically easier to understand and for which the calculations do not require the notational machinery needed with more variables. After a discussion of mappings and especially the introduction of the Jacobian, it is then possible to extend the differential form concepts to higher dimensions. At the same time, this enables for a repetition of the new concepts and terminology which is typically beneficial for learning. Another learning goal that this text attempts to achieve is for the reader to start distinguishing between the
X
Preface
definition of mathematical concepts, the geometric content of the definition and the computational formulae which are useful in most problem solving. It also provides an algorithmic presentation of some computations which I hope will make their usage more straightforward; one such example is for determining the arc-length parametrization of a curve. The content of this book has seen several versions since 2012 and I would like to thank all my students who have ploughed through those various versions. I am grateful to all of those that provided feedback on the presentation and noticed mistakes, typos, etc. I would like to thank Eryn Frawley (Calculus III, 2013) who assisted me by producing a great number of figures using Tikz, solutions to many problems and proofreading of chapters. I am also indebted to my teaching assistants, and especially Jamil Jabbour, who have read major portions of the material and challenged me on the presentation in several places. All figures were done using Tikz (https://sourceforge.net/projects/pgf/) and Asymptote (http://asymptote.sourceforge.net). I would like to thank all the users of those software packages for posting examples and code snippets. In particular, I am indebted to the gallery maintained at http://asy.marris.fr/asymptote/index.html. I hope to be able to give back soon to the community by making some of the codes for the figures of this book available online. For any inquiries about this textbook, error or typos found, etc. Please contact me at:
[email protected]. Luciano Buono Oshawa, June 2016.
1 Introduction This chapter gives an overview of several of the topics necessary for the remainder of the book. The first sections on Set Theory and Linear Algebra are review sections.
1.1 Review of Set Theory We begin with a quick review of basic concepts from set theory with a focus on real numbers R. A set of real numbers is an unordered collection of real numbers. One denotes sets inside brackets in enumerative style as follows
√ }
A = 1, 3, π, 1/ 2 .
{ −
The symbol ∈ means “element of” and denotes the belonging of an element to a set. We can also describe a set using a defining condition B = x
{ ∈ R | x > 2}
which is read as: x is the placeholder for elements of R such that x is greater than 2 .
In the defining condition notation, to verify whether a number belongs to a given set one has to check if the condition is satisfied. Is −3 ∈ B ? Let x = −3, then −3 > 2 is false; therefore, −3 is not an element of B and it is denoted: −3 ∈ B. Sets 2 x
Fig. 1.1. In bold, the set B .
can have a finite number of elements as A or an infinite number of elements as B . An important type of sets on the real line are the intervals, defined as follows: Let a, b ∈ R and a < b [a, b] = x (a, b] = x
{ ∈ R | a ≤ x ≤ b}, (a, b) = {x ∈ R | a < x < b} { ∈ R | a < x ≤ b}, [a, b) = {x ∈ R | a ≤ x < b}. If a = −∞ or b = ∞ then we always use “(a” or “b)”. a
b
a x
b
a x
Fig. 1.2. From left to right, the intervals [a, b], (a, b) and (a, b].
b x
2
1 Introduction
A set E is a subset of a set F if every element of E is also an element of F . We then write E ⊂ F . Another notation is E ⊆ F which allows for E and F to have exactly the same elements, that is E = F . Example 1.1.1. Let
{ ∈ R | x = 4n with n ∈ N} and F be the set of even integers. Is E ⊂ F ? To see this, one has to make sure E = x
that every element of E is an even integer. But, E contains an infinite number of elements and so we use the defining condition instead. At this point, if one believes that E F , then we can proceed to verify it properly as follows: let x = 4n for an arbitrary natural number n N, but x = 2(2n) is divisible by 2 and so x is an even integer. Because x can be chosen to be any element of E , this means E F .
⊂
∈
⊂
Example 1.1.2. Let
E = x
{ ∈ R | x = p/q, p,q ∈ Z, p , q ≤ 10} and F be the interval (0, 1). Is E ⊂ F ? In this case, one can check that for p = 2 ∈ Z and q = 1 ∈ Z, p, q ≤ 10 and so x = 2/1 ∈ E . But, x = 2 > 1 and so x ∈ F . Thus, F . E ⊂ The main operations on sets are the union, the intersection and the complement. Let A ⊆ R and B ⊆ R be two sets. Then the union of A and B is A
∪ B := {x ∈ R | x ∈ A or x ∈ B}.
The intersection of A and B is A
∩ B := {x ∈ R | x ∈ A and x ∈ B}.
The complement of A is Ac := x
{ ∈ R | x ∈ A}.
In particular, one can check that for any two sets A and B : A
∩ B ⊂ A ∪ B.
One is often familiar with these operations in the context of Venn diagrams Because sets are often given using defining conditions the union and intersection operations are given by adding “or” for unions and “and” for intersections to the defining conditions. Example 1.1.3. Let A = x
{ ∈ R | x is an even integer } and B = (−3, 5). Then, A ∪ B = {x ∈ R | x is an even integer or −3 < x < 5 }. A ∩ B = {x ∈ R | x is an even integer and −3 < x < 5 }.
For the complement, one writes for instance Ac = x
{ ∈ R | x ∈ A} = {x ∈ R | x is not an even integer }.
1.2 Review of Linear Algebra
A
3
B
C Fig. 1.3. Venn Diagram of three sets
A, B and C .
Exercises
(1) Write the following descriptions of sets using defining conditions. (a) Let E be the set of elements of R that are square of integers. (b) Let F ⊂ R be the set of points with distance less than 1 from π . (c) Let A be the set of integers with absolute value greater than 2π. (d) Let B be the set of rational numbers with denominator less than 30. (2) Determine the union and intersection of the sets A, B , E , F of Exercise 1. (3) For each problem, determine whether E ⊂ F or not. Explain. (a) Let E = {x ∈ R | |x − 3| < 2 } and F = {x ∈ R | 2 < x < 4 }. (b) Let E = {x ∈ R | x = cos(nπ), for any n ∈ Z} and F = {x ∈ R | −1 ≤ x ≤ 1}. (c) Let E = {x ∈ R | |1 − x| > 0 } and F = {x ∈ R | |x − 1| < 0 }.
1.2 Review of Linear Algebra This review of linear algebra focuses on the definitions and results revolving around the concept of vector space and on geometrical aspects of linear algebra. For review of solution of linear systems using row reduction, matrices, inverses of matrices, and others, the reader can consult their favourite textbook on linear algebra. One, two and three dimensional Euclidean spaces are the line, the plane and the ambient space familiar to us. They are represented mathematically as R, R2 and R3 . The space Rn is defined as the set of all collections of n real numbers written as (x1 , x2 , . . . , x n ).
For n = 2 and n = 3 we have (x1 , x2 )
(x1 , x2 , x3 )
where x1 , x2 , x3 are any real numbers. Euclidean spaces are vector spaces; that is, all elements of Rn satisfy the following two conditions.
4
1 Introduction
(1) For any two elements in
R
n
:
(x1 , x2 , . . . , x n ) and (y1 , y2 , . . . , y n )
then (x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , x n + yn ) is also in Rn . (2) For a ∈ R and (x1 , x2 , . . . , xn ) ∈ Rn , then a(x1 , x2 , . . . , xn ) = (ax1 , . . . , a x n )
∈ Rn.
In general, let V ⊂ Rn , then V is a vector space (or a vector subspace of Rn ) if for every element v1 , v2 ∈ V and a ∈ R the following two properties are satisfied: (1) v1 + v2 ∈ V (2) av1 ∈ V . If a vector space W is a subset of a vector space V , we say that W is a vector subspace of V . Example 1.2.1. The subset
V = (x1 , x2 , x3 )
{
∈ R3 | x3 = 2x1 + x2 } ⊂ R3
is a vector subspace. We need to check the two conditions. Two general elements of V must satisfy the defining condition. We write v1 = (a1 , a2 , 2a1 + a2 )
and v2 = (b1 , b2 , 2b1 + b2 ).
Then, v1 + v2
= =
(a1 , a2 , 2a1 + a2 ) + (b1 , b2 , 2b1 + b2 ) (a1 + b1 , a2 + b2 , 2(a1 + b1 ) + (a2 + b2 )).
Therefore, the third component of the sum satisfies the defining condition for V . Now check for c R
∈
cv1 = c(a1 , a2 , 2a1 + a2 ) = (ca1 , ca2 , c(2a1 + a2 ))
and here we also see that the third component satisfies the defining condition for V . Therefore, V is a vector space. Example 1.2.2. The subset
W = (x1 , x2 )
{
∈ R2 | x2 = −x1 + 1 } ⊂ R2
is not a vector subspace of R2 . Two general elements of W must satisfy the following condition. We write w1 = (a1 , a1 + 1)
−
and w2 = (b1 , b1 + 1).
−
1.2 Review of Linear Linear Algebra
5
Then, w1 + w2 = (a1 , a1 + 1) + (b (b1 , b1 + 1) = (a ( a1 + b1 , (a1 + b1 ) + 2). 2).
−
−
−
Therefore, the second component of the sum does not satisfy the defining condition and so W is not a vector space. linear combination of a The elements of a vector space V are called vectors . A linear k vectors v1 , v2 . . . , vk of a vector space V is given by collection of k a1 v1 + a2 v2 +
· · · + ak vk
for some real numbers a 1 , a2 , . . . , a k called the coefficients of of the linear combination. One can check check that that v1 = (1, (1, 0, 2), v2 = (3, (3, 1, 5) and v3 = Example 1.2.3. One ( 0.2, 1, 0.6) are elements of V from Example 1.2.1. Then,
−
−
(1. (1.1)v 1)v1 + v2 + ( 1)v 1)v3
−
is a linear combination of v1 , v2 , v3 .
We now recall the definitions of linear dependence and linear independence of vectors. Consider the following example first. (1, Example 1.2.4. Let v1 = (1,
−2) and v2 = (−1.3, 2.6), then we can see that v2 =
( 1.3)v 3)v1 . We can rewrite as a linear combination
−
( 1.3)v 3)v1 + ( 1)v 1)v2 = 0.
−
−
That is, the linear combination is equal to zero, but the coefficients of the linear combination are not zero.
A collection of vectors v 1 , . . . , vk in a vector space V are linearly dependent if if there exists a linear combination a1 v1 + a2 v2 +
· · · + ak vk = 0
for which not all the coefficients are zero. A collection of vectors v1 , . . . , vk are linearly independent if if the only way to have a linear combination a1 v1 + a2 v2 +
· · · + ak vk = 0
is for a1 = a 2 = · · · = a k = 0. (1, 0, 2), v2 = (3, (3, Example 1.2.5. The vectors v1 = (1,
−1, 5) and v3 = (−0.2, 1, 0.6) of
Example 1.2.3 are linearly independent. One can see this by writing a1 v1 + a2 v2 + a3 v3 = (a1 + 3a 3 a2
5 a2 + 0. 0 .6a3 ) = 0 − 0.2a3 , −a2 + a3 , 2a1 + 5a
from which we obtain a3 = a2 by looking looking at the sec second comp component. onent. Substituting in the first component we obtain a1 3a2 0.2(a 2(a2 ) = 0 which means a1 = 3.2a2 .
− −
6
1 Introduction Introduction
Substituting in the third component we have 2(3. 2(3 .2)a 2)a2 + 5a2 + 0.6(a 6(a2 ) = 12a 12a2 = 0. But this forces a2 = 0 and therefore a1 = a 3 = 0. The only way for a linear combination of v1 , v2 , v3 to equal zero is for all the coefficients to be zero.
For a collection of vectors v1 , v2 , . . . , vk ∈ V , the span of v1 , v2 , . . . , v k is the set of all linear combinations of v1 , . . . , v k . We write span(v1 , . . . , vk ) := {a1 v1 + a2 v2 + · · · + ak vk | a1 , a2 , . . . , a k ∈ R}. Consider the following example. (1, 0, 2) and v2 = ( 3, 1, 5) is Example 1.2.6. The span of v1 = (1,
−
span (v1 , v2 )
= =
{a1 (1, (1, 0, 2) + a2 (−3, 1, 5) | a1 , a2 ∈ R} {(a1 − 3a2 , a2 , 2a1 + 5a 5 a2 ) | a1 , a2 ∈ R}.
This last line gives an explicit expression for elements of the span. One can now verify whether a vector belongs to the span of v 1 , v2 by checking explicitly. For instance, is v4 = (4, (4, 1, 4) an element of span (v1 , v2 )? We need,
− −
4 = a 1
We see that a a 2 =
− 3a2 , −1 = a2,
a1 + 5a 5 a2 =
−4.
−1 and a a1 = 1 solve the three equations. That is, v4 ∈ span (v1 , v2 ).
Recall the following result concerning span of vectors. Proposition Proposition 1.2.7. Let v1 , . . . , v k be a collection of vectors in the vector space V .
Then, span (v1 , . . . , v k ) forms a vector subspace of V .
We now have all the ingredients to discuss the concepts of basis and dimension. A set of vectors B = {v1 , . . . , v k } in a vector space V is a basis for V if (1) the vectors in B are linearly independent, and (2) V = span(B). Example 1.2.8. If V = Rn , the set = e1 , . . . , en where
B B {
}
(0, . . . , 1 , . . . , 0) ej = (0,
j th
is a basis for Rn . It is called the canonical canonical basis of Rn . The linear independence is straightforward to check. For n = 2, we have e1 = (1, (1, 0)
and e2 = (0, (0, 1). 1).
Let v = (x1 , x2 ) be an arbitrary element of R2 , then v is in the span of :
B B
v = (x1 , x2 ) = x 1 e1 + x2 e2 .
1.2 Review of Linear Linear Algebra
7
e2 e1
Fig. 1.4. Canonical basis of 2 R :
{e1 , e2 }.
The elements elements x1 , x2 are are called alled the coordinates general n, any element coordinates of v . For general n w = (x1 , . . . , xn ) R can be written uniquely as
∈
w = x = x 1 e1 + x2 e2 +
· · · + xnen
where x1 , . . . , x n are said to be coordinates coordinates of w.
The dimension of a vector space V is given by the number number of element elementss of a basis of V . The choice of basis is not unique. For the Euclidean spaces, the canonical basis is the preferred one and other bases are used typically in special circumstances when one wants to single out a particular geometrical feature. Example 1.2.9. We are interested in the subspace of R2 spanned by the vector d1 =
(1, (1, 1). This subspace is the bisector line of the first and third quadrants. A basis of 2 R which includes includes d d 1 is d1 , e2 . For this we check the two properties of a basis: (1) for a, b R, ad1 + be 2 = (a, a + b + b)) = 0 which forces a = b = 0. The vectors are linearly independent, and (2) for any (x1 , x2 ) R2 ,
{ {
∈
}
∈
(x1 , x2 ) = x 1 d1 + (x ( x2
− x1 )e2 .
Thus, d1 , e2 spans R2 .
{
}
We now consider the concept of length of vectors in Rn when written in the canonical basis. This is given by the norm of of a vector v = (x1 , x2 , . . . , xn ) defined by
||v|| :=
x21 + x22 +
· · · + x2n.
In R2 , this corresponds to the hypotenuse of the right angle triangle with sides x1 and x2 . In Example 1.2.9, a choice which may seem more natural is to take the vector d2 = (1, (1, −1) which spans the bisector of the second and fourth quadrants. The vectors d1 and d2 are at right angles, just as the canonical basis elements. The dot product or scalar product product (or also called inner product ) of two vectors v and w written in the canonical basis is denoted by v · w and defined as follows: let v = (x1 , x2 , . . . , x n ) and w = (y1 , y2 , . . . , y n ) then v w = x = x 1 y1 + x2 y2 +
·
· · · + xnyn.
8
1 Introduction Introduction
v
x2
x1
Fig. 1.5. The vector v has
norm v =
|| ||
x21 + x + x22 .
This formula comes from the following argument. If v and w are at right angles or v
−w
v
w
Fig. 1.6. The vector v
− w.
perpendicular, then by Pythagoras theorem (valid in
||v||2 + ||w||2 ||v||2 + ||w||2 ||v||2 + ||w||2
= = =
n R
) we have
||v − w||2 (x1 − y1 )2 + · · · + (x (xn − yn )2 2(x1 y1 + · · · + xn yn ). ||v||2 + ||w||2 − 2(x
Therefore, the left and right hand sides are equal if and only if x1 y1 +
· · · + xnyn = 0 = v · w.
Two vectors v , w are orthogonal (i.e. (i.e. perpendicular) if and only if v · w = 0. A basis B for a vector space V is orthogonal if if all basis elements are mutually orthogonal. If in addition, the vectors in B have norm 1, the basis is called orthonormal . Example 1.2.10. Consider R2 with the basis d1 , e2 of Example 1.2.9. Then, d1
{ {
}
·
e2 = 1 and so this basis is not an orthogonal basis. If instead we choose the basis d1 , d2 , then d1 d2 = 0 and so this forms an orthogonal basis, but it is not orthonormal because d1 = d2 = 2. To make it orthonormal, we need to form a basis using d1 = d 1 / d1 and d2 = d 2 / d2 .
{
}
· || || || || || √ || || || ||
It is a straightforward exercise to check that the canonical basis of Rn is an orthonormal basis. An important property of the scalar product of v and w is that it gives the product of the projection of one vector along the other one. If v and w are vectors
9
1.2 Review of Linear Linear Algebra
in some orthonormal basis, then v w = v
|| ||||w|| cos θ.
·
We continue with a discussion of matrices and linear mappings. Consider a m × n matrix M of real numbers with entries labelled with aij , where i is the row label and j is the column label. We obtain
M =
a11 a21
.. .
a11 a21
.. .
am1
a12 a22
.. .
a m2
··· ···
a1 n a2 n
···
amn
..
.
.. .
.. .
am1
v1 v2
.. .
vn
··· ···
a1 n a2 n
···
amn
..
a m2
n R
Let v = (v1 , . . . , v n ) be a vector in M v is defined by
a12 a22
.
.. .
.
. Recall that matrix - vector multiplication
=
a11 v1 + a12 v2 + a21 v1 + a22 v2 +
· · · + a1nvn · · · + a2nvn
.. .
am1 v1 + am2 v2 +
· · · + amnvn
.
In particular, matrix - vector multiplication has a “linearity” property. That is, for vectors v1 , v2 ∈ Rn and α ∈ R then M ( M (αv1 + v2 ) = αM v1 + M v2 .
The m × n matrices are examples of linear transformations from Rn to Rm . Linear transformations are often more easily described without the use of matrices as the next example shows. Example 1.2.11. Consider the spaces of polynomials of degree
≤ 2 and ≤ ≤ ≤ 1. We
write
P 2 = {a0 + a1 x + a2 x2 | a0, a1 , a2 ∈ R} and P 1 = {b0 + b1 x | b0 , b1 ∈ R}. Consider the differentiation operation D D applied to elements of P P 2 D(a0 + a1 x + a2 x2 ) = a1 + 2a 2 a2 x ∈ P 1 . Therefore D D : P 2 → P 1 and we know from elementary calculus that differentiation is a linear operation; that is, if p, q ∈ ∈ P 2 and α ∈ R then D(αp + q ) = αD( αD ( p) p) + D (q ).
This example shows that using exclusively matrices to discuss linear transformations is not optimal in all situations. Moreover, writing out a matrix requires one to fix a basis for the vector spaces. Therefore, it is more appropriate to define the set of linear transformations independently of bases as follows.
10
1 Introduction Introduction
T : Rn Definition 1.2.12. We say that T
→ Rm is a linear linear transformation if it satisfies + v2 ) = αT ( + T ((v2 ) where α ∈ R and v1 , v2 ∈ Rn . The set of all linear T ( T (αv1 + v αT (v1 ) + T
transformations from Rn to
R
m
is denoted by L(Rn , Rm ).
In this book, we shall often discuss linear transformations in an abstract way, without specifying a basis, and so it is more convenient to adopt the linear transformation formalism, rather than writing all our transformations in matrix form. Finally, Finally, recall that two vector spaces V and W are isomorphic if if there exists a linear transformation T : V → W such that T −1 exists. In particular, any two vector spaces of same dimension are automatically isomorphic. Returning to matrices, recall the definition of determinant for 2 × 2 and 3 × 3 matrices which we use repeatedly: det A = det
a11 a21
a12 a22
= a 11 a22
− a12 a21
and det A = =
a11 a12 a13 d et a21 a22 a23 a31 a32 a33 a11 (a22 a33 a23 a32 )
− a12(a21a33 − a23a31 ) + a13(a21a32 − a22a31). If no confusion with the absolute value is possible, we sometimes write det A = |A|. −
We now conclude with some geometric properties of determinants. y v
q
P ( P (v, w)
w x
Fig. 1.7. Vectors v and w and parallel-
ogram P ( P (v, w)
v = (a, b) and w = (c, d) are two vectors in the plane based at some point q , then If v the area of the parallelogram P vw vw generated by v and w is given by the absolute
1.2 Review of Linear Algebra
11
value of the determinant of the matrix
a c
b d
,
That is, Area(P vw ) = |ad − bc|. There are various ways to check this result, but we do not pursue this here. However, this result is important in understanding the area of a parallelogram in threedimensional space. Another geometric way of obtaining the area of a parallelogram is as follows. Consider the parallelogram P vw formed by two vectors v and w in the plane. The
h w
v
Fig. 1.8. Parallelogram P
formed by the vectors v and w.
area of this parallelogram is also given by the formula base of parallelogram × height of parallelogram . The height is obtained by drawing a perpendicular from a vertex to the opposite side as in Figure 1.8. The length of this height is, using basic trigonometry, h = ||v || sin θ where θ is the angle opposite the dashed line. Geometrically, this means Area(P vw ) = ||w ||||v|| sin θ. Another way to compute the area of a parallelogram is with the “cross product”, also called “vector product”. The cross product is defined for pairs of vectors in R3 (or R2 if the same component of the vectors is zero). Let v = (x1 , x2 , x3 ) and w = (y1 , y2 , y3 ), written in an orthonormal basis, then the cross-product v × w is a vector given by v
× w :=
x2 y2
x3 y3
− ,
x1 y1
x3 y3
,
x1 y1
x2 y2
(1.1)
where | | is the 2 × 2 determinant. This means that the components of the crossproduct correspond to the “oriented” areas of the parallelograms obtained by the projection of the parallelogram P (v, w).
12
1 Introduction
The cross product v × w is perpendicular to both v and w and so is perpendicular to the plane in which v and w lie. This is verified by checking that v (v
· × w) = 0
and w · (v × w) = 0.
An important feature of the cross product is called its alternating property , namely v
× w = −(w × v).
This reflects the fact that there are two vectors perpendicular to the plane containing v and w and they are opposite of each other. Finally, let P vw be the parallelogram generated by v and w then Area(P vw ) = ||v × w||. Exercises
(1) Show that the set V = (x,y,z)
{
∈ R3 | z = 3x + 2y}
is a vector subspace (i.e check that the two conditions of a vector space are satisfied). (2) Show that a line in R2 is a vector subspace if and only if passes through the origin. (3) Use the scalar multiplication property of vector spaces to show that a closed ball B of any radius r > 0 cannot be a vector space. Explicitly B = (x,y,z)
{
∈ R3 | x2 + y2 + z2 ≤ r2 }.
(4) Consider the vectors given. Find out if they are linearly dependent or independent. Compute the span of these vectors and find the dimension of the subspace spanned by the vectors. (a) v1 = (1, 1, 0), v 2 = (0, −2, 0) (b) v1 = (2, −1, 1), v 2 = (0, 1, 3) and v3 = (2, 0, 4). (c) v1 = (3, 4, 1, 0), v 2 = (0, 1, 0, −1) and v3 = (2, −2, 0, 1). (5) Determine if the following vectors are orthogonal. (a) v1 = (1, 3, 1), v 2 = (2, −1, 1). (b) v1 = (2, 1, 0, 0), v2 = (0, −1, 1, 0). (c) v1 = (1, −1, 2, 2), v 2 = (−1, 1, 1, 1). (6) Find a vector (a,b,c,d) orthogonal to (3, −1, 0, 1). (7) Find a vector v orthogonal to the subspace V of Exercise 1. (8) Normalize the vectors√ of Exercise 5. √ √ √ v = (2/ 5, 1/ 5, 0) v = (0, 0, 1) v = ( 1/ 5, 2/ 5, 0) − (9) Verify that 1 , 2 and 3 forms an orthonormal basis of R3 .
1.3 Coordinate systems
13
(10) Recall the determinant formula to compute the cross product of two vectors. Let v = (x1 , y1 , z1 ) and w = (x2 , y2 , z2 ) then v
× w = det
i x1 x2
j y1 y2
k z1 z2
where i, j, k correspond to the basis vectors e 1 , e2 , e3 (notation used in physics). Show that the result corresponds to the formula 1.1. (11) Show that for vectors in the xy-plane v = (x1 , y1 , 0), w = (x2 , y2 , 0), the only possible nonzero component of the cross product v × w is in the e3 direction. (12) Compute the area of the parallelogram given by the vectors v = (1, 0, −2) and w = (−3, 1, 1). = 0, (αv) × w = α(v × w). (13) Show that for α = 0, show that v × w = 0. (14) If w = αv for some α
1.3 Coordinate systems Coordinate systems are commonly used to identify locations whether on earth with the longitudes and latitudes or for celestial objects in the night sky using the Elevation/Azimuthal coordinate system. The mathematical definition of a coordinate system used throughout this text is the following. Definition 1.3.1. A coordinate system on Rn is a collection of n-families of curves
such that any point in Rn corresponds uniquely to the intersection of one curve from each family.
Coordinate systems, as opposed to coordinates of a basis of a vector space, do not need to be linear. In fact, from the commonly used coordinate systems only the Cartesian coordinate system is made up uniquely of straight lines. We define the following families of coordinate curves
F i
:=
xia (t) = (a1 , . . . , ai−1 , ti , ai+1 , . . . , an ) t and a = (a1 , . . . , aˆi , . . . , a n ) Rn−1
∈
| ∈R
(1.2)
for i = 1, . . . , n and where the uˆ denotes the missing coordinate in the vector. The Cartesian coordinate system on Rn consists of the n-families of straight lines given by (1.2). A point p ∈ Rn at the intersection of the n lines F i with values t p1 , t p2 , . . . , t pn has coordinates (t p1 , t p2 , . . . , t pn ).
∈ R2 , then u is at the intersection of the lines
Example 1.3.2. Let u = 1.4e1 + 1.7e2
x11.7 (t1 ) and x21.4 (t2 ) with t1 = 1.4 and t2 = (1.7) and so has coordinates (1.4, 1.7). See Figure 1.9.
One notices from this example that the coordinates of a point u given in the canonical basis correspond to the coordinates of u in the Cartesian coordinate system.
14
1 Introduction
x21.4 (t)
1.7
x11.7 (t)
u 1.4
Fig. 1.9. u = (1.4, 1.7) at the intersec-
tion of two coordinate lines.
Because of this, the dividing line between canonical basis and the Cartesian coordinate systems is often overlooked. θ = π/2 θ = 3π/4
θ = π/4
θ = π θ = 5π/4
θ = 0 θ = 7π/4 θ = 3π/2
Fig. 1.10. Polar coordinate system showing
four (equally spaced) radii and eight radial vectors.
However, this is not correct as they correspond to different mathematical objects: a set of vectors for the canonical basis and a family of straight lines for the Cartesian coordinates. The polar coordinate system in R2 is obtained from the Cartesian coordinate system by defining a family of rays from the origin and a family of circles centered at the origin. Let p ∈ R2 be at the intersection of the ray with angle θ0 from the x-axis and the circle at radius r0 , in coordinates we write p = (r0 , θ0 ). In terms of Cartesian coordinates, the formulae for a ray at angle θ0 ∈ [0, 2π) and a circle of radius r0 are respectively rθ0 (t1 ) = (t1 cos θ0 , t1 sin θ0 )
and θr (t2 ) = (r0 cos(t2 ), r0 sin(t2 )). 0
The polar coordinate system is an example of a curvilinear coordinate system , because at least one family of curves is not made up of straight lines.
∈ R2 . In polar coordinates,
Example 1.3.3. Consider again the point u = (1.4, 1.7)
the coordinate curves passing through u are determined by the ray rθ0 (t) joining the origin to u and the circle θ r0 (t) passing through the point u . The angle θ 0 is obtained
1.3 Coordinate systems
15
x21.4 (t) θ0 u 1.7
x11.7 (t)
1.4 Fig. 1.11. u = (1.4, 1.7) at the in-
tersection of two coordinate curves in Cartesian and polar coordinates.
using basic trigonometry since tan(θ0 ) =
1.7 ; 1.4
and so θ0 = arctan(1.7/1.4) 5π/18. The radius of the circle is obtained from Pythagora’s theorem: r0 = 1.42 + 1.72 2.20.
√ ≈
≈
We now look at the main curvilinear coordinate systems in R3 obtained via the Cartesian coordinate system. The cylindrical coordinate system is described by families of curves extending the polar coordinate system in the third dimension using straight lines. rθ0 ,z0 (t) = (t cos θ0 , t sin θ0 , z0 ),
θr0 ,z0 (t) = (r0 cos t, r0 sin t, z0 )
and x 3(r ,θ ) (t) = (r0 , θ0 , t). Figure 1.12 shows the radial and angular coordinates in the plane and the vertical axis. Note that there are equivalent coordinate systems 0
0
Fig. 1.12. Cylindrical coordinates: polar coordinates in the plane and vertical axis.
16
1 Introduction
with polar coordinates in the other coordinate planes. rθ0 ,x0 (t) = (x0 , t cos θ0 , t sin θ0 ),
and x 1(r
0
,θ0 ) (t)
θr0 ,x0 (t) = (x0 , r0 cos t, r0 sin t)
= (t, r0 cos θ0 , r0 sin θ0 ). Similarly,
rθ0 ,y0 (t) = (t cos θ0 , y0 , t sin θ0 ),
θr0 ,y0 (t) = (r0 cos t, y0 , r0 sin t)
and x 2(r ,θ ) (t) = (r0 cos θ0 , t , r0 sin θ0 ). The spherical coordinate system is obtained by fixing a radius ρ 0 > 0 in defining two families of curves on a sphere of radius ρ0 as follows 0
0
ρθ0 ,φ0 (t) θρ0 ,φ0 (t) φρ0 ,θ0 (t)
= = =
(t cos θ0 sin φ0 , t sin θ0 sin φ0 , t cos(φ0 )) (ρ0 cos t sin φ0 , ρ0 sin t sin φ0 , ρ0 cos(φ0 )) (ρ0 cos θ0 sin t, ρ0 sin θ0 sin t, ρ0 cos t)
3 R
and
(1.3)
The definitions of coordinate system above give us formulae for the correspondence of points from a curvilinear coordinate system to the Cartesian coordinate system. If a point p ∈ R2 has Cartesian coordinates (x, y) and polar coordinates (r, θ) then x = r cos θ,
y = r sin θ
⇔
r =
x2 + y 2 ,
θ = arctan(y/x).
(1.4)
The relationship between the Cartesian coordinate system in R3 and the cylindrical coordinate system is a direct extension of the polar coordinate system equations. For spherical coordinates, equations (1.3) give us x = ρ cos θ sin φ,
y = ρ sin θ sin φ and z = ρ cos φ.
We obtain ρ, φ, θ as functions of x, y,z as follows. The radius of a sphere is given by x2 + y 2 + z 2
so ρ =
= = = =
(ρ cos θ sin φ)2 + (ρ sin θ sin φ)2 + (ρ cos φ)2 ρ2 sin2 φ(cos2 θ + sin2 θ) + ρ2 cos2 φ ρ2 (sin2 φ + cos2 φ) ρ2 .
x2 + y 2 + z 2 . Notice that y/x = tan θ which means θ = arctan(y/x).
Finally, cos φ = z/ρ and writing ρ = φ = arccos
x2 + y 2 + z 2 we obtain
x2
z +
y2
+ z2
.
1.3 Coordinate systems
17
Fig. 1.13. Spherical coordinates
We summarize as follows, see Figure 1.13 for an illustration where the right-angled triangle has sides of length ρ sin φ, ρ cos φ and hypotensude ρ. x = ρ cos θ sin φ, y = ρ sin θ sin φ,
z = ρ cos φ,
ρ =
x2 + y 2 + z 2
θ = arctan
φ = arccos
y x
(1.5) z
x2 + y 2 + z 2
.
Here are some examples on how to perform the algebra from one coordinate system to another.
C in R2 which satisfies the equation
Example 1.3.4. Consider the locus of points
x2 + y 2 = 2y . We rewrite this locus of points using polar coordinates. We substitute for x and y to obtain (r2 cos2 θ + r2 sin2 θ) = 2r sin θ r 2 = 2r sin θ r = 2 sin θ.
C is shown in Figure 1.14, it is the circle of radius 1 centered at (0, 1). Consider this time an example in
3 R
.
S in R3 given by 3x2 + 3y2 − z2 = 1
Example 1.3.5. We write the locus of points
using the cylindrical and spherical coordinates. In cylindrical coordinates, one obtains 3(r 2 cos2 θ) + 3(r2 sin2 θ) 3r2 z 2 = 1.
−
− z2 = 1
18
1 Introduction
y 2
1 Fig. 1.14. The circle
C
of radius 1 centered at (0, 1)
x
One can write, either z2 = 1
− 3r2
or r2 =
1 (1 + z 2 ). 3
In spherical coordinates, 3(ρ2 cos2 θ sin2 φ) + 3(ρ2 sin2 θ sin2 φ) 3ρ2 sin2 φ ρ2 cos2 φ = 1 ρ2 (4sin2 φ 1) = 1
− −
− (ρ2 cos2 φ) = 1
where the last line is obtained by adding and subtracting ρ2 sin2 φ and simplifying with the ρ2 cos2 θ term. Therefore, one can write ρ2 =
1 4sin φ 2
− 1.
We conclude this section with coordinate systems in R4 . Obvious choices are the Cartesian coordinate system given by (x1 , x2 , x3 , x4 ), but one can also take two sets of polar coordinates (r1 , θ1 , r2 , θ2 ) or a three dimensional coordinate system, say spherical, plus an additional Cartesian coordinate (ρ,φ,θ,x4 ). Several other options are possible.
Exercises
(1) Consider the disk D of radius a > 0 in the plane D = (x, y)
{
∈ R2 | x2 + y2 ≤ a}.
Write the definition of D in polar coordinates. What is the shape of D in the (r, θ) plane? (2) Consider the region R =
(r, θ)
∈ R2 |
π 2 < r < 3, 6
≤ θ ≤
2π 3
.
Draw R in the (r, θ) plane. Describe R in Cartesian coordinates and draw the region in the (x, y) plane.
1.4 Functions and Mappings: including partial derivatives
19
(3) Write x 2 + y 2 = 2xy in polar coordinates. Simplify if possible. 1 in Cartesian coordinates. Simplify if possible. 1 cos θ (5) Write z = x 2 + y 2 in cylindrical coordinates. Simplify if possible.
(4) Write r =
−
(6) Write z 2 = x 2 + y2 in spherical coordinates. Simplify if possible. (7) Consider the region R ⊂ R3 given by R =
(x,y,z) x2 + y 2
|
≤1−
1 z 2 , < z < 1 . 2
Describe this region in spherical coordinates and draw it. (8) Look up hyperbolic coordinates online. Describe this coordinate system. (9) Look up paraboloidal coordinates online. Describe this coordinate system. (10) Search the internet for other coordinate systems. How do they relate to the Cartesian coordinate system?
1.4 Functions and Mappings: including partial derivatives One is typically familiar with functions of the form y = f (x) where x ∈ I ⊂ R. The set I is called the domain of f and the rule f assigns a unique value y for each x ∈ I ; this forms the image or range of f . The general definition of a function follows the same pattern. Definition 1.4.1. Let A, B be two sets and f is a rule which assigns to every a
∈ A,
a unique value b = f (a) B . Then the triplet (A,B,f ) is called a function where A is the called the domain of the function and B is the image or range of the function.
∈
In this book, we consider functions of the following form. Let U ⊂ Rn and
→ Rm
f : U
be the rule. As a shortcut, it is typical to refer to f as the function without mentioning the domain. However, one must always keep track of the domain of the function even if it is not explicitly stated. Several cases of n and m values are given special names, those are: (1) n = m = 1: functions of one variable , (2) n > 1 and m = 1: functions of several variables , (3) n = 1 and m > 1: vector functions . We look at a few examples. Example 1.4.2. Here are some examples of functions of several variables.
20
1 Introduction
– –
U = R2 and f (x, y) = x 2 + y2 .
–
U = R4 and h(x,y,z,w) = xyzw .
⊂ R3 and g(ρ,φ,θ) = ρ2 − 4sin21φ − 1 .
U
Example 1.4.3. Consider now examples of vector functions. – r(t) = (t, t2 ) – r(t) = (cos t, sin t, et ) – r(t) = (1 t, t , 2t, cos t).
− ||
For general n and m values, it is also customary to refer to f : U → Rm as a mapping . Changes of coordinates between coordinate systems are very useful mappings. Here are a few examples. Example 1.4.4. Consider the mapping f : R2
\ (0, 0) → R2 given by
f (r, θ) = (r cos θ, r sin θ) Example 1.4.5. Let A :
n R
→ Rn be a linear mapping; i.e. a matrix. Suppose that
|A| = 0 then A is a linear change of coordinates in Rn. For instance, let (x,y,z) be coordinates in R3 and 0 −1 0
−
A =
1 0
0 0
0 1
.
We can use A to make a change of coordinates u v w
= A
x y z
.
Example 1.4.6. Another change of coordinates is from Cartesian to spherical coor-
dinates. Those are seen in equation (1.5): f (x,y,z) =
x2 + y 2 + z 2 , arctan
The inverse transformation mapping is
y , arccos x
x2
z +
y2
+
z2
.
g(ρ,φ,θ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ).
1.4.1 Partial Derivatives
For functions of several variables of the form f : Rn → R, there is a partial extension to the concept of derivative of a function; the so-called “partial derivative” which we define in the case of a function of two variables. The general case is straightforward once this one is understood.
1.4 Functions and Mappings: including partial derivatives
21
Consider f (x, y) near a point (x0 , y0 ) in its domain. If the following limits exist lim
x
→x
0
f (x, y0 ) x
− f (x0 , y0 ) − x0
and
lim
y
→y
0
f (x0 , y) y
− f (x0 , y0 ) − y0
we call those the partial derivatives of f with respect to x and with respect to y respectively. Those are denoted by
and
∂f f (x, y0 ) (x0 , y0 ) := lim x→x0 ∂x x
− f (x0, y0 ) − x0
∂f f (x0 , y) (x0 , y0 ) := lim y→y0 ∂y y
− f (x0 , y0) . − y0
Those can also be rewritten for an arbitrary point (x, y) as: ∂f f (x + h, y) (x, y) := lim ∂x h h→0
− f (x, y) ,
∂f f (x, y + h) (x, y) = lim ∂y h h→0
− f (x, y) .
The partial derivatives of familiar functions are computed in the same way as derivatives of functions of one variable because the other variable is considered fixed as a constant. Moreover, all the differentiation rules apply in the same way: addition, multiplication rule, quotient rule, chain rule. Example 1.4.7. Let f (x, y) = x 2 y cos(xy) + x/(1 + y), then considering y fixed
∂f = 2x cos(xy) ∂x
− x2 y sin(xy)y + 1 −1 y
and keeping x fixed ∂f = x 2 cos(xy) ∂y
− x2 y sin(xy)x + (1 −x y)2 .
For functions of n variables, the recipe is the same. Consider a function f (x1 , . . . , x n ) and a point (x10 , . . . , x n0 ) ∈ Rn , then for j = 1, . . . , n the partial derivative is f (x10 , . . . , xj , . . . , xn0 ) f (x10 , . . . , x n0 ) ∂f (x10 , . . . , x n0 ) = lim xj →xj 0 ∂x j xj xj0
−
−
if the limit on the right-hand side exists. Similarly, at an arbitrary point (x1 , . . . , x n ) we have f (x1 , . . . , xj + h , . . . , x n ) ∂f (x1 , . . . , x n ) = lim ∂x j h h→0
− f (x1 , . . . , xn) .
As explained above, the computation of partial derivatives in the case of n variables follows the same rules as for n = 2. Example 1.4.8. We compute the partial derivatives of f (x,y,z) = (xy + z 2 )eyz :
∂f = yeyz , ∂x
∂f = xe yz + (xy + z 2 )ze yz , ∂y
∂f = 2ze yz + (xy + z 2 )ye yz . ∂z
22
1 Introduction
We now define an object known as the gradient and which is our first example of a differential operator. Definition 1.4.9. Let f (x1 , . . . , x n ) be a function of several variables in Cartesian
coordinates for which partial derivatives with respect to all variables exist. The gradient of f is defined as
∇f (x1 , . . . , xn) =
∂f ∂f ,..., ∂x 1 ∂x n
.
If a function is not specified, we write
∇ :=
∂ ∂ ,..., ∂x 1 ∂x n
As we show in the example above, the chain rule applies in the same way as for functions of one variable. However, there is an important formula concerning the chain rule for functions of several variables which generalizes the regular chain rule from elementary calculus. This formula is valuable for computing derivatives in concrete examples, but it is mostly used theoretically and we refer to this formula often in the following chapters. Proposition 1.4.10. Consider a function of several variables g(x1 , . . . , x n ) and sup-
pose that xj = ϕ j (u1 , . . . , uk ) for j = 1, . . . , n. Define f (u1 , . . . , uk ) := g(ϕ1 (u1 , . . . , uk ), . . . , ϕn (u1 , . . . , uk )).
Then, for i = 1, . . . , k we have ∂f ∂g ∂ϕ 1 = + ∂u i ∂x 1 ∂u i
n ··· + ∂x∂gn ∂ϕ . ∂ui
where the partial derivatives of g are evaluated at xj = ϕj (u1 , . . . , u k ) for j = 1, . . . , n .
The proof can be obtained as a special case of a more general chain rule which we present in Chapter 5 and we decide not to burden the presentation with lengthy calculations at this stage. Let us look at this formula for the cases of a few variables. Example 1.4.11. If g(x, y) and x = ϕ1 (u1 , u2 ), y = ϕ2 (u1 , u2 ) and f (u1 , u2 ) =
g(ϕ1 (u1 , u2 ), ϕ2 (u1 , u2 )). Then ∂f ∂g ∂ϕ 1 ∂g ∂ϕ2 = + ∂u 1 ∂x ∂u1 ∂y ∂u 1
It is customary to label the functions of u with the variables x, y instead of ϕ1 , ϕ2 . That is, we write f (u1 , u2 ) = g(x(u1 , u2 ), y(u1 , u2 )) and ∂f ∂g ∂x ∂g ∂y = + . ∂u 2 ∂x ∂u 2 ∂y ∂u 2
23
1.4 Functions and Mappings: including partial derivatives
We use the formula to compute the following partial derivatives. Example 1.4.12. Let f (x,y,z) = x 2 + y 2 + xz 2 and consider the change of variables
in spherical coordinates x = x(ρ,φ,θ) = ρ cos θ sin φ, y = y(ρ,φ,θ) = ρ sin θ sin φ and z = z(ρ,φ,θ) = ρ cos φ. We consider f (x(ρ,φ,θ), y(ρ,φ,θ), z(ρ,φ,θ)) and compute ∂f ∂ρ
=
∂f ∂x ∂f ∂y ∂f ∂z + + ∂x ∂ρ ∂y ∂ρ ∂z ∂ρ
where the partial derivatives of f are evaluated at x = x(ρ,φ,θ) = ρ cos θ sin φ, y = y(ρ,φ,θ) = ρ sin θ sin φ and z = z(ρ,φ,θ) = ρ cos φ. We obtain ∂f = 2x + z 2 x=x(ρ,φ,θ),z=z(ρ,φ,θ) = 2ρ cos θ sin φ + ρ2 cos2 φ ∂x ∂f = 2y y=y(ρ,φ,θ) = 2ρ sin θ sin φ, ∂y ∂f = 2xz x=x(ρ,φ,θ),z=z(ρ,φ,θ) = 2ρ2 cos θ sin φ cos φ. ∂z
|
|
|
and
∂x = cos θ sin φ, ∂ρ
∂y = sin θ sin φ, ∂ρ
∂z = cos φ. ∂ρ
Putting all those calculations together we obtain ∂f = 2ρ cos2 θ sin2 φ + 2ρ sin2 θ sin2 φ + 3ρ2 cos θ sin φ cos2 φ. ∂ρ
1.4.2 Open and closed sets
Before we begin our discussion on functions, it is important to notice that sometimes only some subset of Euclidean space may be of interest. Here is an important type of subset.
⊂ Rn is open if for all points p ∈ U , there exists a
Definition 1.4.13. A subset U
ball
{ ∈ Rn | ||x − p|| < δ } of radius δ > 0 containing p such that Bδ ( p) ⊂ U . For p ∈ Rn and δ > 0, a ball Bδ ( p) is open. In R, a ball is just an interval of length Bδ ( p) := x
δ centered at p . In R2 with the Euclidean norm, B δ ( p) is a disk of radius δ centered at p. In R3 , Bδ ( p) is a ball in the common sense of the word. In general, a ball of radius δ centered at p is the set of points whose distance to p is less than δ . Example 1.4.14. On the real line, let a < b and consider an interval (a, b). We
verify using the definition that this interval is open. For each point p in the interval (a, b), we must find a ball Bδ ( p) of a certain radius δ > 0 such that Bδ ( p) (a, b).
⊂
24
1 Introduction
y
Bδ ( p) p x Fig. 1.15. Bδ ( p) is a disk of radius δ .
The point p is either closer to a, closer to b or in the middle of the interval. Let be the smallest value of the distances between p and a and p and b. This is written = min( p
| − a|, | p − b|).
See Figure 1.16 for an illustration. Let δ = /2 and choose an arbitrary point x Bδ ( p). By definition of B δ ( p), we must have x p < δ . By choosing δ to be half the distance to the nearest boundary point, this makes sure that x (a, b). Therefore, any point x Bδ ( p) is also in (a, b). This means B δ ( p) (a, b). Because p is chosen arbitrarily in (a, b) this completes the verification that (a, b) is open.
∈
| − |
∈
⊂
∈
Consider the following subsets of R: (a, b], [a, b) and [a, b]. Those three sets are not open and this can also be seen using the definition. The problem lies in the inclusion of at least one boundary point in the set. Consider the first case (a, b]. Suppose p = b , then any ball Bδ ( p) with δ > 0 contains points x such that b < x < b + δ . But this ∈ (a, b] and so Bδ ( p) ⊂ (a, b] no matter how small δ > 0 is chosen. Thus, means x (a, b] is not open. The same applies automatically to the other cases. In particular, if a = b then [a, b] consists of one point and so singletons are not open sets. We now look at examples of sets in R2 and R3 . Example 1.4.15. Consider the set
O := {(x, y) ∈ R2 | x > y}. This corresponds to the half-plane shown in Figure 1.18. We now show it is an open set. The key here is the strict inequality used in the definition of the set. Let . Then, x0 > y0 . Let be the distance between p and the bisector p = (x0 , y0 ) line x = y . Because x > y , then > 0 . Choose δ = /4 and consider the ball
∈ O
Bδ ( p) = (x, y)
{
a
p
b
a x
∈ R2 | ||(x, y) − (x0 , y0 )|| < δ } p
a p
b x
b x
Fig. 1.16. Three possible cases of location of p in (a, b): midpoint, right and left of midpoint
1.4 Functions and Mappings: including partial derivatives
a
25
p = b x
Fig. 1.17. (a, b] with p = b .
y
B ( p) Bδ ( p)
p
x Fig. 1.18. Balls Bδ ( p) from Exam-
ple 1.4.15
is strictly contained in B ( p) and so strictly contained in .
O
Here is an example of an open set in higher dimension defined in an abstract way. Example 1.4.16. Consider the motion of three (celestial) bodies in R3 of mass
m1 , m2 , m3 . Let q 1 , q 2 , q 3 R3 be the position vectors of the three bodies. The set of non-collision positions defined by
∈
C = (q 1 , q 2 , q 3 )
{
∈ (R3 )3 | q 1 = q 2 ,
q 2 = q 3
and q 1 = q 3
}
is an open subset of (R3 )3 .
An interval [a, b] containing its boundary points is known as a closed interval. The concept of a closed set is trickier to define than open sets and closed sets. Even on the real line, closed sets can have bizarre properties which are beyond the scope of this document. We do not define closed sets formally, but focus our attention to a special type of sets which is a generalization of the closed interval. Example 1.4.17. Consider an open set U in R2 and add the boundary to it to form
a set F . The set F is a closed set. For instance, let U = (x, y)
{
∈ R2 | |x| + |y| < 1}.
This is the diamond shape region shown in Figure 1.19. If we add the boundary, that is, the lines given by x + y = 1, the set
| | | |
F = (x, y)
{
∈ R2 | |x| + |y| ≤ 1}
is a closed set.
One way to find out if a set is closed is by using the following result.
⊂ Rn is open if and only if its complement U c is closed.
Proposition 1.4.18. A set U
26
1 Introduction
y
1
−x + y = 1
x + y = 1 U
−1 −x − y = 1
x
−
x
1 y = 1
Fig. 1.19. The open set U is inside
−1
the diamond. The closed set F is composed of U and includes the boundary lines.
We omit the proof of this theorem as it is also beyond the scope of this text. Example 1.4.19. Proposition 1.4.18 shows that the complement of the set
O in Ex-
ample 1.4.15 defined by
Oc = {(x, y) ∈ R2 | x ≤ y} is closed. In this case, the portion with no boundary is not a problem; infinity acts as a boundary to this set.
Exercises
(1) Find the domain for each of the following mappings (a) f (x, y) =
1
− y2 . √ (b) g(r,θ,z) = 1 − r2 − z 2 . Describe the region geometrically. ρ2 − 1 (c) h(ρ,φ,θ) = . 1 − cos θ sin θ √ (d) r(t) = ( t2 − 1, tan(t)) x2
(2) Compute all the partial derivatives and express the gradient of each function. (a) f (x, y) = ln(cos(xy)) (b) f (x, y) =
xy x2 + y 2 (x2
+ y2
z 2 ) cos(1/
+ (c) f (x,y,z) = (d) f (x1 , x2 , x3 , x4 ) = x 1 x4 ex
1
x2 x3 x4
x2 + y 2 + z 2 )
1.5 Parametric representation of curves
27
(3) Compute the partial derivatives ∂f ∂θ
and
∂f ∂φ
in Example 1.4.12 using the formula of Proposition 1.4.10. Write explicitly the composition of f with the functions of ρ, φ and θ and compute the partial derivatives with respect to ρ, φ and θ using the regular chain rule for functions of one variable. If you do not obtain the same thing with the two methods, verify your calculations. (4) Determine whether the sets below are open, closed or neither. (a) (b) (c) (d) (e)
{ ∈ R2 | x2 + y2 ≤ 1, y > 0} B = {(x, y) ∈ R2 | −1 ≤ x ≤ 2, 3 ≤ y ≤ 4} = 0} C = {(x, y) ∈ R2 | x − y D = {(r, θ) ∈ R2 | r > 1, 0 ≤ θ ≤ π} E = {(r, θ) ∈ R2 | 1 ≤ r < 2 } A = (x, y)
(5) The union of two open sets is an open set. Give an explanation (or a proof if you can) of why this is true. (6) The intersection of two closed sets is a closed set. Give an explanation (or a proof if you can) of why this is true.
1.5 Parametric representation of curves In this section, we begin the study of curves defined using so-called parametric representations. A curve C in Rn is a geometric object which can be described by a unique number; its parameter. Let t ∈ R, a curve C in Rn has parametric representation given by x1 (t), . . . , xn (t)
where t ∈ [a, b] with a < b, where a = −∞ and b = ∞ are allowed. Remark 1.5.1. Note that for a given curve C , the choice of parametric representa-
tion is not unique. Therefore, one has to distinguish between the geometric object C and the possible representations that can describe C .
The graphical representation of the real-line comes with an “orientation”; the arrow points to the right and this determines the direction of increasing real numbers. The choice of pointing to the right is arbitrary and is the convention adopted, possibly unanimously, and is called the positive orientation of the real-line. If the arrow points to the left, so that positive numbers increase in that direction, we talk about negative orientation . An orientation of a curve C is the orientation given by a consistent direction given by an arrow along the length of C .
28
1 Introduction
For a curve C with parametric representation r (t) with t ∈ [a, b], the orientation of C is given by the direction of travel along the curve C given as t increases from a to b. y
(t, t2 )
Fig. 1.20. Parabola (t, t2 ) with
x
t
∈ [0, 1].
Example 1.5.2. Let y = f (x) where f : [a, b]
is a curve C
in R2 .
→ R is a function. The graph of f (x)
The parametric representation is given by x(t) := x 1 (t) = t,
y(t) := x 2 (t) = f (t).
Consider the parabola C given by y = x2 defined for all x representation x(t) = t, y(t) = t 2 with t R.
∈ R, it has parametric
∈ Now, the portion of parabola given by y = x 2 with x ∈ [0, 1] is a different geometric object which we denote C 1 and has parametric representation x(t) = t, y(t) = t 2 with t
∈ [0, 1]
We can use parametric representations to describe more complicated curves which cannot be obtained by a single function y = f (x). Example 1.5.3. A circle of radius r0 has equation x2 + y 2 = r02 . A parametric
representation is given by x(t) = r 0 cos t,
y(t) = r 0 sin t,
t
∈ [0, 2π)
because substituting x(t) and y(t) in the equation is an identity for all t [0, 2π). If the equation of the circle is given in polar coordinates r = r0 , then parametric equations are r(t) = r 0 , θ(t) = t, t [0, 2π).
∈
∈
In some simple cases, it is possible to convert from a parametric representation of a curve to Cartesian coordinates in R2 . The cases where either x(t) = t or y(t) = t are straightforward as noticed above. For more complicated cases, the approach is to find functions of x(t) and y(t) from which we obtain an equality.
29
1.5 Parametric representation of curves
Example 1.5.4. Consider the curve C given by x(t) = 2t2 , y(t) = t 6 . Then,
1 x(t)3 = t 6 = y(t) 8
and so C is given by x3 = 8y .
Although this approach is useful sometimes for graphing, our main emphasis in this textbook is on parametric representations and its properties. Let us look at some commonly presented example in R3 . Example 1.5.5. Consider the curve C given by x(t) = cos t, y(t) = sin t and z (t) = t
for t [0, 4π]. We see that in the xy -plane, this is just a circle and in the z -direction, we have linear growth. The curve C is called a helix and is illustrated in Figure 1.21.
∈
Fig. 1.21. Helix
curve of radius 1
We now look at an example with quite an intricate structure in
3 R
.
Example 1.5.6. Consider the curve C 1 given by
x(t) = 4 cos t + cos t cos(3t),
y(t) = 4sin t + sin t cos(3t),
z(t) = 2 sin(3t)
We see from Figure 1.22 (left) that we obtain a closed curve. Compare with the curve C 2 obtained by changing 3t to 2t (Figure 1.22 right),
√
√
√
√
x(t) = 4cos t + cos t cos( 2t), y(t) = 4sin t + sin t cos( 2t), z(t) = 2 sin( 2t).
This curve lies on a surface which has the shape of a donut or bagel (depending on your taste!), known as a “torus”. In fact, the curve C 1 also lies on the same torus.
Parametric representations in higher dimensions commonly arise in many problems. However, their graphical representations can only be glimpsed upon by looking at projections to R3 . Consider the following one which is reminiscent of the previous example.
30
1 Introduction
Fig. 1.22. Left: closed curve lying on a torus, Right: non-closed curve lying on a torus.
Example 1.5.7. Consider the curve C in R4 given by the parametrization
x1 (t) = cos t,
x2 (t) = sin t,
x3 (t) = cos
√ 2t,
x4 (t) = sin
√ 2t.
We see that the projections to the x 1 , x2 and x3 , x4 planes are just circles with periods respectively of 2π and 2π/ 2. This kind of parametric curve describes the motion of a double pendulum subject to a small displacement. See Figure 1.23 obtained for t [0, 100].
√
∈
Fig. 1.23. Parametric curve of Example 1.5.7
One can think of parametric curves as describing the motion of a point particle in space. This leads to a subtle aspect of parametric curves with respect to their intersections as the following example shows.
1.5 Parametric representation of curves
31
Example 1.5.8. Two particles A and B travel in the paths given by xA (t) = t,
yA (t) = t3 and xB (t) = cos t, yB (t) = sin t. Do the particles collide? The paths traced by these parametric curves intersect in two points as seen from the figure. However, in order to have the particles collide, one would need the intersections to occur for the same value t 0 . It is not the case in this particular example. To check this, one would need to find a value t 0 such that t 0 = cos t0 . By drawing the graphs of t and cos t, we see that there are two such solutions. For each (approximate) solution, compute t30 sin t0 and verify that this is not zero.
−
y
x
Fig. 1.24. The curves have two inter-
section points.
1.5.1 Conics
We now look more closely at the planar curves known as conics. Those are obtained geometrically by taking various cross-sections of a cone and have the following definitions in their simplest forms. Those are for a, b ∈ R: 1
y
2
x Fig. 1.25. Ellipse with a = 2 and
b = 1.
(1) Ellipse:
x2 y2 + =1 a2 b2
32
1 Introduction
(2) Hyperbola: Left-Right and Up-Down cases. x2 a2
y
−
y2 = 1 or b2
y2 a2
−
x2 = 1. b2
y = x
−1
1
x Fig. 1.26. Left-Right hyperbola with
y =
a = b = 1. The dashed lines are the asymptotes of the hyperbola.
−x
(3) Parabola: 4ay = x 2
or 4ax = y 2 .
For the ellipse, the largest of a and b represents the major axis and the other one, the minor axis. The ellipse in Figure 1.25 has major axis a = 2 and minor axis b = 1. In the case of the hyperbola, a is the distance between the origin and the vertices which are the nearest points of the hyperbola to the centre. For |x| large, the hyperbola approaches the asymptotes of slope ±b/a. The parabola 4ay = x 2 opens up or down depending on whether a > 0 or a < 0 . The case 4ax = y 2 opens either left or right. y
x Fig. 1.27. Parabola 4ay = x2 with
a = 1/4.
We now describe the conics using parametric representations. The first way one can do this is by solving for y as a function of x. In the case of the ellipse, we have two functions y = f ± (x) :=
± − b
1
x2 a2
1.5 Parametric representation of curves
33
and f ± (x) is well-defined as a function for x ∈ [−a, a]. This leads to x(t) = t,
y(t) =
± − 1
t2 , a2
t
∈ [−a, a].
Another parametrization similar to the parametrization of the circle is obtained as x(t) = a cos t,
y(t) = b sin t,
t
∈ [0, 2π).
Indeed,
x(t)2 y(t)2 a2 cos2 t b2 sin2 t + 2 = + = 1. a2 b a2 b2 For the hyperbola, the y = f (x) parametrization is done as above. Recall the hyper-
bolic functions cosh t =
et + e−t 2
and sinh t =
et
− e−t . 2
Another parametrization of the hyperbola is given by x(t) = a cosh t,
y(t) = b sinh t,
t
∈ R.
This justifies the name “hyperbolic function”.
Exercises
(1) Use a computer software to draw the curves given by the following parametric representations. (a) x(t) = t 2 , y(t) = cos t; t ∈ [0, 2π] (b) x(t) = t cos t, y(t) = t sin t; t ∈ [0, π] (c) x(t) = t 2 , y(t) = t 3 ; t ∈ [−2, 2] (2) Find the intersections of the curves in Exercise 1. (A difficult exercise!) (3) Verify the statement of Example 1.5.8 by using the approach suggested. (4) Transform the following equations of conics into their standard form and draw a rough sketch of the conic. (a) 2x2 + 5y 2 = 3 (b) 4y 2 − x2 = −3 (5) Find the intersection points of the two conics given. (a) 4x2 + y 2 = 1 and x2 + 4y 2 = 1. (b) x2 − 3y 2 = 1 and 3x2 + 5y 2 = 1. (c) 3y 2 − x2 = 1 and y = 4x2 . (d) 9x2 − 3y 2 = 3 and y 2 − x3 = 1. 2
2
34
1 Introduction
Fig. 1.28. Ellipsoid and Hyperbolic Paraboloid
Fig. 1.29. Cone and Elliptic Paraboloid
1.6 Quadrics The Quadrics are a family of surfaces in R3 which contain some well-known examples such as the cone and the ellipsoid. The general equation of a quadric is given by: αx2 + βy 2 + γz 2 + 2δxy + 2xz + 2λyz + κx + µy + νz + p = 0
which covers the case of quadrics anywhere in space and in any orientation. We focus on the special cases where the quadrics are centered at the origin and have an orientation given by the z -axis. We list the quadrics in their simplest form below. They are the ones that are used in the remainder of the book. Let a, b, c ∈ R: (i) Ellipsoid:
x2 y2 z2 + 2 + 2 = 1. a2 b c
Fig. 1.30. Hyperboloid of one sheet and Hyperboloid of two sheets
1.6 Quadrics
35
(ii) Hyperboloid of one sheet: x2 y2 + 2 a2 b
−
z2 = 1. c2
(iii) Hyperboloid of two sheet: x2 a2
− − (iv) Elliptic paraboloid:
y2 z2 + 2 = 1. b2 c
x2 y2 + 2 a2 b
(v) Hyperbolic paraboloid: (vi) Cone:
x2 a2
− z = 0.
y2 b2
− − z = 0.
z2 x2 y2 = 2 + 2. c2 a b
Note that all the quadrics are aligned along the z axis in the notation above. However, those quadrics can also be aligned along the x and y axes and the equations are the same up to a permutation of the x, y and z . In the case of the elliptic paraboloid, one obtains y2 z2 x2 z2 − 2 − y = 0. + 2 − x = 0 and 2 2 a
b
a
b
The geometry of the quadrics can be understood by taking intersections with planes. We define horizontal planes by setting z = K 3 , vertical planes parallel to the xz -plane setting y = K 2 and vertical planes parallel to yz -plane by setting x = K 1 for some K 1 , K 2 , K 3 ∈ R acting as a parameter which we can vary. The intersection of conics and coordinate planes are called traces . The following examples illustrate the traces of some conics. Example 1.6.1. Consider an ellipsoid
x2 y2 + + z 2 = 1. 9 4
Consider the traces given by z = K 3 , we obtain x2 y2 + =1 9 4
− K 32
which is the equation of an ellipse as long as 1 getting smaller as z increases from zero.
| |
− K 32 ≥ 0. Thus, the ellipses are
36
1 Introduction
Fig. 1.31. A hyperbolic paraboloid with its traces: a hyperbola for the horizontal trace and two
parabolae, one for each vertical trace.
Example 1.6.2. Consider a hyperbolic paraboloid
x2 2
y2 10
− − z = 0
and take all three types of traces. Setting z = K 3 we can simplify the equation to x2 ( 2K 3 )2
√
y2 =1 ( 10K 3 )2
− √
which is the equation of a hyperbola. Setting x = K 1 we have K 12 2
y2 10
− − z = 0
which becomes
y2 K 12 z = + . 10 2 This is the equation of a parabola opening downward with vertex at z = K 12 /2. Finally, setting y = K 2 yields x2 K 22 z = 2 10 which is also a parabola, now opening upward and with vertex at z = K 22 /10. Those are shown in Figure 1.31.
−
−
−
Often, in applications, it is necessary to determine the curves of intersections of quadric surfaces. This is done by equating the equations for the quadrics and performing some manipulations. The following examples illustrate this procedure.
1.6 Quadrics
37
y 2 Example 1.6.3. We find the intersection of the cone z = x + and the ellipsoid 3 2 2 2 2 x + 2y + 3z = 1. Isolating z from both equations, we then have 2
y2 1 x + = (1 3 3 2
2
− x2 − 2y2 )
which simplifies to 2
2
4x + 3y = 1
x2
⇒
y2
+
1 2 2
√ 13
2
= 1;
√ an ellipse with minor axis a = 1/2 and major axis b = 1/ 3. Quadrics can be expressed in the form of one or two functions of several variables z = f (x, y). The elliptic and hyperbolic paraboloid are written, respectively, x2 y2 z= 2 + 2 a b
x2 and z = 2 a
−
y2 . b2
The remaining quadrics are obtained using square roots of z and so are expressed using two functions. For instance, the hyperboloid of two sheets is given by z=
± c
x2 y2 1+ 2 + 2. a b
Note that it is sometimes more convenient to isolate either x or y , rather than z . Example 1.6.4. The surface obtained by rotating the line x = 2y about the x -axis is
a cone. The equation of this cone is obtained as follows. For a fixed x = 0 value, the trace is a circle of radius x/2 which projects to the yz -plane. The equation of this circle is x 2 = y 2 + z 2 2 which we rewrite as
2
2
y z x = + (1/2)2 (1/2)2 2
or x =
±
y2 z2 + . (1/2)2 (1/2)2
Figure 1.32 shows the cone along with the line x = 2y .
1.6.1 Cylinders
The concept of cylinder is a familiar one. One can describe it as a surface with constant horizontal trace given by a circle of fixed radius. The equation in this case is x2 + y 2 = r 2
where r > 0 is the radius of the cylinder. This definition can be generalized to any curve in the plane. Let C be a curve in the xy -plane, then a cylinder over C is the surface with constant horizontal trace given by C .
38
1 Introduction
Fig. 1.32. Cone obtained from revolving the line x = 2y around the x axis.
(1) The curve y = x 2 defines a parabolic cylinder. (2) Note that cylinders do not need to be defined in the xy -plane. Consider the parametric curve x(t) = 0, y(t) = t and z(t) = e−t with t ∈ [−1, 1]. See Figure 1.33. 3
The intersection of cylinders and quadrics also occurs frequently in applications and define curves which are often better understood using parametric representations. Example 1.6.5. We find the intersection curve of the parabolic cylinder y = x 2 and
the top half of the ellipsoid x2 + 3y 2 + 3z 2 = 9 and describe it using its parametric equations. The top half of the ellipsoid is obtained for z 0. To obtain the intersection curve, we substitute y = x 2 into the equation of the ellipsoid:
≥
y + 3y 2 + 3z 2 = 9.
By completing the square for y , we obtain that this equation has the form
1 y + 6
2
+ z2 = 3 +
√
1 . 36
1 This equation describes a circle of radius 3+ 36 with centre at (y, z) = ( 16 , 0). We can solve for z and keep only the + solution because we are interested in the top half
−
1.6 Quadrics Quadrics
39
3
Fig. 1.33. Cylinder defined by the parametric curve y (t) = t , z (t) = e −t in the yz -plane.
of the ellipsoid: z =
− 109 36
1 y + 6
2
.
(1.6)
But, the expression under the square root needs to be positive, so 109 (y + 16 )2 0. 36 We can now describe the intersection curve in parametric form. Let x(t) = t, and since y = x 2 then y (t) = t 2 . Equation (1.6) completes the description. We have
−
x(t) = t,
2
y (t) = t ,
z (t) =
− 109 36
t2
1 + 6
Fig. 1.34. Ellipsoid and cylinder of Example 1.6.5 with its intersection curve.
2
≥
40
1 Introduction Introduction
≥ − − ≤ ≤ −
with domain of t obtained by isolating t in 109 36 109 36
1 6
t
−
t2 + 109 36
1 2 6
0. This yields
1 . 6
See Figure 1.34.
Exercises
(1) Write the following quadrics in their standard form and identify the quadric (note that the x , y , z might be permuted with respect to the equations given at the beginning of the section). (a) 2x2 − y2 + z 2 = −3 (b) x − y 2 − 2z 2 = 0 4y 2 = 0 (c) −2z 2 − x2 + 4y x2 3z 2 = 2 (d) 2 y2 + 3z 2 (e) x2 4y 2 5z 2 =
− − −
−
−2
(2) Find the equation equationss of the traces traces of the quadrics quadrics of Exercise Exercise 1. (3) Consider Consider the the quadric quadric Q with traces given by two families of hyperbolae and one family of circles defined for all values of the constant K . Which quadric is Q ? (4) Consider the quadric Q with traces given by two families of parabolae and one family of circles. Which quadric is Q ? (5) Find the curve of intersection of the cone z 2 = 2x2 + y 2 with the hyperboloid of two sheets 2
−4x2 − y3
+ z 2 = 1.
What is this curve? (6) Find the curve of intersection of the paraboloid 3x2 + y 2 + z 2 = 1 and the 3 y2 − z 2 = 1. What is this curve? hyperboloid of one sheet x2 + 3y 2 y 2 and the (7) Find the curve of intersection of the elliptic paraboloid z = 3x2 + 2y = x 2 − y 2 . What is this curve? hyperbolic paraboloid z = x (8) Consider the cylinders given by x = y 2 and y 2 + z 2 = 4. Find the intersection curve of those surfaces and write the result in parametric form. (9) Consider the cylinder given by the curve x(t) = t3 , y (t) = t2 and the cone z 2 = x 2 + y 2 . Write the intersection curve in parametric form.
2 Calcul Calculus us of Vecto ector Functio unctions ns The previous chapter showed how a curve C can be expressed in terms of a parametric representation representation x1 (t), . . . , xn (t)
with t ∈ [a, b]. A convenient way to write parametric representations is using vector functions r : R → Rn of the form )). r(t) = (x1 (t), . . . , x n (t)).
In this section, we show how to apply calculus techniques to curves, and this is done more conveniently conveniently by using vector functions rather than parametric representations. representations.
2.1 Deriva Derivativ tives es and Integrals Integrals Consider a curve C with with parametrization r(t). Examples from the previous section show that many of the curves defined are quite “smooth” in the sense that there are no sharp corners. In fact, they are also continuous in the sense that there is no jump in the tracing of the curve as one follows it. The concepts of continuity continuity and smoothness (i.e. derivatives) for function f : [a, b] → R are one of the main topics in an introductory course on Calculus. We show in this section how these concepts extend to curves, but with some warnings!
2.1.1 Limits Limits and Continuity Continuity
f (t)) with To discuss limits of vector functions, consider first the case of r(t) = (t, f ( t ∈ [a, [ a, b]. Those correspond to functions y = f ( f (x) with x ∈ [a, [ a, b]. Recall that f has L ∈ R at some x0 ∈ [a, b] if a limit L
lim f ( f (x) = L.
x
→x
(2.1)
0
For convenience of the reader, recall that the exact definition of (2.1) is that such that if 0 < |x − x0 | < δ then
∀ > 0, 0 , ∃δ > 0
|f ( f (x) − L| < .
The vector function r(t) = (t, f ( f (t)) describes the same curve as y = f ( f (x). Consider x(t) = t and y( y (t) = f ( f (t) separately. Then, the limits as t → t0 exist in both cases: and
lim x(t) = lim t = t = t 0
t
→t
t
→t
0
lim y (t) = lim f ( f (t) = L.
t
→t
0
t
→t
0
0
Therefore, it makes sense to define lim r(t) =
t
→t
0
lim x(t), lim y(t)
t
→t
0
t
→t
0
= (t0 , L).
42
2 Calculus Calculus of Vector Vector Functions Functions
The definition of limit for general vector functions r(t) follows the same approach. Definition 2.1.1. Consider the vector function r(t) = (x1 (t), . . . , xn (t))
for t
∈ [a, b]. Then, the limit limit of r(t) at t0 ∈ [a, b] exists if lim xj (t) = L j ∈ R t→t 0
for all al l j = 1 . . . , n. Thus, lim r(t) = (L1 , . . . , Ln ).
t
→t
0
Note that for t0 = a or t0 = b we take the one-sided limit: t respectively.
→
t0+ or t
→ t−0
We now look at several examples. f (x) with a jump disconExample 2.1.2. This example is similar to a function y = f (
tinuity. Let y (t) =
then the vector function
−
1, 1,
r(t) = (t, y (t), t2 )
t t
∈ [−1, 0) [0, 1] ∈ [0, t
∈ [−1, 1] has two separate pieces depending on whether t t ∈ [−1, 0) or t ∈ [0, [0, 1], see Figure 2.1.
Fig. 2.1. Discontinuous vector
function
With the above definition of limit, we can now discuss the concept of continuity. Definition 2.1.3 (Continuity). A vector function r(t) with r : [a, b] uous at t0 [a, b] if r(t0 ) = r 0 Rn and
∈
∈
lim r(t) = r 0 .
t
→t
0
→ Rn is contin contin-
2.1 Derivatives Derivatives and Integrals Integrals
43
In Example 2.1.2, r(0) = 1 and lim (t, y (t), t2 ) = (0, (0 , 1, 0) and
−
−
→0
t
lim (t, y (t), t2 ) = (0, (0, 1, 0) +
→0
t
so the limit does not exist and r(t) is not continuous at t = 0. Consider instead a vector function r(t) = (t2 , t3 ). Then, r(t0 ) = (t20 , t30 ) and lim (t2 , t3 ) = (t20 , t30 ).
t
→t
0
In most of our examples, the vector functions are continuous.
2.1.2 Derivative Derivatives, s, Smoothness and Integrals Integrals
We begin with the definition of derivative for a vector function. = t 0 if Definition 2.1.4. A vector function r(t) is differentiable differentiable at t = t lim
t
r(t)
− r(t0 ) t − t0
→t
0
exists. The limit is denoted by r (t0 ) and called the derivative derivative of r(t). = t 0 + h, then An equivalent definition is obtained by setting t = t r (t0 ) = lim
r(t0 + h)
→0
h
h
− r(t0 ) .
This form of the definition is useful when one needs to obtain a general formula for [ a, b] the derivative of r(t) independent of t0 . A vector function r(t) defined for t ∈ [a, is differentiable on (a, b) if r(t) is differentiable at each t ∈ (a, b). As one may expect, if each coordinate function of r(t) is differentiable on its interval of definition then r(t) is differentiable. This is expressed in the following result. Proposition Proposition 2.1.5. A vector function r(t) = (x1 (t), . . . , xn (t)) is differentiable on
(a, b) if and only if xj (t) is differentiable on (a, b) for j = 1, . . . , n.
Proof. The proof can be done as a single calculation as follows. Let t0
we write lim
t
→t
0
r(t)
− r(t0 ) t − t0
= =
li m
t
→t
0
li m
t
→t
0
∈ (a, ( a, b) and
(x1 (t)
− x1 (t0 ), . . . , xn(t) − xn(t0 )) t − t0 x1 (t) − x1 (t0 ) x n (t) − xn (t0 ) ,..., t − t0 t − t0
So, if the left-hand side limit exists, then the right-hand side limit in the last row must exist for each component. Similarly, if the right-hand side limit exists for each component, then the left-hand side limit exists.
44
2 Calculus Calculus of Vector Vector Functions Functions
From Proposition 2.1.5, we see that the derivative of vector functions depends completely on each coordinate function and it is known from an introductory calculus course how to compute derivatives of functions xj : [a, b] → R. t = 0 and this is a consequence We know that r(t) = (t, |t|) is not differentiable at t = of the corner of the function |t| at t = However, vector functions are different from t = 0. However, = f ((x) because the curve C defined by a vector function can have all its functions y = f coordinates differentiable, but still have a corner as the next example shows. Example 2.1.6. Consider r(t) = (t2 , t3 ),
t
∈ [−1, 1]. 1].
Then, x(t) = t2 and y (t) = t3 are differentiable, but Figure 2.2 shows clearly a corner at t = 0. We see that x(t)3 = t 6 = y( y (t)2 so C corresponds to the curve given by y 2 = x 3 and called a cusp cusp curve.
Fig. 2.2. The cusp curve has a sharp corner
(0, 0) at (0,
Here is an example without a corner. Example 2.1.7. Consider r(t) = (t, t2 , et ) with t
∈ [−1, 1], Figure 2.3 shows the
curve C corresponding to this vector function. The fact that there are no corners at any point shows a greater degree of “smoothness” than the previous example.
Therefore, the concept of derivative and the smoothness of the curve C corresponding corresponding to the vector function are not equivalent. equivalent. We We now discuss the question of smoothness of the curve associated with a differentiable vector function r(t). We define smoothness here and show in the following section that smoothness at a point p of a curve C corresponds corresponds to the existence of a line tangent to C at p.
45
2.1 Derivatives and Integrals
Fig. 2.3. Curve with no corner.
Definition 2.1.8. A curve C has a smooth parametrization given by a vector function r (t) for t [a, b] if r (t) = 0 for all t (a, b). If C has a smooth parametrization,
∈
then we say that C is a smooth curve.
∈
Clearly, the cusp curve is not smooth at t = 0. This next example has a non-obvious corner point. Example 2.1.9. Consider the curve r(t) = (t2 , t2 ) with t
∈ [−1, 1]. This curve starts
at (1, 1) for t = 1 and evolves down the diagonal to (0, 0) and returns on itself until it reaches (1, 1) at t = 1. In order to return on its path, the curve must stop at t = 0 and indeed, r (0) = (0, 0). Therefore, this is not a smooth curve.
−
y r(t4 )
r(t3 )
r(t) r(b)
C r(t2 )
a t1 t2
t3
t4
b
x
t r(a)
r(t1 )
Fig. 2.4. A piecewise smooth curve with five pieces and non-smooth points at t1 , t2 , t3 , t4 .
A curve C is piecewise smooth if there exists a parametrization given by r(t) such that r (t) = 0 only at a finite number of points t1 , . . . , t k ∈ (a, b). Most of the
46
2 Calculus of Vector Functions
examples presented in this book are at least piecewise smooth. Figure 2.4 sketches a piecewise smooth curve. A piecewise smooth curve C must often be given by distinct vector functions as the next example shows. But it is not always the case as Example 2.1.9 shows. y 1
2
x Fig. 2.5. Curve C of Example 2.1.10
Example 2.1.10. Consider the curve C which is the triangle with vertices at (0, 0),
(2, 0) and (0, 1). We obtain a vector function for C (with a consistent orientation) by writing a parametrization of each side of the triangle. r(t) =
See Figure 2.5.
(t, 0) (2 2t, t) (0, 1 t)
−
−
t t t
∈ [0, 1] ∈ [0, 1] ∈ [0, 1].
Computing the integral of a vector function is a straightforward process. Let r(t) with t ∈ [a, b] be a continuous vector function with r(t) = (x1 (t), . . . , xn (t)), then b
ˆ
ˆ
b
r(t) dt =
a
b
x1 (t) d t , . . . ,
a
1
ˆ 1
r(t) dt
=
0
2
t dt,
0
=
1 3, 0
∈ [0, 1] then 1
ˆ 0
xn (t) dt .
a
Example 2.1.11. Let r(t) = (t2 , cos(2πt)) with t
ˆ
ˆ
cos(2πt) dt
.
2.1.3 Position and velocity of particles
The use of vector functions is crucial in physics (especially Newtonian mechanics) and in this section we expose the main interpretations of the concepts of the previous section to this context. As an object moves in space, its velocity is the instanteneous rate of change of its position. If the position is given by a parametric representation, we have the following definition. Definition 2.1.12. Let r(t) be the vector function describing the motion of a particle p. The velocity vector of p is r (t) and the speed is given by r (t) .
||
||
47
2.1 Derivatives and Integrals
The acceleration is the rate of change of velocity and so we have the definition. Definition 2.1.13. Let r (t) be the parametrization describing the motion of a particle m. The acceleration vector of m is r (t).
Let us consider some examples. Example 2.1.14. Suppose that an object’s position is given by r(t) = (t,t,t(2 t)) with t [0, 2]. The velocity vector is r (t) = (1, 1, 2 2t) and the speed r (t) = 12 + 1 2 + (2 2t)2 = 6 8t + 4t2 . The acceleration vector is r (t) = (0, 0, 2).
∈
−
√ −
−
− || || −
We can also use integration of vector functions to solve the inverse problem. Example 2.1.15. Suppose that an object has acceleration vector a(t) = (1, t , t2 ) and
has initial position vector (0, 0, 0) and initial velocity vector (1, 0, 0). Integrating the acceleration vector gives the velocity vector: r (t) =
ˆ
a(t) dt =
ˆ
ˆ ˆ
1 dt, tdt,
t2 dt
=
=
t + c1 , 21 t2 + c2 , 31 t3 + c3 t,
1 2 1 3 2t , 3t
+ (c1 , c2 , c3 ).
We know r (0) = (c1 , c2 , c3 ) = (1, 0, 0) implies c1 = 1, c2 = c 3 = 0. So, r (t) =
t 2 t 3 t + 1, , 2 3
.
Integrating the velocity vector: r(t) =
t2 t 3 t4 + t + k1 , + k2 , + k3 . 2 6 12
Then r(0) = (k1 , k2 , k3 ) = (0, 0, 0) implies r(t) =
t2 t 3 t4 + t, , 2 6 12
.
Newton’s second law relates the force vector F acting on an object with the acceleration vector a in this famous formula: F = m a
where m is the mass of the object. Example 2.1.16. Consider a ball with mass m thrown from the ground at an angle
ψ and with initial velocity v = v 0 . If the only external force acting on the ball is the gravitational force F g with acceleration g = 9.8m/s2 in the (0, 0, 1) direction, we find the position r(t) of the ball and the angle that maximizes the horizontal distance traveled. We assume that at time t = 0, the ball is at the origin r(0) = (0, 0, 0) and we suppose that the motion of the ball happens completely inside the yz -plane. Then,
−
48
2 Calculus of Vector Functions
z
r (0)
v0 sin ψ y
v0 cos ψ
Fig. 2.6. Projections on the y and
z axis of the initial velocity
r (0)
the initial velocity vector is given by r (0) = (0, v0 cos ψ, v0 sin ψ). Because the gravitational force is the only one acting on the ball, from Newton’s second law, we obtain the acceleration 9.8 (0, 0, 1). a(t) = m Integrating the acceleration, we obtain the velocity vector
−
r (t) = (9.8/m)(c1 , c2 ,
−t + c3 )
and the values of the constants are computed using the initial velocity: r (0) = (0, v0 cos ψ, v0 sin ψ) = (9.8/m)(c1 , c2 , c3 ). Therefore, r (t) =
−9.8 t + v0 sin ψ 0, v0 cos ψ, m
.
Integrating the velocity vector gives the position vector r(t) =
− 9.8 t2 d1 , tv0 cos ψ + d2 , + tv0 sin ψ + d3 2m
.
Because r(0) = (0, 0, 0) then d1 = d 2 = d 3 = 0 implies r(t) =
4.9 t2 − 0, tv0 cos ψ, + tv0 sin ψ m
.
The ball lands at time t∗ for which z(t∗ ) = 0 and solving for t∗ we obtain t∗ = mv0 sin ψ/4.9. Thus, the distance traveled by the ball is given by d(ψ) := y(t∗ ) = mv02 cos ψ sin ψ/4.9. Using elementary calculus, d(ψ) has a maximum value for ψ = π/4.
Exercises
(1) For the vector functions below, determine if they are differentiable and/or smooth on their interval of definition. (a) r(t) = (t2 − 2t, sin(2πt)) t ∈ [0, 1] (b) r(t) = (3t2 , cos t, e−1/t ), t ∈ [−1, 1] (c) r(t) = ((t − π)2 , cos t, 2sin(t/2)), t ∈ [−2π, 2π] 2
2.2 Best Linear Approximation and Tangent Lines
49
(d) r(t) = (4t3 , t2 ), t ∈ [1, 3] (e) r(t) = (3t cos t, t2/3 , et )), t ∈ [1, 1] (2) For each vector function of exercise (1), compute the integral of the vector function. (3) Let each vector function of exercise (1) describe the trajectory of an object moving in space. Compute the velocity vector, speed and acceleration vector. (4) Show that if a vector function r (t) is differentiable at t = t 0 , then it is continuous at t = t 0 . (5) Show that any curve given by a vector function (t, f (t)) is smooth. (6) Suppose that a force F (t) = (t2 , cos t, sin t) is applied on an object of mass 1. Find the general formula for the velocity and position vector. (7) An object falls from a cliff of height h0 > 0 and only subject to gravitational acceleration g = 9.8(0, 0, −1)m/s2 . Find the velocity and position vectors if the initial velocity is v 0 . Determine a formula for the final velocity as the object hits the ground. 2
2.2 Best Linear Approximation and Tangent Lines To discuss the existence of “tangent line” at a point p of a curve, we introduce the concept of “best linear approximation”. This concept is defined as follows. Definition 2.2.1. Consider a function f (x) and a linear function L(x) = a + bx. We
say that L(x) is the best linear approximation of f (x) at x = x 0 if lim
x
→x
0
f (x) x
− L(x) = 0. − x0
The best linear approximation limit means that the difference f (x)−L(x) approaches zero near x = x 0 at a much faster rate than 1/(x−x0 ) approaches infinity near x = x 0 so that the product tends to zero. We look at the case of functions of one variable as a reminder. Let f (x) be a function which is at least twice differentiable. We do a Taylor expansion of f (x) at x = x 0 to first order, f (x) = f (x0 ) + f (x0 )(x
− x0 ) + o(|x − x0 |)
(2.2)
where the symbol o(|x − x0 |) is a short-hand expression for terms of higher degrees, it is called “little o” and its exact definition is: lim
x
→x
0
o( x x0 ) = 0. x x0
| − | | − |
The right-hand side expression of (2.2) provides an approximation of f (x) for x close to x0 . Consider the linear function L(x) = f (x0 ) + f (x0 )(x − x0 ), then we have lim
x
→x
0
f (x) x
− L(x) = lim o(|x − x0 |) = 0. x→x x − x0 − x0 0
50
2 Calculus of Vector Functions
In fact, L(x) is the only linear function for which lim
x
→x
0
f (x) x
This is shown in the next result.
− L(x) = 0. − x0
Theorem 2.2.2. Suppose f (x) is differentiable at x = x 0 and L(x) = a + bx. Then,
lim
x
→x
0
f (x) x
− L(x) = 0 − x0
if and only if f (x0 ) = L(x0 ) and f (x0 ) = L (x0 ).
Proof. This is an if and only if statement that can be proved in one calculation. We write L(x) = a + bx = (a + bx0 ) + b(x x0 ). Begin with the limit:
−
lim
x
→x
0
f (x) x
f (x0 ) + f (x0 )(x
= lim x
→x
0
= lim x
→x
0
= lim x
→x
− L(x) − x0
0
− x0 ) + o(|x − x0 |) − (a + bx0 + b(x − x0 )) x − x0 [f (x0 ) − (a + bx0 )] + (f (x0 ) − b)(x − x0 ) + o(|x − x0 |) x − x0 f (x0 ) − (a + bx0 ) o(|x − x0 |) + (f (x0 ) − b) + . x − x0 x − x0
Now, the limit on the last line is zero if and only if the limit of each term is zero. We check each case. First, f (x0 ) (a + bx0 ) x x0
− x→x − exists if and only if the numerator f (x0 ) − (a + bx0 ) is zero. In which case, the limit is zero. The third limit is automatically zero by definition of o(|x − x0 |). Therefore, lim
0
the second limit is zero if and only if f (x0 ) = b . Therefore a = f (x0 ) + f (x0 )x0 and so L(x) = f (x0 ) + f (x0 )(x − x0 ) which means L(x0 ) = f (x0 ) and L (x0 ) = f (x0 ) This completes the proof. Definition 2.2.3. The tangent
line to the curve given by y = f (x) at (x0 , f (x0 )) is
given by the best linear approximation L(x) of f (x) at x0 .
Note that in this case, the tangent line is always given by a function y = mx + b. This is not the case anymore for general vector functions as we show below. A vector function q : R → Rn defined by
−→ −→
q(t) = a + b t
−→
→a = (a1 , . . . , an) and b = (b1 , . . . , bn) is called an affine vector function . If where − −→a = 0, it is a linear function . Let r(t) be a parametrization of the smooth curve C . The concept of best linear approximation for f (x) extends naturally to vector functions and we have the following definition.
2.2 Best Linear Approximation and Tangent Lines
Definition 2.2.4. Let r, q :
R
51
→ Rn where r(t) is a vector function and q(t) is a
linear vector function. We say that q(t) is the best linear approximation of r(t) at t = t 0 if r(t) q(t) lim = 0. t→t0 t t0
− −
The idea of a best linear approximation is used in the sections and chapters that follow in order to provide optimal approximations to lengths of curves, areas of surfaces, etc. As with the function f (x) above, we can characterize the best linear approximation using the derivative.
−→ −→
−→ −→ ∈
n R . Theorem 2.2.5. Let r (t) be a vector function and q (t) = a + b t where a , b Then, q(t) is a best linear approximation at t = t 0 if and only if q(t0 ) = r (t0 ) and q (t0 ) = r (t0 ).
Proof. The argument is similar to the proof of Theorem 2.2.2 done on each component of r(t) and q(t) separately.
Therefore, if the curve C is smooth at p = r(t0 ), the best linear approximation is nonconstant and so we have the following definition. Definition 2.2.6. The tangent
line at a smooth point p
linear approximation q (t).
∈ C is given by the best
Example 2.2.7. Consider the circle of radius 1 given by r(t) = (cos t, sin t)
and let t = 0. Then, the best linear approximation is q (t) = (1, t) for t vertical line in the plane and so we can’t write it as y = mx + b.
∈ R. It is a
2.2.1 Construction of the Tangent Line
Let C be the curve defined by the vector function r(t) = (x(t), y(t)) for t ∈ [a, b] and differentiable for t ∈ (a, b). Let t 0 ∈ (a, b), be a tangent line at the point r(t0 ) and let ( p, q ) ∈ . See Figure 2.7. Then, v p,q := ( p, q )
− (x(t0 ), y(t0 ))
is the vector joining r(t0 ) to ( p, q ) and v p,q is a nonzero vector. Because r(t) is = 0 and this means r (t0 ) = differentiable at t = t 0 , then v p,q = s r (t0 ) for some s 0. This corresponds to what is seen in Example 2.1.6, corners can appear only if r (t0 ) = 0. The construction seen in Figure 2.7 gives us a method to obtain the formula for the tangent line at points where a curve C is smooth. This is illustrated in the following example.
52
2 Calculus of Vector Functions
r(b)
y
r (t0 )
r(t)
r(t0 )
v a
t0
b
t
x
C r(a)
v p,q ( p, q )
Fig. 2.7. A curve C given by r(t) with its tangent vector
vector vp,q
r (t0 )
at r(t0 ), tangent line and a
∈ joining the point ( p, q) ∈ to r(t0 ).
Example 2.2.8. Consider the curve C given by the smooth parametrization r(t) =
(t2 , 1 t) for t [0, 1]. The tangent line at t = 21 is obtained by first obtaining the tangent vector r (t) = (2t, 1) and evaluating at t = 21 : r ( 12 ) = (1, 1). Using the base point r( 12 ) = ( 14 , 21 ) and the tangent vector, the equation of the tangent line at r( 12 ) is 1 1 1 1 (s) = , + s (1, 1) = + s, s . 4 2 4 2 where s is the variable parameterizing the tangent line.
−
∈
−
−
−
−
The general method to compute a tangent line is similar to the one exposed in Example 2.2.8. Let r(t) be the smooth parametrization of a curve C and we want to compute the tangent line at t = t 0 . (1) Evaluate the vector function at t = t 0 : r(t0 ). (2) Compute the tangent vector at t = t 0 : r (t0 ). (3) Write the formula: (s) = r (t0 ) + sr (t0 ).
Example 2.2.9. Consider the curve C with smooth parametrization r(t) = (t cos(2πt), et sin t, 2t2 )
for t [0, 1]. We find the general formula for the tangent line at any point on the curve using the approach given above.
∈
(1) Evaluate the vector function: r(t) = (cos(2πt), et sin t, 2t2 )
2.3 Reparametrizations and arc-length parameter
53
(2) Compute the tangent vector: r (t) = (−2π sin(2πt), et cos t + et sin t, 4t) (3) Write the formula: (s) = r (t) + sr (t).
We obtain (s) = (cos(2πt), et sin t, 2t2 ) + s( 2π sin(2πt), et cos t + et sin t, 4t).
−
This formula can now be evaluated at any point t ∈ [0, 1]. Exercises
(1) Determine the general equation of the tangent line for each vector function. Then, evaluate at the given t 0 value. (a) r(t) = (t2 − 2t, sin(2πt), t), t0 = π (b) r(t) = (4t3 , t2 , et ), t0 = 0 (c) r(t) = (3tet , t2/3 , t2 ), t0 = 1 (d) r(t) = (et , tet , t2 et ), t0 = 0
2.3 Reparametrizations and arc-length parameter Consider a straight line segment on the x-axis from x = a to x = b. A natural parametrization is x(t) = t,
y(t) = 0
with t ∈ [a, b]. Note that ||(x (t), y (t))|| = 1. Let α > 0 and consider instead the parametrization x(t) = αt,
y(t) = 0
with t ∈ [a/α, b/α]. Now ||(x (t), y (t))|| = α and the length of the domain is α−1 (b − a). Therefore, the speed of the parametrization being α leads to a contraction of the domain by a factor of α. Consider the parametrization of a circle of radius r0 given by x(t) = r 0 cos(t/r0 ),
y(t) = r 0 sin(t/r0 )
with t ∈ [0, 2πr 0 ) and here again we have ||(x (t), y (t))|| = 1. This parametrization is similar to the one from Example 1.5.3, but the domain has been dilated by a factor r0 with the division of t by r0 in cos and sin. These are examples of “reparametrization” of a curve. Remark 2.3.1. Note that for both examples, the curve has a parametric represen-
tation which travels along the curve at speed 1 and the length of the curve (known from elementary geometry) is the same as the length of the domain interval.
54
2 Calculus of Vector Functions
In this section, we present an argument showing that for any curve C there exists a parametric representation given by a vector function r(t) such that ||r (t)|| = 1. We single out such parametric representations in the following definition. Definition 2.3.2. Let C be a curve with parametrization r(t) with t
[a, b]. A [a, b], reparametrization of C is an invertible differentiable function φ : [c, d] t = φ(s), from which we define a parametrization
∈
→
˜r(s) = r (φ(s))
with s
∈ [c, d] where c = φ−1 (a) and d = φ−1 (b).
In the circle example, the reparametrization from Example 1.5.3 to the one above is given by φ(s) = s/r 0 . Example 2.3.3. Let C be a curve with parametrization r(t) = (et , e2t , e3t ) with t
[0, 3] and consider the reparametrization t = φ(s) = ln(s). Then,
∈
˜r(s) = r (φ(s)) = (eln(s) , e2 ln(s) , e3 ln(s) ) = (s, s2 , s3 )
with s
∈ [1, e3 ].
Definition 2.3.4. A curve C has an arc-length parametrization r(s) if
||r(s)|| = 1 for all s. Example 2.3.5. Let C be a curve with parametrization r(t) = (t2 /2, t3 /3) with t [0, 1]. Then r (t) = t2 + t4 = t 1 + t2 . We now find a reparametrization t = φ(s) such that ˜r(s) = r (φ(s)) and ˜r (s) = 1. Begin by computing
||
√ ||
∈
√
||
||
dφ ˜r (s) = r (φ(s)) ds
||
||
= r (φ(s))
||
Because we want ˜r (s) = 1, we set up the equation
||
||
||
1 = r (φ(s))
||
||
dφ . ds
dφ . ds
It is reasonable to assume that the parametrization does not change the orientation, so we have dφ/ds > 0 . Since t = φ(s), we can write dt 1 1 = = . ds r (t) t 1 + t2
||
||
√
and using the method of separation of variables this becomes t
ˆ 0
t
ds =
ˆ 0
τ 1 + τ 2 dτ.
(2.3)
2.3 Reparametrizations and arc-length parameter
55
The integral on the right can be computed with the substitution rule ( u = 1 + τ 2 ) and we obtain 1 s(t) s(0) = ((1 + t2 )3/2 1). 3
−
−
The value of s(0) is not fixed and so we assume s(0) = 31 , therefore s(t) =
1 (1 + t2 )3/2 . 3
(2.4)
But we can invert this function and obtain t = φ(s) =
(3s)2/3
− 1.
(2.5)
Therefore, C has an arc-length parametrization given by 2
3
− ∈
˜r(s) = (φ(s) /2, φ(s) /3) =
1 ((3s)2/3 2
1 1), ((3s)2/3 3
where the range of the parameter s is obtained from (2.4) with t s
3/2
− 1)
∈ [0, 1]. That is,
1 1 3/2 , 2 . 3 3
From this example, we can extract an algorithm to find the arc-length parametrization of a vector function r(t) with domain [a, b]. (i) Compute ||r (t)||. If ||r (t)|| = 1 then it has already the arc-length parametrization. If not, go to step (ii). (ii) Set up the equation t
s(t)
− s(a) =
ˆ
a
||r(τ )|| dτ.
(2.6)
Compute the integral on the right if possible; s(a) is arbitrary so you can set it to a convenient value. (iii) If possible, invert the formula given by (2.6) to obtain t = φ(s).
(iv) Determine the domain of s: the interval [s(a), s(b)]. (v) Write ˜r(s) = r (φ(s)). It is well-known that antiderivatives of continuous functions are in many cases, very difficult to find or that they may not even exist in terms of elementary functions (polynomial, rational, trigonometric, exponential, logarithmic functions) as the next case shows.
56
2 Calculus of Vector Functions
Example 2.3.6. Consider the curve C with parametrization x(t) = t and y(t) = e −t ,
then a calculation as above leads to a differential equation dt = ds
√ 1 +1 e−2t
and so one must compute the integral.
ˆ
(1 + e−2t )1/2 dt
Unfortunately, an antiderivative cannot be found for this case and so the arc-length parametrization cannot be obtained analytically.
Although we cannot compute the arc-length parametrization analytically in all cases by solving a differential equation such as (2.3), we can show abstractly that the arclength parametrization must always exist. This is the content of the next result. Theorem 2.3.7. A smooth curve C always has an arc-length parametrization.
Proof. Let r(t) be a parametrization of C with t [a, b]. Let t = u(s) be a reparametrization with ˜r(s) = r (u(s)). We must find a function u such that
∈
||˜r (s)|| = ||dr/du for all s. This means du 1 = = ds dr/du
||
||
|| 1
du =1 ds
x1 (u)2 + x2 (u)2 +
··· + xn(u)2 := g(u).
(2.7)
We can solve this differential equation by separation of variables
ˆ
1 du = g(u)
ˆ
ds = s.
(2.8)
1 = 0, then g(u) Because r(t) is smooth and g(u) is continuous. By the Fundamental Theorem of Calculus, there exists F (u) such that F (u) = 1/g(u). So, the solution is
F (u) = s
but F (u) = 1/g(u) > 0 so the inverse of F exists. This means u(s) = F −1 (s)
satisfies the required property. Therefore, there always exists an arc-length parametrization.
2.3 Reparametrizations and arc-length parameter
57
Exercises
(1) For the following vector functions, use the transformation t = φ(s) given to reparametrize and change the domain accordingly. In each case compute the reparametrization r˜(s) with its domain, compute its speed and identify the ones with arc-length parametrization. (a) (b) (c) (d)
∈ [0, 2]; t = φ(s) = √ s. r(t) = (cos(8t), sin(8t)), t ∈ [0, 4π]; t = φ(s) = s/8. √ r(t) = (et , e2t / 2, e3t /3), t ∈ [0, ln(10)]; t = φ(s) = ln(s). s r(t) = (a + bt,c − dt,g + ht), t ∈ [0, 1]; t = φ(s) = √ . b2 + d2 + h2 4
r(t) = (t2 , t2 e−t ),
t
(2) Compute s(t) given by equation (2.6) for the following vector functions. Invert the relationship to t = φ(s) if possible. (a) (b) (c) (d)
r(t) = (t cos t, t sin t) r(t) = (et , e2t , e4t ) r(t) = (t2 , cos(t2 ), sin(t2 )) r(t) = (2e−t/2 cos t, 2e−t/2 sin t)
3 Tangent Spaces and 1-forms We define the concept of tangent space using specific examples: curves, the spaces n R and finally two-dimensional surfaces given by z = f (x, y). The tangent space to a geometric object is an important topic in advanced calculus and differential geometry. The second part introduces a formal definition of differential and the construction of the so-called 1 -forms.
3.1 Tangent spaces We begin by discussing how tangent lines lead to tangent space and build more general tangent spaces for Rn and surfaces.
3.1.1 From Tangent Lines to Tangent Spaces
The previous section shows how to determine the tangent line to a curve C at some point p ∈ C using the derivative of a vector function r(t) defining C . The goal of this section is to extract the vectors which lie on the tangent lines so that we can add them together without leaving the tangent line. Example 3.1.1. Consider the plane curve C given by x(t) = t2 , y(t) = 1
− t for
∈ [0, 1]. We obtain the tangent line equation at t = 1/2 by the formula (s) = (x(1/2), y(1/2)) + s(x (1/2), y (1/2)) = (1/4, 1/2) + s(1, −1) where s ∈ R is a parameter. The tangent line is not a subspace, if one adds two t
elements of (s), then the result is not in (s) anymore:
1 1 + s1 , 4 2
− s1
+
1 1 + s2 , 4 2
− s2
=
1 + (s1 + s2 ), 1 2
− (s1 + s2 )
.
However, if we decide to fix the base point (x(1/2), y(1/2)) and only add the parametrized part and add the base point after, this leaves us on the tangent line: s1 (1, 1) + s2 (1, 1) = (s1 + s2 , s1
−
− − s2 ) = (s1 + s2 )(1, −1)
−
and so (x(1/2), y(1/2))+(s1 +s2 )(1, 1) is a point of the tangent line. See Figure 3.1 for an illustration.
−
With this in mind, we can now introduce the concept of tangent space, which is a crucial aspect of advanced calculus and of differential geometry in general. Definition 3.1.2. Let C be a smooth space curve (in Rn ). If p
∈ C then the tangent
space of C at p, denoted by T p C , is the set of all vectors tangent to C at the point p.
3.1 Tangent spaces
59
y
1 p x
1
Fig. 3.1. Curve C given by r(t) =
(t2 , 1 t) and the tangent line at p = (1/4, 1/2). The orientation is from (0, 1) to (1, 0).
−
(s)
If the vector function r(t) defines C and p = r(t0 ), T p C is a 1-dimensional vector space with basis {r (t0 )}. In Example 3.1.1, r (1/2) = (1, −1) and for p = (1/4, 1/2), T p C = span(r (1/2)) = α(1, 1) α
{
− | ∈ R}.
T p C lies on top of the tangent line (s), but it is a different geometric object, it is
made up of vectors. Apart from the fact that tangent spaces are vector spaces, another advantage is that they don’t depend on the vector function used to parametrize C . Example 3.1.3. Let C be the circle of radius a and p = (0, 1). The vector functions r1 (t) = (a cos(t), a sin(t)),
− −
r2 (t) = ( t,
a2
t2 ),
t
∈ [0, π] t ∈ [−a, a]
both parametrize the upper portion of C with the counterclockwise orientation and p = r 1 (π/2) = r 2 (0). Now r1 (π/2) = ( a sin(π/2), cos(aπ/2)) = ( a, 0)
−
−
and r2 (0) = ( 1, 0).
−
So, r1 (π/2) = a r2 (0) and they span the same tangent space as shown in Figure 3.2.
3.1.2 Tangent Spaces in any Dimension
We extend the concept of tangent space to any dimension. The following example illustrates the situation. Example 3.1.4. Consider Cartesian coordinate lines in R2 : C 1 horizontal, x1 (t) = t ,
y1 (t) = y 0 and C 2 vertical, x2 (t) = x 0 , y2 (t) = t . Then, for C 1 we have (x1 (t), y1 (t)) = (1, 0) = e 1
and (x2 (t), y2 (t)) = (0, 1) = e 2
60
3 Tangent Spaces and 1-forms
y
r1 (π/2)
r2 (0)
a
x
a
Fig. 3.2. Tangent vectors of two different
parametric representations of C at (0, a).
for all t
∈ R. Let p1 = (t1 , y0 ) and p2 = (x0 , t2 ) for t1 , t2 ∈ R. Then, T p C 1 = {αe1 | α ∈ R} and T p C 2 = {βe 2 | β ∈ R}. Consider p1 , q 1 ∈ C 1 with p1 = q 1 , then T p C 1 and T q C 1 are spanned by the same basis vector e1 , but T p C 1 = T q C 1 because those are located at different points. To 1
2
1
1
1
1
emphasize this distinction, our convention is to distinguish the vectors in a tangent space by labelling those using their base points, for instance e1 ( p) T p1 C 1 and e1 (q ) T q1 C 1 .
∈
∈
y T q C 1 q
T p C 1 p
C 1
x
Fig. 3.3. Tangent spaces of C 1 at p and q
with representative vectors.
Example 3.1.5. We return to the plane curve C of Example 3.1.1 given by x(t) =
t2 , y(t) = 1 t for t [0, 1] at t = 1/2 and consider the elements of T p C with p = (1/4, 1/2). Those can be written as a linear combination of the tangent vectors of the coordinate lines. Indeed, v T p C is written v = α r (1/2) and
−
∈
∈
r (1/2) = 1e1 ( p) + ( 1)e2 ( p).
−
where p is also an element of C 1
∩ C 2 and e1 ( p) ∈ T pC 1 and e2 ( p) ∈ T pC 2 . In particular, this example shows that {e1 ( p), e2 ( p)} can span the tangent vectors of any curve passing through p . We define basis vectors at a point p in
n R
from which
3.1 Tangent spaces
y
61
C 2 1e1 ( p) C 1
p
x
−1e2 ( p)
Fig. 3.4. Decomposition of vector
r (1/2) = (1,
−1)
r (1/2)
=
(1, 1) T p C along the e1 ( p) and e 2 ( p) directions.
− ∈
we can decompose any vector located at p. This leads us to this other important definition. Definition 3.1.6. The tangent
space at the point p ∈
n R ,
denoted by T p Rn , is the
set of all vectors based at the point p.
Example 3.1.7. We look at the cases n = 1, 2.
(1) Let t0 ∈ R, then T t R = {v = he1 | h ∈ R, e1 = 1}. (2) As we saw above, using e1 ( p) = (1, 0) and e2 ( p) = (0, 1) then 0
T p R2 = v = α 1 e1 ( p) + α2 e2 ( p) α1 , α2
{
|
∈ R}.
The tangent space T p Rn can be constructed using families of curves passing through p. As an example, let n = 2 again. We show that any vector v ∈ T p R2 can be obtained as the tangent vector of a curve passing through p. Let p = (x0 , y0 ) and v = (v1 , v2 ) ∈ T p R2 . Consider the curve r(t) = (x0 + tv1 , y0 + tv2 ).
Then r(0) = p and r (0) = (v1 , v2 ).
Proposition 3.1.8. T p Rn is a n-dimensional vector space and the set
{e1 ( p), . . . , en( p)} of unit tangent vectors to the coordinate lines forms a basis. The notation ej ( p) is used to emphasize that the location of the basis vectors is at p. Proof. See exercises.
The role of the derivative is intimately linked to the tangent spaces as the following result shows. C . Then, Proposition 3.1.9. Let r(t) define the smooth curve C and p = r(t0 ) T p C . That is, r (t0 ) is a mapping taking vectors from T t0 R to T p C . r (t0 ) : T t0 R
→
∈
62
3 Tangent Spaces and 1-forms
Proof. We know T p C = αr (t0 ) α R . But, α can be expressed as αe1 Thus, any vector w T p C can be written as w = r (t0 )v where v T t0 R.
{
∈
| ∈ }
∈
∈ T t R. 0
z r(t)
r(b)
C t0 v
a
b
r (t0 )v
y
t
p = r (t0 )
r(a)
x Fig. 3.5. The derivative
w =
r
(t)v
r (t) is
a mapping from T t R into T p C . For any v
∈ T C .
∈ T R, one obtains t
p
This result shows that for a vector function r : [a, b] → Rn defining a curve C , the derivative is a function which takes vectors in T t R and gives a vector in T p C . This interpretation of the derivative as a function on tangent spaces is fundamental. This is shown in Figure 3.5.
3.1.3 Vector Fields
Using the concept of tangent space, we can introduce a useful interpretation of mappings f : Rn → Rn called the “vector field”. Let p ∈ Rn and consider T p Rn . As a vector space T p Rn is isomorphic to Rn because they have the same dimension. Therefore, in the discussion that follows, we drop the dependence on p in the tangent space notation and label all tangent spaces as just Rn . Definition 3.1.10. A mapping f : Rn
is interpreted as a vector in T p Rn .
→ Rn is a vector field if for p ∈ Rn, f ( p) ∈ Rn
The concept of vector field is very important in physics and is at the heart of the qualitative theory of differential equations pioneered by Henri Poincaré in the late 19th century. We now show some examples to illustrate this concept. Example 3.1.11. Let F, G :
2 R
→ R2 be vector fields defined by F (x, y) = (x, −y)
and G(x, y) = ( y, x), shown in Figure 3.6. The arrows are obtained by evaluating F and G at a number of sample points. For instance, F (0, 0) = (0, 0), F (1, 1) = (1, 1) and F (1, 0) = (1, 0).
−
−
3.1 Tangent spaces
63
Fig. 3.6. Vector fields F (x, y) (top) and G(x, y) (bottom). Note that the size of the arrows is
normalized to a unique size for convenience.
The next example is an important example from physics. Example 3.1.12. Newton’s law of gravitation says that the two bodies are attracted
with a force proportional to the masses of the bodies m and M and inversely proportional to the square of the distance. The magnitude of this force is written F (r) =
mM G r2
where G is Newton’s gravitational constant. In the case of a large isolated body of large mass M (e.g. the Earth) and much lighter objects in its neighborhood, it is customary to place the centre of mass M at the origin. Therefore, all bodies are attracted radially towards the origin and for a body located at x = (x,y,z), we write
−→ −x F ( x) = F (r) ||x|| where r = x 2 and the second term is the unit direction vector pointing radially 3 towards the origin. The function F : R3 R defines a vector field.
|| ||
−→
→
64
3 Tangent Spaces and 1-forms
Example 3.1.13. Here is an example of a vector field on R: F (x) = x(1
− x). In this
case, all arrows lie directly on the line. For zero vectors, one draws a point.
0 Fig. 3.7. Vector field on
x
1 R given
by F (x) = x(1
− x). The size of the arrows is relative.
A vector field can be defined not only over whole spaces, but also subsets of spaces. For instance, one can define a vector field on a curve C . Example 3.1.14. To define a vector field on curve, one must use a parametric rep-
resentation of the curve. Consider a circle of radius 1 with parametrization given by r(θ) = (cos(θ), sin(θ)) with θ [ π, π). An example of a vector field on C is given by v : [ π, π) T p C R with v(θ) = π2 2 θ2 . This vector field is shown in Figure 3.8.
−
→
∈ −
−
Fig. 3.8. Vector field on a circle.
The arrows are in the tangent spaces at all points of the circle. π Note the 0 arrows at θ = . 2
±
Vector fields are discussed in more details in several upcoming sections.
3.1.4 Tangent Plane to a Surface
For curves in space, the question of the existence of a tangent line to the curve is an important aspect to consider as it gives a best linear approximation to the curve locally. In particular, tangent lines exist at points in C where the curve is smooth. Consider now a two-dimensional surface S given by z = f (x, y) and consider the question of the existence and computation of a tangent plane to a surface. We do
65
3.1 Tangent spaces
Fig. 3.9. Two-dimensional surface S with curves C 1 and C 2 projecting to coordinate lines in the
xy -plane and intersecting at p = (x0 , y0 , f (x0 , y0 )).
not consider the question of whether a tangent plane exists or not just yet, but focus on the computation. Definition 3.1.15. Let S be a surface in R3 given by z = f (x, y) and p = (x0 , y0 , z0 )
where z 0 = f (x0 , y0 ). If p S is a point at which a unique tangent plane exists, then the tangent space of S at p , denoted by T p S is the set of all vectors tangent to S at p.
∈
With the following calculation, we begin our characterization of T p S . Consider a point p = (x0 , y0 , f (x0 , y0 )) ∈ S and the curves C 1 and C 2 given by r1 (t) = (x0 + t, y0 , f (x0 + t, y0 ))
and
r2 (t) = (x0 , y0 + t, f (x0 , y0 + t))
passing through p; that is, r1 (0) = p = r2 (0). The projections of C 1 and C 2 in the xy plane are Cartesian coordinate lines. Figure 3.9 shows a surface S with the curves C 1 and C 2 . Computing the derivative at t = 0 gives the tangent vectors of both curves at p. Using the chain rule for partial derivatives we obtain r1 (0) =
∂ f 1, 0, (x0 , y0 ) ∂x
and
r2 (0) =
∂ f 0, 1, (x0 , y0 ) . ∂y
These tangent vectors to C 1 and C 2 are shown in Figure 3.10. This shows that r 1 (0) and r2 (0) are vectors tangent to S at p. We can now state our result. Theorem 3.1.16. Let S be the surface given by z = f (x, y) and
p = (x0 , y0 , f (x0 , y0 ))
∈ S.
Then, T p S is a vector space of dimension two with basis given by τ 1 ( p), τ 2 ( p) where τ 1 ( p) =
∂ f 1, 0, (x0 , y0 ) ∂x
and τ 2 ( p) =
{
∂ f 0, 1, (x0 , y0 ) . ∂y
}
66
3 Tangent Spaces and 1-forms
Fig. 3.10. Two-dimensional surface S with tangent vectors τ 1 ( p) and τ 2 ( p) at p
=
(x0 , y0 , f (x0 , y0 )).
The proof is interesting, but technical and is left to the end of the section. We begin with the simplest example. Example 3.1.17. Consider the plane
P given by equation 3x − y + 2z = 3. A plane tangent to P at any point p should correspond to P itself. We verify this. We compute 1 the tangent plane to P at the point p = (1, 2, 1). We write z = f (x, y) = (3 − 3x+y) 2 and from Theorem 3.1.16 we have τ 1 ( p) =
and τ 2 ( p) =
∂ f 1, 0, (1, 2) ∂x
= (1, 0, 3/2)
∂ f (1, 2) ∂y
= (0, 1, 1/2).
0, 1,
−
Those are indeed linearly independent vectors tangent to of at p is obtained by taking
P
T p S = span (τ 1 ( p), τ 2 ( p))
= =
P at p. The tangent plane
{α(1, 0, −3/2) + β (0, 1, 1/2) | α, β ∈ R} {(α,β, −3/2α + 1/2β ) | α, β ∈ R}.
As expected, T p S is independent of p.
We look at an example where the partial derivatives depend on the base point, so the tangent plane depends on its location. Example 3.1.18. Consider the elliptic paraboloid surface E given by
z = f (x, y) = x 2 + y 2
for (x, y) R2 . We determine the equation for the tangent plane at p = (1, 2, 5). We obtain the tangent vectors to S
∈
τ 1 ( p) =
∂ f 1, 0, (1, 2) ∂x
= (1, 0, 2)
3.1 Tangent spaces
and τ 2 ( p) =
based at p and so
∂ f 0, 1, (1, 2) ∂y
67
= (0, 1, 4)
T p S = (α,β, 2α + 4β α, β
{
|
∈ R}
The paraboloid and its tangent plane at p are shown in Figure 3.11
Fig. 3.11. Two-dimensional surface S with tangent vectors τ 1 ( p) and τ 2 ( p) at p.
We begin by showing that all vectors in the vector subspace span{τ 1 ( p), τ 2 ( p)} are tangent to S at p. Choose an arbitrary element w ∈ span{τ 1 ( p), τ 2 ( p)}. Let α, β ∈ R and Proof of Theorem 3.1.16
w := ατ 1 ( p) + βτ 2 ( p) =
∂f ∂f α , β , α (x0 , y0 ) + β (x0 , y0 ) ∂x ∂y
and consider the curve on S given by the vector function r(t) = (x0 + αt,y0 + βt,f (x0 + αt,y0 + βt))
t
Then, r(0) = p and r (0) =
∂f ∂f α , β , α (x0 , y0 ) + β (x0 , y0 ) ∂x ∂y
is tangent to S at p with w = r (0). So w ∈ T p S and therefore span{τ 1 ( p), τ 2 ( p)} ⊆ T p S. If T p S contains a vector v not in span{τ 1 ( p), τ 2 ( p)},
∈ R.
68
3 Tangent Spaces and 1-forms
then v is linearly independent from τ 1 ( p) and τ 2 ( p) and span{τ 1 ( p), τ 2 ( p), v } is a three-dimensional vector space which would mean T p S = R3 . This implies S = R3 which is a contradiction. Therefore, T p S = span{τ 1 ( p), τ 2 ( p)}. Exercises
(1) For each curve C given by r(t), answer the question: (a) r(t) = (t cos(t), t sin(t)). Find T p C where p = r (π). (b) r(t) = (et , e2t , e4t ). Find T p C where p = r (ln(2)). (c) r(t) = (cos(8t), sin(8t)). Find a general formula for T p C at an arbitrary point p = r (t). √ (d) r(t) = (et , e2t / 2, e3t /3). Find a general formula for T p C at an arbitrary point p = r (t). (2) Draw (by hand) the vector field F (x, y) by sampling a sufficient number of points in each quadrant. Compare your answer with a picture obtained using a software. (a) F (x, y) = (1 − xy, 2 + x + y) (b) F (x, y) = (2x, 3y) (c) F (x, y) = (y, x − x2 )
(3) For each surface S given by z = f (x, y), determine T p S by finding τ 1 ( p) and τ 2 ( p). (a) (b) (c) (d)
z = 3x2
− 4y2 at p = (2, 1, 8).
z = cos(xy) at p = (1, π/2, 0). z = x 3 z=
− 3xy2 at p = (0, 0, 0). 1 − (x2 + y 2 ) at p = (1/2, 1/2, 1/2).
(4) Check whether the vectors below are in T p S or not for problems 3(b) and 3(c). (a) v = (2, 3, 2π 3) (c) u = (0, 4, 16)
−
−
(b) w = (1, 2, 1) (d) q = (1, 2, 0).
−
−
(5) T p S can also be computed for surfaces given in other coordinate systems. In the case of cylindrical coordinate system, the basis vectors for T p S are given by evaluating along the r and θ coordinate lines in the xy-plane. Let p = (r0 , θ0 , f (r0 , θ0 )) ∈ S . (a) Draw a sample picture for S (as in Figure 3.9) showing the curves C 1 and C 2 on S given by r1 (t) = (r0 + t, θ0 , f (r0 + t, θ0 )) and r2 (t) = (r0 , θ0 + t, f (r0 , θ0 + t)).
3.2 Differentials
69
(b) Compute the derivative of r1 (t) and r2 (t) at t = 0. (c) Argue geometrically that the vectors f 1 ( p) = r1 (0) and f 2 ( p) = r2 (0) are linearly independent and conclude that T p S = span f 1 ( p), f 2 ( p) .
{
}
(6) Using the method√ outlined in the previous problem, compute T p S for S given √ by z = f (r, θ) = 1 − r2 at (r,θ,z) = ( 2/2, π/4, 1/2). Compare with problem 3(d), are the tangent spaces the same? (7) Consider the curve C given by r(t) and the point p specified. In each case, answer the question. (a) r(t) = (1 + t2 , t − 3) and p = r(1) = (2, −2): find the vector v ∈ T 1 R such that (10, 5) = r (1)v. (b) r(t) = (t, et , e2t ) and p = r (0) = (0, 1, 1): Is (4, 4, 8) ∈ T p C ? Explain why. (c) r(t) = (t2 , 2t, t3 ) and p = r(1) = (1, 2, 1): show that if v1 = 2 ∈ T 1 R and v2 = −3 ∈ T 1 R, then r (1)(v1 + v2 ) = r (1)v1 + r (1)v2 .
(8) Show Proposition 3.1.8.
3.2 Differentials In several elementary books about calculus, the concept of the differential of a function y = f (x) is introduced in a simple fashion, by saying that the differential is dy = f (x) dx where dx is an independent variable taking real values. The differential reappears when discussing definite integrals under the integral sign as one takes the limit of Riemann sums to define the integral. The reason for the appearance of dx under the integral sign is either not mentioned or the author states that it has no meaning save to identify the variable to be integrated or it is the ∆x magically transforming into dx as the limit is taken. Finally, when introducing the substitution rule, say u = g(x), then dx suddenly has a meaning again since now the dx under the integral sign must be changed to du using du = g (x) dx. After this, one would understand a student to be confused about the concept of differential. Our goal in this section is to put the concept of differential over solid foundations such that its use in differentiation and integration becomes clear. The differential we define in this section must at least satisfy the following properties: (1) dx should take values in R. (2) If y = f (x), then we must have dy = f (x) dx. (3) The dx under the sign must have a geometric meaning.
´
70
3 Tangent Spaces and 1-forms
3.2.1 The differential in one-dimension Example 3.2.1. Consider the curve C on R given by (t) = t .
I
(1) At a point t 0 ∈ R, we compute the tangent vector A = I (t)e1 = e 1 A t0
0
(2) Let v ∈ T t
0
R
t
and compute the dot product of v with A: A v = (1e1 ) (he1 ) = h(e1 e1 ) = h
·
(3) Let v, w ∈ T t
0
·
·
∈ R.
, then using the properties of the scalar product
R
A (v + w) = A v + A w
·
·
·
and A · (bv) = b(A · v).
We use this example to define the differential. Definition 3.2.2. The differential
T t0 R
on R, written (dt)t , is a linear function (dt)t : 0
0
→ R defined by: (dt)t0 (v) := A v = h, where v = he 1 .
·
I (t). Because (dt)t is the same at all base points t0 , we write only dt.
where A =
0
Example 3.2.3. The differential and the norm return different information. For in-
stance, let v = ( 5)e1 , then dt(v) =
−5 while ||v|| = 5. Thus, we use the tangent vector A to the curve I (t) = t to define a function dt −
which takes tangent vectors and returns a real number corresponding to the length and the direction of those vectors. The direction of the vector is lost when using the norm instead of the differential. This definition may seem cumbersome and clumsy at the level of R and one should see this as the initial brick to more complicated constructions. Its strength is that it generalizes in a natural way to higher dimensional spaces Rn , to functions, curves and surfaces. The differential of a general function f (t) is done by thinking of the curve C on R with parametrization given by f (t). Definition 3.2.4. The differential of a differentiable function f :
R
→ R at t0 ∈ R
is a function (df )t0 : T t0 R R defined as follows. Let A be the tangent vector f at t0 , A = f (t0 )e1 , and let v = he 1 then
→
(df )t0 (v) := A v.
·
Part (3) of Example (3.2.1) shows that dt is a linear function and this generalized automatically to differentiable functions.
71
3.2 Differentials
Proposition 3.2.5. Let f :
∈ R, then
R
α, β
→
R be
a differentiable function, v, w
∈ T t R and 0
(df )t0 (αv + βw) = α (df )t0 (v) + β (df )t0 (w).
We now can obtain our first important result with our definition of differential.
3.2.2 The classical differential formula
Using the definition above we now obtain the differential as seen in elementary calculus classes. Let s = f (t) and v = he 1 and recall that dt(v) = h , then (ds)t0 (v)
= = = = =
(df )t0 (v) (f (t0 )e1 ) v f (t0 )h(e1 e1 ) f (t0 )h f (t0 )dt(v).
· ·
(3.1)
Therefore, we recover the well-known formula from elementary calculus courses, namely (ds)t0 = f (t0 ) dt.
y
ds(v)
x
v = ∆te1
Fig. 3.12. Geometric representation of the differential as the increment along the y -axis given by
the projection of a tangent vector to the curve with projection ∆t along the x -axis.
The geometric meaning of the differential is similar as in elementary calculus. Let t0 ∈ R and v = (∆t)e1 ∈ T t R be an increment from t 0 . Then, 0
ds(∆t) = f (t0 )dt(∆t)
However, for s = f (t), formula ds = f (t) dt now has additional meaning since we know that dt (and so ds) are functions defined on tangent spaces. We can add and
72
3 Tangent Spaces and 1-forms
multiply those functions as with any other function and this is useful in the following sections. In particular, this justifies the Leibniz notation for the derivative, ds = f (t) dt
where the term on the left is genuinely the division of ds by dt.
3.2.3 Differentials on higher dimensional spaces
We now extend the differential to tangent spaces of arbitrary dimensions. Essentially, a coordinate line C j can be seen as a function R → R since all the other coordinates xi = xi0 are constant for i = j . We begin with R2 . Let p = (x0 , y0 ) ∈ R2 and u = (α1 , α2 ) ∈ T p R2 . Because the tangent space of coordinate lines T p C 1 and T p C 2 are one-dimensional subspaces of T p R2 , we extend our definition of differential to dx,dy : T p R2
→R
as follows. The coordinate lines C 1 and C 2 are given as (x1 (t), y1 (t)) = (t, y0 ) and (x2 (t), y2 (t)) = (x0 , t), then (dx)(x0 ,y0 ) (u) := (x1 (t), y1 (t)) u = (1, 0) (α1 , α2 ) = α 1 .
·
·
and (dy)(x0 ,y0 ) (u) := (x2 (t), y2 (t)) u = (0, 1) (α1 , α2 ) = α 2 .
·
·
Therefore, the geometric meaning of the differentials dx and dy is that it returns the projections of u along the x and y directions respectively. This construction extends automatically to higher dimensional spaces. Proposition 3.2.6. Let x1 , . . . , xn be Cartesian coordinates on Rn and consider the
vector v = (α1 , . . . , αn )
∈ T pRn, then dxj (v) = α j
for j = 1, . . . , n.
We can now begin our discussion of differentials in the context of functions of several variables.
Differentials for functions of several variables
Consider a surface S given by a function z = f (x, y). We see in Section 3.1.4 that the tangent space at a point p = (x0 , y0 , z0 ) ∈ S is given by span{τ 1 ( p), τ 2 ( p)} where τ 1 ( p) =
∂ f 1, 0, (x0 , y0 ) ∂x
and τ 2 ( p) =
∂ f 0, 1, (x0 , y0 ) . ∂y
3.2 Differentials
We now define (df )(x then
0
,y0 ) on
v =
Now,
73
vectors v ∈ T p S . We need (df ) : T p S → R. Let v ∈ T p S ,
∂f ∂f α , β , α (x0 , y0 ) + β (x0 , y0 ) . ∂x ∂y
(dx) p (v) = α,
and (dz) p (v) = α
(dy) p (v) = β
∂f ∂f (x0 , y0 ) + β (x0 , y0 ). ∂x ∂y
But this means (dz) p (v) =
∂f ∂f (x0 , y0 )(dx) p (v) + (x0 , y0 )(dy) p (v). ∂x ∂y
Equivalently, recalling the gradient in Cartesian coordinates, we have (dz) p (v) =
∇f (x0 , y0 ) · ((dx) p(v), (dy) p(v)).
(3.2)
But (3.2) has exactly the form of our previous definitions of differential: “a derivative” scalar product with a vector. Because z = f (x, y) and the values of dx and dy do not depend on the point p explicitly. Definition 3.2.7. The
differential of f at (x0 , y0 ) is defined by
(df )(x0 ,y0 ) :=
∂f ∂f (x0 , y0 )dx + (x0 , y0 )dy. ∂x ∂y
In particular, (df )(x0 ,y0 ) can be evaluated at any vector v
(3.3)
∈ T (x ,y ) R2 . 0
0
The formula is written more simply df =
∂f ∂f dx + dy. ∂x ∂y
Geometrically, we see that df (v) gives the variation of the function f in the direction of the vector v . For this reason, it is also called the directional derivative of f . In particular, because we can express df (v) =
∇f (x, y) · (dx(v), dy(v))
by choosing a unit vector v , we see that the directional derivative of f is maximal if v is parallel to the vector ∇f (x, y) and pointing in the same direction. If v is chosen perpendicular to ∇f (x, y), then df (v) = 0 and this corresponds to a direction where f does not vary. This leads us to introduce the concept of level set curves of a function f (x, y) which is defined as f −1 (c) := (x, y) f (x, y) = c
{
|
}
74
3 Tangent Spaces and 1-forms
where c ∈ R. Note that for a fixed c ∈ R, typically, f (x, y) = c determines a curve. The function f is fixed along f −1 (c) and so the gradient vector is perpendicular to level set curves from the calculation above. A similar construction of the differential can be done for differentiable functions of more than two variables and leads to the formula: n
df =
i=1
∂f dxi . ∂x i
However, the proof of this case must wait for the general definition of tangent spaces for n -dimensional surface in Chapter 5, Section 5.4. We now look at two examples.
− 3x + 2xy2 . Let p = (1, 2), we compute (df ) p evaluated on the vector v = (3, −4) ∈ T p R2 Example 3.2.8. Consider the surface S given by z = f (x, y) = 3
(df ) p (v)
= =
∂f ∂f (1, 2)dx(v) + (1, 2)dy(v) ∂x ∂y 5(3) + (8)( 4) = 17
−
−
We now determine the direction of maximal increase of df . We do it by determining the direction of no increase of f which is obtained using 0=
∇f · (v1 , v2 ) = (5, 8) · (v1 , v2 ) where (v1 , v2 ) is a unit vector. Therefore, v2 = − 5v1 /8 and || (v1 , −5v1 /8)|| = 1. Therefore, −5 . 8 (v1 , v2 ) = √ , √
89
89
The unit vectors perpendicular are
± √ √ √ √ 5 8 , 89 89
.
Therefore, the direction of maximal increase is given by 5 8 , 89 89
Example 3.2.9. Consider the function f : R4
.
→ R defined by
f (x,y,z,w) = x 2 + y2 + z 2 + w2 .
Then df = =
∂f ∂f ∂f ∂f dx + dy + dz + dw ∂x ∂y ∂z ∂w 2x dx + 2y dy + 2z dz + 2wdw.
75
3.2 Differentials
∂ ∂θ y p
∂ ∂r
p
p
x q
∂ ∂r
∂ ∂θ
q
q
Fig. 3.13. Basis vectors
∂ ∂r
∂ ∂θ
2
2
and of T R and T R where p = (1, 1) and q = √ (− 3/2, −1/2). The dashed circles emphasize the tangency of with the circles through p p
q
and q and the dashed rays the radial directions of
∂ ∂r
∂ ∂θ
.
3.2.4 Differentials of curvilinear coordinate systems
Differentials can be defined in any coordinate system. Let (r, θ) be polar coordinates on the plane and let u0 = (r0 , θ0 ) ∈ R2 . From Section 3.1, Problem (4), T u R2 is spanned by 0
d d (r0 + t, θ0 ) t=0 and (0, 1) = (r0 , θ0 + t) t=0 . dt dθ Let r1 (t) = ((r0 +t) cos(θ0 ), (r0 +t) sin(θ0 )) and r 2 (t) = (r0 cos(θ0 +t), r0 sin(θ0 +t)). (1, 0) =
|
|
Consider the vectors r1 (0) = (cos θ0 , sin θ0 )
and
r2 (0) = ( r0 sin θ0 , r0 cos θ0 )
−
based at p. See Figure 3.13. We define the following vectors at p = (x, y) = (r cos θ, r sin θ):
∂ ∂r
:= (cos θ, sin θ)
and
p
∂ ∂θ
p
:= ( r sin θ, r cos θ).
−
(3.4)
which form an orthogonal basis to the tangent space T p R2 . Note however that it is not an orthonormal basis since ||∂/∂θ || = r . Thus, we see that the standard orthogonal basis of T (r,θ) R2 is sent via the derivative of the vector functions r1 and r2 to the basis {∂/∂r,∂/∂θ } at T p R2 . Therefore, a vector v ∈ T p R2 can be written v = v r
∂ ∂ + vθ ; ∂r ∂θ
76
3 Tangent Spaces and 1-forms
and in coordinates we write v = (vr , vθ ). We conclude this discussion by defining the scalar product for vectors in T p R2 written in the basis (∂/∂r, ∂/∂θ). We know that the scalar product of v and w should be the product of the length of the projection of v onto w times the length of w. This means ∂ ∂r
·
∂ = ∂r
Hence, letting
∂ ∂r
v = a
2
∂ ∂θ
= 1 and
∂ ∂ +b ∂r ∂θ
·
∂ = ∂θ
and w = c
∂ ∂θ
2
= r 2 .
∂ ∂ +d ∂r ∂θ
then v · w := ac + bdr2 or we can also write v w = (a, b)
·
1 0
0 r2
(c, d)T .
(3.5)
∂ ∂r
∂ y ∂θ p
√ 2
p
v
p
x
Fig. 3.14. Basis vectors
(1, 1).
−
∂ ∂r
and
p
∂ ∂θ
at p = (1, 1) with tangent vector v = (vr , vθ ) = p
The relationship between differentials in different coordinate systems is straightforward to obtain. We obtain dr, dθ as functions of dx and dy using r = f (x, y) = x2 + y 2 , θ = g(x, y) = arctan(y/x). We know
dr
= =
∂f ∂f dx + dy ∂x ∂y x dx + x2 + y 2
y
x2 + y 2
dy
(3.6)
3.2 Differentials
and dθ
∂g ∂g dx + dy ∂x ∂y y x dx + dy. x2 + y 2 x2 + y 2
=
−
=
Proposition 3.2.10. Let v = (vr , vθ )
uated at v are given by
77
(3.7)
∈ T p R2 . The differentials dr and dθ at p eval-
(dr) p (v) := v r
and (dθ) p (v) := v θ .
Proof. We use (3.6) and the definitions (3.4) to compute (dr) p
and (dθ) p
∂ ∂r
∂ ∂θ
=
x dx r
∂ ∂r
y + dy r
∂ ∂r
=
r cos θ r sin θ cos θ + sin θ = 1 r r
=
−y dx
=
−r sin θ (−r sin θ) + r cos θ r cos θ = 1.
r2
∂ ∂θ
+
r2
x dy r2
∂ ∂θ
r2
From the above, it is straightforward to check that (dr) p (∂/∂θ) = (dθ) p (∂/∂r) = 0. Therefore, by linearity of the differentials, the proposition is verified. Using these formulae, we can evaluate dr and dθ on vectors v ∈ T p R2 in Cartesian coordinate systems. Consider for instance, v = (vx , vy ) = (1, −1) √ at p = (1, 1) as seen in Figure 3.15; the vector v is tangent to the circle of radius 2 and so (dr) p should be zero and (dθ) p nonzero on this vector. We verify with the computation: (dr) p (v) = (dθ) p (v) =
√ 12 dx(1, −1) + √ 12 dy(1, −1) = 0
−1 dx(1, −1) + 1 dy(1, −1) = −1. 2
2
Exercises
(1) Using the differential formula (3.3), compute the differential of the functions listed below. (a) f (x, y) = x/y . Find the direction of maximal increase of f at (1, 1). (b) f (x, y) = x 2 y − y2 x. Find the direction of no increase of f at (2, 1). (c) f (x,y,z) = 3 cos(xyz)
78
3 Tangent Spaces and 1-forms
∂ ∂r
∂ y ∂θ p
√ 2
p
p v x
Fig. 3.15. Basis vectors
(1, 1).
−
∂ ∂r
and
p
∂ ∂θ
at p = (1, 1) with tangent vector v = (vx , vy ) = p
(d) f (x1 , . . . , xn ) = exp((x1 − a1 ) ··· (xn − an )). (2) Show the following statements (a) d(f (t)g(t)) = (f (t)g(t) + f (t)g (t)) dt = 0, d(f (t)/g(t)) = (f (t)g(t) − f (t)g (t))g(t)−2 dt. (b) If g(t) (c) d((f ◦ g)(t)) = f (g(t))g (t) dt. (3) Consider a curve C given by r(t) = (x(t), y(t)) where t ∈ [α, β ]. Let α = t0 < t1 < .. . < tn−1 < tn = β and consider the vector v j = t j+1 − tj based at t j ∈ R for j = 1, . . . , n − 1 (i.e. v0 = t1 − t0 is a vector based at t0 , v1 = t2 − t1 is a vector based at t1 , etc). We denote pj = r (tj ). (a) Compute dt(vj ). (b) Explain why the vector W j := r (tj )dt(vj ) is in the tangent space T pj C . (c) Compute dx(W j ), dy(W j ). (4) Generalize the problem above to a curve r(t) = (x1 (t), . . . , xn (t)) with t ∈ [a, b] by doing part (c) in this context; that is, compute dxi (W j ) for i = 1, . . . , n . (5) Evaluate dr and dθ at the points p ∈ R2 and vectors in v ∈ T p R2 in polar and Cartesian coordinates using the direct definition of dr and dθ and the formulae (3.6) and (3.7). Check that the answers agree. Illustrate the vectors at the point p. (a) p = (2, 0); v = (vr , vθ ) = (1/2, 1/2) and v = (vx , vy ) = (1/2, 1). √ (b) p = (1/2, − 3/2); v = (vr , vθ ) = (1, −1) and v = (vx , vy ) =
(1
− √ 3) , −(1 + √ 3) 2
2
.
3.2 Differentials
79
(6) Compute dx and dy in terms of dr and dθ using x = r cos θ and y = r sin θ. Check that your result is similar to solving dx and dy from the equations (3.6) and (3.7). (7) Consider the spherical coordinate system (ρ,φ,θ) and compute dρ, dφ and dθ as functions of dx, dy and dz . (8) Compute dx, dy and dz as functions of dρ, dθ and dφ. You can do it directly using the formulae relating (x,y,z) with (ρ,φ,θ) or solve for dx , dy and dz from the previous problem. (9) Let p = (r cos θ, r sin θ, z). By defining curves r1 (t) = ((r + t)cos θ, (r + t)sin θ, z)T , r2 (t) = (r cos(θ + t), r0 sin(θ + t), z)T , r3 (t) = (r cos θ, r sin θ, z + t)T
show that the vectors
∂ ∂r
:= (cos θ, sin θ, 0),
p
∂ ∂θ
:= ( r sin θ, r cos θ, 0),
p
−
∂ ∂z
:= (0, 0, 1)
p
forms an orthogonal basis of T p R3 . (10) Let p = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ). By defining curves r1 (t)
=
((ρ + t)cos θ sin φ, (ρ + t)sin θ sin φ, (ρ + t)cos φ)T ,
r2 (t)
=
(ρ cos θ sin(φ + t), ρ sin θ sin(φ + t), ρ cos(φ + t))T ,
r3 (t)
=
(ρ cos(θ + t)sin φ, ρ sin(θ + t)sin φ, ρ cos φ)T
show that the vectors
∂ ∂ρ
p
∂ ∂φ
p
∂ ∂θ
:= (cos θ sin φ, sin θ sin φ, cos φ), := (ρ cos θ cos φ, ρ sin θ cos φ, ρ sin φ),
−
:= ( ρ sin θ sin φ, ρ cos θ sin φ, 0)
−
p
forms an orthogonal basis of T p R3 with
||∂/∂ρ|| = 1, ||∂/∂φ|| = ρ, ||∂/∂θ|| = ρ sin φ.
(11) Show that the scalar products in the basis of the previous two problems are (a) cylindrical coordinates: (v1 , v2 , v3 ) (w1 , w2 , w3 ) = v 1 w1 + r 2 v2 w2 + v3 w3 .
(b) spherical coordinates:
·
(v1 , v2 , v3 ) (w1 , w2 , w3 ) = v 1 w1 + ρ2 v2 w2 + ρ2 sin2 φ v3 w3 .
·
80
3.3
3 Tangent Spaces and 1-forms
1-forms
In this section, we introduce a class of objects called 1-forms, which include the differentials. We begin with a reminder from elementary calculus. The Fundamental Theorem of Calculus states that for any continuous function f : [a, b] → R, one can find a differentiable function F (x) (the antiderivative) such that F (x) = f (x), or in the language of differentials dF = F (x) dx = f (x) dx.
Consider now the problem in two dimensions. Let f (x, y) and g(x, y) be continuous functions, is it always possible to find F (x, y) such that dF = f (x, y) dx + g(x, y) dy
which means
∂F = f (x, y) ∂x
and
∂F = g(x, y)? ∂y
(3.8)
The answer is negative and the cases for which it is possible are studied in more details in a forthcoming section. Here is a case where (3.8) is not satisfied. Let f (x, y) = y
and g(x, y) = x 2 .
Indeed, by integrating with respect to x ∂F = y ∂x
=
⇒
F (x, y) = xy + G(y)
where G(y) is some arbitrary function of y . But then, ∂F = x + G (y) = x 2 ∂y
cannot be satisfied. Note however that the expression y dx + x2 dy
on its own has a well-defined mathematical meaning even though it is not the differential of any function. In fact, expressions such as f (x, y) dx + g(x, y) dy
(3.9)
have many uses in physics as we see below. All differentials and expressions such as (3.9) are examples of mathematical objects called 1-forms. Definition 3.3.1. Let U be an open subset of Rn and consider functions aj (x) :=
aj (x1 , . . . , xn ) for all j = 1, . . . , n.
3.3
1-forms
81
(1) If aj is a continuous functions for j = 1, . . . , n , then ω(x) = a 1 (x)dx1 + . . . + an (x) dxn
is a continuous 1-form on U . (2) If aj is a differentiable functions for j = 1, . . . , n, then ω(x) = a 1 (x)dx1 + . . . + an (x) dxn
is a differentiable 1-form on U . 1-forms are defined using the differentials dx1 , . . . , d xn ; therefore, they act on vectors in the tangent space of points p U Rn . That is, for each p U ,
∈ ⊂
∈
ω( p) : T p Rn
→R
Example 3.3.2. Consider the 1-form
ω(x, y) = y dx + x2 dy,
− ∈ R2 and v = (−1, 5) ∈ T pR2 , then ω(3, −2)−1, 5 = (−2) dx(−1, 5) + (3)2 dy(−1, 5) = −2(−1) + 9(5) = 49. We use the notation for the vectors on which ω is applied. This should alleviate
p = (3, 2)
possible confusion when writing down such expressions.
1-forms can be defined also using curvilinear coordinate systems. Example 3.3.3. Consider the 1 -form ω = r 2 dr+rθdθ. We evaluate at p = (r0 , θ0 ) =
(2, π/3) at the vector v = (vr , vθ ) = (0.2, 1.3) ω(2, π/3) 0.2, 1.3 =
=
∈ T pR2 .
(2)2 dr(0.2, 1.3) + (2)(π/3)dθ(0.2, 1.3) 4(0.2) + (2π/3)(1.3).
If the tangent vector is in Cartesian coordinates v = (vx , vy ), then one needs to use the formulae (3.6) and (3.7) to evaluate dr and dθ.
We conclude this section by looking at the set of (continuous or differentiable) 1forms in n-dimensions at a point p which we denote Ω pn :=
ω : T p Rn
We have the following result.
n
→ R | ω =
i=1
ai (x1 , x2 , . . . , xn ) dxi
.
82
3 Tangent Spaces and 1-forms
Proposition 3.3.4. Ω pn is a vector space (over R).
Proof. One needs to check that the sum of two 1-forms ω1 , ω2 Ωn is also in Ωn and for any a R and ω Ωn , then aω Ωn . We leave the details to the reader.
∈
∈
∈
∈
Let U ⊂ Rn be an open set, we denote by Ω n (U ) the vector space of 1 -forms defined on U .
3.3.1
1-Forms
in Physics
As noted above, 1-forms are useful in describing quantities in physics, in particular for computing “work”. However, many more examples exist. For now, we focus on work. In elementary physics, the work (W ) done by a force is described as the product of a force F and the distance d travelled by a mass moved in the direction of F : W = F · d, and has units of Newton-meters ( N m). In fact, a more general/practical way of expressing work is using 1-forms. The force may depend on its location and so F = F ( p) = (F 1 ( p), . . . , Fn ( p))
is a vector field where p = (x1 , . . . , xn ) ∈ Rn . One defines n
dW ( p) =
F i ( p) dxi .
(3.10)
i=1
At a point p , one can approximate a small distance in the direction of motion given by r(t) by taking a vector v = s r (t) ∈ T p Rn for s small and so dxi (v) = xi (t)s = xi (t) dt(s). Therefore, for v ∈ T p Rn , F i ( p) dxi ( p)v is the product of the force F in the ith direction times the projection of the distance vector v also in the ith direction. Hence, work at p is the linear superposition of the work done in every Cartesian coordinate direction. Example 3.3.5. Consider the following case. Let F (x, y) = (x,
−3) be the force of the wind and suppose that a cyclist travels along the path from P = (3, 2) to Q = (3, −1) and then from Q to R = (−2, −3). We compute the work done at each point p by the wind on the cyclist. Suppose that the path P Q is parametrized by r1 (t) = (3, 2 − t) with t ∈ [0, 3] and the velocity of the cyclist is given by tangent vectors along P Q are of the form v = r 1 (t) = (0, −1), then dW (3, 2 − t)v = xdx(v) + ( −3)dy(v) = (3)(0) − 3(−1) = 3. On the path QR , a parametrization is given by r2 (t) = (3 − 5t, −1 − 2t) with t ∈ [0, 1] and suppose also that the velocity of the cyclist is given by tangent vectors given by
3.3
83
1-forms
y P
x Q
R Fig. 3.16. Trajectory taken by cyclist: from P = (3, 2) to Q = (3,
( 2, 3).
− −
−1) and from Q to R
=
w = r 2 (t) = ( 5, 2).
− −
dW (3
− 5t, −1 − 2t)w
=
xdx(w) + ( 3)dy(w)
=
(3
=
−9 + 25t.
−
− 5t)(−5) + (−3)(−2)
In the following section, we show that the total work over a path can be computed by integrating dW .
Exercises
(1) Evaluate the 1-forms at the point p and on the vector v given. (a) ω = 2x dx + (3xy − y2 ) dy , p = (2, −1), v = (−1, 1). (b) ω = cos(x + y) dx + 3xyz dy + (x2 + z 2 ) dz , p = (0, 0, 1), v = (3, −2, 4). (c) ω = x1 x2 dx1 + x 1 x3 dx2 + x 2 x3 dx3 + x 3 x4 dx4 , p = (−3, −2, 1, 1), v = (2, −1, 1, −3). (d) ω = (θ + r) dr + (rθ 2 ) dθ, p = (r, θ) = (1, π) and v = (vr , vθ ) = (1, 2). (e) ω = 2θr 2 dr + (θr) dθ, p = (x, y) = (−1, −1) and v = (vx , vy ) = (2, −3). (2) Consider the 1-form ω = 2xy dx + x 2 dy . Find a function F (x, y) such that dF = ω . Is this function F the unique function for which dF = ω ? (3) Prove that Ωn is a vector space (Proposition 3.3.4). (4) Compute the work 1-form dW done by the force F (x, y) = (3x, 2xy) at each point of the path r(t) = (t2 , 1 − t) with t ∈ [0, 1]. (5) Compute the work 1 -form dW done by the gravitational force
−→ mM G −x F ( x) = r 2 ||x|| on an object of mass m falling on the earth with the trajectory r(t) = ((1 − t) cos(t), (1 − t) sin(t), 1 − t) with t ∈ [0, 1].
4 Line Integrals We introduce the concept of line integrals starting with the integration of 1-forms. This leads to the first of the important theorems of Vector Calculus: the Fundamental Theorem of Line Integrals which is a generalization of the Fundamental Theorem of Calculus seen in elementary calculus courses. We then extend these results to the context of vector fields.
4.1 Integration of 1 forms When introducing differentials, we mention that expressions such as dt , dx , dy need to satisfy conditions (1), (2), (3) at the beginning of Section 3.2. The first two are satisfied with the definition given above and we even generalized to higher dimension. We now look at condition (3) which has to do with integration. However as the previous section shows, the concept of 1-forms is more general (at least in dimensions greater than 1) and so we define what it means to integrate 1 -forms.
4.1.1 Revisiting integration in one-dimension
We can now properly define integration, not of functions, but of 1 -forms over “space curves” in R. This is done using Riemann sums as in elementary calculus. vn−1 tn−1 b
v0 v1 a t1 t2
t
Fig. 4.1. Curve C is the interval [a, b] with partition a = t0 < t1 <
vectors vj , j = 0, . . . , n
− 1.
· · · < t
n−1
< tn = b and
Let C be the positively oriented curve given by the interval [a, b] ⊂ R, with parametrization r(t) = t, t ∈ [a, b]. Let ω = f (t) dt be a continuous 1-form defined over C . We begin by defining a partition of [a, b]: a = t 0 < t1 < .. . < tn = b.
At each point tj of the partition, define vectors in the tangent space vj := t j+1
− tj ∈ T t R, j = 0, 1, . . . , n − 1 and write vj = ∆tj e1 where ∆tj = t j+1 − tj . Now, evaluate ω at each tj on v j : ω(tj )vj = f (tj ) dt(vj ) j
4.1 Integration of 1 forms
85
and take the sum over all j = 0, . . . , n − 1. Therefore, n 1
−
n 1
ω(tj ) vj =
j=0
−
f (tj ) dt(vj ).
(4.1)
j=0
By noticing that dt(vj ) = ∆tj , (4.1) is just a Riemann sum used to define the integral as it is shown in elementary calculus. Adding points to the partition so that every subinterval [tj −1 , tj ] is always subdivided, we define
ˆ
C
n 1
ω := lim
n 1
−
→∞ j=0 ω(tj
n
−
) vj = lim
b
→∞ j=0 f (tj
n
) dt(vj ) =
ˆ
f (t) dt.
a
With the formulation given by (4.1), the significance of the dt in the integral is justified because this is how the base of the rectangles in the Riemann sum is computed. As the limit is taken, the vj ’s “vanish”, but the dt remains. Thus, the integration of 1-forms is well-defined and blends nicely with previously known integration of functions from elementary calculus. In particular, the properties of integration of 1-forms are identical. Proposition 4.1.1. The integral of a 1-form ω = f (t) dt satisfies the properties be-
low.
(a) If C = [a, b], C 1 = [a, c] and C 2 = [c, b] then
ˆ
ω =
C
ˆ
ω +
C 1
ˆ
ω.
C 2
(b) Let −C be the curve C travelled in the opposite orientation. Then,
ˆ
−C
ω =
−
ˆ
ω.
C
(c) If a, b ∈ R and ωi , i = 1, 2 are 1-forms, then
ˆ
(aω1 + bω2 ) = a
C
ˆ
ω1 + b
C
ˆ
ω2 .
C
Proof. These follow from the same properties for the Riemann integral.
The next example not only shows how to use the substitution rule in the context of 1-forms, but it also introduces a new operation called the “pullback” which is how changes of variables are applied to 1 -forms (and 2 -forms, 3 -forms, etc which are defined in subsequent chapters). Example 4.1.2 (Substitution rule). Consider the 1-form
ω(t) = 2t cos(t2 ) dt
86
4 Line Integrals
over the positively oriented curve C = [0, π]. We know from elementary calculus that the integral of ω can be done using the substitution rule as follows. Let s = t 2 , then ds = 2t dt and the s-variable takes values in [0, π 2 ] so,
ˆ
π
ω =
C
ˆ
2t cos(t2 ) dt =
0
π2
ˆ
cos sds.
0
Thus, we see that the substitution rule gives rise to a new 1-form ˜ (s) = cos(s) ds ω
defined on the curve C = [0, π 2 ] and such that
ˆ
ω(t) =
C
ˆ
ω ˜ (s).
C
√
Note that we can obtain ω˜ (s) by using t = s and dt = 21 s−1/2 ds:
√
1 ω = 2t cos(t2 ) dt = 2 s cos s s−1/2 ds = cos s ds = ω ˜. 2
Changes of variable are more often expressed in the form of the old variable “ t” in terms of a new variable “ s”, so we favour this last formulation. This process of starting with a 1-form ω and using a change of variables to obtain a new 1-form ω˜ is what the pullback is about; the exact definition follows. Definition 4.1.3. Let ω(t) = a(t) dt be a 1-form and t = g(s) where g :
R
→ R is
a differentiable function. The pullback of ω by g is also a 1-form, denoted by g ∗ ω , defined by (g ∗ ω)(s) := a(g(s)) g (s) ds.
In Figure 4.2, we see how the reparametrization of the domain between the s and t-variables is used to take the 1 -form ω(t) and pulls back a 1 -form in the s -domain. Example 4.1.4. We compute the pullback of
ω(t) =
√ 1 t+ t2 dt √
√ −
by t = g(s) = s 1. In ω(t), we have that a(t) = t/ 1 + t2 , then applying the formula for the pullback we obtain g∗ ω(s)
= = =
a(g(s)) g (s) ds s 1
√ − 1 √ ds √ 1 + ( s − 1)2 2 s − 1 1 √ ds. 2 s
4.1 Integration of 1 forms
87
Reparametrization 0
r1 (s) = s
√ s
t =
π2
√ s
0
r2 (t) = t
R
∈ [0, π2 ]
t
π
∈ [0, π]
R
“Pullback” w(s) = cos(s)ds
t = dt =
√ s
w(t) = 2t cos(t2 )dt
1 ds 2 s
√
Fig. 4.2. Diagram illustrating the relationship between reparametrizations and pullbacks in the
case of Example 4.1.2.
4.1.2 Integration of 1-forms
We now show how to integrate the 1-form basis elements over any curve C in Rn . We use the case n = 3 to illustrate the general case. Let r (t) = (x(t), y(t), z(t)) with t ∈ [a, b] define a curve C and consider a point p = r(t0 ) on C . If v ∈ T p C , we can write for some s ∈ R: v = (sx (t0 ), sy (t0 ), sz (t0 )).
Then, setting s = dt(s) we have (dx) p (v) = x (t0 )dt(s),
(dy) p (v) = y (t0 )dt(s),
(dz) p (v) = z (t0 )dt(s). (4.2)
Consider the case dx and note that for some small s > 0 x(t0 + s)
− x(t0 )
=
x(t0 + h) s
− x(t0 ) s
=
x(t0 + s) s
− x(t0 ) dt(s)
≈
x (t0 )dt(s) = dx(v).
(4.3)
Therefore, dx(v) approximates the small increment in the x-direction projected from the tangent vector v ∈ T p C . We now define
ˆ
dx
C
using the following process. (1) Partition [a, b] ∈ R: a = t 0 < t1 < .. . < tn−1 < tn = b .
88
4 Line Integrals
(2) Let vj = (tj+1 − tj )e1 ∈ T tj R, then dt(vj ) = t j+1 − tj and r (tj )dt(vj ) ∈ T r(tj ) C . Recall that r (tj )dt(vj ) is the best linear approximation of C near r(tj ). (3) We take the sum of dx evaluated at tangent vectors r (tj )dt(vj ) = (x (tj ), y (tj )) dt(vj )
located at the points r(tj ): n
Rn =
n
dx(r (tj )dt(vj ))
j=1
≈
x(tj+1 )
j=1
− x(tj ) = x(b) − x(a)
where the approximation is given by (4.3). But, n
Rn =
x (tj ) dt(vj ).
j=1
where the right-hand side is a Riemann sum in (4) Therefore,
ˆ
.
R
b
dx := lim
Rn = n→∞
C
ˆ
x (t) dt.
(4.4)
a
and the limit is taken by adding points to the partition so that every interval is subdivided. Another way of writing (4.4) is using the pullback operation on dx:
ˆ
b
dx =
C
ˆ
r∗ dx.
a
The formula in terms of pullback is the one which we use for general 1-forms. We have the formulae
ˆ dx ˆ dy ˆ
b
:=
C
a b
:=
C
dz
C
ˆ x (t) dt = x(b) − x(a), ˆ y (t) dt = y(b) − y(a), ˆ
(4.5)
a b
:=
z (t) dt = z(b)
a
− z(a).
We see that those are respectively the variation of C along the x , y and z directions. Another way of seeing (4.5) is in terms of displacement on C along x, y and z directions as an object travels on C from r(a) to r(b). Example 4.1.5. Consider a smooth curve C with endpoints at r(a) = (x(a), y(a)) = (1, 0)
and r(b) = (x(b), y(b)) = (1, 4).
89
4.1 Integration of 1 forms
y
4
1 x
Fig. 4.3. Smooth curve C from Exam-
ple 4.1.5.
The displacement on C along the x direction is 0 and the displacement on C along the y -direction is 4. These can be verified using formulae (4.5)
ˆ
C
dx = x(b)
ˆ
− x(a) = 0,
dy = y(b)
C
− y(a) = 4.
We now generalize the pullback to a general 1-form in Rn . Definition 4.1.6. Let ω be a 1-form on Rn given by n
ω(x) =
ai (x) dxi .
i=1
and let C be a curve with parametrization r(t) = (g1 (t), . . . , g n (t)). The pullback of ω along C is the 1-form on R given by n
(r∗ ω)(t) =
ai (r(t)) gi (t) dt.
i=1
Example 4.1.7. Consider the context of Example 3.3.5 with
dW = x dx + ( 3) dy.
−
On the path from P = (3, 2) to Q = (3, 1) given by r1 (t) = (x(t), y(t)) = (3, 2 t), we compute the pullback of ω := dW using the three step method outlined below:
−
−
(1) Write the formula: for n = 2, the general formula is ω = (a1 (r(t)) x (t) + a2 (r(t)) y (t)) dt.
(2) Identify the pieces: From the formula for ω the coefficients are a1 (x, y) = x
and a2 (x, y) = −3
90
4 Line Integrals
so on r1 (t) we have a1 (r1 (t)) = a 1 (3, 2
− t) = 3
and a2 (r1 (t)) = a 2 (3, 2 − t) = −3.
The differentials are: x 1 (t) = 0 and y1 (t) = −1. For the path r2 (t) = (3 − 5t, 1 − 2t) from Q to R = (−2, −3), we have a1 (r2 (t)) = a 1 (3
− 5t, 1 − 2t) = 3 − 5t with x2 (t) = −5, y2 (t) = −2
and a2 (3 − 5t, 1 − 2t) = −3
(3) Assemble the pieces: we can now use the computations of the second step into the formula: (r∗1 dW )(t) = r ∗1 ω = (3(0) + ( 3)( 1))dt = 3 dt.
− − (r∗2 dW )(t) = r ∗2 ω = ((3 − 5t) (−5) + (−3) (−2)) dt = (25t − 9) dt. We notice that the pullback is an operator r∗ : Ωn
→ Ω1 .
That is, it takes a 1 -form in Rn and gives a 1 -form in R. From the previous section, we know that 1 -forms in R can be integrated in a straightforward way because they correspond to Riemann integrals.
Geometric construction of the integral
We show the geometric construction of the integral of 1-forms in the case R2 as illustrated in Figure 4.4. Consider a curve C given by r(t) with t ∈ [a, b] and ω = a1 (x, y) dx + a2 (x, y) dy .
r(t)
a t1 t2 t3 v0 v1 v2
tn−1 b
···
r(b)
C
r’(t)
vn−1 t r(a)
Fig. 4.4. Geometric construction of the integral of a 1 -form.
r (tj )dt(vj )
pj = r (tj )
4.1 Integration of 1 forms
91
(1) Partition [a, b] ∈ R: a = t 0 < t1 < .. . < tn−1 < tn = b . (2) Let vj = (tj+1 − tj )e1 ∈ T tj R, then dt(vj ) = t j+1 − tj and r (tj )dt(vj ) ∈ T r(tj ) C . Recall that r (tj )dt(vj ) is the best linear approximation of C near r(tj ). (3) We take the following sum at points r(tj ) and evaluate at r (tj )dt(vj ) = (x (tj ), y (tj )) dt(vj ): n
Rn =
a1 (r(tj )) dx(r (tj )dt(vj )) + a2 (r(tj )) dy(r (tj )dt(vj ))
j=1
This sum can be seen as the contributions of areas of rectangles corresponding to heights of a1 and a2 and bases dx and dy along the x and y coordinate directions as seen in Figure 4.5: a1 (r(tj )) dx(r (tj )dt(vj )),
a2 (r(tj )) dy(r (tj )dt(vj )).
height at r(tj )
height at r(tj )
base in
x-direction
base in
y -direction
a2
A2 A1 a1
pj
dy Fig. 4.5. Contribution of areas of rect-
dx
angles in the x and y directions.
Therefore, we can think of Rn as the addition of the Riemann sums along each direction. We define
ˆ
C
ω := lim
→∞ Rn
n
where the limit is taken by adding points to the partition so that every interval is always subdivided. (4) Evaluating dx and dy we obtain n
Rn = and so
a1 (r(tj ))x (tj ) + a2 (r(tj ))y (tj ) dt(vj )
j=1
n
lim
→∞ Rn
n
=
lim
n
→∞ b
=
ˆ
a
a1 (r(tj ))x (tj ) + a2 (r(tj ))y (tj )
j=1
(a1 (r(t))x (t) + a2 (r(t))y (t)) dt =
b
ˆ
a
dt(vj )
r∗ ω.
92
4 Line Integrals
(5) Therefore,
ˆ
b
ˆ
ω =
C
r∗ ω.
a
Using the above geometric construction for n = 2, we now state the general definition.
∈ Ωn(U ) be a 1-form defined on U ⊂ Rn and let C be a curve given by the vector function r(t) with t ∈ [a, b]. Then, the integral of ω along Definition 4.1.8. Let ω
C is given by
ˆ
b
ω :=
C
ˆ
r∗ ω.
a
In coordinates, this means that if n
ˆ
ω(x) =
ai (x) dxi
i=1
and r(t) = (x1 (t), . . . , x n (t)) then
ˆ
b n
ω =
C
a i=1
ai (r(t))xi (t) dt.
Remark 4.1.9. In the context of vector fields (that comes up in an upcoming section),
the expression line integral along C is preferred, but can be used also in the context of 1-forms.
We now look at several examples. Example 4.1.10. Set up the integral of ω = y dx + 3x dy over C given by r(t) =
(t, sin(2πt)) with t
∈ [1, 2]. We first write the pullback of ω:
(1) Formula: the general formula for the 1-form with n = 2 is r∗ ω = (a1 (r(t)) x (t) + a2 (r(t)) y (t))dt.
(2) Identify the pieces: a1 (x, y) = y and a2 (x, y) = 3x. Then on r(t) we have a1 (r(t)) = sin(2πt)
and a2 (r(t)) = 3t.
Also, x (t) = 1 and y (t) = 2π cos(2πt). (3) Assemble the pieces in the formula: (r∗ ω)(t) = sin(2πt)dt + (3t)2π cos(2πt) dt = (sin(2πt) + 6πt cos(2πt)) dt.
We can now set up the integral
ˆ
C
2
ω =
ˆ 1
r∗ ω =
2
ˆ 1
(sin(2πt) + 6πt cos(2πt)) dt.
4.1 Integration of 1 forms
93
− y) dx + x2 dy − zydz over C
Example 4.1.11. We set up the integral of ω = (z given by r(t) = (t, 2t, t) with t [0, 1].
−
∈
(1) Write the formula of the pullback: r∗ ω(t) = (a1 (r(t))x (t) + a2 (r(t))y (t) + a3 (r(t))z (t)) dt
(2) Identify the pieces: We have a1 (x,y,z) = (z − y), a2 (x,y,z) = x2 , and a3 (x,y,z) = −zy . Now r(t) = (x(t), y(t), z(t) ) = (t, 2t, −t) which implies r(t) = (x (t), y (t), z (t)) = (1, 2, −1). Therefore, a1 (r(t)) =
a2 (r(t)) = t 2 ,
−3t,
a3 (r(t)) = 2t2 .
(3) Assemble the pieces: r∗ ω(t) = (( 3t)1 + 4t2 (2) + 2t2 ( 1)) dt = ( 3t + 6t2 ) dt.
−
−
Therefore,
1
ˆ
ω =
ˆ
( 3t + 6t2 ) dt.
−
0
C
−
This integral is easily computable and we leave the details to the reader. We can now state the general properties of integrals of 1-forms. Proposition 4.1.12. The integral of a 1-form ω
ties.
∈ Ωn satisfies the following proper-
(a) If a curve C is the (disjoint) union of C 1 and C 2 then
ˆ
ω =
C
ˆ
ω +
C 1
ˆ
ω.
C 2
(b) Let −C be the curve C travelled in the opposite orientation. Then,
ˆ
ω =
−
−C
ˆ
ω.
C
(c) If a, b ∈ R and ωi , i = 1, 2 are 1-forms, then
ˆ
(aω1 + bω2 ) = a
C
ˆ
ω1 + b
C
ˆ
ω2 .
C
Proof. (a) We suppose that C is given by r(t) with t [a, b] with C 1 obtained by restricting t [a, c] and C 2 by restricting t [c, b]. We can decompose the integral
as follows
∈
∈
ˆ
C
b
ω =
ˆ
a
r∗ ω =
c
ˆ
a
r∗ ω +
∈ b
ˆ c
r∗ ω.
94
4 Line Integrals
b
ˆ
r∗ ω is the integral of a 1-form in R. The two integrals on because the integral a the right-hand side correspond to
ˆ
ω
and
C 1
ˆ
ω.
C 2
(b) Let C be given by r(t) = (x1 (t), . . . , xn (t)) with t ∈ [a, b]. Then − C is given by ˜r(t) = r(a + b − t) with t ∈ [a, b]. Then, ˜r(a) = r(b), ˜r(b) = r(a) and ˜r (t) = −r (a + b − t). We compute
ˆ
− − b
ω
=
−C
ˆ ˆ ˆ − ˆ
ai (˜r(t)) ( xi (a + b
a i=1 b n
=
a i=1 a n
=
b
=
n
ai (r(a + b
i=1 b n
a i=1
−
− t))
dt
− t)) (−xi(a + b − t)) dt,
set u = a + b − t
ai (r(u))xi (u) ( du)
−
ai (r(u))xi (u) du =
−
ˆ
ω.
C
(c) This property is straightforward to check and left as an exercise. We illustrate the use of part (a) of Proposition 4.1.12. Example 4.1.13. Find the total work done in Example 3.3.5 on the path C from
P to R given. Because the path is piecewise smooth, the integral is the sum of the integral on both smooth pieces C 1 and C 2 :
ˆ
C
dW =
ˆ
C 1
dW +
ˆ
3
dW =
0
C 2
=
1
ˆ ∗ ˆ ∗ r dW + r dW ˆ ˆ 1
3
1
3 dt +
0
=
9+
=
9+
0
2
(25t
0
− −
− 9) dt
1
25 2 t 9t 2 0 25 25 9 = . 2 2
Exercises
(1) Set up the integral of ω = 2xy dx + x 2 dy over the curve C given by r(t) = (1 + t, 3t2 ) with t ∈ [0, 1]. (2) Set up the integral of ω = (z − y) dx + x2 dy − zydz over the curve C given by r(t) = (t, 2t, −t) with t ∈ [0, 1].
4.2 Arc-length, Metrics and Applications
95
(3) Set up
ˆ
ω
C
where ω = e xy dx + xey dy over the piecewise smooth curve C consisting of C 1 the piece of parabola y = x 2 from (0, 0) to (1, 1) , followed by the line segment C 2 from (1, 1) to (2, 0), and finally the line C 3 from (2, 0) to (0, 0). (4) Compute the integral of ω = y sin(z) dx + z sin(x) dy + x sin(y) dz over the curve C given by r(t) = (cos(t), sin(t), sin(5t)) with t ∈ [0, π]. (5) Set up the integral for the work done by the force F (x,y,z) = (xy 2 , xy + yz, −3z 3 + y 2 ) over the piecewise smooth path C given by the piece of helix C 1 parametrized by r(t) = (cos(t), sin(t), t) with t ∈ [0, π] and C 2 the line segment from (−1, 0, π) to (0, −1, 0). (6) Consider a √ cyclist of mass 1√ on a road up a mountain where the path C is given by r(t) = ( 2 − t cos(2πt), 2 − t sin(2πt), t) with t ∈ [0, 2]. (a) Verify that the path verifies the equation of the elliptic paraboloid z = 2 − (x2 + y2 ). (b) Determine the work needed against gravity ( a = (0, 0, −9.8)) to climb up the path C . (c) Suppose there is a wind with force F (x,y,z) = (−3z − 1, 0, 0), compute the work done by the cyclist against the wind. (7) Show part (c) of Proposition 4.1.12.
4.2 Arc-length, Metrics and Applications We show how to use the ideas from 1 -forms to compute important quantities related to curves such as arc-length and curvature. We also introduce integration over the length of a curve and use it to look at computation of mass and centre of mass for curved shaped objects. Finally, we make the link between 1-forms and vector fields and between the integration of 1-forms and the Line Integrals of Vector Fields.
4.2.1 Arc-length
Suppose one wants to know the length of a piece of curve on the floor. It can just be picked up, straightened and measured using a measuring tape. The length of the rope is independent of its shape on the floor. We exploit this idea to given an intrinsic definition of arc-length by finding a mathematical way of “straightening” a curve C on top of a coordinate axis to obtain the arc-length. As a first step, the parametrization should not travel along the curve several times. Consider the following example.
96
4 Line Integrals
Example 4.2.1. Let C be the circle curve given by r(t) = (cos(t), sin(t)) with t
∈
[0, 4π]. The section of curve for t [2π, 4π] is a second winding around the circle. Thus for t1 [0, 2π] and t2 = t 1 + 2π this means two different elements in the domain, t 1 = t 2 , are sent to the same point in R2 : r(t1 ) = r (t2 ). Computing the length of the circle over the whole domain would give us twice the length. See Figure 4.6.
∈
∈
y
r(t1 ) = r (t1 + 2π)
x
Fig. 4.6. Parametrization of the circle in
Example 4.2.1
This example leads us to consider parametrizations that are one-to-one, or also called injective. We recall the definition in the context of vector functions. Definition 4.2.2. Let r(t) be a vector function with t
one or injective if for t1 , t2 ∈ [a, b], r(t1 ) = r (t2 )
∈ [a, b]. Then, r(t) is one-to-
=
⇒ t1 = t2. Another way to write this implication is: if t1 = t 2 then r(t1 ) = r (t2 ). Consider the following example. Example 4.2.3. Let C be a curve given by the parametric representation r(t) =
(t2 , t4 ) with t [ 1, 1]. We check to see whether this vector function is injective by setting r(t1 ) = r (t2 ). The goal is to find out if this equality leads t1 to be equal to t2 . If so, then it is injective. But, we can check directly that for t1 = 1 and t2 = 1 we have (t21 , t41 ) = (t22 , t42 ).
∈ −
−
Thus, it is not injective. In fact, for t2 = t1 we have r(t1 ) = r (t2 ). Restricting the domain of r(t) to t [ 1, 0] or t [0, 1] leads to an injective vector function.
∈−
∈
−
As noted in Remark 2.3.1, the domain of the arc-length parametrization corresponds in the examples at the beginning of Section 6.2 to the length of the curve C as we know from basic geometry. This is true in general as the construction below shows.
97
4.2 Arc-length, Metrics and Applications
Arc-Length: Geometric Construction
We use tangent vectors at points along C to build a Riemann sum for the arc-length. See Figure 4.7 for an illustration of this construction. Let r(s) with s ∈ [a, b] be the arc-length parametrization of a curve C . (1) Let a = s 0 < s1 < .. . < sn = b
be a partition of [a, b] (2) ∆si = s i+1 − si ∈ T si R for i = 0, . . . , n − 1. We know that the line segment r (si ) ds(∆si ) is the best linear approximation of C near r(si ) and
||r(si)dt(∆si)|| = ds(∆si)
for all i = 0, . . . , n − 1 because r (si ) = 1 in the arc-length parametrization. r(b) = r (tn )
r(t)
a = t 0
t1 v0
t2 v1
t3 v2
...
tn−1
r(tn−1 )
r (tj )dt(vj )
tn = b
r(tj )
vn−1
C
r(t2 ) r (t0 )dt(v0 ))
r(t1 ) r(a) = r (t0 )
Fig. 4.7. Tangent vectors at points r(tj ) along C .
We see that the arc-length parametrization preserves the length of tangent vectors from T sj R to T r(sj ) C . Now, summing those line segments, we obtain an approximation of the length of C which is independent of n, the number of elements in the partition: n 1
−
||
n 1
r (si )ds(∆si ) =
||
i=0
and in particular
−
||
ds(∆si ) = b
||
i=0
−a
(4.6)
n 1
lim
n
−
||
→∞ i=0
r (si )dt(∆si ) = b
||
− a.
(4.7)
But, the approximation C by the line segments r (si )dt(∆si ) improves as ds(∆si ) becomes smaller. Thus, in (4.7), we let n → ∞ in such a way that all intervals ∆si → 0 for i = 0, . . . , s − 1.
98
4 Line Integrals
Definition 4.2.4. Let C be a curve with parametrization given by the vector function r(s) with s [a, b] corresponding to arc-length parametrization. Then, the arc-length
∈
of C is given by
b
(C ) :=
ˆ
ds = b
a
− a.
The above calculation shows that using the arc-length parametrization, approximations to the curve by tangent vectors always add-up naturally to the length of the domain of the arc-length parametrization. We are in some sense, straightening out pieces of curve on the tangent vectors. As mentioned in the previous chapter, finding the arc-length parametrization is not possible in many cases and so we must obtain a formula from which the arc-length can be expressed no matter which parametrization is used.
Metrics and arc-length
Consider a vector v = (v1 , . . . , vn ) ∈ T p Rn . We know that the differentials dxi applied on v yield the projections of v on the coordinate axes xi : dxi (v) = vi . We use the differentials to define an alternative way of computing the length of vectors in T p Rn using the metric operator ds : T p Rn → R defined by
n
ds :=
dx2i .
i=1
Note that the metric is not a 1-form because it is not linear, but it is built using 1-forms. The notation ds for the metric operator should not be confused with the differential ds as in the previous section. We see indeed that
n
ds(v) =
n
2
dxi (v) =
i=1
i=1
vi2 = v .
|| ||
Note that the norm can be defined in curvilinear coordinate systems. Consider the differentials dr and dθ from polar coordinates. We perform the change of coordinates from dx, dy to dr, dθ: dx = cos θ dr
− r sin θ dθ
and dy = sin θ dr + r cos θdθ.
So, dx2 + dy 2 = (cos θ dr
− r sin θ dθ)2 + (sin θ dr + r cos θ dθ)2 = dr 2 + r2 dθ2
and therefore, ds =
dr2 + r 2 dθ 2 .
We see in the next result that the metric is the right concept to produce a general formula for computing arc-length. We can compute the arc-length of a curve C by
99
4.2 Arc-length, Metrics and Applications
integrating the metric ds. We assume that the curve C is given by an injective parametrization r(t) with t ∈ [a, b]. Theorem 4.2.5. The arc-length (C ) of a curve C in Rn with injective parametrization r(t), t [c, d], is given by
∈
d
ˆ
(C ) =
c
||r(t)|| dt.
Proof. Let ˜r(s) with s [a, b], be the arc-length parametrization of C , then (C ) = b a. The parametrizations ˜r(s) and r(t) are related by t = u(s) where u : [a, b] [c, d] is differentiable, and we suppose, without loss of generality that u (s) > 0 . We
∈
−
→
know that 1= ˜ r (s) = r (t) u (s) .
||
|| ||
|||
|
We now use ds to obtain an approximation of the length of C . Let c = t 0 < .. . < tn = d
be a partition of [c, d] and consider the vectors r (ti )dt(∆ti ) based at r(ti ) where ∆ti = t i+1 − ti ∈ T ti R. Then, n 1
−
n 1
ds(r (ti )dt(∆ti ))
=
i=0
−
|
x1 (ti )2 +
i=0 n 1
=
−
··· + xn(ti)2 dt(∆ti)
1/ u (si ) dt(∆ti ).
|
i=0
But, dt(∆ti ) = u(si+1 ) − u(si ) ≈ u (si )ds(∆si ). Thus, n 1
−
i=0
n 1
1/ u (si ) dt(∆ti )
|
|
−
≈ | i=0
1 (u (si ))ds(∆si ) = b u (si )
|
− a.
Letting n → ∞ one has d
ˆ c
||r(t)|| dt = b − a = (C )
and this completes the proof. The following result is immediate from the proof of the above theorem. Corollary 4.2.6. Let C be a curve. The arc-length is independent of the vector function used to describe C . That is, if r1 (t) with t [a1 , b1 ] and r2 (t) with t [a2 , b2 ]
are two different parametrizations of C , then b1
ˆ
a1
||r (t)|| dt = 1
b2
ˆ
a2
∈
||r2 (t)|| dt.
∈
100
4 Line Integrals
Proof. Any parametrization can be reparametrized to the arc-length parametrization
and so the formula is the same with the bounds of integration depending on the parametrization. Remark 4.2.7. (1) A more widespread notation for arc-length is
ˆ
ds := (C ).
C
(2) Note that the integral for computing arc-length is the same as the one we need to compute the arc-length parametrization. (3) Reversing the orientation leaves
ˆ
ds invariant because
C
ds(r (ti )dt(∆ti )) = ds( r (ti )dt(∆ti )).
−
Example 4.2.8. We compute the arc-length of C given by r(t) = (t cos(2πt), t sin(2πt))
with t
∈ [0, 2]. The derivative is r (t) = (cos(2πt) − 2πt sin(2πt), sin(2πt) + 2πt cos(2πt)) √ and ||r (t)|| = 1 + 4π 2 t2 . Therefore, y
r(t) = (t cos(2πt), t sin(2πt))
x Fig. 4.8. Curve C of Example 4.2.8
ˆ
2
ds
=
ˆ 1
C
= =
1 2π 1 2π
1 + 4π 2 t2 dt
1 πt 1 + 4π 2 t2 + ln(2πt + 1 + 4π 2 t2 ) 2 1 2π 1 + 16π 2 + ln(4π + 1 + 16π 2 ) . 2
2 0
101
4.2 Arc-length, Metrics and Applications
Example 4.2.9. Set up the integral for the arc-length of C given by r (t) = (et , tet , t2 et ) with t [0, 2]. The derivative is r (t) = (et , et + tet , 2tet + t2 et ) and so r (t) =
∈ √ et 2 + 2t + 3t2 + 4t3 + t4 . Then
ˆ
||
2
ds =
ˆ
t
e
0
C
||
2 + 2t + 3t2 + 4t3 + t4 dt.
More Metrics
We begin with some examples of metrics in three-dimensions, the simplest one being obtained for cylindrical coordinates which is just a direct extension of the polar coordinate case. We have
dr 2 + r2 dθ2 + dz 2 .
ds =
The metric in spherical coordinates requires more work to derive. We begin by writing dx, dy , dz as functions of dρ , dφ and dθ. Recall x = ρ cos θ sin φ,
y = ρ sin θ sin φ,
z = ρ cos φ.
Then, dx =
cos θ sin φ dρ
− ρ sin θ sin φ dθ + ρ cos θ cos φdφ.
dy =
sin θ sin φ dρ + ρ cos θ sin φ dθ + ρ sin θ cos φ dφ
dz =
cos φ dρ
− ρ sin φ dφ
and a straightforward, but lengthy computation shows ds2
=
dx2 + dy 2 + dz 2 = dρ 2 + ρ2 sin2 φ dθ2 + ρ2 dφ2 .
This metric is useful for measuring curves lying on a sphere or following a spherical trajectory, possibly of non-constant radius. Example 4.2.10. We set up the integral for the length of the curve C given by the
vector function r(t) = (t cos t sin t, t sin2 t, t cos t)
with t
∈ [0, π/2]. We write the curve in spherical coordinates: ρ =
t2 cos2 t sin2 t + t2 sin4 t + t2 cos2 t = t,
with θ = arctan(y(t)/x(t)) and cos φ = z(t)/ρ(t). Because t θ = t,
φ = t.
Therefore, dρ2 = dt 2 , dθ2 = dt 2 and dφ2 = dt 2 . Thus,
ˆ
C
π/2
ds =
ˆ 0
1 + t2 sin2 t + t2 dt.
∈ [0, π/2], we obtain
102
4 Line Integrals
r(t) = (t cos t sin t, t sin2 t, t cos t)
y
x Fig. 4.9. The curve given by r(t) in Exam-
z
ple 4.2.10
Exercises
(1) For the following curves C , determine if the parametrization is injective. If not, restrict the domain of t to make it injective (a) Consider the curve C given by r(t) = (a cos(2t), b sin t) with t ∈ [− π2 , π2 ]. (b) Consider the curve C given by r(t) = (t2 , sin(4πt 2 )) with t ∈ [−1, 1]. (2) Find the length of the curve C given by r(t) = (t2 /2, t3 /3) with t ∈ [0, 1]. Compare your answer with the domain obtained in Example 2.3.5. (3) Find the length of the Cycloid curve C given by r (t) = (r(t − sin t), r(1 − cos t)) with t ∈ [0, 2π]. Use a computer software to plot C . (4) Find the length of the Cardioid curve C given by r(t) =(a(2cos t −cos(2t)), a(2 sin t − sin(2t))) with t ∈ [0, 2π]. Use a computer software to plot C . (5) Find the length of the Deltoid curve C given by r(t) = (2a cos t + a cos(2t), 2a sin t
− a sin(2t))
with t ∈ [0, 2π]. Use a computer software to plot C . (6) Show that for a vector function r(t) = (t, f (t)), then ds = 1 + f (t)2 dt. (7) Find the length of the helix curve C given by r(t) = (a cos(2πt), a sin(2πt), t) with t ∈ [0, 2]. Use the metric ds in cylindrical coordinates. (8) Find the length of the curve C given by r(t) = (t cos t, t sin t, t) with t ∈ [0, 2π]. Is a given metric preferable here? (9) Find the length of the curve C given by r(t) = (e2t cos t, 2, e2t sin t) with t ∈ [0, 2π].
4.2 Arc-length, Metrics and Applications
103
4.2.2 Integral of functions over curves: including applications
Consider the situations shown in Figure 4.10:
h( p) h p
p
C
C
Fig. 4.10. Fences of constant height (left) and variable height (right) over a curve C .
(1) Suppose that a fence of constant height h > 0 is build on top of a curve C . The intuition, which is correct, is that the area of the fence is given by the length of the curve C times the height of the fence: h
ˆ
ds.
C
If the height depends on the location along the fence h = h( p) where p ∈ C , by analogy with the area under a curve seen in elementary calculus, we expect the area to be given by an expression such as
ˆ
h( p) ds.
(4.8)
C
(2) A very thin wire (of uniform radial cross-section) is made up of a unique material and is arranged in the shape given by a curve C . It is possible that the material is unevenly distributed along the wire and so the density of material in some small ∆s piece of the curve varies along C . This means that for a small portion of wire near two different points p, q ∈ C , the mass in equal neighborhoods near p and q could be different. The mass near p and q is approximated by ρ( p) ds(v) and ρ(q ) ds(w) where ρ is the mass density (kg/m) (evaluated on C ) and ds (m) is a small distance near the points p and q where v ∈ T p C , w ∈ T q C . Thus, the mass of the wire should be given by
ˆ
C
ρ( p) ds.
(4.9)
104
4 Line Integrals
We want to make sense of the expressions (4.8) and (4.9). Let r (t) be a parametrization of a curve C and let f (x) be a function defined on C , where x = (x1 , . . . , xn ). Let ∆tj := t j+1 − tj ∈ T tj R and define n 1
Rn
−
:=
f (r(tj )) ds(r (tj ) dt(∆tj ))
j=0 n 1
−
=
x1 (tj )2 +
f (r(tj ))
j=0
We define
ˆ
f (x) ds
:=
··· + xn(tj )2 dt(∆tj )
lim Rn
n
→∞
C
=
lim
n
n 1
−
···
→∞ j=0 b
ˆ f (r(t)) ˆ
=
a b
=
x1 (tj )2 +
f (r(tj ))
x1 (t)2 +
··· + xn(tj )2 dt(∆tj )
+ xn (t)2 dt
f (r(t)) r (t) dt.
a
||
||
This leads us to the following definition. Definition 4.2.11. Consider a function f :
with t
n R
→ R and C a curve defined by r(t)
∈ [a, b]. The integral of f over C is given by
ˆ
b
f (x1 , . . . , xn ) ds =
C
ˆ
f (r(t)) r (t) dt.
||
a
||
We look at a few examples. (1) Write the formula for the integral of f (x, y) = xy over the curve C defined by r(t) = (t, t3 ) with t ∈ [0, 1]. Begin by obtaining the derivative r (t) = (1, 3t2 ) √ and ||r (t)|| = 1 + 9t4 . Then,
ˆ
1
f ds =
ˆ 0
C
3t2
1 + 9t4 dt.
(2) Compute the integral of f (x,y,z) = 2x over the curve C defined by r(t) = √ (t, 3cos t, 3sin t) with t ∈ [0, 2π]. We compute directly ||r (t)|| = 10 and so
ˆ
C
2π
f ds =
ˆ 0
√
2t 10 dt =
√
2π
2
10t
√
= 4π 2 10.
0
Example 4.2.12. The formula for the integral of f over C is also valid in other
coordinate systems. Consider the curve given by the vector function r(t) = (a cos t sin t, a sin2 t, a cos t)
4.2 Arc-length, Metrics and Applications
105
with t [0, π/2] seen in Figure 4.11. This curve has a form similar to the one of Example 4.2.10 and one can verify that C is located on a sphere of radius a. We compute dρ, dθ and dφ explicitly. We start with
∈
ρ2 = x 2 + y 2 + z 2
and take the differential to obtain 2ρ dρ = 2x dx + 2y dy + 2zdz.
Then, y r(t) = (2 cos t sin t, 2sin2 t, 2cos t)
x
Fig. 4.11. Curve r(t) from Exam-
z
ple 4.2.12
ρdρ
=
x dx + y dy + z dz
=
a2 cos t sin t( sin2 t + cos2 t) dt + a2 sin2 t (2sin t cos t dt)
−
+a cos t ( a sin t dt) = = dθ
−
a2 (cos3 t sin t + sin3 t cos t
− sin t cos t) dt a2 (sin t cos t(cos2 t + sin2 t) − sin t cos t) dt = 0. =
−y dx + x dy = (−a2 sin2 t(− sin2 t + cos2 t) +a2 cos t sin t(2 sin t cos t)) dt
=
a2 (sin4 t + sin2 t cos2 t) dt
=
a2 sin2 tdt.
and dz = cos φ dρ + ρdφ
becomes ( a sin t dt) = adφ . Since ρ2 sin2 φ = a 2 (1
−
ds = a sin t
− cos2 φ) = a2 sin2 t
a2 sin2 t + 1 dt.
106
4 Line Integra Integrals ls
ˆ
2 π/2 π/
z ds
ˆ a cos t sin t ˆ u 1 + u du, ˆ √
=
0a
C
=
0
1 2
=
2
1 + a2 sin2 t dt d t,
u = = a a sin t
v = 1 + u2
a2
v dv
0
a2
1 3/2 3v
=
2
0
= 31 a3 .
Mass and centre of mass
ρ((x,y,z x,y,z)), As described above, for a thin wire with mass density given by a function ρ the mass m of the wire is given by m =
ˆ
ρ(x,y,z x,y,z)) ds.
C
ρ (x,y,z x,y,z)) = z of shape Example 4.2.13. We compute the mass of the wire of density ρ(
Thiss is done in cyl cylind indric rical al coor ordina dinates tes with ds = (cos t, sin t, t) with t [0,, 4π]. Thi [0 dr 2 + r 2 dθ 2 + dz 2 . We know
∈
√
= x = cos t( sin t dt r dr dr = x dx + y dy dy = dt)) + sin t(cos t dt dt)) = 0,
−
= dt dz = dz
and r 2 dθ dθ = =
dy = = − sin t(− sin t dt dt)) + cos t(cos t dt dt)) = dt −y dx + x dy
with r2 = cos2 t + sin2 t = 1. Thus,
4π
ˆ
ρ(x,y,z x,y,z)) ds ds = =
ˆ
√
t 2 dt dt = =
√
28π 28 π2 .
0
C
The centre of mass of of a wire with density ρ(x,y,z x,y,z)) is located at (x,y,z x,y,z)) given by the formulae 1 x = m
ˆ
xρ((x,y,z xρ x,y,z)) ds,
C
1 y = m
ˆ
yρ((x,y,z yρ x,y,z)) ds,
C
1 z= m
ˆ
yρ((x,y,z yρ x,y,z)) ds.
C
where m is the mass of the wire. For a wire lying in a plane, we need only consider ρ(x, y ) and (x, y ). Compu pute te th thee cen entr tree of ma mass ss of a wi wirre wi with th con onst stan antt dens densit ity y Example 4.2.14. Com shape e C given given by r(t) = (cos t, sin t) with t [0, π ]. We be[0, ρ(x, y ) = 3 of shap
√ gin by computing the mass using ds ds = = dr
∈
(it is also straightforward with 2 2 = dt ). Because C has constant radius, dr = 0 can be easily checked. Now, tan θ = tan t so θ = t, but the domain must be split at t = π/ π/22. Therefore, dθ = dθ = dt dt and
√ ds = ds = dx +dy
m =
ˆ
C
3 ds ds = = 3
0
ˆ
+r 2 dθ2
π/2 π/2
π
ˆ
2
ds = ds = 3
0
π
dt +
ˆ
2 π/2 π/
dt
= 3( 3(π/ π/22 + π/ π/2) 2) = 3π. 3π.
4.2 Arc-leng Arc-length, th, Metrics and Applications Applications
107
y
(0,, π2 ) (0
(cos t, sin t)
x
Fig. 4.12. Wire in a semicircle shape and location of the centre of mass in Example 4.2.14.
Then, 1 x = 3π
π
ˆ ˆ 1 3x ds ds = = π ˆ ˆ 1 1 0
C
π
y =
3π
3y ds ds = =
C
sin t dt dt = =
π
1 cos t dt dt = = sin t π
0
1 π
− cos t
π
= 0.
0 π 0
=
2 . π
Exercises
[0,, 1] and compute (1) Let C be be the curve given by r(t) = (t, t2 ) for t ∈ [0
ˆ
xds.
C
(2) Let C be the curve of intersection of the cone z 2 = x2 + y 2 and the plane z = 2 − x − y. Compute
ˆ
z ds ds..
C
ρ((x, y ) = 1+x 1+x+y (3) Find the mass mass and the centre centre of mass of the wire wire C with with density ρ (2,, 3). with shape given by r(t) = (1 − t)(−1, 0) + t(2 [0,, 2π ] and (4) Let C be the curve given by r(t) = (e−t cos t, e−t sin t) with t ∈ [0 compute
ˆ
C
x2 + y2 ds.
x,y,z)) = xy (5) Find the mass and centre of mass of the wire C with density ρ(x,y,z (cos(2πt)), sin(2 sin(2πt πt)), 3t) with t ∈ [0 [0,, 1]. with shape given by r(t) = (cos(2πt (6) Let C be be the curve given by r(t) = ((1 − t)cos t sin t, (1 − t)sin2 t, (1 − t)cos t) [0,, 2π ]. Set up the integral with t ∈ [0
ˆ
C
xds.
108
4 Line Integra Integrals ls
4.2.3 Curva Curvature ture
We denote the unit tangent vector to a curve C given by a vector function r(t) by T(t) :=
||
r (t) r (t)
||
and look at its evolution as t changes. In fact, we begin the discussion by assuming that r(s) is the arc-length parametrization of C , then T(s) · T(s) = 1 and d dT(s) T(s) T(s) = 2T(s) = 0. ds ds
·
T(s)
C
dT ds dT ds dT ds
(4.10)
T(s)
T ((s) T
Fig. 4.13. Curve C with unit tangent
vector
This means the vector
·
T(s) and
the vector
dT . ds
dT(s) ds
is perpendicular to the tangent vector, but need not be a unit vector. See Figure 4.13 Definition 4.2.15. The curvature curvature of a curve with the arc-length parametrization is
κ =
where T is the unit tangent vector.
dT ds
We check this definition with some intuitive examples.
∈ Rn and C ∈ C the line joining these points.
p, q Example 4.2.16. Consider two points p,
The arc-length parametrization is given by
( q p s) p + sq . q p
|| − || − || − || with s ∈ [0 [0,, ||q − p||]. Since r (s) = (q − p p))/||q − p|| is constant r(s) =
dT =0 ds
and the curvature κ = 0. This agrees with our intuition that a straight line has no curvature.
4.2 Arc-leng Arc-length, th, Metrics and Applications Applications
109
a given by its arc-length parametrizaExample 4.2.17. Consider the circle of radius a tion r(s) = (a cos( cos(t/a sin(t/a )) with t [0 [0,, 2π ]. Then, t/a)), a sin( t/a))
∈
r (s) = (
and
− sin( sin(t/a t/a)), cos( cos(t/a t/a)) ))
dT = ( a−1 cos( cos(t/a t/a)), a−1 sin( sin(t/a t/a)) )) ds
−
which means κ = 1/a. A circle has constant curvature which is the reciprocal of its κ =
1 10
10 κ = 1 1 Fig. 4.14. Comparison of the cur-
vature of circles of radius 1 and 10.
radius. Therefore, the larger the circle, the smaller its curvature. In plain words, this means that for an arc of same length on the small and large circles, the unit tangent vector on the small circle has a greater variation of its direction as compared to the larger circle; it is more curved!
As we have seen already several times, the arc-length parametrization is not always computable and so we now obtain formulas for κ in terms of general parametrizations r(t). Theorem 4.2.18. Let C be a curve given by r(t) with t
∈ [a, b]. Then || T (t)|| ||r (t) × r (t)|| κ = ||r(t)|| = ||r (t)||3 .
Proof. Let t = φ(s) be the repa reparame rametriz trization ation betw between een r(t) and its arc arc-le -lengt ngthh ds/dt = = r (t) Recall that parametrization ˜r(s). Then, ds/dt
||
T(t) =
and so
||
r (t) r (t)
||
||
T (t) T (t) dT dt = T (t) = = . ds r (t) ds ds dt
||
Thus,
|| T (t)|| κ = ||r(t)|| .
||
110
4 Line Integra Integrals ls
The second formula is obtained by noticing that r (t) = r (t) T (t).
||
||
The rest of the proof is an exercise in computing various derivative. First, r (t) =
Using the fact that
T(t)
d2 s ds ( t ) + T T (t) dt2 dt
× T(t) = 0, one can show that ||r(t) × r (t)|| =
|| ds dt
2
T (t)
||
Isolating ||T (t)|| from this equation yields the result. The formula for κ in terms of r (t) and r (t) is typically easier to use. r (t) = (t, t2 ). Example 4.2.19. We compute the curvature of the parabola C C given by r Begin with writing C as r(t) = (t, t2 , 0), then r (t) = (1, (1, 2t, 0)
and r (t)
and r (t) = (0, (0, 2, 0)
(0, 0, 2). × r (t) = (0,
||r(t) × r (t)|| = 2.
√ Now, ||r (t)|| = 1 + 4t 4 t2 which yields κ =
2 . (1 + 4t 4t2 )3/2
Exercises
(1) Compute the curvature of the following curves. (a) r(t) = (et cos t, et sin t) (b) r(t) = (1 + t, 2 − 5t2 ) (c) r(t) = (t, t2 , t3 ) (d) r(t) = (cos t, sin t, t) (2) For the first two curves of the previous problem, plot on the same graph r(t) κ((t). and κ f ((t)), the curvature formula is (3) Show that for a curve given by r(t) = (t, f κ =
|f (x)|
(1 + (f (f (x))2 )3/2
.
4.3 Line integrals integrals of vector fields fields
111
4.3 Lin Line e int integr egrals als of vect vecto or fiel fields ds There is a tight link between vector fields and 1-forms. Let x = (x1 , . . . , xn ) ∈ Rn , then
{Vector fields in Rn} ⇐⇒ {1-forms on Rn} F ((x) = (F 1 (x), . . . , Fn (x)) ⇐⇒ ω = F 1 (x)dx1 + · · · F n (x) dxn . F In particular, one can write ω = (F 1 (x), . . . , Fn (x)) (dx1 , . . . , d x n ).
·
Because of the correspondence above, we can define a line integral of vector fields . If F is is a vector field and C is is a curve given by r (t) with t ∈ [a, b], then we can evaluate the contribution of F along C by by the formula: F ((r(t)) r (t) F
·
for all t ∈ [a, b]. Definition 4.3.1. The line line integral of a vector field F over
with t
∈ [a, b] is
b
ˆ
a curve C given by r(t)
F ((r(t)) r (t) dt. F
a
·
Using (3.6), we obtain the notation b
ˆ
a
F ((r(t)) · r (t) dt F dt = =
ˆ
C
b
F dr =
· ·
ˆ
a
F T ds
· ·
where T = r (t)/||r (t)||. F (x, y ) = ( 3x2 + xy, 5x2 y2 ) and let C C be Example 4.3.2. Consider the vector field F ( the curve given by r(t) = (t3 , 2t2 ) with t [0 [0,, 1]. Then, the line integral of F over
−
C is obtained as follows. Compute
−
∈
F ((r(t)) = ( 3( F 3(tt3 ) + t3 ( 2t2 ), 5t3 ( 2t2 )) = ( 3t3
−
−
−
− − 2t5 , −10 10tt5 )
and so F ((r(t)) r (t) = ( 3t3 F
− − 2t5 , −10 10tt5 ) · (3 (3tt2 , −4t) = −9t5 − 6t7 + 40t 40t6 .
·
Therefore,
ˆ
C
1
F dr =
· ·
ˆ 0
3 3 40 40t6 ) dt dt = = − − + . − − 6t7 + 40t 2 4 7
( 9t5
Definition 4.3.3. The following two definitions are in direct correspondence.
112
4 Line Integrals
Simple Closed Curve Not Simple
Fig. 4.15. Left: a simple closed curve. Right: a closed curve which is not simple because of the
self-intersection.
(1) A 1-form ω ∈ Ωn (U ) is exact if there exists a differentiable function f : U ⊂ n R → R such that ω = df . (2) A vector field F (x) defined on U ⊂ Rn is said to be gradient or conservative if there exists a differentiable function f : U ⊂ Rn → R such that F (x) = ∇f (x). In this case, f is called the potential function . The name conservative comes from physics where force fields are given by vector fields. In the case of conservative vector fields, we show at the end of the section the Principle of Conservation of Energy, which justifies the use of the term conservative. Example 4.3.4. Let ω = ye xy dx + xexy dy . We show that ω is exact with ω = df ;
that is,
∂f = yexy ∂x
and
ˆ ∂f
ˆ
We integrate f (x, y) =
Therefore,
∂x
dx =
∂f = xe xy . ∂y
ye xy dx = e xy + g(y).
∂f = xe xy + g (y) = xe xy ∂y
so g (y) = 0 which implies g(y) = K is a constant. Thus, f (x, y) = e xy + K , where K is an integrating constant.
For exact 1-forms (and conservative vector fields), the function f such that ω = df , is called an antiderivative in analogy with the case of functions of one-variable and so the integration of ω is done directly using the antiderivative. This is the content of the next result which is one of the cornerstones of advanced calculus. Before, we write the statement, we need to introduce the concept of simple closed curve. A curve C is closed if it can be parametrized by r(t) with t ∈ [a, b] such that r(a) = r (b). A closed curve C is simple if it has no self-intersection. See Figure 4.15.
4.3 Line integrals of vector fields
113
Theorem 4.3.5. [Fundamental Theorem of Calculus for Line Integrals] Consider n
ω =
ai (x) dxi
i=1
with ω = df and so F (x) = t [a, b], then
∈
(a)
ˆ
ω = f (r(b))
C
F (x) = (a1 (x), . . . , an (x))
⇐⇒
∇ f (x). Let C be a curve given by r(t) with parameter
ˆ
− f (r(a)) =
F dr
C
·
(b) If C and C are two curves joining p ∈ Rn to q ∈ Rn , then
ˆ
ω =
C
ˆ
ω
ˆ
and
C
F dr =
·
C
ˆ
F dr.
C
·
The integral is independent of the path joining p and q . (c) If C is a simple closed curve, then
˛
ω = 0 =
C
˛
F dr.
C
·
where is the symbol used to emphasize that the line integral is taken over a simple closed curve.
¸
Proof. (a) Because ω is exact, by definition there exists a differentiable function f such that ω = df . Let r(t) = (x1 (t), . . . , xn (t)), then r∗ ω
= = = = =
∂f ∂f r∗ (x)dx1 + + (x)dxn ∂x 1 ∂x n ∂f ∂f (r(t))x1 (t) dt + + (r(t))xn (t) dt ∂x 1 ∂x n ∂f ∂f (r(t))x1 (t) + + (r(t))xn (t) dt ∂x 1 ∂x n f (r(t)) r (t) dt d f (r(t)) dt dt
···
··· ···
∇
·
where the last equality follows from the chain rule. Thus,
ˆ
C
b
ω =
ˆ
a
d f (r(t)) dt = f (r(b)) dt
− f (r(a)).
(b) Because C and C have the same endpoints, the integrals are equal by part (a). (c) A simple closed curve is such that r(b) = r (a) and so the integral is zero. Remark 4.3.6. In particular, the last implication means
ω is exact
=
⇒
˛
C
ω = 0.
114
4 Line Integrals
Example 4.3.7. Consider the 1-form
ω =
−y dx + x dy . x2 + y 2
In polar coordinates, ω = dθ . Consider now a circle C of radius a centered at (0, 0) and compute
˛
2π
ω =
ˆ
dθ = 2π = 0.
0
C
Therefore, ω is not exact.
In Chapters 6 and 8, we investigate in more details the question of exactness of 1forms and establish a necessary and sufficient condition for a 1 -form to be (locally) exact. Example 4.3.7 is crucial in the understanding of this problem.
Example from Physics
We now look at applications of the Fundamental Theorem of Calculus for line integrals in physics. Example 4.3.8. The radial gravitational force field given by
mM G x x r2
− || ||
F (x) =
where x = (x,y,z) and r = x is conservative. The potential function is:
|| ||
f (x) =
mM G 1 . r2 x
|| ||
Indeed, G ∇f (x) = mM r2
−
x
x2 + y 2 + z 2
,
−
y
x2 + y 2 + z 2
We compute the work done by F on the following paths.
,
−
z
x2 + y 2 + z 2
(1) C given by r(t) = (cos t sin t, sin2 t, cos t) with t ∈ [0, π/4]. Because F is conservative
ˆ
C
F dr
·
= = = =
f (r(π/4))
− f (r(0)) √ f (2, 2, 2) − f (0, 0, 1) mM G 1 mM g 1 √ − √ r2 r2 2 02 + 0 2 + 1 2 2 2 2 +2 + 2 mM G r2
√ − 1 10
1 .
4.3 Line integrals of vector fields
115
(2) C is given by r(t) = (cos t, sin t, cos(3t)) with t ∈ [0, 2π]. Then,
ˆ
C
F dr = f (r(2π))
·
− f (r(0)) = f (1, 0, 1) − f (1, 0, 1) = 0.
Consider the work done by a force F on an object moved between p and q in Rn . If F is conservative, then by Theorem 4.3.5(b), the work done by F is independent of the path chosen. Suppose that an object of mass m is moved on a path C given by r(t) between p = r (a) and q = r (b) in R3 . The force applied on the object at r(t) is F (r(t)) = m r (t)
by Newton’s second law. The force F is not necessarily conservative, but the work 1-form dW = F · (dx,dy,dz) on r(t) is r∗ dW = F (r(t)) dr = m r (t) r (t) =
·
·
1 d r (t) 2 dt
||
||2 dt
and integrating we have m dW = 2 C
ˆ
b
d r (t) dt
ˆ
||
a
||2 dt = m2 ||r (b)||2 − ||r (a)||2
.
The quantity 21 m||r (t)|| is called the kinetic energy of the object and is denoted K (r(t)). This gives us the following result. Theorem 4.3.9. The work done by a force F in moving an object of mass m on a
∈ Rn is the difference in the kinetic energy at points p
path C between points p, q and q .
In the case of a conservative force F with potential function f , we know furthermore that F · dr = f (r(b)) − f (r(a)). (4.11)
ˆ
C
The potential energy of an object located at p in a conservative field F is P ( p) = −f ( p) and therefore, F ( p) = −∇P ( p). Therefore, (4.11) becomes
ˆ
C
F dr = (P (r(a))
·
− P (r(b))).
Together with Theorem 4.3.9 we have
||
m r (b) 2 r (a) 2 = (P (r(a)) P (r(b))). 2 We define E ( p) := K ( p) + P ( p) to be the total energy at p. Rearranging the terms,
|| − ||
||
−
we obtain E (r(b)) = K (r(b)) + P (r(b)) = K (r(a)) + P (r(a)) = E (r(a)).
This leads us to one of the Fundamental Laws of Physics. Theorem 4.3.10. [Principle of Conservation of Energy] In a conservative field F ,
the total energy is conserved along any path C joining two points p and q .
116
4 Line Integrals
Exercises
(1) Compute the line integrals of vector fields F over the curves C given below. (a) F (x, y) = (x sin(y), cos(y)) where C is given by r(t) = (cos(t), t) with t ∈ [0, 2π]. √ √ (b) F (x,y,z) = (y 2 ex , z cos(yz), xz) where C is given by r(t) = (ln(t), t, 2t) and t ∈ [1, 3]. (2) Show that the line integrals below are independent of path and compute the integrals. (a)
ˆ
(2xy + z 2 ) dx + (x2
C − ˆ
( 1, 2, 0) to (0, 2, 3).
(b)
− 2z2 ) dy + (2xz − 4yz) dz where C is any path from
(ex sin(y) + 3x2 ) dx + (ex cos(y)
C
to (1, 0).
− ey ) dy where C is any path from (0, 0)
(3) Let ω = (−y dx + x dy)/(x2 + y 2 ) and show the following integrals.
1 ω = 1 where C is the square with vertices (1, 1), ( 1, 1), ( 1, 1), (a) 2π C (1, 1) in counterclockwise direction. 1 ω = 0 where C is the circle of radius 1 centered at (2, 0) in the (b) 2π C
˛
−
− −
−˛
counterclockwise direction. 1 ω = n where C is given by r(t) = (cos(nt), sin(nt)) with t ∈ [0, 2π] (c) 2π C and n ∈ N is a fixed positive integer. (4) From the previous problem, we see that if ω = (−y dx + x dy)/(x2 + y 2 ), then 1 ω is either non-exact, or could be exact if the integral is zero. Notice that
˛
˛
2π C ω is not defined at (0, 0) and in the first and third integral, the curve C surrounds (0, 0), but not the second curve. Consider ω ˜ = ( y dx+(x 2) dy)/((x 2)2 +y2 )
−
−
and show that 1 ω ˜ = 1 where C is the circle of radius 1 centered at (2, 0). (a) 2π
˛ ˛
−
C
(b)
1 2π
ω ˜ = 0 where C is given by r(t) = (cos(nt), sin(nt)) with t
C
∈ [0, 2π].
What is your conclusion? (Hint: use the substitution u = x − 2). (5) Consider the electric force field F (x) = qQ
x . x 3
|| ||
generated by a charged particle Q located at the origin on an electric charge q located at x = (x,y,z). Suppose an electron is located at the origin with charge Q = −1.6 × 10−19 Coulomb and a positive charge q > 0 is located at (x,y,z) = (10−12 , 0, 0) (in meters). Find the work done by the electric force field on the positive charge as it moves to the location (x,y,z) = (0, 10−8 , 0) (use the value = 8.985 × 109 ).
5 Differential Calculus of Mappings The goal of this chapter is to generalize the basic results of differential calculus to mappings. We begin by distinguishing between functions (mappings) and the graphs of functions. This is often blurred when studying functions f : R → R, but for general mappings it is an essential step. We go on to discuss various ways of visualizing some types of mappings. The following section deals with the concepts of limit and continuity. The definitions are similar to the case of real functions of a single variable; however, examples have much more exotic properties. We pursue with the concept of best linear approximation, introduced previously for vector functions, which we now apply to general mappings and use it to obtain a definition of derivative. We present the main properties of the derivative, in particular, the Chain Rule. We conclude this chapter with higher derivatives and Taylor expansions.
5.1 Graphs and Level Sets We begin the study of local properties of general functions or mappings f : Rn → Rm where n, m ≥ 1. We write those mappings as column vectors. For instance (1) r : R → R3 : r(t) =
(2) f : R2 → R2 : f (x, y) =
where f 1 , f 2 : R2 → R. (3) f : R5 → R3 : f (x1 , x2 , x3 , x4 , x5 ) =
where f 1 , f 2 , f 3 : R5 → R.
x(t) y(t) z(t)
.
f 1 (x, y) f 2 (x, y)
f 1 (x1 , x2 , x3 , x4 , x5 ) f 2 (x1 , x2 , x3 , x4 , x5 ) f 3 (x1 , x2 , x3 , x4 , x5 )
We now define explicitly the “graph of a function”. In the case of functions f : R → R or f : R2 → R, the graph of f can be easily visualized and the function and its graph are often thought of as one and the same. For vector functions of several variables,
118
5 Differential Calculus of Mappings
this is not feasible and we must then make sure that the concepts of a function and its graph are well-understood in their own rights.
⊂ Rn be an open set and f : U → Rm be a mapping. The
Definition 5.1.1. Let U
graph of f is the set
graph (f ) = (x, f (x)) x
| ∈ U }.
{
We look at a few examples
a
Fig. 5.1. Graph of a function f :
b
[a, b]
→ R
Example 5.1.2. Consider a function f : [a, b]
→ R, then graph (f ) = {(x, f (x)) | x ∈ [a, b]}.
We can plot the points of graph (f ) in the plane and this gives us the familiar representation of a curve over an interval in the plane as shown in Figure 5.1.
⊂ R2 →
Fig. 5.2. Graph of a function f : U R
⊂ R2 and f : U → R, then graph (f ) = {(x,y,f (x, y)) | (x, y) ∈ U }
Let U
As shown in Figure 5.2, we can plot graph (f ) in a representation of threedimensional space.
5.1 Graphs and Level Sets
119
In Chapter 1, we plot several curves in the plane given by vector functions r : 2 R R . The graph of r can also often be visualized in three-dimensional space, as seen in Figure 5.3.
→
r(t) = (t cos(t) sin(t), t sin2 (t), t cos(t)) y
t Fig. 5.3. Graph of a vector function in 3 R .
x
Graphical Representations
The previous examples show visualizations of graphs of functions and mappings. Those can be obtained for f : Rn → Rm with n + m ≤ 3 . We can complement the graphical representations with other visualizations in the case of vector functions r : [a, b] → R3 and vector fields f : R2 → R2 and f : R3 → R3 as presented in previous chapters. Another approach to obtain information about mappings (and their graph) for functions of several variables is through level sets. The traces of conics of Chapter 1 are examples of level sets and we briefly mention level set curves in Section 3.2.3. We now formalize the concept.
⊂ Rn, f : U → R. The level sets of f are given by {(x1 , . . . , xn) ∈ U | f (x1 , . . . , xn) = c},
Definition 5.1.3. Let U
(5.1)
where c R. We denote the set (5.1) by f −1 (c) and it is called the inverse image of c by f . In particular, f −1 (c) can be the empty set.
∈
Note that the definition of inverse image is similar to the inverse of a function; in fact, if a function is invertible, its inverse image consists of isolated points. We now look at a few examples. Example 5.1.4. The Monkey Saddle is a surface given by the graph of f (x, y) =
x3 3xy 2 , see Figure 5.4. We look at its level sets by computing the inverse image f −1 (c) for various values of c . It is typical, as a first step, to consider c = 0 and one value of c < 0 and c > 0 . We have for c = 0 that f (x, y) = x(x2 3y2 ) = 0 and this holds for x = 0 and x = 3y . See Figure 5.5 for some level set curves. Therefore,
−
±√
−
120
5 Differential Calculus of Mappings
Fig. 5.4. Monkey saddle surface
f −1 (0) is made up of three lines through (0, 0) at angles of 2π/3. For c = 0 the task is more difficult.
4
f −1 (1) f −1 ( 1) f −1 (0)
−
2
−4
−2
2
4
−2 −4 Fig. 5.5. Level sets of the Monkey saddle surface.
Example 5.1.5. The motion of a pendulum without friction has total energy given
by
1 2 y cos(x). 2 Figure 5.6 shows the surface given by E and Figure 5.7, level sets for c = 1, c = 1/2 and c = 2. We see a transition from isolated points (at x = 2kπ ), to circles, and finally to curves extending to in x. E (x, y) =
−
−
−
±∞
Example 5.1.6. Let x = (x1 , x2 ) be coordinates on the plane and y = (y1 , y2 ) the
velocity of the particle at x. The harmonic oscillator in the plane has total energy given by 1 H (x, y) = (y y + x x) 2
·
·
5.2 Limits and Continuity
121
Fig. 5.6. Surface given by the energy E of the pendulum without friction.
where y y/2 is the kinetic energy and x x/2 is the potential energy. The level sets of H are given by
·
·
H −1 (h) = (x, y)
{
∈ R4 | y · y + x · x = h}
For h > 0 we let h = k 2 for k > 0 . The level sets H −1 (h) correspond to three dimensional spheres of radius k since they satisfy an equation analogous to the equation of a sphere in R3 , but with an additional variable. For h = 0, the level set is only the origin (0, 0, 0, 0), while for h < 0 the level set is empty.
Exercises
(1) Write the expression for the graph of the functions. (a) f (x,y,z) = (x2 ,xy,yz 3 )T . (b) g(A) = det(A) where A is a 2 × 2 matrix. (2) Write the level sets of the functions and give a geometrical description. (a) g(x, y) = cos x cos y (b) f (x,y,z) = 2x2 + y 2 + z 2
5.2 Limits and Continuity In this section we extend the concepts of limit and continuity to general mappings. The computation of limits for functions of several variables is much trickier than for functions f : R → R.
122
5 Differential Calculus of Mappings
E −1 ( 1) E −1 ( 12 ) E −1 (2)
− −
4 2
−2π
π
−π
2π
−2 −4 Fig. 5.7. Level sets of the energy E of the pendulum without friction for c =
c = 2.
Example 5.2.1. Consider the function f : R2
f (x, y) =
−1, c = −1/2 and
→ R defined by
x
if y = 0
y
if x = 0
1 otherwise
and we look at various approaches towards (0, 0). Consider the path (x(t), y(t)) = (t, t) then lim f (x, y) = 1 (t,t)
→(0,0)
because f (x, y) = 1 on the path (t, t). But lim
(x,0)
→(0,0)
f (x, y) = 0.
Therefore, the limit cannot exist at (0, 0).
The definition of limit for mappings, therefore, needs to consider all possible approaches of x towards x0 .
→ Rm be a mapping, then the limit of f at x0 ∈ Rn exists if there exists a unique ∈ Rm such that Definition 5.2.2. Let f :
n R
lim f (x) = ,
x
→x
0
which means
∀ > 0, ∃δ > 0
such that
||x − x0 ||R
n
< δ
⇒ ||f (x) − ||R
m
< .
5.2 Limits and Continuity
123
Example 5.2.3. We show using the definition that
lim
(u,v)
→(0,0)
|u| + |v| = 0.
We begin by choosing some > 0. The function in our case is f : by f (u, v) = u + v and
2 R
|| ||
→ R defined
||f (x) − ||R = ||u| + |v| − 0| = |u| + |v|. √ We choose (u, v) such that || (u, v) − (0, 0)||R = u2 + v 2 < δ where δ is not yet 1
2
fixed with respect to . Note that the following inequalities hold for all u and v
|u| ≤ Therefore,
u2 + v 2
| |≤
and
|v| ≤
u2 + v 2 .
||f (x) − ||R = |u| + v 2 u2 + v2 < 2δ and setting δ = /2 we obtain ||f (x) − ||R < . This completes the proof. 1
1
As seen in Example 5.2.1, in order to show that a limit does not exist, it is often a good strategy to determine a direction of approach to the limiting point for which the one-dimensional limit does not exist or find two approaches which yield different values for the limit. We illustrate this method with the following two examples. Example 5.2.4. We check whether
xy (x,y)→(0,0) 3x2 + 2y 2 lim
exists. Substituting (0, 0) into the formula yields 0/0 and this is an indeterminate form. However, it cannot be resolved using l’Hospital rule as it is only valid for functions of one variable. Consider the line y = mx and substitute in the expression, we obtain xy mx2 m = = . 3x2 + 2y 2 3x2 + 2m2 x2 3 + 2m2 Therefore, xy m m = lim = , 3 + 2m2 (x,y)→(0,0) 3x2 + 2y 2 (x,mx)→(0,0) 3 + 2m2 lim
which means the value of the limit depends on the line of approach towards the origin. We must then conclude that the limit does not exist.
We now look at a mapping example. Example 5.2.5. Consider f : R2
\ {(0, 0)} → R2 defined by
f (x, y) =
(x, y)T . (x, y)
||
||
124
5 Differential Calculus of Mappings
We show that the limit as (x, y) in the expression
→ (0, 0) does not exist. Let y = mx and substitute
(x, y)T (x, y)
lim
→(0,0) ||
(x,y)
=
||
(x,mx)T (x,mx) (x,mx)T
lim
(x,mx)
→(0,0) ||
||
√ 1 + m2 (1, m)T √ 1 + m2 . lim (x,mx)→(0,0)
=
lim
(x,mx)
=
→(0,0) x
Again, the vector value of the limit depends on the line of approach to the origin, so the limit does not exist.
This example is similar to the case of the function
| | x x 0
f (x) =
x=0
x = 0,
which is illustrated in the Figure 5.8. It is clear that the function does not have a limit at x = 0.
1
−1
Fig. 5.8. Function f (x) with a
jump discontinuity at x = 0
For functions of several variables, f : Rn → R, the properties of the limit are the same as the ones already known for functions of one variable. Those are listed in the next result. Proposition 5.2.6. Let f, g :
n R
→ R, x0 ∈ Rn, α ∈ R, L1 , L2 ∈ R and suppose
lim f (x) = L 1
x
→x
0
and
lim g(x) = L 2 .
x
→x
0
Then, (1) lim αf (x) + g(x) = αL 1 + L2
x
→x
Linearity Property
0
(2) lim f (x)g(x) = L 1 L2
x
→x
0
5.2 Limits and Continuity
125
(3) If L2 = 0, then
lim
x
→x
0
f (x) L1 = . g(x) L2
Proof. We only do the second case and leave the other ones as exercises at the end of the section. We must use the definition of limit. Let > 0 and consider f (x)g(x) L1 L2 . Then,
|
− | |f (x)g(x) − L1 L2 |
=
|f (x)g(x) − L1 g(x) + L1 g(x) − L1 L2 | = |(f (x) − L1 )g(x) + L1 (g(x) − L2 )| ≤ |(f (x) − L1 )g(x)| + |L1 (g(x) − L2 )| = |f (x) − L1 ||g(x)| + |L1 ||g(x) − L2 |. Because the limit of g(x) as x → x0 exists, there exists > 0 such that if ||x − x0 || < δ 1 then |g(x) − L 2 | < . Rearranging this last inequality we obtain − + L 2 < g(x) < + L 2 thus | g(x)| < + L 2 for all x ∈ (x0 − δ 1 , x0 + δ 1 ). For the same > 0, if |x − x 0 | < δ 2 then |f (x) − L 1 | < . Therefore, choosing x such that ||x − x0 || < min(δ 1 , δ 2) we have |f (x) − L1 ||g(x)| + |L1 ||g(x) − L2 | < ( + L2 + |L1 |) Setting = ( + L2 + |L1 |) the proof is complete. As it is done for functions of one variable, we use the limit to introduce the concept of continuity of a function. The exact definition follows. Definition 5.2.7. Let f :
n R
if f (x0 ) is defined and
→ Rm be a mapping. Then f is continuous at x0 ∈ Rn lim f (x) = f (x0 ).
x
→x
0
| | | | | | − |y|)T and compute
Example 5.2.8. Consider f (x, y) = ( x + y , x
lim
(x,y)
→(0,0)
f (x, y) = (0, 0)T = f (0, 0).
We now use the definition to prove this statement. Let > 0. The first step is to estimate f (x, y) (0, 0)T in the Euclidean norm on R2 :
− ||f (x, y) − (0, 0)T ||
= = =
| | | | | | − | | | | | | √ √ || || ( x + y )2 + ( x
y )2
2( x 2 + y 2 )
2
x2 + y 2 =
2 (x, y) .
If (x, y) is chosen so that (x, y) < δ and we choose δ = then
||
||
√ ||f (x, y) − (0, 0)T || = 2||(x, y)|| < δ = and the proof is complete.
126
5 Differential Calculus of Mappings
The properties of continuous functions of several variables follow directly from the properties of limits seen in Proposition 5.2.6. n R
Proposition 5.2.9. Let f, g :
continuous at x0 . Then, f (x) + g(x), f (x)g(x) are continuous at x 0 . Moreover, if g(x0 ) = 0, then f (x)/g(x) is continuous at x0 .
→
R be
Proof. By setting L1 = f (x0 ) and L2 = g(x0 ) the proof of Proposition 5.2.6 yields
the result. We now look at properties of limits and continuity of general mappings f : Rn → Rm . Of course, the multiplication and division properties of limit do not have an analogue if m > 1 . However, the linearity property still holds: if f , g : Rn → Rm , v 1 , v2 ∈ Rm , and lim f (x) = v 1 and lim g(x) = v 2 x
then for α ∈ R
→x
x
0
→x
0
lim αf (x) + g(x) = αv 1 + v2 .
x
→x
(5.2)
0
However, we can use the results for functions of several variables, as follows, to obtain limit and continuity results for vector functions of several variables. Proposition 5.2.10. Let f :
v = (v1 , . . . , v m )
n R
∈ Rm. Then,
→ Rm be a mapping where f = (f 1 , . . . , fm )T and lim f (x) = v
x
→x
0
if and only if limx→x0 f j (x) = v j for all j = 1, . . . , m. Proof. This is an “if and only if” statement and so we need to prove two implications. We begin by assuming the limit for f exists. Let > 0 , then for x x0 < δ
|| − ||
||f (x) − v|| = (f 1 (x) − v1 )2 + ··· + (f m(x) − vm)2 < . √ Recall that |u| = u2 , and therefore |f j (x) − vj | < (f 1 (x) − v1 )2 + ··· + (f m(x) − vm)2 < for all j = 1, . . . , m . This implies lim f j (x) = v j
x
→x
0
for all j = 1, . . . , m and so one implication is proved. Suppose now that lim f j (x) = v j
x
→x
0
for all j = 1, . . . , m. Let > 0, there exists δ > 0 such that if ||x − x0 || < δ then |f j (x) − vj | < for j = 1, . . . , m . So, if x is such that ||x − x0 || < δ , then
||f (x) − v||2 = (f 1 (x) − v1)2 + ··· + (f m(x) − vm)2 < m2 . √ Thus, ||f (x) − v || < m and the proof is complete.
5.2 Limits and Continuity
127
We can use the above result to determine continuity directly. The proof is straightforward by setting vj = f j (x0 ) for j = 1, . . . , m. Corollary 5.2.11. Let f :
n R
→ Rm where f = (f 1 , . . . , fm )T , then f is continuous
at x0 if and only if f j is continuous at x0 for all j = 1, . . . , m.
Proposition 5.2.10 and Corollary 5.2.11, along with Propositions 5.2.6 and Proposition 5.2.9, give a recipe for verifying whether a mapping has a limit or is continuous. See the next example for an illustration of the procedure. Example 5.2.12. Consider the mapping
f (x,y,z) =
x2 sin(xyz) (x +
yz)ex
.
We use properties of functions of several variables to show continuity of f as (x,y,z) (0, 1, 1). Let f = (f 1 , f 2 )T and we study each function in turn. Because x2 0 as x 0 and sin(xyz) 0 as (x,y,z) (0, 1, 1) and we have f 1 (0, 1, 1) = 0, then lim(x,y,z)→(0,1,1) f 1 (x,y,z) = f 1 (0, 1, 1). Checking the second one, we have (x + yz) 1 as (x,y,z) (0, 1, 1) and ex 1 as x 0; therefore, lim(x,y,z)→(0,1,1) f 2 (x,y,z) = f 2 (0, 1, 1). By Corollary 5.2.11, f is continuous at (0, 1, 1).
→ →
→
→
→
→
→
→
→
We can extend the definition of continuity at a point to continuity in a domain.
⊂ Rn be an open set and f : U → Rm . Then, f is continuous on U if for any x0 ∈ U , Definition 5.2.13. Let U
lim f (x) = f (x0 ).
x
→x
0
Before we address the question of derivatives of mappings, let us consider the following example which shows the limitations of partial derivatives. Example 5.2.14. Let
f (x, y) =
x
if y = 0
y
if x = 0
1 otherwise .
We begin by showing that the partial derivatives exist at (0, 0): ∂f f (h, 0) f (0, 0) h (0, 0) = lim = lim = 1, ∂x h (x,y)→(0,0) (x,y)→(0,0) h
−
∂f f (0, h) f (0, 0) h (0, 0) = lim = lim = 1. ∂y h (x,y)→(0,0) (x,y)→(0,0) h
−
However, the function itself does not have a limit at (0, 0) as shown above in Example 5.2.1. Therefore, the existence of partial derivatives at a point does not provide any information about the existence of a limit at that same point.
128
5 Differential Calculus of Mappings
In fact, there are examples where all directional derivatives exist at a point, but the limit does not.
Exercises
(1) Compute the following limits. (a)
lim
(x,y)
→(1,0)
x sin(xy)
ez lim (b) (x,y,z)→(0,0,0) 1 + x2 + y 2
(c)
(d) (e) (f)
lim
(x,y)
→(2,2)
x2
− 2xy + y2 − 2
lim
→(−1,1)
lim
(2sin φ cos θ, 2sin φ sin θ, 2cos φ)
3
7
x
+
−2 4 y lim ||(x − 1, y + 2, z − 3)|| (x,y,z)→(0,0,0) (x,y)
(θ,φ)
1
5
→(π,π/2)
(2) Show using approaches from various directions that the limits below do not exist. 3y 6 1, y 2) (x,y)→(1,2) (x 3xy 2 lim (b) (Hint: try a quadratic approach) xy 2 (x,y)→(0,0) 2x5
(a)
lim
|| −
−
− ||
−
(y,x, 5z)T lim (c) (x,y,z)→(0,0,0) (x,y,z)
(d) lim qQ
→0
x
||
x , x 3
|| ||
||
where x = (x,y,z).
(3) Show using the definition that the limit exists. (a)
lim
− 3y = 0 lim xy + x − y − 1 = 0 (b) (x,y)→(1,−1) (x, y)||(x, y)|| (c) lim =0 (x,y)→(0,0) 1 + ||(x, y)|| (d) lim (x,y,z) × (1, 1, 1) = (0, 0, 0) (x,y,z)→(1,1,1) (x,y)
→(0,0)
7x
(4) Show cases (1) and (3) of Proposition 5.2.6. (5) Determine whether the following functions are continuous at the point given. If yes, provide a calculation or use the definition of limit. If not continuous, show why using the method of your choice. (a)
1 + xy (x,y,z)→(π/4,2,π/4) cos(y(x + z)) lim
5.3 Best Linear Approximation and Derivatives
(b) (c)
lim
(x,y)
→(0,0)
(A(x, y))−1 where A(x, y) =
lim
(θ,φ)
→(π/4,π/4)
2
(cos θ sin φ, sin θ sin φ, cos φ)
3
x
−y
x
129
(6) Show the continuity of the following mappings using properties of continuity of functions of several variables using Corollary 5.2.11. (a) (b) (c)
lim
(x,y)
→(0,0) lim
(x,y,z)
x2 , e−1/xy 3x + 4xy
→(0,0,0)
tanh(1/(xyz)), x2 + y 2 + z 2
(ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ) (ρ, 0, 0) (ρ,θ,φ)→(5,0,0) lim
||
||
5.3 Best Linear Approximation and Derivatives We begin the discussion on derivatives by addressing the concept of tangency at a point. Example 5.3.1. The following examples show functions which are tangent at a point.
(1) x2 and x3 at x = 0. x3
x2 1
1 Fig. 5.9. Tangency of x 2 and x 3
at x = 0.
(2) sin x and tanh(x) at x = 0. tanh(x)
1
−1
sin(x)
Fig. 5.10. Tangency of sin(x) and
tanh(x) at x = 0.
130
5 Differential Calculus of Mappings
Figure 5.9 and Figure 5.10 show the tangency; however, we need a formula from which we can determine whether two curves are tangent at a point. This is given by the next definition. Definition 5.3.2. Consider mappings f :
and g are tangent at x0 = (x10 , . . . , x n 0) lim
x
→x
0
n m and g : Rn R R n R if f (x0 ) = g(x0 ) and
→ Rm . We say f
→
∈ ||f (x) − g(x)||R ||x − x0 ||R
m
n
= 0.
The idea behind this definition is that tangency occurs if the approach of f (x) − g(x) towards zero as x → x0 happens at a rate much faster than the linear rate of approach given by x − x0 . We verify that the examples above agree with this definition. Example 5.3.3. In the case of f (x) = x 2 and g(x) = x 3 at x = 0 we have
f (x) lim x→0 x
− g(x) = lim x2 − x3 = lim x − x2 = 0. x→0 x→0 x −0
Take f (x) = sin x and g(x) = tanh(x). Recall from elementary calculus that lim
x
and tanh x = 1
→0
sin x = 1, x
− tanh2 x. Then, using l’Hospital rule tanh(x) = 1. x→0 x lim
Therefore, lim
→0
x
sin x
− tanh(x) = lim sin x − lim tanh(x) = 0. x
x
→0
We now look at a case in several variables.
x
→0
x
x
Example 5.3.4. Consider f (x, y) = x2 + y 2 and g(x, y) = 2(x + y
that
− 1). We show
|f (x, y) − g(x, y)| = 0. (x,y)→(1,1) ||(x, y) − (1, 1)|| lim
Note that
|f (x, y) − g(x, y)|
= = =
|x2 + y2 − 2(x + y − 1)| |(x2 − 2x + 1) + (y2 − 2y + 1)| |(x − 1)2 + (y − 1)2 | = ||(x − 1, y − 1)||2 .
Thus,
|f (x, y) − g(x, y)| lim (x,y)→(1,1) ||(x, y) − (1, 1)||
= =
|| (x − 1, y − 1)||2 lim (x,y)→(1,1) ||(x − 1, y − 1)|| lim ||(x − 1, y − 1)|| = 0. (x,y)→(1,1)
5.3 Best Linear Approximation and Derivatives
131
We now refine the above definition to consider only functions g which are linear and this leads us to the concepts of Best Linear Approximation and the Derivative. Definition 5.3.5. Consider the mapping f :
x0 = (x10 , . . . , x n 0)
n R
∈ Rn:
→ Rm and
(1) If g(x) = a + Lx where a ∈ Rm , L : Rn → Rm and g is tangent to f at x = x 0 then g(x) is called the best linear approximation to f at x0 . (2) If f has a best linear approximation g(x) at x0 , then f is said to be differentiable at x0 and L is called the derivative of f at x0 . We denote L = Df (x0 ).
Properties of Tangency and Best Linear Approximation
We now look at properties of tangency and best linear approximation which are direct consequences of the definitions above. (1) Suppose f (x) and g(x) are continuous and tangent at x 0 , then f (x0 ) = g(x0 ). Proof. Because f (x) and g(x) are continuous at x0 then lim f (x) = f (x0 ) and
x
→x
0
lim g(x) = g(x0 ).
x
→x
0
Tangency of f and g implies lim x→x f (x) − g(x) = 0. Therefore, f (x0 ) = g(x0 ). If g is a best linear approximation to f at x0 then g(x) = f (x0 ) + L(x − x0 ). 0
Proof. We know g(x) = a + Lx, so g(x0 ) = a + Lx0 = f (x0 ) by the above result. This implies a = f (x0 ) Lx0 and
−
g(x) = a + Lx = f (x0 ) + L(x
− x0 ).
Best linear approximations are unique, which means derivatives are unique. Proof. Suppose there are two best linear approximations g 1 (x) = f (x0 )+L1 (x x0 ) and g 2 (x) = f (x0 ) + L2 (x x0 ). Then, g 1 and g 2 are tangent to each other
−
−
and
||g1 (x) − g2 (x)|| = lim ||(L1 − L2 )(x − x0 )|| . x→x x→x ||x − x0 || ||x − x0 || = 0. Let v = x − x0 We now proceed by contradiction. Suppose that L1 − L2 be a vector not in the kernel of L1 − L2 and consider w := (L1 − L2 )v. Then ||w|| = α||v|| for some nonzero α ∈ R. In the v direction, we can write ||(L1 − L2 )(x − x0 )|| = lim ||(L1 − L2 )tv|| lim x−x t→0 ||x − x0 || ||tv|| ||tw|| = lim t→0 ||tv || = lim α = 0. t→0 0 = lim
0
0
0
Thus we have a contradiction and so L1 = L 2 .
132
5 Differential Calculus of Mappings
We look at some familiar examples. (1) For f : R → R, the derivative Df (x) is a 1 × 1 matrix and corresponds to the regular derivative f (x). (2) For vector functions f : R → Rn this is shown in Chapter 2. (3) For f : R2 → R the derivative Df (x) is a 1 × 2 matrix. The next results shows that Df (x) = ∇f (x). Proposition 5.3.6. If f :
n R
→ R is a differentiable function at a point x0. Then, the derivative L = Df (x0 ) = ∇f (x0 ). Proof. Let L = Df (x0 ) = (a1 , . . . , a n ) be a 1 × n matrix. Because f is differentiable at x = x 0 we know that |f (x) − g(x)| = 0 lim x→x ||x − x0 || 0
no matter which path is used to approach x0 . Let x0 = (x01 , . . . , x0n ) and consider the path parallel to the j th coordinate axis given by x(t) = (x1 (t), . . . , xj (t), . . . , xn (t)) = j and xj (t) = x 0j + t where t ∈ R. Then, where x i (t) = x 0i for i 0 =
|f (x) − f (x0 ) − Df (x0 )(x − x0 )| x→x ||x − x0 || | f (x01 , . . . , x0j + t , . . . , x 0n ) − f (x0 ) − Df (x0 )ej t| lim t→0 t f (x01 , . . . , x0j + t , . . . , x 0n ) − f (x0 ) − aj t| | lim lim
0
= = =
→0
t
∂f (x0 ). ∂t
t
For j = 1, . . . , n we obtain the partial derivatives in all coordinate directions and Df (x0 ) = ∇f (x0 ). Example 5.3.7. The derivative of f (x1 , x2 , x3 , x4 ) = x 22 x3
− ex x 1
4
at
x0 = (x01 , x02 , x03 , x04 ) = (1, 1, 1, 1)
is obtained by computing the gradient Df (x0 )
= =
∂f ∂f ∂f ∂f (x0 ), (x0 ), (x0 ), (x0 ) ∂x 1 ∂x 2 ∂x3 ∂x 4
( x4 ex1 x4 , 2x2 x3 , x22 , x1 ex1 x4 )
−
−
x0
= ( e, 2, 1, e).
−
−
133
5.3 Best Linear Approximation and Derivatives
The Jacobian Matrix
Using the characterization of the derivative for a function of several variables f : n R → R in terms of the gradient, we can extend our study to general mappings f : Rn → Rm for arbitrary n, m ∈ N. This is given in the next theorem. Theorem 5.3.8. Let f : Rn
→ Rm be a mapping and x = (x1 , . . . , xn) ∈ Rn with
f (x) =
f 1 (x1 , . . . , x n ) f 2 (x1 , . . . , x n ) .. . f m (x1 , . . . , xn )
Suppose that Df (x0 ) exists. Then,
Df (x0 ) =
.
∂f 1 (x0 ) ∂x 1
∂f 1 (x0 ) ∂x2
···
∂f 1 (x0 ) ∂x n
∂f 2 (x0 ) ∂x 1
∂f 2 (x0 ) ∂x2
···
∂f 2 (x0 ) ∂x n
.. .
.. .
∂f m (x0 ) ∂x 1
∂f m (x0 ) ∂x2
··· ···
.. . ∂f m (x0 ) ∂x n
.
(5.3)
The matrix (5.3) is called the Jacobian matrix of f and is written J f (x).
The proof is done in a similar way as for the derivative of a function f : Rn → R and we do not present it here. The interested reader can supply the details. The Jacobian matrix J f (x) depends only on partial derivatives and has importance of its own, and should be computed even before checking that a mapping f is differentiable at some point. It is a useful construction and is fundamental in what follows. However, it should not be confused with the derivative since the derivative can exist at points where the Jacobian matrix cannot be computed as we see in the next section. Example 5.3.9. We compute the derivative of
f (x,y,z,u,w) =
−
uz 2
−
3wx2
cos(yz) + xuw xe−1/yu
134
5 Differential Calculus of Mappings
at (x,y,z,u,w) = (0, 1, 0, 1, 0). We begin by writing f = (f 1 , f 2 , f 3 )T and the Jacobian matrix formula is
J f (x,y,z,u,w) =
∂f 1 ∂x
∂f 1 ∂y
∂f 1 ∂z
∂f 1 ∂u
∂f 1 ∂w
∂f 2 ∂x
∂f 2 ∂y
∂f 2 ∂z
∂f 2 ∂u
∂f 2 ∂w
∂f 3 ∂x
∂f 3 ∂y
∂f 3 ∂z
∂f 3 ∂u
∂f 3 . ∂w
We compute each partial derivative: ∂f 1 = ∂x
∂f 1 = 0, ∂y
−6wx,
∂f 2 = uw, ∂x
∂f 1 = 2uz, ∂z
∂f 2 = z sin(yz), ∂y
∂f 3 = e −1/yu , ∂x
∂f 1 = z 2 , ∂u
∂f 2 = y sin(yz), ∂z
∂f 3 x −1/yu = e , ∂y uy 2
∂f 1 = ∂w
−3x2 ,
∂f 2 = xw, ∂u
∂f 3 = 0, ∂z
∂f 2 = xu, ∂w
∂f 3 x = 2 e−1/yu , ∂u yu
∂f 3 = 0. ∂w
and evaluate at (x,y,z,u,w) = (0, 1, 0, 1, 0) to obtain
J f (0, 1, 0, 1, 0) =
0
0
0
0
0
0
0
0
0
0
e−1
0
0
0
0
Example 5.3.10. We determine if the Jacobian of
g(x,y,z) =
y sin(xπ) y ln 1 + x
|
|
z 4 y2 + cos(xπ)
.
computed at (x,y,z) = (0, 2, 1) is triangular. We begin by writing g = (g1 , g2 , g3 )T and the Jacobian matrix formula is
J g(x, y) =
∂g1 ∂x
∂g 1 ∂y
∂g 1 ∂z
∂g2 ∂x
∂g 2 ∂y
∂g 2 ∂z
∂g3 ∂x
∂g 3 ∂y
∂g 3 ∂z
5.3 Best Linear Approximation and Derivatives
135
We compute each partial derivative: ∂g 1 = yπ cos(xπ), ∂x
∂g 1 = sin(xπ), ∂y
∂g 1 =0 ∂z
∂g 2 y = , ∂x 1+x
∂g 2 = ln 1 + x , ∂y
∂g 2 =0 ∂z
∂g3 = 2u4 y, ∂y
∂g 3 = 4u3 y 2 ∂z
∂g 3 = ∂x
|
−π sin(xπ),
|
and evaluate at (x,y,z) = (0, 2, 1) to obtain
J g(0, 2) =
which is a lower triangular matrix.
2π
0
0
3
ln 1
0
0
4
16
Alternate terminology and notations are as follows: (1) The Jacobian is sometimes written J f (x) =
∂ (f 1 , f 2 , . . . , f m ) (x). ∂ (x1 , . . . , xn )
(2) The Jacobian matrix is sometimes referred to as the linear part of f or the linearization of f . We conclude this section with the linearity property for derivatives of mappings. Proposition 5.3.11. [Linearity Property] Let f, g :
n R
⊂ Rn, and α ∈ R. Then,
pings on U
D(f + g)(x) = Df (x) + Dg(x)
→ Rm be differentiable map-
and D(αf )(x) = αDf (x).
Example 5.3.12. In the study of differential equations, mappings such as this one
are often encountered f (x, y) = A
x y
+
2x2
+ 4xy
5x3
− 3xy2
where A is a matrix of constants. The derivative is Df (x, y) = A +
4x + 4y
15x2
−
3y 2
4x
−6xy
.
136
5 Differential Calculus of Mappings
Exercises
(1) Determine whether the functions are continuous, if so prove using the − δ definition. If not, give an explanation why. If possible, find a candidate for the best linear approximation function g and show the tangency with the function at the point. (Hint: the Jacobian matrix may be useful in finding g). (a) f (x) at x = 0 where f (x) =
(b) f (x, y) = (x2 + y2 )5/2 at (0, 0). (c) f (x,y,z) =
x2
x
−x2
xyz sin(1/x)
x=0
0
x = 0
≥0
x < 0
at (0, 0, 0).
(d) Consider the function f : R2 → R2
− | | x x
y
f (x, y) =
,
x=0
xy
0
0
,
x = 0
at (x, y) = (0, 0). (2) Compute the Jacobian matrix of the following mappings: √ (a) f : R3 → R defined by f (x,y,z) = xyz . (b) f : R3 → R2 defined by f (x,y,z) =
(c) f : R2 → R3 defined by
x2 eyz (1 + xyz)2
√ − xy 2
f (x, y) =
exy
x
(d) f : R4 → R4 defined by
y
x21 +
f (x1 , x2 , x3 , x4 ) =
.
.
x2 x3
x1 x2 + x2 x4 x1 x3 + x2 x4 x2 x3 + x24
5.3 Best Linear Approximation and Derivatives
137
5.3.1 Conditions for differentiability
The regularity of a function/mapping is often used to refer to how “nice” a function/mapping is. For instance, we know that a differentiable function f : R → R is automatically continuous. But we know that the opposite statement is not true; for instance f (x) = | x| at x = 0 is continuous, but not differentiable. Therefore, one says that differentiable functions have more or higher regularity than continuous functions. In this section, we begin our discussion of regularity of mappings by generalizing the implication just mentioned for functions of one variable. Theorem 5.3.13. Let f :
n R
at x = x 0 .
→ Rm be differentiable at x = x0 , then it is continuous
Proof. The proof is left as an exercise.
We continue with examples from which we are able to derive sufficient conditions guaranteeing that a mapping is indeed differentiable and in the process, also make a distinction between the derivative of a function and its Jacobian matrix. We begin with an example from elementary calculus. Example 5.3.14. Consider
f (x) =
x2 sin(1/x)
x=0
0
x = 0
We begin by noting that f (x) is continuous at x = 0. The details of this are left as an exercise. The Jacobian is given by J f (x) = 2x sin(1/x)
− cos(1/x)
and this function is not defined at x = 0. By the way, it cannot be made continuous at x = 0 because of the oscillating nature of the term cos(1/x). However, f is differentiable at x = 0 and Df (0) = f (0) = 0 which can be computed directly via the limit definition of derivative. Thus, we have Df (x) =
2x sin(1/x)
− cos(1/x)
0
x=0
x = 0.
and Df (x) is defined for all x, but J f (x) is not. This means J f (x) is not a good indicator of whether Df (x) exists at x = 0.
This example can be easily extended to functions of several variables. Example 5.3.15. Consider
f (x, y) =
(x2 + y 2 )sin
1
x2 + y 2
0
(x, y) = (0, 0)
(x, y) = (0, 0).
138
5 Differential Calculus of Mappings
The Jacobian matrix is J f (x, y) = (∂f/∂x,∂f/∂y) with ∂f = 2x sin ∂x ∂f = 2y sin ∂y
− || || − || 1 (x, y) 1 (x, y)
||
x cos (x, y) y cos (x, y)
||
|| ||
||
|| || 1 (x, y) 1 (x, y)
||
||
.
Note again that J f (x, y) is not defined at (0, 0) since both partial derivatives are not defined at (0, 0). But, we show that Df (0, 0) = (0, 0) using the definition. Let g be the best linear approximation to f at (0, 0): g(x, y) = g(0, 0) + L(x 0, y 0)T with g(0, 0) = 0, we let L = (0, 0) and write
−
f (x, y) (x,y)→(0,0) (x, y) lim
||
− g(x, y) − (0, 0)||
= = =
lim
|| || || || || ||
||(x, y)
(x,y)
→(0,0)
lim
||(x, y) (x,y)→(0,0)
0.
−
1 (x, y)
2 sin
(x, y)
sin
||
1 (x, y)
||
where the last equality follows from the squeeze theorem since
−1 ≤ sin(1/||(x, y)||) ≤ 1. Because partial derivatives are easy to compute, we determine a criterion based on partial derivatives which guarantees the existence of Df . This is the content of the next result. m be an open set and consider f : U with R f = (f 1 , . . . , f m )T . Then, f is differentiable on U if f is continuous on U and all the partial derivatives
Theorem 5.3.16. Let U
⊂
n R
∂f j (x0 ) ∂x i
→
j = 1, . . . , m; i = 1, . . . , n
exist and are continuous functions U .
We omit the proof of this theorem. The following definition is helpful in describing levels of differentiability
⊂ Rn and for which Df (x) is a continuous function on U is said to be of class C 1 on U . We write f ∈ C 1 on U ⊂ Rn or f ∈ C 1 (U ). Definition 5.3.17. A mapping f differentiable on U
We can now illustrate the logical relationship between various conditions of regularity of a function. f satisfies Theorem 5.3.16 f is C 1 on U =
⇐⇒ f is C 1 on U
⇒ f is differentiable on U f is differentiable on U =⇒ f is continuous on U .
5.4 Tangent spaces
139
Exercises
(1) Consider the mapping f : R3 → R defined by f (x,y,z) = 3x cos(yz) and verify that f is differentiable for all (x,y,z) ∈ R3 . (2) Consider f (x, y) =
(x + (x
y)4/3
− y)2/3
(x, y) = (0, 0)
(0, 0)T
(x, y) = (0, 0).
Compute the partial derivatives, check to see if the partial derivatives are defined at (0, 0) and if they are continuous at (0, 0). Using this information, can one tell if Df (0, 0) exists? (3) Show Theorem 5.3.13. (4) For each of the mappings below (i) Find the domain of the mapping. (ii) Compute the Jacobian matrix of the mapping. (iii) Find out if the Jacobian matrix is defined at all points of the domain. (a)
− − √ x2 y 2
f (x,y,z,u) =
cos yz
ln(x2
(b)
uz
u2 )
xy + 2z 2
f (x , y , z , u) =
uxeyz
cos(x2
− y2 )
5.4 Tangent spaces In Sections 3.1.1 and 3.1.4, we define tangent spaces to curves and to surfaces defined by z = f (x, y). The tangent space to a function y = f (x) is given by T p C = α(1, f (x0 )) α
| ∈ R},
{
where α ∈ T x
. In general, tangent space of space curves are given by
R
0
T p C = αr (t0 ) α
| ∈ R},
{
with α ∈ T t
0
. Let S be a surface given by z = f (x, y). Then
R
T p S = span τ 1 ( p), τ 2 ( p) ,
{
}
140
5 Differential Calculus of Mappings
where τ 1 ( p) =
∂ f 1, 0, (x0 , y0 ) ∂x
and τ 2 ( p) =
∂ f 0, 1, (x0 , y0 ) . ∂y
We now use the derivative to generalize the above constructions to arbitrary mappings f : Rn → Rm . Recall that graph (f ) is a surface of dimension n in Rn × Rm . Definition 5.4.1. Consider the set S given by graph f and suppose that f is differ-
entiable at x0 . Then the tangent space at p = (x0 , f (x0 ))
∈ S is given by T p S = {(v T , [Df (x0 )v]T ) ∈ T x Rn × T f (x ) Rm | v ∈ T x Rn } 0
0
and recall that v is a column vector.
If S is a n -dimensional surface given by a differentiable mapping f : U ⊂ Rn → Rm then the tangent space at any point p ∈ S is a vector space of dimension n . This can be easily shown with a calculation. We check that a linear combination of elements of T p S is also in T p S . Let (v1T , [Df (x0 )v1 ]T ), (v2T , [Df (x0 )v2 ]T ) ∈ T p S and α ∈ R, then α(v1T , [Df (x0 )v1 ]T ) + (v2T , [Df (x0 )v2 ]T ) = (αv1T + v2T , α[Df (x0 )v1 ]T + [Df (x0 )v2 ]T ) = (αv1T + v2T , α[Df (x0 )(αv1 + v2 )]T )
where this last element is indeed in T p S . Example 5.4.2. We now verify that the formula of the definition corresponds to
the tangent space of a two-dimensional surface. Let z = f (x, y) be differentiable at (x0 , y0 ). The definition gives T p S = (v T , [Df (x0 , y0 )v]T )
{
with Df (x0 , y0 )v
=
∈ T (x ,y ) R2 × T f (x ,y ) R | vT = (v1 , v2 ) ∈ T x R2 } 0
(v T , [Df (x0 , y0 )v]T )
= =
0
0
∂f (x0 , y0 ) ∂ f (x0 , y0 ) (v1 , v2 )T , ∂x ∂y ∂f (x0 , y0 ) ∂f (x0 , y0 ) v1 + v2 . ∂x ∂y
=
Thus,
0
∂ f (x0 , y0 ) ∂f (x0 , y0 ) v1 , v2 , v1 + v2 ∂x ∂y ∂ f (x0 , y0 ) ∂ f (x0 , y0 ) v1 1, 0, + v2 0, 1, ∂x ∂y
,
which confirms the correspondence.
From the definition of T p S , we see that the linear operator Df (x0 ) is, in fact, a linear mapping that sends vectors v , element of the tangent space at x0 , to vectors at the tangent space at f (x0 ). Df (x0 ) : T x0 Rn
→ T f (x ) Rm 0
v
→ Df (x0 )v.
5.4 Tangent spaces
141
(f,Df ) Df (x0 )v f (x0 ) x0 v
Fig. 5.11. The pair (f,Df ) maps the pair (x0 , v0 )
(f (x0 ), Df (x0 ))
m
∈ R × T (
m
f x0 ) R
.
n
n x0 R
∈ R × T
to the corresponding pair
This is illustrated in Figure 5.11. Similarly, we can say that T p S is the graph of the linear mapping Df (x0 ). Example 5.4.3. We compute the tangent space of f (x, y) = (2xy,x2 + y 2 )T at an
arbitrary point p = (x,y,f (x, y)). S = graph f is a two-dimensional surface in R2 2 T R . We proceed first by writing f = (f 1 , f 2 ) and computing
Df (x, y)
=
Let v = (v1 , v2 )T and compute
∂f 1 ∂x ∂f 2 ∂x
Df (x, y)v =
∂f 1 ∂y ∂f 2 ∂y
=
2yv1 + 2xv2 2xv1 + 2yv2
2y
2x
2x
2y
×
.
.
Then, T p S = (v1 , v2 , 2yv 1 + 2xv2 , 2xv1 + 2yv 2 ) (v1 , v2 )T
{
|
∈ T (x,y) R2 }.
We are now in a position to show the differential formula for functions of several variables. Recall from Chapter 3 that if f : Rn → R is such that all its partial derivatives exist, then n
df =
i=1
∂f dxi . ∂x i
Let S be the n-dimensional surface given by the graph of f (x1 , . . . , xn ). Then T p S = (α1 , . . . , αn , f (α1 , . . . , α n )T ) α1 , . . . , αn
{
∇ ·
|
∈ R}.
142
5 Differential Calculus of Mappings
Letting v = (α1 , . . . , αn , ∇f · (α1 , . . . , αn )T ) we have dxj (v) = αj for j = 1, . . . , n and n df (v) = dx n+1 (v) =
T
∇f · (α1 , . . . , αn)
=
i=1
∂f dxi (v). ∂x i
Exercises
(1) Consider the mapping f (x, y) =
xy 2 x + 2(x3 + y3 )
and consider S = graph (f ). (a) Verify that the point p = (1, −1, 1, 0) ∈ S . (b) Compute T p S . (c) Is the vector w = (2, 1, 0, 20) ∈ T p S ? (2) Consider the differentiable mapping f (x,y,z) =
yx 2
−
y2
+ xyz
3xy + xyz 2
.
(a) Compute Df (x,y,z). Find values of (x,y,z) for which Df is the zero matrix. (b) Compute the general formula for the tangent space to graph (f ) = S for an arbitrary point p = (x , y , z , f ( x,y,z)). (3) Consider the differentiable mapping g(x,y,z,u) = 6xyz
− zx2 − z2 y − 92 x2 + 9x − 12 x4 − 2uy + u.
(a) Compute the general formula for the tangent space to graph (g) = S for an arbitrary point p = (x,y,z,u,g(x , y , z , u)). (b) Find all the values of (x,y,z,u) for which Dg(x , y , z , u) = 0.
5.5 The Chain Rule Let f : Rn → Rm and g : Rm → Rk , then we can compose f and g: (g f )(x) = g(f (x)).
◦
We look at two examples.
√
(1) Let f (x,y,z) = x 2 + y 2 + z 2 and g(u) = u, we compute (g ◦ f )(x): 2
2
2
(g f )(x) = g(f (x)) = g(x + y + z ) =
◦
x2 + y 2 + z 2 .
5.5 The Chain Rule
(2) Let f : R2 → R4 be defined by f (x, y) =
−
x
y
cos(x)
0
and g be the determinant function. We compute (g ◦ f ): (g f )(x) = g(f (x))
◦
143
x
=
g
=
det
=
y cos(x).
− − y
cos(x) x
0
y
cos(x)
0
We recall the chain rule for functions of one variable. Let f, g : R → R and suppose that g ◦ f : R → R is well-defined. The chain rule formula is (g f ) (x) =
◦
d g(f (x)) = g (f (x))f (x). dx
Thus, we multiply the derivatives of g and f . The same multiplication of derivatives characterizes the chain rule for mappings. However, we must now consider multiplication of matrices with the proper sizes. The exact statement follows. Proposition 5.5.1. [Chain rule] Let f :
n R
entiable functions. Then,
→ Rm and g : Rm → Rk be two differ-
D(g f )(x) = Dg(f (x))Df (x).
◦
We now show what this formula means and how it can be unpacked. Example 5.5.2. We compute the derivative of Example (1). The formula is
D(g f )(x) = Dg(f (x))Df (x).
◦
We compute each derivative separately: (Dg)(u) =
1 √ , 2 u
Df (x) = (2x, 2y, 2z).
Evaluating Dg(u) at u = f (x) we obtain Dg(f (x)) =
2
1
x2 + y 2 + z 2
.
Putting these computations together in the formula we conclude that D(g f )(x) = Dg(f (x))Df (x) =
◦
(x,y,z)
x2 + y 2 + z 2
.
144
5 Differential Calculus of Mappings
In particular, this shows that in fact, (g f )(x,y,z) = (x,y,z) and we have D( (x,y,z) ) =
||
||
◦
||
||
x y z , , (x,y,z) (x,y,z) (x,y,z)
||
|| ||
|| ||
||
.
We compute the derivative of Example (2). Example 5.5.3. We start by computing each derivative separately. Let
A =
and rewrite g as a mapping from R4
a1
a2
a3
a4
→ R:
g(a1 , a2 , a3 , a4 ) = a 1 a4
− a2 a 3 .
Then Dg(A) = (a4 , a3 , a2 , a1 ).
− −
Write f as f (x, y) = (x, y, cos(x), 0)T and
−
Df (x, y) =
−
−
1
0
0
1
sin(x)
0
0
.
0
Evaluating Dg at A written as a R4 vector (x, y, cos(x), 0) we have Dg(A) = (0, cos(x), y , x). We now assemble in the formula.
−
−
D(g f )(x, y)
◦
=
=
(0,
− cos(x), y , x)
−
1 0 sin(x) 0
( y sin(x), cos(x)).
−
− 0
1
0
0
Exercises
(1) Compute the derivative of the mappings g ◦ f for f and g given below. (a) (b) (c) (d)
f (x,y,z) = xyz and g(t) = (t, t2 , t3 ). f (x, y) = (xy 2 , 2x2
− y2 ) and g(u, v) = (uv,u2 , u − v2).
Let A, B be n × n matrices, f (A) = A 2 and g(B) = tr B . f (t) = (cos(t), t sin(t), te2t ), g(x,y,z) = (xy,x2 − 2y 2 ).
5.6 Higher Derivatives
145
(2) Show that if x = (x1 , . . . , x n ) ∈ Rn and g(x) = ||x||, then Dg(x) =
x . x
|| ||
(3) Use the chain rule to compute the following derivatives where f (x,y,z) = x 2 + yz , g(x, y) = y 3 + xy and h(x) = sin(x). (a) F (x,y,z) = f (h(x), g(x, y), z). (b) G(x,y,z) = h(f (x,y,z)g(x, y)). (c) H (x,y,z) = g(f (x,y,h(x)), g(z, y)).
5.6 Higher Derivatives As in the case of real functions, the derivative of a mapping f : Rn → Rm is also a mapping. Before looking at a few examples, consider the set L(Rn , Rm ) of all real linear transformations from Rn to Rm . This set is a vector space since for any two linear transformations L1 , L2 ∈ L(Rn , Rm ) and α ∈ R, then αL1 + L 2 is also an element of L(Rn , Rm ). If one prefers thinking about matrices, then it is straightforward to show that Mat m×n (R), the vector space of all m × n matrices with real entries, and L(Rn , Rm ) are isomorphic as vector spaces. (1) Let f : R → R, then Df (x) is a 1 × 1 matrix. For instance, if f (x) = sin x, then Df :
(2) Let f :
4 R
→ L(R, R) x → (cos(x)) → R2 , then Df (x, y) is a 2 × 4 matrix. If f (x , y , z , w) = (x2 + R
yzw,y 2 + xw)T , then
Df :
4 R
→ L(R4 , R2 ) 2x (x , y , z , w) →
w
zw
yw
yz
2y
0
x
We are thus led to the following characterization. If f : Rn then Df is a mapping: Df : Rn
→ L(Rn, Rm)
x = (x1 , . . . , x n )
.
→ Rm is differentiable,
→ Df (x).
In the case of real functions of one variable, the above characterization (as we know from elementary calculus) shows that iterating derivatives does not change the matrix structure of Df .
146
5 Differential Calculus of Mappings
Proposition 5.6.1. Let f : R
→ R be an infinitely differentiable function. Then, for
∈ N, Dk f : R → L(R, R).
any k
For general mappings, the situation becomes complicated quite quickly as this example shows.
→ R, then Df (x, y) ∈ L(R2 , R); that is, it can be represented as a 1 × 2 matrix. Note that the space of 1 × 2 matrices (or L(R2 , R)) can be identified with R2 . Thus, we can write Df : R2 → R2 . We can now take the Example 5.6.2. Let f :
R
2
derivative of the Df mapping
D(Df ) : R2
→ L(R2 , R2 ).
That is, we define D2 f (x, y) := D(Df )(x, y), and it is a 2 2 matrix. We illustrate with f (x, y) = x 2 y 3 . We compute Df (x, y) = (2xy3 , 3x2 y 2 )T and so
×
D2 f (x, y) =
2y 3
6xy 2
6xy 2
6x2 y
.
Continuing this way, D 3 f should be a three-dimensional 2 × 2 × 2 array, and this can still be visualized. However, for Dk f for k ≥ 4 we cannot visualize the derivatives anymore in this way. Example 5.6.3. Consider again f : R2
→ R4 . Then Df (x, y) ∈ L(R4 , R2 ) is a 2 × 4
matrix, therefore D2 f already must have the form of a three-dimensional array.
These examples illustrate the notational complexity arising for higher order derivatives of general mappings. In the remainder of this section, our focus is on functions of several variables and we do not cover the case of more general mappings. The next section deals specifically with the second derivative which has important applications.
5.6.1 Second Derivative: functions of several variables
We consider only functions f : Rn → R for which Df is differentiable; that is, we say that f : Rn → R is twice differentiable if Df : Rn → L(Rn , R) exists and it is a differentiable mapping. In Example 5.6.2, we see that we can identify L(R2 , R) with 2 n n R . This is true in any dimension and we can write L(R , R) R . Therefore, we also have the identification L(Rn , Rn )
L(Rn, L(Rn, R))
where recall that L(Rn , Rn ) can be identified with Mat n×n (R). We can now state the main theorem of this section.
147
5.6 Higher Derivatives
Theorem 5.6.4. Let f :
n
→ R be twice differentiable where x = (x1 , . . . , xn). Then, the matrix of D2 f : Rn → L(Rn , L(Rn , R)) can be represented by a n × n matrix given by
R
∂ 2 f ∂x 21
∂ 2 f ∂x 2 ∂x 1
∂ 2 f ∂x 1 ∂x 2
∂ 2 f ∂x 22
.. . ∂ 2 f ∂x 1 ∂x n
∂ 2 f ∂x 2 ∂x n
···
∂ 2 f ∂x n ∂x 1
···
∂ 2 f ∂x n ∂x 2
..
.. .
.
∂ 2 f ∂x 2n
···
where each partial derivative is evaluated at x = (x1 , . . . , x n ).
We look at a few examples. Example 5.6.5. The second derivative of f (x, y) = cos(xy) is
D2 f (x, y) =
∂ 2 f ∂x 2 ∂ 2 f ∂x∂y
∂ 2 f ∂y∂x ∂ 2 f ∂y 2
− − =
y2 cos(xy)
−xy cos(xy) −x2 cos(xy)
xy cos(xy)
.
Example 5.6.6. We compute D 2 f for f (x , y , z , w) = x2 + e yz + w . From Theo-
rem 5.6.4, we have
D2 f (x,y,z,w)
=
=
∂ 2 f ∂x 2
∂ 2 f ∂y∂x
∂ 2 f ∂z∂x
∂ 2 f ∂w∂x
∂ 2 f ∂x∂y
∂ 2 f ∂y 2
∂ 2 f ∂z∂y
∂ 2 f ∂w∂y
∂ 2 f ∂x∂z
∂ 2 f ∂y∂z
∂ 2 f ∂z 2
∂ 2 f ∂w∂z
∂ 2 f ∂x∂w
∂ 2 f ∂y∂w
∂ 2 f ∂z∂w
∂ 2 f ∂w 2
2
0
0
0
0
z 2 eyz
(1 + zy)eyz
0
y 2 eyz
0
0
0
0 0
(1 +
zy)eyz 0
.
148
5 Differential Calculus of Mappings
Definition 5.6.7. A mapping f : Rn
→ R is of class C 2 if D2 f : Rn → L(Rn , L(Rn , R))
exists and is a continuous function.
A very important property of C 2 functions is the following. Theorem 5.6.8 (Clairault’s Theorem). Suppose that f :
then the matrix D2 f (x) is symmetric; that is,
n R
→ R is a C 2 mapping,
∂ 2 f ∂ 2 f for i, j = 1, . . . , n = ∂x i ∂x j ∂x j ∂x i
Therefore, all mixed partial derivatives ∂ 2 f ∂ 2 f = . ∂xi ∂xj ∂x j ∂x i
for i, j = 1, . . . , n are equal.
We see that the functions of Examples 5.6.5 and 5.6.6 satisfy the conclusions of Clairault’s theorem. In particular, Clairault’s theorem implies that the derivative of any C 2 function f is a symmetric matrix; that is, D2 f (x) = [D2 f (x)]T .
We see above that third and higher derivatives are problematic to represent in terms of matrices. In order to come up with a reasonable way of writing higher derivatives, we look at a more abstract set-up for the second derivative, which generalizes nicely to higher dimensions.
Algebraic meaning of the
2nd derivative
For a function of several variables f : vectors in T f (x) R: Df (x)
:
n R
→ R, Df (x) sends vectors in T xRn to
T x Rn v
→ T f (x) R
→ Df (x)v.
We can check that Df (x) is a linear mapping. Let v, w ∈ T x Rn and α ∈ R then Df (x)(αv + w) = Df (x)(αv) + Df (x)w = αDf (x)v + Df (x)w.
Therefore, Df (x) ∈ L(T x Rn , T f (x) R), but because we can make the identification T p Rm Rm for any point p and any m, we write Df (x) ∈ L(Rn , R) as noted above. We know that D2 f (x) ∈ L(Rn , L(Rn , R)) where we should think of Rn as T x Rn and R as T f (x) R. Therefore, the correct expression is D 2 f (x) ∈
149
5.6 Higher Derivatives
L(T x Rn , L(T x Rn , T f (x) R)). If we reason in terms of vectors, for v T x Rn a column vector, then D 2 f (x) applied to v must be a 1 n vector. This is obtained as follows
∈
×
D 2 f (x)
T x Rn
:
→ L(T xRn, T f (x)R)
→ vT D2 f (x).
v
Since, v T D2 f (x) ∈ L(T x Rn , T f (x) R) then if w is a column vector in T x Rn then v T D2 f (x) applied on w must be real number. This is done as follows v T D2 f (x)
:
T x Rn
→ T f (x) R
→ vT D2 f (x)w. We see that if D2 f (x) is applied to two vectors v, w ∈ T x Rn w
then the formula
v T D2 f (x)w returns indeed an element of T f (x) R.
Example 5.6.9. Consider Example 5.6.5 with f (x, y) = cos(xy). Then, if v =
(v1 , v2 )T and w = (w1 , w2 ) we have v T D2 f (x, y)w = (v1 , v2 )
−
2
y cos(xy)
−xy cos(xy)
−xy cos(xy) −x2 cos(xy)
w1 w2
w1
= ( y 2 cos(xy)v1
− xy cos(xy)v2 , −xy cos(xy)v1 − x2 cos(xy)v2 )
= ( y 2 cos(xy)v1
− xy cos(xy)v2 )w1 + ( −xy cos(xy)v1 − x2 cos(xy)v2 )w2
− −
=
w2
−y2 cos(xy)v1 w1 − xy cos(xy)(v2 w1 + v1 w2 ) − x2 cos(xy)v2 w2 .
The following definition formalizes this type of mathematical object.
× Rn → R is a real bilinear form on Rn if for any u, v,u 1 , u2 , v1 , v2 ∈ Rn and α, β ∈ R then Definition 5.6.10. A mapping B :
n R
(1) B(αu1 + βu 2 , v) = αB(u1 , v) + βB (u2 , v), and (2) B(u,αv1 + βv 2 ) = αB(u, v1 ) + βB (u, v2 ).
We denote by L2 (Rn , R) the set of all real bilinear forms. Moreover, if B(u, v) = B(v, u) for all u, v ∈ Rn then B is called a symmetric bilinear form and the set of all symmetric bilinear forms is denoted L 2s (Rn , R). Consider the following example. Example 5.6.11. Let A be a symmetric n
vectors. Define
× n matrix and v, w ∈ Rn be two column
B(v, w) = v T Aw.
(5.4)
150
Let u
5 Differential Calculus of Mappings
∈ Rn be another vector and α ∈ R, then B(αu + v, w)
=
(αu + v)T Aw = (αuT + v T )Aw
=
αuT Aw + v T Aw
=
αB(u, w) + B(v, w).
A similar calculation shows that B(v,αu + w) = αB(v, u) + B(v, w).
Thus, B defined using (5.4) is a bilinear form. We now show B is symmetric as follows. Note that since v T Aw is a real number, then v T Aw = (v T Aw)T and so v T Aw = (v T Aw)T = w T AT v = w T Av.
where the last equality holds because A is symmetric. Therefore, B(v, w) = B(w, v).
This last example provides a proof of the following statement. Proposition 5.6.12. Let A be a n n symmetric matrix and u, v
×
vectors. Then,
∈ Rn be two column
B(u, v) = u T Av
is a symmetric bilinear form.
Proposition 5.6.12 shows that we can construct symmetric bilinear forms by using symmetric matrices. A quadratic form q : Rn → R is obtained from a symmetric bilinear B : Rn × Rn → R form by the formula q (v) := B(v, v).
Quadratic forms are important in several fields of mathematics and in particular, they are encountered in the next section. Example 5.6.13. Consider the quadratic form q (x, y) = B((x, y)T , (x, y)T ) where B
is obtained from the matrix A =
1
−2
− 2
3
.
Then, q (x, y) = x 2 + 3y 2
− 4xy.
Exercises
(1) Let r : R → Rn be a vector function defined by r (t) = (x1 (t), . . . , x n (t))T . Show that if the kth derivative of r exists, then Dk r(t) = (xk1 (t), . . . , xkn (t))T where xkj (t) is the k th derivative of x j (t).
5.7 Taylor expansions
151
(2) Compute the fourth derivative of r(t) = (t2 , e2t , cos(t))T . (3) Compute the second derivative of the following mappings. (a) f (x,y,z) = xy exp(xz)
−
(b) f (x1 , x2 , x3 , x4 ) = det (c) f (x, y) = (x, y)
3
x1
x2
x3
x4
1
x
−1 −2
y
(4) Let f (x, y) = x 2 y 2 + 1 and g(u) = e 2u . Compute D 2 (g ◦ f )(x, y) by first expressing g ◦ f explicitly and then taking its second derivative. (5) Consider the function
f (x, y) =
xy(x2 y 2 ) x2 + y 2
= (0, 0) if (x, y)
0
if (x, y) = (0, 0).
−
∂ 2 f ∂ 2 f (0, 0) and (0, 0). Is f a C 2 Compute the second partial derivatives ∂x∂y ∂y∂x
function? (Hint: Not an easy question.) (6) If q (v) = v T Av where A is a symmetric matrix, compute D2 q (v) in two ways: by writing q explicitly in coordinate form by setting v = (x1 , . . . , x n )T or by using the tangency limit and the best linear approximation formula. (7) Consider the quadratic form q : R3 → R where q (v) = B(v, v) and B is given by the matrix A =
Show that
a
b
c
b
d
e
c
e
f
.
q (x,y,z) = ax 2 + dy 2 + f z 2 + 2bxy + 2cxz + 2eyz.
5.7 Taylor expansions Recall Taylor’s expansions for functions f : R → R at a point x0 . If f is differentiable r times, then f (x)
=
f (x0 ) f (x0 ) f (x0 ) + (x x0 ) + (x 1! 2! f (r) (x0 ) + + (x x0 )r + Rr (x) r!
−
···
−
− x0 ) 2
152
5 Differential Calculus of Mappings
where R r (x) is the remainder term and can be expressed as f r+1 (ξ ) Rr (x) = (x (r + 1)!
− x0 )r+1
where ξ ∈ (x0 , x). We can rewrite the above compactly as r
f (x) =
j=1
f j (x0 ) (x j!
− x0 )j + Rr (x).
Example 5.7.1. We compute the Taylor expansion of exp(x) at x = 1 up to order 3 :
f (x)
f (1) f (1) 2 = f (1) + f (1)(x − 1) + (x − 1) + (x − 1)3 + R3 (x) 2
= e + e(x
3!
− 1) + 12 e(x − 1)2 + 16 e(x − 1)3 + R3 (x)
= e 1 + (x
−
1 1) + (x 2
−
1 1)2 + (x 6
− 1)3
+ R3 (x)
1 ξ where R3 (x) = 4! e (x 1)4 for some ξ (1, x). Note that because ex is infinitely differentiable, we can take the Taylor expansion up to any order.
−
∈
Here is an example where the function is only differentiable a finite number of times. Example 5.7.2. We compute the Taylor expansion of f (x) = x5/2 at x = 0 up
to order 2 because this function is only twice differentiable. We obtain f (0) = 0, 15 −1/2 3 f (0) = 0, f (0) = 0 with f (x) = ξ x , therefore 8 1 f (x) = f (0) + f (0)x + f (0)x2 + R2 (x) = 0 + R2 (x) 2
where R2 (x) =
15 1/2 x3 for 48 ξ
−
some ξ
∈ (0, x).
We now show a Taylor’s expansion formula for functions f : Rn → R at some point x0 = (x01 , . . . , x0n ) ∈ Rn . We do this by using the function g : R → R defined by g(t) = f (x0 + t(x
− x0 ))
where x = (x1 , . . . , xn ) and begin by computing the Taylor expansion at t = 0. We assume that f is infinitely differentiable and so obtain the expression g(t) = g(0) + g (0)t +
1 g (0)t2 + 2!
···
g r (0) r + t + r!
···
where g(0) = f (x0 ). We now use this notation to compute the higher derivatives. Set u(t) = x 0 + t(x − x0 ) and note that u(0) = x 0 and u (t) = x − x0 . Then g (t) =
d f (u(t)) = Df (u(t))u (t). dt
5.7 Taylor expansions
153
and so g (0) = Df (x0 )(x
− x0 ).
We now proceed to the second derivative. g (t) =
d (Df (u(t))(x dt
− x0 )T )
Note first that Df (u(t))(x − x0 )T is a 1 × 1 matrix and taking its transpose we have Df (u(t))(x
− x0 ) = (x − x0 )T Df (u(t))T .
Therefore, g (t)
= =
d d (Df (u(t))(x x0 )) = (x dt dt (x x0 )T D2 f (u(t))(x x0 )
−
−
−
− x0 )T Df (u(t))T
where the last equality holds because D2 f is a symmetric matrix. Thus, we obtain g (0) = (x
− x0 )T D2 f (x0 )(x − x0 ).
We stop the calculations here and summarize our findings in the next result. Theorem 5.7.3. Let f : U
⊂ Rn → R be a C 2 function on U . Then, the Taylor
expansion of f truncated to second order is T 2 f (x) := f (x0 ) + Df (x0 )(x
− x0 ) + 2!1 (x − x0 )T D2 f (x0 )(x − x0 ).
This result is obtained by taking g(t) = g(0) + g (0)t +
1 g (0)t2 2!
and evaluating at t = 1; that is, computing g(1). As one can imagine, higher derivatives of g should be yielding higher derivatives of f . However, since we have not studied formally the higher derivatives of functions of several variables, we do not pursue the Taylor expansions to orders higher than two. Notice also that we do not provide a remainder formula, but only look at the second order truncation. We look at a few examples on how to apply the Taylor expansion formula. Example 5.7.4. Compute the Taylor expansion truncated to second order of f (x, y) =
exp(xy) at (x0 , y0 ) = (1, 1). The first derivative is Df (x, y) = (y exp(xy), x exp(xy)),
Df (1, 1) = (e, e).
The second derivative is D2 f (x, y) =
y 2 exp(xy)
exp(xy)(1 + xy)
exp(xy)(1 + xy)
x2 exp(xy)
154
5 Differential Calculus of Mappings
and so D2 f (1, 1) =
Therefore, T 2 f (x, y)
=
= = =
e + (e, e)
− x
1
y
−1
e
2e
2e
e
1 + (x 2
.
− 1, y − 1)
− − e
2e
x
1
2e
e
y
1
− 1) + (y − 1) + 21 (x − 1, y − 1) (x2(x−−1)1)++2(y(y −− 1)1) e(−2 + x + y + 21 (x − 1)2 + 2(x − 1)(y − 1) + 21 (y − 1)2 ) e(1 − 2x − 2y + 21 x2 + 2xy + 21 y 2 ). e 1 + (x
Example 5.7.5. We compute the Taylor expansion truncated to second order of
f (x,y,z) = sin(xyz) at (x,y,z) = (1, π, 1/2). The first derivative is Df (x,y,z) = (yz cos(xyz), xz cos(xyz), xy cos(xyz)),
Df (1, π, 1/2) = (0, 0, 0).
The second derivative is
− −−
(yz)2 sin(xyz)
D2 f (x,y,z) =
xyz 2 sin(xyz) xy 2 z sin(xyz)
and so
xyz 2 sin(xyz)
xy 2 z sin(xyz)
− − −(xz)2 sin(xyz) −x2 yz sin(xyz) −x2 yz sin(xyz) −(xy)2 sin(xyz)
− − −
π2
π2
− π4 − 2
4 1 π 1 π D2 f (1, π, ) = . 2 4 4 2 2 π π π2 2 2 The Taylor expansion truncated to second order is, setting u = (x
T 2 f (x,y,z)
=
=
1
1
− 12 u
−
−
π2 (x 8
π2 (x 2
π2
π 4 1 4 π 2
4 π 4 π2 2
2
− 1) −
− 1)
π2 2 π 2 π2
1 (y 8 1 2
−
−
−
− 1, y − π, z − 21 ),
uT
2
− π) −
− − z
−
π (y 2
π2 (z 2
− π)
− 1)2 − π4 (x − 1)(y − π)
− z
1 2
5.7 Taylor expansions
155
Exercises
(1) Compute the Taylor expansions truncated to second order of the following functions. (a) f (x, y) = x 4 y 3 at (x, y) = (1, −1). (b) f (x,y,z) = ln(x + y + z) at (x,y,z) = (1, 0, 0). (2) Consider the case f (x, y) evaluated at (x0 , y0 ) and compute explicitly g (0). Rearrange the terms to obtain 1 ∂ 3 f 1 ∂ 3 f 3 (x0 , y0 )(x x0 ) + (x0 , y0 )(x x0 )2 (y 3 2 3! ∂x 2! ∂x ∂y 1 ∂ 3 f 1 ∂ 3 f 2 + (x0 , y0 )(x x0 )(y y0 ) + (x0 , y0 )(y 2! ∂x∂y 2 3! ∂y3
−
−
−
−
− y0 ) − y0 )3 .
This is the expression for the cubic terms of the Taylor expansion of f (x, y) at (x−x0 , y −y0 ). It can be written in short form as D3 f (x0 , y0 )(u−u0 , u−u0 , u−u0 ) where u−u0 = (x−x0 , y −y0 ). The operator D 3 f (x, y) is an example of a trilinear form. (3) Using the formula of the previous exercise, compute the cubic terms of the Taylor expansion of (a) f (x, y) = x 4 y 3 at (1, −1). (b) f (x, y) = exp(xy) at (1, 1). (4) Compute the Taylor expansion of f (x, y) = x2 + xy − 2y 2 at (x, y) = (0, 0) truncated to second order. Set y = x in the Taylor expansion and determine the curve obtained. Set y = −x/2 in the Taylor expansion and determine the curve obtained. Draw each curve in xyz -space.
6 Applications of Differential Calculus This chapter contains a collection of different topics where the concept of derivative of mapping is applied. The first section is on optimization and this is an important topic in Calculus. However, the results are not being used in the following chapters. This is followed by a section on parametrizations which is crucial for the remainder of the book. The next section is on differential operators and puts on firm footing the concept of differentials and gradients and introduces more differential operators. This section is also used in the following chapters. Finally, we conclude by discussing how Clairault’s theorem is used to complete the theory of exact 1-forms discussed in Chapter 4.
6.1 Optimization This section is concerned with generalizing the optimization theory of functions f : R → R to the setting of general functions of several variables f : Rn → R. We begin by generalizing the concept of critical point.
Critical Points and Extremals Recall that for differentiable functions f : R → R, a point x0 is a critical point if f (x0 ) = 0. This generalizes naturally to the n-variables case. Definition 6.1.1. A point u0
f : Rn
→ R if
∈ Rn is a critical point of a differentiable mapping Df (u0 ) = 0,
which is the same as f (u0 ) = 0.
∇
We find the critical points of the following examples. Example 6.1.2. Let f (x, y) = 2x2
− x3 y + xy, then Df (x, y) = (4x − 3x2 y + y, −x3 + x) = (0, 0) is solved by seeing first that −x3 + x = 0 has solutions x = 0 and x = ±1. Setting x = 0 in the first equation we have solution y = 0. Substituting x = ±1 in the first equation we have 2 − 3y + y = 0 which has solution y = 1. There are three critical points (x0 , y0 ) = (0, 0), (x1 , y1 ) = (1, 1) and (x2 , y2 ) = (−1, 1). Example 6.1.3. Consider f (x,y,z) = 3xz 2 − 3x − 2y + 4xy + z 2 − 15z . We set Df (x,y,z) = (3z 2 − 3 + 4y, −2 + 4x, 6xz + 2z − 15) = (0, 0, 0)
6.1 Optimization
157
and solve for (x,y,z). We have three equations in three unknowns: 3z 2
− 3 + 4y −2 + 4x 6xz + 2z − 15
=
0
=
0
=
0.
Solving the second equation we obtain x = 1/2, which we substitute in the third equation. Thus, we obtain 5z 15 = 0 with solution z = 3. Finally, substituting in the first equation we have y = 6. Therefore, f has a unique critical point at u0 = (x0 , y0 , z0 ) = (1/2, 6, 3).
−
−
−
Example 6.1.4. Consider the mapping
g(x,y,z,u) = 6xyz
− zx2 − z2 y − 92 x2 + 9x − 12 x4 − 2uy + u.
Df (x,y,z,u)
=
(6yz
=
(0, 0, 0, 0)
Set
− 2xz − 9x + 9 − 2x3 , 6xz − z2 − 2u, 6xy − x2 − 2yz, −2y + 1)
and we obtain the nonlinear system of equations 6yz
− 2xz − 9x + 9 − 2x3 6xz − z 2 − 2u 6xy − x2 − 2yz −2y + 1
=
0
=
0
=
0
=
0.
The fourth equation is solved for y = 1/2 and we substitute in the other equations to obtain 3z 2xz 9x + 9 2x3 = 0
−
−
− 6xz − z 2 − 2u = 0 3x − x2 − z = 0. Solving the last equation we have z = 3x − x2 and we substitute in the first equation to obtain
− x2 ) − 2x(3x − x2 ) − 9x + 9 − 2x3 = −9x2 + 9 = 0 which has solutions x = ±1. Therefore, for x = 1 we have z = 2 and for x = −1 we have z = −4. Finally, we solve for u in the second equation 1 u = 3xz − z 2 , 2 substituting x = 1, z = 2 we have u = 4 while for x = −1, z = −4 we have u = 4. 3(3x
The critical points are
(x0 , y0 , z0 , u0 ) = (1, 1/2, 2, 4)
and (x1 , y1 , z1 , u1 ) = ( 1, 1/2, 4, 4).
−
−
158
6 Applications of Differential Calculus
Recall that for a function f : Rn → R, the tangent space at p = (x, f (x)) ∈ graph (f ) is
| ∈ T xRn}.
T p S = (v T , (Df (x)v)T ) v
{
Thus, at a critical point u0 , if p0 = (u0 , f (u0 )) then
| ∈ T xRn}
T p0 S = (v T , 0) v
{
and we can say that the T p S is parallel to T x Rn . For functions f : R → R and f : R2 → R, this leads to saying that T p S is “horizontal”, since it is perpendicular to the vertical axis. We know that the relationship between critical points and extremals is important. However, the definition of extremal is independent of the concept of critical point. For instance, the function f (x) = |x| has a (global) minimum at x = 0, but the function is not differentiable at x = 0 and so we cannot say that x = 0 is a critical point. 0
0
Definition 6.1.5. Let f :
n R
n R
→ R be a function of several variables.
(1) A point u0 is a local minimum of f if there exists a number r > 0 such that f (x) f (u0 ) for all x Rn satisfying x u0 < r. (2) A point u0 Rn is a local maximum of f if there exists a number R > 0 such that f (x) f (u0 ) for all x Rn satisfying x u0 < R.
∈ ≥ ∈ ≤
∈
|| − ||
∈
|| − ||
We consider an example. Example 6.1.6. We show that (x0 , y0 ) = ( 0, 0) is a local maximum of f (x, y) =
2 x2 x4 . This is straightforward because f (0, 0) = 2 and for all nonzero values of (x, y), the expression x2 + x4 > 0 . Therefore, f (x, y) = 2 (x2 + x4 ) < 2 = f (0, 0).
− −
−
We establish the link between extremals and critical points. Theorem 6.1.7. Let f :
n R
→ R be a differentiable mapping. If u0 is a local maxi-
mum or a local minimum of f , then u0 is a critical point of f .
Proof. Let u 0 be a local maximum (or minimum) of f , then there exists R > 0 such that for all x Rn satisfying x u0 < R (or r), then f (x) f (u0 ) (or f (x) f (u0 ) in the case of a local minimum). Consider the path x j (t) = u 0 + tej where ej is the j th canonical basis vector of Rn . Then, xj (0) = u 0 , xj (0) = e j and we consider the function g : R R defined by g j (t) = f (xj (t)). Then g j (0) = f (u0 ) and for t small enough we have xj (t) u0 < R which implies gj (t) = f (xj (t)) f (u0 ) for all t small enough. That is, g has a local maximum (minimum) at t = 0. Therefore, for all j = 1, . . . , n
∈
→ ||
|| − ||
− ||
0 = g j (t) =
d f (xj (t)) dt
≤
≥
≤
| |
| |
|t=0= Df (xj (0))xj (0) = Df (u0 )ej .
This means all partial derivatives of f at u0 vanish and so Df (u0 ) = 0; u0 is a critical point of f .
6.1 Optimization
159
Theorem 6.1.7 gives us a criteria to identify possible extrema for differentiable functions. As in the simpler case of functions f : R → R, we know that the second derivative contains information about whether a critical point is a local minimum, local maximum or an inflection point. This is the topic of the next section.
6.1.1 Second derivative criterion for extremals
Consider the case of a differentiable function f : R2 → R with a critical point at u0 = (x0 , y0 ). Theorem 6.1.7 tells us that u0 could be a local maximum or a local mimimum. An example such as the hyperbolic paraboloid z = f (x, y) = x2 − y 2 shows that (0, 0) is a critical point of f , but this critical point is not an extremal. Indeed, for x = 0 we have z = f (0, y) = −y2 which is a parabola with a maximum at (0, 0) and for y = 0, then z = f (x, 0) = x2 has a minimum at (0, 0). This type of critical point is an example of a saddle point : a critical point which is a local minimum in at least one direction and a local maximum in at least one direction. The next example shows another situation Example 6.1.8. Consider z = f (x, y) = x3
− 3xy2 , this is the surface shown in
Figure 5.4. Then, Df (x, y) = (3x2 3y 2 , 6xy) = (0, 0) implies (x, y) = (0, 0) is the only critical point. This critical point is not a local extrema or a saddle point. A problem in the exercises asks to show that in fact, any path through (0, 0) has a point of inflection at (0, 0), except for the three lines y = mx with m = 0, m = 1/ 3.
−
−
± √
The criterion for determining local maxima and minima is the second derivative test and it is sometimes expressed as follows in elementary calculus courses. Proposition 6.1.9. Let f :
(x0 , y0 ). Define
2 R
→ R be a C 2 function with a critical point at u0 =
∂ 2 f ∂ 2 f D(x0 , y0 ) := (x0 , y0 ) 2 (x0 , y0 ) ∂x 2 ∂y
−
∂ 2 f (x0 , y0 ) ∂x∂y
2
.
2
f (a) If ∂ ∂x 2 (x0 , y0 ) > 0 and D > 0 then (x0 , y0 ) is a local minimum. 2
f (b) If ∂ ∂x 2 (x0 , y0 ) < 0 and D > 0 then (x0 , y0 ) is a local maximum.
(c) If D(x0 , y0 ) < 0 then f is neither a maximum nor a minimum, but a saddle point. (d) If D(x0 , y0 ) = 0, then the test is inconclusive. Proof. (a) Let v = (v1 , v2 )T R2 be a vector and consider the straight line path r(t) = (x0 + tv 1 , y0 + tv 2 ) through u0 in R2 . Define the function h(t) = f (r(t)). Then, one can check that h(0) = f (u0 ) and h (0) = 0. For simplicity of notation let
∈
∂ 2 f (u0 ) a = , ∂x2
∂ 2 f (u0 ) b = , ∂x∂y
∂ 2 f (u0 ) c = . ∂y 2
160
6 Applications of Differential Calculus
Then h (0)
=
v T D2 f (u0 )v
=
av12 + cv22 + 2bv1 v2
=
a(v1
(6.1) − (b/a)v2 )2 + (c − (b2 /a))v22 1 = a(v1 − (b/a)v2 )2 + (ac − b2 )v22 > 0 a 2 because a > 0 and D = ac − b > 0. Thus, u0 is a local minimum of h for all straight line paths r(t) through u0 . Thus, for |t| small enough f (r(t)) ≥ f (u0 ) for all paths, hence f (x, y) ≥ f (u0 ) for all (x, y) close enough to u0 and so it is a local minimum. A similar argument works for local maxima in part (b) using (6.1). For part (c), D < 0 implies that if a > 0, then h (0) > 0 for the path given by (v1 , 0) and h (0) < 0 for the path given by (0, v2 ). If a < 0 , the opposite is true, but still h (0) has different signs depending on the paths. Therefore, it is a local minimum with respect to a path and a local maximum with respect to the other path. Hence, u0 is a saddle point. For part (d), since D = 0 as Example 6.1.8 shows, the situation may be other than a local extrema or saddle point and so we cannot conclude.
Example 6.1.10. We continue with Example 6.1.2 and determine the nature of the
critical points (0, 0), (1, 1) and ( 1, 1). We compute
−
∂ 2 f = 4 6xy ∂x2 which evaluated at (0, 0) and ( 1, 1) is positive, while it is negative at (1, 1). However,
−
−
∂ 2 f ∂ 2 f (x0 , y0 ) 2 (x0 , y0 ) D = ∂x2 ∂y
−
∂ 2 f (x0 , y0 ) ∂x∂y
2
=
−(−3x2 + 1)2
which means D is always negative. Therefore, all critical points are saddle points.
We now show that the second derivative test conditions of Proposition 6.1.9 can be rewritten in terms of the second derivative D2 f . But before, we introduce a widely used terminology about the second derivative when expressed as a matrix. Definition 6.1.11. Let f : Rn
→ R be a C 2 function. The Hessian matrix of f is H (f )(u0 ) := D 2 f (u0 ).
The Hessian matrix is just another name for the matrix of the second derivative when a function is C 2 . Just as the Jacobian matrix is the matrix of the first derivative. Using the Hessian matrix we notice that for f : R2 → R a C 2 function det H (f )(u0 )
=
= =
−
∂ 2 f ∂x 2 ∂ 2 f ∂x 2 ∂y ∂ 2 f ∂ 2 f ∂x 2 ∂y 2 D(x0 , y0 ).
∂ 2 f ∂y∂x ∂ 2 f ∂y 2 ∂ 2 f ∂y∂x
2
6.1 Optimization
161
Thus, we rewrite the second derivative test as follows. Proposition 6.1.12. Let f : R2 2
→ R be a C 2 function with a critical point at u0 .
f (a) If ∂ ∂x 2 (x0 , y0 ) > 0 and det H (f )(x0 , y0 ) > 0 then (x0 , y0 ) is a local minimum 2
f (b) If ∂ ∂x 2 (x0 , y0 ) < 0 and det H (f )(x0 , y0 ) > 0 then (x0 , y0 ) is a local maximum
(c) If det H (f )(x0 , y0 ) < 0 then f is neither a maximum nor a minimum, but a saddle point. (d) If det H (f )(x0 , y0 ) = 0 then the test is inconclusive.
We now come to the last characterization of the second derivative test. Recall that if A is a n × n matrix with eigenvalues λ 1 , . . . , λ n then det A = λ 1 ··· λn . Therefore, if λ1 , λ2 are the eigenvalues of a 2 × 2 Hessian matrix H (f )(x0 , y0 ) then det H (f )(x0 , y0 ) = λ 1 λ2 .
Using this correspondence, the second derivative test can be rewritten. Proposition 6.1.13. Let f : R2
→ R be a C 2 function with a critical point at u0 .
(1) If the eigenvalues of H (f )(x0 , y0 ) are both positive, then (x0 , y0 ) is a local minimum (2) If the eigenvalues of H (f )(x0 , y0 ) are both negative, then (x0 , y0 ) is a local maximum (3) If the eigenvalues of H (f )(x0 , y0 ) have different signs, then f is neither a maximum nor a minimum, but a saddle point. (4) If H (f )(x0 , y0 ) has a zero eigenvalue, then the test is inconclusive. Proof. Write H (f )(x0 , y0 ) =
a
b
b
c
.
and let λ1 , λ2 be its eigenvalues. Then ac − b2 = λ 1 λ2 and ac − b2 > 0 if and only if λ1 and λ2 are both positive or both negative. But this means a and c must have the same sign and this corresponds to the first two cases of local minimum and local maximum from Proposition 6.1.12. If λ1 , λ2 have different signs then ac − b2 < 0. The last case is straightforward. Let us apply Proposition 6.1.13 to some examples. Example 6.1.14. Consider f (x, y) = cos x cos y with 0
≤ x < π and 0 ≤ y < π.
We set Df (x, y) = (sin x cos y, cos x sin y) = (0, 0) and critical points are found at (x, y) = (0, 0) and (x, y) = (π/2, π/2). The Hessian matrix is
−
H (f )(x, y) =
−
cos x cos y
sin x sin y
sin x sin y
− cos x cos y
162
6 Applications of Differential Calculus
and we evaluate at each critical point. We have H (f )(0, 0) =
− 1
0
0
−1
and H (f )(π/2, π/2) =
0
1
1
0
.
Because H (f )(0, 0) is a diagonal matrix, the eigenvalues are the elements of the diagonal: 1 with multiplicity two. Thus, (0, 0) is a local maximum. The eigenvalues of H (f )(π/2, π/2) are given by solving the characteristic polynomial p(λ) = det(λI H (f )(π/2, π/2)) = 0. A calculation shows that p(λ) = λ 2 1 = 0 and the eigenvalues are 1 which implies that (π/2, π/2) is a saddle point.
−
−
±
Example 6.1.15. Consider f (x, y) = x 3
−
− 3xy2 which we know has a unique critical
point at (0, 0) from Example 6.1.8. The Hessian at (0, 0) is H (f )(0, 0) =
0
0
0
0
and both eigenvalues are zero. Therefore, the second derivative test is inconclusive.
We can now turn our attention to the general case of C 2 functions of several variables. Before we do so, we restrict our attention to critical points which exclude cases that are inconclusive in the classification given by Proposition 6.1.13. Definition 6.1.16. A critical point u0 of f :
R
n
= 0. point of f if det H (f )(u0 )
→ R is a nondegenerate critical
The characterization of local minimum, local maximum and saddle points in terms of eigenvalues generalizes in a straightforward manner to functions of several variables. Note that with n -dimensions, the concept of saddle point has many more cases than in the two-dimensional case. We do not seek to classify all those cases which is the topic of Morse theory and we consider a few examples. Example 6.1.17. Let f (x,y,z,u) = x 2 + y 2 + z 2
− u2 , then Df (x,y,z) = (2x, 2y, 2z, −2u)
and there is a unique critical point at (0, 0, 0, 0). Setting y = z = u = 0, x = z = u = 0 and x = y = u = 0 shows (0, 0, 0, 0) has a minimum along the respective paths while for x = y = z = 0, the path through (0, 0, 0, 0) has a maximum. Consider instead f (x,y,z,u) = x2 + y 2 z 2 u2 . Then, the critical point at (0, 0, 0, 0) is a minimum along x and y axis paths and a maximum along z and u axis paths. We see that as the number of dimensions goes above three, there are different types of saddle points with different numbers of (local) minimum and maximum directions.
− −
Considering the variety of saddle points possible, one can imagine that cases where a zero eigenvalue occurs in the Hessian matrix at a critical point can easily get
6.1 Optimization
163
out of control. By focusing on nondegenerate critical points, the situation remains manageable and we have the following result. Theorem 6.1.18. Let f :
n R
→ R be a C 2 function with a nondegenerate critical
point at u0 . Then, (1) If all the eigenvalues of the Hessian H (f )(u0 ) are positive, then u0 is a local minimum (2) If all the eigenvalues of the Hessian H (f )(u0 ) are negative, then u0 is a local maximum (3) If the Hessian H (f )(u0 ) has both positive and negative eigenvalues, then it is a saddle point. Proof. Using paths as in the proof of Proposition 6.1.9 is quite cumbersome and we seek a different approach. Instead, we use the second-order Taylor expansion of f at a critical point u0 : T 2 f (x) = f (u0 ) + (x
− u0 )T H (f )(u0 )(x − u0 )
since Df (u0 ) = 0. Because the matrix H (f )(u0 ) is symmetric, it has only real eigenvalues and is diagonalizable. This is a well-known result of linear algebra which is found in any textbook on linear algebra [6]. Therefore, there exists an orthogonal matrix P such that P T H (f )(u0 )P = diag(λ1 , . . . , λ n ) is the diagonal matrix with entries given by the eigenvalues λ1 , . . . , λn of H (f )(u0 ) from top left to bottom right. Note that H (f )(u0 ) = P diag(λ1 , . . . , λn )P T and so the second-order Taylor expansion is written T 2 f (x) = f (u0 ) + (x
− u0 )T P diag(λ1 , . . . , λn)P T (x − u0 ).
Set v = P T (x − u0 ) := (v1 , . . . , v n )T , then f (x) = f (u0 ) + v T diag(λ1 , . . . , λn )v = f (u0 ) + λ1 v12 +
··· + λnvn2 .
We disregard the fact that terms of order three and higher are neglected and proceed. For part (a), if λ1 > 0, . . . , λn > 0 then f (x) ≥ f (u0 ). For part (b), if λ1 < 0, . . . , λn < 0, then f (x) ≤ f (u0 ). Finally, for part (c), suppose that λ1 > 0 and λ2 < 0 then for x values such that v 1 = 0 and v 2 = ··· = v n = 0 then f (x) ≥ f (u0 ). = 0, v3 = ··· = vn = 0 then f (x) ≤ f (u0 ). The For x values such that v1 = 0, v2 = j . Adding back the higher-order terms argument is similar if λi < 0 , λj > 0 for i and considering x close enough to u 0 the inequalities above are maintained and the result holds. Using Theorem 6.1.18, we can now return to Examples 6.1.3 and 6.1.4, and determine the nature of the critical points.
164
6 Applications of Differential Calculus
Example 6.1.19. For f (x,y,z) = 3xz 2
− 3x − 2y + 4xy + z2 − 15z the critical point
is (x,y,z) = (1/2, 6, 3) and the Hessian matrix is
−
H (f )(0, 0, 0) =
0
4
18
4
0
0
18
0
2
The characteristic polynomial is given by p(λ)
=
det(H (f )(1/2, 6, 3)
−
4
λ
=
det
4
λ3
− λI )
18 0
−λ
18
=
−
.
0
−λ + 2
− 2λ2 − 340λ + 32.
We must solve p(λ) = 0. This can be done using a mathematical computation software package and we obtain the eigenvalues λ1 = 8, λ2 = 3 13 and λ3 = 3 + 13. This implies that the critical point is a saddle point.
− − √
− √
Example 6.1.20. For g(x,y,z,u) = 6xyz
recall that the critical points are (x0 , y0 , z0 , u0 ) = (1, 1/2, 2, 4)
− zx2 − z2 y − 29 x2 + 9x − 21 x4 − 2uy + u.
and (x1 , y1 , z1 , u1 ) = ( 1, 1/2, 4, 4).
−
−
The Hessian matrix is
− − − 9
H (f )(x , y , z , u) =
6x2
6z
6y
2x
6z
6y
2x
0
6x
2z
6x
− 2z −2
0
We evaluate the Hessian at each critical point
H (f )(x0 , y0 , z0 , u0 ) =
H (f )(x1 , y1 , z1 , u1 ) =
− − −
15
12
12
0
1
2
0
−2
15
−24
24
− − − − − − − −
0
5
2
0
−2
0
2
2y
0
0
0
1
0
2
2
1
0
0
0
5
0
2
1
0
2
0
0
.
6.1 Optimization
165
The eigenvalues are computed from the characteristic polynomials p0 (λ)
=
det(H (f )(x0 , y0 , z0 , u0 )
− − λ
=
15
12 1
12
1
−λ
2
2
0
−2
− λI )
−λ − 1 0
=
λ4 + 16λ3
=
det(H (f )(x1 , y1 , z1 , u1 )
− − 0
2
0
λ
− 138λ2 − 316λ − 56
and p1 (λ)
− − − λ
=
15
24
5 0
=
λ4 + 16λ3
−24 −λ 2
−2
− λI )
5 2
−λ − 1 0
− − 0
2
0
λ
− 594λ2 − 220λ + 40.
Numerical solutions can be computed using a software and one obtains for p0 (λ) three negative eigenvalues ( 21.69, 1.74, 0.19) and one positive ( 7.63). Now, p1 (λ) has two negative eigenvalues ( 33.52, 0.49) and two positive eigenvalues ( 17.88, 0.13). In both cases, the critical point is a saddle point.
≈−
≈
− − ≈− −
≈
Exercises
(1) Compute the critical points of the following functions and determine if they are nondegenerate. For the nondegenerate cases, determine whether the critical points are local maximum, local minimum or saddle point. (a) f (x, y) = −x2 y + 2xy 3 + y . (b) g(x, y) = ln(1 + x2 + y2 ). (c) f (x,y,z) = exyz+xy−xz+2yz . (d) g(x,y,z) = cos x cos y sin z , with domain 0 ≤ x,y,z ≤ π/2. (e) h(x , y , z , u) = x 3 − 2y2 + 3xz − 3zu + z 3 + 3u. (f) k(x,y,z,u) = x 2 + u2 + 3z 2 + (y + 1) 2 − 2y . (2) Show that f (x, y) = x3 − 3xy 2 along any straight line path through (0, 0) has an inflection point at (0, 0), except for three lines. (Hint: this is the function defining the Monkey Saddle of Chapter 5
166
6 Applications of Differential Calculus
(3) Show that f (x, y) = x 2 + y 4 has a degenerate critical point at (0, 0). Using the definition, explain why (0, 0) is a local minimum. Can you find a similar mapping with a local maximum? (4) Find a function g(x, y) which has a degenerate critical point at (0, 0) but which behaves as a saddle point at (0, 0).
6.2 Parametrizations In this section, we explore parametrizations of regions in R2 and R3 and then discuss the parametrization of graphs of functions f : U ⊂ R2 → R. We consider the case of differentiable parametrizations. The techniques and examples in this section are crucial for the computation of double integrals, triple integrals and surface integrals.
6.2.1 Parametrization of 2D and 3D regions
We state most of the definitions and results for regions in R3 , with the R2 case naturally included implicitly. The goal of a parametrization is to provide a description of a region via another region which has a simpler description. Definition 6.2.1. A differentiable
is a differentiable mapping
parametrization of a region E ⊂ R3 (also in R2 )
Φ : R
⊂ R3 → E
such that Φ−1 exists and is a differentiable mapping.
Note that any region E has the trivial parametrization given by the identity mapping. That is, let R = E and Φ(x,y,z) = (x,y,z)T . However, unless the region E is simple enough, e.g. a rectangle parallel to the axes in R2 or a cube parallel to the axes in 3 R , we seek for parametrizations which simplify the description of the region E . The main types of parametrizations we use involve curvilinear coordinate systems (1) Polar: Φ(r, θ) = (r cos θ, r sin θ)T (2) Cylindrical: Φ(r,θ,z) = (r cos θ, r sin θ, z)T (3) Spherical: Φ(ρ,φ,θ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ)T and linear mappings Φ(u1 , . . . , u n ) = A(u1 , . . . , un )T
where A is an invertible n × n matrix. Note that in each case, the mappings Φ are differentiable and det(DΦ) = 0
6.2 Parametrizations
167
at all points in the domain. This implies that Φ−1 exists and is a differentiable function. This is proved using a result called the Inverse Function Theorem, which is not covered in this book. We begin with examples of regions D in R2 and the important class of parametrizations in R2 done with polar coordinates. Example 6.2.2. Consider the domain
D = (x, y)
{
∈ R2 | a ≤ x2 + y2 ≤ b2 , 0 < a < b, x, y ≥ 0}.
This region is an annulus centered at the origin and with radius varying from a to b as shown in Figure 6.1. A parametrization in polar coordinates is given by
r = b
r = a
Fig. 6.1. Annulus region bounded by cir-
cles of radius a and b with a < b.
Φ(r, θ) = (r cos θ, r sin θ)T and the domain of Φ is R = (r, θ) a
{
| ≤ r ≤ b, 0 ≤ θ < π/2}.
We see in Figure 6.2 that the domain R is a rectangle with sides parallel to the axes in the (r, θ) plane. Therefore, the parametrization provides a description of D via a domain R which has a simpler form.
Another important type of parametrizations are the linear transformations. Those can be useful if the domain is bounded by straight lines. Example 6.2.3. Consider
D = (x, y) 0
| ≤ y ≤ 1, −y ≤ x ≤ y}. This is a triangular region with vertices at (1, 1), ( −1, 1) and (0, 0), see Figure 7.14. {
The goal of the parametrization we use here is to have a triangular domain R with two sides on the horizontal and vertical axes. We define
168
6 Applications of Differential Calculus
θ
2π
a
Fig. 6.2. Domain of the parametrization
r
b
Φ
y ( 1, 1)
(1, 1)
−
D Fig. 6.3. Domain D of Exam-
x
ple 6.2.3
Φ(u, v) = (u + v, u
− v)T =
1
1
u
1
−1
v
.
with domain R = (u, v) 0
| ≤ u ≤ 1, u − 1 ≤ v ≤ 0}.
{
Using Φ−1 we show how R is constructed using D. It is straightforward to compute v
1 u u
−1 Fig. 6.4. Domain R of the
−1 1 Φ−1 (x, y) = 2
parametrization Φ
1
1
x
1
−1
y
Consider the boundary y = x with 0
=
1 1 (x + y), (x 2 2
≤ x ≤ 1. Then
(u, v)T = Φ−1 (x, x) = (x, 0)T ,
−
T
y)
.
6.2 Parametrizations
169
so u = x and 0
≤ u ≤ 1. The boundary y = −x with −1 ≤ x ≤ 0 leads to (u, v)T = Φ−1 (x, −x) = (0, x)T . Thus v = x and −1 ≤ v ≤ 0. From the boundary y = 1 with −1 ≤ x ≤ 1, we have (u, v)T = Φ−1 (x, 1)
which leads to
1 1 (x + 1) and v = (x 1). 2 2 Isolating x in each equation, we obtain x = 2u 1 and x = 2v + 1 which means 2u 1 = 2v + 1. Thus, v = u 1 with 0 u 1. This completes the characterization of R, see Figure 6.4. u =
−
−
−
≤ ≤
−
Our main examples of parametrizations in R3 are using cylindrical and spherical coordinates. We consider an example of each. Example 6.2.4. Consider the region E inside the cylinder x2 + y 2 = 1 and in the
sphere x2 + y 2 + z 2 = 4, see Figure 6.5 on the left. We can describe this region in Cartesian coordinates as E =
| −1 ≤ x ≤ 1, − √ 1 − x2 ≤ y ≤ √ 1 − x2 , − 4 − x2 − y2 ≤ z ≤ 4 − x2 − y2 .
(x,y,z)
If we use cylindrical coordinates, the cylinder equation becomes r 2 = 1 and the
Fig. 6.5. Region E (left) and the portion z
(right).
≥
0 of the domain R of its parametrization Φ
sphere r2 + z 2 = 4. The cylindrical coordinates parametrization Φ(r,θ,z) = (r cos θ, r sin θ, z)T
170
6 Applications of Differential Calculus
with domain R = (r,θ,z) 0 < r
{
|
≤ 1, 0 ≤ θ ≤ 2π,
− − 4
r2
≤ ≤ − z
4
r2
}
gives a simpler description of E . Indeed, the domain R is the region extending above and below the rectangle 0 < r 1 and 0 θ 2π bounded above and below by z= 4 r 2 . See Figure 6.5 right for the z 0 portion of R, the z < 0 portion is symmetric.
≤
√ ± −
≤ ≤ ≥
We now turn to a case where spherical coordinates are useful. Example 6.2.5. Consider a shell region E between two spheres of radius 1 and 2
in the first octant, see Figure 6.6. We use spherical coordinates. To stay in the first octant, one needs θ [0, π/2] and φ [0, π/2]. We define
∈
R =
∈
(ρ,θ,φ) 1
| ≤ ρ ≤ 2, 0 ≤ θ ≤
π , 0 2
≤ ≤ φ
π 2
which is a box in (ρ,φ,θ) space with sides parallel to the coordinate planes. The
Fig. 6.6. Region E between spheres of
radius 1 and 2.
domain R is pictured in Figure 6.7. z π/2
R ρ
2
π/2
φ Fig. 6.7. Box domain R from Example 6.2.5
6.2 Parametrizations
6.2.2 Two-Dimensional Surfaces in
171
3
R
We now show examples of parametrizations of two-dimensional surfaces in R3 . Because a surface S is a two-dimensional set, we need two variables to describe S ; that is, parametrizations are of the form Φ : D
⊂ R2 → S.
The main example of parametrization is using graphs of functions. Consider a surface S given by z = f (x, y) where (x, y) ∈ D. A parametrization is given by the graph Φ(x, y) = (x,y,f (x, y)).
All quadric surfaces can be written as combinations of the above. Example 6.2.6. Consider a hyperbolic paraboloid z = 3x2 2 R .
Then, Φ(x, y) = (x,y, 3x2
− 4y2 with domain D =
− 4y2 )T
is a parametrization. For a hyperboloid of one sheet x 2 +y 2 z 2 = 1 the parametrization is obtained using two mappings:
−
− 1)T and with domain D = {(x, y) | x2 + y 2 ≥ 1}. Φ1 (x, y) = (x,y, +
x2 + y 2
Φ2 (x, y) = (x,y,
−
x2 + y 2
− 1)T
The circular nature of the domain suggests that polar coordinates can also be used for the hyperboloid of one sheet. In fact, for all quadrics with circular traces, parametrizations can be achieved with polar coordinates.
Fig. 6.8. Cone intersecting the hemisphere in Example 6.2.8.
172
6 Applications of Differential Calculus
Example 6.2.7. Consider the paraboloid P of equation z = x 2 +y 2 with disk domain
D of radius 3 centered at the origin. A simple parametrization with polar coordinates can be done with z = r 2 : Φ(r, θ) = (r cos θ, r sin θ, r 2 )T
with D = (r, θ) 0
| ≤ r ≤ 3, 0 ≤ θ ≤ 2π}
{
Example 6.2.8. Consider the piece of sphere S of radius a > 0 lying inside the cone
z 2 = x2 + y 2 in the upper hemisphere. Parametrizations exist in Cartesian, cylindrical and spherical coordinates. The cylindrical coordinate system parametrization is obtained with the cone given by z = r and the sphere becomes r 2 + z 2 = a 2 . The intersection of the cone and sphere is obtained by substituting z 2 = r 2 from the cone to the sphere equation from which we obtain 2r2 = a2 . Solving this equation yields r = a/ 2. Therefore, the cone and the sphere intersect in the upper hemisphere at the height z = a/ 2. A parametrization of the portion of sphere is
√
√
Φ(r, θ) = (r cos θ, r sin θ,
with
− a2
r 2 )T
√ | ≤ r ≤ a/ 2, 0 ≤ θ ≤ 2π}.
D = (r, θ) 0
{
We now obtain parametrizations for quadrics with non-circular traces. Example 6.2.9. Consider the ellipsoid S given by
x2 y2 z2 + 2 + 2 = 1. a2 b c
A parametrization for S is Φ(φ, θ) = (a cos θ sin φ, b sin θ sin φ, c cos φ)T
with domain D = (φ, θ) 0
{
| ≤ θ ≤ 2π, 0 ≤ φ ≤ π}.
This is verified by substituting x = a cos θ sin φ, y = b sin θ sin φ and z = c cos φ in the ellipsoid equation. Example 6.2.10. Consider the hyperboloid of one sheet
x2 y2 + 2 a2 b
−
z2 = 1. c2
Let x = au , y = bv and z = cw and substitute in the equation for the hyperboloid of one sheet. We obtain u2 + v 2 w2 = 1.
−
6.2 Parametrizations
173
Isolating w, we obtain w2 = 1
− (u2 + v2 ).
Writing u, v in polar coordinates: u = r cos θ , v = r sin θ, the equation for w simplifies to w2 = 1 r 2 . A parametrization is given by the mappings
−
Φ+ (r, θ)
=
Φ− (r, θ)
=
√ (ar cos θ,br sin θ, c 1 − r2 )T , √ (ar cos θ,br sin θ, −c 1 − r2 )T
with D = (r, θ) r > 1, 0
{
|
≤ θ ≤ 2π}.
Here is a more exotic example of a very important surface in mathematics.
Fig. 6.9. Torus
Example 6.2.11. A torus is a two-dimensional surface which can be described as the
Cartesian product of two circles. It is the shape of donuts and bagels! See Figure 6.9. Several parametrizations exist and we look at one using angles. Consider a circle S1a of radius a in the yz -plane with centre at (0, b, 0) where b > a. We obtain the torus surface by taking the circle S1a and making a full revolution (thus obtaining a second circle of radius b which we denote S1b ) of the circle around the origin keeping the centre in the xy -plane, see Figure 6.10. We now give a description in terms of coordinates. The circle S1a has parametrization y = b + a cos u, z = a sin u with z
S1b
a b x
S1a y
Fig. 6.10. The two circles
1 Sa
and
1 Sb generat-
ing the torus
0 u 2π . Because the height z does not depend on the location of S1a along its revolution, then z = a sin u. For the x and y coordinates, this is not true and
≤ ≤
174
6 Applications of Differential Calculus
depends on the location along S1b which is parametrized by x = b cos v , y = b sin v with 0 v 2π . Note that the circle S1a has x = 0 coordinate which corresponds to b cos v with v = π/2. We now put together the two circle parametrizations to obtain
≤ ≤
x = (b + a cos u)cos v
and y = (b + a sin u)sin v.
The parametrization of the torus is Φ(u, v) = ((b + a cos u)cos v, (b + a sin u)sin v, a sin u)T
with domain D = (u, v) 0
| ≤ u ≤ 2π, 0 ≤ v ≤ 2π}.
{
Arbitrary Surfaces given by Parametrizations
The examples above show cases where a surface S described either in words or by an equation can be written in terms of a parametrization Φ : D ⊂ R2 → S . Recall that an arbitrary vector function r : [a, b] → Rn defines a curve C , but this curve may not necessarily have a description in words or via a formula. The same is true for (differentiable) mappings Φ : D ⊂ R2 → R3 , those always define surfaces. But there may not exist an easy equation in Cartesian coordinates describing the surface. The following example is solvable, but it is easy to produce mappings Φ for which such a description would be challenging or even impossible. Example 6.2.12. The mapping Φ(u, v) = (u2 , u + v , v
− u)T
with domain D = (u, v) 0 u 1, 0 v 1 defines a surface as shown in Figure 6.11. We express the surface given by Φ as a single equation in Cartesian coordinates. Multiplying y and z we have yz = v 2 u2 and so x + yz = v 2 . We can obtain
{
| ≤ ≤
≤ ≤ } −
v2 =
1 2 (y + z 2 ) 2
− x.
Therefore Φ is the parametrization of 1 2 (y + z 2 ) 2
− yz − 2x = 0.
Exercises
(1) Find parametrizations of the following regions in R2 . (a) E is the region enclosed by the circles of radius 1 and 2 in the first, second and third quadrants, the x-axis between 1 and 2, and the y -axis between −1 and −2.
6.2 Parametrizations
Fig. 6.11. Surface given by Φ(u, v) = (u2 , u + v, v
− u)
175
T
(b) E is the region bounded by the x -axis between 0 and 1 , the line y = (x − 1) for 1 ≤ x ≤ 2, the line y = 1 and the y -axis. (c) E is the region bounded by the curve C given by r(t) = ((t + 1) cos(2πt), (t + 1) sin(2πt))
with t ∈ [0, 1] and the line joining the points (1, 0) and (2, 0). Φ maps R into E . (2) In each case, show that the parametrization √ (a) Let E be the half-disk of radius 2 with y ≥ x , Φ = (r cos θ, r sin θ)
and (b) Let E =
| ≤ r ≤ √ 2, −π/4 ≤ θ ≤ 3π/4}. −1 + |x| ≤ y ≤ 1 − |x| , (x, y) | −2 ≤ x ≤ 2, 2 4 2 4 D = (r, θ) 0
{
Φ(u, v) =
−
1 2( 1 + u + v), ( u + v) 2
−
and R = {(u, v) | 0 ≤ u ≤ 1, 0 ≤ v ≤ 1}. (3) Find parametrizations of the following regions in R3 . (a) E is the region bounded below by the paraboloid z = x 2 + y 2 and bounded above by the sphere of radius 1, with z > 0 . (b) E is the region bounded by the cylinder x2 + z 2 = 1 with x ≥ 0 , the plane x = 0 and the planes y = −1, y = 1. (c) E is the region bounded below by the plane z = 1, above by the sphere of radius 2 and in the first octant. (d) E is the region bounded above by the paraboloid z = x2 + y2 , inside the cylinder (x − 1)2 + y 2 = 1 and above the xy-plane. (Hint: attempt a parametrization using cylindrical coordinates.)
176
6 Applications of Differential Calculus
(4) In each case, show that the parametrization Φ maps R into E . (a) E = {(x,y,z) | y ≥ 0}, Φ(ρ,θ,φ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ)
with R = (ρ,θ,φ) ρ > 0, 0
{
|
≤ θ ≤ π, 0 ≤ φ ≤ π/2}
(b) E is the tetrahedron with vertices
−√ 3/2, 0), (1/2, √ 3/2, 0) and (0, 0, 1), with √ 3 1 Φ(u,v,w) = −1 + (u + v), − (−u + v), w 2 2 and R = {(u,v,w) | u,v,w ≥ 0, 0 ≤ u + v + w ≤ 1}. ( 1, 0, 0), (1/2,
−
(5) Find parametrizations of the following surfaces S in R3 . Note that you have to determine correctly the domain D in each case. (a) S = graph (f ) where f (x, y) = 4 − (x2 + y 2 ), not extending below the xy -plane. (b) S is the hyperbolic paraboloid z = x 2 − 2y 2 for x ≥ 0. (c) S is the sphere of radius 1 between the planes z = 0 and y = 0 and for y ≥ 0. (d) S is one eight of the torus of Example 6.2.11 lying in the first octant. (6) Consider Φ1 : R3 → R3 given by Φ1 (x,y,z) = (au,bv,cw); it is a rescaling of all of R3 . Show that Φ1 composed with the parametrization Φ2 (θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ) with 0 ≤ θ ≤ 2π , 0 ≤ φ ≤ π gives the parametrization of the ellipsoid in Example 6.2.9.
6.3 Differential Operators The gradient and differential are examples of differential operators. This section gives formal definitions of those differential operators as well as introducing new differential operators that are important in Vector Calculus and in the theory of partial differential equations (which is not a topic of this book).
6.3.1 Differentials and Gradients
In Cartesian coordinates, the link between the differential and the gradient of a differentiable function f (x1 , . . . , x n ) is straightforward df =
∇f · (dx1 , . . . , d xn) = ∇f (dx1 , . . . , d xn)T
177
6.3 Differential Operators
where we use the fact that the scalar product in Cartesian coordinates corresponds to multiplication of a row and column vector. We begin by showing that the structure of the differential is invariant by changes of coordinates. This is illustrated with the next example and shows the process by which the general case is proved. Example 6.3.1. Consider R2 and the change of coordinates to polar
x = x(r, θ) = r cos θ
and y = y(r, θ) = r sin θ.
(6.2)
The inverse change of coordinates is r = r(x, y) =
x2 + y 2
and θ = θ(x, y) = arctan(y/x).
(6.3)
Let f (x, y) be a differentiable function, then we write df =
∇f (x, y)(dx, dy)T .
To convert df in polar coordinates, we must obtain the partial derivatives as a function of r and θ using (6.3) ∂f ∂f ∂r ∂f ∂θ = + ∂x ∂r ∂x ∂θ ∂x
∂f ∂f ∂r ∂f ∂θ = + . ∂y ∂r ∂y ∂θ ∂y
and
where the partial derivatives of r, θ with respect to x and y are expressed in polar coordinates. That is, in matrix form we have
∇f (x, y)
=
=
=
√ √ ∂r ∂x ∂θ ∂x
∂f ∂ f , ∂r ∂θ
∂r ∂y ∂θ ∂y
x x +y2
∂f ∂ f , ∂r ∂θ
2
y x2 +y2
−
∂f ∂ f , ∂r ∂θ
dx =
sin θ
− sin θ
cos θ r
∂x ∂x dr + dθ ∂r ∂θ
and dy =
(6.4)
x x2 +y2
cos θ r
Using (6.2) we compute
y x +y2 2
∂y ∂y dr + dθ ∂r ∂θ
(6.5)
which we rewrite
T
(dx, dy) =
∂x ∂r ∂y ∂r
∂x ∂θ ∂y ∂θ
T
(dr, dθ) =
cos θ
−r sin θ
sin θ
r cos θ
(dr, dθ)T . (6.6)
178
6 Applications of Differential Calculus
Notice that the matrices in (6.4) and (6.6) are inverses of each other. Therefore, df =
∇f (x, y)(dx,dy)T =
∂f ∂ f , (dr, dθ)T . ∂r ∂θ
However, note that we do not write the vector of partial derivatives with respect to the polar coordinates in terms of the operator. The issue of change of coordinates for the gradient vector is trickier.
∇
We illustrate the calculation above with an explicit computation. Example 6.3.2. Let f (x, y) = x 2 y and consider the polar change of coordinates. We
compute first df = 2xy dx + x2 dy and using (6.5) we obtain df = =
2r 2 cos θ sin θ(cos θ dr
− r sin θ dθ) + r2 cos2 θ(sin θ dr + r cos θ dθ) 3r 2 cos θ2 sin θ dr + ( −2r2 cos θ sin2 θ + r3 cos3 θ) dθ.
Writing f directly in polar coordinates we have f (r, θ) = r 3 cos2 θ sin θ and df =
3r 2 cos θ sin θ dr + r 3 ( 2cos θ sin2 θ + cos3 θ) dθ
=
3r 2 cos θ sin θ dr + r 3 cos θ( 2sin2 θ + cos2 θ) dθ.
−
−
The two results match as the previous result informs us.
We now state the general result about changes of coordinates. Proposition 6.3.3. Let (x1 , . . . , x n ) be Cartesian coordinates, f (x1 , . . . , xn ) be a dif-
ferentiable function and xi = xi (u1 , . . . , un ) for i = 1, . . . , n is a change of coordinates. Then ∂f ∂f df = du1 + + dun . ∂u 1 ∂u n
···
Proof. The proof follows the approach outlined in the polar coordinates example above. Write the inverse change of coordinates ui = u i (x1 , . . . , x n ) for i = 1, . . . , n. We know df = f (dx1 , . . . , d xn ),
∇ ·
∇f (x1 , . . . , xn) =
∂f ∂f ,..., ∂u1 ∂un
and (dx1 , . . . , d x n )T =
∂ (u1 , . . . , u n ) ∂ (x1 , . . . , x n )
xi =xi (u1 ,...,un )
∂ (x1 , . . . , xn ) (du1 , . . . , d u n )T . ∂ (u1 , . . . , un )
The Jacobian matrices ∂ (u1 , . . . , un ) ∂ (x1 , . . . , xn )
xi =xi (u1 ,...,un )
and
∂ (x1 , . . . , x n ) ∂ (u1 , . . . , u n )
are inverses of each other as they are obtained via the coordinate change and its inverse. Therefore, the formula follows.
6.3 Differential Operators
179
Because the structure of df is unchanged with respect to coordinate changes, this enables us to extend the definition of differential which we obtained for the Cartesian coordinate system to any coordinate system Definition 6.3.4. Let q 1 , . . . , qn be a coordinate system in Rn and f (q 1 , . . . , qn ) be
a differentiable function. Then, the differential of f is df =
∂f dq 1 + ∂q 1
∂f dq n . ··· + ∂q n
As mentioned above, changing coordinates for the gradient is not so straightforward and we see below that the structure of the vector may well be modified from the Cartesian case. But before we can present this, more fundamentally, we have not yet provided a formal definition of gradient. We do this now. Definition 6.3.5. Let q 1 , . . . , qn be a coordinate system in Rn and f (q 1 , . . . , qn ) be
a differentiable function. Then, the gradient of f is the vector
∇q f such that
df =
∇q f · (dq 1 , . . . , d qn ).
where is the scalar product for tangent space vectors in the coordinate system (q 1 , . . . , qn ).
·
We now work in the polar coordinate case again and show how the structure of ∇f changes, but also, the fact that it depends on the basis chosen for the tangent space in polar coordinates. Example 6.3.6. Consider R2 with polar coordinates where the scalar product for
tangent vectors in R2 in the basis (∂/∂r, ∂/∂θ) is the one defined in Chapter 3, see (3.5). For a differentiable function f (r, θ), let (r,θ) f = ( r f, θ f ) and by definition df = (r,θ) f (dr, dθ) = r f dr + r2 θ f dθ.
∇
∇
·
Therefore,
∇(r,θ) f =
∇
∇ ∇
∇
∂f 1 ∂f , ∂r r2 ∂θ
.
In many other references, the gradient in polar coordinates does not have this form and this is a consequence of those authors normalizing the basis of T p R2 in polar coordinates: ∂ 1 ∂ (er , eθ ) := , . ∂r r ∂θ The alternative formula is obtained as follows:
∂ 1 ∂f ∂ ∇(r,θ)f = ∂f + 2 ∂r ∂r r ∂θ ∂θ ∂f = er + ∂r ∂f = er + ∂r
1 ∂f 1 ∂ r r 2 ∂θ r ∂θ 1 ∂f eθ . r ∂θ
180
6 Applications of Differential Calculus
Therefore, the gradient in the orthonormal basis is
∇(r,θ) f =
∂f 1 ∂f , ∂r r ∂θ
.
Exercises
(1) Let (u1 , . . . , u n ) = A(x1 , . . . , x n ) where A is an orthogonal matrix (AT A = AAT = I ). Show that the gradient operator has the form
∇u =
∂ ∂ ,..., ∂u 1 ∂u n
.
(2) In each case, show the formula for the gradient operator in the following coordinate systems in the given basis. (a) In cylindrical coordinates with basis (∂/∂r,∂/∂θ,e3 )
∇(r,θ,z) f = and in its orthonormalized basis
∇(r,θ,z) f =
∂f 1 ∂f ∂ f , , ∂r r2 ∂θ ∂z
∂f 1 ∂f ∂ f , , . ∂r r ∂θ ∂z
(b) In spherical coordinates with basis (∂/∂ρ,∂/∂φ,∂/∂θ )
∇(ρ,φ,θ) f =
∂f 1 ∂f 1 ∂f , 2 , 2 2 ∂ρ ρ ∂φ ρ sin φ ∂θ
and in its orthonormalized basis.
∇(ρ,φ,θ) f =
∂f 1 ∂f 1 ∂f , , ∂ρ ρ ∂φ ρ sin φ ∂θ
.
.
6.3.2 Laplacian, Divergence and Curl Differential Operators
The gradient operator is useful to define other differential operators. We now introduce the Laplacian, the divergence and the curl operators. The Laplacian operator is defined as ∆=
∇2 := ∇ · ∇.
where · is the dot product. For instance, for functions
· → · ∆=
and for functions from ∆=
3 R
∂ ∂ , ∂x ∂y
∂ ∂ , ∂x ∂y
2 R
→ R we have
∂ 2 ∂ 2 = + 2. ∂x 2 ∂y
R
∂ ∂ ∂ , , ∂x ∂y ∂z
∂ ∂ ∂ , , ∂x ∂y ∂z
∂ 2 ∂ 2 ∂ 2 = + 2 + 2. ∂x 2 ∂y ∂z
6.3 Differential Operators
181
Example 6.3.7. We apply the Laplacian operator to f (x, y) = x 2 y + sin(xy).
∆f = =
∂ 2 f ∂ 2 f + 2 ∂x 2 ∂y (2y y 2 sin(xy)) + ( x2 sin(xy)) = 2y
−
−
− (x2 + y2 ) sin(xy).
By transforming the gradient to other coordinate systems, we can obtain the Laplacian after coordinate change. Of course, this depends on the basis for the tangent vectors chosen. If two relevant choices are available, as in the polar coordinate case, we obtain both. Example 6.3.8. The Laplacian in polar coordinates is given as follows; recall that
the scalar product depends on the basis chosen. In the basis (∂/∂r,∂/∂θ), ∆(r,θ) f = = =
∇(r,θ) · ∇(r,θ) f ∂ 1 ∂ , · ∂r r 2 ∂θ
∂ 2 1 ∂ 2 + 2 2. ∂r 2 r ∂θ
∂ 1 ∂ , ∂r r 2 ∂θ
We obtain the same result in the orthonormal basis (er , eθ ). Recall that the scalar product does not have a r2 factor now. We have ∆(r,θ) f = = =
∇(r,θ) · ∇(r,θ) f ∂ 1 ∂ ∂ 1 ∂ , , · ∂r r ∂θ ∂r r ∂θ
∂ 2 1 ∂ 2 + 2 2. ∂r 2 r ∂θ
We now define two differential operators which appear in the study of vector fields and the integration theory of the next chapters. The divergence operator is defined on differentiable vector fields F : Rn → Rn , F = (f 1 , . . . , f n )T , as follows ∂f 1 + ∂x 1
div F (x) := ∇ · F =
∂f n . ··· + ∂x n
The physical meaning of the divergence is discussed in Chapter 10. The curl operator is defined also on vector fields in R3 , F : R3 (f 1 , f 2 , f 3 )T using the following formula curl F (x) := ∇ × F =
i ∂ ∂x f 1
j ∂ ∂y f 2
k ∂ ∂z f 3
→ R3 , F =
.
where i,j,k correspond to the standard basis vectors of R3 . Therefore, the curl of a vector field in R3 is also a vector field in R3 . The explicit expression from the formula is ∂f 3 − ∂f 2 , ∂ f 1 − ∂f 3 , ∂ f 2 − ∂f 1 . curl F (x) =
∂y
∂z
∂z
∂x
∂x
∂y
182
6 Applications of Differential Calculus
The curl operator can also be applied to vector fields in R2 of the form F (x, y) = (F 1 (x, y), F 2 (x, y)) by extending the vector field to the z component with 0: F (x,y,z) = (F 1 (x, y), F 2 (x, y), 0). The physical meaning of the curl operator is also discussed in Chapter 10. Example 6.3.9. Recall that the radial gravitational force vector field is given by
F (x) =
mM G x x r2
− || ||
where x = (x,y,z) and r = x . To compute the div and curl, we first decompose F into its components
|| ||
f 1 (x,y,z) =
−mMGx , r3
f 2 (x,y,z) =
−ymMG , r3
f 3 (x,y,z) =
−mMGz . r3
Then, div F = =
3 2 3 2 3 2 −mM G r −r63rx − mM G r −r63ry − mM G r −r63rz G 3 (3r − 3r(x2 + y 2 + z 2 )) = 0 − mM 6 r
because x2 + y 2 + z 2 = r 2 and curl F = mM G (3yzr
− 3zyr, 3rxz − 3rzx, 3ryx − 3rxy) = (0, 0, 0).
The geometric and physical significance of the values obtained here are explained in Chapter 10.
The gradient, Laplacian and divergence differential operators are linear, that is, for functions g1 , g2 : Rn → R, vector fields F 1 , F 2 : Rn → Rn , G1 , G2 : R3 → R and scalars α, β ∈ R then
∇(αg1 + βg 2 ) = α∇g1 + β ∇g2 ∆(αg1 + βg 2 ) = α∆g1 + β ∆g2
(6.7) div(αF 1 + βF 2 ) = α divF 1 + β divF 2 . curl(αG1 + βG 2 ) = α curlG1 + β curlG2 . The linearity of ∇ is automatic from the linearity of the derivative. The linearity of ∆, div and curl are left for the reader in the exercises section.
6.4 Application of Clairault’s theorem to 1 -forms
183
Exercises
(1) Apply the differential operator ∆ to the following functions. (a) f (x, y) = exp(x)cos(y) (b) f (x,y,z) = x 2 yz 3 (c) f (x,y,z) = 21 z exp(2x) cos(2y) (d) f (x,y,z,w) = (w2 + x2 )ezy (2) A function f is said to be harmonic if ∆f = 0. Which functions in exercise 1 are harmonic? (3) Apply the differential operator div to the following vector fields and curl if applicable. In each case, plot the vector field using a software package and make a guess about the meaning of the div and curl operators on vector fields. (a) F (x, y) = (−y + x(x2 + y 2 ), x + y(x2 + y 2 )) (b) F (x,y,z) = (x,y,z) (c) F (x , y , z , w) = (x, −y,z, −w) (d) F (x, y) = (−2y + 12x4 y 2 , 6x − 16x3 y 3 ) (e) F (x,y,z) = (xy,xz 2 , y 2 z) (4) Show that ∆, div and curl are linear operators. That is, show that the equations (6.7) are satisfied. (5) Let u, v be functions and F and G be vector fields. Show the following differential operator identities. (a) ∇(uv) = u ∇v + v ∇u (b) ∇ · (uF ) = (∇u) · F + u(∇ · F ) (c) ∇ · (∇ × F ) = 0 (d) ∇ × (∇u) = 0 (e) ∇ · (F × G) = (∇ × F ) · G − F · (∇ × G) (6) Show that the Laplacian operator after the coordinate change is given by the formulae below. Do the calculation using the gradient obtained for both bases in Problem (2) of Section 6.3.1. (a) cylindrical coordinates: ∆f =
∂ 2 f 1 ∂f ∂ 2 f + + . ∂r 2 r 2 ∂r 2 ∂z 2
(b) spherical coordinates: ∂ 2 f 1 ∂ 2 f 1 ∂ 2 f ∆f = + 2 2+ 2 2 . ∂ρ 2 ρ ∂φ ρ sin φ ∂θ 2
6.4 Application of Clairault’s theorem to 1-forms We begin by considering 1-forms defined in U ⊂ R2 where U is open and for which the coefficients are C 1 . Let ω = a 1 (x, y) dx + a2 (x, y) dy
184
6 Applications of Differential Calculus
and suppose that ω is exact, so that there exists a C 2 function f : U → R with ∂f ∂x
a1 =
and a2 =
∂f . ∂y
Then, from Clairault’s theorem, we have the equality of mixed second derivative for f . This leads to the following equalities ∂a 1 ∂ 2 f ∂ 2 f ∂a 2 = = = . ∂y ∂y∂x ∂x∂y ∂x
Thus, we can summarize this result as follows: Theorem 6.4.1. If ω = a 1 (x, y) dx + a2 (x, y) dy is exact, then
∂a 2 ∂x
1 = 0. − ∂a ∂y
(6.8)
This gives us a negative criterion for verifying whether ω is exact; that is, if the condition on the partial derivatives of a 1 and a 2 is not satisfied, then ω is not exact. We look at an example. Example 6.4.2. Let ω = 2xy dx + x2 y dy with a1 (x, y) = 2xy and a2 (x, y) = x2 y .
Then
∂a 2 ∂x
1 = 2xy − 2x =0 − ∂a ∂y
and so ω is not exact.
We return to Example 4.3.7 from Chapter 4. Example 6.4.3. Consider
ω =
x y dx + dy x2 + y 2 x2 + y 2
where the coefficients are C 1 on U = R2
\ {(0, 0)}. We now compute ∂a 2 ∂a 1 −2xy − −2xy = 0. = 2 − ∂x ∂y (x + y 2 )2 (x2 + y 2 )2
But, we know that ω is not exact.
Therefore, the opposite implication is not necessarily true, that is ∂a 2 ∂x
− ∂a∂y1 = 0 does not necessarily imply ω = a1 (x, y) dx + a2 (x, y) dy is exact.
We see below that there are cases where this implication is in fact true. Before addressing this question, we generalize the above argument to C 1 differentiable 1forms in U ⊂ Rn . Suppose that n
ω =
i=1
ai (x1 , . . . , xn ) dxi
6.4 Application of Clairault’s theorem to 1 -forms
185
is exact, then there exists a C 2 function f : U ⊂ Rn → R such that ai =
∂f ∂x i
for all i = 1, . . . , n. From Clairault’s theorem, we can write the equality of mixed partial derivatives for every i, j , for which we only need to consider i < j . We obtain ∂a j ∂a i ∂ 2 f ∂ 2 f = = = . ∂x j ∂x j ∂xi ∂xi ∂xj ∂x i
We illustrate this calculation in the case n = 3. We only need to consider the cases (i, j) = (1, 2), (i, j) = (1, 3) and (i, j) = (2, 3), therefore ∂a 2 ∂x1
∂a 1 − ∂x = 0, 2
∂a 3 ∂x 1
∂a 1 − ∂x = 0, 3
∂a 2 ∂x 3
∂a 3 − ∂x = 0. 2
We now summarize the calculation above in the following result. Theorem 6.4.4. Suppose that the C 1 differentiable 1-form n
ω =
ai (x1 , . . . , x n ) dxi
i=1
⊂ Rn is exact. Then,
defined in U
∂a j ∂x i
∂ai =0 − ∂x j
(6.9)
for all i, j = 1, . . . , n with i < j . This means that if at least one of the conditions in (6.9) is not satisfied, then ω is not exact.
Let us return to Example 4.3.7 and consider the function f (x, y) =
− arctan(x/y).
It is a straightforward exercise to check that df = ω and this would make f a potential function for ω and therefore exact! How can we explain this seemingly contradictory statement? Note that f is not defined for y = 0, therefore f cannot be a valid potential function for all (x, y). This means that ω is not “globally” exact on all its domain of definition R2 \ {(0, 0)}. We say that a 1-form ω defined on an open subset U ⊂ Rn is locally exact if for every p ∈ U , there exists a neighborhood V ⊂ U of p and a function f : V → R such that ω = df . Therefore, we can say that ω in Example 4.3.7 is locally exact by choosing the domain U = R2 \ {(0, 0)}. We now state a computational criterion which gurarantees whether a 1-form is locally exact.
⊂ Rn where U is
Theorem 6.4.5. Let ω be a C 1 differentiable 1-form defined in U
an open subset. Then, ω is locally exact if and only if conditions (6.9) are satisfied.
186
6 Applications of Differential Calculus
Proof. We only need to prove that if conditions (6.9) are satisfied, then ω is locally exact. The other implication is given in Theorem 6.4.4. Let p = (x01 , . . . , x0n ) U and δ > 0 be small enough so that the ball
∈
Bδ ( p) := x = (x1 , . . . , x n )
{
∈ Rn | ||x − p|| < δ } ⊂ U.
Let x = (x1 , . . . , xn ) ∈ Bδ ( p) be an arbitrary point and consider the straight line path L joining x and p with parametrization r(t) = p + t(x − p). Note that r (t) = x
− p = (x1 − x01, . . . , xn − x0n).
We are now ready to define the potential function as 1
ˆ
f (q ) =
ˆ
ω =
r∗ ω =
0
L
1 n
ˆ
0 i=1
ai (r(t))(xi
− x0i) dt.
Thus, we need to verify that df = ω and this is where the conditions (6.9) are crucial. We do this be computing the partial derivative ∂f (x) ∂xj
=
1 n
ˆ ˆ ∂ ∂x j
ˆ
− x0i) dt i=1 ∂ (xj − x0j ) ∂a i ∂ ( p + t(x − p)) (xi − x0i ) + aj (r(t))
0 n 1
=
∂x j
0 i=1 1 n
=
ai (r(t))(xi
∂x j
∂a i (r(t)) (xi ∂x j
0 i=1
− x0i)t + aj (r(t))
∂x j
dt
dt
(6.10) where we can differentiate under the integral because the integral is with respect to t and the functions are continuous. We know from conditions (6.9) that for all i = j we have ∂a i ∂x j
j = 0. − ∂a ∂xi
We make the substitution in the last line of (6.10) to obtain ∂f (x) = ∂x j
1 n
ˆ
0 i=1
∂a j (r(t)) (xi ∂x i
− x0i)
t + aj (r(t)) dt.
We notice the following simplification
n
i=1
and this leads to ∂f (x) ∂x j
∂a j (r(t)) (xi ∂x i
1
=
ˆ ˆ 0
=
0
1
− x0i)
=
d aj (r(t)) dt
d t aj (r(t)) + aj dt dt d (aj (r(t))t) dt = a j (r(1)) = a j (x). dt
6.4 Application of Clairault’s theorem to 1 -forms
187
So f is indeed the potential function for ω : n
df =
i=1
∂f (x) dxi = a i (x) dxi = ω ∂x i
and this completes the proof.
Exercises
(1) Determine whether the 1-forms are locally exact. x y (a) ω = dx + dy
y x y (b) ω = (x cos(e ) + z 3 ) dx + 12 x2 sin(ey )ey + 2z dy + (3xz 2 + x + 2y) dz (c) ω = (2xy 2 + 3x2 ) dx + 2x2 y dy + 2z dz x y z dx + dy + dz . (d) ω = (x,y,z) 3 (x,y,z) 3 (x,y,z) 3
||
−
−
||
||
−
||
||
−
||
7 Double and Triple Integrals This chapter is concerned with the definitions of double and triple integrals as well as practical formulae for their computation. However, we begin with a section introducing a new concept called the wedge product which we apply to differentials. The wedge product is useful to compute areas and volumes and form the backbone of the definitions of double and triple integrals. The second section is about the double integral and may be considered as review for some readers, except for the last part of the section where we discuss Green’s theorem. This theorem links the computation of line integrals of 1 -forms to double integrals and is considered one of the fundamental theorems of vector calculus, the second presented in this text after Theorem 4.3.5. The final section discusses the triple integral over boxed and more general domains.
7.1 Area and Volume Forms We show in Chapter 1, Section 1.5 that curves can be given an orientation. We generalize the construction to higher dimensional spaces and surfaces. We say that 2 R is positively (negatively) oriented if angles increase when rotating around the origin in the counterclockwise (clockwise) direction. This means that when defining polar coordinates in the typical way, we are giving R2 a positive orientation. Orientations of R3 are obtained as follows. Choose a positive orientation of R2 with axes x and y and let the z axis be perpendicular to the x, y -plane. If the positive numbers of the z axis point up, we say that R3 has positive orientation, otherwise, it has negative orientation. One can give a positive and negative orientation to Rn for any n inductively by choosing a positive orientation for Rn−1 , then the orientation of the x n axis determines a positive or negative orientation of Rn . However, it is not possible to visualize the orientation in dimensions higher than three. In Chapter 10, we define orientations for two-dimensional surfaces in R3 . For computing integrals over two dimensional and three dimensional domains, one approximates the area or volume by splitting the region into small pieces for which the area or volume is known. We use parallelograms and parallelepipeds to decompose our regions.
Parallelogram
The area of a parallelogram of base b and height h is bh. However, often a parallelogram is given in terms of vectors defining the sides of the parallelogram, for instance, let u = (a1 , a2 ) and v = (b1 , b2 ) be two vectors in R2 as Figure 7.1 shows. Recall from linear algebra that the area of a parallelogram given by two vectors
189
7.1 Area and Volume Forms
u = (a1 , a2 ) and v = (v1 , v2 ) is also given by the norm of the cross-product u v and we know that u v = v u. Therefore, it seems that the cross-product of
×
×
− ×
vectors is a suitable tool for our purposes. One drawback of the cross-product is that it is not generalizable to dimensions higher than three and this alone is reason enough to consider instead the following approach. y u
v x Fig. 7.1. Vectors u and v
The area A of the parallelogram P (u, v) obtained with u and v is given by A(u, v) = a1 b2
|
− a2 b1 |.
Note that P (u, v) and P (v, u) give the same parallelogram and in the formula for the area, A(u, v) = a1 b2
|
− a2 b1 | = |b1 a2 − b2 a1 | = A(v, u)
because of the absolute value. If one removes the absolute value, we obtain a signed area of P (u, v) which is the negative of the signed area of P (v, u). This suggests that the ordering of the vectors in the definition of a parallelogram, that is P (u, v) versus P (v, u), can be identified with the orientations of R2 . We want to keep track of the orientation of parallelograms in R2 when computing area. We know that basic 1-forms dx and dy take values that depend on the orientation of vectors in R2 . Therefore, we define a “product” of 1-forms which does exactly what we want. Definition 7.1.1. Let u = (a1 , a2 ) and v = (b1 , b2 ) be two vectors in R2 in the
standard basis. The wedge product of 1 -forms dx , dy on (u, v) is a mapping dx dy : 2 2 R R R defined by
∧
× →
(dx
∧ dy)(u, v) :=
dx(u)
dy(u)
dx(v)
dy(v)
= a 1 b2
− a2 b1 .
190
7 Double and Triple Integrals
The expression (dx dy)(u, v) is the signed area of the parallelogram generated by the vectors u and v .
∧
If dx and dy are interchanged, this causes a change of orientation of R2 and the sign of the area changes. (dy
∧ dx)(u, v)
= =
dy(u)
dx(u)
dy(v)
dx(v)
a2 b1
− b2 a1 −(dx ∧ dy)(u, v).
=
This is called the “skew-symmetry” property of wedge product. That is, the interchange of dx and dy induces a change in sign. The wedge product of two differentiable 1 -forms is an example of a differentiable 2-form which are introduced in Chapter 8. We can now return to the problem of area and define the area form as dA = dx
| ∧ dy|
which always gives a non-negative area. Note that it is not a 2 -form. Example 7.1.2. We compute the signed area of the parallelogram generated by the
vectors u = (1, 4) and v = ( 2, 1). We have,
− −
(dx
∧ dy)(u, v) =
dx(u)
dy(u)
dx(v)
dy(v)
=
−
1 2
− 4
1
= 7.
The parallelogram P (u, v) is positively oriented and so the signed area is positive.
A geometric criterion to determine the sign of the area of a parallelogram P (u, v) is to use a rotation with smallest angle to align u to v . If the smallest angle is achieved via a counterclockwise rotation, then P (u, v) is positively oriented , otherwise it is negatively oriented . y u
v
x
Fig. 7.2. Positive orientation of the paral-
lelogram generated by u and v
It is straightforward to extend the wedge product of two basic 1-forms dxi , dxj applied to vectors in higher dimensions as follows. For vectors u = (a1 , . . . , an ) and
7.1 Area and Volume Forms
191
v = (b1 , . . . , b n ), we define (dxi
∧ dxj )(u, v) :=
dxi (u)
dxj (u)
dxi (v)
dxj (v)
= a i bj
− bi aj .
The vectors u and v generate a parallelogram P (u, v) in Rn . This means (dxi ∧ dxj )(u, v) is the signed area of the projection of P (u, v) to the two-dimensional (xi , xj )-subspace of Rn . Indeed, the projection of u and v to the (xi , xj )-plane is given by the vectors (ai , aj ) and (bi , bj ). Example 7.1.3. Consider the vectors u = (2,
−2, 3) and v = (−1, −3, 1). We com-
pute the signed area of the projections of the parallelogram P (u, v): (dx1
∧ dx2 )(u, v)
=
(dx1
∧ dx3 )(u, v)
=
(dx2
∧ dx3 )(u, v)
=
− − − −
2 1
2 1 2 3
− − 2 3
3 1 3 1
=
−8
=5
= 7.
The parallelogram P (u, v) and its projection on the xy -plane are illustrated in Figure 7.3.
Consider the parallelogram in
3 R
given by the vectors
u = (a1 , a2 , a3 )
and v = (b1 , b2 , b3 ).
The formula for the area of the parallelogram is also obtained from the norm of the cross-product
||u × v||
= =
− ∧ (a2 b3 (dy
b2 a3 )2 + (a1 b3
dz)2 (u, v) + (dx
− b1 a3 )2 + (a1 b2 − a2 b1 )2
∧ dz)2 (u, v) + (dx ∧ dy)2 (u, v).
Thus, we automatically have the following theorem. Theorem 7.1.4. Let u and v be two vectors in R3 . Then, the area AP of the paral-
lelogram P spanned by u and v is AP =
[(dy
∧ dz)(u, v)]2 + [(dz ∧ dx)(u, v)]2 + [(dx ∧ dy)(u, v)]2 .
We now turn our attention to parallelepipeds.
192
7 Double and Triple Integrals
z
u v
y
x Fig. 7.3. Projection of the parallelogram generated by u and v onto the xy -plane.
Parallelepipeds
Consider a parallelepiped Q(u,v,w) given by three vectors u = (a1 , a2 , a3 ), v = (b1 , b2 , b3 ) and w = (c1 , c2 , c3 ). Similar to parallelograms, the volume of Q(u,v,w) is given by the product of the area of the base of Q(u,v,w) times the height. In terms of vectors u, v,w , recall the triple product V ol(Q(u,v,w)) = u (v
· × w).
Then, V ol(Q(u,v,w))
= = =
(a1 , a2 , a3 ) (b1 c2
− b2 c1, −(b1 c3 − b3 c1 ), b1 c2 − b2 c1 ) a1 (b1 c2 − b2 c1 ) − a2 (b1 c3 − b3 c1 ) + a3 (b1 c2 − b2 c1 ) ·
det
a1
a2
a3
b1
b2
b3
c1
c2
c3
.
This shows a similar pattern as we have seen with area of parallelogram and we recast the volume of a parallelepiped in terms of wedge products. This is easily done by tacking on another wedge and extending the definition by increasing to a 3 × 3 determinant. Definition 7.1.5. Let u = (a1 , a2 , a3 ), v = (b1 , b2 , b3 ) and w = (c1 , c2 , c3 ) be three
vectors in the standard basis of R3 . Then, the wedge product of dx, dy and dz is dx
∧ dy ∧ dz : R3 × R3 × R3 → R
7.1 Area and Volume Forms
where (dx
∧ dy ∧ dz)(u,v,w) =
dx(u)
dy(u)
dz(u)
dx(v)
dy(v)
dz(v)
dx(w)
dy(w)
dz(w)
193
.
This is the signed volume of a parallelepiped. It is an example of a differentiable 3-form. One sees that the wedge product formalism is easily extendable to higher dimensions in the way shown in Definition 7.1.5. Definition 7.1.5 also has skewsymmetry properties as above: dx
∧ dz ∧ dy = −dx ∧ dy ∧ dz = −dy ∧ dx ∧ dz = −dz ∧ dy ∧ dx.
from the interchange of columns in the determinant. Definition 7.1.6. The volume form is defined as
dV = dx
| ∧ dy ∧ dz|.
It always gives a non-negative volume. It is not a 3-form. Example 7.1.7. Compute the signed volume and volume of the parallelepiped given
by u = (2, 1, 2), v = (0, 3, 2) and w = (1, 1, 3).
−
(dx
∧ dy ∧ dz)(u,v,w)
=
=
Therefore dV (u,v,w) = 14.
dx(u)
dy(u)
dz(u)
dx(v)
dy(v)
dz(v)
dx(w)
dy(w)
dz(w)
2
1
0
3
1
1
− 2
2
= 14.
3
The signed areas and signed volume forms are used in a later section when we introduce integration of forms. The area and volume forms are used in the next section to define double and triple integrals.
Exercises
(1) Determine whether the parallelograms P (u, v) below are positively or negatively oriented and compute their areas. (a) P (u, v) with u = (−1, 2), v = (3, 2) (b) P (u, v) with u = (3, −1), v = (2, 4)
194
7 Double and Triple Integrals
√ 3 − (c) P (u, v) with u = (0, 3), v = (1/2, 2 ).
(2) Compute the area of the parallelograms P (u, v) below (a) P (u, v) with u = (3, 1, 0), v = (−1, 2, 3) (b) P (u, v) with u = (1, 0, 3), v = (−1, 4, 1) (c) P (u, v) with u = (2, −3, −3), v = (5, 1, 1) (3) Show that dxi ∧ dxj = 0 if i = j and that dxi ∧ dxj ∧ dx = 0 if i = j , i = or j = . (4) Compute the volume of the parallelepipeds Q(u,v,w) defined below. (a) u = (2, 0, 0), v = (−1, 1, 3), w = (3, 2, 4). (b) u = (1, 1, 1), v = (−1, −1, −1), w = (1, 0, 1). (c) u = (−4, 3, 1), v = (2, 1, 3), w = (0, 5, 7). (5) Show that if u = (a1 , a2 , a3 , a4 )T and v = (b1 , b2 , b3 , b4 )T are vectors in R4 , then Ω(u, v) = (dx1
can be written
∧ dx3 )(u, v) + (dx2 ∧ dx4 )(u, v)
Ω(u, v) = u T J v = (a1 , a2 , a3 , a4 )J
where J =
0
I 2
−I 2
0
b1 b2 b3 b4
.
with I 2 the 2 × 2 identity matrix and 0 the 2 × 2 zero matrix. (6) Using the representation of Ω(u, v) of the previous problem, show that Ω is unchanged if (u, v) is replaced by (Au,Av) where A is a 4 × 4 matrix satisfying AT J A = J.
7.2 Double integrals We now study the question of integrating functions of two variables over rectangular and more general domains in R2 . We begin with functions of two variables over rectangular domains. Suppose we want to compute the volume under a surface z = f (x, y) over a rectangle R ∈ R2 . We look at some easy examples. Example 7.2.1. Define the piecewise constant function f on R = [1, 2]
f (x, y) =
2, 1, 3, 3/2,
(x, y)
∈ R1 = [1, 3/2] × [0, 1/2] (x, y) ∈ R2 = [3/2, 2] × [0, 1] (x, y) ∈ R3 = [1, 5/4] × [1/2, 1] (x, y) ∈ R4 = [5/4, 3/2] × [1/2, 1].
× [0, 1] by
7.2 Double integrals
195
y
1 R3 R 4 1/2
R2 R1 1
5 4
2
3 2
x
Fig. 7.4. Splitting of the domain for f (x, y)
See Figure 7.4. Then, the volume under the function f (x, y) is the sum of the volumes of the boxes with heights given by f over the rectangles defined in the domain. We have V olume
=
2Area (R1 ) + 1 Area (R2 ) + 3 Area (R3 ) + 23 Area (R4 )
=
2(1/4) + (1/2) + 3(1/8) + (1/8) = 3/2.
Let R be a rectangle in R2 : [a0 , a1 ]×[b0 , b1 ] and f : R ⊂ R2 → R with f (x, y) ≥ 0. We now estimate the volume between the xy -plane and the function f . First, we define a partition of R into small subrectangles by partitioning each interval separately. We have a0 = x 0 < x1 < .. . < xn−1 < xn = a 1 b0 = y 0 < y1 < .. . < ym−1 < ym = b 1 .
There are mn subrectangles labelled: Rij := [xi , xi+1 ]×[yj , yj+1 ] with i = 0, . . . , n−1 and j = 0, . . . , m − 1. We obtain an approximation to the volume between the xy plane and the function f by creating a piecewise constant function and evaluating ∗ ) for each subrectangle Rij . We take the sum of the volumes of f at a point (x∗ij , yij ∗ ): we obtain the boxes with base Rij and height f (x∗ij , yij n
RS :=
m
∗ )dA(Rij ). f (x∗ij , yij
i=1 j=1
The expression RS is called a Riemann Sum , similar to the definition of integral of one-dimensional functions f (x). Refining the grid of subrectangles R ij by letting n, m → ∞ so that dA(Rij ) → 0 for all i, j , the question which arises is the existence of the following limit n
lim
n,m
m
→∞ i=1 j=1
∗ )dA(Rij ). f (x∗ij , yij
(7.1)
196
7 Double and Triple Integrals
y b1 yj+1 yj
Rij
b0
a0
xi xi+1 a1
x
Fig. 7.5. Subrectangle Rij given by the
partitions of [a0 , a1 ] and [b0 , b1 ].
The above construction is valid for functions f (x, y) which are not necessarily positive and we have the following definition. Definition 7.2.2. If the limit of the Riemann sum (7.1) of f exists on R, then f is
said to be integrable on R and
¨
n
f (x, y) dA :=
R
lim
n,m
m
→∞ i=1 j=1
∗ )dA(Rij ). f (x∗ij , yij
It is in general difficult to check for integrability and compute the integral of f directly from the definition. Therefore, we first need to determine general conditions under which a function is integrable. This is done in the next theorem. Theorem 7.2.3. Let f : R
⊂ R2 → R. Then,
(1) If f : R ⊂ R2 → R is continuous, then f is integrable on R. (2) If f : R ⊂ R2 → R is bounded on R and has only finite size jump discontinuities over a finite set of smooth curves, then f is integrable on R . Proof. If f is a continuous function on R, then it is bounded on R. For bounded ∗ ) on each rectfunctions, with finite size jump discontinuities, the values f (x∗ij , yij angle Rij can be bounded above and below and as the area of Rij shrinks to zero,
the limit converges. As mentioned above, we also need a more effective way of computing the double integral than using the definition. We know from elementary calculus that for simple integrals of functions of one variable, the Fundamental Theorem of Calculus enables one to relate the value of the integral to the existence of an antiderivative, or indefinite integral. There is no such result for double integrals. However, it is possible to decompose the double integral into a succession of two simple integrals as the next result shows.
197
7.2 Double integrals
Theorem 7.2.4 (Fubini). If f is integrable on the rectangle R. Then
¨
a1
f (x, y) dA =
R
a0
b1
ˆ ˆ
f (x, y) dy dx =
b0
b1
b0
a1
ˆ ˆ
f (x, y) dx dy.
a0
The idea of this theorem is that one can decompose the rectangle [a0 , a1 ] × [b0 , b1 ] using slices parallel to one of the axes. This is done as follows, say for slices parallel to the y -axis. Fix a value x ∈ [a0 , a1 ] and consider f (x, y) with y ∈ [b0 , b1 ]. Then, we can compute the integral of the slice obtained using f (x, y): b1
ˆ
f (x, y) dy.
(7.2)
b0
Now, (7.2) depends on the value of x chosen to obtain the slice and so the result y
slice at x = x 0
b1 b0 a0
x 0 a1
x
Fig. 7.6. Slice at x = x 0 of the domain.
of (7.2) is a function of x. We write b1
G(x) =
ˆ
f (x, y) dy.
b0
Thus, for every x ∈ [a0 , a1 ], G(x) is the integral of a slice. Fubini’s theorem tells us that summing up all the integrals G(x) for x ∈ [a0 , a1 ] corresponds to the integral of f (x, y) over R. Example 7.2.5. Let R = [0, 3]
¨
× [−1, 1] and f (x, y) = ex+y . We compute 3
f (x, y) dA
=
0
R
3
=
1
x
1
x+y
dy
dx
1
y
e dx e
0
=
ˆ ˆ e − ˆ (e
−1
− e−1 )(e3 − 1).
Computation of mass and populations
The double integral can be used to compute physical quantities such as total mass or total electric charge if the mass or charge density is known. Suppose that a
198
7 Doubl Doublee and Triple Integr Integrals als
rectangular plate R of a certain material has density (with units of mass/unit area, say kg/m2 ) given by a continuous function ρ(x, y). The total mass of the plate can be computed as follows. Decompose R into subrectangles Rij for i = 1, . . . , n , j = 1, . . . , m as shown in the definition of the double integral. Then, we evaluate ρ ρ((x, y ) at an arbitrary point (x∗i , yj∗ ) of Rij . Let A(Rij ) be the area of the rectangle Rij , then ρ(x∗i , yj∗ )A(Rij ) has units of mass and gives an approximation of the mass of the plate defined by Rij . Therefore, we can write a Riemann sum with ρ(x, y ) giving an approximate value for the total mass of the plate. Because ρ is a continuous function, by taking the limit of the Riemann sum, the total mass is given by
¨
ρ(x, y ) dA.
R
Example 7.2.6. Suppose that R = [ 1, 1]
−
¨
ρ(x, y ) dA
=
R
=
|| || ˆ ˆ − ˆ ˆ ˆ ˆ ˆ ˆ − − − ˆ ˆ ˆ ˆ − − − 1
1
−1
−1
1
1
ˆ ˆ
x + y dx
0
0
(x + y ) dx
0
−1
1
0
( x + y ) dx
0
1
1
1
0
0
−1
0
1
0
0
dy
−1
−1
y ) dx
dy
0
1 2 x + yx 2
dy
1 2 x + yx 2
+
(x
−1
1 2 x + yx 2
0
1
dy + dy +
−1
dy
−1
0
+
=
dy
(x + y ) dx d dyy +
0
=
× [−1, 1] and ρ(x, y) = |x| + |y|, then
1 2 x 2
0
dy
−1
1
yx
dy
0
where the result is obtained by a simple computation of the integrals.
We can apply this approach to any density function, for instance, population density. Example 7.2.7. Suppose that a bacterial colony is cultivated in a square domain R
of sides of length a and that the density of population is estimated by ρ(x, y ) = 3 (x2 + y2 ). The total population is obtained by integrating ρ over the domain:
−
¨
R
a
− a
ˆ ˆ ρ(x, y ) dA dA = = (3 − (x ˆ 0
(3
0
+ y 2 )) dy
0
a
=
2
2
x )y
−
y3 3
a 0
dx
dx
7.2 Double integrals integrals
− − − a
=
ˆ
(3
0
=
−
199
x3 3
3x
= 3 a2
−
x2 ) a
2 a4 3
a
a3 3
dx
a3 x 3
a 0
.
7.2.1 Integ Integrals rals over gene general ral tw two-dim o-dimensio ensional nal doma domain in
f ((x, y ) be a continuous function defined on Let D ⊂ R2 be a bounded region and f D. We suppose that the boundary of D is made up of a finite number of smooth curves. Suppose that D ⊂ R where R is a rectangle in R2 , see Figure 7.7. We show that the double integral over D can be defined. Let F ((x, y ) = F
f ((x, y ) f 0
if (x, y) ∈ D if (x, y) ∈ R \ D.
F ((x, y ) is bounded on R and continuous on the pieces D and R F
\ D, with a jump
F ((x, y ) is discontinuity on a finite number of smooth curves. By Theorem 7.2.3, F y
R D
\
D
x
Fig. 7.7. Domain D with boundary given
by piecewise smooth curves.
integrable on R and we define
¨
D
f ((x, y ) dA f dA := :=
¨
F ((x, y ) dA. F
R
This definition is satisfying because the contribution of the double integral of F ((x, y ) on R \ D is zero. We now enumerate a list of cases of domains D in R2 that F are computationally tractable.
200
7 Doubl Doublee and Triple Integr Integrals als
Type I:
type I if A planar domain D is said to be of type if it lies between the graph of two continuous functions of x with the same domain, see Figure 7.8. y g2 (x)
g1 (x) x Fig. 7.8. Schematic of a domain of Type I.
∈ R2 | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)}
D = (x, y )
{
An example of Type I domain is given in Figure 7.8. The general formula for the double integral is
¨
b
f ((x, y ) dA f dA = =
D
g2 (x)
ˆ ˆ a
f ((x, y )dy f
g1 (x)
dx.
R between the xy -plane with y y Example 7.2.8. Compute the volume of the region R x2
+ y 2
cylinder x2
+ y 2
≥
and the 0, the elliptic paraboloid z = = 1. This region is bounded below by the xy -plane and above by the elliptic paraboloid. But the domain D in the xy -plane is bounded by the cylinder and by the line y = 0. We have D = (x, y )
{
∈
2 R
| −1 ≤ x ≤ 1, 0
Therefore, the volume of R is given by Volume (R) =
¨
x2 + y2 dA dA = =
1
=
ˆ
−1
1 x2 y + y 3 3
y
ˆ ˆ √ − 1
−1
D
≤ ≤ −
√ 1−x
0
1 x2
0
2
dx
1
x2 .
}
(x2 + y 2 ) dy
dx
201
7.2 Double integrals integrals
1
=
ˆ
−1
=
− x
− −
1 x(1 4
1 + 3
=
2
1
−
1 x(1 4
1 x2 + (1 3
1 x2 )3/2 + x 8
2 3/2
−x )
2 3/2
−x )
− −
dx d x
1 x2 + arcsin x 8
1
3 + x 8
1
x2
3 + arcsin x 8
1 π. 4
1
−1 1
−1
Type II:
A planar domain D is said to be of type II if it lies between the graph of two continuous functions of y. D = (x, y )
{
∈ R2 | c ≤ y ≤ d, f 1 (y) ≤ x ≤ f 2 (y)}.
An example of a type II domain is found in Figure 7.9. The general formula for the double integral is
¨
D
d
f ((x, y ) dA f dA = =
f 2 (y )
ˆ ˆ c
f 1 (y)
f ((x, y )dx dy. f
y
f 2 (y ) f 1 (y )
x Fig. 7.9. Schematic domain of Type II
202
7 Doubl Doublee and Triple Integr Integrals als
Example 7.2.9. Compute the volume of the region R enclosed by the cylinder x
−
|y| − 1 = 0, the planes y = 1, y = −1/2, x = 0 and below the function f f ((x, y) = 2y. We see that x values are constrained to be between 0 and the lines given by 1 + |y |, where y takes its values between − −1/2 and 1. This is indeed a region of type II and we write
D = (x, y )
{
∈ R2 | −1/2 ≤ y ≤ 1,
0
ˆ
≤ x ≤ 1 + |y|}.
Therefore, the volume is
¨
1
2y dA
=
1+ y
ˆ
−1/2
D
||
2y dx d dyy
0
1
=
2
ˆ
y (1 + y ) dy
||
−1/2
ˆ
0
=
2
1
y (1
−1/2
=
− y) dy dy + +
ˆ
y (1 + y) dy
0
2
Type III:
type III if A domain D is of type if it can be written as a type I or as a type II domain. Here are some examples.
(1) D is the region bounded by the ellipse x2 y2 + =1 4 3
in the first quadrant, see Figure 7.10. We can write D
= =
{(x, y) {(x, y)
| ≤ ≤ √ ≤ ≤ − | ≤ ≤ ≤ ≤ − 0
x
0
y
2, 0
3, 0
y
3(1
x
2
1
x2 /4)
} y 2 /3}.
= y y 2 and y = e x − 1, see Figure 7.11. (2) D is the region bounded by the curves x = These two curves intersect at (0 (0,, 0) and at 2
y0 = e y0
−1 where y0 ≈ 0.7469 which means x0 = y 02 ≈ 0.5578. Therefore, √ D = {(x, y ) | 0 ≤ x ≤ x0 , ex − 1 ≤ y ≤ x} = {(x, y ) | 0 ≤ y ≤ y0 , y 2 ≤ x ≤ ln( ln(yy + 1) }.
7.2 Double integrals integrals
203
√ y3 D
2
x
Fig. 7.10. Domain of Type III bounded by
the ellipse in the first quadrant.
y y0 D x0
x Fig. 7.11. Domain of Type III bounded by
the exponential and the parabola
It is often the case that for type III domains, one of the parametrizations of D leads to an easier computation. Example 7.2.10. Consider the double integral
¨
2
e−x dA
D
where D is the domain bounded by the lines y = 0, y = x and x = 1. This is a domain of type III and we write D = (x, y )
{
∈ R2 | 0 ≤ y ≤ 1, 0 ≤ x ≤ y}.
Then
¨
2
e−x dA dA = =
D
1
y
ˆ ˆ 0
2
e−x dx
0
dy
but the inner integral does not have an antiderivative which can be written in what are called “elementary terms”: polynomials, rational functions, trigonometric functions, exponential and logarithmic functions. Hence, the integral is not computable explicitly. If we write instead D = (x, y )
{
∈ R2 | 0 ≤ x ≤ 1, 0 ≤ y ≤ x}.
Then the order of integration is changed and we have
204
7 Double and Triple Integrals
y 1 D x
1
Fig. 7.12. Triangular domain of
Type III
¨
2
e−x dA
1
x
ˆ ˆ − e dy ˆ − ˆ e dy ˆ − xe dx ˆ − 1
=
0
D
=
0
=
0
=
1
0
x
2
dx
x
x2
dx
0
1
x2
1
e
u
du 2 0 1 (1 e−1 ). 2
=
−
More general domains
If a domain is not of type I, II or III one can in general split the domain into such regions and perform the integrals. Proposition 7.2.11. Let D
⊂ R2 and suppose D = D 1 ∪ D2 where the intersection
of D1 and D2 is only along smooth curves. Then
¨
D
f 1 dA =
¨
f 1 dA +
D1
¨
f 1 dA
D2
We illustrate the statement in the next example. Example 7.2.12. Let D be the region enclosed by the lines L 1 , . . . , L 5 with L 1 joining
(0, 0) to (2, 0), L2 joins (2, 0) to (1, 3), L3 joins (1, 3) to (2, 2), L4 joins (2, 2) to ( 32 , 1), L5 joins ( 32 , 1) to (2, 0) and finally L6 joins (2, 0) to (0, 0) . See Figure 7.13. We can split D into four regions: D1
=
D2
=
D3
=
D4
=
{(x, y) | 0 ≤ x ≤ 1, 0 ≤ y ≤ x + 2} {(x, y) | 1 ≤ x ≤ 2, 2 ≤ y ≤ 4 − x} {(x, y) | 1 ≤ y ≤ 2, 1 ≤ x ≤ y2 + 1} {(x, y) | 0 ≤ y ≤ 1, 1 ≤ x ≤ − y2 + 2}.
205
7.2 Double integrals
Then, if f : D
¨
→ R is integrable we can write
f dA
¨ ¨ f dA + ˆ ˆ f dy ˆ ˆ
=
D
D1 1
=
D2
0
x+2
0 2
+
1
¨ ¨ f dA + f dA + f dA ˆ ˆ − dx + f dy dx ˆ ˆ − D3
2
1+y/2
1
f dx
1
D4
4 x
2
1
dy +
0
2 y/2
f dx
1
dy.
y 3 D2 2 D3 1
D
D1
D4 1
2
x
Fig. 7.13. Union of domains of Types
I and II.
Exercises
(1) Calculate the iterated integrals. 3
5
ˆ ˆ (a) 4xy dy dx ˆ ˆ y cos(xy) dx (b) ˆ ˆ 1
2
(c)
0 1
−1
1
1
0
y
0
2
dy
ey dx dy
(2) Find the volume of the solid that lies under the hyperbolic paraboloid z = 9y2 − 3x2 + 6 and above the rectangle R = [−1, 1] × [1, 2]. (3) Consider the domains below and determine if they are of type I , I I , III . (a) D1 = {(x, y) | −1 ≤ y ≤ 1, −y − 2 ≤ x ≤ y}. (b) D2 is the triangular region with vertices (0, 1), (1, 2), (4, 1).
206
7 Double and Triple Integrals
(c) D3 = {(x, y) | 0 ≤ x ≤ π, 0 ≤ y ≤ sin(x)} (d) D4 is the circle with centre at the origin and radius 2. (4) Evaluate the following double integrals
¨ 2y dA where D is bounded by y = x , y = x , x ≥ 0. (a) ¨ x y dA where D is enclosed by y = |x| and y = 2. (b) ¨ 3
D
2 2
D
(x + y) dA where D is enclosed by x3 = y 2 and y = x + 1 .
(c)
D
(5) Compute the double integral of f (x, y) = x 2 y + 3 over the domain D = (x, y)
{
| −1 ≤ x ≤ 1, −x + 1 ≤ y ≤ x + 1}
(6) A heavy snow fall leaves on a flat roof of rectangular shape R of dimensions 20m × 15m an uneven depth of snow due to the wind. Let R = [0, 20] × [0, 15] and we approximate the depth of the snow cover (in meters) on the roof by 1 3 x2 + 1000 y 2 . Suppose that the snow density is the function f (x, y) = 1 + 1000 estimated as 50 kg/m3 , determine the density of snow per m 2 and calculate the total mass of snow on the roof. (7) The density of woodland caribou (Rangifer tarandus caribou) living in the boreal forest of northern Québec has been estimated (for the Temiscamie herd) at 2.04 caribou/100 km2 [4]. Suppose that for conservation purposes, an area of prime ecological conditions D for the woodland caribou is protected from human disturbances and has the shape given in Figure 7.14. How many woodland caribous can be estimated to inhabit the conservation area? 12km
12km 6km 12km
45km
48km D
3km 6km 12km
18km
Fig. 7.14. Domain D of Exercise 7.
207
7.3 Green’s Theorem
7.3 Green’s Theorem After the Fundamental Theorem of Line Integrals, Green’s theorem is the next ma jor theorem of Vector Calculus. It provides a relation similar to the Fundamental Theorem, but between line integrals of 1-forms in R2 and double integrals. We begin with the statement in the simplest case, that of a unique simple closed curve surrounding a domain where a 1 -form is defined. Theorem 7.3.1 (Green’s Theorem: simplest case) . Let C
2 R
be a simple closed 2 curve oriented counterclockwise surrounding an open domain D R . Let ω = a(x, y) dx + b(x, y) dy be a 1-form with C 2 coefficients defined for all (x, y) D. Then, ∂b ∂a ω = dA(x,y) . ∂x ∂y C D
˛
¨
⊂
⊂
∈
−
There are several proofs of Green’s theorem and we provide a proof for the more general case described below. Example 7.3.2. The simplest situation which illustrates Green’s theorem is in the
case of a curve C = C 1 C 2 C 3 C 4 enclosing a rectangular region D given by [α, 0] [0, β ]. We compute the line integral using parametrizations r1 (t) = (t, 0) for t [0, α], r2 (t) = (α, t) for t [0, β ], r3 (t) = (α t, β ) for t [0, α] and r4 (t) = (0, β t) for t [0, β ]. Then, one obtains
∈
˛
C
∪ ∪ ∪
×
∈
ˆ − ˆ ∈ ˆ ˆ ω = ω + ω + ω + ω ˆ ˆ ˆ = a(t, 0) dt + b(α, t) dt − ˆ ˆ ˆ = a(t, 0) dt + b(α, t) dt − ˆ ˆ C 1 α
C 2
0
C 3 β
−
(7.3) β
a(α
0
− t, β ) dt −
α
0
ˆ
b(0, β
0
− t) dt
β
a(u, β ) du
0
α
=
α
β
0
∈
C 4
0
α
−
−
ˆ
b(0, u) du
0
β
(a(t, β )
0
− a(t, 0)) dt +
(b(α, t)
0
− b(0, t)) dt
where we use the substitution rule above, and return from the u to the t variable in the last line. Using the Fundamental Theorem of Calculus, the last line is equal to the following β
=
0
0
=
α
0
α
=
α
β
ˆ ˆ ∂b ˆ ˆ dxdt − ∂x ˆ ˆ ∂b ∂a − ∂y dy ∂x ¨ ∂b ∂a 0
0
−
∂a dy dt ∂y
β
dx
(7.4)
0
dA(x,y) . ∂x ∂y D This example is important not only in understanding the origin of the formula of Green’s theorem, but we use it also in Chapter 10 to provide a more general proof than the one we provide in this chapter. Stay tuned.
208
7 Double and Triple Integrals
We now look at some applications of Green’s theorem. Example 7.3.3. Let ω = x dy . Then a(x, y) = 0 and b(x, y) = x , so Green’s theorem
yields
˛
ω =
C
¨
D
dA(x,y)
where the right-hand side is the area of the domain D . The reader can check that the same result can be obtained by using the 1 -forms ω = y dx or ω = 21 ( y dx + x dy). It is a remarkable consequence of Green’s theorem that the line integral of a particular 1-form gives the area of the region it encloses.
−
−
y d c a
Fig. 7.15. Rectangular domain
x
b
with boundary oriented counterclockwise.
The more general version of Green’s theorem takes into account the fact that the boundary of the region D may be made up of a simple closed curve C forming the outer boundary and several simple closed curves surrounding holes within the region bounded by C . See Figure 7.16 and note that the orientation of the closed curves inside C are given orientation opposite to the orientation of C . The convention is to take C with counterclockwise orientation and this one is referred to positive orientation , thus the inner curves have clockwise orientation, which are referred to as negative orientations . This choice is not arbitrary and guarantees that the formula for Green’s theorem of Theorem 7.3.1 is unchanged for a domain with holes. Indeed, consider the simple situation in R2 where D is the region bounded by ∂D = C 1 ∪ C 2 where C 1 is a circle of radius 1 and C 2 is a circle of radius 2 both oriented counterclockwise, see Figure 7.17. Let Di be the region enclosed by the curve C i , for i = 1, 2. Then,
¨ ¨ − − − − ¨ ¨ ˛ ˛ − − − −
¨
D
∂b ∂x
∂a ∂y
=
D2
∂b ∂x
∂a ∂y
D1
∂b ∂x
∂a ∂y
We apply Green’s theorem for the outer boundary curves, we obtain D2
∂b ∂x
∂a ∂y
D1
∂b ∂x
∂a ∂y
=
ω
C 2
ω.
C 1
.
209
7.3 Green’s Theorem
C
Fig. 7.16. Left: a simple closed curve C surrounding a region with holes bounded by several
simple closed curves. Note the opposite orientation of the inner boundary curves.
y C 2 C 1 x Fig. 7.17. Annular domain bounded
by the curves C 1 and C 2 with counterclockwise orientations.
Let −C 1 be the curve C 1 with opposite orientation, then
˛
ω =
C 1
Thus,
¨
D
∂b ∂x
∂a ∂y
−
=
˛
−
˛
ω +
C 2
ω.
−C
1
˛
ω =
−C
1
˛
ω.
C 2 ( C 1 )
∪−
This example illustrates why the boundary curves are chosen with opposite orientations, in this case, ∂ D = C 2 ∪ (−C 1 ) where C 2 and −C 1 have opposite orientations. Theorem 7.3.4. Let D
2 R 2
be a domain and C = ∂D. Let ω = a(x, y) dx + b(x, y) dy be a 1-form with C coefficients defined for all (x, y) D. Then,
⊂
˛
C
ω =
¨
D
∂b ∂x
−
∂a ∂y
∈
dA(x,y) .
(7.5)
Proof. We verify the formula of Green’s theorem for regions of Type I and Type II,
but only do the proof for Type I regions as the proof for Type II regions is completely analogous. More general regions can be decomposed into a union of Type I and Type II regions and we return to this case at the end of the proof.
210
7 Double and Triple Integrals
Let D = (x, y) α
| ≤ x ≤ β, g1 (x) ≤ y ≤ g2 (x)}.
{
The boundary curve C is made up of four pieces C 1 = (t, g1 (t)) α
| ≤ t ≤ β }, C 2 = {(β, t) | g1 (β ) ≤ t ≤ g2 (β )} −C 3 = {(t, g2 (t)) | α ≤ t ≤ β }, −C 4 = {(α, t) | g1 (α) ≤ t ≤ g2 (α)}. {
We compute
ˆ
ω
=
C
ˆ ˆ
C 1
=
C 1
ˆ ω + ˆ
ˆ ω + ˆ
C 2
C 3
ω +
ω
C 2
−
ˆ ω + ω ˆ
(7.6)
C 4
ω
−C
3
−
ω.
−C
4
The following integrals can be verified by the reader using the pullback formula β
ˆ ˆ ω = (a(t, g (t)) + b(t, g (t))g (t) dt ˆ ˆ ω = b(β, t) dt ˆ ˆ ω = (a(t, g (t)) + b(t, g (t))g (t) dt ˆ − ˆ 1
C 1
α g2 (β)
C 2
g1 (β) β
1
2
C 3
2
α g2 (α)
ω =
1
(7.7)
2
b(α, t) dt.
g1 (α)
−C
4
Collecting the integrals involving a(x, y) from the C 1 and −C 3 integrals we obtain β
ˆ
α
β
a(t, g1 (t))
− a(t, g2 (t))) dt = −
g2 (t)
ˆ ˆ α
g1 (t)
∂a (t, y) dy ∂y
dt =
−
¨ ∂a D
∂y
dA (x,y) .
We now consider the computation of the integrals involving b(x, y). Let B y (x, y) be the antiderivative of b(x, y) with respect to y . Then g2 (β)
ˆ
b(β, y) dy = B y (β, g2 (β ))
g1 (β)
and
g2 (α)
−
ˆ
b(α, y) dy =
g1 (α)
− By (β, g1 (β ))
−By (α, g2 (α)) + By (α, g1 (α)).
To compute the integrals involving b(t, g1 (t)) and b(t, g2 (t)), we compute for i = 1, 2, dB y (t, gi (t)) dt
= =
∂B y dx ∂B y dy + ∂x dt ∂y dt ∂B y (t, gi (t)) + b(t, gi (t))gi (t). ∂x
211
7.3 Green’s Theorem
Integrating with respect to t we have B y (β, gi (β ))
−
B y (y, g
β
i (α))
ˆ ˆ
=
α β
=
α
dB y (t, gi (t)) dt dt ∂B y (t, gi (t)) dt + ∂x
β
ˆ
α
b(t, gi (t))gi (t) dt.
Therefore, β
ˆ
b(t, gi (t))g (t) dt = B y (β, gi (β )) − B y (y, gi (α)) − i
α
β
∂B y (t, gi (t)) dt. ∂x
ˆ
α
Collecting the terms together using (7.6) and (7.7), the ones involving only B y cancel out each other and because x = t on C 1 and C 3 , we are left with
ˆ
β
ω
=
C
α β
=
α
β
=
α
=
y
y
ˆ ∂B (x, g (x)) ∂B (x, g (x)) ¨ ∂a dx − dA − ∂x ∂x ∂y ˆ ˆ ∂ B (x, y) ¨ ∂a dy dx − dA (∗) ∂y∂x ∂y ˆ ˆ ∂b(x, y) ¨ ∂a dy dx − dA ∂x ∂y ¨ ∂b ∂a D
g2 (x)
2
2
y
g1 (x)
g2 (x)
g1 (x)
∂x
− ∂y
1
(x,y)
D
(x,y)
D
(x,y)
D
dA(x,y) .
where the equality following (∗) (on the second line above) is justified from the fact that B y is a C 2 function and so we can interchange the order of the mixed derivatives, that is: ∂ 2 B y (x, y) ∂ 2 B y (x, y) ∂ ∂B y (x, y) ∂ = = = b(x, y). ∂y∂x ∂x∂y ∂x ∂y ∂x
Suppose that D = D 1 ∪ · · · ∪ Dk where each D j is a region of Type I or II. Let C i be the (nonempty) smooth boundary curve between D i and D . Then, C i as a portion of ∂D i has the opposite orientation to C i as a portion of ∂D . Therefore,
¨
Di D
∪
∂b ∂x
−
∂a dA ∂y
=
¨ ˛ ˆ ˆ ˛
Di
=
∂b ∂x
−
ω +
∂D i
=
\
˛
ˆ ω ω + ˆ
ω +
C i
D
\
∂D
∂b ∂x
ˆ
∂D
ω +
∂D i C i
=
¨
∂D
∂D i C i
=
∂a dA + ∂y
− ∂a dA ∂y ω
\−C
i
ω
\−C
i
ω
∂ (Di D )
∪
and we can continue this process by iterating over the rest of the components of D to obtain the result. This completes the proof.
212
7 Double and Triple Integrals
7.3.1 Green’s Theorem: examples and applications
We begin with the 1-form from Example 4.3.7 of Chapter 4. We know that this 1-form is not exact and compute its line integral around a circle of radius a centered at the origin. We generalize this result to any simple closed curve surrounding the origin. Example 7.3.5. Let
ω =
−y dx + x dy x2 + y 2
and C be any simple closed curve around the origin in R2 oriented in counterclockwise direction. We now show that
˛
ω = 2π.
C
Let D be the region enclosed by C and consider C a a circle of radius a > 0 small enough such that C a D. We define D as the region between C a and C , see Figure 7.18. Thus, ∂ D = C C a . C has positive orientation going counterclockwise and C a has positive orientation going clockwise. Let r(t) = (a cos t, a sin t) with t [0, 2π] be a clockwise parametrization of C a . Then,
⊂
∪
−
∈
y C D x C a
Fig. 7.18. Curves C and C a forming
the boundary of the domain D .
˛
ω =
C a
and so
˛
∂D
ω =
ˆ
C
ω +
˛
−2π
C a
ω =
˛
C
ω
− 2π.
213
7.3 Green’s Theorem
Recall that
∂b ∂x
=0 − ∂a ∂y
and so by Green’s theorem
˛
ω =
∂D
¨
D
which means
˛
∂b ∂x
−
∂a dA ∂y
= 0.
ω = 2π.
C
Therefore, one sees that the value of the integral of the 1-form does not depend on the curve C itself, but whether the curve C winds around the origin which is the only point where ω is not defined. We now explain this result from the viewpoint of polar coordinates. Indeed, in polar coordinates, it is straightforward to compute that ω = dθ . Therefore,
˛
ω
C
computes the angular displacement along the curve C . For simple closed curves encircling the origin, we obtain 2π .
The following example shows the derivation of one of Green’s formula [2]. Example 7.3.6. Let D
⊂ R2 be an open subset with ∂ D = C given by a simple closed
curve. Let u = u(x, y) and v = v(x, y). Recall that
∂ 2 v ∂ 2 v ∆v(x, y) = + 2. ∂x 2 ∂y
Consider the 1-form ω =
− u
∂v ∂y
dx + u
∂v ∂x
dy.
Then, a straightforward computation shows that dω = (u∆v +
∇u · ∇v) dx ∧ dy.
(7.8)
Thus, taking the double integral over D and applying Green’s theorem, we obtain
˛
ω =
C
¨
D
dω =
¨
(u∆v +
D
∇u · ∇v) dA.
Writing the line integral explicitly and rearranging the terms we have
¨
D
∇u · ∇v dA = −
¨
D
u∆v dA +
˛
C
− u
∂v ∂y
dx
u
∂v ∂x
dy.
(7.9)
We simplify this last integral further as follows. We know that if C has a parametrization r(t) = (x(t), y(t)), then r (t) is tangent to C . The vector N (t) = (y (t), x (t))
−
214
7 Double and Triple Integrals
is perpendicular to r (t) and therefore, perpendicular to the boundary curve C . There fore, the pullback of ω by r(t) is u(r(t))
−
∂v ∂v (r(t))x (t) + (r(t))y (t) ∂y ∂x
dt = u(r(t))( v(r(t)) N (t)) dt.
∇
·
Thus, we can rewrite (7.9) as
¨
∇u · ∇v dA = −
D
¨
u∆v dA +
D
˛
C
u( v N ) dt.
∇ ·
(7.10)
This last equation is the first Green’s formula and is a useful tool in the context of partial differential equations.
The following example is called the Bendixson-Dulac theorem and is useful in the geometric theory of differential equations, see for instance [7]. Example 7.3.7. Let F (x, y) = (f (x, y), g(x, y)) be a vector field in R2 . If there exists a simple closed curve C with parametrization, say r (t) = (x(t), y(t)) such that r (t) = F (r(t)) then C is called a periodic solution of the vector field F . Periodic solutions
are a fundamental concept in the theory of differential equations. Suppose the vector field F is such that there exists a function φ : that ∂ (φf ) ∂ (φg) + > 0 ∂x ∂y
2 R
→ R so
in R2 . We now show that F does not have any periodic solution. We do this by contradiction. We suppose there does exist a simple closed curve C with parametrization r(t) ( t [0, 2π]) such that r˙ (t) = F (r(t)) and show that this leads to an impossible situation. 2 Let D R be a region with boundary ∂D = C , a simple closed curve. We consider the 1-form ω = φ(x, y)g(x, y) dx + φ(x, y)f (x, y) dy . Then, by Green’s theorem ∂φf ∂φg ω = dω = + dA(x,y) > 0 ∂x ∂y C D D
∈
⊂
−
ˆ
ˆ
¨
because the integrand is strictly positive. Note that on C we have dx = x (t) dt = f (x(t), y(t)) dt and dy = y (t) dt = g(x(t), y(t)) dt because r(t) = (x(t), y(t)) is a solution of the differential equation. So
ˆ
ω =
C
2π
ˆ 0
( φ(x(t), y(t))g(x(t), y(t))f (x(t), y(t))
−
+φ(x(t), y(t))f (x(t), y(t))g(x(t), y(t))) dt = 0.
Therefore, we have a contradictions since the line integral of ω along C cannot be simulateneously positive and zero. This means the vector field F does not allow periodic solutions.
7.4 Three-dimensional domains
215
Exercises
(1) Let D be the region bounded by the lines y = 0, y = x − 2, x = 0. Parametrize the curves so that ∂D has positive orientation. (2) Consider the annular region D bounded by the circles of radius 1 and of radius 4 centered at the origin. Parametrize the curves so that ∂D has positive orientation. (3) Let D be the region bounded by the ellipse x2 + y 2 /4 = 1 and the ellipse 2x2 + 4y2 = 1. Parametrize the curves so that ∂D has negative orientation. (1, 1) and the arc of circle (4) Let C be the curve given by the line from (0, 0) to√ √ from (1, 1) to ( − 2, 0), and finally the line from (− 2, 0) to (0, 0). Let ω = (x
− x2 y) dx + (xy2 − y3 ) dy
and evaluate the integral
˛
ω.
C
˛
2
ω where ω = (xe−x + 3y 2 ) dx + (6x − tanh y 2 ) dy where C is the (5) Compute C positively oriented curve which forms the boundary of the half-disk of radius a where y ≥ x. (6) Consider the vector field F (x, y) = (a(x, y), b(x, y)). Apply Green’s theorem to the 1-form ω = b(x, y) dx − a(x, y) dy and show that for any curve C bounding a domain D ⊂ R2 F · n ˆ ds = div F dA
˛
C
¨
D
where nˆ is the unit vector perpendicular to T p C at all p ∈ C . This is often referred to as the Divergence theorem in the plane. Chapter 10 discusses the general Divergence theorem. (7) For each of the following problems, use the Divergence theorem in the plane of Exercise 6. (a) Let F (x, y) = (3x + 4 cos(y 2 ), 4yx) and C be the triangle defined by the vertices (0, 0), (2, 0) and (1, 1). (b) Let F (x, y) = (xy 2 , yx 2 ) and C be given by the segment joining (0, 0) to (a, 0), the portion of the circle of radius a in the quadrants 1, 2 and 3, and finally the segment joining ( −a, 0) to (0, 0).
7.4 Three-dimensional domains We now extend integration to functions of three variables. We begin with cubic domains which are the simplest type of domains for which Fubini’s theorem applies. As in the previous section, we define three types of domains for which it is straightfor-
216
7 Double and Triple Integrals
ward to decompose the triple integral into three simple integrals. Several examples are presented.
7.4.1 Triple integral over cubes
We now present the construction of integrals of functions of three-variables over a box domain B defined by [a0 , a1 ] × [b0 , b1 ] × [c0 , c1 ]. In this case, we can also define an integral over B as before. Let f : R3 → R and consider partitions of each interval: a0 = x 0 < x1 < .. . < xn−1 < xn = a 1 b0 = y 0 < y1 < .. . < ym−1 < ym = b 1 c0 = z 0 < z1 < .. . < zm−1 < z = c 1
from which we obtain a decomposition into subboxes Bijk for i = 1, . . . , n , j = 1, . . . , m and k = 1, . . . , . A subbox Bijk is illustrated in Figure 7.19. We define a z
zk+1 zk
Bijk
yj
yj+1 y
xi xi+1
x Fig. 7.19. A subbox element B ijk decomposing the box domain B .
7.4 Three-dimensional domains
217
Riemann sum as in the double integral n
RS =
m
∗ , z ∗ )dV (Bijk ) f (x∗ijk , yijk ijk
(7.11)
i=1 j=1 k=1
and obtain a similar definition as above.
∗ , z∗ ) Definition 7.4.1. Let (x∗ijk , yijk ijk
∈ B ijk for all i = 1, . . . , n, j = 1, . . . , m and k = 1, . . . , and dV (Bijk ) → 0 as n,m, → ∞ . If the limit of the Riemann sum exists then f is said to be integrable on B and
˚
n
f (x,y,z) dV =
B
m
lim
n,m,
→∞ i=1 j=1 k=1
∗ , z ∗ )dV (Bijk ). f (x∗ijk , yijk ijk
Continuity of f on B is again a sufficient condition for integrability and also for functions that are piecewise continuous. This is described precisely in the following result. Theorem 7.4.2. Let f : B
⊂ R3 → R. Then,
(1) If f is continuous on B , then it is integrable on B (2) If f is bounded and has only finite size jump discontinuities over a finite set of smooth surfaces, then f is integrable on B . As in the case of double integrals, Fubini‘s theorem is applicable and we can compute the triple integral as an iterated integral. Theorem 7.4.3. If f is integrable on the rectangle B . Then
˚
B
x1
f (x,y,z) dV =
y1
z1
ˆ ˆ ˆ x0
y0
f (x,y,z) dz dy
z0
dx.
and all other permutations are equal.
Again, just as with double integrals, we can compute interesting physical quantities such as total mass, moments, centre of mass, total charge, etc in three-dimensional regions using the triple integral. Example 7.4.4. Consider an electrically charged region B in the shape of a box.
The charge density in B (given in Coulomb per square meters C/m 2 ) is given by a continuous function σ(x,y,z). The Riemann sum over the region B with σ(x,y,z) gives an approximation to the total charge and we define the total charge as the limit of the Riemann sum. Thus, the total charge is defined by
˚
B
σ(x,y,z) dV.
218
7 Double and Triple Integrals
Suppose that B = [0, 1]
˚
× [0, 1] × [0, 1] and σ(x,y,z) = xyz. The total charge in B is 1
σ(x,y,z) dV =
1
ˆ
x dx
0
B
ˆ
1
y dy
0
ˆ
z dz =
0
1 . 8
7.4.2 Triple integrals over more general regions
We can define the triple integral over a general region E as in the case of double integrals. Suppose that the boundary of E is made up of a finite number of smooth surfaces and let E ⊂ B where B is a box in R3 . Assume f (x, y) is continuous on E . We define F (x,y,z) =
f (x,y,z)
(x,y,z)
∈ E (x,y,z) ∈ B \ E.
0
Then, F is continous on B with only a finite number of finite jump discontinuities over smooth surfaces. By Theorem 7.4.2, F is integrable and we define
˚
f (x,y,z) dV :=
E
˚
F (x,y,z) dV
B
We can proceed to a similar classification of domains in
3 R
into three types.
Type I:
A region E is of type I if it is lies between the graphs of two continuous functions u1 (x, y) and u2 (x, y) with domain D . E = (x,y,z) (x, y)
{
|
∈ D, u1 (x, y) ≤ z ≤ u2 (x, y)}.
The general formula is of the form
˚
f (x,y,z) dV =
E
u2 (x,y)
¨ ˆ D
f (x,y,z) dz
u1 (x,y)
dA.
Example 7.4.5. We set up the triple integral of f (x,y,z) = (x + y)z on the region
E enclosed by the ellipsoid x2 +
and the cylinder
y2 + z2 = 1 4
− x
1 2
2
+ y2 =
1 . 4
The region E can be obtained as the region inside the cylinder, bounded below and above by the portion of the ellipsoid inside the cylinder for z < 0 and z > 0 respectively. Figure 7.20 shows the portion for z > 0 . We write D =
| − (x, y)
x
1 2
2
+ y2
≤ 1 4
219
7.4 Three-dimensional domains
Fig. 7.20. Region E enclosed by the ellipsoid and the cylinder in Example 7.4.5.
and
E = (x,y,z) (x, y)
Then,
|
˚
− − − ¨ ˆ √
∈ D,
x2
1
y 2 /4
≤ ≤ − − z
1
x2
y 2 /4 .
1 x2 y2 /4
(x + y)z dV =
E
− −
√
2
− 1−x −y
D
(x + y)z dz dA.
2
/4
Type II:
A region E is of type II if it is lies between the graphs of two continuous functions v1 (y, z) and v 2 (y, z) with domain D . E = (x,y,z) (y, z)
{
|
∈ D, v1 (y, z) ≤ z ≤ v2 (y, z)}.
The general formula is of the form
˚
f (x,y,z) dV =
E
v2 (y,z)
¨ ˆ D
f (x,y,z) dx
v1 (y,z)
dA.
Example 7.4.6. We set up the triple integral of f (x,y,z) = x2 on the region E
enclosed by the planes x y + z = 2, x = 1, y = 0 and z = 0. The planes delimiting the region E are shown in Figure 7.21. We choose to let x = 2 + y z and
−
−
−
220
7 Double and Triple Integrals
consider the region in the plane x =
−1 bounded by the remaining planes to define D = {(y, z) | 0 ≤ z ≤ 3, 0 ≤ y ≤ z − 3}.
Then, z
3
x
− y + z = 2 x =
−3
−1
y 2 x Fig. 7.21. The various planes delimiting the region E in Example 7.4.6
E = (x,y,z) (y, z)
{
|
∈ D, −1 ≤ x ≤ 2 + y − z}.
The triple integral is
˚
x2 dV =
E
2+y z
˚ ˆ D
−
−1
x2 dx
dA.
Type III:
A region E is of type III if it is lies between the graphs of two continuous functions w1 (x, z) and w 2 (x, z) with domain D. E = (x,y,z) (x, z)
{
|
∈ D, w1 (x, z) ≤ y ≤ w2 (x, z)}.
The general formula is of the form
˚
E
f (x,y,z) dV =
w2 (x,z)
¨ ˆ D
w1 (x,z)
f (x,y,z) dx dA.
7.4 Three-dimensional domains
221
Fig. 7.22. Region E bounded by the cone and the sphere in Example 7.4.7.
Example 7.4.7. We set up the triple integral of f (x,y,z) = xyz on the region E
enclosed by the cone y 2 = x 2 + z 2 and the sphere x2 + y 2 + z 2 = 1 and containing the y -axis. See Figure 7.22. The region E projects to a disk in the (x, z) plane with maximal radius 1/ 2. This can be seen by obtaining the intersection curve of the cone and the sphere. We substitute x2 + z 2 by y 2 in the sphere equation to obtain 2y 2 = 1 and so y 2 = 1/2. The projection of the intersection curve to the (x, z) plane is indeed x2 + z 2 = 1/2. We can now write D = (x, z) x2 + z 2 1/2
√
{
|
≤
}
and so E = (x,y,z) (x, z)
{
|
∈ − − ¨ ˆ D,
1
x2
−
z2
From this we can write the triple integral
˘
√ 1−x −z −√ 1−x −z
xyz dV =
E
D
2
2
2
≤ ≤ − y
xyz dy
1
x2
− z2 }.
dA.
2
The following is a straightforward result to prove using Riemann sums.
⊂ R3 and B = B1 ∪ B2 where the intersection of B1 and
Proposition 7.4.8. Let B
B2 is only along smooth surfaces and let g be a continous function. Then
˚
B
g dV =
˚
B1
g dV +
˚
g dV
B2
We illustrate the use of this result in the next example.
222
7 Double and Triple Integrals
Example 7.4.9. Let B be the region enclosed inside the cone (z
− 1)2 = x2 + y 2 ,
bounded below by z = 0 and above by z = 3. We split B = B1 B2 as follows. Let D1 be the circle of radius 1 centered at the origin and D2 be the circle of radius 2 centered at the origin. Then,
∪
{(x,y,z) | (x, y) ∈ D1 , 0 ≤ z ≤ 1 − x2 + y2 }, B2 = {(x,y,z) | (x, y) ∈ D2 , 1 + x2 + y2 ≤ z ≤ 3}. See Figure 7.23. If g : B → R is an integrable function, then B1
˚
g dV
=
B
=
˚
g dV +
B1
=
˚
¨ ˆ −√ D1
g dV
B2
1
0
x2 +y2
g dz
dA +
3
¨ ˆ D2
1+
√ x +y 2
g dz 2
dA.
Fig. 7.23. Region B made up of two regions B 1 and B2 bounded by cones.
Exercises
(1) Find the volume of the solid that lies between the planes z = 3 + 2x − y , z = −2 and which projects to the rectangle R = [−1, 1] × [1, 2]. (2) Compute the volume under the surface z = 4 − x2 y over the domain D bounded by the lines y = −1, x = 0, x = 2, y = −x + 4, y = x + 2.
7.4 Three-dimensional domains
223
(3) Calculate the iterated integrals. 1
1
2
1 1
1 2
ˆ ˆ ˆ z xy dx dy dz (a) − ˆ ˆ ˆ (b) xy e dz dx dy ˆ ˆ ˆ 0
(c)
1
0 1
−1
0
1
2
0
2
2
2 xyz
cos(πx)yz dy
0
dz
dx
(4) Consider the domains below and determine if they are of type I , I I , or III . (a) R1 = {(x,y,z) | −1 ≤ y ≤ 1, −2 ≤ x ≤ 0, −(x + y)2 ≤ z ≤ 4}. (b) R2 is the tetrahedron determined by the vertices (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1). (c) R3 is the region enclosed by the circular cylinder of radius 1 centered at the origin in the xz -plane and the sphere of radius 2 centered at the origin, and containing the y-axis. (d) R4 = {(x,y,z) | 0 ≤ y ≤ 3, 0 ≤ z ≤ ln(1 + y), −3 ≤ x ≤ y2 + z 2 }. (5) Find the volume of the solid enclosed by the cylinder z = 1 − y 2 , the z = 0 plane and the planes x = −1 and x = 2 − y . (6) Find the volume of the solid E enclosed by the elliptic paraboloid z = x 2 + y 2 and the plane z = 4. The answer is an integer multiple of π . (7) Find the volume of the solid bounded below by the elliptic paraboloid x = z 2 + y 2 − 1 and above by the plane x = 4 − z + 2y .
8 Wedge Products and Exterior Derivatives This chapter begins with an extension of the wedge product operation to arbitrary forms. This enables us to obtain formulae relating area and volume forms under changes of coordinates. These calculations are used at the beginning of Chapter 9 when discussing pullbacks. From the definition of wedge product, we can now extend the concept from differentiable 1-forms to differentiable k forms with an emphasis on 2-forms in R2 and R3 and 3-forms in R3 . We then generalize the differential d acting on functions by introducing the exterior derivative which can be applied to any k-form. We do not prove all the properties of the exterior derivative and only show its use on 1 and 2 forms. In particular, we redefine the concepts of closed and exact forms using the exterior derivative.
8.1 More on Wedge Products We generalize the wedge product defined for differentials dxi to 1-forms. This is done by a formal extension of the definition. Definition 8.1.1 (Wedge product for 1-forms) . Let ω1 , ω2 be 1-forms in
u, v
∈ Rn. Then, (ω1
∧ ω2 )(u, v) :=
ω1 (u)
ω2 (u)
ω1 (v)
ω2 (v)
R
n
and
.
The wedge product has the following properties that are easily verifiable using the definition. Proposition 8.1.2. Let ω1 , ω2 , ω3 be 1-forms on Rn and a, b
(a) (b) (c) (d) (e)
ω 1 ω1 0 (aω1 + bω2 ) ω3 = a(ω1 ω3 ) + b(ω2 ω1 ω2 = (ω2 ω1 ) (ω1 ω2 )(u, u) = 0. if u1 , v1 , u2 , v2 Rn then
∧ ≡
∧ ∧
∧ − ∧
∧
∈ R.
∧ ω3 )
∈
(ω1
∧ ω2 )(au1 + bv1 , u2 ) = a(ω1 ∧ ω2)(u1, u2 ) + b(ω1 ∧ ω2 )(v1, u2 )
(ω1
∧ ω2 )(u1 , au2 + bv2 ) = a(ω1 ∧ ω2 )(u1 , u2 ) + b(ω1 ∧ ω2 )(u1 , v2).
and
Proof. (a) With ω1 = ω2 in the wedge product, the determinant has identical
columns and so the determinant is identically zero. (b) This property is obtained directly by writing [(aω1 + bω2 )
∧ ω3 ](u, v) =
(aω1 + bω2 )(u)
ω3 (u)
(aω1 + bω2 )(v)
ω3 (v)
8.1 More on Wedge Products
=
225
(aω1 + bω2 )(u)ω3 (v)
− ω3 (u)(aω1 + bω2 )(v) = aω1 (u)ω3 (v) + bω2 (u)ω3 (v) − aω3 (u)ω1 (v) − bω3 (u)ω2 (v) = a(ω1 (u)ω3 (v) − ω3 (u)ω1 (v)) + b(ω2 (u)ω3 (v) − ω3 (u)ω2 (v)) = a(ω1 ∧ ω3 )(u, v) + b(ω2 ∧ ω3 )(u, v). (c) The columns of ω2 ∧ ω1 are interchanged from ω1 ∧ ω2 , therefore one is minus the other. (d) is straightforward from the definition because two columns of the determinant are equal. (e) is done by writing explicitly the left-hand side of the formula and computing explicitly the determinant.
We look at some examples of computations using the properties (a), (b), (c) above. Example 8.1.3. Let
ω1 = a 1 (x, y) dx + a2 (x, y) dy
and ω2 = b 1 (x, y) dx + b2 (x, y) dy.
Writing down the formula and using property (b) at first we obtain ω1
∧ ω2
= =
= =
(a1 (x, y) dx + a2 (x, y) dy)
∧ (b1 (x, y) dx + b2 (x, y) dy) a1 (x, y)b1 (x, y)dx ∧ dx + a1 (x, y)b2 (x, y)dx ∧ dy +a2 (x, y)b1 (x, y) dy ∧ dx + a2 (x, y)b2 (x, y) dy ∧ dy a1 (x, y)b2 (x, y)dx ∧ dy + a2 (x, y)b1 (x, y) dy ∧ dx using (a) (a1 (x, y)b2 (x, y) − a2 (x, y)b1 (x, y)) dx ∧ dy using (c).
We now show that general wedge products are useful in dealing with changes of coordinates. We begin with a general calculation. Example 8.1.4. Let x = x(u, v), y = y(u, v) and z = z (u, v) and we compute their
differentials: ∂x ∂x ∂y ∂y ∂z ∂z du + dv, dy = du + dv, dz = du + dv. ∂u ∂v ∂u ∂v ∂u ∂v According to the calculation in Example 8.1.3 we set dx =
a1 =
∂x , ∂u
a2 =
and so dx
∧ dy =
This can be rewritten
dx
∂x , ∂v
∂x ∂y ∂u ∂v
∧ dy =
det
=
det
b1 =
−
∂y , ∂u
∂x ∂y ∂v ∂u
b2 =
∧ ∧ du
∂y ∂v
dv.
∂x ∂x ∂u ∂v du dv ∂y ∂y ∂u ∂v ∂ (x, y) du dv. ∂ (u, v)
∧
226
8 Wedge Products and Exterior Derivatives
Similar formulae with dy dy
∧ dz = det
∧ dz and dz ∧ dx yield ∂ (y, z) du ∧ dv, dz ∧ dx = det ∂ (u, v)
∂ (z, x) ∂ (u, v)
du
∧ dv.
The formulae of this example can be summarized in the following result for
R
n
.
∈ Rn and suppose that xi = xi(u, v) for i = 1, . . . , n where the dependence is differentiable. Then, for i = j ∂ (xi , xj ) dxi ∧ dxj = det du ∧ dv. ∂ (u, v) Proposition 8.1.5. Consider (x1 , . . . , x n )
We now consider specific changes of coordinates.
Example 8.1.6. (1) For the transformation to polar coordinates, we set x = r cos θ ,
y = r sin θ, then dx
∧ dy = det
cos θ sin θ
−r sin θ r cos θ
= r dr
(2) Consider the linear transformation x = u + v , y = u dx
∧ dy = det
1
1
1
−1
du
∧ dθ.
− v, then
∧ dv = −2 du ∧ dv.
We translate those results directly to area forms dA . In coordinates x, y , we write dA(x,y) and recall that dA(x,y) = dx dy . We have for x = x(u, v) and y = y(u, v),
| ∧ |
dA(x,y) = det
∂ (x, y) ∂ (u, v)
dA(u,v) .
(1) for polar coordinates dA(x,y) = r dA(r,θ) . (2) for x = u + v and y = u v , dA(x,y) = 2 dA(u,v) . (3) More generally, suppose that
−
(x, y)T = B(u, v)T
where B is a 2
× 2 matrix of real numbers with det(B) = 0. Then dA(x,y) = | det(B)|dA(u,v)
The formulae above are obtained by computing wedge products and from algebraic manipulations. We still need to determine which vectors are arguments of dx ∧ dy and du ∧ dv and look at the geometric meaning of the relationship between the area forms. Consider the change of variables given by the mapping (x, y)T = Φ(u, v). and = (β 1 , β 2 ) in (α1 , α2 ) and β the point (u0 , v0 ) ∈ R2 . Consider now the vectors α = T (u ,v ) R2 . Let (x0 , y0 ) = Φ(u0 , v0 ), then the vectors 0
0
) DΦ( α) and DΦ(β
227
8.1 More on Wedge Products
)) based at (x0 , y0 ). α), DΦ(β are in T (x ,y ) R2 , and generate a parallelogram P (DΦ( Let e1 = (1, 0)T and e2 = (0, 1)T be the canonical basis vectors for T (u ,v ) R2 . Let P (e1 , e2 ) be the parallelogram generated by e1 and e2 , then (du ∧ dv)(e1 , e2 ) = 1. Suppose that α = e 1 and β = e 2 Then 0
0
0
(DΦ(e1 ), DΦ(e2 )) =
Therefore, (dx
∧ dy)(DΦ(e1 ), DΦ(e2 ))
=
=
∂x ∂u ∂y ∂u
,
∂x ∂v ∂y ∂v
det
0
.
∂ (x, y) ∂ (u, v)
∂x ∂u ∂y ∂u
∂x ∂v ∂y ∂v
(du
∧ dv)(e1 , e2 )
where the last equality holds because (du ∧ dv)(e1 , e2 ) = 1. be arbitrary vectors, , β If we let α )) = (α1 DΦ(e1 ) + α2 DΦ(e2 ), β 1 DΦ(e1 ) + β 2 DΦ(e2 )) (DΦ( α), DΦ(β
because the derivative DΦ is a linear operator. We now compute the area of the )); α), DΦ(β parallelogram P (DΦ( (dx
= (dx ∧ dy)(α1 DΦ(e1 ) + α2 DΦ(e2 ), β1 DΦ(e1 ) + β2 DΦ(e2 )) ∧ dy)(DΦ( α), DΦ(β)) = α 1 β2 (dx ∧ dy)(DΦ(e1 ), DΦ(e2 )) + α2 β1 (dx ∧ dy)(DΦ(e2 ), DΦ(e1 )) = (α1 β2
− α2 β1 )det
= det
∂ (x, y) ∂ (u, v)
∂ (x, y) ∂ (u, v)
(du
∧ dv)( α, β)
with the last line obtained because (du
∧ dv)( α, β ) =
du( α) ) du(β
du( α) ) du(β
= α 1 β 2
− α2 β 1 .
We summarize these computations in the following result Theorem 8.1.7. Let Φ : R2
→ R2 be a change of coordinates (i.e. differentiable and
be tangent vectors at (u0 , v0 ). invertible) such that Φ(u0 , v0 ) = (x0 , y0 ). Let α and β ) are tangent vectors at (x0 , y0 ) and the signed areas of the Then, DΦ( α) and DΦ(β )) and P ( parallelograms P (DΦ( α), DΦ(β α, β ) are related by the formula (dx
∧ dy)(DΦ( α), DΦ(β )) = det
∂ (x, y) ∂ (u, v)
(du
∧ dv)( α, β ).
228
8 Wedge Products and Exterior Derivatives
y
θ β
Φ(R)
Φ R
α) DΦ(
) DΦ(β
α r
x
Fig. 8.1. The rectangle R is mapped by the polar coordinates mapping Φ to a portion of annulus
are mapped to DΦ( . Φ(R). The vectors α and β α) and D Φ(β)
We look at the important case of polar coordinate transformations. Example 8.1.8. Consider the case of polar coordinates with a rectangle R in (r, θ)
, where we assume that are relatively small. Then, space generated by α and β α and β Φ(R) is a portion of annulus in (x, y)-space as in Figure 8.1. The parallelogram )) P (DΦ( α), DΦ(β
provides an approximation to the area of the portion of annulus; with the approxi approach zero. mation improving as α and β
We continue with an application to volume forms. Recall that dV = dx
| ∧ dy ∧ dz|.
We now explore the effect of changes of coordinates in uct, and therefore the volume form.
3 R
on the triple wedge prod-
Theorem 8.1.9. Let x = x(u,v,w), y = y(u,v,w) and z = z (u,v,w) be a differen-
tiable change of coordinates. Then dx
∧ dy ∧ dz = det
In particular,
dV (x,y,z) = det
∧∧ ∧ ∧
∂ (x,y,z) ∂ (u,v,w)
∂ (x,y,z) ∂ (u,v,w)
du
du
dv
dv
dw.
dw.
Proof. We write the wedge product explicitly in terms of the differentials dx
=
dy
=
dz
=
∂x du + ∂u ∂y du + ∂u ∂z du + ∂u
∂x dv + ∂v ∂y dv + ∂v ∂z dv + ∂v
∂x dw ∂w ∂y dw ∂w ∂z dw. ∂w
229
8.1 More on Wedge Products
We obtain dx
∧ dy ∧ dz
= =
(dx
∧ dy) ∧ dz
∧
∧ ∧ − ∧ ∧ ∧ − − ∧∧
∂x ∂x ∂x du + dv + dw ∂u ∂v ∂w
∂y ∂y ∂y du + dv + dw ∂u ∂v ∂w
∂z ∂z ∂z du + dv + dw ∂u ∂v ∂w
=
∂x ∂y ∂u ∂v
∂x ∂y ∂v ∂w
+ =
∂x ∂y ∂u ∂v
det
du
∂x ∂y − ∂w ∂v
∂y − ∂x ∂v ∂u
∂x ∂y ∂v ∂w
+
=
∂y − ∂x ∂v ∂u
dv
du
∂x ∂y ∂u ∂w
∂x ∂y ∂u ∂w
∂z ∂u
du
∂x ∂y ∂w ∂u
du
dw
∂z ∂z ∂z du + dv + dw ∂u ∂v ∂w
dw
∂z ∂w
∂x ∂y − ∂w ∂v
∂ (x,y,z) ∂ (u,v,w)
dv +
dv
∂x ∂y ∂w ∂u
∂z ∂v
dw.
∧ dv ∧ dw.
The result for volume forms is straightforward. We apply this result to the main three dimensional changes of coordinates. Example 8.1.10 (Cylindrical coordinates). It is a direct extension of the polar case
and the details are left to the reader. We obtain dx
∧ dy ∧ dz = r dr ∧ dθ ∧ dz.
Example 8.1.11 (Spherical coordinates). For spherical coordinates, we use
x = x(ρ,φ,θ), y = y(ρ,φ,θ), z = z(ρ,φ,θ).
and so the Jacobian matrix is
cos θ sin φ
ρ cos θ cos φ
sin θ sin φ
ρ sin θ cos φ
cos φ
−ρ sin φ
−ρ sin θ sin φ ρ cos θ sin φ 0
which has determinant ρ2 sin φ. Therefore, dx
∧ dy ∧ dz = ρ2 sin φ dρ ∧ dφ ∧ dθ.
We denote the volume forms in Cartesian and spherical coordinates respectively by dV (x,y,z) and dV (ρ,φ,θ) then
| ∧ dy ∧ dz| = ρ2 | sin φ||dρ ∧ dθ ∧ dφ| = ρ2 sin φ dV (ρ,φ,θ) . Note that | sin φ| = sin φ since φ ∈ [0, π]. dV (x,y,z) = dx
230
8 Wedge Products and Exterior Derivatives
We now state a result similar to Theorem 8.1.7 Theorem 8.1.12. Let Φ : R3
→ R3 be a change of coordinates (i.e. differentiable and
and γ be tangent vectors at invertible) such that Φ(u0 , v0 , w0 ) = (x0 , y0 , z0 ). Let α, β ) and DΦ(γ ) are tangent vectors at (x0 , y0 , z0 ) and (u0 , v0 , w0 ). Then, DΦ( α), DΦ(β ), DΦ(γ )) and P ( the signed volumes of the parallelepipeds P (DΦ( α), DΦ(β α, β, γ ) are related by the formula (dx
∧ dy ∧ dz)(DΦ( α), DΦ(β ), DΦ(γ )) = det
∂ (x,y,z) ∂ (u,v,w)
(du
∧ dv ∧ dw)( α, β, γ ).
The calculation proceeds as in the 2D case. Consider the box generated by the canonical basis vectors e1 , e2 and e3 in the tangent space at (u0 , v0 , w0 ). Then,
(DΦ(e1 ), DΦ(e2 ), DΦ(e3 )) =
and so (dx
∂x ∂u ∂y ∂u ∂z ∂u
∧ dy ∧ dz)(DΦ(e1 ), DΦ(e2 ), DΦ(e3 )) = det
,
∂x ∂v ∂y ∂v ∂z ∂v
∂ (x,y,z) ∂ (u,v,w)
,
(du
∂x ∂w ∂y ∂w ∂z ∂w
∧ dv ∧ dw)(e1 , e2 , e3 ).
= (β 1 , β 2 , β 3 ) and γ = (γ 1 , γ 2 , γ 3 ) be three vectors at = (α1 , α2 , α3 ), β Let α (u0 , v0 , w0 ), then by writing α = α 1 e1 + α2 e2 + α3 e3 = β 1 e1 + β 2 e2 + β 3 e3 β γ = γ 1 e1 + γ 2 e2 + γ 3 e3
and using the linearity property of the wedge product, we obtain the result of Theorem 8.1.12.
Exercises
(1) Compute the wedge product of the 1 -forms ω1 = 5x1 dx1 + 3x3 x5 dx2
− x22 dx3 + x23 x4 dx4
and ω2 =
−2x5 x3 dx1 − x4 dx2 − x33 dx3 + x1 x4 x5 dx4 + x1 x25 dx5 .
(2) Let x = a cosh(u) and y = a sinh(u). Compute dA(x,y) in terms of dA(a,u) .
231
8.2 Differential Forms
(3) Let x = u + 2v − 2w, y = −2u − 3w and z = 3v + w. Compute dV (x,y,z) in terms on dV (u,v,w) . (4) Compute dV (x,y,z) in terms of the volume form in cylindrical coordinates (r,θ,z). (5) Let ω1 = a 1 (x,y,z) dx + a2 (x,y,z) dy + a3 (x,y,z) dz , ω2 = b 1 (x,y,z) dx + b2 (x,y,z) dy + b3 (x,y,z) dz,
and
ω3 = c 1 (x,y,z) dx + c2 (x,y,z) dy + c3 (x,y,z) dz.
If u, v,w are vectors in
3 R
, verify that computing (ω1
∧ ω2 ∧ ω3 )(u,v,w)
using the definition is equal to (ω1
∧ ω2 ∧ ω3 )(u,v,w) = det
ω1 (u)
ω2 (u)
ω3 (u)
ω1 (v)
ω2 (v)
ω3 (v)
ω1 (w)
ω2 (w)
ω3 (w)
(6) The elliptic coordinate system is given by x = cosh µ cos θ,
.
y = sinh µ sin θ
where µ ≥ 0 and θ ∈ [0, 2π). Show that dx
∧ dy = (sinh2 µ + sin2 θ) dµ ∧ dθ.
Recall that (cosh x) = sinh x, (sinh x) = cosh x and cosh2 x − sinh2 x = 1. (7) The oblate spheroidal coordinate systems is given by the formula x = cosh µ cos φ cos ν,
y = cosh µ sin φ cos ν,
z = sinh µ sin ν.
where µ ≥ 0 , φ ∈ [ −π, π] and ν ∈ [ −π/2, π/2]. (Recall that cosh (x) = sinh(x), sinh (x) = cosh(x) and cosh2 (x) − sinh2 (x) = 1.). Show that dx
∧ dy ∧ dz = cosh µ cos ν (sin2 ν + sinh2 µ) dµ ∧ dφ ∧ dν.
(8) The parabolic coordinate system is given by 1 2 (τ 2
− σ2 ) where σ, τ ∈ R. Show that dx ∧ dy = (σ 2 + τ 2 ) dσ ∧ dτ . x = στ and y =
8.2 Differential Forms Using the wedge product of differentials, we now define differential 2-forms and 3forms in R2 and R3 . The subject of differential forms is quite extensive in both
232
8 Wedge Products and Exterior Derivatives
mathematics and physics; including differential geometry, control theory, dynamical systems, Hamiltonian mechanics and thermodynamics. We do not launch into the general definition of k-forms, but rather build-up the differential forms which are needed in this book. Definition 8.2.1. Differentiable 2-forms in R2 and R3 are defined, respectively, as
ω = a(x, y) dx
∧ dy, ω = a 1 (x,y,z) dy ∧ dz + a2 (x,y,z) dz ∧ dx ∧ a3 (x,y,z) dx ∧ dy. where a, a1 , a2 , a3 are C 1 functions. Differentiable 3-forms in R3 are defined as ω = b(x,y,z) dx
∧ dy ∧ dz
where b is a C 1 function.
In Chapter 4, we show that differentiable 1-forms can be defined in Rn for any n ∈ N. This is true also of 2-forms and 3-forms. A general study of k-forms is beyond the scope of this text. In a nutshell, a k-form in Rn can be defined by taking linear combinations of dxi1
∧ dxi ∧ · · · ∧ dxi
k
2
by C 1 functions, where for vectors u1 , . . . , uk
(dxi1
∧ dxi ∧ · · · ∧ dxi )(u1 , . . . , uk ) := k
2
dxi1 (u1 )
dxi2 (u1 )
dxi1 (u2 )
dxi2 (u2 )
dxi1 (uk )
dxi2 (uk )
.. .
.. .
··· ···
dxik (u1 )
···
dxik (uk )
dxik (u2 )
.. .
.. .
By this definition of the multiple wedge product of differentials dxi , . . . , d xik , we see that if k > n then automatically dxi1
1
∧ dxi ∧ · · · ∧ dxi ≡ 0 k
2
because two columns of the determinant must be equal. Therefore, k-forms in Rn are nontrivial only for k ≤ n . In particular, differentiable 0-forms are differentiable functions f : Rn → R. Example 8.2.2. Consider the case n = 4 with X = (x1 , x2 , x3 , x4 ), then we can
define differentiable k -forms for k = 0, 1, 2, 3, 4. Those are (1) 0-form: a(X ), (2) 1-form:
4 i=1
ai (X ) dxi ,
(3) 2-form: b 1 (X ) dx1 dx2 + b2 (X ) dx1 dx3 + b3 (X ) dx1 dx4 + b4 (X ) dx2 dx3 + b5 (X ) dx2 dx4 + b6 (X ) dx3 dx4
∧
∧
∧
∧
∧
∧
233
8.2 Differential Forms
(4) 3-form: c 1 (X ) dx1 dx2 dx3 + c2 (X ) dx1 dx2 dx4 + c3 (X ) dx1 dx3 dx4 + c4 (X ) dx2 dx3 dx4
∧
∧
∧
∧
∧
∧
∧
∧ (5) 4-form: g(X ) dx1 ∧ dx2 ∧ dx3 ∧ dx4 . where a : R4 → R, ai : R4 → R for i = 1, 2, 3, 4, bi : R4 → R for i = 1, . . . , 6, ci : R4 → R for i = 1, 2, 3, 4 and g : R4 → R are all differentiable functions. We now define the wedge product of arbitrary forms as follows.
Definition 8.2.3 (Wedge product). Let X = (x1 , . . . , x n ), k,
ω1 = a(X ) dxi1
∧ dxi ∧ · · · ∧ dxi 2
and
k
≤ n and ω2 = b(X ) dxj ∧ dxj ∧ · · · ∧ dxj . 1
2
Then ω1
∧ ω2 = a(X )b(X ) dxi ∧ dxi ∧ · · · ∧ dxi ∧ dxj ∧ dxj ∧ · · · ∧ dxj . In particular, ω1 ∧ ω2 ≡ 0 if dxi = dx j for some n and m. Example 8.2.4. Let X ∈ R3 , ω1 = a(X ) dx ∧ dy , ω2 = b(X ) dz and ω3 = c(X ) dx. Then, ω1 ∧ ω3 ≡ 0 ω1 ∧ ω2 = a(X )b(X ) dx ∧ dy ∧ dz and ω2 ∧ ω3 = b(X )c(X ) dz ∧ dx. 1
k
2
n
1
2
m
For general k-forms and -forms in Rn , the wedge product is taken by linearity: If ω1 ∈ Ωk (Rn ), ω2 ∈ Ω (Rn ), ω3 ∈ Ωm (Rn ) and α, β ∈ R then (α ω1 + β ω2 )
∧ ω3 = α ω1 ∧ ω3 + β ω2 ∧ ω3 . Example 8.2.5. Let ω1 = a 1 (x,y,z) dx ∧ dz + a2 (x,y,z) dz ∧ dx + a3 (x,y,z) dx ∧ dy and ω2 = b 1 (x,y,z) dx + b2 (x,y,z) dy + b3 (x,y,z) dz . Then, ω1
∧ ω2 = = =
(a1 (x,y,z) dy
∧ dz + a2 (x,y,z) dz ∧ dx + a3 (x,y,z) dx ∧ dy)∧
(b1 (x,y,z) dx + b2 (x,y,z) dy + b3 (x,y,z) dz) a1 (x,y,z)b1 (x,y,z) dy
∧ dz ∧ dx + a2 (x,y,z)b2 (x,y,z) dz ∧ dx ∧ dy +a3 (x,y,z)b3 (x,y,z) dx ∧ dy ∧ dz (a1 b1 + a2 b2 + a3 b3 ) dx ∧ dy ∧ dz.
where the last equality follows by interchanging the differentials (twice) in the wedge products of the first two terms.
We conclude this section by looking at the space of k-forms. We denote by Ωk (Rn ) the space of differentiable k-forms. One can show that Ωk (Rn ) is a vector space by verifying the following property: if ω1 , ω2 ∈ Ωk (Rn ) and a, b ∈ R then
∈ Ωk (Rn).
aω1 + bω2
234
8 Wedge Products and Exterior Derivatives
Exercises
(1) Compute ω1 ∧ ω2 where ω1 = x 1 dx1 + x2 dx2 + x3 dx3 and ω2 = x 1 dx1
∧ dx2 + dx1 ∧ dx3 . (2) Let ω1 = x 2 x4 dx1 ∧ dx3 and ω2 = x 1 x3 dx2 ∧ dx4 . Compute ω 1 ∧ ω2 .
(3) Denote by ω the 2-form in R4 given by in Example 8.2.2. Show that ω ∧ ω is not necessarily zero. (4) Show that Ωk (Rn ) is a vector space. (5) This is a generalization of Exercises 5 and 6 of Section 7.1. Consider the coordinates of R2n given by (q 1 , p1 , . . . , q n , pn ) and define the 2-form n
ω =
dpi
i=1
∧ dq i.
(a) Let u = (u1 , v1 , . . . , u n , vn ) and v = (w1 , z1 , . . . , wn , zn ) be vectors in R2n . Compute ω (u, v). Notice that it corresponds to the sum of the signed areas of the parallelograms P ((ui , vi ), (wi , zi )) for i = 1, . . . , n. (b) Let J =
0
I n
−I n
0
where I n is the n × n identity matrix and A be a 2n × 2n matrix such that AT J A = J . Show that ω(Au,Av) = ω(u, v).
(Hint: represent ω using the J matrix). This 2-form is a “symplectic form” and is an important tool for the study of Hamiltonian systems.
8.3 Exterior Derivative In the first subsection we define a differential operator called the exterior derivative and denoted by d (just as the differential) which we can use on 1 and 2-forms. The formal definition extends directly from the one we give here, but we do not present it. The main property of the exterior derivative is that the composition with itself is equal to 0. We show this fact in a particular case and go on to discuss the concepts of closed and exact k -forms from this point of view. The second subsection is optional on a first reading and shows how to use the exactness conditions (6.9) for a 1-form to formally construct the exterior derivative for 1-forms. In particular, this construction exhibits directly that dω is in fact a 2-form.
235
8.3 Exterior Derivative
8.3.1 Definition and Properties of the Exterior Derivative
In general, the exterior derivative is an operator d : Ωk (Rn ) → Ωk+1 (Rn ). In the case of 1 -forms in Rn n
ω(x1 , . . . , xn ) =
ai (x1 , . . . , x n ) dxi
i=1
the exterior derivative is defined by n
dω :=
dai (x1 , . . . , xn )
i=1
∧ dxi.
(8.1)
For 2-forms in R3 , ω(x,y,z) = a 1 (x,y,z) dy
∧ dz + a2 (x,y,z) dz ∧ dx + a3 (x,y,z) dx ∧ dy.
then dω(x,y,z)
:=
da1 (x,y,z)
∧ (dy ∧ dz) + da2 (x,y,z) ∧ (dz ∧ dx) +da3 (x,y,z) ∧ (dx ∧ dy).
The principle of the exterior derivative for k-forms follows the recipe above: take the differential of the coefficient functions and wedge it with the wedge product corresponding to each coefficient. We compute some useful general formulae with 1 and 2-forms in R3 . As we know from Definition 8.3.6, if ω = a 1 (x,y,z) dx + a2 (x,y,z) dy + a3 (x,y,z) dz
then dω
=
+
which simplifies to dω
=
∂a 1 ∂a 1 dy + dz ∂y ∂z
∧ ∧ dx +
∂a 3 ∂a 3 dx + dy ∂x ∂y
∂a2 ∂x
+
−
∂a 1 ∂z
∂a1 ∂y
3 − ∂a ∂x
dx
∂a 2 ∂a 2 dx + dz ∂x ∂z
∧
dy
dz.
∧ dy +
dz
∂a3 ∂y
−
∂a 2 ∂z
dy
∧ dz
(8.2)
∧ dx.
Consider now the 2-form in R3 ω = b 1 (x,y,z) dy
∧ dz + b2 (x,y,z) dz ∧ dx + b3 (x,y,z) dx ∧ dy.
(8.3)
236
8 Wedge Products and Exterior Derivatives
In the computation of db1 , db2 and db3 we only need to consider the differential term which does not appear in the associated wedge product. We obtain dω =
∂b1 ∂b 2 ∂b 3 + + ∂x ∂y ∂z
dx
∧ dy ∧ dz because the reordering of the wedge products to dx ∧ dy ∧ dz always requires an even number of interchanges. The reader is invited to perform this calculation.
Correspondence between k-forms and vector fields
Formulae (8.2) and (8.3) should remind you of the curl and divergence operators applied to vector fields, presented in Chapter 6, Section 6.3.2. We make this correspondence explicit by considering differentiable vector fields F (x,y,z) and G(x,y,z) in R3 . We define 1-forms and 2-forms ωF (x,y,z) = F (x,y,z) (dx,dy,dz),
ωG = G(x,y,z) (dy
·
·
∧ dz,dz ∧ dx,dx ∧ dy).
Then, dωF = curl F (x,y,z) (dy
·
dωG = div G(x,y,z) dx
∧ dz,dz ∧ dx,dx ∧ dy)
∧ dy ∧ dz.
We return to these relationships in the formulation of Stokes’ theorem in terms of differential operators (see Chapter 10).
Exact and Closed k-forms
We now come to the fundamental property of exterior derivative when applied to differentiable k-forms. The proof is presented only in the case of 1-forms. Theorem 8.3.1. For any k -form ω with C 2 coefficients, then
d2 ω = d(dω) = 0.
present the proof for 1-forms in Rn . The general proof follows a similar argument using the general form of a k-form explained in the previous section. Let x = (x1 , . . . , x n ) and proof for 1-forms: We
n
∧ ω =
ai (x) dxi .
i=1
Then,
n
dω =
i=1
n
j=1
∂a i dxj ∂xj
dxi
237
8.3 Exterior Derivative
and
n
d2 ω = d(dω) =
n
n
i=1 j=1
k=1
∂ 2 a
i
∂x k ∂x j
dxk
∧
dxj
∧ dxi.
For any term in the triple sum for which i = j , i = k or j = k, then dxk ∧ dxj ∧ dxi ≡ 0. Suppose now that i,j,k are different and fix i. Consider only the terms j = m 1 , k = m 2 and j = m 2 , k = m 1 in the sum, we have ∂ 2 ai dxm2 ∂x m2 ∂x m1 =
∂ 2 ai ∂x m2 ∂xm1
∧ dxm1 ∧
−
∂ 2 ai dxi + dxm1 ∂x m1 ∂x m2
∂ 2 ai ∂x m1 ∂x m2
dxm2
∧ dxm2 ∧ dxi
∧ dxm1 ∧ dxi
=0
by the equality of mixed second partial derivatives guaranteed by Clairault’s theorem. That is, all terms in the sum are zero and this completes the proof for 1forms. We extend the definition of exactness to general k -forms, moreover we introduce the concept of closed k-forms. Definition 8.3.2 (exact and closed forms). Consider a differential form ω
∈ Ωk (Rn).
(a) ω is exact if there exists a (k − 1)-form α such that ω = dα . (b) ω is closed if dω = 0. These definitions lead directly to the following result. Proposition 8.3.3. If ω is an exact k -form, then ω is closed.
Proof. ω is exact implies there exists a (k dω = d(dα) = 0 by Theorem 8.3.1.
− 1)-form α such that ω = dα. Then,
We look at a few examples. Example 8.3.4. Let ω = a(x)dx + b(y) dy + c(z) dz where a(x), b(y) and c(z) are
continuous functions. Then, dω = a (x) dx
∧ dx + b(y) dy ∧ dy + c(z) dz ∧ dz = 0.
So, ω is closed and ω is also exact where ω = df with f (x,y,z) =
ˆ
a(x) dx +
ˆ
b(y) dy +
ˆ
c(z) dz.
We conclude with a generalization of Theorem 6.4.5 to general differentiable k -forms. We do not prove this result, but refer to [1].
238
8 Wedge Products and Exterior Derivatives
⊂ Rn if
Theorem 8.3.5 (Poincaré’s Lemma) . ω is a C 2 , closed k -form defined on U
and only if ω is locally exact.
Exercises
(1) Show that if ω 1 is a k-form and ω2 is an s form with k, s ∈ {0, 1, 2, 3} then ω1
∧ ω2 = (−1)ks (ω2 ∧ ω1 ).
(2) Let ω = xyz dx + yzdy + (x + z) dz and compute dω. (3) Show that if ω 1 , ω2 are both either 1, 2 or 3 forms in R3 , then d(ω1 + ω2 ) = dω 1 + dω2 .
(4) Show that if ω = 21 (−y dx + x dy) then dω = dx ∧ dy . (5) Consider ω = (xy + 21 y) dx − 21 x dy . Verify that ω = ωe + ωr where xy x2 ωe = dx + dy 2 4
and ωr =
− xy y + 2 2
dx
x x2 + 2 4
dy.
Show that ωe is closed and find the 0-form f such that df = ω e . Show that ωr is not exact. (6) Let ω = a(x, y) dx + b(x, y) dy where a, b are differentiable and defined for all of 2 R and F (x, y) = (λx,λy) be a vector field. Consider 1
Hω(x, y)
:=
ˆ
ω(λx,λy) F (x, y) dλ.
0
Then, Hω(x, y) is a function. Explain why dHω is exact? In each case below, compute Hω(x, y), d Hω and rewrite ω = d Hω + (ω
− dHω).
(a) ω = (xy + y/2) dx − (x/2) dy , (b) ω = (5x2 y − 2xy) dx + (2x − xy2 ) dy , (c) ω = exy dx + xey dy . (7) Consider the 2-form ω = (xy + 2z 2 y2 ) dy
∧ dz +
Verify that
xz 2
−
y2 2
dz
∧ dx + (x3 − y2 ) dx ∧ dy.
ω = ω e + ωr 2
where ωe = xy dy ∧ dz − y2 dz ∧ dx and ωr = 2z 2 y2 dy
∧ dz + xz2 dz ∧ dx + (x3 − y2 ) dx ∧ dy.
Show that ωe is closed and find a 1-form φ such that dφ = ω e . Show that ω r is not exact.
239
8.3 Exterior Derivative
8.3.2 A Construction of the Exterior Derivative
As Theorem 6.4.5 shows, the conditions (6.9) are important quantities to determine whether a 1 -form is locally exact. Now, those expressions are obtained from differentiating the coefficients of ω and therefore we would like to express (6.9) in terms of a “differential operator” we can apply directly to ω ; in analogy with differential operators such as ∇, div, curl or ∆ seen in Section 6.3. The exterior derivative defined in the previous section is the correct differential operator for this purpose, however, it is stated without justification whatsoever. In this section, we present one way of constructing the exterior derivative as applied to 1 -forms. Before, doing the derivation recall the following properties of matrices. Let A be a n × n matrix, then A is symmetric if A = A T , while A is skew-symmetric (or anti-symmetric) if AT = −A. Then, any matrix A can be written as the sum of a symmetric and an antisymmetric matrix. Explicitly, A = B + C, where B = 21 (A + AT ) and C = 21 (A
− AT ).
It is straightforward to check that B is symmetric and C is skew-symmetric. We illustrate with a 2 × 2 matrix
a
b
c
d
=
1 2 (b
a 1 2 (b +
c)
+ c) d
1 + (c 2
− b)
− 0
1
1
0
.
Consider the 1-form ω = a(x, y) dx + b(x, y) dy where a, b are C 1 functions in 2 2 2 R . We know that at each (x, y) ∈ R , ω can be applied to a vector v ∈ T (x,y) R . If the point (x, y) is not specified, the vector v can vary with the base point (x, y) and it is then more convenient to think not only of ω , but of a pair (ω, F ) depending on (x, y) only, where F = F (x, y) is some vector field. Thus, we consider ω(x, y)F as a function of (x, y). Now, let G = G(x, y) be another vector field and we compute the derivative of ω applied to vectors in G; that is, Dω(x, y) F : T (x,y) R2
→ Dω(x, y)F · G. In explicit coordinates, we have the following expression: Dω(x, y)F · G
= =
→ R,
G
·
∂a ∂b ∂ a ∂b dx(F ) + dy(F ), dx(F ) + dy(F ) ∂x ∂x ∂y ∂y
∂a ∂b dx(F ) + dy(F ) dx(G) + ∂x ∂x
= (dx(F ), dy(F ))
∂a ∂x ∂b ∂x
∂a ∂y ∂b ∂y
G
∂a ∂b dx(F ) + dy(F ) dy(G) ∂y ∂y
dx(G) dy(G)
240
8 Wedge Products and Exterior Derivatives
= (dx(F ), dy(F ))
1 2
1 + 2
∂b ∂x
−
∂a ∂y
∂a ∂x
1 2
∂a ∂b + ∂y ∂x
(dx(F ), dy(F ))
0
∂a ∂b + ∂y ∂x ∂b ∂y
− 1
1
0
dx(G) dy(G)
dx(G) dy(G)
where the last equality comes from the splitting of a matrix into its symmetric and antisymmetric parts. In this form, we see the desired expression
∂b ∂x
− ∂a ∂y
appear in the second term. Moreover, the last term corresponds to the area of the parallelogram generated by the vectors F and G, in other words, we can rewrite (dx(F ), dy(F ))
0 1
− 1
dx(G)
0
dy(G)
= (dx
∧ dy)(F, G).
A straightforward computation shows that interchanging the roles of F and G yields Dω(x, y) G
· F
∧ ∧
= (dx(G), dy(G))
1 2
1 + 2
∂b ∂x
− ∂a ∂y
(dx
1 2
−
∂b ∂x
− ∂a ∂y
(dx
∂a ∂b + ∂y ∂x
1 2
∂a ∂b + ∂y ∂x ∂b ∂y
dy)(G, F )
= (dx(G), dy(G))
1 2
∂a ∂x
∂a ∂x ∂a ∂b + ∂y ∂x
1 2
∂a ∂b + ∂y ∂x ∂b ∂y
dy)(F, G).
dx(F ) dy(F )
dx(F ) dy(F )
We can now conclude this calculation by noticing that the symmetric parts cancel out and so we isolate the antisymmetric part, which is a 2-form: Dω(x, y) F
· G − Dω(x, y)G · F =
∂b ∂x
−
∂a ∂y
(dx
∧ dy)(F, G).
Let us look at a similar calculation with a 1-form in R3 and we use the (x1 , x2 , x3 ) notation. Let ω = a 1 (x1 , x2 , x3 ) dx1 + a2 (x1 , x2 , x3 ) dx2 + a3 (x1 , x2 , x3 ) dx3
241
8.3 Exterior Derivative
and F = (F 1 , F 2 , F 3 )T and G = (G1 , G2 , G3 )T be vector fields. Then, we obtain Dω(x1 , x2 , x3 ) F
·G
− T
= F
+
+
+
1 2
1 2
∂a1 ∂a 2 + ∂x2 ∂x1
1 2
∂a1 ∂a 3 + ∂x3 ∂x1
1 2
∂a 1 ∂a 2 + ∂x 2 ∂x 1 ∂a2 ∂x2 ∂a 2 ∂a 3 + ∂x 3 ∂x 2
1 2
∂a 1 ∂a 3 + ∂x 3 ∂x 1
1 2
∂a 2 ∂a 3 + ∂x 3 ∂x 2 ∂a 3 ∂x 3
G
− − − − − − ∧ ·
1 2
1 2
1 2
∂a 1 ∂x 1
∂a 2 ∂x 1
∂a 1 ∂x 2
∂a 3 ∂x 1
∂a 3 ∂x 2
∂a 1 ∂x 3
∂a 2 ∂x 3
0
(dx1 (F ), dx2 (F ), dx3 (F ))
(dx1 (F ), dx2 (F ), dx3 (F ))
(dx1 (F ), dx2 (F ), dx3 (F ))
1
0
dx1 (G)
1
0
0
dx2 (G)
0
0
0
dx3 (G)
1
dx1 (G)
0
0
0
0
0
dx2 (G)
1
0
0
dx3 (G)
0
0
0
0
0
1
0
dx1 (G)
1
0
dx2 (G)
.
dx3 (G)
By noticing that
(dx1 (F ), dx2 (F ), dx3 (F ))
0 1 0
1 0 0
0 0 0
dx1 (G) dx2 (G) dx3 (G)
= (dx
dy)(F, G)
and similarly for the other two expressions, we can rewrite as Dω(x1 , x2 , x3 ) F G ∂a 1 ∂x 1
= F T
1 2
∂a 1 ∂a 2 + ∂x 2 ∂x 1
1 2
∂a 1 ∂a 3 + ∂x 3 ∂x 1
1 2
∂a 1 ∂a 2 + ∂x2 ∂x 1 ∂a 2 ∂x 2
1 2
∂a 2 ∂a 3 + ∂x3 ∂x 2
1 2
∂a 1 ∂a3 + ∂x 3 ∂x 1
1 2
∂a 2 ∂a3 + ∂x 3 ∂x 2 ∂a 3 ∂x3
G
242
8 Wedge Products and Exterior Derivatives
1 + 2
∂a 2 ∂x1
−
∂a 1 ∂x 2
(dx
∧
1 dy)(F, G) + 2 +
1 2
∂a 3 ∂x 1
∂a 1 ∂x 3
−
∂a 3 ∂x2
∂a 2 − ∂x 3
(dx
∧ dz)(F, G)
(dy
∧ dz)(F, G)
Therefore, Dω(x1 , x2 , x3 ) F
−
· G − Dω(x1 , x2 , x3 )G · F =
∂a 3 ∂x 1
−
∂a 1 ∂x 3
(dz
∧ dx)(F, G) +
∂a 3 ∂x 2
∂a 2 − ∂x 3
∂a 1 ∂x 2
∂a 2 ∂x 1
−
(dy
∧ dz)(F, G)
(dx
∧ dy)(F, G)
because the symmetric parts cancel out. This calculation, unsurprisingly, also yields a 2 -form Repeating the calculation above with a general 1-form in Rn we obtain the following formula. Let x = (x1 , . . . , x n ), n
ω =
ai (x) dxi ,
i=1
F = (F 1 , . . . , Fn ), G = (G1 , . . . , G n ) be vector fields in Dω(x) F
· G − Dω(x)G · F =
i
∂a j ∂xi
−
n R
∂a i ∂x j
. Then, (dxi
∧ dxj )(F, G).
These calculations motivate the following definition. n
ai (x) dxi be a differentiable 1-form in Rn . The exte-
Definition 8.3.6. Let ω =
i=1
rior derivative of ω is the differential operator d : Ω1 (Rn ) → Ω2 (Rn ) defined by dω F, G :=
i
∂aj ∂x i
−
∂a i ∂x j
(dxi
∧ dxj )(F, G).
Exercises
(1) Verify that the exterior derivative formula for a general 1-form n
ω =
ai (x) dxi
i=1
obtained in Definition 8.3.6 corresponds to the exterior derivative computed with formula (8.1).
9 Integration of Forms In this chapter, we put together many of the results obtained in Chapter 7 and Chapter 8 in order to develop the theory of integration of 2-forms and 3-forms. This includes broadening the discussion on pullbacks, specifically for the integration of 2-forms and 3-forms. The results are used in the second section to establish a change of variables formula which we state for 2 and 3 forms, along with examples. We continue with the definition of surface integral and in particular, integrating 2forms on a surface. Next, we present an introduction to the orientation of surfaces. We conclude with a brief section on general pullbacks presenting results linking pullbacks with wedge products and the exterior derivative.
9.1 Pullbacks of k -forms: k
= 1, 2, 3
We begin by generalizing the pullback operation described in Chapter 4 for 1 -forms to the case of 2-forms in R2 and R3 and 3-forms in R3 . We then extend the discussion to more general pullbacks. Recall first the pullback operation for 1-forms over a curve C parametrized by r(t) with t ∈ [a, b]. Consider a 1 -form in R2 ω = a(x, y) dx + b(x, y) dy.
The pullback is a way of restricting ω , defined for all of R2 , to the curve C ⊂ R2 . To do this, we evaluate the coefficients a and b on the parametrization r (t) and evaluate the differentials dx and dy on vectors in the tangent space at all points of the curve C . The definition of pullback for 2 and 3 forms (and more general k -forms too) follows a similar process. This time the geometric objects under study are not curves, but two and three dimensional domains, as well as surfaces in R3 . Recall that Section 6.2 shows how to obtain parametrization of such regions. We can now begin our derivations.
Case 1:
Consider a parametrization of R2 given by Φ : R2
→ R2 ,
(x, y)T = Φ(u, v).
be vectors in the and β Let ω = a(x, y) dx ∧ dy be a 2-form defined on R2 . Let α tangent space of some point in (u, v)-space. As in Theorem 8.1.7, we use the mapping Φ to obtain (dx
∧ dy)(DΦ( α), DΦ( α)) = det
∂ (x, y) ∂ (u, v)
du
∧ dv( α, β )
244
9 Integration of Forms
The pullback of ω by Φ is completed by evaluating the coefficient of ω on the parametrization Φ. Thus, the pullback defines a new 2-form obtained using the substitutions of variables just shown: (Φ∗ ω)(u, v) := a(Φ(u, v)) det
∂ (x, y) du ∂ (u, v)
∧ dv
α, β ) in T (u,v) R2 . applied to pairs of vectors ( Example 9.1.1. Consider the polar coordinate mapping.
Φ(r, θ) = (r cos θ, r sin θ)
and the two-form in R2 : ω = 2x x2 + y 2 dx dy . The coefficient is a(x, y) = 2x x2 + y 2 and so a(Φ(r, θ)) = 2r2 cos θ.
Because we know dx
∧
∧ dy = r dr ∧ dθ, (Φ∗ ω)(r, θ) = 2r 3 cos θ dr
∧ dθ.
Case 2:
For a parametrization of R3 Φ : R3
→ R3 ,
(x,y,z)T = Φ(u,v,w)
the pullback operation is defined in the same way as for 2-forms. Consider the 3-form on R3 given by ω (x,y,z) = a(x,y,z) dx ∧ dy ∧ dz . We know that dx
∧ dy ∧ dz = det
∂ (x,y,z) ∂ (u,v,w)
du
∧ dv ∧ dw.
We define a new 3-form using the direct substitutions into ω: (Φ∗ ω)(u,v,w) = a(Φ(u,v,w)) det
∂ (x,y,z) ∂ (u,v,w)
du
∧ dv ∧ dw
α, β, γ ) in the tangent space of the point (u,v,w). applied to triples of vectors ( Example 9.1.2. Consider the spherical coordinate mapping
Φ(ρ,φ,θ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ) 2
2
2
2
and the three-form: ω = e x +y +z dx dy dz . Then, a(x,y,z) = e x 2 a(Φ(u,v,w)) = e ρ which implies
∧ ∧ 2
(Φ∗ ω)(r,φ,θ) = e ρ ρ2 sin φ dρ
∧ dφ ∧ dθ.
+y2 +z 2
becomes
245
9.1 Pullbacks of k-forms: k = 1, 2, 3
Case 3:
We now turn to the case of the parametrization of a surface S given by Φ : R
⊂ R2 → R3 .
and consider the 2-form ω(x,y,z) = a 1 (x,y,z) dy
∧ dz + a2 (x,y,z) dz ∧ dx + a3 (x,y,z) dx ∧ dy.
The pullback operation is defined as follows. We need to evaluate the coefficients on the surface given by the parametrization Φ(u, v); that is, a1 (Φ(u, v)),
a2 (Φ(u, v)),
a3 (Φ(u, v)).
We then use the parametrization Φ to transform the 2-forms dy ∧ dz , dz ∧ dx and dx ∧ dy to 2 -forms involving du ∧ dv . For convenience, we write (x,y,z)T = Φ(u, v) = (x(u, v), y(u, v), z(u, v))T .
Therefore, dx
∧ dy =
det
dy
∧ dz =
det
dz
∧ dx =
det
β ∈ T (u,v) R2 , For vectors α, DΦ( α)
∂ (x, y) ∂ (u, v)
du
∧ dv,
∂ (y, z) ∂ (u, v)
du
∧ dv,
∂ (z, x) ∂ (u, v)
du
∧ dv.
(9.1)
) and DΦ(β
are elements of T Φ(u,v) S . The wedge products dx ∧ dy , dy ∧ dz and dz ∧ dx are applied ). to the vectors DΦ( α) and DΦ(β Putting all those substitutions together, we define the pullback of ω by Φ as (Φ∗ ω)(u, v) =
∂ (y, z) ∂ (z, x) ∂ (x, y) a1 (Φ(u, v)) + a2 (Φ(u, v)) + a3 (Φ(u, v)) ∂ (u, v) ∂ (u, v) ∂ (u, v)
β ∈ Tu,v R2 . applied to vectors α,
Example 9.1.3. Consider the surface S with parametrization
Φ(u, v) = (u2 , u + v, v
− u)
and the 2-form ω = 3xyz dy
∧ dz + zydz ∧ dx + x2 y dx ∧ dy.
du
∧ dv
246
9 Integration of Forms
Then, a1 (x,y,z) = 3xyz , a2 (x,y,z) = zy , a3 (x,y,z) = x 2 y and so a1 (Φ(u, v))
=
3u2 (u + v)(v
a2 (Φ(u, v))
=
(v
a3 (Φ(u, v))
=
(u2 )2 (u + v) = u 5 + u4 v.
Moreover,
∂ (y, z) ∂ (u, v)
∂ (z, x) ∂ (u, v)
det
which leads to (Φ∗ ω)(u, v)
− u)(u + v) = v 2 − u2
− −
det
det
− u) = 3u2 (v2 − u2 )
∂ (x, y) ∂ (u, v)
=
(6u2 (v 2
=
(6u2 v 2
1
1
= det
=2
1
1
1 1
= det
2u
2u
=
0 0
= det
1
−
2u
= 2u
1
− u2 )2 − 2u(v2 − u2 ) + 2u(u5 + u4 v)) du ∧ dv
− 6u4 − 2uv2 + 2u3 + 2u6 + 2u5 v) du ∧ dv.
The main use of pullbacks in our context is to compute integrals as we show in the next section.
Exercises
(1) Compute the following pullbacks of 2 and 3 forms. (a) ω(x, y) = (x2 − y 2 )dx ∧ dy with Φ(u, v) = (u cosh(v), u sinh(v)). (b) ω(x, y) = exp(xy)dx ∧ dy with Φ(u, v) = (u2 − v 2 , u2 + v2 ) (c) ω(x, y) = (3 − x2 + y 2 )dx ∧ dy with Φ(u, v) = (u cos v, u sin v) (d) ω(x,y,z) = (3x − y)dy ∧ dz + (y − z)dz ∧ dx − (x + y + z)dx ∧ dy with Φ(u, v) = ( 13 u + v , v , u − v).
9.2 Integrals of Forms: change of variables formula
247
(e) ω(x,y,z) = xdy ∧ dz + ydz ∧ dx + zdx ∧ dy with Φ(u, v) = (2 cos u sin v, 2sin u sin v, 2cos v).
(f) ω(x,y,z) = (xz − y)dy ∧ dz + (y2 − z)dz ∧ dx + xyzdx ∧ dy with Φ(u, v) = (u,uv + u2 , v + u). (g) ω(x,y,z) =
x2 22
+
y2 12
z2 32
+
dx
∧ dy ∧ dz with Φ(u,v,w) = (2u,v, 3w).
(h) ω(x,y,z) = exp(xyz)dx ∧ dy ∧ dz with Φ(u,v,w) = (ln uv,v 2 , uw). (i) ω(x,y,z) = ||(x,y,z)||dx ∧ dy ∧ dz with Φ(u,v,w) = (u cos v, u sin v, w).
9.2 Integrals of Forms: change of variables formula We begin by considering the integration of 2-forms over two-dimensional domains and 3-forms over three-dimensional domains. We show that the integrals of those forms correspond to regular double and triple integrals as defined in Chapter 7. To make this correspondence, we return to wedge products and orientation. Note that for any rectangle P (u, v) with u parallel to the x-axis and v parallel to the y -axis, then (dx ∧ dy)(u, v) = dA(P (u, v)). Similarly, if B(u,v,w) is a box with u parallel to the x-axis, v parallel to the y-axis and w parallel to the z -axis. Then, (dx ∧ dy ∧ dz)(B) = dV (B). Definition 9.2.1. Let ω 2 (x, y) = f (x, y) dx
∧ dy and ω3 (x,y,z) = g(x,y,z) dx ∧ dy ∧
dz , then for a rectangular domain D in R2 and a box domain B in R3
¨
ω
2
˚
and
D
ω3
B
are defined as limits of Riemann sums over grids subdividing D and B as in the definition of double and triple integrals, see Definition 7.2.2 and Definition 7.4.1. Proposition 9.2.2. Let ω 2 (x, y) be a 2-form and ω 3 (x,y,z) be a 3-form, as in Defi-
⊂ R2 , we have
nition 9.2.1. For a domain D
¨
2
¨
f (x, y) dA.
˚
g(x,y,z) dV,
ω =
D
D
⊂ R3 ,
and for a domain E
˚
ω3 =
E
E
if the integrals on the right exist. Therefore, all results shown for double and triple integrals apply to integrals of 2 and 3 forms. Proof. Using these properties of dx
∧ dy and dx ∧ dy ∧ dz on P (u, v) and B (u,v,w)
respectively shows the equality between the Riemann sums of forms and the Riemann
248
9 Integration of Forms
sums defining double and triple integrals. In particular, the existence of integrals is guaranteed over rectangles R and boxes B domains for 2 and 3 forms having bounded coefficients with a finite number of jump discontinuities of finite size. Therefore, we can define those integrals for domains D and E more general than the rectangle and box domains by setting ω 2 and ω 3 to zero outside D and E . We now present the “change of variables formula” which is just a generalization of the substitution rule (or u-substitution) to multiple integrals. In Section 4.1, we show that the substitution rule can be rewritten in terms of pullbacks of the 1 -form. Therefore, it is natural that the change of variables formula for multiple integrals can also be obtained using pullbacks and with the formulae obtained above, it is now a straightforward result. Theorem 9.2.3 (Change of variables formula). Let D
for which
¨
˚
f (x, y) dA,
D
⊂ R2 and E ⊂ R3 be subsets
g(x,y,z) dV
E
exist. Consider the parametrizations Φ1 : R 1
→ D and Φ2 : R2 → E :
Φ1 (u, v) = (x(u, v), y(u, v))
and
Φ2 (u,v,w) = (x(u,v,w), y(u,v,w), z(u,v,w)).
Then
¨
D
˚
f (x, y) dA(x,y) =
E
¨
R1
g(x,y,z) dV (x,y,z) =
∂ (x, y) ∂ (u, v)
f (Φ1 (u, v)) det
˚
g(Φ2 (u, v)) det
R2
dA(u,v)
∂ (x,y,z) ∂ (u,v,w)
dV (u,v,w)
Proof. Note that using Proposition 9.2.2, we can work directly with
¨
D
ω
2
and
˚
ω3
E
where ω 2 = f (x, y) dx ∧ dy and ω 3 = g(x,y,z) dx ∧ dy ∧ dz . ˜ containing D and Without loss of generality, we consider a rectangular domain D ˜ into nm subrectangles D ˜ ij with ω ˜ 2 equal to ω 2 on D and zero outside. Decompose D ˜ ij is generated by the vectors ∆xi = x i − xi−1 i = 1, . . . , n and j = 1, . . . , m where D and ∆yj = yj − yj −1 . Moreover, note 1 −1 α i = DΦ− 1 (∆xi ) and β j = DΦ1 (∆yj ).
9.2 Integrals of Forms: change of variables formula
249
We write the integral of ω2 using its Riemann sum
¨
˜ D
n
ω
2
=
lim
m
→∞ i=1 j=1 f (x, y)(dx ∧ dy)(∆xi , ∆yj )
n,m
n
=
lim
n,m
m
→∞ i=1 j=1 n
= =
lim
n,m
→∞
¨
f (Φ1 (u, v))(dx
m
f (Φ(u, v))
i=1 j=1
˜ Φ−1 (D) 1
∧ dy)(∆xi, ∆yj )
∂ (x, y) (du ∂ (u, v)
∧ dv)( αi, β j )
Φ∗1 ω 2
˜ . A similar since the next to last line is the Riemann sum of Φ∗1 ω 2 over Φ−1 (D) calculation shows that
¨
E
3
ω =
¨
R2
Φ∗1 ω 3
and this completes the proof. We now proceed with several examples of the change of variables formula.
Fig. 9.1. Region E bounded by the cylinder and paraboloid described in Example 9.2.4.
Example 9.2.4. Find the volume of the solid that lies under the paraboloid z =
x2 + y2 inside the cylinder (x 1)2 + y 2 = 1 and above the xy -plane. See Figure 9.1. We begin by describing the domain E enclosed by the paraboloid and the cylinder, we write E = (x,y,z) (x, y) D, 0 z x2 + y 2
−
{
|
∈
≤ ≤
}
250
9 Integration of Forms
where
| ≤ (x − 1)2 + y2 ≤ 1}.
D = (x, y) 0
{
Because the region is defined partly using a cylinder, it is a reasonable guess to move to cylindrical coordinates: (x,y,z) = Φ(r,θ,z) where Φ : R E . We need to determine the domain R. From the change of coordinates, we have
→
0
≤ z ≤ x2 + y2 ⇐⇒
0
≤ z ≤ r2
and (x
− 1)2 + y2 = 1 ⇔ (r cos θ − 1)2 + r2 sin2 θ = 1 ⇔ r = 2 cos θ.
Thus, the projection of the interior of the cylinder is parametrized by 0
and
≤ r ≤ 2cos θ
− π2 ≤ θ ≤ π2 .
Therefore,
| − ≤ θ ≤
˚
˚
| det DΦ| dV (r,θ,z)
˚
r dV (r,θ,z)
R =
(r,θ,z)
π 2
π , 2
0
≤ r ≤ 2cos θ,
0
≤ z ≤ r
2
We can now apply the change of variables formula:
E
dV (x,y,z)
=
R
=
R
π/2
=
ˆ
ˆ ˆ
π/2
=
4
ˆ
0
cos4 θ dθ =
−π/2
Example 9.2.5. Compute
ellipse x 2 given by
¨
D
.
r2
2cos θ
r dz dr
−π/2
dθ
0
3π . 2
(x2 xy +y 2 ) dA where D is the region bounded by the
−
− xy + y2 = 2. First, set h(x, y) = x2 − xy + y2 and we use (x, y) = Φ(u, v) √ √ x = 2u − 2/3v, y = 2u + 2/3v.
The domain inside the ellipse is described by
| ≤ x2 − xy + y2 ≤ 2}.
D = (x, y) 0
{
and so Φ : D 1
→ D is a parametrization of D where D1 = {(u, v) | u2 + v 2 ≤ 1}.
9.2 Integrals of Forms: change of variables formula
251
From the change of variables formula, we obtain
¨
D
h(x, y)dA(x,y) =
¨
D1
h(Φ(u, v)) det DΦ dA(u,v) .
|
|
We just need to compute the terms in the integrand on the right-hand side of the equality. We have h(Φ(u, v)) = 2(u2 + v 2 )
| det DΦ| = √ 43 .
and
Therefore,
¨
D
h(x, y)dA(x,y) =
ˆ ˆ √ − 1
√ 83
0
1 v2
−√ 1−v
(u2 + v 2 ) dv
2
du.
Now, one can possibly argue that the iterated integral on the right can be simplified by going to polar coordinates. Let (u, v) = Φ1 (r, θ) and Φ1 : D 2 D 1 where D2 = (r, θ) 0 r 1, 0 θ 2π . Then,
{
| ≤ ≤ 8 3
√
→
≤ ≤ }
¨
D1
2
2
(u + v ) dA(u,v)
= =
8 3 8 3
¨ √ r |DΦ | dA ˆ ˆ √
2
D2 1
0
1
2π
(r,θ)
r 3 dθ dr =
0
4π √ . 3
Example 9.2.6. Compute the volume inside the torus given by
x = (2 + cos v)cos u,
y = (2 + cos v)sin u,
z = sin v
where 0 u 2π and 0 v 2π. The parametrization of the region inside the torus is similar to a spherical coordinate case. The torus has rotation axis at distance 2 from the origin and the torus radius is 1. The region E inside the torus is parametrized by Φ : R E given by
≤ ≤
≤ ≤
→
Φ(ρ,u,v) = ((2 + ρ cos v)cos u, (2 + ρ cos v)sin u, ρ sin v)
from a domain R = (ρ,u,v) 0
| ≤ ρ ≤ 1, 0 ≤ u ≤ 2π, 0 ≤ v ≤ 2π}.
{
Thus,
˚
E
dV (x,y,z) =
˚
R
| det DΦ| dV (ρ,u,v)
where det DΦ =
det
cos v cos u
−(2 + ρ cos v)sin u −ρ sin v cos u
cos v sin u
(2 + ρ cos v)cos u
−ρ sin v sin u
sin v
0
ρ cos v
= ρ(2 + ρ cos v) > 0.
252
9 Integration of Forms
This leads to
˚
E
1
dV (x,y,z)
=
0
0
0
0
2π
2π 0
2π
dρ
0
1
=
2π
0
1
=
2π
0
1
=
ˆ ˆ ˆ ρ(2 + ρ cos v) dv du ˆ ˆ (2ρv + ρ sin v) | du dρ ˆ ˆ 4πρdu dρ ˆ 8πρdρ
0
=
4π 2
Exercises
(1) Set up the iterated integral for the following integrals of 2-forms and 3-forms over the given domain. (a) Let ω(x, y) = (x2 −y 2 )dx∧dy and the domain D is parametrized by Φ(u, v) = (u cosh(v), u sinh(v)) with 1 ≤ u ≤ 3 and 0 ≤ v ≤ π . (b) Let ω(x, y) = ye x with domain D parametrized by Φ(u, v) = (ln(uv), uv) where 1 ≤ u ≤ 2, 1 ≤ v ≤ 3. (c) Let ω(x,y,z) =
x2 y2 z2 + 2 + 2 22 1 3
dx
∧ dy ∧ dz
and the domain E given by the parametrization Φ(u,v,w) = (2u,v, 3w) with 0 ≤ u ≤ 1, 0 ≤ v ≤ 2, −1 ≤ w ≤ 1. (d) Let ω(x,y,z) = (x2 +y 2 − z 2 )dx ∧ dy ∧ dz and the domain E is parametrized by Φ(µ,ν,φ) = (cosh µ cos ν cos φ, c cosh µ, cos ν, sin φ, sinh µ sin ν ) where 0 < µ ≤ 1, −π/2 ≤ ν ≤ π/2 and 0 ≤ φ ≤ π . (2) Evaluate the following integrals (a) S (x + y) dA where S is the region inside the disk x2 +y 2 = a 2 and between the lines y = x and y = −x and containing the y-axis. (b) D tan(x2 + y 2 ) dA where D is the disk of radius 1 centered at (0, 0). (3) Let T be the region bounded by the triangle with vertices (−1, 0), (0, 1) and (1, 0). Evaluate
˜ ˜
¨
T
− y x y + x
2
dA
by using the change of variables u = y − x, v = y + x.
9.3 Integrals on a surface
253
E bounded by the tetrahedron with vertices (0, 0, 0), (0, 1, 1), (4) Consider the region √ √ (− 3/2, −1/2, 1), ( 3/2, −1/2, 1). Evaluate
˚
(xz + 2y) dV
E
using the change of variables u = y
− z2 ,
v =
√ x3 − z2 ,
√ x3 + z2 .
w =
(5) In each case, find the volume of the given region. (a) The region inside the cone of equation z = x2 + y2 , below the sphere of equation x 2 + y 2 + z 2 = a 2 . (b) The region located in the first octant between the planes of equation y = 0, y = x and inside the ellipsoid of equation
x2 y2 z2 + + = 1. a2 b2 c2
(6) Consider a colony of bacteria living in a Petri dish in the shape of a disk of radius 10cm. Suppose that the density of bacteria is estimated by ρ(x, y) = 3 − 2 exp((x2 + y 2 )−1 ) bacteria/cm2 . Compute the total number of bacteria in the dish and determine the average density of bacteria. (7) Consider an electrically charged region E in the shape of a torus as in Example 9.2.6. We assume the charge density in E (given in Coulomb per square meters C/m2 ) to be given by a function σ(x,y,z) = z(x2 + y 2 ). Evaluate the total charge density in E . (8) Consider a ball of radius 8 cm filled with sand lying on a table and assume that the south pole is at the origin of R3 . Suppose that the mass density is estimated √ by the function ρ(x,y,z) = 5 − z/4 g/cm3 . Find the centre of mass (x,y,z) of the ball by computing the integrals 1 x = M
˚
xρ dV,
E
1 y = M
˚
yρ dV,
E
1 z = M
˚
zρdV
E
where E is the domain bounded by the ball.
9.3 Integrals on a surface Consider a surface S ⊂ R3 and a point p ∈ S . We begin by defining an area form, β are vectors in R3 then the but for parallelograms defined in T p S . Recall that if α, α, β ) is given by the formula area of the parallelogram P ( A(P ) =
See Section 7.1.
((dy
∧ dz)2 ( α, β ) + (dz ∧ dx)2 ( α, β ) + (dx ∧ dy)2 ( α, β ).
254
9 Integration of Forms
⊂ R3 be a surface for which a unique tangent space exists
Definition 9.3.1. Let S
at each point p defined by
∈ S and let α, β be vectors in T p S . Then, the surface area form is
dS (α, β ) =
∧ (dy
dz)2 ( α, β ) + (dz
∧ dx)2 ( α, β ) + (dx ∧ dy)2 ( α, β ).
The surface area form is non-negative and is a generalization of the area form to any two-dimensional surface in R3 . We now use the surface area form to define various surface integrals.
9.3.1 Surface integral
We begin with the surface integral of a real-valued function f on a surface S . This is the simplest type of integral on a surface and one of its main use is to compute the area of surfaces. Consider a rubber sheet of area A and a curved surface S we would like to cover with the rubber sheet. We take the rubber sheet, stretch it and bend it to fit on the surface. What is the surface area of the curved surface? If we don’t have to stretch the rubber sheet and we can cover the surface entirely, then it is intuitively clear that the curved surface has also area A. Suppose a stretching is needed, and the stretching and bending can be described by the mapping Φ. We show in this section how to use the mapping Φ to compute the surface area of S . The pullback transformation is the key ingredient here.
Fig. 9.2. Surface S decomposed via a grid into portions S ij
Consider a surface S which has a tangent space at all points, and partitioned into a grid as in Figure 9.2 with ki , the ith curve parallel to the y -axis, and j ,
9.3 Integrals on a surface
255
the j th curve parallel to the x-axis on S . Let qi (t) be a parametrization of ki with t ∈ [0, 1] and rj (τ ) a parametrization of j with τ ∈ [0, 1]. Let pij = q i (tj ) = r j (τ i ) be the intersection point of ki and j and define γ ij := q i (tj )(tj+1
δ ij = r j (τ i )(τ i+1
− tj ),
− τ i),
two vectors based at pij and tangent to ki and j respectively. The parallelogram P (γ ij , δ ij ) is an approximation of the area of the portion of surface S ij bounded δ ij ). By refining the grid, see by the curves ki , ki+1 , j , j+1 and it has area dS (γ ij ,
Fig. 9.3. Refinement of the
grid on the surface S .
Figure 9.3, the approximation improves since each S ij becomes flatter as the pieces become smaller. Consider now a function f : S → R. The value of f can be approximated at the points p ij . We use this construction to define the integral of f on the surface S . Definition 9.3.2. Let S be a differentiable surface and f (x,y,z) a function taking
real values on S . That is, f ( p)
∈ R for all p ∈ S . Then,
¨
f (x,y,z) dS
S
is called the surface integral of f on S and its value is given by the Riemann sum n
lim
n,m
m
→∞ i=1 j=1
f ( pij )dS (γ ij , δ ij ),
if it exists as the size of γ ij , δ ij tends to 0 as n, m
(9.2)
→ ∞.
Now that we have a definition of the surface integral, we need to obtain a computational formula to evaluate the integral. For this, we need S to be described by a parametrization and our goal is to write dS in terms of this parametrization. Let Φ : D → S , Φ(u, v) = (x(u, v), y(u, v), z(u, v)) be a parametrization of S . Let ) ∈ (u0 , v0 ) ∈ D and p = Φ(u0 , v0 ) ∈ S . Let α, β ∈ T (u ,v ) R2 , then DΦ( α), DΦ(β T p S and 0
(dy
∧ dz)(DΦ( α), DΦ(β )) = det
0
∂ (y, z) ∂ (u, v)
(du
∧ dv)( α, β )
256
9 Integration of Forms
(dz
∧ dx)(DΦ( α), DΦ(β )) = det
∂ (z, x) ∂ (u, v)
(du
∧ dv)( α, β )
(dx
∧ dy)(DΦ( α), DΦ(β )) = det
∂ (x, y) ∂ (u, v)
(du
∧ dv)( α, β ).
Therefore, we obtain
∂ (y, z) ∂ (u, v)
)) = dS (DΦ( α), DΦ(β
2
∂ (z, x) + ∂ (u, v)
2
2
∂ (x, y) + (du ∂ (u, v)
∧ dv)( α, β )
where the right hand side is similar to a pullback of dS via Φ and we define (Φ∗ dS ) :=
∂ (y, z) ∂ (u, v)
2
∂ (z, x) + ∂ (u, v)
2
2
∂ (x, y) + (du ∂ (u, v)
∧ dv).
This now leads to a computational formula for the surface integral in terms of the pullback of dS . Proposition 9.3.3. Let S be a surface parametrized by Φ : D
→ R be a continuous function. Then,
f : S
¨
¨
f dS =
S
⊂ R2 → S and let
f (Φ(u, v))(Φ ∗ dS ).
D
Proof. Consider the elements in the Riemann sum (9.2): pij = Φ(ui , vj ) for some (ui , vj ) D, and so f ( pij ) = f (Φ(ui , vj )). Moreover,
∈
ij )) (γ ij , δ ij ) = (DΦ( αij ), DΦ(β ij , β ij ∈ T (ui ,vj ) R2 . Then for some α dS (γ ij , δ ij ) = (Φ∗ dS )( αij , β ij ).
Substituting in (9.2) we obtain
¨
f dS
S
n
=
lim
n,m→∞
m
i=1 j =1
f (Φ(ui , vj ))
∂ (y, z) ∂ (u, v)
2
∂ (z, x) + ∂ (u, v)
2
∂ (x, y) + ∂ (u, v)
2
(du
∧ dv)( α
ij , βij )
where the right-hand side is the Riemann sum leading to the double integral of the function f (Φ(ui , vj ))
on D. This proves the result.
∂ (y, z) ∂ (u, v)
2
∂ (z, x) + ∂ (u, v)
2
∂ (x, y) + ∂ (u, v)
2
We now look at several examples beginning with the area of surfaces obtained if f (x,y,z) = 1.
257
9.3 Integrals on a surface
Example 9.3.4. Suppose S is a sphere of radius a with parametrization:
x = a cos θ sin φ,
y = a sin θ sin φ,
z = a cos φ
where D = (φ, θ) 0
| ≤ φ ≤ π, 0 ≤ θ ≤ 2π}. Then,
{
¨
dS =
S
¨ ∗ (Φ dS ) ˆ ˆ
ˆ ˆ ˆ ˆ D 2π
=
π
0
0
2π
=
∂ (y, z) ∂ (φ, θ)
π
0
2
∂ (z, x) + ∂ (φ, θ)
∂ (x, y) + ∂ (φ, θ)
2
dφ
dθ
0
2π
π
a2 sin2 θ sin φ cos φ)2 dφ
2
a sin φdφ dθ
0
=
2
(a2 cos θ sin2 φ)2 + (a2 sin θ sin2 φ)2
+(a2 cos2 θ sin φ cos φ +
=
0
dθ
4πa2 .
We now look at a family of examples of surface integrals for surfaces S given by z = g(x, y) where g : D ⊂ R2 → R is differentiable and f : D → R. The parametrization is Φ(x, y) = (x,y,g(x, y)), with Φ : D ⊂ S . Then, det
∂ (y, z) ∂ (x, y)
=
−
∂g , ∂x
det
Therefore,
¨
f dS =
S
¨
D
− ∂ (z, x) ∂ (x, y)
f (Φ(x, y))
=
∂g ∂y
and det
∂g 2 ∂g 2 1+ + dx ∂x ∂y
∂ (x, y) ∂ (x, y)
= 1.
∧ dy.
Example 9.3.5. Consider the elliptic paraboloid S given by g(x, y) = x 2 + y 2 defined
over D, the circle of radius 2: D = (r, θ) 0
| ≤ r ≤ 2, 0 ≤ θ ≤ 2π}.
{
We write the formula and convert to polar coordinates
¨
S
dS =
¨
D
2π
1 + 4x2
+ 4y 2 dx
∧ dy =
0
The r integral is obtained by substitution and so 2π(53/2 dS = 3 S
¨
2
ˆ ˆ − 1) .
0
1 + r2 r dr dθ.
258
9 Integration of Forms
We now consider a few applications to quantities expressed as densities. Let ρ(x,y,z) be the density of a quantity (mass, population, biomass, etc) and we restrict (x,y,z) to S . The total quantity over S is then expressed using the surface integral
¨
ρ(x,y,z) dS.
S
Example 9.3.6. We compute the mass of a thin material in the shape of a conic
surface S with parametric representation Φ(r, θ) = (r cos θ, r sin θ, r) with 0 r 1 and 0 θ 2π . Suppose ρ(x,y,z) = z 2 be the mass density at (x,y,z) S . Then,
≤ ≤
∈
m =
¨
ρ(x,y,z) dS =
S
√
¨
r2 2r dr
D
= =
√ 2π 2 √ 2π
≤ ≤
∧ dθ
ˆ 1
r3 dr
0
.
2
Example 9.3.7. We consider a situation where the density of vegetation cover on a
mountain depends on the height and whether we consider the south or north face of the mountain. Suppose the mountain has height z = H > 0 with a cone shape assumed to be circularly symmetric with radius 10H at the base z = 0. We set the top of the mountain at (x,y,z) = (0, 0, H ) and assume the density of biomass to be given by a function
− − 2C 1
ρ(x,y,z) =
C 1
z H z H
if 0
≤ y,
if y < 0
where C is some constant. Let S be the surface of the mountain given by z = 1 g(x, y) := H 100H (x2 + y 2 ) with D = (x, y) x2 + y 2 (10H )2 . The total density of vegetation is given by
−
{
¨
|
≤
}
ρ(x,y,z) dS.
S
1 The parametrization of S is Φ(x, y) = (x,y,H 100H (x2 + y2 )) with domain D. To obtain the integral, we begin by splitting D = D+ D− where D+ = D (x, y) (x, y) y < 0 . We define S + = S y ≥0 and S − = S y<0 and y 0 and D− = D z so S ± is parametrized by Φ with domain D± . Moreover, ρ D+ = 2C 1 H and
−
≥ }
∩{
|
}
∪
|
|
|
∩ {
−
|
259
9.3 Integrals on a surface
− z H
ρ D− = C 1
|
¨
. Therefore, we can write
ρ(x,y,z) dS =
S
¨ ¨ ρ (x,y,z) dS + ¨ g(x, y) 2C 1 − H ¨ g(x, y) +
S +
=
S −
D+
+
−
C 1
D−
ρ− (x,y,z) dS
H
4(x2 + y2 ) 1+ dx (100H )2
∧ dy
4(x2 + y2 ) 1+ dx (100H )2
∧ dy.
Transforming to polar coordinates in (x, y), the parametrization of S is given by 1 2 ˜ θ) = (r,θ,H ˜ Φ(r, 10H, θ (0, π) and 100H r ) with domain D+ = (r, θ) r ˜ − = (r, θ) r 10H, θ (π, 2π) . Thus, we obtain D
{
¨
D±
− | ≤
1
−
g(x, y) H
∈
{
}
4(x2 + y 2 ) 1+ dx (100H )2
∧ dy =
¨
˜± D
| ≤
r2 100H 2
∈
1+
4r2 rdr (100H )2
}
∧ dθ.
Writing in terms of iterated integrals, using the fact that the integrand is independent of θ, we now have r2
¨
˜± D
100H 2
1+
4r 2 r dr (100H )2
10H
∧ dθ = π
ˆ 0
r2 100H 2
1+
4r 2 r dr (100H )2
This last integral can be solved by setting 4r 2 4 with u = 1+ u du = r dr. (100H )2 (100H )2 2
The bounds of integration are u = 1 and u =
√ 5. We leave the details to the reader.
Exercises
(1) In each case, set up the integral to compute the surface area of S , if possible, compute the integral. (a) S is given by z = 2x2 + y 2 with domain D = {(x, y) | 0 ≤ 2x2 + y 2 ≤ 4}. (b) S is the portion of the cylinder x2 + z 2 = 2 with z ≥ 0 and −3 ≤ y ≤ 3. (c) S is given by Φ(u, v) = (uv,uv 2 , 2 + u) with −1 ≤ u ≤ 1 and 0 ≤ v ≤ 1. (d) S is the portion of sphere of radius 1 with 1/2 ≤ z ≤ 1. (e) S is the surface of the tetrahedron with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1). (f) S is the positively oriented cone with a vertex at (0, 0, 2) with axis given by the z -axis and base given by the disk of radius 2 in the xy plane. (g) S is the portion of sphere given by Φ(φ, θ) = (2 cos θ sin φ, 2sin θ sin φ, 2cos φ)
with D = {(φ, θ) | π/3 ≤ φ ≤ 2π/3, π/4 ≤ θ ≤ 3π/2}.
260
9 Integration of Forms
(2) Suppose that the density of population of a wildlife species A depends on the height of the landscape and is given by the formula ρ(x,y,z) = ρ 0 − z and does not live at heights greater than ρ0 . If the landscape can be approximated by z = x 4 y 3 + 2x2 y + 3xy 2 where −1 ≤ x ≤ 1 and 0 ≤ y ≤ 1 . Set up the integral for the total population of A . (3) Cell membranes are covered with channels enabling the transport of ions from the interior of the cell to the extracellular medium. Channels typically only enable the passage of a unique type of ion. Na + ion channel density on a given type of cell is 60 channels/µm2 [5]. Suppose that a cell is spherically shaped with radius 2µm. How many ion channels are on the sphere. (4) European otter Lutra lutra exhibits a mean hair density of about 70000 hairs/cm2 not including appendages [3] The body of the otter is 57 to 95 cm long not counting the tail. If we assume that the body of the otter is approximatively a cylinder of length (with ∈ [57, 95]) and radius of 8 cm, determine the total number of hair on the otter.
9.3.2 Integral of a 2-form on a surface
We begin by a formal definition of the integral of a 2-form over a surface using Riemann sums. Then, we use the pullback formula for 2 -forms in R3 via a mapping Φ acting as a parametrization of S to have a convenient formula for the integral of 2-forms on surfaces. Consider a surface S in the context given before the definition of surface integral; that is, Definition 9.3.2. Let R = [a, b] × [c, d] ⊂ R2 be a rectangular region such that D ⊂ R and consider partitions of [a, b] and [c, d] leading to the grid of points (xi , yj ) ∈ R with 1 ≤ i ≤ n and 1 ≤ j ≤ m . Let pij = g(xi , yj ) ∈ S and γ ij and δ ij be vectors in T pij S parallel to the x and y axes, respectively. Definition 9.3.8. Let S be a surface as described above and ω 2 be a differentiable
2-form. Then, the integral of ω 2 on S is defined as
¨
S
n
ω
2
=
lim
n,m
→∞
m
a1 ( pij ) (dy
i=1 j=1
+a3 ( pij ) (dx
∧ dz)(γ ij , δ ij ) + a2 ( pij ) (dz ∧ dx)(γ ij , δ ij )
∧ dy)(γ ij , δ ij )
if the limit of the Riemann sum on the right-hand side of the equality sign exists as the size of γ ij , . δ ij tends to 0 as n, m
→∞
Theorem 9.3.9. Consider a surface S with a parametrization Φ : D
Φ(u, v) = (x(u, v), y(u, v), z(u, v)).
⊂ R2 → R3 :
261
9.3 Integrals on a surface
Let ω be a 2-form, then
¨
ω =
S
¨
Φ∗ ω
D
where the term on the right is a double integral in R2 . Proof. Begin with the Riemann sum formula of Definition 9.3.8. Set ω = a 1 dy dz + a2 dz dx+a3 dx dy , p ij = Φ(ui , vj ) S and αij , β ij T pij S and γ ij = DΦ−1 ( αij ), ij ), then δ ij = DΦ−1 (β
∧
¨
∧
∈
n
ω
2
=
n,m
→∞ i=1 j=1
S
+a3 ( pij ) (dx lim
m
→∞ i=1 j=1
∂ (y, z) a1 (Φ(uij , vij )) (du ∂ (u, v)
∧ dv)(γ ij , δ ij )
+a2 (Φ(uij , vij ))
∂ (z, x) (du ∂ (u, v)
dv)(γ ij , δ ij )
+a3 (Φ(uij , vij ))
∂ (x, y) (du ∂ (u, v)
dv)(γ ij , δ ij )
n
lim
n,m
m
→∞ i=1 j=1
+a3 (Φ(uij , vij ))
¨
=
∧ dz)( αij , β ij ) + a2 ( pij ) (dz ∧ dx)( αij , β ij )
∧ dy)( αij , β ij )
n,m
=
a1 ( pij ) (dy
∧ ∧ ∧ n
=
∈
m
lim
∧
∂ (y, z) ∂ (z, x) a1 (Φ(uij , vij )) + a2 (Φ(uij , vij )) ∂ (u, v) ∂ (u, v) ∂ (x, y) ∂ (u, v)
(du
dv)(γ ij , δ ij )
Φ∗ ω 2 ,
D
where the last equality holds as one recognizes the Riemann sum of the pullback of ω 2 in the previous line. We illustrate the use of the formula of Theorem 9.3.9 in the following examples. Example 9.3.10. Let S be parametrized by (x,y,z) = Φ(u, v) = (u2 , u + v, v
with D = (u, v) 0
| ≤ u ≤ 1, 0 ≤ v ≤ 1} and ω = 3xyz(dy ∧ dz) + zy(dz ∧ dx) + x2 y(dx ∧ dy).
{
We compute
¨
− u)
ω.
S
This case is straightforward because the domain D and the parametrization Φ are given explicitly. All we need is to compute the pullback of ω . We compute first ∂ (y, z) = ∂ (u, v)
1
−1
1 1
=
−2,
∂ (z, x) = ∂ (u, v)
−1
1
2u
0
=
−2u,
∂ (x, y) = ∂ (u, v)
2u 1
0 1
= 2u.
262
9 Integration of Forms
We have Φ∗ ω = (3u2 (u + v)(v
− u)(−2) + (v − u)(u + v)(−2u) + u4 (u + v)(2u))du ∧ dv
and so
¨
1
ω =
0
S
1
ˆ ˆ
(u + v)(6u3
0
− 6u2 v − 2uv + 2u2 + 2u5 ) du
dv.
This last integral is straightforward and we leave the details to the reader. Example 9.3.11. Let S be the surface which forms the boundary of the region deter-
mined by the cylinder x2 + z 2 = 1, the plane y = 0 and x + y = 2. See Figure 9.4. Let ω = 5(dy dz) + y(dz dx) + x(dx dy). We compute
¨
∧
∧
∧
ω . This computation is much more involved than the previous one
S
and we outline only the major steps. First, there are three separate surfaces making up S and second, the parametrizations are not given explicitly. The three separate
Fig. 9.4. Surface S made up of three sepa-
rate surfaces S 1 , S 2 and S 3 .
surfaces are: the disk of radius 1 centered at the origin in the (x, z)-plane ( S1 ), the cylinder ( S 2 ) and the plane given by x + y = 2 ( S3 ). Therefore, we write
¨
ω =
S
S 1 is parametrized by Φ 1 : D 1 and D1 = (r, θ) 0 r 1,
{
¨
S 1
ω +
¨
S 2
ω +
¨
ω.
S 3
→ S 1 where Φ1 is the polar coordinates transformation | ≤ ≤ 0 ≤ θ ≤ 2π}. We have S 2 = {(x,y,z) | x2 + z 2 = 1, 0 ≤ y ≤ 2 − x}
263
9.3 Integrals on a surface
parametrized by Φ2 : D 2
→ S 2 where
Φ2 (θ, y) = (cos θ,y, sin θ)
and D2 = (θ, y) 0
| ≤ θ ≤ 2π, 0 ≤ y ≤ 2 − cos θ}.
{
Finally,
− x, x2 + z2 ≤ 1}
S 3 = (x,y,z) y = 2
{
|
and Φ3 : D 3
→ S 3 is
Φ3 (r, θ) = (r cos θ, 2
− r cos θ, r sin θ),
D3 = (r, θ) 0
| ≤ r ≤ 1, 0 ≤ θ ≤ 2π}.
{
One now has all the information necessary to complete the problem (See Exercises).
The link between surface integrals and integrals of 2-forms on a surface
For a 2-form ω = a 1 dy ∧ dz + a2 dz ∧ dx + a3 dx ∧ dy and a surface S parametrized by Φ : D → S . Notice that the formula given by Theorem 9.3.9 can be transformed as follows:
¨
ω
¨
=
S
D
·
a3 (Φ(u, v))
¨
=
∂ (y, z) ∂ (z, x) a1 (Φ(u, v)) + a2 (Φ(u, v)) + ∂ (u, v) ∂ (u, v) ∂ (x, y) ∂ (u, v)
du
∧ dv
∂ (y, z) ∂ (z, x) ∂ (x, y) , , ∂ (u, v) ∂ (u, v) ∂ (u, v)
a(Φ(u, v))
D
where we define
(9.3) du
∧ dv.
a(Φ(u, v)) := (a1 (Φ(u, v)), a2 (Φ(u, v)), a3 (Φ(u, v))).
Just as we do for 1-forms in Chapter 4, we can interpret the coefficients of the 2form as the components of a vector field a( p) := (a1 ( p), a2 ( p), a3 ( p)) ∈ T p R3 where p = (x,y,z). Thus, we see that the integrand is the scalar product of the vector field a( p) with a vector
n( p) :=
associated only with the surface S . Note that
||n( p)|| =
∂ (y, z) ∂ (z, x) ∂ (x, y) , , ∂ (u, v) ∂ (u, v) ∂ (u, v)
∂ (y, z) ∂ (u, v)
2
∂ (z, x) + ∂ (u, v)
2
2
∂ (x, y) + . ∂ (u, v)
Therefore, normalizing the vector n( p) in the integrand of (9.3) we obtain
¨
S
ω =
¨
D
a( p)
n( p) n( p)
· ||
||
||
n( p) du
|| ∧ dv =
¨
D
f ( p) dS
(9.4)
264
9 Integration of Forms
where dS corresponds to ||n( p)|| du ∧ dv, and f ( p) := a( p)
· ||nn( p) ∈ R. ( p)||
Thus, the integral of a 2-form on a surface S corresponds to the surface integral of the function f . In the next section, we give a geometric meaning to n; namely, n( p) is a vector perpendicular to S at p (also called a “normal vector”). The interpretation of the coefficients of the 2-form as a vector field and the geometric significance of f are explained in Chapter 10. (1) Setup the iterated integrals for the computation of the integral of the 2 -form on the given surface. Compute the integral if possible. (a) Let S be the surface given by Φ(u, v) = (u + v,uv, u2 ) with 0 ≤ u ≤ 1, −1 ≤ v ≤ 1 and ω = 2xyz dy ∧ dz + 3y2 x dz ∧ dx + zx2 dx ∧ dy. (b) Let S be the portion of the ellipsoid x2 z2 2 +y + =1 4 3
with y ≥ 0 and the 2-form is ω = z dy ∧ dz + xy dz ∧ dx (c) Let S be the cylinder with cross-section given by (x − 1)2 + y 2 = 1 with 0 ≤ z ≤ 2 and the 2-form is ω = x dy ∧ dz + y dz ∧ dx. (d) Let S be the surface of rotation generated by y = (1 − x)2 with 0 ≤ x ≤ 1 around the z -axis and the 2-form is ω = xe yz dy ∧ dz+exz dz ∧ dx+z 2 dx ∧ dy . (e) Let S be the portion of sphere given by Φ(φ, θ) = (cos θ sin φ, sin θ sin φ, cos φ)
with −π/4 ≤ φ ≤ π/4 and the 2 -form is ω = y dy ∧ dz + dz ∧ dx + zdx ∧ dy . (2) Transform the integrals of the previous problem into surface integrals using formula (9.4). (3) Perform the computations of Example 9.3.11.
9.4 Orientation of Surfaces Just as curves have two possible orientations, we now explain how to define and compute the orientation on surfaces. The simplest example of a surface is a plane and we begin with this case as the general case builds on this one. We know that the (x, y)-plane can be given an orientation, positive (counterclockwise rotation) or negative (clockwise rotation). Consider now a plane P ⊂ R3 , not necessarily containing the origin. We center the plane at some p ∈ P and consider an axis perpendicular to P and passing through p. Rotations in one direction are defined as positive while rotations in the opposite direction are defined as negative.
9.4 Orientation of Surfaces
265
For nonvertical planes P in (x,y,z) space, we say that the plane P has positive orientation at p ∈ P , if the rotation around the axis at p , projected on the (x, y)-plane is in the positive orientation. Therefore, all points on a plane can be given a positive orientation induced by the orientation on the (x, y)-plane. For a vertical plane, we can use the projection to the (y, z) or (x, z) planes to define the orientation.
Fig. 9.5. Consistent orientation at points p and q of a plane P induced by the orientation on the
xy -plane.
Let p, q ∈ P , we say that the orientations at p and q are consistent if the orientations at p, q are the same. Therefore, the orientation induced by the (x, y)-plane for all p ∈ P is consistent. See Figure 9.5. Because orientations are well-defined and easily computable for planes, those are used for the definition of orientation for a general surface. Definition 9.4.1. A two-dimensional surface S is orientable if for any simple closed
curve C S , it is possible to prescribe a consistent orientation at each point of the curve; that is, for any p, q C , T q S and T p S have the same orientation.
⊂
∈
Just as with planes, visualizing the orientation for two-dimensional surfaces in R3 is done by considering a vector perpendicular to the tangent space, see Figure 9.6. Definition 9.4.2. Let S be a two-dimensional surface and p
∈ S . A vector n ∈ T pR3
is called a normal vector to S if n is perpendicular to T p S . If v is a normal vector with n = 1, it is called a unit normal vector .
|| ||
Given any normal vector v at p ∈ S , the vector −v is also a normal vector. We use this duality to attach a normal vector to each orientation. We adopt the convention that the direction of the normal vector corresponding to the positive orientation
266
9 Integration of Forms
Fig. 9.6. Surface S with tangent space T p S and normal vectors
±n.
should follow the right-hand rule ; this means that if you bend your four fingers in the counterclockwise direction on the tangent space to the surface using the axis perpendicular to the tangent space, your thumb points in the direction of the positive normal vector. Example 9.4.3. A sphere S is orientable. Indeed, for each point p
S , one can always choose the unit normal vector n perpendicular to T p S as pointing towards the outside of the sphere. Thus, for any curve C joining two points p, q S , the normal vector n prescribes a consistent orientation for all tangent spaces along the curve C joining p and q .
∈
∈
We now conclude this section with the famous Möbius strip which is the simplest example of a non-orientable surface. Example 9.4.4. A Möbius strip S is a surface obtained by taking a strip of length L
with much smaller width, giving the edges a half-turn and connecting the two ends together as shown in Figure 9.7. The Möbius strip is not orientable. b a
d L
c
a
c
b
d
Fig. 9.7. Thin rectangle used to make a Möbius strip by twisting one end and connecting the
two more remote sides.
9.4 Orientation of Surfaces
267
From Figure 9.8, for a point chosen on the line in the middle of the strip, the curve returns to its original location after one revolution, but the normal vector attached to the path is now pointing in the opposite direction from the original one. This means that as we follow a path along the middle of the strip, the orientation on the surface is not consistent. This is true also for any path which winds around once along the strip. For instance, consider a point at a constant positive distance from the middle of the strip, we see in the picture that the normal vector after one revolution points in the opposite direction from the original normal vector, then joining the starting and end points by a straight line the normal vector is carried to minus the starting point normal vector.
Fig. 9.8. Paths along the Möbius strip with normal vectors attached along the paths.
9.4.1 Computation of Normal Vectors
The previous section shows that the computation of normal vectors perpendicular to a surface is obtained by computing the tangent plane and determining a vector perpendicular to the tangent plane. We now derive a general formula to compute normal vectors for parametrized surfaces. Let S be a surface parametrized by Φ(u, v) = (x(u, v), y(u, v), z(u, v)) and p = Φ(u0 , v0 ) ∈ S . A normal vector n( p) = (nx , ny , nz ) must satisfy n( p) ⊥ T p S , but we know that an infinite number of vectors satisfy this condition. Therefore, we also impose a condition on the norm of this vector. One choice would be to restrict to unit vectors. However, we prefer the following condition which has a stronger geometric significance in the context of parametrized surfaces. Consider the horizontal and vertical paths through (u0 , v0 ) given by (u, v0 ) and (u0 , v). Then Φ(u, v0 ) and Φ(u0 , v) are paths in S through p. Let u1 = (1, 0) and v1 = (0, 1) be unit vectors at T (u ,v ) R2 and consider the vectors 0
0
α = DΦ(u u1 = 0 , v0 )
∂x ∂ y ∂ z , , ∂u ∂u ∂u
(u0 ,v0 )
268
9 Integration of Forms
y (u0 , v)
v0
v1
(u, v0 )
u1
u0
x
Fig. 9.9. Horizontal and vertical paths through (u0 , v0 ).
and = DΦ(u0 , v0 )v1 = β
See Figure 9.10 for an illustration.
∂x ∂ y ∂ z , , ∂v ∂v ∂v
(u0 ,v0 )
.
obtained using the derivative DΦ. Fig. 9.10. Surface S with vectors α and β
= 0 are the orthogonality α, β ) and so n( p) · α = n( p) · β Then, T p S = span( conditions. The second condition needed to characterize n( p) is that the norm of the α, β ); that is, normal vector should be equal to the area of the parallelogram P (
||n( p)|| = A(P ( α, β )). The explicit formula is a direct computation as
||n( p)|| =
∧ (dy
dz)2 ( α, β ) + (dz
∧ dx)2 ( α, β ) + (dx ∧ dy)2 ( α, β ).
Therefore, the normal vector to S at a point p = (x,y,z) is given by n( p) =
∂ (y, z) ∂ (z, x) ∂ (x, y) , , ∂ (u, v) ∂ (u, v) ∂ (u, v)
(u0 ,v0 )
.
269
9.4 Orientation of Surfaces
With this choice, the normal vector is connected to to the formula for dS . We now look at some important examples for which the formulae are used repeatedly in the next sections. Example 9.4.5. Consider the surface given by the graph of a differentiable function
f (x, y). Then, Φ(u, v) = (u,v,f (u, v)) is a parametrization and the normal vector at a point p = (x,y,z) is given by n( p) =
∂ (y, z) ∂ (z, x) ∂ (x, y) , , ∂ (u, v) ∂ (u, v) ∂ (u, v)
− =
∂f , ∂u
−
∂f ,1 ∂v
(u0 ,v0 )
.
This normal vector corresponds to the positive orientation induced from the positive orientation in the (x, y)-plane. Example 9.4.6. Let S be the sphere of radius 1 centered at the origin with parametriza-
tion given by Φ(φ, θ) = (cos θ sin φ, sin θ sin φ, cos φ) with 0 Then, n( p)
=
≤ φ ≤ π, 0 ≤ θ ≤ 2π.
∂ (y, z) ∂ (z, x) ∂ (x, y) , , ∂ (φ, θ) ∂ (φ, θ) ∂ (φ, θ)
(cos θ sin2 φ, sin θ sin2 φ, sin φ cos φ).
=
Note that n( p) points towards the outside of the sphere. This can be seen by letting θ = 0, φ = π/2, then the normal vector is (1, 0, 0).
Exercises
(1) Consider the surface S given by the cylinder x2 + y 2 = 4 with 0 ≤ z ≤ 1 given by the parametrization Φ(θ, z) = (2cos θ, 2sin θ, z). Determine (graphically) the positive orientation of S . Show that the normal vector computed with the parametrization Φ automatically gives the positive orientation. (2) Compute the normal vector on a torus with parametrization Φ(u, v) = ((2 + cos v)cos u, (2 + cos v)sin u, sin v) with 0 ≤ u, v ≤ 2π. (3) Compute the normal vectors of the surface given by the basic quadrics: x2 y2 z2 + 2 + 2 = 1, a2 b c
x2 a2
− − x2 a2
y2 z2 + 2 = 1, b2 c
2
− yb2 − z = 0,
x2 y2 + 2 a2 b
−
x2 y2 + 2 a2 b
− z = 0
z2 x2 y2 = + . c2 a2 b2
z2 =1 c2
270
9 Integration of Forms
(4) Compute the normal vector of the helicoid surface given by the parametric representation Φ(ρ, θ) = (ρ cos(aθ), ρ sin(aθ), θ)
where ρ > 0 , θ ∈ [0, 2π) and a is a fixed constant. (5) Consider the tetrahedron with vertices at (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1). Determine the positively oriented normal vectors for this surface. (Hint: the normal vectors for the coordinate planes are straightforward to obtain).
9.5 General Pullback Formula The formulae obtained in the above section are special cases of a pullback formula valid for any k-forms as we now explain. Let Ψ : Rn → Rm be a differentiable mapping and ω be a k-form in Rm . That is, ω is evaluated at q ∈ Rm on vectors α1 , . . . , αk where α j ∈ T q Rm for j = 1, . . . , k: ω(q ) α1 , . . . , αk
∈ R.
Using Ψ , we want to define a k -form on Rn denoted by Ψ ∗ ω and defined at p ∈ Rn on vectors v1 , . . . , vk where v j ∈ T p Rn for j = 1, . . . , k: (Ψ∗ ω)( p) v1 , . . . , v k := ω(Ψ( p)) DΨ(v1 ), . . . , DΨ(vk )
(9.5)
where the left-hand side is well-defined at a point p ∈ Rn and applied to vectors in T p Rn because q = Ψ( p) ∈ Rm and DΨ(vj ) ∈ T q Rm for j = 1, . . . , k . We do not present the procedure to compute the explicit k-forms in complete generality, but instead illustrate in this example how one obtains explicitly the new k-form with such a formula. Example 9.5.1. Let
ω = a 1 (x,y,z) dx + a2 (x,y,z) dy + a3 (x,y,z) dz 3 be a 1-form in R3 and consider the mapping Ψ : R2 R . We use the pullback 2 formula and this gives us a 1-form at a point p = (α, β ) R and applied at v T p R2 :
→
∈
∈
(Ψ∗ ω)( p) v = a 1 (Ψ(α, β )) dx DΨ(v) +a2 (Ψ(α, β )) dy DΨ(v) +a3 (α, β ) dz DΨ(v)
(9.6)
which we now unpack. We write Ψ(α, β ) = (Ψ1 (α, β ), Ψ2 (α, β ), Ψ3 (α, β ))T and for v T p R2 , the derivative is
∈
DΨ(v) = (DΨ1 (v), DΨ2 (v), DΨ3 (v))T
By definition of the differentials we have dx DΨ(v) = DΨ1 (v),
dy DΨ(v) = DΨ2 (v),
dz DΨ(v) = DΨ3 (v).
271
9.5 General Pullback Formula
Moreover, letting v = (dα(v), dβ (v)), for i = 1, 2, 3 we have ∂ Ψi ∂ Ψi dα(v) + dβ (v). ∂α ∂β
DΨi (v) =
Collecting this information into equation (9.6) and simplifying, we obtain (eliminating the dependency on v )
3
(Ψ∗ ω)(α, β ) =
ai (Ψ(α, β ))
i=1
∂ Ψi ∂α
3
dα +
ai (Ψ(α, β ))
i=1
Note that this example contains also the subcase of Ψ : R2
∂ Ψi ∂β
dβ.
→ R2 if Ψ3 = 0.
From this example, we can now present a result about changes of variables in line integrals which is useful in proving Stokes’ theorem in Chapter 10. Proposition 9.5.2. Let ω be a 1-form in R3 and Ψ :
R
mapping. If C is a curve in R2 , then
ˆ
Ψ∗ ω =
ˆ
2
→ R3 be a differentiable
ω.
Ψ(C )
C
Proof. Let ω = a 1 (x,y,z) dx+a2 (x,y,z) dy +a3 (x,y,z) dz , and r (t) = (α(t), β (t)) = (r1 (t), r2 (t)) with t [a, b] be a parametrization of C . Then, the curve Ψ(C ) has
∈
parametrization given by x(t) = Ψ1 (r(t)),
y(t) = Ψ2 (r(t)),
with t ∈ [a, b] and dx(Ψ(r(t)))
=
d Ψ1 (r(t)) dt = dt
dy(Ψ(r(t)))
=
d Ψ2 (r(t)) dt = dt
dz(Ψ(r(t)))
=
d Ψ3 (r(t)) dt = dt
z(t) = Ψ3 (r(t))
∂ Ψ2 ∂ Ψ2 r1 (t) + r (t) dt ∂α ∂β 2 ∂ Ψ3 ∂ Ψ3 r1 (t) + r (t) dt. ∂α ∂β 2
Therefore, putting all the formulae above together we obtain
ˆ
∂ Ψ1 ∂ Ψ1 r1 (t) + r (t) dt ∂α ∂β 2
b
ω =
Ψ(C )
ˆ
(a1 (Ψ(r(t))) dx(Ψ(r(t))) + a2 (Ψ(r(t))) dy(Ψ(r(t)))
a
+a3 (Ψ(r(t))) dz(Ψ(r(t)))) b
=
ˆ ˆ
a
=
3
i=1
Ψ∗ ω.
C
from Example 9.5.1.
ai (Ψ(r(t)))
∂ Ψi ∂α
3
r1 (t) +
i=1
ai (Ψ(r(t)))
∂ Ψi ∂β
r2 (t) dt
272
9 Integration of Forms
We conclude with some important properties of the interaction between wedge products, exterior derivatives and pullbacks. These are used in the proof of Stokes theorem which is presented in Chapter 10. Proposition 9.5.3. Let φ1 , . . . , φ be 1-forms in Rm , ω and ϕ be two forms in Rm m and Ψ : Rn R be a differentiable mapping where Ψ(x1 , . . . , xm ) = (y1 , . . . , y m ). Then, (1) Ψ∗ (φ1 . . . φ ) = Ψ∗ φ1 . . . Ψ∗ φ
→
∧ ∧
(2) Ψ∗ (ω
∧
∧ ϕ)
(3) d(Ψ∗ ω) = Ψ∗ (dω). Proof. The proof of those results follows the ones found in [1], we reproduce the one
for (a) and part of (c) here for completeness. (a) This is a straightforward application of the general definition of pullback and the definition of wedge product of 1-forms using the determinant. We have Ψ∗ (φ1
∧ . . . ∧ φ)( p)v1 , . . . , v
=
(φ1
∧ . . . ∧ φ)( p)dΨ(v1 ), . . . , dΨ(v)
=
det(φi (dΨ(vj ))
=
det(Ψ∗ φ(vj ))
=
(Ψ∗ φ1
∧ · · · ∧ Ψ∗φ)(v1 , . . . , vk ).
(b) One can use part (a) construct the proof of this case. Working out some simple examples may be helpful. (c) We begin by proving the result for differentiable 0-forms f : Rm → R. We have
m
Ψ∗ (df )
=
Ψ∗ m
=
i m
=
i
i
∂f dyi ∂y i
∂f dyi (DΨ) ∂y i ∂f ∂y i
n
j=1
∂ Ψi dxj ∂x j
9.5 General Pullback Formula
273
where this last equality is obtained because dyi computes the differential of the ith component of DΨ as illustrated in Example 9.5.1. This last term is then equal to m
=
n
◦ i=1 j=1 n
=
j=1
=
∂f ∂ Ψi dxj ∂y i ∂xj
∂ (f Ψ) dxj ∂x j
d(f Ψ) = d(Ψ∗ f ).
◦
The remainder of the proof uses part (b) and the first part of this proof, but we do not continue it here explicitly.
Exercises
(1) Show that Cases 1,2 and 3 in Section 9.1 can be obtained using the general pullback formula (9.5). (2) Let ω be a 2-form in R4 and Ψ : R3 → R4 be a differentiable mapping. Compute explicitly the 2-form in R3 given by Ψ∗ ω . (3) Complete the proof of Proposition 9.5.3 by consulting [1].
10 Stokes’ Theorem and Applications We are now at the point of presenting the main theorem of Vector Calculus: Stokes’ Theorem. In the traditional setting of Vector Calculus where one studies vector fields and differential operators without discussing differentiable k -forms and exterior derivatives, Stokes’ theorem comes in three flavours. There is Green’s theorem for vector fields in the plane which is decribed in Chapter 7, Stokes’ theorem for vector fields on surfaces and the Divergence Theorem (also known as Gauss’ Theorem) for vector fields in R3 . However, the formulae involved are quite different in each case. We present those three theorems using the differential form theory and the formulae have exactly the same structure, depending only on the type of k -form and the dimension of the geometric objects. In order to state Stokes’ theorem in complete generality for arbitrary k-forms and in any dimension requires one to develop the theory of differentiable k -forms and the formulae of exterior derivatives at a much deeper level and is beyond the scope of this book. We recommend the following reference for the interested readers [1]. We begin this chapter with a basic discussion of orientation of curves on surfaces as a generalization of the orientation of curves around two-dimensional domains seen for Green’s theorem in Chapter 7. We state Stokes’ theorem in the three versions stated above and present some examples and applications. We then state Stokes’ theorem in the classical way using the differential operators and present some examples.
10.1 More on orientation of curves and surfaces Before we can state Stokes’ theorem in the three versions, we need to define orientation of curves on surfaces and the orientation of surfaces forming the boundary of bounded regions in R3 .
10.1.1 Orientation of curves on surfaces
Recall the convention for orienting boundary curves in Chapter 7 in the context of Green’s theorem. Let D ⊂ R2 be a subset with boundary denoted by ∂D given by only one simple closed curve C . We say that C is oriented positively if it is travelled in the counterclockwise direction. There is also the more subtle situation if the domain D has holes in its interior, bounded by simple closed curves C 1 , . . . , Ck . Then the curves C 1 , . . . , C k have positive orientation if the orientation is in the clockwise direction. One way to understand the convention described here is to consider the xy -plane in R3 and taking the unit normal vector n pointing in the positive z direction, then
10.1 More on orientation of curves and surfaces
275
C C 2
C 1
Fig. 10.1. Curve C made up of three distinct curves, bounding a domain in the plane where the
orientation of the inner curves is opposite the orientation of the outer curve.
as the normal vector is moved along C , the domain D remains on the left. In the case of a domain with holes, this choice is made so that the domain D is again, always on the left of the normal vector n as it is moved along any of the curves C 1 , . . . , C k . Figure 10.2 gives a schematic of this process. z
left left C 2 D x
C 1 y
Fig. 10.2. The domain D lies to the left
of the normal vector as it is moved along the curves C 1 and C 2 .
Let S be an orientable surface with boundary C = ∂ S (and no holes). We induce an orientation on C using the orientation of S . Because S is orientable, we can choose a consistent orientation for all tangent spaces T p S . Even though T p S may not be defined on points p ∈ C , we impose the same orientation as S on the boundary curve C by a straightforward limiting process; that is, we take the normal vector in the direction consistent with normal vectors on S \ ∂S . Then, the orientation on C is defined so that as the normal vector n is moved along C , the surface S must remain on the left. If S has holes, determined by simple closed boundary curves C 1 , . . . , Ck , then the orientations induced from the surface S are obtained as above by moving the induced normal vector n along the curve C i (for some i = 1, . . . , k ) so that the surface S remains on the left. Note that the induced orientation of the boundary curves C 1 , . . . , Ck of holes in S are opposite to the orientation of the (outside) boundary curve C . See Figure 10.3. Example 10.1.1. Consider the surface S given by the sphere with z
≥ 0 . Then, the
boundary curve is given by the circle x2 + y 2 = 1 at z = 0. The orientation on
276
10 Stokes’ Theorem and Applications
Fig. 10.3. Surface with outside boundary curve positively oriented in the counterclockwise direc-
tion. The boundary curve inside is oriented positively in the clockwise direction.
Fig. 10.4. Surface S of Example 10.1.1 with positively oriented boundary curve
S is obtained using the parametrization Φ(φ, θ) = (sin φ cos θ, sin φ sin θ, cos φ) with 0 φ π/2 and 0 θ 2π . At a point p S ,
≤ ≤
≤ ≤
∈
n( p) = (cos θ sin2 φ, sin θ sin2 φ,
− sin φ cos φ).
At the boundary, the normal vector is n(Φ(π/2, θ)) = (cos θ, sin θ, 0) which points away from (0, 0, 0). This means the positive orientation along ∂S is in the counterclockwise direction. See Figure 10.4.
10.1.2 Orientation of boundary surfaces
Let E ⊂ R3 be an open subset. E is said to be bounded if there exists a ball B of radius R centered at the origin such that E ⊂ B . Moreover, we say that E is connected if for any two points p0 , p1 ∈ E , there exists a continuous curve C ⊂ E joining p0 to p1 (a more general definition of connectedness exists, but we do not need it).
10.1 More on orientation of curves and surfaces
277
Fig. 10.5. Three-dimensional region E bounded by the closed surface ∂ E with positive orienta-
tion from the outward pointing normal vectors.
Consider an open bounded and connected set E such that its boundary S is an orientable piecewise smooth surface, made up of a collection of smooth (orientable) surfaces S 1 , . . . , S k . We denote S = ∂E . We also refer to E as a solid region. The boundary of E is known as a closed surface as it separates R3 into two disjoint regions: E which lies inside ∂E and R3 \ E outside of E . Then, we say that ∂E is positively oriented if the normal vectors n for all points p ∈ S points towards 3 R \ E . We determine the positive orientation of ∂E for some often encountered parametrizations of surfaces. Example 10.1.2. Consider a sphere S of radius R with parametrization
Φ(φ, θ) = (R cos θ sin φ, R sin θ sin φ, R cos φ) S is the boundary of the region E which is a ball of radius R, E = (x,y,z) x2 + y 2 + z 2 < R .
{
|
}
Then, the normal vector is given by n( p)
=
=
(R cos θ sin2 φ, R sin θ sin2 φ, R sin φ cos φ)
det
∂ (y, z) ∂ (φ, θ)
, det
∂ (z, x) ∂ (φ, θ)
, det
∂ (x, y) ∂ (φ, θ)
as computed in Example 9.4.6 and so the parametrization Φ automatically yields the positive orientation on the sphere. This can be verified by setting φ = π/2 for which n( p) = (R cos θ, R sin θ, 0)
points towards R3 E .
\
278
10 Stokes’ Theorem and Applications
Example 10.1.3. Consider the region E bounded by the cylinder given by the ellipse
x2 y2 + 2 =1 a2 b
with a,b > 0 , and the planes z = k and z = h for k < h. A parametrization of the
Fig. 10.6. Cylinder
bounded at z = k and z = h with k < h.
cylinder is given in cylindrical coordinates by Φ(θ, z) = (a cos θ, b sin θ, z).
Thus, n( p)
=
=
(b cos θ, a sin θ, 0).
det
∂ (y, z) ∂ (θ, z)
, det
∂ (z, x) ∂ (θ, z)
, det
∂ (x, y) ∂ (θ, z)
By evaluating at θ = 0, we have the vector n = (1, 0, z) which points towards R3 E and has positive orientation. Parametrizations of the planes are given by Φ A (x, y) = (x,y,A) where A = k and A = h . Then, n( p) = (0, 0, 1) which in the case of the plane z = h points to the outside of E . However, for z = k , we must take n( p) = (0, 0, 1) in order to have positive orientation.
\
−
10.1 More on orientation of curves and surfaces
279
Fig. 10.7. Normal vectors n computed using the formula from the parametrization in Exam-
ple 10.1.4. To obtain a positive orientation, the normal vectors of the cone must be chosen in the opposite direction.
Example 10.1.4. Let E be the region bounded by the cone z 2 = x2 + y 2 and the
elliptic paraboloid z = 4 x2 y 2 with z > 0. Parametrizations are given by Φ1 (r, θ) = (r cos θ, r sin θ, r 2 ) and Φ2 (r, θ) = (r cos θ, r sin θ, 4 r 2 ) respectively. For the cone we have
− −
n( p)
= =
−
det
∂ (y, z) ∂ (r, θ)
, det
∂ (z, x) ∂ (r, θ)
, det
∂ (x, y) ∂ (r, θ)
( 2r 2 cos θ, 2r2 sin θ, r).
−
−
Note that the normal vector cannot be computed at the vertex of the cone. Set θ = 0, then n( 2r2 , 0, r) points to the inside of region E . Therefore, the positive orientation is given by (2r 2 cos θ, 2r2 sin θ, r). For the elliptic paraboloid we have
−
−
n( p)
= =
∂ (y, z) ∂ (z, x) , det ∂ (r, θ) ∂ (r, θ) (2r2 cos θ, 2r 2 sin θ, r). det
, det
∂ (x, y) ∂ (r, θ)
and we can check that at θ = 0, we obtain n(r, 0, 4 r2 ) = (2r 2 , 0, r) and this vector points towards R3 E and so has positive orientation. Figre 10.7 shows a few normal vectors.
\
−
280
10 Stokes’ Theorem and Applications
Exercises
(1) Consider the torus with parametrization Φ(u, v) = ((2 + cos v)cos u, (2 + cos v)sin u, sin v) with 0 ≤ u ≤ 2π and 0 ≤ v ≤ 2π . Let S be the portion of the torus with y ≥ 0. Identify the boundary of S and find a parametrization so that ∂S has positive orientation. (2) Consider the sphere of radius 1 and the plane P given by the equation 2z −y −1 = 0. Let the surface S be the portion of the sphere lying above the plane P . Determine the boundary curve of S and find a parametrization such that ∂D has positive orientation. (3) Consider the plane P of equation z = 2 − x − y lying in the first octant and consider the circular cylinder of radius 1/4 centered at (x, y) = (1, 1/2). The surface S is the portion of the plane P lying outside the cylinder. Identify the boundary curve ∂S and find a parametrization so that ∂ S has positive orientation. (4) Let E be the region in the first octant x, y,z ≥ 0 where z ≤ 2 and x 2 + y2 ≤ 4. Find the normal vectors that give ∂E its positive orientation. (5) Let E be the region bounded by the planes y = 0, y = 3 − x and the cylinder x2 + z 2 = 1. Find the normal vectors that give ∂E its positive orientation. (6) Let E be the region bounded by the cylinder x 2 + y 2 = a 2 , the sphere x 2 + y 2 + z 2 = 2a2 , and not containing the z -axis. Find the normal vectors that give ∂E its positive orientation.
10.2 Stokes’ Theorem We now come to the statement of Stokes’ theorem in the three versions. We rewrite Green’s theorem about the integration of 1-forms along curves in R2 forming the boundary of some region D. Then, what is clasically known as Stokes’ theorem for integration of 1-forms along curves forming the boundary on a surface S . Finally, the Divergence theorem (or Gauss’ theorem) about the integration of 2-forms on closed surfaces. Theorem 10.2.1 (Stokes’ Theorem).
(1) Green’s Theorem (flat surfaces): Let D ⊂ R2 and ∂D be the boundary of D oriented positively. If ω is a 1-form with C 1 coefficients then
˛
ω =
∂D
¨
dω.
D
(2) Classical Stokes’ theorem: Let S ⊂ R3 be a positively oriented surface with boundary curve ∂S having the induced orientation. If ω is a 1-form with C 1 coefficients in R3 then
˛
∂S
ω =
¨
S
dω.
10.2 Stokes’ Theorem
281
(3) Divergence theorem (Gauss’ Theorem): Let E ⊂ R3 be a solid region with boundary surface ∂ E oriented positively. If ω is a 2 -form with C 1 coefficients in 3 R , then
"
˚
ω =
∂E
dω.
E
We do not present the proofs of those theorems here. We begin by showing the power of these results with some examples. We begin by illustrating the use of (the Classical) Stokes’ theorem. Example 10.2.2. Let
ω = (y + sin x) dx + (z 2 + cos y) dy + x3 dz
and C the curve given by r(t) = (sin t, cos t, sin2t) for 0
˛
≤ t ≤ 2π. Let us evaluate
ω.
C
Note that if one attempts to compute this integral directly by using the pullback via r , the resulting integral cannot be easily computed. We now show that Stokes’ theorem makes the calculation tractable. ˜ given by z = 2xy . Therefore, let S be the Note that C lies on the surface S ˜ enclosed by C . Then, Stokes’ theorem states that portion of S
˛
ω =
C
¨
dω.
S
Thus, dω
∧ dx + d(z2 + cos y) ∧ dy + d(x3 ) ∧ dz
=
d(y + sin x)
=
1 dy
=
−2z dy ∧ dz − 3x2 dz ∧ dx − 1 dx ∧ dy.
∧ dx + 2z dz ∧ dy + 3x2 dx ∧ dz
A parametrization of S is given in polar coordinates by Φ(r, θ) = (r cos θ, r sin θ, 2r 2 cos θ sin θ)
with domain D = (r, θ) 0
{
| ≤ r < 1, 0 ≤ θ < 2π}.
282
10 Stokes’ Theorem and Applications
Noticing that 2 cosθ sin θ = sin 2θ , we compute
det
det
∂ (y, z) ∂ (r, θ)
∂ (z, x) ∂ (r, θ)
=
det
sin θ
r cos θ
2r sin2θ
4r 2 cos2θ
=
2r 2 (sin θ cos2θ
=
−2r2 (sin θ + cos θ sin2θ).
=
det
− 2cos θ sin2θ)
2r sin2θ
4r 2 cos2θ
cos θ
−r sin θ
=
−2r2 (sin2θ sin θ + 2cos 2θ cos θ)
=
−2r2 cos θ(1 + cos 2θ)
where the last equalities in the above two determinants follow by applying the difference of angles formula sin θ = sin(2θ θ) and cos θ = cos(2θ θ). We know already that ∂ (x, y) det = r. ∂ (r, θ)
−
−
− − − − ∧
Therefore, the pullback of dω by Φ is Φ∗ (dω)
= =
=
− −
r 2 sin2θ
∂ (y, z) ∂ (r, θ)
3r2 cos2 θ
∂ (x, y) ∂ (r, θ)
dr
r dr
dθ
∧ dθ
r 2 sin2θ( 2r2 (sin θ + cos θ sin2θ))
−
−3r2 cos2 θ(
2r 2 (cos θ + cos 2θ cos θ))
( r + 2r4 (2(sin θ sin2θ + sin2 2θ cos θ)
−
+3cos3 θ(1 + cos 2θ)))dr :=
∂ (z, x) ∂ (r, θ)
G(r, θ) dr
∧ dθ
∧ dθ.
This leads to
˛
C
1
ω =
2π
ˆ ˆ 0
0
G(r, θ) dθ
dr.
This double integral is straightforward to compute and we leave the details to the interested reader.
10.2 Stokes’ Theorem
283
Stokes’ theorem is useful in order to establish some properties of differential operators. We show an example here. Example 10.2.3. Let φ and ψ be two C 2 functions from R3
vector field
→ R and consider the
F (x,y,z) = φ ψ.
∇
We define the 1 -form ω F = F (dx,dy,dz) and let S be a surface with a simple closed curve boundary C = ∂S . By Stokes’ theorem
·
˛
C
ωF =
¨
S
dωF .
It is straightforward to verify that dωF = ( φ
∇ × ∇ψ) · (dy ∧ dz,dz ∧ dx, dx ∧ dy).
Similarly, the vector field G(x,y,z) = ψ φ
∇
is used to define the 1-form ωG = G (dx,dy,dz). then by Stokes’ theorem
·
˛
ωG =
C
¨
dωG
S
and dωG = ( ψ
∇ × ∇φ) · (dy ∧ dz,dz ∧ dx,dx ∧ dy) = −dωF .
Therefore, we conclude that
˛
φ ψ dr =
C
∇ ·
−
˛
ψ φ dr.
C
∇ ·
Moreover, this calculation also tells us that div ( ψ
∇ × ∇φ) = 0
since this corresponds to d2 ωF .
In the next example, we use Stokes’ theorem to derive a meaning for the curl operator in the context of fluid flows. Example 10.2.4. Let v(x,y,z) be the vector field describing the velocity flow of some
⊂ R3 . We define the circulation 1-form as ω = v · (dx,dy,dz).
fluid defined in some open set U
The circulation of v around a simple closed curve C with parametrization r(t), t [a, b], is defined as
˛
C
b
ω =
ˆ
a
b
v dr =
·
ˆ
a
(v T(t)) ds
·
∈
284
10 Stokes’ Theorem and Applications
where recall that T(t) = r(t)/ r(t) and ds = r (t) dt. Let p0 U and S a be a small disk of radius a > 0 in U (not necessarily lying in a plane parallel to the xy -plane) with p0 in the centre and let ∂S a = C a . We apply Stokes’ theorem to this situation where we recall that dω = curl v (dy dz,dz dx, dx dy) and we denote by n ˆ ( p) = n( p)/ n( p) the unit normal vector to S a at p. We obtain,
|| ||
||
||
˛
C a
ω =
¨
dω =
S a
||
||
∈
·
∧
∧
∧
¨
curl v n( p) dS.
·
S a
Now, for a 0, we can say that curl v( p) ˆ n( p) Thus, we have
≈
≈ curl v( p0 ) · ˆn( p0) for all p ∈ S a .
·
˛
C a
≈ curl v( p0 ) · nˆ( p0 )Area (S a) = curl v( p0 ) · nˆ( p0 )πa2 .
ω
which we rearrange as curl v( p0 ) n ˆ ( p0 )
·
≈
1 πa 2
˛
ω.
(10.1)
C a
Finally, we can write 1 curl v( p0 ) n ˆ ( p0 ) = lim a→0 πa 2
·
˛
ω.
C a
The interpretation of formula (10.1) is illustrated in Figure 10.8 and described as follows. At a point p0 R3 , choose a unit vector n ˆ ( p0 ) and consider the plane P perpendicular to n ˆ ( p0 ). For a circle C a of (small) radius a in P centered at p0 , the curl operator of F at p0 defines a vector at p0 whose orientation with respect to n ˆ measures the amount of rotation the vector field F near p0 has in the plane P . Notice that the magnitude of the circulation of the vector field F along the circle C a is maximized if curl (F )( p0 ) and n ˆ ( p0 ) are parallel. But, it is minimal if curl (F )( p0 ) and n ˆ ( p0 ) are perpendicular, thus meaning that there is no rotation of F near p0 in the plane P .
∈
ˆ ( p0 ) and vector Fig. 10.8. Normal vector n
field at points on the perpendicular plane P . Note that the vectors do not necessary lie on the plane.
We now look at some applications of the Divergence theorem.
285
10.2 Stokes’ Theorem
Example 10.2.5. Let
ω = z 2 x dy
∧ dx +
1 3 y + tan z 3
dz
∧ dx + (x2 z + y2 ) dx ∧ dy
and let S be the top half of the sphere x2 + y 2 + z 2 = 1. We compute
¨
ω
S
using the Divergence theorem. Because S is not a closed surface, we need to define a region bounded by a closed surface using S . We do this by closing the region defined by S using the disk D of radius 1 centered at the origin of the xy -plane. Of course, ˜ := S D is a closed other choices are possible, but this is one of the simplest. Then, S ˜ and we choose a positive orientation for surface, E is the solid region with ∂E = S ∂E . Therefore,
∪
˚
dω =
E
and
¨
S
‹
ω =
ω =
˜ S
¨ ¨ ω +
S
˚
E
dω
−
ω
D
¨
ω.
D
We can now proceed with the computation. We have dω = (x2 + y 2 + z 2 ) dx
∧ dy ∧ dz.
˜ We use the polar parametrization Φ : D
→ D,
Φ(r, θ) = (r cos θ, r sin θ, 0) ˜ with domain D = (r, θ) 0
| ≤ θ < 2π, 0 ≤ r < 1}. Then, Φ∗ ω = r 3 sin θ dr ∧ dθ.
{
For the triple integral, we use spherical coordinates with domain R = (ρ,φ,θ) 0
| ≤ ρ < 1, 0 ≤ φ < π/2, 0 ≤ θ < 2π}.
{
Putting it all together we have
¨
1
ω
=
0
S
=
ˆ
2π 5
2π
π/2
ˆ ˆ 0
0
. − π4 = 3π 20
ρ4 sin φ dθ dφ dρ
1
−
2π
ˆ ˆ 0
0
r3 sin2 θ dθ dr
286
10 Stokes’ Theorem and Applications
Flux through a surface
We begin by giving an interpretation of 2-forms as the rate of flow, also called flux , of a vector field through a surface. Consider a surface S ⊂ R3 and a vector field F (x,y,z) = (F 1 (x,y,z), F 2 (x,y,z), F 3 (x,y,z))T
The vector field can have several physical meanings as a flux per unit area. For instance, the flux per unit area of a fluid, an electric field, heat, concentration of a chemical, etc. We can write F = ρv where ρ is a density and v a velocity field. Note that ρ has units of kg/m3 and v has unit of m/s, therefore ρv has units of (kg/s)/m2 which is indeed a flux per unit area.
Fig. 10.9. Unit normal vectors to S and
vector field at points of S . Arrows with thick heads are the normal vectors to S .
Let p 0 = (x0 , y0 , z0 ) and let nˆ = n/||n|| be the unit tangent vector to S . We approximate the flux F through S near p 0 in the direction nˆ using F ( p0 ) n ˆ ( p0 ).
·
α and β be a pair of linearly independent Because the above is a flux per unit area, let α, β ) a parallelogram with area given by ||n||. Then, the flux vectors in T p S and P ( through S near p0 is approximated by 0
F ( p0 ) n ˆ ( p) Area(P ( α, β ))
·
which we can rewrite using the surface area form as ˆ ( p0 ) dS (P ( F ( p0 ) n α, β )).
·
Recall that dS (P ( α, β )) =
∧ (dy
dz)2 ( α, β ) + (dz
∧ dy)2 ( α, β ) + (dx ∧ dy)2 ( α, β ).
α, β )), then F ( p0 ) · n ˆ ( p0 ) dS (P ( α, β ) = But, since ||n|| = dS (P ( F ( p0 ) ((dy
·
∧ dz)( α, β ), (dz ∧ dy)( α, β ), (dx ∧ dy)( α, β ))
where we recognize the right-hand side of the above equation as a 2 -form.
10.2 Stokes’ Theorem
287
ˆ ( p0 ), the vector field F ( p0 ) and Fig. 10.10. Zoom in near a point p0 with the normal vector n
the projection of the vector field on the normal vector.
We summarize as follows: (a) Let F be a vector field describing a flux per unit area (fluid, heat, etc) (b) Define a 2-form: ωF = F · (dy ∧ dz,dz ∧ dx, dx ∧ dy). (c) Let S be a surface with unit normal vector nˆ . (d) Then, on S we have: F · nˆ dS = ωF . (e) The flux through S is defined by
¨
S
F n ˆ dS =
·
¨
S
ωF .
(f) If S is parametrized by Φ : D → S then
¨
S
F n ˆ dS =
·
¨
S
ωF =
¨
D
Φ∗ ωF .
We now show an example of flux. Example 10.2.6. A fluid has density ρ = 870 kg/ m3 and flows with velocity v = (z, y 2 , x2 ) where x, y,z are measured in meters and the components of v in m/s. We
find the flux outward through the side of the cylinder x2 + y 2 = 4 for 0 A parametrization of the cylinder is given by Φ(θ, z) = (2 cos θ, 2sin θ, z)
≤ z ≤ 1.
288
10 Stokes’ Theorem and Applications
where D = (θ, z) 0 θ < 2π, 0 then the flux 2-form is
{
≤ z < 1}. Let F = ρv be the flux per unit area,
| ≤
ωF
=
F (dy
=
870(z dy
·
∧ dz,dz ∧ dx, dx ∧ dy) ∧ dz + y2 dz ∧ dx + x2 dx ∧ dy).
We compute the pullback Φ∗ ωF =
∂ (y, z) ∂ (z, x) ∂ (x, y) 870 z + y2 + x2 ∂ (θ, z) ∂ (θ, z) ∂ (θ, z)
=
870(z(2cos θ) + 4 sin2 θ(2sin θ)) dθ
=
870(2z cos θ + 8 sin3 θ) dθ
∧ dz
dθ
∧ dz
∧ dz.
Therefore,
¨
S
F n ˆ dS =
·
¨ ω ˆ ˆ S 1
=
0
F
2π
870(2z cos θ + 8 sin3 θ) dθ
0
dz = 0.
Remark 10.2.7. We introduce the following notation, which is widely used, for the
flux through a surface
¨
S
F dS :=
·
¨
F n ˆ dS.
S
·
That is, we define dS = n ˆ dS .
The formulation of flux in terms of a 2-form leads directly to the formulation of Stokes’ theorem for vector fields.
10.2.1 Proof of Theorem 10.2.1: Stokes
We now perform the proof of Stokes’ theorem. Note that the proof of Green’s theorem found in Chapter 7 is obtained by assuming that the 1-form ω is C 2 . This is an unnecessary assumption and we now show the proof for the case where ω is only C 1 . Proof of Green’s theorem. Any domain D in
2 R
can be decomposed as a union of domains of Type I and Type II as discussed in the proof of Green’s theorem in Chapter 7. We focus on the domains of Type I and let D be such a domain; that is, D = {(x, y) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)}. Let B = [0, 1] × [0, 1]. Then there exists a parametrization Φ : B → D defined using x(u) = a(1 − u) + bu by Φ(u, v) = (x(u), g1 (x(u))(1
− v) + g2 (x(u))v).
10.2 Stokes’ Theorem
289
By the change of variables formula and Proposition 9.5.3,
¨
dω =
D
¨
Φ∗ dω =
B
¨
d(Φ∗ ω).
B
Moreover, because Φ(∂B ) = ∂ D, then
˛
ω =
∂D
˛
Φ∗ ω.
∂B
by Proposition 9.5.2. But, we know from Example 7.3.2 that for a rectangular region
˛
Φ∗ ω =
∂B
¨
d(Φ∗ ω).
B
Note that Example 7.3.2 only requires ω to be C 1 and so this completes the proof. We now use Green’s theorem to provide a straightforward proof of the Classical Stokes’ theorem using the pullback and exterior derivative properties obtained at the end of Section 9.1. Proof of the Classical Stokes’ theorem. Consider a surface S with regular boundary ∂S . We assume S can be split into the union of subsurfaces S 1 , S 2 , sharing a bound˜ . Therefore S 1 and S 2 have boundaries given respectively by C 1 C ˜ ary given by C ˜ and ∂S = C 1 C 2 . If Stokes’ theorem holds for each surface S 1 and C and C 2 S 2 we have
∪ −
∪
∪
¨
dω
=
S
¨ ¨ dω + dω ˛ ˛ ω + ω ∪ ∪− ˆ ˆ ˆ ˆ ω + ω + ω + − ˆ ˆ ω + ω ˆ S 1
= = = =
S 2
˜ C 1 C
C 2
C 1
˜ C
C 1
C 2
˜ C
C 2
˜ C
ω
ω.
∂S
This construction can be generalized to a finite number of surfaces S 1 , . . . , S k with the common boundaries cancelling out in the same way as the calculation just above as in the case of Green’s theorem, see the proof in Chapter 7. Therefore, all we need is to prove the formula of Stokes’ theorem for surfaces S with boundary ∂S given by a parametrization Φ : D ⊂ R2 → R3 . Note that
290
10 Stokes’ Theorem and Applications
∂S = Φ(∂D). For this, we use Green’s theorem and Proposition 9.5.3:
¨
dω
=
S
¨ ∗ Φ dω ¨ ∗ d(Φ ω) ˛ ∗ Φ ω ˛ ω ˛ D
=
D
=
∂D
=
Φ(∂D)
=
ω
∂S
where the fourth equality holds by Proposition 9.5.2. In order to provide an argument for the proof of the Divergence theorem with a simple calculation, we begin by noticing that the regions of Type I, Type II and Type III for triple integrals can be given a simple parametrization using a cube as a domain, similar to what is done for domains in R2 in the proof of Green’s theorem in this chapter. We show the details in the case of a Type I region. Let E = {(x,y,z) ∈ R3 | (x, y) ∈ D, g1 (x, y) ≤ z ≤ g2 (x, y)} where we assume that D ⊂ R2 is a domain of Type I; that is, D = {(x, y) ∈ R2 | a ≤ x ≤ b, h1 (x) ≤ y ≤ h2 (x)}. Let B = [0, 1] × [0, 1] × [0, 1] and consider the parametrization Φ : B → E defined by x(u) = a(1
− u) + bu,
y(u, v) = h 1 (x(u))(1
− v) + h2 (x(u))v
and Φ(u,v,w) = (x(u), y(u, v), g1 (x(u), y(u, v))(1
− w) + g2 (x(u), y(u, v))w).
(10.2)
Proof of the Divergence theorem. Any solid region E can be decomposed into a
union of regions of Type I,II and III with each subregion sharing a boundary via a piecewise smooth surface. We return to the decomposition aspect at the end of the proof and we assume for the beginning part of the proof that E is without loss of generality a region of Type I. We know there exists a parametrization Φ : B → E as given by (10.2). We use Φ to pullback the calculation on B :
‹ ˚
E
dω =
Φ∗ ω
Φ∗ dω =
˚
ω =
∂E
and
‹
∂B
˚
B
B
d(Φ∗ ω)
10.2 Stokes’ Theorem
291
with the last equality guaranteed by Proposition 9.5.3. Suppose that ω = a ˜(x,y,z) dy
∧ dz + ˜b(x,y,z) dz ∧ dx + ˜c(x,y,z) dx ∧ dy.
A straightforward computation shows that Φ∗ ω = a(u,v,w) dv
∧ dw + b(u,v,w) dw ∧ du + c(u,v,w) du ∧ dv
where the exact form of the coefficients is not relevant for our purposes. Therefore, d(Φ∗ ω) =
∂a ∂b ∂c + + ∂u ∂v ∂w
du
∧ dv ∧ dw.
We now parametrize the six sides of ∂B using the mappings Ψjδ (α, β ) with domain D = [0, 1] × [0, 1], where j = u, v,w describes the direction of the normal vector to the side and δ = 0, 1 the location of the side along the j -axis. For instance, Ψu1 (α, β ) = (1, α , β ) is the parametrization of the side of the cube located at u = 1. Then, it is a straightforward calculation to show that Ψ∗jδ (dv ∧ dw) = 0 for j = v, w , Ψ ∗jδ (dw ∧ du) = 0 for j = u, w and Ψ ∗jδ (du ∧ dv) = 0 for j = u, v . Moreover, Ψ∗uδ (dv
∧ dw) = dα ∧ dβ,
Ψ∗vδ (dw
From this, we can conclude that
∧ du) = −dα ∧ dβ,
‹
Ψ∗wδ (du
∧ dv) = dα ∧ dβ.
Φ∗ ω
∂B
=
¨ ¨ a(Ψ (α, β )) dα ∧ dβ − a(Ψ (α, β )) dα ∧ dβ ¨ ¨ − b(Ψ (α, β )) dα ∧ dβ + b(Ψ (α, β )) dα ∧ dβ ¨ ¨ + c(Ψ (α, β )) dα ∧ dβ − c(Ψ (α, β )) dα ∧ dβ ¨ ¨ (a(1, α , β ) − a(0, α , β ) )dα ∧ dβ − (b(α, 1, β ) − b(α, 0, β ))dα ∧ dβ ¨ + (c(α,β, 1) − c(α,β, 0)) dα ∧ dβ ¨ ˆ ∂a(u,v,w) ¨ ˆ ∂b(u,v,w) du dv ∧ dw − dv du ∧ dw ∂u ∂v ¨ ˆ ∂c(u,v,w) + dw du ∧ dv ∂w ˚ ∂a(u,v,w) ˚ ∂b(u,v,w) du ∧ dv ∧ dw − dv ∧ du ∧ dw ∂u ∂v ˚ ∂c(u,v,w) + dw ∧ du ∧ dv ∂w ˚ ∂a(u,v,w) ∂b(u,v,w) ∂c(u,v,w) + + du ∧ dv ∧ dw ∂u ∂v ∂w ˚ ∗ u1
u0
D
D
v1
v0
D
D
w1
D
=
w0
D
D
D
D
=
D
D
=
1
0
1
0
B
B
=
d(Φ ω).
B
D
1
0
B
B
=
292
10 Stokes’ Theorem and Applications
Suppose now that E = E 1 ∪···∪ E k where the regions E j are of Type I, II or III and the boundary between E j and the other regions is given by a piecewise smooth surface S j . That is, we can write ∂E j = (∂E j \ S j ) ∪ S j . This means that k
∂E =
(∂E j
\ S j ).
j=1
We must also write explicitly the piecewise smooth surface S j as j
S j =
S ji
i=1
where S ji are smooth surfaces. But notice that j
k
¨
ω = 0
S ji
j=1 i=1
because for a fixed j , the computation of the integral for S ji has normal vector pointing towards the outside of E j , while for neighbouring regions, the same boundary surfaces have normal vectors pointing in the opposite directions, thus all integrals cancel out. The argument is similar to the one used for Green’s theorem and the details are left to the reader. We can now compute
˚
k
dω
=
E
= =
˚ ‹ ¨ ¨ ¨ j=1 k
E j
j=1 k
∂E j
j=1 k
= = =
dω ω k
ω +
∂E j S j
ω +
∂E j S j
j=1
∂E j S j
‹
\
j=1 i=1
ω
\
ω
∂E
and this completes the proof of the theorem.
ω
S j j=1 k j
\
j=1 k
¨ ¨
S ji
ω
10.2 Stokes’ Theorem
293
Exercises
(1) Evaluate the line integral of ω over the given simple closed curve C oriented positively. (a) Let ω = (cos x + y 3 ) dx + (4x − ey ) dy and C be the curve bounding the half-disk x 2 + y 2 ≤ a2 with x ≥ 0. (b) Let ω = (cos(ex ) − y3 ) dx + (y 5 + x3 ) dy and D be the region between y = x , y = −x and the circle or radius a, with y ≥ 0. (2) Let u, v : R2 → R be differentiable and let D ⊂ R2 be a bounded, connected region. Let N (t) be a vector normal to ∂ D. Show that 2
ˆ
D
(u∆v
− v∆u) dA =
ˆ
∂D
(u( v N )
∇ ·
− v(∇u · N )) dt.
(3) Show that ω1 = x dy , ω2 = y dx are 1-forms which yield the area of a bounded region D ⊂ R2 . (4) Show that if ω is an exact 2-form, then S ω = 0 for any closed surface S . (5) Use Classical Stokes’ or Divergence theorem to compute the following integrals. (a) C ω where ω = ye x dx + (x + ex ) dy + z 2 dz and C is the curve with parametrization r(t) = (1 + cos t, 1 + s i n t, 1 − sin t − cos t) with 0 ≤ t ≤ 2π . (Hint: the curve C lies on a plane z = A − Bx − Cy where you need to find A, B and C ). (b) S dω where ω = 3y dx − 2xz dy + (x2 − y 2 ) dz and S is the positively oriented hemisphere of equation x 2 + y2 + z 2 = a 2 with z ≥ 0. (c) S ω where ω = (x + y 2 ) dy ∧ dz + (3x2 y + y 3 − x3 ) dz ∧ dx + (z + 1) dx ∧ dy and S is the positively oriented cone with a vertex at (0, 0, 2), axis given by the z -axis, and with base given by the disk of radius 2 in the xy -plane. (d) E dω where ω = (1 − (x2 + y 2 )3 ) dy ∧ dz + 2(1 − (x2 + y 2 )3 ) dz ∧ dx + x2 z 2 dx ∧ dy and E ⊂ R3 is the region bounded by the cylinder of radius 1 along the z -axis and between the planes z = 0 and z = 1. (6) Set up the explicit integral for S ωF computing the rate of flow (i.e. flux) of the vector field F across S . (Hint: you must find a parametrization Φ for S along with its domain D . Determine the 2-form ω F corresponding to F , compute the pullback and set up the bounds of the integrals.) (a) F (x,y,z) = (xy,yz,zx)T and S is the part of the paraboloid z = 4 − x2 − y 2 that lies above the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1. (b) F (x,y,z) = (−y, z − y, x)T and S is the surface of the tetrahedron with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1). (c) F (x,y,z) = (x, −z, y)T and S is the part of the sphere x2 + y 2 + z 2 = 4 in the first octant. (7) Let S 1 be the elliptic paraboloid surface z = x2 + y 2 and S 2 be the plane z = 2 − 2x − 2y . Show that the intersection curve C of the cone and the plane
‚
¸
˜ ˜
˝
˜
294
10 Stokes’ Theorem and Applications
has parametric representation r (t) = (−1+2cos t, −1+2sin t, 6 − 4cos t − 4sin t) with t ∈ [0, 2π). Consider the 1-form ω = (3xz
and evaluate
− y2 ) dx + (exp(y2 ) − 2xy) dy +
3 z 3 + x2 2
dz
¸ ω . C
10.3 Stokes’s Theorem for Vector Fields We now write explicitly Stokes’ theorem using the vector field formalism. The formulae are straightforward from the correspondence seen in Section 8.3.1 between 1-forms and vector fields, and between the exterior derivative and the curl and divergence differential operators acting on vector fields. 2 R , ∂D is posTheorem 10.3.1 (Stokes’ theorem in Vector Calculus). (1) Let D 2 itively oriented using r : [a, b] R and F (x, y) = (f 1 (x, y), f 2 (x, y)) a vector
⊂
→
field on D. Then,
˛
F dr =
·
∂D
¨
D
∂f 2 ∂x
∂f 1 ∂y
−
dA.
(2) Let S R3 be a surface with boundary ∂S positively oriented and induced ori3 3 3 entation using r : [a, b] R . If F : R R is a differentiable vector field, then curl F n F dr = ˆ dS.
⊂
→
→
˛
·
∂S
¨
·
S
⊂ R3 be a solid region with boundary surface ∂E oriented positively. If → R3 is a differentiable vector field then " ˚ div F dV. F · n ˆ dS =
(3) Let E F : R3
∂E
E
We now use the vector field statement of the Divergence theorem to explore the physical meaning of the divergence operator. Example 10.3.2. Let F (x,y,z) be a differentiable vector field and p0 = (x0 , y0 , z0 ).
Let B be a ball of radius > 0 with boundary sphere S . We suppose that B is oriented positively and we use the Divergence theorem for the flux of F through S :
‹
S
F n ˆ dS =
·
˚
div F dV.
B
Now, notice that for << 1, we can approximate div (F )( p) p B . Therefore, we have the following computation
∈
1 lim →0 dV (B )
1 F n ˆ dS = lim →0 dV (B ) S
‹
·
˚
B
≈ div (F )( p0 ) for all
div (F ) dV = div (F )( p0 ).
10.3 Stokes’s Theorem for Vector Fields
295
The divergence operator can thus be interpreted in the following way. If div (F )( p0 ) > 0, it means that for S with << 1 near p 0 , the total flux of F points towards R3 B and so there is a positive flux away from p 0 and we say that p 0 acts as a source point for the vector field. If div (F )( p0 ) < 0, then there is a total negative flux pointing towards the inside of B and we say that p0 acts as a sink point for the vector field.
\
Here are a few more examples with applications of Stokes’ and Divergence theorem. Example 10.3.3 (Stokes’ theorem: Balloon) . Suppose that air diffuses out of the Mr.
Winter hot air balloon. The bottom of the balloon is circular with radius 1. The hot air escapes the porous membrane with velocity V (x,y,z) = curl F with F (x,y,z) = ( y,x, 0). We compute the outward flux of air through the surface if density is ρ(x,y,z) = K .
−
Fig. 10.11. Mr. Winter hot
air balloon.
The vector field associated with the air flow is given by G(x,y,z) = ρV (x,y,z) = K curl (F ).
The crucial point here is that the boundary of the hot air balloon surface is a circle C of radius 1. We assume that the boundary circle lies in the xy -plane and we parametrize C by r(t) = (cos t, sin t, 0) with t [0, 2π]. By Stokes’ theorem
∈
¨
S
K curl (F ) n ˆ dS =
·
˛ F · dr ˛ (−y,x, 0) · (− sin t, cos t, 0) dt ˆ (− sin t, cos t, 0) · (− sin t, cos t, 0) dt ˆ ∂S
=
C 2π
=
0
2π
=
0
=
2π.
(sin2 t + cos2 t) dt
296
10 Stokes’ Theorem and Applications
The next example is important in the theory of electricity and magnetism. Example 10.3.4 (Divergence Theorem: Electric Fields). Consider the electric field E(x) =
Q
|x|3 x
where the electric charge Q is located at the origin and x = (x,y,z) is a position vector. We show that the electric flux of E through any closed surface S that encloses the origin is (recall Remark 10.2.7 for the notation),
¨
E dS = 4πQ.
S
·
Notice that E is not defined at (0, 0, 0) and this should remind you of the 1 -form in Example 7.3.5 which is not defined at (0, 0). The argument is similar to the one used for Example 7.3.5.
˜. The unit normal vectors are n1 and n2 . Fig. 10.12. Region R bounded by the surfaces S and S
˜ be a sphere of radius a Let R be the region enclosed by the surface S and let S ˜ as the region bounded centered at the origin and lying entirely in R. We define R ˜ and let ∂ ˜ outside by S and inside by S R have positive orientation. That is, we denote ˜, which points by n ˆ 1 the unit normal vector of S and n ˆ 2 the unit normal vector of S towards the origin. By the Divergence theorem
¨
∂ ˜ R
ˆ dS = E n
·
˚
˜ R
div (E) dV.
But, it is a straightforward computation to show that div (E) = 0. Therefore, 0=
¨
∂ ˜ R
E dS =
·
¨
∂S
E n ˆ 1 dS +
·
¨
˜ S
E n ˆ 2 dS.
·
297
10.3 Stokes’s Theorem for Vector Fields
Using the unit normal vector n ˆ 2 instead, the last equality can be rewritten as
−
¨
∂S
ˆ 1 dS = E n
·
−
¨
˜ S
ˆ 2 ) dS. E ( n
·−
We now compute
¨
˜ S
E ( n ˆ 2 ) dS.
·−
by parametrizing the sphere with Φ(φ, θ) = (a cos θ sin φ, a sin θ sin φ, a cos φ)
and on the sphere, we have x = a . We define the 2-form
|| || ω = E · (dy ∧ dz,dz ∧ dx, dx ∧ dy).
Thus, Φ∗ ω
=
Q Φ(φ, θ) (a2 cos θ sin2 φ, a2 sin θ sin2 φ, a2 cos φ sin φ) dφ 3 a
=
Q(cos2 θ sin3 φ + sin2 θ sin3 φ + cos2 φ sin φ)dφ
=
Q sin φdφ
·
∧ dθ
∧ dθ
and we compute
¨
˜ S
E ( n ˆ 2 ) dS
¨ ω ˆ ˆ ˆ
=
·−
˜ S π
=
0
=
2π
Q sin φ dθ dφ
0 π
2πQ
sin φ dφ = 4πQ
0
Therefore,
¨
S
E dS = 4πQ
·
for all closed surfaces S enclosing a region containing the origin.
∧ dθ
298
10 Stokes’ Theorem and Applications
Exercises
(1) Use Classical Stokes’ or Divergence theorem to compute the following integrals. (Hint: transform the integrand to a 1, 2 or 3 form and use the methods of the previous section.) (a) S curl F · ˆn dS where F (x,y,z) = (xz − y 3 cos z, x3 ez ,xyzex +y +z ) and S is the surface with equation x2 + y 2 + 2(z − 1)2 = 6 with z ≥ 0 and n ˆ is the unit normal vector pointing away from the z -axis. (b) C F · d r where F (x,y,z) = (xy,yz,zx) where C is the triangular curve joining the vertices (1, 0, 0), (0, 1, 0) and (0, 0, 1) oriented in the positive direction (counterclockwise). (c) Compute the flux of F (x,y,z) = (y +xz,y +yz, −2x − z 2 ) in the direction of positive normal orientation across the sphere of radius a in the first octant. (d) Let E be the region defined by x 2 + y 2 + z 2 ≤ 2a, x 2 + y 2 ≥ a2 . Let S = ∂E be the boundary of E which is the union of a cylindrical surface S 1 and a spherical surface S 2 . Compute the flux of F (x,y,z) = (x + yz, y − xz,z − ex sin y) in the positive orientation, across (a) S , (b) S 1 , (c) S 2 . (2) If ω is an exact 2-form, then S ω = 0 for any closed surface S . Write this statement in terms of the curl operator. (3) Find the flux of the vector field F (x,y,z) = (x +yz,y + xz,z + xy)T through the boundary of the region E delimited by the first octant x, y, z ≥ 0 where z ≤ 2 , x2 + y 2 ≤ 4 and ∂ E is oriented positively. (4) Find the flux of the vector field F (x,y,z) = (x2 , y 2 , z 2 )T through the surface of the sphere of radius a oriented positively. 2
˜ ¸
‚
2
2
Bibliography [1] M.P. DoCarmo. Differential Forms and applications . Springer-Verlag, 1994. [2] R. Haberman. Applied Partial Differential Equations: With Fourier Series and Boundary Value Problems . Pearson Prentice Hall, 2004. [3] Rachel A. Kuhn, Hermann Ansorge, Szymon Godynicki, and Wilfried Meyer. Hair density in the eurasian otter lutra lutra and the sea otter enhydra lutris. Acta Theriologica, 55:211– 222, 2010. [4] T. D. Rudolph, P Drapeau, M-H St-Laurent, and L. Imbeau. Status of woodland caribou (rangifer tarandus caribou) in the james bay region of northern québec. Technical report, Wooldand Caribou Recovery Task Force Scientific Advisory Group - Northern Québec, Montréal, 2012. [5] Biswa Sengupta, A Aldo Faisal, Simon B Laughlin, and Jeremy E Niven. The effect of cell size and channel density on neuronal information encoding and energy efficiency. Journal of Cerebral Blood Flow & Metabolism , 33:1465–1473, 2013. [6] G. Strang. Linear Algebra and Its Applications . Thomson, Brooks/Cole, 2006. [7] S.H. Strogatz. Nonlinear Dynamics And Chaos . Studies in nonlinearity. Sarat Book House, 2007.
Index 1-form 81 – circulation 283 – exact 112 – exterior derivative 242 – locally exact 185 – space 81 – work 82
arc-length parametrization 54 area form 190 bagels 173 Bendixson-Dulac theorem 214 best linear approximation – function 49 – mapping 131 – vector function 51 bilinear form 149 – symmetric 149 bounded set 276 canonical basis 6 centre of mass – 1d-domain 106 change of variables formula 248 Classical Stokes’ theorem 280 Conics 31 – Ellipse 31 – Hyperbola 31 – Parabola 32 connected set 276 coordinate system 13 – Cartesian 13 – curvilinear 14 – cylindrical 15 – polar 14 – spherical 16 critical point 156 – nondegenerate 162 cross product 11 curl operator 181 curvature 108 curve 27 – closed 112 – helix 29 – parametric representation 27
– simple 112 – smooth 45 cylinder 37 derivative – linearity 135 – mapping 131 differential 179 – R 70 – dθ 77 – dr 76 – dx 72 – dy 72 – function 70 – function of two variables 73 – wedge product 189 directional derivative 73 divergence operator 181 Divergence theorem 281 domains in R2 – type I 200 – type II 201 – type III 202 domains in R3 – type I 218 – type II 219 – type III 220 donuts 173 energy – kinetic 115 – potential 115 – principle of conservation 115 – total 115 exact – 1 -form 112 exterior derivative – for 1-form 242 – more general formula 235 flux 286 forms – closed 237 – exact 237 Fubini’s theorem 197 – R3 217
302
Index
function 19 – graph 118 – level set 119 – local maximum 158 – local minimum 158 – regularity 137 – Taylor expansion 151 functions of several variables 19 gradient 179 Green’s formula 213 Green’s theorem 207 – using forms 280 Hessian matrix 160 inner product 7 integrability – function of several variables 104 – function of two variables 196 integral of 2-form on surface 260 integration of forms 247 isomorphic vector spaces 10 Jacobian matrix 133 Laplacian operator 180 level set curves 73 line integral of vector fields 111 linear combination 5 linearly dependent 5 linearly independent 5 little o 49 Möbius strip 266 mapping 20 – C 1 138 – C 2 148 – chain rule 143 – continous 125 – continuous on set 127 – differentiable 131 – limit 122 – regularity 137 – second derivative 146 – tangent 130 – tangent space 140 mass – 1d-domains 106
metric operator – Cartesian coordinates 98 – cylindrical coordinates 101 – polar coordinates 98 – spherical coordinates 101 Monkey saddle 119 Newton’s law of gravitation 63 norm 7 normal vector 265 – computation 267 open set 23 orientation – R2 188 – curve on a surface 275 – parallelogram 190 – plane 265 – surface 277 orthogonal 8 orthonormal 8 parallelepiped 192 parallelogram – orientation 190 – signed area 190 parametric representation – speed 53 parametrization – R3 166 partial derivatives 21 population density 198 potential function 112 pullback – 1 -form 89 – 2 -form in R2 244 – 2 -form on surface 245 – 3 -forms in R3 244 – real function 86 quadratic form 150 Quadrics 34 – Cone 35 – Ellipsoid 34 – Elliptic paraboloid 35 – Hyperbolic paraboloid 35 – Hyperboloid of one sheet 35 – Hyperboloid of two sheet 35 – traces 35
Index
reparametrization 54 Riemann sum – 1 -form in R2 91 – dx 88 – double integral 195 – real function 85 scalar product 7 second derivative criterion 159 set 1 – closed 25 smooth parametrization 45 space of k-forms 233 span 6 Stokes’ theorem 280 – vector calculus notation 294 subset 2 surface – closed 277 – flux 286 – nonorientable 266 – orientable 265 – orientation 277 surface area form 254 surface integral 255 tangent line 51 tangent space 58 – Rn 59 – curve 58 – mapping 140 – surface 65 Taylor expansion – function 151 – function of several variables 152
Theorem – Line Integrals 113 torus 29, 173 total charge 218 total mass – plate 197 vector field 62 – conservative 112 – gradient 112 – sink point 295 – source point 295 vector function – affine 50 – continuity 42 – derivative 43 – differentiability 43 – integral 46 – limit 42 – linear 50 – piecewise smooth 45 vector functions 19 vector space 4 – basis 6 – dimension 7 vector subspace 4 volume form 193 wedge product – 1 -form 224 – k -forms 233 – differential 189 – differentials in R3 192 work 82
303