A detailed solution manual and guide for Schutz’s First Course in General Relativity (Schutz, 2009) Robert B. Scott,1∗ 1
Department of Physics, University of Brest, Brest, France
∗To whom correspondence should be addressed; E-mail:
[email protected]
February 10, 2012
2 Note to user. This manual is to be used as a companion to the textbook “A First Course in General Relativity”, by Bernard Schutz, 2nd edition, published 2009 by Cambridge University Press. It will only make sense when read with Schutz’s text. Herein you’ll find brief notes meant to clarify and amplify his textbook, presented in the same order as his textbook. Most importantly you’ll find solutions to almost all the exercises. These are presented in great detail, with each step explained including references to the text. You’ll also find my supplementary problems that are meant to establish intermediate goals necessary to solve Schutz’s exercises or to amplify concepts not covered by his exercises. Comments should be sent to
[email protected].
Chapter 1 Special Relativity 1.1
Fundamental principles of special relativity (SR) theory
In the footnote 4 on p. 3, the answer to the first question is “no, the soup is unaffected by a acceleration experienced by an astronaut in orbit.” This would appear to also cause problems for SR, since how do we know that an observer is in an inertial frame? The acceleration cannot, necessarily, be measured locally. And there’s no special reference frame with which to measure ones acceleration.
1.2
Definition of an inertial observer in SR
Gives a “geometrical” definition of an inertial reference frame, or coordinate system. Notes that gravity makes it impossible to construct such an inertial coordinate system.
1.3
New units
Introduces what Misner et al. (1973) called “geometric units”, wherein time is measured in distance of light travel. They claim the motivation is that c = 3 × 108 m/s in SI, a “ridiculous value”. I disagree, since then a 1/3 second becomes the ridiculously large 3
4 105 km! A more useful motivation comes from • velocity becoming a dimensionless parameter, • space-time diagrams having the same units on all axes, and • the world lines of light paths having unit slope.
1.4
Spacetime diagrams
Typo: Fig. 1.4: v is of course a vector, so one should replace this with v = |v|.
1.5
Construction of the coordinates used by another observer
This is an extremely important section. Unfortunately he doesn’t explain why the angle of the x−axis to the x−axis is φ = arctan(v), where v = |v| is the magnitude of the velocity of O along the x−axis axis. Rather this result appears in Fig. 1.5 without explanation, nor even delegating it as an “exercise for the student”. The result does follow from the construction of the x−axis , but the steps involved are not trivial. Please see supplementary problem SP.1 in section 1.15.
1.6
Invariance of the interval
This section purports to provide a proof of the invariance of the interval. But bear in mind that it assumes that the relationship between coordinates in different frames is linear, see discussion before Eq. (1.2). He reduces the relationship between the interval in one frame and another to a function of the relative velocities of their origins, see Eq. (1.5) on p. 10, ∆s2 = −M00 ∆s2 = φ(v)∆s2 . To show that φ(v) depends only upon direction he considers the case of a rod (or two events A and B at the ends of the rod) lying along the y−axis.
Rob’s notes on Schutz
5
A and B are simultaneous in O, and he argues that they are therefore also simultaneous in O, by constructing the y-axis as he did in Fig. 1.3. But now the velocity of O is orthogonal to the constructed axis, so of course the simultaneity of events is not changed by the coordinate transform. The intermediate result is that the space-time interval between A and B in either frame is just the square of the length, so their ratio is the sought-after φ(v). Now the subtle point is that he then claims that this ratio cannot depend on the direction of the velocity, because the rod is perpendicular to it and there are no preferred directions?! So what?? I think the solution is that v could be in an arbitrary direction in the x−z plane. The ratio of lengths should not depend upon this direction, because then there would be preferred directions. But as far as I can tell, this only shows that the direction of the component orthogonal to the y−axis cannot influence φ(v).
1.7
Invariant hyperbolae
At the end of the section it’s stated that “The lesson of Fig. 1.12b is that tangent to a hyperbola at any event P is line of simultaneity of the Lorentz frame whose time axis joins P to the origin. If this frame has velocity v, the tangent has slope v.” The above is stated without proof or even hint that there’s some calculation involved. Fortunately it proceeds straightforwardly. We seek the slope of the tangent to a hyperbola. Differentiate any timelike hyperbola wrt x, to obtain in general dt x = . dx t At some point P the slope of the tangent wrt the x−axis is xtpp . Now if the t−axis is chosen to go through the origin and P its slope wrt the t−axis will also be xtpp , corresponding to tan(φ) = v = xtpp . But we know from Fig. 1.5 that the corresponding x−axis will have slope v relative to the x−axis. That is, the tangent is parallel the x−axis, and is therefore a line of simultaneity for O. QED.
6
1.8
Particularly important results
Time dilation This was straightforward once one uses the invariant hyperbolae. The event xB was constructed so that it had t = 1. The corresponding event in O is obtained by tracing the point back to the t−axis along the hyperbola with the same interval, ∆s2 = −1, −t2 + x2 = −1 One must also note that the equation for the t−axis is t = x/v. Substituting this into the hyperbola, −t2B + x2B = −1 −t2B + (tB v)2 = −1 1 tB = √1−v 2
(1.1) (1.2) (1.3)
This gives Eq. (1.8). Lorentz contraction I still don’t see how he came up with xC = √
l 1 − v2
But I obtain the same end result using instead the invariance of the interval, which in O is 2
2
∆s2AC = −∆t + ∆x2 = −t + x2 = 0 + l2 = l2 . and therefore must also be in O. I also used the equation for the x−axis, t = vx. This was confusing at first since the units look wrong! But it’s clear when you go back to Fig. 1.5 and note that tan(φ) = v, which was obvious for the t−axis since the observer O is moving along the x−axis at speed v. That the x−axis was also inclined at the same angle φ was more complicated. One also needs a three relation, which is simply xC − xB = vtC . A little algebra gives the Lorentz contraction: √ xB = l 1 − v 2 .
Rob’s notes on Schutz
1.9
7
Lorentz transformation
The first step, substituting the equations for the O axes proceeds immediately to t = α(t − v x) x = σ(x − v t)
(1.4) (1.5)
I had trouble seeing how α = σ from the path of a light ray, so I used the invariant hyperbolae instead. Substituting (1.4 and 1.5) into the equation for the interval from the origin, gives, 2
∆s2 = −t2 + x2 = ∆s2 = −t + x2 The cross term on the RHS involving x t must be zero, giving that α2 = σ 2 . Equating either of the other terms gives the Lorentz factor, 1 α2 = √ 1 − v2 As Schutz (2009, p. 22) points out, the positive root is selected so that the coordinates are not inverted when v = 0. In retrospect, it is clear how the path of a light ray gives α = σ. Simply note that the world line of line ray has ∆x = ±∆t and ∆x = ±∆t. Substitution into (1.4 and 1.5) gives, ∆x σ ∆x − v∆t = 1. = α ∆t − v∆x ∆t So σ = α. The Lorentz transformation is often said to reduce to the Galilean transformation in the limit v 1, but that’s not strictly true. Unlike for the Galilean transformation, in the Lorentz transformation time is affected at large distances even for small velocities.
1.10
Velocity composition law
1.11
Paradoxes and physical intuition
1.12
Further reading
A more thoughtful look at fundamentals, Bohm (2008).
8
1.13
Appendix: the twin paradox dissected
1.14
Exercises
1.14.1
Convert to geometric units
a) 10 J = 10N m = 10kg m2 /s2 = 10/9 × 1016 kg = 1.11 × 10−16 kg. b) 100W = 100J/s = 1.11×10−15 kg/s = 1.11×10−15 /3×108 kg/m = 0.371×10−23 kg/m c) ~ = 1.05 × 10−34 J s =
1.05 × 10−34 J s = 0.352 × 10−42 kg m 3 × 108 m/s
d) Car velocity [108 km/hr] v = 30m/s = 10−7 e) Car momentum p = 30m/s × 1000kg = 10−4 kg f) Atmospheric pressure, 1bar = 105 N m−2 =
105 kg m s−2 = 1.1 × 10−12 kg m−3 9 × 1016 m4 s−2
g) water density 103 kg m−3 h) Luminosity flux 6
−1
−2
10 J s cm
10
−1
−2
= 10 J s m
1010 J s−1 m−2 × 1.11 × 10−16 kg J−1 = = 3.71×10−16 kg m−3 8 −1 3 × 10 m s
Rob’s notes on Schutz
1.14.2
9
Convert from natural units (c = 1) to SI units
2 (a) Velocity, v = 10−2 : v = 10−2 × c[m s−1 ] = 3 × 106 [m s−1 ] 2 (b) Pressure, 1019 [kg m−3 ]: 1019 [kg m−3 ] × c2 [m2 s−2 ] = 9 × 1035 [N m−2 ] 2 (c) Time, 1018 [m]: 1018 [m] = 3.3 × 109 [s] c[m s−1 ] 2 (d) Energy density, 1 [kg m−3 ]: 1 [kg m−3 ] × c2 [m2 s−2 ] = 9 × 1016 [J m−3 ] 2 (e) Acceleration, 10 [m−1 ]: 10 [m−1 ] × c2 [m2 s−2 ] = 9 × 1017 [m s−2 ]
1.14.6
Show that Eq. (1.2) contains only Mαβ + Mβα when α 6= β, not Mαβ and Mβα independently. Argue that this allows us to set Mαβ = Mβα without loss of generality. ∆s2 =
3 X 3 X α=0 β=0
Mαβ (∆xα )(∆xβ )
10 Pick a pair of indices, α = α0 and β = β 0 say, where α0 6= β 0 , and α0 ∈ {0 . . . 3} and β 0 ∈ {0 . . . 3}. So ∆s2 contains a term like, 0
0
Mα0 β 0 (∆xα )(∆xβ ). But ∆s2 also contains a term like, 0
0
0
0
Mβ 0 α0 (∆xβ )(∆xα ) = Mβ 0 α0 (∆xα )(∆xβ ). The equality follows because of course the product does not depend upon the order of the factors. So we can group these two terms and factor out the 0 0 (∆xα )(∆xβ ) leaving, 0
0
(∆xα )(∆xβ )(Mα0 β 0 + Mβ 0 α0 ) Because the off-diagonal terms always appear in pairs as above, we could without changing the interval (and therefore without loss of generality) replace them with their mean value ˜ αβ ≡ (Mαβ + Mβα )/2 M ˜ αβ is by construction symmetric. Thus the new tensor M
1.14.7
In the discussion leading up to Eq. (1.2), assume that the coordinates of O are given as the following linear combinations of those of O: t = αt + βx, x = µt + υx, y = ay, z = bz,
(1.6) (1.7) (1.8) (1.9)
where α, β, µ, υ, a, and b may be functions of the velocity v of O relative to O, but they do not depend on the coordinate. Find the numbers {Mαβ , α, β = 0, . . . 3} of Eq. (1.2) in terms of α, β, µ, υ, a, and b. First note that the origins of the two coordinate systems line up, and that ∆t = t etc. Then the result follows from straightforward substitution of (1.6) to (1.9) into Eq. (1.1) 2
∆s2 = −∆t + ∆x2 + ∆z 2 + ∆z 2 = −(α∆t + β∆x)2 + (µ∆t + υ∆x)2 + (a∆y)2 + (b∆z)2
(1.10) (1.11)
Rob’s notes on Schutz
11
Grouping terms we find that (−α2 +µ2 ) multiplies ∆t2 , so M00 = (−α2 +µ2 ). Similarly, the term multiplying ∆x2 is M11 = −β 2 + υ 2 . The cross terms give M01 = M10 = −αβ + µυ, and the remaining diagonal terms are M22 = a2 , M33 = b2 . Other cross terms are nil.
1.14.8
a) Derive Eq. (1.3) from Eq. (1.2) for general Mαβ .
Start with Eq. (1.2) ∆s2 = Mαβ ∆xα ∆xβ . Substituting ∆s2 = M00 ∆t2 + M0i ∆xi ∆t + Mi0 ∆xi ∆t + Mij ∆xi ∆xj 2 Note that Mp i0 = M0i (problem 6). Consider case ∆s = 0, so from Eq. (1.1), 2 2 2 ∆t = ∆r = ∆x + ∆y + ∆z . Then,
∆s2 = M00 ∆r2 + 2M0i ∆xi ∆r + Mij ∆xi ∆xj which is Eq. (1.3). b) Since ∆s2 = 0 in Eq. (1.3) for any {∆xi }, replace ∆xi by −∆xi in Eq. (1.3) and subtract the resulting equations from Eq. (1.3) to establish that M0i = 0 for i = 1, 2, 3. We have set ∆s2 = 0 and it followed, based upon the universality of the speed of light, that ∆s2 = 0. Note that changing ∆xi to −∆xi does not change ∆r nor ∆s. So that’s why ∆s2 = 0 in Eq. (1.3). The only term in Eq. (1.3) to change sign when changing ∆xi to −∆xi is the 2M0i ∆xi ∆r term. The final term doesn’t because changing ∆xi to −∆xi also changes ∆xj to −∆xj ; the i is just a dummy index. So when we subtract from Eq. (1.3) the following ∆s2 = M00 ∆r2 − 2M0i ∆xi ∆r + Mij ∆xi ∆xj we’re left with 0 = 4M0i ∆xi ∆r. This must be true for arbitrary ∆xi so M0i = 0. QED. c) Derive Eq. (1.4)b
12 Required to show: Mij = −M00 δij ,
(i, j = 1, 2, 3).
Adding to Eq. (1.3) the following 0 = ∆s2 = M00 ∆r2 − 2M0i ∆xi ∆r + Mij ∆xi ∆xj gives, 0 = M00 ∆r2 + Mij ∆xi ∆xj
(1.12)
Suppose, ∆x = ∆r, ∆y = ∆z = 0. Substituting into (1.12) then gives M00 = −M11 . Or, when ∆y = ∆r, ∆x = ∆z = 0, we see that M00 = −M22 . Similarly, M00 = −M33 . To see that the off-diagonal terms are zero, note √ that it’s also possible that ∆x = ∆y = ∆r/ 2 and ∆z =. Substitution into (1.12) gives that 0 = (M12 + M21 )∆r/2 = ∆rM12 = 0 Similarly, M13 = 0 = M23 . In summary, Mij = −M00 δij ,
(i, j = 1, 2, 3).
which is Eq. (1.4)b. QED.
1.14.18
a) Show that velocity parameters add linearly, b) apply to a specific problem
Define the velocity parameter W through w = tanh(W ). Want to show the velocity addition law, w0 =
u+w 1 + wu
implies linear addition of velocity parameters. Simply substitute the definition of velocity parameter, tanh(U ) + tanh(W ) 1 + tanh(U ) tanh(W ) (tanh(U ) + tanh(W )) cosh(W ) cosh(U ) = cosh(W ) cosh(U ) + sinh(U ) sinh(W )
w0 =
(1.13) (1.14)
Rob’s notes on Schutz
13
The numerator can be written as, N = sinh(W ) cosh(U ) + cosh(W ) sinh(U ) so that w0 =
sinh(W ) cosh(U ) + cosh(W ) sinh(U ) cosh(W ) cosh(U ) + sinh(U ) sinh(W )
The following identities are useful: exp(b) + exp(−b) exp(a) + exp(−a) cosh(a) cosh(b) = 2 2 exp(a + b) + exp(−(a + b)) exp(a − b) + exp(−(a − b)) = + 4 4 cosh(a + b) cosh(a − b) = + (1.15) 2 2
exp(a) − exp(−a) exp(b) − exp(−b) sinh(a) sinh(b) = 2 2 exp(a + b) + exp(−(a + b)) exp(a − b) + exp(−(a − b)) − = 4 4 cosh(a + b) cosh(a − b) = − (1.16) 2 2
exp(a) − exp(−a) exp(b) + exp(−b) sinh(a) cosh(b) = 2 2 exp(a + b) − exp(−(a + b)) exp(a − b) − exp(−(a − b)) + = 4 4 sinh(a + b) sinh(a − b) = − (1.17) 2 2 Using (1.15) and (1.16) the denominator above simplifies to D = cosh(U + W ). Using (1.17) the numerator simplifies to N = sinh(U + W ). So, w0 = tanh(U + W ) which reveals that we can linearly add velocity parameters, then apply tanh to reduce the final parameter to the final velocity.
14 b Velocity of 2nd star relative to first, u2 = 0.9. Velocity of nth star relative to (n-1)th, un − un−1 = 0.9. So the Nth star relative to the first is, u0N = tanh[(N − 1)U ] where 0.9 = tanh(U ). My answer disagrees with that given by Schutz. Where I have N − 1 he has N . Note that my answer is correct for N = 2, the 2nd star relative to the first moves at 0.9.
1.14.19
a) Lorentz Transformation using velocity parameter t = γt − γvx x = −γvt + γx y=y z=z
(1.18)
Let, v = tanh(V ). Note that the Lorentz factor also simplifies, 1 γ≡√ 1 − v2 −1/2 = 1 − tanh2 (V ) 1/2 cosh2 (V ) = cosh2 (V ) − sinh2 (V ) = ± cosh(V )
(1.19)
We always take the positive root in the Lorentz factor so that the Lorentz transformation reduces to the identity matrix when v = 0. The final equality follows from the following identity (stated without proof in b): 2 2 exp(V ) + exp(−V ) exp(V ) − exp(−V ) 2 2 cosh (V ) − sinh (V ) = − 2 2 exp(2V ) + exp(−2V ) + 2 exp(2V ) + exp(−2V ) − 2 = − 4 4 =1 (1.20)
Rob’s notes on Schutz
15
Substituting v = tanh(V ) and (1.19) into (1.18) gives the desired result, t = cosh(V ) t − sinh(V ) x x = − sinh(V ) t + cosh(V ) x y=y z=z
1.14.19
(1.21)
b) invariance of the interval using velocity parameter
The given identity is derived above (1.20). Invariance of the interval follows from straightforward substitution into (1.21). 2
∆s2 = −∆t + ∆x2 + ∆y 2 + ∆z 2 = −(cosh(V )∆t − sinh(V )∆x)2 + (− sinh(V )∆t + cosh(V )∆x)2 + ∆y 2 + ∆z 2 = ∆s2 (1.22) In the final equality, the cross terms cancelled directly while the squared terms simplified with the identity (1.20).
1.14.19
c) analogy between Lorentz transformation using velocity parameter and Euclidean coordinate transformation
Hyperbolic trigonometric functions replace regular trigonometric functions, but the sign changes for the sine term in the Euclidean coordinate transformation and not the sinh term of the Lorentz transformation. The analog to the interval ∆s2 is the squared distance to the origin. The analog to the invariant hyperbolae are circles. These could be used to calibrate axes of the rotated Euclidean frame.
1.14.20
Lorentz tranformation in matrix form x = Ax
16 where
and
t x x= y , z
t x x= y z
cosh(V ) − sinh(V ) − sinh(V ) cosh(V ) A= 0 0 0 0
1.14.21
0 0 1 0
0 0 0 1
a) Timelike separated events can be transformed to occur at the same point.
Without loss of generality, consider two points, one at the origin and another at t, x in inertial frame O. For the interval to be timelike, we require ∆s2 = −∆t2 + ∆x2 = −t2 + x2 > 0 Consider another inertial frame O moving at velocity v along the x−axis with origins that coincide at t = 0. From the Lorentz transformation t¯ = γt − γvx x¯ = −γvt + γx
(1.23)
so (∆¯ x)2 = x¯2 = (−γvt + γx)2 We divide through by t2 to reduce this expression to one in a single parameter α ≡ x/t with |α| < 1 for the timelike interval: α2 − 2vα + v 2 = 0 so (α − v)2 = 0 This has solution α = v and this is possible for realistic velocity boost |v| < 1 because |α| < 1 for the timelike interval. Q.E.D.
Rob’s notes on Schutz
1.15
17
Rob’s supplementary problems
SP.1 In Fig. 1.5, explain why the angle of the x−axis to the x−axis is φ = arctan(v), where v = |v| is the magnitude of the velocity of O along the x−axis axis. The result follows from the construction of the x−axis , but the steps involved are not trivial.
Call the unknown angle between the x−axis and the x−axis α. Extend the line from P to R all the way to the t−axis, and call this intersection Q. Draw two lines parallel to the x−axis, one through R and where it crosses the t−axis, call this U. The other through P and where it crosses the t − axis, call this T . The events ξ, P, R form a right triangle, with hypotenuse ξR = 2 a. We need the angle at RξP, which turns out to be χ = π/4 − φ. (Call angle ORQ γ. Then φ + γ + π/4 = π, and angle ξRP, which is π − γ = π/4 + φ. It follows that √χ = π/4 − − φ.) So now we can compute the length RP = 2a sin(χ) = a 2(cos(φ) √ sin(φ)). UR = a sin(φ). Then QR = UR/ sin(π/4) = a sin(φ) 2. Sum√ ming the two lengths QP = QR + RP = a 2 cos(φ). We’re now after OT = OQ − T Q. But OQ = OU + UQ, with UQ = UR = a sin(φ), and OU = a cos(φ). Also, T Q = QP cos(π/4) = a cos(φ) So OT = a(sin(φ) + cos(φ)) − a cos(φ) = a sin(φ). Note that the sought after angle α satisfies tan(α) = OT /T P = OT /T Q = a sin(φ)/a cos(φ), so α = φ, the desired result.
1.16
Additional thoughts
I think it’s worth mentioning that the Lorentz transformation, which is linear by construction, transforms lines to lines. This is easily verified by substituting the equation for a line in O and confirming that it’s also a line in O. It’s also worth pointing out that a tangent line to a curve in O remains a tangent line in O. Of course it would be quite strange if this were not true, but on the other hand it was not immediately obvious to me that it holds.
18
Chapter 2 Vector Analysis in Special Relativity An elementary introduction to 4-vectors, working with Lorentz transformations. Contains lots of hand-holding about the algebra of working with vectors, the summation convention, changing dumbing indices etc.
19
20
2.1
Definition of a vector
Regarding the Einstein summation convention introduced on p. 34, Schutz states that it applies whenever there is a repeated index, one up and one down, in the same expression. I find this misleading because it’s actually the same term or same factor.
Buried in footnote 2 on p. 35 is an important notational point.
2.2
Vector algebra
Eq. (2.10) introduces a strange notational twist. Apparently enclosing the vectors e~α with parentheses and writing a superscript β implies that we are forming a tensor from the set of these vectors? (e~α )β = δαβ There’s no comment to explain this. Earlier the author explained that the superscript notation will become clear when he introduces differential geometry. For now I just note that the RHS is the Kronecker delta, which is a second-rank tensor. [Coming back to the above point after having read most of the book (all but Chapters 9 and 12), I’m still amazed at this leap in notation. Generally Schutz is clear and fairly careful but this is an exception. I interpret it as follows. Enclosing the vector in parentheses I believe means taking the set of components, since later when we want to write down the components of a 2nd rank tensor as a matrix we enclose the tensor in parentheses and the analogous operation for a vector would be writing down its 4 components as a row or column vector. Apparently the superscript β means then selecting the β th one of these components. Indeed, this interpretation is consistent with his usage in Eq. (2.21) and later in the book, such as Eq. (5.52).]
On p. 38 he introduces the notation that putting a tensor within square ¯ brackets [Λβα ] gives the matrix of components. He will oscillate between square brackets and parentheses throughout the book, sometimes in the same paragraph!! For example, the two sentences after Eq. (2.18).
Rob’s notes on Schutz
21
Eq. (2.18) is described as a key formula. Exercise 2.11c is to verify it.
Schutz doesn’t like the terms “contravariant” and “covariant”, used by many others. Aα are the contravariant components of a vector (Hobson et al., 2009).
2.3
The four-velocity
First paragraph. Not clear what “uniformly moving” means. I guess he means not accelerating so that the inertial frame in which the particle is at rest is not constantly changing. This also makes sense because the next paragraph discusses the case of an accelerating particle.
2.4
The four-momentum
Typo on p. 42, in the example, p1 = mv(1 − v 2 )−1/2 should be p1 = mv(1 − v 2 )−1/2 .
2.6
Some applications
~ has only a zero component . . .. It might be Top of p. 48. In the MCRF U better to say “only a time component” since he means “zero” in the sense of the first component.
2.7
Photons
Four-momentum Typo: Last paragraph: “. . . a photon has frequency v and . . . ”. This should be “. . . a photon has frequency ν and . . . ”.
The derivation of Eq. (2.39) seems a bit mysterious until one realizes
22 that it’s based on the idea that the four-momentum must transform under a Lorentz transformation. In particular, the momentum of a photon directed in the x−direction is: p~ →O (E, E, 0, 0) If we change reference frames to O then this four-vector must transform under a Lorentz transformation, giving: ¯ E, ¯ 0, 0) p~ →O (E, where E¯ = γE − Evγ = Eγ(1 − v) etc.
2.9
Exercises
2. Identify the dummy and free indices, count the equations: a) α is the dummy index. One equation. b) ν is the dummy index. µ is the free index. Four equations. c) λ, µ dummy indices; α, γ free indices; 16 equations. d) ν and µ are free indices, and there are 16 equations. Although the indices are repeated, they’re not repeated in the same factor, and one is not superscript.
3 Prove Eq. (2.5). There’s nothing to prove really. It follows immediately from the definition and notation conventions. In particular, the LHS involves a sum over all values of the dummy index β ∈ {0, 1, 2, 3}, see p. 34. The RHS merely spells this out, with the convention that Roman indices like i take all values i ∈ {1, 2, 3}.
4 Practise adding components of vectors, and multiplying by a scalar.
Rob’s notes on Schutz
23
~ →O (−30, 6, 0, −6) a) −6A
5 a) Show that the basis vectors are linearly independent. Start with a general linear combination aµ , 0 = aµ (~eα )µ = aµ δαµ Start with the first component, α = 0. The equation above is 0 = a0 × 1, so a0 must be zero. Similarly for the other components. Since this trivial solution is the only solution, the basis vectors must be linearly independent. More formally, one could write this out in matrix notation: 1 0 0 0 a0 0 0 1 0 0 a1 0 0 0 1 0 a2 = 0 0 0 0 1 a3 0 It’s a result of elementary linear algebra that this system has nontrivial solutions only if the determinant of the matrix is zero. But the determinant is +1. So there are no nontrivial solutions and thus the basis vectors must be linearly independent, Q.E.D. 5 b) The given set is not linearly independent, since the linear combination (−5, −3, +2, 1) gives the zero vector.
6 As in Fig. 1.5, the t and x axes are tilted at an angle φ relative to their O frame counterparts and toward the world line of the line ray t = x. The basis vectors are parallel to these O axes. Here tan(φ) = 0.6. For the O the axes will be tilted even further toward t = x. The angle of this basis vectors θ can be computed as tan(θ) = tanh(2arctanh (0.6))
7 a) Verify Eq. (2.10). As mentioned above, this is a strange notational twist. If we write the basis vectors as row vectors as in Eq. (2.9), then the
24 set form a matrix, and the matrix element is unity when row and column numbers are equal, and zero otherwise, i.e. the identity matrix. 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 The RHS of Eq. (2.10) can of course be written as the identity matrix too, which demonstrates the equality. 7 b) I’ve always thought of Eq. (2.11) as the definition of the vector, so it seems to me a tautology, rather than something to prove. Perhaps it’s worth ~ to stating the result in words. If you use the components of the vector, A form the linear combination of basis vectors ~e, i.e. Aα~eα ~ In particular, for the first compothen you, of course, recover the vector A. 0 nent, α = 0, the first component A multiplies all the basis vectors, but only the first one ~e0 contributes since the other basis vectors are all zero in the first component. Similarly for the other components.
8 a) Prove that the zero vector has the same components in all reference frames. This follows immediately from the use of a linear transformation to go between reference frames. See p. 35, and Eq. (2.7) for the definition of the general (4-) vector and the linear transformation. In other words, 0 0 0 0 A 0 = 0 0 0 for all matrices A and the Lorentz transformation can always be written as a matrix A. 8 b) Prove that if two vectors have equal components in one frame their components are equal in all frames.
Rob’s notes on Schutz
25
My first thought is that if their components are equal in a given frame, then they’re the same vector. By the definition of a vector, they are invariant under coordinate transformation. So their components are equal in all other frames. But that doesn’t use 8a. Using 8a, one could subtract the two equal vectors, giving the zero vector in that frame. Under coordinate transformation, this difference vector remains they zero vector. Thus their components must be equal in any other frame.
9 There are 16 terms to write out, which is too much work. It seems convincing enough to me to note that for each term in the sum on the LHS, there is a corresponding term on the RHS. In general these terms look like, Λαβ Aβ ~eα Of course the order of summation doesn’t matter for a finite sum. Substituting specific values for the dummy indices might make this more clear, say α = 0, β = 1.
10 Prove Eq. (2.13) from Aα (Λβα~eβ − ~eα ) = 0 Eq. (2.13) was: ¯
~eα = Λβα ~eβ¯ Choosing any Aα with only one non-zero entry, like (1, 0, 0, 0), or (10, 0, 0, 0), shows straight away that Λβ0 ~eβ = ~e0 which is Eq. (2.13) with α = 0. Similarly choosing Aα as (0, 1, 0, 0), or (0, 2, 0, 0), shows straight away Λβ1 ~eβ = ~e1 . So repeating this argument gives the result for the other two basis vectors.
26 Perhaps more instructive is to note that this result works for more general situations. The quantity inside the parentheses is a set of 4 different vectors ~vα , (Λβα ~eβ − ~eα ) = ~vα Then view the components of Aα as the components of a linear combination of this vector ~vα . Now it’s clear that the RHS is not just the number zero, but the 4-vector (0, 0, 0, 0). The linear combination of the set of ~vα must sum to the zero vector for arbitrary components of the linear combination. If the first three led to a non-zero vector, 2 X
~vα = (2, 4, 6, 8)
α=0
then A3 would have to be chosen so bring this to zero. For example, if ~v3 = (1, 2, 3, 4) one would have to choose A3 = −2. But since Aα was arbitrary so then choosing A3 = +2 would violate the equality. So this means that the only way it could work is if 2 X
~vα = 0
α=0
P and ~v3 = 0. One can now repeat this argument for the 1α=0 ~vα etc. and show that all ~vα are the zero 4-vector. And the result Eq. (2.13) holds.
11 (a) Matrix of Λνµ (−v). Exercise 1.20 was to put the Lorentz transformation in matrix form. Note that sinh(−V ) = − sinh(V ), cosh(−V ) = cosh(V ). So we only have to change the sign of the sinh(V ) elements, cosh(V ) sinh(V ) 0 0 sinh(V ) cosh(V ) 0 0 Λ= 0 0 1 0 0 0 0 1 where v = tanh(V ).
Rob’s notes on Schutz
27
(b) Aα for all α. A0 = cosh(V )A0 − sinh(V )A1
(2.1)
A1 = − sinh(V )A0 + cosh(V )A1
(2.2)
A2 = A2
(2.3)
A3 = A3
(2.4)
(c) Verify Eq. (2.18). Written out in matrix form Eq. (2.18) becomes, cosh(V ) sinh(V ) 0 0 cosh(V ) − sinh(V ) 0 0 1 0 0 0 sinh(V ) cosh(V ) 0 0 − sinh(V ) cosh(V ) 0 0 0 1 0 0 = . 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1
To show this it’s useful to use the hyperbolic function identity, cosh2 (x) − sinh2 (x) = 1. Eq. (2.18) follows immediately from matrix multiplication. This identity is easy to derive, and can be found at http://en.wikipedia.org/wiki/ Hyperbolic_function#Similarities_to_circular_trigonometric_functions along with other properties. (d) The Lorentz transformation matrix from O to O is just the matrix in (a). Since O is moving toward increasing x with velocity v with respect to O, then from O point of view O is moving toward increasing x with velocity −v. (e) Aα for all α. A0 = cosh(V )A0 + sinh(V )A1 = A0 0
1
1
(2.5) 1
A = + sinh(V )A + cosh(V )A = A 2
2
(2.6)
2
(2.7)
A3 = A3 = A3
(2.8)
A =A =A
~ on the left by the Lorentz Relation to Eq. (2.18): Multiplying the vector A transformation matrix Λ(v) gives the components in the O frame, Aα =
28 Λαβ(v) Aβ . Multiplying this vector on the right by the Lorentz transformation matrix Λ(−v) should return the vector to the O frame. And indeed it does, when we use Eq. (2.18) in the final step below: Λνα (−v) Aα = Aν Λνα (−v) Λαβ(v) δ νβ
(2.9)
β
ν
(2.10)
β
ν
(2.11)
A =A A =A
(f) Verify that the order applying the transformations doesn’t matter. Physically we know this must be true. Mathematically it works out because if we repeat (c) with the matrices in the opposite order, we get the same result: cosh(V ) − sinh(V ) 0 0 cosh(V ) sinh(V ) 0 0 1 0 0 0 − sinh(V ) cosh(V ) 0 0 sinh(V ) cosh(V ) 0 0 0 1 0 0 = . 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1
(g) Establish that ~eα = δ να ~eν I find this a rather strange question. From the definition of the Kronecker delta function, Eq. (1.4c), the result is immediately obvious. Another way to see this is that the Kronecker delta can be written as the identity matrix. And of course, writing the vector on the RHS as a column vector, multiplying by the identity matrix, gives back the original vector.
12 (b) Remember not to add the velocities linearly, but to use the Einstein law of composition of velocities Eq. (1.13), or use the velocity parameters introduced in Exercise 1.18. (c) Note that the definition of the magnitude of the vector is analogous to the interval introduced in Chapter 1, see Eq. (2.24). ~ 2 = −02 + (−2)2 + 32 + 52 = 38. A
Rob’s notes on Schutz
29
(d) The magnitude should be independent of the reference frame, because of the invariance of the interval.
13 (a) Transformation of coordinates from O to O is can be constructed in two steps. First transform to O, Aγ = Λγµ (v) Aµ . Then transform from O to O, Aα = Λαγ (v0 ) (Λγµ (v)Aµ ). So the Lorentz transformation from O to O is Λαµ = Λαγ (v0 ) Λγµ (v). (b) I thought we just did show that Eq. (2.41) was the matrix product of the two individual Lorentz transformations. Maybe he means write it out in matrix form? I’m not sure what he’s looking for. (c) The was an important exercise for me because I learned that the Lorentz transformation matrix did not have to be symmetric when there are velocity components in two directions. γ(v)γ(v 0 ) −γ(v)vγ(v 0 ) −γ(v 0 )v 0 0 −γ(v)v γ(v) 0 0 . Λαµ = 0 0 0 0 0 −γ(v)γ(v )v γ(v)vγ(v )v γ(v ) 0 0 0 0 1
(d) Show that the interval is invariant under the above transformation. (e) Show that the order matters in constructing the Lorentz transformation as in (a), i.e. Λαγ (v) Λγµ (v0 ) 6= Λαγ (v0 ) Λγµ (v)
30 Using the example from (c), the LHS of the above would be, γ(v)γ(v 0 ) −γ(v)v −γ(v)γ(v 0 )v 0 −γ(v)vγ(v 0 ) γ(v) γ(v)vγ(v 0 )v 0 LHS = Λαµ = 0 0 −γ(v )v 0 γ(v 0 ) 0 0 0
0 0 0 1
Comparison with the matrix in (c) shows it’s different. In fact, another observation, not discussed by Schutz, is that one is the transpose of the other. This can be understood because (AB)T = B T AT = BA where the final equality holds because A and B are symmetric. This is surprising if we think in a Galilean way. However, mathematically we know in general that matrix multiplication is not commutative, http:// en.wikipedia.org/wiki/Matrix_multiplication#Common_properties. Physically we know that the Lorentz transformation results in the axes tilting toward the t = x line, as in Fig. 1.5. The order of rotations matters. For example, rotating the globe 90◦ to the east about the polar axis, then 45◦ clockwise about the axis through the Equator and 90◦ W and 90◦ E, puts the coordinates 0◦ N, 0◦ E where the South Indian Ocean used to be. But performing the same rotations in the opposite order leaves the coordinates 0◦ N, 0◦ E on the old Equatorial plane.
14 (a) v = −3/5 in the positive z direction. The off-diagonal term gives the direction, −vγ = √ 0.75, and the diagonal term gives γ = 1.25. One can confirm that γ = 1/ 1 − v 2 , once v is found. (b) Since it’s a Lorentz transformation, the inverse should be obtained by from the Lorentz transformation from O back to O. 1.25 0 0 −0.75 0 1 0 0 Λ(−v) = 0 0 1 0 −0.75 0 0 1 And matrix multiplication confirms this is the inverse.
Rob’s notes on Schutz
31
(c)
1.25 0 0 −0.75
0 1 0 0
0 −0.75 0 0 1 0 0 1
1 1.25 2 2 = 0 0 0 −0.75
15 (a) The particle 3-velocity is v = (v, 0, 0). In the frame moving ~ → (1, 0, 0, 0). The Lorentz with the particle, the 4-velocity is ~e0 , so A O transformation back to the O frame is γ(v) v γ(v) 0 0 v γ(v) γ(v) 0 0 . Λ(−v) = 0 0 1 0 0 0 0 1 ~ in the O frame has components A ~ →O (γ(v), vγ(v), 0, 0). So A (b) For general particle 3-velocity is v = (u, v, w). Let’s start with a slightly less general 3-velocity is v = (u, v, 0) to make the algebra easier. One could rotate through an angle θ to a frame where v = (|v|, 0, 0). Here θ is such that 0 cos(θ) sin(θ) u |v| u = = v0 − sin(θ) cos(θ) v 0 Now we have the situation as in (a) so we can apply the Lorentz transformation back to the O frame γ(|v|) |v| γ(|v|) 0 0 |v| γ(|v|) γ(|v|) 0 0 . Λ(−v) = 0 0 1 0 0 0 0 1 ~ in a frame moving with the O frame but rotated through θ has compoSo A ~ →O (γ(|v|), |v|γ(|v|), 0, 0). Finally we rotate through −θ to obtain nents A ~ in the O frame A ~ →O (γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) A =(γ(|v|), uγ(|v|), γ(|v|)v, 0)
(2.12) (2.13)
32 Finally, there’s no reason for the z component to behave differently, so we can generalize this. For general particle 3-velocity is v = (u, v, w), the 4-velocity is ~ →O (γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) A (2.14) =(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|)) where |v| =
(2.15)
√ u2 + v 2 + w 2 .
(c) Starting with the 4-velocity components {U α }, one can write the 3velocity, v = (U 1 /γ, U 2 /γ, U 3 /γ) √ where γ ≡ 1/ 1 − v · v = U 0 . (d) Applying the above formula, if the 4-velocity is given as (2, 1, 1, 1) then the 3-velocity is v = (1/2, 1/2, 1/2). Note the magnitude of the 4-velocity is −4 + 3 = −1, making it a legitimate example.
16 Particle moves with speed w, say along the x−axis, in a reference frame O moving along the x−axis with speed v. Deriving Einstein’s velocity addition law from a Lorentz transformation of the particle’s 4-velocity. The particle’s 4-velocity in reference frame O, U →O (γ(w), γ(w) w, 0, 0). Lorentz transformation from O to O γ(v) v γ(v) 0 0 v γ(v) γ(v) 0 0 . Λ(−v) = 0 0 1 0 0 0 0 1 So the 4-velocity is, U →O (γ(w)γ(v)+vwγ(w)γ(v), vγ(v)γ(w)+wγ(v)γ(w), 0, 0). Converting this to the 3-velocity using the formula in 15c, Ux (2.16) vx = 0 U γ(v) γ(w)(v + w) = (2.17) γ(v) γ(w)(1 + v w) (v + w) = (2.18) (1 + v w)
Rob’s notes on Schutz
33
~ for which U 0 > 0 and U ~ ·U ~ = −1 17 (a) Prove that any timelike vector U is the four-velocity of some world line. ~ is some world line’s fourThe four-velocity is the ~e0 in the MCRF. If U ~ →O (1, 0, 0, 0). velocity, then there exists a Lorentz transformation for which U ~ Let’s see if that’s possible for the given vector U . The coordinate system can be rotated so that U α = (U 0 , u, 0, 0), just to make the algebra simpler. Now apply an arbitrary Lorentz transformation 0 γ(v) −v γ(v) 0 0 U 1 −v γ(v) γ(v) 0 0 u 0 , = 0 0 1 0 0 0 0 0 0 1 0 0 for some v and γ(v). Thus we require 1 = γ(U 0 − v u) 0 = γ(u − U 0 v).
(2.19) (2.20)
But in general we require γ ≥ 1, so the second equation (2.20) requires ~ is v = u/U 0 . We know U 0 > 0 (given) and it follows from the fact that U timelike that U 0 > u. So thus v < 1. Thus γ(v) > 1, and most importantly, γ(v) ∈ <, i.e. the Lorentz transformation is possible. Does this required Lorentz transformation also bring the time component to unity? ~ ·U ~ = −1. The algebra can get messy, but simplifies if we use the fact that U Eliminate v in the first equation (2.19) gives u2 1 1 1 = U 0 − 0 = 0 ((U 0 )2 − u2 ) = 0 γ(v) U U U So γ = U0 ~ = ~e0 is This proves that the required Lorentz transformation to make U possible, which is enough to show that the original 4-vector is the 4-velocity of something. Note that the requirement that U 0 ≥ 1 is hidden in the ~ ·U ~ = −1. requirement that U (b) Use this to prove that for any timelike vector V~ there is a Lorentz frame in which V~ has zero spatial components.
34 The magnitude of a vector is the interval between the origin and the coordinates of the vector. For a timelike interval the vector is timelike, and vice versa. Timelike intervals can be transformed via a Lorentz transformation to have zero spatial part, see Exercise 1.21. The corresponding vector will have zero spatial components. If you haven’t done Exercise 1.21, you can construct a proof using part 17(a). We are no longer required to make the time part unity; we only require the space part to be zero, i.e. (2.20), 0 = γ(v)(u − V 0 v), where u is now V i Vi = u2 . We no longer have V 0 > 0, but that doesn’t matter. Because it’s a timelike vector we have (V 0 )2 > V i Vi = u2 So (2.20) implies now that |v| < 1 and again, γ(v) ∈ <, i.e. the Lorentz transformation is possible.
18 (a) Sum of two spacelike orthogonal vectors is spacelike. ~·B ~ = 0, so By definition, orthogonal vectors have A ~ + B) ~ · (A ~ + B) ~ =A ~·A ~+B ~ ·B ~ + 2A ~·B ~ (A ~·A ~+B ~ ·B ~ > 0. =A
(2.21) (2.22)
~·A ~ > 0. So (A ~ + B) ~ is also Spacelike vectors have positive magnitude, A spacelike. (b) Timelike vector and null vector cannot be orthogonal. ~ Let’s keep the algebra simple and rotate to a coTimelike vector A. ~ is all in one ordinate frame such that the spacepart of the null vector N component, N →O (N 0 , N 1 , 0, 0) = (N 0 , N 0 , 0, 0) ~ has unknown coorBecause it’s a null vector, N 0 = N 1 . The null vector N dinates in this frame, but A · N = −A0 N 0 + A1 N 0 = N 0 (A1 − A0 )
Rob’s notes on Schutz
35
But if (A1 − A0 ) = 0 then ~·A ~ = (A2 )2 + (A3 )2 ≥ 0 A ~ is timelike. Thus (A1 − A0 ) 6= 0. which contradicts the stipulation that A ~ is not the zero vector then N 0 6= 0 and A · N 6= 0 so they are not Thus if N orthogonal.
19 Consider a uniformly accelerated particle. That is, it has a 4-acceleration ~a of constant magnitude ~a · ~a = α2 where α2 ≥ 0 is a constant. (a) Show that ~a has constant components in the particle’s MCRF, and that these components look like normal, Galilean, acceleration terms. From Eq. (2.32) the 4-acceleration is always normal to the 4-velocity. But ~ = ~e0 . Without loss of generality we can take the ~a to have in the MCRF U only one spatial component, say in the x−direction, with increasing x in the direction of the acceleration. So ~a = (a0 , a1 , 0, 0) with this orientation of the spatial axes. Then Eq. (2.32) requires that a0 = 0 and a1 = α. ~a →MCRF (0, α, 0, 0) In some small amount of time, say dt¯ = dτ , in the MCRF at point P, the 4-velocity will change by ~ = dt¯~a = dt¯(0, α, 0, 0) dU Since the 4-velocity started at ~e0 in the MCRF (by definition) then velocity at the end of this small amount of time will be simply ~ (dt¯) ≈ (1, dt¯α, 0, 0) U The equality can be made arbitrarily accurate for small dt¯, wherein the relativistic effects are negligible. So the 3-velocity changes during dt¯ from zero to dt¯α in the x−direction, just as in Galilean acceleration, which is what we were required to prove. (See Exercise 15c for converting 4-velocity to 3-velocity).
36 19(b) Say α = 10 m s−2 and particle starts from rest at t = 0. Find general expression for the speed at time t. The trick is to work in the MCRF to find the change in velocity in some small increment of time dt¯, such that relativistic corrections are small, as in (a) above. Then one find the new 4-velocity in the MCRF. But then one must transform the new 4-velocity and the time increment back to the original frame using a Lorentz transformation. A some arbitrary point P with coordinates (tp , xp , 0, 0) the particle will be moving at speed vp in the positive x−direction. So applying the Lorentz transformation from the MCRF back to the original frame we find the new velocity is just: γ(−vp ) +vp γ(−vp ) 0 0 1 γ[1 + vp adt¯] ¯ ¯ γ(−vp ) 0 0 ~ →O +vp γ(−vp ) αdt = γ[vp + adt] , U 0 0 1 0 0 0 0 0 0 1 0 0 The new 3-velocity is: U1 U0 γ[vp + αdt¯] = γ[1 + vp αdt¯] [vp + αdt¯] = [1 + vp αdt¯] ≈ [vp + αdt¯][1 − vp αdt¯] ≈ vp + αdt¯ − vp2 αdt¯ + O(dt¯2 )
ux =
(2.23) This change in velocity occurred in time increment dt¯ in O, which corresponds to dt = γdt¯ + γvp d¯ x 1 = γdt¯ + γvp αdt¯2 2 ¯ ≈ γdt
(2.24)
Rob’s notes on Schutz
37
The acceleration in O is vp + αdt¯ − vp2 αdt¯ − vp dux = dt dt αdt¯ − vp2 αdt¯ = γdt¯ α(1 − vp2 ) α = = 3 γ γ
(2.25)
This differential equation can be written γ 3 dv = α dt which can be integrated immediately Z Z vp 3 γ dv = 0
p
(2.26)
tp
αdt
0
vp = αtp 1 − vp2
(2.27)
Solving for vp (t) =
αtp 1 + α2 t2p
(2.28)
We can immediately solve for the distance traveled by dx αtp = vp (t) = dt 1 + α2 t2p αtp dt dx = 1 + α2 t2p 1 x= ln(1 + α2 t2 ) 2α Setting vp = 0.999 and α = 10 m s−2 =
10 s−1 3×108
(2.29) we find a time of
tp = 6.7 × 108 s ≈ 21 years and expressing α = 10/(3 × 108 )2 m−1 α2 [m/s2 ]2 t2p [s]2 1 (3 × 108 )2 xp = ln 1 + = 2.0 × 1017 m 8 2 2 10 (3 × 10 )
38 19 (c) Find the elapsed proper time for the particle in (b). Recall from part (a) that the time increment in the MCRF was the proper time increment, dt¯ = dτ and this was related to the time increment in the original frame via the Lorentz transformation with the spatial part playing negligible role: dt = γdt¯ + γvp d¯ x 1 = γdt¯ + γvp αdt¯2 2 ¯ ≈ γdt So dτ = dt¯ =
(2.30)
1 dt γ(v)
We can use the expression derived above (2.26) 1 dt γ(v) 1 = γ(v)2 dv α
dτ =
which can also integrated immediately Z vp Z τp 1 dτ = γ(v)2 dv α 0 0 1 τp = arctanh (vp ) α 1 αtp τp (tp ) = arctanh α 1 + α2 t2p
(2.31)
(2.32)
We can solve for the proper time elapsed during the 21 years it took to accelerate the particle to v = 0.999: τp =
c[m/s] arctanh (0.999) = 1.14 × 108 [s] ≈ 3.6 years α[m/s2 ]
20 The particle moves in a circle in the x − y plane of radius b, in a clockwise sense when viewed in the direction of decreasing z. The circle
Rob’s notes on Schutz
39
translates along the x−axis at speed a. It’s stated that |bω| < 1, but the requirement for a realistic particle is actually that |a+bω| < 1. The 3-velocity is computed directly by differentiating the given equations, v →O (x, ˙ y, ˙ 0), where x˙ = a + ωb cos(ωt) y˙ = −ωb sin(ωt)
(2.33) (2.34)
The 4-velocity is obtained from the 3-velocity using the formula derived in problem 1.15b. ~ →O (γ(v), xγ(v), U ˙ yγ(v), ˙ 0) = γ(v)(1, a + bω cos(ωt), −ωb sin(ωt), 0) (2.35) p p where v = |v| = (a + ωb sin(ωt))2 + (ωb cos(ωt))2 = a2 + 2aωb sin(ωt) + ω 2 b2 . To obtain the 4-acceleration we require the 4-velocity as a function of proper time, τ , not t, the time in the inertial frame. But remember that the proper time is the time measured by a clock at, say, the origin of the MCRF. Call this frame O, and then t = τ = x0 . And t = Λ(−v)0 α xα . For simplicity we choose the MCRF with origin at the particle location, so xα →O (τ, 0, 0, 0), and t = γ(−v)τ = γ(v)τ . Then we obtain the 4acceleration from the given equations in t and the chain rule, ~ ~ dt ~ dU dU dU = = γ(v) dτ dt dτ dt We now confront the question as to whether or not to let γ(v) in this derivative! The answer is yes. See my supplementary problem SP.3 in the next section, in which we derived a general expression for the 4-acceleration, see (2.53). ~a ≡
~ + γ 2 (0, x¨, y¨, z¨) ~a = γ 3 [x¨ ˙ x + y˙ y¨ + z˙ z¨]U
(2.36)
Substituting the values for our particular problem we find: ~ + γ 2 (0, −ω 2 b sin(ωt), −ω 2 b cos(ωt, 0) ~a = γ 3 [−ω 2 ab sin(ωt)]U
21 The motion is hyperbolic in frame O, λ λ 2 2 2 2 2 2 − a sinh = a2 x − t = a cosh a a
(2.37)
40 2
and therefore hyperbolic in all reference frames, −t + x2 = a2 . The velocity is obtained by differentiating with respect to λ, dx dt λ dx = / = tanh . v= dt dλ dλ a So we notice that λa is a velocity parameter for v, see problem 1.18. The Lorentz transformation to the MCRF can be written in a simple form with the velocity parameter, see problem 1.20: cosh λa − sinh λa Λ= − sinh λa cosh λa Thus we find the points transform to cosh λa − sinh λa t(λ) a sinh = λ λ x(λ) − sinh a a cosh cosh a
λ a λ a
=
0 a
The particle always ends up on the x−axis. To show that the parameter λ is the proper time, we show that dt =1 dλ for a MCRF and any λ. This is a bit subtle, because we want to hold the Lorentz transformation fixed (so hold λ = λMCRF fixed), so that the MCRF is inertial. But we want to let λ vary about λ = λMCRF so we can take the derivative of t(λ) wrt λ. I’ve written out this dependence explicitly below: λMCRF λMCRF λ λ t(λ) = cosh a sinh − sinh a cosh a a a a Now differential wrt λ, and evaluate at λ = λMCRF giving, dt = cosh2 dλ
λ λ 2 − sinh =1 a a
The 4-velocity is ~ →O U
λ λ cosh , sinh , 0, 0 a a
Rob’s notes on Schutz
41
The 4-acceleration is easy for this problem because we have the 4-velocity as a function of proper time! ~ ~ dU 1 λ 1 λ dU = →O sinh , cosh , 0, 0 ~a ≡ dτ dλ a a a a We can check if it’s orthogonal to the 4-velocity, as it should be. λ λ 1 λ λ 1 ~ · ~a = − sinh cosh + sinh cosh = 0. U a a a a a a Is it uniformly accelerating? 1 ~a · ~a = − 2 sinh2 a
λ 1 λ 1 2 + 2 cosh = 2. a a a a
And a was given as constant, and it’s always pointing in the x−direction, so it is uniformly accelerating (see definition in problem 2.19).
22 (a) Given 4-momentum, p~ →O (4, 1, 1, 0) kg. Find: Energy in O: In general p~ →O (E, p1 , p2 , p3 ), so E = 4 kg. ~ = p~, where m is the rest mass and U ~ 3-velocity in O: In general mU is the 4-velocity. And the 3-velocity is related to the 4-velocity as inferred in problem 2.15b, U α = Λ(−|v|)α0 . So p~ →O (mγ, muγ, mvγ, mwγ), where v →O (u, v, w) are the components of the 3-velocity. Note that E = mγ, and simply dividing through by E gives v →O (1/4, 1/4, 0). Rest mass: 1 4 =√ . γ=√ 1−v·v 14 √ From which it follows from E = mγ = 4 that m = 14 ≈ 3.74kg. (b) We must apply the law of conservation of 4-momentum. p~I = p~1 + p~2 →O (5, 0, 1, 0) kg By conservation of 4-momentum, p~F = p~I = p~3 + p~4 + p~5 ,
42 so p~5 = p~I − p~3 − p~4 =→O (3, −1/2, 1, 0) kg. Now, like in problem (a), we know the 4-momentum. From an analysis just like in (a), we find the 5th particle has in this same reference frame: E √5 →O 3kg, and v5 →O (−1/6, 1/3, 0). Finally, the rest mass is m = 31/2 ≈ 2.83kg. The CM frame is found by finding the Lorentz transformation that transforms the p~F to have only a time component, Λαβ pβ = (~e0 )α This gives the equation for the y-direction, −vγ5 + γ = 0 So CM has 3-velocity v →O (0, 1/5, 0).
23 Find the energy given the 3-velocity and rest mass. ~ = mγ(1, u, v, w). And the energy is First find the 4-momentum, p~ = mU the time-part of the 4-momentum, E = mγ We can find an approximate value of γ from the binomial series, http://en. wikipedia.org/wiki/Binomial_series. This is just a Taylor series about x = 0. Let x = v · v = v 2 , and α = −1/2, so we obtain, 3 1 γ = 1 + v2 + v4 + . . . 2 8 So
1 3 E ≈ m(1 + v 2 + v 4 + . . .) 2 8 i.e. the rest mass, plus the classical kinetic energy, plus a correction of order O(v 4 ). The correction is 1/2 the kinetic energy when, v=
p 2/3
Rob’s notes on Schutz
43
24 Show that it’s impossible for a positron and an electron to annihilate and produce a single γ−ray. Apparently particles come and go, but 4-momentum is conserved. Line up the coordinates such that the x−axis is aligned with the direction of propagation of the γ−ray. Then conservation of 4-momentum, p~e+ + p~e− = p~γ , gives two equations. The time part looks like conservation of energy, p0e+ + p0e− = p0γ , while the spatial part looks like traditional conservation of momentum, p1e+ + p1e− = p1γ . It’s important to realize that they are not independent, since in a reference frame wherein the electron and positron move with velocities ve− and ve+ , we have m(γ(ve+ ) + γ(ve− )) = hν m(γ(ve+ ) ve+ + γ(ve− ) ve− ) = hν,
(2.38) (2.39)
where m is the rest mass of the electron and positron and ν is the frequency of the γ−ray. The only mathematical solution is then ve− = ve+ = 1, which is physically impossible because of their non-zero rest mass. Nothing moves at the speed of light, except electromagnetic radiation and possibly gravity waves if they exist. It’s possible to produce two γ−rays. Suppose they are travel in opposite directions with equal and opposite momentum in some frame of reference. Then the final total 4-momentum is the null vector. To satisfy momentum conservation we only require that the positron and electron have equal and opposite momentum in the same frame of reference, so ve+ = −ve− with arbitrary ve+ , which can obviously be satisfied.
25 Doppler shift. In frame O photon has 4-momentum p~ →O (hν, hν cos(θ), hν sin(θ), 0)
44 Transforming to the frame O moving at speed apply the Lorentz transformation γ(v) −v γ(v) 0 −v γ(v) γ(v) 0 Λ(v) = 0 0 1 0 0 0
v along the x−axis, we 0 0 0 1
to obtain
γhν − vγ(v)hν cos(θ) −vγ(v)hν + γ(v)hν cos(θ) p~ →O hν sin(θ) 0 So the Doppler shift is obtained from the time component, i.e. the first component, and can be expressed as, 1 ν = γ(v)(1 − v cos(θ)) = √ (1 − v cos(θ)) ν 1 − v2 as given. (b) No Doppler shift occurs when ν 1 =1= √ (1 − v cos(θ)) ν 1 − v2 So v=
2 cos θ 1 + cos2 θ
or θ = arccos
1−
√
1 − v2 v
Extra questions: Does this have solutions? For |v| 1 use the binomial series to see that θ ≈ π/2. What’s the maximum angle of no Doppler shift? As v → 1, θ → 0. Show that at v = 1/2, θ ≈ 74.5◦ . (c)
Rob’s notes on Schutz
45
Eq. (2.35) is the frame-invarient expression for energy E relative to ob~ server moving with velocity U obs , −~p · Uobs = E and Eq. (2.38) was just E = hν. This calculation ends up being exactly the same as above, but allows one to focus on the relevant parts, i.e. just the time component. Since ~ U obs →O (γ(v), γ(v)v, 0, 0) and recall p~ →O (hν, hν cos(θ), hν sin(θ), 0) so we can immediately find E = γ(v)hν − vγ(v)hν cos(θ). which was the time component of the p~ →O found in (a) above. 26 Energy required to accelerate an object with rest mass m from v to δv to first order in δv. E = mγ(v) = m √
1 1 − v2
so the change in energy is just δE = m(γ(v + δv) − γ(v)). When v 1 the problem is easy. Just differentiate γ wrt v to get the Taylor series approximation 1 γ(v + δv) − γ(v) = γ 0 (v)δv + γ 00 (v)δv 2 + . . . 2 where dγ = vγ 3 dv γ 00 = γ 3 + v3γ 2 γ 0 γ0 =
(2.40) (2.41)
46 So γ(v + δv) − γ(v) = vγ 3 δv + O(δv 2 ) . . . And so the change in energy is, δE ≈ mvγ 3 δv. A subtlety arises when v is not small. The coefficient γ 00 become large relative to γ 0 , so ignoring the O(δv 2 ) term becomes misleading. The author should have instructed us to check this. In particular, γ 00 1 + 3vγ 2 = γ0 v When v 1 we can replace 1 00 2 γ δv ≈ γ 0 δv 2
δv 2v
γ 0 (v)δv
since we’re given that ( δv ) 1. So we’re still justified in ignoring the 2nd v term in the Taylor series. But when v is not small we need another approach. The above argument is not formally correct when v is not small because the higher order terms in the Taylor series can no longer be ignored. Here is one approach. Write v = 1 − where 0 < 1, so we’re close to the speed of light. Use 1 and the Binomial series to simplify γ, 1 1 ≈ √ (1 + /4), γ(v) = p 2 (2 − ) and
1 γ(v + δv) = p ( − δ)(2 − + δ)
where δ = −δv. To simplify the latter we need to consider the case where |δ| . But this is not so restrictive. Then 1 1 δ − δ 1+ γ(v + δv) = p ≈√ 1+ 2 4 2 ( − δ)(2 − + δ) To find the perturbation in energy we take the difference, 1 δ γ(v + δv) − γ(v) ≈ √ 2 4
Rob’s notes on Schutz
47
It’s clear that as → 0, so v → 1, A simpler and better solution: Write v = 1 − where 0 < 1, so we’re close to the speed of light. 1 γ(v) = p . (2 − ) Now expand this in a Taylor Series in : γ(v + δv) − γ(v) =
dγ 1 d2 γ (−δ) + (−δ)2 + . . . d 2 d2
And
− 1 − 4 − 1 − 4 1 + 34 dγ = ≈ d (2)3/2 (1 − /2)3/2 (2)3/2 where the approximation exploits 0 < 1 with the Binomial Series approximation. It’s important to check the size of the 2nd derivative relative to the first. We find, again using the Binomial Series, d2 γ −3 dγ ≈ d2 2 d so we’re only justified in ignoring the 2nd term if |δ| . In this case, the change in energy is δE ≈ m
1 δv = mγ 3 δv + O() (2)3/2
This actually agrees with the result we would have obtained from using the simply Taylor Series above. We’re asked to show that the energy becomes infinite when v → 1. This is easily obtained by noting that γ is finite for 0 ≤ v < 1. However, lim γ(v) → ∞.
v→1
27 Increasing temperature increases the rest mass. Object has rest mass, m(T0 ) = 10[kg]. Increasing temperature from T0 to T by heat flux δQ = 100 J. This must be reflected in an increase in rest mass, since in the MCRF of the object, U 0 = 1 and mU 0 = p0 = E. So m(T ) = m(T0 )[kg] + δQ[J]/c2 [m2 /s2 ] = 10 + 1.1 × 10−15 [kg]
48 This problem is interesting to look at from a thermodynamics point of view. The heat flux increases the temperature and enthalpy of the object, which is reflected on a microscopic scale by an increase in the motion, relative to the centre of mass of the object, of the elements (atoms or molecules or sea of electrons depending on the material) composing the object. This motion increases the effective mass of the elements. Say an element has rest mass mi , then when it has thermal speed vi it has “relativistic mass” mi,rel = mi γ(vi ). I found this website, which expands on these ideas http://en.wikipedia. org/wiki/Massenergy_equivalence.
28 Boring.
29 d d ~ ~ (U · U ) = −(U 0 )2 + (U 1 )2 + (U 2 )2 + (U 3 )2 dτ dτ dU 0 dU i = −2U 0 + 2U i dτ dτ ~ ~ · dU = 2U dτ
(2.42) (2.43) (2.44)
Q.E.D.
30 Four velocity of rocket ship, ~ →O (2, 1, 1, 1) U High-velocity cosmic ray with 4-momentum, P~ →O (300, 299, 0, 0) × 10−27 kg (a) Transform to MCRF of rocket ship. We know from Ex. 2.15, that for general particle 3-velocity is v = (u, v, w), the 4-velocity is ~ →O (γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) A =(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|))
(2.45) (2.46)
Rob’s notes on Schutz
49
where |v| =
√
u2 + v 2 + w2 .
~ reveals that Inspection of U γ=2 u = 1/2 v = 1/2 w = 1/2 √ √ and |v| = u2 + v 2 + w2 = 3/2. Now we need the Lorentz transformation for a reference frame moving with 3-velocity with more than one non-zero component. Up to this point we haven’t learned this, and I’m a bit surprised Schutz has thrown this at us now. To lead one through the steps to construct a general Lorentz transformation, I’ve created supplementary problem SP.1 in section 2.10. Here we note that we actually only need the first row of the Lorentz transformation matrix, since we only require P 0 = E. This first row ~ → (1, 0, 0, 0). Thus it must be related must be such that it transforms U O ~ to the components of U as follows: Λ00 = U 0
Λ0i = −U i .
Applying Λ0α to the given P~ gives, E = 301 × 10−27 kg in rocket ship frame. (b)
2 1 −27 −27 ~ −P~ · U obs = Eobs = 10 [300 299 0 0] 1 = 301 × 10 kg 1
(c) Of course (b) was faster. The same computations were performed to get the answer, but in (b) we only did the necessary computations.
31 Photon reflects off mirror without changing frequency ν. Angle of incidence is θ.
50 This appears to be a straightforward application of conservation of 4momentum, but it fun because it gets us thinking about all 4 components. Let the mirror lie in the y − z plane, with photon travelling initially in the x − y plane, with angle θ to the x−axis. Then the initial 4-momentum of the photon is written P~i = (hν, cos(θ)hν, sin(θ)hν, 0). First let’s construct the 4-momentum of the reflected photon P~r . Since the photon frequency doesn’t change, we know instantly the time component, Pr0 = Pi0 = hν. For a smooth mirror we assume (actually I’m just guessing!) that the momentum transferred is only in the x−direction. So then we can also construct the components, Pr2 = Pi2 = sin(θ)hν,
Pr3 = Pi3 = 0
Recall from Eq. (2.37) that the 4-momentum of a photon is orthogonal to itself. This alone gives us two possibilities for Pr1 = ± cos(θ)hν. For the reflected photon, we choose the minus sign. In summary, P~r = (hν, −hν cos(θ), hν sin(θ), 0). By conservation of 4-momentum, we see that the momentum transferred to the mirror must be ∆Pm1 = 2hν cos(θ) in the x−direction. How did the mirror acquire x−direction momentum without gaining energy? See Supplementary Problem SP 2 in section 2.10. If the photon is absorbed, then the momentum transferred to the mirror has three components, ∆P~m = (∆Em , ∆Pm1 , ∆Pm2 , 0) = (hν, hν cos(θ), hν sin(θ), 0), How did the mirror acquire the extra energy ∆Em = hν? See Supplementary Problem SP 2 in section 2.10.
32 Derive the Compton scattering relationship Eq. (2.43).
Rob’s notes on Schutz
51
Initially the total 4-momentum in the particle’s initial rest frame O is P~ →O (hνi , hνf , 0, 0) + (m, 0, 0, 0) After the scattering event, P~ →O (hνf , hνf cos(θ), hνf sin(θ), 0) + m(γ, vγ cos(φ), vγ sin(φ), 0) where v and φ are the speed and the angle of the particle’s scattered trajectory in the x − y plane relative to the initial direction of the incident photon. Equating the three nonzero components of 4-momentum gives 3 equations for the 3 unknowns νf , v, φ. In principle one can then solve for νf in terms of the other two unknowns, but I found it too tedious to do so.
33 Compton scattering of a cosmic microwave background radiation photon off a cosmic ray ( high-energy proton). What’s the max frequency of scattered photon? Very nice problem. At first appears very challenging, but the extreme differences in energy between the two particles simplifies things. First we note that in the rest frame of the particle, Compton scattering only reduces the frequency and more so for less massive particles (see also supplementary problem SP 2 below). So how can Compton scattering increase the energy of the photon?? The increase in energy is revealed via the Doppler shift. The key simplification in this problem is that the Compton scattering in the frame of the particle has very little effect on frequency. 1 1 = 5000eV−1 = 10−9 eV−1 . hνi mp So the angle of the Compton scattering has very little effect on the finally frequency in the particles initial rest frame. So in considering the effect of the angle, we need only consider its effect on the Doppler shift. Now the problem is easy. The Doppler shift in frequency is given in general by Eq. (2.42). Obviously to maximize the frequency in the cosmic ray frame, ν¯i , we want the photon and cosmic ray traveling in a line in opposite directions, i.e. θ = π radians, for which Eq. (2.42) gives hν i = hνi √
1 2 (1 + v) ≈ hνi √ = hνi 2 × 109 = 4 × 105 eV. 2 1−v 1 − v2
52 The Doppler shift has made a tremendous increase in frequency! The Compton scattering will make very little difference, so to maximize the scattered frequency in the Sun’s frame, choose the Compton scattering angle to maximize the Doppler shift. That is, choose the scattering angle to be π. Eq. (2.43) gives 1 2 1 = + = 0.25 × 10−5 + 2 × 10−9 ≈ 0.25 × 10−5 [eV]−1 . h¯ νf h¯ νi mp Compton scattered caused negligible decrease in energy in the proton’s frame. The proton, like the mirror in problem 31, is massive enough to cause little change in frequency of the photon in the proton’s frame. See also Supplementary problem SP 2. Now Lorentz transform back to the Sun’s frame. The photon again gains tremendously from the Doppler shift (that’s why we choose the scattering angle to be complete reflection). hνf ≈ h¯ νf 2 × 109 ≈ 8 × 1014 eV. This is a very hard γ−ray. A pair of 511 keV photons arising from annihilation of an electron and positron are considered to be γ−rays. And this is more than a billion times more energetic than that.
34 These are quite trivial. For example, expand out the dot product in terms of components using the definition in Eq. (2.26), and use the linearity property given by Eq. (2.8), ~ ·B ~ = −αA0 B 0 + αA1 B 1 + αA2 B 2 + αA3 B 3 (αA) = α(−A0 B 0 + A1 B 1 + A2 B 2 + A3 B 3 ) ~ · B) ~ = α(A
35 Show that ~eβ¯ obtained from Eq. (2.15), ~eµ¯ = Λνµ¯ (−v)~eν , obey ~eα¯ · ~eβ¯ = ηα¯ β¯
(2.47)
Rob’s notes on Schutz
53
~eα¯ · ~eβ¯ = Λνα¯ (−v)~eν · Λµβ¯(−v)~eµ = Λνα¯ Λµβ¯ ~eν · ~eµ = Λνα¯ Λµβ¯ ηνµ The LHS is a vector expression, and it shouldn’t depend upon the orientation of the coordinate axes. So let’s rotate the axes so that v is oriented along the x−axis. Then γ −vγ 0 0 −vγ γ 0 0 Λ(v) = 0 0 1 0 0 0 0 1 Note that Λ is symmetric so we can interchange indices on one without effect, ¯
~eα¯ · ~eβ¯ = Λβµ Λνα¯ ηνµ ¯ the RHS looks like the product of a row of Λβ¯ times a For given α ¯ = β, µ column Λνα¯ . It’s easy to see that the result is −1 for α ¯ = β¯ = 0 and +1 for ¯ the RHS = 0. Q.E.D. α ¯ = β¯ > 0. When α ¯ 6= β,
2.10
Rob’s supplemental problems
√ ~ →O (2, 1, 2, 0) in some R.1 Suppose the 4-velocity of rocket ship is U reference frame O. ~ is a legitimate 4-velocity. Show that V~ →O (a) Show that the given U (2, 1, 1, 0) is not possible. (b) Find the 3-velocity in O. Hint: see Ex. 2.15. (You’ll need this for (c)). (c) Find the matrix that rotates of spatial coordinates such that the 3velocity has only one non-zero component, in say the x−direction. What’s the matrix that rotates the 4-velocity to have only one nonzero spatial component? (d) Find the inverse rotation matrices for above. Hint: Think physically and check mathematically, i.e. R−1 4 R4 = I (e) Find the Lorentz transformation from O to the MCRF of the rocket ~ itself. Hint: The ship. Confirm that it has the correct effect applied to U
54 problem here is that we have so far only seen the Lorentz transformation when the 3-velocity has only one non-zero component. Use your rotation matrix from above and its inverse. Solution: (a) ~ ·U ~ = −22 + 12 + U
√ 2 2 = −1
which is consistent with Eq. (2.28). On the other hand, V~ · V~ = −22 + 12 + 12 = −2 which is inconsistent with Eq. (2.28). (b) See solution to Ex. 2.15: √ v →O (1/2, 2/2, 0) √ (c) Rotating anticlockwise through angle θ = arccos(1/ 3) aligns the x−axis with the 3-velocity. This is accomplished with the matrix R3 , cos(θ) sin(θ) 0 R3 = − sin(θ) cos(θ) 0 0 0 1 For the 4-velocity
1 0 0 0 0 cos(θ) sin(θ) 0 R4 = 0 − sin(θ) cos(θ) 0 0 0 0 1 (d) To find the inverse of the rotation matrix just change the sign of the angle! 1 0 0 0 0 cos(θ) − sin(θ) 0 R−1 4 = 0 sin(θ) cos(θ) 0 0 0 0 1
Rob’s notes on Schutz
55
(e) The Lorentz transformation for the case Λ(u, v, 0) can be built from ~. the above tools. Consider transforming a vector, U ~ = ΛU ~ U 0 0 ~ = R−1 4 Λ (u , 0, 0)R4 U
where
γ(u0 ) −u0 γ(u0 ) −u0 γ(u0 ) γ(u0 ) Λ0 (u0 , 0, 0) = 0 0 0 0
0 0 1 0
0 0 0 1
So this defines the desired Lorentz transformation Λ(u, v, 0), γ(|v|) −uγ(|v|) −vγ(|v|) −uγ(|v|) γ(|v|) cos2 (θ) + sin2 (θ) (γ(|v|) − 1) cos(θ) sin(θ) Λ(u, v, 0) = −vγ(|v|) (γ(|v|) − 1) cos(θ) sin(θ) γ(|v|) sin2 (θ) + cos2 (θ) 0 0 0 (2.48) √ where |v| = u2 + v 2 and θ = arctan(v/u). It’s straightforward, albeit a bit tedious, to show that 1 0 ~ = . Λ(u, v, 0)U 0 0
R.2 (a)How did the mirror in problem 2.31 acquire x−direction momentum without acquiring energy when the photon was reflected? (b) How did it acquire the energy when the photon was absorbed? Solution: (a) The change in 4-momentum is related to the change in 4-velocity of a massive object, ~ = m(∆γ, ∆(uγ), 0, 0) = m(γ − 1, uγ, 0, 0), ∆P~m = m∆U where the 2nd equality assumes the mirror is initially at rest. Thus the ratio of √ ∆Em 1 u ∆Pm0 = = (1 − 1 − u2 ) ≈ . 1 1 ∆Pm ∆(mU ) u 2
0 0 0 1
56 The approximation applies in the limit u 1 using the binomial series. So the change in energy can be arbitrarily small for a given change in momentum if the change in velocity is correspondingly small. This corresponds to intuition that a more massive mirror would rebound less for a given momentum transfer. I suspect the imposition of “reflection without change in frequency” is an idealization applicable for massive “mirrors”. Indeed the next problem, 2.32 covers Compton scattering, wherein a photon reflects off a particle of mass m. In Eq. (2.43) we see that for m νi h where νi is the incident frequency of the photon, the reflected frequency νf ≈ νi . (b) For a massive mirror, the energy must have become mostly thermal energy. For a less massive mirror the energy, more the energy would go into the translational kinetic energy of the rebound.
R.3 Start with the expression for the 4-velocity in terms of the 3-velocity with the components of the 3-velocity written as a function of time in an inertial frame O: ~ →O γ(v)(1, x, U ˙ y, ˙ z) ˙ where v = |v| =
p x˙ 2 + y˙ 2 + z˙ 2
Show that the 4-acceleration is orthogonal to the 4-velocity by using the change rule to derive a general expression for the 4-acceleration involving derivatives with respect to time. Solution: The 4-acceleration is defined as ~ dU ~a ≡ Eq. (2.32) dτ d = [γ(v)(1, x, ˙ y, ˙ z)] ˙ dτ dt d = [γ(v)(1, x, ˙ y, ˙ z)] ˙ dτ dt (2.49)
Rob’s notes on Schutz
57
One can always put the origin of the MCRF at the particle location, so that xα¯ →O (τ, 0, 0, 0) and thus for short increments in time, t = Λ0α xα¯ = γ(−v)τ = γ(v)τ
(2.50)
and dt = γ(v) dτ
(2.51)
To resolve (2.49) we also require "
d d 1 p γ(v) = 2 dt dt 1 − x˙ − y˙ 2 − z˙ 2
#
= γ 3 [x¨ ˙ x + y˙ y¨ + z˙ z¨]
(2.52)
Substituting (2.51) and (2.52) into (2.49) we find, ~a = γ{γ 3 [x¨ ˙ x + y˙ y¨ + z˙ z¨](1, x, ˙ y, ˙ z) ˙ + γ(0, x¨, y¨, z¨)} ~ + γ(0, x¨, y¨, z¨)} = γ{γ 2 [x¨ ˙ x + y˙ y¨ + z˙ z¨]U ~ + γ 2 (0, x¨, y¨, z¨) = γ 3 [x¨ ˙ x + y˙ y¨ + z˙ z¨]U
(2.53)
~: Now take the dot product with U ~ · ~a = γ 3 [x¨ ~ + γ 2 (0, x¨, y¨, z¨) U ˙ x + y˙ y¨ + z˙ z¨]U = −γ 3 [x¨ ˙ x + y˙ y¨ + z˙ z¨] + γ 3 (1, x, ˙ y, ˙ z) ˙ · (0, x¨, y¨, z¨) = 0.
(2.54)
58
Chapter 3 Tensor Analysis in Special Relativity
59
60
3.1
The metric tensor
We learn that ηαβ (introduced in Eq. (2.27)) is the metric tensor, and it provides a frame-invariant way to write the scalar product of two vectors, Eq. (3.1). I don’t see why this is “frame-invariant” when the RHS depends upon the components, which in turn depend upon the frame. Maybe he means the LHS? Of course the scalar product itself is frame-invariant, and perhaps that’s all he means here. In any case, he wants to talk about tensors in general and is using the metric tensor as a concrete example to get the discussion going.
3.2
Definition of tensors
Tensors are defined as rules for mapping N vectors to real numbers that are linear in the arguments: 0 N An ordinary function y = f (t, x, y, z) is a rule for mapping reals onto a real and is classed as 0 0 because it takes zero vectors as it’s input. I find this odd because no where in the definition of a tensor is there mention of the dimension of the vectors and scalars are vectors of dimension 0. But this is just semantics.
Aside on the usage of the term ‘function’ Emphasizes that a regular function y = f (x) is more generally thought of as a rule for associating real values of x with real values of y. So we should think of tensors as rules for associating vectors with real scalars. For example g is the tensor that associates the vectors with their ~ and B ~ with the number A ~ · B. ~ dot product; for example, say A
Components of a tensor
Rob’s notes on Schutz
61
So the components of a tensor, for a given frame, are the values of the tensor applied to the basis vectors of that frame. This gives new insight into Eq. (2.27) that introduced the metric tensor, for now we see the same equation repeated here as Eq. (3.5), but now with the interpretation of the 16 values of ηαβ being the “components of the metric tensor” in basis vectors Eq. (2.9). And the metric tensor provides the rule associated with the dot product.
3.3
The
0 1
tensors: one-forms
“Covector” = “covariant vector” = “one-form”. The concept of one-form was so confusing (to me at least) in Misner et al. (1973) that I put their book aside and bought Schutz. Here Schutz comes through brilliantly, helping me through this hurdle.
General properties Typo in Eq. (3.6b). ˜ = α~p(A) ~ r˜(A) should be ~ = α˜ ~ r˜(A) p(A) The set of all one-forms form a vector space. The axioms of a vector space are given in Appendix A on p. 374. BTW . . . an abelian group, also called a commutative group, is a group in which the result of applying the group operation to two group elements does not depend on their order (the axiom of commutativity). Abelian groups generalize the arithmetic of addition of integers. which I got from http://en.wikipedia.org/wiki/Abelian_group. Components of one-forms transform in the same way as basis vectors do; see Eq. (3.9). A one-form is frame-independent, see p. 59.
Notation for derivatives
62 Result Eq. (3.20) is very fundamental, yet I found it a bit weakly explained. Up until this point the gradient was defined only for a scalar field, like φ(~x), c.f. Eq. (3.15). Suddenly, and without comment to the effect, in Eq. (3.20) Schutz is applying the gradient operator to the component of a vector, xα ). One might be tempted to say, but hold α fixed and then xα is like a scalar field. I find that unsatisfactory because components of vectors are very different things from scalar fields. Components change under a change of basis, while scalars don’t! In any case, Eq. (3.20) appears out-ofhat. I now appreciate the approach of Hobson et al. (2009), who start with manifolds and co-ordinate curves therein, then cover vector and tensor calculus on pseudo-Riemann manifolds. Then the notation of basis vector and basis one-form, and their connection with coordinate curves, appears more naturally. In exercise 34(e) we’ll learn by example that the definition in Eq. (3.20) means that the one-form bases are not necessarily just the metric applied to the vector basis vectors.
Normal one-forms In my opinion it is necessary to state that the normal one-form is not the zero one-form. This was specified for instance in Problem 12.
3.5
Definition of tensors
Why distinguish one-forms from vectors This is the crucial material missing from Misner et al. (1973). There’s also a nice connection made with row vectors and column vectors and Dirac’s bra and ket vector formulation of quantum mechanics. Typo: Eq. (3.45) should be ~dφ → because it’s a vector gradient.
Rob’s notes on Schutz
3.6
Finally:
63 M N
tensors
Typo: Paragraph before Eq. (3.55), the equation below should have the RHS in regular, not bold, face type because it’s the components. R(˜ ω α ; ~eβ ) := Rαβ Similarly for Eq. (3.55), last line, the RHS should be in regular, not bold, face type because it’s the components. Also the first line of this equation should have α ¯ on the RHS not α ~.
3.8
Differentiation of tensors
This is a very important section for later work. Unfortunately it is a bit rushed. As discussed in my solution to problem 28 of § 3.10, I find that Eq. (3.66) is not clearly explained. Rather than saying that we “deduce” Eq. (3.66) from Eq. (3.65), I’d find it more satisfying if he’d said something like: We chose to define the gradient of a rank 2 tensor as . Eq. (3.66). And in so doing, we can then obtain the Eq. (3.65), which is desirable because it appears as a straightforward generalization of Eq. (3.14).
64
3.10
Exercises
1(a). The double sum is obviously different because it includes the offdiagonal terms Mαβ , when α 6= β (b) α
β
A B ηαβ =
3 X 3 X
Aα B β ηαβ
α=0 β=0 0 0
= A B (−1) + A1 B 1 + A2 B 2 + A3 B 3
(3.1) (3.2)
using ηαβ defined after Eq. (2.7).
2. To prove that the set of all one-forms is a vector space, we must show that this set meets the axioms (1) and (2) given in Appendix A, p. 374. Axiom (1): The sum of two one-forms must also be a one-form, which is satisfied by Eq. (3.6a), and the order of summation doesn’t matter, which is satisfied by Eq. (3.6b) because a one-form evaluates to a real and the sum of two reals doesn’t depend upon the order. We also require a zero. The zero one-form gives zero for any vector (see p. 60). So say q˜ is the zero one-form. Then assuming Eq. (3.6a) and by Eq. (3.6b) ~ = p˜(A) ~ + q˜(A) ~ s˜(A) ~ +0 = p˜(A) ~ = p˜(A)
(3.3) (3.4) (3.5)
so p˜ + 0 = p˜ and we have a zero. Axiom (1) is satisfied. Axiom (2): There are four requirements to meet. Although it’s not made explicit in Eq. (3.6a), let’s assume α ∈ <. By Eq. (3.6) it’s clear that multiplication of a one-form by a real scalar meets requirements of Axiom 2.
3(a). Show p˜(Aα~eα ) = Aα p˜(~eα )
Rob’s notes on Schutz p˜(Aα~eα ) = p˜
3 X
65 ! Aα~eα
(3.6)
α=0
=
3 X
Aα p˜ (~eα ) ,
by linearity in arguments, c.f. p. 56,
α=0 α
= A p˜(~eα )
(3.7) (3.8)
3(b). (i) ~ = Aα pα p˜(A) = −2 + 1 + 0 + 0 = −1 (ii) +2, (iii) -7, (iv) -7.
4(a). To show the vectors are linearly independent we require that there is no non-trivial linear combination ~ + bB ~ + cC ~ + dD ~ =0 aA a 0 b 0 ~ B ~ C ~ D ~ = A c 0 d 0
(3.9) (3.10)
A non-trivial combination (i.e. not a = b = c = d = 0) requires the determi~ through D ~ to be nant of the matrix with columns formed by the vectors A zero. But the determinant is -8. ~ = 1 etc. 4(b). Find components of p˜ given p˜(A) Using the definition of components Eq. (3.8), we can write a linear system in the four unknown components: ~ ~ A p˜(A) p0 ~ ~ B p1 p˜(B) (3.11) ~ = ~ p˜(C) C p2 ~ ~ p3 D p˜(D)
66 Note the matrix is written with the rows given by the vectors. This can be solved in MaLab as follows: A b x x
= [2 1 1 0; 1 2 0 0; 0 0 1 1; -3 2 0 0]; = [1; -1; -1; 0]; = A\b; = -0.2500 -0.3750 1.8750 -2.8750 4(c). Given ~ →O (1, 1, 0, 0) E
we easily find that ~ = E α pα p˜(E) = p0 + p1 = −5/8
4(d). Given the values of the four one-forms, p˜, q˜, r˜, s˜ applied to the four ~ B, ~ C, ~ D ~ we can, in principle, find all components of all four known vectors A, one-forms, repeating the procedure we did in 4b. And then one could write a matrix M where the columns of M are taken from the one-form components. If the determinant of M is zero the one-forms are linearly dependent. But that’s a lot of work. I believe there is a simpler way to test for linear dependence. If the oneforms are linearly dependent, then there are non-trivial real number a, b, c, d such that a˜ p + b˜ q + c˜ r + d˜ s = t˜ = 0 a 0 0 b ˜ p˜ q˜ r˜ s˜ c = t = 0 d 0
(3.12)
Rob’s notes on Schutz
67
But then ~ A 0 ~ B ˜ 0 ~t = 0 C ~ 0 D By Eq. (3.6) we have ~ ~ q˜(A) ~ A p˜(A) ~ ~ ~ B ˜ p˜(B) q˜(B) ~t = ~ ~ C p˜(C) q˜(C) ~ q˜(D) ~ ~ p˜(D) D 1 0 2 −1 0 0 = −1 1 0 0 −1 0 0 0 = 0 0
(3.13)
~ s˜(A) ~ a r˜(A) ~ s˜(B) ~ r˜(B) b ~ s˜(C) ~ c r˜(C) ~ s˜(D) ~ d r˜(D) −1 a −1 b 0 c 0 d
The latter can only be true if the determinant is zero, but 1 0 2 −1 −1 0 0 −1 −1 1 0 0 = −2 0 −1 0 0
(3.14)
(3.15)
so the one-forms must not be linearly dependent.
5. Justify steps from Eq. (3.10a) to Eq. (3.10d).
Aα¯ pα¯ = (Λα¯β Aβ )(Λµα¯ pµ ), =
(Λµα¯
Λα¯β ) (Aβ pµ ),
by Eq. (2.7) and Eq. (3.9) respectively just rearranged the terms (3.16)
68 We claimed that the transformation for one-forms Λµα¯ was the same as for basis vectors Eq. (3.9). Aα¯ pα¯ = (Λµα¯ Λα¯β )(Aβ pµ ), = δ µβ (Aβ pµ ), = A µ pµ ,
by Eq. (2.18),
by the properties of the Kronecker delta.
˜0, λ ˜1, λ ˜2, λ ˜ 3 } for the 6. Given a basis {~eα } of a frame O and a basis {λ space of one-forms, with ˜ 0 →O (1, 1, 0, 0) λ ˜ 1 →O (1, −1, 0, 0) λ ˜ 2 →O (0, 0, 1, −1) λ
(3.19)
˜3
(3.20)
(3.17) (3.18)
λ →O (0, 0, 1, 1) ~ 6a. Consider an arbitrary one-form p˜ and vector A. ˜ α (A) ˜ α (A) ~ = pα λ ~ p˜(~eα )λ ˜ α (~eβ )Aβ = pα λ α
= pα A
˜α
iff λ (~eβ ) =
(3.21) (3.22) δ αβ
(3.23)
˜ α (~eβ ) 6= δ α by inspection of the given basis. but it is clear that λ β 6b. Given p˜ →O (1, 1, 1, 1), find ˜α p˜ = lα λ We can write this as a linear system of equations to solve for lα : ˜ α (~eβ ) lα = p˜(~eβ ) λ ˜ α )β lα = pβ (λ where I’m using that weird notation introduced in Eq. (2.10). This can be solved in MatLab with
(3.24) (3.25)
Rob’s notes on Schutz A p l l
69
= [1 1 0 0; 1 -1 0 0; 0 0 1 1; 0 0 -1 1]; = [1 1 1 1]’; = A\p; = 1 0 0 1
7. The proof of Eq. (3.13), we’re told on p. 61, is analogous to the corresponding relation for basis vectors, which was given on p. 37. ~ an arbitrary vector. Let ~eα Imagine that p˜ is an arbitrary one-form, and A and ~eα be the basis vectors in frame O, and O respectively. The components ~ in the two frames are related by: of A Λα¯β Aβ = Aα¯ , see Eq. (2.7) on p. 35. We seek the transformation relating the components ~ is frameof the one-form in frame O and O. Let’s call it T µα¯ . Because p˜(A) independent (after all, it’s just a scalar): ~ = Aα pα = Aα¯ pα¯ p˜(A) = Λα¯β Aβ pα¯ , = Λα¯β Aβ T µα¯ pµ , = Aβ pµ Λα¯β T µα¯ ,
just rearranging terms
At this point I believe Schutz would relabel his indices and replace β with α. I’m a little uncomfortable with this because my understanding is that the sums on the two sides of the equal sign are equal, but we cannot immediately ~ is arbitrary, say the individual terms are equal. We can note however that A ~ are zero and therefore we can imagine the case where all components of A ~ but one, A = a~e0 say, and then: ~ = A0 p0 = Aβ pµ Λα¯β T µα¯ , p˜(A) = A0 pµ Λα0¯ T µα¯ ,
from above (3.26)
70 ~ = a~e1 etc. In this way we can see that And similarly, we can imagine A indeed, it is valid to set α = β. ~ = Aα pα = Aβ pµ Λα¯ T µα¯ , p˜(A) β α
=A
pµ Λα¯α
T µα¯
,
from above relabel β with α (3.27)
and now it’s clear that Λα¯α T µα¯ = δ µα T µα¯ = Λµα¯
(3.28)
˜ viewed from the t − x plane, would be 8. The basis one-forms of dt, equally spaced straight lines through the t−axis. The spacing is one unit of ˜ α t between surfaces so that ~e0 would cross one surface. The equation for dx is given in Eq. (3.20) in terms of the basis one-forms, which are written out on the top of page 61. ˜ viewed from the t − x plane, would be equally The basis one-forms of dx, spaced straight lines through the x−axis. The spacing is one unit of x between surfaces. ˜ are given by Eq. (3.15), and the partial deriva9. The components of dT tives can be estimated from Fig. 3.5, at least for the x and y directions. It’s not clear what we’re supposed to do for the t and z directions.
10 (a) It’s obvious that ∂xα = δ αβ ∂xβ (3.29) because when α 6= β we have terms like, say, ∂x0 ∂t = =0 ∂x1 ∂x
Rob’s notes on Schutz
71
because t and x are independent variables. But when α = β we have terms like, say, ∂x3 ∂z = 1. = ∂x3 ∂z 10 (b) ∂xβ ∂xµ β ∂Λ α¯ xα¯ ∂xµ ∂xα¯ Λβα¯ µ ∂x β Λ α¯ Λαµ¯
= δ βµ ,
from (a)
= δ βµ ,
sub co-ordinate transform
= δ βµ ,
transform is a constant
= δ βµ ,
from Eq. (3.18) (3.30)
11 Eq. (3.14) in different notation: dφ dt dx dy dz = φ,t + φ,x + φ,y + φ,z dτ dτ dτ dτ dτ
(3.31)
Eq. (3.15) in different notation: ˜ →O (φ,t , φ,x , φ,y , φ,z ) dφ
(3.32)
Eq. (3.18) in different notation: xβ,α¯ = Λβα¯
12 (a)
(3.33)
72 V~ is not tangent to surface S, so it must have a component in the e~x direction, V x 6= 0. n ˜ (V~ ) = nx V x (3.34) To show this is nonzero, we must show that nx > 0. The normal one-form is the one-form that is zero for every vector tangent to the surface. For the x = 0 surface in 3D space, n ˜ →O (nx , 0, 0)
(3.35)
where nx 6= 0 will serve as a non-zero, normal-one form.
12 (b) Suppose n ˜ (V~ ) = V α nα , = V x nx > 0.
(3.36)
~ has same sign of W x as V x . Then, Suppose W ~ ) = W α nα , n ˜ (W = W x nx > 0.
(3.37)
12 (c) Any normal to the surface S must have ny = 0 and nz = 0. To be non-zero it requires nx 6= 0. So n ˜ →O a(nx , 0, 0)
(3.38)
where a 6= 0 and a ∈ < will serve also as a non-zero, normal-one form.
12 (d)
Rob’s notes on Schutz
73
To generalize the above results to a 3D surface in 4D space-time, I found it hard to work with surfaces that are not simply one of the coordinates equals a constant. So I suggest that we require that the surface be sufficiently smooth that we can approximate the surface locally by a tangent plane. Then we can also rotate the coordinates so that the tangent plane is x = 0, and then the above immediately holds.
13 Show that the one-form formed from the gradient of a scalar function f is normal to surfaces of constant value of f .
Consider an arbitrary point p = (t, x, y, z) where f (t, x, y, z) = fp . Now imagine taking an infinitesimal step ∆t, ∆x, ∆y, ∆z such that the change in value of f , ∂f ∂f ∂f ∂f ∆t + ∆x + ∆y + ∆z ∂t ∂x ∂y ∂z = 0.
∆f =
(3.39)
This ensures we don’t leave the surface of constant f . So a tangent vector to the surface of constant f is obtained from an arbitrary multiple of such a step: ~ →O a(∆t, ∆x, ∆y, ∆z) A
(3.40)
where a ∈ < and a 6= 0. The gradient one-form evaluated with such a tangent vector is ∂f ∂f ∂f ∂f ˜ ~ ∆t + ∆x + ∆y + ∆z df (A) = a ∂t ∂x ∂y ∂z = 0. (3.41) 14 Given p˜ →O (1, 1, 0, 0) q˜ →O (−1, 0, 1, 0)
74 ~ = ~e0 and B ~ = ~e1 . Then the computation are I chose something very easy: A easy because we only have one component contributing for each vector: ~ B) ~ = p˜(A)˜ ~ q (B) ~ p˜ ⊗ q˜(A, = 1 · 0 = 0. While changing the order of the vectors gives: ~ A) ~ = p˜(B)˜ ~ q (A) ~ p˜ ⊗ q˜(B, = 1 · (−1) = −1 ~ B). ~ 6= p˜ ⊗ q˜(A, To find the components we must input the basis vectors. Because there are two vector inputs we have a two-by-two array of basis components: p˜ ⊗ q˜(~eα , ~eβ ) = p˜(~eα )˜ q (~eβ ) = p˜ ⊗ q˜αβ . Let’s write these in a matrix where α is −1 −1 p˜ ⊗ q˜αβ = 0 0
the row and β is the column: 0 1 0 0 1 0 . 0 0 0 0 0 0
15 Supply the reasoning leading from Eq. (3.23) to Eq. (3.24). f = fαβ ω ˜ αβ , what we mean by a basis fµν = f (~eµ , ~eν ), what we mean by components fµν = fαβ ω ˜ αβ (~eµ , ~eν ), αβ
ω ˜ (~eµ , ~eν ) =
δ αµ
δ βν
,
sub first line into 2nd line
solving above, verify by substitution, used Eq. (3.12) (3.42)
16 (a)
Rob’s notes on Schutz
75
It’s obvious from the definition Eq. (3.69) that the 02 tensor h(s) is ~ and B ~ and you obtain the symmetric. Just interchange the arguments A same result because 1 ~ ~ h(A, B) 2 and 1 ~ ~ h(B, A) 2 are just a real numbers and the order of addition doesn’t matter for reals. 16 (b) Similar argument for the antisymmetric
0 2
16 (c) Components of the symmetric part of −1 0 −1 0 p˜ ⊗ q˜αβ = 0 0 0 0
1 1 0 0
are
tensor h(A) .
0 0 . 0 0
−1 − 21 12 0 − 1 0 1 0 2 2 . = 1 1 0 0 2 2 0 0 0 0
p˜ ⊗ q˜(S) The antisymmetric part is:
1 1 0 0 2 2 − 1 0 1 0 2 2 = − 1 − 1 0 0 . 2 2 0 0 0 0
p˜ ⊗ q˜(A)
16 (d) From the definition of an antisymmetric tensor we know that Eq. (3.32) ~ B) ~ = −h(B, ~ A), ~ h(A, ~ A) ~ = −h(A, ~ A), ~ h(A, = 0,
~ A ~ ∀B,
a special case solving for LHS.
(3.43)
76 16 (e) Number of independent components of h(S) 0 For a general 2 tensor there are 4 × 4 = 16 components. But for the symmetric tensor we have Eq. (3.28) fαβ = fβα This gives the 4 diagonal components are the 6 upper diagonal components are independent, so 10 independent components, while the 6 lower diagonal are determined by the symmetry. For an antisymmetric tensor we have Eq. (3.33) fαβ = −fβα which means the diagonal elements are zero, fαα = 0, so there are only 6 independent components in total.
17 (a) This problem takes some time to work through. There must be an easier way, but here is my long-winded solution! In general, ~ A) ~ = hγβ C γ Aβ h(C, ~ as an arbitrary vector. We’re told that for arbitrary vectors Let’s treat C ~ ~ ~ 6= 0 A and B, but with B ~ = αh( , B) ~ h( , A) (3.44) Suppose C γ →O (1, 0, 0, 0). Then h0β B β = αh0µ Aµ ~ = α˜ ~ q˜(B) q (A)
(3.45) (3.46)
The LHS of (3.45) has the form a one-form, so we wrote that explicitly in (3.46). So far there’s no restriction on q˜; we simply choose α=
~ q˜(A) . ~ q˜(B)
(3.47)
Rob’s notes on Schutz
77
~ 6= 0. and we note the stipulation that B γ Now suppose C →O (1, 1, 0, 0). Then h0β B β + h1β B β = α(h0µ Aµ + h1µ Aµ ) β
µ
h1β B = α(h1µ A )
(3.48) (3.49)
For (3.49) we used (3.45), but now α is no longer a free variable, being set by (3.47). Both (3.45) and (3.49) can be satisfied iff h0β = h1β a
(3.50)
for an arbitrary a. And in general, we see simply by repeating the argument ~ that above for different C, h0β = hµβ pµ .
(3.51)
That is, if the tensor h is written as a matrix, the rows are arbitrary scalar constants pµ of the first row. And so, hµβ = pµ qβ
(3.52)
so the tensor h has the form of an outer product h = p˜ ⊗ q˜.
(3.53)
17 (b) Given 11 tensor T, show that T( ; ~v ) is a vector. In general, for a 11 tensor T, T(˜ p; ~v ) = T(pα ω ˜ α ; v β ~eβ ) ˜ µν (˜ ω α ; ~eβ ) = Tµν pα v β u =
Tµν pν v µ
(3.54) (3.55) (3.56)
˜ as the basis of 11 tensors, and p˜ as my one-form arguments I’ve used u because I wanted to keep ω ˜ as the basis for the one-form. By equating (3.55) and (3.56) we see that ˜ µν (˜ ω α ; ~eβ ) = δ αν δ µβ u
(3.57)
78 and we identify ω ˜ µ (~eβ ) = δ µβ from Eq. (3.12) on p. 60, and from the duality of one-forms and vectors we identify ~eν (˜ ω α ) = δ αν , so ˜ µν = ~eν ⊗ ω u ˜ µ.
(3.58)
Now I believe we can go back and write T acting on the vector alone: T( ; ~v ) = T( ; v β ~eβ ) = = = =
(3.59)
˜ µν ( ; ~eβ ) Tµν v β u Tµν v β ω ˜ µ (~eβ ) ~eν Tµν v β δ µβ ~eν Tµν v µ ~eν .
=
Tµν v β ~eν
µ
⊗ω ˜ ( ; ~eβ )
(3.60) (3.61) (3.62) (3.63)
And now we’re done because the RHS is a scalar Tµ0 v µ times the basis vector ~e0 etc., which is a vector! And similar we can go back and write T acting on the one-form alone: T(˜ p;
) = T(pα ω ˜ α; ) = Tµν pα~eν ⊗ ω ˜ µ (˜ ωα; = =
)
˜ µ δ αν Tµν pα ω Tµν pν ω ˜ µ.
(3.64)
Again we’re done here because the RHS is clearly a one-form.
18 (a) Applying the metric tensor to a vector gives the corresponding one-form. The result is simply a change in sign of the first component, so ~ →O (1, 0, −1, 0) A results in A˜ →O (−1, 0, −1, 0). Note that there is a caution on p. 69 that the metric tensor will change, but for now we’re still within Special Relativity, and the metric tensor has the simple values given on p. 45. (b) Applying the inverse metric tensor to a one-form gives the corresponding vector. The inverse metric tensor is the same matrix as the metric tensor, so again, the result is simply a change in sign of the first component, so p˜ →O (3, 0, −1, −1)
Rob’s notes on Schutz
79
results in p~ →O (−3, 0, −1, −1).
19 (a) The inverse matrix tensor {η αβ } was given in Eq. (3.44). It’s a matrix with the same values as the metric tensor itself, given first on p. 45. −1 0 0 0 −1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 (3.65) 0 0 1 0 0 0 1 0 = 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 19 (b) To derive the formula for the inner product of one-forms in terms of components, Eq. (3.53), we start with the definition Eq. (3.52). This involves only the squares. But first we must establish how to obtain the components of the addition of two one-forms. Intuitively one must guess we just add the components. Indeed this is so, but to establish this rigorously we start with ~ = p˜(A) ~ + q˜(A) ~ the definition of addition in Eq. (3.6). If s˜ = p˜ + q˜ then s˜(A) ~ Suppose for all A. ~ →O (1, 0, 0, 0). A Then ~ = pα Aα = p0 p˜(A) ~ = qα Aα = q0 q˜(A)
(3.67)
~ = sα Aα = s0 = p0 + q0 . s˜(A)
(3.68)
(3.66)
~ →O (0, 1, 0, 0) etc. This establishes that Eq. (3.6) implies that Similarly for A to add two one-forms one just adds the components. Now we can expand (˜ p + q˜)2 in terms of components using Eq. (3.50). (˜ p + q˜)2 = s˜2 = η αβ sα sβ ,
by Eq. (3.50)
= η αβ (pα + qα )(pβ + qβ ), αβ
component-wise addition of one-forms
= η (pα pβ + qα qβ + pα qβ + pβ qα ),
components are just reals (3.69)
80 Now we are finally ready to deal with the definition: 1 p˜ · q˜ = [(˜ p + q˜)2 − p˜2 − q˜2 ], by definition Eq. (3.52) 2 1 = [η αβ (pα pβ + qα qβ + pα qβ + pβ qα ) − p˜2 − q˜2 ], by definition Eq. (3.50) 2 1 = [η αβ (pα pβ + qα qβ + pα qβ + pβ qα ) − η αβ pα pβ − η αβ qα qβ ], by Eq. (3.50) 2 1 = [η αβ (pα qβ + pβ qα )], after cancelling terms 2 = −p0 q0 + p1 q1 + p2 q2 + p3 q3 , using {η αβ } from Eq. (3.44) (3.70) which is Eq. (3.53).
20 Suppose we’re in Euclidean 3-space in Cartesian coordinates. (a) We want to show that Aα¯ = Λα¯β Aβ and Pβ¯ = Λαβ¯ Pα
(3.71)
are the same transformation since the matrix {Λα¯β } is equal to the transpose of its inverse. The inverse transformation takes us back from the O frame to the O frame, and is thus written, {Λα¯β }
−1
= {Λαβ¯}.
(3.72)
If the one-form components on the RHS of (3.71), {Pα }, are written as a column matrix, then the transformation in (3.71) amounts to multiplying the column vector by the matrix ¯
{Λαβ¯}T = {Λβα }. And if the original matrix is orthogonal then we’re back to the same matrix. (b) All we’re given is that metric tensor for Cartesian 3-space is {δij , i, j = 1, 2, 3}. The metric tensor is used in forming the inner product of vectors,
Rob’s notes on Schutz
81
which we know must be frame invariant. So let’s write the inner product between two 3-space vectors in two different frames, ¯
¯
~·B ~ δ¯i¯j Ai B j = A = δkl Ak B l ¯
¯
= δkl (Ai Λk¯i )(B j Λl ¯j ) (3.73) ¯
¯
and so upon cancelling the Ai B j on either side and rearranging we see that δ¯i¯j = Λk¯i Λl ¯j δkl ,
(3.74) ¯
¯
as required. If one were uncomfortable with cancelling the Ai B j on either side, one can write ¯ ¯ ~·B ~ δ¯i¯j Ai B j = A
= δkl Ak B l = δkl (Am¯ Λkm¯ )(B n¯ Λl n¯ ).
(3.75)
~ and B ~ were arbitrary we can consider cases like And then because A ~ →O¯ (1, 0, 0). A ¯
Then there is only one none-zero Ai , Am¯ component on each side, and we can divide through by this single component. And this is true for all the ¯ ¯ components Ai and B i . So again, we assert (3.74). δ¯i¯j = Λk¯i Λl ¯j δkl , = Λk¯i Λ¯kj ,
from above
after summing over l.
(3.76) (3.77)
The RHS of (3.77) is the product of a matrix by its transpose, and for this to equal the identity matrix (i.e. the LHS), we require the matrix to be orthogonal. And now I know why we never learned one-forms in undergrad, and called the gradient of a scalar field a vector. Incidentally, it was precisely this point that I didn’t understand in reading the explanation of a one-form by Misner et al. (1973) that made me give up on their book and try Schutz’s book. I
82 couldn’t see the difference between their description of a one-form and the gradient (and of course there is no difference) but I also thought I “knew” that the gradient of a scalar field was a vector. So hats off to Schutz for making this point clear. Though perhaps if I had persisted with Misner et al. (1973) all would have been fine in the end.
21 (a) Starting with the t = 0 boundary and moving counter clockwise, ˜ These so next the x = 1 boundary, let’s call the normal one-forms a ˜, ˜b, c˜, d. take on the values: a ˜ →O (a0 , 0), ˜b →O (0, b1 ), c˜ →O (c0 , 0), d˜ →O (0, d1 ),
a0 > 0
(3.78)
b1 > 0 c0 < 0
(3.79) (3.80)
d1 < 0.
(3.81)
To obtain the corresponding vectors we simply change the sign of the time component: ~a →O (−a0 , 0), a0 > 0 ~b →O (0, b1 ), b1 > 0 ~c →O (−c0 , 0), c0 < 0 d~ →O (0, d1 ), d1 < 0.
(3.82) (3.83) (3.84) (3.85)
The normal vectors in the time direction, ~a and ~c look odd because they appear to be pointing inward. But the metric is such, −1 0 η= 0 1 that the scalar product of vectors that point outward will be positive. 21 (b) The first challenge is finding out what is meant by the “null boundary”. I would guess it’s the surface in the direction of the null vector, a null vector being one whose inner product with itself is zero, e.g.: V~ →O (1, 1).
Rob’s notes on Schutz
83
This vector has the strange property of being orthogonal to itself, see p. 45. The other two boundaries: x = 1 and t = 1 are easily named and so process of elimination also points to the boundary between (1, 0) and (2, 1) as the “null boundary”. The outward normal one-form, c˜ is easily found: c˜(V~ ) = 0, definition of normal = cα V α = c0 V 0 + c1 V 1 = c0 + c1 .
(3.86) (3.87) (3.88) (3.89)
To ensure the outward normal, we require c0 (1) + c1 (−1) = c0 − c1 > 0.
(3.90)
For example, c˜ →O (1, −1). The associated vector is, ~c →O (−1, −1). 22 To show that vectors form a vector space, when introduced as a functions that take one-forms as arguments, we would proceed as in section 3.3, interchanging the roles of vectors and one-forms in Eq. (3.6). That is, we would define the addition of vectors and multiplication of a vector by a scalar as follows. Suppose ~=B ~ +C ~ A ~ = aB, ~ D a ∈ <,
(3.91) (3.92)
then we require for all one-form arguments p˜, ~ p) = B(˜ ~ p) + C(˜ ~ p) A(˜ ~ p) = aB(˜ ~ p), a ∈ <, D(˜ in analogy to Eq. (3.6b).
(3.93) (3.94)
84 We need a zero, which is provided by the null vector, the one that is zero for any one-form argument: V~ (˜ q ) = 0, ∀˜ q. The two axioms of Appendix A are now clearly satisfied. (See also problems 2 and 23.) 23 (a) Prove that the set of all M tensors for fixed M and N forms a N vector space. This is like question 2. But now we need to define what Mwe mean by M the addition of two N tensors and the multiplication of an N tensor by a scalar. Of course we are guided by Eq. (3.6). That is, we note that M N tensors produce real numbers that can be added like real numbers, so the generalization of Eq. (3.6) is trivial. The tensor S where S=P+Q
(3.95)
is defined to be that which gives the sum of the two values obtained by applying the input to P and Q. That is, S(˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN ) = P(˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN ) + Q(˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN )
(3.96)
where I have invented the notation (˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN ) to show the M one-form inputs and N vector inputs. The choice of oneforms first, we’ll see later, gives the basis in order Schutz gave in 23b. I have followed the convention that superscript integers are used as indices of different one-forms. That is, a ˜1 and a ˜2 are two different one-forms, not components of the same one-form. Similarly, subscripts are used to denote different vectors. In analogy with Eq. (3.6b) we can define multiplication of M an N tensor by a scalar α R = αP
(3.97)
to be the tensor that, for a given input, gives just α times the real number produced by supplying the input to P: R(˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN ) = αP(˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN )
(3.98)
Rob’s notes on Schutz
85
The set of M tensors for fixed M and N forms a vector space by the N same argument as given for question 2. Perhaps it’s worth making explicit M what we mean by the zero N tensor. This is the tensor that gives zero for any input, (˜ a1 , a ˜2 , . . . , a ˜M ; ~b1 , ~b2 , . . . , ~bN ). The set of M tensors, with (3.95) and (3.96) then meets axiom (1) in ApN M pendix A: N tensors form an abelian group with the operation of addition. all
23(b) Prove that the basis for the vector space formed from the set of M tensors for fixed M and N is the set: N {~eα ⊗ ~eβ ⊗ . . . ⊗ ~eγ
⊗
ω ˜µ ⊗ ω ˜ν ⊗ . . . ⊗ ω ˜ λ}
(3.99)
with M vectors labeled with α . . . γ and N one-forms labeled µ . . . λ. This is a nice question because it forces us to think about what we mean by a basis. The answer is a straightforward generalization of the argument 0 for the basis of the 2 tensors starting at the bottom of page 66 and ending with Eq. (3.26) on p. 67. The notation is combersome because one needs to refer to M superscripts and N subscripts where M and N are arbitrary. In defining the basis (3.99) Schutz has used a series of greek letters like α . . . γ. I’ve decided to put subscript indices on the greek letters α1 , α2 , . . . αM . That way I can be explicit about how many there are. Remember that each greek letter index can take on 4 values, e.g. α1 = {0, 1, 2, 3} corresponding to the four dimensions. M As in Eq. (3.23) we write the N tensor as a sum of components times the basis that we seek: ,α2 ,...,αM 2 ,...,βN ω ˜ αβ11,β R = Rβα11,β ,α2 ,...,αM 2 ,...,βN
(3.100)
And furthermore, the components correspond to the real values produced by applying the tensor to arguments that are the basis one-forms and basis vectors. So, ,α2 ,...,αM Rβα11,β = R(˜ ω α1 , ω ˜ α2 , . . . , ω ˜ αM ; ~eβ1 , ~eβ2 , . . . , ~eβN ) 2 ,...,βN
(3.101)
which is the generalization of the formula given between Eq. (3.23) and Eq. (3.24) on p. 67. Now, we simply substitute the tensor (3.100) into (3.101) to obtain: ,α2 ,...,αM ,...,µM 2 ,...,βN Rνµ11,ν,µ22,...,ν = Rβα11,β ω ˜ αβ11,β ω µ1 , ω ˜ µ2 , . . . , ω ˜ µM ; ~eν1 , ~eν2 , . . . , ~eνN ) ,α2 ,...,αM (˜ N 2 ,...,βN (3.102)
86 This implies the analogue to Eq. (3.24), 2 ,...,βN M ˜ µM ; ~eν1 , ~eν2 , . . . , ~eνN ) = δαµ11 δαµ22 . . . δαµM ω ˜ αβ11,β ˜ µ2 , . . . , ω ω µ1 , ω ,α2 ,...,αM (˜
δνβ11 δνβ22 . . . δνβNN (3.103)
Using Eq. (3.12) we identify ˜ β1 (~eν1 ) δνβ11 = ω ˜ β2 (~eν2 ) δνβ22 = ω .. . δνβNN = ω ˜ βN (~eνN )
(3.104)
Based upon the dualism between vectors and one-forms, we identify: δαµ11 = ~e µ1 (˜ ωα1 ) µ2 µ2 δα2 = ~e (˜ ωα2 ) .. . µM δαM = ~e µM (˜ ωαM )
(3.105)
So, M δαµ11 δαµ22 . . . δαµM = ~e µ1 (˜ ωα1 )~e µ2 (˜ ωα2 ) . . . ~e µM (˜ ωαM ).
(3.106)
So focusing on just the tensor, i.e. dropping the arguments, we’re left with the basis that is the analogue to Eq. (3.25), 2 ,...,βN e µ1 ⊗ ~e µ2 ⊗ . . . ~e µM ⊗ ω ω ˜ αβ11,β ˜ β1 ⊗ ω ˜ β2 . . . ⊗ ω ˜ βN ,α2 ,...,αM = ~
(3.107)
where we have introduced the idea of an outer product of N one-forms as a simply extension of the case when N = 2 introduced on p. 66. That is, the outer product of N one-forms p˜1 ⊗ p˜2 . . . ⊗ p˜N ~ 1, A ~ 2, . . . , A ~N , is simply the tensor that, when supplied with N vector inputs, say A as arguments, produces that number that results from multiplying together ~ n , i.e. each real number that results from applying p˜n to vector argument A ~ 1, A ~ 2, . . . , A ~ N ) = p˜1 (A ~ 1 )˜ ~ 2 ) . . . p˜N (A ~N ) p˜1 ⊗ p˜2 . . . ⊗ p˜N (A p2 (A
Rob’s notes on Schutz
87
24 (a) (i) The definitions of the symmetric and antisymmetric tensors are given in Eq. (3.31) and Eq. (3.34). 0 1 1 21 1 −1 0 1 (3.108) {M (αβ) } = 1 0 0 3 2 1 1 32 0 2
0 0 {M [αβ] } = 1 1 2
0 −1 − 12 0 0 1 0 0 − 12 −1 12 0
(3.109)
(ii) Section 3.7 shows how to raise and lower indices using the metric tensor. M αβ = ηβγ M αγ 0 1 0 1 −1 0 2 0 0 1 0 −2
0 −1 0 2 1 0 0 0
0 1 0 0
0 0 1 0
(3.110)
= {M αγ }{ηβγ }T 0 0 1 0 −1 −1 0 0 = 0 −2 0 0 1 −1 0 −2
(3.111)
0 2 1 0
(3.112)
(iii) Mα β = ηαγ M γβ
(3.113) γβ
−1 0 0 0
0 1 0 0
0 0 1 0
0 0 1 0 0 1 −1 0 0 2 0 0 1 1 0 −2
= {ηαγ }{M } 0 0 −1 0 2 1 −1 0 = 1 2 0 0 0 1 0 −2
(3.114)
0 2 1 0
(3.115)
(iv) −1 0 0 0
0 1 0 0
0 0 1 0
Mαβ = ηαγ M γβ = 0 0 1 0 0 0 −1 0 0 −1 −1 0 2 −1 −1 0 = 0 −2 0 0 1 −2 0 0 1 −1 0 −2 0 −1 0 −2
(3.116)
0 2 1 0
(3.117)
88 OR,
Mαβ = ηβγ Mα γ γ
0 −1 0 1 −1 0 2 0 0 1 0 −2
0 −1 2 0 1 0 0 0
0 1 0 0
0 0 1 0
(3.118) T
= {Mα }{ηβγ } 0 0 −1 0 0 −1 −1 0 = 0 −2 0 0 1 −1 0 −2
(3.119)
0 2 . 1 0
(3.120)
Of course the two different ways to obtain Mαβ agree.
24 (b) Does it make sense to speak of the symmetric and antisymmetric 1 α parts of M β ? This 1 tensor is represented by a matrix, so if it did makes sense then it would be easy to find the symmetric and antisymmetric parts! But I would guess that it doesn’t make sense, because symmetry has to do 1 with the interchange of the order of the arguments. For a 1 tensor, one argument is a vector, the other a one-form. So they cannot be interchanged. (On the other hand, each vector has a corresponding one-form and vice versa, so if we incorporate this into the idea of symmetry, then one could define symmetric and antisymmetric parts.) I guess I’m sitting on the fence. This first view is the correct one. It only makes sense to discuss symmetry after both indices have been raised or lowered, as we’ve done in this example using the metric tensor.
24 (c) The 20 tensor η αβ was introduced in section 3.5 as the inverse of the metric tensor. So when we raise an index on the metric tensor, we naturally
Rob’s notes on Schutz
89
get the identity matrix: η αβ = η αγ ηγβ , using Eq. (3.58) −1 0 0 0 −1 0 0 0 1 0 0 0 1 0 = 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 = 0 0 1 0 , 0 0 0 1
(3.121) 0 0 , 0 1
using Eq. (3.44)
(3.122)
(3.123)
= δ αβ .
(3.124)
25 Show that Aαβ Bαβ is frame invariant. This simple-looking problem turned out to be quite tedious. Perhaps I’m doing more than required? [In retrospect, having now read most of the book, the solution below of course contains much more detail then required. But it was only through working through this sort of detail that I was able to gain confidence in tensor calculus.] First I believe we need to interpret Aαβ Bαβ in terms of tensor calculous. It looks like it has the form of A(B), where A is a 20 tensor and B is a 02 tensor. To confirm this we write: A = Aαβ ~eα ⊗ ~eβ , see problem 23 (b) B = Bµν ω ˜µ ⊗ ω ˜ν.
(3.125) (3.126)
Now one applies the tensor A to the arguments B, A(B) = Aαβ ~eα ⊗ ~eβ (Bµν ω ˜µ ⊗ ω ˜ν) = Aαβ Bµν ~eα (˜ ω µ )~eβ (˜ ων ) =A
αβ
Bµν δ µα
αβ
= A Bαβ , which confirms our interpretation.
δ νβ
(3.127) (3.128) (3.129) (3.130)
90 At this point, we expect Aαβ Bαβ = A(B) to be frame-invariant, because its composed of vectors and one-forms, which are frame-invariant. But we can demonstrate this more explicitly. That is, ¯
A = Aα¯ β ~eα¯ ⊗ ~eβ¯, =
¯ Aαβ Λα¯α Λββ α0
(3.131) β0
α0
Λ α¯ Λ β¯ ~eα0 ⊗ ~eβ 0 ,
β0
(3.132)
= Aαβ δα δβ ~eα0 ⊗ ~eβ 0 ,
(3.133)
= Aαβ ~eα ⊗ ~eβ .
(3.134)
And similarly for B: B = Bµ¯ν¯ ω ˜ µ¯ ⊗ ω ˜ ν¯ , = = =
(3.135)
0 Bµν Λµµ¯ Λνν¯ Λµ¯µ0 Λν¯ν 0 ω ˜µ 0 0 Bµν δ µµ0 δ νν 0 ω ˜µ ⊗ ω ˜ν , Bµν ω ˜µ ⊗ ω ˜ν.
ν0
⊗ω ˜ ,
(3.136) (3.137) (3.138)
Now we should be quite convinced that Aαβ Bαβ = A(B) is frame-invariant, because its both A and B are frame-invariant. And of course we can show this in tedious detail: ¯
A(B) = Aα¯ β ~eα¯ ⊗ ~eβ¯(Bµ¯ν¯ ω ˜ µ¯ ⊗ ω ˜ ν¯ )
(3.139)
α ¯ β¯
= A Bµ¯ν¯~eα¯ ⊗ ~eβ¯(˜ ω µ¯ ⊗ ω ˜ ν¯ ) =A =
α ¯ β¯
(3.140)
Bµ¯ν¯ ~eα¯ (˜ ω µ¯ ) ~eβ¯(˜ ω ν¯ )
¯ Aαβ Λα¯α Λββ
0
0
0
β0
Bµν Λµµ¯ Λνν¯ Λ
(3.141) α0 α ¯
Λ
β0
0
β¯
µ0
0
ω µ ) ~eβ 0 (˜ ω ν ), = Aαβ δ αα δ ββ Bµν δ µµ0 δ νν 0 ~eα0 (˜ = Aαβ δ αα δ αβ
= A Bαβ .
β
Bµν δ µµ0 δ νν 0 δ
µ0 α0
0
δ νβ 0 ,
0
~eα0 (˜ ω ) Λµ¯µ0 Λν¯ν 0 ~eβ 0 (˜ ω ν ), (3.142) (3.143) (3.144) (3.145)
26 A is an antisymmetric 20 tensor, B is a symmetric 02 tensor, C is an arbitrary 02 tensor, D is an arbitrary 20 tensor. Prove:
Rob’s notes on Schutz
91
(a) Aαβ Bαβ = 0. I don’t think Schutz discussed the symmetry of 20 M tensors explicitly in this chapter. But in Section 3.6 we are introduced to 0 tensors and assured that “All our previous discussions of 02 tensors apply here.” So I think it’s safe to use the obvious analogues of the symmetry ideas of Section 3.4. [Yes, obviously it is.] Because A is an antisymmetric 20 tensor, we can write it as the antisymmetric part of some more general tensor say A0 : 1 0αβ A − A0βα , by Eq. (3.34), A0[αβ] = 2 = Aαβ , (3.146) and because B is a symmetric 02 tensor, we can write it as follows 1 0 0 Bαβ + Bβα , 2 Now we simply apply A to B: B(αβ) =
by Eq. (3.31).
0 Aαβ Bαβ = A0[αβ] B(αβ) 1 1 0αβ 0 0 A − A0βα Bαβ + Bβα , = 2 2 1 0αβ 0 0 0 0 A Bαβ + A0αβ Bβα − A0βα Bαβ − A0βα Bβα , = 4 But we can call the indices α and β whatever we like, and so
(3.147)
(3.148)
0 0 A0βα Bβα = A0αβ Bαβ 0 0 A0βα Bαβ = A0αβ Bβα
(3.149)
just be relabelling. And from this we see that Aαβ Bαβ = 0. (b) Prove that: Aαβ Cαβ = Aαβ C[αβ] . Aαβ Cαβ = Aαβ C(αβ) + C[αβ] , = Aαβ C[αβ] ,
by Eq. (3.35),
by part (a)!
(3.150)
(c) Prove that: Bαβ Dαβ = Bαβ D(αβ) . Bαβ Dαβ = Bαβ D(αβ) + D[αβ] , = Bαβ D(αβ) ,
by Eq. (3.35)
by part (a)!
(3.151)
92 27 (a) A is an antisymmetric metric 02 tensor.
2 0
tensor. Show that {Aαβ } is an antisym-
Aαβ = ηαµ ηβν Aµν , So in matrix notation: −1 0 0 0 1 0 {Aαβ } = 0 0 1 0 0 0 −1 0 0 0 1 0 = 0 0 1 0 0 0 0 −a12 a12 0 = a13 −a23 a14 −a24
by Eq. (3.56).
0 0 a12 a13 a14 −1 −a12 0 0 a a 23 24 0 0 −a13 −a23 0 a34 0 1 −a14 −a24 −a34 0 0 0 0 a12 a13 a14 0 a12 0 a23 a24 0 a13 −a23 0 a34 1 a14 −a24 −a34 0 −a13 −a14 a23 a24 0 a34 −a34 0
(3.152)
0 1 0 0
0 0 1 0
0 0 0 1
(3.153)
27 (b) Suppose V α = W α . Prove that Vα = Wα . Vα = ηαβ V β ,
by 3.39
Wα = ηαβ W β .
(3.154)
So Vα − Wα = ηαβ V β = ηαβ = 0.
− ηαγ W γ V β − Wβ
28 Deduce Eq. (3.66) from Eq. (3.65).
(3.155)
Rob’s notes on Schutz
93
I would start from comparing Eq. (3.65) with Eq. (3.14). For scalar function φ, we had dφ ∂φ t ∂φ x ∂φ y ∂φ z = U + U + U + U from Eq. (3.14) dτ ∂t ∂x ∂y ∂z ˜ U ~ ), = dφ( just different notation. (3.156) So I believe we want to write the derivate of the 11 tensor in a similar way, ~ as an argument and gives the rate of change i.e. as something that takes U 1 ~: of that 1 tensor following the world line of U dT ~ ), = ∇T (U analogous to (3.156) above, dτ = T αβ,γ ~eα ⊗ ω ˜β ⊗ ω ˜ γ (U µ~eµ ) Eq. 3.66 = T αβ,γ ~eα ⊗ ω ˜ β U µω ˜ γ (~eµ ) rearranging = T αβ,γ ~eα ⊗ ω ˜ β U µ δ γµ =
T αβ,γ
β
Eq. (3.12)
γ
~eα ⊗ ω ˜ U .
(3.157)
Thus we were lead back to Eq. (3.65) from Eq. (3.66) using the analogy with Eq. (3.14). [In retrospect, the above solution doesn’t seem to answer the question posed. But I think the question is not well-posed. The solution given by ~ . But there is more going on Schutz merely refers to the arbitrariness of U here in making the step from Eq. (3.65) to Eq. (3.66), and what’s involved is the analogy with the gradient of a scalar, as discussed above. In short, I’m not happy with either my solution nor with Schutz’s, and I blame the question! I discuss this more in my notes on this section above.]
29 Prove that tnesor differentiation obeys the Leibniz (product) rule. ∇(A ⊗ B) = (∇A) ⊗ B + A ⊗ (∇B). We’re not told what A and B are, which probably means that it doesn’t matter. For convenience, I’ll assume that they are 11 tensors: A = Aαβ ω ˜ β ⊗ ~eα , B = B αβ ω ˜ β ⊗ ~eα .
Eq. 3.61 (3.158)
94 We then have A ⊗ B = Aαβ ω ˜ β ⊗ ~eα ⊗ (B µν ω ˜ ν ⊗ ~eµ ) = Aαβ B µν ω ˜ β ⊗ ~eα ⊗ ω ˜ ν ⊗ ~eµ .
(3.159)
My strategy is to just redo what Schutz did in Section 3.8. Let’s differentiate A ⊗ B with respect to proper time, as in Eq. (3.63): d(Aαβ B µν ) β d(A ⊗ B) = ω ˜ ⊗ ~eα ⊗ ω ˜ ν ⊗ ~eµ , (3.160) dτ dτ were use have assumed that the basis one-forms and basis vectors are uniform in spacetime. And d(Aαβ B µν ) dτ is just the ordinary derivative of this function (Aαβ B µν ) along the world line. Because it’s just the ordinary derivative, we can use the ordinary product rule: α µ dA β d(Aαβ B µν ) dB ν µ α = B ν +A β . (3.161) dτ dτ dτ Now just sub this into (3.160): α µ dA β d(A ⊗ B) dB ν µ α = B ν +A β ω ˜ β ⊗ ~eα ⊗ ω ˜ ν ⊗ ~eµ , dτ dτ dτ ~ ) ⊗ B + A ⊗ ∇B(U ~) = ∇A(U ~ ). = ∇(A ⊗ B)(U
(3.162)
~ = U γ ~eγ everywhere, we’d find that it cancels on the two If we wrote out U sides of the equal sign. So it’s clear that: ∇(A ⊗ B) = (∇A) ⊗ B + A ⊗ (∇B).
(3.163)
30 Given the following fields: √ ~ →O (1 + t2 , t2 , 2t, 0) U √ ~ →O (x, 5xt, 2t, 0) D ρ = x2 + t 2 − y 2
(3.164)
Rob’s notes on Schutz
95
~ ·U ~ = −1. And U ~ (a) By Eq. (2.28) we require a four-velocity to have U does at all points. ~ ·U ~ = ηαβ U α U β , U
see Eq. (3.1) √ = −(1 + t ) + (t2 )2 + ( 2t)2 + 0 = −1. 2 2
(3.165)
~ cannot be a four-velocity because it doesn’t meet this On the other hand, D property everywhere: ~ ·D ~ = ηαβ Dα Dβ , D
see Eq. (3.1) √ = −x2 + (5tx)2 + ( 2t)2 + 0 6= −1 everywhere.
(3.166)
And for some reason we’re asked to also calculate: ~ ·D ~ = ηαβ U α Dβ , U
see Eq. (3.1) √ = −(1 + t2 )x + (t2 )(5tx) + ( 2t)2 + 0 = −x − xt2 + 5t3 x + 2t2 .
(3.167)
~ . Spatial components (b) Find spatial velocity ν given the four-velocity U of a four-velocity vector was explained on p. 42. Here, √ ν →O (t2 , 2t, 0) At t = 0, ν = 0. But as t → ∞, ν → ∞. (c) Find Uα for all α. Uα = ηαβ U β
see section 3.7.
U0 = −U 0 = −(1 + t2 ) U1 = U 1 = t2 √ U2 = U 2 = 2t U3 = U 3 = 0.
(3.168)
96 (d) Find U α,β for all α, β. U α,β ≡
∂U α ∂xβ
∂U 0 ∂x0 ∂U 0 = ∂x1 ∂U 0 = ∂x2 ∂U 0 = ∂x3
U 0,2 U 0,3
∂(1 + t2 ) ∂t ∂(1 + t2 ) = ∂x ∂(1 + t2 ) = ∂y ∂(1 + t2 ) = ∂x
∂U 0 ∂t ∂U 0 = ∂x ∂U 0 = ∂y ∂U 0 = ∂x
U 0,0 = U 0,1
see Eq. (3.19)
=
=
(3.169)
= 2t, = 0, = 0, = 0.
For α = 1: ∂U 1 ∂x0 ∂U 1 = ∂x1 ∂U 1 = ∂x2 ∂U 1 = ∂x3
U 1,0 = U 1,1 U 1,2 U 1,3
∂t2 ∂t ∂t2 = ∂x ∂t2 = ∂y ∂t2 = ∂x =
= 2t, = 0, = 0, = 0.
For α = 2: U 2,0
=
U 2,1 = U 2,2 = U 2,3 =
∂U 2 ∂x0 ∂U 2 ∂x1 ∂U 2 ∂x2 ∂U 2 ∂x3
= = = =
√ ∂ 2t ∂t √ ∂ 2t ∂x √ ∂ 2t ∂y √ ∂ 2t ∂x
=
√
2,
= 0, = 0, = 0.
For α = 3: U 3,β =
∂U 3 ∂0 = = 0, β ∂x ∂xβ
(3.170)
Rob’s notes on Schutz
97
(e) Show that Uα U α,β = 0 for all β. For β = 0: Uα U α,t = −(1 + t2 )2t + t2 2t +
√ √ 2t 2 + 0
= 0.
(3.171)
For β > 0, U α,β = 0. Thus, Uα U α,β = 0. Show that U α Uα,β = 0 for all β. U α Uα,β = U α (ηαγ U γ ),β = ηγα U α (U γ,β ) = Uγ (U γ,β ) = 0,
see above!
(3.172)
(f) Find Dβ,β ∂D0 ∂D1 ∂D2 ∂D3 + + + ∂t ∂x ∂y ∂z √ ∂x ∂5tx ∂ 2t ∂0 = + + + ∂t ∂x ∂y ∂z = 5t.
Dβ,β =
(3.173)
(g) Find (U α Dβ ),β for all α. (U α Dβ ),β = U α (Dβ,β ) + U α,β Dβ √ √ = {(1 + t2 ), t2 , 2t, 0}(5t) + D0 {2t, 2t, 2, 0} √ √ = {(1 + t2 ), t2 , 2t, 0}(5t) + x{2t, 2t, 2, 0} √ √ = {5t(1 + t2 ) + x2t, 5t3 + x2t, 5 2t2 + 2x, 0} (3.174) (h) Find Uα (U α Dβ ),β .
98
Uα (U α Dβ ),β = Uα [U α,β Dβ + U α Dβ,β ] = Uα U α,β Dβ + Uα U α Dβ,β = Uα U α Dβ,β =
−Dβ,β
= −5t
using result from (e) using result from (a) and Eq. (2.28)
using result from (f)
(3.175)
(i) Find ρ,α for all α ∂ρ ∂ρ ∂ρ ∂ρ , , , ,} ∂t ∂x ∂y ∂z = {2t, 2x, −2y, 0}
ρ,α = {
(3.176)
Find ρ,α for all α
ρ,α = η αβ ρ,β = {−2t, 2x, −2y, 0}
(3.177)
˜ →O {ρ,α } dρ
(3.178)
Interpretation:
That is, the ρ,α are the components of the one-form that is the gradient of the scalar field ρ. ρ,α are the components of the associated vector. (j) Find ∇U~ ρ ∇U~ ρ → {U α ρ,α }
from Eq. (3.68) √ = {2t(1 + t2 ), 2xt2 , −2y 2t, 0}
Find ∇U~ D
(3.179)
Rob’s notes on Schutz
99
~ = U α Dβ,α ∇U~ D = {U α D0,α , U α D1,α , U α D2,α , U α D3,α } √ = {t2 , 5x(1 + t2 ) + 5t(t2 ), 2(1 + t2 ), 0}
(3.180)
~ 31 There’s a mistake in the initial definition of P : We’re told that U is a unit timelike vector. This implies its scalar product with itself is both negative , i.e. Uα U α < 0, But what we require for this problem is that: and of magnitude unity like a 4-velocity: Uα U α = −1. (i) Given the definition of P and V~⊥ it is then simply a matter of taking ~ to show that its orthogonal. the scalar product with U ηαµ V⊥α U µ = ηαµ U µ (ηβα + U α Uβ )V β = Uα (ηβα + U α Uβ )V β , =
Uα (δβα
α
β
α
β
+ U Uβ )V ,
using Eq. (3.39) using Eq. (3.60)
= (Uβ + Uα U Uβ )V , = (Uβ − Uβ )V β , =0
(3.181)
(ii) Similarly for showing that V~⊥ is unaffected by P. Pαµ V⊥α = (ηαµ + U µ Uα )V⊥α = (ηαµ + U µ Uα )(ηβα + U α Uβ )V β = (ηαµ ηβα + U µ Uα ηβα + ηαµ U α Uβ + U µ Uα U α Uβ )V β = (ηβµ + U µ Uβ + U µ Uβ + (Uα U α )U µ Uβ )V β = (ηβµ + U µ Uβ + U µ Uβ + (−1)U µ Uβ )V β = (ηβµ + U µ Uβ )V β = V⊥µ
(3.182)
100 (b) Show that the tensor ηµν −
q µ qν , q α qα
with only restriction that ~q is not the null vector, projects orthogonally. Based upon (a) we can guess that “projects orthogonally” means that this tensor converts vectors into one-forms that are orthogonal to ~q. (Note that the given tensor produces a one-form from a vector input because it’s a 0 tensor.) Much like in (a) we simply apply the given tensor to an arbitrary 2 vector, say ~s. Here this produces a one-form. And then we apply this oneform to ~q to show that it’s zero, indicating a one-form orthogonal to ~q. qµ q ν qµ q ν q µ ν γ µ ηµν − α s q = ηµν q − α sν q aα q qα qµ q µ qν µ = ηνµ q − α sν metric tensor is its own transpose, q qα = (qν − qν ) sν , = 0. (3.183) This fails when qµ q µ = 0 because then of course we have zero in the denominator of the definition. Now the relation to (a) is clear. In (a) we didn’t need the qµ q µ in the denominator of the 2nd term, nor the negative ~. sign, because it was negative unity for vector U (c) Show that ~ ⊥ ) = g(V~⊥ , W ~ ⊥ ). P(V~⊥ , W I think it is as simple as this: ~ ⊥ ) = Pαβ V⊥α W β , P(V~⊥ , W ⊥ = (ηαβ + Uα Uβ ) W⊥β V⊥α , = ηαβ W⊥β V⊥α , ~ ⊥ ). = g(V~⊥ , W
substituting definition given
using (a) (i) above,
32 (a) Prove the given transformation law for
(3.184)
0 2
tensors.
Rob’s notes on Schutz
101
fα¯ β¯ = f (~eα¯ , ~eβ¯) = fαβ ω ˜α ⊗ ω ˜ β (~eα¯ , ~eβ¯), = = =
α
β
using frame-independent definition of f, Eq. (3.26)
(Λµα¯
fαβ ω ˜ ⊗ω ˜ ~eµ , Λνβ¯ ~eν ), using Eq. (2.15) fαβ Λµα¯ Λνβ¯ δ αµ δ βν , using Eq. (3.24, 3.25) fαβ Λαα¯ Λββ¯, (3.185)
Let A = (aαβ ) be the matrix with components, aαβ , and similarly for B. It’s easy to see that regular matrix multiplication AB = aαγ bγβ = (a)(b) So we can write fα¯ β¯ = Λαα¯ fαβ Λββ¯, just rearranging (f¯) = (Λ)T (f ) (Λ).
(3.186)
32 (b) Prove that the familiar Lorentz transformation associated with a “velocity boost” obeys the generalization suggested. The familiar Lorentz transformation was given on p. 22 by Eq. (1.12), and deemed a “velocity boost” v in the x−direction. t¯ = γt − vγx x¯ = γx − vγt y¯ = y z¯ = z √ where γ = 1/ 1 − v 2 . Note this transformation can be written in matrix form t¯ γ −vγ 0 0 t x¯ −vγ x γ 0 0 = y¯ 0 0 1 0 y z¯ 0 0 0 1 z
102 Testing whether this transformation matrix meets Eq. (3.71) is just a matter of doing the matrix multiplication. Writing Eq. (3.71) in matrix form: (¯ η ) = (Λ)T (η)(Λ),
see problem 32
(3.187)
Substituting our familiar Lorentz transformation we get RHS = (Λ)T (η)(Λ), γ −vγ 0 −vγ γ 0 = 0 0 1 0 0 0 γ −vγ 0 −vγ γ 0 = 0 0 1 0 0 0 −1 0 0 0 0 1 0 0 = 0 0 1 0 0 0 0 1
0 −1 0 0 0 0 1 0 0 −γ 0 −vγ 0 0 1 0
0 1 0 0
0 γ −vγ 0 0 0 γ 0 0 −vγ 0 0 0 1 0 1 0 0 0 1 vγ 0 0 γ 0 0 0 1 0 0 0 1 0 0 1 0
= LHS.
(3.188)
32 (c) Suppose (L) and (Λ) are two matrices that satisfy Eq. (3.71). Prove that (Λ)(L) also obeys Eq. (3.71). This is, of course, what we would hope since each Lorentz transformation corresponds to changing reference frames, and (Λ)(L) = (N ) corresponds to changing twice. RHS = (N )T (η)(N ) = ((Λ)(L))T (η)((Λ)(L)) = (L)T (Λ)T (η)(Λ)(L) = (L)T
(Λ)T (η)(Λ)
(L)
T
= (L) (η) (L), using Eq. (3.71) = (η), using Eq. (3.71) = LHS.
(3.189)
Rob’s notes on Schutz
103
33 (a) Find the matrix for the identity element of the Lorentz group. Apparently ‘identity element” is a very general concept, beyond just group theory: http://en.wikipedia.org/wiki/Identity_element. For the Lorentz group, we seek I such that IL=L for all elements L. Clearly the 4 × 4 identity matrix I meets this requirement. Note that Λ(v = 0) = I. The implicit matrix in Eq. (1.12) was γ −vγ 0 0 −vγ γ 0 0 0 0 1 0 0 0 0 1 √ where γ = 1/ 1 − v 2 . Its inverse is
γ vγ 0 0 vγ γ 0 0 0 0 1 0 0 0 0 1
which is obvious on physical grounds, and can be easily confirmed plication: 2 γ vγ 0 0 γ −vγ 0 0 γ (1 − v 2 ) 0 2 vγ γ 0 0 −vγ γ 0 0 0 γ (1 − v 2 ) = 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 = 0 0 1 0 0 0 0 1
by multi-
0 0 1 0
0 0 0 1 (3.190)
33 (b) Prove that the determinant of any matrix representing a Lorentz transformation is ±1.
104 It’s easy to show that the determinant of Lorentz transformation associated with the “velocity boost” v in the x−direction is +1. But what about velocity components in different directions? In Chapter 1 we did find the most general Lorentz transformation for arbitrarily oriented velocity. But it would be messy to find the determinant, and also I’m not sure if Schutz wants us to consider this familiar Lorentz transformation, or his generalization defined by Eq. (3.71). Let’s work with generalization defined by Eq. (3.71).
|(η)| = |(L)T (η)(L)| = |(L)T | |(η)| |(L)| −1 = −|(L)|2 |(L)| = ±1,
(3.191)
where we have used properties of the determinant of matrices specified at http://en.wikipedia.org/wiki/Determinant#Multiplicativity_and_matrix_ groups. Clearly if |A| = 1, and |B| = 1, then |C| = |AB| = 1, and this forms a subgroup. But if |A| = −1, and |B| = −1, then |C| = |AB| = 1. Thus C is not a member and the set of matrices with determinant of -1 do not for a subgroup because they fail to meet the closure axiom. The axioms of a group are given here http://en.wikipedia.org/wiki/Group_(mathematics).
33 (c) Show that the 3D orthogonal matrices form a group. Matrix multiplication meets the following three axioms of a group: associativity, identity, and invertibility. We need to show that the set of 3D orthogonal matrices is closed. That is, if A and B are elements of the set, is
Rob’s notes on Schutz
105
C = AB also an element? C T C = (AB)T (AB) = (B T AT ) (A B) = B T (AT A) B T
= B (I) B
associativity orthogonality of A
T
=B B =I orthogonality of B which implies the orthogonality of C. Thus the set is closed, and forms a matrix group. If we remove the first row and column of the L(4) matrices we see that Eq. (3.71) becomes just the condition for orthogonal 3D matrices given in problem 20. I believe this implies that the O(3) matrices are a subgroup of L(4). [Yes, this essentially agrees with Schutz’s solution.]
34 Introduce variables: u = t − x and v = t + x. (a) Given that ~eu connects events u = 1, v = 0, y = 0, z = 0 to the origin (u = 0, v = 0, y = 0, z = 0). Therefore u=1=t−x v =0=t+x so that t = 1/2 and x = −1/2. And thus ~eu = (~et − ~ex )/2. Given that ~ev connects events u = 0, v = 1, y = 0, z = 0 to the origin (u = 0, v = 0, y = 0, z = 0). Therefore this event is u=0=t−x v =1=t+x and can be written as t = 1/2 and x = 1/2. And thus ~ev = (~et + ~ex )/2. (b) Show that ~eu , ~ev , ~ey , ~ez form a basis.
106 To be a basis, the vectors must be at least linearly independent – no one of them can be written as a linear combination of the others. ~eu · ~ev = (~et − ~ex )/2 · (~et + ~ex )/2 = 0.
(3.192)
Also, ~eu · ~ey = ~eu · ~ez = 0, ~ev · ~ey = ~ev · ~ez = 0. so the 4 vectors are orthogonal and can form in fact an orthogonal basis. (c) The metric tensor is given in terms of the basis vectors by Eq. (3.5). The trick to finding the metric tensor is then to do the scalar product in the basis where we know the metric tensor so that we can find the metric tensor in the new basis. ηuu = ~eu · ~eu = (~et − ~ex )/2 · (~et − ~ex )/2 1 = (~et · ~et + ~ex · ~ex ) 4 1 = (−1 + 1) 4 =0
ηuv = ηvu = ~eu · ~ev = (~et − ~ex )/2 · (~et + ~ex )/2 1 = (~et · ~et − ~ex · ~ex ) 4 1 = (−1 − 1) 4 1 =− 2
(3.193)
(3.194)
Rob’s notes on Schutz
107
ηuy = ~eu · ~ey = (~et − ~ex )/2 · ~ey =0
(3.195)
ηuz = ~eu · ~ez = (~et − ~ex )/2 · ~ez =0
(3.196)
And similarly ηvy = ηvz = 0. ηvv = ~ev · ~ev = (~et + ~ex )/2 · (~et + ~ex )/2 1 = (~et · ~et + ~ex · ~ex ) 4 1 = (−1 + 1) 4 =0
(3.197)
While of course the rest of the tensor is the same as the familiar one. Collecting terms: 0 − 12 0 0 − 1 0 0 0 2 η= (3.198) 0 0 1 0 0 0 0 1. (d) We have already shown that ~eu · ~eu = 0, in (c) above, which means that it is a null vector, c.f. p. 45. (Not sure why he asked us that now, instead of because (c)!?). Similarly for ~ev · ~ev = 0, also a null vector. And we have already shown that ~eu · ~ev = −1/2 6= 0, so we have already shown that ~eu and ~ev are not orthogonal. ˜ dx.
˜ dv, ˜ g(~eu , ), and g(~eu , ) in terms of dt, ˜ (e) Compute the four one-forms du,
108 The formula for the gradient is given by Eq. (3.15), but we’re asked here ˜ dx”. ˜ for the answer “ in terms of dt, So I think we’re to use the basis formed from ~et , ~ex etc., and since the gradient is a one-form the basis elements are ω ˜ t, x ˜ ω ˜ , etc. Note that Eq. (3.20) let’s us write these basis one-forms as dt = ω ˜ t, ˜ =ω dx ˜ x. The computation is trivial: ˜ α ˜ = ∂u dx du α ∂x ∂(t − x) ˜ ∂(t − x) ˜ dt + dx = ∂t ∂x ˜ − dx. ˜ = dt (3.199) And similarly for ˜ = ∂v dx ˜ α dv ∂xα ∂(t + x) ˜ ∂(t + x) ˜ = dt + dx ∂t ∂x ˜ + dx. ˜ = dt
(3.200)
And now we’re asked for g(~eu , ). Look in Section 3.5 if you’re having trouble here. g(~eu , ) = ηαβ ω ˜α ⊗ ω ˜ β (~eu , ) = ηαβ ω ˜ α (~eu )˜ ωβ ( ) = ηαβ δuα ω ˜β( ) = ηuβ ω ˜β( ) (3.201) The “u row” of η only has one non-zero element, so we might as well continue to narrow this down: g(~eu , ) = ηuβ ω ˜β( ) = ηuv ω ˜v( ) 1 v ˜ ( ) =− ω 2 1 ˜ = − dv( ) 2 1˜ 1˜ = − dt − dx. 2 2
(3.202)
Rob’s notes on Schutz
109
And similarly for g(~ev , ). g(~ev , ) = ηαβ ω ˜α ⊗ ω ˜ β (~ev , ) = ηαβ ω ˜ α (~ev )˜ ωβ ( ) ˜β( ) = ηαβ δvα ω = ηvβ ω ˜β( ) = ηvu ω ˜ u( ) only non-zero element 1 u =− ω ˜ ( ) 2 1 ˜ ) = − du( 2 1˜ 1˜ = − dt + dx. 2 2
(3.203)
[Schutz stopped one line before I have above, so he left the answer in ˜ and dv. ˜ ] terms of du
110
Chapter 4 Perfect fluids in Special Relativity GR books tend to cover mostly the same material: SR, tensor calculus, curvature and the Riemann tensor, Einstein’s field equations, Schwarzschild solution, etc. But not necessarily the material of this chapter on fluid mechanics. This chapter has come in lieu of one on electricity and magnetism. Either of these topics would give us practise on working with tensors and developing frame-invariant equations, as we will have to do to understand the development of the Einstein field equations. I like Schutz’s choice here because fluid mechanics seems of more general relevance to astrophysical applications of GR. The last question of § 4.10 is on electricity and magnetism, so Schutz is tipping his hat to the other possible choice. See (Hobson et al., 2009) for the a full chapter on electricity and magnetism with the aim of preparing the student for the development of the Einstein field equations (but without a fluids chapter).
4.1
Fluids
The continuum hypothesis is discussed in much more detail in classical fluid dynamics texts, see for example (Batchelor, 1967). Traditionally the qualitative distinction between solids and fluids is that solids can sustain a stress (a force parallel to an interface between fluid elements) without a strain ( relative movement between the elements). I’m not sure why Schutz avoids making this distinction here. 111
112
4.2
~ Dust: the number -flux vector N
The discussion at the end of this section, on p. 88, is very fundamental. It’s perhaps worth adding that Einstein arrived at GR by searching for a description of gravity that was consistent with SR in the sense that it was written in terms of tensors that are invariant under Lorentz transformations. One can develop such a tensorial description of classical Electricity and Magnetism, see problem 25 of §4.10, or (Hobson et al., 2009).
4.3
One-forms and surfaces
Number density as a timelike flux I found this section confusing until I reach the end and realized that all he ~ in Eq. (4.5) looks like the wanted to say was that the time-component of N spatial components [but with velocity c]. The mathematics is much clearer than all the words in my opinion. This point is also immediately obvious to anyone who looked at the four-velocity and realized that in the MCRF one is moving at the speed of light in the direction of time.
Representation of a frame by a one-form Last line of this section, ~ E = −~p · U looks very confusing because, taking literally, it suggests that energy E is a scalar, i.e. a frame-independent concept! But we know that energy is the zero-component of the four-momentum, and as such depends upon the reference frame (as one would expect!). The resolution is to flip back to p. 48 ~ = U ~ and remind oneself that equation Eq. (2.35) was written with U obs being the 4-velocity of the observer: ~ E = −~p · U obs In short, he’s being sloppy at the end of § 4.10.
Rob’s notes on Schutz
4.4
113
Dust again: the stress-energy tensor
This is great practise for developing the frame-independent view of tensor 2 equations – energy density must be a component of a 0 tensor since it transforms like γ 2 . Indeed, the stress-energy tensor T forms the RHS of the Einstein field equations.
4.5
General Fluids
Symmetry of T αβ in the MCRF Don’t take the lower indices of F1i as indicating the covariant components. They are just indices to give names to the forces on the different faces so he can talk about them.
Conservation of energy-momentum Typo bottom of p. 98: −l2 T 0x (x = a) should be −l2 T 0x (x = l).
4.6
Perfect fluids
No viscosity The only matrix diagonal in all frames is a multiple of the identity matrix. Here by all frames, he means all orientations of the spatial axes. In two spatial dimensions this is easy to see. Just apply the rotation matrix cos θ sin θ R= − sin θ cos θ Then consider the transformation of some arbitrary diagonal matrix a 0 A= 0 b We find that in a co-ordinate system rotated by θ we have a cos2 θ + b sin2 θ (a − b) cos θ sin θ T R AR= (a − b) cos θ sin θ b cos2 θ + a sin2 θ which is again diagonal off a = b.
(4.1)
114 The conservation laws Typo: Eq. (4.52) should read: (ρ + p) U i β U β + p,β η iβ = 0
To see that Ui,β U β is the definition of the four-acceleration ai , we must piece together the definition in Eq. (2.32) and the notation in Eq. (3.64) and one must lower the index using the metric, which is special relativity is constant and so commutes with derivatives.
It’s a bit beyond the scope of this text, but Eq. (4.55) is the famous Euler equation, which is the inviscid form of the Navier-Stokes equations of classical fluid mechanics. As Schutz says, it’s the F = ma of fluid mechanics.
4.10
Problems
1. The continuum hypothesis applies to which of these situations. (a) Planetary motions in the solar system. The continuum hypothesis does not apply because there are only nine planets, and they have different orbits, periods, velocities, etc. (b) A lava flow from a volcano is likely well suited to the continuum approximation because the molten rock flows like a liquid. Presumably elements can be found that are much bigger than the molecules of the minerals but small enough to have uniform macroscopic properties like temperature, density, etc. (c) Traffic on a major road at rush hour is likely to be well suited to the continuum approximation if one considers scales much larger than an individual car so there are many cars in the element, but small enough that speed and direction of the cars in one element is roughly constant. At rush
Rob’s notes on Schutz
115
hour it’s more likely to have bumper to bumper traffic which would force the cars in a vicinity to travel at the same speed. (d) Traffic at an intersection with stop signed is likely to be not well suited to the continuum approximation. The stop signs ensure that the cars in an element will have different speeds at different times. There is no near-uniform element. (e) Plasma dynamics is likely to be well suited to the continuum approximation unless the plasma is extremely rarified. In the latter case there might not be sufficient collisions to bring the ions into statistical equilibrium.
2. “Flux across a surface of constant x” is often called “flux in the x direction”. This is inappropriate because it implies that “flux in the x direction” is a component of a vector. However, the “Flux across a surface of constant x” is actually the result of the application of a vector to a one-form, or vice versa: ˜ = h˜ ~ (dx) ~i N n, N ˜ N ~i = hdx,
(4.2)
This was described in Section 4.3.
3. (a) Galilean momentum is frame-dependent in a manner that relativistic momentum is not. Galilean momentum is the ordinary, 3-vector velocity, ~v times the (frame-independent) mass m: p~g = m~v .
(4.3)
The velocity depends very much of the frame. It’s not just the components that change with reference frame, but the vector itself that changes. In contrast with Galilean momentum, the relativistic momentum, p~, is a ~: 4-vector, created by the scalar rest mass m times the 4-velocity U ~. p~ = mU
(4.4)
The rest mass is obviously frame-invariant. The 4-velocity is too, while its components do depend upon reference frame. Note for instance that the
116 magnitude of the 4-velocity is alway, ~ ·U ~ = Uα U α U = −1.
(4.5)
(b) How is the above situation possible given that Galilean momentum is an approximation to relativistic momentum? Hint: Define a Galilean 4momentum. The Galilean 4-momentum would look like the regular 3-momentum but with a time component, p~G →O m{1, v x , v y , v z }.
(4.6)
where {v x , v y , v z } is ordinary 3-velocity in frame O. Because |~v | 1 then the p~G vector is dominated by its first component, just the inertial mass. Thus for instance, the magnitude of the Galilean 4-momentum is almost frame-invariant.
4. Show that the number density of dust to an arbitrary observer with ~ 4-velocity U obs is ~ ·U ~ −N obs Let’s transform ourselves into the velocity of the observer, O, so then ~ U obs →O {1, 0, 0, 0} Now, ~ ·U ~ ~ ~ −N obs = −nU · Uobs = −n(−U 0 ) 1 = n√ 1 − v2 (4.7) where v is the ordinary velocity of the dust measured in the O frame. But this is exactly what we came up with for the number density in Eq. (4.2).
Rob’s notes on Schutz
117
~ are frame-invariant Now we note that the observers 4-velocity and N ~ ~ vectors, and hence N · Uobs is frame-invariant. So the result must be true in all reference frames.
5. Complete the proof that Eq. (4.14) [for the stress-energy tensor] defines a tensor by arguing that it must be linear in both its arguments. Eq. (4.14) defines the stress-energy tensor ˜ α , dx ˜ β ) = T αβ , T(dx and T αβ is the “flux of α−momentum across a surface of constant xβ ”. Of course we require this flux to be proportional to the area of the surface of ˜ α , dx ˜ β ) is linear in the constant xβ . This requirement is met because T(dx ˜ β. 2nd argument dx Furthermore, the α−momentum of the four-momentum is ˜ α , p~i, pα = hdx
(4.8)
˜ β , the first argument of T(dx ˜ α , dx ˜ β ) , by the properties which is linear in dx of one-forms and vectors.
6. Derive Eq. (4.19) ~ has only one non-zero component in We only need to show that p~ ⊗ N 00 the MCRF, namely (T )MCRF = nm. In the MCRF, say O, ~ →O (1, 0, 0, 0) U ~ = nU ~ →O (n, 0, 0, 0) N ~ →O (m, 0, 0, 0). p~ = mU ~ has only one non-zero component, namely (T 00 ) Thus T = p~ ⊗ N MCRF = nm.
7. Derive Eq. (4.21).
118 The terms in Eq. (4.21) are immediately clear from the preceding expression for the 4-velocity in frame O: ~ → (γ, v x γ, v y γ, v z γ) U O P 1 2 i i where γ = √1−v 2 , and v = iv v .
(4.9)
8. (a) Argue that Eqs. (4.25) and (4.26) can be written as statements about one-forms. In the derivation of Eqs. (4.25) and (4.26), we started with a fluid element in the MCRF and considered how its energy could change by first a finite amount ∆E and then we took the limit of an infinitesimal change dE. If we simply divided by the change in proper time, ∆τ , then for instance Eq. (4.24) would become n
∆ρ ρ + p ∆n ∆q = − ∆τ ∆τ n ∆τ
(4.10)
Now, following Schutz, we suppose that the changes are infinitesimal. The RHS becomes ∆ρ ρ + p ∆n dρ ρ + p dn − = − ∆τ →0 ∆τ n ∆τ dτ n dτ ˜ ·U ˜ ·U ~ ~ − ρ + p dn = dρ n ˜ ·U ~ = nT dS lim
(4.11)
~ , we obtain If we divide by the U ˜ − ρ + p dn ˜ = nT dS ˜ dρ n (4.12)
[Solution above is different from Schutz’s solutions, but I believe the two are compatible.] ˜ is not a gradient. 8. (b) Show that ∆q
Rob’s notes on Schutz
119
Unlike, ρ and n, there was no property of each element that we could define by q. (Yes formally one can write q := Q/N , but what is Q? Heat is only defined in terms of transfer of energy.) I find it misleading that Schutz writes q := Q/N on p. 95.
9. Show that Eq. (4.34), when α = i, i.e. any spatial index, expresses Newton’s 2nd law. Let’s assume we’re dealing with dust, so that the stress-energy tensor is given by Eq. (4.19). Let’s start with β = 0. Then Eq. (4.34) gives the following term: ∂T i0 ∂x0 ∂T i0 = ∂t
T i0,0 =
(4.13)
which we interpret as the time rate of change of the i−direction momentum density. The remaining terms ∂T ij T ij,j = ∂xj we directly interpret as the divergence of the flux of i−direction momentum. The same terms can be written in a form familiar to classical fluid mechanics: ∂ρui ∂T i0 = ∂t ∂t ij ∂T ∂ρui uj = j ∂x ∂xj
(4.14)
where on the RHS we have used the notation of standard symbols of fluid mechanics: ρ is the mass density, ui is the fluid velocity in the xi direction. So ∂T i0 ∂T ij ∂ρui ∂ρui uj = + + j ∂t ∂x ∂t ∂xj = 0.
(4.15)
120 The RHS is the Navier-Stokes equations but without the pressure gradient term, forcing or dissipation. This is of course a form of Newton’s 2nd law but when forces are not present. This case applies to dust only. In a perfect fluid the stress energy tensor is given by Eq. (4.38) and we immediately see the possibility of a pressure gradient term.
10. Take limit of |~v | 1 showing that Eq. (4.35) reduces to the equation given. Solution: N α,α = 0,
restating Eq. (4.35)
i
∂nγ ∂nγv + = 0, ∂t ∂xi ∂n ∂nv i + = 0, ∂t ∂xi
using Eq. (4.5) γ → 1when|~v | 1
√ where γ = 1/ 1 − v 2 .
11. (a) Show that the matrix δ ij is unchanged when transformed by a rotation of the spatial axes. Why this question is in this chapter? Recall the discussion of viscosity on p. 101. If there is no viscosity then T should be diagonal in any reference frame. First one has to remind oneself how to transform matrices under a change of coordinates. I don’t have my linear algebra book with me, but we can find this be pretending that the matrix represents a 20 tensor, say A. This tensor should be an entity that is invariant under the arbitrary choice of coordinate system so we demand that A = Aαβ ~eα ⊗ ~eβ , = =
see Chapter 3, section 6 for basis
¯ Aαβ Λα¯α~eα¯ ⊗ ~eβ¯Λββ , ¯ Aα¯ β ~eα¯ ⊗ ~eβ¯, entity
using bases of O independent of coordinate system.
(4.16)
matrix version of previous line.
(4.17)
So it’s now clear that ¯
¯
Aα¯ β = Aαβ Λα¯α Λββ ¯
¯
(Aα¯ β ) = (Λα¯α )(Aαβ )(Λββ )T ,
Rob’s notes on Schutz
121
Now we can address the problem by replacing (Aαβ ) with the matrix δ ij ¯ and the transformation (Λα¯α ) with a rotation Rii . Let’s work in 2D. The matrix δ ij in the rotated coordinate system O will be ¯ ¯ δ¯i¯j = Rii δ ij Rjj = (R)I(R)T cos θ sin θ 1 0 cos θ − sin θ = − sin θ cos θ 0 1 sin θ cos θ cos θ sin θ cos θ − sin θ = − sin θ cos θ sin θ cos θ 1 0 = . 0 1
(4.18)
11. (b) Show that any matrix that has this property is a multiple of δ ij . It is quite obvious that any multiple of the identity matrix has this property. In fact I went through this in my notes above for § 4.6.
12. Derive Eq. (4.37) from Eq. (4.36). We simply go term by term as Schutz suggested on p. 101. He has already addressed α = β = 0. Using the convention i ∈ {1, 2, 3}, consider the three terms T 0i : T 0i = (ρ + p)U 0 U i + pη 0i = (ρ + p)U i + pη 0i = (ρ + p)U i
in MCRF, nil spatial components
η is diagonal
~ = 0. in MCRF, nil spatial components of U
(4.19)
Then by symmetry, T i0 = T 0i = 0.
(4.20)
Finally, T ij = (ρ + p)U i U j + pη ij = pη ij = pδ ij .
~ in MCRF, nil spatial components of U (4.21)
122 13. Supply reasoning behind Eq. (4.44) We want the derivative of the dot product of the 4-velocity with itself. This can be written (U α U γ ηαγ ),β = (U α U γ ),β ηαγ , metric tensor is constant = (U α,β U γ + U α U γ,β )ηαγ , product rule = (2U α,β U γ )ηαγ ,
metric tensor is symmetric.
(4.22)
14. Argue that Eq. (4.46) is the time component of Eq. (4.45) in the MCRF. Recall Eq. (4.45) was ρ+p α β nU + p,β η αβ = 0 U n ,β Although there are two indices, β is just a dummy index (it appears both upstairs and downstairs implying a sum over β). So the time component is when α = 0. The following argument is not correct: ρ+p 0 β U 0 = nU + p,β η 0β n ,β ρ+p ~ = ~e0 in the MCRF + p,β η 0β , since U = n Uβ n ,β ρ + p ∂p = n Uβ − , recall properties of η see Eq. (3.44) n ∂t ,β ~ = ~e0 in MCRF, we cannot immediately conclude that the graAlthough U dient of the time component is zero. Instead, we must first take the gradient in Eq (4.45), and then set α = 0: ρ+p α β U + p,β η αβ Eq. (4.45) 0 = nU n ,β " # ρ + p ρ + p = n Uβ Uα + U α,β + p,β η αβ , expanding (4.23) n n ,β
Rob’s notes on Schutz
123
And now focus attention on the 2nd term in square brackets: nU
β
ρ+p n
U α,β = (ρ + p)U β U α,β ~ = (ρ + p)∇U~ U c.f. Eq. (3.68) ~ dU = (ρ + p) c.f. Eq. (3.67) dτ =0 for t component in MCRF, c.f. p. 48
(4.24)
So for the above reason, Eq. (4.45) in the time component reduces to " 0 = nU
β
U
α
"
ρ+p n
+ ,β
ρ+p n
# U α,β
+ p,β η αβ ,
from above
#
ρ+p + p,β η αβ , using above n ,β " # ρ+p + p,β η 0β , time component 0 = n Uβ n ,β ρ+p ∂p β = nU recall properties of η see Eq. (3.44) (4.25) − , n ∂t ,β 0 = n Uβ Uα
One the other hand, Eq. (4.46) was β
0 = n U Uα
ρ+p α U n
+ p,β η αβ Uα ,β
Consider: ρ+p α ρ+p ρ+p α α Uα U = Uα U + Uα U ,β , product rule n n n ,β ,β ρ+p ρ+p α =− + Uα U ,β , using Eq. (2.28) n n ,β ρ+p =− , using Eq. (4.42). (4.26) n ,β
124 So Eq. (4.46) becomes:
ρ+p 0 = −n U + p,β η αβ Uα n ,β ρ+p = −n U β + p,β U β n ,β ρ + p ∂p + , in MCRF. = −n U β n ∂t ,β ρ+p ∂p β = nU − , same as time component of Eq. (4.45). n ∂t ,β (4.27) β
15. Derive Eq. (4.48) from Eq. (4.47). This is just trivial manipulation to encourage the student to follow the steps of the argument and become comfortable with the notation: " 0=U
β
−n
ρ+p n
# + p,β ,
recall Eq. (4.47)
,β
i h n n = U − (ρ,β + p,β ) + 2 (ρ + p) n,β + p,β , n n ρ + p = −U β ρ,β − n,β , algebra. n β
product rule and chain rule, (4.28)
Which is Eq. (4.48).
16. In the MCRF, U i = 0. Why can’t we assume U,βi = 0? The analogous statement in 3D space is also true. In fluid mechanics for instance, one can alway transform the equations into a frame momentarily co-moving with the local fluid velocity, but that doesn’t mean the velocity gradient will be zero. The 3-velocity in fluid mechanics, and 4-velocity in SR, can depend upon space so that adjacent fluid elements have different MCRFs.
Rob’s notes on Schutz
125
17. We have defined aµ = U µ,β U β . Show that in the nonrelativistic limit: ai = v˙ i + (v · ∇)v i =
Dv i . Dt
The 4-velocity can be written in terms of the 3-velocity as U µ = [γ, γu, γv, γw]
(4.29)
where
1 . 1 − v2 (Recall that v is assumed to be the ordinary 3-velocity divided by c.) γ=√
ai = U i,β U β =γ
∂γui ∂γui ∂γui ∂γui + γu + γv + γw ∂t ∂x ∂y ∂z (4.30)
In the nonrelativistic limit v 1, so γ ≈ 1. and, ∂ui ∂ui ∂ui ∂ui +u +v +w ∂t ∂x ∂y ∂z i Du . = Dt
ai =
(4.31)
Unfortunately I’ve used v as both the y-component of the 3-velocity, (standard in fluid mechanics), and also as the magnitude of the 3-velocity in the definition of γ.
18. Sharpen the discussion at the end of § 4.6 by showing that −∇p is actually the net force per unit volume on the fluid element in the MCRF. I believe this is simply the same argument used in classical fluid mechanics. Imagine a cube with one corner at the origin, with sides parallel to the
126 Cartesian coordinate axes, and of volume δx δy δz. Without loss of generality let the pressure gradient be in the y−direction. The pressure force on the face at y = 0 is p(y = 0) δx δz, while the pressure force on the face at y = δy is −p(y = δy) δx δz. So the pressure gradient force per unit volume is δx δz PGF = − [p(y = δy) − p(y = 0)] δx δy δz δx δy δz p(y = δy) − p(y = 0) =− δy (4.32) Taking the limit δx → 0, δy → 0, δz → 0, PGF ∂p =− . dx dy dz ∂y
(4.33)
19. Starting with Eq. (4.58) prove Eq. (4.47). Equation Eq. (4.58) contains the sum of terms like: Z Z 0 0 V (t2 ) − V (t1 ) dx dy dz + V x (x2 ) − V x (x1 ) dt dy dz + . . . Let’s start with the first term. Z Z 0 V (t1 + δt) − V 0 (t1 ) 0 0 V (t2 ) − V (t1 ) dx dy dz = δt dx dy dz, δt where δt = t2 − t1 . Taking the limit δt → 0, Z Z ∂V 0 (t) 0 0 V (t2 ) − V (t1 ) dx dy dz = dt dx dy dz. ∂t And similarly for the other terms. For example, Z Z ∂V x (x) x x V (x2 ) − V (x1 ) dt dy dz = dt dx dy dz. ∂x Combining these terms we obtain Eq. (4.57).
(4.34)
(4.35)
(4.36)
(4.37)
Rob’s notes on Schutz
127
20. (a) Show that if particles are not conserved but are generated locally at a rate ε particles per unit volume per unit time in the MCRF, then the conservation law, Eq. (4.35), becomes: N α,α = ε. We must essentially derive Eq. (4.35) but including the source term. We were told just before Eq. (4.35) that the procedure was the same as for Eq. (4.34), see p. 98. Consider a fluid element as described in Fig. 4.8 and bottom of p. 98. The number density is n in the MCRF, and by Eq. (4.42) it is γn in a reference frame moving at speed v relative to the fluid element, with, 1 . γ=√ 1 − v2 ~ = ~e0 in the MCRF and the The 4-velocity of the fluid is, by definition, U time component is in general U 0 = γ. Thus we can write the number of particles in an element of volume l3 as l3 nγ = n l3 U 0 . The rate of flow (or flux of ) particles across surface 4 (c.f. Fig. 4.8) is l2 nU x (x = 0). (This may seem strange because we know that U x = 0 in the MCRF, but soon we’re going to take a derivative that will not be zero – recall problem 16 above). The flux of particles across surface 2 is l2 nU x (x = l). Similarly, in the y−direction and z−direction the net inflow of particles is l2 nU y (y = 0) − l2 nU y (y = l) and l2 nU z (z = 0) − l2 nU z (z = l) respectively. These net inflow terms increase the particle density in the fluid element at a rate ∂nl3 U 0 = l2 [(nU x )(x = 0) − (nU x )(x = l) + (nU y )(y = 0) − (nU y )(y = l) ∂t (4.38) + (nU z )(z = 0) − (nU z )(z = l)] + . . . other terms (4.39) There are other terms contributing now, unlike in deriving Eq. (4.35), because
128 there is also a source term giving, ∂nl3 U 0 = l2 [(nU x )(x = 0) − (nU x )(x = l) + (nU y )(y = 0) − (nU y )(y = l) ∂t (4.40) + (nU z )(z = 0) − (nU z )(z = l)] + l3 ε.
(4.41)
Note that this relation should be frame-invariant because n is obviously frame ~ is also frame invariant. invariant and U Note: ε is frame-invariant! Recall ε is the rate of generation of particles per unit volume per unit time in the MCRF. In another reference frame, there is a factor of γ to account for the fact that the volume will be smaller, thus tending to increase the generation rate, but the time will be slower by a factor 1/γ. In short, time dilation cancels length contraction. And we can pull l3 out of the derivative because it is a specified constant, and then divide both sides by l3 : (nU x )(x = l) − (nU x )(x = 0) (nU y )(y = l) − (nU y )(y = 0) ∂n U 0 =− − ∂t l l (4.42) z z (nU )(z = l) − (nU )(z = 0) − + ε. (4.43) l In the limit l → 0, ∂nU x ∂nU y ∂nU z ∂n U 0 =− − − + ε. ∂t ∂x ∂y ∂z
(4.44)
Or N α,α = ε. 20. (b) Show that if momentum and energy are not conserved (due to interactions with external systems) then Eq. (4.34) becomes: T αβ,β = F α . where F α is the relativistic 4-vector force. This problem is easier than 20. (a) because we follow the derivation of Eq. (4.34) on pp. 98 and 99. Recall Eq. (4.31): ∂T 00 = −T 0i i ∂t
Rob’s notes on Schutz
129
where the LHS is the time rate of change of energy per unit volume, and the RHS in the net in flux of energy per unit time per unit volume. Thus for non-conservative systems we must add a F 0 , which is the net rate of energy forcing (or supply of energy from external sources and not associated with fluxes across the boundaries of the fluid element) per unit time per unit volume. ∂T 00 = −T 0i ,i + F 0 . ∂t
(4.45)
I believe, unlike the source of particles, ε, F 0 will be frame dependent. For suppose it is associated with particles being generated at a rate ε. We argued that ε was frame-invariant in 20 a. But the total energy of each particle is mγ, where m is the rest mass. So the energy source F 0 will increase as one moves to a reference frame moving relative to the fluid element. [In retrospect, of course it’s frame dependent, it’s a component of a 4vector.] Similarly for the other components. Consider ∂T x0 = −T xi ,i . ∂t The LHS is the time rate of change of x−direction momentum per unit volume, and the RHS in the net influx of x−momentum per unit time per unit volume. Thus we must add any external forces ∂T x0 = −T xi ,i + F x . ∂t
(4.46)
where F x is the external force on the fluid element per unit volume in the MCRF.
21. Find the stress-energy tensor components in an inertial frame O. (a) Group of particles all with the same 3-velocity ~v = β~ex in frame O. The rest-mass density is ρ0 and we can make the continuum approximation. Making the continuum approximation, we treat the particles as a fluid, and because all the particles have the same velocity the fluid is dust.
130 The stress-energy tensor for dust was discussed in § 4.4, see in particular Eq. (4.19): ~ ⊗U ~. T = ρU Here the energy density in the rest frame is ρ = ρ0 and the 4-velocity is ~ →O (γ, βγ, 0, 0) U p with γ = 1/ 1 − β 2 . So the stress-energy tensor is simply ρ0 γ 2 ρ0 γ 2 β 0 0 ρ0 γ 2 β ρ0 γ 2 β 2 0 0 . T αβ →O 0 0 0 0 0 0 0 0 (b) A ring of N similar particles of mass m rotating counter-clockwise in the x − y plane about the origin of O. The radius of the ring is a, and the circular cross-section of the ring has radius δa a. Ignore force to keep particles in circular orbit. Because we can ignore the forces to keep the particles in the circular orbit, I believe we can treat the fluid as a dust, even though the frame co-moving with the particles is not inertial. [ Yes, my solution agrees with Schutz’s solution, so this must be right.] Let’s assume that the given mass m is the rest-mass of each of the particles. The relativistic mass of each particle is then mγ measured in frame O, where 1 . γ=q 2 2 1 − ωc2a Let’s assume that a was given in geometric units so that c = 1, and 1 γ=√ . 1 − ω 2 a2 The total energy is then N γm. And this energy is uniformly distributed over the volume of the torus so the energy density in frame O is ρr =
N γm N γm = . 2 2aπ(δa) π 2aπ 2 (δa)2
Rob’s notes on Schutz
131
But the particles each have speed ωa in O. To transform to the MCRF energy density one must use Eq. (4.13), ρ=
ρr Nm . = 2 γ 2aπ 2 (δa)2 γ
(Here we’re assuming that we can treat the fluid as dust even though there is centripetal acceleration.) Now all we need is the 4-velocity. In frame O, the x−component of velocity is −ωa sin(θ) = −ωay/a = −ωy. The 3-velocity in frame O is ~v = (−ωy, ωx, 0). The 4-velocity is then ~ →O (γ, −γωy, γωx, 0). U We can now easily form the stress-energy tensor ~ ⊗U ~ T = ρU which in matrix form is
γ2 −γ 2 ωy γ 2 ωx −γ 2 ωy γ 2 ω 2 y 2 −γ 2 ω 2 xy (T) = ρ γ 2 ωx −γ 2 ω 2 xy γ 2 ω 2 x2 0 0 0
0 0 0 0
(c) For two such rings that are identical in every way accept the sense of rotation and wherein the particles do not interact. We simply add the two stress-energy tensors since the energy density is linear in the number of particles, the volume is fixed. The 2nd stress-energy tensor is obtained by just changing the sign of ω in the first stress-energy tensor. γ2 γ 2 ωy −γ 2 ωx γ2 −γ 2 ωy γ 2 ωx 0 2 2 2 2 2 2 2 2 2 2 γ ωy −γ ωy γ ω y γ ω y −γ 2 ω 2 xy −γ ω xy 0 (T) = ρ + ρ γ 2 ωx −γ 2 ω 2 xy γ 2 ω 2 x2 0 −γ 2 ωx −γ 2 ω 2 xy γ 2 ω 2 x2 0 0 0 0 0 0 0 2 γ 0 0 0 2 2 2 2 2 0 γ ω y −γ ω xy 0 = ρ (4.47) 2 2 0 −γ ω xy γ 2 ω 2 x2 0 0 0 0 0
0 0 0 0
132 22.(i) Here we must argue that a collection of noncolliding particles, like a galaxy, with random velocities with no preferred direction in the MCRF has a stress-energy tensor of a perfect fluid. Here we must simply argue that heat conduction and viscosity are zero, so that the conditions of a perfect fluid are met. Head conduction and momentum diffusion (viscosity) result from the net transfer of energy and momentum, respectively, due to particle motions. In classical fluid dynamics this results from the molecular motions having a preferred direction due to temperature gradients or momentum gradients. But here we are assuming a priori that there is no preferred direction in the MCRF (this must be the same MCRF for the entire system so that there are no gradients). So there can be no net transfer of energy or momentum due to the motion of the particles. Hence no heat transfer by conduction or momentum transfer by viscosity. With the conditions of a perfect fluid being met, the argument of § 4.6 apply for the form of the stress-energy tensor. This is interesting because the condition of no bias in any direction of the particle velocities (in the MCRF) is a statistical condition, applying in mean. But if the velocities are truly randomly distributed, then we should expect random fluctuations about the mean, implying a random heat conduction and viscosity effect, albeit with a time mean of zero. In classical atmosphere/ocean fluid dynamics this is only recently being considered under the name of “stochastic parameterization”. 22. (ii) If all particles have the same speed v and mass m (less assume that’s rest mass), express p and ρ as functions of m, v, and n. The magnitude of the momentum of a given particle is, mγv, so the momentum flux in a given direction is X X nv cos(θi )mγv cos(θi ) = nmγv 2 cos2 (θi ). i
i
We make the continuum hypothesis and approach the problem statistically. Suppose a particle is at the origin. Because the direction of a given particle’s trajectory is random, the probability of leaving through a given piece of the unit sphere centred at the origin is proportional to the area of the piece. Dividing by the area of the sphere we get the solid angle. The solid angle
Rob’s notes on Schutz
133
of the strip of width dθ making an angle θ to the z-axis is 2π sin(θ) dθ/4π = sin(θ) dθ/2. So the mean value of the momentum flux of one particle in the z−direction is Rπ 2 mγv 2 −1 2 0 sin(θ) cos (θ)dθ = [cos3 (θ)]π0 mγv 2 2 3 1 (4.48) = mγv 2 . 3 Originally I integrated only over half the sphere’s area because I mistakenly thought that only half the particles contribute to the momentum flux in a given direction, the other half fluxing momentum in the other direction. But this is wrong because the momentum flux goes like the square of the velocity – a particle moving in the negative x−direction carries negative momentum in the negative x−direction and thus results in a positive momentum flux! This was made clear in the previous problem, see 21c. For a photon gas, say the energy of each photon is E and the MCRM number density is n, so that ρ = nE The momentum of each photon is E, c.f. Eq. (2.37), and it’s speed is c = 1. So the momentum flux of a given photon in a given direction will be E Ec = 3 3 and the total momentum flux from all photons per unit volume will be p=n
E ρ = 3 3
23. This question was confusing at first because d3 x is not defined in the text. Is that dx1 dx2 dx3 or dx0 dx2 dx3 etc.? Below I just assumed it was the 3 spatial dimensions d3 x = dx dy dz, and found that it makes sense. 23 (a) Prove ∂ ∂t
Z
T 0α d3 x = 0
134
0 = T αβ,β = = = ZΩ
T 0α,t + T iα,i d3 x
= ΩZ
T 0α,t d3 x = −
T iα,i d3 x
ZΩ
Ω
symmetry T iα,i
expanding T 0α,t + T iα,i dx dy dz
Z
Z
Eq. (4.34)
T βα,β T 0α,t +
=−
integrate over volume Ω
change notation
rearrange
T iα ni d2 x
3D version of Gauss’ theorem
dΩ
choosing surface dΩ in region where T αβ = 0
=0 ∂ ∂t
Z
T 0α d3 x = 0
choosing time-independent dΩ
(4.49)
Ω
23 (b) Prove
∂2 ∂t2
Z T
00
i
j
3
x x d x=2
Z
T ij d3 x
This is a bit trickier. I just played with the integrand such that it would give me terms that appeared in the above. Even if you guess incorrectly, after doing a few computations you’ll learn how things work and an then
Rob’s notes on Schutz
135
find the solution. µ ν ∂ ∂ ∂ µ ν βα ∂x x βα µ ν βα T x x = T ,α x x + T ∂xβ ∂xα ∂xβ ∂xα µ ν ∂ ∂x x = T βα by Eq. (4.34) β ∂x ∂xα µ ν ∂ αβ ∂x x T symmetry = ∂xβ ∂xα ∂ 2 xµ xν = T αβ α β by Eq. (4.34) ∂x ∂x ∂ product rule of elementary calculus = T αβ β (xµ δ ν α + xν δ µα ) ∂x = T αβ δ µβ δ ν α + δ ν β δ µα product rule of elementary calculus = T µν + T νµ = 2 T µν symmetry
(4.50)
Now we just restrict attention to spatial ranges of these indices: µ = i and ν = j, and we integrate over a 3D spatial volume Ω fixed in time so that temporal derivatives commute with the spatial integral (without Leibniz terms) and large enough that it entirely encloses the region in which the stress-energy tensor is non-zero: ∂ ∂ βα i j T x x = 2 T ij restricting above β ∂xα ∂x Z Z ∂ ∂ T βα xi xj d3 x = 2 T ij d3 x integrating β α ∂x ∂x Ω Ω Z Z Z 3 3 ∂ ∂ ∂2 00 i j kl i j 2 T ij d3 x expanding T x x d x+ T x x d x= k ∂xl ∂t2 Ω ∂x Ω Ω (4.51) The middle term can be shown to be zero because it always involves an integral over a direction for which there is a spatial derivative, giving terms like, say for k = 2: Z Z yR ∂ ∂ ∂ ky i j ky i j T x x dx dy dz = T x x dx dz = 0 l l Ω ∂y ∂x Ω ∂x yL where yR and yL are the y coordinates of the bounding surface on the right and left sides of the volume Ω. But on this surface T = 0 because we choose the surface to be outside the bounded region of non-zero T = 0.
136 Maybe I could have used Gauss’ theorem again instead of the final argument above? In any case, we’re left with just ∂2 ∂t2
Z T
00
i
j
Z
3
2 T ij d3 x
x x d x=
(4.52)
Ω
Ω
which was what we had to prove. 23 (c) Prove ∂2 ∂t2
Z T
00
i
2
3
(x xi ) d x = 4
Z
T ii
j
3
x xj d x + 8
Z
T ij xi xj d3 x
The solution to this problem proceeds much like in (b) but is a bit more complicated. The difficulty is mostly in guessing where to start, and in particular what operator to apply to T αβ . Again I chose something too general and made life difficult for myself, but it soon became clear where I should have started. These problems are actually much easier than they might seem at first. Here is some “thinking out loud” that might help. We see on the left a second derivative wrt t so we want to start with a general double derivative of T αβ . ∂2 T αβ ∂xα xβ As in parts (a) and (b), we anticipate exploiting the fact that these derivatives are zero for deriving the RHS. Clearly we have to multiple by some combination of xµ and xν and there should be four of them. Some will apparently be eliminated somehow and somehow one of the indices of T αβ will need to be lowered but I had no clue how at this point. I started very general, with ∂2 T αβ xµ xν xγ xσ ∂xα xβ But after a few computations to see how things worked, it became clear I only needed the spatial components and that I only needed two pairs, that is ∂2 T αβ xi xj xi xj α β ∂x x
Rob’s notes on Schutz
137
It wasn’t that tricky. It was obvious what needed to be done to get some of the terms on the desired RHS. It also became clear how to lower the index of T αβ . Let’s just proceed now with the computations. ∂2 T αβ xi xj xi xj ∂xα xβ ∂2 = T αβ α β xi xj xi xj Eq. (4.34) and symmetry ∂x x ∂ αβ ∂ i j k l =T (x x ηki x ηlj x ) raise indices so we can differentiate ∂xβ ∂xα ∂ ∂ i j k l αβ (x x x x ) metric is uniform in space = T ηki ηlj β ∂x ∂xα ∂ = T αβ ηki ηlj β (xi xj xk δ l α + xi xj δ kα xl + xi δ jα xk xl + δ i α xj xk xl ) product rule ∂x = T αβ ηki ηlj { δ l α [xi xj δ kβ + xi δ jβ xk + δ i β xj xk ] + δ kα [xi xj δ l β + xi δ jβ xl + δ i β xj xl ] δ jα [xi xk δ l β + xi δ kβ xl + δ i β xk xl ] + δ i α [xj xk δ l β + xj δ kβ xl + δ jβ xk xl ]}
(4.53)
The terms in red are characterized by having the indices of x, say xi xk for the first one, that are both within the same metric term, e.g. ηki , as opposed to spread across two metric terms. Thus it’s only possible to lower one of the two indices. It doesn’t matter which one we lower (you’ll see in a second because of the symmetry of T αβ we can get what we want in the end either way). So the 4 red terms become 4T ii xj xj where we have relabelled dummy indices. And the remaining 8 black terms become 8T ij xi xj After integrating over all space this gives the RHS. The LHS follows as it did in (b).
24. (a) Show that in the rest frame O of a star of constant luminosity L (total energy radiated per second), the stres-energy tensor of the radiation from the star at the event (t, x, 0, 0) has components T 00 = T 0x = T x0 = T xx = L/(4πx2 ). Stars sites at the origin.
138 Assume the star emits radiation isotropically. So a sphere of radius x centred at the origin has radiation flowing out of it at a rate of L. The surface area of the sphere is 4πx2 . The flux will be evenly distributed over the surface of the sphere by the assumption of isotropy. Thus the flux per unit time per unit area is everywhere of magnitude L/(4πx2 ). And in particular at event (t, x, 0, 0) it is also T 0x = L/(4πx2 ). In time period δt the energy flow out of the sphere will be Lδt and this energy will fill a spherical shell of volume (4πx2 ) c δt = (4πx2 ) δt, since c = 1 in geometric units. Thus the energy density at a distance of x from the origin will be T 00 = L δt/(4πx2 δt) = L/(4πx2 ). By the symmetry properties of T, we know that in general T 0x = T x0 . Finally, the energy flux is photon flux, say Fp times the energy per photon, hν, T 0x = Fp hν And the momentum flux will be the photon flux times the momentum per photon, hκ, ν T xx = Fp hκ = Fp h = T 0x c because again c = 1 in geometric units, c.f. p. 49. 24. (b) By drawing the world lines of photons emitted and absorbed at an event, I can only guess that the definition of a “a null vector that separates the emission and reception of the radiation” is a null vector in the direction of the radiation that passes through the event and points in the direction of the emitted radiation and opposite the received radiation. We are given that ~ →O (x, x, 0, 0) X is such a vector for the event (x, x, 0, 0). Recall the radiation came from the origin of O wherein the star is at rest, see (a). I don’t see that there’s anything to prove. I guess we’re just supposed to learn the above definition that wasn’t explicitly provided. To show that the stress-energy tensor has the given frame-invariant form, we must show that it is indeed frame-invariant, and that it reproduces the results of (a) in the MCRF. Vector’s are frame-invariant, and so their out 2 erproduct will for a 0 tensor that is also frame-invariant. The radiation emitted per second, the luminosity, will depend upon reference frame (as we will have to discuss in part (c)), so we must assume that L is the luminosity in
Rob’s notes on Schutz
139
the rest frame, even though this was not stated explicitly. Finally the denominator has an inner product of two four-vectors, and is also frame-invariant. So the given expression for T is frame-invariant. In the MCRF it is easy to verify that the expression given produces the results of (a): L X 0X 0 2π (U α Xα )4 L x·x = 2π (1 · x)4 L = . 2πx2
T 00 =
(4.54)
And L X 0X 1 2π (U α Xα )4 L x·x = 2π (1 · x)4 L = , 2πx2
T 0x =
(4.55)
and L X 1X 1 2π (U α Xα )4 L x·x = 2π (1 · x)4 L = . 2πx2
T xx =
(4.56)
Furthermore, T
0y
L X 0X 2 = 2π (U α Xα )4 L x·0 = 2π (1 · x)4 = 0,
and similarly for the remaining terms.
(4.57)
140 ¯ ¯ moving at speed v along the x−axis of O. 24. (c) Find T 0¯x in a frame O ~ Applying the Lorentz transformation to T we first transform X:
~ →O¯ (γ(1 − v)x, γ(1 − v)x, 0, 0), X →O¯ (R, R, 0, 0), (4.58) √ where γ = 1/ 1 − v 2 , see p. 38, and we have found r 1−v x R = γ(1 − v)x = 1+v The 4-velocity of the star in the Earth’s frame of reference is U~s →O (γ, −vγ, 0, 0) ~ = −Rγ(1 + v). Using the frame-invariant expression which gives for U~s · X for the stress energy tensor we find in the Earth’s frame of reference: ¯
T
¯ 0¯ x
¯
L X 0X 1 = 4π (U α Xα )4 L R2 = 4π (Rγ(1 + v))4 L (1 − v)2 = 4πR2 (1 + v)2
(4.59)
Interpretation Consider the emission of a photon from the star as event A at (0, 0, 0, 0) in frame O, see Fig. 4.1. The photon travels along the x−axis, which is parallel to the x¯−axis. Consider the absorption of the photon in a detector as event B ¯ coincides with frame frame at (x, x, 0, 0) in frame O. Suppose that frame O O at event A, so both frames observe event A at their origins. During the ¯ has time period between A and B, ∆t = x − 0 = x, the reference frame O traveled a distance v in the x−direction. So under a Galilean transformation the event B at (x, x, 0, 0) in O would occur at x¯ = x − vt = x(1 − v) in the ¯ moving at speed v in the positive x-direction. Due to relativistic frame O effects, this becomes x¯ = xγ(1 − v) = R. This is different from Lorentz contraction or time dilation because it’s neither a ruler nor a clock’s reading
Rob’s notes on Schutz
141
that is being transformed between Lorentz frames but the spatial part of a light path. ¯ the light emitted at event A spreads out in From the point of view of O ¯ despite the fact that the star was a spherical shell centred at the origin of O moving when the light was emitted. In principle, this makes things simple! ¯ T 0¯x is the flux of energy across a surface of constant x¯ = R. But note that the luminosity varies with reference frame. Suppose the star emits N photons per second of mean energy hν such that L = N hν in O the MCRF of the star. The emission rate is a clock of sorts, and moving clocks run slowly, so in ¯ the emission rate will be N /γ. But the radiation will be red-shifted frame O, by the relativistic Doppler effect, ν¯ = (1 − v)γ ν ¯ will be reduced: see Eq. (2.39) on p. 49. Thus the luminosity observed in O ¯ = N ν(1 − v)γ = L(1 − v). L γ Based on the above argument I would have anticipated ¯
T 0¯x =
L (1 − v) 4πR2
but this is wrong by a factor of (1 − v)/(1 + v)2 ! A more accurate but only partial answer is to note from the frameinvariant expression given in 24 b), it’s clear that the components of T only ~ ⊗X ~ term in the numerator, since the remaining change because of the X factors are frame-invariant. For all four non-zero terms, ¯ ∈ {¯0, x¯} ~ ⊗ X) ~ α¯ β¯ → R2 , (¯ α, β) (X O q 1−v And R/x = γ(1 − v) = , which implies that the magnitude of the 1+v components of T change like ¯
T 0¯x R2 1−v = = T 0x x2 1+v The same result can of course be obtained directly from the Lorentz transformation. Tensors are geometric objects and thus invariant under changes of
142 reference frame, and the Lorentz transformation tells us how the components change. Apply this to T we find ¯
¯
T α¯ β = T αβ Λα¯α Λβ β which also gives ¯
T 0¯x R2 1−v = = 0x 2 T x 1+v
25. This problem is apparently not required to understand the remainder of the book so I’ve but it on hold.
Rob’s notes on Schutz
t
143
t B *
x R
x R
A*
x
Figure 4.1: Events A and B seen in star’s reference frame (x − t axes) and a reference frame co-moving with Earth but with origin coinciding with star at event A (¯ x − t¯ axes), when radiation was emitted that reach Earth in time x at event B.
144
Chapter 5 Preface to curvature
145
146
5.1
On the relation of gravitation to curvature
Bad choice of symbols in Eq. (5.1): h is Plank’s constant when multiplied by frequency and h is the height of the tower when multiplied by g.
5.2
Tensor calculus in polar coordinates
How did he get Eq. (5.28b) for the magnitudes of the one-form bases? Use Eq. (3.51) but adapted to 2D Cartesian space instead of Minkowski space, p˜2 = p21 + p22
(5.1)
So e.g. ˜ = |dr|
p cos2 θ + sin2 θ = 1
using Eq. (5.27) and (5.1)
Some typos: • Eq. (5.28a), double == is just a typo. • Typo middle p. 129, just before Eq. (5.54), the final µ should be superscript: ∂V α + Γαµα V µ V αα = ∂xα • pp. 130 and 131, two separate equations labeled Eq. (5.64).
5.3
Christoffel symbols and the metric
Clarification: On p. 132, before Eq. (5.70), the superscripted α on the LHS of Vα;β = gαµ V µ;β looks odd. But it is correct. It follows because we’re in Cartesian coordinates wherein gαµ = δαµ .
Rob’s notes on Schutz
5.4
147
Noncoordinate bases
Verifying Eq. (5.78): First we use Eqs. (5.22, 5.23) for ~er and ~eθ , and Eq. (5.76) to obtain: ~erˆ = cos θ~ex + sin θ~ey , ~eθˆ = − sin θ~ex + cos θ~ey ,
(5.2)
˜ and dθ, ˜ and Eq. (5.77) to obtain: and we use Eqs. (5.26, 5.27) for dr ˜ = cos θ dx ˜ + sin θ dy, ˜ dr ˜ = − sin θdx ˜ + cos θdy. ˜ dθ
(5.3)
Taking the various dot products of basis vectors we find: ~erˆ · ~erˆ = cos2 θ + sin2 θ = 1, ~erˆ · ~eθˆ = − sin θ cos θ + sin θ cos θ = 0, ~eθˆ · ~eθˆ = cos2 θ + sin2 θ = 1.
(5.4)
Doing the same for the one-form bases gives the analogous result. If one is uncomfortable taking dot products of one-form bases, recall Eq. (3.47) and Eq. (3.52). ˜ · dr ˜ = (cos θ dx ˜ + sin θ dy) ˜ · (cos θ dx ˜ + sin θ dy), ˜ dr ˜ · dx ˜ + sin2 θ dy ˜ · dy ˜ + 2 cos θ sin θ dx ˜ · dy, ˜ = cos2 θ dx = cos2 θ + sin2 θ = 1.
(5.5)
Eq. (5.84) reduces to, 1 p 6= 0. 2 x + y2
5.8
Exercises
1. Repeat the argument that led to Eq. (5.1) under more realistic assumptions: suppose a fraction ε of the kinetic energy of the mass at the bottom
148 can be converted into a photon and sent back up, the remaining energy staying at ground level in a useful form. Devise a perpetual motion engine if Eq. (5.1) is violated. Taken literally, ε is the fraction of the kinetic energy of the mass at the bottom, not the fraction of the total energy. The remaining energy then means the rest mass energy plus (1 − ε) of the kinetic energy. Somehow I doubt that’s what Schutz meant because it makes it needlessly complicated but that’s what he said, so we’ll go with that interpretation. Conceptually it doesn’t really matter since we’re supposed to just see that the Einstein thought experiment carries forward the same message even when inefficiencies are introduced. Let’s introduce an index i to keep track of the iterations of the mass falling and photon propagating to the top of the tower. Say we start with mass m0 at the top, it falls gaining kinetic energy m0
gh = m0 gh c2
where the constants are in geometric units so that c = 1 and gh is dimensionless. For Earth conditions of course gh 1. Of this kinetic energy only a fraction ε is available for generating the photon at the bottom of the tower, εm0 gh = 2π~ ν0 , while the remaining energy is accumulated, apparently in useful form, at the base of the tower: m ˜ 0 = [(1 − ε)gh + 1]m0 . The key assumption is that the radiation is unaffected by the gravitational field (in violation with Eq. (5.1)), yielding a photon at the top of the tower of the same energy as at the bottom: εm0 gh = 2π~ ν0 . Now this is converted into mass m1 = 2π~ ν0 = m0 εgh. This yielding kinetic energy at the base of the tower: m1 gh = m0 ε(gh)2 ,
Rob’s notes on Schutz
149
of which the fraction ε is available for the 2nd photon: εm1 gh = 2π~ ν1 = m0 (εgh)2 . The remaining energy accumulates at the bottom: m ˜ 1 = [(1 − ε)gh + 1]m1 = [(1 − ε)gh + 1]m0 εgh. The kinetic energy at the top of the tower is again taken as the total energy in the photon at the base of the tower, yielding a new mass at the top of the tower: m2 = 2π~ ν1 = m0 (εgh)2 . At the bottom we generate another photon: 2π~ ν2 = m0 (εgh)3 , and accumulate more mass: m ˜ 2 = [(1 − ε)gh + 1]m2 = [(1 − ε)gh + 1](εgh)2 m0 . The process will repeat indefinitely. After n + 1 iterations we have accumulated this much mass at the base of the tower: n X i=0
m ˜i =
n X
ari ,
i=0
with a = [(1 − ε)gh + 1]m0 and r = (εgh). As n → ∞, the accumulated mass approaches a ([(1 − ε)gh + 1]m0 M= = , 1−r 1 − εgh see Boas (1983, Eq. (1.8)) for the sum of an infinite geometric series. Assuming Earth-like values, gh 1, and M ≈ [(1 − ε)gh + 1]m0 (1 + εgh) ≈ m0 (1 + gh). The accumulated mass is not much more than the starting mass so the process is not an efficient way to create energy. However, we gained something for nothing and generated an infinite process. Clearly something is wrong, and in particular, it was the violation of Eq. (5.1) describing the gravitational
150 redshift. Einstein’s simple thought experiment is robust to the inclusion of inefficiencies.
2. A uniform external gravitational field would contribute to a uniform acceleration for the solid Earth and its fluid envelopes, engendering no relative motion between them.
3. (a) Consider coordinate transformation (x, y) → (ξ, η) with ξ = x and η = 1. Note that ∂η/∂x = 0 and ∂η/∂y = 0. This violates Eq. (5.6), implying that this coordinate transformation is not good. In fact this same example was worked out on p. 118, complete with an example of a distinct pair of points (x, y) points having the same (ξ, η) coordinates.
3. (b) Are the following coordinates transformations good ones? Compute the Jacobian and list and points where the transformations fail. (i) ξ = (x2 + y 2 )1/2 , η = arctan(y/x). This is of course Eq. (5.3), the polar coordinate transformation.
∂ξ ∂x ∂ξ ∂y ∂η ∂x ∂η ∂y
=p =p
x x2
+ y2 y
x2 + y 2 −y = 2 , x + y2 x = 2 , x + y2
, ,
(5.6)
p The determinant is 1/ x2 + y 2 so I believe the only problem is at the origin, where r = 0 and derivatives above are undefined. (ii) ξ = ln(x), η = y.
Rob’s notes on Schutz
151
∂ξ ∂x ∂ξ ∂y ∂η ∂x ∂η ∂y
=
1 , x
= 0, = 0, = 1,
(5.7)
The determinant is x1 so again the only problem is at the origin and of course x ≤ 0 where ξ = ln(x) is undefined. (iii) ξ = arctan(y/x), η = (x2 + y 2 )−1/2 . ∂ξ ∂x ∂ξ ∂y ∂η ∂x ∂η ∂y
−y , + y2 x = 2 , x + y2 −x = 2 , (x + y 2 )3/2 −y = 2 , (x + y 2 )3/2 =
x2
(5.8)
The determinant is 1/(x2 + y 2 )3/2 so I believe the only problems are at the origin and as x2 + y 2 → ∞, where the derivatives above are undefined.
4. A curve is defined by {x = f (λ), y = g(λ), 0 ≤ λ ≤ 1}. Show that the tangent vector (dx/dλ, dy/dλ) does actually lie tangent to the curve. The slope of the tangent to the curve is ∆y ∆λ dy/dλ = ∆x→0 ∆x ∆λ dx/dλ dy/dλ = , dx/dλ lim
(5.9)
152 which is also the slope of the tangent vector.
5. Sketch the curves, compare paths, find tangent vectors when parameter is nil. The computations in this exercise are a bit trivial but still I found it instructive. If one likes this sort of approach, one might like the text by Faber (1983). The plots can be found in the accompanying MapleTM file schutz2009_ch5.mw. (a) x = sin λ y = cos λ
(5.10)
This is a unit circle centred at the origin. When λ = 0 the tangent vector is at (0, 1), points in the x−direction and has unit length. (b) x = cos(2πt2 ) y = sin(2πt2 + π)
(5.11)
This is a unit circle centred at the origin, as in (a). The tangent vector is a bit subtle. x˙ = − sin(2πt2 )4πt y˙ = cos(2πt2 + π)4πt
(5.12)
When t = 0 the tangent vector is the zero vector. But strangely we can still identify its direction! The angle of the tangent vector to the x−axis is sin(2πt2 ) x˙ cot θ = = − y˙ cos(2πt2 + π) When t = 0 it’s clear that θ = ±π/2. And we can decide the sign as follows. Let t = , a small and positive parameter. Then x is close to unity but slightly less, and y is close to zero but negative. Taking → 0 moves the point to (1, 0), so the tangent vector points to y = −∞. That is, θ = −π/2.
Rob’s notes on Schutz
153
(c) x=s y =s+4
(5.13)
The path is a straight line with y intercept at y = 4. The tangent vector is uniform (1, 1). (d) x = s2 y = −(s − 2)(s + 2) = 4 − s2
(5.14)
The path is a straight line with y intercept at y = 4, as in (c), but now we’re restricted to x ≥ 0. The tangent vector is not uniform but depends upon s: x˙ = 2s y˙ = −2s
(5.15)
As in (b) the tangent vector is the zero vector at s = 0, but again we can still define its direction. In this case the direction is uniform (although the magnitude is not) (1, −1) or θ = −π/4. (e) x=µ y=1
(5.16)
The path is a straight line with y intercept at y = 1. The tangent vector is uniform: x˙ = 1 y˙ = 0
(5.17)
6. Justify the basis vectors and one-forms in Fig. 5.5. The ~er vectors should point away from the origin, and be of the same length regardless of position because |~er | = 1, c.f. Eq. (5.28b).
154 The ~eθ vectors should be orthogonal to ~er , and point in the anticlockwise direction about the origin. The length should increase linearly with distance from the origin, c.f. Eq. (5.28a). ˜ basis should be surfaces tangent to a curve of constant r so orthogThe dr onal to ~er , which point away from the origin. The “amplitude” (indicated by the spacing of the lines) should be constant regardless of position because ˜ = 1, c.f. Eq. (5.28b) or (5.1) in my notes of § 5.3. |dr| ˜ basis should be surfaces tangent to a curve of constant θ so orthogThe dθ onal to ~eθ . The amplitude (weaker amplitude indicated by greater spacing ˜ = r−1 , c.f. of the lines) should decrease with distance from the origin, |dθ| Eq. (5.28b).
7. Let primed indices indicate polar coordinates and unprimed Cartesian. 0 Find Λαβ and Λµν 0 . Let’s start with Λµν 0 . Using Eq. (5.3) for the cooridnates (x, y) in terms of polar coordinate variables (r, θ), we calculate the terms of the transformation given in Eq. (5.13): (Λµν 0 )
cos θ −r sin θ = . sin θ r cos θ
(5.18)
0
Slightly more awkward is Λαβ . Use Eq. (5.8) with ξ = r and η = θ. Note for the second row we must differentiate arctan: ∂θ ∂η = ∂x ∂x ∂ arctan(y/x) = . ∂x
(5.19)
I found this simpler to write tan θ = y/x and differentiate both sides with respect to x: ∂ tan θ d tan θ ∂θ = , ∂x dθ ∂x 1 ∂θ = . cos2 θ ∂x
(5.20)
Rob’s notes on Schutz
155
Now we solve for ∂θ ∂ tan θ = cos2 θ ∂x ∂x ∂y/x = cos2 θ ∂x −y = cos2 θ x2 −y = 2 . x + y2
(5.21)
And similarly for ∂ tan θ ∂θ = cos2 θ ∂y ∂y x = 2 . x + y2
(5.22)
Finally we arrive at: " α0
(Λ β ) =
√
x x2 +y 2 −y x2 +y 2
√
y
#
x2 +y 2
x x2 +y 2
(5.23)
As a check we can multiple the two transformations together to confirm that they are indeed a pair of inverses. To do this we must choose a common set of variables. Let’s use (r, θ). It’s straightforward to convert # " x √ √ y 0 2 2 2 2 x +y x +y (Λαβ ) = , −y x x2 +y 2
=
x2 +y 2
cos θ sin θ − sin θ r
cos θ r
.
And then we find their product gives the identity matrix, cos θ sin θ cos θ −r sin θ β α0 (Λ β )(Λ γ 0 ) = − sin θ cos θ , sin θ r cos θ r r 1 0 = . 0 1
(5.24)
(5.25)
156 8. (a) Let f = x2 + y 2 + 2 x y, and consider two vectors in Cartesian coordinates, V~ →c (x2 + 3y, y 2 + 3x) ~ →c (1, 1). W
(5.26)
Find f as a function or r and θ, and find the components of the two vectors on the polar basis. From Eq. (5.3) we find immediately that f = r2 + 2 r2 cos θ sin θ
, by simple substitution.
(5.27)
For the vectors, we need to express the Cartesian components in terms of 0 (r, θ), and we need the transformation matrix Λαβ where prime refers to the polar coordinates and unprimed the Cartesian, as in problem 7. The former is V~ →c (x2 + 3y, y 2 + 3x) →c (r2 cos2 θ + 3 r sin θ, r2 sin2 θ + 3 r cos θ) , by substitution. (5.28) These are still Cartesian coordinates, but expressed as a function of (r, θ). Using the transformation matrix from problem 7 we have cos θ sin θ r2 cos2 θ + 3 r sin θ, ~ V →p − sin θ cos θ r2 sin2 θ + 3 r cos θ r r r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ ~ V →p . (5.29) −r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ) ~ , again using the transformation matrix from probAnd similarly for W lem 7, we have simply: cos θ sin θ 1 ~ →p − sin θ cos θ W 1 r r sin θ + cos θ ~ →p − sin θ cos θ . W (5.30) + r r ~ are a function of (r, θ) because The coordinates of the simple vector W the basis vectors change with position, c.f. Eqs. (5.22) & (5.23).
Rob’s notes on Schutz
157
˜ in Cartesian coordinates. 8. (b) Find the components of the one-form df Solution: Using Eq. (5.10) we find ∂f ∂f ˜ →c , df ∂x ∂y →c (2x + 2y, 2x + 2y) .
(5.31)
˜ in polar coordinates by direct (i) Find the components of the one-form df computation. We use result (5.27) above and Eq. (5.10), ∂f ∂f ˜ , df →p , Eq. (5.10) ∂r ∂θ →p 2r + 4r cos θ sin θ, 2r2 (cos2 θ − sin2 θ) .
(5.32)
˜ in polar coordinates by (ii) Find the components of the one-form df transforming the Cartesian components. We use of course Eq. (5.14) in general to relate the polar and Cartesian components of a one-form. The result (5.31) above gives the Cartesian components and the transformation, in general is Eq. (5.13) and was found above in result (5.18). ˜ ˜ df = df Λαβ 0 , Eq. (5.14) β0 α ∂x ∂x ∂f ∂f ∂r ∂θ , = ∂y ∂y ∂x ∂y ∂r ∂θ cos θ −r sin θ = (2x + 2y, 2x + 2y) sin θ r cos θ cos θ −r sin θ = (2r(cos θ + sin θ), 2r(cos θ + sin θ)) sin θ r cos θ 2 2 2 = 2r + 4r cos θ sin θ, 2r (cos θ − sin θ) . (5.33) which agrees, of course, with the result (5.32).
158 ˜ in polar 8. (c) (i) Find the components of the one-forms V˜ and W coordinates using the metric tensor. The matrix tensor in polar coordinates was given in Eq. (5.32). We found V~ in polar coordinates in problem 8(a), see result (5.29). Vα = gαβ V β c.f. Eq. (3.39) r 1 0 V = 2 0 r Vθ = (V r , r2 V θ ) r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ = . −r3 cos2 θ sin θ + r3 sin2 θ cos θ + 3r2 (cos2 θ − sin2 θ)
(5.34)
~ in polar coordinates we use result (5.30): For W Wα = gαβ W β c.f. Eq. (3.39) r 1 0 W = 0 r2 W θ sin θ + cos θ = . −r sin θ + r cos θ
(5.35)
˜ in polar 8. (c) (ii) Find the components of the one-forms V˜ and W coordinates using the transformation from Cartesian components. Because the matrix tensor in Cartesian components corresponds to the identity matrix, the components of the one-forms are identical to the vectors:
Vα = gαβ V β Vα = δαβ V Vα →c V α
β
c.f. Eq. (3.39) , used Eq. (5.29) , also explained p. 131 bottom.
(5.36)
˜ in Cartesian comso we immediately have the components of the V˜ and W
Rob’s notes on Schutz
159
ponents. We merely have to transform these to polar coordinates. Vα0 = Vβ Λβα0
Eq. (5.14) or Eq. (3.16)
cos θ −r sin θ = (r cos θ + 3r sin θ, r sin θ + 3r cos θ) sin θ +r cos θ = r2 (cos3 θ + sin3 θ) + 6r sin θ cos θ, − r3 (cos2 θ sin θ − sin2 θ cos θ) + 3r2 (cos2 θ − sin2 θ) , (5.37) 2
2
2
2
which agrees with result (5.34). And similarly for the other one-form: Wα0 = Wβ Λβα0 cos θ −r sin θ = (1, 1) sin θ +r cos θ = cos θ + sin θ, r (cos θ − sin θ)
(5.38)
which agrees with the result (5.35).
9. Draw a figure to explain Eqs. (5.38a, b). Recall the Eqs. (5.38a, b): ∂~eθ 1 = ~eθ , ∂r r ∂~eθ = −r~er . ∂θ We see that changing r does not change the direction of the polar coordinate basis vectors. But ~eθ does change in magnitude since it must increase in length as one moves further from the origin, albeit more slowly the farther one is from the origin, see Fig. 5.1. Changing θ on the other hand does change the orientation of the basis vectors. Increasing θ when one is in the first quandrant for instance results in ~eθ pointing more toward the −x-direction, see Fig. 5.2. 10. Prove that ∇V~ defined in Eq. (5.52) is a
1 1
tensor.
160 y
.A
.B Δr
x
Figure 5.1: Moving from point A to B we see the basis vector ~eθ increases in length but doesn’t change direction. For larger r the relative change is smaller. Plot was partly generated with MapleTM , see accompanying file schutz2009 ch5.mw. First a note on terminology. One might complain that the RHS below (∇V~ )αβ ≡ V α;β
(5.39)
is not a tensor because it lacks a basis. In Chapter 3, Schutz was careful to include the basis functions but here they were lost. While this complaint is valid, Schutz is following tradition here by calling the components of a tensor “a tensor” for short. Hobson et al. (2009) were explicit about using such an abbreviation and noted that it is commonly done. In Schutz’s partial defence here, in Eq. (5.52) he enclosed the tensor on the LHS in parentheses and pulled off the α and β components explicitly – recall the discussion in my notes on chapter 2 §2.2 of this notation introduced without explanation. Going back to the subsection The covariant derivative of § 5.3, if we take the gradient of Eq. (5.43) instead of just the derivative, then instead of Eq. (5.46) we obtain: α ~ ∂V β α µ β ∂V ˜ ˜ = dx ~eα + V Γ αβ ~eµ . dx ∂xβ ∂xβ
(5.40)
Rob’s notes on Schutz
161
y
.B
.A
Δθ
x
Figure 5.2: Moving from point A to B we see the basis vector ~eθ changes direction, in this case pointing more toward the −x-direction, but doesn’t change in length. Plot was partly generated with MapleTM , see accompanying file schutz2009 ch5.mw. In other words, I’ve just included the one-form basis. Continuing the development on p. 128 leads up to Eq. (5.51), but with the one-form basis: ~ ˜ β ∂ V = dx ˜ β V α ~eα dx ;β β ∂x ˜ β ⊗ ~eα V α . = dx ;β
(5.41)
To completely prove this is a 11 tensor would include showing that it is invariant under a change of basis. This is easy if we restrict ourselves to Cartesian basis for then we don’t have to deal with the Christoffel symbol. ¯ related to frame O as follows: Consider frame O, V α¯ = Λα¯α V α 0
~eα¯ = Λαα¯ ~eα0 ˜ β¯ = Λβ¯ dx ˜ β dx β β0
∂ ∂x ∂ ∂ β0 , ¯ 0 = Λ β ¯ = ¯ β β β ∂xβ 0 ∂x ∂x ∂x
(5.42)
162 ¯ frame, those without where overbar indices indicate components of the O overbar indicate components of the O frame regardless of whether or not they have primes. We simply substitute these into the expression for the ¯ gradient of the velocity in the frame O: α¯ ~ ∂V ∂ V ¯ ¯ β β ˜ ˜ = dx ~eα¯ , dx ¯ ∂xβ ∂xβ¯ α ∂V β¯ ˜ β β0 α0 α ¯ = Λ β dx Λ α Λ β¯ Λ α¯ ~eα0 , ∂xβ 0 α ∂V β 0 ˜ β α0 = δ β dx δ α ~eα0 , ∂xβ 0 α ∂V β ˜ ~eα . = dx (5.43) ∂xβ Viola! It transforms properly such that it is frame invariant at least for frames with a Cartesian basis. [Schutz’s solution focuses on showing linearity. ]
11. For V~ from Exer. 8 above, find: (a) V α,β in Cartesian coordinates.
V
α ,β
∂V x ∂x ∂V y ∂x
∂V x ∂y ∂V y , ∂y
→c 2x 3 = , 3 2y 2r cos θ 3 = . 3 2r sin θ
(5.44)
11. (b) The transformation to polar coordinates. Transformation of vectors between different coordinates was explained in § 2.2 and § 5.2. Perhaps it was not thoroughly explained how to transform tensors. This exercise then is very important for clarifying that. The key ingredient is that one needs to apply a transformation matrix for each index
Rob’s notes on Schutz
163
or rank of the tensor – Eq. (5.8) for the superscript indices (what other books called the contravariant components) and Eq. (5.13) for the subscript indices (what other books call the covariant components). 0 Λµα
V
α ,β
Λβν 0
=
− sin θ r
=
cos θ sin θ
cos θ r
A B C D
2r cos θ 3 3 2r sin θ
cos θ −r sin θ sin θ r cos θ
(5.45)
where A = 2r(cos3 θ + sin3 θ) + 3 sin 2θ B = 2r2 (− cos2 θ sin θ + sin2 θ cos θ) + 3r cos 2θ 3 C = 2(− cos2 θ sin θ + sin2 θ cos θ) + cos 2θ r D = 2r(cos2 θ sin θ + sin2 θ cos θ) − 3 sin 2θ
(5.46)
0
11. (c) V µ;ν 0 directly in polar coordinates. First we need the velocity field in polar coordinates. We use the result (5.29) found in problem 7 above, which was:
2 2 r cos θ + 3 r sin θ, V~ →p − sin θ cos θ r2 sin2 θ + 3 r cos θ r r 3 2 3 r (cos θ + sin θ) + 3 r sin 2θ V~ →p . −r cos2 θ sin θ + r sin2 θ cos θ + 3 cos 2θ cos θ sin θ
(5.47)
From Eq. (5.50), the velocity gradient has two parts. The first part, due to the gradient of the components: V
µ0 ,ν 0
∂V r ∂r ∂V θ ∂r
=
A E = F G
∂V r ∂θ ∂V θ ∂θ
(5.48)
164 where A = 2r(cos3 θ + sin3 θ) + 3 sin 2θ , as in (b) above 2 2 2 E = 3r (− cos θ sin θ + sin θ cos θ) + 6r cos 2θ F = (− cos2 θ sin θ + sin2 θ cos θ) G = r(2 sin2 θ cos θ + 2 cos2 θ sin θ − cos3 θ − sin3 θ) − 6 sin 2θ
(5.49)
The second part is due to the gradient in basis vectors. Using Eq. (5.45) for the Christoffel symbols: 0 V θ Γr θθ γ 0 µ0 V Γ γ0ν0 = V θ Γθ θr V r Γθ rθ 0 r2 (cos2 θ sin θ − sin2 θ cos θ) − 3r cos 2θ (− cos2 θ sin θ + sin2 θ cos θ) + 3 cosr 2θ r (cos3 θ + sin3 θ) + 3 sin 2θ (5.50) Combining these two we obtain 0
V µ;ν 0 →p (5.51) 3 2 3 2 2 2r(cos θ + sin θ) + 3 sin 2θ 2r (− cos θ sin θ + sin θ cos θ) + 3r cos 2θ r(2 sin2 θ cos θ + 2 cos2 θ sin θ) − 3 sin 2θ 2(− cos2 θ sin θ + sin2 θ cos θ) + 3 cosr 2θ (5.52) And viola, it agrees with the much simpler calculation in Cartesian coordinates, transformed to polar (5.45). 11. (d) The divergence using results from (a). The divergence should be frame-independent and is the trace of the matrix of the covariant derivative of the vector. Using the covariant derivative in Cartesian coordinates from (a) above: V α,α = 2r(cos θ + sin θ).
(5.53)
11. (e) The divergence using results from (b). V α;α = 2r(cos3 θ + sin3 θ + cos2 θ sin θ + sin2 θ cos θ), = 2r[cos2 θ(cos θ + sin θ) + sin2 θ(sin θ + cos θ)], = 2r(cos θ + sin θ).
(5.54)
Rob’s notes on Schutz
165
And of course this agrees with the result (5.53) obtained in (d). 11. (f) The divergence using Eq. (5.56) Recall from (5.29) that we had: 3 2 3 r (cos θ + sin θ) + 6 r sin θ cos θ V~ →p . −r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ)
(5.55)
So applying Eq. (5.56) we get 1 ∂ ∂ ∇ · V~ = [rV r ] + V θ r ∂r ∂θ 1 ∂ 2 = r(r (cos3 θ + sin3 θ) + 6 r sin θ cos θ) + (5.56) r ∂r ∂ [−r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ)] ∂θ = 3(r (cos3 θ + sin3 θ) + 12 sin θ cos θ) + (5.57) 2 3 2 3 [2r cos θ sin θ − r cos θ + 2r sin θ cos θ − r sin θ − 6 sin(2θ)] = 2r(cos θ + sin θ). (5.58) And of course this also agrees with the result (5.53) obtained in (d).
12. Given p˜ whose components in Cartesian coordinates are the same as V~ in Exers. 8(a) and 11. (a) Find pα,β in Cartesian coordinates. ∂pα ∂xβ and we’ve done all these calculations in Exer. 11(a). " # pα,β =
pα,β →c
∂px ∂x ∂py ∂x
∂px ∂y ∂py ∂y
,
2x 3 = , 3 2y 2r cos θ 3 = . 3 2r sin θ
(5.59)
166 (b) Transform pα,β from Cartesian coordinates to polar coordinates, pα0 ;β 0 . Note the different notation now for the derivative – because we’re using curvilinear coordinates there are derivatives of the basis vectors too so we use the colon instead of comma. Now we get something different from Exerc. 11(b) because the transformation is different for one-forms than for vectors: pα0 ;β 0 = Λαα0 Λβ β 0 pα,β Instead of using matrices, let’s try using tensors as follows. pr;r = Λαr Λβ r pα,β = Λx r Λx r px,x + Λx r Λy r (px,y + py,x ) + Λy r Λy r py,y = (Λx r )2 2x + Λx r Λy r (3 + 3) + (Λy r )2 2y = (Λx r )2 2r cos θ + Λx r Λy r (6) + (Λy r )2 2r sin θ = (cos θ)2 2r cos θ + cos θ sin θ(6) + (sin θ)2 2r sin θ = 2r(cos3 θ + sin3 θ) + 6 cos θ sin θ = V r;r see result (5.46) above
(5.60)
And pθ;r = Λαθ Λβ r pα,β = Λx θ Λx r px,x + Λx θ Λy r px,y + +Λy θ Λx r py,x + Λy θ Λy r py,y = Λx θ Λx r 2r cos θ + Λx θ Λy r 3 + +Λy θ Λx r 3 + Λy θ Λy r 2r sin θ = [−r sin θ cos θ]2r cos θ + [−r sin θ cos θ]3 + [r cos2 θ]3 + [r sin θ cos θ]2r sin θ 3 = r2 {2[− cos2 θ sin θ + sin2 θ cos θ] + cos(2θ)} r 2 θ =r V r see result (5.46) above (5.61) We can save some time by noting because pα,β = pβ,α pr;θ = Λαr Λβ θ pα,β = pθ;r
(5.62)
Rob’s notes on Schutz
167
And finally, pθ;θ = Λαθ Λβ θ pα,β = Λx θ Λx θ px,x + Λx θ Λy θ (px,y + py,x ) + Λy θ Λy θ py,y = (Λx θ )2 2r cos θ + Λx θ Λy θ (6) + (Λy θ )2 2r sin θ = (r sin θ)2 2r cos θ + (−r2 cos θ sin θ)6 + (r cos θ)2 2r sin θ = 2r3 (cos2 θ sin θ + sin2 θ cos θ) − 3r2 sin(2θ) = r2 V θ;θ
see result (5.46) above
(5.63)
12 (c) Obtain pα0 ;β 0 directly in polar coordinates. We need pα in polar coordinates. We can use the results we obtained in Exer. 8(c). Because V~ from Exer. 8 had the same components as our oneform p˜ when both were in Cartesian components, V~ must be the vector dual of p˜. This works because gαβ = δαβ for Cartesian coordinates. So pα = Vα and recalling result (5.34) from above, pr = r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ pθ = −r3 cos2 θ sin θ + r3 sin2 θ cos θ + 3r2 (cos2 θ − sin2 θ)
(5.64)
Let’s separate out the 4 terms in Eq. (5.63) and work on them one at a time. 0
pr;r = pr,r − pα0 Γα rr
Eq. (5.63) 0
= pr,r , Eq. (5.45) gives Γα rr = 0 ∂ 2 (r (cos3 θ + sin3 θ) + 6 r sin θ cos θ) = ∂r = 2r(cos3 θ + sin3 θ) + 3 sin(2θ)
(5.65)
as in result (5.60) above. 0
pr;θ = pr,θ − pα0 Γα rθ θ
Eq. (5.63)
= pr,θ − pθ Γ rθ , only one non-zero Christoffle from Eq. (5.45) ∂ 2 1 3 3 = (r (cos θ + sin θ) + 6 r sin θ cos θ) − pθ ∂θ r 2 2 2 = 2r (− cos θ sin θ + sin θ cos θ) + 3r cos(2θ) , several terms canceled (5.66)
168 as in result (5.61) above. You can save yourself some work by noting that order doesn’t matter for partial derivatives: pθ,r = pr,θ and the latter was already calculated above. And the Christoffel symbol cannot make a difference because of the symmetry (Eq. (5.74)). Thus pθ;r = pr;θ : 0
pθ;r = pθ,r − pα0 Γα rθ
Eq. (5.63)
θ
= pr,θ − pθ Γ rθ , only one non-zero Christoffle from Eq. (5.45) = 2r2 (− cos2 θ sin θ + sin2 θ cos θ) + 3r cos(2θ) = pr;θ (5.67) as in result (5.62) above. Finally 0
pθ;θ = pθ,θ − pα0 Γα θθ
Eq. (5.63)
θ
= pθ,θ − pr Γ θθ , only one non-zero Christoffle from Eq. (5.45) ∂ 1 = (pθ ) − pθ ∂θ r 3 2 = 2r (cos θ sin θ + sin2 θ cos θ) − 3r2 sin(2θ) , several terms canceled (5.68) as in result (5.63) above.
13. Show that one could have obtained the results in Exer. 12(b) by lowering the index using the metric. 0
pα0 ;β 0 = gα0 σ0 V σ;β 0 Recall in Exer. 12(b) we found pr;r = Λαr Λβ r pα,β = 2r(cos3 θ + sin3 θ) + 6 cos θ sin θ = V r;r
(5.69)
But 0
pr;r = grσ0 V σ;r = grr V r;r = V r;r
, used Eq. (5.31)
(5.70)
Rob’s notes on Schutz
169
Recall in Exer. 12(b) we found pθ;r = Λαθ Λβ r pα,β = r2 {2[− cos2 θ sin θ + sin2 θ cos θ] +
3 cos(2θ)} r
= r2 V θ;r
(5.71)
But 0
pθ;r = gθσ0 V σ;r = gθθ V θ;r = r2 V θ;r
, used Eq. (5.31)
(5.72)
And 0
pr;θ = grσ0 V σ;θ = grr V r;θ = V r;θ
, used Eq. (5.31)
= r2 V θ;r
, used result (5.46
(5.73)
Recall in Exer. 12(b) we found pθ;θ = 2r3 [cos2 θ sin θ + sin2 θ cos θ] − 3r2 sin(2θ) = r2 V θ;θ
(5.74)
But 0
pθ;θ = gθσ0 V σ;θ = gθθ V θ;θ = r2 V θ;θ
14. Given a
2 0
, used Eq. (5.31)
tensor A with components in polar coordinates:
r2 r sin θ (A) = r cos θ tan θ
(5.75)
170 find the components of ∇A. From Eq. (5.65) there are three contributions: Aαβ;µ = Aαβ,µ + Aασ Γβ σµ + Aσβ Γα σµ The three contributions are respectively: Aαβ;µ Arr;r Arr;θ Arθ;r Arθ;θ Aθr;r Aθr;θ Aθθ;r Aθθ;r
Aαβ,µ = 2r =0 = sin θ = r cos θ = cos θ = −r sin θ =0 = sec2 θ
Aασ Γβ σµ +0 −r2 sin θ + sin θ +r +0 −r tan θ +r−1 tan θ + cos θ
Aσβ Γα σµ +0 −r2 cos θ +0 + − r tan θ + cos θ +r +r−1 tan θ + sin θ
= 2r = −r2 (sin θ + cos θ) = 2 sin θ = r(1 + cos θ − tan θ) = 2 cos θ = r(1 − sin θ − tan θ) = r−1 2 tan θ = sec2 θ + cos θ + sin θ
The above solution was verified with MapleTM , please see accompanying file schutz2009_ch5.mw.
15. Given the uniform vector in polar coordinates V r = 1 and V θ = 0, which points radially from the origin, find V α;µ ;ν In principle this is quite straightforward, but there are several places one might slip-up. First I think it’s a good idea to write down the general expression, and then substitute the given vector field. Write T αµ ≡ V α;µ = V α,µ + V σ Γα σµ
Eq. (5.64), the 2nd one that is
(5.76)
Don’t substitute the given vector at this point because we’re still going to take another derivative: V α;µ ;ν ≡ T αµ ;ν = T αµ ,ν + T σµ Γα σν −T ασ Γσ µν
Eq. (5.66)
(5.77)
Rob’s notes on Schutz
171
Now it’s straightforward substitution. The problem simplifies tremendously because there are only three nonzero components of the Christoffel symbol, c.f. Eq. (5.45): Γθ rθ = Γθ θr =
1 r
Γr θθ = −r
(5.78)
To help you debug, I’ve split (5.77) into 3 colour coded parts (so the first term makes the contributions in red below): 1 1 1 1 + 2− 2 = − 2 2 r r r r 1 1 =− 2 =− 2 r r = −1 = −1
V θ;θ ;r = − V θ;r ;θ V r;θ ;θ
(5.79)
The above solution was verified with MapleTM , please see accompanying file schutz2009_ch5.mw.
16. Fill in steps between Eq. (5.74) and Eq. (5.75). There are no steps to fill in! He has explained every step in detail. See Exer. 20 for a problem that forces one thoroughly understand each of these steps.
17. Discover how V β,α transforms under a change of coordinates. Do same for V µ Γβ µα I’ve created SP 3 and SP 4 as an alternative to this problem. They carry the same message but in a more straightforward way the follows naturally from what we did in Chapter 2 for vectors.
18. Verify Eq. (5.78).
172 Recall Eq. (5.78) gave ~eαˆ · ~eβˆ ≡ gαˆ βˆ = δαˆ βˆ ˆ
ˆ
ˆ
ω ˜ αˆ · ω ˜ β ≡ g αˆ β = δ αˆ β
(5.80) So in the first line there are two equalities to verify. Recall how we obtain the components of any tensor, Eq. (3.21), and the metric tensor in particular, Eq. (3.5), consistent with: ~eαˆ · ~eβˆ ≡ gαˆ βˆ For the second equality of the first line, we have four terms to verify. ~erˆ · ~erˆ = ~er · ~er , substituted Eq. (5.76) = grr , Eq. (3.5) =1 , Eq. (5.31a) (5.81) 1 substituted Eq. (5.76) ~erˆ · ~eθˆ = ~er · ~eθ r 1 = grθ , Eq. (3.5) r =0 , Eq. (5.31b) , order of dot product above doesn’t matter. = ~eθˆ · ~erˆ 1 1 ~eθˆ · ~eθˆ = ~eθ · ~eθ , substituted Eq. (5.76) r r 1 , Eq. (3.5) = 2 gθθ r =1 , Eq. (5.31a)
(5.82)
(5.83)
So in the second line there are two equalities to verify. Recall how we obtain the components of any tensor, Eq. (3.21), consistent with: ˆ
ˆ
ω ˜ αˆ · ω ˜ β ≡ g αˆ β
Rob’s notes on Schutz
173
For the second equality of the second line, we have four terms to verify. ˜ · dr ˜ ω ˜ rˆ · ω ˜ rˆ = dr , substituted Eq. (5.77) rr =g see p. 124 =1 , Eq. (5.34) (5.84)
ˆ ˜ · rdθ ˜ ω ˜ rˆ · ω ˜ θ = dr
, substituted Eq. (5.77)
= rg rθ , see p. 124 =0 , Eq. (5.34) ˆ
˜ rˆ =ω ˜θ · ω
, order of dot product above doesn’t matter.
ˆ ˆ ˜ · rdθ ˜ ω ˜θ · ω ˜ θ = rdθ 2
(5.85)
, substituted Eq. (5.77)
θθ
=r g see p. 124 =1 , Eq. (5.34)
(5.86)
˜ and dθ ˜ 19. Repeating argument from Eq. (5.81) to Eq.(5.84) using dr leads to the conclusion that these are a coordinate basis. We simply repeat the argument but instead of substituting Eq. (5.77) we ˜ and dθ. ˜ We find the 2nd line of Eq. (5.81) changes: use dr ˜ = cos θ dx ˜ + sin θdy ˜ dr ˜ = − sin θ dx ˜ + cos θdy ˜ dθ
, used Eq. (5.27) , used Eq. (5.26)
(5.87)
So now instead of Eq. (5.82) we get ∂η −1 = sin θ ∂x r ∂η 1 = cos θ ∂y r
(5.88)
174 So instead of Eq. (5.83) we have factors of 1/r on both sides, ∂ 2η ∂ −1 = sin θ ∂y∂x ∂y r ∂ −y = ∂y x2 + y 2 1 2y 2 =− 2 + , chain rule x + y 2 (x2 + y 2 )2 2y 2 x2 + y 2 + =− 2 (x + y 2 )2 (x2 + y 2 )2 −x2 + y 2 = 2 (x + y 2 )2
(5.89)
On the other hand, ∂ 1 ∂ 2η = cos θ ∂x∂y ∂x r ∂ x = ∂x x2 + y 2 2x2 1 − = 2 x + y 2 (x2 + y 2 )2 −x2 + y 2 = 2 (x + y 2 )2 ∂ 2η = ∂y∂x
(5.90)
Thus the basis is consistent with a coordinate basis. See SP.5 to complete ˜ and dθ ˜ are a coordinate basis. the proof that dr
20 For a noncordinate basis {~eµ }, define ∇~eµ ~eν − ∇~eν ~eµ ≡ cα µν ~eα and use this in place of Eq. (5.74) to generalize Eq. (5.75). The Christoffel symbol [of the second kind] arising from the derivative of the basis vectors, as in Eq. (5.44). For coordinate bases it is symmetric on
Rob’s notes on Schutz
175
the two lower indices, Eq. (5.74), and this was used to derive Eq. (5.75). Now we must derive the generalization of Eq. (5.75) appropriate for noncoordinate bases. It is clear that cα µν so defined is antisymmetric on the two lower indices. It is convenient to introduce a new symbol for the connection coefficient appropriate for the noncoordinate bases as follows: ∇~eµ ~eν ≡ λα µν ~eα 1 α α = Γ µν + c µν ~eα 2
(5.91)
For each α, λα µν is a 4 × 4 matrix and it can be uniquely decomposed into its symmetric and antisymmetric parts. The cα µν is then identified with the twice the antisymmetric part and the regular Christoffel symbol as the symmetric part. We then replace in Eq. (5.72) Γα µν → λα µν and repeat the steps on p. 134: gαβ,µ = λν αµ gνβ + λν βµ gαν gαµ,β = λν αβ gνµ + λν µβ gαν −gβµ,α = −λν βα gνµ − λν µα gβν
(5.92)
Simply add both sides, grouping terms with common factors of g, remembering that we can still exploit the symmetry of g. [These have been colour coded above to help you find them quickly.] We get of course the same as Schutz on p. 134 but with Γα µν → λα µν : gαβ,µ + gαµ,β − gβµ,α = (λν αµ − λν µα ) gβν + (λν αβ − λν βα ) gνµ + (λν βµ + λν µβ ) gαν
(5.93)
Now we exploit the fact that λα µν has been represented in terms of its symmetric and antisymmetric parts. That is, (λν αµ − λν µα ) = cν αµ (λν αβ − λν βα ) = cν αβ (λν βµ + λν µβ ) = 2Γν βµ
(5.94)
176 which gives us gαβ,µ + gαµ,β − gβµ,α = cν αµ gβν + cν αβ gνµ + 2Γν βµ gαν gαβ,µ + gαµ,β − gβµ,α = cαµβ + cαβµ + 2Γν βµ gαν
, lowered index
(5.95)
We’re almost there! Multiply by (1/2) g αγ and solve for the Christoffel symbol: 1 Γν βµ δ γ ν = g αγ (gαβ,µ + gαµ,β − gβµ,α − cαµβ − cαβµ ) 2 1 Γγ βµ = g αγ (gαβ,µ + gαµ,β − gβµ,α − cαµβ − cαβµ ) 2
(5.96)
21a. Hold λ fixed and let a vary in Eq. (5.96), showing that these coordinate curves are orthogonal to the world lines (coordinate curves obtained with a fixed and λ varying). ~ be the tangent to the curve with λ fixed and varying a: Let A ~ = ~e0 ∂t(a, λ) + ~ex ∂x(a, λ) A ∂a ∂a = ~e0 sinh(λ) + ~ex cosh(λ)
(5.97)
~ be the tangent to the curve with a fixed and varying λ (i.e. the world Let B lines): ~ = ~e0 ∂t(a, λ) + ~ex ∂x(a, λ) B ∂λ ∂λ = ~e0 a cosh(λ) + ~ex a sinh(λ)
(5.98)
Now it’s easy to show that ~·B ~ = −a sinh(λ) cosh(λ) + a sinh(λ) cosh(λ) = 0 A
(5.99)
21b. Show that Eq. (5.96) defines a transformation from coordinates (t, x) to coordinates (λ, a) that form an orthogonal coordinate system. Draw
Rob’s notes on Schutz
177
these coordinates and show they only cover half of the original t − x plane. Apparently there are two “disjoint quadrants” separated by |t| = |x|. The transformation matrix is ∂t ∂t a cosh(λ) sinh(λ) α ∂a Λ α¯ = ∂λ ∂x ∂x = a sinh(λ) cosh(λ) ∂λ ∂a
(5.100)
And the determinant of this transformation matrix is det(Λαα¯ ) = a which for a 6= 0 is a legitimate transformation. The basis vectors will be Λαα¯ ~eα = ~eα¯ so that we have already found the new basis vectors in terms of the old: ~ ~ea = A ~ ~eλ = B
(5.101)
and shown they were orthogonal in part (a) above. “Plotting the coordinates” sounds vague, but certainly the easiest thing to do is plot the curves in the t − x plane obtained by holding one of the a − λ pair fixed and varying the other. These are coordinate curves. Before plotting these coordinate curves (see Fig. 5.3) it’s useful to reflect on what they will look like. Say a = 1, and λ −1, then exp(−λ) exp(λ) − exp(−λ) ≈− 2 2 exp(λ) + exp(−λ) exp(−λ) x(1, λ) = 1 · cosh(λ) = ≈+ 2 2 t(1, λ) = 1 · sinh(λ) =
(5.102) so the curve approaches the straight line t = −x in the limit λ → −∞. And the family of curves approach this same limit regardless of the value of a as long as it’s finite. (We’ll discuss a < 0 in a minute. For now think of a > 0.) At λ = 0 we have t = 0 and x = a. So for various a > 0 we have a family of curves in the 4th quadrant (t ≤ 0 and x > 0).
178 For λ > 0 the curves in the 1st quadrant are the refection about the x−axis at t = 0 of the curves we just described in the 4th quadrant. In particular, all curves with a > 0 approach t = x as λ → +∞. For a < 0 we have a family of curves in the 2nd and 3rd quadrants that are the refection about the t−axis at x = 0 of the curves we just described in the 1st and 4th quadrants respectively. Thus we see that the region between t = x and t = −x are not parameterized by (λ, a). t
x
Figure 5.3: Coordinate curves for Eq. (5.96). Plot was partly generated with MapleTM , see accompanying file schutz2009 ch5.mw.
21c. Metric tensor and Christoffel symbols. We know the metric in ηαβ in t − x coordinates. So we transform this metric to λ − a coordinates as we did in § 5.2: gα¯ β¯ = Λα α¯ Λβ β¯ gαβ =
∂xα ∂xβ gαβ ∂xα¯ ∂xβ¯
(5.103)
Let’s do this one component at a time. There are only 3 to check since
Rob’s notes on Schutz
179
gα¯ β¯ = gβ¯α¯ : g¯0¯0 = Λα ¯0 Λβ ¯0 gαβ ∂xα ∂xβ gαβ , just rewritting ∂λ ∂λ ∂x ∂x ∂t ∂t η00 + η11 = ∂λ ∂λ ∂λ ∂λ 2 2 ∂a sinh λ ∂a cosh λ = (−1) + (+1) ∂λ ∂λ
gλλ =
= (a cosh λ)2 (−1) + (a sinh λ)2 (+1) = −a2
(5.104)
And the off-diagonal term: g¯0¯1 = Λα ¯0 Λβ ¯1 gαβ ∂xα ∂xβ gαβ , just rewritting ∂λ ∂a ∂x ∂x ∂t ∂t η00 + η11 = ∂λ ∂a ∂λ ∂a ∂a sinh λ ∂a sinh λ ∂a cosh λ ∂a cosh λ = (−1) + (+1) ∂λ ∂a ∂λ ∂a = a cosh λ sinh λ (−1) + a sinh λ cosh λ (+1) =0 = gaλ symmetry of the metric tensor. (5.105)
gλa =
The final component: g¯1¯1 = Λα ¯1 Λβ ¯1 gαβ ∂xα ∂xβ gαβ , just rewritting ∂a ∂a ∂t ∂t ∂x ∂x = η00 + η11 ∂a ∂a ∂a ∂a 2 2 ∂a sinh λ ∂a cosh λ = (−1) + (+1) ∂a ∂a
gaa =
= (sinh λ)2 (−1) + (cosh λ)2 (+1) =1
(5.106)
180 There are only 2×3 = 6 components of the Christoffel symbol to compute. We use Eq. (5.75) since we’re in a coordinate bases. We’ll need the inverse metric tensor −1 −a2 0 (g ) = 0 1 −2 −a 0 = 0 1 α ¯ β¯
1 Γα µν = g ασ (gσµ,ν + gσν,µ − gµν,σ ) 2
(5.107)
(5.108)
The metric depends only upon a since no where does λ appear in our components of g above. So we immediately conclude Γλ λλ = 0.
(5.109)
1 Γλ λa = g λσ (gσλ,a + gσa,λ − gλa,σ ) 2 1 = g λλ (gλλ,a + gλa,λ − gλa,λ ) , diagonal metric 2 1 = g λλ (gλλ,a ) , diagonal metric 2 1 −1 ∂(−a2 ) 1 = = 2 2 a ∂a a λ = Γ aλ Eq. (5.74)
(5.110)
1 Γλ aa = g λσ (gσa,a + gσa,a − gaa,σ ) 2 1 λλ = g (gλa,a + gλa,a − gaa,λ ) 2 1 λλ = g (−gaa,λ ) , diagonal metric 2 = 0.
(5.111)
Rob’s notes on Schutz
181
1 Γa λλ = g aσ (gσλ,λ + gσλ,λ − gλλ,σ ) 2 1 = g aa (gaλ,λ + gaλ,λ − gλλ,a ) 2 1 = g aa (−gλλ,a ) , diagonal metric 2 ∂(−a2 ) 1 =a = (+1) 2 ∂a
(5.112)
1 Γa λa = g aσ (gσλ,a + gσa,λ − gλa,σ ) 2 1 = g aa (gaλ,a + gaa,λ − gλa,a ) 2 1 aa = g (+gaa,λ ) 2 = 0.
(5.113)
1 Γa aa = g aσ (gσa,a + gσa,a − gaa,σ ) 2 1 = g aa (gaa,a + gaa,a − gaa,a ) 2 = 0.
(5.114)
22. Show that if U α ∇α V β = W β then U α ∇α Vβ = Wβ We simply use the metric tensor to lower the index β on both sides of the equals sign, =, in the first expression: gσβ U α ∇α V σ = gσβ W σ U α ∇α (gσβ V σ ) = gσβ W σ metric commutes with covariant derivative, Eq. (5.71) U α ∇α Vβ = Wβ lower the index (5.115)
182
5.9
Rob’s supplementary problems
SP.1 Eq. (5.28b) states that the magnitude of the one-form basis ˜ = |dθ|
1 r
while Eq. (5.28a) implies that |~eθ | = r Does this contradict Eq. (3.47) wherein the magnitude of a one-form was stated to be the same as its associated vector? Hint. The answer is of course “no”. Work through problem 34 of § 3.10, which might help.
Solution: The answer is of course “no, there is no contradiction”. The one-form bases are not simply the associated one-forms of the vector bases. This was stated explicitly by Schutz in §3.3, p. 61. See also the next supplementary problem SP.2.
SP.2 Find the one-form associated with vector ~eα for a fixed α. ~ is Solution: The one-form associated with a vector A ~ A˜ = g(A,
)
as stated just before Eq. (5.67), and first introduced in § 3.5. We simply substitute Aα ~eα , with Aα = 1 for fixed α into the above expression: ~ ) A˜ = g(A, = gµν ω ˜µ ⊗ ω ˜ ν (Aα~eα , ) = Aα gµν ω ˜ µ (~eα ) ω ˜ν( ) = Aα gµν δ µα ω ˜ν( ) = gαν ω ˜ν( ) , used Aα = 1
(5.116)
So we see that only if the metric is diagonal is gαα ω ˜ α the one-form associated with ~eα , for a fixed α, and furthermore only if gαα = 1 is ω ˜ α the one-form associated with ~eα .
Rob’s notes on Schutz
183
SP.3 Write the identity matrix as the product of two transformations and take the partial derivative ∂/∂xµ to show that 0
0
Λαα0 ,µ Λαβ + Λαα0 Λαβ ,µ = 0
Solution: 0
Λαα0 Λαβ = δ αβ Λαα0 ,µ
0 Λαβ
+
Λαα0
0 Λαβ ,µ
a transform and its inverse transform
= 0 LHS: product rule, RHS: identity matrix constant (5.117)
˜ β so that it’s SP.4 Multiply Eq. (5.43) by the one-form basis ω ˜ β = dx a proper tensorequation. Then show that it transforms as we would hope, that is like a 11 tensor. This supplementary problem is meant to be easier than Schut’z problem 17 but to carry the same message. Hint: Go take to Eq. (3.10) and remind yourself how we showed that p˜ was invariant under a change of coordinates. Furthermore, you’ll need the result from SP.3. ˜ β gives, Solution: Multiplying Eq. (5.43) by the one-form basis ω ˜ β = dx ˜ β ∂ (V α~eα ) = dx ˜ β {~eα ∂ V α + V α ∂ ~eα } dx ∂xβ ∂xβ ∂xβ In Eq. (3.10) we showed that one-forms are invariant under a change of coordinates (or rather we assumed they were invariant and found the appropriate transformation). However you interpret it, we do the same manipulations here but for a mixed rank 2 tensor. To be frame invariant, we require ˜ β ∂ (V α~eα ) = dx ˜ β 0 ∂ 0 (V α0 ~eα0 ) dx β ∂x ∂xβ where, in keeping with Chapter 5 notation, the primes indicate a different coordinate system. Now let’s expand the RHS, writing it in terms of known transformations from things in the unprimed frame. Our strategy will be to
184 ˜ β that we can cancel with term on obtain terms on the RHS like V α and dx the LHS. ˜ β ∂ (V α~eα ) dx ∂xβ ˜ β 0 ∂ 0 (V α0 ~eα0 ) assumed it’s frame invariant = dx ∂xβ ˜ β 0 {~eα0 ∂ 0 V α0 + V α0 ∂ 0 ~eα0 } product rule (as in Eq. (5.43)) = dx ∂xβ ∂xβ 0 ∂ α0 α0 ∂ α ˜ β ){(Λα 0 ~eα ) = (Λβ β dx V + V eα )} unprimed bases 0 0 (Λ α0 ~ α β β ∂x ∂x 0 ˜ β ){(Λα 0 ~eα ) ∂ 0 (V ν Λα0 ) + (V ν Λα0 ) ∂ 0 (Λα 0 ~eα )} unprimed components = (Λβ β dx ν ν α α β ∂x ∂xβ µ µ ∂x ∂ ∂x ∂ β0 ˜ β α ν α0 ν α0 = (Λ β dx ){(Λ α0 ~eα ) (V Λ ν ) + (V Λ ν ) (Λαα0 ~eα )} chain rule ∂xβ 0 ∂xµ ∂xβ 0 ∂xµ ∂ ∂ µ µ ν α0 ν α0 β α ˜ = (dx ){(Λ α0 ~eα ) δ β µ (V Λ ν ) + (V Λ ν ) δ β µ (Λαα0 ~eα )} inverses ∂x ∂x ∂ ∂ ν α0 ν α0 ˜ µ ){(Λα 0 ~eα ) = (dx (V Λ ) + (V Λ ) (Λαα0 ~eα )} simplified α ν ν ∂xµ ∂xµ ∂~ e 0 0 0 α µ α ν α ν α ν α α α ˜ ){(Λ 0 ~eα ) V Λ = (dx eα + Λ α0 µ } product rule α ν ,µ + V ,µ Λ ν + (V Λ ν ) Λ α0 ,µ ~ ∂x ˜ µ ){(Λα 0 ~eα ) V ν Λα0 + (V ν Λα0 ) Λα 0 ∂~eα } used SP3. = (dx α ,µ ν ν α ∂xµ ˜ µ ){δ α ~eα V ν + V ν δ α ∂~eα } inverse transforms = (dx ν ,µ ν ∂xµ ˜ µ ){~eν V ν + V ν ∂~eν } simplified = (dx ,µ ∂xµ (5.118) And if we changed dummy indices, we’d be back where we started. So applying the expected transformation rules we found the covariant derivative 1 of a vector does transform as we expected, like a 1 tensor. And notice that, because of the product rule of elementary differential calculus, the covariant derivative involved two terms. Neither of these terms transformed like a tensor on their own, because of the troublesome derivatives of the transformations, terms like Λαα0 ,µ above. But these two terms cancelled when we used the result of SP3. This was the point of Schutz’s problem 17.
Rob’s notes on Schutz
185
˜ and dθ ˜ are consistent with a coorSP.5 In Exer. 19 we showed that dr ˜ ˜ are a coordinate basis. dinate basis. Complete the proof that dr and dθ
Solution: One must repeat the argument for ξ also. Analogous to Eq. (5.82) we obtain: ∂ξ = cos θ ∂x ∂ξ = sin θ ∂y
(5.119)
Find the common mixed partial derivative, ∂ 2ξ ∂ = (cos θ) ∂y∂x ∂y ∂ = ∂y
x
!
p
x2 + y 2 2xy =− 2 (x + y 2 )3/2
, product rule
(5.120)
And ∂ 2ξ ∂ = (sin θ) ∂x∂y ∂x ∂ = ∂x
y
p x2 + y 2 2xy =− 2 (x + y 2 )3/2 ∂ 2ξ = ∂y∂x
!
, product rule (5.121)
186
Chapter 6 Curved Manifolds
187
188
6.3
Covariant differentiation
Typo in Eqs (6.30) and (6.31). Colin and period should be semicolon and comma. Typo just before Eq. (6.36), reference should be to Eq. (5.52), not (5.53).
6.5
The Curvature Tensor
Typo in Eq. (6.67), the first upper index should be α not σ: 1 Rαβµν = g ασ (gσν,βµ − gβν,σµ − gσµ,βν + gβµ,σν ). 2 This will be derived in Exerc. 17.
6.6
(6.1)
Bianchi identities: Ricci and Einstein tensors
Typo in Eq. (6.91), the σ should be an α, i.e. Rαβ ≡ Rµαµβ = Rβα
6.9
Exercises
It might help to tackle my supplementary problems first, see § 6.10 below.
1. Decide if the following sets are manifolds and say why. If there are exceptional points at which the sets are not manifolds, give them: We are given only an intuitive explanation of manifolds in § 6.1, where we’re told that a manifold is a space that can be continuously parameterized; that there is a smooth mapping from points of the manifold to a Euclidean space of the same dimension. So I believe such an intuitive explanation is all that is required here. (a) Phase space of Hamiltonian mechanics, the space of the canonical coordinates and momenta pi and q i ;
Rob’s notes on Schutz
189
I believe this is a typo and should read “. . . the space of the canonical coordinates i.e. momenta pi and q i ”, since in Hamiltonian mechanics the canonical coordinates are momenta pi and q i , see http://en.wikipedia. org/wiki/Canonical_coordinates. The answer is “yes” of course, since the phase space is a manifold. The intuitive reason is that the generalized momentum pi and generalized coordinates qi are parameterized by time, and on physical grounds this parameterization must be continuous – otherwise the particles would jump instantly in position or accelerate infinitely. (b) The interior of a circle of unit radius in two-dimensional Euclidean space. “Yes”, because again this can be parameterized by putting the centre of the circle at the origin, say, and parameterizing the curve on angle of the radius vector to the x−axis. (c) The set of permutations of n objects. I’m not completely clear on what the objects are nor what permutations of them are, but I believe the answer is “No”, because I don’t see how separate objects, potentially with a space between them, can be continuously parameterized. (d) The subset of Euclidean space of two dimensions (coordinates x and y) which is a solution of x y (x2 + y 2 − 1) = 0. The solution is the union of the x−axis, i.e. ∀x, and y = 0, and the y−axis, and the unit circle centred at the origin. So because these intersect, one could parameterize it continuously. But it’s not differentiable everywhere, so it’s not a differential manifold.
2. Of the manifolds in Exer. 1, on which is it customary to use a metric, and what is that metric? On which would a metric not normally be defined, and why?
190 (b) The natural coordinates would be polar coordinates, since the domain is then easily defined: r < 1 and 0 ≤ θ ≤ 2 π. The metric tensor was given in § 5.2, c.f. Eq. (5.32).
3. (a) Show that given a diagonal matrix, D one can always find a matrix R such that RT DR is also diagonal, with same elements as D but in ascending order. ˜ Let’s call the reorder matrix D, ˜ = RT DR, D and the elements of the matrices di j for elements of D etc. Then d˜kl = (rT )ki dij rjl = rik dij rjl = rik dii ril (6.2) Suppose we want to move the diagonal element dII into slot K for given (fixed) I and K, i.e. we want d˜KK = dII We simply choose rIK = 1 and riK = 0 ∀i 6= I. But are the off-diagonal terms still zero? Yes, this is guaranteed because when k 6= l, we cannot have both rik 6= 0 and ril 6= 0 since that would correspond to moving the diagonal element dii into two different slots d˜kk and d˜ll . That’s not allowed because each diagonal can only be moved into one slot. 3. (b) Show that the diagonal elements can be scaled such that they are ˜ . either −1, 0, or +1 using another matrix N as follows: N T DN As we found above the new elements will be d˜˜kl = (nT )ki d˜ njl ij
= nik d˜ij njl = nik d˜ii nil (6.3)
Rob’s notes on Schutz
191
Suppose d˜KK 6= 0 and we want the diagonal element d˜˜KK for given (fixed) K to be −1 or +1, i.e. we want d˜˜KK = 1 sign(d˜KK ). q = 1/ |d˜KK | and niK = 0 ∀i 6= K.
We simply choose nKK But if d˜KK = 0 we must cannot do this, and must choose nKK = a, where a is any nonzero number, and niK = 0 ∀i 6= K. We then end up with d˜˜KK = 0. 3. (c) Show that none of the diagonal elements can be zero for the inverse of A to exist. The equation for the inverse is trivial for a diagonal matrix because one can find one equation with one unknown for each element of the inverse. DD−1 = I.
(6.4)
So for element (D−1 )ii we have (D−1 )ii = 1/(D)ii . and for off-diagonal elements (D−1 )ij = 0/(D)ii = 0. So the inverse of a diagonal matrix is also diagonal but with the elements equal to the inverse of the original elements. When the original matrix had zero for one or more diagonal elements, then the inverse doesn’t exist because finding it would involve dividing by zero. 3. (d) Show that the metric of Eq. (6.2) can always be found. As we are reminded in the text (p. 145), the metric tensor g is symmetric by definition (for example if it is defined from the dot product of two vectors, the order of the vectors does not matter). So as stated in Exerc. 3, this implies that (g) can be diagonalized. Furthermore (g) must have an inverse (for the mapping from vectors to one-forms to be invertible. Then the results (a) through (c) show that we can reduce (g) to a matrix with either −1 or +1 on the diagonal. Since we choose (g) to have one negative eigenvalue we end up with one −1 on the diagonal, and remaining entries +1.
192 4. Prove the following results used in the proof of the local flatness theorem in § 6.2: 0
0
0
0
(a) ∂ 2 xα /∂xγ ∂xµ |0 has 40 independent values. 0 0 Consider first the operator, ∂ 2 /∂xγ ∂xµ |0 . It is represented by a 4 × 4 symmetric (because partial differentiation commutes) matrix, and has 10 independent elements (4 diagonal and 6 in the upper diagonol). For each element of this differential operator, there are 4 independent coordinates it can 0 0 act on, xα . Hence there are 4×10 = 40 degrees of freedom in ∂ 2 xα /∂xγ ∂xµ |0 . 0
(b) ∂ 3 xα /∂xλ ∂xµ ∂xν |0 has 80 independent values. Again the problem is to omit the elements made redundant by the fact that the partial derivatives commute. Again, start with just the differential 0 0 0 operator which here is ∂ 3 /∂xλ ∂xµ ∂xν |0 . There are 4 diagonal elements. There are 4 × 3 = 12 ways to choose two of the derivatives the same and one different (like λ0 = µ0 6= ν 0 . It’s not necessary to also consider λ0 6= µ0 = ν 0 because we’ve accounted for these elements already, since the order of the derivatives does not matter). And finally there are the elements where λ0 6= µ0 6= ν 0 . Here it is easiest to count them by noting that for any given choice there is only one unused index value, so there are 4 such elements. Adding these three types of terms we get 4 + 12 + 4 = 20. Again, for each element of this differential operator, there are 4 independent coordinates it can act on, xα . Hence there are 4 × 20 = 80 degrees of freedom. (c) gαβ ,γ 0 µ0 |0 has 100 independent values. This one is easy. The differential operator has 10 degrees of freedom, see Exerc. (a) above. And this is applied to the metric which is a symmetric 4×4 tensor, and hence has 10 independent values. Thus there are 10 × 10 = 100 independent values in total. 5. (a) Prove that Γµαβ = Γµβα in any coordinate system in a curved Riemannian space. This Exercise is so important one really must do it. By the local flatness theorem, c.f. § 6.2, on a general Riemann manifold, there is a local inertial (Lorentz) reference frame.In a Lorentz frame spacetime is flat and one can construct a reference frame with basis vectors that do not change with posi-
Rob’s notes on Schutz
193
tion, so the Christoffel symbols are zero. This is all one needs to reproduce the argument of § 5.4 leading to Eq. (5.74). 5. (b) Use this to prove that Eq. (6.32) can be derived in the same manner as in flat space. The same principle is as in (a), i.e. the local flatness theorem, is involved in deriving Eq. (6.31) (which is identical to Eq. (5.71)). And Eq. (6.32) is identical to Eq. (5.75). The Argument leading to Eq. (5.75) can be repeated in curved Riemann space because it used Eqs. (5.71, 5.72, 5.74). Eq. (5.72) is valid in curved space. Eq. (5.74) was proved above in (a). Eq. (5.71) was given.
6. Prove the first term in Eq. (6.37) vanishes. Recall Eq. (6.37), 1 1 Γαµα = g αβ (gβµ,α − gµα,β ) + g αβ gαβ,µ 2 2 As pointed out in the text, we only need to prove that (gβµ,α − gµα,β ) is antisymmetric. Then we can use the result of Exerc. 26(a) of § 3.10 to show that the term vanishes. First we note that gβµ = gµβ because of the symmetry of the metric tensor. (gβµ,α − gµα,β ) = (gµβ,α − gµα,β ). Then we note that for each µ, the RHS is antisymmetric in α and β since obviously (gµβ,α − gµα,β ) = −(gµα,β − gµβ,α ). And for each µ, this term antisymmetric in α and β will be multiplied by the inverse metric tensor g αβ that is obviously symmetric in α and β. Thus, by the result of Exerc. 26(a) of § 3.10, the first term vanishes.
7. This problem is similar to 8.16 on page 222 in Misner et al. (1973) book.
194 7. (a) Give the definition of the determinant of a matrix A in terms of cofactors of elements.
7. (b) Differentiate the determinant of an arbitrary 2 × 2 matrix and show that it satisfies Eq. (6.39).
7. (c) Generalize Eq. (6.39) (by induction or otherwise) to arbitrary n×n matrices.
8. Fill in the missing algebra leading to Eqs. (6.40) and (6.42). I find it easiest to work backwards. That is, start with Eq. (6.40) and differentiate using the chain rule: √ ( −g),µ α Γ µα = √ −g 1 (−g),µ = √ 2 ( −g)2 1 g,µ = . (6.5) 2 g Substitution of Eq. (6.39) leads directly to Eq. (6.38). And for the second part, again I find it easiest to work backwards. That is, start with Eq. (6.42) and differentiate using the product rule and chain rule: 1 √ V α;α = √ ( −g V α ),α −g √ −g Vα √ = √ V,αα + √ ( −g ),α −g −g α √ V = V,αα + √ ( −g ),α , −g
(6.6)
which is Eq. (6.41). And Eq. (6.41) was obtained directly by substitution of Eq. (6.40) into Eq. (6.36).
Rob’s notes on Schutz
195
9. Show that Eq. (6.42) leads to Eq. (5.56). This amounts to showing that the general formula for the divergence of a velocity field is consistent with the special case derived in § 5.3 for polar coordinates. We start with Eq. (5.32) for the metric in polar coordinates. The determinate is simply g = det (gαβ ) = r2 . Substitution into Eq. (6.42) we find 1 √ V α;α = √ ( −g V α ),α −g Vα √ = V,αα + √ ( −g ),α , −g Vr √ 2 r θ = V,r + V,θ + √ ( −r ),r , −r2 Vr = V,rr + V,θθ + , r
(6.7)
consistent with the first line of Eq. (5.56). Find the divergence formula for the metric given in Eq. (6.19), i.e. that for spherical polar coordinates. From Eq. (6.19), the determinate is g = r4 sin2 θ, which has two nonzero gradient components, p −r4 sin2 θ −2r3 sin2 θ =p , ∂r −r4 sin2 θ p ∂ −r4 sin2 θ −r4 sin θ cos θ = p . ∂θ −r4 sin2 θ
∂
(6.8)
196 Substitution into Eq. (6.42) we find, Vr −2r3 sin2 θ −r4 sin θ cos θ Vθ p p V α;α = V,rr + V,θθ + V,φφ + p , +p −r4 sin2 θ −r4 sin2 θ −r4 sin2 θ −r4 sin2 θ 2V r Vθ φ θ r = V,r + V,θ + V,φ + + r tan θ (6.9)
10. Consider a triangle made up of great circles on a sphere intersecting at points A, B, and C, as in Fig. 6.3 but with B not necessarily on the pole. Show that the amount by which a vector is rotated by parallel transport around such a triangle equals the excess of the sum of the angles over 180◦ = π rad. This is a good problem because it forces us to think carefully through all the steps of the first subsection of § 6.4. We strongly need a diagram to do this one. Let’s use that of Fig. 6.3. The only constraint given was that the sides of the triangle form great circles on the sphere. Without loss of generality we can take A and C to be on the equator. We can label the interior angles of the triangle as follows: CAB is the angle at A between the great circle through CA and that through AB, and similarly for the other two. As in § 6.4, let’s start with a vector at A that is parallel to the equator. The angle from the vector to geodesic through AB is CAB. This angle doesn’t change as we move to B because it is parallel-transported. Now imagine looking down on the point B. The angle between the extension of geodesic AB to the vector is still CAB. Let φ be the angle between the vector at B and geodesic BC. The ABC + φ + CAB = π rad. I believe this is the only tricky part of this problem. It stems from the fact that the intersection of two great circles forms four angles that add to 2π rad, with those on one side of a great circle adding to π rad. I take this as visually obvious (unfortunately I can’t offer any proof of this). Moving from B to C the angle φ doesn’t change. At C, let θ be the angle between the vector and AC i.e. the equator. There, BCA = θ + φ.
Rob’s notes on Schutz
197
Combining these two equations and solving for θ gives θ = CAB + ABC + BCA − π.
11. When is the gradient of the velocity field zero everywhere? V α;β = 0 = V α,β + Γαµβ V µ
(6.10)
(a) A necessary condition for the solution of (6.10) is the integrability condition for (6.10), which apparently follows from the commuting of partial derivatives. We are to use the fact that V α,νβ = V α,βν to derive the equation: (Γαµβ,ν − Γαµν,β ) V µ = (Γαµβ Γµσν − Γαµν Γµσβ )V σ
(6.11)
This problem is fairly straightforward once one thinks about “what would I do to (6.10) to obtain a term like . . . in (6.11)”. Obviously to obtain a term like Γαµβ,ν one must take ∂ of (6.10). ∂xν To get rid of the resulting V α,βν one simply does this twice and subtracts the two results, using the commuting property of partial differentiation, as Schutz indicated. But then one must also deal with terms like Γαµβ V µ,ν . For these one uses (6.10) again (without differentiating). (b) By relabeling indices, work this into the form given. Actually the form given has a typo and should be: (Γαµβ,ν − Γαµν,β + Γασν Γσµβ − Γασβ Γσµν ) V µ = 0.
(6.12)
This problem is straightforward. The first two terms in (6.12) and (6.11) are identical. Based on the sign of the 3rd term in (6.12) it’s obviously the final term in (6.11). So we must replace µ → σ. That leaves the final term in in (6.12) as the 3rd in (6.11). So we need to also replace σ → µ.
12. Prove that Eq. (6.52) defines a new affine parameter.
198 Using φ defined in Eq. (6.52) to parameterize the curve, we can find the equation for the geodesics by using Eq. (6.51) and the chain rule. So the operator 2 d d dλ d d = dφ dφ dφ dλ dλ 2 1 d d = a dλ dλ (6.13) That means we can rewrite Eq. (6.51) in terms of φ by simply dividing Eq. (6.51) by a2 , which is constant. Because Eq. (6.51) is set to zero, this doesn’t change the form of the equation at all, as indicated on p. 157, equation below Eq. (6.52).
~ and B ~ are parallel-transported along a curve, then 13. (a) Show that if A ~ B) ~ =A ~·B ~ is constant on the curve. g(A, A vector that is parallel-transported along a curve is moved in the direction of the tangent to the curve without rotating or changing its length. From this notion it seems obvious that the dot product of two vectors that were parallel-transported along a curve would not change. To demonstrate this mathematically, on could take the derivative along the curve (parameterized by λ) of the dot product: ~ B) ~ d d g(A, = gαβ Aα B β dλ dλ d d d = Aα B β Bβ (gαβ ) + gαβ B β (Aα ) + gαβ Aα dλ dλ dλ (6.14) All the derivatives are zero. The first term is the derivative of the metric along the curve and is zero as consequence of the local flatness theorem: d (gαβ ) = U µ gαβ;µ dλ = 0, c.f. Eq. (6.31).
(6.15)
Rob’s notes on Schutz
199
The 2nd and 3rd terms are the derivatives of the components of the vectors. These are zero because these vectors were assumed to be parallel-transported along the curve, c.f. Eq. (6.47).
13. (b) Conclude from the results of (a) that if a geodesic is spacelike (or timelike or null) at some point, it is necessarily spacelike (or timelike or null) at all points.
Vectors were defined as spacelike (or timelike or null) if their magnitude was > 0(< 0, = 0), c.f. § 2.5 on p. 44. A geodesic is of course not a vector, but it does have a tangent vector at each point along the curve that gives the linear approximation to the displacement along the curve at the point, per unit of the parameter that parameterizes the curve. So it would be reasonable ~ were of positive to call a geodesic spacelike at a point if its tangent vector U magnitude at that point,
~ ·U ~ = gαβ U α U β . U
Can this change as one moves along the curve? The geodesic is, by definition, the curve that parallel-transports its own tangent vector. But from (a) we have that any two vectors that are parallel-transported by any curve keep the same dot product. So the tangent vector, dotted with itself, does not change as it is parallel-transported around the geodesic.
14. Proper distance along a curve whose tangent is V~ is given by Eq. (6.8). Show that if the curve is a geodesic, then proper length is an affine parameter. (Use results of Exerc. 13).
200
Z
λ1
l= λ0 Z λ1
=
|V~ · V~ |1/2 dλ,
Eq. (6.8)
~ ·U ~ |1/2 dλ, |U
~ is tangent vector to geodesic where U
λ0
~ ·U ~ |1/2 = |U
Z
λ1
dλ,
using Exerc. 13b
λ0
(6.16) Thus the proper distance along the curve has the form, ~ ·U ~ |1/2 (λ − λ0 ) l(λ) = |U ~ ·U ~ |1/2 λ − λ0 |U ~ ·U ~ |1/2 , = |U = aλ + b.
(6.17)
~ ·U ~ |1/2 is constant (c.f. Exerc. 13 (a)), and b = −λ0 a. This where a = |U is the same form as Eq. (6.52), which was (hopefully) shown to be an affine parameter in Exerc. 12.
15. Use Exercs. 13 and 14 to prove that the proper length of a geodesic between two points is unchanged to first order by small changes in the curve that do not change its endpoints. This looks like a nice problem because it implies that geodesics are extremma in distance between two fixed points.
16. (a) Derive Eqs. (6.59) and (6.60) from Eq. (6.58). This problem makes sure we’re following the argument. The Eq. (6.59) is found simply by expanding the integrand in a Taylor series (about point A) and keeping only the constant term and the term in the first derivative. The constant terms cancel in both cases leaving only the first derivative terms, leading to Eq. (6.59). Treating the integrands in Eq. (6.59) as constants (in keeping the Taylor series expansions), the integrals can be performed giving immediately Eq. (6.60).
Rob’s notes on Schutz
201
16. (b) Provide algebra needed to justify Eq. (6.61). Again a simple problem to make sure we’re following the argument. Let’s start with the first term in Eq. (6.60). −
∂ ∂ ∂ (Γαµ2 V µ ) = −V µ 1 (Γαµ2 ) − Γαµ2 1 (V µ ) 1 ∂x ∂x ∂x α µ α = −V (Γ µ2,1 ) − Γ µ2 (−Γµν1 V ν ), =
[−Γαµ2,1
+
using Eq. (6.53)
Γαν2 Γν µ1 ]V µ ,
(6.18)
where we let ν → µ and µ → ν in the 2nd term so that we can pull out the V µ . This is allowed because they are dummy (repeated) indices. The 2nd term in Eq. (6.60) is dealt with in exactly the same way: ∂ ∂ α µ µ ∂ α α (Γ V ) = V (Γ ) + Γ (V µ ) µ1 µ1 µ1 2 2 2 ∂x ∂x ∂x = V µ (Γαµ1,2 ) + Γαµ1 (−Γµν2 V ν ), =
[Γαµ1,2
−
using Eq. (6.53)
Γαν1 Γν µ2 ]V µ ,
(6.19)
where we let ν → µ and µ → ν in the 2nd term so that we can pull out the V µ.
17. (a) Prove that Eq. (6.5) implies that g αβ ,µ (P) = 0 Because the metric tensor applied to its inverse gives the identity matrix, which is of course a constant, we have gαµ g µβ = g βα = δ βα , (gαµ g µβ ),γ = δ βα,γ = 0, µβ gαµ,γ g µβ + gαµ g,γ = 0, µβ gαµ g,γ = 0.
(6.20)
And now for the tricky bit. In general gαβ is a general tensor and there could in principle be several non-zero terms in each column that cancel to produce
202 µβ zero when multiplied by g,γ . So I believe one must argue as follows. However, one can always choose ones basis such that gαβ = ηαβ , c.f. Eq. (6.2). But then there is only one non-zero term in each column. For instance, let α = 0.
g0µ g µβ,γ = −1g 0β,γ = 0. = g 0β,γ = 0,
∀β, γ.
(6.21)
The same argument of course applies to all the other values of α, and thus g αβ,γ = 0,
∀α, β, γ.
17. (b) Use results of (a) to establish Eq. (6.64). A nice easy problem, perhaps the point be to show us what use the results in (a) can be put to. Starting with Eq. (6.32) Γαµν =
1 αβ g (gβµ,ν + gβν,µ − gµν,β ), 2
(6.22)
we simply differentiate with respect to xσ , 1 ∂ g αβ (gβµ,ν + gβν,µ − gµν,β ) , σ 2 ∂x 1 1 = g αβ,σ (gβµ,ν + gβν,µ − gµν,β ) + g αβ (gβµ,νσ + gβν,µσ − gµν,βσ ), 2 2 1 αβ = g (gβµ,νσ + gβν,µσ − gµν,βσ ). (6.23) 2
Γαµν,σ =
17. (c) Fill in step needed to establish Eq. (6.68). We start with Eq. (6.63). Because we’re in a local inertial frame at point P, the Christoffel symbol vanishes at P. For the derivative of the Christoffel symbol, we substitute from Eq. (6.64), for the first term making the substitutions µ→β σ→µ β → σ.
(6.24)
Rob’s notes on Schutz
203
For the 2nd term we simply interchange µ and ν in the first term and change the sign, giving first Eq. (6.65), and then, after cancelling the red terms using Eq. (6.66), then arriving at Eq. (6.67): 1 Rαβµν = g ασ (gσβ,νµ + gσν,βµ − gβν,σµ −gσβ,µν − gσµ,βν + gβµ,σν ), 2 1 = g ασ (gσν,βµ − gβν,σµ − gσµ,βν + gβµ,σν ), 2 1 = g ασ (gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ ), just changed order. 2 (6.25) Finally we must lower the index, Rαβµν = gαλ Rλβµν 1 = gαλ g λσ (gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ ), 2 σ 1 = g α (gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ ), 2 σ 1 = δ α (gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ ), 2 1 = (gαν,βµ − gαµ,βν + gβµ,αν − gβν,αµ ), 2
(6.26)
which is Eq. (6.68).
18. (a) Derive Eqs. (6.69) and (6.70) from Eq. (6.68) This question involves trivial (and not so trivial) index manipulation. In Eq. (6.68) on notices that changing the order of α and β changes the sign. This is clear because the first term, gαν,βµ is the negative of the last term −gβν,αµ but with α and β in the opposite order. And similarly for the two middle terms. This observation gives the first equality of Eq. (6.69) Rαβµν = −Rβαµν . An analogous observation gives the 2nd equality in Eq. (6.69). That is, the first two terms of Eq. (6.68) are of opposite sign and have the second
204 pair of indices, µ and ν, in the opposite order. Similarly for the 3rd and 4th terms. This gives, the second equality of Eq. (6.69) Rαβµν = −Rαβνµ . The 3rd and final equality in Eq. (6.69) is not so immediately clear. I found a proof using the following somewhat tedious procedure. Use Eq. (6.68) to find the expression for Rµναβ . That is, simply use Eq. (6.68) with the following substitutions: α → µ, β → ν, µ → α, ν → β. One finds 1 Rµναβ = (gµβ,να − gµα,νβ + gνα,µβ − gνβ,µα ) 2
(6.27)
There are two ways to proceed from here. The most direct is to use the facts that g is always symmetric, and derivatives commute. So we can interchange the order of the two indices before and after the comma without changing the result. Then one notices that the 3rd term in (6.27) above is the same as the first term in Eq. (6.69), gνα,µβ = gαν,βµ .
(6.28)
And furthermore, the 2nd term in (6.27) above is the same as the 2nd term in Eq. (6.69); the 1st term in (6.27) above is the same as the 3rd term in Eq. (6.69); the 4th term in (6.27) above is the same as the 4th term in Eq. (6.69). This gives the 3rd equality in Eq. (6.69). Proving Eq. (6.70) is also rather straightforward, albeit rather messy (at least for the brute force method I came up with). I simply wrote done the expressions for all three terms in Eq. (6.70), Rαβµν , Rανβµ and Rαµνβ using Eq. (6.68): 2Rαβµν = +gαν,βµ −gαµ,βν +gβµ,αν −gβν,αµ 2Rανβµ = +gαµ,νβ −gαβ,νµ +gνβ,αµ − gνµ,αβ 2Rαµνβ = +gαβ,µν −gαν,µβ + gµν,αβ −gµβ,αν
(6.29) (6.30) (6.31)
Rob’s notes on Schutz
205
Recognizing again that the order of the first pair and last pair of indices in terms like gαβ,µν doesn’t matter because of symmetry of the metric and partial derivatives commute, we can cancel all the terms. The cancelling terms are typeset in the same colour of font. 18. (b) Show that Eq. (6.69) reduces the number of independent components of Rαβµν to 21. If you’re having trouble here, perhaps it’s worth having a go at Exerc. 35, since that problem involves writing down the independent elements for a specific example, perhaps forcing you to think about it in a more concrete or different way. Of course, Exerc. 35 would be much easier if you solved Exer. 18 first! If you’re still stuck, heres my solution below. First of all one must notice that terms of the form Rααµν = 0, and of course also Rαβµµ = 0. Speaking from personal experience, one can waste quite some time if one doesn’t appreciate this from the start! But once one imposes this, then the problem simplifies tremendously. For now there are only six independent choices for the first pair α, β (where order doesn’t matter because of Eq. (6.69)). One can establish this a number of ways. For instance, there are 4 ways of choosing α and then only 3 ways of choosing β 6= α giving 4 × 3 = 12 pairs. But we have counting them twice because we counted α = 1, β = 2 separately from α = 2, β = 1, etc. so we must divide this by two to get 12/2 = 6 independent pairs. Similarly for the second set, but let’s be careful in choosing these because we’re going to have to impose the symmetry given by the last equality in Eq. (6.69). Let’s consider first those pairs of µ, ν where of course we require µ 6= ν, but also we impose that neither µ nor ν equals α or β. Thus for any given pair α, β we have only 5 pairs of µ, ν. This counts twice the permutations Rαβµν and Rµναβ , so we must divide this by 2 to get 6 × 5/2 = 15 independent elements. To this we must add the six pairs of µ, ν that were not different from the α, β pair, i.e. µ = α or β and ν = β or α. This gives a total of 15 + 6 = 21 independent elements of Rαβµν accounting for Eq. (6.69) only. (We account for Eq. (6.70) in the next problem.) 18. (c) Show that Eq. (6.70), Rαβµν + Rανβµ + Rαµνβ = 0, imposes only one restriction that is independent of Eq. (6.69), and thus
206 reduces the number of independent components of Rαβµν to 20. My solution is, as is often the case, rather longwinded. If you find a neater solution, please let me know! First let’s establish that all the indices must be different. I do this but simply consider all the cases. Consider α = β, and then Rααµν = 0, which implies, from Eq. (6.70), Rααµν + Rαναµ + Rαµνα = 0, Rαναµ + Rαµνα = 0, Rαναµ − Rαµαν = 0.
(6.32)
But this final result is a special case in Eq. (6.69), so is not a new result, and does not provide any restraints on Rαβµν beyond those of Eq. (6.69). We proceed like this, next considering α = µ: Rαβαν + Rανβα + Rαανβ = 0, Rαβαν + Rανβα = 0, Rαβαν − Rαναβ = 0.
(6.33)
This last equality was also implied by Eq. (6.69) as a special case. Unfortunately we have a few more cases to consider: α = ν, and separately β = µ, and separately µ = ν. They all reduce to special cases of Eq. (6.69). So we conclude that the only new information in Eq. (6.70) must come from the case when the indices are all unique. There is only one set of 4 indices all different, {0, 1, 2, 3}, and these can form 3 unique pairs: α, β 0, 1 0, 2 0, 3
µ, ν 2, 3 1, 3 2, 1
(6.34)
where the order of α, β does not matter, nor does the order of µ, ν. And notice also we don’t distinguish between α, β = 0, 1 while µ, ν = 2, 3 and the case where α, β = 2, 3 while µ, ν = 0, 1, because the elements must have the same value due to the last equality in Eq. (6.69). So of the 21 elements of Rαβµν described in (b) above, there are 3 such that the indices are all different. But these three elements obey the relation Eq. (6.70), which can be written assuming all the indices α, β, µ, ν are unique, for example R0123 + R0312 + R0231 = 0.
Rob’s notes on Schutz
207
It is not necessary to distinguish this equation from others that can be obtained by simply applications of the rules in Eq. (6.69). This equation then imposes one constraint, independently of Eq. (6.69), and thus reduces the number of independent elements of Rαβµν from the 21 we found in (b) to 20. 19. Prove that Rαβµν for polar coordinates in the Euclidean plane. First we find the number of independent components of Rαβµν in two dimensions. [Refer to Exer. 18(b) for the case of 4-dimensional space.] There is only one degree of free associated with the first pair of indices because Rrθµν = −Rθrµν and Rααµν = 0. And similarly only one degree of freedom associated with the last two indices since Rαβrθ = −Rαβθr and Rαβµµ = 0. It’s not possible to use the cyclic identity to reduce this any further since applying the cyclic identity we discover something that was true by the other symmetry relations, that is: Rrθrθ + Rrθθr + Rrrθθ = 0 Rrθrθ − Rrθθr = 0
, by cyclic identity, Eq. (6.70) , symmetry relations, Eq. (6.69)
(6.35)
So there is only one independent value to compute. Let’s compute Rrθrθ = Γr θθ,r − Γr θr,θ + Γσ θθ Γr σr − Γσ θr Γr σθ r
=Γ
θθ,r
+Γ
r
r
θθ Γ rr Γθ θr Γr θθ
−Γ
θ
r
θr Γ θθ
= Γr θθ,r − , Eq. (5.45) ∂(−r) 1 = − (−r) , Eq. (5.45) ∂r r = 0.
, Eq. (6.63)
, Eq. (5.45)
(6.36)
208 And this is of course what we expect since (despite the polar coordinates) we’re in Euclidean space, which is flat, and Eq. (6.71) tells us we must have the zero for the Riemann tensor.
20. Fill in the algebra necessary to establish Eq. (6.73). The covariant derivative of a vector: ∇β V µ = V µ;β = V µ,β + Γµσβ V σ
see Eq. (6.33).
(6.37)
Now we simply apply the gradient operator another time, ∇α (∇β V µ ) = (V µ;β );α = (V µ;β ),α + Γµσα V σ;β − Γσβα V µ;σ , = (V
µ ;β ),α
see Eq. (6.34) (6.38)
where the last step used the fact that Christoffel symbols are zero in local inertial coordinates. But their gradients are not, so ∇α (∇β V µ ) = (V µ;β ),α = V µ,βα + (Γµνβ V ν ),α = V µ,βα + Γµνβ,α V ν
(6.39)
which is Eq. (6.73).
21. Following Eq. (6.78) it was claimed that one could generalize Eq. (6.77) for the commutator of the covariant derivative of a vector to a tensor with Eq. (6.78) . Each index got a Riemann tensor and the sign was always positive, even when the index was a lower index. In parentheses it was claimed that this must be the case because the metric tensor g is unaffected by the covariant derivative. This is a great problem. The result Eq. (6.78) seems to contradict Eq. (6.34). I cannot find the explanation Schutz’s is looking for. All I’ve
Rob’s notes on Schutz
209
managed so far on this is to derive Eq. (6.78) (see my supplementary problem SP2 in section 6.10 above) and to show that Eq. (6.34) is at least consistent in the following sense. Vα;β = (gασ V σ );β = gασ;β V σ + gασ V σ;β = (gασ,β − Γµσβ gαµ − Γµαβ gµσ )V σ + gασ V σ;β = (−Γµσβ gαµ − Γµαβ gµσ )V σ + gασ V σ;β = (−Γµσβ gαµ − Γµαβ gµσ )V σ + gασ (V σ,β + Γσνβ V ν ) (6.40) Let’s relabel the dummy indices so that it’s more clear which terms cancel: Vα;β = (−Γµσβ gαµ − Γµαβ gµσ )V σ + gαµ (V µ,β + Γµσβ V σ ) = −Γµσβ gαµ V σ − Γµαβ gµσ V σ + gαµ V µ,β + gαµ Γµσβ V σ = gαµ V µ,β − Γµαβ gµσ V σ = Vα,β − Γµαβ Vµ
(6.41)
which is consistent with Eq. (6.34). Of course we used Eq. (6.34) in obtaining this, so it’s not a proof. But it is a check on internal consistency.
22. Establish Eqs. (6.84), (6.85), and (6.86).
23. Prove Eq. (6.88). Eq. (6.88) gives the partial derivative of the Riemann curvature tensor. As suggested in § 6.6, we can find it by starting with the definition of the Riemann curvature tensor given in Eq. (6.63). Rαβµν,λ = (gαγ Rγβµν ),λ = gαγ Rγβµν,λ
using Eq. (6.5)
= gαγ (Γγβν,µλ − Γγβµ,νλ + Γγσµ,λ Γσβν + Γγσµ Γσβν,λ − Γγσν,λ Γσβµ − Γγσν Γσβµ,λ ) = gαγ (Γγβν,µλ − Γγβµ,νλ ) in local inertial frame
(6.42)
210 We now use Eq. (6.32), which is purportedly correct in any coordinate system, to write Christoffel symbols in terms of the metric tensor. We must differentiate Eq. (6.32) with respect to xσ : 1 1 Γγµν,σ = g γβ (gβµ,νσ + gβν,µσ − gµν,βσ ) + g γβ,σ (gβµ,ν + gβν,µ − gµν,β ) (6.43) 2 2 We might note that to arrive at Eq. (6.64), we would eliminate the terms that are zero by Eq. (6.5) and the results of Exerc. 17(a), i.e. g γβ,σ = 0, leading to: 1 Γγµν,σ = g γβ (gβµ,νσ + gβν,µσ − gµν,βσ ) 2
(6.44)
But since we are going to differentiate this again, to be on the safe side, let’s keep the terms that we eliminated from (6.43) because of g γβ,σ = 0. Now differentiate (6.43) with respect to xλ to arrive at: 1 Γγµν,σλ = g γβ (gβµ,νσλ + gβν,µσλ − gµν,βσλ ) 2 1 γβ 1 + g ,σλ (gβµ,ν + gβν,µ − gµν,β ) + g γβ,σ (gβµ,νλ + gβν,µλ − gµν,βλ ) 2 2 1 γβ = g (gβµ,νσλ + gβν,µσλ − gµν,βσλ ) 2 (6.45) The terms in blue didn’t contribute because there was always a common factor with at most one derivative of the metric, which vanishes. So being careful didn’t amount to anything here – we could have differentiated Eq. (6.64) right away. Armed with this 2nd derivative of the Christoffel symbol, we can substitute this into (6.46), giving: 1 Rαβµν,λ = gαγ g γσ [gσβ,νµλ + gσν,βµλ − gβν,σµλ − (gσβ,µνλ + gσµ,βνλ − gβµ,σνλ )] 2 1 σ = g α [gσν,βµλ − gβν,σµλ − (gσµ,βνλ − gβµ,σνλ )] 2 1 σ = δ α [gσν,βµλ − gβν,σµλ − (gσµ,βνλ − gβµ,σνλ )] 2 1 = [gαν,βµλ − gβν,αµλ − gαµ,βνλ + gβµ,ανλ ] (6.46) 2 which is Eq. (6.88).
Rob’s notes on Schutz
211
24. Establish Eq. (6.89) from Eq. (6.88). The solution follows from straightforward substitution of Eq. (6.88) into Eq. (6.89). For the 2nd term, make the substitutions: µ→λ ν→µ λ→ν
(6.47)
For the 3rd term, make the substitutions: µ→ν ν→λ λ→µ
(6.48)
Matching up terms that cancel is simply a matter of finding the terms with the same first two indices since g is symmetric. These pairs of terms (shown in the same colour font below) will cancel because the remaining indices (the 3 after the comma) will necessarily correspond and again order doesn’t matter because partial derivatives commute. 2Rαβµν,λ + 2Rαβλµ,ν + 2Rαβνλ,µ = [gαν,βµλ −gβν,αµλ −gαµ,βνλ +gβµ,ανλ ] + [gαµ,βλν −gβµ,αλν −gαλ,βµν + gβλ,αµν ] + [gαλ,βνµ − gβλ,ανµ −gαν,βλµ +gβν,αλµ ] = 0.
(6.49)
25. (a) Prove that the Ricci tensor Rµαµβ is the only independent contraction of Rαβµν since all other are multiples of it [or they are zero as pointed out in the text]. First we need the result that Rαβµν = −Rαβ µν as was proven in supplementary problem SP.4 above. It follows that Rααµν = −Rαα µν = 0,
∀µ, ν,
212 since zero is the only number equal to its own negative. Similarly, Rαβ µ µ = 0,
∀α, β,
It remains to consider Rαµνα , Rαµ µβ , Rαµ βµ . These candidates were identified by stepping through the possibilities systematically: 1st and 2nd, 1st and 3rd, 1st and 4th, (that’s it for those involving the first index), 2nd and 3rd (1st and 2nd already considered), 2nd and 4th, 3rd and 4th. That’s all. −Rµαβµ = −g σµ Rσαβµ = g σµ Rσαµβ , = Rµαµβ = Rαβ ,
−Rαµ µβ = Rµαµβ , = Rαβ ,
by Eq. (6.69)
i.e. the Ricci tensor.
using results of SP.4 i.e. the Ricci tensor.
Rαµ βµ = −Rµαβµ ,
(6.50)
(6.51)
using results of SP.4
= Rµαµβ = Rαβ ,
i.e. the Ricci tensor.
(6.52)
25. (b) Show the Ricci tensor is symmetric.
Rαβ = g σµ Rσαµβ = g σµ Rµβσα , = Rµβ σα ,
by Eq. (6.69)
= Rβα .
26. Use Exer. 17(a) to prove Eq. (6.94).
(6.53)
Rob’s notes on Schutz
213
In Exer. 17(a) we proved that g αβ,µ (P) = 0 at some event P. If we chose a local inertial reference frame, then Γσαµ (P) = 0, so g αβ,µ (P) = g αβ;µ (P) = 0 just as in the un-numbered equation on p.151 between Eqs. (6.30) and (6.31). The 2nd equality is a valid tensor equation, which is then valid in all reference frames. So, just as for Eq. (6.31), we also have: g αβ;µ (P) = 0,
in any basis.
27. Fill in the steps needed to establish Eqs. (6.95), (6.97), and (6.99). Eq. (6.95) is the 2nd term in Eq. (6.93) g αµ Rαβµλ;ν = g αµ Rαβλµ;ν = (g αµ Rαβλµ );ν using Eq. (6.94) = (−g αµ Rαβµλ );ν using Eq. (6.69) = (−Rµβµλ );ν = (−Rβλ );ν = −Rβλ;ν
(6.54)
Although not asked for this, note that the first and 3rd terms in Eq. (6.63) follow immediately from multiplication by the inverse metric tensor, so we have established Eq. (6.63) as well. To establish Eq. (6.67) we need Eq. (6.66), which is the contraction of Eq. (6.63) with the inverse metric: 0 = g βν [Rβν;λ − Rβλ;ν + Rµβνλ;µ ] = R;λ + g βν [−Rβλ;ν + Rµβνλ;µ ] used Eqs. (6.92), (6.94) = R;λ − Rµλ;µ + g βν Rµβνλ;µ
relabled µ → ν
(6.55)
The 3rd term is more involved. I found it easiest to work backwards, starting
214 at the result in Eq. (6.96): −Rµλ;µ = −g σµ Rσλ;µ
used Eqs. (6.92), (6.94)
= −g σµ Rν σνλ;µ
used Eqs. (6.91)
= −g σµ g βν Rβσνλ;µ = g σµ g βν Rσβνλ;µ
used Eqs. (6.69)
= g βν Rµβνλ;µ (6.56) Substituting this for the 3rd term in (6.57) gives Eq. (6.96): 0 = R;λ − Rµλ;µ − Rµλ;µ
(6.57)
2 Rµλ;µ − R;λ = 0
(6.58)
Multiplying by −1 gives: For all values of λ and µ, the Ricci scalar R is a scalar. So we can write, R;λ = (δ µλ R);µ = δ µλ R;µ
(6.59)
Substitution into (6.60) gives Eq. (6.97): (2 Rµλ − δ µλ R);µ = 0
(6.60)
The first equality in Eq. (6.98) is a definition (for some reason he’s changed notation from := to ≡, both being standard nomenclature). The 2nd equality follows from the symmetry of Rαβ as follows: Rαβ = Rµν g αµ g βν = Rνµ g αµ g βν
c.f. Eq. (6.91)
= Rνµ g µα g νβ
symmetry of (inverse) metric tensor
=R
βα
(6.61)
Working backwards again, replace α → λ in Eq. (6.99) and multiply by 2gβλ giving: 2gβλ Gβµ ;µ = 2gβλ Gµβ ;µ
symmetry of Gµβ 1 = 2gβλ (Rµβ − g µβ R);µ using def. in Eq. (6.98) 2 µ µ = (2R λ − g λ R);µ = (2Rµλ − δ µλ R);µ (6.62)
Rob’s notes on Schutz
215
which is Eq. (6.97).
28.(a) Derive Eq. (6.19) by using the usual coordinate transformation from Cartesian to spherical polars. [On the one hand these problems working with familiar coordinates like spherical polar may help build physical intuition. But on the other hand, you loose generality by working with a specific coordinate system. ] First one needs the coordinate transformation equations: x = r sin θ cos φ y = r sin θ sin φ z = r cos θ
(6.63)
We’ll need the transformation matrix, ∂xα ∂xβ 0 0
where the xβ refer to the spherical-polar coordinates (r, θ, φ) in that order. ∂x ∂x ∂x = sin θ cos φ = r cos θ cos φ = −r sin θ sin φ ∂r ∂θ ∂φ ∂y ∂y ∂y = sin θ sin φ = r cos θ sin φ = r sin θ cos φ ∂r ∂θ ∂φ ∂z ∂z ∂z = cos θ = −r sin θ =0 ∂r ∂θ ∂φ The metric tensor for Euclidean space in Cartesian coordinates is simply gij = δij
Eq. (5.29)
which is immediately clear from considering the dot product of the Cartesian coordinate basis vectors in 3D Euclidean space. So we can obtain the metric of 3D Euclidean space in spherical-polar coordinates through the transformation i j ∂x ∂x gi0 j 0 = gij 0 ∂xi ∂xj 0 i i ∂x ∂x = 0 i ∂x ∂xj 0 (6.64)
216 grr =
∂xi ∂r
∂xj ∂r
gij
2 ∂xi = , with summation over i ∂r 2 2 2 ∂x ∂y ∂z = + + ∂r ∂r ∂r
= (sin θ cos φ)2 + (sin θ sin φ)2 + (cos φ)2 = 1.
(6.65)
j ∂xi ∂x = gij ∂r ∂θ i i ∂x ∂x = with summation over i ∂r ∂θ ∂x ∂x ∂y ∂y ∂z ∂z = + + ∂r ∂θ ∂r ∂θ ∂r ∂θ = (sin θ cos φ) (r cos θ cos φ) + (sin θ sin φ) (r cos θ sin φ) + (cos θ) (−r sin θ) = 0. (6.66)
grθ
j ∂x ∂xi gij = ∂r ∂φ i i ∂x ∂x = with summation over i ∂r ∂φ ∂x ∂y ∂y ∂z ∂z ∂x + + = ∂r ∂φ ∂r ∂φ ∂r ∂φ = (sin θ cos φ) (−r sin θ sin φ) + (sin θ sin φ) (r sin θ cos φ) + (cos θ) (0) = 0. (6.67)
grφ
gθr = grθ
, all metrics are symmetric
(6.68)
Rob’s notes on Schutz
gθθ =
217
∂xi ∂θ
∂xj ∂θ
gij
2 ∂xi = with summation over i ∂θ 2 2 2 ∂x ∂y ∂z = + + ∂θ ∂θ ∂θ
= (r cos θ cos φ)2 + (r cos θ sin φ)2 + (−r sin θ)2 = r2 .
(6.69)
j ∂xi ∂x = gij ∂θ ∂φ i i ∂x ∂x with summation over i = ∂θ ∂φ ∂x ∂x ∂y ∂y ∂z ∂z = + + ∂θ ∂φ ∂θ ∂φ ∂θ ∂φ = (r cos θ cos φ) (−r sin θ sin φ) + (r cos θ sin φ) (r sin θ cos φ) + 0 = 0. (6.70)
gθφ
gφr = grφ
, all metrics are symmetric
(6.71)
gφθ = gθφ
, all metrics are symmetric
(6.72)
gφφ =
∂xi ∂φ
∂xj ∂φ
gij
2 ∂xi = with summation over i ∂φ 2 2 2 ∂x ∂y ∂z = + + ∂φ ∂φ ∂φ
= (r sin θ sin φ)2 + (r sin θ cos φ)2 + 0 = r2 sin2 θ.
(6.73)
218 The above metric is consistent with that presented in matrix form in Eq. (6.19).
28.(b) Deduce from Eq. (6.19) that the metric of the surface of a sphere of radius r has components gθθ = r2 ,
gφφ = r2 sin2 θ,
gθφ = 0
in spherical coordinates. On the surface of a sphere, the variable r is held fix at the radius so dr = 0 and the line element becomes dl2 = r2 dθ2 + r2 sin2 θdφ2 consistent with the metric given: gθθ = r2 ,
gφφ = r2 sin2 θ,
gθφ = 0
28.(c) Find the components of g αβ for the sphere. Recall from elementary linear algebra, or the results of Exer. 3(c), the inverse of a diagonal matrix is just the inverse of the diagonal elements 1 −1 when i = j and 0 otherwise (Dij ) = Dij Applying that here we find g θθ = r−2 ,
g φφ = r−2 sin−2 θ,
g θφ = 0
29. Find the Riemann curvature tensor on the surface of a sphere of radius r = 1 in polar coordinates. There’s only one independent component, Rθφθφ , see Exer. 18(b) for explanation. Even though there’s only one component, and we have the metric from Exer. 28, this question still involves a lot of work. I suggest we keep r as a
Rob’s notes on Schutz
219
variable since it’s no extra effort and it gains us a more general result. Let’s break it into three steps. (i) Find the basis vectors ~eφ and ~eθ .
30. Boring.
31. Show that covariant differentiation obeys the usual product rule, e.g. (V αβ Wβγ );µ = V αβ;µ Wβγ + V αβ Wβγ;µ Hint: Use a locally inertial frame. In a locally inertial frame, the Christoffel symbol vanishes and covariant derivatives equal partial derivatives, so (V αβ Wβγ );µ = (V αβ Wβγ ),µ ! X ∂ V αβ Wβγ from here forward suspend usual summation convention = ∂xµ β X ∂ = V αβ Wβγ µ ∂x β X ∂ αβ αβ ∂ Wβγ = Wβγ µ V + V ∂x ∂xµ β X = Wβγ V αβ,µ + V αβ Wβγ,µ β
= Wβγ V αβ,µ + V αβ Wβγ,µ = Wβγ V
αβ ;µ
+V
αβ
Wβγ;µ
reinvoke usual summation convention because we’re in a locally inertial frame. (6.74)
The last equality is a valid tensor equation, valid in all reference frames.
32. A 4D manifold has coordinates (u, v, w, p) in which the metric has
220 components guv = 1 gww = 1 gpp = 1
(6.75)
and all other independent components vanishing (emphasis my own). (a) Show that the manifold is flat and the signature is +2. By § 6.7 point (6), we have that a flat space has Riemann curvature identically zero, Rαβµν = 0. The Riemann curvature tensor was expressed in terms of the metric in Eq. (6.68). The important point is that it depends only upon 2nd derivatives of the metric. But the metric here is given as a constant, independent of event or point on the manifold. So the derivatives vanish and the Riemann curvature tensor is identically zero, Rαβµν = 0. To find the signature we only need to count the number of positive and negative eigenvalues. But since in (b) we will need to diagonalize the metric tensor, we might as well do it here. Then it’s trivial to find the eigenvalues. Simply by playing around, I found the following transformation worked: √ √2 − √2 − √22 0 0 + 2 − 2 0 0 2 H= 02 0 1 0 0 0 0 1 Then H T (g)H = (η) (b) Find the transformation to the usual coordinates t, x, y, z. We found the transformation matrix in √ √2 − √2 − √22 + 2 − 2 2 H= 02 0 0 0
(a) to be: 0 0 0 0 1 0 0 1
(6.76)
Rob’s notes on Schutz
221
This means that
¯
gαβ = ηαβ = Λα¯α gα¯ β¯ Λββ where α, β are indices on the coordinates in (t, x, y, z) and α ¯ , β¯ indices on ¯ ¯ 0 1 the coordinates in (u, v, w, p) and say u = u, u = v, . . . So, H=
¯ (Λββ )
¯
∂uβ = ∂xβ
Thus, integrating these derivatives finally we arrive at: √ √ 2 2 u=− t− x, √2 √2 2 2 v=+ t− x, 2 2 w = y, p = z.
(6.77)
33. A 3-sphere is the 3D surface in a 4D Euclidean space (coordinates x, y, z, w), given by the equation x2 + y 2 + z 2 + w 2 = r 2 where r is the radius of the three-sphere. (a) Define new coordinates (r, θ, φ, χ) by the equations w=r z=r y=r x=r
cos(χ), sin(χ) cos(θ), sin(χ) sin(θ) sin(φ), sin(χ) sin(θ) cos(φ),
(6.78)
Show that (θ, φ, χ) are coordinates for the sphere. If we simply substitute these equations (6.78) into the equation for the 3-sphere, we find that the equation is satisfied for fixed r for all values of (θ, φ, χ). So these coordinates can vary and we “stay on the 3-sphere”. But
222 to show that these are truly coordinates, I believe we must also show that the transformation defined by (6.78) is not singular, c.f. Eq. (5.6). After a whole lot of algebra, I found the determinant of ∂w
∂r ∂z ∂r det ∂y ∂r ∂x ∂r
∂w ∂θ ∂z ∂θ
... ...
∂w ∂φ
∂w ∂χ
...
= −2r3 sin2 (χ) sin(θ). ∂x ∂χ
So just as in spherical-polar coordinates there are singular points, but the transformation is generally non-singular. (b) Find the metric tensor of the three-sphere of radius r in these coordinates. (Use method of Exer. 28). The off-diagonal terms of the metric tensor are zero if the basis vectors are orthogonal. In spherical coordinates this was obvious but it’s not a priori obvious for the three-sphere (at least for me). So I simply calculated all the terms of the metric tensor. There are only 9 independent terms (because of symmetry, gα¯ β¯ = gβ¯α¯ . Let’s use an overbar to indicate indices on the bases in θ, φ, χ, with ¯
x1 = θ, ¯
x2 = φ, ¯
x3 = χ.
(6.79)
And indices without overbar to indicate the original coordinates in x, y, z, w. Then in general gα¯ β¯ = gαβ Λαα¯ Λββ¯
(6.80)
where Λαα¯ =
∂xα ∂xα¯
(6.81)
Rob’s notes on Schutz
223
The calculus is tedious but straightforward. For instance, g¯1¯1 = gθθ 2 2 2 2 ∂x ∂y ∂z ∂w = gxx + gyy + gzz + gww ∂θ ∂θ ∂θ ∂θ 2 2 2 2 2 2 2 2 2 = r sin χ cos θ sin φ + r sin χ cos θ cos φ + r sin2 χ sin2 θ = r2 sin2 χ. (6.82)
The only question for me was “what’s the metric tensor in the 4D space”? Turns out to get the right answer one must assume gαβ = +1 if α = β = 0 if α 6= β
(6.83)
In a similar manner one can easily show the off-diagonal terms are zero.
34. Establish the following identities for a general metric tensor in a general coordinate system. Eqs. (6.39) and (6.40) are sometimes helpful. In my humble opinion, these problems provide some nice practise with tensor calculus, but are not essential for understanding the material of Chapter 6.
(a) 1 Γµµν = (ln |g|),ν 2
Γµµν = Γµνµ cf. Exer. 5 1 = g αβ gαβ,µ cf. Eq. (6.38) 2 1 g,µ = using Eq. (6.39) 2 g 1 = (ln |g|),µ chain rule. 2
(6.84)
224 (b) Personally I found this by far the hardest of these 5 identities. If you have trouble with this one, try (d) first. √ αβ g −g ,β √ g µν Γαµν = − −g Expand the RHS using the product rule of differential calculus, √ √ g αβ −g ,β ( −g),β αβ √ √ − = −g − g αβ,β −g −g = −g αβ Γσβσ − g αβ,β using Eq. (6.40) 1 = −g αβ g µν gµν,β − g αβ,β using Eq. (6.38) 2 (6.85) Expand the LHS using Eq. (6.32): 1 g µν Γαµν = g µν [ g αβ (gβµ,ν + gβν,µ − gµν,β )] 2 1 αβ 1 µν = −g g gµν,β + g µν g αβ (gβµ,ν + gβν,µ ) rearranging. 2 2
(6.86)
Subtracting these two results we find it remains to prove only that 1 −g αβ,β = g µν g αβ (gβµ,ν + gβν,µ ) 2
(6.87)
This follows easily using (result from (d)): 1 1 g µν g αβ gβµ,ν = −g µν g αβ,ν gβµ 2 2 ν 1 αβ = −g β g ,ν 2 1 αβ = − g ,β . 2 (6.88) Similarly for the other term, giving the result we required.
Rob’s notes on Schutz
225
(c) Suppose F µν is antisymmetric. F µν ;ν
√ (F µν −g),ν √ =− −g
Expand the RHS using the product rule of differential calculus, √ √ ( −g),ν (F µν −g),ν µν √ √ =F + F µν,ν − −g −g = F µν Γανα + F µν,ν using Eq. (6.40) (6.89) Expand the LHS using Eq. (6.35): µν µσ ν F µν Γ σν + F σν Γµσν ;ν = F ,ν + F
= F µν,ν + F µν Γανα
relabelling dummy indices.
(6.90)
The 3rd term vanished because F σν is antisymmetric (given) and Γµσν is symmetric in σν and they are contracted on these indices (vanishes by the results of Exer. 26 of § 3.10). (d) Prove that g ασ gσβ,γ = −gσβ g ασ ,γ
(6.91)
in all bases. Solution: g ασ gσβ = g αβ = δ αβ in all bases because the inverse metric tensor is, by definition, the inverse of the metric tensor. Simply differentiate this formula: (g ασ gσβ ),γ = δ αβ,γ g ασ gσβ,γ + g ασ,γ gσβ = 0 g ασ gσβ,γ = −gσβ g ασ,γ . (e) Prove g µνα = −Γµβα g βν − Γν βα g µβ
(6.92)
226 This is a one-liner that follows immediately from Eq. (6.31) and (6.35).
35. Given the line element ds2 = − exp(2Φ(r)) dt2 + exp(2Γ(r)) dr2 + r2 dθ2 + r2 sin2 θ dφ2 find the Riemann curvature tensor. I found this problem instructive on several levels. It should be done after (or concurrently with) Exer. 18, for it helps clarify the symmetry relations and the implied reduction in degrees of freedom of the Riemann curvature tensor. It also helps reveal how much information is packed in the line element equation! Later we’ll learn that the given form of the metric leads to the Schwarzschild metric which represents the simplest solution of the Einstein equations. It is therefore extremely useful to know. (i) The coordinates are {t, r, θ, φ}. This is clear because these form the differential variables of the line element. The metric tensor is − exp(2Φ) 0 0 0 0 exp(2Λ) 0 0 (gαβ ) = 2 0 0 r 0 0 0 0 r2 sin2 θ A fair question is “why is the metric tensor diagonal”? The answer is that there are no cross-terms in the line element, dl = |gαβ dxα dxβ |1/2 which was given just before Eq. (6.6). (ii) The inverse of the metric tensor: − exp(−2Φ) 0 0 0 0 exp(−2Λ) 0 0 (g αβ ) = −2 0 0 r 0 −2 −2 0 0 0 r sin θ See Exer. 3 for computing the inverse of a diagonal matrix.
Rob’s notes on Schutz
227
The Christoffel symbol can be computed from the metric tensor using Eq. (6.32). One needs the first derivatives of the metric tensor: gtt,r = grr,r = gθθ,r = gφφ,r = gφφ,θ =
∂ [− exp(2Φ)] = −2 exp(2Φ) Φ0 , ∂r ∂ [exp(2Λ)] = 2 exp(2Λ) Λ0 , ∂r ∂ 2 [r ] = 2r, ∂r ∂ 2 [r sin2 (θ)] = 2r sin2 (θ), ∂r ∂ 2 [r sin2 (θ)] = 2r2 sin(θ) cos(θ) = r2 sin(2θ). ∂θ
(6.93)
All other first derivatives of the metric tensor are zero. Here are the nonzero Christoffel symbols: Γ001 Γ100 Γ111 Γ122 Γ133 Γ212 Γ233 Γ313 Γ323
= Φ0 = exp(−2Λ) exp(2Φ) Φ0 = Λ0 = − exp(−2Λ) r = − exp(−2Λ) r sin2 (θ) 1 = r = − sin(θ) cos(θ) 1 = r cos(θ) = sin(θ)
(6.94)
(iii) Deciding the 20 terms of Rαβµν to calculate. This is not as simple as it might sound. Here it helps tremendously if one has solved Exer. 18. Recall Rααµν = 0 as does Rαβνν because of Eq. (6.69) (see Exer. 18). We organize the terms as recommended in the hint for Exer. 18: we choose pairs of α 6= β (there are 6 of them accounting for the fact that order doesn’t matter), and similarly there are 6 pairs of µ 6= ν. These would give 6 × 6 = 36 elements, but, because of the symmetry Rαβµν = Rµναβ , we must
228 divide the number off-diagonal elements by two to get 5 × 6/2 + 6 = 21. (We’ll deal with the reduction to 20 by Eq. (6.70) in a minute.) I found it too difficult to attempt to write down these terms immediately based on the above prescription. Instead it was much easier to write down all 6 × 6 = 36 terms and eliminate the lower diagonal: Rtrtr Rtθtr Rtφtr Rrθtr Rrφtr Rθφtr
Rtrtθ Rtθtθ Rtφtθ Rrθtθ Rrφtθ Rθφtθ
Rtrtφ Rtθtφ Rtφtφ Rrθtφ Rrφtφ Rθφtφ
Rtrrθ Rtθrθ Rtφrθ Rrθrθ Rrφrθ Rθφrθ
Rtrrφ Rtθrφ Rtφrφ Rrθrφ Rrφrφ Rθφrφ
Rtrθφ Rtθθφ Rtφθφ Rrθθφ Rrφθφ Rθφθφ
(6.95)
Note that we’re not writing them down randomly. Instead, we step the 2nd pair of indices, i.e. µν, systematically by increasing the ν most rapidly with increasing column, µ more slowly with increasing column. Similarly we increase the first pair of indices, i.e. αβ, with row, and β more rapidly than α. These were arbitrary choices of course, but having a system and sticking to it makes it easy. Recall only the upper diagonal is necessary to determine the tensor because of symmetry Rαβµν = Rµναβ : Rtrtr Rtrtθ Rtrtφ Rtrrθ Rtθtθ Rtθtφ Rtθrθ Rtφtφ Rtφrθ Rrθrθ
Rtrrφ Rtθrφ Rtφrφ Rrθrφ Rrφrφ
Rtrθφ Rtθθφ Rtφθφ Rrθθφ Rrφθφ Rθφθφ
(6.96)
Now we must also impose the condition in Eq. (6.70), which we explained in Exer. 18 only applies to the case when none of the indices are equal. There are three such terms, indicated in red above. One of these can be determined from the other two. Let’s evaluate a few of these in full detail. It’s important to use Eq. (6.63), which is true in all coordinate bases and not Eq. (6.68) which is only true in a local inertial frame. From Eq. (6.63) and the Christoffel symbols above
Rob’s notes on Schutz
229
(6.94) we find for Rtrtr = gtt Rt rtr = − exp(2Φ) −Γ001,r + Γ010 Γ111 − Γ001 Γ001 = exp(2Φ) (Φ0 )2 + Φ00 − Φ0 Λ0
(6.97)
It’s important to note that if one were to Eq. (6.68): Rαβµν =
1 (gαν,βµ − gαµ,βν + gβµ,αν − gβν,αµ ) 2
one would miss the cross term 1 (gtr,rt − gtt,rr + grt,tr − grr,tt ) 2 1 = 0 − (−4 exp(2Φ)(Φ0 )2 − 2 exp(2Φ)Φ00 ) + 0 − 0 2 = 2 exp(2Φ)(Φ0 )2 + exp(2Φ)Φ00
Rtrtr =
(6.98)
I’m finding my answers disagree with those provided by Schutz, so it’s important to work these out in more detail. For the next one Rtθtθ I found, Rtθtθ = gtt Λ022,t − Λ020,θ + Λ0σ0 Λσ22 − Λ0σ2 Λσ20 = gtt Λ0σ0 Λσ22 = gtt Λ010 Λ122 = − exp(2Φ) [Φ0 (−r exp(−2Λ))] = +rΦ0 exp(2Φ − 2Λ)
(6.99)
After a lot of algebra one arrives at only the diagonal elements of (6.96) are nonzero: Rtrtr 0 0 0 0 0 0 Rtθtθ 0 0 0 0 0 0 Rtφtφ 0 0 0 (Rαβµν ) = (6.100) 0 0 0 R 0 0 rθrθ 0 0 0 0 Rrφrφ 0 0 0 0 0 0 Rθφθφ
230 and the other 256 − 36 = 220 terms determined by symmetry relations in Eq. (6.69). These 6 nonzero terms are Rtrtr Rtθtθ Rtφtφ Rrθrθ Rrφrφ
= exp(2Φ) (Φ0 )2 + Φ00 − Φ0 Λ0 = rΦ0 exp(2Φ) exp(−2Λ) = sin2 (θ) rΦ0 exp(2Φ) exp(−2Λ) = rΛ0 = rΛ0 sin2 (θ)
Rθφθφ = −r2 cos2 (θ) − 1 + exp(−2Λ) − cos2 (θ) exp(−2Λ) = r2 sin2 (θ) (1 − exp(−2Λ)) (6.101)
36. A four-dimensional manifold has coordinates (t, x, y, z) and a line element: ds2 = −(1 + 2φ)dt2 + (1 − 2φ)(dx2 + dy 2 + dz 2 ), with |φ(t, x, y, z)| 1 everywhere. At any point P with coordinates (t0 , x0 , y0 , z0 ), find a coordinate transformation to a locally inertial coordinate system, to first order in φ. At what rate does such a frame accelerate with respect to the original coordinates, again to first order in φ? I’ve noticed in the other chapters that the questions near the end of the Exercises section generally anticipate results we’ll need later. This is probably of that genre since I don’t immediately see the point of this (i.e. it’s probably important). (Indeed this metric will reappear in Chapter 7, Eq. (7.8).) First we need the metric, which we can find as we did in Exer. 35. Here the metric is −(1 + 2φ) 0 0 0 0 (1 − 2φ) 0 0 (gαβ ) = 0 0 (1 − 2φ) 0 0 0 0 (1 − 2φ) So we seek a transformation Λαα¯ such that, ηα¯ β¯ = Λαα¯ gαβ Λββ¯
Rob’s notes on Schutz
231
By inspection (which was aided by solving Exer. 3), we see that: (1 + 2φP )−1/2 0 0 0 0 (1 − 2φP )−1/2 0 0 (Λαα¯ ) = −1/2 0 0 (1 − 2φP ) 0 0 0 0 (1 − 2φP )−1/2 where φP = φ(P ) = φ(t0 , x0 , y0 , z0 ). The question specifically for the “coordinate transformation to a locally inertial coordinate system, to first order in φ”. I guess that means Schutz wants us to approximate this transformation, perhaps using the binomial theorem: (1 + 2φP )−1/2 ≈ (1 − φP ) + O(φ2P ). I’m not sure because I don’t see the point of this. In any case one would get: (1 − φP ) 0 0 0 0 (1 + φP ) 0 0 (Λαα¯ ) ≈ 0 0 (1 + φP ) 0 0 0 0 (1 + φP ) I confess I don’t know what is meant by “the rate such a frame accelerates with respect to the original coordinates”. But it is clear that this transformation works exactly only at the event P , and for the binomial approximation it only applies approximately there as well. And one can write down how the metric would depart from the local inertial metric, η, by simply applying the approximation: −(1 + 2φ)(1 − φP )2 0 0 0 0 (1 − 2φ)(1 + φP )2 0 0 ηα¯ β¯ ≈ 0 0 (1 − 2φ)(1 + φP )2 0 2 0 0 0 (1 − 2φ)(1 + φP ) −(1 + 2φ)(1 − 2φP ) 0 0 0 0 (1 − 2φ)(1 + 2φP ) 0 0 ≈ 0 0 (1 − 2φ)(1 + 2φP ) 0 0 0 0 (1 − 2φ)(1 + 2φP ) −(1 + 2(φ − φP )) 0 0 0 0 (1 − 2(φ − φP )) 0 0 . ≈ 0 0 (1 − 2(φ − φP )) 0 0 0 0 (1 − 2(φ − φP )) (6.102)
232 37. (a) ‘Proper volume’ of a 2D manifold is usually called ‘proper area’. Using the metric oin Exer. 28, integrate Eq. (6.18) to find the proper area of a sphere of radius r.
Z Z
1
Z
2
π
2π
Z
g 1/2 dφ dθ,
dx dx = 0
Z
π
2π
Z
(r2 r2 sin2 θ)1/2 dφ dθ
= Z0 π Z0 2π = 0 2
changed sign of det |g|, cf. Eq. (6.19)
0
(r2 sin θ) dφ dθ
0
Z
π
sin θ dθ
= r 2π 0
= −r2 2π[cos θ]π0 = 4πr2 .
37. (b) Analogous problem for Exer. 33, the three-sphere. I don’t see that we learn anything new here.
39. Defines the Lie bracket in Eq. (6.100): ~ , V~ ]α = U β ∇β V α − V β ∇β U α [U
(a) That, ~ , V~ ]α = −[V~ , U ~ ]α [U follows immediately from the definition in Eq. (6.100). Show that, ~ , V~ ]α = U β V α,β − V β U α,β [U
(6.103)
Rob’s notes on Schutz
233
We start with the definition of Eq. (6.100): ~ , V~ ]α = U β ∇β V α − V β ∇β U α [U = U β V α;β − V β U α;β
notation change only
= U β (V α,β + Γαµβ V µ ) − V β (U α,β + Γαµβ U µ ) used Eq. 6.33 = U β V α,β − V β U α,β + U β Γαµβ V µ − V β Γαµβ U µ
rearranging only (6.104)
where the terms in black font match what we are required to prove. So we only need to show that the red terms vanish. U β Γαµβ V µ − V β Γαµβ U µ = Γαβµ V β U µ − V β Γαµβ U µ
relabelled dummy indices on first term
= 0,
(6.105)
because in any coordinate system Γαµβ = Γαβµ , see Exer. 5.
6.10
Rob’s supplementary problems
SP.1 What are the numerical values of the elements of the Riemann curvature tensor, Rααµν and Rαβµµ , with summation not implied. (Hint: Think about the implications of the symmetry relations contained in Eq. (6.69).) I recommend doing this problem before attempting Exerc. 18(b) in § 6.9.
Solution: Rααµν = 0 and Rαβµµ = 0 because we must have Rαβµν = Rβαµν For the case where α = β we must have that this element has a numerical value equal to its inverse. The only number equal to its inverse is zero.
SP.2 Generalize Eq. (6.73) to the case of
1 1
tensor, F µν .
Solution: This is a straightforward generalization of the argument leading to Eq. (6.73). See also the solution to Schutz’s Exerc. 20.
234 The covariant derivative of a and (6.35).
1 1
tensor can be inferred from Eqs. (6.34)
∇β F µν = F µν;β = F µν,β + Γµσβ F σν − Γσνβ F µσ
(6.106)
Now we simply apply the gradient operator another time, initially for compactness without expanding F µν;β ∇α (∇β F µν ) = (F µν;β );α = (F µν;β ),α + Γµσα F σν;β − Γσνα F µσ;β − Γσβα F µν;σ = (F µν;β ),α
(6.107)
where the last step used the fact that Christoffel symbols are zero in local inertial coordinates. But their gradients are not, so ∇α (∇β F µν ) = (F µν;β ),α = F µν,βα + (Γµσβ F σν − Γσνβ F µσ ),α =
F µ,βα
+
Γµσβ,α F σν
−
using (6.106) above
Γσνβ,α F µσ
(6.108)
which generalizes Eq. (6.73).
SP.3 Derive Eq. (6.78) in a manor analogous to the derivation of Eq. (6.77). Use the results of SP2 above. I posed this problem because I couldn’t find the solution to Exerc. 21 wherein one is to explain the positive signs in Eq. (6.78).
Solution: Using results (6.108) from SP2 above, we start by changing the order of the derivatives by changing the order of the indices α and β. Taking the difference between the two derivatives we obtain (terms in red cancel) [∇α , ∇β ]F µν ≡ ∇α (∇β F µν ) − ∇β (∇α F µν ) = F µ,βα + Γµσβ,α F σν − Γσνβ,α F µσ − (F µ,αβ + Γµσα,β F σν − Γσνα,β F µσ ) = (Γµσβ,α − Γµσα,β )F σν − (Γσνβ,α − Γσνα,β )F µσ
collecting common factors (6.109)
Rob’s notes on Schutz
235
Use Eq. (6.63) but in reference frame where the Christoffel symbols all vanish, [∇α , ∇β ]F µν = Rµσαβ F σν − Rσναβ F µσ
(6.110)
SP.4 Do the symmetry relations in Eq. (6.69) apply also when an index is raised? Prove that Rαβµν = −Rαβ µν . This result will be useful for Exerc. 25(a). Yes. −Rαβ µν = −g σβ Rασµν = g σβ Rσαµν , = Rβαµν .
by Eq. (6.69) (6.111)
SP.5 What’s the Ricci tensor and Ricci scalar of the metric of Exer. 35 (which turns out to be of the form of the Schwarzschild metric)?
SP.6 Why is it generally important to use Eq. (6.63) for the computation of the Riemann tensor components and not the one in terms of the metric, given by Eq. (6.68)?
236
Chapter 7 Physics in curved spacetime
237
238
7.2
Physics in slightly curved spacetimes
Eq. (7.14) should be Γ000 ≈ (1 − 2φ) φ,t = φ,t + O(φ). p. 177, [g αβ ] is apparently the matrix associated with the inverse metric tensor. In earlier chapters the notation (g αβ ) was used, an annoying inconsistency.
7.6
Exercises
It might help to tackle my supplementary problems first, see § 7.7 below.
1. (i) If Eq. (7.3) were the correct generalization of Eq. (7.1) to a curved spactime, how would you interpret it? (ii) What would happen to the number of particles in a comoving volume of fluid, as time evolves? (iii) In principle, can we distinguish experimentally between Eqs. (7.2) and (7.3)? (i) Recall the hypothetical Eq. (7.3) was (nU α );α = q R where q was some constant and R the Ricci scalar. Based on kinematics, this equation states that, for q > 0 there is a source of particles for positively curved space R > 0. This was shown in Exer. 20 (a) of § 4.10, which ~ was the rate interpreted showed that the divergence of the 4-vector N = nU of generation of particles per unit volume. In Chapter 4 we were working in flat-space time but on a curved manifold the divergence would include terms from the rate of change of the basis vectors: N α,α = ε = (nU α );α
using definitions Eq. (4.1) and Eq. (4.4) (7.1)
If q R were negative there would be a sink of particles. The fact that, by Eq. (7.1) (nU α ),α = 0
Rob’s notes on Schutz
239
while (nU α );α = q R 6= 0 would mean, at least mathematically, that the nonzero source results from the derivatives of the basis vectors: (nU α );α = N α;α = N α,α + Γαµα N µ
Eq. (6.36)
= Γαµα N µ using Eq. (7.1) √ = N α (ln( −g)),α using Eq. (6.41) = q R by the hypothetical Eq. (7.3).
(7.2)
I suppose physically we could interpret this as a source of particles, per unit volume per unit time, (q R), resulting from the flux of particles N α up the gradient of the natural logarithm of the magnitude of the determinant of the metric, whatever that would mean! I’m not sure how to understand or interpret this more completely. Of course would be admittedly strange because it would mean that there would be a source of particles in some frames but not in the locally inertial frame (where by Eq. (6.5) we had vanishing first derivatives of all components of the metric). (ii) What would happen to the number of particles in a co-moving volume of fluid, as time evolves? Let’s recall the solution to Exer. 20 of § 4.10, where we derived (4.44): ∂nU x ∂nU y ∂nU z ∂n U 0 =− − − + ε. ∂t ∂x ∂y ∂z In the co-moving frame, U 0 is always unity, such that: ∂n ∂nU x ∂nU y ∂nU z =− − − + ε. ∂t ∂x ∂y ∂z
(7.3)
How the number of particles per unit volume n evolves in time depends upon the spatial convergence of the number flux −N i;i and the source term, ε = qR. (iii) Could we ever distinguish experimentally between Eqs. (7.2) and (7.3)? Recall Eq. (7.2) had (nU α );α = 0.
240 So “yes”, I believe one could, in principle, measure the terms on the RHS of (7.3) and thereby distinguish between Eqs. (7.2) and (7.3).
2. To first order in φ, compute g αβ for [the line element given by ] Eq. (7.8). See Exers. 35 and 36 from § 6.9 for how to calculate the metric from the line element. Here the metric is −(1 + 2φ) 0 0 0 0 (1 − 2φ) 0 0 (gαβ ) = 0 0 (1 − 2φ) 0 0 0 0 (1 − 2φ) and in fact, it’s exactly the same as in Exer. 36 of § 6.9. Now we are asked for the inverse, which because it’s diagonal follows immediately as the inverse of the terms on the diagonal of the metric: −(1 + 2φ)−1 0 0 0 0 (1 − 2φ)−1 0 0 (g αβ ) = −1 0 0 (1 − 2φ) 0 −1 0 0 0 (1 − 2φ) −(1 − 2φ) 0 0 0 0 (1 + 2φ) 0 0 ≈ 0 0 (1 + 2φ) 0 0 0 0 (1 + 2φ) (7.4)
3. Calculate all the Christoffel symbols for the metric given by Eq. (7.8), to first order in φ(t, x, y, z). This question requires tones of algebra yet I found I didn’t learn anything. On the other hand, given the importance of this metric, perhaps a complete set of Christoffel symbols will come in handy later. First lets count the number of independent Christoffel symbols, Γαµν to calculate. For each α there are only ten independent terms because Γαµν = Γανµ in any basis. Given
Rob’s notes on Schutz
241
the metric and inverse metric, see Exer. 2, we can calculate the Christoffel symbols using Eq. (6.32). The calculation simplifies tremendously because (g αβ ) is diagonal. Thus we need only consider the β = α contribution in Eq. (6.32). Γ000 = Γt tt 1 = g tt gtt,t 2 = −(1 − 2φ) φ,t
(7.5)
Γ001 = Γt tx 1 = g tt gtt,x 2 = −(1 − 2φ) φ,x
(7.6)
Γ002 = Γt ty 1 = g tt gtt,y 2 = −(1 − 2φ) φ,y
(7.7)
Γ003 = Γt tz 1 = g tt gtt,z 2 = −(1 − 2φ) φ,z
(7.8)
Γ010 = Γ001 = Γt tx = −(1 − 2φ) φ,x
(7.9)
but let’s not bother with redundant ones any more.
242
Γ011 = Γt xx 1 = g tt (−gxx,t ) 2 = (1 − 2φ) φ,t
(7.10)
Γ012 = Γt xy 1 = g tt (0) 2 = 0.
(7.11)
and in general when all the indices are different, the results is nil. Γ013 = Γt xz = 0.
(7.12)
Γ022 = Γt yy 1 = g tt (−gyy,t ) 2 = (1 − 2φ) φ,t
(7.13)
Γ023 = Γt yz =0
Γ033 = Γt zz 1 = g tt (−gzz,t ) 2 = (1 − 2φ) φ,t
(7.14)
(7.15)
Rob’s notes on Schutz
243
Γ100 = Γxtt 1 = g xx (−gtt,x ) 2 = (1 + 2φ) φ,x
(7.16)
Γ101 = Γxtx 1 = g xx (gxx,t ) 2 = −(1 + 2φ) φ,t
(7.17)
Γ102 = Γxty = 0.
(7.18)
Γ103 = Γxtz = 0.
(7.19)
Γ111 = Γxxx 1 = g xx (gxx,x ) 2 = −(1 + 2φ) φ,x
(7.20)
Γ112 = Γxxy 1 = g xx (gxx,y ) 2 = −(1 + 2φ) φ,y
(7.21)
244
Γ113 = Γxxz 1 = g xx (gxx,z ) 2 = −(1 + 2φ) φ,z
(7.22)
Γ122 = Γxyy 1 = g xx (−gyy,x ) 2 = (1 + 2φ) φ,x
(7.23)
Γ123 = Γxyz = 0.
(7.24)
Γ133 = Γxzz 1 = g xx (−gzz,x ) 2 = (1 + 2φ) φ,x
(7.25)
Γ200 = Γytt 1 = g yy (−gtt,y ) 2 = (1 + 2φ) φ,y
(7.26)
Γ201 = Γytx = 0.
(7.27)
Rob’s notes on Schutz
245
Γ202 = Γyty 1 = g yy (gyy,t ) 2 = −(1 + 2φ) φ,t
(7.28)
Γ203 = Γytz = 0.
(7.29)
Γ211 = Γyxx 1 = g yy (−gxx,y ) 2 = (1 + 2φ) φ,y
(7.30)
Γ212 = Γyxy 1 = g yy (gyy,x ) 2 = −(1 + 2φ) φ,x
(7.31)
Γ213 = Γyxz = 0.
(7.32)
Γ222 = Γyyy 1 = g yy (gyy,y ) 2 = −(1 + 2φ) φ,y
(7.33)
246
Γ223 = Γyyz 1 = g yy (gyy,z ) 2 = −(1 + 2φ) φ,z
(7.34)
Γ233 = Γyzz 1 = g yy (−gzz,y ) 2 = (1 + 2φ) φ,y
(7.35)
Γ300 = Γztt 1 = g zz (−gtt,z ) 2 = (1 + 2φ) φ,z
(7.36)
Γ301 = Γztx = 0.
(7.37)
Γ302 = Γzty = 0.
Γ303 = Γztz 1 = g zz (gzz,t ) 2 = −(1 + 2φ) φ,t
(7.38)
(7.39)
Rob’s notes on Schutz
247
Γ311 = Γzxx 1 = g zz (−gxx,z ) 2 = (1 + 2φ) φ,z
(7.40)
Γ312 = Γzxy = 0.
(7.41)
Γ313 = Γzxz 1 = g zz (gzz,x ) 2 = −(1 + 2φ) φ,x
(7.42)
Γ322 = Γzyy 1 = g zz (−gyy,z ) 2 = (1 + 2φ) φ,z
(7.43)
Γ323 = Γzyz 1 = g zz (gzz,y ) 2 = −(1 + 2φ) φ,y
(7.44)
Γ333 = Γzzz 1 = g zz (gzz,z ) 2 = −(1 + 2φ) φ,z
(7.45)
248 4. Verify that the results of Eqs. (7.15) and (7.24) depended only on g00 ; the form of gxx doesn’t affect them, as long as it is (1 + O(φ). It’s not clear where we’re expected to start. I’ll go back to Eq. (7.9), since this is clearly the equation for a geodesic. Let λ = τ /m, where τ is proper time and m is the rest mass. Let ~x be the position of the particle, so the tangent vector ~ = d~x U dτ is also the four-velocity of the particle. The momentum p~ is ~ p~ = mU d~x =m dτ d~x = . dλ
(7.46)
And thus the momentum is the tangent vector the same path parameterized by λ. So particles path also satisfies Eq. (7.10), the equation for a geodesic written in terms of its tangent vector, cf. Eq. (6.49). To justify Eq. (7.11) from Eq. (7.10), we write out Eq. (7.10) using the notation used in Eq. (6.47): d~p dλ d α = (p ~eα ) dλ d α d = ~eα p + pα ~eα dλ dλ d α ∂~eα dxβ = ~eα p + pα β dλ ∂x dλ d α dxβ = ~eα p + pα Γµαβ ~eµ using definition of Christoffel symbol, cf. Eq. (5.44) dλ dλ d α = ~eα p + pα Γµαβ ~eµ pβ using (7.46) dλ d µ α µ β = ~eµ p + p Γ αβ p relabelling dummy indices to allow factoring dλ = 0. (7.47)
∇p~ p~ =
Rob’s notes on Schutz
249
The quantity in parentheses is a scalar for each µ and the basis vectors ~eµ are all orthogonal, since we’re working in an orthonormal basis (since the line element in Eq. (7.8) does not contain off-diagonal terms – see discussion in solution of Exer. 35 of § 6.9.) That implies that each component of Eq. (7.10) must vanish (something that was not immediately obvious). So we can consider just the µ = 0 (i.e. time) component, which gives Eq. (7.11). Recall the 4-momentum is p~ →O (E, p1 , p2 , p3 )
= (mγ, mγv 1 , mγv 2 , mγv 3 )
where v i is the ith component of the 3-velocity and 1 . γ≡ 1 − v 2 /c2 When the magnitude of the three-velocity v c = 1, we have p0 pi , and clearly Eq. (7.11) simplifies to Eq. (7.12). The Γ000 for this metric was computed in Exer. 3 and agrees with that given just before Eq. (7.14). But I would get, instead of Eq. (7.14), Γ000 ≈ (1 − 2φ) φ,t = φ,t + O(φ). Furthermore, p0 = E, the total energy, 1 ≈ m, 1 − v2 see bottom p. 42. Substitution into Eq. (7.11) gives Eq. (7.15). E = mγ = m √
5. (a) For a perfect fluid, verify that the spatial components of Eq. (7.6) in the Newtonian limit reduce to the Euler equations (Eq. 7.38) for the metric Eq. (7.8). The stress-energy tensor T µν for a perfect fluid was given in Eq. (7.7), the only modification from that giving by Eq. (4.37) in Chapter 4 on fluids in SR is the more general metric tensor. Substituting this into Eq. (7.6) we get: T µν ;ν = [(ρ + p)U µ U ν ];ν + [pg µν ];ν = 0 = [(ρ + p)U µ U ν ];ν + p;ν g µν = [ρ U µ U ν ];ν + p,ν g µν
because g µν ;ν = 0
because p is a scalar (7.48)
250 and because p ρ. This follows because the pressure arises from the random motion of the particles, which provides them with negligible kinetic energy relative to the rest mass energy in the non-relativistic limit. Furthermore, ρ ≈ ρ0 = nm again because the rest mass dominates the energy in the non-relativistic limit. (See Table 4.1 for a definition of symbols.) T µν ;ν = m [n U µ U ν ];ν + p,ν g µν = m n U ν [U µ ];ν + p,ν g µν using Eq. (7.2) = m n U ν [U µ,ν + Γµσν U σ ] + p,ν g µν using Eq. (6.33) = ρ0 U ν [U µ,ν + Γµσν U σ ] + p,ν g µν
using Eq. (6.33)
(7.49)
where m n = ρ0 , the rest mass density. Let’s now restrict attention to the µ = i (spatial) components. T iν ;ν = ρ0 U ν [U i,ν + Γi σν U σ ] + p,ν g iν
(7.50)
Let’s look at the terms in (7.50) one-at-a-time. The first term is the time derivative, in particular the Eulerian part when ν = 0 and the advective part when ν = j > 0 (by convention). To see this, expand the first term when ν = 0: ρ0 U 0 U i,0 = ρ0 U 0 U i,t = ρ0 γ 2 v i ,t
(7.51)
√ where γ = 1/ 1 − v 2 , v i is the ith component of the 3-velocity and v is its magnitude. ρ0 γ 2 v i ,t ≈ ρ0 v i ,t ∂v = ρ0 ∂t
Newtonian limit changing to vector notation.
(7.52)
And the advective part corresponds to this term with ν = j: ρ0 γ 2 v j v i ,j ≈ ρ0 v j v i ,j = ρ0 (v · ∇)v
Newtonian limit changing to vector notation.
(7.53)
Rob’s notes on Schutz
251
The next term in in (7.50) contains the Christoffel symbol and can be written: ρ0 Γi σν U σ U ν ≈ ρ0 Γi 00 U 0 U 0 in Newtonian limit U 0 U i = ρ0 φ,i U 0 U 0 Eq. (7.23) = ρ0 φ,i γ 2 ≈ ρ0 φ,i in Newtonian limit = ρ0 ∇φ (7.54) which is the gravitational force per unit volume of fluid. The final term in (7.50) is the pressure gradient force per unit volume, p,ν g iν = p,i (1 − 2φ)−1 Eq. (7.20) ≈ p,i (1 + 2φ) because |φ| 1 = p,i because |φ| 1 = ∇p.
(7.55)
And that’s the Euler equation. 5. (b) Examine the time component of Eq. (7.6) under the Newtonian limit, and interpret each term. We go back to (7.49) and restrict attention to the µ = 0 (time) component: T µν ;ν = ρ0 U ν [U µ,ν + Γµσν U σ ] + p,ν g µν T 0ν ;ν = ρ0 U ν [U 0,ν + Γ0σν U σ ] + p,ν g 0ν (7.56) Let’s look at the terms in (7.56) one-at-a-time. The first term is the time derivative. To see this, expand the first term when ν = 0: ρ0 U 0 U 0,0 = ρ0 U 0 U 0,t = ρ0 γ γ,t ∂ 1 2 ≈ ρ0 v ∂t 2 (7.57)
252 which the Eulerian time rate of change of the kinetic energy density. The final approximation comes about by expanding γ and using the binomial approximation that applies in the Newtonian limit: 1 1 − v2 v2 ≈1+ . 2 And the advective part corresponds to this term with ν = j: γ=√
ρ0 U j U 0,j = ρ0 γ v j γ ,j 1 2 j Newtonian limit ≈ ρ0 v v 2 ,j 1 2 v = ρ0 (v · ∇) changing to vector notation, 2
(7.58)
(7.59)
which is the advection of the kinetic energy per unit volume. The next term in in (7.56) contains the Christoffel symbol and can be written: ρ0 Γ0σν U σ U ν ≈ ρ0 Γ00ν U 0 U ν in Newtonian limit U 0 U i 1 = ρ0 U 0 U ν g 0β (gβν,0 + gβ0,ν − g0ν,β ) 2 1 = ρ0 U 0 U ν g 00 (g0ν,0 + g00,ν − g0ν,0 ) (inverse) metric is diagonal 2 1 = ρ0 U 0 U ν g 00 (g00,ν ) 2 1 = ρ0 U 0 U ν [−(1 + 2φ)−1 ][−(1 + 2φ),ν ] 2 ≈ ρ0 U 0 U ν (1 − 2φ)φ,ν binomial approximation ≈ ρ0 U 0 U ν φ,ν = ρ0 γ 2 v ν φ,ν (7.60) When ν = 0 this gives the tidal forcing: ρ0 Γ0σ0 U σ U 0 ≈ ρ0 φ,t
(7.61)
When ν = i this gives the work done by potential forces (or change in potential energy): ρ0 Γ0σi U σ U i ≈ ρ0 v i φ,i
(7.62)
Rob’s notes on Schutz
253
The final term in (7.56) involves the pressure, p,ν g 0ν = p,0 g 00 because the inverse metric is diagonal = p,t [−(1 + 2φ)−1 ] Eq. (7.8), take the inverse of diagonal matrix ≈ p,t [−(1 − 2φ)] binomial approximation = −p,t (7.63) This looks like a change from internal energy to kinetic energy but I’m confused because this should result from divergence of the velocity (compression/expansion of fluid parcels working against ambient pressure). 5. (c) Derive the relativistic hydrostatic balance, given by Eq. (7.40), from Eq. (7.6). [I recommend you try my supplementary problem R1 above before tackling this one.] We start again with the divergence of the stress-energy tensor given above in (7.48): T µν ;ν = [(ρ + p)U µ U ν ];ν + [pg µν ];ν = 0 = [(ρ + p)U µ ]U ν;ν + [(ρ + p)U ν ]U µ;ν + [U µ U ν ](ρ + p);ν + p;ν g µν + p g µν ;ν (7.64) Let’s step through the terms of (7.64) starting with the last, applying the static condition when appropriate. The final term vanishes: p g µν ;ν = 0 as we have used many times, see § 6.9 Exer. 17(a) and Eq. (6.31). The 4th term is: p;ν g µν = p,ν g µν = p,i g µi
because p is a scalar, static metric.
(7.65)
The 3rd term vanishes: [U µ U ν ](ρ + p);ν = [U µ U ν ](ρ + p),ν = [U 0 U 0 ](ρ + p),0 = [U 0 U 0 ](ρ + p),t =0 static fluid.
because p is a scalar static fluid (7.66)
254 The 2nd term is [(ρ + p)U ν ]U µ;ν = [(ρ + p)U 0 ]U µ;0 = = =
[(ρ + p)U ](U µ,t + (ρ + p) U 0 Γµσt U σ (ρ + p) U 0 Γµ00 U 0 0
static fluid Γµσt U σ ) Eq. (6.33) static fluid static fluid (7.67)
The first term vanishes: (ρ0 + p) U α U β;β = (ρ0 + p) U α [U β,β + U β (ln(−g)),β ] = (ρ0 + p) U α [U 0,t + U 0 (ln(−g)),t ] =0
static fluid
static fluid (7.68)
So there’s a (relativistic hydrostatic) balance between the 4th and 2nd terms: (ρ + p) U 0 Γµ00 U 0 + p,i g µi = 0 (7.69) Now let’s simplify the Christoffel symbol, 1 Γµ00 = g µβ (−g00,β ) using Eq. 5.75 & static metric 2 1 = g µi (−g00,i ) static metric. 2 Substitution in our hydrostatic balance equation gives:
(7.70)
0 = (ρ + p) U 0 Γµ00 U 0 + p,i g µi recall above 1 = (ρ + p) g µi (−g00,i )U 0 U 0 + p,i g µi 2 (7.71) And now for the tricky bit! Recall we learned in SR that the four velocity of a stationary particle was the speed of light in the direction of time, c.f. § 2.2, so that ~ ·U ~ = gαβ U α U β U = U 0 U 0 g00 = 1 · 1 · (−1) = −1,
Eq. (2.28).
(7.72)
Rob’s notes on Schutz
255
Now in GR the metric has changed, but do we keep the magnitude of the ~ ·U ~ = −1 and change the components of U α accordingly? Yes, we do. U [Personally I think this wasn’t explained clearly by Schutz and it’s one of my few criticisms of this text.] Once one sees this small but important step, the result follows easily. One either considers the µ = 0 component or factors g µi out: 1 0 = (ρ + p) g µi (−g00,i )U 0 U 0 + p,i g µi 2 1 1 = (ρ + p) (−g00,i ) + p,i g µi 2 −g00 1 1 µi =g ρ + p) (−g00,i ) + p,i 2 −g00 1 1 + p,i = (ρ + p) (−g00,i ) 2 −g00 1 = (ρ + p) [ln(−g00 )],i + p,i chain rule of differential calculus 2 (7.73)
5. (d) Appears to be a relationship between g00 and − exp(2 φ), where φ is the Newtonian potential. Show that Eq. (7.8) and Exer. 4 are consistent with this. First we have to make sense of “there is a close relation ”. Let’s check if they are approximately equal in some situations (such as the Newtonian limit). So then the term 1 1 [ln(−g00 )],i = [ln(exp(2φ))],i = φ,i 2 2 the Newtonian gravitational force per unit mass that appears in the Newtonian fluid hydrostatic balance (except that we have to then ignore p ρ). So yes, approximate equality (in some situations) appears to be what the relation is. Let’s assume that the Newtonian potential is small, |φ| 1 in non-
256 dimensional units. Then − exp(2 φ) = −(1 + 2φ + O(4φ)2 ) Taylor series about φ = 0 ≈ −(1 + 2φ) if |φ| 1 = g00 as in Eq. (7.8). (7.74) And in Exer. 4 we were required to show that in the Newtonian limit of Eq. (7.10), which was Eq. (7.15) and Eq. (7.24), there was only a dependence on g00 , and not the other components of the metric. These equations correspond to the energy and momentum of a particle in a time-varying gravitational field. In the Newtonian limit we expect expect classical mechanics to apply, of course, from which we know that the energy depends upon the tidal forcing (time variation of the gravitational potential), and the momentum is altered by the gravitational force which is the gradient of the gravitational potential. So if g00 ≈ − exp(2φ) is consistent with this.
6. Deduce Eq. (7.25) from Eq. (7.10). Eq. (7.10) was the equation for a geodesic, ∇p~ (~p) = 0 gβµ [∇p~ (~p)] = 0 = gβµ [pα pµ;α ] = pα [gβµ pµ ] = pα pβ;α
;α
using Eq. (6.31) (7.75)
7. We’re given for expressions for the line elements corresponding to four different metrics (i) . . . (iv). (i) is the Minkowski metric. (a) For each metric, find as many conserved components of pα of a freely falling particle’s four momentum as possible. (i) All four components of p˜, i.e. pα , are conserved.
Rob’s notes on Schutz
257
(ii) The conserved components of p˜ are: pt pφ
(7.76)
(iii) The conserved components of p˜ are: pt pφ
(7.77)
(iv) The conserved components of p˜ are: pφ
(7.78)
(b) Use the results of Exer. 28 from § 6.9 to transform the Minkowski metric (i) to the form ds02 = −dt2 + dr2 + r2 (dθ2 + sin2 θd φ2 ) Use this to argue that (ii) and (iv) are spherically symmetric. Does this increase the number of conserved components of pα ? Recall Exer. 28 from § 6.9 gave us the metric on the surface of a sphere in 3D Cartesian space. But the 3 spatial coordinates of Minkowski space look like that of 3D Cartesian space. So we can represent them in spherical polar coordinates with the same metric, while keeping the time component. I suppose one also needs to know that dr2 gives the radial contribution to the line element in spherical polar coordinates yet this was not provided in Exer. 28. I’m not sure about the correct answer to the remainder of this question, but I’ll have a stab at it. It’s clear for Schwarzschild metric (ii) that for fixed r (so on the surface of 3-sphere manifold in this 4-space), that the Schwarzschild metric has the same form (to within constants) as the metric for Minkowski space written in spherical coordinates. We know all components of pα are conserved for the Minkowski metric. This indicates that linear combinations of components in pα on the surface of 3-sphere manifold in the Schwarzschild space are
258 conserved. There is a linear transformation that converts this metric to the Minkowski metric everywhere on the 3-sphere. For the Robertson-Walker metric (iv) for fixed r and t, the metric has the same form to within constants as the Minkowski metric written in spherical coordinates. So again this indicates that linear combinations of components in pα on this manifold would be conserved. (c) For metrics (i’), (ii), (iii), and (iv), a geodesic that begins tangent to the equatorial plane stays on the equatorial plane (i.e. starts with θ = π/2 and pθ = 0 and keeps θ = π/2 and pθ = 0). For cases (i’), (ii), and (iii), use the equation p~ · p~ = −m2 to solve for pr in terms of m, and other conserved quantities, and known functions of position. For (i’), −m2 = p~ · p~ = pα pβ gαβ = −(pt )2 + (pr )2 + r2 [(pθ )2 + sin2 θ(pφ )2 ] writing out terms with metric (i’) = −(pt )2 + (pr )2 + r2 (pφ )2 tangent to equatorial plane p pr = ± (pt )2 − r2 (pφ )2 − m2
(7.79)
We’re almost there, but remember it’s pt and pφ that are conserved. So we need to relate pt and pφ to them to fulfill the requirement of “other conserved quantities”. Because (gαβ ) is symmetric for metrics (i’), and (ii), a single factor relates each component: pt = pt gαt = pt gtt = pt (−1) pφ = pφ gαφ = pφ gφφ = pφ r2 sin2 θ = pφ r2
(7.80)
Substitution in the above equation gives: q p = ± (−pt )2 − (pφ )2 /r2 − m2 r
(7.81)
Rob’s notes on Schutz
259
For (ii), −m2 = pα pβ gαβ 1 (pr )2 + r2 [(pθ )2 + sin2 θ(pφ )2 ] expanding 1 − 2M/r 1 (pr )2 + r2 (pφ )2 tangent to equatorial plane = −[1 − 2M/r] (pt )2 + 1 − 2M/r p p pr = ± 1 − 2M/r [1 − 2M/r] (pt )2 − r2 (pφ )2 − m2 (7.82) = −[1 − 2M/r] (pt )2 +
Again we must find the corresponding conserved quantities: pt = pt gtt = pt (−1)[1 − 2M/r] pφ = pφ gφφ = pφ r2
(7.83)
Substitution in the above equation gives: q p r p = ± 1 − 2M/r (pt )2 /[1 − 2M/r] − (pφ )2 /r2 − m2
(7.84)
For (iii), it’s instructive to note immediately that on the equatorial plane ρ = r2 + a2 cos2 θ = r2 , so that 2
−m2 = pα pβ gαβ (r2 + a2 )2 − a2 ∆ φ 2 r2 r 2 ∆ − a2 t 2 4aM t φ (p ) − (p p ) + (p ) + (p ) =− r2 r r2 ∆ (7.85) The above equation involves known functions of position, and pt and pφ which are not conserved. Because the Kerr metric is not diagonal, it’s more involved to find the corresponding conserved quantities: pt = pσ gtσ = gtt pt + gtφ pφ pφ = pσ gφσ = gφφ pφ + gtφ pt
(7.86)
We solve this 2 × 2 system for pt and pφ to find: gtφ gφφ pt = −pt 2 + pφ 2 gφt − gtt gφφ gφt − gtt gφφ g gtt tφ pφ = pt 2 − pφ 2 gφt − gtt gφφ gφt − gtt gφφ (7.87)
260 Substitution in the above equation gives would give the required equation, but it would be very messy. We simply note that we have met the requirement in principle because gtt , gφt , gφφ are known functions of position. (d) For (iv), spherical symmetry implies that if a geodesic begins with p = pφ = 0, these remain zero. Use this to show from Eq. (7.29) that when k = 0, pr is conserved. θ
When k = 0 the Robertson-Walker metric simplifies to: −1 0 0 0 0 R2 (t) 0 0 (gαβ ) = 0 0 r2 R2 (t) 0 2 2 2 0 0 0 r sin θR (t) Writing out Eq. (7.29) for pr and this metric we get: 1 d pβ = gνα,β pν pα Eq. (7.29) dτ 2 d 1 m pr = gνα,r pν pα dτ 2 1 = [gtt,r (pt )2 + grr,r (pr )2 + gθθ,r (pθ )2 + gφφ,r (pφ )2 ] 2 1 = [gθθ (pθ )2 + gφφ (pφ )2 ] clear from metric 2 = 0 clear from given initial conditions and “spherical symmetry”. (7.88)
m
8. Suppose that in some coordinate system the components of the metric gαβ are independent of some coordinate xµ . (a) Show that the conservation law T νµ;ν = 0 for any stress energy tensor becomes that given by Eq. (7.41). If it’s not obvious where this conservation law came from, have a go at supplementary problem SP2 above.
Rob’s notes on Schutz
261
Let’s start with expanding Eq. (7.41) to see what’s really there. √ 1 0= √ −gT νµ ,ν −g √ ν ν ( −g),ν = T µ,ν + T µ √ product rule of differential calculus −g = T νµ,ν + T νµ Γανα Eq. (6.40). (7.89) This now looks somewhat like the given conservation law, for if we expand it using Eqs. (6.34) and (6.35), we obtain T νµ;ν = T νµ,ν + T νµ Γανα − T νσ Γσµν
(7.90)
Comparing (7.89) and (7.90) we see that we merely have to show that under the given conditions on the metric, the final term in (7.90) vanishes. Let’s start with the most general expression relating the Christoffel symbol to the metric tensor, and then apply the restriction given here: 1 T νσ Γσµν = T νσ g σβ [gβµ,ν + gβν,µ − gµν,β ] Eq. (6.32) 2 ν 1 σβ = T σ g [gβµ,ν − gµν,β ] when g independent of xµ 2 νβ 1 [gβµ,ν − gµν,β ] just raised index =T 2 = 0, (7.91) because T νβ is symmetric on ν, β while quantity in [ ] is antisymmetric on these indices, cf. Exer. 26 of § 3.10. (b) Suppose that in these coordinates T αβ 6= 0 only in some bounded region of each spacelike hypersurface x0 =const. Show that Eq. (7.41) implies Z √ T νµ −g ην d3 x x0 =const. is independent of time x0 , if ην is the unit normal to the hypersurface. I believe this is a typo and should be written with ν = 0, i.e.: Z √ T 0µ −g η0 d3 x x0 =const.
262 Starting with Eq. (7.41) we integrate both sides, using the proper volume √ element, −g d4 x (c.f. Eq. (6.18)): Z √ √ 1 √ −gT νµ ,ν −g d4 x = 0 −g Z √ −gT νµ ,ν d4 x = 0 I √ −gT νµ nν d3 S = 0 Eq. (6.44) (7.92) I found it helpful at this point to picture using Gauss’s law in a simple setting of 2D Cartesian space, with some field being only nonzero in a some region of finite extent. In this simple case one would choose the bounding limits to be straight lines x = x1 and x = x2 and y = y1 and y = y2 that lie outside the region of nonzero field. By analogy, we expand the LHS of (7.92)
7.7
Rob’s supplementary problems
SP 1. Recall we learned in SR that the four velocity of a stationary particle was the speed of light in the direction of time, c.f. § 2.2, so that ~ ·U ~ = gαβ U α U β U = U 0 U 0 g00 = 1 · 1 · (−1) = −1,
Eq. (2.28).
(7.93)
Now in GR the metric has changed, but do we keep the magnitude of the ~ ·U ~ = −1 and change the components of U α accordingly? I recommend U you do this problem before tackling Exer. 5(c).
Yes, we do. [Personally I think this wasn’t explained clearly by Schutz up to this point in the text and it’s one of my few criticisms.] See Eq. (10.18).
SP 2. Show that T νµ;ν = 0 can be derived from the conservation law Eq. (7.6).
Rob’s notes on Schutz
263
We simply multiply Eq. (7.6) by the metric tensor to lower the index. 0 = T σν ;ν =T
symmetry of the stress-energy tensor
;ν νσ
0 = gµσ T
= (gµσ T = T νµ;ν et voila!
Eq. (7.6)
νσ ;ν νσ
);ν
multiplying both sides by g using Eq. (6.31) (7.94)
264
Chapter 8 The Einstein Field Equations
265
266
8.1
Purpose and justification of the field equations
Re-reading this long after Chapter 7, I found it strange that he was justifying Eq. (7.6) based upon the Einstein equivalence principle, see discussion between Eqs. (8.4) and (8.5) on p. 185. But re-reading Chapter 7 it is clear that he’s just referring to the application of the “comma goes to semi-colon” rule. That is, Eq. (7.5), the conservation of four-momentum in SR, generalizes (by the Einstein equivalence principle) to Eq. (7.6).
8.2
Einstein’s equations
8.3
Einstein’s equations for weak gravitational fields
It’s perhaps helpful to go through the derivation of Eq. (8.22). gα0 β 0 = Λµα0 Λνβ 0 gµν
transformation of a 2nd rank tensor, like Eq. (8.16)
= Λµα0 Λνβ 0 (ηµν + hµν ) substitute Eq. (8.12) = [δ µα − εµ,α ][δ νβ − εν,β ](ηµν + hµν ) substitute Eq. (8.21) = ηαβ + hαβ − ηαν εν,β − ηβµ εµ,α
keeping just the largest terms
= ηαβ + hαβ − εα,β − εβ,α
(8.1)
which gives Eq. (8.22). It’s tempting to justify the final step above by the metric tensor lowering the indices µ and ν. But note that Schutz is careful not to say this, but instead says that we define εα ≡ ηαβ εβ see his Eq. (8.23). I guess this is because η is not the full metric, but only the dominant piece of it. When gαβ is given by Eq. (8.12) then Eq. (8.25) follows immediately from Eq. (6.67), which is Rαβµν in a local inertial frame. All one has to do is substitute Eq. (8.12) and lower the index. Note the typo in Eq. (6.67).
Rob’s notes on Schutz
8.6
267
Exercises
1. Show that Eq. (8.2) is a solution to Eq. (8.1) using the method suggested.
Gauss’ law in 3D space is Z
Z ∇ · (∇φ)dV =
Ω
n ˆ · (∇φ)dA dΩ
where Ω is a volume, dV a differential volume element, dΩ is the surface bounding Ω, n ˆ is the outward pointing unit normal vector, and dA is an area element on the bounding surface. As suggested, we consider a point particle at the origin of a spherical co-ordinate system. Then the left-hand integral is trivial: Z
Z ∇ · (∇φ)dV =
4πGρdV, using Eq. (8.1) Z = 4πG ρdV,
Ω
Ω
Ω
= 4πGm,
(8.2)
where m is the mass of the particle. We have assumed that the particle is in a vacuum so that φ is only due to this particle. The righthand side gives, Z
Z n ˆ · (∇φ)dA =
dΩ
dΩ
=
dφ dA, dr
dφ 4πR, dr
using surface of sphere for dΩ (8.3)
where R is the radius of the sphere and by spherical symmetry the integrand is constant on the surface of the sphere. Combining these two results gives the 1st order ODE: dφ Gm = 2 dr r
268 To solve this we integrate both sides, imposing the BC φ(∞) = 0, Z ∞ Z ∞ Gm dφ dr = dr r2 R R dr ∞ Gm ∞ [φ]R = − r R Gm φ(R) = − R which is Eq. (8.2).
(8.4)
2 (a). Derive the two given conversion factors. One simply plugs in the given constants with values in SI units to calculate the numerical value. One should also convince oneself that the units are correct. The SI values and units were given in Table 8.1. Personally I find it easy to remember the formulae that the constants are used in and figure out the units from there. GSI [m3 kg−1 s−2 ] GSI [m3 kg−1 s−2 ] = 2 c2SI [m s−1 ]2 cSI [m2 s−2 ] G [m] = 2SI cSI [kg] 6.674 × 10−11 [m kg−1 ] 8 2 (2.998 × 10 ) = 7.425 × 10−28 [m kg−1 ]
(8.5)
c5SI [m5 s−5 ] = GSI [m3 kg−1 s−2 ] GSI [m3 kg−1 s−2 ] c5SI [m2 kg] = GSI [s3 ] (2.998 × 108 )5 = [N m s−1 ] 6.674 × 10−11 = 3.629 × 1052 [J s−1 ]
(8.6)
=
c5SI [m s−1 ]5
Rob’s notes on Schutz
269
2 (b). Derive the constants in Table 8 in geometrized units. Here I’ll just get you started. The numerical computations were performed in the corresponding MapleTM script for Chapter 8. Table 8.1: Conversion between SI and geometrized units Constant
Geometrized units
Value in terms of constants in SI
c
unitless
G
unitless
cSI (cSI )−1 GSI G−1 SI
~
m2
me
m
M
m
L
unitless
~SI GSI c−3 SI me SI GSI c−2 SI m SI GSI c−2 SI −5 LSI GSI cSI
2 (c) Express the following in geometrized units.
The trick is again to find the right combination of c and G that give the right units. By “right units” we mean that you’re not allowed to have [kg] or [s] or units derived from these, like [N]. But you are allowed to have [m]. (i) Density of a neutron star. −2 ρSI GSI c−1 SI [m ]
(8.7)
(ii) Pressure in a neutron star. −2 pSI GSI c−3 SI [m ]
(8.8)
270 (iii) Acceleration. −1 gSI c−2 SI [m ]
(8.9)
(iv) Luminosity of a neutron star. LSI GSI c−5 SI [unitless]
(8.10)
2 (d). Find the Planck length, mass, and time all in SI units. Now the goal is to find the right combination of c and G and ~ such that the quantity has dimensions of length [m], mass [kg], and time [s] respectively. Computations were performed in the accompanying MapleTM worksheet for Chapter 8. (i) Planck length. 1/2 LP = ~SI GSI c−3 [m] SI = 1.62 × 10−35 [m]
(8.11)
1/2 [kg] MP = ~SI G−1 c SI SI = 2.18 × 10−8 [kg]
(8.12)
1/2 TP = ~SI GSI c−5 [s] SI = 5.39 × 10−44 [s]
(8.13)
(ii) Planck mass.
(iii) Planck time.
Now we are to compare these three scales with elementary particles. Elementary particles are considered “point particles”, with no inherent radius
Rob’s notes on Schutz
271
(Perkins, 2009). Experiments in high-energy particle physics explore scales down to 10−17 m, or 100 times smaller than the charge radius of a proton (Perkins, 2009, p. 3), which is still much much larger than the Planck scale. The heavier leptons, i.e. the muon and the tauon, are unstable and decay with mean lifetimes of 2.2 × 10−6 and 2.9 × 10−13 [s], much much longer than the Planck time. In contrast, the Planck mass appears to be rather large, much heavier than the electron.
3 (a). Calculate the following in geometrized units: (i) Newtonian potential of Sun at the surface of the Sun.
GM r 1 · 1476 m = −2.12 × 10−6 =− 8 6.96 × 10 m
φ=−
(8.14)
(ii) Newtonian potential of Sun at the radius of the Earth’s orbit.
GM r 1 · 1476 m =− = −9.87 × 10−9 1.496 × 1011 m
φ=−
(8.15)
(iii) Newtonian potential of Earth at the surface of the Earth.
GM r 1 · 4.434 × 10−3 m =− = −6.96 × 10−10 6 6.371 × 10 m
φ=−
(8.16)
272 3 (b). Why is the Sun’s potential at the Earth’s radius greater than that of the Earths own potential there, yet we feel the Earth’s gravitational pull more? The force of gravity per unit mass is determined by the gradient of the gravitational potential. While the Sun’s potential is larger at the surface of the Earth, its potential gradient is much less than that of the Earth’s own gravitational potential.
3 (c). Show that a circular orbit around a body of mass M has an orbital velocity, in Newtonian theory, of v2 = −φ, where φ is the Newtonian potential. The centripetal acceleration of a body in a circular orbit of radius r is 2
v ~ × (Omega ~ (8.17) Ω × r~er ) = − ~er r as can be found in elementary texts on mechanics (or see Lesson 2 in my course notes http://stockage.univ-brest.fr/~scott/GFD1/2012/index_ gfd1.html).
4 (a). Let A be an n × n matrix whose entries are all very small, |Ai,j | 1/n, and let I be the unit matrix. Show that (I + A)−1 = I − A + A2 − A3 . . . by first showing that (i) the series on the RHS converges absolutely. (ii) (I + A) times the RHS gives I.
(i) A series ∞ X
ap
p=0
is absolutely convergent if
∞ X p=0
|ap | < ∞
Rob’s notes on Schutz
273
see http://en.wikipedia.org/wiki/Absolute_convergence. Let’s say that the largest term is, 1 max |Ai,j | = , n
1.
where the maximum is over all i and j. Then consider an arbitrary term of the RHS, ∞ X Ri,j = ap , p=0
with a0 = 1 from I. The next contribution from the series, p = 1, is −Ai,j , which makes contribution |a1 | = | − Ai,j | ≤
1 n
The next term in the series a2 has magnitude, 1 1 |a2 | = |Ai,k Ak,j | ≤ n × ( )2 = 2 n n The next term in the series a3 has magnitude, 1 1 |a3 | = |Ai,k Ak,l Al,j | ≤ n × n × ( )3 = 3 n n So we see that
1 n which is 1/n times the geometric series from 1 to ∞ with || < 1 for which the sum is finite. Thus ∞ X 1 =1+ n1− p=0 |ap | = p
which proves that the RHS is absolutely convergent. (ii) The next step is actually easier. Multiply the RHS by (I + A) gives two sets of terms. The first set comes from I and is of course just the RHS again. The next set of terms is like the RHS but each term has A to one higher power. But because the terms alternate in sign, the second set cancels all the terms in the first set but the I!
274 4 (b). Use results from (a) to establish Eq. (8.21) from Eq. (8.20). First we must identify Eqs. (8.20) with the matrix equation: 0
(Λαβ ) = I + A 0
(δ αβ ) = I 0
(εαβ ) = A
(8.18)
0
And of course we know Λαβ and Λαβ 0 are inverse transforms. Alternatively it’s also clear from basic calculus that (Dirac, 1996): 0
0
∂xα ∂xβ ∂xα α0 = 0 0 = δ γ0 β γ γ ∂x ∂x ∂x So then Eq. (8.21) can be written in matrix form: (I + A)−1 = I − A + O(A2 )
(8.19)
and of course this is a straightforward application of the result in exercise 4 (a).
Chapter 10 Spherical solutions for stars
275
276
10.2
Static spherically symmetric spacetimes
10.2.1
The metric
10.2.2
Physical interpretation of metric terms
~ ·U ~ = 1 but of course this should Typo p. 259, just before Eq. (10.11), has U be ~ ·U ~ = −1 U c.f. Eq. (2.28) or Eq. (10.18). It really is just a typo. He wants to say ~ ·U ~ = −1 U = U α U β gαβ = U 0 U 0 g00 0
0
because she’s in the MCRF 2Φ
= U U (−e )
using metric term from Eq. (10.7) (10.1)
which implies, as Schutz concludes, U 0 = exp(−Φ).
10.7
Realistic stars and gravitational collapse
White dwarfs Typo: p. 273. “. . . the Fermi momentum rises (Eq. (10.5))”. I believe this should be Eq. (10.75), which shows that Fermi momentum pf increases like n1/3 where n is the number density.
10.9
Exercises
4. Calculate the diagonal components of the Einstein tensor, c.f Eqs. (10.14 . . . 10.17) for the spherically symmetric static metric Eq. (10.7). For this we can use the results of Problem 35 § 6.9 wherein we found the Riemann tensor for a metric of this form. However, I found my Riemann tensor disagreed with that of Schutz’s solutions. For that reason I’ll show my intermediate results as well.
Rob’s notes on Schutz
277
First the Ricci tensor obtained by contracting the Riemann tensor (see Eq. 6.91): Rtt = Rσtσt = g rr Rrtrt + g θθ Rθtθt + g φφ Rφtφt = exp(−2Λ){exp(2Φ) (Φ0 )2 + Φ00 − Φ0 Λ0 } + r−2 {rΦ0 exp(2Φ) exp(−2Λ)} + r−2 sin−2 (θ){sin2 (θ) rΦ0 exp(2Φ) exp(−2Λ)} Φ0 0 2 00 0 0 = exp(2Φ − 2Λ) (Φ ) + Φ − Φ Λ + 2 (10.2) r Rrr = Rσrσr = g tt Rtrtr + g θθ Rθrθr + g φφ Rφrφr = − exp(−2Φ){exp(2Φ) (Φ0 )2 + Φ00 − Φ0 Λ0 } + r−2 {rΛ0 } + r−2 sin−2 (θ){rΛ0 sin2 (θ)} Λ0 = −(Φ0 )2 − Φ00 + Φ0 Λ0 + 2 (10.3) r Rθθ = Rσθσθ = g tt Rtθtθ + g rr Rθrθr + g φφ Rφθφθ = − exp(−2Φ)[rΦ0 exp(2Φ) exp(−2Λ)] + exp(−2Λ)[rΛ0 ] + r−2 sin−2 (θ)[r2 sin2 (θ)(1 − exp(−2Λ))] = −rΦ0 exp(−2Λ) + rΛ0 exp(−2Λ) + (1 − exp(−2Λ))
(10.4)
Rφφ = Rσφσφ = g tt Rtφtφ + g rr Rφrφr + g θθ Rφθφθ = − exp(−2Φ)[sin2 (θ) rΦ0 exp(2Φ) exp(−2Λ)] + exp(−2Λ)[rΛ0 sin2 (θ)] + r−2 [r2 sin2 (θ) (1 − exp(−2Λ))] = exp(−2Λ) sin2 (θ)[−rΦ0 + exp(2Λ) − 1 + rΛ0 ] (10.5) Second the Ricci scalar obtained by contracting the Ricci tensor (see Eq. 6.92): R = g αβ Rαβ Λ0 1 exp(2Λ) Φ0 + Φ0 Λ0 − Φ02 − Φ00 = 2 exp(−Λ) −2 + 2 − 2 + r r r r2
(10.6)
278 Finally we obtain the Einstein tensor via Eq. (6.98). But notice that for Eqs. (10.14 . . . 10.17) we want the covariant components. So we use 1 Gαβ = Rαβ − gαβ R 2 (10.7) We find: 1 G00 = Rtt − g00 R 2 Φ0 0 2 00 0 0 = exp(2Φ − 2Λ) (Φ ) + Φ − Φ Λ + 2 r 0 0 Φ Λ 1 exp(2Λ) 1 0 0 02 00 +ΦΛ −Φ −Φ } − [− exp(2Φ)]{2 exp(−Λ) −2 + 2 − 2 + 2 r r r r2 0 1 exp(2Λ) Λ = exp(2Φ − 2Λ) 2 − 2 + r r r2 d 1 [r(1 − exp(−2Λ)] , the form of Eq. (10.14). (10.8) = 2 exp(2Φ) r dr 1 G11 = Rrr − grr R 2 Λ0 = −(Φ0 )2 − Φ00 + Φ0 Λ0 + 2 r 0 1 Φ Λ0 1 exp(2Λ) 0 0 02 00 − exp(2Λ)2 exp(−Λ) −2 + 2 − 2 + +ΦΛ −Φ −Φ 2 r r r r2 Φ0 exp(2Φ) =− [(1 − exp(−2Λ)] + 2 , the form of Eq. (10.15). (10.9) r2 r 1 G22 = Rθθ − gθθ R 2 0 = −rΦ exp(−2Λ) + rΛ0 exp(−2Λ) + (1 − exp(−2Λ)) 1 2 Φ0 Λ0 1 exp(2Λ) 0 0 02 00 − r {2 exp(−Λ) −2 + 2 − 2 + +ΦΛ −Φ −Φ } 2 r r r r2 Λ0 Φ0 2 00 02 0 0 = r exp(−2Λ) Φ + Φ + −ΦΛ − , the form of Eq. (10.16). r r (10.10)
Rob’s notes on Schutz
279
And finally for the φ component we need not do any further computations if we note the relationship with the corresponding θ components of the Ricca tensor and metric tensor: 1 G33 = Rφφ − gφφ R 2 = sin2 (θ)Rθθ −
1 2 sin (θ)gθθ R 2
= sin2 (θ)G22 .
10.10 SP. 1
Rob’s supplementary problems
(10.11)
280
Chapter 11 Schwarzschild geometry and black holes
281
282
11.1
Trajectories in the Schwarzschild spacetime
Conserved quantities Eqs. (11.5) and (11.6) look strange at first, since they appear to contradict the definitions of p. 42. Why is E = −p0 here, yet E = p0 on p. 42. Why is L = pφ ? I found it instructive to note that pα = mU α = mdxα /dτ , and xα can be a coordinate like φ in spherical coordinates. And φ/dτ is an angular velocity, so it doesn’t have units of velocity. But pφ = r2 sin2 θm dφ/dτ which includes the velocity r sin θ dφ/dτ and the momentum arm length r sin θ. We can also just accept p0 = −E and pφ = L as convenient definitions of quantities that are conserved in the Schwarzschild metric and certainly E is “energy like” and L is “angular momentum like”, which is how I guess how Schutz meant us to take them.
Types of orbits Typo: p. 286, first sentence of page, too many ”(”. Regarding Eq. (11.19), Schutz refers to the “minimum” radius of a particle’s circular orbit. But it is important to note that this is the minimum only for the stable, larger orbit obtained from the positive root in Eq. (11.17). For the unstable, smaller orbit obtained from the negative root in Eq. (11.17), r = 6M is the maximum radius. In the latter case, the minimum is obtained ˜ → ∞, which gives by taking the limit of the L r → 3M See also the solution to Exercise 4 in § 11.7.
Perihelon shift Typo: p. 288, 3rd line, “These opposite of ‘peri’ is ‘ap’ ” should be “The opposite of ‘peri’ is ‘ap’ ”. Typo: p. 290, line above Eq. (11.42), “Each orbit take . . . ” should be “Each orbit takes . . . ”. Typo: p. 291, last line of 1st paragraph “predict” should be “predicts”.
Rob’s notes on Schutz
283
[I sent these notes to Schutz in Dec. 2011 and he didn’t reply, so I’m no longer bothering with cosmetic typos.]
11.3
General black holes
Ergoregion Typo in Eq. (11.78), the τ should be a t obviously, since it came from the previous line.
11.7
Exercises
1. (a). A particle or photon in an orbit [careful, it’s not necessarily a closed orbit] in the Schwarzschild metric with a certain E and L, at a radius r M . Show that if space-time were really flat, the particle [or photon] would travel √ 2 on a straight line that would pass a distance b ≡ L/ E − m2 from the centre of coordinates r = 0. This ratio b is called the impact factor. (b). Show also that photon orbits that follow Eq. (11.12) depend only on b.
(a) If we rotate the coordinate system such that θ = π/2 then pθ = 0 always. For a particle or a photon pα pα = −m2 where m is the rest mass, which is nil for a photon, c.f. Eqs. (2.33) and (2.40). At the point where r is a minimum Ur =
dr =0 dτ
because it’s an extremum of the particle path. The Schwarzschild metric is diagonal so that this implies pr = 0 at r = b. So that we have only two
284 non-zero components of the momentum: −m = pα pα
in general
t
= p pt + pφ pφ = g 0α pα pt + g φα pα pφ = g 00 pt pt + g φφ pφ pφ diagonal metric −1 2M =− 1− E 2 + r−2 L2 , used Eqs. (11.5), (11.6) r 2 −2 2 = −E + r L flat space approximation 2 −2 2 = −E + b L r=b (11.1) Solve for b and chose positive root since b is the spatial distance: b= √
L . E 2 − m2
(b) For a photon m = 0 Eq. (2.40), b=
L . E
Note that we can rewrite Eq. (11.12) by absorbing the E into the parameter λ, defining for instance γ = Eλ which is consistent with Eq. (6.52), so we retain an affine parameter γ so the paths are still geodesics, 2 2M L2 dr 2 =E − 1− dλ r r2 2 2 2M b E = E2 − 1 − substituted L = bE r r2 2M b2 2 = E [1 − 1 − ] r r2 2 dr 2M b2 =1− 1− (11.2) dγ r r2
Rob’s notes on Schutz
285
so besides M , the only parameter is b.
2. Prove Eqs. (11.17) and (11.18). For the particle orbit, start with d 0= dr
"
2M 1− r
˜2 L 1+ 2 r
!#
(11.3) If one does the obvious thing, and simply integrates this, one runs into trouble. One can show that the integration constant is E 2 . But then one has an cubic to solve. Even knowing the solution, it’s not obvious that it is a solution to the cubic. However, if one first differentiates then everything works out easily, with only a quadratic to solve, d 0= dr =
"
2M r2
= r2 − r
!# ˜2 L 2M 1+ 2 1− r r ! ! ˜2 ˜2 L 2M L 1+ 2 + 1− −2 3 r r r
˜2 L ˜2 + 3L M
after multiplying by
r M
(11.4)
The quadratic formula gives the solution: ˜2 L r= 2M
r 1±
12M 2 1− ˜2 L
!
For the photon orbit, start with d 0= dr
2 2M L 1− r r2 (11.5)
286 Again, it’s much easier, perhaps essential, to start by differentiating, 2 2M L d 1− 0= dr r r2 2 2M L 2M L2 = 2 + 1− −2 3 r r2 r r 2 r = 3M after dividing by L
(11.6)
4. What kind of orbits are possible outside a star of radius (a) R = 2.5M , (b) R = 4M , (c) R = 10M , See the last paragraph of § 11.1, subsection Types of orbits, where Schutz points out that if the radius of the star exceeds the radius of the orbit, the orbit is not possible, simply because the star is in the way. In my notes above I qualify in what sense rM IN = 6M is a minimum in Eq. (11.19). The plot below is useful for interpreting this question. (a) Note that R < 3M which is the photon’s circular orbit radius, so a circular, unstable, photon orbit is possible. This is point A in Fig. 11.2. Note that R < 6M = rM IN so a circular, unstable, particle orbit is possible, like point A in Fig. 11.1. Because R < 3M , which is the minimum unstable orbit radius (see my notes on section 11.1), the unstable orbit is ˜ allowed for all L. ˜ point B in Also the larger, stable, circular orbit is also allowed for all L, Fig. 11.1. (b) Note that R > 3M which is the photon’s circular orbit radius, so a circular, unstable, photon orbit is not possible. Note that R = 4 < 6M = rM IN so a circular, unstable, particle orbit is possible, like point A in Fig. 11.1. Because R > 3M , which is the minimum unstable orbit radius (see my notes on section 11.1), the unstable orbit is ˜ see my Fig. 11.1 above. only possible for a finite range of L, ˜ point B in Fig. 11.1. The larger, stable, circular orbit is allowed for all L, (c) Note that R = 10 > 3M which is the photon’s circular orbit radius, so a circular, unstable, photon orbit is not possible.
Rob’s notes on Schutz
287
˜ for M = 1 from Eq. (11.17). Figure 11.1: Radius of circular orbit r vs. L Green line taking the positive root, while red line from taking the negative root. Note that R > 6M which is the particle’s maximum unstable circular ˜ orbit radius, so a circular, unstable, particle orbit is not possible for any L. ˜ Let L ˜ become The larger, stable, circular orbit is allowed for a range of L. large in Eq. (11.17) and it’s clear that the larger orbit can be at arbitrarily large r so and be such that r > R = 10M and this orbit must exist. See also green line in my Fig. 11.1 above.
5 (a). Find the radius R0.01 at which −g00 differs from the ‘Newtonian’ value 1 − 2M/R by only 1%. This question doesn’t make sense because −g00 = 1
−2M R
in the Schwartzschild metric without approximation. Please see my supplementary problem below.
288 5 (b). How many normal [Sun-like] stars can fit in the region between R0.01 and the radius 2M ? This question does make sense, once one answers 5(a). It is a simple geometry question, no tricks.
11.8
Rob’s Supplementary Problems
SP. 1 Recall from Exercise 5d of § 7.6 that g00 is “closely related to” − exp(2φ), where φ is the Newtonian potential for a similar [non-relativistic] situation. This was derived for a hydrostatic fluid in Exercise 5d, but use this fact here to find at what distance R from a black hole of mass 106 M the Schwartzschild g00 differs from the Newtonian potential by 1%. Solution: The Schwartzschild metric has, precisely, 2M R with G = 1 of course. When the weak gravity limit applies, the Newtonian potential φ = − M for a similar [non-relativistic] situation obeys φ 1 and R we have −g00 = 1 −
−g00 ≈ exp(2φ) , Exercies 5d of § 7.6.
(11.7)
Use a Taylor series about φ = 0 to approximate the exponential function, −g00 ≈ exp(2φ) 4 ≈ 1 + 2φ + 2φ2 + φ3 . . . 3 4 1 + 2φ = 1 + 2φ + 2φ2 + φ3 . . . 3 (11.8) For φ 1 the LHS differs from the RHS by approximately 2φ2 . Setting this difference to 1% of −g00 ≈ 1 we solve for the radius when 106 M , 2 1 M 2 = 2φ = 2 − R 100 so √ R = 10 2 M = 14.1 × 106 × 1.5km see Table 8.1
Chapter 12 Cosmology
289
290
12.2
Cosmological kinematics: observing the expanding universe
Three types of universe Typo: in Eq. (12.15), missing square on dΩ. Eq. (12.21) uses z for redshift. This was defined in Eq. (10.12) and also in Eq. (12.37).
12.6
Exercises
1. Use the metric of the 2-sphere to prove the statement associated with Fig. 12.1 that the rate of increase of the distance between any two points as the sphere expands ( as measured on the sphere) is proportional to the distance between them. The metric of the 2-sphere is apparent from the line element dl2 = r2 (dθ2 + sin2 θ dφ2 ) Here we assume that r = r(t) and pick two points on the sphere, p and q. We can always rotate the coordinates such that the two points are on the equator, θ = π/2, so the distance is Z Z dl = l = r(t) dφ = r(t)(φp − φq ) Differentiating with respect to time one obtains the rate of increase of their separation r˙ l˙ = r(φ ˙ p − φq ) = l r
(12.1)
So for a given r and rate of change of r, the rate of change of the distance l is proportional to l.
2. Find the parsec in meters given the radius of the Earth’s orbit as about 1011 m and the definition of the parsec, which is the parallax of 1 second of arc.
Rob’s notes on Schutz
291
This question is of course quite trivial. But it’s nice to go through it once since one feels then more comfortable with the parsec and gains appreciation of the astronomers who measure positions of stars to within an arc second 6 months apart. The distance from the Sun to the star l is l= ≈
r sin(1[arc second]) π [rad 180 11
1 [degree 60
per degree] 10 × 180 × 60 × 60 = π
r per minute]
1 [minute 60
per second] (12.2)
3. Newtonian cosmology. 3(a). Apply Newton’s law of gravity to the study of cosmology by showing that the general solution of ∇2 φ = 4πρ for constant ρ is a quadratic polynomial in Cartesian coordinates that is not necessarily isotropic. Newton’s law is a Poisson equation with constant coefficient. By inspection it is immediately clear that Φ = Φ0 (a2 x2 + b2 y 2 + c2 z 2 ) is a solution with Φ0 some constant and 2(a2 + b2 + c2 ) =
ρ Φ0
But to this one can add an arbitrary solution to Laplace’s equation, so our solution is by no means a completely general solution! It is clear that it is not isotropic because we do not require a = b = c. 3 (b) Show that if the universe consists of a region where ρ is constant, outside of which there is a vacuum, then, if the boundary is not spherical, the field will not be isotropic. The field will show significant deviations from sphericity throughout the interior, even at the centre.
292 Suppose we stretch our Cartesian co-ordinates such that: x0 = ax;
y 0 = by;
z 0 = cz
Then our “general solution” is Φ = Φ0 (x02 + y 02 + z 02 ) = Φ0 r02 and is spherically symmetric in this stretched coordinate system. Outside the region of nonzero ρ, so in the vacuum, we know the solution must be Φ=−
4πρ 1 3 r0
, see Eq. (8.2) and Exercise 1 of § 8.6
And this solution must match the solution Φ0 r02 at the boundary. But this proves that the boundary must be a surface of constant r0 = R, for otherwise it would be impossible to match the interior and exterior solutions. And this gives us the result we required. For the boundary is affecting the shape of the solution every, including the whole interior and even its central point r0 = 0. And if the boundary R is not a true circle (i.e. a circle in the original unstretched coordinates, such that a 6= b 6= c) then the anisotropy is apparent, after transforming back to the original coordinates, for instance in the second derivatives: Φ,xx = Φ0 2a 6= Φ,yy = Φ0 2 b 6= Φ,zz = Φ0 2 c
(12.3)
3 (c) Show that an experiment done locally could determine the shape of the boundary. One could measure the gravitational acceleration on a test particle in 3 orthogonal directions as a function of position. Under the assumption of ρ =constant [or possibly correcting for local inhomogeneities – not sure how feasible that would be in practice] one would obtain Φ,xx = Φ0 2a from the x−direction gradient of the x−direction gravitational acceleration, and similarly for the other two directions. This gives the relative sizes of a, b, c which would determine the shape of an ellipsoidal boundary.
Rob’s notes on Schutz
293
For a boundary shape more general than ellipsoidal, it is not immediately clear how to proceed. The problem lies in our solution in terms of quadratic terms in x, y, z not being truly a general solution. I guess the mathematics will become complicated rather quickly and it’s not clear it’s worth the effort. For we have already shown that the boundary effects of an ellipsoidal shaped Newtonian universe would in principal be observable throughout the interior, and clearly we don’t observe such effects.
4. Show that if hij (t1 ) 6= f (t1 , t0 ) hij (t0 ) for all i and j in Eq. (12.3), then distances between galaxies would increase anisotropically: the Hubble law would have to be written as v i = H i j xj for a matrix H i j not proportional to the identity. We replace hij (t1 ) = f (t1 , t0 ) hij (t0 ) by a more general expression hij (t1 ) = (ai j )2 (t1 , t0 ) hij (t0 ) with no summation over the double i or j indices on the RHS. Consider a galaxy at some distance l0 at t = t0 . We can orient our Cartesian coordinates such that we are at the centre and the galaxy is in the x−direction. dl2 = hij (t0 )dxi dxj = hxx (dx)2 p dl = hxx (t0 ) dx Z Z p dl = hxx (t0 ) dx = l0
(12.4)
294 Then at a later time t1 dl2 = hij (t1 )dxi dxj = (ai j )2 hij (t0 )dxi dxj = (ax x )2 hxx (dx)2 p dl = ax x hxx (t0 ) dx Z Z p x hxx (t0 ) dx = ax x l0 dl = a x
(12.5)
So the rate of change of position of this galaxy observed at Earth is l˙ = v = a˙ x x l0
(12.6)
But if our (ai j )2 is not just a constant times δ i j then we measure a different Hubble parameter in each of the 3 orthogonal directions: v x = a˙ x x l0x v y = a˙ y y l0x v z = a˙ z z l0z
(12.7)
And performing a rotation to orient our Cartesian coordinates to an arbitrary direction we obtain the generalized Hubble law v j = H j i xi
6. (a) Prove the statement leading to Eq. (12.8), that we can deduce Gij of our three-spaces by setting Φ = 0 in Eq. (10.15)-(10.17). I’m going to be a bit deconstructionist for this question since I don’t like how it’s posed. Recall that the Einstein tensor is defined in Eq. (6.98), which, after lower the indices, is 1 Gαβ = Rαβ − gαβ R 2 So we want the spatial part of that, Gij . But this involves the temporal dimension through the Riemann tensor. Recall the Ricci tensor was defined by Eq. (6.91), which after fixing the typo, is Rαβ = Rµ αµβ
Rob’s notes on Schutz
295
So even when we restrict ourselves to Gij we have contributions from the temporal diminution through µ = 0 in the contraction of the Riemann tensor. So I think it’s better to say we want to ensure that the metric in Eq. (12.6) is homogeneous at a given point in time, t = t0 . Then R2 (t0 ) constant one without loss of generality it can be recalled to R2 = 1: ds2 = −dt2 + R2 (t)hij dxi dxj = −dt2 + (e2Λ(r) dr2 + r2 dΩ2 ) If we set Φ = 0 in Eq. (10.7) then we obtain the same metric. So of course it will have the same Riemann tensor and all that follows (Ricci tensor and scalar and Einstein tensor).
6. (b) Derive Eq. (12.9). To obtain Eq. (12.9) we substitute Eq. (12.8) for the diagonal components of the Einstein tensor into the equation for the trace, G = Gij δ ij and the rest is just algebra. There’s nothing to derive.
7.Show the metric in Eq. (12.7) is only flat at r = 0 if A = 0 in Eq. (12.11). For the metric to be flat we require the Riemann tensor to be zero, see Eq. (6.71), Rα βµν = 0 Careful, it’s not correct to show that Rαβµν = 0 I’ve learned from experience one can waste hours working with this incorrect equation! Our metric was given by Eq. (12.7): dl2 = exp(2Λ(r)) dr2 + r2 dΩ2 which has the spatial part the same as metric Eq. (10.7). For the Riemann tensor, we can pull off the spatial part without fear of contamination from
296 the temporal part of the metric since our metric is diagonal and there is no contraction involved in the definition of the Riemann tensor Rα βµν vα = vα; Using the results from Exer. 35 of § 6.9, we see there are only three nonzero terms to consider (and the ones obtainable from these 3 by symmetry relations), see result (6.101), Rrθrθ = rΛ0 Rrφrφ = rΛ0 sin2 (θ) Rθφθφ = r2 sin2 (θ) (1 − exp(−2Λ))
(12.8)
We’ll be using Eq. (12.11) grr = exp(2Λ) =
1 1+
1 κr2 3
−
A r
Consider case A 6= 0. Then Rr θrθ = g rα Rαθrθ 1 = Rrθrθ , for diagonal metric grr 1 = rΛ0 , substitute from result (12.8) grr r dgrr , elementary operations = 2(grr )2 dr 1 2 2 A =− κr + 2 3 r (12.9) which is nonzero (in fact singular) as r → 0. On the other hand, if A = 0 then Rr θrθ = 0 at r = 0. But we still need to check the other two non-zero components of the Riemann tensor when A = 0 to confirm that space is flat there. Note that Rr φrφ = Rr θrθ sin2 (θ) = 0
at r = 0.
Rob’s notes on Schutz
297
Rθ rθr = g θα Rαrθr 1 = Rθrθr , for diagonal metric gθθ 1 Rrθrθ , symmetry = gθθ 1 = 2 rΛ0 , substitute from result (12.8) r −κ = , elementary operations 3 + κr2 (12.10)
Rθ φθφ = g θα Rαφθφ 1 Rθφθφ , for diagonal metric = gθθ 1 , substitute from result (12.8) = 2 r2 sin2 (θ) (1 − exp(−2Λ)) r = 0 at r = 0. (12.11)
Rφ θφθ = g φα Rαθφθ 1 = Rφθφθ , for diagonal metric gφφ 1 = Rθφθφ , symmetry gφφ 1 1 Rθ = 2 2 Rθφθφ = r sin θ sin2 θ φθφ = 0 at r = 0.
8.Find the coordinate transform leading to Eq. (12.19). It’s immediately clear by inspection of the term in dΩ2 that r = sinh χ
(12.12)
298 is the transform. To confirm this works we substitute it in Eq. (12.13) when k = −1. We’ll use the identity cosh2 χ − sinh2 χ = 1 and d sinh χ = cosh χdχ These identities all follow easily from the definitions of the hyperbolic functions, http://en.wikipedia.org/wiki/Hyperbolic_function. Omitting the temporal part we find dr2 2 2 2 + r dΩ , spatial part of Eq. (12.13) dl = R(t0 ) 1 + r2 cosh2 (χ)dχ2 2 2 = R(t0 ) + sinh (χ)dΩ , sub transform 1 + sinh2 (χ) = R(t0 ) dχ2 + sinh2 (χ)dΩ2 , use hyperbolic identies (12.13)
12.7
Rob’s supplementary problems
SP. 1 Let’s get a feel for the order of magnitude of the terms in the expanding universe. SP. 1(a) At the current rate of expansion of the universe, how long will it take for a meter stick to double in size? For the universe to double in size? SP. 1(b) Plate tectonics is causing the Atlantic ocean to spread at a rate of about 25 mm/year, see http://en.wikipedia.org/wiki/Mid-ocean_ ridge. So stationed in Brest, France one observes New York City to recede at v = 25 mm/year. Assuming Hubbles’ law applies locally, compare this to the rate at which New York recedes from Brest due to expansion of the universe.
Bibliography Batchelor, G. K., 1967: An Introduction to Fluid Dynamics. Cambridge University Press, 615 pp. Boas, M. L., 1983: Mathematical methods in the physical sciences. John Wiley and Sons, 793 pp. Dirac, P. A. M., 1996: The General Theory of Relativity. Princeton University Press. 71 pp. Faber, R. L., 1983: Differential geometry and relativity theory: An Introduction. Marcel Dekker. 255 + X pp. Hobson, M., G. Efstathiou, and A. Lasenby, 2009: General Relativity: An introduction for physicists. Cambridge. 572 +XVIII pp. Misner, C. W., K. S. Thorne, and J. A. Wheeler, 1973: Gravitation. W. H. Freeman and company. 1279 + XXVI pp. Perkins, D., 2009: Particle Astrophysics. Oxford University Press. 2nd ed., 339 + IX pp. Schutz, B., 2009: A first course in General Relativity. Cambridge University Press. 2nd ed., 393 + XV pp.
299