Berger GR 2006

General Relativity

Lecture Notes C348

c© Mitchell A BergerMathematics University College London 2006

Contents

1 Manifolds, Vectors, and Gradients 11.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Product Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Co-ordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Things that Live on Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.2 Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.3 Parametrized Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3.4 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.5 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Transformation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.2 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Duality between Vectors and Gradients . . . . . . . . . . . . . . . . . . . . . . . 141.5.1 The directional derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5.2 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.5.3 Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Tensors and Metrics 162.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2 Getting the transformation correct . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Things to do with tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 The Levi–Civita Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.1 Odd / Even Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.5 3-Vector Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.6 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.7 Euclidean Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.7.1 E3 Euclidean 3-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.7.2 E3: Cylindrical Co-ordinates . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.8 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.9 Scalar Product and Magnitude for Vectors . . . . . . . . . . . . . . . . . . . . . . 272.10 Raising and Lowering Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.11 Signature of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.12 Riemannian and Pseudo-Riemannian metrics . . . . . . . . . . . . . . . . . . . . 29

i

2.13 Map Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.13.1 Cylindrical Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.13.2 Mercator Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 Special Relativity 343.1 Minkowski Space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.3 Einstein’s Axioms of Special Relativity . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Space-Time Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.5 The Poincare and Lorentz Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.1 Group Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.6 Lorentz Boosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.6.1 Deriving the transformation matrix . . . . . . . . . . . . . . . . . . . . . . 433.7 Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.8 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.9 Relativistic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.9.1 The 4-momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.9.2 Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.9.3 Energy-Momentum Conservation . . . . . . . . . . . . . . . . . . . . . . . 503.9.4 Photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4 Maxwell’s Equations in Tensor Form 524.1 Maxwell’s Equations – Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.1.1 Internal Structure Equations . . . . . . . . . . . . . . . . . . . . . . . . . 524.1.2 Source Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.1.3 Lorentz Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.1.4 Charge Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2 The Faraday Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.3 Internal Structure Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.4 Source Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.5 Charge Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.6 Lorentz Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.7 Potential Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.7.1 Advantage – Internal Structure Equations . . . . . . . . . . . . . . . . . . 584.7.2 Advantage – Source Equations . . . . . . . . . . . . . . . . . . . . . . . . 58

4.8 Gauge Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.9 Lorentz Gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.10 Light Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5 The Equivalence Principle 625.1 Inertial mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.2 Free Fall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.2.1 Locally Inertial Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.3 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.3.2 The Geodesic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

ii

5.3.3 Covariant Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6 Covariant Derivatives 696.1 Non-Euclidean Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696.2 The Covariant Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.3 Derivatives of Other Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.3.1 The gradient of the metric in General Relativity . . . . . . . . . . . . . . 746.4 Covariant Directional Derivatives and Acceleration . . . . . . . . . . . . . . . . . 746.5 Newton’s Law of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.6 Twin Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.6.1 The rapidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7 Orbits 797.1 Noether’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.2 The Schwarzschild Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2.1 Symmetries and Conserved Quantities . . . . . . . . . . . . . . . . . . . . 817.2.2 Orbits in the Equatorial Plane . . . . . . . . . . . . . . . . . . . . . . . . 82

7.3 Precession of Mercury’s Orbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867.3.2 Newtonian Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.3.3 Relativistic Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.4 Deflection of Starlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897.4.1 Newtonian Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897.4.2 Relativistic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.5 Energy Conservation on Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . 92

iii

Chapter 1

Manifolds, Vectors, and Gradients

... ‘You must follow me carefully. I shall have to controvert one or two ideas that are almost universallyaccepted. The geometry, for instance, they taught you at school is founded on a misconception.’

‘Is not that rather a large thing to expect us to begin upon?’ said Filby ....The Time Machine, HG Wells 1895

1.1 Manifolds

Mathematics provides a common mathematical term for curved surfaces, curved spaces, and evencurved space-times – the manifold. In essence, a manifold is an N-Dimensional surface. Thismeans that each point of the manifold can be located by specifying N numbers or coordinates.More formally,

Definition 1.1 Manifold

• A manifold M is a set of points which can locally be mapped into RN for some N =0, 1, 2, . . . . The number N will be called the dimension of the manifold.

• The mapping must be one-to-one.

• If two mappings overlap, one must be a differentiable function of the other.

Example 1.1 Let M be a two dimensional surface. Suppose we have a point P and we wishto say where this point is. We can do this by specifying two coordinates x and y for P . In figure1.1 P is mapped to (x, y) = (0.5, 1.5), i.e. P is given the co-ordinates (0.5, 1.5). Note that thelines of constant x never cross each other (similarly for the lines of constant y) – if they did,then the coordinates of the crossing point would not be unique.

Example 1.2 Suppose we try to map the same surface M using polar coordinates r and φ.Here we run into difficulties:

• The φ function is not single-valued (one-to-one) at r = 0. Any coordinate pair (r, φ) =(0, φ) refers to the origin.

• The mapping is not continuous for all values of φ as angles only have a range of 2π. – wemust cut the plane, say at the φ = π or − π line. Then φ will not be continuous acrossthis line.

1

2 Manifolds, Vectors, and Gradients

Figure 1.1: The manifold in example 1.1.

Figure 1.2: The plane divided into two regions. Polar coordinates work in Region I, even if theydo not work in all space.

1.1 Manifolds 3

Figure 1.3: Spherical coordinates.

Note that polars are still well-defined in regions which avoid the origin and the cut. In thefigure, polar coordinates are single valued and continuous in region I, but not in the entire plane.We remedy the situation by using a different co-ordinate system in region II. (For example, wecould employ Cartesian coordinates in region II.)

Example 1.3 M = S1 (circle). There is one coordinate φ which can be chosen to go from−π to π. The point mapped to φ = π also maps to φ = −π. Thus we will need at least twoco-ordinate patches.

To summarize: we cover a manifold M with co-ordinate patches. In each patch, the coordi-nates form a cross-hatch. The minimum number of patches needed depends on the topology ofM . In the previous example, M = S1 needs two patches. A manifold M = R2 consisting of theentire plane needs only one (using Cartesian coordinates rather than polars).

Example 1.4 The 2-sphereM = S2. Spherical coordinates do not cover the entire sphere in aone-to-one continuous fashion; azimuthal angle φ is many-valued at the poles and is discontinuousat φ = ±π. On a globe of the Earth, φ corresponds to longitude, which is discontinuous at theinternational date line drawn near (±180o).

Exercise 1.1 Cover the 2-sphere S2 with just two coordinate patches.

Example 1.5 Sn (n-sphere)

Sn ={

(X1,X2, . . . ,Xn+1) | (X1)2 + (X2)2 + · · ·+ (Xn+1)2 = 1}

(1.1)


e.g. S2 ={

(x, y, z) |x2 + y2 + z2 = 1}

S1 ={

(x, y) |x2 + y2 = 1}

S0 ={x |x2 = 1

}={−1, 1

}(1.2)

Example 1.6 Bn (n-ball)

Bn ={

(X1,X2, . . . ,Xn) | (X1)2 + (X2)2 + · · ·+ (Xn)2 < 1}

(1.3)

i.e Bn is the solid volume inside Sn−1.

1.1.1 Product Manifolds

Example 1.7

=T 2M θ φ

M = T 2 (2-Torus). The 2-torus can berepresented as the surface of a dough-nut. Let θ represent angle the short wayaround, and φ angle the long way around.Both these coordinates are discontinuousat ±π.

We can also represent T 2 as a rectangu-lar box (slit and open out a torus into arectangular shape).The edges are co-existent (θ = π is equiv-alent to θ = −π, similarly for φ).

Thus we can write T 2 as the set of points

Tn ={

(θ, φ) | −π ≤ θ ≤ π, −π ≤ φ ≤ π}. (1.4)

1.2 Co-ordinate Transformations 5

Note that in the definition of T 2, the coordinates θ and φ are completely independent.By themselves, each gives a circle S1. We can express this independence with the notationT 2 = S1 × S1. We call T 2 a product manifold, as it can be generated by considering allcombinations of θ (the first S1) and φ (the second S1).

1.2 Co-ordinate Transformations

Suppose we know the coordinates of points in a manifold in one coordinate system. We mayneed to be able to find the same points in other coordinate systems as well, for various reasons.Somebody we work with may use another system, so there will be a communication problemunless we can translate between systems. Or perhaps the equations we wish to solve are easierin a different system. Also, if we know general ways of going from one system to another, wecan make sure that our solutions work independently of which particular coordinate system weuse. We will only deal with coordinate systems which are differentiable functions of each other.

Suppose point P has co-ordinates X = (X1,X2, . . . ,Xn) in one co-ordinate system, andY = (Y1,Y2, . . . ,Yn) in another. Then the co-ordinates (Y1,Y2, . . . ,Yn) must be smooth (dif-ferentiable) functions of (X1,X2, . . . ,Xn). For example,

Y1 = Y1(X1,X2, . . . ,Xn). (1.5)

Also, the co-ordinate (Jacobian) transformation matrix is

∂Ya

∂Xb, (1.6)

where a labels the row, and b labels the column in the matrix.

1.3 Things that Live on Manifolds

1.3.1 Scalar fields

Definition 1.2 Scalar Fields Scalar fields are functions which assign numbers to points onthe manifold. More formally, a scalar field is a function f which maps a manifold M to the setof real numbers: f : M → R.

Example 1.8

M = Surface of the Earthf(P ) = Temperature at point P (→ weather map)

N.B. We may wish to use complex functions ψ : M → C, for example to describe quantumwave-functions.

1.3.2 Curves

The scalar fields defined above map the manifold M to the Real line R. Suppose we now do thereverse. For each real number λ, we will obtain a point γ(λ) on M . If we string these pointstogether, we will get a curve on M .


Figure 1.4: A curve γ : R →M

Example 1.9 Suppose M = R3, infinite three-dimensional space. Let

γ(λ) = (7 cos 3λ, 7 sin 3λ, λ). (1.7)

This curve has the shape of a helix. The helix winds around the z axis at a radius of 7, andmakes a complete turn each time λ increases by 2π/3.

Definition 1.3 Curves A curve is a mapping of the real line (or part of the real line, ora circle) into the manifold M . Formally this is written γ : R → M for the whole real line, orγ : [0, 1] →M (unit interval), or γ : S1 →M (Circle).

1.3.3 Parametrized Surfaces

Most manifolds cannot be visualized, especially if their dimension is much greater than 3. Fortu-nately, one and two-dimensional manifolds provide many useful visual examples. We must firstbe able to imbed, or place, the manifold in ordinary 3-space R3. A one-dimensional manifold canbe drawn using a curve with one parameter λ as in the previous section. For two-dimensionalmanifolds (surfaces) imbedded in 3-space, we can specify the surface by giving two parameters.

Example 1.10 Drawing a sphere. We will use spherical polar coordinates θ and φ asparameters, accepting that there will be coordinate singularities at the poles (and a discontinuityat φ = ±π). The imbedding satisfies

S :(θ, φ) → (x, y, z)x = x(θ, φ) = sin θ cosφy = y(θ, φ) = sin θ sinφz = z(θ, φ) = cos θ

(1.8)

Exercise 1.2 Consider an ellipsoid E given by the equation

9x2 + 4y2 + z2 = 36. (1.9)

Find a parametrization E : (θ, φ) → (x, y, z) which satisfies this equation. That is, find thefunctions x(θ, φ), y(θ, φ), and z(θ, φ).

1.3 Things that Live on Manifolds 7

Exercise 1.3 Here we find a parametrization of the Northern half of a sphere which doesnot use spherical coordinates: let the two parameters be t and u, where

S :(t, u) → (x, y, z)

x = x(t, u) = t

y = y(t, u) = u

z = z(t, u) = ?

(1.10)

Find the function z(t, u).

1.3.4 Vectors

A vector has:

• A Magnitude

• A Direction

• A Base Point

N.B. Forget “position vectors”. Position is given by co-ordinates, not by vectors. In flatEuclidean space, one can draw an arrow from the origin to any point, and in some elementarybooks this is called a vector. We will not do this, as we may wish to study highly warped man-ifolds where such arrows will also need to be warped (and hence their direction and magnitudewill not be well defined). Vectors in differential geometry always have a base point, and give adirection and magnitude proceeding from that point.

Start with a curve γ : R →M (or [0,1], or S1 →M). The curve provides a set of n coordinatefunctions of λ, i.e.

γ(λ) =(X1(λ), . . . ,Xn(λ)

). (1.11)

These n functions have derivatives which show how fast they increase with λ. Taken together,they show the direction the curve is travelling.

Definition 1.4 Tangent Vectors The tangent vector to the curve γ is given by

V(λ) =

dX1/dλ...

dXn/dλ

. (1.12)


Note that the set of coordinates of a point (e.g. (X1, . . . ,Xn)) is not a vector!

1.3.5 Gradients

Gradients are formed from scalar functions.Definition 1.5 Gradients Given a function f : M → R,

∇f =(∂f

∂X1,∂f

∂X2, . . . ,

∂f

∂Xn

). (1.13)

This is different to a tangent vector. For one thing, it is determined everywhere on M ,whereas a tangent vector is only defined on a single curve.

Example 1.11 Consider an ordinance survey map, giving height h as a function of positionon the Earths surface h : S2 → R.

The contours are lines of constant h. But note that these are not parametrized curves! (Wehave not been given a way to choose λ = 0, 1, 2, . . . etc). Thus we have no way of definingtangent vectors to the contours (at least not until we introduce metrics, in §2.6).

Exercise 1.4 Consider a torus T2 with coordinates (X1,X2) = (u, v), where −π < u ≤ π,−π < v ≤ π. Suppose there is a curve γ : R → T2, where γ(λ) = (2πλ, 6πλ) (i.e. u(λ) = 2πλand v(λ) = 6πλ)).

Find the tangent vector

V(λ) =dγdλ.

Consider the function f(u, v) = sin(v). Find the gradient ∇f and the directional derivativeV · ∇f .

1.4 Transformation Laws 9

Figure 1.5: Curve on a torus.

1.4 Transformation Laws

1.4.1 Vectors

In co-ordinates X = (X1, . . . ,Xn)

VX =

dX1/dλ...

dXn/dλ

. (1.14)

In another co-ordinate system Y = (Y1, . . . ,Yn)

VY =

dY1/dλ...

dYn/dλ

. (1.15)

How do we relate VX and VY?By the chain rule

dY1

dλ=

N∑a=1

∂Y1

∂Xa

dXa

dλ(1.16)

∴ VY =

∑N

a=1∂Y1

∂XadXa

dλ...∑N

a=1∂YN

∂XadXa

dλ

. (1.17)


1.4.2 Gradients

Chain rule again. This time, we let c be the ’dummy’ index which is summed from c = 1 toc = N (we can use any letter, of course).

∂f

∂Y1=

N∑c=1

∂f

∂Xc

∂Xc

∂Y1(1.18)

and so, the transformation is

∴ ∇Yf =

(N∑

c=1

∂f

∂Xc

∂Xc

∂Y1, . . . ,

N∑c=1

∂f

∂Xc

∂Xc

∂YN

). (1.19)

1.4.3 Notation

A) Einstein’s Summation Convention The placement of indices is important in geometryand relativity. The vectors we have looked at (like V 1) have been given indices on top, i.e.superscripts. Meanwhile, the gradients (like ∇1f) have indices lower down, i.e. subscripts. Thismakes it easier to distinguish between them. It also allows us to define consistent rules forcalculating inner products. But first we will simplify the notation.

In numerous expressions in geometry and relativity (and particle theory as well) there areplaces where the index a is repeated twice and summed over. If we do not bother to write downthe summation sign, then the expressions will be less cluttered.

Definition 1.6 Einstein Summation Given two objects, one indexed with superscriptsA = (A1, . . . , AN ), and one with subscripts B = (B1, . . . , BN ), we define

BcAc ≡

N∑c=1

BcAc (1.20)

In a derivative like ∂f∂Xc the index c will be considered a subscript.

Example 1.12 In the previous section, equation (1.19) becomes

∇Yf =(∂f

∂Xc

∂Xc

∂Y1, . . . ,

∂f

∂Xc

∂Xc

∂YN

). (1.21)

B) Co-ordinate system labels We will label Co-ordinate systems by either primed symbols(X′) and unprimed symbols (X), or by capital letters; for example

EarthFrame E =(E1, . . . ,EN

)Spaceship Frame S =

(S1, . . . ,SN

)LabFrame L =

(L1, . . . , LN

) (1.22)

C) Differentials∂f

∂Xa= ∂af ,

∂f

∂X′a = ∂′af ,∂f

∂Ea= ∂Eaf. (1.23)


D) Vector Components In, for example, the lab frame L the tangent vector to the curve

γ(λ) = (L1(λ), . . . , LN(λ)) (1.24)

is

V =dγdλ

=

dL1/dλ...

dLN/dλ

(1.25)

with components

V aL =

dLa

dλ. (1.26)

We sometimes refer to the whole of the vector V by referring to a typical component V a.Similarly, we may refer to ∇f as ∂af .

E) Transforms

• Vectors:

For transformations between the X frame and the Y frame, equation (1.16) becomes

V 1Y =

dY1

dλ=∂Y1

∂Xa

dXa

dλ=∂Y1

∂XaV a

X . (1.27)

The formula for an arbitrary component of V is

V bY =

∂Yb

∂XaV a

X . (1.28)

• Gradients: an arbitrary component of ∇f in equation (1.19) can now be expressed as

∂Y bf =∂Xc

∂Yb∂X cf (1.29)

• For transformations between primed and unprimed co-ordinates, these expressions become

V′b =

∂X′b

∂XaV a. (1.30)

∂′bf =∂Xc

∂X′b∂cf (1.31)

• Note that the transformation matrices in equations (1.28) and (1.29), i.e. ∂Eb/∂Sa and∂Sc/∂Eb, are inverses. Proof: The (a, c) component of the product matrix is

N∑b=1

∂Eb

∂Sa

∂Sc

∂Eb=

N∑b=1

∂Sc

∂Eb

∂Eb

∂Sa(1.32)

=∂Sc

∂Sa(1.33)


by the chain rule. But∂Sc

∂Sa=

{1 c = a

0 c 6= a(1.34)

i.e.∂Sc

∂Eb

∂Eb

∂Sa= δc

a (1.35)

where

δca =

1 0 0 · · ·0 1 0 · · ·0 0 1 · · ·...

...

(1.36)

is the identity matrix. QED

This result is very important. Gradients and vectors have different (in fact, inverse)transformation laws. Thus a gradient is not a vector!

Exercise 1.5 Show that for any vector W with components W a,

δcaW

a = W c. (1.37)

• How do two successive transformations work? Suppose we transform from co-ordinatesX → Y → Z:

V aZ =

∂Za

∂YbV b

Y, where (1.38)

V bY =

∂Yb

∂XcV c

X , ⇒ (1.39)

V aZ =

∂Za

∂Yb

∂Yb

∂XcV c

X . (1.40)

But by the chain rule∂Za

∂Yb

∂Yb

∂Xc=∂Za

∂Xc(1.41)

so, (as we should expect)

V aZ =

∂Za

∂XcV c

X . (1.42)

Now try X → Y → X:

V aX =

∂Xa

∂Yb

∂Yb

∂XcV c

X (1.43)

= δacV

cX (1.44)

= V aX good! (1.45)

Example 1.13 Polar Co-ordinates.

C = (C1,C2) = (x, y) Cartesian (1.46)

P = (P1,P2) = (r, φ) Polar (1.47)


where

C1 = x = r cosφ = P1 cos P2 (1.48)

C2 = y = r sinφ = P1 sinP2 (1.49)

or, going the other way,

P1 = r =√x2 + y2 =

√(C1)2 + (C2)2 (1.50)

P2 = φ = arctany

x= arctan

C2

C1(1.51)

The transformation matrix (Jacobian) is:

∂Ca

∂ Pb=∂Ca

∂ Pb=(∂C1/∂P1 ∂C1/∂P2

∂C2/∂P1 ∂C2/∂P2

)(1.52)

=(∂x/∂r ∂x/∂φ∂y/∂r ∂y/∂φ

)(1.53)

=(

cosφ −r sinφsinφ r cosφ

)(1.54)

and the other way

∂Pb

∂Cd=(∂r/∂x ∂r/∂y∂φ/∂x ∂φ/∂y

)(1.55)

=(

x/r y/r−y/r2 x/r2

)(1.56)

=(

cosφ sinφ− sin φ

rcos φ

r

). (1.57)

And the two matrices are inverses.Next: Check the transformation law for tangent vectors in polar co-ordinates.

Let γ be a circle of radius R, parametrizedby λ = φ/(2π).Cartesian coordinates

γ(λ) =(C1,C2

)(1.58)

= (x(λ), y(λ)) (1.59)= (R cos 2πλ,R sin 2πλ) (1.60)


Polar Coordinates

γ(λ) =(P1,P2

)(1.61)

= (r(λ), φ(λ)) (1.62)= (R, 2πλ) (1.63)

The tangent vector can be computed separately for each coordinate system:Cartesian coordinates

V aC (λ) =

dCa

dλ(1.64)

=(

dx/dλdy/dλ

)(1.65)

=(−2πR sin 2πλ2πR cos 2πλ

). (1.66)

Polar coordinates

⇒ V aP (λ) =

dPa

dλ(1.67)

=(

dr/dλdφ/dλ

)(1.68)

=(

02π

). (1.69)

Let’s check the transformation law, using equation (1.54):

V aC (λ) =

∂Ca

∂PbV b

P (at r = R,φ = 2πλ) (1.70)

=(

cos 2πλ −r sin 2πλsin 2πλ r cos 2πλ

)(02π

)(1.71)

=(−2πR sin 2πλ2πR cos 2πλ

). (1.72)

IT WORKS!

1.5 Duality between Vectors and Gradients

1.5.1 The directional derivative

Two objects in maths are called ‘dual’ if they combine to give a single number (in R or C ).In elementary matrix algebra, row vectors and column vectors multiply to give a number. Inquantum theory, a bra vector 〈ψ| and a ket vector |φ〉 combine to give the complex number〈ψ|φ〉.

1.5 Duality between Vectors and Gradients 15

On a manifold, we can combine a vector, V, and a gradient, ∇f , naturally by using Einsteinsummation:

V · ∇f = V a ∂af

=N∑

a=1

V a ∂af.(1.73)

This sum gives a number called the directional derivative.Why is this ‘natural’?

1.5.2 Invariance

Exercise 1.6 Prove from the transformation laws that the directional derivative is the samein all co-ordinate systems.

Solution For the transformation from X → X′,

V a =∂Xa

∂X′bV′b

∂af =∂X

′c

∂Xa∂′cf

⇒ V · ∇f = V a∂af =

(∂Xa

∂X′b

∂X′c

∂Xa

)V′b∂′cf

= ( δcb )V

′b∂′cf

= V′c∂

′cf

(1.74)

ThusV · ∇f = V a∂af = V

′c∂′cf. (1.75)

The directional derivative has the same form in any two co-ordinate systems, and gives thesame number.

1.5.3 Geometric Interpretation

Suppose f is the temperature as a function of position. We can measure f(λ) as we travel alongγ(λ).

We can find df/dλ by the chain rule:

dfdλ

=N∑

a=1

∂f

∂Xa

dXa

dλ

= ∂af Va

= V · ∇f.

(1.76)

In words, if the curve γ(λ) has tangent vector V, then the directional derivative V · ∇f givesthe rate of change of f along the curve, df/dλ.

Chapter 2

Tensors and Metrics

‘You know of course that a mathematical line, a line of thickness nil, has no real existence. They taughtyou that? Neither has a mathematical plane. These things are mere abstractions.’ ‘That is all right,’said the Psychologist. ‘Nor, having only length, breadth, and thickness, can a cube have a real existence.’‘There I object,’ said Filby. ‘Of course a solid body may exist. All real things’ ‘So most people think.But wait a moment. Can an instantaneous cube exist?’ ‘Dont follow you,’ said Filby. ‘Can a cube thatdoes not last for any time at all, have a real existence?’

Filby became pensive.The Time Machine, HG Wells 1895

2.1 Tensors

Tensors are classified by their rank:

Rank0 Scalars (functions) ·1 Vectors and forms (inc gradients) · · ·

2 Two indices - appear like matrices

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

3 Three indices - appear like a stack of matrices...

......

0th RankDefinition 2.1 Scalars A tensors with no subscripts or superscripts. They are functions of position

on the manifold, and are completely independent of coordinates.

1st RankDefinition 2.2 Vectors Any tensor with one superscript which transforms like a tangent vector.

In many books these are called contravariant vectors. We give the symbol for a vector V an overline; Vhas components V a.

Definition 2.3 One-forms Any tensor with one subscript which transforms like a gradient. Wewill usually refer to one-forms simply as forms.We give forms an underline; the form W has componentsWa indexed by a subscript.

Two-forms will be introduced in section 4.2 (they not only have two subscripts but also must beantisymmetric). Many books refer to forms as covariant vectors or co-vectors.

16

2.2 Getting the transformation correct 17

Example 2.1 Suppose W = f ∇g, where f, g are functions. Then, going from X → X′ we have

f → f′= f (2.1)

g → g′= g (2.2)

∂b → ∂′

b =∂Xa

∂X′b∂a (2.3)

⇒Wb →W′

b =∂Xa

∂X′bWa (2.4)

NOTE: W has been constructed from a scalar function f and the gradient of another scalar g.However, it may not itself be a gradient – i.e there may not exist a third function h such that W = ∇h.

Higher Rank TensorsThere are 2 equivalent definitions of 2nd order (and higher) tensors.1) A tensor with (for example) one superscript and subscript transforms as a vector on the superscript

and as a form on the subscript.Example 2.2 Let M be a mixed tensor with components Ma

b. Then

M ′ab =

∂X′a

∂Xc

∂Xd

∂X′bM c

d (2.5)

2) A tensor with 1 superscript and 1 subscript is dual to the product of one vector and one form.Thus given V and W, the components Ma

b of M must have the following property: the quantity

α = MabV

bWa (2.6)

is scalar, i.e. the same in all reference frames.In general, a tensor with p upper indices and q lower indices will ‘eat’ p forms and q vectors, and

return a scalar.

Definition 2.4 Type

(pq

)tensors A type

(pq

)tensor has p upper indices and q lower indices.

The product with q vectors and p forms (summing over all indices) returns a scalar.For example, if T ab

cd

efg is a type

(43

)tensor (7th rank) then

µ = T abcd

efgAaBbC

cDdEeF fGg (2.7)

is a scalar where A,B,D,G are forms, and C,E,F are vectors.

2.2 Getting the transformation correct

Example 2.3

R′a

bc =?Rd

ef (2.8)

Step 1 write three sets of ∂X∂X on right:

R′a

bc =

∂X

∂X

∂X

∂X

∂X

∂XRd

ef (2.9)

Step 2 Pair up indices

a→ d (2.10)b→ e (2.11)c→ f (2.12)

18 Tensors and Metrics

Step 3 The a, b, and c indices correspond to primed coordinates. Fill in the a, b, and c

indices, together with primes, in the same positions as appear in R′a

bc:

R′a

bc =

∂X′a

∂X

∂X

∂X′b

∂X′c

∂XRd

ef . (2.13)

Step 4 Fill in the remaining indices by the pairings from (2):

R′a

bc =

∂X′a

∂Xd

∂Xe

∂X′b

∂X′c

∂XfRd

ef . (2.14)

Note that the d, e, and f indices appear once on top and once on bottom, so they are summed over.

Exercise 2.1 How does the tensorRab

cd (2.15)

transform under a coordinate transformation?

Exercise 2.2 Suppose that in some coordinate system the tensor δab has the form

δab =

{0 if a 6= b

1 if a = b.(2.16)

Show that it also has this form in any other coordinate system.

2.3 Things to do with tensors

Each of these results in a new tensor:

a. AdditionRa

bc + Sa

bc = T a

bc All terms have the same indices! (2.17)

b. CompositionMa

b = V aWb (2.18)

Example 2.4 V 1 = 1, V 2 = 2,W1 = 20,W2 = 30 gives

Mab =

(20 3040 60

)(2.19)

c. Contraction between Tensors (Einstein summation)

µ = V aWa (= 80 in the above example) (2.20)

d. Contraction within a Tensor

P abbc = Qa

c (2.21)Ma

a = 80 = µ in the above example (2.22)

For 2nd rank mixed tensors, Maa = Tr(M), the trace of M.

2.3 Things to do with tensors 19

e. Symmetrizing and anti-Symmetrizing (for 2nd rank tensors)

Given a tensor with components T ab,

let Sab =12(T ab + T ba

)(2.23)

and Aab =12(T ab − T ba

)(2.24)

Then

(1) Sab +Aab = T ab (2.25)

(2) Sab = Sba (Symmetric) (2.26)

(3) Aab = −Aba (Antisymmetric) (2.27)

Example 2.5

T ab =(

0 12 3

)T ba =

(0 21 3

)= (TT )ab (2.28)

Sab =(

0 32

32 3

)Aab =

(0 − 1

212 0

)(2.29)

Tip: Watch out for bad tensor expressions:

• Wb = GaaV

ab.

This equation does not make sense because there are three a’s – thus we do not knowwhich two to sum over.

• ψ = Uaa.

2 a’s on top – summing would not give a scalar.

• V c = XaWaTcaAa

Bad – now there are four a’s. Perhaps this really means V c = XaWaTcbAb, but it could

also mean V c = XaWbTcbAa.

Exercise 2.3 Determine which tensor equations are valid, and describe the errors in the otherequations.

a. Dab = TaWa

b

b. Eab = F cbaCc + LdSdba

c. Zmn = Y am

an

d. Pc = JaaK

aaRc

e. Aab = Bba + gabDcDc

f. F cb = GcaHda

g. f = JaKaLaMa +N b

b


Exercise 2.4 Let space be two dimensional, with coordinates (X1,X2). Suppose tensors withcomponents V a, Wa, P ab, Qab, and Ma

b are measured to have the values

V 1 = 2, V 2 = 3; (2.30)W1 = 4, W2 = 5; (2.31)

P ab =(

2 −13 6

); Qab =

(0 24 7

); Ma

b =(

4 32 1

). (2.32)

Calculate the following tensors:

α = V aWa; (2.33)T b = P abWa; (2.34)F a

c = P abQbc; (2.35)Gab = M c

bQca. (2.36)

Theorem. Let Sab be a symmetric tensor, and Aab an anti-symmetric tensor. Then their doublecontraction vanishes:

AabSab = 0. (2.37)

Proof. To evaluate the double contraction of Aab and Sab, note that we can exchange the dummylabels a and b. Thus

µ ≡ AabSab = AbaS

ba. (2.38)

Now use the symmetry and anti-symmetry of Sab and Aab:

µ = AbaSba = (−Aab)(+Sab) = −µ. (2.39)

This can only be true if the double contraction µ ≡ AabSab is zero.

2.4 The Levi–Civita Tensor

The Levi–Civita tensor is completely antisymmetric. It is actually a tensor density (see section ??).In 2-dimensions:

εab =(

0 1−1 0

)In 3-dimensions:

εab1 =

0 0 00 0 10 −1 0

, (2.40)

εab2 =

0 0 −10 0 01 0 0

, (2.41)

εab3 =

0 1 0−1 0 00 0 0

. (2.42)

Note that all these matrices are anti-symmetric.In general, in N -dimensions:

εab...N =

0 if any 2 of a, b, c, . . . , N are equal1 if (a, b, c, . . . , N) is an even permutation−1 if (a, b, c, . . . , N) is an odd permutation

2.5 3-Vector Identities 21

2.4.1 Odd / Even Permutations

A permutation is an re-ordering. e.g. (4, 2, 1, 3) is a permutation of (1, 2, 3, 4). For N = 3, there are 6permutations:

(1, 2, 3) (1, 3, 2)(2, 3, 1) (2, 1, 3)(3, 1, 2) (3, 2, 1)

We can get from the start (1, 2, 3) to any other permutation by swapping pairs of numbers.Thus swapping 1,2 sends (1,2,3) → (2,1,3), swapping 1,3 sends (2,1,3) → (2,3,1), etc.Similarly, all permutations on N objects can be reached from the identity (1, 2, . . . , N) by a sequence

of swaps. Sometimes two different sequences of swaps will result in the same permutation. But when thishappens both sequences will consist of an even number of swaps, or an odd number of swaps.

Definition 2.5 Even and Odd Permutations Starting from (1, 2, . . . , N), an even number ofswaps results in an even permutation; an odd number of swaps results in an odd permutation.

Example 2.6 N = 4:ε1234 = 1ε1243 = −1ε2143 = 1

(2.43)

2.5 3-Vector Identities

a.−→A ·

−→B = AiBi . (2.44)

b.∇f = ∂if . (2.45)

c. (∇×

−→A)i

= εijk∂jAk (2.46)

Example 2.7 Verify equation (2.46) for the y-component of the curl.Solution The y-component of ∇×

−→A is(

∇×−→A)

y= ε2jk∂jAk (2nd component = y-component)

= ε231∂3A1 + ε213∂1A3

= (1)∂

∂zAx + (−1)

∂

∂xAz

=∂Ax

∂z− ∂Az

∂x.

Example 2.8 Show that−→A ×

−→B = −

−→B ×

−→A. (2.47)

Solution Translate to (−→A ×

−→B)i

= εijkAjBk.


but εijk = −εikj

⇒(−→A ×

−→B)i

= −εikjAjBk

= −εikjBkAj

(Aj and Bk are just numbers, so we can reverse their order in a multiplication without affecting theresult).

Now we replace the dummy indices j → k and k → j. They are being summed over, so it does notmatter which one is called which:

⇒(−→A ×

−→B)i

= −εijkBjAk

= −(−→B ×

−→A)i

∴−→A ×

−→B = −

−→B ×

−→A .

Example 2.9 Show that

∇×∇f = 0. (2.48)

Solution Translate to

(∇×∇f)i = εijk∂j∂kf

= εijk∂k∂jf (partial derivatives commute)

= −εikj∂k∂jf

relabel j ↔ k

⇒ (∇×∇f)i = −εijk∂j∂kf

⇒ (∇×∇f)i = − (∇×∇f)i

⇒ (∇×∇f)i = 0

∴ ∇×∇f = 0

Exercise 2.5 Translate the following 3-vector identities into index notation, and prove them:

A · (B×C) = B · (C×A) = C · (A×B); (2.49)∇ · (fA) = A · ∇f + f∇ ·A; (2.50)

∇ · (A×B) = B · ∇ ×A−A · ∇ ×B. (2.51)

2.6 Metrics 23

2.6 Metrics

Definition 2.6 Metric Given two nearby points,(X1, . . . ,XN

)and

(X1 + dX1, . . . , XN + dXN

), a

distance ds can be defined by introducing a new object, the metric tensor gab. The distance satisfies

ds2 = g11 dX1dX1 + g12 dX1dX2 + . . .+ gNN dXNdXN (2.52)

ords2 = gabdXadXb . (2.53)

N.B. Throughout these notes, ds2 means (ds)2, not d(s2).

Example 2.10 In Euclidean geometry with N = 3

ds2 =(dX1

)2+(dX2

)2+(dX3

)2or ds2 = gijdXidXj

with gij = δij (i, j = 1, 2, 3)

(2.54)

Let’s decompose gab into symmetric and anti-symmetric parts:

gab = Sab +Aab (2.55)ds2 = (Sab +Aab)

(dXadXb

)(2.56)

where Sab = Sba and Aab = −Aba.Consider the anti-symmetric contributions to ds2, AabdXadXb. This is the double contraction of an

anti-symmetric tensor Aab with a symmetric tensor dXadXb, so by equation (2.37)

AabdXadXb = 0. (2.57)

We have just shown that Aab is useless, and so we get rid of it. Thus we define gab to be symmetric

gab = gba . (2.58)


How many components does gab have? If the manifold is N dimensional, then an arbitrary tensorof rank r has Nr components. But since the metric is symmetric, not all these components will beindependent.

gab =

g11 g12 . . . g1N

g21 g22 . . . g2N

......

gN1 gN2 . . . gNN

︸︷︷︸

N2 components

. (2.59)

By symmetry (gab = gba),

⇒ gab =

g11 g12 g13 . . . g1N

g12 g22 g23 . . . g2N

g13 g23 g33 . . . g3N

......

g1N g2N g3N . . . gNN

= gba (2.60)

so there are N(N+1)2 independent components.

2.7 Euclidean Metrics

E2 – Euclidean PlaneCartesian: By the Pythagorean theorem,

ds2 =(dC1

)2+(dC2

)2= dx2 + dy2 (2.61)

= gC11 dx2 + g

C12 dxdy + gC21 dydx+ g

C22 dy2. (2.62)

So gC11 = g

C22 = 1 while gC12 = g

C21 = 0, or

gCab =

(1 00 1

).

Polars:Two methods to find the metric:

2.7 Euclidean Metrics 25

Method 1: Draw pictures:

a = 2(r + dr) sin(

dφ2

)(2.63)

= 2(r + dr)(

dφ2

)(+O( dφ)3) (2.64)

= rdφ (+O( dr dφ)). (2.65)

Thusds2 = dr2 + r2dφ2 (2.66)

or, in terms of the metric tensor ds2 = gab dPa dPb,

gP ab =

(1 00 r2

). (2.67)

Method 2:We know that

gCab =

(1 00 1

)∴ g

P ab =∂Cc

∂Pa

∂Cd

∂Pb

(1 00 1

)cd

.

(Cartesians on top (c,d), polars on bottom (a,b)).

2.7.1 E3 Euclidean 3-space

Cartesian:

gCab =

1 0 00 1 00 0 1

. (2.68)

Spherical:

• If we just move in r:

ds2r = dr2 (2.69)

• If we just move in θ, we move on a radiusr:

ds2θ = r2dθ2 (2.70)

• If we just move in φ, we move on a con-stant radius r sin θ:

ds2φ = r2 sin2 θdφ (2.71)


Thus

ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2 (2.72)

gP ab =

1 0 00 r2 00 0 r2 sin2 θ

. (2.73)

2.7.2 E3: Cylindrical Co-ordinates

(X1,X2,X3

)= (ρ, φ, z) (2.74)

ds2 = dρ2 + ρ2dφ2 + dz2 (2.75)

⇒ gCyl ab =

1 0 00 ρ2 00 0 1

(2.76)

2.8 Arc Length

What is the arc length P → Q? We define

L =∫ Q

P

ds(λ) (2.77)

Thus

L =∫ Q

P

√ds2

=∫ Q

P

√gabdXadXb

(2.78)

2.9 Scalar Product and Magnitude for Vectors 27

Example 2.11 E2 with Cartesian coordinates. Here

L =∫ Q

P

√dx2 + dy2. (2.79)

We can divide by√

dλ2 and multiply by dλ to make

L =∫ Q

P

√(dxdλ

)2

+(

dydλ

)2

dλ. (2.80)

If we are able to choose λ = x as our parameter, so that y = y(x) (i.e. y is a well behaved functionof x. See diagram),

��

y

x

Bad

��

y

x

Good

then

L =∫ Q

P

√(dydx

)+ 1 dx

e.g. y = sinx

L =∫ Q

P

√1 + cos2(x) dx

��

��

��

y

x

1

-1

2.9 Scalar Product and Magnitude for Vectors

Given a vector, V, with components V a the sum∑

a VaV a is not invariant under co-ordinate transfor-

mations, and so it is meaningless!


But, if we have a metric, we can define

V ·V = gabVaV b This is invariant! (2.81)

The magnitude of V is

∣∣V∣∣ =√V ·V (2.82)

Similarly, with two vectors V,WV ·W = gabV

aW b (2.83)

For two forms A,BA ·B = gabAaBb (2.84)

where gab is the inverse of the metric tensor, in the sense that

gabgbc = δac. (2.85)

Also ∣∣A∣∣ =√A ·A. (2.86)

2.10 Raising and Lowering Operators

Note that V ·W = gabVaW b is a scalar. We can write this as

V ·W = (gabVa)W b (2.87)

Then gabVa is ‘dual’ to the set of vectors – It takes a vector, W, and returns a scalar number. Therefore,

given a vector V, we can define a form V where

Vb = gabVa. (2.88)

Similarly

V ·W = V a(gabW

b)

(2.89)

⇒Wa = gabWb (2.90)

⇒ V ·W = V aWa (2.91)

= VbWb (2.92)

By contraction with the metric, we can ‘raise’ or ‘lower’ the index.Exercise 2.6 Suppose that Aab is anti-symmetric. Let Aab = gacgbdA

cd where gac is the (sym-metric) metric tensor. Show that Aab is antisymmetric as well.

Exercise 2.7 Let

T ab =(

2 35 7

). (2.93)

Find its symmetric and antisymmetric parts Aab and Sab. Next, let the metric be

gab =(

1 22 1

). (2.94)

Find Aab. Show explicitly thatAabS

ab = 0. (2.95)

2.11 Signature of the Metric 29

2.11 Signature of the Metric

One can show, using linear algebra, that any metric, gab, can be diagonalized by transforming to a suitableco-ordinate system.

e.g. If in coordinates X, the metric is

gab =(−3 −5−5 62

)Then there exists coordinates X′ where

g′ab =(

1 00 −1

)In general, we can make the values on the diagonal 1 or -1 by transforming to suitable co-ordinates

• Find eigenvectors and eigenvalues of gab

• Transform to co-ordinates along the eigenvectors (diagonalize metric)

• divide each eigenvector by it’s eigenvalue (all go to 1,-1)

There may be a few ways of doing this. However, it can be shown that the sum of the diagonalelements will always be the same. This is called the signature of the metric.

Example: E3 with Cartesian coordinates

gab =

1 0 00 1 00 0 1

(2.96)

Signature = 3.

Example: Minkowski Metric

gab = ηab =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

(2.97)

Signature = -2.

2.12 Riemannian and Pseudo-Riemannian metrics

Is the magnitude of a vector always positive?Definition 2.7 Riemannian metric For a Riemannian metric, any vector V 6= 0 satisfies

|V|2 = V ·V > 0. (2.98)

A metric which is not Riemannian is called Pseudo-Riemannian.

• Signature = N metrics are Riemannian.To see this note that we can calculate V ·V (at some point P ) in any coordinate system, includingthe system with only ±1 diagonal elements for the metric. Then

V ·V = gabVaV b = ±(V 1)2 ± (V 2)2 ± ...± (V N )2. (2.99)

If signature = N then there will be only pluses, and the sum will be positive.


• Signature < N metrics are pseudo-Riemannian.On the other hand, if the signature is less than N , then there will be some minuses in the sum.Say for example that V ·V = (V 1)2 + (V 2)2 + · · · + (V n−1)2 − (V N )2. Then we could choose avector V whose only non-zero element is V N . This would make V ·V < 0.

Euclidean metrics are Riemannian, while the Minkowski metric is pseudo-Riemannian.

2.13 Map Projections

The Earth, radius r = R = constant, has the metric line element (in spherical coordinates S = (θ, φ))

ds2 = R2(dθ2 + sin2 θdφ2

), (2.100)

with 0 ≤ θ ≤ π, −π ≤ φ ≤ π.

⇒ gSab = R2

(1 00 sin2 θ

)(2.101)

2.13.1 Cylindrical Map

Globe

PaperProject outonto paper

y

x

0 ππx

y

We wish to project the Earth onto a piece of paper of width w and height h. The co-ordinates onthe map will be M = (x, y) with

x =w

2πφ (2.102)

y =h

2cos θ (2.103)

yθ

θRcos

Let us call the metric for the map projection gM ab . Using this metric we can calculate the arclength

of any path drawn on the map (from Buenos Aires to Glasgow for example). The calculation will givethe distance along the corresponding path on the Earth’s surface.

2.13 Map Projections 31

Problem: find the metric gM ab .

Method 1: Use the transformation law for tensors:

gM ab =

∂Ma

∂Sc

∂Mb

∂Sdg

Scd . (2.104)

Method 2: Directly transform the metric line element: first,

ds2 = R2(dθ2 + sin2 θ dφ2

)(2.105)

dφ =2πw

dx (2.106)

⇒ ds2 = R2

(dθ2 +

4π2

w2sin2 θ dx2

). (2.107)

Next, from the formula for y,

dy = −h2

sin θ dθ (2.108)

⇒ dθ2 =4h2

dy2

sin2 θ. (2.109)

Also,

sin2 θ = 1−(

2yh

)2

(2.110)

⇒ dθ2 =4dy2

h2 − 4y2. (2.111)

Thus

ds2 = R2

(4π2

w2

h2 − 4y2

h2dx2 +

4h2 − 4y2

dy2

). (2.112)

Changing the aspect ratio w/h will stretch or compress the map in the vertical direction. We should choosethis ratio to give the least distortion. Note that at y = 0 (the equator) the y dependence disappears. Themetric will be symmetric in x and y at the equator if w = πh. In this case, the line element and metricare

ds2 = 4R2

(h2 − 4y2

h4dx2 +

1h2 − 4y2

dy2

). (2.113)

gab = 4R2

(h2−4y2

h4 00 1

h2−4y2

). (2.114)

2.13.2 Mercator Projection

Idea: Align compass bearings to constant directions on the map.


ds2 = a2 + b2 (2.115)a = R sin θdθ (2.116)b = −Rdθ (2.117)

⇒ ds2 = R2(dθ2 + sin2 θdφ2

)(2.118)

Again, let x = w2π φ.

Choose y so that the slope of dydx of a curve is a constant for constant ψ:

cotψ =−Rdθ

R sin θdφ= − 1

sin θdθdφ. (2.119)

On the map, however,

cotψ =dydx

(2.120)

= − 1sin θ

dθdφ

= − w

2π1

sin θdθdx

(2.121)

⇒ dydθ

= − w

2π1

sin θ(2.122)

After integration,

y =∫ y

0

dy′ = − w

2π

∫ 0

θ

dθsin θ

(2.123)

∴ y =w

2πlog[cot

θ

2

](2.124)

We can now find gM ab . First, we derive a simple identity: let

y ≡ 2πwy = log

[cot

θ

2

].

Thensin θ =

1cosh y

= sech y.

Proof:

sech y =2

ey + e−y

=2

cot θ/2 + tan θ/2

=2

cot θ/2(1 + tan2 θ/2

)=

2cot θ/2 (sec2 θ/2)

= 2 cos θ/2 sin θ/2.

Now

ds2 = R2(dθ2 + sin2 θ dφ2

)=(

2πw

)2

R2((− sin θ dy)2 + sin2 θ dx2

)

2.13 Map Projections 33

Thus we have found the Mercator line element:

ds2 =(

2πRw

)2

sech 2y(dx2 + dy2

), (2.125)

and the Mercator metric:

gM ab =

(2πRw

)2

sech 2y

(1 00 1

). (2.126)

Exercise 2.8 London has latitude 51◦ and longitude 0◦. Suppose you travel on the surface of theearth, following a geodesic (great circle) which leaves London in the direction due East. Where will youfirst hit the equator? (This problem can be done without any equations!)

Chapter 3

Special Relativity

‘Really this is what is meant by the Fourth Dimension, though some people who talk about the FourthDimension do not know they mean it. It is only another way of looking at Time. There is no differencebetween time and any of the three dimensions of space except that our consciousness moves along it. Butsome foolish people have got hold of the wrong side of that idea. You have all heard what they have tosay about this Fourth Dimension?’

3.1 Minkowski Space-time

In modern terms, special relativity is the study of physics in a universe governed by the Minkowski metric,equation (2.97). The Minkowski metric has coordinates

(X0,X1,X2,X3) = (ct, x, y, z) (3.1)

where t is time and c is the speed of light. Note that the time component, by convention, is distinguishedby being given the index 0. Also, the metric is diagonal,

ηab = Diagonal(1,−1,−1,−1), (3.2)

with the time component of opposite sign to the spatial components.We will examine the geometry of the Minkowski metric in detail. First recall that the spatial part of

ηab is Cartesian, apart from an overall minus sign. We could, of course, transform to another coordinatesystem such as spherical polars. However, this would introduce an apparent dependence of the metricelements on position, which would obscure the simplicity and symmetry of the metric. Thus for thischapter we will exclusively use Cartesian spatial coordinates.

We first note that the Minkowski metric is independent of position and time. This fact gives it theproperty of

Definition 3.1 Homogeneity An object or physical law is homogeneous if it has the same form atall places, i.e. (its form is invariant to translations).

A rotation of the spatial (x, y, z) axes leaves the Minkowski metric unchanged. Exercise 3.1

Show that the Minkowski metric is unchanged by a rotation by an angle ψ about the z axis, where

t → t; (3.3)x → x cosψ − y sinψ; (3.4)y → y cosψ + x sinψ; (3.5)z → z. (3.6)

34

3.2 Units 35

Thus it has the property of isotropy.Definition 3.2 Isotropy An object or physical law is isotropic if it has the same in all directions,

i.e. its from is invariant to rotations, about any central point and axis.

Objects can be homogeneous but not isotropic.Examples:

• A uniform magnetic field. The field looks the same at all points in space, but points in a particulardirection.

• A regular crystal. The crystal structure may appear the same at different places, but the molecularbonds are oriented in particular directions

• Most fabrics are woven with a warp and a weft, with the result that their ability to stretch dependson direction.

It is not possible to be isotropic but non-homogeneous. For example, consider a dis-tribution of stars. If the distribution is non-isotropic, more stars are seen in some direc-tions than others. But then regions seen indifferent directions (A and B) must be differ-ent and therefore non-homogeneous.

Isotropic ⇒ Homogeneous . (3.7)

Also note that spherical symmetry about somecentral point does not imply isotropy about allpoints.

3.2 Units

The zeroth coordinate in Minkowski space-time is X0 = ct. The presence of the factor of c makes manyof the equations more complicated. But we need this factor because traditional units for time and spaceare different. In order to understand space and time in a unified way, we need to employ a system ofunits which treats space and time more equally.

Exercise 3.2 Suppose there were a move to convert the measure of distance on British roads tokilometers. However, this move was fiercely resisted by half of the population. In a political compromise,it was decided to measure East-West distances with kilometers, and North-South distance with miles.

Imagine coping with this mixed system. What would be the distance from London to Manchester?What would speedometers and odometers look like?

Relativistic UnitsIn conventional units the speed of light in a vacuum is c = 2.997 . . .× 108 m s−1.In a relativistic system of units c = 1. There are two ways of constructing such systems.

a. Use a basic unit of time; the length unit will be the distance travelled by light in that time.

• A) choose the time unit to be the second (s), and define the unit of length to be the light-second (` s)

1` s = 3× 108 m. (3.8)

In these units, c = 1` s/ s. Usually, we do not bother writing the ` s/ s, and so c = 1.

36 Special Relativity

• B) Time unit: year (y); length unit: light-year ( ` y).

1 ` y ≈ 3× 108 ms· 3.4× 107 s (3.9)

≈ 1016 m (3.10)

and c = 1 ` y/ y. Again we will ignore the ` y/ y and just say c = 1.

b. Use a basic unit of length; the time unit will be the interval of time needed for light to travel thatdistance.

• Choose the length unit to be the metre (m), and the time unit to be the light-metre (`m)

1`m ≈ 3× 10−9 s (3.11)

and c = 1m/`m = 1.

Example 3.1 Express Watts in relativistic units with basic units second, kilogram.Solution

1W = 1kg m2 s−3 (3.12)

= 1 kg s−1 m2

s2

(l s

3× 108 m

)2

(3.13)

=1

9× 1016kg s−1

(l ss

)2

(3.14)

≈ 1.1× 10−17 kg s−1 (3.15)

N.B. We could have reached this by multiplying by c or c−1 until only the units kg and s were left(cancel out as many factors of c as necessary to get the units right).

In reverse: what is 1kg s−1 in Watts?Solution Multiply by c2 to obtain the right units

⇒ 1 kg s−1 = 9× 1016 kg m2 s−3 (3.16)= 9× 1016W. (3.17)

Exercise 3.3 The gravitational constant is G = 6.67 10−8cm3g−1s−2. Express G in relativisticunits, with the basic units being grams and centimetres (time measured in light-cms). Next in relativisticunits calculate the escape velocity V from the surface of the earth. Also calculate γ − 1. (Recall that thegravitational potential energy of an object of mass m at the surface of the Earth is −GM⊕m/R⊕, whereM⊕ is the mass of the Earth and R⊕ is its radius.)Earth mass: M⊕ = 61027 g.Earth radius: R⊕ = 6.4 103 km.

Exercise 3.4 The acceleration due to gravity at the Earth’s surface is 1g = 9.8ms2 . Express this

in relativistic units with basic unit being the year (i.e. lengths are measured in light years). (1 year≈ 3.2× 107s.)

3.3 Einstein’s Axioms of Special Relativity 37

3.3 Einstein’s Axioms of Special Relativity

The Minkowski metric, as we have seen, is invariant to translations and spatial rotations. However, ina four dimensional space-time manifold, we can also consider rotations involving both space and timecoordinates. Such mixing of space and time coordinates may seem mysterious, but actually the effect issimple: the spatial origin x = y = z = 0 in the rotated system moves at a constant velocity with respectto the original system. This is called boost.

Definition 3.3 Boost A boost is a transformation to a coordinate system moving at a constantrelative velocity with respect to the original system.

Einstein’s famous 1905 paper demonstrated that we needed a new conception of space-time if we wereto have a theory of electromagnetism which looked the same in all coordinate systems, especially onesreached via a boost transformation. He started out with the idea of a coordinate system, or referenceframe in which there are no inertial forces such as centrifugal or Coriolis forces.

Definition 3.4 Inertial Reference Frame An inertial reference frame (IRF) is a co-ordinatesystem for space-time with Cartesian spatial co-ordinates, and where there exist no inertial (fictitious)forces.

The invariance of Maxwell’s equations led Einstein to believe that the speed of light does not changewhen boosting to a moving reference frame. He wrote down two axioms for physics in general, andelectromagnetic radiation in particular:

a. The laws of physics are invariant to translations, rotations and boosts.

b. The speed of light is the same in all IRF’s.

3.4 Space-Time Diagrams

Space-time diagrams place time on the vertical axis, with one space dimension on the horizontal axis (ortwo in a horizontal plane).

Definition 3.5 World–line The world–line of an object is the path it traces in space-time.

If we just look at one space dimension (x), the velocity of the object is dx/dt. Since the vertical axisis time, however, the slope of the world line is dt/dx.

For photons c = dx/dt = 1. In special relativity light moves at an angle of arctan(1) = π/4 on aspace-time diagram.

Choose a point on an object’s world-line, P , to be τ = 0. Then let τ be the arc–length away from P ,

τ =∫ Q

P

√gabdXadXb. (3.18)

Definition 3.6 Proper Time The arc–length τ along a world–line is called the proper time.


x

t

y

Figure 3.1: The object with the world–line to the left is at rest. The object to the right ismoving in a circular orbit.

Q)(τ

τ (P)��

��

��

t

x

P

Q

Definition 3.7 Event A space-time event P is a point in space-time.Suppose at event P a camera flash goes off, sending an expanding sphere of light into space (and

time). We can most easily picture this expanding sphere by suppressing one dimension. For example,suppose the z co-ordinate of P is 0. We can then consider how the light travels in the z = 0 plane. Theexpanding sphere of light intersects this plane as an expanding circle. In a space-time diagram showingthe x, y, and t directions, the expanding circle traces out a cone.

The future light cone at an event P shows how a pulse of light emitted at P travels through space-

3.4 Space-Time Diagrams 39

time. A light-cone splits space-time into space-like and time-like parts. Objects moving slower than light(i.e. all massive objects!) can only reach the time-like parts. One would need to move faster than lightto reach the space-like parts.

Figure 3.2: The interval PQ is light-like (∆τ = 0), as Q is on P’s past light cone. The intervalPR is space-like (∆τ2 < 0), while PS is time-like (∆τ2 > 0).

A particle with 3-velocity−→V = (dx/dt,dy/dt,dz/dt) satisfies∣∣−→V∣∣2dt2 = dx2 + dy2 + dz2. (3.19)

For a photon,∣∣−→V∣∣2 = c = 1, and so dt2 = dx2 + dy2 + dz2 between any two events along a photon’s

world-line. By Einstein’s second axiom, this is true for a photon as seen in any IRF.Suppose we define a small interval between two points dτ by

dτ2 ≡ dt2 − dx2 − dy2 − dz2. (3.20)

Thendτ2 = 0 (3.21)

along a photon path in all IRFs. Also, dτ2 is precisely the line element

dτ2 = gabdXadXb (3.22)

resulting from the Minkowski metric

gab = ηab =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

. (3.23)

Space-time equipped with ηab is called “Minkowski space” or M4. A world-line is a curve in M4.Theorem


Proper time equals clock time in the rest frame of the object.Proof: In the rest frame R,

dR1 = dR2 = dR3 = 0, (3.24)

so dτ 2 = dR02= dt2

R. We generally chose to have coordinate time increase in the same

direction as proper time, so we can take the positive square root:

dτ = dtR. (3.25)

The tangent vector to the world line is the 4-vector

Ua =dXa

dτ=

dt/dτdx/dτdy/dτdz/dτ

(3.26)

What is the corresponding form Ua?

Ua = ηabUb (3.27)

=

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

dt/dτdx/dτdy/dτdz/dτ

(3.28)

=

(dt

dτ,−dx

dτ,−dy

dτ,−dz

dτ

). (3.29)

|U |2 = UaUa (3.30)

=

(dt

dτ

)2

−(

dx

dτ

)2

−(

dy

dτ

)2

−(

dz

dτ

)2

(3.31)

=dt2 − dx2 − dy2 − dz2

dτ 2(3.32)

=ds2

dτ 2=

(dτ

dτ

)2

= 1 (3.33)

⇒ UaUa = 1 in all reference frames (3.34)

This is easiest to see in the rest frame, where Ua = (1, 0, 0, 0).

Exercise 3.5 Suppose a spaceship moves at speed−→V in the Earth frame. What is

dtE/dτ where tE is the Earth time and τ is proper time inside the spaceship? Nextsuppose the spaceship is investigating some scalar function of position f(tE, xE, yE, zE)(e.g. temperature of the interplanetary medium). The ship measures and records f(τ) as ittravels through space. Find df/dτ in terms of ∂f

∂tEand the spatial gradient ( ∂f

∂xE, ∂f

∂yE, ∂f

∂zE).

Can you express your result in 4-vector notation?

3.5 The Poincare and Lorentz Groups 41

3.5 The Poincare and Lorentz Groups

These are the sets of transforms from one IRF to another i.e. that preserve gab = ηab.Thus going from coordinates XA to XB with transform LA→B gives

gAab =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

; gBab =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

(3.35)

Example 3.2 Rotation about Axes.

��

��

��

xA

xB

yAyB

��

��

��

��

θ

LA→B has the rule

tB = tA (3.36)

xB = cos θxA + sin θyA (3.37)

yB = − sin θxA + cos θyA (3.38)

zB = zA (3.39)

⇒ LA→B =∂Ba

∂Ab=

1 0 0 00 cos θ sin θ 00 − sin θ cos θ 00 0 0 1

(3.40)

Example 3.3 Translations.

Consider the transformation PA→B

tB = tA + 3 yB = yA − 5xB = xA − 2 zB = zA − 4

(3.41)

The orientation of the axes does not change: ∂B∂A

= I4. However, the origin moves – theorigin of the B system is at (ta, xa) = (5, 4).

The set of all ηab preserving transformations is called the Poincare group, while thosewhich leave the origin fixed (no translation, just rotation) are the Lorentz group. TheLorentz group is a subgroup of the Poincare group.


3.5.1 Group Axioms

a. Closure:

XAPA→B−−−→ XB

PB→C−−−→ XC (3.42)

= XAPA→C−−−→ XC. (3.43)

Let PA→B and PB→C be any elements of the Poincare group. Then: PA→C =PB→CPA→B (composition of the two transformations) preserves ηab if both PA→B andPB→C do.

b. Identity:

If XA = XB, then

PA→B =

1 0 0 00 1 0 00 0 1 00 0 0 1

. (3.44)

c. Inverse:

P−1A→B = PB→A (3.45)

d. Associative:

(PC→DPB→C)PA→B = PC→D (PB→CPA→B) (3.46)

Theorem:For any Lorentz transform LA→B

|det(LA→B)| = 1 (3.47)

Proof:Both A and B are inertial frames, so g

Acd = ηcd and gBab = ηab. Thus

ηab =∂Ac

∂Ba

∂Ad

∂Bbηcd (3.48)

= LcaL

db ηcd. (3.49)

where L = LA→B. Now, the determinant of the product of two matrices is the product ofthe determinants, so

det(η) = det(L)2 det(η) (3.50)

=⇒ det(L)2 = 1. (3.51)

∴ |det(LA→B)| = 1 . (3.52)

Definition 3.8 Proper and Improper Transforms

3.6 Lorentz Boosts 43

• ‘Proper’ Lorentz transforms have det(L) = 1

• ‘Improper’ Lorentz transforms have det(L) = −1

Example 3.4 Mirror Transformtxyz

B

=

1 0 0 00 −1 0 00 0 1 00 0 0 1

txyz

A

. (3.53)

This improper transform reflects objects in the x direction.

3.6 Lorentz Boosts

3.6.1 Deriving the transformation matrix

Suppose a spaceship moves, velocity vi w.r.t. Earth:

Ship’s rest frame: S (3.54)

Earth’s rest frame: E (3.55)

Assume the origins coincide – Sa = 0 is the same event as Ea = 0. There will be manyLorentz transformations which go from the Earth frame to the Ship’s frame; these willdiffer by rotations in space. We can guess, however, that there will be a simple one wherethe y and z coordinates do not change: yS = yE and zS = zE. Thus we will try transformsof the form:

txyz

S

=

? ? 0 0? ? 0 00 0 1 00 0 0 1

txyz

E

(3.56)

=⇒(tx

)S

=

(γ δµ ν

)(tx

)E

(3.57)

1) c = 1 in both frames

��

��

��

(t,-t)(t,t)

t

x

Photon

Photon

In all frames, a photon moving to theright passes through events with(

tx

)=

(tt

). (3.58)

A photon moving to the left, on theother hand, passes through(

tx

)=

(t−t

). (3.59)


Suppose in the Earth frame a photon passes through the event(tx

)E

=

(11

)E

.

In ship’s co-ordinates, (tx

)S

=

(tt

)S

=

(γ δµ ν

)(11

)E

(3.60)

Thus

tS = γ + δ (3.61)

= µ+ ν (3.62)

⇒ γ + δ = µ+ ν. (3.63)

A photon could also pass through

(tx

)E

=

(1−1

)E

, which has ship’s co-ordinates

(tx

)S

=

(γ δµ ν

)(1−1

)E

(3.64)

=

(t−t

)S

(3.65)

⇒ tS = γ − δ (3.66)

−tS = µ− ν (3.67)

⇒ γ − δ = ν − µ (3.68)

Combining these two results gives

γ + δ = µ+ ν (3.69)

γ − δ = ν − µ (3.70)

⇒ 2γ = 2ν (3.71)

⇒ ν = γ , µ = δ (3.72)

We now have (tx

)S

=

(γ δδ γ

)(tx

)E

. (3.73)

2) Follow spatial origin in ship’s co-ordinates

3.6 Lorentz Boosts 45

On the ship,

(tx

)S

=

(t0

)S

. But, Earthlings see this move at speed V

⇒(tx

)E

=

(tV t

)(3.74)

⇒(tx

)S

=

(t0

)S

(3.75)

=

(γ δδ γ

)(tV t

)E

(3.76)

And so,

tS = γtE + δV tE (3.77)

= (γ + δV ) tE (3.78)

xS = δtE + γV tE (3.79)

= (δ + γV ) tE (3.80)

= 0 (3.81)

⇒ δ + γV = 0 (3.82)

⇒ δ = −γV (3.83)

Thus, we have (tx

)S

= γ

(1 −V−V 1

)(tx

)E

. (3.84)

3) Apply det(L) = 1

det

(γ −γV

−γV γ

)= 1 (3.85)

= γ2 − γ2V 2 (3.86)

= γ2(1− V 2

)(3.87)

⇒ γ2 =1

1− V 2(3.88)

∴ γ =1√

1− V 2. (3.89)

4) Inverse TransformationFrom the ship’s frame to the Earth’s frame(

tx

)E

=

(γ γVγV γ

)(tx

)S

(3.90)

This is the inverse transform to the one from Earth to ship, as V → −V .


3.7 Simultaneity

A ’surface of simultaneity’ is a set of points where t =constant in some reference frame.Lets look at the Ship’s surface of simultaneity in the Earth’s co-ordinate frame.

The line tS = 0 contains events occur-ring simultaneously in the ship’s frame.

t = 0

x

t

t = 0

P Q

R

S

E

E

E

From the inverse transformation,

tE = γV xS (3.91)

xE = γxS (3.92)

⇒ tE = V xE . (3.93)

This is the line in the Earth’s frame of reference corresponding to tS = 0.Events P,Q are simultaneous in the Earth’s frame, but not in the Ship’s. Events P,R

are simultaneous in the Ship’s frame, but not in the Earth’s.

Exercise 3.6 Suppose a spaceship moves at speed V x with respect to the Earth, whereV = 1/2. Let the coordinates in the rest frame of the ship be

Sa = (S0, S1) = (tS, xS), (3.94)

(ignoring the y and z components). Similarly, let Ea = (E0,E1) = (tE, xE) be coordinatesin the rest frame of the Earth. Also let (0, 0)S = (0, 0)E. Draw a space-time diagramwhere the horizontal axis gives xE and the vertical axis gives tE. On this diagram drawthe line xS = 0 (the time axis in the ship rest frame) and the line tS = 0 (the space axisin the ship rest frame). What is the angle between these lines?

3.8 Length Contraction 47

3.8 Length Contraction

Definition Length: The length of an object is the spatial distance between the ends,measured simultaneously in some reference frame.

tREV

P Q

R

L

L

S

E

Consider a metre stick at rest on thespaceship; The space travellers mea-sure the position of the ends of thestick simultaneously at P,R. Earth-lings see P,Q as simultaneous eventscorresponding to the ends of the stickat tE = 0.

Thus,

tE xE tS xS

P 0 0 0 0Q 0 LE ? ?R ? ? 0 LS

Apply the Lorentz transform to find the ?’s.

Q :

(tQxQ

)=

(γ −γV

−γV γ

)(0LE

)= γ

(−V LE

LE

)(3.95)

R :

(tRxR

)=

(γ γVγV γ

)(0LS

)= γ

(−V LS

LS

)(3.96)

The right end of the stick moves from Q→ R with speed V w.r.t. Earth

⇒ xRE = xQE + V (tRE − tQE) (3.97)

= xQE + V tRE (3.98)

If we combine this with the Lorentz transform, we get

γLS = LE + V (γV LS) (3.99)

=⇒ LE = LS(γ − V 2γ) (3.100)

= LSγ(1− V 2) (3.101)

= LSγγ−2 (3.102)

= LSγ−1 (3.103)


∴ LE = LSγ−1 (3.104)

and since γ > 1,

LE < LS . (3.105)

3.9 Relativistic Dynamics

3.9.1 The 4-momentum

The three-momentum of an object is defined as −→p = m−→V . In space-time, we use U

instead of−→V . For an object travelling at speed

−→V , a Lorentz transformation from the

rest frame of the object gives

Ua =

(γ

γ−→V

). (3.106)

We can check that |U|2 = 1:

UaUa = (γ,−

−→V)

(γ

γ−→V

)= γ2(1− V 2) = 1. (3.107)

Next, we extend momentum from the three dimensional −→p to a four-dimensionalobject. We do this by including the energy E. This makes intuitive sense: classical me-chanics conserves E as well as the three components of momentum. Another rational forcombining energy and momentum follows from the symmetries of space-time. Noether’stheorem (see chapter 7) shows that momentum conservation is a direct consequence of thehomogeneity of space (i.e. invariance to spatial translations). But Noether also showedthat energy conservation follows in the same way from the homogeneity of time (invari-ance to time translations). Thus combining space and time into space-time correspondsto combining energy and momentum into one object as well.

As we will see later, when analyzing orbits, the 4-momentum works most naturallyas a form rather than a vector. In other areas of physics momentum also appears asdual or conjugate to vectors, in particular the position vector. Recall, for example, thatin quantum theory momentum appears paired with the position vector −→x , e.g. in theFourier transform term exp(i−→p · −→x /~). In these cases −→p combines with −→x to form ascalar, just as a form combines with a vector to form a scalar. This is why we will beginwith its definition as a form. We are free to give names to the components of p. In

fact, we will give the symbols E to p0 and −−→p to the other components. Afterwards, wejustify these names, showing that E acts like energy and −→p acts like the non-relativisticmomentum.Definition 4-momentum The 4-momentum is a form p defined by

pa ≡ mgabUb = mUa. (3.108)

The components of the 4-momentum will be given names:

pa ≡ (E,−−→p ). (3.109)

3.9 Relativistic Dynamics 49

The raised form of the 4-momentum pa ≡ gabpb is simply

pa = mUa. (3.110)

In Special Relativity where gab = ηab the raised form of the 4-momentum is

pa = ηabpb =

(E−→p

). (3.111)

This implies

pa =

(γm

γm−→V

)(3.112)

so that−→p = γm

−→V . (3.113)

The right hand side gives the non-relativistic 3-momentum, apart from the factor of γ(which is very nearly 1 in non-relativistic situations). Thus we are justified in our choiceof the symbol −→p for the spatial components of the 4-momentum.

The 0’th component of p resembles the non-relativistic energy, plus just a bit extra:

E = p0 = γm (3.114)

= m(1− V 2

)−1/2(3.115)

= m

(1 +

1

2V 2 +O(V 4)

)(3.116)

≈ m+1

2mV 2 +O(V 4). (3.117)

We interpret this as the rest mass (m) + kinetic energy (mV 2/2) + relativistic correction(O(V 4)).

In the rest frame γ = 1 and we have Einstein’s famous equation

E = m . (3.118)

(or in non-relativistic units E = mc2).

Exercise 3.7

a. Show that if the particle has three-velocity−→V, −→p = E

−→V.

b. Show that E2 = |−→p |2 +m2.

3.9.2 Forces

Newton:Classically: F = maIn Special Relativity, this becomes

fa = maa (3.119)


where

aa ≡ dUa

dτ=

d2Xa

dτ 2. (3.120)

Theorem:aaU

a = 0 (3.121)

i.e. the 4-acceleration is perpendicular to the 4-velocity.Proof:

a ·U =dU

dτ·U (3.122)

=1

2

d(U ·U

)dτ

(3.123)

=1

2

d

dτ(1) (3.124)

= 0. (3.125)

Corollary Since the force f = ma , with m scalar, we also have

f ·U = 0 . (3.126)

3.9.3 Energy-Momentum Conservation

Consider 2 particles colliding:

p1

p2

p3

p4

Total 4-momentum

before pa1 + pa

2 (3.127)

after pa3 + pa

4 (3.128)

Conservation of Energy and Momen-tum:

=⇒ pa1 + pa

2 = pa3 + pa

4 (3.129)

3.9.4 Photons

The four-velocity becomes ill-defined when V → 1:

Ua =

(γ

γ−→V

)→(∞∞−→V

). (3.130)

3.9 Relativistic Dynamics 51

Fortunately, the 4-momentum still makes sense. Let m → 0 while V → 1, keepingE = γm constant:

pa = (γm,−γm−→V) = (E,−E

−→V). (3.131)

Note that |p|2 = m2.

The 3-vector−→V becomes a unit vector as its modulus V → 1. Let k = limV→1

−→V .

Then pa = (E,−Ek). Here k tells us the direction of travel of the photon.In quantum theory, E = ~ω for a photon, where ω is the angular frequency of the

light. This implies

pa = ~(ω,−−→k ) (3.132)

where−→k = ωk. We will write ka = (ω,−

−→k ) for the wave-number

=⇒ p = ~k . (3.133)

N.B. The world-line of a photon cannot be parameterized by τ (proper time), sinceproper time does not exist for a photon (dτ = 0 along the path of a photon). We can stilluse other parameters, for example the coordinate time t in some reference frame.

Chapter 4

Maxwell’s Equations in Tensor Form

‘Well, I do not mind telling you I have been at work upon this geometry of Four Dimensions for sometime. Some of my results are curious. For instance, here is a portrait of a man at eight years old, anotherat fifteen, another at seventeen, another at twenty-three, and so on. All these are evidently sections, as itwere, Three-Dimensional representations of his Four-Dimensioned being, which is a fixed and unalterablething.

4.1 Maxwell’s Equations – Review

We will use units where ε0 = µ0 = c = 1 (Vacuum Equations)

4.1.1 Internal Structure Equations

The internal structure equations involve the fields only; matter terms involving chargesand currents do not appear.

∇ ·−→B = 0, (4.1)

∇×−→E + ∂t

−→B = 0. (4.2)

Equation (4.1) implies there are no magnetic monopoles - lines of magnetic flux haveno endpoints. The meaning of Equation (4.2) can be seen by integrating over a surface Sbounded by a curve C:

52

4.1 Maxwell’s Equations – Review 53

n

C

S

∫S

(∇×

−→E)· n d2x = −

∫S

∂t

−→B · n d2x (4.3)

= − d

dt

∫S

−→B · n d2x (4.4)

= − d

dt[magnetic flux through S] (4.5)

but, by Stokes’ theorem∫S

(∇×

−→E)· n d2x =

∫C

−→E ·

−→dl (4.6)

= [electric power round C] (4.7)

Thus, changes in the magnetic flux produce electric power (and vice-versa).

4.1.2 Source Equations

∇ ·−→E = ρc, (4.8)

∇×−→B − ∂t

−→E =

−→J . (4.9)

Equation (4.8) implies that electric field lines start and stop at electric charges.

For non-relativistic applications, ∂t

−→E is

small, and equation (4.9) gives

∇×−→B ≈

−→J (4.10)

i.e. Magnetic field lines circle currents

J

B

Maxwell’s equations give us 4 vector equations, but 8 component equations.

54 Maxwell’s Equations in Tensor Form

4.1.3 Lorentz Force

The source equations tell us how matter generates fields. We need a supplemental equationto see how fields affect matter - the Lorentz Force equation. For a particle of charge q

−→F = q

(−→E +

−→V ×

−→B). (4.11)

4.1.4 Charge Conservation

Charge conservation is expressed by the equation

∂tρc +∇ ·−→J = 0. (4.12)

If we integrate this over a volume V , bounded by the surface S, containing charge Q;

∫V

∂tρc d3x = −

∫V

∇ ·−→J d3x (4.13)

=⇒ d

dt

∫V

ρc d3x = −

∮S

−→J · n d2x (By the Divergence Theorem) (4.14)

=⇒ d

dt(Q) = − [flow of charge out of V ] (4.15)

J

S

Q

V

A physical system of fields and matter can be represented as follows:

4.2 The Faraday Tensor 55

4.2 The Faraday Tensor

We define the Faraday Tensor as the antisymmetric tensor

Fab =

b︷︸︸︷0 −Ex −Ey −Ez

Ex 0 Bz −By

Ey −Bz 0 Bx

Ez By −Bx 0

}a. (4.16)

Definition 4.1 two-forms A two-form is an antisymmetric second rank tensor withtwo lower indices.

Thus the Faraday tensor is a two-form. In general, we will define fields as forms (likegradients), and particles (i.e. 4-velocities, currents) as vectors. However, at times we canraise or lower using the metric. e.g. Ua = gabU

b.

Exercise 4.1

a. Find the raised version of Fab, i.e. find F cd = ηceηdfFef . Be careful if you use matrixmultiplication!!

b. Next find ’the dual Faraday tensor’

∗F ab ≡ 1/2 εabcdFcd. (4.17)

Answer:

∗F ab=

0 +Bx +By +Bz

−Bx 0 −Ez Ey

−By Ez 0 −Ex

−Bz −Ey Ex 0

(4.18)

4.3 Internal Structure Equations

∂aFbc + ∂bFca + ∂cFab = 0 (4.19)

True for any a, b, c = 0, 1, 2, 3.

Example 4.1 a = 1, b = 2, c = 3

∂1F23 + ∂2F31 + ∂3F12 = 0

=⇒ ∂

∂xBx +

∂

∂yBy +

∂

∂zBz = 0

∴ ∇ ·−→B = 0

which is the first Maxwell equation.

Note: There are 64 combinations of a, b, c, but most are useless!


Example 4.2 a = 1, b = 2, c = 2

∂1F22 + ∂2F21 + ∂2F12 = 0

=⇒ ∂

∂x(0) +

∂

∂y(−Bz) +

∂

∂y(Bz) = 0

=⇒ 0 = 0

Which is true automatically, and tells us nothing.

Only 4 choices of a, b, c are useful – those where all three are different.

Exercise 4.2 Consider the equation

∂b∗F ab

= 0. (4.20)

Find the four equations for E and B generated by letting a = 0, a = 1, a = 2, and a = 3.Show that these are just the Internal Maxwell equations

∇ ·B = 0, ∇× E +∂B

∂t= 0. (4.21)

4.4 Source Equations

∂bFab = ja . (4.22)

where: j0 = ρc, (j1, j2, j3) =−→J , and

F ab = ηacηbdFcd, (Special Relativity); (4.23)

F ab = gacgbdFcd, (General Relativity). (4.24)

Special Relativity:

F ab =

0 Ex Ey Ez

−Ex 0 Bz −By

−Ey −Bz 0 Bx

−Ez By −Bx 0

. (4.25)

There are 4 equations, for a = 0, 1, 2, 3E.g. a = 0:

∂bF0b = j0

∂0F00 + ∂1F

01 + ∂2F02 + ∂3F

03 = j0

∂t(0) + ∂xEx + ∂yEy + ∂zEz = ρc

∴ ∇ ·−→E = ρc

4.5 Charge Conservation 57

4.5 Charge Conservation

∂ρc

∂t+∇ ·

−→J = 0 (4.26)

⇒ ∂0j0 +

(∂1j

1 + ∂2j2 + ∂3j

3)

= 0 (4.27)

or∂aj

a = 0 . (4.28)

This equation follows immediately from the Source equation:

∂aja = ∂a∂bF

ab = 0 (4.29)

as ∂a∂b is symmetric while F ab is antisymmetric.The 4-divergenceIn general, when the 4-divergence, ∂aV

a, of a vector field vanishes, we say that V a isconserved.

The 4-current is ja =

(ρc−→J

). The time component j0 = ρc gives the amount of charge

moving in the time direction per unit (space) volume. The components of−→J give the

amount of charge moving in each space direction per unit time.

Exercise 4.3 Let inertial frame B move at speed V x with respect to inertial frameA. Suppose in frame A the magnetic field components vanish. Using the Faraday Tensor,find the magnetic field components and electric field components in frame B.

4.6 Lorentz Force

fa = qUbFba . (4.30)

Example 4.3 What is the a = 1 component of the force? Solution

f 1 = qUbFb1

=⇒ f 1 = qγ(F 01 − VxF

11 − VyF21 − VzF

31)

(Ua = (γ,−γ−→V))

= qγ (Ex − Vx(0)− Vy(−Bz)− VzBy)

= qγ (Ex + (VyBz − VzBy))

= qγ(Ex + (

−→V ×

−→B)x

)and so (

f 1, f2, f3)

= qγ(−→E +

−→V ×

−→B)

Checking that fαUα = 0

fαUα =(qUβF

βα)Uα

= qF βαUβUα


but F βα is anti-symmetric, while UβUα is symmetric; the double contraction thereforegives 0.

Exercise 4.4 Express the Lorentz scalars

F abFab,∗F ab

Fab (4.31)

in terms of E and B. Suppose that E vanishes in some inertial frame. Show that E mustbe perpendicular to B in all frames. Is it possible for B = 0,E 6= 0 in one frame andE = 0,B 6= 0 in another?

4.7 Potential Form

Definition 4.2 Electromagnetic potential The electromagnetic potential is given by theform

φ = (φe,−−→A) (4.32)

where φe is the static electric potential and−→A is the magnetic vector potential.

The Faraday tensor involves antisymmetric derivatives of φ:

Fab = ∂bφa − ∂aφb . (4.33)

This definition is consistent with the assignments

−→E = −∇φe − ∂t

−→A, (4.34)

−→B = ∇×

−→A. (4.35)

4.7.1 Advantage – Internal Structure Equations

These are automatically satisfied:

∂aFbc + ∂bFca + ∂cFab = ?

Using the electromagnetic potential,

=⇒ ∂a (∂cφb − ∂bφc) + ∂b (∂aφc − ∂cφa) + ∂c (∂bφa − ∂aφb) = 0

just by cancellations.

4.7.2 Advantage – Source Equations

The equation∂bF

ab = ja (4.36)

becomes∂b

(∂bφa − ∂aφb

)= ja (4.37)

where φa = gabφb.

4.8 Gauge Transformations 59

We can write this as�2φa − ∂a

(∂bφ

b)

= ja

where:

�2 ≡ d’Alembertian

= ∂b∂b

=∂2

∂t2− ∂2

∂x2− ∂2

∂y2− ∂2

∂z2

=∂2

∂t2−∇2

Thus, Maxwell’s equations reduce to a single source equation, with the internal equa-tions being automatic and no longer needed.

4.8 Gauge Transformations

Recall that−→B = ∇×

−→A. Suppose we apply the gauge transformation

−→A ′ =

−→A +∇ψ for

some function ψ. Then

−→B ′ = ∇×

−→A ′ (4.38)

= ∇×−→A +∇×∇ψ (4.39)

= ∇×−→A + 0 (4.40)

=−→B . (4.41)

Similarly, if φ′ = φ+ ∂ψ, then

F ′ab = ∂b (φa + ∂aψ)− ∂a (φb + ∂bψ)

= ∂bφa − ∂aφb + (∂b∂aψ − ∂a∂bψ)

= Fab.

The potentials are therefore not unique, and we are free to choose the most convenientpotential, φa, to solve the problem

4.9 Lorentz Gauge

Suppose we try a potential, φ′, where

∂aφ′a = h

for some function h. Then we may apply the gauge transformation

φ′ = φ+ ∂ψ

where ψ satisfies∂a∂

aψ = h.


This equation can be shown to always have a solution, and so we are left with a newpotential φ which satisfies

∂aφa = 0 . “Lorentz Gauge” (4.42)

In Lorentz gauge, the source equation becomes

�2φa = ja . (4.43)

4.10 Light Waves

In the vacuum, ja = 0

=⇒ �2φa = 0 (4.44)

=⇒ ∂b∂bφa = 0 (4.45)

=⇒ ∂2φa

∂t2− ∂2φa

∂x2− ∂2φa

∂y2− ∂2φa

∂z2= 0 (4.46)

which is the wave equation, with solution of the form

φa = Caeikbxb

(4.47)

= Caei(ωt−−→k ·−→x ). (4.48)

Here k = (ω,−−→k ) is the wave vector, and Ca is the amplitude.

Check:

∂2φa

∂t2= −ω2φa (4.49)

∂2φa

∂x2= −k2

xφa (4.50)

etc . . . (4.51)

(4.52)

Substituting these into the wave equation gives

−ω2 +−→k 2 = 0 (4.53)

orω = ±

∣∣∣−→k ∣∣∣ . (4.54)

Exercise 4.5 Suppose that magnetic monopoles exist in nature. Then, in addition tothe electric charge-current 4-vector je, there is a magnetic charge-current 4-vector jm =(ρm, jm x, jm y, jm z) where ρm is the magnetic charge density and jm x is the current ofmagnetic charge in the x direction. The Maxwell equations become

∂bFab = ja

e

∂b∗F ab

= jam.

4.10 Light Waves 61

a. Consider the second equation ∂b∗F ab

= jam. Find the four equations for

−→E and

−→B

generated by letting a = 0, a = 1, a = 2, and a = 3.

b. Show that magnetic charge is conserved; i.e. show that

∂ajam = 0.

c. The Lorentz force on a magnetic monopole of charge qm and 4-velocity Ua is

fa =dpa

dτ= qmUb

∗F ab.

Find the four equations generated by letting a = 0, a = 1, a = 2, and a = 3. Express

these in terms of the three-velocity−→V = d−→x /dt and γ = (1− V 2)−1/2.

d. Show that the Lorentz force in the previous item is perpendicular to U in the sensethat

f ·U = 0.

e. Suppose that the Faraday tensor Fab can be written in the form

Fab = ∂aφb − ∂bφa

for some four-potential φ. Show that the magnetic current 4-vector must vanish, i.e.

jm = 0.

Chapter 5

The Equivalence Principle

There was a minute’s pause perhaps. The Psychologist seemed about to speak to me, but changed hismind. Then the Time Traveller put forth his finger towards the lever. ‘No,’ he said suddenly. ‘Lend meyour hand.’ And turning to the Psychologist, he took that individual’s hand in his own and told him toput out his forefinger. So that it was the Psychologist himself who sent forth the model Time Machine onits interminable voyage. We all saw the lever turn. I am absolutely certain there was no trickery. Therewas a breath of wind, and the lamp flame jumped.

5.1 Inertial mass

“Gravitational mass is equivalent to inertial mass.”

Newton’s second law states that the force on an object is proportional to mass timesacceleration. In this section, we will call the mass which appears in this law the inertialmass: −→

F = mI−→a . (5.1)

for example, consider the electro-static interaction between two particles with massesm1,m2, and charges q1, q2. Particle 2 feels a force

−→F 2 =

q1q2r212

r12. (5.2)

The acceleration felt by particle 2 can then be found by combining these two equations:

−→a 2 =

(q2m2 I

)q1r212

r12. (5.3)

Thus, −→a 2 depends on the ratio of charge to inertial mass, q2/m2 I .Next, consider a particle falling to the ground. The Newtonian gravitational force

resembles the electrostatic force (both are inverse square laws), with masses replacingelectrical charges. We will simply call the gravitational charge m. Let M⊕ be the massof the Earth, r the distance to the centre of the Earth, and r the upwards unit vector.

62

5.2 Free Fall 63

Then the force on an apple falling to the ground can be written

−→F = −CM⊕m

r2r, (5.4)

where C is a constant.

Now, by Newton’s 2nd law

−→F = mI

−→a (5.5)

=⇒ −→a =

(m

mI

)(−CM⊕

r2

)r. (5.6)

Thus, −→a depends on the ratio of gravitational mass to inertial mass, m/mI . Galileo’sexperiments −→a should be the same for all materials, once air resistance is neglected. Thisimplies that m/mI is the same constant for all matter.

Thus, we can combine the two constants C and m/mI into a single constant

G = C

(m

mI

)(5.7)

=⇒ −→a = −GM⊕

r2r. (5.8)

We now see a fundamental difference between the electrical force and the gravitationalforce: the former depends on a charge-to-inertial-mass ratio, but the latter does not. Theinertial forces (also known as fictitious forces) are similar to gravity in this respect – theacceleration has no dependence on mass.

5.2 Free Fall

Einstein had difficulty incorporating both gravity and inertial forces into special relativity.His great insight was to treat them together, using the principle of equivalence to eliminatethe dependence on mass. Now, inertial forces can be eliminated by transferring to a non-accelerating frame. Einstein reasoned that gravitational forces can be removed in a similarway by transferring to a free-fall frame.

Example: Consider an object in the Earth’s gravitational field near r = R⊕. Let z bethe vertical direction.

d2z

dt2= −g

g =GM⊕

R⊕

≈ 10ms−1

64 The Equivalence Principle

initial conditions:

z0 = z(t = 0) = h

z0 = V0

=⇒ z(t) = −gt+ V0

z(t) = h− 1

2gt2 + V0t

Let’s transform to new co-ordinates:

ξ(z, t) = z +1

2gt2

Then:

ξ = h+ V0t

ξ = V0 = constant

ξ = 0

Since the acceleration is zero, there is no gravitational force.

5.2.1 Locally Inertial Frames

Definition Locally Inertial Frame (LIF) An LIF is a reference frame with origin at space-time event P . An object at P is in free-fall (if there are no external forces). Near P , thereare no gravitational forces, and special relativity holds.

In an LIF

a.

gab

∣∣∣P

= ηab, (5.9)

b.

∂cgab

∣∣∣P

= 0. (5.10)

for all a, b, c. The second condition follows from the isotropy of space-time.

5.3 Geodesics

Definition Geodesic

a. A geodesic is a path on a manifold M which is an extremum of length (i.e. max ormin distance between two points).

b. A geodesic is a path which has zero covariant acceleration (to be defined later).

5.3 Geodesics 65

5.3.1 Examples

a. R2. Geodesics are straight lines.

b. S2. Geodesics are the great circles, for example the equator.

c. Minkowski Space M4.

In special relativity, “distance” becomes proper time, τ

dτ 2 = ds2 = dt2 − dx2 − dy2 − dz2

��

��

��

��

P

Q

Path 1Path 2

x

t

Path 1:

τ1 =

∫dτ =

∫dt = ∆t. (5.11)

Path 2:

τ2 =

∫ √dt2 − dx2 − dy2 − dz2 (5.12)

<

∫ √dt2 = ∆t. (5.13)

Thus, τ1 > τ2. An object at rest has maximum proper time (and follows a geodesicin space-time).

5.3.2 The Geodesic Equation

In general relativity, the orbit of a satellite is a geodesic in a space-time distorted by themass of the Earth. How can we find the orbit? We seek an expression for the accelerationof the satellite in ordinary coordinates fixed to the Earth. In the locally inertial (free-fall)frame, of course, the acceleration is easy: it is zero!

Let (ξ0, ξ1, ξ2, ξ3) be coordinates in the satellite’s LIF, while X are coordinates fixedto the Earth. In other words, near the spacetime event (ξ0, ξ1, ξ2, ξ3) = (0, 0, 0, 0), thesatellite is at rest in the LIF coordinates, and experiences no forces and no acceleration:


UaLIF =

dξa

dτ=

1000

; (5.14)

dUaLIF

dτ=

d2ξa

dτ 2= 0. (5.15)

Let U be the 4-velocity in Earth coordinates X. We now transform ULIF to U:

0 =d

dτUd

LIF =d

dτ

(∂ξd

∂XbU b

)(5.16)

=

(d

dτ

∂ξd

∂Xb

)U b +

∂ξd

∂Xb

d

dτU b (5.17)

by the product rule. To understand the first term, we use the fact that

d

dτ=

dt

dτ

∂

∂t+dx

dτ

∂

∂x+ . . . (5.18)

= U0∂0 + U1∂1 + . . . (5.19)

= U c∂c (5.20)

to obtain

0 = U b

(U c∂c

∂ξd

∂Xb

)+∂ξd

∂Xb

d

dτU b (5.21)

= U bU c ∂2ξd

∂Xb∂Xc+∂ξd

∂Xb

d

dτU b. (5.22)

The last term contains what we are searching for: the acceleration in the Earth framedU b/dτ . However, we need to free this expression from the transformation matrix ∂ξd/∂Xb.To rid ourselves of this unwanted matrix, we multiply by its inverse ∂Xa/∂ξd:

∂Xa

∂ξd

∂ξd

∂Xb= δa

b. (5.23)

This gives

0 =∂Xa

∂ξd

(U bU c ∂

2ξd

∂XbXc+∂ξd

∂Xb

d

dτU b

)(5.24)

=∂Xa

∂ξd

∂2ξd

∂Xb∂XcU bU c + δa

bd

dτU b. (5.25)

Now, δabdU

b/dτ = dUa/dτ . Rearranging terms, we finally obtain

dUa

dτ+∂Xa

∂ξd

∂2ξd

∂Xb∂XcU bU c = 0 . (5.26)

5.3 Geodesics 67

Equation (5.26) can be written in the form

dUa

dτ+ Γa

bcUbU c = 0, (5.27)

where the Christoffel symbols Γabc are given by

Γabc =

∂Xa

∂ξd

∂2ξd

∂Xb∂Xc. (5.28)

Equation (5.26) is called the geodesic equation, and governs the motion of matter inthe absence of forces. A more useful formula for the Christoffel symbols will be derivedin the exercise below, and (in a different way) in the next chapter. Equation (5.26)has been introduced here because of its physical meaning. It computes the apparentacceleration dUa/dτ of an object in one frame (X) in terms of transformations from theLIF. We experience gravitational forces only because we insist on viewing things from anon-inertial frame! Like any other fictitious or inertial force, gravitation arises from theacceleration of one frame of reference with respect to the inertial frames. We do feel theeffects of weight, particularly after a long hike uphill, but we can now view this as theeffect of forces coming from the ground under our feet, accelerating us away from ournatural state – free-fall.

Exercise 5.1 Here we derive an expression for the Christoffel symbols in terms ofthe metric. To simplify the notation, gab will denote the metric in the non-inertial frame,gLab the metric in the LIF, and ∂a ≡ ∂/∂Xa (i.e. the shorthand for partial derivativesapplies only to the noninertial X frame).

a. Show that∂cgab =

[(∂c∂aξ

f )(∂bξg) + (∂aξ

f )(∂c∂bξg)]gLfg. (5.29)

b. Show thatΓabc = gaeΓ

ebc = (∂aξ

f )(∂b∂cξg)gLfg. (5.30)

c. Show thatΓabc + Γbca = ∂cgab. (5.31)

d. Hence show that

Γabc =

1

2gad (∂bgcd + ∂cgdb − ∂dgbc) . (5.32)

5.3.3 Covariant Acceleration

Theorem. If the components of a tensor vanish in one co-ordinate system, then theyvanish in all frames.

Proof. This follows directly from the transformation laws for a tensor. For example,if in frame A we have Mab

A = 0 for all a, b, then in frame B

M cdB =

∂Bc

∂Aa

∂Bd

∂Ab(0)ab = 0. (5.33)


Note that the acceleration term dUa/dτ is zero in the LIF, but non-zero in the Earthframe. Thus it is not a tensor! To understand the motion of objects in general relativityfurther, we must learn how to differentiate vectors in a covariant way (i.e. so that theresult is a tensor).

A vector U involves not only its components Ua, but also the basis vectors of thecoordinate system. If space-time is warped, or even if we are simply using non-Cartesiancoordinates, these basis vectors will point in different directions at different points. Inthe next chapter, we will show how to differentiate vectors by including both componentsand basis vectors.

Chapter 6

Covariant Derivatives

One of the candles on the mantel was blown out, and the little machine suddenly swung round, becameindistinct, was seen as a ghost for a second perhaps, as an eddy of faintly glittering brass and ivory; andit was gone–vanished! Save for the lamp the table was bare.

How do we differentiate vectors (& tensors)?There are many examples in physics where derivatives need to be extended. For

example, in fluid mechanics the Navier-Stokes force equation reads

D−→V

Dt= ∇p+ ν∇2−→V (6.1)

where D/Dt is the total Lagrangian derivative

D/Dt = ∂/∂t+−→V · ∇. (6.2)

In Quantum mechanics, Schrodinger’s equation has the form(1

2m

(~i∇)2

+ V

)ψ = Eψ. (6.3)

With an applied magnetic field the gradient ∇ is replaced by the ‘gauge covariant deriva-

tive’ ∇− iem

−→A, where ∇×

−→A =

−→B . In this context the vector potential

−→A is sometimes

called the ‘electromagnetic connection’.

6.1 Non-Euclidean Geometry

Basis Vectors A Co-ordinate Line is a line parameterized by one of the co-ordinates. Abasis vector is a tangent vector to a co-ordinate line. Let eOa be the basis vector tangentto the co-ordinate line following xa .

Note A circled subscript appears because the subscript chooses between the vectorsin a set – e.g. the set {eO0 , eO1 , eO2 , eO3 } – ordinary subscripts choose the component of asingle vector.

69

70 Covariant Derivatives

Thus, in component form (for co-ordinates X)

eO1 =

dX0/X1

dX1/X1

dX2/X1

dX3/X1

=

0100

(6.4)

=⇒ eO10 = 0, eO1

1 = 1, . . . (6.5)

or eO1a = δa

1 (6.6)

In general,

eOca = δa

c . (6.7)

Note that the basis vectors need not be orthogonal or of unit size:

eOb · eOc = gadeObaeOc

d (6.8)

= gadδab δ

dc (6.9)

= gbc (6.10)

i.e. the scalar product of basis vectors eOb and eOc is equal to element gbc of the metric.All vectors can be written as sums of basis vectors:

V =

V0

V1

V2

V3

= V ceOc . (6.11)

6.2 The Covariant Derivative 71

6.2 The Covariant Derivative

Derivatives must satisfy the product rule. Thus for the derivative in the Xb direction

∇bV = ∇b

(V ceOc

)= (∇bV

c) eOc + V c(∇beOc ). (6.12)

V c is a number at each point (i.e. a function of position), so we can write

(∇bVc) = ∂bV

c =∂V c

∂Xb(6.13)

=⇒(∇bV

)= (∂bV

c) eOc + V c(∇beOc

). (6.14)

We need to define the last term in brackets. The object ∇beOc is itself a vector. Wewill name its components with the capital Greek letter Γ:

∇beOc =

Γ0

bc

Γ1bc

Γ2bc

Γ3bc

. (6.15)

In terms of the basis vectors,

∇beOc = Γ0bceO0 + Γ1

bceO1 + Γ2bceO2 + Γ3

bceO3 , (6.16)

or

∇beOc = ΓabceOa . (6.17)

The object Γabc is called a metric connection, or alternatively a Christoffel symbol.

Let us go back to calculating the gradient of a vector:

∇bV = (∂bVc) eOc + V c

(∇beOc

)(6.18)

= (∂bVc) eOc + V c

(Γa

bceOa). (6.19)

Exchange a↔ c in the 1st term on the RHS:

=⇒ ∇bV = (∂bVa) eOa + V cΓa

bceOa (6.20)

= (∂bVa + V cΓa

bc) eOa . (6.21)

This is a vector, with components(∇bV

)a= (∂bV

a + ΓabcV

c)

To recap: The covariant derivative, ∇b (derivatives in Xb direction)

a. Produces tensors.

b. Obeys the product rule.

c. For a scalar function f ,

∇bf = ∂bf =∂f

∂Xb. (6.22)

d. There exists a set of numbers Γabc, where(

∇bV)a

= ∂bVa + Γa

bcVc.


6.3 Derivatives of Other Tensors

Use the properties listed above.Example: Given a 2nd rank tensor, Mcd, find (∇bM)cd.

To do this, let V c,W d be arbitrary vectors. Let f = McdVcW d be a scalar function.

By the product rule for ∂b,

∂bf = (∂bMcd)VcW d +Mcd (∂bV

c)W d +McdVc(∂bW

d). (6.23)

Also by the product rule for ∇b,

∇bf = (∇bM)cd VcW d +Mcd

(∇bV

)cW d +McdV

c(∇bW

)d. (6.24)

Meanwhile,∇bf = ∂bf , so

(∇bM)cd VcW d = (∂bMcd)V

cW d +Mcd

(∂bV

c −(∇bV

)c)W d

+McdVc(∂bW

d −(∇bW

)d).

Now apply rule(4):


cW d +Mcd

[(−Γc

baVa)W d + V c

(−Γd

baWa)]. (6.25)

Next we factor out V and W . To do this, we swap a↔ c in the second to last term, anda↔ d in the last term:


cW d − (ΓabcMad + Γa

bdMca)VcW d. (6.26)

As the above is true for any values of V c and W d we can cancel V cW d from both sides,with the final result

(∇bM)cd = ∂bMcd − ΓabcMad − Γa

bdMca . (6.27)

In general, we will obtain one Γ for each index of a tensor, with a minus sign for eachsubscript (form) index and a plus sign for each superscript (vector) index.

Exercise 6.1 The covariant derivative ∇b of a vector field Xa is

∇bXa = ∂bX

a + ΓabcX

c. (6.28)

Also, the covariant derivative of a scalar is the same as the partial derivative:

∇af = ∂af. (6.29)

Use these equations to find an expression for the covariant derivative ∇b of a form Wc,i.e. find ∇bWc.

Exercise 6.2 In general Relativity, the relation between the Faraday tensor and thevector potential φa is defined by

Fab = (∇bφ)a − (∇aφ)b. (6.30)

Write out the right hand side in terms of Christoffel symbols. Show that the Christoffelterms cancel, leaving

Fab = (∂bφ)a − (∂aφ)b. (6.31)

6.3 Derivatives of Other Tensors 73

Theorem: Suppose the ∇ operator (covariant derivative) satisfies

a.∇g = 0 (g = metric) (6.32)

b.Γa

bc = Γacb (“zero torsion”) (6.33)

then the Christoffel symbols are determined by

Γabc =

1

2gad (∂bgcd + ∂cgdb − ∂dgbc) . (6.34)

Proof: gcd is a 2nd order tensor with two lower indices, so

(∇bg)cd = ∂bgcd − Γabcgad − Γa

bdgca. (6.35)

DefineΓdbc ≡ gadΓ

abc. (6.36)

Then

(∇bg)cd = ∂bgcd − Γdbc − Γcbd (6.37)

= 0 (6.38)

=⇒ ∂bgcd = Γdbc + Γcbd. (6.39)

Next apply assumption 2: Γcdb = Γcbd, so

∂bgcd = Γdbc + Γcdb. (6.40)

Obtain two more equations by cycling b→ c, c→ d, and d→ b:

=⇒ ∂cgdb = Γbcd + Γdbc (6.41)

∂dgbc = Γcdb + Γbcd. (6.42)

Take the sum (equation (6.40) + equation (6.41) - equation (6.42)):

∂bgcd + ∂cgdb − ∂dgbc = 2Γdbc + 0 + 0 (6.43)

=⇒ Γdbc =1

2(∂bgcd + ∂cgdb − ∂dgbc) . (6.44)

Finally, use gdb = gbd in the second to last term and let

Γabc = gadΓdbc (6.45)

to prove the theorem.

Exercise 6.3 Consider a sphere of radius 1 as a 2-dimensional manifold with coordi-nates X1 = θ (colatitude) and X2 = φ (longitude). What is the metric gab and its inverse


gbc? Find the Christoffel symbols Γabc(there are 8 of these for a 2-dimensional manifold).

Suppose a geodesic on the sphere is parameterized by λ. Use the geodesic equation

d2Xa

dλ2+ Γa

bcdXb

dλ

dXc

dλ= 0 (6.46)

to find d2θdλ2 and d2φ

dλ2 .

6.3.1 The gradient of the metric in General Relativity

For an object in free-fall (no external forces) the geodesic equation gives

dUa

dτ+ Γa

bcUbU c = 0 . (6.47)

This is true in all coordinate frames. But in the LIF the object will appear to be atrest (or in uniform motion). Thus the components of U in the LIF remain constant, i.e.dUa/dτ = 0. This implies that

ΓabcU

bU c = 0 in LIF. (6.48)

Since this holds for arbitrary 4-velocities U, we must have the Christoffel symbols van-ishing,

Γabc = 0 in LIF. (6.49)

(Strictly speaking, only the part of Γabc symmetric in the lower indices b and c need

vanish. Einstein’s theory employs the simplest assumption, that the antisymmetric partof Γa

bc – called the torsion – is always zero. Some modified theories of gravity include anon-zero torsion.)

Now,(∇cg)ab = ∂cgab − Γd

cagdb − Γdcbgad. (6.50)

In the LIF, however, Γdca = Γd

cb = 0. Also, by the definition of locally inertial frames,∂cgab = 0. Thus

∇g = 0. (6.51)

And, since ∇g is a tensor, it must vanish in all frames. Thus in General Relativityequation (6.34) can be used to calculate the Christoffel symbols.

6.4 Covariant Directional Derivatives and Acceleration

Consider a curve γ parameterized by λ whose tangent vector is V (see section 1.5.1). Fora function f , the derivative in the direction along the curve is (Recall equation (1.76)):

df

dλ= V · ∇f (6.52)

= V b∂bf. (6.53)

6.5 Newton’s Law of motion 75

For the directional derivative of a vector along the curve, the covariant gradient ∇ nowincludes Christoffel symbols. Thus for a vector W,

DW

Dλ= V · ∇W, (6.54)

or, expressed in components,(DW

Dλ

)a

=(V · ∇W

)a(6.55)

= V b(∇bW

)a(6.56)

= V b (∂bWa + Γa

bcWc) (6.57)

=dW a

dλ+ Γa

bcVbW c. (6.58)

We can apply this to find the acceleration 4-vector. Let the world-line of an objectbe parameterized by its proper time τ , with tangent vector the 4-velocity U. Then the4-acceleration is

a =DU

Dτ= U · ∇U; (6.59)

aa = U b(∇bU

)a= U b (∂bU

a + ΓabcU

c) (6.60)

=dUa

dτ+ Γa

bcUbU c. (6.61)

Note the similarity with the geodesic equation, equation (6.47): The geodesic equationcan now be written in a very simple form:

a = 0 . (6.62)

6.5 Newton’s Law of motion

Newton’s 2nd law becomes, for an external force fa

fa = maa = m

(DU

Dτ

)a

. (6.63)

If fa = 0, an object follows a geodesic.We can now describe the apparent acceleration of component a of the 4-velocity, Ua:

dUa

dτ=

1

m

(fa − Γa

bcUbU c). (6.64)

The first term on the RHS arises from external forces, while the second term arises fromfictitious inertial forces, including gravity.


6.6 Twin Paradox

In the year 2000, one twin sets off for a distant planet, while the other twin stays home.Flight Plan (in ship time = proper time):

a. 5 year acceleration 1g ≈ 10 ms−2, the surface acceleration of the Earth.

b. 5 year deceleration 1g.

c. 1 year on planet.

d. 5 year acceleration 1g.

e. 5 year deceleration 1g.

In relativistic units g ≈ 1.03yr−1.When does the twin arrive back on Earth?We need to compare Earth time tE with the proper time τ . Conveniently, as measured

in the Earth frame the zeroth component of the 4-velocity U0E is

U0E =

dtEdτ

= γ. (6.65)

So (setting tE = τ = 0 at the start of the journey)

tE(τ) =

∫ τ

0

γ(τ ′)dτ ′. (6.66)

Strategy: Compare Ua and aa in both the Earth and the spaceship frame. We willignore the y and z components, considering only the t and x components.

Ship: in the spaceship rest frame,

UaS (τ) =

(10

). (6.67)

The astronauts feel a force of one Earth gravity, which allows them to walk aroundthe spaceship rather than float about. This force is the normal force from the floor actingon the feet of the astronauts. For an astronaut of mass m, Newton’s second law statesthat the normal force (in the forward S1 = x direction) is

F = mg. (6.68)

Let us write this in covariant form: the covariant Newton’s 2nd law reads

F aS = m

(DUS

Dτ

)a

= mg. (6.69)

Expanding the covariant derivative in terms of ordinary derivatives and Christoffel sym-bols,

F aS = m

(dUa

S

dτ+ Γa

SbcUbSU

cS

). (6.70)

6.6 Twin Paradox 77

But in the ship’s rest frame

UaS =

(10

)= constant, (6.71)

dUaS

dτ= 0. (6.72)

Thus

F aS = mΓa

S00. (6.73)

In particular,

Γ1S00 = g (≈ 1.03 year−1). (6.74)

Thus Γ1S00 gives the fictional (inertial) force felt by the astronauts. Also, as F a

SUSa = 0,we must have F 0

S = 0 and Γ0S00 = 0.

Earth:Assume the Earth frame is an inertial frame, Γa

Ebc = 0, i.e. ignore the Earth’s owngravity. Then (

DUE

Dτ

)a

=dUa

E

dτ+ 0. (6.75)

Now,

UaE(τ) =

∂Ea

∂SbU b

S(τ). (6.76)

The co-ordinate transformation between the Earth frame and the spaceship frame dependson the speed and hence the position of the ship. The position of the ship is parameterizedby the proper time τ . For a ship travelling at speed V (τ) the Lorentz boost formula gives

∂Ea

∂Sb=

(γ(τ) γV (τ)γV (τ) γ(τ)

). (6.77)

6.6.1 The rapidity

Let the rapidity φ be defined by φ = tanh−1 V . Thus the quantities V , γ = (1− V 2)−1/2

,and γV are given by simple hyperbolic functions:

V = tanhφ, (6.78)

γ = coshφ, (6.79)

γV = sinhφ. (6.80)

In terms of the rapidity, the transformation matrix has the simple form

∂Ea

∂Sb=

(coshφ(τ) sinhφ(τ)sinhφ(τ) coshφ(τ)

). (6.81)


Because both the velocity 4-vector and the covariant acceleration are tensors, they can bereadily transformed to Earth coordinates from the spaceship frame as in equation (6.76):

UE(τ) =

(coshφ sinhφsinhφ coshφ

)(10

)(6.82)

=

(coshφ(τ)sinhφ(τ)

); (6.83)

DUE

Dτ=


)DUS

Dτ(6.84)

=


)(0g

)(6.85)

= g

(sinhφ(τ)coshφ(τ)

). (6.86)

Meanwhile, the ordinary derivative of UaE is

dUaE

dτ=

d

dτ

(coshφ(τ)sinhφ(τ)

)(6.87)

= φ

(sinhφ(τ)coshφ(τ)

), (6.88)

where φ = dφ/dτ . Thus from equation (6.75), we find φ = g, which integrates to

φ(τ) = gτ. (6.89)

We can now go back to equation (6.66). As U0E = γ = cosh(gτ), we have

tE(τ) =

∫ τ

0

cosh(gτ)dτ ′ (6.90)

∴ tE(τ) =1

gsinh(gτ). (6.91)

Now g = 1.03yr−1, and so at τ = 5 years,

tE = sinh(5.15)/1.03 = 86.6yr. (6.92)

At this point, Earth time is 2086, Ship time is 2005. Similarly,

• the 5 year deceleration takes 86.6 years on Earth;

• 1 year on alien planet takes 1 year on Earth;

• the return journey takes 2× 86.6 years on Earth.

So the twin returns to Earth 21 years older, in the year 2347.Note the asymmetry between the stay-at-home twin and the space-faring twin. Earth

people see the spaceship moving away and returning. But people on the spaceship alsosee the Earth moving away and returning! However, there is no true symmetry here: onlythe spaceship resides in a non-inertial rest frame. This results in a true difference betweenthe flow of time on the spaceship and on the Earth.

Chapter 7

Orbits

Everyone was silent for a minute. Then Filby said he was damned.

7.1 Noether’s Theorem

For any continuous symmetry of a physical system, there is a conservedquantity.

This theorem is most often expressed in the context of Hamiltonian or Lagrangianmechanics, either quantum or classical. For example, if H is the Hamiltonian for aphysical system, then:

a. If dH/dt = 0 (H symmetric to time translation), energy is conserved.

b. If ∂H/∂x = 0 (H symmetric to translation in the x direction) then the x componentof linear momentum is conserved.

c. If ∂H/∂φ = 0 (H symmetric to rotation), then angular momentum is conserved

d. In electromagnetism, if H is independent of gauge, then charge is conserved.

e. In particle theory, gauge symmetry can imply conservation of other kinds of ‘charge’.For example SU(3) symmetry implies conservation of ‘colour’ (strong force) charge.

To employ symmetry arguments in the analysis of orbits, we first prove a variant ofNoether’s theorem applicable to geodesics.

Theorem:dpa

dτ=m

2(∂agbc)U

bU c . (7.1)

Thus, for example, if ∂0gbc = 0, then dp0/dτ = 0. In this case the energy E = p0 willbe conserved.

Proof 7.1We derive the corresponding equation for the lowered form of the 4-velocity U. Here

U = p/m can be interpreted as energy-momentum per unit mass.

79

80 Orbits

First, we write Ua = gaeUe and apply the product rule:

dUa

dτ=

d

dτ(gaeU

e) (7.2)

= gaedU e

dτ+ U e dgae

dτ. (7.3)

Apply the geodesic equation equation (6.47) to the first term, and write d/dτ = U b∂b inthe second term:

dUa

dτ= gae

(−Γe

bcUbU c)

+ U e(U b∂bgae

). (7.4)

Next, we can change the dummy variable e→ c in the last term, so that we can factorout U bU c:

dUa

dτ= gae

(−Γe

bcUbU c)

+ U c(U b∂bgac

)(7.5)

= (∂bgac − gaeΓebc)U

bU c. (7.6)

Note that ged is the inverse metric tensor, so by equation (6.34),

Γabc ≡ gaeΓebc =

1

2gaeg

ed (∂bgcd + ∂cgdb − ∂dgbc) (7.7)

=1

2δa

d (∂bgcd + ∂cgdb − ∂dgbc) (7.8)

=1

2(∂bgca + ∂cgab − ∂agbc) . (7.9)

Thus

dUa

dτ=

[∂bgac −

1

2(∂bgca + ∂cgab − ∂agbc)

]U bU c (7.10)

=1

2(∂agbc + ∂bgac − ∂cgab)U

bU c, (7.11)

using gca = gac to combine the first two terms in equation (7.10).Finally, the last two terms in equation (7.11) involve the factor (∂bgac−∂cgab), which is

anti-symmetric in b and c. But this factor double-contracts with U bU c, which is symmetricin b and c. Contraction of symmetric and anti-symmetric tensors gives 0, i.e.(

∂bgac − ∂cgab

)U bU c = 0 (7.12)

so we are left withdUa

dτ=

1

2(∂agbc)U

bU c. (7.13)

QED

7.2 The Schwarzschild Metric

Consider the space surrounding a planet or star or black hole of total mass M . We willassume that the central object is spherically symmetric (so we ignore rotation) and time

7.2 The Schwarzschild Metric 81

independent (so we ignore time evolution). One can show from the Einstein field equationsthat the metric line element is

dτ 2 =(1− rs

r

)dt2 −

(1− rs

r

)−1

dr2 − r2dθ2 − r2 sin2 θdφ2 . (7.14)

where rs ≡ 2GM is called the Schwarzschild Radius.Equivalently the metric tensor is

gab =

(1− rs

r

)0 0 0

0 −(1− rs

r

)−10 0

0 0 −r2 00 0 0 −r2 sin2 θ

. (7.15)

The Schwarzschild radius is quite small: for the Sun, M = M�, rs = 3km. The radius ofthe sun, however, is R� = 7× 105km � rs. Thus for anything orbiting the sun, even ina very low orbit, r > R� so the ratio rs/r � 1.

For the Earth, M = M⊕, rs = 0.886cm. Again, for anything orbiting the Earth,rs/r � 1.

Also note that if we neglect the rs/r terms in the metric we get back to the Minkowskimetric.

7.2.1 Symmetries and Conserved Quantities

To find a planetary orbit, we solve the geodesic equation for objects moving in the curvedspace described by the Schwarzschild metric. This is difficult to do directly. However wecan take advantage of two symmetries in the problem.

The two symmetries are time invariance and rotational invariance:

∂0gbc = ∂tgbc = 0 for all b, c = 0, 1, 2, 3 (7.16)

∂3gbc = ∂φgbc = 0 for all b, c = 0, 1, 2, 3. (7.17)

The corresponding conserved quantities are energy E and angular momentum L:

dp0

dτ=

dE

dτ= 0; (7.18)

dp3

dτ=

dL

dτ= 0. (7.19)

For massive particles we can define the energy per unit mass k = E/m and the angularmomentum per unit mass h = L/m. For constant rest mass h and k will also be constantalong a geodesic. We first consider k:

k = U0 = g0bUb (7.20)

=(1− rs

r

)U0. (7.21)

82 Orbits

As U0 = dt/dτ , we have

dt

dτ= k

(1− rs

r

)−1

. (7.22)

Thus the Noether symmetry arguments lead to an expression for how co-ordinate time tvaries with proper time τ .

Next consider the angular momentum per unit mass h:

h = −U3 = −g3bUb (7.23)

= r2 sin2 θ U3. (7.24)

Now U3 = dφ/dτ , so

dφ

dτ=

h

r2 sin2 θ. (7.25)

Note how the velocity expressed as a vector U with upper indices Ua has a differentphysical meaning from the form U with lower indices Ua. The vector U shows us wherethe object is going (as it represents the tangent to the world line). The form U, on theother hand, tells us how much energy and momentum (per unit mass) the object carries.

7.2.2 Orbits in the Equatorial Plane

Consider geodesics in the equatorial plane θ = π/2. For the solar system this plane iscalled the ecliptic. (The constellations seen on the ecliptic are known as the zodiac.) Formotion on this plane dθ = 0 and sin θ = 1, so the Schwarzschild metric line elementsimplifies to

dτ 2 =(1− rs

r

)dt2 −

(1− rs

r

)−1

dr2 − r2dφ2. (7.26)

Let us find orbit equations in terms of h and k:

a. Divide the metric line element by dτ 2 and use equations (7.22) and (7.25) (withsin θ = 1):

1 = k2(1− rs

r

)−1

−(1− rs

r

)−1(

dr

dτ

)2

− h2

r2(7.27)

=⇒ dr

dτ=

√k2 −

(1 +

h2

r2

)(1− rs

r

). (7.28)

This gives us a differential equation for dr/dτ . Unfortunately, it is quite non-linearand difficult to solve in this form.

Exercise 7.1

(a) Starting with equation (7.28), derive an expression for dr/dτ in the form

1

2

(dr

dτ

)2

+ V (r) = C

7.2 The Schwarzschild Metric 83

where C is a constant, and the effective potential is

V (r) = −1

2

(rs

r− h2

r2+rsh

2

r3

).

What is the effective energy C?

(b) Find the radii r1 and r2, r1 < r2 where the effective potential has an extremum(maximum or minimum). Show that if C = V (r1) or C = V (r2) then therequirements for a circular orbit (dr/dτ = d2r/dτ 2 = 0) are satisfied. Showthat h ≥

√3 rs for these orbits. Also show that r2 ≥ 3rs.

(c) Let h = 2rs. What are r1 and r2? Show that for the outer orbit at r2, V′′(r2) > 0

and hence that this orbit is stable. Is the inner orbit at r1 stable?

b. To simplify the equation, we change the independent variable from τ → φ, and findr(φ):

dr

dτ=

(dr

dφ

)(dφ

dτ

)=

(dr

dφ

)(h

r2

)(7.29)

=⇒(

dr

dφ

)2h2

r4= k2 −

(1 +

h2

r2

)(1− rs

r

). (7.30)

c. Next we change the dependent variable r → u = 1/r. We will denote differentiationby φ with a prime, e.g. u′ = du/dφ. Thus

r′ =dr

dφ=

dr

du

du

dφ(7.31)

= − 1

u2u′ (7.32)

=⇒(− u

′

u2

)2

u4h2 = k2 −(1 + h2u2

)(1− rsu) (7.33)

=⇒ u′2 =

k2

h2− (1 + h2u2) (1− rsu)

h2(7.34)

We now have an equation for u′ which at least has no terms in the denominator:

u′2 =

(k2 − 1

h2

)+rsu

h2− u2 + rsu

3 . (Einstein) (7.35)

The corresponding Newtonian orbit equation leaves out the last term:

u′2 = kN +

rsu

h2− u2 , (Newton) (7.36)

where the Newtonian energy per unit mass is

kN =V 2

2− GM

r=V 2

2− rs

2r. (7.37)

84 Orbits

d. We can simplify further by differentiating with respect to φ:

2u′u′′

=rs

h2u′ − 2uu

′+ 3rsu

2u′

(7.38)

=⇒ u′′

+ u =rs

2h2+

3rs

2u2. (7.39)

Thus General Relativity predicts that orbits satisfy

u′′ + u =rs

2h2+

3rs

2u2 . (Einstein) (7.40)

In contrast, the Newtonian orbit equation is

u′′ + u =rs

2h2. (Newton) (7.41)

Compare the relativistic correction term (the last term in the Einstein version) tothe linear term u:

3rsu2/2

u=

3

2

rs

r= 3

GM

r. (7.42)

For planets orbiting the sun at a radius r > 100R�,

3GM�

100R�∼ 10−7 (7.43)

and so the relativistic correction term results in very small deviations from theNewtonian predictions. These deviations have, however, been observed!

Exercise 7.2

a. Consider a sphere of radius 1 as a 2-dimensional manifold with coordinates x1 = θ,x2 = φ, and line-element

ds2 =(dθ2 + sin2 θdφ2

).

Show that geodesics have a conserved quantity (call it H).

b. Using the metric line element, or otherwise, derive an equation for dθ/ds in termsof H.

c. Let X = cos θ. Show that X satisfies(dX

ds

)2

= 1−X2 −H2.

d. Obtain a second-order differential equation for X(s) and write down its general so-lution. Suppose a geodesic starts at co-latitude θ0 heading due East. Find X(s) andhence θ(s) explicitly in terms of θ0. Show that the total length of the geodesic (i.e.the length needed to go all the way around the sphere once) is independent of θ0.

7.3 Precession of Mercury’s Orbit 85

Figure 7.1: A visualization of the Schwarzschild Metric. More precisely, a 2-manifold imbeddedin three dimensional space which has the same spatial metric as a constant time equatorial(t = constant, θ = π/2) slice of the Schwarzschild metric (equation (7.26)).

Exercise 7.3

a. Consider a surface embedded in 3 dimensional Euclidean space (e.g. the surface of abowl). Using cylindrical coordinates (r, φ, z), the surface is specified by the functionz = Z(r). Let the two coordinates on the surface be x1 = r and x2 = φ. Show thatthe metric of the surface is given by

gab =

1 + Z ′2 00 r2

. (7.44)

b. Next consider geodesics on the surface. Since the surface is purely spatial, we replaceτ by arclength s in the geodesic equations. Show that the metric has a symmetry,and hence there exists a conserved quantity (call it h) along each geodesic. In otherwords, find a quantity h such that dh/ds = 0.

c. Let u = 1/r and let u′ = du/dφ. Derive the equation

u′2 =

(1

h2− u2

)1

1 + Z ′2 .

d. Now suppose Z(r) = 2√r − 1. Show that this gives a surface whose metric is the

same as the spatial part of the Schwarzschild metric, equation (7.26), (for θ = π/2)in units where rs = 1 (see figure 7.1). What is the equation for u′2? How does thiscompare with the orbit equation derived from the full Schwarzschild metric?

7.3 Precession of Mercury’s Orbit

Perihelion = Closest approach to the Sun.

86 Orbits

Observations show that the perihelion of Mercury precesses about 1000 arcsec / century.Influences of the other planets account for all but 43” of this. Einstein found the 43”could be explained by the extra term in the orbit equation. A numerical approach mightbe to go back to the first order equation, equation (7.35), and integrate directly:∫

du√(k2 − 1) /h2 + rsu/h2 − u2 + rsu3

=

∫dφ. (7.45)

However, for planetary orbits we can exploit the fact that the relativistic correctionto Newtonian theory is very small. This will enable us to find a simple analytic solution(approximate, but then so are numerical solutions!).

7.3.1 Method

The angle φ measures the net angle through which the planet has orbited the sun. Duringthe first orbit 0 ≤ φ ≤ 2π. During the second orbit, 2π ≤ φ ≤ 4π and so on. But eachsuccessive orbit does not exactly follow the previous orbit, so for example u(2π) 6= u(0).Thus the function u(φ) cannot only contain terms periodic in φ like sinφ. There may bea linear term as well. We will solve for the function u(φ), then invert to obtain φ(u).

Now at perihelion r = rmin, so u = umax. We will compare successive values of φ(umax).Because the orbits are not exactly alike,

φ(umax)︸︷︷︸orbit 2

−φ(umax)︸︷︷︸orbit 1

= 2π + δφ (7.46)

for some angle δφ. We will call δφ the precession.But first we need to solve for u(φ). We will do this by writing this function as the sum

of the Newtonian solution plus a small correction term (we derive the Newtonian solutionbelow):

u(φ) = u0

(1 + ε sinφ)︸︷︷︸Newtonian Orbit

+ y(φ)︸︷︷︸Correction

; (7.47)

u0 ≡ rs

2h2. (7.48)

We plug this into the orbit equation to obtain a new differential equation for y(φ). Asy(φ) is small, we drop the non-linear terms (those in y2 etc) to obtain a linear equationin y, which can then be readily solved.

7.3 Precession of Mercury’s Orbit 87

7.3.2 Newtonian Solution

First, look at the solution to the Newtonian equations 7.36 and 7.41. The second orderequation has sines and cosines as complementary functions and a constant as a particularsolution, so the general solution can be written

u(φ) = A cosφ+B sinφ+ u0. (7.49)

Comment: Here there are two unknown constants of integration, as expected for asecond order differential equation. These could be determined, for example, by settinginitial conditions u(φ0) and u′(φ0) at some angle φ0. However, we started with a first orderequation, equation (7.36), which should only have one constant of integration. What haschanged? When we differentiated to obtain the second order equation, we lost someinformation about the orbit. In particular, we lost the term (k2− 1)/h2. This is the onlyterm which tells us about the energy k. A solution of the first order equation may becharacterized by just one of the initial conditions as well as k and h. The second orderequation loses the explicit k dependence. Of course, the energy will be derivable as afunction of the second initial condition, and vice-versa.

The perihelion occurs at minimum r, hence maximum u. Let us suppose this occursat φ0 = π/2. Then one initial condition gives

u′(π

2) = 0 ⇒ A = 0. (7.50)

Now plug into the first order Newtonian equation (7.36) (without the relativistic correctionterm rsu

3). The result gives

B =

(kN +

r2s

4h4

)1/2

. (7.51)

Finally, define the eccentricity ε by

ε =B

u0

(7.52)

to yield

u = u0 (1 + ε sinφ) . (7.53)

The eccentricity tells us the shape of the orbit. Thus ε = 0 gives a circle, 0 < ε < 1 givesan ellipse, ε = 1 gives a parabola, and ε > 1 gives a hyperbola (the latter two are openorbits where an object comes in from ∞, is deflected, and escapes to ∞). The eccentricityof Mercury’s orbit is about ε = 0.21.

Note that for a circular orbit

ucirc = u0 =rs

2h2circ

(7.54)

⇒ hcirc =

(rcirc

2rs

)1/2

rs. (7.55)

Thus for planetary orbits hcirc � rs.

88 Orbits

7.3.3 Relativistic Correction

We now substitute equation (7.47) into the full relativistic orbit equation (7.40). Thisyields the following differential equation for y(φ):

y′′ + y =3rsu0

2((1 + ε sinφ) + y)2 (7.56)

=3rsu0

2

((1 + ε sinφ)2 + 2 (1 + ε sinφ) y + y2

). (7.57)

We can ignore the y2 term, as y � 1. Next, compare the terms linear in y. Thereare two terms: on the left hand side, with coefficient 1, and on the right, with coefficient3rsu0 (1 + ε sinφ). The latter term is of order rs/r � 1. Neglecting this term gives

y′′ + y ≈ 3rsu0

2

(1 + 2ε sinφ+ ε2 sin2 φ

). (7.58)

The terms on the right act as forcing functions for the harmonic oscillator on the left.The most interesting forcing function is the 2ε sinφ term, as this is in resonance withthe oscillator (the complementary functions include sinφ). This resonance drives theprecession.

For initial conditions y(π/2) = 0, y′(π/2) = 0, the solution is

y(φ) =3rsu0

2

[(1− sinφ) + ε

(π2− φ)

cosφ+ε2

3

(2− sin2 φ− sinφ

)](7.59)

=3rsu0

2ε(π

2− φ)

cosφ+ periodic terms. (7.60)

Recall the discussion leading to equation (7.46). The first orbit has perihelion atφ1 = π/2 , while the second orbit has perihelion at φ2 = 5π/2 + δφ. Solving u′(φ2) = 0gives

δφ ≈ 3πrsu0. (7.61)

The mean radius is r ≈ u−10 = 58× 106 km. Thus

δφ ≈ 0.1 arcseconds/orbit. (7.62)

The Mercury year is 88 earth days, making the precession 1 arcsecond every 880 days, or43 arcseconds per century.

7.4 Deflection of Starlight 89

7.4 Deflection of Starlight

As light travels from a star, its path is distorted by the curvature of space-time. Thusphotons passing near a massive object like the sun will curve around the object. Whenthe light is viewed, it will appear in the wrong place on the sky. This effect has beenseen not only for starlight in the sun’s gravitational field, but also for light from distantgalaxies or quasars passing by nearer galaxies on the way to Earth.

The effect on starlight is best seen during a solar eclipse. During an eclipse we can seestars in the sky close to the sun without being swamped by daylight. Photons from thesestars receive the maximum deflection.

Figure 7.4 shows the path of the starlight. The photon starts at the star, with

u? ≈ 0; φ? = π +∆

2. (7.63)

Because of the gravitational deflection, we observe the photon coming in from the angleφ = π −∆/2. Thus the angle of deflection is ∆.

First we look at the Newtonian prediction, then derive the relativistic deflection.

7.4.1 Newtonian Theory

The Newtonian orbit for m 6= 0 is, from equation (7.47)

u(φ) = u0 (1 + ε sinφ) ; u0 =rs

2h2=GM�

h2. (7.64)

The angular momentum per unit mass h is

h = r2φ = rVφ. (7.65)

In figure 7.4, the photon travelling from the star to our telescope has its closest approach(perihelion) at φ = π/2. At perihelion dr/dt = 0, so Vr = 0 and thus |V | = Vφ. But for a

90 Orbits

photon |V | = c = 1 in relativistic units. So

h = rmin ≈ R� (7.66)

if the starlight just grazes the surface of the sun on its way to our telescope. Thus

u(φ) =GM�

R2�

(1 + ε sinφ) . (7.67)

At the star, equation (7.63) gives

0 ≈ GM�

R2�

(1 + ε sin

(π +

∆

2

))(7.68)

≈ GM�

R2�

(1− ε

(∆

2

)), (7.69)

as sin(π + x) ≈ −x for small x. Thus ε ≈ 2/∆. We now have

u(φ) ≈ GM�

R2�

(1 +

2

∆sinφ

). (7.70)

Finally, at φ = π/2 the starlight reaches r = R�, so

1

R�≈ GM�

R2�

(1 +

2

∆

), (7.71)

≈ GM�

R2�

(2

∆

), (7.72)

using 2/∆ � 1. Thus

∆ ≈ 2GM�

R�. (7.73)

7.4.2 Relativistic Theory

As photons are massless, we consider the orbit equations in the limit m→ 0. Photons dohave energy and momentum (and angular momentum), so we hold energy E = mk andangular momentum L = mh constant. The first order equation (equation (7.35)) becomes

u′2 =

(E2 −m

L2

)− rsum

2

L2− u2 + rsu

3. (7.74)

In the limit m→ 0, then,

u′2 =

E2

L2− u2 + rsu

3 . (7.75)

Differentiation gives the second order equation

u′′

+ u =3

2rsu

2 . (7.76)

7.4 Deflection of Starlight 91

First, consider the rs = 0 case. Here M = 0 and space-time is flat:

u′′

+ u = 0 (7.77)

=⇒ u = A cosφ+B sinφ (7.78)

(7.79)

Apply boundary conditions that the light reaches r = ∞ at φ = 0, and (again) that theperihelion of the light path occurs at (r, φ) = (R�, π/2). As a result, A = 0 and B = R−1

� ,i.e.

u =1

R�sinφ. (7.80)

This is the equation of a straight horizontal line (y = R�) in polar coordinates. Lightpassing near a massive object, however, will be perturbed by the missing rs term. As inthe analysis of the precession of Mercury we try a small non-linear correction to u:

u(φ) =1

R�(sinφ+ y(φ)) (7.81)

where y(φ) << 1. Then equation (7.76) gives

1

R�(− sinφ+ y′′) +

1

R�(sinφ+ y) =

3rs

2R2�

(sinφ+ y)2 (7.82)

=⇒ y′′ + y =3rs

2R�(sinφ+ y)2 . (7.83)

Include only the terms of lowest order in y and rs (as y, rs are both small):

y′′ + y ≈ 3rs

2R�sin2 φ =

3rs

4R�(1− cos 2φ). (7.84)

Combining complementary functions and particular integrals gives

y = A cosφ+B sinφ+ C +D cos 2φ+ E sin 2φ. (7.85)

We can determine the constants as follows:

a. Plugging equation (7.85) into equation (7.84) gives C = 3D, D = rs/4R� and E = 0.

b. At perihelion, y(π/2) = 0, so B = −2D.

c. Also, as seen in the figure, the photon path is symmetric about φ = π/2. Hence, forexample, y(0) = y(π), which implies A = 0.

92 Orbits

We now havey =

rs

4R�(3− 2 sinφ+ cos 2φ). (7.86)

Finally, we can apply the boundary conditions at the star, equation (7.4). These give

0 = R� u (π + ∆/2) (7.87)

= sin (π + ∆/2) +rs

4R�(3− 2 sin (π + ∆/2) + cos (2π + ∆)) (7.88)

= − sin(∆/2) +rs

4R�(3 + 2 sin(∆/2) + cos 2∆) (7.89)

≈ −∆

2+

rs

4R�(4 + Order(∆)). (7.90)

We can ignore the terms of order (rs/R�)∆, giving

∆ ≈ 2rs

R�=

4GM�

R�≈ 1.74 arcseconds. (7.91)

Thus relativity predicts a deflection almost exactly double the Newtonian result.

7.5 Energy Conservation on Geodesics

According to Newtonian theory, an object moving in a central gravitational field hasenergy

E =1

2mV 2 +mΦ(r); Φ(r) = −GM

r, (7.92)

where Φ(r) is the potential energy. We now show that for weak gravitational fields(r � rs) and small velocities (V � 1), the relativistic energy is approximately the sameas the Newtonian energy, apart from the contribution from rest mass.

Recall the Schwarzschild metric line element in the equatorial plane, equation (7.26).In terms of Φ,

dτ 2 = (1 + 2Φ) dt2 − (1 + 2Φ)−1 dr2 − r2dφ2. (7.93)

Next, divide by dt2, using equation (7.21) for dτ/dt:(dτ

dt

)2

=1

k2(1 + 2Φ)2 (7.94)

= (1 + 2Φ)− (1 + 2Φ)−1

(dr

dt

)2

− r2

(dφ

dt

)2

(7.95)

= (1 + 2Φ)− (1 + 2Φ)−1 V 2r − V 2

φ . (7.96)

Now Φ � 1 and V 2 � 1 so we can expand the right hand side, ignoring terms like Φ2 orΦV 2:

1

k2(1 + 2Φ)2 ≈ (1 + 2Φ)− (1− 2Φ)V 2

r − V 2φ (7.97)

≈ (1 + 2Φ)− (V 2r + V 2

φ ) (7.98)

=(1 + 2Φ− V 2

). (7.99)

7.5 Energy Conservation on Geodesics 93

Thus

k2 ≈ (1 + 2Φ)2

1 + 2Φ− V 2(7.100)

≈ (1 + 4Φ)(1− 2Φ + V 2) (7.101)

≈ 1 + V 2 + 2Φ. (7.102)

Finally, we take the square root of this approximation, using (1 + x)1/2 ≈ 1 + x/2. WithE = mk the Newtonian correspondence becomes clear:

E ≈ m+1

2mV 2 +mΦ. (7.103)

Berger GR 2006

Documents

Transcript of Berger GR 2006