Lecture Notes

THREE LECTURES

ON SPECIAL RELATIVITY

Then God said, "Let there be

light," and there was light. And

God saw that the light was

good.

Genesis

And he made it the fastest thing in the Universe

By Taras Plakhotnik

School of Mathematics and Physics, The University of Queensland

Other learning resources on relativity

Wolfgang Rindler, Introduction to Special Relativity (Oxford University Press)

Wolfgang Rindler, Relativity : special, general, and cosmological

Hans Stephani, Relativity, (Cambridge University Press)

D.W. Hogg, Web notes on special relativity, http://cosmo.nyu.edu/hogg/sr/

W.S.C. Williams, Introduction to Special Relativity, (Taylor and Francis, 2002)

Michael Tsamparlis, Special Relativity (Springer, 2010) available as an electronic book from the

UQ library/

Three Lectures on Special Relativity by Taras Plakhotnik

2

GLOSSARY

Reference Frame is a space co-ordinate system and a set of clocks located at every point of space. All these clocks are stationary in their reference frame (that is they have permanent, time independent coordinates) and are synchronized with each other. The time of an event is measured by the clock located at the same place as that of the event. Other physical quantities (electrical field, magnetic field, charge density, etc) are always explicitly or implicitly related to a reference frame when one specifies a point in the co-ordinate space and the time reading (by the clock nearest to at that point) when a particular quantity was determined. For example, electrical fields are frequently described as a three-dimensional vector-function of four variables ( ), , ,x y z tE .

Inertial Reference Frame is a frame in which Newton’s first law is valid. Strictly speaking, rather abstract theoretical concepts. In practice, we expect that an object which is far from all other objects is a good approximation for an origin of an inertial reference frame. If an object moves with acceleration in a certain reference frame, we search for force acting on that object and if the force can not be identified despite all the efforts, we assume that the reference frame is not inertial. The canter of mass of our nearest star is a very good approximation to the origin of an inertial reference frame. Even a point stationary relative to the surface of our planet will do reasonably well in many cases. Observer is a fiction person introduced in textbooks to confuse the students. We will use observer only if we are interested in actual visual impression which a real person would receive if he/she were present at a certain location and at a certain time. Rest Frame is a concept used to confuse those who are not yet confused. There is no such thing as an absolute rest frame. But there is always a frame in which a selected object is at rest. Such a frame is called co-moving frame (with the object selected). Event has a more restricted meaning in physics than in conventional English. An event occurs at a point in space (does not have any size) and at one instant of time. It can be described by one set of co-ordinates (x, y, z, t) in a specified frame. Remember that time is measured by the clock at the location of an event. Simultaneous events are events which occur at the same time in a specified reference frame. Squared Interval is defined between any two events as

( ) ( ) ( ) ( )2 2 2 22 212 2 1 2 1 2 1 2 1s c t t x x y y z z∆ ≡ − − − − − − − . Note that the squared interval can

be negative. Proper Time is a confusing name for the interval divided by the speed of light in

vacuum, 212s cτ ≡ ∆ (note that this definition makes sense only if 212 0s∆ ≥ ). The proper

time coincides with time between two events measured in a certain reference frame if the two events have the same space coordinates. Length Contraction and Time Dilation are somewhat misleading concepts widely used in textbooks. They simply represent two special cases of Lorentz transformations. The distance between two events measured at the same time in some reference frame is


3

shorter (by a factor of γ ) than the distance between these events in a reference frame where they are not simultaneous (do not have the same time). The time between two events at the same place is longer (by a factor of γ ) than the time between these events in a reference frame where they are not at the same place (do not have the same space coordinates). The concepts are misleading because they lead away from the complete Lorentz transformations where time and position are interrelated (this relation is the essence of Special Relativity). Relative velocity is the velocity of one object relative to other. That is, velocity of object A relative to object B is the velocity of object A in a reference frame co-moving with object B (see Rest Frame). Thought-experiment is not an experiment but a mental exercise designed to illustrate

the theoretical concepts for educational purposes. At most, such “experiments” can

demonstrate that the theory is not self contradictive but can not prove that the theory is

correct.


4

Lecture 1

Concepts: Events. Reference frame. Relativity postulates. Transformation of coordinates. Galilean transformations. Deriving Lorentz transformations. Invariance of the squared interval under Lorentz transformations. Absolute past and absolute future.

Questions for consideration How to measure a length of a moving rod? Events and a reference frame

Physical events are described by four numbers which refer to a certain reference frame.

A reference frame is made of an infinite number of synchronized clocks covering the

whole space. Each clock has definitive and fixed coordinates (see Fig. 1). For every

event, three numbers ( ), ,x y z specify the space coordinates of the clock nearest to the

event (in theory the clock location coincides with the location of the event) and the

fourth number tells the time t shown by that clock. We will say that this event is

described by its four coordinated in reference frame O. The event coordinates are

( ), ,x y z′ ′ ′ and t′ respectively in reference frame O′ . The second reference frame is not

fundamentally different from the first frame but these two frames can move relative to

each other. Four coordinates of an event will be written in several ways such as

[ ], , ,x y z t , [ ]1 2 3 4, , ,x x x x , or [ ]1 2 3, , ,x x x t depending on the circumstances and

convenience.

Because the frame-forming clocks are distributed in space, some attention should be

paid to their synchronization. Only clocks located at the same location can be compared

directly. One way to synchronize two clocks is to move an exact replica of one clock to

the location of the second clock. This motion should be done with a very slow speed (in

theory a limit of zero speed should be taken which would require infinitely long time to

cover a finite distance) to avoid acceleration. Alternatively, a signal with a known

propagation velocity υ can be sent from one clock to the other when, for example, the

first clock shows zero time. When the pulse arrives to the second clock, the clock can

be set to the time /t L υ= , where L is the distance between the clocks. There is nothing

special about using a pulse of light for this purpose, except for light being able to

propagate in vacuum and that its speed in vacuum is known to high accuracy (actually


5

this speed is used as an etalon in the modern metrology and therefore its value is known

exactly).

A concept useful for understanding the relativity is a world line shown in Fig. 2.

An object with a constant velocity will have a straight world line. The speed of the

particle equals cotθ . The angle theta determines the gradient of the curve as shown in

Fig. 2.

Relativity Principles 1. Absolute uniform motion (motion with a constant velocity) cannot be detected

(Galileo, Newton, etc).

In other words, all laws of mechanics (later extended to laws of physics) are identical in

all reference frames moving with a constant velocity relative to each other. An example

of such a law is Newton’s First Law – Every object will stay in uniform motion unless

an external force is applied. A body moving with a constant velocity in reference frame

O will move with a constant velocity in reference frame O′ if the O′ moves with a

constant velocity relative to O. O′ may have different orientation (directions of the

three axes) relative to O and a different location of its origin. But what is the reference

frame where the law holds? It holds in inertial reference frames. In fact, Newton’s First

t

1x

1t

x

Figure 2. If at time 1t the location of an

object is 1x , this can be represented by a point on the x-t plane. Motion of an object is then represented by a curve which is called a world line. θ

9

1

6

3

9

1

6

3

9

1

6

3

9

1

6

3

9

1

6

3 9

1

6

3

x′

y′

z′O′

u�

Figure 1. Infinite space-grid of identical clocks is set up in every reference frame. These clocks are at rest and synchronized in the corresponding reference frame. When an event happens, its space coordinates and time are read from the coordinates of the nearest clock and the time shown by that clock.

9

1

6

3

9

1

6

3 9

1

6

3

9

1

6

3 9

1

6

3 9

1

6

3

9

1

6

3

9

1

6

3

9

1

6

3

9

1

6

3

9

1

6

3 9

1

6

3

x

y

z

O

9

1

6

3

9

1

6

3 9

1

6

3

9

1

6

3 9

1

6

3 9

1

6

3


6

Law postulates the existence of such a frame. From the above formulated principle it

follows that Newton’s First Law also holds in a reference frame which moves uniformly

relative to a reference frame which is already proved to be inertial.

2. The speed of light in vacuum is the same in all reference frames (Einstein).

The historically first experiment verifying that the speed of light is independent of the

reference frame was performed by Michelson and Morley. They observed that the speed

of light relative to Earth is the same despite the orbital motion of the Earth and/or

different directions of light propagation. Modern particle accelerators are able to

accelerate particles to the velocities extremely close to the speed of light in vacuum.

Design of accelerators and the analysis of experimental results obtained with particles

colliding at high speeds rely on relativistic Newton's laws which are derived later in the

course. The constancy of the speed of light also follows from Maxwell equations and

the assumption that these equations are valid in all inertial reference frames.

Transformation of coordinates. “Standard” pair of r eference frames Apparently, there must be a relation between the sets of 4-coordinates in different

reference frames. We assume that there is a universal transformation describing how to

calculate the primed coordinates of an event given its not primed coordinates and the

relative velocity of the two frames.

Any straight world line remains a straight world line in any inertial reference frame

because if the velocity is constant in one inertial reference frame it is also constant (time

independent) in any other inertial reference frame. The only transformation of

coordinates which transforms a straight line into a straight line is a linear

transformation.

Conventionally for simplicity, the following conditions are assumed valid.

1) The primed reference frame moves in the x-direction with velocity u . 2) The

corresponding axes of the two reference frames are parallel to each other.

3) When the locations of the two origins O and O′ coincide in space, the two clocks

located at the origins of the two reference frames show zero time.

These two frames will be called a standard pair of the reference frames or simply

standard reference frames for briefness (see Fig. 1). Since the direction of the velocity

u is parallel to the x-axis, the velocity fully described just only one number u . The sign

of this number is positive if the velocity vector points in the direction from negative x to

positive x and negative if it points in the −∞ x-direction.

These conditions are not limitations of the theory which can be easily generalized to

arbitrary inertial reference frames.


7

If the above conditions are not satisfied, one can use reference frames O′′ and O′′′ which are not moving relative to O and O′ respectively but which satisfy the above

conditions.

First, O′′ and O′′′ are rotated in space relative to O and O′ respectively in such a way

that 1) the velocity u is parallel to x′′ and x′′′ ; 2) y y′′ ′′′� 3) z z′′ ′′′� . Obviously, O′′

and O′′′ satisfy the conditions 1) and 2).

The space coordinates of O and O′′ (and similar for O′ and O′′′ ) are related through the ordinary Euclidian

geometry based expressions. A vector-matrix notation

can be used to write ˆx x

y y

′′ = ℜ ′′

, where ℜ̂ stands for

a rotation matrix. For example, in a two-dimensional

case shown in the figure below the relation between the

coordinates is derived as follows

( )( )

cos cos cos sin sin cos sin

sin sin cos cos sin sin cos

x r r r x y

y r r r x y

φ ϕ φ ϕ φ ϕ ϕ ϕφ ϕ φ ϕ φ ϕ ϕ ϕ

′′ = − = + = +

′′ = − = − = − +

This can be written as cos sin

sin cos

x x

y y

ϕ ϕϕ ϕ

′′ = ′′ −

The time in reference frame O′′ is synchronized with time in O . That is t t′′ = . The time

in reference frame O′′′ can be set with an arbitrary shift t∆ so that generally

t t t′′′ ′= + ∆ . But t∆ can be chosen to satisfy the convention 3) above. That is the clock

at origin of O′′′ shows zero when space location of the origin coincides with the origin

of O′′ .

For the rest of the course we will deal with the standard configuration of the reference

frames.

Lorentz transformations For a pair of standards reference frames the transformation of y and z coordinates is

very simple y y′ = and z z′ = . It is not so for the x-coordinate and time.

In classical Newtonian mechanics, the relation between two events are given by

Galilean transformations

ϕ x

yy′′

x′′

φr


8

t t

x x ut

′ = ′ = −

It is easy to see that these transformations violate second Einstein’s postulate: The

velocity of anything (including light) depends on the reference frame according to

Galilean velocity addition formula ′ = −υ υ u .

A general form for a linear transformation of x and t from O to O′ is

t x t

x x t

α βγ δ

′ = + ′ = +

(1)

All four coefficients may be functions of velocity u , the velocity of O′ relative to O .

The world line of the origin of the primed reference frame is 0x′ ≡ in the primed

reference frame and satisfies the equation x ut= in the not primed reference frame.

When these two equalities are substituted into the equation x x tγ δ′ = + , one gets

0 ut tγ δ= + (2)

Therefore we must have uδ γ= − . Thus, only three parameters are left in the linear

relation (Eq. 1) and we are seeking a transformation in the form

( )t x t

x x ut

α βγ

′ = + ′ = −

(3)

which is obtained from Eq (1) by substitution uδ γ= − . These equations should

satisfy the Einstein’s axioms.

The speed of light must be the same in all reference frames, therefore by substitution

( )x x utγ′ = − and t x tα β′ = +

(i) world line x ct′ ′= should transform into world line x ct=

(ii) world line x ct′ ′= − should transform into world line x ct= − .

Substation of the expressions for x′ and t′ into x ct′ ′= and x ct′ ′= − gives

( ) ( )( ) ( )x ut c x t

x ut c x t

γ α βγ α β

− = +

− = − + (4)

respectively. By solving each of these equations for x one gets


9

c ux t

c

c ux t

c

β γγ αβ γγ α

+ = − − + = +

. (5)

The required linear dependencies x ct= and x ct= − emerge only if

c uc

c

c uc

c

β γγ αβ γγ α

+ = −− + = − +

(6)

or in an equivalent form

2

2

c u c c

c u c c

β γ γ αβ γ γ α + = −

− = + (7)

where by solving for α and β we obtain

2

u

c

β γ

α γ

= = −

(8)

Thus, the transformation from not primed to primed coordinates must be

( )

2

x x ut

ut t x

c

γ

γ

′ = − ′ = −

, (9)

where the only not yet determined parameter is γ . This transformation can be inverted

(to obtain transformation from primed to not primed reference frame) by solving

Equations (9) for x and t . We begin by solving the upper equation for x

1x x ut

γ′= + (10)

next, substitute this solution into the lower equation

2

2 21

u ut t x

c cγ ′ ′= − −

, (11)

solve it for t and then substitute the solution into Equation (10)


10

( )

22

2

2 2

2 2 2

2 2 2

1

1

1 / 1

1 1 1

ut t x

cu

c

u c utx x x x ut

u u u

c c c

γ

γ γ γ γ

′ ′= + − ′ ′ ′ ′ ′= + + = +

− − −

(12)

Now we exploit the symmetry between the primed and not primed reference frames. We

use the fact that there is no difference between the primed and not primed reference

frame accept for the value of the relative velocity. Remember that the laws of physics

(and the Lorentz transformations is one of such laws) should be identical in all reference

frames. Therefore the transformations from x′ and t′ to x and t also can be obtained

from Eq. (9) by replacing u with u′ , the velocity of O relative to O′ .

( )( )

( ) 2

x u x u t

ut u t x

c

γ

γ

′ ′ ′ ′= −

′ ′ ′ ′= −

(13)

The agreement between Eq. (13) and Eq. (12) appears only if

( )( )

( )( )

2

2

2

2

1

1

1

uu

uc

uu u

uu

c

γγ

γγ

′= − ′ ′= −

−

(14)

Therefore

( ) ( )2

21

u u

uu u

cγ γ

′= −

− = −

Thus, the transformation relating 4-coordinates of the same event when described in two

reference frames reads

xu�

−u�

A)

B)

Drawing A) can be obtained from B) by flipping left and right. Because there is no physical different between left and right, Lorentz transformation should not change if x is replaced by x− and u is replaced by u− . Therefore ( )uγ must be equal to ( )uγ − . x−


11

( )1/22

2

1/ 22

2 2

1

1

ux x ut

c

u ut t x

c c

−

−

′ = − −

′ = − −

(15)

and is called Lorentz transformation .

Transformation of a particle’s trajectory to a different RF If a trajectory is given by equations

( )( )

x g t

y f t

=

=,

where ( )g t and ( )f t are arbitrary functions of time (of course there is a limit for

velocity which a physical particle can have and therefore 2 2

2dg dfc

dt dt + ≤

should hold) we can transform these trajectories to a primed reference frame.

Because Lorentz transformations from not primed frame to primed frame

(standard configuration) are

( )

2

x x ut

y y

ut t x

c

γ

γ

′ ′= +′=

′ ′= +

we can substitute these relations into trajectory equations

( ) 2

2

ux ut g t x

c

uy f t x

c

γ γ

γ

′ ′ ′ ′+ = +

′ ′ ′= +

It may be possible to solve the top equation for x′ and get ( )x G t′ ′= , where ( )G t′

is a function of time in the primed reference frame. This function can be

substituted into the equation for y′ . Thus one gets

( )

( )2

x G t

uy f t G t

cγ

′ ′=

′ ′ ′= +

,

the trajectory of the same particle in the primed reference frame.

Properties of Lorentz transformations


12

1. Non relativistic limit . If the speed of light is formally set to infinity, the Lorentz

transformations are equivalent to the Galilean transformations. It is a good idea to look

at the limit c → ∞ in any relativistic problem to make sure that the solution converges

to the non relativistic Newtonian mechanics.

2. Interval . Let us calculate in different reference frames the value of

( ) ( ) ( ) ( )2 2 2 22 2 212 2 1 2 1s c t t x x c t x≡ − − − ≡ ∆ − ∆ , where the subscripts refer to two events.

In a primed reference frame the value of

( )2

22 2 2 2 2 2 212 2

2 2 2 22

us c t x c t x x u t

c

c t u t x

γ γ

γ γ

′ ′ ′≡ ∆ − ∆ = ∆ − ∆ − ∆ − ∆ =

∆ + ∆ ∆2

2 2 2 2 22

2u

x x u t xc

γ γ γ+ ∆ − ∆ − ∆ ∆

( )

2 2 2

22 2 2 2 2 2 2 2 2 2

1221

u t

uc u t x c t x s

c

γ

γ γ

− ∆ =

− ∆ − − ∆ = ∆ − ∆ ≡

(16)

The quantity 212s is the same in all reference frames for any two events. This

number is called a squared interval

Generally, in 3D space the squared interval is defined by

2 2 2 2 2 212 12 12 12 12s c t x y z≡ ∆ − ∆ − ∆ − ∆ (17)

It is also true in 3D that the squared interval does not change (it is an invariant) under

Lorentz transformations.

If a squared interval is lager than zero, it is called time-like. If a squared interval is

smaller than zero, it is called space-like.

Note 1: Distance between events is invariant under rotation of the space coordinates but

is not invariant under Lorentz transformation.

Absolute past and absolute future

The value of 212s as defined above can have any sign (do not be confused by the

superscript 2). Because 212s is an invariant (does not depend on the reference frame), the

sign of 212s is the same in all reference frames. This sign is very important because it


13

tells a lot about possible relations between the two events for which the squared interval

is calculated.

1. When 212 0s > , the chronological order of the two events is absolute, it is the same in

all reference frames. Such two events are said to have timelike separation or simply are

called timelike events.

Note that when c is infinitely large (that is when Newton’s mechanics is valid) this

inequality always holds and therefore all events are timelike.

Prove. The Lorentz transformation for time separation between two events is

2 1 2

ut t t t x

cγ ′ ′ ′− ≡ ∆ = ∆ − ∆

(18)

If the squared interval is larger than zero, then 2 2 2c t x t x c∆ > ∆ → ∆ > ∆ and therefore

2

ut x

c∆ > ∆ (19)

as long as u c< . The latest inequality holds for any physically allowed reference frame

and hence as follows from Eqs. (18) and (19) the sign of t′∆ coincides with the sign of

t∆ .

In other words, if 2 1t t> , then 2 1t t′ ′> and if 2 1t t< , then 2 1t t′ ′< .

The locations of two timelike events can always be made identical by choosing an

appropriate reference frame. ( )2 1x x x x u tγ′ ′ ′− ≡ ∆ = ∆ − ∆ is zero if x u t∆ = ∆ , that is if

xu

t

∆=

∆ (20)

Because u c< for a physically allowed reference frame, the equality can be achieved

only if 2 2 2 0c x t c t x> ∆ ∆ → ∆ − ∆ > . That is when the squared interval is larger than

zero.


14

2. When 212 0s < , the chronological order of the two events depends on the reference

frame. Such two events are said to have spacelike separation or simply called spacelike

events. These events can not be considered as being a physical cause/consequence of

each other because such relation should not depend on the reference frame (logically the

cause can not follow the consequence).

For example, two spacelike events can be made simultaneous because by choosing an

appropriate primed reference frame we can get

20

ut t x

cγ ′∆ = ∆ − ∆ =

. (21)

This can be achieved if the primed reference frame is moving with velocity

2 tu c

x

∆=∆

(22)

And if 2 2 2 0 1c t x c t x∆ − ∆ < → ∆ ∆ < , the required speed is physically allowed

(x

u ct

∆≡ <∆

).

Final remarks for Lecture 1

In this Lecture Notes we do not talk about time dilation and length contraction, the

central topics of many elementary text books. Actually, the best advice I can give you is

to avoid such terminology. A typical statement which can be seen here and there “ time

runs slower in a mowing reference frame” is quite misleading. Time does not do it! If

taken seriously and/or without an explanation, such a statement is logically absurd. You

have to make sense of time being slower in the primed frame than in the not primed

frame (because the primed is moving relative to the not primed) and at the same time

being slower in the not primed framed because it is moving relative to the primed. But

the theory is built on logic. Is this the instance when impossibility of understanding the

theory is revealed? I do not think so!

The SR theory interconnects time and x,y,z coordinated in a single –dimensional space.

In other (in a bit simpler) words time in the primed frame (for example) is expressed as

a linear combination of x coordinate and time in the not primed frame. Consider a pair

of events 1 and 2 such that their location is the same in the primed frame. That is

1,2 0x′∆ = . Then

1,2 1,2 1,2 1,22

ut t x t

cγ γ ′ ′ ′∆ = ∆ + ∆ = ∆


15

Because 1γ > we conclude that 1,2 1,2t t′∆ > ∆ . So indeed, the time interval is smaller in

the primed frame and one may say that the time runs slower in the primed frame.

However, take two events (3 and 4) such that they are at the same location in the not

primed frame ( 3,4 0x′∆ = ). Then 3,4 3,4 3,4 3,42

ut t x t

cγ γ ′∆ = ∆ − ∆ = ∆

.

The γ is on the wrong side!

Thus, for some events the time interval is shorter in the primed frame and for some

other the time interval is shorter in not primed frame. Is this a surprise? No!

Look at an example of ordinary linear transformation (rotation) of two space frames.

For point A, A Ax x′> but for point B B Bx x′< . Very few people have ever complained or

have called it a paradox.

Same problem can be seen with the length contraction. Just consider two events which

happen at the same time in the primed frame and then in another pair of events that are

at the same time in the primed frame.

x

yy′

x′A

B


16

Lecture 2 Concepts: Vectors. Vectors in Euclidian space. Scalar product in Minkovski space. Proper time. 4-velocity. Quotient rule. 4-Wave-vector. Doppler effect. Aberration effect. Transformation of the phase velocity.

A geometrical concept of a vector is useful because any relations between vectors are

frame invariant. These relations hold after any allowed transformation of coordinates.

For example, + =a b c holds after any 3D rotation or after translation in space.

Vectors

Generally, vectors are geometrical objects which can be added to each other and can be

multiplied by a number. Given a coordinate system (base vectors), each N-dimensional

vector is identified by its n components. For example, a 2-dimentional vector

[ ]1 2,a a→a�

. These numbers can be transformed to a different coordinate system (a

different set of base vectors) according to the transformation defined by N relations for

each m

1

N

m mn nn

a p a=

′ =∑ , (23)

where 2N numbers mnp are the same for all vectors (but depend on the choice of the

coordinate systems involved in the transformation). There must be one to one

correspondence between [ ]1 2, ,..., Na a a′ ′ ′ and [ ]1 2, ,..., Na a a therefore ( )det 0mnp ≠ must

hold to ensure that the linear equations (27) can be solved for na .

Addition of two vectors and multiplication of a vector by a number reads in coordinate

representation as

a�

c�

b� Figure 3. An example of two

vectors a and b which are geometrically added to produce vector c .


17

[ ]1 1 2 2, ,..., N Na b a b a b= + → ≡ + + +c a b c��

(24)

and

[ ]1 2, ,..., Nc c cα α α α≡c�

(25)

Vectors in Euclidian space

Under certain transformations of the coordinate system such as rotation, translation, and

their combinations, the following quantity

2 2 2 21 2 ... Na a a≡ + + +a

� (26)

does not change. Such a quantity is said to be invariant. The particular quantity defined

by Eq. (26) is called a length and is a not negative number. In a space where the axioms

and postulates of Euclidean geometry apply, 2 2 21 2 ... Na a a+ + + is an invariant. The

space is called an Euclidian vector space.

Scalar- or dot-product

Because

( ) ( ) ( )

( )

2 2 2 2

1 1 2 2

22

1 1 2 2

...

2 ...

N N

N N

a b a b a b

a b a b a b

+ = + + + + + +

= + + + + +

a b

a b

��

�� (27)

and 22

, a b�� 2

+a b��

are invariant under rotations, translations and their combinations,

1 1 2 2 ... N Na b a b a b+ + + should be also invariant under these transformations. An invariant

which can be defined for any two vectors is called a scalar product. In Euclidian space

the scalar product is therefore defined as

1 1 2 2 .... N Na b a b a b⋅ ≡ + + +a b��

(28)

This scalar product has obvious properties:

⋅ = ⋅c a a c� � � �

(29)

( )+ ⋅ = ⋅ + ⋅a b c a c b c� ��

(30)

( )α α⋅ = ⋅a b a b� ��

(31)


18

Note that the definition of the scalar product depends on the transformation of the

coordinates. Not every transformation preserves length as defined by

2 2 2 21 1 ... Na a a≡ + + +a

�.

Examples of vectors are Newtonian momentum, velocity, acceleration, and force,

electrical field, etc. These will be called 3-vectors because they are 3-dimentional and

their scalar products are defined by Eq. (28).

Definition of 4-vectors in Minkovski space

In Minkovski space the coordinates are transformed according to Lorentz

transformations. In these lectures we will use low-case letters for vectors in Euclidian

space. Capital letters will label 4-vectors in Minkovski space (or, for brevity, simply 4-

vectors). This convention will help avoid possible confusions.

A 4-verctor [ ]1 2 3 4, , ,A A A A≡A�

can be transformed to a primed reference frame

according to

1 1 4

2 2 3 3

4 4 1

;

uA A A

c

A A A A

uA A A

c

γ

γ

′ = −

′ ′= =

′ = −

(32)

These transformation are identical to the derived above Lorentz transformations if we

set 1A x= , 2A y= , 3A z= , and 4A ct= .

Minkovski space allows also ordinary 3-diminsional rotations of the first three

components of a 4-vector but we will not consider it here for simplicity. When rotations

are excluded, the transformation matrix is then given by

[ ]

0 0

0 1 0 0

0 0 1 0

0 0

mn

u c

p

u c

γ γ

γ γ

− = −

(33)

Note: Sometimes (when convenient) we will use for 4-vectors the notation

[ ]1 2 3 4, , , , , ,x y z tA A A A A A A A ≡ ≡ A�

A quantity


19

22 2 2 2

t x y zA A A A≡ − − −A�

(34)

is invariant under Lorentz transformation because it is simply the squared interval

considered previously. Therefore a scalar product of two 4-vectors is defined as

t t x x y y z zA B A B A B A B⋅ ≡ − − −A B� �

(35)

because, as required, it is an invariant under Lorentz transformations.

′ ′⋅ = ⋅A B A B� ��

(36)

Some of the properties of such scalar products are identical to the properties of ordinary

scalar products in Euclidian space

( ) ( ) ( )α α α⋅ = ⋅ = ⋅A B A B A B� � ��

(37)

( )⋅ + = ⋅ + ⋅A B C A B A C� � � � ��

(38)

⋅ = ⋅A B B A� ��

(39)

Example of a 4-verctor is a displacement between two events

[ ]12 12 12 12, , ,x y z c t≡ ∆ ∆ ∆ ∆∆R� (40)

Other examples of 4-vectors can be obtained using properties of the Lorentz

transformations and will be considered below and in the following lectures. But first we

introduce a new scalar in Minkowski space, a proper time.

Proper time (a frame independent scalar expressed in seconds)

A proper time 12τ∆ between two timelike events is defined by the relation

212

12

s

cτ∆ ≡ ± (41)

The sign in the above equation equals the sign of 12t∆ .

1. Obviously, the proper time is an invariant under Lorentz transformations. The

squared interval is not negative and therefore the proper time is a real number and has a

direct physical meaning.


20

2. Proper time equals the time interval between two events if they take place at the same

location. Indeed, for two events at the same location 2 2 2 0x y z∆ + ∆ + ∆ = . Therefore

( )2 2 2 2

2 2 21212 2 2

s x y zt t

c cτ ∆ + ∆ + ∆∆ ≡ ≡ ∆ − = ∆ . (42)

This simple result let us define the proper time also as a time interval shown by a clock

at rest (in this case, two readings of the clock are the two events taking place at the same

location, the location of the clock)

3. In different reference frame the time intervals between two events are related by

Lorentz transformation

( )2/t t u x cγ′∆ = ∆ − ∆ (43)

But if t∆ is the time measured in a reference frame where the two events are located at

the same place (that is, t∆ is the proper time between the two events) then t τ∆ = ∆ ,

0x∆ = and therefore

t γ τ′∆ = ∆ (44)

Note that τ∆ is always shorter than t′∆ (since 1γ ≥ ). This result sometimes is stated

saying that “a moving clock appears to run slower”. When two times (say 1 pm and 2

pm) are displayed by the moving clock, the time interval between these two events as

read from the clocks of the reference frame which is used to describe the moving clock

trajectory (apparently two clocks are required to do this because the clocks are not

moving in their own reference frame) will be longer.

4-Velocity (a new example of a 4-vector) Because the proper time is an invariant, any 4-vector between can be divided by the

proper time (which is a real number for time-like events) and the result will be a 4-

vector (it will be transformed as prescribed by the Lorentz transformation). If the

displacement 4-vector is divided by the proper time, the result is called 4-velocity. The

first three components of the 4-velocity can be related to the ordinary velocity υ which

we will call 3-velocity


21

4-Velocity of a particle

4-Velocity of a particle is defined as

[ ] [ ], , , , , , ( ) , , ,x y z

d d dtx y z ct x y z ct c

d dt dγ υ υ υ υ

τ τ ≡ = = ⋅ V

� (45)

The factor γ equals the ratio dt dτ as derived in the Figure caption to Fig. 4. Note that

( ) ( ) 1/ 22 21 /cγ υ υ−

≡ − . The part , ,x y zυ υ υ represents ordinary 3D velocity, that is, for

example, /x dx dtυ = etc.

Because dτ is a scalar and [ ], , ,d x y z ct is a 4-vector, the ratio [ ], , ,d x y z ct dτ is a 4-

vector too unlike [ ], , ,d x y z ct dt which is not a 4-vector because dt is not a scalar (it

changes if we change the reference frame!).

The four components of the 4-velocity obey (as any other 4-vector) Lorentz

transformations when the reference-frame changes.

( )

( )

1 1 4

2 2

3 3

4 4 1

/ ;

;

;

/

V V u cV

V V

V V

V V u cV

γ

γ

′= −′ =′ =′ = −

(46)

Note that in these equations ( ) ( ) 1/22 21 /u u cγ γ−

= ≡ − . Equations (46) are identical to

Eq. (36) which should hold for any 4-vector.

The squared 4-velocity of a particle should be a scalar which does not change under

Lorentz transformations. Indeed

Figure 4. A particle is moving along its world line.

The proper time between two events -- [ ],x t and

[ ],x dx t dt+ + is (one space-dimension case)

2

2 22

2 2 21 1

dxdx dt

d dt dt dtc c c

υτ

= − = − = −

This also holds in three dimensions. 2

2 22

2 2 21 1

drdr dt

d dt dt dtc c c

υτ

= − = − = − ,

where 2 2 2 2dr dx dy dz= + +

x

t


22

2 2 2 22 2 2 2 2 2

4 1 2 3 2 21 /x y zc

V V V V cc

υ υ υυ

− − −≡ − − − = =

−V�

(47)

Note. If two or more particles are present, each of them will have a 4-velocity. One can

add these 4-vectors and obtain another 4-vector. For example, for two particles, a and b

a bΣ ≡ +V V V� � �

. However, the square of this new 4-vector ΣV�

is not equal to 2c .

( )22 2 2 22 2 2a b a a b b a bcΣ = + = + ⋅ + = + ⋅V V V V V V V V V� � � � � � � � �

. The last term depends on the

speed of one particle relative to the other (that is the speed of particle a in a reference

frame where particle b is at rest). If the relative speed is zero, one gets 2 24cΣ =V�

(to see

this immediately consider a RF where both particles are at rest).

The relation between the 4-velocity and the 3-velocity can be obtained from Eq. (45).

For example, for the first three coordinates of the 4-velocity one gets

[ ] ( )[ ]1 2 3 1 2 3, , , ,V V V γ υ υ υ υ= and for the fourth component ( )4V cγ υ= . Using these

relations, the Lorentz transformations (46) read

( ) ( ) ( )

( )

1 1 1

2 2

3 3

1

( ) ( ) ( ) ( ) ;

( ) ( ) ;

( ) ( ) ;

( ) ( ) ( ) /

uu c u u

c

c u c u c

γ υ υ γ γ υ υ γ υ γ γ υ υ

γ υ υ γ υ υ

γ υ υ γ υ υ

γ υ γ γ υ υ

′ ′⋅ = − = ⋅ −

′ ′⋅ = ⋅

′ ′⋅ = ⋅

′ = ⋅ −

(48)

We can solve the first 3 equations for the coordinates of υ′ to get transformations for

components of 3-velocity

( )1 1

322 3

( ) ( )

( )

( ) ( ) ( ) ( );

( ) ( ) ( ) ( )

uu

u u

u

γ γ υυ υγ υ

υυγ γ υ γ υ γυ υγ υ γ υ γ υ γ

′ = ⋅ −′

′ ′= ⋅ = ⋅′ ′

(49)

A useful equality

21

( ) ( ) 1

( ) 1 /

u

u c

γ γ υγ υ υ

=′ −

(50)

can be obtained from the transformation of 4V . The finial result for the relativistic

transformation of 3-velocity reads


23

11 2

1

322 32 2

1 1

1 /

1 1;

1 / ( ) 1 / ( )

u

u c

u c u u c u

υυυ

υυυ υυ γ υ γ

−′ =−

′ ′= ⋅ = ⋅− −

(51)

Note. Transformation of the 3-velocity can be obtained directly from the Lorentz

transformations. The result, of course is the same as in Eq. (51). For example,

( ) ( )dx u dx udtγ′ = −

( ) udt u dt dx

cγ ′ = −

We divide the top equation by the bottom one to get

( )( )( ) 1 1

x

x

dxuu dx udt udx dx udt dt

u u dx uudt dt dxu dt dxc c dt cc

γ υ

υγ

−−′ −−= = = =′ − − −−

Quotient rule This rule helps identifying (based on physical arguments) some of the quantities as

being 4-vectors.

Theorem. If for any 4-vector A�

in Minkovski space

4 4 1 1 2 2 3 3A Y AY A Y A Y− − − (52)

is invariant (independent on the choice of the coordinate system), then [ ]1 2 3 4, , ,Y Y Y Y≡Y

is a 4-vector.

Proof:

When the coordinates are transformed, mA is replaced by 4

1mn n

n

p A=∑ and mY is replaced

by mY ′ (we do not know yet how mY and mY ′ are related). Because

4 4 1 1 2 2 3 3A Y AY A Y A Y− − − is an invariant, the following equality

4 4 4 4

1 1 2 2 3 3 4 41 1 1 1

1 1 2 2 3 3 4 4

n n n n n n n nn n n n

p A Y p A Y p A Y p A Y

AY A Y A Y A Y= = = =

′ ′ ′ ′− − − + =

= − − − +

∑ ∑ ∑ ∑ , (53)

where matrix [ ]mnp is defined in Eq. (33) holds for any choice of A. Therefore the

system of equations


24

11 1 21 2 31 3 41 4 1

12 1 22 2 32 3 42 4 2

13 1 23 2 33 3 43 4 3

14 1 24 2 34 3 44 4 4

p Y p Y p Y p Y Y

p Y p Y p Y p Y Y

p Y p Y p Y p Y Y

p Y p Y p Y p Y Y

′ ′ ′ ′− − − + = −

′ ′ ′ ′− − − + = −

′ ′ ′ ′− − − + = −

′ ′ ′ ′− − − + =

(54)

must be satisfied (the coefficients in front of 1A , 2A , 3A , and 4A should be equal on the

right and left hand sides). Because ( )det 0mnp ≠ , this system of linear equations for

{ }nY ′ has only one solution. But we know that if 4

1n nm m

m

Y p Y=

′ =∑ that is if the { }nY is

transformed as a 4-vector, then the equations are satisfied. Since there is no other

solution, the transformation of Y-numbers to a new reference frame is given by 4

1n nm m

m

Y p Y=

′ =∑ . Therefore { }nY is a 4-vector.

Now we can use the rule and physical arguments to generate a new 4-vector.

4 - Wave vector

Note that in some books θ π− is denoted by θ . To obtain the same equations as in such books, cosθ and sinθ should be replaced by cosθ− and sinθ− in the following expressions.

Propagation of a plain wave is described by the equation ( )0 sinF F tω= − ⋅k r� �

, where

F is any quantity (pressure/displacement for sound waves or electric/magnetic fields

for radio waves and light etc) and r�

is a radius-vector that is the displacement vector

from the origin of the coordinates to the point where the wave is observed. The phase

velocity of the wave is defined as / kυ ω= .

In an experiment, a recorder (filled box in the Figure) measures oscillating variable F

related to the propagating wave and displays the number of detected maxima on its

display. This experiment can be described using any inertial reference frame. The phase,

that is tω − ⋅k r� �

should have the same value in all these frames because its change

(divided by 2π tells how many maxima have been recorded by the recorder. This

x

y θ

k�

yk

xk

u

Figure 4. A plane wave propagates in the direction determined by its

wave-vector k�

. The angle θ is the

polar angle of wave vector k�

defined relative to x-axis as shown in the figure to the left. With such a definition of the angle one gets

cosxk k θ= and sinyk k θ= . r�


25

outcome of the experiment (counting the maxima) should not depend on the choice of

the reference frame. It is too “uncomfortable” to thinks that the displayed number of

maxima is different in a reference frame where the box is at rest and in a reference

frame moving relatively to the box. Therefore the following equality must hold

x x y y z z x x x y y z zt k r k r k r t k r k r k rω ω′ ′ ′ ′ ′ ′ ′ ′− − − = − − − (55)

In other words

x x y y z zct k r k r k rc

ω − − − (56)

is Lorentz invariant. Therefore (see the quotient rule and note that , , ,x y zr r r ct is a 4-

vector)

,c

ω ≡ K k�

(57)

is a 4-wave vector and must be transformed according to the following Lorentz

transformations. Thus, the invariance of the phase requires the following relation

between the values of k and ω expressed in different inertial reference frames.

( )2; ; ; x x y y z z x

uk k k k k k uk

cγ ω ω γ ω ′ ′ ′ ′= − = = = −

(58)

Note that 0⋅ =K K� �

for EM waves in vacuum.

Doppler Effect

The transformation of the fourth component of the 4-wave vector reads

x

uk

c c c

ω ωγ′ = −

(59)

and given that /k ω υ= and cosxk k θ= one gets transformation of the angular

frequency

1 cosuω γ θ ωυ

′ = −

(60)

This is a Doppler frequency shift which can be observed for any wave. For EM waves

in vacuum,

1 cosu

cω γ θ ω ′ = −

. (61)


26

The Doppler shift is not necessarily a relativistic effect. The difference between the non

relativistic Doppler shift and relativistic one is the factor gamma in the above

expression. Because of this factor, the relativistic Doppler shift is also present if θ

equals 90 degree (called transverse Doppler shift). Transverse Doppler shift has been

observed experimentally for atoms in motion using precise spectroscopy. The transverse

Doppler shift is a relativistic effect and is a manifestation of the “slowed down” time in a

“moving” reference frame.

If the frequency is known in one reference frame, the Doppler shift can be used to

measure the velocity of any other reference frame (where the same wave is detectable)

relative to the first one. For example, for a simple case when cos 1θ = − (this is when

the wave propagates in the direction of negative x, opposite to the velocity of the primed

reference frame which move in the direction of positive x) and the wave is an EM wave

in vacuum, one gets

1/ 21 /

11 /

u u c

c u cω γ ω ω + ′ = + = −

(62)

The Doppler Effect can be used to determine the velocity u if the ratio of two

frequencies is known because Eq. (62) can be solved for u.

Aberration effect The change in the direction of the wave vector is call aberration effect. Because the

direction of the wave-vector is determined by the value of the polar angle θ , aberration

effect can be described in terms of this angle.

For any wave (this treatment is valid for any wave not only light),

2 2

sin sin

cos cos

y y

x x

k k k k

u uk k k k k

c c

θ θ

θ γ ω γ θ υ

′ ′ ′≡ = =

′ ′ ′≡ = − = −

.

Dividing the upper equations by the lower equations, one gets the direction of the wave

vector in the primed reference frame.

2 2

sin sintan

cos cos

y

x

k ku uk

k kc c

θ θθυ υγ θ γ θ

′′ = = =

′ − −

(63)

Other functions of the primed angle are easy to derive.


27

sin sin sinsin

1 cos

yk

uk

ω θ ω θ θυθ ω ω γ θυ υ

′′ = = = =′′ ′ −

(64)

2 2cos cos

cos1 cos

x

u uk c c

uk

ω υγ θ υ θυθ ω θυ υ

− − ′ ′ = = =′′ − (65)

In Eqs. (65, 64) previously derived /ω ω′ is used (see Eq. (60)).

This is a general result applicable to any kind of wave.

Useful relations for EM waves in vacuum

For EM waves in vacuum cυ =

( )sin

tancos /u c

θθγ θ

′ =−

(66)

sinsin

1 cosuc

θθγ θ

′ = −

(67)

coscos

1 cos

u

cu

c

θθ

θ

−′ =

− (68)

1 /tan tan

2 1 / 2

u c

u c

θ θ′ +=−

(69)

To derive the last one, you need the identity

2 2 2

2sin cos 2sin cossin 2 2 2 2 tan1 cos 21 cos sin 2cos

2 2 2

θ θ θ θθ θ

θ θ θθ= = =

+ + − (70)

The details of the derivation are below


28

( )

2

2

2

2

sin

1 cossin sin

tan 12 1 cos cos 1 cos cos

11 cos

sin 1 / sin 1 /1 tan

1 / 1 cos 1 / 21 1 cos

uuc

u u ucc c c

u

c

u u c u cuc u c u cc

θ

γ θθ θ θ

θ θ θ θ

θ

θ θ θθθ

− ′ ′ = = = − =′+ − − + −

+−

+ += − = =− + − − +

Note. The definition of the angels is some times confusing. If in doubts, write down

transformations for the components of the 4-wavevector and then get the angles from

the change of the 3-wavevector direction.

Phase velocity transformation Of course, there is also relation between the phase velocities of the same wave in two

reference frames. The magnitude of the phase velocity equals kω . This transformation

is easier to get if you recall that ⋅K K� �

is an invariant. Therefore

2 2 2 2

c c

ω ω ω ωυ υ

′ ′ − = − ′ (71)

where we have used the equality 2 2 2/k ω υ= . We substitute into the above equation the

expression for the frequency transformation

2 2

2 2 1 cos 1 cosu u

c c

γ θ ω γ θ ωω ω υ υ

υ υ

− − − = − ′

(72)

( ) ( )( )( )

22 2

2 2 2 22

2 2

1 1 /1 /

1 1cos

1 cos

cu c

c u cc

u u

υ υυ υ θθ

υ

− − − − = + = + ′ − −

and solve it for the phase velocity in the primed reference frame υ′


29

( )

( ) ( )2 2 22

2 22 24 2 22

cos cos

1 2 cos sincos 1

u c u

u u uuu c

c c cc

υ θ υ θυυ υ θ θυ θ υ

− −′ = = = + − −− + − −

2 2 2 22

2 4 2

cos

1 cos sin

u

u u u

c c c

υ θ

υ υθ θ

−= − + −

(73)

For example, if cos 1θ = ∓ and therefore sin 0θ = then

( )222 1 /1 /

u u

u cu c

υ υυυυ

± +′ = =±±

For EM waves in vacuum Eq. (72) reduces to

( )( )22 2 2

cos

cos

c u cc

c u c c

γ θυ

γ θ

−′ = =

− − + (74)

This is not a surprising result since the whole theory is based on the invariance of the

speed of light.

Note that there are three velocities related to the wave propagation problem in different

reference frames.

Final remarks

There are three velocities related to the 4-wavevector problem.

1. Speed of light. This is a fundamental physical constant conventionally labelled by c .

2. Relative velocity of reference frames labelled by u

3. Phase velocity of a wave. This is denoted as υ . The relation between υ , ω , and k is

/k ω υ= . The 4-wavevector is [ ]1 2 3 4, , , , , , /x y zK K K K k k k cω ≡ ≡ K for all waves.

For electromagnetic waves in vacuum /k cω= and therefore , , ,x y zk k k k ≡ K .

For all waves (not only EM waves)

1 1 4

2 2 3 3

4 4 1

;

uK K K

c

K K K K

uK K K

c

γ

γ

′ = − ′ ′= =

′ = −


30

To use the equations in this Lecture where angles are involved, you have to determine

correctly angle θ . This is the polar angle of the 3D wave vector in the corresponding

reference frame. The wave vector points in the direction where the wave propagates not

in the direction to the source of the wave. Aberration effect, Doppler etc have nothing

to do with the source! First, you draw the x and x′ -axes. x and x′ point in the same

direction. The direction of these axes should be parallel to the direction of u�

. In the

above equations u is positive if u�

points in the direction of increasing x . If u�

points in

the direction opposite to the direction of x , the value of u is negative. Then you

identify the angler as shown below and use your favourite equation.

My favourite for EM waves in vacuum: 1 /

tan tan2 1 / 2

u c

u c

θ θ′ +=−

For the Doppler effect 2 2

1 cos

1 /

u

u c

θυω ω

−′ =

− (all waves).

θx

k

x

k

θ


31

Lecture 3 Concepts:

4-momentum. Conservation of 4-momentum. Relativistic 3-momentum. Total energy. 4-acceleration. 4-force. Transformation of magnetic and electrical fields. 4-momentum

By the analogy with Newtonian mechanics, 4-momentum of a particle is defined as a

product of its mass and its 4-velocity

[ ]2 2

, , , ,1 /

x y z

mm m mc c

cγ υ υ υ

υ ≡ = = −

P V υ� � �

(75)

In these lectures m is a frame independent intrinsic property of a particle, sometimes

also called “the rest mass of a particle”. Note that in some textbooks the rest mass is

labelled as 0m and 0m mγ≡ is called “relativistic mass”. The concept of "relativistic

mass" creates more problems than it can possibly solve and therefore should be avoided.

For example, the gravity created by amoving particle is not simply enhanced by the

factor gamma. It has a more complicated dependence on the velocity and the effect of

gravity is considered in General Relativity.

Similar to its non relativistic counterpart, the 4-momentum is the same before and after

collision of any number of particles. You can think of momentum conservation as being

a basic law of physics or a mathematical axiom of the theory. But remember that the

validity of an axiom in physics is subject to experimental testing. Momentum

conservation law agrees with all the experiments done so far. Mathematically, the 4-

momentum conservation reads

before collision after collision

n nn n

= ∑ ∑P P� �

, (76)

where nP�

is the 4-momentum of n-th particle. Because this relation is stated in terms of

4-vectors, the equality is automatically Lorentz invariant. That is, once it is valid in one

inertial reference frame it is also valid in all inertial reference frames.


32

The 4-momentum is not conserved when you change the reference frame (it will be

transformed according to Lorentz transformation). Therefore

before collision before collision

n nn n

′≠ ∑ ∑P P� �

,

where the prime indicates that the 4-momenta are referred to a different inertial

reference frame.

The equality (76) can be written separately for the first three components of the 4-

momentum and for its fourth component

before collision after collisionn n n n n nm mγ γ = ∑ ∑υ υ� �

before collision after collisionn n n nm mγ γ = ∑ ∑

From Newton’s physics we know two quantities which are conserved in any collision

(one is a vector and the second is a scalar). These quantities are the 3-momentum and

the total energy (note that the kinetic energy is conserved only in elastic collisions).

Relativistic 3-momentum and total energy

We identify the vector mγ υ�

as a relativistic 3-momentum

( )rel mγ≡p υ� �

.

It is a vector in Euclidian space. You can add relativistic 3-momentum to another 3-

momentum, you can rotate it 3D space, calculated scalar product as

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )1 2 1 2 1 2 1 2

rel rel rel rel rel rel rel relx x y y z zp p p p p p⋅ ≡ + +p p . So defined ( ) ( )

1 2rel rel⋅p p is not invariant

under Lorentz transformations. However, in reference frames which are not moving

relative to each other but only rotated relative to each other, ( ) ( )1 2

rel rel⋅p p as defined

above is an invariant (that is a scalar product if only rotation and translation of a

reference frame is allowed). If the reference frame is replaced by a new reference frame

which is moving relative the first one, the Lorentz transformations are applied to the 4-

momentum where the 3-momentum represents only the first three coordinates.

The forth coordinate of the 4-momentum is mcγ . If we multiply it by c , the result can

be identified as the relativistic total energy.

2mc Eγ ≡ (77)


33

There are several reasons for identifying 2mcγ as the total energy. (1) This value has

the units of energy (mass multiplied by the square of velocity), (2) it is conserved in all

collisions if calculated for the total of all involved particles, (3) it looks nice, and (4) it

gives correct value for the kinetic energy in a non relativistic limit as shown below.

First, we note that when the 3-velocity of the particle is zero, its total energy is 2mc . If

the particle is moving, its energy is increased due to kinetic energy. Therefore, the

kinetic energy KE a free moving particle is given by

2 2 22 2 2

22

2

11 ....

2 21

mc mKE mc mc mc

c

c

υ υυ

= − ≈ + + − ≈

− (78)

where a Taylor’s expansion assuming small value of 2 2cυ is used. The kinetic energy

has limited application in relativity because the splitting of the total energy between the

potential and the kinetic energy is not always obvious.

With the new definitions in place, the 4-momentum can now be also written as

( )( ) , /rel E c≡P p� �

(79)

The square of the 4-momentum reads

2 2 2 ( )2 2 2/ relE c p m c≡ − =P�

(80)

Note 1: ( )2 ( )2 ( )2 ( )2rel rel rel relx y zp p p p≡ + + is the squared length of the relativistic 3-vector

of the momentum.

Note 2: For photons ( )relE p c= and therefore 2 0=P�

. The fact that 4-momentum

squared is zero for photons should not confuse. Of course, the 3-momentum and the

total energy are never zero for a photon. ( )relE p c= can be derived from Maxwell

equations and also from the fact that for photons 0m = . If m were not zero for photons,

their energies would be infinitely large because photons move with the speed of light in

vacuum and the corresponding factor ( ) 1/ 22 21 /cγ υ−

≡ − is infinitely large.

4-acceleration


34

Like in a non relativistic case, the 4-acceleration is defined as a time derivative of the 4-

velocity. But to ensure that the obtained quantity is a 4-vector, we should take the

derivative over the proper time.

( ), , , , , ,x y z x y z

d d dc c

d d dtγ υ υ υ γ γ υ υ υ

τ τ ≡ = =

VA

��

(81)

When calculating the derivative, we should not forget that γ will also depend on time if

the speed changes.

( ) ( )2

31/ 2 3/ 2 22 2 2 2

1 /

1 / 1 /

d d c d d

dt dt dt c dtc c

γ υ υ υ υγυ υ

= = =− −

(82)

Useful and equivalent expressions for 4-acceleration are

[ ] [ ]2 4 2 42

, ,0 ,d d d d

cdt dt dt c dt c

γ υ υ υ υγ γ γ γ γ = + = + A υ υ υ a� � � � �

(83)

A 3-vector a�

in the above expressions is a 3-vector of acceleration d dt≡a υ� �

.

Examples and Some Interesting Results

1. If the length of the 3-velocity vector (that is the speed) is time independent, then

2 ,0γ = A a� �

2. At the moment when the instantaneous

3-velocity is zero, [ ],0=A a� �

.

Consider a co-moving frame where the

instantaneous velocity of a particle is zero. Let the

direction of the 3-acceleration in the co-moving frame be in the direction of the x-axis.

Hence in the instantaneous acceleration in the co-moving frame is [ ],0,0,0xα′ ′=A�

. In a

reference frame where the co-moving frame moves with the velocity of the particle, the

4-acceleration is given by Eq. (83). On the other hand this acceleration can be obtained

by Lorentz transformation of ′A�

which results in ,0,0,x xc

υγ α α ′ ′= A�

. Therefore

comparing the forth component in this expression and in Eq. (83) one gets

( )3/ 22 21 /x

dc

dt

υ α υ′= − (84)

[ ],0,0xα′ ′=α

x′

y′

x

y

υ


35

This differential equation describes how the velocity changes if a particle accelerated

with a constant acceleration in its co-moving frame. Integration of this equation leads to

( ) 1/ 22 21 / xc tυ υ α−

′− = , which is straightforward to solve for υ and get

( ) ( )1/ 2 1/ 22 2 2 2 2 21

x x

x x

t tc

c t t c

α αυα α

′ ′= =

′ ′+ + (85)

The velocity increases in time but never reaches the speed of light in vacuum. When the

velocity is small relative to the speed of light we recover the not relativistic relation

xtυ α ′= .

3. The scalar product of 4-acceleration and 4-velocity of the same particle is always

zero. To prove this note that in the reference frame where the instantaneous 3-velocity is

zero, the 4-velocity is ,c 0�

. In such a reference frame, the 4-acceleration is [ ],0=A a� �

.

4-force

In line with classical mechanics, the relativistic Newton’s law for 4-vectors reads

d

dτ=P F� �

(86)

The law can also be written in terms of 4-velocity and 4-acceleration

( )d d dm d dmm m m

d d d d dτ τ τ τ τ= = + = + =V

P V V V A F

��

(87)

Note, in non relativistic counterpart of the second Newton’s law the derivative of mass

over time is also present if the mass is not a constant. Therefore there is nothing really

new in this equation.

To solve dynamical problems (for example, to calculate the trajectory of a particle) one

needs an expression of the force in Eq. (87). Since gravity is excluded from SR (you

need general relativity to deal with gravity), electromagnetic force is the only

fundamental force which can be easily included in the theory (see section about the

electromagnetic fields below).

3-force and 4-force


36

The relation between 3-force and 4-force follows from

( )( ) 1, ,reld dt d E dE

d d dt c c dtγ υ

τ τ = = =

F P p f��

, (88)

where the 3-force is defined as a time derivative of the relativistic 3-momentum

( )reld

dt≡f p� �

(89)

Note: The relativistic 3-momentum ( )rel mγ≡p υ� �

represents the first 3 coordinates of the

4-momentumP�

, that is ( ), , relx y zP P P = p

�. However, the 3-force is a time derivative of

the relativistic 3-momentum and therefore , ,x y zF F F γ = f�

because the first three

components of the 4-force are the derivative of the 3-momentum over the proper time.

Useful equalities describing properties of the 4-force are derived below.

2 2dm dmc m c

d dτ τ⋅ = + ⋅ =F V A V

�� (90)

On the other hand

2 2dE

dtγ γ⋅ = − ⋅F V f υ

�� (91)

Therefore

2 2 2dm dEc

d dtγ γ

τ= − ⋅f υ

� � (92)

and if 0dm dτ = then

dE

dt= ⋅f υ� �

(93)

Relativistic transformation of the 3-force

The transformations of three components of the 3-force are similar to the transformation

derived for the 3-velocity. This is not surprising because there is a clear analogy

between the expressions for the 4-force and for the 4-velocity.

υ�

u X ′

Y ′

X

Yf� Figure 5. In a not primed

reference frame, a particle moves with 3-velocity υ

� and experiences

3-force f�

. The primed reference frame moves with velocity u

� as

accepted in the standard configuration.


37

( ) 1,

dE

c dtγ υ =

F f��

(94)

( )[ ],cγ υ=V υ� �

(95)

For example, using transformation of F�

as a 4-vector one can derive

( ) ( ) ( ) ( )1 1

1u dEf u f

c c dtγ υ γ γ υ γ υ ′′ = −

and therefore

( ) ( )( )1 1

1u u dEf f

c c dt

γ γ υγ υ

′ = − ′

This can be further simplified by using Eq. (50). The complete set of transformations for

3-force reads

( )( )

( )( )

1 12 21

2 221

3 321

1

1 /

1;

1 /

1

1 /

u dEf f

u c c dt

f fu u c

f fu u c

υ

γ υ

γ υ

′= − −

′ =−

′ =−

(96)

The fourth coordinate of the 4-force gives the transformation for power.

121

1

1 /

dE dEuf

dt u c dtυ′ = − ′ −

(97)

A rest mass preserving force is such a force that 0dm dτ = . In this case dE dt = ⋅f υ� �

and therefore

1 12 21

1

1 /

uf f

u c cυ ⋅′= − −

f υ� �

(98)

An example of such a force is Lorentz 3-force acting on a moving charged particle.

Transformation of magnetic and electrical fields

The Lorentz 3-force acting on a moving charged particle reads

q q= × +f υ b e, (99)


38

where b is a 3-vector of magnetic

field and e is a 3-vector of

electrical field, υ is the 3-velocity

of the particle, and q is the charge

of the particle.

Note: The coefficients in this

equation depend on the units used.

For example, in the Gaussian units

the Lorentz 3-force is

G G G Gq c q= × +f υ b e . The relation

between the Gaussian units and the

international system of units for the

electrical and magnetic fields and

electrical charge is given below for

your reference

( ) ( )( )

1/2 1/ 20 0

1/ 2 20 0 0

4 4

4 ; 1/

SI G SI G

SI G c

µ π πε

ρ πε ρ ε µ

−= =

= =

b b e e� � � �

The electrical charge q is proportional to the number of elementary particles (for

example, electrons for a negative charge) and is invariant for all reference frames. For

briefness we set 1q = . Because

( ) ( ) ( )2 3 3 2 3 1 1 3 1 2 2 1b b b b b bυ υ υ υ υ υ× = − + − + −υ b i j k , (100)

where i , j , and k are corresponding orthogonal unit vectors.

In a way, the Lorentz 3-force defines the electrical and magnetic fields. The

transformation of b and e fields can be derived using transformations already derived

for 3-force and 3-velocity.

We begin by writing down the components of the Lorentz force in the non primed and

primed reference frames which read

1 2 3 3 2 1f b b eυ υ= − + 1 2 3 3 2 1f b b eυ υ′ ′ ′ ′ ′ ′= − +

2 3 1 1 3 2f b b eυ υ= − + 2 3 1 1 3 2f b b eυ υ′ ′ ′ ′ ′ ′= − + (101)

b�

υ�

xi

yi

zi

×υ b��

Fig. Magnetic-field part of the Lorentz force. The magnetic field vector in xz-plane. The vector product is perpendicular to the xz-plane.

( ) ( ) ( )

x y z

x y z

x y z

x y z z y y x z z x z x y y x

b b b

b b b b b b

υ υ υ

υ υ υ υ υ υ

× = ≡

≡ − − − + −

i i i

υ b

i i i

, , and x y zi i i are unit vectors in x, y, and z

directions respectively.


39

3 1 2 2 1 3f b b eυ υ= − + 3 1 2 2 1 3f b b eυ υ′ ′ ′ ′ ′ ′= − +

respectively. Then, we use the velocity transformations

( ) ( )31 2

1 2 32 2 21 1 1

; ; 1 / ( ) 1 / ( ) 1 /

u

u c u u c u u c

υυ υυ υ υυ γ υ γ υ−′ ′ ′= = =

− − − (102)

to express the primed force in terms of non primed velocity. For example, for the first

component of the primed force we get

( ) ( )32

1 3 2 12 21 1( ) 1 / ( ) 1 /

f b b eu u c u u c

υυγ υ γ υ

′ ′ ′ ′= − +− −

(103)

On the other hand, we can use relativistic transformation of a 3-force instead. This

transformation states that

21

1 21

/

1 /

f u cf

u cυ− ⋅′=−

f υ. (104)

One can now substitute the expressions for f (see Eqs. (101)) in the not primed

reference frame, and get

( ) 22 3 3 2 1 1 1 2 2 3 3

1 21

2 2 22 3 3 2 1 1 1 2 2 3 3

21

2 23 2 2 3

2 3 12 21 1

/

1 /

/ / /

1 /

/ /

1 / 1 /

b b e u e e e cf

u c

b b e e u c e u c e u c

u c

b e u c b e u ce

u c u c

υ υ υ υ υυ

υ υ υ υ υυ

υ υυ υ

− + − + +′= =

−

− + − − −= =−

− += − +− −

(105)

The two expressions for 1f ′ (Eqs. 103 and 105) must be equivalent no matter what the

values of 1υ , 2υ , and 3υ are. Therefore the factors in front of 2υ and 3υ must be equal.

This gives expressions for 3b′ and 2b′ .

( )23 3 2 /b b e u cγ′ = − (106)

( )22 2 3 /b b e u cγ′ = + (107)

The terms independent of the velocity υ must also be equal. It follows that

1 1e e′ = (108)

Expressions for 2e′ , 3e′ , and 1b′ can be obtained when the expressions for 2f ′ and 3f ′ are

derived in two different ways (as it was done above for 1f ′ ) are compared. All the

results are summarized below.


40

( )( )

1 1

2 2 3

3 3 2

1 1

2 2 32

3 3 22

e e

e e ub

e e ub

b b

ub b e

c

ub b e

c

γγ

γ

γ

′ =′ = −

′ = +′ =

′ = +

′ = −

(109)

Concluding remarks

Maxwell equations

0 0 0div 0; curlt

µ ε µ ∂= = − +∂e

b b j��

(110)

0div ; curlt

ρ ε ∂= = −∂b

e e

��

(111)

stay valid if the Lorentz transformations of space-time are used, the e and b fields are

transformed as derived above, and the current density j and the charge density ρ are

changed as components of a 4-current density

[ ]0 0 , ,c cρ ρ γ ρ ≡ = ≡ J V υ j��

(112)

This can be verified directly by substitution of the appropriate transformation derived in

this course.

One can also introduce an electromagnetic field tensor (a generalization of a 4-vector)

and write the Maxwell equations in a 4-tensor form but we will not develop this

technique in these lectures. Those who are interested may read one of the recommended

books.


41

Some General Hints for Solving Problems

Colliding or “breaking into parts particles” are usually one type of problems where

Special Relativity is easy to use. What you need to do is to write down 4 equalities. One

for the total energy and three for each component of the relativistic 3-momentum vector

(thus you cover all four components of the 4-momentum vector). In each of these

equalities you should have energy/momentum of all particles added together before

collision (say on the left hand side) and total energy/momentum of the involved

particles after collision on the other side of the equality. The expressions for the total

energy and the relativistic 3-momentum are

2

2 21 /

mcE

cυ=

−,

( )

2 21 /

rel xx

mp

c

υυ

=−

, ( )

2 21 /

yrely

mp

c

υ

υ=

−, ( )

2 21 /

rel zz

mp

c

υυ

=−

It is useful to use symmetry and choose the direction of the x, y, and z so that some of

the momenta are obviously zeros. This will reduce the number of the equations. In

exceptional situation a trick can help to answer the question. Note that m in these

expressions is the mass of a particle (in some other texts, a term rest mass is used

instead of simply mass, a term used in these notes). If the particle is made of parts, you

can not add mass of these parts together to get the mass of the composite particle. This

is because the primary particles can interact with each other or can move relative to

each other. Anything of the above will change the mass. Therefore, unless the particles

are elementary (like electrons, e.g.) you can not consider the masses to be the same

before and after collision and the problem generally speaking, can not be solved if some

extra information is not provided. This can be, for example, information that particles

stick together after collision. In this case the velocities of the parts after collision are

equal and the number of unknowns in the equations is dramatically reduced.

The electrical/magnetic fields may need conversion from one reference frame to

another. If it is easier to solve the problem in some reference frame, then you can get the

solution for a different frame by applying appropriate transformations


42

( )( )

1 1

2 2 3

3 3 2

1 1

2 2 32

3 3 22

e e

e e ub

e e ub

b b

ub b e

c

ub b e

c

γγ

γ

γ

′ =′ = −

′ = +′ =

′ = +

′ = −

But watch the sign of u! In these equations u is positive if the primed RF moves in the

direction of increasing positive values of x of the not primed RF.

Some problems require solving differential equations describing the dynamics of the

system.

2 21 /

d m

dt cυ=

−υ

f� �

Note that each of the components of the 3-velocity and the speed (the magnitude of the

velocity) may be time dependent. Of course you need an expression for force f�

to write

down the actual equation. Since gravity is excluded from SR, a typical example of a

force is the Lorentz force q q= × +f υ b e.

The Twin Paradox

Everyone who teaches or study Special Relativity should an opinion about the Twin

Paradox.

Lecture Notes

Documents

Transcript of Lecture Notes