The Dirac Equation

21
1 INTRODUCTION 3 Acknowledgements I would like to thank Prof. D. Baye for his remarks on the intermediate version of this document as well for his courses which allowed me to con- firm my interest in quantum mechanics and provided the incentive for my willingness to learn more about relativistic quantum mechanics. 1 Introduction In 1928, Paul Adrien Maurice Dirac (1902-1984) discovered the relativistic equation which now bares his name while trying to overcome the difficulties of negative probability densities of the Klein-Gordon equation 1 . For a long time, it was believed that the Dirac equation was the only valid equation for massive particles. It was only after Pauli reinterpreted the K-G equation as a field theory in 1934 that this belief was shaken. Even now the Dirac equation has special importance because it describes particles of spin- 1 2 , which is the case of the electron as well as many of the elementary particles. In fact, it is a theoretical conjecture that all the elementary particles found in Nature obeying Fermi statistics have spin- 1 2 ([8]). It is therefore useful to study the Dirac equation, not only from a theo- retical point of view but also from a practical one as some phenomena like the β - decay (positron emission) can be explained by the Dirac equation, as well as some of the phenomena were the non-relativistic quantum theory is unable to explain experimental facts such as the anomalous Zeeman effect. At first, this document shall set the postulational basis of the theory, enouncing the frame in which the relativistic electron theory has been built in. This will be followed by a brief review of relativistic notations and of the Lorentz transformations. Then, we will follow Dirac’s revolutionary way of thinking which will lead us to the Dirac equation for the free particle, in absence and presence of an electromagnetic field. The role of the spin as an internal degree of freedom and the existence of negative energy particles will be discussed. The next step is to apply the theory to some simple system and to see what the solutions of the equation are and how to interpret them. Finally, a trial for interpretation of the theory will require us to introduce a new representation, the Foldy-Wouthuysen representation, and to highlight its main advantages. 1 This equation is derived by inserting the operator substitutions E i~t , p →-i~into E 2 = c 2 p 2 + μ 2 c 4 , the relativistic relation between energy and momentum for a free particle of mass μ.

description

The Dirac Equation

Transcript of The Dirac Equation

Page 1: The Dirac Equation

1 INTRODUCTION 3

Acknowledgements

I would like to thank Prof. D. Baye for his remarks on the intermediateversion of this document as well for his courses which allowed me to con-firm my interest in quantum mechanics and provided the incentive for mywillingness to learn more about relativistic quantum mechanics.

1 Introduction

In 1928, Paul Adrien Maurice Dirac (1902-1984) discovered the relativisticequation which now bares his name while trying to overcome the difficultiesof negative probability densities of the Klein-Gordon equation1. For a longtime, it was believed that the Dirac equation was the only valid equationfor massive particles. It was only after Pauli reinterpreted the K-G equationas a field theory in 1934 that this belief was shaken. Even now the Diracequation has special importance because it describes particles of spin-1

2 ,which is the case of the electron as well as many of the elementary particles.In fact, it is a theoretical conjecture that all the elementary particles foundin Nature obeying Fermi statistics have spin-1

2 ([8]).It is therefore useful to study the Dirac equation, not only from a theo-

retical point of view but also from a practical one as some phenomena likethe β− decay (positron emission) can be explained by the Dirac equation, aswell as some of the phenomena were the non-relativistic quantum theory isunable to explain experimental facts such as the anomalous Zeeman effect.

At first, this document shall set the postulational basis of the theory,enouncing the frame in which the relativistic electron theory has been builtin. This will be followed by a brief review of relativistic notations and ofthe Lorentz transformations. Then, we will follow Dirac’s revolutionary wayof thinking which will lead us to the Dirac equation for the free particle, inabsence and presence of an electromagnetic field. The role of the spin as aninternal degree of freedom and the existence of negative energy particles willbe discussed. The next step is to apply the theory to some simple systemand to see what the solutions of the equation are and how to interpret them.Finally, a trial for interpretation of the theory will require us to introduce anew representation, the Foldy-Wouthuysen representation, and to highlightits main advantages.

1This equation is derived by inserting the operator substitutions E → i~∂t, p → −i~∇into E2 = c2p2 + µ2c4, the relativistic relation between energy and momentum for a freeparticle of mass µ.

Page 2: The Dirac Equation

2 POSTULATES OF THE THEORY 4

2 Postulates of the theory

The relativistic electron theory being a quantum mechanical theory, cer-tain of its postulates are common to general quantum mechanical theories.However, the relativistic theory is consistent with the special principle of rel-ativity. The postulates of the theory are listed here and will be followed bya brief discussion (see [7] for more details). For a more complete discussion,the reader is referred to the work of Dirac [3].

I. The theory shall be formulated in terms of a field, quantitatively rep-resented by an amplitude function ψ, in such way that the statisticalinterpretation of quantum phenomena will be valid.

II. The description of physical phenomena in the theory will be basedon an equation of motion describing the development in time of thesystem or of the field amplitude ψ.

III. The superposition principle shall hold, requiring the equation of mo-tion to be linear in ψ.

IV. The equations of motion must be consistent with the special principleof relativity 2. This requires that they may be written in covariantform.

V. From postulate I, it must be possible to define a probability density ρsuch that it is positive definite :

ρ > 0

and that its space integral satisfies :∫

ρ d3x =∫

ρ′ d3x′ (1a)

d

dt

∫ρ d3x = 0 (1b)

Condition (1a) expresses that ρ is a relativistic invariant and both con-ditions permit a Lorentz-invariant meaning to a normalization condi-tion such as ∫

ρ d3x = 1

2Since general relativity is required only when dealing with gravitational forces, whichare quite unimportant in atomic phenomena, there is no need to make the theory conformto general relativity.

Page 3: The Dirac Equation

2 POSTULATES OF THE THEORY 5

VI. The theory should be consistent with the correspondence principleand in its non-relativistic limit should reduce to the standard form ofquantum mechanics applicable at low velocities. Furthermore, in itsnon-quantum limit, the theory should yield the mechanics of specialrelativity.

Postulates I and III appear to be necessary in view of such experimen-tal facts as scattering and the attendant diffraction effects observed in suchphenomena. The ψ-function referred to will be called a wave function. Itwill, in general depend on the four space-time coordinates xµ and may be amulti-component wave function (as if the theory is to account spin proper-ties).

Postulate II implies the existence of an operator equation of the form :

Hψ = i~∂ψ

∂t, (2a)

or, switching to a natural unit system by setting ~ = c = 1 (as it will be thecase in this document) :

Hψ = i∂ψ

∂t. (2b)

In connection with postulate IV, the occurrence of the first time deriva-tive in the equation of motion implies the space derivatives should also occurto first order. The obvious requirement of symmetry in all four space-timevariables is clearly not fulfilled by the non-relativistic form of the quantummechanics. Although the required symmetrical appearance of the four xµ inthe equations of motion is not a sufficient condition for relativistic covarianceand therefore this covariance must, and will be, demonstrated.

About postulate V, the fact that ρ is positive definite implies we speakof a particle and not a charge density. It is not clear whether the goal ofρ being positive definite is attainable in a given theory. We also have tonotice that (1b) is assured if a continuity equation exists and if ψ vanishessufficiently strongly at the boundaries of the system. That is, a particlecurrent density j must exist such that

∇ ¦ j +∂ρ

∂t= 0. (3)

This equation has the usual interpretation that a particle cannot disappearfrom a volume of space unless it crosses the surface bounding that volume.In fact, electrons can actually do this by means of pair annihilation. Thuscreation or destruction of particles and antiparticles contradict the conser-vation of particles but not the conservation of charge. There is clearly acontradiction here. However, this difficulty, which disappears in a quan-tized field theory, raises no real problem in the questions discussed in thisdocument.

Page 4: The Dirac Equation

3 THE RELATIVISTIC NOTATION 6

3 The relativistic notation

Before starting on the path toward developing the relativistic wave equation,a few words on relativistic notation are in order. In relativity all 4-vectorsand their transformations are the most important quantities. The most fun-damental 4-vector is the one that describes space and time, xµ = (t, x, y, z).Its transformation properties are defined in terms of the invariant quan-tity s2 = xµxµ = t2 − x2 − y2 − z2. This introduces a second quantityxµ = (t,−x,−y,−z). The 4-vector with the upper index is a contravariantvector, while that with the lower index is a covariant vector. To transformbetween the two types of vectors, we introduce the metric tensor:

gµν =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

⇒ xµ = gµνxν (4)

The momentum p, whose components will be written p1, p2, p3, is equalto the operator

pr = −i∂

∂xr(r = 1, 2, 3). (5)

To bring (5) into a relativistic theory, we must first write it with balancedsuffixes, pr = i ∂/∂xr (r = 1, 2, 3), and extend it to the complete 4-vectornotation,

pµ = i∂

∂xµ. (6)

We thus have to introduce p0, equal to the operator i ∂/∂x0. Since thelast forms a 4-vector when combined with the momenta pr, it must have thephysical meaning of the energy of the particle divided by c. We now haveto develop the theory treating the four p’s on the same footing, just like thefour x’s.

4 Lorentz transformations

The Lorentz group is the group of transformations that preserves the lengths2 = xµxµ. Some of the continuous transformations that do this are theordinary (space) rotations and the ”boosts” or imaginary rotations, whichcorrespond to passing from one inertial frame to another one moving relativeto the first.

For example, the homogenous Lorentz transformation to a frame with

Page 5: The Dirac Equation

5 THE DIRAC WAVE EQUATION 7

velocity v along the x1-axis is given by:

Λµν =

γ −γβ 0 0−γβ γ 0 0

0 0 1 00 0 0 1

⇒ x

′µ = Λµνxν , (7)

where γ = 1/√

1− β2 and β = v/c = v.

Those transformations, satisfying det Λ = +1, constitute the subgroupof the proper Lorentz transformations while the transformations satisfyingdet Λ = −1 constitute the improper transformation subgroup. The latterinclude:

(a) space reflections : Λik = −δik, Λ00 = 1, Λj0 = Λ0j = 0

(b) time reflections : Λik = δik, Λ00 = −1, Λj0 = Λ0j = 0

(c) as well as any product of a proper transformation with a space or timereflection.

The covariant vector transforms differently from the contravariant vectorx′µ = Λν

µxν where the two different transformations are defined by the in-variance of xµxµ = xνΛµ

νΛνµxν . This imposes the condition that Λµ

νΛνµ = 1,

that is they are inverse transformations of each other:

Λνµ =

γ γβ 0 0γβ γ 0 00 0 1 00 0 0 1

⇒ x

′µ = Λν

µxν (8)

5 The Dirac wave equation

The reasoning followed here is, at least for its first part, inspired by Dirac’sbook [3].

Let us consider the case of the motion of an electron in the absence of anelectromagnetic field, so that the problem is that of the free particle, withthe possible addition of internal degrees of freedom.

The relativistic Hamiltonian provided by the classical mechanics (throughthe relation E =

√p2 + m2) leads to the wave equation

{p0 − (m2 + p21 + p2

2 + p23)

12 }Ψ = 0 (9)

where the p’s are interpreted as in (6). This equation is however very un-satisfactory as it is very unsymmetrical between p0 and the other p’s. Wemust therefore search for an other wave equation.

Page 6: The Dirac Equation

5 THE DIRAC WAVE EQUATION 8

Multiplying (9) on the left by {p0 +(m2 +p21 +p2

2 +p23)

12 }, we obtain the

equation{p2

0 −m2 − p21 − p2

2 − p23}Ψ = 0, (10)

which is of a relativistically invariant form. However equation (10) is notcompletely equivalent to (9) since every solution of (10) is not solution of(9), although the converse is true.

At this point, equation (10) is not of the form required by the laws ofthe quantum theory on account of its being quadratic in p0. We need a waveequation linear in p0 and roughly equivalent to (10). In order to transformin a simple way under Lorentz transformations, we shall try to make thatequation rational and linear in pµ, and thus of the form

{p0 − α1p1 − α2p2 − α3p3 − βm}Ψ = 0, (11)

in which the α’s and β are independent of the x’s and the p’s. They thereforedescribe some new degree of freedom, belonging to some internal motion inthe electron. In fact, as we shall see, they bring in the spin of the electron.

Multiplying (11) by {p0+α1p1+α2p2+α3p3+βm} on the left, we obtain

{p20 −

[α2

i p2i + (αiαj + αjαi) pipj + (αiβ + βαi) pim + β2m2

]}ψ = 0, (12)

summation being implied over repeated suffixes, with the imposed conditioni > j. This is the same as (10) with the α’s and β satisfying

α2i = β2 = 1 ; αiαj + αjαi = 2δij ; αiβ + βαi = 0. (13)

Thus by giving suitable properties to the α’s and β we can make equation(11) equivalent to (10), in so far as the motion of an electron as a whole isconcerned. We may now assume that

{p0 − [α ¦ p + βm]}ψ = 0, (14)

or in the (2b) form,

Eψ = i∂ψ

∂t= [α ¦ p + βm]ψ = HDψ, (15)

is the correct relativistic equation for the motion of an electron in the absenceof a field. Taken into account that this equation is not exactly equivalent to(9), we shall, at the moment, consider only those solutions corresponding topositive values of p0, the negative values not corresponding to any actuallyobservable motion of an electron. We shall come back to that point later.

To generalize this equation to the case when there is an electromagneticfield present, we follow the classical rule of replacing p0 and p by p0 − qA0

Page 7: The Dirac Equation

6 THE DIRAC MATRICES 9

and p− qA, A0 and A being the scalar and vector potentials of the field atthe place where the particle is. This gives the equation

{p0 − qA0 − [α ¦ (p− qA)]− βm}ψ = 0, (16)

the Hamiltonian of the energy being

HFD = α ¦ π + βm + qA0 (17)

andπ = p− qA (18)

being the standard kinetic momentum operator in the general case of aparticle with a charge q. For an electron, π = p + eA.

6 The Dirac matrices

It is obvious that relations (13) require the α’s and β to be matrices. Todetermine the form of the matrices, some conditions need to be imposed :

- The wave function should be a column vector in order that the prob-ability density be given as ψ†ψ3. This imposes the condition that thematrices must be square.

- The Hamiltonian must be hermitian so that its eigenvalues are real.This forces the four matrices to also be hermitian.

The α’s and β have similar properties to the Pauli σ matrices (21), whichare 2×2 matrices. However, so long as we keep working with 2×2 matrices,we can get a representation of no more than three anticommuting quantities.

The rank n of those matrices must be even. This can be shown byobserving that for each of the four matrices there is another matrix whichanti-commutes with it. Therefore, if bµ is any of the four matrices and bν isa matrix which anti-commutes with bµ, we have

Tr[bµ] = Tr[bµb2ν ] = Tr[bνbµbν ] = −Tr[bµb2

ν ] = 0 (19)

since each b2ν = 1 and Tr[AB] = Tr[BA]. Each matrix has thus zero trace.

There exists a representation in which any bµ can be brought to diagonalform, and, since b2

µ = 1 and Tr[bµ] = 0 are independent of the representation,we conclude that the eigenvalues of bµ in diagonal form are ±1 and that thereare as many +1 as -1 eigenvalues. Thus the number of rows and columnsmust be even.

3Notation † denotes the conjugate transpose

Page 8: The Dirac Equation

7 COVARIANT FORM OF THE DIRAC WAVE EQUATION 10

The minimum possible number for n is 4, and a 4 × 4 representationexist. For example,

α =(

0 σσ 0

), β =

(I2 00 −I2

), (20)

where I2 is a 2× 2 unity matrix and σ represents the three Pauli matrices

σ1 =(

0 11 0

), σ2 =

(0 −ii 0

), σ3 =

(1 00 −1

). (21)

If we consider the direct matrix product between two matrices operatingin different spaces, we can write all the Dirac matrices previously defined asa direct product of two 2 × 2 matrices : one operating in the Dirac spacereferring to the four areas of the 4 × 4 matrices and the other operatingin the Pauli space referring to the four elements within each of these fourareas. Thus

αj = ρ1 ⊗ σj , β = ρ3 ⊗ I2 (22)

where the three matrices operating in Dirac space

ρ1 =(

0 11 0

), ρ2 =

(0 −ii 0

), ρ3 =

(1 00 −1

)(23)

form, with I2, a complete set like I2, σ1, σ2, σ3. Since it is to be understoodthat the direct product is always implied for matrices operating in differentspaces, the ⊗ symbol can be omitted.

7 Covariant form of the Dirac wave equation

Although being in Hamiltonian form the Dirac equation given above (15)doesn’t include time and space coordinates in a symmetric manner. Totransform the equation, we first need rewriting it using the usual operatorsubstitutions (i.e. E → i∂/∂t, p → −i∇) before multiplying its both sidesby β on the left:

[−iα ·∇ + βm]ψ = i∂ψ

∂t⇒ [−iβα ·∇ + m] ψ = iβ

∂ψ

∂t(24)

We can now introduce the Dirac γ matrices γµ = (β, βα) and rewritethe Dirac equation as :

[iγµ∂µ −m] ψ = 0 (∂µ = ∂/∂xµ), (25)

which puts both time and position coordinates on an equal footing.

Page 9: The Dirac Equation

8 DIRAC γ MATRICES 11

8 Dirac γ matrices

We now consider the complete set of matrices which can be constructedfrom the four γµ matrices defined in the previous section by multiplications.There are 16 different matrices γA which can be formed this way. Those canbe classified into five groups (as done in [7]):

- Group S. This consists of a single matrix, the identity matrix. It canbe formed by at least four ways : (γµ)2 = 1.

- Group V. These are just the four γµ matrices.

- Group T. These are the six matrices formed by the relation

iγµγν (µ 6= ν),

the phase factor i being taken to have in all cases (γA)2 = 1

- Group P. This is the single matrix formed by multiplying all four γµ :

γ5 = γ0γ1γ2γ3.

- Group A. These are the four possible products formed by products ofthree γµ :

iγµγνγξ (µ 6= ν 6= ξ).

These can be written using the γ5 matrix in the form iγ5γµ.

The designation ”group” used above does not mean that these 16 ma-trices form a group in the technical sense. Nevertheless, this set does forma mathematical entity : a Clifford algebra.

Proceeding from the rules in (13), some relations can be derived for theγ matrices :

- Multiplying βαi + αiβ = 0 by β from the left, we get :

β(βαi) + (βαi)β = γ0γi + γiγ0 = 0. (26)

- Now taking αiαj + αjαi = 2δij and multiplying it from both sides byβ, we get, using the αβ anti-commutation relation :

(βαi)(αjβ) + (βαj)(αiβ) = 2δijββ ⇒ γiγj + γjγi = −2δij . (27)

- Putting the previous two equations together yields:

γµγν + γνγµ = {γµ, γν} = 2gµν (28)

Page 10: The Dirac Equation

9 DIRAC WAVE FUNCTIONS 12

- The hermiticity of the γµ matrices can be derived in a similar manner,as the α’s and β are hermitian. This is obviously the case for γ0 = β.The other components are given by:

(γi)† = (βαi)† = (αiβ) = −γi (29)

and these components are shown to be anti-hermitian.

For more information about the γ matrices, the reader is referred to [8]and [7].

9 Dirac wave functions

Each wave function is a 4-component vector with 4 rows and 1 column

ψ (r, t) =

ψ1 (r, t)ψ2 (r, t)ψ3 (r, t)ψ4 (r, t)

=

(ψu

ψl

)(30)

where ψu and ψl refer to upper and lower and are each two componentspinors. The spin of the electron requires the wave function to have twocomponents4. The fact our theory gives four is due to our wave equation(11) having twice as many solutions as it ought to have, half of them corre-sponding to states of negative energy (p0 < 0).

Looking at how operators act on the four-component wave functions, wemay for example calculate

ρ2ψ =(−iψl

iψu

)(31)

or

αψ = ρ1σψ =(

σψl

σψu

). (32)

So the matrices operating in the Dirac space act on ψu and ψl while thematrices operating in Pauli space act on the two components in ψu(ψ1, ψ2)and in ψl(ψ3, ψ4). The four-component ψ will be called a spinor (or four-spinor).

The Dirac matrices from section 6, like ρ3, with zero elements in theupper right and lower left quadrants are called even in the Dirac sense;those, like ρ1 and ρ2, with zeroes in the upper left and lower right quadrantsare called odd. Even Dirac matrices couple ψu with ψu and ψl with ψl

while odd ones couple ψu and ψl. This will, as we shall see, have someconsequences.

4In fact, the appearance of a multi-component wave function is characteristic of theexistence of a non-vanishing spin [7].

Page 11: The Dirac Equation

10 THE PARTICLE CURRENT DENSITY 13

10 The particle current density

With the Dirac equation (15) and a wave function of the form of a 4-component vector (30), we can have a look at the associated probabilitycurrent. Taking (15) multiplied on the left by ψ† and substracting with itsadjoint multiplied on the right by ψ :

ψ†i∂

∂tψ = ψ†(−iα ¦∇+ βm)ψ, (33a)

−(i∂

∂tψ†)ψ = (i∇ψ† ¦ α + mψ†β)ψ, (33b)

resulting in

i∂

∂t(ψ†ψ) = −i∇ ¦ (ψ†αψ), (34)

which is the continuity equation (3), expressing the conservation of proba-bility density if we define it the usual way, i.e. as

ρ = j0 = ψ†ψ, (35)

and the probability current as

j = ψ†αψ. (36)

The postulated property of ρ (1b) is automatically valid with H her-mitian and with (35), which is obviously positive definite. For, if (35) isassumed, we have

∫∂ρ

∂td3x = −i

∫ [ψ†Hψ − (Hψ)† ψ

]d3x = 0 (37)

by virtue of H† = H.Thus with the wave equation defined and the form of the wave function

known, (35) allows us to specify the current density j implied by postulateV.

We now have the constancy of the total probability of finding the electronat any point of space (37). We have now apparently solved the problem offinding a relativistic generalization of the Schrodinger equation.

11 Invariance under Lorentz transformations

From the previous section, it seems like we have our relativistic generaliza-tion completed. But we must still verify the invariance of the Dirac equationunder Lorentz transformations.

Page 12: The Dirac Equation

11 INVARIANCE UNDER LORENTZ TRANSFORMATIONS 14

As in the preceding section we derived the continuity equation usingα and β, we will for this section use the γ matrices which appear in thecovariant form of the equation.

Starting with the covariant form of the Dirac equation (25) from section7, we will show (as in [8]) that the Dirac equation is form invariant underan inhomogeneous Lorentz transformation

x′ = Λx + a (38)

if we defineψ′(x′) = S(Λ) ψ(x) = S(Λ) ψ(Λ−1(x′ − a)), (39)

where S(Λ) is a 4× 4 matrix operating on the components of ψ satisfying

S−1(Λ) γλ S(Λ) = Λλµγµ. (40)

As∂

∂xµ=

∂x′ν

∂xµ

∂x′ν= Λν

µ∂′ν , (41)

the Dirac equation[iγµ∂µ −m]ψ(x) = 0 (42)

can be re-expressed in the form[iΛν

µγµ ∂S−1

∂x′ν−mS−1

]ψ′(x′) = 0, (43)

provided that the γ matrices remain unaltered under Lorentz transforma-tion.

Multiplying (43) by S on the left yields[iS(Λν

µγµ)S−1∂′ν −m]ψ′(x′) = 0, (44)

which is the same as (42) provided that S satisfies (40).

Theorem —Fundamental Pauli theorem [7]— If two sets of matrices γµ

and γλ obey the commutation rules (28), then there must exist a non-singularmatrix S which connects the two sets according to

γλS = Sγµ. (45)

The fundamental theorem of Pauli guarantees the existence of a non-singular S. In fact, the condition (40) uniquely determines S up to a factor([8]).

We now know the Dirac wave equation is form invariant under anyLorentz transformation. The equation now fulfills the main requirementsfor being a relativistic generalization of the Schrodinger equation.

Page 13: The Dirac Equation

12 MAGNETIC MOMENT OF THE ELECTRON 15

12 Magnetic moment of the electron

As we have defined the frame we will be working in, we will use this basisto highlight some of the advances the theory brought in. Following Diracin [3], we will start with one of the biggest success of Dirac’s theory: thetheoretical explanation of the electron having a magnetic moment.

Suppose we put the electron in a magnetic field B = ∇ × A. TheHamiltonian (17) determines the equation of motion. From it, we get forthe electron

(HFD + eA0)2 = (α ¦ π + βm)2 = (α ¦ π)2 + m2 = π2 + m2 + eΣ ¦ B, (46)

as, using the relation

(σ ¦ B)(σ ¦ C) = (B ¦ C) + i[σ ¦ (B × C)] (47)

and introducing the spin matrix

Σ = I2 ⊗ σ = (ρ1 ⊗ I2)α =(

σ 00 σ

)(48)

to make a distinction with the 2× 2 Pauli matrices,

(α ¦ π)2 = π2 + iΣ ¦ (π × π) = π2 + eΣ ¦ B, (49)

with π × π = −ie∇×A = −ie B.

In the non-relativistic limit, i.e. for an electron moving slowly, with asmall momentum, we may expect an Hamiltonian of the form m+H1, whereH1 is small compared to m. Putting this Hamiltonian for HF

D in (46) andneglecting H2

1 and other terms involving e2, we get, on dividing by 2m,

H1 + eA0 =1

2m(π2 + eΣ ¦ B). (50)

The Hamiltonian given by this last equation is the same as the classicalHamiltonian for a slow electron, except for his last term,

e

2mΣ ¦ B,

which may be considered as an additional potential energy which a slow elec-tron has. This extra energy can be interpreted as arising from the electronhaving a magnetic moment

µ = − e

2mΣ, (51)

Page 14: The Dirac Equation

13 SOLUTIONS OF THE DIRAC EQUATION 16

which implies that the g-factor of the electron is 2, which is very nearly thecase.

It is remarkable to notice that the Uhlenbeck-Goudsmit hypothesis,which is that the observed spectral features on the anomalous Zeeman effectare matched by assigning to the electron a magnetic moment given in termsof the operator µ = −e/m s, where s = 1

2 σ, emerges from the Dirac theory.

This discussion suggests that the relativistic particle has a intrinsic an-gular momentum 1/2, so that the total angular momentum is

J = L +12Σ, L = r × p. (52)

The Hamiltonian from (15) commutes with the total angular momentum

[J ,H] = [L,H] +12

[Σ, H] = iα× p− iα× p = 0, (53)

which can be obtained using Σ = 12iα × α. This verifies the rotational

invariance of the Dirac equation.

13 Solutions of the Dirac equation

The Dirac equation admits of plane wave solutions of the form

ψ(x) = e−ip¦ru(p) (54)

where u(p) is a four-component spinor which satisfies the equation

(γ ¦ p−m) u(p) = (γµpµ −m) u(p) = 0. (55)

Equation (55) is a system of four linear homogenous equations for thecomponents uµ, for which non trivial solution exist only if

det(γµpµ −m) = (p20 − p2 −m2)2 = 0. (56)

Solutions therefore only exist if p20 = p2 + m2, i.e. only if p0 = ±

√p2 + m2.

Let u+(p) be a solution for p0 = E(p) = +√

p2 + m2 so that u+(p)satisfies the Dirac equation

(α ¦ p + βm) u+(p) = E(p) u+(p). (57)

Using the decomposition of the wave function in Dirac space, like in thesecond relation of (30), we may write

u+ =(

uu

ul

),

Page 15: The Dirac Equation

13 SOLUTIONS OF THE DIRAC EQUATION 17

where uu and ul have two components each, and, adopting the representation(20) for α and β, we find that uu and ul obey the following equations:

(σ ¦ p) ul + muu = E(p) uu (58a)

(σ ¦ p) uu −mul = E(p) ul. (58b)

Since E(p) + m 6= 0,

ul =σ ¦ p

E(p) + muu (59)

and substituting this value back into (58a), we find

((σ ¦ p)2

E(p) + m+ m) uu = E(p)uu. (60)

Using (47), (σ ¦ p)2 = p2 and

p2

E(p) + m=

E2(p)−m2

E(p) + m= E(p)−m,

we get equation (60) is identically satisfied. There are therefore two lin-early independent positive energy solutions for each momentum p, whichcorrespond, for example, to choosing uu equal to

(10

)or

(01

),

which are respectively equal to χ+1/2 and χ−1/2 when using Pauli’s spinornotation.

This can also be seen using the operators and some of their properties.The Hamiltonian operator HD = α ¦ p + βm commutes with the hermitianoperator

s(p) =Σ ¦ p

|p| , (61)

where Σ is defined by (48).

s(p) is the helicity operator, or helicity of the particle, and physicallycorresponds to the spin of the particle parallel to the direction of motion.The solutions can therefore be chosen as simultaneous eigenfunctions of H

Page 16: The Dirac Equation

14 EXACTLY SOLVABLE PROBLEMS 18

and s(p). Since s2(p) = 1, the eigenvalues of s(p) are ±1. The solutionscan therefore be classified according to the eigenvalues +1 or −1.

A similar classification can be made for the negative energy solutions forwhich p0 = −

√p2 + m2 and where, for a given momentum, there are again

two linearly independent solutions.

So, for a given four-momentum p, there are four linearly independentsolutions of the Dirac equation. These are characterized by p0 = ±E(p)and s(p) = ±1.

As an example, we may explicit two linearly independent solutions forpositive energy and momentum p :

u(u)+ (p) =

√E(p) + m

2E(p)

(χ+1/2

σ¦pE(p)+mχ+1/2

)(62a)

u(l)+ (p) =

√E(p) + m

2E(p)

(χ−1/2

σ¦pE(p)+mχ−1/2

), (62b)

where the normalization constant is determined by the requirement thatu†u = 1.

In the non-relativistic limit (v ¿ 1 → p = mv and E(p) ' m)5, the com-ponents ul of a positive energy solution are of order v times uu and thereforesmall. For a negative energy particle, it is the two upper components of thewave function who will be small.

14 Exactly solvable problems

There are only few problems for which the Dirac equation can be solvedexactly ([8]). Some of them are, in (3+1)-dimensional space-time :

- The Coulomb potential.

- The case of a homogeneous magnetic field extending over all space.

- The field of an electromagnetic plane wave.

- The so-called Dirac oscillator, which is a relativistic extension of theoscillator problem.

In (2+1)-dimensional space-time, we may cite the Dirac oscillator.5As we are in natural units for quantum mechanics, i.e. ~ = c = 1.

Page 17: The Dirac Equation

15 THE DIRAC EQUATION IN AN ELECTRIC FIELD 19

15 The Dirac equation in an electric field

As in [6], we are starting with the Dirac Hamiltonian in the presence of afiled (17), we may, as we did in section 13, express the Dirac equation astwo coupled equations with time independent solutions uu, ul:

[σ ¦ π]ul + [m− qA0]uu = Euu (63a)

[σ ¦ π]uu − [m− qA0]ul = Eul. (63b)

From those, we get by substitution

(σ ¦ π)1

E + m− qA0(σ ¦ π)uu + qA0uu = (E −m)uu. (64)

Now we will assume A = 0, E = E′ + m and that

1E′ + 2m− qA0

' 12m

[1− E′ − qA0

2m]. (65)

We thus have

− 12m

[1−E′ − qA0

2m](σ¦∇)2uu− q

4m2(σ¦∇A0)(σ¦∇uu)+qA0uu = E′uu. (66)

Using the relation (47) and assuming that A0(r) is spherically symmetric,we get, as the orbital momentum operator L = r × p and the Pauli spinoperator s = 1

2σ:

[− 12m

∇2+qA0+1

2m[E′ − qA0

2m]∇2+

q

2m2

1r

dA0

drL¦s− q

4m2

dA0

dr

∂r]uu = E′uu,

(67)knowing that

∇A0(r) =1r

dA0

drr

∇A0∇uu =dA0

dr

∂uu

∂r

iσ ¦ [∇A0 ×∇uu] = −21r

dA0

drL ¦ s

Let us now look at equation (67) more closely:

- The first and second term are in the non-relativistic Hamiltonian fora particle of mass m and charge q in a central potential A0(r).

- The third term is a relativistic correction to the kinetic energy opera-tor. It can be written

12m

[E′ − qA0

2m]∇2 ' − p4

8m3, (68)

as E′ − qA0 ' p2/2m.

Page 18: The Dirac Equation

16 THE SEA OF NEGATIVE ENERGY 20

- The fourth term is the spin-orbit interaction.

- The fifth in non-Hermitian. C.G. Darwin showed [2] it could be writtenas

q

8m2∇2A0(r), (69)

which is4π

8m2(Ze2

4πε0)δ(r) (70)

for a Coulomb potential. This term only affects the s-states. It comesfrom the fact that, in quantum mechanics, the electrons wavefunctionis spread out. In the nonrelativistic limit, the electron therefore ”feels”the electric field of the proton over a finite volume of approximateradius given by the Compton wavelength of the electron, ~/mc ([8]).

The Dirac equation can be solved exactly for the hydrogen atom. Theenergy eigenvalues are given by ([8]) :

EDnj = m{1 + (

n− j − 1/2 +√

(j + 1/2)− Z2α2}−1/2, (71)

where α = e2

4π~c ' 1137 . If we expand, we get

EDnj = m

[1− (Zα)2

2n2+

(Zα)4(6j + 3− 8n)8(2j + 1)n4

+ O(Zα)6]

, (72)

which leads to

Enj = EDnj −m = E(0)

n

[1 +

(Zα)2(6j + 3− 8n)4(2j + 1)n2

+ O(Zα)4]

, (73)

where

E(0)n =

m(Zα)2

2n2(74)

are the eigenvalues obtained with the non-relativistic Schrodinger equation.

16 The sea of negative energy

We have noted that the Dirac equation admits of negative energy solutions.Their interpretation presented a great deal of difficulty for some time; asfor example the fact a negative energy particle would be accelerated in theopposite direction of the external force.

In a classical theory, the negative energy states cause no trouble becauseno transition between positive and negative energy states occur. Therefore,if a particle occupies a positive energy state at any time, it will never appear

Page 19: The Dirac Equation

16 THE SEA OF NEGATIVE ENERGY 21

in a negative energy state. The anomalous negative energy states are theneliminated as a result of initial conditions stipulating that no such state oc-curred in the past. In a quantum theory, this device is no longer admissible,as spontaneous emission of radiation can occur as long as a state of lowerenergy is unoccupied and as long as conservation of angular and linear mo-menta can be fulfilled. These conservation principles can always be fulfilledunder appropriate conditions. There is nothing to prevent an electron fromradiating energy in making a transition to lower and lower states. ([7])

In 1930, Dirac resolved the difficulties of interpretation by suggesting hisso-called ”hole” theory which he formulated as follows ([3]) :Assume that nearly all the negative energy states are occupied, with oneelectron in each state in accordance with the exclusion principle of Pauli.The exclusion principle makes it impossible for positive energy electronsto make transition to negative energy states unless they are emptied bysome means. Such an unoccupied negative energy state will now appear assomething with positive energy, since to make it disappear, i.e. to fill it up,we should have to add an electron with negative energy. We assume thatthese unoccupied negative-energy states are the positrons. The ”hole” wouldhave a charge opposite of that of the positive energy particle. The positronwas experimentally discovered in 1932 by Carl D. Anderson [1]. It has thesame spin operator as the electron but has opposite energy, momentum andangular momentum operators. Therefore it has also the opposite helicity,which is well known as an experimental result in beta decay [7].

Dirac also suggested there has to be a distribution of electrons of infinitedensity everywhere in the world and that a perfect vacuum is a region whereall the states of positive energy are unoccupied and all those of negativeenergy are occupied. However, this infinite distribution doesn’t contributeto the electric field, as, of course, Maxwell’s equation in a perfect vacuum,∇ ¦ E = 0, must be valid. Thus only departures from the distribution in avacuum will contribute to the electric density. There will be a contribution−e for each occupied state of positive energy and a contribution +e for eachunoccupied state of negative energy.

The exclusion principle will operate to prevent a positive-energy electronordinarily from making transitions to states of negative energy. It will stillbe possible, however, for such an electron to drop into an unoccupied stateof negative energy. In this case we should have an electron and a positrondisappearing simultaneously, their energy being emitted in the form of ra-diation. The converse process would consist in the creation of an electronand a positron from electromagnetic radiation.

Page 20: The Dirac Equation

17 FOLDY-WOUTHUYSEN REPRESENTATION 22

Although the prediction of the positron is a brilliant success of Dirac’stheory, some questions still arise. With a completely filled ”negative energysea”, the theory can no longer be a single-particle theory. The treatmentof problems of electrodynamics is complicated by the requisite elaboratestructure of the vacuum. However, the effects of the ”crowded” vacuum onthe mass and charge of a Dirac particle is to change them to new values,which must be identified with the observed mass and charge.

17 Foldy-Wouthuysen representation

The Dirac equation in the form described above does not lend itself easilyto a simple interpretation. Let us consider for example the operator

x = i [HD,x] = α, (75)

which we would like to call the velocity operator. Since α2i = 1, the absolute

magnitude of the ”velocity” in any given direction is always 1, which is, sincewe have set c = ~ = 1, the speed of light. This is, of course, not physicallyreasonable.

From this example, we have to conclude that there must exist anotherrepresentation of the Dirac equation in which the physical interpretationis more transparent. This can also be inferred from the fact that the twoindependent states associated with each value of the momentum of a positiveenergy Dirac particle, which correspond to the two possible directions of thespin, have, according to quantum mechanics, to be represented by exactlytwo vectors in Hilbert space ([8]). There exist therefore a redundancy in therepresentation of those vectors.

This problem was solved in 1950 by Foldy and Wouthuysen ([4]) whonoticed that the main reason for this redundancy is the presence of oddoperators, i.e., as has been said before, an operator which connects upperand lower components of the wave function. If it were possible to performa canonical transformation on the Hamiltonian HD and bring it to a formfree of odd operators, it would be possible to represent the solutions bytwo-component spinors.

The suggested transformation,

ψ → eiSψ = ψ′ (76a)

H → eiSHDe−iS = H ′D, (76b)

with S of the formS = −(

i

2m)βα ¦ pω(

|p|m

), (77)

Page 21: The Dirac Equation

17 FOLDY-WOUTHUYSEN REPRESENTATION 23

ω being a real function to be determined such that H ′ is free of odd operators,leads to a new position operator

X = eiSxe−iS = x + iβα

2E(p)− i

β(α ¦ p) p + i [σ × p] |p|2E(p) (E(p) + m) |p| (78)

and to a new spin operator, called the ”main spin operator”,

ΣM = Σ− iβ(α× p)E(p)

− p× (σ × p)E(p) (E(p) + m)

, (79)

whereΣ = eiSΣMe−iS =

12i

(α×α) (80)

is the spin operator defined by (48).The complete reasoning can be followed in the original article [4] as well

as in [8].

Here are some of the consequences of this canonical transformation:

- In this representation, positive and negative energy states are sepa-rately represented by two-component wave functions.

- Position and spin operators differ from the conventional representa-tion.

- The components of the time derivative of the new position operatorall commute and have for eigenvalues all values between −1 and +1,i.e. between −c and +c in non-reduced units.

- The new spin operator is now a constant of the motion, which was notthe case before.

- It is these new operators rather than the conventional ones which passover into the position and spin operators in the Pauli theory in thenon-relativistic limit.

The Foldy-Wouthuysen representation is particulary useful for the dis-cussion of the non-relativistic limit of the Dirac equation, since the operatorsrepresenting physical quantities are in one-to-one correspondence with theoperators of the Pauli theory. There exists also another limit which is ofa considerable interest, namely the ultrarelativistic, where the mass of theparticle can be neglected in comparison with its kinetic energy. Such a formof the Dirac equation is obtained by choosing an appropriate ω (see [8] formore details).