PHYS812 Introductory comments and syllabuspittel/lectures.pdfPHYS812 Introductory comments and...

PHYS812

Introductory comments and syllabus

• The course is PHYS812, the third semester of the Department’s Quantum Mechanics

sequence .

• I am Dr. Pittel and my office is in Sharp Lab Rm 202.

• The formal text for the course is the same as was used in PHYS610 and PHYS811,

namely Principles of Quantum Mechanics, Second Edition, by R. Shankar. Since

Shankar does not discuss some of the topics I wish to speak about, I am recommend-

ing for supplemental reading the book Quantum Mechanics, Third Edition, by E.

Merzbacher.

• The topics to be discussed in the course are:

1. Scattering Theory,

2. Second quantization,

3. Non-relativistic many-body theory,

4. Relativistic quantum mechanics

5. Applications of the Feynman Path Integral Approach (time permitting).

• The material on Scattering Theory can be found in Chapter 19 of Shankar, with valu-

able additional material in Chapter 13 and 20 of Merzbacher. There is no discussion

unfortunately on Second Quantization or Many-Body Theory in Shankar, but some

useful material on those topics in Chapters 21 and 22 of Merzbacher. My material

on Relativistic Quantum Mechanics will derive to some extent from Chapter 20 of

Shankar, but there is also some useful discussion in Chapter 24 of Merzbacher. The

material I will be discussing on the Feynman Path Integral Approach, time permitting,

can be found in Chapter 21 of Shankar.

• Where appropriate, I will be assigning weekly reading from Shankar. I will at the same

time be preparing my own detailed set of lecture notes, which I will make available

1

on the web prior to the associated lecture. Feel free to bring those lecture notes with

you, so that you do not have to spend the entire lecture taking notes. The web site

where the lecture notes will reside is: www.physics.udel.edu/ pittel.

• I will also be assigning weekly problem assignments, sometimes from the text and

sometimes not. All such assignments should be handed in (typically) a week after

assignment and will be graded. Students may work in groups but I request that the

assignments be written up independently. The homework assignments will likewise be

made available on the web site given above.

• I will give a mid-term examination and a final examination, both in class and closed-

book.

• At the end of the semester your grade will be obtained by weighting the homework

problems 20%, the midterm 30% and the final examination 50%.

• Office hours: I will have office hours from 2-4pm on Fridays.

2

An introduction to scattering theory

Over the next several weeks, I will discuss the quantum theory of collision or scattering

processes. I will focus initially on elastic scattering of spinless particles, and only at the end

will begin to put in some of the generalizations.

You should begin reading Chapter 19 of Shankar where his discussion of scattering theory

takes place.

Qualitative description of a scattering experiment

A typical scattering experiment involves the various ingredients shown schematically in

Fig. 1.

D.

TC

S. .

FIG. 1: Schematic illustration of an elastic scattering experiment

1. A source S of incident particles, for example a particle accelerator. The output of the

accelerator is a beam of particles, each described by a wave packet.

3

2. A beam collimator C. The beam is collimated by passing it through a narrow slit,

thereby giving in a small but finite spatial spread in the vertical direction.

3. A target T. The target provides a force field through which the incident particle in-

teracts with it. In principle the particles in the target are also described by wave

packets.

4. A detector D. As a result of the interaction of the incident wave packet with the

target wave packet, a portion of the incident wave packet is scattered and a portion

is transmitted (i.e. unscattered). The scattered part moves radially outward from

the force center, as represented by the succession of circles around T. The “almost”

parallel lines represent the “almost” plane waves, part of which pass through the target

unaffected, i.e. unscattered.

Experimentally, as long as P is not in the path of the beam (i.e. at very small scattering

angles), the only particles that reach it are those that have been scattered, as a result of the

use of the collimating slit C.

The source, the collimator and the detector can all be considered as infinitely far from

the target.

The scattering cross section

Suppose we bombard a group of n target particles with an almost parallel flux of N

particles per unit area per unit time, and count the number of incident particles that reach

the detector per unit time. Let’s assume that the detector subtends a solid angle dΩ about

a direction at polar angles θ, ϕ with respect to the incident beam, which we assume to move

in the z direction. We shall further assume that the detector is so placed that it receives

only scattered particles and none from the transmitted wave.

The number of particles detected per unit time in dΩ is proportional to N , n and dΩ,

viz:

# detected /unit time = nNdσ

dΩdΩ

The proportionality constant dσdΩ

is called the differential cross section.

The total cross section σ is defined as

σ =∫ dσ

dΩdΩ

4

It represents the total fraction of particles scattered by the target per unit time and per unit

target particle.

Note that dimensionally both dσdΩ

and σ have the dimensions of an area, e.g. fm2 or

angstroms 2.

Continuum solutions of the Time-Independent Schrodinger equation

We will show shortly that despite the fact that collision processes involve wave pack-

ets, which are not energy eigenstates, it is nevertheless possible to describe most physical

scattering processes in terms of appropriate continuum eigenstates of the Time-Independent

Schrodinger equation. So as to provide the necessary background for that discussion, I would

now like to remind you of some of the features of continuum solutions of the Schrodinger

equation for two spinless particles interacting via a central potential. The relevant Time-

Independent Schrodinger equation (after removal of the dependence on the CM variable)

is (− h2

2µ2 +V (r)

)Ψ(r) = EΨ(r) (1)

For simplicity, we introduce

U =2µV

h2

k2 =2µE

h2

and rewrite (1) as (2 + k2

)Ψ(r) = U(r)Ψ(r) (2)

The above differential equation by itself does not fully specify the scattering problem. Due

to time reversal or parity invariance, it has degenerate solutions at all positive energies E. By

specifying the boundary conditions, i.e. the conditions as r → ∞, we can choose the linear

combination of degenerate solutions which are appropriate for the scattering problem of

interest. Thus, to fully specify the appropriate continuum solutions for scattering problems,

we must supplement the differential equation (2) by boundary conditions. If, however, we

transform the differential equation (2) into an integral equation, we can incorporate the

desired boundary conditions into the same equation.

Introduction of the Free-Particle Green’s Function

5

To make the transformation from a differential equation with supplementary boundary

conditions to an integral equation, we introduce the free-particle Green’s function g0(r, r′),

which we define by [2 + k2

]g0(r, r ′) = δ(r − r ′)

With the above definition of g0(r, r ′), we see that

(2 + k2

) [∫dr ′g0(r, r ′)U(r′)Ψ(r ′)

]=∫

dr ′(2 + k2)g0(r, r ′)

U(r′)Ψ(r ′)

=∫

dr ′δ(r − r ′)U(r′)Ψ(r ′)

= U(r)Ψ(r)

Thus, if we set

Ψ(r) =∫

dr ′g0(r, r′)U(r′)Ψ(r′)

then this Ψ(r) is guaranteed to satisfy the Schrodinger equation

[2 + k2

]Ψ = UΨ

The equation

Ψ(r) =∫

dr ′g0(r, r′)U(r′)Ψ(r ′)

is an integral equation. Note that the unknown function Ψ also appears in the integrand.

Question: Is the above integral equation the only one that is consistent with the

Schrodinger equation?

Answer: No! To see this, consider instead the integral equation

Ψ(r) = Φ(r) +∫dr ′g0(r, r ′)U(r′)Ψ(r ′)

in which the function Φ(r) is assumed to satisfy

(2 + k2)Φ(r) = 0

Then

(2 + k2

) [Φ +

∫dr ′g0(r, r ′)U(r′)Ψ(r ′)

]=(2 + k2

) [∫dr ′g0(r, r ′)U(r′)Ψ(r ′)

]= U(r)Ψ(r)

Thus, we can always add any solution of the homogeneous equation

(2 + k2)Φ = 0

6

and obtain a new integral equation which is also consistent with the Schrodinger equation.

This freedom can be used to incorporate the desired boundary conditions of the scattering

problem.

Note that the equation

(2 + k2)Φ = 0

is just the free-particle Schrodinger equation. Among its solutions, as we know, are the

normalized plane waves

Φ(r) =1

(2π)3/2eik·r

With this choice of Φ(r), the integral equation becomes

Ψ(r) =1

(2π)3/2eik·r +

∫dr ′g0(r, r

′)U(r′)Ψ(r ′) (3)

This has the desired separation of Ψ(r) into an incoming plane wave plus another term.

We will now show that it is possible to choose g0(r, r ′) such that the second term behaves

for large r like an outgoing spherical wave. Our integral equation will then not only be

consistent with the Schrodinger equation, but it will also incorporate the desired scattering

boundary conditions.

Explicit Construction of g0

Because of the symmetry of the problem, the Green’s function g0(r, r′) can only depend

on r − r ′. Define

x = r − r ′

so that

(2 + k2)g0(x) = δ(x) (4)

To obtain g0(x), we first expand it in a Fourier series

g0(x) =1

(2π)3

∫eiq·xg0(q)dq (5)

Plugging (5) into (4), we see that

1

(2π)3

∫(−q2 + k2) eiq·x g0(q) dq = δ(x) (6)

But we know that1

(2π)3

∫eiq·x dq = δ(x)

7

so that g0(q) must satisfy

(k2 − q2)g0(q) = 1 (7)

Since (7) must be satisfied for all positive q, including q = k, we cannot trivially invert this

equation to obtain g0(q). The proper technique for inverting such singular equations is to

first consider the generalization to complex k. Once k is assumed complex (with Im k = 0),

we can obtain g0(q) for all q. This can then be plugged into (5) and the integration can

be carried out to obtain g0(x). At that point, we can take the limit Im k → 0, since the

physical value of k is real. What we will find is that this procedure does not yield a unique

answer. The result will depend on whether Im k → 0 from above the real axis or from

below. We shall use the notation g+0 (x) to denote the result when Im k → 0+, i.e. from the

positive imaginary side, and correspondingly g−0 (x) to denote the result when Im k → 0−,

i.e. from the negative imaginary side. Physically, these will be seen to yield wave functions

with different boundary conditions. Both are, however, legitimate mathematical solutions.

Mathematically, we can do the above by replacing k2 in (7) by k2 ± iϵ, with the under-

standing that after the eventual integration of (4) we will take the limit ϵ → 0. Thus, we

replace (7) by

(k2 ± iϵ− q2)g±0 (q) = 1

for which the solutions are

g±0 (q) =1

k2 − q2 ± iϵ

Inserting this into (5) and taking the limit as ϵ → 0 gives

g±0 (x) =1

(2π)3limϵ→0

∫ eiq·x

k2 − q2 ± iϵdq

To evaluate the integral, we choose the coordinate system for q in such a way that its

z-axis is along the vector x. Then

eiq·x = eiqx cosθ

The angular integrals can then be carried out immediately, giving∫ +1

−1

∫ 2π

0eiqx cos θd(cos θ)dϕ =

2π

iqx

[eiqx − e−iqx

]so that

g±0 (x) =1

4π2ixlimϵ→0

∫ ∞

0

eiqx − e−iqx

k2 − q2 ± iϵq dq

=1

8π2ixlimϵ→0

∫ ∞

−∞

eiqx − e−iqx

k2 − q2 ± iϵq dq

8

To evaluate the remaining integral, we use contour integration. For the first term,

limϵ→0

∫ ∞

−∞

eiqx

k2 − q2 ± iϵq dq ,

we close the contour in the upper half plane. For the second term,

limϵ→0

∫ ∞

−∞

e−iqx

k2 − q2 ± iϵq dq ,

we close it in the lower half plane. The relevant poles of the integrand are as follows:

(A) g+0 (x)

(1) First term: q =√k2 + iϵ ≈ k + iϵ

2k

(2) Second term: q = −√k2 + iϵ ≈ −k − iϵ

2k

(B) g−0 (x)

(1) First term: q = −√k2 − iϵ ≈ −k + iϵ

2k

(2) Second term: q =√k2 − iϵ ≈ k − iϵ

2k

where I have made use of the fact that ϵ is very small so that we need only keep the term

linear in it.

All told (and you should convince yourself that these results are correct)

g+0 (x) =1

8π2ixπi[−eikx − eikx

]= − 1

4πxeikx

Similarly,

g−0 (x) =1

8π2ixπi[−e−ikx − e−ikx

]= − 1

4πxe−ikx

Combining the two, we find that

g±0 (x) = − 1

4πxe±ikx

Putting back x = r − r ′ gives our final result for the free-particle Green’s function(s)

g±0 (r − r ′) = − 1

4π|r − r ′|e±ik|r−r ′| (8)

9

Return to the integral equation for Ψ(r)

We now insert the two possible Green’s functions into the integral equation (3) for Ψ(r),

obtaining

Ψ±k(r) =

eik·r

(2π)3/2− 1

4π

∫ e±ik|r−r ′|

|r − r ′|U(r′) Ψ±

k(r ′) dr ′ (9)

Note that there are two different solutions, one for each of the Green’s functions. Also, note

that I now include a subscript k to make clear that these wave functions correspond to an

incoming plane wave with momentum k.

We now examine this in the asymptotic limit, namely when r → ∞. Then

|r − r ′| = (r2 + r′2 − 2r · r ′)1/2

= r

(1 + (

r′

r)2 − 2

r

r· r ′

)1/2

→ r(1− r

r· r ′)

and

|r − r ′|−1 → r−1

All told,e±ik|r−r ′|

|r − r ′|→ e±ikr

re∓ik ′· r ′

where the vector k ′ is defined as

k ′ = kr

i.e. it has the same magnitude as k but is in the direction of r.

Finally,

Ψ±k(r) → eik·r

(2π)3/2− e±ikr

4πr

∫e∓ik ′·r ′

U(r′) Ψ±k(r ′)dr ′ (10)

Thus, the solution Ψ+

k(r), corresponding to the Green’s function g+0 (r, r

′), has the desired

behavior of an incident plane wave plus an outgoing scattered wave. It therefore represents

the desired continuum solution for a description of a physical elastic scattering processes.

The other solution, Ψ−k(r), is another continuum solution at the same energy, but it is

not of direct relevance to a physical scattering process. However, we will indeed make use

of it later in our formal development of scattering theory.

10

Equation (10) can be rewritten in the form

Ψ±k(r) → 1

(2π)3/2

eik·r + f±

k(r)

e±ikr

r

(11)

where

f±k(r) = − (2π)3/2

4π

∫e∓ik ′·r ′

U(r′) Ψ±k(r ′) dr ′ (12)

The coefficient f+

k(r) for the physical continuum solution is called the scattering ampli-

tude. It is often just denoted fk(r) without the superscript.

Modification of notation

It is useful to recast the equations we have obtained in terms of the true potential V (r)

rather than the scaled potential U(r). To do so, we note that

V (r) =h2

2µU(r)

Also, we would like to express our equations in terms of a slightly different (free particle)

Green’s function, defined by

(E −H0)G0(r, r ′) = δ(r − r ′)

where

E =h2 k2

2µ

and

H0 = − h2

2µ2

Then, clearly,

G±0 (r, r ′) =

2µ

h2 g±0 (r, r ′)

= − µ

2πh2

e±ik|r−r ′|

|r − r ′|(13)

The integral equation governing the scattering process can now be rewritten as

Ψ±k(r) =

1

(2π)3/2eik· r +

∫dr ′G±

0 (r, r′)V (r′)Ψ±

k(r ′) (14)

and the scattering amplitude as

fk(r) = f+

k(r) = − (2π)1/2µ

h2

∫e−ik ′·r ′

V (r′) Ψ+

k(r ′) dr ′ (15)

11

Scattering of wave packets

We now return to a discussion of a real elastic scattering experiment, in which the incident

projectile and the target are appropriately described by wave packets and not by energy

eigenstates. We shall prove, however, that because of specific features of the wave packets, we

can neglect their spread in energy (or momentum) and describe real scattering experiments

in terms of the aforementioned Ψ+

k(r) continuum eigenstates of H.

In a real scattering experiment, the projectile is prepared at time t0 in the form of a

wave packet, centered about a point z0 and with some average momentum k0. A suitable

expression for the wave packet is

Ψ(r, t0) = A(r − z0) eik0 · r (16)

A(r − z0) is a narrow envelope function that expresses its spatial localization about z0.

We shall define the zero of time as the time at which the projectile and target would

coincide were there no interaction. Then clearly t0 < 0 and furthermore

v0t0 = −|z0| (17)

where

v0 =hk0µ

Let us now introduce a Fourier decomposition of the narrow envelope function,

A(r − z0) =1

(2π)3/2

∫a(k)eik · r dk (18)

The components in this decomposition are given by

a(k) =1

(2π)3/2

∫e−ik · rA(r − z0)dr (19)

We can also Fourier decompose Ψ(r, t0). The coefficients of that expansion are

1

(2π)3/2

∫e−ik·r Ψ(r, t0)dr =

1

(2π)3/2

∫ei(k0−k)·r A(r − z0)dr

= a(k − k0) (20)

where the last equality followed from (19). Thus,

Ψ(r, t0) =1

(2π)3/2

∫a(k − k0) e

ik · r dk (21)

12

Obviously, the coefficients a(k− k0) in this Fourier expansion are only large if k− k0 ≈ 0.

In fact, the important values of momentum lie in a range

∆k ≈ w−1

where w is a typical spatial width over which significant changes in the envelope function

A(r − z0) occur.

In principle, the wave function at any time t > t0 can be evaluated by acting with the

time evolution operator on the wave packet at time t0. Namely it can be evaluated as

Ψ(r, t) = e−ihH(t−t0)Ψ(r, t0) (22)

where H is the full hamiltonian of the system.

As a reminder, the Fourier decomposition (21) was an expansion of the initial wave packet

in terms of the free-particle eigenstates eik · r of H0. If we wish to evaluate the effect of the

full time evolution operator (which involves H, not H0) on the initial wave packet, it is

useful to express it instead as an expansion in terms of the eigenfunctions Ψk(r) of the full

hamiltonian H, i.e. the wave functions Ψ+

k(r) obtained earlier. So, let’s now see how we can

do this.

If we express

Ψ(r, t0) =∫b(k) Ψ+

k(r)dk (23)

then

b(k) =∫Ψ+∗

kΨ(r, t0)dr

=1

(2π)3/2

∫e−ik · rΨ(r, t0)dr − µ

2πh2

∫ e−ik|r−r ′|

|r − r ′|V (r′)Ψ+ ∗

k(r ′) Ψ(r, t0) drdr

′

First term: Overlap with the incident plane wave.

1

(2π)3/2

∫e−ik · r Ψ(r, t0)dr =

1

(2π)3/2

∫a(k ′ − k0) e

i(k ′−k)·r dk ′ dr

=∫a(k ′ − k0)δ(k

′ − k)dk ′

= a(k − k0)

13

Second term: Overlap with the outgoing scattered wave.

To evaluate the second term, we note first that Ψ(r, t0) is a highly localized wave packet,

which is only non-zero over a small region of space. Furthermore, at time t0 (when it was

prepared), the small region of space is very far (|z0| ≈ ∞) from the target.

Thus, the overlap between Ψ(r, t0) and the scattered wave can only be non-zero for r ≈ ∞

(z ≈ −∞). It thus suffices to look at the scattered wave in the asymptotic region, for which

we see from (11) that it behaves like

eikr

(2π)3/2 r× fk (r)

From this, we see that in the localized region of overlap with Ψ(r, t0), the scattered wave has

a well-defined momentum which is opposite in direction to the average incident momentum

k.

But for any reasonable wave packet, the range of important momenta ∆k is such that

∆k

k0<< 1

where k0 is the central momentum of the packet.

I claim therefore that since any such scattered wave has a small range of momenta which

are all opposite in direction and of comparable magnitude to the small range of momenta

of the plane wave components of the wave packet there can not be any significant overlap

between them. Put another way, the second term gives zero contribution.

Putting our results for the first and second terms together, we arrive at the important

conclusion that

b(k) ≈ a(k − k0) (24)

As a reminder, crucial to our reaching this conclusion were that

1. the wave packet is spatially sufficiently well localized so that at t = t0, it does not yet

feel the potential, i.e. it is in the asymptotic region, and

2. its range of momenta is small compared to its average momentum.

Both of these criteria are invariably realized in real scattering experiments.

Now we continue by inserting (24) into (23), thereby reexpressing the t = t0 wave packet

as

Ψ(r, t0) =∫a(k − k0) Ψ

+

k(r)dk (25)

14

This wave packet at subsequent times can be obtaining by applying the time development

operator,

Ψ(r, t) = e−ihH(t−t0)Ψ(r, t0)

=∫

a(k − k0) e− i

hEk(t−t0) Ψ+

k(r)dk (26)

where

Ek =h2k2

2µ

is the energy eigenvalue associated with the eigenfunction Ψ+

k(r).

We are now interested in evaluating Ψ(r, t) in the asymptotic limit, i.e. at r → ∞, since

this is where the detector is located and thus experimental results can be obtained. Using

(11), this can be written as

Ψ(r, t) → 1

(2π)3/2

∫a(k − k0)e

− ihEk(t−t0)

×eik· r + fk(r)

eikr

r

dk (27)

To evaluate this integral, we again make use of the fact that a(k − k0) is only non-zero

for very small values of k − k0. We thus introduce a new variable q = k − k0 and expand

the various phase factors as power series in its magnitude q.

k2 = (k0 + q) · (k0 + q)

= k20 + q2 + 2k0 · q

≈ k20 + 2k0 · q

since q can only assume small values.

Likewise,

k = (k2)1/2

≈[k20

1 +

2k0 · q

k0

]1/2

≈ k0

1 +

k0 · q

k0

= k0 + k0 · q

15

Finally,

Ek =h2

2µk2

≈ h2

2µk20 +

h2

µk0 · q

Defining

ω0 =Ek0

h=

hk20

2µ

and, as earlier,

v0 =hk0µ

we find that

Ek ≈ hω0 + hv0 · q

In contrast to the phase factors, for which rapid variations with q are possible, we do

not expect (except under resonance conditions) that the scattering amplitude fk(r) should

vary much over the small range of important k values. Thus, we assume that over this small

range

fk(r) ≈ fk0(r)

We can now plug all of these results into (27) leading to

Ψ(r, t) → e−iω0(t−t0) ×[

1

(2π)3/2

∫a(q)ei(k0+q)· r−iq· v0(t−t0)dq+

+fk0(r)

(2π)3/2r

∫a(q)eik0r+iq· k0r−iq· v0(t−t0)dq

]

= e−iω0∆t

eik0· r

(2π)3/2

∫a(q)eiq· (r−v0∆t)dq+

+fk0(r)e

ik0r

(2π)3/2r

∫a(q)eiq· (k0r−v0∆t)dq

](28)

where I have now introduced ∆t = t− t0.

Comparing the two integrals in (28) with the integral in (18) for the Fourier transforma-

tion of the envelope function we can rewrite (28) as

Ψ(r, t) → e−iω0∆t

eik0· r

(2π)3/2A(r − v0∆t− z0) +

+fk0(r)e

ik0r

(2π)3/2rA(k0r − v0∆t− z0)

](29)

16

But

v0∆t+ z0 = v0t− v0t0 + z0

= v0t− v0 v0t0 + z0

= v0t− v0 v0t0 − |z0|v0

= v0t− v0 v0t0 + |z0|

= v0t

where the last equality follows from (17).

Thus, after all this work, we find that

Ψ(r, t) → e−iω0(t−t0)

eik0· r

(2π)3/2A(r − v0t)+

+fk0(r)e

ik0r

(2π)3/2rA(k0 r − v0t)

](30)

The physical interpretation of the two terms in (30) is straightforward.

• The first term is just the ongoing incident wave packet; it’s center moves classically

with velocity v0 and its shape does not change.

• The second term is also a wave packet in the form of a spherical shell of flux moving

outward radially with velocity v0. This spherical wave packet only exists for t ≥ 0, i.e.

after the projectile and the target come close together.

Cross Sections

Now that we have the full wave functions in the asymptotic limit for all times t > 0, we

can determine the experimentally meaningful differential cross section for elastic scattering.

Remember that the concept of a differential cross section was introduced at the beginning

of the lectures on scattering, on pages 3-4 of these notes. The differential cross section can

be expressed verbally as

dσ

dΩ=

outgoing flux per unit solid angle

incident flux per unit area

Both the incident and outgoing fluxes can be obtained from the associated probability cur-

rents.

17

Let

jin(r, t) = prob. current associated with the incident wave packet

= Re

[Ψ∗

in(r, t)h

µi Ψin(r, t)

]

But

Ψin =1

(2π)3/2

∫a(k − k0)eik· r dk

=1

(2π)3/2

∫ik a(k − k0) e

ik· r dk

As before, we let k − k0 = q and keep only the lowest terms in a series expansion in powers

of the small variable q. This gives

Ψin(r, t) ≈ ik0Ψin(r, t)

Thus,

jin(r, t) =h

µk0Ψ

∗in(r, t)Ψin(r, t)

=hk0µ

|A(r − v0t)|2

= v0 |A(r − v0t)|2 (31)

The total incoming flux passing the target is

Fin =∫ ∞

t0k0 · jin(r = 0, t)dt

where I’ve used the fact that the wave packet is prepared at time t0 and that the target is

by definition at r = 0. Thus,

Fin = v0

∫ ∞

t0|A(−v0t)|2dt

Letting

ξ = −v0t

so that

dξ = −v0dt

and noting that

v0 = v0k0

18

we obtain that

Fin = −∫ −∞

−v0t0|A(ξk0)|2 dξ

But from (17)

|z0| = −v0t0

so that

Fin = −∫ −∞

|z0||A(ξk0)|2dξ

=∫ |z0|

−∞|A(ξk0)|2dξ

Finally since |z0| is very large compared to the length of the packet, we can replace |z0| by

∞ in the integrand. Thus, finally

Fin ≈∫ ∞

−∞|A(ξk0)|2dξ (32)

Next we let

jout(r, t) = prob. current associated with the scattered wave packet

An analogous treatment in which we only keep the lowest order term in q gives

jout(r, t) = v0r|fk0(r)|

2

r2|A(k0 r − v0t)|2

Since the area subtended by the solid angle dΩ at a distance r is r2dΩ, the total radial flux

coming to the detector is

Fout = limr→∞

∫ ∞

0r · r2jout(r, t)dt

= limr→∞v0|fk0(r)|2∫ ∞

0|A(k0 r − v0t)|2dt

Letting

ξ = r − v0t

so that

dξ = −v0dt

we obtain

Fout = −|fk0(r)|2 limr→∞

∫ −∞

r|A(ξk0)|2dξ

= −|fk0(r)|2∫ −∞

∞|A(ξk0)|2dξ

= |fk0(r)|2∫ ∞

−∞|A(ξk0)|2dξ (33)

19

Finally,

dσ

dΩ=

Fout

Fin

= |fk0(r)|2 (34)

We see therefore that all wave packet aspects cancel out and the differential cross section

is given solely by the scattering amplitude associated with the “average” energy (or mo-

mentum). Thus, to describe an elastic scattering process we can disregard the wave packet

features and merely study the time-independent energy eigenstate with this average energy.

As a reminder, this depended on the fact that the scattering amplitude did not vary

rapidly over the momenta contained in the wave packet, which is only true as long as we are

not looking at a resonant scattering process.

The lab versus the CM frame

We have been discussing two-particle elastic scattering problems. As we know, the two-

particle Schrodinger equation can be reduced to an effective one-particle problem, with the

“particle” having the reduced mass of the two-particle system. And this is precisely what

we did. But of necessity, this means that we have been working in the center-of-mass (CM)

frame of the two-particle system. Thus, the information we have obtained referred to the

CM system, including for example the scattering amplitude. Thus, the differential cross

section we would obtain from it, using (34), is likewise in the CM frame, and is a function

of the CM scattering angles.

Experiments, however, are carried out in the laboratory frame of reference. As discussed

on pages 1-2, we have a projectile incident on a “stationary” target. The measured cross

sections will be in this frame of reference, as a function of the laboratory angles. How can we

obtain laboratory cross sections theoretically from the simpler-to-obtain CM cross sections?

I would now like to discuss this briefly.

Consider a particle of mass m1 and velocity v1 incident on a particle of mass m2, initially

at rest in the lab. Schematically this is illustrated in Fig. 2.

What is the velocity of the CM of this system. Denoting this as V , we see that

(m1 +m2)V = m1v1

20

m2v'2

m1v1

.

m1v'1

. ( lab, lab)

Note: v2=0

FIG. 2: Schematic illustration of elastic scattering kinematics in the lab frame

or that

V =m1v1

m1 +m2

Clearly, then, particle 1 is moving towards the CM with a velocity

V1 = v1 − V =m2v1

m1 +m2

whereas particle 2 is moving towards the CM with velocity

V2 = V =m1v1

m1 +m2

After an elastic collision, the particles go off in the lab frame as also shown in Fig. 2. In

the CM frame, however, they go off in opposite directions and with the same velocities as

before the collision. This is shown schematically in Fig. 3.

Let’s now obtain a relationship between the scattering angles in the lab frame (θlab, ϕlab)

and those in the CM frame (θCM , ϕCM). To do this, let’s focus on the velocity of particle 1

21

V2=m2v1/(m1+m2)m2 ,

m2, m1v1/(m1+m2)

.V1=m2v1/(m1+m2)

.. ( CM, CM)

m1 ,

.

m1, m2v1/(m1+m2)

.

FIG. 3: Schematic illustration of elastic scattering kinematics in the CM frame

in the lab frame, after the collision, which we denote v′1. As shown in Fig. 4 it is given by

the vector sum of the its outgoing velocity in the CM frame V1 and the velocity of the CM

(which is always V ).

We are assuming that the incident projectile is moving along the z-axis. Thus, the ϕ

angles are irrelevant and indeed

ϕCM = ϕlab

Only the θ angles change under transformation between the two frames.

From Fig. 4, we see further that

V + V1 cos θCM = v′1 cos θlab (35)

and

V1 sin θCM = v′1 sin θlab (36)

Dividing (36) by (35) eliminates v′1, yielding

22

v'1

V

CMlab

V1

z

FIG. 4: Schematic illustration of the velocity of particle 1

tan θlab =V1 sin θCM

V + V1 cos θCM

=sin θCM

γ + cos θCM

(37)

where

γ =V

V1

=m1

m2

as can be readily shown from our earlier relations for V and V1.

Note that in the limit m2 = ∞, this reduces to θlab = θCM , as it must. The two frames

are identical when the target is infinitely massive.

Now that we know how to relate scattering angles, let’s turn to differential cross sections.

Clearly the number of particles scattered into a solid angle dΩlab around (θlab, ϕlab) must

be identical to the number scattered into the corresponding dΩCM around (θCM , ϕCM).

23

Mathematically,(dσ

dΩ

)lab

sin θlab dθlabdϕlab =

(dσ

dΩ

)CM

sin θCM dθCMdϕCM

What we now want to do is to relate(dσ

dΩ

)lab

to

(dσ

dΩ

)CM

From the previous expression, we see that(dσ

dΩ

)lab

=sin θCM dθCM

sin θlab dθlab

(dσ

dΩ

)CM

(38)

where I’ve made use of the fact that dϕCM = dϕlab, since ϕCM = ϕlab.

To get the needed ratio, we return to (37)

tan θlab =sin θCM

γ + cos θCM

which we can rewrite ascos θlabsin θlab

=γ + cos θCM

sin θCM

Let’s define the right hand side of this equation to be A, viz:

A =γ + cos θCM

sin θCM

Thencos2 θlabsin2 θlab

= A2

which can be readily solved for cos θlab. The result is

cos θlab =A√

1 + A2

Taking differentials, we find that

−sin θlab dθlab =d

dA

A√1 + A2

dA

dθCM

dθCM

The two derivatives that enter are

d

dA

A√1 + A2

=1√

1 + A2− 1

2

2A2

(1 + A2)3/2=

1

(1 + A2)3/2

and

dA

dθCM

= − 1

sin θCM

sin θCM − γ + cos θCM

sin2θCM

cos θCM

= −[

1

sin θCM

+γ + cos θCM cos θCM

sin3 θCM

]sin θCM

24

Putting this all together, we finally arrive at the ratio needed for (38)

sin θCM dθCM

sin θlab dθlab=

(1 + γ2 + 2γcos θCM)3/2

1 + γcos θCM

so that (dσ

dΩ

)lab

=(1 + γ2 + 2γcos θCM)3/2

1 + γcos θCM

(dσ

dΩ

)CM

(39)

Thus, if we calculate a cross section in the CM frame by solving the reduced-mass

Schrodinger equation, we can then use (39) to calculate from it the corresponding lab cross

section, as needed to make contact with experiment.

25

More formal aspects of scattering theory

Return to the time-independent Schrodinger equation

We showed in the last few lectures that by analyzing appropriate solutions of the time-

independent Schrodinger equation we can describe elastic scattering processes. We also

showed that by introducing the free-particle Green’s function we could obtain an integral

equation for these continuum solutions that incorporated the necessary asymptotic condi-

tions.

We shall now discuss how to cast the equations of elastic scattering into Dirac notation.

This will facilitate subsequent generalization and analysis of the equations.

The Green’s operator

As a reminder, the free-particle Green’s functions are defined so as to satisfy the equation

(E −H0)G0(r, r′) = δ(r − r ′) (40)

where

H0 = − h2

2µ2

is the free-particle hamiltonian.

There are two solutions, called G±0 (r, r

′). The one with a plus superscript leads to

an outgoing spherical wave in the continuum solution of the Time Independent Schrodinger

Equation and the one with a minus superscript leads to a solution with an incoming spherical

wave.

Now let’s consider these two Green’s functions as the coordinate-space representations of

operators G±0 in abstract Hilbert space, viz:

G±0 (r, r

′) =< r|G±0 |r ′ >

I now claim that

G±0 = limη→0(E −H0 ± iη)−1

Let’s now see how we can prove this. Since G±0 , as defined above, are operators in Hilbert

space, we can evaluate their matrix elements in any representation. Let’s do so in momentum

representation, i.e.,

< q |G±0 | q ′ >= limη→0 < q |(E −H0 ± iη)−1| q ′ >

26

But

H0| q ′ >=h2(q′)2

2µ| q ′ >

and

E =h2k2

2µ

Thus,

< q |G±0 | q ′ >= limη→0

1h2k2

2µ− h2q2

2µ± iη

δ(q − q ′)

Knowing this, let’s now look at the matrix representation of the operators G±0 in coor-

dinate representation. We will indeed see that it is precisely the G±0 (r, r

′) that we derived

earlier. This will then prove our assertion.

The coordinate representation of the operators G±0 can be expressed as

< r |G±0 | r ′ >=

∫ ∫dqdq ′ < r | q > < q |G±

0 | q ′ > < q ′| r ′ >

where all I’ve done is to insert two identity operators,

I =∫

dq | q >< q | and I =∫

dq ′| q ′ >< q ′|

But we know all of the matrix elements and overlaps in the integrand. Putting them in,

we get

< r| G±0 |r ′ > = limη→0

∫ ∫dq dq ′ eiq· r

(2π)3/21

h2k2

2µ− h2q2

2µ± iη

δ(q − q ′)e−iq ′· r ′

(2π)3/2

=1

(2π)3limη→0

∫dq

eiq· (r−r ′)

h2k2

2µ− h2q2

2µ± iη

=2µ

h2

1

(2π)3limϵ→0

∫dq

eiq· (r−r ′)

k2 − q2 ± iϵ

where

ϵ =2µ

h2 η

But on pages 8, we showed that

g±0 (r, r′) =

1

(2π)3limϵ→0

∫dq

eiq· (r−r ′)

k2 − q2 ± iϵ

Using this and (13), we see that

< r| G±0 |r ′ > =

2µ

h2 g±0 (r, r

′)

= G±0 (r, r

′)

27

Thus, we have proven that these Green’s functions are indeed just the coordinate space

representation of the operators

G±0 = limη→0(E −H0 ± iη)−1

as advertised.

Having now found the explicit form of these so-called Green’s operators, let’s use them.

In analogy with our earlier discussion, G+0 is the particular interesting one, as it is the one

whose coordinate representation produces scattering wave functions with outgoing spherical

waves.

A quite useful form for this Green’s operator can be obtained by again inserting an

identity operator I =∫

dq | q >< q| , whence

G+0 = lim

η→0

∫dq

| q >< q|E − h2q2

2µ+ iη

The denominator is now just a complex scalar and not the inverse of an operator.

The integral equation in Dirac notation

The physical scattering wave function Ψ+

k(r) was seen earlier to be a solution of the

integral equation

Ψ+

k(r) = Φk(r) +

∫dr ′G+

0 (r, r′)V (r′)Ψ+

k(r ′)

In Dirac notation this becomes

< r| Ψ+

k> = < r| Φk > +

∫dr ′ < r| G+

0 |r ′ >< r ′| V |Ψ+

k>

= < r| Φk > + < r| G+0 V |Ψ+

k>

Thus,

| Ψ+

k>= | Φk > + G+

0 V | Ψ+

k> (41)

This integral equation in Hilbert space is very convenient for formal manipulations related

to scattering. It is known as the Lippman-Schwinger equation.

The transition matrix (or T matrix)

28

The scattering amplitude for a system with incident relative momentum ka was given in

eq. (12) in terms of the scaled potential U as

fka(r) = − (2π)3/2

4π

∫e−ikb·r ′

U(r′) Ψ+

ka(r ′) dr ′

This can be rewritten in terms of the full potential V as

fka(r) = −√2πµ

h2

∫e−ikb·r ′

V (r′) Ψ+

ka(r ′) dr ′

As a reminder, kb = kar, i.e. it has the same magnitude as ka but is pointed along r.

The differential cross section can be obtained from the scattering amplitude by taking

the absolute magnitude squared, i.e.

dσ

dΩ= |fka(r)|

2 =2πµ2

h4 |∫

e−ikb· r ′V (r′)Ψ+

ka(r ′)dr ′|2 (42)

As a reminder, this is the differential cross section in the CM system.

Since1

(2π)3/2eik· r

is just the coordinate-space eigenfunction of H0 with momentum k, we denote

1

(2π)3/2eik· r =< r| k >

so that

e−ikb· r ′= (2π)3/2 < kb| r ′ >

Similarly, we can write Ψ+

ka(r ′) in Dirac notation as

Ψ+

ka(r ′) =< r ′| Ψ+

a >

where I’ve used the notation |Ψ+a > to denote the scattering state associated with incoming

momentum ka.

Thus, equation (42) for the differential cross section can be rewritten as

dσ

dΩ=

(2π)4µ2

h4 |∫

dr ′ < kb| r ′ > < r ′| V | Ψ+a > |2

=(2π)4µ2

h4 | < kb| V | Ψ+a > |2

Definition: The transition operator T is defined by the equation

T |ka >= V |Ψ+a > (43)

29

In terms of this new operator

dσ

dΩ=

(2π)4µ2

h4 | < kb| T | ka > |2 (44)

The quantities < kb| T | ka > are called transition matrix elements or simply T matrix

elements.

Some properties of the transition operator T

(1) T = V + V G+0 T

Proof:

From equation (41),

|Ψ+a >= |ka > +G+

0 V |Ψ+a >

Thus,

< kb| V | Ψ+a >=< kb| V | ka > + < kb| V G+

0 V |Ψ+a >

Inserting (43) twice, we then obtain

< kb| T | ka >=< kb| V | ka > + < kb| V G+0 T |ka >

From this we confirm that

T = V + V G+0 T

(2) Define G± = limη→0 (E−H± iη)−1, where H is the full hamiltonian (i.e. H = H0+V ).

Then T = V + V G+ V .

You will be asked to prove this in a homework problem.

(3) Consider two eigenvectors ka and kb of H0 with the same eigenvalue E (thus |ka| = |kb|).

Then

Tab − T †ab = −2πi

∫dkn Tan T †

nb δ(Ea − En)

Proof:

Consider

G+ = limη→0

1

E −H + iη

Then

(G+)† = limη→01

E −H − iη= G−

since H is hermitean.

30

Thus,

T = V + V G+ V

and

T † = V + V G−V

where here too I have assumed that V is hermitean.

Thus,

T − T † = VG+ −G−

V

Taking matrix elements of this equation gives

Tab − T †ab = limη→0 < ka| V

1

E −H + iη− 1

E −H − iη

V | kb >

We now insert before the second V a complete set of eigenvectors |Ψ+n > of H with energy

En and incident momentum kn. Then using the fact that H|Ψ+n >= En|Ψ+

n >, we obtain

Tab − T †ab = limη→0

∫dkn

1

E − En + iη− 1

E − En − iη

< ka| V | Ψ+

n > < Ψ+n | V | kb >

(45)

It is straightforward to convince yourselves that

limη→0

1

E − En + iη− 1

E − En − iη

= −2πi δ(E − En) (46)

by using the standard form for the Dirac Delta function

δ(x) = limη→0

1

π

η

x2 + η2

Inserting (46) into (45), we find that


∫dknδ(E − En) < ka| V | Ψ+

n > < Ψ+n | V | kb >

We now use the defining relation for T and thus also T † to rewrite this as


∫dknδ(E − En) < ka| T | kn > < kn| T †| kb >

= −2πi∫

dknδ(E − En)TanT†nb

QED

This equation can be expressed in operator notation as

T − T † = −2πi T T † (47)

with the understanding, however, that when we insert a complete set of states between the

two operators T and T †, we only include states at the same energy.

31

QED

Theorem: The set of all vectors |Ψ−a > is orthonormal.

Proof: Almost identical to the one above for the continuum solutions | Ψ+a > with outgoing

spherical waves.

The two sets are, however, not complete, if H admits discrete states. If the discrete states

are added to the set | Ψ+a > or the set | Ψ−

a > the resulting sets are complete.

Introduction of the S matrix

From the preceding discussion, it is clear that the continuum solutions | Ψ+a > can be

expressed as linear combinations of the solutions | Ψ−a >. We define an operator S which

transforms from one set to the other, i.e.

Sab =< Ψ−a | Ψ+

b > (50)

Note: Since the bound states are orthogonal to the continuum states, the expansion does

not require the discrete part of the set(s).

Important point: Since two eigenvectors of H belonging to different eigenvalues must be

orthogonal, S must be diagonal with respect to energy. This is in contrast to the T operator

which need not be.

We see therefore that the S matrix expands a continuum solution at a given energy with

outgoing spherical waves in terms of all of those at the same energy with incoming spherical

waves (and vice versa).

Connection between the S operator and the T operator

Consider

Sba =< Ψ−b | Ψ+

a >

As before, we use

| Ψ+a >= | Φa > +G+ V | Φa >= | Φa > + (Ea −H + iη)−1 V | Φa > (51)

Thus,

Sba = < Ψ−b | Φa > + < Ψ−

b | (Ea −H + iη)−1 V | Φa >

33

= < Ψ−b | Φa > + < Ψ−

b | (Ea − Eb + iη)−1 V | Φa >

= < Ψ−b | Φa > + < Ψ−

b | V (Ea − Eb + iη)−1 | Φa >

= < Ψ−b | Φa > − < Ψ−

b | V (Eb − Ea − iη)−1 | Φa > (52)

If we now consider the Lippman Schwinger equation for |Ψ−b > and take its hermitean

adjoint, we see that

< Ψ−b | =< Φb| + < Ψ−

b | V (Eb −H0 + iη)−1

Inserting this into the first term in (52) gives

Sba = < Φb|Φa > + < Ψ−b |V (Eb −H0 + iη)−1 |Φa > − < Ψ−

b |V (Eb − Ea − iη)−1 |Φa >

= < Φb| Φa > + < Ψ−b | V

1

Eb − Ea + iη− 1

Eb − Ea − iη

| Φa >

As in a previous theorem, we use the fact that

limη→01

2πi

(1

x− iη− 1

x+ iη

)= limη→0

1

2πi

(2iη

x2 + η2

)

= limη→01

π

η

x2 + η2

= δ(x)

Thus,

Sba = < Φb| Φa > − 2πi < Ψ−b | V | Φa > δ(Ea − Eb)

= δab − 2πi < Φb| T | Φa > δ(Ea − Eb)

= δab − 2πi Tba δ(Ea − Eb) (53)

which in operator notation becomes

S = I − 2πi T (54)

as long as we restrict ourselves to states with the same energy.

For calculations, it is useful to have (53) in a slightly different form, in which the operative

delta function is in momentum rather than energy. To do so, we use the fact that

δ(Ea − Eb) =µ

h2kaδ(ka − kb)

Then (53) can be rewritten as

Sba = δab −2πµi

kah2 Tba δ(ka − kb) (55)

34

The scattering amplitude in terms of S-matrix elements

The scattering amplitude fk(r) can be expressed in terms of the T matrix elements

< k ′|T | k > according to [see eqs. (42) and (44)]

fk(r) = −(2π)2µ

h2 < k ′|T | k >

Using (55), we see that

fk(r) = −(2π)2µ

h2

kh2

2πµi< k ′| I − S| k >

= −2πik < k ′| S − I| k > (56)

Note that we are using here the S matrix elements that involve the momentum delta function

because the states |k > and |k ′ > are in momentum representation.

The cross section in terms of S matrix elements

The differential cross section for scattering from k to k ′ can now be expressed in terms

of the S-matrix elements as

dσ

dΩ= |fk(r)|

2 = 4π2k2 | < k ′| S − I| k > |2 (57)

where as a reminder |k ′| = |k| = k.

Theorem: The S operator is unitary.

Proof:

S S† = I − 2πi(T − T †

)− 2πi(2πi)T T †

where I have used (54) for both S and S†.

We now use (47) to rewrite this as

S S† = I − 2πi(−2πiT T †)− 2πi(2πi)T T †

= I

QED

Rotational invariance and the S matrix

35

The S-matrix elements we have been discussing so far were in a plane wave basis. Clearly

in such a basis, the S matrix is not diagonal; otherwise elastic scattering processes would

not occur.

If the potential that governs the scattering process is a central potential, it is clear that

the S-matrix elements cannot depend on the absolute orientation of k and k ′ but only on

their relative orientation. Then assuming that the particles have no spin (or intrinsic angular

momentum) we can carry out a partial wave expansion of the S-matrix elements, viz:

< k ′| S| k >= δ(k − k′)∞∑l=0

2l + 1

4πk2Sl(k) Pl(k · k ′) (58)

We can evaluate the coefficients in this expansion Sl(k) by invoking the unitarity property

of the S-matrix elements, i.e.

S S† = I

which in the plane wave representation becomes

∫dk′′ < k ′| S| k ′′ > < k ′′| S†| k >= δ(k − k ′) (59)

Inserting (58) into (59) we find that

δ(k − k ′) =∫

δ(k′ − k′′)δ(k − k′′)∑ll′

(2l + 1)(2l′ + 1)

16π2k2 k′2 Sl(k′)Sl′∗(k′′)Pl(k′ · k′′)Pl′(k · k′′)dk′′

= δ(k − k′)∑ll′

(2l + 1)(2l′ + 1)

16π2k2Sl(k) Sl′∗(k)

∫Pl(k

′ · k′′)Pl′(k · k′′) dk′′

where one factor of k−2 was cancelled by the k2 factor in k2dk and one momentum delta

function by the integration over dk.

To evaluate the remaining angular integral over dk′′, we use the spherical harmonic ad-

dition theorem (see Merzbacher’s text, Quantum Mechanics, Third Edition, pg 251), which

says that

Pk(r1 · r2) =4π

2k + 1

∑m

Y mk (r1)Y

m∗k (r2)

Thus

∫Pl(k

′ · k′′)Pl′(k · k′′) dk′′ =16π2

(2l + 1)(2l′ + 1)

∑mm′

∫Y ml (k′)Y m∗

l (k′′)

36

× Y m′∗l′ (k)Y m′

l′ (k′′) dk′′

=16π2

(2l + 1)(2l′ + 1)

∑mm′

δll′δmm′ Y ml (k′) Y m′∗

l′ (k)

=16π2

(2l + 1)(2l′ + 1)δll′∑m

Y ml (k′) Y m∗

l (k)

=4π

2l + 1δll′Pl(k

′ · k)

where the last equality again followed from use of the spherical harmonic addition theorem.

Thus, all told,

δ(k − k′) = δ(k − k′)∑l

2l + 1

4πk2|Sl(k)|2 Pl(k · k′)

But the Dirac delta function has the well-known partial wave expansion

δ(k − k′) =δ(k − k′)

k2

∑l

2l + 1

4πPl(k · k′)

Thus, we immediately see that

|Sl(k)|2 = 1 (60)

from which we can introduce the parametrization

Sl(k) = e2iδl(k) (61)

The quantity δl(k) appearing in (61) is referred to as the phase shift in the lth partial

wave, for reasons that will become clear shortly.

What phases are being shifted?

Let’s now discuss the physical significance of the term “phase shift” for δl(k).

Consider the full scattering wave function, assuming that the incident momentum is

directed along the z-axis. Asymptotically as r → ∞

Ψ+k (r) →

1

(2π)3/2

eikz + fk(θ)

eikr

r

Let’s look first at the plane wave part, which would be the wave function in the absence

of a scattering potential. We can perform a partial wave expansion of eikz, yielding

eikz =∞∑l=0

il (2l + 1) jl(kr) Pl(cos θ)

37

Asymptotically as r → ∞, the Bessel function behaves like

jl(kr) →1

krsin (kr − lπ

2)

Thus, in the lth partial wave the plane wave behaves asymptotically like a sine function.

What happens when we add the scattered wave that results from the presence of the

potential? Now

eikz + fk(θ)eikr

r= eikz − 2πik < k ′| S − I| k >

eikr

r

We expand eikz as earlier. For the S-matrix we use the expansion given in (58) and

furthermore replace Sl(k) → e2iδl(k). Then

eikz − 2πik < k ′|S − I |k >eikr

r→

∑l

il(2l + 1)sin (kr − lπ

2)

krPl(cos θ) −

−2πik∑l

2l + 1

4πk2

(e2iδl − 1

) eikrr

Pl(cos θ)

=∑l

il (2l + 1)Pl(cos θ)

kreiδl ×

×sin (kr − lπ

2) e−iδl − (i)−l i

2(eiδl − e−iδl) eikr

Let’s now focus on the term in brackets. Using the fact that

eiδl − e−iδl = 2i sin δl

and that

i−l = e−ilπ/2

we can write the term in brackets as

...... = sin (kr − lπ

2) e−iδl + sin δl e

ikr e−lπ/2

= sin (kr − lπ

2) (cos δl − i sin δl) + sin δl

(cos (kr − lπ

2) + isin (kr − lπ

2)

)

= sin (kr − lπ

2) cos δl + cos (kr − lπ

2)sin δl

− i

(sin (kr − lπ

2) sin δl)− sin (kr − lπ

2) sin δl)

)

= sin (kr − lπ

2) cos δl + cos (kr − lπ

2)sin δl

= sin (kr − lπ

2+ δl)

38

r0

III

0

FIG. 5: A hard-sphere potential

where in the last equality I used the fact that sin (A+B) = sin A cos B + cos A sin B.

We see therefore that the effect of the potential on the lth partial wave is to modify the

phase of the asymptotic sine function from kr − π2to kr − π

2+ δl. Thus, δl represents the

shift in the phase of the lth partial wave on passing through the complete region in which

the potential acts.

An example

Let’s now discuss a specific example in which the phase shifts can be determined analyt-

ically. The example involves scattering by a hard sphere potential,

V (r) = ∞ r ≤ r0

= 0 r > r0

This potential looks schematically as in figure 5.

39

Clearly, in region I (0 < r < r0), the wave function must be identically zero.

Outside the region of the hard sphere, i.e. in region II, the Schrodinger equation is just

that of a free particle, for which the solution is

Ψlm(r) ∝ Rl(r)Yml (r)

with

Rl(r) = Aljl(kr) +Blηl(kr)

Here jl is the spherical Bessel function of order l and ηl the corresponding spherical Neumann

function of order l. [Note: If you haven’t seen it yet, you may look at pages 346-349 of

Shankar for a treatment of the free-particle Schrodinger equation in spherical coordinates.]

One of the two coefficients can be determined by the δ function normalization of this

continuum wave function. The second comes from the condition that the wave function

must be continuous at r = r0.

RIl (r0) = 0

RIIl (r0) = Aljl(kr0) +Blηl(kr0)

Equating

RIl (r0) = RII

l (r0)

then leads toBl

Al

= − jl(kr0)

ηl(kr0)(62)

Bear in mind, however, that this is not an eigenvalue condition. For scattering, all energies

E = h2k2

2µare allowed.

Now we can look at the radial wave function in the lth partial wave (i.e. Rl(r)) asymp-

totically (i.e. for r → ∞), as we must do to extract the phase shift.

Rl(r) = Al jl(kr) + Bl ηl(kr)

→ 1

kr

[Al sin (kr − lπ

2)−Bl cos (kr − lπ

2)

]

where I have used the known asymptotic forms of the spherical Bessel and Neumann func-

tions, given for example on page 348 of Shankar.

40

Using (62), we then find that

Rl(r) →Al

kr

[sin (kr − lπ

2) +

jl(kr0)

ηl(kr0)cos (kr − lπ

2)

]

Defining

tanX =jl(kr0)

ηl(kr0)

we can rewrite this as

Rl(r) → Al

krcos X

[sin (kr − lπ

2)cos X + cos(kr − lπ

2)sin X

]

=Al

krcos Xsin (kr − lπ

2+X)

We see, therefore, that X is precisely the phase shift of the lth partial wave.

But

X = tan−1 jl(kr0)

ηl(kr0)

so that

δl(k) = tan−1 jl(kr0)

ηl(kr0)

The optical theorem

We have already seen that unitarity of the S matrix is equivalent to the T-matrix relation

T − T † = −2πiTT †

I shall now derive an important consequence of this relation, which can accordingly also be

viewed as a consequence of S-matrix unitarity.

Taking matrix elements of the above relation in a plane wave basis

< ka|T | kb > − < ka|T †| kb >= −2πi∫

dkn δ(Ea − En) < ka|T | kn > < kn|T †| kb >

But, from earlier discussion [see (42) and (44)]

< ka|T | kb >= − h2

(2π)2µfka(ka · kb)

Thus,

− h2

(2π)2µ

fka(ka · kb)− f ∗

ka(ka · kb)

= −2πi

h4

(2π)4µ2

∫dkn fka(ka·kn) f

∗kb(kb·kn) δ(Ea−En)

(63)

41

Let’s now consider (63) for the case in which ka = kb, so that ka · kb = 1, i.e. θab = 0.

Then

fka(0)− f ∗ka(0) = 2πi

h2

(2π)2µ

∫dkn|fka(ka · kn)|

2δ(Ea − En) (64)

But

dkn = k2n dkn dkn

En =h2

2µk2n

dEn =h2

µkn dkn

Thus,

k2n dkn =

µ

h2 kn dEn =µ

h2

√2µ

h2

√En dEn

Finally,


2δ(Ea − En) =µ

h2

√2µ

h2

√Ea


2

=µ

h2ka


2

Plugging this into (64) gives

fka(0)− f ∗ka(0) = − 1

2πika

∫dkn |fka(ka · kn)|

2 (65)

Now let’s define

σ =∫

dkn |fka(ka · kn)|2

Clearly σ is the total cross section for elastic scattering for an incoming plane wave of

momentum ka. For any complex number A, we know that A−A∗ = 2i Im A. We can thus

rewrite (65) as

2i Im fka(0) = − 1

2πikaσ

or

σ =4π

kaIm fka(0) (66)

This relation is referred to as the Optical Theorem, in analogy with the process of light

passing through a medium, for which the imaginary part of the complex index of refraction

is related to the total absorption cross section.

42

Since fka(0) is the forward scattering amplitude, what the optical theorem is expressing

is the fact that all of the flux that is removed from the forward direction (i.e. from the

beam direction) goes into a scattering process. Thus, we see a close link between S-matrix

unitarity and flux conservation, which we’ll indeed confirm shortly.

Generalization to complex collision processes

Up to now we have restricted our discussion to elastic scattering processes. Such processes

are characterized by the fact that the internal structure of the projectile and the target are

unaffected by the collision, so that the relative energy (i.e the magnitude of the relative

momentum) of the two is unchanged in the collision. What is changed in an elastic collision

is the direction of the relative momentum vector. Our restriction to elastic scattering was

implicit in our use of the asymptotic wave function

Ψ+

k→ 1

(2π)3/2

eik·r + fk(r)

eikr

r

for which the spherically outgoing scattered wave has the same k as the incident plane wave.

I would now like to discuss briefly how to generalize this to a wider variety of collision

processes. I will still consider two-body collisions only, in which a “projectile” a strikes a

“target” A and emerging from the collision are two objects b and B. [Note: Either the pro-

jectile or the target or both can be complex (many-particle) objects with internal structure.]

Such processes can be expressed either as

a+ A → b+B

or

A(a, b)B

Some examples are:

1. Elastic scattering A(a, a)A

2. Inelastic scattering A(a, a′)A∗

Here A∗ is an excited state of the target and a′ represents the projectile with appro-

priately diminished energy.

43

3. Rearrangement collisions A(a, b)B

Here what emerges from the collision are two different particles.

An example might be the nuclear reaction p + (Z,A) → d + (Z,A − 1), in which a

neutron in the target nucleus attaches itself to the incident proton, which then leaves

the reaction as a deuteron. The target which started with Z protons and A − Z

neutrons now has Z protons and A − Z − 1 neutrons. Note: This is often called a

pickup reaction, since the projectile picks up a neutron from the target.

For a given incident projectile and target at a given relative energy, each of the possi-

ble two–fragment states defines a two-body reaction channel. Those which are allowed by

conservation of energy are called open channels. Those which cannot satisfy energy conser-

vation are called closed channels. And as you might imagine, there can also in principle be

many-body channels with more than two fragments emerging.

Up to now, our analysis has focussed on elastic scattering only, which is of course always

an open channel. Furthermore, we have only discussed processes in which either the two

fragments carried no angular momentum (or spin) of their own or in which their intrinsic

angular momenta were irrelevant so that they could be ignored. My generalization will

permit other reaction processes to occur, but will still be limited to cases in which the

intrinsic angular momentum can be ignored. Further generalization to include angular

momentum and/or intrinsic spin is feasible, but notationally very complex. Furthermore, as

suggested above we will only be considering two-body channels.

Collisions of “Spinless” fragments

We shall let χs denote the intrinsic wave function in channel s and we shall assume that

χs carries no intrinsic angular momentum. Then a process in which the incident channel is

s can be described by a wave function with the asymptotic form

1√vs

eikszs χs +∑t

fst(rt)eiktrt√vtrt

χt (67)

This is the appropriate generalization of the asymptotic wave function we introduced

earlier when only considering elastic collisions. The factors 1/√vs and 1/

√vt are introduced

so that the incident and outgoing components all contain unit flux (see pages 17-19). We

also assume here that the incident (relative) momentum is ks = ksz, i.e. along the z-axis.

Finally, the sum over t goes over all open channels.

44

In this formalism, the differential cross section for a process in which s denotes the

incident channel and t the final channel is

dσ

dΩ(s → t) = |fst(r)|2 (68)

As we have often seen, there are many continuum solutions of the Time Independent

Schrodinger Equation at the same energy. Our choice of (67) was dictated by the physical

boundary conditions of the scattering problem. There is another particularly interesting

one, which I will call Φls. It is by definition the solution which has the asymptotic form

Φls →

1√vs

e−i(ksrs− lπ2)

rsχs Pl(cos θs) −

∑t

Slst(rt)

1√vt

ei(ktrt−lπ2)

rtχt Pl(cos θt) (69)

This solution contains an incoming spherical wave in channel s and outgoing spherical waves

in all open channels. In contrast to the solution (67), this solution is also an angular

momentum eigenstate and thus reflects the spherical symmetry of the problem. The reason

that the physical solution (67) was not an angular momentum eigenstate was that the

physical problem had a preferred direction in space, namely the direction of the incident

(relative) momentum vector ks.

Equation (69) defines the S-matrix elements in the lth partial wave for the various allowed

collision processes s → t. We will soon confirm that it is indeed the appropriate generaliza-

tion of the Sl(k) coefficients introduced on page 36-37 and that Slss is identical to the Sl(k)

introduced there.

Now let’s consider how to relate measurable quantities, e.g. cross sections, to these

S-matrix elements. To do this, we note that

eikszs =∞∑l=0

il(2l + 1)jl(ksrs)Pl(cos θs)

→∞∑l=0

il(2l + 1)sin (ksrs − lπ

2)

ksrsPl(cos θs)

=∞∑l=0

il(2l + 1)ei(ksrs−

lπ2) − e−i(ksrs− lπ

2)

2iksrsPl(cos θs) (70)

With the above result in mind, let’s now consider the following superposition of the

angular momentum eigenstates Φls given in (69) :

Ψs =∞∑l=0

il+12l + 1

2ksΦl

s

45

→ 1√vs

∑l

il (2l + 1)ei(ksrs−

lπ2) − e−i(ksrs− lπ

2)

2iksrsχs Pl(cos θs) −

−∑l

2l + 1

2ksil+1

∑t

(Slst − δst)

ei(ktrt−lπ2)

√vtrt

χt Pl(cos θt)

where the δst term was included in the last line to balance the extra term that was added

on the previous line.

Using (70), we can rewrite this as

Ψs =1

√vs

eikszs χs −∑t

1√vt

eiktrt

rtχt

[∑l

(2l + 1)il+1

2ks(Sl

st − δst) e−i lπ

2 Pl(cos θt)

](71)

This is indeed the linear combination of the Φls angular momentum eigenstates that has

the physical asymptotic behavior (67). Thus, we can read off of it the scattering amplitudes,

which are

fst(rt) =∑l

(2l + 1)il+1

2ks(Sl

st − δst) e−i lπ

2 Pl(cos θt)

Finally, the differential cross section for the process s → t can be obtained from the

corresponding scattering amplitude with the result being (see (68))

dσ

dΩ(s → t) = |

∑l

2l + 1

2ks(Sl

st − δst)Pl(cos θt)|2 (72)

In particular,

dσ

dΩ(s → s) =

1

4k2s

|∑l

(2l + 1)(Slss − 1)Pl(cos θs)|2 (73)

anddσ

dΩ(s → t = s) =

1

4k2s

|∑l

(2l + 1)SlstPl(cos θt)|2 (74)

We have now reparametrized the cross sections for all two-body processes in terms of the

associated S-matrix elements for each partial wave. To calculate the cross sections, we of

course need to know the Slst, which in general we do not since they require that we solve

the entire Schrodinger equation, which is, to say the least, difficult. However, we can use

conservation theorems, to gain some insight into the S-matrix elements without ever fully

solving the problem. Let’s see how this is done.

1. Conservation of flux in Φls, the lth partial wave:

46

Incoming flux = 1

Outgoing flux =∑

t |Slst|2

If there are no sources or sinks in the problem, then the incoming and outgoing fluxes in

each partial wave must be the same, i.e.

∑t

|Slst|2 = 1 (75)

We can now derive from this several important conclusions:

1. If only elastic scattering is energetically allowed, then the sum over t reduces to a

single term t = s, and then

|Slss|2 = 1 (76)

This is just the same as eq. (60). We have thus confirmed that in the one-channel

elastic scattering problem, the Slss coefficients we just introduced in our generalized

formalism are the same as the Sl(k) coefficients introduced in our earlier pure elastic

scattering formalism when we carried out a partial wave decomposition. As in the

earlier discussion, we can on the basis of (76) parametrize

Slss = e2iδl

where δl is the elastic scattering phase shift in the lth partial wave. Thus, we now see

that the applicability of phase shift analysis to elastic scattering is a direct consequence

of flux conservation.

2. The flux conservation equation (75) furthermore shows that

|Slst|2 ≤ 1

for every s, t.

3. If we integrate eq. (72) over all solid angles, we find that the total cross section for

the process s → t is

σ(s → t) =π

k2s

∑l

(2l + 1)|Slst − δst|2

We next define

σl(s → t) =π

k2s

(2l + 1)|Slst − δst|2

47

to be the total cross section in the lth partial wave.

Clearly σl(s → s) is largest when Slss = −1, in which case

(a) σl(s → s) =4(2l + 1)π

k2s

(b) Slst = 0, for all t = s

(c) σl(s → t) = 0 for all t = s

Similarly σl(s → t = s) is largest when Slst = 1, in which case

(a) σl(s → t) =(2l + 1)π

k2s

(b) σl(s → s) =(2l + 1)π

k2s

(c) σl(s → t′) = 0 for all t′ = s, t

From this, we see that as long as there is a non-elastic process, so that some Slst = 0

with (t = s), then there must be some elastic scattering also taking place in that

partial wave.

As indicated earlier, this more general formalism can be extended even more generally -

albeit with enormous complication - to two-body processes with intrinsic spin and even to

more complex many-body channels.

48

Approximation techniques in scattering theory

As in bound-state problems, quantum mechanical scattering problems are very rarely

exactly solvable, making approximate methods of solution critical. We shall discuss two such

methods: (a) the Plane Wave Born Approximation (PWBA) and (b) the Distorted Wave

Born Approximation (DWBA). Both are essentially applications of perturbation theory to

scattering problems. We shall for simplicity only discuss elastic scattering of “spinless”

particles.

1. The Plane Wave Born Approximation

Elastic scattering from an incident (relative) momentum k to a final (relative) momentum

k′ is described in terms of a differential cross section

dσ

dΩ(θ) = |fk(θ)|

2

where

cos θ = k · k ′

and the scattering amplitude fk(θ) is given by

fk(θ) = −√2πµ

h2

∫e−ik ′·r ′

V (r′) Ψ+

k(r ′) dr ′ (77)

and where the full scattering wave function Ψ+

k(r) satisfies the integral equation

Ψ+

k(r) =

1

(2π)3/2eik·r +

∫G+

0 (r, r′) V (r′)Ψ+

k(r ′) dr ′ (78)

Inserting (78) into (77) gives

fk(θ) = − µ

2πh2

∫e−ik ′· r ′

V (r′) eik· r ′dr ′

−√2πµ

h2

∫e−ik ′· r ′

V (r′) G+0 (r

′, r ′′)V (r′′) Ψ+

k(r ′′) dr ′ dr ′′ (79)

We could again insert the integral equation (78) into the second term of (80) and this

would give

fk(θ) = − µ

2πh2

∫e−ik ′· r ′


− µ

2πh2

∫e−ik ′· r ′

V (r′) eik· r ′G+

0 (r′, r ′′) V (r′′) eik· r ′′

dr ′ dr ′′

−√2πµ

h2

∫e−ik ′· r ′

V (r′) G+0 (r

′, r ′′)V (r′′) G+0 (r

′′, r ′′′) V (r′′′)

Ψ+

k(r ′′′) dr ′ dr ′′ dr ′′′ (80)

49

By successively inserting for Ψ+

kthe full integral equation (78) we generate an infinite

series of terms for the scattering amplitude. Each successive term has an extra potential V ,

an extra free-particle Green’s function G+0 , and an extra three-dimensional integral. This

series is called the Born Series. Keeping only the first term in the series is called the 1st

order Born Approximation, or more correctly the 1st order Plane Wave Born Approximation

(PWBA). Keeping more terms gives rise to higher-order PWBA’s.

The Born series in operator language

The transition operator T satisfies the integral equation (see property 1 on page 30)

T = V + V G+0 T (81)

Inserting the complete integral equation (81) for the operator T appearing on the right

hand side gives

T = V + V G+0 V + V G+

0 V G+0 T

By successively replacing the T on the right hand side by the full integral equation, we

generate an operator series

T = V + V G+0 V + V G+

0 V G+0 V + V G+

0 V G+0 V G+

0 + ...

The various orders of PWBA are obtained by evaluating < k ′|T |k > with successively

more terms in the series expansion for T . For example, 1st order PWBA is obtained by

approximating

< k ′|T |k >≈< k ′|V |k >

First-order PWBA

The first-order PWBA gives rise to a scattering amplitude for elastic scattering of the

form

fk(θ) = − µ

2πh2

∫e−ik ′· r ′


= − µ

2πh2

∫eiq· r ′

V (r′) dr ′

where

q = k − k ′

50

is the momentum that is transferred in the elastic collision.

Since V (r′) is spherically symmetric, the angular integration is done straightforwardly.

We choose a coordinate system in which

q = qz′

Thus,

fk(θ) = − µ

2πh2

∫ 2π

0

∫ 1

−1

∫ ∞

0eiqr

′cos θ′ V (r′) r′2dr′d(cos θ) dϕ′

= − 2πµ

2πh2

∫ ∞

0

1

iqr′

eiqr

′cos θ′1−1

V (r′) r′2 dr′

= − µ

h2

∫ ∞

0

eiqr′ − e−iqr′

iqr′V (r′) r′2 dr′

Finally,

fk(θ) = − 2µ

h2q

∫ ∞

0sin qr′ V (r′) r′ dr′ (82)

Note that all of the θ dependence of the scattering amplitude is contained in q. More

specifically,

q = |k − k ′|

=√k2 + k′2 − 2kk′cos θ

For elastic scattering k = k′, so that

q = k√

2− 2cos θ

An example

Consider elastic scattering of an electron by a neutral atom with atomic charge Z via a

screened Coulomb potential of the form

V (r) = −(Ze2

r

)e−r/a

where a is the range of the potential.

Then in first-order PWBA,

fk(θ) =2µZe2

h2q

∫ ∞

0sin qr′ e−r′/a dr′

51

We write

sin qr′ =1

2i

(eiqr

′ − e−iqr′)

so that

fk(θ) =2µZe2

2ih2q

∫ ∞

0eiqr

′− r′a dr′ −

∫ ∞

0e−iqr′− r′

a dr′

=µZe2

ih2q

1

iq − 1a

(eiqr

′− r′a

)∞

0

− 1

−iq − 1a

(e−iqr′− r′

a

)∞

0

=µZe2

ih2q

1

iq − 1a

(−1)− 1

−iq − 1a

(−1)

=µZe2

ih2q

iq + 1

a+ iq − 1

a

q2 + ( 1a)2

=2µZe2

h2(q2 + 1a2)

The differential cross section in PWBA for elastic scattering by the screened potential is

given by the square of the magnitude of the scattering amplitude, namely by

dσ

dΩ(θ) =

4µ2Z2e4

h4(q2 + 1a2)2

As noted earlier, all θ dependence is contained in q2. Using the fact that q2 = 2k2(1 −

cos θ), we find thatdσ

dΩ(θ) =

4µ2Z2e4

h4(2k2(1− cos θ) + 1a2)2

But

1− cos θ = 2sin2 θ

2

so thatdσ

dΩ(θ) =

4µ2Z2e4

h4(4k2 sin2 θ

2+ 1

a2

)2In the limit that a → ∞, the screened Coulomb potential reduces to the pure Coulomb

potential between two point charges, one with charge e and the other with charge Ze. In

this case, the differential cross section becomes

dσ

dΩ(θ) =

µ2Z2e4

h4

1

4k4 sin4 θ2

Noting that

hk = p

52

this can be rewritten asdσ

dΩ(θ) =

µ2Z2e4

4p2cosec4

θ

2

which is exactly the same result as the Rutherford cross section obtained from classical

scattering theory. It is, in fact, also possible to do an exact Quantum Mechanical calculation,

rather than a first-order PWBA calculation. As noted by Shankar on page 531, the exact

QM calculation also leads to the same Rutherford formula.

When should the Born approximation be applicable?

We would like to generate criteria from which to determine whether the first-order PWBA

is appropriate for a given problem. To do this, we first remember that the full scattering

wave function corresponding to an incident relative energy h2k2/2µ satisfies

Ψ+

k(r) =

1

(2π)3/2eik· r − µ

2πh2

∫ eik|r−r ′|

|r − r ′|V (r′) Ψk(r

′) dr ′

In PWBA, only the plane wave term is retained. For all r, the plane wave term has a

magnitude of 1/(2π)3/2. First-order PWBA is thus justified if the remaining term (which I

will denote v(r) is much smaller in magnitude than 1/(2π)3/2 in the region of the potential.

Mathematically, this condition is

|v(r)| = µ

2πh2 |∫ eik|r−r ′|

|r − r ′|V (r′) Ψk(r

′) dr ′| ≪ 1

(2π)3/2

We shall estimate |v(r)| at r = 0. Since V (r) is usually strongest at r = 0, this is the

place where |v(r)| should be largest.

We shall estimate |v(0)| using first-order PWBA. Then

|v(0)| ≈ µ

2πh2

1

(2π)3/2|∫ eikr

′

r′V (r′) eik· r ′

dr ′|

=µ

(2π)5/2h2 |w(0)|

with

w(0) = 2π∫ ∞

0

∫ 1

−1eikr

′V (r′)eikr

′cos θ′r′dr′d(cos θ′)

=2π

ik

∫ ∞

0

e2ikr

′ − 1V (r′)dr′

In terms of w(0), the first-order PWBA is justified if

µ

2πh2 |w(0)| ≪ 1

53

We now consider a prototype potential with range a and depth V0 of the form

V (r′) = −V0e−r′/a

Though this potential is hardly general, it will suffice to provide some feel for the condi-

tions under which first-order PWBA is justified. For this potential,

w(0) = −2πV0

ik

∫ ∞

0

e2ikr

′ − 1e−r′/adr′

=2πiV0

k

− 1

2ik − 1a

− a

=2πV0

k

− 2ika

2ik − 1a

Thus

|w(0)| = 2πV0

k

2ka√4k2 + 1

a2

=4πV0a

2

√4k2a2 + 1

Thus, the condition defining the validity of first-order PWBA becomes

2µV0a2

h2√4k2a2 + 1

≪ 1

We now consider two cases:

1. low energies — ka ≪ 1

Then2µV0a

2

h2 ≪ 1

or

V0 ≪h2

2µa2

Thus, for low-energy scattering, first-order PWBA can be applied only if the potential

is sufficiently weak.

2. high energies – ka ≫ 1

Now the PWBA applicability condition becomes

2µV0a2

2h2ka≪ 1

or

V0 ≪h2

µa2ka

If ka is sufficiently large, this condition will be satisfied for any reasonable depth.

Thus, at sufficiently high energies, first-order PWBA is usually appropriate.

54

In summary, first-order PWBA can be applied to elastic scattering processes if (a) the

incident energy is sufficiently high, or (b) the potential is sufficiently weak. More detailed

criteria would require explicit consideration of the form of the scattering potential for the

problem of interest.

Scattering from two potentials

PWBA is equivalent to treating the full interaction potential V (r) by perturbation theory.

If the criteria we just discussed do not apply, such a perturbative expansion in powers of V

will not be useful.

Under such circumstances, it is often useful to decompose V into two parts, V1 and V2,

where scattering due to V1 can be treated exactly whereas scattering due to V2 cannot. If

this decomposition is made appropriately, it might be possible to treat scattering due to V2

using perturbation theory.

The above approach is similar in spirit to bound-state perturbation theory, in which the

total hamiltonian is decomposed into a part H0 that can be treated exactly and another

part V or H1 that can be treated as a perturbation.

A brief aside - An alternative form for Tba

We have customarily expressed the matrix elements of T in the form

< kb|T |ka >=< kb|V |Ψ+a >

where |Ψ+a > satisfies the integral equation

|Ψ+a >= |ka > +G+

0 V |Ψ+a >

I would now like to show you that the same T matrix elements can be written alternatively

as

< kb|T |ka >=< Ψ−b |V |ka > (83)

where |Ψ−b > satisfies the integral equation

|Ψ−b >= |kb > +G−

0 V |Ψ−b > (84)

so that the state vector |Ψ−b > appearing in (84) contains spherically incoming waves. Note

that such state vectors were introduced on pages 33-34 in the context of our S-matrix

discussion.

55

To prove that the T matrix elements can be expressed in this alternative (but equivalent)

form, consider

< kb|T | ka > = < kb|V + V G+V | ka >

= < kb|V | ka > + < kb|V G+V | ka >

Thus, if (83) is to be satisfied, < Ψ−b | must be given by

< Ψ−b | =< kb|+ < kb| V G+ (85)

Let’s now prove that this is equivalent to (84).

To do so, we first rewrite (85) as

< Ψ−b | =< kb|(I + V G+) (86)

I now claim that

I + V G+ = (I − V G+0 )

−1

To prove this, consider

(I + V G+)(I − V G+0 ) = I + V (G+ −G+

0 )− V G+V G+0

But in an earlier homework problem, we showed that

G+ −G+0 = G+V G+

0

so that

(I + V G+)(I − V G+0 ) = I + V G+V G+

0 − V G+V G+0 = I

QED

Operating to the right on equation (86) with (I−V G+0 ) and using the result just proven,

we find that

< Ψ−b |(I − V G+

0 ) =< kb|

which can be rewritten as

< Ψ−b |− < Ψ−

b | V G+0 =< kb|

Thus,

< Ψ−b | =< kb|+ < Ψ−

b | V G+0

56

Taking the Hermitean adjoint of this equation and noting that (G+0 )

† = G−0 , we obtain

|Ψ−b >= |kb > +G−

0 V |Ψ−b >

which is identical to (84), as we set out to prove.

Return to scattering by two potentials

Now consider two state vectors

|χ+a >= |ka > +G+

0 V1| χ+a > (87)

and

|χ−b >= |kb > +G−

0 V1| χ−b > (88)

which describe the scattering due to V1 alone.

Then,

< kb|T | ka > = < kb|V1 + V2| Ψ+a >

= < χ−b |V1 + V2|Ψ+

a > − < χ−b |V1G

+0 (V1 + V2)| Ψ+

a >

where I have used the hermitean adjoint of (88).

But

|Ψ+a >= |ka > +G+

0 (V1 + V2)| Ψ+a >

or

G+0 (V1 + V2)| Ψ+

a >= |Ψ+a > −|ka >

Thus,

< kb|T | ka > = < χ−b |V1 + V2| Ψ+

a >

− < χ−b |V1| Ψ+

a > + < χb|V1|ka >

= < χ−b |V2| Ψ+

a > + < χ−b |V1| ka >

But

< χ−b |V1| ka >=< kb|T1| ka >

i.e. the T matrix element associated with scattering by potential V1 alone.

57

All told,

< kb|T | ka >=< kb|T1| ka > + < χ−b |V2| Ψ+

a > (89)

If we know how to treat V1 exactly, we can calculate its T matrix elements, < kb|T1| ka >.

We still need to treat the second term, however, which involves the scattering due to V2.

We discuss how this might be done below under appropriate circumstances.

The relationship (89) is called the Gell-Mann Goldberger relation.

The Distorted Wave Born Approximation

If V2 is appropriately weak compared to V1, then it is reasonable to approximate |Ψ+a >

by |χ+a >, the scattering wave function due to V1 alone. Then (89) becomes

< kb|T | ka >=< kb|T1| ka > + < χ−b |V2| χ+

a > (90)

This is known as the first-order Distorted Wave Born Approximation or sometimes just

the Distorted Wave Born Approximation (DWBA). It is the first term in a perturbation

series expansion in powers of V2. Higher-order DWBA approximations can be systematically

generated by using the appropriate integral equation relating |Ψ+a > and |χ+

a >.

The philosophy of the DWBA is that the distortion of the incoming plane wave due to V1

is treated exactly whereas scattering effects due to the weaker V2 are treated perturbatively.

A simple application of DWBA - elastic scattering of an electron from a nucleus

Choose

V1 = Coulomb interaction between electron and a point nucleus = −Ze2

r

Then choose

V2 = V − V1 = modifications due to finite size of nucleus

The scattering due to V1 can be treated exactly using the simple Rutherford scattering

formula given earlier, but that due to V2 cannot. But it is expected to be fairly weak.

Thus, we treat the scattering due to V2 using perturbation theory. I will not go through

the detailed analysis here, as I merely wanted to illustrate the basic ideas and uses of the

DWBA.

Let me close by mentioning that both PWBA and DWBA ideas can be used not only in

treating elastic scattering but also in treating more complex collision processes. But this is

for another course.

58

A review of Identical Particles

The next major topic in the course will be Second Quantization, which as we will see is

a way to deal with systems of many identical particles.

An introduction to identical particles was already presented in PHYS610 and is discussed

in some detail in Chapter 10 of Shankar on pages 260-277. If you haven’t already read this,

you should.

Rather than assume that all of you are familiar with the notion of identical particles and

their description in Quantum Mechanics, I thought I would briefly review some of the key

points that were made in PHYS610. Following that, I will also briefly discuss how to modify

our formalism of scattering theory to accommodate identical particles.

Let’s begin by reminding you of why we have to take specific care of identical particles

in Quantum Mechanics. To do so, let’s consider the scattering (either elastic or inelastic)

of an electron from a hydrogen atom. Experimentally, we detect an outgoing electron. In

doing so, we are faced with a dilemma. Is the electron that we detect the incident projectile

or is it the electron that was originally bound in the hydrogen atom? The mere fact that

we detect an electron is not sufficient to answer this question. This is a consequence of the

fact that the electrons in question are identical particles and thus indistinguishable.

Classically, we can answer the question of which particle we are detecting by making

further measurements. In particular, by tracing the trajectory or path of the electron through

space we can determine where it originates.

Quantum mechanically, we cannot define such a path. In a QM framework, the particles

are described by wave packets and classical trajectories only exist on the average. If the two

wave packets never overlap appreciably, we can to a reasonable approximation neglect the

indistinguishability of the two particles. But in general we need to take into account the

indistinguishability of the two electrons in our QM treatment.

In the Schrodinger picture, a system of n identical particles can be described in terms of

the solutions

Ψ(1, 2, 3, ..., n; t)

of the time-dependent Schrodinger equation

− h

i

∂Ψ

∂t=

n∑

i=1

(− h22i

2mi

) + V (1, 2, ..., n)

Ψ = H(1, 2, ..., n) Ψ

59

Here i is meant to denote the full set of coordinate and (if necessary) spin dynamical variables

of particle i, viz: ri and σi. Note that I have assumed here that the potential does not change

with time.

To say that the particles 1, .., n are indistinguishable means that H is symmetric under

the interchange of two of its arguments, i.e. particles i and j can be interchanged in H

without changing it.

The permutation operator and the exchange operator

To understand the consequences of indistinguishability of identical particles, it is useful

to introduce two operators, the permutation operator P and the exchange operator X.

For a set of n ordered objects; 1, 2, ..., n, the permutation operator P has the effect of

permuting these objects. There are n! possible such permutations. For three objects, for ex-

ample, the 6 possible permutations are (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1).

The exchange operator X interchanges two of the objects. There are obviously n(n−1)2

possible interchanges. For three objects, for example, the three possibilities are

(1, 2, 3) → (2, 1, 3)

→ (3, 2, 1)

→ (1, 3, 2)

Clearly, any permutation of a set of objects can be expressed as a product of exchanges,

though usually not uniquely. Thus, for example,

P(1,2,3)→(2,3,1) = X1↔2 X1↔3

Permutation symmetry

Now let’s define the operator that effects a permutation of an identical particle wave

function

UP Ψα(1, 2, ..., n) = Ψα(P−1(1, 2, ..., n))

Further, let’s assume that Ψα is an eigenfunction of the hamiltonian H of the system,

H Ψα(1, 2, ..., n) = Eα Ψα(1, 2, ..., n) (91)

60

What about the wave function UP Ψα(1, 2, ..., n) obtained by permuting the labels in

some way? Since the particles are identical, such a state is indistinguishable from the state

Ψα(1, 2, ..., n) and thus it too must be an eigenstate of H with the same energy Eα,

H UP Ψα(1, 2, ..., n) = Eα UP Ψα(1, 2, ..., n)

From this we can readily show that

[H,UP ] = 0

i.e. that H commutes with UP . Thus, it is possible to find simultaneous eigenstates of H

and UP .

Systems of two identical particles

We first focus on systems of two identical particles, for which the permutation and ex-

change operators are obviously the same. Consider a given eigenstate of H, Ψα(1, 2). From

what we saw earlier,

UP Ψα(1, 2) = Ψα(P−1(1, 2)) = Ψα(2, 1)

is also an eigenstate of H with the same energy. And thus any linear combination of Ψα(1, 2)

and Ψα(2, 1) is also an eigenstate of H with the same energy.

Of these, there are two that are especially interesting:

ΨSα(1, 2) =

1√2(Ψα(1, 2) + Ψα(2, 1))

ΨAα (1, 2) =

1√2(Ψα(1, 2)−Ψα(2, 1))

They are the two possible simultaneous eigenstates of H and the permutation operator

P . In particular,

UPΨSα(1, 2) = +ΨS

α(1, 2)

and

UPΨAα (1, 2) = −ΨA

α (1, 2)

ΨSα(1, 2) is called the symmetric eigenstate, whereas ΨA

α (1, 2) is called the antisymmetric

eigenstate for reasons to be made clearer shortly.

Three identical particles

61

In the case of more than two identical particles, the permutation and exchange operators

are no longer the same. To make the generalization simpler, let’s begin with three particles.

Consider an eigenstate Ψα(1, 2, 3) of some three-particle hamiltonian H with eigenvalue

Eα. The fact that the particles are identical means that there are (at least) six degenerate

states at energy Eα,

Ψα(1, 2, 3) , Ψα(2, 1, 3) , Ψα(1, 3, 2) , Ψα(3, 1, 2) , Ψα(2, 3, 1) , Ψα(3, 2, 1)

Clearly any linear combination of them will also be an eigenstate of H with the same energy

Eα.

Which are also eigenstates of UP ?

I claim that to be an eigenstate of UP , the state must be a simultaneous eigenstate of

UX(1 ↔ 2) , UX(1 ↔ 3) and UX(2 ↔ 3)

having the same eigenvalue for all three exchanges.

Put another way, it must either be symmetric under all three interchanges or antisym-

metric under all three interchanges. And there is a unique linear combination for each of

these two possibilities.

1. Symmetric under all three interchanges:

ΨSα(1, 2, 3) =

1√3!

[Ψα(1, 2, 3) + Ψα(1, 3, 2) + Ψα(2, 3, 1)+

+ Ψα(2, 1, 3) + Ψα(3, 1, 2) + Ψα(3, 2, 1)]

2. Antisymmetric under all three interchanges:

ΨAα (1, 2, 3) =

1√3!

[Ψα(1, 2, 3)−Ψα(1, 3, 2) + Ψα(2, 3, 1)−

−Ψα(2, 1, 3) + Ψα(3, 1, 2)−Ψα(3, 2, 1)]

Note: the factor 1√3!is included so that they would be normalized assuming the individual

terms are.

I claim that both are eigenstates of UP for any permutation, as you can readily confirm.

An arbitrary number of identical particles

62

The same conclusions are true for any n. Namely, there are two possible states of n

identical particles at a given energy Eα that are also eigenstates of UP for any permutation.

One is called the symmetric state and the other is called the antisymmetric state.

Time evolution of a state of definite permutation symmetry

What happens if we have a system of identical particles in a state of definite permutation

symmetry, either symmetric or antisymmetric, and let it evolve in time. At this point, I do

not care whether it is in an eigenstate of H or not.

The state evolves via the time evolution operator,

T (t, t0) = e−ih(t−t0)H

And since the hamiltonian commutes with the permutation operator, it should thus be

clear that an eigenstate of the permutation operator will preserve its permutation symmetry

character for all times.

An analogous situation occurs for parity. If a hamiltonian is invariant under space in-

version, then a state of definite parity will remain forever in a state of definite parity. On

the other hand, it is possible to form a quantum system in a state of mixed parity, as we do

for example when we prepare a beam of particles in a definite direction in a 1D scattering

experiment.

It is here that the analogy between parity transformations and permutations breaks

down. It has been found necessary to impose another fundamental postulate in our quantum

mechanical formalism when considering identical particles. This new postulate, sometimes

referred to as the symmetrization postulate, asserts that “the states of a system containing

N identical particles are either all symmetric or all antisymmetric with respect to exchanges

of the particles”.

It should be emphasized that this is a postulate, much like the other postulates on which

QM is based. But, here too, the postulate seems to be borne out by experiment.

Which of the two prescriptions should be applied to a given problem depends on the nature

of the identical particles involved. Particles for which all states are symmetric are called

bosons. Those for which all states are antisymmetric are called fermions. All fundamental

particles with half integral intrinsic spins behave as fermions, whereas those with integral

intrinsic spins all behave as bosons.

63

Independent particle wave functions

Often in QM, both for identical and for non-identical particles, a useful first approxima-

tion can be made by neglecting the interactions between the particles and then treating the

interactions between them later using some form of perturbation theory.

The approximate (or unperturbed) hamiltonian for these “noninteracting” particles will

just be the sum of the hamiltonians for each one,

H(1, 2, .., n) = h(1) + h(2) + ...+ h(n) (92)

Obviously, each particle feels the same hamiltonian, since they are identical.

If we denote the eigenfunctions and eigenvalues of the single-particle hamiltonians h(i)

as ua(i) and ϵa, respectively, i.e.

h(i)ua(i) = ϵaua(i)

then the eigenfunctions and eigenvalues of the independent particle hamiltonian (92) are

given by

HΨ(1, ..., n) = EΨ(1, ..., n)

where

Ψ(1, ..., n) = ua1(1)ua2(2)...uan(n)

and

E = ϵa1 + ϵa2 + ...+ ϵan

Clearly, this system has degeneracies. For n = 2, for example, the two states

Ψ1(1, 2) = ua(1)ub(2)

and

Ψ2(1, 2) = ub(1)ua(2)

have the same energy

E = ϵa + ϵb

Neither Ψ1(1, 2) nor Ψ2(1, 2) are states with definite permutation symmetry. But clearly

ΨA(1, 2) =1√2[ua(1)ub(2)− ub(1)ua(2)]

64

and

ΨS(1, 2) =1√2[ua(1)ub(2) + ub(1)ua(2)]

are. The state ΨA is antisymmetric under particle exchange and the state ΨS is symmetric.

If we are dealing with fermions, we must use the antisymmetric solution ΨA whereas if we

are dealing with bosons we must use the symmetric solution ΨS. These are the appropriate

independent particle solutions to use when dealing with systems of two identical particles.

Many-particle particle states for independent identical fermions

We have seen that the appropriate antisymmetric state for two independent fermions is

ΨAab(1, 2) =

1√2[ua(1)ub(2)− ub(1)ua(2)]

where ua and ub are eigenstates of the associated single-particle hamiltonian. A convenient

way to rewrite this is

ΨAab(1, 2) =

1√2

∣∣∣∣∣∣∣ua(1) ua(2)

ub(1) ub(2)

∣∣∣∣∣∣∣in terms of a 2× 2 determinant.

In this form, it can be readily generalized to the case of n identical independent fermions.

The appropriate generalization is to the n× n determinant

ΨAa,b,...(1, 2, ..., n) =

1√n!

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

ua(1) ua(2) ... ua(n)

ub(1) ub(2) ... ub(n)

. . . .

. . . .

. . . .

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣(93)

This is referred to as a Slater determinant.

Clearly, (95) is an n-particle eigenstate of

H =n∑

i=1

h(i)

with eigenvalue

E = ϵa + ϵb + ...

By making use of the properties of determinants under the interchange of two columns, it

is straightforward to confirm that it is fully antisymmetric.

65

The Pauli Exclusion Principle

What happens if we try to construct the antisymmetric Slater determinant for a state

with two particles in the same single-particle state. Then the Slater determinant can be

written as

ΨAa,b,b,...(1, 2, 3, ..., n) =

1√n!

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

ua(1) ua(2) ... ua(n)

ub(1) ub(2) ... ub(n)

ub(1) ub(2) ... ub(n)

. . . .

. . . .

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣Note: I have assumed that it is the single-particle state b that contains two particles, but it

could be any other and the same conclusion would arise.

Making use of another property of determinants, which states that any one which has

two identical rows (or two identical columns) must be zero, we can easily show that

ΨAa,b,b,...(1, 2, 3, ..., n) = 0

What this says is that it is impossible to construct an antisymmetric state of identical

fermions in which more than one particle occupies the same single-particle state. This is of

course the well-known Pauli Exclusion Principle.

Two versus many identical particles

As is evident from the Slater determinant in (93), systems of many identical fermions

require wave functions with an enormous number of terms. This is in fact equally true

for systems of many identical bosons, even though it is not a determinant that is required

there. As such, it will be very difficult to treat systems of many identical particles using the

framework we have developed so far. Soon we will develop a method for handling systems

of many identical particles that captures all of the exchange and permutation symmetry

character of the systems, but with a much more efficient scheme for handling the many

terms that appear in the wave functions. The framework is called Second Quantization to

distinguish it from the ordinary or First Quantized framework we have used so far.

66

Two-particle wave functions for spin-1/2 particles

Let’s now discuss in a bit of detail the wave functions for two identical spin-1/2 particles.

Such wave functions obviously depend both on the spin and spatial degrees of freedom. For

my subsequent discussion of the scattering of identical spin-1/2 particles, it will be necessary

for us to know how the spin part of such wave functions behaves under particle interchange.

So, let’s look at that a bit.

Let’s consider therefore a spin wave function for two identical spin-1/2 particles with

total spin S, namely

χ 12

12;SMS

(σ1, σ2)

How does such a spin wave function behave under the interchange of its two spin labels σ1

and σ2?

To answer this, we consider

χ 12

12;SMS

(σ2, σ1) =∑

m1m2

(1

2m1

1

2m2|SMS)χ 1

2m1

(σ2) χ 12m2

(σ1)

=∑

m1m2

(1

2m1

1

2m2|SMS)χ 1

2m2

(σ1) χ 12m1

(σ2)

= (−)1−S∑

m1m2

(1

2m2

1

2m1|SMS)χ 1

2m2

(σ1) χ 12m1

(σ2)

= (−)1−Sχ 12

12;SMS

(σ1, σ2)

where in the third equality I have used the property of the Clebsch Gordan coefficients which

says that

(j1m1j2m2|JM) = (−)j1+j2−J(j2m2j1m1|JM) (94)

As we know, there are two possible values for the total spin, S = 0 and S = 1. We

have now shown that the S = 0 state is antisymmetric under interchange of the two spins,

whereas the S = 1 state is symmetric.

Scattering of spin-1/2 particles

Now let’s discuss how the indistinguishable nature of identical particles reflects itself in

scattering experiments. I will focus on the scattering of two identical spin-1/2 particles off

one another. The generalization to arbitrary intrinsic spin particles is straightforward.

67

Let’s denote the distance between the two colliding spin-1/2 particles by r = r1 − r2.

Assuming that the incident particle has spin projection m1h and the target particle has spin

projection m2h, we might naively write down the wave function prior to the scattering as

1

(2π)3/2eik· r |1

2m1,

1

2m2 >

Of course, in the usual experimental setup, both the incident particle and the target are

unpolarized, i.e. contain a mixture of all possible spin projections with none preferentially

favored. In such a scenario, we would of course have to take averages over the possible spin

projections m1 and m2.

Alternatively, as we know, we could consider the two spins coupled together to total spin

S and projection MS and try writing down the wave function before scattering as

1

(2π)3/2eik· r |1

2

1

2, SMS >

This is in fact the preferred representation, and the one we will use as we proceed. When

working in this representation, we would reflect an unpolarized scenario by averaging over

all possible values of S and MS. We’ll discuss how to do this shortly.

Why did I say natively when I wrote down the above two possible wave functions? The

reason is that neither of the two wave functions I wrote down are proper for describing

systems of two identical spin-1/2 particles, as neither is fully antisymmetric. To obtain a

properly antisymmetric two-fermion wave function, we would need to consider the behavior

under interchange of the two particles. Under particle interchange, I claim that

r → −r

Thus, the properly antisymmetric plane wave incident state is

1

(2π)3/21√2

[eik· r + (−)Se−ik· r

]|12

1

2, SMS >

The factor (−)S expresses the fact that for S = 0 the spin part of the wave function is

antisymmetric so that a symmetric spatial wave function is required for full antisymmetry.

Correspondingly, for S = 1, the spin part is symmetric and an antisymmetric spatial wave

function is needed.

Now let’s look at the spherically outgoing asymptotic wave. Before antisymmetrization,

i.e. while naivete still reigned supreme, it would be

f(θ, ϕ)eikr

r|12

1

2, S ′MS >

68

In general it is possible for the final spin S ′ to differ from the incident spin S. However, if

the interaction that governs the scattering is spin-independent it is impossible for the spin

of the system to change in the scattering process and S ′ = S. Also, the scattering amplitude

in such cases would not depend on S.

As noted earlier, under particle interchange r → −r. Thus θ → π − θ and ϕ → π + ϕ. A

properly antisymmetrized spherically outgoing wave can thus be written (assuming no spin

dependence of the hamiltonian) as

1√2

[f(θ, ϕ) + (−)Sf(π − θ, π + ϕ)

] eikr

r|12

1

2, SMs >

As we have seen, the coefficient of the spherically outgoing asymptotic wave determines

the differential cross section. For a given spin state S and spin-independent scattering of

identical particles, the appropriate expression is(dσ

dΩ

)S

= |f(θ, ϕ) + (−)S f(π − θ, π + ϕ)|2

= |f(θ, ϕ)|2 + |f(π − θ, π + ϕ)|2 + 2(−)SRef(θ, ϕ)f ∗(π − θ, π + ϕ)

As in ordinary scattering theory it is useful to choose the z-axis along the incident (rel-

ative) momentum vector k. For a central potential, this will ensure that the scattering

amplitude f(θ, ϕ) only depends on θ (i.e. the angle between k and k ′).

With this choice of axes,(dσ

dΩ

)S

= |f(θ)|2 + |f(π − θ)|2 + 2(−)SRef(θ)f ∗(π − θ)

We now note that in an unpolarized experiment S = 1 occurs three times more often

than S = 0, since S = 1 has three spin projections and S = 0 only has one. Thus,(dσ

dΩ

)unpolarized

=1

4

(dσ

dΩ

)S=0

+3

4

(dσ

dΩ

)S=1

=1

4|f(θ)|2 + 1

4|f(π − θ)|2 + 1

2Ref(θ)f ∗(π − θ)

3

4|f(θ)|2 + 3

4|f(π − θ)|2 − 3

2Ref(θ)f ∗(π − θ)

= |f(θ)|2 + |f(π − θ)|2 −Ref(θ)f ∗(π − θ)

Note that this gives the differential cross section for scattering though an angle θ relative

to the incident direction. Furthermore, bear in mind that it gives it in the center-of-mass

69

Detector

Detector

(a)

(b)

FIG. 6: Schematic illustration of an elastic scattering experiment of two identical fermions

frame of reference, since it is the relative motion Schrodinger equation that we solved to

get the scattering amplitude. But we can of course go from this to the lab cross section

straightforwardly.

As a result of the indistinguishability of the two fermions (perhaps two protons), the cross

section contains contributions from two processes, as shown in figure 8. The term |f(θ)|2 is

the cross section resulting from process (a) alone. It would have been the only contribtuion

were the two particles not identical. The term |f(π − θ)|2 is the cross section for scattering

through an angle π − θ and thus would come from process (b) alone. The interference

between the two processes (since in QM we must add amplitudes) gives rise to the third

term −Ref(θ) f ∗(π−θ). It is important to note that there is no way to distinguish between

processes (a) and (b) by merely detecting the single outgoing particle (e.g. the outgoing

proton).

70

Second quantization

We have seen that to describe in coordinate (or momentum) space a system of identical

fermions, we must use as our independent-particle basis functions Slater determinants. For

a system of A identical particles, these basis functions contain A! terms, each one containing

A factors. If A is fairly small, we can easily keep track of these terms and all is fine. But

when A is large, the bookkeeping associated with such a complicated wave function becomes

incredibly difficult.

And this bookkeeping problem is no simpler for systems of identical bosons. In such

cases, we need not make sure that each particle is in a different “level”, as prescribed by

the Pauli principle. But we must still deal with fully symmetrized wave functions, and they

involve the same number of terms as do Slater determinants for fermions.

What we need is a formalism for treating many-particle systems of identical particles

that automatically takes into account the antisymmetry or symmetry of the associated wave

functions, but which avoids the terrible bookkeeping issues.

The solution: Second Quantization.

Unfortunately, our textbook by Shankar does not discuss second quantization. So, you’ll

have to depend on my notes, which I personally think are pretty good and pretty complete,

or try reading another book. Merzbacher’s book does have extensive discussion of second

quantization, though I think that mine is better.

So now let me begin to tell you about second quantization.

Fock Space

In a second-quantized formalism, state vectors of identical particle systems are defined

in so-called Fock space. A state in Fock space is characterized by giving

• the possible states, which I’ll denote | α >, that a single particle can occupy, and

• the occupation numbers for these single-particle states.

The single-particle states | α > are (typically) taken to be the complete set of eigenstates

of some single-particle hamiltonian h, i.e.

h| α >= ϵα| α > (95)

71

We can completely characterize the eigenstates of h in terms of a complete set of quantum

#’s, related to a complete set of commuting observables. For example, in discussing a spin-

1/2 electron in an atom, h could be the Coulomb hamiltonian for a single electron in the

nuclear field. It’s eigenstates would then be characterized by the principal quantum # n,

the orbital angular momentum l, the total angular momentum j and its z-projection m.

With such a choice, | α >= | nljm >. We could of course use any alternative complete set

of commuting observables associated with the problem.

The occupation number of a given single-particle state | α > tells how many particles are

in that many-body state.

The fundamental assumption is that one obtains a complete set of many-particle states

by distributing the particles in all allowed ways over a complete set of single-particle states.

I will use the following notation for a state of n particles in Fock space:

| α1, α2, ..., αn >

This means that the n particles occupy single-particle states α1, α2, up to αn. For fermions,

all αi are distinct; for bosons they need not be.

An important state in Fock space is the one with no particles. We call it the vacuum

state and denote it by | 0 >. Fock space includes all distinct state vectors | α1, α2, ..., αn >

for any number of particles. It includes the vacuum state | 0 >, all possible one-particle

state vectors | α1 >, all possible two-particle state vectors | α1, α2 >, etc.

The set of ket vectors | α1, α2, ..., αn > define a linear vector space. There is also a dual

space, defined by a corresponding set of bra vectors < α1, α2, ..., αn |. The scalar product

< α1, α2, ..., αn | β1, β2, ..., βm >

is zero unless the set of occupied states α1, α2, ..., αn and β1, β2, ..., βm are the same.

This includes the possibility that the elements are the same, but their ordering is different.

Next I introduce the concept of standard order of Fock space states. This is done by

introducing an order of the single-particle states, say

α1 < α2 < α3...

This order is arbitrary, but once chosen must be maintained. A state in Fock space is in

standard order if its indices conform to the prescribed order. Thus, | α1, α2 > is in standard

72

order, but | α2, α1 > is not. We define our metric in Fock space according to

< α1, α2, ..., αn | α1, α2, ..., αn >= 1

as long as both states have the same occupation numbers and are in the same order. If they

are not in the same order, the scalar product between the two states will be determined by

whatever factors are required to bring them to the same order. As we’ll see, this requires

consideration of the indistinguishability of the identical particles.

Creation and annihilation operators

The next crucial ingredients in our second quantized formalism are the operators that

connect states in Fock space. Since Fock space states can have different numbers of particles,

we will in general need operators that change particle number. The simplest are those that

create or annihilate a particle.

We define the single-particle creation operator a†β by

a†β | α1, α2, ..., αn >=√nβ + 1 | β, α1, α2, ..., αn > (96)

where nβ is the occupation number of state β in | α1, α2, ..., αn >. Thus, a†β creates a

particle in state β, although the resulting state is not necessarily in standard order.

Clearly any ket state in Fock space can be built up by acting with creation operators

systematically on the vacuum state. In particular,

| α1, α2, ..., αn >= N a†α1a†α2

...a†αn|0 > (97)

where

N =∏

i=1,n

1√nαi

!

Analogously we introduce the single-particle annihilation operator aβ by

aβ | β, α1, α2, ..., αn >=√nβ | α1, α2, ..., αn > (98)

where nβ is the occupation number of state β in | β, α1, α2, ..., αn >.

Clearly aβ annihilates a particle in state β, as long as it is in the first position. If there

is a particle in state β (i.e. nβ = 0) but it is not in the first position, then before applying

the above defining relation we must first put it in the first position. Once again, we see the

need to rearrange Fock space states, or equivalently the order of creation operators.

73

On the basis of what I’ve just said, it should be clear that

aβ| 0 >= 0, for any β

and

aβa†β| 0 >= | 0 >

Indistinguishability of identical particles

Now let’s incorporate the indistinguishability of identical particles in QM in our formal-

ism. In doing so, we will pin down the earlier issues of how to relate state vectors that differ

only in the order in which particles were created.

To do this, it is useful to consider the coordinate state representations of these Fock space

states.

For one particle,

< r | α >= ϕα(r)

which is just the one-particle wave function in coordinate space.

For two particles,

< r1, r2 | α, β >= ϕαβ(r1, r2)

But we know that the wave functions for two identical particles in coordinate space

depend on whether they are fermions or bosons. For fermions,

ϕαβ(r1, r2) =1√2ϕα(r1)ϕβ(r2)− ϕβ(r1)ϕα(r2)

= −ϕβα(r1, r2)

In contrast, for bosons,

ϕαβ(r1, r2) =1√2ϕα(r1)ϕβ(r2) + ϕβ(r1)ϕα(r2)

= ϕβα(r1, r2)

We can trivially incorporate this in our second quantized formalism by imposing the

condition

| α, β >= λ| β, α >

where λ = −1 for fermions and +1 for bosons.

74

Many particles

It is trivial to generalize the above conditions to Fock space states involving n particles.

The generalization is

| α1, α2, ..., αn >= λ| α2, α1, ..., αn >

Now let’s consider the action of a†β1a†β2

on an arbitrary state in Fock space. From earlier

definitions

a†β1a†β2

| α1, α2, ..., αn > =√(nβ1 + δβ1,β2 + 1)(nβ2 + 1)| β1, β2, α1, α2, ..., αn >

= λ√(nβ1 + δβ1,β2 + 1)(nβ2 + 1)| β2, β1, α1, α2, ..., αn >

= λa†β2a†β1

| α1, α2, ..., αn >

Since this holds for any state vector | α1, α2, ..., αn >, we can conclude that

a†β1a†β2

= λa†β2a†β1

=

a†β2a†β1

for bosons

−a†β2a†β1

for fermions

Let’s now focus for a moment on fermions, for which the negative sign applies, and

consider the case β1 = β2. Then

a†β1a†β1

= −a†β1a†β1

which can only be satisfied if

a†β1a†β1

= 0 , for all β

Thus our second quantized formalism automatically accommodates the Pauli principle in

that two fermions cannot be created in the same state.

Notation:

[A,B] = AB −BA , commutator

A,B = AB +BA , anticommutator

We can summarize the above relations as

[a†α, a†β] = 0 , for bosons (99)

a†α, a†β

= 0 , for fermions (100)

75

Further properties of the creation and annihilation operators

Let’s now discuss some important properties of the creation and annihilation operators.

The first point is that the creation operator a†α and the annihilation operator aα are her-

mitean adjoints of one another. I will leave this as a homework problem for you to prove.

To do so, you need merely confirm that for any Fock space state vectors | ϕ > and | Ψ > in

standard order, the following relation between the matrix elements of a†α and aα is satisfied:

< ϕ | a†α | Ψ >=< Ψ | aα | ϕ >∗

I earlier showed that

a†αa†β = λa†βa

†α

where λ = 1 for bosons and −1 for fermions. Taking the hermitean adjoint of both sides we

get

aβaα = λaαaβ

We thus see that the annihilation operators likewise satisfy analogous commutation or

anticommutation relations

[aα, aβ] = 0 , for bosons (101)

aα, aβ = 0 , for fermions (102)

We have looked at what happens when we interchange the order of two creation or two

annihilation operators. But what about when we interchange the order of a creation and an

annihilation operator? More specifically, what is the relation, if any, between the operators

a†αaβ and aβa†α?

I now make two claims which will again be left as a homework assignment for you to

prove.

(a) aαa†β = λa†βaα, for α = β.

(b) aαa†α = λa†αaα + I,

where I is the identity operator. Both can be proven by acting on an arbitrary ket vector

in standard order.

76

The above two relations can also be summarized compactly as

[aα, a

†β

]= δαβ I , for bosons (103)

aα, a

†β

= δαβ I , for fermions (104)

Number-conserving operators

Up to now, we’ve focussed on the simplest possible operators, namely those that either

create or annihilate/destroy a single particle. Such operators connect states with different

numbers of particles. I will now turn to a very important class of operators, called “number-

conserving operators” which by definition only connect states with the same number of

particles. That such operators should play a preeminent role in QM is already evident from

the fact that the dynamics of (non-relativistic) quantum systems is governed by the hamil-

tonian operator, which is clearly number conserving. So, let’s now see how such operators

are built up in terms of the fundamental creation and annihilation operators.

The number operator

Let me begin by discussing a simple example, namely the number operator. This operator,

which I will denote N , is the one which when acting on an arbitrary state in Fock space tells

you how many particles are in that state. Consider a state in Fock space |α1, α2, ..., αn > in

standard order. Assume that αi is the first time that the single-particle state β occurs in

this ordered state vector. Then

aβ| α1, α2, ..., αn >=√nβ λi−1| α1, α2, .., αi−1, αi+1, .., αn >

Furthermore,

a†βaβ| α1, α2, ..., αn > =√nβ λi−1√nβ| β, α1, α2, .., αi−1, αi+1, .., αn >

= nβ| α1, α2, ..., αn >

Thus, a†βaβ counts the number of particles in state β in a given Fock space state.

If we then define

N =∑β

a†βaβ (105)

77

then

N | α1, α2, ..., αn > =∑β

nβ|α1, α2, ..., αn >

= n| α1, α2, ..., αn >

confirming that N as defined above is indeed the total number operator.

Some useful relations which you will be asked to confirm in the homework are that

[aα, N

]= aα (106)

[a†α, N

]= −a†α (107)[

a†αaβ, N]=[aβa

†α, N

]= 0 (108)

From (108) we see that any operator containing one creation and one annihilation opera-

tor commutes with the number operator. Thus, such an operator cannot change the number

of particles in a state on which it operates and is therefore a “number-conserving operator”.

It is trivial to generalize this (already fairly obvious) statement to reach the more gen-

eral conclusion that any operator containing the same number of creation and annihilation

operators commutes with N and is thus “number conserving”.

Of this large class of number-conserving operators, the ones of most interest are those in

which there is just one creation operator followed by one annihilation operator and those

in which there are just two creation operators followed by two annihilation operators. As

we will now confirm, operators with just one creation operator followed by one annihila-

tion operator correspond to quantum one-body operators, whereas those with two creation

operators followed by two annihilation operators correspond to quantum two-body opera-

tors. And indeed, many of the most important operators in quantum mechanics are one- or

two-body operators.

I will not try to convince you of these remarks in general, since (as noted earlier) the

bookkeeping associated with many-particle states is very difficult. Rather, I will just confirm

them for systems with two particles.

For two particles, a general one-body operator can be written as

Ω = ω(1) + ω(2)

78

Now let’s evaluate the matrix elements of Ω between states of two identical particles, either

fermions or bosons,1√2ϕα1(1)ϕα2(2) + λ ϕα2(1)ϕα1(2)

and1√2ϕα3(1)ϕα4(2) + λ ϕα4(1)ϕα3(2)

We’ll assume that α1 = α2 and α3 = α4; otherwise the states wouldn’t exist for fermions.

We obtain for the matrix element of Ω,

1

2< ϕα1(1)| ω(1)| ϕα3(1) > δα2α4+ < ϕα2(2)| ω(2)| ϕα4(2) > δα1α3

+λ < ϕα2(1)| ω(1)| ϕα3(1) > δα1α4 + λ < ϕα1(2)| ω(2)| ϕα4(2) > δα2α3

+λ < ϕα1(1)| ω(1)| ϕα4(1) > δα2α3 + λ < ϕα2(2)| ω(2)| ϕα3(2) > δα1α4

+ < ϕα2(1)| ω(1)| ϕα4(1) > δα1α3+ < ϕα1(2)| ω(2)| ϕα3(2) > δα2α4

=< ϕα1 | ω| ϕα3 > δα2α4 + λ < ϕα1 | ω| ϕα4 > δα2α3

+λ < ϕα2 | ω| ϕα3 > δα1α4+ < ϕα2 | ω| ϕα4 > δα1α3

Next, we consider the Fock space operator

Ω =∞∑

k,k′=1

ωkk′ a†kak′

with

ωkk′ =< ϕk| ω|ϕk′ >

Note that I put a hat on top of Ω to make clear that it is a second-quantized operator.

Note also that ωkk′ is just a matrix of c-numbers, and that all operator dependence in Ω is

contained in a†kak′ . Let’s evaluate the matrix elements of Ω between the two-particle Fock

space states | α1, α2 > and | α3, α4 >, where as before we assume α1 = α2 and α3 = α4.

Then

| α1, α2 >= a†α1a†α2

| 0 >

and

| α3, α4 >= a†α3a†α4

| 0 >

so that

< α1, α2| Ω| α3, α4 >=∑k,k′

ωkk′ A

79

where

A =< 0 | aα2aα1a†kak′a

†α3a†α4

| 0 >

To evaluate A, note that

ak′a†α3a†α4

| 0 > = λa†α3ak′a

†α4| 0 > +δα3k′a

†α4| 0 >

= a†α3a†α4

ak′| 0 > +λδα4k′a†α3| 0 > +δα3k′a

†α4| 0 >

= λδα4k′a†α3| 0 > +δα3k′a

†α4| 0 >

From this we find that

A =< 0|aα2aα1a†kak′a

†α3a†α4

| 0 > = λ < 0|aα2aα1a†ka

†α3| 0 > δα4k′ +

+ < 0|aα2aα1a†ka

†α4| 0 > δα3k′

Thus,

< α1, α2| Ω| α3, α4 >=∑k

λωkα4 < 0|aα2aα1a

†ka

†α3| 0 > +ωkα3 < 0|aα2aα1a

†ka

†α4| 0 >

Both terms involve a generic matrix element of the form < 0| aαaβa†γa†δ| 0 >. In a

homework problem, I will ask you to confirm that

< 0| aαaβa†γa†δ| 0 >= λδβδδαγ + δβγδαδ

Using this result, we find finally that

< α1, α2| Ω| α3, α4 > = λωα1α4δα2α3 + ωα2α4δα1α3

+ωα1α3δα2α4 + λωα2α3δα1α4

which is identical to the result we obtained on page 79 for the matrix element of Ω between

two-particle states in ordinary (first-quantized) space.

Indeed, the conclusion is general. A Fock space operator of the form Ω =∑

kk′ ωkk′a†kak′

gives exactly the same matrix elements when taken between n-particle Fock space states as

does the corresponding general one-body operator Ω =∑n

i=1 ω(i) when taken between the

corresponding ordinary space states of n particles.

In fact, one can prove exactly the same thing for a general two-body operator

V =1

2

∑i=j

V (i, j)

80

Note: Since we are dealing with identical particles V (i, j) = V (j, i). The factor of 1/2 is

included since for each pair (i, j) with i = j, there also occurs the pair (j, i) for which the

interaction is identical.

The two-body matrix elements of V (i, j) can be written in the form

Vαβ,α′β′ =< ϕα(i)ϕβ(j)|V (i, j)|ϕα′(i)ϕβ′(j) > +λ < ϕα(i)ϕβ(j)|V (i, j)|ϕβ′(i)ϕα′(j) >

as a sum of two integrals, the first called the direct integral and the second called the

exchange integral, as it arises from the exchange of the labels α ′ and β ′.

I now claim that the Fock space operator

V =1

4

∞∑k1k2k3k4=1

Vk1k2k3k4a†k1a†k2ak4ak3

has the same matrix elements in Fock space as given above in ordinary first-quantized space.

Note that the inverted order of the last two annihilation operators is indeed not a typo.

My discussion up to now on one- and two-body operators has been fairly general. Nev-

ertheless, from the notation you might have guessed that I was gearing up for a discussion

of a specific QM operator, the hamiltonian. So, consider a hamiltonian which in coordinate

space could be written as

H = H0 + V

where

H0 =n∑

i=1

h0(i)

and

V =1

2

n∑i =j=1

v(i, j)

For example, in discussing atoms, h0 might refer to the kinetic energy of electron i plus

its Coulomb interaction with the nuclear core and v(i, j) might be the electron-electron

Coulomb interaction.

From the preceding discussion, we can immediately write down the corresponding Fock

space hamiltonian as

H =∑kk′

< k| h0| k′ > a†kak′ +1

4

∑ijkl

Vijkla†ia

†jalak

81

Had we been clever enough to define our Fock space in terms of the single-particle eigen-

states of h0, i.e. by h0|k >= ϵk|k >, then

H =∑k

ϵka†kak +

1

4

∑ijkl

Vijkla†ia

†jalak

There is an important feature of this Fock space formalism relative to the coordinate

space formalism that I would now like to note. The point is that the coordinate space

hamiltonian H depends on the number of particles n, whereas the corresponding Fock space

hamiltonian H applies to systems with any number of particles n. Of course, since H is

a number-conserving operator, its matrix decomposes into submatrices along the diagonal,

each one corresponding to a different number of particles.

At first glance, it seems that we have gained little by going over to Fock space, except for

the bookkeeping simplifications I’ve mentioned earlier. [In fact, many of you might not yet

be convinced that the bookkeeping has been simplified, but take my word for it that it has.]

But we’ve gained something else. Since the Fock space hamiltonian is independent of n, we

can construct its complete matrix corresponding to all states at the same time. Once we

have expressed our physical problem in terms of the mathematical problem of diagonalizing

a certain matrix, we can resort to any mathematically justifiable procedure we want. The

larger the matrix at our disposal, the richer are our choices. In fact, a very powerful way of

approximately diagonalizing H involves carrying out a transformation to a representation

which mixes states with different numbers of particles. This is the basis of the so-called BCS

approximation used to describe superconductivity, as we will discuss later in the semester.

Such a number non-conserving method would not be possible without the use of Fock space,

where all numbers of particles can (if we wish) be treated at the same time.

Symmetries

We have just finished a discussion of “number conserving operators”, which play a par-

ticularly important role in treatments of nonrelativistic quantum systems which typically

have a well defined number of particles. That physical systems have a well-defined number

of particles is an example of a symmetry principle or conservation law. We have seen that

this has important consequences in our second quantized description of the system. On the

one hand, it tells us that the (true) hamiltonian describing the system can be built up in

terms of number-conserving operators only. And, furthermore, it permits us (if we wish) to

82

diagonalize H is a subspace of the full Fock space, namely the subspace of states with the

correct number of particles.

There are many other important symmetry principles for quantum systems, which likewise

can be used to simplify the treatment of complex many-body systems of identical particles,

in much the same way as did Conservation of Particle Number. Some of the better known

symmetries, all of which were discussed in PHYS811, include Rotational Invariance, Invari-

ance under Space Reflections and Invariance under Time Reversal. I would now like to briefly

show how some of these symmetries are manifested in our second quantized framework.

Rotational invariance and angular momentum

As we have seen, invariance under spatial rotations leads to the concept of conservation

of angular momentum. In analogy with conservation of particle number, this leads to two

principal consequences:

1. that hamiltonians for isolated systems must be scalars (or, equivalently, Irreducible

Tensor Operators of rank 0), and

2. that we can (if we wish) only diagonalize H in subspaces corresponding to states with

given total angular momentum.

Of course, it is possible that under certain circumstances it might be useful to relax the

symmetry to develop useful approximation schemes, as we discussed doing for particle num-

ber conservation. Indeed, this is what is often done in the so-called Hartree Fock procedure,

which I will describe later in the semester.

Considering the importance of rotational invariance in so many quantum systems, it is

useful to discuss in greater detail how it enters in our Fock space formalism.

If the total hamiltonian H is rotationally invariant, it is natural (though not essential) to

define the Fock space states in terms of a single-particle hamiltonian h which is rotationally

invariant. The eigenstates of such a hamiltonian can be expressed as | αi ji mi >, where ji

is the total angular momentum quantum number, mi is its z projection, and αi are all other

quantum numbers (e.g. the principal quantum number). These states satisfy the following

eigenvalue equations

83

h| αi ji mi > = ϵi| αi ji mi >

J2| αi ji mi > = ji(ji + 1)| αi ji mi >

Jz| αi ji mi > = mi| αi ji mi >

where I am for notational simplicity setting h = 1.

Let’s now consider the form of these various operators in our second quantized formalism.

The angular momentum operator is an Irreducible Tensor Operator of rank one. The

three components of this ITO are

J01 = Jz

J11 = − 1√

2J+

J−11 =

1√2J−

The total angular momentum squared operator is

J2 = J · J = J01J

01 − J1

1J−11 − J−1

1 J11

Now let’s look at all these operators in Fock space. Since J is a one-body operator, each

of its three components are given by

Jq1 =

∑α1j1m1,α2j2m2

< α1j1m1|Jq1 | α2j2m2 > a†α1j1m1

aα2j2m2

The one-particle matrix elements of Jq1 can be obtained using the Wigner Eckart theorem.

The result, as derived last semester, is

< α1j1m1|Jq1 | α2j2m2 >= −

√j1(j1 + 1)(1qj2m2|j1m1)δα1α2δj1j2

Thus,

Jq1 = −

∑αjm

√j(j + 1) (1qjm|jm+ q)a†αjm+qaαjm

where I have taken into account that the Clebsch Gordan coefficient is only non-zero when

the m values sum appropriately.

84

Putting in explicit formulae for the Clebsch Gordan coefficients, we obtain

J01 =

∑αjm

m a†αjmaαjm

J11 = − 1√

2

∑αjm

√(j +m+ 1)(j −m) a†αjm+1aαjm

J−11 =

1√2

∑αjm

√(j −m+ 1)(j +m) a†αjm−1aαjm

Next we consider

J2 = J01 J

01 − J1

1 J−11 − J−1

1 J11

in terms of the Fock space forms for Jq1 . The net result will be a sum of a one-body term

and a two-body term. This is already obvious by considering the first term J01 J

01 , which is

J01 J

01 =

∑αjm,α′j′m′

mm′ a†αjmaαjma†α′j′m′aα′j′m′

Noting that

aαjma†α′j′m′ = λa†α′j′m′aαjm + δαα′δjj′δmm′

we can rewrite this as

J01 J

01 =

∑αjm,α′j′m′

λmm′a†αjma†α′j′m′aαjmaα′j′m′

+∑αjm

m2a†αjmaαjm

The first term is a two-body operator and the second term is a one-body operator.

Next I would like to consider the angular momentum properties of the creation and

annihilation operators a†αjm and aαjm. I make the following assertions:

1. that a†αjm is an Irreducible Tensor Operator (ITO) of rank j and projection m,

2. that aαjm is not an ITO, and

3. that aαjm = (−)j+maαj−m is an ITO of rank j and projection m.

You will be asked to prove these assertions as a homework problem, by using the Fock space

analogs of the commutation relations that define an ITO.

The point of all of this is that, as we know, ITO’s can be coupled together using Clebsch

Gordan coefficients to produce new ITO’s, just as we can couple together wave functions

with good angular momentum properties to get product states of good angular momentum.

85

Thus, if we have two ITO’s, Aq1k1

and Bq2k2, and we construct the linear combination

∑q1q2(q1+q2=M)

(k1q1k2q2|KM)Aq1k1Bq2

k2

we will end up with an ITO of rank K and projection M . I will denote this beast as

[Ak1 Bk2

]MK

We can use these ideas to make explicit the angular momentum properties of Fock space

operators. Consider, for example, the number operator

N =∑αjm

a†αjmaαjm

As an exercise, you should convince yourself that

N =∑αj

√2j + 1

[a†αj aαj

]00

In this form, it is clear that N is an angular momentum scalar, as clearly it must be.

Using the same techniques, we can also show that both the one- and two-body parts of

the hamiltonian H are scalar operators. And this is of course consistent with the fact that

the eigenstates of H have definite angular momenta.

I’d now like to briefly address the physical significance of the operator aαjm I introduced

earlier. As I asserted, and you will prove, it is this operator, not aαjm that has the properties

of an ITO of rank j and projection m. To understand this, consider a fermion state vector

in which all of the 2j +1 possible m substates associated with a given j value are occupied,

i.e.

a†αjj a†αjj−1 ... a†αj−j | 0 >

It is easy to prove, and you will be asked to do so, that such a state is an eigenstate of

J2 and Jz with total angular momentum J = 0 and total projection M = 0. I will denote

this state as

Φ00(a

†αj)

2j+1|0 >

This state can be decomposed using Clebsch Gordan coefficients as

Φ00

(a†αj

)2j+1|0 > ∝

∑m

(jmj −m|00)a†αjmΦ−mj

(a†αj

)2j|0 >

86

where Φ−mj

(a†αj

)2jis a 2j-particle state with total angular momentum j and total projection

−m.

If we act with aαjm on Φ00(a

†αj)

2j+1|0 >, we obtain

aαjmΦ00(a

†αj)

2j+1|0 >∝ (jmj −m)|00)Φ−mj

(a†αj

)2j|0 >

since the aαjm annihilates one a†αjm.

But

(jmj −m|00) = (−)j−m

√2j + 1

so that

aαjmΦ00(a

†αj)

2j+1|0 >∝ (−)j−mΦ−mj

(a†αj

)2j|0 >

Clearly aαjm acting on a closed-shell system (which had angular momentum 0) does not

produce a state with angular momentum j and projection m. On the other hand,

aαjmΦ00(a

†αj)

2j+1|0 > = (−)j+maαj−mΦ00

(a†αj

)2j+1|0 >

∝ Φmj

(a†αj

)2j|0 >

Thus, aαjm when acting on a closed-shell state does produce a state of definite angular

momentum j and projection m, as an ITO must.

Such a result is not unique to the angular momentum representation. Something very

similar occurs in the linear momentum representation, which is often more useful when

dealing with condensed matter systems. Here we define our single-particle state vectors,

and thus our Fock space, to be eigenstates of the linear momentum operator

p|k >= k|k >

In this representation, a†kcreates a particle of momentum k and ak annihilates a particle of

momentum k.

What happens if we act with ak on a many-body state with total K = 0. If the state |k >

is occupied, the operator will annihilate a particle in that state, leaving behind a state with

momentum −k (conservation of total linear momentum). The operator which produces a

momentum eigenstate with eigenvalue k is

ak = a−k

87

What if anything is the relationship between the two operators aαjm (in an angular

momentum representation) and ak (in a linear momentum representation)? The answer is

contained in the properties of the time reversal operator we learned about last semester. So,

let’s see how.

Denoting the time reversal operator by θ and acting with it on an angular momentum

eigenstate |nljm > gives

θ|nljm >= (−)j−m|nlj −m >

as you should remember from last semester.

Similarly, one-particle momentum eigenstates transform (as should be obvious) under

time reversal as

θ|k >= | − k >

Comparing the above relations with the defining relations for the tilded operators aαjm

and ak, we see that

aαjm = θ aαjm θ−1

and

ak = θ ak θ−1

Thus, aαjm and ak do not annihilate particles in | αjm > and | k >, respectively. Rather,

they annihilate particles in the time reversed single-particle states (−)j−m| αj − m > and

| − k >, respectively.

I will at times use the notation | α > to denote the time-reversed state of | α >, viz:

| α >= θ| α >

From the above, it should be obvious that

aα = aα

Time reversal invariance

When a single-particle hamiltonian for spin-1/2 fermions is time-reversal invariant, it’s

one-particle eigenstates must be doubly degenerate. This, as you hopefully remember from

last semester, is known as Kramer’s degeneracy. In the above notation, this means that for

any single-particle state |α >, there corresponds another single-particle state |α > such that

ϵα = ϵα

88

To incorporate this time-reversal property into our formalism (assuming of course we are

dealing with spin-1/2 fermions), it is useful to split the full space of single-particle states

|α > into two parts, which we denote by α > 0 and α < 0. To each state with |α >

with α > 0 there will correspond another state |α > with α < 0 with the same energy. For

example, for each single-particle eigenstate |nljmj > of a rotationally invariant hamiltonian

with mj > 0, there is a degenerate state |nljmj > for which mj < 0. We can span the full

Fock space in terms of the creation operators

a†α and a†α

with α > 0. We will often use this simplification.

Quasiparticles

The formalism we have developed so far involves the following ingredients:

(a) a vacuum state | 0 >, and

(b) creation and annihilation operators a†α and aα, with either commutation or anticommu-

tation relations, depending on whether we are dealing with fermions or bosons. These

operators create/annihilate a particle in the chosen single-particle basis | α >.

All state vectors in Fock space are generated by acting with creation operators on the

vacuum. All operators are generated in terms of the fundamental creation and annihilation

operators. The link between the two ingredients (a) and (b) of the formalism is contained

in the relation

aα| 0 >= 0 , for all α

As should be obvious, the above formalism is in terms of “real” (honest-to-goodness)

particles. Our creation and annihilation operators create or annihilate particles; our vacuum

state has zero particles, etc.

I now claim that it is also possible to develop alternative algebras that are

mathematically identical to the one just described except that they are not in terms of

real particles. The kinds of beasts that enter in this mathematically equivalent formalism

will be called quasiparticles and will be useful in several different areas of nonrelativistic

many-body theory. I will focus on them solely for systems of spin-1/2 fermions with its

characteristic time reversal symmetry properties.

89

In this quasiparticle formalism, one introduces quasiparticle creation and annihilation

operators c†α and cα which are related to the particle creation and annihilation operators a†α

and aα by

c†α = uαa†α − vαaα

cα = uαaα + vαa†α

with

u2α + v2α = 1

Clearly, c†α does not create a real particle in state α. It creates in part a particle in the

state α and in part a hole in the time reversed state α. The funny thing it creates is called

a quasiparticle.

I now make the following claims which you will be asked to verify in the homework :

1. In order for the equations for c†α and cα to be consistent it is necessary that

uα = uα and vα = −vα

2. The set of quasiparticle creation and annihilation operators satisfy the following set

of anticommutation relations: c†α, c

†β

= cα, cβ = 0

cα, c†β

= δαβ

Thus, the quasiparticle operators satisfy exactly the same anticommutation algebra as

do the real particle operators.

But as noted earlier, the introduction of creation and annihilation operators is not enough

to specify a second-quantized algebra. One must also introduce an “appropriate” vacuum

state. Let’s denote this vacuum state by | 0 > and refer to it as the quasiparticle vacuum.

If quasiparticles are to have mathematically the same structure as real particles, the

quasiparticle vacuum state must satisfy

cα| 0 >= 0 , for all α

I claim that this would be the case if we chose the quasiparticle vacuum state to be

| 0 >=∏β

cβ| 0 >

90

That this quasiparticle vacuum is indeed annihilated by all quasiparticle annihilation

operators can be shown simply by rewriting

| 0 >= ±cα∏

β (β =α)

cβ| 0 >

where the ± sign comes from anticommuting cα past as many cβ’s as are necessary to get it

in the first position. Then since cαcα = 0 for any α, it is clear that cα| 0 >= 0 for any α.

We can in fact express the quasiparticle vacuum state in terms of the real vacuum and

real particles as follows:

| 0 >=∏α

(uαaα − vαa

†α

)| 0 >

Following my discussion on pages 88-89, I divide the full set α into those with α > 0 and

those with α < 0. We can then rewrite the quasiparticle vacuum as

| 0 >=∏α>0

(uαaα − vαa

†α

) (uαaα + vαa

†α

)| 0 >

where I have used the fact that uα = uα and vα = −vα as discussed earlier was required for

consistency. Expanding out, we get

| 0 >=∏α>0

(u2αaαaα + uαvαaαa

†α

− uαvαa†αaα − v2αa

†αa

†α

)| 0 >

The first and third terms in brackets give zero contribution, since aα| 0 >= 0. The second

term can be rewritten as

uαvαaαa†α| 0 >= −uαvαa

†αaα| 0 > +uαvα| 0 >= uαvα| 0 >

since aα| 0 >= 0. Thus,

| 0 > =∏α>0

(uαvα − v2αa

†αa

†α

)| 0 >

=∏α>0

(uαvα + v2αa

†αa

†α

)| 0 >

=∏α>0

uαvα∏α>0

(1 +

vαuα

a†αa†α

)| 0 >

The first factor is just a number and is there to guarantee that

< 0| 0 >= 1

91

i.e. that the quasiparticle vacuum is normalized. If we expand the second factor, we get

a term with zero real creation operators, then one with 2 real creation operators, then one

with 4 real creation operators, etc. Clearly, | 0 > is not an eigenstate of real particle number.

But as we’ll see later, it is nevertheless quite useful.

Now let’s consider the occupation number of a given single-particle state α in the quasi-

particle vacuum. I will denote it as ηα and it is given by

ηα =< 0| a†αaα| 0 >

< 0| 0 >

To evaluate this, we invert the defining relations that connect c†α and cα to a†α and aα.

The inverted equations are, as you can easily confirm,

a†α = uαc†α + vαcα

aα = uαcα − vαc†α

Thus,

ηα =< 0|

(uαc

†α + vαcα

) (uαcα + vαc

†α

)| 0 >

< 0| 0 >

Using the fact that cα| 0 >= cα| 0 >= 0 and that < 0|c†α =< 0| c†α = 0, we obtain

ηα =v2α < 0| cαc†α| 0 >

< 0| 0 >

But

< 0| cαc†α| 0 > = − < 0| c†αcα| 0 > + < 0| 0 >

= < 0| 0 >

so that

ηα = v2α

Thus, v2α measures the “fullness” of state α. Then since u2α + v2α = 1, it is clear that u2

α

measures its “emptiness”.

A simple application

Assume an ordering of (doubly-degenerate) single-particle energies (i.e. the eigenvalues

of h) such that if α > β (both positive) then ϵα > ϵβ.

92

0

v 21

FIG. 7: Occupation numbers for a system involving a set of levels filled up to the Fermi energy.

Next let’s assume that all of the single-particle levels up to a given (positive) λ (with

energy ϵλ) are completely occupied, whereas all those with energies greater than ϵλ are

completely empty. Pictorially, if we plot v2α versus the single-particle level (or single-particle

energy) it will look as in figure 9. The separation point is called the Fermi surface and ϵλ is

referred to as the Fermi energy.

For such a scenario, I claim that

vα = 1 , uα = 0 , for α ≤ λ

vα = 0 , uα = 1 , for α > λ

Thus,

c†α = a†α

cα = aα

, for α > λ

and

c†α = −aα

cα = a†α

, for α ≤ λ

The quasiparticle vacuum associated with this scenario is simply

| 0 >=∏

0<α≤λ

a†αa†α|0 >

93

which you can readily confirm satisfies the requirement that

cα| 0 >= 0 , for all α

What is the significance of the creation operator c†α in this system? Clearly, for particles,

outside the “filled inert core”, it creates real particles. For particles within the inert core, it

annihilates real particles, or equivalently it creates real holes.

By using this (simple) quasiparticle transformation, we have devised a formalism whereby

particles outside an inert core and holes inside that same core are treated on an equal footing.

This permits us to straightforwardly isolate on just the few valence particles and/or holes in

the system, rather than treating all of the particles in the complex many-body system. Since

it requires lots of energy to lift particles from deep below the Fermi surface or to lift particles

to high above the Fermi surface, the dominant excitations in a system near its ground

state will just involve the levels fairly near the Fermi surface. Our simple quasiparticle

transformation enables us to focus on them.

Another application

In a couple of weeks, I will show you how it is possible to treat the uα and vα coefficients

as variational parameters, so as to find the optimum set of noninteracting quasiparticles in

a fermionic system. Such an approach will be meaningful whenever the hamiltonian of the

problem is dominated by so-called pairing correlations. The approximation that will emerge

is the BCS (or Bardeen-Cooper-Schrieffer) approximation, appropriate to superconducting

systems.

Second quantization in coordinate representation - Introduction of field operators

Our discussion of second quantization so far has involved the introduction of a single-

particle basis associated with some single-particle hamiltonian. I would now like to discuss

an alternative representation for second quantized operators, in which we use the continuous

coordinate representation to define Fock space.

In coordinate representation, the relevant single-particle states, |r >, are eigenstates of

the coordinate operator

rop |r >= r |r >

94

Now let’s define the operator that creates a particle at the point r as Ψ†(r), viz:

| r >= Ψ†(r) | 0 > (109)

These operators can, if we wish, be related to the single-particle creation operators a†k

appropriate to the set of single-particle Fock space states | k > by inserting the identity

operator I =∑

k | k >< k| into (109), yielding

| r > =∑k

| k >< k|Ψ†(r) | 0 >

=∑k

| k >< k| r >

=∑k

ϕ∗k(r)| k >

where ϕk(r) is the coordinate-space wave function associated with the single-particle state

| k >, i.e.

ϕk(r) =< r| k >

Thus,

Ψ†(r) =∑k

ϕ∗k(r) a

†k (110)

Now what about the operator Ψ(r) that annihilates a particle at point r. Clearly, it

satisfies

< r| =< 0|Ψ(r)

Inserting, as before, a complete set of states I =∑

k |k >< k|, we find that

< r| =∑k

< 0|Ψ(r)| k >< k|

=∑k

< r| k >< k|

=∑k

< k| ϕk(r)

Thus,

Ψ(r) =∑k

ϕk(r) ak (111)

which as expected is the hermitean adjoint of Ψ†(r).

The operators Ψ†(r) and Ψ(r) are called field operators.

95

Let’s now assume that this refers to the creation and annihilation of fermions at point r.

Then if we consider the anticommutation relation between Ψ(r) and Ψ†(r′), we obtainΨ(r) , Ψ†(r′)

=∑k,k′

ϕk(r)ϕ∗k′(r

′)ak , a†k′

=∑k

ϕk(r)ϕ∗k(r

′)

= δ(r − r′)

As expected, the field operators associated with the creation and annihilation of fermions

satisfy fermion anticommutation relations, but with Dirac delta functions to reflect the fact

that it is a continuous space.

If instead the operators referred to the creation and annihilation of bosons, we could

instead show analogously that [Ψ(r) , Ψ†(r′)

]= δ(r − r′)

namely that the field operators satisfy boson commutation relations. Thus, the field opera-

tors we have introduced carry the permutation symmetry character of the particles they are

creating or annihilating.

At this point, it is useful to include intrinsic spin in the discussion, as it is through the

spin that the symmetry character enters. This can be done straighforwardly, as follows.

The field operator that creates a particle at point r with spin orientation s is denoted Ψ†s(r).

Likewise the field operator that annihilates a particle at point r with spin orientation s is

denoted Ψs(r). These operators satisfy either fermion anticommutation relationsΨs(r) , Ψ†

s′(r′)= δ(r − r′)δss′

or boson commutation relations[Ψs(r) , Ψ†

s′(r′)]= δ(r − r′)δss′

depending on the permutation symmetry character of the particles involved.

And you can readily convince yourselves that the commutation/anticommutation rela-

tions between two field creation operators or two field annihilation operators likewise takes

the expected form.

From the field creation operators we can build up coordinate space wave functions for

identical particles that appropriately reflect the exchange character of the system. For

96

example, an antisymmetric wave function for identical fermions in coordinate space can be

expressed as

Ψ†s1(r1) Ψ

†s2(r2)...Ψ

†sn(rn) |0 >

which you can readily convince yourselves is antisymmetric under the interchange of the

spatial and spin labels.

Now let’s discuss how we would write the usual operators we are familiar with in terms

of these field operators. I will focus on the operators from which we build the hamiltonian

of the system, namely the kinetic and potential energy operators. As in earlier discussion,

we will assume that we have a two-body interaction only.

Let’s first consider the one-body kinetic energy operator, which in coordinate space was

T =h2

2m ·

If we write this in terms of an arbitrary Fock space, with creation and annihilation

operators a†n and an, respectively, we find that

T =∑n,n′

< n| T | n′ > a†nan′

where the matrix element < n| T | n′ > in coordinate space is given by

< n| T | n′ >=h2

2m

∫dr ϕ∗

n(r) · ϕn′(r)

Then

T =h2

2m

∑nn′

∫dr ϕ∗

n(r) · ϕn′(r) a†nan′

=h2

2m

∫dr Ψ†(r) · Ψ(r)

Note that this maintains the natural form we would expect for a second-quantized kinetic

energy operator in coordinate space, except that through the introduction of field operators

we see directly the one-body nature of the kinetic energy operator and furthermore we have

an operator which when its matrix elements are evaluated will automatically reflect the

exchange or permutation character of the identical particles under discussion. It should also

be emphasized that this is the kinetic energy operator to be applied to a many-body system

of identical particles.

97

We can of course do the same thing for the two-body potential operator, which in arbitrary

Fock space takes the form

V =1

4

∑n1n2n2n4

< n1 n2| V | n3 n4 > a†n1a†n2

an4an3

The two-body matrix elements of V that enter are given in coordinate space by

< n1n2| V | n3n4 >=∫ ∫

dr1dr2(ϕ∗n1(r1)ϕ

∗n2(r2)V (r1 − r2)ϕn3(r1)ϕn4(r2)

±ϕ∗n1(r1)ϕ

∗n2(r2)V (r1 − r2)ϕn3(r2)ϕn4(r1)

)where I use the ± sign to reflect the fact that we need to add or subtract the exchange

integral depending on the symmetry character of the particles.

Then,

V =1

4

∑n1n2n2n4

∫ ∫dr1dr2ϕ

∗n1(r1)ϕ

∗n2(r2)V (r1 − r2)ϕn3(r1)ϕn4(r2)a

†n1a†n2

an4an3

±1

4

∑n1n2n2n4

∫ ∫dr1dr2ϕ

∗n1(r1)ϕ

∗n2(r2)V (r1 − r2)ϕn3(r2)ϕn4(r1)a

†n1a†n2

an4an3

=1

2

∫ ∫dr1dr2Ψ

†(r1)Ψ†(r2)V (r1 − r2)Ψ(r1)Ψ(r2) (112)

In obtaining the last equality I made use of the fact that the field annihilation operators are

symmetric or antisymmetric depending on their exchange character, so that the two terms

give the same contribution.

Note that here too we arrive at the natural form we would expect for a second-quantized

two-body potential energy operator in coordinate space.

It is useful to give a verbal interpretation to the potential operator expressed in terms of

the quantum field operators. Basically, what it says is the following:

1. The operator first tries to remove particles from points r1 and r2;

2. If it is successful, it contributes an interaction strength V (r1 − r2);

3. It then replaces the particles at those same points, taking care to replace the first

particle it removed first so that it doesn’t introduce any inadvertent sign changes;

4. It then sums over all possible pairs of points r1 and r2 from which the particles can

be removed and then put back;

98

5. Finally, it compensates for double counting the same pair of points twice through the

factor 12.

This is all I would like to say about the use of quantum fields at this time in my intro-

duction to second quantization.

99

Approximation techniques for non-relativistic many-body systems

I would now like to discuss the use of second quantization in developing practical approxi-

mation techniques for dealing with non-relativistic many-particle systems involving identical

particles. The two specific techniques I will develop and discuss are

(1) The Hartree Fock Approximation

(2) The BCS Approximation.

The Hartree Fock Approximation

In the Hartree Fock method, one approximates a many-body system of interacting

fermions by a system of non-interacting fermions, each of which moves in a field created

by all of the others. The method is variational in the sense that one searches for the best

possible such description.

Clearly, such an independent-particle (or mean-field) approximation will be different for

fermions than for bosons. In the case of fermions, each independent particle must occupy a

different independent-particle state, and the lowest state for N particles involves filling up

the N lowest states (a Slater determinant wave function). In the case of bosons, the Pauli

principle does not apply and such a mean-field variational principle would describe the lowest

state of the system by putting all N particles in the energetically-lowest independent-particle

state.

As noted above, the Hartree Fock (HF) method applies to fermions and this is the method

I will discuss. The Hartree Bose approximation, the corresponding variational mean-field

approximation for bosons, is in fact somewhat simpler and I will have you develop it as a

homework assignment.

So, let’s now assume that we have some single-particle hamiltonian

h0(i) = t(i) + U(i) (113)

which is used to generate a set of single-particle states

| i >= a†i | 0 > (114)

The hamiltonian for the system can be expressed in terms of the creation operators a†i

and their hermitean conjugate annihilation operators ai as

100

H =∑ij

tija†iaj +

1

4

∑ijkl

Vijkl a†ia

†jalak (115)

where

tij =< i| t| j >

and

Vijkl =< ij| V | kl >

In Hartree Fock theory, we wish to find the best possible Slater determinant state vec-

tor. Of course, it is not necessarily the Slater determinant built up by putting particles in

the energetically lowest single-particle states | i >, since that set of single-particle states

was chosen arbitrarily, or perhaps for convenience. So, let’s assume that the HF Slater

determinant can be written as

| Φ >= b†λ1b†λ2

...b†λN| 0 > (116)

where

b†λ =∑i

cλi a†i (117)

Since the a†i form a complete set of single-particle creation operators, we can certainly expand

any single-particle creation operator as a linear combination of them.

What we shall do is to determine the cλi such that

< Φ| H| Φ >

< Φ| Φ >

is minimized. This is a well-defined variational problem, with the cλi as our variational

parameters.

Before doing this, however, it’s useful to carry out a bit of preliminary analysis. Clearly,

we want the new set of single-particle state vectors

| λ >= b†λ | 0 >

to form an orthonormal set, i.e.

< λ|λ ′ >= δλλ ′ (118)

Thus,

< 0| bλb†λ ′ | 0 >= δλλ ′

101

or ∑ij

c∗λi cλ′

j < 0| aia†j | 0 >= δλλ ′

But

< 0| aia†j | 0 >=< i| j >= δij

so that ∑i

c∗λi cλ′

i = δλλ ′ (119)

Next we consider b†λ, b†λ ′

=∑ij

cλi cλ ′

j

a†i , a†j

= 0 (120)

bλ, bλ ′ =∑ij

c∗λi c∗λ′

j ai, aj = 0 (121)

and

bλ, b†λ ′

=∑ij

c∗λi cλ′

j

ai, a†j

=∑ij

c∗λi cλ′

j δij

=∑i

c∗λi cλ′

i

= δλλ ′ (122)

Thus, the b†λ and bλ ′ have the same anticommutation relations as the original creation

and annihilation operators, as expected.

Finally, let’s consider the inverse of the expansion (117), viz:

a†i =∑λ

dλi b†λ (123)

so that

a†i =∑λ j

dλi cλj a

†j

This requires that ∑λ

dλi cλj = δij

Multiplying through by c∗λ′

j and summing over j gives

∑λ j

dλi c∗λ ′

j cλj =∑j

δijc∗λ ′

j = c∗λ′

i

102

The appropriate variational condition is that

∂

∂c∗λm

< Φ| H| Φ > −eλ

∑j

|cλj |2 − 1

= 0 (134)

where eλ is a Lagrange multiplier introduced to guarantee that the normalization < Φ| Φ >=

1 is preserved under the variation.

Carrying out the differentiation, we obtain

∑j

tmjcλj +

1

2

∑jkl

Vmjkl

∑λ2≤N

c∗λ2j cλ2

l

cλk

+1

2

∑ikl

Vimkl

∑λ1≤N

c∗λ1i cλ1

k

cλl

− eλcλm = 0 (135)

The second and third terms are identical, as I will now confirm.

We can rewrite

∑jkl

Vmjkl

∑λ2≤N

c∗λ2j cλ2

l

cλk =∑jkl

Vmkjl

∑λ ′≤N

c∗λ′

k cλ′

l

cλj

by interchanging the dummy summation indices k and j and by replacing the dummy index

λ2 by λ ′. Similarly, we can rewrite

∑ikl

Vimkl

∑λ1≤N

c∗λ1i cλ1

k

cλl =∑jkl

Vjmkl

∑λ ′≤N

c∗λ′

j cλ′

k

cλl

=∑jkl

Vkmjl

∑λ ′≤N

c∗λ′

k cλ′

j

cλl

=∑jkl

Vmkjl

∑λ ′≤N

c∗λ′

j cλ′

l

cλj

Note: The first equality followed from replacing i with j and λ1 with λ ′; the second equality

involved interchanging j with k; the third equality followed from interchanging j with l and

noting that Vkmlj = Vmkjl.

Thus, (135) can be rewritten as (now replacing m by i)

∑j

tijcλj +

∑jkl

Vmjkl

∑λ2≤N

c∗λ2j cλ2

l

cλk − eλcλm = 0

or ∑j

tij +∑kl

Vikjl

∑λ ′≤N

c∗λ′

k cλ′

l

cλj = eλc

λi (136)

105

The one-body hamiltonian h whose eigenvectors give the Hartree Fock single-particle

state vectors can be written in terms of its matrix elements as

hij = tij +∑

λ ′≤N

Viλ ′jλ ′ (141)

In summary, we have shown that the best Slater determinant type wave function for

a system of N identical fermions is obtained by filling up the N lowest eigenstates of the

single-particle Schrodinger equation

h| λ >= eλ| λ > (142)

where h is a one-body operator with matrix elements

hij = tij +∑

λ ′≤N

Viλ ′jλ ′ (143)

Despite the fact that (142) is a one-body Schrodinger equation, it is more complicated

than the usual one-body Schrodinger equation. The reason is that the hamiltonian h depends

on its eigenvectors |λ > through (143). Thus, eqs. (142) and (143) must be solved self

consistently, so that the eigenvectors that go into the construction of Uij also come out of

the diagonalization of hij.

The usual method of obtaining self-consistent HF solutions proceeds in the following

iterative way:

1. Make an initial guess of theN occupied HF single-particle state vectors |λ >, according

to

| λ(0) >=∑i

cλ (0)i | i >

where | i > are the single-particle basis states and the superscript (0) means that this

is the zeroth-order approximation.

2. Evaluate the matrix of t + U in the basis | i > and diagonalize it, yielding a set of

eigenvalues e(1)λ and a new set of eigenvectors c

λ (1)i . [The superscript (1) means this is

the first-order approximation.] In constructing the matrix U from (127) we of course

only sum over the N energetically-lowest single-particle states | λ(0) >.

3. Construct a new set of single-particle states

| λ(1) >=∑i

cλ (1)i | i >

107

and use them to reevaluate the matrix of t + U in the same basis | i >. Diagonalize

this matrix, thereby obtaining a new set of eigenvalues e(2)λ and eigenvectors c

λ (2)i .

4. Keep repeating the above procedure until the set of eigenvectors cλ (n)i from the nth

iteration agree to within some reasonable numerical accuracy with those from the

previous iteration cλ (n−1)i . At this point, self consistency has been achieved, since

the states | λ > which go into the construction of Uij are the same as those that

emerge from the subsequent diagonalization of t+ U . Once self consistency has been

achieved, the resulting eigenvalues e(n)λ and eigenvectors c

λ (n)i are indeed the Hartree

Fock single-particle energies eλ and the self-consistent eigenvectors.

Note that the fermionic nature of the HF description enters in two places:

1. In the fact that our self-consistent product state has one particle in each of the N

lowest self-consistent single-particle states (as required by the Pauli principle), and

2. in the fact that the two-body matrix elements Viλ′jλ′ that enter in the construction of

Uij are appropriately antisymmetrized, with both direct and exchange terms.

Now let’s return to the energy of the N -particle system that results from the self-

consistent HF minimization procedure. It is given by (see eq. (133))

< Φ| H| Φ > =∑λ≤N

∑ij

tijc∗λi cλj

+1

2

∑λ1 λ2≤N

∑ijkl


j cλ1k cλ2

l

As on page 106, we rewrite

∑λ≤N, ij

tijc∗ λi cλj =

∑λ≤N, ij

c∗ λi < i| t| j > cλj

=∑λ≤N

< λ| t| λ >

= tλλ

Likewise we rewrite

∑λ1 λ2≤N

∑ijkl

Vijklc∗λ1i cλ2

j cλ1k cλ2

l =∑

λ1 λ2≤N

∑ijkl

< ij| V | kl > c∗λ1i c∗λ2

j cλ1k cλ2

l

=∑

λ1 λ2≤N

Vλ1λ2λ1λ2

108

Thus,

< Φ|H| Φ >=∑λ≤N

tλλ +1

2

∑λ1 λ2≤N

Vλ1λ2λ1λ2 (144)

But

eλ = hλλ = tλλ + Uλλ

= tλλ +∑

λ ′≤N

Vλλ′λλ′

Thus, the sum of the HF single-particle energies for the N lowest states is

∑λ≤N

eλ =∑λ≤N

tλλ +∑

λλ ′≤N

Vλλ′λλ′

and we see that the total energy in Hartree Fock (144) is not just the sum of the self-

consistent single-particle energies of the N particles, as we might have naively expected for

an independent particle solution.

Rather, we see that

< Φ|H| Φ >=∑λ≤N

eλ −1

2

∑λλ′≤N

Vλλ′λλ′ (145)

The reason for this difference is that each eλ contains a contribution from the interaction of

particle λ with all the other N−1 particles. Thus∑

λ contains the interactions between every

pair of particles twice. The second term in (145) removes half of the two-body interaction

contribution to give the correct HF energy.

Improvements

The general hamiltonian, when written in the self-consistent HF basis, takes the form

H =∑λ1λ2

tλ1λ2b†λ1bλ2 +

1

4

∑λ1λ2λ3λ4

Vλ1λ2λ3λ4b†λ1b†λ2

bλ4bλ3 (146)

Note that in evaluating the expectation value of H in the self-consistent product state

| Φ >, only a part of the full hamiltonian (146) enters, namely the part in which all λi ≤ N .

Those parts of H involving higher states | λi > with λi > N do not contribute.

Let us now decompose

H = Hs.p. + Hint. (147)

where Hs.p. is by definition the part that only involves the N lowest self-consistent states

and Hint. is everything else. I now make several claims:

109

Occupied

Unoccupied

gap

FIG. 8: Self-consistent single-particle levels in HF approximation

• If Hint. is sufficiently weak and/or the gap between the uppermost occupied level and

the lowermost unoccupied level (see figure 8) is sufficiently large, then one could use

perturbation theory to incorporate the effects of Hint. and improve upon the Hartree

Fock energy and eigenvector.

On the other hand, if Hint. is not sufficiently weak or there is no large gap between

occupied and unoccupied levels, then the effects of Hint. cannot either be neglected or

treated in perturbation theory. In such cases, the choice of an independent-particle

trial state vector was not adequate. The ground state of such a system has very strong

correlations between particles, which cannot be described by a mere perturbation of a

Slater determinant. For such problems, alternative approximation strategies must be

sought. And we’ll discuss one soon.

• As a homework problem, you will be asked to prove that H cannot induce one-particle

one-hole admixtures in the ground state, namely that

< Φ| Hb†λ1bλ2 | Φ >= 0 , for λ1 > N and λ2 ≤ N

It can however induce two-particle two-hole admixtures and it is such admixtures that

should be included in perturbation theory, if it is applicable.

Symmetries

110

Although the full hamiltonian H may contain many symmetries (e.g. rotational invari-

ance), the hamiltonian Hs.p. which is being considered in Hartree Fock need not have these

same symmetries. Thus, the self-consistent product eigenfunctions of H do not in general

contain the symmetries of the original hamiltonian. Without going into details, let me

simply note that

(a) Hs.p. is still a number-conserving operator so that its product eigenstates have definite

particle number;

(b) in general Hs.p. will be neither translationally invariant nor rotationally invariant (i.e.

an ITO of rank zero.).

The lack of translational invariance will be reflected in the fact that our product state

does not have a well-defined total momentum but rather can be a mixture of states with

different momenta. The lack of rotational invariance means that the product state need not

have definite total angular momentum, but rather can be a mixture of states with different

values of J . Those mixtures will be manifested in the expansion (117) for the self-consistent

single-particle creation operator. If we were working in a linear momentum representation,

then the states | λ > will be a superposition of momentum eigenstates having different

values of k. Alternatively, were we working in an angular momentum representation, then

the states |λ > would be a mixture of single-particle states with different j and m values.

After the Hartree Fock minimization has been carried out and the set of self-consistent states

| λ > generated, we can project from the many-particle product states those components

with good K values or good J and M values. Such projections will be necessary to make

contact with the true physical states of the system, for which one indeed has conserved

symmetries and thus good quantum numbers.

You should perhaps be asking yourselves “Why did we have to give up the symmetries of

translational and rotational invariance?” The reason is that in the Hartree Fock procedure

we are searching for a simple description of the system as a set of independent particles, and

symmetries are not compatible with such a simple description. Symmetries of necessity imply

some correlations. In an isolated many-body system, you cannot change the momentum (or

angular momentum) of only one particle. The conservation laws require that at least one

other particle also changes its momentum (or angular momentum). Thus, the motion of the

particles cannot be completely independent if symmetries are to be preserved.

111

So, symmetry laws are incompatible with independent-particle motion. Nevertheless, we

often know (or at least expect) from physical considerations that many-body systems to a

good approximation involve simple independent particle motion. Data tells us this and data

don’t lie. To the extent that this is indeed the dominant physics in play, we would like to be

able to get to it directly. And, as we’ve just seen, the only way we can do this is by relaxing

the symmetry requirements. Only after we have isolated the dominant independent-particle

motion do we wish to put back in the corrections required to restore the symmetry (e.g. by

momentum and/or angular momentum projection).

To perhaps make these ideas more palatable, let me discuss them from a slightly differ-

ent perspective. Let’s focus for now on translational invariance and conservation of linear

momentum. Clearly, pure independent particle motion is only compatible with this con-

servation law if all particles have definite momentum. In such a case, all particles are

completely unlocalized. But we know that bound states of real quantum systems, examples

being atoms or molecules or nuclei, are localized, and furthermore seem to involve essentially

independent-particle motion. What’s the resolution to this apparent paradox?

The resolution is that for such real systems the dominant independent-particle motion is

not in the lab frame but rather in the localized body-fixed frame. Once an object is localized,

its center of mass is confined and thus doesn’t have well-defined momentum. Within the

body-fixed frame, the particles can move independently of one another. If one particle

changes its momentum it is not necessary that any other particle responds. Rather the

center of mass of the system can change its momentum to preserve the total momentum of

the system.

Thus, independent-particle motion in the body-fixed system can occur without any vio-

lation of conservation of total momentum. However, the momentum in the body-fixed frame

is not conserved.

Thus we see that independent-particle motion does not preclude cooperative or collective

phenomena. By spontaneously breaking symmetries, we can define an intrinsic frame in

which independent particle motion occurs but which as a whole moves “collectively”. And

this is indeed the essential philosophy underlying the Hartree Fock method as well as the

BCS method we will be introducing next.

112

BCS Theory

As noted on pages 110 of these notes, the Hartree Fock method - either by itself or in

conjunction with perturbation theory – can only be expected to give a good description

of the ground state of the system when the system does not have strong particle-particle

correlations. When such correlations are important, it is necessary to use different methods

for approximately solving the Schrodinger equation. Often it is possible to use these methods

in conjunction with the Hartree Fock method.

Over the next several lectures, I will consider a particular type of hamiltonian which

indeed gives rise to strong particle-particle correlations and which is amenable to an accurate,

though approximate, treatment. It is the so-called BCS approximation, named after its

inventors Bardeen, Cooper and Schrieffer. The hamiltonian I will consider is (in Fock space)

H =∑α

ϵαa†αaα −G

∑α,γ>0

a†αa†αaγaγ (148)

and with G > 0. The second term in the hamiltonian is the so-called pairing interaction.

Indeed, hamiltonians of essentially this type rise in many branches of physics, including

condensed matter physics, nuclear physics, and cold atomic gases.

To put such a hamiltonian in clearer perspective, let me first consider an arbitrary hamil-

tonian

H = T + V

Choosing some arbitrary (but time-reversal invariant) single-particle potential U (perhaps

the HF self-consistent potential), we can rewrite this as

H = H0 + Vres

where

H0 = T + U

and

Vres = V − U

We define our Fock space in terms of the one-particle eigenstates of H0, viz:

H0| α >= ϵα| α >

113

where

| α >= a†α| 0 >

The hamiltonian (148) is based on a residual pairing interaction

Vres = −G∑

α, γ>0

a†αa†αaγaγ

which is defined via its two-body matrix elements

Vαβγδ = −Gδβαδδγ , (α, γ > 0)

Note: By invoking symmetry conditions on Vαβγδ, it is easy to convince yourself that when

we restrict the sum to α, γ > 0, we need not include the customary factor of 1/4. In

subsequent developments, I will denote the pairing interaction for simplicity as VP .

The principal property of the pairing interaction VP is that it is only felt by pairs of

particles in time-reversed single-particle states (e.g. | α > and | α >). Furthermore, the

strength G that governs how strongly it scatters particles from one time-reversed pair (α, α)

to another (γ, γ) is independent of which states are involved.

Finally, it is usually only necessary to consider the pairing force to act in a finite set of

(active) single-particle states, so that the sums in (148) only involve finite numbers of terms.

Let’s now assume that the active single-particle eigenvalues ϵα are ordered as in the figure

(ϵα1 ≤ ϵα2 ≤ ...), with each level (due to the assumed time-reversal symmetry of H0) being

at least doubly degenerate, i.e. ϵα = ϵα. Furthermore, let’s assume that the system under

discussion has N particles with N even. Then, the lowest N -particle eigenvector of H0 is

| Ψ0 >= | α1, α1, α2, α2, ..., αN/2, ¯αN/2 >

in which two particles occupy each of the N/2 lowest doubly-degenerate levels. It is cus-

tomary to refer to the energy ϵαN/2of the uppermost “occupied” single-particle level as the

Fermi energy, and to denote it as λ (see my discussion on pages 92-93). Note of course that

it is only the uppermost occupied level vis a vis the independent-particle hamiltonian H0

and not the full hamiltonian H.

In addition to | Ψ0 >, H0 has many other eigenstates at higher energies, which are

obtained by promoting particles from occupied levels (within the Fermi sea) to unoccupied

levels (outside the Fermi sea). A particularly interesting one is

| Ψ1 >= |α1, α1, α2, α2, ..., αN2−1, ¯αN

2−1, αN

2+1, ¯αN

2+1 >

114

2

N/2

1

Energy

.

.

.

FIG. 9: Self-consistent single-particle levels in HF approximation

in which a pair of particles is lifted from level αN2into the first unoccupied level αN

2+1. If

we denote

H0| Ψ0 >= E0| Ψ0 >

and

H0| Ψ1 >= E1| Ψ1 >

then

∆E = E1 − E0 = 2(ϵN

2+1 − ϵN

2

)Furthermore, it is straightforward to show that

< Ψ1|VP | Ψ0 >= −G

I now claim that

(a) If ∆E >> G, the pairing interaction will not be effective in mixing the states | Ψ1 > and

| Ψ0 >, as is evident from simple perturbation theory arguments. In fact, extension

115

1k

0

k

FIG. 10: Occupation probabilities for a scenario with weak pairing.

of these qualitative arguments suggests that under such circumstances, the pairing

force will be ineffective in mixing any excited eigenstate of H0 with | Ψ0 >. Clearly,

in such cases the “true” ground state of the system will be given “essentially” by

| Ψ0 > and the particles are “uncorrelated”. Of course, small admixtures of excited

configurations such as | Ψ1 > are possible and they can be treated using perturbation

theory. In such cases, the occupation number ηk (see pages 92-93) will look as in figure

10. The occupation numbers for levels below λ are 1 whereas those above λ are zero,

albeit now with some slight smoothing of the distribution around the Fermi surface to

reflect the small perturbative admixtures.

(b) If ∆E ≤ G, then pairs of particles can easily scatter from the uppermost occupied

levels to the lowermost unoccupied ones. If the pairing matrix element G is sufficiently

strong, then it can also excite particles from deep within the Fermi sea to levels well

outside. Clearly, in such cases it is not feasible to use perturbation theory to ascertain

116

1k

0

k

FIG. 11: Occupation probabilities for a strongly correlated scenario.

the true ground state. Qualitatively, the ground state will be be very different than

| Ψ0 > and must be obtained in some non-perturbative approach. The occupation

probability in such cases will look something like shown in figure 11. The scattering

of particles from inside the Fermi sea to outside will “smear out” the Fermi surface,.

Just how much smearing takes place depends sensitively on the specifics of G and the

ϵα. In such cases, the system clearly involves “correlations” between particles.

Degenerate pairing theory

To get a sense as to how the pairing force admixes such configurations to produce a

correlated ground state, I shall now consider an idealized, but exactly solvable, problem in

which the hamiltonian is related to (148), except that we will assume that all the active

single-particle energies ϵα are equal. With such an assumption, the hamiltonian reduces to

H = −G∑

α, γ=1, Ω


117

Note that in writing the degenerate pairing hamiltonian in this way, I have

(a) neglected the term ϵ∑

α,γ=1,Ω a†αaα = ϵN , since this can’t contribute to excitation

energies in a given system (with N fixed), and

(b) made explicit the fact that H only acts over Ω active (doubly-degenerate) levels.

For the hamiltonian (149), we can explicitly solve the Schrodinger equation

H| Ψ >= E| Ψ >

To see how this is done, we first introduce the coherent “pair creation operator”

A† =∑

α=1,Ω

a†αa†α

and rewrite

H = −GA†A

I now make the following claims, which you will be asked to prove in a homework assign-

ment.

(a)[A†, A

]= N − Ω, where N =

∑α=1,Ω

a†αaα + a†αaα

, i.e. N counts the number of

particles in the active levels 1 thru Ω.

(b)[H, A†

]= −GA†

(Ω− N

)Now suppose that we have a v-particle state (v ≤ Ω), which I denote | Ψv > and which

satisfies

A| Ψv >= 0

and thus

H| Ψv >= 0

Examples are all of the states

| Ψv >= |α1, α2, ...αv >

with all αi > 0. Such a state is said to have seniority v.

Next consider the state(A†)N−v

2 | Ψv >. This is a state of N particles and is still said

to have seniority v. It can be readily shown, and you will also be asked to do this in the

homework, that

118

H(A†)N−v

2 | Ψv >= −G

4[N(2Ω−N + 2)− v(2Ω− v + 2)]

(A†)N−v

2 | Ψv > (150)

Thus,(A†)N−v

2 | Ψv > is an eigenvector of H with eigenvalue

EN,v = −G

4[N(2Ω−N + 2)− v(2Ω− v + 2)]

= EN0 +vG

4(2Ω− v + 2) (151)

I will denote this state as | ΨNv >.

Now let’s focus on systems with an even number of particles.

If G > 0, all states with v > 0 are energetically above the state with v = 0. Thus, the

ground state will be

| ΨN0 >∝(A†)N/2

| 0 >

with eigenvalue

EN0 = −G

4N (2Ω−N + 2)

The lowest excited states are those with v = 2, viz:

| ΨN2 >∝(A†)N

2−1

| Ψv=2 >

Clearly there are many such states, since there are many ways to choose two out of the Ω

positive αi values. All of these states are degenerate at an excitation energy

EN2 − EN0 = GΩ

Thus if Ω is large and G is large, the pairing force will produce a large gap between the

(non-degenerate) ground state and the (highly-degenerate) first excited states. Thus, for

even N and a strong pairing force, one state will be pulled down relative to all others.

What are the occupation probabilities ηk associated with the ground state | ΨN0 >? They

are given by

ηk =< ΨN0| a†kak|ΨN0 >

< ΨN0|ΨN0 >

Clearly we only need consider this for k > 0, since the results for k < 0 must be identical.

As an exercise you should confirm that

ηk =N

2Ω

119

which is independent of k. Thus, for the degenerate pairing problem all active levels are

populated equally in the ground state.

And this is as it obviously should be. All doubly-degenerate levels are obviously equiv-

alent in this problem. They are all degenerate with one another and furthermore the pair

scattering from one level to another is independent of the levels involved.

Clearly when we remove the “degenerate model” assumption and let the ϵα be different,

the main qualitative difference will be that no longer are all levels equivalent and thus

equally populated. Rather, we would expect the ηk to look as in figure 11 on page 117, with

occupations near unity for the lowest levels transitioning smoothly into occupations much

smaller for states at the highest energies.

Nevertheless, assuming that the “pairing correlations” are sufficiently strong, we would

expect that the ground state wave function should still have the basic structure of the ground

state from the degenerate theory, namely a condensate of correlated pairs, viz:

| Φ >∝(A†)N/2

| 0 >

Of course, we would no longer expect that

A† =∑k

a†ka†k

since this would lead to equal population of all states, as we’ve seen. Rather, we’d expect

that

A† =∑k>0

cka†ka

†k

with the expansion coefficients ck somehow reflecting the fact that ηk should follow a pattern

like that shown in figure 11, with the lowest levels being populated most and then the higher

levels successively less.

On this basis, one is led to consider as a physically reasonable trial state vector for systems

dominated by pairing correlations one of the form

| Φ >=

∑k>0

cka†ka

†k

N/2

| 0 > (152)

and to consider the ck as variational parameters, which we would determine by minimizing

< Φ|H| Φ >. Such a procedure is feasible, but very difficult to implement especially for

systems with many particles, as arise for example in Condensed Matter Physics. I now wish

120

to convince you that by introducing a quasi-particle transformation, as described on pages

89-92 of these notes, we are (more or less) doing the same thing, but much more simply.

The Bogolyubov quasiparticle transformation

So, let’s consider a transformation from real particles operators a†α, aα to quasiparticle

operators c†α, cα according to

c†α = uαa†α − vαaα

cα = uαaα + vαa†α (153)

with

u2α + v2α = 1

The transformation (153) is customarily referred to as a Bogolyubov transformation after

the physicist who first introduced it.

In our earlier discussion, we showed that we could develop a quasiparticle algebra that was

mathematically equivalent to the real-particle algebra if we could introduce an appropriate

quasiparticle vacuum state | 0 > for which

cα| 0 >= 0, for all α

We showed furthermore that such a quasiparticle vacuum is related to the real particle

operators and the real vacuum by

|0 >=∏α>0

uαvα∏α>0

(1 +

vαuα

a†αa†α

)| 0 > (154)

We also showed that the number of particles in state k in this quasiparticle vacuum is

given by v2k.

Let me now look at the above quasiparticle vacuum state in a slightly different way. To

do so, we note that

exa†αa

†α =

∞∑n=0

1

n!

(xa†αa

†α

)n| 0 >

=[1 + xa†αa

†α +

1

2x2(a†αa

†α

)2+ ...

]| 0 >

=[1 + xa†αa

†α

]| 0 >

121

where in the last line I used the fact that all higher terms, which involve powers of a†αa†α,

cannot contribute since you cannot have more than one particle in the same state α.

Thus,

∏α>0

(1 +

vαuα

a†αa†α

)| 0 > =

∏α>0

evαuα

a†αa†α | 0 >

= e∑

α>0vαuα

a†αa†α | 0 > (155)

Note that in deriving this I have used the operator identity

eA+B = eAeB

which applies for operators A and B that commute, i.e for which [A,B] = 0. And clearly

[a†αa

†α, a

†βa

†β

]= 0

for all α and β.

Again using the expansion

ex =∞∑p=0

1

p!xp

we can rewrite (155) as

∏α>0

(1 +

vαuα

a†αa†α

)| 0 >=

∞∑p=0

1

p!

(∑α>0

vαuα

a†αa†α

)p

| 0 > (156)

which shows that, in accord with the comments on page 91-92, | 0 > contains contributions

with all even numbers of particles. However, it also makes clear that the component with

N particles has the structure (∑α>0

vαuα

a†αa†α

)N/2

| 0 >

This is exactly the form we postulated as being appropriate for describing a pairing hamil-

tonian, namely something of the form

(A†)N/2

| 0 >

where

A† =∑k

cka†ka

†k

and where ck is related to ηk.

122

I therefore claim that if we choose | 0 > as our trial state and minimize < 0|H| 0 > with

respect to the vk coefficients (as a reminder the uk are related to the vk by u2k + v2k = 1),

then we are basically doing what I suggested we should do for a pairing hamiltonian.

I say “basically” because | 0 > is not a state with exactly N particles, as we would

certainly prefer. Nevertheless we will be able to guarantee (through the introduction of a

Lagrange multiplier) that it has N particles on average, i.e. that

< 0| N | 0 >= N

In this way, we will be able to approximately solve the Schrodinger equation for a pairing

hamiltonian in a way that is much simpler that trying to minimize < Φ| H| Φ > with |Φ >

having the form (152). In other words, by relaxing the requirement that our trial state has

exactly the correct number of particles, we’ll greatly simplify our treatment of pairing-like

hamiltonians.

The method that emerges from this number-nonconserving variational treatment of pair-

ing hamiltonians is called the BCS approximation, after John Bardeen, Leon Cooper and

Bob Schrieffer it’s developers.

Derivation of the BCS equations

The relevant variational equation that we will solve is

δ < 0|H − λN | 0 >= 0 (157)

subject to a constraint that

< 0|N | 0 >= N (158)

The variations are done with respect to the vk parameters that define the Bogolyubov

transformation to quasiparticles (or equivalently the related uk parameters).

The operator in (157) can be written as

H ′ = H − λN =∑α>0

(ϵα − λ)(a†αaα + a†αaα

)−G

∑α,γ>0


We now express H ′ in terms of the quasiparticle operators c†α, c†α, cα and cα. As a

reminder, on page 92, we gave the inverted Bogolyubov transformation, which expresses the

quasiparticle operators in terms of the particle operators. The relevant relations are

123

a†α = uαc†α + vαcα

aα = uαcα − vαc†α (160)

From these, we can also get the hermitean adjoint relations, which are

aα = uαcα + vαc†α

a†α = uαc†α − vαcα (161)

We now have all that is needed to rewrite the operator H ′ in terms of quasiparticle

operators, which we now do term by term.

First Term:

∑α>0

(ϵα − λ)(

uαc†α + vαcα

) (uαcα + vαc

†α

)+(uαc

†α − vαcα

) (uαcα − vαc

†α

)=

∑α>0

(ϵα − λ)u2αc

†αcα + uαvαc

†αc

†α + vαuαcαcα + v2αcαc

†α

+ u2αc

†αcα − uαvαc

†αc

†α − vαuαcαcα + v2αcαc

†α

Now what we do is to use the anticommutation relations to put all quasiparticle creation

operators to the left and all quasiparticle annihilation operators to the right. This is known

as putting the operators in normal order. We get (noting that α = α)

∑α>0

(ϵα − λ)u2αc

†αcα + uαvαc

†αc

†α + vαuαcαcα − v2αc

†αcα + v2α

+ u2αc

†αcα − uαvαc

†αc

†α − vαuαcαcα − v2αc

†αcα + v2α

= 2

∑α>0

(ϵα − λ) v2α

+∑α>0

(ϵα − λ)(u2α − v2α

) c†αcα + c†αcα

+2

∑α>0

(ϵα − λ)uαvαc†αc

†α + cαcα

(162)

Note my separation into pieces involving no c† or c operators, one c† and one c operator,

two c† operators, and two c operators.

124

We can obviously do the same thing for the potential term, though its certainly much

more tedious. We first replace all its a and a† operators by quasiparticle c and c† operators;

we then put everything in normal order in which all creation operators are to the left of all

annihilation operators. Since it is so tedious, let me just quote the end result for the various

pieces of H ′ combined.

H ′ = U ′ + H ′11 + H ′

20 + Hint (163)

where

U ′ =∑α>0

2 (ϵα − λ) v2α −Gv4α

−G

∑α,γ>0

uαvαuγvγ (164)

H ′11 =

∑α>0

(u2α − v2α

) (ϵα − λ−Gv2α

)+ 2Guαvα

∑γ>0

uγvγ

(c†αcα + c†αcα)

(165)

H ′20 =

∑α>0

[2 (ϵα − λ−Gv2α)]

uαvα −G(u2α − v2α

)∑γ>0

uγvγ

(c†αc†α + cαcα)

(166)

and Hint is everything else. Indeed, Hint will contain all terms involving four quasiparticle

operators “in normal order” (i.e. with all c† operators to the left of all c operators.). These

include terms like c†c†c†c†, c†c†c†c, c†c†cc, c†ccc and cccc.

I now make the claim that

< 0| H − λN | 0 >= U ′ (167)

The reason is that all other pieces of H − λN have an annihilation operator on the right

and/or a creation operator on the left, and in either case they annihilate the relevant vacuum

state, viz:

< 0| c†α = cα| 0 >= 0

Thus, all such terms give a zero contribution to the quasiparticle vacuum expectation value.

Thus, the variational equation

δ < 0| H − λN | 0 >= 0

reduces to a set of partial differential equations

∂U ′

∂vk= 0 , for all k (168)

125

where

U ′ =∑α>0


−G

∑α,γ>0

uαvαuγvγ (169)

We will solve this system of equations subject to a constraint on the average number of

particles in the quasiparticle vacuum

< 0| N | 0 >= N (170)

If we differentiate U ′ with respect to vk and equate to zero, we get

4 (ϵk − λ) vk − 4Gv3k −G∑α>0

uαvα

(uk + vk

∂uk

∂vk

)−G

∑γ>0

uγvγ

(uk + vk

∂uk

∂vk

)= 0

Combining terms and then dividing by 2 gives

2 (ϵk − λ) vk − 2Gv3k −G

(uk + vk

∂uk

∂vk

)∑γ>0

uγvγ = 0 (171)

But

u2k + v2k = 1

so that

2uk∂uk

∂vk+ 2vk = 0

or∂uk

∂vk= − vk

uK

(172)

Inserting (172) into (171) and then multiplying thru by uk gives

2 (ϵk − λ) vkuk − 2Gv3kuk −G(u2k − v2k

)∑γ>0

uγvγ = 0 (173)

We now define

ϵk = ϵk − λ−Gv2k (174)

and

∆ = G∑γ>0

uγvγ (175)

Then (173) becomes

2ϵkukvk −(u2k − v2k

)∆ = 0 (176)

Equation (176) emerged by minimizing < 0| H − λN | 0 > with respect to vk. We must

still, however, impose the constraint that < 0| N | 0 >= N . Following our earlier treatment

126

of < 0| H − λN | 0 >, we rewrite N in terms of quasiparticle creation and annihilation

operators and then < 0| N | 0 >= N is just the constant term that emerges when the

operators are put in normal order. By simple analogy with what we did in the derivation of

(162), but replacing ϵα − λ by 1 in the sums, we find that

< 0| N | 0 >= 2∑α>0

v2α

Thus our constraint is that

2∑α>0

v2α = N (177)

which confirms our earlier interpretation of v2α as the occupation probability for level α.

[Note: The 2 arises because of the fact that the level α is doubly-degenerate, with equal

occupations of α and α.

Equations (174-177) are the fundamental equations of BCS theory. Their solution is

facilitated, however, by first putting them in a slightly different form, which we will now do.

We first rewrite (176) as

∆(u2k − v2k

)= 2ϵkukvk (178)

Squaring it gives

∆2(u2k − v2k

)2= 4ϵ2ku

2kv

2k (179)

But (u2k − v2k

)2= u4

k + v4k − 2u2kv

2k =

(u2k + v2k

)2− 4v2ku

2k = 1− 4v2ku

2k

Thus, (179) becomes

∆2 − 4v2ku2k∆

2 = 4ϵ2ku2kv

2k

or

u2kv

2k

(4ϵ2k + 4∆2

)= ∆2

so that

2ukvk = ± ∆√ϵ2k +∆2

(180)

To determine the correct sign, we remember that (see eqs. (169), (174) and (175))

U ′ =∑k>0

(2ϵkv

2k +Gv4k −∆ukvk

)

127

Now, since we wish to minimize this, it is clear that ukvk and ∆ must have the same sign

for all k, so that the correct sign in (180) is + and therefore

2ukvk =∆√

ϵ2k +∆2(181)

Putting this into (178) gives

∆(u2k − v2k

)=

∆ϵk√ϵ2k +∆2

or

u2k − v2k =

ϵk√ϵ2k +∆2

Combining this with

u2k + v2k = 1

gives

u2k =

1

2

1 + ϵk√ϵ2k +∆2

(182)

and

v2k =1

2

1− ϵk√ϵ2k +∆2

(183)

But from (175) we know that

∆ = G∑k>0

ukvk

so that, using (181),

∆ =G

2

∑k>0

∆√ϵ2k +∆2

or2

G=∑k>0

1√ϵ2k +∆2

(184)

Finally, from (177),

N = 2∑k>0

v2k

which after inserting (183) gives

N =∑k>0

1− ϵk√ϵ2k +∆2

(185)

128

Equation (184) is called the gap equation and (185) is called the number equation. It is

in this form that the BCS equations are most readily solved. I will discuss their iterative

solution a bit later.

Some features of the BCS equations

(1) Minimizing < 0| H − λN | 0 > with respect to the vk parameters led to equation (173)

2ϵkukvk −(u2k − v2k

)∆ = 0

where

ϵk = ϵk − λ−Gv2k

and

∆ = G∑α>0

uαvα

Clearly ukvk = 0 for all k is a solution to this system of equations. In such cases,

∆ = 0

This is called the normal solution to the BCS equations.

For the normal solution, the occupation probabilities v2k are given by

v2k =1

2

(1− ϵk

|ϵk|

)

and similarly u2k is given by

u2k =

1

2

(1 +

ϵk|ϵk|

)In the limit G → 0,

ϵk|ϵk|

=

1 for ϵk > λ

−1 for ϵk < λ

so that

v2k =

0 for ϵk > λ

1 for ϵk < λ

Thus, in the limit of noninteracting particles (for which as we’ll soon confirm the normal

solution applies) the Lagrange multiplier λ indeed plays the role of the Fermi energy, dividing

the single-particle states into two groups, one that is occupied and one that is empty. When

129

the interaction is turned on (i.e. G = 0), λ is referred to as the chemical potential, although

as we’ll soon see it still has more or less the significance of a Fermi energy.

The normal solution corresponds to filling up the lowest N/2 levels, exactly as for non-

interacting particles. As we have seen, the BCS equations admit such a solution even when

there is a pairing interaction; however, it may not be the energetically lowest solution.

(2) Question: Under what conditions is there a lower solution?

To answer this, consider the gap equation (184)

2

G=∑k>0

1√ϵ2k +∆2

Clearly this equation can only have a solution for G > 0. We already noted in our dis-

cussion of “degenerate pairing theory” that G > 0 was required to produce pairing solutions

for the ground state.

We now rewrite the gap equation as

2

G=∑k>0

1√ϵ2k +∆2

≤∑k>0

1√ϵ2k

or equivalently ∑k>0

G

2|ϵk|≥ 1

If this inequality is not satisfied, it will not be possible to have another solution to the

BCS equations. Put another way, if

∑k>0

G

2|ϵk|≤ 1

the only solution of the BCS equations is the normal one.

But

|ϵk| = |ϵk − λ−Gv2k|

so that this latter equation is equivalent to

∑k>0

G

2|ϵk − λ−Gv2k|≤ 1 (186)

Bearing in mind that λ plays the role of the Fermi energy, this equation is very reminiscent

of our earlier intuitive equation

∆E ≥ G

130

given on pages 116-117. If the energy cost of lifting particles from occupied to unoccupied

single-particle levels is too large compared to the associated pairing matrix element then

the pairing force cannot effectively scatter particles across the Fermi surface. In such cases,

the only solution to the BCS equations is the normal solution and any corrections to it are

perturbative.

However, if ∆E ≤ G, it becomes possible to generate another solution, called the

superconducting solution. Furthermore, it can be shown (though the analysis is difficult

and I won’t show it) that when the superconducting solution exists it is always energetically

lower than the normal solution. An estimate of the gain in energy of the superconducting

solution relative to the normal solution is

δE = Usuperconducting − Unormal ≈ −∆2

2D

where D is the average spacing of single-particle levels.

(3) Now let’s define

ηk = ϵk −Gv2k (187)

so that

ϵk = ηk − λ

and

v2k =1

2

1− ηk − λ√(ηk − λ)2 +∆2

When a superconducting solution exists, this looks like (see figure 12)

When ηk = λ, it is clear that v2k = 1/2. If we now expand about ηk = λ we obtain

ηk − λ√(ηk − λ)2 +∆2

=ηk − λ

∆

1√1 + (ηk−λ)2

∆2

≈ ηk − λ

∆

1− 1

2

(ηk − λ

∆

)2

Thus, for ηk ≈ λ,∂

∂ηkv2k ≈ − 1

2∆

From this, we see that the region of transition from “filled” to “unfilled” levels has a width

of roughly 2∆. Since ∆ ∝ G, we see that the amount that the Fermi surface is smeared out

is proportional to the strength of the pairing force.

131

1vk

2

0

k

.1/2

FIG. 12: Occupation probabilities for a superconducting scenario.

Iterative solution of the BCS equations

The basic BCS equations can be summarized as

ηk = ϵk −Gv2k (188)

∆ = G∑k>0

ukvk (189)

v2k =1

2

1− ηk − λ√(ηk − λ)2 +∆2

(190)

u2k = 1− v2k (191)

N =∑k>0

1− ηk − λ√(ηk − λ)2 +∆2

(192)

We solve this set of equations in the following iterative fashion.

Step 0: Choose starting values for the uk and vk coefficients (denoted u(0)k and v

(0)k ).

132

Step 1: Use (188) and (189) to determine η(0)k and ∆(0).

Step 2: Determine λ(0) so that [see eq. (192)]

N =∑k>0

1−η(0)k − λ(0)√(

η(0)k − λ(0)

)2+ (∆(0))

2

Step 3: Using (190) and (191), evaluate

v(1)k =

1

2

1− η(0)k − λ(0)√(

η(0)k − λ(0)

)2+ (∆(0))

2

1/2

and

u(1)k =

√1−

(v(1)k

)2Step 4: Return to Step 1 and iterate the first three steps until the v

(n)k emerging from a

given iteration agree with the v(n−1)k from the previous one (the n − 1st) to within a

chosen level of accuracy.

Let’s now examine in some detail the various pieces of H ′ (= H − λN) under conditions

of the “optimum” Bogolyubov transformation. Remember that

H ′ = U ′ + H ′11 + H ′

20 + Hint

as given on page 140.

The various components at minimum are given by

U ′ =∑α>0


−G

∑α,γ>0

uαvαuγvγ

=∑α>0

(2ϵα +Gv2α

)v2α −∆uαvα

(193)

H ′11 =

∑α>0

(u2α − v2α

) (ϵα − λ−Gv2α

)+ 2Guαvα

∑γ>0

uγvγ

(c†αcα + c†αcα)

=∑α>0

(u2α − v2α

)ϵα + 2∆uαvα

(c†αcα + c†αcα

)

=∑α>0

2ϵ2αuαvα

∆+ 2∆uαvα


)

133

=∑α>0

ϵ2α√ϵ2α +∆2

+∆2√

ϵ2α +∆2


)=

∑α>0

√ϵ2α +∆2


)(194)

where the third equality followed from (178) and the fourth from (181).

H ′20 =

∑α>0

[2 (ϵα − λ)− 2Gv2α]uαvα −G

(u2α − v2α

)∑γ>0

uγvγ

(c†αc†α + cαcα)

=∑α>0

2ϵαuαvα −∆

(u2α − v2α

) (c†αc

†α + cαcα

)= 0 (195)

where the last equality followed from (178).

Thus, if the uα and vα are chosen so as to minimize < 0| H − λN | 0 >, they will at the

same time guarantee that

H ′20 = 0 (196)

This result is the BCS analog of our Hartree Fock result for the self-consistent independent-

particle solution

< Φ|Hb†λ1bλ2 | Φ >= 0 , for λ1 > N and λ2 ≤ N ,

given on page 110, which you also proved in a homework problem.

Indeed, Bogolyubov showed that in general it is equivalent to minimize U ′ or to set

H ′20 = 0. The latter is referred to as “removing the dangerous terms”.

Thus, for the “optimum” Bogolyubov transformation,

H ′ = U ′ + H ′11 + Hint

If we make the assumption that Hint is weak, which is analogous to neglecting Hint in HF,

then

H ′ = U ′ + H ′11 (197)

I now make several claims:

1. Clearly the full H ′ commutes with N , so that the true physical eigenstates have well-

defined particle number. However, if we neglect Hint, then the approximate H ′ given

134

by (197) does not commute with N , so that its eigenstates (such as its ground state,

| 0 >) do not have fixed numbers of particles. However, U ′ + H ′11 does commute with∑

α c†αcα, which is the quasiparticle number operator. Thus, its eigenstates have fixed

numbers of quasiparticles. We have so far only discussed its ground state | 0 >, which

has zero quasiparticles. Now we will discuss its excited eigenstates.

2. Clearly the approximate hamiltonian (197) is the hamiltonian for independent quasi-

particles. Thus, all of its eigenstates can be written as products of single-quasiparticle

creation operators acting on the quasiparticle vacuum,

c†k1 ...c†kn| 0 >

where n can be any nonnegative integer. The eigenvalue of this n-quasiparticle state

is

U ′ +n∑

i=1

√ϵ2ki +∆2

[Note: Were H ′20 = 0 we would not arrive at an independent quasiparticle hamiltonian,

explaining why setting it to zero is called “removing the dangerous terms”.]

I now make the following claims:

(a) states with an even number of quasiparticles n correspond to systems with an

even number of real particles, and

(b) states with an odd number of quasiparticles n correspond to systems with an odd

number of real particles.

Both can be straightforwardly proven by making use of our knowledge of the real

particle structure of the quasiparticle vacuum and the quasiparticle creation operators.

Thus, we can separately discuss even-n and odd-n systems.

(a) Even-n:

(1) Ground state: | 0 >

(2) Lowest excited states: c†k1c†k2| 0 >

The excitation energies of such two quasiparticle states, and there are

135

lots of them, are

E(k1, k2) =√ϵ2k1 +∆2 +

√ϵ2k2 +∆2

≥ 2∆

Thus, the lowest possible excited states in a system with an even number of

particles will occur at an excitation energy of at least 2∆. Stated another way,

in even-n systems there is a gap in the spectrum between the ground state and

excited states, and this gap increases with increasing G (since ∆ is proportional

to G). Such large gap is one of the features that characterizes superconducting

systems.

(b) Odd-n:

In odd-n systems, the lowest states are one-quasiparticle states. The excitation

energy of the first excited state is

δE =√ϵ2k2 +∆2 −

√ϵ2k1 +∆2

where ϵk1 is the lowest value of ϵk and ϵk2 is the second lowest. Here, no gap

occurs and no state is preferentially picked out and lowered with respect to all

the others.

Accuracy of the BCS approximation

The BCS approximation is of course not exact since the true eigenstates have definite

particle number and the BCS quasiparticle vacuum does not. We can assess the level of

“inaccuracy” of the BCS approximation by considering its application to the degenerate

pairing hamiltonian for which pairing correlations are certainly dominant but for which the

solution can be obtained exactly.

So, let’s assume that we have n particles (n even) moving in Ω levels subject to the

degenerate pairing hamiltonian

H =∑α

ϵNα −G∑

α, γ=1,Ω

a†αa†αaγaγ

The exact ground state energy of this system is given by (151), viz:

En0 = nϵ− G

4n (2Ω− n+ 2) (198)

136

[Note that I am including a constant single-particle energy ϵ for all levels]. Furthermore, as

discussed on page 120 all states have the same occupation number,

v2k =n

2Ω

and thus of course

u2k = 1− n

2Ω

Finally, the lowest excited states are those with seniority v = 2 and they all occur at

En2 = En0 +GΩ , (199)

likewise from (151).

Now let’s see what BCS approximation yields for these quantities. The ground state

energy is given by

< 0| H| 0 > = U ′ + λn

=∑

α=1,Ω

2ϵv2α −Gv4α

−G

∑α, γ=1,Ω

uαvαuγvγ

But

uα =

√1− n

2Ω

and

vα =

√n

2Ω

so that

< 0|H| 0 > = nϵ−GΩ(

n

2Ω

)2

−G(1− n

2Ω

)(n

2Ω

)Ω2

= nϵ− G

4Ωn2 − GΩ

2+

G

4n2

= nϵ− G

4n(2Ω− n+

n

Ω

)(200)

Comparing (200) with (198), we see that the error in the BCS result relative to the exact

result is

EBCS − EExact

EExact

=2− n

Ω

2Ω− n+ 2

=1

Ω

(2− n

Ω

)(2− n

Ω+ 2

Ω

)137

In the limit in which Ω is large, so that there are many active single-particle levels,

EBCS − EExact

EExact

∼ 1

Ω

which is very small.

BCS approximation produces as the lowest excited states (for even n) those with two

quasiparticles. They occur at

E2qp = E0qp +√ϵ2k1 +∆2 +

√ϵ2k2 +∆2

= E0qp + 2√ϵ2 +∆2

where I’ve used the fact that all single-particle energies are the same (ϵ) in the degenerate-

orbit problem. The square root quantity can be obtained from the gap equation

2

G=

∑α=1,Ω

1√ϵ2 +∆2

=Ω√

ϵ2 +∆2

Thus,√ϵ2 +∆2 =

GΩ

2

and

E2qp = E0qp +GΩ

which is identical to the result given in (199) for the exact calculation. Thus, BCS ap-

proximation (despite giving up number conservation) very accurately reproduces the exact

spectrum in the vicinity of the ground state.

Superconductivity in solids

I would now like to very briefly discuss the relevance of the BCS formalism we have

just developed to superconductivity in solids. My discussion will be very qualitative, but

hopefully will give you some of the flavor of why BCS is so important.

Certain solids have been known to exhibit superconductivity ever since the classic exper-

iments in the laboratory of Onnes in 1911 [H. K. Onnes, Commun. Phys. Lab. 12, 120

(1911)]. The quantum mechanical theory of superconductivity was put forth by Bardeen

Cooper and Schrieffer in 1957 [J. Bardeen, L. N. Cooper and J. R. Schrieffer, Phys. Rev.

108, 1175 (1957)], whereby superconductivity was described as the condensation of a set

138

of correlated pairs averaged over the whole system. The mathematical framework in which

this theory was implemented was the number-nonconserving BCS theory.

That the formalism we have just developed applies to solids under appropriate conditions

can be seen from the following heuristic discussion. Consider the hamiltonian for electrons

in a lattice. It is most conveniently expressed in terms of so-called Bloch states, specified

by a wave vector k and by a spin σ = ±1/2. Then the hamiltonian describing the motion

of the electrons in the lattice can be expressed as

H =∑k>kF

ϵka†k σ

ak σ

+∑k<kF

|ϵk|(1− a†

k σak σ

)+HCoulomb

+1

2

∑k, k ′, σ, σ ′, κ

2hωκ|Mκ|2

(ϵk − ϵk+κ)2 − (hωκ)

2 a†k ′−κ, σ′a

†k+κ, σ

ak ′ σ′ak σ (201)

Here, kF is the so-called Fermi momentum, corresponding to the Fermi energy ϵkFwe dis-

cussed earlier.

The third term represents the (screened) Coulomb interaction between the electrons.

The fourth term is the so-called phonon interaction. It is the part of the electron-electron

interaction that derives from the virtual exchange of phonons with the lattice. The basic

idea is that the electrons interact with the lattice and produce a collective excitation called a

phonon. But we are not explicitly including the lattice and thus phonon degrees of freedom

in our treatment, which involves the electrons only. Thus we take into account the excitation

of these phonons in second-order perturbation theory. The form of the phonon interaction

indicates that it will be attractive (i.e. negative) for single-particle excitation energies |ϵk −

ϵk+κ| < hωκ. This is to be contrasted with the screened Coulomb interaction, which is of

course repulsive and which can be expressed approximately as 4π2/κ2.

The overall residual interaction between electrons will be attractive if⟨−2|Mκ|2

hωκ

+4πe2

κ2

⟩ave

< 0

Note that in the phonon interaction term the scattering takes place from a two-particle state

with momentum k + k ′ to another two-particle state with the same total momentum, i.e.

it conserves momentum as it must.

Cooper [Phys. Rev. 104, 1189 (1956)] showed that two electrons in a lattice interact

most strongly with one another via the phonon interaction when their total momentum

139

k + k ′ = 0, i.e. when the two electrons have equal and opposite momenta. Furthermore,

he showed that the interaction between the two electrons is stronger when their spins are

antiparallel than when they are parallel, since in the parallel-spin case the exchange matrix

elements tend to reduce the interaction. Based on this, he proposed that the interaction

between electrons in a solid could be approximated by an interaction that only acted on

such pairs,

−G∑

k, κ>0

a†−k−κ, ↓ a†k+κ, ↑ ak, ↑ a−k, ↓ (202)

which indeed has the above characteristics. The resulting hamiltonian is then

H =∑

k>kF , σ

ϵka†k σ

ak σ −G∑

k, κ>0

a†−k−κ, ↓ a†k+κ, ↑ ak, ↑ a−k, ↓ (203)

With this as the hamiltonian, he then considered a state vector

| Φ >= Γ† | FS >=∑k>kF

1

2ϵk − Ea†k↑ a†−k↓ | FS > (204)

and showed that for such an interaction it would produce a bound state on top of the Fermi

sea (FS) for any attractive pairing interaction. The energy E for this bound state is given

by the lowest solution of the equation

1

G=

∑k>kF

1

2ϵk − E(205)

which can be obtained numerically. The resulting bound collective pair is called a Cooper

pair.

Bardeen, Cooper and Schrieffer followed up on this idea by considering this hamiltonian

and a trial state vector made up as a condensate of these collective Cooper pairs. Because

of the difficulty in treating a number-conserving condensate

(Γ†)n

|FS >

they instead considered the number-nonconserving state

eΓ†|FS >

which we remember as one of the forms for the BCS quasiparticle vacuum [see eqs.

(154, 155)]. When we minimize the expectation value of the pairing hamiltonian (203)

for such a trial state, we obtain the BCS solution given earlier.

140

Of course, from my earlier remarks it is clear that (202) will only be the dominant piece

of the residual effective interaction between electrons in a lattice when it dominates over

the repulsive screened Coulomb interaction. Pines has shown that the condition that the

phonon interaction dominates over the Coulomb interaction is in qualitative agreement with

earlier empirical rules established by Matthias for the occurrence of superconductivity in

solids. So all seems to be consistent.

As you probably all know, superconductivity in solids has many interesting phenomena

associated with it, not just the existence of a large gap in the spectrum. Some of these are

(a) that in superconductors the electrical resistance disappears below a certain temperature;

(b) that superconductors exhibit a second-order phase transition at the critical tempera-

ture; and

(c) that superconductors exhibit the so-called Meissner effect, i.e. they exclude magnetic

fields.

All of these effects are in fact closely related to the existence of a large pairing gap. BCS

theory reproduces all of these phenomena.

Some further comments on BCS theory and the pairing problem

I’d like to close my discussion of pairing in many-body quantum systems with two im-

portant comments, both of which I will briefly discuss.

(a) BCS theory and more general interactions:

I developed BCS theory for a pure pairing hamiltonian, characterized by an interaction

which only acts on pairs in time-reversed states and furthermore for which the strength by

which such a pair is scattered into another such pair is independent of the pairs involved.

Such a hamiltonian is obviously dominated by pairing correlations between time-reversed

pairs, since those are the only pairs that feel the interaction.

On the other hand, it is possible that pairing correlations will also be dominant for more

general interactions. Put another way, after the Hartree Fock correlations are taken into

account, this may be the only other piece of the residual hamiltonian strong enough to

141

produce two-body correlations. And indeed, even for a more general interaction,

1

4

∑k1k2k3k4

Vk1k2k3k4a†k1a†k2ak4ak3

it is conceptually straightforward to minimize the expectation value of hamiltonian in the

BCS quasiparticle vacuum. The equations are somewhat more cumbersome, but are nev-

ertheless obtained using the same basic approach and solved using the same basic iterative

method.

Pairing correlations in atomic nuclei:

In 1958, soon after the development of BCS theory, Bohr, Mottelson and Pines (Phys.

Rev. 110 (1958)) suggested that a similar pairing phenomenon could explain the large gaps

in the spectra of nuclei with an even number of neutrons and an even number of protons.

There, however, the pairing was not between particles in states (k, σ) and (−k,−σ) but

rather between identical nucleons in time-reversed single-nucleon states, (njm) and (njm).

Bohr, Mottelson and Pines noted, however, that in systems with as few particles as atomic

nuclei the violation of number conservation inherent in the BCS theory could cause fairly

serious errors and that development of a number-conserving theory was desirable. Indeed,

for such systems it was soon shown by Dietrich, Mang and Pradal [Phys. Rev. 135, B22

(1964)] how to restore particle number in the BCS formalism by using a trial state

(Γ†)n

|0 >

This is referred to as projected BCS (PBCS) approximation and is precisely what I proposed

doing on page 121 of these notes [see eq. (152)]. While PBCS is very difficult to implement

for systems in condensed matter, where the number of particles is so large but where fortu-

nately it isn’t very critical to use it, it can be implemented in atomic nuclei with its fairly

small number of particles. And like BCS theory it can be implemented for more general

hamiltonians than just the pairing hamiltonian.

142

The Dirac Equation

The next topic we will be discussing concerns how to merge relativity with Quantum

Mechanics. On this topic, Shankar has a nice presentation, so I would like to ask you to

begin reading Chapter 20 in his text.

Let me begin by reminding you that the Schrodinger equation, on which we have focused

so far, was obtained by quantizing classical mechanics. All of the invariance properties of

the classical hamiltonian are thus also present in the corresponding quantum hamiltonian.

As such, all physical properties derived from the Schrodinger equation are invariant under a

Galilean transformation of the reference frame. But they are not invariant under a Lorentz

transformation, as prescribed by the principle of relativity. Of course, in the limit v << c,

we know that a Galilean transformation approximates a Lorentz transformation. What we

conclude therefore is that the non-relativistic Schrodinger theory is appropriate for describing

phenomena for which v << c. And indeed experiment confirms that this is so.

Clearly, however, when the condition v << c is not realized, we will need a quantum

theory that properly respects full Lorentz invariance. And this is what we will now set out

to develop.

Building a theory that respects full Lorentz invariance is unfortunately not the full an-

swer for relativistic systems. In a relativistic theory, mass and energy are equivalent. Thus

whenever the interactions involved give rise to energy transfers that exceed the rest mass

of the particles, particles can be created. To be a complete theory of relativistic quantum

phenomena, our theory must not only respect Lorentz invariance, but it must also accom-

modate states that differ in the number - and perhaps even nature - of the particles from

which it derives. To do this properly for both boson and fermion systems we must resort

to Relativistic Quantum Field Theory. Because of lack of time, I will restrict myself to

but a few simple comments about Relativistic Quantum Field Theory at the very end of

my discussion on Relativistic Quantum Mechanics. Instead, I will focus my discussion on

the first step historically taken for incorporating relativity into quantum physics, the Dirac

equation. The Dirac equation is a relativistic theory for spin-1/2 particles (i.e. fermions) in

a given force field. As we will see, it does have many important features and as a result is

used in many problems. Some of its more attractive features are:

1. It is Lorentz invariant;

143

2. It naturally incorporates the concept of intrinsic spin, which therefore does not have

to be introduced ad hoc as in Schrodinger theory;

3. It does admit the creation of particles;

4. In the limit of small velocities, it reduces to the Schrodinger equation.

One of the things we will see is that in looking at the nonrelativistic, i.e. small v/c, limit

of the theory, we not only recover the Schrodinger equation, but also have a well-defined

prescription for looking at small (but often interesting and important) effects of purely

relativistic origin. We will discuss this in some detail for the hydrogen atom.

I will start out by considering the simplest case possible, that of a free particle. Classically,

the energy (or hamiltonian) of a nonrelativistic free particle is

E =p2

2m

If we promote E and p to quantum operators via the substitutions

E → ih∂

∂t(206)

and

p → P (207)

and let both sides of the resulting operator equation act on a state vector |Ψ(t) >, we obtain

the time-dependent Schrodinger equation

ih∂

∂t| Ψ(t) >=

P 2

2m| Ψ(t) >

Now let’s consider a free particle at large velocities. The corresponding equation for the

classical energy E is

E2 = c2p2 +m2c4

or equivalently

E =(c2p2 +m2c4

)1/2This is the energy-momentum relation appropriate to relativistic particles which we would

like (somehow) to quantize.

144

The simplest way we might imagine doing this is to make the same substitutions (206,207)

as before, namely to raise E and p to quantum operators, via those equations. Doing this

and acting on a state vector | Ψ(t) > gives

ih∂

∂t| Ψ(t) >=

(c2P 2 +m2c4

)1/2| Ψ(t) > (208)

This isn’t terribly appealing. First of all, square root operators are not very nice. Even

more importantly, the equation seems to treat space and time in an asymmetric fashion,

suggesting that it will not be able to preserve the Lorentz invariance of relativity. To see

this, consider the equation in momentum representation, whence

< p| Ψ(t) >= Ψ(p, t)

These states are eigenfunctions of the momentum operator, so that (208) becomes

ih∂

∂tΨ(p, t) = c

(p2 +m2c2

)1/2Ψ(p, t)

= mc2(1 +

p2

2m2c2− p4

8m4c4+ ...

)Ψ(p, t)

Now transform to coordinate space, where p2 becomes −h22, etc. We see then that

we have an inherently different dependence on time (a single derivative) and space (lots of

higher-order derivatives). What we would like is an equation which is of the same order in

space and time. So what should we do?

An interesting (and reasonable) thought is to consider directly the relation

E2 = c2p2 +m2c4

and quantize it. Let’s see what that gives.

E2 → ih∂

∂tih

∂

∂t= −h2 ∂

2

∂t2

and

p2 → P 2 = −h22

in coordinate representation. Acting on a state Ψ(r, t) gives

−h2 ∂2

∂t2Ψ(r, t) = −c2h2 2 Ψ(r, t) +m2c4Ψ(r, t)

145

or [1

c2∂2

∂t2−2 +

(mc

h

)2]Ψ(r, t) = 0 (209)

In this equation, time and space enter compatibly, which is nice. But we won’t use it!

Why? The reason is that the wave function Ψ(r, t) only depends on r and t. But we know

that to describe a particle with spin-1/2 we need a spinor wave function that depends not

only on r and t but also on the spin orientation. An equation such as (209) can thus never

reduce to the Schrodinger equation for a particle with spin in the limit v/c << 1.

In fact the above equation is not without interest or use. Equation (209), called the

Klein-Gordon equation, is often used as a relativistic wave equation for a spinless (or spin-0)

particle.

Now let’s discuss the correct way to obtain an equation appropriate to a relativistic

quantum spin-1/2 particle. To do this, let’s return to (208), which was given on page 147

and then rejected:

ih∂

∂t| Ψ(t) >=

(c2P 2 +m2c4

)1/2| Ψ(t) >

Ideally, we’d like to stay with this equation, rather than its squared version (209), since it

is first order in time. This will, for reasons I won’t discuss, make simpler a probabilistic

interpretation of the proper quantum theory that emerges from it (I promise).

The problem with it, as I said earlier, is the messy square root. This is why we discarded it

then. Dirac’s inspiration was to see whether it was somehow possible to rewrite the quantity

in the square root as a perfect square. If so, the square root could be taken trivially and

all should be fine. Indeed, we would then end up with an equation first order both in space

and time, as we would also like. So, let’s see how to do this.

Let’s look at the factor in the square root, after removing a c2, and then try to express

it as a perfect square, i.e.

p2 +m2c2 = (αxpx + αypy + αzpz + βmc)2

= (α · p+ βmc)2 (210)

where from now on (for simplicity) I will use p rather than P to refer to the momentum

operator. Is it possible to determine α and β such that this holds? Matching the two sides

of (210), we obtain

p2x + p2y + p2z +m2c2 = α2xp

2x + α2

yp2y + α2

zp2z + β2m2c2

146

+pxpy (αxαy + αyαx) + cyclic permutations

+mcpx (αxβ + βαx) + (x → y) + (x → z) (211)

(a) From the first line in (211), we see that

α2i = 1 (i = x, y, z) and β2 = 1 (212)

(b) From the second line, we see that

αxαy + αyαx = αxαz + αzαx = αyαz + αzαy = 0

or equivalently

αi, αj = 0 , for all i = j (213)

(c) From the third line, we see that

αiβ + βαi = αi, β = 0 , for all i (214)

If we find αi and β that satisfy (212-214), we will have achieved our goal.

Some observations:

(1) From (213,214), it is clear that αi and β cannot be simply c-numbers. They must be

matrices that anticommute with one another. And this is nice, since as we noted earlier we

want a theory that in the nonrelativistic (NR) limit will give a nonrelativistic Schrodinger

equation involving two-component spinors.

(2) Since we want our hamiltonian to be hermitean, so that a probabilistic interpretation is

possible, it is clear that these matrices must be hermitean.

(3) From (212), it is clear that each of the matrices can only have eigenvalues ±1.

(4) It is clear, though at this point disheartening, that they can’t be 2 × 2 matrices. The

only 2 × 2 matrices that satisfy all these conditions are the Pauli spin matrices. But there

are only three Pauli spin matrices, and we need four - αx, αy, αz and β.

(5) It is possible to prove that the lowest dimensional matrices possible for satisfying all

these requirements are 4× 4.

The simplest and most common (though not unique) 4× 4 matrices that satisfy all these

requirements are:

α =

0 σ

σ 0

and β =

I 0

0 −I

(215)

147

Here, σ are the usual 2× 2 Pauli spin matrices and

I =

1 0

0 1

is the 2× 2 identity matrix.

Summarizing, with this choice of α and β, a proper relativistic quantum equation of

motion for a free (spin-1/2) particle is

ih∂

∂t|Ψ(t) >= c (α · p+ βmc) |Ψ(t) >

or (ih

∂

∂t− cα · p− βmc2

)|Ψ(t) >= 0 (216)

This is the free-particle Dirac equation.

Let’s now discuss some features of the free-particle Dirac equation:

1. The first thing to note is that as suggested earlier the equation is indeed first order

both in space and time, suggesting that it can indeed preserve the Lorentz invariance

required of a relativistic theory.

2. Since the Pauli spin matrices σ naturally appear in the formalism (via the matrices αx,

αy and αz) and since they relate to the spin operator for a spin-1/2 particle, it would

seem that the Dirac equation contains the requisite physics appropriate to spin-1/2

particles, e.g. the electron. This is to be contrasted with the Klein-Gordon equation

presented earlier, which had no chance of representing the physics of particles with

intrinsic spin.

3. Note further that spin arose here completely naturally, rather than having to be put

in ad hoc as in our non-relativistic quantum theory. This suggests that intrinsic spin

is an inherently relativistic concept, even though we were able to append it by hand

to our NR formalism.

4. Since the matrices α and β that enter the (free-particle) Dirac equation are 4-

dimensional, it is clear that the state vector |Ψ(t) > is a 4-dimensional object as

well. At first glance, this may not see a very happy outcome. We’ll address this

shortly, however, and see that in the small v/c NR limit of the Dirac theory, we indeed

recover the 2-component spinor theory of NR Schrodinger theory.

148

5. Finally, note that the free-particle Hamiltonian in this theory

H = cα · p+ βmc2

is clearly hermitean, as you can readily convince yourselves.

The probabilistic interpretation of the Dirac theory follows naturally from the hermiticity

of the hamiltonian, exactly as in the NR quantum theory. In particular, one defines

Ψ†(r, t)Ψ(r, t)

as the probability density of finding the spin-1/2 particle at point r at time t. Furthermore,

by using the Dirac equation, one can readily confirm that∫Ψ†(r, t)Ψ(r, t)dr = constant

i.e. that the norm of a given state remains constant in time, exactly as in the nonrelativistic

Schrodinger theory. So, all seems fine!

An alternative form for the Dirac equation

Often the Dirac equation is presented in a slightly different form, in which its Lorentz

covariance is more apparent. In particular, we can introduce instead of the α and β matrices,

a related set of four matrices,

γ0 = β

γ1 = βαx

γ2 = βαy

γ3 = βαz (217)

The free-particle Dirac equation can be expressed in terms of these new matrices as

γµ ∂Ψ

∂xµ+ iκΨ = 0 (218)

where

κ = mc/h

and

γµ ∂Ψ

∂xµ= γ0 ∂

∂x0+ γ · = β

1

c

∂

∂t+ βα ·

149

Finally the γ matrices can be shown to satisfy anticommutation relations

γµγν + γνγµ = 2gµνI

in terms of the familiar tensor gµν .

Incorporation of electromagnetism

As a first step towards generalizing the Dirac equation to other than free-particle systems,

let’s discuss what happens when we couple a spin-1/2 fermion (for concreteness, an electron)

to an electromagnetic field.

You’ve already seen this discussed in a nonrelativistic context and in fact the same ap-

proach applies here as well. All we have to do is to replace the operator p by p− qA/c. As

we learned last semester in our discussion of the quantum theory of electromagnetism, A is

also a quantum operator. Note that in principle we should also include the scalar potential

ϕ in our discussion, but as we did then we will work in a gauge in which ϕ = 0. Then the

Dirac equation for a spin-1/2 particle in an EM field becomes

ih∂

∂t| Ψ(t) >=

[cα ·

(p− qA/c

)+ βmc2

]| Ψ(t) > (219)

As before, the stationary states are of the form

|Ψ(t) >= |Ψ > e−iEt/h (220)

which when plugged into (219) yields

E|Ψ >=(cα · π + βmc2

)|Ψ > (221)

where

π = p− qA/c

Now let’s write the four-component eigenvector |Ψ > as

|Ψ >=

χ

ϕ

(222)

where χ and ϕ are themselves two-component vectors (or spinors).

Using the fact that

β =

I 0

0 −I

150

and

α =

0 σ

σ 0

we can rewrite the time-independent Dirac equation (221) as (E −mc2)I −cσ · π

−cσ · π (E +mc2)I

χ

ϕ

= 0 (223)

Thus (E −mc2

)χ− cσ · πϕ = 0 (224)

and

−cσ · π χ+(E +mc2

)ϕ = 0 (225)

Thus, the two-component spinors χ and ϕ are coupled.

Solving the coupled equations (224) and (225) gives the eigensolutions for an electron in

an EM field. A bit later in the semester, we will discuss the solutions of these equations in

the absence of an EM field.

The nonrelativistic limit

I would like now to focus on the solutions of the coupled equations (224,225) at low

velocities (v/c << 1), to see how nonrelativistic Schrodinger theory emerges in this limit.

As we’ll see, it emerges with precisely the features we’d like.

From (224) we see that

ϕ =cσ · π

E +mc2χ

The energy appearing here is the full relativistic energy, including the rest mass mc2. Since

the energy in Schrodinger theory doesn’t include the rest mass, let’s define the Schrodinger

energy as

ES = E −mc2

Then

ϕ =cσ · π

ES + 2mc2χ

At very low velocities, i.e. in the nonrelativistic domain, ES is much less than the

electron’s rest mass. Thus

ES + 2mc2 ≈ 2mc2

151

and

ϕ ≈ cσ · π2mc2

χ =σ · π2mc

χ (226)

The numerator in (226) contains the electron’s momentum operator and thus it’s expec-

tation value will be of order mv, where v is the electron’s velocity. Thus,∣∣∣∣∣ϕχ∣∣∣∣∣ ≈ mv

2mc=

1

2

v

c

We see therefore that in the nonrelativistic limit, ϕ is very small compared to χ. For this

reason, ϕ is referred to as the “small component” of |Ψ > and χ as the “large component”,

as long as we are talking about the NR domain. More generally they are referred to as the

“upper” and “lower” components, respectively, for equally obvious reasons.

We now begin to see how our two-component spinor theory will emerge in the NR limit.

It will emerge when we focus on the large component χ, albeit perhaps including effects due

to the small component ϕ perturbatively. Let’s see how this plays out in a bit more detail.

To do this, let’s plug (226) into (224). Remembering that ES = E −mc2, we obtain

ES χ ≈ (σ · π) (σ · π)2m

χ (227)

Using the identity, (σ · A

) (σ · B

)= A · B + iσ · A× B

we obtain

(σ · π) (σ · π) = π · π + iσ · π × π

But

π × π =iqh

cB

where B is the magnetic field. Thus,

(σ · π) (σ · π) = π2 − qh

cσ · B

Plugging this into (227) we obtain(p− q

cA)2

2m− qh

2mcσ · B

χ = ES χ (228)

This is precisely in the form of a NR Schrodinger equation for a spin-1/2 particle (as a

reminder, χ is a two-component spinor) in an EM field.

152

Note in particular the second term in the square brackets on the left hand side of the

equation. It is precisely the interaction that arises for a spin-1/2 particle in a magnetic field

B, with a gyromagnetic ratio g = 2. As we saw in PHYS811 last semester, the gyromagnetic

ratio was a problem when we treated things nonrelativistically. To achieve agreement with

observation, we had to postulate there that the gyromagnetic ratio associated with intrinsic

spin was g = 2. But at that time, it was purely a postulate. Now we see where it comes

from. It arises from a nonrelativistic approximation to the full (and correct) Dirac equation

for spin-1/2 particles in an EM field. And it arises from the coupling of the dominant large

component of the Dirac wave function to the less-important (but nonetheless necessary)

small component.

Incorporation of an interaction potential

Next let’s discuss how we would incorporate an interaction potential for a spin-1/2 particle

into its relativistic Dirac equation. We do it precisely as you might expect. The hamiltonian

of the free spin-1/2 particle in an EM field is, as we’ve just seen,

Hfree = cα · π + βmc2

In the presence of a potential V it becomes

H = cα · π + V + βmc2

and the Dirac equation then becomes

ih∂

∂t|Ψ >=

[cα · π + V + βmc2

]|Ψ > (229)

Bear in mind, however, that the potential that enters (229) may involve the α and β (or

γ) matrices, i.e. they need not be pure scalar potentials.

Application of the Dirac equation to the Hydrogen atom

Now let’s apply the Dirac formalsim to the hydrogen atom. Though an essentially non-

relativistic system (v/c << 1 for the electron) we should nevertheless be able to apply Dirac

theory to the problem and then take the appropriate NR limit to obtain a Schrodinger

description. We will see, in doing this, that some small, but interesting and observable rel-

ativistic effects creep in. This is for example the origin of the so-called hydrogen atom fine

structure that splits some of the degeneracies inherent in the simple NR Coulomb problem.

153

We will focus on the stationary eigenstates described by the time-independent Dirac

equation

E| Ψ >=[cα · p+ V (r) + βmc2

]| Ψ > (230)

We will ultimately use the Coulomb potential

V (r) = −e2

r

As earlier, we decompose

| Ψ >=

χ

ϕ

into its “large” and “small” components.

These two components satisfy the coupled equations

(E − V −mc2

)χ− cσ · p ϕ = 0 (231)

and (E − V +mc2

)ϕ− cσ · p χ = 0 (232)

From (232), we find that

ϕ =(E − V +mc2

)−1cσ · p χ (233)

Note that the order of the operators is important, as p in coordinate space is a differential

operator. We now plug (233) into (231), again respecting the operator order, and get

(E − V −mc2

)χ = cσ · p

(E − V +mc2

)−1cσ · p χ (234)

Writing, as before,

E = ES +mc2

where ES is the Schrodinger energy, gives[ES − V − cσ · p

(ES − V + 2mc2

)−1cσ · p

]χ = 0 (235)

Consider now the operator (ES − V + 2mc2)−1

entering (235). In the NR limit, ES − V

is very small compared to 2mc2 [Of course, I mean this in expectation value.] Thus, we can

carry out a series expansion in powers of ES−V2mc2

and it should converge fairly rapidly. So,

let’s expand!

154

1

ES − V + 2mc2=

1

2mc2

[1 +

ES − V

2mc2

]−1

=1

2mc2

[1− ES − V

2mc2+ ...

]Plugging this into (235) and only keeping the terms shown explicitly gives[

ES − V − c2(σ · p)2

2mc2− σ · p (ES − V ) σ · p

4m2c4

]χ = 0 (236)

Let’s now see what happens when we only consider the first term in the curly brackets.

Then (236) reduces to [ES − V − (σ · p)2

2m

]χ = 0 (237)

But

(σ · p)2 = p2

so that (237) becomes [ES − V − p2

2m

]χ = 0 (238)

which we recognize as the ordinary Schrodinger equation for the Hydrogen atom.

Thus, we confirm that in lowest order we indeed recover the nonrelativistic Schodinger

equation.

Now we’d like to study the effect of the second term in the curly bracket, which we

anticipate will give us corrections to the NR hamiltonian that are suppressed by (ES −

V )/2mc2 with respect to the usual one.

To put this “suppression” in a clearer light, let’s look for a moment at the lowest-order

equation (238), slightly reorganized,

(ES − V )χ =p2

2mχ

Since

p2 ≈ m2v2

it is clear that

ES − V ≈ 1

2mv2

which isn’t very surprising. Thus, our expansion parameter behaves qualitatively like

ES − V

2mc2≈ v2

4c2

155

As expected, our expansion in powers of (ES − V )/2mc2 is connected with an expansion in

powers of v2/c2, which we recognize as the appropriate expansion parameter in the NR limit

of a proper relativistic theory.

What we will now try to do is to obtain the lowest-order corrections to the Hydrogen

atom, those that are suppressed roughly by v2/c2 with respect to the main terms.

So, let’s now rewrite (236) in the form familiar from Schrodinger theory,

ES χ =

(p2

2m+ V − σ · p (ES − V )σ · p

4m2c2

)χ (239)

where I have again used the fact that (σ · p)2 = p2.

At first glance, this looks like a mess. So, let’s try to clean it up a bit. Consider

(ES − V ) σ · p χ = σ · p (ES − V ) χ+ σ · [ES − V, p] χ

= σ · p (ES − V ) χ+ σ · [p, V ] χ

Finally, the contribution to the NR hamiltonian from this correction term is

Hrel = − σ · p (ES − V ) σ · p4m2c2

= −(σ · p)2 p2

8m3c2− σ · p σ · [p, V ]

4m2c2

= − p4

8m3c2− σ · p σ · [p, V ]

4m2c2(240)

where the second equation arose by keeping only the first term in ES − V , as is needed to

get things to the right order.

The first term in (240) does not depend on the potential, but only on the momentum

operator. It is just the relativistic correction to the kinetic energy operator. It is clearly

suppressed by m2v2/4m2c2 (or by v2/4c2) with respect to the ordinary kinetic energy term.

The second term is the relativistic correction to the potential V = −e2/r. We shall now

analyze it in some detail.

We begin by rewriting this term in a somewhat more convenient form, by invoking the

identity

σ · A σ · B = A · B + iσ · A× B

Thus,

σ · p σ · [p, V ] = p · [p, V ] + iσ · p× [p, V ]

156

and

Vrel = −iσ · p× [p, V ]

4m2c2− p · [p, V ]

4m2c2(241)

Both terms involve, in addition to the potential V , two p operators in the numerator and

4m2c2 in the denominator. Thus, they are both suppressed relative to V by

p2/4m2c2 ≈ m2v2/4m2c2 ≈ v2/4c2 ,

again as expected.

Let’s now look at the first term in (241), which is straightforward to analyze. To begin,

we note that

[p, V ]χ = −ih (V χ)− V ( χ)

= −ih

( V )χ+ V χ− V χ

= −ih( V )χ

Thus, the first term in Vrel is

V(1)rel = −iσ · p× [p, V ]

4m2c2

= −hσ · p×

(− e2

r

)4m2c2

But

(−e2

r

)= − r

r2= − r

r3

so that

V(1)rel = − he2

4m2c2r3σ · p× r

=he2

4m2c2r3σ · r × p (242)

Note: The reason we were able to write p × r = −r × p is that the components of r on

which the momentum operator acts in p× r are always in a direction orthogonal to that of

p. As such, the noncommutativity of p and r doesn’t play a role when dealing with their

cross product.

Now, since r × p = L, we can rewrite (242) as

V(1)rel =

he2

4m2c2r3σ · L

=e2

2m2c2r3S · L (243)

157

The first of the two relativistic corrections to the hydrogen atom potential is thus the familiar

spin-orbit interaction. Since

S · L =1

2

(L+ S) · (L+ S)− L · L− S · S

=

1

2

J · J − L · L− S · S

,

this term splits the degeneracy between states with the same l values (and of course the

same s = 1/2 values) but different j values (i.e. j = l + 1/2 and j = l − 1/2). Note further

that these remarks obviously don’t apply to l = 0 s states, for which only j = l+1/2 exists.

Now let’s look at the second of the relativistic corrections to the potential V in (241),

namely

V(2)rel = − p · [p, V ]

4m2c2(244)

Consider the numerator

O = p · [p, V ] = p · p V − p · V p

Taking its hermitean adjoint, we find

O† = V p · p− p · V p

where I’ve used the fact that both V and p are hermitean. Comparing these two terms we

see that O = O†, so that V (2)res is not hermitean. And that clearly isn’t very nice. As we have

often seen, hermitean hamiltonians are a necessary ingredient of a quantum theory with a

meaningful probabilistic interpretation, i.e. one in which probability is conserved.

Does this mean that all is lost in our efforts to extract the NR limit of the Dirac theory

for the Hydrogen atom? No, it just means that we haven’t been sufficiently careful. The fact

that V(2)rel , as written down, is not hermitean says that the part of the probability associated

with the “large component” χ is not conserved. But Dirac theory never said that the “large

component”of the probability must be conserved, just that the total probability must be

conserved. Mathematically

∫Ψ†Ψdr =

∫ (|χ|2 + |ϕ|2

)dr = constant

but ∫|χ|2dr need not be.

158

On the other hand, the Schrodinger theory that emerges should itself have a conserved

probability. Put another way, if the Schrodinger wave function is χS, then it should satisfy

∫|χS|2dr = constant

What this tells us is that χS = χ. So, what is it?

To answer this, consider again

∫ (χ†χ+ ϕ†ϕ

)dr

We showed that to lowest order in v2/c2,

ϕ ≈ σ · p2mc

χ

Thus, to lowest order in v2/c2,

∫ (χ†χ+ χ† σ · p

2mc

σ · p2mc

χ

)dr = constant

or ∫ (χ†χ+ χ† p2

4m2c2χ

)dr = constant

or ∫χ†[1 +

p2

4m2c2

]χdr = constant

But (1 +

p2

8m2c2

)2

= 1 +p2

4m2c2+O

(p4

64m4c4

)Thus, to O(v2/c2),

∫χ†[1 +

p2

8m2c2

]2χdr

=∫ [(

1 +p2

8m2c2

)χ

]† [(1 +

p2

8m2c2

)χ

]dr

From all of this, we see that if we choose

χS =

(1 +

p2

8m2c2

)χ

then ∫|χS|2dr = constant

159

through the desired order in v2/c2, just as we would like.

Basically, what we have done here is to recognize that relativistic effects must be treated

consistently, both in the operators of the resulting NR theory and in the wave functions.

So, let’s now reconsider the NR Schrodinger-like equation (239)

ES χ =

(p2

2m+ V − σ · p (ES − V )σ · p

4m2c2

)χ = H χ

that we wrote down earlier. As a reminder, H includes the various relativistic corrections

we have been discussing, including the non-hermitean one we didn’t especially like.

Now replace χ on both sides by

χ →(1 +

p2

8m2c2

)−1

χS

Then

ES

(1 +

p2

8m2c2

)−1

χS = H

(1 +

p2

8m2c2

)−1

χS

Premultiplying both sides by(1 + p2

8m2c2

)gives

ES χS =

(1 +

p2

8m2c2

)H

(1 +

p2

8m2c2

)−1

χS

Expanding the inverse operator and keeping only the lowest order terms (thru O(v2/c2))

gives

ES χS =

(H +

p2

8m2c2H −H

p2

8m2c2

)χS

=(H +

1

8m2c2

[p2, H

])χS

Thus, the appropriate NR hamiltonian to use in conjunction with a properly normalizable

χS is

HS = H +1

8m2c2

[p2, V

]I now claim that when we add the new term

1

8m2c2

[p2, V

]to the non-hermitean term (244)

− p · [p, V ]

4m2c2

160

we obtained earlier, we end up with something that is inherently hermitean, as I will now

confirm.

Let’s call the sum of the two terms the Darwin term and then look at it a bit.

VDarwin = − p · [p, V ]

4m2c2+

1

8m2c2

[p2, V

]=

1

8m2c2([p · p, V ]− 2p · [p, V ])

But

[p · p , V ] = p · [p , V ] + [p , V ] · p

so that

[p · p , V ]− 2p · [p , V ] = [p , V ] · p− p · [p , V ]

Let’s now denote this operator as Q and obtain its hermitean adjoint, i.e.

Q = p V · p− V p · p− p · p V + p · V p

and

Q† = p V · p− p · p V − V p · p+ p V · p = Q

Thus, as I suggested earlier, the Darwin term is hermitean. The “wave function renormal-

ization” indeed cured the problem.

Working out the Darwin term in some detail leads to the result

VDarwin =h2

8m2c22 V

=e2h2π

2m2c2δ3(r) (245)

i.e. the Darwin term only acts when the electron is at the origin. Since it is only for s states

that the electron can be at the origin, the Darwin term acts only on s states. This is in

contrast to the spin-orbit term which acted everywhere but on s states.

When we include in perturbation theory the various relativistic corrections just discussed

we get almost perfect agreement with the experimentally measured properties of Hydrogen.

The one discrepancy remaining involves an experimentally observed slight splitting between

the 2s1/2 and 2p1/2 levels, whereas the theory up to now gives them as degenerate. This

phenomenon, called the Lamb shift, requires full-blown Quantum Electrodynamics (QED)

for its explanation.

161

A return to the free-particle Dirac equation

Now that we’ve completed our discussion of the NR reduction of the Dirac equation,

I would like to return to the full relativistic equation and explore some important conse-

quences. I will focus on the free-particle problem, since many interesting conclusions will

already be evident there.

As a reminder, the free-particle Dirac equation is given by

ih∂

∂t| Ψ(t) >=

(cα · p+ βmc2

)| Ψ(t) > (246)

We will again look at the stationary states, which as always are of the form

| Ψ(t) >= | Ψ > e−iEt/h (247)

Plugging this into (246) leads to a relativistic eigenvalue equation

E| Ψ >= HD| Ψ >=(cα · p+ βmc2

)| Ψ > (248)

where for a free particle the Dirac hamiltonian is

HD = cα · p+ βmc2 (249)

For a free particle, the Dirac hamiltonian obviously commutes with the three components

of momentum, p. As a consequence, it is possible to find simultaneous eigenstates of HD

and p, which I’ll denote |ΨE, p > and which satisfy

E| ΨE, p >=(cα · p+ βmc2

)| ΨE, p >

As before, we decompose the four-component Dirac eigenvectors into their upper and

lower components,

| ΨE,p >=

χE,p

ϕE,p

This then leads to the set of coupled equations

(E −mc2

)χE, p − cσ · p ϕE, p = 0 (250)

and

−cσ · p χE, p +(E +mc2

)ϕE, p = 0 (251)

162

From (251) we see that

ϕE, p =cσ · p

E +mc2χE, p

which when plugged into (252) gives

(E −mc2

)χE,p − c2 (σ · p)2

E +mc2χE,p = 0

or, after multiplying through by E +mc2,

[E2 −m2c4 − c2 (σ · p)2

]χE,p = 0

Using as always that

(σ · p)2 = p2

gives (E2 −m2c4 − c2p2

)= 0

As a reminder, p is now the momentum eigenvalue which is why we did not need to keep

the eigenvector χE,p in the equation.

We see therefore that for a free particle there is a connection between E and p2 (as of

course there is for a NR free particle). For a relativistic particle it is

E2 −m2c4 − c2p2 = 0

or

E2 = c2p2 +m2c4 (252)

as expected.

For a given three-momentum p, we see that there are two energies that satisfy (252),

namely

E = +c√p2 +m2c2 (253)

and

E = −c√p2 +m2c2 (254)

Thus, for a given free-particle momentum p, the Dirac equation admits two possible

solutions, one with positive energy and one with negative energy. The positive energy

solution is fine, but what in the world does the negative energy solution mean?

Solutions of the free-particle Dirac equation

163

Before addressing this question, let me first obtain the eigensolutions to the free-particle

Dirac equation, both the positive and negative energy solutions. For notational purposes,

let’s denote the positive energy solution by E = Ep and the negative energy solution by

E = −Ep, where

Ep = c√p2 +m2c2

The simplest positive-energy solutions are obtained by assuming that the momentum

vector p points along the z-direction. Then α · p = αzp and (248) reduces to the four-

dimensional matrix eigenvalue equation

mc2 0 cp 0

0 mc2 0 −cp

cp 0 −mc2 0

0 −cp 0 −mc2

u1

u2

u3

u4

= Ep

u1

u2

u3

u4

(255)

where I have introduced for the four-component eigenvector the specific notation

|Ψ >=

u1

u2

u3

u4

This reduces to four coupled equations

mc2u1 + cpu3 = Epu1

cpu1 −mc2u3 = Epu3

mc2u2 − cpu4 = Epu2

−cpu2 −mc2u4 = Epu4

There are two linearly independent and orthogonal (albeit non-normalized) solutions to

this set of equations,

u(R) ∝

1

0

cpEp+mc2

0

164

and

u(L) ∝

0

1

0

− cpEp+mc2

I have introduced the superscripts R and L to denote the helicity of these two eigenvectors.

Both are eigenvectors of the helicity operator, introduced last semester, with eigenvalues +1

and −1 respectively. As a reminder, the helicity is the spin projection along the momentum

vector.

In the case of the negative energy solutions, the resulting eigenvectors associated with

energy −Ep are

u(R) ∝

− cpEp+mc2

0

1

0

and

u(L) ∝

0

cpEp+mc2

0

1

The negative energy solutions

Now let’s discuss a bit the physical significance of the negative energy solutions. First,

let’s consider the positive-energy free-particle spectrum. From (253), we see that they begin

at a threshold energy of mc2, the rest mass of the particle, and then extend upwards in a

continuum to +∞. Likewise from (254) we see that the negative energy solutions begin at

−mc2 and then extend downwards in a continuum to −∞. This is represented schematically

in fig. 13, albeit with the continuum nature of the solutions above and below the two

thresholds not made evident.

Dirac postulated that the “vacuum” state of nature (i.e. the state with no particles)

corresponds to all of the negative energy states being filled. This set of filled negative-

energy states is called the Dirac Sea. And, again according to Dirac, this filled sea of

negative-energy particles is not observable.

165

-mc20

+mc2

FIG. 13: Schematic illustration of the spectrum of a relativistic free particle.

Sounds radical! Sounds cute! So, let’s now see what such a postulate buys us.

Let’s first ask what happens if you add a particle (we are of course thinking of electrons,

but any spin-1/2 particle will do) to the vacuum.

As we have discussed, particles of spin-1/2 are fermions and satisfy the Pauli exclusion

principle. Thus, no two such particles can occupy the same quantum state. Thus, when you

add a particle to the vacuum it cannot go into any of the negative-energy states, since all

are already occupied. Thus, it must go into one of the positive-energy states. And that is

the usual picture of an electron with positive energy.

Does this mean that the negative-energy states are irrelevant? No! To see why not, let’s

now ask what would happen if we were to hit the system in its vacuum state with something,

say a photon, and in doing this we transferred to the system an energy in excess of 2mc2.

This is enough energy to lift one of the negative-energy electrons within the Dirac Sea across

the 2mc2 energy gap into a positive energy state, as shown schematically in figure 14.

166

-mc20

+mc2

o

x

2mc2

FIG. 14: Schematic illustration of a particle-hole excitation of the Dirac Sea, producing a particle

and an antiparticle.

The net effect is that we now have an electron in a positive-energy state and a hole in

one of the negative-energy states within the Dirac Sea.

Clearly, the electron has positive energy and charge −e. What about the hole in the

Dirac Sea?

When you remove something with charge −e from something with no charge, you leave

behind something with charge +e relative to the (unobservable) filled Dirac Sea. Likewise

removing something with energy−E leaves behind something with energy +E, again relative

to the filled Dirac Sea. Thus, the hole has energy +E (positive) and charge +e (opposite to

that of the electron). The bottom line is that the hole in the Dirac Sea behaves for all intents

and purposes as a particle with the same mass as the electron, but with opposite charge.

Dirac called this strange beast a positron (for positive electron). And, lo and behold, it was

discovered experimentally only a few years after this bold (to say the least) hypothesis.

167

In our discussion last semester of electromagnetic interactions with quantum systems (e.g.

atoms), we saw that whenever there can be a quantum transition induced upwards though

the absorption of energy, there can also be a spontaneous transition downwards with the

emission of energy in the form of photons. And there’s no reason not to expect the same

thing to happen here.

So, imagine an electron and a positron coming together. The electron, being in a positive-

energy state, sees the corresponding hole in a negative-energy state in the Dirac Sea, i.e.

the positron. Since there is a hole available at a lower energy than the particle, the electron

can fall into that hole. In doing so, it will of course emit a photon to carry away its loss of

energy. Obviously, this loss of energy must be at least 2mc2, which is the minimum energy

gap between the positive and negative energy states. This I claim is completely analogous

to what happens when a quantum system in an excited state decays to a lower state by

emitting a photon.

We see therefore that the simple Dirac picture not only predicts the positron, but also

predicts electron-positron annihilation. And as you all know this too is seen experimentally.

But what about bosons?

The Dirac equation, and all that followed from it, applied to fermions only. When dealing

with spinless bosons, the Dirac equation doesn’t apply. There we discussed the fact that a

natural equation of motion was the Klein-Gordon equation, which we gave earlier. Rewriting

it here, it reads

[1

c2∂2

∂t2−2 +

(mc

h

)2]Ψ(r, t) = 0 (256)

It is straightforward to convince yourselves that this equation, like the Dirac equation,

has negative-energy solutions as well as positive-energy solutions. So, what do the negative

energy solutions mean here, i.e. for spinless bosons satisfying the Klein-Gordon equation.

Obviously, Dirac’s interpretation of the negative-energy solutions as being a filled (and

unobservable) sea of particles cannot work here. And the reason is simple. Dirac’s interpre-

tation depended critically on the fact that for fermions the Pauli principle applies and no two

fermions can occupy the same single-level. This is what guaranteed the stability of the Dirac

sea, i.e. the fact that positive-energy particles cannot fall into the negative-energy states.

But bosons do not satisfy a Pauli principle, and thus there is no way for this interpretation

168

d

tc

x

t

td

c

FIG. 15: Schematic illustration of a negative-energy particle of charge e moving backwards in time.

to apply to the negative-energy solutions of the Klein-Gordon equation. There is no way to

have a stable sea of particles with all negative-energy states filled.

How do we get around this? To do this, we instead use an idea due to Feynman, which

indeed applies not only to fermions but to bosons as well, as I will now briefly discuss.

Feynman’s interpretation of the negative-energy particles, whether fermions or bosons, is

that negative-energy particles can only move backwards in time.

This is schematically represented in figure 15 in which we consider a negative-energy

particle (perhaps an electron with negative charge −e) created at a space-time point c

which then travels backwards in time to space-time point d where it is destroyed.

What do we, people who move forward in time and see space-time in equal-time slices,

think is happening.

1. t < td . As far as we can tell there is nothing anywhere.

2. t = td . At this time, a negative energy −|E| and negative charge −e are destroyed, so

169

that the world energy goes up by |E| and the charge goes up by e. It would seem to

us, therefore, that an antiparticle was born here with this charge and energy is born

here.

3. t = tc . At this time, negative energy is created and charge −e is created. Thus, from

our perspective, the antiparticle is wiped out at this time.

4. t > tc. Once again, there does not seem to us to be anything anywhere.

Note, however, that nowhere in this discussion did it really matter whether the negative-

energy particle that was created at tc and destroyed at td was a fermion or a boson.

A nice, albeit brief, description of how we can accommodate such particles moving back-

wards in time in our quantum formalism is provided by Shankar towards the end of Chapter

20. I would now like to review the key steps for you.

The starting point of my discussion is a return to non-relativistic Quantum Mechanics

and to focus on the so-called propagator. As a reminder the propagator gives the amplitude

for propagating from a point r ′ at time t′ to a point r at a subsequent time t. In coordinate

representation, it can be written as

US(r, t : r′, t′) =

∑n

Ψn(r)Ψ∗n(r

′)e−iEn(t−t′)

in terms of the complete set of eigenstates Ψn of the Schrodinger hamiltonian H, i.e.

HΨn(r) = EnΨn(r)

Given this US and Ψ(t′) at some given time t′ we can get Ψ(t) at a later time t′.

But even though we use US to propagate forward in time, i.e. to calculate how the wave

functions evolves to later times, it can also propagate backwards in time, since US = 0 for

t < t′.

To avoid this possibility, it is useful to introduce a propagator that does not allow prop-

agation back in time,

GS(r, t; r′, t′) = θ(t− t′) US(r, t : r

′, t′)

in terms of the usual theta function, which is by definition zero for t < t′ and 1 for t > t′.

170

This new propagator, which only applies for t > t′, satisfies(i∂

∂t−H

)GS =

[i∂

∂tθ(t− t′)

] ∑n

Ψn(r)Ψ∗n(r

′)e−iEn(t−t′)

= iδ(t− t′)δ3(r − r ′)

= iδ4(x− x′)

where I use the notation x to refer to the 4-vector t, r.

Note, to derive this I made use of the fact that

θ = δ(t− t′)

and also that US satisfies the usual equation of a propagator(i∂

∂t−H

)US = δ3(r − r ′)

As a reminder, we need the complete set of eigenstates Ψn to recreate the 3D delta function.

So, that is what the Schrodinger propagator looks like when restricted to moving forward

in time.

Analogous treatment of the free-particle Dirac propagator would show that it too satisfies

an analogous equation (i∂

∂t−H0

)G0

D(x, x′) = iδ4(x− x′)

where H0 is the free-particle Dirac Hamiltonian.

Now, however, when we expand it in terms of the eigenfunctions of the free-particle Dirac

hamiltonian, we must make sure to include the complete set, namely those with positive and

negative energies. Schematically, we write this as

G0D(x, x

′) = θ(t− t′)

(∑n+

+∑n−

)

where n+ refers to those at positive energies and n− to those at negative energies. All are

needed to recreate the full δ4 on the right hand side of the equation.

While this is a fineG0D, it doesn’t satisfy our needs, as it contains negative-energy solutions

propagating forward in time.

It is at this point that Feynman suggested the needed trick. The above equation is not

unique. We can add or subtract from it any solution to the free-particle Dirac equation.

171

But in doing so, we must subtract it for all times. Thus, he suggested that we subtract all

negative-energy solutions at all times.

This gives us a new, but equivalent propagator,

G0F = θ(t− t′)

∑n+

− θ(t′ − t)∑n−

which is called the Feynman propagator.

Let’s now assume we had a state Ψi(t′) composed only of positive-energy states. This

propagator will propagate it forward in time, since it is orthogonal to all the negative-energy

states. But what if we had a state built out of negative-energy components only? Since it

is orthogonal to all positive energy states, it will get backwards propagated in time through

the second term that goes as θ(t′ − t).

If now we are in some external potential, the exact propagation of a particle in an arbitrary

state will be given schematically by

Ψf (t) = G0F (t, t

′)Ψi(t′) +

∑t”

G0F (t, t”) V (t”) G0

F (t”, t′)Ψi(t

′) + ...

in terms of a series of multiple scattering diagrams. This is analogous to what we wrote

down earlier for the propagation of a state in time-dependent perturbation theory, but there

we used the ordinary Schrodinger propagator. Here we use the relativistic Feynman propa-

gator, involving forward propagation of positive-energy particles and backward propagation

of negative-energy particles.

A pictorial flavor of the competing types of processes that could now occur, once we

include the possibility of such negative-energy particles moving backwards in time is given

in figure 16. Both are second-order processes, i.e. processes that involve two scattering

events.

Figure 16a represents a typical two-step process, in which a particle with positive energy

(lets for definiteness call it an electron, although it need not be) gets scattered forward in

time twice. The two scatterings take place at the space-time points 1 and 2, respectively.

Figure 16b represents another two-step process, leading from the same initial state i to

the same final state f , i.e. both have the same x and t (or at least they should have could

I have drawn them better) and both start and end with the same positive energy particle.

But now the scattering at point 1 kicks the particle backward in time and then at point

2 forward in time. As we move forward in time, we first see the electron, then at time 2

172

tff

i

(b)

x

t

x

(a)

i

1

2

2

1

FIG. 16: Two second-order processes that can take place when one includes the possibility of

negative-energy particles moving backwards in time.

(which is before 1) we see two electrons and a positron (i.e. we have created an e+e− pair),

and then at time 1 we again have only an electron. At the end, i.e. at space time point f

we have exactly the same final state (an electron) as we did in process (a).

Clearly, as we go to higher and higher order in the interaction, the electron can wiggle

and jiggle any number of times, creating lots of intermediate states with any number of e+e−

pairs.

So, even though we started out with a one-particle equation, particle production creeps

in through the negative-energy solutions (or the solutions that flow backwards in time). In

the case of fermions, this can either be viewed as resulting from excitations of the infinite

Dirac sea or because the single electron is allowed to go back and forth in time.

While we haven’t yet derived an appropriate propagator for bosons, we would imagine

that there too we would get a propagator in which negative-energy particles propagate

173

backward in time. This will then enable the development of a theory with creation of

particle-antiparticle pairs in boson systems as well.

As I said earlier, the framework for implementing these ideas in a consistent fashion is

relativistic quantum field theory. But that is for another course and another time.

174

Application of the Path Integral Formalism

At this point, I will begin the last topic in the course, the application of Feynman’s

Path Integral formalism. This is a follow-up to the preliminary discussion of Feynman’s

Path Integral formalism that took place in PHYS610 and which derived from Chapter 8 of

Shankar. The more detailed discussion on which we now embark can be found in Chapter

21 of Shankar, which you should now start reading.

Let me briefly summarize what was said in our earlier discussion.

At that time, we focused on the free-particle propagator, which as a reminder is the coor-

dinate space representative of the free-particle time evolution operator. In one dimension we

can readily derive this propagator using the standard Schrodinger or Hamiltonian approach

and would obtain

U(x, t, x′, t′) = < x|U(t, t′)|x′ >

= < x|U(t− t′)|x′ >

=

√m

2πih(t− t′)exp

−m(x− x′)2

2ih(t− t′)

(257)

We then showed that the propagator gives the amplitude for a system propagating from

one point in space and time to another point in space and later in time.

We then postulated following Feynman that we could alternatively obtain the propagator

connecting two points in space-time using the Lagrangian formalism by summing over all

possible paths in space-time that connect them. Feynman furthermore gave a procedure for

implementing this:

• To obtain the propagator between two points in space time (denoted 1 and 2), we need

to determine the sum of an infinity of partial amplitudes UΓ(2, 1), each one associated

with a possible space-time path Γ from (r1, t1) to (r2, t2).

• The partial amplitude associated with the path Γ is determined in the following way:

1. We first determine the classical action SΓ along the path Γ from the classical

Lagrangian L according to the usual formula

SΓ =∫ΓL(r, p, t)dt

175

2. We then determine the partial amplitude UΓ associated with this path as

UΓ(2, 1) = NeihSΓ

where N is a normalization constant which must be, and can be, evaluated.

When we implemented this procedure for a free particle in one dimension, we recovered

precisely the free-particle propagator obtained using the Schrodinget formalism and given

in (257).

I would now like to follow the reverse strategy. Rather than postulating that the propa-

gator can be obtained via a path integral and then proving that the postulate gives the right

results, I would instead like to show you explicitly that we can start with the hamiltonian

formalism and derive the propagator as a path integral. In doing so, we will indeed see sev-

eral key points emerging. One is that there are in fact several possible path integrals that

can be derived. In all we will have to make use of the resolution of the identity operator to

derive the path integral. We will then see that the existence of several possible path integral

formalisms is related to the fact that there are several possible resolutions of the identity

operator that we can use. Finally, once we have done this, we will discuss how one can use

the path integral formalism in its many possible manifestations to treat a key problem in

contemporary many-body quantum physics. I will not have time to treat all that are in

Shankar’s Chapter 21, but will limit my discussion to just one. Furthermore, as emphasized

by Shankar in his presentation, we will not give a detailed development of this application,

but hopefully enough that one can then go to the literature and learn more, now that we

are such proficient Quantum Mechanicians.

Derivation of the Path Integral

So now let’s turn to the derivation of the Path Integral representation for a one-

dimensional propagator governed by a time-independent hamiltonian

H =P 2

2m+ V (X) (258)

As we remember, the propagator is defined as the coordinate-space representation of the

time evolution operator, or

U(x, t;x′, t′ = 0) =< x| exp(− i

hHt)| x′ > (259)

176

Note that we are considering propagation from time t′ = 0 to a subsequent time t.

Now let’s see how we can demonstrate our earlier conjecture that this propagator can be

written as a sum over all possible paths between the two space-time points (x′, 0) and (x, t)

The first point to note is that the operator entering in (259) can be expressed as a product

of N operators

exp(− i

hHt)=[exp

(− i

hH

t

N

)]N(260)

for any integer N . This follows from the Baker-Hausdorff formula that tells how a product

of exponential operators can be combined,

eA eB = eA+B+ 12[A,B]+... (261)

where the ... refers to all higher commutators that enter. Note of course that [H,H] = 0,

which is why we arrive at a simple product of the N operators.

So, let’s now write

ϵ =t

N(262)

and look at this in the limit that N → ∞.

We again use the Baker-Hausdorff formula to write

exp

[−iϵ

h

(P 2

2m+ V (X)

)]≈ exp

(− iϵ

2mhP 2)exp

(−iϵ

hV (X)

)(263)

The reason this follows approximately is that all commutators that enter involve higher

powers of ϵ which will go to zero as ϵ → 0.

Thus, what we will have to compute to get the propagator is the matrix element

< x| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)exp

(− iϵ

2mhP 2)exp

(−iϵ

hV (X)

)...| x′ >

(264)

with the operator product exp(− iϵ

2mhP 2)exp

(− iϵ

hV (X)

)entering N times.

Now we insert the resolution of the identity operator between every pair of operators that

enters. In our current development, we will use the resolution of the identity operator in

coordinate representation, namely

I =∫ +∞

−∞dx |x >< x| (265)

177

To see how this plays out, we will focus on the case of N = 3, and then subsequently

generalize. Following Shankar, I will rename x and x′ by x3 and x0, respectively, whereby

the matrix element becomes

U(x3, x0, t) =∫

dx1 dx2 < x3| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)| x2 >

× < x2| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)| x1 >

× < x1| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)| x0 > (266)

Now let’s look at the generic matrix element

< xn| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)| xn−1 > (267)

When the operator V (X) acts to the right on | xn−1 >, the operator X gets replaced by

its eigenvalue xn−1. Thus,

< xn| exp(− iϵ

2mhP 2)

exp(− iϵ

hV (X)

)| xn−1 > (268)

=< xn| exp(− iϵ

2mhP 2)

| xn−1 > exp(− iϵ

hV (xn−1)

)

Now what about the remaining matrix element < xn| exp(− iϵ

2mhP 2)

| xn−1 >? This

is nothing more than the free-particle propagator for propagating from xn−1 to xn over a

time period ϵ. We worked this out in PHYS610 and the result can be found on page 153 of

Shankar and in eq. (257) of these notes. It is simply

< xn| exp(− iϵ

2mhP 2)

| xn−1 >=(

m

2πhiϵ

)1/2

eim(xn−xn−1)2/2hϵ (269)

Putting this all together, we find that

< xn| exp(− iϵ

2mhP 2)

exp(− iϵ

hV (X)

)| xn−1 > (270)

=(

m

2πhiϵ

)1/2

eim(xn−xn−1)2/2hϵ e−iϵh

V (xn−1)

Now when we combine the three matrix elements that enter (266) we obtain

U(x3, x0, t) =∫

dx1 dx2

(m

2πhiϵ

)1/2

eim(x3−x2)2/2hϵ e−iϵh

V (x2)

(m

2πhiϵ

)1/2


V (x1)

(m

2πhiϵ

)1/2


V (x0)

178

=(

m

2πhiϵ

)1/2[∫ 2∏

n=1

(m

2πhiϵ

)1/2

dxn

]

× exp

[3∑

n=1

im(xn − xn−1)2

2hϵ− iϵ

hV (xn−1)

](271)

The generalization to arbitrary N is straightforward. All we need do is replace

2∏n=1

→N−1∏n=1

and3∑

n=1

→N∑

n=1

whence

U(xN , x0, t) =(

m

2πhiϵ

)1/2[∫ N−1∏

n=1

(m

2πhiϵ

)1/2

dxn

]

× exp

[N∑

n=1

im(xn − xn−1)2

2hϵ− iϵ

hV (xn−1)

](272)

Now consider the exponential

exp

[N∑

n=1

im(xn − xn−1)2

2hϵ− iϵ

hV (xn−1)

]

that appears in (272). It can be straightforwardly rewritten as

exp

[N∑

n=1

im(xn − xn−1)2

2hϵ− iϵ

hV (xn−1)

]= exp

[i

h

N∑n=1

m(xn − xn−1)2

2ϵ− ϵ V (xn−1)

](273)

We recognize this as precisely the discretized version of Feynman’s eiS/h, as discussed

in Chapter 8 This can be seen from equation 8.4.3 in Shankar, generalized to include a

potential.

We can thus if we wish give a continuum version of this result, namely that

U(x, x′, t) =∫[Dx] exp

[1

h

∫ t

0L(x, x)dt

](274)

where by definition

∫[Dx] = lim

N→∞

(m

2πhiϵ

)1/2[∫ N−1∏

n=1

(m

2πhiϵ

)1/2

dxn

](275)

and L is the Lagrangian and is a function of x and x.

179

We refer to this path integral description of the propagator as the Configuration Space

Path Integral, as it derives by inserting the resolution of the identity operator in coordinate

or configuration space.

Now let’s return to the propagator expressed earlier as a matrix element of N pairs of

operators

< x| exp(− iϵ

2mhP 2)

exp(−iϵ

hV (X)

)exp

(− iϵ

2mhP 2)exp

(−iϵ

hV (X)

)...| x′ >

and evaluate it in a different way, namely by making use of a different resolution of the

identity operator. More specifically, let’s now introduce both

I =∫dx|x >< x|

and

I =∫ dp

2πh|p >< p|

When we consider this for N = 3, we find that we need to introduce three momentum-

space resolutions of I and two coordinate-space resolutions, viz:

< x3| exp(− iϵ

2mhP 2)

exp(− iϵ

hV (X)

)exp

(− iϵ

2mhP 2)exp

(−iϵ

hV (X)

)exp

(− iϵ

2mhP 2)exp

(− iϵ

hV (X)

)| x0 >

=1

(2πh)3

∫dp3dp2dp1dx2dx1 < x| exp

(− iϵ

2mhP 2)|p2 >< p2| exp

(−iϵ

hV (X)

)|x1 >

< x1| exp(− iϵ

2mhP 2)|p1 >< p1|exp

(−iϵ

hV (X)

)|x0 > (276)

As earlier, we note that

V (X)|x >= V (x)|x >

Also,

P 2|p >= p2|p >

Furthermore, we note that

< x|p >= eipx/h

consistent with our introduction of the factor 12πh

in the momentum-space resolution of the

identity operator.

180

Putting this all together and combining terms, we find that

U(x3, x0, t) =1

(2πh)3

∫dp3dp2dp1dx2dx1 e−

iϵp232mh eip3x3/he−

iϵhV (x2)e−ip3x2/h

× e−iϵp222mh eip2x2/he−

iϵhV (x1)e−ip2x1/h

× e−iϵp212mh eip1x1/he−

iϵhV (x0)e−ip1x0/h (277)

At this point we can combine terms. In particular we can collect those terms that involve

p2n in the exponent, terms than involve pnxm in the exponent and of course terms involving

V (xn). When we do this we find

U(x3, x0, t) =1

(2πh)3

∫dp1dp2dp3dx1dx2 exp

[3∑

n=1

(− iϵ

2mhp2n +

i

hpn(xn − xn−1)−

iϵ

hV (xn−1

)](278)

At this point we can generalize to an arbitrary number of time steps N , whereby

U(xN , x0, t) =∫ N∏

n=1

dpn2πh

N−1∏n=1

dxn exp

[N∑

n=1

(− iϵ

2mhp2n +

i

hpn(xn − xn−1)−

iϵ

hV (xn−1

)](279)

Here too we can write it in its continuum form by introducing the classical hamiltonian

H =p2

2m+ V (x) (280)

and also a notation for the integration variables over all the momenta and coordinates

∫[DpDx] = lim

N→∞

∫ N∏n=1

dpn2πh

N−1∏n=1

dxn (281)

Then

U(x, x′, t) =∫[DpDx] exp

[i

h

∫ t

0(px−H(x, p)) dt

](282)

This is referred to as the Phase-Space Path Integral for the propagator, as we integrate

over both the momenta and the associated coordinates.

Knowing the momentum dependence in the hamiltonian and furthermore since it is a

(simple) quadratic dependence, we can in fact carry out all the momentum integrals. When

we do this in the discretized form (279), we find that

N∏1

∫ ∞

−∞

dpn2πh

exp

[N∑

n=1

(− iϵ

2mhp2n +

i

hpn(xn − xn−1)

)]=

N∏1

(m

2πihϵ

)1/2

exp

[im(xn − xn−1)

2

2hϵ

](283)

181

If we now plug this into (279), we not surprisingly recover the Configuration Space Path

Integral given in (273). But note that this depended on having a hamiltonian in which the

dependence on momentum was purely a quadratic. If this is not the case, we cannot carry

out the momentum space integrals. But we can still use the Phase Space Path Integral.

I would now like to close my lectures by discussing a problem of contemporary importance

in physics in which we make use of the Path Integral formalism just developed. Further

examples, as noted earlier, can be found in Chapter 21 of Shankar.

The Berry Phase

The topic that I will discuss concerns what is called the Berry Phase.

This concerns what happens when we make a very slow or adiabatic change on a quantum

system. We addressed this last semester in our discussion of time-dependent Perturbation

Theory where we showed that when we apply an adiabatic perturbation to a quantum

system, the system evolves by remaining in a given state of the system, but with the state

itself changing adiabatically. Put another way, if we start off in the ground state of the

system and change the system sufficiently slowly, the system will remain in the ground state

of the hamiltonian at every instant.

Put a bit more formally, let’s assume that the hamiltonian of the system is given by

H(R(t)) where R is some external coordinate that enters the hamiltonian parametrically

and which changes slowly with time. What we stated qualitatively above is that if we start

off in the nth eigenstate of H(R(0)) at time t = 0 we will be in the nth eigenstate of H(R(t))

at the later time t.

A natural way to write the time-dependent wave function of the system |Ψ(t) > in this

approximation is

|Ψ(t) >= exp(− i

h

∫ t

0En(t

′)dt′)|n(t) > (284)

where

H(t)|n(t) >= En(t)|n(t) > (285)

is the instantaneous time-independent Schrodinger equation at time t.

182

Of course, if H were not a function of time, En would not be a function of time and this

would just be the familiar

|Ψn(t) >= exp (−iEnt/h) |n > (286)

Equation (284) recognizes that in the presence of a slowly varying time-dependent hamil-

tonian, the phase that gets built up over time should depend on the instantaneous and

time-dependent energy.

But as we will now see, the ansatz (284) misses some important physics, namely the

physics of what is called the Berry phase. To see just what is missing, let’s try to parametrize

what may be wrong through the introduction of a slightly modified ansatz

|Ψ(t) >= c(t) exp(− i

h

∫ t

0En(t

′)dt′)|n(t) > (287)

If the ansatz (284) were right, we would just find that c(t) = 1. If c(t) = 1, then something

is obviously missing.

So let’s try to determine c(t) by plugging (287) into the time-dependent Schrodinger

equation (ih

∂

∂t−H(t)

)|Ψ(t) >= 0 (288)

The derivative gives three contributions, since each of the three factors depends on t.

The derivative of the phase factor gives rise to a term

c(t)En(t)|n(t) >

which simply cancels the term obtained by acting with H(t) on |Ψ(t) >, since

H(t)|n(t) >= En(t)|n(t) >

What is left behind are the other two derivative terms

c(t)exp(− i

h

∫ t

0En(t

′)dt′)|n(t) > +c(t)exp

(− i

h

∫ t

0En(t

′)dt′)| ddtn(t) >= 0

We now take the overlap of this expression with < n(t)| and get

c(t) = −c(t) < n(t)| ddt|n(t) > (289)

which has as its solution

c(t) = c(0)exp

[−∫ t

0< n(t)| d

dt|n(t′) > dt′

](290)

183

Defining

γ = i∫ t

0< n(t)| d

dt|n(t′) > dt′ (291)

we find that

c(t) = c(0)eiγ (292)

The additional phase γ is called the Berry phase. It is not so interesting that we got an

extra phase from our analysis, since as we have often seen, phases usually don’t matter. But

in fact this phase can indeed have observable consequences and thus does matter.

So let’s now assume that we have a non-zero Berry phase and see what it can do. The

problem we will consider is that of an electron orbiting around a nucleus, which itself can

move. We will let R = R(t) denote the coordinate of the nucleus and r denote that of the

electron orbiting it.

Now let’s look at the effects of the slow motion of the nucleus, slow compared to that of

the electron. Thus, as the nucleus moves the electron adapts to the motion of the nucleus,

staying in the same instantaneous eigenstate |n(t) >.

We first express the Berry phase in a slightly different form. We consider the exponential

containing the Berry phase factor as

exp

(−∫ t

0< n(t′)| d

dt| n(t) > dt′

)

= exp

(i

hih∫ t

0< n(t′)| d

dt| n(t) > dt′

)

= exp

(i

h

∫ t

0ih < n(t′)| d

dR| n(t) > dR

dt′dt′)

= exp

(i

h

∫ t

0An(R)

dR

dt′dt′)

(293)

where

An(R) = ih < n(R)| d

dR| n(R) > (294)

We refer to An(R) as the Berry potential. It is obviously a vector potential as it couples

to the velocity dRdt

of the nucleus. Note that the Berry potential depends on the state n that

the electron is in.

Now let’s construct the path integral corresponding to the nuclear degrees of freedom.

The resolution of the identity that we will use is

I =∫dR

∑n

| R, n(R) >< n(R), R| (295)

184

where

| R, n(R) >= | R > ⊗| n(R) > (296)

At each value of R we pick our basis for the resolution of the identity as the one that

diagonalizes the instantaneous electronic hamiltonian He(R, r, p), namely the eigenstates of

He(R, r, p)|R, n(R) >= En(R)|R, n(R) > (297)

At this point, we will impose the adiabatic approximation, whereby an electron that

starts off in an instantaneous eigenstate | n >, will remain in that instantaneous eigenstate

forever. When we impose this adiabatic condition, we are able to approximate the identity

operator in terms of a single term,

I ≈∫

dR|R, n(R) >< n(R), R| (298)

and thus drop the sum over n.

Now let’s consider the configuration space path integral in the nuclear variable R. A

typical factor for a given time slice ϵ will look like

< n(R(t+ ϵ)), R(t+ ϵ)| exp[−iϵ

hHN(R,P )

]exp

[−iϵ

hHe(R, r, p)

]| n(R(t), R(t) > (299)

Note that I use capital letters to refer to the nuclear variables and small letters to refer to

the electronic variables. And note that we have both the nuclear hamiltonian HN and the

electronic hamiltonian He contributing to the propagator. Lastly, note that the electronic

hamiltonian depends on the nuclear variable, through its parametric dependence.

Let’s first look at the matrix element of the nuclear part of the hamiltonian, taken between

the nuclear part of the eigenstates,

< R(t+ϵ)| exp[−iϵ

hHN(R,P )

]| R(t) >=

√m

2πhiϵexp

[iϵ

h

(m

2ϵ2(R(t+ ϵ)−R(t))2 − V (R)

)](300)

as we remember from our earlier analysis of the configuration space path integral.

Next let’s look at the matrix element of the electronic part of the hamiltonian taken

between the electronic eigenstates (including their parametric dependence on R),

< n(R(t+ ϵ))| exp[− iϵ

hHe(R, r, p)

]| n(R(t)) > (301)

185

When the electronic hamiltonian acts to the right it gives us a factor of

exp[−iϵ

hEn(R)

](302)

But then we are still left with the overlap between the initial electronic state | n(R(t)) >

and the final electronic state | n(R(t+ ϵ)) >, which we still need to evaluate. Indeed, as we

will soon see, all of the interesting physics will arise when we consider this overlap.

To consider this overlap, we will first rewrite it as

< n(R(t+ ϵ))| n(R(t)) >=< n(R ′)| n(R) > (303)

and then carry out a Taylor series expansion in the difference between R and R ′, which we

will denote as η. We will in fact go through order η2, since this (as we’ll soon see) will lead

to results good to order ϵ in the time slice.

So let’s now go back to our earlier discussion in Chapter 8 on how to derive the Schrodinger

equation from the Feynman Path Integral for a single time slice.

As a reminder, we saw there that we could obtain the state of the system Ψ(x, ϵ) from

the state of the system at an earlier time 0 using

Ψ(x, ϵ) =∫ ∞

−∞U(x, ϵ, x′)Ψ(x′, 0)dx′

where

U(x, ϵ, x′, 0) =

√m

2πihϵexp

i

h

[m(x− x′)2

2ϵ− ϵV (

x+ x′

2)

]is the propagator associated with a particle moving in one dimension subject to a potential

V .

We then showed how we could recover the time-dependent Schrodinger equation for an

infinitesimal time step by expanding the integrand appropriately. There too we introduced

the variable η = x′ − x and carried out a series expansion in η tpo the order needed to get

the wave function correct to first order in ϵ.

We will now repeat that discussion, but for the problem at hand, focusing on the nuclear

degree of freedom R. For simplicity, as it does not affect what emerges, we will ignore

the potential. But for reasons just discussed we will need to include the overlap function

< n(R ′)|n(R ′ + η) >. The relevant expression we need to treat is

Ψ(R′, ϵ) =(

m

2πhiϵ

)1/2 ∫ ∞

−∞eimη2/2hϵ < n(R ′)| n(R ′ + η) > Ψ(R ′ + η, 0)dη (304)

186

We would now like to see what effect that overlap function has on the resulting infinitesimal

Schrodinger equation that emerges.

As in the discussion of Chapter 8, there is only a small region of η that can contribute,

defined by

|η| ≈(2πhϵ

m

)1/2

(305)

For η values outside this region the phase in the integral varies very rapidly and the contri-

butions thus cancel. From this equation, we indeed confirm my earlier remark that we must

go to order η2 to get results good to order ϵ (exactly as in the discussion of Chapter 8).

So let’s now expand both the wave function Ψ(R ′+η, 0) and the overlap < n(R′)| n(R′+

η) > to this order. We find

Ψ(R ′ + η, 0) = Ψ(R′, 0) + η∂Ψ

∂η+

η2

2

∂2Ψ

∂η2

< n(R ′)| n(R ′ + η) > = 1 + η < n| ∂n > +η2

2< n| ∂2n > (306)

where all derivatives are evaluated at R ′.

What we now do is to plug (306) into (304), through order η2, and do the appropriate

Gaussian integrals. What we end up with when all is said and done is

ih (Ψ(R, ϵ))−Ψ(R, 0)) = ϵ

[− h2

2m

∂2Ψ

∂R2− h2

m< n| ∂n >

∂Ψ

∂R− h2

2m< n| ∂2n > Ψ

](307)

With a little work, we can cast this info the form of an infinitesimal Schrodinger equation

and then read off the hamiltonian from it. The result is

H =1

2m(P − An)2 + Φn (308)

An = ih < n| ∂n >

Φn =h2

2m[< ∂n| ∂n > − < ∂n|n >< n| ∂n >] (309)

What we find is that indeed the hamiltonian includes the coupling to the Berry vector

potential. But, furthermore, it includes another term, Φn which is a scalar potential.

And there is no way to get rid of these potentials that arise when we consider the coupling

of the fast and slow degrees of freedom in the problem. And we have seen how they arise of

necessity from the use of the path integral formalism.

187

Now I’d like to turn to an example which shows that the Berry phase and the associated

Berry potentials do indeed lead to observable consequences, especially when one considers

periodic trajectories arising from periodic hamiltonians. In particular, I will briefly show

why it is for periodic hamiltonians that such consequences may arise.

The problem we will discuss involves a particle of mass M moving slowly in a circular

path of radius a. Already we see the idea of periodic trajectories entering.

Furthermore we will assume that there is a magnetic field pointing perpendicular to the

circular path (i.e. in the z direction) with field strength B1. Also, there is a second magnetic

field produced by passing a current along a wire along the z-axis, with strength B2. The

total magnetic field thus has a strength

B =√B2

1 +B22

and is at an angle

θ = arctanB2

B1

with respect to the z-axis.

At this point, let’s assume that the particle has no spin. Then there is no coupling of the

particle to the magnetic field, and its the hamiltonian describing its motion can be written

simply as

H =L2

2I(310)

where

I = Ma2 (311)

is the moment of inertia and

L = −ih∂

∂ϕ(312)

is the angular momentum operator, with ϕ being the azimuthal angle that defines the peri-

odic motion of the particle around the circle.

It is easy to solve this problem. The eigenvalues

Em =h2

2Im2 (313)

where m are the quantized values of the z component of the angular momentum, namely

m = 0, ± 1, ± 2, ....

188

Now let’s assume that the particle moving around the circle has spin-1/2 so that it does

indeed couple to the magnetic field. The field that the particle feels will of course depend

on the angle ϕ. The total hamiltonian for the particle is now

H =L2

2I− Cσ · B(ϕ) (314)

where C is a constant that measures the energy splitting between the two spin states.

We will now assume that the energy splitting between the two spin states is very large

compared to the energy splitting between the rotational states which go as h2

2I. Put another

way, the process of flipping spin in the magnetic field is very fast compared to the very slow

motion of the electron as it moves around the circular path. As such, the particle will not

jump between the two spin states as the particle moves around the circle, but will adjust

adiabatically.

Now what are the energies of the resulting states? An initial guess is that it would be

simply

Em =h2

2I∓ CB (315)

namely for each value of m there would be a splitting associated with the two Larmor states.

And that would be wrong, because it fails to take into account the effects of the Berry

potentials that result from the periodic motion of the electron.

To see this, let’s focus on the lower of the two solutions, in which the spin points up the

summed field the particle sees. The relevant spinor state can be shown to be

| θ, ϕ >=

cos θ2

i sin θ2eiϕ

(316)

From this one can find the two Berry potentials. The vector potential turns out to be

A+(ϕ) = ih < θ, ϕ| ∂∂ϕ

| θ, ϕ >= −h sin2 θ

2(317)

whereas the scalar potential turns out to be

Φ =h2

4sin2θ (318)

Since there is a vector potential, we will need to revise the angular momentum operator

to accommodate it, through the replacement

Lz → Lz − A+ (319)

189

Thus, the angular equation of motion associated with Lz changes to(−ih

∂

∂ϕ− A+

)Ψ = λΨ (320)

for which the solutions are

λ = mh− A+ =

(m+ sin2 θ

2

)h (321)

Ψ = eimϕ (322)

again for m = 0, ± 1, ± 2, ....

Now when we calculate the total energy of the spin-up (i.e. the lower) solution it will be

E+ =1

2Iλ2 − CB =

1

2I

(m+ sin2 θ

2

)2

h2 − CB (323)

There is an extra contribution to the energy that derives from the Berry vector potential

compared to (315). Had we not included the Berry phase in this problem, we would have

arrived at the wrong answer. Indeed, we would have predicted a degeneracy under the

replacement of m by −m (for m = 0) and that would have been incorrect.

Of course the problem of having fast and slow degrees of freedom is not new to us. We

have dealt with it when we studied for example the long-range Van der Waals interaction

between two hydrogen atoms. There we had two time scales, one associated with the slow

relative motion of the two nuclei and the other the much more rapid motion of the electronic

degrees of freedom. What we did was to evaluate the interaction between the two atoms as a

function of the parametric distance between the two nuclei, integrating over the fast motion

of the electrons. Once we have the interaction between the two atoms, we can just solve the

two-body problem in the relative coordinate between the two nuclei. This is known as the

Born-Oppenheimer approximation and is the key tool that has historically been used when

we have two very different time scales involved in a many-body problem. Much the same

philosophy is used to build a description of the interaction of two nuclei, where we integrate

over the much faster degrees of freedom associated with the motion of the constituent quarks.

In the Born-Oppenheimer approach, nowhere do we take into account any Berry potentials

when we integrate over the fast degrees of freedom.

What is different between those problems and the kind of problem we just treated in-

volving motion of particles with spin moving on a ring?

190

The answer is that Born and Oppenheimer focused on problems in which the hamiltonian

could always be chosen real and thus whose wave functions could always be chosen real.

While this is fine for any problem involving motion in an open space, it is not appropriate

when dealing with closed loops, where the particle can return to the same position but with

the wave function changing sign when it returns. Thus, to allow for such closed trajectories

Berry considered the possibility of complex hamiltonians which then led naturally to the

Berry phase and the associated Berry potentials and to the new physics it could produce.

This completes what I would like to say about the Berry phase, about the Path Integral

approach in general, and even more generally about Quantum Mechanics.

191

The End

192

PHYS812 Introductory comments and syllabuspittel/lectures.pdfPHYS812 Introductory comments and...

Documents

Transcript of PHYS812 Introductory comments and syllabuspittel/lectures.pdfPHYS812 Introductory comments and...