PHYS812 Introductory comments and syllabuspittel/lectures.pdfPHYS812 Introductory comments and...
Transcript of PHYS812 Introductory comments and syllabuspittel/lectures.pdfPHYS812 Introductory comments and...
PHYS812
Introductory comments and syllabus
• The course is PHYS812, the third semester of the Department’s Quantum Mechanics
sequence .
• I am Dr. Pittel and my office is in Sharp Lab Rm 202.
• The formal text for the course is the same as was used in PHYS610 and PHYS811,
namely Principles of Quantum Mechanics, Second Edition, by R. Shankar. Since
Shankar does not discuss some of the topics I wish to speak about, I am recommend-
ing for supplemental reading the book Quantum Mechanics, Third Edition, by E.
Merzbacher.
• The topics to be discussed in the course are:
1. Scattering Theory,
2. Second quantization,
3. Non-relativistic many-body theory,
4. Relativistic quantum mechanics
5. Applications of the Feynman Path Integral Approach (time permitting).
• The material on Scattering Theory can be found in Chapter 19 of Shankar, with valu-
able additional material in Chapter 13 and 20 of Merzbacher. There is no discussion
unfortunately on Second Quantization or Many-Body Theory in Shankar, but some
useful material on those topics in Chapters 21 and 22 of Merzbacher. My material
on Relativistic Quantum Mechanics will derive to some extent from Chapter 20 of
Shankar, but there is also some useful discussion in Chapter 24 of Merzbacher. The
material I will be discussing on the Feynman Path Integral Approach, time permitting,
can be found in Chapter 21 of Shankar.
• Where appropriate, I will be assigning weekly reading from Shankar. I will at the same
time be preparing my own detailed set of lecture notes, which I will make available
1
on the web prior to the associated lecture. Feel free to bring those lecture notes with
you, so that you do not have to spend the entire lecture taking notes. The web site
where the lecture notes will reside is: www.physics.udel.edu/ pittel.
• I will also be assigning weekly problem assignments, sometimes from the text and
sometimes not. All such assignments should be handed in (typically) a week after
assignment and will be graded. Students may work in groups but I request that the
assignments be written up independently. The homework assignments will likewise be
made available on the web site given above.
• I will give a mid-term examination and a final examination, both in class and closed-
book.
• At the end of the semester your grade will be obtained by weighting the homework
problems 20%, the midterm 30% and the final examination 50%.
• Office hours: I will have office hours from 2-4pm on Fridays.
2
An introduction to scattering theory
Over the next several weeks, I will discuss the quantum theory of collision or scattering
processes. I will focus initially on elastic scattering of spinless particles, and only at the end
will begin to put in some of the generalizations.
You should begin reading Chapter 19 of Shankar where his discussion of scattering theory
takes place.
Qualitative description of a scattering experiment
A typical scattering experiment involves the various ingredients shown schematically in
Fig. 1.
D.
TC
S. .
FIG. 1: Schematic illustration of an elastic scattering experiment
1. A source S of incident particles, for example a particle accelerator. The output of the
accelerator is a beam of particles, each described by a wave packet.
3
2. A beam collimator C. The beam is collimated by passing it through a narrow slit,
thereby giving in a small but finite spatial spread in the vertical direction.
3. A target T. The target provides a force field through which the incident particle in-
teracts with it. In principle the particles in the target are also described by wave
packets.
4. A detector D. As a result of the interaction of the incident wave packet with the
target wave packet, a portion of the incident wave packet is scattered and a portion
is transmitted (i.e. unscattered). The scattered part moves radially outward from
the force center, as represented by the succession of circles around T. The “almost”
parallel lines represent the “almost” plane waves, part of which pass through the target
unaffected, i.e. unscattered.
Experimentally, as long as P is not in the path of the beam (i.e. at very small scattering
angles), the only particles that reach it are those that have been scattered, as a result of the
use of the collimating slit C.
The source, the collimator and the detector can all be considered as infinitely far from
the target.
The scattering cross section
Suppose we bombard a group of n target particles with an almost parallel flux of N
particles per unit area per unit time, and count the number of incident particles that reach
the detector per unit time. Let’s assume that the detector subtends a solid angle dΩ about
a direction at polar angles θ, ϕ with respect to the incident beam, which we assume to move
in the z direction. We shall further assume that the detector is so placed that it receives
only scattered particles and none from the transmitted wave.
The number of particles detected per unit time in dΩ is proportional to N , n and dΩ,
viz:
# detected /unit time = nNdσ
dΩdΩ
The proportionality constant dσdΩ
is called the differential cross section.
The total cross section σ is defined as
σ =∫ dσ
dΩdΩ
4
It represents the total fraction of particles scattered by the target per unit time and per unit
target particle.
Note that dimensionally both dσdΩ
and σ have the dimensions of an area, e.g. fm2 or
angstroms 2.
Continuum solutions of the Time-Independent Schrodinger equation
We will show shortly that despite the fact that collision processes involve wave pack-
ets, which are not energy eigenstates, it is nevertheless possible to describe most physical
scattering processes in terms of appropriate continuum eigenstates of the Time-Independent
Schrodinger equation. So as to provide the necessary background for that discussion, I would
now like to remind you of some of the features of continuum solutions of the Schrodinger
equation for two spinless particles interacting via a central potential. The relevant Time-
Independent Schrodinger equation (after removal of the dependence on the CM variable)
is (− h2
2µ2 +V (r)
)Ψ(r) = EΨ(r) (1)
For simplicity, we introduce
U =2µV
h2
k2 =2µE
h2
and rewrite (1) as (2 + k2
)Ψ(r) = U(r)Ψ(r) (2)
The above differential equation by itself does not fully specify the scattering problem. Due
to time reversal or parity invariance, it has degenerate solutions at all positive energies E. By
specifying the boundary conditions, i.e. the conditions as r → ∞, we can choose the linear
combination of degenerate solutions which are appropriate for the scattering problem of
interest. Thus, to fully specify the appropriate continuum solutions for scattering problems,
we must supplement the differential equation (2) by boundary conditions. If, however, we
transform the differential equation (2) into an integral equation, we can incorporate the
desired boundary conditions into the same equation.
Introduction of the Free-Particle Green’s Function
5
To make the transformation from a differential equation with supplementary boundary
conditions to an integral equation, we introduce the free-particle Green’s function g0(r, r′),
which we define by [2 + k2
]g0(r, r ′) = δ(r − r ′)
With the above definition of g0(r, r ′), we see that
(2 + k2
) [∫dr ′g0(r, r ′)U(r′)Ψ(r ′)
]=∫
dr ′(2 + k2)g0(r, r ′)
U(r′)Ψ(r ′)
=∫
dr ′δ(r − r ′)U(r′)Ψ(r ′)
= U(r)Ψ(r)
Thus, if we set
Ψ(r) =∫
dr ′g0(r, r′)U(r′)Ψ(r′)
then this Ψ(r) is guaranteed to satisfy the Schrodinger equation
[2 + k2
]Ψ = UΨ
The equation
Ψ(r) =∫
dr ′g0(r, r′)U(r′)Ψ(r ′)
is an integral equation. Note that the unknown function Ψ also appears in the integrand.
Question: Is the above integral equation the only one that is consistent with the
Schrodinger equation?
Answer: No! To see this, consider instead the integral equation
Ψ(r) = Φ(r) +∫dr ′g0(r, r ′)U(r′)Ψ(r ′)
in which the function Φ(r) is assumed to satisfy
(2 + k2)Φ(r) = 0
Then
(2 + k2
) [Φ +
∫dr ′g0(r, r ′)U(r′)Ψ(r ′)
]=(2 + k2
) [∫dr ′g0(r, r ′)U(r′)Ψ(r ′)
]= U(r)Ψ(r)
Thus, we can always add any solution of the homogeneous equation
(2 + k2)Φ = 0
6
and obtain a new integral equation which is also consistent with the Schrodinger equation.
This freedom can be used to incorporate the desired boundary conditions of the scattering
problem.
Note that the equation
(2 + k2)Φ = 0
is just the free-particle Schrodinger equation. Among its solutions, as we know, are the
normalized plane waves
Φ(r) =1
(2π)3/2eik·r
With this choice of Φ(r), the integral equation becomes
Ψ(r) =1
(2π)3/2eik·r +
∫dr ′g0(r, r
′)U(r′)Ψ(r ′) (3)
This has the desired separation of Ψ(r) into an incoming plane wave plus another term.
We will now show that it is possible to choose g0(r, r ′) such that the second term behaves
for large r like an outgoing spherical wave. Our integral equation will then not only be
consistent with the Schrodinger equation, but it will also incorporate the desired scattering
boundary conditions.
Explicit Construction of g0
Because of the symmetry of the problem, the Green’s function g0(r, r′) can only depend
on r − r ′. Define
x = r − r ′
so that
(2 + k2)g0(x) = δ(x) (4)
To obtain g0(x), we first expand it in a Fourier series
g0(x) =1
(2π)3
∫eiq·xg0(q)dq (5)
Plugging (5) into (4), we see that
1
(2π)3
∫(−q2 + k2) eiq·x g0(q) dq = δ(x) (6)
But we know that1
(2π)3
∫eiq·x dq = δ(x)
7
so that g0(q) must satisfy
(k2 − q2)g0(q) = 1 (7)
Since (7) must be satisfied for all positive q, including q = k, we cannot trivially invert this
equation to obtain g0(q). The proper technique for inverting such singular equations is to
first consider the generalization to complex k. Once k is assumed complex (with Im k = 0),
we can obtain g0(q) for all q. This can then be plugged into (5) and the integration can
be carried out to obtain g0(x). At that point, we can take the limit Im k → 0, since the
physical value of k is real. What we will find is that this procedure does not yield a unique
answer. The result will depend on whether Im k → 0 from above the real axis or from
below. We shall use the notation g+0 (x) to denote the result when Im k → 0+, i.e. from the
positive imaginary side, and correspondingly g−0 (x) to denote the result when Im k → 0−,
i.e. from the negative imaginary side. Physically, these will be seen to yield wave functions
with different boundary conditions. Both are, however, legitimate mathematical solutions.
Mathematically, we can do the above by replacing k2 in (7) by k2 ± iϵ, with the under-
standing that after the eventual integration of (4) we will take the limit ϵ → 0. Thus, we
replace (7) by
(k2 ± iϵ− q2)g±0 (q) = 1
for which the solutions are
g±0 (q) =1
k2 − q2 ± iϵ
Inserting this into (5) and taking the limit as ϵ → 0 gives
g±0 (x) =1
(2π)3limϵ→0
∫ eiq·x
k2 − q2 ± iϵdq
To evaluate the integral, we choose the coordinate system for q in such a way that its
z-axis is along the vector x. Then
eiq·x = eiqx cosθ
The angular integrals can then be carried out immediately, giving∫ +1
−1
∫ 2π
0eiqx cos θd(cos θ)dϕ =
2π
iqx
[eiqx − e−iqx
]so that
g±0 (x) =1
4π2ixlimϵ→0
∫ ∞
0
eiqx − e−iqx
k2 − q2 ± iϵq dq
=1
8π2ixlimϵ→0
∫ ∞
−∞
eiqx − e−iqx
k2 − q2 ± iϵq dq
8
To evaluate the remaining integral, we use contour integration. For the first term,
limϵ→0
∫ ∞
−∞
eiqx
k2 − q2 ± iϵq dq ,
we close the contour in the upper half plane. For the second term,
limϵ→0
∫ ∞
−∞
e−iqx
k2 − q2 ± iϵq dq ,
we close it in the lower half plane. The relevant poles of the integrand are as follows:
(A) g+0 (x)
(1) First term: q =√k2 + iϵ ≈ k + iϵ
2k
(2) Second term: q = −√k2 + iϵ ≈ −k − iϵ
2k
(B) g−0 (x)
(1) First term: q = −√k2 − iϵ ≈ −k + iϵ
2k
(2) Second term: q =√k2 − iϵ ≈ k − iϵ
2k
where I have made use of the fact that ϵ is very small so that we need only keep the term
linear in it.
All told (and you should convince yourself that these results are correct)
g+0 (x) =1
8π2ixπi[−eikx − eikx
]= − 1
4πxeikx
Similarly,
g−0 (x) =1
8π2ixπi[−e−ikx − e−ikx
]= − 1
4πxe−ikx
Combining the two, we find that
g±0 (x) = − 1
4πxe±ikx
Putting back x = r − r ′ gives our final result for the free-particle Green’s function(s)
g±0 (r − r ′) = − 1
4π|r − r ′|e±ik|r−r ′| (8)
9
Return to the integral equation for Ψ(r)
We now insert the two possible Green’s functions into the integral equation (3) for Ψ(r),
obtaining
Ψ±k(r) =
eik·r
(2π)3/2− 1
4π
∫ e±ik|r−r ′|
|r − r ′|U(r′) Ψ±
k(r ′) dr ′ (9)
Note that there are two different solutions, one for each of the Green’s functions. Also, note
that I now include a subscript k to make clear that these wave functions correspond to an
incoming plane wave with momentum k.
We now examine this in the asymptotic limit, namely when r → ∞. Then
|r − r ′| = (r2 + r′2 − 2r · r ′)1/2
= r
(1 + (
r′
r)2 − 2
r
r· r ′
)1/2
→ r(1− r
r· r ′)
and
|r − r ′|−1 → r−1
All told,e±ik|r−r ′|
|r − r ′|→ e±ikr
re∓ik ′· r ′
where the vector k ′ is defined as
k ′ = kr
i.e. it has the same magnitude as k but is in the direction of r.
Finally,
Ψ±k(r) → eik·r
(2π)3/2− e±ikr
4πr
∫e∓ik ′·r ′
U(r′) Ψ±k(r ′)dr ′ (10)
Thus, the solution Ψ+
k(r), corresponding to the Green’s function g+0 (r, r
′), has the desired
behavior of an incident plane wave plus an outgoing scattered wave. It therefore represents
the desired continuum solution for a description of a physical elastic scattering processes.
The other solution, Ψ−k(r), is another continuum solution at the same energy, but it is
not of direct relevance to a physical scattering process. However, we will indeed make use
of it later in our formal development of scattering theory.
10
Equation (10) can be rewritten in the form
Ψ±k(r) → 1
(2π)3/2
eik·r + f±
k(r)
e±ikr
r
(11)
where
f±k(r) = − (2π)3/2
4π
∫e∓ik ′·r ′
U(r′) Ψ±k(r ′) dr ′ (12)
The coefficient f+
k(r) for the physical continuum solution is called the scattering ampli-
tude. It is often just denoted fk(r) without the superscript.
Modification of notation
It is useful to recast the equations we have obtained in terms of the true potential V (r)
rather than the scaled potential U(r). To do so, we note that
V (r) =h2
2µU(r)
Also, we would like to express our equations in terms of a slightly different (free particle)
Green’s function, defined by
(E −H0)G0(r, r ′) = δ(r − r ′)
where
E =h2 k2
2µ
and
H0 = − h2
2µ2
Then, clearly,
G±0 (r, r ′) =
2µ
h2 g±0 (r, r ′)
= − µ
2πh2
e±ik|r−r ′|
|r − r ′|(13)
The integral equation governing the scattering process can now be rewritten as
Ψ±k(r) =
1
(2π)3/2eik· r +
∫dr ′G±
0 (r, r′)V (r′)Ψ±
k(r ′) (14)
and the scattering amplitude as
fk(r) = f+
k(r) = − (2π)1/2µ
h2
∫e−ik ′·r ′
V (r′) Ψ+
k(r ′) dr ′ (15)
11
Scattering of wave packets
We now return to a discussion of a real elastic scattering experiment, in which the incident
projectile and the target are appropriately described by wave packets and not by energy
eigenstates. We shall prove, however, that because of specific features of the wave packets, we
can neglect their spread in energy (or momentum) and describe real scattering experiments
in terms of the aforementioned Ψ+
k(r) continuum eigenstates of H.
In a real scattering experiment, the projectile is prepared at time t0 in the form of a
wave packet, centered about a point z0 and with some average momentum k0. A suitable
expression for the wave packet is
Ψ(r, t0) = A(r − z0) eik0 · r (16)
A(r − z0) is a narrow envelope function that expresses its spatial localization about z0.
We shall define the zero of time as the time at which the projectile and target would
coincide were there no interaction. Then clearly t0 < 0 and furthermore
v0t0 = −|z0| (17)
where
v0 =hk0µ
Let us now introduce a Fourier decomposition of the narrow envelope function,
A(r − z0) =1
(2π)3/2
∫a(k)eik · r dk (18)
The components in this decomposition are given by
a(k) =1
(2π)3/2
∫e−ik · rA(r − z0)dr (19)
We can also Fourier decompose Ψ(r, t0). The coefficients of that expansion are
1
(2π)3/2
∫e−ik·r Ψ(r, t0)dr =
1
(2π)3/2
∫ei(k0−k)·r A(r − z0)dr
= a(k − k0) (20)
where the last equality followed from (19). Thus,
Ψ(r, t0) =1
(2π)3/2
∫a(k − k0) e
ik · r dk (21)
12
Obviously, the coefficients a(k− k0) in this Fourier expansion are only large if k− k0 ≈ 0.
In fact, the important values of momentum lie in a range
∆k ≈ w−1
where w is a typical spatial width over which significant changes in the envelope function
A(r − z0) occur.
In principle, the wave function at any time t > t0 can be evaluated by acting with the
time evolution operator on the wave packet at time t0. Namely it can be evaluated as
Ψ(r, t) = e−ihH(t−t0)Ψ(r, t0) (22)
where H is the full hamiltonian of the system.
As a reminder, the Fourier decomposition (21) was an expansion of the initial wave packet
in terms of the free-particle eigenstates eik · r of H0. If we wish to evaluate the effect of the
full time evolution operator (which involves H, not H0) on the initial wave packet, it is
useful to express it instead as an expansion in terms of the eigenfunctions Ψk(r) of the full
hamiltonian H, i.e. the wave functions Ψ+
k(r) obtained earlier. So, let’s now see how we can
do this.
If we express
Ψ(r, t0) =∫b(k) Ψ+
k(r)dk (23)
then
b(k) =∫Ψ+∗
kΨ(r, t0)dr
=1
(2π)3/2
∫e−ik · rΨ(r, t0)dr − µ
2πh2
∫ e−ik|r−r ′|
|r − r ′|V (r′)Ψ+ ∗
k(r ′) Ψ(r, t0) drdr
′
First term: Overlap with the incident plane wave.
1
(2π)3/2
∫e−ik · r Ψ(r, t0)dr =
1
(2π)3/2
∫a(k ′ − k0) e
i(k ′−k)·r dk ′ dr
=∫a(k ′ − k0)δ(k
′ − k)dk ′
= a(k − k0)
13
Second term: Overlap with the outgoing scattered wave.
To evaluate the second term, we note first that Ψ(r, t0) is a highly localized wave packet,
which is only non-zero over a small region of space. Furthermore, at time t0 (when it was
prepared), the small region of space is very far (|z0| ≈ ∞) from the target.
Thus, the overlap between Ψ(r, t0) and the scattered wave can only be non-zero for r ≈ ∞
(z ≈ −∞). It thus suffices to look at the scattered wave in the asymptotic region, for which
we see from (11) that it behaves like
eikr
(2π)3/2 r× fk (r)
From this, we see that in the localized region of overlap with Ψ(r, t0), the scattered wave has
a well-defined momentum which is opposite in direction to the average incident momentum
k.
But for any reasonable wave packet, the range of important momenta ∆k is such that
∆k
k0<< 1
where k0 is the central momentum of the packet.
I claim therefore that since any such scattered wave has a small range of momenta which
are all opposite in direction and of comparable magnitude to the small range of momenta
of the plane wave components of the wave packet there can not be any significant overlap
between them. Put another way, the second term gives zero contribution.
Putting our results for the first and second terms together, we arrive at the important
conclusion that
b(k) ≈ a(k − k0) (24)
As a reminder, crucial to our reaching this conclusion were that
1. the wave packet is spatially sufficiently well localized so that at t = t0, it does not yet
feel the potential, i.e. it is in the asymptotic region, and
2. its range of momenta is small compared to its average momentum.
Both of these criteria are invariably realized in real scattering experiments.
Now we continue by inserting (24) into (23), thereby reexpressing the t = t0 wave packet
as
Ψ(r, t0) =∫a(k − k0) Ψ
+
k(r)dk (25)
14
This wave packet at subsequent times can be obtaining by applying the time development
operator,
Ψ(r, t) = e−ihH(t−t0)Ψ(r, t0)
=∫
a(k − k0) e− i
hEk(t−t0) Ψ+
k(r)dk (26)
where
Ek =h2k2
2µ
is the energy eigenvalue associated with the eigenfunction Ψ+
k(r).
We are now interested in evaluating Ψ(r, t) in the asymptotic limit, i.e. at r → ∞, since
this is where the detector is located and thus experimental results can be obtained. Using
(11), this can be written as
Ψ(r, t) → 1
(2π)3/2
∫a(k − k0)e
− ihEk(t−t0)
×eik· r + fk(r)
eikr
r
dk (27)
To evaluate this integral, we again make use of the fact that a(k − k0) is only non-zero
for very small values of k − k0. We thus introduce a new variable q = k − k0 and expand
the various phase factors as power series in its magnitude q.
k2 = (k0 + q) · (k0 + q)
= k20 + q2 + 2k0 · q
≈ k20 + 2k0 · q
since q can only assume small values.
Likewise,
k = (k2)1/2
≈[k20
1 +
2k0 · q
k0
]1/2
≈ k0
1 +
k0 · q
k0
= k0 + k0 · q
15
Finally,
Ek =h2
2µk2
≈ h2
2µk20 +
h2
µk0 · q
Defining
ω0 =Ek0
h=
hk20
2µ
and, as earlier,
v0 =hk0µ
we find that
Ek ≈ hω0 + hv0 · q
In contrast to the phase factors, for which rapid variations with q are possible, we do
not expect (except under resonance conditions) that the scattering amplitude fk(r) should
vary much over the small range of important k values. Thus, we assume that over this small
range
fk(r) ≈ fk0(r)
We can now plug all of these results into (27) leading to
Ψ(r, t) → e−iω0(t−t0) ×[
1
(2π)3/2
∫a(q)ei(k0+q)· r−iq· v0(t−t0)dq+
+fk0(r)
(2π)3/2r
∫a(q)eik0r+iq· k0r−iq· v0(t−t0)dq
]
= e−iω0∆t
eik0· r
(2π)3/2
∫a(q)eiq· (r−v0∆t)dq+
+fk0(r)e
ik0r
(2π)3/2r
∫a(q)eiq· (k0r−v0∆t)dq
](28)
where I have now introduced ∆t = t− t0.
Comparing the two integrals in (28) with the integral in (18) for the Fourier transforma-
tion of the envelope function we can rewrite (28) as
Ψ(r, t) → e−iω0∆t
eik0· r
(2π)3/2A(r − v0∆t− z0) +
+fk0(r)e
ik0r
(2π)3/2rA(k0r − v0∆t− z0)
](29)
16
But
v0∆t+ z0 = v0t− v0t0 + z0
= v0t− v0 v0t0 + z0
= v0t− v0 v0t0 − |z0|v0
= v0t− v0 v0t0 + |z0|
= v0t
where the last equality follows from (17).
Thus, after all this work, we find that
Ψ(r, t) → e−iω0(t−t0)
eik0· r
(2π)3/2A(r − v0t)+
+fk0(r)e
ik0r
(2π)3/2rA(k0 r − v0t)
](30)
The physical interpretation of the two terms in (30) is straightforward.
• The first term is just the ongoing incident wave packet; it’s center moves classically
with velocity v0 and its shape does not change.
• The second term is also a wave packet in the form of a spherical shell of flux moving
outward radially with velocity v0. This spherical wave packet only exists for t ≥ 0, i.e.
after the projectile and the target come close together.
Cross Sections
Now that we have the full wave functions in the asymptotic limit for all times t > 0, we
can determine the experimentally meaningful differential cross section for elastic scattering.
Remember that the concept of a differential cross section was introduced at the beginning
of the lectures on scattering, on pages 3-4 of these notes. The differential cross section can
be expressed verbally as
dσ
dΩ=
outgoing flux per unit solid angle
incident flux per unit area
Both the incident and outgoing fluxes can be obtained from the associated probability cur-
rents.
17
Let
jin(r, t) = prob. current associated with the incident wave packet
= Re
[Ψ∗
in(r, t)h
µi Ψin(r, t)
]
But
Ψin =1
(2π)3/2
∫a(k − k0)eik· r dk
=1
(2π)3/2
∫ik a(k − k0) e
ik· r dk
As before, we let k − k0 = q and keep only the lowest terms in a series expansion in powers
of the small variable q. This gives
Ψin(r, t) ≈ ik0Ψin(r, t)
Thus,
jin(r, t) =h
µk0Ψ
∗in(r, t)Ψin(r, t)
=hk0µ
|A(r − v0t)|2
= v0 |A(r − v0t)|2 (31)
The total incoming flux passing the target is
Fin =∫ ∞
t0k0 · jin(r = 0, t)dt
where I’ve used the fact that the wave packet is prepared at time t0 and that the target is
by definition at r = 0. Thus,
Fin = v0
∫ ∞
t0|A(−v0t)|2dt
Letting
ξ = −v0t
so that
dξ = −v0dt
and noting that
v0 = v0k0
18
we obtain that
Fin = −∫ −∞
−v0t0|A(ξk0)|2 dξ
But from (17)
|z0| = −v0t0
so that
Fin = −∫ −∞
|z0||A(ξk0)|2dξ
=∫ |z0|
−∞|A(ξk0)|2dξ
Finally since |z0| is very large compared to the length of the packet, we can replace |z0| by
∞ in the integrand. Thus, finally
Fin ≈∫ ∞
−∞|A(ξk0)|2dξ (32)
Next we let
jout(r, t) = prob. current associated with the scattered wave packet
An analogous treatment in which we only keep the lowest order term in q gives
jout(r, t) = v0r|fk0(r)|
2
r2|A(k0 r − v0t)|2
Since the area subtended by the solid angle dΩ at a distance r is r2dΩ, the total radial flux
coming to the detector is
Fout = limr→∞
∫ ∞
0r · r2jout(r, t)dt
= limr→∞v0|fk0(r)|2∫ ∞
0|A(k0 r − v0t)|2dt
Letting
ξ = r − v0t
so that
dξ = −v0dt
we obtain
Fout = −|fk0(r)|2 limr→∞
∫ −∞
r|A(ξk0)|2dξ
= −|fk0(r)|2∫ −∞
∞|A(ξk0)|2dξ
= |fk0(r)|2∫ ∞
−∞|A(ξk0)|2dξ (33)
19
Finally,
dσ
dΩ=
Fout
Fin
= |fk0(r)|2 (34)
We see therefore that all wave packet aspects cancel out and the differential cross section
is given solely by the scattering amplitude associated with the “average” energy (or mo-
mentum). Thus, to describe an elastic scattering process we can disregard the wave packet
features and merely study the time-independent energy eigenstate with this average energy.
As a reminder, this depended on the fact that the scattering amplitude did not vary
rapidly over the momenta contained in the wave packet, which is only true as long as we are
not looking at a resonant scattering process.
The lab versus the CM frame
We have been discussing two-particle elastic scattering problems. As we know, the two-
particle Schrodinger equation can be reduced to an effective one-particle problem, with the
“particle” having the reduced mass of the two-particle system. And this is precisely what
we did. But of necessity, this means that we have been working in the center-of-mass (CM)
frame of the two-particle system. Thus, the information we have obtained referred to the
CM system, including for example the scattering amplitude. Thus, the differential cross
section we would obtain from it, using (34), is likewise in the CM frame, and is a function
of the CM scattering angles.
Experiments, however, are carried out in the laboratory frame of reference. As discussed
on pages 1-2, we have a projectile incident on a “stationary” target. The measured cross
sections will be in this frame of reference, as a function of the laboratory angles. How can we
obtain laboratory cross sections theoretically from the simpler-to-obtain CM cross sections?
I would now like to discuss this briefly.
Consider a particle of mass m1 and velocity v1 incident on a particle of mass m2, initially
at rest in the lab. Schematically this is illustrated in Fig. 2.
What is the velocity of the CM of this system. Denoting this as V , we see that
(m1 +m2)V = m1v1
20
m2v'2
m1v1
.
m1v'1
. ( lab, lab)
Note: v2=0
FIG. 2: Schematic illustration of elastic scattering kinematics in the lab frame
or that
V =m1v1
m1 +m2
Clearly, then, particle 1 is moving towards the CM with a velocity
V1 = v1 − V =m2v1
m1 +m2
whereas particle 2 is moving towards the CM with velocity
V2 = V =m1v1
m1 +m2
After an elastic collision, the particles go off in the lab frame as also shown in Fig. 2. In
the CM frame, however, they go off in opposite directions and with the same velocities as
before the collision. This is shown schematically in Fig. 3.
Let’s now obtain a relationship between the scattering angles in the lab frame (θlab, ϕlab)
and those in the CM frame (θCM , ϕCM). To do this, let’s focus on the velocity of particle 1
21
V2=m2v1/(m1+m2)m2 ,
m2, m1v1/(m1+m2)
.V1=m2v1/(m1+m2)
.. ( CM, CM)
m1 ,
.
m1, m2v1/(m1+m2)
.
FIG. 3: Schematic illustration of elastic scattering kinematics in the CM frame
in the lab frame, after the collision, which we denote v′1. As shown in Fig. 4 it is given by
the vector sum of the its outgoing velocity in the CM frame V1 and the velocity of the CM
(which is always V ).
We are assuming that the incident projectile is moving along the z-axis. Thus, the ϕ
angles are irrelevant and indeed
ϕCM = ϕlab
Only the θ angles change under transformation between the two frames.
From Fig. 4, we see further that
V + V1 cos θCM = v′1 cos θlab (35)
and
V1 sin θCM = v′1 sin θlab (36)
Dividing (36) by (35) eliminates v′1, yielding
22
v'1
V
CMlab
V1
z
FIG. 4: Schematic illustration of the velocity of particle 1
tan θlab =V1 sin θCM
V + V1 cos θCM
=sin θCM
γ + cos θCM
(37)
where
γ =V
V1
=m1
m2
as can be readily shown from our earlier relations for V and V1.
Note that in the limit m2 = ∞, this reduces to θlab = θCM , as it must. The two frames
are identical when the target is infinitely massive.
Now that we know how to relate scattering angles, let’s turn to differential cross sections.
Clearly the number of particles scattered into a solid angle dΩlab around (θlab, ϕlab) must
be identical to the number scattered into the corresponding dΩCM around (θCM , ϕCM).
23
Mathematically,(dσ
dΩ
)lab
sin θlab dθlabdϕlab =
(dσ
dΩ
)CM
sin θCM dθCMdϕCM
What we now want to do is to relate(dσ
dΩ
)lab
to
(dσ
dΩ
)CM
From the previous expression, we see that(dσ
dΩ
)lab
=sin θCM dθCM
sin θlab dθlab
(dσ
dΩ
)CM
(38)
where I’ve made use of the fact that dϕCM = dϕlab, since ϕCM = ϕlab.
To get the needed ratio, we return to (37)
tan θlab =sin θCM
γ + cos θCM
which we can rewrite ascos θlabsin θlab
=γ + cos θCM
sin θCM
Let’s define the right hand side of this equation to be A, viz:
A =γ + cos θCM
sin θCM
Thencos2 θlabsin2 θlab
= A2
which can be readily solved for cos θlab. The result is
cos θlab =A√
1 + A2
Taking differentials, we find that
−sin θlab dθlab =d
dA
A√1 + A2
dA
dθCM
dθCM
The two derivatives that enter are
d
dA
A√1 + A2
=1√
1 + A2− 1
2
2A2
(1 + A2)3/2=
1
(1 + A2)3/2
and
dA
dθCM
= − 1
sin θCM
sin θCM − γ + cos θCM
sin2θCM
cos θCM
= −[
1
sin θCM
+γ + cos θCM cos θCM
sin3 θCM
]sin θCM
24
Putting this all together, we finally arrive at the ratio needed for (38)
sin θCM dθCM
sin θlab dθlab=
(1 + γ2 + 2γcos θCM)3/2
1 + γcos θCM
so that (dσ
dΩ
)lab
=(1 + γ2 + 2γcos θCM)3/2
1 + γcos θCM
(dσ
dΩ
)CM
(39)
Thus, if we calculate a cross section in the CM frame by solving the reduced-mass
Schrodinger equation, we can then use (39) to calculate from it the corresponding lab cross
section, as needed to make contact with experiment.
25
More formal aspects of scattering theory
Return to the time-independent Schrodinger equation
We showed in the last few lectures that by analyzing appropriate solutions of the time-
independent Schrodinger equation we can describe elastic scattering processes. We also
showed that by introducing the free-particle Green’s function we could obtain an integral
equation for these continuum solutions that incorporated the necessary asymptotic condi-
tions.
We shall now discuss how to cast the equations of elastic scattering into Dirac notation.
This will facilitate subsequent generalization and analysis of the equations.
The Green’s operator
As a reminder, the free-particle Green’s functions are defined so as to satisfy the equation
(E −H0)G0(r, r′) = δ(r − r ′) (40)
where
H0 = − h2
2µ2
is the free-particle hamiltonian.
There are two solutions, called G±0 (r, r
′). The one with a plus superscript leads to
an outgoing spherical wave in the continuum solution of the Time Independent Schrodinger
Equation and the one with a minus superscript leads to a solution with an incoming spherical
wave.
Now let’s consider these two Green’s functions as the coordinate-space representations of
operators G±0 in abstract Hilbert space, viz:
G±0 (r, r
′) =< r|G±0 |r ′ >
I now claim that
G±0 = limη→0(E −H0 ± iη)−1
Let’s now see how we can prove this. Since G±0 , as defined above, are operators in Hilbert
space, we can evaluate their matrix elements in any representation. Let’s do so in momentum
representation, i.e.,
< q |G±0 | q ′ >= limη→0 < q |(E −H0 ± iη)−1| q ′ >
26
But
H0| q ′ >=h2(q′)2
2µ| q ′ >
and
E =h2k2
2µ
Thus,
< q |G±0 | q ′ >= limη→0
1h2k2
2µ− h2q2
2µ± iη
δ(q − q ′)
Knowing this, let’s now look at the matrix representation of the operators G±0 in coor-
dinate representation. We will indeed see that it is precisely the G±0 (r, r
′) that we derived
earlier. This will then prove our assertion.
The coordinate representation of the operators G±0 can be expressed as
< r |G±0 | r ′ >=
∫ ∫dqdq ′ < r | q > < q |G±
0 | q ′ > < q ′| r ′ >
where all I’ve done is to insert two identity operators,
I =∫
dq | q >< q | and I =∫
dq ′| q ′ >< q ′|
But we know all of the matrix elements and overlaps in the integrand. Putting them in,
we get
< r| G±0 |r ′ > = limη→0
∫ ∫dq dq ′ eiq· r
(2π)3/21
h2k2
2µ− h2q2
2µ± iη
δ(q − q ′)e−iq ′· r ′
(2π)3/2
=1
(2π)3limη→0
∫dq
eiq· (r−r ′)
h2k2
2µ− h2q2
2µ± iη
=2µ
h2
1
(2π)3limϵ→0
∫dq
eiq· (r−r ′)
k2 − q2 ± iϵ
where
ϵ =2µ
h2 η
But on pages 8, we showed that
g±0 (r, r′) =
1
(2π)3limϵ→0
∫dq
eiq· (r−r ′)
k2 − q2 ± iϵ
Using this and (13), we see that
< r| G±0 |r ′ > =
2µ
h2 g±0 (r, r
′)
= G±0 (r, r
′)
27
Thus, we have proven that these Green’s functions are indeed just the coordinate space
representation of the operators
G±0 = limη→0(E −H0 ± iη)−1
as advertised.
Having now found the explicit form of these so-called Green’s operators, let’s use them.
In analogy with our earlier discussion, G+0 is the particular interesting one, as it is the one
whose coordinate representation produces scattering wave functions with outgoing spherical
waves.
A quite useful form for this Green’s operator can be obtained by again inserting an
identity operator I =∫
dq | q >< q| , whence
G+0 = lim
η→0
∫dq
| q >< q|E − h2q2
2µ+ iη
The denominator is now just a complex scalar and not the inverse of an operator.
The integral equation in Dirac notation
The physical scattering wave function Ψ+
k(r) was seen earlier to be a solution of the
integral equation
Ψ+
k(r) = Φk(r) +
∫dr ′G+
0 (r, r′)V (r′)Ψ+
k(r ′)
In Dirac notation this becomes
< r| Ψ+
k> = < r| Φk > +
∫dr ′ < r| G+
0 |r ′ >< r ′| V |Ψ+
k>
= < r| Φk > + < r| G+0 V |Ψ+
k>
Thus,
| Ψ+
k>= | Φk > + G+
0 V | Ψ+
k> (41)
This integral equation in Hilbert space is very convenient for formal manipulations related
to scattering. It is known as the Lippman-Schwinger equation.
The transition matrix (or T matrix)
28
The scattering amplitude for a system with incident relative momentum ka was given in
eq. (12) in terms of the scaled potential U as
fka(r) = − (2π)3/2
4π
∫e−ikb·r ′
U(r′) Ψ+
ka(r ′) dr ′
This can be rewritten in terms of the full potential V as
fka(r) = −√2πµ
h2
∫e−ikb·r ′
V (r′) Ψ+
ka(r ′) dr ′
As a reminder, kb = kar, i.e. it has the same magnitude as ka but is pointed along r.
The differential cross section can be obtained from the scattering amplitude by taking
the absolute magnitude squared, i.e.
dσ
dΩ= |fka(r)|
2 =2πµ2
h4 |∫
e−ikb· r ′V (r′)Ψ+
ka(r ′)dr ′|2 (42)
As a reminder, this is the differential cross section in the CM system.
Since1
(2π)3/2eik· r
is just the coordinate-space eigenfunction of H0 with momentum k, we denote
1
(2π)3/2eik· r =< r| k >
so that
e−ikb· r ′= (2π)3/2 < kb| r ′ >
Similarly, we can write Ψ+
ka(r ′) in Dirac notation as
Ψ+
ka(r ′) =< r ′| Ψ+
a >
where I’ve used the notation |Ψ+a > to denote the scattering state associated with incoming
momentum ka.
Thus, equation (42) for the differential cross section can be rewritten as
dσ
dΩ=
(2π)4µ2
h4 |∫
dr ′ < kb| r ′ > < r ′| V | Ψ+a > |2
=(2π)4µ2
h4 | < kb| V | Ψ+a > |2
Definition: The transition operator T is defined by the equation
T |ka >= V |Ψ+a > (43)
29
In terms of this new operator
dσ
dΩ=
(2π)4µ2
h4 | < kb| T | ka > |2 (44)
The quantities < kb| T | ka > are called transition matrix elements or simply T matrix
elements.
Some properties of the transition operator T
(1) T = V + V G+0 T
Proof:
From equation (41),
|Ψ+a >= |ka > +G+
0 V |Ψ+a >
Thus,
< kb| V | Ψ+a >=< kb| V | ka > + < kb| V G+
0 V |Ψ+a >
Inserting (43) twice, we then obtain
< kb| T | ka >=< kb| V | ka > + < kb| V G+0 T |ka >
From this we confirm that
T = V + V G+0 T
(2) Define G± = limη→0 (E−H± iη)−1, where H is the full hamiltonian (i.e. H = H0+V ).
Then T = V + V G+ V .
You will be asked to prove this in a homework problem.
(3) Consider two eigenvectors ka and kb of H0 with the same eigenvalue E (thus |ka| = |kb|).
Then
Tab − T †ab = −2πi
∫dkn Tan T †
nb δ(Ea − En)
Proof:
Consider
G+ = limη→0
1
E −H + iη
Then
(G+)† = limη→01
E −H − iη= G−
since H is hermitean.
30
Thus,
T = V + V G+ V
and
T † = V + V G−V
where here too I have assumed that V is hermitean.
Thus,
T − T † = VG+ −G−
V
Taking matrix elements of this equation gives
Tab − T †ab = limη→0 < ka| V
1
E −H + iη− 1
E −H − iη
V | kb >
We now insert before the second V a complete set of eigenvectors |Ψ+n > of H with energy
En and incident momentum kn. Then using the fact that H|Ψ+n >= En|Ψ+
n >, we obtain
Tab − T †ab = limη→0
∫dkn
1
E − En + iη− 1
E − En − iη
< ka| V | Ψ+
n > < Ψ+n | V | kb >
(45)
It is straightforward to convince yourselves that
limη→0
1
E − En + iη− 1
E − En − iη
= −2πi δ(E − En) (46)
by using the standard form for the Dirac Delta function
δ(x) = limη→0
1
π
η
x2 + η2
Inserting (46) into (45), we find that
Tab − T †ab = −2πi
∫dknδ(E − En) < ka| V | Ψ+
n > < Ψ+n | V | kb >
We now use the defining relation for T and thus also T † to rewrite this as
Tab − T †ab = −2πi
∫dknδ(E − En) < ka| T | kn > < kn| T †| kb >
= −2πi∫
dknδ(E − En)TanT†nb
QED
This equation can be expressed in operator notation as
T − T † = −2πi T T † (47)
with the understanding, however, that when we insert a complete set of states between the
two operators T and T †, we only include states at the same energy.
31
The scattering or S matrix - some necessary background
I would now like to introduce the scattering or S matrix. To provide the necessary
background, it is first useful to first carry out some preliminary analysis.
First, let’s consider the set of all state vectors |Ψ+a >.
Theorem: The set of all vectors |Ψ+a > is orthonormal.
Proof:
| Ψ+a >= | Φa > + G+
0 V | Ψ+a >
Thus, (1−G+
0 V)| Ψ+
a >= | Φa >
or
| Ψ+a >=
(1−G+
0 V)−1
| Φa >= | Φa > +G+ V | Φa >
where the last result derives from a homework assignment.
Thus,
< Ψ+b | Ψ+
a > = < Ψ+b | Φa > + < Ψ+
b |(Ea −H + iη)−1 V | Φa >
= < Ψ+b | Φa > + < Ψ+
b |(Ea − Eb + iη)−1 V | Φa >
where for simplicity of notation I am now suppressing the limη→0.
Now, since (Ea − Eb + iη)−1 is a c-number, it can be moved to the right of V . We then
note that
Ea| Φa >= H0| Φa >
Thus,
< Ψ+b | Ψ+
a >=< Ψ+b | Φa > + < Ψ+
b | V (H0 − Eb + iη)−1| Φa > (48)
If we take the hermitean adjoint of the Lippman Schwinger equation (41), we find that
< Ψ+b | =< Φb|+ < Ψ+
b | V (Eb −H0 − iη)−1
or
< Φb| =< Ψ+b |− < Ψ+
b | V (Eb −H0 − iη)−1 (49)
Inserting (49) into (48), we find finally that
< Ψ+b | Ψ+
a >=< Φb| Φa >= δab
32
QED
Theorem: The set of all vectors |Ψ−a > is orthonormal.
Proof: Almost identical to the one above for the continuum solutions | Ψ+a > with outgoing
spherical waves.
The two sets are, however, not complete, if H admits discrete states. If the discrete states
are added to the set | Ψ+a > or the set | Ψ−
a > the resulting sets are complete.
Introduction of the S matrix
From the preceding discussion, it is clear that the continuum solutions | Ψ+a > can be
expressed as linear combinations of the solutions | Ψ−a >. We define an operator S which
transforms from one set to the other, i.e.
Sab =< Ψ−a | Ψ+
b > (50)
Note: Since the bound states are orthogonal to the continuum states, the expansion does
not require the discrete part of the set(s).
Important point: Since two eigenvectors of H belonging to different eigenvalues must be
orthogonal, S must be diagonal with respect to energy. This is in contrast to the T operator
which need not be.
We see therefore that the S matrix expands a continuum solution at a given energy with
outgoing spherical waves in terms of all of those at the same energy with incoming spherical
waves (and vice versa).
Connection between the S operator and the T operator
Consider
Sba =< Ψ−b | Ψ+
a >
As before, we use
| Ψ+a >= | Φa > +G+ V | Φa >= | Φa > + (Ea −H + iη)−1 V | Φa > (51)
Thus,
Sba = < Ψ−b | Φa > + < Ψ−
b | (Ea −H + iη)−1 V | Φa >
33
= < Ψ−b | Φa > + < Ψ−
b | (Ea − Eb + iη)−1 V | Φa >
= < Ψ−b | Φa > + < Ψ−
b | V (Ea − Eb + iη)−1 | Φa >
= < Ψ−b | Φa > − < Ψ−
b | V (Eb − Ea − iη)−1 | Φa > (52)
If we now consider the Lippman Schwinger equation for |Ψ−b > and take its hermitean
adjoint, we see that
< Ψ−b | =< Φb| + < Ψ−
b | V (Eb −H0 + iη)−1
Inserting this into the first term in (52) gives
Sba = < Φb|Φa > + < Ψ−b |V (Eb −H0 + iη)−1 |Φa > − < Ψ−
b |V (Eb − Ea − iη)−1 |Φa >
= < Φb| Φa > + < Ψ−b | V
1
Eb − Ea + iη− 1
Eb − Ea − iη
| Φa >
As in a previous theorem, we use the fact that
limη→01
2πi
(1
x− iη− 1
x+ iη
)= limη→0
1
2πi
(2iη
x2 + η2
)
= limη→01
π
η
x2 + η2
= δ(x)
Thus,
Sba = < Φb| Φa > − 2πi < Ψ−b | V | Φa > δ(Ea − Eb)
= δab − 2πi < Φb| T | Φa > δ(Ea − Eb)
= δab − 2πi Tba δ(Ea − Eb) (53)
which in operator notation becomes
S = I − 2πi T (54)
as long as we restrict ourselves to states with the same energy.
For calculations, it is useful to have (53) in a slightly different form, in which the operative
delta function is in momentum rather than energy. To do so, we use the fact that
δ(Ea − Eb) =µ
h2kaδ(ka − kb)
Then (53) can be rewritten as
Sba = δab −2πµi
kah2 Tba δ(ka − kb) (55)
34
The scattering amplitude in terms of S-matrix elements
The scattering amplitude fk(r) can be expressed in terms of the T matrix elements
< k ′|T | k > according to [see eqs. (42) and (44)]
fk(r) = −(2π)2µ
h2 < k ′|T | k >
Using (55), we see that
fk(r) = −(2π)2µ
h2
kh2
2πµi< k ′| I − S| k >
= −2πik < k ′| S − I| k > (56)
Note that we are using here the S matrix elements that involve the momentum delta function
because the states |k > and |k ′ > are in momentum representation.
The cross section in terms of S matrix elements
The differential cross section for scattering from k to k ′ can now be expressed in terms
of the S-matrix elements as
dσ
dΩ= |fk(r)|
2 = 4π2k2 | < k ′| S − I| k > |2 (57)
where as a reminder |k ′| = |k| = k.
Theorem: The S operator is unitary.
Proof:
S S† = I − 2πi(T − T †
)− 2πi(2πi)T T †
where I have used (54) for both S and S†.
We now use (47) to rewrite this as
S S† = I − 2πi(−2πiT T †)− 2πi(2πi)T T †
= I
QED
Rotational invariance and the S matrix
35
The S-matrix elements we have been discussing so far were in a plane wave basis. Clearly
in such a basis, the S matrix is not diagonal; otherwise elastic scattering processes would
not occur.
If the potential that governs the scattering process is a central potential, it is clear that
the S-matrix elements cannot depend on the absolute orientation of k and k ′ but only on
their relative orientation. Then assuming that the particles have no spin (or intrinsic angular
momentum) we can carry out a partial wave expansion of the S-matrix elements, viz:
< k ′| S| k >= δ(k − k′)∞∑l=0
2l + 1
4πk2Sl(k) Pl(k · k ′) (58)
We can evaluate the coefficients in this expansion Sl(k) by invoking the unitarity property
of the S-matrix elements, i.e.
S S† = I
which in the plane wave representation becomes
∫dk′′ < k ′| S| k ′′ > < k ′′| S†| k >= δ(k − k ′) (59)
Inserting (58) into (59) we find that
δ(k − k ′) =∫
δ(k′ − k′′)δ(k − k′′)∑ll′
(2l + 1)(2l′ + 1)
16π2k2 k′2 Sl(k′)Sl′∗(k′′)Pl(k′ · k′′)Pl′(k · k′′)dk′′
= δ(k − k′)∑ll′
(2l + 1)(2l′ + 1)
16π2k2Sl(k) Sl′∗(k)
∫Pl(k
′ · k′′)Pl′(k · k′′) dk′′
where one factor of k−2 was cancelled by the k2 factor in k2dk and one momentum delta
function by the integration over dk.
To evaluate the remaining angular integral over dk′′, we use the spherical harmonic ad-
dition theorem (see Merzbacher’s text, Quantum Mechanics, Third Edition, pg 251), which
says that
Pk(r1 · r2) =4π
2k + 1
∑m
Y mk (r1)Y
m∗k (r2)
Thus
∫Pl(k
′ · k′′)Pl′(k · k′′) dk′′ =16π2
(2l + 1)(2l′ + 1)
∑mm′
∫Y ml (k′)Y m∗
l (k′′)
36
× Y m′∗l′ (k)Y m′
l′ (k′′) dk′′
=16π2
(2l + 1)(2l′ + 1)
∑mm′
δll′δmm′ Y ml (k′) Y m′∗
l′ (k)
=16π2
(2l + 1)(2l′ + 1)δll′∑m
Y ml (k′) Y m∗
l (k)
=4π
2l + 1δll′Pl(k
′ · k)
where the last equality again followed from use of the spherical harmonic addition theorem.
Thus, all told,
δ(k − k′) = δ(k − k′)∑l
2l + 1
4πk2|Sl(k)|2 Pl(k · k′)
But the Dirac delta function has the well-known partial wave expansion
δ(k − k′) =δ(k − k′)
k2
∑l
2l + 1
4πPl(k · k′)
Thus, we immediately see that
|Sl(k)|2 = 1 (60)
from which we can introduce the parametrization
Sl(k) = e2iδl(k) (61)
The quantity δl(k) appearing in (61) is referred to as the phase shift in the lth partial
wave, for reasons that will become clear shortly.
What phases are being shifted?
Let’s now discuss the physical significance of the term “phase shift” for δl(k).
Consider the full scattering wave function, assuming that the incident momentum is
directed along the z-axis. Asymptotically as r → ∞
Ψ+k (r) →
1
(2π)3/2
eikz + fk(θ)
eikr
r
Let’s look first at the plane wave part, which would be the wave function in the absence
of a scattering potential. We can perform a partial wave expansion of eikz, yielding
eikz =∞∑l=0
il (2l + 1) jl(kr) Pl(cos θ)
37
Asymptotically as r → ∞, the Bessel function behaves like
jl(kr) →1
krsin (kr − lπ
2)
Thus, in the lth partial wave the plane wave behaves asymptotically like a sine function.
What happens when we add the scattered wave that results from the presence of the
potential? Now
eikz + fk(θ)eikr
r= eikz − 2πik < k ′| S − I| k >
eikr
r
We expand eikz as earlier. For the S-matrix we use the expansion given in (58) and
furthermore replace Sl(k) → e2iδl(k). Then
eikz − 2πik < k ′|S − I |k >eikr
r→
∑l
il(2l + 1)sin (kr − lπ
2)
krPl(cos θ) −
−2πik∑l
2l + 1
4πk2
(e2iδl − 1
) eikrr
Pl(cos θ)
=∑l
il (2l + 1)Pl(cos θ)
kreiδl ×
×sin (kr − lπ
2) e−iδl − (i)−l i
2(eiδl − e−iδl) eikr
Let’s now focus on the term in brackets. Using the fact that
eiδl − e−iδl = 2i sin δl
and that
i−l = e−ilπ/2
we can write the term in brackets as
...... = sin (kr − lπ
2) e−iδl + sin δl e
ikr e−lπ/2
= sin (kr − lπ
2) (cos δl − i sin δl) + sin δl
(cos (kr − lπ
2) + isin (kr − lπ
2)
)
= sin (kr − lπ
2) cos δl + cos (kr − lπ
2)sin δl
− i
(sin (kr − lπ
2) sin δl)− sin (kr − lπ
2) sin δl)
)
= sin (kr − lπ
2) cos δl + cos (kr − lπ
2)sin δl
= sin (kr − lπ
2+ δl)
38
r0
III
0
FIG. 5: A hard-sphere potential
where in the last equality I used the fact that sin (A+B) = sin A cos B + cos A sin B.
We see therefore that the effect of the potential on the lth partial wave is to modify the
phase of the asymptotic sine function from kr − π2to kr − π
2+ δl. Thus, δl represents the
shift in the phase of the lth partial wave on passing through the complete region in which
the potential acts.
An example
Let’s now discuss a specific example in which the phase shifts can be determined analyt-
ically. The example involves scattering by a hard sphere potential,
V (r) = ∞ r ≤ r0
= 0 r > r0
This potential looks schematically as in figure 5.
39
Clearly, in region I (0 < r < r0), the wave function must be identically zero.
Outside the region of the hard sphere, i.e. in region II, the Schrodinger equation is just
that of a free particle, for which the solution is
Ψlm(r) ∝ Rl(r)Yml (r)
with
Rl(r) = Aljl(kr) +Blηl(kr)
Here jl is the spherical Bessel function of order l and ηl the corresponding spherical Neumann
function of order l. [Note: If you haven’t seen it yet, you may look at pages 346-349 of
Shankar for a treatment of the free-particle Schrodinger equation in spherical coordinates.]
One of the two coefficients can be determined by the δ function normalization of this
continuum wave function. The second comes from the condition that the wave function
must be continuous at r = r0.
RIl (r0) = 0
RIIl (r0) = Aljl(kr0) +Blηl(kr0)
Equating
RIl (r0) = RII
l (r0)
then leads toBl
Al
= − jl(kr0)
ηl(kr0)(62)
Bear in mind, however, that this is not an eigenvalue condition. For scattering, all energies
E = h2k2
2µare allowed.
Now we can look at the radial wave function in the lth partial wave (i.e. Rl(r)) asymp-
totically (i.e. for r → ∞), as we must do to extract the phase shift.
Rl(r) = Al jl(kr) + Bl ηl(kr)
→ 1
kr
[Al sin (kr − lπ
2)−Bl cos (kr − lπ
2)
]
where I have used the known asymptotic forms of the spherical Bessel and Neumann func-
tions, given for example on page 348 of Shankar.
40
Using (62), we then find that
Rl(r) →Al
kr
[sin (kr − lπ
2) +
jl(kr0)
ηl(kr0)cos (kr − lπ
2)
]
Defining
tanX =jl(kr0)
ηl(kr0)
we can rewrite this as
Rl(r) → Al
krcos X
[sin (kr − lπ
2)cos X + cos(kr − lπ
2)sin X
]
=Al
krcos Xsin (kr − lπ
2+X)
We see, therefore, that X is precisely the phase shift of the lth partial wave.
But
X = tan−1 jl(kr0)
ηl(kr0)
so that
δl(k) = tan−1 jl(kr0)
ηl(kr0)
The optical theorem
We have already seen that unitarity of the S matrix is equivalent to the T-matrix relation
T − T † = −2πiTT †
I shall now derive an important consequence of this relation, which can accordingly also be
viewed as a consequence of S-matrix unitarity.
Taking matrix elements of the above relation in a plane wave basis
< ka|T | kb > − < ka|T †| kb >= −2πi∫
dkn δ(Ea − En) < ka|T | kn > < kn|T †| kb >
But, from earlier discussion [see (42) and (44)]
< ka|T | kb >= − h2
(2π)2µfka(ka · kb)
Thus,
− h2
(2π)2µ
fka(ka · kb)− f ∗
ka(ka · kb)
= −2πi
h4
(2π)4µ2
∫dkn fka(ka·kn) f
∗kb(kb·kn) δ(Ea−En)
(63)
41
Let’s now consider (63) for the case in which ka = kb, so that ka · kb = 1, i.e. θab = 0.
Then
fka(0)− f ∗ka(0) = 2πi
h2
(2π)2µ
∫dkn|fka(ka · kn)|
2δ(Ea − En) (64)
But
dkn = k2n dkn dkn
En =h2
2µk2n
dEn =h2
µkn dkn
Thus,
k2n dkn =
µ
h2 kn dEn =µ
h2
√2µ
h2
√En dEn
Finally,
∫dkn|fka(ka · kn)|
2δ(Ea − En) =µ
h2
√2µ
h2
√Ea
∫dkn|fka(ka · kn)|
2
=µ
h2ka
∫dkn|fka(ka · kn)|
2
Plugging this into (64) gives
fka(0)− f ∗ka(0) = − 1
2πika
∫dkn |fka(ka · kn)|
2 (65)
Now let’s define
σ =∫
dkn |fka(ka · kn)|2
Clearly σ is the total cross section for elastic scattering for an incoming plane wave of
momentum ka. For any complex number A, we know that A−A∗ = 2i Im A. We can thus
rewrite (65) as
2i Im fka(0) = − 1
2πikaσ
or
σ =4π
kaIm fka(0) (66)
This relation is referred to as the Optical Theorem, in analogy with the process of light
passing through a medium, for which the imaginary part of the complex index of refraction
is related to the total absorption cross section.
42
Since fka(0) is the forward scattering amplitude, what the optical theorem is expressing
is the fact that all of the flux that is removed from the forward direction (i.e. from the
beam direction) goes into a scattering process. Thus, we see a close link between S-matrix
unitarity and flux conservation, which we’ll indeed confirm shortly.
Generalization to complex collision processes
Up to now we have restricted our discussion to elastic scattering processes. Such processes
are characterized by the fact that the internal structure of the projectile and the target are
unaffected by the collision, so that the relative energy (i.e the magnitude of the relative
momentum) of the two is unchanged in the collision. What is changed in an elastic collision
is the direction of the relative momentum vector. Our restriction to elastic scattering was
implicit in our use of the asymptotic wave function
Ψ+
k→ 1
(2π)3/2
eik·r + fk(r)
eikr
r
for which the spherically outgoing scattered wave has the same k as the incident plane wave.
I would now like to discuss briefly how to generalize this to a wider variety of collision
processes. I will still consider two-body collisions only, in which a “projectile” a strikes a
“target” A and emerging from the collision are two objects b and B. [Note: Either the pro-
jectile or the target or both can be complex (many-particle) objects with internal structure.]
Such processes can be expressed either as
a+ A → b+B
or
A(a, b)B
Some examples are:
1. Elastic scattering A(a, a)A
2. Inelastic scattering A(a, a′)A∗
Here A∗ is an excited state of the target and a′ represents the projectile with appro-
priately diminished energy.
43
3. Rearrangement collisions A(a, b)B
Here what emerges from the collision are two different particles.
An example might be the nuclear reaction p + (Z,A) → d + (Z,A − 1), in which a
neutron in the target nucleus attaches itself to the incident proton, which then leaves
the reaction as a deuteron. The target which started with Z protons and A − Z
neutrons now has Z protons and A − Z − 1 neutrons. Note: This is often called a
pickup reaction, since the projectile picks up a neutron from the target.
For a given incident projectile and target at a given relative energy, each of the possi-
ble two–fragment states defines a two-body reaction channel. Those which are allowed by
conservation of energy are called open channels. Those which cannot satisfy energy conser-
vation are called closed channels. And as you might imagine, there can also in principle be
many-body channels with more than two fragments emerging.
Up to now, our analysis has focussed on elastic scattering only, which is of course always
an open channel. Furthermore, we have only discussed processes in which either the two
fragments carried no angular momentum (or spin) of their own or in which their intrinsic
angular momenta were irrelevant so that they could be ignored. My generalization will
permit other reaction processes to occur, but will still be limited to cases in which the
intrinsic angular momentum can be ignored. Further generalization to include angular
momentum and/or intrinsic spin is feasible, but notationally very complex. Furthermore, as
suggested above we will only be considering two-body channels.
Collisions of “Spinless” fragments
We shall let χs denote the intrinsic wave function in channel s and we shall assume that
χs carries no intrinsic angular momentum. Then a process in which the incident channel is
s can be described by a wave function with the asymptotic form
1√vs
eikszs χs +∑t
fst(rt)eiktrt√vtrt
χt (67)
This is the appropriate generalization of the asymptotic wave function we introduced
earlier when only considering elastic collisions. The factors 1/√vs and 1/
√vt are introduced
so that the incident and outgoing components all contain unit flux (see pages 17-19). We
also assume here that the incident (relative) momentum is ks = ksz, i.e. along the z-axis.
Finally, the sum over t goes over all open channels.
44
In this formalism, the differential cross section for a process in which s denotes the
incident channel and t the final channel is
dσ
dΩ(s → t) = |fst(r)|2 (68)
As we have often seen, there are many continuum solutions of the Time Independent
Schrodinger Equation at the same energy. Our choice of (67) was dictated by the physical
boundary conditions of the scattering problem. There is another particularly interesting
one, which I will call Φls. It is by definition the solution which has the asymptotic form
Φls →
1√vs
e−i(ksrs− lπ2)
rsχs Pl(cos θs) −
∑t
Slst(rt)
1√vt
ei(ktrt−lπ2)
rtχt Pl(cos θt) (69)
This solution contains an incoming spherical wave in channel s and outgoing spherical waves
in all open channels. In contrast to the solution (67), this solution is also an angular
momentum eigenstate and thus reflects the spherical symmetry of the problem. The reason
that the physical solution (67) was not an angular momentum eigenstate was that the
physical problem had a preferred direction in space, namely the direction of the incident
(relative) momentum vector ks.
Equation (69) defines the S-matrix elements in the lth partial wave for the various allowed
collision processes s → t. We will soon confirm that it is indeed the appropriate generaliza-
tion of the Sl(k) coefficients introduced on page 36-37 and that Slss is identical to the Sl(k)
introduced there.
Now let’s consider how to relate measurable quantities, e.g. cross sections, to these
S-matrix elements. To do this, we note that
eikszs =∞∑l=0
il(2l + 1)jl(ksrs)Pl(cos θs)
→∞∑l=0
il(2l + 1)sin (ksrs − lπ
2)
ksrsPl(cos θs)
=∞∑l=0
il(2l + 1)ei(ksrs−
lπ2) − e−i(ksrs− lπ
2)
2iksrsPl(cos θs) (70)
With the above result in mind, let’s now consider the following superposition of the
angular momentum eigenstates Φls given in (69) :
Ψs =∞∑l=0
il+12l + 1
2ksΦl
s
45
→ 1√vs
∑l
il (2l + 1)ei(ksrs−
lπ2) − e−i(ksrs− lπ
2)
2iksrsχs Pl(cos θs) −
−∑l
2l + 1
2ksil+1
∑t
(Slst − δst)
ei(ktrt−lπ2)
√vtrt
χt Pl(cos θt)
where the δst term was included in the last line to balance the extra term that was added
on the previous line.
Using (70), we can rewrite this as
Ψs =1
√vs
eikszs χs −∑t
1√vt
eiktrt
rtχt
[∑l
(2l + 1)il+1
2ks(Sl
st − δst) e−i lπ
2 Pl(cos θt)
](71)
This is indeed the linear combination of the Φls angular momentum eigenstates that has
the physical asymptotic behavior (67). Thus, we can read off of it the scattering amplitudes,
which are
fst(rt) =∑l
(2l + 1)il+1
2ks(Sl
st − δst) e−i lπ
2 Pl(cos θt)
Finally, the differential cross section for the process s → t can be obtained from the
corresponding scattering amplitude with the result being (see (68))
dσ
dΩ(s → t) = |
∑l
2l + 1
2ks(Sl
st − δst)Pl(cos θt)|2 (72)
In particular,
dσ
dΩ(s → s) =
1
4k2s
|∑l
(2l + 1)(Slss − 1)Pl(cos θs)|2 (73)
anddσ
dΩ(s → t = s) =
1
4k2s
|∑l
(2l + 1)SlstPl(cos θt)|2 (74)
We have now reparametrized the cross sections for all two-body processes in terms of the
associated S-matrix elements for each partial wave. To calculate the cross sections, we of
course need to know the Slst, which in general we do not since they require that we solve
the entire Schrodinger equation, which is, to say the least, difficult. However, we can use
conservation theorems, to gain some insight into the S-matrix elements without ever fully
solving the problem. Let’s see how this is done.
1. Conservation of flux in Φls, the lth partial wave:
46
Incoming flux = 1
Outgoing flux =∑
t |Slst|2
If there are no sources or sinks in the problem, then the incoming and outgoing fluxes in
each partial wave must be the same, i.e.
∑t
|Slst|2 = 1 (75)
We can now derive from this several important conclusions:
1. If only elastic scattering is energetically allowed, then the sum over t reduces to a
single term t = s, and then
|Slss|2 = 1 (76)
This is just the same as eq. (60). We have thus confirmed that in the one-channel
elastic scattering problem, the Slss coefficients we just introduced in our generalized
formalism are the same as the Sl(k) coefficients introduced in our earlier pure elastic
scattering formalism when we carried out a partial wave decomposition. As in the
earlier discussion, we can on the basis of (76) parametrize
Slss = e2iδl
where δl is the elastic scattering phase shift in the lth partial wave. Thus, we now see
that the applicability of phase shift analysis to elastic scattering is a direct consequence
of flux conservation.
2. The flux conservation equation (75) furthermore shows that
|Slst|2 ≤ 1
for every s, t.
3. If we integrate eq. (72) over all solid angles, we find that the total cross section for
the process s → t is
σ(s → t) =π
k2s
∑l
(2l + 1)|Slst − δst|2
We next define
σl(s → t) =π
k2s
(2l + 1)|Slst − δst|2
47
to be the total cross section in the lth partial wave.
Clearly σl(s → s) is largest when Slss = −1, in which case
(a) σl(s → s) =4(2l + 1)π
k2s
(b) Slst = 0, for all t = s
(c) σl(s → t) = 0 for all t = s
Similarly σl(s → t = s) is largest when Slst = 1, in which case
(a) σl(s → t) =(2l + 1)π
k2s
(b) σl(s → s) =(2l + 1)π
k2s
(c) σl(s → t′) = 0 for all t′ = s, t
From this, we see that as long as there is a non-elastic process, so that some Slst = 0
with (t = s), then there must be some elastic scattering also taking place in that
partial wave.
As indicated earlier, this more general formalism can be extended even more generally -
albeit with enormous complication - to two-body processes with intrinsic spin and even to
more complex many-body channels.
48
Approximation techniques in scattering theory
As in bound-state problems, quantum mechanical scattering problems are very rarely
exactly solvable, making approximate methods of solution critical. We shall discuss two such
methods: (a) the Plane Wave Born Approximation (PWBA) and (b) the Distorted Wave
Born Approximation (DWBA). Both are essentially applications of perturbation theory to
scattering problems. We shall for simplicity only discuss elastic scattering of “spinless”
particles.
1. The Plane Wave Born Approximation
Elastic scattering from an incident (relative) momentum k to a final (relative) momentum
k′ is described in terms of a differential cross section
dσ
dΩ(θ) = |fk(θ)|
2
where
cos θ = k · k ′
and the scattering amplitude fk(θ) is given by
fk(θ) = −√2πµ
h2
∫e−ik ′·r ′
V (r′) Ψ+
k(r ′) dr ′ (77)
and where the full scattering wave function Ψ+
k(r) satisfies the integral equation
Ψ+
k(r) =
1
(2π)3/2eik·r +
∫G+
0 (r, r′) V (r′)Ψ+
k(r ′) dr ′ (78)
Inserting (78) into (77) gives
fk(θ) = − µ
2πh2
∫e−ik ′· r ′
V (r′) eik· r ′dr ′
−√2πµ
h2
∫e−ik ′· r ′
V (r′) G+0 (r
′, r ′′)V (r′′) Ψ+
k(r ′′) dr ′ dr ′′ (79)
We could again insert the integral equation (78) into the second term of (80) and this
would give
fk(θ) = − µ
2πh2
∫e−ik ′· r ′
V (r′) eik· r ′dr ′
− µ
2πh2
∫e−ik ′· r ′
V (r′) eik· r ′G+
0 (r′, r ′′) V (r′′) eik· r ′′
dr ′ dr ′′
−√2πµ
h2
∫e−ik ′· r ′
V (r′) G+0 (r
′, r ′′)V (r′′) G+0 (r
′′, r ′′′) V (r′′′)
Ψ+
k(r ′′′) dr ′ dr ′′ dr ′′′ (80)
49
By successively inserting for Ψ+
kthe full integral equation (78) we generate an infinite
series of terms for the scattering amplitude. Each successive term has an extra potential V ,
an extra free-particle Green’s function G+0 , and an extra three-dimensional integral. This
series is called the Born Series. Keeping only the first term in the series is called the 1st
order Born Approximation, or more correctly the 1st order Plane Wave Born Approximation
(PWBA). Keeping more terms gives rise to higher-order PWBA’s.
The Born series in operator language
The transition operator T satisfies the integral equation (see property 1 on page 30)
T = V + V G+0 T (81)
Inserting the complete integral equation (81) for the operator T appearing on the right
hand side gives
T = V + V G+0 V + V G+
0 V G+0 T
By successively replacing the T on the right hand side by the full integral equation, we
generate an operator series
T = V + V G+0 V + V G+
0 V G+0 V + V G+
0 V G+0 V G+
0 + ...
The various orders of PWBA are obtained by evaluating < k ′|T |k > with successively
more terms in the series expansion for T . For example, 1st order PWBA is obtained by
approximating
< k ′|T |k >≈< k ′|V |k >
First-order PWBA
The first-order PWBA gives rise to a scattering amplitude for elastic scattering of the
form
fk(θ) = − µ
2πh2
∫e−ik ′· r ′
V (r′) eik· r ′dr ′
= − µ
2πh2
∫eiq· r ′
V (r′) dr ′
where
q = k − k ′
50
is the momentum that is transferred in the elastic collision.
Since V (r′) is spherically symmetric, the angular integration is done straightforwardly.
We choose a coordinate system in which
q = qz′
Thus,
fk(θ) = − µ
2πh2
∫ 2π
0
∫ 1
−1
∫ ∞
0eiqr
′cos θ′ V (r′) r′2dr′d(cos θ) dϕ′
= − 2πµ
2πh2
∫ ∞
0
1
iqr′
eiqr
′cos θ′1−1
V (r′) r′2 dr′
= − µ
h2
∫ ∞
0
eiqr′ − e−iqr′
iqr′V (r′) r′2 dr′
Finally,
fk(θ) = − 2µ
h2q
∫ ∞
0sin qr′ V (r′) r′ dr′ (82)
Note that all of the θ dependence of the scattering amplitude is contained in q. More
specifically,
q = |k − k ′|
=√k2 + k′2 − 2kk′cos θ
For elastic scattering k = k′, so that
q = k√
2− 2cos θ
An example
Consider elastic scattering of an electron by a neutral atom with atomic charge Z via a
screened Coulomb potential of the form
V (r) = −(Ze2
r
)e−r/a
where a is the range of the potential.
Then in first-order PWBA,
fk(θ) =2µZe2
h2q
∫ ∞
0sin qr′ e−r′/a dr′
51
We write
sin qr′ =1
2i
(eiqr
′ − e−iqr′)
so that
fk(θ) =2µZe2
2ih2q
∫ ∞
0eiqr
′− r′a dr′ −
∫ ∞
0e−iqr′− r′
a dr′
=µZe2
ih2q
1
iq − 1a
(eiqr
′− r′a
)∞
0
− 1
−iq − 1a
(e−iqr′− r′
a
)∞
0
=µZe2
ih2q
1
iq − 1a
(−1)− 1
−iq − 1a
(−1)
=µZe2
ih2q
iq + 1
a+ iq − 1
a
q2 + ( 1a)2
=2µZe2
h2(q2 + 1a2)
The differential cross section in PWBA for elastic scattering by the screened potential is
given by the square of the magnitude of the scattering amplitude, namely by
dσ
dΩ(θ) =
4µ2Z2e4
h4(q2 + 1a2)2
As noted earlier, all θ dependence is contained in q2. Using the fact that q2 = 2k2(1 −
cos θ), we find thatdσ
dΩ(θ) =
4µ2Z2e4
h4(2k2(1− cos θ) + 1a2)2
But
1− cos θ = 2sin2 θ
2
so thatdσ
dΩ(θ) =
4µ2Z2e4
h4(4k2 sin2 θ
2+ 1
a2
)2In the limit that a → ∞, the screened Coulomb potential reduces to the pure Coulomb
potential between two point charges, one with charge e and the other with charge Ze. In
this case, the differential cross section becomes
dσ
dΩ(θ) =
µ2Z2e4
h4
1
4k4 sin4 θ2
Noting that
hk = p
52
this can be rewritten asdσ
dΩ(θ) =
µ2Z2e4
4p2cosec4
θ
2
which is exactly the same result as the Rutherford cross section obtained from classical
scattering theory. It is, in fact, also possible to do an exact Quantum Mechanical calculation,
rather than a first-order PWBA calculation. As noted by Shankar on page 531, the exact
QM calculation also leads to the same Rutherford formula.
When should the Born approximation be applicable?
We would like to generate criteria from which to determine whether the first-order PWBA
is appropriate for a given problem. To do this, we first remember that the full scattering
wave function corresponding to an incident relative energy h2k2/2µ satisfies
Ψ+
k(r) =
1
(2π)3/2eik· r − µ
2πh2
∫ eik|r−r ′|
|r − r ′|V (r′) Ψk(r
′) dr ′
In PWBA, only the plane wave term is retained. For all r, the plane wave term has a
magnitude of 1/(2π)3/2. First-order PWBA is thus justified if the remaining term (which I
will denote v(r) is much smaller in magnitude than 1/(2π)3/2 in the region of the potential.
Mathematically, this condition is
|v(r)| = µ
2πh2 |∫ eik|r−r ′|
|r − r ′|V (r′) Ψk(r
′) dr ′| ≪ 1
(2π)3/2
We shall estimate |v(r)| at r = 0. Since V (r) is usually strongest at r = 0, this is the
place where |v(r)| should be largest.
We shall estimate |v(0)| using first-order PWBA. Then
|v(0)| ≈ µ
2πh2
1
(2π)3/2|∫ eikr
′
r′V (r′) eik· r ′
dr ′|
=µ
(2π)5/2h2 |w(0)|
with
w(0) = 2π∫ ∞
0
∫ 1
−1eikr
′V (r′)eikr
′cos θ′r′dr′d(cos θ′)
=2π
ik
∫ ∞
0
e2ikr
′ − 1V (r′)dr′
In terms of w(0), the first-order PWBA is justified if
µ
2πh2 |w(0)| ≪ 1
53
We now consider a prototype potential with range a and depth V0 of the form
V (r′) = −V0e−r′/a
Though this potential is hardly general, it will suffice to provide some feel for the condi-
tions under which first-order PWBA is justified. For this potential,
w(0) = −2πV0
ik
∫ ∞
0
e2ikr
′ − 1e−r′/adr′
=2πiV0
k
− 1
2ik − 1a
− a
=2πV0
k
− 2ika
2ik − 1a
Thus
|w(0)| = 2πV0
k
2ka√4k2 + 1
a2
=4πV0a
2
√4k2a2 + 1
Thus, the condition defining the validity of first-order PWBA becomes
2µV0a2
h2√4k2a2 + 1
≪ 1
We now consider two cases:
1. low energies — ka ≪ 1
Then2µV0a
2
h2 ≪ 1
or
V0 ≪h2
2µa2
Thus, for low-energy scattering, first-order PWBA can be applied only if the potential
is sufficiently weak.
2. high energies – ka ≫ 1
Now the PWBA applicability condition becomes
2µV0a2
2h2ka≪ 1
or
V0 ≪h2
µa2ka
If ka is sufficiently large, this condition will be satisfied for any reasonable depth.
Thus, at sufficiently high energies, first-order PWBA is usually appropriate.
54
In summary, first-order PWBA can be applied to elastic scattering processes if (a) the
incident energy is sufficiently high, or (b) the potential is sufficiently weak. More detailed
criteria would require explicit consideration of the form of the scattering potential for the
problem of interest.
Scattering from two potentials
PWBA is equivalent to treating the full interaction potential V (r) by perturbation theory.
If the criteria we just discussed do not apply, such a perturbative expansion in powers of V
will not be useful.
Under such circumstances, it is often useful to decompose V into two parts, V1 and V2,
where scattering due to V1 can be treated exactly whereas scattering due to V2 cannot. If
this decomposition is made appropriately, it might be possible to treat scattering due to V2
using perturbation theory.
The above approach is similar in spirit to bound-state perturbation theory, in which the
total hamiltonian is decomposed into a part H0 that can be treated exactly and another
part V or H1 that can be treated as a perturbation.
A brief aside - An alternative form for Tba
We have customarily expressed the matrix elements of T in the form
< kb|T |ka >=< kb|V |Ψ+a >
where |Ψ+a > satisfies the integral equation
|Ψ+a >= |ka > +G+
0 V |Ψ+a >
I would now like to show you that the same T matrix elements can be written alternatively
as
< kb|T |ka >=< Ψ−b |V |ka > (83)
where |Ψ−b > satisfies the integral equation
|Ψ−b >= |kb > +G−
0 V |Ψ−b > (84)
so that the state vector |Ψ−b > appearing in (84) contains spherically incoming waves. Note
that such state vectors were introduced on pages 33-34 in the context of our S-matrix
discussion.
55
To prove that the T matrix elements can be expressed in this alternative (but equivalent)
form, consider
< kb|T | ka > = < kb|V + V G+V | ka >
= < kb|V | ka > + < kb|V G+V | ka >
Thus, if (83) is to be satisfied, < Ψ−b | must be given by
< Ψ−b | =< kb|+ < kb| V G+ (85)
Let’s now prove that this is equivalent to (84).
To do so, we first rewrite (85) as
< Ψ−b | =< kb|(I + V G+) (86)
I now claim that
I + V G+ = (I − V G+0 )
−1
To prove this, consider
(I + V G+)(I − V G+0 ) = I + V (G+ −G+
0 )− V G+V G+0
But in an earlier homework problem, we showed that
G+ −G+0 = G+V G+
0
so that
(I + V G+)(I − V G+0 ) = I + V G+V G+
0 − V G+V G+0 = I
QED
Operating to the right on equation (86) with (I−V G+0 ) and using the result just proven,
we find that
< Ψ−b |(I − V G+
0 ) =< kb|
which can be rewritten as
< Ψ−b |− < Ψ−
b | V G+0 =< kb|
Thus,
< Ψ−b | =< kb|+ < Ψ−
b | V G+0
56
Taking the Hermitean adjoint of this equation and noting that (G+0 )
† = G−0 , we obtain
|Ψ−b >= |kb > +G−
0 V |Ψ−b >
which is identical to (84), as we set out to prove.
Return to scattering by two potentials
Now consider two state vectors
|χ+a >= |ka > +G+
0 V1| χ+a > (87)
and
|χ−b >= |kb > +G−
0 V1| χ−b > (88)
which describe the scattering due to V1 alone.
Then,
< kb|T | ka > = < kb|V1 + V2| Ψ+a >
= < χ−b |V1 + V2|Ψ+
a > − < χ−b |V1G
+0 (V1 + V2)| Ψ+
a >
where I have used the hermitean adjoint of (88).
But
|Ψ+a >= |ka > +G+
0 (V1 + V2)| Ψ+a >
or
G+0 (V1 + V2)| Ψ+
a >= |Ψ+a > −|ka >
Thus,
< kb|T | ka > = < χ−b |V1 + V2| Ψ+
a >
− < χ−b |V1| Ψ+
a > + < χb|V1|ka >
= < χ−b |V2| Ψ+
a > + < χ−b |V1| ka >
But
< χ−b |V1| ka >=< kb|T1| ka >
i.e. the T matrix element associated with scattering by potential V1 alone.
57
All told,
< kb|T | ka >=< kb|T1| ka > + < χ−b |V2| Ψ+
a > (89)
If we know how to treat V1 exactly, we can calculate its T matrix elements, < kb|T1| ka >.
We still need to treat the second term, however, which involves the scattering due to V2.
We discuss how this might be done below under appropriate circumstances.
The relationship (89) is called the Gell-Mann Goldberger relation.
The Distorted Wave Born Approximation
If V2 is appropriately weak compared to V1, then it is reasonable to approximate |Ψ+a >
by |χ+a >, the scattering wave function due to V1 alone. Then (89) becomes
< kb|T | ka >=< kb|T1| ka > + < χ−b |V2| χ+
a > (90)
This is known as the first-order Distorted Wave Born Approximation or sometimes just
the Distorted Wave Born Approximation (DWBA). It is the first term in a perturbation
series expansion in powers of V2. Higher-order DWBA approximations can be systematically
generated by using the appropriate integral equation relating |Ψ+a > and |χ+
a >.
The philosophy of the DWBA is that the distortion of the incoming plane wave due to V1
is treated exactly whereas scattering effects due to the weaker V2 are treated perturbatively.
A simple application of DWBA - elastic scattering of an electron from a nucleus
Choose
V1 = Coulomb interaction between electron and a point nucleus = −Ze2
r
Then choose
V2 = V − V1 = modifications due to finite size of nucleus
The scattering due to V1 can be treated exactly using the simple Rutherford scattering
formula given earlier, but that due to V2 cannot. But it is expected to be fairly weak.
Thus, we treat the scattering due to V2 using perturbation theory. I will not go through
the detailed analysis here, as I merely wanted to illustrate the basic ideas and uses of the
DWBA.
Let me close by mentioning that both PWBA and DWBA ideas can be used not only in
treating elastic scattering but also in treating more complex collision processes. But this is
for another course.
58
A review of Identical Particles
The next major topic in the course will be Second Quantization, which as we will see is
a way to deal with systems of many identical particles.
An introduction to identical particles was already presented in PHYS610 and is discussed
in some detail in Chapter 10 of Shankar on pages 260-277. If you haven’t already read this,
you should.
Rather than assume that all of you are familiar with the notion of identical particles and
their description in Quantum Mechanics, I thought I would briefly review some of the key
points that were made in PHYS610. Following that, I will also briefly discuss how to modify
our formalism of scattering theory to accommodate identical particles.
Let’s begin by reminding you of why we have to take specific care of identical particles
in Quantum Mechanics. To do so, let’s consider the scattering (either elastic or inelastic)
of an electron from a hydrogen atom. Experimentally, we detect an outgoing electron. In
doing so, we are faced with a dilemma. Is the electron that we detect the incident projectile
or is it the electron that was originally bound in the hydrogen atom? The mere fact that
we detect an electron is not sufficient to answer this question. This is a consequence of the
fact that the electrons in question are identical particles and thus indistinguishable.
Classically, we can answer the question of which particle we are detecting by making
further measurements. In particular, by tracing the trajectory or path of the electron through
space we can determine where it originates.
Quantum mechanically, we cannot define such a path. In a QM framework, the particles
are described by wave packets and classical trajectories only exist on the average. If the two
wave packets never overlap appreciably, we can to a reasonable approximation neglect the
indistinguishability of the two particles. But in general we need to take into account the
indistinguishability of the two electrons in our QM treatment.
In the Schrodinger picture, a system of n identical particles can be described in terms of
the solutions
Ψ(1, 2, 3, ..., n; t)
of the time-dependent Schrodinger equation
− h
i
∂Ψ
∂t=
n∑
i=1
(− h22i
2mi
) + V (1, 2, ..., n)
Ψ = H(1, 2, ..., n) Ψ
59
Here i is meant to denote the full set of coordinate and (if necessary) spin dynamical variables
of particle i, viz: ri and σi. Note that I have assumed here that the potential does not change
with time.
To say that the particles 1, .., n are indistinguishable means that H is symmetric under
the interchange of two of its arguments, i.e. particles i and j can be interchanged in H
without changing it.
The permutation operator and the exchange operator
To understand the consequences of indistinguishability of identical particles, it is useful
to introduce two operators, the permutation operator P and the exchange operator X.
For a set of n ordered objects; 1, 2, ..., n, the permutation operator P has the effect of
permuting these objects. There are n! possible such permutations. For three objects, for ex-
ample, the 6 possible permutations are (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1).
The exchange operator X interchanges two of the objects. There are obviously n(n−1)2
possible interchanges. For three objects, for example, the three possibilities are
(1, 2, 3) → (2, 1, 3)
→ (3, 2, 1)
→ (1, 3, 2)
Clearly, any permutation of a set of objects can be expressed as a product of exchanges,
though usually not uniquely. Thus, for example,
P(1,2,3)→(2,3,1) = X1↔2 X1↔3
Permutation symmetry
Now let’s define the operator that effects a permutation of an identical particle wave
function
UP Ψα(1, 2, ..., n) = Ψα(P−1(1, 2, ..., n))
Further, let’s assume that Ψα is an eigenfunction of the hamiltonian H of the system,
H Ψα(1, 2, ..., n) = Eα Ψα(1, 2, ..., n) (91)
60
What about the wave function UP Ψα(1, 2, ..., n) obtained by permuting the labels in
some way? Since the particles are identical, such a state is indistinguishable from the state
Ψα(1, 2, ..., n) and thus it too must be an eigenstate of H with the same energy Eα,
H UP Ψα(1, 2, ..., n) = Eα UP Ψα(1, 2, ..., n)
From this we can readily show that
[H,UP ] = 0
i.e. that H commutes with UP . Thus, it is possible to find simultaneous eigenstates of H
and UP .
Systems of two identical particles
We first focus on systems of two identical particles, for which the permutation and ex-
change operators are obviously the same. Consider a given eigenstate of H, Ψα(1, 2). From
what we saw earlier,
UP Ψα(1, 2) = Ψα(P−1(1, 2)) = Ψα(2, 1)
is also an eigenstate of H with the same energy. And thus any linear combination of Ψα(1, 2)
and Ψα(2, 1) is also an eigenstate of H with the same energy.
Of these, there are two that are especially interesting:
ΨSα(1, 2) =
1√2(Ψα(1, 2) + Ψα(2, 1))
ΨAα (1, 2) =
1√2(Ψα(1, 2)−Ψα(2, 1))
They are the two possible simultaneous eigenstates of H and the permutation operator
P . In particular,
UPΨSα(1, 2) = +ΨS
α(1, 2)
and
UPΨAα (1, 2) = −ΨA
α (1, 2)
ΨSα(1, 2) is called the symmetric eigenstate, whereas ΨA
α (1, 2) is called the antisymmetric
eigenstate for reasons to be made clearer shortly.
Three identical particles
61
In the case of more than two identical particles, the permutation and exchange operators
are no longer the same. To make the generalization simpler, let’s begin with three particles.
Consider an eigenstate Ψα(1, 2, 3) of some three-particle hamiltonian H with eigenvalue
Eα. The fact that the particles are identical means that there are (at least) six degenerate
states at energy Eα,
Ψα(1, 2, 3) , Ψα(2, 1, 3) , Ψα(1, 3, 2) , Ψα(3, 1, 2) , Ψα(2, 3, 1) , Ψα(3, 2, 1)
Clearly any linear combination of them will also be an eigenstate of H with the same energy
Eα.
Which are also eigenstates of UP ?
I claim that to be an eigenstate of UP , the state must be a simultaneous eigenstate of
UX(1 ↔ 2) , UX(1 ↔ 3) and UX(2 ↔ 3)
having the same eigenvalue for all three exchanges.
Put another way, it must either be symmetric under all three interchanges or antisym-
metric under all three interchanges. And there is a unique linear combination for each of
these two possibilities.
1. Symmetric under all three interchanges:
ΨSα(1, 2, 3) =
1√3!
[Ψα(1, 2, 3) + Ψα(1, 3, 2) + Ψα(2, 3, 1)+
+ Ψα(2, 1, 3) + Ψα(3, 1, 2) + Ψα(3, 2, 1)]
2. Antisymmetric under all three interchanges:
ΨAα (1, 2, 3) =
1√3!
[Ψα(1, 2, 3)−Ψα(1, 3, 2) + Ψα(2, 3, 1)−
−Ψα(2, 1, 3) + Ψα(3, 1, 2)−Ψα(3, 2, 1)]
Note: the factor 1√3!is included so that they would be normalized assuming the individual
terms are.
I claim that both are eigenstates of UP for any permutation, as you can readily confirm.
An arbitrary number of identical particles
62
The same conclusions are true for any n. Namely, there are two possible states of n
identical particles at a given energy Eα that are also eigenstates of UP for any permutation.
One is called the symmetric state and the other is called the antisymmetric state.
Time evolution of a state of definite permutation symmetry
What happens if we have a system of identical particles in a state of definite permutation
symmetry, either symmetric or antisymmetric, and let it evolve in time. At this point, I do
not care whether it is in an eigenstate of H or not.
The state evolves via the time evolution operator,
T (t, t0) = e−ih(t−t0)H
And since the hamiltonian commutes with the permutation operator, it should thus be
clear that an eigenstate of the permutation operator will preserve its permutation symmetry
character for all times.
An analogous situation occurs for parity. If a hamiltonian is invariant under space in-
version, then a state of definite parity will remain forever in a state of definite parity. On
the other hand, it is possible to form a quantum system in a state of mixed parity, as we do
for example when we prepare a beam of particles in a definite direction in a 1D scattering
experiment.
It is here that the analogy between parity transformations and permutations breaks
down. It has been found necessary to impose another fundamental postulate in our quantum
mechanical formalism when considering identical particles. This new postulate, sometimes
referred to as the symmetrization postulate, asserts that “the states of a system containing
N identical particles are either all symmetric or all antisymmetric with respect to exchanges
of the particles”.
It should be emphasized that this is a postulate, much like the other postulates on which
QM is based. But, here too, the postulate seems to be borne out by experiment.
Which of the two prescriptions should be applied to a given problem depends on the nature
of the identical particles involved. Particles for which all states are symmetric are called
bosons. Those for which all states are antisymmetric are called fermions. All fundamental
particles with half integral intrinsic spins behave as fermions, whereas those with integral
intrinsic spins all behave as bosons.
63
Independent particle wave functions
Often in QM, both for identical and for non-identical particles, a useful first approxima-
tion can be made by neglecting the interactions between the particles and then treating the
interactions between them later using some form of perturbation theory.
The approximate (or unperturbed) hamiltonian for these “noninteracting” particles will
just be the sum of the hamiltonians for each one,
H(1, 2, .., n) = h(1) + h(2) + ...+ h(n) (92)
Obviously, each particle feels the same hamiltonian, since they are identical.
If we denote the eigenfunctions and eigenvalues of the single-particle hamiltonians h(i)
as ua(i) and ϵa, respectively, i.e.
h(i)ua(i) = ϵaua(i)
then the eigenfunctions and eigenvalues of the independent particle hamiltonian (92) are
given by
HΨ(1, ..., n) = EΨ(1, ..., n)
where
Ψ(1, ..., n) = ua1(1)ua2(2)...uan(n)
and
E = ϵa1 + ϵa2 + ...+ ϵan
Clearly, this system has degeneracies. For n = 2, for example, the two states
Ψ1(1, 2) = ua(1)ub(2)
and
Ψ2(1, 2) = ub(1)ua(2)
have the same energy
E = ϵa + ϵb
Neither Ψ1(1, 2) nor Ψ2(1, 2) are states with definite permutation symmetry. But clearly
ΨA(1, 2) =1√2[ua(1)ub(2)− ub(1)ua(2)]
64
and
ΨS(1, 2) =1√2[ua(1)ub(2) + ub(1)ua(2)]
are. The state ΨA is antisymmetric under particle exchange and the state ΨS is symmetric.
If we are dealing with fermions, we must use the antisymmetric solution ΨA whereas if we
are dealing with bosons we must use the symmetric solution ΨS. These are the appropriate
independent particle solutions to use when dealing with systems of two identical particles.
Many-particle particle states for independent identical fermions
We have seen that the appropriate antisymmetric state for two independent fermions is
ΨAab(1, 2) =
1√2[ua(1)ub(2)− ub(1)ua(2)]
where ua and ub are eigenstates of the associated single-particle hamiltonian. A convenient
way to rewrite this is
ΨAab(1, 2) =
1√2
∣∣∣∣∣∣∣ua(1) ua(2)
ub(1) ub(2)
∣∣∣∣∣∣∣in terms of a 2× 2 determinant.
In this form, it can be readily generalized to the case of n identical independent fermions.
The appropriate generalization is to the n× n determinant
ΨAa,b,...(1, 2, ..., n) =
1√n!
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
ua(1) ua(2) ... ua(n)
ub(1) ub(2) ... ub(n)
. . . .
. . . .
. . . .
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣(93)
This is referred to as a Slater determinant.
Clearly, (95) is an n-particle eigenstate of
H =n∑
i=1
h(i)
with eigenvalue
E = ϵa + ϵb + ...
By making use of the properties of determinants under the interchange of two columns, it
is straightforward to confirm that it is fully antisymmetric.
65
The Pauli Exclusion Principle
What happens if we try to construct the antisymmetric Slater determinant for a state
with two particles in the same single-particle state. Then the Slater determinant can be
written as
ΨAa,b,b,...(1, 2, 3, ..., n) =
1√n!
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
ua(1) ua(2) ... ua(n)
ub(1) ub(2) ... ub(n)
ub(1) ub(2) ... ub(n)
. . . .
. . . .
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣Note: I have assumed that it is the single-particle state b that contains two particles, but it
could be any other and the same conclusion would arise.
Making use of another property of determinants, which states that any one which has
two identical rows (or two identical columns) must be zero, we can easily show that
ΨAa,b,b,...(1, 2, 3, ..., n) = 0
What this says is that it is impossible to construct an antisymmetric state of identical
fermions in which more than one particle occupies the same single-particle state. This is of
course the well-known Pauli Exclusion Principle.
Two versus many identical particles
As is evident from the Slater determinant in (93), systems of many identical fermions
require wave functions with an enormous number of terms. This is in fact equally true
for systems of many identical bosons, even though it is not a determinant that is required
there. As such, it will be very difficult to treat systems of many identical particles using the
framework we have developed so far. Soon we will develop a method for handling systems
of many identical particles that captures all of the exchange and permutation symmetry
character of the systems, but with a much more efficient scheme for handling the many
terms that appear in the wave functions. The framework is called Second Quantization to
distinguish it from the ordinary or First Quantized framework we have used so far.
66
Two-particle wave functions for spin-1/2 particles
Let’s now discuss in a bit of detail the wave functions for two identical spin-1/2 particles.
Such wave functions obviously depend both on the spin and spatial degrees of freedom. For
my subsequent discussion of the scattering of identical spin-1/2 particles, it will be necessary
for us to know how the spin part of such wave functions behaves under particle interchange.
So, let’s look at that a bit.
Let’s consider therefore a spin wave function for two identical spin-1/2 particles with
total spin S, namely
χ 12
12;SMS
(σ1, σ2)
How does such a spin wave function behave under the interchange of its two spin labels σ1
and σ2?
To answer this, we consider
χ 12
12;SMS
(σ2, σ1) =∑
m1m2
(1
2m1
1
2m2|SMS)χ 1
2m1
(σ2) χ 12m2
(σ1)
=∑
m1m2
(1
2m1
1
2m2|SMS)χ 1
2m2
(σ1) χ 12m1
(σ2)
= (−)1−S∑
m1m2
(1
2m2
1
2m1|SMS)χ 1
2m2
(σ1) χ 12m1
(σ2)
= (−)1−Sχ 12
12;SMS
(σ1, σ2)
where in the third equality I have used the property of the Clebsch Gordan coefficients which
says that
(j1m1j2m2|JM) = (−)j1+j2−J(j2m2j1m1|JM) (94)
As we know, there are two possible values for the total spin, S = 0 and S = 1. We
have now shown that the S = 0 state is antisymmetric under interchange of the two spins,
whereas the S = 1 state is symmetric.
Scattering of spin-1/2 particles
Now let’s discuss how the indistinguishable nature of identical particles reflects itself in
scattering experiments. I will focus on the scattering of two identical spin-1/2 particles off
one another. The generalization to arbitrary intrinsic spin particles is straightforward.
67
Let’s denote the distance between the two colliding spin-1/2 particles by r = r1 − r2.
Assuming that the incident particle has spin projection m1h and the target particle has spin
projection m2h, we might naively write down the wave function prior to the scattering as
1
(2π)3/2eik· r |1
2m1,
1
2m2 >
Of course, in the usual experimental setup, both the incident particle and the target are
unpolarized, i.e. contain a mixture of all possible spin projections with none preferentially
favored. In such a scenario, we would of course have to take averages over the possible spin
projections m1 and m2.
Alternatively, as we know, we could consider the two spins coupled together to total spin
S and projection MS and try writing down the wave function before scattering as
1
(2π)3/2eik· r |1
2
1
2, SMS >
This is in fact the preferred representation, and the one we will use as we proceed. When
working in this representation, we would reflect an unpolarized scenario by averaging over
all possible values of S and MS. We’ll discuss how to do this shortly.
Why did I say natively when I wrote down the above two possible wave functions? The
reason is that neither of the two wave functions I wrote down are proper for describing
systems of two identical spin-1/2 particles, as neither is fully antisymmetric. To obtain a
properly antisymmetric two-fermion wave function, we would need to consider the behavior
under interchange of the two particles. Under particle interchange, I claim that
r → −r
Thus, the properly antisymmetric plane wave incident state is
1
(2π)3/21√2
[eik· r + (−)Se−ik· r
]|12
1
2, SMS >
The factor (−)S expresses the fact that for S = 0 the spin part of the wave function is
antisymmetric so that a symmetric spatial wave function is required for full antisymmetry.
Correspondingly, for S = 1, the spin part is symmetric and an antisymmetric spatial wave
function is needed.
Now let’s look at the spherically outgoing asymptotic wave. Before antisymmetrization,
i.e. while naivete still reigned supreme, it would be
f(θ, ϕ)eikr
r|12
1
2, S ′MS >
68
In general it is possible for the final spin S ′ to differ from the incident spin S. However, if
the interaction that governs the scattering is spin-independent it is impossible for the spin
of the system to change in the scattering process and S ′ = S. Also, the scattering amplitude
in such cases would not depend on S.
As noted earlier, under particle interchange r → −r. Thus θ → π − θ and ϕ → π + ϕ. A
properly antisymmetrized spherically outgoing wave can thus be written (assuming no spin
dependence of the hamiltonian) as
1√2
[f(θ, ϕ) + (−)Sf(π − θ, π + ϕ)
] eikr
r|12
1
2, SMs >
As we have seen, the coefficient of the spherically outgoing asymptotic wave determines
the differential cross section. For a given spin state S and spin-independent scattering of
identical particles, the appropriate expression is(dσ
dΩ
)S
= |f(θ, ϕ) + (−)S f(π − θ, π + ϕ)|2
= |f(θ, ϕ)|2 + |f(π − θ, π + ϕ)|2 + 2(−)SRef(θ, ϕ)f ∗(π − θ, π + ϕ)
As in ordinary scattering theory it is useful to choose the z-axis along the incident (rel-
ative) momentum vector k. For a central potential, this will ensure that the scattering
amplitude f(θ, ϕ) only depends on θ (i.e. the angle between k and k ′).
With this choice of axes,(dσ
dΩ
)S
= |f(θ)|2 + |f(π − θ)|2 + 2(−)SRef(θ)f ∗(π − θ)
We now note that in an unpolarized experiment S = 1 occurs three times more often
than S = 0, since S = 1 has three spin projections and S = 0 only has one. Thus,(dσ
dΩ
)unpolarized
=1
4
(dσ
dΩ
)S=0
+3
4
(dσ
dΩ
)S=1
=1
4|f(θ)|2 + 1
4|f(π − θ)|2 + 1
2Ref(θ)f ∗(π − θ)
3
4|f(θ)|2 + 3
4|f(π − θ)|2 − 3
2Ref(θ)f ∗(π − θ)
= |f(θ)|2 + |f(π − θ)|2 −Ref(θ)f ∗(π − θ)
Note that this gives the differential cross section for scattering though an angle θ relative
to the incident direction. Furthermore, bear in mind that it gives it in the center-of-mass
69
Detector
Detector
(a)
(b)
FIG. 6: Schematic illustration of an elastic scattering experiment of two identical fermions
frame of reference, since it is the relative motion Schrodinger equation that we solved to
get the scattering amplitude. But we can of course go from this to the lab cross section
straightforwardly.
As a result of the indistinguishability of the two fermions (perhaps two protons), the cross
section contains contributions from two processes, as shown in figure 8. The term |f(θ)|2 is
the cross section resulting from process (a) alone. It would have been the only contribtuion
were the two particles not identical. The term |f(π − θ)|2 is the cross section for scattering
through an angle π − θ and thus would come from process (b) alone. The interference
between the two processes (since in QM we must add amplitudes) gives rise to the third
term −Ref(θ) f ∗(π−θ). It is important to note that there is no way to distinguish between
processes (a) and (b) by merely detecting the single outgoing particle (e.g. the outgoing
proton).
70
Second quantization
We have seen that to describe in coordinate (or momentum) space a system of identical
fermions, we must use as our independent-particle basis functions Slater determinants. For
a system of A identical particles, these basis functions contain A! terms, each one containing
A factors. If A is fairly small, we can easily keep track of these terms and all is fine. But
when A is large, the bookkeeping associated with such a complicated wave function becomes
incredibly difficult.
And this bookkeeping problem is no simpler for systems of identical bosons. In such
cases, we need not make sure that each particle is in a different “level”, as prescribed by
the Pauli principle. But we must still deal with fully symmetrized wave functions, and they
involve the same number of terms as do Slater determinants for fermions.
What we need is a formalism for treating many-particle systems of identical particles
that automatically takes into account the antisymmetry or symmetry of the associated wave
functions, but which avoids the terrible bookkeeping issues.
The solution: Second Quantization.
Unfortunately, our textbook by Shankar does not discuss second quantization. So, you’ll
have to depend on my notes, which I personally think are pretty good and pretty complete,
or try reading another book. Merzbacher’s book does have extensive discussion of second
quantization, though I think that mine is better.
So now let me begin to tell you about second quantization.
Fock Space
In a second-quantized formalism, state vectors of identical particle systems are defined
in so-called Fock space. A state in Fock space is characterized by giving
• the possible states, which I’ll denote | α >, that a single particle can occupy, and
• the occupation numbers for these single-particle states.
The single-particle states | α > are (typically) taken to be the complete set of eigenstates
of some single-particle hamiltonian h, i.e.
h| α >= ϵα| α > (95)
71
We can completely characterize the eigenstates of h in terms of a complete set of quantum
#’s, related to a complete set of commuting observables. For example, in discussing a spin-
1/2 electron in an atom, h could be the Coulomb hamiltonian for a single electron in the
nuclear field. It’s eigenstates would then be characterized by the principal quantum # n,
the orbital angular momentum l, the total angular momentum j and its z-projection m.
With such a choice, | α >= | nljm >. We could of course use any alternative complete set
of commuting observables associated with the problem.
The occupation number of a given single-particle state | α > tells how many particles are
in that many-body state.
The fundamental assumption is that one obtains a complete set of many-particle states
by distributing the particles in all allowed ways over a complete set of single-particle states.
I will use the following notation for a state of n particles in Fock space:
| α1, α2, ..., αn >
This means that the n particles occupy single-particle states α1, α2, up to αn. For fermions,
all αi are distinct; for bosons they need not be.
An important state in Fock space is the one with no particles. We call it the vacuum
state and denote it by | 0 >. Fock space includes all distinct state vectors | α1, α2, ..., αn >
for any number of particles. It includes the vacuum state | 0 >, all possible one-particle
state vectors | α1 >, all possible two-particle state vectors | α1, α2 >, etc.
The set of ket vectors | α1, α2, ..., αn > define a linear vector space. There is also a dual
space, defined by a corresponding set of bra vectors < α1, α2, ..., αn |. The scalar product
< α1, α2, ..., αn | β1, β2, ..., βm >
is zero unless the set of occupied states α1, α2, ..., αn and β1, β2, ..., βm are the same.
This includes the possibility that the elements are the same, but their ordering is different.
Next I introduce the concept of standard order of Fock space states. This is done by
introducing an order of the single-particle states, say
α1 < α2 < α3...
This order is arbitrary, but once chosen must be maintained. A state in Fock space is in
standard order if its indices conform to the prescribed order. Thus, | α1, α2 > is in standard
72
order, but | α2, α1 > is not. We define our metric in Fock space according to
< α1, α2, ..., αn | α1, α2, ..., αn >= 1
as long as both states have the same occupation numbers and are in the same order. If they
are not in the same order, the scalar product between the two states will be determined by
whatever factors are required to bring them to the same order. As we’ll see, this requires
consideration of the indistinguishability of the identical particles.
Creation and annihilation operators
The next crucial ingredients in our second quantized formalism are the operators that
connect states in Fock space. Since Fock space states can have different numbers of particles,
we will in general need operators that change particle number. The simplest are those that
create or annihilate a particle.
We define the single-particle creation operator a†β by
a†β | α1, α2, ..., αn >=√nβ + 1 | β, α1, α2, ..., αn > (96)
where nβ is the occupation number of state β in | α1, α2, ..., αn >. Thus, a†β creates a
particle in state β, although the resulting state is not necessarily in standard order.
Clearly any ket state in Fock space can be built up by acting with creation operators
systematically on the vacuum state. In particular,
| α1, α2, ..., αn >= N a†α1a†α2
...a†αn|0 > (97)
where
N =∏
i=1,n
1√nαi
!
Analogously we introduce the single-particle annihilation operator aβ by
aβ | β, α1, α2, ..., αn >=√nβ | α1, α2, ..., αn > (98)
where nβ is the occupation number of state β in | β, α1, α2, ..., αn >.
Clearly aβ annihilates a particle in state β, as long as it is in the first position. If there
is a particle in state β (i.e. nβ = 0) but it is not in the first position, then before applying
the above defining relation we must first put it in the first position. Once again, we see the
need to rearrange Fock space states, or equivalently the order of creation operators.
73
On the basis of what I’ve just said, it should be clear that
aβ| 0 >= 0, for any β
and
aβa†β| 0 >= | 0 >
Indistinguishability of identical particles
Now let’s incorporate the indistinguishability of identical particles in QM in our formal-
ism. In doing so, we will pin down the earlier issues of how to relate state vectors that differ
only in the order in which particles were created.
To do this, it is useful to consider the coordinate state representations of these Fock space
states.
For one particle,
< r | α >= ϕα(r)
which is just the one-particle wave function in coordinate space.
For two particles,
< r1, r2 | α, β >= ϕαβ(r1, r2)
But we know that the wave functions for two identical particles in coordinate space
depend on whether they are fermions or bosons. For fermions,
ϕαβ(r1, r2) =1√2ϕα(r1)ϕβ(r2)− ϕβ(r1)ϕα(r2)
= −ϕβα(r1, r2)
In contrast, for bosons,
ϕαβ(r1, r2) =1√2ϕα(r1)ϕβ(r2) + ϕβ(r1)ϕα(r2)
= ϕβα(r1, r2)
We can trivially incorporate this in our second quantized formalism by imposing the
condition
| α, β >= λ| β, α >
where λ = −1 for fermions and +1 for bosons.
74
Many particles
It is trivial to generalize the above conditions to Fock space states involving n particles.
The generalization is
| α1, α2, ..., αn >= λ| α2, α1, ..., αn >
Now let’s consider the action of a†β1a†β2
on an arbitrary state in Fock space. From earlier
definitions
a†β1a†β2
| α1, α2, ..., αn > =√(nβ1 + δβ1,β2 + 1)(nβ2 + 1)| β1, β2, α1, α2, ..., αn >
= λ√(nβ1 + δβ1,β2 + 1)(nβ2 + 1)| β2, β1, α1, α2, ..., αn >
= λa†β2a†β1
| α1, α2, ..., αn >
Since this holds for any state vector | α1, α2, ..., αn >, we can conclude that
a†β1a†β2
= λa†β2a†β1
=
a†β2a†β1
for bosons
−a†β2a†β1
for fermions
Let’s now focus for a moment on fermions, for which the negative sign applies, and
consider the case β1 = β2. Then
a†β1a†β1
= −a†β1a†β1
which can only be satisfied if
a†β1a†β1
= 0 , for all β
Thus our second quantized formalism automatically accommodates the Pauli principle in
that two fermions cannot be created in the same state.
Notation:
[A,B] = AB −BA , commutator
A,B = AB +BA , anticommutator
We can summarize the above relations as
[a†α, a†β] = 0 , for bosons (99)
a†α, a†β
= 0 , for fermions (100)
75
Further properties of the creation and annihilation operators
Let’s now discuss some important properties of the creation and annihilation operators.
The first point is that the creation operator a†α and the annihilation operator aα are her-
mitean adjoints of one another. I will leave this as a homework problem for you to prove.
To do so, you need merely confirm that for any Fock space state vectors | ϕ > and | Ψ > in
standard order, the following relation between the matrix elements of a†α and aα is satisfied:
< ϕ | a†α | Ψ >=< Ψ | aα | ϕ >∗
I earlier showed that
a†αa†β = λa†βa
†α
where λ = 1 for bosons and −1 for fermions. Taking the hermitean adjoint of both sides we
get
aβaα = λaαaβ
We thus see that the annihilation operators likewise satisfy analogous commutation or
anticommutation relations
[aα, aβ] = 0 , for bosons (101)
aα, aβ = 0 , for fermions (102)
We have looked at what happens when we interchange the order of two creation or two
annihilation operators. But what about when we interchange the order of a creation and an
annihilation operator? More specifically, what is the relation, if any, between the operators
a†αaβ and aβa†α?
I now make two claims which will again be left as a homework assignment for you to
prove.
(a) aαa†β = λa†βaα, for α = β.
(b) aαa†α = λa†αaα + I,
where I is the identity operator. Both can be proven by acting on an arbitrary ket vector
in standard order.
76
The above two relations can also be summarized compactly as
[aα, a
†β
]= δαβ I , for bosons (103)
aα, a
†β
= δαβ I , for fermions (104)
Number-conserving operators
Up to now, we’ve focussed on the simplest possible operators, namely those that either
create or annihilate/destroy a single particle. Such operators connect states with different
numbers of particles. I will now turn to a very important class of operators, called “number-
conserving operators” which by definition only connect states with the same number of
particles. That such operators should play a preeminent role in QM is already evident from
the fact that the dynamics of (non-relativistic) quantum systems is governed by the hamil-
tonian operator, which is clearly number conserving. So, let’s now see how such operators
are built up in terms of the fundamental creation and annihilation operators.
The number operator
Let me begin by discussing a simple example, namely the number operator. This operator,
which I will denote N , is the one which when acting on an arbitrary state in Fock space tells
you how many particles are in that state. Consider a state in Fock space |α1, α2, ..., αn > in
standard order. Assume that αi is the first time that the single-particle state β occurs in
this ordered state vector. Then
aβ| α1, α2, ..., αn >=√nβ λi−1| α1, α2, .., αi−1, αi+1, .., αn >
Furthermore,
a†βaβ| α1, α2, ..., αn > =√nβ λi−1√nβ| β, α1, α2, .., αi−1, αi+1, .., αn >
= nβ| α1, α2, ..., αn >
Thus, a†βaβ counts the number of particles in state β in a given Fock space state.
If we then define
N =∑β
a†βaβ (105)
77
then
N | α1, α2, ..., αn > =∑β
nβ|α1, α2, ..., αn >
= n| α1, α2, ..., αn >
confirming that N as defined above is indeed the total number operator.
Some useful relations which you will be asked to confirm in the homework are that
[aα, N
]= aα (106)
[a†α, N
]= −a†α (107)[
a†αaβ, N]=[aβa
†α, N
]= 0 (108)
From (108) we see that any operator containing one creation and one annihilation opera-
tor commutes with the number operator. Thus, such an operator cannot change the number
of particles in a state on which it operates and is therefore a “number-conserving operator”.
It is trivial to generalize this (already fairly obvious) statement to reach the more gen-
eral conclusion that any operator containing the same number of creation and annihilation
operators commutes with N and is thus “number conserving”.
Of this large class of number-conserving operators, the ones of most interest are those in
which there is just one creation operator followed by one annihilation operator and those
in which there are just two creation operators followed by two annihilation operators. As
we will now confirm, operators with just one creation operator followed by one annihila-
tion operator correspond to quantum one-body operators, whereas those with two creation
operators followed by two annihilation operators correspond to quantum two-body opera-
tors. And indeed, many of the most important operators in quantum mechanics are one- or
two-body operators.
I will not try to convince you of these remarks in general, since (as noted earlier) the
bookkeeping associated with many-particle states is very difficult. Rather, I will just confirm
them for systems with two particles.
For two particles, a general one-body operator can be written as
Ω = ω(1) + ω(2)
78
Now let’s evaluate the matrix elements of Ω between states of two identical particles, either
fermions or bosons,1√2ϕα1(1)ϕα2(2) + λ ϕα2(1)ϕα1(2)
and1√2ϕα3(1)ϕα4(2) + λ ϕα4(1)ϕα3(2)
We’ll assume that α1 = α2 and α3 = α4; otherwise the states wouldn’t exist for fermions.
We obtain for the matrix element of Ω,
1
2< ϕα1(1)| ω(1)| ϕα3(1) > δα2α4+ < ϕα2(2)| ω(2)| ϕα4(2) > δα1α3
+λ < ϕα2(1)| ω(1)| ϕα3(1) > δα1α4 + λ < ϕα1(2)| ω(2)| ϕα4(2) > δα2α3
+λ < ϕα1(1)| ω(1)| ϕα4(1) > δα2α3 + λ < ϕα2(2)| ω(2)| ϕα3(2) > δα1α4
+ < ϕα2(1)| ω(1)| ϕα4(1) > δα1α3+ < ϕα1(2)| ω(2)| ϕα3(2) > δα2α4
=< ϕα1 | ω| ϕα3 > δα2α4 + λ < ϕα1 | ω| ϕα4 > δα2α3
+λ < ϕα2 | ω| ϕα3 > δα1α4+ < ϕα2 | ω| ϕα4 > δα1α3
Next, we consider the Fock space operator
Ω =∞∑
k,k′=1
ωkk′ a†kak′
with
ωkk′ =< ϕk| ω|ϕk′ >
Note that I put a hat on top of Ω to make clear that it is a second-quantized operator.
Note also that ωkk′ is just a matrix of c-numbers, and that all operator dependence in Ω is
contained in a†kak′ . Let’s evaluate the matrix elements of Ω between the two-particle Fock
space states | α1, α2 > and | α3, α4 >, where as before we assume α1 = α2 and α3 = α4.
Then
| α1, α2 >= a†α1a†α2
| 0 >
and
| α3, α4 >= a†α3a†α4
| 0 >
so that
< α1, α2| Ω| α3, α4 >=∑k,k′
ωkk′ A
79
where
A =< 0 | aα2aα1a†kak′a
†α3a†α4
| 0 >
To evaluate A, note that
ak′a†α3a†α4
| 0 > = λa†α3ak′a
†α4| 0 > +δα3k′a
†α4| 0 >
= a†α3a†α4
ak′| 0 > +λδα4k′a†α3| 0 > +δα3k′a
†α4| 0 >
= λδα4k′a†α3| 0 > +δα3k′a
†α4| 0 >
From this we find that
A =< 0|aα2aα1a†kak′a
†α3a†α4
| 0 > = λ < 0|aα2aα1a†ka
†α3| 0 > δα4k′ +
+ < 0|aα2aα1a†ka
†α4| 0 > δα3k′
Thus,
< α1, α2| Ω| α3, α4 >=∑k
λωkα4 < 0|aα2aα1a
†ka
†α3| 0 > +ωkα3 < 0|aα2aα1a
†ka
†α4| 0 >
Both terms involve a generic matrix element of the form < 0| aαaβa†γa†δ| 0 >. In a
homework problem, I will ask you to confirm that
< 0| aαaβa†γa†δ| 0 >= λδβδδαγ + δβγδαδ
Using this result, we find finally that
< α1, α2| Ω| α3, α4 > = λωα1α4δα2α3 + ωα2α4δα1α3
+ωα1α3δα2α4 + λωα2α3δα1α4
which is identical to the result we obtained on page 79 for the matrix element of Ω between
two-particle states in ordinary (first-quantized) space.
Indeed, the conclusion is general. A Fock space operator of the form Ω =∑
kk′ ωkk′a†kak′
gives exactly the same matrix elements when taken between n-particle Fock space states as
does the corresponding general one-body operator Ω =∑n
i=1 ω(i) when taken between the
corresponding ordinary space states of n particles.
In fact, one can prove exactly the same thing for a general two-body operator
V =1
2
∑i=j
V (i, j)
80
Note: Since we are dealing with identical particles V (i, j) = V (j, i). The factor of 1/2 is
included since for each pair (i, j) with i = j, there also occurs the pair (j, i) for which the
interaction is identical.
The two-body matrix elements of V (i, j) can be written in the form
Vαβ,α′β′ =< ϕα(i)ϕβ(j)|V (i, j)|ϕα′(i)ϕβ′(j) > +λ < ϕα(i)ϕβ(j)|V (i, j)|ϕβ′(i)ϕα′(j) >
as a sum of two integrals, the first called the direct integral and the second called the
exchange integral, as it arises from the exchange of the labels α ′ and β ′.
I now claim that the Fock space operator
V =1
4
∞∑k1k2k3k4=1
Vk1k2k3k4a†k1a†k2ak4ak3
has the same matrix elements in Fock space as given above in ordinary first-quantized space.
Note that the inverted order of the last two annihilation operators is indeed not a typo.
My discussion up to now on one- and two-body operators has been fairly general. Nev-
ertheless, from the notation you might have guessed that I was gearing up for a discussion
of a specific QM operator, the hamiltonian. So, consider a hamiltonian which in coordinate
space could be written as
H = H0 + V
where
H0 =n∑
i=1
h0(i)
and
V =1
2
n∑i =j=1
v(i, j)
For example, in discussing atoms, h0 might refer to the kinetic energy of electron i plus
its Coulomb interaction with the nuclear core and v(i, j) might be the electron-electron
Coulomb interaction.
From the preceding discussion, we can immediately write down the corresponding Fock
space hamiltonian as
H =∑kk′
< k| h0| k′ > a†kak′ +1
4
∑ijkl
Vijkla†ia
†jalak
81
Had we been clever enough to define our Fock space in terms of the single-particle eigen-
states of h0, i.e. by h0|k >= ϵk|k >, then
H =∑k
ϵka†kak +
1
4
∑ijkl
Vijkla†ia
†jalak
There is an important feature of this Fock space formalism relative to the coordinate
space formalism that I would now like to note. The point is that the coordinate space
hamiltonian H depends on the number of particles n, whereas the corresponding Fock space
hamiltonian H applies to systems with any number of particles n. Of course, since H is
a number-conserving operator, its matrix decomposes into submatrices along the diagonal,
each one corresponding to a different number of particles.
At first glance, it seems that we have gained little by going over to Fock space, except for
the bookkeeping simplifications I’ve mentioned earlier. [In fact, many of you might not yet
be convinced that the bookkeeping has been simplified, but take my word for it that it has.]
But we’ve gained something else. Since the Fock space hamiltonian is independent of n, we
can construct its complete matrix corresponding to all states at the same time. Once we
have expressed our physical problem in terms of the mathematical problem of diagonalizing
a certain matrix, we can resort to any mathematically justifiable procedure we want. The
larger the matrix at our disposal, the richer are our choices. In fact, a very powerful way of
approximately diagonalizing H involves carrying out a transformation to a representation
which mixes states with different numbers of particles. This is the basis of the so-called BCS
approximation used to describe superconductivity, as we will discuss later in the semester.
Such a number non-conserving method would not be possible without the use of Fock space,
where all numbers of particles can (if we wish) be treated at the same time.
Symmetries
We have just finished a discussion of “number conserving operators”, which play a par-
ticularly important role in treatments of nonrelativistic quantum systems which typically
have a well defined number of particles. That physical systems have a well-defined number
of particles is an example of a symmetry principle or conservation law. We have seen that
this has important consequences in our second quantized description of the system. On the
one hand, it tells us that the (true) hamiltonian describing the system can be built up in
terms of number-conserving operators only. And, furthermore, it permits us (if we wish) to
82
diagonalize H is a subspace of the full Fock space, namely the subspace of states with the
correct number of particles.
There are many other important symmetry principles for quantum systems, which likewise
can be used to simplify the treatment of complex many-body systems of identical particles,
in much the same way as did Conservation of Particle Number. Some of the better known
symmetries, all of which were discussed in PHYS811, include Rotational Invariance, Invari-
ance under Space Reflections and Invariance under Time Reversal. I would now like to briefly
show how some of these symmetries are manifested in our second quantized framework.
Rotational invariance and angular momentum
As we have seen, invariance under spatial rotations leads to the concept of conservation
of angular momentum. In analogy with conservation of particle number, this leads to two
principal consequences:
1. that hamiltonians for isolated systems must be scalars (or, equivalently, Irreducible
Tensor Operators of rank 0), and
2. that we can (if we wish) only diagonalize H in subspaces corresponding to states with
given total angular momentum.
Of course, it is possible that under certain circumstances it might be useful to relax the
symmetry to develop useful approximation schemes, as we discussed doing for particle num-
ber conservation. Indeed, this is what is often done in the so-called Hartree Fock procedure,
which I will describe later in the semester.
Considering the importance of rotational invariance in so many quantum systems, it is
useful to discuss in greater detail how it enters in our Fock space formalism.
If the total hamiltonian H is rotationally invariant, it is natural (though not essential) to
define the Fock space states in terms of a single-particle hamiltonian h which is rotationally
invariant. The eigenstates of such a hamiltonian can be expressed as | αi ji mi >, where ji
is the total angular momentum quantum number, mi is its z projection, and αi are all other
quantum numbers (e.g. the principal quantum number). These states satisfy the following
eigenvalue equations
83
h| αi ji mi > = ϵi| αi ji mi >
J2| αi ji mi > = ji(ji + 1)| αi ji mi >
Jz| αi ji mi > = mi| αi ji mi >
where I am for notational simplicity setting h = 1.
Let’s now consider the form of these various operators in our second quantized formalism.
The angular momentum operator is an Irreducible Tensor Operator of rank one. The
three components of this ITO are
J01 = Jz
J11 = − 1√
2J+
J−11 =
1√2J−
The total angular momentum squared operator is
J2 = J · J = J01J
01 − J1
1J−11 − J−1
1 J11
Now let’s look at all these operators in Fock space. Since J is a one-body operator, each
of its three components are given by
Jq1 =
∑α1j1m1,α2j2m2
< α1j1m1|Jq1 | α2j2m2 > a†α1j1m1
aα2j2m2
The one-particle matrix elements of Jq1 can be obtained using the Wigner Eckart theorem.
The result, as derived last semester, is
< α1j1m1|Jq1 | α2j2m2 >= −
√j1(j1 + 1)(1qj2m2|j1m1)δα1α2δj1j2
Thus,
Jq1 = −
∑αjm
√j(j + 1) (1qjm|jm+ q)a†αjm+qaαjm
where I have taken into account that the Clebsch Gordan coefficient is only non-zero when
the m values sum appropriately.
84
Putting in explicit formulae for the Clebsch Gordan coefficients, we obtain
J01 =
∑αjm
m a†αjmaαjm
J11 = − 1√
2
∑αjm
√(j +m+ 1)(j −m) a†αjm+1aαjm
J−11 =
1√2
∑αjm
√(j −m+ 1)(j +m) a†αjm−1aαjm
Next we consider
J2 = J01 J
01 − J1
1 J−11 − J−1
1 J11
in terms of the Fock space forms for Jq1 . The net result will be a sum of a one-body term
and a two-body term. This is already obvious by considering the first term J01 J
01 , which is
J01 J
01 =
∑αjm,α′j′m′
mm′ a†αjmaαjma†α′j′m′aα′j′m′
Noting that
aαjma†α′j′m′ = λa†α′j′m′aαjm + δαα′δjj′δmm′
we can rewrite this as
J01 J
01 =
∑αjm,α′j′m′
λmm′a†αjma†α′j′m′aαjmaα′j′m′
+∑αjm
m2a†αjmaαjm
The first term is a two-body operator and the second term is a one-body operator.
Next I would like to consider the angular momentum properties of the creation and
annihilation operators a†αjm and aαjm. I make the following assertions:
1. that a†αjm is an Irreducible Tensor Operator (ITO) of rank j and projection m,
2. that aαjm is not an ITO, and
3. that aαjm = (−)j+maαj−m is an ITO of rank j and projection m.
You will be asked to prove these assertions as a homework problem, by using the Fock space
analogs of the commutation relations that define an ITO.
The point of all of this is that, as we know, ITO’s can be coupled together using Clebsch
Gordan coefficients to produce new ITO’s, just as we can couple together wave functions
with good angular momentum properties to get product states of good angular momentum.
85
Thus, if we have two ITO’s, Aq1k1
and Bq2k2, and we construct the linear combination
∑q1q2(q1+q2=M)
(k1q1k2q2|KM)Aq1k1Bq2
k2
we will end up with an ITO of rank K and projection M . I will denote this beast as
[Ak1 Bk2
]MK
We can use these ideas to make explicit the angular momentum properties of Fock space
operators. Consider, for example, the number operator
N =∑αjm
a†αjmaαjm
As an exercise, you should convince yourself that
N =∑αj
√2j + 1
[a†αj aαj
]00
In this form, it is clear that N is an angular momentum scalar, as clearly it must be.
Using the same techniques, we can also show that both the one- and two-body parts of
the hamiltonian H are scalar operators. And this is of course consistent with the fact that
the eigenstates of H have definite angular momenta.
I’d now like to briefly address the physical significance of the operator aαjm I introduced
earlier. As I asserted, and you will prove, it is this operator, not aαjm that has the properties
of an ITO of rank j and projection m. To understand this, consider a fermion state vector
in which all of the 2j +1 possible m substates associated with a given j value are occupied,
i.e.
a†αjj a†αjj−1 ... a†αj−j | 0 >
It is easy to prove, and you will be asked to do so, that such a state is an eigenstate of
J2 and Jz with total angular momentum J = 0 and total projection M = 0. I will denote
this state as
Φ00(a
†αj)
2j+1|0 >
This state can be decomposed using Clebsch Gordan coefficients as
Φ00
(a†αj
)2j+1|0 > ∝
∑m
(jmj −m|00)a†αjmΦ−mj
(a†αj
)2j|0 >
86
where Φ−mj
(a†αj
)2jis a 2j-particle state with total angular momentum j and total projection
−m.
If we act with aαjm on Φ00(a
†αj)
2j+1|0 >, we obtain
aαjmΦ00(a
†αj)
2j+1|0 >∝ (jmj −m)|00)Φ−mj
(a†αj
)2j|0 >
since the aαjm annihilates one a†αjm.
But
(jmj −m|00) = (−)j−m
√2j + 1
so that
aαjmΦ00(a
†αj)
2j+1|0 >∝ (−)j−mΦ−mj
(a†αj
)2j|0 >
Clearly aαjm acting on a closed-shell system (which had angular momentum 0) does not
produce a state with angular momentum j and projection m. On the other hand,
aαjmΦ00(a
†αj)
2j+1|0 > = (−)j+maαj−mΦ00
(a†αj
)2j+1|0 >
∝ Φmj
(a†αj
)2j|0 >
Thus, aαjm when acting on a closed-shell state does produce a state of definite angular
momentum j and projection m, as an ITO must.
Such a result is not unique to the angular momentum representation. Something very
similar occurs in the linear momentum representation, which is often more useful when
dealing with condensed matter systems. Here we define our single-particle state vectors,
and thus our Fock space, to be eigenstates of the linear momentum operator
p|k >= k|k >
In this representation, a†kcreates a particle of momentum k and ak annihilates a particle of
momentum k.
What happens if we act with ak on a many-body state with total K = 0. If the state |k >
is occupied, the operator will annihilate a particle in that state, leaving behind a state with
momentum −k (conservation of total linear momentum). The operator which produces a
momentum eigenstate with eigenvalue k is
ak = a−k
87
What if anything is the relationship between the two operators aαjm (in an angular
momentum representation) and ak (in a linear momentum representation)? The answer is
contained in the properties of the time reversal operator we learned about last semester. So,
let’s see how.
Denoting the time reversal operator by θ and acting with it on an angular momentum
eigenstate |nljm > gives
θ|nljm >= (−)j−m|nlj −m >
as you should remember from last semester.
Similarly, one-particle momentum eigenstates transform (as should be obvious) under
time reversal as
θ|k >= | − k >
Comparing the above relations with the defining relations for the tilded operators aαjm
and ak, we see that
aαjm = θ aαjm θ−1
and
ak = θ ak θ−1
Thus, aαjm and ak do not annihilate particles in | αjm > and | k >, respectively. Rather,
they annihilate particles in the time reversed single-particle states (−)j−m| αj − m > and
| − k >, respectively.
I will at times use the notation | α > to denote the time-reversed state of | α >, viz:
| α >= θ| α >
From the above, it should be obvious that
aα = aα
Time reversal invariance
When a single-particle hamiltonian for spin-1/2 fermions is time-reversal invariant, it’s
one-particle eigenstates must be doubly degenerate. This, as you hopefully remember from
last semester, is known as Kramer’s degeneracy. In the above notation, this means that for
any single-particle state |α >, there corresponds another single-particle state |α > such that
ϵα = ϵα
88
To incorporate this time-reversal property into our formalism (assuming of course we are
dealing with spin-1/2 fermions), it is useful to split the full space of single-particle states
|α > into two parts, which we denote by α > 0 and α < 0. To each state with |α >
with α > 0 there will correspond another state |α > with α < 0 with the same energy. For
example, for each single-particle eigenstate |nljmj > of a rotationally invariant hamiltonian
with mj > 0, there is a degenerate state |nljmj > for which mj < 0. We can span the full
Fock space in terms of the creation operators
a†α and a†α
with α > 0. We will often use this simplification.
Quasiparticles
The formalism we have developed so far involves the following ingredients:
(a) a vacuum state | 0 >, and
(b) creation and annihilation operators a†α and aα, with either commutation or anticommu-
tation relations, depending on whether we are dealing with fermions or bosons. These
operators create/annihilate a particle in the chosen single-particle basis | α >.
All state vectors in Fock space are generated by acting with creation operators on the
vacuum. All operators are generated in terms of the fundamental creation and annihilation
operators. The link between the two ingredients (a) and (b) of the formalism is contained
in the relation
aα| 0 >= 0 , for all α
As should be obvious, the above formalism is in terms of “real” (honest-to-goodness)
particles. Our creation and annihilation operators create or annihilate particles; our vacuum
state has zero particles, etc.
I now claim that it is also possible to develop alternative algebras that are
mathematically identical to the one just described except that they are not in terms of
real particles. The kinds of beasts that enter in this mathematically equivalent formalism
will be called quasiparticles and will be useful in several different areas of nonrelativistic
many-body theory. I will focus on them solely for systems of spin-1/2 fermions with its
characteristic time reversal symmetry properties.
89
In this quasiparticle formalism, one introduces quasiparticle creation and annihilation
operators c†α and cα which are related to the particle creation and annihilation operators a†α
and aα by
c†α = uαa†α − vαaα
cα = uαaα + vαa†α
with
u2α + v2α = 1
Clearly, c†α does not create a real particle in state α. It creates in part a particle in the
state α and in part a hole in the time reversed state α. The funny thing it creates is called
a quasiparticle.
I now make the following claims which you will be asked to verify in the homework :
1. In order for the equations for c†α and cα to be consistent it is necessary that
uα = uα and vα = −vα
2. The set of quasiparticle creation and annihilation operators satisfy the following set
of anticommutation relations: c†α, c
†β
= cα, cβ = 0
cα, c†β
= δαβ
Thus, the quasiparticle operators satisfy exactly the same anticommutation algebra as
do the real particle operators.
But as noted earlier, the introduction of creation and annihilation operators is not enough
to specify a second-quantized algebra. One must also introduce an “appropriate” vacuum
state. Let’s denote this vacuum state by | 0 > and refer to it as the quasiparticle vacuum.
If quasiparticles are to have mathematically the same structure as real particles, the
quasiparticle vacuum state must satisfy
cα| 0 >= 0 , for all α
I claim that this would be the case if we chose the quasiparticle vacuum state to be
| 0 >=∏β
cβ| 0 >
90
That this quasiparticle vacuum is indeed annihilated by all quasiparticle annihilation
operators can be shown simply by rewriting
| 0 >= ±cα∏
β (β =α)
cβ| 0 >
where the ± sign comes from anticommuting cα past as many cβ’s as are necessary to get it
in the first position. Then since cαcα = 0 for any α, it is clear that cα| 0 >= 0 for any α.
We can in fact express the quasiparticle vacuum state in terms of the real vacuum and
real particles as follows:
| 0 >=∏α
(uαaα − vαa
†α
)| 0 >
Following my discussion on pages 88-89, I divide the full set α into those with α > 0 and
those with α < 0. We can then rewrite the quasiparticle vacuum as
| 0 >=∏α>0
(uαaα − vαa
†α
) (uαaα + vαa
†α
)| 0 >
where I have used the fact that uα = uα and vα = −vα as discussed earlier was required for
consistency. Expanding out, we get
| 0 >=∏α>0
(u2αaαaα + uαvαaαa
†α
− uαvαa†αaα − v2αa
†αa
†α
)| 0 >
The first and third terms in brackets give zero contribution, since aα| 0 >= 0. The second
term can be rewritten as
uαvαaαa†α| 0 >= −uαvαa
†αaα| 0 > +uαvα| 0 >= uαvα| 0 >
since aα| 0 >= 0. Thus,
| 0 > =∏α>0
(uαvα − v2αa
†αa
†α
)| 0 >
=∏α>0
(uαvα + v2αa
†αa
†α
)| 0 >
=∏α>0
uαvα∏α>0
(1 +
vαuα
a†αa†α
)| 0 >
The first factor is just a number and is there to guarantee that
< 0| 0 >= 1
91
i.e. that the quasiparticle vacuum is normalized. If we expand the second factor, we get
a term with zero real creation operators, then one with 2 real creation operators, then one
with 4 real creation operators, etc. Clearly, | 0 > is not an eigenstate of real particle number.
But as we’ll see later, it is nevertheless quite useful.
Now let’s consider the occupation number of a given single-particle state α in the quasi-
particle vacuum. I will denote it as ηα and it is given by
ηα =< 0| a†αaα| 0 >
< 0| 0 >
To evaluate this, we invert the defining relations that connect c†α and cα to a†α and aα.
The inverted equations are, as you can easily confirm,
a†α = uαc†α + vαcα
aα = uαcα − vαc†α
Thus,
ηα =< 0|
(uαc
†α + vαcα
) (uαcα + vαc
†α
)| 0 >
< 0| 0 >
Using the fact that cα| 0 >= cα| 0 >= 0 and that < 0|c†α =< 0| c†α = 0, we obtain
ηα =v2α < 0| cαc†α| 0 >
< 0| 0 >
But
< 0| cαc†α| 0 > = − < 0| c†αcα| 0 > + < 0| 0 >
= < 0| 0 >
so that
ηα = v2α
Thus, v2α measures the “fullness” of state α. Then since u2α + v2α = 1, it is clear that u2
α
measures its “emptiness”.
A simple application
Assume an ordering of (doubly-degenerate) single-particle energies (i.e. the eigenvalues
of h) such that if α > β (both positive) then ϵα > ϵβ.
92
0
v 21
FIG. 7: Occupation numbers for a system involving a set of levels filled up to the Fermi energy.
Next let’s assume that all of the single-particle levels up to a given (positive) λ (with
energy ϵλ) are completely occupied, whereas all those with energies greater than ϵλ are
completely empty. Pictorially, if we plot v2α versus the single-particle level (or single-particle
energy) it will look as in figure 9. The separation point is called the Fermi surface and ϵλ is
referred to as the Fermi energy.
For such a scenario, I claim that
vα = 1 , uα = 0 , for α ≤ λ
vα = 0 , uα = 1 , for α > λ
Thus,
c†α = a†α
cα = aα
, for α > λ
and
c†α = −aα
cα = a†α
, for α ≤ λ
The quasiparticle vacuum associated with this scenario is simply
| 0 >=∏
0<α≤λ
a†αa†α|0 >
93
which you can readily confirm satisfies the requirement that
cα| 0 >= 0 , for all α
What is the significance of the creation operator c†α in this system? Clearly, for particles,
outside the “filled inert core”, it creates real particles. For particles within the inert core, it
annihilates real particles, or equivalently it creates real holes.
By using this (simple) quasiparticle transformation, we have devised a formalism whereby
particles outside an inert core and holes inside that same core are treated on an equal footing.
This permits us to straightforwardly isolate on just the few valence particles and/or holes in
the system, rather than treating all of the particles in the complex many-body system. Since
it requires lots of energy to lift particles from deep below the Fermi surface or to lift particles
to high above the Fermi surface, the dominant excitations in a system near its ground
state will just involve the levels fairly near the Fermi surface. Our simple quasiparticle
transformation enables us to focus on them.
Another application
In a couple of weeks, I will show you how it is possible to treat the uα and vα coefficients
as variational parameters, so as to find the optimum set of noninteracting quasiparticles in
a fermionic system. Such an approach will be meaningful whenever the hamiltonian of the
problem is dominated by so-called pairing correlations. The approximation that will emerge
is the BCS (or Bardeen-Cooper-Schrieffer) approximation, appropriate to superconducting
systems.
Second quantization in coordinate representation - Introduction of field operators
Our discussion of second quantization so far has involved the introduction of a single-
particle basis associated with some single-particle hamiltonian. I would now like to discuss
an alternative representation for second quantized operators, in which we use the continuous
coordinate representation to define Fock space.
In coordinate representation, the relevant single-particle states, |r >, are eigenstates of
the coordinate operator
rop |r >= r |r >
94
Now let’s define the operator that creates a particle at the point r as Ψ†(r), viz:
| r >= Ψ†(r) | 0 > (109)
These operators can, if we wish, be related to the single-particle creation operators a†k
appropriate to the set of single-particle Fock space states | k > by inserting the identity
operator I =∑
k | k >< k| into (109), yielding
| r > =∑k
| k >< k|Ψ†(r) | 0 >
=∑k
| k >< k| r >
=∑k
ϕ∗k(r)| k >
where ϕk(r) is the coordinate-space wave function associated with the single-particle state
| k >, i.e.
ϕk(r) =< r| k >
Thus,
Ψ†(r) =∑k
ϕ∗k(r) a
†k (110)
Now what about the operator Ψ(r) that annihilates a particle at point r. Clearly, it
satisfies
< r| =< 0|Ψ(r)
Inserting, as before, a complete set of states I =∑
k |k >< k|, we find that
< r| =∑k
< 0|Ψ(r)| k >< k|
=∑k
< r| k >< k|
=∑k
< k| ϕk(r)
Thus,
Ψ(r) =∑k
ϕk(r) ak (111)
which as expected is the hermitean adjoint of Ψ†(r).
The operators Ψ†(r) and Ψ(r) are called field operators.
95
Let’s now assume that this refers to the creation and annihilation of fermions at point r.
Then if we consider the anticommutation relation between Ψ(r) and Ψ†(r′), we obtainΨ(r) , Ψ†(r′)
=∑k,k′
ϕk(r)ϕ∗k′(r
′)ak , a†k′
=∑k
ϕk(r)ϕ∗k(r
′)
= δ(r − r′)
As expected, the field operators associated with the creation and annihilation of fermions
satisfy fermion anticommutation relations, but with Dirac delta functions to reflect the fact
that it is a continuous space.
If instead the operators referred to the creation and annihilation of bosons, we could
instead show analogously that [Ψ(r) , Ψ†(r′)
]= δ(r − r′)
namely that the field operators satisfy boson commutation relations. Thus, the field opera-
tors we have introduced carry the permutation symmetry character of the particles they are
creating or annihilating.
At this point, it is useful to include intrinsic spin in the discussion, as it is through the
spin that the symmetry character enters. This can be done straighforwardly, as follows.
The field operator that creates a particle at point r with spin orientation s is denoted Ψ†s(r).
Likewise the field operator that annihilates a particle at point r with spin orientation s is
denoted Ψs(r). These operators satisfy either fermion anticommutation relationsΨs(r) , Ψ†
s′(r′)= δ(r − r′)δss′
or boson commutation relations[Ψs(r) , Ψ†
s′(r′)]= δ(r − r′)δss′
depending on the permutation symmetry character of the particles involved.
And you can readily convince yourselves that the commutation/anticommutation rela-
tions between two field creation operators or two field annihilation operators likewise takes
the expected form.
From the field creation operators we can build up coordinate space wave functions for
identical particles that appropriately reflect the exchange character of the system. For
96
example, an antisymmetric wave function for identical fermions in coordinate space can be
expressed as
Ψ†s1(r1) Ψ
†s2(r2)...Ψ
†sn(rn) |0 >
which you can readily convince yourselves is antisymmetric under the interchange of the
spatial and spin labels.
Now let’s discuss how we would write the usual operators we are familiar with in terms
of these field operators. I will focus on the operators from which we build the hamiltonian
of the system, namely the kinetic and potential energy operators. As in earlier discussion,
we will assume that we have a two-body interaction only.
Let’s first consider the one-body kinetic energy operator, which in coordinate space was
T =h2
2m ·
If we write this in terms of an arbitrary Fock space, with creation and annihilation
operators a†n and an, respectively, we find that
T =∑n,n′
< n| T | n′ > a†nan′
where the matrix element < n| T | n′ > in coordinate space is given by
< n| T | n′ >=h2
2m
∫dr ϕ∗
n(r) · ϕn′(r)
Then
T =h2
2m
∑nn′
∫dr ϕ∗
n(r) · ϕn′(r) a†nan′
=h2
2m
∫dr Ψ†(r) · Ψ(r)
Note that this maintains the natural form we would expect for a second-quantized kinetic
energy operator in coordinate space, except that through the introduction of field operators
we see directly the one-body nature of the kinetic energy operator and furthermore we have
an operator which when its matrix elements are evaluated will automatically reflect the
exchange or permutation character of the identical particles under discussion. It should also
be emphasized that this is the kinetic energy operator to be applied to a many-body system
of identical particles.
97
We can of course do the same thing for the two-body potential operator, which in arbitrary
Fock space takes the form
V =1
4
∑n1n2n2n4
< n1 n2| V | n3 n4 > a†n1a†n2
an4an3
The two-body matrix elements of V that enter are given in coordinate space by
< n1n2| V | n3n4 >=∫ ∫
dr1dr2(ϕ∗n1(r1)ϕ
∗n2(r2)V (r1 − r2)ϕn3(r1)ϕn4(r2)
±ϕ∗n1(r1)ϕ
∗n2(r2)V (r1 − r2)ϕn3(r2)ϕn4(r1)
)where I use the ± sign to reflect the fact that we need to add or subtract the exchange
integral depending on the symmetry character of the particles.
Then,
V =1
4
∑n1n2n2n4
∫ ∫dr1dr2ϕ
∗n1(r1)ϕ
∗n2(r2)V (r1 − r2)ϕn3(r1)ϕn4(r2)a
†n1a†n2
an4an3
±1
4
∑n1n2n2n4
∫ ∫dr1dr2ϕ
∗n1(r1)ϕ
∗n2(r2)V (r1 − r2)ϕn3(r2)ϕn4(r1)a
†n1a†n2
an4an3
=1
2
∫ ∫dr1dr2Ψ
†(r1)Ψ†(r2)V (r1 − r2)Ψ(r1)Ψ(r2) (112)
In obtaining the last equality I made use of the fact that the field annihilation operators are
symmetric or antisymmetric depending on their exchange character, so that the two terms
give the same contribution.
Note that here too we arrive at the natural form we would expect for a second-quantized
two-body potential energy operator in coordinate space.
It is useful to give a verbal interpretation to the potential operator expressed in terms of
the quantum field operators. Basically, what it says is the following:
1. The operator first tries to remove particles from points r1 and r2;
2. If it is successful, it contributes an interaction strength V (r1 − r2);
3. It then replaces the particles at those same points, taking care to replace the first
particle it removed first so that it doesn’t introduce any inadvertent sign changes;
4. It then sums over all possible pairs of points r1 and r2 from which the particles can
be removed and then put back;
98
5. Finally, it compensates for double counting the same pair of points twice through the
factor 12.
This is all I would like to say about the use of quantum fields at this time in my intro-
duction to second quantization.
99
Approximation techniques for non-relativistic many-body systems
I would now like to discuss the use of second quantization in developing practical approxi-
mation techniques for dealing with non-relativistic many-particle systems involving identical
particles. The two specific techniques I will develop and discuss are
(1) The Hartree Fock Approximation
(2) The BCS Approximation.
The Hartree Fock Approximation
In the Hartree Fock method, one approximates a many-body system of interacting
fermions by a system of non-interacting fermions, each of which moves in a field created
by all of the others. The method is variational in the sense that one searches for the best
possible such description.
Clearly, such an independent-particle (or mean-field) approximation will be different for
fermions than for bosons. In the case of fermions, each independent particle must occupy a
different independent-particle state, and the lowest state for N particles involves filling up
the N lowest states (a Slater determinant wave function). In the case of bosons, the Pauli
principle does not apply and such a mean-field variational principle would describe the lowest
state of the system by putting all N particles in the energetically-lowest independent-particle
state.
As noted above, the Hartree Fock (HF) method applies to fermions and this is the method
I will discuss. The Hartree Bose approximation, the corresponding variational mean-field
approximation for bosons, is in fact somewhat simpler and I will have you develop it as a
homework assignment.
So, let’s now assume that we have some single-particle hamiltonian
h0(i) = t(i) + U(i) (113)
which is used to generate a set of single-particle states
| i >= a†i | 0 > (114)
The hamiltonian for the system can be expressed in terms of the creation operators a†i
and their hermitean conjugate annihilation operators ai as
100
H =∑ij
tija†iaj +
1
4
∑ijkl
Vijkl a†ia
†jalak (115)
where
tij =< i| t| j >
and
Vijkl =< ij| V | kl >
In Hartree Fock theory, we wish to find the best possible Slater determinant state vec-
tor. Of course, it is not necessarily the Slater determinant built up by putting particles in
the energetically lowest single-particle states | i >, since that set of single-particle states
was chosen arbitrarily, or perhaps for convenience. So, let’s assume that the HF Slater
determinant can be written as
| Φ >= b†λ1b†λ2
...b†λN| 0 > (116)
where
b†λ =∑i
cλi a†i (117)
Since the a†i form a complete set of single-particle creation operators, we can certainly expand
any single-particle creation operator as a linear combination of them.
What we shall do is to determine the cλi such that
< Φ| H| Φ >
< Φ| Φ >
is minimized. This is a well-defined variational problem, with the cλi as our variational
parameters.
Before doing this, however, it’s useful to carry out a bit of preliminary analysis. Clearly,
we want the new set of single-particle state vectors
| λ >= b†λ | 0 >
to form an orthonormal set, i.e.
< λ|λ ′ >= δλλ ′ (118)
Thus,
< 0| bλb†λ ′ | 0 >= δλλ ′
101
or ∑ij
c∗λi cλ′
j < 0| aia†j | 0 >= δλλ ′
But
< 0| aia†j | 0 >=< i| j >= δij
so that ∑i
c∗λi cλ′
i = δλλ ′ (119)
Next we consider b†λ, b†λ ′
=∑ij
cλi cλ ′
j
a†i , a†j
= 0 (120)
bλ, bλ ′ =∑ij
c∗λi c∗λ′
j ai, aj = 0 (121)
and
bλ, b†λ ′
=∑ij
c∗λi cλ′
j
ai, a†j
=∑ij
c∗λi cλ′
j δij
=∑i
c∗λi cλ′
i
= δλλ ′ (122)
Thus, the b†λ and bλ ′ have the same anticommutation relations as the original creation
and annihilation operators, as expected.
Finally, let’s consider the inverse of the expansion (117), viz:
a†i =∑λ
dλi b†λ (123)
so that
a†i =∑λ j
dλi cλj a
†j
This requires that ∑λ
dλi cλj = δij
Multiplying through by c∗λ′
j and summing over j gives
∑λ j
dλi c∗λ ′
j cλj =∑j
δijc∗λ ′
j = c∗λ′
i
102
But ∑j
c∗λ′
j cλj = δλλ ′
so that
dλ′
i = c∗λ′
i
Plugging this into (123), we find that
a†i =∑λ
c∗λi b†λ (124)
and
ai =∑λ
cλi bλ (125)
With the preliminaries out of the way, we are now in position to consider the expectation
value of H in the trial state |Φ >, the quantity we wish to minimize to define the optimal
Slater determinant.
< Φ| H| Φ > =∑ij
tij < Φ| a†iaj| Φ > +1
4
∑ijkl
Vijkl < Φ| a†ia†jalak| Φ >
=∑
ijλ1λ2
tijc∗λ1i cλ2
j < Φ| b†λ1bλ2 | Φ >
+1
4
∑ijklλ1λ2λ3λ4
Vijklc∗λ1i c∗λ2
j cλ3k cλ4
l < Φ| b†λ1b†λ2
bλ4bλ3 | Φ > (126)
I now claim that the sums over all λi must be restricted to λi ≤ N . The reason is that if
λi > N then
bλi| Φ >=< Φ| b†λi
= 0 (127)
since state |λi > is unoccupied.
Thus
< Φ| H| Φ > =∑
λ1 λ2≤N
∑ij
tijc∗λ1i cλ2
j
< Φ| b†λ1bλ2 | Φ >
+1
4
∑λ1 λ2 λ3 λ4≤N
∑ijkl
Vijklc∗λ1i c∗λ2
j cλ3k cλ4
l
×
× < Φ| b†λ1b†λ2
bλ4bλ3 | Φ > (128)
103
Next we consider (for λ, λ ′ ≤ N)
< Φ| b†λbλ ′| Φ > = − < Φ| bλ ′b†λ| Φ > +δλ λ ′ < Φ| Φ >
= δλ λ ′ < Φ| Φ >
= δλ λ ′ (129)
where in the next to last line I used the fact that b†λ| Φ >= 0 for λ ≤ N and in the last line
I used the fact that since each single-particle state |λ > is normalized and since each is also
distinct the overall state |Φ > is normalized..
Similarly, for λ1, λ2, λ3, λ4 ≤ N ,
< Φ| b†λ1b†λ2
bλ4bλ3 | Φ > = − < Φ| b†λ1bλ4b
†λ2bλ3 | Φ > + δλ2 λ4 < Φ| b†λ1
bλ3 | Φ >
= < Φ| b†λ1bλ4bλ3b
†λ2| Φ >
−δλ2 λ3 < Φ| b†λ1bλ4 | Φ > +δλ2 λ4 < Φ| b†λ1
bλ3 | Φ >
= −δλ2 λ3 < Φ| b†λ1bλ4 | Φ > +δλ2 λ4 < Φ| b†λ1
bλ3 | Φ > (130)
where I have again used the fact that b†λ2| Φ >= 0 for λ2 ≤ N .
Next, we use (129) to evaluate the last two terms and obtain (for λ1, λ2, λ3, λ4 ≤ N)
< Φ| b†λ1b†λ2
bλ4bλ3 | Φ >= −δλ2 λ3 δλ1 λ4 + δλ2 λ4 δλ1 λ3 (131)
where I have again used the fact that |Φ > is normalized.
Inserting (129) and (131) into (128), we see that
< Φ| H| Φ > =∑λ≤N
∑ij
tijc∗λi cλj
−1
4
∑λ1 λ2≤N
∑ijkl
Vijklc∗λ1i c∗λ2
j cλ2k cλ1
l
+1
4
∑λ1 λ2≤N
∑ijkl
Vijklc∗λ1i c∗λ2
j cλ1k cλ2
l
(132)
Using the fact that the summation indices k and l are just dummy indices and that (for
fermions) Vijkl = −Vijlk, we can combine the second and third terms to give
< Φ| H| Φ > =∑λ≤N
∑ij
tijc∗λi cλj
+1
2
∑λ1 λ2≤N
∑ijkl
Vijklc∗λ1i c∗λ2
j cλ1k cλ2
l
(133)
104
The appropriate variational condition is that
∂
∂c∗λm
< Φ| H| Φ > −eλ
∑j
|cλj |2 − 1
= 0 (134)
where eλ is a Lagrange multiplier introduced to guarantee that the normalization < Φ| Φ >=
1 is preserved under the variation.
Carrying out the differentiation, we obtain
∑j
tmjcλj +
1
2
∑jkl
Vmjkl
∑λ2≤N
c∗λ2j cλ2
l
cλk
+1
2
∑ikl
Vimkl
∑λ1≤N
c∗λ1i cλ1
k
cλl
− eλcλm = 0 (135)
The second and third terms are identical, as I will now confirm.
We can rewrite
∑jkl
Vmjkl
∑λ2≤N
c∗λ2j cλ2
l
cλk =∑jkl
Vmkjl
∑λ ′≤N
c∗λ′
k cλ′
l
cλj
by interchanging the dummy summation indices k and j and by replacing the dummy index
λ2 by λ ′. Similarly, we can rewrite
∑ikl
Vimkl
∑λ1≤N
c∗λ1i cλ1
k
cλl =∑jkl
Vjmkl
∑λ ′≤N
c∗λ′
j cλ′
k
cλl
=∑jkl
Vkmjl
∑λ ′≤N
c∗λ′
k cλ′
j
cλl
=∑jkl
Vmkjl
∑λ ′≤N
c∗λ′
j cλ′
l
cλj
Note: The first equality followed from replacing i with j and λ1 with λ ′; the second equality
involved interchanging j with k; the third equality followed from interchanging j with l and
noting that Vkmlj = Vmkjl.
Thus, (135) can be rewritten as (now replacing m by i)
∑j
tijcλj +
∑jkl
Vmjkl
∑λ2≤N
c∗λ2j cλ2
l
cλk − eλcλm = 0
or ∑j
tij +∑kl
Vikjl
∑λ ′≤N
c∗λ′
k cλ′
l
cλj = eλc
λi (136)
105
I now claim that (136) is of the form of a single-particle eigenvalue equation
h| λ >= eλ| λ > (137)
To see this, we take the overlap of (137) with the bra vector < i| yielding
< i| h| λ >= eλ < i| λ >
Inserting | λ >=∑
j cλj | j > gives
∑j
< i| h| j > cλj = eλcλi
or equivalently ∑j
hijcλj = eλc
λi
If we thus identify
hij = tij + Uij (138)
with
Uij =∑kl
Vikjl
∑λ ′≤N
c∗λ′
k cλ′
l
(139)
then indeed (136) is precisely of this form.
We can simplify the form of the “single-particle” potential U somewhat by expressing
Vikjl =< ik| V | jl >
so that
Uij =∑
λ ′≤N, k l
c∗λ′
k < ik| V | jl > cλ′
l
Then since
| λ ′ >=∑l
cλ′
l | l >
and
< λ ′| =∑k
c∗λ′
k < k|
we can rewrite
Uij =∑
λ ′≤N
< iλ ′| V | jλ ′ >
=∑
λ ′≤N
Viλ ′jλ ′ (140)
106
The one-body hamiltonian h whose eigenvectors give the Hartree Fock single-particle
state vectors can be written in terms of its matrix elements as
hij = tij +∑
λ ′≤N
Viλ ′jλ ′ (141)
In summary, we have shown that the best Slater determinant type wave function for
a system of N identical fermions is obtained by filling up the N lowest eigenstates of the
single-particle Schrodinger equation
h| λ >= eλ| λ > (142)
where h is a one-body operator with matrix elements
hij = tij +∑
λ ′≤N
Viλ ′jλ ′ (143)
Despite the fact that (142) is a one-body Schrodinger equation, it is more complicated
than the usual one-body Schrodinger equation. The reason is that the hamiltonian h depends
on its eigenvectors |λ > through (143). Thus, eqs. (142) and (143) must be solved self
consistently, so that the eigenvectors that go into the construction of Uij also come out of
the diagonalization of hij.
The usual method of obtaining self-consistent HF solutions proceeds in the following
iterative way:
1. Make an initial guess of theN occupied HF single-particle state vectors |λ >, according
to
| λ(0) >=∑i
cλ (0)i | i >
where | i > are the single-particle basis states and the superscript (0) means that this
is the zeroth-order approximation.
2. Evaluate the matrix of t + U in the basis | i > and diagonalize it, yielding a set of
eigenvalues e(1)λ and a new set of eigenvectors c
λ (1)i . [The superscript (1) means this is
the first-order approximation.] In constructing the matrix U from (127) we of course
only sum over the N energetically-lowest single-particle states | λ(0) >.
3. Construct a new set of single-particle states
| λ(1) >=∑i
cλ (1)i | i >
107
and use them to reevaluate the matrix of t + U in the same basis | i >. Diagonalize
this matrix, thereby obtaining a new set of eigenvalues e(2)λ and eigenvectors c
λ (2)i .
4. Keep repeating the above procedure until the set of eigenvectors cλ (n)i from the nth
iteration agree to within some reasonable numerical accuracy with those from the
previous iteration cλ (n−1)i . At this point, self consistency has been achieved, since
the states | λ > which go into the construction of Uij are the same as those that
emerge from the subsequent diagonalization of t+ U . Once self consistency has been
achieved, the resulting eigenvalues e(n)λ and eigenvectors c
λ (n)i are indeed the Hartree
Fock single-particle energies eλ and the self-consistent eigenvectors.
Note that the fermionic nature of the HF description enters in two places:
1. In the fact that our self-consistent product state has one particle in each of the N
lowest self-consistent single-particle states (as required by the Pauli principle), and
2. in the fact that the two-body matrix elements Viλ′jλ′ that enter in the construction of
Uij are appropriately antisymmetrized, with both direct and exchange terms.
Now let’s return to the energy of the N -particle system that results from the self-
consistent HF minimization procedure. It is given by (see eq. (133))
< Φ| H| Φ > =∑λ≤N
∑ij
tijc∗λi cλj
+1
2
∑λ1 λ2≤N
∑ijkl
Vijklc∗λ1i c∗λ2
j cλ1k cλ2
l
As on page 106, we rewrite
∑λ≤N, ij
tijc∗ λi cλj =
∑λ≤N, ij
c∗ λi < i| t| j > cλj
=∑λ≤N
< λ| t| λ >
= tλλ
Likewise we rewrite
∑λ1 λ2≤N
∑ijkl
Vijklc∗λ1i cλ2
j cλ1k cλ2
l =∑
λ1 λ2≤N
∑ijkl
< ij| V | kl > c∗λ1i c∗λ2
j cλ1k cλ2
l
=∑
λ1 λ2≤N
Vλ1λ2λ1λ2
108
Thus,
< Φ|H| Φ >=∑λ≤N
tλλ +1
2
∑λ1 λ2≤N
Vλ1λ2λ1λ2 (144)
But
eλ = hλλ = tλλ + Uλλ
= tλλ +∑
λ ′≤N
Vλλ′λλ′
Thus, the sum of the HF single-particle energies for the N lowest states is
∑λ≤N
eλ =∑λ≤N
tλλ +∑
λλ ′≤N
Vλλ′λλ′
and we see that the total energy in Hartree Fock (144) is not just the sum of the self-
consistent single-particle energies of the N particles, as we might have naively expected for
an independent particle solution.
Rather, we see that
< Φ|H| Φ >=∑λ≤N
eλ −1
2
∑λλ′≤N
Vλλ′λλ′ (145)
The reason for this difference is that each eλ contains a contribution from the interaction of
particle λ with all the other N−1 particles. Thus∑
λ contains the interactions between every
pair of particles twice. The second term in (145) removes half of the two-body interaction
contribution to give the correct HF energy.
Improvements
The general hamiltonian, when written in the self-consistent HF basis, takes the form
H =∑λ1λ2
tλ1λ2b†λ1bλ2 +
1
4
∑λ1λ2λ3λ4
Vλ1λ2λ3λ4b†λ1b†λ2
bλ4bλ3 (146)
Note that in evaluating the expectation value of H in the self-consistent product state
| Φ >, only a part of the full hamiltonian (146) enters, namely the part in which all λi ≤ N .
Those parts of H involving higher states | λi > with λi > N do not contribute.
Let us now decompose
H = Hs.p. + Hint. (147)
where Hs.p. is by definition the part that only involves the N lowest self-consistent states
and Hint. is everything else. I now make several claims:
109
Occupied
Unoccupied
gap
FIG. 8: Self-consistent single-particle levels in HF approximation
• If Hint. is sufficiently weak and/or the gap between the uppermost occupied level and
the lowermost unoccupied level (see figure 8) is sufficiently large, then one could use
perturbation theory to incorporate the effects of Hint. and improve upon the Hartree
Fock energy and eigenvector.
On the other hand, if Hint. is not sufficiently weak or there is no large gap between
occupied and unoccupied levels, then the effects of Hint. cannot either be neglected or
treated in perturbation theory. In such cases, the choice of an independent-particle
trial state vector was not adequate. The ground state of such a system has very strong
correlations between particles, which cannot be described by a mere perturbation of a
Slater determinant. For such problems, alternative approximation strategies must be
sought. And we’ll discuss one soon.
• As a homework problem, you will be asked to prove that H cannot induce one-particle
one-hole admixtures in the ground state, namely that
< Φ| Hb†λ1bλ2 | Φ >= 0 , for λ1 > N and λ2 ≤ N
It can however induce two-particle two-hole admixtures and it is such admixtures that
should be included in perturbation theory, if it is applicable.
Symmetries
110
Although the full hamiltonian H may contain many symmetries (e.g. rotational invari-
ance), the hamiltonian Hs.p. which is being considered in Hartree Fock need not have these
same symmetries. Thus, the self-consistent product eigenfunctions of H do not in general
contain the symmetries of the original hamiltonian. Without going into details, let me
simply note that
(a) Hs.p. is still a number-conserving operator so that its product eigenstates have definite
particle number;
(b) in general Hs.p. will be neither translationally invariant nor rotationally invariant (i.e.
an ITO of rank zero.).
The lack of translational invariance will be reflected in the fact that our product state
does not have a well-defined total momentum but rather can be a mixture of states with
different momenta. The lack of rotational invariance means that the product state need not
have definite total angular momentum, but rather can be a mixture of states with different
values of J . Those mixtures will be manifested in the expansion (117) for the self-consistent
single-particle creation operator. If we were working in a linear momentum representation,
then the states | λ > will be a superposition of momentum eigenstates having different
values of k. Alternatively, were we working in an angular momentum representation, then
the states |λ > would be a mixture of single-particle states with different j and m values.
After the Hartree Fock minimization has been carried out and the set of self-consistent states
| λ > generated, we can project from the many-particle product states those components
with good K values or good J and M values. Such projections will be necessary to make
contact with the true physical states of the system, for which one indeed has conserved
symmetries and thus good quantum numbers.
You should perhaps be asking yourselves “Why did we have to give up the symmetries of
translational and rotational invariance?” The reason is that in the Hartree Fock procedure
we are searching for a simple description of the system as a set of independent particles, and
symmetries are not compatible with such a simple description. Symmetries of necessity imply
some correlations. In an isolated many-body system, you cannot change the momentum (or
angular momentum) of only one particle. The conservation laws require that at least one
other particle also changes its momentum (or angular momentum). Thus, the motion of the
particles cannot be completely independent if symmetries are to be preserved.
111
So, symmetry laws are incompatible with independent-particle motion. Nevertheless, we
often know (or at least expect) from physical considerations that many-body systems to a
good approximation involve simple independent particle motion. Data tells us this and data
don’t lie. To the extent that this is indeed the dominant physics in play, we would like to be
able to get to it directly. And, as we’ve just seen, the only way we can do this is by relaxing
the symmetry requirements. Only after we have isolated the dominant independent-particle
motion do we wish to put back in the corrections required to restore the symmetry (e.g. by
momentum and/or angular momentum projection).
To perhaps make these ideas more palatable, let me discuss them from a slightly differ-
ent perspective. Let’s focus for now on translational invariance and conservation of linear
momentum. Clearly, pure independent particle motion is only compatible with this con-
servation law if all particles have definite momentum. In such a case, all particles are
completely unlocalized. But we know that bound states of real quantum systems, examples
being atoms or molecules or nuclei, are localized, and furthermore seem to involve essentially
independent-particle motion. What’s the resolution to this apparent paradox?
The resolution is that for such real systems the dominant independent-particle motion is
not in the lab frame but rather in the localized body-fixed frame. Once an object is localized,
its center of mass is confined and thus doesn’t have well-defined momentum. Within the
body-fixed frame, the particles can move independently of one another. If one particle
changes its momentum it is not necessary that any other particle responds. Rather the
center of mass of the system can change its momentum to preserve the total momentum of
the system.
Thus, independent-particle motion in the body-fixed system can occur without any vio-
lation of conservation of total momentum. However, the momentum in the body-fixed frame
is not conserved.
Thus we see that independent-particle motion does not preclude cooperative or collective
phenomena. By spontaneously breaking symmetries, we can define an intrinsic frame in
which independent particle motion occurs but which as a whole moves “collectively”. And
this is indeed the essential philosophy underlying the Hartree Fock method as well as the
BCS method we will be introducing next.
112
BCS Theory
As noted on pages 110 of these notes, the Hartree Fock method - either by itself or in
conjunction with perturbation theory – can only be expected to give a good description
of the ground state of the system when the system does not have strong particle-particle
correlations. When such correlations are important, it is necessary to use different methods
for approximately solving the Schrodinger equation. Often it is possible to use these methods
in conjunction with the Hartree Fock method.
Over the next several lectures, I will consider a particular type of hamiltonian which
indeed gives rise to strong particle-particle correlations and which is amenable to an accurate,
though approximate, treatment. It is the so-called BCS approximation, named after its
inventors Bardeen, Cooper and Schrieffer. The hamiltonian I will consider is (in Fock space)
H =∑α
ϵαa†αaα −G
∑α,γ>0
a†αa†αaγaγ (148)
and with G > 0. The second term in the hamiltonian is the so-called pairing interaction.
Indeed, hamiltonians of essentially this type rise in many branches of physics, including
condensed matter physics, nuclear physics, and cold atomic gases.
To put such a hamiltonian in clearer perspective, let me first consider an arbitrary hamil-
tonian
H = T + V
Choosing some arbitrary (but time-reversal invariant) single-particle potential U (perhaps
the HF self-consistent potential), we can rewrite this as
H = H0 + Vres
where
H0 = T + U
and
Vres = V − U
We define our Fock space in terms of the one-particle eigenstates of H0, viz:
H0| α >= ϵα| α >
113
where
| α >= a†α| 0 >
The hamiltonian (148) is based on a residual pairing interaction
Vres = −G∑
α, γ>0
a†αa†αaγaγ
which is defined via its two-body matrix elements
Vαβγδ = −Gδβαδδγ , (α, γ > 0)
Note: By invoking symmetry conditions on Vαβγδ, it is easy to convince yourself that when
we restrict the sum to α, γ > 0, we need not include the customary factor of 1/4. In
subsequent developments, I will denote the pairing interaction for simplicity as VP .
The principal property of the pairing interaction VP is that it is only felt by pairs of
particles in time-reversed single-particle states (e.g. | α > and | α >). Furthermore, the
strength G that governs how strongly it scatters particles from one time-reversed pair (α, α)
to another (γ, γ) is independent of which states are involved.
Finally, it is usually only necessary to consider the pairing force to act in a finite set of
(active) single-particle states, so that the sums in (148) only involve finite numbers of terms.
Let’s now assume that the active single-particle eigenvalues ϵα are ordered as in the figure
(ϵα1 ≤ ϵα2 ≤ ...), with each level (due to the assumed time-reversal symmetry of H0) being
at least doubly degenerate, i.e. ϵα = ϵα. Furthermore, let’s assume that the system under
discussion has N particles with N even. Then, the lowest N -particle eigenvector of H0 is
| Ψ0 >= | α1, α1, α2, α2, ..., αN/2, ¯αN/2 >
in which two particles occupy each of the N/2 lowest doubly-degenerate levels. It is cus-
tomary to refer to the energy ϵαN/2of the uppermost “occupied” single-particle level as the
Fermi energy, and to denote it as λ (see my discussion on pages 92-93). Note of course that
it is only the uppermost occupied level vis a vis the independent-particle hamiltonian H0
and not the full hamiltonian H.
In addition to | Ψ0 >, H0 has many other eigenstates at higher energies, which are
obtained by promoting particles from occupied levels (within the Fermi sea) to unoccupied
levels (outside the Fermi sea). A particularly interesting one is
| Ψ1 >= |α1, α1, α2, α2, ..., αN2−1, ¯αN
2−1, αN
2+1, ¯αN
2+1 >
114
2
N/2
1
Energy
.
.
.
FIG. 9: Self-consistent single-particle levels in HF approximation
in which a pair of particles is lifted from level αN2into the first unoccupied level αN
2+1. If
we denote
H0| Ψ0 >= E0| Ψ0 >
and
H0| Ψ1 >= E1| Ψ1 >
then
∆E = E1 − E0 = 2(ϵN
2+1 − ϵN
2
)Furthermore, it is straightforward to show that
< Ψ1|VP | Ψ0 >= −G
I now claim that
(a) If ∆E >> G, the pairing interaction will not be effective in mixing the states | Ψ1 > and
| Ψ0 >, as is evident from simple perturbation theory arguments. In fact, extension
115
1k
0
k
FIG. 10: Occupation probabilities for a scenario with weak pairing.
of these qualitative arguments suggests that under such circumstances, the pairing
force will be ineffective in mixing any excited eigenstate of H0 with | Ψ0 >. Clearly,
in such cases the “true” ground state of the system will be given “essentially” by
| Ψ0 > and the particles are “uncorrelated”. Of course, small admixtures of excited
configurations such as | Ψ1 > are possible and they can be treated using perturbation
theory. In such cases, the occupation number ηk (see pages 92-93) will look as in figure
10. The occupation numbers for levels below λ are 1 whereas those above λ are zero,
albeit now with some slight smoothing of the distribution around the Fermi surface to
reflect the small perturbative admixtures.
(b) If ∆E ≤ G, then pairs of particles can easily scatter from the uppermost occupied
levels to the lowermost unoccupied ones. If the pairing matrix element G is sufficiently
strong, then it can also excite particles from deep within the Fermi sea to levels well
outside. Clearly, in such cases it is not feasible to use perturbation theory to ascertain
116
1k
0
k
FIG. 11: Occupation probabilities for a strongly correlated scenario.
the true ground state. Qualitatively, the ground state will be be very different than
| Ψ0 > and must be obtained in some non-perturbative approach. The occupation
probability in such cases will look something like shown in figure 11. The scattering
of particles from inside the Fermi sea to outside will “smear out” the Fermi surface,.
Just how much smearing takes place depends sensitively on the specifics of G and the
ϵα. In such cases, the system clearly involves “correlations” between particles.
Degenerate pairing theory
To get a sense as to how the pairing force admixes such configurations to produce a
correlated ground state, I shall now consider an idealized, but exactly solvable, problem in
which the hamiltonian is related to (148), except that we will assume that all the active
single-particle energies ϵα are equal. With such an assumption, the hamiltonian reduces to
H = −G∑
α, γ=1, Ω
a†αa†αaγaγ (149)
117
Note that in writing the degenerate pairing hamiltonian in this way, I have
(a) neglected the term ϵ∑
α,γ=1,Ω a†αaα = ϵN , since this can’t contribute to excitation
energies in a given system (with N fixed), and
(b) made explicit the fact that H only acts over Ω active (doubly-degenerate) levels.
For the hamiltonian (149), we can explicitly solve the Schrodinger equation
H| Ψ >= E| Ψ >
To see how this is done, we first introduce the coherent “pair creation operator”
A† =∑
α=1,Ω
a†αa†α
and rewrite
H = −GA†A
I now make the following claims, which you will be asked to prove in a homework assign-
ment.
(a)[A†, A
]= N − Ω, where N =
∑α=1,Ω
a†αaα + a†αaα
, i.e. N counts the number of
particles in the active levels 1 thru Ω.
(b)[H, A†
]= −GA†
(Ω− N
)Now suppose that we have a v-particle state (v ≤ Ω), which I denote | Ψv > and which
satisfies
A| Ψv >= 0
and thus
H| Ψv >= 0
Examples are all of the states
| Ψv >= |α1, α2, ...αv >
with all αi > 0. Such a state is said to have seniority v.
Next consider the state(A†)N−v
2 | Ψv >. This is a state of N particles and is still said
to have seniority v. It can be readily shown, and you will also be asked to do this in the
homework, that
118
H(A†)N−v
2 | Ψv >= −G
4[N(2Ω−N + 2)− v(2Ω− v + 2)]
(A†)N−v
2 | Ψv > (150)
Thus,(A†)N−v
2 | Ψv > is an eigenvector of H with eigenvalue
EN,v = −G
4[N(2Ω−N + 2)− v(2Ω− v + 2)]
= EN0 +vG
4(2Ω− v + 2) (151)
I will denote this state as | ΨNv >.
Now let’s focus on systems with an even number of particles.
If G > 0, all states with v > 0 are energetically above the state with v = 0. Thus, the
ground state will be
| ΨN0 >∝(A†)N/2
| 0 >
with eigenvalue
EN0 = −G
4N (2Ω−N + 2)
The lowest excited states are those with v = 2, viz:
| ΨN2 >∝(A†)N
2−1
| Ψv=2 >
Clearly there are many such states, since there are many ways to choose two out of the Ω
positive αi values. All of these states are degenerate at an excitation energy
EN2 − EN0 = GΩ
Thus if Ω is large and G is large, the pairing force will produce a large gap between the
(non-degenerate) ground state and the (highly-degenerate) first excited states. Thus, for
even N and a strong pairing force, one state will be pulled down relative to all others.
What are the occupation probabilities ηk associated with the ground state | ΨN0 >? They
are given by
ηk =< ΨN0| a†kak|ΨN0 >
< ΨN0|ΨN0 >
Clearly we only need consider this for k > 0, since the results for k < 0 must be identical.
As an exercise you should confirm that
ηk =N
2Ω
119
which is independent of k. Thus, for the degenerate pairing problem all active levels are
populated equally in the ground state.
And this is as it obviously should be. All doubly-degenerate levels are obviously equiv-
alent in this problem. They are all degenerate with one another and furthermore the pair
scattering from one level to another is independent of the levels involved.
Clearly when we remove the “degenerate model” assumption and let the ϵα be different,
the main qualitative difference will be that no longer are all levels equivalent and thus
equally populated. Rather, we would expect the ηk to look as in figure 11 on page 117, with
occupations near unity for the lowest levels transitioning smoothly into occupations much
smaller for states at the highest energies.
Nevertheless, assuming that the “pairing correlations” are sufficiently strong, we would
expect that the ground state wave function should still have the basic structure of the ground
state from the degenerate theory, namely a condensate of correlated pairs, viz:
| Φ >∝(A†)N/2
| 0 >
Of course, we would no longer expect that
A† =∑k
a†ka†k
since this would lead to equal population of all states, as we’ve seen. Rather, we’d expect
that
A† =∑k>0
cka†ka
†k
with the expansion coefficients ck somehow reflecting the fact that ηk should follow a pattern
like that shown in figure 11, with the lowest levels being populated most and then the higher
levels successively less.
On this basis, one is led to consider as a physically reasonable trial state vector for systems
dominated by pairing correlations one of the form
| Φ >=
∑k>0
cka†ka
†k
N/2
| 0 > (152)
and to consider the ck as variational parameters, which we would determine by minimizing
< Φ|H| Φ >. Such a procedure is feasible, but very difficult to implement especially for
systems with many particles, as arise for example in Condensed Matter Physics. I now wish
120
to convince you that by introducing a quasi-particle transformation, as described on pages
89-92 of these notes, we are (more or less) doing the same thing, but much more simply.
The Bogolyubov quasiparticle transformation
So, let’s consider a transformation from real particles operators a†α, aα to quasiparticle
operators c†α, cα according to
c†α = uαa†α − vαaα
cα = uαaα + vαa†α (153)
with
u2α + v2α = 1
The transformation (153) is customarily referred to as a Bogolyubov transformation after
the physicist who first introduced it.
In our earlier discussion, we showed that we could develop a quasiparticle algebra that was
mathematically equivalent to the real-particle algebra if we could introduce an appropriate
quasiparticle vacuum state | 0 > for which
cα| 0 >= 0, for all α
We showed furthermore that such a quasiparticle vacuum is related to the real particle
operators and the real vacuum by
|0 >=∏α>0
uαvα∏α>0
(1 +
vαuα
a†αa†α
)| 0 > (154)
We also showed that the number of particles in state k in this quasiparticle vacuum is
given by v2k.
Let me now look at the above quasiparticle vacuum state in a slightly different way. To
do so, we note that
exa†αa
†α =
∞∑n=0
1
n!
(xa†αa
†α
)n| 0 >
=[1 + xa†αa
†α +
1
2x2(a†αa
†α
)2+ ...
]| 0 >
=[1 + xa†αa
†α
]| 0 >
121
where in the last line I used the fact that all higher terms, which involve powers of a†αa†α,
cannot contribute since you cannot have more than one particle in the same state α.
Thus,
∏α>0
(1 +
vαuα
a†αa†α
)| 0 > =
∏α>0
evαuα
a†αa†α | 0 >
= e∑
α>0vαuα
a†αa†α | 0 > (155)
Note that in deriving this I have used the operator identity
eA+B = eAeB
which applies for operators A and B that commute, i.e for which [A,B] = 0. And clearly
[a†αa
†α, a
†βa
†β
]= 0
for all α and β.
Again using the expansion
ex =∞∑p=0
1
p!xp
we can rewrite (155) as
∏α>0
(1 +
vαuα
a†αa†α
)| 0 >=
∞∑p=0
1
p!
(∑α>0
vαuα
a†αa†α
)p
| 0 > (156)
which shows that, in accord with the comments on page 91-92, | 0 > contains contributions
with all even numbers of particles. However, it also makes clear that the component with
N particles has the structure (∑α>0
vαuα
a†αa†α
)N/2
| 0 >
This is exactly the form we postulated as being appropriate for describing a pairing hamil-
tonian, namely something of the form
(A†)N/2
| 0 >
where
A† =∑k
cka†ka
†k
and where ck is related to ηk.
122
I therefore claim that if we choose | 0 > as our trial state and minimize < 0|H| 0 > with
respect to the vk coefficients (as a reminder the uk are related to the vk by u2k + v2k = 1),
then we are basically doing what I suggested we should do for a pairing hamiltonian.
I say “basically” because | 0 > is not a state with exactly N particles, as we would
certainly prefer. Nevertheless we will be able to guarantee (through the introduction of a
Lagrange multiplier) that it has N particles on average, i.e. that
< 0| N | 0 >= N
In this way, we will be able to approximately solve the Schrodinger equation for a pairing
hamiltonian in a way that is much simpler that trying to minimize < Φ| H| Φ > with |Φ >
having the form (152). In other words, by relaxing the requirement that our trial state has
exactly the correct number of particles, we’ll greatly simplify our treatment of pairing-like
hamiltonians.
The method that emerges from this number-nonconserving variational treatment of pair-
ing hamiltonians is called the BCS approximation, after John Bardeen, Leon Cooper and
Bob Schrieffer it’s developers.
Derivation of the BCS equations
The relevant variational equation that we will solve is
δ < 0|H − λN | 0 >= 0 (157)
subject to a constraint that
< 0|N | 0 >= N (158)
The variations are done with respect to the vk parameters that define the Bogolyubov
transformation to quasiparticles (or equivalently the related uk parameters).
The operator in (157) can be written as
H ′ = H − λN =∑α>0
(ϵα − λ)(a†αaα + a†αaα
)−G
∑α,γ>0
a†αa†αaγaγ (159)
We now express H ′ in terms of the quasiparticle operators c†α, c†α, cα and cα. As a
reminder, on page 92, we gave the inverted Bogolyubov transformation, which expresses the
quasiparticle operators in terms of the particle operators. The relevant relations are
123
a†α = uαc†α + vαcα
aα = uαcα − vαc†α (160)
From these, we can also get the hermitean adjoint relations, which are
aα = uαcα + vαc†α
a†α = uαc†α − vαcα (161)
We now have all that is needed to rewrite the operator H ′ in terms of quasiparticle
operators, which we now do term by term.
First Term:
∑α>0
(ϵα − λ)(
uαc†α + vαcα
) (uαcα + vαc
†α
)+(uαc
†α − vαcα
) (uαcα − vαc
†α
)=
∑α>0
(ϵα − λ)u2αc
†αcα + uαvαc
†αc
†α + vαuαcαcα + v2αcαc
†α
+ u2αc
†αcα − uαvαc
†αc
†α − vαuαcαcα + v2αcαc
†α
Now what we do is to use the anticommutation relations to put all quasiparticle creation
operators to the left and all quasiparticle annihilation operators to the right. This is known
as putting the operators in normal order. We get (noting that α = α)
∑α>0
(ϵα − λ)u2αc
†αcα + uαvαc
†αc
†α + vαuαcαcα − v2αc
†αcα + v2α
+ u2αc
†αcα − uαvαc
†αc
†α − vαuαcαcα − v2αc
†αcα + v2α
= 2
∑α>0
(ϵα − λ) v2α
+∑α>0
(ϵα − λ)(u2α − v2α
) c†αcα + c†αcα
+2
∑α>0
(ϵα − λ)uαvαc†αc
†α + cαcα
(162)
Note my separation into pieces involving no c† or c operators, one c† and one c operator,
two c† operators, and two c operators.
124
We can obviously do the same thing for the potential term, though its certainly much
more tedious. We first replace all its a and a† operators by quasiparticle c and c† operators;
we then put everything in normal order in which all creation operators are to the left of all
annihilation operators. Since it is so tedious, let me just quote the end result for the various
pieces of H ′ combined.
H ′ = U ′ + H ′11 + H ′
20 + Hint (163)
where
U ′ =∑α>0
2 (ϵα − λ) v2α −Gv4α
−G
∑α,γ>0
uαvαuγvγ (164)
H ′11 =
∑α>0
(u2α − v2α
) (ϵα − λ−Gv2α
)+ 2Guαvα
∑γ>0
uγvγ
(c†αcα + c†αcα)
(165)
H ′20 =
∑α>0
[2 (ϵα − λ−Gv2α)]
uαvα −G(u2α − v2α
)∑γ>0
uγvγ
(c†αc†α + cαcα)
(166)
and Hint is everything else. Indeed, Hint will contain all terms involving four quasiparticle
operators “in normal order” (i.e. with all c† operators to the left of all c operators.). These
include terms like c†c†c†c†, c†c†c†c, c†c†cc, c†ccc and cccc.
I now make the claim that
< 0| H − λN | 0 >= U ′ (167)
The reason is that all other pieces of H − λN have an annihilation operator on the right
and/or a creation operator on the left, and in either case they annihilate the relevant vacuum
state, viz:
< 0| c†α = cα| 0 >= 0
Thus, all such terms give a zero contribution to the quasiparticle vacuum expectation value.
Thus, the variational equation
δ < 0| H − λN | 0 >= 0
reduces to a set of partial differential equations
∂U ′
∂vk= 0 , for all k (168)
125
where
U ′ =∑α>0
2 (ϵα − λ) v2α −Gv4α
−G
∑α,γ>0
uαvαuγvγ (169)
We will solve this system of equations subject to a constraint on the average number of
particles in the quasiparticle vacuum
< 0| N | 0 >= N (170)
If we differentiate U ′ with respect to vk and equate to zero, we get
4 (ϵk − λ) vk − 4Gv3k −G∑α>0
uαvα
(uk + vk
∂uk
∂vk
)−G
∑γ>0
uγvγ
(uk + vk
∂uk
∂vk
)= 0
Combining terms and then dividing by 2 gives
2 (ϵk − λ) vk − 2Gv3k −G
(uk + vk
∂uk
∂vk
)∑γ>0
uγvγ = 0 (171)
But
u2k + v2k = 1
so that
2uk∂uk
∂vk+ 2vk = 0
or∂uk
∂vk= − vk
uK
(172)
Inserting (172) into (171) and then multiplying thru by uk gives
2 (ϵk − λ) vkuk − 2Gv3kuk −G(u2k − v2k
)∑γ>0
uγvγ = 0 (173)
We now define
ϵk = ϵk − λ−Gv2k (174)
and
∆ = G∑γ>0
uγvγ (175)
Then (173) becomes
2ϵkukvk −(u2k − v2k
)∆ = 0 (176)
Equation (176) emerged by minimizing < 0| H − λN | 0 > with respect to vk. We must
still, however, impose the constraint that < 0| N | 0 >= N . Following our earlier treatment
126
of < 0| H − λN | 0 >, we rewrite N in terms of quasiparticle creation and annihilation
operators and then < 0| N | 0 >= N is just the constant term that emerges when the
operators are put in normal order. By simple analogy with what we did in the derivation of
(162), but replacing ϵα − λ by 1 in the sums, we find that
< 0| N | 0 >= 2∑α>0
v2α
Thus our constraint is that
2∑α>0
v2α = N (177)
which confirms our earlier interpretation of v2α as the occupation probability for level α.
[Note: The 2 arises because of the fact that the level α is doubly-degenerate, with equal
occupations of α and α.
Equations (174-177) are the fundamental equations of BCS theory. Their solution is
facilitated, however, by first putting them in a slightly different form, which we will now do.
We first rewrite (176) as
∆(u2k − v2k
)= 2ϵkukvk (178)
Squaring it gives
∆2(u2k − v2k
)2= 4ϵ2ku
2kv
2k (179)
But (u2k − v2k
)2= u4
k + v4k − 2u2kv
2k =
(u2k + v2k
)2− 4v2ku
2k = 1− 4v2ku
2k
Thus, (179) becomes
∆2 − 4v2ku2k∆
2 = 4ϵ2ku2kv
2k
or
u2kv
2k
(4ϵ2k + 4∆2
)= ∆2
so that
2ukvk = ± ∆√ϵ2k +∆2
(180)
To determine the correct sign, we remember that (see eqs. (169), (174) and (175))
U ′ =∑k>0
(2ϵkv
2k +Gv4k −∆ukvk
)
127
Now, since we wish to minimize this, it is clear that ukvk and ∆ must have the same sign
for all k, so that the correct sign in (180) is + and therefore
2ukvk =∆√
ϵ2k +∆2(181)
Putting this into (178) gives
∆(u2k − v2k
)=
∆ϵk√ϵ2k +∆2
or
u2k − v2k =
ϵk√ϵ2k +∆2
Combining this with
u2k + v2k = 1
gives
u2k =
1
2
1 + ϵk√ϵ2k +∆2
(182)
and
v2k =1
2
1− ϵk√ϵ2k +∆2
(183)
But from (175) we know that
∆ = G∑k>0
ukvk
so that, using (181),
∆ =G
2
∑k>0
∆√ϵ2k +∆2
or2
G=∑k>0
1√ϵ2k +∆2
(184)
Finally, from (177),
N = 2∑k>0
v2k
which after inserting (183) gives
N =∑k>0
1− ϵk√ϵ2k +∆2
(185)
128
Equation (184) is called the gap equation and (185) is called the number equation. It is
in this form that the BCS equations are most readily solved. I will discuss their iterative
solution a bit later.
Some features of the BCS equations
(1) Minimizing < 0| H − λN | 0 > with respect to the vk parameters led to equation (173)
2ϵkukvk −(u2k − v2k
)∆ = 0
where
ϵk = ϵk − λ−Gv2k
and
∆ = G∑α>0
uαvα
Clearly ukvk = 0 for all k is a solution to this system of equations. In such cases,
∆ = 0
This is called the normal solution to the BCS equations.
For the normal solution, the occupation probabilities v2k are given by
v2k =1
2
(1− ϵk
|ϵk|
)
and similarly u2k is given by
u2k =
1
2
(1 +
ϵk|ϵk|
)In the limit G → 0,
ϵk|ϵk|
=
1 for ϵk > λ
−1 for ϵk < λ
so that
v2k =
0 for ϵk > λ
1 for ϵk < λ
Thus, in the limit of noninteracting particles (for which as we’ll soon confirm the normal
solution applies) the Lagrange multiplier λ indeed plays the role of the Fermi energy, dividing
the single-particle states into two groups, one that is occupied and one that is empty. When
129
the interaction is turned on (i.e. G = 0), λ is referred to as the chemical potential, although
as we’ll soon see it still has more or less the significance of a Fermi energy.
The normal solution corresponds to filling up the lowest N/2 levels, exactly as for non-
interacting particles. As we have seen, the BCS equations admit such a solution even when
there is a pairing interaction; however, it may not be the energetically lowest solution.
(2) Question: Under what conditions is there a lower solution?
To answer this, consider the gap equation (184)
2
G=∑k>0
1√ϵ2k +∆2
Clearly this equation can only have a solution for G > 0. We already noted in our dis-
cussion of “degenerate pairing theory” that G > 0 was required to produce pairing solutions
for the ground state.
We now rewrite the gap equation as
2
G=∑k>0
1√ϵ2k +∆2
≤∑k>0
1√ϵ2k
or equivalently ∑k>0
G
2|ϵk|≥ 1
If this inequality is not satisfied, it will not be possible to have another solution to the
BCS equations. Put another way, if
∑k>0
G
2|ϵk|≤ 1
the only solution of the BCS equations is the normal one.
But
|ϵk| = |ϵk − λ−Gv2k|
so that this latter equation is equivalent to
∑k>0
G
2|ϵk − λ−Gv2k|≤ 1 (186)
Bearing in mind that λ plays the role of the Fermi energy, this equation is very reminiscent
of our earlier intuitive equation
∆E ≥ G
130
given on pages 116-117. If the energy cost of lifting particles from occupied to unoccupied
single-particle levels is too large compared to the associated pairing matrix element then
the pairing force cannot effectively scatter particles across the Fermi surface. In such cases,
the only solution to the BCS equations is the normal solution and any corrections to it are
perturbative.
However, if ∆E ≤ G, it becomes possible to generate another solution, called the
superconducting solution. Furthermore, it can be shown (though the analysis is difficult
and I won’t show it) that when the superconducting solution exists it is always energetically
lower than the normal solution. An estimate of the gain in energy of the superconducting
solution relative to the normal solution is
δE = Usuperconducting − Unormal ≈ −∆2
2D
where D is the average spacing of single-particle levels.
(3) Now let’s define
ηk = ϵk −Gv2k (187)
so that
ϵk = ηk − λ
and
v2k =1
2
1− ηk − λ√(ηk − λ)2 +∆2
When a superconducting solution exists, this looks like (see figure 12)
When ηk = λ, it is clear that v2k = 1/2. If we now expand about ηk = λ we obtain
ηk − λ√(ηk − λ)2 +∆2
=ηk − λ
∆
1√1 + (ηk−λ)2
∆2
≈ ηk − λ
∆
1− 1
2
(ηk − λ
∆
)2
Thus, for ηk ≈ λ,∂
∂ηkv2k ≈ − 1
2∆
From this, we see that the region of transition from “filled” to “unfilled” levels has a width
of roughly 2∆. Since ∆ ∝ G, we see that the amount that the Fermi surface is smeared out
is proportional to the strength of the pairing force.
131
1vk
2
0
k
.1/2
FIG. 12: Occupation probabilities for a superconducting scenario.
Iterative solution of the BCS equations
The basic BCS equations can be summarized as
ηk = ϵk −Gv2k (188)
∆ = G∑k>0
ukvk (189)
v2k =1
2
1− ηk − λ√(ηk − λ)2 +∆2
(190)
u2k = 1− v2k (191)
N =∑k>0
1− ηk − λ√(ηk − λ)2 +∆2
(192)
We solve this set of equations in the following iterative fashion.
Step 0: Choose starting values for the uk and vk coefficients (denoted u(0)k and v
(0)k ).
132
Step 1: Use (188) and (189) to determine η(0)k and ∆(0).
Step 2: Determine λ(0) so that [see eq. (192)]
N =∑k>0
1−η(0)k − λ(0)√(
η(0)k − λ(0)
)2+ (∆(0))
2
Step 3: Using (190) and (191), evaluate
v(1)k =
1
2
1− η(0)k − λ(0)√(
η(0)k − λ(0)
)2+ (∆(0))
2
1/2
and
u(1)k =
√1−
(v(1)k
)2Step 4: Return to Step 1 and iterate the first three steps until the v
(n)k emerging from a
given iteration agree with the v(n−1)k from the previous one (the n − 1st) to within a
chosen level of accuracy.
Let’s now examine in some detail the various pieces of H ′ (= H − λN) under conditions
of the “optimum” Bogolyubov transformation. Remember that
H ′ = U ′ + H ′11 + H ′
20 + Hint
as given on page 140.
The various components at minimum are given by
U ′ =∑α>0
2 (ϵα − λ) v2α −Gv4α
−G
∑α,γ>0
uαvαuγvγ
=∑α>0
(2ϵα +Gv2α
)v2α −∆uαvα
(193)
H ′11 =
∑α>0
(u2α − v2α
) (ϵα − λ−Gv2α
)+ 2Guαvα
∑γ>0
uγvγ
(c†αcα + c†αcα)
=∑α>0
(u2α − v2α
)ϵα + 2∆uαvα
(c†αcα + c†αcα
)
=∑α>0
2ϵ2αuαvα
∆+ 2∆uαvα
(c†αcα + c†αcα
)
133
=∑α>0
ϵ2α√ϵ2α +∆2
+∆2√
ϵ2α +∆2
(c†αcα + c†αcα
)=
∑α>0
√ϵ2α +∆2
(c†αcα + c†αcα
)(194)
where the third equality followed from (178) and the fourth from (181).
H ′20 =
∑α>0
[2 (ϵα − λ)− 2Gv2α]uαvα −G
(u2α − v2α
)∑γ>0
uγvγ
(c†αc†α + cαcα)
=∑α>0
2ϵαuαvα −∆
(u2α − v2α
) (c†αc
†α + cαcα
)= 0 (195)
where the last equality followed from (178).
Thus, if the uα and vα are chosen so as to minimize < 0| H − λN | 0 >, they will at the
same time guarantee that
H ′20 = 0 (196)
This result is the BCS analog of our Hartree Fock result for the self-consistent independent-
particle solution
< Φ|Hb†λ1bλ2 | Φ >= 0 , for λ1 > N and λ2 ≤ N ,
given on page 110, which you also proved in a homework problem.
Indeed, Bogolyubov showed that in general it is equivalent to minimize U ′ or to set
H ′20 = 0. The latter is referred to as “removing the dangerous terms”.
Thus, for the “optimum” Bogolyubov transformation,
H ′ = U ′ + H ′11 + Hint
If we make the assumption that Hint is weak, which is analogous to neglecting Hint in HF,
then
H ′ = U ′ + H ′11 (197)
I now make several claims:
1. Clearly the full H ′ commutes with N , so that the true physical eigenstates have well-
defined particle number. However, if we neglect Hint, then the approximate H ′ given
134
by (197) does not commute with N , so that its eigenstates (such as its ground state,
| 0 >) do not have fixed numbers of particles. However, U ′ + H ′11 does commute with∑
α c†αcα, which is the quasiparticle number operator. Thus, its eigenstates have fixed
numbers of quasiparticles. We have so far only discussed its ground state | 0 >, which
has zero quasiparticles. Now we will discuss its excited eigenstates.
2. Clearly the approximate hamiltonian (197) is the hamiltonian for independent quasi-
particles. Thus, all of its eigenstates can be written as products of single-quasiparticle
creation operators acting on the quasiparticle vacuum,
c†k1 ...c†kn| 0 >
where n can be any nonnegative integer. The eigenvalue of this n-quasiparticle state
is
U ′ +n∑
i=1
√ϵ2ki +∆2
[Note: Were H ′20 = 0 we would not arrive at an independent quasiparticle hamiltonian,
explaining why setting it to zero is called “removing the dangerous terms”.]
I now make the following claims:
(a) states with an even number of quasiparticles n correspond to systems with an
even number of real particles, and
(b) states with an odd number of quasiparticles n correspond to systems with an odd
number of real particles.
Both can be straightforwardly proven by making use of our knowledge of the real
particle structure of the quasiparticle vacuum and the quasiparticle creation operators.
Thus, we can separately discuss even-n and odd-n systems.
(a) Even-n:
(1) Ground state: | 0 >
(2) Lowest excited states: c†k1c†k2| 0 >
The excitation energies of such two quasiparticle states, and there are
135
lots of them, are
E(k1, k2) =√ϵ2k1 +∆2 +
√ϵ2k2 +∆2
≥ 2∆
Thus, the lowest possible excited states in a system with an even number of
particles will occur at an excitation energy of at least 2∆. Stated another way,
in even-n systems there is a gap in the spectrum between the ground state and
excited states, and this gap increases with increasing G (since ∆ is proportional
to G). Such large gap is one of the features that characterizes superconducting
systems.
(b) Odd-n:
In odd-n systems, the lowest states are one-quasiparticle states. The excitation
energy of the first excited state is
δE =√ϵ2k2 +∆2 −
√ϵ2k1 +∆2
where ϵk1 is the lowest value of ϵk and ϵk2 is the second lowest. Here, no gap
occurs and no state is preferentially picked out and lowered with respect to all
the others.
Accuracy of the BCS approximation
The BCS approximation is of course not exact since the true eigenstates have definite
particle number and the BCS quasiparticle vacuum does not. We can assess the level of
“inaccuracy” of the BCS approximation by considering its application to the degenerate
pairing hamiltonian for which pairing correlations are certainly dominant but for which the
solution can be obtained exactly.
So, let’s assume that we have n particles (n even) moving in Ω levels subject to the
degenerate pairing hamiltonian
H =∑α
ϵNα −G∑
α, γ=1,Ω
a†αa†αaγaγ
The exact ground state energy of this system is given by (151), viz:
En0 = nϵ− G
4n (2Ω− n+ 2) (198)
136
[Note that I am including a constant single-particle energy ϵ for all levels]. Furthermore, as
discussed on page 120 all states have the same occupation number,
v2k =n
2Ω
and thus of course
u2k = 1− n
2Ω
Finally, the lowest excited states are those with seniority v = 2 and they all occur at
En2 = En0 +GΩ , (199)
likewise from (151).
Now let’s see what BCS approximation yields for these quantities. The ground state
energy is given by
< 0| H| 0 > = U ′ + λn
=∑
α=1,Ω
2ϵv2α −Gv4α
−G
∑α, γ=1,Ω
uαvαuγvγ
But
uα =
√1− n
2Ω
and
vα =
√n
2Ω
so that
< 0|H| 0 > = nϵ−GΩ(
n
2Ω
)2
−G(1− n
2Ω
)(n
2Ω
)Ω2
= nϵ− G
4Ωn2 − GΩ
2+
G
4n2
= nϵ− G
4n(2Ω− n+
n
Ω
)(200)
Comparing (200) with (198), we see that the error in the BCS result relative to the exact
result is
EBCS − EExact
EExact
=2− n
Ω
2Ω− n+ 2
=1
Ω
(2− n
Ω
)(2− n
Ω+ 2
Ω
)137
In the limit in which Ω is large, so that there are many active single-particle levels,
EBCS − EExact
EExact
∼ 1
Ω
which is very small.
BCS approximation produces as the lowest excited states (for even n) those with two
quasiparticles. They occur at
E2qp = E0qp +√ϵ2k1 +∆2 +
√ϵ2k2 +∆2
= E0qp + 2√ϵ2 +∆2
where I’ve used the fact that all single-particle energies are the same (ϵ) in the degenerate-
orbit problem. The square root quantity can be obtained from the gap equation
2
G=
∑α=1,Ω
1√ϵ2 +∆2
=Ω√
ϵ2 +∆2
Thus,√ϵ2 +∆2 =
GΩ
2
and
E2qp = E0qp +GΩ
which is identical to the result given in (199) for the exact calculation. Thus, BCS ap-
proximation (despite giving up number conservation) very accurately reproduces the exact
spectrum in the vicinity of the ground state.
Superconductivity in solids
I would now like to very briefly discuss the relevance of the BCS formalism we have
just developed to superconductivity in solids. My discussion will be very qualitative, but
hopefully will give you some of the flavor of why BCS is so important.
Certain solids have been known to exhibit superconductivity ever since the classic exper-
iments in the laboratory of Onnes in 1911 [H. K. Onnes, Commun. Phys. Lab. 12, 120
(1911)]. The quantum mechanical theory of superconductivity was put forth by Bardeen
Cooper and Schrieffer in 1957 [J. Bardeen, L. N. Cooper and J. R. Schrieffer, Phys. Rev.
108, 1175 (1957)], whereby superconductivity was described as the condensation of a set
138
of correlated pairs averaged over the whole system. The mathematical framework in which
this theory was implemented was the number-nonconserving BCS theory.
That the formalism we have just developed applies to solids under appropriate conditions
can be seen from the following heuristic discussion. Consider the hamiltonian for electrons
in a lattice. It is most conveniently expressed in terms of so-called Bloch states, specified
by a wave vector k and by a spin σ = ±1/2. Then the hamiltonian describing the motion
of the electrons in the lattice can be expressed as
H =∑k>kF
ϵka†k σ
ak σ
+∑k<kF
|ϵk|(1− a†
k σak σ
)+HCoulomb
+1
2
∑k, k ′, σ, σ ′, κ
2hωκ|Mκ|2
(ϵk − ϵk+κ)2 − (hωκ)
2 a†k ′−κ, σ′a
†k+κ, σ
ak ′ σ′ak σ (201)
Here, kF is the so-called Fermi momentum, corresponding to the Fermi energy ϵkFwe dis-
cussed earlier.
The third term represents the (screened) Coulomb interaction between the electrons.
The fourth term is the so-called phonon interaction. It is the part of the electron-electron
interaction that derives from the virtual exchange of phonons with the lattice. The basic
idea is that the electrons interact with the lattice and produce a collective excitation called a
phonon. But we are not explicitly including the lattice and thus phonon degrees of freedom
in our treatment, which involves the electrons only. Thus we take into account the excitation
of these phonons in second-order perturbation theory. The form of the phonon interaction
indicates that it will be attractive (i.e. negative) for single-particle excitation energies |ϵk −
ϵk+κ| < hωκ. This is to be contrasted with the screened Coulomb interaction, which is of
course repulsive and which can be expressed approximately as 4π2/κ2.
The overall residual interaction between electrons will be attractive if⟨−2|Mκ|2
hωκ
+4πe2
κ2
⟩ave
< 0
Note that in the phonon interaction term the scattering takes place from a two-particle state
with momentum k + k ′ to another two-particle state with the same total momentum, i.e.
it conserves momentum as it must.
Cooper [Phys. Rev. 104, 1189 (1956)] showed that two electrons in a lattice interact
most strongly with one another via the phonon interaction when their total momentum
139
k + k ′ = 0, i.e. when the two electrons have equal and opposite momenta. Furthermore,
he showed that the interaction between the two electrons is stronger when their spins are
antiparallel than when they are parallel, since in the parallel-spin case the exchange matrix
elements tend to reduce the interaction. Based on this, he proposed that the interaction
between electrons in a solid could be approximated by an interaction that only acted on
such pairs,
−G∑
k, κ>0
a†−k−κ, ↓ a†k+κ, ↑ ak, ↑ a−k, ↓ (202)
which indeed has the above characteristics. The resulting hamiltonian is then
H =∑
k>kF , σ
ϵka†k σ
ak σ −G∑
k, κ>0
a†−k−κ, ↓ a†k+κ, ↑ ak, ↑ a−k, ↓ (203)
With this as the hamiltonian, he then considered a state vector
| Φ >= Γ† | FS >=∑k>kF
1
2ϵk − Ea†k↑ a†−k↓ | FS > (204)
and showed that for such an interaction it would produce a bound state on top of the Fermi
sea (FS) for any attractive pairing interaction. The energy E for this bound state is given
by the lowest solution of the equation
1
G=
∑k>kF
1
2ϵk − E(205)
which can be obtained numerically. The resulting bound collective pair is called a Cooper
pair.
Bardeen, Cooper and Schrieffer followed up on this idea by considering this hamiltonian
and a trial state vector made up as a condensate of these collective Cooper pairs. Because
of the difficulty in treating a number-conserving condensate
(Γ†)n
|FS >
they instead considered the number-nonconserving state
eÆ|FS >
which we remember as one of the forms for the BCS quasiparticle vacuum [see eqs.
(154, 155)]. When we minimize the expectation value of the pairing hamiltonian (203)
for such a trial state, we obtain the BCS solution given earlier.
140
Of course, from my earlier remarks it is clear that (202) will only be the dominant piece
of the residual effective interaction between electrons in a lattice when it dominates over
the repulsive screened Coulomb interaction. Pines has shown that the condition that the
phonon interaction dominates over the Coulomb interaction is in qualitative agreement with
earlier empirical rules established by Matthias for the occurrence of superconductivity in
solids. So all seems to be consistent.
As you probably all know, superconductivity in solids has many interesting phenomena
associated with it, not just the existence of a large gap in the spectrum. Some of these are
(a) that in superconductors the electrical resistance disappears below a certain temperature;
(b) that superconductors exhibit a second-order phase transition at the critical tempera-
ture; and
(c) that superconductors exhibit the so-called Meissner effect, i.e. they exclude magnetic
fields.
All of these effects are in fact closely related to the existence of a large pairing gap. BCS
theory reproduces all of these phenomena.
Some further comments on BCS theory and the pairing problem
I’d like to close my discussion of pairing in many-body quantum systems with two im-
portant comments, both of which I will briefly discuss.
(a) BCS theory and more general interactions:
I developed BCS theory for a pure pairing hamiltonian, characterized by an interaction
which only acts on pairs in time-reversed states and furthermore for which the strength by
which such a pair is scattered into another such pair is independent of the pairs involved.
Such a hamiltonian is obviously dominated by pairing correlations between time-reversed
pairs, since those are the only pairs that feel the interaction.
On the other hand, it is possible that pairing correlations will also be dominant for more
general interactions. Put another way, after the Hartree Fock correlations are taken into
account, this may be the only other piece of the residual hamiltonian strong enough to
141
produce two-body correlations. And indeed, even for a more general interaction,
1
4
∑k1k2k3k4
Vk1k2k3k4a†k1a†k2ak4ak3
it is conceptually straightforward to minimize the expectation value of hamiltonian in the
BCS quasiparticle vacuum. The equations are somewhat more cumbersome, but are nev-
ertheless obtained using the same basic approach and solved using the same basic iterative
method.
Pairing correlations in atomic nuclei:
In 1958, soon after the development of BCS theory, Bohr, Mottelson and Pines (Phys.
Rev. 110 (1958)) suggested that a similar pairing phenomenon could explain the large gaps
in the spectra of nuclei with an even number of neutrons and an even number of protons.
There, however, the pairing was not between particles in states (k, σ) and (−k,−σ) but
rather between identical nucleons in time-reversed single-nucleon states, (njm) and (njm).
Bohr, Mottelson and Pines noted, however, that in systems with as few particles as atomic
nuclei the violation of number conservation inherent in the BCS theory could cause fairly
serious errors and that development of a number-conserving theory was desirable. Indeed,
for such systems it was soon shown by Dietrich, Mang and Pradal [Phys. Rev. 135, B22
(1964)] how to restore particle number in the BCS formalism by using a trial state
(Γ†)n
|0 >
This is referred to as projected BCS (PBCS) approximation and is precisely what I proposed
doing on page 121 of these notes [see eq. (152)]. While PBCS is very difficult to implement
for systems in condensed matter, where the number of particles is so large but where fortu-
nately it isn’t very critical to use it, it can be implemented in atomic nuclei with its fairly
small number of particles. And like BCS theory it can be implemented for more general
hamiltonians than just the pairing hamiltonian.
142
The Dirac Equation
The next topic we will be discussing concerns how to merge relativity with Quantum
Mechanics. On this topic, Shankar has a nice presentation, so I would like to ask you to
begin reading Chapter 20 in his text.
Let me begin by reminding you that the Schrodinger equation, on which we have focused
so far, was obtained by quantizing classical mechanics. All of the invariance properties of
the classical hamiltonian are thus also present in the corresponding quantum hamiltonian.
As such, all physical properties derived from the Schrodinger equation are invariant under a
Galilean transformation of the reference frame. But they are not invariant under a Lorentz
transformation, as prescribed by the principle of relativity. Of course, in the limit v << c,
we know that a Galilean transformation approximates a Lorentz transformation. What we
conclude therefore is that the non-relativistic Schrodinger theory is appropriate for describing
phenomena for which v << c. And indeed experiment confirms that this is so.
Clearly, however, when the condition v << c is not realized, we will need a quantum
theory that properly respects full Lorentz invariance. And this is what we will now set out
to develop.
Building a theory that respects full Lorentz invariance is unfortunately not the full an-
swer for relativistic systems. In a relativistic theory, mass and energy are equivalent. Thus
whenever the interactions involved give rise to energy transfers that exceed the rest mass
of the particles, particles can be created. To be a complete theory of relativistic quantum
phenomena, our theory must not only respect Lorentz invariance, but it must also accom-
modate states that differ in the number - and perhaps even nature - of the particles from
which it derives. To do this properly for both boson and fermion systems we must resort
to Relativistic Quantum Field Theory. Because of lack of time, I will restrict myself to
but a few simple comments about Relativistic Quantum Field Theory at the very end of
my discussion on Relativistic Quantum Mechanics. Instead, I will focus my discussion on
the first step historically taken for incorporating relativity into quantum physics, the Dirac
equation. The Dirac equation is a relativistic theory for spin-1/2 particles (i.e. fermions) in
a given force field. As we will see, it does have many important features and as a result is
used in many problems. Some of its more attractive features are:
1. It is Lorentz invariant;
143
2. It naturally incorporates the concept of intrinsic spin, which therefore does not have
to be introduced ad hoc as in Schrodinger theory;
3. It does admit the creation of particles;
4. In the limit of small velocities, it reduces to the Schrodinger equation.
One of the things we will see is that in looking at the nonrelativistic, i.e. small v/c, limit
of the theory, we not only recover the Schrodinger equation, but also have a well-defined
prescription for looking at small (but often interesting and important) effects of purely
relativistic origin. We will discuss this in some detail for the hydrogen atom.
I will start out by considering the simplest case possible, that of a free particle. Classically,
the energy (or hamiltonian) of a nonrelativistic free particle is
E =p2
2m
If we promote E and p to quantum operators via the substitutions
E → ih∂
∂t(206)
and
p → P (207)
and let both sides of the resulting operator equation act on a state vector |Ψ(t) >, we obtain
the time-dependent Schrodinger equation
ih∂
∂t| Ψ(t) >=
P 2
2m| Ψ(t) >
Now let’s consider a free particle at large velocities. The corresponding equation for the
classical energy E is
E2 = c2p2 +m2c4
or equivalently
E =(c2p2 +m2c4
)1/2This is the energy-momentum relation appropriate to relativistic particles which we would
like (somehow) to quantize.
144
The simplest way we might imagine doing this is to make the same substitutions (206,207)
as before, namely to raise E and p to quantum operators, via those equations. Doing this
and acting on a state vector | Ψ(t) > gives
ih∂
∂t| Ψ(t) >=
(c2P 2 +m2c4
)1/2| Ψ(t) > (208)
This isn’t terribly appealing. First of all, square root operators are not very nice. Even
more importantly, the equation seems to treat space and time in an asymmetric fashion,
suggesting that it will not be able to preserve the Lorentz invariance of relativity. To see
this, consider the equation in momentum representation, whence
< p| Ψ(t) >= Ψ(p, t)
These states are eigenfunctions of the momentum operator, so that (208) becomes
ih∂
∂tΨ(p, t) = c
(p2 +m2c2
)1/2Ψ(p, t)
= mc2(1 +
p2
2m2c2− p4
8m4c4+ ...
)Ψ(p, t)
Now transform to coordinate space, where p2 becomes −h22, etc. We see then that
we have an inherently different dependence on time (a single derivative) and space (lots of
higher-order derivatives). What we would like is an equation which is of the same order in
space and time. So what should we do?
An interesting (and reasonable) thought is to consider directly the relation
E2 = c2p2 +m2c4
and quantize it. Let’s see what that gives.
E2 → ih∂
∂tih
∂
∂t= −h2 ∂
2
∂t2
and
p2 → P 2 = −h22
in coordinate representation. Acting on a state Ψ(r, t) gives
−h2 ∂2
∂t2Ψ(r, t) = −c2h2 2 Ψ(r, t) +m2c4Ψ(r, t)
145
or [1
c2∂2
∂t2−2 +
(mc
h
)2]Ψ(r, t) = 0 (209)
In this equation, time and space enter compatibly, which is nice. But we won’t use it!
Why? The reason is that the wave function Ψ(r, t) only depends on r and t. But we know
that to describe a particle with spin-1/2 we need a spinor wave function that depends not
only on r and t but also on the spin orientation. An equation such as (209) can thus never
reduce to the Schrodinger equation for a particle with spin in the limit v/c << 1.
In fact the above equation is not without interest or use. Equation (209), called the
Klein-Gordon equation, is often used as a relativistic wave equation for a spinless (or spin-0)
particle.
Now let’s discuss the correct way to obtain an equation appropriate to a relativistic
quantum spin-1/2 particle. To do this, let’s return to (208), which was given on page 147
and then rejected:
ih∂
∂t| Ψ(t) >=
(c2P 2 +m2c4
)1/2| Ψ(t) >
Ideally, we’d like to stay with this equation, rather than its squared version (209), since it
is first order in time. This will, for reasons I won’t discuss, make simpler a probabilistic
interpretation of the proper quantum theory that emerges from it (I promise).
The problem with it, as I said earlier, is the messy square root. This is why we discarded it
then. Dirac’s inspiration was to see whether it was somehow possible to rewrite the quantity
in the square root as a perfect square. If so, the square root could be taken trivially and
all should be fine. Indeed, we would then end up with an equation first order both in space
and time, as we would also like. So, let’s see how to do this.
Let’s look at the factor in the square root, after removing a c2, and then try to express
it as a perfect square, i.e.
p2 +m2c2 = (αxpx + αypy + αzpz + βmc)2
= (α · p+ βmc)2 (210)
where from now on (for simplicity) I will use p rather than P to refer to the momentum
operator. Is it possible to determine α and β such that this holds? Matching the two sides
of (210), we obtain
p2x + p2y + p2z +m2c2 = α2xp
2x + α2
yp2y + α2
zp2z + β2m2c2
146
+pxpy (αxαy + αyαx) + cyclic permutations
+mcpx (αxβ + βαx) + (x → y) + (x → z) (211)
(a) From the first line in (211), we see that
α2i = 1 (i = x, y, z) and β2 = 1 (212)
(b) From the second line, we see that
αxαy + αyαx = αxαz + αzαx = αyαz + αzαy = 0
or equivalently
αi, αj = 0 , for all i = j (213)
(c) From the third line, we see that
αiβ + βαi = αi, β = 0 , for all i (214)
If we find αi and β that satisfy (212-214), we will have achieved our goal.
Some observations:
(1) From (213,214), it is clear that αi and β cannot be simply c-numbers. They must be
matrices that anticommute with one another. And this is nice, since as we noted earlier we
want a theory that in the nonrelativistic (NR) limit will give a nonrelativistic Schrodinger
equation involving two-component spinors.
(2) Since we want our hamiltonian to be hermitean, so that a probabilistic interpretation is
possible, it is clear that these matrices must be hermitean.
(3) From (212), it is clear that each of the matrices can only have eigenvalues ±1.
(4) It is clear, though at this point disheartening, that they can’t be 2 × 2 matrices. The
only 2 × 2 matrices that satisfy all these conditions are the Pauli spin matrices. But there
are only three Pauli spin matrices, and we need four - αx, αy, αz and β.
(5) It is possible to prove that the lowest dimensional matrices possible for satisfying all
these requirements are 4× 4.
The simplest and most common (though not unique) 4× 4 matrices that satisfy all these
requirements are:
α =
0 σ
σ 0
and β =
I 0
0 −I
(215)
147
Here, σ are the usual 2× 2 Pauli spin matrices and
I =
1 0
0 1
is the 2× 2 identity matrix.
Summarizing, with this choice of α and β, a proper relativistic quantum equation of
motion for a free (spin-1/2) particle is
ih∂
∂t|Ψ(t) >= c (α · p+ βmc) |Ψ(t) >
or (ih
∂
∂t− cα · p− βmc2
)|Ψ(t) >= 0 (216)
This is the free-particle Dirac equation.
Let’s now discuss some features of the free-particle Dirac equation:
1. The first thing to note is that as suggested earlier the equation is indeed first order
both in space and time, suggesting that it can indeed preserve the Lorentz invariance
required of a relativistic theory.
2. Since the Pauli spin matrices σ naturally appear in the formalism (via the matrices αx,
αy and αz) and since they relate to the spin operator for a spin-1/2 particle, it would
seem that the Dirac equation contains the requisite physics appropriate to spin-1/2
particles, e.g. the electron. This is to be contrasted with the Klein-Gordon equation
presented earlier, which had no chance of representing the physics of particles with
intrinsic spin.
3. Note further that spin arose here completely naturally, rather than having to be put
in ad hoc as in our non-relativistic quantum theory. This suggests that intrinsic spin
is an inherently relativistic concept, even though we were able to append it by hand
to our NR formalism.
4. Since the matrices α and β that enter the (free-particle) Dirac equation are 4-
dimensional, it is clear that the state vector |Ψ(t) > is a 4-dimensional object as
well. At first glance, this may not see a very happy outcome. We’ll address this
shortly, however, and see that in the small v/c NR limit of the Dirac theory, we indeed
recover the 2-component spinor theory of NR Schrodinger theory.
148
5. Finally, note that the free-particle Hamiltonian in this theory
H = cα · p+ βmc2
is clearly hermitean, as you can readily convince yourselves.
The probabilistic interpretation of the Dirac theory follows naturally from the hermiticity
of the hamiltonian, exactly as in the NR quantum theory. In particular, one defines
Ψ†(r, t)Ψ(r, t)
as the probability density of finding the spin-1/2 particle at point r at time t. Furthermore,
by using the Dirac equation, one can readily confirm that∫Ψ†(r, t)Ψ(r, t)dr = constant
i.e. that the norm of a given state remains constant in time, exactly as in the nonrelativistic
Schrodinger theory. So, all seems fine!
An alternative form for the Dirac equation
Often the Dirac equation is presented in a slightly different form, in which its Lorentz
covariance is more apparent. In particular, we can introduce instead of the α and β matrices,
a related set of four matrices,
γ0 = β
γ1 = βαx
γ2 = βαy
γ3 = βαz (217)
The free-particle Dirac equation can be expressed in terms of these new matrices as
γµ ∂Ψ
∂xµ+ iκΨ = 0 (218)
where
κ = mc/h
and
γµ ∂Ψ
∂xµ= γ0 ∂
∂x0+ γ · = β
1
c
∂
∂t+ βα ·
149
Finally the γ matrices can be shown to satisfy anticommutation relations
γµγν + γνγµ = 2gµνI
in terms of the familiar tensor gµν .
Incorporation of electromagnetism
As a first step towards generalizing the Dirac equation to other than free-particle systems,
let’s discuss what happens when we couple a spin-1/2 fermion (for concreteness, an electron)
to an electromagnetic field.
You’ve already seen this discussed in a nonrelativistic context and in fact the same ap-
proach applies here as well. All we have to do is to replace the operator p by p− qA/c. As
we learned last semester in our discussion of the quantum theory of electromagnetism, A is
also a quantum operator. Note that in principle we should also include the scalar potential
ϕ in our discussion, but as we did then we will work in a gauge in which ϕ = 0. Then the
Dirac equation for a spin-1/2 particle in an EM field becomes
ih∂
∂t| Ψ(t) >=
[cα ·
(p− qA/c
)+ βmc2
]| Ψ(t) > (219)
As before, the stationary states are of the form
|Ψ(t) >= |Ψ > e−iEt/h (220)
which when plugged into (219) yields
E|Ψ >=(cα · π + βmc2
)|Ψ > (221)
where
π = p− qA/c
Now let’s write the four-component eigenvector |Ψ > as
|Ψ >=
χ
ϕ
(222)
where χ and ϕ are themselves two-component vectors (or spinors).
Using the fact that
β =
I 0
0 −I
150
and
α =
0 σ
σ 0
we can rewrite the time-independent Dirac equation (221) as (E −mc2)I −cσ · π
−cσ · π (E +mc2)I
χ
ϕ
= 0 (223)
Thus (E −mc2
)χ− cσ · πϕ = 0 (224)
and
−cσ · π χ+(E +mc2
)ϕ = 0 (225)
Thus, the two-component spinors χ and ϕ are coupled.
Solving the coupled equations (224) and (225) gives the eigensolutions for an electron in
an EM field. A bit later in the semester, we will discuss the solutions of these equations in
the absence of an EM field.
The nonrelativistic limit
I would like now to focus on the solutions of the coupled equations (224,225) at low
velocities (v/c << 1), to see how nonrelativistic Schrodinger theory emerges in this limit.
As we’ll see, it emerges with precisely the features we’d like.
From (224) we see that
ϕ =cσ · π
E +mc2χ
The energy appearing here is the full relativistic energy, including the rest mass mc2. Since
the energy in Schrodinger theory doesn’t include the rest mass, let’s define the Schrodinger
energy as
ES = E −mc2
Then
ϕ =cσ · π
ES + 2mc2χ
At very low velocities, i.e. in the nonrelativistic domain, ES is much less than the
electron’s rest mass. Thus
ES + 2mc2 ≈ 2mc2
151
and
ϕ ≈ cσ · π2mc2
χ =σ · π2mc
χ (226)
The numerator in (226) contains the electron’s momentum operator and thus it’s expec-
tation value will be of order mv, where v is the electron’s velocity. Thus,∣∣∣∣∣ϕχ∣∣∣∣∣ ≈ mv
2mc=
1
2
v
c
We see therefore that in the nonrelativistic limit, ϕ is very small compared to χ. For this
reason, ϕ is referred to as the “small component” of |Ψ > and χ as the “large component”,
as long as we are talking about the NR domain. More generally they are referred to as the
“upper” and “lower” components, respectively, for equally obvious reasons.
We now begin to see how our two-component spinor theory will emerge in the NR limit.
It will emerge when we focus on the large component χ, albeit perhaps including effects due
to the small component ϕ perturbatively. Let’s see how this plays out in a bit more detail.
To do this, let’s plug (226) into (224). Remembering that ES = E −mc2, we obtain
ES χ ≈ (σ · π) (σ · π)2m
χ (227)
Using the identity, (σ · A
) (σ · B
)= A · B + iσ · A× B
we obtain
(σ · π) (σ · π) = π · π + iσ · π × π
But
π × π =iqh
cB
where B is the magnetic field. Thus,
(σ · π) (σ · π) = π2 − qh
cσ · B
Plugging this into (227) we obtain(p− q
cA)2
2m− qh
2mcσ · B
χ = ES χ (228)
This is precisely in the form of a NR Schrodinger equation for a spin-1/2 particle (as a
reminder, χ is a two-component spinor) in an EM field.
152
Note in particular the second term in the square brackets on the left hand side of the
equation. It is precisely the interaction that arises for a spin-1/2 particle in a magnetic field
B, with a gyromagnetic ratio g = 2. As we saw in PHYS811 last semester, the gyromagnetic
ratio was a problem when we treated things nonrelativistically. To achieve agreement with
observation, we had to postulate there that the gyromagnetic ratio associated with intrinsic
spin was g = 2. But at that time, it was purely a postulate. Now we see where it comes
from. It arises from a nonrelativistic approximation to the full (and correct) Dirac equation
for spin-1/2 particles in an EM field. And it arises from the coupling of the dominant large
component of the Dirac wave function to the less-important (but nonetheless necessary)
small component.
Incorporation of an interaction potential
Next let’s discuss how we would incorporate an interaction potential for a spin-1/2 particle
into its relativistic Dirac equation. We do it precisely as you might expect. The hamiltonian
of the free spin-1/2 particle in an EM field is, as we’ve just seen,
Hfree = cα · π + βmc2
In the presence of a potential V it becomes
H = cα · π + V + βmc2
and the Dirac equation then becomes
ih∂
∂t|Ψ >=
[cα · π + V + βmc2
]|Ψ > (229)
Bear in mind, however, that the potential that enters (229) may involve the α and β (or
γ) matrices, i.e. they need not be pure scalar potentials.
Application of the Dirac equation to the Hydrogen atom
Now let’s apply the Dirac formalsim to the hydrogen atom. Though an essentially non-
relativistic system (v/c << 1 for the electron) we should nevertheless be able to apply Dirac
theory to the problem and then take the appropriate NR limit to obtain a Schrodinger
description. We will see, in doing this, that some small, but interesting and observable rel-
ativistic effects creep in. This is for example the origin of the so-called hydrogen atom fine
structure that splits some of the degeneracies inherent in the simple NR Coulomb problem.
153
We will focus on the stationary eigenstates described by the time-independent Dirac
equation
E| Ψ >=[cα · p+ V (r) + βmc2
]| Ψ > (230)
We will ultimately use the Coulomb potential
V (r) = −e2
r
As earlier, we decompose
| Ψ >=
χ
ϕ
into its “large” and “small” components.
These two components satisfy the coupled equations
(E − V −mc2
)χ− cσ · p ϕ = 0 (231)
and (E − V +mc2
)ϕ− cσ · p χ = 0 (232)
From (232), we find that
ϕ =(E − V +mc2
)−1cσ · p χ (233)
Note that the order of the operators is important, as p in coordinate space is a differential
operator. We now plug (233) into (231), again respecting the operator order, and get
(E − V −mc2
)χ = cσ · p
(E − V +mc2
)−1cσ · p χ (234)
Writing, as before,
E = ES +mc2
where ES is the Schrodinger energy, gives[ES − V − cσ · p
(ES − V + 2mc2
)−1cσ · p
]χ = 0 (235)
Consider now the operator (ES − V + 2mc2)−1
entering (235). In the NR limit, ES − V
is very small compared to 2mc2 [Of course, I mean this in expectation value.] Thus, we can
carry out a series expansion in powers of ES−V2mc2
and it should converge fairly rapidly. So,
let’s expand!
154
1
ES − V + 2mc2=
1
2mc2
[1 +
ES − V
2mc2
]−1
=1
2mc2
[1− ES − V
2mc2+ ...
]Plugging this into (235) and only keeping the terms shown explicitly gives[
ES − V − c2(σ · p)2
2mc2− σ · p (ES − V ) σ · p
4m2c4
]χ = 0 (236)
Let’s now see what happens when we only consider the first term in the curly brackets.
Then (236) reduces to [ES − V − (σ · p)2
2m
]χ = 0 (237)
But
(σ · p)2 = p2
so that (237) becomes [ES − V − p2
2m
]χ = 0 (238)
which we recognize as the ordinary Schrodinger equation for the Hydrogen atom.
Thus, we confirm that in lowest order we indeed recover the nonrelativistic Schodinger
equation.
Now we’d like to study the effect of the second term in the curly bracket, which we
anticipate will give us corrections to the NR hamiltonian that are suppressed by (ES −
V )/2mc2 with respect to the usual one.
To put this “suppression” in a clearer light, let’s look for a moment at the lowest-order
equation (238), slightly reorganized,
(ES − V )χ =p2
2mχ
Since
p2 ≈ m2v2
it is clear that
ES − V ≈ 1
2mv2
which isn’t very surprising. Thus, our expansion parameter behaves qualitatively like
ES − V
2mc2≈ v2
4c2
155
As expected, our expansion in powers of (ES − V )/2mc2 is connected with an expansion in
powers of v2/c2, which we recognize as the appropriate expansion parameter in the NR limit
of a proper relativistic theory.
What we will now try to do is to obtain the lowest-order corrections to the Hydrogen
atom, those that are suppressed roughly by v2/c2 with respect to the main terms.
So, let’s now rewrite (236) in the form familiar from Schrodinger theory,
ES χ =
(p2
2m+ V − σ · p (ES − V )σ · p
4m2c2
)χ (239)
where I have again used the fact that (σ · p)2 = p2.
At first glance, this looks like a mess. So, let’s try to clean it up a bit. Consider
(ES − V ) σ · p χ = σ · p (ES − V ) χ+ σ · [ES − V, p] χ
= σ · p (ES − V ) χ+ σ · [p, V ] χ
Finally, the contribution to the NR hamiltonian from this correction term is
Hrel = − σ · p (ES − V ) σ · p4m2c2
= −(σ · p)2 p2
8m3c2− σ · p σ · [p, V ]
4m2c2
= − p4
8m3c2− σ · p σ · [p, V ]
4m2c2(240)
where the second equation arose by keeping only the first term in ES − V , as is needed to
get things to the right order.
The first term in (240) does not depend on the potential, but only on the momentum
operator. It is just the relativistic correction to the kinetic energy operator. It is clearly
suppressed by m2v2/4m2c2 (or by v2/4c2) with respect to the ordinary kinetic energy term.
The second term is the relativistic correction to the potential V = −e2/r. We shall now
analyze it in some detail.
We begin by rewriting this term in a somewhat more convenient form, by invoking the
identity
σ · A σ · B = A · B + iσ · A× B
Thus,
σ · p σ · [p, V ] = p · [p, V ] + iσ · p× [p, V ]
156
and
Vrel = −iσ · p× [p, V ]
4m2c2− p · [p, V ]
4m2c2(241)
Both terms involve, in addition to the potential V , two p operators in the numerator and
4m2c2 in the denominator. Thus, they are both suppressed relative to V by
p2/4m2c2 ≈ m2v2/4m2c2 ≈ v2/4c2 ,
again as expected.
Let’s now look at the first term in (241), which is straightforward to analyze. To begin,
we note that
[p, V ]χ = −ih (V χ)− V ( χ)
= −ih
( V )χ+ V χ− V χ
= −ih( V )χ
Thus, the first term in Vrel is
V(1)rel = −iσ · p× [p, V ]
4m2c2
= −hσ · p×
(− e2
r
)4m2c2
But
(−e2
r
)= − r
r2= − r
r3
so that
V(1)rel = − he2
4m2c2r3σ · p× r
=he2
4m2c2r3σ · r × p (242)
Note: The reason we were able to write p × r = −r × p is that the components of r on
which the momentum operator acts in p× r are always in a direction orthogonal to that of
p. As such, the noncommutativity of p and r doesn’t play a role when dealing with their
cross product.
Now, since r × p = L, we can rewrite (242) as
V(1)rel =
he2
4m2c2r3σ · L
=e2
2m2c2r3S · L (243)
157
The first of the two relativistic corrections to the hydrogen atom potential is thus the familiar
spin-orbit interaction. Since
S · L =1
2
(L+ S) · (L+ S)− L · L− S · S
=
1
2
J · J − L · L− S · S
,
this term splits the degeneracy between states with the same l values (and of course the
same s = 1/2 values) but different j values (i.e. j = l + 1/2 and j = l − 1/2). Note further
that these remarks obviously don’t apply to l = 0 s states, for which only j = l+1/2 exists.
Now let’s look at the second of the relativistic corrections to the potential V in (241),
namely
V(2)rel = − p · [p, V ]
4m2c2(244)
Consider the numerator
O = p · [p, V ] = p · p V − p · V p
Taking its hermitean adjoint, we find
O† = V p · p− p · V p
where I’ve used the fact that both V and p are hermitean. Comparing these two terms we
see that O = O†, so that V (2)res is not hermitean. And that clearly isn’t very nice. As we have
often seen, hermitean hamiltonians are a necessary ingredient of a quantum theory with a
meaningful probabilistic interpretation, i.e. one in which probability is conserved.
Does this mean that all is lost in our efforts to extract the NR limit of the Dirac theory
for the Hydrogen atom? No, it just means that we haven’t been sufficiently careful. The fact
that V(2)rel , as written down, is not hermitean says that the part of the probability associated
with the “large component” χ is not conserved. But Dirac theory never said that the “large
component”of the probability must be conserved, just that the total probability must be
conserved. Mathematically
∫Ψ†Ψdr =
∫ (|χ|2 + |ϕ|2
)dr = constant
but ∫|χ|2dr need not be.
158
On the other hand, the Schrodinger theory that emerges should itself have a conserved
probability. Put another way, if the Schrodinger wave function is χS, then it should satisfy
∫|χS|2dr = constant
What this tells us is that χS = χ. So, what is it?
To answer this, consider again
∫ (χ†χ+ ϕ†ϕ
)dr
We showed that to lowest order in v2/c2,
ϕ ≈ σ · p2mc
χ
Thus, to lowest order in v2/c2,
∫ (χ†χ+ χ† σ · p
2mc
σ · p2mc
χ
)dr = constant
or ∫ (χ†χ+ χ† p2
4m2c2χ
)dr = constant
or ∫χ†[1 +
p2
4m2c2
]χdr = constant
But (1 +
p2
8m2c2
)2
= 1 +p2
4m2c2+O
(p4
64m4c4
)Thus, to O(v2/c2),
∫χ†[1 +
p2
8m2c2
]2χdr
=∫ [(
1 +p2
8m2c2
)χ
]† [(1 +
p2
8m2c2
)χ
]dr
From all of this, we see that if we choose
χS =
(1 +
p2
8m2c2
)χ
then ∫|χS|2dr = constant
159
through the desired order in v2/c2, just as we would like.
Basically, what we have done here is to recognize that relativistic effects must be treated
consistently, both in the operators of the resulting NR theory and in the wave functions.
So, let’s now reconsider the NR Schrodinger-like equation (239)
ES χ =
(p2
2m+ V − σ · p (ES − V )σ · p
4m2c2
)χ = H χ
that we wrote down earlier. As a reminder, H includes the various relativistic corrections
we have been discussing, including the non-hermitean one we didn’t especially like.
Now replace χ on both sides by
χ →(1 +
p2
8m2c2
)−1
χS
Then
ES
(1 +
p2
8m2c2
)−1
χS = H
(1 +
p2
8m2c2
)−1
χS
Premultiplying both sides by(1 + p2
8m2c2
)gives
ES χS =
(1 +
p2
8m2c2
)H
(1 +
p2
8m2c2
)−1
χS
Expanding the inverse operator and keeping only the lowest order terms (thru O(v2/c2))
gives
ES χS =
(H +
p2
8m2c2H −H
p2
8m2c2
)χS
=(H +
1
8m2c2
[p2, H
])χS
Thus, the appropriate NR hamiltonian to use in conjunction with a properly normalizable
χS is
HS = H +1
8m2c2
[p2, V
]I now claim that when we add the new term
1
8m2c2
[p2, V
]to the non-hermitean term (244)
− p · [p, V ]
4m2c2
160
we obtained earlier, we end up with something that is inherently hermitean, as I will now
confirm.
Let’s call the sum of the two terms the Darwin term and then look at it a bit.
VDarwin = − p · [p, V ]
4m2c2+
1
8m2c2
[p2, V
]=
1
8m2c2([p · p, V ]− 2p · [p, V ])
But
[p · p , V ] = p · [p , V ] + [p , V ] · p
so that
[p · p , V ]− 2p · [p , V ] = [p , V ] · p− p · [p , V ]
Let’s now denote this operator as Q and obtain its hermitean adjoint, i.e.
Q = p V · p− V p · p− p · p V + p · V p
and
Q† = p V · p− p · p V − V p · p+ p V · p = Q
Thus, as I suggested earlier, the Darwin term is hermitean. The “wave function renormal-
ization” indeed cured the problem.
Working out the Darwin term in some detail leads to the result
VDarwin =h2
8m2c22 V
=e2h2π
2m2c2δ3(r) (245)
i.e. the Darwin term only acts when the electron is at the origin. Since it is only for s states
that the electron can be at the origin, the Darwin term acts only on s states. This is in
contrast to the spin-orbit term which acted everywhere but on s states.
When we include in perturbation theory the various relativistic corrections just discussed
we get almost perfect agreement with the experimentally measured properties of Hydrogen.
The one discrepancy remaining involves an experimentally observed slight splitting between
the 2s1/2 and 2p1/2 levels, whereas the theory up to now gives them as degenerate. This
phenomenon, called the Lamb shift, requires full-blown Quantum Electrodynamics (QED)
for its explanation.
161
A return to the free-particle Dirac equation
Now that we’ve completed our discussion of the NR reduction of the Dirac equation,
I would like to return to the full relativistic equation and explore some important conse-
quences. I will focus on the free-particle problem, since many interesting conclusions will
already be evident there.
As a reminder, the free-particle Dirac equation is given by
ih∂
∂t| Ψ(t) >=
(cα · p+ βmc2
)| Ψ(t) > (246)
We will again look at the stationary states, which as always are of the form
| Ψ(t) >= | Ψ > e−iEt/h (247)
Plugging this into (246) leads to a relativistic eigenvalue equation
E| Ψ >= HD| Ψ >=(cα · p+ βmc2
)| Ψ > (248)
where for a free particle the Dirac hamiltonian is
HD = cα · p+ βmc2 (249)
For a free particle, the Dirac hamiltonian obviously commutes with the three components
of momentum, p. As a consequence, it is possible to find simultaneous eigenstates of HD
and p, which I’ll denote |ΨE, p > and which satisfy
E| ΨE, p >=(cα · p+ βmc2
)| ΨE, p >
As before, we decompose the four-component Dirac eigenvectors into their upper and
lower components,
| ΨE,p >=
χE,p
ϕE,p
This then leads to the set of coupled equations
(E −mc2
)χE, p − cσ · p ϕE, p = 0 (250)
and
−cσ · p χE, p +(E +mc2
)ϕE, p = 0 (251)
162
From (251) we see that
ϕE, p =cσ · p
E +mc2χE, p
which when plugged into (252) gives
(E −mc2
)χE,p − c2 (σ · p)2
E +mc2χE,p = 0
or, after multiplying through by E +mc2,
[E2 −m2c4 − c2 (σ · p)2
]χE,p = 0
Using as always that
(σ · p)2 = p2
gives (E2 −m2c4 − c2p2
)= 0
As a reminder, p is now the momentum eigenvalue which is why we did not need to keep
the eigenvector χE,p in the equation.
We see therefore that for a free particle there is a connection between E and p2 (as of
course there is for a NR free particle). For a relativistic particle it is
E2 −m2c4 − c2p2 = 0
or
E2 = c2p2 +m2c4 (252)
as expected.
For a given three-momentum p, we see that there are two energies that satisfy (252),
namely
E = +c√p2 +m2c2 (253)
and
E = −c√p2 +m2c2 (254)
Thus, for a given free-particle momentum p, the Dirac equation admits two possible
solutions, one with positive energy and one with negative energy. The positive energy
solution is fine, but what in the world does the negative energy solution mean?
Solutions of the free-particle Dirac equation
163
Before addressing this question, let me first obtain the eigensolutions to the free-particle
Dirac equation, both the positive and negative energy solutions. For notational purposes,
let’s denote the positive energy solution by E = Ep and the negative energy solution by
E = −Ep, where
Ep = c√p2 +m2c2
The simplest positive-energy solutions are obtained by assuming that the momentum
vector p points along the z-direction. Then α · p = αzp and (248) reduces to the four-
dimensional matrix eigenvalue equation
mc2 0 cp 0
0 mc2 0 −cp
cp 0 −mc2 0
0 −cp 0 −mc2
u1
u2
u3
u4
= Ep
u1
u2
u3
u4
(255)
where I have introduced for the four-component eigenvector the specific notation
|Ψ >=
u1
u2
u3
u4
This reduces to four coupled equations
mc2u1 + cpu3 = Epu1
cpu1 −mc2u3 = Epu3
mc2u2 − cpu4 = Epu2
−cpu2 −mc2u4 = Epu4
There are two linearly independent and orthogonal (albeit non-normalized) solutions to
this set of equations,
u(R) ∝
1
0
cpEp+mc2
0
164
and
u(L) ∝
0
1
0
− cpEp+mc2
I have introduced the superscripts R and L to denote the helicity of these two eigenvectors.
Both are eigenvectors of the helicity operator, introduced last semester, with eigenvalues +1
and −1 respectively. As a reminder, the helicity is the spin projection along the momentum
vector.
In the case of the negative energy solutions, the resulting eigenvectors associated with
energy −Ep are
u(R) ∝
− cpEp+mc2
0
1
0
and
u(L) ∝
0
cpEp+mc2
0
1
The negative energy solutions
Now let’s discuss a bit the physical significance of the negative energy solutions. First,
let’s consider the positive-energy free-particle spectrum. From (253), we see that they begin
at a threshold energy of mc2, the rest mass of the particle, and then extend upwards in a
continuum to +∞. Likewise from (254) we see that the negative energy solutions begin at
−mc2 and then extend downwards in a continuum to −∞. This is represented schematically
in fig. 13, albeit with the continuum nature of the solutions above and below the two
thresholds not made evident.
Dirac postulated that the “vacuum” state of nature (i.e. the state with no particles)
corresponds to all of the negative energy states being filled. This set of filled negative-
energy states is called the Dirac Sea. And, again according to Dirac, this filled sea of
negative-energy particles is not observable.
165
-mc20
+mc2
FIG. 13: Schematic illustration of the spectrum of a relativistic free particle.
Sounds radical! Sounds cute! So, let’s now see what such a postulate buys us.
Let’s first ask what happens if you add a particle (we are of course thinking of electrons,
but any spin-1/2 particle will do) to the vacuum.
As we have discussed, particles of spin-1/2 are fermions and satisfy the Pauli exclusion
principle. Thus, no two such particles can occupy the same quantum state. Thus, when you
add a particle to the vacuum it cannot go into any of the negative-energy states, since all
are already occupied. Thus, it must go into one of the positive-energy states. And that is
the usual picture of an electron with positive energy.
Does this mean that the negative-energy states are irrelevant? No! To see why not, let’s
now ask what would happen if we were to hit the system in its vacuum state with something,
say a photon, and in doing this we transferred to the system an energy in excess of 2mc2.
This is enough energy to lift one of the negative-energy electrons within the Dirac Sea across
the 2mc2 energy gap into a positive energy state, as shown schematically in figure 14.
166
-mc20
+mc2
o
x
2mc2
FIG. 14: Schematic illustration of a particle-hole excitation of the Dirac Sea, producing a particle
and an antiparticle.
The net effect is that we now have an electron in a positive-energy state and a hole in
one of the negative-energy states within the Dirac Sea.
Clearly, the electron has positive energy and charge −e. What about the hole in the
Dirac Sea?
When you remove something with charge −e from something with no charge, you leave
behind something with charge +e relative to the (unobservable) filled Dirac Sea. Likewise
removing something with energy−E leaves behind something with energy +E, again relative
to the filled Dirac Sea. Thus, the hole has energy +E (positive) and charge +e (opposite to
that of the electron). The bottom line is that the hole in the Dirac Sea behaves for all intents
and purposes as a particle with the same mass as the electron, but with opposite charge.
Dirac called this strange beast a positron (for positive electron). And, lo and behold, it was
discovered experimentally only a few years after this bold (to say the least) hypothesis.
167
In our discussion last semester of electromagnetic interactions with quantum systems (e.g.
atoms), we saw that whenever there can be a quantum transition induced upwards though
the absorption of energy, there can also be a spontaneous transition downwards with the
emission of energy in the form of photons. And there’s no reason not to expect the same
thing to happen here.
So, imagine an electron and a positron coming together. The electron, being in a positive-
energy state, sees the corresponding hole in a negative-energy state in the Dirac Sea, i.e.
the positron. Since there is a hole available at a lower energy than the particle, the electron
can fall into that hole. In doing so, it will of course emit a photon to carry away its loss of
energy. Obviously, this loss of energy must be at least 2mc2, which is the minimum energy
gap between the positive and negative energy states. This I claim is completely analogous
to what happens when a quantum system in an excited state decays to a lower state by
emitting a photon.
We see therefore that the simple Dirac picture not only predicts the positron, but also
predicts electron-positron annihilation. And as you all know this too is seen experimentally.
But what about bosons?
The Dirac equation, and all that followed from it, applied to fermions only. When dealing
with spinless bosons, the Dirac equation doesn’t apply. There we discussed the fact that a
natural equation of motion was the Klein-Gordon equation, which we gave earlier. Rewriting
it here, it reads
[1
c2∂2
∂t2−2 +
(mc
h
)2]Ψ(r, t) = 0 (256)
It is straightforward to convince yourselves that this equation, like the Dirac equation,
has negative-energy solutions as well as positive-energy solutions. So, what do the negative
energy solutions mean here, i.e. for spinless bosons satisfying the Klein-Gordon equation.
Obviously, Dirac’s interpretation of the negative-energy solutions as being a filled (and
unobservable) sea of particles cannot work here. And the reason is simple. Dirac’s interpre-
tation depended critically on the fact that for fermions the Pauli principle applies and no two
fermions can occupy the same single-level. This is what guaranteed the stability of the Dirac
sea, i.e. the fact that positive-energy particles cannot fall into the negative-energy states.
But bosons do not satisfy a Pauli principle, and thus there is no way for this interpretation
168
d
tc
x
t
td
c
FIG. 15: Schematic illustration of a negative-energy particle of charge e moving backwards in time.
to apply to the negative-energy solutions of the Klein-Gordon equation. There is no way to
have a stable sea of particles with all negative-energy states filled.
How do we get around this? To do this, we instead use an idea due to Feynman, which
indeed applies not only to fermions but to bosons as well, as I will now briefly discuss.
Feynman’s interpretation of the negative-energy particles, whether fermions or bosons, is
that negative-energy particles can only move backwards in time.
This is schematically represented in figure 15 in which we consider a negative-energy
particle (perhaps an electron with negative charge −e) created at a space-time point c
which then travels backwards in time to space-time point d where it is destroyed.
What do we, people who move forward in time and see space-time in equal-time slices,
think is happening.
1. t < td . As far as we can tell there is nothing anywhere.
2. t = td . At this time, a negative energy −|E| and negative charge −e are destroyed, so
169
that the world energy goes up by |E| and the charge goes up by e. It would seem to
us, therefore, that an antiparticle was born here with this charge and energy is born
here.
3. t = tc . At this time, negative energy is created and charge −e is created. Thus, from
our perspective, the antiparticle is wiped out at this time.
4. t > tc. Once again, there does not seem to us to be anything anywhere.
Note, however, that nowhere in this discussion did it really matter whether the negative-
energy particle that was created at tc and destroyed at td was a fermion or a boson.
A nice, albeit brief, description of how we can accommodate such particles moving back-
wards in time in our quantum formalism is provided by Shankar towards the end of Chapter
20. I would now like to review the key steps for you.
The starting point of my discussion is a return to non-relativistic Quantum Mechanics
and to focus on the so-called propagator. As a reminder the propagator gives the amplitude
for propagating from a point r ′ at time t′ to a point r at a subsequent time t. In coordinate
representation, it can be written as
US(r, t : r′, t′) =
∑n
Ψn(r)Ψ∗n(r
′)e−iEn(t−t′)
in terms of the complete set of eigenstates Ψn of the Schrodinger hamiltonian H, i.e.
HΨn(r) = EnΨn(r)
Given this US and Ψ(t′) at some given time t′ we can get Ψ(t) at a later time t′.
But even though we use US to propagate forward in time, i.e. to calculate how the wave
functions evolves to later times, it can also propagate backwards in time, since US = 0 for
t < t′.
To avoid this possibility, it is useful to introduce a propagator that does not allow prop-
agation back in time,
GS(r, t; r′, t′) = θ(t− t′) US(r, t : r
′, t′)
in terms of the usual theta function, which is by definition zero for t < t′ and 1 for t > t′.
170
This new propagator, which only applies for t > t′, satisfies(i∂
∂t−H
)GS =
[i∂
∂tθ(t− t′)
] ∑n
Ψn(r)Ψ∗n(r
′)e−iEn(t−t′)
= iδ(t− t′)δ3(r − r ′)
= iδ4(x− x′)
where I use the notation x to refer to the 4-vector t, r.
Note, to derive this I made use of the fact that
θ = δ(t− t′)
and also that US satisfies the usual equation of a propagator(i∂
∂t−H
)US = δ3(r − r ′)
As a reminder, we need the complete set of eigenstates Ψn to recreate the 3D delta function.
So, that is what the Schrodinger propagator looks like when restricted to moving forward
in time.
Analogous treatment of the free-particle Dirac propagator would show that it too satisfies
an analogous equation (i∂
∂t−H0
)G0
D(x, x′) = iδ4(x− x′)
where H0 is the free-particle Dirac Hamiltonian.
Now, however, when we expand it in terms of the eigenfunctions of the free-particle Dirac
hamiltonian, we must make sure to include the complete set, namely those with positive and
negative energies. Schematically, we write this as
G0D(x, x
′) = θ(t− t′)
(∑n+
+∑n−
)
where n+ refers to those at positive energies and n− to those at negative energies. All are
needed to recreate the full δ4 on the right hand side of the equation.
While this is a fineG0D, it doesn’t satisfy our needs, as it contains negative-energy solutions
propagating forward in time.
It is at this point that Feynman suggested the needed trick. The above equation is not
unique. We can add or subtract from it any solution to the free-particle Dirac equation.
171
But in doing so, we must subtract it for all times. Thus, he suggested that we subtract all
negative-energy solutions at all times.
This gives us a new, but equivalent propagator,
G0F = θ(t− t′)
∑n+
− θ(t′ − t)∑n−
which is called the Feynman propagator.
Let’s now assume we had a state Ψi(t′) composed only of positive-energy states. This
propagator will propagate it forward in time, since it is orthogonal to all the negative-energy
states. But what if we had a state built out of negative-energy components only? Since it
is orthogonal to all positive energy states, it will get backwards propagated in time through
the second term that goes as θ(t′ − t).
If now we are in some external potential, the exact propagation of a particle in an arbitrary
state will be given schematically by
Ψf (t) = G0F (t, t
′)Ψi(t′) +
∑t”
G0F (t, t”) V (t”) G0
F (t”, t′)Ψi(t
′) + ...
in terms of a series of multiple scattering diagrams. This is analogous to what we wrote
down earlier for the propagation of a state in time-dependent perturbation theory, but there
we used the ordinary Schrodinger propagator. Here we use the relativistic Feynman propa-
gator, involving forward propagation of positive-energy particles and backward propagation
of negative-energy particles.
A pictorial flavor of the competing types of processes that could now occur, once we
include the possibility of such negative-energy particles moving backwards in time is given
in figure 16. Both are second-order processes, i.e. processes that involve two scattering
events.
Figure 16a represents a typical two-step process, in which a particle with positive energy
(lets for definiteness call it an electron, although it need not be) gets scattered forward in
time twice. The two scatterings take place at the space-time points 1 and 2, respectively.
Figure 16b represents another two-step process, leading from the same initial state i to
the same final state f , i.e. both have the same x and t (or at least they should have could
I have drawn them better) and both start and end with the same positive energy particle.
But now the scattering at point 1 kicks the particle backward in time and then at point
2 forward in time. As we move forward in time, we first see the electron, then at time 2
172
tff
i
(b)
x
t
x
(a)
i
1
2
2
1
FIG. 16: Two second-order processes that can take place when one includes the possibility of
negative-energy particles moving backwards in time.
(which is before 1) we see two electrons and a positron (i.e. we have created an e+e− pair),
and then at time 1 we again have only an electron. At the end, i.e. at space time point f
we have exactly the same final state (an electron) as we did in process (a).
Clearly, as we go to higher and higher order in the interaction, the electron can wiggle
and jiggle any number of times, creating lots of intermediate states with any number of e+e−
pairs.
So, even though we started out with a one-particle equation, particle production creeps
in through the negative-energy solutions (or the solutions that flow backwards in time). In
the case of fermions, this can either be viewed as resulting from excitations of the infinite
Dirac sea or because the single electron is allowed to go back and forth in time.
While we haven’t yet derived an appropriate propagator for bosons, we would imagine
that there too we would get a propagator in which negative-energy particles propagate
173
backward in time. This will then enable the development of a theory with creation of
particle-antiparticle pairs in boson systems as well.
As I said earlier, the framework for implementing these ideas in a consistent fashion is
relativistic quantum field theory. But that is for another course and another time.
174
Application of the Path Integral Formalism
At this point, I will begin the last topic in the course, the application of Feynman’s
Path Integral formalism. This is a follow-up to the preliminary discussion of Feynman’s
Path Integral formalism that took place in PHYS610 and which derived from Chapter 8 of
Shankar. The more detailed discussion on which we now embark can be found in Chapter
21 of Shankar, which you should now start reading.
Let me briefly summarize what was said in our earlier discussion.
At that time, we focused on the free-particle propagator, which as a reminder is the coor-
dinate space representative of the free-particle time evolution operator. In one dimension we
can readily derive this propagator using the standard Schrodinger or Hamiltonian approach
and would obtain
U(x, t, x′, t′) = < x|U(t, t′)|x′ >
= < x|U(t− t′)|x′ >
=
√m
2πih(t− t′)exp
−m(x− x′)2
2ih(t− t′)
(257)
We then showed that the propagator gives the amplitude for a system propagating from
one point in space and time to another point in space and later in time.
We then postulated following Feynman that we could alternatively obtain the propagator
connecting two points in space-time using the Lagrangian formalism by summing over all
possible paths in space-time that connect them. Feynman furthermore gave a procedure for
implementing this:
• To obtain the propagator between two points in space time (denoted 1 and 2), we need
to determine the sum of an infinity of partial amplitudes UΓ(2, 1), each one associated
with a possible space-time path Γ from (r1, t1) to (r2, t2).
• The partial amplitude associated with the path Γ is determined in the following way:
1. We first determine the classical action SΓ along the path Γ from the classical
Lagrangian L according to the usual formula
SΓ =∫ΓL(r, p, t)dt
175
2. We then determine the partial amplitude UΓ associated with this path as
UΓ(2, 1) = NeihSΓ
where N is a normalization constant which must be, and can be, evaluated.
When we implemented this procedure for a free particle in one dimension, we recovered
precisely the free-particle propagator obtained using the Schrodinget formalism and given
in (257).
I would now like to follow the reverse strategy. Rather than postulating that the propa-
gator can be obtained via a path integral and then proving that the postulate gives the right
results, I would instead like to show you explicitly that we can start with the hamiltonian
formalism and derive the propagator as a path integral. In doing so, we will indeed see sev-
eral key points emerging. One is that there are in fact several possible path integrals that
can be derived. In all we will have to make use of the resolution of the identity operator to
derive the path integral. We will then see that the existence of several possible path integral
formalisms is related to the fact that there are several possible resolutions of the identity
operator that we can use. Finally, once we have done this, we will discuss how one can use
the path integral formalism in its many possible manifestations to treat a key problem in
contemporary many-body quantum physics. I will not have time to treat all that are in
Shankar’s Chapter 21, but will limit my discussion to just one. Furthermore, as emphasized
by Shankar in his presentation, we will not give a detailed development of this application,
but hopefully enough that one can then go to the literature and learn more, now that we
are such proficient Quantum Mechanicians.
Derivation of the Path Integral
So now let’s turn to the derivation of the Path Integral representation for a one-
dimensional propagator governed by a time-independent hamiltonian
H =P 2
2m+ V (X) (258)
As we remember, the propagator is defined as the coordinate-space representation of the
time evolution operator, or
U(x, t;x′, t′ = 0) =< x| exp(− i
hHt)| x′ > (259)
176
Note that we are considering propagation from time t′ = 0 to a subsequent time t.
Now let’s see how we can demonstrate our earlier conjecture that this propagator can be
written as a sum over all possible paths between the two space-time points (x′, 0) and (x, t)
The first point to note is that the operator entering in (259) can be expressed as a product
of N operators
exp(− i
hHt)=[exp
(− i
hH
t
N
)]N(260)
for any integer N . This follows from the Baker-Hausdorff formula that tells how a product
of exponential operators can be combined,
eA eB = eA+B+ 12[A,B]+... (261)
where the ... refers to all higher commutators that enter. Note of course that [H,H] = 0,
which is why we arrive at a simple product of the N operators.
So, let’s now write
ϵ =t
N(262)
and look at this in the limit that N → ∞.
We again use the Baker-Hausdorff formula to write
exp
[−iϵ
h
(P 2
2m+ V (X)
)]≈ exp
(− iϵ
2mhP 2)exp
(−iϵ
hV (X)
)(263)
The reason this follows approximately is that all commutators that enter involve higher
powers of ϵ which will go to zero as ϵ → 0.
Thus, what we will have to compute to get the propagator is the matrix element
< x| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)exp
(− iϵ
2mhP 2)exp
(−iϵ
hV (X)
)...| x′ >
(264)
with the operator product exp(− iϵ
2mhP 2)exp
(− iϵ
hV (X)
)entering N times.
Now we insert the resolution of the identity operator between every pair of operators that
enters. In our current development, we will use the resolution of the identity operator in
coordinate representation, namely
I =∫ +∞
−∞dx |x >< x| (265)
177
To see how this plays out, we will focus on the case of N = 3, and then subsequently
generalize. Following Shankar, I will rename x and x′ by x3 and x0, respectively, whereby
the matrix element becomes
U(x3, x0, t) =∫
dx1 dx2 < x3| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)| x2 >
× < x2| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)| x1 >
× < x1| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)| x0 > (266)
Now let’s look at the generic matrix element
< xn| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)| xn−1 > (267)
When the operator V (X) acts to the right on | xn−1 >, the operator X gets replaced by
its eigenvalue xn−1. Thus,
< xn| exp(− iϵ
2mhP 2)
exp(− iϵ
hV (X)
)| xn−1 > (268)
=< xn| exp(− iϵ
2mhP 2)
| xn−1 > exp(− iϵ
hV (xn−1)
)
Now what about the remaining matrix element < xn| exp(− iϵ
2mhP 2)
| xn−1 >? This
is nothing more than the free-particle propagator for propagating from xn−1 to xn over a
time period ϵ. We worked this out in PHYS610 and the result can be found on page 153 of
Shankar and in eq. (257) of these notes. It is simply
< xn| exp(− iϵ
2mhP 2)
| xn−1 >=(
m
2πhiϵ
)1/2
eim(xn−xn−1)2/2hϵ (269)
Putting this all together, we find that
< xn| exp(− iϵ
2mhP 2)
exp(− iϵ
hV (X)
)| xn−1 > (270)
=(
m
2πhiϵ
)1/2
eim(xn−xn−1)2/2hϵ e−iϵh
V (xn−1)
Now when we combine the three matrix elements that enter (266) we obtain
U(x3, x0, t) =∫
dx1 dx2
(m
2πhiϵ
)1/2
eim(x3−x2)2/2hϵ e−iϵh
V (x2)
(m
2πhiϵ
)1/2
eim(x2−x1)2/2hϵ e−iϵh
V (x1)
(m
2πhiϵ
)1/2
eim(x1−x0)2/2hϵ e−iϵh
V (x0)
178
=(
m
2πhiϵ
)1/2[∫ 2∏
n=1
(m
2πhiϵ
)1/2
dxn
]
× exp
[3∑
n=1
im(xn − xn−1)2
2hϵ− iϵ
hV (xn−1)
](271)
The generalization to arbitrary N is straightforward. All we need do is replace
2∏n=1
→N−1∏n=1
and3∑
n=1
→N∑
n=1
whence
U(xN , x0, t) =(
m
2πhiϵ
)1/2[∫ N−1∏
n=1
(m
2πhiϵ
)1/2
dxn
]
× exp
[N∑
n=1
im(xn − xn−1)2
2hϵ− iϵ
hV (xn−1)
](272)
Now consider the exponential
exp
[N∑
n=1
im(xn − xn−1)2
2hϵ− iϵ
hV (xn−1)
]
that appears in (272). It can be straightforwardly rewritten as
exp
[N∑
n=1
im(xn − xn−1)2
2hϵ− iϵ
hV (xn−1)
]= exp
[i
h
N∑n=1
m(xn − xn−1)2
2ϵ− ϵ V (xn−1)
](273)
We recognize this as precisely the discretized version of Feynman’s eiS/h, as discussed
in Chapter 8 This can be seen from equation 8.4.3 in Shankar, generalized to include a
potential.
We can thus if we wish give a continuum version of this result, namely that
U(x, x′, t) =∫[Dx] exp
[1
h
∫ t
0L(x, x)dt
](274)
where by definition
∫[Dx] = lim
N→∞
(m
2πhiϵ
)1/2[∫ N−1∏
n=1
(m
2πhiϵ
)1/2
dxn
](275)
and L is the Lagrangian and is a function of x and x.
179
We refer to this path integral description of the propagator as the Configuration Space
Path Integral, as it derives by inserting the resolution of the identity operator in coordinate
or configuration space.
Now let’s return to the propagator expressed earlier as a matrix element of N pairs of
operators
< x| exp(− iϵ
2mhP 2)
exp(−iϵ
hV (X)
)exp
(− iϵ
2mhP 2)exp
(−iϵ
hV (X)
)...| x′ >
and evaluate it in a different way, namely by making use of a different resolution of the
identity operator. More specifically, let’s now introduce both
I =∫dx|x >< x|
and
I =∫ dp
2πh|p >< p|
When we consider this for N = 3, we find that we need to introduce three momentum-
space resolutions of I and two coordinate-space resolutions, viz:
< x3| exp(− iϵ
2mhP 2)
exp(− iϵ
hV (X)
)exp
(− iϵ
2mhP 2)exp
(−iϵ
hV (X)
)exp
(− iϵ
2mhP 2)exp
(− iϵ
hV (X)
)| x0 >
=1
(2πh)3
∫dp3dp2dp1dx2dx1 < x| exp
(− iϵ
2mhP 2)|p2 >< p2| exp
(−iϵ
hV (X)
)|x1 >
< x1| exp(− iϵ
2mhP 2)|p1 >< p1|exp
(−iϵ
hV (X)
)|x0 > (276)
As earlier, we note that
V (X)|x >= V (x)|x >
Also,
P 2|p >= p2|p >
Furthermore, we note that
< x|p >= eipx/h
consistent with our introduction of the factor 12πh
in the momentum-space resolution of the
identity operator.
180
Putting this all together and combining terms, we find that
U(x3, x0, t) =1
(2πh)3
∫dp3dp2dp1dx2dx1 e−
iϵp232mh eip3x3/he−
iϵhV (x2)e−ip3x2/h
× e−iϵp222mh eip2x2/he−
iϵhV (x1)e−ip2x1/h
× e−iϵp212mh eip1x1/he−
iϵhV (x0)e−ip1x0/h (277)
At this point we can combine terms. In particular we can collect those terms that involve
p2n in the exponent, terms than involve pnxm in the exponent and of course terms involving
V (xn). When we do this we find
U(x3, x0, t) =1
(2πh)3
∫dp1dp2dp3dx1dx2 exp
[3∑
n=1
(− iϵ
2mhp2n +
i
hpn(xn − xn−1)−
iϵ
hV (xn−1
)](278)
At this point we can generalize to an arbitrary number of time steps N , whereby
U(xN , x0, t) =∫ N∏
n=1
dpn2πh
N−1∏n=1
dxn exp
[N∑
n=1
(− iϵ
2mhp2n +
i
hpn(xn − xn−1)−
iϵ
hV (xn−1
)](279)
Here too we can write it in its continuum form by introducing the classical hamiltonian
H =p2
2m+ V (x) (280)
and also a notation for the integration variables over all the momenta and coordinates
∫[DpDx] = lim
N→∞
∫ N∏n=1
dpn2πh
N−1∏n=1
dxn (281)
Then
U(x, x′, t) =∫[DpDx] exp
[i
h
∫ t
0(px−H(x, p)) dt
](282)
This is referred to as the Phase-Space Path Integral for the propagator, as we integrate
over both the momenta and the associated coordinates.
Knowing the momentum dependence in the hamiltonian and furthermore since it is a
(simple) quadratic dependence, we can in fact carry out all the momentum integrals. When
we do this in the discretized form (279), we find that
N∏1
∫ ∞
−∞
dpn2πh
exp
[N∑
n=1
(− iϵ
2mhp2n +
i
hpn(xn − xn−1)
)]=
N∏1
(m
2πihϵ
)1/2
exp
[im(xn − xn−1)
2
2hϵ
](283)
181
If we now plug this into (279), we not surprisingly recover the Configuration Space Path
Integral given in (273). But note that this depended on having a hamiltonian in which the
dependence on momentum was purely a quadratic. If this is not the case, we cannot carry
out the momentum space integrals. But we can still use the Phase Space Path Integral.
I would now like to close my lectures by discussing a problem of contemporary importance
in physics in which we make use of the Path Integral formalism just developed. Further
examples, as noted earlier, can be found in Chapter 21 of Shankar.
The Berry Phase
The topic that I will discuss concerns what is called the Berry Phase.
This concerns what happens when we make a very slow or adiabatic change on a quantum
system. We addressed this last semester in our discussion of time-dependent Perturbation
Theory where we showed that when we apply an adiabatic perturbation to a quantum
system, the system evolves by remaining in a given state of the system, but with the state
itself changing adiabatically. Put another way, if we start off in the ground state of the
system and change the system sufficiently slowly, the system will remain in the ground state
of the hamiltonian at every instant.
Put a bit more formally, let’s assume that the hamiltonian of the system is given by
H(R(t)) where R is some external coordinate that enters the hamiltonian parametrically
and which changes slowly with time. What we stated qualitatively above is that if we start
off in the nth eigenstate of H(R(0)) at time t = 0 we will be in the nth eigenstate of H(R(t))
at the later time t.
A natural way to write the time-dependent wave function of the system |Ψ(t) > in this
approximation is
|Ψ(t) >= exp(− i
h
∫ t
0En(t
′)dt′)|n(t) > (284)
where
H(t)|n(t) >= En(t)|n(t) > (285)
is the instantaneous time-independent Schrodinger equation at time t.
182
Of course, if H were not a function of time, En would not be a function of time and this
would just be the familiar
|Ψn(t) >= exp (−iEnt/h) |n > (286)
Equation (284) recognizes that in the presence of a slowly varying time-dependent hamil-
tonian, the phase that gets built up over time should depend on the instantaneous and
time-dependent energy.
But as we will now see, the ansatz (284) misses some important physics, namely the
physics of what is called the Berry phase. To see just what is missing, let’s try to parametrize
what may be wrong through the introduction of a slightly modified ansatz
|Ψ(t) >= c(t) exp(− i
h
∫ t
0En(t
′)dt′)|n(t) > (287)
If the ansatz (284) were right, we would just find that c(t) = 1. If c(t) = 1, then something
is obviously missing.
So let’s try to determine c(t) by plugging (287) into the time-dependent Schrodinger
equation (ih
∂
∂t−H(t)
)|Ψ(t) >= 0 (288)
The derivative gives three contributions, since each of the three factors depends on t.
The derivative of the phase factor gives rise to a term
c(t)En(t)|n(t) >
which simply cancels the term obtained by acting with H(t) on |Ψ(t) >, since
H(t)|n(t) >= En(t)|n(t) >
What is left behind are the other two derivative terms
c(t)exp(− i
h
∫ t
0En(t
′)dt′)|n(t) > +c(t)exp
(− i
h
∫ t
0En(t
′)dt′)| ddtn(t) >= 0
We now take the overlap of this expression with < n(t)| and get
c(t) = −c(t) < n(t)| ddt|n(t) > (289)
which has as its solution
c(t) = c(0)exp
[−∫ t
0< n(t)| d
dt|n(t′) > dt′
](290)
183
Defining
γ = i∫ t
0< n(t)| d
dt|n(t′) > dt′ (291)
we find that
c(t) = c(0)eiγ (292)
The additional phase γ is called the Berry phase. It is not so interesting that we got an
extra phase from our analysis, since as we have often seen, phases usually don’t matter. But
in fact this phase can indeed have observable consequences and thus does matter.
So let’s now assume that we have a non-zero Berry phase and see what it can do. The
problem we will consider is that of an electron orbiting around a nucleus, which itself can
move. We will let R = R(t) denote the coordinate of the nucleus and r denote that of the
electron orbiting it.
Now let’s look at the effects of the slow motion of the nucleus, slow compared to that of
the electron. Thus, as the nucleus moves the electron adapts to the motion of the nucleus,
staying in the same instantaneous eigenstate |n(t) >.
We first express the Berry phase in a slightly different form. We consider the exponential
containing the Berry phase factor as
exp
(−∫ t
0< n(t′)| d
dt| n(t) > dt′
)
= exp
(i
hih∫ t
0< n(t′)| d
dt| n(t) > dt′
)
= exp
(i
h
∫ t
0ih < n(t′)| d
dR| n(t) > dR
dt′dt′)
= exp
(i
h
∫ t
0An(R)
dR
dt′dt′)
(293)
where
An(R) = ih < n(R)| d
dR| n(R) > (294)
We refer to An(R) as the Berry potential. It is obviously a vector potential as it couples
to the velocity dRdt
of the nucleus. Note that the Berry potential depends on the state n that
the electron is in.
Now let’s construct the path integral corresponding to the nuclear degrees of freedom.
The resolution of the identity that we will use is
I =∫dR
∑n
| R, n(R) >< n(R), R| (295)
184
where
| R, n(R) >= | R > ⊗| n(R) > (296)
At each value of R we pick our basis for the resolution of the identity as the one that
diagonalizes the instantaneous electronic hamiltonian He(R, r, p), namely the eigenstates of
He(R, r, p)|R, n(R) >= En(R)|R, n(R) > (297)
At this point, we will impose the adiabatic approximation, whereby an electron that
starts off in an instantaneous eigenstate | n >, will remain in that instantaneous eigenstate
forever. When we impose this adiabatic condition, we are able to approximate the identity
operator in terms of a single term,
I ≈∫
dR|R, n(R) >< n(R), R| (298)
and thus drop the sum over n.
Now let’s consider the configuration space path integral in the nuclear variable R. A
typical factor for a given time slice ϵ will look like
< n(R(t+ ϵ)), R(t+ ϵ)| exp[−iϵ
hHN(R,P )
]exp
[−iϵ
hHe(R, r, p)
]| n(R(t), R(t) > (299)
Note that I use capital letters to refer to the nuclear variables and small letters to refer to
the electronic variables. And note that we have both the nuclear hamiltonian HN and the
electronic hamiltonian He contributing to the propagator. Lastly, note that the electronic
hamiltonian depends on the nuclear variable, through its parametric dependence.
Let’s first look at the matrix element of the nuclear part of the hamiltonian, taken between
the nuclear part of the eigenstates,
< R(t+ϵ)| exp[−iϵ
hHN(R,P )
]| R(t) >=
√m
2πhiϵexp
[iϵ
h
(m
2ϵ2(R(t+ ϵ)−R(t))2 − V (R)
)](300)
as we remember from our earlier analysis of the configuration space path integral.
Next let’s look at the matrix element of the electronic part of the hamiltonian taken
between the electronic eigenstates (including their parametric dependence on R),
< n(R(t+ ϵ))| exp[− iϵ
hHe(R, r, p)
]| n(R(t)) > (301)
185
When the electronic hamiltonian acts to the right it gives us a factor of
exp[−iϵ
hEn(R)
](302)
But then we are still left with the overlap between the initial electronic state | n(R(t)) >
and the final electronic state | n(R(t+ ϵ)) >, which we still need to evaluate. Indeed, as we
will soon see, all of the interesting physics will arise when we consider this overlap.
To consider this overlap, we will first rewrite it as
< n(R(t+ ϵ))| n(R(t)) >=< n(R ′)| n(R) > (303)
and then carry out a Taylor series expansion in the difference between R and R ′, which we
will denote as η. We will in fact go through order η2, since this (as we’ll soon see) will lead
to results good to order ϵ in the time slice.
So let’s now go back to our earlier discussion in Chapter 8 on how to derive the Schrodinger
equation from the Feynman Path Integral for a single time slice.
As a reminder, we saw there that we could obtain the state of the system Ψ(x, ϵ) from
the state of the system at an earlier time 0 using
Ψ(x, ϵ) =∫ ∞
−∞U(x, ϵ, x′)Ψ(x′, 0)dx′
where
U(x, ϵ, x′, 0) =
√m
2πihϵexp
i
h
[m(x− x′)2
2ϵ− ϵV (
x+ x′
2)
]is the propagator associated with a particle moving in one dimension subject to a potential
V .
We then showed how we could recover the time-dependent Schrodinger equation for an
infinitesimal time step by expanding the integrand appropriately. There too we introduced
the variable η = x′ − x and carried out a series expansion in η tpo the order needed to get
the wave function correct to first order in ϵ.
We will now repeat that discussion, but for the problem at hand, focusing on the nuclear
degree of freedom R. For simplicity, as it does not affect what emerges, we will ignore
the potential. But for reasons just discussed we will need to include the overlap function
< n(R ′)|n(R ′ + η) >. The relevant expression we need to treat is
Ψ(R′, ϵ) =(
m
2πhiϵ
)1/2 ∫ ∞
−∞eimη2/2hϵ < n(R ′)| n(R ′ + η) > Ψ(R ′ + η, 0)dη (304)
186
We would now like to see what effect that overlap function has on the resulting infinitesimal
Schrodinger equation that emerges.
As in the discussion of Chapter 8, there is only a small region of η that can contribute,
defined by
|η| ≈(2πhϵ
m
)1/2
(305)
For η values outside this region the phase in the integral varies very rapidly and the contri-
butions thus cancel. From this equation, we indeed confirm my earlier remark that we must
go to order η2 to get results good to order ϵ (exactly as in the discussion of Chapter 8).
So let’s now expand both the wave function Ψ(R ′+η, 0) and the overlap < n(R′)| n(R′+
η) > to this order. We find
Ψ(R ′ + η, 0) = Ψ(R′, 0) + η∂Ψ
∂η+
η2
2
∂2Ψ
∂η2
< n(R ′)| n(R ′ + η) > = 1 + η < n| ∂n > +η2
2< n| ∂2n > (306)
where all derivatives are evaluated at R ′.
What we now do is to plug (306) into (304), through order η2, and do the appropriate
Gaussian integrals. What we end up with when all is said and done is
ih (Ψ(R, ϵ))−Ψ(R, 0)) = ϵ
[− h2
2m
∂2Ψ
∂R2− h2
m< n| ∂n >
∂Ψ
∂R− h2
2m< n| ∂2n > Ψ
](307)
With a little work, we can cast this info the form of an infinitesimal Schrodinger equation
and then read off the hamiltonian from it. The result is
H =1
2m(P − An)2 + Φn (308)
An = ih < n| ∂n >
Φn =h2
2m[< ∂n| ∂n > − < ∂n|n >< n| ∂n >] (309)
What we find is that indeed the hamiltonian includes the coupling to the Berry vector
potential. But, furthermore, it includes another term, Φn which is a scalar potential.
And there is no way to get rid of these potentials that arise when we consider the coupling
of the fast and slow degrees of freedom in the problem. And we have seen how they arise of
necessity from the use of the path integral formalism.
187
Now I’d like to turn to an example which shows that the Berry phase and the associated
Berry potentials do indeed lead to observable consequences, especially when one considers
periodic trajectories arising from periodic hamiltonians. In particular, I will briefly show
why it is for periodic hamiltonians that such consequences may arise.
The problem we will discuss involves a particle of mass M moving slowly in a circular
path of radius a. Already we see the idea of periodic trajectories entering.
Furthermore we will assume that there is a magnetic field pointing perpendicular to the
circular path (i.e. in the z direction) with field strength B1. Also, there is a second magnetic
field produced by passing a current along a wire along the z-axis, with strength B2. The
total magnetic field thus has a strength
B =√B2
1 +B22
and is at an angle
θ = arctanB2
B1
with respect to the z-axis.
At this point, let’s assume that the particle has no spin. Then there is no coupling of the
particle to the magnetic field, and its the hamiltonian describing its motion can be written
simply as
H =L2
2I(310)
where
I = Ma2 (311)
is the moment of inertia and
L = −ih∂
∂ϕ(312)
is the angular momentum operator, with ϕ being the azimuthal angle that defines the peri-
odic motion of the particle around the circle.
It is easy to solve this problem. The eigenvalues
Em =h2
2Im2 (313)
where m are the quantized values of the z component of the angular momentum, namely
m = 0, ± 1, ± 2, ....
188
Now let’s assume that the particle moving around the circle has spin-1/2 so that it does
indeed couple to the magnetic field. The field that the particle feels will of course depend
on the angle ϕ. The total hamiltonian for the particle is now
H =L2
2I− Cσ · B(ϕ) (314)
where C is a constant that measures the energy splitting between the two spin states.
We will now assume that the energy splitting between the two spin states is very large
compared to the energy splitting between the rotational states which go as h2
2I. Put another
way, the process of flipping spin in the magnetic field is very fast compared to the very slow
motion of the electron as it moves around the circular path. As such, the particle will not
jump between the two spin states as the particle moves around the circle, but will adjust
adiabatically.
Now what are the energies of the resulting states? An initial guess is that it would be
simply
Em =h2
2I∓ CB (315)
namely for each value of m there would be a splitting associated with the two Larmor states.
And that would be wrong, because it fails to take into account the effects of the Berry
potentials that result from the periodic motion of the electron.
To see this, let’s focus on the lower of the two solutions, in which the spin points up the
summed field the particle sees. The relevant spinor state can be shown to be
| θ, ϕ >=
cos θ2
i sin θ2eiϕ
(316)
From this one can find the two Berry potentials. The vector potential turns out to be
A+(ϕ) = ih < θ, ϕ| ∂∂ϕ
| θ, ϕ >= −h sin2 θ
2(317)
whereas the scalar potential turns out to be
Φ =h2
4sin2θ (318)
Since there is a vector potential, we will need to revise the angular momentum operator
to accommodate it, through the replacement
Lz → Lz − A+ (319)
189
Thus, the angular equation of motion associated with Lz changes to(−ih
∂
∂ϕ− A+
)Ψ = λΨ (320)
for which the solutions are
λ = mh− A+ =
(m+ sin2 θ
2
)h (321)
Ψ = eimϕ (322)
again for m = 0, ± 1, ± 2, ....
Now when we calculate the total energy of the spin-up (i.e. the lower) solution it will be
E+ =1
2Iλ2 − CB =
1
2I
(m+ sin2 θ
2
)2
h2 − CB (323)
There is an extra contribution to the energy that derives from the Berry vector potential
compared to (315). Had we not included the Berry phase in this problem, we would have
arrived at the wrong answer. Indeed, we would have predicted a degeneracy under the
replacement of m by −m (for m = 0) and that would have been incorrect.
Of course the problem of having fast and slow degrees of freedom is not new to us. We
have dealt with it when we studied for example the long-range Van der Waals interaction
between two hydrogen atoms. There we had two time scales, one associated with the slow
relative motion of the two nuclei and the other the much more rapid motion of the electronic
degrees of freedom. What we did was to evaluate the interaction between the two atoms as a
function of the parametric distance between the two nuclei, integrating over the fast motion
of the electrons. Once we have the interaction between the two atoms, we can just solve the
two-body problem in the relative coordinate between the two nuclei. This is known as the
Born-Oppenheimer approximation and is the key tool that has historically been used when
we have two very different time scales involved in a many-body problem. Much the same
philosophy is used to build a description of the interaction of two nuclei, where we integrate
over the much faster degrees of freedom associated with the motion of the constituent quarks.
In the Born-Oppenheimer approach, nowhere do we take into account any Berry potentials
when we integrate over the fast degrees of freedom.
What is different between those problems and the kind of problem we just treated in-
volving motion of particles with spin moving on a ring?
190
The answer is that Born and Oppenheimer focused on problems in which the hamiltonian
could always be chosen real and thus whose wave functions could always be chosen real.
While this is fine for any problem involving motion in an open space, it is not appropriate
when dealing with closed loops, where the particle can return to the same position but with
the wave function changing sign when it returns. Thus, to allow for such closed trajectories
Berry considered the possibility of complex hamiltonians which then led naturally to the
Berry phase and the associated Berry potentials and to the new physics it could produce.
This completes what I would like to say about the Berry phase, about the Path Integral
approach in general, and even more generally about Quantum Mechanics.
191
The End
192