canonical - UBC Physics & Astronomy | UBC Physics & Astronomy

Canonical ensembles

So far, we have studied classical and quantum microcanonical ensembles, i.e. ensembles of largenumbers of isolated thermodynamic systems that are all in the same (equilibrium) macrostate. Wesaw that if we’re able to find the multiplicity Ω of a macrostate, we immediately find the entropyS = kB ln Ω and then we can just go ahead and calculate everything that we like.

The problem is that calculating multiplicities can lead to quite difficult (or even impossible)mathematical problems, so we can only do it for a handful of very simple problems. We certainlycan’t do it for quantum ideal gases, which is what we’d like to study and understand in this course.So we need to find an alternative formulation of statistical mechanics, since this one, although quitebeautiful and certainly complete, is too difficult to use because of the math involved.

So what can we change, in order to get a different formulation? Well, what we can change aboutour system is how it is separated from the outside world – let us see what happens if our modelsystem is not isolated, but closed (it exchanges energy but not particles with the outside, see Fig.below ). The “outside” is called a bath or reservoir, and we picture it as a very very large system,also in equilibrium and at some temperature TR that we can set to whatever value we like. By veryvery large we really mean here that ER ≫ E. The reason is this: suppose we prepare our system insome initial state, and then we set it in thermal contact with the reservoir. We know that the systemwill evolve towards equilibrium, and that in the process some energy (heat) will be transferred fromthe system to the reservoir or viceversa. If E ≪ ER, this amount of exchanged heat will be negligiblysmall from the point of view of the reservoir – so the reservoir will remain at the same temperatureTR. That’s precisely what we mean by a reservoir – its state remains the same no matter how thestate of our system changes. To fulfill this condition, it must be very big compared to the system.

reservoir:

system: E,V,N,T,p,.....

E , V , N ,T , p ....R R R R R

Figure 1: Model of a closed thermodynamic system. The system has some energy E, volume V , etc and is in thermal

contact with a bath or reservoir with energy ER, volume VR, etc. The bath is always assumed to be much much bigger

than the system, so that E ≪ ER, V ≪ VR, N ≪ NR. We know that in thermal equilibrium T = TR.

As already discussed, the macrostate of such a system will now be characterized by its temperatureT (which will be the same as that of the outside reservoir, T = TR), its volume V and number ofparticles N (again, I am using a classical gas as a typical example. We know that for other systemswe might change some of these variables, for instance we might not need the volume if we deal witha crystal). The key point is that we must use T instead of E to characterize the macrostate.

We call an ensemble of very many copies of our closed system, all prepared in the same macrostate

1

T, V,N , a canonical ensemble. It is classical or quantum, depending on the problem at hand. Letus first study classical canonical ensembles, and then we will see the easy generalization to quantumcanonical ensembles.

1 Classical canonical ensemble

For a classical system, we know that its microstate is characterized by all generalized coordinates(q,p). The second postulate of classical stat mech says that if we can find the density of probabilityρc(q,p) to find a system in the canonical ensemble in microstate (q,p), then we can calculate anymacroscopic property as an ensemble average. For example, the internal energy is just the averageof the system’s energy:

U = 〈H〉 =∫ dqdp

GNhNfρc(q,p)H(q,p)

Why? Well, because dqdpGNhNf ρc(q,p) is, by definition, the probability to have a microstate in between

q,q+ dq and p,p+ dp, and these microstates all have energy H(q,p). If we sum over contributionsfrom all of the microstates, we find the average energy in the system, which is what we mean by“internal energy”. As before, the Gibbs’ factor is there to make sure we do not over-count microstates.The factor hNf is there for convenience, so that dqdp

GNhNf is dimensionless. As in the general case, thenormalization condition is:

1 =∫ dqdp

GNhNfρc(q,p)

Another quantity that we can calculate is the entropy, since we know that

S = −kB〈ln ρ〉 = −kB

∫ dqdp

GNhNfρc(q,p) ln ρc(q,p)

Remember that S = −kB〈ln ρ〉 is always true, so it must also hold when we use the canonical densityof probability ρc(q,p). Etc.

So we need to figure out ρc(q,p). For the microcanonical system, this was the point where weused the 3rd postulate, which said that for an isolated system any allowed microstate is equally likely– from there we concluded that the microcanonical density of probability ρmc(q,p) = const = 1/Ωif E ≤ H(q,p) ≤ E + δE, and zero otherwise. In contrast, our closed system can have absolutelyany energy, since in principle it can exchange any amount of heat with the reservoir. We expect thatmicrostates corresponding to whatever energy is consistent with the system’s fixed temperature Tare more likely than microstates with very very different energy ... so we know something here, wecan’t claim full ignorance. In any event, this is not an isolated system. So what can we do?

Well, as is traditional, we reduce the problem to one that we already know how to solve, by noticingthat the total system = system + reservoir is isolated, and so we can use microcanonicalstatistics for it. The microstate of the total system is characterized by the generalized coordinatesq,p;qR,pR, where the latter describe the microsystems making up the reservoir. Its Hamiltonianis HT (q,p;q

R,pR) = H(q,p) +HR(qR,pR) – this just says that the total energy is the sum of the

energies of the components (note: you might wonder about adding an extra interaction term, sincemicrosystems inside the system have to somehow interact with those from the outside, if they canexchange energy. In fact, this is not a problem, because one can argue that this interaction, whateverit is, must be proportional to the contact surface between the two systems. As such, for the largethermodynamic systems that we consider, this is much much smaller than either H or HR, whichare proportional to the volumes).

2

We know that the density of probability to find the total system in the microstate (q,p;qR,pR)is then:

ρmc(q,p;qR,pR) =

1ΩT (ET ,δET ,V,VR,N,NR)

if ET ≤ HT (q,p;qR,pR) ≤ ET + δET

0 otherwise

where ΩT (ET , δET , V, VR, N,NR) is the multiplicity for the total system. This is the same as sayingthat the total probability to find the system in a microstate between q,q+ dq and p,p+ dp ANDthe reservoir in a microstate between qR,qR + dqR and pR,pR + dpR is:

dqdpdqRdpR

GNGNRhNf+NRfR

ρmc(q,p;qR,pR)

(particles cannot go through the wall, so the Gibbs’ factor is a product of the two factors – particlesinside and outside can never interchange places).

But we don’t care what is the reservoir doing, all we want to know is the probability that thesystem itself is in a microstate between q,q+ dq and p,p+ dp. To find that, we must simply sumup all over all the possible reservoir microstates, while keeping the system in the desired microstate.Therefore

ρc(q,p)dqdp

GNhNf=∫

Res

dqdpdqRdpR

GNGNRhNf+NRfR

ρmc(q,p;qR,pR)

where the integral is over all the reservoir’s degrees of freedom. After simplifying, we have:

ρc(q,p) =∫

Res

dqRdpR

GNRhNRfR

ρmc(q,p;qR,pR) =

1

ΩT

∫

ET≤H(q,p)+HR(qR,pR)≤ET+δET

dqRdpR

GNRhNRfR

since ρmc =1ΩT

when this energy condition is satisfied, and zero otherwise. However, we can rewritethe condition as:

ρc(q,p) =1

ΩT

∫

ET−H(q,p)≤HR(qR,pR)≤ET−H(q,p)+δET

dqRdpR

GNRhNRfR

→

ρc(q,p) =ΩR(ET −H(q,p), δET , VR, NR)

ΩT (ET , δET , V, VR, N,NR)

since the integral is by definition just the multiplicity for the reservoir to be in a macrostate ofenergy ET −H(q,p), with VR and NR. Using the link between the entropy of the reservoir and itsmultiplicity SR = kB ln ΩR (because the reservoir is so big and insensitive to what the system does,all microcanonical formulae valid for an isolated system apply to the reservoir), we then have:

ΩR(ET −H(q,p), δET , VR, NR) = eSR(ET−H(q,p),δET ,VR,NR)

kB ≈ eSR(ET ,δET ,VR,NR)−H(q,p)

∂SR∂ER

kB

where I used the fact that the energy of the system H(q,p) is very small compared to the totalenergy, and used a Taylor expansion.

But we know that∂SR

∂ER

=1

TR

=1

T

at equilibrium. We define β = 1/(kBT ), and find the major result:

ρc(q,p) =1

Ze−βH(q,p) (1)

3

where Z is a constant (what we obtain when we collect all terms that do not depend on (q,p)),called the canonical partition function. We can find its value from the normalization condition:

1 =∫ dqdp

GNhNfρc(q,p) → Z(T, V,N) =

∫ dqdp

GNhNfe−βH(q,p) (2)

Note that Z is a function of T (through the β in exponent), of V (the integrals over positions arerestricted to the volume V ), and N , which appears in the number of degrees of freedom, and thenumber of integrals, and possibly in the Gibbs factor.

The result in Eq. (1) should make you very happy: this is what is known as the Boltzmanndistribution or Boltzmann probability. I am sure you’ve been told before that the probability tofind a system at temperature T to have energy E is proportional to e−βE – now we’ve derived thisformula from the basic postulates. Moreover, we know when we can apply it: it holds for closedsystems only! (we already saw that for a microcanonical ensemble ρmc is very different, and you’llhave to believe me, for the time being, that ρgc that we’ll find for grand-canonical ensembles – opensystems – will also be different.)

Now that we know the canonical density of probability, we can calculate the internal energy (seediscussion at the beginning)

U = 〈H(q,p)〉 =∫ dqdp

GNhNfρc(q,p)H(q,p) =

1

Z

∫ dqdp

GNhNfH(q,p)e−βH(q,p)

However, this can be simplified with a trick we’ve already discussed a few times, by rewritingHe−βH =− ∂

∂βe−βH so that:

U = − 1

Z

∂

∂β

∫ dqdp

GNhNfe−βH(q,p) = − 1

Z

∂Z

∂β

since the integral is exactly the partition function (see Eq. (2)). So as soon as we know the canonicalpartition function Z(T, V,N), we can immediately find the internal energy as:

U = − 1

Z

∂Z

∂β= −∂ lnZ

∂β(3)

How about the entropy? Using the expression of ρc in the general Boltzmann formula we discussedin the beginning, we find that:

S = −kB

∫ dqdp

GNhNfρc(q,p) ln ρc(q,p) = −kB

∫ dqdp

GNhNf

1

Ze−βH(q,p) [− lnZ − βH(q,p)] →

S = kB lnZ + kBβU

since the first integral is related to the normalization condition, while the second is just the averageof H. From this we find that the free energy is:

F (T, V,N) = U − TS = −kBT lnZ(T, V,N) (4)

I hope you are now really impressed with how beautifully everything holds together. If you remember,when we reviewed thermodynamics, we decided that for a closed system, if we managed to calculateits free energy F (T, V,N), then we would use

dF = −SdT − pdV + µdN (5)

to find S, p and µ as its partial derivatives; and anything else can be obtained by further derivatives.

4

So let us summarize how we solve a classical canonical ensemble problem. As always, we begin byidentifying the number of degrees of freedom f per microsystem, all needed generalized coordinatesq,p that fully characterize a microstate, and the Hamiltonian of the system H(q,p).

Then we calculate the partition function from Eq. (2). Once we have it, we know that F =−kBT lnZ. Once we have F , we can find

S = −(

∂F

∂T

)

V,N

; p = −(

∂F

∂V

)

T,N

;µ =

(

∂F

∂N

)

T,V

The internal energy comes from Eq. (3) – that’s the simplest way to get it, from the statisticalaverage. Of course, we could also use the fact that U = F + TS, and that we already know F andS – the result will be the same. From U we can calculate CV and whatever else we might like.

For example, we might also want to find the standard deviation of the energy (we know that theaverage is U , but how big are the fluctuations about this average?). For this, we use the definitionof any statistical average to find:

〈H2〉 =∫ dqdp

GNhNfρc(q,p)[H(q,p)]2 =

1

Z

∫ dqdp

GNhNf[H(q,p)]2e−βH(q,p) =

1

Z

∂2

β2Z

where I used again the same trick (if you learn this trick, it’ll make your lives a lot easier. We can dothe calculation by brute force as well, see example below – but it’s better to avoid that if you can).Then, since U = 〈H〉 = − 1

Z∂Z∂β, we find the standard deviation of the energy to be:

σ2E = 〈H2〉 − 〈H〉2 = 1

Z

∂2

β2Z −

[

1

Z

∂Z

∂β

]2

=∂

∂β

[

1

Z

∂Z

∂β

]

= −∂U

∂β= kBT

2∂U

∂T= kBT

2CV (6)

Note that this must be true for any system whatsoever, since we made no special assumption aboutwhat the Hamiltonian looks like, or anything like that – this is always valid. In fact, this last equationis an example of the kind of results one obtains from the so-called “fluctuation-dissipation theorem”,a very powerful theorem. If you continue on to graduate school in physics, you’ll learn to recognizeand love this theorem, but we will drop this topic for now.

Before looking at some examples, let us generalize this discussion to:

2 Quantum canonical ensembles

As we already know, the main difference between a classical and a quantum system is how wecharacterize the microstates. For a classical system, we use q,p which are continuous variables. Soif we want to “sum over all microstates” (for example in order to calculate an ensemble average) weactually have to do many integrals over the whole phase-space.

In contrast, microstates of a quantum system are its eigenstates, and are characterized by theappropriate quantum numbers for the particular problem. Their energies are always discrete, so ifwe want to “sum over all microstates” (for example in order to calculate an ensemble average) inthis case we really have to do a sum over all the eigenstates.

To be more specific, let us assume we have a quantum system described by a Hamiltonian H(an operator), and let Eα be its eigenenergies, where α is one or more quantum numbers (howevermany are required to fully identify the eigenstate for the problem at hand). What is the canonicalprobability to find the system in a given microstate α? Well, here we should repeat all the argumentswe used for the classical system: the total system is isolated, so we can use microcanonical ensemble

5

formalism for it, after which we can sum over all the microstates of the reservoir since we don’t carewhat the reservoir is doing ... to find out at the end of the day that the probability to find the closedquantum system in a microstate α is simply:

pα =1

Ze−βEα (7)

This is the direct analog of the classical expression which had e−βH(q,p). If you look again at howwe derived that, you’ll see that nowhere did we need to worry whether the energy of the systemin that microstate (which is H(q,p) for a classical system, respectively Eα for a quantum system)is continuous or discrete. This is why the quantum and classical cases give similar looking results.Again we call Z the canonical partition function, and again we calculate it from the normalizationcondition, which is now:

1 =∑

α

pα → Z(T, V,N) =∑

α

e−βEα (8)

Note: here we need to be very careful. The sum is over all the microstates, not over the energy levels.If an eigenstate is degenerate (i.e., there are several different microstates all with the same energyEβ, let’s say) then the total contribution from that energy is gβe

−βEβ , where gβ is the degeneracy ofthat level – each microstate of energy Eβ contributes an e−βEβ to the sum.

Now that we have the probability, we can calculate ensemble averages. For example:

U = 〈H〉 =∑

α

pαEα

since when in microstate α, the system has energy Eα. But we can now do exactly the same trickagain:

U =1

Z

∑

α

Eαe−βEα = − 1

Z

∂Z

∂β= −∂ lnZ

∂β

Similarly

S = −kB〈lnp〉 = −kB∑

α

pα ln pα = ... = kB lnZ + U/T

if you go through precisely the same kinds of steps we did for classical systems. The only differenceis that the integrals over microstates for the classical cases go here into sums over quantum numbers,the rest is the same. So from here, we find again that

F (T, V,N) = −kBT lnZ(T, V,N)

and then all the rest with the partial derivatives to find S, p, µ and U and CV goes precisely thesame. Even σE =

√CV kBT 2 remains true as well (check!).

So the only difference is at the step where we calculate the partition function Z – we do an integralover the whole phase space for a classical system, whereas for a quantum system we do a sum overall possible eigenstates. Let’s see some examples.

2.1 Classical ideal gas

As usual, assume N identical simple atoms, with f = 3 degrees of freedom each, inside a volume Vand kept at a temperature T . The microstate is characterized by ~r1, ..., ~rN , ~p1, ..., ~pN . Because thereare no interactions, the Hamiltonian inside the box is simply

H =N∑

i=1

~p2i2m

6

Then, by definition (see Eq. (2)):

Z =1

N !h3N

∫

d~r1 · · ·∫

d~rN

∫

d~p1 · · ·∫

d~pNe−β∑N

i=1

~p2i

2m

since the Gibbs factor is N !. Each∫

d~ri = V , since each particle can only be located inside thebox. We still have to make 3N integrals over all 3N components of the N momenta. Since ~p2i =p2i,x + p2i,y + p2i,z and

∫

d~p =∫∞−∞ dpi,x

∫∞−∞ dpi,y

∫∞−∞ dpi,z, note that the remaining integrals factorize in

a product of simple gaussian integrals:

Z =V N

N !h3N

[∫ ∞

−∞dp1,xe

− β2m

p21.x

]

· · ·[∫ ∞

−∞dpN,ze

− β2m

p2N,z

]

So we have 3N perfectly identical integrals, each of each is equal to√

πβ2m

=√

2mπβ

=√2mπkBT , and

so

Z(T, V,N) =V N

N !

(

2mπkBT

h2

) 3N2

Then:

F = −kBT lnZ = −NkBT ln

V

(

2mπkBT

h2

)32

+ kBT lnN !

After using Stirling’s formula lnN ! = N lnN −N , we can group terms together to find:

F (T, V,N) = −kBT lnZ = −NkBT ln

V

N

(

2mπkBT

h2

)32

−NkBT

At this point you should stop and verify that (i) this indeed has units of energy, and (ii) this is indeedan extensive quantity.

We now take partial derivatives and find:

S = −(

∂F

∂T

)

V,N

= NkB ln

V

N

(

2mπkBT

h2

)32

+5

2NkB

p = −(

∂F

∂V

)

T,N

=NkBT

V

µ =

(

∂F

∂N

)

T,V

= −kBT ln

V

N

(

2mπkBT

h2

)32

If we want U , we can either use U = F + TS, or better, we can use:

U = − ∂

∂βlnZ = − ∂

∂β

[

3N

2ln

1

β+ ...

]

=3

2NkBT

where ... where terms that didn’t depend on T or β, so they do not contribute to the derivative.Of course, we could also calculate U as an ensemble average:

U = 〈H〉 = 1

N !h3N

∫

d~r1 · · ·∫

d~rN

∫

d~p1 · · ·∫

d~pN1

Ze−β

∑N

i=1

~p2i

2m

[

N∑

i=1

~p2i2m

]

7

This can be done! There are now 3N different terms (there are 3N contributions to the total energyfrom the 3N momentum components) and each multiple integral is doable. In fact they all give thesame result – and I might force you to go through this calculation once in the next assignment, justso you see how much you have to work if you don’t learn the nice tricks and instead do things bybrute force. 〈H2〉 can also be calculate by brute force, but it has (3N)2 multiple integrals ... so thetrick with the derivative is really useful, and we will use it in future calculations as well. Learn it!

Some comments now: (1) I hope you agree that calculating Z is a lot simpler than calculating Ωwas. In fact, we’ll soon learn another trick, called “the factorization theorem”, that makes thingseven simpler. In any case, we’ll never have to deal with hyperspheres again, only simple gaussianintegrals; (2) we obviously got the right results – you should compare this with what we found for theclassical microcanonical ensemble, and convince yourself that these and those relations are perfectlyequivalent. But this agreement should puzzle you, in fact. Is it obvious that we should get the samerelationships between the macroscopic variables whether the system is isolated or closed? These arevery different conditions! We’ll see in a bit that the reason we get the same results is because theseare thermodynamic (big) systems. If they weren’t, the results might be very different indeed.

2.2 Chain of classical 1D harmonic oscillators

In this case f = 1,

H =N∑

i=1

(

p2x,i2m

+mω2u2

i

2

)

(see discussion for microcanonical ensemble for notation, if needed).Then, since GN = 1 in this case, using the definition of the partition function we have:

Z =1

hN

∫ ∞

−∞du1...

∫ ∞

−∞dpx,Ne

−β∑N

i=1

(

p2x,i2m

+mω2u2

i2

)

The exponential again factorizes in simple gaussian integrals:

Z =1

hN

∫ ∞

−∞du1e

−βmω2u21

2 ...∫ ∞

−∞dpx,Ne

−βp2x,N2m

Each spatial integral equals√

π/βmω2

2and each momentum integral equals to

√

π/ β2m

, so that:

Z =1

hN

√

π/βmω2

2

N

√

π/β

2m

N

=

(

kBT

hω

)N

Then,

F = −kBT lnZ = −NkBT lnkBT

hωand we find:

U = − ∂

∂βlnZ = ... =

N

β→ U = NkBT

etc. (you should check that again we get all the results in agreement with the ones we had formicrocanonical ensembles. Again, you should wonder why that is so?)

Before looking at some new systems (problems we could not treat with microcanonical ensembleformalism, but we will be able to easily solve them with canonical ensemble formalism), let us noticethat in all the cases above, we had to do sets of N identical integrals. This allows us to make thefollowing simplification:

8

2.3 Factorization theorem – works only for non-interacting systems!

For non-interacting systems, the total Hamiltonian is the sum of Hamiltonians of each microsystem(e.g., atoms that make up the gas) H =

∑Ni=1 hi. The Hamiltonian of each microsystem depends only

on the generalized coordinates and momenta of that particular microsystem, let’s call them qi,pi,so:

H =N∑

i=1

hi(qi,pi)

Using this in the definition of the partition function, we find that for non-interacting systems wecan rewrite:

Z(T, V,N) =1

GN

[z(T, V )]N

where

z(T, V ) =∫ dqdp

hfe−βh(q,p)

and the integrals are only over coordinates/momenta associated with a single microsystem, not withall N of them. If you think about it, z(T, V ) = Z(T, V,N = 1) is just the partition function for asystem with a single microsystem inside. What these formulae show, is that we only need to do oneset of integrals over the coordinates and momenta of a single particle (2f integrals). All particles areidentical, so the contributions from the integrals of the other particles have to be equal to these.

Once we have Z, we use F = −kBT lnZ, etc.For example, for the chain of classical harmonic oscillators, we have f = 1 so:

z(T, V ) =1

h

∫ ∞

−∞du∫ ∞

−∞dpe−β( p2

2m+mω2u2

2) =

1

h

√

π/βmω2

2

√

π/β

2m=

kBT

hω

and since GN = 1, we find the same Z as before. So we only need to do 2f integrals, not 2Nf . We’llsee that there is an analogous factorization theorem for quantum canonical ensembles – we’ll waitwith that until we look at some quantum examples.

But first, let us study a system we could not investigate within the microcanonical ensemble,because we could not calculate the multiplicity:

2.4 N classical non-interacting spins (aka paramagnetic spins)

Remember that we have studied N non-interacting spins-1/2. This will be the classical version of theproblem, where we assume that spins are simple (classical) vectors that could point in any direction(so, no quantization that only allows certain spin projections). More precisely, we assume that eachatom has a magnetic moment ~m, which has a known value m, but is free to point in any directionin space. If we assume the atoms fixed in a lattice, then the only degrees of freedom per atom arethe two angles θ, φ that describe the orientation of its magnetic moment. Of course, this comes withtwo angular momenta pθ, pφ that characterize how fast these angles change in time.

So f = 2.Now we need the Hamiltonian. As we’ve already discussed in assignment 2, the rotational kinetic

energy for a rotating object with angular momentum ~l is:

~l2

2I=

1

2I

(

p2θ +p2φ

sin2 θ

)

9

where I is the moment of inertia and we use pθ and pφ as names for the projections of the angularmomentum along the corresponding directions.The potential energy (coming from interactions with an ex-ternal magnetic field, which we assume to be oriented alongthe z-axis) is:

−~m · ~B = −mB cos θ

(see figure →).So the total Hamiltonian of one classical spin is:

h =1

2I

(

p2θ +p2φ

sin2 θ

)

−mB cos θ

Because the spins are non-interacting, the total Hamiltonianis just the sum of individual Hamiltonians, and we can usethe factorization theorem. Since the volume is fixed (atomslocked in a crystal) there is no dependence on V . Therefore:

x

y

z

m

θ

φ

Fig 2.Spherical coordinates θ, φ characterizingthe orientation of the magnetic moment ~mof an atom.

z(T ) =1

h2

∫ π

0dθ∫ 2π

0dφ∫ ∞

−∞dpθ

∫ ∞

−∞dpφe

−β

(

12I

(

p2θ+

p2φ

sin2 θ

)

−mB cos θ

)

Of these 4 integrals, the one over φ is trivial and gives 2π. The integral over pθ is just a gaussian,

and gives√

π2Iβ. The integral over pφ is also a gaussian, and gives

√

π2I sin2 θβ

. So we are left with:

z(T ) =1

h2

∫ π

0dθ2π

√

π2I

β

√

√

√

√

π2I sin2 θ

βeβmB cos θ =

4π2I

βh2

∫ π

0dθ sin θeβmB cos θ

Using a new variable u = cos θ → du = −sinθdθ, we have θ = 0 → u = 1; θ = π → u = −1, so:

z(T ) =4π2I

βh2

∫ 1

−1dueβmBu =

4π2I

βh2

2 sinh(βmB)

βmB→ z(T ) =

2I(kBT )2

mBh2 sinh(

mB

kBT

)

You must admit that these integrals were rather trivial (especially compared with the microcanonicalversion, which you should now try and see how far you can take).

Since the spins are locked in a lattice, GN = 1, so Z(T,N) = [z(T )]N , and:

F (T,N) = −NkBT ln z(T ) = −NkBT ln

[

2I(kBT )2

mBh2 sinh(

mB

kBT

)

]

We can calculate the internal energy:

U = − ∂

∂βlnZ = −N

∂

∂βln z = 2NkBT −NmB coth

(

mB

kBT

)

and then the specific heat, entropy, etc.More interesting in this case is to calculate the magnetic properties of this system, in particular

we would like to know what is the average magnetization:

〈 ~M〉 = 〈N∑

i=1

~mi〉 =∫ dqdp

GNhNfρc(q,p)

N∑

i=1

~mi

10

where, of course, in this particular case:

dqdp

GNhNf=

dθ1...dθNdφ1...dφNdpθ,1...dpθ,Ndpφ,1...dpφ,Nh2N

and, by definition

ρc(q,p) =1

Ze−βH(q,p) → 1

zNe−β∑N

i=1

[

12I

(

p2θi+

p2φi

sin2 θi

)

−mB cos θi

]

Of course, we could jump into doing the integrals, but let us think about this for a second. It shouldbe apparent that 〈~m1〉 = 〈~m2〉 = ... = 〈~mN〉, since the spins are identical and placed in identicalconditions (same magnetic field), so they should all have the same average magnetization. This tells

us that it’s enough to find the value of one of them, and then 〈 ~M〉 = N〈~m1〉, for example.Now, consider:

〈~m1〉 =∫ dθ1...dθNdφ1...dφNdpθ,1...dpθ,Ndpφ,1...dpφ,N

h2N

1

zNe−β∑N

i=1

[

12I

(

p2θi+

p2φi

sin2 θi

)

−mB cos θi

]

~m1

Since ~m1 = m(sin θ1 cosφ1, sin θ1 sinφ1, cos θ1) that we’re averaging only depends on the angles of thefirst spin θ1, φ1, the integrals over all coordinates and momenta with i > 2 are the same as when wecalculated Z. In fact, the 4 integrals over the momenta and coordinates of each of the spins 2, 3, ...Nwill each just give a z, so that in the end we are left with:

〈~m1〉 =1

h2

∫ π

0dθ1

∫ 2π

0dφ1

∫ ∞

−∞dpθ,1

∫ ∞

−∞dpφ1

1

ze−β

(

12I

(

p2θ,1+

p2φ,1

sin2 θ1

)

−mB cos θ1

)

~m1

If we stop to think about it, this formula is very reasonable. Since we’re calculating an average,everything multiplying the averaged quantity (i.e. ~m1) must be the corresponding density of proba-bility to find spin 1 pointing in the direction θ1, φ1 and with angular momenta pθ,1, pφ1 . The formulaabove says that this density of probability is 1

ze−βh1 , where h1 is the energy of the spin 1 in short

notation. We could infer this result directly without doing the integrals over the other spins’ anglesand momenta: since the spins do not interact with one another, they behave independently and thetotal probability ρc must be the product of probabilities for each spin to do its own thing. Since ρc isa product of N terms 1

ze−βhi , one for each spin i = 1, N , each of these terms must be the probability

for the corresponding spin to be in its corresponding microstate. Another way to think about this isthat in the absence of interactions, the spin of interest would behave just the same if it was the onlyspin in the system. In the limit N → 1, ρc → 1

ze−βh. The bottom line is that for non-interacting

microsystems, the probability for one of them to be in a microstate is 1ze−βh where h is the energy

of the microsystem in that microstate and z is the corresponding normalization factor, which indeedequals the partition function if there is a single particle in the system.

For interacting systems this is not true, however, and there we must start from the full ρc andintegrate over all the possible states of all the other microsystems.

Coming back to our expectation value, we can now do the angular momenta integrals, since~m1 = m (sin θ1 cosφ1, sin θ1 sinφ1, cos θ1) does not depend on them. As already discussed, thoseintegrals are simple gaussians and will simplify some terms from z. Using the expression of z andsimplifying those terms, we find:

〈~m1〉 =βmB

4π sinh(βmB)

∫ 2π

0dφ1

∫ π

0dθ1 sin θ1e

βmB cos θ1 ~m1

11

Let’s start calculating averages for the individual components:

〈m1,x〉 =βmB

4π sinh(βmB)

∫ 2π

0dφ1

∫ π

0dθ1 sin θ1e

βmB cos θ1m sin θ1 cosφ1 = 0

because∫ 2π0 dφ cosφ = 0. Similarly, we find 〈m1,y〉 = 0. This is expected since there is nothing

favoring some direction in the xy plane more than the others. As we just said, the xy projection ofthe spin can point with equal likelihood in any direction, so the average in the xy plane must be 0.This is the result of the symmetry of the problem, and so far as solutions at the exam are concerned,I am perfectly satisfied if you answer a problem like this by saying “〈Mx〉 = 〈My〉 = 0 because ofsymmetries”.

However, 〈m1,z〉 6= 0, since the spin is more likely to point up than down, so the average shouldbe some positive value. Indeed:

〈m1,z〉 =βmB

4π sinh(βmB)

∫ 2π

0dφ1

∫ π

0dθ1 sin θ1e

βmB cos θ1m cos θ1 = mβmB

2 sinh(βmB)

∫ 1

−1duueβmBu

where we did the φ integral, and changed to the new variable u = cos θ1. Here we can either integrateby parts, or observe that our favorite trick holds: ueβmBu = 1

βm∂

∂(B)eβmBu, and we’ve already done

the integral from the exponential alone. Either way, we obtain:

〈m1,z〉 = m

[

coth (βmB)− 1

βmB

]

= mL(βmB)

where the function L(x) = coth(x) − 1xis called the Langevin function, and looks as shown in the

figure below. So the total magnetization of the system is:

〈 ~M〉 = ezNmL(βmB)

Now before analyzing the meaning of this, let me show you the smart way of solving this problem.It is quite similar with the trick for calculating U = 〈H〉 by derivatives from Z, instead of doing theintegrals. In the derivation above we did the integrals – which is fine, but takes time. Here’s theshort, trick-based solution. First, as I said, we notice that we must have 〈Mx〉 = 〈My〉 = 0 becauseof the symmetry of the problem – all xy in-plane directions are perfectly equivalent to each other.Then, by definition:

〈Mz〉 =∫ dqdp

GNhNf

1

Ze−βHMz

Next, we notice that H = K.E.−B∑N

i=1 mi,z = K.E.−BMz. So we can use the trick, since:

Mze−β(K.E.−BMz) =

1

β

∂

∂Be−β(K.E.−BMz)

(note that the kinetic energy does not depend on the magnetic field B, so the derivative with B givesthe right answer). Therefore:

〈Mz〉 =1

Z

1

β

∂

∂B

∫ dqdp

GNhNfe−βH =

1

Z

1

β

∂

∂BZ = kBT

∂

∂BlnZ = NkBT

∂

∂Bln z

Calculating this derivative we find the answer 〈Mz〉 = NmL(βmB). I hope you agree that this wasa lot easier, and that it’s worth paying attention: if you are asked for the average of some quantity

12

which is somehow part of the Hamiltonian, the trick can be used. You should now be able to calculate〈M2

z 〉 quite easily as well, and then the standard deviation of the magnetization.

The Langevin function L(x) = coth(x)− 1xis plotted at right.

It asymptotically goes to value 1, since 1/x → 0 when x →∞, while coth(x) → 1. For small values of x, using Taylorexpansions, you should be able to verify that L(x) ≈ x

3+ ....

For us, the argument is x = mBkBT

. Large x means kBT ≪ mB,i.e. large magnetic fields and low temperatures. In this limit,we find L(x) ≈ 1 and so :

〈Mz〉 ≈ Nm

L(x)

x

y=x/3

1

Fig 3. The Langevin function L(x). Atsmall x ≪ 1, L(x) ≈ x/3, while for x →

∞,L(x) ≈ 1.

showing that at small temperature kBT ≪ mB, the spins will go to the lowest energy state (lowtemperature) which consists in all pointing in the positive z-direction. This makes sense.

At high temperatures kBT ≫ mB, we have x = mB/(kBT ) ≪ 1, and therefore:

〈Mz〉 ≈ Nmx

3=

Nm2B

3kBT→ 0

This also makes sense. At high temperatures the spins have lots and lots of (kinetic) energy, sothey rotate fast through all possible orientations, and the magnetization becomes smaller and smallerin average.

Experimentalists can measure the magnetization and haveconfirmed this behavior. In fact, they prefer to define theso-called magnetic susceptibility:

χ =∂〈Mz〉∂B

which measures how the average magnetization changes withchanging the applied magnetic field (while keeping the tem-perature and number of spins constant).

χ

kT

mB

~1T

Curie Law

Fig 4. Magnetic susceptibility of paramag-netic spins.

Using the exact expression of 〈Mz〉 in terms of the Langevin function and taking the derivativewith B, we find:

χ = Nm2β

[

1

(βmB)2− 1

sinh2(βmB)

]

The shape of this is plotted in Fig. 4.At low temperatures kBT ≪ mB, one finds that χ → 0. This is expected since in this limit we

had 〈Mz〉 = Nm = const. At high temperatures kBT ≫ mB we find:

χ =Nm2

3kBT

(here 〈Mz〉 = Nm2B3kBT

, see above). This is well-known as the Curie law. In fact, one of the first thingsone does when one has a new material to investigate is to measure its magnetic susceptibility. If itdecreases like 1/T at high-T, one knows for sure that there are some non-interacting (paramagnetic)magnetic impurities in that sample. We will see soon that the high-T results agree with what thequantum theory predicts (as they should).

13

Let me show you one more neat thing. If you look at the expression of the internal energy, youcan easily show that we can rewrite it as:

U = NkBT − 〈Mz〉B

However, we know that U = 〈H〉 = 〈K.E.〉 − B〈Mz〉 so it follows that for this particular kineticenergy, 〈K.E.〉 = NkBT . Since the average must be the same for each spin (no reason why one wouldrotate slower or faster in average), it follows that the average kinetic energy per spin must be kBT .

You might have noticed that we got similar results in previous cases. For example, for a simpleideal classical gas, the average kinetic energy per atom was 3/2kBT . The difference is that thekinetic energy of the spin is the sum of two quadratic terms (one proportional to p2θ, one proportionalto p2φ), while the kinetic energy of a moving atom has three quadratic terms (one proportional to

p2x, one proportional to p2y and one proportional to p2z). So we may guess that the average of eachquadratic term in the energy (per microsystem) is kBT/2. This “guess” also agrees with what wefound for 1D classical harmonic oscillators: there there are two quadratic terms in the energy of eachparticle (one proportional to p2x, one to u2), and indeed we found the average energy per particle tobe kBT = 2kBT/2.

One can demonstrate that for classical systems this is indeed true: the average expectation value ofany term in the Hamiltonian that is quadratic in a generalized momentum or a generalized coordinateis always kBT/2 (per microsystem). If the Hamiltonian of a microsystem is the sum of g quadraticterms, and there are no interactions between microsystems, then we have U = NgkBT/2 – this iscalled the equipartition theorem and you’ll have the pleasant task to prove it yourselves in thenext assignment. What the example of paramagnetic spins shows, is that if we have terms whichdo not depend quadratically on some momentum or coordinate (like the potential energy, which isproportional to cos θ, not θ2) then we can get very different expectation values for those terms (inthis case, something proportional to the Langevin function).

Before looking at some quantum examples, let us discuss why the microcanonical and the canonicalpredictions are identical. It is not obvious that this should be so, since one might expect thatrelationships between the macroscopic variables might depend on whether the system is isolated (ornot) from the rest of the universe.

3 Fluctuations

The main difference between an isolated and a closedsystem has to do with their energies. For an isolatedsystem, we know that the system is only allowed to haveenergies in a narrow interval Emc, Emc + δE. If we plotthe density of probability to find the isolated system tohave some energy E, it therefore looks as shown in Fig.5: it is zero everywhere except in the allowed interval,where it is a constant (any allowed microstate is equallylikely). Note that I use Emc to represent the allowedvalue for the energy of the isolated system, since E canbe any energy.

mc

E

p (E)

E

E + Eδmc mc

Fig 5. Density of probability to find an isolatedsystem to have energy E.

On the other hand, since a closed system can exchange energy with the outside, it follows that it

14

could have any energy whatsoever. Let’s find the probability pc(E)δE to find a closed system withenergy between some values E,E + δE. We know the probability ρc(q,p)

dqdpGNhNf to find the closed

system in a microstate in the vicinity of q,p. So the desired answer must be:

pc(E)δE =∫

E≤H(q,p)≤E+δE

dqdp

GNhNfρc(q,p)

i.e. we keep contributions only from the microstates which have the desired energy, and sum overthe probabilities to be in these microstates.

But ρc(q,p) =1Ze−βH(q,p) and for all the microstates contributing to the integral H(q,p) = E, so

it follows that:

pc(E)δE =1

Ze−βE

∫

E≤H(q,p)≤E+δE

dqdp

GNhNf=

1

Ze−βEΩ(E, δE,N, ...)

since the phase-space integral is just the multiplicity of the macrostate E, δE,N, .... It follows that:

pc(E) =1

Ze−βEg(E)

where

g(E) =Ω(E, δE,N, ...)

δE

is called the density of states, because it is the number of microstates within an energy intervalδE, divided by δE. We call such quantities densities (for example, particle density is the number ofparticles in a certain volume, divided by the volume. It’s the same here, except we count the numberof states within a certain energy interval, divided by the energy interval).

We would like to plot this probability pc(E) and see how different it looks from the microcanonicalone. First we need to figure out how Ω(E, δE,N, ...) depends on the energy E. If you look at allthe examples we’ve investigated, every single time we found that Ω(E, δE,N, ...) ∼ ExNδE where xis some number, e.g. x = 3/2 for classical ideal gases, x = 1 for 1D classical harmonic oscillators,etc. (strictly speaking, we found ExN−1, but then we always used xN − 1 ≈ xN). It follows thatpx(E) ∼ ExNe−βE, i.e. it is the product of a function that increases fast with E, and one thatdecreases fast with E (see Fig. 6). We expect it then to have some maximum somewhere at a finitevalue, let’s call it Ec.So far, this looks quite different from pmc(E). However, letus try to be more precise and locate where the maximum Ec

is, as well as what is the width of this peak. The maximumcomes from asking that:

dpc(E)

dE= 0 =

1

ZδE

[

−βΩ +∂Ω

∂E

]

e−βE

→ β =1

Ω

∂Ω

∂E→ 1

T= kB

∂ ln Ω

∂E=

∂S

∂Ec

E

cp (E)

E c

δE

∼Ε∼ e

xN− Eβ

Fig 6. Density of probability to find aclosed system to have energy E.

We found that the maximum is at a value Ec where 1T= ∂S

∂Ec. However, for the isolated system

we know that 1Tmc

= ∂S∂Emc

– that’s how we find the temperature of an isolated system. Comparingthe two, it follows that if conditions are arranged such that T = Tmc, i.e. the closed system is keptat the temperature equal to that of the isolated system, then Ec = Emc, i.e. the peak in pc(E) is at

15

the same value where pmc(E) is finite. So at least the maxima of these probabilities have the samelocation, if the temperatures are equal.

How about the width? Well, we’ve already showed that the standard deviation of the energy isalways:

δE ∼√

〈H2〉 − 〈H〉2〈H〉 =

√kBT 2CV

U

However, both U and CV are extensive quantities, U ∼ N,CV ∼ N (e.g., for classical ideal gas ofsimple atoms we had U = 3NkBT/2, CV = 3NkB/2), so it follows that the relative width of thepeak is always:

δE ∼ 1√N

→ 0

for thermodynamic systems with very large N , such as we consider. So in fact pc(E) also has a verynarrow and sharp peak at Ec = Emc, just like the microcanonical probability pmc. This explainswhy both ensembles give the same predictions. However, note that this only holds for large N . Aswe’ve discussed when looking at the statistical meaning of the entropy, for such large systems themost probable state (energy Ec, in this case) becomes so overwhelmingly more likely than any otherstate, that the probability to have the system in any other state is virtually zero. So even thoughthe closed system can in principle have any energy, in fact its energy will stay put at its averagevalue U , and the fluctuations about this value are extremely small, δE ∼ 1√

N→ 0. As a result, this

looks very similar (and will behave the same) as an isolated system, where the energy is fixed at adesired value Emc. However, for small systems the fluctuations in the energy of a closed systems canbe substantial, and then it will make a difference if the system is closed or isolated.

The good news for us is that for large, thermodynamic systems, we can use whichever ensembleis most convenient (easiest calculation) and get the same results (we’ll see later that grandcanonicalensembles also give the same relations between the macroscopic variables, and it will be for the samereasons: in principle, for those ensembles the number of particles also varies and can be anything.But in reality, one can show that with overwhelming probability the actual value is fixed to the mostprobable value and fluctuations around it are very small, so the actual conditions are very similar tothose of closed or isolated systems).

There is one more interesting property that holds for both quantum and classical canonical sys-tems. Let me mention it here briefly, without demonstration (see textbook for it).

4 Minimization principle

Remember that for an isolated system in equilibrium, the entropy (which is the important thermo-dynamic potential in this case, from which we can derive everything else) has a maximum.

Well, interestingly enough, one can show that for a closed system in equilibrium, the free energy

(which is the important thermodynamic potential in this case, from which we can derive everythingelse) has a minimum. As I said, I will not prove this, but the interesting thing is that equilibriumcorresponds to an extremum of the appropriate thermodynamic potential.

In fact, from this we can infer quite a lot. Remember that F = U − TS, and that S and T arealways positive quantities. Now, in equilibrium this quantity is minimized, as I just said. At low T ,the way to minimize this is to make U as small as possible (if T is small, we expect that the termTS is less important than the term U). Since U = 〈H〉, minimizing it means going towards theground-state (the state of minimum possible energy). Indeed, in all cases we studied we found that

16

at low-T , the equilibrium state looks more and more like the ground-state of the system. Usually thisstate is non-degenerate (for a quantum system), so it has multiplicity Ω = 1 and so S = kB ln Ω = 0.We call such a state “ordered”, because we know precisely what each microsystem is doing. Forexample, for spins, each spin is in the state with maximum projection all the time, which is a veryorderly state if you think about it.

However, at high-T , the term TS becomes important and we can minimize F by maximizing S(in that case, subtracting TS from U will give the smallest possible value for F ). But maximizing Smeans going to a macrostate with the largest possible multiplicity, i.e. the most “disordered” statepossible, where by “disorder” we mean that many choices are available to each microsystem and theygo through all of them, giving the large multiplicity. For example, for spins, at high-T we saw thatthey point with some probability in any direction, and the larger T is, the more likely all directionsbecome (the average was going towards zero). So if we take a snapshot of the spins at high-T, atany moment they’ll be pointing every which way and changing their orientations from snapshot tosnapshot ... which is a very disordered state.

Based on these general ideas, we can now understand why at low T matter goes into a crystal(solid phase) and as we raise the temperature it has transitions to liquid and then gas phases: thesolid is the most orderly of them, since each atom is pinned in some position (it can oscillate aboutit, but it will be at all times in the expected neighborhood). A liquid is more disorderly, since wedon’t know anymore where each atom is; however the average distance between neighboring atomsis not that different from what it was in a solid, so there is still some remnant of order. In a gas,however, any atom can be anywhere, and two “neighboring” atoms could be anywhere from in directcontact (when they collide) to extremely large distances apart, so this is an extremely disorderedstate. Another example is for interacting ferromagnetic spins (we will not discuss this problem inthis course, but I assume you may have heard of these ideas). In this case, at low-T the systemis “ferromagnetic” with all spins aligned with each other. Again, that is a very orderly state, butnow comes because of interactions between spins (there is no externally applied field). At high-temperatures, we expect a transition to a “disordered” state, i.e. one where each spins points everywhich way – this is called a “paramagnetic state”. We can use stat mech very well to study thistransition, the only complication is that now we have to deal with interactions, and in this course weonly focus on non-interacting systems. If you’ll take a grad-level course of stat-mech, it’s guaranteedthat this example would be one of the very first interacting systems you would study.

Many other general trends of evolution with T can be understood based on this general idea, ofgoing from the most ordered to the most disordered possible state as T increases. If you’ll followgraduate studies in physics, you’ll hear a lot more about this.

We are not quite done with classical canonical systems: a bit later on I will show you, for fun, howwe deal with a weakly-interacting classical gas, and what is the difference from the ideal classical gas.That will give you a taste of what “real” calculations with interactions added in are like. However,let us first quickly consider some quantum non-interacting canonical ensembles, just so we see howthese sort of problems work as well.

5 Quantum canonical ensembles

Let me first say that we will only be able to deal with quantum problems where the microsystemsare distinguishable by position, i.e. they are locked in a crystal and cannot exchange positions (e.g.,quantum harmonic oscillators, quantum spins, etc). To treat quantum gases, where microsystems

17

can interchange their locations, we will need to use grand-canonical ensembles.Note that we could still treat “mixed” problems, where some degrees of freedom are treated as

quantum while the overall motion is treated as classical (for example gas of atoms which classicaltranslation, but with quantum spins, stay tuned for assignments). Such approaches make sense ifwe are at such temperatures that the kinetic translational energy can be treated as classical, but thespin degree of freedom (for example) is still quantum. But we still won’t be able to treat quantumtranslational motion, in this approach.

Let’s get started. We know that here microstates correspond to eigenstates of the total Hamilto-nian. For non-interacting systems, these eigenstates are characterized by a set of quantum numbersα1, ..., αN , where α1 are the quantum numbers characterizing the eigenenergy of the first microsys-tem, etc. For example, for a chain of quantum harmonic oscillators, the microstate is characterizedby the positive integers n1, ..., nN , with the total energy being

En1,...,nN=

N∑

i=1

hω(

ni +1

2

)

Similarly, for a chain of spins-12, the quantum state of each spin is characterized by the quantum

number sz = ±12, so that the energy of a spin is esz = −gµBBsz (see prev. set of notes). As a result,

the microstate of the entire system is characterized by sz,1, ..., sz,N and Esz,1,...,sz,N = −∑Ni=1 gµBBsz,i.

Etc.Let me go back to calling the quantum numbers (one or more) per microsystem as α, so that the

microstate is described by α1, ...αN and has the total energy Eα1,...,αN.

Then, as discussed, the probability to be in the microstate α1, ...αN is:

pα1,...,αN=

1

Ze−βEα1,...,αN

where the canonical partition function is obtained from the normalization condition:∑

α1,...,αN

pα1,...,αN= 1 → Z =

∑

α1,...,αN

e−βEα1,...,αN

Note that here we also have a factorization theorem, which holds for non-interacting systems forwhich Eα1,...,αN

=∑N

i=1 eαi, where eαi

is the contribution of the ith microsystem. In this case, we canrewrite:

Z =∑

α1

e−βeα1 · · ·∑

αN

e−βeαN = zN

wherez =

∑

α

e−βeα

is the single particle canonical partition function. The sum is over all possible values of the quantumnumber(s) α of a single particle. Once we have z and Z = zN , then we use F = −kBT lnZ =−NkBT ln z and we’re on our way.

Let’s see first the examples we’ve done using microcanonical ensembles. First, for spins 1/2 (ortwo level systems) we have:

z =

12∑

sz=− 12

eβgµBBsz = e−βgµBB

2 + eβgµBB

2 = 2 cosh(

βgµBB

2

)

18

You should now verify that the resulting Z and F give the same results as we obtained before, for

example U = −N gµBB2

tanh(

β gµBB2

)

(remember that we called ǫo =gµBB

2). You could also calculate

average magnetizations, etc, but we’ll do these for a more general problem very soon.Let us also look at quantum harmonic oscillators. In this case:

z =∞∑

n=0

e−βhω(n+ 12) = e−

βhω2

∞∑

n=0

[

e−βhω]n

This is a geometric series, and I hope you remember that

N∑

n=0

xn =1− xN+1

1− x

and so, for any |x| < 1, we have:∞∑

n=0

xn =1

1− x

For us, x = e−βhω < 1 indeed, so we find:

z = e−βhω2

1

1− e−βhω=

1

2 sinh(

βhω2

)

in nicer form. We should again check that using Z = zN , F = −kBT lnZ,U = − ∂∂β

lnZ etc, we

recover all the relationships we found in the microcanonical ensemble.I hope you agree that these calculations are a lot easier – it took quite a bit of ingenuity to figure

out the multiplicity of the macrostate (especially for quantum harmonic oscillators), whereas thesecalculations are just straightforward, we only need to do some simple sums! They’re actually evensimpler than the classical examples, where we had to do some integrals.

5.1 Quantum paramagnetic spins

Let us consider N non-interacting quantum spins (locked in a lattice). The spins have magnitude S,where S can be any integer or half-integer (so we will deal with all possible values at once), and are

placed in an external uniform magnetic field ~B = Bez. As discussed, the Hamiltonian for a singlespin is (the Zeeman interaction):

h = −gµB

h~S · ~B = −gµB

hSzB

where Sz is the z-component of the spin operator of that particular spin. This operator has theeigenvalues −Sh,−(S−1)h, ...., (S−1)h, Sh, i.e. 2S+1 different allowed projections. It follows thatthe single spin eigenvalues are:

em = −gµBBm

where m = −S,−S + 1, ..., S − 1, S can take 2S + 1 values. If the spins are non-interacting, we canuse the factorization theorem Z = zN , where:

z(T ) =S∑

m=−S

e−βem =S∑

m=−S

eβgµBBm

19

To simplify notation, let me call x = eβgµBB for a bit, so that:

z =S∑

m=−S

xm = x−S + ...+ xS = x−S[

1 + x+ ....+ x2S]

= x−S x2S+1 − 1

x− 1=

xS+ 12 − x−S− 1

2

x12 − x− 1

2

since whether S is integer or half-integer, 2S is certainly an integer and we can use the usual geometricseries result.

Going back to the original notation, we have:

z(T ) =eβgµBB(S+ 1

2) − e−βgµBB(S+ 12)

eβgµBB 12 − e−βgµBB 1

2

=sinh

[

βgµBB(

S + 12

)]

sinh[

βgµBB12

]

Then, F = −NkBT ln z = −NkBT lnsinh[βgµBB(S+ 1

2)]sinh[βgµBB 1

2 ]. Since dF = −SdT + µdN , we can find the

entropy S = −∂F∂T

. You should verify that S → 0 when T → 0, as the 3rd law of thermodynamicsrequires (this is a quantum system so it should give a correct description at low T ).

The internal energy is:

U = − ∂

∂βlnZ = ... = −NgµBBS

[(

1 +1

2S

)

coth[

βgµBB(

S +1

2

)]

− 1

2Scoth

(

βgµBB1

2

)]

after simplifying the derivative. Let us introduce the so-called Brillouin function

BS(x) =(

1 +1

2S

)

coth[

x(

1 +S

2

)]

− 1

2Scoth

x

2

then:U = −NgµBBSBS(βgµBB)

But:

U = 〈H〉 = −gµB

hB

N∑

i=1

〈Si,z〉

and since we expect that the ensemble average of each spin should be the same, we can concludethat we must have:

〈 ˆS1,z〉 = ... = 〈 ˆSN,z〉 = ShBS(βgµBB)

(note that this has the right units: S and BS(βgµBB) are dimensionless numbers, so the units of thespin, which is an angular momentum, are indeed h).

Before looking at how this varies with temperature, let us calculate this average from the definitionof the ensemble average. For example:

〈 ˆS1,z〉 =∑

m1,...,mN

hm1pm1,...,mN=

h

z

S∑

m1=−S

m1eβgµBBm1

(the sums over m2, ...,mN each equals z as before). Note that this means that the probability to findspin 1 to have projection hm1 (irrespective of what the other spins are doing) is:

pm1 =∑

m2,...,mN

pm1,...,mN=

1

ze−βem1 =

1

zeβgµBBm1

20

in agreement with the equation above, showing that the most likely state is the one with the highestprojection +Sh, while the least likely state is the one with −Sh. This is reasonable. To carry thesum, we again use our trick:

m1eβgµBBm1 =

1

βgµB

∂

∂BeβgµBBm1

so that:

〈 ˆS1,z〉 = h1

z

1

βgµB

∂

∂Bz = h

1

βgµB

∂

∂Bln z = ... = ShBS(βgµBB)

after the derivative is calculated.The total magnetization is defined as:

〈Mz〉 =gµB

h

N∑

i=1

〈Si,z〉

(the magnetic moment operator is defined as ~m = gµB

h~S, i.e. it is proportional to the spin operator,

with the constants in front so that units are correct).From what we discussed, we see that the average magnetization is:

〈Mz〉 = NgµBSBS(βgµBB)

(and U = −B〈Mz〉, similar to the potential energy of classical spins).The Brillouin function BS(x) looks somewhat similar to whatthe Langevin function for classical spins looked like, thoughthere are some differences. It also goes to 1 as x → ∞, how-ever at small x it is BS(x) ≈ (S + 1)x/3 + ... (the Langevinfunction was x/3 at small x).This means that at low temperatures kBT ≪ gµBB wherex ≫ 1, we have:

〈Mz〉 = NgµBS

x

1

B (x)y=(S+1)x/3S

Fig 7. The Brillouin function BS(x).

which is indeed the maximum possible magnetization, when all spins have the projection m = S.This makes sense, since we know that at low T the system should go towards its ground-state, andin the ground-state all spins are fully polarized by the magnetic field.

At high-temperatures kBT ≫ gµBB where our x ≪ 1, we have:

〈Mz〉 = NgµBSgµBB(S + 1)

3kBT= N

S(S + 1)(gµB)2

3KBTB

It follows that the susceptibility at high temperatures is:

χ =∂〈Mz〉∂B

= NgµBSgµBB(S + 1)

3kBT= N

S(S + 1)(gµB)2

3kBT

i.e. it decreases like 1/T , as predicted by the Curie law. In fact, the agreement with the classicalprediction is better than this. Remember that the classical susceptibility at high-T was found to be:

χ = Nm2

3kBT

where m was the magnetic momentum of the spin. Now, in the quantum version we expect to

replace m2 = ~m2 → ~m2, i.e. we replace numbers by operators. As just discussed, ~m = gµB

h~S →

21

~m2 =(

gµB

h

)2 ~S2. However, the operator ~S2 = h2S(S + 1) always (this is its only eigenvalue), so we

find that: m2 → S(S + 1)(gµB)2, which is exactly what we got in the quantum result. So indeed

the classical and the quantum predictions agree at high temperatures if we use the proper expressionfor m2. The Curie law, when observed experimentally, tells us not only that there are magneticimpurities in the sample, but also their number and their spin S. Knowing the spin, we can narrowdown the search for what impurities are responsible for this, because the spin of an atom dependson its type and on how many electrons it has occupying its last shell (remember Hund’s rule?).

Ok, so now we know all there is to know about non-interacting spins, whether classical or quantum.I hope you appreciate that the canonical calculation was quite simple for any spin S, whereas wecould only do the microcanonical one for spins S = 1/2. One might again wonder why we get answersfor canonical systems in agreement with the ones for microcanonical systems; the answer (as for theclassical problems) is that this is due to the fact that we deal with very large systems, N ∼ 1023,and as a result energy fluctuations about the average are extraordinarily small. The demonstrationof this proceeds very similar to the one we had for the classical systems, and I won’t redo it.

As I said in the beginning, in the canonical formalism we only treat quantum problems where themicrosystems are “distinguishable” through being located in different regions of space (locked in acrystal). Let us understand what goes wrong if we’re trying to treat an “ideal quantum gas” problem,where the atoms can move around and interchange positions. After all, we know the spectrum ofthat problem, so why can’t we just calculate the canonical partition function and be done?

To see why, consider N non-interacting atoms of mass m inside a cubic box of volume V = L3.The potential is zero inside, and infinite outside the box. We know how to solve the Schrodingerproblem for a single particle with Hamiltonian

h = − h2

2m

(

d2

dx2+

d2

dy2+

d2

dz2

)

and find its spectrum:

enx,ny ,nz =h2

8mL2(n2

x + n2y + n2

z)

where nx, ny, nz = 1, 2, ... are strictly positive integers. So here, the quantum number(s) α = nx, ny, nz

characterizing the state of a particle are 3 natural numbers.If there was a single atom in the system, it could occupy any eigenstate φα(~r).How about if there are two atoms? Well, we might think that if one is in state α and one is in

state β, then the total eigenfunction must be φα(~r1)φβ(~r2). However, this is not quite true. If youremember what you learned for identical particles in quantum mechanics, the wavefunction for twoidentical particles must be either symmetric (for so-called bosonic particles) or antisymmetric (forso-called fermionic particles) when we exchange the two particles, so in fact

φ(~r1, ~r2) = φα(~r1)φβ(~r2)± φβ(~r1)φα(r2)

This shows that for fermions we cannot place both atoms in the same level α = β, since in this caseφ(~r1, ~r2) = 0. This is called Pauli’s principle: we cannot have more than one fermion occupying asingle-particle level (also called orbital), although we can have any number of bosons occupying thesame level.

For N atoms, we then have to specify how many atoms are occupying which levels – this will definethe microstate. The orderly way to do this is to introduce occupation numbers: let nα be the

22

number of particles occupying state α. If there is no particle in that state, then nα = 0. For fermions,we must have nα ≤ 1. For bosons, nα can have any value 0,1,2, .... Note that we make no attemptto specify which particles are occupying which level, we only say how many are in a given level. Thisis consistent with the fact that they are identical and we cannot know which are which. In fact, weknow that the wavefunction will be the fully symmetrized (or antisymmetrized) combination of thecorresponding single-particle orbitals, so in fact any particle is, with the appropriate probability, inany occupied level.

Then, each microstate is characterized by the set of occupation numbers nα – there is one valuefor each possible single-particle level, telling us whether that orbital is empty or occupied in thisparticular microstate. Of course, since there are N particles, we must have

∑

α nα = N . Because theparticles are non-interacting, the total energy in this microstate is simply:

Enα =∑

α

nαeα

i.e., we go through each orbital and multiply its energy by the number of particles occupying it.Since the canonical partition function is the sum over all microstates from e−βEmicrostate , we then

have:Z =

∑

nα1

∑

nα2

...e−β∑

αnαeα

The sums run over all possible orbitals (could be an infinite number of sums if there is an infinitenumber of orbitals, as is the case for the quantum ideal gas), and each sum runs over all allowedvalues of the occupation numbers: for fermions, nα = 0, 1 while for bosons nα = 0, 1, 2, ...., howeverwith the restriction that

∑

α nα = N . It is this restriction that makes the calculation impossible.Because of it the sums cannot be carried out independently of each other, instead we must alwaysmake sure that the restriction is obeyed. We simply cannot calculate this. However, if we had norestriction, then the sums would factorize into independent sums over each occupation number, andwe could do them very easily. This is why we need to go to grandcanonical systems, where thenumber of particles is not fixed anymore, so this troublesome restriction is lifted and we can finallystudy quantum ideal gases.

We do this next!

23

canonical - UBC Physics & Astronomy | UBC Physics & Astronomy

Documents

Transcript of canonical - UBC Physics & Astronomy | UBC Physics & Astronomy