Oono - StatMech Primer

1

Primer to

“Introduction to Equilibrium Statistical Mechanics”

Oct, 2011 version1

Yoshi Oono [email protected] (3111ESB)Physics and IGB, UIUC

These notes are for those who have not attended any undergraduate statistical ther-modynamics courses except for very rudimentary courses at the 200 level; this isessentially a set of notes for one semester undergraduate statistical mechanics course.The notes may be used as your basic knowledge checklist; simply scan the bold-facetitles of entries and the index (hyperreferenced). The IESM is a critical introductionto equilibrium statistical mechanics, but this primer is rather conventional, so somecritical comments are in the footnotes.2

1Errors and typos in version 0 (2004) were fixed thanks to Bo Liu.2Chapter 2 covers standard elementary topics, but the author of this memo always suspects that

ideal quantum gases are excessively discussed in elementary courses because it is easy to composeelementary but not-so-trivial exam questions. They are important, but other fascinating topicsare squeezed out because of them. Therefore, those who do not wish to go into solid-state or lowtemperature physics may browse through ideal quantum-gas-related topics (Sections 2.12-2.14).

Contents

1 Thermodynamics Primer 51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Zeroth law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . 71.3 First law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . 71.4 Fourth law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . 131.5 Second law of thermodynamics . . . . . . . . . . . . . . . . . . . . . 141.6 Clausius’ inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.7 Various thermodynamic potentials . . . . . . . . . . . . . . . . . . . . 241.8 Manipulation of thermodynamic formulas . . . . . . . . . . . . . . . . 271.9 Consequences of stability of equilibrium states . . . . . . . . . . . . . 291.10 Ideal rubber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331.11 Third law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . 35

2 Statistical Mechanics Primer 392.1 Basic hypothesis of equilibrium statistical mechanics . . . . . . . . . 392.2 Boltzmann’s principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.3 Equilibrium at constant temperature . . . . . . . . . . . . . . . . . . 432.4 Simple systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.5 Classical statistical mechanics . . . . . . . . . . . . . . . . . . . . . . 512.6 Heat capacity of solid . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.7 Classical ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.8 Open systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602.9 Ideal particle systems — quantum statistics . . . . . . . . . . . . . . 632.10 Free fermion gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662.11 Free bosons and Bose-Einstein condensation . . . . . . . . . . . . . . 692.12 Phonons and photons . . . . . . . . . . . . . . . . . . . . . . . . . . . 712.13 Phase coexistence and phase rule . . . . . . . . . . . . . . . . . . . . 752.14 Phase transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3

4 CONTENTS

Chapter 1

Thermodynamics Primer

1.1 Introduction

1.1.1 Why do we start with thermodynamics?When a macroscopic object is isolated and left alone for a long time, it reaches anequilibrium state. The state can be described by a set of macroscopic quantities suchas temperature, pressure, volume, etc., and these quantities obey thermodynamics.Thermodynamics summarizes our empirical knowledge about macroscopic objects inequilibrium. Statistical mechanics tries to elucidate thermodynamics in terms of thestatistical behavior of systems consisting of many objects obeying mechanics. How-ever, its theoretical framework cannot be derived from mechanics. Its justificationcomes from its consistency with empirical facts (thermodynamics).

We should not forget that the fundamental framework of statistical mechanics wasestablished by Gibbs well before the so-called quantum revolution. Gibbs constructeda framework that is consistent with thermodynamics. Thermodynamics not only sur-vived this revolution but it is fair to say that it was an important guiding principleto make a sound theoretical framework. Statistical mechanics almost survived therevolution, but it is not surprising because Gibbs made statistical mechanics heavilyrelying on thermodynamics. Do not forget that Gibbs was a foremost expert of ther-modynamics. We can easily guess that statistical mechanics is largely independentof the actual mechanics of the microscopic world. Furthermore, if we accept an ob-vious fact that the ultimate judge of physics is empirical results, we could say thatthermodynamics is more fundamental than the microscopic mechanical description

5

6 CHAPTER 1. THERMODYNAMICS PRIMER

of macroscopic objects that is beyond our direct experimental confirmation.It is a prejudice that a more microscopic description is more fundamental. There-

fore, we start with thermodynamics. After discussing elementary thermodynamics,we proceed to statistical mechanics. Rudiments of probability and combinatorics willbe given in due course.

1.1.2 Macroscopic objects and equilibrium stateOur empirical facts about macroscopic objects in equilibrium are usually summa-rized in the five laws (“axioms”1) of thermodynamics. In standard approaches tothermodynamics we do not explicitly define the word “macroscopic” nor “equilib-rium.” In the standard thermodynamics, these words are implicitly defined throughthe fundamental laws just as points and lines are in Euclidean geometry.

However, their intuitive meaning is as follows. We say an object is macroscopic ifits halves are again macroscopic. This implicitly implies that we may ignore the sur-face effect completely. Usually, a macroscopic object contains 1020 or more molecules,and the range of intermolecular forces extends only the distance of the order of thesize of molecules; most molecules do not feel the surface of the object. Hence, thesurface effect is almost surely negligible for ordinary macroscopic objects.

A system is said to be in an equilibrium state when all the fast processes haveoccurred but all the slow processes have not. This characterization of equilibriummay sound very pragmatic, but this is the honest characterization of the word ‘equi-librium.’

1.1.3 Five fundamental laws of equilibrium thermodynamics: SummaryThere are five fundamental thermodynamic laws:[0] The existence of equilibrium states, and temperature (the zeroth law).2

[1] The conservation of energy (the first law).[2] The variational principle selecting equilibrium states (the second law).[3] The impossibility to reach the absolute zero temperature (the third law).[4] Thermodynamic quantities are either extensive or intensive (the fourth law).

The ordering above is not logical, but here we follow the conventional scheme.[4] is often not recognized as a fundamental law, but quite important. The terms‘intensive’ and ‘extensive’ will be explained in 1.4.1.

1However, these ‘axioms’ are far insufficient to reconstruct thermodynamics mathematically.Thus, although we call them informally ‘axioms’ we should regard them as important principles.

2Strictly speaking, it is logically impossible to introduce temperature without other laws, butin this primer we proceed intuitively, just as many rudimentary textbooks.

1.2. ZEROTH LAW OF THERMODYNAMICS 7

1.2 Zeroth law of thermodynamics

1.2.1 Thermal equilibrium of isolated systemsA macroscopic system is said to be isolated if the system has no interaction at allwith its surrounding environment. If an isolated system is left undisturbed for a longtime, the system would reach a macroscopic state which would not change any more.This final state is called a thermal equilibrium state.

1.2.2 Zeroth law consists of two assertionsThe zeroth law consists of two assertions i the conventional exposition:

ThOa For a given isolated system there is a thermal equilibrium state.

There is a special way of making contact between two systems called thermal con-tact. Thermal contact is a contact through a special wall which does not allow thesystems to exchange work, matter, or any systematic macroscopic interaction (suchas electromagnetic interactions).

If two systems A and B are in thermal contact and are in equilibrium as a com-pound system, we say A and B are in thermal equilibrium.

ThOb If the systems A and B are in thermal equilibrium, and so are the systems Band C, then the systems A and C are in thermal equilibrium. That is, the thermalequilibrium relation is an equivalence relation.

The second assertion implies the existence of a scalar quantity called temperature(or more precisely, an empirical temperature): there is a quantity called temperaturewhich takes identical values for two systems in thermal equilibrium. Here, we do notmathematically demonstrate this, but this should not be counterintuitive. Notice,however, up to this point the introduced temperature does not imply that hotterobjects have higher temperatures. The definition of temperature simply tells us thathotter and colder objects have different temperatures.

1.3 First law of thermodynamics

1.3.1 Thermodynamic variables and thermodynamic spaceEmpirically, it is known that equilibrium states of macroscopic (and spatially homo-geneous) objects can be macroscopically uniquely specified by a few variables (calledthermodynamic variables) such as temperature, volume, etc. For example, an equi-librium state of a simple fluid (ordinary liquids and gases of pure substances likeliquid water, helium gas, etc.) is uniquely specified by (P, V,M), where P is the


pressure, V the volume, and M the mass or molarity of the system.However, strictly speaking, if phase coexistence can occur, not arbitrary set of

thermodynamic variables can uniquely specify the equilibrium state. Only the (in-ternal) energy E and the work coordinates X1, · · · can, where work coordinates arethermodynamic variables that are used to express the work done to the systemwhose mechanical meanings are unambiguously clear and extensive (see, e.g., 1.3.8and 1.3.9). The set E,X1, · · · is called the thermodynamic coordinate system. Thespace spanned by these thermodynamic coordinates may be called the thermody-namic space. For a given macroscopic system, its each equilibrium state uniquelycorresponds to a point in the thermodynamic space of its own.

Some readers might question that there are much more macroscopic observables we can observe fora given object, shapes, orientation, etc. Precisely speaking, thermodynamic states are equivalenceclasses of macroscopically distinguishable states according to the values of the thermodynamic co-ordinates.

1.3.2 Simple system and compound systemA system that is macroscopically spatially uniform unless there is phase coexistenceand that can be thermodynamically uniquely described by a set of thermodynamiccoordinates with a single internal energy E,Xi is called a simple system . A sys-tem that may be described by a join of simple systems is called a compound system ,whose thermodynamic space may be interpreted as a direct product of the thermo-dynamics spaces of the constituent simple systems. In these note, we mainly discusssimple systems.

1.3.3 Quasistatic process and path in thermodynamic spaceAny (experimentally realizable) process that consists of states extremely (infinites-imally) close (experimentally indistinguishably close) to equilibrium states is calleda quasistatic process. Quasistatic processes may be expressed as a curve in the ther-modynamic space. In these notes, a quasistatic process is synonymous to a processthat has a curve representation in the thermodynamic space (Fig. 1.1). Whether itis reversible (retraceable) or not is not directly related to quasistatic nature of theprocess.

Certainly, the processes not described as curves in the thermodynamic space arenonequilibrium processes. For example, if a process is sufficiently rapid, it cannothave any corresponding path in the thermodynamic space, because states along theprocess are not infinitesimally close to equilibrium states. Only when the processis sufficiently slow, all the instantaneous macroscopic states during the process areinfinitesimally close to equilibrium states. In this case the path lies in the thermo-dynamic space. It is sure that there is a way to realize any path corresponding to

1.3. FIRST LAW OF THERMODYNAMICS 9

a quasistatic process reversibly. (Equilibrium) thermodynamics can tell us how tocompute changes of thermodynamic quantities along any quasistatic path.

A

B

quasistatic

noneq process

Fig. 1.1 A and B are equilibriumstates. A quasistatic process connectingA and B is in the thermodynamic space.From A to B a process need not bequasistatic. Then, such a process cannotbe described in the thermodynamic space(red).

However, a given quasistatic process is reversible or not depends on the context.For example, if a system is in contact with a cold bath across a thermally fairly wellinsulating wall. At each instance, the state of the system is very close to an equilib-rium state, so the process is a quasiequilibrium process, but what is join on in thisprocess is a cooling process, so the system + the cold bath undergoes an irreversibleprocess. Still, there is a way to realize this temperature change reversibly for thesystem alone, so we can use equilibrium thermodynamics to study every state alongthe process.

1.3.4 State functionsIf in an equilibrium state the value of a macroscopic quantity is uniquely specifiedby the corresponding point in the thermodynamic space, the macroscopic quantityis called a state function. Its value is indifferent to how the state is realized.

Once the thermodynamic space is established we may say a (univalent) functiondefined on the thermodynamic space is a state function. That is, a function of thethermodynamic coordinates is called a state function. For example, the equilibriumvolume of a system is a state function; temperature is another example.

When the initial and final equilibrium states are given, the variation of a statefunction does not depend on the actual process but only on the initial and finalequilibrium states. Even if the actual process connecting these two states is nota quasistatic process (i.e., does not lie in the state space), we can thermodynami-cally compute the variation of any state function during the process with the aid ofan appropriate (appropriately devised) quasistatic process connecting the same endpoints.

Actually, the essence of thermodynamic computation is to devise a quasistaticpath connecting two equilibrium states that may be in practice connected by an


irreversible process.

1.3.5 Joule demonstrated that energy is a state functionWe have already included internal energy E in the thermodynamic coordinates, butwhether E is a state function or not was not clear before Joule. Joule experimentallyproved in 18433 that when the initial and the final equilibrium states are specified, thenecessary (mechanical and electromagnetic) work W for any process connecting thesetwo states of a ‘thermally isolated system’4 is independent of the actual procedure(actual way of supplying work) and depends only on both ends of the process. Fromthis we may conclude that there is a state function E whose change for an isolatedsystem is given by

∆E = W. (1.3.1)

Remark. A more precise statement of (a generalization of) Joule’s finding is asfollows. There is a special wall called an adiabatic wall such that for a system sur-rounded by this wall the necessary work to bring the system from a given initialequilibrium state to a specified final equilibrium state is independent of the actualprocess but is dependent only on these two end states of the process. Here, work isdefined by mechanics and electrodynamics. ut.

1.3.6 Closed system, heat, and internal energy(1.3.1) does not hold when the process is not adiabatic, even if the system does notexchange matter with its environment. A system which does not exchange matterwith its environment is called a closed system. Now, even if a system is closed, if itis surrounded by a ‘energetically leaky’ wall, all the energy supplied to the systemas work W may not stay in the system, or perhaps more energy could seep throughthe wall into the system.

Empirically, we know that if we have two equilibrium states A and B, we can bringat least one of the states into the other adiabatically (in a Dewar jar), say A → B,supplying only work W from outside. The process need not be a quasistatic process(perhaps we can heat the system by friction inside). In any case, in this way wecan measure the energy difference of B relative to A ∆E = EA − EB = W 5 can bemeasured in terms of mechanics (+ electrodynamics).

If we wish to realize the change A → B by a quasistatic and reversible process,we cannot always do so adiabatically. Now, suppose a quasistatic and non-adiabaticprocess A→ B requires work W , which is usually smaller than ∆E (see the secondlaw below). The deficit ∆E −W is understood as supplied through the wall as heat

3[1843: The first Angro-Maori War in New Zealand; Tahiti became a French colony.]4e.g., (intuitively) any system contained in a Dewar jar.5The difference due to a process is always defined as the final quantity − initial quantity.

1.3. FIRST LAW OF THERMODYNAMICS 11

Q: Q = ∆E −W . In this way heat is introduced. That is, we wish to keep E as astate function even for non-isolated closed systems:

∆E = W +Q. (1.3.2)

(1.3.2) is the conservation law of energy extended to non-mechanical processes. Eis called internal energy in thermodynamics, because we do not take the mechanicalenergy due to the mechanical motion of the system as a whole into account, even ifthe system is moving as a whole.6

Notice that although E is a state function, neither W nor Q is a state function;they depend explicitly on the path connecting the initial and the final equilibriumstates (the path may not be in the thermodynamic space7).

1.3.7 Open system and general form of the first lawWhen not only heat but matter can be exchanged between the system and its en-vironment (in this case the system is called an open system), (1.3.2) does not holdanymore. To rescue the equality, we introduce the term called mass action Z

∆E = W +Q+ Z. (1.3.3)

Z is not a state function, either. Now, we can summarize the first law of thermody-namics:

ThI The internal energy E defined by (1.3.3) is a state function.

The first law may be regarded as a special case of the general law of energyconservation. Strictly speaking, however, the first law only discusses the processesconnecting two equilibrium states.

For an infinitesimally small change, we write (1.3.3) as follows:

dE = d′W + d′Q+ d′Z, (1.3.4)

where d′ is used to emphasize that these changes are not the changes of state func-tions (not the path independent changes).8

6Precisely speaking, the total energy of the system observed from the co-moving and co-rotatingobserver is the internal energy of the system.

7Notice that W is purely mechanically defined, irrespective of the nature of the process, it ismacroscopically measurable, but Q is usually not directly measurable in nonequilibrium processes.It is computed at the end to satisfy (1.3.2).

8Mathematically, it is not a total differential or not a closed form.


1.3.8 Work due to volume changeWhen the change is quasistatic, W , Q and Z in (1.3.3) are determined by the equi-librium states of the system along the quasistatic path.

For example, let us consider the work required to change the system volume fromV to V + dV (V is clearly a state function, so d′ is not used here). The necessarywork supplied to the system reads (See Fig. 1.2)

dl

AP F

Fig. 1.2 Work done by volume change.

d′W = −Fdl = −PdV, (1.3.5)

where P is the pressure and the force F is given by the following formula, if theprocess is sufficiently slow

F = A× P, (1.3.6)

where A is the cross section of the ‘piston.’ Here, we use the sign convention suchthat the energy gained by the system becomes positive. Hence, in the present exam-ple, d′W should be positive when we compress the system (i.e., when dV < 0).

If the process is fast, there would not be sufficient time for the system to equili-brate. For example, when we compress the system, the force necessary and the forcegiven by (1.3.6) can be different; the pressure P may not be well defined. Conse-quently, (1.3.5) does not hold (the work actually done is larger than given by (1.3.5)).

1.3.9 Electromagnetic workThe electromagnetic work can be written as

d′W = H · dM , (1.3.7)

d′W = E · dP , (1.3.8)

where H is the magnetic field, M the magnetization, E the electric field, and P thepolarization.

1.4. FOURTH LAW OF THERMODYNAMICS 13

1.3.10 Mass action and chemical potentialThe mass action is empirically written as

d′Z =∑

i

µidNi, (1.3.9)

where Ni is the number of i-th particles (or the molarity of the i-th chemical species),and µi is its chemical potential.

1.3.11 The first law is not identical to the law of energy conservationThe first law is about the internal energy, which is defined only for equilibrium states. If a systemis not in equilibrium, it may have a material flow (e.g., fluid flow) that carries macroscopic kineticenergy. Such an energy is not regarded as a part of the internal energy. Needless to say, if we countall the energies, the conservation of energy rigorously holds for an isolated system, but the totalenergy is equal to the internal energy only when the system is in equilibrium and is not moving orrotating with respect to the observer who measures the internal energy. Thus, the conservation ofenergy implies the first law, but the converse is not true.

1.4 Fourth law of thermodynamics

1.4.1 Extensive and intensive thermodynamic variablesThermodynamic observables which are proportional to the number of particles (ormass) in the system are called extensive quantities. For example, the internal en-ergy U of the system is doubled when we piece together identical systems in theidentical thermodynamic state. In contrast, the temperature is not doubled by thesame procedure. Temperature is independent of the amount of mass in the system.Thermodynamic quantities independent of the number of particles (or mass) in thesystem are called intensive quantities. Temperature and pressure are examples.

Notice that the infinitesimal form of the first law of thermodynamics can be writ-ten in general as follows:

dE = d′Q+∑

i

xidXi, (1.4.1)

where Xi are extensive quantities and xi are intensive quantities. The pair (xi, Xi)is called a thermodynamical conjugate pair (with respect to energy).


1.4.2 The fourth law of thermodynamicsThe fourth law claims

ThIV All thermodynamic observables are either extensive or intensive.

This empirical law is vital when we construct a statistical mechanical frameworkto explain macroscopic properties of matter.

Also the law is practically useful to obtain the equation of state applicable to anyamount of matter from experiments that actually use a particular amount of thematter as shown in the following example.Example 1. This example contains thermodynamic variables we have not yet dis-cussed at all, but what matters is whether a particular variable is extensive or in-tensive, so don’t worry. An empirical equation of state of a magnetic substance (2moles) is obtained as

A = T−1/2M2, (1.4.2)

where A is the Helmholtz free energy, which is an extensive variable we will discusslater, T the absolute temperature that is intensive, and M the magnetization that isextensive. Find the equation of state for the free energy A for N moles of the samesubstance.

We use the fourth law. If we replace extensive quantities A and M in the empiricalequation of state (1.4.2) by their N -mole counterparts, we have

2A

N= T−1/2

(2M

N

)2

, (1.4.3)

so we obtain the following formula:

A = 2T−1/2M2/N. (1.4.4)

ut

1.5 Second law of thermodynamics

1.5.1 The second law of thermodynamicsWe know empirically that not all conceivable processes are realizable in Nature. Thesecond law summarizes this as follows:

ThIIcl Clausius’ law: Heat cannot spontaneously be transferred from a colder to a

1.5. SECOND LAW OF THERMODYNAMICS 15

hotter body.9

ThIIk Kelvin’s law: A process cannot occur whose only effect is the complete con-version of heat into work. (No existence of perpetum mobile of the second kind; thereis no engine which can produce work without a radiator.)

ThIIp Planck’s law: In the adiabatic process if all the work coordinates except forE return to their original value, ∆E ≥ 0.

1.5.2 Planck’s law in thermodynamic spaceThe first law implies adiabatically

dE =∑

i

xidXi, (1.5.1)

where (xi, Xi) are conjugate pairs for work coordinates (non-thermal variables). Thevariables E and Xi span thermodynamic space (1.3.1).

A

B

X

X

E

1

2

Fig. 1.3 The path in the thermodynamicspace corresponds to the quasistatic process(notice, however, generally that Planck’s lawdoes not require quasistatic processes). Thevertical move implies a purely thermal process.Adiabatically, there is no way to move from astate to another state that is vertically below itaccording to Planck’s law.

1.5.3 Planck’s law, Kelvins’ law and Clausius’ law are equivalentIf Kelvin’s law could be violated, then we can absorb heat from a colder body, andthen produce work. If we simply dissipate the work into heat, we could add it to ahotter body. Thus, Clausius’ law could be violated. Therefore, Clausius’ law impliesKelvin’s law.

Conversely, if Clausius’ law could be violated, we can split a body into two halves,and one of them could be made hotter than the other. This body can be used to

9The presentation here has not defined temperature clearly yet, so strictly speaking, we cannotdescribe this law properly here. The author thinks Planck’s law is the most elegant formulation ofthe second law.


push the state from A to B in Fig. 1.3. Thus, Kelvin’s law would be violated. Wehave completed a demonstration of the equivalence of ThIIk and ThIIcl.

If Planck’s law could be violated, due to work alone ∆E = W < 0 is possible(i.e, the system does work to the environment), so supplying this as heat Q, we canviolate Kelvin’s law. Hence, Kelvin implies Planck. If Kelvin is violated, we can firstabsorb heat (say A to B in Fig. 1.3) and then go from B to A by converting it intowork, violating Planck, so Planck implies Kelvin.

1.5.4 Caratheodory’s principlePlanck’s law tells us that there is no way to return the system from B to A adia-batically in Fig. 1.3. Actually, notice that Planck tells us much more than we canillustrate in the state space. Recall that if a process is not quasistatic, we cannotdraw the path corresponding to it in the state space. Still Planck tells us that irre-spective of the processes involved being quasistatic or not we cannot go from B toany state below it (along the constant work coordinate in line) the thermodynamicspace. trace.

Therefore, it is obvious that any thermodynamic state has another state that can-not be reached from the former by any adiabatic process. It is often convenient topromote this to be a form of the second law:

ThIIca Caratheodory’s principle: Each point in the thermodynamic space has in itsevery neighborhood a point which is adiabatically inaccessible.

We have already said Planck implies Caratheodory. The converse requires moreconditions; this should be obvious, because Planck requires there is a state that can-not be accessible from a given state in a particular way, while Caratheodory claimsthere is a state that cannot be accessible by any means.

1.5.5 Existence of entropyStarting with 1.5.6 we demonstrate that the second law implies the existence ofentropy, the quantity that foliates the thermodynamic space into ‘adiabats’ or ‘isen-tropic hypersurfaces.’ Roughly speaking, if the entropies of two states are identical,adiabatically reversibly these states can be connected. If state A has larger entropythan state B, then we can never go from A to B adiabatically.

1.5.6 Adiabat expresses adiabatic accessibility limitChoose an arbitrary point P in the thermodynamic space and a quasistatic adia-batic path connecting P and L, a line parallel to the energy axis (a constant workcoordinate line; Fig. 1.4).

Suppose the path lands on L at point Q. Can we quasistatically and adiabaticallygo from P to A or B distinct from Q on the line L? Any quasistatic path may be


realized by a reversible process, so PQ can be traveled in any direction, but neitherthe cycle PQBP nor PAQP should be allowed. B may be reached adiabatically fromP by an irreversible process, but A cannot be reached adiabatically by any means.Therefore, we conclude that the point Q is the state with the highest internal energythat cannot be reached from P adiabatically.

P

Q

A

B

L

X

X'

E

adiabaticallyinaccessiblefrom P

Fig. 1.4 The cycle PAQP implies that the sys-tem can absorb heat along AQ and then convertit into work without any trace (returning to P ),violating Planck’s law. In contrast PBQP is re-alizable, because we have simply waste work intoheat and discard it during BQ. Notice that PQ

and QP are adiabatically allowed, and AQ, QA,BQ and QB are possible with the aid of an ap-propriate heat bath.

Now, moving the stick L throughout the space keeping it parallel to the energyaxis, we can construct a hypersurface consisting of points adiabatically, quasistati-cally and reversibly accessible from point P. This is an adiabat (the totality of thestate that can be quasistatically reachable from P without any heat exchange withthe environment).

1.5.7 Adiabats foliate thermodynamic spaceSee Fig. 1.5 to understand that these sheets = adiabats cannot cross. This impliesthat we can define a state function S, whose level sets are given by these sheets (S =constant defines an adiabat).


P'

PQ

Fig. 1.5 If two adiabats cross or touch,then we can make a cycle that can be traced inany direction, because PQ, P ′Q can be tracedeither directions (reversibility of quasistaticprocesses), so can be PP ′ with the aid of anappropriate heat bath. Planck’s law is violated.

1.5.8 Adiabat can be parameterized by an increasing function of energyThis state function S parameterizing the adiabats can have no overhang, if we regardthe energy coordinate direction as the vertical direction as can be seen from Fig. 1.6.

P

Q

P' Fig. 1.6Just as Fig. 1.5, an overhang violates Planck’slaw.

Therefore, adiabats can be parameterized by a continuous state function S thatincreases monotonically with energy. This state function is essentially entropy.

1.5.9 Relation between heat and entropyFor a given system, we have seen that we can introduce a state function S that ismonotone increasing function of E (if other state variables are kept constant).

We can change entropy keeping all Xi variables constant; that is, we can changeS by supplying or removing heat Q. Since energy is extensive, so is Q.10.

Since if d′Q > 0, dS > 0 should be required. We may choose these two differentialsto be proportional: dS =∝ d′Q. This automatically implies that we assume S to beextensive, and the proportionality constant must be intensive.

10That is, if we double the system, we must double the heat to reach the same thermodynamicstate characterized by the same intensive parameters and densities (= extensive variables per vol-ume)


Suppose two systems are in contact through a wall that allows only the exchangeof heat, and they are in thermal equilibrium. Exchange of heat d′Q between thesystem is a reversible process (say, system I gains d′QI = d′Q and II d′QII = −d′Q),so this process occurs within single adiabat of the compound system. If we writed′QX = θXdSX (X = I or II),

0 = dSI + dSII = d′Q

(1

θI− 1

θII

). (1.5.2)

This implies θI = θII. That is, when two systems have the same temperature, theproportionality constant is also the same. Hence, we may interpret the proportion-ality factor as a temperature (cf. the zeroth law). The introduced temperature canbe chosen as a universal temperature T called the absolute temperature. Hence, inthe quasistatic process we can write

d′Q = TdS. (1.5.3)

Remark The discussion given above is a crude version of the standard thermodynamic demonstra-tion of the existence of an integrating factor for d′Q. The usual demonstration with the aid of aclassical ideal gas must be avoided, because the classical ideal gas is not consistent with thermody-namics.

1.5.10 Infinitesimal form of the first law with the aid of entropyNow we can write down the infinitesimal version of the first law of thermodynamicsfor the quasistatic process as follows:

dE = TdS − PdV + µdN + H · dM + · · · . (1.5.4)

This is called the Gibbs relation. Notice that each term consists of a product of anintensive factor and d[the corresponding (i.e., conjugate) extensive quantity].


1.6 Clausius’ inequality

1.6.1 Entropy cannot decrease in isolated systemsIn the preceding section we have shown that whenever there is an irreversible changein an isolated system (more generally adiabatic system), the second law of thermo-dynamics implies that the entropy increases; more precisely, we have shown that wecan introduce the concept of entropy to satisfy this condition. See Fig. 1.4 in 1.5.6again. From P we can reach the portion of L no lower than Q; we have shown thatthe entropy can be introduced as an increasing function of E along L. In an adiabaticsystem, only when the change is quasistatic does the entropy stay constant. Thus,when d′Q = 0 (adiabatic)

∆S ≥ 0. (1.6.1)

Here, ∆ implies the difference between the values of the final and of the initial equi-librium state.11 This is called Clausius’ inequality (for the isolated system).

1.6.2 Stability of state and evolution criterionSince all spontaneous processes are irreversible, we may say that for a system toevolve (under an adiabatic condition) from an initial state to a final state, the entropychange (which is completely determined by these end points in equilibrium) mustbe positive. If not, there cannot be any spontaneous change; the system is in athermodynamically stable state. Thus, for an isolated system

δS < 0 ⇐⇒ the state is thermodynamically stable, (1.6.2)

δS > 0 ⇐⇒ the state spontaneously evolves, (1.6.3)

where δ implies virtual changes of states. The first line above is the stability condi-tion, and the second the evolution criterion. If δS = 0, the equilibrium state can bechanged to another adiabatically reversibly and/or can drift to another equilibriumstate.

Here, ‘virtual changes’ might be taken as ‘fictitious changes,’ but in the actualsystem these changes are realized by thermal fluctuations. Thus, if the evolutioncriterion’ δS > 0 is satisfied, in most cases, the system actually moves away fromthe current state.

11The reader might feel a bit strange, because, equilibrium states should not evolve further.Here, the situation ∆ indicates is as follows: initially the system is in an equilibrium state. Then,some (environmental) conditions are changed (say, the volume is changed); whether this change isslow or rapid, we do not care. After this change, the system (in suitable isolation as required by theadiabatic condition) would settle down to another (new) equilibrium state (as guaranteed by thezeroth law). ∆ compares this new equilibrium state to the original equilibrium state before change.

1.6. CLAUSIUS’ INEQUALITY 21

1.6.3 Variational principle for equilibrium stateIf an isolated system arrives at a stable equilibrium state, its entropy must be maxi-mized. Therefore, the second law gives us a variational principle (entropy maximiza-tion principle) to find a stable equilibrium state for an isolated system.

1.6.4 Extension to non-isolated systemNext, we would like to extend our inequality for isolated systems to non-isolatedsystems. The following argument is a standard strategy that we use repeatedlythroughout statistical thermodynamics. To consider a system which is not isolated,that is, a system which is interacting with its environment, we construct an isolatedsystem composed of the system itself (I) and its interacting environment (II) (Fig.1.7). We assume that both systems are macroscopic, so we may safely ignore thesurface effect.

III

reservoir

Fig. 1.7The system II is the environment for the systemwe are interested in I. II is sufficiently large sono change in I significantly affects II.

The environment is a stationary one, whose intensive thermodynamic variables suchas temperature are kept constant. To realize this we take a sufficiently big system(called a reservoir like a thermostat or a chemostat) as the environmental system II.Even if a change is a rather drastic one for the system I itself, it would be negligiblefor the system II, because it is very large. Therefore, we may assume that any processin the system I is a quasistatic process for system II. This means that the entropychange of the compound system I+II is given by the sum of the entropy change ofthe system I denoted by ∆SI and that of the environment II denoted by ∆SII.

1.6.5 Clausius’ inequality for general casesSince the whole system I+II is isolated, the second law or Clausius’ inequality (1.6.1)for isolated systems tells us that

∆SI + ∆SII ≥ 0. (1.6.4)

Let Q (> 0) be the heat transferred to the system I from the environment II. Fromour assumption, we have

∆SII = −Q/Te, (1.6.5)


where Te is the temperature of the environment. The minus sign is because II islosing heat to I. Combining (1.6.4) and (1.6.5) yields the following inequality:

∆SI ≥ Q/Te. (1.6.6)

This is Clausius’ inequality for non-isolated systems. Of course, for isolated systemsQ vanishes, so we recover (1.6.1).

1.6.6 Intrinsic change of entropyIf a process is quasistatic (and isothermal), then the fundamental relation betweenentropy and heat reads

∆SI|reversible = Q/T, (1.6.7)

where T is the temperature of the system I. For this process Te must be identical toT . Hence the entropy change in this reversible process is solely due to the transferof heat (i.e., solely due to the interaction with the environment).

When irreversibility occurs, then the equality (1.6.7) is violated. To describe thisde Donder split the entropy change into two parts, and introduced the concept ofintrinsic change of entropy due to irreversibility as

∆iS ≡ ∆S −Q/Te. (1.6.8)

The second law reads

∆iS ≥ 0. (1.6.9)

The intrinsic change is interpreted as the portion of the entropy change producedinside the system due to the very irreversibility of the process.

1.6.7 Clausius’ inequality in terms of internal energyClausius’ inequality can be rewritten in terms of the internal energy as follows. Wehave

∆E = Q+W + Z. (1.6.10)

Combining this with Clausius’ inequality (1.6.6), we get

∆E −W − Z ≤ Te∆S, (1.6.11)

or

∆E ≤ Te∆S − Pe∆V + µe∆N + · · · , (1.6.12)

where quantities with subscript e are all for the environment.

1.6. CLAUSIUS’ INEQUALITY 23

1.6.8 Equilibrium conditions for two systems in contactAs an application of the entropy maximization principle 1.6.3, let us study theequilibrium condition for two systems I and II interacting through various walls.

I II

Fig. 1.8The thick vertical segment is the wall thatselectively allow the exchange of a certainextensive quantity.

i) Consider a rigid impermeable wall which is diathermal. Thus, the two systems incontact through this wall exchange energy (internal energy) in the form of heat. Thetotal entropy of the system S is the sum of the entropy of each system SI and SII.The total internal energy E is also the sum of subsystem internal energies EI and EII(extensivity). We isolate the compound system and ask the equilibrium conditionfor the system. We should maximize the total entropy with respect to the variationof EI and EII (the variational principle for equilibrium; see 1.6.3):

δS =∂SI∂EI

δEI +∂SII∂EII

δEII =

(∂SI∂EI

− ∂SII∂EII

)δEI = 0, (1.6.13)

where we have used that δE = 0 or δEI = −δEII. Hence, the equilibrium conditionis

∂SI∂EI

=∂SII∂EII

, (1.6.14)

or TI = TII.

ii) Consider an diathermal impermeable wall which is movable. In this case the twosystems can exchange energy and volume. If we assume that the total volume of thesystem is kept constant, the equilibrium condition should be

δS =∂SI∂VI

δVI +∂SII∂VII

δVII =

(∂SI∂VI

− ∂SII∂VII

)δVI = 0, (1.6.15)

and TI = TII, that is,∂SI∂VI

=∂SII∂VII

(1.6.16)


and TI = TII. Therefore, PI = PII is also required.

If the wall is adiabatic, then it cannot exchange heat, so there is no way toexchange entropy. This suggests that to use the Gibbs relation (1.5.4) directly isconvenient. PI = PII is the condition; we cannot say anything about the tempera-tures.

iii) Consider a semi-permeable wall to the i-th chemical species. In this case it isnatural to assume that the wall is diathermal. Hence, the two systems can exchangethe molarity Ni of chemical species i and internal energy. The total number of thei-th particles is conserved, so quite analogously to i) and ii), we get the followingequilibrium conditions:

∂SI∂NiI

=∂SII∂NiII

,∂SI∂EI

=∂SII∂EII

. (1.6.17)

That is, TI = TII, and µiI = µiII.

1.6.9 Phase coexistence conditionWhen two phases are in equilibrium, the phase boundary is a wall which allowsexchanges of all the extensive quantities in the system. Therefore, the equilibriumcondition (that is, the phase coexistence condition) for these two phases is that all theintensive quantities of both the phases are identical. See Section 2.5 (e.g., 2.13.1,2.13.2).

1.7 Various thermodynamic potentials

1.7.1 Helmholtz free energyIn reality, the variables S, V,N, · · · to be controlled for the ordinary Gibbs relation(1.5.4) are often hard to control or at least awkward. For example, to keep volumeconstant may be more difficult than to keep pressure constant. Perhaps, to keep thetemperature constant is easier than the adiabatic condition.

In order to change independent variables from S, V,N, · · · to T, V,N, · · ·, we per-form the following Legendre transformation:

E −→ E − TS. (1.7.1)

The introduced quantity E − TS is called the Helmholtz free energy and is usuallywritten as A.12 The total differential of A reads

dA = dE − TdS − SdT = −SdT − PdV + µdN + · · · , (1.7.2)

12Old literatures use F .

1.7. VARIOUS THERMODYNAMIC POTENTIALS 25

where we have used the Gibbs relation (1.5.4).The Helmholtz free energy should be a good thermodynamic potential under con-

stant T, V,N, · · ·. When we compute dS, we regard S as a function of T, V,N, · · ·.Under constant T (1.6.12) reads

∆E − Te∆S = ∆A ≤ −Pe∆V + µe∆N + · · · , (1.7.3)

so under constant T, V,N, · · · for any change13

∆A ≤ 0. (1.7.4)

Hence, in the stable equilibrium state under constant T, V,N, · · · the Helmholtz freeenergy must be the global minimum (i.e., in particular, δA > 0).14

1.7.2 Under constant pressure and temperature: Gibbs free energyIf we wish to study a closed system under constant T and P , we should further

change the independent variables from T, V,N, · · · to T, P,N, · · ·. The necessaryLegendre transformation is (notice that the conjugate quantity of V is −P , not P )

A+ PV = E − TS + PV ≡ G, (1.7.5)

which is called the Gibbs free energy. We have

dG = −SdT + V dP + µdN. (1.7.6)

Exercise 1. Demonstrate that in the stable equilibrium state under constant T, P,N, · · ·,G is minimum. ut

If we wish to study a system in which S, P , and N are kept constant, as is easilyguessed, the following thermodynamic potential called enthalpy H is convenient:

H ≡ E + PV. (1.7.7)

Find the total differential of H and the stability condition for the equilibrium stateunder constant S, P,N, · · ·.Remark. The Legendre transformation is a more general concept than discussedabove. Convex analysis is the mathematical topic covering the general discussion ofLegendre transformation.ut

13‘any change’ here implies literally any change; the change need not be small (that is why ∆is used instead of δ), and can be any ‘local change’ in the system; for example, we could interpreta single simple system as a compound system and manipulate the thermodynamic variables of theconstituent subsystems freely within the required overall constrains that T, V,N, · · · are constant.

14In equilibrium thermodynamics, usually, local minimum implies global minimum, but globalminima may not be unique. If the minima are isolated in the thermodynamic space, so locallyδA > 0 holds, but it does not imply ∆A > 0.


1.7.3 Gibbs-Duhem relationLet us pursue a consequence of the fourth law (1.4.2) of thermodynamics. If weincrease the amount of all materials in the system from Ni to (1+δλ)Ni, then all theextensive quantities are multiplied by 1 + δλ, and all the intensive quantities remainunaltered. Therefore, (1.5.4) now reads

d[(1 + δλ)E] = Td[(1 + δλ)S]− Pd[(1 + δλ)V ] +∑

i

µid[(1 + δλ)Ni] + · · · , (1.7.8)

or

Edδλ = TSdδλ− PV dδλ+∑

i

µiNidδλ+ · · · . (1.7.9)

That is, we have

E = TS − PV +∑

i

µiNi + · · · . (1.7.10)

Combining the total differential of this formula and the Gibbs relation (1.5.4), wearrive at

SdT − V dP +∑

NIdµi + · · · = 0. (1.7.11)

This important relation is called the Gibbs-Duhem relation.

Exercise 1. Demonstrate the following formulas in two ways: 1) with the aidof the definitions of various thermodynamic potentials, 2) directly from the totaldifferential formulas such as (1.7.2) and (1.7.6), using the same logic we have justused to demonstrate the Gibbs-Duhem relation.

A = −PV +∑

i

µiNi + · · · , (1.7.12)

H = TS +∑

i

µiNi + · · · , (1.7.13)

G =∑

i

µiNi + · · · . (1.7.14)

utThe last formula in the exercise implies that for a simple pure fluid

µ = G/N. (1.7.15)

1.8. MANIPULATION OF THERMODYNAMIC FORMULAS 27

1.8 Manipulation of thermodynamic formulas

1.8.1 Symmetry of mixed second partial derivativesLet f be a function of x and y. If its partial derivatives fx and fy exist and aredifferentiable in a domain D, then

fxy = fyx (1.8.1)

in D (Young’s theorem).Remark. More precisely, if fx, fy and fxy exist and fxy is continuous, then fyx exists and identicalto fxy (Schwarz’s theorem). Notice that fxx and fyy do not necessarily exist under this condition.utHence, so long as thermodynamic potentials are smooth, we may apply Young’stheorem to them. The resultant equations corresponding to (1.8.1) are collectivelycalled Maxwell’s relations. Some examples follow.

dE = TdS − PdV + · · · (1.8.2)

gives∂T

∂V

∣∣∣∣S

= − ∂P

∂S

∣∣∣∣V

. (1.8.3)

Notice the conjugate pairs appearing in the upstairs and in the conditions. If westart from the Helmholtz free energy, we get

∂S

∂V

∣∣∣∣T

=∂P

∂T

∣∣∣∣V

. (1.8.4)

1.8.2 Jacobian technique to manipulate partial derivativesTo manipulate many partial derivatives, it is very convenient to use the so-called

Jacobian technique. The Jacobian for two independent variables is defined as thefollowing determinant:

∂(X, Y )

∂(x, y)≡

∣∣∣∣∣∣∂X∂x

∣∣y

∂X∂y

∣∣∣x

∂Y∂x

∣∣y

∂Y∂y

∣∣∣x

∣∣∣∣∣∣ =∂X

∂x

∣∣∣∣y

∂Y

∂y

∣∣∣∣x

− ∂Y

∂x

∣∣∣∣y

∂X

∂y

∣∣∣∣x

. (1.8.5)

In particular, we have∂(X, y)

∂(x, y)=∂X

∂x

∣∣∣∣y

. (1.8.6)


This and some simple algebraic relation in ?? are the keys.Remark. More generally, the Jacobian of n functions fin

i=1 of n independentvariables xin

i=1 is defined by

∂(f1, f1, · · · , fn)

∂(x1, x2, · · · , xn)≡ det

(∂fi

∂xj

). (1.8.7)

ut

1.8.3 Useful elementary relations involving JacobiansFrom the property of the determinant, if we change the order of variables or functions,there is a sign change:

∂(X, Y )

∂(x, y)= −∂(X, Y )

∂(y, x)=∂(Y,X)

∂(y, x)= −∂(Y,X)

∂(x, y). (1.8.8)

If we assume that X and Y are functions of a and b, and that a and b are, in turn,functions of x and y, we have the following multiplicative relation:

∂(X, Y )

∂(a, b)

∂(a, b)

∂(x, y)=∂(X, Y )

∂(x, y). (1.8.9)

This is a disguised chain rule. The proof of this relation is left to the readers. Use

∂X

∂x

∣∣∣∣y

=∂X

∂a

∣∣∣∣b

∂a

∂x

∣∣∣∣y

+∂X

∂b

∣∣∣∣a

∂b

∂x

∣∣∣∣y

. (1.8.10)

The rest is straightforward algebra.From (1.8.9) we get at once

∂(X, Y )

∂(x, y)= 1

/∂(x, y)

∂(X, Y ). (1.8.11)

In particular, we have∂X

∂x

∣∣∣∣Y

= 1

/∂x

∂X

∣∣∣∣Y

. (1.8.12)

Using these relations, we can easily demonstrate

∂X

∂y

∣∣∣∣x

= − ∂x

∂y

∣∣∣∣X

/∂x

∂X

∣∣∣∣y

(1.8.13)

1.9. CONSEQUENCES OF STABILITY OF EQUILIBRIUM STATES 29

as follows:∂(X, x)

∂(y, x)

(1.8.9)=

∂(y,X)

∂(y, x)

∂(X, x)

∂(y,X)

(1.8.8)= −∂(x,X)

∂(y,X)

∂(X, y)

∂(x, y). (1.8.14)

Then, use (1.8.11). A concrete example of this formula is

∂p

∂T

∣∣∣∣V

= − ∂V

∂T

∣∣∣∣p

/∂V

∂p

∣∣∣∣T

, (1.8.15)

which relates thermal expansivity and isothermal compressibility.

1.8.4 Maxwell’s relation in terms of JacobianAll the Maxwell’s relations can be unified in the following form

∂(X, x)

∂(Y, y)= −1, (1.8.16)

where (x,X) and (y, Y ) are conjugate pairs.

1.9 Consequences of stability of equilibrium states

1.9.1 Origin of definite signs of many second derivativesClausius’ inequality was interpreted as an evolution criterion of equilibrium statesin 1.6.2. From this inequality we derived other inequalities in terms of variousthermodynamic potentials. These inequalities dictate signs of many derivatives ofthermodynamic quantities.

1.9.2 Second differential must be with a definite sign15

Let us start with (1.6.12). This is an evolution criterion, i.e., if this inequality issatisfied, the system evolves spontaneously. Therefore, if the equilibrium state underconsideration is stable (or neutral), we must have

δE ≥ TeδS − PeδV + µeδN + · · ·+ xieδXi + · · · , (1.9.1)

where δ implies virtual changes of variables and · · · denotes other first order defer-ential terms. On the other hand, expansion of δE to the second order reads

δE = TδS − PδV + µδN + · · ·+ xiδXi + · · ·

+1

2

∑i,j

∂2E

∂Xi∂Xj

δXiδXj + higher order differential terms. (1.9.2)

15This topic is a convex analysis topic. E is a convex function of the extensive variables.


Since the system is in equilibrium with the environment with Te, Pe, µe, etc., T = Te,P = Pe, etc. holds. Therefore, combining these two formulas, we conclude that

1

2

∑i,j

∂2E

∂Xi∂Xj

δXiδXj ≥ 0 (1.9.3)

for any δXi. That is, the matrix (Hessian): Matr.(∂2E/∂Xi∂Xj) is positivesemidefinite (if the system is really stable, positive definite).16

1.9.3 Simple consequence of positive semidefiniteness of Hessian matrixA necessary and sufficient condition for a matrix to be positive definite is that allits principal minors are positive. Therefore, (1.9.3) implies that all the diagonalelements are non-negative:

∂2E

∂X2i

≥ 0. (1.9.4)

For example,∂2E

∂S2> 0 ⇒ ∂S

∂T

∣∣∣∣V

> 0 ⇒ CV > 0, (1.9.5)

∂2E

∂V 2> 0 ⇒ − ∂P

∂V

∣∣∣∣S

> 0 ⇒ κS > 0, (1.9.6)

where CV (≡ T (∂S/∂T )V ) is the specific heat under constant volume, and κS

(≡ −(∂P/∂V )S/V ) is the adiabatic compressibility.That the positivity of these quantities implies the stability of the system is intu-

itively understandable. Suppose CV < 0. Then, if the system gains energy (as heatgain), the temperature of the system decreases. Consequently, the system becomesa heat sink, and sucks all the energy of the universe.

1.9.4 More general consequences of positive semidefiniteness of HessianmatrixGenerally, the above mentioned necessary and sufficient condition for the positivedefiniteness for the matrix Matr.(∂2U/∂Xi∂Xj) implies

∂(xi, xj, · · · , xl)

∂(Xi, Xj, · · · , Xl)> 0, (1.9.7)

where xi is the conjugate variable of Xi:

xi ≡∂E

∂Xi

∣∣∣∣X1X2···(exceptXi)···Xl

. (1.9.8)

16If the internal energy E is twice differentiable. E is always a C1-function: the intensivevariables are continuous functions, but twice differentiability may not be guaranteed.

1.9. CONSEQUENCES OF STABILITY OF EQUILIBRIUM STATES 31

In particular, we have∂(T,−P )

∂(S, V )> 0 ⇒ ∂(T, P )

∂(S, V )< 0. (1.9.9)

Notice that, whenever you use general formulas, the conjugate of V is not P but −P .

1.9.5 Use of other thermodynamic potentialsWe can start with any stability inequality. For example, we may start from theinequality (1.7.4) for the Helmholtz free energy. In this case temperature must bekept constant at Te. The general condition corresponding to (1.9.3) is

1

2

∑i,j

∂2A

∂Yi∂Yj

∣∣∣∣Te, other Y ′s

δYiδYj ≥ 0, (1.9.10)

where Yi are extensive natural variables for A.Example 1. Show that any specific heat Ca is positive, if the system is stable, wheresuffix a implies that a is kept constant; a can be any variable, intensive or extensive.

Specific heat is always defined by

Ca ≡ T∂S

∂T

∣∣∣∣a

(=∂′Q

∂T

∣∣∣∣a

). (1.9.11)

If a is extensive, the derivative is just a diagonal element in Matr.(∂2E/∂Xi∂Xj);notice that (∂2E/∂S2)Y = T/CY . If a is intensive, we use

∂(S, x)

∂(T, x)

(1.8.9)=

∂(S,X)

∂(T, x)

∂(S, x)

∂(S,X), (1.9.12)

where X is the extensive conjugate variable of x. The first factor on RHS (= right-hand-side) is just a 2 × 2 case of (1.9.7), while the second factor is its 1 × 1 case.Thus, we may conclude, for example, CP is positive. ut

Generally, we can show∂X

∂x

∣∣∣∣...

> 0, (1.9.13)

where (X, x) is a conjugate pair, where · · · are various constraints.

1.9.6 Le Chatelier’s principleSuppose an equilibrium state is disturbed by applying a small change in X. LeChatelier’s principle asserts that the direct effect of this change on x occurs in thedirection to ease the effect of X. This is an interpretation of

∂xi

∂Xi

∣∣∣∣··· except Xi,···

≥ 0. (1.9.14)


For example, suppose we introduce heat into a system in equilibrium. This isinterpreted as increasing S of the system. If the system temperature went down(i.e., ∆T < 0), then more heat could flow in. This causes further decrease of thesystem temperature and the situation runs away. Certainly, the equilibrium stateof the system cannot be stable. Therefore, increasing S (i.e., ∆S ≥ 0) must imply∆T ≥ 0 if the equilibrium is stable. This implies the positivity of the specific heat.

1.9.7 Le Chatelier-Braun’s principleSuppose an equilibrium state is disturbed by applying a small change in X. LeChatelier-Braun’s principle asserts that the indirect effect of this change on y occursin the direction to ease the effect of X. This is an interpretation of

∂x

∂X

∣∣∣∣Y

≥ ∂x

∂X

∣∣∣∣y

. (1.9.15)

or we may write

(∆x)Y ≥ (∆x)y. (1.9.16)

Let us demonstrate (1.9.15).

∂x

∂X

∣∣∣∣y

=∂(x, y)

∂(X, y)=

∂(x, y)

∂(X, Y )

∂(X, Y )

∂(X, y)(1.9.17)

=∂Y

∂y

∣∣∣∣X

∂x

∂X

∣∣∣∣Y

∂y

∂Y

∣∣∣∣X

− ∂x

∂Y

∣∣∣∣X

∂y

∂X

∣∣∣∣Y

(1.9.18)

=∂x

∂X

∣∣∣∣Y

− ∂Y

∂y

∣∣∣∣X

∂x

∂Y

∣∣∣∣X

∂y

∂X

∣∣∣∣Y

(1.9.19)

=∂x

∂X

∣∣∣∣Y

− ∂Y

∂y

∣∣∣∣X

∂(x,X)

∂(Y,X)

∂y

∂X

∣∣∣∣Y

(1.9.20)

=∂x

∂X

∣∣∣∣Y

− ∂Y

∂y

∣∣∣∣X

∂(x,X)

∂(Y, y)

∂(Y, y)

∂(Y,X)

∂y

∂X

∣∣∣∣Y

(1.9.21)

(1.8.16)=

∂x

∂X

∣∣∣∣Y

− ∂Y

∂y

∣∣∣∣X

∂y

∂X

∣∣∣∣2Y

, (1.9.22)

where Maxwell’s relation 1.8.4 was used. We realize that the subtracted term in(1.9.22) is positive. Therefore, we obtain (1.9.15).

For example, suppose we introduce heat into a system in equilibrium. This isinterpreted as increasing S of the system. Then the system temperature increases.Let us keep the system volume. (∆T )V denotes the temperature increase under

1.10. IDEAL RUBBER 33

constant volume. Now, we allow the system to change its volume under constant P .The temperature change (∆T )P under this condition should be smaller:

∂T

∂S

∣∣∣∣P

≤ ∂T

∂S

∣∣∣∣V

. (1.9.23)

That is,CP ≥ CV . (1.9.24)

1.10 Ideal rubber

1.10.1 Ideal rubber bandA rubber band is made of many flexible chain molecules. In an idealization we

assume that there is no energy change required to alter conformations of molecules,and molecules do not interact with each other. Due to thermal motion, monomers

a

b

Fig. 1.9 A polymer chain and a chain of dancing children (b after N. Saito)

making each polymer tend to point random spatial directions. This means that thechains spontaneously coil up. Thus if a rubber band, which may be regarded as abundle of these chains, is stretched, it resists stretching. As illustrated in Fig. 1.9,monomers (arrows in a) are like children hand-in-hand, dancing around like b. Thetwo end flag poles are surely pulled inwardly.

The internal energy of a rubber band can be written as (for simplicity, we considera 1D stretch)

dE = TdS + FdL, (1.10.1)


where F is the tensile force and L the total length of the rubber band. For idealrubber, no volume change occurs upon stretching.

1.10.2 Entropic elasticity of rubber bandUnder constant stretching force F , the length should become shorter if the temper-ature is raised. Hence, we assume

∂L

∂T

∣∣∣∣F

< 0. (1.10.2)

Thermodynamics cannot demonstrate this inequality. This should come from em-pirical data or from more microscopic considerations. Many interesting conclusionsfollow from this single inequality and the general thermodynamic framework.

The inequality (1.10.2) is opposite to the usual substance that expands upon heat-ing. (1.10.2) is a signature of ‘entropic’ elasticity: elasticity due to the moving aroundof the microscopic constituents. The usual solids relax due to thermal motion, be-cause their elasticity comes from the energetic interactions among constituents, somoving around that causes thermal expansion weakens their elastic constant.

1.10.3 Upon stretching entropy decreasesWhat happens to the entropy of the rubber band, if we stretch it under constanttemperature?

∂S

∂F

∣∣∣∣T

= +∂L

∂T

∣∣∣∣F

(< 0) (1.10.3)

thanks to a Maxwell’s relation is enough to answer the question.Stretching certainly hinders the motion of the chain, or the children in Fig. 1.9 in

1.10.1. The above inequality tells us that entropy indicates the randomness of themicroscopic constituents of a macroscopic systems.

1.10.4 Adiabatic stretching raises temperatureWhat happens to the temperature, if we adiabatically (i.e., under S constant) stretchthe band? We wish to know the sign of

∂T

∂F

∣∣∣∣S

=∂(T, S)

∂(F, S)=∂(T, S)

∂(F,L)

∂(F,L)

∂(F, S)

(1.8.16)= − ∂L

∂S

∣∣∣∣F

. (1.10.4)

The relation tells us that we can answer the above question through answering thequestion about the length change of a band, when entropy is increased under constantstretching force. The latter question may not be intuitively easy. Let us use the ideathat entropy indicates the randomness of the chain (or how many different shapes the

1.11. THIRD LAW OF THERMODYNAMICS 35

chain can assume relatively easily). Increasing entropy requires easier movement ofthe chain. Then, L should decrease. Therefore, (1.10.4) implies that the temperatureof the chain should rise upon increasing the stretching force.

Let us perform a formal calculation:

∂T

∂F

∣∣∣∣S

=∂(T, S)

∂(F, S)=∂(F, T )

∂(F, S)

∂(T, S)

∂(F, T )= − T

CF

∂S

∂F

∣∣∣∣T

> 0. (1.10.5)

CF is the specific heat under constant force, which is positive (see (1.9.11)). We havealso used (1.10.3). Thus, the rubber band indeed becomes warm.

Perform an experiment! You can easily feel this temperature increase by touchinga rapidly stretched rubber band (a thick one such as used to bind broccoli in thegrocery store) with your lip. If you rapidly relax the stretched rubber band after itequilibrates with the room temperature, you can again feel with your lip that theband has cooled off.

In the above inequality F can be replaced with L:

∂T

∂L

∣∣∣∣S

=∂T

∂F

∣∣∣∣S

∂F

∂L

∣∣∣∣S

> 0. (1.10.6)

The second factor on RHS is positive, since it is a diagonal element (cf. (1.9.13)).What happens if we stretch the rubber band under constant temperature? Let us

study entropy.

∂S

∂L

∣∣∣∣T

=∂(S, T )

∂(L, T )=∂(S, L)

∂(L, T )

∂(S, T )

∂(S, L)= − T

CL

∂T

∂L

∣∣∣∣S

< 0, (1.10.7)

where CL is the specific heat under constant length, and we have used (1.10.6).

Exercise 1 How about the following derivatives?

∂F

∂S

∣∣∣∣L

,∂L

∂S

∣∣∣∣F

,∂F

∂S

∣∣∣∣T

. (1.10.8)

ut

1.11 Third law of thermodynamics

1.11.1 Nernst’s law and the third lawNernst empirically found that all the derivatives of entropy S vanish as T → 0


(Nernst’s law). For example,

∂S

∂P

∣∣∣∣T

= − ∂V

∂T

∣∣∣∣P

→ 0. (1.11.1)

All the specific heat vanishes as T → 0. Notice that these observations contradictthe ideal gas law. Nernst concluded that entropy becomes constant (independent ofany thermodynamic variables) in the T → 0 limit. Later, Planck chose this constantto be zero (the concept of absolute entropy). We will discuss absolute entropy laterin light of statistical mechanics.S = 0 at T = 0 is sometimes chosen as the third law of thermodynamics. We

adopt the following:

ThIII Reversible change of entropy ∆S vanishes in the T → 0 limit.

1.11.2 Adiabatic coolingTo cool a system we often use adiabatic cooling. During a reversible adiabatic processentropy is kept constant. Therefore, if entropy depends not only on T but also onsome thermodynamic variable a, there should be a way to decrease T by changinga, since

S(T, a) = const. (1.11.2)

We already know a good example. When we relax a stretched rubber band sufficientlyrapidly, the temperature of the band decreases. The mechanism for this cooling isalmost identical to the often employed adiabatic demagnetization method to get verylow temperatures.

A piece of a magnetic material contains many spins (or microscopic magneticmoments17). If we connect these spins in a head-to-tail fashion, we get a polymerchain. Stretching a chain corresponds to aligning spins. This we can accomplishby external magnetic field. Hence, if we remove the external magnetic field, thetemperature of the system decreases.

More formally, we can compute

∂T

∂H

∣∣∣∣S

= − ∂(S, T )

∂(H,T )

∂(H,T )

∂H, S)= − ∂S

∂H

∣∣∣∣T

/CH > 0, (1.11.3)

where CH is the heat capacity under constant magnetic field, and (∂S/∂H)T < 0 hasbeen used:

∂S

∂H

∣∣∣∣T

=∂M

∂T

∣∣∣∣H

< 0. (1.11.4)

17However, the interactions aligning the moments are not moment-moment interactions butelectron exchange interactions.

1.11. THIRD LAW OF THERMODYNAMICS 37

This inequality must be assumed within thermodynamics, but intuitive microscopicunderstanding is not hard. Hence, (11.3) implies that if H is decreased, T decreases.

1.11.3 Absolute zero is unattainableSince the derivatives of S becomes smaller when we come closer to the absolute zerotemperature (Nernst’s law), any cooling method becomes inefficient sufficiently closeto T = 0. Thus, we cannot reach T = 0.18

18However, this does not imply the third law.

Chapter 2

Statistical Mechanics Primer

Rudimentary probability and combinatorics are summarized in Appendices 2.A and2.B.

2.1 Basic hypothesis of equilibrium statistical me-

chanics

2.1.1 Phase spaceWe describe a given macroscopic system microscopically in terms of mechanics. Atevery instant, the system takes a definite microscopic state. The totality Ω of all theadmissible microscopic states is called the phase space of the system. Microstatesare the elementary events in the terminology of probability theory (→2.A.8).

If we assume that the system can be described by classical mechanics, every micro-scopic state is designated by positions and velocities (momenta) of all the particlesconstituting the system. Therefore, we make a 6N -dimensional space called thephase space and a microstate corresponds to a certain point in this space uniquely.

Quantum mechanically, the phase space may be identified with the vector spacespanned by all the eigenstates of the Hamiltonian of the system.1

1Precisely speaking, only the ‘direction’ of the vector matters, so the phase space is a collectionof rays.

39

40 CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.1.2 Why statistical descriptions?We cannot expect to be able to describe a macroscopic system completely at themicrostate level. At best we hope for a kind of statistical description of the sys-tem. This is because our macroscopic observations are not instantaneous, and alsobecause macroscopic objects can be regarded as an ensemble of statistically (moreor less) independent subsystems which are again macroscopic. This is empiricallyguaranteed by the fourth law of thermodynamics (→1.4.1).

To develop a probabilistic description, we must know the probabilities of elemen-tary events, i.e., microstates. Even if we assume that the world is completely describ-able by mechanics, we cannot derive the necessary fundamental probabilities frommechanics alone. Thus, we postulate a general probabilistic law about microscopicevents, whose justification comes from the success of the framework a posteriori.

2.1.3 Principle of equal probabilityConsider an isolated system. The fundamental postulate of equilibrium statisticalmechanics is:

Principle of equal probability: All microstates (i.e., elementary events) are equallyprobable.

Of course, the said microstates must be the ones compatible with the constraintsimposed on the macroscopic system.

As is mentioned above, there is no justification of this postulate from the atomisticmechanical picture of the world; invariably, something extramechanical creeps intothe derivation.

We must not forget that there cannot be any truly isolated system in this universe.A famous argument by E. Borel goes as follows: if one gram of matter displaces 1cm on Sirius (11 light years away), the gravitational field around us changes 1 partper 10100. This tiny change is, however, enough to completely destroy the intrinsicmechanical behavior (say, particle trajectories in classical mechanics) of the systemafter 1 nsec.

2.2 Boltzmann’s principle

2.2.1 Probability and entropyIn statistical mechanics any macroscopic state is interpreted as a set of microstates

2.2. BOLTZMANN’S PRINCIPLE 41

which give the same macroscopic observable values (the same thermodynamic quan-tities). If the system is isolated, all the microstates are equally likely (→2.1.3), sothey have the same probability P to be observed.

We have learned the interpretation that entropy is a measure of microscopic dis-order (cf. 1.10.3, 1.10.4 in Chapter 1). If a macrostate has more microstates thatare compatible with it, then its entropy should be larger. Therefore, in any case it issensible to assume that entropy S is a function of P : S = S(P ). This is the crucialpoint Boltzmann realized about 100 years ago.

2.2.2 Boltzmann’s principleLet us ‘derive’ Boltzmann’s principle: the entropy of a macrostate is given by

S = kB log Ω, (2.2.1)

where kB is the Boltzmann constant, and Ω is the number of microscopic states (orthe phase volume of the set of microstates) compatible with the macrostate of anisolated system.

A crucial observation is that entropy is an extensive quantity. If we form a com-pound system by combining two systems I and II already in thermal equilibriumwith each other, the entropy of the compound system is the sum of that of eachcomponent.

The interaction introduced by the contact of the two systems is, for macroscopicsystems, a very weak one. In any case, the effect is confined to the boundary layerwhose thickness is microscopic. The probability to observe a microstate of the com-pound system can be computed by simply multiplying the probabilities to observethe microstate for each subsystem. In other words, the two subsystems may be re-garded statistically independent (cf. (2.A.12)).

Combining the above considerations, we have arrived at the following functionalrelation:

S(PIPII) = S(PI) + S(PII), (2.2.2)

where suffixes denote subsystems.Assume that S is a sufficiently smooth function. We conclude from the relation

that S is proportional to logP . The fundamental postulate tells us that P = 1/Ω.Therefore, we have arrived at (2.2.1). The proportionality coefficient kB must bepositive, because entropy should be larger with larger E that corresponds to largerΩ (see 1.5.5).

2.2.3 Equilibrium state is the most probable stateWe know that the equilibrium state corresponds to the maximum entropy for anisolated system (1.6.3). Formula (2.2.1) implies that the equilibrium state is the


most probable macrostate (meaning that it corresponds to the largest number ofmicrostates). Thermodynamic irreversibility is due to the change of less likelymacrostates towards more probable macrostates. Usually, an ordered state has lesscompatible microstates than a disordered state, so that spontaneous processes in-crease the microscopic disorder of a system. In this way the origin of irreversibilityis intuitively understood.

2.2.4 Calculation of intensive quantitiesOnce we know the number of microscopic configurations Ω of an isolated system asa function of energy E, volume V and the number (or molarity) of particles N , wecan compute T , P , and µ with the aid of thermodynamic relations: it is convenientto rewrite the Gibbs relation (1.3.5) as follows (be careful about the signs):

dS =1

TdE +

P

TdV − µ

TdN − H

T· dM + · · · . (2.2.3)

From this, we get∂S

∂E=

1

T, (2.2.4)

∂S

∂V=P

T, (2.2.5)

∂S

∂N= −µ

T. (2.2.6)

These derivatives are called conjugate intensive variables (with respect to entropy).

2.2.5 Example: Schottky defectsLet us consider an isolated crystal with point defects (vacancies) on the lattice sites(Schottky defects). To create one such defect we must move an atom from a latticepoint to the surface of the crystal. The energy cost for this is assumed to be ε.Although the number n of vacancies are macroscopic, we may still assume it to bevery small compared to the number N of all lattice sites. Hence, we may assumethat the volume of the system is constant. The energy of the system is a macroscopic(thermodynamic) variable which completely specifies macrostates.

We must compute Ω(E) as a function of the total energy E , which is given by

E = nε. (2.2.7)

The average of E is the internal energy E. We may consider Ω as a function of n.Obviously,

Ω(n) =

(N

n

). (2.2.8)

2.3. EQUILIBRIUM AT CONSTANT TEMPERATURE 43

To compute this, we use Stirling’s formula to evaluate N ! asymptotically:

N ! ≈(N

e

)N

, (2.2.9)

orlogN ! ≈ N logN −N. (2.2.10)

Boltzmann’s principle gives us

S = kB log Ω(n) ' kB[N logN − n log n− (N − n) log(N − n)]. (2.2.11)

Using (2.2.4), we get

1

T=

∂S

∂E

∣∣∣∣V

=1

ε

dS

dn=kB

εlog

N − n

n. (2.2.12)

If the temperature is sufficiently low or ε is sufficiently large so that ε/kBT 1, theabove formula reduces to

ε

kBT' log

N

n. (2.2.13)

Hence, under this low temperature condition, the internal energy E reads

E = εNe−ε/kBT . (2.2.14)

The constant volume specific heat CV of the system can be obtained as

CV =dE

dT= NkB

(ε

kBT

)2

e−ε/kBT . (2.2.15)

Exercise 1. Find the formula for CV correct for all T . Notice that CV has a peakwhen kBT is of order ε. ut

2.3 Equilibrium at constant temperature

2.3.1 System + thermostat is considered as isolatedLet us consider a closed system in a thermostat (or a heat bath). In this case, thetotal energy E of the system is no longer constant, but can fluctuate. Instead, thetemperature T is kept constant. We assume V and N are also kept constant (the


system is in a rigid container). To study this system we use the same trick used inthermodynamics (see 1.6.4). We embed our system (I) in the heat bath (II), andassume that the composite system is an isolated equilibrium system. The theorydeveloped in the preceding section then applies.

The total energy E0 of the system is given by

E0 = EI + EII. (2.3.1)

The number of microscopic states for system I (resp., II) with energy EI (resp.,EII) is denoted by ΩI(EI) (resp., ΩII(EII)). Thermal contact is a very weak in-teraction, so the two systems are statistically independent. Hence, the number ofmicrostates for the composite system with the energies EI in I and EII in II is givenby

ΩI(EI)ΩII(EII). (2.3.2)

The total number Ω(E0) of microstates for the composite system must be the sumof this product over all the ways to distribute energy to I and II. Therefore, we get

Ω(E0) =∑

0≤EI≤E0

ΩI(EI)ΩII(E0 − EI). (2.3.3)

2.3.2 Derivation of canonical distributionThe fundamental postulate of equilibrium statistical mechanics 2.1.3 implies thatthe probability for the system I to have energy E (or more precisely, energy betweenE and E + dE) is given by

P (EI ∼ E) =ΩI(E)ΩII(E0 − E)

Ω(E0). (2.3.4)

Consider the equilibrium state of the composite system; the subsystems mustalso be in thermal equilibrium. We may use Boltzmann’s principle 2.2.2 to rewriteΩII(EII) as2 and it agrees with S(E) at E = E as a function of

ΩII(EII) = exp(SII(EII)/kB). (2.3.5)

The system II is huge compared with I. Expand the entropy as follows:

SII(E0 − E) = SII(E0)− E∂SII∂EII

+1

2E2∂

2SII∂E2

II+ · · · (2.3.6)

2Strictly speaking, the thermodynamic entropy S is defined only for equilibrium states, soalthough S(E) is defined, general S(E) has not been defined. We must say that this general S isdefined by (2.3.5). However, here, we may interpret, for example, S(EII) is the entropy of systemII in equilibrium with internal energy EII.

2.3. EQUILIBRIUM AT CONSTANT TEMPERATURE 45

and denote the temperature of the heat bath (i.e., system II) by T :

∂SII∂EII

=1

T. (2.3.7)

The most probable E should be close to the internal energy of system I, so that dueto the extensivity of internal energy this should be of order NI, the total number ofparticles in the system I. The second derivative in (2.3.6) should be of order N−1

II , sothe ratio of the second term and the third term in (2.3.6) is of order NI/NII, whichis negligibly small. Thus, we can streamline (2.3.4) as

P (EI ∼ E) ∝ ΩI(E)e−βE , (2.3.8)

where a standard notationβ = 1/kBT (2.3.9)

is used.

2.3.3 Canonical partition functionTo compute the probability we need the normalization constant for (2.3.8)

Z =∑E

ΩI(E)e−βE , (2.3.10)

which is called the canonical partition function. The sum may become an integral.A more microscopic expression is also possible:

Z =∑

all microstates

e−βE = Tr e−βH. (2.3.11)

If we decompose the sum as follows, we can easily understand this formula:∑all microstates

=∑E

∑all microstates with energy ∼E

, (2.3.12)

but ∑all microstates with energy ∼E

e−βE = Ω(E)e−βE . (2.3.13)

The probability distribution we have obtained:

P (E) =1

ZΩI(E)e−βE (2.3.14)

is called the Gibbs-Boltzmann distribution.


2.3.4 Calculation of internal energyOnce the canonical partition function is known, the internal energy of the systemcan be obtained easily:

E = 〈E〉 =∑E

P (E)E =1

Z

∑E

EΩI(E)e−βE = −∂logZ(β)

∂β, (2.3.15)

where Z (cf. (2.3.11)) is explicitly written as a function of β. The formula shouldbe easily understood from the corresponding general formula (2.A.17) for the gener-ating function. Indeed, the canonical partition function is the generating function ofenergy.

2.3.5 Calculation of Helmholtz free energyBoltzmann’s principle tells us that Ω(E) ∼ exp(S(E)/kB), and is an increasing func-tion of E. Due to the extensivity of entropy, S(E)/kB is of order N . The energy ofthe system is also of the same order. Hence, Ω(E) exp(−βE) is sharply peaked aroundits most probable value of E , which should be very close to the internal energy E,the average of E .

This knowledge can be used to evaluate the canonical partition function Z:

Z =∑E

Ω(E)e−βE =∑E

e−β(E−TS(E)) ' e−β(E−TS(E)), (2.3.16)

where E is the internal energy that is identified with the most probable value of theenergy. The error in this estimate is of order N , i.e., the error in logZ, which isextensive (i.e., of order N), is of order logN , extremely accurate.

From this we conclude the following equation of immense importance:

A = −kBT logZ. (2.3.17)

This is a consequence of Boltzmann’s principle, but practically it is much more usefulthan the principle itself.

It turns out that (2.3.15) is a thermodynamically well-known formula:

∂(A/T )

∂(1/T )

∣∣∣∣V

= E, or∂βA

∂β

∣∣∣∣V

= E, (2.3.18)

the Gibbs-Helmholtz formula.

2.3.6 Example: Schottky defects revisitedThis is a continuation of 2.2.5. With Ω(n) known, it is easy to compute Z:

Z =∑

n

Ω(n)e−βnε = (1 + e−βε)N , (2.3.19)

2.4. SIMPLE SYSTEMS 47

where we have used the binomial theorem (2.B.6). From this the Helmholtz freeenergy immediately follows:

A = −NkBT log(1 + e−βε). (2.3.20)

We can get entropy by differentiation:

S = −∂A∂T

= NkB log(1 + e−βε) +Nε

T

e−βε

(1 + e−βε). (2.3.21)

Compare this with the formula obtained directly by Boltzmann’s formula. ut

2.4 Simple systems

2.4.1 No-interacting 1D harmonic oscillatorsConsider a collection of N 1-dimensional harmonic oscillators which are not inter-acting with each other at all.

Let us first examine a single oscillator of frequency ν. Elementary quantum me-chanics tells us that the energy of the system is quantized as

ε =

(1

2+ n

)hν, n = 0, 1, 2, · · · . (2.4.1)

Each eigenstate is nondegenerate. Thus, if we specify the quantum number n, thestate of a single oscillator is completely specified. The canonical partition functionfor a single oscillator reads

Z1 =∞∑

n=0

exp

[−β(

1

2+ n

)hν

]. (2.4.2)

Using (1− x)−1 = 1 + x+ x2 + x3 + · · · (|x| < 1), we get

Z1 = e−βhν/2(1− e−βhν)−1 =

(2 sinh

βhν

2

)−1

. (2.4.3)

There are N independent oscillators, so the canonical partition function for thesystem should be

Z = ZN1 . (2.4.4)


You can honestly proceed as follows, too. The state of the system should be uniquely specified (cf.Fig. 2.1 in 2.4.5) if we know all the quantum numbers of the oscillators n1, n2, · · · , nN; we mayidentify this table and the microstate. The energy E of the microstate n1, n2, · · · , nN is given bythe sum of the energies of individual oscillators:

E =N∑

i=1

(12

+ ni

)hν. (2.4.5)

The canonical partition function is, by definition,

Z =∞∑

n1=0

∞∑n2=0

· · ·∞∑

nN=0

exp

(−βNhν/2− β

N∑i=1

nihν

)=

(e−βhν/2

∞∑n=0

e−βnhν

)N

. (2.4.6)

Thus, we have arrived at (2.4.4).From (2.4.4) we obtain

A(N) = NkBT log

(sinh

βhν

2

), (2.4.7)

and

E =Nhν

2coth

(βhν

2

). (2.4.8)

Exercise 1. Compute S and CV . Notice that A(2N) = 2A(N), that is, the fourthlaw of thermodynamics is satisfied. We will return to this model when we study thespecific heat of insulators (→2.6.3).ut

2.4.2 Ideal gasConsider a gas consisting of N identical noninteracting particles. Each particle canhave internal degrees of freedom which may be thermally excited. The gas is rareenough, so that we may ignore any quantum effect due to nondistinguishability ofidentical particles. For this to be true the average de Broglie wave length of eachparticle must be much smaller than the average interparticle distance. The de Brogliewave length λ is, on average,

λ ∼ h/√mkBT , (2.4.9)

where m is the mass of the particle, and h is Planck’s constant. The mean particledistance is 3

√V/N , so the condition we want is

N

V(mkBT

h2

)3/2

. (2.4.10)

When this inequality is satisfied, we say the gas is classical. Notice that the dy-namics of internal degrees of freedom such as vibration and rotation need not be

2.4. SIMPLE SYSTEMS 49

classical (cf. 2.7.4). Since there are no interactions between particles, each particlecannot sense the density. Consequently, the internal energy of the system must be afunction of T only: E = E(T ). This is a good characterization of ideal gases.

2.4.3 Quantum calculation of one particleLet us first compute the number of microstates allowed for a single particle in a boxof volume V . To this end we solve the Schrodinger equation in a cube with edges oflength L:

− ~2

2m∆ψ = Eψ; (2.4.11)

∆ is the Laplacian, and a homogeneous Dirichlet boundary condition ψ = 0 at thewall is imposed. As is well-known, the eigenfunctions are:

ψk ∝ sin kxx sin kyy sin kzz, (2.4.12)

and the boundary condition requires the following quantization condition:

k ≡ (kx, ky, kz) =π

L(nx, ny, nz) ≡

π

Ln. (2.4.13)

Here, nx, · · · are positive integers, 1, 2, · · ·. The eigenfunction ψk belongs to theeigenvalue (energy) k2~2/2m.

The number of states with wavevectors k in the range k to k + dk is

#k | k < |k| < k + dk = #

n

∣∣∣∣ Lπ k < |n| < L

π(k + dk)

=

(1

84πn2dn =

)1

8

L3

π34πk2dk =

1

2π2V k2dk. (2.4.14)

The factor 1/8 is required because the relevant k are only in the first octant.Now we can compute the canonical partition function for a single particle using

its definition:

Zt =∑

nx>0,ny>0,nz>0

exp(−βE) ' 1

8

V

π3

∫ ∞

0

4πk2dk exp(−βk2~2/2m). (2.4.15)

The integration is readily performed:

Zt(V ) = V

(2πmkBT

h2

)3/2

. (2.4.16)


The important point of this result is that Zt ∝ V .If a particle has internal degrees of freedom, we must multiply the corresponding

“internal” partition function Zi to get the one particle partition function Z1

Z1(V ) = Zt(V )Zi. (2.4.17)

For later convenience V is explicitly specified. This formula can easily be understood,because the energy of a particle is a sum of the kinetic energy of its translationalmotion and the energy of its internal degrees of freedom, E = Et + Ei. Each termin this sum does not share any common variable. Zi does not depend on V , becauseinternal degrees of freedom are not affected by the external environment (if the sys-tem is dilute enough).

2.4.4 Gibbs paradoxThe partition function Z of the system consisting of N identical particles may seemto be

“Z = ZN1 ”, (2.4.18)

just as (2.4.4), because particles do not interact.If we assume that all the particles are distinguishable as ordinary objects we

observe daily on our length scale, the microstate is specified by a table of integervectors niN

i=0,3 where ni is the vector appearing in (2.4.13) for the i-th particle.

Then, the canonical partition function should read

“Z =∞∑

n1=1

∞∑n2=1

· · ·∞∑

nN=1

exp[−β(E1 + E2 + · · ·+ EN)]”, (2.4.19)

where Ei is the energy of the i-th particle. The sum can readily be done and we get(2.4.18). Let us compute the Helmholtz free energy of the system. The fundamentalrelation (2.3.17) gives

A(N, V ) = −NkBT logZ1(V ). (2.4.20)

Now, prepare two identical systems each of volume V with N particles. The freeenergy of each system is given by A(N, V ). Next, combine these two systems to makea single system. The resultant system has 2N particles and volume 2V , so its freeenergy should be A(2N, 2V ). The fourth law of thermodynamics (1.4.2) requiresthat

A(2N, 2V ) = 2A(N, V ). (2.4.21)

3and internal states of each particle; here for simplicity we ignore internal degrees of freedom

2.5. CLASSICAL STATISTICAL MECHANICS 51

Unfortunately, as you can easily check, this is not satisfied by (2.4.20). Thus wemust conclude (2.4.18) is wrong. This is the famous Gibbs paradox. Since the fourthlaw is an empirical fact, we must modify (2.4.18) to

Z = f(N)ZN1 , (2.4.22)

where f(N) is as yet an unspecified function of N . (2.4.21) requires

2 log f(N) = 2N log 2 + log f(2N), (2.4.23)

orf(N)2 = 22Nf(2N) (more generally, f(N)α = ααNf(αN)). (2.4.24)

The general solution to this functional equation is

f(N) =

(f(1)

N

)N

∝ (N !)−1. (2.4.25)

Therefore, thermodynamics forces us to write

Z =1

N !ZN

1 , (2.4.26)

where we have discarded the unimportant multiplicative factor.

2.4.5 What can we distinguish, and what not?Why do we need 1/N ! for gases and not for oscillators? The most important dif-ference is: while in the case of oscillators each oscillator cannot move in space butsimply sits on, say, a lattice point, each gas particle can move around. See Fig. 2.1.The combinatorial interpretation of N ! (→2.B.17) implies that the configuration(1) and (2) are distinguishable, but (3) and (4) are not.

We must conclude that each gas particle is indistinguishable from other particles.A configuration of gas particles is just like one pattern on a TV screen; each pixelis indistinguishable. Thermodynamics has forced us to abandon the naive realism;molecules are not the same as the macroscopic objects whose distinguishability wetake for granted.

2.5 Classical statistical mechanics


1 1

0

00 2

0

1 0

1 1

0

0 0

2 1

0

0

12

34

5

6

1

2

3

4

5

6

(1) (2) (3) (4)harmonic oscillators

numbers are quantum numbers

gas particles

numbers denote `different' particles

Fig. 2.1 (1) and (2) are distinguishable, but (3) and (4) are not.

2.5.1 Classical formulation of canonical partition functionIn classical mechanics a many body system consisting of N point particles can becompletely (microscopically) described by a table of all instantaneous positions andmomenta of the particles qi,piN

i=0, where qi is the position of the i-th particle, andpi the momentum of the i-th particle. The vector (q1, q2, · · · , qN ,p1,p2, · · · ,pN)spans the phase space. It is natural to interpret the sum in the definition of thecanonical partition function as an integral over phase space. Thus the classicalcanonical partition function would be written

“Z” =

∫· · ·∫dq1dq2 · · · dqNdp1dp2 · · · dpNe−βH, (2.5.1)

where H is the total energy (the Hamiltonian) of the system. If the system consistsof itinerating particles, Z must be divided by N ! as discussed in 2.4.5. Compare thisformula with the partition function for the ideal gas already computed as (2.4.15).Integration over space (which gives V ) and the integration over k in the formula canbe rewritten as

V

2π2

∫ ∞

0

k2dk =1

h3

∫ ∞

−∞dq1

∫ ∞

−∞dp1, (2.5.2)

so Z becomes

Z =1

N !h3N

∫ ∞

−∞dq1

∫ ∞

−∞dq2 · · ·

∫ ∞

−∞dqN

∫ ∞

−∞dp1,

∫ ∞

−∞dp2 · · ·

∫ ∞

−∞dpNe−βH,

(2.5.3)or, introducing the phase space volume element

dΓ ≡ dq1dq2 · · · dqNdp1dp2 · · · dpN (2.5.4)

we may write

Z =1

h3NN !

∫dΓ e−βH. (2.5.5)

2.6. HEAT CAPACITY OF SOLID 53

2.5.2 Quantum-classical correspondenceTo understand why this is the correct choice, we must reflect upon what is actuallydistinguishable as a distinct microstate. Heisenberg’s uncertainty principle tells usthat the error (mean square error) in the coordinate x and its conjugate momentumpx must satisfy

∆x∆px ≥ h. (2.5.6)

For a one dimensional system consisting of a single particle, the phase space of thesystem is a plane spanned by x and px. Due to the uncertainty principle we canat best distinguish squares of

√h ×

√h. Hence, the number of distinguishable mi-

crostates in a domain of phase area A should be A/h.This can easily be generalized to many body systems. The number of microstates

in a volume element dq1dq2 · · · dqNdp1dp2 · · · dpN should be dq1dq2 · · · dqNdp1dp2 · · ·· · · dpN/h

3N . Since the energy H does not strongly depend on the phase positionsin a box, when the summation is replaced with the integration, we may replace theenergy with the function H (Hamiltonian = total energy). Thus classical canonicalpartition function is defined by (2.5.5).

In fact, the formula (2.5.5) without Planck’s constant was introduced by Gibbsbefore the advent of quantum mechanics. Later, it was realized that the correc-tion factor 1/h3N makes the transition between quantum and classical mechanicssmooth. However, in most cases this factor gives an uninteresting additive term tothe Helmholtz free energy, so we may often ignore it.

2.6 Heat capacity of solid

2.6.1 Harmonic solidsConsider a crystal made of N atoms, having 3N mechanical degrees of freedom.Small displacements of atoms around their mechanical equilibrium positions shouldbe a kind of harmonic oscillation. Thus, we may regard the crystal as a set of 3Nindependent harmonic oscillators (modes) of various frequencies (due to couplingamong atoms). As we have already shown, the partition function of the total systemis the product of the partition function for each harmonic mode.

2.6.2 Classical harmonic solidsTreating the system completely classically and using the definition of the classical


partition function (2.5.5), we get

Z1 =1

~

∫dp

∫dqe−β(p2+ω2q2) = 2πkBT/ω~. (2.6.1)

The contribution of this oscillator to the internal energy is readily obtained as

E1 = kBT. (2.6.2)

This is independent of the frequency of the oscillator (equiparition of energy), so thetotal internal energy of the crystal is simply

E = 3NkBT. (2.6.3)

If the volume is kept constant, the frequencies are also kept constant. Therefore, theconstant volume specific heat CV is given by

CV = 3NkB, (2.6.4)

which is independent of temperature, a contradiction to the third law of thermody-namics (1.11.1).

Actual experimental results can be summarized as

CV ∼ T 3 (2.6.5)

at low temperatures. At higher temperatures (2.6.4) is correct and is called Dulong-Petit’s law.

2.6.3 Quantum harmonic solidsIf we take quantum effects into account, the energy of each oscillator must be discrete.Most importantly, there is a gap between the ground state and the first excited state.Therefore, at low enough temperatures, excitation becomes prohibitively difficult,and the specific heat vanishes. Thus the quantum effect is the key to understandingthe third law of thermodynamics (1.11.1).

This point was recognized by Einstein: he treated an ensemble of 3N identical 1Doscillators quantum mechanically (Einstein model) as a model of solid. Since the 1DEinstein model was already studied in 2.4.1, here we have only to replace N with3N in (2.4.7), (2.4.8), etc., so the internal energy is

E =3

2N~ω coth

(β~ω

2

)= 3N

(1

2~ω +

~ωeβ~ω − 1

). (2.6.6)

2.6. HEAT CAPACITY OF SOLID 55

Hence, the specific heat is

CV = 3NkB

(~ωkBT

)2eβ~ω

(eβ~ω − 1)2. (2.6.7)

At sufficiently high temperatures (~ω/kBT 1) quantum effects should not beimportant. As expected we recover the classical result (2.6.4):

CV → 3NkB. (2.6.8)

For sufficiently low temperatures (~ω/kBT 1) (2.6.7) reduces to

CV ' 3NkB

(~ωkBT

)2

e−β~ω. (2.6.9)

Thus, CV vanishes at T = 0, and the third law behavior is exhibited.However, CV goes to zero exponentially fast at variance with the empirical law

(2.6.5) mentioned above. It is a rule that whenever there is a finite energy gap ε be-tween the ground and the first excited states, the specific heat behaves like exp(−βε)at low temperatures. The empirical result implies that there is no finite energy gapin a real crystal.

2.6.4 Real harmonic solids: density of statesNow think about an actual three dimensional crystal. It is very hard to displaceevery other atom (i.e., a sublattice relative to the other sublattice), but it is easy topropagate sound waves, which are long wavelength (relative to the atomic spacing)vibrations. Thus, an actual crystal should contain low frequency oscillators (modes).Denote by f(ω)dω the number of vibrational modes with the angular frequenciesbetween ω and ω + dω. f(ω) is called the density of state. The number of modesmust be identical to the number of degrees of freedom 3N (ωD = highest allowedfrequency): ∫ ωD

0

f(ω)dω = 3N. (2.6.10)

Now consider a 3D lattice with a3N atoms, but with the lattice spacing 1/a of theoriginal spacing. It has the same shape and size as the original crystal. The highestfrequency becomes aωD (Fig. 2.3).

Hence, ∫ aωD

0

f(ω)dω = 3a3N. (2.6.11)


a = 3λ min λ min

Fig. 2.3 Highest frequency modes.

Differentiating this equation with respect to a, we get

ωDf(aωD) = 9Na2. (2.6.12)

If the actual microscopic details do not matter, then this relation should hold forany a,4 so we conclude

f(x) ∝ x2 (2.6.13)

(You should have noticed that the 2 in this formula is actually d − 1, d being thespatial dimensionality. N is extensive, but ωD is not (volume independent), so thatf must be proportional to V . Thus, we may conclude that f(ω) = αω2V , where αis a proportionality constant, which we can fix with the aid of (2.6.10):

αω3DV/3 = 3N (2.6.14)

Thus the density of states is

f(ω) = 9Nω2/ω3D. (2.6.15)

2.6.5 Debye modelThe total Helmholtz free energy of a 3d-crystal is given by

A =∑

ω

Aω =

∫ ωD

0

dωAωf(ω), (2.6.16)

where (see (2.4.7))

Aω = kBT log

(sinh

β~ω2

). (2.6.17)

If we use (2.6.15) for the density of states f , the model of solid is called the Debyemodel. The Gibbs-Helmholtz formula gives for this model

E =9

8N~ωD +

9~Nω3

D

∫ ωD

0

ω3

eβ~ω − 1dω. (2.6.18)

4Needless to say, this assumption is not true. However, if we are interested in the relatively lowfrequency modes (long-wavelength modes), this is not a bad assumption. This is enough for ourpurpose.

2.7. CLASSICAL IDEAL GAS 57

The high temperature limit is obtained by∫ ωD

0

ω3

eβ~ω − 1→ ω3

D

3β~⇒ E = 3kBTN + const., (2.6.19)

which of course agrees with the classical result.In the low temperature limit the integral in the internal energy can be approxi-

mated as ∫ ωD

0

ω3e−β~ω '∫ ∞

0

ω3e−β~ω ∝ T 4. (2.6.20)

From these formulas we can get

CV →

= 3NkB ~ωD kBT∝ T 3 ~ωD kBT

(2.6.21)

Warning. Do not confuse the word “state” used in “density of states” and in “number of micro-scopic states”. In the former, “state” implies “eigenmode” of the system Hamiltonian, while in thelatter “state” implies “eigenstate” of the system Hamiltonian. Thus, even for a single mode thereare many different excitation states.ut

2.7 Classical ideal gas

2.7.1 Free energy of classical ideal gasWe have already discussed the translational part of the partition function ZT for theclassical ideal gas (2.4.16). The partition function of a classical ideal gas consistingof N molecules is given by (see (2.4.26))

Z =1

N !ZT (T, V )NZi(T )N , (2.7.1)

where independent thermodynamic variables are explicitly specified. From this for-mula it follows

A(T, V,N) = −NkBT log

eV

N

(2πmkBT

h2

)3/2

Zi(T )

, (2.7.2)

where we have used Stirling’s formula.As concluded in 2.4.3 without any computation, the Gibbs-Helmholtz equation


gives the internal energy independent of the volume of the system (or the densityρ ≡ N/V of the gas):

E(N, T ) = Ei +3

2NkBT, (2.7.3)

where Ei is the contribution from the internal degrees of freedom. The equation ofstate is found to be

P = − ∂A

∂V

∣∣∣∣T

= NkBT/V. (2.7.4)

This is the well-known equation of state of the classical ideal gas.Remark. Independence of internal energy E from V , i.e.,

∂E

∂V

∣∣∣∣T

= 0, (2.7.5)

implies∂P

∂T

∣∣∣∣V

=P

T, (2.7.6)

(demonstrate this) soPf(V ) = T, (2.7.7)

where f is a function of V . But since P and T are both intensive, this relation implies

Pf(ρ) = T. (2.7.8)

The equation of state of the ideal gas (2.7.4) is an example. ut

2.7.2 Fundamental equation of stateIt is clear from the derivation of the equation of state that the information aboutthe internal degrees of freedom in the free energy is completely lost.

On the other hand, the entropy preserves this information as is clear from

S =E − A

T=

3

2NkB +NkB log

eV

N

(2πmkBT

h2

)3/2

Zi(T )

. (2.7.9)

From this S can be written in terms of V and E (Try it). The function S = S(E, V )is called by Gibbs the fundamental equation (of state). In contrast to the ordinaryequation of state such as (2.7.8), this equation contains the information not only forthe equation of state but also for the heat capacity.

The constant volume heat capacity CV can be obtained from (2.7.9) as

CV =3

2NkB + CV, int, (2.7.10)

2.7. CLASSICAL IDEAL GAS 59

where CV, int is the contribution of internal degrees of freedom:

CV, int ≡ T∂logZi

∂T

∣∣∣∣V

. (2.7.11)

2.7.3 Absolute entropyThe entropy obtained above is contradictory to the third law of thermodynamics,because S ∝ log T ( −∞). However, since S ∝ log ρ + (3/2) log T , making thedensity of the gas sufficiently low (ρ 0), we can indefinitely lower the temperaturebelow which the unphysical nature of the ideal gas emerges. In this sense, classicalideal gas is a useful idealization of a real dilute gas.

Combining the equation of state (2.7.4) and the formula for the entropy (2.7.9),we can derive

logP =5

2log T +

5

2+ log

[k

5/2B

(2πm

h2

)3/2]− S

NkB

. (2.7.12)

The entropy in this formula can be measured, if the ideal gas (i.e., a sufficientlydilute gas) is in equilibrium with a condensed phase. Thus, we can directly checkwhether the value of absolute entropy is correct or not.

2.7.4 Contribution of internal degrees of freedomThe partition function Zi(T ) for the internal degrees of freedom is not easily com-puted, since we have to take into account the nature of nuclei in the molecule. 5

There are essentially four important internal degrees of freedom: nuclear spins,electrons, molecular rotation and vibrations.

Since the energy gaps between different states of nuclear spins are usually verysmall, we may assume that all the states are equally populated except at very lowtemperatures. Hence, this gives Zi a constant multiplicative factor (spin multiplic-ity) independent of T . That is, nuclear spins do not contribute to the heat capacityexcept for extremely low temperatures.

The electronic degrees of freedom have very large energy gaps between theirground and first excited states (∼ 5 eV, the order of ionization potentials), so theyare virtually frozen up to a few thousand K (kB = 8.62× 10−5 eV/K, so 5eV corre-sponds to ∼5,800K). Thus electrons do not contribute to the heat capacity, either.

Vibrational degrees of freedom have energy gaps (or energy quanta) of order 0.1

5Here, we discuss only qualitative features except for this complication that occurs for lighthomonuclear diatomic molecules.


eV, so they cannot fully contribute at room temperatures. If T < 100K, we mayusually totally ignore vibrational degrees of freedom.

Rotational degrees of freedom have small energy quanta of order 10K or less exceptfor light diatomic molecules such as H2,D2,T2, etc; for polyatomic gases, we mayalways treat rotational degrees of freedom classically (except for methanes). Thusthe heat capacity of most classical ideal gases consists of translational and rotationalcontributions.

2.7.5 Momentum distribution of classical ideal gas particlesDenote the density distribution function for momentum p by f(p); its meaning is

f(p)dp = Prob.

a particle has the momentum p such that its i-thcomponent (i = x, y, z) is between pi and pi + dpi

. (2.7.13)

We can use the relation between the indicator and the probability (→2.A.14):

f(p′) = 〈δ(p− p′)〉, (2.7.14)

where 〈•〉 is the equilibrium ensemble average:

〈•〉 ≡ 1

h3NN !

∫· · ·∫dq1dq2 · · · dqNdp1dp2 · · · dpN • e−β

Pp2i /2m/(ZN

T /N !),

(2.7.15)and irrelevant internal degrees of freedom have been ignored. Thus we get

f(p) = (2πkBT )−3/2 exp(−p2/2m). (2.7.16)

This is the Maxwell distribution.

Exercise 1. Compute the average square velocity. Also compute the average squareof the relative velocity of two arbitrary molecules. ut

2.8 Open systems

2.8.1 Open systemWhen a system can exchange not only energy but particles with its environment,we call it an open system. Let us find an equilibrium distribution function for thenumber of particles in the system as well as energy. The strategy here is parallel tothe one adopted for the study of closed systems. Embed our system in a reservoir

2.8. OPEN SYSTEMS 61

of energy and chemical species. Then, consider the system (I) and reservoir (II, thistime it is not only a heat bath but a chemostat) as a single isolated system (see1.6.4); Boltzmann’s principle may then be applied to the composite system. Wediscuss a system consisting of a single chemical species first, and later generalize ourresult to multicomponent systems.

2.8.2 System + chemostat at constant temperatureThe total energy E0 and the total number of particles N0 of the system are given by

E0 = EI + EII, (2.8.1)

N0 = NI +NII. (2.8.2)

The number of microscopic states for the system I (II) with energy EI (EII) andparticle number NI (NII) is denoted by ΩI(EI, NI) (ΩII(EII, NII)). We assume theinteraction between the two systems to be very weak, so both systems are statisticallyindependent. Hence, the number of microstates for the composite system with anenergy EI in I and EII in II and with a number of particles NI in I and NII in II,respectively, is given by

ΩI(EI, NI)ΩII(EII, NII). (2.8.3)

The total number Ω(E0, N0) of microstates for the composite system must be thesum of this product over all the ways to distribute energy and molarity to I and II.Therefore, we get

Ω(E0, N0) =∑

0≤EI≤E0, 0≤NI≤N0

ΩI(EI, NI)ΩII(EII, NII). (2.8.4)

2.8.3 Probability of the system macrostatesThis entry is quite parallel to 2.3.2.

The fundamental postulate of equilibrium statistical mechanics 2.1.3 implies thatthe probability for the system I to have energy E and molarity N (or more precisely,the energy between E and E + dE , and molarity between N and N + dN ) is givenby

P (EI ∼ E , NI ∼ N ) =ΩI(E ,N )ΩII(E0 − E , N0 −N )

Ω(E0, N0). (2.8.5)

Now consider the equilibrium state of the composite system. The subsystems mustalso be in equilibrium. We may use Boltzmann’s principle to rewrite ΩII(EII, NII) as

ΩII(EII, NII) = exp[SII(EII, NII)/kB]. (2.8.6)


Since the reservoir (II) is huge compared to the system (I), the entropy may beexpanded as:

SII(E0−E , N0−N ) = SII(E0, N0)−E∂SII∂EII

−N ∂SII∂NII

+second order terms. (2.8.7)

Denote the temperature of the reservoir T :

∂SII∂EII

=1

T, (2.8.8)

and introduce the chemical potential µ

∂SII∂NII

= −µT. (2.8.9)

The most probable E and N should be close to their macroscopically observablevalues for the system I, so that due to the extensivity of internal energy and particlenumber, they should be of order NI. Thus, as was discussed previously, the secondorder terms in (2.8.7) are of order NI/NII, which are negligibly small.

2.8.4 Grand canonical partition functionNow, the consideration in 2.8.3 allows us to streamline (2.8.5) as

P (EI ∼ E , NI ∼ N ) ∝ ΩI(E ,N )e−βE+βµN . (2.8.10)

To get the probability we need the normalization constant Ξ:

Ξ =∑E,N

ΩI(E ,N )e−βE+βµN , (2.8.11)

which is called the grand canonical partition function (or grand partition function).The sum over E may become an integral. A more microscopic expression is also

possible:

Ξ =∑

all microstates

e−βE+βµN . (2.8.12)

The probability density distribution calculated according to the grand canonicalformalism reads

P (E ,N ) =1

ΞΩI(E ,N )e−βE+βµN . (2.8.13)

2.9. IDEAL PARTICLE SYSTEMS — QUANTUM STATISTICS 63

2.8.5 Relation between thermodynamics and grand partition functionThe next task is to find the relation between what has been obtained and thermo-dynamics. We use the same idea as was used for the canonical partition function(→2.3.5).

Ξ =∑E,N

Ω(E ,N )e−βE+βµN =∑E,N

e−β(E−µN−TS(E,N )) ' e−β(E−µN−TS(E,N)), (2.8.14)

where in the rightmost expression E, N are the most probable values of energy andparticle number, respectively. The error in this approximation for log Ξ, which is oforder N , is of order logN , extremely accurate just as before. From this we get thefollowing relation:

TS − E + µN = kBT log Ξ. (2.8.15)

UsingE = TS − PV + µN, (2.8.16)

we finally conclude thatPV/T = kB log Ξ. (2.8.17)

PV/T is sometimes called Kramers’ q-potential. From this the equation of statedirectly follows.

The Gibbs free energy can be easily obtained as

G = Nµ, (2.8.18)

as was already discussed thermodynamically (→1.7.3). Thus the easiest way tocompute the Helmholtz free energy from the grand partition function is via

A = G+ PV. (2.8.19)

2.9 Ideal particle systems — quantum statistics

2.9.1 Specification of microstate for indistinguishable particle systemLet i (= 1, 2, · · ·) denote the i-th one particle state, and ni be the number (occupationnumber) of particles in this state. Since all the particles are indistinguishable, the


table of the occupation numbers n1, n2, · · · should be sufficient to specify an ele-mentary microstate completely. Thus we may identify this table and the microstate.

Let εi be the energy of the i-th one particle state. The total energy E and thetotal number of particles N can be written as

E =∑i=1

εini, (2.9.1)

andN =

∑i=1

ni. (2.9.2)

2.9.2 Grand partition function of indistinguishable particle systemLet us compute the grand canonical partition function (2.8.12) for the system in2.9.1

Ξ(β, µ) =∑

n1,n2,···

e−βE+βµN . (2.9.3)

Using the microscopic descriptions of E and N ((2.9.1) and (2.9.2)), we can rearrangethe summation as

Ξ =∏

i

Ξi, (2.9.4)

whereΞi ≡

∑ni

exp[−β(εi − µ)ni]. (2.9.5)

This quantity may be called the grand canonical partition function for the one par-ticle state i. As was warned in 2.6.5, the term “state” is used here in the sense of“mode” (“i-th mode is occupied by ni particles”).

2.9.3 There are only fermions and bosons in the worldIn the world it seems that there are only two kinds of particles:bosons: there is no upper bound for the occupation number;fermions: the occupation number can be at most 1 (the Pauli exclusion principle).

This is an empirical fact. Electrons, protons, etc., are fermions, and photons,phonons (= quanta of sound wave) are bosons.

There is the so-called spin-statistics relation that the particles with half odd inte-ger spins are fermions, and those with integer spins are bosons. The rule applies alsoto compound particles such as hydrogen atoms. Thus, H and T are bosons, but theirnuclei are fermions. D and 3He are fermions. 4He is a boson, and so is its nucleus.A complication mentioned about the heat capacity of ideal classical gases in 2.7.4

2.9. IDEAL PARTICLE SYSTEMS — QUANTUM STATISTICS 65

is due to these differences.

2.9.4 Average occupation number for bosonsFor bosons, any number of particles can occupy the same one particle state, so

Ξi =∞∑

n=0

e−β(εi−µ)n =(1− e−β(εi−µ)

)−1. (2.9.6)

The mean occupation number of the i-th state is given by

〈ni〉 =∞∑

n=0

nie−β(εi−µ)n/Ξi, (2.9.7)

so we conclude

〈ni〉 =∂log Ξi

∂βµ

∣∣∣∣β

= kBT∂log Ξi

∂µ

∣∣∣∣T

=1

eβ(εi−µ) − 1. (2.9.8)

This distribution is called the Bose-Einstein distribution.Notice that the chemical potential must be smaller than the ground state energy

to maintain the positivity of the average occupation number.

2.9.5 Average occupation number for fermionsFor fermions no one particle state can be occupied by more than one particle, so thesum over the occupation number is merely the sum for n = 0 and n = 1:

Ξi = 1 + e−β(εi−µ). (2.9.9)

Thus, the mean occupation number is given by

〈ni〉 = kBT∂log Ξi

∂µ

∣∣∣∣T

=1

eβ(εi−µ) + 1. (2.9.10)

This distribution is called the Fermi-Dirac distribution.It is very important to recognize the qualitative features of the Fermi-Dirac dis-

tribution function (see Fig. 2.3).

2.9.6 Classical limit of occupation numberIn order to obtain the classical limit, we must take the occupation number 0 limitto avoid quantum interference among particles (cf. 2.4.2). The chemical potentialµ is a measure of the “strength” of the chemostat to push particles into the system.


0

1/2

1

expected

occupation

number

μ ε

k TB

symmetric around this point

Fig. 2.3 The cliff has a width of order kBT . µ is called the Fermi potential. The symmetry notedin the figure is the so-called particle-hole symmetry.

Thus, we must make the chemical potential extremely small: µ −∞.In this limit both Bose-Einstein (2.9.8) and Fermi-Dirac distributions (2.9.10)

reduce to the Maxwell-Boltzmann distribution as expected

〈ni〉 → N e−βεi , (2.9.11)

where N = eβµ is the normalization constant determined by the total number ofparticles in the system.

2.10 Free fermion gas

2.10.1 Free electron model of metalAs an application of the Fermi-Dirac distribution (2.9.10), let us consider a free elec-tron gas. Electrons are negatively charged particles, so that the Coulomb interactionamong them is very strong. However, in a metal, the positive charges on the latticeneutralize the total charge, and due to the screening effect even the charge densityfluctuations do not interact strongly. Thus, the free electron model of metals is atleast a good “zeroth order” model of real metals.

2.10.2 We can apply grand canonical theory to closed systemsWe apply the grand canonical scheme to a piece of metal. Since this is not an opensystem, it might appear that the grand canonical scheme is not applicable. In thelarge system limit (thermodynamic limit), however, the use of grand canonical schemeto compute thermodynamic quantities is fully justified mathematically.6

6This is thanks to the so-called ensemble equivalence.

2.10. FREE FERMION GAS 67

Intuitive understanding of this fact is not hard. A macroscopic piece of metal inequilibrium should be uniform, so we can get all the thermodynamic properties ofthe whole piece from a tiny (but still macroscopic) part. For this tiny portion, therest of the specimen acts as a reservoir. Or, we may rely on the following typicalstatistical mechanical assertion claiming that the properties of a macroscopic systemare virtually independent of the boundary conditions.

In a piece of metal the number of electrons N is fixed (or rather, the electron den-sity is fixed), so a correct electron chemical potential must be chosen to be consistentwith this density.

2.10.3 Total number of particlesThe total number of particles must be

N =∑

i

〈ni〉. (2.10.1)

The sum is over all the different one particle states of the free electron. The stateof a free electron is completely fixed by its spin (up or down) and its momentum p.We already know the density of states for free particles (cf. 2.4.3), so the density ofstates f for the free electron should read

f(p)dp = 2× V4πp2dp

h3, (2.10.2)

where p ≡ |p|, V is the volume of the system (the factor 2 comes from the spin).It is convenient to write everything in terms of energy:

ε = p2/2m, (2.10.3)

where m is the mass of the electron. The density of states reads

f(ε)dε =4πV

h3(2m)3/2ε1/2dε. (2.10.4)

Thus, we have arrived at

N =4πV

h3(2m)3/2

∫ ∞

0

dεε1/2

eβ(ε−µ) + 1, (2.10.5)

which implicitly fixes µ as a function of T, V and N .


2.10.4 Free fermion system at T = 0At T = 0, the Fermi-Dirac distribution is a step function:

〈n(ε)〉 = θ(εF − ε), (2.10.6)

where εF = µ is the Fermi energy. Therefore, it is easy to explicitly compute theintegral in (2.10.5) to get

N =4πV

h3(2m)3/2 2

3ε3/2F , (2.10.7)

or

εF =h2

2m

(3N

8πV

)2/3

. (2.10.8)

For ordinary metals, εF is of the order of a few eV (for copper 7.00 eV, for gold 5.90eV). This kinetic energy corresponds to a speed of electrons of order 1% the speedof light.

Exercise 1. Compute the internal energy at T = 0. (Answer: 3NεF/5; this is thelowest possible energy for the system.) ut

2.10.5 Equation of state of free fermion gasThe equation of state reads

PV

kBT= log Ξ =

∫dεf(ε) log(1 + e−β(ε−µ)), (2.10.9)

and the easiest way to obtain the Helmholtz free energy is through (2.8.19):

A = Nµ+ kBT

∫dεf(ε) log(1 + e−β(ε−µ)). (2.10.10)

2.10.6 Specific heat of fermion gasLet us intuitively discuss the electronic heat capacity of metals at low temperatures.We may assume that the Fermi-Dirac distribution is (almost) a step function. FromFig. 2.3 in 2.9.5 we can infer that the number of excitable electrons is ∼ NkBT .Therefore, their energetic contribution is ∆E ∼ N(kBT )2. Hence, CV ∝ T at lowertemperatures. Thus at sufficiently low temperatures this dominates the heat capac-ity of metals (where T 3 T ).

2.11. FREE BOSONS AND BOSE-EINSTEIN CONDENSATION 69

2.11 Free bosons and Bose-Einstein condensation

2.11.1 Total number of particles of free boson gasLet us take the ground state energy of the system to be the origin of energy.7 Also thechemical potential cannot be positive. The total number of particles in the systemof free bosons is given by

N =∑

i

1

eβ(εi−µ) − 1. (2.11.1)

If T is sufficiently small, the terms corresponding to the energy levels near the groundstate can become very large, so in general it is dangerous to approximate (2.11.1)by an integral with the aid of a smooth density of state (see Fig. 2.4). (In case offermions, each term cannot be larger than 1, so there is no problem at all in thisapproximation.)

T < T

N

T > Tc c

N

N0

1

εε

expected

occupation

number

expected

occupation

number

Fig. 2.4 If T < Tc, where Tc is the Bose-Einstein condensation temperature (→2.11.2), then theground state is occupied by N0 = O[N ] particles, so the approximation of (2.11.1) by an integralbecomes grossly incorrect.

2.11.2 Bose-Einstein condensationLet us try to approximate (2.11.1) by integration in 3d. In this case the density ofstates has the form f(ε) = Cε1/2, C being a constant independent of T :

N ≈ N1 ≡ C

∫ ∞

0

ε1/2

eβ(ε−µ) − 1≤ C(kBT )3/2

∫ ∞

0

dzz1/2

ez − 1. (2.11.2)

The equality holds when µ = 0. The integral is finite, so N1 can be made indefinitelyclose to 0 by reducing T . However, the system should have N bosons independent

7The ground state energy must be finite for a system to exist as stable matter.


of T , so there must be a temperature Tc at which

N = N1 (2.11.3)

and below itN > N1. (2.11.4)

The temperature is called the Bose-Einstein condensation temperature; below thisthe continuum approximation breaks down.

What actually happens is that a macroscopic number N0 (= N −N1) of particlesfall into the lowest energy one particle state (see Fig. 2.4 in 2.11.1). This phe-nomenon is called Bose-Einstein condensation.

2.11.3 Number of particles that does not undergo condensationNow study N0, the number of particles in the ground state, more closely. From(2.11.2), we know that N1 is an increasing function of µ, but we cannot increaseµ indefinitely; µ must be non-positive. Hence, at or below a certain particulartemperature Tc µ vanishes. Then, the equality holds in (2.11.2), so that Tc is fixedby the condition

N = N1 = C(kBTc)3/2

∫ ∞

0

dzz1/2

ez − 1. (2.11.5)

Hence, we get for T < Tc (Fig. 2.5)

N1 = N

(T

Tc

)3/2

. (2.11.6)

T

N /N

Tc0

1

1Fig. 2.5 The ratio N1/N of non-condensateatoms has a singularity at the Bose-Einsteincondensation point Tc.

Remark. There is no Bose-Einstein condensation in one and two dimensional spaces,because the integral in (2.11.2) for these cases does not converge. ut

2.11.4 Internal energy and specific heat of ideal bose gasThe Bose-Einstein condensate does not contribute to internal energy, so we may use

2.12. PHONONS AND PHOTONS 71

the continuum approximation to compute the internal energy. Below Tc we maysetµ = 0, so

E =

∫ ∞

0

dεf(ε)εε1/2

eβ(ε−µ) − 1. (2.11.7)

Therefore, for T < Tc

E ∝ β−5/2. (2.11.8)

Of course, E must be extensive, so

E = NkBTc

(T

Tc

)5/2

× const., (2.11.9)

where the constant is, according to a more detailed computation, equal to 1.93 · · ·.From this the low temperature heat capacity is

CV ∝(T

Tc

)3/2

. (2.11.10)

This goes to zero with T as required by the third law (1.11.1).

2.12 Phonons and photons

2.12.1 PhononsPhonons are quanta of sound waves. A harmonic mode with angular frequency ωhas states with energy (see 2.4.1)

εn =

(n+

1

2

)~ω (n = 0, 1, 2, · · ·). (2.12.1)

When a harmonic mode has energy εn, we say there are n phonons of the harmonicmode. The canonical partition function of the phonon system has the same structureas the grand canonical partition function of free bosons with a zero chemical potential(see 2.4.1, if we ignore the contribution of the zero-point energy).8 Therefore, theaverage number of phonons of a harmonic mode is given by

〈n〉 =1

e+β~ω − 1. (2.12.2)

8This is a mathematical formal relation; do not read too much physics in it.


2.12.2 Phonon contribution to internal energyThe phonon contribution to the internal energy of a system may be computed justas we did for the Debye model (→2.6.5). We need the density of states (i.e, phononspectrum) f(ω). The internal energy of all the phonons is given by

E =∑

modes

〈n(ω)〉~ω =

∫dωf(ω)

~ωe+β~ω − 1

. (2.12.3)

This is the internal energy without the contribution of zero-point energy. The lattercontribution is a mere constant shifting the origin of energy, so it is thermodynami-cally irrelevant.

The approximation of the sum in (2.12.3) by an integral is always allowed, becausethe factor ~ω removes the difficulty encountered in the Bose-Einstein condensation.The total number of phonons of a given mode diverges as ω → 0 (infrared catastro-phe), but this is quite harmless, since these phonons do not have much energy.

2.12.3 PhotonsPhotons are quanta of light or electromagnetic wave. Photons have spin 1, but theytravel at the speed of light, so the z-component of its spin takes only ±1. This corre-sponds to the polarization vector of light. As with phonons, there is no constraint onthe number of photons. Photons result from quantization of electromagnetic waves,so mathematically they just behave as phonons. Thus, formally, we may treat pho-tons as bosons with two internal states and with µ = 0. Hence, the number ofphotons of a given mode reads exactly the same as (2.12.2).

2.12.4 Planck’s radiation law: derivationLet f(ω)dω be the number of modes with angular frequency between ω and ω + dω(the photon spectrum). The internal energy dEω and the number dNω of photons inthis range are given by

dEω = 2f(ω)~ω

e+β~ω − 1dω, (2.12.4)

dNω = 2f(ω)1

eβ~ω − 1dω. (2.12.5)

The factor 2 comes from the polarization states.A standard way to obtain the density of states is to study the wave equation

governing the electromagnetic waves, but here we use a shortcut. The magnitude ofthe photon momentum is p = ~k = ~ω/c, so

d3pd3q

h3⇒ V

h34πp2dp =

V

2π2c3ω2dω, (2.12.6)

2.12. PHONONS AND PHOTONS 73

i.e.,

f(ω) =V ω2

2π2c3. (2.12.7)

Therefore, the photon energy per unit volume reads

u(T, ω)dω = dUω/V =~ω3

π2c31

eβ~ω − 1dω. (2.12.8)

This is called Planck’s radiation law.

2.12.5 Planck’s radiation law: qualitative featuresIt is important to know some qualitative features of this law (Fig. 2.6).

Fig. 2.6 Classical electrodynamics gives theRayleigh-Jeans formula (2.12.10) (green); thisis the result of equipartition of energy and dueto many UV modes, the density is not inte-grable (the total energy diverges). Wien reached(2.12.11) empirically (red). Planck arrived athis formula (black) originally by interpolation ofthese results. Notice that the peak position isproportional to the temperature.

Planck’s law can explain why the spectrum blue-shifts as temperature increases; thiswas not possible within the classical theory.

2.12.6 Total energy of radiation fieldThe total energy density u(T ) of a radiation field is obtained by integration:

u(T ) =

∫ ∞

0

dω u(T, ω). (2.12.9)

With Planck’s law (2.12.8) this is always finite. If the limit ~ → 0 is taken (theclassical limit), we get

u(T, ω) =kBTω

2

π2c3( = 2f(ω)kBT ) , (2.12.10)

which is the formula obtained by classical physics. Upon integration, the classicallimit gives an infinite u(T ). The reason for this divergence is obviously due to the


contribution from the high frequency modes. Thus this difficulty is called the ultra-violet catastrophe, which destroyed classical physics. Empirically, Wien proposed

u(T, ω) ' kBT

π2c3ω2e−β~ω. (2.12.11)

The formula can be obtained from Planck’s law in the limiting case ~ω kBT .Using Planck’s law of radiation, we immediately get

u(T ) ∝ T 4, (2.12.12)

which is called the Stefan-Boltzmann law. This was derived purely thermodynami-cally by Boltzmann before the advent of quantum mechanics (→2.12.8). The pro-portionality constant contains ~, so it was impossible to theoretically obtain thefactor (Stefan experimentally obtained it).

2.12.7 Radiation pressurePhotons may be treated as ideal bosons with µ = 0. If µ = 0, then A = −PV , sothe equation of state is immediately obtained as

PV

kBT= log Ξ = − logZ = −

∫dεf(ε) log(1− e−βε). (2.12.13)

The density of states has the form f(ε) = Cε2, where C is a numerical constant. Weuse ∫

dεCε2 log(1− e−βε) =

∫dεC

3(ε3)′ log(1− e−βε) = −Cβ

3

∫dεε3 1

eβε − 1,

(2.12.14)where we have used integration by parts. This implies that∫

dεf(ε) log(1− e−βε) = −β3

∫dεf(ε)

ε

eβε − 1, (2.12.15)

or, with the aid of (2.12.3) and (2.12.13)

PV

kBT=β

3E. (2.12.16)

That is, we have obtainedP = u(T )/3 (2.12.17)

Exercise 1. Derive the corresponding formula to (2.12.17) for D-dimensional space.ut

2.13. PHASE COEXISTENCE AND PHASE RULE 75

2.12.8 Thermodynamic derivation of Stefan-Boltzmann lawNow assuming (2.12.17), we can derive the Stefan-Boltzmann law purely thermody-namically. Since we know A = −PV as noted in 2.12.7, or

A = E − TS = −V3u(T ). (2.12.18)

Thus, we can get S as

S =4V

3Tu(T ). (2.12.19)

We use (∂E/∂S)V = T or

du(T ) = Td

(4u(T )

3T

). (2.12.20)

That is,Tdu = 4udT, (2.12.21)

which implies the Stefan-Boltzmann law (2.12.12).

2.13 Phase coexistence and phase rule

2.13.1 Coexistence of two phasesConsider a one component fluid system consisting of two coexisting phases. Thesystem is isolated as a whole. The phase boundary allows exchange of energy, volume,and particles. Then the maximum entropy principle (cf. 1.6.8) tells us that theequilibrium conditions for these coexisting phases are

T I = T II, P I = P II, µI = µII, (2.13.1)

where we use the usual symbols, and superscripts denote two phases I and II. Wecan rewrite the last equality in (2.13.1) as

µI(T, P ) = µII(T, P ). (2.13.2)

This functional relation determines a curve called the coexistence curve in the T -Pdiagram.

Along this line

G = N IµI +N IIµII. (2.13.3)

Thus, without changing the value of G, any mass ratio of the two coexisting phasesis realizable (as we know well with water and ice).


2.13.2 Number of possible coexisting phases for pure substanceHow many phases can coexist at a given T and P? Suppose we have X coexistingphases. The following conditions must be satisfied:

µI(T, P ) = µII(T, P ) = · · · = µX(T, P ). (2.13.4)

We believe that for the generic case, µ’s are sufficiently functionally independent. Tobe able to solve for T and P , we can allow at most two independent relations. Thatis, at most three phases can coexist at a given T and P for a pure substance.

For a pure substance, if three phases coexist, T and P are uniquely fixed. Thispoint on the T -P diagram is called the triple point. The Kelvin scale of temperatureis defined so that triple point of water is at T = 273.16K. t = T − 273.15 is thetemperature in Celsius.

P

T

triplepoint

criticalpoint

Fig. 2.8 A generic phase diagram for purefluid. You must be able to specify which zonecorresponds to which phase, solid, liquid or gas.The phase diagram for water near T = 273Kand 1 atm has a slight difference from this.What is it?

2.13.3 Gibbs phase ruleConsider a more general case of a system consisting of c chemically independentcomponents (i.e., the number of components we can change independently). Forexample, H3O

+ in pure water should not be counted, if we count H2O among theindependent chemical components.

Suppose there are φ coexisting phases. The equilibrium conditions are:(1) T and P must be common to all the phases,(2) The chemical potentials of the c chemical species must be common to all thephases.

To specify the composition of a phase we need c−1 variables, because we need onlythe concentration ratio. Thus, the chemical potential for a chemical species dependson T , P and c−1 mole fractions, which are not necessarily common to all the phases.That is, µ’s are c+1 variable functions, and we have 2+φ(c−1) unknown variables.We have φ− 1 equalities among the chemical potentials in different phases for each

2.13. PHASE COEXISTENCE AND PHASE RULE 77

chemical species, so the number of equalities we have is (φ − 1) × c. Consequently,for the generic case we can choose f = 2 + φ(c− 1)− c(φ− 1) = c+ 2− φ variablesfreely. This number f is called the number of thermodynamic degrees of freedom.

We have arrived at the Gibbs phase rule:

f = c+ 2− φ. (2.13.5)

As astute readers have probably sensed already, the derivation is not water tight.Rigorously speaking, we cannot derive the phase rule from the fundamental laws ofthermodynamics.

2.13.4 Clapeyron-Clausius relationFor a pure substance, as we have seen, the chemical potentials of coexisting phasesmust be identical. Before and after the phase transition from phase I to II or viceversa, there is no change of the Gibbs free energy

∆GCC = 0, (2.13.6)

where CC means “along the coexistence curve” and ∆ implies the difference acrossthe coexistence curve (say, phase I − phase II). This should be true even if we changeT and P simultaneously along the coexistence curve. Hence, along CC,

−∆SdT + ∆V dP = 0 ⇒ ∂P

∂T

∣∣∣∣CC

=∆S

∆V. (2.13.7)

At the phase transition ∆H = T∆S, where ∆H is the latent heat and T the phasetransition temperature. Thus, we can rewrite (2.13.7) as

∂P

∂T

∣∣∣∣CC

=∆I→IIH

T∆I→IIV, (2.13.8)

where ∆I→IIX denotes XII − XI. This relation is called the Clapeyron-Clausiusrelation.

If we may assume that one phase is an ideal gas phase, and if we ignore the volumeof the other phase, then

∆V ' VG = NRT/P, (2.13.9)

where N is the mole number of the substance. Therefore, we can integrate (2.13.8)as

P ∝ exp

(− L

NRT

), (2.13.10)

where L is the latent heat (heat of evaporation). This gives the vapor pressure ofthe condensed phase.


2.14 Phase transition

2.14.1 Phase transition as singularity of free energyWhen the free energy A becomes singular, such as when it becomes nondifferentiableor ceases to have higher order derivatives, we say the system exhibits a phase tran-sition. When A itself becomes nondifferentiable, the phase transition is called a firstorder phase transition. Other phase transitions are collectively called second orderphase transitions, continuous phase transitions or higher order phase transitions. Atypical behavior of the Gibbs free energy G is illustrated below.

T

G

SL

G

melting

point

boiling

point

Fig. 2.8 Typical behavior of Gibbs free en-ergy for a pure substance. The free energy losesdifferentiability at first order phase transitionpoints.

First order phase transitions often depend strongly on the details of individualmaterial, so a general theory is hard to construct. For second order phase transitionsor critical phenomena, long wave length fluctuations become very important, so thatthere are many features independent of the details of individual systems (microscopicdetails). Thus, there is a nice set of general theories for the second order phase tran-sition.

2.14.2 Typical second order phase transitionA typical second order phase transition is the one from the paramagnetic to theferromagnetic phase.

A magnet can be understood as a lattice of spins interacting with each otherlocally in space. The interaction between two spins has a tendency to align themparallelly. At higher temperatures, due to vigorous thermal motions, this interactioncannot quite make order among spins, but at lower temperatures the entropic effectbecomes less significant, so spins order globally. There is a special temperature Tc

below which this ordering occurs. We say an order-disorder transition occurs at thistemperature.

The Ising model is the simplest model of this transition. At each lattice point isa (classical) spin σ which takes only +1 (up) or −1 (down). A nearest neighbor spin

2.14. PHASE TRANSITION 79

pair has the following interaction energy:

− Jσiσj, (2.14.1)

where J is called the coupling constant, which is positive in our example (ferromag-netic case). We assume all the spin-spin interaction energies are superposable, so thetotal energy of the system for a lattice is given by

H = −∑〈i,j〉

Jσiσj −∑

i

hσi, (2.14.2)

where 〈〉 implies the nearest neighbor pairs, and h is the external magnetic field.The partition function for this system reads

Z =∑

σi=±1

e−βH. (2.14.3)

Here, the sum is over all spin configurations.

2.14.3 Necessity of thermodynamic limitIf the lattice size is finite, the sum in (2.14.3) is a finite sum of positive terms. Eachterm in this sum is analytic in T and h, so the sum itself becomes analytic in T andh. Furthermore, Z cannot be zero, because each term in the sum is strictly positive.Therefore, its logarithm is analytic in T and h; the free energy of the finite latticesystem cannot exhibit any singularity. That is, there is no phase transition for thissystem. Strictly speaking, there is no phase transition for any finite system, unlesseach spin has infinitely many states.

Even in the actual system we study experimentally, there are only a finite numberof atoms, but this number is huge. Thus, the question of phase transitions from thestatistical physics point of view is: is there any singularity in A in the large systemlimit? The large system limit, with proper caution not to increase its surface areamore than the order of V 2/3, where V is the system volume, is called the thermody-namic limit. Strictly speaking, phase transitions can occur only in this limit.

2.14.4 Spatial dimensionality is crucialFor the existence of a phase transition, not only the system size but also the spatial-dimensionality of the system is crucial.

Let us consider a one-dimensional Ising model (Ising chain), whose total energyreads

H = −J∑

−∞<i<+∞

σiσi+1. (2.14.4)


We have ignored the external magnetic field for simplicity. Compare the energiesof the following two spin configurations (+ denotes the up spins and − down spins)(Fig. 2.9):

+ + + + + + + + + + + + + + + + + + + + + + + + + + − − − − − − − − − + + + + + + + + +

L

Fig. 2.9 Ising chain with a spin-flipped island

The bottom one has a larger energy than the top by 2J × 2 due to the existence ofthe down spin island. However, this energy difference is independent of the size L ofthe island. Therefore, so long as T > 0 there is a finite chance of making big downspin islands amidst the ocean of up spins. If a down spin island becomes large, thereis a finite probability for a large lake of up spins on it. This implies that no orderingis possible for T > 0.

As you can easily guess there is no ordered phase in any one dimensional latticesystem with local interactions for T > 0.

2.14.5 In 2-space Ising model exhibits phase transitionConsider the two-dimensional Ising model with h = 0. Imagine there is an ocean ofup spins (Fig. 2.10). To make a circular down-spin island of radius L, we need 4πJLmore energy than the completely ordered phase.

L

up

spins

down

spins

Fig. 2.10 Peierls’ argument illustrated.

This energy depends on L, making the formation of a larger island harder. That is,to destroy a global order we need a macroscopic amount of energy, so for sufficientlylow temperatures, the ordered phase cannot be destroyed spontaneously. Of course,small local islands could be made, but they never become very large. Hence, we mayconclude that a phase transition is possible for a two-dimensional system with localinteractions even for T > 0. The above argument is known as Peierls’ argument,Rand can be made rigorous.9

9There is at least one more crucial factor governing the existence of phase transition. It is thespin dimension: the degree of freedom of each spin. Ising spins cannot point different directions,


2.14.6 Long range interactionsWhat happens if the range of interactions is not finite? Peierls’ argument is stillapplicable. Obviously, if each spin can interact with all the spins in the system uni-formly, an ordered phase is possible even in one dimensional space. If the couplingconstant J decays slower than 1/r2, then an order-disorder phase transition is stillpossible at a finite temperature in one dimensional space.

Exercise 1. Intuitively explain the last statement. utWe have learned that for phase transitions, system size, dimensionality of space,

and the range of interactions are crucial.

only up or down (their spin-dimension is 1). However, the true atomic magnets can orient in anydirection (their spin dimension is 3). This freedom makes ordering harder. Actually, in 2D spaceferromagnetic ordering at T > 0 by spins with a larger spin dimension than Ising spins is impossible.


Appendix 2A: Rudiments of probability

2.A.7 Introductory examplesSuppose we have a jar containing 5 white balls and 2 black balls. What is the degreeon the 0-1 scale of your confidence for you to pick a white ball out of the jar? Weexpect that on the average for 5 times out of 7 we will take a white ball out. Hence,it is sensible to say that our confidence in the above statement is 5/7.

What do we mean when we say that there will be 70% chance of rain tomorrow?In this case, in contrast to the preceding example, we cannot repeat tomorrow againand again. However, the meaning seems clear.

We conclude that the probability of an event should be a measure of confidencein the occurrence of the event. A particular measure may be realistic or unrealistic,but this is not a concern of probability theory. In probability theory we study theconsequences of the general abstract definition of probability.

2.A.8 Elementary eventsAn event which cannot (or need not) be analyzed further into a combination of eventsis called an elementary event. For example, to rain tomorrow is an elementary event(if you ask whether you prepare for raingear), but to rain or to snow tomorrow is acombination of two elementary events.

Denote by Ω the totality of elementary events allowed in the situation or to thesystem under study. Any (compound) event under consideration can be identifiedwith a subset of Ω. When we say an event corresponding to a subset A of Ω occurs,we mean that one of the elements in A occurs.

2.A.9 Probability is a volume of confidenceLet us denote the probability of A ⊂ Ω by P (A). Since probability should measurethe degree of our confidence on a 0-1 scale, we demand that

P (Ω) = 1; (2.A.1)

something must happen. Then, it is also sensible to assume

P (∅) = 0. (2.A.2)

Now, consider two mutually exclusive events, A and B. This means that wheneveran elementary event in A occurs, no elementary event in B occurs and vice versa.Hence, A ∩B = ∅. Thus, it is sensible to demand

P (A ∪B) = P (A) + P (B), if A ∩B = ∅. (2.A.3)

For example, if you know that with 30% chance it will rain tomorrow and with 20%chance it will snow tomorrow (excluding sleet), then you can be sure that with 50%


chance it will rain or snow tomorrow.We already know quantities satisfying (2.A.3): length, area, or volume. Thus,

probability is a kind of volume to measure one’s confidence.10

Remark. The concept of volume is abstracted as measure in mathematics; a measurewith its total mass equal to unity is called a probability measure. utExample 1. We throw three fair coins. Find the probability to have at least twoheads.

In this case elementary events are the outcomes of one trial, say HHT (H = head,T = tail). Thus there are 8 elementary events, and we have

Ω = HHH, HHT, HTH, THH, HTT, THT, TTH, TTT. (2.A.4)

The word “fair” means that all elementary events are equally likely. Hence, theprobability of any elementary event should be 1/8. The event A, defined by havingat least 2 H’s, is given by A = HHH, HHT, HTH, THH. Of course, elementaryevents are mutually exclusive, so P (A) = 1/2. ut

2.A.10 Some rudimentary facts about probabilityIt is easy to check that

P (A ∪B) ≤ P (A) + P (B), (2.A.5)

A ⊂ B ⇒ P (A) ≤ P (B). (2.A.6)

Denoting Ω \ A by Ac (complement), we get

P (Ac) = 1− P (A). (2.A.7)

Example 1. There are r people in a room. What is the probability of having atleast two persons sharing the same birthday?

Let Ar be the event of there being at least one such pair. Then Acr = “all the people

have distinct birthdays.” It is easier to compute P (Acr). Assume, for simplicity, that

one year consists of 365 days, and the human birth rate is uniform throughout theyear. We get

P (Acr) = 1 ·

(1− 1

365

)·(

1− 2

365

)· · · ·

(1− r − 1

365

). (2.A.8)

This rapidly converges to 0: P (A30) = 1− P (Ac30) ' 0.706, and P (A50) ' 0.97. ut

10Why is such a subjective quantity meaningful objectively? Because our subjectivity is selectedto match objectivity through phylogenetic learning. Those who have subjectivity not well matchedto objectivity have been selected out during the past 4 billion years.


2.A.11 Conditional probabilitySuppose we know for sure that an elementary event in B has occurred. Under thiscondition what is the probability of the occurrence of the event A? Thus we need theconcept of conditional probability. We write this conditional probability as P (A|B),and define it through

P (A ∩B) = P (A|B)P (B). (2.A.9)

2.A.12 Statistical independenceWhen the occurrence of an elementary event in a set (i.e., an event) A has noth-ing to do with that in a set B, we say the two events A and B are (statistically)independent.11 Since knowing about the event B does not help us to obtain moreinformation about A if A and B are independent, we should get

P (A|B) = P (A). (2.A.10)

It follows from (2.A.9) that

P (A ∩B) = P (A|B) · P (B) = P (A) · P (B). (2.A.11)

We use this as a definition of the (statistical) independence of two events A and B:Two events A and B are said to be (statistically) independent, if

P (A ∩B) = P (A) · P (B). (2.A.12)

For example, when we use two fair dice a and b and ask the probability for ato exhibit a number less than or equal to 2 (event A), and b a number larger than3 (event B), we have only to know the probability for each event A = 1, 2 andB = 4, 5, 6. Thus, the answer is P (A) · P (B) = 1/3 · 1/2 = 1/6.

2.A.13 Expectation valueSuppose a probability P is given on a set of events Ω = ωi, and there is anobservable F such that F (ωi) is its value when the elementary event ωi actuallyoccurs. The expectation value (= average) of F with respect to the probability P iswritten as EP (F ) or 〈F 〉P and is defined by

EP (F ) ≡ 〈F 〉P ≡∑ω∈Ω

P (ω)F (ω), (2.A.13)

Often the suffix P is omitted. The sum often becomes integration when we studyevents which are specified by a continuous parameter.

11The concept ‘independence’ and ‘uncorrelated’ should not be confused.‘Uncorrelated’ oftenmeans that the correlation vanishes.


2.A.14 IndicatorThe indicator χA of a set A is defined by

χA(ω) ≡

1 if ω ∈ A,0 if ω 6∈ A. (2.A.14)

Notice that〈χA〉P = P (A). (2.A.15)

This is a very important relation for the computation of probabilities. The formulaimplies that if we can write down an event as a set theoretical formula, we can com-pute its probability by summation (or more generally, by integration).

2.A.15 Generating functionIt is often convenient to introduce a generating function Γ of a probability distribu-tion P with respect to an observable F :

Γ(t) ≡∑ω∈Ω

et·F (ω)P (ω). (2.A.16)

From this definition follows

d log Γ(t)

dt

∣∣∣∣t=0

= 〈F 〉P . (2.A.17)

That is, if we know the generating function, we can compute the expectation valueby differentiation.

2.A.16 Variance measures fluctuationδF ≡ F − 〈F 〉P describes the fluctuation around the expectation value. 〈δF 2〉P is ameasure of fluctuation called the variance of F :

〈δF 2〉P = 〈F 2〉P − 〈F 〉2P . (2.A.18)

Since

d

dt

(1

Γ

dΓ

dt

)∣∣∣∣t=0

= − 1

Γ2

(dΓ

dt

)2

+1

Γ

d2Γ

dt2= −〈F 2〉P + 〈F 〉2P , (2.A.19)

we getd2

dt2log Γ(t)|t=0 = 〈F 2〉P − 〈F 〉2P . (2.A.20)


Example 1. Let XiNi=1 be a set of independently and identically distributed

(iid12) random variables. Their expectation value is M and their variance is V .

The expectation value of the sum YN ≡∑N

i=1Xi is given by NM , and its vari-

ance by NV . Therefore, the relative fluctuation of YN , defined as√〈δY 2

N〉/〈YN〉 is√NV /NM ∝ 1/

√N . This implies that YN clusters relatively more tightly as N

increases. This is the reason for the law of large numbers. ut

12This is a standard abbreviation.


Appendix 2B: Rudiments of combinatorics

In statistical mechanics we must be able to compute the number of elementary events(i.e., microscopic events) under various constraints. We should know rudiments ofcombinatorics. C. L. Liu, Introduction to Combinatorial Mathematics (McGraw-Hill)is a nice introduction to the subject with many (practical) examples.

2.B.17 Sequential arrangement of distinguishable objects: nPr

Suppose there is a set of n distinguishable objects. How many ways are there to makesequential arrangements of r objects taken from this set? This number is denotedby nPr ≡ P (n, r).

There are two ways to get an explicit formula for this number.(i) There are n ways in selecting the first object. To choose the second object,there are (n − 1) ways, because we have already taken out the first one. Here, thedistinguishability of each object is crucial. In this way we arrive at

P (n, r) = n · (n− 1) · · · (n− r + 1) =n!

(n− r)!, (2.B.1)

where n! = 1 · 2 · 3 · · · (n− 1) · n; n factorial is the number of ways n distinguishableobjects can be arranged in a sequence.(ii) This derivation is an interpretation of the rightmost formula in (2.B.1) For eacharrangement of r objects in a linear order, there are (n − r)! ways to complete onearrangement of n (all) objects. The total number of ways of arranging n objects isn!, so we must factor (n− r)! out.

2.B.18 Selection of distinguishable objects: binomial coefficientUnder the same distinguishability condition, we now disregard the order in the ar-rangement of r objects. That is, we wish to answer the question: how many waysare there to choose r objects from a set of n distinguishable objects?

Since we disregard the ordering in each arrangement of r objects, the answershould be

nCr ≡(n

r

)≡ nPr

r!=

n!

(n− r)!r!. (2.B.2)

The number(

nr

)is called the binomial coefficient due to a reason clear from (2.B.6).

Exercise 1. Show the following equality and give combinatorial explanations:

nPr =

(nr

)· rPr, (2.B.3)

88 CHAPTER 2. STATISTICAL MECHANICS PRIMER(nr

)=

(n− 1r − 1

)+

(n− 1r

). (2.B.4)

ut

2.B.19 Multinomial coefficientSuppose there are k species of particles. There are qi particles for the i-th species.We assume that the particles of the same species are not distinguishable. The totalnumber of particles is n ≡

∑ki=1 qi. How many ways are there to arrange these

particles in one dimensional array?If we assume that all the particles are distinguishable, the answer is n!. However,

the particles of the same species cannot be distinguished, so we need not worry whichi-th particle is chosen first. Hence, we have overcounted the number of ways by thefactor qi! for the i-th species. The same should hold for all species. Thus we arriveat

n!

q1!q2! · · · qk−1!qk!. (2.B.5)

This is called the multinomial coefficient (→2.B.21).

2.B.20 Binomial theoremConsider the n-th power of x + y. There exists an expansion formula called thebinomial expansion:

(x+ y)n =n∑

r=0

(nr

)xn−ryr. (2.B.6)

This can be seen easily as follows: We wish to expand the product of n (x+ y):

n︷︸︸︷(x+ y)(x+ y)(x+ y) · · · (x+ y) · · · (x+ y) (2.B.7)

As an example take the term x2yn−2. To produce this term by expanding the aboveproduct, we must choose 2 x’s from n (x+ y). There are

(n2

)ways to do this, so the

coefficient must be(

n2

).

2.B.21 Multinomial theoremThere is a generalization of (2.B.6) to the case of more than two variables and iscalled the multinomial expansion. It can be understood from (2.B.5):

(x1 + x2 + x3 + · · ·+ xm)n =∑

q1+q2+···+qm=n, qi≥0

n!

q1!q2! · · · qm!xq1

1 xq2

2 · · ·xqmm . (2.B.8)


2.B.22 Arrangement of indistinguishable objects in distinguishable boxesConsider n indistinguishable objects. We wish to distribute them into r distinguish-able boxes. How many distinguishable arrangements can be made?

Since the boxes are distinguishable, we arrange them in a fixed sequence, and thendistribute the indistinguishable objects.

.........

.........

...

Fig. 2B.1 Indistinguishable objects

Hence, the problem is equivalent to counting the number of arrangements of n in-distinguishable balls and r − 1 indistinguishable bars on a line (Fig. 2B.1). Apply(2.B.5) to obtain the answer:

(n+ r − 1)!

n!(r − 1)!=

(n+ r − 1

n

). (2.B.9)

Index

absolute entropy, 36absolute temperature, 18adiabat, 17adiabatic cooling, 36adiabatic demagnetization, 36

binomial coefficient, 87binomial expansion, 88binomial theorem, 88Boltzmann constant, 41Boltzmann’s principle, 41Bose-Einstein distribution, 65Bose-Einstein condensation, 70boson, 64

canonical partition function, 45Caratheodory’s principle, 16chemical potential, 62Clapeyron-Clausius relation, 77classical gas, 48Clausius’ inequality, 20, 22Clausius’ law, 14closed system, 10coexistence curve, 75compound system, 8conditional probability, 84conjugate variables, 42convex analysis, 25

Debye model, 56Dulong-Petit’s law, 54

Einstein model, 54

elementary event, 82ensemble equivalence, 66entropic elasticity, 34entropy, 18–20entropy maximization principle, 21equilibrium state, 6equipartion of energy, 54event, 82evolution criterion, 20expectation value, 84extensive quantity, 13

Fermi energy, 68Fermi-Dirac distribution, 65fermion, 64ferromagnetic phase transition, 78first law, 11first order phase transition, 78fluctuation, 85fourth law, 14, 26free electron gas, 66fundamental equation (of state), 58

generating function, 85Gibbs, 5Gibbs free energy, 25, 63Gibbs paradox, 51Gibbs phase rule, 77Gibbs relation, 19Gibbs-Boltzmann distribution, 45Gibbs-Duhem relation, 26Gibbs-Helmholtz formula, 46grand canonical partition function, 62

90

INDEX 91

grand partition function, 62

heat, 11Helmholtz free energy, 24, 46, 63

ideal gas, 58independent events, 84indicator, 85intensive quantity, 13internal degrees of freedom, 59internal energy, 11Ising model, 78

Jacobian technique, 27

Kelvin’s law, 15Kramers’ q, 63

laws of thermodynamics), 6Le Chatelier’s principle, 31Le Chatelier-Braun’s principle, 32Legendre transformation, 24

macroscopic object, 6mass action, 11Maxwell distribution, 60Maxwell’s relation, 27Maxwell’s relation in terms of Jacobian,

29measure, 83multinomial coefficient, 88multinomial expansion, 88multinomial theorem, 88

Nernst’s law, 36

open system, 60order-disorder phase transition, 78

Peierls’ argument, 80phase coexistence condition, 24phase space, 39, 52

phase transition, 78phonon, 71photon, 72Planck’s law, 15Planck’s radiation law, 73principle of equal probability, 40probability measure, 83probability theory, 82

quasistatic process, 8

reservoir, 21rubber band, 33

Schottky defect, 42second law, 14second order phase transitions, 78sign convention, of energy exchange, 12simple system, 8spin-statistics relation, 64stability condition, 20state function, 9Stefan-Boltzmann law, 74, 75Stirling’s formula, 43

temperature, 7thermal equilibrium, 7thermodynamic conjugate pair, 13thermodynamic coordinates, 8thermodynamic degrees of freedom, 77thermodynamic limit, 66, 79thermodynamic space, 8third law, 36, 54, 59, 71triple point, 76

variance, 85

work coordinates, 8

Young’s theorem, 27

zeroth law, 7

Oono - StatMech Primer

Documents

Transcript of Oono - StatMech Primer