Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof....

142
Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften Universität Kassel February 3, 2016 The purpose of these lectures is to study the fundamental principles of Statistical Mechanics, also known as Statistical Physics, which govern the behaviour of macroscopic systems, i.e., systems composed by a large number of atoms or molecules. Although large macroscopic systems are subject to the same fundamental laws of quantum mechanics as small systems, their be- haviour follows new universal laws which are specific to the presence of a very large number of degrees of freedom and to the boundary conditions imposed by experiment. Instead of becoming increasingly intricate and obscure, new regularities appear as the number of particles becomes very large which es- cape a purely mechanical deterministic interpretation. These statistical laws cannot be reduced to purely mechanical laws. New concept and principles are required in order to derive them, which do not apply to small systems. One of our first goals will be to understand why the laws of quantum me- chanics (or classical mechanics) are not enough for understanding macroscopic systems and to lay down the principles of statistical mechanics as clearly and explicitly as possible. Many introductory courses and a number of remarkable books on the theory of macroscopic systems (e.g., K. Huang, Statistical Me- chanics, Wiley) follow the historical development. They start by formulating the first and second laws of the thermodynamics and develop in length their consequences by establishing relations between their mechanic magnitudes. Important concepts like entropy, heat and temperature are introduced in an ad hoc phenomenological way. The concepts of Statistical Mechanics are in- troduced later in order to provide a microscopic justification or rationalization of the thermodynamic principles. This is the pathway followed by R. Clau- sius (1822-1888), J.C. Maxwell(1831-1879), L.E. Boltzmann (1844-1906) and J.W. Gibbs (1839-1903). While such an evolution is understandable at a time when no microscopic theory of matter was available, approaching the subject in this way would be an anachronism. At present it is far more meaningful 1

Transcript of Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof....

Page 1: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Introduction to Statistical Physics

Prof. Dr. G. M. Pastor

Institut für Theoretische Physik

Fachbereich Mathematik und Naturwissenschaften

Universität Kassel

February 3, 2016

The purpose of these lectures is to study the fundamental principles ofStatistical Mechanics, also known as Statistical Physics, which govern thebehaviour of macroscopic systems, i.e., systems composed by a large numberof atoms or molecules. Although large macroscopic systems are subject tothe same fundamental laws of quantum mechanics as small systems, their be-haviour follows new universal laws which are specific to the presence of a verylarge number of degrees of freedom and to the boundary conditions imposedby experiment. Instead of becoming increasingly intricate and obscure, newregularities appear as the number of particles becomes very large which es-cape a purely mechanical deterministic interpretation. These statistical lawscannot be reduced to purely mechanical laws. New concept and principlesare required in order to derive them, which do not apply to small systems.

One of our first goals will be to understand why the laws of quantum me-chanics (or classical mechanics) are not enough for understanding macroscopicsystems and to lay down the principles of statistical mechanics as clearly andexplicitly as possible. Many introductory courses and a number of remarkablebooks on the theory of macroscopic systems (e.g., K. Huang, Statistical Me-chanics, Wiley) follow the historical development. They start by formulatingthe first and second laws of the thermodynamics and develop in length theirconsequences by establishing relations between their mechanic magnitudes.Important concepts like entropy, heat and temperature are introduced in anad hoc phenomenological way. The concepts of Statistical Mechanics are in-troduced later in order to provide a microscopic justification or rationalizationof the thermodynamic principles. This is the pathway followed by R. Clau-sius (1822-1888), J.C. Maxwell(1831-1879), L.E. Boltzmann (1844-1906) andJ.W. Gibbs (1839-1903). While such an evolution is understandable at a timewhen no microscopic theory of matter was available, approaching the subjectin this way would be an anachronism. At present it is far more meaningful

1

Page 2: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

to set first of all the basis of Statistical Mechanics taking advantage of ourknowledge of classical and quantum mechanics and to derive thermodynamicsas a byproduct.

This approach, which will be adopted in the following, has a number if im-portant advantages. From a conceptual perspective it is important to statethe principles of the theory of macroscopic systems in a clear transparent way,so that we understand, where they are applicable, where they are not andwhat kind of consequences to expect in borderline cases (e.g., nanoscale sys-tems). In this way we set the focus on the general fundamental theory ratherthan on the phenomenological consequences of it. Statistical mechanics isfree from a number of limitations of thermodynamics. For example, ther-modynamics concerns systems in equilibrium leaving no room for thermalfluctuations of the physical observables. It provides no microscopic explana-tion of the observed behaviour but simply general relations between differentmeasurable properties. Since no connection to quantum mechanics can beestablished in thermodynamics, it is not possible to predict any observableproperty (e.g., the melting point of a given metal or the Curie temperatureof ferromagnet). The equation of state of a material in a specific phase cannever arise from thermodynamics. Statistical mechanics shares with ther-modynamics its universal character, since its principles and laws apply toall systems alike. But in addition, it opens the possibility of a detailed the-ory of matter including the calculation material-specific properties. Subtlequantum-mechanical effects such as superconductivity, magnetism, metal in-sulation transitions and all other condensed-matter properties can thus bepredicted. In fact, the rigorous bridge between quantum theory and macro-scopic behaviour has only been established thanks to Statistical Mechanics.This allows a detailed comparison between microscopic theory and macro-scopic experiments, the cornerstone of scientific progress.

Any theory of the properties of macroscopic systems has to be based, orat least by consistent, with quantum mechanics (QM), which is our micro-scopic theory of matter. QM defines very precise methods for determining theeigenstates and eigenvalues of different observables, in particular the station-ary states and eigenenergies. This is certainly not an easy task for systemscontaining of the order of 1023 atoms. In addition, QM predicts the timeevolution of any given state | i. The same holds for a classical description,whenever it is applicable. However, an atomistic deterministic mechanicaltheory is not able to predict in which state a given system will be found un-der specified macroscopic constraints. Answering this questions is one of thegoals of Statistical Mechanics. In a deterministic dynamics the state is givenby the initial conditions and the elapsed time. A simple example, which Iborrowed from E. Müller-Hartmann, allows to illustrate the different scopesof deterministic and statistical mechanics: Consider a large number of H2Omolecules in a given volume. A typical statement of QM would be that waterand ice are possible states of the system. Based on this QM result Statistical

2

Page 3: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Mechanics allows us to conclude that in summer one finds water and in winterice.

G. M. Pastor, WS 13/14, Kassel.

Contents

1 Basic statistical concepts 6

1.1 Random variables and probabilities . . . . . . . . . . . . . . . . . . . . . . 61.2 A single continuous random variable . . . . . . . . . . . . . . . . . . . . . 7

1.2.1 Moments of a PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.2 Cumulants of a PDF⇤ . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.3 Computation of moments in terms of cumulants⇤ . . . . . . . . . . 10

1.3 The Gaussian or normal distribution . . . . . . . . . . . . . . . . . . . . . 111.4 Many random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4.1 Joint cumulants of independent random variables⇤ . . . . . . . . . 141.5 The Gaussian distribution for many variables⇤ . . . . . . . . . . . . . . . . 151.6 The central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.7 Information content of a probability distribution . . . . . . . . . . . . . . 18

1.7.1 Inferring unbiased probability distributions . . . . . . . . . . . . . 211.7.2 Entropy of continuous probability densities . . . . . . . . . . . . . 22

2 Mixed states, density matrices and distribution functions 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2 Mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.3 Density matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.4 Time dependence of the density operator of isolated systems . . . . . . . . 332.5 The statistical distribution function . . . . . . . . . . . . . . . . . . . . . . 352.6 Time dependence of ⇢(p, q): Liouville theorem . . . . . . . . . . . . . . . . 36

2.6.1 Time independence of the entropy of the distribution function⇢(p, q) in strictly isolated systems . . . . . . . . . . . . . . . . . . . 40

3 Equilibrium statistical ensembles 42

3.1 Statistical independence of macroscopic subsystems . . . . . . . . . . . . . 423.2 The statistical independence of extensive additive properties . . . . . . . . 433.3 The importance of additive constants of motion . . . . . . . . . . . . . . . 443.4 The density operator: General formulation . . . . . . . . . . . . . . . . . . 46

3.4.1 The microcanonical ensemble . . . . . . . . . . . . . . . . . . . . . 463.4.2 The grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . 473.4.3 The canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . 48

3.5 Explicit forms of the density operator . . . . . . . . . . . . . . . . . . . . 493.5.1 Microcanonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . 503.5.2 Canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3

Page 4: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

3.5.3 Grand canonical ensemble . . . . . . . . . . . . . . . . . . . . . . . 543.5.4 Grand canonical pressure ensemble . . . . . . . . . . . . . . . . . . 57

3.6 Implicit notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4 Entropy 58

4.1 Maximum entropy theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 594.2 The approach to equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 614.3 Thermodynamic equivalent descriptions . . . . . . . . . . . . . . . . . . . 64

4.3.1 Euler theorem for homogeneous functions . . . . . . . . . . . . . . 704.4 Thermodynamic potentials: General formulation . . . . . . . . . . . . . . 71

5 Thermodynamic properties 77

5.1 Thermodynamic potentials: special cases . . . . . . . . . . . . . . . . . . . 775.1.1 Energy E and Entropy S . . . . . . . . . . . . . . . . . . . . . . . 775.1.2 Helmhotz free energy F . . . . . . . . . . . . . . . . . . . . . . . . 785.1.3 Enthalpy H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.1.4 Free enthalpy G . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.1.5 Grand canonical potential � . . . . . . . . . . . . . . . . . . . . . . 80

5.2 Derived thermodynamic properties . . . . . . . . . . . . . . . . . . . . . . 815.2.1 Heat capacities and specific heats . . . . . . . . . . . . . . . . . . . 815.2.2 Compressibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815.2.3 Thermal expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.2.4 Charge compressibility . . . . . . . . . . . . . . . . . . . . . . . . . 825.2.5 Assessing fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.3 Minimum free-energy theorem . . . . . . . . . . . . . . . . . . . . . . . . . 84

6 Thermodynamic relations 87

6.1 Durhem-Gibbs relations for thermodynamic potentials . . . . . . . . . . . 876.2 Intensive nature of the derivatives of extensive properties . . . . . . . . . . 876.3 Integrability of the differential of thermodynamic potentials . . . . . . . . 886.4 Jacobi-determinant manipulations . . . . . . . . . . . . . . . . . . . . . . . 906.5 Measuring the absolute temperature scale . . . . . . . . . . . . . . . . . . 93

7 Thermodynamic processes 95

7.1 Carnot cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967.2 Joule Thompson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987.3 Adiabatic expansion in vacuum . . . . . . . . . . . . . . . . . . . . . . . . 99

8 The Nernst theorem: The Third law of thermondynamics 101

9 The principle of indistinguishability of identical particles 105

9.1 Many-particle wave functions . . . . . . . . . . . . . . . . . . . . . . . . . 1079.2 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099.3 Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4

Page 5: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

10 The classical limit 116

10.1 Boltzmann counting and configurational integrals . . . . . . . . . . . . . . 11810.2 Configural Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11910.3 Virial and equipartition theorems . . . . . . . . . . . . . . . . . . . . . . . 11910.4 The ideal classical gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12110.5 Derivation of the classical limit . . . . . . . . . . . . . . . . . . . . . . . . 122

11 The ideal quantum gases 124

12 Fermi systems: The ideal Fermi gas 129

12.1 Energy-pressure relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13112.2 High temperatures and low densities . . . . . . . . . . . . . . . . . . . . . 13312.3 Low temperatures and high densities . . . . . . . . . . . . . . . . . . . . . 134

13 Bose systems: Photons and black body radiation 143

14 Phonons 150

15 Bose-Einstein condensation 152

16 Bibliography 162

5

Page 6: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

1 Basic statistical concepts

A full mechanical description of the dynamics of a macroscopic system is both hopelessand not very meaningful, since we lack the required precise information on the initialconditions of all degrees of freedom and on the exact form of the Hamiltonian governingthe system and its environment. Moreover, for describing the equilibrium properties aprecise knowledge of all its constituents is not necessary at all. What we actually need toknow is only the probability of finding the macroscopic system in each one of its possiblemicroscopic states. It is the goal of statistical mechanics to provide such an inherentlyprobabilistic description of macroscopic systems. This motivates a brief discussion ofsome basic concepts of probability theory.

The sections marked with an asterisk (⇤) in this chapter are not needed for under-standing the rest of the notes. They may be skipped without consequences.

1.1 Random variables and probabilities

Variables whose outcome cannot be predicted with certainty are usually known as randomvariables. We consider a random variable x with a set of possible outcomes S, which maybe discrete (e.g., S = {x1, x2, . . .}) or continuous (e.g., S ⌘ ).

An event E is a subset of outcomes E ✓ S, for example, E = {even result of a dice throw}= {2, 4, 6}. To each event E ✓ S we assign a probability P (E) with the following threefundamental properties:

i) Positive definiteness: P (E) � 0.

ii) Additivity: P (E1) + P (E2) = P (E1 [ E2) if E1 \ E2 and E1, E2 ✓ S.E1 and E2 are said to be disconnected events.

iii) Normalization: P (S) = 1.

Probabilities may be assigned in two ways:

i) Experimentally as

P (E) = limN!1

NE

N,

where NE is the number of actual occurrences of the event E after N “throws” oroutcomes.

ii) Theoretically by means of an estimation, which is based on the determination ofthe set of outcomes S, and some hypothesis about the relative probabilities fora complete set of events. For instance, knowing that S = {1, 2, 3, 4, 5, 6} for adice and assuming equal probabilities P (i) = 1/6 8 i 2 S, we conclude thatP (even) = 3 · 1/6 = 1/2. Due to the lack of knowledge of the precise mechanical

6

Page 7: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

properties of the dice (i.e., the system) and the way of throwing it (i.e., the initialcondition and the Hamiltonian including the environment) and in the absence ofany reason to believe that the dice is biased, we assume that all six possibilities ofthe elements in S (i.e., the states of the system) are equally probable.All probability assignments in statistical mechanics are theoretical or “subjective"(as opposed to “objective” in i). Their validity needs to be verified by contrastingthem to experiment.

1.2 A single continuous random variable

We consider a continuous random variable x 2 . The cumulative probability function(CPF) is defined as the probability P (x) for any outcome x0 to be smaller than x:

P (x) = prob�x0 2 [�1, x]

�.

The basic properties of a CPF are

i) P (�1) = 0.

ii) P (x) is monotonically increasing, i.e,

P (x+�) = P (x) + prob�x0 2 [x, x+�]

�� P (x),

since any probability satisfies additivity and positiveness.

iii) Finally, the normalization condition implies P (+1) = 1.

The probability density function (PDF) is defined as p(x) =dP

dx. Consequently,

p(x) dx = prob�x0 2 [x, x+ dx]

�.

Notice that, in contrast to probabilities satisfying P (x) 1, there is no upper bound forp(x).

If x is a random variable, any function F (x) of x is also a random variable with itsown PDF, which is given by

pF (f) df = prob (F (x) 2 [f, f + df ]) .

Let xi with i = 1, . . . ⌫ be the solutions of F (xi) = f , we have

pF (f) df =⌫X

i=1

p(xi) dxi

) pF (f) =⌫X

i=1

p(xi)

���� dxdF����x=x

i

=F�1(f).

(1.1)

7

Page 8: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that |dx/dF | is the Jacobian for the change of variables. Eq. (1.1) may be writtenas

pF (f) =⌫X

i=1

p�xi = F�1(f)

� 1��dFdx

��x=x

i

=F�1(f).

Such a change of variables may lead to divergencies in pF (f), which remain of courseintegrable, as one may easily verify by changing variables back to x.

Exercise 1.1:A metastable nucleus decays through � emission. What is the probability density p(', ✓)for the electron to be emitted with a polar angle ✓ (relative to the z axis) and azimuthalangle '? Imagine the nucleus at the coordinate origin and consider

p(', ✓) d' d✓ = prob {emission with angle ✓ 2 [✓, ✓ + d✓] and angle ' 2 [','+ d']} .

Exercise 1.2: Consider the Gaussian distribution

p(x) =1p

2⇡ �2e�x

2/2�2

with x 2 and f = F (x) = x2. Show that pF (f) =1p

2⇡ �2e�f/2�

2

pf

for f > 0 and

pF (f) = 0 for f < 0. Verify the normalization of pF (f).

The expectation value of a function F (x) of the random variable x is given by

hF (x) i =Z +1

�1F (x) p(x) dx.

1.2.1 Moments of a PDF

Particularly important expectation values are the moments of the PDF

hxn i =Z

xn p(x) dx

and the characteristic function �(k)

�(k) = h e�i k x i =Z

e�i k x p(x) dx,

8

Page 9: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

which is just the Fourier transform of the PDF. Note that �(0) = 1. The characteristicfunction is the generator of the moments:

�(k) =

*+1Xn=0

(�i k)n

n!xn+

=+1Xn=0

(�i k)n

n!hxn i, (1.2)

from which we obtainin

dn�(k)

dkn

����k=0

= hxn i.

The PDF can be recovered from �(k) by the inverse transformation

p(x) =1

2⇡

Zei k x �(k) dk. (1.3)

One can also easily obtain the moments around any other point x0 from

ei k x0 �(k) = h e�i k (x�x0) i =+1Xn=0

(�i k)n

n!h (x� x0)

n i.

It is clear that the knowledge of all the moments of a PDF defines it univocally, sincehxni defines �(k), from which p(x) can be obtained. See Eqs. (1.2) and (1.3).

1.2.2 Cumulants of a PDF⇤

The logarithm of the characteristic function �(k) is known as the cumulant generatingfunction

ln�(k) =+1Xn=1

(�i k)n

n!hxn ic. (1.4)

The cumulants are defined implicitly by means of the previous series expansion of ln�(k).Note that ln�(k = 0) = ln 1 = 0.

The cumulants and the moments are of course related. One can obtain the cumulantsfrom

�(k) = 1 ++1Xn=1

(�i k)n

n!hxn i| {z }

"

(1.5)

and

ln�(k) = ln(1 + ") =+1Xl=1

(�1)l+1 "l

l. (1.6)

9

Page 10: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Using the definition of the cumulants (1.4) and replacing " from (1.5) in (1.6) we have

+1Xn=1

(�i k)n

n!hxn ic =

+1Xl=1

(�1) l+1

l

+1Xm=1

(�i k)m

m!hxm i

!l

.

This leads to

hx ic = hx i mean

hx2 ic = hx2 i � hx i2 variance

hx3 ic = hx3 i � 3 hx2 i hx i+ 2 hx i3 skewness

hx4 ic = hx4 i � 4 hx3 i hx i � 3 hx2 i2 + 12 hx2 i hx i2 � 6 hx i4 kurtosis (or kurtosis).

A PDF can be described indistinctively in terms of its cumulants or of its moments.

1.2.3 Computation of moments in terms of cumulants⇤

Theorem: The m-th moment hxm i is obtained by considering all possible subdivisions ofm points in pn groups or connected clusters of n points each. Of course

Pn pn n = m.

Each possible subdivision contributes with the product of the cumulants hxn ic associatedto the connected cluster having n points.

Examples:

hx i = (•) = hx ic

hx2 i = (•) (•) + (• •) = hx i2c + hx2 ic

hx3 i = (•) (•) (•) + 3 (•) (• •) + (• • •) = hx i3c + 3 hx ic hx2 ic + hx3 ic

Exercise 1.3:Obtain the expression for hx4 i in terms of hxl ic and l = 1–4. Deduce hxl ic in terms ofhxl i for l 4.

10

Page 11: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The theorem can be demonstrated by noting that

�(k) =+1Xm=0

(�i k)m

m!hxm i = exp {ln�(k)}

= exp

(+1Xn=1

(�i k)n

n!hxn ic

)

=+1Yn=1

exp⇢(�i k)n

n!hxn ic

=+1Yn=1

24 +1Xpn

=0

(�i k)n pn

(n!)pnhxn ipncpn!

35 .

Matching the coefficients of the powers of (i k)m with all the possibilities yieldingP

n pn =m we have

hxm im!

=X{p

n

}

Yn

hxn ipncpn! (n!)pn

(1.7)

where the sum runs over all the possibilities of forming subgroups withP

pn

n = m.After rewriting Eq. (1.7) as

hxm i =X{p

n

}

Yn

✓m!

pn! (n!)pnhxn ipnc

◆we can identify the different variables and factors as follows:n: number of points in one cluster.pn: number of clusters with the same number n of points inside.m!: number of permutations of all the m points.(n!)pn : permutations of the points within each cluster.pn!: number of permutations of the clusters with n points among them.

m!

pn! (n!)pn: number of ways of splitting m points in {pn} subgroups with n points each.

1.3 The Gaussian or normal distribution

The Gaussian distribution is given by

p(x) =1p

2⇡ �2e�

(x��)2

2�

2 ,

where � = hxi is the mean value and �2 = hx2i� hxi2 is the variance. The characteristicfunction

�(k) =1p

2⇡ �2

Zdx e�

(x��)2

2�

2 �i k x

11

Page 12: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

has also a Gaussian form. In order to prove it we set ⇠ = x�� and rewrite the exponentas

(x� �)2

2�2+ i k x =

⇠2

2�2+ i k ⇠ + i k �

=(⇠ + i�2 k)2

2�2+�2 k2

2+ i k � .

One obtains

�(k) = e�i k �� k

2�

2

21p

2⇡ �2

Z +1

�1d⇠ e�

(⇠+i�

2k)2

2�

2| {z }=1

.

The cumulant generating function is simply given by

ln�(k) = �i k �� k2 �2

2which implies

hx ic = hx i = �,

hx2 ic = hx2i � hxi2 = �2,

and

hxn ic = 0 for n � 3.

This makes the calculations using the cluster expansion particularly simple, since thegraphical expansions involve only one- and two-point clusters (see Sec. 1.2.3).

1.4 Many random variables

For more than one variable ~x = (x1, . . . xN ) 2 n the set of outcomes S ✓ n. The jointprobability distribution function (PDF) is defined by

p(~x)NYi=1

dxi = prob�event ~x 0 in xi < x0i < xi + dxi 8 i

,

which satisfies the normalization conditionZp(~x) dNx = 1.

If, and only if, the variables are independent we have p(~x) =QN

i=1 pi(xi), where pi(x) isthe PDF of the random variable xi.

12

Page 13: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The unconditional probability density for a subset of random variables x1, . . . xm isgiven by

p(x1 . . . xm) =

Z NYi=m+1

dxi p(x1 . . . xm xm+1 . . . xN ).

It describes the behavior of these x1 . . . xm variables irrespective of all the others. Forinstance p(~x) =

Rd3v p(~x,~v) gives the particle density (i.e., probability distribution for

the position) irrespective of the velocity ~v.The conditional PDF p(x1 . . . xm |xm+1 . . . xN ) describes the behavior of some variables

x1 . . . xm, subject to the constraint that the other variables xm+1 . . . xN have specifiedvalues. For example, one may search for the velocity distribution at a given point ~x,which we denote by p(~v | ~x).

The joint probability is given by

p(x1 . . . xm, xm+1 . . . xN ) = p(xm+1 . . . xN ) p(x1 . . . xm |xm+1 . . . xN ),

where p(xm+1 . . . xN ) is the unconditional probability density for xm+1 . . . xN , irrespec-tively of the other variables x1 . . . xm, and p(x1 . . . xm |xm+1 . . . xN ) is the probability ofx1 . . . xm given the values xm+1 . . . xN . Thus

p(x1 . . . xm |xm+1 . . . xN ) =p(x1 . . . xN )

p(xm+1 . . . xN ),

where p(x1 . . . xN ) is the number of events x1 . . . xm, xm+1 . . . xN (divided by the numberN of trials) and p(xm+1 . . . xN ) is the number of events xm+1 . . . xN (divided by thenumber N of trials). An example would be p(~v|~x) = p(~x,~v)/p(~x).

The expectation values of some function F (~x) is calculated as usual from

hF (~x) i =Z

p(~x) F (~x) dNx.

The joint characteristic function is given by the Fourier transform

�(~k) =

Zd~x e�i

~k·~x p(~x)

= h e�iP

N

j=1 kj

xj i

and the joint cumulant generating function is ln�(~k). In the particular case of inde-pendent variables we have p(~x) =

QNi=1 pi(xi), which implies �(~k) =

QNi=1 �i(ki) and

ln�(~k) =PN

i=1 ln�i(ki), where �i(k) is the characteristic function of the probabilitydistribution of the variable i.

The joint moments and joint cumulants are then obtained from

hxn11 . . . xnN

N i = @ n1

@(�i k1). . .

@ nN

@(�i kN )�(~k)

����~k=0

13

Page 14: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

and

hxn11 ⇤ . . . ⇤ xnN

N ic =@ n1

@(�i k1). . .

@ nN

@(�i kN )ln�(~k)

����~k=0

.

The graphical relation between moments and cumulants, that was demonstrated for onevariable, also applies to N variables. For instance,

hx1 x2 i =�•1

� �•2

�+�•1•2

�= hx1 ic hx2 ic + hx1 ⇤ x2 ic

or

hx21 x2 i =�•1

� �•2

� �•1

�+�•1•1

� �•2

�+ 2

�•1•2

� �•1

�+�•1•2•1

�= hx1 i2c hx2 ic + hx21 ic hx2 ic + 2 hx1 ⇤ x2 ic hx1 ic + hx21 ⇤ x2 ic.

1.4.1 Joint cumulants of independent random variables⇤

It is easy to see that hx↵ ⇤ x� ic = 0 if x↵ and x� are independent random variables. Letthe PDF be of the form

p(~x) = p1(x1 . . . xm) p2(xm+1 . . . xN ).

Then

�(~k) =

Zd~x e�i

~k·~x p(~x)

= h e�iP

m

j=1 kj

xj i1 h e�i

PN

j=m+1 kj

xj i2

= �1(~k1) �2(~k2).

The joint moment

hx↵ x� i =@�1

@(�i k↵)

@�2

@(�i k�)= hx↵ i1 hx� i2

for 1 ↵ m and m+ 1 � N . It follows that

ln�(~k) = ln�1(~k1) + ln�2(~k2)

and consequently

@

@k↵

@

@k�ln�(~k) = 0

if 1 ↵ m and m+ 1 � N . The joint cumulant hx↵ ⇤ x� ic is also known as theconnected correlation.

14

Page 15: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

1.5 The Gaussian distribution for many variables⇤

The generalization to N variables of the normal distribution has the form

p(~x) =1p

(2⇡)N det(⌃)exp

n� 1

2

Xmn

�⌃�1

�mn

(xm � �m) (xn � �n)o,

where ⌃ is a positive-definite symmetric matrix and ⌃�1 refers to its inverse. Note that⌃�1 is also positive definite. In other words, the argument of the exponential is anarbitrary positive-definite quadratic form.

The characteristic function is given by

�(~k) =

Zd~x e�i

~k·~x e�12 (~x�~�)·⌃�1 (~x�~�) 1p

(2⇡)N det(⌃),

where we have introduced ~k = (k1, . . . kN ) and ~� = (�1, . . .�N ).One may easily verify the normalization of p(~x) and compute �(~k) by changing vari-

ables to ~y = ~x � ~�, so that the Gaussian distribution is centered at the origin ofthe coordinate system, and by performing an orthogonal transformation U such thatU t⌃�1 U = ��2m �mn is diagonal (⌃�1 is symmetric). The Jacobian of the orthogonaltransformation being equal to 1 (detU = 1) and denoting the eigenvalues of ⌃�1 by1/�2m > 0 we have �

U t⌃�1 U�mn

= �mn1

�2mwith U t U =

and �U t⌃U

�mn

= �mn �2m.

Setting U~⇠ = ~y = (~x� ~�) we have

�(~k) =e�i

~k·~�p(2⇡)N det(⌃)

Zd~⇠ e�i

~k·U~⇠ exp⇢� 1

2

⇣U~⇠

⌘· ⌃�1 U~⇠| {z }

~⇠·U t ⌃�1 U~⇠

�.

If we set for a moment ~k = 0 to verify the normalization, we see that the integral splitsin a product of N one-dimensional Gaussians each yielding an integral

p2⇡ �2m, so thatQ

m

⇣p2⇡ �2m

⌘=p

(2⇡)N det(⌃). The joint PDF is therefore properly normalized.

In order to compute �(~k) one can use the result for one-dimensional Gaussians for~k 0 = U t ~k noting that (U t⌃�1 U)mn =

�mn

�2m. In this way one has

�(~k) = e�i~k·~�

Ym

e�k

0 2m

2m

2

= e�i~k·~� exp

⇢� 1

2

Xm

k0 2m �2m

�.

15

Page 16: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Using that Xm

k0 2m �2m =Xm

k0m �2m k0m = ~k0 · U t⌃U ~k0 =

= U t ~k · U t⌃~k = ~k · ⌃~k =Xmn

⌃mn km kn,

we finally obtain

�(~k) = e�i~k·~�� 1

2~k·⌃~k

= exp⇢� i

Xm

km �m � 1

2

Xmn

⌃mn km kn

�.

Consequently,

ln�(~k) = �iXm

km �m � 1

2

Xmn

⌃mn km kn,

which implies

hxm ic = �m,

hxm ⇤ xn ic = ⌃mn,

and all higher cumulants vanish.In the special case of vanishing mean values, i.e., �m = 0 8m, we have that all odd

cumulants vanish. Thus, all odd moments vanish and any even moment is given by thesum of the products of cumulants obtained from all possible ways of forming pairs ofvariables. For instance,

hx↵ x� x� x� i = hx↵ ⇤ x� ic hx� ⇤ x� ic +

+ hx↵ ⇤ x� ic hx� ⇤ x� ic

+ hx↵ ⇤ x� ic hx� ⇤ x� ic.

This is analogous to Wick’s theorem in many-body Green’s function theory.

1.6 The central limit theorem

We consider the average x = 1N

PN⌫=1 x⌫ of N random variables x1, . . . xN having the

joint PDF p(x1, . . . xN ). The PDF for the random variable x is

px(x) =

Zdx1 . . . dxN p(x1, . . . xN ) �

x� 1

N

NX⌫=1

x⌫

!

16

Page 17: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

and the characteristic function is

�x(k) =

Zdx px(x) e

�i k x

=

Zdx e�i k x

Zdx1 . . . dxN p(x1, . . . xN ) �

x� 1

N

X⌫

x⌫

!

=

Zdx1 . . . dxN p(x1, . . . xN ) e�i

k

N

PN

⌫=1 x⌫

= �p

✓k1 =

k

N, . . . kN =

k

N

◆,

where �p(~k) is the characteristic function of p(x1 . . . xN ) [�p : N ! ].Let us now assume that the x⌫ are independent variables having all the same PDF

p1(x), i.e.,

p(x1, . . . xN ) =NY⌫=1

p1(x⌫),

where the index 1 in p1(x⌫) indicates that only one variable is involved. The characteristicfunction associated to p(~x) is

�p(~k) =NYi=1

�1(ki) , (1.8)

where �1(k) is the characteristic function of p1(x). The characteristic function for the

PDF of the average x =1

N

Px⌫ takes then the form

�x(k) =NY⌫=1

✓Zdx⌫ p1(x⌫) e

�i k

N

x⌫

◆=

�1

✓k

N

◆�N,

where �1 is the characteristic function of p1(x). The cumulant generating function for xreads

ln�x(k) = N ln�1

✓k

N

◆. (1.9)

We can now expand �1 or ln�1 for small k/N , i.e., large N , in terms of the cumulantsof the probability distribution for one variable:

ln�1

✓k

N

◆= �i

k

Nhx ic �

1

2

✓k

N

◆2

hx2 ic +O(N�3). (1.10)

Combining Eqs. (1.9) and (1.10) we have

ln�x(k) = �i k hx ic �1

2k2

hx2 icN

+O(N�3).

17

Page 18: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The average of x is, as expected, equal to hx i and the variance is given by h x2 ic =hx2 icN

,which vanishes for N ! 1. Transforming back to px(x) we have asymptotically

px(x) ⇠=1p2⇡ �2x

exp⇢� 1

2�2x(x� hx i)2

�with variance �2x = �2/N . The distribution of the average follows a Gaussian distributionfor N ! 1. Moreover, the variance of px(x), the probability distribution of the averageresulting from N trials, tends to zero for N ! 1. Notice that this holds independentlyof the form of the PDF of the random variable x, i.e., for any p1(x). The details ofthe physics behind p1(x) are therefore irrelevant. This important result is known as thecentral limit theorem.

The same reasoning applies to the sum of N independent random variables X =P

⌫ x⌫ .In this case the average hX i = N hx i and hX2 ic = Nhx2 ic, so that in this case thefluctuations around the mean, as measured by the standard deviation �X =

phX2 ic =

pN �, scale with

pN . The relative fluctuation

�XhX i ⇠ 1p

N! 0 for N ! 1.

The consequences of the central limit theorem are far reaching. Consider for examplea gas, liquid or solid which is allowed to exchange energy with its environment and askyourself how the probability distribution for the energy per atom E should look like.If the system is macroscopic (i.e., containing of the order of 1020 atoms and molecules)one may divide it in a large number N of equal subvolumes, still containing a very largenumber of atoms, so that their behavior can be regarded as essentially uncorrelated.The probability of finding the subsystems in the states q1, q2, . . . , qN is thus given to agood approximation by p(q1, q2, . . . , qN ) =

QNi=1 p1(qi), where p1(q) is the probability of

finding a subsystem in the state q. Assuming homogeneity they should all be the same.As pointed out by Gibbs, these conditions apply to the physics of macroscopic systemsin general. We may now focus on the probability distribution of the energy. Considerthe energy per atom "i in subsystem i as an independent random variable, and assumethe "i follows some unknown probability distribution p1("). The energy per particle ofthe whole system is given by the average " =

Pi "i/N of the random variables "i. The

central limit theorem implies that the energy per particle of any macroscopic system inequilibrium is normal distributed with a relative mean-square deviation tending to zeroas ⇠ 1/

pN , where N is proportional to the number of particles. The fact that the latter

holds independently of the nature of the system and its internal interactions, providedthat they are short ranged, demonstrates the universality of the statistical approach.

1.7 Information content of a probability distribution

Consider a random variable with a discrete set of outcomes S = {xi, i = 1, . . .M} havingthe probabilities {pi}. Suppose we construct a message x1 . . . xN with N independentoutcomes of the random variable xi. We intend to quantify the possible informationcontent of such a message as a function of the probability distribution {pi, i = 1, . . .M}.Analyzing the number of different possible messages will allow us to infer how much of

18

Page 19: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

the apparent information content of the message is already contained in the probabilitydistribution. For instance, if the probability distribution is {p1 = 1 and pi = 0 8 i > 1}there is just one possible message (x1 . . . x1) and actually no information can be conveyed.All the information is in the probability distribution (PD). In the other extreme case,where xi is uniformly distributed, the PD carries no information at all.

Let us first consider the case where the values of xi in the message x1 . . . xN canbe chosen at will. Since there are M possibilities for each xi, the number of differentmessages is g = MN . The number of bits K necessary to transmit such a message, or, ifyou want, the number of bits to distinguish one message from the other, is K = ln2 g =ln2MN = N ln2M (since 2K = MN ). On the other hand, if the xi are taken from theprobability distribution pi, the possible choices of xi are limited. For instance, if p1 � p2it is unlikely to construct a message with more occurrences of x2 than x1. In the limitof large number of message elements N , the number of occurrences of xi in the messageapproaches asymptotically Ni = piN . In fact the probability of finding |Ni�N pi| >

pNi

becomes exponentially small as N ! 1. Taking into account the restriction that themessage contains Ni occurrences of xi, the number of possible messages is reduced to

g =N !QM

i=1 Ni!.

This corresponds to the number of possible ways of arranging the N1, . . . NM occurrencesof x1 . . . xM . To specify the message we therefore need

K = ln2 g = ln2N !�MXi=1

ln2Ni!

⇠= N ln2N �N �MXi=1

(Ni ln2Ni �Ni)

= �NMXi=1

Ni

Nln2

Ni

N= �N

MXi=1

pi ln2 pi

bits of information. As expected, we recover here the two limiting cases discussed above:ln2 g = 0 for pi = 1 and pj = 0 8 j 6= i and ln2 g = N ln2M for pi = 1/M 8 i (uniformdistribution). For any non-uniform probability distribution the information content of themessage ln2 g is smaller than N ln2M , which is the information content in the absenceof any information on the relative probabilities pi. One assigns this difference to theinformation carried by the probability distribution {pi}. The information content of theprobability distribution {pi} is thus defined as

I{pi} = ln2M +MXi=1

pi ln2 pi,

i.e., the reduction of information per unit message-length or transmitted token.

19

Page 20: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Introducing the definition of the entropy of a probability distribution

S{pi} = �MXi=1

pi ln pi = �h ln pi i � 0 (0 pi 1)

we have

I{pi} = (Smax

� S{pi}) / ln2,

where

Smax

= �MXi=1

1

Mln

1

M= lnM

is the maximum of S{pi} corresponding to pi = 1/M . A distribution with maximumentropy carries the least possible information. Therefore, S gives a measure of diversityof the distribution. S is actually the logarithm of the number of possible microscopicallydifferent states (messages) that can be constructed with elements satisfying Ni/N =pi. For the distribution pi = �ij (for some j) there is only one possible microscopicconfiguration or message (xj xj . . . xj). In this case, and only in this case, we have S = 0.

The entropy does not depend on the values of the random variables. Any one-to-onemapping xi ! fi = F (xi) leaves the entropy unchanged since p(xi) = p(fi). This impliesin particular that the (non-equilibrium) entropy of a system of interacting particles (e.g.,an interacting electron gas) with occupation probabilities n~k for each quasi-particle state~k is the same as the entropy of a non-interacting having the same n~k. The actual equi-librium entropy at a given temperature T will of course be different, since in equilibriumthe entropy corresponds to the maximum value of S{n~k} compatible with the constraintof a fixed average energy hEi.

In contrast, any many-to-one mapping will reduce the entropy of the probability dis-tribution, since it reduces its diversity or, in other words, it increases the definiteness orthe information content. For example, given p1 and p2, the mapping

x1x2

�! f

gives

p(f) = p1 + p2.

The resulting change in the entropy reads

�S = �p(f) ln p(f) + (p1 ln p1 + p2 ln p2)

= Sf � S1 2

= p1 lnp1

p1 + p2+ p2 ln

p2p1 + p2

,

which is negative, provided that p2 6= 0 (or p1 6= 0). Conversely, removing a constraintin a probability distribution systematically increases S.

20

Page 21: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

1.7.1 Inferring unbiased probability distributions

The entropy S can also be used to infer subjective (theoretical) estimates of probabilitydistributions. For instance, in the absence of any information of pi the best unbiasedestimate of pi is that all possible outcomes M are equally probable, i.e., pi = 1/M . Thisdistribution maximizes the entropy, the diversity of the distribution and the number ofpossible microscopic states for the given M available states or outcomes. One may alsosay that this choice minimizes the information content of the probability distribution{pi}.

If additional information is available the unbiased estimate for pi is obtained by maxi-mizing S subject to the constraints imposed by the available information. As an examplelet us assume that we know the average value hF (x) i = f of some function F (x) of therandom variable x. In this case we obtain the unbiased pi from the extremum of

L {pi} = S {pi}| {z }�

PM

i=1 pi

ln pi

�↵

MXi=1

pi � 1

!� �

MXi=1

pi F (xi)� f

!,

where ↵ and � are Lagrange multipliers. Straightforward derivation yields@L

@pi= � ln pi � 1� ↵� �F (xi) = 0

) pi = e�(1+↵) e��F (xi

).

Imposing the normalization conditionP

i pi = 1 we obtain e1+↵ =P

i e��F (x

i

) and thus

pi =e��F (x

i

)PMi=1 e��F (x

i

), (1.11)

where � is such that

f = hF (x) i =PM

i=1 e��F (xi

) F (xi)Pi e��F (x

i

). (1.12)

Exercise 1.4:Find the unbiased probability p(xi) for a random variable xi (i = 1, . . .M) knowing thefirst n moments of p(xi) (i.e., hx⌫ i = M⌫ for ⌫ n ).

i) Show that p(xi) / exp

nX

⌫=0

a⌫ x⌫i

!with certain coefficients a⌫ .

ii) Consider the partition function Z =MXi=1

exp

(nX

⌫=0

an xni

)and show that the coeffi-

cients are given by the equations@ lnZ

@a⌫= M⌫ .

21

Page 22: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

1.7.2 Entropy of continuous probability densities

In analogy with the discrete case we can define the entropy of a continuous probabilitydensity distribution p(~x) as

S = �h ln p(~x) i = �Z

p(~x) ln p(~x) d~x.

However, notice that this definition does not have some of the nice properties of S =�P

i pi ln pi for discrete random variables. For instance, for a uniform distribution inthe interval [a, b], i.e.,

p(x) =

(1/(b� a) for a x b

0 elsewhere,

we have

S = �Z b

a

1

b� aln

✓1

b� a

◆= ln(b� a).

For large intervals this is positive and diverges logarithmically [S ! +1 for (b � a) !+1]. In the opposite limit of a very narrow PDF around some point x0, we have

p(x) =

(1/" for x0 � "/2 < x < x0 + "/2

0 elsewhere,

and S ! �1 for "! 0.Notice that S can take negative values for very sharp p(x) ' �(x) since p(x) is not

bounded. These situations, however, never appear in the description of macroscopicsystems. It is interesting to observe that S decreases as the diversity of the distributiondecreases (e.g., S[�(x)] ! �1) as in the case of discrete random variables.

In order to avoid this problem, or rather to understand the origin of the divergencein h ln p(x) i for p(x) =

Pi pi �(x � xi), it is useful to derive the expression for S for

continuous PDF starting from the expression for discrete variables

S = �Xi

pi ln pi

and introducing a finite lower bound or threshold � below which two outcomes (differingby less than �) are considered to be equivalent. With this coarse graining the logarithmof the number of possible messages (i.e., the diversity of the probability distribution) isgiven by

S = �Xi

P (xi < x < xi +�) lnhP (xi < x < xi +�)

i

22

Page 23: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where xi = i� (xi+1 = xi +�) and pi = P (xi < x < xi +�) is the probability for x tolie in the interval [xi, xi +�]. Using that

P (xi < x < xi +�) = P (xi +�)� P (xi) =

Z xi

+�

xi

p(x) dx,

where P (x) refers to the cumulative probability function and p(x) =dP

dxto the proba-

bility density function, we have

S = �Xi

✓Z xi

+�

xi

p(x) dx

◆ln

✓Z xi

+�

xi

p(x) dx

◆� 0.

If the spectrum of outcomes is discrete or shows very narrow peaks (narrower than �)we can still compute S and recover the limit of discrete random variables. However, ifp(x) is smooth we can write

S ⇠= �Xi

✓Z xi

+�

xi

p(x) dx

◆ln [p(xi)�]

⇠= �Xi

✓Z xi

+�

xi

p(x) dx

◆[ln p(xi) + ln�]

⇠= �Z

p(x) ln [p(x)] dx� ln�.

The term ln� cancels the divergence of S = �h ln p(x) i for p(x) ! �(x).Another potential problem of the definition

S = �Z

p(x) ln[p(x)] dx

is that it is not necessarily invariant under a bijective mapping f = F (x). In fact, onehas

p(f) df = p(x) dx ) p(f) = p(x)1��dFdx

��and therefore

S [p(f)] = �Z

p(f) ln p(f) df = �Z

p(x)

ln p(x)� ln

����dFdx���� � dx

= �Z

p(x) ln p(x) dx+

Zp(x) ln

����dFdx���� dx

= S [p(x)] +

*ln

����dFdx����+.

23

Page 24: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

A simple change of scale, i.e., f = ↵x, would thus change the value of S. An expansion(↵ > 1) increases S, since it increases the diversity, while a compression (↵ < 1) reducesS since it flattens the variations of the random variable.

In the case of many random variables we define S analogously as

S = �hln p(~x)i = �Z

p(~x) ln p(~x) d~x.

Following a change of variables ~x ! ~f we have

Shp(~f)

i= S [p(~x)] +

⌦ln��J(~f)�� ↵,

where J(~f) =

���� @fi@xj

���� is the Jacobian of the variable transformation fi = fi(x1, . . . xN ).

The entropy is thus invariant under canonical transformations in classical mechanics andunitary transformation in quantum mechanics, since in these cases the Jacobian equal to1.

Exercise 1.5: Loaded diceA dice is loaded such that p6 = n p1 (e.g., n = 2). In other words, six occurs n times moreoften than 1.

i) Find the unbiased probabilities for the six faces of the dice.

ii) What is the information content of the probability distribution function {pi, i = 1–6}as a function of n?

24

Page 25: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

2 Mixed states, density matrices and distribution functions

2.1 Introduction

The usual point of view in mechanics, either quantum or classical, is to predict the stateof a system at a given time t provided that one knows the state of the system at aprevious time t0 < t and that one know the exact form of the Hamiltonian. This isactually the scope of any deterministic mechanical theory. The information required todefine a state, the very nature of it, and its connection to the observable properties isvery different in the quantum and classical cases. Still, the time evolution is in bothcases deterministic. While we all agree that QM should be the basis of any theory ofmatter, it is easy to see that a deterministic approach to macroscopic systems is neitherpractical nor meaningful. From a practical perspective first, it is clear that we can neversucceed in recording all the information needed to determine the initial state of a systemcomposed of 1020 atoms. Even the exact form of the Hamiltonian governing the systemand its environment is, strictly speaking, inaccessible. Needless to say that solving thetime evolution of such a system is impossible both analytically or numerically. The otherimportant reason not to pursue the deterministic perspective is understanding. Supposewe would be able to propagate during 1 µs a wave function, also known as pure state,depending on 1020 coordinates. What would we do with all these numbers? From anexperimental perspective we are only interested in mean values, either within an intervalof time, or in space, or both. The information consisting of the complete wave function(or of the coordinates and momenta of all particles in the classical case) as a function oftime does not provide any understanding.

In order to successfully approach this problem a change of perspective is needed. In-stead of following the dynamics of a given system having a given initial state as a functionof time, it is far more meaningful to consider a large number of equivalent systems, whichhave a wide variety of initial states, ideally all possible states compatible with the globalboundary conditions and constants of motion. This ensemble of systems should be dis-tributed across the accessible states with some initial probability distribution. Actually,only the probability of finding a system in a given state is needed in order to compute anyaverage value. Notice that with average values we not only mean global properties of thesystem (e.g., total energy, kinetic energy, magnetization, equilibrium volume, etc.) butalso the most detailed microscopic information, such as, spin-polarized density distribu-tion, density and spin correlation functions including their dependence on experimentallytunable external fields. The challenge is then to determine the time evolution of the prob-ability distribution when the initial distribution has been given. This is the perspectiveof statistical mechanics as proposed by Maxwell, Boltzmann and Gibbs. The propertiesof such statistical ensembles of systems is characterized by density operators ⇢ in quan-tum mechanics and by statistical distribution functions ⇢(p, q) in classical mechanics.In the following we shall review the properties of ⇢ and ⇢(p, q). A rigorous justifica-tion of the ensemble or mixed state approach to macroscopic systems will be given lateron. As we shall see, mixed states and ensembles appear quite naturally once the crucial,even if very weak, coupling between the system and its environment is taken into account.

25

Page 26: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

2.2 Mixed states

The fundamental principles of quantum mechanics establish that the state of any systemis characterized by an element | i of a vector space or Hilbert space V. The superpositionprinciple implies that the set of all possible states form a vector space since 8 | 1i and| 2i 2 V a| 1i + b| 2i 2 V for all a, b 2 . These states are called pure states ormicrostates in the broader context of statistical mechanics. In order to characterize | ione usually expands it in a basis of states having well-defined values of a complete set ofcompatible observables f :

| i =Xn

an |fni+Z

af |fi df

It is clear that knowing all these expansion coefficient is impossible in practice, exceptfor very small systems (atoms and small molecules). Strictly speaking, all possible statesof the system at any time t are pure states, i.e., an element of V. Nevertheless, this isnot the only physical situation that one may encounter.

In many cases of interest, and in particular in the study of macroscopic systems, onehas to deal with an incoherent superposition of microstates |↵ii each having a probabilitywi but baring no correlation whatsoever. Such ensembles of microstates are called mixedstates or mixed ensembles.

Mixed states are necessary in order to describe the properties of ensembles of physicallyequivalent, statistically independent systems, each being in a state |↵ii with probabilitywi. The reader is probably familiar with calculating the properties of a beam of atomscoming out of an oven in the context of the Stern-Gerlach experiment. In the case of amacroscopic system it is reasonable to regard a large system (containing for example 1020particles) as an ensemble of N subsystems each containing a large number of particles aswell (for example, 1010 subsystems with 1010 particles each). In the macroscopic limitthe subsystems are also macroscopic, and the interaction between them are weak, so thattheir states can be regarded as statistically independent. The surface to volume ratio ofthe subsystems tends to zero as the size tends to infinity. It is important to note thatfull statistical independence requires the system to be open or that it is part of a muchlarger closed system, since in the case of closed system the subsystems must satisfy theusual conservation laws. For example, the sum of all the energies of the subsystems of aclosed system must be constant. In any case it is clear that the predictions of a theoryof macroscopic systems must be the same, whether we consider fewer larger systems ora larger number of smaller subsystems, provided that they remain macroscopic.

We consider an ensemble of systems distributed over the pure states |↵ii with proba-bility wi. The fractional population of the states |↵ii satisfy the normalization conditionX

i

wi = 1

26

Page 27: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The pure states |↵ii are properly normalized

h↵i|↵ii = 1 8i

However, the different |↵ii need not be orthogonal to each other. Clearly, the statesof different subsystems bare no correlation to each other and therefore need not beorthogonal.

The average of any observable A when a large number of measurements is performedon the ensemble is

hAi =Xi

wih↵i|A|↵ii

Notice that hAi involves the usual quantum mechanical average of the observable A inthe pure state |↵ii weighted by the probability wi of finding the systems in |↵ii.

It is interesting to observe that ensemble averages also appear when one considersthe properties of a single isolated system, which has interacted with the environmentprior to isolation. This is the most relevant situation we encounter in real experiments,since preparing a macroscopic system in a given state requires that some interactionswith the environment have taken place, whose details cannot all be controlled. Weconsider a system, which is characterized by a complete set of states {|�ni}, and itsenvironment, which is characterized by the complete basis set |�mi. The entity ‘systemplus environment’ (S�E) is assumed to be strictly isolated. Thus, according to the lawsof quantum mechanics, S�E must be in a well-defined pure state of the form

| i =Xmn

amn |�mi |�ni ,

even though we are unable to know the values of the expansion coefficients anm 2 C. Wewould like to compute the average value of an operator A, which concerns some propertyof the system and thus affects only the variables |�ni. To this aim we rewrite

| i =Xm

|�miXn

amn|�ni (2.1)

and definebm|↵mi =

Xn

amn|�ni

withh↵m|↵mi = 1 8m.

The coefficients |bm|2 are given by

|bm|2 =Xn,n0

a⇤mn0amnh�n0 |�ni =Xn

|amn|2.

We may take bm 2 R, so that

|↵mi =P

n amn|�nipPn |amn|2

.

27

Page 28: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

|bm|2 > 0 represents the probability of finding the environment in the state |�mi, whenthe isolated system plus environment is in the pure state | i given by Eq. (2.1), whichwe may write as

| i =Xm

bm|�mi|↵mi .

Notice that h↵m|↵m0i 6= 0 for m 6= m0 in general. The normalization of the state | i andthe fact that h↵m|↵mi = 1 8m implyX

m

|bm|2 = 1 .

We may now compute h |A| i for any operator concerning the system, i.e., acting onlyon the variables |�ni or |↵mi:

h |A| i =Xm,m0

b⇤m0bm h�m0 |�mi| {z }�mm

0

h↵m0 |A|↵mi =Xm

|bm|2h↵m|A|↵mi .

Remarkably, this takes the same form as the ensemble average with wm = |bm|2, eventhough the state of the system plus environment is pure. Moreover, this expressionholds for all times, since the eigenstates of the environment remain orthogonal to eachother once the system and environment are decoupled. In practice it is impossible tokeep a system decoupled from its environment for a very long time. All real isolatedsystems are actually quasi-closed at best. One can imagine that the system and theenvironment interact from time to time. This corresponds to a change or redefinition ofthe expansion coefficients amn defining the global state | i. Consequently, the interactionwith the environment implies sampling different states |↵mi and weights wm = |bm|2. Itis precisely the system-environment interaction, however weak, what finally leads to theestablishment of thermodynamical equilibrium.

It is important to note that the weights wm = |bm|2 are independent of time as long asthe system remains perfectly isolated from the environment, i.e., as long as the system andthe environment do not interact. Indeed, a lack of interaction means that the Hamiltoniantakes the form H = Henv + Hsys where Hsys (Henv) acts only on the variables of thesystem (environment). The states |�ni and |�mi remain eigenstates of the system andenvironment if no interaction is present:

Henv|�mi = Em|�mi

andHsys|�ni = "n|�ni

Consequently,| (t)i =

Xmn

amne� i

~ (Em

+"n

)t| {z }amn

(t)

|�mi|�ni

and|bm(t)|2 =

Xn

|amn(t)|2 =Xn

|amn|2

28

Page 29: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

is independent of time. This implies that for a strictly isolated system all the timedependence of a mixed state must be ascribed to the time dependence of the states |↵mi.As we shall see, this leads to the Liouville equation of motion of the density operator ⇢.

Summarizing, mixed states or mixed ensembles provide a more general perspective toquantum mechanical systems, which includes pure states as a particular case (wm = 1 fora given m) and which allows us to describe macroscopic systems from two complementaryperspectives:

i) Macroscopic systems can be regarded as a statistical ensemble of a large numberof macroscopic subsystems, and

ii) macroscopic systems can be regarded as quasi-closed systems having a very weakthough non-vanishing interaction with the environment.

2.3 Density matrix

The expression for the average value of an operator in a mixed state

hAi =Xi

wih↵i|A|↵ii

has a very clear physical interpretation. However one would like to express hAi in an in-variant form, which clearly separates the factors that depend on the ensemble from thosethat depend on the property under study. To this aim we introduce the completenessrelation

Pn|nihn| = 1 and obtain

hAi =Xi

wih↵i|A X

n

|nihn|!|↵ii

=Xn

Xi

hn|↵iiwih↵i|A|ni

= Tr

( Xi

|↵iiwih↵i|!A

)= Tr

n⇢A

o,

where we have introduced the density-matrix operator or density operator (DO)

⇢ =Xi

|↵iiwih↵i| .

The DO depends only on the considered mixed state, since it is given by the participatingpure states |↵ii and the corresponding weights wi. As we shall see, ⇢ defines the mixedstate completely. It actually takes the role played by the wave function for pure statesin the case of mixed states. Note that ⇢ is independent of the observable A underconsideration. It applies to the average of any observable.

29

Page 30: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Let us recall some properties of the trace of an operator

TrnAo=Xn

hn|A|ni,

where {|ni} is a complete orthonomal basis. It is easy to see that

TrnAB

o= Tr

nBA

ofor any operators A and B. This also implies that Tr{A} is independent of the orthonor-mal basis used for performing the sum. If U is a unitary transformation (U †U = 1) suchthat

|uni = U |ni

we have Xn

hun|A|uni =Xn

hn|U †AU |ni = TrnU †AU

o= Tr

nAo

Finally, it is also useful to recall that

Tr{|↵ih↵|} =Xn

|hn|↵i|2 = 1

for any state |↵i whose norm h↵|↵i = 1.The density operator ⇢ has the following important properties:

i) Hermiticity:⇢† = ⇢ since wi 2 R

ii) Normalized trace:Tr {⇢} = 1 since

Xi

wi = 1

iii) Since ⇢† = ⇢, it is diagonalizable. Thus, ⇢ has a spectral representation of the form

⇢ =Xk

⇢k| kih k|

where ⇢| ki = ⇢k| ki, ⇢k 2 R and h k| k0i = �kk0 . The eigenstates of ⇢ form acomplete orthonormal basis. Notice that the weights wi entering the definition of⇢ are not necessarily the eigenvalues of ⇢, since in general h↵i|↵ji 6= �ij .

iv) The eigenvalues of ⇢ satisfy 0 ⇢k 1. Let us first show that ⇢ is positivesemi-definite:

h |⇢| i =Xi

wi|h |↵ii|2 � 0

for any state | i. This implies that the eigenvalues ⇢k � 0. In addition, sincePk

⇢k = Tr⇢ = 1 we must have ⇢k 1.

30

Page 31: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

v) The square of ⇢ satisfiesTr

�⇢2 1.

This is easily proven by noting that

1 = (Tr {⇢})2 = X

k

⇢k

!=Xk

⇢2k +Xk 6=l

⇢k⇢l.

SincePk 6=l

⇢k⇢l � 0, we have

Tr�⇢2 =Xk

⇢2k 1.

vi) Pure-state characterization:

Tr�⇢2 = 1 , ⇢ = | kih k|

for some k. In other words, Tr�⇢2 = 1 if and only if ⇢ describes a pure state.

The proof follows by noting that

Tr�⇢2 =Xk

⇢2k = 1 ,Xk 6=l

⇢k⇢l = 0 , ⇢k⇢l = 0 8k 6= l.

Therefore, only one eigenvalue ⇢k can be different from zero, which must then beequal to one. Consequently, ⇢ = | kih k| and ⇢2 = ⇢. We conclude that Tr{⇢2}allows us to distinguish mixed states from pure states.

vii) Tr{⇢2} is not only independent of the representation but also independent of time.This implies that pure states can never evolve into mixed states, and vice versa, aslong as the system is perfectly isolated. The reason for this is that the dynamics ofa system, which is perfectly decoupled from the environment, follows the unitarytime-evolution operator U(t, t0). The proof is straightforward by noting that

⇢2 =Xij

|↵iiwi h↵i|↵jiwj h↵j |

and thus

Tr�⇢2 =Xnij

hn|↵iiwi h↵i|↵jiwj h↵j |ni

=Xij

wiwj |h↵i|↵ji|2 .

Since U(t, t0) is unitary, the time evolution |↵i, ti = U(t, t0)|↵i, t0i does not modifythe scalar products h↵i|↵ji. Consequently, Tr

�⇢2

is independent of time in anisolated system.

31

Page 32: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Summarizing so far, we may say that the set of all operators of the form

⇢ =Xi

wi|↵iih↵i| (2.2)

with h↵i|↵ii = 18i, and the set of all hermitic, positive semi-definite operators with traceequal one are identical. However, notice that the representation of ⇢ in the form (2.2)is not unique, since the |↵ii are not necessarily orthogonal to each other. This meansthat mixed states, which may look different in terms of {|↵ii} and {wi}, may have thesame density operators ⇢ and may thus be physically identical. On the other side ⇢characterizes a physical state fully and univocally concerning the results of every possiblemeasurement. The knowledge of ⇢ defines not only the average value of any observableA through

hAi = Trn⇢ A

obut also the average

hF (A)i = Trn⇢F (A)

oof any function of it.

Consequently, ⇢ defines the probability density distribution PA(a) that the result of ameasurement of A yields the value a. This probability density is in fact given by

PA(a) = h�(a� A)i = Trn⇢ �(a� A)

o.

Exercise 2.6: Show the validity of the above expression for PA(a).

From a more mathematical perspective it is interesting to note that the set of alldensity operators is convex. Let ⇢1 and ⇢2 be two density operators describing twodifferent (mixed) states. Then the operator

⇢ = ↵1⇢1 + ↵2⇢2

with ↵1 + ↵2 = 1 (↵i � 0) is also a density operator describing a possible mixed state.Pure states are particular cases of the more general concept of mixed states, which can

be perfectly described with density operators. It is easy to see that a density operator⇢ corresponds to a pure state if an only if any of the following equivalent conditions issatisfied:

i) There is a state |↵i such that ⇢ = |↵ih↵|.

ii) ⇢2 = ⇢,

iii) Tr�⇢2 = 1,

32

Page 33: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

iv) Tr {⇢ ln ⇢} = hln ⇢i = 0, or

v) ⇢ cannot be written as the combination of two different ⇢1 and ⇢2, i.e., there existno density operators ⇢1 6= ⇢2 such that ⇢ = ↵1⇢1 + ↵2⇢2 with ↵1 + ↵2 = 1 and↵i > 0.

2.4 Time dependence of the density operator of isolated systems

We consider here the time dependence of the density operator of strictly isolated systems,for which we know that the weights wi of the states |↵ii building the mixed state areindependent of time. The quantum states |↵ii follow the Schrödinger equation. Weconsider ⇢(t) in the Schrödinger picture:

⇢s(t) =Xi

wi |↵i, tih↵i, t| ,

where the Schrödinger kets |↵i, ti = U(t, t0)|↵i, ti satisfy the Schrödinger equation

i~ @@t

|↵i, ti = H|↵i, ti

or equivalently

�i~ @@t

h↵i, t| = h↵i, t|H

The Hamilton operator H(t) may depend on time. It is of course hermitic, since |↵iipreserves its norm as a function of time. We then have

i~@⇢s@t

=Xi

wi

⇣H|↵itih↵it|� H|↵itih↵it|

⌘or equivalently

i~@⇢s@t

=hH, ⇢

i. (2.3)

This is known as the Liouville equation. It describes the time dependence of the operator⇢s in the Schrödinger picture, which originates in the time dependence of the Schrödingerkets |↵ii. As already said, the density operator plays the role of the wave function for purestates in the more general case of mixed states. It depends on time in the Schrödingerpicture and, as we shall see, is independent of time in the Heisenberg picture.

The thermodynamic equilibrium is defined by requiring that the macrostate of thesystem is independent of time. This is equivalent to requiring that @⇢

s

@t = 0, since ⇢sdefines the state of the system. The Liouville equation implies that in thermodynamicor statistical equilibrium we have h

H, ⇢si= 0.

33

Page 34: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

In this case the eigenstates | ki of ⇢s can be chosen to be eigenstates of H. This is avery important hint for our search of the form of ⇢s describing equilibrium.

An alternative to Liouville’s differential equation is to express ⇢s using the time-evolution operator:

|↵i, ti = U(t, t0)|↵i, t0i

andh↵i, t| = h↵i, t0|U †(t, t0).

In this way we have⇢s(t) = U(t, t0) ⇢s(t0) U

†(t, t0) .

Notice that the time evolution of ⇢s(t) corresponds simply to a unitary transformation(U U † = 1). This form of ⇢s(t) is particularly useful in order to demonstrate the timeindependence of Tr{⇢}, Tr{⇢2} and of the eigenvalues ⇢k of ⇢.

Exercise 2.7:

i) Find the density matrix operator ⇢H(t) in the Heisenberg picture. Verify that it isindependent of time.

ii) Verify the equivalence of the average values of any operator A in the Schrödingerand Heisenberg picture.

Excercise 2.8: The entropy of a mixed state is defined by S = �kBhln ⇢i = �kBP

k ⇢k ln ⇢k,where kB is the Boltzmann constant having units of energy divided by temperature. Showthat S is independent of time for strictly isolated systems (i.e., following Liouville’s equa-tion). As we shall see, this implies that an isolated system can never reach equilibrium onits own, i.e., without a however weak interaction with the environment.

Before closing this section it is probably useful to recall that the time derivative of anoperator f in quantum mechanics is defined by the condition*

�����dfdt�����

+=

d

dt

D ���f ��� E

for any | (t)i. It follows that

df

dt=@f

@t+

i

~

hH, f

i.

34

Page 35: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that dfdt is the operator of a new physical observable in the Schrödinger picture

(e.g., the velocity vx = dx/dt). Applying this to the density operator we have

d⇢sdt

=@⇢s@t

+i

~

hH, ⇢

i= 0,

where the second equality follows from the Liouville equation (2.15). This means that⇢s is a constant of motion for strictly isolated systems, even though it depends explic-itly on time (@⇢s/dt 6= 0) in the most general out-of-equilibrium case. In other words,the propagation of the kets | (t)i = U(t, t0)| (t0)i and the time dependence of ⇢(t) =U(t, t0)⇢(t0)U †(t, t0) is such that any average h (t)|⇢(t)| (t)i = h (t0)|⇢(t0)| (t0)i in-volving ⇢ or of any function of ⇢ is independent of time. We can also say that alleigenstates | ki of ⇢(t0) at a given time t0 remain eigenstates at any subsequent time t.In equilibrium we have in addition @⇢/@t = 0 and [H, ⇢] = 0.

2.5 The statistical distribution function

We would like to transpose the concepts of statistical ensembles and density operatorto systems which can be described by classical mechanics. A detailed formal derivationof the classical limit of quantum statistical mechanics will be presented later on. Thedynamic state of a classical system with s degrees of freedom is defined by its generalizedcoordinates q = (q1, . . . , qs) and conjugated momenta p = (p1, . . . , ps). Each point (p, q)represents a state of the entire macroscopic N -particle system (s = 3N). As in thequantum case we refer to these states, which contain the maximum detail of informationon the system, as microstates or pure states. The 2s dimensional space containing allthe microstates (p, q) = (p1, . . . , ps, q1, . . . , qs) is known as �-space or phase space ofthe system. As in the quantum case it is meaningful to change the perspective of themechanical description from the deterministic dynamics of a single system in a preciselydefined state (p, q) to the dynamics of the probability distribution of a large ensemble ofsystems distributed throughout all possible microstates. This broader physical situationis described by the statistical distribution function or simply distribution function

⇢(p, q) = ⇢(p1, . . . , ps, q1, . . . , qs),

which represents the joint probability density function for finding the system in themicrostate (p, q) = (p1, . . . , ps, q1, . . . , qs). In other words

⇢(p, q) dpdq = ⇢(p1, . . . , ps, q1, . . . , qs) dpsdqs

is the probability of finding the system in the phase-space volume element dpsdqs centeredat the point (p, q). As any joint probability function, ⇢(p, q) satisfies ⇢(p, q) � 0 andR⇢(p, q) dpdq = 1. These conditions correspond to wi � 0 and

Piwi = 1 in the quantum

case.The physical properties of classical systems are given by functions A(p, q) of the gen-

eralized variables. The corresponding average value and variance are given by

hAi =Z

dpdq A(p, q) ⇢(p, q)

35

Page 36: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

and�A2 = h(A� hAi)2i = hA2i � hAi2 > 0.

The motivations for considering statistical ensembles in order to describe macroscopicsystems are the same as in the quantum case:

i) One may regard a macroscopic system as an ensemble of equivalent statisticallyindependent subsystems each occupying a state (pi, qi). It is the distribution of themicrostates of the subsystems throughout �-space what leads to the distributionfunction ⇢(p, q).

ii) One may follow the time evolution of a system or of any macroscopic part of a largersystem and record the state (pi, qi) of the system at a large number of equally spacedinstants ti. From this perspective ⇢(p, q) corresponds to the probability of findingthe system in the state (p, q) averaged over a large period of time. The average ofobservables corresponds to time averages.

iii) One may consider the system as being quasi closed, i.e., almost perfectly isolated,thus experiencing relatively rare weak interactions with the environment. Theselead to changes of state beyond the time dependence of (p, q) corresponding toa strictly isolated system. Again, the actual microstate of the system cannot beknown with certainty. Thus, the notions of statistical ensemble and probabilitydistribution ⇢(p, q) prevail.

2.6 Time dependence of ⇢(p, q): Liouville theorem

We would like now to investigate the time dependence of ⇢(p, q) assuming that the systemis isolated or, equivalently, we consider quasi-closed systems between two interactionswith the environment. In this case, the generalized coordinates of each element in theensemble follow Hamilton equations

pi = �@H@qi

andqi =

@H

@pi

for i = 1, . . . , s, where H = H(p, q) is the Hamilton function of the isolated system. Theseequations describe how each microstate, also known as representative point, moves in �space as a function of time. Moreover, since (p, q) defines the mechanical state completely,one may consider as initial condition any point [p(t0), q(t0)] in the trajectory. In this wayone always obtains the same distinct trajectory, i.e., the same unique (since deterministic)curve connecting the points (p, q) at subsequent instants t > t0. This implies that atrajectory (p, q) in � space can never cross itself or another distinct trajectory, since inthis case the crossing point (p0, q0) would not define the mechanical state unambiguously.A trajectory can be a simple closed loop though. However, two trajectories cannot mergeinto one and a trajectory cannot evolve into a closed loop. The latter would imply

36

Page 37: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

that there is a point (�p0, q0) in � space, where the trajectory bifurcates, which againcontradicts the deterministic character of classical motion. In order to demonstrate thelast statement we observe that Hamilton’s equations are invariant upon time reversal.More precisely, let [p(t), q(t)] be a solution of Hamilton’s equations and consider thetrajectory q0(t) = q(2t0 � t) and p0(t) = �p(2t0 � t), which reverses the path along thesame coordinators with reversed momenta. Notice that the location of the pints (p, q)and (p0, q0) are indeed different in � space due to momentum reversals. It is easy to seethat

dq0idt

(t) = �dqidt

(2t0 � t) = �@H@pi

(2t0 � t) =@H

@p0i(t)

anddp0idt

(t) =dpidt

(2t0 � t) = �@H@qi

(2t0 � t) = �@H@q0i

(t),

which implies that (p0, q0) is also a trajectory in �-space. Thus, in the reverse timesequence, merging transforms into bifurcation. We conclude that the different trajectoriesin phase space preserve their identity. It is therefore reasonable to expect that along thetime evolution each point in � space will carry along its ⇢(p, q) so that the total derivated⇢dt = 0. We shall prove this statement in the following.

Before doing that, let us consider some physical property f , which is given by a functionf(p, q, t) of the coordinates, momenta and time. Let us now compute the total timederivative along the classical trajectory:

d

dt=@f

@t+Xk

✓@f

@pkqk +

@f

@pkpk

◆.

Using Hamilton’s equations we have

df

dt=@f

@t+Xk

✓@H

@pk

@f

@gk� @f

@pk

@H

@qk

◆=@f

@t+ {H, f} , (2.4)

where we have introduced the Poisson bracket between two functions of (p, q):

{f, g} =Xk

✓@f

@pk

@g

@qk� @g

@pk

@f

@qk

◆.

Poisson brackets are bilinear functions of f and g. They have very similar properties asthe commutator between two operators:

{↵f1 + �f2, g} = ↵ {f1, g}+ � {f2, g} ,{f, g} = � {g, f} ,

and

{f1f2, g} = f1 {f2, g}+ {f1, g} f2 .

37

Page 38: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice the analogy between Eq. (2.4) and the time derivative of observables in quantummechanics:

df

dt=@f

@t+

i

~

hH, f

i.

One may also say that df/dt and df/dt are connected by the correspondance relation{H, f} $ i

~

hH, f

i.

We would like to extend the continuity equation, well-known from electromagnetism,to density fields and the associated velocity fields in Rn. Let ⇢(~x) : Rn ! R be thedensity distribution of some fictitious particles in Rn and let ~v(~x) : Rn ! Rn be thevelocity field corresponding to these particles, whose density is ⇢(~x). The current densityis then given by ~j = ⇢~v. For any arbitrary volume V ⇢ Rn, the change in the number ofparticles inside V per unit time can be calculated in two ways:

d

dt

✓ZV⇢(~x) dnx

◆= �

ZS(V )

~j · n da ,

where the right hand side corresponds to the flux of the current density across the surfaceS(V ) of the volume V . The normal n to the surface points, as usual, in the outwardsdirection. Notice that we have taken into account that there are no external sourcesor drains of particles so that the change in the number of particles within V can onlyresult from particles actually crossing the surface S(V ). Using Gauss theorem on theright-hand side we have Z

V

✓@⇢

@t+ ~r ·~j

◆dnx = 0 ,

where ~r =⇣

@@x1

, . . . , @@x

n

⌘stands for the nabla operator in Rn. Since this holds for any

V we obtain@⇢

@t+ ~r ·~j = @⇢

@t+ ~r · (⇢~v) = 0 . (2.5)

This continuity equation simply expresses the conservation of the number of particlesunderlying ⇢(~x) and moving according to the velocity field ~v(~x).

In order to investigate the time dependence of ⇢(p, q, t) it is useful to regard ⇢(p, q) asthe density of representative points in � space, whose motion leads to the current density~j = ⇢~v. Since ⇢ depends on (p, q) the velocity field is given by

~v = (p1, p2, . . . , ps, q1, . . . , qs) 2 R2s

or in a more compact form by

~v = (p, q) =

✓�@H@q

,@H

@p

◆.

It is easy to see that

~r · ~v =sX

k=1

✓� @

@pk

@H

@qk+

@

@qk

@H

@pk

◆= 0,

38

Page 39: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where we have used that

~r =

✓@

@p1, . . . ,

@

@ps,@

@q1, . . .

@

@qs

◆.

The divergence of the current is then simply

~r ·~j = ~r · (⇢~v) = ~v · ~r⇢+ ⇢ ~r · ~v = ~v · ~r⇢ .

Developing ~v · ~r⇢ we have

~r ·~j =sX

k=1

✓pk

@⇢

@pk+ qk

@⇢

@qk

◆=

sXk=1

✓@H

@pk

@⇢

@qk� @⇢

@pk

@H

@qk

◆= {H, ⇢} .

Applying the expression for the total derivative [Eq. 2.4] to ⇢, and using the continuityequation (2.18) we finally obtain

d⇢

dt=@⇢

@t+ {H, ⇢} = 0 .

This result, known as Liouville’s theorem, tells us that the total derivative of ⇢ vanishes atall times. In other words, each representative point carries the probability density aroundit all along its trajectory in phase space. The result holds, of course, provided that thetime evolution follows from the Hamilton function of an isolated system. Consequently,⇢ is a constant of motion, even though @⇢/@t 6= 0. One also says that the flow of the fluidassociated to the motion of the microstates in a statistical ensemble is incompressible.Notice that the flow is incompressible but not the fluid, since @⇢/@t 6= 0 in general.However, in the particular case of thermodynamical equilibrium we have @⇢/@t = 0,since the very notion of equilibrium implies time dependence. Thus, in equilibrium,

{H, ⇢} = 0

and ⇢ is then a constant of motion in the narrow sense, such as the total energy, momen-tum or angular momentum. As in the quantum case, this gives us a very important hintin order to derive the actual expression of the equilibrium ⇢ as a function of (p, q).

Since ⇢(p, q) is the same for all the microstates visited along a phase-space trajectory,one may be tempted to conclude that ⇢ should be constant within the hypersurface in� space containing all the points that are compatible with the given set of constants ofmotion (e.g., energy, number of particles, volume, etc.). This actually holds for ergodicsystems, i.e., systems in which the representative points (p, q) as a function of time coverthe entire accessible phase space. More precisely, the ergodic condition requires that if onewaits long enough time, any representative point (p, q) must eventually come arbitrarilyclose to any other point in the accessible part of the � space. It is important to remarkthat an arbitrarily long time Terg might be required for the system to reach a statethat is arbitrarily close to some point (p0, q0). However, this long time Terg has nothingto do with the time that the system needs to reach equilibrium. The latter, known asrelaxation time ⌧rel, is a well defined property of every macroscopic system, which is

39

Page 40: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

not only much shorter, but also much more important physically. ⌧rel is independentof any particular choice of the initial and target states. However, it depends on thesystem size. Short range equilibrium is reached in a much shorter time than across longdistances. Moreover, different relaxation times often apply to different degrees of freedom(e.g., electronic translational and spin degrees of freedom, lattice vibrational degrees offreedom, etc.) or to different properties (e.g., the density of particles and the chemicalcomposition in thermally activated chemical reactions).

The equal a priori probability of all accessible states is in fact the fundamental principleof statistical mechanics. Its validity does not rely on the ergodic hypothesis but onquantum mechanics and statistical arguments.

2.6.1 Time independence of the entropy of the distribution function ⇢(p, q) instrictly isolated systems

It is interesting to observe that the entropy

S = �hln ⇢i = �Z

dqdp ⇢(p, q) ln[⇢(p, q)]

of the statistical distribution ⇢(p, q) of a strictly isolated system, i.e., one following adeterministic Hamiltonian dynamics for which Liouville theorem holds, is independentof time.

A brute force proof could follow the lines

@S

@t= �

Zdpdq

✓@⇢

@tln ⇢+

@⇢

@t

◆= �

Zdpdq

@⇢

@tln ⇢ ,

where we have used thatRdpdq ⇢(p, q) = 1. However, a quite elegant proof can be

obtained by using some previous results on the entropy of continuous probability distri-butions and the properties of canonical transformations. Consider the entropy at at time⌧ given by

S(⌧) =

Z⇢⌧ (p, q) ln ⇢⌧ (p, q) dpdq (2.6)

where ⇢⌧ (p, q) is the statistical distribution at time ⌧ . Liouville theorem states thatd⇢dt = 0 which implies

⇢⌧ (p⌧ , q⌧ ) = ⇢0(p0, q0) (2.7)

where (p⌧ , q⌧ ) is obtained from (p0, q0) as a result of the time evolution. Obviously(p⌧ , q⌧ ) is a well defined function of (p0, q0). The transformation⇢

p⌧ = p⌧ (p0, q0)q⌧ = q⌧ (p0, q0)

and its inverse ⇢p0 = p0(p⌧ , q⌧ )q0 = q0(p⌧ , q⌧ )

40

Page 41: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

are canonical transformations, since a displacement in time does not alter the form ofHamilton’s equations. Moreover, the Jacobian J

h@(p0,q0)@(p

,q⌧

)

i= 1, as in any canonical

transformation [see, for instance, L.D. Laudau and E.M. Lifshitz, Mechanics, 3rd ed.(Elsevier, Amsterdam, 1976), p. 143 ff.]. Instead of integrating ⇢⌧ as a function of (p, q),we integrate ⇢0 as a function of the point (p0, q0) which evolves into (p, q) at time ⌧ .Replacing Eq. (2.7) in Eq. (2.6) we may then write

S(⌧) =

Z⇢0[p0(p, q), q0(p, q)] ln{⇢0[p0(p, q), q0(p, q)]} dpdq. (2.8)

It is now meaningful to change variables as p0 = p0(p, q) and q0 = q0(p, q). The volumeelements, in general related by

dp0dq0 = J

@(p0, q0)

@(p, q)

�dpdq ,

are in the present case the same since the Jacobian J = 1. Replacing with (p0, q0) inEq. (2.8) we obtain

S(⌧) =

Z⇢0(p

0, q0) ln[⇢0(p0, q0)] dp0dq0 = S(0).

Therefore, the entropy of a strictly isolated system is not altered by time evolution as inthe quantum case.

In order to discuss the consequences of dS/dt = 0 in strictly isolated systems, twodifferent situations should be considered. If the system is in equilibrium, @⇢/@t = 0and ⇢(p, q) does not depend explicitly on time at any representative point (p, q). Itis clear that S also remains constant as a function of time. However, if the systemis out of equilibrium, with some value of S that is different from the equilibrium one,the above result implies that either no equilibrium is reached, no matter how large thesystem is, or that S remains constant along the equilibration process, both of which arephysically wrong and in contradiction with experimental observations. The reason for thisapparent conceptual problem is that we have completely neglected the interaction withthe environment. Physically, the interaction with the environment can never be avoidedover a long period of time, even if the system is closed and the processes do not involveenergy, particle or volume exchange. We conclude, as in the quantum case, that they playa crucial role in the process of reaching equilibrium. These interactions cause changesin ⇢(p, q) which do not result from the dynamics of the isolated system alone. In fact,it can be shown, by using time-dependent perturbation theory in quantum mechanicsand taking into account the transitions induced by the interaction with the environment,that dS/dt � 0 in closed systems. Processes in which S is conserved are reversible, whilethe others, having dS > 0, are irreversible (i.e., they cannot occur spontaneously inreverse direction. Consequently, the equilibrium states of isolated systems, not only havedS/dt = 0, but are such that the entropy takes the maximum value compatible with thegiven boundary conditions and constants of motion.

41

Page 42: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

3 Equilibrium statistical ensembles

3.1 Statistical independence of macroscopic subsystems

We would like to discuss here how the statistical distribution and density operator of anisolated macroscopic system can be expressed in terms of the statistical distribution ofthe subsystems in which it can be divided. We consider any two subsystems 1 and 2 of alarge macroscopic system. If the subsystems are themselves macroscopic, the interactionswith their surroundings must be small, since the surface to volume ratio is small. Thesubsystems can be regarded as quasi closed, in fact, with a degree of accuracy that in-creases as the system and subsystem sizes increase. Under these circumstances, i.e., if theinteractions with the surroundings are weak, the microstates of the different subsystemsbecome statistically independent from each other. The probability of finding subsystem1 in a given state |↵ni [classically (p1, q1)] is independent of the state |�mi [or (p2, q2)] ofthe subsystem 2. In other words, the subsystem 1 can take a variety of different states|↵ni without taking care of, or having any influence on the subsystem 2. Nevertheless,in the case of isolated systems, the statistical independence is restricted by the knownfundamental conservation laws. For example, if we divide an isolated system in a largenumber N of subsystems, the states of these subsystems are independent provided thatthe sums of the individual energies, momenta, number of particles, etc. remain constant.For the moment we assume that the system is open, without any conservation constraints,or equivalently, we consider subsystems that are a small part of a much larger isolatedsystem, so that the previous restrictions do not apply.

Mathematically, statistical independence between two events a and b means that theconditional probability p(a|b) of finding the event a given b is equal to the unconditionalprobability p(a) of finding a. Knowing that

p(a|b) = p(a, b)

p(b),

where p(a, b) is the joint probability of finding a and b, we conclude that statisticallyindependent events are characterized by

p(a, b) = p(a) p(b).

Therefore, the statistical independence of two subsystems 1 and 2 implies that the sta-tistical distribution ⇢12(p1, p2, q1, q2) of finding the ensemble 1 + 2 in the state (p1, q2)for subsystem 1, and (p2, q2) for subsystem 2 is given by

⇢12(p1, p2, q1, q2) = ⇢1(p1, q1) ⇢2(p2, q2),

where ⇢1 and ⇢2 are the distribution functions of the subparts. The converse is of coursealso true. Indeed, if ⇢12 is given by the product of two distributions, each depending onlyon the variables of a subsystem, then the subsystems are statistically independent.

The same holds for density operators in the quantum case. Let

⇢12 =Xmn

w12mn |↵mi|�ni h�n|h↵m|

42

Page 43: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

be the density operator of the 1+2 system, where w12mn is the probability of finding the

subsystem 1 in state |↵mi and subsystem 2 in state |�ni. The lack of interactions betweenthe parts implies that the states of 1+2 are product states. Statistical independencecorresponds thus to

w12mn = w1

mw2n with

Xm

w1m =

Xn

w2n = 1 .

Therefore,⇢12 =

Xm

w1m |↵mih↵m|

Xn

w1n |↵nih↵n| = ⇢1 ⇢2.

Note that ⇢1 (⇢2) acts only on the variables of system 1 (2). Consequently, [⇢1, ⇢2] = 0.

3.2 The statistical independence of extensive additive properties

The statistical independence of the subsystems of a larger macroscopic system has farreaching consequences. For example, for any two physical observables f1 and f2 con-cerning, respectively, the subsystems 1 and 2, the mean value of the product is givenby

hf1f2i = Tr{⇢12 f1f2} =

Xm

w1mh↵m|f1|↵mi

! Xn

w2nh�n|f2|�ni

!= hf1i1 hf2i2 .

Let us now focus on an additive (also known as extensive) property f of a macroscopicsystem, which we imagine divided in a large number N of statistically independent sub-systems. Let fi be the operator of this property in the subsystem i. The additivity of fimplies that for the whole system we have

F =NXi=1

fi.

If the subsystems are equivalent (same composition and size) the probability distributionPf

i

(f) for the values of fi is the same in all subsystems, i.e.,

Pfi

= h�(fi � f)ii = p(f)

is independent of i. The joint probability of measuring f1, . . . , fn in the various subsys-tems is then

P (f1, . . . , fN ) = ⇧Ni=1 p(fi).

Thus, the hypotheses of the central limit theorem are fulfilled, since the values of f in thevarious subsystems are independent random variables governed by the same probabilitydensity function. It follows that the value of F in the whole system, the sum of the fi,follows a Gaussian distribution with average

hF i =NXi=1

hfii = Nhfi

43

Page 44: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

and variance

h(F � hF i)2i =NXi=1

h(fi � hfii)2i+Xi 6=i

h(fi � hfii)| {z }=0

ih(fj � hfji)| {z }=0

i

h�F 2i = Nh�f2i.

Consequently, the relative fluctuation of F in the entire macroscopic system is given byph�F 2ihF i

=1pN

ph�f2ihfi ! 0

for N ! 1. One concludes that the relative fluctuation of any additive property ofa macroscopic system decreases as 1/

pNp, where Np refers to the number of particles

in the system. When the body is macroscopic and Np ! 1, the extensive quantitiespractically do not fluctuate in relative terms, even if they are not conserved a priori. Theabsolute fluctuations

ph�F 2i actually increase with size, although more slowly that the

actual average value, so that the relative fluctuations vanish. Even if the system is open,for instance, if it can exchange energy through a thermal contact, the quantities canbe regarded as practically conserved in average. Indeed, the probability of observingany value away from the average per particle is negligibly small. More precisely, theprobability of observing a relative deviation of F from its average value by any small finiteamount decreases exponentially with increasing N [P (|F � hF i| > ⌃ =

pN�) ' 33%

and P (|f � hfi| > �/pN) ' 33%].

Notice that the previous considerations hold for any additive property and for anyform of the statistical distribution, provided that it is the same in all subsystems. Thisimplies that the equilibrium properties of open and closed macroscopic bodies are verysimilar, even if the boundary conditions lead to different statistical distributions.

3.3 The importance of additive constants of motion

From now on we focus on conservative systems, i.e., systems whose Hamiltonian does notdepend on time. Our purpose is to derive the form of ⇢(p, q) and ⇢ in thermodynamicalequilibrium, i.e., when ⇢(p, q) and ⇢ do not depend explicitly on time. Constants ofmotion play a central role in both classical and quantum mechanics. They are extremelyimportant in statistical mechanics as well.

From mechanics we know that the constants of motion reflect fundamental symmetriesof the physical laws, for instance, the homogeneity of space and time and the isotropy ofspace. They allow us to classify the trajections or the stationary states according to theirvalues and are therefore important for understanding. Having said this, it is clear that itmakes no sense to consider all possible constants of motion, since they are infinite. Anylinear combination (or product) of constant of motion is also a constant of motion. It isin fact sufficient to consider a complete set of linear independent additive constants ofmotion, since all other constants of motion can be expressed in terms of them. We shalldenote by {Fi, i = 1, . . . , s} the complete set of linearly independent additive constants of

44

Page 45: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

motion of the system. In classical mechanics the observables Fi = Fi(p, q) are functions ofthe generalized coordinates and conjugated momenta of the particles, while in quantummechanics Fi is an hermitic operator with eigenvalues fi.

The additive constants of motion are well known from mechanics:

i) The total energy H(p, q) or H, which reflect the invariance with respect to trans-lation in time,

ii) the total linear momentum ~P = (px, py, pz), which reflects the homogeneity ofspace, i.e., the invariance with respect to translation of the system as a whole, and

iii) the total angular momentum ~J = (Jx, Jy, Jz) which reflects the isotropy of spaceand the invariance with respect to rotations.

To these seven fundamental additive integrals of motion we should add

iv) the volume V of the system, and

v) the total number of particles N . In multi-component systems the number Ni ofeach type of particles is conserved.

These two last constants are parameters that define the system. They do not depend onthe generalized coordinates (p, q). In quantum mechanics N can be an operator if thesecond quantization formalism is used.

In almost all cases of experimental interest, one considers macroscopic bodies whichare at rest and which do not rotate.1 One can therefore imagine the system in a fixedbox or volume and take this box at rest. In this case ~P and ~J are no longer constantsof motion, which leaves us with E, N and V as the only additive linearly independentconstants of motion.

Constants of motion are characterized by having a vanishing Poisson bracket or com-mutator with the Hamiltonian. As stated by the Liouville and von Neumann theorems,the distribution function ⇢(p, q) and the density operator ⇢ in equilibrium (@⇢/@t = 0)fulfill this condition and are thus a constants of motion. The statistical independence ofsubsystems implies that ln ⇢ is and additive constant of motion. Indeed,

ln ⇢12 = ln(⇢1⇢2) = ln ⇢1 + ln ⇢2 ,

or in the classical limit

ln ⇢12(p1, p2, q1, q2) = ln[⇢1(p1, q1)⇢2(p2, q2)] = ln[⇢1(p1, q1)] + ln[⇢2(p2, q2)] ,

for any two microscopic subsystems of a large macroscopic system.2 Consequently, ln ⇢must be a linear combination of the linearly independent additive constants of motion

1The reader may wish to examine the Einstein-de Haas effect for a remarkable exception. A. Einsteinand W.J. de Haas, Verh. Dtsch. Phys. Ges. 17, 152 (1915); 18, 173 (1916).

2 Notice that statistical independence also implies [⇢1, ⇢2] = 0, which is a necessary condition for thevalidity of the operator relation ln(⇢1⇢2) = ln ⇢1 + ln ⇢2.

45

Page 46: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

F1, . . . , Fs (i.e., Fi ⌘ E, ~P , ~L, V , N):

ln ⇢ =sX

i=1

�i Fi , (3.1)

or in classical mechanics

ln ⇢(p, q) =sX

i=1

�i Fi(p, q) , (3.2)

with some real coefficients �i. This is central expresion for the density operator in equi-librium. In particular it implies that the equilibrium density operator and distributionfunction can only depend on the dynamical variables (p, q) through the additive con-stants of motion. All microstates having the same additive constants of motion areequally probable!

3.4 The density operator: General formulation

In the following we derive the general form of the density operator describing mixedstates in different equilibrium situations ranging from complete isolation to completeopenness. The formulation is admittedly somewhat abstract —for instance, the conservedquantities are denoted by Fi with i = 1, . . . , s, instead of E, V , N , etc.— in order tostress the universal validity and the analogies between the different statistical ensembles.An explicit form of the density operator in the most usual cases of physical interest isgiven thereafter.

3.4.1 The microcanonical ensemble

In order to obtain ⇢ we could simply take the exponential of ln ⇢. However, in the caseof isolated systems a much more direct derivation is possible. In isolated systems theconstants of motion have precise values fi for i = 1, . . . , s. Thus, the probability offinding the system in a state having Fi 6= fi is zero. In other words ⇢(p, q) = 0 for all(p, q) having Fi(p, q) 6= fi for some i 2 [1, s]. The same holds in quantum mechanics:wn = 0 for all states |↵ni for which Fi|↵ni 6= fi|↵ni for some i. On the other side, thetrace or integral of ⇢ over all possible states must be 1. The only function satisfyingthese conditions is the delta function. It follows that

⇢ = asY

i=1

�(Fi � fi) (3.3)

with

a�1 = Tr

(sY

i=1

�(Fi � fi)

), (3.4)

or in the classical case,

⇢(p, q) = asY

i=1

�[Fi(p, q)� fi] (3.5)

46

Page 47: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

with

a�1 =

Zdp dq

sYi=1

�[Fi(p, q)� fi] . (3.6)

The fi are the values of the constant of motion in the particular macrostate under consid-eration. Notice that the mixed state describing equilibrium, i.e., the macrostate, is fullycharacterized by the very few numbers f1, . . . , fs. Out of the 1020 degrees of freedomneeded to characterize a microstate, we are left we only 9 at most! The statistical ensem-ble characterized by ⇢ or ⇢(p, q) as given by Eqs. (3.3)–(3.6), is known as microcanonicalensemble.

The expressions given above are valid for any physical system and any set of linearlyindependent additive constants of motion. They can be significantly simplified in themost usual situation, where ~P = 0, ~L = 0, and the volume V and number of particles Nis fixed. In this case we may write

⇢mc =1

Zmc�(H � E)

where the normalization constant

Zmc = Trn�(H � E)

o,

is known as the microcanonical partition function. The Hilbert space involved in thetrace contains only states having N particles in a volume V . ⇢mc depends thus on E, Vand N .

3.4.2 The grand canonical ensemble

We turn now to systems which are not isolated. These systems can also be regarded asparts of a much larger system, which can be isolated or not. There are different physicalsituations to consider. We focus first on completely open systems, whose statisticaldescription corresponds to the grand canonical ensemble. By completely open, we meanthat none of the usual constant of motion is conserved in the system. For example, energy,particles and even accesible volume can be exchanged with the environment. The system1 is said to be completely open if the interaction between 1 and 2 allows exchange of allthe additive constants of motion of the ensemble, so that none of the quantaties Fi(1)has a fixed value. The role of the environment 2 is only to fix the average value hFii ofFi in 1. Since the influence of the interaction surface between 1 and 2 relative to thevolumes can be assumed to be negligible, the probability of finding the subsystem 1 in agiven microstate can be taken to be independent of the microstate of system 2. In fact,part 1 of the system can take a variety of microstates without having any influence orbeing affected by the state of part 2. Still, an important conservation constraint remainswhen the ensemble 12 is isolated, since in this case the sum of the constants of motionmust be preserved:

Fi(2) = Fi(12)� Fi(1).

47

Page 48: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The values of the constants of motion in the subsystems of an isolated system are notindependent from each other. In open systems, however, these constraints do not applyand statistical independence implies

⇢12 = ⇢1 ⇢2 .

Moreover, [⇢1, ⇢2] = 0, since they act on different degrees of freedom. The states of thejoint system 12 are simply the direct products of the states of the parts 1 and 2:

|12i = |1i|2i.

It follows that ln ⇢ is an additive constant of motion. Therefore, it can be expressed asa linear combination of the linearly independent additive constants of motion Fi withi = 1, . . . , s:

ln ⇢ =sX

i=1

�iFi,

where the �i are some real coefficients. The corresponding density operator is given by

⇢gc = eP

s

i=1 �i

Fi . (3.7)

The statistical ensemble or mixed state characterized by ⇢gc is called grand-canonicalensemble.

In the grand-canonical ensemble the coefficients �1, . . . ,�s are fixed by the requirementthat the average values of the additive constants of motion take the predefined valuesf1, . . . , fs. This gives the conditions

hFii = Trn⇢gc Fi

o= Tr

neP

s

i=1 �i

Fi Fi

o= fi

for i = 1, . . . , s. Conversely, the �1, . . . ,�s define all the mean values f1, . . . , fs. Animportant relation between the �i results from Eq. (3.7) and the normalization of ⇢gc:

Tr{⇢gc} = Tr{eP

s

i=1 �i

Fi} = 1. (3.8)

Thus, the coefficients �i (i = 1, . . . , s) are not all independent from each other even in acompletely open system. Notice that the operators Fi describing the linear independentadditive constants of motion are independent of �i. Later on, we shall see that the �i areintensive (i.e., size-scaling independent) properties of the system such as temperature,pressure or density. The normalization condition (3.8) expresses a relation between the�i which represents the equation of state of the material.

3.4.3 The canonical ensemble

The canonical ensemble describes the intermediate situation in which the system is openwith respect to r additive constants of motion F1, . . . , Fr with r < s, and closed with

48

Page 49: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

respect to the remaining ones Fr+1, . . . , Fs. Thus, the conditions on the mean values hFiiread

hFii = Tr{⇢c Fi} = fi

for i = 1, . . . , r, where ⇢c denotes the canonical density operator. The coefficients�1, . . . ,�s, of the linear expansion must take into account the normalization condition

Tr {⇢c} = Tr

8<:eP

r

i=1 �i

Fi

sYj=r+1

�(Fj � fj)

9=; eP

s

j=r+1 �i

fi = 1 .

This implies that ⇢c actually does not depend on the parameter �r+1, . . . ,�s associatedto the conserved quantities Fr+1, . . . , Fs. They only enter ⇢c through the normalizationconstant. We may thus write

⇢c =1

ZreP

r

i=1 �i

Fi

sYj=r+1

�(Fj � fj) ,

where

Zr = Tr

8<:eP

r

i=1 �i

Fi

sYj=r+1

�(Fj � fj)

9=; = e�P

s

j=r+1 �i

fi

is known as the canonical partition function. Zr has a great practical importance instatistical mechanics. It can be directly related to the thermodynamical state functionappropriate for the corresponding physical situation or ensemble.

The expression for ⇢c describes the most general density operator in thermal equilib-rium. It includes in particular the grand canonical ensemble (r = s) and the microcanon-ical ensemble (r = 0). In the grand canonical ensemble we have simply Zs = 1 and inthe microcanonical ensemble we have

Z0 = Zmc = Tr

8<:sY

j=1

�(Fj � fj)

9=; .

3.5 Explicit forms of the density operator

After having discussed the general formulation, we would like to apply the formalism tothe cases of physical interest. Before doing that let us recall that the additive constantsof motion in mechanics are

i) the energy E given by H,

ii) the linear momentum ~P ,

iii) the angular momentum ~L,

iv) the number of particles N (eventually Ni for different components) and

49

Page 50: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

v) the volume V.

Each one of these is clearly related to a particular symmetry of the physical laws or to aboundary condition. We focus on the situation where the system is at rest (~P = 0 and~L = 0). This is experimentally realized by keeping the system in a fixed container, inwhich case ~P and ~L are no longer conserved. Under these circumstances, the system isactually open with respect to linear and angular momentum transfer (due to collisionsor constraints at the boundary of the volume) keeping the average values h~P i = 0 andh~Li = 0. Thus, the terms corresponding to ~P and ~L disappear from ⇢.

3.5.1 Microcanonical ensemble

The ensemble corresponding to a system which is closed with respect to all remainingadditive constants of motion E, N and V , is known as microcanonical ensemble. Themicrocanonical density operator can be written as

⇢mc =1

Zmc�(H � E)

withZmc = Tr{�(H � E)},

where it is understood that the Hilbert space on which ⇢ acts, and within which thepartition function Zmc is calculated, contains only states having a fixed volume V and afixed number of particles N . ⇢mc and Zmc depend thus on E, V and N .

Let us discuss the physical meaning of Zmc. The number of states having an energylower than E is given by

⌃(E) =Xn

✓(E � En) = Tr{✓(E � H)}, (3.9)

where ✓(x) is the Heaviside function [✓(x) = 1 for x > 0 and ✓(x) = 0 for x < 0] withd✓/dx = �(x). The number of states having an energy in the interval [E,E +�] is

�(E) = ⌃(E +�)� ⌃(E) = �⌦(E) +O(�2) ,

where we have introduced the density of states of the system ⌦(E) = d⌃/dE at theenergy E and assumed � ! 0. ⌦(E) represents the number of accessible states at theenergy E per unit energy, i.e., the number of states per unit energy in the spectrum ofthe system at the energy E. From Eq. (3.9) it is clear that

⌦(E) =d⌃

dE= Tr{�(E � H)} = Zmc(E) .

The microcanonical partition function Zmc represents the density of accessible states inthe system at the energy E for the given total volume V and particle number N .

In the microcanonical ensemble all the states having the same energy E are equallyprobable. In classical mechanics all the states (p, q), belonging to the hypersurface H(p, q)are equally probable. The latter is of course consistent with ⇢ being a constant of motionand with the ergodic hypothesis. Notice, however, that we didn’t invoke any ergodichypothesis but simply the concept of statistical independence of the subsystems.

50

Page 51: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Excercise 3.9: Compute the density of states ⌦(E) of a system of N independent iden-tical particles of mass m in a volume V = LD in D dimensions. Suppose thateach particle has the energy "i (

Pi "i = E) and calculate the number of states

�("i) for one particle having the energy " < "i. Derive then total number of states⌃("1, . . . , "N ) for the case when the individual particles have the energies "1, "2, . . . , "N .How would this expression be modified, if you take into account the principle of in-distinguishability of identical particles? Maximize ⌃("1, . . . , "N ) under the total en-ergy constraint

Pi "i = E and approximate ⌃(E) by the maximum value. De-

rive ⌦(E) and analyse the dependence of ⌃ and ⌦ on E, V , N and v = V/N .

3.5.2 Canonical ensemble

A system having a fixed volume V and a fixed number of particles N , which can exchangeenergy with environment is described by the canonical ensemble in a narrow sense. Inthese cases one says that the system has a heat contact with the environment. Theenergy exchanged in this way, i.e., by keeping V and N fixed, is commonly known asheat. The canonical density operator is given by

⇢c =1

Zce��H

withZc = Tr{e��H},

where we have implicitly assumed that the Hilbert space contains all the states havingN particles in a volume V . Thus, ⇢c and Zc depend on �, V and N . The parameter �defines the average energy E of the system:

E(�) = hHi = Tr{⇢c H}

=Tr{He��H}Tr{e��H}

= � @

@�lnZc .

It is important to note that the converse is also true, i.e., the average energy E defines� univocally, since E(�) is a strictly monotonically decreasing function of �:

@E

@�= �Tr{H2e��H}Tr{e��H}+ (Tr{He��H})2

(Tr{He��H})2

= �hH2i+ hHi2 = �(�H)2 < 0 .

51

Page 52: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Since (�H)2 is always strictly positive and tends to zero only for � ! 1, the relationbetween E and � is bijective.

In the canonical ensemble the total energy E of the system is not fixed. The statisticaldistribution w(E) of E, i.e., the probability density w(E) of measuring the energy E, orof finding the energy E in the system, is given by

w(E) = h�(E � H)i

=1

ZcTr{e��H �(E � H)} =

1

ZcTr{e��E �(E � H)}

=1

Zce��E ⌦(E). (3.10)

In words, the probability of measuring the value E of the total energy is equal to theprobability e��E/Zc, of finding the system in a microstate of energy E multiplied by thenumber of states ⌦(E) having this energy.

The canonical partition function Zc can be readily related to the microcanonical par-tition function Zmc(E) = ⌦(E) by noting that

Zc = Tr{e��H}

= Tr{e��HZ

dE �(E � H)}

=

ZdE Tr{e��H �(E � H)}

=

ZdE e��E Tr{�(E � H)}

=

ZdE e��E ⌦(E) . (3.11)

This could have been directly inferred from Eq. (3.10) and the normalization of w(E).In the canonical ensemble the total energy E of the system does not have well defined

value. It is therefore very interesting to analyze its probablility distribution w(E) inorder to assess the potential importance of energy fluctuations. In the following we willshow that w(E) is very sharply peaked at the average value. Let us firt notice that thedensity of states ⌦(E) of a macroscopic system increases extremely rapidly with E. Wealready know from our discussion the of statistical independence of subsystems that ln⌦is an additive property. We may therefore write

ln⌦(E) = N ln

!

✓E

N,V

N

◆�, ⌦(E) = !

✓E

N,V

N

◆N

,

where !(", v) is some increasing function of the energy per particle " = E/N representingthe density of states per particle or per subsystem. We conclude that the probabilitydensity distribution w(E) for the total energy E is the product of two very rapidlyincreasing and decreasing functions of E [see Eq. (3.10)]. Consequently, w(E) shows

52

Page 53: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

an extremely narrow mawimum. In addition, we already know, from the central limittheorem, that

p(�H)2/hHi / 1/

pN ! 0 for N ! 1.

In the macroscopic limit (N ! 1) we may therefore identify the average energyE = hHi with the energy E at which w(E) has its maximum (saddle point integration):

E = hHi =Z

E0w(E0) dE0 =

ZE0 e��E

0⌦(E0) dE0

!N!1

ZE0 �(E0 � E) dE0 = E,

where E is given by the condition

dw(E)

dE

����E=E

= 0.

Knowing thatdw

dE=

1

Zc

�� e��E ⌦(E) + e��E

d⌦

dE

�,

the energy yielding the maximum in w(E) is obtained from

dw

dE= 0 , ��⌦+

d⌦

dE= 0 , � =

1

⌦(E)

d⌦

dE,

which implies

�(E) =d

dEln⌦(E) > 0 .

This allows us to directly obtain � as a function of the average energy E, without needingto solve E(�) = Tr{⇢c H}. Since ⌦(E) is in most cases an increasing function of E, weusually have � > 0.

The spectrum of macroscopic systems has a lower bound, the ground-state energyE0, and is usually unbound for high energies. The density of states is an increasingfunction of E and therefore � is in most cases positive. Notice that the trace giving thepartition function converges only if the spectrum is bounded at least on one side. If thespectrum has only a lower bound, the trace converges only for � > 0. There are, however,systems in which the spectrum is bounded on both sides, i.e., in which there is maximumachievable energy Emax. An example, of bounded spectrum is found in spin systems. Inthese cases ⌦(E) increases first with E, and then decreases as we approach the upperbound. Thus, for energies close to the Emax one finds � < 0. As a simple example,consider the Ising model with nearest neighbor interactions in a one dimensional chain:

H = �JNXi=1

si si+1

with si = ±1. In this case one has E0 = �JN and Emax = JN .

53

Page 54: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

3.5.3 Grand canonical ensemble

The equilibrium state of systems which are open with respect to energy and particleexchange is described by the grand canonical ensemble. Examples of such systems aremetals kept at some given electrostatic potential, where electrons can be freely exchanged,or chemical solutions exchanging ions with the electrodes. The corresponding densityoperator is given by

⇢gc =1

Zgce��(H�

Pi

µi

Ni

)

with the partition function

Zgc = Tr{e��(H�P

i

µi

Ni

)} .

The operator Ni counts the number of particles of each component i in the system. Asusual, the parameters � and µi define the average values of the total energy

E = hHi = Tr{⇢gc H} = E(�, µi)

and particle numbersNi = hNii = Tr{⇢gc Ni} = Ni(�, µi).

It is understood that average values and traces are calculated for states having a fixedvolume V . Thus, ⇢gc and Zgc depend on �, µi and V . For simplicity, we focus on onecomponent systems in the following.

It is easy to express the grand canonical partition function in terms of the canonicalone

Zc(�, N) = Tr{e��H}N ,

which corresponds to a fixed number of particles N . Splitting the grand-canonical tracein separate sums over all states having the same N , we have

Zgc(�, µ) =+1XN=0

Tr{e��H} e�µN =+1XN=0

e�µNZc(�, N).

Some authors, such as K. Huang, introduce the fugacity z = e�µ and write Zgc =1P

N=0zNZc. Although sometimes quite practical, we shall not follow this notation, since

it spoils the correspondence between �i and fi.The probability of finding an energy E and a number of particles N is given by

w(E,N) = h�(E � H) �(N � N)i

=1

ZgcTr{�(E � H) �(N � N)} e��(E�µN)

=1

Zgce��(E�µN)⌦N (E) .

54

Page 55: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where ⌦N (E) is the density of states for a system containing N particles in the volumeV . Again we find that the probability density distribution for E and N is very narrowaround the average values. The equations defining the position of the maximum ofw(E,N) allow us to express � and µ as a function of E and N .

Exercise 3.10: Show that@hNi@µ

= �(hN2i � hNi2).

Show that hNi = 0 for µ ! �1 (� > 0). Explain why for any � > 0, there is always oneand only one solution of the equation hNi = N for any average number of particles N > 0.

Finally, we would like to show that in the mapping (�, µ) ! (E,N) with N = hNiand E = hHi is locally inversible for all � and µ. This implies that the solution of theequation E(�, µ) = E and N(�, µ) = N is always unique.

The proof is rather complicated if we use the variables � and µ, which are certainlymore physical than �1, and �2 with F1 = H and F2 = N . We consider the grandcanonical partition function in the form

⇢gc =e�(�1F1+�2F2)

Zgc

withZgc = Tr{e��1F1��2F2}

and [F1, F2] = 0. It follows that

@ lnZgc

@�i= �hFii

and@hFii@�i

= �(hF 2i i � hFii2)

for i = 1, 2. Moreover,

@hF1i@�2

= �[hF1F2i � hF1ihF2i] =@hF2i@�1

.

The Jacobian matrix of the transformation is then given by

J =

✓@hFii@�j

◆= �

hF 2

1 i � hF1i2 hF1F2i � hF1ihF2ihF1F2i � hF1ihF2i hF 2

2 i � hF2i2�.

55

Page 56: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

We need to show that the determinant of the Jacobian is never zero, whatever thedensity matrix ⇢ used for performing the averages. For this, we use the Cauchy-Schwarzinequality with an appropriate inner product in an ad hoc vector space.

First notice that D(F1 � hF1i)(F2 � hF2i)

E= hF1F2i � hF1ihF2i.

We consider s compatible linear operators F1, . . . , Fs with [Fi, Fj ] = 0 8i, j. We define thevector space V of all the linear combinations of the zero-average operators fi = Fi�hFii.In this space we introduce the bilinear function

hf |gi = hf gi ,

where the averages are taken with respect to some fixed density matrix ⇢. hf |gi has allthe properties of an inner product. First, hf |fi � 0 8f and hf |fi = 0 ) f = 0 . Here weused that hf2i = 0 implies that f is constant and thus zero. Second, hf |gi is bilinear andsymmetric, since [Fi, Fj ] = 0. Consequently, we can apply the Cauchy-Schwarz inequality

hf |gi2 hf |fihg|gi ,

which implies

0 �hF 2

1 i � hF1i2� �

hF 22 i � hF2i2

���hF1F2i � hF1ihF2i

�2= det(J).

The equal sign holds only when the vectors (operators) are linearly dependent, which ofcourse does not hold for H and N . Since the Jacobian has always the same sign (andis never zero) the change of variables is inversible, which implies the uniqueness of thesolution of the equations

E(�, µ) = hHi�µ = E

andN(�, µ) = hNi�µ = N .

Exercise 3.11: Show that@ lnZgc

@�= �

⇣hHi � µhNi

⌘and

@

@�

⇣hHi � µhNi

⌘= �

hh(H � µN)2i�iH � µNi2

i< 0.

Since hHi � µhNi is a monotonically decreasing function of � for all µ, conclude that theequation hHi � µhNi = constant always has a unique solution as a function of �. Performthe corresponding analysis concerning the chemical potential µ.

[Zeichnung]

56

Page 57: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

3.5.4 Grand canonical pressure ensemble

A system which is open with respect to all constants of motion is characterized by thedensity operator

⇢gcp = e��(H�µN+pV ) ,

where the partition function Tr{⇢gcp} = 1. Thus, there is no need for a normalizingfactor. The relevant parameters or variables entering ⇢gcp are here �, µ and p, whichdefine the average energy E = hHi, average number of particles hNi, and average volumehV i. In this case volume fluctuations come into play. We find this physical situation whenwe consider a hot air balloon, an air bubble in a liquid, or a warm-air bubble climbing itsway up in the atmosphere on a sunny day. This ensemble is known as grand canonicalpressure ensemble.

The probability of finding an energy E, number of particles N and volume V is

w(E,N, V ) = h�(E � H) �(N � N) �(V � V )i = e��(E�µN+pV )⌦N,V (E),

where ⌦N,V (E) is the density of states for a system containing N particles in a volumeV . As in the previous cases ⌦N,V (E) has an extremely narrow distribution around theaverages hEi, hNi and hV i, which coincide with the values of E, N and V that maximize⌦N,V (E).

3.6 Implicit notation

Throughout the previous discussion of the density operators and partition functions in thedifferent ensembles we made use of a widespread convention, which consists in droppingthe � functions in the expressions for ⇢c and Zr. This is possible, and should lead to noconfusion, since we implicitly assumed that the Hilbert space on which ⇢c acts, and withrespect to which the trace in Zr is calculated, contains only states | i having well-definedvalues of the constants of motion:

Fi| i = fi| i

for i = r+1, . . . , s. Keeping this in mind is, of course, crucial for the calculations. Usingthis convention, the general expressions for the canonical ensemble take the form

⇢c =1

ZreP

r

i=1 �i

Fi ,

whereZr = Tr{e

Pr

i=1 �i

Fi} ,

which is clearly more compact and elegant.

57

Page 58: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

4 Entropy

The entropy is one of the most important concepts in statistical mechanics. It is definedas

S = �kBhln ⇢i = �kBTr{⇢ ln ⇢},

where kB = 1, 38 ⇥ 10�23 J/K is the Boltzmann constant. This constant is introducedin order to be consistent with the historical definition of entropy change, which is anenergy difference (reversible heat) divided by temperature at which the heat transferoccurs. Thus, kB and S have units of energy divided by temperature. Here, K stands forKelvin after William Thomson, 1st Baron Kelvin (1824 - 1907) who measured the valueof the absolute zero of temperature (' �273� C). A temperature difference of 1K is thesame as 1� C, only the reference zero is different. In the Kelvin or absolute temperaturescale, zero corresponds to the lowest attainable temperature value, whose existence wasalready known to Carnot (1824). The Boltzmann constant is not a fundamental physicalconstant. Its value follows from the convention which assigns the value T = 273, 16 Kto the triple point of water. Thus, maybe some day one might be able to compute kBin terms of ~, c and the mass and charge of the elementary particles. kB allows us torelated temperature and energy scales. Roughly speaking, for simple estimates, one maytake that 1 eV ⇡ 104 K, actually 8620 K. As we shall see, this rough approximation ofkB is useful for judging which energy differences are important at a given temperature.

If we denote by ⇢⌫ , the eigenvalues of ⇢ we have

S = �kBX⌫

⇢⌫ ln ⇢⌫ > 0 ,

since x lnx 0 for 0 x 1. We already know that S = 0 only for pure states. Thedefinition of S shows that S is nothing but the average of the additive constant of motion� ln ⇢, which we inferred under the assumption of statistical independence of macroscopicsubsystems. For open systems, in the grand canonical ensemble, we know that ln ⇢ is anadditive constant of motion, which can be thus expressed as

ln ⇢ =sX

i=1

�iFi ,

in terms of the set Fi, . . . Fs of linearly independent additive constants of motion. In thegrand canonical ensemble, the additivity of S follows immediately:

S = �kBhln ⇢i = �kB

sXi=1

�ihFii = �kB

sXi=1

�ifi .

If the system is not fully open, i.e., if some additive properties are conserved, the sta-tistical independence of the subsystems does not hold strictly. Different subsystems ofan isolated system can be regarded as statistically independent, except for the fact thatthe sum of the conserved additive properties over all subsystems must always take awell-defined value. Therefore, one could doubt on the additivity of S a priori. However,

58

Page 59: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

we have seen that for macroscopic systems such constraints are not essential, since thedensity operators of different microcanonical and grand canonical ensembles are not verydifferent in practice. For instance, the energy distribution in the canonical ensemble isextremely sharp, even though E is not strictly conserved. In fact, as we shall see, theadditivity of S can be demonstrated in general.

Let us consider the general canonical ensemble, which is open with respect to Fi, . . . , Fr,and closed with respect to Fr+1, . . . , Fs. In this case we have

⇢c =1

ZreP

r

i=1 �i

Fi , (4.1)

whereZr = Tr{e

Pr

i=1 �i

Fi} . (4.2)

Moreover, we know thatZr = e�

Ps

i=r+1 �i

fi . (4.3)

The entropy is then given by

S = �kB

"rX

i=1

�ihFii � lnZr

#= �kB

sXi=1

�ifi. (4.4)

Notice that S depends on the r coefficient �1, . . . ,�r, which control the average valuesof the non-conserved additive observables hFii = fi for 1 i r. It also depends onthe conserved properties Fr+1, . . . , Fs with fixed values fr+1, . . . , fs, which define theensemble. For example, in the canonical ensemble, S depends on the temperature, whichdefines the average energy E, on the number of particles N , and on the volume V .

4.1 Maximum entropy theorem

We would like to establish a variational principle in terms of the entropy by which wecan characterize the state of equilibrium. To this aim we consider the entropy

S = �kBhln ⇢i = �kBTr{⇢ ln ⇢} (4.5)

as a functional S[⇢] defined for any density operator ⇢ acting in the given Hilbert space ofour system. The properties of ⇢ are clear: 0 ⇢⌫ 1 and ⇢† = ⇢. The underlying Hilbertspace is spanned by the states having well-defined constants of motion Fr+1, . . . , Fs.

The following fundamental property holds. Among all the the density operators ⇢satisfying hFii = Tr{⇢Fi} = fi for i = 1, . . . , r the one yielding the largest S is thecanonical density operator ⇢c given by Eqs. (4.1) and (4.2), which describes the state ofequilibrium. This is equivalent to the inequality

S[⇢] S[⇢c]

or, using Eqs. (4.4) and (4.5), to

S[⇢] = �kBTr{⇢ ln ⇢} 6 S[⇢c] = �kB

"rX

i=1

�ihFii � lnZr

#.

59

Page 60: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Thus, the state of equilibrium is the one which maximises S under the given contraintsof hFii = fi for i = 1, . . . , r, and Fi = fi for i = r + 1, . . . s.

In order to prove this important statement we first show the following lemma. Givenany two density operators ⇢ and ⇢0, we have

Tr{⇢ ln ⇢0} 6 Tr{⇢ ln ⇢} , (4.6)

The starting point is the inequality lnx x� 1 for any x > 0. This implies that for anyhermitic operator A with eigenvectors |ni and eigenvalues an � 0, we have

h | ln A| i =Xn

|h |ni|2 ln an 6Xn

|h |ni|2(an � 1) = h |(A� 1)| i. (4.7)

We consider the diagonal representation of

⇢ =Xn

⇢n|nihn|

with ⇢n > 0 and obtain

Tr{⇢ [ln(⇢0)� ln(⇢)]} =Xn

⇢n hn|[ln(⇢0)� ln(⇢n)]|ni =Xn

⇢n hn| ln✓⇢0

⇢n

◆|ni,

where we have used that ⇢n > 0. The operators ⇢0/⇢n are all positive semi-definite, since⇢0 has positive or zero eigenvalues and ⇢n > 0. Therefore, using (4.7) we have

hn| ln✓⇢0

⇢n

◆|ni hn|⇢0/⇢n � 1|ni

for all n. It follows that

Tr{⇢(ln ⇢0 � ln ⇢)} 6Xn

⇢nhn|✓⇢0

⇢n� 1

◆|ni =

Xn

hn|⇢0|ni � Tr{⇢} = Tr{⇢0}� 1 = 0.

This proves the inequality (4.6).We turn now our attention to the entropy S[⇢] of an arbitrary density operator ⇢:

S[⇢] = �kBTr{⇢ ln ⇢} 6 �kBTr{⇢ ln ⇢c} = �kBTr{⇢

rXi=1

�iFi � lnZr

!}

= �kB

"rX

i=1

�ihFii � lnZr

#= �kB

sXi=1

�ifi = S[⇢c] ,

where we have used (4.6) and the explicit form of ln ⇢c =Pr

i=1 �iFi � lnZr =Ps

i=1 �ififrom Eqs. (4.1) and (4.3). Notice that the logarithm of the density operator at equilibriumis a linear combination of additive constants of motion. Therefore, the average hln ⇢ci =Tr{⇢ ln ⇢c} is the same for all density operators having the same average values hFii = fi

60

Page 61: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

of the properties Fi with respect to which the system is open (1 i r) and the sameconstants of motion fj with respect to which the system is closed (r + 1 j s).

We have already discussed the meaning of �hln ⇢i = �P

n ⇢n ln ⇢n for an arbitraryprobability distribution ⇢n, where 0 ⇢n 1 and

Pn ⇢n = 1. We have seen that

�hln ⇢i gives a measure of the degree of disorder in the distribution, or equivalently,a measure of the deviation of the mixed states described by ⇢ from a pure state. Infact, S = 0 only for pure states, and S increases as the number of states participatingin the mixed states increases. A larger number of states participating in mixed statesreduces the average of ⇢n and ln ⇢n, since

Pn ⇢n = 1 remains constant (h⇢i = Tr{⇢2}

and hln ⇢i = Tr{⇢ ln ⇢}). We have also interpreted �hln ⇢i as a measure of degree ofuncertainty (or lack of knowledge) that we have on the actual state of the system. Withthis in mind, the principle of maximum entropy tells us that among all the states of asystem satisfying hFii = fi for i = 1, . . . , r and Fi ⌘ fi for i = r+1, . . . , s, the equilibriumstate is the one with the highest degree of disorder, the most far from pure, or the oneabout which we know less. This fundamental property is physically most sound, one canactually grasp the state of equilibrium in this way.

4.2 The approach to equilibrium

An alternative perspective to the principle of maximum entropy can be obtained byinvestigating the time dependence of the entropy S = �kBhln ⇢i of a system in a mixedstate ⇢ (not necessarily in equilibrium) which results from the interactions between thesystem and its environment. We already know that if a system is strictly isolated, ⇢ is aconstant of motion and the entropy is independent of time.

We consider a system in an arbitrary mixed state, not necessarily in equilibrium, whichis described by the density operator

⇢ =Xm

|miPmhm|.

The states |mi are the eigenstates of ⇢ with hm|m0i = �mm0 andP

m Pm = 1. We considerthe system under the action of some external (unspecified) time-dependent perturbationV , which describes the interaction between the system and the environment. Sincethe interactions with the environment are uncontrolled, i.e, their outcome cannot bepredicted with certainty, we shall describe them as a stochastic process. This is definedas a time-indexed succession of states

m1t1,m2t2, . . . ,mn�1tn�1

with t1 < t2 < · · · < tn�1. Moreover, we assume that probability for the system to be inthe state mk at time tk is completely determined by the state mk�1 which the system hadat time tk�1. Such processes are called Markovian. They lack of any memory or inertialeffects. The time evolution of the system is controlled by the transition probability

p(m0, t0|m, t) , (4.8)

61

Page 62: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

which represents the conditional probability of finding the system in the state m0 at timet0, knowing that it was in state m at time t. Clearly,

p(m0, t|m, t) = �mm0 8m,m0 and t (4.9)

and Xm0

p(m0, t0|m, t) = 1 8m, t, and t0 > t. (4.10)

For small time differences �t = t0 � t, it is useful to perform a linear expansion andto introduce the transition rate or rate constant km0m, which represents the transitionprobability per unit time:

p(m0, t+�t |mt) = km0m�t+O(�t2), (4.11)

where m0 6= m. The probability of staying in the state m at t+�t is obtained from thenormalization condition (4.10):

p(m, t+�t |m, t) = 1�X

m0 6=m

km0m�t+O(�t2). (4.12)

Notice that (4.11) and (4.12) satisfy Eq. (4.9) to first order in �t. The dynamics ofMarkov processes is controlled by the transition rates km0m and by the initial state asgiven probability distribution {Pm}.

The probability of finding the system in the state m0 at time t + �t is equal to thesum of the probabilities of being in any state m (including m0) at time t multiplied bythe transition probability p(m0, t + �t|m, t) from the state m to the state m0. This iswritten as

Pm0(t+�t) =Xm

p(m0, t+�t |mt)Pm(t) .

Replacing (4.11) for m0 6= m and (4.12) for m = m0, we have

Pm0(t+�t) =X

m 6=m0

kmm0Pm(t)�t+ Pm0(t)�X

m 6=m0

kmm0Pm0(t)�t ,

which impliesdPm

dt=

Xn 6=m

(kmnPn � knmPm). (4.13)

This is known as the master equation. The first term on the right-hand side representsthe probability of landing on m coming from elsewhere (n 6= m) and the second negativeterm is the probability of going elsewhere startung from m.

In order compute the time dependence of Pm and of S we need some information onthe transition rates kmn. This is provided by time-dependent perturbation theory inquantum mechanics. Given a periodic perturbation V (t) with frequency ! and matrix

62

Page 63: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

elements Vnm = hn|V |mi, the transition probability per unit time from an unperturbedstate |mi to a state |ni is given by the Fermi golden rule

knm = kn m =2⇡

~ |Vnm|2�(En � Em ± ~!) .

Important for us is simply the microscopic reversibility

knm = kmn , (4.14)

which is valid at the quantum level, since V is hermitic (Vnm = V ⇤mn). This tells us thatthe probability for a transition from state |mi to |ni is the same as the probability for atransition from |ni to |mi.

We can finally turn our attention to the entropy

S = �kBhln ⇢i = �kBXm

Pm lnPm

and calculatedS

dt= �kB

Xm

✓dPm

dtlnPm +

dPm

dt

◆.

Taking into account that Xm

Pm = 1 )Xm

dPm

dt= 0 ,

we have

dS

dt= �kB

Xm

dPm

dtlnPm

= �kBXm

Xn 6=m

(kmnPn � knmPm) lnPm

= �kBXm 6=n

kmn(Pn � Pm) lnPm

= �kBXm<n

kmn(Pn � Pm)(lnPm � lnPn) .

Since ln(x) is an increasing function of x the signs of (Pn � Pm) and (lnPm � lnPn) arealways opposite. This implies

dS

dt> 0 8t . (4.15)

One concludes that the entropy always increases along the process of reaching equilib-rium. The equilibrium state can only be the one with maximum entropy.

Notice that the previous derivation of the dynamical entropy increase is intimatelyrelated to the microscopic reversibility knm = kmn. It is also clear that symmetric ratesalways lead to an increase of the diversity or disorder in the probability distribution{Pm}. Consider for example a pure state having Pn = 1 and Pm = 0. Since Pm = 0 for

63

Page 64: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

m 6= 0 and Pn = 1 (very large) the transition rate kmn (from n to m) will lead to anincrease of Pm. However, Pm is very small at the beginning. Therefore, the probabilityfor the system to go back from m to n is very small, even though the rate knm is thesame. Instead, the system will make transitions from m to other states l and so on, untilthe maximum diversity or disorder (maximum S) is reached.

Let us finally note that in many applications of the master equation to the stochasticdynamics of open systems we do have kmn 6= knm. In this case a reduction of S alongthe dynamics cannot be excluded, for example, along the process of folding a protein inits natural state. Of course, the entropy of the proteins plus the surrounding liquid ortissue certainly increases along this spontaneous process.

If the system under study does not satisfy knm = kmn, then the equilibrium situationdoes not necessarily correspond to the configuration of maximum entropy but simply todPm/dt = 0. In this case we also have, of course, dS/dt = 0. A simple transparentcondition for equilibrium is in this case the detailed balance condition kmnPn = knmPm

for all n and m. However, this is not the only possibility [see Eq. (4.13)].The principle of maximum entropy (either in its variational or dynamical form) allows

us to predict or understand the evolution of systems from a non-equilibrium situationto the equilibrium state. For example, if two systems 1 and 2 are brought into thermalcontact, i.e., if they are allowed to exchange energy without volume or particle exchange,the energy will flow from the system having the lowest @S/@E to the system having thelargest @S/@E, since in this way S = S1 + S2 is increased. As we shall see, @S/@E =1/T , where T is the absolute temperature of the system. Thus, energy can only flowspontaneously from the hottest to the coolest system. Equilibrium is reached only whenthe temperatures are equal.

ZEICHNUNG

4.3 Thermodynamic equivalent descriptions

We have learned that systems having a given set of additive constants of motion f1, . . . , fsare described by different density operators depending on the degree of isolation, i.e., onwhether fi is a truly conserved property with a well-defined value (Fi = fi) or whetherfi is actually the average value hFii = fi of a property, which allows exchanges and thusfluctuates. However, we have also seen that these descriptions are essentially equivalent,since the probability of observing a significant fluctuation tends to zero as 1/

pN as

the size of the system grows. We therefore conclude that the various mixed states ordensity operators ⇢r satisfying hFii = fi for i = 1, . . . , r and Fi = fi for i = r + 1 . . . , sare thermodynamically equivalent descriptions of the same macroscopic state. Morespecifically, the microcanonical, canonical and grand canonical descriptions of a systemwith given E, N and V are equivalent, regardless of whether E is strictly conserved orjust E = hHi, for example.

We have thus different possibilities of characterizing a macroscopic state either bymean of the additive observables f1, . . . , fs or by any of the sets of s parameters

�1, . . . ,�r, fr+1, . . . , fs

64

Page 65: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

provided that the �i yield hFii = fi for i = 1, . . . , r. Since the �i, together withfr+1, . . . , fs, define

⇢r =1

ZreP

r

i=1 �i

Fi ,

one could in principle attempt to obtain �1, . . .�s as function of f1, . . . , fs by solving thecoupled equations

fi = hFii = Tr{⇢r Fi} . (4.16)

However, the entropy and the partition function Zr provide us with a much simpler wayto solve this problem.

From the definition of entropy we have

S = �kBhln ⇢ri = �kB

"rX

i=1

�ihFii � lnZr

#

= �kB

"rX

i=1

�ifi � lnZr

#(4.17)

This includes in particular the microcanonical case (r = 0), where

S = �kB lnZ0 = kB ln⌦ ,

and the grand canonical case (r = s) where

S = �kB

sXi=1

�ihFii = �kB

sXi=1

�ifi ,

since Zs = 1. It is easy to see that all the different ways to compute S, using differentensembles, yield the same result, since

Zr = Tr{eP

r

i=1 �i

Fi

sYj=r+1

�(fj � Fi)} = e�P

s

j=r+1 �j

fj . (4.18)

Substituting (4.18) in (4.17) we obtain

S = �kB

24 rXi=1

�ifi +sX

j=r+1

�jfj

35 = �kB

sXi=1

�ifi , (4.19)

where we have used that hFii = fi for 1 i r. In order to characterize the entropy, orany partition function Zr, we are in principle free to use any set of s independent variables.Nevertheless, it is useful to describe Zr in terms of �1, . . . ,�r, fr+1, . . . , fs, since these arethe variables which one controls directly in the corresponding ensemble. In the case of theentropy we have noticed that S = �kB lnZ0 is (appart from a multiplicative constant)equal to the logarithm of the microcanonical partition function. Therefore, it is usefulto regard

S = S(f1, . . . , fs)

65

Page 66: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

as a function of the additive constants of motion. In the most usual explicit case we have

S = S(E, V,N) .

We would like to understand the dependence of S on its fundamental variables f1, . . . , fs,the additive constants of motion, by computing @S

@fj

(keeping the other fi fixed) so thatwe can write

dS =Xi

@S

@fidfi ,

for example,

dS =@S

@EdE +

@S

@VdV +

@S

@NdN .

In the general canonical case, it is easy to see that

@ lnZr

@�i= hFii = fi (4.20)

for i = 1, . . . , r, whereby all other natural variables �j for j 6= i and fr+1, . . . , fs are keptconstant when performing the partial derivation. This is another important reason forusing �1, . . . ,�r, fr+1, . . . , fs to describe Zr.

We intend to calculate the partial derivatives of S with respect to fj keeping all otherfk for k 6= j fixed. From (4.17) we have for 1 j r that

@S

@fj|f

k 6=j

=� kB

"�j +

rXi=1

fi@�i@fj

�rX

i=1

@ lnZr

@�i

@�i@fj

#

=� kB�j � kB

rXi=1

@�i@fj

(fi � fi)

=� kB�j , (4.21)

while for r < j s we have

@S

@fj|f

k 6=j

= �kB

rXi=1

fi@�i@fj

� @ lnZr

@fj

����fk 6=j

. (4.22)

Notice that the �i with 1 i n depend on fj also for r + 1 j s in order tokeep the averages f1, . . . , fr fixed. The dependence of lnZr on fj for fi fixed (i 6= j)has therefore two sources. First, @�i/@fj for 1 i r in order to keep the averagesf1, . . . , fr fixed, and second the dependence of Zr on the additive constant of motion fjfor r + 1 j s. Developing Eq. (4.22) we have

@S

@fj= �kB

"rX

i=1

fi@�i@fj

�rX

i=1

@ lnZr

@�i

@�i@fj

� @ lnZr

@fj

#.

66

Page 67: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Using Eq. (4.20) the first two terms cancel. Moreover, recalling that lnZr = �Ps

j=r+1 �jfjwe have @ lnZ

r

@fj

= �j due to the additivity of ln ⇢r. Thus, we finally have

@S

@fj

����fi 6=j

= kB@ lnZr

@fj

�����i

= �kB�j (4.23)

for r < j s. In conclusion we obtain the important result

@S

@fj

����fi

= �kB�j . (4.24)

for all j = 1, . . . , s.Example: Consider a single-component system with constants of motion E, N and V .

In the grand canonical ensemble we write

⇢gc = e��(H�µN+pV ).

The derivatives of S = S(E,N, V ) are then

@S

@E

����N,V

= �kB(��) =1

T, (4.25)

@S

@N

����E,V

= �kB(�µ) = �µ

T(4.26)

and@S

@V

����E,N

= �kB(��p) =p

T. (4.27)

We may therefore writedS =

1

TdE � µ

TdN +

p

TdV. (4.28)

The entropy allows us to calculate the parameters �1, . . . ,�r directly from the derivativeswith respect to f1, . . . , fr without needing to solve the equations fi = fi(�1, . . . ,�r) =hFii for 1 i r.

In addition we know from (4.19) that

S = �kB

sXi=1

�ifi. (4.29)

Taking into account (4.24) we conclude that S satisfies the Euler condition

S =sX

i=1

@S

@fifi . (4.30)

Applying Euler’s theorem (see section 4.3.1) we conclude that the entropy is an homo-geneous function of degree 1 and thus satisfies

S(↵f1, . . . ,↵fs) = ↵S(f1, ...., fs) (4.31)

67

Page 68: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

for all ↵ > 0. This important property means that if one changes the values of fi of alladditive constants of motion by the same factor ↵, then the entropy also changes by thefactor ↵. This kind of properties are said to be extensive.

Consider now two isolated systems A and B with additive constants of motion fAi and

fBi . The entropy of the ensemble, before allowing any exchange of fi, is S(fA

1 , . . . , fAs )+

S(fB1 , . . . , fB

s ), since S is additive. If now exchanges are allowed, the pair AB (which is ingeneral out of equilibrium at the beginning) will evolve towards equilibrium by exchangingfi keeping the sum fA

i + fBi constant. Since equilibration occurs spontaneously, the

entropy must increase (or remain the same) during this process. Therefore, the entropyof the ensemble S(fA

1 + fB1 , . . . , fA

s + fBs at equilibrium satisfies

S(fA1 , ..., fA

s ) + S(fB1 , ..., fB

s ) 6 S(fA1 + fB

1 , ..., fAs + fB

s ) .

The equal sign holds when the two systems were already in equilibrium before being puttogether.

In order to discuss the previous statement in more detail, suppose that the ratiosbetween all the fi’s are the same in both systems:

fAi

fA1

=fBi

fB1

, fAi

fBi

=fA1

fB1

= ↵ , fAi = ↵fB

i 8i .

It then follows that

S(fAi ) + S(fB

i ) = S(fAi ) + S(↵fA

i ) = S(fAi ) (1 + ↵) = S[(1 + ↵)fA

i ] = S(fAi + fB

i ).

One concludes that if the ratios fi/fj are kept constant, the entropy is just proportionalto the size of the system.

What about the converse? What relation holds between the intensive properties �i oftwo systems A and B, for which the entropy of the equilibrated ensemble AB is simplyequal to the sum of the entropies before contact? In this case

S(fAi ) + S(fB

i ) = S(fAi + fB

i )

implies

@S

@fAi

=@S

@fi

���f=fA

i

+fB

i

=@S

@fBi

��� ) �Ai = �Bi

for all i. We conclude that the equal sign holds when the parts were already in equilibriumbefore exchanges of the additive constants of motion are allowed.

The coefficients �i defining ⇢r and Zr are partial derivatives of S, which is an homo-geneous function of first degree:

@S

@fj= �kB�j .

Therefore, they are homogeneous functions of degree zero, i.e.,

�i(↵f1, ...,↵fs) = �i(f1, .., fs). (4.32)

68

Page 69: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Such properties are said to be intensive. They do not change when the extensive prop-erties are scaled, i.e., changed by keeping the ratios fi/fj constant. The �i depend onlyon s� 1 independent ratios, for example, on f2/f1, f3/f1,. . . , fs/f1. That the �i are notall independent is consistent with the fact that they satisfy the equation of state givenby the normalization condition Tr{⇢gcp} = 1.

Examples of the intensive properties �i are the temperature T or � = 1/kBT , thechemical potential µ, and the pressure p. Note that any function of intensive propertiesis also intensive. In particular �, µ and p depend on the intensive variables E/N andV/N .

To summarize a few comments are due:

i) The knowledge of the parameters �1, . . . ,�s gives us no information on the size ofthe system. For instance, knowing the temperature and pressure gives no informa-tion of the number of atoms or the volume.

ii) The �i with i = 1, . . . , s cannot all be independent of each other, i.e., they cannot allbe varied at will, since they depend on only s�1 densities, such as f2/f1, f3/f1,. . . ,fs/f1. Therefore, T , p and µ cannot be chosen at will. As an example, consider anon interacting classical gas, for which we have pV = NkBT or p V

N = kBT .

iii) The relation between the �i’s depends of course on the type of system that we arestudying, i.e., on its Hamiltonian, composition and interactions, not on the sizeof the system. This relation is known as equation of state of the material. It isa result of statistical mechanics, which cannot be obtained by thermodynamicalarguments.

iv) The dependence of the �i among each other has already been mentioned, whenintroducing the grand canonical density operator

⇢gc = eP

s

i=1 �i

Fi ,

which satisfiesZgc = Tr{e�

Ps

i=1 �i

Fi} = 1.

In fact, the condition Zgc = 1 defines the relation between the intensive quantities�1, . . . ,�s known as equation of state. Note that Zgc depends only on �i and onthe operators defining the system under study: H, N , V .

The logarithm of the partition functions Zr is also extensive. This can be seen by notingthat

kB lnZr = S +rX

i=1

�ifi (4.33)

from Eq. (4.17) or by the relation

lnZr = �sX

i=r+1

�ifi. (4.34)

69

Page 70: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

This means that

lnZr(�1...�r,↵fr+1,↵fs) = ↵ lnZr(�1...�r, fr+1...fs). (4.35)

From Eq. (4.23) we know that@ lnZr

@fj= ��j (4.36)

for r + 1 j s. Replacing in Eq. (4.34) we have

lnZr =sX

i=r+1

fi@ lnZr

@fi

which is consistent with Euler’s theorem.As a final consequence of the extensiveness of the logarithm of the partition functions,

it is interesting to revisit the dependence of the density of states ⌦(E) on system size.We know that S(E, V,N) = kB ln⌦(E) is an extensive property, which impliesln⌦(E, V,N) = N ln⌦(E/N, V/N, 1). Denoting !(E/N, V/N) = ⌦(E/N, V/N, 1) wehave

⌦(E, V,N) =

!

✓E

N,V

N

◆�N. , (4.37)

which confirms that the density of states is an extremely rapidly increasing function ofthe system size.

4.3.1 Euler theorem for homogeneous functions

Let � : Rn ! R be a continuous differentiable function. We say that � is homogeneousof degree k if

�(↵~x) = ↵k�(~x).

Euler’s theorem states that �(x1, . . . , xn) is homogeneous of degree k if and only ifnX

i=1

@�

@xixi = k�. (4.38)

Proof: Let us assume that �(↵~x) = ↵k�(~x). Then

d�(↵~x)

d↵= k↵k�1�(~x).

In addition we know that in general

d�(↵x1, . . . ,↵xn)

d↵=Xi

@�

@xi

�����↵~x

xi. (4.39)

Consequently, Xi

@�i@xi

�����↵~x

xi = k↵k�1�(~x).

70

Page 71: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Taking ↵ = 1, we have Xi

@�

@xi

�����↵~x

↵xi = k�(↵~x).

In order to prove the converse we start from Eq.(4.38) at the point ↵~x:

↵Xi

@�i@xi

�����↵~x

xi = k�(↵~x),

which combined with the general relation (4.39) implies

↵d

d↵�(↵~x) = k�(↵~x).

The solution of this differential equation is �(↵~x) = A↵k, where A is independent of↵ and thus A = �(~x). We conclude that �(~x)↵k = �(↵~x), which means that �(~x) ishomogeneous of degree k.

Lemma: If �(~x) is homogeneous of degree k, then @�@x

j

is homogeneous of degree k � 1

for all j. The homogeneity of �(~x) impliesnX

i=1

xi@�

@xi= k�

)nX

i=1

xi@

@xi

✓@�

@xj

◆+@�

@xj= k

@�

@xj

)nX

i=1

xi@

@xi

✓@�

@xj

◆= (k � 1)

@�

@xj.

This means that @�@x

j

is homogeneous of degree k � 1 for all j.

4.4 Thermodynamic potentials: General formulation

Energy plays a central role in any mechanical theory and statistical mechanics is noexception. We have seen that this reflects itself in the various canonical and grandcanonical ensembles and in the convention used for denoting the intensive variables �iassociated to the observables Fi. The first and most important observable is F1 = Hand the corresponding parameter is �1 = ��, so that the canonical density operator is

⇢c =1

Zce��H

withZc = Tr{e��H} .

We have also seen that@S

@E

����N,V

= kB�.

71

Page 72: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

If two previously isolated systems 1 and 2 are brought into contact, so that energy canbe exchanged among them, some energy transfer �E = E1 � E2 will in general occuruntil a new equilibrium situation is reached. Since the ensemble 1 and 2 is supposed tobe isolated, we have that E = E1 + E2 remains constant. The additivity of S impliesthat

S = S1(E1) + S2(E2) = S1(E1) + S2(E � E1).

The change of entropy associated to some small energy transfer �E1 = ��E2 is

�S =@S1

@E�E1 �

@S2

@E�E1 = kB(�1 � �2)�E1 +O(�E2

1).

Since �S must necessarily be positive or zero, we have

�E1 > 0 , �1 > �2

or equivalently�E1 = ��E2 > 0 , 1

�1<

1

�2.

Thus, energy is absorbed by the system having the smallest 1/� and equilibrium isreached when �1 = �2, i.e., when �S = 0.

Notice that the total entropy change �S = �S1 +�S2 is always positive. However,this is not necessarily true for the subparts. For example, for �1 > �2, we have �S1 =kB�1�E1 > 0 and �S2 = kB�2�E2 = �kB�2�E1 < 0. The total entropy increases,the entropy of the cooler subsystem increases, and the entropy of the hotter subsystemdecreases. Therefore, 1/� has the properties of a temperature in thermodynamical sense.We therefore define the absolute temperature

T =1

kB�, � =

1

kBT. (4.40)

The other intensive variables are redefined with respect to � as

�i = ��↵i,

or equivalently�kB�i =

↵i

T

for i = 2, . . . , s. For i = 1 we have �1 = ��, ↵ = 1 and f1 = E. As we shall see, theintensive variables ↵i are physically more appealing than the �i’s. For example, ↵2 = �µand ↵3 = p.

Consequently, we write⇢r =

1

Zre��(H+

Pr

i=2 ↵i

Fi

) (4.41)

withZr = Tr{e��(H+

Pr

i=2 ↵i

Fi

)}. (4.42)

72

Page 73: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Furthermore, we have seen that lnZr is an extensive property from which a number ofimportant information of the equilibrium state can be derived, particularly in the formof derivatives with respect to �i and fi. For instance, we know that

@ lnZr

@�i= hFii = fi.

This is equivalent to

� 1

@ lnZr

@↵i= fi (4.43)

It is therefore very useful to introduce the thermodynamical potential

�r = � 1

�lnZr = �kBT lnZr (4.44)

for the canonical ensemble that is open with respect to r additive constants of motion.In terms of �r we have

@�r

@↵i= fi, (4.45)

for i = 2, ..., r. Concerning the remaining quantities fj with j = r + 1, . . . , s, we have

@�r

@fj= � 1

@ lnZr

@fj=

1

��j = �↵j . (4.46)

In order to compute @�r/@T it is useful to note the following simple relation

T@

@T= �� @

@�. (4.47)

We then have

T@�r

@T

����↵i

= �� @

@�

✓� 1

�lnZr

◆= �

@

@�

✓1

�lnZr

◆↵i

= �

"� 1

�2lnZr +

1

rXi=1

@ lnZr

@�i

@�i@�

#.

Since �1 = �� and �i = ��↵i for i = 2, . . . , r, and recalling that @ lnZr

@�i

= fi for1 6 i 6 r (f1 = E) we have

T@�r

@T

����↵i

= �kBT lnZr � E �rX

i=2

fi ↵i

= �r � E �rX

i=2

↵i fi. (4.48)

73

Page 74: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Let us now express �r in terms of E, S and the intensive variables T,↵2, . . . ,↵r. From(4.19) we have

�r = �kBT lnZr = �TS � kBTrX

i=2

�i fi

= �TS +rX

i=2

↵i fi + E

= E � TS +rX

i=2

↵i fi . (4.49)

We also know that

lnZr = �sX

j=r+1

�jfj

which implies

�r = � 1

�lnZr =

sXj=r+1

�j�fj = �

sXj=r+1

↵jfj . (4.50)

Coming back to Eq. (4.48) and replacing in it Eq. (4.49) we obtain

T@�r

@T= E � TS +

rX2

↵ifi � E �rX2

↵ifi ,

which implies@�r

@T= �S . (4.51)

For the sake of completeness we should give the derivatives of S with respect to fi interms of T and ↵i. Eq. (4.24) reads

@S

@f1= �kB�1 = kB� =

1

T

and@S

@fi= �kB�i = kB� ↵i =

↵i

T

for i � 2. The different thermodynamic potentials �r are related through Legendretransformations. In order to obtain �r from �r�1 one should first solve

@�r�1@fr

(T,↵2, ...↵r�1, fr, ...fs) = �↵r

in order to obtainfr = fr(T,↵2, ..,↵r, fr+1, ...fs).

The thermodynamic potential is then given by

�r(T,↵2, ...↵r, fr+1, ..fs) = �r�1(T,↵2...↵r�1, fr, ..fs) + ↵rfr. (4.52)

74

Page 75: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

It is easy to verify that �r, constructed in this way, has the same partial derivatives than�r, given by Eq. (4.44) and therefore coincides with it. In fact, from Eq. (4.52) we have

@�r

@T=@�r�1@T

+�r�1@fr| {z }�↵

r

·@fr@T

+ ↵r@fr@T

= �S

and@�r

@↵i=@�r�1↵i

+@�r�1@fr| {z }�↵

r

·@fr@↵i

+ ↵r@fr@↵i

= fi

for 2 i r � 1. Moreover,

@�r

@↵r=@�r�1@fr| {z }�↵

r

· @fr@↵r

+ fr + ↵r@fr@↵r

= fr

and finally@�r

@fj=@�r�1@fr| {z }�↵

r

·@fr@fj

+@�r�1@fj| {z }�↵

j

+↵r@fr@fj

= �↵j

for r + 1 j s.Before closing this section we would like to discuss the thermodynamic potential cor-

responding to the microcanonical ensemble, which cannot be obtained from Eq. (4.44)or Eq. (4.49). In this case the extensive properties are E, N and V and in addition theentropy S. Those four properties are of course related since E, N and V alone definethe macroscopic state.

From Eq. (4.49) and (4.50) we have

F = �1 = E � TS = �sX

i=2

↵ifi .

�1, usually denoted by F , is known as Helmholtz free energy. We thus obtain

E = E(S, f2, . . . , fs) = TS + �1 = TS �sX

i=2

↵ifi. (4.53)

The partial derivatives are given by

@E

@fi

����S,f

j 6=i

=@�1

@fi|{z}�↵

i

+@�1

@T|{z}�S

@T

@fi+ S

@T

@fi= �↵i (4.54)

for i = 2, . . . , s, and@E

@S

����fi

= T + S@T

@S+@�1

@T|{z}�S

@T

@S= T. (4.55)

75

Page 76: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

We may thus write

dE = TdS �sX

i=2

↵i dfi. (4.56)

The extensiveness of E implies (Euler theorem) that

E = TS �sX

i=2

↵i fi . (4.57)

The same result could have been obtained from Eq. (4.29):

S = �kB

sXi=1

�i fi =1

TE +

sXi=2

↵i

Tfi . (4.58)

76

Page 77: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

5 Thermodynamic properties

The purpose of this chapter is to discuss the thermodynamic potentials and derived ther-modynamic properties of practical interest by applying the general formulation developedin the previous chapter.

5.1 Thermodynamic potentials: special cases

5.1.1 Energy E and Entropy S

We consider a system composed of particles of the same kind. The extensive constantsof motion which characterize the macroscopic state are the energy E, the number ofparticles N and the volume V . A further important extensive property of statisticalinterest is the entropy S. The relation between these four properties can be expressed as

E = E(S,N, V ) ,

where the partial derivatives are the intensive properties known as temperature

@E

@S

����V,N

= T , (5.1)

chemical potential@E

@N

����S,V

= µ , (5.2)

and pressure@E

@V

����S,N

= �p. (5.3)

The physical meaning of the intensive properties becomes clear if one considers the totaldifferential

dE = TdS � pdV + µdN. (5.4)

TdS = �QR is the change in energy by constant volume (isochoric process) and constantnumber of particles. This is known as reversible heat.

The chemical potential µ represents the change in energy when a particle is added tothe system by keeping the volume and the entropy constant. In other words, µ is theenergy that a particle must bring in order to keep the entropy unchanged by constantvolume V . Finally, p is the pressure since �W = �pdV is the work done by the externalforces, when a reversible volume change occurs without involving any heat or particleexchange (dS = 0 and dN = 0). Processes without volume change (dV = 0) are knownas isochoric. When the entropy is constant we call them isoentropic. Processes withoutheat exchange (�Q = 0) are known as adiabatic. Processes in which the system is alwaysin thermodynamical equilibrium are called reversible. Thus, reversible and adiabaticmeans isoentropic since TdS = �QR = 0.

The extensiveness of E implies

E = TS � pV + µN (5.5)

77

Page 78: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

We can also express S = S(E,N, V ) as a function of the mechanical constant of motion:

S =E

T� T

µN +

p

TV (5.6)

anddS =

dE

T� T

µdN +

p

TdV. (5.7)

The energy and the entropy are known as thermodynamic potentials, since the thermo-dynamic properties such as T , p and µ, which characterize the macroscopic state of thesystem, can be derived from them.

5.1.2 Helmhotz free energy F

In addition, there are other four potentials which have great practical significance. Thefirst one is

F = �1 = �kBT lnZc = �kBT lnhTr{e��H}

i. (5.8)

From Eq. (4.49) we haveF = E � TS (5.9)

and applying Eqs. (4.46) and (4.51) we have

@F

@T= �S (5.10)

@F

@N= µ (5.11)

@F

@V= �p . (5.12)

This impliesdF = �S dT + µdN � p dV . (5.13)

Applying Eq. (4.50), or recalling that F is extensive and using Euler’s theorem, we canexpress F as a linear combination of its natural extensive variables V and N :

F = µN � pV . (5.14)

It is clear that all these expressions could have been derived by applying the generalformalism developed in the previous section to the present specific case, as is usuallydone in most thermodynamic textbooks. For instance, if one wants consider physicalsituations in which T instead of S is the variable under control, one performs a Lagrangetransformation on E by subtracting TS and considers the thermodynamic potentialF = E � TS. The differential of F is thus dF = dE � TdS � SdT = TdS � pdV +µdN �TdS�SdT = �SdT �pdV +µdN , which is equivalent to Eqs. (5.10)–(5.12). Theextensiveness of F , which follows from the extensiveness of E and S and the intensivenessof T , implies Eq. (5.14).

78

Page 79: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

In order to provide a physical interpretation to F we consider a process in which thevolume is changed keeping the temperature and the particle number constant. In thiscase we have

dF = �pdV

since dT = 0 and dN = 0. Thus, dF represents the available energy, or free energy, inisothermal processes. Note that the corresponding change in internal energy would bedE = TdS � pdV . If the system expands heat has to be absorbed in order to keep thetemperature constant.

Taking into account that

F = �kBT lnhTr

ne��H

oiwe may write

Zc = e��F (5.15)and

⇢c =1

Zce��H = e�(F�H) = e��(H�F ) (5.16)

From a statistical perspective, F can be obtained, as all the other thermodynamicalpotentials, directly from the corresponding partition function.

We may also verify that the usual statistical definition of entropy

S = �kBhln ⇢ci (5.17)

in the canonical ensemble and its well-known physical interpretation coincide with thethermodynamic concept of entropy, since Eqs. (5.17) and Eq. (5.9) are actually the same.Indeed, from Eq. (5.16) we have

�kBhln ⇢ci = kB�hH � F i = E � F

T.

5.1.3 Enthalpy H

The enthalpy is defined by means of a Legendre transformation on E in order to replacethe volume by the pressure as the natural variable:

H(S,N, p) = E + pV , (5.18)

which implies

dH = T dS + µdN � p dV + p dV + V dp

= T dS + µdN + V dp . (5.19)

Again, the extensiveness of H implies

H = TS + µN . (5.20)

In an isobaric process (dp = 0) keeping dN = 0, the exchanged reversible heat �QR = TdScorresponds to the change in enthalpy, which is therefore sometimes also called heatfunction. Notice that if �QR > 0, work has to be done by the system (dV > 0) in orderto keep the pressure constant. Thus, the change in energy is smaller than the enthalpychange: dE < dH = TdS.

79

Page 80: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

5.1.4 Free enthalpy G

The free enthalpy G is obtained by changing variables to T , N and µ. This is obtainedby the Legendre transformation

G = F + pV = H � TS (5.21)

which implies

dG = �S dT + µdN � p dV + p dV + V dp

= �S dT + µdN + V dp . (5.22)

In terms of N , the only extensive natural variable of G, we have

G = µN . (5.23)

Therefore, in a system having just one kind of particles, G per particle is equal to thechemical potential.

5.1.5 Grand canonical potential �

Another important thermodynamic potential is the grand canonical potential

� = �kBT lnhTr

ne��(H�µN)

oi= �kBT lnZgc , (5.24)

which is directly obtained from the grand canonical partition Zgc. The natural variablesof � are T , µ and V . In this ensemble the volume is fixed, while energy and particleexchanges are possible. From Eq. (4.45) and (4.51) we have

@�

@T= �S (5.25)

@�

@µ= �N (5.26)

@�

@V= �p . (5.27)

Consequently,d� = �S dT � p dV �N dµ (5.28)

and� = �pV = F � µN = E � TS � µN . (5.29)

The density of � per unit volume is equal to minus the pressure p = ��/V .Knowing from Eq. (5.24) that

Zgc = e��� (5.30)

we may write⇢gc = e�(��H+µN) = e��(H�µN��) . (5.31)

80

Page 81: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

5.2 Derived thermodynamic properties

5.2.1 Heat capacities and specific heats

Besides the thermodynamical potentials, there is a number of derived thermodynamicproperties, which have a great physical interest. One of the first question one may askin order to characterize the thermal properties of a material is how much heat one musttransfer to the system in a reversible form in order to obtain a certain temperatureincrease. This property, known as specific heat capacity, is defined by

C =�QR

dT= T

dS

dT. (5.32)

Let us recall that TdS = dE when dV = 0 and dN = 0. The amount of heat involved in agiven temperature change�T depends of course on the type of process that one considers.For example, the heat transfered in a transformation at constant pressure, having a given�T , will always be larger than the heat required for the same temperature change atconstant V , since in the former case the system will expand (assuming �QR > 0) and dowork in the environment.

One usually keeps the number of particles N constant. One possibility is to considera transformation at constant volume (isochoric process) in which case we have

CV,N = CV = T@S

@T

����V,N

=@E

@T

����V,N

. (5.33)

The other most common situation is to keep the pressure constant (isobaric process) inwhich case we have

Cp,N = Cp = T@S

@T

����p,N

=@H

@T

����p,N

. (5.34)

Notice that the heat capacities, as defined by Eqs. (5.32), (5.33) and (5.34), are extensiveproperties. Material specific values per particle or per mole, are known as specific heatcapacity or simply specific heat.

5.2.2 Compressibilities

A further interesting question is to quantify the change in volume of a system associatedto a given reversible change in pressure. One therefore defines the isothermal compress-ibility as

T,N = T = � 1

V

@V

@p

����T,N

(5.35)

and the adiabatic compressibility as

S,N = S = � 1

V

@V

@p

����S,N

. (5.36)

In the former case one compresses the system at constant temperature (keeping thesystem in contact with a thermal bath) and in the latter one compresses the system

81

Page 82: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

precluding any heat exchange. is obviously an intensive property. Compressing asystem at constant T allows it to release internal energy to the environment (�p > 0),which is not possible in an isoentropic process. If the system is isolated the temperaturewill increase thus rendering a reduction of volume more difficult. Consequently, we expectT > S in general.

5.2.3 Thermal expansion

A further property of general interest is the thermal expansion coefficient

↵ =1

V

@V

@p

����p,N

. (5.37)

This is an intensive property which measures the volume change resulting from a giventemperature change at constant pressure. Note that ↵ can be positive (the most usualcase) or negative. Examples of negative ↵ are found in water close to the freezing point(T < 3, 984 �C) as it approaches the transition to ice, but also in several semiconductors.

5.2.4 Charge compressibility

Finally, it is interesting to consider how the number of particles change when the chem-ical potential is changed. By analogy with the compressibility we define the chargecompressibility

c =1

N

@N

����V,T

, (5.38)

which is also known as charge susceptibility. The latter designation, borrowed from theanalogy with the magnetic susceptibility, seems less appropriate, since one should keepthe term charge susceptibility to indicate changes in the charge distribution resultingfrom the action of some external potential, which usually depends on the wave vector ~qand frequency !. The charge compressibility can be regarded as the many-body densityof states at the chemical potential µ, since �N = cN�µ. It is often used to characterizemetal-insulator transitions, since c > 0 is a characteristic of metallic behavior.

5.2.5 Assessing fluctuations

The above considered properties, which have a very clear thermodynamical interpreta-tion, should also have very interesting relations to the fluctuations of the constants ofmotion E, V and N in the equilibrium states of macroscopic systems. Let us first of allconsider the canonical ensemble with given V and N . Recalling that

E = hHi =Tr

ne��HH

oTr

ne��H

owe have

CV =@E

@T

����V,N

= � 1

T�@E

@�

����V,N

=1

kBT 2

⇣hH2i � hHi2

⌘.

82

Page 83: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

We conclude that the heat capacity at constant volume

CV,N =(�H)2

kBT 2� 0 (5.39)

provides a measure of the energy fluctuations. Larger specific heat implies strongerfluctuations of the energy at a given temperature. The extensitivity of CV implies thatcV = CV /N and CV /E are independent of system size. Thus, the standard deviationp

(�H)2 /pN . We recover the known resultp

(�H)2

hHi/ 1p

N,

which implies that microcanonical and canonical ensembles are equivalent for N ! 1.In order to compute the compressibility we turn to the grand-canonical pressure en-

semble and write

V = hV i =Tr

nV e��(H�µN+pV )

oTr

ne��(H�µN+pV )

o ,

where we have formally introduced the operator V measuring the volume of the system.It follows that

@V

@p

����T,N

= ��⇣hV 2i � hV i2

⌘= �(�V )2

kBT

andT,N =

1

kBT

(�V )2

V� 0 . (5.40)

Larger compressibility implies that the volume fluctuations (for a given T and V ) aremore important. The intensiveness of implies vanishing relative volume fluctuationsfor N ! 1: p

(�V )2

V=

rkBTT,N

V/ 1p

N.

In order to relate the charge compressibility to particle-number fluctutations in thegrand canonical ensemble we start from

N =Tr

ne��(H�µN)N

oTr

ne��(H�µN)

oand obtain

@N

@µ= �

⇣hN2i � hNi2

⌘=

(�N)2

kBT.

The charge crompressibility is then given by

c =1

N

@N

����T,V

=1

kBT

(�N)2

N. (5.41)

83

Page 84: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The intensiveness of c implies that the relative fluctuations of the particle numberp(�N)2

N/ 1p

N

vanish for N ! 1.

5.3 Minimum free-energy theorem

In section 4.1 we have demonstrated a fundamental variational principle for the entropy

S[⇢] = �kBhln ⇢i = �kBTr {⇢ ln ⇢}

regarded as a functional of the density matrix ⇢. The maximum entropy theorem statesthat all the density operators ⇢, which comply with the normalization condition Tr⇢ = 1and yield some given average or fixed values of a complete set of additive constants ofmotion f1, . . . , fs always satisfy the inequality

S[⇢] S[⇢r] , (5.42)

where ⇢r is the corresponding equilibrium canonical density operator. It is important tobe clear about the constraints imposed to ⇢ in order that (5.42) holds:

hFii = Tr{⇢ Fi} = fi for i = 1, . . . , r

andFi = fi for i = r + 1, . . . , s.

For example, in a grand canonical ensemble energy and particles can be exchanged butthe volume is conserved (r = 2). The macrostate is characterized by some given values ofthe averages E = hHi and N = hNi and of the fixed volume V . The maximum entropytheorem says that among all the ⇢ having E = hHi and N = hNi in a volume V theequilibrium one is the one yielding the largest S [⇢] = �kBhln ⇢i. This important resultimplies the second law of thermodynamics. It allows us to predict the sense of evolutionof processes that occur spontaneously. In particular we have shown, taking advantage ofthe additivity of S that

S�fA1 , . . . , fA

s

�+ S

�fB1 , . . . , fB

s

� S

⇣fA+B1 , . . . , fA+B

s

⌘when two systems A and B are brought into contact and are allowed to exchange additiveconstants of motion (e.g., E and N).

We seek for an appropriate variational principle for the canonical and grand-canonicalensembles. To be explicit we focus on the grand canonical ensemble with a fixed volumeV , temperature T and chemical potential µ. We consider the grand canonical potential

�[⇢] = E[⇢] + µN [⇢]� TS[⇢]

= Trn⇢⇣H � µN + TkB ln ⇢

⌘o(5.43)

84

Page 85: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

as a functional of ⇢. The parameters T and µ characterize the ensemble and are thereforefixed, as well as the volume V . The following minimum free-energy theorem holds. Theminimum of �[⇢] among all the density operators ⇢, satisfying Tr⇢ = 1, is achieved when⇢ is equal to the equilibrium grand canonical density operator

⇢gc =1

Zgce��(H�µN).

The volume V is the same for all the ⇢. This can be written as

� [⇢gc] � [⇢] 8⇢ (Tr⇢ = 1) (5.44)

or asmin⇢

Tr⇢=1

{� [⇢]} = �[⇢gc] . (5.45)

The proof is simple if we use the maximum entropy theorem. However, in order to beable to do that, we need to keep in mind that S[⇢] is maximal among all the ⇢ yieldinga given hHi, hNi and V . We therefore perform the minimization in Eq. (5.45) in twosteps:

min⇢

{� [⇢]} = minE,N

8><>: min⇢!hHi=E

hNi=N

hTr

n⇢⇣H � µN + T kB ln ⇢

⌘oi9>=>; .

The inner minimization runs over all the ⇢ yielding some given average values of E = hHiand N = hNi, while the outer minimization removes this constraint by minimizing overall possible values of E and N . Since all the ⇢ inside the outer minimization yield thesame hHi and hNi for the given volume V , we can write

min⇢

{� [⇢]} = minE,N

8><>:E � µN + min⇢!hHi=E

hNi=N

hT kBTr {⇢ ln ⇢}

i9>=>; .

The maximum entropie theorem implies that

min⇢!hHi=E

hNi=N

hT kBTr {⇢ ln ⇢}

i= �T S(E,N, V ) ,

where S(E,N, V ) is the entropy corresponding to the average energy E, particle numberN and volume V . Moreover, the maximum S is achieved by the grand canonical densityoperator

⇢gc =1

Zgce��(H�µN)

where � = �(E,N, V ) and µ = µ(E,N, V ) are the inverse temperature and chemicalpotential yielding the given values of E, N for the volume V . In order to determine theoptimal � and µ we must perform the minimization with respect to E and N :

min⇢

{� [⇢]} = minE,N

{E � µN � T S(E,N, V )} .

85

Page 86: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Denoting by F = E�µN�T S(E, V,N) the grand-canonical potential which correspondsto � and µ, we find that the optimal E and N are given by

@F

@E

����N,V

= 1� T@S

@E

����N,V

= 0 ) @S

@E=

1

T

and@F

@N

����E,N

= �µ� T@S

@N

����E,V

= 0 ) @S

@N

����E,V

= �µ

T.

Since @S/@E = kB� = 1/T and @S/@N = �kB�µ = �(µ/T ), we deduce that theminimum is achieved when T = T and µ = µ. This proves the theorem.

We finally come to the conclusion that processes occuring spontaneously at a giventemperature and chemical potential will evolve in order to minimize the grand canonicalpotential �[⇢]. Maximizing the entropy under the constraint of fixed hHi, hNi and V isequivalent to minimizing the grand canonical potential �[⇢] at constant T , µ and V . Thesame minimal property hold for the Helmhotz free energy F [⇢], regarded as a functionalof ⇢, in the case where the temperature, particle number and volume are the relevantvariables.

86

Page 87: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

6 Thermodynamic relations

In the last chapter we have introduced a variety of thermodynamic properties (thermody-namic potentials, derived extensive and intensive properties) which allow us to describedifferent aspects of macroscopic equilibrium states, as well as the processes connectingthem. Taking into account that there are far more properties of physical interest than thefew additive extensive or intensive quantities needed for characterizing the equilibriumstate, it is most reasonable to attempt to establish general relations between them. In ad-dition we know that the potentials and variables in thermodynamics fulfill very specific,mathematically restricting properties, for example, the extensiveness of the potentials orthe intensiveness of pressure, temperature and chemical potential. These mathematicalrelations are important in order to derive new properties from a few known ones. In ad-dition, they set rigorous constraints on the validity of microscopic models of the theory ofmatter and on the accuracy of the approximations used to solve them. It is the purposeof this section to establish some of these relations and to discuss a number of differentways to derive them.

6.1 Durhem-Gibbs relations for thermodynamic potentials

The first form of thermodynamic relations is given by the Durhem-Gibbs relation amongthe potentials. These follow from the extensive character of the potential and the cor-responding Euler differential equation

Pi

@�@x

i

xi = �. Examples of this kind of relationsare

E = TS � pV + µN

F = E � TS = µN � pV

� = E � TS � µN = �pV.

6.2 Intensive nature of the derivatives of extensive properties

Another source of relations is the intensive character of properties like p, µ and T . If weconsider, for example, the pressure p = p(T, V,N) we know that

p(T, V,N) = p(T,↵V,↵N). (6.1)

Taking the derivative with respect to ↵ we have

0 =@p

@V

����T,N

V +@p

@N

����T,V

N, (6.2)

which allow to exchange @/@V by @/@N in the case of intensive properties. Anotherexample of this kind concerns the chemical potential

µ(T, V,N) = µ(T,↵V,↵N) , (6.3)

from which we obtain0 =

@V

����T,N

V +@µ

@N

����T,V

N . (6.4)

87

Page 88: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

6.3 Integrability of the differential of thermodynamic potentials

We know that the thermodynamic potentials are state functions, whose values do notdepend on the previous history of the system. Therefore, the change in a thermodynamicpotential f between any two states 1 and 2 is independent of the details of the processinvolved and of the integration path:

f2 � f1 =

Z 2

1df .

This is usually stated by requiring thatIdf = 0

for any closed path. If the differential form of f is given by

df = f1 dx+ f2 dy ,

the integrability condition reads@f1@y

=@f2@x

or@

@y

✓@f

@x

◆=

@

@x

✓@f

@y

◆. (6.5)

Once applied to the thermodynamic potentials, this simple condition on the partialderivatives becomes a very useful source of thermodynamic relations.

Let us considerdF = �SdT � pdV + µdN

and write@S

@V

����T,N

=@p

@T

����V,N

. (6.6)

An analogous relation can be derived from the grand canonical potential

d� = �S dT � p dV �N dµ ,

namely,@S

@V

����T,µ

=@p

@T

����V,µ

. (6.7)

The only difference between (6.6) and (6.7) is that the derivatives are now taken for fixedµ instead of fixed N .

There are three pairs of extensive/intensive variables: (S, T ), (V, p) and (N,µ). Wecan therefore construct 23 = 8 different thermodynamic potentials, which depend ondifferent variables, by picking one variable from each pair. In each case there are threechoices for the two variables whose derivatives are exchanged. In the examples above thepotentials are F = F (T, V,N) and � = �(T, V, µ) and the variables are V and T . We

88

Page 89: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

have therefore 3⇥ 8 = 24 different thermodynamic relations of this form. Actually, theycan be reduced to 12, if we ignore which is the third variable that is kept fixed on bothsides of the equality. In the examples above, we could simply write

@S

@V

����T

=@p

@T

����V

regardless of whether N or µ is kept fixed.In an analogous way we obtain, starting from

dH = TdS + V dp+ µdN ,

the relation@T

@p

����S,N

=@V

@S

����p,N

, (6.8)

which looks pretty useless a priori. If we consider the free enthalpy

dG = �S dT + V dp+ µdN

we obtain@S

@p

����T

= �@V@T

����p

= �↵V (6.9)

where ↵ = 1V

@V@T

��p

is the thermal expansion coefficient. From dF = �SdT � pdV + µdNfollows

@V

����N,T

= � @p

@N

����V,T

. (6.10)

The right hand side can be transformed by replacing @p/@N by @p/@V using (6.2) toobtain

@V

����N,T

=@p

@V

����N,T

V

N=

1

N

V@V@p

��N,T

@V

����N,T

= � 1

N

1

T,N. (6.11)

Since µ is an intensive property, we can replace @µ/@V by @µ/@N and relate it to thecharge compressibility c as

@V

����N,T

= � @µ

@N

����V,T

N

V= � 1

V

N@N@µ

��V,T

= � 1

V

1

c. (6.12)

We conclude thatT,N =

V

Nc =

V

N2

@N

����V,T

, (6.13)

which relates the volume compressibility to the charge compressibility, and allows us todetermine T,N in the framework of the grand canonical ensemble (T, V, µ).

89

Page 90: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

One can also derive interesting relations for the derivatives of the thermodynamicalpotentials with respect to variables which are not the natural ones. Let us consider theenergy

E = TS � pV + µN

and its differentialdE = T dS � p dV + µ, dN .

We may then write

@E

@V

����T,N

= T@S

@V

����T,N

� p = T@p

@T

����V,N

� p . (6.14)

In the first step we have used that a change in volume V at constant T and N causesan energy change dE = TdS � pdV , provided that dS refers to the entropy change atconstant temperature. In the second step we have used the condition on the crossedderivatives in dF .

Alternatively we could have started from the Durhem-Gibbs relation for E and takethe derivative with respect to V straightforwardly:

@E

@V

����T,N

= T@S

@V

����T,N

� @p

@V

����T,N

V � p+@µ

@V

����T,N

N ,

which looks a lot more complicated at first. However, we know that

@V

����T,N

= � @p

@N

����T,V

=@p

@V

����T,N

V

N,

which brings us back to Eq. (6.14) as it should. Other relations can be derived in asimilar way (e.g., for @F

@V

��S,N

).

6.4 Jacobi-determinant manipulations

A large number of relations in thermodynamics are obtained by changing an extensivevariable (e.g., V ) by its conjugated intensive variable (e.g., p) in partial derivatives.This concern both exchanging derivatives with respect to these variable or changing the

variables that are kept fixed, for example, from @E@V

����T,N

to @E@p

����T,N

or from @E@T

����V,N

to

@E@T

����p,N

. These kind of manipulations can be performed in a systematic way by using the

properties of the 2⇥ 2 Jacobi determinant.Let us first recall the definition and some useful properties of the Jacobian. Given two

functions of the variables u and v, namely, f, g : R2 ! R, f = f(u, v) and g = g(u, v),the Jacobi determinant is defined by

@(f, g)

@(u, v)=

�����@f@u

@f@v

@g@u

@g@v

����� = @f

@u

@g

@v� @g

@u

@f

@v. (6.15)

90

Page 91: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The properties of the determinant imply

@(f, g)

@(u, v)= �@(g, f)

@(u, v)= �@(f, g)

@(v, u)=@(g, f)

@(v, u). (6.16)

In addition we have

@(f, v)

@(u, v)=

�����@f@u

@f@v

0 1

����� = @f

@u

����v

=@(v, f)

@(v, u). (6.17)

If the variables u and v are themselves function u = u(x, y) and v(x, y), we can applythe usual chain rule for the partial derivatives:

@f

@x=@f

@u

@u

@x+@f

@v

@v

@x,

@f

@y=@f

@u

@u

@y+@f

@v

@v

@y

and similarly for g. These relations can be written in a compact matrix form as @f@x

@f@y

@g@x

@g@y

!=

@f@u

@f@v

@g@u

@g@v

! @u@x

@u@y

@v@x

@v@y

!.

Taking determinants on both sides we have

@(f, g)

@(x, y)=@(f, g)

@(u, v)· @(u, v)@(x, y)

. (6.18)

If the mapping (u, v) ! (f, g) is bijective, we can invert it, i.e., solve for u and v theequations f = f(u, v) and g = g(u, v). The functions u(f, g) and v(f, g) obtained in thisway satisfy

f�u(f 0, g0), v(f 0, g0)

�= f 0

andg�u(f 0, g0), v(f 0, g0)

�= g0 .

We can therefore write1 =

@(f, g)

@(f, g)=@(f, g)

@(u, v)· @(u, v)@(f, g)

. (6.19)

As a first application of the Jacobi determinant method we consider @p@T

��V

and try torelate it with other properties by changing the variable V by p. We thus write

@p

@T

����V

=@(p, V )

@(T, V )=@(p, V )

@(p, T )

@(p, T )

@(T, V )=@V

@T

����p

� @p

@V

����T

�=

V ↵

�@V@p

��T

=↵

T. (6.20)

91

Page 92: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that the sign of @p@T

��V

is the same as the sign of the thermal expansion coefficient↵.

A further interesting application is to relate the heat capacities Cp and CV by changingthe pressure p by the volume V as the variable that is kept constant:

Cp = T@S

@T

����p

= T@(S, p)

@(T, p)= T

@(S, p)

@(S, V )

@(S, V )

@(T, p).

In this first step we replace p by V , keeping S constant [@(T, p) ! @(S, V )]. We recognize

@(S, p)

@(S, V )=

1@V@p

��S

= � 1

V S.

In a second step we intend to find CV by replacing @(T, p) by @(T, V ) in the denominator.Thus we have

Cp = T

✓� 1

V S

◆@(S, V )

@(T, V )

@(T, V )

@(T, p).

Noting that

T@(S, V )

@(T, V )= T

@S

@T

����V

= CV and@(T, V )

@(T, p)=@V

@p

����T

= �V T ,

we haveCp = CV

TS

orCp

CV=TS

. (6.21)

There are also some Jacobi determinants which take a simple form and can thus beused in order to establish new relations. For example:

@(S, V )

@(T, p)=

�����@S@T

��p

@S@p

��T

@V@T

��p

@V@p

��T

����� =����� Cp

1T �↵V

V ↵ �V T

����� = �V TT

Cp + ↵2V 2 , (6.22)

where we have used that

dG = �SdT + V dp� µdN ) @S

@p

����T

= �@V@T

����p

= �↵V .

We can then rewrite CV as

CV = T@S

@T

����V

= T@(S, V )

@(T, V )= T

@(S, V )

@(T, p)

@(T, p)

@(T, V )

= (�V TCp + T↵2V 2)

✓� 1

V T

◆= Cp � T

↵2V

T,

and obtain the important relation

Cp � CV = T↵2V

T� 0. (6.23)

92

Page 93: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Combining (6.21) and (6.23) we have

1� CV

Cp= 1� S

T= T

↵2V

TCp

T � S = T↵2V

Cp� 0. (6.24)

These relations show that only three of the five quantities CV , Cp, T , S and ↵ areactually independent. This is just one example of the numerous relations between ther-modynamic properties, which can be verified by comparing independent experimentalmeasurements. As a byproduct we should mention that

Cp � CV � 0 and T � S � 0, (6.25)

which can be easily understood physically.

Exercise 6.12: Starting from the definition of S show that S = CV

Cp

T by using simpleJacobi transformations.

The number of thermodynamic relation that one may conceive is probably inex-haustible. We shall not pursue showing further examples here, but rather derive thoseneeded in the context of the specific problems to be discussed below.

6.5 Measuring the absolute temperature scale

In the thermodynamic relations discussed in the previous section the absolute tempera-ture scale T appears explicitly, either as the parameter which is varied (@/@T ), or whichis kept constant, or as a multiplicative factor. This means that the derived equationswould not hold if T is replaced by an arbitrary empirical temperature ✓. One may there-fore use these relations in order to determine the absolute temperature T = T (✓) as afunction of some empirical temperature ✓ (e.g., the voltage across a thermocouple, thelength of a Hg filament, etc.)

In order to illustrate this remarkable possibility, we consider the following properties:

i) The Joule-Thompson coefficient

µJT =@T

@p

����H

=dT

d✓· @✓@p

����H

= µJTdT

d✓, (6.26)

which measures the temperature change following an isoenthalpic pressure change.On the left hand side of (6.26) we have the true Joule-Thompson coefficient whileon the right hand side we have the measured one according to ✓ = ✓(T ).

93

Page 94: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

ii) The heat capacity

Cp =@H

@T

����p

=@H

@✓

����p

d✓

dT=

Cp

dTd✓

(6.27)

iii) The expansion coefficient

↵ =1

V

@V

@T

����p

=1

V

@V

@✓

����p

· d✓dT

=↵dTd✓

(6.28)

We would like to express µJT in terms of known quantities including in particular Cp

and ↵:

µJT =@T

@p

����H

=@(T,H)

@(p,H)=@(T,H)

@(T, p)

@(T, p)

@(p,H)| {z }�1/C

p

= �@H@p

��T

Cp. (6.29)

From dH = TdS + V dp+ µdN we have@H

@p

����T,N

= T@S

@p

����T

+ V , (6.30)

and from dG = �SdT + V dp+ µdN we have

� @S

@p

����T

=@V

@T

����p

. (6.31)

Thus,@H

@p

����T

= T@S

@p

����T

+ V = V � T@V

@T

����p

= V (1� T↵)

We therefore haveµJT =

V

Cp(T↵� 1). (6.32)

Replacing (6.26), (6.27) and (6.28) in (6.32) we have

µJTdT

d✓=

V

Cp

dT

d✓

T↵dTd✓

� 1

!✓µJT +

V

Cp

◆dT

d✓=

V T ↵

Cp

d lnT

d✓=

V ↵

µJT Cp + V.

The measurement of µJT , Cp and ↵ at a volume V , by using an arbitrary temperaturescale and an arbitrary substance, allows us to determine the absolute temperature exceptfor an arbitrary multiplicative constant. This constant has to be fixed by convention.In this context it is important to notice that all thermodynamic relations are invariantwith respect to a scaling transformation T 0 = aT , where a is a constant. The conventionis such that 1 K is the temperature of the triple point of water divided by 273.16, orequivalently, that the triple point of water is 273.16 K. Since in the Celsius scale thetriple-point temperature is 0.01 �C, the absolute zero corresponds to �273.15 �C.

94

Page 95: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

7 Thermodynamic processes

Thermodynamic processes are present in our everyday life, for example, in a number ofprocesses in the atmosphere and in many technological applications (internal combustionengines, refrigerators, etc.). In many cases the process is cyclic. One has a workingsubstance (typically a gas) which undergoes a cyclic process repeatedly. Very often, thesubstance is replaced by a new initial substance at the beginning of the cycle (e.g., thevaporized water in a steam engine or the fuel in an internal combustion engine). Sincethe principles of cyclic processes are not affected by this, we shall ignore the replacementand assume that the particle number N is constant. Thus, two thermodynamic varaibles[e.g., (p, V ) or (S, T )] are enough to characterize the equilibrium state of the substance.We shall further assume that the processes are quasi static, so that equilibrium can betaken for granted. Irreversible processes are discussed in a forthcoming section.

It is useful to represent the process in both a (T, S) and a (p, V ) diagrams, since thearea TdS and pdV under the process curves have a clear physical meaning. The areaunder the T = T (S) curve represents the heat

Q12 =

Z 2

1T dS (7.1)

absorbed by the system. In a cyclic process the heat absorbed in a cycle is

Q =

IT dS (7.2)

The area under the p = p(V ) curve is the work done by the system, i.e., the work

W12 =

Z 2

1p dV (7.3)

delivered by the system into the environment. In a cycle we have

W =

Ip dV (7.4)

[GRAFIK]Since the energy is a state function and dE = TdS � pdV we haveI

dE = E (7.5)

andQ�W = 0 . (7.6)

Thus, with the appropriate scale, the areas enclosed by the cycles in the (T, S) and(p, V ) diagrams are the same. See Eqs. (7.2), (7.4) and (7.6). Moreover, since thesigns of Q and W are the same, the cycles are run in the same sence in both diagrams.Counterclockwise processes are also known as left processes and clockwise processes as

95

Page 96: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

right processes. Clockwise processes have Q = W > 0. A net amount of heat is absorbedand work is delivered. These are working machines. Conterclockwise processes haveQ = W < 0. Work is absorbed and heat is delivered. These are refrigerators or heatpumps from lower to higher temperatures.

The cyclic processes can be usually subdivided in simpler subprocesses, in which a giventhermodynamic property is kept constant. One distinguishes in particular isothermic,isobaric, isochoric and isoentropic processes. The latter are also known as adiabatic.These can be illustrated as follows:

[GRAFIK]The different slopes of the isotherm and isoentropic curves in the (p, V ) diagram, and of

the isochoric and isobaric curves in the (T, S) diagram can be inferred from the relationsT � S � 0 and Cp � CV � 0 [see Eqs. (6.23) and (6.24)].

7.1 Carnot cycle

The most important cyclic process is Carnot’s cycle, which is obtained by alternatingisothermal and adiabatic subprocesses:

[GRAFIK]All the heat exchanges occur at constant temperature. As clockwise process the system

absorbs the heatQ2 = T2 (S2 � S1) > 0 (7.7)

from the higher temperature T2, and rejects a smaller amount of heat

Q1 = T1 (S2 � S1) > 0 (7.8)

to the cooler reservoir at T1. The total heat exchanged in a cycle is

Q = Q2 �Q1 = W > 0 . (7.9)

The work efficiency or energy-conversion efficiency of a machine is defined by

⌘W =W

Q2=

Work doneHeat absorbed

. (7.10)

In a Carnot cycle this takes the value

⌘WCarnot

=T2 � T1

T2< 1, (7.11)

where T2 > T1.[GRAFIK]It is easy to see, using that

H �QT 0, that no machine involving heat baths in the

temperature range [T1, T2] can exceed the work efficiency of the Carnot cycle. We consideran arbitrary cycle in which the system exchanges heat �Q with thermal baths at varioustemperatures T (S). Let T2 be the largest value of T along the cycle and T1 the smallestone. We intend to compare ⌘W for such an arbitrary cycle with the efficiency of the

96

Page 97: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Carnot cycle running between the largest and the smallest temperatures. The total workand heat are

W = Q =

I�Q =

Z b

a�Q| {z }

Q2>0

+

Z a

b�Q| {z }

�Q1<0

= Q2 �Q1 . (7.12)

We consider Clausius inequality I�Q

T 0 , (7.13)

where T refers to the temperature of the thermal bath with which the heat exchange �Qoccurs:

0 �I�Q

T=

Z b

a

�Q

T| {z }�Q>0

+

Z a

b

�Q

T| {z }�Q<0

(7.14)

Since T2 � T � T1 we replace T by T2 for all the �Q > 0 and T by T1 for all the �Q < 0.We may thus write

0 �Z�Q

T| {z }�Q>0

+

Z�Q

T| {z }�Q<0

�Z�Q

T2+

Z�Q

T1=

Q2

T2� Q1

T1,

which impliesQ1

T1� Q2

T2) Q1

Q2� T1

T2. (7.15)

The work efficiency ⌘W of the arbitrary cycle satisfies

⌘W =W

Q2=

Q2 �Q1

Q2= 1� Q1

Q2 1� T1

T2= ⌘W

Carnot

. (7.16)

The equal sign holds only when the process is reversible (i.e.,H �Q

T = 0) and at the sametime all the heat absorbed (�Q > 0) comes from the highest temperature bath (T2) andall the rejected heat (�Q < 0) goes to the coolest bath (T1). In other words, only for theCarnot cycle running between the temperatures T1 and T2.

As a counterclockwise process the Carnot cycle receives the work �W > 0 done byexternal forces, it absorbs the heat Q1 > 0 from the reservoir at the lower temperatureT1, and transfers the absorbed energy �E = �Q2 = �W �Q1 < 0 in the form of heat tothe hottest reservoir at T2. One defines the heat-transfer efficiency of an arbitrary heatpump as

⌘H =Rejected heatAbsorbed work

=�Q2

�W=

1

⌘W. (7.17)

In the Carnot cycle we have

⌘HCarnot

=T2

T2 � T1=

1

⌘WCarnot

> 1 . (7.18)

97

Page 98: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that the efficiency of a Carnot cycle is much better than the plain dissipation ofwork, as one would have in a Joule experiment, for which Q2 = W and thus ⌘H

Joule

= 1.

It is also interesting to consider the cooling effciency of refrigerators, which is definedas

⌘C =Heat removed

Absorbed Work=

Q1

�W. (7.19)

For the Carnot cycle we have⌘CCarnot

=T1

T2 � T1, (7.20)

which shows that the efficiency improves when �T = T2�T1 is small. Therefore, coolingprocesses are usually splitted in a number of subprocesses with smaller �T ’s.

Before closing this section we would like to mention the Stirling cycle, which alternatesisothermal and isochoric processes, the Ericsson cycle, which alternates isobaric andisothermal processes, and the Rankine cycle (steam engine) which alternates adiabaticand isobaric processes.

7.2 Joule Thompson process

In the following sections we would like to discuss two important irreversible processes.The first one is the Joule-Thompson effect which consists in an adiabatic (�Q = 0) ex-pansion at constant pressure. More precisely, we consider a gas or liquid at a pressurep1 in a volume V1 which flows into a region of lower pressure p2 through a small openingor porous material. During the process the pressures p1 and p2 are kept constant fromthe exterior and the volume change is so slow that the system is constantly at thermo-dynamical equilibrium. Initially, before the value is open, we have V1 = V 0

1 and V2 = 0and in the final state we have V1 = 0 and V2 = V12 .

[GRAFIK]It is easy to show that this is an isoenthalpic process (dH = 0). Since the system is

thermally isolated, we have �Q = 0 and therefore

dE = �W = �p1dV1 � p2dV2 . (7.21)

The extensiveness of E implies E = E1 + E2 and thus

dE1 + dE2 + p1dV1 + p2dV2 = 0 (7.22)

ordH1 + dH2 = dH = 0 . (7.23)

The enthalpy is conserved throughout the process. Note that the change in the numberof particles of each subsystem dN1 = �dN2 < 0 does not contribute to dE or dH, sincethe system is in equilibrium. Therefore, µ1 = µ2 = µ and µdN1 + µdN2 = µdN = 0.The total number of particles is conserved.

98

Page 99: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The change of temperature associated with a change of pressure at constant enthalpyH is given by the Joule-Thompson coefficient

µJT =@T

@p

����H

=V

Cp(↵T � 1) .

The total temperature change is then

T2 � T1 =

Z p2

p1

@T

@p

����H

dp.

The observed temperature change is known as Joule-Thompson effect. In an ideal gaswe have ↵ = 1/T , since the enthalpy depends only on the temperature. No temperaturechange can occur because �H = 0. In real gases the Joule-Thompson coefficient changessign as a function of T , becoming positive only at low temperatures, below the so-calledinversion point. For µJT > 0 we have a temperature drop upon pressure drop. Therefore,in order that the Joule-Thompson expansion becomes effective as cooler, the gas mustalready be below the inversion temperature where µJT = 0.

Since dH = 0 and dN = 0, we have

dH = TdS + V dp = 0 , (7.24)

which implies@S

@p

����H,N

= �V

T< 0 . (7.25)

The process is obviously irreversible, since

�S = �V

T�p > 0 for �p = p2 � p1 < 0 . (7.26)

Finally, we observe that any change of pressure occurring spontaneously is negative, asexpected.

7.3 Adiabatic expansion in vacuum

A further important irreversible process is the adiabatic (�Q = 0) expansion into anevacuated chamber having pressure p2 = 0. In order that the process takes place quasi-statically, and that the system remains always in equilibrium, we can device the systemillustrated above, which involves a small valve or a porous media. Before opening thevalve, we have V = V1 and p = p1. In the final state we have V = V2. The externalpressure is p2 ! 0. In this case we have both �Q = 0 and �W = 0, which implies dE = 0.The energy of the gas is conserved. The temperature change can be calculated from

T2 � T1 =

Z V2

V1

@T

@V

����E

dV.

99

Page 100: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The rate of temperature change at constant energy can be expressed in terms of knownproperties. Our starting point is

@T

@V

����E

=@(T,E)

@(V,E)=@(T,E)

@(V, T )

@(V, T )

@(V,E)= � 1

CV

@E

@V

����T

.

MoreoverdE = TdS � pdV ) @E

@V

����T

= T@S

@V

����T

� p

anddF = �SdT � pdV ) @S

@V

����T

=@p

@T

����V

.

Combining these relations we obtain

@E

@V

����T

= T@p

@T

����V

� p

and finally@T

@V

����E

=1

CV

p� T

@p

@T

����V

�.

Again, for an ideal gas there is no effect, since at constant V, p depends linearly on Tand thus @T

@V

��E= 0. In principle, one can find both signs for @T

@V

��E. However, in diluted

real gases we always see a cooling. In fact, the van der Waals interactions between gasmolecules are attractive at long distances. Consequently, increasing the volume impliesan increase of the average interparticle distance. This raises the potential energy and byenergy conservation lowers the kinetic energy. The equipartition theorem implies thatthe temperature must decrease.

Before closing this section, we may calculate the entropy change following an energyconserving expansion:

dE = TdS � pdV = 0 ) @S

@V

����E,N

=p

T> 0

and�S =

p

T�V.

As expected, any spontaneous volume change into vacuum must be positive. The processis irreversible, since �S > 0.

100

Page 101: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

8 The Nernst theorem: The Third law of thermondynamics

In 1905 Walther Nernst postulated that the entropy of any system at absolute zerotemperature is a universal constant, which may be taken to be zero. This postulate, whichhe actually formulated in other terms, was inferred from experiments at temperaturesof the order of 1 K. For example, one observed that the changes in enthalpy H and infree enthalpy G = H � TS in isothermal and isobaric processes become equal as T ! 0.Indeed, since �G = �H � T�S for constant T,�H = �G implies �S = 0. Moreover,since @G

@T

��p= �S, the fact that@G@T

��p! 0 for T ! 0 was used to infer that S ! 0 for

T ! 0. This important fundamental property is general in two respects:

i) It holds for any system and any substance, and

ii) it states that S = 0 at T = 0, regardless of the values of any other additiveconstants of motion fi = E, V or N or intensive properties ↵i = p or µ, on whichS usually depends.

Therefore, it is easy to imagine that the third law of thermodynamics has many farreaching consequences. In the following, we shall demonstrate this important result.Furthermore, we discuss the conditions for its validity from the point of view of statisticalmechanics, which provides an unambiguous microscopic definition of S.

Let us consider the canonical density operator

⇢c =e��H

Tr{e��H}=

+1Pn=0

e��En |nihn|

+1Pn=0

e��En

=

P0 +Pn

E

n

>E0

e��(En

�E0)|nihn|

g +Pn

E

n

>E0

e��(En

�E0)

in the limit of T ! 0 (i.e., � ! 1) where

P0 =gX

i=1

|iihi|

is the projector operator on the g-dimensional subspace spanned by the states having thelowest energy E0 (P 2

0 = P0). The degeneracy of the ground state is denoted by g, i.e.,Ei = E0 for i = 1, . . . , g. For T ! 0 we have

⇢c(T = 0) =1

gP0 =

1

g

gXi=1

|iihi|,

which implies that the system is found in the ground-state subspace. The entropy istherefore given by

S0 = �kBhln ⇢i = kB ln g

for T ! 0. In most cases the ground state is non-degenerate (i.e., g = 1). Thus,S = 0 irrespectively of the other thermodynamic variables (e.g., V , N , p or µ). However,

101

Page 102: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

there are situations where g > 1 and which deserve some discussion. For example, ifa system has an odd number of electrons, its ground state will have a non-vanishinghalf integer total spin S � 1/2. Even if combined with an orbital angular momentumto yield the total angular momentum ~J = ~S + ~L, the 2J + 1 degeneracy will be finite.Moreover, time inversion symmetry requires that the ground state is at least twofolddegenerate for an odd number of electrons. We would then have S = kB ln(2J + 1)or S = kB ln 2, which is negligible on a macroscopic level. In the worst case we canexpect to have g ⇡ N , for example, in a rotational invariant ferromagnetic system withsaturated magnetic order having S / N and degeneracy 2S + 1 / N . But even in thesecases the entropy per particle S

N / 1N lnN ! 0 for T ! 0. Therefore, we may say

that the additive or extensive part of the entropy is always zero as T ! 0, regardlessof the other thermodynamic variables. It should be noted that spin-orbit interactionsintroduce magnetic anisotropy which removes the 2S + 1 spin-rotational degeneracy offerromagnetic materials. Also notice that the total orbital momentum is not conserved,since the solid or its lattice is fixed to the laboratory frame. Only the time inversionsymmetry and possibly some point group symmetry are left, yielding g independent ofN . We conclude that the entropy of every real system vanishes at T = 0, irrespectivelyof the values of all other thermodynamic quantities (V , N , p and µ).

This important result has far reaching consequences, which can be directly verified inexperiment. Historically, it is this set of observations, which we shall now derive, whatguided Nernst to infer and postulate his principle.

An immediate consequence of the third principle is that any heat capacity vanishes atT = 0. Consider some reversible path R connecting the T = 0 equilibrium state withthe state A, whose entropy wants to be calculated. Since

@S

@T

����R

=CR

T

we have

S(A) =

Z TA

0

CR

TdT.

Since S(A) is well defined and finite, CR

T needs to be integrable in [0, TA]. This impliesthat

CRT!0���! 0.

In particular CV ! 0 and Cp ! 0 for T ! 0. This result has been verified by allexperiments done so far.

A further consequence of the third principle is that the thermal expansion coefficientV ↵ vanishes for T ! 0 for any substance. Since S(T =0) = 0 we can write

S(T ) =

Z T

0

Cp

T 0dT 0,

where the integration path is isobaric ( @S@T��p= C

p

T ). Moreover, we know that per definition

V ↵ =@V

@T

����p

= �@S@p

����T

= � @

@p

✓Z T

0

Cp

T 0dT 0

◆ ����T

= �Z T

0

@Cp

@p

����T

dT 0

T 0. (8.1)

102

Page 103: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Taking into account that

Cp

T=@S

@T

����p

) 1

T

@Cp

@p

����T

=@

@T

✓@S

@p

����T

◆ ����p

= � @

@T

✓@V

@T

◆ ����p

= �@2V

@T 2

����p

.

Substituting in Eq. (8.1), we finally obtain

V ↵ =

Z T

0

@2V

@T 2

����p

dT 0 =@V

@T

����p

(T )� @V

@T

����p

(0)T!0���! 0.

This proves the statement. Moreover, from the previous relations (6.6), (6.7) and (6.20)we have

T=@p

@T

����V

=@S

@V

����T,N

T!0���! 0.

In the above derivation it is irrelevant whether N or µ are kept constant. Thus, S isindependent of V for T ! 0. A similar argument involving the charge compressibility cshows that @S

@N

��T

T!0���! 0 (see exercise).

Exercise 8.13: Show that @S@N

��T

= �↵/c. Consider @µ@T

��V,N

and work it out using theJacobi determinant method as in Eq. (6.20). Conclude that S is independent of N forT ! 0.

On these thermodynamic grounds, we conclude that S(T, V,N) is independent of Vand N at T = 0. This result alone already allows us to choose the value of this universalconstant equal to zero, even without having any microscopic explanation.

A further important consequence of the third principle of thermodynamics is the im-possibility to reach the absolute zero of temperature in experiment. It is in fact easy toshow that an infinite number of processes would be necessary in order to succeed. Onlyin the limit of the number of processes tending to infinite one would be to approach theT = 0 state, without ever actually reaching it.

[GRAFIK]The basic reasoning goes as follows. Since S = 0 independently of any other thermo-

dynamic variables and of the details of the system, we must have that all curves in the(S, T ) diagram corresponding to any constant value of any parameter P (e.g., pressure orvolume) must merge at T = 0 in order to yield S = 0. Situations like the one illustratedin the following are not possible:

[GRAFIK]One can imagine, for example, an isoentropic cooling from p = p1 to p = p2 (vertical

line) but then, in order to be able to achieve further cooling, one should reduce theentropy. This can be done at best via an isothermal process, since (after some point) nothermal bath cooler than the system can be found (horizontal line). Since the values of

103

Page 104: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

P must be finite, the isothermal process leaves the system at some finite value of p = p1.From there, a new isoentropic cooling can be made. In order to reach T = 0 (or evento approach T = 0 arbitrarily close) one would need an infinite number of constant T ,constant S processes. Notice that this would not be necessary if all the constant-P curvesin the (S, T ) diagram would not converge to the same S as illustrated.

104

Page 105: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

9 The principle of indistinguishability of identical particles

Before discussing the properties of macroscopie systems at low temperatures and assesingthe limitations of classical statistical mechanics, it is absolutely necessary to recall theprinciple of indistinguishability of identical particles and its consequences on the sym-metry of many-body wave functions. In classical mechanics the particles preserve theiridentity despite having the same physical properties. A numbering is possible withoutaltering the mechanical behaviour in any significant respect. This allows us in principleto follow the trajectory of each particle along its path. This reasoning applies to anynumber of particles and in particular to particles that are identical in all respects. Forexample, from a classical perspective a state in which particle 1 has a position and mo-mentum (q, p) and particle 2 has the coordinates (q0, p0) is fundamentally distinct for thecase in which particle 1 is in state (q0, p0) and particle 2 in state (q, p).

In quantum mechanics the situation is entirely different, since the motion of determin-istic path ceases to exist as a consequence of the Heisenberg uncertainty principle. Evenif a labeling of the particles is attempted by measuring all the positions of each particlewith arbitrary accuracy at some time t, it is not possible to track the position of theparticles at any future time t0 > t. In fact, the coordinates have no definite values evenat times t0 arbitrarily close to t. Suppose that we localize (measure the position) of aparticle at a given time t0 > t. It would then be impossible to say which particle amongthe initial N ones has arrived at this point. This is clearly illustrated by the followingscattering experiment:

[GRAFIK]In quantum mechanics identical particles entirely lose their individuality. They are

completely indistinguishable. No experimental measurement can ever remove this indis-tinguishability. This fundamental principle of indistinguishability of identical particleshas many far reaching consequences.

Consider two observers O and O0 who attempt to defy the principle of indistinguisha-bility by preparing two experiments, adopting different conventions for labeling the elec-tronic coordinates. For example, the scattering between two electronic wave packets isprepared. According to O electron 1, in the wave packet |ai, interacts with electron 2in the wave packet |bi. The corresponding state considered by O is | i with coordinatewave function hx1, x2| i = (x1, x2). According to observer O0, electron 2 is the one instate |ai, while electron 1 is in state |bi. The corresponding two particle state is | 0i withwave function hx1, x2| 0i = 0(x1, x2). The principle of indistinguishability of identicalparticles states that | i and | 0i are two equivalent representations of the same physicalstate. No measurement can ever make the difference between | i and | 0i. This meansno more and no less that for any state |�i the probability of finding | i in |�i is the sameas the probability of finding | 0i in |�i:

|h�| i|2 =��h�| 0i��2

for any |�i. Setting|�i = | i

105

Page 106: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

we have|h | i|2 = 1 = |h | 0i|2

and setting |�i = | 0i we have

|h | 0i|2 = 1 = |h | 0i|2 .

It is easy to see that two vectors having a norm equal to 1 and an overlap with absolutevalue equal to 1 must be collinear. This implies

| 0i = ei↵| i .

Consequently,

hx1, x2| 0i = 0(x1, x2) = ei↵ (x1, x2) = hx1, x2| i .

Repeating the interchange, we obtain

(x1, x2) = ei↵ (x2, x1) = e2i↵ (x1, x2) 8x1, x2) e2i↵ = 1 ) ei↵ = ±1

Consequently (x1, x2) = ± (x2, x1).

Obviously, the same sign must hold for any two particles of the same kind.Applying the previous argument two any two particles belonging to a larger N -particle

system we have

(x1, . . . , xi, . . . , xj , . . . , xN ) = ± (x1, . . . , xj , . . . , xi, . . . , xN )

for all i and j. The particles in nature are thus divided in two disjoint groups: theparticles having fully symmetric wave functions (+ sign), which are called bosons, and theparticles having fully antisymmetric wave functions (� sign), which are called fermions.The property of being a boson or a fermion is an intrinsive property of any particle.It can be shown experimentally that there is a one-to-one correspondence between thebosonic or fermionic character and integer or half integer nature of the particles’ intrinsicspin. Bosons have integer spin, while fermions have half-integer spin. Most elementaryparticles are fermions (e�, p+, n, e+). Photons and most elementary excitations incondensed matter are bosons (phonons, magnons, etc.).

Complex particles, for example, atoms, molecules or nuclei have fermonic characterif the number of fermions constituting them (elementary particles) is odd. Otherwise,they show bosonic character. For instance, 3He atoms are fermious while 4He atoms arebosons. As we shall see, this has crucial consequences for the thermodynamic propertiesat low temperatures.

The exchange or transposition of the coordinates of two particles i and j is a particularcase of the most general permutation of N coordinates x1, . . . , xN . A permutation P :

106

Page 107: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

[1, N ] ! [1, N ] in the natural interval [1, N ] is a bijective function within [1, N ]. We maydenote it by ✓

1 2 . . . NP (1) P (2) . . . P (N)

◆or simply by P = [P (1), P (2), . . . , P (N)]. For example, a transposition corresponds to✓

1 . . . i . . . j . . . N1 . . . j . . . i . . . N

◆.

It is very useful to define the order of a permutation P , which we denote by O(P ) = p,as the number of transpositions required to bring the sequence [P (1), P (2)...P (N)] tothe normal order [1, 2, ...N ]. For example✓

1 2 31 3 2

◆has p = 1

and ✓1 2 33 1 2

◆has p = 2.

A simple transposition has always p = 1. One may easily convince oneself that p corre-sponds to the number line crossing in the diagrams of the form✓

1 2 . . . NP (1) P (2) . . . P (N)

◆, (9.1)

where the lines connect the numbers i on the upper row with P (i) on the lower row.Knowing that (x1, . . . xi, . . . , xj , . . . xN ) = ± (x1, . . . , xj , . . . , xi, . . . xN ) we concludethat

(xP (1), xP (2), . . . , xP (N)) = (x1, x2, . . . , xN )

for bosons and (xP (1), . . . , xP (N)) = (�1)p (x1, . . . , xN )

for fermions. Functions with these properties are said to be fully symmetric (bosons) orfully antisymmetric (fermions).

9.1 Many-particle wave functions

The superposition principle states that any linear combination of two quantum mechan-ical states | 1i and | 2i is a possible state. Therefore, all the wave functions of a systemof N identical particles must have the same fermionic or bosonic character. Otherwisethe linear combination would have no defined symmetry. In particular, the basis statesmust be fully antisymmetric or symmetric. We seek for a simple complete basis setfor fermions and bosons, in terms of which we can compute the partition functions andstatistical averages of systems of identical particles.

107

Page 108: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Consider an arbitrary complete set of orthonormal single-particle wavefunctions orspin-orbitals

{'1(x),'2(x), . . . } = {'↵ with ↵ = 1, 2, . . . } ,

where X↵

'⇤↵(x)'↵(x0) = �(x� x0)

and Z'⇤↵(x)'�(x) dx = �↵� .

For simplicity, we introduce the notation xi ⌘ ~ri for the spatial coordinates of spinlessparticles, and also xi ⌘ (~ri,�i) for particles having a spin variable �i (e.g., electrons).Consequently, the delta functions and integrals are to be interpreted as

�(x� x0) = �(3)(~r � ~r0) ���0

and Z. . . dx =

X�

Z. . . d3r.

Since the set {�↵(x)} is a complete set for single-variable functions, any N variablefunction can be expanded as a linear combination of simple products of the form

'k1(x1),'k2(x2), . . . ,'kN

(xN ).

However, simple products are not appropriate N -particle basis wavefunctions, since theyare neither symmetric nor antisymmetric. In other words, simple products are not phys-ically acceptable states for identical particles. It is, however, simple to obtain a completebasis set appropriate for fermions or bosons by symmetrizing the products 'k1 , . . . ,'k

N

.

Excercise 9.14: Show that (x1, . . . , xN ) as given by Eq. (9.2) fulfills the required anti-symmetry properties of fermion wavefunctions.

108

Page 109: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

9.2 Fermions

In the case of fermions, the simplest fully antisymmetric wave functions that can beconstructed from a single product reads

(x1, . . . , xN ) =1pN !

XP

(�1)p'kP (1)

(x1)'kP (2)

(x2) . . .'kP (N)

(xN )

=1pN !

XP

(�1)p'k1(xP (1))'k2(xP (2)) . . .'kN

(xP (N))

=1pN !

�������'k1(x1) . . . 'k1(xN )

...'k

N

(x1) . . . 'kN

(xN )

������� . (9.2)

These functions are known as Slater determinants. Each Slater determinant is univocallydefined by specifying the occupation numbers n↵ of each single-particle spin-orbital '↵,i.e., by specifying whether ↵ appears in the list k1, k2, . . . kN , in which case n↵ = 1, or not,in which case n↵ = 0. The phase ambiguity is removed by requiring k1 < k2 < · · · < kN .Notice that = 0 if any of the single-particle orbitals is occupied twice. The Pauliexclusion principle imposes n↵ = 0 or 1 for fermions. Let us recall that the singleparticle states '↵(x) concern both spatial coordinates ~ri and the spin variables �i. Wemay thus write

k1,...,kN

(x1, . . . , xN ) = hx1, . . . , xN |n1, n2 . . . i ,

where n↵ = 1 for ↵ = k1, k2, . . . , kN , and n↵ = 0 otherwise. From the occupation-numberperspective, the following notation for the Slater-determinant wave function seems moreappropriate:

{n↵

}(x1, . . . , xN ) = hx1, . . . , xN |n1, n2 . . . i ,

where |n1, n2 . . . i is the N -fermion ket corresponding to the occupation numbers n↵ =0, 1 and

P↵ n↵ = N . It is easy to verify that the Slater determinants are properly nor-

malized and that two Slater determinants are orthogonal unless all occupation numberscoincide:

hn1, n2, . . . |n01, n02, . . . i =Z

dx1 . . . dxN ⇤{n

}(x1, . . . , xN ) {n0↵

}(x1, . . . , xN )

= �n1n01�n2n0

2. . . �n1n0

1 .

Once we have the orthonormal many-body basis |n1, n2, . . . i or {n↵

}(x1, . . . , xN ), wecan compute the trace of any operator O in the Hilbert space of fermion states havingan arbitrary number of particles:

Tr{O} =X{n

}

hn1, n2, . . . |O|n1, n2, . . . i

=1X

n1=0

1Xn2=0

· · ·1X

n↵

=0

. . . hn1, n2, . . . |O|n1, n2, . . . i ,

109

Page 110: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where

hn1, n2, . . . |O|n1, n2, . . . i =Z

dx1 . . . dxN ⇤{n

}(x1, . . . , xN ) O {n↵

}(x1, . . . , xN ) .

For example, the grand canonical partition function is given by

Zgc ==1X

n1=0

1Xn2=0

· · ·1X

n↵

=0

. . . hn1, n2, . . . |e��(H�µN)|n1, n2, . . . i .

Of course, in the case of a canonical partition function the constraint ⌃↵n↵ = N mustbe enforced.

An alternative way to compute the trace is to sum over all possible choices of the Noccupied orbitals (i.e., the orbitals appearing in the product 'k1 , . . . ,'k

N

) and to correctfor the multiple appearence of the same set of orbitals. In the case of fermions all non-allowed products showing the same orbitals two or more times are automatically ignored,since the corresponding Slater determinant is zero. However, if one sums over each valueof ki independently, one obtains N ! times the same set of orbitals. Therefore we have

Tr{O} =1

N !

1Xk1=1

· · ·1X

kN

=1

Zdx1 . . . dxN k1,...,k

N

(x1, . . . , xN )⇤ O k1,...,kN

(x1, . . . , xN ).

9.3 Bosons

In the case of bosons, the fully symmetric wave function obtained from a simple productof single-particle orbitals looks even simpler than the Slater determinant. It can bewritten as

�sk1,...,k

N

(x1, . . . , xN ) =1pN !

XP

'kP (1)

(x1) . . .'kP (N)

(xN )

=1pN !

XP

'k1(xP (1)) . . .'kN

(xP (N))

= �s{n

}(x1, . . . , xN ).

It is easy to see that �s(xQ(1), . . . , xQ(N)) = �s(x1, . . . , xN ) for any permutation Q.Moreover, Z ⇣

�s{n

}

⌘⇤�s{n0

} dx1 . . . dxN = 0

unless the occupation numbers n↵ = n0↵ are the same 8↵. Indeed, if this is not thecase one of the integrals

R'⇤i (x)'j(x) dx = 0. The form of �s is appealing, since

it is the same as the Slater determinant except for the phase factor (�1)p. However,it has the inconvenience of not being normalized to 1. In fact in the

PP there are

n1!n2! . . . n↵! · · · =Q

↵ n↵! terms which are exactly the same. They correspond to the

110

Page 111: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

permutations of the indices kP (i) among the ki corresponding to the same single-particleorbital ↵. The properly normalized fully symmetric states are therefore given by

�s{n

}(x1, . . . , xN ) =�s(x1, . . . , xN )pQ

↵ n↵!.

This can also be written as

�s{n

}(x1, . . . , xN ) =

rQ↵ n↵!

N !

XP 6=k

�kP (1)

(x1) . . .�kP (N)

(xN )

whereP

_P 6=k is the sum over all possible permutations of the indices i = 1, . . . , N suchthat kP (i) 6= ki for some i. In other words, in

PP 6=k

only permutations which actuallychange the product are taken into account. There are N !/(

Q↵ n↵!) such permutations,

since the number of permutations among indices corresponding to the same orbitals isQ↵ n↵!.As in the case of fermions, indistinguishability implies that the many-particle state

or wavefunction is univocally defined by the occupation numbers {n↵} = n1, n2 . . . . Wetherefore introduce the bosons kets |n1, n2, . . . i corresponding to the occupation numbers{n↵}, which are defined by

hx1, . . . , xN |n1, n2 . . . i = �s{n

}(x1, . . . , xN ).

It is easy to see that

hn1, n2, . . . |n01, n02, . . . i = �n1n01�n2n0

2. . . �n

n0↵

. . .

for all {n↵} and all N =P

↵ n↵, since �s{n

} is normalised to 1.Once we have the orthonormal many-body basis |n1, n2, . . . i or �s

{n↵

}(x1, . . . , xN ) wecan compute the trace of any operator O as

Tr{O} =X{n

}

hn1, n2, . . . |O|n1, n2, . . . i =1X

n1=0

· · ·1X

n↵

=0

. . . hn1, n2, . . . |O|n1, n2, . . . i ,

where

hn1, n2, . . . |O|n1, n2, . . . i =Z

dx1 . . . dxN �s{n

}(x1, . . . , xN )⇤ O�s{n

}(x1, . . . , xN ).

Notice that each term in the sum corresponds to a distinct many-body state. It istherefore important that |n1, n2, . . . i and �s

{n↵

}(x1, . . . , xN ) are normalised to 1. Forexample, the grand-canonical partition function is given by

Zgc =1X

n1=0

· · ·1X

n↵

=0

. . . hn1, n2, . . . |e��(H�µN)|n1, n2, . . . i.

111

Page 112: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

In this case there are no restrictions on the total number of bosons N =P

↵ n↵. Calcu-lating the canonical partition function Zc would be more complicated, since one wouldneed to impose the constraint

P↵ n↵ = N on the occupation numbers.

An alternative way of computing the trace is to sum over all orbitals indices k1, k2, . . . , kNindependently (not occupation numbers!) and to correct for the multiple appearances.In the case of bosons the number of times that the same product of orbitals 'k1 , . . . ,'k

N

appears is given by N !n1!n2!...n1! =

N !Q↵

n↵

! which is the number of permutations betweenthe indices ki having different values, i.e., ki 6= kP (i) for some i. We therefore have

Tr{O} =1X

k1=1

· · ·1X

kN

=1

⇧↵n↵!

N !

Zdx1 . . . dxN �

s{n

}(x1, . . . , xN )⇤ O�s{n

}(x1, . . . , xN ).

At first sight this seems quite inconvenient, since the factor ⇧↵n↵! depends on the values

of k1, k2, . . . , kN in a complicated way. However, the expression simplifies remarkably ifwe recall that

�sk1...k

N

(x1, . . . , xN ) =

sY↵

n↵! �sk1...k

N

(x1, . . . , xN ) (9.3)

=1pN !

XP

'k1(xP (1)) . . .'kN

(xP (N)). (9.4)

We finally conclude that

Tr{O} =1

N !

1Xk1=1

· · ·1X

kN

=1

Zdx1 . . . dxN �

s{n

}(x1, . . . , xN )⇤ O �s{n

}(x1, . . . , xN ) ,

which has the same form as in the case of fermions. Only the phase factor (�1)p isreplaced by (+1)p = 1 in the sum over all permutations P .

These two ways of computing Tr{O} should not be confused. In one case,P

{n↵

},each term of the sum corresponds to a different many-body state, and the kets or wavefunctions involved must be normalised to 1. In the other case,

Pk1· · ·

PkN

, the wave-functions of bosons are not necessarily normalised to one and there are repetitions in themany-body states. Here, the prefactor 1/N ! does not come from the normalization ofthe wavefunctions alone.

112

Page 113: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Exercise 9.15: Consider the Hamilton operator

H =NXi=1

h(xi)

of N non-interacting identical particles. For example, H =NPi=1

p2i

2m . Note that h(x) is

the same for all particles, since they are identical. Let 'k(x) with k = 1, 2, . . . be theeigenfunctions of h: h'k = "k'k. Show that any simple product

⇡(x1, . . . , xN ) = 'k1(x1)'k2(x2) . . .'kN

(xN )

is an eigenfunction of H with eigenenergy E =NPi=1

"ki

. Is ⇡(x1, . . . , xN ) an ap-

propriate wave function in some particular case? Show that the fully symmetrizedand antisymmtrized functions ±(x1, . . . , xN ) obtained from ⇡ are either zero or areeigenfunctions of H with the same energy E. Conclude that H|n1, n2, . . . n1i =✓ 1P

↵=1n↵"↵

◆|n1, n2, . . . n1i, where |n1, n2, . . . , n1i is the state with definite occupa-

tion numbers n↵ of all orbitals '↵ [hx1, x2, . . . , xN |n1, n2, . . . , n1i = ±(x1, . . . , xN )].

Exercise 9.16: Consider a degenate energy level with degeneracy g, which is occupied byn particles (0 n g). Calculate the number of possible quantum states ⌦g(n) for nidentical fermions, n identical bosons and n distinguishable particles. What would be theresult for "correct Boltzmann counting"? Are there any cases in which ⌦g(n) is the samefor all the three situations? When? Interpret physically.Repeat the exercise by calculating the number of quantum stats ⌦(ni) for the case in whicha number of different groups of levels i, each with degeneracy gi, is occupied by ni particles(Pini = N).

Exercise 9.17:: Identity of particles:

i) Given a complete orthonormal single-variable basis set {'↵(x),↵ = 1, 2, . . . } satis-fying X

'↵(x)⇤ '↵(x

0) = �(x� x0)

113

Page 114: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

and Z'↵(x)

⇤ '�(x) dx = �↵� ,

it is always possible to expand an N variable function in the form

(x1 . . . xN ) =1X

k1=1

· · ·1X

kN

=1

c(k1, . . . , kN )'k1(x1) . . .'kN

(xN ) .

a) Consider now the case where (x1, . . . , xN ) is an N -particle fermionic wavefunction. Show that

c(k1, . . . ki . . . kj . . . kN ) = �c(k1, . . . kj . . . ki . . . kN ).

for all i and j. Conclude that two fermions can never occupy the same single-particle state or spin-orbital (Pauli exclusion principle). Remember that theindex ki (or ↵) defines the single-particle state completely, i.e., it includes thespin variable.

b) Show that c(kP (1), . . . , kP (N)) = (�1)p c(k1, . . . , kN ) for an arbitrary permuta-tion P , where p is the order of P . Conclude that for each choice of occupationsn1, n2, . . . with n↵ = 0, 1 and

P↵ n↵ = N , there is only one independent co-

efficient, for instance,

c(k1 < k2 < · · · < kN ) = c(n1, . . . n↵, . . . ).

c) Consider now the case where (x1, . . . , xN ) is an N -particle bosonic wave-function. Show that

c(k1, . . . ki . . . kj . . . kN ) = c(k1, . . . kj . . . ki . . . kN )

for all i, j. Generalize the statement to an arbitrary permutation of the indices:

c(kP (1) . . . kP (N)) = c(k1, . . . kN ) 8P.

Conclude that for a given choice of orbital occupations n1, n2, . . . withP

↵ n↵ = N ,all the coefficient are the same. In other words, for each set of n↵ there is only oneindependent coefficient c(n1, n2, . . . , n↵, . . . ).

ii) Consider the Slater determinant or fully antisymmetrized single-product state

�{n↵

}(x1, . . . , xN ) =1pN !

XP

(�1)p'kP (1)

(x1) . . .'kP (N)

(xN ) ,

where k1, . . . , kN are the occupied orbitals (i.e., n↵ = 1 for ↵ = k1, . . . , kN ) andthe sum runs over all permutations P having order p. Show that �{n

}(x1, . . . , xN )is normalized to 1 provided that the single-particle orbital '↵(x) are orthonormaland ki 6= kj for all i, j. Show that �{n

} = 0 if ki = kj for some i and j.

114

Page 115: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

iii) Consider the symmetrized bosonic single product state

�s{n

} =1pN !

XP

'kP (1)

(x1) . . .'kP (N)

(xN )

for certain occupation numbers n↵ satisfyingP

↵ n↵ = N .

115

Page 116: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

10 The classical limit

In the previous section we have found closed expressions for the trace of any operatorO according to the laws of quantum mechanics and the principle of indistiguishableidentical particles. In principle every real system should be described in terms of thecorresponding symmetrized or antisymmetrized many-body quantum states. However,in many cases of interest the quantum mechanical description is superfluous. We wouldlike to derive the classical approach to statistical mechanics, in which, according to theclassical picture, the state of the system is defined by specifying the positions ~r1, . . . , ~rNand the momenta ~p1, . . . , ~pN of each particle. This is formally achieved by starting fromthe quantum expression for the partition function Z for fermions and bosons and takingthe limit ~ ! 0. An expansion in power of ~ is thus obtained, in which the zerothorder term gives Gibbs classical expression for Z, and the higher order terms quantifythe quantum corrections. More important, the derivation allows us to understand howthe classical description emerges from the quantum principles and to delimit the domainof applicability of classical statistics.

We will show that if the temperature T is sufficiently high or if the density of particlesN/V is sufficiently low (large volume per particle v = V/N) we can approximate thecanonical partition function by

Zc(T, V,N) =1

N !

1

(2⇡~)3N

Zdp3Ndq3Ne��H(p,q) , (10.1)

where H(p, q) is the classical Hamilton function. In cartesian coordinates this is givenby

H =NXi=1

p2i2m

+W (~r1 . . .~rN ) ,

where W (~r1, . . . , ~rN ) = (1/2)Pi,j

w(|~ri � ~rj |) +Piv(~ri) is the interaction energy between

the particles and with any external fields.Before the derivation, it is interesting to discuss the range of validity and consequences

of Eq.(10.1). The classical limit is valid when the average inter-particle distance r =(V/N)1/3 satisfies ✓

V

N

◆1/3

� �,

where

� =2⇡~p

2⇡mkBT=

s2⇡~2mkBT

is the thermal de Broglie wavelength. It represents the quantum mechanical uncertaintyin the position of a particle having an uncertainty in the momentum corresponding tokinetic energy p2/2m = kBT . In fact, �p =

php2i =

p2mkBT ) �x ⇠ ~

�p =~p

2mkB

T= 2

p⇡�. Thus, �, which is proportional to ~, can be regarded as a physically

more appropriate expansion parameter than ~ itself. In fact, ~ never goes to zero. Still,

116

Page 117: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

the classical limit is reached for sufficiently large T , since � / 1/pT . If � ⌧ v1/3 =

(V/N)1/3 the particles can be regarded as occupying wave packets, whose width is of theorder of �. In this limit the overlap between the different wave packets is nearly zero andthe differences between fermionic and bosonic statistics vanish. Indeed, if the occupiedorbitals do not overlap at all [i.e., '↵(~r)'�(~r) ⇠= 0, 8~r 2 R3] the probability of finding aparticle at point ~r and another particle at point ~r 0, usually given by

P (~r,~r 0) = h�(~r � ~r1) �(~r0 � ~r2)i =

Zdr1 . . . drN |�(~r1 . . .~rN )|2 �(~r � ~r1) �(~r

0 � ~r2)

=

Z|�(~r,~r 0,~r3 . . .~rN )|2dr3 . . . drN ,

takes the simpler form

P (~r,~r 0) =1

N(N � 1)

X↵ 6=�

|'↵(~r)|2|'�(~r0)|2 .

This holds independently of the symmetry of the many-body wavefunction provided that'↵(~r)'�(~r) = 0 for all ~r, ↵ and �, as already discussed for the case of two particles.

On the opposite limit, T ! 0, � / 1/pT diverges, since the momenta of the particles

become well defined (hp2i ⇠ kBT ! 0). In this case the overlap between the occupiedsingle-particle states is always important (� & v1/3). Thus, the quantum mechanicalsymmetry-correlations resulting from indistinguishability play a central role. One con-cludes that the classical approximation always breaks down for T ! 0, regardless ofthe tolerated accuracy by which the classical model may describe the microstates of thesystem.

In order to get a feeling of the order of magnitude of the temperature below whichclassical statistics fails, we may compute v and � for electrons in solids and for a typicalatomic gas. Using that 2⇡~ = 6, 6 ⇥ 10�34Js, me = 9, 1 ⇥ 10�31kg, and kB = 1, 4 ⇥10�23J/K, we have

T [K] =2⇡~2⇡kB

1

�2m=

5⇥ 10�45

�2[m2]m[kg]=

5⇥ 10�38

�2[cm2]m[g]=

5, 44⇥ 105

�2[Å2]m[me]

. (10.2)

In solids we have typically one electron per cubic Å, i.e., v = V/N ' 1 Å3. If we set� = v1/3 in Eq. (10.2), we obtain that the temperature should be larger than 5, 5⇥105 Kin order that the classical approximation starts to be applicable. Therefore, electrons insolids can never be treated classically. The symmetry correlations are always importantin condensed matter. The situation is quite different in not too dense gases with, forexample, (V/N)1/3 = 10 Å and m = Amp, where A is the atomic weight and mp =1, 76 ⇥ 103 me the proton mass. In this case, the symmetry of the wavefunction ceasesto be important for temperatures above T ' 3/A K. Lighter atoms are more affected byquantum effects, since for the same average kinetic energy or temperature the uncertaintyin momentum is smaller, for instance, A = 4 for He while A = 131 for Xe. If the densityof the gas is lower, one may apply the classical approximation to lower temperatures but

117

Page 118: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

never for T ! 0, since in this case � diverges. The important point, in order that thesymmetry of the many-body wavefunction (fermionic or bosonic) becomes irrelevant, isthat the probability of finding two particles in the same state should be extremely low. Aswe shall see in the exercises, the entropies of Fermi, Bose and Boltzmann gases coincide,if and only if all the occupation numbers are extremely low. This is only possible forsufficiently high temperatures or sufficiently low densities.

The classical partition function (10.1) shows important signs of its quantum mechanicalorigin, despite the fact that the microstates are characterized by (p, q). The first one isthe factor 1/2⇡~ per degree of freedom and the second one is the factor 1/N !, where Nis the number of identical particles. They are both important consequences of the right(quantum mechanical) counting of the number of (linearly independent) states havingthe same energy. The first one tells us that in phase space only one state fits in a volumeelement 2⇡~. A factor proportional to 1/~ correcting each component of the phase-spacevolume element, could have been expected, since we know that each state must have�pi�qi > ~/2. It is not possible to find states occupying a smaller volume in phasespace. Note, moreover, that this factor renders Z dimensionless, as it should.

The second factor tells us that N identical particles having the set of momenta andcoordinates {(p1, q1), (p2, q2), . . . , (pN , qN )} correspond to one and the same state, irre-spectively of which particle has which (pi, qi). Notice that all N ! permutations of thecoordinates and momenta appear in the integral. Dividing by 1/N ! simply ensures thatthese are counted only once. Furthermore, as we shall see, this factor is necessary inorder that the thermodynamic potentials derived from Z are extensive.

10.1 Boltzmann counting and configurational integrals

Eq.(10.1) allows us to introduce the so-called correct Boltzmann counting, which corre-sponds to the correct counting of states in the limit where the symmetry correlations forfermions or bosons play no role. According to classical mechanics identical particles aredistinguishable. Therefore, from a classical perspective, the states p1, . . . , pN , q1, . . . , qNand all the permutations of them are distinguishable and should be counted separatelywhen computing Z. Correct Boltzmann counting means that we consider the particlesas distinguishable for the purpose of computing Z, or the number of accesible states⌦(E), and that we then divide the result blindly by N !, when the particles are identical.This does not correspond to Bosons or Fermions. And there is no real particle followingthis artificial statistics. Nevertheless, correct Boltzmann counting, or simply Boltzmannstatistics, provides a shortcut for the high temperature and low density limits of quantumstatistics. It is useful for analyzing them.

118

Page 119: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

10.2 Configural Integrals

The integration with respect to the momenta in Eq.(10.1) can be done straightforwardly.

Let us recall that1R�1

dx e�x

2

2�2 =p2⇡�2. Using that

H =Xi

p2i2m

+W (~r1 . . .~rN )

we have

Zc =1

N !

Zd3r1 . . . d

3rN e��W (~r1...~rN

) 1

(2⇡~)3N

0@ 1Z�1

e��p

2

2m dp

1A3N

,

where1Z�1

e��p

2

2m dp =p

2⇡mkBT =2⇡~�

.

Consequently,

Zc(V, T,N) =1

N !

Zd3r1 . . . d3rN

�3Ne��W (~r1...~r

N

), (10.3)

where � = 2⇡~/p2⇡mkBT . The classical partition functions is therefore given by a

configurational integral over all possible distributions of the particles having e��W asweighting factor. Performing such integrals is, however, non trivial.

Using Eq. (10.3) one may show that

CV =3

2NkB +

(�W )2

kBT 2,

where (�W )2 = hW 2i � hW i2 is the square mean deviation of the interaction energy.Notice that all types of interactions (attractive or repulsive) lead to an enhancement ofCV . Since the second term cannot be negative, it is clear that CV does not tend to zerofor T ! 0. The reasons for the breakdown of the classical approximation have beendiscussed.

10.3 Virial and equipartition theorems

Let y be any coordinate or momentum on which H depends. Then we have

� 1

@

@ye��H =

@H

@ye��H .

This implies ⌧y@H

@y

�=

Ry @H

@y e��H dpdxRe��H dpdx

= � 1

Ry @@ye��H dpdxR

e��H dpdx.

119

Page 120: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Integration by parts givesZy@

@ye��H dpdx = y e��H

����y=1y=�1

�Z

e��H dpdx , (10.4)

which implies ⌧y@H

@y

�=

1

�= kBT , (10.5)

provided that e��H ! 0 more rapidly than y for y ! ±1 or at the boundaries of thevolume. This is obviously the case for y = pi. For y = xi we need to incorporate the finitevolume restriction in the form of a single particle potential which constraints the motionof the particles within V . This cannot be done in the case of free particles. ThereforeEq.(10.5) is not applicable in the absence of a potential for y = xi.

We take y = pi and obtain y @H@y = pi

@H@p

i

=p2i

m . Consequently,⌧p2i2m

�=

1

2kBT. (10.6)

This important relation is known as equipartition theorem. It tells us that in classicalsystems the average of each momentum component is equal to k

B

T2 , independently of

the particle mass, the nature of the interparticle interactions, or any other parameterscharacterizing the equilibrium state, such as system size, pressure, volume or energy. Ofcourse, the temperature and the kinetic energy per particle depend on all these parame-ters. In the classical limit, we may therefore associate temperature to the average kineticenergy of the system.

Example: Consider a close-shell nanoparticle or droplet in which an atom can be pro-moted from the closed surface shell to above the surface, creating an adatom and avacancy. Which configuration is hot? Which one is cold?

Applying (10.5) to y = xi we obtain⌧xi@H

@xi

�= �hxipii = kBT ,

provided that the boundary contribution to the partial integration can be neglected.Summing over all degrees of freedom we have

�3NXi=1

⌧xi@H

@xi

�=

*3NXi=1

xipi

+= �3NkBT = �2

*Xi

p2i2m

+and using Eq.(10.6) *

3NXi=1

xipi

+= �2hEkini.

This relation is knows as virial theorem.Example: Consider a system of N non-interacting particles in a 3D harmonic potential

well:

W =NXi=1

kr2i2

) hW i = �1

2

3NXi=1

hxipii =3N

2kBT = hEkini . (10.7)

120

Page 121: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

10.4 The ideal classical gas

We consider a classical system in the absence of interactions. Setting W = 0 in Eq. (10.3),the canonical partition function is given by

Zc(T, V ;N) =1

N !

✓V

�3

◆N

=1

N ![Zc(T, V, 1)]

N .

Using Stirling’s formula, lnN ! = N ln�Ne

�+O(lnN), we obtain

lnZc = ln

✓V

�3

◆�N ln

✓N

e

◆= N ln

✓V

N

e

�3

◆.

The free energy is thus given by

F (T, V,N) = �NkBT ln

✓e

N

V

�3

◆.

It is important to remark that the factor 1/N ! in Zc, which is a consequence of the indis-tinguishability of identical particles, is absolutely necessary in order that F is extensive,i.e., F (T,↵V,↵N) = ↵F (T, V,N). Keeping in mind that � = 2⇡~/

p2⇡mkBT ⇠ 1/

pT ,

one easily obtains all thermodynamic properties:

p = �@F@V

���T=

NkBT

V,

S = �@F@T

= NkB

✓ln

eV

N�3

◆+

3

2

�= �F

T+

3

2NkB ,

E = F + TS =3

2NkBT ,

H = E + pV =5

2NkBT ,

CV =@E

@T

����V

=3

2NkB ,

Cp =@H

@T

����p

=5

2NkB ,

↵ =1

V

@V

@T

���p=

1

T,

T = � 1

V

@V

@p

���T=

1

p, and

S = TCV

Cp=

3

5p.

121

Page 122: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The grand-canonical partition function is given by

Zgc =1X

N=0

e�µNZc(N) =1X

N=0

zN1

N !

✓V

�3

◆N

= ezV/�3,

where we have introduced the fugacity

z = e�µ ,

which satisfies the following useful relations:

@z

@µ= �z and

@z

@�= µz .

The grand canonical potential reads

�(T, V, z) = �kBTzV

�3

or

�(T, V, µ) = �kBT e�µV

�3.

The average number of particles is given by

N = �@�@µ

= e�µV

�3=

zV

�3.

Using this expression for N we can write

� = �NkBT. (10.8)

Moreover, from the definition of � (Laplace transformation) we know that � = �pV ingeneral. In the present case Eq. (10.8) implies

pV = NkBT .

Thus, we recover the known equation of state of the ideal classical gas.

10.5 Derivation of the classical limit

The quantum expression for the canonical partition function is

Zc(N) =1

N !

1Xk1=1

· · ·1X

kN

=1

hk1 . . . kN |e��H |k1 . . . kN i ,

122

Page 123: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where the symmetrized states |k1, . . . , kN i are given by

hx1 . . . xN |k1 . . . kN i = �s(x1 . . . xN ) =1

N !

XP

(±1)p'k1(xP (1)) . . .'kN

(xP (N))

and the average values are given by

hk1 . . . kN |e��H |k1 . . . kN i =Z

d3r1d3rN �

s(x1 . . . xN )⇤ e��H �s(x1 . . . xN ) .

Notice that we only need to consider Zc(N), since Zgc =P

N zNZc(N)....

123

Page 124: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

11 The ideal quantum gases

The simplest system of N identical particles is found in the absence of interactions. Inthis case the Hamiltonian is given by

H =NXi=1

p2i2m

,

assuming that the particles have a rest mass m. The case of photons will be discussedseperately. In order to compute the partition function we need to find the eigenstatesof the system, or at least we need to classify them by finding the appropriate quantumnumbers, the corresponding eigenvalues and their degeneracies. In the following weassume for simplicity that that the particles are spinless.

Since there are no interactions among the particles, the Hamiltonian is simply the sumof the operators p2i /2m, each acting on a different variable ~ri. Under these circumstancesthe eigenfunctions of H are products of single-particle wave functions, each one depend-ing on ~ri and being eigenfunctions of p2i /2m . These simple products, however, needto be symmetrized (antisymmetrized) in order to yield appropriate wave functions ofindistinguishable bosons (fermions). The general form of the N -particle eigenfunctionsis

�{n~p

}(x1, . . . , xN ) =1pN !

XP

(±1)p u~p1

�xP (1)

�. . . u~p

N

�xP (N)

�, (11.1)

whereu~p(~r) =

1pVe

i

~ ~p·~r. (11.2)

The + (�) sign corresponds to bosons (fermions). As already discussed �{n~p

} dependsonly on the occupation numbers {n~p}, i.e., on the number of times that the single-particlestate u~p(~r) appears on the right hand side of (11.1). For bosons all positive or zero valuesof n~p are allowed (n~p = 0, 1, 2, . . . ) while for fermions we have only n~p = 0 or 1 (Pauliexclusion principle). In all cases the sum of the occupation numbers must be equal tothe number of particles:

N =X~p

n~p , (11.3)

where the sum runs over all possible values of the momentum of a single particle in thevolume V. The fact that the eigenstates of H depend only on {n~p} can be stressed byintroducing the occupation number representation. The many-body states or kets

|{n~p}i = |n1, n2, . . . , n1i

have definite occupation numbers of all single-particle eigenstates of the momentumoperator ~p. The normalized wave function corresponding to |{n~p}i is given by

hx1, . . . , xN |{n~p}i =1q⇧↵n↵!�{n

~p

}(x1, . . . , xN ) . (11.4)

124

Page 125: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The states |n1, n2, . . . , n1i constitute a complete orthonormal set satisfying

hn1, n2, . . . n1|n01, n02, . . . n01i = �n1n01. . . �n1n0

1 .

Since u~p(r) is an eigenfunction of ~p we have

p2

2mu~p 0 =

p0 2

2mu~p 0 = "~p 0 u~p 0 (11.5)

This implies that the N -particle wave function satisfies

H�{n~p

} = E{n~p

}�{n~p

} (11.6)

where the eigenenergieE{n

~p

} =X~p

n~p "~p (11.7)

is the sum of the single-particle eigenenergies of the occupied states u~p. In the occupationnumber representation we have

H|n1, n2, . . . n1i = X

i

ni "i

!|n1, n2, . . . n1i (11.8)

where i = 1, 2, 3, . . . , is an index that enumerates the different possibles values of themomentum ~p. We can also write

H|{n~p}i =

0@X~p

n~p"~p

1A |{n~p}i. (11.9)

In order to determine the allowed values of ~p, we consider a cubic volume V = L3 withside L. A complete basis of eigenfunctions of the momentum operator ~p is given by

u~p =1pVe

i

~ ~p·~r ,

where the values of ~p satisfy periodic boundary conditionp↵~ L = 2⇡n↵

where ↵ = x, y, z and n↵ 2 Z. This constitutes a complete set of single-particle wavefunctions in the volume V . In a more compact form we may write

~p =2⇡~L~n (11.10)

where ~n 2 Z3 is a vector with integer components. In the limit of V ! 1 the allowedvalues of ~p form nearly a continuum. We will then be able to replaceX

~p

. . . ! V

(2⇡~)3

Z. . . d3p , (11.11)

125

Page 126: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

provided that all the terms in the sum over ~p are finite. Later on we will see, in thecontext of the Bose-Einstein condensation, that this condition is not always satisfied andthat a diverging term (~p = 0) needs to be singled-out. The reason for the factor V/(2⇡~)3is the number of allowed values of ~p in a volume element d3p. Formally, we may writeX

~p

· · · =X~n

. . . !Z

. . . d3n =V

(2⇡~)3

Z. . . dp3

since~n =

L

2⇡~ ~p ) d3n = dnx dny dnz =V

(2⇡~)3 d3p .

Computing the canonical partition function Zc(N) for a fixed number of particles isquite complicated, since we would need to impose the constraint

N =X~p

n~p .

This restriction is absent in the grand-canonical ensemble, where all possible values of theoccupation numbers and total number of particles N are allowed. The grand partitionfunction is given by

Zgc = Tr{e��(H�µN)} = Tr{zNe��H} =X{n

~p

}

e��

P~p

("p

�µ)n~p

=X{n

~p

}

z

P~p

n~p

e��

P~p

"~p

n~p

.

The sum runs over all possible states |n~pi with defined occupations n~p = {n1, n2, . . . , n1}.The number of particles in |n~pi is

N =X~p

n~p (11.12)

and the eigenenergy of |n~pi isE =

X~p

"~p n~p . (11.13)

We must thus sum over all possible values of n~p for each ~p. Notice that every set ofoccupation numbers n1, n2, . . . , n1 defines one and only one distinct state. Consequently,

Zgc =Xn1

Xn2

· · ·Xn1

e��

Pi

("i

�µ)ni

=

Xn1

e��("1�µ)n1

! Xn2

e��("2�µ)n2

!. . .

=Y~p

"Xn

e��("~p�µ)n#

=Y~p

"Xn

⇣e��("~p�µ)

⌘n#.

126

Page 127: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The sum over the occupation number n is of course different for fermions (n = 0, 1) andbosons (n 2 N). For fermions we have simply

Zgc =Y~p

⇣1 + e��("~p�µ)

⌘. (11.14)

For bosons we obtainZgc =

Y~p

1�1� e��("~p�µ)

� , (11.15)

provided that e��("~p�µ) < 1, which requieres µ 0. In order to facilitate some math-ematical manipulations it is sometimes useful to replace the chemical potential by thefugacity

z = e�µ

as an independent (also intensive) variable of Zgc. In terms of z we have

Zgc =

8><>:Q~p

�1 + z e��"~p

�FermionsQ

~p

�1� z e��"~p

��1 Bosons(11.16)

The grand-canonical potential is given by

� =

8><>:�kBT

P~p

ln⇥1 + e��("~p�µ)

⇤Fermions

kBTP~p

ln⇥1� e��("~p�µ)

⇤Bosons

(11.17)

Since � = �pV we obtain the pressure of the quantum gases from

pV

kBT=

8><>:P~p

ln⇥1 + e��("~p�µ)

⇤Fermions

�P~p

ln⇥1� e��("~p�µ)

⇤Bosons

(11.18)

The chemical potential µ (or z) are defined by the average number of particles

N = �@�@µ

=X~p

e��("~p�µ)

1 + e��("~p�µ)=X~p

1

e�("~p�µ) + 1(11.19)

for fermions, and

N =X~p

e��("~p�µ)

1� e��("~p�µ)=X~p

1

e�("~p�µ) � 1(11.20)

for bosons. These equations allow us to solve for µ(N), which can be replaced in (11.18)to obtain the equations of state.

127

Page 128: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

The average occupation of each single-particle state ~p can be calculated from

hn~pi =Tr{n~pe

��(H�N)}Zgc

=

Tr⇢n~pe��

P~p

("~p

�µ)�Zgc

= � 1

@

@"~plnZgc

=@�

@"~p,

which yields

hn~pi =

8<:1

e�("

~p

�µ)+1

Fermions1

e�("

~p

�µ)�1Bosons

Comparison with Eqs. (11.19) and (11.20) shows, as expected, thatP

~phn~pi = N . Wemay now also confirm that e��("~p�µ) = ze��"~p < 1 for bosons, as requested in order thatthe geometric series leading to (11.15) converges, since n~p has only positive eigenvalues.In other terms, hn~pi � 0 ) e�("~p�µ) � 1 ) µ "~p ) µ 0. The chemical potential µ(or z) must always satisfy this condition. Moreover, since N

V is necessarily finite, hn~p

iV is

finite for all ~p. For fermions hn~pi itself is always finite for all ~p.In order to pursue the discussion we take the limit V ! 1 and replaceX

~p

! V

(2⇡~)3

Zd3p

whenever this is possible. In fact, such a replacement is valid if all the terms in the sumare finite. If this is not the case we may single-out the terms that risk to diverge in thesum. In the following we work out the Fermi and Bose gases separately.

128

Page 129: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

12 Fermi systems: The ideal Fermi gas

In order to analyze the properties of the ideal Fermi gas in more detail, it is convenientto replace e�µ = z, and to consider the fugacity z, T and V as the thermodynamicvariables. Let us examine the equation of state

pV

kBT=X~p

lnh1 + z e��"~p

i. (12.1)

Since z > 0 and e��"~p > 0, all the terms in the sum are finite. We may then write

pV

kBT=

V

(2⇡~)3

Zlnh1 + z e��p

2/2mid3p (12.2)

=V

(2⇡~)3 4⇡1Z0

p2 lnh1 + z e��p

2/2midp . (12.3)

Introducing the dimensionless integration variable x = pp2mk

B

T) x2 = �p2/2m in

Eq. (12.3) we have

p

kBT=

4⇡

(2⇡~)3⇣p

2mkBT⌘3 1Z

0

x2 lnh1 + ze�x

2idx . (12.4)

Recalling that � = 2⇡~p2⇡mk

B

Tis the thermal wave length, we obtain the equation of state

p

kBT=

1

�f5/2(z) , (12.5)

where

f5/2 =4p⇡

1Z0

x2 lnh1 + ze�x

2idx =

1Xl=1

(�1)l�1zl

l5/2(12.6)

is a Riemann z-function. It is clear that f5/2(0) = 0. Notice that the integral is welldefined for all z � 0. However, the Taylor expansion converges only for |z| 1. Thefugacity z is an intensive thermodynamic variable that defines the particle number N atthe given temperature T . Alternatively, we may regard z as a function of T and N .

In order to obtain the Taylor expansion of f5/2(z) for small z we develop ln(1 + ↵z)around z = 0. The derivatives are

ln(1 + ↵z)(1)!d

dz

1 + ↵z

(2)! ↵2

(1 + ↵z)2(3)! 2↵3

(1 + ↵z)3! . . . (12.7)

In general we havedl

dzlln(1 + ↵z) =

(�1)l�1 (l � 1)!↵l

(1 + ↵z)l

129

Page 130: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

anddl

dzlln(1 + ↵z)

����z=0

= (�1)l�1 (l � 1)!↵l .

For the expansion of f5/2 we have ↵ = e�x2 . Therefore we need the integrals

4p⇡

1Z0

x2e�lx2dx =

4p⇡

1

l3/2

1Z0

y2e�y2dy

| {z }p⇡/4

=1

l3/2,

where we have used y =plx and dx = dy/

pl. Since f5/2(z) =

4p⇡

1R0dx x2 ln(1 + ze�x

2)

we have

df5/2(z)

dz

����z=0

= (�1)l�1(l � 1)!4p⇡

1Z0

x2e�lx2dx

| {z }l�3/2

= (�1)l�1(l � 1)!1

l3/2= (�1)l�1

l!

l5/2.

Recalling that f5/2(0) = 0 , we finally have

f5/2(z) =Xl=1

(�1)l�1zl

l5/2= z � z2

4p2+

z3

9p3� . . . .

The equation for the number of particles reads

N =X~p

1

e�("~p�µ) + 1=X~p

1

z�1 e�"~p + 1.

Since z = e�µ > 0 all terms in the sum are finite. Thus, for V ! 1 we may write

N =V

(2⇡~)3 4⇡1Z0

p21

z�1 e�"~p + 1dp

andN

V=

1

v=

1

�3f3/2(z) , (12.8)

where

f3/2(z) =4p⇡

1Z0

x2

z�1 ex2 + 1dx (12.9)

=4p⇡

1Z0

zx2

ex2 + zdx.

130

Page 131: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that

zdf5/2dz

= z4p⇡

1Z0

x21

1 + ze�x2 dx

=4p⇡

1Z0

x21

z�1ex2 + 1dx = f3/2(z) .

Consequently,

f3/2(z) = zdf5/2dz

=1Xl=1

(�1)l�1zl

l3/2(12.10)

Both f5/2 and f3/2 have convergent Taylor expansions for z ! 0:

f3/2(z) ⇡ z � z2

2p2+ . . .

andf5/2(z) ⇡ z � z2

4p2+ . . .

12.1 Energy-pressure relation

An interesting relation may be derived for the average energy hEi. We know that

pV = �� = kBT lnZgc

and thereforelnZgc =

pV

kBT=

V

�3f5/2(z) (12.11)

On the other side

Zgc = Tr

(XN

zNe��H)

which implies that

� @ lnZgc

@�

����z,V

=1

ZgcTr

(XN

zNe��HH

)= hEi (12.12)

Notice that z is kept fixed and not µ. Combining (12.11) and (12.12) and recalling that� @@�

�1�3

�= �3

21�3 we have

hEi = �@ lnZgc

@�= �V f5/2(z) ·

1

✓�3

2

◆1

�3

=3

2kBT

V

�3f5/2(z) =

3

2pV

131

Page 132: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Remarkably, the relationE =

3

2pV (12.13)

is the same as in the classical limit. However, pV is not simply proportional to T ! Usingthermodynamic relations a number of other properties can be derived from Eq. (12.13).See the following exercises.

Exercise 12.18: The energy of the quantum Fermi gas is given by E = 3/2pV . Showthat this implies that the adiabatic compressibility is also given by the classical expressionS = 3

5p . Are there other classical-ideal-gas relations which survive in the quantum case?

Exercise 12.19: Calculate the entropy S of the non-interacting Fermi gas. Express theresult in terms of hn~pi and interpret the result physically. Hint: For the interpretationcalculate the number of accessible states ⌦ for Ni Fermions occupying gi states, and express⌦ in terms of the probability ni = Ni/gi for an individual state to be occupied.

Summarizing so far, the functions

f5/2(z) =4p⇡

1Z0

dx x2 ln(1 + ze�x2)

and

f3/2(z) = zf 05/2(z) =4p⇡

1Z0

dxx2

z�1ex2 + 1,

define, on the one side, the grand canonical potential � per particle and the equation ofstate in terms of �3/v and the fugacity z as

� = �kBTNv

�3f5/2(z) ,

pv

kBT=

v

�3f5/2(z) .

On the other side, they allow us to obtain z as a function of �3/v by solving�3

v= f3/2(z) .

They can be expressed in the form of integrals, power series or asymptotic expansions.Otherwise, they can only be evaluated numerically. In particular, there are no closedexpressions, neither for N as a function of z, nor for the much needed z = z(N) in thegeneral case. There are, however, two most important limits, which deserve to be workedout analytically.

132

Page 133: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

12.2 High temperatures and low densities

The limit of low densities and/or high temperatures corresponds to �3

v ! 0. Since f3/2(z)is a monotonously increasing function of z, this corresponds to z ! 0. This limit is easyto discuss since

f3/2 =1Xl=1

(�1)l�1zl

l3/2

and

f5/2 =1Xl=1

(�1)l�1zl

l5/2

have a convergent Taylor expansion at z = 0. We obtain z from

�3

v= z � z2

2p2+O(z3) ,

which implies

z =�3

v+

1

2p2z2 +O

�z3�

and

z =�3

v+

1

2p2

✓�3

v

◆2

+O

✓�3

v

◆3.

Expanding f5/2 we have

pv

kBT=⇣ v

�3

⌘z � z2

4p2+O

�z3��

= 1 +1

2p2

�3

v� 1

4p2

�3

v+O

✓�3

v

◆2= 1 +

1

4p2

�3

v+O

✓�3

v

◆2.

We recover the classical limit and the first non-vanishing quantum correction, which isentirely due to the symmetry correlations (no interactions). Notice that the correctionsis proportional to ~3/T 3/2. In the Fermi gas the pressure is always higher than in theclassical Boltzmann gas. This is a consequence of the Pauli exclusion principle. As weshall see, the pressure of the Fermi gas is finite even at T = 0. The Pauli exclusion resultsin an effective repulsion among the particles which enhances p.

The grand canonical potential in the limit of⇣�3

v

⌘! 0 reads

� = �kBTN

1 +

1

4p2

�3

v+O

✓�3

v

◆2!

and

E =3

2kBTN

1 +

1

4p2

�3

v+O

✓�3

v

◆2!

133

Page 134: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

from which the correction to all other thermodynamic properties can be obtained (CV ,Cp, T , ↵, S, etc.).

The average occupation hnpi for z ! 0 (i.e., �µ ! �1) given by

hnpi =1

z�1 e�"p + 1' z e��"p ' �3

ve��"p +O

"✓�3

v

◆2#,

takes the Maxwell-Boltzmann form.

Exercise 12.20: Generalize the expressions for � and N for Fermi and Bose gases to thecase where the single-particle energy levels "~p↵ have a degeneracy g (i.e., "~p,↵ = "~p for↵ = 1, . . . , g). In this way one can take into account the intrinsic angular momentum S ofthe particles (g = 2S + 1).

12.3 Low temperatures and high densities

A far more interesting and important limit is found at low temperatures and/or highdensities, i.e., �3

v ! +1. Since f3/2(z) is monotonously increasing we also have z =

e�µ ! +1. We therefore need an asymtotic expansion of the diverging functions f5/2and f3/2. This is obtained by a method due to Arnold Sommerfeld (1868–1951).

We start from

f5/2(z) =4p⇡

1Z0

x2 ln(1 + ze�x2) dx

and replace y = x2, dy = 2xdx, dx = dy/2py to obtain

f5/2(z) =4p⇡

1Z0

1

2pyy ln(1 + ze�y) dy

=2p⇡

1Z0

py|{z}

u0

ln(1 + ze�y)| {z }v

dy .

We integrate by parts as

f5/2(z) =2p⇡

8>><>>:2

3y3/2 ln(1 + ze�y)

����10| {z }

0

+

1Z0

2

3y3/2

ze�y

1 + ze�ydy

9>>=>>;f5/2(z) =

4

3p⇡

1Z0

y3/2

z�1ey + 1dy .

134

Page 135: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Before further manipulations, let us apply the same change of variables x2 = y to

f3/2(z) =4p⇡

1Z0

x2

z�1ex2 + 1dx

and obtain

f3/2(z) =2p⇡

1Z0

y1/2

z�1ey + 1dy.

We see that both functions of interest may be cast in the same form. We thus need toexpand

fm(z) =

1Z0

u0z }| {ym�1

z�1ey + 1| {z }v

dy .

for m = 3/2 and 5/2. We integrate by parts and obtain

1

mym

1

z�1ey + 1

����10| {z }

0

+

1Z0

1

mym

z�1ey

(z�1ey + 1)2dy .

At this point it is useful to introduce the variable

⌫ = �µ ) z = e⌫ and ⌫ = ln z

and write

fm(z) =1

m

1Z0

ymey�⌫

(ey�⌫ + 1)2dy.

Replacing t = y � ⌫ we have

fm(z) =1

m

1Z�⌫

(⌫ + t)met

(et + 1)2dt.

As z ! 1, ⌫ ! 1 and the integral approaches1R�1

. The function

h(t) =et

(et + 1)2=

e�2t

e�2tet

(et + 1)2=

e�t

(1 + e�t)2= h(�t)

is even, has a bell-like form with a width of about 2, and a maximum a t = 0. It decreasesvery fast for |t| ! 1.

1Z�1

h(t) dt = 1.

h(t) = � d

dt

✓1

et + 1

◆, t = �("� µ)

135

Page 136: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Since h(t) has a peak form with a finite width ⇡ 4 around t = 0, we can expand

(⌫ + t)m = ⌫m +m ⌫m�1 t+m(m� 1)

2⌫m�2 t2 + . . .

around t = 0 and obtain

fm(z) =1

m

1Z�⌫

dtet

(et + 1)2

✓⌫m +m ⌫m�1 t+

m(m� 1)

2⌫m�2 t2 + . . .

We may now replace1R�⌫

. . . dt =1R�1

. . . dt��⌫R�1

. . . dt knowing that

�⌫Z�1

tlet

(et + 1)2dt ⇡ tlet

�����⌫�1

� l

�⌫Z�1

tl�1etdt / e�⌫

Al = (�⌫)le�⌫ � lAl�1

Al = Ple�⌫ ) Ple

�⌫ = �(⌫)le�⌫ � lPl�1e�⌫

Pl = (�⌫)l � lPl�1, polynomial of degree l

The integrals�⌫R�1

tl et

(et+1)2 dt are of the order ⌫le�⌫ and can therefore be neglected in the

limit ⌫ = ln z ! +1. We finally have

fm(z) =1

m

✓I0⌫

m +mI1⌫m�1 +

m(m� 1)

2I2⌫

m�2 + . . .

◆where

Il =

+1Z�1

tlet

(et + 1)2dt

Since h(t) = et

(et+1)2 is even all odd-l integrals vanish. For l = 0 we have

I0 =

+1Z�1

et

(et + 1)2dt = 1

and for even l > 0

Il = �2

24 @

@�

1Z0

dttl�1

e�t + 1

35�=1

= (l � 1)! (2l) (1� 2l�1) ⇣(l)

136

Page 137: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

where ⇣(l) is the Riemann zeta function:

⇣(2) =⇡2

6, ⇣(4) =

⇡4

90, . . .

We can now replace this infomation explicitly for m = 3/2 and 5/2:

f3/2(z) =3

2

⌫3/2 +

⇡2

3

3

8⌫�1/2 +O

⇣⌫�5/2

⌘�and therefore

f3/2(z) =2p⇡f3/2(z) =

4

3p⇡

✓⌫3/2 +

⇡2

8⌫�1/2 + . . .

◆=

4

3p⇡⌫3/2

✓1 +

⇡2

8⌫�2 + . . .

◆=

4

3p⇡

✓(ln z)3/2 +

⇡2

8(ln z)�1/2 +O

h(ln z)�5/2

i◆.

In the same way we havef5/2(z) =

4

3p⇡f5/2(z) (12.14)

where

f5/2(z) =2

5

✓⌫5/2 +

⇡2

3

15

8⌫1/2 +O(⌫�3/2)

◆.

Consequently,

f5/2(z) =8

15p⇡

✓⌫5/2 +

5⇡2

8⌫1/2 + . . .

◆=

8

15p⇡

⇢(ln z)5/2 +

5⇡2

8(ln z)1/2 +O

h(ln z)�3/2

i�.

It is probably more clear to write

f3/2(z) =4

3p⇡(ln z)3/2

✓1 +

⇡2

8(ln z)�2 +O

⇥(ln z)�4

⇤◆(12.15)

andf5/2(z) =

8

15p⇡(ln z)5/2

✓1 +

5⇡2

8(ln z)�2 +O

⇥(ln z)�4

⇤◆. (12.16)

These expressions will become a lot more transparent once we replace ln z = �µ.Before using our expansion and solving for z as a function of the density N/V or the

volume per particle v = VN it is useful to consider the T = 0 limit explicitly. In this case

the occupation numbers are

hn~pi =1

e�("~p�µ) + 1=

(1 if "~p < µ = "F

0 if "~p > µ = "F

137

Page 138: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

All states are occupied below the Fermi energy "F and all states above "F are empty,where

"F =p2F2m

and

N =V

(2⇡~)3 4⇡pFZ

0

p2dp =V

(2⇡~)34

3⇡p3F =

1

6⇡2~3 p3F .

The Fermi wave vector is thus given by

pF = ~✓6⇡2

v

◆1/3

= ~kF ,

which is proportional to ~ divided by the average distance v1/3 between the Fermions.The Fermi energy is given by

"F =~22m

✓6⇡2

v

◆2/3

. (12.17)

"F and pF are quantum properties. They do not depend on the size of the system buton the density of Fermions 1/v. As the density grows, both "F and pF grow.

In order to obtain the temperature scale below which quantum effects dominate weintroduce the Fermi temperature

TF ="FkB

.

The importance of "F and TF becomes clear if we consider finite low temperatures. Tocalculate µ to the lowest non-vanishing order in T we write

�3

v= f3/2(z) =

4

3p⇡(ln z)3/2

✓1 +

⇡2

8(ln z)�2 +O

⇥(ln z)�4

⇤◆.

Recalling that

� =2⇡~p

2⇡mkBT=

r2⇡~2m

�1/2 ! �3 =

✓2⇡~2m

◆3/2

�3/2

and replacing ln z = �µ we have

1

v

r2⇡~2m

!3

�3/2 =4µ3/2

3p⇡�3/2

✓1 +

⇡2

8(�µ)�2 + . . .

◆.

Rearranging the prefactors we have

1

v6⇡2

✓~22m

◆3/2

= µ3/2

✓1 +

⇡2

8(�µ)�2 +O

⇥(�µ)�4

⇤◆

138

Page 139: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

which yields

"F =~22m

✓6⇡2

v

◆2/3

= µ

✓1 +

⇡2

8(�µ)�2 + . . .

◆2/3

= µ

✓1 +

⇡2

12(�µ)�2 + . . .

◆µ = "F � ⇡2

12

✓kBT

"F

◆2✓"2Fµ

◆+O

✓kBT

µ

◆4

.

Replacing µ on the right hand side and keeping the terms up to order (kBT/"F )2 wehave

µ

"F= 1� ⇡2

12

✓kBT

"F

◆2✓"Fµ

◆+O

✓kBT

µ

◆4

µ

"F= 1� ⇡2

12

✓kBT

"F

◆2

+O

✓kBT

µ

◆4

Notice that µ ⇡ "F up to temperatures of the order of TF ("F ⇡ 5 eV ⇡ 5 · 104 K innarrow band metals).

The fact that µ decreases with increasing T is a consequence of the form of the single-particle energy spectrum of free particles in three dimensions. We may compute thesingle-particle density of states (DOS)

⇢(") =X~p

�("~p � ") =V

(2⇡~)3

Zd3p �("~p � ") ,

which represents the number of single-particle states per unit energy at the energy ". Fornon-interacting particles of mass m we have "~p = p2/2m and

⇢(") =V

(2⇡~)3 4⇡1Z0

p2 �

✓p2

2m� "

◆dp .

Introducing

"0 =p2

2m) d"0 =

p

mdp

anddp =

mp2m"0

d"0

we have

⇢(") =V

(2⇡~)3 4⇡r

m

22m

1Z0

p"0 �("0 � ") d"0

T = 0 : N =

"FZ

�1

⇢(") d"

139

Page 140: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

T > 0 : N =

+1Z�1

⇢(")f(") d".

f(") =1

e�("�µ) + 1

⇢(") =Nv(2m)3/2

4⇡~3p" =

3

2

N

"3/2F

p"

N =

+1Z�1

⇢(")d"

e�("�µ) + 1

For free fermions:

µ = "F

"1� ⇡2

12

✓kBT

"F

◆2

+O

✓kBT

µ

◆4#

µ = "F

"1� ⇡2

12

✓T

TF

◆2

+ . . .

#

The equation of state is obtained from

pv

kBT=

v

�3f5/2(z) =

f5/2(z)

f3/2(z)

=2

5ln z

1 + 5⇡2

8 (ln z)�2 + . . .

1 + ⇡2

8 (ln z)�2 + . . .

!

=2

5ln z

✓1 +

5⇡2

8(ln z)�2 � ⇡2

8(ln z)�2 + . . .

◆=

2

5ln z

✓1 +

⇡2

2(ln z)�2 +O(ln z)�4

◆=

2

5�µ

✓1 +

⇡2

2(�µ)�2 + . . .

◆=

2

5

"FkBT

"1� ⇡2

12

✓kBT

"F

◆2

+⇡2

2

✓kBT

"F

◆2

+ . . .

#

=2

5

"FkBT

"1 +

5⇡2

12

✓kBT

"F

◆2

+O

✓kBT

"F

◆4#.

It finally reads

pv =2

5"F

"1 +

5⇡2

12

✓kBT

"F

◆2

+ . . .

#.

140

Page 141: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

Notice that the pressure does not vanish at T = 0. This is of course a consequence ofPauli’s exclusion principle. For non-interacting Fermi gases the pressure gives us directlythe internal energy through

E =3

2pV =

3

5N"F

1 +

5⇡2

12

✓kBT

"F

◆2

+ . . .

!,

from which we obtain the important heat capacity

CV =@E

@T|V,N =

3

5N"F

5⇡2

6

k2BT

"2F=

�z }| {⇡2

2

k2B"F

N T +O

✓kBT

"F

◆3

,

which is linear in T and thus satisfies the third principle of thermodynamics. Knowingthat for T ! +1, CV ! 3

2NkB we can draw it qualitatively.Notice the ground state energy, given by E0 = 3

5N"F , is simply proportional to "F .This result could also have been obtained from

E0 =

Z "F

0⇢(") " d" =

3

2

N

"3/2F

"FZ

0

"3/2 d" =3

2

N

"�3/22

5"5/2F =

3

5N"F .

A few important properties of the Fermi energy

"F =~22m

✓6⇡2

v

◆2/3

deserve to be underlined:

i) "F (or the Fermi temperature TF ) defines the temperature scale below which quan-tum effects and in particular symmetry correlations dominate the behaviour of idealFermi gases.

ii) "F is a quantum property ⇠ ~2.

iii) "F increases when the mass of the particles decreases.

iv) "F is an intensive property. It does not depend on the size of the system but onthe volume per particle v = V/N , which defines the average interparticle distancev1/3. Equivalently, we may express "F in terms of the density n = N

V .

v) As the density increases, the Fermi energy "F and the Fermi momentum pF increase.Larger density (e.g., due to compression) increases the size of the Fermi sphere.This behaviour is understandable taking into account the symmetry correlations,which preclude two Fermions from occupying the same state, and the Heisenbergprinciple, which implies an increase of the kinetic energy of each single-particlestate with decreasing volume. Notice that by scaling the system volume V and theparticle number N , keeping v constant, only the density of independent values ofthe momentum p↵ = 2⇡n↵/L increases.

141

Page 142: Introduction to Statistical Physics - uni-kassel.de · Introduction to Statistical Physics Prof. Dr. G. M. Pastor Institut für Theoretische Physik Fachbereich Mathematik und Naturwissenschaften

After all these calculations it is interesting to discuss how the properties change asa function of T within the quantum (also known as degenerate) regime (T ⌧ TF ). Asalready mentioned, for T = 0 we have the ice cube or ice block. At finite T the step inthe Fermi function gives place to smooth transition, from f(") ' 1 for " well below thechemical potential [(" � µ) ⌧ �kBT ], to f(") ' 0 for " well above µ, in units of kBT[("� µ) � kBT ]. At the same time, as f(") describes a smooth crossover, the chemicalpotential decreases slightly. This is understandable since the DOS ⇢(") /

p" increases

with energy. Thus, increasing T and keeping µ fixed would imply an increase in thenumber of particles N , since the number of states with energy " > µ is larger than thenumber of states with " < µ. Notice, that f(") yields as many occupied states above µas it produces holes below µ:

1� 1

ex + 1=

ex + 1� 1

ex + 1=

1

e�x + 1.

A decrease in occupation below µ (x < 0) implies and increase in occupation above µ(�x > 0). Thus, f(") alone does not change the number of particles as T increases. Inother words, if ⇢(") would be independent of ", N would not change by increasing T :

0Z�1

✓1� 1

1 + ex

◆dx =

Z 0

�1

dx

e�x + 1= �

Z 0

+1

dx

ex + 1=

Z +1

0

dx

ex + 1.

The number of unoccupied states below µ equals the number of occupied states above µ.The energy range where f(") 6= 0 and 1 (i.e., the crossover region) is of the order of

kBT . With increasing T excitations are introduced. The number of excitations is of theorder of kBT⇢("F ), where ⇢("F ) is the single-particle density of states at "F . The energyof these excitations is in average of the order of kBT . This explains why the internalenergy E increases proportionally to (kBT )2, and why the specific heat CV = (@E/@T )Vis linear in T , with a coefficient � that is proportional to ⇢("F ):

⇢("F ) =3

2

N

"3/2F

p"F =

3

2

N

"F

CV =⇡2

2k2B

N

"FT =

⇡2k2B3

⇢("F )| {z }�

T .

The reduction of CV with decreasing T reflects the reduction of the available degrees offreedom for the Fermions, as a result of the symmetry correlations. In fact, the exclusionprinciple precludes all electrons except those at the Fermi energy (and within an energyrange kBT ) to be excited. It is this reduction of accessible phase space what reduces CV ,so that it complies with the 3rd principle.

142