0511225v3

8/3/2019 0511225v3

1/12

arXiv:quant-ph/0511225v3

17Oct2006

The foundations of statistical mechanics from entanglement:

Individual states vs. averages

Sandu Popescu,1, 2 Anthony J. Short,1 and Andreas Winter3

1H.H.Wills Physics Laboratory, University of Bristol, Tyndall Avenue, Bristol BS8 1TL, U.K.2Hewlett-Packard Laboratories, Stoke Gifford, Bristol BS12 6QZ, U.K.

3Department of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, U.K.

We consider an alternative approach to the foundations of statistical mechanics, in which subjec-tive randomness, ensemble-averaging or time-averaging are not required. Instead, the universe (i.e.the system together with a sufficiently large environment) is in a quantum pure state subject to aglobal constraint, and thermalisation results from entanglement between system and environment.We formulate and prove a General Canonical Principle, which states that the system will be ther-malised for almost all pure states of the universe, and provide rigorous quantitative bounds usingLevys Lemma.

I. INTRODUCTION

Despite many years of research, the foundations of sta-tistical mechanics remain a controversial subject. Crucialquestions regarding the role of probabilities and entropy

(which are viewed both as measures of ignorance and ob-jective properties of the state) are not satisfactorily re-solved, and the relevance of time averages and ensembleaverages to individual physical systems is unclear.

Here we adopt a fundamentally new viewpoint sug-gested by Yakir Aharonov [1], which is uniquely quan-tum, and which does not rely on any ignorance proba-bilities in the description of the state. We consider theglobal state of a large isolated system, the universe, tobe a quantum pure state. Hence there is no lack of knowl-edge about the state of the universe, and the entropy ofthe universe is zero. However, when we consider onlypart of the universe (that we call the system), it is pos-

sible that its state will not be pure, due to quantum en-tanglement with the rest of the universe (that we callthe environment). Hence there is an objective lack ofknowledge about the state of the system, even though weknow everything about the state of the universe. In suchcases, the entropy of the system is non-zero, even thoughwe have introduced no randomness and the universe itselfhas zero entropy.

Furthermore, interactions between the system and en-vironment can objectively increase both the entropy ofthe system and that of the environment by increasingtheir entanglement. It is conceivable that this is themechanism behind the second law of thermodynamics.Indeed, as information about the system will tend to leakinto (and spread out in) the environment, we might wellexpect that their entanglement (and hence entropy) willincrease over time in accordance with the second law.

The above ideas provide a compelling vision of thefoundations of statistical mechanics. Such a viewpointhas been independently proposed recently by Gemmer etal. [2].

In this paper, we address one particular aspect of theabove programme. We show that thermalisation is ageneric property of pure states of the universe, in the

sense that for almost all of them, the reduced state of thesystem is the canonical mixed state. That is, not only isthe state of the system mixed (due to entanglement withthe rest of the universe), but it is in precisely the statewe would expect from standard statistical arguments.

In fact, we prove a stronger result. In the standardstatistical setting, energy constraints are imposed on thestate of the universe, which then determine a correspond-ing temperature and canonical state for the system. Herewe consider that states of the universe are subject toarbitrary constraints. We then show that almost everypure state of the universe subject to those constraints issuch that the system is in the corresponding generalisedcanonical state.

Our results are kinematic, rather than dynamical.That is, we do not consider any particular unitary evolu-tion of the global state, and we do not show that thermal-isation of the system occurs. However, because almost

all states of the universe are such that the system is in acanonical thermal state, we anticipate that most evolu-tions will quickly carry a state in which the system is notthermalised to one in which it is, and that the systemwill remain thermalised for most of its evolution.

A key ingredient in our analysis is Levys Lemma [3, 4],which plays a similar role to the law of large numbersand governs the properties of typical states in large-dimensional Hilbert spaces. Levys Lemma has alreadybeen used in quantum information theory to study en-tanglement and other correlation properties of randomstates in large bipartite systems [5]. It provides a verypowerful tool with which to evaluate functions of ran-

domly chosen quantum states.

The structure of this paper is as follows. In sectionII we present our main result in the form of a GeneralCanonical Principle. In section III we support this prin-ciple with precise mathematical theorems. In section IVwe introduce Levys Lemma, which is used in sectionsV and VI to provide proofs of our main theorems. Sec-tion VII illustrates these results with the simple exampleof spins in a magnetic field. Finally, in section VIII wepresent our conclusions.
http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3http://arxiv.org/abs/quant-ph/0511225v3

8/3/2019 0511225v3

2/12

2

II. GENERAL CANONICAL PRINCIPLE

Consider a large quantum mechanical system, the uni-verse, that we decompose into two parts, the system Sand the environment E. We will assume that the di-mension of the environment is much larger than that ofthe system. Consider now that the state of the universeobeys some global constraint R. We can represent this

quantum mechanically by restricting the allowed statesof the system and environment to a subspace HR of thetotal Hilbert space:

HR HS HE, (1)

where HS and HE are the Hilbert spaces of the systemand environment, with dimensions dS and dE respec-tively. In standard statistical mechanics R would typ-ically be a restriction on the total energy of the universe,but here we leave R completely general.

We define ER, the equiprobable state of the uni-verse corresponding to the restriction R, by

ER = 11RdR

, (2)

where 11R is the identity (projection) operator on HR,and dR is the dimension of HR. ER is the maximallymixed state in HR, in which each pure state has equalprobability. This corresponds to the standard intuitionof assigning equal a priori probabilities to all states ofthe universe consistent with the constraints.

We define S, the canonical state of the systemcorresponding to the restriction R, as the quantumstate of the system when the universe is in the equiprob-able state ER. The canonical state of the system S istherefore obtained by tracing out the environment in theequiprobable state of the universe:

S = TrE(ER). (3)

We now come to the main idea behind our paper.As described in the introduction, we now consider that

the universe is in a pure state , and not in the mixedstate ER (which represents a subjective lack of knowledgeabout its state). We prove that despite this, the stateof the system is very close to the canonical state S inalmost all cases. That is, for almost every pure state ofthe universe, the system behaves as if the universe wereactually in the equiprobable mixed state

ER.

We now state this basic qualitative result as a generalprinciple, that will subsequently be refined by quantita-tive theorems:

General Canonical Principle: Given a sufficientlysmall subsystem of the universe, almost every pure stateof the universe is such that the subsystem is approxi-mately in the canonical state S.

Recalling that the canonical state of the system S is,by definition, the state of the system when the universe is

in the equiprobable state ER we can interpret the aboveprinciple as follows:

Principle of Apparently Equal a priori Proba-bility: For almost every pure state of the universe, thestate of a sufficiently small subsystem is approximatelythe same as if the universe were in the equiprobable stateER. In other words, almost every pure state of the uni-verse is locally (i.e. on the system) indistinguishable fromER.

For an arbitrary pure state | of the universe, thestate of the system alone is given by

S = TrE(||). (4)Our principle states that for almost all states | HR,

S S. (5)Obviously, the above principle is stated qualitatively.

To express these results quantitatively, we need to care-fully define what we mean by a sufficiently small subsys-

tem, under what distance measure S S, and howgood this approximation is. This will be done in theremaining sections of the paper.

We emphasise that the above is a generalised principle,in the sense that the restriction R imposed on the statesof the universe is completely arbitrary (and is not neces-sarily the usual constraint on energy or other conservedquantities). Similarly, the canonical state S is not nec-essarily the usual thermal canonical state, but is definedrelative to the arbitrary restriction R by equation (3).

To connect the above principle to standard statisticalmechanics, all we have to do is to consider the restrictionR to be that the total energy of the universe is close toE, which then sets the temperature scale T. The totalHamiltonian of the universe HU is given by

HU = HS + HE + Hint, (6)

where HS and HE are the Hamiltonians of the systemand environment respectively, and Hint is the interactionHamiltonian between the system and environment. Inthe standard situation, in which Hint is small and theenergy spectrum of the environment is sufficiently dense

and uniform, the canonical state (E)S can be computed

using standard techniques, and shown to be

(E)S

exp

HS

kBT . (7)

This allows us to state the thermal canonical principlethat establishes the validity (at least kinematically) ofthe viewpoint expressed in the introduction.

Thermal Canonical Principle: Given that the totalenergy of the universe is approximately E, interactionsbetween the system and the rest of the universe are weak,and that the energy spectrum of the universe is suffi-ciently dense and uniform, almost every pure state of the

8/3/2019 0511225v3

3/12

3

universe is such that the state of the system alone is ap-

proximately equal to the thermal canonical state e

HSkBT ,

with temperature T (corresponding to the energy E)

We emphasise here that our contribution in this paperis to show that S S, and has nothing to do withshowing that S

e

HSkBT , which is a standard problem

in statistical mechanics.Finally, we note that the General Canonical Principle

applies also in the case where the interaction betweenthe system and environment is not small. In such sit-uations, the canonical state of the system is no longer

eHSkBT , since the behaviour of the system will depend

very strongly on Hint. Nevertheless, the general principleremains valid for the corresponding generalised canonicalstate S . Furthermore our principle will apply to arbi-trary restrictions R that have nothing to do with energy,which may lead to many interesting insights.

III. QUANTITATIVE SETUP

AND MAIN THEOREMS

We now formulate and prove precise mathematical the-orems correponding to the General Canonical Principlestated in the previous section.

As a measure of the distance between S and S , weuse the trace-norm S S1, where [6]

M1 = Tr |M| = Tr

MM , (8)

as this distance will be small if and only if it would be

hard for any measurement to tell S and S apart. In-deed, M1 = supO1 Tr(M O), where the maximisa-tion is over all operators (observables) O with operatornorm bounded by 1.

In our analysis, we also make use of the Hilbert-Schmidt norm M2 =

Tr(MM), which is easier to

manipulate than M1. However, we only use this forintermediate calculational purposes, as it does not havethe desirable physical properties of the trace-norm. Inparticular S S2 can be small even when the twostates are orthogonal for high-dimensional systems.

Throughout this paper we denote by the averageover states

|

HR according to the uniform distribu-

tion. For example, it is easy to see that S = S.We will prove the following theorems:

Theorem 1 For a randomly chosen state | HR HS HE and arbitrary > 0, the distance between thereduced density matrix of the system S = Tr(||) andthe canonical state S = Tr ER is given probabilisticallyby

Prob S S1 , (9)

where

= +

dS

d effE, (10)

= 2expCdR2 . (11)

In these expressions, C is a positive constant (given byC = (183)1), dS anddR are the dimensions of

HS and

HR respectively, andd effE is a measure of the effective sizeof the environment, given by

d effE =1

Tr 2E dR

dS, (12)

where E = TrS ER. Both and will be small quanti-ties, and thus the state will be close to the canonical statewith high probability, whenever d effE dS (i.e. the effec-tive dimension of the environment is much larger thanthat of the system) and dR2 1 . This latter condi-tion can be ensured whendR 1 (i.e. the total accessiblespace is large), by choosing = d

1/3R .

This theorem gives rigorous meaning to our statementsin section II about thermalisation being achieved for al-most all states: we have an exponentially small boundon the relative volume of the exceptional set, i.e. onthe probability of finding the system in a state that isfar from the canonical state. Interestingly, the exponentscales with the dimension of the space HR of the con-straint, while the deviation from the canonical state ischaracterised by the ratio between the system size andthe effective size of the environment, which makes intu-itive sense.

Theorem 1 provides a bound on the distance betweenS and S, but in many situations we can further im-prove it. Often the system does not really occupy all ofits Hilbert space HS, and also the estimate of the effec-tive environment dimension d effE may be too small, dueto exceptionally large eigenvalues of E = TrS ER. Bycutting out these non-typical components (similar to thewell-known method of projecting onto the typical sub-space), we can optimize the bound obtained, as we willshow in Theorem 2. The benefits of this optimizationwill be apparent in section VII, where we consider a par-ticular example.

Theorem 2 Assume that there exists some bounded pos-itive operator XR on HR satisfying 0 XR 11 suchthat, with ER = XRERXR,

Tr(ER) = TrERXR 1 . (13)

(I.e. the probability of obtaining the outcome correspond-ing to measurement operator XR in a generalised mea-surement (POVM) on ER is approximately one.)

Then, for a randomly chosen state | HR HS HE and > 0,

Prob S S1 , (14)

8/3/2019 0511225v3

4/12

4

where

= +

dS

d effE+ 4

, (15)

= 2expCdR2 . (16)

Here, C anddR are as in Theorem 1, dS is the dimensionof the support of XR in

HS , and d

effE is the effective size

of the environment after applying XR, given by

d effE =1

Tr 2E dR

dS, (17)

where E = TrS (ER). In many situations can be madevery small, while at the same time improving the rela-tion between system and effective environment dimen-sion. Note that the above is essentially the technique ofthe smooth (quantum) Renyi entropies [7, 8]: log dS is

related to S0(S) and log deffE to S

2(E).

In the process of proving these theorems, we also ob-tain the following subsidiary results:

1. The average distance between the systems reduceddensity matrix for a randomly chosen state and thecanonical state will be small whenever the effectiveenvironment size is larger than the system. Specif-ically,

S S1

dSd effE

d2SdR

, (18)

where the effective dimension of the environmentd effE is given by (12).

2. With high probability, the expectation value of a

bounded observable OS on the system for a ran-domly chosen state will be very similar to its ex-pectation value in the canonical state wheneverdR 1. Specifically,

Prob |Tr(OSS ) Tr(OS S)| d1/3R

2exp

Cd1/3R

OS2

,(19)

where C is a constant.

In our analysis we use two alternative methods, withthe hope that the different mathematical techniques em-ployed will aid in future exploration of the field.

IV. LEVYS LEMMA

A major component in the proofs of the followingsections is the mathematical theorem known as LevysLemma [3, 4], which states that when a point is se-lected at random from a hypersphere of high dimensionand f() does not vary too rapidly, then f() f withhigh probability:

Lemma 3 (Levys Lemma) Given a functionf : Sd R defined on the d-dimensional hypersphere Sd, and apoint Sd chosen uniformly at random,

Prob |f() f| 2exp2C(d + 1)2

2

(20)

where is the Lipschitz constant of f, given by =sup

|f|, andC is a positive constant (which can be taken

to be C = (183)1).

Due to normalisation, pure states in HR can be repre-sented by points on the surface of a (2dR1)-dimensionalhypersphere S2dR1, and hence we can apply LevysLemma to functions of the randomly selected quantumstate by setting d = 2dR 1. For such a randomly cho-sen state | HR, we wish to show that S S1 0with high probability.

V. METHOD I: APPLYING LEVYS

LEMMA TO S S1

In this section, we consider the consequences of ap-plying Levys Lemma directly to the distance betweenS = TrE(||) and S , by choosing

f() = S S1 . (21)in (20). As we prove in appendix A, the function f()has Lipschitz constant 2. Applying Levys Lemmato f() then gives:

Prob S S1 S S1

2eCdR2 .

(22)

To obtain Theorem 1, we rearrange this equation to getProb

S S1 (23)where

= + S S1 (24) = 2exp

CdR2 . (25)The focus of the following subsections is to obtain a

bound on S S1. In section V C we show that

S S1

dSd effE

(26)

where d effE is a measure of the effective size of the envi-ronment, given by (12). Inserting equation (26) in (24)we obtain Theorem 1.

Typically dR 1 (the total number of accessible statesis large) and hence by choosing = d

1/3R we can ensure

that both and are small quantities. When it is alsotrue that d effE dS (the environment is much larger thanthat of the system) both and will be small quantities,leading to S S1 0 with high probability.

8/3/2019 0511225v3

5/12

5

To obtain Theorem 2, we consider a generalised mea-surement which has an almost certain outcome for theequiprobable state ER HR, and apply the correspond-ing measurement operator before proceeding with ouranalysis. By an appropriate choice of measurement oper-ator, the ratio of the system and environments effectivedimensions can be significantly improved (as shown bythe example in section VII A).

A. Calculating

S S1

As mentioned in section III, although S S1 isa physically meaningful quantity, it is difficult to workwith directly, so we first relate it to the Hilbert-Schmidtnorm S S2. The two norms are related by

S S1

dS S S2 , (27)as proved in Appendix A.

Expanding S S2 we obtain

S S2 S S22 (28)

=

Tr(S S)2=

Tr 2S 2 Tr(S S) + Tr 2S=

Tr 2S Tr 2S, (29)and hence

S S1

dS (Tr 2S Tr 2S) (30)

B. Calculating Tr(2S)In this section we show the fundamental inequality

Tr 2S Tr S2 + TrE2 . (31)

The following calculations and estimates are closely re-lated to the arguments used in random quantum chan-nel coding [9] and random entanglement distillation (see[10]).

To calculate

Tr 2S

, it is helpful to introduce a secondcopy of the original Hilbert space, extending the problemfrom HR to HR HR where HR HS HE.

Note that

TrS 2S =

k

(kk)2

=

k,l,k,l

(kl)(kl) kk |ll ll|kk

= TrSS

(S S)FSS

, (32)

where FSS is the flip (or swap) operation S S:FSS =

S,S

|ss|S |ss|S, (33)

and hence

TrS 2S = TrRR

(||||)RR (FSS 11EE)

. (34)

So, our problem reduces to the calculation of

V || || =

|| || d. (35)

As V is invariant under operations of the form V (U U)V(U U) for any unitary U, representationtheory implies that

V = symRR + antiRR , (36)

where sym/antiRR are projectors onto the symmetric and

antisymmetric subspaces of HR HR respectively, and and are constants. As

(|| ||) 12

(|ab |ba) = 0 a,b,, (37)

it is clear that = 0, and as V is a normalised state,

=1

dim(RRsym)=

2

dR(dR + 1). (38)

Hence

|| || = 2dR(dR + 1)

symRR . (39)

and therefore

TrS

2S

= TrRR

2 symRR

dR(dR + 1)

(FSS 11EE)

.

(40)To proceed further we perform the substitution

symRR =1

2(11RR + FRR) , (41)

where FRR is the flip operator taking R R. Notingthat FRR = 11RR(FSS FEE), this gives

TrS 2S

= TrRR

11RR

dR(dR + 1)

(FSS 11EE)

+ TrRR

11RR

dR(dR + 1)

(11SS FEE)

TrRR

11RdR

11RdR

(FSS 11EE)

+ TrRR

11RdR

11RdR

(11SS FEE)

= TrSS

(S S)FSS

+ TrEE

(E E )FEE

. (42)

Hence from equation (32),TrS (

2S) TrS 2S + TrE 2E

= Tr S2 + TrE2(43)

8/3/2019 0511225v3

6/12

6

C. Bounding

S S1

Inserting the results of the last section in equation (30)we obtain

S S1

dS TrE 2E (44)

Intuitively, we can understand this equation by defin-

ing

d effE =1

TrE 2E, (45)

as the effective dimension of the environment in thecanonical state. If all of the non-zero eigenvalues of Ewere of equal weight this would simply correspond to thedimension of E s support, but more generally it willmeasure the dimension of the space in which the envi-ronment is most likely to be found. When there is noconstraint on the accessible states of the environment,such that HR = HS HE then d effE = dE

Denoting the eigenvalues of E by

k

E (with maximumeigenvalue maxE ), it is also interesting to note that

TrE 2E =

k

(kE )2

maxE

k

kE

= max|E

E | TrS

11RdR

|E

= max|E

s

s E |11RdR

|s E

dS

dR. (46)

Hence d effE dR/dS, and we obtain the final result that

S S1

dSd effE

d2SdR

. (47)

The average distance S S1 will therefore besmall whenever the effective size of the environment ismuch larger than that of the system (d effE dS ).

Inserting the results of equation (47) into equation (24)gives Theorem 1.

D. Improved bounds using restricted subspaces

As mentioned in section III, in many cases it is possi-ble to improve the bounds obtained from Theorem 1 byprojecting the states onto a typical subspace before pro-ceeding with the analysis. This can allow one to decreasethe effective dimension of the system dS (by eliminatingcomponents with negligible amplitude), and increase theeffective dimension of the environment d effE = (Tr

2E)

1

(by eliminating components of E with disproportion-ately high amplitudes), whilst leaving the equiprobablestate ER largely unchanged.

To allow for the most general possibility, we considera generalised measurement operator XR satisfying 0 XR 11 (of which a projector is a special case), whichhas high probability of being satisfied by ER, such that

TrR(ERXR) 1 . (48)We denote the dimension of the support of XR in HS

by dS, which will play the role of dS in the revised anal-ysis [11]. The bounds on S S1 will be optimizedby choosing XR such that dS is as small as possible, andd effE as large as possible.

We also define the sub-normalised states obtained aftermeasurement of XR:

| =

XR| (49)ER =

XR ER

XR =

XRdR

(50)

S = TrE (ER) (51)E = TrS (ER) (52)

Applying the same analysis as in the previous sectionsto these states, we find

TrS

2S

= TrRR

|| || (FSS 11EE)= TrRR

(XR XR)symRR

dR(dR + 1)(FSS 11EE)

TrRR XRdR

XRdR (FSS

11EE)+ TrRR

XRdR

XRdR

(11SS FEE)

= TrS

2S + TrE

2E, (53)

where in the second equality we have used the fact that[

XR

XR , symRR ] = 0. From the analogue of equa-

tion (30) we can then obtain

S S1

dS

d effE, (54)

where (using the analogue of (46))

d effE =1

TrE 2E dR

dS. (55)

To transform this bound onS S

1into a bound

on S S1, we note that

S S1 S S1 +S S

1+S S

1.

(56)

8/3/2019 0511225v3

7/12

7

We bound S S1 as follows:

S S1 || ||

1

2|| ||

2

=

2 Tr(|| ||)2

=

2(1 2|XR|2 + |XR|2)

2(1 |XR|2)

4(1 Tr(XR||)), (57)

where in the first inequality we have used the non-increase of the trace-norm under partial tracing, in thesecond inequality we have used Lemma 6 (Appendix A)

and the fact that | and | span a two-dimensional sub-space, and in the third inequality we have used the factthat XR

XR (because XR 11R).

It follows that

S S1 4(1 Tr(XR||))

4(1 Tr(XR||))=

4(1 Tr(XRER) 2

, (58)

where we have used the concavity of the square root func-tion and equation (48).

In addition, note that from the triangle inequality,

S S1

= S S1 S S1 2. (59)

Inserting these results into the average of equation (56)we get

S S1

dS

d effE+ 4

(60)

and inserting this in equation (24) we obtain Theorem 2.

VI. METHOD II: APPLYING LEVYS

LEMMA TO EXPECTATION VALUES

In this section, we describe an alternative method ofobtaining bounds on S S1 by considering the ex-pectation values of a complete set of observables. Thephysical intuition is that if the expectations of all ob-servables on two states are close to each other, then thestates themselves must be close.

We begin by showing that for an arbitrary (bounded)observable OS on S, the difference in expectation valuebetween a randomly chosen state S = TrE (||) and

the canonical state S is small with high probability. Wethen proceed to show that this holds for a full operatorbasis, and thereby prove that S S with high proba-bility when dR d2S.

In this method, Levys Lemma plays a far more cen-tral role. This approach may be more suitable in somesituations, and yields further insights into the underlyingstructure of the problem.

A. Similarity of expectation values for random and

canonical states

Consider Levys Lemma applied to the expectationvalue of an operator OS on HS , for which we take

f() = Tr(OS S ). (61)

in (20). Let OS have bounded operator norm OS(where OS is the modulus of the maximum eigenvalueof the operator). Then the Lipschitz constant of f()is also bounded, satisfying

2

OS

(as shown in ap-pendix A). We therefore obtain

Prob |Tr(OS S) Tr(OS S )| 2expCdR 2OS2

.

(62)However, note that

Tr(OSS ) = Tr(OSS) = Tr(OS S), (63)

and hence that

Prob |Tr(OS S) Tr(OS S )| 2expCdR2OS2

.

(64)

By choosing = d1/3R we obtain the result that

Prob |Tr(OS S )Tr(OS S )| d1/3R

2exp

Cd1/3R

OS2

.

(65)

For dR 1, the expectation value of any given boundedoperator for a randomly chosen state will therefore beclose to that of the canonical state S with high proba-bility [13].

B. Similarity of expectation values for a complete

operator basis

Here we consider a complete basis of operators for thesystem. Rather than Hermitian operators, we find it con-venient to consider a basis of unitary operators UxS . Weshow that with high probability all of these operatorswill have (complex) expectation values close to those ofthe canonical state.

It is always possible to define d2S unitary operators UxS

on the system, labelled by x {0, 1, . . . d2S 1}, such

8/3/2019 0511225v3

8/12

8

that these operators form a complete orthogonal operatorbasis for HS satisfying [14]

Tr(UxS UyS ) = dS xy, (66)

where xy is the Kronecker delta function. One possiblechoice of UxS is given by

UxS =

dS1

s=0

e2is(x(x mod dS))/d2S |(s+x)mod dSs|. (67)

Noting that UxS = 1 x (due to unitarity), we can thenapply equation (64) to OS = U

xS to obtain

Prob |Tr(UxS S) Tr(UxS S)| 2eCdR2 x.

(68)Furthermore, as there are only d2S possible values of x,this implies that

Prob

x : |Tr(UxS S) Tr(UxS S)|

2d2S eCdR2

(69)

If we take = d1/3R 1, note that as the right handside of (69) will be dominated by the exponential decay

eCd1/3R , it is very likely that all operators UxS will have

expectation values close to their canonical values.

C. Obtaining a probabilistic bound on S S1

As the UxS form a complete basis, we can expand anystate S as

S =1

dS

x

Cx(S )UxS (70)

where

Cx() = Tr(UxS S ) = Tr(U

xS S)

. (71)

Expressing equation (69) in these terms we obtain

Prob x : |Cx() Cx()| 2d2S eCdR 2 (72)

When |Cx(S )Cx(S )| for all x, an upper boundcan be obtained for the squared Hilbert-Schmidt norm[15] as follows:

A A22 = 1dS

x

(Cx(S) Cx(S)) UxS2

2

=1

d2STr

x

(Cx(S ) Cx(S)) UxS2

=1

dS

x

(Cx(S ) Cx(S))2

dS 2 (73)

Hence using the relation between the trace-norm andHilbert-Schmidt-norm (proved in appendix A),

S S1

dS S S2 dS. (74)

Incorporating this result into equation (72) yields

Prob S

S

1

dS

2d2SeCdR

2

. (75)

If we choose

=

dSdR

1/3(76)

we obtain the final result that

Prob S S1 1 2d2S eC . (77)

where

=dR

d2S1/3

(78)

Note that S S1 0 with high probability when-ever log2(dS ) 1, and hence when dR d2S . Thisresult is qualitatively similar to the result obtained usingthe previous method, although it can be shown that thebound obtained is actually slightly weaker in this case.

VII. EXAMPLE: SPIN CHAIN

WITH np EXCITATIONS

As a concrete example of the above formalism, con-

sider a chain of n spin-1/2 systems in an external mag-netic field in the +z direction, where the first k spinsform the system, and the remaining n k spins form theenvironment. We therefore consider a Hamiltonian of theform

H = n

i=1

B

2(i)z (79)

where B is a constant energy (proportional to the exter-

nal field strength), and (i)z is a Pauli spin operator for

the ith spin.Under these circumstances, the global energy eigen-

states can be divided into orthogonal subspaces depen-dent on the total number of spins aligned with the field.We consider a restriction to one of these degenerate sub-spaces HR HSHE in which np spins are in the excitedstate |1 (opposite to the field) and the remaining n(1p)spins are in the ground state |0 (aligned with the field).

With this setup, dS = 2k and

dR =

n

np

. (80)

8/3/2019 0511225v3

9/12

9

Approximating this binomial coefficient by an exponen-tial (as in Appendix C), gives

dR 2nH(p)

n + 1(81)

where H(p) = p log2(p)(1p)log2(1p) (the Shannonentropy of a single spin).

From Theorem 1,Prob

S S1 , (82)where

= +

dS

d effE, (83)

= 2expCdR2 . (84)

In addition,

dSdeffE

d2S

dR (n + 1) 2(nH(p)2k)/2. (85)

For an appropriate choice of (e.g. = d1/3R 1), we

will obtain S S1 0 with high probability when-ever

(n + 1) 2(nH(p)2k)/2 1 (86)For fixed p, this condition will be satisfied for all suffi-ciently large n k.

We emphasise that our results concern the distancebetween S and S . Computing the precise form of Sis a standard exercise in statistical mechanics, which wesketch here for completeness.

In the regime where n k2, the canonical state Swill take the approximate form

S =

s

(n k)!dS (np |s|)!(n(1 p) (k |s|))! |ss|

s

n!(np)|s|(n(1 p))k|s|dS nk(np)!(n(1 p))! |ss|

=

s

p|s|(1 p)k|s| |ss| (87)

= (p|11| + (1 p)|00|)k. (88)

and hence the canonical state of the system will approxi-mate that ofk uncorrelated spins, each with a probabilityp of being excited, as expected.

To connect our result to the standard statistical me-chanical formula,

S exp

HSkBT

(89)

we use Boltzmanns formula relating the entropy of theenvironment SE(|e|) to the number of states NE (|e|) of

the environment with a given number of excitations |e|to get

SE(|e|) = kB ln NE (|e|)= kB ln

n k|e|

kB(n k)ln(n k) |e| ln |e|(n k |e|)ln(n k |e|)

, (90)

where in the third line we have used Stirlings approxi-mation. Defining the temperature in the usual way, andnoting that the energy of the environment is given byE = |e|B (n k)B/2, we obtain

1

T=

dSE (E)

dE

E=E

=1

B

dSE (|e|)d|e|

|e|=(nk)p

k

BB ln

n k |e||e| |e|=(nk)p=

kBB

ln

1 p

p

(91)

This formula expresses how the probability p definesa temperature T of the environment. Rearranging equa-tion (87) to incorporate equation (91) gives the usualstatistical mechanical result

S (1 p)k

s

p

1 p|s|

|ss|

= (1 p)k s

exp|s| ln1 pp |ss|

= (1 p)k

s

exp

|s|B

kBT

|ss|

exp

HSkBT

. (92)

A. Projection on the typical subspace

We can obtain an improved bound on S S1 bynoting that the system state almost always lies in a typ-

ical subspace with approximately kp excitations. Wemake use of this observation by applying Theorem 2 witha measurement operator XR given by

XR = S 11E (93)

where S is a projector onto the typical subspace of thesystem, in which it contains a number of excitations |s|in the range

kp |s| kp + . (94)

8/3/2019 0511225v3

10/12

10

It is easy to show, using classical probabilistic argu-ments (see Appendix B), that

TrR(XRER) = TrS (S S ) 1 (95)where

= 2 exp2

4kp(1

p) (96)Furthermore, the dimension dS of the support of XR

on HS (which here is simply the dimension of the typicalsubspace) is shown in appendix C to be given by

dS =

kp+|s|=kp

k|s|

(2 + 1) 2kH(p)+G(p) (97)where

G(p) = dH(p)dp

= log2 p1 p . (98)

From Theorem 2 we obtain:

Prob S S1 , (99)

where, using d effE dR/dS, and inserting the results ofequations (81), (96) and (97),

= +

(n + 1)(2 + 1) 2(kn/2)H(p)+G(p)(100)

+

32 exp

2

8kp(1 p)

= 2exp CdR2 . (101)

Choosing = k2/3 and = d1/3R yields

= (n + 1)1/3 2nH(p)/3

+

(n + 1)(2k2/3 + 1) 2(n2k)H(p)/2+k2/3G(p)

+

32 exp

k

1/3

8p(1 p)

(102)

= 2exp

C2

nH(p)/3

(n + 1)1/3

. (103)

In the thermodynamic limit in which p is fixed (cor-responding to the temperature), the ratio of the systemand environment sizes r = k/(n k) is fixed at somevalue r < 1 (i.e. the system is smaller than the environ-ment), and n tends to infinity, 0 and 0, andhence S S .

For large (but finite) n the system will be thermalisedfor almost all states when the system is smaller than theenvironment (i.e. r < 1). Note that as depends expo-nentially on (n 2k), 1 can be achieved with onlysmall differences in the number of spins in the systemand environment.

VIII. CONCLUSIONS

Let us look back at what we have done. Concern-ing the problem of thermalisation of a system interactingwith an environment in statistical mechanics, there areseveral standard approaches. One way of looking at itis to say that the only thing we know about the state ofthe universe is a global constraint such as its total energy.

Thus the way to proceed is to take a Bayesian point ofview and consider all states consistent with this globalconstraint to be equally probable. The average over allthese states indeed leads to the state of any small subsys-tem being canonical. But the question then arises whatis the meaning of this average, when we deal with justone state. Also, these probabilities are subjective, andthis raises the problem of how to argue for an objectivemeaning of the entropy. A formal way out is that sug-gested by Gibbs, to consider an ensemble of universes,but of course this doesnt solve the puzzle, because thereis usually only one actual universe. Alternatively, it wassuggested that the state of the universe, as it evolves in

time, can reach any of the states that are consistent withthe global constraint. Thus if we look at time averages,they are the same as the average that results from consid-ering each state of the universe to be equally probable.To make sense of this image one needs assumptions ofergodicity, to ensure that the universe explores all theavailable space equally, and of course this doesnt solvethe problem of what the state of the subsystem is at agiven time.

What we showed here is that these averages are notnecessary. Rather, (almost) any individual state of theuniverse is such that any sufficiently small subsystem be-haves as if the universe were in the equiprobable aver-

age state. This is due to massive entanglement betweenthe subsystem and the rest of the universe, which is ageneric feature of the vast majority of states. To obtainthis result, we have have introduced measures of the ef-fective size of the system, dS , and its environment (i.e.the rest of the universe), d effE , and showed that the aver-age distance between the individual reduced states andthe canonical state is directly related to dS /d effE . LevysLemma is then invoked to conclude that all but an ex-ponentially small fraction of all states are close to thecanonical state.

In conclusion, the main message of our paper is thataverages are not needed in order to justify the canonicalstate of a system in contact with the rest of the universe

almost any individual state of the universe is enoughto lead to the canonical state. In effect, we propose toreplace the Postulate of Equal a priori Probabilities bythe Principle of Apparently Equal a priori Probabilities,which states that as far as the system is concerned everysingle state of the universe seems similar to the average.

We stress once more that we are concerned only withthe distance between the state of the system and thecanonical state, and not with the precise mathematicalform of this canonical state. Indeed, it is an advantage

8/3/2019 0511225v3

11/12

11

of our method that these two issues are completely sep-arated. For example, our result is independent of thecanonical state having Boltzmannian form, of degenera-cies of energy levels, of interaction strength, or of energy(of system, environment or the universe) at all.

In future work [17], we will go beyond the kinematicviewpoint presented here to address the dynamics of ther-malisation. In particular, we will investigate under what

conditions the state of the universe will evolve into (andspend almost all of its later time in) the large region ofits Hilbert space in which its subsystems are thermalised.

Acknowledgments

The authors would like to thank Yakir Aharonov andNoah Linden for illuminating discussions. SP, AJS andAW acknowledge support through the U.K. EPSRCsproject QIP IRC. In addition, SP also acknowledgessupport through EPSRC Engineering-Physics grantGR/527405/01, and AW acknowledges a University of

Bristol Research Fellowship.Note added: A very recent independent paper by Gold-stein et. al. [18] discusses similar issues to those ad-dressed here.

APPENDIX A: LIPSCHITZ CONSTANTS

AND NORM RELATION

Lemma 4 The Lipschitz constant of the functionf() = S S1, satisfies 2.Proof: Defining the reduced states 1 = TrE (|11|)and 2 = TrE (

|2

2

|), and using the result that partial

tracing cannot increase the trace-norm

|f(1) f(2)|2 = |1 1 2 1|2

1 221 |11| |22|21= 4

1 |1|2|2

4 ||1 |2|2 (A1)

Hence |f(1) f(2)| 2 ||1 |2|, and thus 2.Lemma 5 The Lipschitz constant of the functionf() = Tr(X||), where X is any operator on HRwith finite operator normX satisfies 2 X.Proof:

|f(1) f(2)| =1|X|1 2|X|2

=1

2

(1| + 2|)X(|1 |2)+(1| 2|)X(|1 + |2)

X |1 + |2|1 |2 2 X

|1 |2. (A2)

Lemma 6 For any n n matrix M, M1

n M2.Proof: If M has eigenvalues i,

M21 = n2

1

n

i

|i|2

n2

1

n

i |i

|2 = n

M

22 ,

by the convexity of the square function. Taking thesquare-root yields the desired result.

APPENDIX B: PROJECTION ONTO

THE TYPICAL SUBSPACE

Lemma 7 Given a system in the canonical state S, theprobability of it containing a number of excitations |s| inthe range kp |s| kp + is given by

Tr(S S)

1

(B1)

where

= 2 exp

2

4kp(1 p)

(B2)

Proof: S is essentially a classical probabilistic state,obtained by choosing k spins at random from a bagcontaining np excited spins and n(1p) un-excited spinswithout replacement. It is easy to see that this state willlie in the typical subspace with higher probability than ifthe spins were replaced in the bag after each selection, asthe former process is mean reverting, whereas the latteris not. We can bound the probability of lying outside

the typical subspace in the case with replacement usingChernoffs inequality [16] for the sum X =

i(si p),

where si {0, 1} is the value of the ith spin. This gives

ProbX > 2e 242 (B3)

where 2 = kp(1 p) is the variance of X. Hence

Prob|s| kp > 2e 24kp(1p) (B4)

and thus

Tr(S S) = 1

Prob|s| kp >

1 2e 2

4kp(1p) (B5)

APPENDIX C: EXPONENTIAL BOUNDS

ON COMBINATORIAL QUANTITIES

In this appendix we obtain bounds for the combina-toric quantities required to consider the example case ofa spin-chain.

8/3/2019 0511225v3

12/12

12

From standard probability theory we know that

nk=0

nk

pk(1 p)nk = 1, (C1)

with the maximal term in the sum being obtained whenk = np. Hence

n

np

pnp(1p)n(1p) 1

nk=0

n

np

pnp(1p)n(1p).

(C2)Noting that

pnp(1 p)n(1p) = 2nH(p) (C3)where H(p) = p log2(p) (1 p)log2(1 p), we canrearrange equation (C2) to get

2nH(p)log2(n+1)

nnp

2nH(p). (C4)

We also require an upper bound for the dimension ofthe typical subspace of system S, given by

dS =

kp+|s|=kp

k|s|

. (C5)

The maximal term in this sum occurs when |s| = k p

where

p =

p + /k : p < 12 /k12 :

p 12 /kp /k : p > 12 + /k,

(C6)

and as the sum consists of (2 + 1) terms,

dS (2 + 1) kk p . (C7)Bounding the binomial coefficient by an exponential asabove we obtain

dS (2 + 1)2kH(p). (C8)

As H(p) is a concave function of p, we also note that

kH(p) kH(p) dH(p)dp

(C9)Defining

G(p) =

dH(p)dp =

log2

p

1 p , (C10)

we therefore find that

dS (2 + 1)2kH(p)+ G(p). (C11)

[1] Y. Aharonov, personal communication, 1986.[2] J. Gemmer, M. Michel, and G. Mahler, Quantum Ther-

modynamics, LNP 657, Springer Verlag, Heidelberg,Berlin, 2004.

[3] V. D. Milman and G. Schechtman, Asymptotic Theoryof Finite-Dimensional Normed Spaces, LNM 1200, Ap-pendix IV, Springer Verlag, 1986.

[4] M. Ledoux, The concentration of measure phenomenon,AMS Mathematical Surveys and Monographs, vol. 89,American Mathematical Society, 2001.

[5] P. Hayden, D. W. Leung, and A. Winter, eprint quant-ph/0407049 (2004). To appear in Commun. Math. Phys.

[6] Note that S S1 is twice the usual trace-distancebetween the two states, taking a maximum value of 2 fororthogonal states.

[7] R. Renner and R. Konig, Proc. TCC 2005, LNCS 3378,Springer Verlag, 2005. R. Renner and S. Wolf, Proc. 2004

IEEE Intl. Symp. Inf. Theory, p. 233 (2004).[8] P. Hayden and A. Winter, Phys. Rev. A 67, 012326

(2003), eprint quant-ph/0204092 (2002).[9] S. Lloyd, Phys. Rev. A 55, 1613 (1997). P. W. Shor, un-

published MSRI lecture notes (Berkeley, 2002); availableonline at www.msri.org/publications/ln/msri/2002/quantumcrypto/shor/1/.

[10] M. Horodecki, J. Oppenheim, and A. Winter, Nature

436, 673 (2005). Long version in preparation.

[11] Mathematically, dS = minS Tr S, where the mini-

mum is over all projectors S HS such that [(S 11E),XR] = 0.[12] C. A. Fuchs and J. van de Graaf, IEEE Trans. Inf. Theory

45, 1216 (1999).[13] Note that the probability of obtaining a particular out-

come in a measurement is always representable as theexpectation value of a bounded observable (with OS 1), hence this is a very general result.

[14] In the special case in which log2dS is an integer, it is

also possible to make UxS Hermitian by constructing themfrom tensor products of 2-dimensional Pauli spin matri-ces and Identity matrices.

[15] Note that unlike in the previous method, it is possible toproceed directly with the trace-norm here, but the boundobtained is weaker.

[16] A. Dembo, O. Zeitouni, Large Deviations: Techniquesand Applications, 2nd ed., Springer Verlag, New York,1998.

[17] Y. Aharonov, N. Linden, S. Popescu, A. J. Short, and A.Winter. In preparation.

[18] S. Goldstein, J. L. Lebowitz, R. Tumulka, N.Zangh,eprint cond-mat/0511091 (2005).

0511225v3

Documents

Transcript of 0511225v3