ELEMENTS OF QUANTUM MECHANICSjeffery/course/c_quantum/sqm/savedir2/001_elements...ELEMENTS OF...

40
ELEMENTS OF QUANTUM MECHANICS David J. Jeffery 2012 January 1 1. INTRODUCTION The topic of Speakable Quantum Mechanics (SQM) is introductory non-relativistic quan- tum mechanics which we will usually just call quantum mechanics, unless we need to be more specific. Relativistic quantum mechanics is usually referred to as quantum field theory—it is beyond our scope, except for occasional bits and pieces. Non-relativistic quantum mechanics is necessarily an approximation to relativistic quan- tum mechanics. It is valid in the regime where kinetic energies and potential energies (more generally field energies) are much less than the rest mass energies of particles. Following philosophy of Laughlin (2005), I prefer to think of non-relativistic quantum mechanics as an emergent theory: one that is exactly true in the limit of low kinetic and potential energies. Non-relativistic quantum mechanics is such a well verified theory—all modern electronics testifies to it—that it is hard to believe there is not a limit in which it is exact. If there were not such a limit—let’s call this limit the non-relativistic-quantum-mechanics limit (NRQM limit)—then I think non-relativistic quantum mechanics would somehow be a crude theory or model that is sometimes unreliable—which it is not. In this chapter, we develop the elements of quantum mechanics. The approach taken is a bit more abstract and formal than in most introductory quantum textbooks. But yours truly thinks it best to get the full formalism first and then derive results for systems rather than derive results with only part of the formalism and then have to rethink the systems and results in terms of the full formalism. I just think it is the easier way as long as the readers

Transcript of ELEMENTS OF QUANTUM MECHANICSjeffery/course/c_quantum/sqm/savedir2/001_elements...ELEMENTS OF...

ELEMENTS OF QUANTUM MECHANICS

David J. Jeffery

2012 January 1

1. INTRODUCTION

The topic of Speakable Quantum Mechanics (SQM) is introductory non-relativistic quan-

tum mechanics which we will usually just call quantum mechanics, unless we need to be more

specific. Relativistic quantum mechanics is usually referred to as quantum field theory—it

is beyond our scope, except for occasional bits and pieces.

Non-relativistic quantum mechanics is necessarily an approximation to relativistic quan-

tum mechanics. It is valid in the regime where kinetic energies and potential energies (more

generally field energies) are much less than the rest mass energies of particles. Following

philosophy of Laughlin (2005), I prefer to think of non-relativistic quantum mechanics as an

emergent theory: one that is exactly true in the limit of low kinetic and potential energies.

Non-relativistic quantum mechanics is such a well verified theory—all modern electronics

testifies to it—that it is hard to believe there is not a limit in which it is exact. If there were

not such a limit—let’s call this limit the non-relativistic-quantum-mechanics limit (NRQM

limit)—then I think non-relativistic quantum mechanics would somehow be a crude theory

or model that is sometimes unreliable—which it is not.

In this chapter, we develop the elements of quantum mechanics. The approach taken is

a bit more abstract and formal than in most introductory quantum textbooks. But yours

truly thinks it best to get the full formalism first and then derive results for systems rather

than derive results with only part of the formalism and then have to rethink the systems and

results in terms of the full formalism. I just think it is the easier way as long as the readers

– 102 –

can bear formalism that looks a long way from physics—and a lot like not very rigorous

math.

2. AXIOMS

Quantum mechanics can be developed in an axiomatic way, but there are many different

ways of formulating the axioms and explicating them. For example, Cohen-Tannoudji et al.

(1977, p. 211ff) gives 6 axioms and an elaborate explication. The axiomatic path absolutely

has its value. But I think it is not terribly memorable path in pure form. Also in thinking

about quantum mechanics and in solving quantum mechanics problems, the mind swims in

the sea of axioms and results without placing them in a definite hierarchy at every moment.

Yours truly will compromise and present a shorthand set of 5 axioms here (mainly following

Zurek (2009)). The shorthand set allows for easy remembering and cogitation. The axioms

require elaborate explication. We give much of that in this chapter and more in as we go

along in SQM.

The 5 axioms are:

1. State Vector Axiom: States are vectors in Hilbert spaces, and thus the state vector

contains all information about the state of system. For further explication see § 3.

2. Schrodinger’s Equation Axiom: Evolutions of states are deterministic when determined

(in non-relativistic quantum mechanics) by Schrodinger’s equation:

H|Ψ〉 = ih− ∂

∂t|Ψ〉 , (1)

where H is the Hamiltonian (which is the energy operator) and Ψ is contentional state

label. The partial derivative symbol is actually conventional since there is no implicit

time dependence. The full time derivative (i.e., d/dt) could be used as well (e.g., Cohen-

– 103 –

Tannoudji et al. 1977, p. 222). Schrodinger’s equation is referenced to initial frames

unless non-inertial frame effects are accounted for or approximated as negligible. For

further explication see § 6.

3. Repitition Axiom: The immediate repetition of a measurement yields the same outcome

as as the first performance. The meaning of the term measurement in this context is

discussed in § 9. Actually, testing this axiom directly is difficult in many cases, but it

seems essential to the overall validity of quantum mechanics (Zurek 2009).

4. Wave Function Collapse Axiom: The outcomes of measurements are eigenvalues of the

measured observable and the measurement collapses the state to the corresponding

eigenstate of the observable. The axiom actually applies to states not describable by

wave functions too. However, the term wave function collapse is used generally for

the collapse event. The terms “observable” and “measurement” need further explica-

tion. However, we note here that “measurement” is conventional and does necessarily

imply human measurement. It refers to a strong interaction of the system with the

environment. For further explication see § 9.

5. Born’s rule Axiom: The probability of a measurement outcome (i.e., collapse to eigen-

state of an observable) is the magnitude squared of the state amplitude of the eigenstate

in the expansion of the state in the complete set of eigenstates of the observable. State

amplitude is conventionally called probability amplitude, but SQM prefers state am-

plitude for reasons given in § 9. For the position presentation, the state amplitude is

the magnitude squared of the wave function |Ψ|2.

After reading this chapter, one hopes these axioms largely click into place in the minds of

readers.

Actually quantum mechanics has more axioms to deal with actual physical entities.

– 104 –

There is such a thing as spin (see the chapter Spin) whose existence is an axiom. The sym-

metrization principle is an axiom needed to deal with identical particles (see the chapter

The Symmetrization Principle). Moreover, it is my view that what I call micro-axioms fre-

quently get added as quantum mechanics is developed without bothering to tabulate them:

the practitioner just assimilates them. For example, the eigenstates of the space operator for

the Hilbert space (§ 4) of space space (i.e., ordinary 3-dimensional space) are Dirac delta

functions even though they Dirac delta functions are not real functions (§ 5). That we can

put the Dirac delta functions on the Procrustian bed of Hilbert space eigenstates seems to

be me to be a micro-axiom.

We need to further remark that quantum mechanics is at the same time profoundly like

and unlike classical mechanics. We will just let those likenesses and unlikenesses emerge as we

go along in SQM. The likenesses certainly guided the development of quantum mechanics, but

new axioms were needed and quantum mechanics cannot be derived from classical mechanics.

It can, of course, be derived as an emergent theory from quantum field theory in the NRQM

limit—but we won’t do that.

3. STATE AS VECTOR

From § 2

This axiom has been contested by some (notably Einstein) since the dawn

of quantum mechanics. The contesters believe that the state vector of quantum

mechanics theory has only statistical information about the state, and so is not

identical with the state. With our current knowledge, there is no practical difference

in the two views since quantum mechanics has never been falsified—to use the

jargon of falsifiability (e.g., Wikipedia: Falsifiability). The theoretical difference is

– 105 –

profound since the statistical view implies there is a more fundamental theory than

quantum mechanics. There is a notable proof by Pusey et al. (2011) that quantum

mechanics cannot be interpreted statistically—this argument supports the axiom

as stated. The validity of this proof is still under debate.

What does the state being a state vector mean? We cannot say

everything possible here. We can say it means that it takes many numbers

to describe the state in general. From a classical point of view, no surprise.

That it takes a continuum infinity (i.e., an uncountable infinity) of complex

numbers numbers to describe the state of a single particle is a surprise.

This continuum infinity is called the wave function.

Also in classical physics, the numbers used in describing the state depend spatial

coordinate system or basis system. This is true even for kinetic energy which is a scalar:

the velocity used to evaluate the kinetic energy depends on the motion of the axes. In

quantum mechanics, the bases of Hilbert space are not spatial coordinates although spatial

coordinates are needed too. The Hilbert space bases include most importantly the position,

wavenumber (AKA momentum) and energy bases. We will elucidate these and other bases

later. Here just as key example, the wave function can be expanded in the position basis: this

means that there is a complex number (which is a basis vector coefficient) for every point

in space. If one changes to the wavenumber basis expansion for the wave function, there

is a complex number for every point in wavenumber space. The position and wavenumber

bases have continuum infinity of basis vectors. The energy basis for the same system can

have continuum infinity of basis vectors for unbound and therefore unquantized states or a

discrete infinity (i.e., a countable infinity) of basis vectors for bound and therefore quantized

states. Of course, there are system which have both unbound and bound states, and therefore

both a continuum infinity and discrete infinity in the set of basis vectors. We discuss state

– 106 –

vectors further below in § 3.

So to make a start, the state of a system in quantum mechanics is (or is described by if

you prefer) an abstract state vector in a Hilbert space which represents the system. We will

explicate Hilbert spaces below in § 4. The conventional symbolization for the vector is

|Ψ〉 . (2)

The form |〉 is ket vector form in Dirac’s bra-ket notation and the Greek Psi Ψ is the

conventional state symbol. Of course, other state symbols are introduced as needed. State

vector, state, and wave function (because it turns out to be the most basic basis expansion

of the state vector) are effectively synonyms and we will use them as such on most occasions.

Note that by common physics convention, the word system can also mean the state of the

system. Context decides whether “system” means system or state of the system.

We need to remark that a vector, except a trivial one, is not specified by a single

number. It is specified by a set of coefficients for unit vectors for the space it is embedded

in. In quantum mechanics, those coefficients are called amplitudes—at least in SQM—and

we will get to discussing them in § ?? and the remainder of this chapter—so readers hold

your horses of anxiety.

In quantum mechanics as in classical mechanics, everything in the universe is connected

to everything else. So the exact system is the universe. Fortunately, as in classical mechanics,

it is possible to understand a part of the universe by idealizing that part as the system and

the rest of the universe as the environment. In the limit that the environment has no effect,

the system is the exactly correct system. Otherwise, the system is an approximation whose

limitations can be reduced by including more of the universe in the system. In quantum

mechanics, a more exact system is a bigger or more comprehensive Hilbert space. However,

there is an immense and subtle issue called decoherence theory (e.g., Zurek 2003) to be

introduced in regard to system and environment. The time evolution of systems involves

– 107 –

decoherence and decoherence is caused by the environment. So many quantum mechanics

calculations, the enviromnent must be included. Decoherence theory has grown up since

about 1970 (e.g., Zurek 2003, p. 5) and is still a work in progress, but the process of de-

coherence has had to be dealt with since the early days of quantum mechanics—without

being identified as decoherence. It was dealt with via Born’s rule and wave function collapse

which in traditional quantum mechanics are extra axioms. We will take up the subjects of

the Born rule and wave function collapse and decoherence in § 9. In our first developments,

the system will be a single particle in force field described by a potential energy—which in

quantum mechanics jargon is called a potential—which is not to be confused with potential

in classical electromagnetism which is potential energy per unit charge.

The state vector is normalized (i.e., its length is 1) and its direction in the Hilbert

space gives all the information about the state that one can know. How that information is

extracted we elucidate below in the § 4. The evolution of the direction in the Hilbert space is

the evolution of the system. Actually, one seldom/never describes the evolution as evolution

of direction, but its a valid perspective.

The time evolution of state is determined by Schrodinger’s equation (AKA Schrodinger’s

equation) which we introduce below in § 6. It the general equation of motion of non-

relativistic quantum mechanics and is the quantum mechanical analog to Newton’s 2nd law

(AKA ~Fnet = m~a). Like Newton’s 2nd law, Schrodinger’s equation can be loosely described

as an imbalance or disequilibrium between energy terms that determines time evolution. (I

am thinking of classical forces as field-energy-structure-derived entities.) A special case, is

when there is no imbalance. Classically, “no imbalance” gives an equilibrium state that is

a static state in rest frame of the system. In quantum mechanics, “no imbalance” gives an

elementary time dependence and we call the state a stationary state.

The idea of representing the state of system by a vector is not unknown in classical

– 108 –

mechanics. The three position variables and three momentum variables of a particle together

constitute a position or displacement vector in 6 dimensional phase space. But there is a

profound difference. The components of phase space vector are definite characteristics of the

particle. In quantum mechanics, the components of the vector state with respect to a basis

(i.e., a set of unit vectors that span the Hilbert space) are the amplitudes for that basis.

We explicate the terms basis and amplitude below in § 4. The fact that the components are

amplitudes leads to the requirement that the state vector be normalized as we will also see

in § 4.

4. HILBERT SPACES AND OBSERVABLES

5. THE WAVE FUNCTION

6. SCHROEDINGER’S EQUATION

Schrodinger’s equation is

ih−d|Ψ〉dt

= H|Ψ〉 , (3)

Schrodinger’s equation is often written with a partial time derivative rather than full

time derivative. The full time derivative seems to be the more correct form (e.g., Cohen-

Tannoudji et al. 1977, p. 222), but there is no distinction in most cases since there usually

will be no implicit time dependence in the independent arguments of |Psi〉.

6.1. A Natural Path to Schrodinger’s Equation

Schrodinger’s equation cannot be derived from classical physics. Counterfactually, if it

could, then we would be partially along the path to finding that quantum mechanics is an

emergent theory from classical physics. If fact, at the moment it seems the other way around.

– 109 –

But, of course, both may be emergent theories from some more general physics that is as yet

unknown.

It is possible to follow a natural path to Schrodinger’s equation that makes use of

classical and quantum mechanics concepts. We call it a natural path since it one that could

have been used in the discovery of Schrodinger’s equation by Erwin Schrodinger (1887–1961)

(Wikipedia: Erwin Schrodinger). The actual historical path was probably more convoluted,

but we leave the history path to the history of science. It is pedagogically useful to follow

the natural path since it helps to understand quantum mechanics and to understand how

theorizing works.

In the early 1920s, de Broglie hypothesized that massive particles like photons (which

were already widely believed to exist) would obey the relations (which we know call the de

Broglie relations)

λ =h

pand E = hν , (4)

where λ is the particle’s wavelength, p is its momentum, E is its energy, ν is its frequency, and

h is Planck’s constant. The formula for λ is conventionally called the de Broglie wavelength.

De Broglie’s hypothesis included the idea that the wavelength and frequency were meaningful

for a massive particle. De Broglie’s relations are quantum mechanical concepts. We assume

that E is the mechanical energy: i.e., the sum of kinetic energy T and potential energy V—

which we will just call potential hereafter following the convention of quanutm mechanics.

Thus,

E = T + V =p

2m, (5)

where p is momentum again,m is the particle mass, and we have assumed the non-relativistic

limit.

We now invoke classical concepts. If there are matter waves, then we can imagine that

there must be a wave function that describes oscillation of whatever is oscillating as a function

– 110 –

of space and time. There should also be wave equation that is the equation of motion for

the matter waves: i.e., a differential equation that is the dynamical law that governs their

evolution. We will consider just a one spatial dimensional system for our wave function and

wave equation development.

For a wave function, we ansatz the traveling wave function

Ψ = cei(kx−ωt) , (6)

where c is an amplitude we leave unspecified for the moment, k = 2π/lambda is wavenumber,

x is the spatial coordinate, ω = 2πν is the frequency, t is time, and i is, of course, the

imaginary unit. We note using equation (4) that

p =h

λ=hk

2π= h−k and E = hν =

2π= h−ω . (7)

What is Ψ? For the moment, we will just say that it a function that somehow tells us

where the particle is or is spread out in space. Our traveling wave wave function, in fact,

complete delocalized. Aside from the periodic oscillation itself, it is uniform for all x. But

we will not let that stop us.

Now that we have a wave function, can we deduce the wave equation from which it

follows? Well first let’s assume an energy conserving system (i.e., E constant) with a con-

stant potential V , and therefore constant kinetic energy T . We have the equation, so far

uninteresting,

T + V = E . (8)

From our wave function, we see that we can extract the T value and the E value using,

resprectively, the operators,

Top = − h−2

2m

∂2

∂x2and top = ih− ∂

∂t, (9)

– 111 –

where the first is the kinetic energy operator and the second is the time operator. Using

these operators and equation (8), we obtain

TopΨ + VΨ = topΨ or HΨ = ih−∂Ψ∂t

, (10)

where H = Top + V is the quantum mechanical Hamiltonian that we has seen above. Equa-

tion (10), which is satisfied by our wave function, is Schrodinger’s equation.

Following the natural path, we assume that equation (10) generalizes to all cases which

includes three dimensions, space and time dependent potentials, and multiple particles. Then

we have ansatz the standard interpretation of the wave function as a state amplitude and

Ψ|2 as a state density, and ansatz Born’s rule. That completes the natural path.

Of course, all the history of quantum mechanics confirms that Schrodinger’s equation

is the fundamental equation of motion for non-relativistic quantum mechanics.

6.2. Continuity Conditions

6.3. Free Particle Case

7. THE UNCERTAINTY PRINCIPLE

The uncertainty principle is immensely important result in quantum mechanics both

as a conceptual aid to understanting quantum mechanics and as calculation tool in making

estimates. The word “principle” is for historical reasons. The uncertainty principle is, in

fact, a result of the vector formalism of quantum mechanics. The word “uncertainty” is

a bit of misnomer. The modern uncertainty principle is not directly about measurement

uncertainty, but is a relationship between the widths (more precisely standard deviations)

of the superposition distribution of a state for two different eigenstate bases.

– 112 –

The general version of the uncertainty principle is

σAσB ≥ 1

2|〈i[A,B]〉| (11)

where A and B are general observables, σA and σB are the standard deviations for the

observables for a general state |α〉. There is actually a limitation on generality of the operators

and states. The vectors A|α〉 and B|α〉 should, in general, still be ones for which A and B are

Hermitian operators—there are tricky cases where this not so: we will consider two example

in § 7.4. For these tricky cases, the uncertainty principle does not hold.

We should also note that i[A,B] is a Hermitian operator (when uncertainty principle

holds), and 〈i[A,B]〉 is a real number. We should prove this. From § ??, we know that

[A,B]† = −[A†, B†] (12)

for general operators A and B. If the operators are, in fact, Hermitian, then

(i[A,B])† = i∗(−[A,B]) = i[A,B] (13)

and that’s QED since the Hermitian conjugate of i[A,B] equals i[A,B]: i.e.,

(i[A,B])† = i[A,B] . (14)

Another comment to make is in regard to the time-energy uncertainty principle. It is

not a special case of the general uncertainty principle equation (11). The general uncertainty

principle is evaluated all at one instant in time. There is no precise meaning to attach

to the idea of the state distributed into eigenstates of time. The time-energy uncertainty

principle provides information of the evolution of the system. To include time evolution in

the formalism, we need the equation of motion for the system: i.e., Schrodinger’s equation

(§ 6). Making use of Ehrenfest’s theorem (which is derived using Schrodinger’s equation:

§ 8) and the general uncertainty principle, the time-energy uncertainty principle is derived

– 113 –

(§ ??). The time-energy uncertainty principle is called a principle for historical reasons even

though it simply result.

Now for the main event: the proof of the uncertainty principle.

7.1. Proof

First, we note the variance for A is

σ2A = 〈α|(A− 〈A〉)2|α〉 = 〈α|(A− 〈A〉)(A− 〈A〉)2|α〉 = 〈α|(A− 〈A〉)†(A− 〈A〉)|α〉

= 〈(A− 〈A〉)α|(A− 〈A〉)|α〉 , (15)

where we have used the facts that A is Hermitian for |α〉 and tht the expection value 〈A〉 is

a pure real c-number (since A is Hermitian for |α〉), and so is a trivial Hermitian operator

too. Similarily, the variance for B is

σ2B = 〈(B − 〈B〉)α|(B − 〈B〉)|α〉 . (16)

Second, we note

σ2Aσ

2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 ≥ |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 ,

(17)

where we have used the Schwarz inequality (§ ??). Now

|〈(A− 〈A〉)α|(B − 〈B〉)|α〉|2 = |〈α|(A− 〈A〉)†(B − 〈B〉)|α〉|2

= |〈α|(A†B − A†〈B〉 − 〈A〉B + 〈A〉〈B〉)|α〉|2

= |〈α|(A†B − A〈B〉 − 〈A〉B + 〈A〉〈B〉)|α〉|2

= |〈A†B〉 − 〈A〉〈B〉|2

= (Re[〈A†B〉] − 〈A〉〈B〉)2 + Im[〈A†B〉]2

=

[

1

2(〈A†B〉 + 〈B†A〉) − 〈A〉〈B〉

]2

– 114 –

+

[

− i

2(〈A†B〉 − 〈B†A〉)

]2

, (18)

where we have not assumed that A is Hermitian for B|α〉 nor that B is Hermitian for A|α〉

and where Re[] and Im[] are functions that evaluate, respectively, to the real and imaginary

parts of their arguments. Now we can write the rather general inequality

σ2Aσ

2B ≥

[

1

2(〈A†B〉 + 〈B†A〉) − 〈A〉〈B〉

]2

+

[

i

2(〈A†B〉 − 〈B†A〉)

]2

(19)

Third, we now do assume that A is Hermitian for B|α〉 and that B is Hermitian for

A|α〉 and our inequality reduces to

σ2Aσ

2B ≥

(

1

2〈{A,B}〉 − 〈A〉〈B〉)2

)2

+

(

1

2〈i[A,B]〉

)2

, (20)

where we have made use of the commutator and the anticommutator (§ ??). The first term

on the right-hand side is the classical analog term. If A and B were commuting functions

and squared amplitude was a set of probabilities or a probability density, then this term

would become the square of the covariance 〈AB〉− 〈A〉〈B〉 of the functions A and B for the

probability distribution (e.g. Bevington 1969, p. 64). The second term is purely quantum

mechanical. It is non-zero in general if A and B do not commute (i.e., are incompatible

observables) and therefore have no common basis. It may be zero for particular states |α〉

for incompatible observables, if expectation value 〈i[A,B]〉 happens to be zero.

Finally, dropping the classical analog term, we have the uncertainty principle (eq. 11)

as given above:

σAσB ≥ 1

2|〈i[A,B]〉| . (21)

The uncertainty principle shows neither σA nor σB can be zero if 〈i[A,B]〉 6= 0.

The uncertainty principle can be used to estimate one of σA nor σB if the other can

be estimated and (1/2)|〈i[A,B]〉| can be estimated. For example some actual experiment

measurement confines (by wave function collapse) the distribution of the state among the A

– 115 –

eigenstates and (1/2)|〈i[A,B]〉| is estimated, then we can obtain a lower bound on the range

of B eigenvalues obtainable from a second measurement that collapses the wave function.

So the uncertainty principle does have use in making estimates of quantities that might be

very hard to calculate or measure exactly.

7.2. The Heisenberg Uncertainty Principle

The uncertainty principle’s most famous special case is for the position and wavenumber

observables. This special case is called the Heisenberg uncertainty principle.

Consider the position observable xop,i = xi and the wavenumber observable

kop,j =1

i

∂xj

(22)

where i and j are general coordinate indices which could be any of 1, 2, 3 (§ 5). Evaluating

the commutator of these gives

[xop,i, kop,j] = xikop,j − kop,jxi = xikop,j −1

i− xikop,jxi = iδij (23)

Thus, we have commutators

[xop,i, kop,j] = iδij [xop,i, pop,j] = ih−δij (24)

and the special uncertainty principle cases

σxop,iσkop,j

≥ 1

2δij and σxop,i

σpop,j≥ h−

2δij (25)

Either of these can be called the Heisenberg uncertainty principle, but that term is often

reserved just for

σxop,iσpop,i

≥ h−2. (26)

We emphasize again that the uncertainty principle is not directly about measurement

uncertainty, but is a relationship between the widths (more precisely standard deviations)

– 116 –

of the superposition distribution of a state for two different eigenstate bases. But it can be

used to make estimates of the range of experimental results as discussed in § 7.

In the Heisenberg uncertainty, the lower limit on the value of σxop,iσkop,j

and σxop,iσpop,j

is

independent of the particular state. This makes using the Heisenberg uncertainty principle

for estimates of the ranges of experimental results particularly easy. Since position and

wavenumber/momentum are very important dynamical variables, the Heisenberg uncertainty

principle has great utility.

For example, say that you have experimentally confined a particle to a range ∼ ∆x in

the x-direction. Then the range in electron velocities ∆v (for the x-direction) on a subsequent

measurement would satisfy

∆v &h−

2m∆x. (27)

For example, say we have an electron (m = 9.109 . . . × 10−19 kg) confined to about 1 A =

10−10 m. We find

∆v &1

2× 106 m/s , (28)

where the 1/2 factor may be insignificant depending on the particular case. The velocity range

lower limit is high, but does not imply the velocity values themselves will be relativistic (in

which case our non-relativistic treatment might fail). The result of equation (eqn-Heisenberg-

uncertainty-principle-2) is only a lower limit on the range of velocities. The in actual cases

where the expectation value for velocity is known to be zero, the lower limit on the range is

often a good estimate of the order of magnitude of the velocity.

Actually, Heisenberg derived his eponymic uncertainty principle for actual measure-

ments, where its interpretation is different than that of the modern uncertainty principle.

The uncertainty quantities represented actual possible measurement uncertainties, whereas

the in modern formulation they are standard deviations of the state for observables. Hei-

senberg’s interpretation is now believed to be not completely valid. A modern experimental

– 117 –

uncertainty relation has been derived by Ozawa (2012, and references therein) which has

been experimentally verified so far Erhart et al. (2012). It is beyond our scope to further

into the subject of the experimental uncertainty relation.

7.3. The Minimum Uncertainty State

In proving the uncertainty printiple, we had a step equation (17)

σ2Aσ

2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 ≥ |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 .

(29)

We define a minimum uncertainty state to be one where the equality holds for this expression.

Thus, a minimum uncertainty state satisfies

σ2Aσ

2B = 〈(A−〈A〉)α|(A−〈A〉)|α〉〈(B−〈B〉)α|(B−〈B〉)|α〉 = |〈(A−〈A〉)α|(B−〈B〉)|α〉|2 .

(30)

Now from the Schwarz inequality (§ ??), we now that the last equality is equivalent to

having

(A− 〈A〉)|α〉 = c|(B − 〈B〉)|α〉 or (A− cB)|α〉 = −(〈A〉 − c〈B〉)|α〉 (31)

where c is some c-number. The last expression is an eigenvalue problem, but not necessarily

a Hermitian operator eigenvalue problem. The Hermitian conjugate of (A− cB) is

(A− cB)† = A† − c†B)† = A− c∗B (32)

which is Hermitian only for the case of c being pure real.

We are not completely free to choose c. The fact that the minimum uncertainty state be

normalizable puts a contraint on c. Still there can be some freedom in choosing c, and thus

making σAσB. We might guess that choosing c to be pure imaginary would help minimize

– 118 –

σAσB . If c were pure imaginary,

〈(A− 〈A〉)α|(B − 〈B〉)|α〉 = c∗〈α|(B − 〈B〉)2|α〉 (33)

would be pure imaginary since (B−〈B〉)2 is a Hermitian operator and this would mean that

the classical term in equation (20) would be zero. Thus, we find that

σAσB =1

2|〈i[A,B]〉| . (34)

But it is not clear in general that c can be chosen to be pure imaginary nor that doing

so necessarily leads to the smallest possible value of σAσB. In fact, choosing c to be pure

imaginary does lead to the smallest possible σAσB—but we only know this after we have

shown it.

We need to remark that the state obtained by solving equation (31) is just the state

at one instant in time in general. There is no guarantee that it is a stationary state. So in

general it will evolve and cease to be a minimum uncertainty state after that one instant.

We cannot go any further in general, and so will specialize equation (31) to the case of

the Heisenberg uncertainty principle in the position basis for one dimension:

(1

i

∂x− cx)Ψ == −(〈k〉 − c〈x〉)Ψ . (35)

where have chosen A = kop = (1/i)∂/∂x and B = xop = x. The solution of this differential

equation is straightforward: one obtains

Ψ = C exp[i(〈k〉 − c〈x〉) +ic

2x2] (36)

where C is a normalization contant. For this wave function to be normalizable, c must have

a non-zero positive imaginary part. Let us write c = a+ ib where b > 0. Now we have

Ψ = C exp[i(〈k〉 − a〈x〉) − b

2x2] + b〈x〉]

– 119 –

= D exp[i(〈k〉 − a〈x〉) − b

2(x− 〈x〉)2]

=1

σ√

2πexp[i(〈k〉 − a〈x〉) − 1

2(2σ2x)

(x− 〈x〉)2] , (37)

where we have completed the square for x quadratic and adjusted the normalization constant

according, identified the wave function as a Gaussian and the recognized that standard

deviation of the squared amplitude (which is also a Gaussian) satisfies b = 1/(2σ2x) (e.g.

Bevington 1969, p. 53). Now we find

〈(A− 〈A〉)α|(B − 〈B〉)|α〉 = c∗〈α|(B − 〈B〉)2|α〉 = c∗σ2x =

a− ib

2b(38)

Thus we find

σxσk =

√a2 + b2

2b

(39)

Obviously, we minimize σxσk by choosing a = 0. As we anticiplated, the minimum wave func-

tion is obtained for pure imaginary c. Since the b cancels out of the result, it is indeterminate.

Thus, our minimum uncertainty Heisenberg wavefunction is

Ψ ==1

σ√

2πexp[i〈k〉 − 1

2(2σ2x)

(x− 〈x〉)2] (40)

and this gives

σxσk =1

2and σxσp =

h−2. (41)

As we anticiplated, the minimum wave function is obtained for c pure real.

The minimum Heisenberg uncertainty wave function can probably evolved in many sys-

tems, but only existing then for an instant in time in general. However, the ground eigenstate

(i.e., ground stationary state) of the 1-dimensional harmonic oscillator is a Gaussian wave

packet (with a = 0), and so is minimum Heisenberg uncertainty wave function that does not

evolve in time. Another wave function that remains a Gaussian for more than an instant is

free particle wave packet that is initially a Gaussian (e.g., Griffiths 2005, 67). Its standard

deviation increases with time, but it remains Gaussian. We consider both the 1-dimensional

– 120 –

harmonic oscillator and the 1-dimensional wave packet in the chapter 1-Dimensional Sys-

tems.

7.4. Cases Where the Uncertainty Principle Does Not Apply

There is actually a limitation on generality of the operators and states for the uncertainty

principle as mentioned in § 7. The vectors A|α〉 and B|α〉 should, in general, still be ones for

which A and B are Hermitian operators for the uncertainty principle to apply.

The tricky cases (which do turn up) where the uncertainty principle does not apply.

Discussing them actually will get us a bit ahead of the topics in this chapter, but so be it.

Consider 1-dimensional system of length L with zero potential (or constant potential chosen

to have value zero) containing a particle of mass m. We impose periodic boundary conditions:

i.e.,

ψ(0) = ψ(L) . (42)

These boundary conditions cannot be exactly valid for any Euclidean 1-dimensional space,

but there are cases where it is valid approximation as we will discuss in the chapter Solids:

one can approximate a finite crystal using periodic boundary conditions. In this case, the

normalized solutions of the time-independent Schrodinger equation are

ψk(x) =eikx

√L, (43)

where evidently both energy and wavenumber eigenstates just as for the free particle case

briefly discussed in § 6.3. Quantization on the eigenstates by the need to meet the boundary

conditions:

kL = 2πn or k =2π

Ln , (44)

where n must be a general integer. The integer n is a quantum number in the jargon of quan-

tum mechanics: a dimensionless number (not always an integer) that indexes the eigenvalues

– 121 –

of an observable: the eigenvalues are functions of the quantum numbers. The eigen-energies

are also quantized, of course:

E =h−2k2

2m=

h−2

2m

(

L

)

n2 . (45)

We can see the energy values have a double-degeneracy: two wavenumber states (those for

n and −n) have each energy.

The wavenumber states by quantum mechanics axiom (§ 4) should form a basis for the

system space [0, L]. In this case, Sturm-Liouville theory of mathematics Arfken (e.g. 1970,

p. 442ff) verifies that the wavenumbers states are a complete set (i.e., are a basis) in a Sturm-

Liouville theory sense. This means any piecewise continuous function in the interval [0, L]

can be expanded in the basis with the squared discrepancy between function and expansion

integrated over the interval vanishing. This means there are Hilbert space vectors that cannot

be physical states. Recall we require wave functions and their 1st derivatives to be continuous

(§ 6.2) This rules vectors that are only piecewise continuous. We also rule out vectors that

do not satisfy the boundary conditions: they are discontinuous at the boundary.

The wavenumber basis is, in fact, the exponential Fourier series scaled to interval [0, L]

Arfken (e.g. 1970, p. 643).

If the particle is in a wavenumber eigenstate, the standard deviation of the wavenumber

observable is σkop= 0. The standard deviation of the position observable xop = x is finite

since the whole system only extends from x = 0 to x = L. So we have

σxσk = 0 <1

2(46)

in apparent violation of the Heisenberg uncertainty principle (§ 7.2).

The paradox of apparent violation is resolved by noting that

xψk(x) = xeikx

√L

(47)

– 122 –

is not a vector for which kop is a Hermitian operator. Behold:

〈k′|kop|k〉 =

∫ x

0

e−ik′x

√Lkop

(

xeikx

√L

)

dx

=e−ik′x

√L

1

i

(

xeikx

√L

)

L

0−

∫ x

0

xeikx

√L

1

i

∂x

(

e−ik′x

√L

)

dx

= −i+ 〈k′|kop|k〉∗ (48)

The −i term in the last expression shows that kop is not Hermitian (i.e., kop 6= k†op) for the

vector of equation (47). For any vector matching the periodic boundary conditions, kop is

Hermitian.

In the derivation of the uncertainty principle, we relied on an observable being Hermitian

for acting on vector formed from the other observable and the general state of the derivation.

Without this condition—which we do not have for xψk(x), the uncertainty principle does

not apply. If we look at the intermediate equation (18) in the proof of the uncertainty

principle—before making the use of general Hermiticity of operators A and B, we have the

inequality

σ2Aσ

2B ≥ |〈A†B〉 − 〈A〉〈B〉|2 . (49)

Choosing A = xop (which is always Hermition), B = kop, and ψk as our general state, we

find

σ2Aσ

2B ≥ |L

2k − L

2k|2 = 0 (50)

which is exactly correct for our periodic boundary condition case.

As mentioned above, periodic boundary conditions cannot be exactly valid for any Eucli-

dean 1-dimensional space. However, they exact for the azimuthal angle space (the interval

[0, 2π]) of spherical polar coordinates must have periodic boundary conditions. The position

operator for this space is φop = φ (the azimuthal angle itself) and the z component angular

momentum observable (which determines the z component angular momentum eigenstates)

– 123 –

is

Lz =h−i

∂φ(51)

(see the chapter Angular Momentum). This angular case is exactly like the periodic boundary

case we have just analyzed, mutatis mutandis, in regard to the uncertainty principle. So the

uncertainty principle does not apply to it and the fact that φop and Lz do not commute does

require the product σφopσLz

to be greater than zero: it can be zero.

8. EHRENFEST’S THEOREM

Ehrenfest’s theorem (e.g., Wikipedia: Ehrenfest theorem; Griffiths 2005, p. 115) is one

of the most basic results of quantum mechanics. The theorem is

d〈A〉dt

=1

ih−〈Ψ|[A,H ]|Ψ〉+

∂A

∂t

, (52)

where A is a very general operator, H is the Hamiltonian, [A,H ] is the commutator of A and

H , Ψ is a general wave state, and ∂A/∂t is the partial time derivative of operator A which is

zero in many cases. We require of A only that 〈A〉 exist (i.e., not diverge). It does not have

to be an observable. However, the physical interpretation of 〈A〉 and d〈A〉/dt if A is not an

observable have to elucidated. If A is an observable, then [A,H ] is not an observable, but

(1/i)[A,H ] (see Appendix ??).

Ehrenfest’s theorem gives the time derivative of the expectation value of A. If d〈A〉/dt

is zero, A is called a constant of the motion (e.g., Cohen-Tannoudji et al. 1977, p. 247)

and expectation value 〈A〉 is conserved. We see that if [A,H ] (i.e., A and H commute) and

∂A/∂t = 0 (i.e., A has no explicit time dependence), then A is a constant of the motion. It

seems odd to call an operator a constant of the motion, but that is the accepted jargon. One

can probably call 〈A〉 a constant of the motion too if A is a constant of the motion.

An immediate basic result of Ehrenfest’s theorem is that if H is time-independent, then

– 124 –

H is a constant of the motion and 〈H〉 is conserved. This is because all operators commute

with themselves. Similarly, f(H) is a constant of the motion where f is any power-law

expansion in its argument.

Ehrenfest’s theorem is derived straight from Schrodinger’s equation. From it many other

basic and interesting results can be easily derived. Here we will derive Ehrenfest’s theorem

and some of these other basic results.

8.1. Derivation of Ehrenfest’s Theorem

Consider very general operator A. We require of A only that

〈A〉 = 〈Ψ|A|Ψ〉 (53)

exist (i.e., not diverge to infinity) for general state |Ψ〉.

Now

d〈A〉dt

=d〈Ψ|A|Ψ〉

dt=d〈Ψ|dt

|A|Ψ〉 + 〈Ψ|∂A∂t

|Ψ〉 + 〈Ψ|A|d|Ψ〉dt

= − 1

ih−〈Ψ|H†A|Ψ〉 +

1

ih−〈Ψ|AH|Ψ〉 + 〈Ψ|∂A

∂t|Ψ〉

=1

ih−〈Ψ|[A,H ]|Ψ〉+

∂A

∂t

, (54)

where we make use of the product rule, Schrodinger’s equation and its Hermitian conjugate,

ih−d|Ψ〉dt

= H‖Ψ〉 and − ih−d〈Ψ|dt

= 〈Ψ|H† = 〈Ψ|H , (55)

and results from Appendix ??. Note ∂A/∂t = dA/dt since we can assume operators have no

implicit time dependence as a usual rule.

Equation (54) is the complete proof.

– 125 –

9. WAVE FUNCTION COLLAPSE AND BORN’S RULE

Wave function collapse is the most mysterious and controversial aspect of quantum

mechanics and has been since the beginning of quantum mechanics in 1921–1926. It some-

times referred to as the wave function reduction or fundamental perturbation (e.g., Cohen-

Tannoudji et al. 1977, p. 220, 226). These terms seem like euphisms to me. Older texbooks

seem often not to mention wave function collapse at all or refer it fleetingly Let’s elucidate

wave function collapse insofar as we can.

First, we note that Schrodinger’s equation is completely deterministic. Given the initial

conditions and the time evolution of any external potential, the future evolution and the

past evolution too are completely determined.

However, when an ideal “measurement” of a dynamical variable is made, the wave

function collapses and

For the position presentation, the state amplitude is the magnitude squared of the wave

function |Ψ|2.

Decoherence theory dispenses with this axiom by showing how wave func-

tion collapse is, in fact, a decoherence event or effective wave function collapse that

follows from quantum mechanics based on the first three axioms. If decoherence

theory is accepted, then the wave function collapse axiom becomes faux axiom that

is just a highly useful rule in prediction outcomes of measurements.

SQM is neutral about decoherence theory. It is a theory which has gained

traction since its early formulation the 1970s and 1980s (Zurek 2003, e.g.,). But

it is not universely accepted (Schlosshauer et al. 2013, e.g.,). In fact, decoherence

with effective wave function collapse and

It is still a possibility that there are true wave function collapses. Appendix C.

– 126 –

Decoherence theory dispenses with this axiom too by proving how Born’s

rule follows from quantum mechanics based on the first three axioms. SQM accepts

decoherence theory, and so for us the axiom is a faux axiom. However, it is a

traditional quantum mechanics axiom and and it remains highly useful in practical

applications of quantum mechanics. And so we state it here.

In Appendix B, we give a proof of Born’s rule in a special case for pedagogical

reasons.

10. ENTANGLEMENT

11. FOUNDATIONAL ISSUES

Quantum mechanics has been riven by foundational issues every since its formulation in

1925–1926. We have alluded to the issues in earlier sections. This issues are not at all resolved

despite decades of experiment, application, theorizing, and argument (e.g. Schlosshauer et

al. 2013; Norsen & Nelson 2013).

Let’s summarize and discuss the main ones now insofar as yours truly understands them.

1. Is quantum mechanics ontic or epistemic in the jargon that is currently in vogue (e.g.

Schlosshauer et al. 2013; Norsen & Nelson 2013).

“Ontic” means quantum mechanics deals with real objects, in particular that the

wave function or state is real thing. “Epistemic” on the other hand means that quantum

mechanics is just informational like a probability distribution (e.g., for a coin toss).

If quantum mechanics is epistemic, then necessarily there is a deeper theory that

is ontic. On the other hand, if quantum mechanics is ontic, there may or may not be

a deeper theory.

– 127 –

ACKNOWLEDGMENTS

Support for this work has been provided by the Department of Physics of the University

of Evansville.

A. GRAM-SCHMIDT PROCESS

The Gram-Schmidt process (e.g., Wikipedia: Gram-Schmidt process) is used for creating

a set of orthonormal vectors out of a set of non-orthonormal vectors: i.e., orthonormalizing

a set of basis vectors in shorthand. The process is actually tedious and, for space of infinite

dimension. Fortunately, one seldom has to do it. Useful orthonormal sets often being obtained

by other means: e.g., by solution to eigenvalue problems for Hermitian operators (4). Here

describe the Gram-Schmidt process for completeness and for the insight the description gives.

Say we have a non-orthonormal set of independent vectors {|φi〉} (where the curly

brackets mean set of vectors and the i is an index for the set). The indices run 1, 2, 3, etc.

We assume that dimension of the space is finite or a countable infinity. A space with an

dimension of uncountable infinity is, in general beyond the knowledge of yours truly. How

the uncountably infinite position and wavenumber spaces are treated in quantum mechanics

is described in § 5.

We will construct the orthonormal set in order of index i. We assert the ith orthonor-

malized vector is given by

|φi〉 =|φi〉 −

∑i−1j=1 |φj〉〈φj|φi〉

normalization factor, (A1)

where we use |φi〉 for the normalized replacement for |φi〉, recall that ||α〉| =√

〈α|α〉 is the

norm of |α〉, and for i = 1 the summation is assigned the value zero. This orthonormalized

vector is orthogonal to all processed vectors of index less than i. The normalization is proven

by inspection. The proof of orthogonality is by induction.

– 128 –

1. We first prove that the subset {|φj〉}2 (the subset of processed vectors up to index 2)

is orthonormal. First note that

|φ1〉 =|φ1〉|φ1|

|φ2〉 =|φ2〉 − |φ1〉〈φ1|φ2〉

normalization factor. (A2)

Taking the inner product of these two vectors gives

〈φ1|φ2〉 =〈φ1|φ2〉 − 〈φ1|φ1〉〈φ1|φ2〉

normalization factor=

〈φ1|φ2〉 − 〈φ1|φ2〉normalization factor

= 0 (A3)

Thus, the subset is orthonormal.

2. We asssume the subset {|φj〉}i−1 (the subset of processed vectors up to index i− 1) is

an orthonormal subset.

3. Now we prove that |φi〉 is orthogonal to all of subset {|φj〉}i−1 For k < i, we find

〈φk|φi〉 =〈φk|φi〉 −

∑i−1j=1〈φk|φj〉〈φj|φi〉

normalization factor=

〈φk|φi〉 −∑i−1

j=1 δkj〈φj|φi〉normalization factor

=〈φk|φi〉 − 〈φk|φi〉

normalization factor= 0 . (A4)

Thus, the subset {|φj〉}i is an orthonormal subset.

4. The proof by induction is complete, since if {|φj〉}2 is orthonormal—and it is—so is

{|φj〉}3 and then so is {|φj〉}4, and so on.

There are continuum infinity of sets of orthonormal vectors in general. This is proven

just by noting that the 3-dimensional Cartesian unit vectors of 3-dimensional Euclidean

space have a continuum infinity of possible orientations. So it is not surprising that the

Gram-Schmidt process does not give a unique orthonormal set. In fact, the orthonormal set

obtained depends on the order of the states taken in the process in general. For a partial proof,

imagine that the initial set {|φi〉} contains no orthogonal pairs of vectors at all. Whichever

vector you start with becomes the first member of the orthonormalized set (aside from being

– 129 –

normalized if it is not so already). All the other members of the orthonormalized set are not

members of the original set since they are orthogonal to the first member. Thus, whatever

vector you start with gives a different orthonormal set containing that original set member

and no other original vectors. Each orthonormal set is unique.

What about a general proof of order dependence for the orthonormal set obtained? It

is a bit tedious, but for reference and pedagogical reasons, we give it in § A.1 just below.

A.1. General Proof

Here we give a general proof of the order dependence for the orthonormal set obtained

from the Gram-Schmidt process. Say that you start selecting vectors for two Gram-Schmidt

processes identically up to the ith vector. You now make different choice for the ith vector

and after make general choices for both processes. The two ith vectors are

|φi〉 =|φi〉 −

∑i−1j=1 |φj〉〈φj|φi〉

normalization factor(A5)

|φ′i〉 =

|φ′i〉 −

∑i−1j=1 |φj〉〈φj|φ′

i〉normalization factor

(A6)

where we use primes to label the second process vectors. Note that the subsets {|φj〉}i−1 and

{|φ′j〉}i−1 are identical, and so we suppress the primes on the later. Since |φi〉 and |φ′

i〉 are

independent by hypothesis (i.e., not aligned), |φi〉 and |φ′i〉 must be independent (i.e., not

aligned). Let us take the inner product of |φi〉 and |φ′i〉, but suppressing the denominators

for clarity:

〈φi|φ′i〉 ∝ 〈φi|φ′

i〉 − 2i−1∑

j=1

〈φi|φj〉〈φj|φ′i〉 +

i−1∑

j=1

i−1∑

k=1

〈φi|φj〉〈φj|φk〉〈φk|φ′i〉

∝ 〈φi|φ′i〉 − 2

i−1∑

j=1

〈φi|φj〉〈φj|φ′i〉 +

i−1∑

j=1

i−1∑

k=1

〈φi|φj〉δjkφk〉〈φk|φ′i〉

∝ 〈φi|φ′i〉 − 2

i−1∑

j=1

〈φi|φj〉〈φj|φ′i〉 +

i−1∑

j=1

〈φi|φj〉〈φj|φ′i〉

– 130 –

∝ 〈φi|φ′i〉 −

i−1∑

j=1

〈φi|φj〉〈φj|φ′i〉

∝N

j=i

〈φi|φj〉〈φj|φ′i〉 (A7)

where N is the total number of independent vectors and for the last step we have used

the fact that {|φj〉} is a complete orthonormal set. By hypothesis that |φi〉 and |φ′i〉 are

independent, we must have i < N .

In special cases, either or both of |φi〉 and |φ′i〉 can be independent of all all vectors

in the orthonormal subset {|φj〉}i−1 and be non-orthogonal relative to each other. In these

cases, the members of the set {|φj〉} from index i on form a complete set as far as either or

both of |φi〉 and |φ′i〉 are concerned. Thus, in our special cases, we can have either or both

of the expansions

|φi〉 =

N∑

j=i

|φj〉〈φj|φi〉 (A8)

|φ′i〉 =

N∑

j=i

|φj〉〈φj|φ′i〉 (A9)

And thus, in our special cases,

N∑

j=i

〈φi|φj〉〈φj|φ′i〉 = 〈φi|φ′

i〉 6= 0 . (A10)

And finally thus, in our special cases,

〈φi|φ′i〉 6= 0 , (A11)

or in other words |φi〉 and |φ′i〉 are not orthogonal.

Now if the two Gram-Schmidt processes yield identical orthonormal sets and |φi〉 and

|φ′i〉 are independent, they must be orthogonal. But we have just shown that there are

special cases where they are not orthogonal. So after mucho labor, we have proven that the

– 131 –

orthonormal set obtained is independent in general (but not in special cases) of how far one

goes in two Gram-Schmidt processes selecting the same states for orthonormalization before

branching into different selection paths.

B. ENVARIANCE AND BORN’S RULE

Born’s rule can be derived in decoherence theory using envariance (e.g., Zurek 2005).

As stated in the § 9, decoherence theory is widely favored, but not universally accepted.

The derivation Born’s rule is not, however, dependent on the full decoherence theory and its

validity must be judged separately. Your truly is not able to give an authoritative judgment

and will just rely on the authority of Zurek (2005). However, yours truly thinks it important

to give a special case derivation of Born’s rule (for an expansion in two eigenstates with

equal amplitudes) to give the reader some insight into an important element of decoherence

theory. The derivation follows those of Zurek (2005, 2009). It is somewhat belabored for

pedagogical reasons.

Envariance stands for environment induced invariance. Envariance is not a new axiom. It

is special symmetry that leads to Born’s rule given the first 4 axioms of quantum mechanics

(§ 1).

Consider the folllowing normalized state:

|Ψ〉 =1√2

(

|a〉S|b〉E + |c〉S|d〉E)

, (B1)

where S stands for system, E stands for environment the system is embedded in, |a〉S and

|c〉S are distinct orthonormal eigenstates of the system for a system observable QS , |b〉E and

|d〉E are distinct orthonormal eigenstates of the environment for some system observable QE ,

and any phase factors have been absorbed into the system or environment eigenstates. For

– 132 –

pedagogical reasons, we will verify that the overall state is normalized. Behold:

〈Ψ|Ψ〉 =1

2

(

〈a|a〉S〈b|b〉E + 〈a|c〉S〈b|d〉E + 〈c|a〉S〈d|b〉E + 〈c|c〉S〈d|d〉E)

=1

2

(

1 × 1 + 0 + 0 + 1 × 1)

= 1 , (B2)

where we have used the orthonormality of the eigenstates.

Now |a〉S |b〉E and |c〉S|d〉E are both eigenstates of the the joint system-environment

system. The system is a superposition of these joint eigenstates with equal amplitudes (i.e.,

equal coefficients). By the wave function collapse axiom or alternatively by decoherence

theory, a decoherence event (which could be either an actual observer measurement or a

natural decoherence event) for of the system observable collapses the system into either

|a〉S (with probability pa) or |c〉S (with probability pc) with no superposition. Similarly, a

decoherence event for the environment observable collapses the system into either |b〉E (with

probability pb) or |d〉E (with probability pd) with no superposition.

Logically, pa + pc = 1 and pb + pd = 1. It seems reasonable to believe that pa = pc = 1/2

and pb = pd = 1/2. But that is precisely what we are going to prove. First, we assert that

pa = pb and pc = pd. This product-state probability rule seems reasonable since equalities

certainly hold (and are equal to 1) when the overall system is in either |a〉S|b〉E or |c〉S|d〉E .

We leave the full justification to Zurek (2005, 2009).

Second, we now posit a unitary swap transformation on the system:

US = |c〉S〈a|S + |a〉S〈c|S . (B3)

We can prove this transformation is unitary. Note

USUS = |c〉S〈a|c〉S〈a|S + |c〉S〈a|a〉S〈c|S + |a〉S〈c|c〉S〈a|S + |a〉S〈c|a〉S〈c|S= |c〉Sδac〈a|S + |c〉S〈c|S + |a〉S〈a|S + |a〉Sδac〈c|S= |c〉S〈c|S + |a〉S〈a|S = 1op , (B4)

– 133 –

where have used orthonormality for distinct states and where the penultimate expression is

clearly the unit operator 1op. Clearly US = U−1S : i.e., US is its own inverse. Now consider

general states |α〉, |β〉, |i〉 and |j〉. We form the operator

Q = |i〉〈j| . (B5)

Now

〈α|Q†|β〉 = 〈β|Q|α〉∗ = (〈β|i〉〈j|α〉)∗ = 〈β|i〉∗〈j|α〉∗ = 〈i|β〉〈α|j〉 = 〈α|j〉〈i|β〉 , (B6)

and thus

Q† = |j〉〈i| . (B7)

Now since Hermitian conjugation distributes over addition (4), U †S = US = U−1

S , and thus

U †S = U−1

S which means US is unitary. Since swap US is unitary, one can always imagine that

there is some physical process that could actually bring the swap about.

Third, applying US to |Ψ〉 gives

US |Ψ〉 =1√2

(

|c〉S|b〉E + |a〉S |d〉E)

. (B8)

This state is clearly different from |Ψ〉. A joint system-environment collapse of |Ψ〉 gives

|a〉S|b〉E or |c〉S |d〉E , whereas a joint system-environment collapse of U |Ψ〉 |c〉S|b〉E or |a〉S |d〉E .

Fourth, we will use primes to indicate the probabilies of collapses to states after swaps.

Now making use of the product-state probability rule, we find that

p′c = pb = pa and p′a = pd = pc . (B9)

Fifth, we posit a unitary swap transformation on the environment:

UE = |d〉E〈b|E + |b〉E〈d|E . (B10)

The proof that swap UE is unitary is the same as for US , mutatis mutandis.

– 134 –

Sixth, applying UE to US |Ψ〉 gives

UEUS |Ψ〉 =1√2

(

|a〉S|b〉E + |c〉S|d〉E)

= |Ψ〉 (B11)

which is the original state. Now making use of the product-state probability rule again, we

find that

p′b = p′a = pd = pc and p′d = p′c = pb = pa . (B12)

But since the state is the original state, p′b = pb = pa and p′d = pd = pc. Now we have

pa = pc. Making use of the product-state equal probability rule and the fact that the sum of

probabilities must be 1, we finally obtain

pa = pc =1

2and pb = pd =

1

2. (B13)

This last result is just what Born’s rule predicts: probability of collapse is equal to

the squared magnitude of the amplitude coefficients both of which are 1/√

2. Zurek (2005)

generalizes the proof for a general number of terms in the expansion and general amplitudes.

It is certainly remarkable that Born’s rule becomes a result of quantum mechanics in

decoherence theory instead of axiom as in conventional quantum mechanics. It is hard to

imagine that a theory of true wave function collapse that included Born’s rule as a result

or an axiom could exist. Nature would have have two mechanisms for producing Born’s rule

which seems to be an unlikely coincidence—it is extremely unparsimonious. Unless, of course,

Born’s rule is some common emergent property itself.

C. DETERMINISM, PROBABILISM, AND FREE WILL

In this appendix, I will take up the subjects of determinism, probabilism, and free will—

all without much awareness of the enormous literature on these subjects. My opinions and

conjectures may not be worth much, but they are what I have at the present.

– 135 –

In pure decoherence theory, there is no wave function collapse in a system (e.g., Paz

& Zurek 2001; Zurek 2003). Einselection extremely rapidly damps out all, but the pointer

states. The pointer states then evolve effectively independently of each other: they are the

initial conditions for the rest of the evolution of the system and the universe insofar as it

is affected by the system. The upshot is that the universe is constantly bifurcating into

independent or uncorrelated paths. One gets a quasi-infinity of parallel worlds: the many

worlds interpretation of quantum mechanics (e.g., Zurek 2003, p. 5) is actually correct.

Furthermore the evolution of the universe of parallel worlds is completely deterministic

just as quantum mechanics without the wave function collapse axiom predicts. This is as

it should be, since decoherence theory is an emergent theory based on quantum mechanics

without the wave function collapse axiom. In this case, it seems if one knew exactly all the

physical conditions of the universe at any one instant, the past and future would be totally

predictable just Laplace thought should be the case for classical mechanics (e.g., Wikipedia:

Laplace’s demon). The universe obeys determinism.

However, there many who cannot believe in quasi-infinity of the parallel worlds or which

only one can be detected—the one we are in. It seems to them—maybe to us all—extremely

unparsimonious. So some maintain that true wave function collapse must still happen and

that only one eigenstate or, in the decoherence theory view, one pointer state in any system

eventually leads to the future. So there is only one macroscopic world. Given that decoherence

theory is valid—which is the assumption of SQM—true wave function collapse would follow,

compete, or precede and preclude decoherence. The simplest wave function collapse theory

would be one where the Born rule is fundamental: quantum mechanics is then intrinsically

probabilistic and the universe obeys probabilism. But it seems extremely unparsimonious to

have a true wave function collapse theory that aside from the true wave function collapse

mimicks and/or complements decoherence theory—assuming its validity as we do. We are

– 136 –

faced with battling unparsimonies—Occam’s razor against itself.

Can the fundamental probabilism of a true wave function collapse theory affect the

macroscopic world? Well obviously. Many experimental results are predicted by Born’s rule,

and so from these at least a microscopic probabilism is amplified to the macroscopic world.

In fact, there must be many amplifications in nature not just in human experiments. It

may be that some of those occur in our brains and lead to intrinsically random thoughts—

Heisenberg, Martin Heisenberg, seems to think so (Heisenberg 2009). The butterfly effect

(e.g., Wikipedia: Butterfly effect) ensures that microscopic intrinsic randomness leaks out

everywhere in the macroscopic universe particularly in biospheres and probably affects even

very large scale evolutions like those of galaxies.

But the trouble with the wave function collapse hypothesis is that no adequate consensus

theory of it exists. As discussed in § 9, the point where one breaks off a quantum mechanical

calculation and uses Born’s rule to calculate the probabilities of experimental outcomes

seems to have been chosen empirically (e.g., Greene 2004, p. 119) or in more recent times

from decoherence theory where no wave function collapse happens and one is just choosing

a good time when the pointer states have come to dominate a system. However, one can say

that if instantaneous wave function collapse is required, it does not seem to be precluded

by quantum mechanics. The non-locality of Bell state interactions of entangled states allows

that. Of course, it seems to be unspeakable as to what frame instantaneous collapse happens

in. Greenstein & Zajonc (1997) (p. 184) blandly informs us in a footnote that the question

of frame has been shown not to be a problem. I would guess, the collapse is instantaneous

according to cosmic time (e.g., Wikipedia: Cosmic time): time scale of the expansion of

the universe. What other time would be appropriate for Bell states that are spread over

cosmological distances. We are mixing quantum mechanics and general relativity, but that

has been done before—e.g., for Hawking radiation. If two formally incompatible theories are

– 137 –

both on the right path, then a mixed approach may be valid.

In any case, the supporters of parallel worlds (the determinists) can retort to the sup-

porters of wave function collapse (the probabilists) that parallel worlds are not a theory,

they are a prediction of quantum mechanics—a highly verified theory—and the probabilists

have to invoke an ad hoc hypothesis—but a venerable one—to explain away parallel worlds.

I am neutral in this debate: determinism versus probabilism. I would like to see it

resolved, but I have no favorite. But it is interesting that there may be no way in principle

to tell between the two positions. If a testable theory of wave function collapse arises that

might tell. But otherwise we might be left in the lurch. If there is in principle no way to tell,

then the world as we know it empirically is indifferent to the two choices. And even if there

is a way, we could imagine a universe in which there is not. The upshot is that the decision

between determinism versus probabilism may have no implications for some human concerns

like ethics.

Either way what about free will? With determinism, every human decision is predes-

tined. With probabilism, every human decision is a mixture of eternal causes and random

events since forever. I will offer my opinion. A conscious being based on many factors makes

decisions. The result may be purely determined or the mixture, but as we suggested above

that may make no difference to the decision at all. Either way, the conscious being doesn’t

know the outcome of the decision until it is made—or maybe a bit later since there is evi-

dence that humans are not aware of having made a decision until a bit later. Making the

decision with an outcome is often unknown to everyone but omniscient beings—and not even

them in a probabilistic world—is free will as I call it while acknowledging that others may

have different views.

The process of conscious beings making decisions is very complex and creative. The

complexity is probably essential to the creativity. In particular random factors going into

– 138 –

the process from inside or outside probably vastly aid creativity. That creativity allows

scientific and artistic leaps. The universe would be very different without conscious-being

decision making (i.e., free will)—oh the galaxies would look much the same—but those parts

particularly interesting to us would be very different and very impoverished. The human

world would be very different

But with “my” free will, what of ethics or as Vincent Price (1911-1993) (Wikipedia:

Vincent Price) once said “Is there no morality left?” (Wikipedia: The Comedy of Terrors

(1964)) Well consciousness likes consciousness or else it would likely die out quickly or ne-

ver develop. Consciousness liking consciousness is an emergent principle—an iron law—of

consciousness which itself an emergent entity. I posit it as an axiom that the consciousness-

liking-consciousness principle is the seed from which all ethics flow. Specific ethical systems

and specific ethical decisions are extremely contigent on all kinds of other things, but ultima-

tely must include at their base, explicitly or implicitly, the consciousness-liking-consciousness

principle. But at this moment I am writing oracularly and not about to discuss whether “my”

axiom is falsifiable.

Now what is consciousness and does God exist?—well some more thought is needed.

REFERENCES

Arfken, G. 1970, Mathematical Methods for Physicists (New York: Academic Press)

Bevington, P. R. 1969, Data Reduction and Error Analysis for the Physical Sciences (New

York: McGraw-Hill Book Company)

Cohen-Tanoudji, C., Diu, B., & Laloe, F. 1977, Quantum Mechanics (New York: John Wiley

& Sons)

– 139 –

Erhart, J., Sponar, S., Sulyok, G., Badurek, G., Ozawa, M., & Hasegawa, Y. 2012, Nature

Physics, doi:10.1038/nphys2194

Ghirardi, G. C., Rimini, A, & Weber, T. 1986, Physical Review D, 34, 470

Greene, B. 2004, The Fabric of the Cosmos (New York: Vintage Books), (Gre)

Greenstein, G., & Zajonc, A. G. 1997 The Quantum Challenge: Modern Research on the

Foundations of Quantum Mechanics (Sudbury, Massachusetts: Jones and Bartlett

Publishers)

Griffiths, D. J. 2005, Introduction to Quantum Mechanics (Upper Saddle River, New Jersey:

Pearson/Prentice Hall)

Heisenberg, M. 2009, Nature, 459, 164

Laughlin, R. B. 2005 , A Different Universe: Reinventing Physics from the Bottom Down

(New York: Basic Books)

Norsen, T., & Nelson, S. 2013, arXiv:1306.4646

Ozawa, M. 2012, arXiv:1201.5334v1

Paz, J. P., & Zurek, W. H. 2001, in Coherent Atomic Matter Waves: Session LXXII of

the Les Houches Ecole d’Ete de Physique Theorique, ed. R. Kaiser, C. Westbrook,

& F. David (Berlin: Springer), 533, arXiv:quant-ph/0010011v1

Pusey, M. F., Barrett, J., & Rudolph, T. 2011, arXiv:1111.3328v1

Schlosshauer, M., Kofler, J., & Zeilinger, A. 2013, arXiv:1301.1069

Wikipedia, http://en.wikipedia.org/wiki/Main Page

Zurek, W. H. 2003, arXiv:quant-ph/0306072v1

– 140 –

Zurek, W. H. 2005, Phys. Rev. A 71, 052105, arXiv:quant-ph/0405161v2

Zurek, W. H. 2009, Nature Physics, 5, 181, arXiv:0903.5082v1

This preprint was prepared with the AAS LATEX macros v5.2.