The symmetrical foundation of Measure, Probability and ...bayes/18/papers/skilling/s... · the...

The symmetrical foundation ofMeasure, Probability and Quantum theories

John Skilling,1 Kevin H. Knuth,2∗

1Maximum Entropy Data Consultants Ltd, Kenmare, Ireland2University at Albany (SUNY) Albany NY 12222, USA

∗To whom correspondence should be addressed; E-mail: [email protected].

Abstract

Quantification starts with sum and product rules that express combination and partition.These rules rest on elementary symmetries that have wide applicability, which explains whyarithmetical adding up and splitting into proportions are ubiquitous. Specifically, measuretheory formalises addition, and probability theory formalises inference in terms of propor-tions.

Quantum theory rests on the same simple symmetries, but is formalised in two dimen-sions, not just one, in order to track an object through its binary interactions with otherobjects. The symmetries still require sum and product rules (here known as the Feynmanrules), but they apply to complex numbers instead of real scalars, with observable proba-bilities being modulus-squared (known as the Born rule). The standard quantum formalismfollows. There is no mystery or weirdness, just ordinary probabilistic inference.

1

Contents:

1. Introduction

2. Measures

3. Probability

4. Quantum theory

4.1. Feynman rules

4.2. Born rule

4.3. Probability assignments

4.4. Hilbert space

4.5. Measurement

5. Commentary

1 Introduction

“Quantum mechanics will cease to look puzzling only when we will be able to derive the for-

malism of the theory from a set of simple assertions about the world.” — Carlo Rovelli [1]

Our job in science is to make sense of our observations. This is very general, and we seek

corresponding clarity and simplicity. General theory must apply to all cases, and our strat-

egy of eliminative induction [2] is to exclude theories which give “wrong” results in particular

cases, until there remains just a single candidate theory which we can then recommend with

confidence.

Symmetries are particularly valuable tools for eliminating “wrong” behaviour [3, 4, 5, 6, 7].

If A is supposed to equal B, then all the other possibilities with A 6= B are immediately ex-

cluded. Symmetries are particularly powerful when applied to simple systems because “wrong”

2

behaviour is seen most clearly there, where judgement is least subjective. Accordingly, we pro-

ceed by considering simple thought-experiments, whose behaviour should be uncontroversial.

In classical physics, objects can be detected passively, unchanged by interaction with a

probe. That’s a valid limit for when the object dominates the probe, and its quantification leads

to standard scalar measure theory (“stuff adds up”). More fundamentally, though, a probe is

a partner object, which happens to be built so that its perturbation may become recorded as a

measurement. But, if our object can perturb a partner object, then by symmetry the partner

object can also perturb our object. We could assign either role to either. Measurement involves

the interaction between at least two entities, and not just the presence of one. Objects are

detected actively.

This insight is the source of “quantum-ness”. It leads to the Feynman rules [8], which rep-

resent elementary interactions by pairs of numbers which intertwine according to the rules of

complex arithmetic, and to the Born rule [9] which relates those complex numbers to observa-

tion. With those rules in place, Hilbert space can be constructed [10] and the rest of quantum

theory follows.

In this paper, we use the same thought-experiments and the same symmetries for both clas-

sical and quantum situations. This gives a straightforward unified derivation of measure theory,

probability theory, and quantum theory. Two formal operations are uncovered – a sum rule and

a product rule. Measures formalise quantitation (kilograms, coulombs and so on). Probability

formalises our inferences. Quantum theory tracks objects through their interactions. It’s simple.

3

2 Measures

“I have tried, with little success, to get some of my friends to understand my amazement that

the abstraction of integers for counting is both possible and useful. . . . To me, this is one of

the strongest examples of the unreasonable effectiveness of mathematics. Indeed, I find it both

strange and inexplicable.” — Richard Hamming [11].

Starting at the beginning with minimalist foundation, we might, perhaps, think of a shopping

basket of fruit — apples, bananas and so on. Consider the operator ⊕ which combines disjoint

objects A and B into a composite object A⊕ B. We list some basic properties of ⊕ which are

commonly applicable.

Closure A combination of objects is an object in its own right.

(A⊕B) is an object, where A and B are disjoint (1)

Commutative The order of objects does not matter.

A⊕B = B ⊕ A (2)

Associative The order of combination does not matter.

(A⊕B)⊕ C = A⊕ (B ⊕ C) (3)

Commutativity and associativity together mean that objects can be arbitrarily shuffled.

Limitless Equivalent, but disjoint, objects can be combined without restriction.

A, 2A, 3A, . . . . . . are all different, where (n+1)A = (nA)⊕ A (4)

Here individual A’s are equivalent but disjoint objects A1, A2, A3, . . . . We intend no

infinite limit here. All we claim is the freedom to include more objects, limited only by

4

our resources and patience, but not somehow limited by an intellectual barrier. If there is

a cardinality restriction, we will never reach it. It would have no practical consequences

for us, so we are free to ignore it.

If these properties (closure, commutativity, associativity, limitless) are all accepted, then

A⊕B is represented by the component-wise sum a + b (5)

where a = (a1, a2, . . . , an)′ and b = (b1, b2, . . . , bn)′ are n-tuple representations of A and B.

This is the general sum rule. In accordance with practical feasibility, the rule is demonstrated

by construction (7, appendix A), successively incorporating new objects and introducing a new

dimension whenever a new object is not commensurable with the existing set, commensurability

of X and Y being defined as the existence of non-zero integers m and n such that m of X can

be deemed equivalent to n of Y .

Like all basic assignments, commensurability is decided by practical judgement. Thus π

(known to be irrational) and its accurate approximation 3.1415926535897932 would usually be

deemed equivalent for practical purposes, allowing circumferential lengths to be commensurate

with diameters. The mathematical distinction between rational and real numbers has zero prac-

tical impact, so can always be ignored for rational inference in practical science. Equivalence

requires equality of representation and non-equivalence requires inequality. The quoted sum

rule preserves all such correspondences.

The representation is only forced “up to isomorphism”, meaning that any 1:1 re-grading is

logically equivalent and has exactly the same analytical power. Conversely, a representation that

was not in 1:1 correspondence would break some of the equivalences and/or non-equivalences,

so would be rejected. The convenience of “+” is so great that we adopt it as a near-universal

convention. For one thing, its compact connected topology fits naturally with our intuitive

notion of locality: small changes have small effect.

5

Within the n-tuple formulation, the only freedom consistent with preservation of “+” and

the original commensurabilities is linear shear x′ = Tx by an arbitrary non-singular (affine)

n×n transformation matrix T.

Dimension If our objects are fully commensurable, so that there is only one relevant prop-

erty (as when a shopping basket holds apples but nothing else), the dimension shrinks to

1 and the sum rule takes scalar form.

A⊕B is represented by a+ b (6)

Such a quantity is known in physics as an extensive variable. The only freedom consis-

tent with preservation of “+” and the original commensurabilities is linear rescaling to

different units. The quantity may or may not be signed. Electric charge, for example,

is signed, but mass is not. A single-signed quantity, positive by convention, is known in

mathematics as a measure.

Familiarity makes addition seem obvious, indeed trivial. Here, though, we see why additiv-

ity is so ubiquitous [12]. It’s required by elementary symmetries that are commonly upheld. To

see whether additivity is required, all we have to do is check the boxes.

Closure? Commutative? Associative? Limitless? Dimension?

If they are satisfied, then the sum rule must apply, in the appropriate dimension. We do not need

bespoke derivations for each application. Neither do we need sophisticated formalism for such

simple requirements.

3 Probability

“Probability is expectation founded upon partial knowledge.” — George Boole [13].

6

The operation inverse to combination is partition, in which a composite object is progres-

sively decomposed, if necessary all the way down to a notional substrate of a-priori-equivalent

microstates.O Object O

A B Partition O = A⊕BC D B Partition A = C ⊕DC E Combination D ⊕B = E

O Combination C ⊕ E = O1 2 3 4 5 6 7 8 9 Notional substrate

(7)

Quantification of the divisions obeys the symmetries of measure and hence we have the quan-

tification sum rule, for scalars when the dimension is 1.

q(X ⊕ Y ) = q(X) + q(Y ) (8)

We are also interested in source-to-destination steps u = (dest; source) and their quantifica-

tion p(u). Steps can be chained into composites by a “from” operator ◦, so that u ◦ v is also a

step, when u starts where v ends. This is formalised as closure of ◦.

Closure of ◦ u ◦ v is also a step, when u starts where v ends. (9)

Changing the provenance of a combination does not alter whatever commensurabilities de-

fined its components, a requirement formalised as right-distributivity.

Right-distributive ◦ (u⊕ v) ◦w = (u ◦w)⊕ (v ◦w) (10)

Right-distributivity implies that the quantification of a step is linear in the destination quantity,

with the only remaining freedom being a scale factor which may depend on the source:

p(X;Z) = q(X)f(Z) (11)

where f is some as-yet-unknown function. Likewise for other arguments:

p(X;Y ) = q(X)f(Y ) , p(Y ;Z) = q(Y )f(Z) . (12)

7

The representation of u ◦ v is to be constructed from the representations of u and v, so

p(u ◦ v) = φ(p(u), p(v)

)(13)

for some function φ representing ◦. But u = (X;Y ) and v = (Y ;Z) chain to become the step

u ◦ v = (X;Z) with source Z and destination X so, using (11)–(13),

q(X)f(Z) = φ(q(X)f(Y ), q(Y )f(Z)

)(14)

The left side being linear in q(X) and in f(Z), the right side must be also. Hence φ is bilinear

φ(x, y) = γxy with some coefficient γ.

On setting γ = 1 by choice of scale factor, equation (13) reads

p(u ◦ v) = γp(u) p(v) with γ = 1 (15)

which is the product rule. Multiplication builds on addition, and (8) and (15) are the consistent

rules of combination and partition. As a bonus, (14) implies that f(Y ) q(Y ) = γ, so (with

γ = 1) f is the reciprocal of q and p is just proportion.

p(dest;source) =q(dest)

q(source)(16)

In this derivation, associativity p((u ◦ v) ◦w) = p(u ◦ (v ◦w)) is an emergent property of

the representation. However, associativity

Associativity of ◦ (u ◦ v) ◦w = u ◦ (v ◦w) (17)

of chained steps themselves is an intuitive requirement that could be taken as axiomatic (7).

Indeed, there is a nexus of interrelated “obvious” properties, where the selection of which is

axiomatic and which is emergent is to some extent a matter of choice.

Partitioning and combination can be filled out to a Boolean lattice defined by join ∨ (logical

OR) which upgrades⊕ to non-disjoint components, and meet ∧ (logical AND) which identifies

8

overlaps [14, 15]. In this context, we recognise p(dest;source) as the transition probability

from source to destination and traditionally written as Prob(dest | source). Symmetry of

the product rule then gives Bayes’ theorem

Prob(ξ)︸︷︷︸Prior

Prob(data | ξ)︸︷︷︸Likelihood

= Prob(data)︸︷︷︸Evidence

Prob(ξ | data)︸︷︷︸Posterior

(18)

for the inference of parameters ξ from data, which is acknowledged as the foundation of

rational inference. Just as summation is generally taken as axiomatic for measures, so can

Laplace’s rule (successes/trials as in (16)) be taken as the definition of probability [16]. In fact,

both are inevitable. This is why probability is ubiquitous in inference, while any competing

framework is denied. Its calculus is required by elementary symmetries that apply to logical

operations on propositions.

Awkwardly, the traditional but idiosyncratic solidus notation suggests that there’s something

special about Prob(· | ·). There isn’t. Probability just obeys the same simple laws of proportion

that apply widely elsewhere. It needs no bespoke derivation, which is not to impugn earlier

work [3, 17, 18, 19].

4 Quantum Theory

“This theoretical failure to find a plausible alternative to quantum mechanics, even more than

the precise experimental verification of linearity, suggests to me that quantum mechanics is the

way it is because any small change in quantum mechanics would lead to logical absurdities.”

— Steven Weinberg [20]

Production of objects consists of generating them as reproducibly as we can and supplying

them when available. We can then make the objects interact with our probes. Observation

happens when we interrogate a probe, at which optional later stage we extract information.

9

Our job is to make sense of our observations of objects. But observation requires interaction

with probes, so even when we know that an object exists (which is one real constraint), our

knowledge of it remains incomplete because of our ignorance of the probe. Probes themselves

are incompletely-known objects, possibly of different type, about which completion of knowl-

edge could only come from observation, and so on and so on in indefinite regress. Accordingly,

we adopt the pair postulate, that our knowledge of an object is mediated through interactions

that involve two parameters and not just one. Connection with scalar observation is then medi-

ated probabilistically through scalar quantities q(ψ) arising from the underlying pairs ψ. Detail

is lost, which we may try to recover by inference.

Pair ψFunction

−−−−−−−−⇀↽−−−−−−−−Inference

Scalar quantity q(ψ)Probability−−−−−−−−⇀↽−−−−−−−−

InferenceScalar observation

On investigating objects, we seek reproducible behaviour. A probe may reproducibly re-

spond this way or conversely that way. We can use that distinguishing behaviour to separate

“this objects” from “that objects”, so that we can prepare this-type objects by discarding that-

type, and vice versa. We delve into such distinctions by partitioning our description of the

original object.

To investigate the calculus, we upgrade our analysis of scalar measure and probability to

apply to pairs, representing the extraction of X as a partition of Y by the pair ψ(X|Y ). All the

quoted symmetries apply, but with 2-dimensional representation.

4.1 Feynman rules

As before, the symmetries of partition and combination lead to sum and product rules, the only

difference being that the dimension of pairs ψ is two, not just one. The sum rule for pairs is

u + v =

(u1+v1u2+v2

)representing ψ(X⊕Y | Z)︸︷︷︸

u+v

= ψ(X|Z)︸︷︷︸u

+ψ(Y |Z)︸︷︷︸v

(19)

10

and the product rule, promoted to 2 dimensions from (15), becomes

(u ◦ v)i =∑jk

γijkujvk representing ψ(X|Z)︸︷︷︸u◦v

= ψ(X|Y )︸︷︷︸u

◦ψ(Y |Z)︸︷︷︸v

(20)

The dimension being 2, there are 8 (not just 1) constant coefficients γ.... Their arbitrariness

can be reduced by applying appropriate linear shear (no longer just a single scale) to the pairs.

However, a 2× 2 shear matrix has only 4 components, which are insufficient to reduce 8 γ’s to

standard form. To resolve this, we adopt associativity of product (17) as an additional require-

ment. Chaining would be associative in any dimension but now, in 2, it’s needed.

We have shown [5] that associativity reduces the bilinear product rule (20) to three different

classes, each of whose coordinate axes can be sheared into a standard form [5][equation 20]

with discriminant µ = −1, 0, 1 respectively).

u ◦ v =

(u1v1 − u2v2u1v2 + u2v1

)or(

u1v1u1v2 + u2v1

)or(u1v1 + u2v2u1v2 + u2v1

)(21a,b,c)

The other two apparently-allowable classes ((5), equations 22 and 23)

u ◦ v =

(u1v1u2v1

)or(u1v1u1v2

)(22a,b)

are quickly rejected because one of the factors (either v or u) is only present with one compo-

nent, so it operates as a scalar, contrary to the pair postulate.

4.2 Born rule

To discover the nature of the scalar function q(ψ) that is to mediate observation, define moduli

and phases for the three candidate product rules (21a,b,c) as, respectively,

|ψ| =√ψ21 + ψ2

2 or ψ1 or√ψ21 − ψ2

2

argψ = arctan(ψ2/ψ1) or ψ2/ψ1 or arctanh(ψ2/ψ1)(23a,b,c)

In conformity with the symmetries, these log-moduli and phases are each linearly additive dur-

ing chained multiplication.

log |ψ(X|Z)| = log |ψ(X|Y)|+ log |ψ(Y|Z)|argψ(X|Z) = argψ(X|Y) + argψ(Y|Z)

(24)

11

The only freedom between conforming linear scales is a constant factor, so there is a common

linear scale on which by virtue of the same symmetries we may also place our scalar q, giving

log q = α log |ψ|+ β argψ (25)

for some constants α and β.

This implies that our knowledge of the object, as mediated through q, is invariant under

phase offset ∆(argψ) with compensating modulus change ∆(log |ψ|) = −(β/α)∆(argψ).

Accordingly, our prior probability assignment for phase is to be likewise invariant, hence uni-

form. Yet the range of phase is infinite (−∞ < argψ <∞) for the second and third alternatives

(23b,c). That would make the uniform distribution improper, making it impossible to assign the

foundational prior probability from which subsequent inference would follow. Only the first

alternative (23a) survives, for which phase is 2π-periodic (in effect 0 ≤ argψ < 2π).

Pairs are recognised as complex numbers, because theirs are the relevant addition and mul-

tiplication rules (19) and (21a). Meanwhile, periodicity makes β = 0 so that [21]

q(ψ) = |ψ|α (26)

and phase is distributed uniformly as

Pr(argψ) = 1/2π (27)

Now, what we observe is the mean rate at which a source produces particular types of object

— more precisely, we observe the number of such objects that is produced in an ensemble of

N independent experiments. The ensemble can equally well be considered as a single (larger)

experiment embodying its own (correspondingly larger) production, given by the scalar sum

rule as Q = q1 + q2 + · · · + qN . This scales linearly with N . Likewise, the pair sum rule (19)

collects pairs — individually identifiable by arrival time — into Ψ = ψ1 +ψ2 + · · ·+ψN . (Loss

12

of identifiability, leading to bosons and fermions, would enter the analysis later as or when

stored particles were coerced into the same state.)

Regardless of its distribution in modulus, ψ is random in phase, so the expected mean re-

mains zero while the expected variance grows linearly with N . Hence quantity Q and variance

var(Ψ) scale together in line with N . Comparison with (26) shows that α must be 2 and we

have the Born rule.

q(ψ) = |ψ|2 (28)

4.3 Probability assignments

Object arrivals being independent, their distribution is Poisson, radioactive decay being an ex-

ample. The probability distribution with a Poisson mean rate r is

Pr(q) dq = exp(−qr

) dqr

(29)

For consistency with the Born rule, the probability distribution for ψ is then Gaussian.

Pr(ψ) d2ψ =1

πexp

(−|ψ|

2

r

)d2ψ

r(30)

Sums of Gaussian variables are themselves Gaussian, so this form of distribution is preserved

under partition and combination. Indeed, the Gaussian form could have been independently

justified through the law of large numbers applied to microstates.

Complex ψ|ψ|2

−−−−−−−−⇀↽−−−−−−−−Gauss

Quantity qAverage

−−−−−−−−⇀↽−−−−−−−−Poisson

Observe rate r

4.4 Hilbert space

We choose to introduce quantum calculus through the notional substrate of n a-priori-equivalent

microstates, all supplied at the same rate, unit for convenience. It is helpful to keep track

of combination and partition by inventing orthonormal base vectors |ek〉 for these microstates

13

k = 1, 2, . . . , n. A sample object ψ can then be expressed in “bra-ket” quantum physics notation

as a complex “amplitude vector” in Hilbert space.

|ψ〉 =n∑k=1

ψk|ek〉 with rate q = |ψ|2 = 〈ψ|ψ〉 (31)

With nothing at first known about the component amplitudes other than their unit supply rate,

their prior distribution is, independently,

ψk = ComplexGauss (32)

When we see “one object” which we then know exists, we could specify |ψ|2 = 1 which

confines ψ to the unit Hilbert sphere and identifies component quantities as probabilities. That

constraint would continue to be obeyed no matter how deeply we partition and recombine.

However, we recommend encoding the magnitude within ψ itself. After all, rates are additive

indefinitely, while probabilities are bounded by unity. The additive variance of ψ is interpreted

more naturally as a rate than as a probability. And rates are closer to laboratory practice.

Theorists tend to discuss states while experimentalists provide rates.

According to the pair sum rule (19), composite states X have amplitudes

ψX =∑k∈X

ψk (33)

which are themselves complex Gaussian whose variance is the size of X . Size 1 (a single

microstate) is called a “pure” state and larger sizes are called “mixed”, maximal mixing with all

microstates included being the sample object itself. State X can be extracted from the sample

objects by applying the selection operator (in mathematics, a projection)

PX =∑k∈X

|ek〉〈ek| so that |ψX〉 = PX |ψ〉 (34)

Selection separates objects that exhibit different behaviour, and is implemented in such devices

as diffraction screens and Stern-Gerlach experiments.

14

It might be objected that this summation contradicts the sum rule for rates, which appears

to require |ψX + ψY |2 = |ψX |2 + |ψY |2, which is disobeyed by the samples. However, that

hypothetical addition would refer to the quantitation q = |ψ|2 of individual samples, which is

not observable. It’s the ensemble average rate r = Mean(q) that needs to sum linearly, and it

does.

The distribution of ψ being spherically symmetric, orthonormal base vectors |e〉 can be

rotated arbitrarily, so that a suitable selection P and its complement Q (with P + Q = I, the

identity) can split the original Hilbert space into any desired subspace and its complement.

4.5 Measurement

After partitioning by some physical device, the physicist may then identify the partitions by

labels which will be interpreted as the physical property (energy, position, charge, whatever)

that was measured by that device. Commuting selections split the Hilbert space recursively into

multi-way decompositions represented by Hermitian matrices whose eigenvalues are the real-

valued partitioning labels. These matrices that partition and label the amplitudes in appropriate

coordinates are known as quantum observables.

O ObjectA B PartitionA C D Partition2 5 5 7 Energy values

2 0 0 00 5 0 00 0 5 00 0 0 7

ObservableHamiltonian (35)

In this example, partition C with energy 5 is selected by

PC = |e2〉〈e2|+ |e3〉〈e3| =

0 0 0 00 1 0 00 0 1 00 0 0 0

(36)

and similarly for A and D.

Once a device has split an object into particular partitions, we assume that repeated subjec-

tion to that same device has no extra effect. Projections are idempotent, P2 = P, and it would

15

be difficult to make sense of reproducible phenomena without that assumption. However, later

non-commuting selections may corrupt separations accomplished earlier: projections need not

commute. Objects may thus shift their behaviour according to how they become selected, of-

ten (as shown by Bell [22] and by Kochen and Specker [23]) in a manner incompatible with

classical particles.

Measurement becomes possible when an object is probed and the probe removed for in-

spection. Unless existence is damaged, the only effect of a probe can be to change the phase

of the selected subspace, without altering internal structure to which the probe is blind. Thus

|ψX〉 −→ eiθ|ψX〉 where θ is some phase shift, effectively random if the probe is strongly

intrusive.

Even then, there is no “collapse of the wave function”. Probing just changes some phases

|ψ〉 = P|ψ〉+ Q|ψ〉 −→ eiθP|ψ〉+ Q|ψ〉 = |ψ′〉 (37)

and subsequent inspection of the probe to get a P-or-Q measurement is optional. Collapse

only occurs if one outcome or the other is physically blocked or mentally ignored because of

subsequent irrelevance. Ensemble membership is then reduced, but that’s standard in Bayesian

computations as data constrain the solutions and it carries no philosophical content.

5 Commentary

“Now the essential content of both statistical mechanics and communication theory, of course,

does not lie in the equations; it lies in the ideas that lead to those equations.” — Edwin T.

Jaynes [24, p.4]

We have presented a unified derivation of summation in measure theory, multiplication in

probability theory, and complex numbers in quantum theory. This minimal foundation is very

simple, and should be accessible to neophyte students as well as experienced researchers.

16

We make no assumption that cannot be checked in the lab. We recommend that as a good

strategic principle, because assumptions that cannot be checked are thereby divorced from prac-

tical impact, in which case they become a peculiar and questionable part of scientific inquiry. If

such assumption is truly needed, then it has practical impact after all because its denial would

alter experimental results, which is self-contradictory. If it’s not needed, then requiring it would

be regrettable. Specifically, we make no assumption involving infinity or the infinitesimal. Any

general theory must apply to special cases, including simple ones, and it happens that simple

examples are sufficient to eliminate all but the one calculus.

In the last couple of decades there has been an effort to reformulate and reconstruct the

quantum formalism based on probability theory [25, 26, 27, 28] and information theory [29,

30, 31, 32, 33, 34, 35, 36, 37]. Yet we find that the similarities between the quantum rules and

probability/information theory are not due to the fact that one derives from the other, but rather

that they both derive from common principles. We share much of the interpretive aspect of

quantum Bayesianism (QBism) [28]. For us, though, Bayes comes first.

Note that our derivation of the quantum formalism cannot be undermined by any alternative

interpretation or supposed generalisation of probability, or by some differing assumptions there

[38, 39], which might be thought to open the possibility of conflict. Symmetries are silent on

interpretation, and we need only the one common foundation to support the whole edifice.

Our symmetries are necessary and sufficient, but we do not exclude using similarly verifiable

assumptions as sufficient foundation [7]. But, as a matter of logic, any alteration to measure

(which has not been seriously proposed) or to probability (which has often been proposed)

must conflict with our symmetries, and thereby with implementations of our quantum thought-

experiments. For, it must be acknowledged, quantum theory works. So does probability. And

the two are entirely mutually consistent.

17

6 Acknowledgements

We thank those many colleagues who have guided the evolution of our thought over the past

quarter-century, particularly Ariel Caticha, Seth Chaiken, Keith Earle, Anton Garrett, Steve

Gull, and Oleg Lunin. We also thank Andrei Khrennikov, Julio Stern, and Federico Holik

for invitations to present our efforts leading up to this manuscript. K.H.K. also thanks the

Foundational Questions Institute (FQXi) and those who have worked to support the FQXi essay

contests, for providing an opportunity for researchers to explore their thoughts and ideas on

foundational topics [12]. Yet our greatest debt is to Edwin Jaynes who kept faith with rational

inference through many dark years, and to whose memory we respectfully dedicate this work.

The authors contributed equally to this work. The authors declare no competing financial

interests.

References

[1] Rovelli, C. Relative information at the foundation of physics. 2013, “It from Bit or Bit from

It?” FQXi 2013 Essay Contest (2nd prize).

Preprint at http://www.fqxi.org/community/forum/topic/1816.

[2] Caticha, A. Quantifying rational belief. Bayesian Inference And Maximum Entropy Methods

In Science And Engineering, Oxford MS, USA 2009 (ed. P. M. Goggans P. M. & C.-Y.

Chan, C. Y.), AIP Conf. Proc. 1193, 60–68 (2009).

[3] Cox, R. T. Probability, frequency, and reasonable expectation. Am. J. Phys. 14, 1–13 (1946).

[4] Caticha, A. Consistency, amplitudes, and probabilities in quantum theory. Phys. Rev. A 57,

1572–1582 (1998).

18

[5] Goyal, P., Knuth, K. H. & Skilling, J. Origin of complex quantum amplitudes and Feyn-

man’s rules. Phys. Rev. A 81, 022109 (2010), (arXiv:0907.0909 [quant-ph]).

[6] Goyal, P. & Knuth, K. H. Quantum theory and probability theory: their relationship and

origin in symmetry. Symmetry 3, 171–206 (2011).

[7] Knuth, K. H. & Skilling, J. Foundations of inference. Axioms 1, 38–73 (2012).

[8] Feynman, R. P. Space-time approach to non-relativistic quantum mechanics. Rev. Mod.

Phys. 20, 367–387 (1948).

[9] Born, M. Zur quantenmechanik der stoßvorgange (quantum mechanics of collision pro-

cesses). Zeit. fur Phys. 38, 803 (1926).

[10] von Neumann, J. Mathematical Foundations Of Quantum Mechanics, 2, (12 ed., Princeton

Univ. Press, 1996).

[11] Hamming, R. W. The unreasonable effectiveness of mathematics. Amer. Math. Monthly

87, 81–90 (1980).

[12] Knuth, K. H. The deeper roles of mathematics in physical laws. Trick or Truth: the Mys-

terious Connection between Physics and Mathematics (ed. Aguirre, A., Foster, B. & Z. Mer-

ali, Z.), Springer Frontiers Collection, Springer-Verlag, Heidelberg, 2016, FQXi 2015 Essay

Contest (3rd prize), (arXiv:1504.06686 [math.HO]), pp. 77–90.

[13] Boole, G. An Investigation Of The Laws Of Thought. (Macmillan, London, 1854).

[14] Knuth, K. H. Measuring on lattices. Bayesian Inference And Maximum Entropy Methods

In Science And Engineering, Oxford MS, USA 2009 (ed. P. M. Goggans P. M. & C.-Y.

Chan, C. Y.), AIP Conf. Proc. 1193, 132–144, (2009), (arXiv:0909.3684v1 [math.GM]).

19

[15] Knuth, K. H. Lattices and their consistent quantification. (2017), Submitted to Annalen

der Physik.

[16] Laplace, P. S. Theorie Analytique Des Probabilites, 2, Ch. 1. (Courcier Imprimeur, Paris,

1812).

[17] Kolmogorov, A. N. Foundations Of The Theory Of Probability. (Chelsea, New York,

1950). English translation and reprinting of Kolmogorov, A. (1933), Grundbegriffe der

Wahrscheinlichkeitsrechnung (Springer, Berlin).

[18] de Finetti, B. Probabilism. Erkenntnis 31, 169–223 (1989), English translation and reprint-

ing of de Finetti, B. (1931), Probabilismo, Logos (Napoli), pp. 163-219.

https://doi.org/10.1007/BF01236563

[19] Jaynes, E. T. Probability Theory: The Logic Of Science, p. 4 (Cambridge Univ. Press,

Cambridge, 2003).

[20] Weinberg, S. Dreams Of A Final Theory (Vintage, 1992).

[21] Tikochinsky, Y. Feynman rules for probability amplitudes. Int. J. Theor. Phys. 27, 543–549

(1988).

[22] Bell, J. S. On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys. 38,

447–452 (1966).

[23] Kochen, S. & Specker, E. P. The problem of hidden variables in quantum mechanics. J.

Math. Mech. 17, 59–87 (1967).

[24] Jaynes, E. T. Probability theory in science and engineering. Colloquium Lectures in

Pure and Applied Science, 4, p. 4, (Socony-Mobil Oil Company, Inc., Dallas, TX, 1959),

http://bayes.wustl.edu/etj/articles/mobil.pdf.

20

[25] Youssef, S. Quantum mechanics as Bayesian complex probability theory. Mod. Phys. Lett.

A9, 2571–2586 (1994).

[26] Caves, C. M., Fuchs, C. A. & Schack, R. Quantum probabilities as Bayesian probabilities.

Phys. Rev. A 65, 022305 (2002).

[27] Bub, J. Quantum probabilities as degrees of belief. Studies in History and Philosophy of

Science Part B: Studies in History and Philosophy of Modern Physics 38, 232–254 (2007).

[28] Fuchs, C. A., Mermin, N. D. & Schack, R. An introduction to QBism with an application

to the locality of quantum mechanics (2013), (arXiv:1311.5253 [quant-ph]).

[29] Timpson, C. G. Quantum Information Theory And The Foundations Of Quantum Mechan-

ics. (Oxford Univ. Press, Oxford, 2013).

[30] Rovelli, C. Relational quantum mechanics. Int. J. Theor. Phys. 35, 1637–1678 (1996).

[31] Reginatto, M. Derivation of the equations of nonrelativistic quantum mechanics using the

principle of minimum Fisher information. Phys. Rev. A 58, 1775 (1998).

[32] Zeilinger, A. A foundational principle for quantum mechanics. Foundations of Physics 29,

631–643 (1999).

[33] Fuchs, C. A. Quantum mechanics as quantum information (and only a little more). 2002,

(arXiv:quant-ph/0205039).

[34] Clifton, R., Bub J. & Halvorson, H. Characterizing quantum theory in terms of

information-theoretic constraints. Foundations of Physics 33, 1561–1591 (2003).

[35] Goyal, P. Information-geometric reconstruction of quantum theory. Phys. Rev. A 78,

052120 (2008).

21

[36] Brukner, C. & Zeilinger, A. Information invariance and quantum probabilities. Founda-

tions of Physics 39, 677–689 (2009).

[37] Wootters, W. K. Communicating through probabilities: does quantum theory optimize the

transfer of information? Entropy 15, 3130–3147 (2013).

[38] Dupre M. J. & Tipler, F. J. New axioms for rigorous Bayesian probability. Bayesian Anal-

ysis 4, 599–606 (2009).

[39] Terenin, A. & Draper, D. Cox’s Theorem and the Jaynesian Interpretation of Probability.

(2017), (arXiv:1507.06597 [math.ST]).

22

The symmetrical foundation of Measure, Probability and ...bayes/18/papers/skilling/s... · the...

Documents

Transcript of The symmetrical foundation of Measure, Probability and ...bayes/18/papers/skilling/s... · the...