Post on 20-May-2020
The symmetrical foundation ofMeasure, Probability and Quantum theories
John Skilling,1 Kevin H. Knuth,2∗
1Maximum Entropy Data Consultants Ltd, Kenmare, Ireland2University at Albany (SUNY) Albany NY 12222, USA
∗To whom correspondence should be addressed; E-mail: kknuth@albany.edu.
Abstract
Quantification starts with sum and product rules that express combination and partition.These rules rest on elementary symmetries that have wide applicability, which explains whyarithmetical adding up and splitting into proportions are ubiquitous. Specifically, measuretheory formalises addition, and probability theory formalises inference in terms of propor-tions.
Quantum theory rests on the same simple symmetries, but is formalised in two dimen-sions, not just one, in order to track an object through its binary interactions with otherobjects. The symmetries still require sum and product rules (here known as the Feynmanrules), but they apply to complex numbers instead of real scalars, with observable proba-bilities being modulus-squared (known as the Born rule). The standard quantum formalismfollows. There is no mystery or weirdness, just ordinary probabilistic inference.
1
Contents:
1. Introduction
2. Measures
3. Probability
4. Quantum theory
4.1. Feynman rules
4.2. Born rule
4.3. Probability assignments
4.4. Hilbert space
4.5. Measurement
5. Commentary
1 Introduction
“Quantum mechanics will cease to look puzzling only when we will be able to derive the for-
malism of the theory from a set of simple assertions about the world.” — Carlo Rovelli [1]
Our job in science is to make sense of our observations. This is very general, and we seek
corresponding clarity and simplicity. General theory must apply to all cases, and our strat-
egy of eliminative induction [2] is to exclude theories which give “wrong” results in particular
cases, until there remains just a single candidate theory which we can then recommend with
confidence.
Symmetries are particularly valuable tools for eliminating “wrong” behaviour [3, 4, 5, 6, 7].
If A is supposed to equal B, then all the other possibilities with A 6= B are immediately ex-
cluded. Symmetries are particularly powerful when applied to simple systems because “wrong”
2
behaviour is seen most clearly there, where judgement is least subjective. Accordingly, we pro-
ceed by considering simple thought-experiments, whose behaviour should be uncontroversial.
In classical physics, objects can be detected passively, unchanged by interaction with a
probe. That’s a valid limit for when the object dominates the probe, and its quantification leads
to standard scalar measure theory (“stuff adds up”). More fundamentally, though, a probe is
a partner object, which happens to be built so that its perturbation may become recorded as a
measurement. But, if our object can perturb a partner object, then by symmetry the partner
object can also perturb our object. We could assign either role to either. Measurement involves
the interaction between at least two entities, and not just the presence of one. Objects are
detected actively.
This insight is the source of “quantum-ness”. It leads to the Feynman rules [8], which rep-
resent elementary interactions by pairs of numbers which intertwine according to the rules of
complex arithmetic, and to the Born rule [9] which relates those complex numbers to observa-
tion. With those rules in place, Hilbert space can be constructed [10] and the rest of quantum
theory follows.
In this paper, we use the same thought-experiments and the same symmetries for both clas-
sical and quantum situations. This gives a straightforward unified derivation of measure theory,
probability theory, and quantum theory. Two formal operations are uncovered – a sum rule and
a product rule. Measures formalise quantitation (kilograms, coulombs and so on). Probability
formalises our inferences. Quantum theory tracks objects through their interactions. It’s simple.
3
2 Measures
“I have tried, with little success, to get some of my friends to understand my amazement that
the abstraction of integers for counting is both possible and useful. . . . To me, this is one of
the strongest examples of the unreasonable effectiveness of mathematics. Indeed, I find it both
strange and inexplicable.” — Richard Hamming [11].
Starting at the beginning with minimalist foundation, we might, perhaps, think of a shopping
basket of fruit — apples, bananas and so on. Consider the operator ⊕ which combines disjoint
objects A and B into a composite object A⊕ B. We list some basic properties of ⊕ which are
commonly applicable.
Closure A combination of objects is an object in its own right.
(A⊕B) is an object, where A and B are disjoint (1)
Commutative The order of objects does not matter.
A⊕B = B ⊕ A (2)
Associative The order of combination does not matter.
(A⊕B)⊕ C = A⊕ (B ⊕ C) (3)
Commutativity and associativity together mean that objects can be arbitrarily shuffled.
Limitless Equivalent, but disjoint, objects can be combined without restriction.
A, 2A, 3A, . . . . . . are all different, where (n+1)A = (nA)⊕ A (4)
Here individual A’s are equivalent but disjoint objects A1, A2, A3, . . . . We intend no
infinite limit here. All we claim is the freedom to include more objects, limited only by
4
our resources and patience, but not somehow limited by an intellectual barrier. If there is
a cardinality restriction, we will never reach it. It would have no practical consequences
for us, so we are free to ignore it.
If these properties (closure, commutativity, associativity, limitless) are all accepted, then
A⊕B is represented by the component-wise sum a + b (5)
where a = (a1, a2, . . . , an)′ and b = (b1, b2, . . . , bn)′ are n-tuple representations of A and B.
This is the general sum rule. In accordance with practical feasibility, the rule is demonstrated
by construction (7, appendix A), successively incorporating new objects and introducing a new
dimension whenever a new object is not commensurable with the existing set, commensurability
of X and Y being defined as the existence of non-zero integers m and n such that m of X can
be deemed equivalent to n of Y .
Like all basic assignments, commensurability is decided by practical judgement. Thus π
(known to be irrational) and its accurate approximation 3.1415926535897932 would usually be
deemed equivalent for practical purposes, allowing circumferential lengths to be commensurate
with diameters. The mathematical distinction between rational and real numbers has zero prac-
tical impact, so can always be ignored for rational inference in practical science. Equivalence
requires equality of representation and non-equivalence requires inequality. The quoted sum
rule preserves all such correspondences.
The representation is only forced “up to isomorphism”, meaning that any 1:1 re-grading is
logically equivalent and has exactly the same analytical power. Conversely, a representation that
was not in 1:1 correspondence would break some of the equivalences and/or non-equivalences,
so would be rejected. The convenience of “+” is so great that we adopt it as a near-universal
convention. For one thing, its compact connected topology fits naturally with our intuitive
notion of locality: small changes have small effect.
5
Within the n-tuple formulation, the only freedom consistent with preservation of “+” and
the original commensurabilities is linear shear x′ = Tx by an arbitrary non-singular (affine)
n×n transformation matrix T.
Dimension If our objects are fully commensurable, so that there is only one relevant prop-
erty (as when a shopping basket holds apples but nothing else), the dimension shrinks to
1 and the sum rule takes scalar form.
A⊕B is represented by a+ b (6)
Such a quantity is known in physics as an extensive variable. The only freedom consis-
tent with preservation of “+” and the original commensurabilities is linear rescaling to
different units. The quantity may or may not be signed. Electric charge, for example,
is signed, but mass is not. A single-signed quantity, positive by convention, is known in
mathematics as a measure.
Familiarity makes addition seem obvious, indeed trivial. Here, though, we see why additiv-
ity is so ubiquitous [12]. It’s required by elementary symmetries that are commonly upheld. To
see whether additivity is required, all we have to do is check the boxes.
Closure? Commutative? Associative? Limitless? Dimension?
If they are satisfied, then the sum rule must apply, in the appropriate dimension. We do not need
bespoke derivations for each application. Neither do we need sophisticated formalism for such
simple requirements.
3 Probability
“Probability is expectation founded upon partial knowledge.” — George Boole [13].
6
The operation inverse to combination is partition, in which a composite object is progres-
sively decomposed, if necessary all the way down to a notional substrate of a-priori-equivalent
microstates.O Object O
A B Partition O = A⊕BC D B Partition A = C ⊕DC E Combination D ⊕B = E
O Combination C ⊕ E = O1 2 3 4 5 6 7 8 9 Notional substrate
(7)
Quantification of the divisions obeys the symmetries of measure and hence we have the quan-
tification sum rule, for scalars when the dimension is 1.
q(X ⊕ Y ) = q(X) + q(Y ) (8)
We are also interested in source-to-destination steps u = (dest; source) and their quantifica-
tion p(u). Steps can be chained into composites by a “from” operator ◦, so that u ◦ v is also a
step, when u starts where v ends. This is formalised as closure of ◦.
Closure of ◦ u ◦ v is also a step, when u starts where v ends. (9)
Changing the provenance of a combination does not alter whatever commensurabilities de-
fined its components, a requirement formalised as right-distributivity.
Right-distributive ◦ (u⊕ v) ◦w = (u ◦w)⊕ (v ◦w) (10)
Right-distributivity implies that the quantification of a step is linear in the destination quantity,
with the only remaining freedom being a scale factor which may depend on the source:
p(X;Z) = q(X)f(Z) (11)
where f is some as-yet-unknown function. Likewise for other arguments:
p(X;Y ) = q(X)f(Y ) , p(Y ;Z) = q(Y )f(Z) . (12)
7
The representation of u ◦ v is to be constructed from the representations of u and v, so
p(u ◦ v) = φ(p(u), p(v)
)(13)
for some function φ representing ◦. But u = (X;Y ) and v = (Y ;Z) chain to become the step
u ◦ v = (X;Z) with source Z and destination X so, using (11)–(13),
q(X)f(Z) = φ(q(X)f(Y ), q(Y )f(Z)
)(14)
The left side being linear in q(X) and in f(Z), the right side must be also. Hence φ is bilinear
φ(x, y) = γxy with some coefficient γ.
On setting γ = 1 by choice of scale factor, equation (13) reads
p(u ◦ v) = γp(u) p(v) with γ = 1 (15)
which is the product rule. Multiplication builds on addition, and (8) and (15) are the consistent
rules of combination and partition. As a bonus, (14) implies that f(Y ) q(Y ) = γ, so (with
γ = 1) f is the reciprocal of q and p is just proportion.
p(dest;source) =q(dest)
q(source)(16)
In this derivation, associativity p((u ◦ v) ◦w) = p(u ◦ (v ◦w)) is an emergent property of
the representation. However, associativity
Associativity of ◦ (u ◦ v) ◦w = u ◦ (v ◦w) (17)
of chained steps themselves is an intuitive requirement that could be taken as axiomatic (7).
Indeed, there is a nexus of interrelated “obvious” properties, where the selection of which is
axiomatic and which is emergent is to some extent a matter of choice.
Partitioning and combination can be filled out to a Boolean lattice defined by join ∨ (logical
OR) which upgrades⊕ to non-disjoint components, and meet ∧ (logical AND) which identifies
8
overlaps [14, 15]. In this context, we recognise p(dest;source) as the transition probability
from source to destination and traditionally written as Prob(dest | source). Symmetry of
the product rule then gives Bayes’ theorem
Prob(ξ)︸ ︷︷ ︸Prior
Prob(data | ξ)︸ ︷︷ ︸Likelihood
= Prob(data)︸ ︷︷ ︸Evidence
Prob(ξ | data)︸ ︷︷ ︸Posterior
(18)
for the inference of parameters ξ from data, which is acknowledged as the foundation of
rational inference. Just as summation is generally taken as axiomatic for measures, so can
Laplace’s rule (successes/trials as in (16)) be taken as the definition of probability [16]. In fact,
both are inevitable. This is why probability is ubiquitous in inference, while any competing
framework is denied. Its calculus is required by elementary symmetries that apply to logical
operations on propositions.
Awkwardly, the traditional but idiosyncratic solidus notation suggests that there’s something
special about Prob(· | ·). There isn’t. Probability just obeys the same simple laws of proportion
that apply widely elsewhere. It needs no bespoke derivation, which is not to impugn earlier
work [3, 17, 18, 19].
4 Quantum Theory
“This theoretical failure to find a plausible alternative to quantum mechanics, even more than
the precise experimental verification of linearity, suggests to me that quantum mechanics is the
way it is because any small change in quantum mechanics would lead to logical absurdities.”
— Steven Weinberg [20]
Production of objects consists of generating them as reproducibly as we can and supplying
them when available. We can then make the objects interact with our probes. Observation
happens when we interrogate a probe, at which optional later stage we extract information.
9
Our job is to make sense of our observations of objects. But observation requires interaction
with probes, so even when we know that an object exists (which is one real constraint), our
knowledge of it remains incomplete because of our ignorance of the probe. Probes themselves
are incompletely-known objects, possibly of different type, about which completion of knowl-
edge could only come from observation, and so on and so on in indefinite regress. Accordingly,
we adopt the pair postulate, that our knowledge of an object is mediated through interactions
that involve two parameters and not just one. Connection with scalar observation is then medi-
ated probabilistically through scalar quantities q(ψ) arising from the underlying pairs ψ. Detail
is lost, which we may try to recover by inference.
Pair ψFunction
−−−−−−−−⇀↽−−−−−−−−Inference
Scalar quantity q(ψ)Probability−−−−−−−−⇀↽−−−−−−−−
InferenceScalar observation
On investigating objects, we seek reproducible behaviour. A probe may reproducibly re-
spond this way or conversely that way. We can use that distinguishing behaviour to separate
“this objects” from “that objects”, so that we can prepare this-type objects by discarding that-
type, and vice versa. We delve into such distinctions by partitioning our description of the
original object.
To investigate the calculus, we upgrade our analysis of scalar measure and probability to
apply to pairs, representing the extraction of X as a partition of Y by the pair ψ(X|Y ). All the
quoted symmetries apply, but with 2-dimensional representation.
4.1 Feynman rules
As before, the symmetries of partition and combination lead to sum and product rules, the only
difference being that the dimension of pairs ψ is two, not just one. The sum rule for pairs is
u + v =
(u1+v1u2+v2
)representing ψ(X⊕Y | Z)︸ ︷︷ ︸
u+v
= ψ(X|Z)︸ ︷︷ ︸u
+ψ(Y |Z)︸ ︷︷ ︸v
(19)
10
and the product rule, promoted to 2 dimensions from (15), becomes
(u ◦ v)i =∑jk
γijkujvk representing ψ(X|Z)︸ ︷︷ ︸u◦v
= ψ(X|Y )︸ ︷︷ ︸u
◦ψ(Y |Z)︸ ︷︷ ︸v
(20)
The dimension being 2, there are 8 (not just 1) constant coefficients γ.... Their arbitrariness
can be reduced by applying appropriate linear shear (no longer just a single scale) to the pairs.
However, a 2× 2 shear matrix has only 4 components, which are insufficient to reduce 8 γ’s to
standard form. To resolve this, we adopt associativity of product (17) as an additional require-
ment. Chaining would be associative in any dimension but now, in 2, it’s needed.
We have shown [5] that associativity reduces the bilinear product rule (20) to three different
classes, each of whose coordinate axes can be sheared into a standard form [5][equation 20]
with discriminant µ = −1, 0, 1 respectively).
u ◦ v =
(u1v1 − u2v2u1v2 + u2v1
)or(
u1v1u1v2 + u2v1
)or(u1v1 + u2v2u1v2 + u2v1
)(21a,b,c)
The other two apparently-allowable classes ((5), equations 22 and 23)
u ◦ v =
(u1v1u2v1
)or(u1v1u1v2
)(22a,b)
are quickly rejected because one of the factors (either v or u) is only present with one compo-
nent, so it operates as a scalar, contrary to the pair postulate.
4.2 Born rule
To discover the nature of the scalar function q(ψ) that is to mediate observation, define moduli
and phases for the three candidate product rules (21a,b,c) as, respectively,
|ψ| =√ψ21 + ψ2
2 or ψ1 or√ψ21 − ψ2
2
argψ = arctan(ψ2/ψ1) or ψ2/ψ1 or arctanh(ψ2/ψ1)(23a,b,c)
In conformity with the symmetries, these log-moduli and phases are each linearly additive dur-
ing chained multiplication.
log |ψ(X|Z)| = log |ψ(X|Y)|+ log |ψ(Y|Z)|argψ(X|Z) = argψ(X|Y) + argψ(Y|Z)
(24)
11
The only freedom between conforming linear scales is a constant factor, so there is a common
linear scale on which by virtue of the same symmetries we may also place our scalar q, giving
log q = α log |ψ|+ β argψ (25)
for some constants α and β.
This implies that our knowledge of the object, as mediated through q, is invariant under
phase offset ∆(argψ) with compensating modulus change ∆(log |ψ|) = −(β/α)∆(argψ).
Accordingly, our prior probability assignment for phase is to be likewise invariant, hence uni-
form. Yet the range of phase is infinite (−∞ < argψ <∞) for the second and third alternatives
(23b,c). That would make the uniform distribution improper, making it impossible to assign the
foundational prior probability from which subsequent inference would follow. Only the first
alternative (23a) survives, for which phase is 2π-periodic (in effect 0 ≤ argψ < 2π).
Pairs are recognised as complex numbers, because theirs are the relevant addition and mul-
tiplication rules (19) and (21a). Meanwhile, periodicity makes β = 0 so that [21]
q(ψ) = |ψ|α (26)
and phase is distributed uniformly as
Pr(argψ) = 1/2π (27)
Now, what we observe is the mean rate at which a source produces particular types of object
— more precisely, we observe the number of such objects that is produced in an ensemble of
N independent experiments. The ensemble can equally well be considered as a single (larger)
experiment embodying its own (correspondingly larger) production, given by the scalar sum
rule as Q = q1 + q2 + · · · + qN . This scales linearly with N . Likewise, the pair sum rule (19)
collects pairs — individually identifiable by arrival time — into Ψ = ψ1 +ψ2 + · · ·+ψN . (Loss
12
of identifiability, leading to bosons and fermions, would enter the analysis later as or when
stored particles were coerced into the same state.)
Regardless of its distribution in modulus, ψ is random in phase, so the expected mean re-
mains zero while the expected variance grows linearly with N . Hence quantity Q and variance
var(Ψ) scale together in line with N . Comparison with (26) shows that α must be 2 and we
have the Born rule.
q(ψ) = |ψ|2 (28)
4.3 Probability assignments
Object arrivals being independent, their distribution is Poisson, radioactive decay being an ex-
ample. The probability distribution with a Poisson mean rate r is
Pr(q) dq = exp(−qr
) dqr
(29)
For consistency with the Born rule, the probability distribution for ψ is then Gaussian.
Pr(ψ) d2ψ =1
πexp
(−|ψ|
2
r
)d2ψ
r(30)
Sums of Gaussian variables are themselves Gaussian, so this form of distribution is preserved
under partition and combination. Indeed, the Gaussian form could have been independently
justified through the law of large numbers applied to microstates.
Complex ψ|ψ|2
−−−−−−−−⇀↽−−−−−−−−Gauss
Quantity qAverage
−−−−−−−−⇀↽−−−−−−−−Poisson
Observe rate r
4.4 Hilbert space
We choose to introduce quantum calculus through the notional substrate of n a-priori-equivalent
microstates, all supplied at the same rate, unit for convenience. It is helpful to keep track
of combination and partition by inventing orthonormal base vectors |ek〉 for these microstates
13
k = 1, 2, . . . , n. A sample object ψ can then be expressed in “bra-ket” quantum physics notation
as a complex “amplitude vector” in Hilbert space.
|ψ〉 =n∑k=1
ψk|ek〉 with rate q = |ψ|2 = 〈ψ|ψ〉 (31)
With nothing at first known about the component amplitudes other than their unit supply rate,
their prior distribution is, independently,
ψk = ComplexGauss (32)
When we see “one object” which we then know exists, we could specify |ψ|2 = 1 which
confines ψ to the unit Hilbert sphere and identifies component quantities as probabilities. That
constraint would continue to be obeyed no matter how deeply we partition and recombine.
However, we recommend encoding the magnitude within ψ itself. After all, rates are additive
indefinitely, while probabilities are bounded by unity. The additive variance of ψ is interpreted
more naturally as a rate than as a probability. And rates are closer to laboratory practice.
Theorists tend to discuss states while experimentalists provide rates.
According to the pair sum rule (19), composite states X have amplitudes
ψX =∑k∈X
ψk (33)
which are themselves complex Gaussian whose variance is the size of X . Size 1 (a single
microstate) is called a “pure” state and larger sizes are called “mixed”, maximal mixing with all
microstates included being the sample object itself. State X can be extracted from the sample
objects by applying the selection operator (in mathematics, a projection)
PX =∑k∈X
|ek〉〈ek| so that |ψX〉 = PX |ψ〉 (34)
Selection separates objects that exhibit different behaviour, and is implemented in such devices
as diffraction screens and Stern-Gerlach experiments.
14
It might be objected that this summation contradicts the sum rule for rates, which appears
to require |ψX + ψY |2 = |ψX |2 + |ψY |2, which is disobeyed by the samples. However, that
hypothetical addition would refer to the quantitation q = |ψ|2 of individual samples, which is
not observable. It’s the ensemble average rate r = Mean(q) that needs to sum linearly, and it
does.
The distribution of ψ being spherically symmetric, orthonormal base vectors |e〉 can be
rotated arbitrarily, so that a suitable selection P and its complement Q (with P + Q = I, the
identity) can split the original Hilbert space into any desired subspace and its complement.
4.5 Measurement
After partitioning by some physical device, the physicist may then identify the partitions by
labels which will be interpreted as the physical property (energy, position, charge, whatever)
that was measured by that device. Commuting selections split the Hilbert space recursively into
multi-way decompositions represented by Hermitian matrices whose eigenvalues are the real-
valued partitioning labels. These matrices that partition and label the amplitudes in appropriate
coordinates are known as quantum observables.
O ObjectA B PartitionA C D Partition2 5 5 7 Energy values
2 0 0 00 5 0 00 0 5 00 0 0 7
ObservableHamiltonian (35)
In this example, partition C with energy 5 is selected by
PC = |e2〉〈e2|+ |e3〉〈e3| =
0 0 0 00 1 0 00 0 1 00 0 0 0
(36)
and similarly for A and D.
Once a device has split an object into particular partitions, we assume that repeated subjec-
tion to that same device has no extra effect. Projections are idempotent, P2 = P, and it would
15
be difficult to make sense of reproducible phenomena without that assumption. However, later
non-commuting selections may corrupt separations accomplished earlier: projections need not
commute. Objects may thus shift their behaviour according to how they become selected, of-
ten (as shown by Bell [22] and by Kochen and Specker [23]) in a manner incompatible with
classical particles.
Measurement becomes possible when an object is probed and the probe removed for in-
spection. Unless existence is damaged, the only effect of a probe can be to change the phase
of the selected subspace, without altering internal structure to which the probe is blind. Thus
|ψX〉 −→ eiθ|ψX〉 where θ is some phase shift, effectively random if the probe is strongly
intrusive.
Even then, there is no “collapse of the wave function”. Probing just changes some phases
|ψ〉 = P|ψ〉+ Q|ψ〉 −→ eiθP|ψ〉+ Q|ψ〉 = |ψ′〉 (37)
and subsequent inspection of the probe to get a P-or-Q measurement is optional. Collapse
only occurs if one outcome or the other is physically blocked or mentally ignored because of
subsequent irrelevance. Ensemble membership is then reduced, but that’s standard in Bayesian
computations as data constrain the solutions and it carries no philosophical content.
5 Commentary
“Now the essential content of both statistical mechanics and communication theory, of course,
does not lie in the equations; it lies in the ideas that lead to those equations.” — Edwin T.
Jaynes [24, p.4]
We have presented a unified derivation of summation in measure theory, multiplication in
probability theory, and complex numbers in quantum theory. This minimal foundation is very
simple, and should be accessible to neophyte students as well as experienced researchers.
16
We make no assumption that cannot be checked in the lab. We recommend that as a good
strategic principle, because assumptions that cannot be checked are thereby divorced from prac-
tical impact, in which case they become a peculiar and questionable part of scientific inquiry. If
such assumption is truly needed, then it has practical impact after all because its denial would
alter experimental results, which is self-contradictory. If it’s not needed, then requiring it would
be regrettable. Specifically, we make no assumption involving infinity or the infinitesimal. Any
general theory must apply to special cases, including simple ones, and it happens that simple
examples are sufficient to eliminate all but the one calculus.
In the last couple of decades there has been an effort to reformulate and reconstruct the
quantum formalism based on probability theory [25, 26, 27, 28] and information theory [29,
30, 31, 32, 33, 34, 35, 36, 37]. Yet we find that the similarities between the quantum rules and
probability/information theory are not due to the fact that one derives from the other, but rather
that they both derive from common principles. We share much of the interpretive aspect of
quantum Bayesianism (QBism) [28]. For us, though, Bayes comes first.
Note that our derivation of the quantum formalism cannot be undermined by any alternative
interpretation or supposed generalisation of probability, or by some differing assumptions there
[38, 39], which might be thought to open the possibility of conflict. Symmetries are silent on
interpretation, and we need only the one common foundation to support the whole edifice.
Our symmetries are necessary and sufficient, but we do not exclude using similarly verifiable
assumptions as sufficient foundation [7]. But, as a matter of logic, any alteration to measure
(which has not been seriously proposed) or to probability (which has often been proposed)
must conflict with our symmetries, and thereby with implementations of our quantum thought-
experiments. For, it must be acknowledged, quantum theory works. So does probability. And
the two are entirely mutually consistent.
17
6 Acknowledgements
We thank those many colleagues who have guided the evolution of our thought over the past
quarter-century, particularly Ariel Caticha, Seth Chaiken, Keith Earle, Anton Garrett, Steve
Gull, and Oleg Lunin. We also thank Andrei Khrennikov, Julio Stern, and Federico Holik
for invitations to present our efforts leading up to this manuscript. K.H.K. also thanks the
Foundational Questions Institute (FQXi) and those who have worked to support the FQXi essay
contests, for providing an opportunity for researchers to explore their thoughts and ideas on
foundational topics [12]. Yet our greatest debt is to Edwin Jaynes who kept faith with rational
inference through many dark years, and to whose memory we respectfully dedicate this work.
The authors contributed equally to this work. The authors declare no competing financial
interests.
References
[1] Rovelli, C. Relative information at the foundation of physics. 2013, “It from Bit or Bit from
It?” FQXi 2013 Essay Contest (2nd prize).
Preprint at http://www.fqxi.org/community/forum/topic/1816.
[2] Caticha, A. Quantifying rational belief. Bayesian Inference And Maximum Entropy Methods
In Science And Engineering, Oxford MS, USA 2009 (ed. P. M. Goggans P. M. & C.-Y.
Chan, C. Y.), AIP Conf. Proc. 1193, 60–68 (2009).
[3] Cox, R. T. Probability, frequency, and reasonable expectation. Am. J. Phys. 14, 1–13 (1946).
[4] Caticha, A. Consistency, amplitudes, and probabilities in quantum theory. Phys. Rev. A 57,
1572–1582 (1998).
18
[5] Goyal, P., Knuth, K. H. & Skilling, J. Origin of complex quantum amplitudes and Feyn-
man’s rules. Phys. Rev. A 81, 022109 (2010), (arXiv:0907.0909 [quant-ph]).
[6] Goyal, P. & Knuth, K. H. Quantum theory and probability theory: their relationship and
origin in symmetry. Symmetry 3, 171–206 (2011).
[7] Knuth, K. H. & Skilling, J. Foundations of inference. Axioms 1, 38–73 (2012).
[8] Feynman, R. P. Space-time approach to non-relativistic quantum mechanics. Rev. Mod.
Phys. 20, 367–387 (1948).
[9] Born, M. Zur quantenmechanik der stoßvorgange (quantum mechanics of collision pro-
cesses). Zeit. fur Phys. 38, 803 (1926).
[10] von Neumann, J. Mathematical Foundations Of Quantum Mechanics, 2, (12 ed., Princeton
Univ. Press, 1996).
[11] Hamming, R. W. The unreasonable effectiveness of mathematics. Amer. Math. Monthly
87, 81–90 (1980).
[12] Knuth, K. H. The deeper roles of mathematics in physical laws. Trick or Truth: the Mys-
terious Connection between Physics and Mathematics (ed. Aguirre, A., Foster, B. & Z. Mer-
ali, Z.), Springer Frontiers Collection, Springer-Verlag, Heidelberg, 2016, FQXi 2015 Essay
Contest (3rd prize), (arXiv:1504.06686 [math.HO]), pp. 77–90.
[13] Boole, G. An Investigation Of The Laws Of Thought. (Macmillan, London, 1854).
[14] Knuth, K. H. Measuring on lattices. Bayesian Inference And Maximum Entropy Methods
In Science And Engineering, Oxford MS, USA 2009 (ed. P. M. Goggans P. M. & C.-Y.
Chan, C. Y.), AIP Conf. Proc. 1193, 132–144, (2009), (arXiv:0909.3684v1 [math.GM]).
19
[15] Knuth, K. H. Lattices and their consistent quantification. (2017), Submitted to Annalen
der Physik.
[16] Laplace, P. S. Theorie Analytique Des Probabilites, 2, Ch. 1. (Courcier Imprimeur, Paris,
1812).
[17] Kolmogorov, A. N. Foundations Of The Theory Of Probability. (Chelsea, New York,
1950). English translation and reprinting of Kolmogorov, A. (1933), Grundbegriffe der
Wahrscheinlichkeitsrechnung (Springer, Berlin).
[18] de Finetti, B. Probabilism. Erkenntnis 31, 169–223 (1989), English translation and reprint-
ing of de Finetti, B. (1931), Probabilismo, Logos (Napoli), pp. 163-219.
https://doi.org/10.1007/BF01236563
[19] Jaynes, E. T. Probability Theory: The Logic Of Science, p. 4 (Cambridge Univ. Press,
Cambridge, 2003).
[20] Weinberg, S. Dreams Of A Final Theory (Vintage, 1992).
[21] Tikochinsky, Y. Feynman rules for probability amplitudes. Int. J. Theor. Phys. 27, 543–549
(1988).
[22] Bell, J. S. On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys. 38,
447–452 (1966).
[23] Kochen, S. & Specker, E. P. The problem of hidden variables in quantum mechanics. J.
Math. Mech. 17, 59–87 (1967).
[24] Jaynes, E. T. Probability theory in science and engineering. Colloquium Lectures in
Pure and Applied Science, 4, p. 4, (Socony-Mobil Oil Company, Inc., Dallas, TX, 1959),
http://bayes.wustl.edu/etj/articles/mobil.pdf.
20
[25] Youssef, S. Quantum mechanics as Bayesian complex probability theory. Mod. Phys. Lett.
A9, 2571–2586 (1994).
[26] Caves, C. M., Fuchs, C. A. & Schack, R. Quantum probabilities as Bayesian probabilities.
Phys. Rev. A 65, 022305 (2002).
[27] Bub, J. Quantum probabilities as degrees of belief. Studies in History and Philosophy of
Science Part B: Studies in History and Philosophy of Modern Physics 38, 232–254 (2007).
[28] Fuchs, C. A., Mermin, N. D. & Schack, R. An introduction to QBism with an application
to the locality of quantum mechanics (2013), (arXiv:1311.5253 [quant-ph]).
[29] Timpson, C. G. Quantum Information Theory And The Foundations Of Quantum Mechan-
ics. (Oxford Univ. Press, Oxford, 2013).
[30] Rovelli, C. Relational quantum mechanics. Int. J. Theor. Phys. 35, 1637–1678 (1996).
[31] Reginatto, M. Derivation of the equations of nonrelativistic quantum mechanics using the
principle of minimum Fisher information. Phys. Rev. A 58, 1775 (1998).
[32] Zeilinger, A. A foundational principle for quantum mechanics. Foundations of Physics 29,
631–643 (1999).
[33] Fuchs, C. A. Quantum mechanics as quantum information (and only a little more). 2002,
(arXiv:quant-ph/0205039).
[34] Clifton, R., Bub J. & Halvorson, H. Characterizing quantum theory in terms of
information-theoretic constraints. Foundations of Physics 33, 1561–1591 (2003).
[35] Goyal, P. Information-geometric reconstruction of quantum theory. Phys. Rev. A 78,
052120 (2008).
21
[36] Brukner, C. & Zeilinger, A. Information invariance and quantum probabilities. Founda-
tions of Physics 39, 677–689 (2009).
[37] Wootters, W. K. Communicating through probabilities: does quantum theory optimize the
transfer of information? Entropy 15, 3130–3147 (2013).
[38] Dupre M. J. & Tipler, F. J. New axioms for rigorous Bayesian probability. Bayesian Anal-
ysis 4, 599–606 (2009).
[39] Terenin, A. & Draper, D. Cox’s Theorem and the Jaynesian Interpretation of Probability.
(2017), (arXiv:1507.06597 [math.ST]).
22