A Derivation of Quantum Theory From Physical Requirements Masanes Muller

16
arXiv:1004.1483v4 [quant-ph] 16 May 2011 A derivation of quantum theory from physical requirements Llu´ ıs Masanes ICFO-Institut de Ciencies Fotoniques, Mediterranean Technology Park, 08860 Castelldefels (Barcelona), Spain Markus P. M¨ uller Institute of Mathematics, Technical University of Berlin, 10623 Berlin, Germany, Institute of Physics and Astronomy, University of Potsdam, 14476 Potsdam, Germany and Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, ON N2L 2Y5, Canada. (Dated: October 5, 2010) Quantum theory is usually formulated in terms of abstract mathematical postulates, involving Hilbert spaces, state vectors, and unitary operators. In this work, we show that the full formalism of quantum theory can instead be derived from five simple physical requirements, based on elementary assumptions about preparation, transformations and measurements. This is more similar to the usual formulation of special relativity, where two simple physical requirements – the principles of relativity and light speed invariance – are used to derive the mathematical structure of Minkowski space-time. Our derivation provides insights into the physical origin of the structure of quantum state spaces (including a group-theoretic explanation of the Bloch ball and its three-dimensionality), and it suggests several natural possibilities to construct consistent modifications of quantum theory. I. INTRODUCTION Quantum theory is usually formulated by postulating the mathematical structure and representation of states, transformations, and measurements. The general phys- ical consequences that follow (like violation of Bell-type inequalities [1], the possibility of performing state tomog- raphy with local measurements, or factorization of inte- gers in polynomial time [2]) come as theorems which use the postulates as premises. In this work, this procedure is reversed: we impose five simple physical requirements, and this suffices to single out quantum theory and derive its mathematical formalism uniquely. This is more sim- ilar to the usual formulation of special relativity, where two simple physical requirements —the principles of rela- tivity and light speed invariance— are used to derive the mathematical structure of Minkowski space-time and its transformations. The requirements can be schematically stated as: 1. In systems that carry one bit of information, each state is characterized by a finite set of outcome probabilities. 2. The state of a composite system is characterized by the statistics of measurements on the individual components. 3. All systems that effectively carry the same amount of information have equivalent state spaces. 4. Any pure state of a system can be reversibly trans- formed into any other. 5. In systems that carry one bit of information, all mathematically well-defined measurements are al- lowed by the theory. These requirements are imposed on the framework of generalized probabilistic theories [3–9], which already as- sumes that some operational notions (preparation, mix- ture, measurement, and counting relative frequencies of measurement outcomes) make sense. Due to its concep- tual simplicity, this framework leaves room for an infini- tude of possible theories, allowing for weaker- or stronger- than-quantum non-locality [6, 10–14]. In this work, we show that quantum theory (QT) and classical probabil- ity theory (CPT) are very special among those theories: they are the only general probabilistic theories that sat- isfy the five requirements stated above. The non-uniqueness of the solution is not a problem, since CPT is embedded in QT, thus QT is the most gen- eral theory satisfying the requirements. One can also proceed as Hardy in [4]: if Requirement 4 is strengthened by imposing continuity of the reversible transformations, then CPT is ruled out and QT is the only theory satisfy- ing the requirements. This strengthening can be justified by the continuity of time evolution of physical systems. It is conceivable that in the future, another theory may replace or generalize QT. Such a theory must violate at least one of our assumptions. The clear meaning of our requirements allows to straightforwardly explore poten- tial features of such a theory. The relaxation of each of our requirements constitutes a different way to go beyond QT. The search for alternative axiomatizations of quantum theory (QT) is an old topic that goes back to Birkhoff and von Neumann [8], and has been approached in many different ways: extending propositional logic [7, 8], using operational primitives [3–6, 9], searching for information- theoretic principles [5, 6, 10, 11, 19–21], building upon the phenomenon of quantum nonlocality [6, 10–13]. Alf- sen and Shultz [22] have accomplished a complete char- acterization of the state spaces of QT from a geometric point of view, but the result does not seem to have an immediate physical meaning. In particular, the fact that the state space of a generalized bit is a three-dimensional ball is an assumption there, while here it is derived from

description

A Derivation of Quantum Theory From Physical Requirements Masanes Muller

Transcript of A Derivation of Quantum Theory From Physical Requirements Masanes Muller

  • arX

    iv:1

    004.

    1483

    v4 [

    quan

    t-ph]

    16 M

    ay 20

    11A derivation of quantum theory from physical requirements

    Llus MasanesICFO-Institut de Ciencies Fotoniques, Mediterranean Technology Park, 08860 Castelldefels (Barcelona), Spain

    Markus P. MullerInstitute of Mathematics, Technical University of Berlin, 10623 Berlin, Germany,

    Institute of Physics and Astronomy, University of Potsdam, 14476 Potsdam, Germany andPerimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, ON N2L 2Y5, Canada.

    (Dated: October 5, 2010)

    Quantum theory is usually formulated in terms of abstract mathematical postulates, involvingHilbert spaces, state vectors, and unitary operators. In this work, we show that the full formalism ofquantum theory can instead be derived from five simple physical requirements, based on elementaryassumptions about preparation, transformations and measurements. This is more similar to theusual formulation of special relativity, where two simple physical requirements the principles ofrelativity and light speed invariance are used to derive the mathematical structure of Minkowskispace-time. Our derivation provides insights into the physical origin of the structure of quantumstate spaces (including a group-theoretic explanation of the Bloch ball and its three-dimensionality),and it suggests several natural possibilities to construct consistent modifications of quantum theory.

    I. INTRODUCTION

    Quantum theory is usually formulated by postulatingthe mathematical structure and representation of states,transformations, and measurements. The general phys-ical consequences that follow (like violation of Bell-typeinequalities [1], the possibility of performing state tomog-raphy with local measurements, or factorization of inte-gers in polynomial time [2]) come as theorems which usethe postulates as premises. In this work, this procedureis reversed: we impose five simple physical requirements,and this suffices to single out quantum theory and deriveits mathematical formalism uniquely. This is more sim-ilar to the usual formulation of special relativity, wheretwo simple physical requirements the principles of rela-tivity and light speed invariance are used to derive themathematical structure of Minkowski space-time and itstransformations.The requirements can be schematically stated as:

    1. In systems that carry one bit of information, eachstate is characterized by a finite set of outcomeprobabilities.

    2. The state of a composite system is characterizedby the statistics of measurements on the individualcomponents.

    3. All systems that effectively carry the same amountof information have equivalent state spaces.

    4. Any pure state of a system can be reversibly trans-formed into any other.

    5. In systems that carry one bit of information, allmathematically well-defined measurements are al-lowed by the theory.

    These requirements are imposed on the framework ofgeneralized probabilistic theories [39], which already as-

    sumes that some operational notions (preparation, mix-ture, measurement, and counting relative frequencies ofmeasurement outcomes) make sense. Due to its concep-tual simplicity, this framework leaves room for an infini-tude of possible theories, allowing for weaker- or stronger-than-quantum non-locality [6, 1014]. In this work, weshow that quantum theory (QT) and classical probabil-ity theory (CPT) are very special among those theories:they are the only general probabilistic theories that sat-isfy the five requirements stated above.

    The non-uniqueness of the solution is not a problem,since CPT is embedded in QT, thus QT is the most gen-eral theory satisfying the requirements. One can alsoproceed as Hardy in [4]: if Requirement 4 is strengthenedby imposing continuity of the reversible transformations,then CPT is ruled out and QT is the only theory satisfy-ing the requirements. This strengthening can be justifiedby the continuity of time evolution of physical systems.

    It is conceivable that in the future, another theory mayreplace or generalize QT. Such a theory must violate atleast one of our assumptions. The clear meaning of ourrequirements allows to straightforwardly explore poten-tial features of such a theory. The relaxation of each ofour requirements constitutes a different way to go beyondQT.

    The search for alternative axiomatizations of quantumtheory (QT) is an old topic that goes back to Birkhoffand von Neumann [8], and has been approached in manydifferent ways: extending propositional logic [7, 8], usingoperational primitives [36, 9], searching for information-theoretic principles [5, 6, 10, 11, 1921], building uponthe phenomenon of quantum nonlocality [6, 1013]. Alf-sen and Shultz [22] have accomplished a complete char-acterization of the state spaces of QT from a geometricpoint of view, but the result does not seem to have animmediate physical meaning. In particular, the fact thatthe state space of a generalized bit is a three-dimensionalball is an assumption there, while here it is derived from

  • 2physical requirements.This work is particularly close to [4, 19], from where it

    takes some material. More concretely, the multiplicativ-ity of capacities and the Simplicity Axiom from [4] arereplaced by Requirement 5. In comparison with [19], thefact that each state of a generalized bit is the mixtureof two distinguishable ones, the maximality of the groupof reversible transformations and its orthogonality, andthe multiplicativity of capacities, are also replaced byRequirement 5.Summary of the paper. Section II contains an introduc-

    tion to the framework of generalized probabilistic theo-ries, where some elementary results are stated withoutproof. In Section III the five requirements and their sig-nificance are explained in full detail. Section IV is thecore of this work. It contains the characterization of alltheories compatible with the requirements, concludingthat the only possibilities are CPT and QT. The Conclu-sion (Section V) recapitulates the results and adds someremarks. The Appendix contains all lemmas and theirproofs.

    II. GENERALIZED PROBABILISTIC

    THEORIES

    In CPT there can always be a joint probability distri-bution for all random variables under consideration. Theframework of generalized probabilistic theories (GPTs),also called convex operational framework, generalizes thisby allowing the possibility of random variables that can-not have a joint probability distribution, or cannot be si-multaneously measured (like noncommuting observablesin QT).This framework assumes that at some level there is a

    classical reality, where it makes sense to talk about exper-imentalists performing basic operations such as: prepa-rations, mixtures, measurements, and counting relativefrequencies of outcomes. These are the primary conceptsof this framework. It also provides a unified way for allGPTs to represent states, transformations and measure-ments. A particular GPT specifies which of these areallowed, but it does not tell their correspondence withactual experimental setups. On its own, a GPT can stillmake nontrivial predictions like: the maximal violationof a Bell inequality [1], the complexity-theoretic compu-tational power [2, 18], and in general, all information-theoretic properties of the theory [6].The framework of GPTs can be stated in different

    ways, but all lead to the same formalism [39]. This for-malism is presented in this section at a very basic level,providing some elementary results without proofs.

    A. States

    Definition of system. To a setup like FIG. 1 we asso-ciate a system if for each configuration of the prepara-

    release button

    physical system

    T

    outcomes x and x

    x

    FIG. 1: General experimental set up. From left to rightthere are the preparation, transformation and measurementdevices. As soon as the release button is pressed, the prepa-ration device outputs a physical system in the state specifiedby the knobs. The next device performs the transformationspecified by its knobs (which in particular can be do noth-ing). The device on the right performs the measurementspecified by its knobs, and the outcome (x or x) is indicatedby the corresponding light.

    tion, transformation and measurement devices, the rela-tive frequencies of the outcomes tend to a unique proba-bility distribution (in the large sample limit).The probability of a measurement outcome x is de-

    noted by p(x). This outcome can be associated to abinary measurement which tells whether x happens ornot (this second event x has probability p(x) = 1p(x)).The above definition of system allows to associate to eachpreparation procedure a list of probabilities for the out-comes of all measurements that can be performed on asystem. As we show in Subsection IVC below, our re-quirements imply that all these probabilities p(x) are de-termined by a finite set of them; the smallest such set isused to represent the state

    =

    1p(x1)...

    p(xd)

    =

    0

    1

    ...d

    S Rd+1. (1)

    The measurement outcomes that characterize the statex1, . . . , xd are called fiducial, and in general, there is morethan one set of them (for example, a 12 -spin particle in QTis characterized by the spin in any 3 linearly-independentdirections). Note that each of the fiducial outcomes cancorrespond to a different measurement. The redundantcomponent 0 = 1 is reminiscent of QT, where one of thediagonal entries of a density matrix is redundant, sincethey sum up to 1. In fact 0 6= 1 is sometimes usedto represent unnormalized states, but not in this paper,where only normalized states are considered. The re-dundant component 0 allows to use the tensor-productformalism in composite systems (Subsection IID), whichsimplifies the notation.The set of all allowed states S is convex [23], because

    if 1, 2 S then one can prepare 1 with probability qand 2 with probability 1 q, effectively preparing thestate q1+(1q)2. The number of fiducial probabilitiesd is equal to the (affine) dimension of S, otherwise onefiducial probability would be functionally related to the

  • 3others, and hence redundant.Suppose there is a Rd+1-vector / S which is in the

    topological closure of S that is, can be approximatedby states S to arbitrary accuracy. Since there isno observable physical difference between perfect prepa-ration and arbitrarily good preparation, we will consider to be a valid state and add it to the state space. This doesnot change the physical predictions of the theory, but ithas the mathematical consequence that state spaces be-come topologically closed. Since state vectors (1) arebounded, and we are in finite dimensions (shown in Sub-section IVC), state spaces S are compact convex sets [23].The pure states of a state space S are the ones that

    cannot be written as mixtures: 6= q1+(1 q)2 with1 6= 2 and 0 < q < 1. Since S is compact and convex,all states are mixtures of pure states [23].

    B. Measurements

    The probability of measurement outcome x whenthe system is in state S is given by a functionx(). Suppose the system is prepared in the mixtureq1 + (1 q)2, then the relative frequency of outcomex does not depend on whether the label of the actualpreparation k is ignored before or after the measure-ment, hence

    x(q1 + (1 q)2

    )= qx(1) + (1 q)x(2) .

    This means that the function x is affine on S. Theredundant component 0 in (1) allows to write this func-tion as a linear map x : R

    d+1 R [3, 6].An effect is a linear map : Rd+1 R such that

    () [0, 1] for all states S. Every function x as-sociated to an outcome probability p(x) is an effect. Theconverse is not necessarily true: the framework of GPTsallows to construct theories where some effects do notrepresent possible measurement outcomes. These restric-tions are analogous to superselection rules, where some(mathematically well-defined) states are not allowed bythe physical theory. This is related to Requirement 5.A tight effect is one for which there are two states0, 1 S satisfying (0) = 0 and (1) = 1.An n-outcome measurement is specified by n effects

    1, . . . ,n such that 1() + + n() = 1 for all S. The number a() is the probability of outcomea when the measurement is performed on the state .The states 1, . . . , n are distinguishable if there is ann-outcome measurement such that a(b) = a,b, wherea,b = 1 if a = b, and a,b = 0 if a 6= b.The capacity of a state space S is the size of the largest

    family of distinguishable states, and is denoted by c. Thisis the amount of classical information that can be trans-mitted by the corresponding type of system, in a single-shot error-free procedure. (In QT the capacity of a sys-tem is the dimension of its corresponding Hilbert space;which must not be confused with the dimension of thestate space d = c2 1, that is, the set of c c complex

    matrices that are positive and have unit trace.) A com-plete measurement on S is one capable of distinguishingc states.

    C. Transformations

    Each type of system has associated to it: a state space,a set of measurements, and a set of transformations. Atransformation T is a map T : S S. Similarly asfor measurements, if a state is prepared as a mixtureq1 + (1 q)2, it does not matter whether the label ofthe actual preparation k is ignored before or after thetransformation. Hence

    T (q1 + (1 q)2) = qT (1) + (1 q)T (2) ,which implies that T is an affine map. The redundantcomponent 0 in (1) allows to extend T to a linear mapT : Rd+1 Rd+1 [3, 6].A transformation T is reversible if its inverse T1 ex-

    ists and belongs to the set of transformations allowedby the theory. The set of (allowed) reversible transfor-mations of a particular state space S forms a group G.For the same reason as for the state space itself, we willassume that the group of reversible transformations istopologically closed. Previously we have seen that a statespace S is bounded, hence the corresponding group oftransformations G is bounded, too. In summary, groupsof transformations are compact [24].

    D. Composite systems

    Definition of composite system. Two systemsA,B con-stitute a composite system, denoted AB, if a measure-ment for A together with a measurement for B uniquelyspecifies a measurement forAB. This means that if x andy are measurement outcomes on A and B respectively,the pair (x, y) specifies a unique measurement outcomeon AB, whose probability distribution p(x, y) does notdepend on the temporal order in which the subsystemsare measured.The fact that subsystems are themselves systems im-

    plies that each has a well-defined reduced state A, Bwhich does not depend on which transformations andmeasurements are performed on the other subsystem (seedefinition of system in Subsection IIA). This is often re-ferred to as no-signaling. Let x1, . . . , xdA be the fiducialmeasurements of system A, and y1, . . . , ydB the ones ofB. The no-signaling constraints are

    p(xi) = p(xi, yj) + p(xi, yj)p(yi) = p(xi, yj) + p(xi, yj)

    (2)

    for all i, j.An assumption which is often postulated additionally

    in the GPT context is Requirement 2, which says that thestate of a composite system is completely characterized

  • 4by the statistics of measurements on the subsystems, thatis, p(x, y). This and no-signaling (2) imply that statesin AB can be represented on the tensor product vectorspace [3] as

    AB =

    1...

    p(xi)...

    p(yj)...

    p(xi, yj)...

    SAB RdA+1 RdB+1. (3)

    The joint probability of two arbitrary local measurementoutcomes x, y is given by

    p(x, y) = (x y)(AB) , (4)where x is the effect representing x in A, that isp(x) = x(A), and analogously for y [3]. (The termlocal is used when referring to subsystems, and hasnothing to do with spatial locations.) In other words,if {A1 , . . . ,An } is an n-outcome measurement on A,and {B1 , . . . ,Bm} is an m-outcome measurement onB, then {Aa Bb | a = 1, . . . , n ; b = 1, . . . ,m} defines ameasurement on AB with nm outcomes. Local transfor-mations act on the global state as

    AB (TA TB)(AB) , (5)where TA is the matrix that represents the transforma-tion in A, and analogously for TB [3]. The reduced states

    A =

    1...

    p(xi)...

    , B =

    1...

    p(yj)...

    , (6)

    are obtained from AB by picking the right components(3). Alternatively, reduced states can be defined byA(A) = (A 1)(AB) for any effect A in A, where1(B) =

    0B is the unit effect. The reduced state A

    must belong to the state space of subsystem A, denotedSA, and any state in SA must be the reduction of a statefrom SAB. (Analogously for subsystem B.) This impliesthat all product states

    AB = A B (7)are contained in SAB [3], and similarly, all tensor prod-ucts of local measurements and transformations are al-lowed on AB.Given two fixed state spaces SA and SB, the previous

    discussion imposes constraints on the state space of thecomposite system SAB. However, there are still many dif-ferent possible joint state spaces SAB, and some of them

    allow for larger violations of Bell inequalities than QT. Infact, this has been extensively studied [5, 6, 1014], andis one of the reasons for the popularity of generalizedprobabilistic theories.Nothing prevents Bobs system from being composite

    itself; hence one can recursively extend the definition ofcomposite system and formulas (3), (4), (5), and (7) tomore parties.

    E. Equivalent state spaces

    Let L : S S be an invertible affine map. If allstates are transformed as L(), and all effects onS are transformed as L1, then the outcomeprobabilities () are kept unchanged. Analogously, ifall transformations on S are mapped as T LT L1then their action on the states is the same. The newstate space S , together with the transformed effects andtransformations, is then just a different representation ofS. In this case, we call S and S equivalent. In the newrepresentation, the entries of need not be probabilitiesas in (1), but it may have other advantages. In this work,several representations are used.In the standard formalism of QT, states are repre-

    sented by density matrices, however they can also berepresented as in (1).Changing the set of fiducial measurements is a particu-

    lar type of L-transformation. For example, if the compo-nents of the Bloch vector (of a quantum spin- 12 particle)correspond to spin measurements in non-orthogonal di-rections, then the Bloch sphere becomes an ellipsoid.

    F. Instances of generalized probabilistic theories

    QT is an instance of GPT, and can be specified asfollows. The state space Sc with capacity c is equivalentto the set of complex cc-matrices such that 0and tr = 1. This set has dimension dc = c

    2 1, andits pure states are rank-one. The effects on Sc have theform () = tr(M), where M is a complex cc-matrixsuch that 0 M I. The reversible transformationsact as V V with V SU(c). The capacity of acomposite system AB is the product of the capacities forthe subsystems cAB = cAcB.CPT is another instance of GPT, and can be speci-

    fied as follows. The state space Sc with capacity c isequivalent to the set of c-outcome probability distribu-tions [p(1), . . . , p(c)], which has dimension dc = c 1(in geometric terms, each Sc is a simplex). The purestates are the deterministic distributions p(a) = a,b withb = 1, . . . , c. The c-outcome measurement with effectsa() = p(a) for a = 1, . . . , c, distinguishes the c purestates, hence it is complete. Any other measurement is afunction of this one. The reversible transformations actby permuting the entries of the state [p(1), . . . , p(c)]. Thecapacity of a composite system is also cAB = cAcB. Note

  • 5that CPT can be obtained by restricting the states of QTto diagonal matrices. In other words, CPT is embeddedin QT.An instance of GPT that is not observed in nature

    is generalized no-signaling theory [6], colloquially calledboxworld. By definition, state spaces contain all correla-tions (3) satisfying the no-signaling constraints (2). Suchstate spaces have finitely many pure states, and some ofthem violate Bell inequalities stronger than any quantumstate [12]. The effects in boxworld are all generated byproducts of local effects. The group of reversible trans-formations consists only of relabellings of local measure-ments and their outcomes, permutations of subsystems,and combinations thereof [14].

    III. THE REQUIREMENTS

    This section contains the precise statement of the re-quirements, each followed by explanations about its sig-nificance.

    Requirement 1 (Finiteness). A state space with capac-ity c = 2 has finite dimension d.

    If this did not hold, the characterization of a state ofa generalized bit would require infinitely many outcomeprobabilities, making state estimation impossible. It isshown below that this requirement, together with theothers, implies that all state spaces with finite capacityc have finite dimension.

    Requirement 2 (Local tomography). The state of acomposite system AB is completely characterized by thestatistics of measurements on the subsystems A,B.

    In other words, state tomography [3] can be performedlocally. This is equivalent to the constraint

    (dAB + 1) = (dA + 1)(dB + 1) (8)

    [3, 4]. This requirement can be recursively extended tomore parties by letting subsystems A,B to be themselvescomposite.

    Requirement 3 (Equivalence of subspaces). Let Sc andSc1 be systems with capacities c and c 1, respectively.If 1, . . . ,c is a complete measurement on Sc, then theset of states Sc with c() = 0 is equivalent to Sc1.The notions of complete measurements and equivalent

    state spaces are defined in Subsections II B and II E. Inparticular, equivalence of Sc1 and

    S c1 := { Sc : c() = 0} Sc (9)implies that all measurements and reversible transforma-tions on one of them can be implemented on the other.This requirement, first introduced in [4], implies that

    all state spaces with the same capacity are equivalent: ifSc1 and Sc1 are state spaces with capacity c 1, then

    both are equivalent to (9), hence they are equivalent toeach other. In other words, the only property that char-acterizes the type of system is the capacity for carryinginformation. If we start with Sc and apply Requirement 3recursively, we get a more general formulation: considerany subset of outcomes {a1, . . . , ac} {1, . . . , c} of thecomplete measurement 1, . . . ,c, then the set of states Sc with

    a1() + +ac () = 1 (10)is equivalent to the state space Sc with capacity c.This provides an onion-like structure for all state spacesS1 S2 S3 The particular structure of QT simplifies the task of

    assigning a state space to a physical system or experi-mental setup. It is not necessary to consider all possiblestates of the system, but instead, the relevant ones forthe context being analyzed. For example, an atom issometimes modeled with a state space having two dis-tinguishable states (c = 2), even though its constituentshave many more degrees of freedom. In particular, if weknow that only two energy levels are populated with non-zero probability, we can ignore all others and effectivelyget a genuine quantum 2-level state space. In a theorywhere this is not true, the effective state space mightdepend on how many unpopulated energy levels are ig-nored, or on the detailed internal state of the electron,for example. In order to avoid pathologies like this, wepostulate Requirement 3.

    Requirement 4 (Symmetry). For every pair of purestates 1, 2 S there is a reversible transformation Gmapping one onto the other: G(1) = 2.

    The set of reversible transformations of a state spaceSc forms a group, denoted Gc. This group endows Sc witha symmetry, which makes all pure states equivalent. Agroup Gc is said to be continuous if it is topologically con-nected: any transformation is the composition of manyinfinitesimal ones [24]. Hardy invokes the continuity oftime-evolution in physical systems to justify the conti-nuity of reversible transformations [3, 4]; in this case,state spaces Sc must have infinitely-many pure states;this rules out CPT and singles out QT. However, all theanalysis in this work is done without imposing continu-ity, since we find it very interesting that the only theorywith state spaces having finitely-many pure states, andsatisfying the requirements, is CPT.

    Requirement 5 (All measurements allowed). All effectson S2 are outcome probabilities of possible measurements.It is shown below that, in combination with the other

    requirements, this implies that all effects on all statespaces (with arbitrary c) appear as outcome probabilitiesof measurements in the resulting theory. Note that Re-quirement 5 has non-trivial consequences in conjunctionwith the other requirements: adding effects as allowedmeasurements to a physical theory extends the applica-bility of Requirement 3.

  • 6For completeness, we would like to mention that Re-quirement 5 can be replaced by the following postulate,which has first been put forward in an interesting paperthat appeared after completion of this work [34]. It callsa state completely mixed if it is in the relative interiorof state space. See Lemma 9 in the appendix for how theproof of our main result has to be modified in this case.

    Requirement 5 [34]. If a state is not completelymixed, then there exists at least one state that can beperfectly distinguished from it.

    IV. CHARACTERIZATION OF ALL THEORIES

    SATISFYING THE REQUIREMENTS

    A. The maximally-mixed state

    We use the following notation: the system with ca-pacity c has state space Sc with dimension dc and groupof reversible transformations Gc. The group Gc is com-pact (Section II C), and hence, has a normalized invariantHaar measure [26]. This allows to define the maximally-mixed state

    c =

    Gc

    G() dG Sc , (11)

    where Sc is an arbitrary pure state. It follows fromRequirement 4 that the resulting state c does not de-pend on the choice of the pure state . By construction,the maximally-mixed state is invariant:

    G(c) = c for all G Gc . (12)Moreover, Lemma 1 shows that it is the only invariantstate in Sc (this lemma and all others are stated andproven in the appendix).

    B. The generalized bit

    A generalized bit is a system with capacity two. Forany state S2 in the standard representation (1), itsBloch representation is defined by

    = 2

    p(x1) 12...

    p(xd2) d22

    S2 Rd2 . (13)

    States in the Bloch representation do not have the re-dundant component 0, so equations (4, 5, 7) become

    less simple. The invertible map L : S2 S2 is affinebut not linear; hence, effects in the Bloch representa-tion ( = L1) are affine but not necessarily linear.The same applies to transformations (G = L G L1),however, the maximally-mixed state in the Bloch repre-sentation is the null vector 2 = 0, therefore (12) be-

    comes G(0) = 0, which implies that G acts linearly (asa matrix).

    S2=1 1

    2

    mix

    one=1

    one S2 one=1

    one

    FIG. 2: The left figure is a state space whose boundary con-sists of facets (like = 1). Each facet contains infinitely manystates ( = 1 contains 1, 2 and all mix = q1+(1q)2).The right figure is a state space whose boundary has nofacets. Any state space has supporting hyperplanes contain-ing a unique state (like one = 1 in both figures).

    Theorem 1. A state in S2 is pure if and only if it belongsto the boundary S2.Proof. In any convex set, pure states belong to theboundary [23]. Let us see the converse.It is shown in [27] that any compact convex set has

    a supporting hyperplane containing exactly one point ofthe set. Translated to our language: there is a tight effectone on S2 such that only one state one S2 satisfiesone(one) = 1; this is illustrated in FIG. 2. Accord-

    ing to Requirement 5, the effect one corresponds to avalid measurement outcome, and so does 1 one, where1() = 1 for all S2. Thus, the two effects one and1 one define a complete measurement on S2. Impos-ing Requirement 3 on the single outcome one constrainsthe state space with unit capacity S1 to contain only onestate.Suppose there is a point in the boundary mix S2

    which is not pure: mix = q1 + (1 q)2 with 1 6=2 and 0 < q < 1. Every point in the boundary of acompact convex set has a supporting hyperplane whichcontains it [23]. In our language: there is a tight effect

    on S2 such that (mix) = 1. The affine function is bounded: () 1 for any S2, which implies(1) = (2) = 1; this is illustrated in FIG. 2. Like

    one, the effect defines a complete measurement, andRequirement 3 can be imposed on the single outcome ,implying that S1 contains more than one state. This isin contradiction with the previous paragraph; hence, allpoints in the boundary are pure.

    For the case d2 = 1, the state space S2 is a segment(a 1-dimensional ball), hence the previous and next theo-rems are trivial. For d2 > 1, the previous theorem impliesthat S2 contains infinitely-many pure states. The nexttheorem recovers the (quantum-like) Bloch sphere witha yet unknown dimension d2.

    Theorem 2. There is a set of fiducial measurements forwhich S2 is a d2-dimensional unit ball.Proof. Lemma 2 shows that there is an invertible realmatrix S such that for each G G2 the matrix SGS1

  • 7is orthogonal. Let us redefine the set S2 by transform-ing the states as = qS, where the numberq > 0 is chosen such that all pure states are unit vec-tors ||2 = T = 1. This is possible because in thetransformed state space, all pure states are related byorthogonal matrices (SGS1) which preserve the norm.

    Since Theorem 1 also applies to the redefined set S 2, itmust be a unit ball. In what follows we define a new set offiducial measurements xi such that the Bloch represen-tation (13) associated to the new fiducial probabilitiesp(xi) coincides with the redefinition

    .

    Requirement 5 tells that in S 2, all tight effects are al-lowed measurements. For each unit vector Rd2 thefunction (

    ) = (1 + T)/2 is a tight effect on theunit ball, and conversely, all tight effects on the unit ballare of this form. The new set of fiducial measurementsxi has effects xi = i , where

    1 =

    10...0

    , 2 =

    01...0

    , . . . , d2 =

    00...1

    (14)

    is a fixed orthonormal basis for Rd2 . For any state

    the new fiducial probabilities are p(xi) = xi() =

    (1 + i)/2, which implies i = 2[p(xi) 1/2]. This isjust (13) with the new fiducial measurements (note that

    2 = 0 and i2 = xi(0) = 1/2).

    In the rest of the paper, we will use the representationderived in Theorem 2 above, where the generalized bit isrepresented by a unit ball. Moreover, we will drop theprime in S 2, xi, used in the proof, and simply writeS2, xi, .As argued above, for each pure state S2 there is a

    binary measurement with associated effect

    () = (1 + T)/2 , (15)

    such that () = 1 and () = 0. In summary,there is a correspondence between tight effects and purestates in S2, and each pure state belongs to a distinguish-able pair {,}.

    C. Capacity and dimension

    Requirements 1, 2 and 3 imply that a state space withfinite capacity c has finite dimension dc, which generalizesRequirement 1. To see this, consider a system composedof m generalized bits, with state space denoted by S2m .Since d2 is finite, equation (8) implies that S2m has finitedimension. Due to the fact that perfectly distinguishablestates are linearly independent, its capacity, denoted cm,must be finite, too. Since systems with the same capacityare equivalent, we must have cm 6= cn for m 6= n, andthe sequence of integers c1, c2, . . . is unbounded. For any

    capacity c there is a value of m such that c cm, henceby Requirement 3 we have Sc S2m , which implies thatSc is finite-dimensional.In QT, the maximally-mixed state (11) has two con-

    venient properties. First property: if A and B are themaximally-mixed states of systems A and B, then themaximally-mixed state of the composite system AB is

    AB = A B . (16)

    Second property: in the state space Sc, there are c puredistinguishable states 1, . . . , c Sc such that

    c =1

    c

    ca=1

    a . (17)

    Lemmas 3 and 5 show that these two properties hold forevery theory satisfying our requirements. The followingtheorem exploits these properties to show that the ca-pacity is multiplicative (one of the axioms in [4]).

    Theorem 3. If cA and cB are the capacities of systemsA and B, then the capacity of the composite system ABis

    cAB = cAcB . (18)

    Proof. Equation (17) allows to write the maximally-mixed states of systems A and B as

    A =1

    cA

    cAa=1

    Aa , B =1

    cB

    cBb=1

    Bb ,

    where A1 , . . . , AcA

    SA are pure and distinguishable,and B1 , . . . ,

    BcB SB are pure and distinguishable, too.

    This and equation (16) imply

    AB = A B = 1cAcB

    cAa=1

    cBb=1

    Aa Bb . (19)

    All states Aa Bb SAB are distinguishable with thetensor-product measurement, therefore

    cAB cAcB . (20)

    Let (1, . . . ,cAB ) be a complete measurement on ABwhich distinguishes the states 1, . . . , cAB SAB; thatis k(k ) = k,k . According to Lemma 4 these statescan be chosen to be pure. Since

    cABk=1 k(AB) = 1,

    there is at least one value of k, denoted k0, such that

    k0(AB) 1/cAB . (21)

    The product of pure states A1 B1 is pure [3], hence Re-quirement 4 tells that there is a reversible transformationG GAB such that G(k0) = A1 B1 . The measure-ment (1 G1, . . . ,cAB G1) distinguishes the statesG(1), . . . , G(cAB ). Inequality (21), the invariance of

  • 8AB, expansion (19), the positivity of probabilities, and(k0 G1)(A1 B1 ) = 1, imply

    1

    cAB (k0 G1)(AB)

    =1

    cAcB

    a,b

    (k0 G1)(Aa Bb ) 1

    cAcB.

    This and (20) imply (18).

    It is shown in [4] that the two multiplicativity formulas(8) and (18) imply the existence of a positive integer rsuch that: for any c the state space Sc has dimension

    dc = cr 1 . (22)

    The integer r is a constant of the theory, with valuesr = 1 for CPT and r = 2 for QT.

    D. Recovering classical probability theory

    Let us consider all theories with d2 = 1. In this case,equation (22) becomes dc = c 1. In [4], it is shown thatthe only GPT with this relation between capacity anddimension is CPT, as described in Subsection II F. Wereproduce the proof for completeness.

    Theorem 4. The only GPT with d2 = 1 satisfying Re-quirements 15 is classical probability theory.

    Proof. Let Sc be a state space and (1, . . . ,c) acomplete measurement which distinguishes the states1, . . . , c Sc. The vectors 1, . . . , c Rc are linearlyindependent; otherwise a =

    b6=a tb b and 1 = a(a)

    =

    b6=a tba(b) = 0 gives a contradiction. Therefore,any state Sc Rc can be written in this basis =

    a qa a where qa = a() turns out to be the

    probability of outcome a. The numbers (q1, . . . , qc) con-stitute a probability distribution, hence, there is a one-to-one correspondence between states in Sc and c-outcomeprobability distributions. This kind of set is called adc-simplex. A similar argument shows that the effects1, . . . ,c are linearly independent. Hence, any effect on Sc can be written as =

    a haa, and the constraint

    0 (a) 1 implies 0 ha 1. In other words, everymeasurement on Sc is generated by the complete one.Every reversible transformation on Sc is a symmetry of

    the dc-simplex, that is, a permutation of pure states. Dueto Requirement 4, there is a reversible transformation onthe bit S2 which exchanges the two pure states. UsingRequirement 3 inductively: if there is a transformationon Sc1 which exchanges two pure states and leaves therest invariant, this transposition can be implemented onSc, also leaving all other pure states invariant. Therefore,all transpositions can be implemented in Sc, and thosegenerate the full group of permutations.

    E. Reversible transformations for the generalized

    bit

    In the rest of the paper, only theories with d2 > 1 areconsidered. Theorem 2 shows that S2 is a d2-dimensionalunit ball. Equation (22) for c = 2 implies that d2 is odd.

    The pure states in S2 are the unit vectors in Rd2 . Areversible transformation G G2 maps pure states ontopure states, hence it preserves the norm and has to be anorthogonal matrix GT = G1. Therefore G2 is a subgroupof the orthogonal group O(d2).

    Requirement 4 imposes that for any pair of unit vec-tors , there is G G2 such that G() = . Inother words, G2 is transitive on the sphere [28, 29]. Ac-cording to Lemma 6, if G2 is transitive on the sphere,then the largest connected subgroup C2 G2 is alsotransitive on the sphere. The matrix group C2 is com-pact and connected, hence a Lie group (Theorem 7.31in [24]). The classification of all connected compact Liegroups that are transitive on the sphere is done in [28, 29].

    For odd d2, the only possibility is C2 = SO(d2), ex-cept for d2 = 7 where there are additional possibilities:C2 = MG2MT SO(7) for any M O(7), where G2is the fundamental representation of the smallest excep-tional Lie group [30]. For even d2, there are many morepossibilities [28, 29], but equation (22) implies that d2must be odd.

    The stabilizer of the vector 1 defined in (14) is the

    subgroup H2 = {G G2 : G(1) = 1}. Each transfor-mation H H2 has the form

    H =

    (1 0T

    0 H

    ),

    where H H2 is the nontrivial part. If C2 = SO(d2) thenSO(d2 1) H2. In the case d2 = 7, if C2 = MG2MTthen H2 contains (up to the similarity M) the real 6-dimensional representation of SU(3) given by

    H =

    (reU imUimU reU

    ), (23)

    where reU and imU are the real and imaginary parts ofU SU(3) (see exercise 22.27 in [30]).

    F. Two generalized bits

    The joint state space of two S2 systems is denoted byS2,2. The multiplicativity of the capacity (18) impliesthat S2,2 is equivalent to S4. However, we write S2,2 toemphasize the bipartite structure.In what follows, instead of using the standard rep-

    resentation for bipartite systems (3) we generalize theBloch representation to two generalized bits. A state

    AB S2,2 has Bloch representation AB = [, , C]

  • 9with

    i = 2p(xi) 1j = 2p(yj) 1Cij = 4p(xi, yj) 2p(xi) 2p(yj) + 1

    (24)

    for i, j = 1, . . . , d2. Note that = A and = Bare the reduced states in the Bloch representation (13).The correlation matrix can also be written as Cij =p(xi, yj) p(xi, yj) p(xi, yj) + p(xi, yj), and character-izes the correlations between subsystems. Product stateshave Bloch representation

    (A B) =[A, B , A

    T

    B

    ], (25)

    with rank-one correlation matrix. In QT, where d2 =3, two-qubit density matrices are often represented by[, , C] through formula (41). Definition (24) implies

    1 i, j , Cij 1 . (26)

    The invertible map L[AB ] = AB defined by (24)also determines the Bloch representation of effects = L1. In particular, the tensor-product of two effectsof the form (15) is

    (AB )[, , C] =(1+ TA+

    T

    B+ T

    ACB)/4 .(27)

    The map L also determines the action of reversible trans-formation in the Bloch representation. Since L is affinebut not linear, the action G = L G L1 need not belinear. Identities (16, 25) and 2 = 0 imply that the

    maximally-mixed state in S2,2 is 2,2 = (2, 2, 2T2 ) =0. This and (12) imply that transformations G2,2 act onthe generic vector [, , C] as matrices. In particular,local transformations GA, GB G2 act as

    (GA GB)[, , C] = [GA, GB, GACGTB] . (28)

    Subsection IVE concludes that G2 consists of orthogonalmatrices, and Lemma 8 shows that all transformations inG2,2 are orthogonal, too. Orthogonal matrices preservethe norm of vectors, therefore all pure states S2,2satisfy

    ||2 = ||2 + ||2 + tr(CTC) = 3 . (29)The constant in the right-hand side can be obtained by

    letting = [, , T] with || = 1.

    G. Consistency in the subspaces of two generalized

    bits

    In this subsection we use a trick introduced in [19]: toimpose the equivalence between a particular subspace ofS2,2 and S2 (Requirement 3).Consider the unit vector 1 from (14) and the two

    distinguishable pure states 0 = 1 and 1 = 1

    from S2. The four pure states a,b = a b S2,2can be distinguished with the complete measurementa,b = a b where a, b {0, 1}. Formula (27)implies

    0,0[, , C] = (1 + 1 + 1 + C1,1)/4 , (30)

    1,1[, , C] = (1 1 1 + C1,1)/4 . (31)Requirement 3 implies that the subspace

    S 2 = { S2,2 : (0,0 +1,1)() = 1} ,is equivalent to S2. By adding (30) plus (31), it becomesclear that a state = [, , C] belongs to S 2 if and onlyif C1,1 = 1. Moreover, if S 2, then it follows from0,1() 0 and 1,0() 0 that 1 = 1.Theorem 5. The state space of a generalized bit hasdimension three (d2 = 3).

    Proof. Recall that the case under consideration is odd d2larger than one. The space S 2 S2,2 is equivalent to S2,which is a d2-dimensional unit ball. If 0,0 and 1,1 areconsidered the poles of this ball, then the equator is theset of states eq such that 0,0(eq) = 1,1(eq) = 1/2.Equations (30, 31) tell that equator states have 1 =1 = 0, and then

    eq = [

    (0

    ),

    (0

    ),

    (1 T

    C

    )] , (32)

    where , , , Rd21 and C R(d21)(d21). Con-sider the action of GAI for GA G2 on an equator stateeq. Since G2 is transitive on the unit sphere, if 6= 0then there is some GA G2 such that the correlationmatrix transforms into

    GA

    (1 T

    C

    )=

    ( 1 + ||2 ?0 ?

    ),

    which is in contradiction with (26). Therefore = 0, andby a similar argument = 0.The stabilizer of 1 is the largest subgroup H2 G2

    which leaves 1 invariant (Subsection IVE). For any pairHA, HB H2 the identity a,b(HAHB) = a,b holds,which implies that if eq belongs to the equator (32) then

    (HAHB)(eq) = [(

    0HA

    ),

    (0

    HB

    ),

    (1 0T

    0 HACHT

    B

    )]

    also belongs to the equator. The equator is a unit ball ofdimension d2 1. Since the set

    {(HA HB)(eq) : HA, HB H2} (33)is a subset of the equator, the dimension of its affine spanis at most d2 1.Consider the case C = 0. The normalization condition

    (29) implies || = || = 1. The set {HA : HA H2} hasdimension d2 1, and the same for {HB : HB H2}.

  • 10

    Therefore the set (33) has dimension at least 2(d2 1)generating a contradiction.Consider the case C 6= 0. The group action on C

    corresponds to the exterior tensor product H2 H2 ={HA HB : HA, HB H2}. If d2 > 3 and SO(d2 1) H2 then H2 is irreducible in Cd21, and a simplecharacter-based argument shows that H2 H2 is irre-ducible in (Cd21)2 (see page 427 in [30]). Hence theset

    {HACHTB : HA, HB H2} (34)has dimension (d2 1)2, which conflicts with the dimen-sionality requirements of (33). If d2 = 7 and H2 containsthe representation of SU(3) given in (23), then the sub-group

    H =

    (U 00 U

    )

    with U SO(3) SU(3) has two invariant C3 subspaces.Therefore the invariant subspaces of H2H2 have at leastdimension 9, and independently of C, the set (34) has atleast dimension 9, which conflicts with the dimensionalityrequirements of (33). So the only possibility is d2 =3.

    From now on, only the case d2 = 3 is considered. Sub-section IVE tells that SO(3) G2 O(3), which impliesthat either G2 = O(3) and H2 = O(2), or G2 = SO(3)and H2 = SO(2).Let us see that the first case is impossible. The group

    H2 = O(2) is irreducible in C2, therefore H2 H2 is ir-reducible in (C2)2. Three paragraphs above it is shownthat C 6= 0, hence the set (34) has dimension (d2 1)2,which is a lower bound for the one of (33), which is largerthan the allowed one (d2 1 = 2).Let us address the second case. The group H2 = SO(2)

    is irreducible in R2 but reducible in C2; so the previousargument does not hold. The vector space of 2 2 realmatrices decomposes into the subspace generated by ro-tations

    R+ =

    (cos v sin v sin v cos v

    ), (35)

    and the one generated by reflections

    R =

    (cos v sin vsin v cos v

    ), (36)

    where detR = 1. For any pair HA, HB H2 the ma-trix HAR+H

    T

    B is a rotation and the matrix HARHT

    B

    is a reflection; therefore the 2-dimensional subspaces(35) and (36) are invariant under H2 H2. Since theequator has dimension d2 1 = 2, all matrices C 6= 0must be fully contained in one of the two subspacesspanned by (35) or (36), otherwise the dimension of theset (33) would be too large again. For the same reason = = 0.

    Depending on whether C is in the subspace generatedby R+ from (35) or by R from (36), the states in the

    equator of S 2 are either +eq or eq, where

    eq = [

    00

    0

    ,

    00

    0

    ,

    1 0 00 cos v sin v

    0 sin v cos v

    ] .

    The proportionality constants in C R are fixed bynormalization (29). It turns out that both the symmet-

    ric case eq and the antisymmetric case +eq correspond

    to different representations of the same physical theorythat is, the corresponding state spaces (together withmeasurements and transformations) are equivalent in thesense of Subsection II E. To see this, define the linearmap : S2 S2 as (1, 2, 3)T := (1, 2,3)T;that is, a reflection in the Bloch ball. The equivalencetransformation is defined as L := I (in quantum in-formation terms, this is a partial transposition). Thismap respects the tensor product structure, leaves the set

    of product states invariant, and satisfies L(+eq) = eq[19]. In other words: we have reduced the discussionof the antisymmetric theory to that of the symmetrictheory [35], which will be considered for the rest of thepaper.The orthogonality of the matrices in G2,2 implies that

    S 2 is a 3-dimensional ball, and not just affinely related toit. Hence all states on the surface of the ball S 2 S2,2can be parametrized in polar coordinates u [0, pi) andv [0, 2pi) as

    (u, v) = (37)

    [

    cosu0

    0

    , cosu0

    0

    , 1 0 00 sinu cos v sinu sin v

    0 sinu sin v sinu cos v

    ] .

    These states cannot be written as proper mixtures ofother states from S 2. It is easy to see that this impliesthat they are pure states in S2,2.

    H. The Hermitian representation

    In this subsection, a new (more familiar) representa-tion is introduced, where states in S2 are represented by2 2 Hermitian matrices. For any state S2 in thestandard representation (1), define the linear map

    L[] = 0 I 1 2 3

    2+

    3i=1

    ii . (38)

    The Pauli matrices

    1 =

    (0 11 0

    ), 2 =

    (0 ii 0

    ), 3 =

    (1 00 1

    ),

    together with the identity I constitute an orthogonal ba-sis for the real vector space of Hermitian matrices. In

  • 11

    terms of the Bloch representation, the map (38) has thefamiliar form

    L[] = 12

    (I+

    3i=1

    ii

    ).

    All positive unit-trace 2 2 Hermitian matrices can bewritten in this way with in the unit sphere. SinceS2 is a 3-dimensional unit sphere, the set L[S2] is theset of quantum states. The extreme points of L[S2] arethe rank-one projectors: each pure state S2 satisfiesL[] = ||, where the vector | C2 is defined up toa global phase. Effect (15) associated to the pure state S2 is

    () =(L1

    )(L[]) = tr (|| L[]) . (39)

    Note that the state and its associated effect areboth represented by ||. The action of a reversibletransformation G G2 = SO(3) in the Hermitian repre-sentation is

    L[G()] = UL[]U ,

    where U SU(2) is related to G via3

    j=1

    Gjij = UiU , (40)

    and Gji are the matrix components (equation VII.5.12in [26]). In summary, the generalized bit in all theoriessatisfying d2 > 1 and the requirements, is equivalent tothe qubit in QT.

    I. Reconstructing quantum theory

    In this subsection, the main result of this work isproved. But before, let us introduce some notation.In QT, the state space with capacity c and the corre-

    sponding group of reversible transformations are

    SQc = { Ccc : 0, tr = 1} ,GQc = {U U : U SU(c)} .

    The joint state space of m generalized bits is denoted byS2m , and the corresponding group of reversible trans-formations by G2m . The Hermitian representation of astate S2m is defined to be Lm[], where Lm :=L L, and L is defined in (38). The map Lm actsindependently on each tensor factor, hence it translatesthe tensor product structure from the standard represen-tation (4, 5, 7) to the Hermitian one. For example, if S2 is a pure state, then Lm[m] = ||m. Thenotation

    SH2m = Lm[S2m ] ,GH2m = Lm G2m (Lm)1 ,

    will be useful. The Hermitian representation of a state

    AB = [, , C] S2,2 isL2[AB ] = (41)

    1

    4

    [II+

    3i=1

    iiI+3

    j=1

    jIj +3

    i,j=1

    Cijij].

    The action of local transformations GA, GB G2 onAB S2,2 isL2[(GAGB)(AB)] = (UAUB)AB(UAUB) (42)where AB = L2[AB] and UA, UB SU(2) are relatedto GA, GB via (40). Now, we are ready to prove

    Theorem 6. The only GPT with d2 > 1 satisfying Re-quirements 15 is quantum theory.

    Proof. We start by reproducing an argument from [19]

    which shows that SQ4 SH2,2. A particular family ofpure states in S2,2 is (u) = (u, 0) defined in (37).The Hermitian representation of (u) is the projectorL2[(u)] = |(u)(u)| onto the C2 C2-vector

    |(u) = cos u2|+|++ sin u

    2|| ,

    where |+ = (1, 1)T/2 and | = (1, 1)T/2.From the Schmidt decomposition, it follows thatall rank-one projectors in C44 can be written as(UA UB)|(u)(u)|(UA UB) for some value of uand some local unitaries UA, UB SU(2). Thus, all rank-one projectors are pure states in SH2,2. Their mixturesgenerate all of SQ4 , therefore SQ4 SH2,2.Direct calculation shows that

    tr(L2[]L2[]) = 1

    4+

    1

    4[T + T + tr(CTC)]

    (43)

    for any pair of states = [, , C] and = [, , C]

    from S2,2. Lemma 8 shows that all G G2,2 are orthog-onal matrices. Therefore, the Euclidean inner productbetween states, as in the right-hand side of (43), is pre-

    served by the action of any G G2,2. Equality (43)maps this property to the Hermitian representation: anyH GH2,2 preserves the Hilbert-Schmidt inner productbetween states:

    tr[H()H()] = tr( ) , (44)

    for all , SH2,2.For any pure state S2, the rank-one projector

    || || = L2[ ] is a pure state in SH2,2, andtr(|| || ) = ( ) (L2)1() is a mea-surement on SH2,2. Any rank-one projector || C44is a pure state in SH2,2, hence there is H GH2,2 suchthat H(|| ||) = ||. Composing the trans-formation H with the effect generates the effect( ) (L2)1 H1, which maps any SH2,2 to

    tr[|| ||H1()] = tr[|| ] , (45)

  • 12

    where (44) has been used. In summary, every rank-oneprojector || C44 has an associated effect (45)which is an allowed measurement on SH2,2, and these gen-erate all quantum effects.

    We have seen that all quantum states SQ4 are containedin SH4 , but can there be other states? If so, the associ-ated Hermitian matrices should have a negative eigen-value (note that all states in the Hermitian representa-tion (41) have unit trace). If has a negative eigenvalueand | is the corresponding eigenvector, then the associ-ated measurement outcome (45) has negative probability.

    Hence, we conclude that SQ4 = SH4 , and similarly for themeasurements.All reversible transformationsH GH4 map pure states

    to pure states, that is, rank-one projectors to rank-oneprojectors. According to Wigners Theorem [31], ev-ery map of this kind can be written as H(||) =(U |)(U |), where U is either unitary or anti-unitary.If U is anti-unitary, it follows from Wigners normalform [32] that there is a two-dimensional U -invariant sub-space spanned by two orthonormal vectors |0, |1 C4such that U (t0|0+ t1|1) equals either t0|0 + t1|1or t1e

    is|0 + t0eis|1 for some s R. In both cases,U acts as a reflection in the corresponding Bloch ball,which contradicts Requirement 3 because we know that

    G2 = SO(3). Therefore GH4 GQ4 .We know that GH2,2 contains all local unitaries. Since

    this group is transitive on the pure states, it contains atleast one unitary which maps a product state to an entan-gled state. It is well-known [33] that this implies that thecorresponding group of unitaries constitutes a universalgate set for quantum computation; that is, it generatesevery unitary operation on 2 qubits. This proves that

    GH4 = GQ4 .Consider m generalized bits as a composite system.

    From the previously discussed case of S2,2, we knowthat every unitary operation on every pair of general-ized bits is an allowed transformation on SH2m . But two-qubit unitaries generate all unitary transformations [33],

    hence GQ2m GH2m . By applying all these unitaries to||m, all pure quantum states are generated, henceSQ2m SH2m . Reasoning as in the S2,2 case, for ev-ery rank-one projector || acting on C2m , the asso-ciated effect which maps SH2m to tr(||) is an al-lowed measurement outcome on SH2m . This implies thatall matrices in SH2m have positive eigenvalues, thereforeSH2m = SQ2m and GH2m = GQ2m .The remaining cases of capacities c that are not powers

    of two are treated by applying Requirement 3, using thatSc S2m for large enough m.

    V. CONCLUSION

    We have imposed five physical requirements on theframework of generalized probabilistic theories. These re-

    quirements are simple and have a clear physical meaningin terms of basic operational procedures. It is shown thatthe only theories compatible with them are CPT and QT.If Requirement 4 is strengthened by imposing the conti-nuity of reversible transformations, then the only theorythat survives is QT. Any other theory violates at leastone of the requirements, hence the relaxation of each oneconstitutes a different way to go beyond QT.

    The standard formulation of QT includes two postu-lates which do not follow from our requirements: (i) theupdate rule for the state after a measurement, and (ii) theSchrodinger equation. If desired, these can be incorpo-rated in our derivation of QT by imposing the followingtwo extra requirements: (i) if a system is measured twicein rapid succession with the same measurement, thesame outcome is obtained both times [4], and (ii) closedsystems evolve reversibly and continuously in time.

    This derivation of QT contains two steps which de-serve a special mention. First, a direct consequence ofRequirement 3 is that S2 is fully surrounded by purestates, which together with Requirement 4 implies thatS2 is a ball. Second, this ball has dimension three, sinced = 3 is the only value for which SO(d 1) is reduciblein Cd.

    Modifications and generalizations of QT are of interestin themselves, and could be essential in order to constructa QT of gravity. Some well-known attempts [15, 16] haveshown that straightforward modifications of QTs mathe-matical formalism quickly lead to inconsistencies, such assuperluminal signaling [17]. This work provides an alter-native way to proceed. We have shown that the Hilbertspace formalism of QT follows from five simple physicalrequirements. This gives five different consistent waysto go beyond QT, each obtained by relaxing one of ourrequirements.

    Acknowledgments

    The authors are grateful to Anne Beyreuther, Jens Eis-ert, Volkher Scholz, Tony Short, and Christopher Wittefor discussions. Special thanks to Lucien Hardy for point-ing out the multiplicity of groups that are transitive onthe sphere and, correspondingly, the need to address the7-dimensional ball as a special case. Llus Masanes is fi-nancially supported by Caixa Manresa, and benefits fromthe Spanish MEC project TOQATA (FIS2008-00784)and QOIT (Consolider Ingenio 2010), EU IntegratedProject SCALA and STREP project NAMEQUAM.Markus Muller was supported by the EU (QESSENCE).Research at Perimeter Institute is supported by the Gov-ernment of Canada through Industry Canada and by theProvince of Ontario through the Ministry of Researchand Innovation.

  • 13

    [1] J. S. Bell, Physics 1, 195 (1964).[2] P. W. Shor, Polynomial-Time Algorithms for Prime Fac-

    torization and Discrete Logarithms on a Quantum Com-puter; SIAM J. Comput. 26(5), 1484-1509 (1997), quant-ph/9508027v2.

    [3] L. Hardy, Foliable Operational Structures for GeneralProbabilistic Theories; arXiv:0912.4740v1.

    [4] L. Hardy; Quantum Theory From Five Reasonable Ax-ioms, quant-ph/0101012v4.

    [5] H. Barnum, A. Wilce, Information processing in con-vex operational theories, DCM/QPL (Oxford University2008), arXiv:0908.2352v1.

    [6] J. Barrett, Information processing in generalized prob-abilistic theories, Phys. Rev. A 75, 032304 (2007),arXiv:quant-ph/0508211v3.

    [7] G. W. Mackey; The mathematical foundations of quan-tum mechanics, (W. A. Benjamin Inc, New York, 1963).

    [8] G. Birkhoff, J. von Neumann, The Logic of QuantumMechanics, Annals of Mathematics, 37, 823 (1936).

    [9] G. Chiribella, G. M. DAriano, P. Perinotti; Probabilis-tic theories with purification; Phys. Rev. A 81, 062348(2010), arXiv:0908.1583v5.

    [10] W. van Dam, Implausible Consequences of SuperstrongNonlocality, arXiv:quant-ph/0501159v1.

    [11] M. Pawlowski, T. Paterek, D. Kaszlikowski, V. Scarani,A. Winter, M. Zukowski, A new physical princi-ple: Information Causality, Nature 461, 1101 (2009),arXiv:0905.2292v3.

    [12] S. Popescu, D. Rohrlich, Causality and Nonlocalityas Axioms for Quantum Mechanics, Proceedings ofthe Symposium on Causality and Locality in ModernPhysics and Astronomy (York University, Toronto, 1997),arXiv:quant-ph/9709026v2.

    [13] M. Navascues, H. Wunderlich, A glance beyond the quan-tum model, Proc. Roy. Soc. Lond. A 466, 881-890 (2009),arXiv:0907.0372v1.

    [14] D. Gross, M. Muller, R. Colbeck, O. C. O. Dahlsten,All reversible dynamics in maximally non-local theo-ries are trivial, Phys. Rev. Lett. 104, 080402 (2010),arXiv:0910.1840v2.

    [15] A. Gleason, J. Math. Mech. 6, 885 (1957).[16] S. Weinberg, Ann. Phys. NY 194, 336 (1989).[17] N. Gisin, Weinbergs non-linear quantum mechanics and

    supraluminal communications, Phys. Lett. A 14312(1990).

    [18] S. Aaronson, Is Quantum Mechanics An Island In Theo-ryspace?, quant-ph/0401062v2.

    [19] B. Dakic, C. Brukner, Quantum Theory and Beyond: IsEntanglement Special?, arXiv:0911.0695v1.

    [20] C. A. Fuchs, Quantum Mechanics as Quantum Informa-tion (and only a little more), Quantum Theory: Recon-struction of Foundations, A. Khrenikov (ed.), Vaxjo Uni-versity Press (2002), arXiv:quant-ph/0205039v1.

    [21] G. Brassard, Is information the key? Nature Physics 1,2 (2005).

    [22] E. M. Alfsen and F. W. Shultz, Geometry of state spacesof operator algebras, Birkhauser, Boston (2003).

    [23] R. T. Rockafellar, Convex Analysis, Princeton UniversityPress (1970).

    [24] A. Baker, Matrix Groups, An Introduction to Lie GroupTheory, Springer-Verlag London Limited (2006).

    [25] A. J. Short, private communication.[26] B. Simon, Representations of Finite and Compact

    Groups, Graduate Studies in Mathematics, vol. 10,American Mathematical Society (1996).

    [27] S. Straszewicz, Uber exponierte Punkte abgeschlossenerPunktmengen, Fund. Math. 24, 139-143 (1935).

    [28] A. L. Onishchik and V. V. Gorbatsevich, Lie groups andLie algebras I, Encyclopedia of Mathematical Sciences20, Springer Verlag Berlin, Heidelberg (1993).

    [29] A. L. Onishchik, Transitive compact transformationgroups, Mat. Sb. (N.S.) 60(102):4 447485 (1963); En-glish translation: Amer. Math. Soc. Transl. (2) 55, 153194 (1966).

    [30] W. Fulton, J. Harris, Representation Theory, Graduatetexts in mathematics, Springer (2004).

    [31] V. Bargmann, Note on Wigners Theorem on SymmetryOperations, J. Math. Phys. 5, 862868 (1964).

    [32] E. P. Wigner, Normal Form of Antiunitary Operators, J.Math. Phys. 1, 409413 (1960).

    [33] M. J. Bremner, C. M. Dawson, J. L. Dodd, A. Gilchrist,A. W. Harrow, D. Mortimer, M. A. Nielsen, T. J.Osborne, Practical scheme for quantum computationwith any two-qubit entangling gate, Phys. Rev. Lett.89:247902 (2002), arXiv:quant-ph/0207072v1.

    [34] G. Chiribella, G. M. DAriano, and P. Perinotti,Informational derivation of Quantum Theory,arXiv:1011.6451v2.

    [35] As a physical interpretation of the antisymmetric case,consider two observers who have never met before, butwho have independently built devices to measure spin-1

    2particles in three orthogonal directions. If they never

    had the chance to agree on a common handedness ofspatial coordinate systems, and happen to have chosentwo different orientations, they will measure antisymmet-ric correlation matrices on shared quantum states. Thethree-bit nogo result from [19] can be interpreted asfollows: if there is a third observer, then it is impossiblethat every pair of parties measures antisymmetric corre-lation matrices.

    Appendix A: Lemmas

    Lemma 1. In any state space Sc, the only state Scwhich is invariant under all reversible transformations

    G() = for all G Gc , (A1)is the maximally-mixed state c, defined in (11).

    Proof. Suppose Sc satisfies (A1). Any state canbe written as a mixture of pure states: =

    k qk k.

    NormalizationGcdG = 1, condition (A1), the linearity of

    G, the purity of all k, the definition of c, and

    k qk =1, imply

    =

    Gc

    dG =

    Gc

    G() dG

    =k

    qk

    Gc

    G(k) dG =k

    qk c = c ,

  • 14

    which proves the claim.

    Lemma 2. If G is a compact real matrix group, thenthere is a real matrix S > 0 such that for each G G thematrix SGS1 is orthogonal.

    Proof. Since the group G is compact, there is an invariantHaar measure [26], which allows us to define

    P =

    G

    GTGdG .

    Since each G is invertible, the matrix GTG is strictlypositive, and P too. Define S =

    P > 0 where both

    S, S1 are real and symmetric. For any G G we have(SGS1)T(SGS1) = I, which implies orthogonality.

    Lemma 3. If A and B are the maximally-mixed statesof the state spaces SA and SB , then the maximally-mixedstate of the composite system SAB is

    AB = A B .

    Proof. The pure states A in SA linearly spanRdA+1, andthe pure states B in SB linearly span RdB+1. Therefore,pure product states A B span RdA+1 RdB+1. Inparticular, the maximally-mixed state (11) of SAB canbe written as

    AB =a,b

    ta,b Aa Bb , (A2)

    where ta,b R are not necessarily positive coefficients,and all Aa ,

    Bb are pure. From definition (1), the first

    component of the vector equality (A2) implies

    a,b ta,b =1. The maximally-mixed state is invariant under all re-versible transformations, in particular the local ones

    AB =

    GA

    dGA

    GB

    dGB (GA GB)(AB)

    =a,b

    ta,b

    [GA

    dGAGA(Aa )

    ][

    GB

    dGB GB(Bb )

    ]

    =a,b

    ta,b A B = A B ,

    where the same tricks from Lemma 1 have been used.

    Lemma 4. For every tight effect , there is a pure state such that () = 1. Also, if a measurement 1, . . . ,ndistinguishes n states 1, . . . , n, then these states can bechosen pure.

    Proof. By definition, for each tight effect there is a (notnecessarily pure) state such that () = 1. Every

    can be written as a mixture of pure states k, that is =

    k qk k with qk > 0 and

    k qk = 1. Effects

    are linear functions such that () 1 for any state .Therefore, it must happen that all pure states k in theabove decomposition satisfy (k) = 1.

    To prove the second part, let 1, . . . , n be the states

    that are distinguished by the measurement, that isa(

    b) = a,b. Every

    b can be written as a convex

    combination of pure states b =

    k qk b,k. But effectsare linear functions such that 0 () 1 for any state. Hence (b) = 0 is only possible if (b,k) = 0 for allk, and similarly for the case (b) = 1. It follows thata(b,1) = a,b.

    Lemma 5. If Sc is a state space with capacity c 1 andc the corresponding maximally-mixed state, then thereare c pure distinguishable states 1, . . . , c Sc such that

    c =1

    c

    ca=1

    a .

    Proof. Since S1 contains a single state the claim is triv-ially true for c = 1. Since S2 is the d2-dimensional unitball, two antipodal points 1 and 2 = 1 are pure,distinguishable and satisfy

    2 =1

    2(1 + 2) . (A3)

    Now, consider the joint state space of n generalizedbits, denoted S2n . Lemma 3 and (A3) imply that themaximally-mixed state of S2n is

    (n) = (2)n =

    1

    2n

    ai{1,2}

    a1 an . (A4)

    The states a1 an S2n for all a1, . . . , an {1, 2} are perfectly distinguishable by the correspondingproduct measurement, hence the capacity of S2n , de-noted cn, satisfies

    cn 2n . (A5)

    Let (1, . . . ,cn) be a complete measurement which dis-tinguishes the states 1, . . . , cn S2n . According toLemma 4 these states can be chosen to be pure. Sincecn

    k=1 k((n)) = 1, there is at least one value of k, de-noted k0, such that

    k0((n)) 1/cn . (A6)

    The state 1 S2 from (A3) is pure, hence n1 S2nis pure too. Requirement 4 tells that there is a reversibletransformation G acting on S2n such that G(k0) =n1 . The measurement (1 G1, . . . ,cn G1) distin-guishes the states G(1), . . . , G(cn). Inequality (A6),the invariance of (n) under G, expansion (A4), the pos-

    itivity of probabilities, and (k0 G1)(n1 ) = 1, imply1

    cn (k0 G1)((n))

    =1

    2n

    ai{1,2}

    (k0 G1)(a1 an) 1

    2n.

  • 15

    This and (A5) imply cn = 2n. This together with (A4)

    shows the assertion of the lemma for state spaces whosecapacity is a power of two. The rest of cases are shownby induction.Let us prove that if the claim of the lemma holds for

    a state space with capacity c, with c > 1, then it holdsfor a state space with capacity c 1 too. The induc-tion hypothesis tells that there is a complete measure-ment (1, . . . ,c) which distinguishes the pure states1, . . . , c Sc, and c = 1c

    ck=1 k Sc is the cor-

    responding maximally-mixed state. Requirement 3 tellsthat the state space Sc1 is equivalent to

    S c1 = { Sc |1() + +c1() = 1} .According to Requirement 3, for each G Gc1 there isG Gc which implements G on S c1. Hence G(k) S c1 for k = 1, . . . , c1, which implies (c G)(k) = 0for those k, and

    1

    c= c(c) = (c G)(c)

    =1

    c

    ck=1

    (c G)(k) = 1c(c G)(c) ,

    and (c G)(c) = 1. Requirement 3 tells that the setS 1 = { Sc |c() = 1}

    is equivalent to S1, which contains a single state. Thisand c(c) = 1 imply that G

    (c) = c and thenG(c1) =

    c1, where we define

    c1 =1

    c 1c1k=1

    k =c

    c 1c 1

    c 1c Sc1 .

    For any G Gc1 the corresponding G satisfiesG(c1) =

    c1. Due to Lemma 1 the invariant state

    c1 must be the maximally-mixed state in S c1, whichhas the claimed form, and Requirement 3 extends this toSc1.Lemma 6. Let S be a state space such that the subset ofpure states P is a connected topological manifold. If thecorresponding group of transformations G is transitive onP, then the largest connected subgroup C G is transitiveon P, too.Proof. This proof involves basic notions of point settopology.Since G is compact, it is the union of a finite number

    of (disjoint) connected components G = C1 Cn.If n = 1 the lemma is trivial. Let C be the connectedcomponent Ci containing the identity matrix I, which isthe largest connected subgroup of G. Each connectedcomponent Ci is clopen (open and closed), compact anda coset of the group: Ci = Gi C for some Gi G [30].Pick P , and consider the continuous surjective

    map f : G P , defined by f(G) = G() P . Since

    C is compact f(C) P is compact too. Since the mani-fold P is in particular a Hausdorff space, f(C) is closed.Consider the set D = f1(f(C)). If two group elementsG,H G are in the same component, that is G1H C,then G D implies H D, using that C is a normalsubgroup of G. This implies that D is the union of someconnected components Ci, and so is G\D. In particular,G\D is compact, thus f(G\D) is compact, hence closed.Therefore, f(C) = P\f(G\D) is open. We have thusproven that f(C) 6= is clopen. Since P is connected, itfollows that f(C) = P .

    The following lemma shows that there are transforma-tions for two generalized bits which perform the classi-cal swap and the classical controlled-not in a partic-ular basis. Note that these transformations do not nec-essarily swap other states that are not in the given basis,as in QT. However, they implement a minimal amount ofreversible computational power which exceeds, for exam-ple, that of boxworld, where no controlled-not operationis possible [14].

    Lemma 7. For each pair of distinguishable states0, 1 S2, there are transformations Gswap, Gcnot G2,2 such that

    Gswap(a b) = b a , (A7)Gcnot(a b) = a ab , (A8)

    for all a, b {0, 1}, where is addition modulo 2.

    Proof. Let (0,1) be the measurement which distin-guishes (0, 1), that is a(b) = a,b. Define a,b =a b and a,b = a b for a, b {0, 1}. Define

    S 3 = { S2,2 | (0,1 +1,0 +1,1)() = 1},S 2 = { S2,2 | (0,1 +1,0)() = 1} ,

    and note that S 2 S 3 S2,2. At the end ofSubsection IVB, it is shown that two distinguishablestates 0, 1 S2 have Bloch representation satisfy-ing 0 = 1. According to Requirement 4 thereis G G2 such that G(0) = 1, and by linearity,G(1) = G(0) = 1 = 0. Requirement 3 im-plies that there is a transformation Gswap for S 3 suchthat Gswap(0,1) = 1,0 and G

    swap(1,0) = 0,1. Ac-

    cording to Lemma 5, the maximally-mixed state in S 3can be written as 3 = (0,1 + 1,0 +1,1)/3. EqualitiesGswap(

    3) =

    3 and G

    swap(0,1 +1,0) = 0,1 +1,0 im-

    ply that Gswap(1,1) = 1,1. Using Requirement 3 again,there is a reversible transformation Gswap G2,2 whichimplements Gswap in the subspace S 3. Repeating the ar-gument with the maximally-mixed state (now in S2,2) weconclude that Gswap(1,1) = 1,1, hence Gswap satisfies(A7).The existence of Gcnot is shown similarly, by exchang-

    ing the roles of 0,1 and 1,1.

  • 16

    Lemma 8. Reversible transformations for two general-ized bits in the Bloch representation (24) are orthogonal:

    G2,2 O(d4) .Proof. In Subsection IVF the Bloch representation fortwo generalized bits is defined, and it is argued that re-versible transformations G2,2 act on [, , C] as matrices.In particular, local transformations (28) are

    (GA GB) = GA 0 00 GB 0

    0 0 GA GB

    , (A9)

    where each diagonal block acts on an entry of [, , C],

    and GA, GB G2. In Subsection IVE it is argued thatG2 O(d2), hence local transformations (A9) are orthog-onal.Lemma 2 shows the existence of a real matrix S > 0

    such that for any G G2,2 the matrix SGS1 is orthog-onal. In particular[S (GA GB) S1

    ]T[S (GA GB) S1

    ]= I ,

    which implies the commutation relation

    S (GA GB) = (GA GB) S (A10)Subsection IVE concludes that d2 is odd, and thatSO(d2) G2 except when d2 = 7, where MG2M1 G2and G2 is the fundamental representation of the smallestexceptional Lie group [30]. For d2 3 these groups actirreducibly in Cd2 , hence G2 acts irreducibly in Cd2 too[30]. The first two diagonal blocks in (A9) are irreducible.The exterior tensor product of two irreducible representa-tions (in Cd) is also an irreducible representation, hencethe third diagonal block in (A9) is also irreducible. Thistogether with (A10) implies that

    S =

    aI 0 00 bI 0

    0 0 sI

    for some a, b, s > 0 (Schurs Lemma [30]).

    According to Lemma 7, for each unit vector S2there is a transformation Gswap G2,2 such that

    Gswap[, 0, 0]

    = Gswap ([, , T] + [,,T]) /2

    = ([, , T] + [, ,T]) /2= [0, , 0] .

    Since SGswapS1 is orthogonal, the vectors [, 0, 0] and

    [0, ba1, 0] have the same modulus, hence a = b. Also,there is a transformation Gcnot G2,2 such that

    Gcnot[0, 0, T]

    = Gcnot ([, , T] + [,, T]) /2

    = ([, , T] + [, ,T]) /2= [0, , 0] .

    Since SGcnotS1 is orthogonal, the vectors [0, 0, T] and

    [0, bs1, 0] have the same modulus, hence s = b. Con-sequently S = aI, and the claim follows.

    Lemma 9. The results of this paper also hold if Require-ment 5 is replaced by Requirement 5.

    Proof. Assume Requirement 5, but not Requirement 5.It follows that S1 contains a single state only: if itcontained more than one state, there would exist some S1 which is not completely mixed, which would thenbe distinguishable from some other state, contradictingthat S1 has capacity 1. Requirement 5 is used in the proofof Theorem 1. This proof is easily modified to complywith Requirement 5 instead: adopting the notation fromthe proof, the state mix is not completely mixed, thusdistinguishable from some other state. This proves ex-istence of some with the claimed properties, and theother arguments remain unchanged, proving that S2 canbe represented as a unit ball. All pure states in S2 arenot completely mixed, hence have a corresponding tighteffect which is physically allowed. But for every state onthe surface of the ball, there exists only one unique tighteffect which gives probability one for that state. Hence,all these effects must be allowed, and since they generatethe set of all effects, this proves Requirement 5 for usein (14) and the rest of the paper.