Foundations of Probability and Physics: Proceedings of the Conference

PQ-QP: Quantum Probability and WItite Noise Analysis

Volume XIII

^ ^ Proceedings of the Conference

Foundations of p robability and

physics Edited by A Khrennikov

World Scientific

^ ^ Proceedings of the Conference

foundations of Probability and

physics

P Q - Q P : Quantum Probability and White Noise Analysis

Managing Editor: W. Freudenberg Advisory Board Members: L. Accardi, T. Hida, R. Hudson and K. R. Parthasarathy

PQ-QP: Quantum Probability and White Noise Analysis

Vol. 13: Foundations of Probability and Physics ed. A. Khrennikov

QP-PQ

Vol. 10: Quantum Probability Communications eds. R. L. Hudson and J. M. Lindsay

Vol. 9: Quantum Probability and Related Topics ed. L. Accardi




PQ-QP: Quantum Probability and White Noise Analysis

Volume XIII

Proceedings of the Conference

foundations of probability and

physics Vaxjo, Sweden 25 November - 1 December 2000

Edited by A Khrennikov University of Vaxjo, Sweden

|5% World Scientific m New Jersey'London'Singapore* New Jersey • London • Singapore • Hong Kong

Published by

World Scientific Publishing Co. Pte. Ltd.

P O Box 128, Farrer Road, Singapore 912805

USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

FOUNDATIONS OF PROBABILITY AND PHYSICS PQ-QP: Quantum Probability and White Noise Analysis - Vol. 13

Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-02-4846-6

Printed in Singapore by World Scientific Printers (S) Pte Ltd

V

Foreword

With the present proceedings of a conference on "Foundations of Probability and Physics" we continue the QP series — the first volume of which appeared more than twenty years ago. The series had its origin in proceedings of conferences and workshops on quantum probability and related topics. Initially published by Springer-Verlag, World Scientific has now been the publisher for about ten years. Much has changed in the world of quantum probability in the last two decades. Quantum probabilistic methods became a mature subject in mathematics and mathematical physics. The number of well-established scientists who have turned their scientific interest to the field of quantum probability is impressively increasing. Scientifically and numerically strong schools of quantum probability evolved in the past years. Moreover, the highly interdisciplinary character of quantum probability became more and more evident. Especially, the close connections to white noise analysis aroused the interest of classical and quantum probabilists and stimulated mutual exchange and cooperation fruitful for both parties.

Taking into account this development, during the previous QP conferences we discussed comprehensively and in detail the future profile and main goals of the series. Some changes in the alignment and the objectives of the series resulted from these discussions. First of all the new title reflects the intention to unify white noise analysis and quantum probability. It is important and essential to bring together classical and quantum probabilists, and the success of the World Scientific journal "Infinite Dimensional Analysis, Quantum Probability and Related Topics" shows that such an alliance will benefit both parties. Furthermore, we should be open to a wide audience of scientists and to a broad spectrum of themes. The present volume represents such a field being not very closely connected to quantum probability and white noise analysis but of general interest to the readership of the series.

Future volumes of the series will include proceedings of conferences or workshops, lecture notes of schools but also monographs on topics in quantum probability and white noise analysis.

Finally, we would like to thank all former editors of the series for their excellent job they did. We especially appreciate the enthusiastic commitment of Luigi Accardi who initiated the series and was the responsible editor for many years.

Wolfgang Freudenberg

VII

Contents

Foreword v

Preface xi

Locality and Bell's Inequality 1 L. Accardi and M. Regoli

Refutation of Bell's Theorem 29 G. Adenier

Probability Conservation and the State Determination Problem 39 S. Aerts

Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws 50 H. Atmanspacher, R. C. Bishop and A. Amann

Interpretations of Probability and Quantum Theory 71 L. E. Ballentine

Forcing Discretization and Determination in Quantum History Theories 85

B. Coecke

Interpretations of Quantum Mechanics, and Interpretations of Violation of Bell's Inequality 95

W. M. De Muynck

Discrete Hessians in Study of Quantum Statistical Systems: Complex Ginibre Ensemble 115

M. M. Duras

Some Remarks on Hardy Functions Associated with Dirichlet Series 121 W. Ehm

Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit 131

D. H. E. Gross

An Approach to Quantum Probability 147 S. Gudder

Innovation Approach to Stochastic Processes and Quantum Dynamics 161

T. Hida

Statistics and Ergodicity of Wave Functions in Chaotic Open Systems 170 H. Ishio

Origin of Quantum Probabilities 180 A. Khrennikov

Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies 201

A. Khrennikov

"Complementarity" or Schizophrenia: Is Probability in Quantum Mechanics Information or Onta? 219

A. F. Kracklauer

A Probabilistic Inequality for the Kochen-Specker Paradox 236 J.-A. Larsson

Quantum Stochastics. The New Approach to the Description of Quantum Measurements 246

E. Loubenets

Abstract Models of Probability 257 V. M. Maximov

Quantum K-Systems and their Abelian Models 274 H. Narnhofer

Scattering in Quantum Tubes 303 B. Nilsson

Position Eigenstates and the Statistical Axiom of Quantum Mechanics 314

L. Polley

Is Random Event the Core Question? Some Remarks and a Proposal 321 P. Rocchi

Constructive Foundations of Randomness 335 V. I. Serdobolskii

ix

Structure of Probabilistic Information and Quantum Laws 350 J. Summhammer

Quantum Cryptography in Space and Bell's Theorem 364 /. Volovich

Interacting Stochastic Process and Renormalization Theory 373 Y. Volovich

xi

Preface

This volume constitutes the proceedings of the Conference "Foundations of Probability and Physics" held in Vaxjo (Smoland, Sweden) from 25 November to 1 December, 2000.

The Organizing Committee of the Conference: L. Accardi (Rome, Italy), W. De Muynck (Eindhoven, the Netherlands), T. Hida (Meijo University, Japan), A. Khrennikov (Vaxjo University, Sweden) and U. V. Maximov (Be-lostok, Poland).

The purpose of the Conference (tentatively the first of a series) was to bring together scientists (physicists as well as mathematicians) who are interested in probabilistic foundations of physics. An emphasis was made on both theory and experiment, the underlying objective being to offer to the physical and mathematical scientific communities a truly interdisciplinary Conference as a privileged place for a scientific interaction among theoreticians and experimentalists. Due to the actual increased role of probabilistic foundations in physical applications (Einstein-Podolsky-Rosen correlation experiments, Bell's inequality, quantum information, computing and teleportation) as well as the necessity to reconsider foundations at the beginning of new millennium, the organizers of the Conference decided that it was just the right time for taking the scientific risk of trying this.

Since the creation of Statistical Mechanics, probabilistic description plays more and more important role in physics. The new crucial step in the development of the statistical approach to physics was made in the process of the creation of quantum mechanics. The founders of quantum theory recognized that quantum formalism could not provide the description of physical processes for individual elementary particles. The understanding of this surprising fact induced numerous debates on the possibilities of individual and probabilistic descriptions and relations between them. These debates are characterized by the large diversity of opinions on the origin of quantum stochasticity.

One of the viewpoints is that 'quantum stochasticity' differs from 'classical stochasticity'. So quantum (statistical) mechanics could not be reduced to classical statistical mechanics. This viewpoint implies convential interpretation of quantum mechanics.

By this interpretation we could not use objective realism in quantum description of reality. The very fundamental physical quantities such as, for example, position and momentum of an elementary particle could not be considered as properties of the object, the elementary particle. The elementary particle can be in a state that is superposition of alternatives. Only the act of a measurement gives the possibility to choose between these alternatives.

xii

We recall historical roots of the origin of such a viewpoint, namely the idea of superposition.

In fact, the whole 'quantum building' was built on two experimental cornerstones: 1) the experiment on photoelectric emission, 2) the two slit experiment."

The first experiment definitely demonstrated that light has the corpuscular structure (discrete structure of energy).

However, the second experiment demonstrated that photons (corpuscular objects), do not follow the standard CLASSICAL STATISTICS. The conventional rule for the addition of probabilistic alternatives:

P = P1+P2

is violated in the interference experiments. Instead of this rule, probabilities observed in interference experiments follow to quantum rule for the addition of probabilistic alternatives:

P = Pi + P2 + 2T/P1P2COSO.

Thus in general the classical rule is perturbed by the cos 0-factor. The appearance of NEW STATISTICS induced the revolution in theoret

ical physics: reconsideration of the role of all basic elements of the physical theory. The common opinion was (and is) that quantum probabilistic rule could not be explained by purely corpuscular model. To explain this rule, we must apply to wave arguments, (see, for example, Dirac's book* for the detailed analysis of the roots of quantum mechanical formalism).

This implies the wave-particle dualism and Bohr's principle of complementarity. This was the crucial change of the whole picture of physical reality (at least at micro-level).

We underline again that all these revolutionary changes had the purely probabilistic root, namely the appearance of the new probabilistic rule. We also underline that the founders of quantum mechanics, in fact, did not provide deep probabilistic analysis of the problem. Instead of this, they analysed other elements of the physical model. And such an analysis induces the new description of physical reality that we have already discussed, namely 'quantum reality'. We will never know the real reasons of such a development of the

aOf course, we must also mention that the necessity for a departure from classical mechanics was shown by experiments demonstrating the remarkable stability of atoms and molecules. The forces known in classical electrodynamics are inadequate for the explanation of this phenomenon. However, quantum mechanical explanation of such a stability is, in fact, based on the same arguments as the explanation of the photoelectric effect.

bP. A. M. Dirac, The Principles of Quantum Mechanics (Claredon Press, Oxford, 1995).

xiii

theoretical study of the results of experiments with elementary particles at the beginning of the last century.

It might be that one of the reasons was the absence of the mathematical theory of probability: A. N. Kolmogorov proposed the modern axiomatics of probability theory only in 1933.

During the round table at this conference, Prof. T. Hida and Prof. I Volovich pointed out to the fundamental role of direct contacts between physicists and mathematician in the creation of new physical theories. It may be that the absence of the direct collaboration between quantum physical and probabilistic communities was the main root of the absence of deep probabilistic analysis of quantum behaviour.

Debates on foundations of quantum mechanics were continued with a new excitement in the connection with Einstein-Podolsky-Rosen (EPR) paradox. Unfortunately the probabilistic element played the minor role in the EPR considerations. There was used (in a rather formal way) the notion of probability one in the formulation of the sufficient condition to be an element of physical reality. A new probabilistic impulse to debates on foundations of quantum mechanics was given by Bell's inequality. However, we must recognize that Bell's probabilistic considerations were performed on the formal level that could not be considered as satisfactory (at least from the point of view of mathematician). It may be that this absence of the deep probabilistic analysis of the EPR and Bell arguments was one of the main reasons to concentrate investigations in the direction of nonlocality and no-go theorems for hidden variables.

The main aim of the conference "Foundations of Probability and Physics" was to provide probabilistic analysis of foundations of physics, classical as well as quantum (in particular, the EPR and Bell arguments). The present volume contains results of such analysis. It gives the general picture of probabilistic foundations of modern physics. Foundations of probability were considered in the close connection to foundations of physics. We demonstrated that probability plays the fundamental role in models of physical reality. It seems to be impossible to split probabilistic and physical problems. On one hand, many important problems that looks as purely physical are, in fact, just probabilistic problems. On the other hand, the right meaning of probability can be found only on the basis of physical investigations. Such a meaning depends strongly on a physical model.

The conference and the present volume give the good example of the fruitful collaboration between physicists and mathematicians, stimulate research on the foundations of probability and physics, especially quantum physics.

We would like to thank Swedish Natural Science Foundation, Swedish Technical Science Foundation, Vaxjo University and Vaxjo Commune for fi-

XIV

nancial support that made the Conference possible. We would also like to thank Prof. Magnus Soderstrom, the Rector of Vaxjo University, for support of fundamental investigations and, in particular, this Conference.

Andrei Khrennikov International Center for Mathematical Modelling in Physics and Cognitive Sciences University of Vaxjo, Sweden December, 2000.

1

L O C A L I T Y A N D B E L L ' S I N E Q U A L I T Y

LUIDGI ACCARDI, MASSIMO REGOLI Centro Vito Volterra

Universita di Roma "Tor Vergata", Roma, Italy Email: accardi ©volterra. mat. uniroma2. it

We prove that the locality condition is irrelevant to Bell in equality. We check that the real origin of the Bell's inequality is the assumption of applicability of classical (Kolmogorovian) probability theory to quantum mechanics. We describe the chameleon effect which allows to construct an experiment realizing a local, realistic, classical, deterministic and macroscopic violation of the Bell inequalities.

1 Inequal i t i e s a m o n g n u m b e r s

In this section we summarize some elementary inequalities among numbers, which correspond to different forms of the Bell inequality one meets in the literature. Since some confusion have arosen about the mutual relationships among these inequalities, in particular their (in)equivalence and the cases of equality, such a summary might not be totally useless.

L e m m a (1) For any two numbers a,c € [—1,1] the following equivalent inequalities hold:

\a±c\<l±ac (1)

Moreover equality in (1) holds if and only if either o = ± l o r c = ± l .

Proof. The equivalence of the two inequalities (1) follows from the fact tha t one is obtained from the other by changing the sign of c and c is arbi t rary in

[-1,1]-

Since for any a, c 6 [—1,1], 1 ± ac > 0, (1) is equivalent to

\a ± c\2 = a2 + c2 ± 2ac < (1 ± ac)2 = 1 + a2c2 ± 2ac

and this is equivalent to a 2 ( l - c 2 ) + c2 < 1

which is identically satisfied because 1 — c2 > 0 and therefore

a 2 ( l - c 2 ) + c 2 < l - c 2 + c2 = 1 (2)

Notice tha t in (2) equality holds if and only if a2 = 1 i.e. a = ± 1 . Since, exchanging a and c in (1) the inequality remains unchanged, the thesis follows.

2

Corollary (2) For any three numbers a,b,c € [—1,1] the following equivalent inequalities hold:

\ab ± cb\ < 1 ± ac (3)

and equality holds if and only if b = ±1 and either a = ± l o r c = i l .

Proof. For b e [-1,1],

\ab±cb\ = \b\-\a±c\<\a±c\ (4)

so the thesis follows from Lemma (1). In (3.4) equality holds if and only if b = ± 1 , so also the second statement follows from Lemma (1).

Lemma (3). For any numbers o, a', b, b', c e [—1,1], one has

\ab - bc\ + \ab' + b'c\ < 2 (5)

\ab + ab' + a'b' -a'b\ < 2 (6)

In (5) equality holds if and only if b, b' = ±1 and either a o r c = ± 1 .

Proof. Adding the two inequalities in (3) one finds (5). The left hand side of (6) is < than

\ab-ba'\ + \ab' + l/a'\ (7)

and replacing a' by c, (7) becomes the left hand side of (5). Therefore (6) holds. If b, b' = ±1 and either a or c = ±1 equality holds in (3) hence in (5). Conversely, suppose that equality holds in (5) and suppose that either \b\ < 1 or | V | < 1. Then we arrive to the contradiction

2 = \b\ • \a - a'\ + \b'\ • |o + a'\ <\a- a'\ + \a + a'\ < (1 - aa') + (1 + aa!) = 2 (8)

So, if equality holds in (5), we must have |6| = \b'\ = 1. In this case (5) becomes

\a-a'\ + \a + a'\=2 (9)

and we know from Lemma (1) that the identity (4.1) can take place if and only if either a or a' = ± 1 .

3

Corollary (4). If a,a',b,b',c £ {-1,1}, then the inequalities (3), (6) and (5) are equivalent and equality holds in all of them.

Proof. From Lemma (1) we know that the inequalities (1) and (2) are equivalent. Prom Lemma (3) we know that (3) implies (5). Choosing b' = a in (5), since a = ± 1 , (5) becomes

\ab — cb\ < 1 — ac

which is (3). The left hand side of (6) is

a(b + b') + a'(b' - b) (10)

In our assumptions either (b + b') or (b' - b) is zero, so (4) is either equal to

\a(b+b')\ = \b + b'\=2

or to \a'(b'-b)\ = \b-b'\ = 2

Corollary (5). If a,b',c G (—1,1), then the inequality (5), hence a fortiori (6), is strictly weaker than (3).

Proof. We have already proved that that (3) implies (5), hence (6). On the other hand (5) is equivalent to

\ab - bc\ < (1 - ac) + (1 + ac - \ab' + b'c\ (11)

ByLemma(l) , 1+ac— \ab'+b'c\ > 0 and equality holds if and only if | b' | = land either a or c is ± 1 . From this the thesis follows.

2 The Bell inequality

Corollary (1) (Bell inequality) Let A,B,C,D be random variables defined on the same probability space (f2, J-, P) and with values in the interval [—1,1]. Then the following inequalities hold:

E(\AB - BC\) < 1 - E(AC) (1)

E(\AB + BC\) < 1 + E{AC) (2)

4

E(\AB - BC\) + E(\AD + DC\) < 2 (3)

where E denotes the expectation value in the probability space of the four variables. Moreover (1) is equivalent to (2) and, if either A or C has values ± 1 , then the three inequalities are equivalent.

Proof. Lemma (1.1) implies the following inequalities (interpreted pointwise on fi):

\AB - BC\ < 1 - AC

\AB + BC\ < 1 + AC

\AB - BC\ + \AD + DC\ < 2 from which (1), (2), (3) follow by taking expectation and using the fact that |£(-?0I < .Ed-X^). The equivalence is established by the same arguments as in Lemma (1.1).

Remark (2). Bell's original proof, as well as the almost totality of the available proofs of Bell's inequality, deal only with the case of random variables assuming only the values +1 and —1. The present generalization is not without interest because it dispenses from the assumption that the classical random variables, used to describe quantum observables, have the same set of values of the latter ones: a hidden variable theory is required to reproduce the results of quantum theory only when the hidden parameters are averaged over.

Theorem (3). Let Sa , 5c , 5^ , 5^ be random variables defined on a probability space (£l,F, P) and with values in the interval following inequalities holds:

-1,+1]. Then the

£(5«5<2>) - E(SWSP) < 1 - E(SWS^) (4)

E(SMS12)) + E(SWsi2)) < 1 + E(S^SW) (5)

E(sWsi2)) - £ ( 5 « 5 < 2 ) ) + E(S^S{2)) + E(S^S{2)) < 2 (6)

Proof. This is a rephrasing of Corollary (2).

5

3 Implications of the Bell's inequalities for the singlet correlations

To apply Bell's inequalities to the singlet correlations, considered in the EPR paradox, it is enough to observe that they imply the following

Lemma (1) In the ordinary three-dimensional euclidean space there exist sets of three, unit length, vectors a, b, c, such that it is not possible to find a probability space (Q,, T, P) and six random variables SX

J (x = a, 6, c, j = 1,2) denned on ($7, J-, P) and with values in the interval [—1, +1], whose correlations are given by:

E(SW-SM) = -x-y ; x,y = a,b,c (1)

where, if x = (xi,X2,X3), y = (2/1,1/2;2/3) are two three-dimensional vectors, x • y denotes their euclidean scalar product, i.e. the sum x\yi + X2J/2 + ^32/3-

Remark . In the usual EPR-type experiments, the random variables qti) qU) qii)

represent the spin (or polarization) of particle j of a singlet pair along the three directions a,b,c in space. The expression in the right-hand side of (1) is the singlet correlation of two spin or polarization observables, theoretically predicted by quantum theory and experimentally confirmed by the Aspect-type experiments.

Proof. Suppose that, for any choice of the unit vectors x = a,b,c there exist random variables Si as in the statement of the Lemma. Then, using Bell's inequality in the form (2.5) with A = s£1}, B = s f ) , C = S ^ ) , we obtain

E(SWsl2)) + E(S12)SW)\ < 1 + E{S<psM) (2)

Now notice that, if x = y is chosen in (1), we obtain

E{SP • SM) =-x • x = - \\x\\2 = ~l ; x = a,b,c

and, since Si J Si ' = 1 this is possible if and only if Si1' = -Sx2>> (x = a, b, c)

P-almost everywhere. Using this (2) becomes equivalent to:

\E{SPSI*>) + E(S^SW)\ < 1 - E(S^S^)

or, again using (1), to:

\a-b + b-c\ < 1 + o-c (3)

6

If the three vectors a, b, c are chosen to be in the same plane and such that a is perpendicular to c and b lies between a and b, forming an angle 9 with a, then the inequality (3) becomes:

cos9 + sin0 < 1 ; 0 < 0 < TT/2 (4)

But the maximum of the function of 6 i—> sin 9 + cos 9 in the interval [0, n/2] is \/2 (obtained for 9 = 7r/4). Therefore, for 0 close to 7r/4, the left-hand side of (4) will be close to \/2 which is more that 1. In conclusion, for such a choice of the unit vectors a, b, c, random variables Sa , S^ ',Sc ,Sc as in the statement of the Lemma cannot exist.

Definition (2) A local realistic model for the EPR (singlet) correlations is defined by:

(1) a probability space (fl, T, P)

(2) for every unit vector x, in the three-dimensional euclidean space, two random variables Sx ,SX defined on fi and with values in the interval [—1, +1] whose correlations, for any x, y, are given by equation (1).

Corollary (3) If a, b, c are chosen so to violate (4) then a local realistic model for the EPR correlations, in the sense of Definition (2), does not exist.

Proof. Its existence would contradict Lemma (1).

Remark. In the literature one usually distinguishes two types of local realistic models - deterministic and stochastic ones. Both are included in Definition (2): the deterministic models are defined by random variables Sx with values in the set{—1, +1}; while, in the stochastic models, the random variables take values in the interval [—1,+1]. The original paper [7] was devoted to the deterministic case. Starting from [9] several papers have been introduced to justify the stochastic models. We prefer to distinguish the definition of the models from their justification.

4 Bell on the meaning of Bell's inequality

In the last section of [8] (submitted before [7], but published after) Bell briefly describes Bohm hidden variable interpretation of quantum theory underlining

7

its non local character. He then raises the question: ... that there is no proof that any hidden variable account of quantum mechanics must have this extraordinary character ... and, in a footnote added during the proof corrections, he claims that: ... Since the completion of this paper such a proof has been found

m-In the short Introduction to [7], Bell reaffirms the same ideas, namely

that the result proven by him in this paper shows that: ... any such [hidden variable] theory which reproduces exactly the quantum mechanical predictions must have ... a grossly nonlocal structure.

The proof goes along the following scheme: Bell proves an inequality in which, according to what he says (cf. statement after formula (1) in [7]):

... The vital assumption [2] is that the result B for particle 2 does not depend on the setting a, of the magnet for particle \, nor A on b.

The paper [2], mentioned in the above statement, is nothing but the Einstein, Podolsky, Rosen paper [11] and the locality issue is further emphasized by the fact that he reports the famous Einstein's statement [12]: ... But on one supposition we should, in my opinion, absolutely hold fast: the real factual situation of the system S2 is independent of what is done with the system Si, which is spatially separated from the former.

Stated otherwise: according to Bell, Bell's inequality is a consequence of the locality assumption.

It follows that a theory which violates the above mentioned inequality also violates ... the vital assumption needed, according to Bell, for its deduction, i.e. locality.

Since the experiments prove the violation of this inequality, Bell concludes that quantum theory does not admit a local completion; in particular quantum mechanics is a nonlocal theory. To use again Bell's words: the statistical predictions of quantum mechanics are incompatible with separable predetermination ([7], p.199). Moreover this incompatibility has to be understood in the sense that: in a theory in which parameters are added to quantum mechanics to determine the results of individual measurements, without changing the statistical predictions, there must be a mechanism whereby the setting of one measuring device can influence the reading of another instrument, how-evere remote. Moreover, the signal involved must propagate instantaneously,...

5 Critique of Bell's "vital assumption"

An assumption should be considered "vital" for a theorem if, without it, the theorem cannot be proved.

8

To favor Bell, let us require much less. Namely let us agree to consider his assumption vital if the theorem cannot be proved by taking as its hypothesis the negation of this assumption.

If even this minimal requirement is not satisfied, then we must conclude that the given assumption has nothing to do with the theorem.

Notice that Bell expresses his locality condition by the requirement that the result B for particle 2 should not depend on the setting a, of the magnet for particle 1 (cf. citation in the preceeding section). Let us denote Mi (M2) the space of all possible measurement settings on system 1 (2).

Theorem (1) For each unit vector x in the three dimensional euclidean space (1 6 R3 , I a; |= 1) let be given two random variables Sx , Sx (spin of particle 1 (2) in direction x), defined on a space D. with a probability P and with values in the 2-point set {+1, —1}- Fix 3 of these unit vectors a, b, c and suppose that the corresponding random variables satisfy the following non locality condition [violating Bell's vital assumption]: suppose that the probability space Cl has the following structure:

!) = A x M , x M 2 (1)

so that, for some function Fj1', F^2' : A x Mi x M2 -»• [-1,1],

Sal) (w) = Fa

(1) (A, mi, m2) (S^ depends on m2) (2)

Sa2)(u) = Fa

(2)(A, mi , m2) (Sa2) depends on mi) (3)

with mi € Mi,m2 € M.2 and similarly for b and c. [nothing changes in the (2) proof if we add further dependences, for example Fa may depend on all the

41 }(w) and F0(1) on all the SX

2\LJ)}.

Then the random variables Si', S^2', Sc satisfy the inequality

I (SMS™) - (S™SW) |< 1 - (S^SM) (4)

If moreover the singlet condition

<5(1)-S(2)) = - 1 ; x = a,b,c (5)

is also satisfied, then Bell's inequality holds in the form

\(Sa^si2))-{S^S^)\<l + (sWS^) (6)

9

Proof. The random variables Sa', S^ , Sc satisfy the assumptions of Corollary (2.3) therefore (4), holds. If also condition (5) is satisfied then, since the variables take values in the set {—1, +1}, with probability 1 one must have

SP = -SW (x = a,b,c) (7)

and therefore (S^S^) = -{S^S^). Using this identity, (4) becomes (6).

Summing up: Theorem (1) proves that Bell's inequality is satisfied if one takes as hypothesis the negation of his "vital assumption". From this we conclude that Bell's "vital assumption" not only is not "vital" but in fact has nothing to do with Bell's inequality.

REMARK. Using Lemma (14.1) below, we can allow that the observables take values in [—1,1] also in Theorem (1).

REMARK. The above discussion is not a refutation of the Bell inequality: it is a refutation of Bell's claim that his formulation of locality is an essential assumption for its validity: since the locality assumption is irrelevant for the proof of Bell's inequality it follows that this inequality cannot discriminate between local and non local hidden variable theories, as claimed both in the introduction and the conclusions of Bell's paper.

In particular: Theorem (1) gives an example of situations in which:

(i) Bell's locality condition is violated while his inequality is satisfied.

In a recent experiment with M. Regoli [4] we have produced examples of situations in which:

(ii) Bell's locality condition is satisfied while his inequality is violated.

6 The role of the counterfactual argument in Bell's proof

Bell uses the counterfactual argument in an essential way in his proof because it is easy to check that formula (13) in [7] paper is the one which allows him to reduce, in the proof of his inequality, all consideration to the A-variables (Sa

in our notations, while Bell's -B-variables are the Sa ^ in our notations). The pairs of chameleons (cf. section (10), as well as the experiment of [4] provide a counterexample precisely to this formula.

10

7 Proofs of Bell's inequality based on counting arguments

There is a widespread illusion to exorcize the above mentioned critiques by restricting one's considerations to results of measurements. The following considerations show why this is an illusion.

The counting arguments, usually used to prove the Bell inequality are all based on the following scheme. In the same notations used up to now, consider N simultaneous measurements of the singlet pairs of observables (S^, S%), (S£, S*), (S%, 5*) and one denotes S3

XV the results of the v-th measurement of S°x (j = 1,2, x = a, b, c, v = 1 , . . . , N). With these notations one can calculate the empirical correlations on the samples, that is

u

(and similarly for the other ones). In the Bell inequality, 3 such correlations are involved.

(slsl), {slsD, {slsD (2)

Thus in the three experiments observer 1 has to measure 5* in the first and third experiment and S* in the second, while observer 2 has to measure Sjj in the first and second experiment and S* in the third. Therefore the directions a and b can be chosen arbitrarily by the two observers and it is not necessary that observer 1 is informed of the choice of observer 2 or conversely. However the direction c has to be chosen by both observers and therefore at least on this direction there should be a preliminary agreement among the two observers. This preliminary information can be replaced it by a procedure in which each observer chooses at will the three directions only those choices are considered for which it happens (by chance) that the second choice of observer 1 coincides with the third of observer 2 (cf. section (15) for further discussion of this point). Whichever procedure has been chosen, after the results of the experiments one can compute the 3 empirical correlations

^ 2 )^ 1 ) ) = ^E^ 1 ) (^ 2 ) )^ 2 ) ^ 2 ) ) <4>

11

JV

(5)

where pj ' means the j - t h point of the 3-d experiment etc. If we try to apply the Bell argument directly to the empirical data given by the right hand sides of (3), (4), (5), we meet the expression

Jj E&WWto?) - ± E^^pf )5f (Pf) (6) N

J = I j = i

from which we immediately see that, if we try to apply Bell's reasoning to the empirical data, we are stuck at the first step because we find a sum of terms of the type

si^sPip^-sUip^sfHpV) (7)

to which the inequalities among numbers, of section (1), cannot be applied because in general

More explicitly: since the expression (x.) above is of the form

ab — b'c

(8)

with a, b, b', c € {±1}, the only possible upper bound for it is 2 and not 1 — ac. Even supposing that we, in order to uphold Bell's thesis, can introduce a

cleaning operation [3], (cf. [4]), which eliminates all the points in which (8) is not satisfied, we would arrive to the inequality

jf E^frf) W>) - jf E ^ f W (*f) j = i 3 = 1

< i-^E^W^fef) (9) j = i

and, in order to deduce from this, something comparable with the experiments we need to use the counterfactual argument, assessing that

^ 1 , (p 9 ) ) = -s<a>(Pa)) (2h (10)

12

But in the second experiment S^ ' and not Sc ' has been measured. Thus to postulate the validity of (10) means to postulate that: the value assumed by Sjj in the second experiment is the same that we would have found if Sc and

(2) not S^ had been measured. The chameleon effect provides a counterexample to this statement.

8 The quantum probabilistic analysis

Given the results of section (5), (6), (7), it is then legitimate to ask: if Bell's vital assumption is irrelevant for the deduction of Bell's inequal

ity, which is the really vital assumption which guarantees the validity of this inequality?

This natural question was first answered in [1] and this result motivated the birth of quantum probability as something more than a mere noncommu-tative generalization of probability theory; in fact a necessity motivated by experimental data.

Theorem (2.3) has only two assumptions:

(i) that the random variables take values in the interval [—1, +1]

(ii) that the random variables are defined on the same probability space

Since we are dealing with spin variables, assumption (i) is reasonable. Let us consider assumption (ii). This is equivalent to the claim that the

three probability measures Pab,Pac,Pcb, representing the distributions of the pairs (Sa ,Sl '), (Sc , 5^ ), (Sa ,SC ) respectively, can be obtained by restriction from a single probability measure P, representing the distribution of the quadruple si1], s f \ s f \ SJ?\

This is indeed a strong assumption because, due to the incompatibility of the spin variables along non parallel directions, the three correlations

(spsP) , <s«s<2>> , (s^sP) (i)

can only be estimated in different, in fact mutually incompatible, series of experiments. If we label each series of experiments by the corresponding pair (i.e. (a, 6), (6, c), (c, a)), then we cannot exclude the possibility that also the probability measure in each series of experiments will depend on the corresponding pair. In other words, each of the measures Pa,b, Pb,c, Pc,a describes the joint statistics of a pair of commuting observables (Si1}, s f } ) , (S^, s f >),

13

(Sa ,Sc ) and there is no a priori reason to postulate that all these joint distributions for pairs can be deduced from a single distribution for the quadruple r o U ) c ( l ) o(2) Q ( 2 ) I

We have already proved in Theorem (2.3) that this strong assumption implies the validity of the Bell inequality. Now let us prove that it is the truly vital assumption for the validity of this inequality, i.e. that, if this assumption is dropped, i.e. if no single distribution for quadruples exist, then it is an easy exercise to construct counterexamples violating Bell's inequality. To this goal one can use the following lemma:

Lemma (1). Let be given three probability measures ±abi *aci -* c6 on & given (measurable) space (S1,f) and let S^, si1], S^, SJp be functions, defined on (Q,J-) with values in the interval [—1,-1-1], and such that the probability measure Pab (resp. Pcb,Pac) is the distribution of the pair (Sa ,Sl ) (resp. ( ^ 1 } , ^ 2 ) ) , (S i 1 } , ^ 2 ) ) ) . For each pair define the corresponding correlation

Kab:={SW,S^):=Jsa^S^dPab

and suppose that, for e,e' = ± , the joint probabilities for pairs

Ki ••= P(Si1] = e • S™ = e')

satisfy:

p++ _ p— . p + - _ p - + (o\ xy xy > xy M xy \^I

P? = Px = 1/2 (3)

then the Bell inequality

\Kab - Kbc\ <l~Kac (4)

is equivalent to

\p:b+-pb

+c+\+p^+<\ (5)

Proof. The inequality (4) is equivalent to

W - 2Pab" ~ *P&+ + 2P+-1 < 1 - 2Pa+

c+ + 2 P + - (6)

14

Using the identity (equivalent to (3))

*•-.xy 0 ":xy (')

the left hand side of (4) becomes the modulus of

2(^t+-^r )-2(nt+-nr) = 2 (*s+-f +pav) -2 (pbt+-\+nr)

= 4(p a v-n t + ) (8) and, again using (7), the right hand side of (6) is equal to

1 - 2 ( P + + - 2 + Pac+ ) = 2 - 4P++ (9)

Summing up, (4) is equivalent to

\Kb+-Kc+\<l -PaV (io)

which is (5).

Corollary (2). There exist triples of Pab,Pac,Pcb on the 4-point space { + 1 , - 1 } x { + 1 , - 1 } which satisfy conditions (1), (2) of Lemma (1) and are not compatible with any probability measure P on the 6-point space { + 1 , - 1 } X { + 1 , - 1 } X { + 1 , - 1 } .

Proof. Because of conditions (1), (3) the probability measures Pab, Pac, Pcb are uniquely determined by the three numbers

p:b+,p++,px+€io,i} (ii)

Thus, if we choose these three numbers so that the inequality (5) is not satisfied, the Bell inequality (4) cannot be satisfied because of Lemma (1).

9 The realism of ballot boxes and the corresponding statistics

The fact that there is no a priori reason to postulate that the joint distributions of the pairs ( S ^ s f 0 ) , (si1],sf}), ( S ^ S ^ ) can be deduced from a single distribution for the quadruple Sa ,Sc ,Sl ' ,Sc , does not necessarily mean that such a common joint distribution does not exist.

15

On the contrary, in several physically meaningful situations, we have good reasons to expect that such a joint distribution should exist even if it might not be accessible to direct experimental verification.

This is a simple consequence of the so-called hypothesis of realism which is justified whenever we are entitled to believe that the results of our measurements are pre-determined. In the words of Bell: Since we can predict in advance the result of measuring any chosen component of o<i, by previously measuring the same component of o\, it follows that the result of any such measurement must actually be predetermined.

Consider for example a box containing pairs of balls. Suppose that the experiments allow to measure either the color or the weight or the material of which each ball is made of, but the rules of the game are that on each ball only one measurement at a time can be performed. Suppose moreover that the experiments show that, for each property, only two values are realized and that, whenever a simultaneous measurement of the same property on the two elements of a pair is performed, the resulting answers are always discordant. Up to a change of convenction and in appropriate units, we can always suppose that these two values are ±1 and we shall do so in the following.

Then the joint distributions of pairs (of properties relative to different balls) are accessible to experiment, but those of triples, or quadruples, are not.

Nevertheless, it is reasonable to postulate that, in the box, there is a well defined (although purely Platonic, in the sense of not being accessible to experiment) number of balls with each given color, weight and material. These numbers give the relative frequencies of triples of properties for each element of the pair hence, using the perfect anticorrelation, a family of joint probabilities for all the possible sextuples. More precisely, due to the perfect anticorrelation, the relative frequency of the triples of properties

{SW=ai}, [Sf^h], [^1}=Cl]

where ai,bi,a = ±1 are equal to the relative frequency of the sextuples of properties

[S™ = ai] , [Si1] = h] , [SP = Cl] , [SM = - 0 l ] , [S<2> = -bl] , [S(2) = _C l]

and, since we are confining ourselves to the case of 3 properties and 2 particles, the above ones, when a\,bi,c\ vary in all possible ways in the set {±1}, are all the possible configurations in this situation, the counterfactural argument is applicable and in fact we have used it to deduce the joint distribution of sextuples from the joint distributions of triples.

16

10 The realism of chameleons and the corresponding statistics

According to the quantum probabilistic interpretation, what Einstein, Podol-sky, Rosen, Bell and several other who have discussed this topic, call the hypothesis of realism should be called in a more precise way the hypothesis of the ballot box realism as opposed to hypothesis of the chameleon realism.

The point is that, according to the quantum probabilistic interpretation, the term predetermined should not be confused with the term realized a priori, which has been discussed in section (9.): it might be conditionally dediced according to the scheme: if such and such will happen, I will react so and so....

The chameleon provides a simple example of this distinction: a chameleon becomes deterministically green on a leaf and brown on a log. In this sense we can surely claim that its color on a leaf is predetermined. However this does not mean that the chameleon was green also before jumping on the leaf.

The chameleon metaphora describes a mechanism which is perfectly local, even deterministic and surely classical and macroscopic; moreover there are no doubts that the situation it describes is absolutely realistic. Yet this realism, being different from the ballot box realism, allows to render free from metaphysics statements of the orthodox interpretation such as: the act of measurement creates the value of the measured observable. To many this looks metaphysic or magic; but load how natural it sounds when you think of the color of a chameleon.

Finally, and most important for its implications relatively to the EPR argument, the chameleon realism provides a simple and natural counterexample of a situation in which the results are predetermined however the counter-factual argument is not applicable.

Imagine in fact a box in which there are many pairs of chameleons. In each pair there is exactly an healthy one, which becomes green on a leaf and brown on a log, and a mutant one, which becomes brown on a leaf and green on a log; moreover exactly one of the chameleons in each pair weights 100 grams and exactly one 200 grams. A measurement consists in separating the members of each pair, each one in a smaller box, and in performing one and only one measurement on each member of each pair.

The color on the leaf, color on the log, and weight are 2-valued observables (because we do not know a priori if we are measuring the healthy or the mutant chameleon). Thus, with respect to the observables: color on the leaf color on the long and weight the pairs of chameleons behave exactly as EPR pairs: whenever the same observable is measured on both elements of a pair, the results are opposite. However, suppose I measure the color on the leaf, of one element of a pair and the weight of the other one and suppose the answers I

17

find are: green and 100 grams. Can I conclude that the second element of the pair is brown and weights 100 grams'! Clearly not because there is no reason to believe that the second member of the pair, of which the weight was measured while in a box, was also on a leaf.

From this point of view the measurement interaction enters the very definition of an observable. However also in this interpretation, which is more similar to the quantum mechanical situation, the counterfactual argument cannot be applied because it amounts to answer "brown" to the question: which is the color on the leaf, if I have measured the weight and if I know that the chameleon is the mutant one? (this because the measurement of the other one gave green on the leaf). But this answer is not correct, because it could well be that inside the box there is a leaf and the chameleon is interacting with it while I am measuring its weight, but it could also be that it is interacting with a log, also contained inside the box in which case, being a mutant, it would be green.

Therefore if we can produce an example of a 2-particle system in which the Heisenberg evolution of each particle's observable satisfies Bell's locality condition, but the Schroedinger evolution of the state, i.e. the expectation value (•), depends on the pair (a,b) of measured observables, we can claim that this counterexample abides with the same definition of locality as Bell's theorem.

11 Bell's inequalities and the chamaleon effect

Definition (1) Let S be a physical system and O a family of observable quantities relative to this system. We say that the it chamaleon effect is realized on S if, for any measurement M of an observable A £ O, the dynamical evolution of S depends on the observable A. If D denotes the state space of S, this means that the change of state from the beginning to the end of the experiment is described by a map (a one-parameter group or semigroup in the case of continuous time)

TA : D->D

Remark. The explicit form of the dependence of TA on A depends on both the system and the measurement and many concrete examples can be constructed. An example in the quantum domain is discussed in [3] and the experiment of [4] realizes an example in the classical domain.

Remark If the system S is composed of two sub-systems S\ and 52, we can also consider the case in which the evolutions of the two subsystems are different in the sense that, for system 1, we have one form of functional dependence,

18

Tjj , of the evolution associated to the observable A and, for system 2, we have another form of functional dependence, Tjj ' . In the experiment of [4], the state space is the unit disk D in the plane, the observables are parametrized by angles in [0,2n) (or equivalently by unit vectors in the unit circle) and, for each observable S i of system 1

and, for each observable S„ of system 2

where Ra denotes (counterclockwise) rotation of an angle a. Let us consider Bell's inequalities by assuming that a chamaleon effect

is present. Denoting E the common initial state of the composite system (1,2), (e.g. singlet state), the state at the end of the measurement will be

Now replace Sx by;

g(j) . = g{j) o T ( j )

"x x -"-x

Since the Sx take values ± 1 , we know from Theorem (2.3) that, if we postulate

the existence of joint probabilities for the triple 5„ ' ,S^ ',Sc , compatible with

the two correlations E(si1}S^2)), E(si1}S^2)), then the inequality

\E(S^si2)) - E(S^si2))\ < 1 - E(S^S^)

holds and, if we also have the singlet condition

E{S£\TWp)S?\TWp)) = -l (1)

then a.e.

and we have the Bell's inequality. Thus, if we postulate the same probability space, even the chamaleon effect alone is not sufficient to guarantee violation of the Bell's inequality.

Therefore the fact that the three experiments are done on different and incompatible samples must play a crucial role.

19

As far as the chameleon effect is concerned, let us notice that, in the above statement of the problem the fact that we use a single initial probability measure E is equivalent to postulate that, at time t = 0 the three pairs of observables

(Û2)) , (sMa>) , (Û1}) admit a common joint distribution, in fact E.

12 Physical implausibility of Bell's argument

In this section we show that, combining the chameleon effect with the fact that the three experiments refer to different samples, then even in very simple situations, no cleaning conditions can lead to a proof of the Bell's inequality.

If we try to apply Bell's reasoning to the empirical data, we have to start from the expression

~ E^W^sfcr^) -1 E^crJV)^(if Pf) 3 3

(1)

which we majorize by

^ E W'^PîT^p]) - SW(TJ V ) s f (tf V ) (2) N

3

But, if we try to apply the inequality among numbers to the expression

SPiT^SîTiW) - S?\TWp»)sl2\T!;%»)\ (3)

we see that we are not dealing with the situation covered by Corollary (1.2),

i.e.

\ab -cb\<l-ac (4)

because, since

si2)(T^)^S^(T^Py) (5)

the left hand side of (4) must be replaced by

\ab-cb'\ (6)

whose maximum, for a, b, c,b' € [—1, +1] is 2 and not 1 — ac.

20

Bell's implicit assumption of the single probability space is equivalent to the postulate that, for each j = 1 , . . . , N

P]=P" (7)

Physically this means that: the hidden parameter in the first experiment is the same as the hidden

parameter in the second experiment This is surely a very implausible assumption. Notice however that, without this assumption, Bell's argument cannot be

carried over and we cannot deduce the inequality because we must stop at equation (2).

13 The role of the single probability space in CHSH's proof

Clauser, Home, Shimony, Holt [9] introduced the variant (2.6) of the Bell inequality for quadruples (a,b), (a,b'), (a',b), {a',b') which is based on the following inequality among numbers a, b, b', a € [—1,1]

\ ab + ab'+ a'b - a'b' |< 2 (1)

Section (1) already contains a proof of (1). A direct proof follows from

\b + b'\ + \b-b'\<2 (2)

because

| ab + ab' + a'b - a'b' | = | a(b + b') + a'{b - b') |

<\a\-\b + b' \ + \a' \-\b-b' \<\b + b' \ + \b-b' \<2

The proof of (2) is obvious.

Remark (1) Notice that an inequality of the form

\a1b1+a2b'2 + a'3b3~a'4b'4\<2 (3)

would be obviously false. In fact, for example the choice

c.1 = b\ = a2 = b'2 = a'3 = 63 = b'4 = 1 ; a 4 = —1

would give I o-ih + a2b'2 + a'3b3 - a'4b'4 \= 4

21

That is: for the validity of (1) it is absolutely essential that the number a is the same in the first and the second term and similarly for a' in the 3-d and the 4-th, b' in the 2-d and the 4-th, b in the first and the 3-d.

This inequality among numbers can be extended to pairs of random variables by introducing the following postulates:

( P I ) Instead of four numbers a, b, b', a g [—1,1], one considers four functions

o(l) c(2) o(l) o(2) °a J°b ' °a' ' "-V

all defined on the same space A (whose points are called hidden parameters) and with values in [—1,1].

(P2) One postulates that there exists a probability measure P on A which defines the joint distribution of each of the following four pairs of functions

{&,&), (#>,S<?>), {S<$\SP), {S$\SP) (4)

Remark (2) Notice that (P2) automatically implies that the joint distributions of the four pairs of functions can be deduced from a joint distribution of the whole quadruple, i.e. the existence of a single Kolmogorov model for these four pairs. With these premises, for each A € A one can apply the inequality

(1) to the four numbers

and deduce that

I S£\\)S12\\) + SW{\)S$\\) + S«(A)Sf (A) - S$\\)S™(\) |< 2 (5)

From this, taking P-averages, one obtains

I <slM2)) + (^1}42)> + < ^ 2 ) > - <s£W> i= (6)

I J(SW{\)S12\\) + SW{\)S<?\\) + Si)\\)si2\x) - 5^(A)42)(A))rfP(A) |<

(7)

<||5W(A)^2)(A) + 5«(A)42)(A)+

22

S$\\)Sl2\\) - S$\\)Si?>(\) I dP(X) < 2 (8)

Remark (3) Notice that in the step from (6) to (7) we have used in an essential way the existence of a joint distribution for the whole quadruple, i.e. the fact that all these random variales can be realized in the same probability space. In EPR type experiments we are interested in the case in which the

four pairs (a, b), (a, &'), (a',b), (a',b') come from four mutually incompatible experiments. Let us assume that there is a hidden parameter, determining the result of each of these experiments. This means that we interpret the number Sa (A) as the value of the spin of particle 1 in direction a, determined by the hidden parameter A.

There is obviously no reason to postulate that the hidden parameter, determining the result of the first experiment is exactly the same one which determines the result of the second experiment. However, when CHSH consider the quantity (5), they are implicitly doing the much stronger assumption that the same hidden parameter A determines the results of all the four experiments. This assumption is quite unreasonable from the physical point of view and in any case it is a much stronger assumption than simply postulating the existence of hidden parameters. The latter assumption would allow CHSH only to consider the expression

SPiWfHXi) + S«(A2)42)(A2) + 5^(A3)5f (A3) - 5^(A4)4

)(A4) (9)

and, as shown in Remark (1.) above the maximum of this expression is not 2 but 4 and this does not allow to deduce the Bell inequality.

14 The role of the counterfactual argument in CHSH's proof

Contrarily to the original Bell's argument, the CHSH proof of the Bell inequality does not use explicitly the counterfactual argument. Since one can perform experiments also on quadruples, rather than on triples, as originally proposed by Bell, has led some authors to claim that the counterfactual argument is not essential in the deduction of the Bell inequality. However we have just seen in section (7.) that the hidden assumption as in Bell's proof, i.e. the realizabil-ity of all the random variales involved in the same probability space, is also present in the CHSH argument. The following lemma shows that, under the singlet assumption, the conclusion of the counterfactual argument follows from the hidden assumption of Bell and of CHSH.

23

Lemma (1) If / and g are random variables defined on a probability space (A, P) and with values in [—1,1], then

(fg) •= I fgdP = - i JA

if and only if P{fg = - i ) = i

Proof. If P(fg > - 1 ) > 0, then

/ fgdP = -P(fg = - 1 ) - / \fg\dP > -P(fg = -1)-P(fg > - 1 ) > - 1 JA Jfg>-1

Corollary (2) Suppose that all the random variales in (x.3) are realized in the same probability space. Then, if the singlet condition:

(SPSW) = - 1 (1)

is satisfied, then the condition

SW = SM ( 2)

(i.e. formula (13) in Bell's '64 paper) is true almost everywhere.

Proof. Follows from Lemma (1) with the choice f = Sx , g = Si'. Summing

up: if you want to compare the predictions of a hidden variable theory with quantum theory in the EPR experiment (so that at least we admit the validity of the singlet law) then the hidden assumption, of realizability of all the random variables in (3) in the same probability space, (without which Bell's inequality cannot be proved) implies the same conclusion of the counterfactual argument. Stated otherwise: the counterfactual argument is implicit when you postulate the singlet condition and the realizability on a single probability space. It does not matter if you use triples or quadruples.

15 Physical difference between the CHSH's and the original Bell's inequalities

In the CHSH scheme:

(a,b), (a',b'), (a,b'), (a',b')

24

the agreement required by the experimenters is the following: - 1 will measures the same observable in experiments I and III, and the

same observable in experiments II and IV; - 2 will measure the same observable in experiments I and II, and the same

observable in experiments III and IV. Here there is no restriction a priori on the choice of the observables to be

measured. In the Bell scheme the experimentalists agree that: - 1 measures the same observable in experiments I and III, - 2 measures the same observable in experiments I and II - 1 and 2 choose a priori, i.e. before the experiment begins, a direction c

and agree that 1 will measure spin in direction c in experiment II and 2 will measure spin in direction c in experiment III (strong agreement)

The strong agreement can be replaced by the following (weak agreement): - 1 and 2 choose a priori, i.e. before the experiment begins, a finite set of

directions c\,..., CK and agree that 1 will measure spin in a direction choosen randomly among the directions c\,..., CK in experiment II and 2 will do the same in experiment III

In this scheme there is an a priori restriction on the choice of some of the observables to be measured.

If the directions, fixed a priori in the plane, are K, then the probability of a coincidence, corresponding to a totally random (equiprobable) choice, is

p{*$ = 42A) = X > # =«; 42A =«) = £ h = h a=l a=l

This shows that, contrarily than in the CHSH scheme, the choice has to be restricted to a finite number of possibilities otherwise the probability of coincidence will be zero.

From this point of view we can claim that the Clauser, Home, Shimony, Holt formulation of Bell's inequalities realize a small improvement with respect to the original Bell's formulation.

Reproduction of the E P R correlations by the chameleon effect

Consider a classical dynamical system composed of two particles (1,2). Let S denote the state space of each of the particles and suppose that at

time t = to (initial time) the state / i j , of particle 1, and the state /U°J OI particle 2, coincide:

H° = A=ti (1)

25

Starting from time to, the two particles begin to move in opposite directions and, after a time interval of length T, two independent and non communicating experimenters simultaneously perform a measurement on each particle.

Experimenter 1 (resp. 2) can choose among three different measurements, corresponding to the observables

SWSW.SW (resp. 5 ( 2 ) , 5 f , ^ ) ) (2)

of particle 1 (resp. particle 2). We suppose that both particles satisfy the chameleon effect, described by

the following:

DEFINITION (1). Let S be the state space of a dynamical system u, let 7 be a set and, for each x € I, let be given a function

Sx : S -> R ; x € I (3)

representing an observable of the system. The system <r is said to realize the chameleon effect with respect to the observables (3.3) if: whenever the observable Sx is measured, the dynamical evolution of the system

T* : S -> S ; tell (4)

depends on the measured observable Sx. In our case we consider only two instants of time, the initial one and the

one when the measurement takes place, and we omit time from our notations. Moreover, in our case we have two particles and each particle is far away from the other one hence it can only feel the interaction with the measurement apparatus near to it. So, combining the locality principle with the chameleon effect, we conclude that, if experimenter 1 (resp. 2) chooses to measure the observable Sx (resp. Sy ) then particle 1 (resp. 2) will evolve according to the dynamics

T1>x (resp. T2lV) (5)

In our case the variables x, y can be any element of the set {a, b, c}.

Suppose that experimenter 1 chooses to measure and experimenter

Let / ti (resp. /j,2) denote the final state, i.e. the state at the time when the measurement occurs, of particle 1 (resp. 2). Condition (3.1) is then equivalent to

^iTaVi = T276Va (6)

26

The empirical correlations of the measurements will then be

i £ 5(1)(/x1)5f ( / i ^ C O i - T2>2) (7)

where J^(-) is a <5-like factor keeping into account the fact that only the configurations satisfying condition (6) give a non zero contribution to the correlations.

Now suppose that the state space S is the real line R. Thus the empirical correlations (7) are

na,b = Z J J 5« ( m )5 f (M2) (T1;aV1 - T^^d^d^ (8)

where Z is a normalization constant. With the change of variables

T ^ V i =: Ai ; T~^2 =: A2 (9)

(8) becomes

z j J 5W(T1,aA1)^2)(T2,bA2)<5(A1 - X2)dTha(X1)dT2,b(X2) (10)

Now introduce the notations

S^\TiiX\j)=:S^(\j); j = l,2; x = a,b (11)

with these notations, supposing as always possible, that T[i0(Ai),T2 6(A2) > 0, (10) becomes

Z j j S^{X1)S{b2\x2)8{Xl - X2)T{<a(X1)T^b(X2)dX1dX2 =

Z JSi1\X)si2)(X)Tla(X)Tib(X)dX

Now let us make the following choices:

A 6 [0,2vr] «• supp S<j) C [0, 2TT] (12)

Z = (27T)"1 (13)

27

T'b = V^ (14)

n a ( A ) = ^ | c o s ( A - a ) | (15)

SW(\) = sgn (cos(A - x)) ; S™ = -S™ (16)

With these choices, the correlations (8) become

I-2TT I

( S ^ f i f }> = - sgn (cos(A - a)) sgn(cos(A - 6))- | cos(A - a)\d\ (17) Jo 4

= — / sgn (cos(A — b)) cos(A — a)d\ = — cos(b — a) = —a • b

which are the EPR correlations.

References

1. L. Accardi, Phys. Rep. 77, 169-192 (1981). 2. L. Accardi, Urne e camaleonti. Dialogo sulla realta, le leggi del caso

e la teoria quantistica. (II Saggiatore, 1997). Japanese translation, Maruzen (2000), russian translation, ed. by Igor Volovich (PHASIS Publishing House, 2000), english translation by Daniele Tartaglia, to appear

3. L. Accardi: On the EPR paradox and the Bell inequality Volterra Preprint N. 350 (1998).

4. L. Accardi, M. Regoli, Quantum probability and the interpretation of quantum mechanics: a crucial experiment,Invited talk at the workshop: "The applications of mathematics to the sciences of nature: critical moments and aspetcs", Arcidosso June 28-July 1 (1999). To appear in the proceedings of the workshop, Preprint Volterra N. 399 (1999)

5. L. Accardi, M. Regoli, Local realistic violation of Bell's inequality: an experiment, Conference given by the first-named author at the Dipartimento di Fisica, Universita di Pavia on 24-02-2000, Preprint Volterra N. 402

6. L. Accardi, M. Regoli, Non-locality and quantum theory: new experimental evidence, Invited talk given by the first-named author at the Conference: "Quantum paradoxes", University of Nottingham, on 4-05-2000, Preprint Volterra N. 421

7. J. S. Bell, Physics 1, 3, 195-200 (1964). 8. J. S. Bell, Rev. Mod. Phys. 38, 447-452 (1966).

28

9. J. F. Clauser , M.A. Home, A. Shimony, R. A. Holt, Phys. Rev. Letters 49, 1804-1806 (1969); J. S. Bell, Speakable and unspeakable in quantum mechanics. (Cambridge Univ. Press, 1987).

10. J. F. Clauser, M. A. Home, Phys. Rev. D 10, 2 (1974) 11. A. Einstein, B. Podolsky, N. Rosen, Phys. Rev. 47, 777-780 (1935). 12. A. Einstein in: Albert Einstein: Philosopher Scientist. Edited by P.A.

Schilpp, Library of Living Philosophers (Evanston, Illinois, 1949).

29

R e f u t a t i o n of Be l l ' s T h e o r e m

Guil laume A D E N I E R Louis Pasteur University, Strasbourg, France.

E-mail: [email protected]

Bell's Theorem was developed on the basis of considerations involving a linear combination of spin correlation functions, each of which has a distinct pair of arguments. The simultaneous presence of these different pairs of arguments in the same equation can be understood in two radically different ways: either as 'strongly objective,' that is, all correlation functions pertain to the same set of particle pairs, or as 'weakly objective,' that is, each correlation function pertains to a different set of particle pairs. It is demonstrated that once this meaning is determined, no discrepancy appears between local realistic theories and quantum mechanics: the discrepancy in Bell's Theorem is due only to a meaningless comparison between a local realistic inequality written within the strongly objective interpretation (thus relevant to a single set of particle pairs) and a quantum mechanical prediction derived from a weakly objective interpretation (thus relevant to several different sets of particle pairs).

1 Introduction

Bell's Theorem1 exhibits a peculiar discrepancy between any local realistic theory and Quantum Mechanics, which leads to empirically distinguishable alternatives. The quandary is that neither local realistic conceptions nor Quantum Mechanics are easy to abandon. Indeed, classical physics and common sense are usually based upon the former, while the latter is rightly presented as the most successful theory of all times. Several experiments have been done, all but a few2 show violations of Bell inequalities.3 Yet, the ideas brought forth by Bell's Theorem are so disconcerting that there is still incredulity, not to mention antipathy, evoked by the verdict. The purpose of this article is to provide a refutation of this theorem, within a strictly quantum theoretical framework, without the use of outside assumptions.

2 The E P R B gedanken experiment

2.1 Spin observables and singlet state

Bell's theorem is usually based on a didactic reformulation of the EPR (Einstein, Podolsky and Rosen4) gedanken experiment, due to D. Bohm.5 In this EPRB gedanken experiment, a pair of spin-| particles with total spin zero is produced such that each particle moves away from the source in opposite directions along the y-axis. Two Stern-Gerlach devices are placed at opposite

30

points (left and right) on the y-axis, and are oriented respectively along the directions u and v. The Hilbert space associated with the entire EPRB system is H = 7ih <8>HR, where T^L and HR are the Hilbert spaces associated with each Stern-Gerlach device respectively. The spin observable has two counterparts in this new product space H as

CTL-U = <r-u(g>IR, (1)

<TR • v = IL ® a • v, (2)

where I I and IR are the identity operators of ~Hh and % R . Contrary to the observables a • u and a • v which are mutually non commuting when u ^ v, these new observables ox • u and OR • v do commute, reflecting the fact that the Stern-Gerlach devices are arbitrarily far from each other, and are thus measuring distinct subsystems. The product of these two observables is therefore also an observable and can be understood as a spin correlation observable corresponding to the joint spin measurement of both Stern-Gerlach devices. Its eigenvectors are |£L,U) <g> | £ R , V ) , with corresponding eigenvalues £L-£R> where each e is either +1 or —1.

In an EPRB gedanken experiment, the source produces particle pairs with zero total spin, represented by the singlet state

M = ^ [l+'n> ® !->n> - !->n> ® l+'n>]> (3)

where n is an arbitrary unitary vector which can usually be ommited since the singlet state is invariant under rotation.6

2.2 Statistical properties and hidden-variables

The expectation value of a spin observable for the singlet state \ip) is zero:

(#r-u(8>lR|V>) = 0, MI L ®<r-v |^> = 0, (4)

whatever u and v, as follows from the rotational invariance of the singlet state. Likewise, the expectation value of the spin correlation observable 6'7 is

E*(u,v) = M ( o f u ) ( o * - v ) M (5)

= - u - v , (6)

which depends only on the relative angle between u and v.

31

In a local realistic hidden-variables model, a single particle pair is supposed to be entirely characterised by means of a set of hidden-variables, which are symbolically represented by a parameter A, so that the measurement result on the left along u can be written as A(u,A), and the result on the right along v as B(v,\). Although the hidden-variables model is supposed to be fully deterministic, it must also be capable of reproducing the stochastic nature of the EPRB gedanken experiment expressed in Eqs. (4) and (6). For that purpose, the complete state specification Aj of any particle pair with label i must be a random variable:1,s its complete state Aj is supposed to be drawn randomly according to a probability distribution p.

Consider a set of N particle pairs {i = 1,... ,N}, the mean value of joint spin measurements for this set is :

1 N

M"(u,v) = - ^ A ( u , A i ) B ( v ! A i ) . (7)

3 The 'CHSH' function

In order to establish Bell's Theorem, a linear combination of correlation functions c(a, b) with different arguments 9 is considered, once when these correlation functions are expectation values E^{a,v) given by Quantum Mechanics; i.e., Eq.(6), and once when they are mean values M p (u ,v ) given by local hidden-variables theories, Eq.(7); then the results are to be compared. A well known choice of such a linear combination is the CHSH (Clauser, Home, Shi-mony and Holt10) function, written with four pairs of arguments:

S = |c(a,b) - c ( a , b ' ) +c (a ' , b ) + c(a ' ,b ') | . (8)

The exact meaning of the simultaneous presence of these different arguments in a CHSH function must be clarified. Basically, there are two possible interpretations, the strongly objective interpretation and the weakly objective interpretation:11,12

Strongly Objective Interpretation implies that all correlation functions are relevant to the same set of N particle pairs. As such they cannot be relevant to actual experiments but rather with what result would have been obtained if measured on the same set of N particle pairs along different directions.

Weakly Objective Interpretation implies that each correlation function is actually to be measured on distinct sets of N particle pairs, that is, for each pair only one joint spin measurement is to be executed.

32

The CHSH function was actually developed specifically for experimental convenience,10 and many experiments have been done (the most famous being Aspect's13), obviously invoking the natural interpretation, namely the weakly objective one. Nevertheless, the strongly objective interpretation must also be considered, since it remains a possible interpretation a priori, and since the choice between strong and weak objectivity is not made at all explicit in many papers, including Bell's.

It must be stressed that these interpretations are radically different, not only epistemologically, but also physically. Indeed, the strongly objective interpretation pertains to a single set of N particle pairs characterised by the corresponding set of parameters {A; ; i = 1 , . . . , TV}; whereas the weakly objective interpretation pertains to no less than 4 sets of N particle pairs. The fact is that a finite set of N particle pairs characterised by {A;} can't be identically reproduced, either theoretically (for each complete state A; of any particle pair i is a random variable, as defined in Section 2.2), or empirically (for the experimenter has no control over the complete state of a particle pair in a singlet state). Hence, in the weakly objective interpretation, these four sets are necessarily four different sets of particle pairs 7 '14 respectively characterised by four different sets of hidden-variables parameters {Ai,j}, {^2,i}, {^3,i} a n d {A4,J.

The difference between each interpretation can therefore be embodied in the number of degrees of freedom of the whole system. Let / be the degrees of freedom of a single particle pair. In the strongly objective interpretation the degrees of freedom of the whole CHSH system is then Nf, whereas in the weakly objective interpretation it is 4 times as large, that is, 47V/. Thus, before initiating Bell's analysis, one has to choose explicitly one interpretation and stick to it.

4 Strongly objective interpretation

4-1 Local realistic inequality within strongly objective interpretation

The local realistic formulation of the CHSH function within strong objectivity is written

OP ^strong

M " ( a , b ) - M " ( a , b ' ) + M'>(a',b) + M' ,(a ' ,b ' ) , (9)

which (using Eq. 7) becomes after factorisation a summation where each term can have two values 2 '7

A(a, Xi) \B(b, Xi) - B(b', Xi)] + A(a', Xt) [l?(b, A<) + B(b', A*)] = ±2, (10)

33

so that the most restrictive local realistic inequality within the strongly objective interpretation is :

Strong < 2- (11)

This is the well known generalised formulation of Bell's inequality due to CHSH.10 It must be stressed once more, however, that this inequality has been established only within the strongly objective interpretation, which means that each expectation value is relevant to the same set of N particle pairs. Hence, this result cannot be compared directly with results from real experimental tests, where in fact mean values from four distinct sets of N particle pairs are measured.

4-2 Quantum mechanical prediction within strongly objective interpretation

The quantum prediction for the CHSH function within the strongly objective interpretation is written

strong = l ^ ( a , b ) - E * ( a , b ' ) + E+(a!,b) + E*(a',h')\. (12)

This equation is usually directly evaluated by replacing each expectation value by the scalar product result of Eq. (6). This, unfortunately, is all too hasty.

Indeed, in order to understand better the quantum mechanical meaning of equation (12), it is advantageous to take a step backward using equation (5):

^strong (V>|(aL.a)(<TR.bM - <V>|(<7L.a)(<7R.b')|t/>)

+ (y>|(<7L.a')(<TR.b)|V) + (i/>|(<7L.a')(<7R .b')|V) • (13)

The four spin correlation observables in this equation are non commuting observables (this can be shown by calculating the commutator of ((7L.U)(<TR,.V)

and ((TL.U)(CTR.V') with v ^ v ') , so that the meaning of their combination must be questioned.

According to Von Neumann,15 any linear combination of expectation values of different observables R, S,... is meaningful in quantum mechanics:

{R + S + ...)4 = (R)4, + (S)4, + ... (14)

even if R, S,... are non commuting observables. However, as was stressed by d'Espagnat, 11,16 quantum mechanics is only a weakly objective theory, and expectation values given by quantum mechanics are also weakly objective statements, that is to say, statements relevant to observations, so that when

34

R, 5 , . . . are non commuting observables, the expectation values cannot be simultaneously relevant to the same set of N systems: each expectation value is necessarily relevant to a distinct set of JV systems. Therefore, the only possible meaning of equation (13) is weakly objective, not strongly objective as desired. Of course, this does not imply that Quantum Mechanics cannot provide any meaning at all for the CHSH function; it implies only that this meaning cannot be strongly objective.

Since the local realistic inequality SgtT0 cannot be compared with any strongly objective prediction given by Quantum Mechanics, Bell's Theorem cannot be verified with a strongly objective interpretation given to the CHSH function. Hence, there is no choice but to rely on the weakly objective interpretation in order to compare hidden-variables theories and Quantum Mechanics.

5 Weakly objective interpretation

5.1 Quantum mechanical prediction within weakly objective interpretation

It was shown in Section 3 that strong objectivity and weak objectivity pertain to different physical systems. This difference should therefore appear in the relevant equations. Indeed, the correlation expressed in Eq. (6) is relevant to spin measurements performed on particles that once constituted a single parent particle. Yet, two particles issued from two distinct parents never have interacted with each other, so that spin measurements performed on such particle pairs can not be correlated. Hence, if left and right spin measurements are performed on two distinct sets of N particle pairs, instead of the same set, there should be no correlation, and this property should appear in a generalised spin correlation function (i.e. generalised to the case of spin measurements performed on different sets of particle pairs).

This can be easily done within a quantum theoretical framework by means of a distinct EPRB space for each set of N particle pairs. Let Hj be the EPRB Hilbert space associated with the jth set of particle pairs. In this Hilbert space, the EPRB gedanken experiment is represented by the singlet state \ipj) (see Section 2),

|V;) = ^[l+>;®|->;-|->;®|+>,-]. (15)

The whole CHSH experiment with the four sets of particle pairs can be expressed then in terms of a new tensor product space W1234 = %i ® %2 ® %3 ® "HA in which the state vector is

1 1234) = |Vl) ® 1 2) ® |^s) ® |^4>- (16)

35

The counterparts of observables in 7 1234 are obtained as in Section 2.1. For instance, the observable pertaining to the right Stern-Gerlach device for the 2nd set of particle pairs is

a2,R -u = Ii ® (CTR • u) <8> I3 ® I4, (17)

where Ij is the identity operator of the EPRB space Hj. Hence, the expectation value of the product of two spin observables, the first belonging to the fcth set and the second to the Zth set, is

Eft{u, V) = (V>1234|(<T*,L • U)(<7I,R • v)|V>1234), (18)

and this is the generalised expectation value of spin correlation observables that was sought. The expectation value for measurements performed on the same set (k = I) of particle pairs is already known, Eq. (6), and E^k(u, v) should provide the same result. Indeed, using Eqs. (16) and (17) leads to

< ( u , v ) = <IM(<TL -u) • K - v)\rpk) = - u v , (19)

but when k ^ I, the result is quite different:

J3*(u,v) = (V-fcKot - u ^ X V - z I K -v)hM = 0, (20)

in accord with Eq. (4). There are indeed no correlations between two sets of particle pairs, as stipulated in the beginning of this section.

Now, contrary to what was done in Section 4.2, it is possible to proceed here in full accord with the quantum mechanical postulates, because the spin correlation observables as the one given in Eq. (17), are mutually commuting, so that a linear combination of these commuting observables is an observable as well. The CHSH experiment can therefore be described by a new observable

Sweak = (<7l,L • a)(ai ,R • b ) - (<T2,L • a)(<72,R " b ' )

+(o-3,L-a')(<T3,R-b) + (<74,L- a')(<x4,R • b ' ) , (21)

and the quantum prediction for the CHSH function within a weakly objective interpretation is therefore obtained by calculating the expectation value of the observable 5weak when the system is in the quantum state 1 1234) :

Sweak = (^1234|5weak|V'1234) , (22)

which using Eqs. (17), (18), and (19) is

S L k = S f 1 ( a , b ) - ^ 2 ( a , b ' ) + ^ 3 ( a ' , b ) + E/ 4 (a ' ,b ' ) . (23)

36

This equation is not ambiguous (as was Eq. 12): it is a linear combination of expectation values, each relevant to a distinct set of N particle pairs. This equation is therefore weakly objective, as requested.

Finally, using Eq. (19), yields

weak a • b - a • b ' + a' • b + a' • b '

with a well known maximum equal to

max(5* B a k )=2>^ .

(24)

(25)

This numerical result is indeed the one given in the literature, the only difference here being the fact that the meaning of this result is unambiguously weakly objective. Quantum Mechanics, which is a weakly objective theory, n

provides a clear answer to the CHSH function understood as a weakly objective question.

5.2 Local realistic inequality within weakly objective interpretation

The last step consists in comparing the quantum prediction S^eak with its local realistic counterpart S^eak. As was stressed in Section 3, the j t h set of particle pairs must be characterised by a distinct set of hidden-variables parameters [Xji; j = 1 , . . . ,N}. Hence, to the generalised expectation value of the spin correlation observable Eq. (18) corresponds the generalised mean value of joint spin measurements:

1 N

M£(u,v) = - J > ( u , A M ) B ( v , A M ) , (26)

which is a priori capable of reproducing not only the k — I prediction, Eq. (19), but also the k ^ / prediction, Eq. (20). The local realistic CHSH function with a weakly objective interpretation is therefore

9P = weak

Mftfob) - M2"2(a,b') + M3 '3(a',b) + M4 '4(a',b') , (27)

and that is explicitly

i 1 N

5weak = b E [^(a ,A M )£(b ,A M ) - >l(a,A2li)B(b',A2ii)

+A(a!, \3,i)B(h, A3,i) + A{B!, A4,i)B(b'l A4]i) ] (28)

37

This expression is to be compared with the one pertaining to the strongly objective interpretation (Section 4.1), which contained terms that could be factored. Here, since each term is different from the others, no factorisation is possible; i.e., there is no way to derive a Bell inequality7—this is not the first time this fact has been noticed, unfortunately, no conclusion was drawn then. Yet, this fact cannot be ignored, for it has been shown in Section 4 that Bell's Theorem cannot be demonstrated within a strongly objective interpretation.

Here, the only local realistic inequality that can be derived is obtained by considering—as was done with Eq. (10)—the possible numerical values of each term of the summation in Eq. (28), for which the extrema are +4 and -4, so that the narrowest local realistic inequality that can be derived from Eq. (28) is nothing but

^ e a k < 4 - (29)

This most restrictive local realistic inequality (which can also be found in Accardi17) is not incompatible with the quantum mechanical prediction, as the maximum of S„e a k is 2-\/2. This shows that experiments intended to test Bell's Theorem were unfortunately not testing the strongly objective inequality, Eq. (11)—which is a Bell inequality—, but this weakly objective one, Eq. (29), since all experimental tests necessarily are executed in a weakly objective way, due to the irreducible incompatibility between spin measurements. As was stressed by Sica18 and Accardi,17 a local realistic inequality is nothing but an arithmetic identity, and inequality (29) is definitely too lax to be violated by experimental tests.

6 Conclusion

It was shown that Bell's Theorem cannot be derived, either within a strongly objective interpretation of the CHSH function, because Quantum Mechanics gives no strongly objective results for the CHSH function (see Section 4.2), or within a weakly objective interpretation, because the only derivable local realistic inequality is never violated, either by Quantum Mechanics or by experiments (see Section 5.2). It was demonstrated that the discrepancy in Bell's Theorem is due only to a meaningless comparison between S^trons < 2 and 5^ e a k = 2\/2, where the former is relevant to a system with Nf degrees of freedom, whereas the latter to one with 4Nf (see Section 3). The only meaningful comparison is between the weakly objective local realistic inequality 5^ e a k < 4 and the weakly objective quantum prediction S„ e a k = 2^/2, but these results are not incompatible. Bell's Theorem, therefore, is refuted.

38

References

1. J. S. Bell, Physics 1, 195 (1964). 2. F. Selleri, Le grand debat de la mcanique quantique (Champs Flammar-

ion, Paris, 1986). 3. A. Aspect, Nature 398, 189 (1999). 4. A. Einstein, B. Podolsky, and N. Rosen, Phys. Rev. 47, 777 (1935). 5. D. Bohm, Phys. Rev. 85, 166 (1952). 6. D. Greenberger, M. Home, A. Shimony and A. Zeilinger, Am. J. Phys.

58, 1131 (1990). 7. A. Bohm, Quantum Mechanics, Foundations and applications (Springer-

Verlag, New York, 1979). 8. J. S. Bell, in Proceedings of the international School of physics 'Enrico

Fermi', course IL: Foundations of quantum mechanics (Academic, New York, 1971), p. 171.

9. J. S. Bell, Epistemological Letters, p. 2 (July, 1975). 10. J. F. Clauser, M. A. Home, A. Shimony and R. A. Holt, Phys. Rev. Lett.

23, 880 (1969). 11. B. d'Espagnat, Veiled Reality: An Analysis of Present Day Quantum

Mechanical Concepts, (Addison-Wesley, 1995). 12. B. d'Espagnat, http://arXiv/abs/quant-ph/9802046. 13. A. Aspect, J. Dalibard, and G. Roger, Phys. Rev. Lett. 49, 1804 (1982). 14. A. Khrennikov, http://arXiv/abs/quant-ph/0006017. 15. J. von Neumann, Mathematical Foundations of Quantum Mechanics

(Princeton University Press, 1955). 16. B. d'Espagnat, Conceptual foundations of Quantum Mechanics, (W.A.

Benjamin, Massachusetts, 1976). 17. L. Accardi, http://arXiv/abs/quant-ph/0007005. 18. L. Sica, Opt. Commun., 170, 55 (1999).

39

PROBABILITY CONSERVATION A N D THE STATE DETERMINATION PROBLEM

S. AERTS Free University of Brussels

Triomflaan 2, Brussels, Belgium E-mail: [email protected]

The problem of finding an operational definition for the wave vector is briefly examined from a historical point of view. Led by an old idea of Feenberg, we integrate the one dimensional probability conservation equation to obtain a closed formula that determines the state vector in the spinless case. The formula that determines the state does not depend on the (real) potential, external fields having their influence on the state only through the time derivative of the probability density function in position space. We apply the method to the simple case of a free Gaussian wave packet. Some problems regarding the operational status of the quantities involved are discussed.

1 Introduction

It is well known that Heisenberg constructed the matrix formulation of quantum mechanics by keeping in close accordance with what might be labelled the 'principle of operationality'. Roughly, one can describe this principle as a determination to introduce only measurable quantities. Schrodinger, more concerned with 'anschaulichkeit' than operationality, introduced rather unscrupulously the concept of a wave function. He initially interpreted the wave function as a charge density in space, but this interpretation is difficult to extend to several particle problems a . The interpretation that would stand the test of time, as testimonied by it being awarded the Nobel prize in 1954, was due to Born. In analogy with the theory of electro-magnetic radiation, in which the intensity is the square of the amplitude, Born took the step to interpret the intensity of an electro-magnetic wave in a given region of space as proportional to the relative frequency of a photon detection in that region and the probabilistic interpretation was born. However, this correspondence still doesn't make it an operational quantity, as for every density p(x, t) there are infinitely many 4>(x,t) such that, with ip(x,t) = ^p{x,t).el^x't\ we get ip*(x,t)ip(x,t) = p(x,t). The problem is then to find suitable functions that we can approximate experimentally in a statistical way, that in some well chosen combination yield the same information as the complete wave function. In order to make the question mathematically more precise, Prugovecki2 intro-

aFor a rescue attempt of the original Schrodinger interpretation, see Dorling1.

40

duced the notion of "informational completeness". A family T = {Oi\i € 1} of bounded operators on a Hilbert space ~H is called informationally complete iff for every two density operators p and p' the equality Tr(pOi) = Tr(p'Oi) implies p = p''. This definition implies that the set of expectation values of an informationally complete set of operators, allows only one state operator from which the expectation values could have been derived. What characterizes such a set? In a classical statistical framework, we can calculate all macroscopic quantities from a single density function p(p, q) in phase space. Hence, by analogy, one is naturally led to the following interesting question, originally due to Pauli3: Is it sufficient to know the probability density functions of position and momentum to determine unambiguously the quantum mechanical state of the physical system? In the quantum mechanical case, it is sufficient to know the wave function in coordinate space ip(x,t), since the corresponding wave function for the same system in momentum space ip(p,t) is given by its Fourier transform. Hence we can phrase the problem in a more mathematical way: is it possible to determine a square integrable function uniquely from both its modulus and the modulus of its Fourier transform? Possibly the first non-trivial counterexamples came from Bargmann b who constructed explicit examples of wave functions V'l and ip2, that give rise to the same probability distributions for position and momentum, but give a different probability distribution for a third operator that does not commute with the position or momentum operator. This leads to the remarkable conclusion that the wave function in its coordinate representation contains more information than the corresponding probability densities in position and momentum together. Due to Bargmann we know the answer to be negative in a physically relevant way c

and what is now commonly referred to as the Pauli problem is either the problem of determining the set of states that share the same modulus and the modulus of their Fourier transform, or the problem of finding a set of observables that are informationally complete. The problems are related but not identical, and we prefer to refer to the first version of the problem as the Pauli problem, and to the second as simply the state determination problem. It seems much more work has been done on the state determination problem, which isn't surprising given the fact that the Pauli problem is a special case of it. With the exception of the production of counterexamples such as Bargmann's, the first instructive results regarding the Pauli problem were obtained only in

Bargmann never seems to have published these results himself, and as a result, little reference is given to his work in the literature. However, the examples can be found in Reichen-bach 4 . c The problem re-appeared unaltered in the 1958 edition of Pauli's book, more than a decade after the first counterexamples.

41

1978 by Corbett and Hurst5 . In their paper they construct physically important classes of functions that are uniquely determined by their position and momentum distributions. However, they also show there exist dense subsets of states that are not uniquely determined by their position and momentum distributions and, as a consequence, any state can be approximated, in norm, by a non-unique state. Extensions, comments and counterexamples to their work can be found in Friedman6 and Pavicic7. Nevertheless, the complete characterization of the set of states that share modulus and the modulus of their Fourier transform is still open. As for the state determination problem, we can split the work into those who were primarily concerned with establishing a set of observables that is informationally complete (or disproving a certain set to have this property), and those that set out to characterize such sets. The first group includes Feenberg8 (1933), Moyal9 (1949 ), Gale, Guth, and Trammell (1968)10, Band and Park 1 1 1 2 13 (1970-1971), and many more14 15 16. We will not go into the reconstruction of the state by placing the entity in different potentials, a method pioneered by Lamb17 and one that inspired many similar approaches such as Wiesbrock18 and Weigert19 nor will we mention the vast literature pertaining to the measurement of the Wigner distribution, known as phase-space tomography. However, concerning the characterization of informationally complete sets we cannot help but make the following elementary remarks. Suppose we have a non-trivial (i.e., not a multiple of the identity) self-adjoint operator A that commutes with every member of a set of operators S in a Hilbert space 7i. It is well known that the one parameter family of unitary operators exp(itA) also commutes with every element of <S. Now take any xj) that is not an eigenvector of A. For any observable in S, the state ipt — exp(itA)tp gives the same expectation value for this operator, whatever numerical value t has. But if t ^ s it follows that ipt ^ Vs (for the relation of this with superselection rules, see Wick, Wightman and Wigner (1952) 20, Emch and Piron (1963) 21 and Piron2 2). Hence S is not an informationally complete set of observables. So a necessary condition for a set of observables to be informationally complete is maximality in the sense of Dirac, in other words, that there be no other non-trivial operator that commutes with every member of the set. However, this is far from sufficiency. As Bush and Lahti23

have shown, it is easy to derive d from the considerations above that no commuting set of observables is informationally complete! Maximal commuting sets of observables serve as a means of state preparation, not state identification. This means that, at least for for continuous variables, the Pauli set {P, Q} is in a certain sense the minimal set that one could possibly hope to be informationally complete (although Bargmann has shown this in general not

One arrives at this result by allowing A to be a member of S.

42

to be the case).

2 Conservation of Probability

What we will present in this article is an elaboration on the reasoning followed by Feenberg. Consider the time-dependent Schrodinger equation in tp with a real e potential V and using the shorthand tp for ip(r, t):

~ = -h/2imV2tp +^rVip at in

Multiply by tp* and add this to the complex conjugate of the above equation multiplied by ip. After some elementary vector operator manipulation, we find what is commonly known as the conservation law of probability:

Substitution of the 'polar representation' of the wave vector iP(r,t) = y/fafie*'™ (ip assumed real) into the former equation yields a second order partial dif

ferential equation, which is in fact a Fokker-Planck equation with zero diffusion coefficient and the phase serving as a a potential:

Feenberg's argument is a uniqueness result based on this last equation. It amounts to showing that any two phase functions that satisfy this equation and some gentle boundary conditions differ by at most a constant. His 1933 thesis is hard to get hold of, but the argument was (erroneously1015 ) extended by Kemble 24 to three spatial dimensions in his much easier to find handbook on quantum mechanics . What we will do here, is go back to the original one dimensional idea, but rather than trying to establish a uniqueness result, we will show that in this simple case a solution can be obtained by direct integration.

3 Determination of the phase function

So p and ip satisfy the conservation law as given by the last equation. Rewriting this equation in one dimension, evaluated at a specific time instant t = to gives us: eThe imaginary part of a complex potential can be used to mimic creation and annihilation effects. Although this is sometimes a useful approximation, such results violate the continuity equation, and for a more reliable analysis, one should really use a second quantized theory.

43

, ,<9V , dp(x,t0)dip mtdp{x,t) p{x'to)w + —dx—Tx + -n{—m-]t^ = °

Assume for the time being that p(x, t0) ^ 0, and divide the equation by p(x, t0):

d2(p dinp(x,t0) dip m dlnp(x,t). _ ~dtf + dx ~5x~+ J{ dt h=t0 ~

Assuming po{x) and its time derivative to be known functions, we can solve for the unknown phase <p(x,to). Set

As all quantities are evaluated at the same time instant t = to, we will not bother to give further notational reference to this fact. In what follows, we will also abbreviate (with abuse of language) ( a i nP(x ' f)) f = t o a s dtlnp(x). Applying these transformations the equation becomes:

^ + f(X)(f> = g(X)

So we have transformed the second order partial differential equation into an ordinary, first order, linear differential equation with a source g(x) at a fixed time instant. The solution of the homogeneous equation is: <ph = exp[— f f(x')dx'] = p~1{x). The general solution, with c chosen to fit the boundary condition, is: <f>(x) = 4>h{x)(c + $x g(s)p(s)ds). We have to integrate this result once more to get <p(x):

/

x rr

4>h{r)(c+ I g(s)p(s)ds)dr

= J p~(7)[c+J J P(s)dtlnp(s)ds]

= J (c+-J dtP(s)ds)W)

4 Validity and range of applicability

The solution is seen to be a two parameter family of curves, one for every value of the constant c, and one for every lower limit, say x$, of the r integration ' . The result of changing the lower integration limit is only the addition

•'The lower limit of the s integration is absorbed in the constant c.

44

of an overall constant to tp(x,t). Because we know the quantum mechanical expectation values and probabilities to be invariant under such an addition, we set this constant equal to zero. The value of the constant c can potentially affect the phase in a more profound way. Depending on the particular p(r, t) used, / pfriy m i g n t diverge when p(r, t) is zero for some value(s) of r or, even worse, for some Ar. First of all, we assumed in our derivation that p(r, t) ^ 0, but this restriction can easily be removed. Indeed, suppose we have n places xn where the density does equal zero. A solution ipi is then obtained for each interval ]x{, Xi+\ [ by means of our equation. The total solution ip is obtained by pasting all the ipi together by requiring continuity of if; and V^- 9 • Now continuity of ip and VV> implies continuity of their respective complex conjugates and hence of p and Vp. If we are to infer the phase from actual data, it seems reasonable to require (p also to be continuous. In fact, the conservation equation requires it to be twice differentiable. If any cutting and pasting is necessary to obtain the solution, we can easily see that the constant c should be the same for any two pasted pieces. Hence, if the cut is applied at a pole, c has to be zero h for <p to be continuous. We arrive at the same conclusion when we use the same reasoning on a point adjacent to the support of p. Hence we arrive at the main result of our paper:

m rx fo rr

V(x,t0) = y/p(x,t0)exp(i— / . / dtp(s,t0)ds)

Note that the state does not contain reference to the potential. External fields will show up in the state indirectly as a consequence of the time dependence of p. The assumptions that underlie the derivation of the equation are: a spinless, one dimensional particle that acts under a real potential V, being prepared in a pure state. In short, all that is required for a particle to obey the one dimensional dynamical Schrodinger equation. However restricted this class is, it does include many examples that can be found in standard textbooks on quantum mechanics.

Comparing the result we have found to those in the literature, we find the closest match with a result obtained by Gale, Guth and Trammel10 . They apply the definitions of p(r) and j(r) to show that knowing these is sufficient for the determination of the phase. They then discuss a gedanken experiment

9 This continuity demand is in fact a necessity because the validity of the equation of probability conservation (and a fortiori of the Schrodinger equation) requires xjj and Vi/> to be continuous. A notable but unproblematic exception is that of an infinite potential step. h the value of c might be non-zero in applications where the continuity equation only expresses conservation of the probability flux in some intermediate region, the boundaries (possibly at infinity) containing 'sinks' or 'sources' of probability.

45

for establishing the probability current by measuring the expectation of the velocity and argue, by means of this experiment and an intuitive argument, that the current j(r) equals p(r) < v(r) > for some r inside a small space region that is supposed to contain the particle. Our result was obtained by a direct integration and, as a consequence, is exact. It is however difficult to extend to higher dimensions because of two reasons. The first is the fact that the expression for the probability current in the presence of a vector potential becomes J(x,£) = Re{ip*(x,t)[p/m— (q/mc)A]ip(x, t)} and, depending on the form of the vector potential, it is not obvious to what function of the phase this corresponds. If the vector potential corresponds to a uniform magnetic field, or in absence of a vector potential (in which case one can transform the equation into a Poisson equation) one can solve the continuity equation by employing standard techniques. However, one then encounters a second problem. Providing an initial value for the phase (which is unproblematic as the phase is only determined within an additive constant) is no longer sufficient, instead we need an initial boundary function. Hence we have to resort to other principles to determine the phase on such a boundary in order to solve the problem. Of course, the principle of conservation may still serve the purpose of reducing the family of admissible functions for the phase of the amplitude. We will now illustrate the principle by applying it to a Gaussian wave packet. Later we will expound a few operational issues regarding the quantities involved in the solution given above.

5 Evolution of a Gaussian Wave Packet

The full, time dependent wave function for a free Gaussian wave packet is:

*c .o = <MA*)Sr''<<i + ^ r ' ' ' -x2/4(Ax)l + ik0x - ik2,Ht/2m

eXpL 1 + iht/2m(Ax)20 J

From this we easily calculate p(x,t):

p(x,t) = tp{x,t).ip*(x,t)

iv,, /A N2W-, h2t2 N-,-1/2 r -(x + k0ht/m)2 .

Now assume we did not know the wave function, only the probability density and its time derivative at some time instant t — 0. In an abbreviated

46

form (with easy identification of the coefficients) we can write the probability as:

* , , ) = * + tf)-/»«p[-JE±|£]

At time t = 0 this gives us: p(x,0) — aexp(—^-) The derivative of p with respect to the time parameter:

**'•«» - 4i<1 + 6 , 2>~1 / 2 e x p<-|r^)>]'= CX , X2

= ~2a~dexp(~~j)

So the phase becomes:

<p(x,0) = j J J dtp(s,0)d, p(r,0)

2 „2 • v

C TTl f fr S V

= ~2d-hJ J sexP(--)dsexP(-)d,

m fx v^ r2

kohm = T~x

m n

= kox

which is precisely the desired phase of the wave function at t = 0 6 Operational Issues

Expounding Feenberg's uniqueness result, Reichenbach points out that we can recover the phase by numerical computation if we know p(x, to) and dtp(x, t) \t=t0 • In order to establish these quantities, Reichenbach outlines the following procedure4. We take an ensemble A of identically prepared systems such that the ensemble can be properly described by a pure state if>. Now select at random two sub-ensembles from A, say B and C. For each system in B we measure at the time to the value of a;. As the results will vary, we obtain in this way a distribution p(x,to)- Likewise, for each system in C, we we measure at the time ti the value of x, obtaining a distribution p(x,ti). The quotient

p(x,t0) - p(x,h)

h — to

47

is then supposed to approximate dtp(x,t) for t € [to,h] if the interval [to,h] is chosen sufficiently small. The wave function can then be obtained through numerical approximation and represents the state of the systems that are left untouched in the original ensemble A. There is a problem with Reichenbach's procedure for determining these quantities that is of equal concern to our method. Despite the fact that it is entirely possible to position the detector wherever one wants it to be, hence effectively controlling x in p(x,t), it is an annoying peculiarity of quanta that one cannot determine when a detection will take place. One places a detector and simply waits for a detection count to happen. The problem seems related to what Mielnik has called "the screen problem" in a provocative and enlightening paper by the same name 25. As Mielnik points out "experimentalists perform a lot of experiments, but none resembling an instantaneous check of particle position". Indeed, a measurement setup typically consists of a source, that what is emitted undergoes a series of transformations (i.e., an optical bench or a potential) and is subsequently detected by a fixed detector, or a set of fixed detectors. If we are to describe operational means of measuring densities at some time instant, we will have to do so by such a typical setup. To produce anything remotely satisfactory, we will need a few assumptions. A first assumption is that if a particle is detected at some time instant to in position x, the intricate mechanism between the measurement apparatus and the particle that is responsible for its detection does not depend on to and in this sense, has no effect on the value of p(x,t). However unnatural the assumption might be from a physical point of view, it seems to underlie the statistical interpretation of fn \^{x, t)\2dV as an instantaneous localization probability of the system in a state ip in a space region fi and at a time instant t. In so far as our analysis depends on this assumption, so does the standard interpretation of quantum mechanics. The next assumption is that we are able to control the release of the particle in a certain state within a sufficient small time interval At such that, within this small time interval, the density can reasonably be approximated by a linear function. This can be achieved by placing a shutter mechanism behind the source. Naturally, the shutter opening time has to be substantially less than the coherence time of the particle. A sufficiently short opening time can only be established by experiment, and one can never be quite sure if there would still be more oscillations on a much shorter time scale. A density function with a larger variation will be harder to approximate as it requires a shorter shutter opening time and hence will result in a lower detection rate. The wave packet then participates in the transformations we may have set up (optical bench, Stern-Gerlach,...) and is detected. The time interval between the shutter release and the detection time is noted together with the position of the detector.

48

After many of such recordings, we gather all the data to reconstruct p(x,t). How many samples do we need? Well, if the samples were taken at equidistant At and Ax, we could do a Fourier synthesis and apply the Shannon-Whittaker sampling theorem. However, due to the non-equidistant spreading of the tn, (at best following some statistical pattern), we need Frame Theory (Duffin and Schaeffer26) to reconstruct band limited signals / from irregularly spaced samples {f(tn)}. The derivative with respect to time can then be derived from the reconstructed signal and the phase derived by means of the proposed equation.

Acknowledgments

The author wishes to acknowledge a helpful discussion with John Corbett regarding the subject of this paper.

References

1. J. Dorling, Schrodinger, Centenary celebration of a polymath , eds. C.W. Kilmister (Cambridge, 1987).

2. E. Prugovecki, Int. J. Theor. Phys., 16, pp 321-331, (1977). 3. W. Pauli, Encyclopedia of Physics, Vol V, p.17 (Springer-Verlag, Berlin,

1958). 4. H. Reichenbach, Philosophic Foundations of Quantum Mechanics, (Uni

versity of California Press, 1948). 5. J.V. Corbett, C.A. Hurst, J. Austral. Math . Soc, B20, 182-201, (1978). 6. C.N. Friedman, J. Austral. Math . Soc, B30, 298, (1987). 7. M. Pavicic, Phys. Lett. A, 122, 280, (1987). 8. E. Feenberg, The Scattering of Slow Electrons in Neutral Atoms, Thesis,

Harvard University, (1933). 9. J.E. Moyal, Proc. Cambridge Phil. Soc, 45, 99, (1949).

10. W. Gale, E. Guth, and G.T. Trammell, Phys. Rev. A, 165, 1434-1436, (1968).

11. W. Band, J. Park, Found. Phys., 1, No 2, pp 133-144, (1970). 12. J. Park, W. Band, Found. Phys., 1, No 4, pp 339-357, (1971). 13. W. Band, J. Park, Am. J. Phy. , 47, pp 188-191, (1979). 14. A. Royer, Phys. Rev. Lett, 55, pp 2745, (1985). 15. A. Royer, Found. Phys., 19, 3, (1989). 16. W. Stulpe, M. Singer, Found. Phys. Lett, 3, 153, (1990). 17. W. E. Lamb, Phys. Today, 22(4), 23, (1969). 18. H.-W. Wiesbrock, Int. J. Theor. Phys., 26, pp 1175, (1987). 19. S. Weigert, Phys. Rev. A., 45, pp 7688-7696, (1992).

49

20. G.C. Wick, A.S. Wightman, E.P. Wigner, Phys. Rev., 88, pp 101-105, (1952).

21. E.C. Emch, C. Piron, J. Math . Phys., 4,pp 496-473, (1963). 22. C. Piron, Helv. Phys. Acta, 42, pp 330-338 ,(1969). 23. P. Bush, P.J. Lahti, Found. Phys., 19, pp 633, (1971). 24. E.C. Kemble, New York, MacGraw-Hill, (1937). 25. B. Mielnik, Found. Phys., 24, 8, pp 1113-1129, (1994). 26. R.J. Duffin, A.C. Schaeffer, Trans. Amer. Math. Soc., 72, 341-366

(1952).

50

EXTRINSIC A N D INTRINSIC IRREVERSIBILITY IN PROBABILISTIC DYNAMICAL LAWS

H. ATMANSPACHER Institut fur Grenzgebiete der Psychologie und Psychohygiene,

Wilhelmstr. 3a, D-79098 Freiburg, Germany, E-mail: [email protected]

and Max-Planck-Institut fur extraterrestrische Physik,

D-85740 Garching, Germany

R. C. BISHOP Institut fur Grenzgebiete der Psychologie und Psychohygiene,

Wilhelmstr. 3a, D-79098 Freiburg, Germany, E-mail: [email protected]

A. AMANN Universitatsklinik fur Anasthesie, Leopold-Franzens- Universitat,

Anichstr. 35, A-6020 Innsbruck, Austria E-mail: [email protected]

and Institut fur Allgemeine, Anorganische und Theoretische Chemie, Abteilung fur theoretische Chemie, Leopold-Franzens- Universitat,

Innrain 52a, A-6020 Innsbruck, Austria

Two distinct conceptions for the relation between reversible, time-reversal invariant laws of nature and the irreversible behavior of physical systems are outlined. The standard, extrinsic concept of irreversibility is based on the notion of an open system interacting with its environment. An alternative, intrinsic concept of irreversibility does not explicitly refer to any environment at all. Basic aspects of the two concepts are presented and compared with each other. The significance of the terms extrinsic and intrinsic is discussed.

1 Introduction

The relation between reversible, time-reversal invariant laws of nature and the irreversible behavior of empirical systems has been a long-standing problem in physics. In most standard approaches, fundamental dynamical laws such as in Newton's, Maxwell's, Einstein's or Schrodinger's equations describe the temporal evolution of isolated systems. Irreversible dynamical laws are typically regarded as emerging from the interaction between systems and their environment, i.e., from considering open systems.

In contrast to this "extrinsic" conception of irreversibility, there is a group

51

of scientists who insist that some kinds of irreversibility are "intrinsic", i.e., some kinds of irreversible laws are fundamental. On this view, mainly advocated by Prigogine and colleagues in Brussels and Austin, the switch from extrinsic to intrinsic irreversibility goes along with a switch from particular kinds of deterministic descriptions to particular kinds of probabilistic descriptions.

In general, the two viewpoints are considered to be distinct, sometimes even entirely incompatible. It is the main goal of this contribution to show that there are both differences and similarities between them. As a consequence it does not make too much sense to prefer one of them at the expense of the other. It is much more interesting to explore whether particular aspects of each of the two views can be constructively related to each other in order to increase our insight into the issue of irreversibility.

In the following, both conceptions will be presented to some detail and compared. It is suggested that the distinction of ontic and epistemic catego-rial frameworks for some problems associated with irreversibility is particularly useful when focusing on a conceptual discussion. Such a distinction serves to clarify both common and distinct aspects of extrinsic and intrinsic irreversibility, and it helps to frame a number of open questions concerning them.

In Section 2, ontic and epistemic descriptions are briefly introduced. We use an algebraic framework for this introduction since this has proven fruitful in related problem areas. Section 3 outlines some basic issues with respect to the ontic states of closed quantum systems and their time-reversal invariant dynamical evolution. Subsequently, two ways to conceive of extrinsic irreversibility are described. In one of them epistemic states are represented by (reduced) density operators, in the other they are represented by probability distributions of pure states. Section 4 presents the intrinsic conception of irreversibility. One major line of research in this regard deals with transformations from invertible K-systems to non-invertible exact systems, the other uses the concept of rigged Hilbert spaces to extend the state of a system beyond Hilbert space. Section 5 summarizes the main points and indicates some open questions.

2 Ontic and epistemic descriptions

2.1 General issues

Can nature be observed and described as it is in itself, independent of those who observe and describe - that is to say, nature as it is "when nobody looks"? This question has been debated throughout the history of philosophy with no clear answer either way. Each perspective has strengths and weaknesses, and in each

52

epoch has had its critics and proponents. In contemporary terminology, the two perspectives can be distinguished as the topics of ontology and epistemology. Ontological questions refer to the structure and behavior of a system as such, whereas epistemological questions refer to knowledge (or information) about systems.

In philosophical discourse it is considered a serious fallacy to confuse these two types of questions. For instance, Fetzer and Almeder emphasize that "an ontic answer to an epistemic question (or vice versa) normally commits a category mistake" 1. Nevertheless, such mistakes are frequently committed in many fields of research when addressing subjects where the distinction between ontological and epistemological arguments is important.

The ontic/epistemic distinction refers to states and properties of a system as such or in its relation to observers, hence it is an ontological distinction.0

In physics, the rise of quantum theory with its interpretational problems was one of the first major challenges to the ontic/epistemic distinction. The Bohr-Einstein discussions in the 1920s and 1930s serve as a famous historical example. Einstein's arguments were generally ontically motivated; that is to say, he emphasized a viewpoint independent of observers or measurements. By contrast, Bohr's emphasis was generally epistemically motivated, focusing on what we could know and infer from observed quantum phenomena. Since Bohr and Einstein never made their basic viewpoints explicit, it is not surprising that they talked past each other in a number of respects2.

Examples of approaches trying to avoid the confusions of the Bohr-Einstein discussions are Heisenberg's distinction of actuality and potentiality 3, Bohm's ideas on explicate and implicate orders5, or d'Espagnat's scheme of an empirical, weakly objective reality and an objective (veiled) reality independent of observers and their minds5 . Further terms fitting into the ontic side of these distinctions are latency6, propensity7, or disposition8. See also Jammer's discussion of these notions, including their criticism and additional references 9

A first attempt to draw an explicit distinction between ontic and epistemic descriptions for quantum systems was introduced by Scheibe 10 who himself, however, strongly emphasized the epistemic realm. Later, Primas developed this distinction in the formal framework of algebraic quantum theory11 . The basic structure of the ontic/epistemic distinction, which will be made more precise below, can be roughly characterized as follows (for more details, the reader is referred to1 1 '1 2) :

"On the other hand, the distinction between ontological and epistemological problems can be considered as epistemological insofar as both areas represent fields of (philosophical) knowledge.

53

Ontic states describe all properties of a physical system exhaustively. ("Exhaustive" in this context means that an ontic state is "precisely the way it is", without any reference to epistemic knowledge or ignorance.) Ontic states are the referents of individual descriptions, the properties of the system are treated as intrinsic •properties}' As an important example, ontic states refer to closed systems; they are empirically inaccessible. Typically, their temporal evolution (dynamics) is reversible and follows fundamental, deterministic laws. Epistemic states describe our (usually non-exhaustive) knowledge of the properties of a physical system, i.e. based on a finite partition of the relevant phase space. The referents of statistical descriptions are epistemic states, the properties of the system are treated as contextual properties. Epistemic states refer to open systems; they are, at least in principle, empirically accessible. Typically, their temporal evolution (dynamics) follows irreversible laws.

The combination of the ontic/epistemic distinction with the formalism of algebraic quantum theory provides a framework that is both formally and conceptually satisfying. Although the formalism of algebraic quantum theory is often hard to handle for specific physical applications, it offers significant clarifications concerning the basic structure and the philosophical implications of quantum theory. For instance, the modern achievements of algebraic quantum theory make clear in what sense pioneer quantum mechanics (which von Neumann implicitly formulated epistemically 13) as well as classical and statistical mechanics can be considered as special cases of a more general theory. Compared to the framework of von Neumann's monograph13, important extensions are obtained by giving up the irreducibility of the algebra of observables (not admitting observables which commute with every observable in the same algebra) and the restriction to locally compact phase spaces (admitting only finitely many degrees of freedom). As a consequence, modern quantum physics is able to deal with open systems in addition to isolated ones; it can involve infinitely many degrees of freedom such as the infinitely many modes of a radiation field; it can properly consider interactions with the environment of a system; superselection rules, classical observables, and phase transitions can be formulated, which would be impossible in an irreducible algebra of observables; there exist infinitely many representations inequivalent to the Fock

''In a more technical terminology, one speaks of "observables" (mathematically represented by "operators") rather than properties of a system. Prima facie, the term "observable" has nothing to do with the actual observability of a corresponding property.

54

representation; and non-automorphic, irreversible dynamical evolutions can be successfully incorporated and even derived.

In addition to this remarkable progress, the mathematical rigor of algebraic quantum theory in combination with the ontic/epistemic distinction allows us to address a number of unresolved conceptual and interpretational problems of pioneer quantum mechanics from a new perspective. First, the distinction between different concepts of states as well as observables provides a much better understanding of many confusing issues in earlier conceptions, including alleged paradoxes such as those of Einstein, Podolsky, and Rosen (EPR) 1 4 . Second, a clear-cut characterization of different concepts of states and observables is a necessary precondition to explore new approaches, beyond von Neumann's projection postulate, toward the central problem that pervades all quantum theory: the measurement problem. Third, a number of much-discussed interpretations of quantum theory and their variants can be appreciated more properly if they are considered from the perspective of an algebraic formulation.

One of the most striking differences between the concepts of ontic and epistemic states is their difference concerning operational access, i.e. observability and measurability. At first sight it might appear pointless to keep a level of description which is not related to what can be operationalized empirically. However, a most appealing feature at this ontic level is the existence of first principles and fundamental laws that cannot be obtained at the epistemic level. Furthermore, it is possible to rigorously deduce (e.g., to "GNS-construct"; cf. 12>15) a proper epistemic description from an ontic description if enough details about the empirically given situation are known. These aspects show that the crucial point is not to decide whether ontic or epistemic levels of discussions are right or wrong in a mutually exclusive sense. There are always ontic and epistemic elements to be taken into account for a proper description of a system. This requires the definition of ontic and epistemic terms to be relativized with respect to some selected framework within a set of (hierarchical) descriptions (see16 for details and examples). The problem is then to use the proper level of description for a given context, and to develop and explore well-defined relations between different levels.

These relations are not universally prescribed; they depend on contexts of various kinds. The concepts of reduction and emergence are of crucial significance here. In contrast to the majority of publications dealing with these topics, it is possible to precisely specify their meaning in mathematical terms. Contexts, or contingent conditions, can be formally incorporated as topologies in which particular asymptotic limits give rise to novel, emergent properties unavailable without those contexts (see 15 for more details). It should also

55

be mentioned that the distinction between ontic and epistemic descriptions is neither identical with that of parts and wholes nor with that of micro- and macrostates as used in statistical mechanics or thermodynamics. The thermodynamic limit of an infinite number of degrees of freedom provides only one example of a contextual topology, others are the Born-Oppenheimer limit in molecular physics or the short-wavelength limit for geometrical optics.

These examples indicate that the usefulness or even inevitability of the ontic/epistemic distinction is not restricted to quantum systems. It plays a significant role in the description of classical systems as well. More specifically, it has been shown in detail that for systems exhibiting deterministic chaos the distinction of ontic and epistemic descriptions is necessary if category mistakes and corresponding interpretational fallacies are to be avoided17.

3 Breaking Time-Reversal Symmetry: Extrinsic Irreversibility

3.1 Time-Reversal Symmetry in Closed Systems

Let us start with a closed quantum system which can be considered without any reference to an environment. The pure state <f> of such a system is an extremal positive linear functional on a C*-algebra A. The state </> € A*, where A* is the dual of A, is then called an ontic state of the closed system. If a Hilbert space representation of A is possible, <j> can be represented as a state vector ip G %, characterized by the expectation values < ip\Aip > of all observables A € A. Under particular conditions, the dynamics of <f> is given by the time-reversal invariant Schrodinger equation.

In the traditional Hilbert space representation, the algebra A of observables is irreducible; there are no commuting observables. Due to the Stone-von Neumann theorem, every representation of the canonical commutation relations is then equivalent to the Schrodinger representation. In the more general setting of a Fock space (sum of tensor products of one-particle Hilbert spaces), the same holds for Fock representations.

A restriction of <fr to a subsystem is not a pure state in general; hence it is in general illegitimate to consider a closed quantum system as consisting of closed subsystems. As a consequence, an ontic state cf> characterizes an individual, undivided whole not consisting of subsystems with their own ontic states. This is the level of description to which the notions of quantum nonlocality or quantum holism apply. Since the concept of an environment does not make sense for ontic states of closed systems, it is illegitimate to speak about their entanglement or interaction with another state.

If one introduces a distinction (Heisenberg cut) to create subsystems in

56

a closed system, then these subsystems in general are open. For example, one can then consider an object entangled and/or interacting with its environment. The epistemic state r] of those subsystems can be represented in two conceptually different ways.

3.2 Density Operators as Non-Pure States

The first, more or less familiar representation of an epistemic state n is given by a (reduced) density operator D 6 M*, where M* is the predual of a W*-algebra M of contextual observables. The expectation value of D is given by Tr.DM for observables M E M. The epistemic state n represented by D is a non-pure state. EPR-correlations between subsystem and environment are generic if the contextual algebra of observables is non-commutative.

The term "contextual observables" derives from the fact that their construction requires the selection of a context defined by a subset of "relevant" observables B E B C A and a reference state (e.g., vacuum state, KMS state) distinguished by some appropriate stability condition. This context induces the weak closure of B and gives rise to a contextual topology in M.. If the context is known well enough, then the GNS representation is a powerful constructive tool to implement a proper contextual topology (see, e.g.,15).

The dynamics of D is of Schrodinger type plus dissipative terms (e.g., a master equation), so that the time-reversal invariance of the Schrodinger equation can be broken18 '19.

3.3 Probability Distributions of Pure States

If the epistemic state r\ of an open system is approximately pure by a clever dressing of object and environment (b indicates bare objects and environments and d indicates dressed objects and environments),

ri0i,j <8> Henv = Hgbj <8> nenv,

7] can be represented (estimated) by a probability distribution fj, of pure states. (A dressing procedure is clever if it minimizes EPR-correlations between object and environment, or if it maximizes the "integrity" of both object and environment20.) Hgbj is the proper Hilbert space for an approximately pure epistemic state 77. Although 77 can be uniquely extended to a normal state on M (represented by a density operator), the pure states and their distribution fi themselves do not make sense on M. The "relevant" observables are elements of a C*-subalgebra B C A.

57

The dynamics of p is of Schrodinger type plus stochastic terms (e.g., an Ito/Stratonovic equation), so that the time-reversal invariance of the Schrodinger equation can be broken. The stochastic aspect of the time evolution (of approximately pure states of the object) originates from the fact that the (initial) state of the environment cannot be determined and therefore must be treated as a stochastic variable. Starting from an initial pure state pa, one gets time-evolved states pt,u, where co is the stochastic variable. First steps of such an approach toward single open quantum systems, not based exclusively on decompositions of density-operator dynamics, were proposed in2 1 '2 2 .

For a large class of stochastic dynamics of approximately pure states of objects, one ends up with one particular distribution p^ of pure states in the limit t —> oo independently of the initial conditions (such dynamical objects are called ergodic). Splitting the underlying C*-algebra B into two subsystems with two C*-subalgebras B\ and B2, B = B\ ® B2, is then admitted under particular conditions. In an ideal situation all those pure states onto which the probability measures pt extend are product states with respect to the tensor product B = B\ ® $2- This situation never arises in practice, but "most" relevant pure states can be product states or almost product states, if the dressing tensorization is chosen appropriately 23.

3-4 Dynamics of Measurement: a Simple Example

Any dynamical description of measurement has to start from a proper decomposition of a system into a dressed object and its dressed environment. It is crucial to keep in mind that such a decomposition is a logical precondition for the dynamics of measurement insofar as the Hamiltonian of the composed system needs to be written as a sum

H = Hobi®l + l®Hmy+Hint. (1)

An illustrative heuristic example has been extensively discussed by Primas24 . Consider the simple case of a two-level quantum object (spin 1/2 system) with the Hamiltonian

h 3

^ o b j ~ Tj/^yGu, (2)

a sufficiently nontrivial boson field environment

3

-Henv = ^2^2ujka*kl/akv, (3)

58

and an interaction

3

Hint = ^ <7„ (g> A„ , (4)

where

Av = ^ ^kuOtkv + C.C. (5) k

If such a decomposition has been properly carried out (cf. Sec. 3.3), then it is possible to derive the expectation values

M(t) = <iptW\flH> (6)

a(t) = <Xt\A\Xt> (7)

with respect to the (approximate) product state

*t = v- tobj®xr- (8)

Corresponding to the product state \Pt, the C*-algebra of intrinsic observables in the composed system of dressed object and dressed environment is

A = A0hi ®-4env (9)

Aohi is the C*-algebra of 2 x 2 matrices and ^4env is the C*-algebra of intrinsic observables of an environment with infinitely many degrees of freedom.

The equations of motion for the expectation values M(t) and a(t) are given by:

M(t) = M(t) x ft + M(t) x a(t), (10)

"*!/(*) = -UkOLkv + -^>~kvMv{t) . (11)

They describe the feedback between object and environment. More precisely, they describe the polarization M of the object under the influence of the environment and the motion of the environment observable a (boson operator) under the polarizing influence of the object. The solution of the second equation, referring to the observables of the environment (or the measuring system,

59

respectively) has a retarded and an advanced part:

(t > 0), (12)

(t < 0) , (13)

A bidirectionally deterministic system can be described in terms of a superposition of a backward deterministic (forward non-deterministic) and a forward deterministic (backward non-deterministic) process which are equally relevant a priori. Selecting one of these solutions and disregarding the other requires the time inversion symmetry of the compound system to be broken. For this purpose, one can apply the principle of causality (past-determinacy, error-free retrodiction, no anticipation) as a "heuristic" argument for the selection of the retarded solution.

It has been argued that the retarded, i.e., the backward deterministic, forward non-deterministic, solution is a K-flowc on a state space with infinitely many degrees of freedom24. In the simplest case, the relaxation time for this K-flow is the time constant r„ of an exponentially decaying correlation function (for details, see24)

Kv = ivexp(-\t\/Tv). (14)

At this point we are still at the level of description of intrinsic observables needed for the specification of initial conditions of the K-flow. Conceptually, this K-flow represents a stochastic process which corresponds to chaos in the sense of Wiener25 rather than chaos in the sense of Kolmogorov and Sinai (i.e., a dissipative dynamics). By introducing a context via a reference state, with respect to which stability in a particular sense (hopefully more general than thermal equilibrium) can be checked, one can proceed to (GNS-constructed) contextual observables.

3.5 General Features of Extrinsic Irreversibility

The breaking of time-reversal symmetry in the framework of extrinsic irreversibility corresponds to the conceptual transition from closed systems with cNote that K-flows or K-systems play an important role in one of the approaches of intrinsic irreversibility (see Sec. 4.1). It would be interesting, but exceeds the scope of this paper, to explore the question of whether the process of measurement as described here can be conceived as intrinsically irreversible. In this respect, see, e.g.,2 6 .

aTke; = exp(-iLjkt)akl/{0)

i r - 2Xk" exp(-iuk(t - s))Mv(s)ds

fj = exp(-iujkt)akv(t)

i f° + 9 ^ / exp(-wt(t-s))M„(s)ds

60

ontic states to open systems with epistemic states. Such a transition can be understood by dividing a closed system into open, more or less EPR-correlated subsystems (e.g., object and environment), and by selecting a subset of "relevant" observables. The proper state concepts are epistemic. There are then two different statistical representations for different epistemic state concepts. A /^-statistical representation expresses a probability distribution of pure states, whereas the usual /^-statistical representation focuses on reduced density operators.

The interaction of the open subsystems is described by dynamical laws different from the time-reversal invariant dynamics of a closed system. Breaking the time-reversal invariance of a unitary group evolution generates two semigroups, which can be endowed with two arrows of time opposite to each other. It should be pointed out that the forward arrow cannot be selected by physical reasons alone. Extra-physical arguments such as consistency with experience, causality, etc. must be invoked.

4 Breaking Time-Reversal Symmetry: Intrinsic Irreversibility

In contrast to the extrinsic concept of irreversibility, there is an alternative concept of intrinsic irreversibility, mainly advocated by Prigogine and collaborators (more recently also by Bohm). They propose describing states of any system generically with distributions p (i.e., probability distributions or density operators). The claim is that the state p of systems beyond a particular degree of complexity evolves irreversibly by itself, i.e., without any relationship to an environment. There are essentially two lines of research pursuing this proposal.

4-1 A-Transformation from K-Systems to Exact Systems

The notion of the A-transformation has been developed by Misra, Courbage and Prigogine in the 1970s. It is essentially based on the theory of ergodic systems. In particular, the concept of Kolmogorov systems, briefly K-systems, is of central significance in this context.

Definition 127: Let (X, A, n) be a normalized measure space and let S : X —> X be an invertible transformation such that S and 5 _ 1 are measurable and measure preserving. The transformation S is called a K-automorphism if there exists a cr-algebra A0 such that the following three conditions are satisfied: (i)S-1(A0)cA0; (ii) the cr-algebra f l^Lo ' - ' ""^ 0 ) is trivial (i.e., contains only sets of measure

61

1 or 0); (hi) the smallest cr-algebra containing \J™=0S

n(Ao) is identical to A. Another way to characterize (classical) K-systems is by way of the existence

of positive Ljapounov exponents, equivalent to a strictly positive Kolmogorov-Sinai entropy. The properties of K-systems imply mixing and ergodicity. K-systems are invertible transformations, hence their deterministic dynamics, given by p(t) = Ut p(0), is reversible (Ut is a unitary evolution operator acting on p). A standard example is the (2-dimensional) baker transformation.

Another important class of mixing systems refers to so-called exact systems.

Definition 2 27: Let (X\A,p) be a normalized measure space and let S : X —t X a, measure preserving transformation such that S(A) £ A for each A £ A. If l im^oo = p(Sn(A)) = 1 for every A € A, p(A) = 1, then S is called exact.

Exact systems are represented by non-invertible transformations, hence their stochastic dynamics, given by p(t) = Wt p(0), is irreversible. Wt is a semigroup evolution operator acting on a distribution p rather than p. For instance, an exact system obtained from the baker transformation is the dyadic transformation

S(x) = 2x (mod 1).

A theorem by Rokhlin28 says that every exact system is the factor of a K-system. This means that K-systems can be transformed into exact systems by their projections (or "factors", see2 7). More generally, a factor of a K-system can be obtained by restriction to dilating fibers or unstable manifolds. Hence it is intuitively clear that the invertibility of a K-system gets lost by its transformation into an exact system.

According to Misra et al. 29 '30; the relations between the two kinds of

dynamics Ut and Wt and the two state concepts p and p are provided by a similarity transformation A according to

Wt = AUtA-1

p = Ap.

Wightman's question31 as to the meaning of p in his review of30 gets an immediate answer if one applies Rokhlin's theorem to construct A (cf. 3 2 ) . The transformed distribution p is the projection of p onto a dilating subspace. This can easily be seen for the examples of the baker transformation and the dyadic transformation. In the more complicated case of continuous-time nonlinear (hyperbolic) systems, the corresponding procedure would be a projection onto the unstable manifolds, i.e., those directions along which the Lyapunov expo-

62

nents are positive and add up to the Kolmogorov-Sinai entropy (cf. 33>34). As an important conceptual feature, such projections select a time direction.

A crucial formal feature associated with the irreversibility due to Wt is that a properly constructed A (and hence A[/ (A

_1) preserves the positivity of the state distributions only for positive times. A conceptual discussion of this point can be found in3 5 . For a more detailed, formal account of the role which positivity preservation plays in the transformation between irreversible semigroups and chaotic dynamics see 36 and references given there.

4-2 Rigged Hilbert Space Representation

Intrinsic irreversibility has also been implemented in an approach based on an extension of the usual Hilbert space representation of the state of a system. This approach makes use of the so-called rigged Hilbert space (RHS) construction first introduced by the Russian mathematician Gel'fand and his collaborators37. Roberts38 and Bohm3 9 independently showed how Dirac's formalism could be justified with complete mathematical rigor in a RHS. By the end of the 1970s, it turned out that some basic physical problems of Hilbert space quantum mechanics, notably in the context of decaying states or resonances, could be clarified in terms of RHS (40 and references therein).

Very briefly, a RHS (Gel'fand triplet) can be understood as follows. Let * be an abstract linear scalar product space and complete * with respect to two topologies. The first topology is the standard norm topology yielding a separable Hilbert space. The second topology r$ is defined by a countable set of norms

IMU = \A&0)n ^ € #, n = 0,1,2,... (15)

where (f> e $ and the scalar product is given by

(<(>, <f/)n = (<j>, (A + 1) V ) , n = 0 , 1 , 2 , . . . (16)

where A is the Nelson operator A =J2iXi41- The Xi are operators representing the observables for the system in question and are the generators for the Nelson operator. Furthermore the operator A + 1 is a nuclear operator and ensures that $ is a nuclear space (cf. 42>39). An operator is nuclear if it is linear, essentially self-adjoint and its inverse is Hilbert-Schmidt. An operator A-1 is Hilbert Schmidt if A'1 = XiPi where the Pt are mutually orthogonal projection operators on a finite dimensional vector space and J2iPi < °° > Pi denoting the eigenvalues of Pi39. We then have the Gel'fand triplet of spaces

$ C ^ C $ X (17)

63

where $ x is the dual to the space $. The Nelson operator fully determines the choice of function space when

it comes to choosing a realization of the space $. However, there are many different inequivalent irreducible representations of an enveloping algebra of a Lie group used to generate a Nelson operator describing physical systems. Therefore further restrictions on the choice of function space for a realization of $ are required. The particular characteristics of the physical context of the system being modeled provide some of these restrictions, analogous to the situation for GNS constructions in the transition from C*- to W*-algebras in algebraic quantum mechanics23. Additional restrictions may be required due to the convergence properties desired for test functions in $ and <J>X.

Bohm and colleagues applied the RHS approach to intrinsic irreversibility in the context of scattering and decay phenomena40'43. Antoniou and Prigogine 44 extended the approach to broader contexts. The core idea in both versions is that a unitary group operator Ut = exp(-iHt), —oo < t < oo, generated by a Hamiltonian H, under very general circumstances, may be extended from W to $ x (restricted to $) . For scattering processes, $ is the intersection of the Hardy class functions with the Schwarz class functions. Because of continuity and completeness requirements, Ut : $ x —> $ x (Ut : $—>$) can be extended to the upper half plane $+ (restricted to $+) for positive times and to the lower half plane $ x ($_) for negative times4 3 . The extension of Ut to $ x

(restriction to $) forms two semigroups because the extension (restriction) cannot be defined for replacement of t with —t. Thus, semigroup evolution falls out of the analysis quite naturally in the RHS framework.

4-3 General Features of Intrinsic Irreversibility

In the intrinsic conception of irreversibility, states of a system are generically represented by distributions in a suitable state space, where pure states are S functions. The trajectories of individual points are either (1) considered irrelevant because empirically inaccessible (as in the A-transformation approach) or (2) make minimal contributions to the collective behavior of the system when a sufficient number of Poincare resonances are present (as in the RHS approach). For systems beyond a particular degree of complexity (K-systems, Poincare resonances, etc.), the dynamics of the system is governed by irreversible evolution laws regardless of interactions with an environment.

While the A-transformation approach has only been applied to the baker map, the RHS approach has been applied to nonlinear maps, Friedrich models,

dThe dual space * x is the space of linear functionals acting on elements of <£> and its topology is induced by the choice of T* and includes distributions among its elements.

64

scattering experiments and other decay phenomena. In the latter approach, exact Golden Rules for decay and survival probabilities and their rates can be derived in agreement with experimental observations43.

In both approaches the transition from reversible to irreversible dynamical evolution laws is achieved by breaking the time-reversal symmetry in specific ways leading to two semigroups. The time direction of the semigroups, however, is not given by either the A-transformation or RHS approaches. Physical considerations alone are insufficient to select the forward arrow and one must appeal to consistency with experience, causality or other criteria.

5 Summary and Open Questions

There are two basic points at which extrinsic and intrinsic notions of irreversibility coincide. The first is that both notions explicitly break the time-reversal symmetry of reversible dynamical laws. This is clearly the case for the standard, external view, in which the transition from fundamental, reversible laws to contextual, irreversible laws corresponds to the transition from ontic states of closed systems to epistemic states of open systems. But even for the alternative, intrinsic view "irreversibility is an emergent feature" 45 . In the framework of the A-transformation, the time-reversal symmetry of K-systems is broken, leading to irreversible, exact systems. In the RHS representation, a similar symmetry breaking is achieved by the transition from Hilbert space to the rigging spaces $ and $ x .

The breaking of time-reversal symmetry always produces two semigroups which can be endowed with opposite temporal directions. Selection criteria must be used to select one of these two directions for a preferred mode of description. In both extrinsic and intrinsic approaches, there is no such criterion available based on physical reasoning alone. The selection is based on extra-physical arguments such as causality, experience, and others. This second point of agreement between extrinsic and intrinsic irreversibility raises the interesting question of what conditions the "proper" direction of time has to satisfy. It could be argued that up to the condition that it is the same for all physical systems, the selection is arbitrary .

There are two basic points at which extrinsic and intrinsic notions of irreversibility apparently differ. One of them concerns the role of the environment, the other has to do with the state concepts used in the two approaches. Briefly speaking, the role of the environment and the distinction of different state concepts is crucial in the standard framework of extrinsic irreversibility. The conceptual framework of the formalisms refering to intrinsic irreversibility neither (1) explicitly contains the concept of an environment nor (2) distinguishes

65

between different state concepts. These observations do not necessarily imply that intrinsic irreversibility

really can dispense with points (1) and (2). It is likely that the two points play crucial roles even though they do not explicitly appear in the formalism and its usual interpretation.

The projection (factorization) which is the crucial part of a A transformation can be considered as the selection of an exact subsystem of the original K-system. Obviously, the A-transformation is not universal but context-dependent. Conceptually, the irreversible evolution of p — Kp due to Wt could then be attributed to the restriction of the K-system to an exact subsystem. This might lead to interesting analogies with aspects of extrinsic irreversibility, if the subsystem cannot be described as a closed subsystem. Concrete empirical applications of the A-transformation are not yet available. They would be necessary to check the significance of a physical environment which is not explicit in the formalism.

Concerning the distinction between ontic and epistemic state concepts, it is clear that the approach of intrinsic irreversibility starts at the level of distributions rather than points. In the space of distributions, 5 functions are special cases that could be related to points in a state space underlying the distribution space considered. In this way, a connection between distributions as epistemic states and points as ontic states is possible. The general claim in the A-transformation framework of intrinsic irreversibility, though, is that ontic states in the sense of phase points are meaningless or irrelevant since they are empirically inaccessible.

But is it justified to consider ontic states as generally irrelevant because they are empirically inaccessible? Reversible fundamental laws refer to ontic states, and it is not easy to formulate physics without them. The monographs by Ludwig46, which consistently avoid any ontic elements, are an illustrative example. Moreover, special techniques to break symmetries often enable a unique derivation of irreversible contextual laws if the fundamental laws plus contexts are known. This also holds for the symmetry breaking used to derive intrinsic irreversibility from time-reversal invariant evolution in the A-transformation approach. The empirical inaccessibility of ontic states notwithstanding, one should therefore not dismiss their overall relevance too quickly.

In the RHS approach, there is no contradiction with the formal arguments in the case of extrinsic irreversibility insofar as the extension of Ut from V. into $ x leads from reversibility to irreversibility. In this case, irreversibility is a feature arising during the transition from states in % to states whose state space is defined with respect to contexts. In the algebraic framework of Sec. 3,

66

such contexts are reflected by a contextual topology on M.. As mentioned in Sec. 4.2, physical contexts may not be known sufficiently well to determine $ x uniquely. The physical examples used to demonstrate the significance of the RHS formulation (e.g., decay) suggest that a physical environment is inevitable, although this is not explicit in the formalism.

The relationship between ontic and epistemic states in the RHS approach is more subtle than in the A-transformation approach. As Petrosky and Pri-gogine argue47,48, the presence of a sufficient number of Poincare resonances in so-called large Poincare systems (LPS) rapidly convert the smooth, infinitely differentiable trajectories of the phase space points into random walks. Though the trajectories are not considered to be empirically inaccessible, their effects are limited to the formation of higher and higher orders of correlations as the dynamics evolves. The phase space points can represent ontic states, but the correlations also have an ontic status. Correlations very rapidly come to dominate the dynamics of all collective modes of behavior of LPS (e.g. the approach to equilibrium) as the correlations diffuse throughout the system. In this way the effects of individual points and trajectories become irrelevant to the dynamics of the whole and, thus, one can argue that the distribution description is an ontic description of the system's behavior.

In this way, the distinction between ontic and epistemic states might be a powerful conceptual tool even at the level of distributions alone. There is a conceptual difference between a probability distribution conceived as a distribution over an ensemble of individual pure states (as in the /^-statistical representation) and a probability distribution conceived as an individual whole. The latter concept is sometimes indicated in the context of intrinsic irreversibility and can be considered as an ontic version of the former (cf. the notion of relative onticity16). For instance, continuum mechanics requires a formulation which needs ontically interpreted, "holistic" distributions from the very beginning, since its description in terms of an ensemble of points would violate basic physical laws.

Among the adherents of intrinsic irreversibility it is claimed that the "holistic" concept of a distribution as a whole entails predictions, e.g., related to the dynamics of correlations in large systems, which cannot be obtained with the concept of a probability distribution of individual pure states. This claim particularly refers to situations far from thermal equilibrium. Based on Gallavotti's approach, which describes systems far from equilibrium in terms of SRB-measures49, i.e., in an ensemble description, this claim may become testable (see also50 for a brief discussion).

After all, it is possible to view the intrinsic approach to irreversibility as emphasizing the relative importance of the advanced level of complexity

67

of systems with nontrivial correlations over environmental effects. While extrinsic irreversibility addresses the importance of an environment, intrinsic irreversibility should not primarily be understood as focusing on the neglect of such an environment (e.g. the environment may be a necessary condition for the existence of the dynamics). Instead, it is perhaps more appropriate to understand intrinsic irreversibility as irreversibility intrinsic to the dynamics of a system given a particular degree of its complexity.

Acknowledgments

Helpful comments by L. Accardi, L. Ballentine, H. Narnhofer, and I. Volovich during the discussion of this contribution at the conference are much appreciated. We are grateful to H. Primas for remarks on an earlier version of this paper.

References

1. J.H. Fetzer and R.F. Almeder, Glossary of Epistemology/Philosophy of Science (Paragon House, New York, 1993), p. lOOf.

2. D. Howard: Space-time and separability: problems of identity and individuation in fundamental physics. In Potentiality, Entanglement, and Passion-at-a-Distance, ed. by R.S. Cohen, M. Home, and J. Stachel (Kluwer, Dordrecht, 1997), pp. 113-141.

3. W. Heisenberg: Physics and Philosophy (Harper and Row, New York, 1958).

4. D. Bohm: Wholeness and the Implicate Order (Routledge and Kegan Paul, London, 1980).

5. B. d'Espagnat: Veiled Reality (Addison-Wesley, Reading, 1995). 6. H. Margenau: Reality in quantum mechanics. Phil. Science 16, 287-302

(1949), here: p. 297. 7. K.R. Popper: The propensity interpretation of probability, and quan

tum mechanics. In Observation and Interpretation in the Philosophy of Physics - With special reference to Quantum Mechanics, ed. by S. Korner in collaboration with M.H.L. Pryce (Constable, London, 1957), pp. 65-70. [Reprinted by Dover, New York, 1962.]

8. R. Harre: Is there a basic ontology for the physical sciences? Dialectica 51, 17-34 (1997).

9. M. Jammer: The Philosophy of Quantum Mechanics (Wiley, New York, 1974), pp. 448-453, 504-507.

10. E. Scheibe: The Logical Analysis of Quantum Mechanics (Pergamon, Oxford, 1973), pp. 82-88.

68

11. H. Primas: Mathematical and philosophical questions in the theory of open and macroscopic quantum systems. In Sixty-Two Years of Uncertainty, ed. by A.I. Miller (Plenum, New York, 1990), pp. 233-257.

12. H. Primas: Endo- and exotheories of matter. In Inside Versus Outside, ed. by H. Atmanspacher and G.J. Dalenoort (Springer Berlin, 1994), pp. 163-193.

13. J. von Neumann: Mathematische Grundlagen der Quantenmechanik (Springer, Berlin, 1932). English translation: Mathematical Foundations of Quantum Mechanics (Princeton University Press, Princeton, 1955).

14. A. Einstein, B. Podolsky, and N. Rosen: Can quantum-mechanical description of physical reality be considered complete? Phys. Rev. 47, 777-780 (1935).

15. H. Primas: Emergence in exact natural sciences. Acta Polytechnica Scan-dinavica M a 91, 83-98 (1998). See also Primas, Chemistry, Quantum Mechanics, and Reductionism (Springer, Berlin, 1983), Chap. 6.

16. H. Atmanspacher and F. Kronz: Relative onticity. In On Quanta, Mind, and Matter. Hans Primas in Context. Edited by H. Atmanspacher, A. Amann and U. Miiller-Herold (Kluwer, Dordrecht, 1999), pp. 273-294.

17. H. Atmanspacher: Ontic and epistemic descriptions of chaotic systems. In Computing Anticipatory Systems: CASYS 99. Edited by D. Dubois (Springer, Berlin, 2000), pp. 465-478.

18. E. Fick and G. Sauermann: Quantenstatistik dynamischer Prozesse Ha: Antwort- und Relaxationstheorie (Harri Deutsch, Thun, 1986).

19. R. Kubo, M. Toda, and N. Hashitsume: Statistical Physics II (Springer, Berlin, 1985).

20. H. Primas: The Cartesian cut, the Heisenberg cut, and disentangled observers. In Symposia on the Foundations of Modern Physics. Wolfgang Pauli as a Philosopher, ed. by K.V. Laurikainen and C. Montonen (World Scientific, Singapore, 1993), pp. 245-269.

21. A. Amann: Structure, dynamics and spectroscopy of single molecules: a challenge to quantum mechanics. J. Math. Chem. 18, 247-308 (1995).

22. A. Amann and H. Atmanspacher: Fluctuations in the dynamics of single quantum systems. Stud. Hist. Phil. Mod. Phys. 29, 151-182 (1998).

23. A. Amann and H. Atmanspacher: C*- and W*-algebras of observ-ables, their interpretation, and the problem of measurement. In On Quanta, Mind, and Matter. Hans Primas in Context. Edited by H. Atmanspacher, A. Amann and U. Miiller-Herold (Kluwer, Dordrecht, 1999), pp. 57-79.

24. H. Primas: Induced nonlinear time evolution of open quantum systems.

69

In Sixty-Two Years of Uncertainty, ed. by A.I. Miller (Plenum, New York, 1990), pp. 259-280.

25. N. Wiener (1938): The homogeneous chaos. Am. J. Math. 60, 897-936 (1938).

26. CM. Lockhart and B. Misra: Irreversibility and measurement in quantum mechanics. Physica A 136, 47-76 (1986). Cf. H. Primas, Math. Rev. 87k, 81006 (1987).

27. A. Lasota and M.C. Mackey: Chaos, Fractals, and Noise (Springer, Berlin, 1995).

28. V.A. Rokhlin: Exact endomorphisms of Lebesgue spaces. Izv. Akad. Nauk SSSR Ser. Mat. 25, 499-530 (1964); transl. in Am. Math. Soc. Transl. 39, 1-36 (1964).

29. B. Misra: NonequiUbrium entropy, Lyapounov variables, and ergodic properties of classical systems. Proc. Ntl. Acad. Sci. USA 75, 1627-1631 (1978).

30. B. Misra, I. Prigogine, and M. Courbage: From deterministic dynamics to probabilistic descriptions. Physica A 98, 1-26 (1979).

31. A. Wightman: Review of Misra, Prigogine, and Courbage30. Math. Rev. 82e, 58066 (1982).

32. Z. Suchanecki: On lambda and internal time operators. Physica A 187, 249-266 (1992).

33. H. Atmanspacher and H. Scheingraber: A fundamental link between system theory and statistical mechanics. Found. Phys. 17, 939-963 (1987).

34. H. Atmanspacher: Dynamical entropy in dynamical systems. In Time, Temporality, Now, ed. by H. Atmanspacher and E. Ruhnau (Springer, Berlin, 1997), pp. 325-344.

35. R.W. Batterman: Randomness and probability in dynamical theories: on the proposals of the Prigogine school. Philosophy of Science 58, 241-263 (1991).

36. I. Antoniou, K. Gustafson, and Z. Suchanecki (1998): On the inverse problem of statistical physics: from irreversible semigroups to chaotic dynamics. Physica A 252, 345-361 (1998).

37. I.M. Gel'fand and N.Ya. Vilenkin: Generalized Functions, Vol. 4 (Academic, New York, 1964). Russian original published 1961 in Moscow.

38. J.E.Roberts: The Dirac bra and ket formalism. Journal of Mathematical Physics 7, 1097-1104 (1966).

39. A. Bohm: Rigged Hilbert space and mathematical descriptions of physical systems. In Lectures in Theoretical Physics IX A: Mathematical methods of theoretical physics. Edited by W.E. Brittin, A.O. Barut and M. Guenin (Gordon and Breach, New York, 1967), pp. 255-317.

70

40. A. Bohm and M. Gadella: Dirac Kets, Gamow Vectors, and Gelfand Triplets. Lecture Notes in Physics, Vol. 348, ed. by A. Bohm and J.D. Dollard (Springer, Berlin, 1989).

41. E. Nelson: Analytic Vectors. Annals of Mathematics 70, 572-615 (1959). 42. F. Treves: Topological Vector Spaces, Distributions and Kernels (Aca

demic Press, New York, 1967). 43. A. Bohm, S. Maxson, M. Loewe, and M. Gadella: Quantum mechanical

irreversibility. Physica A 236, 485-549 (1997). 44. I. Antoniou and I. Prigogine: Intrinsic irreversibility and integrability of

dynamics. Physica A 192, 443-464 (1993). 45. T. Petrosky and I. Prigogine: The Liouville space extension of quantum

mechanics. Adv. Chem. Phys. XCIX, 1-120 (1997), here p. 71. 46. G. Ludwig: Foundations of Quantum Mechanics Vols. 1/2 (Springer,

Berlin, 1983/1985). 47. T. Petrosky and I. Prigogine: Poincare resonances and the extension of

classical dynamics. Chaos, Solitons & Fractals 7, 441-497 (1996). 48. T. Petrosky and I. Prigogine: The Extension of Classical Dynamics for

Unstable Hamiltonian Systems. Computers & Mathematics with Applications 34, 1-44 (1997).

49. G. Gallavotti: Chaotic dynamics, fluctuations, nonequilibrium ensembles. CHAOS 8, 384-392(1998).

50. D. Ruelle: Gaps and new ideas in our understanding of nonequilibrium. Physica A 263, 540-544 (1999).

71

INTERPRETATIONS OF PROBABILITY A N D Q U A N T U M THEORY

L. E. B A L L E N T I N E

Department of Physics, Simon Fraser University, Burnaby,

BC V5A 1S6, Canada

e-mail: [email protected]

There is a peculiar similarity between Probability Theory and Quantum Mechanics: both subjects are mature and successful, yet both remain subject to controversy about their foundations and interpretation. I first present a classification of the various interpretations of probability, arguing that they should not be thought of as rivals, but rather as applications of a general theory to different kinds of subject matter. An axiom system that makes conditional probability the fundamental concept is put forward as being superior to Kolmogorov's axioms. I then discuss the relevance to quantum theory of the various interpretations of probability, the applicability of classical probability theory within quantum mechanics, and the relations between the interpretation of probability and the interpretation of quantum mechanics.

1 Introduction

There are many connections between Probability Theory and Quantum Mechanics, the most notable being that Quantum Mechanics uses Probability Theory in its fundamental interpretation, not merely as a technique. But I wish to concentrate on a more peculiar similarity. Although both subjects are mature and successful, both remain subject to controversy about their foundations and interpretation. There may be even more interpretations of probability than there are of quantum theory. Can one bring some degree of order to this subject?

Probability Theory, being a branch of mathematics, is defined by a set of axioms. So it can legitimately be applied to any entity that satisfies those axioms. Most of the interpretations of probability can be viewed as applications of the formal theory to different subject matters. It is therefore misguided to argue over which is the correct interpretation. Most of them are correct within their appropriate domain of application. But it is still reasonable to ask whether there is a general, overarching form of Probability Theory, of which all the various interpretations can be seen as special cases, applied to special subject matters.

I shall propose such a classification of the various interpretations of probability. To do so, it is necessary to overlook small differences and to lump closely related interpretations into a few broad categories. I expect this classi-

72

fication to be controversial, but I believe that it is a step in the right direction. I shall consider only theories that are based on the same, or equivalent, sets of axioms. Hence generalizations such as negative probabilities are not included in this scheme, although I shall briefly refer to them later. After describing the major categories of interpretation of probability, I will discuss the relevance of each to quantum mechanics.

2 Interpretations of Probability

Many different interpretations of probability are examined in detail by T. L. Fine.1 I propose to overlook many of the fine differences, and hence classify them into a few major groups, shown in Figure 1. References to most of the authors named in Fig. 1, and critical analyses of their ideas, are given by Fine.1

2.1 The Theory of Inductive Inference

I propose that the Theory of Inductive Inference be taken as the master theory, and that all other interpretations be regarded as special cases, applicable in more restricted contexts. This point of view was expressed most completely by E. T. Jaynes in his book Probability Theory: The Logic of Science? which unfortunately was not completed during his lifetime.

Within this interpretation, probability is assigned to propositions. The notation P(A\C) is to be read as the probability of A under the condition C. Probability is regarded as a logical relation among propositions that is weaker than entailment. Inductive logic reduces to deductive logic in the limit of probability values 0 and 1. Probability is an objective relation, and should not be confused with degrees of belief.

The propositions to which probability is assigned may have any particular content. If we specialize to propositions about repeated experiments we obtain the Ensemble-Frequency theory. If we specialize to propositions about personal belief we obtain Subjective probability. If we specialize to propositions about indeterministic or unpredictable events we obtain the Propensity theory.

Although P(A\C) is a logical relation between proposition A and the conditioning information C, it is not merely a formal, syntactic relation. The content (meaning) of A and C must be invoked to evaluate P(A\C). There is no magic formula to translate arbitrary information into probabilities. Jaynes has given solutions to this problem in some important special cases (symmetry groups, marginalization), but there is, as yet, no general solution.

73

The Logic of Inductive Inference

(E. T. Jaynes, R. T. Cox, H. Jefferys)

P(A\C) is the probability that proposition A is true, given the information C.

Ensemble and Frequency

(Kolmogorov, Bernoulli, von Mises)

Measure on a set; Limit frequency in an ordered sequence.

Propensity

(K. R. Popper)

P{A\C) is the propensity for event A to occur under the condition C.

Subjective and Personal

(de Finnetti, L. J. Savage, I. J. Good)

Incomplete knowledge; Degrees of reasonable belief.

Figure 1: Classification of the interpretations of Probability.

2.2 Ensemble and Frequency Theories

One of the most common interpretations of probability is as a limit frequency in an ordered sequence. The ratio of the number n of occurrences of a particular type in a sequence of N events, n/N, is identified with the probability. This interpretation is useful in analyzing repeated experiments, but it has the

74

difficulty that in a random sequence the ratio n/N need not have a limit. The ensemble interpretation is a generalization of the frequency interpretation, in which probability is identified with a measure on a set that need not be ordered. It is closely associated with Kolmogorov's axiom system, which will be discussed later.

2.3 Subjective Probability

Subjectivism has its place, and subjective probability provides an excellent way to describe degrees of reasonable belief. But in science, subjectivism can be like a virus, and we must guard against its infection. In general, the probability P(A\C) expresses an objective relation between A and C, determined by the totality of the information C, and not by anyone's personal opinions. Jaynes tried to ensure objectivity through the pedagogical device of introducing a robot that is programmed to reason consistently using only the information that is given to it. But even Jaynes sometimes slipped from objective to personal probabilities in his examples, without apparently being aware of doing so. Indeed, the contamination of Inductive Logic Probability by subjectivism may have been a major barrier to its acceptance.

2.4 Propensity

Propensity is a form of causality that is weaker than determinism.3'4 Generally speaking, probability expresses logical relations, rather that causal relations. (Recall the old saying: Correlation does not imply causality.) However, causality is a special kind of logical relation, and propensity theory deals with just that special case. The propensity interpretation of probability is natural in situations, such as those described by quantum mechanics, in which events can not be predicted with certainty from their antecedents.

3 The Axioms of Probability

The axioms of probability theory can be given in several different forms, however those given by R.T. Cox5,6 are particularly convenient.

Axiom 1. 0 < P{A\B) < 1 Axiom 2. P{A\A) = 1 Axiom 3. PhA\B) = 1 - P(A\B) Axiom 4. P(AkB\C) = P(A\C) P{B\AkC)

Here the notation is as follows: ->A means "not A"; Ak,B means "A and J5"; A\/ B means "either A or B".

75

Axiom 2 states that the probability of a certainty (A, given A) is one. Axiom 1 states that no probabilities are greater than the probability of a certainty. Axiom 3 expresses the notion that the probability of non-occurrence of an event increases as the probability of its occurrence decreases. It also implies P{->A\A) = 0; an impossibility (not A, given A) has zero probability. Axiom 4 is the least intuitive. The probability of both A and B (under some condition C) is equal to the probability of A multiplied by the probability of B given A.

The probabilities of negation (->A) and conjunction (A&B) each require an axiom. However, no further axioms are required to treat disjunction because AV B = -i(-iA&-ii?); in words, "A or B" is equivalent to the negation of "neither A nor B". This allows us to deduce a theorem:

P(A V B\C) = P(A\C) + P(B\C) - P{AkB\C). (1)

If A and B are mutually exclusive then we obtain

P{AV B\C) = P(A\C) + P(B\C), (2)

which is often taken to be an axiom, and may be used in place of Axiom 3. Several remarks about these axioms are in order. First, the notion of ran

domness plays no fundamental role in the theory. Hence we need not enquire whether our variables and events are random as a prerequisite to applying probability theory.

Second, these axioms are not arbitrary. They are uniquely determined (apart from formal changes that do not affect the content) by conditions of plausibility and consistency (see Cox5 and Jaynes2):

(i) The probability of A on some given evidence determines also the probability of "not A" on the same evidence.

(ii) The probability on given evidence that both A and B are true is determined by their separate probabilities, one on the given evidence, and the other on that evidence plus the assumption that the first is true.

(iii) If a complex proposition can be composed in more than one way [ex.: (A&B)&C, or A&c{Bb,C)\ then all ways of computing its probability must lead to the same answer. Notice that in (i) and (ii) only the existence of certain connections are assumed, but not their mathematical form. The consistency condition (iii) then leads to the mathematical forms of the axioms. Therefore, anyone who proposes an inequivalent alternative to Cox's axioms (such as allowing negative probabilities) has an obligation to explain how and why he departs from these conditions of plausibility and consistency.

76

Finally, a very important remark: All probabilities are conditional.

The use of the single-variable notation P{A), instead of P(A\C), is permissible only if the conditional information C is obvious from the context, and is unchanging throughout the problem. Many fallacies and paradoxes follow from ignoring this principle.

3.1 Kolmogorov's axioms

If the fundamental axioms that define Probability Theory are those given above, then what is the status of Kolmogorov's well-known axioms? According to Kolmogorov's axioms, probability is assigned to subsets of a universal set fi, with the following rules:

(i) p(n) = I (2) P(f) > 0 for any / in il. (3) If / i , - - - /«are disjoint then P(f) = Sj / j , where / is the union of

fir" fn-(4) If/* —> 0 (the empty set) then P(fi) -> 0. The answer, I believe, is that Kolmogorov's axioms provide a mathemat

ical model of probability theory (defined by Cox's axioms) on the theory of measurable sets. A mathematical model is useful because it reduces the consistency of one theory to that of another. (A familiar example is the algebra of complex numbers, which can be modeled by the algebra of ordered pairs of reals.) Thus any doubts about the consistency of Probability Theory may be laid to rest because of the existence of Kolmogorov's model.

There are several objections to taking Kolmogorov's axioms as a foundation for Probability Theory, rather than merely as a model: • The universal set Cl is often fictitious. The propositions to which probabilities are assigned are not subsets of a set. • Conditional probability is relegated to secondary status, while the mathematical fiction of "absolute probability" is made primary. • Probability theory and Measure theory are distinct subjects. The interesting problems of one are not closely related to the interesting problems of the other. For example, measure theory deals mostly with infinite sets, culminating with the construction of non-measureable sets, which have no probabilistic interpretation. But in probability theory one seldom needs to consider an infinite number of conjunctions and disjunctions. On the other hand, the important problem of translating qualitative information into probabilities has no measure-theoretic analog.

77

4 Probability in Quantum Mechanics

4-1 Relevant and Irrelevant Interpretations of Probability

Which of the interpretations of probability are relevant to quantum mechanics? The ensemble-frequency interpretation is obviously relevant, and widely used, in discussing the statistics of repeated experiments on similarly prepared states. Indeed, the standard description of an idealized experiment is: (1) prepare a state; (2) measure an observable of the system; (3) repeat the previous two steps until sufficient statistical data has been accumulated; (4) compare the relative frequencies of this data with the probabilities predicted by quantum theory.

The propensity interpretation is in accord with the ensemble-frequency interpretation whenever it is applied to repeated experiments, but it also allows one to make meaningful statements about individual events. The propensity interpretation is more natural when one considers time-dependent states, and hence time-dependent probabilities. Consider the following examples.

(i): A source produces s = 1/2 particles polarized at an angle 4> relative to some coordinate axis. A Stern-Gerlach magnet has its field gradient axis oriented at an angle 8. What is the probability that such a particle, incident on the apparatus, will emerge with spin "up"?

The formal answer is, of course, p = {cos[(9 — <j))/2}}2 , but what does this mean?

According to the propensity interpretation it means: The propensity (chance) of the particle emerging with spin "up" is p.

According to the ensemble-frequency interpretation it means: In a long run of similar experiments the fraction of particles emerging with spin "up" will be (approximately) p.

(ii) Now let the magnet be re-oriented in some arbitrary manner before each particle is released, so that 6 is different in each case.

According to the propensity interpretation we say nearly the same thing: The propensity (chance) has a different value, p = p$, in each case.

But in the ensemble-frequency interpretation one must conceptually embed each event in an imaginary long run of experiments having the same value of 6, in order to make a frequency statement.

78

(iii) Suppose next that the polarization direction <j> of the particles is unknown. Can it be inferred from the data of (ii)?

In the ensemble-frequency interpretation the answer would appear to be: No. A long run of events for each value of 0 would be necessary to estimate p$ as a frequency, and hence to determine its dependence on 6.

In the propensity interpretation the answer is: Yes. Bayesian inference (equivalent to maximum likelihood if the prior probability distribution for </> is uniform) can determine the most probable value of <j>, even if there is only one event for each value of 9.

I have never seen a coherent exposition of QM based on a subjective interpretation of quantum probabilities as representing knowledge*. This point (which has also been argued at length by Popper8) is worth emphasizing because the interpretation of probabilities as knowledge seems to be a tenet of the Copenhagen interpretation.

Two persons (with limited knowledge of QM) might have different "reasonable" beliefs about the position of the electron in the hydrogen atom, and those beliefs could be represented by subjective probabilities. But such "ignorance" probabilities have nothing to do with |*/>(a0|2 from the Schroedinger equation. |V'(a;)|2 is an objective propensity, not a subjective degree of belief.

The so-called Uncertainty principle, AxAp > h/2, has nothing to do with subjective knowledge or ignorance. Its meaning is that in any physical preparation of a state, the values of x and p will not be reproducible, the widths of their distributions being related by the inequality. The widths Aa; and Ap are objective, predictable, and measurable parameters, which should not be called "uncertainties". Indeed, the name "Indeterminacy principle" is preferable to "Uncertainty principle".0

Subjective probabilities can occur in the information games that are played in quantum communication theory. Consider a typical example.

Bob prepares some quantum state, but keeps it secret. He tells Alice only that it is one of four (usually nonorthogonal) possible states, and she must try to infer what the hidden state is from a measurement. Alice's incomplete knowledge of that hidden state can be expressed as a subjective probability. Suppose also that Bob tells Carol that the unknown state is one of three possibilities. Carol's knowledge is different from Alice's, and hence her subjective probability will be different. But both of these subjective knowledge probabilities are quite distinct from the objective quantum probabilities (propensities)

"When I once heard Heisenberg speak (about 1964), he used the term Indeterminacy principle. In his early writings he used the words Ungenauigheit (inexactness), Unbestimmtheit (indeterminacy), and Unsicherheit (uncertainty) with various shades of meaning.

79

that would be calculated by solving Schroedinger's equation for Bob's state preparation apparatus.

I suspect that the subjective "knowledge" interpretation of QM probabilities came about by accident; the founders of QM may have believed (erroneously) that probability can only be a measure of knowledge/ignorance. Max Born has written that Heisenberg did not know what a matrix was when he was inventing what later became known as matrix mechanics. It is therefore not very radical to suppose that the founders of quantum mechanics had an inadequate understanding of probability.

4-2 Fallacies in the use of Probability

Unsound arguments to the effect that "classical" probability theory does not apply to QM are woefully common. Before examining an actual argument to that effect, let us first consider a simple classical paradox.

The Bookie's Paradox A bookie needs to fix the odds on a star track runner, who has a 60% chance of winning any race that he enters. There is a race in Paris and a race in Tokyo scheduled on the same day, so he cannot enter both, and we do not know which he will enter. What is the probability that he will win at least one of these races?

Let A = (winning in Paris), and let B = (winning in Tokyo). Clearly A and B are mutually exclusive events, so P{A\JB) = P{A) + P(B). The probability of his winning at least one race is 0.6 + 0.6 = 1.2. But this is absurd, since 1.2 > 1.

The paradox is resolved by taking account of a principle that was noted in Sec. 3:

All probabilities are conditional. The notation P{A), instead of P(A\C), is permissible only if the conditional information C is obvious from the context, and unchanging throughout the problem.

Let us, therefore, be more precise about the conditions involved. Let Ep = (entering in Paris), and let ET — (entering in Tokyo). Then clearly we have

P(A\EP) = 0.6 ; P(B\EP)=0 P(A\ET) = 0 ; P(B\Er) = 0.6

80

Additivity, P(A V B\C) = P(A\C) + P{B\C), holds for the same condition C in all terms. But P{A\Ep) and P(B\ET) are not additive by any valid rule, so the absurd conclusion, reached above, followed only from an erroneous application of probability theory.

Double-slit Fallacy A common fallacy about 2-slit experiment is of exactly the same form. The experiment consists of three parts:

(a) Open slit # 1 , close slit #2 . The probability of a particle arriving at the point X on the screen is Pi(X).

(b) Open slit #2 , close slit # 1 . The probability of a particle arriving at X is now P2(X).

(c) Open both slits # 1 and #2 . The probability of a particle arriving at X is Pi2(X).

Now passage through slit # 1 and through slit # 2 are mutually exclusive, so we deduce

Pu{X) = Pi(X) + P2(X), which is empirically false. It is then concluded (fallaciously) that "classical" probability theory does not apply in quantum mechanics.

The above reasoning embodies essentially the same fallacy is does the Bookie's paradox, and it is resolved similarly by paying proper attention to the conditional nature of the probabilities.

Let condition C\ = (slit # 1 open, slit # 2 closed). Let C2 = (slit # 2 open, slit # 1 closed). Let C3 = (both slits open).

We observe empirically that P(X\Ci) + P(X\C2) ^ P(X\C3),

(due, of course, to interference). But this fact is is fully compatible with classical probability theory.

4-3 Quantum Probabilities

Quantum probabilities are not essentially different from classical probabilities, but like quantum theory itself, they do require some care in their interpretation. H. Jefferys 7 remarked that the probability statements of quantum mechanics are incomplete because, "a probability is always relative to a set of data, and the data are not specified." In our terminology, Jefferys is saying that all probabilities are conditional, and the conditions need to be specified to

81

make the probability statement meaningful. This can be accomplished through a propensity interpretation of quantum probabilities, with proper attention being given to the basic concepts of measurement and state preparation. When that is done, it can be demonstrated9 '10 that quantum probabilities obey all of the axioms of "classical" probability theory. The demonstration is straight forward, but too lengthy to review here, so I shall only remark on some conceptual points.

(a) The standard formula: P(A=an\^) = | (a„ |*) |2 , where A\an) = an\an), should be read as:

The probability (propensity) for a measurement of the dynamical variable A to yield the value an, conditional on the preparation of the state * , is | (a„ |*) |2 .

Note that the propensity is conditioned by the physical process of state preparation, and not by anyone's beliefs or opinions.

(b) One can also calculate the probability of a measurement result, conditioned by state preparation and the results of other measurements^

P(B=bm\(A=an)kV). However, it is necessary that the measurement processes be described dynamically as an interaction between the object and the apparatus. Simplistic application of the Projection Postulate is liable to give an incorrect answer.11

(c) No difficulties of principle arise if the probabilities are conditioned on actual events of state preparation and measurement. But assigning probabilities to hypothetical unmeasured values is not always possible. This problem is encountered if we try to introduce joint probability distributions for (unmeasured values of) non-commuting observables, and require the marginal distributions to agree with the quantum probabilities of the individual observables.

In the case of position and momentum, we would like to have a joint distribution P(x,p) that satisfies:

P(x,p) > 0, (3)

Jp(x,p)dp=\(x\*)\2, (4)

Jp(x,p)dx = \(p\V)\2. (5)

There are infinitely many solutions to this problem,12 but there is no apparent physical reason for any one of them to be preferred.

However, in the case of angular momentum, where we might seek a joint distribution P(Jx,Jy,Jz) for the three angular momentum components, it is

82

not difficult to show that no such a function can yield the quantum probabilities of the three components as marginals. However, this has more to do with Kochen-Specker13 difficulties (the impossibility of assigning values to all quantum observables, consistent with all the relevant constraints) than to probability theory. There is no case in which a quantum probability is well defined but violates an axiom of classical probability theory.

5 Conclusions

In this paper I have suggested a scheme whereby all the major interpretations of probability are unified, with the separate interpretations now seen as applications of the general theory to particular subject matters. That such different ideas as ensemble-frequency theories, propensity theory, and subjective degrees of reasonable belief can all be encompassed within a single framework is both useful and surprizing. Because they can all be described by the same mathematical axioms, it is easy to switch from one kind of probability to another, as may be appropriate in a particular problem. But on the other hand, one can ask why such different things as frequencies, propensities, and degrees of belief should necessarily obey the same axiom system. This question should stimulate further foundational research.

For the case of degrees of reasonable belief this work has already been completed by Cox,5'6 who showed that certain conditions of plausibility and consistency determine the axioms essentially uniquely. "Essentially unique" means subject only to formal transformations that do not alter the content of the theory. Therefore, any alternative inequivalent system of plausible reasoning could be shown to suffer from some degree of inconsistency.

Khrennikov14 has studied limit frequencies outside of any theory of probability, imposing only a condition of stabilization: that in a long sequence the frequencies should approach a limit. He has found many different cases to be possible, some of which lie outside of probability theory. It will be interesting to see whether these new logical possibilities are realized in nature. If not, then his stabilization condition will have to be supplemented by other conditions.

The greatest need for more foundational research is in the case of propensity. Although it clearly can be described by the axioms of probability theory, it is not yet clear why it must be so described.

Although I have dealt only with versions of probability theory that are derivable from the same axioms, I expect that the classification of interpretations (Fig. 1) may also be useful for generalized theories, such as those that admit negative probabilities.15 For such generalizations, we should ask which of the interpretations do they support. Can such generalized probabilities be

83

interpreted as frequencies? As propensities? As degrees of belief? Or must they be given some entirely new interpretation?

There are connections between the interpretations of probability and of quantum mechanics. This must be so because quantum mechanics does not predict events, but only the probabilities of events. If one adheres exclusively to a frequency interpretation of probability, then one is bound to assert that a quantum state describes only an ensemble of similarly prepared systems. If, on the other hand, one adopts a propensity interpretation of probability, then it becomes possible to make meaningful probability statements about an individual system. However the empirically testable content of those statements can be realized only by measurements on an ensemble of similarly prepared systems. Thus the frequency interpretation is not made obsolete by the propensity interpretation, but merely broadened. The subjective interpretation of probability can be used in some situations, such as when the observer is not fully informed about the state preparation procedure. But it is never correct to interpret \ip\2 as representing knowledge (except, perhaps, in the trivial case in which the observer's knowledge is complete and in perfect accord with reality).

References

1. T.L. Fine, Theories of Probability, an Examination of Foundations (Academic Press, New York, 1973).

2. E.T. Jaynes, Probability Theory: The Logic of Science (Cambridge University Press, forthcoming); an incomplete version of this work is available electronically at http://bayes.wustl.edu/

3. K.R. Popper in Observation and Interpretation ed. S. Korner (Butter-worths, London, 1957).

4. K.R. Popper, Realism and the Aim of Science (Hutchinson, London, 1983).

5. R.T. Cox, The Algebra of Probable Inference (Johns Hopkins University Press, Baltimore MD, 1961).

6. R.T. Cox, Am. J. Phys. 14, 1 (1946). 7. H. Jefferys, Scientific Inference (Cambridge University Press, Cambridge,

1973), sec. 10.31 8. K.R. Popper, Quantum Theory and the Schism in Physics (Hutchinson,

London, 1982). 9. L.E. Ballentine, Quantum Mechanics - A Modern Development (World

Scientific, Singapore, 1998), Ch. 1.5, 2.4, 9.6 10. L.E. Ballentine, Am. J. Phys. 54, 883 (1986). 11. L.E. Ballentine, Found. Phys. 20, 1329 (1990).

84

12. L. Cohen, in Frontiers of Nonequilibrium Statistical Physics, ed. G.T. Moore and M.O. Scully (Plenum, New York, 1986), pp. 97-117.

13. S. Kochen and E.P. Specker, J. Math. Mech. 17, 59 (1967). 14. A. Khrennikov, Nonconventional approach to 'elements of physical real

ity' based on nonreal asymptotics of relative frequencies. Proc. Conf. Foundations of Probability and Physics, Vaxjo-2000 (WSP, Singapore, 2001).

15. A. Khrennikov, Interpretations of Probability, (VSP, Utrecht, 1999).

85

FORCING DISCRETIZATION A N D DETERMINATION IN Q U A N T U M HISTORY THEORIES

BOB COECKE Imperial College of Science, Technology & Medicine, Theoretical Physics Group,

The Blackett Laboratory, South Kensington, LondonSW7 2BZ; and

Free University of Brussels, Department of Mathematics, Pleinlaan 2, B-1050 Brussels;


We present a formally deterministic representation for quantum history theories where we obtain the probabilistic structure via a discrete contextual variable: no continuous probabilities are as such involved at the primal level.

1 Introduction

In this paper we propose and study a model for history theories in which the probability structure emerges from a finite number of contextual happenings, any next happening having a fixed chance to occur under the condition that the previous one happened. Although this model cannot have a canonical mathematical status since it has been proved that this type of representation in general admits no essentially unique "smallest one" 8 u , it provides insight in the emergence of logicality in the "History Projection Operator" setting14, and it illustrates how deterministic behavior can be encoded beyond those interpretations of quantum history theories that are interpretationally restricted by so-called consistency or quasi-consistency (e.g., approximate decoherence). The particular motivation for this "paradigm case study" finds its origin in structural considerations towards a theory of quantum gravity4 '15 '19. As argued in16, although the relative frequency interpretation of probability justifies the continuous interval as the codomain for value assignment, "... in the quantum gravity regime standard ideas of space and time might break down in such a way that the idea of spatial or temporal 'ensembles' is inappropriate. For the other main interpretations of probability — subjective, logical, or propensity — there seems to be no compelling a priori reason why probabilities should be real numbers." Our model should be envisioned as a deconstructive step unraveling the probabilistic continuum as it appears in standard quantum theory, reducing it explicitly to a discrete temporal sequence of (contextual) events. The as such emerging temporal sequence is then easier to manipulate towards alternative encoding of contextual events, e.g., in propositional terms. It also enables a separate treatment of internal (the system's) and external (the con-

86

text's) time-encoding variable. Although quantum history theories are currently most frequently envi

sioned in a context of so-called decoherence we prefer to take the minimal perspective that a history theory is a theory that deals with sequential quantum measurements but remains essentially a dichotomic propositional theory. This is formally encoded in a rigid way in the History Projection Operator-approach 14. We also mention recently studied sequential structures in the context of quantum logic, of which references can be found in1 0 , resulting in a dynamic disjunctive quantum logic, which provides an appropriate formal context to discuss the logicality of history theories.

A general theory on deterministic contextual models can be found in 8 . Note here that what we consider as contextuality is that in a measurement there is an interaction between the system and its context and that precisely this interaction to some extend may influence the outcome of a measurement. A lack of knowledge on the precise interaction then yields quantum-type uncertainties *. Besides this interpretational issue, classical representations are important since we think classical, so even without giving any conceptual significance to the representation, it provides a mode to think deterministically in terms of determined trajectories of the system's state, without having to reconcile with concrete non-canonical constructs like pilot-wave mechanics.

2 Outcome determination via contextual models

We will present the required results in full abstraction such that the reader clearly sees which structural ingredient of quantum theory determines existence of contextual models. For details and proofs we refer t o 8 . Let B(M?) denote the Borel subsets of M? . Definition 1. A 'probabilistic measurement system' is given by: (i) A set of states £ and a set of measurements £; (ii) For each e e £ an outcome set Oe € B(W), a a-field B(Oe) of Oe-subsets and (Kolmogorovian) probability measures Pp<e : B(Oe) -> [0,1] for eachp 6 £ . The canonical example is that of quantum theory with every Hilbert space ray ij) representing a state, every self-adjoint operator H representing a measurement with its spectrum OH C K as outcome set where the a-structure B(OH) is inherited from that of B(R) and with probability measures P^,tH{E) •= (tp\PEtp) where PE denotes the spectral projector for E G B{OH) • In benefit of insight and also for notational convenience we will from now on assume that the measurements e £ £ are represented in a one to one way by their outcome sets Oe — note that whenever £ can be represented by points of W it then suffices to consider W x w' = W+v' in stead of W to fulfill this assumption,

87

taking Oe x {e} as the corresponding outcome set. We stress however that the results listed below also hold in absence of this assumption81 '". Definition 2. A 'pre-probabilistic hidden measurement system' is given by: (i) A set of states £ and a set of measurements £ ; (ii) Sets O C B(W) and A that parameterize £, i.e., £ = {eA,o|A £ A,0 £ O}, and each e £ £ goes equipped with a map <p\to '• £ —> O . We can represent {<£A,O|A £ A} as ipo : £ x A -> O : (p, A) H-> <PA,O(P) giving A a similar formal status as the set of states £ , or as AAo : £ x 13(0) —> P ( £ ) : (p,E) >-> {A|y0(p, A) £ E} where 7>(A) denotes the set of subsets of A. The core of this definition is that given a state p £ £ and a value A € A we have a completely determined outcome tpo [p, A). These pre-probabilistic hidden measurement systems encode as such fully deterministic settings. Definition 3. Whenever for a given pre-probabilistic hidden measurement system (Y,,£(0, A), {<po}oeo) there exists a a-field B(A) of A-subsets that satisfies \J0e0{AAo(p,E)\(p,E) £ £ x B(0)} C B(A), it defines a 'probabilistic hidden measurement system' if a probability measure p : B(A) —> [0,1] is also specified.

The condition on A A requires that all AAo(p, E) are 23(A)-measurable, such that to all triples (p, O, E) we can assign a value PPto(E) := p(AAo(p, E)) € [0,1]. As such, any probabilistic hidden measurement system defines a measurement system. The question then rises whether every probabilistic measurement system (MS) can be encoded as a probabilistic hidden measurement system (HMS). The answer to this question is yes8% 4.2, Theorem 1,2 3: There always exists a canonical HMS-representation for A = [0,1], B(A) = B([0,1]) (i.e., the Borel sets in [0,1]) and pu([0,a]) := a, i.e., uniformly distributed — the proof goes via a construction using the Loomis-Sikorski Theorem17 '20 and Marczewski's Lemma13. It makes as such sense to investigate how the different possible HMS-representations for different non-isomorphic pairs (B(A),p,) are structured — below it will become clear what we mean here by non-isomorphic. First we will discuss an example that illustrates the above; it traces back to 1 and details and illustrations can be found in 2 ' 8 " . Consider the states of a spin-1 entity encoded as a point on the Poincare sphere £ 0 ( = C^/C) C E3 . Then any pair of antipodically located points of £ 0 encodes mutual orthogonal states, as such encodes mutual orthogonal one-dimensional projectors and thus a (dichotomic) measurement. Let p £ £ 0 , let (a, ->a) be a pair of mutual orthogonal points of £ 0 and let A be the diagonal connecting a and -<a. Let xp £ A be the orthogonal projection of p on the diagonal A. Then, for A £ [xp,->a:], i.e., xp £ [a,A], we set <p(p,A) = a and for A £ [a, xp[, i.e., xp €]A, -IQ] , we set <p(p, A) = -*a. One then verifies that for p0 '•= B([a, ->a]) —> [0,1] : [a, (1 — x)a + x-<a] >-> x , i.e., uniformly distributed,

88

we obtain exactly the probability structure for spin- | in quantum theory .a An interpretational proposal of this model could be the following:1'2'3 Rather than decomposing states as in so-called hidden variable theories, here we decompose the measurements in deterministic ones — the probability measure fi should then be envisioned as encoding the lack of knowledge on the interaction of the measured system with its environment, including measurement device.

We now introduce a notion of "relative size" of HMS-representations, justifying the use of "smaller". Given a er-algebra6 and probability measure H : B —> [0,1] denote by B/n the <r-algebra of equivalence classes [E] with respect to the relation

£ ~ £ ' iff n(E? n Ec) = n{E H (E')c) = 0,

i.e., iff E and E' coincide up to a symmetric difference of measure zero. The ordering of B/n is inherited from B. For notational convenience denote the induced measure B/fi —> [0,1] : [E] H-> H(E) again by fi. Given two pairs (B, /x) and (B1, / / ) consisting of separable cr-algebras and probability measures on them set:

• (B u) < (B' u') & / 3f : B^ ~* B'^'' a n i n J e c t i v e c-ni°rphism

We call {B,n) and (B',fi') equivalent, denoted (B,fi) ~ (B',fi'), whenever in the above / is a c-isomorphism. Given two MS (£,£) and (E',£ ') we set:

{ 3s : S -> E ' , 3t :£-+£', both bijections Ve 6 £, 3 / e : B(Oe) -> B(Ot(e)), a cr-isomorphism Vp E E , V e E £ : Ps(p),t(e) ° fe = PP,e

Via this equivalence relation we can define a relation < M S between classes of measurement systems M and M1 as M <MSM' if for all (E,£) € M there exists (E',£') 6 M' such that (E,£) ~M S(S' ,£" ') , i.e., if M is included in M' up to MS-equivalence. We can then prove the following:

(i) (B,/i) ~ (B',ii') if and only if (B>Ai) < (B',n') and {B',ft') < {B,ft) — 8 " , 3, Lemma 1; thus, the equivalence classes with respect to ~ constitute a partially ordered set (poset) for the ordering induced by < ; we will denote

"As shown in 6 , 9 this deterministic model for spin-^ in R3 can be generalized to R3-models for arbitrary spin-N/2 . The states are then represented in the so called Majorana representation 1 8 ' 5 , i.e., as N copies of So . Correct probabilistic behavior is then obtained by introducing entanglement between the N different "spin-^ systems". fcI.e., a "pointless" cr-fleld. In particular, it follows from the Loomis-Sikorski theorem 1 7 ' 2 0

that all separable <r-algebras (i.e., which contain a countable dense subset) can be represented as a <T-field — it as such also follows that assuming that B(A) is a er-field and not an abstracted c-algebra imposes no formal restriction.

89

the set of these equivalence classes by M , a class in it will be denoted via a member of it as [B, n].

(ii) When setting M H M S := {M[B{K),ii\ \ [B(A),n] £ M } where M[B(A),fi] stands for all HMS with B(A') and \i' such that (S'(A'), fi') £ [B(A),/j] , we have that (B(A),/i) < (B'(A'),M') B,ndM[B(A),n] <MS M[B'(A'),n'] are equivalent 8 i \ 3, Theorem 2. This then results in:

Theorem 1. (M, <) and (MH M S ,<M S) are isomorphic posets. One of the crucial ingredients in (ii) above and also in the proof for gen

eral existence with A = [0,1] is the following: when setting AM(E,£) := {(B(Oe), Pp,e)\p € £, e G £} , we obtain that £, £ admits a HMS-representation with B(A) and \i if and only if AM(E, £) < (B(A),n), where the order applies pointwisely to the elements of AM(E,£) 8 t , 4.2, Theorem 1. Using this and Theorem 1 above we can now translate properties of M to propositions on the existence of certain HMS-representations. We obtain the following:

(i) (M, <) is not a join-semilattice, thus: In general there exists no smallest HMS-representation. As such we will have to refine our study to particular settings where we are able to make statements whether there exists a smallest one, and if not, whether we can say at least something on the cardinality of A.

(ii) One can prove a number of criteria on AM(E,£) that force (B(A),fi) ~ (S([0,1]), /i„) as such assuring existence of a smallest representation. Among these the following. Let Mfinite := {(B(X),^) € M J X is finite}. / / ^•finite Q AM(£ ,£ ) than A cannot be discrete. It then follows for example that quantum theory restricted to measurements with a finite number of outcomes still requires A = [0,1].

(iii) Let MJV := {(B(X),(i) 6 M | X has at most N elements} . J /AM(£ ,£ ) C M^r then there exists a HMS-representation with A — N. Thus, quantum theory restricted to those measurements with at most a fixed number N of outcomes has discrete HMS-representation.

(iv) / / A M ( E , £ ) = MAT then there exists no smallest HMS-representation. Neither does it exist when fixing the number of outcomes. So there is no essentially unique smallest HMS-representation for ./V-outcome quantum theory.

Although there exists no smallest and as such no canonical discrete HMS-representation we will give the construction of one solution for dichotomic (or propositional) quantum theory, i.e., N = 2, since this will constitute the core of the model presented in this paper. We will follow8"2, to which we also refer for a construction for arbitrary N. Let us denote the quantum mechanical probability to obtain a positive outcome in a measurement of a proposition or question a on a system in state p as Pp(a) — the outcome set consists here of "we obtain a positive answer for the question a", slightly abusively denoted

90

as a itself, and "we obtain a negative answer for the question a", denoted as -ia. Set inductively for A € N : c

{ a iff P (n\ > A- 4- V * - 1 i(.Vc.(p<i),a) . <pa(p, X):=\ a ™/i> W Z ^ + U=i 2>

^ -ia otherwise

One verifies that for p,(X) := ^x we obtain the correct probabilities in the resulting HMS-model. This provides a discrete alternative for the above discussed E3 -model for spin-i . The model, including the projection xp remains the same although we don't consider [a, ->a] as A anymore. Let A e A ' : = N . Set x„ := ( 1 - £)a+ (£)-<a for n £ Z2>-i • For xp <E [a,x$[U[x$,x$[U[x%,x£[U... U [a;2A-i'~lQ;] w e se^ f'a&ty = a> anc^ PaiP'ty = ~}<x otherwise. Then, for p'0 := B(N) —»• [0,1] : {A} >-> ^ we obtain again quantum probability. Geometrically, this means that the values of A £ A, as compared to the first model where they represents points on the diagonal, i.e., a continuous interval, or, again equivalently, decompositions of an interval in two intervals, we now consider decompositions of an interval in 2A equally long parts, of which there are only a discrete number of possibilities. We refer t o 8 " for details and illustrations concerning.

3 Unitary, ortho- and projective structure

In the above discussed E3 models, rotational symmetries where implicit in their spatial geometry. However, in general the decompositions of measurements over p: B(A) —> [0,1] go measurement by measurement so additional structure, if there is any, has to be put in by hand. It is probably fair to say that these contextual models only become non-trivial and useful when encoding physical symmetries within the maps tpa in an appropriate manner. For sake of the argument we will distinguish between three types of symmetries that can be encoded, namely unitary, ortho- and projective ones.

i. Unitary symmetries: When considering quantum measurements with discrete non-degenerated spectrum we can represent the outcomes {OJ}J by the corresponding "eigenstates" {pi}i via spectral decomposition, i.e., there exists an injective map B(Oe) -t P(E) for each e € £. Then, specification of e = (UoipoU-1) : AxE -> {pe,i}i, where U : E -> E is the unitary transformation that satisfies U(pi) = pe,i, and pe = p. This is exactly the

cWe agree on N := {1,2,.. . } . Note here that already by non-uniqueness of binary decomposition — i = 4- = EigN T^TT — '* follows that the construction below is not canonical. Obviously, there are also less pathological differences between the different non-comparable discrete representations8".

91

symmetry encoded in the above described E3-models. Note in particular that in this perspective the pairs (a, -ia) and (->a, ->(->a!)) should not be envisioned as merely a change of names of the outcomes, but truly as putting the measurement device (or at least its detecting part) upside down .d In this setting where we represent outcomes as states, the assignment of an outcome can now be envisioned as a true change of state fe>\ : E -> E (D Oe) : p i-> tpe(p, A), as such allowing to describe the behavior of the system under concatenated measurements.

ii. Projective symmetries: For non-degenerated quantum measurements, the outcomes require representation by higher dimensional subspaces so identification in terms of states now requires an injective map B(Oe) -» V(V(S)). The behavior of states of the system under concatenated measurements then requires specification of a family of "projectors" {TTT • S -> T\T € Oe}, e.g., the orthogonal projectors 7 r ^ : E - > A : p i - > ^ l A ( p V A x ) on the corresponding subspace A in quantum theory. The above discussed non-degenerated case fits also in this picture by setting Oe C {{p} | p £ E} where now each 7T{p} : E —> {p} is uniquely determined (having a singleton codomain).

Hi. Orthosymmetries: The existence of an orthocomplementation on the lattice of closed subspaces of a Hilbert space provides a dichotomic representation for measurements which can be envisioned as a pair consisting of a (to be verified) proposition a and its negation -*a, in quantum theory yielding TT^A '• E —> A1- : p H» A L A ( p V A ) . In terms of linear operator calculus we have IT^A = 1 — ""A > both of them being orthogonal projectors.

4 Representing quantum history theory

Although quantum history theory involves sequential measurements, one of its goals is to remain an essentially dichotomic propositional theory. This is formally encoded in a rigid way in the "History Projection Operator"-approach 14. The key idea here is that the form of logicality aimed at in 14 represents faithfully in the Hilbert space tensor product.e Let A := (ctti)i be a

d The attentive reader will note that it is at this point that we escape the so-called hidden variable no-go theorems. They arise when trying to impose contextual symmetries within the states of the system by requiring that values of observables are independent of the chosen context, e.g., the proof of the Kochen-Specker theorem. Our newly introduced variable A £ A follows contextual manipulations in an obvious manner. c At this point we mention that in the study of sequential phenomena in the axiomatic quantum theory perspective on quantum logic, sequentiality and compoundness both turn out to be specifications of a universal causal duality 1 0 , as such providing a metaphysical perspective on the use of tensor products both for the description of compound physical systems and sequential processes.

92

(so-called homogeneous) quantum history proposition with temporal support (£1, £2, • • • , tn) • Then, rather than representing this as a sequence of subspaces (Ai)i or projectors (ir^i we will either represent A as a pure tensor ®iAi in the lattice of closed subspaces of the tensor product of the corresponding Hilbert spaces or as the orthogonal projector ®i~Ki on this subspace. The crucial property of this representation is then that ->A again encodes as a projector namely id—®iiTi14, clarifying the notations TTJ, and 7r-,^ . Moreover, if {Al}i is a set of so-called disjoint history propositions, i.e., <8>kAk ± ®kA3

k for i ^ j , then, the history proposition that expresses the disjunction of {A'}i sensu14 is exactly encoded as the projector ] [ \ ®*7r£ . We get as such a kind of logical setting that is still encoded in terms of projectors. Note that TT-,A is not of the form ®j7Tj but of the form Yli ®A7rfc breaking the structural symmetry between a proposition and its negation in ordinary quantum theory.

We will now transcribe the observations in the two previous sections to this setting in order to provide a contextual deterministic model for quantum history theory with discretely originating probabilities. One could say that we will apply a split picture in terms of Schrodinger-Eisenbergh, namely we assume that on the level of unitary evolution we apply the Eisenbergh picture such that we can fix notation without reference to this evolution, but for changes of state due to measurement we will (obviously) express this in the state space. When encoding outcomes in terms of states we need to consider n copies of E , encoding the trajectories due to the measurements. In view of the considerations made above it will be no surprise that we will consider these trajectories as of the form ®iPi in the tensor product (gijEj. This will require the introduction of the following "pseudo-projector":

• 7r^ : £ -> ®i£i : p H> p ^ := p ® m(p) ® . . . ® (7Tn_i o . . . o in)(p). Setting £® := TT®[£] = {pg|p £ £} then ir% : £ -> E^ encodes a bijective representation of E . Noting that PP(A) :— (p® I'XAP'A) is the probability given by quantum theory to obtain A, we then set inductively for fixed A £ N that <PA(P, A) = A if and only if:

• > £ + E £ ^ ^ and (p^(p,\) = -1.4 otherwise. The outcome trajectories in case we obtain A are then given in terms of initial states by (n^ o 7r®) : E —> ®iAi. The value A € N can be envisioned as follows. We assume it to be a number of contextual events, either real or virtual depending on one's taste, and we assume that, given that some events already happened, the chance of a next one happening is equal to the chance that it doesn't happen, so we actually consider a finite number of probabilistically balanced consecutive binary decisive processes where the result of the previous one determines whether we actually

93

will perform the next one. Unitary symmetries are induced in the obvious way as tensored unitary operators ®iUi. This model then produces the statistical behavior of quantum history theory.

The breaking of the structural symmetry between a proposition and its negation manifestates itself in the most explicit way in the sense that when we have a determined outcome ->A we don't have a determined trajectory in our model — obviously one could build a fully deterministic model that also determines this by concatenation of individual deterministic models (one for each element in the temporal support), but we feel that this would not be in accordance with the propositional flavor a history theory aims at. The negation ->A is indeed cognitive and not ontological with respect to the actual executed physical procedure or, in other words, the system's context, and one cannot expect an ontological model to encode this in terms of a formal duality. Explicitly, -i(A®B) can be written both as {H <8> ->B) © (->A ® B) and (->A ® H) © (A ® ->B) which clearly define different procedures with respect to imposed change of state due to the measurement. Even more explicitly, setting HPO({Hk}k) := { E ; ® * 4 l 4 G £(«*)> ®*4l -L ® * 4 for i ^ j} for £(}ik) the lattice of closed subspaces of Hk , the "ontologically faithful hull" oiUVO({Uk}k) consists then of all "ortho-ideals" Ol(HVO({Hk}k)) ~

• {4.[{®*Aj}i] | A\ e C{Uk),®kA\ ± ®kA{ for i ± j}

where J,[—] assigns to a set of pure tensors all pure tensors in QkHk that are smaller than at least one in the given set, this with respect to the ordering in C{®kHk) — the downset 4-[~] construction makes Ol(HVO({Hk}k)) inherit the £(®kHk)-oideT as intersection. If a particular decomposition is specified as an element of OX(HVO({'Hk}k)), what means full specification of the physical procedure where summation over different sequences of pure tensors is now envisioned as choice of procedure, we can provide a deterministic contextual model, the choice of procedure itself becoming an additional variable. Conclusively, the HPO-setting "looses" part of the physical ontology that goes with an operational perspective on quantum theory/ and as such, if we want to provide a deterministic representation for general inhomogeneous history propositions sensu the one we obtained for the homogeneous ones, we formally need to restore this part of the physical ontology, e.g. as Ol{7iVO({7ik}k)) .

5 Further discussion

In this paper we didn't provide an answer and we even didn't pose a question. We just provided a new way to think about things, slightly confronting the

' A choice that is motivated by the traditional consistent history setting and its interpretation as well as by a particular semantical perspective on quantum logic as a whole.

94

usual consistency or decoherence perspective for history theories. Even if one does not subscribe to the underlying deterministic nature of the model it still exhibits what a minimal representation of the indeterministic ingredients can be, as such representing it in a more tangible way. With respect to the nonexistence of a smallest representation, in view of other physical considerations it could be that one of the constructible discrete models presents itself as the truly canonical one, e.g., equilibrium or other thermodynamical considerations, metastatistical ones, emerging from additional modelization.

Acknowledgments

We thank Chris Isham for useful discussions on the content of this paper.

References

1. D. Aerts, J. Math. Phys. 27, 202 (1986). 2. D. Aerts, Int. J. Theor. Phys. 32, 2207 (1993). 3. D. Aerts, Found. Phys. 24, 1227 (1994). 4. G.K. Au — Interview with A. Ashtekar, C.J. Isham and E. Witten, The

Quest for Quantum Gravity; arXiv: gr-qc/9506001 (1995). 5. H. Bacry, J. Math. Phys. 15, 1686 (1974) . 6. B. Coecke, Helv. Phys. Acta 68, 396 (1995). 7. B. Coecke, Found. Phys. Lett. 8, 437 (1995). 8. B. Coecke, Helv. Phys. Acta 70, 442, 462(1997); arXiv: quant-

ph/0008061 k 0008062; Tatra Mt. Math. Publ. 10, 63. 9. B. Coecke, Found. Phys. 28, 1347 (1998).

10. B. Coecke et ai Found. Phys. Lett. 14(2001); arXiv: quant-ph/0009100. 11. N. Gisin and C. Piron, Lett. Math. Phys. 5, 379 (1981). 12. S. Gudder, J. Math. Phys. 11, 431 (1970). 13. A. Horn and H. Tarski, Trans. AMS 64, 467 (1948). 14. C. J. Isham J. Math. Phys. 23, 2157 (1994). 15. C. J. Isham, Structural Issues in Quantum Gravity, In: General Relativ

ity and Gravitation: GR14, pp.167 (World Scientific, Singapore, 1997). 16. C.J. Isham and J. Butterfield, Found. Phys. 30, 1707 (2000). 17. L. Loomis, Bull. AMS 53, 757 (1947). 18. E. Majorana, Nuovo Cimento 9, 43 (1932). 19. C. Rovelli, Strings, Loops and Others: A Critical Survey of the Present

Approaches to Quantum Gravity, Plenary Lecture at GR15, Poona, India (1998); arXiv: gr-qc/9803024.

20. R. Sikorski, Fund. Math. 35, 247 (1948).

95

INTERPRETATIONS OF Q U A N T U M MECHANICS, A N D INTERPRETATIONS OF VIOLATION OF BELL'S

INEQUALITY

WILLEM. M. DE MUYNCK Theoretical Physics, Eindhoven University of Technology,

FOB 513, 5600 MB Eindhoven, the Netherlands E-mail: [email protected]

The discussion of the foundations of quantum mechanics is complicated by the fact that a number of different issues are closely entangled. Three of these issues are i) the interpretation of probability, ii) the choice between realist and empiricist interpretations of the mathematical formalism of quantum mechanics, iii) the distinction between measurement and preparation. It will be demonstrated that an interpretation of violation of Bell's inequality by quantum mechanics as evidence of non-locality of the quantum world is a consequence of a particular choice between these alternatives. Also a distinction must be drawn between two forms of realism, viz. a) realist interpretations of quantum mechanics, b) the possibility of hidden-variables (sub-quantum) theories.

1 Realist and empiricist interpretations of quantum mechanics

In realist interpretations of the mathematical formalism of quantum mechanics state vector and observable are thought to refer to the microscopic object in the usual way presented in most textbooks. Although, of course, preparing and measuring instruments are often present, these are not taken into account in the mathematical description (unless, as in the theory of measurement, the subject is the interaction between object and measuring instrument).

In an empiricist interpretation quantum mechanics is thought to describe relations between input and output of a measurement process. A state vector is just a label of a preparation procedure; an observable is a label of a measuring instrument. In an empiricist interpretation quantum mechanics is not thought to describe the microscopic object. This, of course, does not imply that this object would not exist; it only means that it is not described by quantum mechanics. Explanation of relations between input and output of a measurement process should be provided by another theory, e.g. a hidden-variables (sub-quantum) theory. This is analogous to the way the theory of rigid bodies describes the empirical behavior of a billiard ball, or to the description by thermodynamics of the thermodynamic properties of a volume of gas, explanations being relegated to theories describing the microscopic (atomic) properties of the systems.

Although a term like 'observable' (rather than 'physical quantity') is ev-

96

idence of the empiricist origin of quantum mechanics (compare Heisenberg1), there has always existed a strong tendency toward a realist interpretation in which observables are considered as properties of the microscopic object, more or less analogous to classical ones. Likewise, many physicists use to think about electrons as wave packets flying around in space, without bothering too much about the "Unanschaulichkeit" that for Schrodingei2 was such a problematic feature of quantum theory. Without entering into a detailed discussion of the relative merits of either of these interpretations (e.g. de Muynck3) it is noted here that an empiricist interpretation is in agreement with the operational way theory and experiment are compared in the laboratory. Moreover, it is free of paradoxes, which have their origin in a realist interpretation. As will be seen in the next section, the difference between realist and empiricist interpretations is highly relevant when dealing with the EPR problem.

2 E P R experiments and Bell experiments

In figure 1 the experiment is depicted,

/

measuring instrument for Q, or P,

Figure 1: E P R experiment.

proposed by Einstein, Podolsky and Rosen4 to study (in)completeness of quantum mechanics. A pair of particles (1 and 2) is prepared in an entangled state and allowed to separate. A measurement is performed on particle 1. It is essential to the EPR reasoning that particle 2 does not interact with any measuring instrument, thus allowing to consider so-called 'elements of physical reality' of this particle, that can be considered as objective properties, being attributable to particle 2 independently of what happens to particle 1. By EPR this arrangement was presented as a way to perform a measurement on particle 2 without in any way disturbing this particle.

The EPR experiment should be compared to correlation measurements of the type performed by Aspect et al.5'6 to test Bell's inequality (cf. figure 2). In these latter experiments also particle 2 is interacting with a measuring instrument. In the literature these experiments are often referred to as EPR experiments, too, thus neglecting the fundamental difference between

97

Q,

Figure 2: Bell experiment.

the two measurement arrangements of figures 1 and 2. This negligence has been responsible for quite a bit of confusion, and should preferably be avoided by referring to the latter experiments as Bell experiments rather than EPR ones. In EPR experiments particle 2 is not subject to a measurement, but to a (conditional) preparation (conditional on the measurement result obtained for particle 1). This is especially clear in an empiricist interpretation, because here measurement results cannot exist unless a measuring instrument is present, its pointer positions corresponding to the measurement results.

Unfortunately, the EPR experiment of figure 1 was presented by EPR as a measurement performed on particle 2, and accepted by Bohr as such. That this could happen is a consequence of the fact that both Einstein and Bohr entertained a realist interpretation of quantum mechanical observables (note that they differed with respect to the interpretation of the state vector), the only difference being that Einstein's realist interpretation was an objectivistic one (in which observables are considered as properties of the object, possessed independently of any measurement: the EPR 'elements of physical reality'), whereas Bohr's was a contextualistic realism (in which observables are only well-defined within the context of the measurement). Note that in Bell experiments the EPR reasoning would break down because, due to the interaction of particle 2 with its measuring instrument, there cannot exist 'elements of physical reality'.

Much confusion could have been avoided if Bohr had maintained his interactional view of measurement. However, by accepting the EPR experiment as a measurement of particle 2 he had to weaken his interpretation to a relational one (e.g. Popper7, Jammer8), allowing the observable of particle 2 to be co-determined by the measurement context for particle 1. This introduced for the first time non-locality in the interpretation of quantum mechanics. But this could easily have been avoided if Bohr had required that for a measurement of particle 2 a measuring instrument should be actually interacting with this very particle, with the result that an observable of particle n (n = 1,2) can be co-determined in a local way by the measurement context of that particle only. This, incidentally, would have completely made obsolete the EPR 'ele-

98

ments of physical reality', and would have been quite a bit less confusing than the answer Bohr9 actually gave (to the effect that the definition of the EPR 'element of physical reality' would be ambiguous because of the fact that it did not take into account the measurement arrangement for the other particle), thus promoting the non-locality idea.

Summarizing, the idea of EPR non-locality is a consequence of i) a neglect of the difference between EPR and Bell experiments (equating 'elements of physical reality' to measurement results), ii) a realist interpretation of quantum mechanics (considering measurement results as properties of the microscopic object, i.e. particle 2). In an empiricist interpretation there is no reason to assume any non-locality.

It is often asserted that non-locality is proven by the Aspect experiments, because these are violating Bell's inequality. The reason for such an assertion is that it is thought that non-locality is a necessary condition for a derivation of Bell's inequality. However, as will be demonstrated in the following, this cannot be correct since this inequality can be derived from quite different assumptions. Also, experiments like the Aspect ones, -although violating Bell's inequality,-do not exhibit any trace of non-locality, because their measurement results are completely consistent with the postulate of local commutativity, implying that relative frequencies of measurement results are independent of which measurements are performed in causally disconnected regions. Admittedly, this does not logically exclude a certain non-locality at the individual level, being unobservable at the statistical level of quantum mechanical probability distributions. However, from a physical point of view a peaceful coexistence between locality at the (physically relevant) statistical level and non-locality at the individual level is extremely implausible. Unobservability of the latter would require a kind of conspiracy not unlike the one making unobservable 19"* century world aether. For this reason the 'non-locality' explanation of the experimental violation of Bell's inequality does not seem to be very plausible, and does it seem wise to look for alternative explanations.

Since non-locality is never the only assumption in deriving Bell's inequality, such alternative explanations do exist. Thus, Einstein's assumption of the existence of 'elements of physical reality' is such an additional assumption. More generally, in Bell's derivation10 the existence of hidden-variables is one. Is it still possible to derive Bell's inequality if these assumptions are abolished? Moreover, even assuming the possibility of hidden-variables theories, are there in Bell's derivation no hidden assumptions, additional to the locality assumption.

Bell's inequality refers to a set of four quantum mechanical observables, Ai,Bi,A2 and B2, observables with different/identical indices being compati-

99

ble/incompatible. In the Aspect experiments measurements of the four possible compatible pairs are performed; in these experiments An and Bn refer to polarization observables of photon n, n = 1,2, respectively). Bell's inequality can typically be derived for the stochastic quantities of a classical Kolmogorovian probability theory. Hence, violation of Bell's inequality is an indication that observables A\, B\, A2 and B2 are not stochastic quantities in the sense of Kol-mogorov's probability theory. In particular, there cannot exist a quadrivariate joint probability distribution of these four observables. Such a non-existence is a consequence of the incompatibility of certain of the observables. Since incompatibility is a local affair, this is another reason to doubt the 'non-locality' explanation of the violation of Bell's inequality.

In the following derivations of Bell's inequality will be scrutinized to see whether the non-locality assumption is as crucial as was assumed by Bell. In doing so it is necessary to distinguish derivations in quantum mechanics from derivations in hidden-variables theories.

3 Bell's inequality in quantum mechanics

For dichotomic observables, having values ± 1 , Bell's inequality is given according to

\{A^A2) - {AXB2)\ - (B1B2) - (BiA2) < 2. (1)

A more general inequality, being valid for arbitrary values of the observables, is the BCHS inequality

-l<p(b1,a2) +p(bi,b2)+p(a1,b2) - p ( o i , a 2 ) -p(bi) -p(b2) < 0 (2)

from which (3.1) can be derived for the dichotomic case. Because of its independence of the values of the observables inequality (3.2) is preferable by far over inequality (3.1). Bell's inequality may be violated if some of the observables are incompatible: [>li,i?i]_ ^ O, [^2,-62]- ^ O.

I shall now discuss two derivations of Bell's inequality, which can be formulated within the quantum mechanical formalism, and which do not rely on the existence of hidden variables. The first one is relying on a 'possessed values' principle, stating that

values of quantum mechanical observables may be attributed to the object as objective properties, possessed by the object independent of observation

values' principle can be seen as an expression of the objectiv-

'possessed values' = < principle

The 'possessec istic-realist interpretation of the quantum mechanical formalism preferred by

100

Einstein (compare the EPR 'elements of physical reality'). The important point is that by this principle well-defined values are simultaneously attributed to incompatible observables. If a\n', bj = ±1 are the values of Ai and Bj for the nth of a sequence of N particle pairs, then we have

- 2 < < 4 n ) 4 n ) - a[n)b{2n) - b[n)b2

n) - &<n)a2n) < 2,

from which it directly follows that the quantities

< iA2> = l f ; a W 4 n > > e t c . n=l

must satisfy Bell's inequality (3.1) (a similar derivation has first been given by Stapp11, although starting from quite a different interpretation). The essential point in the derivation is the assumption of the existence of a quadruple of values (ai, b\, a,2,62) for each of the particle pairs.

From the experimental violation of Bell's inequality it follows that an objectivistic-realist interpretation of the quantum mechanical formalism, encompassing the 'possessed values' principle, is impossible. Violation of Bell's inequality entails failure of the 'possessed values' principle (no quadruples available). In view of the important role measurement is playing in the interpretation of quantum mechanics this is hardly surprising. As is well-known, due to the incompatibility of some of the observables the existence of a quadruple of values can only be attained on the basis of doubtful counterfactual reasoning. If a realist interpretation is feasible at all, it seems to have to be a contextualistic one, in which the values of observables are co-determined by the measurement arrangement. In the case of Bell experiments non-locality does not seem to be involved.

As a second possibility to derive Bell's inequality within quantum mechanics we should consider derivations of the BCHS inequality (3.2) from the existence of a quadrivariate probability distribution p(ai, 61,02,62) by Fine12

and Rastalf3 (also de Muynck14). Hence, from violation of Bell's inequality the non-existence of a quadrivariate joint probability distribution follows. In view of the fact that incompatible observables are involved, this, once again, is hardly surprising.

A priori there are two possible reasons for the non-existence of the quadrivariate joint probability distribution #(01,61,02,62). First, it is possible that Um]v->00N(ai,bi,a2,b2)/N of the relative frequencies of quadruples of measurement results does not exist. Since, however, Bell's inequality already follows from the existence of relative frequency ^(01,61,02,62)/^ with finite

101

N, and the limit N —> oo is never involved in any experimental implementation, this answer does not seem to be sufficient. Therefore the reason for the non-existence of the quadrivariate joint probability distribution p{a\, &i, a<i, 62) can only be the non-existence of relative frequencies N(ai,bi,a,2,b2)/N. This seems to reduce the present case to the previous one: Bell's inequality can be violated because quadruples ( 4i = a\, B\ = bi, A% = 0,2, B2 = ^2) do not exist.

Could non-locality explain the non-existence of quadruples {A\ = a\,B\ = bi, A2 = a2, B2 = 62)? Indeed, it could. If the value of A\, say, is co-determined by the measurement arrangement of particle 2, then non-locality could entail

Oi(^2) #0!(B2) , (3)

thus preventing the existence of one single value of observable A\ for the two Aspect experiments involving this observable. This, precisely, is the 'non-locality' explanation referred to above. This explanation is close to Bohr's 'ambiguity' answer to EPR, referred to in section 2, stating that the definition of an 'element of physical reality' of observable A\ must depend on the measurement context of particle 2.

As will be demonstrated next, there is a more plausible local explanation, however, based on the inequality

a i ^ O ^ a ^ B i ) , (4)

expressing that the value of Ai, say, will depend on whether either Ai or B\ is measured. Inequality (3.4) could be seen as an implementation of Heisenberg's disturbance theory of measurement, to the effect that observables, incompatible with the actually measured one, are disturbed by the measurement. That such an effect is really occurring in the Aspect experiments, can be seen from the generalized Aspect experiment depicted in figure 3. This experiment should be compared with the Aspect switching experiment?, in which the switches have been replaced by two semi-transparent mirrors (transmissivities 71 and 72, respectively). The four Aspect experiments are special cases of the generalized one, having 7„ = 0 or 1, n = 1,2.

Restricting for a moment to one side of the interferometer, it is possible to calculate the joint detection probabilities of the two detectors according to

{p^auMj)) - { ( 1 _ 7 l ) ( F ( D + ) i - 7 l ( £ ( i ) + ) - ( l - 7 l ) ( f ( i ) + ) J>

(5)

in which {E^ +, E^„} and {F^+jF^-} are the spectral representations of the two polarization observables (Ai and Bi) in directions 81 and 6[, respectively. The values an = +/—,bij = +/— correspond to yes/no registration

102

(IIS • y <& • BID Pole, D,

Pole,' C S 3 E 3 Pol 9]

Figure 3: Generalized Aspect experiment.

of a photon by the detector. p 7 1 (+ ,+) = 0 means that, like in the switching experiment, only one of the detectors can register photon 1. There, however, is a fundamental difference with the switching experiment, because in this latter experiment the photon wave packet is sent either toward one detector or the other, whereas in the present one it is split so as to interact coherently with both detectors. This makes it possible to interpret the right hand part of the generalized experiment of figure 3 as a joint non-ideal measurement of the incompatible polarization observables in directions 6\ and 6[ (e.g. de Muynck et al.15), the joint probability distribution of the observables being given by (5).

It is not possible to extensively discuss here the relevance of experiments of the generalized type for understanding Heisenberg's disturbance theory of measurement, and its relation to the Heisenberg uncertainty relations (see e.g. de Muynck16). The important point is that such experiments do not fit into the standard (Dirac-von Neumann) formalism in which a probability is an expectation value of a projection operator. Indeed, from (5) it follows that P-n(au,bij) = TrpRîj is yielding operators Rîj according to

( * ( 1 ) « ) = ( ( 1 - T 0 F < 1 > + 7 i £(D. 7 i£ ( 1 ) +

. + ( l - 7 l ) F ( O (6)

The set of operators {Rîj} constitutes a so-called positive operator-valued measure (POVM). Only generalized measurements corresponding to POVMs are able to describe joint non-ideal measurements of incompatible observables. By calculating the marginals of probability distribution p 7 l (an, b\j) it is possible to see that for each value of 71 information is obtained on both polarization observables, be it that information on polarization in direction 0\ gets more non-ideal as 71 decreases, while information on polarization in direction 0[ is getting more ideal. This is in perfect agreement with the idea of mutual disturbance in a joint measurement of incompatible observables. The explanation of the non-existence of a single measurement result for observable Ai, say, as implied by inequality (3.4), is corroborated by this analysis.

103

The analysis can easily be extended to the joint detection probabilities of the whole experiment of figure 3. The joint detection probability distribution of all four detectors is given by the expectation value of a quadrivariate POVM {Rijki} according to

(an, bij,a2k,hi) = TrpRijkt- (7)

This POVM can be expressed in terms of the POVMs of the left and right interferometer arms according to

Rijki = R%)R%). (8)

It is important to note that the existence of the quadrivariate joint probability distribution (7), and the consequent satisfaction of Bell's inequality, is a consequence of the existence of quadruples of measurement results, available because it is possible to determine for each individual particle pair what is the result of each of the four detectors. Although, because of (3.5), also locality is assumed, this does not play an essential role. Under the condition that a quadruple of measurement results exists for each individual photon pair Bell's inequality would be satisfied also if, due to non-local interaction, Rijkt were not a product of operators of the two arms of the interferometer. The reason why the standard Aspect experiments do not satisfy Bell's inequality is the non-existence of a quadrivariate joint probability distribution yielding the bivariate probabilities of these experiments as marginals. Such a nonexistence is strongly suggested by Heisenberg's idea of mutual disturbance in a joint measurement of incompatible observables. This is corroborated by the easily verifiable fact that the quadrivariate joint probability distributions of the standard Aspect experiments, obtained from (7) and (3.5) by taking j n

to be either 1 or 0, are all distinct. Moreover, in general the quadrivariate joint probability distribution (7) for one standard Aspect experiment does not yield the bivariate ones of the other experiments as marginals. Although it is not strictly excluded that a quadrivariate joint probability distribution might exist having the bivariate probabilities of the standard Aspect experiments as marginals (hence, different from the ones referred to above), does the mathematical formalism of quantum mechanics not give any reason to surmise its existence. As far as quantum mechanics is concerned, the standard Aspect experiments need not satisfy Bell's inequality.

104

4 Bell's inequality in stochastic and deterministic hidden-variables theories

In stochastic hidden-variables theories quantum mechanical probabilities are usually given as

p(ai)= [ d\ p(\)p(ai\\), (1) JA

in which A is the space of hidden variable A (to be compared with classical phase space), and p(ai|A) is the conditional probability of measurement result A = ai if the value of the hidden variable was A, and p{X) the probability of A. It should be noticed that expression (4.1) fits perfectly into an empiricist interpretation of the quantum mechanical formalism, in which measurement result ai is referring to a pointer position of a measuring instrument, the object being described by the hidden variable. Since p(a,i | A) may depend on the specific way the measurement is carried out, the stochastic hidden-variables model corresponds to a contextualistic interpretation of quantum mechanical observables. Deterministic hidden-variables theories are just special cases in which p(ai|A) is either 1 or 0. In the deterministic case it is possible to associate in a unique way (although possibly dependent on the measurement procedure) the value ai to the phase space point A the object is prepared in. A disadvantage of a deterministic theory is that the physical interaction of object and measuring instrument is left out of consideration, thus suggesting measurement result ai to be a (possibly contextually determined) property of the object. In order to have maximal generality it is preferable to deal with the stochastic case.

For Bell experiments we have

p(ai,a2)= / d\p(X)p(ai,a2\\), (2) JA

a condition of conditional statistical independence,

p(a1,a2\X) =p(ai|A)p(o2 |A), (3)

expressing that the measurement procedures of Ai and A2 do not influence each other (so-called locality condition).

As is well-known the locality condition was thought by Bell to be the crucial condition allowing a derivation of his inequality. This does not seem to be correct, however. As a matter of fact, Bell's inequality can be derived if a quadrivariate joint probability distribution exists12'13. In a stochastic hidden-variables theory such a distribution could be represented by

p(ai,bi,a2,b2) = / dX p(X)p(ai,bi,a2,b2\X), (4) JA

105

without any necessity that the conditional probability be factorizable in order that Bell's inequality be satisfied (although for the generalized experiment discussed in section 3 it would be reasonable to require that p(ai, 61,02,621 A) = p(ai,6i|A)p(a2,&2|A)). Analogous to the quantum mechanical case, it is sufficient that for each individual preparation (here parameterized by A) a quadruple of measurement results exists. If Heisenberg measurement disturbance is a physically realistic effect in the experiments at issue, it should be described by the hidden-variables theory as well. Therefore the explanation of the nonexistence of such quadruples is the same as in quantum mechanics.

However, with respect to the possibility of deriving Bell's inequality there is an important difference between quantum mechanics and the stochastic hidden-variables theories of the kind discussed here. Whereas quantum mechanics does not yield any indication as regards the existence of a quadrivariate joint probability distribution returning the bivariate probabilities of the Aspect experiments as marginals, local stochastic hidden-variables theory does. Indeed, using the single-observable conditional probabilities assumed to exist in the local theory (compare (3)), it is possible to construct a quadrivariate joint probability distribution according to

p(ai,a2,b1,b2) = / d\ p(A)p(ai|A)p(a2|A)p(&i|A)p(&2|A), (5) JK

satisfying all requirements. It should be noted that (4.2) does not describe the results of any joint measurement of the four observables that are involved. Quadruples (ai, a2, b\, b2) are obtained here by combining measurement results found in different experiments, assuming the same value of A in all experiments. For this reason the physical meaning of this probability distribution is not clear. However, this does not seem to be important. The existence of (4.2) as a purely mathematical constraint is sufficient to warrant that any stochastic hidden-variables theory in which (2) and (3) are satisfied, must require that the standard Aspect experiments obey Bell's inequality. Admittedly, there is a possibility that (4.2) might not be a valid mathematical entity because it is based on multiplication of the probability distributions p(a|A), which might be distributions in the sense of Schwartz' distribution theory. However, the remark made with respect to the existence of probability distributions as infinite—A'' limits of relative frequencies is valid also here: the reasoning does not depend on this limit, but is equally applicable to relative frequencies in finite sequences.

The question is whether this reasoning is sufficient to conclude that no local hidden-variables theory can reproduce quantum mechanics. Such a conclusion would only be justified if locality would be the only assumption in

106

deriving Bell's inequality. If there would be any additional assumption in this derivation, then violation of Bell's inequality could possibly be blamed on the invalidity of this additional assumption rather than locality. Evidently, one such additional assumption is the existence of hidden variables. A belief in the completeness of the quantum mechanical formalism would, indeed, be a sufficient reason to reject this assumption, thus increasing pressure on the locality assumption. Since, however, an empiricist interpretation is hardly reconcilable with such a completeness belief, we have to take hidden-variables theories seriously, and look for the possibility of additional assumptions within such theories.

In expression (4.1) one such assumption is evident, viz. the existence of the conditional probability p(ai|A). The assumption of the applicability of this quantity in a quantum mechanical measurement is far less innocuous than appears at first sight. If quantum mechanical measurements really can be modeled by equality (4.1), this implies that a quantum mechanical measurement result is determined, either in a stochastic or in a deterministic sense, by an instantaneous value A of the hidden variable, prepared independently of the measurement to be performed later. It is questionable whether this is a realistic assumption, in particular, if hidden variables would have the character of rapidly fluctuating stochastic variables. As a matter of fact, every individual quantum mechanical measurement takes a certain amount of time, and it will in general be virtually impossible to determine the precise instant to be taken as the initial time of the measurement, as well as the precise value of the stochastic variable at that moment. Hence, hidden-variables theories of the kind considered here may be too specific.

Because of the assumption of a non-contextual preparation of the hidden variable, such theories were called quasi-objectivistic stochastic hidden-variables theories in de Muynck and van Stekelenborg17 (dependence of the conditional probabilities p(ai\X) on the measurement procedure preventing complete objectivity of the theory). In the past attention has mainly been restricted to quasi-objectivistic hidden-variables theories. It is questionable, however, whether the assumption of quasi-objectivity is a possible one for hidden-variables theories purporting to reproduce quantum mechanical measurement results. The existence of quadrivariate probability distribution (4.2) only excludes quasi-objectivistic local hidden-variables theories (either stochastic or deterministic) from the possibility of reproducing quantum mechanics. As will be seen in the next section, it is far more reasonable to blame quasi-objectivity than locality for this, thus leaving the possibility of local hidden-variables theories that are not quasi-objectivistic.

107

5 Analogy between thermodynamics and quantum mechanics

The essential feature of expression (4.1) is the possibility to attribute, either in a stochastic or in a deterministic way, measurement result a\ to an instantaneous value of hidden variable A. The question is whether this is a reasonable assumption within the domain of quantum mechanical measurement. Are the conditional probabilities p(ai|A) experimentally relevant within this domain? In order to give a tentative answer to this question, we shall exploit the analogy between thermodynamics and quantum mechanics, considered already a long time ago by many authors (e.g. de Broglie18, Bohm et al.19'20, Nelson21,22).

Quantum mechanics -¥ Hidden variables theory (A1,A2,BUB2) A

t t Thermodynamics —> Classical statistical mechanics

(P,T,S) {quPi} In this analogy thermodynamics and quantum mechanics are considered as phenomenological theories, to be reduced to more fundamental "microscopic" theories. The reduction of thermodynamics to classical statistical mechanics is thought to be analogous to a possible reduction of quantum mechanics to stochastic hidden-variables theory. Due to certain restrictions imposed on preparations and measurements within the domains of the phenomenological theories, their domains of application are thought to be contained in, but smaller than, the domains of the "microscopic" theories.

In order to assess the nature and the importance of such restrictions let us first look at thermodynamics. As is well-known (e.g. Hollinger and Zenzen23) thermodynamics is valid only under a condition of molecular chaos, assuring the existence of local equilibrium" necessary for the ergodic hypothesis to be satisfied. Thermodynamics only describes measurements of quantities (like pressure, temperature, and entropy) being defined for such equilibrium states. From an operational point of view this implies that measurements within the domain of thermodynamics do not yield information on the object system, valid for one particular instant of time, but it is time-averaged information, time averaging being replaced, under the ergodic hypothesis, by ensemble averaging. In the Gibbs theory this ensemble is represented by the canonical density function Z~1e~H^qn'p"^^kT on phase space. This state is called a macrostate, to be distinguished from the microstate {qn,Pn}, representing the point in phase space the classical object is in at a certain instant of time.

The restricted validity of thermodynamics is manifest in a two-fold way: i) through the restriction of all possible density functions on phase space to aIn "equilibrium thermodynamics" equilibrium is assumed to be even global.

108

the canonical ones; ii) through the restriction of thermodynamical quantities (observables) to functionals on the space of thermodynamic states. Physically this can be interpreted as a restriction of the domain of application of thermodynamics to those measurement procedures probing only properties of the macrostates. This implies that such measurements only yield information that is averaged over times exceeding the relaxation time needed to reach a state of (local) equilibrium. Thus, it is important to note that thermodynamic quantities are quite different from the physical quantities of classical statistical mechanics, the latter ones being represented by functions of the microstate {<ln,Pn} and, hence, referring to a particular instant of time6. Only if it were possible to perform measurements faster than the relaxation time, would it be necessary to consider such non-thermodynamic quantities. Such measurements, then, are outside the domain of application of thermodynamics. Thus, if we have a cubic container containing a volume of gas in a microstate initially concentrated at its center, and if we could measure at a single instant of time either the total kinetic energy or the force exerted on the boundary of the container, then these results would not be equal to thermodynamic temperature and pressurec, respectively, because this microstate is not an equilibrium state. Only after the gas has reached equilibrium within the volume denned by the container (equilibrium) thermodynamics becomes applicable.

Within the domain of application of thermodynamics the microstate of the system may change appreciably without the macrostate being affected. Indeed, a macrostate is equivalent to an (ergodic) trajectory {qn(t),pn(t)}ergodic- We might exploit as follows the difference between micro- and macrostates for characterizing objectivity of a physical theory. Whereas the microstate is thought to yield an objective description of the (microscopic) object, the macrostate just describes certain phenomena to be attributed to the object system only while being observed under conditions valid within the domain of application of the theory. In this sense classical mechanics is an objective theory, all quantities being instantaneous properties of the microstate. Thermodynamic quantities, only being attributable to the macrostate (i.e., to an ergodic trajectory), can not be seen, however, as properties belonging to the object at a certain instant of time. Of course, we might attribute the thermodynamic quantity to the event in space-time represented by the trajectory, but it should be realized that this event is not determined solely by the preparation of the microstate, but is determined as well by the macroscopic arrangement serving

6Note that a "definition" of an instantaneous temperature by means of the equality Z/2nkT = S i P?/2mj does not make sense, as can easily be seen by applying this "definition" to an ideal gas in a container freely falling in a gravitational field. t h e r m o d y n a m i c pressure is defined for the canonical ensemble by p — kTd/dV log Z.

109

*

Figure 4: Incompatible thermodynamic arrangements.

to define the macrostate. In order to illustrate this, consider two identical cubic containers differing

only in their orientations (cf. figure 4). In principle, the same microstate may be prepared in the two containers. Because of the different orientations, however, the macrostates, evolving from this microstate during the time the gas is reaching equilibrium with the container, are different (for different orientations of the container we have Hx ^ H2, and, hence, e - i f l / f c T /Z i ^ e~H2/kT/Z2, since H = T+V, and Vi ^ V2 because potential energy is infinite outside a container). This implies that thermodynamic macrostates may be different even though starting from the same microstate. Macrostates in thermodynamics have a contextual meaning. It is important to note that, since the container is part of the preparing apparatus, this contextuality is connected here to preparation rather than to measurement. Consequently, whereas classical quantities f({qn,Pn}) can be interpreted as objective properties, thermodynamic quantities are non-objective, the non-objectivity being of a contextual nature.

Let us now suppose that quantum mechanics is related to hidden-variables theory analogous to the way thermodynamics is related to classical mechanics, the analogy maybe being even closer for non-equilibrium thermodynamics (only local equilibrium being assumed) than for the thermodynamics of global equilibrium processes. Support for this idea was found in de Muynck and van Stekelenborg17, where it was demonstrated that in the Husimi representation of quantum mechanics by means of non-negative probability distribution functions on phase space an analogous restriction to a "canonical" set of distributions obtains as in thermodynamics. In particular, it was demonstrated that the dispersionfree states p(q,p) = S(q — qo)S(p — po) are not "canonical" in this sense. This implies that within the domain of quantum mechanics it does not make sense to consider the preparation of the object in a "microstate" with a well-defined value of the hidden variables (q,p).

In the analogy quantum mechanical observables like Ai,A2,Bi,B2 should be compared to thermodynamic quantities like pressure, temperature, and entropy. The central issue in the analogy is the fact that thermodynamic quanti-

110

ties like pressure and temperature cannot be conditioned on the instantaneous phase space variable {qn,Pn} (microstate). Expressions like p({qn,Pn}) and T({qn,Pn}) are meaningless within thermodynamics. Thermodynamic quantities are conditioned on macrostates, corresponding to ergodic paths in phase space. Analogously, a quantum mechanical observable might not correspond to an instantaneous property of the object, but might have to be associated with an (ergodic) path in hidden-variables space A (macrostate) rather than with an instantaneous value A (microstate).

On the basis of the analogy between thermodynamics and quantum mechanics it is possible to state the following conjectures:

• Quantum mechanical measurements (analogous to thermodynamic measurements) do not probe microstates but macrostates.

• Quantum mechanical quantities (analogous to thermodynamic quantities) should be conditioned on macrostates.

A hidden-variables macrostate will be symbolically indicated by A . For quantum mechanical measurements the conditional probabilities p(ai\\) of (4.1) should then be replaced by p(ai|A ). Concomitantly, quantum mechanical probabilities should be represented in the hidden-variables theory by a functional integral,

p(ai) = Jd? ptfMa^X1), (1)

in which the integration is over all possible macrostates consistent with the preparation procedure.

By itself conditioning of quantum mechanical observables on macrostates rather than microstates is not sufficient to prevent derivation of Bell's inequality. As a matter of fact, on the basis of expression (4.3) a quadrivariate joint probability distribution can be defined, analogous to (4.2), according to

p(oi,02,61,62) = f dt p(A')p(a1|At)p(a2|At)p(61|A<)p(62|At), (2)

from which Bell's inequality can be derived just as well. There is, however, one important aspect that up till now has not sufficiently been taken into account, viz. contextuality. In the construction of (4.4) it is assumed that the

macrostate A is applicable in each of the measurement arrangements of observables A\,A2,Bi, and B2. Because of the incompatibility of some of these observables this is an implausible assumption. On the basis of the thermodynamic analogy it is to be expected that macrostates A will depend on the

111

measurement context of a specific observable. Since [Ai,Bi]_ ^ O, we will have

f f1, (3)

and analogously for A2 and B2. Then, for the Bell experiments measuring the pairs (Ai, A2) and (Ai,B2), respectively, we have

p(ai,a2) = dX* ' 2 p(t 1 2)p(ai|A 1 2)p(a2\X 1 2 ) , (4)

p(ai,b2) = Jd\tAlB2 p{tMB2)p{atfMB*)p{a2\\tMB*). (5)

Now, the contextuality expressed by inequality (4.5) prevents the construction of a quadrivariate joint probability distribution analogous to (4.4). Hence, like in the quantum mechanical approach, also in the local non-objectivistic hidden-variables theory a derivation of Bell's inequality is prevented due to the local contextuality involved in the interaction of the particle and the measuring instrument it is directly interacting with.

6 Conclusions

Our conclusion is that if quantum mechanical measurements do probe macro-states A rather than microstates A, then Bell's inequality cannot be derived for quantum mechanical measurements. Both in quantum mechanics and in hidden-variables theories is Bell's inequality a consequence of the assumption that the theory is yielding an objective description of reality in the sense that the preparation of the microscopic object, as far as relevant to the realization of the measurement result, can be thought to be independent of the measurement arrangement. The important point to be noticed is that, although in Bell experiments the preparation of the particle pair at the source (i.e. the microstate) can be considered to be independent of the measurement procedures to be carried out later (and, hence, one and the same microstate can be assumed in different Bell experiments), the measurement result is only determined by the macrostate, which is co-determined by the interaction with the measuring instruments. It really seems that the Copenhagen maxim of the impossibility of attributing quantum mechanical measurement results to the object as objective properties, possessed independently of the measurement, should be taken very seriously, and implemented also in hidden-variables theories purporting to reproduce the quantum mechanical results. The quantum

112

mechanical dice is only cast after the object has been interacting with the measuring instrument, even though its result can be deterministically determined by the (sub-quantum mechanical) microstate.

The thermodynamic analogy suggests which experiments could be done in order to transcend the boundaries of the domain of application of quantum mechanics. If it would be possible to perform experiments that probe the microstate A rather than the macrostate A , then we are in the domain of (quasi-)objectivistic hidden-variables theories. Because of (4.2) it, then, is to be expected that Bell's inequality should be satisfied for such experiments. In such experiments preparation and measurement must be completed well within the relaxation time of the microstates. Such times have been estimated by Bohm24 "for the sake of illustration" as the time light needs to cover a distance of the order of the size of an atom (10~18 s, say). If this is correct, then all present-day experimentation is well within the range of quantum mechanics, thus explaining the seemingly universal applicability of this latter theory. By hindsight, this would explain why Aspect's switching experiment? is corroborating quantum mechanics: the applied switching frequency (50 MHz), although sufficient to warrant locality, has been far too low to beat the local relaxation processes in each of the measuring instruments separately.

It has often been felt that the most surprising feature of Bell experiments is the possibility (in certain states) of a strict correlation between the measurement results of the two measured observables, without being able to attribute this to a previous preparation of the object (no 'elements of physical reality '). For many physicists the existence of such strict correlations has been reason enough to doubt Bohr's Copenhagen solution to renounce causal explanation of measurement results, and to replace 'determinism' by 'complementarity'. It seems that the urge for causal reasoning has been so strong that even within the Copenhagen interpretation a certain causality has been accepted, even a non-local one, in an EPR experiment (cf. figure 1) determining a measurement result for particle 2 by the measurement of particle 1. This, however, should rather be seen as an internal inconsistency of this interpretation, caused by a tendency to make the Copenhagen interpretation as realist as possible. In a consistent application of the Copenhagen interpretation to Bell experiments such experiments could be interpreted as measurements of bivariate correlation observables. The certainty of obtaining a certain (bivariate) eigenvalue of such an observable would not be more surprising than the certainty of obtaining a certain eigenvalue of a univariate one if the state vector is the corresponding eigenvector.

It is important to note that this latter interpretation of Bell experiments takes seriously the Copenhagen idea that quantum mechanics need not ex-

113

plain the specific measurement result found in an individual measurement. Indeed, in order to compare theory and experiment it would be sufficient that quantum mechanics just describe the relative frequencies found in such measurements. In this view quantum mechanics is just a phenomenological theory, in an analogous way describing (not explaining) observations as does thermodynamics in its own domain of application. Explanations should be provided by "more fundamental" theories, describing the mechanisms behind the observable phenomena. Hence, the Copenhagen 'completeness' thesis should be rejected (although this need not imply a return to determinism).

This approach has important consequences. One consequence is that the non-existence, within quantum mechanics, of 'elements of physical reality' does not imply that 'elements of physical reality' do not exist at all. They could be elements of the "more fundamental" theories. In section 5 it was discussed how an analogy between quantum mechanics and thermodynamics could be exploited to spell this out. 'Elements of physical reality' could correspond to hidden-variables microstates A. The determinism necessary to explain the strict correlations, referred to above, would be explained if, within a given measurement context, a microstate would define a unique macrostate A . This demonstrates how it could be possible that quantum mechanical measurement results cannot be attributed to the object as properties possessed prior to measurement, and there, yet, is sufficient determinism to yield a local explanation of strict correlations of quantum mechanical measurement results in certain Bell experiments.

Another important aspect of a dissociation of phenomenological and fundamental aspects of measurement is the possibility of an empiricist interpretation of quantum mechanics. As demonstrated by the generalized Aspect experiment discussed in section 3, an empiricist approach needs a generalization of the mathematical formalism of quantum mechanics, in which an observable is represented by a POVM rather than by a projection-valued measure corresponding to a self-adjoint operator of the standard formalism. Such a generalization has been very important in assessing the meaning of Bell's inequality. In the major part of the literature of the past this subject has been dealt with on the basis of the (restricted) standard formalism. However, some conclusions drawn from the restricted formalism are not cogent when viewed in the generalized one (for instance, because von Neumann's projection postulate is not applicable in general). For this reason we must be very careful when accepting conclusions drawn from the standard formalism. This, in particular, holds true for the issue of non-locality.

114

References

1. W. Heisenberg, Zeitschr. f. Phys. 33, 879 (1925). 2. E. Schrodinger, Naturwissenschaften 23, 807, 823, 844 (1935) (English

translation in Quantum Theory and Measurement, eds. J.A. Wheeler and W.H. Zurek (Princeton Univ. Press, 1983, p. 152)).

3. W.M. de Muynck, Synthese 102, 293 (1995). 4. A. Einstein, B. Podolsky, and N. Rosen, Phys. Rev. 47, 777 (1935). 5. A. Aspect, P. Grangier, and G. Roger, Phys. Rev. Lett 47, 460 (1981). 6. A. Aspect, J. Dalibard, and G. Roger, Phys. Rev. Lett. 49, 1804 (1982). 7. K.R. Popper, Quantum theory and the schism in physics (Rowman and

Littlefield, Totowa, 1982). 8. M. Jammer, The philosophy of quantum mechanics (Wiley, New York,

1974.) 9. N. Bohr, Phys. Rev. 48, 696 (1935).

10. J.S. Bell, Physics 1, 195 (1964). 11. H.R Stapp, Phys. Rev. D 3, 1303 (1971); II Nuovo Cim. 29B, 270

(1975). 12. A. Fine, Journ. Math. Phys. 23, 1306 (1982); Phys. Rev. Lett. 48, 291

(1982). 13. P. Rastall, Found, of Phys. 13, 555 (1983). 14. W.M. de Muynck, Phys. Lett. A 114, 65 (1986). 15. W.M. de Muynck, W. De Baere, and H. Martens, Found, of Phys. 24,

1589 (1994). 16. W.M. de Muynck, Found, of Phys. 30, 205 (2000). 17. W.M. de Muynck and J.T. van Stekelenborg, Ann. der Phys., 7. Folge,

45, 222 (1988). 18. L. de Broglie, La thermodynamique de la particule isolee (Gauthier-

Villars, 1964); L. de Broglie, Diverses questions de mecanique et de thermodynamique classiques et relativistes (Springer-Verlag, 1995).

19. D. Bohm, Phys. Rev. 89, 458 (1953). 20. D. Bohm and J.-P. Vigier, Phys. Rev. 96, 208 (1954). 21. E. Nelson, Dynamical theories of Brownian motion (Princeton University

Press, 1967). 22. E. Nelson, Quantum fluctuations (Princeton University Press, 1985). 23. H.B. Hollinger and M.J.Zenzen, The Nature of Irreversibility (D. Reidel

Publishing Company, Dordrecht, 1985, sect. 4.4). 24. D. Bohm, Phys. Rev. 85, 166, 180 (1952).

115

DISCRETE HESSIANS IN STUDY OF Q U A N T U M STATISTICAL SYSTEMS: COMPLEX GINIBRE ENSEMBLE

M. M. DURAS

Institute of Physics, Cracow University of Technology, ulica Podchorazych 1, PL-30084 Cracow, Poland


The Ginibre ensemble of nonhermitean random Hamiltonian matrices K is considered. Each quantum system described by K is a dissipative system and the eigenenergies Z; of the Hamiltonian are complex-valued random variables. The second difference of complex eigenenergies is viewed as discrete analog of Hessian with respect to labelling index. The results are considered in view of Wigner and Dyson's electrostatic analogy. An extension of space of dynamics of random magnitudes is performed by introduction of discrete space of labeling indices.

1 Introduction

Random Matrix Theory RMT studies quantum Hamiltonian operators H which are random matrix variables. Their matrix elements Hij are independent random scalar variables 1.2,3,4,5,6,7,8 There were studied among others the following Gaussian Random Matrix ensembles GRME: orthogonal GOE, unitary GUE, symplectic GSE, as well as circular ensembles: orthogonal COE, unitary CUE, and symplectic CSE. The choice of ensemble is based on quantum symmetries ascribed to the Hamiltonian H. The Hamiltonian H acts on quantum space V of eigenfunctions. It is assumed that V is TV-dimensional Hilbert space V = F ^ , where the real, complex, or quaternion field F = R, C ,H , corresponds to GOE, GUE, or GSE, respectively. If the Hamiltonian matrix

116

H is hermitean H — H\ then the probability density function of H reads:

MH)=CH0exp[-p-±-Tr(H2)}, (1)

CH0 = ( ^ ) ^ / 2 ,

MHP=N+ ^N(N - 1)0,

/ fn(H)dH = 1,

N N D-l

^=nniK). i = l j > i 7=0

Hii = (H$\...,H<S>-»)eF,

where the parameter /3 assume values /3 = 1,2,4, for GOE(iV), GUE(A^), GSE(A^), respectively, and Nap is number of independent matrix elements of hermitean Hamiltonian H. The Hamiltonian H belongs to Lie group of hermitean N x AT-matrices, and the matrix Haar's measure dH is invariant under transformations from the unitary group U(iV, F) . The eigenenergies Ei,i = 1,..., N, oi H, are real-valued random variables Ei = E*. It was Eugene Wigner who firstly dealt with eigenenergy level repulsion phenomenon studying nuclear spectra1 '2 '3. RMT is applicable now in many branches of physics: nuclear physics (slow neutron resonances, highly excited complex nuclei), condensed phase physics (fine metallic particles, random Ising model [spin glasses]), quantum chaos (quantum billiards, quantum dots), disordered meso-scopic systems (transport phenomena), quantum chromodynamics, quantum gravity, field theory.

2 The Ginibre ensembles

Jean Ginibre considered another example of GRME dropping the assumption of hermiticity of Hamiltonians thus denning generic F-valued Hamiltonian K 1,2,9,10 j j e n C 6 ) j{ belong to general linear Lie group GL(N, F), and the matrix Haar's measure dK is invariant under transformations form that group. The

117

distribution of K is given by:

MK) = CK0 exp [-P-\- TrffftA-)], (2)

/

K.Hf> = N2p,

fK{K)dK = 1,

N N D-\

^=nniK). i=\j=\ 7=0

where /3 — 1,2,4, stands for real, complex, and quaternion Ginibre ensembles, respectively. Therefore, the eigenenergies Zi of quantum system ascribed to Ginibre ensemble are complex-valued random variables. The eigenenergies Zi,i = 1,...,N, of nonhermitean Hamiltonian K are not real-valued random variables Zi ^ Z*. Jean Ginibre postulated the following joint probability density function of random vector of complex eigenvalues Z\,..., ZN tor N X N Hamiltonian matrices K for f} = 21 '2-9 '10:

P{zu...,zN) = (3) N 1 N N

=n ^771 • n \zi - ztf • exp(- zZ I^I2) ' 3 = 1 J i<j j=l

where Zi are complex-valued sample points (zi 6 C). We emphasize here Wigner and Dyson's electrostatic analogy. A Coulomb

gas of iV unit charges moving on complex plane (Gauss's plane) C is considered. The vectors of positions of charges are zt and potential energy of the system is:

U(z1,...,zN) = -J2]n\zi-*j\ + l'E\Zil (4) i<j i

If gas is in thermodynamical equilibrium at temperature T = ^- (ft = -^-^ = 2, ks is Boltzmann's constant), then probability density function of vectors of positions is P(ZI,...,ZN) Eq. (3). Therefore, complex eigenenergies Zi of quantum system are analogous to vectors of positions of charges of Coulomb

118

gas. Moreover, complex-valued spacings AxZi of complex eigenenergies of quantum system:

A1Zi = Zi+1-Zi,i = l,...,(N-l), (5)

are analogous to vectors of relative positions of electric charges. Finally, complex-valued second differences A2Zj of complex eigenenergies:

A2Zi = Zi+2 - 2Zi+l + Zui = 1,..., {N - 2), (6)

are analogous to vectors of relative positions of vectors of relative positions of electric charges.

The eigenenergies Zi = Z(i) can be treated as values of function Z of discrete parameter i — 1,..., N. The "Jacobian" of Zi reads:

dZi A1Zi , JacZi = V ~ ^ T 1 = A Z<- 7

Ol A1! We readily have, that the spacing is an discrete analog of Jacobian, since the indexing parameter i belongs to discrete space of indices i £ / = {l,. . . , iV}. Therefore, the first derivative with respect to i reduces to the first differential quotient. The Hessian is a Jacobian applied to Jacobian. We immediately have the formula for discrete "Hessian" for the eigenenergies Zi\

Q2 7. A 2 7.

Thus, the second difference of Z is discrete analog of Hessian of Z. One emphasizes that both "Jacobian" and "Hessian" work on discrete index space / of indices i. The finite differences of order higher than two are discrete analogs of compositions of " Jacobians" with "Hessians" of Z.

The eigenenergies Ei,i 6 / , of the hermitean Hamiltonian H are ordered increasingly real-valued random variables. They are values of discrete function Ei = E{i). The first difference of adjacent eigenenergies is:

A1Ei = Ei+1-Ei,i = l,...,(N-l), (9)

are analogous to vectors of relative positions of electric charges of one-dimensional Coulomb gas. It is simply the spacing of two adjacent energies. Real-valued second differences A2Ei of eigenenergies:

A2Ei = Ei+2 - 2Ei+1 +Eui = 1,..., (N - 2), (10)

119

are analogous to vectors of relative positions of vectors of relative positions of charges of one-dimensional Coulomb gas. The A2Zi have their real parts ReA2Zi, and imaginary parts ImA2Z;, as well as radii (moduli) \A2Zi\, and main arguments (angles) ArgA2Zi. A2Zj are extensions of real-valued second differences:

A 2 £i = Ei+2 - 2Ei+1 +Ehi = 1,..., (N - 2), (11)

of adjacent ordered increasingly real-valued eigenenergies Ei of Hamiltonian H defined for GOE, GUE, GSE, and Poisson ensemble PE (where Poisson ensemble is composed of uncorrelated randomly distributed eigenenergies)11,12,13'14'15. The Jacobian and Hessian operators of energy function E(i) — Ei for these ensembles read:

and

The treatment of first and second differences of eigenenergies as discrete analogs of Jacobians and Hessians allows one to consider these eigenenergies as a magnitudes with statistical properties studied in discrete space of indices. The labelling index i of the eigenenergies is an additional variable of "motion", hence the space of indices I augments the space of dynamics of random magnitudes.

Acknowledgements

It is my pleasure to most deeply thank Professor Antoni Ostoja-Gajewski for continuous help. I also thank Professor Wlodzimierz Wojcik for his giving me access to computer facilities.

References

1. F. Haake, Quantum Signatures of Chaos (Springer-Verlag, Berlin Heidelberg New York 1990), Chapters 1, 3, 4, 8, pp 1-11, 33-77, 202-213.

2. T. Guhr, A. Miiller-Groeling and H. A. Weidenmuller: Phys. Rept. 299, 189-425 (1998).

3. M. L. Mehta, Random matrices (Academic Press, Boston 1990), Chapters 1, 2, 9, pp 1-54, 182-193.

4. L. E. Reichl, The Transition to Chaos In Conservative Classical Systems: Quantum Manifestations (Springer-Verlag, New York, 1992), Chapter 6, p. 248.

5. O. Bohigas, in Proceedings of the Les Houches Summer School on Chaos and Quantum Physics, (North-Holland, Amsterdam, 1991), p. 89.

6. C.E. Porter, Statistical Theories of Spectra: Fluctuations (Academic Press, New York, 1965).

7. T. A. Brody, J. Flores, J. B. French, P. A. Mello, A. Pandey and S. S. M. Wong, Rev. Mod. Phys. 53, 385 (1981).

8. C. W. J. Beenakker, Rev. Mod. Phys. 69, 731 (1997). 9. J. Ginibre, J. Math. Phys. 6, 440 (1965).

10. M. L. Mehta, Random matrices (Academic Press, Boston 1990), Chapter 15, pp 294-310.

11. M. M. Duras and K. Sokalski, Phys. Rev. E 54, 3142 (1996). 12. M. M. Duras, Finite difference and finite element distributions in statis

tical theory of energy levels in quantum systems (PhD thesis, Jagellonian University, Cracow 1996).

13. M. M. Duras and K. Sokalski, Physica D125, 260 (1999). 14. M. M. Duras, Description of Quantum Systems by Random Matrix En

sembles of Large Dimensions, in Proceedings of the Sixth International Conference on Squeezed States and Uncertainty Relations, 24 May-29 May 1999, Naples, Italy (NASA, Greenbelt, Maryland, at press 2000).

15. M. M. Duras, J. Opt. B: Quantum Semiclass. Opt. 2, 287 (2000).

121

SOME REMARKS ON HARDY FUNCTIONS ASSOCIATED WITH DIRICHLET SERIES

W. E H M Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstrasse 3a, 79098 Freiburg, Germany E-mail: [email protected]

A simple method of associating a Hardy function with a Dirichlet series is described and applied to some examples connected with the Riemann zeta function. The theory of Hardy functions then is used to derive "integral tests" of the Riemann hypothesis generalizing a recent result of Balazard, Saias and Yor.1

1 Introduction

The most famous example of a Dirichlet series f(z) = Y^=i an n~z converging absolutely in the half plane $lz > 1 is the Riemann zeta function ((z), which has all coefficients an = 1. It has a simple pole at z — 1 and can be extended as a meromorphic function with no other singularities to the whole complex plane.6

A simple method of associating a Hardy function with a Dirichlet series of that kind consists in multiplying f(z) by (z — l ) / ^ 2 : the factor (z — l)/z removes the pole at z = 1, and the division by z achieves square integrability along vertical lines. Moreover, the zeros of f{z) remain unchanged by this modification. The motivation for passing from f(z) to f(z) (z — l)/z2 is to utilize the theory of Hardy functions, especially factorization of Hardy functions, for the study of the zeta function.

In section 2 of this note we give conditions under which the function f(z) (z — l)/z2 has an analytic continuation, as a Hardy function, beyond the abscissa of convergence of the Dirichlet series f(z). The criterion is tested on three examples, all related to the Riemann zeta function. Factorization of the Hardy function £(z) (z — l)/z2, which is briefly dicussed in section 3, is used in section 4 to derive some "integral tests" of the Riemann hypothesis. The content of the Riemann hypothesis, hereafter abbreviated "RH", is Riemann's yet unproven conjecture that all non-real zeros of the £ function lie on the line iftz = 1/2 in the complex plane. It has received increasing interest among physicists since the discovery of striking similarities in the distribution of the zeros of the zeta function and the spectrum of large random matrices.2

The idea to utilize Hardy functions in connection with the zeta function, including "integral tests" of the Riemann hypothesis, is not new. See the recent article of Balazard, Saias and Yor1, who initially work with Hardy functions in the disc, then pass to the half plane 3te > 1/2 by conformal mapping. In our

122

approach based on the function C,(z)(z — l ) / z 2 , which also appears in recent work of Burnol4, we deal with half plane Hardy functions from the beginning. This leads to somewhat more general results in a natural fashion.

2 "Hardyfication" of Dirichlet series

The basic result of this section is the following.

Theorem Given a Dirichlet series f(z) = $3nLi a « n~z with a finite abscissa of convergence, let functions A and <f> be defined by

A(x) = ^2 a„, <j){x) = ^^ an(l-x + \ogn) (x € R ) . l < n < x l<n<e*

(1)

Suppose that A{x) = 0(x) as x —>• oo, and let

X = l i m s u p l-?pM , where DN = A(N) - V ^ M . ( 2 )

Then the function f(z) (z — l)/z2 can be represented as the Laplace transform of <f>(x) in the half plane Stz > A,

(3) /•OO

f(z)(z-l)/z2 = / e-zx4>(x)dx ($lz>\). Jo

Proof. Fix an integer N > 1 and let log N < x < \og(N + 1). Then

\4>(x)-4>(logN)\ = \(x-logN)A(N)\<\A(N)\log?t±l = 0(1)

as N -> oo, by the assumed growth behavior of A(x). Combining this with

(A(log(n + l))-</)(logn) = an+1 - A(n) log ^ = an+1 - A(n)/n + 0(n'1),

we get for N = [ex] -> oo

N-l

4>{x) = m + J2 [^(log("+!)) - ^(losn)] + °(!) n=l

N-l

= ai + 5 3 [an+1 - A(n)/n + Ofa-1)] + 0(1) = DN + 0(log N), n = l

123

and thus for every e > 0, <f>(x) = 0(ea;(A+e>), x t oo, by the definition of A. Since 4> vanishes on the left half line it follows that the integral on the right-hand side of (3) converges absolutely in the half plane 5ftz > A. It remains to show that this Laplace transform coincides with f(z) (z - l ) / z 2 in the half plane 3?z > aa, where aa denotes the abscissa of absolute convergence of f(z).

To that end let us write r)(z) = f(z) (z — l)/z2 and introduce truncated versions

N

fN(z) = ^2ann~z, T]N(z) = fN(z)(z-l)/z2

n = l

(j)N{x) = Y2 an(l-x + \ogn), l<n<min(N,ex)

N >1, and set h^îx) — e~~ax <f>jv(x). Using

2TT J^ [ + ] \ 0 if x < 0

(for every integer q > 1, a > 0) we get for fixed a > aa

(•OO

eitxr)N(v + it)dt (4)

-i /-oo N = v \ eitx ]C a" n~°~it (a + it- l)l(a +{t)2 dt

2?r J -OO

-f 2TT J_.

n = l N

^"-'ijy^-'iû dt ya + it (a + it)2,

Y, ann-°e-°(x-lo^(l-(x-logn)) = ha,N{x) l<n<min(N,ex)

almost everywhere in x S R, the Fourier integrals being understood in the L2

sense. Note that r](z) is square integrable along every line 9?z = a with a > aa. Clearly rjî^+it) converges to r){a+it) in L2(dt), so h^^ is a Cauchy sequence in L2(dx), by Parseval's formula. The pointwise limit ha(x) of h<T^(x) then also is the L2(dx) limit, so that by (4) h^x) and T)(a + it) represent a Fourier transform pair for every a > aa. Therefore,

poo poo

r](a + it) = Kit) = / ha{x)e~ixtdx = / e-(°+iVx<f)(x)dx (5) Jo Jo

124

holds almost everywhere in t (a > aa), hence everywhere in 3te > aa, by continuity. This shows that the Laplace transform of <f> represents the analytic continuation of 77 to the region $tz > A, completing the proof.

Let Ti2 denote the Hardy space consisting of all functions g(z) which are analytic for $lz > a and such that s u p ^ , ^ J^° \g(cr' + it)\2 dt < 00. The growth behavior of (j>(x) established in the proof implies ha € L2 for every a > A, so that by (5) and Parseval's formula we obtain the following.

Corollary Under the conditions of the theorem, the function f(z) (z — l)/z2

belongs to every Hardy space *H2, a > X.

Example 1. Let o„ = 1 for all n, that is f(z) — C,{z). Then DN = 1, N > 1, so that A = 0. A more careful analysis shows that <fr{x) is nonnegative and grows linearly as x tends to infinity. Consequently, (,{z) (z — l)/z2 is a member of every Hardy space W2,, a > 0, but not of H2,. The nonnegativity allows one to associate with <f> an exponential family V — {pa, a > 0} of probability densities with support [0,00) by setting

p„(x) = K(x)/r](a) = <f>{x)e-'x/ri((T) (x € R, a > 0). (6)

The function £(z) (z — l)/z2 was also considered by Burno?, in connection with a closure problem in function space known as the "Nyman - Beurling real variable form of the Riemann hypothesis".

It may be interesting to note here that although ha is square integrable for every a > 0 it is not true that hafM —>• /i<r in L2 if cr < 1. In fact we have

Uminf jv->oo ||fr(7,JV-/i<r||2 > 0, 0 < a < 1. (7)

Proof. Note first that for x > log N -> 00

4>{x) - 4>N{X) (8)

J ^ ( l - z + logn) = ( l - a O Q e ^ - A O + l o g t e ^ l - l o g A T ! N<n<e*

= ( l - x ) ( [ e * ] - A 0 + ([ex} + ±)log[ex] - [ex] - (N + | ) logiV + N + 0(1)

= (JV+!)(log[ex]- logJV) + ( [ e^ ] - iV) ( log [e a ; ] -x )+0 ( l )

= (N + ! ) ( * - log TV) + 0(1),

on using Stirling's formula and the inequalities 0 < x - log [ex] <2e~x (x > 0). The estimate (8) shows that there exists a finite constant B > 0 such that

125

<f>(x) - 4>N{x) >N(x- logN) for all large N and x > B + log JV. Therefore

/*O0

\\K,N-K\\l > / (<f>(x) - <t>N(x))2 e-2** dx JB+\ogN

roo TOO

> TV2 / (x-logN)2e-2axdx = N2~2° / y2 e~2try dy JB+\ogN JB

for all large N, and assertion (7) follows.

Example 2. Let f(z) = ^2p~z^ogp, where the sum extends over all prime numbers. This example is related to the logarithmic derivative of the zeta function, as may be seen from the product representation £(z) = J~T_ (1—p_ z)_ 1 . For IRz > 1,

C'(z) v - logP ./ > , V - !ogP C(z) ^ Pz - 1 M ^ ^ Pz (p2 - 1) '

and since the last series converges for Htz > 1/2, it suffices to consider f(z) as far as the analytic continuation of C(z)/C(z) 1S concerned.

The series f(z) had convergence abscissa 1/2, implying the RH, if the associated sequence DN satisfied condition (2) with A = 1/2. For a numerical check we computed DN for TV up to 5 million. A plot of log+ |.Djv| / log TV versus logiV (thinned out to every 200th data point; the general picture is not affected thereby) is shown in Figure 1 (a). Within the considered range, the observed behavior is well in accordance with a possible value of A = 1/2. Notice the obvious connection with the classical criterion saying that the RH is equivalent to the error estimate $^p<xlogp — x = 0(x1/2+e) (V e > 0) in the prime number theorem (Edwards6, Sect. 5.5). Incidentally, 4>(x) seems to be nonnegative in this case, too, as a plot of <f>(x) for small a;-values indicates.

Example 3. Let f(z) = 1/C,(z) = ^2^Li^(n)n~z, with fj, the Mobius function. It is well-known that the RH is equivalent to the condition A(N) = En<ivM(™) = 0(/V1 /2 + e) (for every e > 0), that is to A = 1/2. The analogous plot for this case is shown in Figure 1 (b), with similar findings.

3 Factorization of r)

From now on we shall restrict attention to the case / = £. For brevity we write r](z) = (,(z)(z — l)/z2 throughout the sequel. Recall from the previous section that TJ belongs to every Hardy space 'H2

T, a > 0. Being a Hardy function r\ admits a useful factorization, some applications of which will be discussed in

126

Figure 1: Convergence abscissa of Laplace transform equal to 1/2? Plot of criterion log"1" \DN\ I logN versus log AT, for (a): Example 2; (b): Example 3.

the next section. The zeros of r) in the right half plane Sftz > 0, which coincide with the non-trivial zeros of the zeta function, are generically denoted by p. The p's are known to lie symmetrically with respect to both the real axis and the critical line Kz = 1/2. That is, whenever p is a zero then so are the mirror images p, 1-/9, and 1 — p.

Let a > 0 be fixed. According to the factorization theorem for Hardy functions (see e.g. Dym and McKean5 (ch. 2.7) or Hoffman8 (p. 132, 133)) TJ can be represented as the product of an outer and an inner function on the half plane 5Rz > a. More precisely,

r,(z) = Ha{z)Ba{z)

where the outer function is given by

(ftz > a),

H<r(z) = exp 7T J-c

log \rj(a + it)\ t(z — a) + i dt t + i(z-a) 1+t2

(9)

(10)

and the inner function reduces in the present case to a Blaschke product Ba

which is composed of the zeros p of T] with 5fy> > a and their mirror images after reflection at the line 9?z = a, 2a — ~p. Explicitly,

\l-{p-o? D M _ TT z ~ P l 1 " ( i i )

These formulae are easily obtained from the familiar ones for the half plane 9iz > 08 by shifting both the complex variable and the zeros by a. The inner

127

factor simplifies to a Blaschke product for the following reasons: (i) n has an analytic continuation across the line dtz = a to the entire right half plane, so that there is no singular factor; (ii) the constant c appearing in the general factorization formula reduces to unity because Ba(o) = 1 and Ha(a) = rj(a), as is readily verified. For real arguments z = s, taking first logarithms, then real parts on both sides of (9) one obtains for s > a > 0

iog,(s) = i jy^(^)\ {s(s_-^\2 + £ i0i

5Rp><r

s-p s-(2a-p)

(12)

Note that T](s) is positive for s > 0, being the Laplace transform of a nonneg-ative function.

4 Applications

The factorization of n gives rise to various "tests" of the RH. A first example is obtained by setting a = 1/2 in (12). The sum on the right-hand side of (12) vanishes if and only if £(z) has no zero within the region $lz > 1/2. Therefore, the RH is true if and only if for some (and then for all) s > 1/2

If 71" J-<

logMl + ^ l / * ! ^ * = log»K*). (13) , (s 2) +t

This criterion is equivalent to the condition that r)(z) be an outer function for the half plane 9?z > 1/2; cf. Dym and McKean5, Sect. 2.7. For s = 1 it assumes a particularly neat form. The right-hand side vanishes, and the left-hand side can be simplified, and one gets the following criterion for the truth of the RH due to Balazard, Saias and Yor1,

4 + l

Another example results from the formula

/

OO 1,

log[|ij(<r + it)|/i,(<7)] -2L - 2 £ K ( p - a ) " 1 (15)

(cr > 0), which can be derived from (12) by subtracting logger) on both sides, dividing by s - cr, and then taking the limit s \, a. The interchange of limits and integration (or summation) can be justified by dominated convergence.

128

Putting a = 1/2 in (15) one obtains the following differential version of the "integral tests" (13), (14). The RH is true if and only if

f j — <

dt l o g t W i + i t J I M D l - r j = ( log^) ' ( i ) . (16)

This statement can be amplified in various ways. First, it is possible to evaluate (log77)'(|) explicitly, (logr?)'(|) = f + |log(87r) + f - 6, and for u = 1/2 the sum in (15) can be written in a more symmetric form. One thus obtains the relation

/

00

log \v{\+it)\

v(h) dt (l 1 , 7T \ \^\$tp-5 ( l + l l o g M + I _ 6 ) = E 2 I

•Kt2 \2 2 6V ; 4 J ^ \p - | p (17)

in which the sum extends over all zeros in the critical strip. Note that (17) quantifies the difference between the two sides of (16) as a weighted sum of the absolute deviations of the real parts of the zeros from 1/2.

Secondly, there is a connection with logarithmic Hilbert transforms, also called logarithmic dispersion relations.3 Suppose we had T](z) ^ 0 for IStz > 1/2. Then n itself would be an outer function,

Taking imaginary parts in this equation one can show with a little algebra that for z — 1/2 = a + ib, a > 0, one then has

ZlogV(z) = - J ^ (log|7?(i + it)\ - l o g W ! +ib)\) -±-± j - ^ 1 8 )

l o g M | + r t ) I - log \T}{ \ + ib) I a dt

-I t-b a2 + (t-b)2

Fix any b > 0 such that 7/(| +ib) ^ 0. Then the last term in (38) converges to zero as a 4- 0. Therefore, using the fact that \r](\ + it)\ is an even function of t, one obtains in the limit the logarithmic dispersion relation

o-i (! + •* 2b Z-00 log k ( | + it)| - log |»?(| + t6)| ^ Zlogriiz+ib) = — J i ^ — ^ dt, (19)

which expresses the phase of rj on the boundary dtz = 1/2 as an integral of its log modulus along that line. Recall that this relation is a consequence of the

129

assumed outer function character of 77, that is, of the RH. In fact, the validity of (19) for every 6 > 0 such that 7?(| + ib) ^ 0 is also sufficient for the RH. To see this divide both sides of (19) by b and let 6 4-0. Then the left side tends to (log»7)'(i), the right side to f /0°°log[\r](\ + it)\h{\)] §, so in the limit we get the condition (16) shown above to be equivalent to the RH.

Finally we note that — (log77)'(<r) equals the first moment of the probability density p„, cp. (6). In view of (16) and (15), this raises the question whether the integral term in these relations admits of a probabilistic interpretation, too. Relevant to this question is the observation going back to Khintchine that for every a > 1 the function fa(t) = £(a + it)/((a) is the characteristic function of an infinitely divisible distribution; cf. Example 6, p. 75, in Gnedenko and Kolmogorov.7 This can be verified by rewriting the product representation of the zeta function (for a > 1) in the form

C(o- + it) = T T 1-p-7

exp — T,—on

y^ y^ E ie-itn\oSp _ i\ . p n = l

(20)

and noting that fa{t) is thus represented as a product of terms of the form exp(a(elbt — 1)), each of which is the characteristic function of a Poisson random variable with intensity a and values in the lattice kb, k = 0,1,2,... .

In order to connect this fact with the above question it is convenient to introduce the Levy measure Fa, which puts mass (npncr)~1 at each of the points - logp", n>l,p prime. Then (20) becomes log '^fffi = J(eitx - 1) Fa(dx), so taking real parts in this equation and using J^° (l — costx)/t2 dt = n \x\ (x £ R) one obtains

J o g [ | C ( a + i i ) | / C ( < T ) ] ^ = j_^j{postx-l)Fa{dx)^

= / / ( c o s t e - 1 ) — ^ F ^ d x ) = - hxlFeidx) = xF„(dx).

Thus we find that the essential part of the integral in question equals the first moment of the Levy measure Fa. The other part stemming from the factor (z — l)/z2 can be incorporated by introducing a signed, absolutely continuous measure Ga with density x _ 1 [2eax - e ^ - 1 ^ ) on (-00,0) (zero on [0,00)). One then has

log r){a + it) ±ii) = j(eax-l)(Fa-Ga)(dx),

130

and hence

l o g [ | „ ( | + r t ) I M § ) ] ^ = lx(F„-Ga){dx) (<x>l) .

These calculations give a more detailed picture of the way how the factor (z — l)/z2 regularizes the zeta function as a J. 1: it compensates the flow of mass of Fa towards — oo by the subtraction of measures Ga such that the first moment of Fa — Ga remains bounded. Evidently, other ways of renormalizing the Levy measure as a \, 1 are also conceivable, and may be interesting to explore.

References

1. M. Balazard, E. Saias, and M. Yor, Adv. Math. 143, 284 (1999). 2. M.V. Berry and J.P. Keating, SIAM Review 41, 236 (1999). 3. R.E. Burge, M.A. Fiddy, A.H. Greenaway, and G. Ross, Proc. R. Soc.

London A 350, 191 (1976). 4. J .-F. Burnol, < h t t p : / / arXiv.org/abs/math/0001013> (2000). 5. H. Dym and H.P. McKean, Gaussian Processes, Function Theory, and

the Inverse Spectral Problem, (Academic Press, New York, 1976). 6. H.M. Edwards, The Theory of the Riemann Zeta Function (Academic

Press, New York, 1974). 7. B.V. Gnedenko and A.N. Kolmogorov, Limit Distributions for Sums of

Independent Random Variables (Addison-Wesley, Cambridge, 1954). 8. K. Hoffman, Banach Spaces of Analytic Functions (Dover, New York,

1988).

131

ENSEMBLE PROBABILISTIC EQUILIBRIUM A N D NON-EQUILIBRIUM THERMODYNAMICS W I T H O U T THE

THERMODYNAMICAL LIMIT

D. H. E. G R O S S

Hahn-Meitner-Institut Berlin, Bereich Theoretische Physik,Glienickerstr.lOO

14109 Berlin, Germany and Freie Universitdt Berlin, Fachbereich Physik; Email: [email protected]

Boltzmann's principle S = k In W allows to extend equilibrium thermo-statistics to "Small" systems without invoking the thermodynamic limit'2'3. As the limit hides more than clarifies the origin of phase transitions, a deeper and more transparent understanding is thus possible. The main clue is to base statistical probability on ensemble averaging and not on time averaging. It is argued that due to the incomplete information obtained by macroscopic measurements thermodynamics handles ensembles or finite-sized sub-manifolds in phase space and not single time-dependent trajectories. Therefore, ensemble averages are the natural objects of statistical probabilities. This is the physical origin of coarse-graining which is not anymore a mathematical ad hoc assumption. The probabilities P(M) of macroscopic measurements M are given by the ratio P(M) = W(M)/W of the volumes of the sub-manifold M. of the microcanonical ensemble with the constraint M to the one without. From this concept all equilibrium thermodynamics can be deduced quite naturally including the most sophisticated phenomena of phase transitions for "Small" systems.

Boltzmann's principle is generalized to non-equilibrium Hamiltonian systems with possibly fractal distributions M. in 6iV-dim. phase space by replacing the conventional Riemann integral for the volume in phase space by its corresponding box-counting volume. This is equal to the volume of the closure M.. With this extension the Second Law is derived without invoking the thermodynamic limit. The irreversibility in this approach is due to the replacement of the phase-space volume of the fractal sub-manifold M. by the volume of its closure M. The physical reason for this replacement is that macroscopic measurements cannot distinguish M. from Ai. Whereas the former is not changing in time due to Liouville's theorem, the volume of the closure can be larger. In contrast to conventional coarse graining the box-counting volume is defined in the limit of infinite resolution. I.e. there is no artificial loss of information.

1 Introduction

Recently the interest in the thermo-statistical behavior of non-extensive many-body systems, like atomic nuclei, atomic clusters, soft-matter, biological systems — and also self-gravitating astro-physical systems lead to consider thermo-statistics without using the thermodynamic limit. This is most safely done by going back to Boltzmann. Einstein considers Boltzmann's definition of entropy as e.g. written on his

132

famous epitaph

S=k-lnW (1)

as Boltzmann's principle4 from which Boltzmann was able to deduce thermodynamics. Here W is the number of micro-states at given energy E of the TV-body system in the spatial volume V:

W(E,N,V) = tr[e0S(E - HN)) (2)

<*<*-&,)] = ff^(^0)\B.B„). (3)

eo is a suitable energy constant to make W dimensionless, Hpf is the N-particle Hamilton-function and the iV positions q are restricted to the volume V whereas the momenta p are unrestricted . In what follows, we remain on the level of classical mechanics. The only reminders of the underlying quantum mechanics are the measure of the phase space in units of 2-KK and the factor 1/N! which respects the indistinguishability of the particles (Gibbs paradoxon). In contrast to Boltzmann5,6 who used the principle only for dilute gases and to Schrodinger7, who thought equation (1) is useless otherwise, I take the principle as the fundamental, generic definition of entropy. In the following sections 1 will demonstrate that this definition of thermo-statistics works well especially also at higher densities and at phase transitions without invoking the thermodynamic limit.

2 There is a lot to add to classical equilibrium statistics from our experience with "Small" systems:

Following Lieb8 extensivity a and the existence of the thermodynamic limit N —> oo|jv/v=co„gt are essential conditions for conventional (canonical) thermodynamics to apply. Certainly, this implies also the homogeneity of the system. Phase transitions are somehow foreign to this: The essence of first order transitions is that the systems become inhomogeneous and split into different phases separated by interfaces. In the conventional Yang-Lee theory phase transitions are represented by the positive zeros of the grand-canonical partition sum where the grand-canonical formalism breaks down (Yang-Lee singularities). In the following we show that the micro-canonical ensemble

"Dividing extensive systems into larger pieces, the total energy and entropy are equal to the sum of those of the pieces.

133

gives much more detailed and more natural insight which corresponds to the experimental identification of phase transitions.

There is a whole group of physical many-body systems called "Small" in the following which cannot be addressed by conventional thermo-statistics:

• nuclei,

• atomic cluster

• polymers

• soft matter (biological) systems

• astrophysical systems

• first order transitions are distinguished from continuous transitions by the appearance of phase-separations and interfaces with surface tension. If the range of the force or the thickness of the surface layers is such that the number of surface particles is not negligible compared to the total number of particles, these systems are non-extensive.

For such systems the thermodynamic limit does not exist or makes no sense. Either the range of the forces (Coulomb, gravitation) is of the order of the linear dimensions of these systems, and/or they are strongly inhomogeneous e.g. at phase-separation.

Boltzmann's principle does not invoke the thermodynamic limit, nor ad-ditivity, nor extensivity, nor concavity of the entropy S(E,N) (downwards bending). This was largely forgotten since hundred years. We have to go back to pre Gibbsian times. It is a purely geometrical definition of the entropy and applies as well to "Small" systems. Moreover, the entropy S(E, N) as defined above is everywhere single-valued and multiple differentiable. There are no singularities in it. This is the most simple access to equilibrium statistics9. We will explore its consequences in this contribution. Moreover, we will see that this way we get simultaneously the complete information about the three crucial parameters characterizing a phase transition of first order: transition temperature Ttr, latent heat per atom qiat and surface tension crsurf. Boltzmann's famous epitaph above (eq.l) contains everything what can be said about equilibrium thermodynamics in its most condensed form. W is the volume of the sub-manifold at sharp energy in the 6iV-dim. phase space.

134

3 Relation of the topology of S(E,N) to the Yang-Lee zeros of Z(T,n,V)

In conventional thermo-statistics phase transitions are indicated by zeros of the grand-canonical partition function Z(T, n, V), V is the volume. See more details in1-2'3'10

Z(T,fi,V) = f r — dN e-[E-*N-TsmiT JJo go

r°°dE

V2

= Y_ ff de dn c-V[ e-Mn-r.(e,n)]/T_ «o JJo

const.+lin.+quadr.

(4)

in the thermodynamic limit V —> oo|;v/y=co„s t. The double Laplace integral (4) can be evaluated asymptotically for large

V by expanding the exponent as indicated in the last line to second order in Ae, An around the "stationary point" es,ns where the linear term vanishes:

1 T

T P f

dE 8

as dN

dS dv (5)

the only term remaining to be integrated is the quadratic one. If the two eigen-curvatures Ai < 0, A2 < 0 this is then a Gaussian integral and yields:

Z(T,li,V) = Yle-V[e.-Itn.-T.^,n.)]/T ff°° dvidv2eV[Mvl+X,vl}/2 ( g )

CO JJ-00

Z(T,fi,V) = e - F ^ ^ (7)

FiT^V) _ _ T B i i ^ ^ ^ ± ^ ( g )

V

„, Tln(v/det(eg ,n,)) , l n V , -+ea- / in. - Tss + VV

VK s" + o ( — ) .

Here det(e s ,n s) is the determinant of the curvatures of s(e,n), vi,v2 are the eigenvectors of d.

det(e,n) = de2 dnde d s d s

dedn dn2 Sfie Snn A1A2 Ai > A2 (9)

135

Nalooo P = 1 a t m ^ AS s u r f ^_^.

^ J - ^ — ^ r / f ^ /

•7 e2 ' 1 s ( e ) - 2 5 - e * 1 1 . 5

H l a t

' e 3

0.3 0 .5 0.7 0.9 1.1 1.3

Figure 1: MMMC simulation of the entropy s(e) per atom (e in eV per atom) of a system of JVo = 1000 sodium atoms with realistic interaction at an external pressure of 1 atm. At the energy per atom e\ the system is in the pure liquid phase and at e$ in the pure gas phase, of course with fluctuations. The latent heat per atom is qiat = e.% — e\.

Attention: the curve s(e) is artifically sheared by subtracting a linear function 25 -(- e * 11.5 in order to make the convex intruder visible s(e) is always a steeply monotonic rising function.We clearly see the global concave (downwards bending) nature of s(e) and its convex intruder. Its depth is the entropy loss due to the additional correlations by the interfaces. Prom this one can calculate the surface tension per surface atom aSUrf/Ttr = As3 1 i r / * No/NsUrf. The double tangent is the concave hull of s(e). Its derivative gives the Maxwell line in the caloric curve T(e) at Ttr- In the thermodynamic limit the intruder would disappear and s(e) would approach the double tangent (Maxwell line) from below.

In the cases studied here A2 < 0 but Ai can be positive or negative. If d e t ( e s , n s ) is positive (Ai < 0) the last two terms in eq.(8) go to 0, and we obtain the familiar result f{T,n,V —> oo) = es — /xns — Tss. I.e. the curvature Ai of the entropy surface s(e, n, V) decides whether the grand-canonical ensemble agrees with the fundamental micro ensemble in the thermodynamic limit. If this is the case, \n[Z(T, /j,)] or f(T,n) is analytical in e'3^ and due to Yang and Lee we have a single, stable phase. Or otherwise, the Yang-Lee zeros reflect anomalous points/regions of Ai > 0 (det (e ,n) < 0). This is crucial. As d e t ( e s , n s ) can be studied for finite or even small systems as well, this is the only proper extension of phase transit ions to "Small" systems.

4 T h e reg ions of p o s i t i v e curvature Ai of s{es,ns) c o r r e s p o n d t o p h a s e t rans i t i ons of first order

We will now discuss the physical origin of convex (upwards bending) intruders in the entropy surface in two examples.

In table (1) we compare the "liquid-gas" phase transit ion in sodium clusters of a few hundred atoms with tha t of the bulk at 1 a tm. c.f. also fig.(l).

Figure (2) shows how for a small system (Pot ts q = 3 lattice gas with 50 * 50 points) all phenomena of phase transitions can be studied from the

136

Table 1: Parameters of the liquid-gas transition of small sodium clusters (MMMC-calculation1) in comparison with the bulk for rising number No of atoms, Nsurf is the average number of surface atoms of all clusters together.

N a

N0

Ttr [K] qiat [eV]

Sboil

^Ssurf

•L* surf

cr/Ttr

200

940 0.82 10.1 0.55 39.94 2.75

1000

990 0.91 10.7 0.56 98.53 5.68

3000

1095 0.94 9.9 0.44 186.6 7.07

bulk 1156 0.923 9.267

oo 7.41

topology of the determinant of curvatures (9) in the micro-canonical ensemble.

5 Boltzmann's principle and non-equilibrium thermodynamics

Before we proceed we must comment on Einstein's attitude to the principle11): Originally, Boltzmann called W the "Wahrscheinlichkeit" (probability), i.e. the relative time a system spends (along a time-dependent path) in a given region of 6./V-dim. phase space. Our interpretation of W to be the number of "complexions" (Boltzmann's second interpretation) or quantum states (trace) with the same energy was criticized by Einstein4 as artificial. It is exactly that criticized interpretation of W which I use here and which works so excellently1. In section 7 I will come back to this fundamental point.

After succeeding to deduce equilibrium statistics including all phenomena of phase transitions from Boltzmann's principle even for "Small" systems, i.e. non-extensive many-body systems, it is challenging to explore how far this "most conservative and restrictive way to thermodynamics"9 is able to describe also the approach of (eventually "Small") systems to equilibrium and the Second Law of Thermodynamics.

Thermodynamics describes the development of macroscopic features of many-body systems without specifying them microscopically in all details. Before we address the Second Law, we have to clarify what we mean with the label "macroscopic observable".

6 Macroscopic observables imply the "EPS-probability"

A single point {qi(t),Pi(t)}i=i...N in the Af-body phase space corresponds to a detailed specification of the system with all degrees of freedom (d.o.f) com-

137

1

0 .8

0 .6

0 .4

0 .2

0 - 2 - 1 . 5 - 1 - 0 . 5 0

e Figure 2: Conture plot of the curvature determinant of Potts-3 lattice gas; Dark grey line: d = 0, boundary of the region of phase coexistence, the triangle APmB\ Light grey line: minimum of d(e,n) in the direction of the largest curvature, second order transition; In the triangle APmC ordered (solid) phase; Above and right of the line CPmB disordered (gas) phase; The crossing Pm of the boundary lines is a multi critical point. The light gray region around the multi-critical point Pm corresponds to a flat region of d(e, n) ~ 0

pletely fixed at time t (microscopic determination). Fixing only the total energy E of an iV-body system leaves the other (6N — l)-degrees of freedom unspecified. A second system with the same energy is most likely not in the same microscopic state as the first, it will be at another point in phase space, the other d.o.f. will be different. I.e. the measurement of the total energy HN, or any other macroscopic observable M, determines a (QN — 1)-dimensional sub-manifold £ or M in phase space. All points in iV-body phase space consistent with the given value of E and volume V, i.e. all points in the (6N — l)-dimensional sub-manifold £{N,V) of phase space are equally consistent with this measurement. £(N,V) is the microcanonical ensemble. This example tells us that any macroscopic measurement is incomplete and defines a sub-manifold of points in phase space not a single point. An additional measurement of another macroscopic quantity B{q,p} reduces £ further to the cross-section £ O B, a (6iV — 2)-dimensional subset of points in £ with the volume:

W{B,E,N,V) = ±J {j0f) e0S(E-HN{q,p})6(B-B{q,p}) (10)

138

If Hff{q,p} as also B{q,p} are continuous differentiable functions of their arguments, what we assume in the following, £ n B is closed. In the following we use W for the Riemann or Liouville volume of a many-fold.

Microcanonical thermostatics gives the probability P(B, E, N, V) to find the TV-body system in the sub-manifold £ D B(E,N, V):

P(B E N V)~ W(B'E>N'V) _ ln[W(B,E,N,V)]-S(E,N,V) ( m

This is what Krylov seems to have had in mind12 and what I will call the "ensemble probabilistic formulation of statistical mechanics (EPS) ".

Similarly thermodynamics describes the development of some macroscopic observable B{qt,pt} in time of a system which was specified at an earlier time to by another macroscopic measurement A{qo,p0}. It is related to the volume of the sub-manifold M(t) = A(t0) n B(t) D £:

W(A,B,E,t) = ^J{^0)N^-B{qupt]) 6(A - A{q0,po})e0d(E - H{qt,pt}), (12)

where {qt{Qo,Po},Pt{Qo,Po}} is the set of trajectories solving the Hamilton-Jacobi equations

dH . 8H . , % = « - , Pi = — « - , i = l---N (13)

with the initial conditions {q(t = to) = <?o; p(t = t0) = Po}- For a very large system with N ~ 1023 the probability to find a given value B(T), P(B(t)), is usually sharply peaked as function of B. Ordinary thermodynamics treats systems in the thermodynamic limit N —• oo and gives only <B(t)>. However, here we are interested to formulate the Second Law for "Small" systems i.e. we are interested in the whole distribution P(B(t)) not only in its mean value <B(t)>. Thermodynamics does not describe the temporal development of a single system (single point in the 6iV-diiri phase space).

There is an important property of macroscopic measurements: Whereas the macroscopic constraint A{qo,po} determines (usually) a compact region A(to) in {qo,Po} this does not need to be the case at later times t 3> to: A(t) denned by A{qo{qt,pt},Po{<lt,Pt}} might become a fractal i.e. "spaghetti-like" manifold c.f. fig.3 as a function of {qt,Pt} in f at i — oo and loose compactness.

This can be expressed in mathematical terms: There exist series of points {an} € -4(oo) which converge to a point an=_+oo which is not in ^4(oo). E.g.

139

such points may have intruded from the phase space complimentary to A(to). Illustrative examples for this evolution of an initially compact sub-manifold into a fractal set are the baker transformation discussed in this context by ref.13'14. Then no macroscopic (incomplete) measurement at time t = oo can resolve aoo from its immediate neighbors an in phase space with distances \o-n — «oo| less then any arbitrary small 5. In other words, at the time t S> to no macroscopic measurement with its incomplete information about {qt,Pt} can decide whether {qo{qt,Pt},Po{qt,Pt}} € -4(*o) or not. I.e. any macroscopic theory like thermodynamics can only deal with the closure of A(t). If necessary, the sub-manifold A(t) must be artificially closed to A(t) as developed further in section 8. Clearly, in this approach this is the physical origin of irreversibility. We come back to this in section 8.

7 On Einstein's objections against the EPS-probability

According to Abraham Pais: "Subtle is the Lord"11, Einstein was critical with regard to the definition of relative probabilities by eq.l l , Boltzmann's counting of "complexions". He considered it as artificial and not corresponding to the immediate picture of probability used in the actual problem: "The word probability is used in a sense that does not conform to its definition as given in the theory of probability. In particular, cases of equal probability are often hypothetically defined in instances where the theoretical pictures used are sufficiently definite to give a deduction rather than a hypothetical assertion"4. He preferred to define probability by the relative time a system (a trajectory of a single point moving with time in the ./V-body phase space) spends in a subset of the phase space. However, is this really the immediate picture of probability used in statistical mechanics? This definition demands the ergodicity of the trajectory in phase space. As we discussed above, thermodynamics as any other macroscopic theory handles incomplete, macroscopic informations of the A-body system. It handles, consequently, the temporal evolution of finite sized sub-manifolds - ensembles - not single points in phase space. The typical outcomes of macroscopic measurements are calculated. Nobody waits in a macroscopic measurement, e.g. of the temperature, long enough that an atom can cross the whole system.

In this respect, I think the EPS version of statistical mechanics is closer to the experimental situation than the duration-time of a single trajectory. Moreover, in an experiment on a small system like a nucleus, the excited nucleus, which then may fragment statistically later on, is produced by a multiple repetition of scattering events and statistical averages are taken. No ergodic covering of the whole phase space by a single trajectory in time is demanded.

140

At the high excitations of the nuclei in the fragmentation region their life-time would be too short for that. This is analogous to the statistics of a falling ball on a Galton's nail-board where also a single trajectory is not touching all nails but is random. Only after many repetitions the smooth binomial distribution is established. As I am discussing here the Second Law in finite systems, this is the correct scenario, not the time average over a single ergodic trajectory.

8 Fractal distributions in phase space, Second Law

Let us examine the following Gedanken experiment: Suppose the probability to find our system at points {qt,Pt}\ in phase space is uniformly distributed for times t < to over the sub-manifold £{N, V\) of the TV-body phase space at energy E and spatial volume V\. At time t > to we allow the system to spread over the larger volume V2 > Vi without changing its energy. If the system is dynamically mixing, the majority of trajectories {qt,Pt}^ in phase space starting from points {qo,Po} with qo 6 V\ at to will now spread over the larger volume V2- Of course the Liouvillean measure of the distribution JA{qt,Pt} in phase space at t > to will remain the same (= tr[£(N, Vi)]f5. (The label {qo £ Vi} of the integral means that the positions {qo}^ are restricted to the volume Vi, the momenta {po}? are unrestricted.)

tr[M{qt{qo,Po},Pt{qo,Po}}]\{goeVl}

-UMW"-^-6'1^ <14) because of: 7-7—-—r = 1. (15)

o{qo,Po}

But as already argued by Gibbs the distribution M{qt,Pt} will be filamented like ink in water and will approach any point of £{N, V2) arbitrarily close. M{qt,pt\ becomes dense in the new, larger £(N, V2) for times sufficiently larger than to (strictly in the limt_>.oo)- The closure M. becomes equal to £{N,V-z). This is clearly expressed by Lebowitz16,17.

In order to express this fact mathematically, we have to redefine Boltz-mann's definition of entropy eq.(l) and introduce the following fractal "mea-

141

sure" for integrals like (3) or (10):

W(E,N,t»t0) = ± [ i^Sf)zo6(E-HN{quPt}) (16)

With the transformation:

f(d3qt d3Pt)

N • • • = / " d < n •• • da6N • • • (17)

1 ^{dH , dH, \ 1 ,_, ,1 Q , do-QN := — > -—- dqi + -^—dpi = —dE (18)

IVffll N?s),+?gy W[E, N, t » t0) = v , / 9 Lv3jv f rf<Ji • • • d(76N-1-

JVH||

we replace .M by its closure M. and define now:

(20)

W(E,W, t» fo ) -> M(E, JV , t» t 0 ) :=<G(£(JV,V2))> *volt08[MCE,J\T,t » i o ) ] , (21)

where < G(S(N, V2)) > is the average of fi^llvgll o v e r t ' i e (^arSer) m a n _

ifold £(N, V2), and volbox[M(E,N,t » to)] is the box-counting volume of M(E, N, t 3> to) which is the same as the volume of M, see below.

To obtain volt,ox[M(E, N,t 3> to)] we cover the d-dim. sub-manifold M(t), here with d = (6./V — 1), of the phase space by a grid with spacing 6 and count the number N$ oc 5~d of boxes of size S6N, which contain points of M. Then we determine

vo\box[M(E,N,t » to)] :=)ms_y05dNs[M(E,N,f» f0)] (22)

with lim*= inf [lim *] or symbolically:

M(E,N,t»t0) =: L l.f^^Pi) e06(E-HN)(23) J {«o{«.,p«}eVi}M V ( 2 ^ ) ^ J

N

i 1 1 a"at arvt \

= WfaNWtWiE^M), (24)

142

Va vb va + vb

t < * 0 * > i o

Figure 3: The compact set M(to), left side, develops into an increasingly folded "spaghetti"-like distribution in phase-space with rising time t. This figure shows only the early form of the distribution. At much larger times it will become more and more fractal. The grid illustrates the boxes of the box-counting method. All boxes which overlap with A4(t) are counted in Ng in eq.(22)

where 3d means that this integral should be evaluated via the box-counting

volume (22) here with d = 6N — 1. This is illustrated by the figure 3. With this extension of eq.(3) Boltzmann's entropy (1) is at time t ->• oo equal to the logarithm of the larger phase space W(E, TV, V )- This is the Second Law of Thermodynamics. The box-counting is also used in the definition of the Kolmogorov entropy, the average rate of entropy gain18'19. Of course still at to Mto)=M{t0)=£{N,V1):

l_ M(E,N,t0) =:

{<7o€Vi}

'{qo€Vi} N l

= W{E,N,V{).

4o6V,> N\ \

d3q0 d?pQ

(2irH)3

d3q0 d3p0 \ (2nh)3 J

e06(E - HN) (25)

e0S(E - HN)

(26)

The box-counting volume is analogous to the standard method to determine the fractal dimension of a set of points18 by the box-counting dimension:

dimbox[M(E,N,t » t0)] := lira,, InNs[M(E,N,t> tp)]

In S (27)

143

Like the box-counting dimension, volbox has the peculiarity that it is equal to the volume of the smallest closed covering set. E.g.: The box-counting volume of the set of rational numbers {Q} between 0 and 1, is vol;,ox{Q} = 1, and thus equal to the measure of the real numbers , c.f. Falconer18 section 3.1. This is the reason why vol&ox is not a measure in its mathematical definition because then we should have

volf,0 £(M) i€{Q}

2 voUo«[Mi] = 0, (28) ie{Q}

therefore the quotation marks for the box-counting "measure". Coming back to the the end of section (6), the volume W(A,B,• • • ,t) of

the relevant ensemble, the closure M(t) must be "measured" by something like

the box-counting "measure" (22,23) with the box-counting integral B d which

must replace the integral in eq.(3). Due to the fact that the box-counting volume is equal to the volume of the smallest closed covering set, the new, extended, definition of the phase-space integral eq.(23) is for compact sets like the equilibrium distribution £ identical to the old one eq.(3). Therefore, one can simply replace the old Boltzmann-definition of the number of complexions and with it of the entropy by the new one (23).

9 Conclusion

Macroscopic measurements M determine only a very few of all 6N d.o.f. Any macroscopic theory like thermodynamics deals with the volumes M of the corresponding closed sub-manifolds M in the 6iV-dim. phase space not with single points. The averaging over ensembles or finite sub-manifolds in phase space becomes especially important for the micro canonical ensemble of a finite system.

Because of this necessarily coarsed information, macroscopic measurements, and with it also macroscopic theories are unable to distinguish fractal sets M from their closures M. Therefore, I make the conjecture: the proper manifolds determined by a macroscopic theory like thermodynamics are the closed M. However, an initially closed subset of points at time to does not necessarily evolve again into a closed subset at t ^> to- l e . the closure operation and the t —)• oo limit do not commute, and the macroscopic dynamics becomes irreversible. The limt-^oo and l i m ^ o may be linked as e.g. S > const.ft and the S —>• 0 limit taken after the t —> oo limit.

Here is the origin of the misunderstanding by the famous reversibility paradoxes which were invented by Loschmidt20 and Zermelo21'22 and which

144

bothered Boltzmann so much23,24. These paradoxes address to trajectories of single points in the JV-body phase space which must return after Poincarre's recurrence time or which must run backwards if all momenta are exactly reversed. Therefore, Loschmidt and Zermelo concluded that the entropy should decrease as well as it was increasing before. The specification of a single point demands of course a microscopic exact specification of all 6N degrees of freedom not a determination of a few macroscopic degrees of freedom only. No entropy is defined for a single point.

By our formulation of thermo-statistics various non-trivial limiting processes can be avoided. Neither does one invoke the thermodynamic limit of a homogeneous system with infinitely many particles nor does one rely on the er-godic hypothesis of the equivalence of (very long) time averages and ensemble averages. The use of ensemble averages is justified directly by the very nature of macroscopic (incomplete) measurements. Coarse-graining appears as natural consequence of this. The box-counting method mirrors the averaging over the overwhelming number of non-determined degrees of freedom. Of course, a fully consistent theory must use this averaging explicitly. Then one would not depend on the order of the limits l i m ^ o limt_>oo as it was tacitly assumed here. Presumably, the rise of the entropy can then be already seen at finite times when the fractality of the distribution in phase space is not yet fully developed. The coarse-graining is no more any mathematical ad hoc assumption. Moreover the Second Law is in the EPS-formulation of statistical mechanics not linked to the thermodynamic limit as was thought up to now16'17.

Appendix

In the mathematical theory of fractals18 one usually uses the Hausdorff measure or the Hausdorff dimension of the fractal19. This, however, would be wrong in Statistical Mechanics. Here I want to point out the difference between the box-counting "measure" and the proper Hausdorff measure of a manifold of points in phase space. Without going into too much mathematical details we can make this clear again with the same example as above: The Hausdorff measure of the rational numbers € [0,1] is 0, whereas the Hausdorff measure of the real numbers € [0,1] is 1. Therefore, the Hausdorff measure of a set is a proper measure. The Hausdorff measure of the fractal distribution in phase space M(t -> oo) is the same as that of M(to), W(E, N,V{). Measured by the Hausdorff measure the phase space volume of the fractal distribution M(t -t oo) is conserved and Liouville's theorem applies. This would demand that thermodynamics could distinguish between any point inside the fractal from any point outside of it independently how close it is. This, however,

145

is impossible for any macroscopic theory that can only address macroscopic information where all unobserved degrees of freedom are averaged over. That is the deep reason why the box-counting "measure" must be taken and where irreversibility comes from.

Acknowledgement

I thank to E.G.D. Cohen and Pierre Gaspard for detailed discussions.

References

1. D. H. E. Gross, Microcanonical thermodynamics: Phase transitions in "Small" systems. Lecture Notes in Physics (World Scientific, Singapore, 2000).

2. D. H. E. Gross and E. Votyakov, Phase transitions in "small" systems. Eur.Phys.J.B, 15, 115-126, (2000); http://arXiv.org/abs/cond-mat/9911257.

3. D. H. E. Gross. Micro-canonical statistical mechanics of some non-extensive systems. http://arXiv. org/abs/astro-ph/cond-mat/0004268 (2000).

4. A. Einstein, Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt. Annalen der Physik, 17, 132 (1905).

5. L. Boltzmann, Uber die Beziehung eines algemeinen mechanischen Satzes zum Hauptsatz der Warmelehre. Sitzungsbericht der Akadamie der Wis-senschaften, Wien, 2, 67-73 (1877).

6. L. Boltzmann, Uber die Begriindung einer kinetischen Gastheorie auf anziehende Krafte allein. Wiener Berichte, 89, 714 (1884).

7. E. Schrodinger, Statistical Thermodynamics, a Course of Seminar Lectures, delivered in January-March 1944 at the School of Theoretical Physics (Cambridge University Press, London, 1946).

8. Elliott H. Lieb and J. Yngvason, The physics and mathematics of the second law of thermodynamics. Physics Report,cond-mat/9708200, 310, 1-96 (1999).

9. J. Bricmont, Science of chaos or chaos in science? Physicalia Magazine, Proceedings of the New York Academy of Science, to apear, 1-50 (2000).

10. D.H.E. Gross, Phase transitions in "small" systems - a challenge for thermodynamics. http://arXiv.org/abs/cond-mat/0006087, page 8 (2000).

11. A. Pais, Subtle is the Lord, chapter 4, pages 60 - 78. (Oxford University Press, Oxford, 1982).

12. N. S. Krylov, Works on the Foundation of Statistical Physics. (Princeton University Press, Princeton, 1979).

13. R. F. Fox, Entropy evolution for the baker map. Chaos, 8, 462-465 (1998).

14. T. Gilbert, J. R. Dorfman, and P. Gaspard, Entropy production, fractals, and relaxation to equilibrium. Phys.Rev.Lett., 85, 1606,nlin.CD/000301 (2000).

15. H. Goldstein, Classical Mechanics (Addison-Wesley, Reading, Mass, 1959).

16. J. L. Lebowitz, Microscopic origins of irreversible macroscopic behavior. Physica A, 263, 516-527 (1999).

17. J. L. Lebowitz, Statistical mechanics: A selective review of two central issues. Rev.Mod.Phys., 71, S346-S357 (1999).

18. K. Falconer, Fractal Geometry - Mathematical Foundations and Applications ( John Wiley & Sons, Chichester, New York, Brisbane, Toronto,Singapore, 1990).

19. E. W. Weisstein, Concise Encyclopedia of Mathemetics (CRC Press, London, New York, Washington D.C:, 1999. CD-ROM, edition 1, 20.5. 99).

20. J. Loschmidt, Wienerberichte, 73, 128 (1876). 21. E. Zermelo, Wied.Ann., 57, 778-784 (1896). 22. E. Zermelo. Uber die mechanische Erklarung irreversiblen Vorgange.

Wied.Ann., 60, 392-398 (1897). 23. E. G. D. Cohen, Boltzmann and statistical mechanics. In Boltz-

mann's Legacy, 150 Years after his Birth, http://xxx.lanl.gov/abs/cond-mat/9608054, (Atti dell Accademia dei Lincei, Rome, 1997).

24. E. G. D. Cohen,. Boltzmann and Statistical Mechanics, volume 371 of Dynamics: Models and Kinetic Methods for Nonequilibrium Many Body Systems, J. Karkheck editor, 223-238 (Kluwer, Dordrecht, The Netherlands, 2000).

147

A N APPROACH TO Q U A N T U M PROBABILITY

STAN GUDDER Department of Mathematics

University of Denver Denver Colorado 80208

sgudder@cs. du. edu

We present an approach to quantum probability that is motivated by the Feynman formalism. This approach shows that there is a realistic description of quantum mechanics and that nonrelativistic quantum theory can be derived from simple postulates of quantum probability. The basic concepts in this framework are measurements and actions. The measurements are similar to the dynamic variables of classical mechanics and the random variables of classical probability theory. The actions correspond to quantum mechanical states. An influence between configurations of a physical system is defined in terms of an action. The fundamental postulate of this approach is that the probability density at a measurement outcome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing the measurement.

1 Introduction

We shall discuss a new approach to quantum probability that combines a reformulation of the mathematical foundations of quantum mechanics and the basic tenets of probability theory. This approach is motivated by the Feynman formalism1 and it answers various puzzling questions about traditional quantum mechanics. Some of these questions are the following.

1. Where does the quantum mechanical Hilbert space H come from?

2. Why are states represented by unit vectors in H and observables by self-

adjoint operators on HI

3. Why does the probability have it's postulated form?

4. Why do the position and momentum operators have their particular forms?

5. Why does a physical theory that must give real-valued results involve complex amplitudes or states?

6. Is there a realistic description of quantum mechanics?

Our philosophy is that quantum probability theory need not be the same as classical probability theory. That is, the probability need not be given by a measure. However, the predictions of quantum probability theory should agree

148

with experimental long run relative frequencies. We shall show that there is a realistic description of quantum mechanics. In other words, a quantum system has properties independent of observation. We also show that nonrelativistic quantum mechanics can be derived from simple postulates of this approach. Our presentation is a modified version of the discussion in Gudder 2.

2 Formulation

We denote the set of possible configurations of a physical system <S by fl and call $1 a sample space. If X is a measurement on <S, then executing X results in a unique outcome depending on the configuration u of S. To be precise, we define a measurement to be a map X from fl onto its range R(X) C R satisfying:

(Ml) R(X) is the base space of a measure space (R(X), Ex , fix)-

(M2) X_1(x) is the base space of a measurable space (X~1(x), E x ) for every x e R(x).

We call the elements of R(X), X-outcomes and the sets in Ex are X-events. Note that X _ 1 (x ) corresponds to the set of configurations resulting in outcome x when X is executed and we call X_1(x) the X-fiber over x. The measure fix represents an a priori weight due to our knowledge of the system (for example, we may know the energy of S or we might assume the energy has a certain value). In the case of total ignorance, the weight is taken to be counting measure in the discrete case and uniform measure in the continuous case. This framework gives a realistic theory because a configuration CJ determines the properties of S independent of any particular observation. That is, w determines the outcomes of all measurements simultaneously. Notice that measurements are similar to the dynamical variables of classical mechanics and the random variables of classical probability theory. The sample space fi gives an underlying level of reality upon which traditional quantum mechanics can be constructed.

If X is a measurement, an X-action is a pair

(S,{/£: xeR(X)})

where S: CI —> R and (ix is a measure on [X~l{x),Hxx). As we shall see,

actions correspond to quantum states. For simplicity, we frequently denote an action by S and we remark that S depends on our model of S and also on our knowledge of <S. We define the influence between w, w' 6 SI relative to S

149

by

Fs(u,u') = JVf cos[S(w) - S(u')] (1)

where Ns > 0 is a normalization constant. The appearance of the cosine in (1) is not arbitrary, but it can be derived from the regularity conditions of continuity and causality.2'5

We now make a fundamental reformulation of the probability concept.2 '5

We postulate that the probability density Px,s{%) of an X-outcome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing X. Precisely, we postulate that Fs(w, u/) is integrable and that

PX,S(X)= f [ FS(uj,Uj')fM'x(du)^x(dLj' JX-l(x) JX~l(x)

(2)

Also, to ensure that Px,s{x) is indeed a probability density, we assume that Px,s{x) is measurable with respect to Ex and that

L R{X) Px,s(x)nx(dx) - 1 (3)

Equation (3) can be employed to find Ns- To show that Px,s(x) > 0 we have

Px,s(*)

= N2S[ f [caaS(w)coaS(w') + 8mS(u)S(u')]px(du)px(du')

Jx-Hx) Jx-Hx)

= N2S

-| 2 p

/ cosS(u})fix(dcj + / sinS(w)^x(eL; Jx-1(x) Jx-^x)

> 0

We conclude that Px,s(x) is a probability density on {R(X), £ X , / J X ) .

If B G £ * is an X-event, we define the (X, 5)-probability of B by

Px,s{B) = [ Px,s(x)Vx{dx) JB

(4)

(5)

Then Px,s'- Ex -> [0,1] is a probability measure on (R(X),Hx) that we call the S-distribution of X. If h: R(X) ->• R is ^x-integrable, then the

150

5-expectation of h{X) is defined by

Es(h{X))= [ h(x)Px,s(dx)= [ h(x)Px,s(x)nx(dx) (6) JR(X) JR(X)

In particular if h is the identity function, the 5-expectation of X becomes

ES(X)= [ xPx,s{x)nx(dx) (7) JR(X)

Influence is a strictly quantum phenomenon that is not present in classical physics. In the classical limit, Fs{w,u') approaches a delta function 5U(UJ'). In this limit Fs(ui,ui') = 0 for u 7 OJ' and there is no influence between distinct configurations. We then have Px,s(x) — nx

x {X~l{x)) which gives a classical probability framework.

We can extend this theory to include expectations of other functions on Q. Let g: Q —> R be a function that is integrable along X-fibers. We define the (X, 5)-expectation of g at x by

EXlS(g)(x) = I [ 5(w)fs(w,a;')Mx(dw)Mx(dw') (8) JX-1(x)JX-^(x)

This is the natural generalization of (2) from a probability density to an expectation density. If Ex,s(g) 1S integrable, then the (X, 5)-expectation of g is given by

Ex,s(9) = [ Ex,s{9){x)»x{dx) (9) JR(X)

In particular, if g(u) = h (X(CJ)) then

Ex,s(g)(.x) = h(x)Px,s(x)

and

ExM = I h(x)Px,s(x)»x(dx) = Es (h(X)) JR(X)

This shows that (9) is an extension of (6). We can also use this formalism to compute probabilities of events in fi. Let

ACQ, and denote the characteristic function of Aby xA- If XA is integrable along X-fibers we define, analogously as in classical probability theory, the (X, 5)-pseudoprobability of A by

?x,s(A) = Ex,s(xA)

151

It follows from (3) and (9) that Px,s(ty = 1 and Px,s is countably additive. However, Px,s rnay have negative values, which is why it is called a pseudo-probability. Nevertheless, there are cr-algebras of subsets of fi on which Px,s is a probability measure. For example, if A = X~X{B) for B € Ex , then it can be shown that Px,s(A) = Px,s(B).2 Therefore, in this case Px,s reduces to the distribution Px,s- We shall consider some less trivial examples later.

3 Wave Functions and Hilbert Space

This section employs the formalism of Section 2 to derive the wave functions and Hilbert space of traditional quantum mechanics. It is not necessary to do this because the needed probability formulas have been presented in Section 2. However, as we shall see, the Hilbert space formulation gives more convenient and concise notations.

Applying (4) we obtain

/ NseiS^»x(duj)

JX-l(x)

2

(10)

We call the function

/ s M = NseiS^ (11)

the S-amplitude function and define the (X, S)-wave function by

fx,s(*) = f fs(u)»xx(du) (12)

./X-i(a:)

From (10) and (12) we obtain

Px,s(x) = l /x,s(*)|2 (13)

We also have

Fs(u,w') = iVfRe e ^ M e - ^ " ' ) = Re /s(w)'/s(w')* (14)

Equation (10) shows how the complex numbers arise in quantum mechanics. The complex numbers are not needed for the computation of Px,s because we can always write FS(OJ,W') in the form (1). They are merely a convenience that gives a simple and concise formula. Equation (11) gives the Feynman amplitude function which we have now derived from deeper principles and (12) is Feynman's prescription that the amplitude of an outcome a; is the sum (or

152

integral) of the amplitudes of the configurations (or alternatives) that result in x when X is executed. :

If B G Ex , applying (5) and (13) gives

Px,s(B) = [ \fx,s(x)\2»x(dx) (15) JB

and this is the usual probabilistic formula of traditional quantum mechanics. It follows from (3) that fx,s is a unit vector in the Hilbert space 1? (R(X),Hx,^x) and this derives the quantum Hilbert space and the vector form for a state. If Ax is a set of X-actions, then the Hilbert space Hx Q L2 (R(X), T,x,fJ-x) generated by the set of wave functions {fx,s- S € Ax} is called an X-Hilbert space. Some X-actions may not be relevant for physical reason so we may want Ax to be a proper subset of the set of all X-actions.

If g: Cl —> R is integrable along rr-fibers and S £ Ax, we define the (X, 5)-amplitude average of g at x by

fx,s(9){x) = [ g(u)fs(u>)fx(dLj) = NS [ g{u)eiS^nx{d") Jx-l(x) JX-i(x)

(16)

Applying (8) and (14), we obtain

£x ,s ( f f ) (s )=Re / g(Lj)fs(cj)»x(du) [ /s(^')>i(^')

= Befx,s(g)(x)fx,s{x)*

It follows from (9) that

Ex,s(g)='Re(fx,s(g),fx,s) (17)

Define the linear operator g on Hx by gfx,s(%) = fx,s(g)(%) and extend by linearity. If the operator Tj is self-adjoint on Hx we call g an X-observable and we have

Ex,s(9) = (9fx,s,fx,s) (18)

for all S G Ax- We then say that g is represented by the self-adjoint operator <? on Hx • This derives the representation of observables by self-adjoint operators.

153

For a simple example of a representation, let g: £1 -» R be a constant function g(uj) = c. Then (16) gives

fx,s(g){x) = c / fs(w)nx(du) = cfx,s(x) JX-1(x)

Hence, g is an A"-observable and is represented by the self-adjoint operator cl. As another example, letting g — X we have by (16) that

fx,s(X){x) = xfXiS(x)

It follows that X is represented by the self-adjoint operator X on Hx given by Xu(x) = xu{x). We conclude that Hx is a Hilbert space in which X is "diagonal." More generally, since

fx,s (h(X)) (x) = h(x)fx,s(x) (19)

we see that h{X) is represented by the self-adjoint operator h(X)Au(x) = h(x)u(x). Moreover, the spectral measure Px is given by Px (B)u(X) — XB(x)u(x) and applying (15) gives

Px,s(B) = \\px(B)fx,s\

which is again a standard probabilistic formula. Finally, for A C fi the (X, 5)-pseudoprobability becomes by (17)

Px,s(A) = Re (fx,s(xA),fx,s) (20)

where by (16) we have

fxAxA)(x)= [ fs(cj)fixx(du) = NS I eiS^»x(ckj) (21) JX- ' ( i )n i Jx-1(x)nA

4 Spin

We now illustrate the framework presented in the last two sections by presenting a model for spin 1/2 measurements. Fix a direction corresponding to the z axis and assume that the spin j z in the z direction is known (either 1/2 or —1/2). Let UJ € [0,7r] denote a direction whose angle to the z axis is LJ. By symmetry, the spin distribution should depend only on u. Let fi = [0,7r], 8 6 fi and let X: Q -> {-1/2,1/2} be the function

X(u) = - 1 / 2 for u E [0,6] and X(u) = 1/2 for u G (0,TT].

154

We make X into a measurement by defining

fix ({-1/2})= ^ ({1/2}) = 1

and endowing X~1(-l/2) = [0,(9] and X~ 1 ( 1 /2) = (0,ir] with the usual Borel structure. The function X corresponds to a spin 1/2 measurement in the 0 direction. Letting 6 vary, we obtain an infinite number of spin measurements each applied in a different direction. Observe that a sample point u> € CI determines the spin in every direction simultaneously.

For j z = 1/2 we define the X-action (S, < fix ' , fix >J given by S(LJ) = u

and fix ' , fix are fi/2 where fi is Lebesgue measure restricted to X_ 1(—1/2), X _ 1 ( l / 2 ) , respectively. We then have

FS(OJ,CJ') = cos(o; - a/)

(we shall see that Ns = 1). The probabilities become

P* ,5 ( - l / 2 ) = l /oVoCOs^-w'Jdwdw'

= i[/09cosa;du;]2 + i [ / 0

e s i n a ; ^ ] 2 (22)

= ± s i n 2 0 + i ( l - c o s 0 ) 2 = s i n 2 f

Px,s(l/2) = \f?ficoa(u,-uj')dLjdu,'

= \ [fg cos uiduj] + i [fg sin udu}] (23)

= \ sin2 6 + \(1 + cos Of = cos2 f

Since Px,s(-l/2) + Px,s(ll2) = 1 we see that Ns = 1. Notice that (22) and (23) are the usual probability distribution for spin in the 9 direction when U = i /2 .

For j z = —1/2 we define the X-action \S' Avx ' , vj \ J given by

S' = u for u e (0,7r) and S' = -TT/2 for u e {0, n} and vx' = So + fi/2, vx = Sn + fi/2 where <5o, Sv are the Dirac point

measures at 0, ir, respectively. A similar, but more tedious calculation gives

i ^ S ' ( - 1 / 2 ) = cos 2^

Px,s-(1/2) = s in 2 ^

155

which is the usual distribution for spin in the 6 direction when j z — - 1 / 2 . We now examine the wave functions and Hilbert space corresponding to

this model. The 5-amplitude function becomes fs(u>) = etw and the (X,S)-wave function fx,s is given by

/ x , s ( - l / 2 ) 2 Jo e * w d w = - ( l - ')

fx,s{^l2) = \f e^<kj^%-{l + , ,i0

The S'-amplitude function becomes fs> (w) = e™ for u € (0, TT) and / s - M = -i for w € {0, TT} . and the (X, 5')-wave function fx,s' IS given by

fxM-W) = f[o,9]fs'(.">x1/2^) = -i+12foei"d"

= - f ( l + eiS)

/x,5<(l/2) = / { M ] / 5 ' H ^ / 2 ( ^ ) = - i + 3 X r ^ d W

= - | ( l - e i e )

The X-Hilbert space is clearly C 2 and we can represent fx,s and /x,S' in C 2 by the unit vectors

vs

VS'

(l-ei9,l + eie)

(I + eie,1 - eie)

Notice that vs i . vs'- Also, when 6 = 0, vs — (0,1) and us/ = (1,0) which are the usual eigenvectors for the spin 1/2 operator in the z direction. We can treat this as a measurement and the general X as an observable. It can be shown that the matrix for X in the standard basis (1,0), (0,1) becomes

* = 5 cos 9 ism 6

-i sin 6 — cos 6 = - cos 6

2 1 0 0 - 1

+ - sin 6 0 i -i 0

which is the usual form for a spin 1/2 matrix in the direction 6. We can extend this analysis to higher order spins.3 Moreover, this frame

work gives a realistic model for the Bohm version of the EPR problem.4 The reason that Bell's theorem is not contradicted is because Bell's inequalities are derived using classical probability theory and we have employed quantum probability theory.

156

5 Traditional Quantum Mechanics

We now show that this formalism contains traditional nonrelativistic quantum mechanics. For simplicity, we consider a single spinless particle in one dimension although this work easily generalizes to three dimensions. We take our sample space to be the phase space

n = K2 = {(q,p): q,pER}

The two most important measurements are the position and momentum given by Q(Q,P) = <?> P(QJP) = P, respectively. However, as is frequently done in quantum mechanics, we shall investigate the ^-representation of the system. In this case, Q is considered a measurement and P: fi —> R is viewed as a function on fi which, as we shall show, is a Q-observable.

Each Q-fiber, Q~l{q) = {(q,p)- p £ R } can be identified with R. We make Q a measurement by endowing its range R(Q) = R with Lebesgue measure and its fibers with the usual Borel structure of R. Only certain Q-actions IS,<(1Q: < 7 G R H correspond to traditional quantum states and these can be derived from natural postulates. We assume that fj,% is absolutely continuous relative to Lebesgue measure on R and that \IQ is independent of Q. This is because sets of Lebesgue measure zero are too small to have any effect on the outcomes of position measurements and there is no a priori reason to distinguish between Q-fibers. It follows from the Radon-Nikodym theorem that there exists a nonnegative Lebesgue measurable function £: R —> R such that

»Q(dp) = (2irh)-1/2ap)dp (24)

We take S: fl —> R to have the form

S(q,p) = f+V(p) (25)

This form is natural because qp is the classical action and adding a function of momentum gives a quantum fluctuation. We could also add a function of q but it is easy to see that this would just multiply the wave function by a constant phase which would not alter the probabilistic formulas. Denote by AQ the set of (^-actions that have the form (24), (25).

Applying (12) for S € AQ, we find that the (Q, 5)-wave function becomes

fQ,s(q) = {2-KK)-1/2 J ti{pYn{p)eiqv/hdp

Defining

m = t(p)eiv{p) (26)

157

and denoting the inverse Fourier transform by v we have

fQ,s(q) = (27T?r1/2 / 4>{Pyqp/hdP = <p{a) (27)

In order for (3) to be satisfied, / Q ^ must be a unit vector in L2(R, dq) or equivalently, <j>{p) must be a unit vector in L2(R, dp). However, every vector in L2 (R, dp) has the form (26) for some functions £: R -»• R + , 77: R ->• R. It follows that the Q-Hilbert space becomes the traditional Hilbert space HQ = L2(R, dq) and fQ,s is the usual wave function (or state).

Let (s, l^9Q: q € R } ) be a fixed Q-action in AQ of the form (24), (25)

and let ip(q) = fQ,s(q), $(p) = ^(p)eit>^. Applying (16) and (27) we have

fQ,s(P)(Q) = (2nh)-1/2Jp(p)ei^dp

= -ih±(2nh)-V2j4>(P)ei'lp/hdp=-ih%{q)

More generally, if n is a positive integer, we obtain

fQ,s(Pn)(Q) = (-ihQ V-CP) (28)

Moreover, applying (18) we have

E^pn) = l[(-ihiS 1>(q) P(q)dq

which is the usual quantum expectation formula. We conclude from (28) that P " is a Q-observable and is represented by the operator (—ihd/dq)n. Moreover, if V: R —> R is measurable, we see from (19) that V(Q) is a Q-observable and is represented by the operator V(Q)Au(q) = V(q)u(q). This together with our observation concerning P " , gives a derivation of the Bohr correspondence principle.

We now consider probability distributions. We have already seen in (15) that

PQ,S(B)= I \<P(q)\2dq JB

which is the usual distribution of Q. It is more interesting to compute the probability of A = P~1(B) for the momentum function P . We have from (21) that

fQ,s(xA)(q) = {2Kh)-1'2 [ 4>{j>yqp,hdp={xB4>Y{q) JB

158

Hence, by (20) and the Plancherel formula we obtain

PQ,S [P-^B)] = j{xBd>Y{q)r{.q)dq

(xB4>){p)<P*(p)dp / <

= / |#(p) JB <

dp

Again, this is the usual momentum distribution. This gives an example in which PQ,S is an actual probability measure on a er-algebra of subsets of fi.

Until now we have treated time as fixed. We now briefly consider dynamics. Let ip{q,t) be a smooth function. Our previous formulas hold with tp(q) replaced by tp(q,t) and HQ replaced by t*Qt- We now derive Schrodinger's equation from Hamilton's equation of classical mechanics dp/dt = —dH/dq. Suppose the energy function has the form

H(q,P) = ^+V(q)

We assume that Hamilton's equation holds in the amplitude average. Applying (16) we have

Jt J Pfs(q,P,t)nqQ<t(dp) = -—J H(q,p)fs{q,p,t)nq

Qt{dp)

Hence

dt Jp$(p, t)e^'hdp =-^f H(q,p)$(p, t)e^lhdp

Applying (28) and (19) gives

h2 d2i> dt \ dq J dq 2m dq2 + V(q)rl>

Interchanging the order of differentiation on the left side of this equation and integrating with respect to q gives Schrodinger's equation.

6 Concluding Remarks

In this paper we have presented a realistic, contextual, nonlocal approach to quantum probability theory. The formalism is realistic because each sample

159

point w € n uniquely determines a value X(u>) for any measurement X. In this way, a physical system <S possesses all of its attributes independent of whether they are measured. Although the sample space fi exists and we can discuss its properties, fi is not physically accessible in general. This is because the sample points may not correspond to physical states which can be prepared in the laboratory or at least exist in nature. We may think of fi as a hidden variable completion of quantum mechanics. This approach is contextual because it is necessary to specify a particular basic measurement X. Once X is specified, a Hilbert space Hx can be constructed and Hx provides an X-representation for S. Of course, one may choose a different basic measurement Y and then the ^-representation will give a different picture of S. For example, in traditional quantum mechanics we usually choose the position representation or the momentum representation to describe <S. For a given basic measurement X and an action S we have given a method for constructing the probability distribution Px,s of X. We have shown that Px,s may be found in terms of a state vector fx,s 6 Hx and these correspond to physically accessible states. In Hx the measurement X and functions of X are "diagonal" and hence represented by "random variables." Other measurements which we call observables to distinguish them from X are represented by self-adjoint operators on Hx and their usual distributions follow in a natural way. The theory is nonlocal because the distribution Px,s is specified by an influence function Fs(w,w'). This function provides an influence between pairs of sample points which in a spacetime model may be spacelike separated.

There is considerable controversy concerning various interpretations and approaches to probability theory. I believe that three types of probabilities are necessary for a description of quantum mechanics. The probabilities and distributions of measurement results in the laboratory are usually computed using long run relative frequencies. Even though a measurement X may involve a microscopic system S (for example, the position of an electron), S must interact with a macroscopic apparatus in order to obtain an observable outcome. The theoretician's task is to find the distribution Px of X. This theoretical distribution should agree with the long run relative frequencies found in the laboratory or give predictions that can eventually be tested experimentally. Since there are serious well-known difficulties in dealing with abstract theories of relative frequencies, it is convenient and perhaps even necessary to use the standard Kolmogorovian probability theory for describing Px- Now Px is a probability measure that satisfies the axioms of standard probability theory. However, the method for computing Px is characteristic of quantum mechanics and is not found in any classical theory. Richard Feynman, whose work has motivated the present paper, once said that nobody really understands

160

quantum mechanics. I think that what he meant is that nobody understands why nature has chosen to compute probabilities in this unusual way. As presented here, the probability density for Px is found by employing an influence function. The advantage of this method is that it is physically motivated and avoids complex numbers. An equivalent method, which is usually employed in quantum mechanics, is to take the absolute value squared of the wave function.

The quantum probability approach that we have presented contains standard probability theory as a special case. Thus, we only need two types of probabilities to describe quantum mechanics. Standard probability theory as developed by Kolmogorov is a distillation of hundreds of years of experience with empirical and theoretical studies of chance phenomena. The founders of the subject were concerned with games of chance, statistics and the behavior of macroscopic objects. They were not aware of microscopic objects and quantum mechanics and had no reason to design a probability theory for describing such situations. It is therefore not surprising that a new theory called quantum probability theory had to be developed to serve these purposes.

References

1. R. Feynman and A. Hibbs, Quantum Mechanics and Path Integrals (Mc Graw-Hill, New York, 1965).

2. S. Gudder, Int. J. Theor. Phys. 32, 1747 (1993). 3. S. Gudder, Int. J. Theor. Phys. 32, 824 (1993). 4. S. Gudder, Quantum probability and the EPR argument, Ann. Found.

Louis De Broglie, 20, 167 (1994). 5. G. Hemion, Int. J. Theor. Phys. 29, 1335 (1990).

161

INNOVATION APPROACH TO STOCHASTIC PROCESSES A N D Q U A N T U M DYNAMICS

TAKEYUKI HIDA Department of Mathematics

Meijo University Tenpaku,Nagoya 468-8502

and Nagoya University (Professor Emeritus)

Theory of stochastic process has extensively developed in the twentieth century and there established a beautiful connection with quantum dynamics. It seems to be a good time now to revisit the foundations of stochastic process and quantum mechanics with the hope that the attempt would suggest some of further directions of these two disciplines with intimate relations. For this purpose, we review some topics in white noise analysis and observe motivations from physiscs and how they have actually been realized.

1 Introduction

We shall discuss the analysis of random complex systems and its connection with Quantum dynamics. In particular, we analyse some stochastic processes X{t) and random fields X(C), in a manner of using the innovation and revisit quantum dynamics in connection with stochastic analysis. Actually, our aim is to study those random complex systems including quantum fields by using the white noise analysis.

The basic idea of our analysis is that we first discuss stochastic processes by taking a basic and standard system of random variables, then expressing the given process as a function of the system that has been provided. The system of such variables from where we have started is called idealized elemental random variables (abbr. i.e.r.v.). The idea of taking such a system is in line with the

Reductionism. One might think that this thought seems to be similar to the Reductionism

in physics. Before we come to this point, it sounds interesting to refer to the lecture given by P.W. Anderson at University of Tokyo 1999. His title included Emergence together with reductionism and he gave good interpretation.

Following the reductionism we then come to the next step, is to form a function of the i.e.r.v.'s, so that the function represents the given random complex system. It is nothing but

Synthesis.

162

Then, naturally follows the analysis of functions which have been formed in our setup. Thus the goal has therefore to be the analysis of the function (may be called functional) to identify the random complex system in question.

The first step of taking suitable system of i.e.r.v.'s has been influenced by the way how to understand the notion of a stochastic process. We therefore have a quick review of the definition of a stochastic process starting from the idea of J. Bernoulli (Ars Conjectandi, 1713), S. Bernstein (1933) and P. Levy on the definition of a stochastic process (1947), where we are suggested to consider the innovation of a stochastic process. It is viewed as a system of i.e.r.v.'s, which will be specified to be a white noise.

The analysis of white noise functionals has many significant characteristics which are fitting for investigation of quantum mechnical phenomena. Thus, we shall be able to show examples to which white noise theory is efficiently applied.

Having had great contribution by many authors, the theory developed in our line has become the present state:

AMS 2000 Mathematics Subject Classification 60H40 White Noise Theory

2 Review of defining a stochastic process and white noise analysis

There is a traditional, and in fact original way of defining a stochastic process. Let us refer to Levy's definition of a stochastic process given in his book [3] Chapt. II. "une fonction aleatoire X(t) du temps t dans lequel le hasard inter-vient a chaque instant". The hasard is expressed as an infinitesimal random variable Y(t) which is independent of the observed values of X(s), s < t, in the past. The random variable Y(t) is nothing but the innovation of the process X(t).

Formally speaking the Y(t), which is usually an infinitesimal random variable, contains the information that was gained by the X(t) during the time interval [t, t + dt). To express this idea P. Levy proposed a formula called an infinitesimal equation for the variation 5X (t):

6X(t) = $(X(s),s < t,Y(t),t,dt),

where $ is a non-random functional. Although this equation has only a formal significance, it still tells us lots of suggestions.

While, it would be fine if the given process is expressed as a functional of

163

Y{t) in the following manner:

X(t) = V(Y(s),s<t,t),

where ^ is a sure (non random) function. Such a trick may be called the Reduction and Synthesis method. The

above expression is causal in the sense that the X(t) is expressed as a function of Y(s), s <t, and never uses Y(s) with s > t.

Note that this method of denning a stochastic process is more important than function space type distribution.

The collection {Y(s)} is a system of i.e.r.v.'s so that the above expression is a realization of the synthesis. We are particularly interested in the case where the system of i.e.r.v.'s is taken to be a white noise and thus ready to discuss white noise analysis.

So far we have discussed the theory only for a stochastic process. It is in fact quite natural to extend the theory for a random field X(C) indexed by an ovaloid, say a contour or closed surface. A generalization of the infinitesimal equation is

SX(C) = $ (X(C") ,C < C,Y(s),s e C,C,6C).

The {y(s) ,s G C} is the innovation.

We note that the white noise analysis has many advantages as are quickly mentioned below. Such a generalization can be done because of the use of the innovation.

1) It is an infinite dimensional analysis. Actually, our stochastic analysis can be systematically done by taking a white noise as a sytem of i.e.r.v.'s to express the given random complex systems. Indeed, the analysis is essentially infinite dimensional as will be seen in what follows.

2) Infinite dimensional harmonic analysis The white noise measure supported by the space E* of generalized func

tions on the parameter space Rd is invariant under the rotations of E*. Hence a harmonic analysis arising from the group will naturally be discussed. The group contains significant subgroups which describes essentially infinite dimensional characters.

3) Generalizations to random fields X(C) are discussed in the similar manner to X(t) so far as innovation is concerned. Needless to say, X(C) enjoys more profound characteristic properties.

164

4) Connection with the classical functional analysis. The so-called S-transform applied to white noise functionals provides a bridge connecting white noise functionals and classical functionals of ordinary functions. We can therefor appeal to the classical theory of functionals established in the first half of the twentieth century.

5) Good connection with quantum dynamics as will be seen in the next section.

Differential and integral calculus of white noise functionals using annihilation dt and creation <9t*, class of generalized functionals, harmonic analysis including Fourie analysis, the Levy Laplacian A L , complexification and other theories are refered to the monograph [12] and other literatures.

3 Relations to Quantum Dynamics

We now explain briefly some topics in quantum dynamics to which white noise theory can be applied. What we are going to present here may seem to be separate topics each other, but behind the description always is a white noise.

1) Representation of the canonical commutation relations for Boson field. This topic is well known.

Let B(t) be a white noise and let dt denote the S(i)-derivative. Then it is an annihilation operator and its dual operator 3t* stands for the creation. They satisfy the commutation relations

[ft,a.] = [a;,a;] = o,

[dt,d;] = s(t-s).

From these, a representation of the canonical commutation relations are given for Bosonic particle.

It is noted that the following assertion holds.

Proposition. There are continuously many irreducible representations of the canonical commutation relations.

White noises with different variances are inequivalent each other, which proves the assertion.

2) Reflection positivity (T-positivity).

165

A stationary multiple Markov (say N-ple Markov) Gaussian process has a spetral density function /(A) of particular type. Namely,

On the other hand, it is proved that

Proposition. The covariance function 7(/t) of a stationary T-positive Gaussian process is expressed in the form

/ • O O

j(h) = / exp[— |/i|x]cfo(a;), Jo

where v is a positive finite measure.

By applying this assertion to the N-ple Markov Gaussian process we claim that T-positivity requires Ck > 0 for every k.

Note that in the strictly N-ple Markov case this condition is not satisfied.

It is our hope that this result would be generalized to the cases of general stochastic processes of multiple Markov properties.

3) A path integral formulation.

One of the realizations of Dirac-Feynman's idea of the path integral may be given by the following method using generalized white noise functionals. First we establish a class of possible trajectories when a Lagrangian L(x, x) is given. Let x be the classical trajectory determined by the Lagrangian. As soon as we come to quantum dynamics we have to consider fluctuating paths y. We propose they are given by

y(s) = x{s) + \ —B{s). V m

The average over the paths is replaced with the expectation with respect to the probability measure for which Brownian motion B(t) is defined. Thus, the propagator G(yi,y2,t) is given by

E{Nexp[l-J L(y,y)ds+^j B(s)2ds] • S(y(t) - y2)}.

With this setup actual computations have been done to get exact formulae of the propagators. (L. Streit et al.)

166

4) Dirichlet forms in infinite dimensions. With the help of positive grneralized white noise functionals we prove criteria for closability of energy forms. See [3].

5) Random fields X(C).

A random field X{C) depending on a parameter C, which is taken to be a certain smooth and closed manifold in a Euclidean space, naturally enjoys more complex probabilistic structure than a stochastic process X(t) depending on the time t. It therefore has good connections with quantum fields in physics.

We are particularly interested in the case where X(C) has a causal representation in terms of white noise. Some typical examples are listed below.

5.1) Markov property and multiple Markov properties. We are suggested by Dirac's paper [1] to define Markov property. For

Gaussian case a reasonable definition has been given (see [15]) by using the canonical representation in terms of white noise, where the canonical property of a representation can be introduced as a geberalization of that for a Gaussian process. Some attempts have been made for some non Gaussian fields (see [17]). For Gaussian case, multiple Markov properties have been defined. It is now an interesting question to find conditions under which a Gaussian random field satisfies a multiple Markov property.

5.2) Stochastic variational equations of Langevin type. Let C runs through a class C of concentric circles. The equation is to solve

the following stochastic variational equation of Langevin type.

SX(C) = -XX{C) [ 6n(s)ds + X0 [ v(s)d*s5n(s)ds. Jc Jc

The explicit solution is given by using the 5-transform and the classical theory of functionals.

5.3) We have made an attempt to define a random field X(C),C G C which satisfies conformal invariance. Reversibility can also be discussed.

Example. Linear parameter case. A Brownian bridge. For t € [0,1] it is defined by

X(t) = (l-t) [ —^—B(u)du. Jo 1 ~u

167

Reversibility can be guaranteed not only by the time reflection but also by whiskers (one-parameter subgroup denned by deformation of parameter) in the conformal group that leaves the unit time interval invariant.

We now come to the case of a random field. Let C be the class of concentric circles. Assume 0 < r0 < r < r\. Denote by Cr the circle with radius r. Then we define

'(ft) - yfi^^bw w^w*^ This is a canonical representation. To show a reversibility, we apply the inversion with respect to the circle with radius y/rori:

We claim that it is possible to have a generalization to the case where C is taken to be a class of curves obtained by a conformal mapping of concentric circles.

Remark 1. It is noted that the white noise x(t) is regarded as a representation of the parameter t, so that propagation of randomness (fluctuation) is expressed in terms of x(t) instead the time t itself. Namely, the way of development of random complex phenomena, in particular reversibility, has explicit description in terms of white noise as is seen in the above example.

Remark 2. See the papers [1] by Dirac and [13] by Polyakov to have suggestions on a generalization of the path integral.

4 Addenda to foundations of the theories. Concluding remarks

Before the concluding remarks are given, we should like to add some facts as an addenda to SI, regarding the foundations of probability theory.

Prom a brief history mentioned in SI , we understand the reason why a white noise, that is a system of i.e.r.v.'s is introduced. It is a generalized stochastic process, so that we need some additional consideration when reasonable functionals, in general nonlinear functionals, of white noise are introduced. In physics we met interesting cases where those nonlinear functionals of white noise are requested; canonical commutation relations for quantum fields, where degree of freedom is continuously infinite, Feynman's path integrals as was discussed in 3) of the last section, and variational equation for a

168

random field. On the other hand, we were lucky when a class of generalized white noise functionals were introduced in 1975, since the theory of genaral-ized functions was established and some attempt had been made to apply it to the theory of generalized stochastic processes. To have further fruitful results, we have been given a powerful method to study random fields indexed by a manifold. It is the so-called innovation approach, where our reductionism does not care higher dimensionality of the parameter space. With these in mind we can come to the concluding remarks.

As the concluding remarks some of proposed future directions are now in order.

1. One is concerned with good applications of the Levy Laplacian. Its significance is that it is an operator that is essentially infinite dimensional.

2. A two-dimensional Brownian path is considered to have some optimality in occupying the territory. This property should reflect to forming a model of physical phenomena.

3. Systematic approach to in variance of random fields under transformation group will be discussed.

4. Stochastic Variational Calculus for random fields.

With the classical results on variational calculus we can proceed further white noise analysis.

Acknowledgements. The author is grateful to Professor A. Khrenikov who has invited him to give a talk at this conference. Thanks are due to Academic Frontier Project at Meijo University for the support of this work.

References

1. P.A.M. Dirac, The Lagrangian in quantum mechanics. Phys. Z. Soviet Union, 3, 64-72(1933).

2. S. Tomonaga, On a relativistically invariant formulation of the quantum theory of wave fields. Prog. Theor. Phys., 1, 27-42 (1946).

3. P. Levy, Processus stochastiques et mouvement brownien (Gauthier-Villars 1948; 2 ed. 1965).

4. P. Levy, Nouvelle notice sur les travaux scientifique de M. Paul Levy, Janvier 1964. Part III. Processus stochastiques. (unpublished manuscript).

169

5. T. Hida, Canonical representations of Gaussian processes and their applications. Mem. College of Science, Univ. of Kyoto, A, 33, 109-155(1960).

6. T. Hida, Stationary stochastic processes (Princeton Univ. Press. 1970). 7. T. Hida, Brownian motion (Iwanami Pub. Co., 1975; English ed.

Springer-Verlag, 1980). 8. T. Hida, Analysis of Brownina functionals. Carleton Math. Lecture

Notes, 13 (1975). 9. T. Hida, Innovation approach to random complex systems. Pub.

Volterra Center, 433 (2000). 10. T. Hida and L. Streit, On quantum theory in terms of white noise.Nagoya

Math. J., 68, 21-34(1977). 11. T. Hida, J. Pothoff and L. Streit, Dirichlet forms and white noise

analysis. Commun. Math. Phys., 116, 235-245 (1988). 12. T. Hida, H.-H. Kuo, J. Potthoff and L. Streit, White noise, an Infinite

dimensional calculus (Kluwer Academikc Pub. 1993). 13. A.M. Polyakov, Quantum geometry of Bosonic strings. Phys. Lett.,

103B, 207-210(1981). 14. J. Schwinger, Brownian motion of a quantum oscillator. J. of Math.

Phys., 2, 407-432 (1961). 15. Si Si, Gaussian processes and Gaussian random fields. Quantum In

formational (World Scientific Pub. Co. 2000). 16. L. Streit and T. Hida, Generalized Brownian functionals and the Feyn-

man integral. Stoch. Processes Appl., 16, 55-69 (1983). 17. L. Accardi and Si Si, Innovation approach to multiple Markov proper

ties of some non Gaussian random fields, to appear.

170

STATISTICS A N D ERGODICITY OF WAVE FUNCTIONS IN CHAOTIC OPEN SYSTEMS

H. ISHIO Department of Physics and Measurement Technology, Linkoping University,

S-581 83 Linkoping, Sweden E-mail: [email protected]

and Division of Natural Science, Osaka Kyoiku University, Kashiwara,

Osaka 582-8582, Japan E-mail: [email protected]

In general, quantum chaotic systems are considered to be described in the context of the random matrix theory, i.e., by random Gaussian variables (real or complex) in an appropriate universality class. In reality, however, quantum states inside a chaotic open system are not given by a statistically homogeneous random state. We show some numerical evidences of such statistical inhomogeneity for ballistic transport through two-dimensional chaotic open billiards, and argue about their relation to the corresponding classical dynamics.

1 Introduction

Quantum-mechanical signature of classical chaos is called quantum chaos. The rigorous definition of chaotic systems in quantum theory has been given very recently for Kolmogorov (K-) and Anosov (C-) systems on the analogy of the corresponding classical natures.1 In such systems, quantum ergodicity is naturally expected: Eigenfunctions are equidistributed in their representation space, and all expectation values of quantum observables coincide with mean values of the corresponding classical observables. It was first noted that a sufficient condition for quantum ergodicity to hold is the ergodicity of the corresponding classical dynamics.2 More recently, the statement was proved in the case of quantum billiards.3'4 Nowadays, the quantum ergodicity is one of the few results for which there exist mathematical proofs in the field of quantum chaos.

The quantum ergodicity, however, can be reached only in the semiclassical limit (h —> 0). In experiments or numerical simulations for chaotic systems, we often see nonuniversal quantum features far from ergodicity even in a high (but finite) energy region. In the present work, we show some numerical evidences of such statistical inhomogeneity for chaotic open systems. In Sec. 2, we introduce a model of ballistic transport through a chaotic open billiard, and show some evidences of nonergodicity in the classical dynamics. We briefly discuss in Sec. 3 the general wave-statistical description of chaotic open systems by

171

Figure 1: Typical single trajectory in the open stadium billiard.

the random matrix theory (RMT). In Sec. 4, we show numerical results of fully-quantum calculations of the open billiard model, and find that the idealistic description by RMT does not apply in some cases even in a high energy region. There, we focus on the relation between the statistical deviations and wave localization corresponding to classical short paths. Section 5 consists of conclusions.

2 Classical Nonergodicity and Short-Path Dynamics

We consider a two-dimentional (2D) billiard where the motion of noninter-acting particles confined by Dirichlet boundaries is ballistic. The shape of the boundaries directly determines the nonlinearity of particle dynamics inside the billiard. One of the prototypes of conservative chaotic systems is a Bunimovich stadium billiard. In the case of a closed stadium billiard, it is proved that the system has K-property. 5 In the case of an open stadium billiard coupled to two narrow leads (see Fig. 1), the nonintegrability is still expected, e.g., we can observe a fractal structure in the spectrum of dwell times inside the cavity region.6 However, the Monte Carlo simulation of the classical path-length (oc dwell time) distribution shows that the distribution function is not a simple exponential decay function as a signature of ergodicity, but a highly structured function owing to short-path dynamics.7

Another example showing nonergodicity of classical dynamics in the case

172

of the open stadium billiard is a transmission-reflection diagram of particles as is shown in Fig. 2. There, y is an initial transversal position of each particle incoming from the lead 1 (see Fig. 1) at the entrance of the stadium cavity. d denotes a common width of the attached leads. We apply semiclassical quantization condition to the momentum of the incoming particles in the lead: The angle of incidence is quantized as 6, = ± s in - 1 [(nir)/(kd)] (n = 1,2,. . .) , where we choose the positive and negative 0j for the upper and lower direction of particle motions in Fig. 1, respectively, k is the Fermi wave number of the semiclassical particles. In the calculation of all the range of the diagram, we fix the quantized mode number n as n = 1. Because of the semiclassical quantization condition, \0i\ monotonically decreases as a function of k. The distributed black and white points correspond to transmission and reflection events, respectively. The relative measure of the black (white) portion for each fc is equal to the classical transmission (reflection) probability Tci(k) (Rct(k)). In Fig. 2, we see a number of black and white "windows" in the chaotic sea. Each of them is associated with a family of short paths connecting from the lead 1 to the lead 2 (for the black) and the lead 1 (for the white). Such paths are stable in the event of transmission and reflection, and are expected to make an important contribution as a family to the corresponding quantum transport.

3 Universal Description of Wave Function Statistics

We write the scaled local density as p(r) — V\ip(r)\2, where V is the volume of the system, in which a single-particle wave function ip(r) is normalized in terms of the position r. It is well known that the probability distribution of the local densities of a chaotic eigenfunction of a closed system is the Porter-Thomas (P-T) distribution,8

P(p) = ( l / v / 2 ^ ) exp( -p /2) , (1)

described by a Gaussian orthogonal ensemble (GOE) of random matrices, when time-reversal symmetry (TRS) is present, i.e., ip £R. On the other hand, the distribution is an exponential,8,Q

P(p) = exp(-p), (2)

described by a Gaussian unitary ensemble (GUE) of random matrices, when TRS is broken in the closed system, i.e., tp 6 C. The space-averaged spatial correlation of the local densities of a 2D chaotic wave function with wave number k is also given by9 '10 '11

P2(kr) = (p^pfa)) = l + cJi(kr), (3)

173

where r = |ri — r2 | and Jo{x) is the Bessel function of zeroth order. The parameter c is chosen as c = 2 for GOE (TRS) and c = 1 for GUE (broken TRS) eigenfunctions.

Investigations of the continuous transition of the wave function statistics between GOE and GUE symmetries have been also worked out. Introducing a transition parameter b € (1,2], we have the probability distribution: 12,13,14,15,16

PM = 2Vr3Texp("4(5^T)'')

where Io{x) is the modified Bessel function of zeroth order, and the spatial correlation:17

Pb2{kr) = 1 + (l + ( ^ ) 2 ) JS(kr) • (5)

For b -> 1 and b -> 2, both equations tend to the GOE and GUE cases, respectively.

On the other hand, the systematic statistical investigations of scattering wave functions in open chaotic systems have been carried out quite recently.16

It is essential that the space reciprocity in conservative closed systems, which means that each plane wave ties up with its counterpart with the same amplitude and running in the opposite direction in phase, is lost in open systems. As a result, the wave function statistics in a chaotic open system is expected to be the GUE if the system is completely open.16

4 Numerical Analyses and Discussions

We show in this section some numerical evidences of wave statistical inho-mogeneity for ballistic transport through the 2D open stadium billiard. Assuming steady current flow through the leads, we solve the time-independent Schrodinger equation for a single particle under Dirichlet boundary conditions based on the plane-wave-expansion method,6 giving reflection and transmission amplitudes as well as local wave functions for each energy. In the calculation of the statistics, a sample space A(= V) is taken in the cavity region corresponding to the closed stadium, and more than one million sample points are used to obtain reliable statistics. We show the numerical results for the wave probability density in Fig. 3 and for the probability distribution P(p) and spatial correlation P2(kr) in Fig. 4.

174

In Fig. 3(a), we find the so-called bouncing-ball mode in the central region of the stadium cavity, where we see a number of vertical nodes associated with marginally stable classical orbits bouncing vertically between the straight edges. Bouncing-ball states are nonstatistical states since the amplitude of ip is strongly localized in the middle region of the stadium (the space reciprocity holds locally) and is very small in the endcaps (the space reciprocity does not necessarily hold). As a result, both P{p) and P2(kr) for such states do not follow their universal expressions (see Fig. 4(a)). In addition to the bouncing-ball mode, we also see another wave localization strongly coupled to both the initial and the (open) transmission channels corresponding to the direct transmission path (see the white line depicted in Fig. 3(a)). Along such localization, plane wave may propagate with nonzero probability current, partially contributing to the anomaly of the wave statistics.16

In the higher energy region, where the ratio of the system size \/A to the wave length A is v^4/A ~ 25 (i.e., in the case of Fig, 3(b)), we may expect the GUE statistics. However, we see in Fig, 4(b) that both P(p) and P2(kr) follow closely the GOE.

The reason is a localization effect reminiscent of the phenomenon known as "scar" 18 describing an anomalous localization of quantum probability density along unstable periodic orbits in classically chaotic systems. In order to characterize a localization, we usually introduce a moment defined by J, = V~l Jv \tp(r)\2qdr of the eigenfunction local density |VKr)|2, with V being the system volume.19 '20 The second moment, I2, is known as the inverse participation ratio (IPR). Assuming a normalization condition (|V'|2) (= ^1) = 1> we have I2 = 1 for completely ergodic (random and uniform) eigenfunctions while h = 00 for completely localized eigenfunctions like IV'(r)!2 ~ V5(r). The localization effect on wave-function density statistics has been examined analytically in relation to J, for closed systems21,22,23 and also numerically using a time-dependent approach, i.e., in terms of recurrences of a test Gaussian wave packet, for closed and weakly (imperfectly) open systems. 24>25>26 In the latter work, they showed that the tail of the wave-function intensity distribution in phase space is dominated by scarring, departing from the RMT predictions.

In contrast, the most prominent effect of the localization of wave probability density in open billiards is the local space reciprocity holding along the classical orbits corresponding to the localization not strongly coupled to any (open) transmission channel (see, e.g., the white lines depicted in Fig. 3(b)): Along such orbits, there is no net current owing to the coherent overlap of time-reversed waves, so that both P(p) and P2(kr) are close to the GOE predictions. 16 For quantitative discussion, the value of the GOE-GUE transition parameter b is calculated numerically from the wave function ip(r) — u(r) + iv(r)

175

by a formula: 16

& = 2 < | V | 2 ) / (h/f) + y(|V|2)2-4((u2)( l;2)-(w)2) (6)

and (• • •) denotes a space average on A. The obtained value for Fig. 3(b) is b = 1.03, which corresponds to the case very close to the GOE.

In the case of open systems, the IPR may again play an important role as a measure of localization.27 In the definition, I2 = V" 1 Jv |^(r) |4dr, |V'(r)|2(= p(r)) is the scattering-wave local density and V the area (A) of the stadium cavity in our case. For chaotic wave functions normalized as (IV'I2) = 1 > w e

obtain from Eq. (4) the IPR l\ for the transition between the GOE and GUE statistics as

Tb I p2Pb(p)dp = -7T

2V/F^i

5 [2*

70 Ti dQ

[l+(t-l)cos0] ;

3b2 - 4 6 + 4 b2 (7)

In the GOE and GUE limits, I%=1 = 3 and 7|=2 = 2, respectively. For Fig. 3(b), the numerically obtained IPR is h = 2.89, which is exactly equal to jt=i.03 ^phis m e a n s that the enhancement of the IPR by the amplitude of the localized wave is not strong in the case of Fig. 3(b), and that the effect of the localization appears mainly in the value of b, which also determines the IPR.

From our investigations together with more extended studies,16 the complete GUE statistics is conjectured to be obtained only in the high-energy (semiclassical) limit. Until the energy reaches such limit, the localization of wave functions within the chaotic open systems strongly affects the wave statistical properties, leading to deviations from the RMT predictions based on the ergodicity or uniform randomness of wave functions.

Finally, we note that the classical-path families associated with the localization found in Fig. 3(a) and (b) can be identified as windows indicated with a and /3 in Fig. 2, respectively. (In Fig. 3(b), only the path family for the localization touching the entrance can be identified in Fig. 2.) We notice that the angle of incidence 0, for a given k is irrelevant to that of the path corresponding to the observed localizations directly connected to the entrance.

5 Conclusions

In conclusions, our numerical analyses show that chaotic-scattering wave functions in open systems exhibit remarkably different features from the idealistic GUE predictions. The statistical deviations from the GUE can be understood in terms of wave localization corresponding to classical short-path dynamics.

176

Acknowledgments

The auther is obliged to K.-F. Berggren, A. I. Saichev and A. F. Sadreev for fruitful collaboration leading to the work in Sec. 4. Support from the Swedish Board for Industrial and Technological Development (NUTEK) under Project No. P12144-1 is also acknowledged. Part of the calculations of the wave function statistics were carried out by using a resource in National Supercomputer Center (NSC) at Linkoping.

References

1. H. Narnhofer (to be published). 2. A. I. Shnirelman, Usp. Mat. Nauk 29, 181 (1974). 3. P. Gerard and E. Leichtnam, Duke Math. J. 71, 559 (1993). 4. S. Zelditch and M. Zworski, Comm. Math. Phys. 175, 673 (1996). 5. L. A. Bunimovich, Fund. Anal. Appl. 8, 254 (1974). 6. K. Nakamura and H. Ishio, J. Phys. Soc. Jpn. 61, 3939 (1992). 7. H. Ishio and J. Burgdorfer, Phys. Rev. B 51, 2013 (1995). 8. C. Porter and R. Thomas, Phys. Rev. 104, 483 (1956). 9. V. N. Prigodin, Phys. Rev. Lett. 74, 1566 (1995).

10. V. N. Prigodin et al, Phys. Rev. Lett. 72, 546 (1994). 11. M. V. Berry, in Chaos and Quantum Physics, ed. M. J. Giannoni,

A. Voros, and J. Zinn-Justin (Elsevier, Amsterdam, 1990), p. 251. 12. K. Zyczkowski and G. Lenz, Z. Phys. B 82, 299 (1991). 13. G. Lenz and K. Zyczkowski, J. Phys. A 25, 5539 (1992). 14. E. Kanzieper and V. Freilikher, Phys. Rev. B 54, 8737 (1996). 15. R. Pnini and B. Shapiro, Phys. Rev. E 54, R1032 (1996). 16. H. Ishio et al., (unpublished). 17. S.-H. Chung et al., Phys. Rev. Lett. 85, 2482 (2000). 18. E. J. Heller, Phys. Rev. Lett. 53, 1515 (1984). 19. F. Wegner, Z. Phys. B 36, 209 (1980). 20. C. Castellani and L. Peliti, J. Phys. A 19, L429 (1986). 21. Y. V. Fyodorov and A. D. Mirlin, Phys. Rev. B 51, 13403 (1995). 22. K. Miiller et al, Phys. Rev. Lett. 78, 215 (1997). 23. V. N. Prigodin and B. L. Altshuler, Phys. Rev. Lett. 80, 1944 (1998). 24. L. Kaplan, Nonlinearity 12, Rl (1999). 25. L. Kaplan, Phys. Rev. Lett. 80, 2582 (1998). 26. L. Kaplan and E. J. Heller Ann. Phys. 264, 171 (1998). 27. H. Ishio and L. Kaplan (private communication).

177

-612 0 612-612 0 612 y(-9i) y(+6i)

Figure 2: Transmission-reflection diagram of classical particles as a function of initial position y at the entrance of the stadium cavity and Fermi wave number k corresponding to the angle of incidence $i calculated by semiclassical quantization condition (n = 1 in all the range) in the lead. Black and white points correspond to transmission and reflection events, respectively. Two families of short paths are identified with an arrow beside the diagram (see the text).

178

Figure 3: Contour plot of wave probability density in the open stadium billiard for the condition (a) kd/n = 1.8785 (n = 1) and (b) kd/rc = 4.6553 (n = 1). Initial wave comes through the left lead into the cavity. The transmission probability is (a) Tqm = 0.55 and (b) Tqm = 0.36. The contours show about 97.5% of the largest wave probability density. Thin white lines show some of the short classical orbits corresponding to the localization of the wave probability density. Taken from the work by the authors in Ref. [12] (unpublished).

179

Q.

Q_

0.01

10

Q.

Q_

0.1

0.01

(b) \ =*. 2

X ^ Q U E _ _S ">J^\ 0 G O r T < ^ < \

\GOE

) 2 4 6 kr

•

8

0

Figure 4: Probability distribution (steps) and spatial correlation (thick line in the inset) of local densities in the open stadium billiard for the condition (a) kd/% = 1.8785 (n = 1) and (b) kd/ir = 4.6553 (n = 1). Two thin lines show GOE (i.e., Eq. (1)) and GUE (i.e., Eq. (2)) cases (Eq. (3) for the inset). Taken from the work by the authors in Ref. [12] (unpublished).

180

ORIGIN OF Q U A N T U M PROBABILITIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences,

MSI, University of Vaxjo, S-35195, Sweden

Email: [email protected]

We demonstrate that the origin of the quantum probabilistic rule (which differs from the conventional Bayes' formula by the presence of cos 0-factor) might be explained by perturbation effects of preparation and measurement procedures. The main consequence of our investigation is that interference could be produced by purely corpuscular objects. In particular, the quantum rule for probabilities (with nontrivial cos 0-factor) could be simulated for macroscopic physical systems via preparation procedures producing statistical deviations of a special form. We discuss preparation and measurement procedures which may produce probabilistic rules which are neither classical nor quantum; in particular, hyperbolic 'quantum theory.'

1 Introduction

It is well known that the conventional probabilistic rule, formula for the total probability, (that is based on Bayes' formula for conditional probabilities) cannot be applied to quantum experiments, see, for example, [1]-[12] for extended discussions. It seems that special features of quantum probabilistic behaviour are just consequences of violations of the conventional probabilistic rule.

In this paper we restrict our investigations to the two dimensional case. Here the formula for the total probability has the form (i = 1,2) :

p(A = ai) = p(B = h)p(A = <n/B = h) + p(B = b2)p{A = ta/B = b2),

(1)

where A and B are physical variables which take, respectively, values ai,a2

and 61,62- Symbols p(A = a^jB = bj) denote conditional probabilities. It is one of the most important rules used in applied probability theory. In fact, it is the prediction rule: if we know probabilities for B and conditional probabilities, then we can find probabilities for A. However, this rule cannot be used for the prediction of probabilities observed in experiments with elementary particles. The violation of conventional probabilistic rule and the necessity to use new prediction rule was found in interference experiments with elementary particles. This astonishing fact was one of the main reasons to build the quantum formalism on the basis of the wave-particle duality.

181

Let (f> be a quantum state. Let {\b{ >}f=1 be the basis consisting of eigenvectors of the operator B corresponding to the physical observable B. The quantum probabilistic rule has the form (i = 1,2) :

Pi = qiPii + q2P2i ± 2%/qiPHq2p2i cos0, (2)

where p* = p^A = a,i),qj - p^B = 6j),Py = p\bi>(A = aj),i,j = 1,2. Here probabilities have indexes corresponding to quantum states.

By denoting P = pj and P i = qiPi i ,P2 = q2P2i we get the standard quantum probabilistic rule for interference of alternatives:

P = P i + P 2 + 2v /P7PT cos6». There is the large diversity of opinions on the origin of violations of conven

tional probabilistic rule (1) in quantum mechanics, see [1]-[12] The common opinion is that violations of (1) are induced by special properties of quantum systems (for example, Dirac, Feynman, Schrodinger). Thus the quantum probabilistic rule must be considered as a peculiarity of nature.

An interesting investigation on this problem is contained in the paper of J. Shummhammer [12]. In the opposite to Dirac, Feynman, Schrodinger,... he claimed that quantum probabilistic rule (2) is not a peculiarity of nature, but just a consequence of one special method of the probabilistic description of nature, so called method of maximum predictive power.

In this paper we provide probabilistic analysis of quantum rule (2). In our analysis 'probability' has the meaning of the frequency probability, namely the limit of frequencies in a long sequence of trials (or for a large statistical ensemble). Hence, in fact , we follow to R. von Mises' approach to probability [13]. It seems that it would be impossible to find the roots of quantum rule (2) in the measure-theoretical framework, A. N. Kolmorogov, 1933, [14]. In the measure-theoretical framework probabilities are defined as sets of real numbers having some special mathematical properties. The conventional rule (1) is merely a consequence of the definition of conditional probabilities. In the Kolmogorov framework to analyse the transition from (1) to (2) is to analyse the transition from one definition to another. In the frequency framework we can analyse behaviour of trails which induce one or another property of probability. Our analysis shows that quantum probabilistic rule (2) can be, in principle, a consequence of perturbation effects of preparation and measurement procedures. Thus trigonometric fluctuations of quantum probabilities can be explained without using the wave arguments.

In fact, our investigation is strongly based on the famous Dirac's analysis of foundations of quantum mechanics, see [1]. In particular, P. Dirac pointed out that one of the main differences between the classical and quantum theories is that in quantum case perturbation effects of preparation and measurement

182

procedures play the crucial role. However, P. Dirac could not explain the origin of interference for quantum particles in the purely corpuscular model. He must apply to wave arguments: 'If the two components are now made to interfere, we should require a photon in one component to be able to interfere with one in the other', [1].

In this paper we discuss perturbation effects of preparation and measurement procedures. We remark that we do not follow to W. Heisenberg [15]; we do not study perturbation effects for individual measurements. We discuss statistical (ensemble) deviations induced by perturbations."

We underline again that our probabilistic analysis was possible only due to the rejection of Kolmogorov's measure-theoretical model of probability theory. Of course, each particular experiment (measurement) can be described by Kolmogorov's model: there are no 'quantum probablities'. Moreover, it seems that there is nothing more than the binomial probability distribution (see the paper of J. Shummhammer in the present volume). The most important feature of QUANTUM STATISTICS is not related to a single experiment. We have to consider at least three different experiments (preparation procedures) to observe 'quantum probabilistic behaviour', namely interference of alternatives. Kolmogorov's model is not adequate to such a situation. In this model all random variables are defined on the same probability space. It is impossible to do in the case of a few experiments that produce interference of alternatives (at least the author does not see any way to do this). In our analysis probability is 'classical', relative frequency, but it is not Kolmogorov (compare with Accardi [3]).

An unexpected consequence of our analysis is that quantum probability rule (2) is just one of possible perturbations (by ensemble fluctuations) of conventional probability rule (1). In principle, there might exist experiments which would produce perturbations of conventional probabilistic rule (1) which differ from quantum probabilistic rule (2).

Moreover, if we use the same normalization of the interference term, namely 2v/PTP7, then we can classify all possible probabilistic rules that we have in nature:

1) trigonometric; 2) hyperbolic; 3) hyper-trigonometric. The hyperbolic probabilistic transformation has a linear space representa

tion that is similar to the standard quantum formalism in the complex Hilbert space. Instead of complex numbers, we use so called hyperbolic numbers, see, for example, [18], p.21. The development of hyperbolic quantum mechanics can be interesting for comparative analysis with standard quantum mechanics. In

"Such an approach implies the statistical viewpoint to Heisenberg uncertainty relation: the statistical dispersion principle, see L. Ballentine [16], [17] for the details.

183

particular, we clarify the role of complex numbers in quantum theory. Complex (as well as hyperbolic) numbers were used to linearize nonlinear probabilistic rule (that in general could not be linearized over real numbers). Another interesting feature of hyperbolic quantum mechanics is the violation of the principle of superposition. Here we have only some restricted variant of this principle.

2 Quantum formalism and perturbation effects

1. Frequency probability theory. The frequency definition of probability is more or less standard in quantum theory; especially in the approach based on preparation and measurement procedures, [5], [10], [16], [11].

Let us consider a sequence of physical systems n = (7TI,7T2, ...,71-JV, •••) • Suppose that elements of TT have some property, for example, position or spin, and this property can be described by natural numbers: L = {1,2, . . . ,m}, the set of labels. Thus, for each -Kj € TT, we have a number Xj £ L. So ir induces a sequence

x = (XI,X2,...,XN,...), Xj e L. (3)

For each fixed a € L, we have the relative frequency VN{OC) — niv(a)/N of the appearance of a in (a;i,a;2, ...,XN). Here njv(a) is the number of elements in (XI,X2,-.-,XN) with Xj = a. R. von Mises [13] said that x satisfies to the principle of the statistical stabilization of relative frequencies, if, for each fixed a G L, there exists the limit

p(a) = lim ^AT(Q). (4) N—HXl

This limit is said to be a probability of a. Thus the probability is defined as the limit of relative frequencies. In fact, this definition of probability is used in all experimental investigations. In Kolmogorov's approach [14] probability is denned as a measure. The principle of the statistical stabilization is obtained as the mathematical theorem, the law of large numbers.

2. Preparation and measurement procedures and quantum formalism. We consider a statistical ensemble S of quantum particles described by a quantum state <j>. This ensemble is produced by some preparation procedure 8, see, for example, [4], [5], [16], [10], [11] for details, see also P. Dirac [1]: 'In practice the conditions could be imposed by a suitable preparation of the system, consisting perhaps in passing it through various kinds of sorting apparatus, such as slits and polarimeters, the system being left undisturbed after the preparation.'

There are two discrete physical observables B = bi, 62 and A = ax, a2.

184

The total number of particles in S is equal to N. Suppose that n\,i — 1,2, particles in S with B = bi and n", i = 1,2, particles in S with A = a,.

Suppose that, among those particles with B = bi, there are riij,i,j, = 1,2, particles with A = a,j (see (R) below to specify the meaning of 'with'). So

n\ = nn +ni2,n^ = nxi +n2j,i,j = 1,2.

(R) We follow to Einstein and use the objective realist model in that both B and A are objective properties of a quantum particle, see [5], [4], [10] for the details. In particular, here each elementary particle has simultaneously defined position and momentum. In such a model we can consider in the ensemble S sub-ensembles Sj(B) and Sj(A),j = 1,2, of particles having properties B = bj and A = a,j, respectively. Set

Sij(A,B) = S i(B)nS j(A). Then n^ is the number of elements in the ensemble S ; J ( A , B ) . We remark

that the 'existence' of the objective property (B — bi and A — Oj) need not imply the possibility to measure this property. For example, such a measurement is impossible in the case of incompatible observables. In general the property (B = bi and A = a,j) is a kind of hidden objective property. b

The physical experience says that the following frequency probabilities are well defined for all observables B, A :

q i = p^(B = 6 i ) = lim q ^ U r 0 ^ ; (5) JV—>oo iV

p . = p ( j 4 = a . ) = l i m pW,pf) = | . (6) IS —too 1\

Let quantum states |6j > be eigenstates of the operator B. Let us consider statistical ensembles Ti,i = 1,2, of quantum particles described by the quantum states |6j > . These ensembles are produced by some preparation procedures £j. For instance, we can suppose that particles produced by a preparation procedure £ (for the quantum state 4>) pass through additional niters Fi, i = 1,2. In quantum formalism we have

<f> = x/qT |&i > +V^eiB \h > • (7)

^Attempts to use objective realism in quantum theory were strongly criticized, especially in the connection with the EPR-Bell considerations. Moreover, many authors (for example, P. Dirac [1] and R. Feynman [2]) claimed that the contradiction between objective realism and quantum theory can be observed just by comparing the conventional and quantum probabilistic rules (see d'Espagnat [4] for the extended discussion). However, in this paper we demonstrate that there is no direct contradiction between objective realism and quantum probabilistic rule.

185

In the objective realist model (R) this representation may induce the illusion that ensembles Tt,i = 1,2, for states \bi > must be identified with sub-ensembles Si(B) of the ensemble S for the state (j). However, there are no physical reasons for such an identification:

The additional filter Fj(i = 1,2) changes the A-property of quantum particles. In general the probability distribution of the property A for the ensemble S;(B) = {IT e S : B(7r) = b;} differs from the corresponding probability distribution for the ensemble T;.

Suppose that there are rriij particles in the ensemble T; with A = a,j(j — 1,2). c

The following frequency probabilities are well defined: Pij = p|6 .>(A = a,j) = limAr- oo p>- ', where the relative frequency p ^ =

^f- (by measuring values of the variable A for the statistical ensemble T ;

we always observe the stabilization of the relative frequencies pj • to some constant probability py) .

Here it is assumed that the ensemble Tj consists of n^ particles, i = 1,2. This assumption is natural if we consider preparation procedure £; = Ft, a filter with respect to the value B — bi. Only particles with B = bi pass this filter. Hence the number of elements in the ensemble T; (represented by the state \bi >) coincides with number of elements with B = bi in the ensemble 5 (represented by the state cj>).

It is also assumed that n\ = n\(N) -> oo,iV->oo. In fact, the latter assumption holds true if both probabilities q;,i = 1,2,

are nonzero. We remark that probabilities pjj = Tp\bi>{A = a,j) cannot be (in general)

identified with conditional probabilities p$(A = a,j/B = bi). As we have remarked, these probabilities are related to statistical ensembles prepared by different preparation procedures, namely by £i,i — 1,2, and £. Probabilities P|i,j>(A = a,j) can be found by measuring the A-variable for particles belonging to the ensemble Tj. Probabilities p^iA = CLJ/B = bi) in general could not be found; these are hidden probabilities with respect to the ensemble S.

3. Derivation of quantum probabilistic rule. Here we present the standard Hilbert space calculations.

cWe can use the objective realist model, (R). Then m^- is just the number of particles in the ensemble Tj having the objective property A = a,j. We can also use the contextualist model, (C). Then rriij is the number of particles in the ensemble T, which in the process of an interaction with a measurement device for the physical observable A would give the result A = a,j.

186

<t> = y/5x \h > +y/^eie \b2 > . Let {\a,j >} be the orthonormal basis consisting of eigenvectors of the

operator A. We can restrict our considerations to the case:

\h >= -v/PiT K > +e I 7 lv / pH \a2 >, \b2 >= VP2T K > +en2^/p22 \a2 > •

(8)

We note that Pll + Pl2 = 1, P21 + P22 = 1-The first sum is the probability to observe one of values of the variable A

for the statistical ensemble Ti; the second sum is the probability to observe one of values of the variable A for the statistical ensemble T2 .

As < &i|62 > = 0, we obtain: VP11P21 + e i(71 ~72) v /p l ip i i = 0. We suppose that all probabilities pij > 0. This is equivalent to say that

A and B are incompatible observables or that operators A and B do not commute.

Hence, sin(7i — 72) = 0 and 72 = 71 + nk. We also have VP11P21 + cos(7i - 72VP12P22 = 0. This implies that k = 21 + 1 and ^ p i ^ i = i/Pi2P22- As p!2 = 1 — P n

and P21 = 1 — P22, we obtain that

P l l = P 2 2 , P l2=P21- (9)

This equalities are equivalent to the condition: P u + P21 = 1, P12 + P22 = 1. Hence, the matrix of probabilities (pij) is double stochastic matrix, see,

for example, [5] for general considerations. Thus, in fact,

\h >= v^PiT K > +e17lVPi2 \a2 >, \b2 >= ^pln |ai > - e J 7 l v^22 \a2 > . (10)

So (p = di |ai > +d2|a2 >, where di = VqlpTT + e ^ y ^ p i T , d2 = e i 7 l , /qiPi2 - e'^+^y/qjp^. Thus

pi = p 0 ( A = ai) = |di|2 = q i p n + q 2 p 2 i + 2 v ' q ip i iq 2 p 2 i cos^; (11)

p 2 = p<t,(A = a2) = |d2|2 = qiPi2 + q2P22 - 2yqiPi2q2P22Cos0. (12)

187

3. Probability transformations connecting preparation procedures. Let us forget at the moment about the quantum theory. Let B(= b\, b2) and A(= 01,02) be physical variables. We consider an arbitrary preparation procedure £ for microsystems or macrosystems. Suppose that £ produced an ensemble S of physical systems. Let £\ and £2 be preparation procedures which are based on filters Fi and F2 corresponding, respectively, to values 61 and b2

of B. Denote statistical ensembles produced by these preparation procedures by symbols Tx and T2, respectively. Symbols

have the same meaning as in the previous considerations. Probabilities qi)Pij>Pi a r e defined in the same way as in the previous considerations. The only difference is that, instead of indexes corresponding to quantum states, we use indexes corresponding to statistical ensembles:

q* = Ps(B = bi),pi = ps(A = a,i),pij = PTi(A = a,).

We shall restrict our considerations to the case of strictly positive probabilities.

The following simple frequency considerations are basic in our investigation. We would like to represent the frequency p^ (for A = a, in the ensemble S) as the sum of the conventional (Bayes) part,

q i ^ P i f + q ^ P ^ and some perturbation term. Such a perturbation term appears, because

frequencies q ' and p ^ ' are calculated with respect to different ensembles. The magnitude of this perturbation term will play the crucial role in our further analysis. We have:

(N) _ n± _ nu , I^£ _ mi l , H!2i 4. (nii ~ mi») , (n2i ~ ra2j) P i ~ N ~ N N ~ N N N N

But, for i = l ,2 , we have

™>u _ rnu_ r^_ _ (N) (N) m^ _ rn^ n | _ (jy) (N)

N ~ n\ ' N ~ P l i q i ' N ~ n\ ' N ~P2i ^ '

Hence

pw = qwp(f) + qwp(f) + r ) ) (13)

where

SiN) = Jj[(nu ~ m i i ) + ("2i - m2i)], i — 1,2.

188

In fact, this rest term depends on the statistical ensembles S,Ti,T2, 4N>=6W(S,Tl,T2). 4. Behaviour of fluctuations. First we remark that limjv-yoo S\ ' exists

for all physical measurements. We always observe that P 1

( N ) - M M , q i( N ) - q , , p J , ) - > P u , N - > 0 0 .

Thus there exist limits 6i = limivôo S\ = Pi ~ qiPii - q2P2i-This coefficient Si is statistical deviation produced by the perturbation

effect of the preparation procedure Ei (quantities S\ ' are experimental statistical deviations).

Suppose that preparation procedures £,,i = 1,2, (typically filters F,) produce negligibly small (with respect to the size N of the statistical ensemble) changes in properties of particles. Then

6?° ->0,N-*oo. (14)

This asymptotic implies conventional probabilistic rule (1). In particular, this rule can be used in all experiments of classical physics. Hence, preparation and measurement procedures of classical physics produce experimental statistical deviations with asymptotic (14). We also have such a behaviour in the case of compatible observables in quantum physics.

Moreover, the same conventional probabilistic rule we can obtain for incompatible observables B and A if the phase factor 9 = j + nk. Therefore conventional probabilistic rule (1) is not directly related to commutativity of corresponding operators in quantum theory. It is a consequence of asymptotic (14).

Despite the same asymptotic, (14), there is the crucial difference between classical observations (and compatible observations) and decoherence, 9 = f +

irk, for incompatible observations. In the first case S\ fa 0, TV -> oo, because both

4T = jj(nu ~mH)w °' si¥ = jj(n2i ~ m 2 * ) K ° ' N •*• ° ° -In an ideal classical experiment we have

>ii» = ma and nî = tnî-Here preparation procedures £j (filters with respect to the values hi of the

variable B) do not change values of the A-variable at all. In the case of decoherence of incompatible observables the statistical de

viations S\ j ' and 8\ 2 are not negligibly small. So perturbations can be sufficiently strong. However, we still observe (14), as a consequence of the compensation effect of perturbations:

189

x(N) ~ _x(") °i,l ~ °i,2 • Suppose now that filters Fi,i = 1,2, produce changes in properties of

particles that are not negligibly small (from the statistical viewpoint). Then the statistical deviations

lim 6\N) =Si^0. (15) iV->oo

Here we obtain probabilistic rules which differ from the conventional one, (1). In particular, this implies that behaviour (15) cannot be produced in experiments of classical physics (or for compatible observables in quantum physics).

A rather special class of statistical deviations (15) is produced in experiments of quantum physics. However, behaviour of form (15) is not the specific feature of quantum measurements (see further considerations).

To study carefully behaviour of fluctuations S\ ', we represent them as:

where

A-N) = , . [jnu - mii) + (n2i - m2i)] . 2y/mum2i

These are normalized (experimental) statistical deviations. We have used the fact:

(N) (N) (N) (N) _ nj r^}± ^2 ^2i _ rniim2i qi P H q2 p2i - N • n t • N • n6 - JV-2 •

In the limit N -> oo, we get:

Si = 2y'qiPHq2P2i A»,

where the coefficients Aj = limjv->oo A ',i = 1,2. Thus we found the general probabilistic transformation (for three preparation procedures) that can be obtained as a perturbation of the conventional probabilistic rule (i = 1,2) :

Pi = qiPH + q2P2i + 2Vqiq2PiiP2iAj. (16)

Of course, we are free in the choice of a normalization constant in the perturbation term. We use 2v/qiq2Piipi7 by the analogy with quantum formalism. In fact, such a normalization was found in quantum formalism to get the representation of probabilities with the aid of complex numbers. Complex numbers were introduced in quantum formalism to linearize the nonlinear

190

probabilistic transformation q ip i , + q2P2» + 2-v/qiq2PiiP2i cos 6. To do this, we use the formula (c, d > 0):

c + d + 2Vcdcos6 = \^+Vdeie\2 . (17)

The 'square root' y/c+Vde*9 gives the possibility to use linear transformations. Thus we do not see anything mystical in the appearance of complex numbers in quantum theory. This is a consequence of the impossibility of real linearization of the nonlinear probabilistic transformation.

In classical physics the coefficients A; = 0. The same situation we have in quantum physics for all compatible observables as well as for measurements of incompatible observables for some states. In the general case in quantum physics we can only say that the normalized statistical deviations

\K\ < 1. (18)

Hence, for quantum experiments, we always have:

(nu - mu) + (n2i - m2i).

2y/mum2i < l , J V - > o o . (19)

Thus quantum perturbations induce a relatively small (but not negligibly small!) statistical variations of properties. We underline again that quantum perturbations give just the proper class of perturbations satisfying to condition (19).

Let us consider arbitrary preparation procedures that induce perturbations satisfying to (18). We can set

Aj = cos9i,i = 1,2, where 6i are some 'phases.' Here we can represent perturbation to the

conventional probabilistic rule in the form:

St = 2v*,qipliq2p2iCOS0i,J = 1,2. (20)

In this case the probabilistic rule has the form (i = 1,2) :

Pi = qiPii + q2P2i + 2^/qiq2piiP2i cos8i. (21)

This is the general form of a trigonometric probabilistic transformation. The usual probabilistic calculations give us 1 = Pl + p 2 = qiPH + q2P21 + +qiPl2 + q2P22 + 2 T/qTqiPiTpircos^i + 2 yqTqiPiipii" cos 02

= 1 + 2A/qiq2[x/pnP2i cos<?i + v'Pi2P22 cos02] •

191

Thus we obtain the relation:

\ / P l l P 2 1 c o s ^ l + \/Pl2P22COS02 = 0 . (22)

Suppose now that the matrix of probabilities is a double stochastic matrix. We get

cos 6\ — — cos 6-2 . (23)

We obtain quantum probabilistic transformation (2). We demonstrate that this rule could be derived even in the realist framework. Condition (19) has the evident interpretation. To explain the mystery of quantum probabilistic rule, we must give some physical interpretation to the condition of double stochasticity, see section 4 for such an attempt.

We can simulate quantum probabilistic transformation by using random variables nij{u),mij{u) such that the deviations:

4T = nu - mH = 2^fVmi»m2». (24)

4 i = n2i ~ m2j = ^ii VmUm2i, (25)

where the coefficients £y satisfy the inequality

l #° + $ ° I < l,*->oo. (26)

Suppose that A> — £j; + Qj ' ~» A;, N -»• oo, where |Ai| < 1. We can rep

resent A|N) = cos(9i(N). Then0JN) ->• 9i,mod2iT, when N -> oo. Thus A; = cos ft. We remark that the conventional probabilistic rule (which is induced by

ensemble fluctuations with Q ' —> 0) can be observed for fluctuations having relatively large absolute magnitudes. For instance, let

e l i — *?lt Vml»> e2i — 2S2t V m 2i )» — J-iA (27)

where sequences of coefficients {£}4 '} and {£^ '} are bounded (JV -> oo). Here (N) f(JV) £(JV)

^ = \/mti "*" w'mn -> 0, iV -> oo (as usual, we assume that p,j > 0). Example 2.1. Let N « 106,nJ w rig « 5 • 105 ,mn ss mi2 « m2i «

m22 ~ 25 • 104. So qi — q2 = 1/2; p u — p i 2 = p 2 1 = p 2 2 = 1/2 (symmetric state). Suppose we have fluctuations (27) with f ' m Qi ~ 1/2- Then eH w 4 w ^00. So riij = 24 • 104 ± 500. Hence, the relative deviation

192

(N)

"m7" = 25I04 ~ 0.002. Thus fluctuations of the relative magnitude « 0,002 produce the conventional probabilistic rule.

It is evident that fluctuations of essentially larger magnitude

4V = 2^f )(mH)1 /2(m2 1)1A>, € W = 2&\m2i)^(mu)W,a,p > 2, (28)

where {Q{ '} and {£2i } a r e bounded sequences (N —> 00), also produce (for Pij ¥" 0) the conventional probabilistic rule.

Example 2.2. Let all numbers N,... ,m,ij be the same as in Example 3.1 and let deviations have behaviour (28) with a = /? = 4. Here the relative

AN)

deviation -"— « 0,045. Remark 2.1. The magnitude of fluctuations can be found experimentally.

Let A and B be two physical observables. We prepare free statistical ensembles S ,Ti ,T 2 corresponding to states <j),\bi >,\b2 > • By measurements of B and A for 7r G S we obtain frequencies q[ ',q2 > Pi > P2 > ^y measurements of A for 7r € Ti and for TT G T2 we obtain frequencies p[j '. We have

H N ) = A ( N ) = p(N) q ( N ) p ( N ) _ q ( N ) p ( N ,

It would be interesting to obtain graphs of functions f; (N) for different pairs of physical observables. Of course, we know that lini7v-»oo ft (N) = ±cos6. However, it may be that such graphs can present a finer structure of quantum states.

3 Hyperbolic and hyper-trigonometric probabilistic transformations

Let Si, £2 be preparation procedures that produce perturbations such that the normalized (experimental) statistical deviations

lAJ^I > l,JV-»oo. (29)

Thus |Aj| > 1,2 = 1,2. Here the coefficients Aj can be represented in the form Aj = ± cosh8i,i = 1,2. The corresponding probability rule has the following form:

Pi = qiPii + Q2P2J ± 2A/qIqipIip27cosh Qh i = 1,2. The normalization pi + p 2 = 1 gives the orthogonality relation:

VP11P2I COSh 61 ± 1/Pl2P22COSh^2 = 0 . (30)

Thus cosh 62 — C0Sn^i\/pi2P22 and signAiA2 = —1.

193

This probabilistic transformation can be called a hyperbolic rule. It describes a part of nonconventional probabilistic behaviours which is not described by the 'trigonometric formalism'. Experiments (and preparation procedures 8,61,82) which produce hyperbolic probabilistic behaviour could be simulated on computer. On the other hand, at the moment we have no 'natural' physical phenomena which are described by the hyperbolic probabilistic formalism. Trigonometric probabilistic behaviour corresponds to essentially better control of properties in the process of preparation than hyperbolic probabilistic behaviour. Of course, the aim of any experimenter is to approach trigonometric behaviour. However, in principle there might exist such natural phenomena that trigonometric quantum behaviour could not be achieved.

Example 3 .1 . Let qi = a, q2 = 1 - a , P n = . . . = P22 = 1/2. Then pi = I + y/a(l - a)Ai, P2 = I - \A*(1 - «)^i • If a is sufficiently small, then Ai can be, in principle, larger than 1. We

can find a 'phase' 6 such that the normalized statistical deviation Ai = cosh#. Let us consider experiments that produce hyperbolic probabilistic rule and

let the corresponding matrix of probabilities be double stochastic. In this case orthogonality relation (30) has the form:

cosh#i = cosh 62 = cosh#. We get the probabilistic transformation:

Pi = q i P n +q2P2i ± 2^/qiq2piiP2i coshfl ;

P2 = q iP i2 + q2P22 T 2v /qiq2Pi2P22COsh0 .

This probabilistic transformation looks similar to the quantum probabilistic transformation. The only difference is the presence of hyperbolic factors instead of trigonometric. This similarity gives the possibility to construct a linear space representation of the hyperbolic probabilistic calculus, see section 7.

The reader can easily consider by himself the last possibility: one normalized statistical deviations |A| is large than 1 and another is less than 1; hyper-trigonometric probabilistic transformation.

Remark 3.1. The real experimental situation is more complicated. In fact, the phase parameter 6 is connected with the experimental arrangement. In particular, in the standard interference experiments the phase is related to the space-time structure of an experiment. It may be that in some experiments dependence of the normalized statistical deviation A on 6 is neither trigonometric nor hyperbolic:

P = P ! + P 2 + 2 y/P^XiO). However, if the function |A(#)| < 1, then we can obtain the trigonometric

transformation by just the reparametrization: 6' = arccos/(#).

194

4 Double stochasticity and correlations between preparation procedures

In this section we study the frequency meaning of the fact that in the quantum formalism the matrix of probabilities is double stochastic. We remark that this is a consequence of orthogonality of quantum states \bi > and |62 > corresponding to distinct values of a physical observable B. We have

PU = P22 ( 3 1 )

Pl2 P21

Suppose that all quantum features are induced by the impossibility to create new ensembles Ti and T2 without to change properties of quantum particles. Suppose that, for example, the preparation procedure Si practically destroys the property A = ai (transforms this property into the property A = a2). So p n = 0. As a consequence, the £1 makes the property A = a2

dominating. So p i 2 « 1. Then the preparation procedure Si must practically destroy the property A = a2 (transforms this property into the property A = ai) . So P22 PS 0. As a consequence, the Si makes the property A = ai dominating. So P21 « 1.

We remark that

We recall that the number of elements in the ensemble T is equal to n\. Thus

n n -run _ ,n22 - m 2 2 , ^ nil _ "22 ,„„.

This is nothing than the relation between fluctuations of property A under the transition from the ensemble S to ensembles Ti, T2 and distribution of this property in the ensemble S.

5 Hyperbolic quantum formalism

The mathematical formalism presented in this section can have different 'physical interpretations.' In particular, quantum state can be interpreted from the orthodox Copenhagen as well as statistical viewpoints.

A hyperbolic algebra G, see [18], p. 21, is a two dimensional real algebra with basis eo = 1 and ei = j , where j 2 = 1. Elements of G have the form z = x + jy, x,y € R. We have zi + z2 = (xi + x2) + j(yi + yi) and ziz2 = {xixi + 2/12/2) + j(^i2/2 + X2yi). This algebra is commutative. We introduce

195

the involution in G by setting z = x - jy. We set \z\2 = zz = x2 - y2. We remark that \z\ = yjx2 - y2 is not well denned for an arbitrary z € G . We set G+ = {z £ G : \z\2 > 0}. We remark that G+ is the multiplicative semigroup: Zi,Z2 £ G + —• z = z\z2 £ G+. It is a consequence of the equality

\zxz2\2 = |zi |2 |z2 |2 .

Thus, for z\,z2 £ G + , we have \z\z2\ = l^iH^I- We introduce

eje = cosh6+js inh9, 6 £ R.

We remark that

e j 0 i e j 02 _ em+<>2)^ _ e - j 9 ; |gj«|2 _ c o s h 2 g _ s i n h 2 g, _ L

Hence, z = ±eJ 'e always belongs to G+. We also have cosh6» = e +2

e , sinh6> = e ~j . We set G ; = { z e G + : |Z|2 > 0}. Let z £ G*+. We have

* = W(1f[+W = «N( aSr+jHSr)-2 2

As A T - T TJ = 1, we can represent x sign a; = cosh 6 and y sign a; = sinh 6, where the phase 6 is unequally defined. We can represent each z £ G+ as

z = sign x |z| e?e . By using this representation we can easily prove that G+ is the mul

tiplicative group. Here \ — 5!Spe-J'fl'. The unit circle in G is denned as Si = {z £ G : \z\2 = 1} = {z = ±eje,9 £ ( -oo,+oo)}. It is a multiplicative subgroup of G+.

Hyperbolic Hilbert space is G-linear space (module), see [18], E with a G-linear product: a map (•,•): E x E —> G that is

1) linear with respect to the first argument: (az + bw,u) = a(z,u) + b(w,u),a,b £ G,z,w,u £ E; 2) symmetric: (z,u) = (u,z); 3) nondegenerated: (z,u) = 0 for all u £ E iff z — 0. If we consider E as just a R-linear space, then (•, •) is a bilinear form which

is not positively defined. In particular, in the two dimensional case we have the signature: (+, —, +, —).

As in the ordinary quantum formalism, we represent physical states by normalized vectors of the hyperbolic Hilbert space: <p £ E and (ip, ip) = 1. We shall consider only dichotomic physical variables and quantum states belonging to the two dimensional Hilbert space. So everywhere below E denotes the two dimensional space. Let A = a\, a2 and B = bi, b2 be two dichotomic physical variables. We represent they by G-linear operators: \a\ >< a i | + \a2 >< a2\

196

and \bi >< b\\ + |&2 > < b2\, where {|a; >}j=i,2 and {\bi >}i=i,2 are two orthonormal bases in E.

Let (p be a state (normalized vector belonging to E). We can perform the following operation (which is well defined from the mathematical point of view). We expend the vector }i=i,2 •

+p2\b2>, (34)

where the coefficients (coordinates) Pi belong to G. As the basis {\bi >}i=i,2 is orthonormal, we get (as in the complex case) that:

\p1\2 + \p2\

2 = l. (35)

However, we could not automatically use Born's probabilistic interpretation for normalized vectors in the hyperbolic Hilbert space: it may be that Pi $. G +

(in fact, in the complex case we have C = C + ) . We say that a state ip is decomposable with respect to the system of states {|6j >}i=i,2 (S-decomposable) if

Pi G G+ . (36)

In such a case we can use Born's probabilistic interpretation of vectors in a hyperbolic Hilbert space:

Numbers q; = \Pi\2,i = 1,2, are interpreted as probabilities for values B = bi for the G-quantum state tp.

We now repeat these considerations for each state \bi > by using the basis {\o>k >}*=i,2- We suppose that each \bi > is ^-decomposable. We have:

|&i > = / ? n k > +Pi2\a2 >, |&2 > = &i |a i > +p22\a2 > , (37)

where the coefficients Pik belong to G+. We have automatically:

|/?n|2 + |/?i2|2 = l, |/?2i|2 + |/?22|2 = l . (38)

We can use the probabilistic interpretation of numbers p n = |/?n|2,pi2 = |/3i2|2 and p2 i = |/32i|

2,P22 = \P22? • Pik is the probability for a - ak in the state \bi > .

Let us consider matrices B = (Pik) and P = (pik)- As in the complex case, the matrix B is unitary: vectors u\ = (Pn,Pi2) and u2 = (p2i,P22) are orthonormal. The matrix P is double stochastic.

By using the G-linear space calculation (the change of the basis) we get +a 2 | a 2 >, where a-i = PiPn + P2P21 and a2 — P\P\2 + /?2/?22-

197

We remark that decomposability is not transitive. In principle ip may be not A-decomposable, despite B-decomposability of ip and A-decomposability of the B-system.

Suppose that ip is A-decomposable. Therefore coefficients p^ = |afc|2 can be interpreted as probabilities for a = a,k for the G-quantum state <p.

Let us consider states such that coefficients fii,Pik belong to G+. We can uniquely represent them as

pi = ±v/q~e^, I5ik = ±y/JHkehih ,i,k,= 1,2.

We find that

Pi = q i P u + Q2P21 + 2ei v /q 1piiq 2p 2 i coshfli , (39)

P2 = qiPi2 + q2P22 + 2e2v/qTpl2q2P22 cosh^2 , (40)

where 6t = 77 + 7* and 77 = f i - £2,71 = 7n - 721,7i = 7i2 - 722 and e* = ± . To find the right relation between signs of the last terms in equations (39), (40), we use the normalization condition

M 2 + |a2 |2 = l (41)

(which is a consequence of the normalization of ip and orthonormality of the system {\ai >}i=i,2). It is equivalent to the equation (condition of orthogonality in the hyperbolic case, see section 8).

VPl2P22COSh02 ± \/PllP2lCOSh02 = 0. Thus we have to choose opposite signs in equations (39), (40). Unitarity

of B also inply that 6\ — 62 = 0, so 71 = 72. We recall that in the ordinary quantum mechanics we have similar conditions, but trigonometric functions are used instead of hyperbolic and phases 71 and 72 are such that 71—72 = ir.

Finally, we get that (unitary) linear transformations in the G-Hilbert space (in the domain of decomposable states) represent the hyperbolic transformation of probabilities (see section 8):

Pi = QiPu + q2P2i ± 2-v/q1piiq2p2iCOsh0 , P2 = qiPi2 + q2P22 =F 2v/q1pi2q2P22COsh0 . This is a kind of hyperbolic interference. There can be some connection with quantization in Hilbert spaces with

indefinite metric as well as the theory of relativity. However, at the moment we cannot say anything definite. It seems that by using Lorentz-'rotations' we can produce hyperbolic interference in a similar way as we produce the standard trigonometric interference by using ordinary rotations.

198

6 Physical consequences

The wave-particle dualism was created to explain the interference phenomenon for massive elementary particles. In particular, the orthodox Copenhagen interpretation was proposed to find a compromise between corpuscular and wave features of elementary particles. The idea of superposition of distinct 'properties' is, in fact, based on these interference experiments. It is well known that the orthodox Copenhagen interpretation is not free of difficulties (in particular, collapse of wave function) and even paradoxes (see, for example, Schrodinger [19]). Problems in the orthodox Copenhagen interpretation induce even attempts to exclude corpuscular objects from quantum theory at all, see, for example, [20] for Schrodinger critique of the classical concept of a particle. At the moment there is only one alternative to the orthodox Copenhagen interpretation, namely Einstein's statistical interpretation. By this interpretation the wave function describes distinct statistical features of an ensemble of elementary particles, see L. Ballentine [17] for the details (see also [16], [5], [10],

[11])-However, we must recognize that Einstein's statistical approach could not

solve the fundamental problem of quantum theory: it could not explain the appearance of NEW STATISTICS in the purely corpuscular model. We did this in the present paper. On one hand, this is the strong argument in favour of the statistical interpretation of quantum mechanics. On the other hand, one of main motivations to use the wave-particle duality disappeared.

Nevertheless, our investigation could not be considered as the crucial argument against the wave-particle duality. It is clear that by using purely mathematical analysis we cannot prove or disprove some physical theory. The only thing that we proved is that corpuscular objects (that have no wave features) can exhibit NEW STATISTICS.

In fact, we obtained essentially more than planed: this NEW STATISTICS are not reduced to QUANTUM STATISTICS. In principle, we can propose experiments that induce TRIGONOMETRIC, HYPERBOLIC and HYPER-TRIGONOMETRIC STATISTICS.

We remark that the quantum probabilistic transformation P = Pi + P2 + 2VPTP7 cos0 gives the possibility to predict the probability P if we know probabilities

P i and P 2 . In principle, there might be created theories based on arbitrary transformations:

P = F ( P 1 > P 2 ) . It may be that some rules have linear space representations over 'exotic number systems', for example, p-adic numbers [20].

199

Preliminary analysis of probabilistic foundations of quantum mechanics (that induced the present investigation) was performed in the books [11] and [21] (chapter 2); a part of results of this paper was presented in preprints [22]-[24].

Acknowledgements

I would like to thank S. Albeverio, L. Accardi, L. Ballentine, V. Belavkin, E. Beltrametti, W. De Muynck, S. Gudder, T. Hida, A. Holevo, P. Lahti, A. Peres, J. Summhammer, I. Volovich for (sometimes critical) discussions on probabilistic foundations of quantum mechanics.

References 1. P. A. M. Dirac, The Principles of Quantum Mechanics (Claredon Press,

Oxford, 1995). 2. R. Feynman and A. Hibbs, Quantum Mechanics and Path Integrals

(McGraw-Hill, New-York, 1965). 3. L. Accardi, The probabilistic roots of the quantum mechanical para

doxes. The wave-particle dualism. A tribute to Louis de Broglie on his 90th Birthday, ed. S. Diner, D. Fargue, G. Lochak and F. Selleri (D. Reidel Publ. Company, Dordrecht, 297-330, 1984).

4. B. d'Espagnat, Veiled Reality. An anlysis of present-day quantum mechanical concepts (Addison-Wesley, 1995).

5. A. Peres, Quantum Theory: Concepts and Methods (Kluwer Academic Publishers, 1994).

6. J. von Neumann, Mathematical foundations of quantum mechanics (Princeton Univ. Press, Princeton, N.J., 1955).

7. E. Schrodinger, Philosophy and the Birth of Quantum Mechanics. Edited by M. Bitbol, O. Darrigol (Editions Frontieres, 1992).

8. J. M. Jauch, Foundations of Quantum Mechanics (Addison-Wesley, Reading, Mass., 1968).

9. P. Busch, M. Grabowski, P. Lahti, Operational Quantum Physics (Springer Verlag, 1995).

10. W. De Muynck, W. De Baere, H. Martens, Found. Phys. 24, 1589-1663 (1994).

11. A. Yu. Khrennikov, Interpretations of probability (VSP Int. Publ., Utrecht, 1999).

12. J. Summhammer, Int. J. Theor. Phys. 33, 171-178 (1994). 13. R. von Mises, The mathematical theory of probability and statistics

(Academic, London, 1964).

200

14. A. N. Kolmogoroff, Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer Verlag, Berlin, 1933); reprinted: Foundations of the Probability Theory. (Chelsea Publ. Comp., New York, 1956).

15. W. Heisenberg, Z. Physik., 43, 172 (1927). 16. L. E. Ballentine, Quantum mechanics (Englewood Cliffs, New Jersey,

1989). 17. L. E. Ballentine, Rev. Mod. Phys., 42, 358-381 (1970). 18. A. Yu. Khrennikov, Supernalysis (Kluwer Academic Publishers, Dor-

dreht, 1999). 19. E. Schrodinger, Die Naturwiss, 23, 807-812, 824-828, 844-849 (1935). 20. E. Schrodinger, What is an elementary particle? in Gesammelte Ab-

handlungen. (Wieweg and Son, Wien 1984). 21. A. Yu. Khrennikov, p-adic valued distributions in mathematical physics

(Kluwer Academic Publishers, Dordrecht, 1994). 22. A. Yu. Khrennikov, Ensemble fluctuations and the origin of quantum

probabilistic rule. Rep. MSI, Vaxjo Univ., 90, October (2000). 23. A. Yu. Khrennikov, Classification of transformations of probabilities

for preparation procedures: trigonometric and hyperbolic behaviours. Preprint quant-ph/0012141, 24 Dec (2000).

24. A. Yu. Khrennikov, Hyperbolic quantum mechanics. Preprint quant-ph/0101002, 31 Dec (2000).

201

NONCONVENTIONAL VIEWPOINT TO ELEMENTS OF PHYSICAL REALITY BASED ON NONREAL ASYMPTOTICS

OF RELATIVE FREQUENCIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences,

MSI, University of Vaxjo, S-35195, Sweden

Email:[email protected]

We study connection between stabilization of relative frequencies and elements of physical reality. We observe that , besides the standard stabilization with respect to the real metric, there can be considered other statistical stabilizations (in particular, with respect to so called p-adic metric on the set of rational numbers). Nonconventional statistical stabilizations might be connected with new (noncon-ventional) elements of reality. We present a few natural examples of statistical phenomena in that relative frequencies of observed events stabilize in the p-adic metric, but fluctuate in the standard real metric.

1 Introduction

The present methodology of physical measurements is based on the principle of the statistical stabilization of relative frequencies in the long run of trials. In the mathematical model this principle is represented by the law of large numbers. This approach to measurements is induced by human representation of physical reality as reality of stable repetitive phenomena. In the process of evolution we created cognitive structures that correspond to elements of this 'repetitive physical reality'. All modern physical investigations are oriented to the creation of new elements of such a reality."

It must be remarked that the notion of stabil ization (of relative frequencies) plays the fundamental role in the creation of this reality. I would like to point out that the conventional meaning of stabilization is based on real numbers. When we say stabilization, we mean the stabilization with respect to the standard real metric pn(x,y) = |x — y| (the distance between points x and y on the real line R). Of course, such a choice of the metric that determines statistically elements of physical reality was not just a consequence of the development of one special mathematical theory, real analysis. b It

a W e ask the reader not connect our vague ('common sense') use of the notion of an element of physical reality with the EPR sufficient condition to be an element of reality, [1]. bNevertheless, we must not forget that the human factor played the large role in the expending of the (presently dominating) model of physical reality based on real numbers. At the beginning Newton's analysis was propagated as a kind of religion. There were (in particular

202

seems that the notion of /^-stabilization was induced by human practice in that quantities n « N were not important. We created 'real physical reality', because we used smallness based on the standard order on the set of natural numbers.

It must be underlined that in modern physics the real physical reality (i.e., reality based on the /9R-stability) is, in fact, identified with the whole physical reality.

On the other hand, the modern mathematics is not more just a real analysis. In particular, the development of general topology [2], [3] induced large spectrum of new nearness (in particular, metric) structures. In principle, we need not more identify any stabilization with the p^-stabilization. There appears a huge set of new possibilities to introduce new forms of stability in physical experiments. Moreover, new stable structures can be considered as new elements of physical reality that, in general, need not belong the standard real reality.

This idea was presented for the first time in author's investigations [4], [5] on so called p-adic physics [6]- [10]. Later we tried to find the place of p-adic probabilities in quantum physics [11], [12] (in particular, to justify on the mathematical level of rigorousness the use of negative and complex probabilities as well as create models with hidden variables that do not produce Bell's inequality). In this paper we give the brief introduction into these probabilistic models as well as present a few rather natural examples in that relative frequencies of events stabilize with respect to so called p-adic metric, but fluctuate with respect to pR. There is no corresponding element of the real reality. But there is an element of the p-adic reality. The objects considered in examples could be created on the 'hard'-level. In particular, to create a plantation in that a colour of the flower (red or white) is the element of p-adic reality, I need just a tractor and (sufficiently large) peace of land. Nevertheless, I must agree that such a p-adic element of reality were never observed in 'naturally created' physical objects.

The reader can be interested in the reasons by that we are concentrated on the statistical stabilization with respect to the p-adic numbers, p-adic frequency probability theory. The main reason is that p-adic numbers are, in fact, the unique alternative to real numbers: there is no other possibility to complete the field of rational numbers and obtain a new number field (Ostrovskii's theorem, see, for example, [13], [14]).

Our probabilistic foundations are based on the generalization of R. von Mises frequency theory of probability [15], [16]. At the beginning of this century, when the foundation of modern probability theory were being laid, the

in France) divine services devoted to Newton's analysis.

203

frequency definition of probability proposed by von Mises played an important role. In particular, it was this definition of probability that Kolmogorov used to motivate his axioms of probability theory (see [17]). We also begin the construction of the new theory of probability with a frequency definition of probability.

Von Mises defined the probability of an event as the limit of the relative frequencies of the occurrence of the event when the volume of the statistical sample tends to infinity. This definition is the foundation of mathematical statistics (see example, Cramer [18]), in which von Mises's definition is formulated as the principle of statistical stabilization of relative frequencies.

In this paper, we propose a general principle of statistical stabilization of relative frequencies. By virtue of this principle, statistical stabilization of relative frequencies {u = n/N} can be considered not only in the real topology on Q (and all relative frequencies are rational numbers), but also in any other topology on Q. Then the probabilities of events belong to the corresponding completion of the field of rational numbers. As special cases, we obtain the ordinary real probability theory (von Mises's definition) and p-adic probability theories, p = 2 ,3 ,5 , . . . .

How should one choose the topology of statistical stabilization for a given statistical sample? The topology is determined by the properties of the studied probability model. In essence, we propose this principle: for each probability model there is a corresponding topology (or topologies) of statistical stabilization.

For example, in a random sample there need not be any statistical stabilization of the relative frequencies in the real metric. Thus, from the point of view of real probability theory this is not a probabilistic object. However, in this random sample one may observe p-adic statistical stabilization of the relative frequencies.

In essence, I am asserting that the foundation of probability theory is provided by rational numbers (relative frequencies) and not real numbers. Real probabilities of events merely represent one of many possibilities that arise in the statistical analysis of a random sample. Such an approach to probability theory agrees well with Volovich's proposition that rational numbers are the foundation of theoretical physics [19]. In accordance with this proposition, everything physical is rational, and number fields that are different from the field of rational numbers arise as an idealization needed for the theoretical description of physical results.

All necessary information on p-adic (and more general m-adic) numbers can be found in Appendix 1 of this paper. However, in the first two sections they are hardly used at all, and we may restrict ourselves to the remark that

204

in addition to the completion of the field of rational numbers Q with respect to the real metric there also exist completions with respect to other metrics, and among these completions there are the fields of p-adic numbers Qp,p = 2 , 3 , 5 , . . . .

2 Analysis of the foundation of probability theory

2.1. Frequency Definition of Probability. As is well known, the frequency definition of probability proposed by von Mises [15] in 1919 played an important role in the construction of the foundations of modern probability theory. This definition exerted a strong influence on the theory of probability measures, the foundations of which were laid by Borel [20], Kolmogorov [17], and Frechet [21]. There is no point in giving here Kolmogorov's axioms (which can be found in any textbook on probability theory) but it is probably necessary to recall in its general features the main propositions of von Mises's theory of probability. The theory is based on infinite sequences x = (a;i, x<i,... , xn,...) of samplings or observations. If an experiment having S outcomes is made, then Xj can take values 1,2,... , 5 (possible outcomes). For the standard experiment on coin trails, we have 5 = 2 and Xj = 1,2. In what follows possible outcomes of an experiment will be called labels.

However, not every such sequence is regarded as an object of probability theory. The fundamental principle of the frequency theory of probability is the principle of statistical stabilization of the relative frequencies of occurrence of a particular label and only sequences of samplings that satisfy this principle are regarded as objects of probability theory. Such sequences of samplings are called collectives.

"A collective is a bulk phenomenon or a repeated process, in brief a series of individual observations for which one is justified in assuming that the relative frequency of occurrence of each individual observable label tends to a definite limiting value" [16].

The probability of an event E is defined as the limit of the sequence of frequencies u^ = n/N, where n is the number of cases in which the event E is detected in the first N tests.

For the subsequent considerations, it is important to note that in the statistical analysis of the results of an experiment only rational numbers -relative frequencies - are obtained.

The principle of statistical stabilization of the relative frequencies is used practically unchanged in mathematical statistics.

" Observations of the frequency v^ of a fixed event E for increasing values of N reveals that this frequency has, generally speaking, a tendency to take a

205

more or less constant value at large N" (see Cramer [18]). In defining a collective, von Mises used a further principle - the principle

of irregularity of a sequence of tests, i.e., invariance of the limit of the relative frequencies with respect to the selection, made using a definite law, from a given sequence of tests x = (xi,X2,... ,xn,...) of some subsequence. It is important that the law of this selection should not be based on the difference of the elements of the sequence with respect to the considered label.

"Second, this limiting value must remain unchanged if from the complete sequence we choose arbitrarily any part and consider in what follows only this part" [16].

This principle, like the principle of statistical stabilization of the relative frequencies, is fully in accord with our intuitive ideas of randomness. However, there are here some logical difficulties associated with the "arbitrariness" of the choice. A detailed analysis of these logical problems was made by Khinchin [22], see also [12] for the details. It appears that one must agree with Khinchin's critical comments and consider the frequency theory of probability that is based only on von Mises's first principle - the principle of statistical stabilization of the relative frequencies.

As is noted in [22], the frequency theory of probability based solely on von Mises's first principle is axiomatized and is as rigorous a mathematical theory as Kolmogorov's theory of probability. Here, we do not intend to consider von Mises's theory of probability in the framework of an axiomatic approach. Our task is to analyze the principle of stabilization of the frequencies of occurrence of a particular event in a collective.

2.2. Von Mises Frequency Theory of Probabilities as Objective Foundation of Kolmogorov's Axiomatics.

As motivation of his axioms, Kolmogorov used the properties of limits of relative frequencies, see [17]. We shall be interested in the manner in which Kolmogorov's axiom 2 arose; in accordance with this axiom, the probability P{E) of any event E is a nonnegative real number < 1. In [17], Kolmogorov considers von Mises's definition [16] of probability as the limit of the relative frequencies of occurrence of the event E. Further, since the relative frequencies i/(£) = n/N are rational numbers that lie between zero and unity, their limits in the real topology are real numbers between zero and unity. Cramer proceeded similarly in the construction of his theory of probability distributions [18].

Khinchin, discussing the advantages of Kolmogorov's axioms over von Mises's frequency theory of probability, noted that "...from the formal aspect, the mutual relationship between the axiomatic and frequency theories is characterized in the first place by a higher degree of abstraction of the former."

This higher degree of abstraction was the foundation of the successful

206

development of the theory of probability measures. However, this degree of abstraction is too high, and some properties of the world of real frequencies are lost in it. Essentially, the rational numbers were lost in Kolmogorov's theory of probability. Whereas in von Mises's theory the rational numbers arise as primary objects, and real probabilities are obtained as a result of a limiting process for rational frequencies, in Kolmogorov's theory rational frequencies are secondary objects associated with real probabilities (which are here primary) by means of the law of large numbers.

3 General principle of statistical stabilization of relative frequencies

First, we emphasize that the probabilities P in von Mises's frequency theory are ideal objects (symbols to denote the sequences of relative frequencies that are stabilized in the field of real numbers). Therefore, real numbers arise here as ideal objects associated with rational sequences of frequencies (see also Borel [20] and Poincare [23]).

A basis for a broader view of probability theory is provided by the following principle of statistical stabilization of frequencies:

Statistical stabilization (the limiting process) can be considered not only in the real topology on the field of rational numbers Q but also in any other topology on Q. The probabilities of events are defined as the limits of the sequences of relative frequencies in the corresponding completions of the field of rational numbers.

For each considered probability model, there is a corresponding topology on the field of rational numbers. The metrizable topologies on Q given by absolute values are the most interesting. By virtue of Ostrovskii's theorem, there are very few such topologies; indeed, besides the usual real topology, for which p(x,y) = \x — y\, there exists only the p-adic topologies p = 2 , 3 , . . . , where p(x, y) = \x — y\p. Thus, if we consider only topologies given by absolute values, then, besides the usual probability theory over R, we obtain only the probability theories over Qp.

It is here necessary to introduce a natural restriction on the topology of statistical stabilization.

The completion Qt of the field of rational numbers Q with respect to the statistical stabilization topology t is a topological field.

We have deliberately not introduced this restriction into the general principle of statistical stabilization. One can also consider statistical stabilization topologies that are not consistent with the algebraic structure on Q. However, probability theory based on such topologies loses many familiar properties. For

207

example, it turns out that the continuity of the addition operation is equivalent to additivity of probabilities, and continuity of the division operation is equivalent to the existence of conditional probabilities.

Let x = (x\,X2,. • • ,xn,...) be some collective. We denote the set of all labels for this collective (possible outcomes of an experiment producing this collective) by the symbol II. We denote by fi the event consisting in the realization of at least of the label n € II.

Proposition 3.1. The probability of the event il is equal to unity. To prove this, it is sufficient to use the fact that all the relative frequencies

are equal to unity. Let v^fi, j = 1,2, be the relative frequencies of realization of certain labels

7Ti and 7r2, and Pj = l imi /^ be the corresponding probabilities. Let event A be the realization of the label TT\ or -K-I : A = n\ V TT2 • Using the continuity of the addition operation, we obtain

P(A) = lim i/W = lim(j/W + v^) = lim i/W + lim J / 2 ) = PX+P2 (1)

This rule can be generalized to any number of mutually exclusive events. Proposition 3.2. Let Aj,j = 1 , . . . ,k, be mutually exclusive events (i.e.,

the sets of labels that define these events are disjoint). Then

k

P(A1V...VAk) = Y,P(Aj) (2) i= i

Using the continuity of the subtraction operation, we obtain the following proposition.

Proposition 3.3. For any two events A and B, the equation P(A\/B) — P{A) + P{B) - P{A A B) holds.

In the language of collectives, the rule of addition of probabilities is formulated as follows, see[16]: "Beginning with an original collective possessing more than two labels, an appreciable number of new collectives can be constructed by "uniting" labels; the elements of the new collective are the same as in the original one, but their labels are unifications of the labels of the original collective...." To the unification of labels there corresponds the addition of frequencies.

We consider the set of rational numbers U = {x € Q : Q < x < \}. We denote by the symbol Ut the closure of the set U in the field Qt (if t is the ordinary real topology, then Ut — [0,1]). An obvious consequence of the definition of probabilities is the following proposition.

Proposition 3.4 The probability of any event P{E) belongs to the set Ut-

208

Conditional probabilities are then introduced into the frequency theory in same way as in [16]. Suppose there is some initial collective x = (xltx2,--. ,xn,...) with probabilities pn of the labels, IT € II. Using the unification rule, we define the probabilities of all groups of labels:

P(A) = Y,P*- (3)

We fix some group of labels B = n^ V . . . V iTik. We are interested in the conditional probability P(TT/B),TT € B, of the label n given the condition B. We form a new collective x' = (x[, x'2,... ,x'n), which is obtained from the original one by choosing only the elements with the labels ?r' £ 5 . The probability of the label -K in this new collective is then called the conditional probability of the label n under the condition B : P(n/B) = lim v^lB^, where J,(T/B) a r e the relative frequencies of the label -K in the new collective. Noting that z/*'/5) = i/M /z / B ) , where v^ is the relative frequency of the label it in the collective x, and j / B ) is the relative frequency of the event B in the collective x, we obtain (using the continuity of the division operation)

j / ( 7 r ) limi/W p(V) PMB)=lua-m = — m = ^ y P{B)*0. (4)

The general formula can be proved similarly. Proposition 3.5. P(A/B) = P{AAB)/P(B),P(B) £ 0. We now introduce the concept of independence of events. Analyzing argu

ments in the book [16], one notes that the rule of multiplication of probabilities for independent events is equivalent to the continuity of the multiplication operation.

An important property that makes it possible to use p-adic probabilities when considering standard problems of probability theory is the p-adic interpretation of the probabilities zero and one (which are probabilities in the sense of ordinary probability theory).

Indeed, the equation P(E) = 0 in ordinary probability theory does not mean that the event E is impossible. It merely means that in a long series of experiments the event E occurs in a very small fraction of cases. However, in a large number of experiments this fraction can be relatively large. Moreover, the equation P(E) = 0 "lumps together" a huge class of events that intuitively appear to have different probabilities. For example, suppose we consider two events, E\ and Ei and in the first

N = Nk = C£*)2 (5)

209

trials the event Ei is realized n^ = 2k times and the event E2 is realized

k

nW = Y,2j (6) J=0

times. It is intuitively clear that the probabilities of these events must be different. However, in real probability theory

Pi = lim n{1)/N = P2= lim n (2) /N = 0 (7)

It is different in 2-adic probability theory. Stabilization in the 2-adic topology gives

Pi = 0 ? P2 = - 1 since in Q2 we have 2* -> 0, k -> co, and for - 1 we have the represen

tation - 1 = l + 2 + 22 + . . . + 2" + . . . We here encounter for the first time negative numbers for probabilities of events (compare to Wigner [24], Dirac [25], Feynman [26], see also [27], [28], [12]). Of course, these probabilities are forbidden by Kolmogorov's second axiom in ordinary probability theory (in von Mises's approach, they are forbidden by the choice of the topology of statistical stabilization). However, from the point of view of the frequency theory of probability P = — 1 is only an ideal object, the symbol that denotes the limit of a sequence of relative frequencies. This symbol is in no way better and in no way worse than the symbol P = \jix in ordinary probability theory.

In this example negative p-adic probabilities were used to split zero conventional (real) probability. So p-adic negative probabilities can be interpreted as infinitely small conventional probabilities. It may be that all negative probabilities that appear in quantum physics might be interpreted in such a way. If conventional (real) probability is equal to zero there is no conventional (real) element of reality. However, there is nonconventional (p-adic) element of reality that is realized with negative probability. Real and p-adic probabilities correspond to different classes of measurement procedures. The element of reality that it would be impossible to observe by using 'real measurement procedure' might be observed by using 'p-adic measurement procedure.'

One can treat similarly the case of a probability (in the sense of the ordinary theory) equal to unity. For example, suppose

k k k k

N = Nk = (J2V)2,n^ = (]T2^)2 - 2fc,n(2) = ( ^ V ) 2 - £)2>' (8) j=0 j=0 j=0 j=0

210

In 2-adic probability theory, we find that

oo

P1 =l^P2 = l _ ( l / ^ 2 > ) = 2 (9) 3=0

We see here that natural numbers not equal to unity also belongs to the set Up.

In this example p-adic (integer) probabilities which are larger than 1 were used to split conventional (real) probability one. So under the p-adic consideration a conventional element of reality can be split to a few p-adic elements of reality.

In the framework of p-adic statistical stabilizations there is also "nothing seditious" about complex probabilities. For example, let p = l(mod 4). Then i = ( - l )Va e Qp. Let

i = io + hp + iip1 + • • • , ir = 0 , 1 , . . . ,p - 1, (10)

be the canonical decomposition of the imaginary unit in powers of p. Note also that for any p

_ l = ( p - l ) + ( p - l ) p + ( p - l ) p 2 + . . . . (11)

Then for rational relative frequencies, we have

v JQ + HP+... + ikpk ^ _{ , 1 2 ,

(p - 1) + (p - l)p + . . . + (p - l)pk

in the p-adic topology. Geometrically, one may suppose that the new probability theory is a tran

sition from one-dimensional probabilities on the interval [0,1] to multidimensional probabilities.

4 Probability distribution of a collective

Let x = (xi,... , Xk, • • •) be some collective, and II be the set of labels of this collective. We consider the simplest case when the set II is finite, II = ( 1 , . . . ,S). We denote by v^ the relative frequency of the j—label and by Pj = limi/J') the corresponding probability. In the frequency theory, the set of probabilities Px = (Pi , . • • , Ps) is called the probability distribution of the collective x.

211

The general principle of statistical stabilization makes it possible to consider not only real distributions but also distributions for other number fields. For one and the same collective x, there can exist distributions over different number fields. Thus, in the proposed approach a collective has, in general, an entire spectrum of distributions, PXit = (P i , t , . . . ,Ps,t), where t are the topologies of statistical stabilization for the given collective. Therefore, one here studies more subtle structure of the collective. The relative frequencies are investigated not only for real stabilization but for a complete spectrum of stabilizations.

In the connection with the existence of an entire spectrum of probability distributions of a collective, it is necessary to make some comments.

First, this agrees well with von Mises's principle that "the collective comes first and the probabilities after." Indeed, a probability distribution is an object derived from a collective, and to one and the same collective there corresponds an entire spectrum of probability distributions, these reflecting different properties of the collective.

Second, each statistical stabilization determines some physical property of the investigated object. For example, if in a statistical experiment involving the tossing of a coin the probability of heads is Pi and tails is P2, then these probabilities are physical characteristics of the coin like its mass or volume. This question is discussed in detail in the books of Poincare [23] and von Mises [16].

If we consider from this point of view the new principle of statistical stabilization, we obtain new physical characteristics of the investigated objects. For example, if in the real topology statistical stabilization is absent, then it is not possible to obtain any physical constants in the language of ordinary probability theory. But these constants could exist and be, for example, p-adic numbers. If a collective has not only a real probability distribution but an entire spectrum of other distributions, then, besides real constants corresponding to physical properties of the investigated object, we obtain an entire spectrum of new constants corresponding to physical properties that were hidden from the real statistics. Note that these new constants can also be ordinary rational numbers.

5 Model examples of p-adic statistics

5.1 Plantation with Red and White Flowers. As one of the first examples of a collective, von Mises considered [16] a

plantation sown with flowers of different colors, and he studied the statistical stabilization of the relative frequencies of each of the colors. We shall construct

212

an analogous collective for which p-adic stabilization always occurs but real stabilization is in general absent.

Suppose there are flowers of two types: red (R) and white (W). The plantation (or, rather, infinite bed) is sown in a random order with red and white flowers, the flowers being sown in series formed by blocks of p flowers, the length of the series (the power of p) being also determined in accordance with a random rule.

Namely, suppose there are two generators of random numbers: 1) j = 0,1; 2) i = 1,2 (with probabilities 0.5). If j = 0, then a series of red flowers is sown; if j = 1, then a series of white ones. The length of each series is defined as follows: the length of the first series is some power p'1 (it can also be determined in accordance with a random rule); if the length of the previous series was plm, then the length of the next series is plm+x, lm+i =lm + im.

We introduce the relative frequencies of the red and white flowers in the firs m series: v}£> = rVm>/Nm,i^T = n™ /Nm.

Proposition 5.1. For all generators of the random numbers j and i, there is statistical stabilization of the relative frequencies u^R> and u^w> in the p-adic topology.

Thus, we have defined p-adic probabilities PR = l imi / ' ^ and Pw — limi/(w\ and

oo oo oo oo

PR = (£(1 -Jn)P'")/CZ,Pln)>pw = (E^") / (E^ n ) (13) n=l n= l n=l n=l

Note that in general there is no real statistical stabilization for such a random plantation. If the generator of the random numbers j gives series 0 or 1, then u^ and v^w^ in the real topology can oscillate from zero to unity.

Thus, a real observer (an investigator who carries out statistical analysis of the sample in the field of real numbers) cannot obtain any statistically regular law.

He will obtain only a random variation of the series of real relative frequencies. In contrast, the p-adic observer (the investigator who makes a statistical analysis of the sample in the field of p-adic numbers) will obtain a well-defined law, consisting of the stabilization of the outcomes in the p-adic decomposition of the relative frequencies.

It is evident that in the example of probability theory we observe a new fundamental approach to the investigation of natural phenomena. In accordance with this approach, experimental results must be analyzed not only in the field of real numbers but also in p-adic fields.

Naturally, our example is purely illustrative, but it does appear to reflect many very important properties of p-adic statistics.

213

Remark 5.1. Intuitively, one supposes that in a real plantation it is possible to find a white flower next to almost every red flower; in contrast, large groups (clusters) of red and white flowers are distributed randomly over a p-adic plantation (one can sow not only a bed but also distribute series of red and white flowers over a plane in accordance with a random rule). A real random plane is obtained if one throws at random red and white points onto the plane; in contrast, a p-adic random plane is obtained if one throws patches of pl points at a time of red and white color onto the plane.

In Appendix 2, we give the results of statistical analysis of the results of a random modeling on a computer of the proposed probability model. There is very rapid p-adic stabilization of the relative frequencies and no stabilization in the sense of ordinary real probability theory.

Remark 5.2. Evidently, the structure of series formed by powers of p need not necessarily be directly observed in a statistical sample. This structure is introduced by rounding the number of results to powers of p. In very large statistical samples, one can take into account only the orders of the numbers, and one thereby introduces into the sample a 10-adic structure.

5.2. Random Choice of the Digit of a p-Adic Number. Suppose there are two labels: 1 and 2; j is a generator of random numbers

corresponding to the choice of one of the labels. Each random label is produced in series, the length of the series being determined by random choice of the next p-adic digit, i.e., there is a generator of random numbers a that take the values a = 0 , 1 , . . . , p - 1, and the length of the next series is anp

n~1,n = 1,2,... . We introduce the relative frequencies v^ and v^.

Proposition 5.2. For all generators of the random numbers j and a there is statistical stabilization of the relative frequencies v'-1' and i / 1 ' in the p-adic topology.

Thus, the following p-adic probabilities are defined:

oo oo oo oo Pl = (Y,^l-J^nPn~1)l{Y,^nPn-l),P2 = (Ejn<*nP

n-l)/(<rianpn-1) n=l n=l n=l n=l

In the real topology, there is, in general, no statistical stabilization. Appendix 1 Every rational number x ^ 0 can be represented in the form

where p does not divide m and n. Here p is a fixed prime. The p-adic absolute value (norm) for the rational number x is defined by the equations \x\p =

214

p r , i / 0, |0|p = 0. This absolute value has the usual properties: l)\x\p > 0, \x\p = 0 «-»• x = 0; 2)|x?/|p = |a;|p|2/|p, and satisfies a strong triangle inequality: 3)\x + y\p < max(|a;|p, |y|p).

The completion of the field of rational numbers with respect to the metric p(x — y) = \x — y\p is called the field of p-adic numbers and denoted by the symbol Qp. It is a locally compact field. Numbers in the unit ball Zp = {x € QP '• \X\P < 1} °f the field Qp are called integer p-adic numbers. Prom the strong triangle inequality, we obtain a theorem which states that a series in the field Qp converges if and only if its general term tends to zero. Any p-adic number can be represented in a unique manner in the form of a (convergent) series in powers of p :

oo x = Yla^'ai =0 ,1 , . . . ,p-l;fc = 0,±l,... , (15)

j=k

with \x\p = p~k.

One can define similarly m-adic numbers, where m is any natural number, m > 2. In the general case, property 2) is replaced by the weaker property \xy\m < |z|m|2/|m> i-e-> \x\m ls a pseudonorm. The completion of the field Q in the metric p(x,y) = \x — y\m will not be a field (for m that are not prime). It is only a ring. Here, we already encounter some deviations from the ordinary probability rules (which can be extended without any changes to p-adic probabilities). For example, one can have a situation of the following kind: A and B are independent events, P(A) ^ 0 and P{B) ^ 0, but P(A AB)=0. In particular, the conditional probability P(A/B) is in general not defined for an event B having nonvanishing probability.

Appendix 2

We give here the results of a random experiment (modeled on a computer) for a 2-adic plantation. The results of this experiment give a good illustration of a situation in which there is no statistical stabilization in the real topology, but there is statistical stabilization in the 2-adic topology. In the following tables, m is the number of a random experiment in which two random numbers are modeled, one corresponding to the choice of a flower and the other to the length of the series of this flower; d is the number of elements in the sample. Because of the exponential growth of the number of elements in the series, d increases very rapidly.

The table of relative frequencies in the field of real numbers is:

215

m 4 5 6 7

12 13 14

22 23

d 10 102

103

103

105

105

106

109

1010

w uyy

0.1304 0,6364 0,1913 0,0504

0,0006 0,5335 0,1703

0,0022 0,7453

uH

0.8696 0,3636 0,8087 0,9496

0,9994 0,4665 0,8297

0,9978 0,2547

Thus, for the relative frequencies in the field of real numbers there is no stabilization of even the first digit after the decimal point. We examined large sequences of experiments on the computer in which the oscillations continued. The calculations in the field Q2 give the results

AT = 10

v(w) =101011111011000000110100010111011000110011011110110001011 i/W =001100000100111111001011101000100111001100100001001110100

iV = 20

v(w) _ 10101111101100111011001100101111110000011100111000000001 vW> = 00110000010011000100110011010000001111100011000111111110

AT = 30

i/W = 101011111011001110110011001111111100000000100110110000011 i/W =001100000100110001001100110000000011111111011001001111100

AT = 40

v(w) =101011111011001110110011001111111100000000010111001110100 i/W =001100000100110001001100110000000011111111101000110001011

216

Thus, after ten random experiments 14 digits are stabilized in the 2-adic decomposition for the relative frequency of occurrence of a red flower and 14 digits for a white flower; after 20 experiments, the numbers of digits that are stabilized are 27 for both colors; after 30 experiments, 42 digits are stabilized for each, and so forth.

Appendix 3 W e give the results of analysis of a statistical sample in a field of 5-adic

numbers. Here, N is the number of random experiments, M is the number of elements of the sample, M\ is the number of elements of the first label, and Mi is the number of elements of the second label:

N : 2; M l : 002; M 2 : 00002; M : 00202

Ml/M:1044004400440044004400440044004400440044004400440044 M2/M:0010440044004400440044004400440044004400440044004400

N : 3; M l : 002; M 2 : 000023; M : 002023

Ml/M:1040303403420004404141041024440040303403420004404141 M2/M1:0014141041024440040303403420004404141041024440040303

N : 4; M l : 00200002; M 2 : 000023; M : 00202302

Ml/M:1040303004000130020234341334320032124414032304024031 M2/M:0014141440444314424210103110124412320030412140420413

N : 5; M l : 00200002; M 2 : 000023004; M : 002023024

Ml/M:1040301040132010043322212441423102032221232032034142 M2/M:0014143404312434401122232003021342412223212412410302

N : 6; M l : 00200002; M 2 : 00002300403; M : 00202302403

Ml/M:1040301003131014113132222240403413222311230303113140 M2/M:0014143441313430331312222204041031222133214141331304

N : 7; M l : 00200002; M 2 : 0000230040303; M : 0020230240303

217

Ml/M:1040301003202004101343032004014023441101104433243020 M2/M:0014143441242440343101412440430421003343340011201424

Thus, in the analysis of the sample in the field of 5-adic numbers there is rapid stabilization of the digits in the 5-adic decomposition of the relative frequencies. For example, after 55 experiments 78 digits in the 5-adic decomposition of the relative frequencies are stabilized.

When the sample is analyzed in the field of real numbers, there is again no statistical stabilization.

Acknowledgements

I would like to thank L. Ballentine and J. Summhammer for discussions on p-adic probabilities and elements of physical reality.

References 1. A. Einstein, B. Podolsky, N. Rosen, Phys. Rev., 47, 777-780 (1935). 2. P.S. Alexandrov, Introduction to general theory of sets and functions.

(Gostehizdat, Moscow, 1948). 3. R. Engelking, General Topology (PWN, Warszawa, 1977). 4. A.Yu. Khrennikov, Dokl. Akad. Nauk , 322, 1075-1079 (1992). 5. A.Yu. Khrennikov, J. of Math. Phys., 32, 932-937 (1991). 6. V.S. Vladimirov, I. V. Volovich, and E. I. Zelenov, p-adic analysis and

mathematical physics ( World Scientific Publ., Singapore, 1994). 7. Yu. Manin, Springer Lecture Notes in Math.,1111, 59-101 (1985). 8. P. G. 0 . Freund and E. Witten, Phys. Lett. B, 199, 191-195 (1987). 9. A.Yu. Khrennikov, Non-Archimedean Analysis: Quantum Paradoxes,

Dynamical Systems and Biological Models (Kluwer Academic Publ., Dordrecht, 1997).

10. S. Albeverio, A. Yu. Khrennikov and R. Cianci, J. Phys. A, Math. and Gen. 30, 881-889, (1997).

11. A. Yu. Khrennikov, J. of Math. Physics, 39, 1388-1402 (1998). 12. A.Yu. Khrennikov, Interpretations of probability (VSP Int. Publ.,

Utrecht, 1999). 13. Z. I. Borevich and I. R. Shafarevich, Number Theory (Academic Press,

New-York, 1966). 14. W. Schikhov, Ultrametric calculus (Cambridge Univ. Press, Cam

bridge, 1984) 15. R. von Mises, Math.Z., 5, 52-99 (1919).

16. R. von Mises,, Probability, Statistics and Truth (Macmillan, London, 1957).

17. A. N. Kolmogorov, Foundations of the Probability Theory (Chelsea Publ. Comp., New York, 1956).

18. H. Cramer, Mathematical theory of statistics (Univ. Press, Princeton, 1949).

19. I. V. Volovich, Number Theory as the Ultimate Physical Theory, Preprint, CERN, Geneva. TH. 4781/87 (1987)

20. E. Borel, Rend. Cic. Mat. Palermo, 27, 247 (1909). 21. M. Frechet, Recherches theoriques modernes sur la theorie des proba

bility (Univ. Press., Paris, 1937-1938). 22. A. Ya. Khinchin, Voprosi Filosofii, No 1, 92; No 2, 77 (1961) (in

Russian). 23. A. Poincare, About Science. Collection of works (Nauka, Moscow,

1983). 24. E. Wigner, Quantum -mechanical distribution functions revisted, in:

Perspectives in quantum theory. Yourgrau W. and van der Merwe A., editors (MIT Press, Cambridge MA, 1971).

25. P. A. M. Dirac, Proc. Roy. Soc. London, A 180, 1-39 (1942). 26. R. P. Feynman , Negative probability. Quantum Implications, Es

says in Honour of David Bohm, 235-246. B.J. Hiley and F.D. Peat, editors (Routledge and Kegan Paul, London, 1987).

27. W. Muckenheim, Phys. Reports, 133, 338-401 (1986). 28. A. Yu. Khrennikov, Int. J. Theor. Phys., 34, 2423-2434 (1995).

219

"COMPLEMENTARITY" OR SCHIZOPHRENIA: IS PROBABILITY IN Q U A N T U M MECHANICS INFORMATION

OR ONTA?

A. F. KRACKLAUER E-mail: [email protected]

Of the various "complimentarities" or "dualities" evident in Quantum Mechanics (QM), among the most vexing is that afflicting the character of a 'wave function,' which at once is to be something ontological because it diffracts at material boundaries, and something epistemological because it carries only probabilistic information. Herein a description of a paradigm, a conceptual model of physical effects, will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions. It is based on Stochastic Electrodynamics (SED), a candidate theory to elucidate the mysteries of QM. The fundamental assumption underlying SED is the supposed existence of a certain sort of random, electromagnetic background, the nature of which, it is hoped, will ultimately account for the behavior of atomic scale entities as described usually by QM. In addition, the interplay of this paradigm with Bell's 'no-go' theorem for local, realistic extentions of QM will be analyzed.

1 Introduction

Of the various "complimentarities" or "dualities" evident in Quantum Mechanics (QM), among the most vexing is that afflicting the character of a 'wave function,' which at once is to be something ontological because it diffracts at material boundaries, and something epistemological because it carries only probabilistic information. All other diffractable waves, it may be said, carry {momentum, energy}, not conceptual, abstract information, "ideas." All other probabilities are calculational aids, and like abstractions generally, are utterly unaffected by material boundaries. The literature is replete with resolutions of QM-conundrums selectively ignoring one or the other of these characteristics— in the end, they all fail.

Herein a description of a paradigm, a conceptual model of physical effects, will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions. It is based on Stochastic Electrodynamics (SED), a candidate theory to elucidate the mysteries of QM.1 The fundamental concept underlying SED is the supposed existence of a certain sort of random, electromagnetic background, the nature of which, it is hoped, will ultimately account for the behavior of atomic scale entities as described usually by QM.2 Among the successes of SED, one is a local realistic explanation of the diffraction of particle beams.3 The core of this explanation is the

220

notion that relative motion through the SED background effectively engenders de Broglie's pilot wave. Given such a pilot wave associated with a particle's motion, the statistical distribution of momentum in a density over phase space can be decomposed, in the sense of Fourier analysis, such that the resulting form of Liouville's Equation, under some conditions, is Schrodinger's Equation.

From this viewpoint, the 'schizophrenic' character of wave functions can be discussed and understood free of preternatural attributes. These concepts have broad implications for serious philosophical questions such as the "mind-body" dichotomy through teleportation to popular science fiction effects. In addition, the peculiar nature of probability in QM is clarified.

Although much remains to be done to comprehensively interpret all of QM in terms of SED, many of the by now hoary 'paradoxes' can be rationally deconstructed.

A secondary (but intimately related) issue is that of determining the import of Bell's Theorem for the use of the SED paradigm to reconcile fully the interpretation of QM. Arguments will be presented showing that in his proof, Bell (essentially by misconstruing the use of conditional probabilities) called on inappropriate hypothetical presumptions, just as Hermann, de Broglie, Bohm and others found that Von Neumann did before him.4'5

2 De Broglie waves as an SED effect

The foundation of the model or conceptual paradigm for the mechanism of particle diffraction proposed herein is Stochastic Electrodynamics (SED). Most of SED, for which there exists a substantial literature, is not crucial for the issue at hand.1 The nux of SED can be characterized as the logical inversion of QM in the following sense. If QM is taken as a valid theory, then ultimately one concludes that there exists a finite ground state for the free electromagnetic field with energy per mode given by

E = huj/2. (1)

SED, on the other hand, inverts this logic and axiomatically posits the existence of a random electromagnetic background field with this same spectral energy distribution, and then endeavors to show that ultimately, a consequence of the existence of such a background is that physical systems exhibit the behavior otherwise codified by QM. The motivation for SED proponents is to find an intuitive local realistic interpretation for QM, hopefully to resolve the well known philosophical and lexical problems as well as to inspire new attacks on other problems.

221

The question of the origin of this electromagnetic background is, of course, fundamental. In the historical development of SED, its existence has been posited as an operational hypothesis whose justification rests o posteriori on results. Nevertheless, lurking on the fringes from the beginning, has been the idea that this background is the result of self-consistent interaction; i.e., the background arises out of interactions from all other electromagnetic charges in the universe.6

For present purposes, all that is needed is the hypothesis that particles, as systems with charge structure (not necessarily with a net charge), are in equilibrium with electromagnetic signals in the background. Consider, for example, as a prototype system, a dipole with characteristic frequency u. Equilibrium for such a system in its rest frame can be expressed as

moc2 = Jkj0. (2)

This statement is actually tautological, as it just defines UJQ for which an exact numerical value will turn out to be practically immaterial.

This equilibrium in each degree of freedom is achieved in the particle's rest frame by interaction with counter propagating electromagnetic background signals in both polarization modes separately, which on the average, add to give a standing wave with antinode at the particle's position:

2cos(fc0a;)sin(wo*)- (3)

Again, this is essentially a tautological statement as a particle doesn't 'see' signals with nodes at its location, thereby leaving only the others. Of course, everything is to be understood in an on-the-average, statistical sense.

Now consider Eq. (3) in a translating frame, in particular the rest frame of a slit through which the particle as a member of a beam ensemble passes. In such a frame the component signals under a Lorentz transform are Doppler shifted and then add together to give what appears as modulated waves:

2 cos(fc07(x — cflt)) sin(wo7(i — c_1/3a;)), (4)

for which the second, the modulation factor, has wave length A = (7/?fco)-1. From the Lorentz transform of Eq. (2), P = hj/3ko, the factors j/3k0 can be identified as the de Broglie wave vector from QM as expressed in the slit frame.

In short, it is seen that a particle's de Broglie wave is modulation on what the orthodox theory designates Zitterbewegung. The modulation-wave effectively functions as a pilot wave. Unlike de Broglie's original conception in which the pilot wave emanates from the kernel, here this pilot wave is a kinematic effect of the particle interacting with the SED Background. Because

222

this SED Background is classical electromagnetic radiation, it will diffract according to the usual laws of optics and thereafter, modify the trajectory of the particle with which it is in equilibrium.3 (See Ref. [1], Section 12.3, for a didactical elaboration of these concepts.)

The detailed mechanism for pilot wave steerage is based on observing that the energy pattern of the actual signal that pilot waves are modulating, and to which a particle tunes, comprises a fence or rake-like structure with prongs of varying average heights specified by the pilot wave modulation. These prongs, in turn, can be considered as forming the boundaries of energy wells in which particles are trapped; a series of micro-Paul-traps, as it were. Intuitively, it is clear that where such traps are deepest, particles will tend to be captured and dwell the longest. The exact mechanism moving and restraining particles is radiation pressure, but not as given by the modulation, rather by the carrier signal itself. Of course, because these signals are stochastic, well boundaries are bobbing up and down somewhat so that any given particle with whatever energy it has will tend to migrate back and forth into neighboring cells as boundary fluctuations permit. Where the wells are very shallow, however, particles are laterally (in a diffraction setup, say) unconstrained; they tend to vacate such regions, and therefore have a low probability of being found there.

The observable consequences of the constraints imposed on the motion of particles is a microscopic effect which can be made manifest only in the observation of many similar systems. For illustration, consider an ensemble of similar particles comprising a beam passing through a slit. Let us assume that these particles are very close to equilibrium with the background, that is, that any effects due to the slit can be considered as slight perturbations on the systematic motion of the beam members.

Given this assumption, each member of the ensemble with index, n say, will with a certain probability have a given amount of kinetic energy, En, associated with each degree of freedom. Of special interest here is the beam direction perpendicular to both the beam and the slit in which, by virtue of the assumed state of near equilibrium with the background, we can take the distribution, with respect to energy of the members of the ensemble, to be given in the usual way by the Boltzmann Factor:e_^£" where /? is the reciprocal product of the Boltzmann Constant k and the temperature, T, in degrees Kelvin. The temperature in this case is that of the electromagnetic background serving as a thermal bath for the beam particles with which it is in near equilibrium.

Now, the relative probability of finding any given particle; i.e., with energy E{n,j} or E{n<k} or . . . , trapped in a particular well will be, according to elementary probability, proportional to the sum of the probabilities of finding

223

particles with energy less than the well depth,

£ e -J = f" ( t ) e " s & = (1-e"SD)' <5) {l\En,,<d} JO 0 V 0

where approximating the sum with an integral is tantamount to the recognition that the number of energy levels, if not a priori continuous, is large with respect to the well depth.

If now d in Eq. (5) is expressed as a function of position, we get the probability density as a function of position. For example, for a diffraction pattern from a single slit of width o at distance D, the intensity (essentially the energy density) as a function of lateral position is: E0 sin2(9)/62 where 9 = k[piiotWave(^/D)y, and the probability of occurrence, P(6(y)), as a function of position, would be

P ( y ) a ( l - e - ^ s i n 2 W / f l 2 ) . (6)

Whenever the exponent in Eq. (6) is significantly less than one, its r.h.s. is very accurately approximated by the exponent itself; so that one obtains the standard and verified result that the probability of occurrence, P{y) = ip*tp in conventional QM, is proportional to the intensity of a particle's de Broglie (pilot) wave.

3 Schrodinger Equation

A consequence of the attachment of a De Broglie pilot wave to each particle is that there exists a Fourier kernel of the following form:

• 2p V (7)

which can be used to decompose the density function of an ensemble of similar particles. Consider an ensemble governed by the Liouville Equation:

at m ^ = - V / » - ^ + ( V p p ) . F ,

i=x,y ,z (8)

Now, decompose p(x, p)with respect to p using the De Broglie-Fourier Kernel:

p(x, x', t) = / e'-^p(x, p , t)dp, (9)

224

1.10

relative intensity

Neutron Diffraction

0 Particle Beam

1 x Radiation

•I A Chi(y)-squared (x50)

lateral displacement in radians, 'theta'

Figure 1: A simulated single slit neutron diffraction pattern showing the closeness of the fit of Eq. (6) to the pure wave diffraction patten. See Ref. [3] for details.

to transform the Liouville Equation into:

dt \i2m

To solve, separate variables using:

f)(x'.P)?.

r = x + x', r ' = x —x',

to get

i = (^ )^ - (^» - ( i ) (-"»•'(4^^ which can (sometimes) be separated by writing:

#r , r ' )=V*(r ' )V<(r) ,

(10)

(11)

(12)

(13)

225

to get Schrodinger's Equation:

ihd-^ = ~y^ + v^. (14) at 2 m

4 Conclusions

Within this paradigm, Quantum Mechanics is incomplete as surmised by Einstein, Padolsky and Rosen.4 It is built on the basis of the Liouville Equation while taking a particular stochastic background into account. The conceptual function of Probability in QM is just as in Statistical Mechanics. Measurement reduces ignorance; it does not precipitate "reality." Of course, measurement also disturbs the measured system, but this presents no more fundamental problems that it does in classical physics. 'Heisenberg uncertainty,' on the other hand, is seen to be caused simply by the incessant dynamical perturbation from background signals. In so far as the source of background signals can not be isolated, this source of uncertainty is intrinsic, but not fundamentally novel. For these reasons, "duality" is superfluous. Particles have the same ontological status as in classical physics. Individual particles in a beam pass through one or the other slit in a Young double slit experiment, for example, while their De Broglie piloting waves pass through both slits. Beyond the slit, the particles are induced stochastically to track the nodes of their pilot waves so that a diffraction pattern is built up mimicking the intensity of the pilot wave.

From within this paradigm, the now infamously paradoxical situations illustrating various problems with the interpretation of QM never arise or are resolved with elementary reasoning. In particular, wave functions are not vested with an ambiguous nature.

The SED Paradigm also clarifies the appearance of interference among "probabilities." Numerous analysts from various view points have discovered that fact that Probability Theory admits structure (used by QM) that goes unexploited in traditional applications. (E.g., see Gudder, Summhammar, this volume) While each of these approaches provides deep and surprising insights, none really offers any explanation of why and how nature exploits this structure. Just as a certain second order hyperbolic partial differential equation becomes the "wave equation," as a physics statement only with the introduction; e.g., of Hook's Law, so this extra probability structure can be made into physics only with an analogue to Hook's Law.

SED provides that analogue for particle behavior with its model of pilot wave guidance. In this model, radiation pressure is responsible for particle guidance.3 Radiation pressure is proportional to the square of EM fields; i.e.,

226

the intensity (in this case of the the background field as modified by objects in the environment) which is not additive. Rather, the field amplitudes are additive and interference arrises in the way well understood in classical EM. In other words, QM interference is a manifestation of EM interference. The relevant Hook's Law analogue is the phenomenon of radiation pressure. For radiation, this is all intimately related, of course, to classical coherence theory as applied to "square law" photoelectron detectors, which, when properly applied, resolves many QM conundrums, including those instigated by Bell's Theorem surrounding EPR correlations.

Appendix: Bell's Theorem

The interpretation or paradigm described herein conflicts with the conclusions of Bell's "no-go" theorem, according to which a local, realistic extention of QM should conform with certain restraints that have been shown empirically to be false. To be sure, this paradigm does not deliver the hidden variables for exploitation in calculations, but it does indicate to which features in the universe they pertain—namely, all other charges. The character of these hidden variables is dictated by the fact that they are distinguished only in that they pertain to particles distant from the system of particular interest; thus, internal consistency requires that they be local and realistic.8

The basic proof

Bell's Theorem purports to establish certain limitations on coincidence probabilities of spin or polarization measurements as calculated using QM if they are to have an underlying deterministic but still local and realistic basis describ-able by extra, as yet, 'hidden variables,' A, distributed with a density p(X). These limitations take the form of inequalities which measurable coincidences must respect. The extraction of one of these inequalities, where the input assumptions are enumerated as Bell made them, proceeds as follows:

Bell's fundamental Ansatz consists of the following equation:

P(a, b) = f d\p(X)A(a, X)B(b, A), (15)

where, per explicit assumption: A is not a function of 6; nor B of a. This he motivated on the grounds that a measurement at station A, if it respects 'locality,' can not depend on remote conditions, such as the settings of a distant measuring device, i.,e., b. In addition, each, by definition, satisfies

\A\<1; \B\<1. (16)

227

Eq. (15) expresses the fact that when the hidden variables are integrated out, the usual results from QM are recovered.

The extraction proceeds by considering the difference of two such coincidence probabilities where the parameters of one measuring station differ:

P(a, b) - P(a, b') = f d\p(X)[A(a, X)B(b, A) - A(a, X)B(b', A)], (17)

to which zero in the form

A(a, X)B(b, X)A(a', X)B(b', A) - A(a, X)B(b', X)A(a', X)B(b, A), (18)

is added to get:

P(a, b) - P(a, b') = [ dXp(X)(A(a, X)B(b, A))(l ± A(a', X)B(b', A)+

/ dXp(X)(A(a, X)B(b', A))(l ± A(a', X)B(b, A), (19)

which, upon taking absolute values, Bell wrote as:

\P(a, b)-P(a, b')\ < [dXp(X)(l ± A(a', X)B(b', A)+

I dXp{X){l ± A(a', X)B(b, A). (20)

Then, using Eq. (15), "Ansatz, " and normalization J dXp(X) = 1, one gets

\P(a, b) - P(a, b')\ + \P(a', V) + P(a', b)\ < 2, (21)

a Bell inequality.9

Now if the QM result for these coincidences, namely P(a, b) = — cos(20), is put in Eq. (21), it will be found that for 6 = ir/&, the r.h.s. of Eq. (21) becomes 2\/2. Experiments verify this result.10 Why the discrepancy? According to Bell: it must have been induced by demanding "locality," as all else he took to be harmless.

228

Critiques

Although Bell's analysis is denoted a 'theorem,' in fact there can be no such thing in Physics; the axiomatic base on which to base a theorem consists of those fundamental theories which the whole enterprise is endeavoring to reveal. Moreover, buried in all mathematics pertaining to the physical world are numerous unarticulated assumptions, some of which are exposed below.

The analytical character of dichotomic functions

In motivating his discussion of the extraction of inequalities, Bell considered the measurement of spin using Stern-Gerlach magnets or polarization measurements of 'photons.' In both cases, single measurements can be seen as individual terms in a symmetric dichotomic series; i.e., having the values ± 1 . It is ther-fore natural to ask if the correlation computed using QM, P(a, b) = — cos(20), and verified empirically, can be the correlation of dichotomic functions. It is easy to show that they can not so be; consider:

- cos(20) = k f P(x- 6)P(x)dx, (22)

where p(A) is fc/27r and where the P's are dichotomic functions. Now, take the derivative w.r.t. 8, to get:

2 sin(2<9) = f 5(x - 6j)P(x)dx = ^ P{0j) = k, (23) J i

and again

4cos(20)=O, (24)

which is false. QED Some authors (see, e.g., Aerts, this volume) employ a parameterized di

chotomic function to represent measurements. Such a function can be dichotomic in the argument but continuous in the parameter, e.g., of the form P(sin(i) — x)), for which then the correlation is taken to be of the form

Corr(t) = J D(x- sin(2t))D(x)dx. (25) J — IT

However, this approach seems misguided. First it assumes that the the argument of Corr, t, can be identical to the parameter of the dichotomic function

229

Pt(x) rather than the 'off-set' in the argument, here x, as befitting a correlation. Moreover, the same sort of consistency test applied above also results in contradictions; therefore, such parameterized functions do not constitute counterexamples invalidating the claim that discontinuous functions can not have an harmonic correlation. At best, this tactic implicitly results in the correlation of the measurement functions w.r.t. the continuous parameter, t, which is interpreted as the "weight" or frequency of the the dichotomic value. This tactic, however, does not conform with Bell's analysis in which the dichotomic values are to correlated, rather it corresponds with the type of model proposed below, without, however, recognizing Malus' Law as the source of the 'weights.'

Conclusion: There is a fundamental error in Bell's analysis; the QM result is at irreconcilable odds with the conventional understanding of his arguments.11

This can be revealed alternately, following Sica, by considering four dichotomic sequences (with values ±1 and length N) a, a', b and b' and the following two quantities a ^ + a ^ = a;(6j + 6J) and dfii — a'^)'i = a\{bi — b^). Sum these expressions over i, divide by N, and take absolute values before adding together to get

N N N N

i i i i

N N

- £ | a j | | & i + &;i + - j>;n& i -&; i . (26) i i

The r.h.s. equals 2; so this is a Bell Inequality. Conclusion: this Bell Inequality is an arithmetic identity for dichotomic sequences; there is no need to postulate "locality" in order to extract it.12

Discrete vice continuous variables

By implication Bell considered discrete variables for which the correlation would be

1 N

Cor(a, 6 ) : = - 5 3 X 4 ( 0 ) ^ ( 6 ) , (27) i

But: experiments measure the number of hits per unit time given a, b; and then compute the correlation, each event is a density, not a single pair. The

230

data taken in experiments corresponds to the read-out for Malus' Law, not the generation of dichotomic sequences for which each term represents an event consisting of a pair of photons with anticorrelated polarization or a particle pair with anticorrelated spins. This discrepancy is ignored in the standard renditions of Bell's analysis. It is, however, serious and suggests a different tack.

Consider, following Barut, a model for which the spin axis of pairs of particles have random, but totally anticorrelated instantaneous orientation: Si = —S2.13 Each particle then is directed through a Stern-Gerlach magnetic field with orientation a and b. The observable in each case then would be A := Si • a and B := S2 • b . Now by standard theory,

_ , . „ s <\AB\> - <A> . . Cor (A, B) = ' = = = , 28

V< A2 > < B2 > the where the angle brackets indicate averages over the range of the variables. This becomes

Cor(A, B) = / ^ s i n ( 7 ) d y c o s ( 7 - g ) c o s ( 7 ) ^

\J(Jd'ysm(j)cos2(j))2

which evaluates to -cos(0); i.e., the QM result for spin state correlation. Conclusion: this model, essentially a counter example to Bell's analysis, shows that continuous functions (vice dichotomic) work. It is more than just natural to ask where do the 'gremlins' reside in Bell's analysis? There are at least two.

One has to do with the following covert hypothesis: Bell's 'proof seems to pertain to continuous variables in that the demand is only that \A\ (\B\) < 1. This argument, however, silently also assumes that the averages, < A > = = 0. It enters in the derivation of a Bell inequality where the second term above is ignored as if it is always zero. When it is not zero, Bell inequalities become; e.g.,

l\P(a, b) - P(a, b')\ + \P(a', b') - P(a', b)\<2+ /2 < ^ > < f 2

> ^ , (30) V< Az > < Bz >

which opens up a broader category of non quantum models. A second covert gremlin having broader significance is discussed below.

Are 'nonlocal' correlations essential?

The demand that in spite of the introduction of hidden variables, A, that a probability, P(a, b), averaged over these extra variables reduce to currently

231

used QM expressions, implies that:

P(a, b)= f P(a, b, X)dX. (31)

By basic probability theory, the integrand in this equation is to be decomposed in terms of individual detections in each arm according to Bayes' formula

P{a, b, A) = P(X)P(a\ X)P(b\a, A), (32)

where P(a\ A) is a conditional probability. In turn, the integrand above can be converted to the integrand of Bell's Ansatz:

P(a, b) = jA(a, X)B(b, X)p{X)dX, iff

P(b\a,X) = P(b\X), Va. (33)

This equation admits, it seems, two interpretations:

(i) When this equation is true, the ratio of occurrence of outcomes at station B must be statistically independent of the outcomes at A. Therefore, as the hidden variables A are 'extra' and do not duplicate a and b, even if the correlation is considered to be encoded by a A , it will not be available to an observer. But, the correlation by hypothesis does exist and is to be detectable via the a's and 6's; therefore, this equation can not hold. Thus, within this interpretation, Bell's Ansatz is not internally consistent.

(ii) Alternately, if the a on the l.h.s. is superfluous, so is b; so that P — P(X) = 0 except at one value of A, where it equals 1, or is a Dirac-delta function . That is, the correlation is totally encoded by the hidden variables, as follows if a sufficient number of new variables are introduced to render everything deterministic—as often assumed. Consequently, individual products of probabilities at the separate stations, i.e., AB's, in Bell's notation, become Dirac delta-functions of the A. If everything is deterministic, then there can be no overlap of the of the non-zero values of pairs of probabilities for a given value of A, and therefore, in the extraction of a Bell inequality, all quadruple products of P 's with pair-wise different values of A in Eq. (19) are identically zero so that the final form of a Bell inequality is the trivial identity:

\P(a,b)-P(a,b')\<2. (34)

232

In either case, "locality" is not be so employed so as to exclude correlations generated at the conception of the spin-particles or photon pairs, i.e., "common causes." The non existence of instantaneous communication can not impose a restraint here; it must bear no relationship to the validity of Eq. (33).

In addition, Eq. (34) reconciles Barut's continuous variable model with Bell's analysis.

Bell-Kochen-Specker 'Theorem'

Besides Bell's original theorem there is another set of no-go theorems ostensibly prohibiting a local realistic extention for QM. In contrast to the theorem analyzed above, they do not make explicit use of 'locality,' rather they use certain properties (falsely, it turns out) of angular momentum (spin). In general, the 'proof of these theorems proceeds as follows: The system of interest is described as being in a 'state' \ip) specified by observables A, B, C A hidden variable theory is then taken to be a mapping v of observables to numerical values: v(A),v(B),v(C)... Use is then made of the fact that if a set of operators all commute, then any function of these operators f(A, B,C...) = 0 will also be satisfied by their eigenvalues: f(v(A), v(B),v(C)...) — 0.

The proof of a Kochen-Specker Theorem proceeds by displaying a contradiction; consider, e.g., two 'spin-1/2' particles for which the nine separate mutually commuting operators can be arranged in the following 3 by 3 matrix:

°l °l °\°\ (35) °Wy °l°\ °\°z

It is then a little exercise in bookkeeping to verify that any assignment of plus and minus ones for each of the factors in each element of this matrix results in a contradiction, namely, the product of all these operators formed row-wise is plus one and the same product formed column-wise is minus one.14

Now, recall that given a uniform static magnetic field B in the z-direction, the Hamiltonian is: H = ^Baz for which the time-dependent solution of the

r n—iuit Schrodinger equation is: ip(t) = 4= e

„+iut and this in turn gives time-

dependent expectation values for spin values in the x,y directions^5

< &x >— ~ cos(o;i); < ay >= - sin(wi), (36)

where w = eB/mc.

233

Proof of a Bell-Kochen-Specker theorem depends on simultaneously assigning the [eigenvalues ±1 to <rx, o~y and az as measurables for each particle. (With some effort, for all other proofs of this theorem one can find an equivalent assumption.) However, as Barut13 observed and can be seen in Eq. (36), if the eigenvalues ±1 are realizable measurement results in the "P-field" direction, then in the other two directions the expectation values oscillate out of phase and therefore, can not be simultaneously equal to ± 1 . Thus, this variation of a Bell theorem also is defective physics.

A local model for EPR (polarization) Correlations

The following model incorporates the features of polarization correlations without preternatural aspects or the concept of 'photon.' The basic assumption is that the source emits oppositely directed, anticorrelated classical electromagnetic signals:

EA = xcos(i/) +ys in( f ) ; EB = — xsin(*/ + 6) + y cos(i/ + 9), (37)

where factors of the form exp(i(wt + k • x + £(t)), where £(£), is a random variable, are dropped, as they are suppressed by averaging.16 Now, the random variables with physical significance, emerging in the detectors per Malus' Law, are EA B . It is the detectors that digitize the data and create the illusion of 'photons.' But, because Maxwell's Equations are not linear in intensities, rather in the fields, a fourth order field correlation is required to calculate the cross correlation of the intensity:

P(a, b) = K<(A- B)(B • A) >, (38)

where brackets indicate averages over space-time. (This appears to be the source of "entanglement" in QM, which is seen to have no basis beyond that found in classical physics.) Here, Eq. (38) turns out to be:

P ( + , +) <XK (COS(J/) sin(i/ + 6) - sin(i/) cos(i/ + 6)fdv, (39) Jo

which gives P ( + , + ) = P ( - , - ) oc /tsin2(0) a n d P ( - , + ) = P ( - , - ) ocfccos2(0). The constant, K, can be eliminated by computing the ratio of particular events to the total sample space, which here includes coincident detections in all four combinations of detectors averaged over all possible displacement angles 6; thus, the denominator is:

— / (sin2 (6») + cos2 (6))d6 = 2K, (40) i" Jo

234

so that the ratio; becomes:

P ( + , + ) = is in 2(0) , (41)

the QM result. This in turn yields the correlation

P ( + , +) + P ( - , - ) - P ( + , - ) - P ( - , +) Cor(a, b) :=

P ( + , +) + P ( - , - ) + P ( + , - ) + P ( - , + ) '

Cor (a, b) = -cos(20). (42)

If the fundamental assumptions involved in this local, realistic model are valid, then there would be observable consequences. For example, if radiation on the "other side" of a photodetector is continuous and not comprised of "photons," then, photoelectrons are evoked independently in each detector by continuous but (anti)correlated radiation. Thus, the density of photoelectron pairs should be linearly proportional (baring effects caused by limited coherence) to the coincidence window width. On the other hand, if photons are in fact generated in matched pairs at the source, then at very low intensities, the detection rate should be relatively insensitive to the coincidence window width once it is wide enough to capture both electrons.

1. L. de la Peha and A. M. Cetto, The Quantum Dice (Kluwer, Dordrecht, 1996).

2. A. F. Kracklauer, An Intuitive Paradigm for Quantum Mechanics. Physics Essays 5 (2) 226 (1992).

3. A. F. Kracklauer, Found. Phys. Lett. 12 (5) 441 (1999). 4. G. Hermann, Die Naturphilosophischen Grundlagen der Quanten-

mechanik. Abhandlungen der Fries'schen Schule 6, 75-152 (1935). 5. D. Bohm, Causality and Chance in Modern Physics. (Routledge & Kegan

Paul Ltd., London, 1957). 6. H. Puthoff, Phys. Rev. A 40, 4857 (1989); 44, 3385 (1991). 7. A. Einstein, B. Podolsky and N. Rosen, Phys. Rev. 47, 777 (1935). 8. J. S. Bell, Speakable and unspeakable in quantum mechanics, (Cambridge

University Press, Cambridge, 1987). 9. J. S. Bell in Foundations of Quantum Mechanics, Proceedings of the

International School of Physics 'Enrico Fermi,' course IL (Academic, New York, 1971), p. 171-181; reprinted in Ref [8].

10. A. Afriat and F. Selleri, The Einstein, Podolsky and Rosen Paradox, (Plenum, New York, 1999) review theory and experiments from a current prospective.

235

11. A. F. Kracklauer in New Developments on Fundamental Problems in Quantum Mechanics, M. Ferrero and A. van der Merwe (eds.) (Kluwer, Dordrecht, 1997), p.185.

12. L. Sica, Opt. Commun. 170, 55-60 & 61-66 (1999). 13. A. O. Barut, Found. Phys. 22 (1) 137 (1992). 14. N. D. Mermin; Rev. Mod. Phys. 65 (3) 803 (1993); 15. R. H. Dicke and J. P. Wittke, Introduction to Quantum Mechanics,

(Addison-Wesley, Reading, 1960) p. 195. 16. A. F. Kracklauer, in Instantaneous Action-at-a-Distance in Modern

Physics, A. E. Chubykalo, V. Pope and R. Smirnov-Rueda (eds.) (Nova Science, Commack NY, 1999) p. 379; http://arXiv:quant-ph/0007101; Ann. Fond. L. deBroglie 20 (2) 193, (2000).

236

A PROBABILISTIC INEQUALITY FOR THE KOCHEN-SPECKER PARADOX

JAN-AKE LARSSON Matematiska Institutionen, Linkopings Universitet

SE-581 83 Linkoping, Sweden E-mail: [email protected]

A probabilistic version of the Kochen-Specker paradox is presented. The paradox is restated in the form of an inequality relating probabilities from a non-contextual hidden-variable model, by formulating the concept of "probabilistic contextuality." This enables an experimental test for contextuality at low experimental error rates. Using the assumption of independent errors, an explicit error bound of 0.71% is derived, below which a Kochen-Specker contradiction occurs.

1 Introduction

The description of quantum-mechanical (QM) processes by hidden variables is a subject being actively researched at present. The interest can be traced to topics where recent improvements in technology has made testing and using QM processes possible. Research in this field is usually intended to provide insight into whether, how, and why QM processes are different from classical processes. Here, the presentation will be restricted to the question whether there is a possibility of describing a certain QM system using a non-contextual hidden-variable model or not. A non-contextual hidden-variable model would be a model where the result of a specific measurement does not depend on the context, i.e., what other measurements that are simultaneously performed on the system. It is already known that for perfect measurements (perfect alignment, no measurement errors), no non-contextual model exists. These results origin in the work of Gleasonf but a conceptually simpler proof was given by Kochen and Specker2 (KS).

The KS theorem concerns measurements on a QM system consisting of a spin-1 particle. In the QM description of this system, the operators associated with measurement of the spin components along orthogonal directions do not commute, i.e.,

'Sxj^y, and sz do not commute. (1)

however, the operators that are associated with measurement of the square of the spin components do commute, i.e.,

^1,'s'i, and s^ commute. (2)

237

The latter operators (the squared ones) have the eigenvalues 0 and 1, and

si +s2y + s2

z = 21. (3)

Thus, it is possible to simultaneously measure the square of the spin components along three orthogonal vectors, and two of the results will be 1 while the third will be 0. Only this QM property of the system will be used in what follows.

The notation used from now on is intended to avoid confusion with QM notation, since the notions used will be those of (Kolmogorovian) probability theory, not QM. A hidden-variable model will be taken to be a probabilistic model, i.e., the hidden variable A is represented as a point in a probabilistic space A, and sets in this space ("events") have a probability given by the probability measure P. The measurement results are described by random variables (RVs) Xj(A), which take their values in the value space {0,1}.

These mappings will depend not only on the hidden variable A, but also the specific directions in which we choose to measure the squared spin components, so that we would have

X i ( x , y , z , A ) : A - > { 0 , l }

X 2 ( x , y , z , A ) : A - + { 0 , l } (4)

X 3 ( x , y , z , A ) : A ^ { 0 , l } .

Here, Xi is the result of the measurement along the first direction (x), X2

along the second (y), and X3 along the third (z). To be able to model the spin-1 system described above, these RVs would need to sum to two, i.e.,

3

^ X i ( x , y , z , A ) = 2. (5) i= l

This is in itself no guarantee that the model will be accurate, but it is the least one would expect from a hidden-variable model yielding the QM behaviour.

In simple experimental setups, there is usually only one direction specified (the direction along which the spin component squared is measured). Thus, we would expect that X\ only depends on x (and A). This is referred to as non-contextuality, and more formally this can be written as

Xi(x,y,z,A) =X 1 (x ,y ' , z ' ,A )

X 2 (x ,y , z ,A)=X 2 (x ' , y , z ' ,A ) (6)

AT3(x,y,z,A) = X 3 ( x ' , y ' , z , A ) .

These two prerequisites are all that is needed to arrive at the Kochen-Specker paradox.

238

2 The Kochen-Specker t heo rem

A more appropriate name for this section is perhaps "A Kochen-Specker theorem," since there are several variants; the example presented here is from Peres (1993).3 All variants aim for the same thing: to show a contradiction by assigning values to measurement results coming from a non-contextual hidden-variable model. In this particular one,3 a set of 33 three-dimensional vectors are used, depicted in Fig. 1.

Figure 1: The 33 vectors used in the Kochen-Specker theorem. The vectors are from the center of the cube onto one of the spots on the cube's surface (normalized, if desired).

The proof is as follows; assume that we have a non-contextual hidden-variable model. Then, for any A (except perhaps for a null set), this model satisfies equations (5) and (6), in particular for the directions in Fig. 1. Now, look at Fig. 2(a). The measurement result along one of the coordinate axes must be 0, and along the other axes it must be 1. Let us assume that the 0 is obtained from the measurement along the z axis (the white spot on the cube) and the other two measurements yield 1 (black spots"). Measurements along other directions in the ay-plane must also yield 1, as indicated in Fig. 2(a). In Fig. 2(b-d), three more similar choices are made, and having made these assignments, a white spot must be added at the position indicated in Fig. 2(e), because of the two black spots at orthogonal positions, and by this another black spot must be added, being orthogonal to the white one. This procedure continues in Fig. 2(f-j) until all the spots are painted either white or black as necessitated by the previously painted spots. Finally, in Fig. 2(k), we have three black orthogonal spots, violating equation (5), the condition of QM results. A similar contradiction will occur whatever choices we make in our assignments in Fig. 2(a-d), and we have a proof of the KS theorem. We have

"these were green and red in Peres3

239

(a) Arbitrary choice (b) Arbitrary choice (c) Arbitrary choice

(d) Arbitrary choice (e) Orthogonality (f) Orthogonality

(g) Orthogonality (h) Orthogonality (i) Orthogonality

(j) Orthogonality (k) Contradiction

Figure 2: A proof of the Kochen-Specker paradox.

240

Theorem 1: (Kochen-Specker) The following three prerequisites cannot hold simultaneously for any A

(i) Realism. Measurement results can be described by probability theory, using three (families of) RV's

X ; ( x , y , z ) : A - > { 0 , l } , i = 1,2,3.

(ii) Non-contextuality. The result along a vector is not changed by rotation around that vector. For example,

Xi(x,y,z,A) = X j ( x , y ' , z ' , A ) .

(Hi) Quantum-mechanical results. For any triad, the sum of the results is two, i.e.,

^ X i ( x , y , z , A ) = 2. i

Note that there is a certain structure to the proof: assignment of measurement results on a finite number of orthogonal triads according to the QM rule, and rotations connecting the measurement results on different triads by non-contextuality. This structure can be made explicit in the statement of the theorem, by introducing the set EKS (a "KS set of triads"):

"""{©•©•©•©•-•(-i5)} (7)

In this set there are n vectors forming TV distinct orthogonal triads where some vectors are present in more than one triad, establishing in total M connections by rotation around a vector. Using this notation, (a restricted version of) the KS theorem is

Theorem 1': (Kochen-Specker) Given a KS set of vector triads EKS, the following three prerequisites cannot hold simultaneously for any A

(i) Realism. For any triad in EKS, the measurement results can be described by probability theory, using three (families of) RV's

Xi(x,y,z):A^{0,l}, 1 = 1,2,3.

241

(ii) Non-contextuality. For any pair of triads in EKS related by a rotation around a vector, the result along that vector is not changed by the rotation. For example,

Xi(x,y,z,A) = X i ( x , y ' , z ' , A ) .

(Hi) Quantum-mechanical results. For any triad in EKS , the sum of the results is two, i.e.,

^ X i ( x , y , z , A ) = 2. i

This version of the KS theorem will be useful when formulating a probabilistic version of the theorem.

3 The Kochen-Specker inequality

The above discussion is valid in an ideal situation where no measurement errors are present. Introducing measurement errors, these occur as (i) missing detections, (ii) changes in the results along the axis vector when rotating, or (hi) deviations from the sum 2. Since the prerequisites of Theorem 1 is no longer valid, neither is the theorem. However, using probabilistic notions the theorem can be restated as follows.

Theorem 2: (Kochen-Specker inequality) Given a KS set EKS of AT vector triads with M interconnections by rotation, if we have

(i) Realism. For any triad in EKS, the measurement results can be described by probability theory, using three (families of) RV's

J f i ( x , y , z ) : A X l - + { 0 , l } , i = l , 2 , 3 ,

where Ax{ is a (possibly proper) subset of A.

(ii) "Rotation" error bound. For any pair of triads in EKS related by a rotation around a vector, the set of As where the result along that vector is not changed by the rotation is probabilistically large (has probability greater than 1 — S). For example,

p ( \ \ : Xi(x> y >z,A) = Xi(x,y'>z'>A))>) > 1 - S.

242

(Hi) "Sum" error bound. For any triad in EKS, the set of As where the sum of the results is two is probabilistically large (has probability greater than 1 - e ) , i.e.,

p f { A : ^ X i ( x , y , z , A ) = 2 } ) > 1 - e.

Then

M8 + Ne> 1,

To shorten the proof, the following symmetry of the measurement results are assumed to hold (the proof goes through without the symmetry, but grows notably in size):

Xi(x,y,z,A) = X 2 ( z , x , y , A ) = X 3 (y , z ,x , A). (8)

Proof: By Theorem 1, we have

( f | { A : X 1 ( x , y , z , A ) = X 1 ( x , y ' , z ' , A ) } ) f l M

( f | { A : ] T x i ( x , v , z , A ) = 2 } ) = 0 N %

Then, the complement has probability one, and

1 = P (\j{^-X1(K,y,z,X)=X1(x,y',z,,X)} ) - M

U(U{A :£^(x 'y>z>A) = 2}c)l N i J

< ^ p ( { A : X 1 ( x , y , z , A ) = X 1 ( x , y ' , z ' , A ) } C ) ( 9 )

M

+ Ep({A :Ex^x>y>z'A) = 2}c) N i

<M6 + Ne

Here, the probability in (iii) is to be read as "the probability of obtaining results for all three Xi and that the sum is two." In other words, it is

243

possible to avoid using the no-enhancement assumption in Theorem 2, but unfortunately inefficient detector devices would contribute no-detection events to both the error rates S and e, which puts a rather high demand on experimental equipment. While the no-enhancement assumption can be used in inefficient setups, this may weaken the statement (cf. a similar argument for the GHZ paradox2).

The error rate e is the probability of getting an error in the sum (both non-detections and the wrong sum are errors here), not the probability of getting an error in an individual result. This makes it easy to extract e from experimental data, but unfortunately, the errors that arise in rotation are not available in the experimental data so it is not possible to estimate the size of S (note that it is not even meaningful to discuss 5 in QM). It is possible to use e to obtain a bound for 5:

Corollary 3 (Kochen-Specker inequality) Given a KS set of N vector triads EKS with M interconnections by rotation, if Theorem 2 (i-iii) hold, then

Obviously, a small EKS s e t (small N and M) is better, yielding a higher bound for S for a given e (for a few different KS sets, see2 ,3 '5).

In an inexact experiment yielding a large e one expects the error rate S to be large as well, whereas the bound in Theorem 3 will be low because of the large e. A model for this inexact experiment may then be said to be "probabilistically non-contextual"; the measurement error rate is large enough to allow the changes arising in rotation to be explained as natural errors in the inexact measurement device, rather than being fundamentally contextual. For a good experiment yielding a low e one expects 6 to be low, but here the bound in Theorem 3 is higher. In a hidden-variable model of this experiment, the changes arising in rotation occur at an unexpectedly high rate which cannot be explained as due to measurement errors, and a model of this type may be said to be "probabilistically contextual". Note that this "probabilistic" non-contextuality is a weaker notion than the one used in Theorem 1 (ii).

4 Independence

To enable a general statement, the proof of Theorem 2 does not make any assumptions on independence of the errors, but it is possible to give a more quantitative bound for the error rate by introducing independence (for simplicity, at 100% detector efficiency).

Corollary 4 (KS inequality for independent errors): Assuming that the errors are independent at the rate r and that Theorem 2 (i-iii) hold, then both

244

= P(noerrors) + P(fliponbothXi's) *• '

6 and e are given by r, and

M(2r - 2r2) + iV(3r - 5r2 + 3r3) > 1.

Proof: In the case of independent errors at the rate r, the expressions for the probabilities in Theorem 2 (i) and (ii) are

p({\:X1(X,y,z,\)=X1(x,y',z',\)})

.rrors) + P(fliponboth.

= ( l - r ) 2 + r 2 = l - ( 2 r - 2 r 2 ) ,

p({A:Ex<(x'y'z>A) = 2}) 1 (ii)

= P(noerrors) + P(flipoftheOandonel) = (1 - r )3 + 2(1 - r)r2 = 1 - (3r - 5r2 + 3r3).

The probabilities of these sets are not independent, so from this point on we cannot use independence. The inequality above then follows easily from Theorem 2.

An expression on the form r > f(N, M) can now be derived from Corollary 4, but this complicated expression is not central to the present paper. One important observation is that again, to obtain a contradiction for high error rates (r), a small EKS set is needed (small N and M). Unfortunately, the error rate needs to be very low, e.g., in the E^s m the present example,6 only an error rate r below 0.71% yields a contradiction in Corollary 4. Please note that there is no experimental check whether the assumption of independent errors holds or not. While the errors in the sum may be possible to check, it is not possible to extract what errors are present in the rotations or check for independence of those errors (further discussion of independence is necessary but cannot be fit into this limited space).

' 'The set contains 33 vectors forming 16 distinct orthonormal bases3, but some rotations used are not between two of these 16 bases; in some cases a rotation goes from one of the 16 bases to a pair of vectors in the set (where the third needed to form a basis is not in the set), and a subsequent rotation returns us to another of the 16 bases. Thus, in the notation adopted here, a few extra vectors are needed to form % s yielding n = 41, N — 24, and M = 31. Note that these additional vectors are not needed to yield the KS contradiction, but are only needed in the proof of the inequality in this paper. A more detailed analysis for the initial set of 33 vectors is possible, probably yielding a contradiction at a somewhat higher r than the one obtained from this general analysis but this is lengthy and will not be done here.

245

5 Conclusions

To conclude, for any hidden-variable model we have a bound on the changes arising in rotation:

Here, iV is the number of triads in EKS and M is the number of connections within EKS- A proof using few triads with few connections is not only easier to understand but is also essential to yield a bound usable in real experiments. At a large error rate e probabilistically non-contextual models cannot be ruled out, since the changes of the results arising in rotation can be attributed to measurement errors. However, a small error rate e will force any hidden-variable description of the physical system to be probabilistically contextual.

If the assumption of independent errors is used, an explicit bound can be determined for the error rate r:

M(2r - 2r2) + ./V(3r - 5r2 + 3r3) > 1, (13)

which is possible to write on the form r > f(N, M). Below the bound, we have a KS contradiction. Again, a small KS set is better than a large one, yielding a higher bound. For example, for the KS set used here,3 an r below 0.71% yields a contradiction.

While writing this paper, the author learned from C. Simon that a similar approach was in preparation by him, C. Brukner, and A. Zeilinger.6

The author would like to thank A. Kent for discussions. This work was partially supported by the Quantum Information Theory Programme at the European Science Foundation.

1. A. M. Gleason, J. Math. Mech. 6, 885, (1957). 2. S. Kochen and E. P. Specker, J. Math. Mech. 17, 59 (1967). 3. A. Peres, Quantum Theory: Concepts and Methods, Ch. 7, (Kluwer, Dor

drecht, 1993). 4. D. M. Greenberger, M. Home, A. Shimony, and A. Zeilinger, Am. J.

Phys. 58, 1131 (1990); N. D. Mermin, Phys. Rev. Lett. 65, 1838 (1990); J.-A. Larsson, Phys. Rev. A 57, R3145 (1998); J.-A. Larsson, Phys. Rev. A 59, 4801 (1999).

5. A. Peres, J. Phys. A 24, L175 (1991); J. Zimba and R. Penrose, Stud. Hist. Philos. Sci. 24, 697 (1993).

6. C. Simon, C. Brukner, and A. Zeilinger, quant-ph/0006043.

246

Q U A N T U M STOCHASTICS. THE N E W A P P R OA C H TO THE DESCRIPTION OF Q U A N T U M MEASUREMENTS

ELENA LOUBENETS Moscow State Institute of Electronics and Mathematics

Abstract

We propose a new general approach to the description of an arbitrary generalized direct quantum measurement with outcomes in a measurable space. This approach is based on the introduction of the physically important mathematical notion of a family of quantum stochastic evolution operators, describing in a Hilbert space the conditional evolution of a quantum system under a direct measurement.

In the frame of the proposed approach, which we call quantum stochastic, all possible schemes of measurements upon a quantum system can be considered.

The quantum stochastic approach (QSA) gives not only the complete statistical description of any quantum measurement (a POV measure and a family of posterior states) but it gives also the complete stochastic description of the random behaviour of a quantum sytem in a Hilbert space in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement. When a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator.

In the case of continuous in time measurements the QSA allows to define, in the most general case, the notion of the family of posterior pure state trajectories (quantum trajectories) in the Hilbert space of a quantum system and to give their probabilistic treatment.

1 Introduction

The evolution of the isolated quantum system is quantum deterministic since its behaviour in a complex separable Hilbert space H is described by a unitary operator U(t) :% —> %, satisfying the Schrodinger equation whose solutions are reversible in time.

Under a measurement the behaviour of a quantum system becomes irreversible in time and stochastic: not only is the outcome of a measurement random being defined with some probability distribution but the state of a quantum system becomes random as well.

Consider the general scheme of description of any quantum measurement

247

with outcomes of the most general nature possible under a quantum measurement. Such a measurement is usually called generalized.

Let n be a set of outcomes and J7 be a u-algebra of subsets of fi. Let po be a state of a quantum system at the instant before a measurement.

The complete statistical description of any generalized quantum measurement implies that for any initial state po of a quantum system we can present:

• the probability distribution of different outcomes of a measurement; • the statistical description of a state change po -> pout of the quantum

system under a measurement. We shall say also about the complete stochastic description of the random

behaviour of a quantum system under a measurement in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement.

Introduce some notations. Let fj,(E,po) = Prob{w 6 E;p0}, WE £ T be a probability that under

a measurement (upon a quantum system being initially in a state po) the observed outcome UJ belongs to a subset E.

Let Ex{Z\E) be a conditional expectation of any von Neumann observable Z G C(H), Z = Z+ at the instant immediately after the measurement provided the observed outcome w 6 E. Here C{H) denotes the linear space of all linear bounded operators on 7i.

The statistical (density) operator pout(E,po) is called a posterior state of a quantum system conditioned by the observed outcome w € E if for any Z the following relation is valid

Ex{Z\E} = tr[pout(E,p0)Z]. (1)

Unconditional (a priori) state p0ut(Q,Po) of a quantum system defines the quantum mean value

tr[pout(n,p0)Z] = Ex{Z\Q} = (Z)Pout{n,Po) (2)

of any von Neumann observable Z at the instant immediately after the measurement if the results of a measurement are ignored.

Any conditional state change p0ut(E,po) of a quantum system under a measurement can be completely described by a family of statistical operators {Pout(u,Po),v G ft], denned ^-almost everytwhere on fl, and called a family of posterior states

Specifically, for WE £ T, fi(E, p0) ^ 0

Pout{E'Po) ~ pjE^) ( 3 )

248

and, consequently, due to (1), for any von Neumann observable Z the conditional expectation can be presented as

Ex{Z\E\ = f"eB tr^pout^' P o ) Z M ^ , Pa) ( 4 )

p(E,p0)

Every posterior state pout(^,po) describes the state of a quantum system conditioned by the "sharp" outcome w. In general, however, when outcomes of a measurement are not of discrete character or the observation is not "sharp" then, provided the outcome u> £ E, we can only say that after a measurement the quantum system is in a state p0ut(<^,Po) with probability

n{dw,po)

* * ( w ) 7^T (5)

where XE{<*>) is an indicator function of a subset E. The a priori state p0ut(^,Po) a n d the quantum mean value of any von

Neumann observable Z at the instant immediately after the measurement are represented through the family of posterior states as

Pout(tt,p0)= / Pout(u,p0)lJ'(du,Po), (6) Ja

(z)pout(n,po)= / tr[pout(uj,po)Z]ft(<h},po), (7) Jn

respectively. The relation (6) can be considered as the usual statistical average over

posterior states p0ut{u,Po), given with the probability distribution p,(cLj,po). From (7) it also follows that in any possible measurement upon an ob

servable Z, which could be done immediately at the instant after the first measurement, the probability distribution Prob{z € A;pout(Cl,po)} of possible outcomes is given by

Prob{z e A; /w(n , /9 0 )} = / Pvob{z € A;pout(u,po)}fi(du,p0). (8) JQ

This formula can be considered as the quantum analog of Bayes' formula in classical probability theory.

In quantum theory there are two major approaches to the specification of above mentioned elements of the description of a quantum measurement.

249

• The von Neumann approach [1] considers only direct measurements with outcomes in R. According to this approach only self-adjoint operators on ~H are allowed to represent real-valued variables of a quantum system, which can be measured (observables). The probability distribution p,(E,po) of any measurement is denned as

Li(E,po)=tr[p0P(E)l (9)

through the projection-valued measure P(-) on (R, B(M)), corresponding, due to the spectral theorem, to the self-adjoint operator, representing this observable.

Under the von Neumann approach the posterior state of a quantum system is defined only in the case of discrete spectrum of a measured quantum variable and is given by the well-known "jump" of a quantum system under a measurement, prescribed by von Neumann reduction postulate.

In the case of continuous spectrum of a quantum observable the description of a state change of a quantum system under a measurement is not formalized.

The simultaneous measurement of n quantum observables is allowed if and only if the corresponding self-adjoint operators and, consequently, their spectral projection-valued measures commute.

•The operational approach [2-8] gives the complete statistical description of any generalized quantum measurement. In the frame of the operational approach the mathematical notion of a quantum instrument plays the central role. In physical literature a quantum instrument is usually called a " superop-erator".

Specifically, a mapping T(-)[-]: T x C(Ji) -> C{T-L) is called a quantum instrument if T(-) is a measure on (fi, F) with values T(E), VE £ T, being linear bounded normal completely positive maps on £(H), such that the following normality relation is valid: T(fi)[J] = J.

Let T(-)[-] be an instrument of a generalized quantum measurement. Then the conditional expectation of any von Neumann observable Z at

the instant after a measurement is defined to be

Ex{m = ^mMMt yE£jr. ( 1 0 ) H{h,po)

In case Z = I, from (10) it follows that in the frame of the operational approach the probability distribution p(E, po) of outcomes under a measurement is given by

p(E,p0) = tr[p0T(E)[I]], V£ € T. (11)

250

The positive operator-valued measure M(E) = T(E)[I], satisfying the condition M(fi) = / is called a probability operator-valued measure or a POV measure, for short.

From (1) and (10) it also follows that, for any initial state po, the posterior state p0ut(E,po) conditioned by the outcome us £ E can be represented as

Pout(E,p0)- KEpo) , (12)

where T*(E)[-] denotes the dual map, acting on the linear space T(H) of trace class operators on H and denned by

tr[ST(E)[Z]} = tr[T*(E)[S}Z], VZ £ C{U), VS <ET(H). (13)

For any initial state po of a quantum system the family of posterior state {Pout(u,po),w G fi} always exists and is denned uniquely, /^-almost everywhere, by the relation:

/ tr[pout(cj,p0)Z]fi(du,p0)=tr[p0T(E)[Z]}, MZ 6 C(H), V£ € T. (14) Jui€E

Due to (13), (14) we have

T*(E)[p0]= pout(uj,po)p-(du),po), (15) Jw€E

and, consequently, the posterior state pout(^,Po) is a density of the measure T*(-)[po] with respect to the probability scalar measure p,(-,po).

The operational approach is very important for the formalization of the complete statistical description of an arbitrary generalized quantum measurement.

However, the operational approach does not specify the description of a generalized direct quantum measurement, that is, the situation where we have to describe a direct interaction between a measuring device and an observed quantum system, resulting in some observed outcome w in a classical world and the change of a quantum system state conditioned by this outcome.

We would like to emphasize that, in principle, the description of a direct measurement can not be simply reduced to the quantum theoretical description of a measuring process. We can not specify definitely neither the interaction, nor the quantum state of a measuring device environment, nor to describe a measuring device only in quantum theory terms. In fact under such a scheme the description of a direct quantum measurement is simply postponed to the

251

description of a direct measurement of some observable of the environment of a measuring device.

The operational approach does not also, in general, give the possibility to include into consideration the complete stochastic description of the random behaviour of a quantum system under a measurement.

We recall that for the case of discrete outcomes the von Neumann approach gives both - the complete statistical description of a direct quantum measurement and the complete stochastic description in a Hilbert space of the random behaviour of a quantum system under a single measurement. In particular, if the initial state po of a quantum system is pure, that is, po = |V'o)(V'o|, and if under a single measurement the outcome A_, is observed, then in the frame of von Neumann approach the quantum system "jumps" with certainty to the posterior pure state

AV'o H -iM

(16)

where Pj is the projection, corresponding to the observed eigenvalue Xj. The probability fij of the outcome Xj is given by

H = ll-P^oll2. (17)

We would also like to underline that the description of stochastic, irreversible in time behaviour of the quantum system under a direct measurement is very important, in particular, in the case of continuous in time direct measurements, where the evolution of continuously observed quantum system can not be described by reversible in time solutions of the Schrodinger equation.

In quantum theory any physically based problem must be formulated in unitarily equivalent terms and the results of its consideration must not be dependent neither on the choice of a special representation picture (Schrodinger, Heisenberg or interaction) nor on the choice of a basis in the Hilbert space. That is why, in [9] we introduce the notion of a class of unitarily equivalent measuring processes and analyse the invariants of this class.

We show [9] that the description of any generalized direct quantum measurement with outcomes in a standard Borel space (n ,Fg) can be considered in the frame of a new general approach, which we call quantum stochastic, based on the notion of a family of quantum stochastic evolution operators, satisfying the orthonormality relation. In the case when a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element, which is a unitary operator.

The quantum stochastic approach (QSA), which we present in the next section, can be considered as the quantum stochastic generalization of the de-

252

scription of von Neumann measurements for the case of any measurable space of outcomes, an input probability scalar measure of any type on the space of outcomes and any type of a quantum state reduction. Due to the orthonormality relation, the QSA allows to interpret the posterior pure states, defined by quantum stochastic evolution operators, as posterior pure state outcomes in a Hilbert space corresponding to different random measurement channels.

Even for the special case of discrete outcomes, the QSA differs, due to the orthogonality relation for posterior pure state outcomes, from looking somewhat similar, approaches considered in the physical literature [10,11], where the so called "measurement" or Kraus operators are used for the description of both the statistics of a measurement (a POV measure) and the conditional state change of a quantum system.

The QSA gives not only the complete statistical description of any generalized direct quantum measurement but it gives also the complete stochastic description of the random behaviour of the quantum system under a measurement

2 Quantum stochastic approach

In this section we introduce the quantum stochastic approach (QSA) to the description of a generalized direct quantum measurement, developed in [9].

Specifically, it was shown in [9] that for any generalized direct quantum measurement with outcomes in a standard Borel space (ft, TB) upon a quantum system, being at the instant before the measurement in a state po, there exist:

• the unique family of complex scalar measures, absolutely continuous with respect to a finite positive scalar measure v(-) and satisfying the orthonormality relation:

A = {nji(ui)i/(du;) : LJ £ Cl;i,j - 1,...,N0;N0 < oo; / Trji(cj)i/(du)) = <%}; Jn

(18)

• the unique (up to phase equivalence) family of v- measurable operator-valued functions l^(-) on fi, satisfying the orthonormality relation, with values being linear operators on % defined for any ip 6 % v- almost everywhere on ft:

V = {Vi(u) : u £ il;i = 1,..., JV0; f Vf (u)Vi(w)irji(u)v(du) = % / } , (19)

and such that for any index i = l,...No and for V.E 6 TB

[ Vi(w)7r«(u;)i/(dw) (20) Jw€E

253

is a bounded operator on %. The relation

W V O M = V ; M V , W> G H , (21)

holding ^-almost everywhere on fl, defines the bounded linear operator Wi : Ti —>Ce(il,i>y,'H) with the norm ||Wj|| = 1. Here Vi{dw) = nu(ui)i'(daj);

• the unique sequence of positive numbers a = (01,0:2, . . ,OJV0), satisfying the relation

No

5 > i = i; (22) »=i

such that the complete statistical description (a POV measure and a family of posterior states) of a measurement and the complete stochastic description of the random behaviour of a quantum system under a single measurement (a family of posterior pure state outcomes and their probability distribution) are given by:

• The POV measure

Wo

M(E) = J2 <*iMi{E), V£ e TB (23) i= l

with

Mi(E) = f VJ+MVSMi^dw); (24)

JweE

• The family of posterior states

No

Pout{u, Po) = ^2 &(w)r^(w, po) (25) t = i

with

and

T%t(w,p0) = Vi(cj)poV?(Lj) (26)

E j <Xi*n MM7"™*(u, po)]' f«H = ^ , " \ u ) f -'> (27)

254

• The probability scalar measure of the measurement, given by the expression

H(du,p0) = ^ a ^ w ( d w , p 0 ) (28) i

through the probability scalar measures

^ ( d w . p o ) = tr[T^t(uj,po)Mdoj). (29)

• The family of random operators (19), describing the stochastic behaviour of the quantum system under a single measurement. Every operator Vi(ui) defines in the Hilbert space % a posterior pure state outcome, conditioned by the observed result ui and corresponding to the i-th random channel of a measurement.

For any ij)0 £% the following orthonormality relation for a family {Vi(u>)ipo, w i £l;i = l,...,No} of unnormalized posterior pure state outcomes is valid:

/ (^»Vo, v s M M w M K d w ) = <MhM«- (30)

For the definite observed outcome u the probability of the posterior pure state outcome Vi(-)tpo in the Hilbert space % is given by

Q(. A- O ^ M M I I V J M ^ O H 2 / O I \ 1 ;~E,-«i* i iMI|v;-MiM2" ^ '

We call Viifjj) quantum stochastic evolution operators and the probability scalar measures i/j(-),fo(-) = Z ^ a w O and/zW(-,p0), P>(-,Po) = S»a»/x( , )("iA)) - input and output probability measures, respectively.

Due to the decompositions (23), (25), and (28) Mi(E), T^t(uj,p0), Vi(-) and fj,^(-,po) are interpreted to present the POV measure, the unnormalized posterior state, the input and the output probability distributions of outcomes in the i-th func-random channel of the measurement, respectively..The statistical weights of different i-th func-random channels are given by numbers a>i,i = 1, ...,N0.

The a priori state

Pout(ti;po) = y2ai T^t(u,p0)ui((hj) (32) i Jn

is the usual statistical average over unnormalized posterior states Tg^t(uj,po) with respect to the input probability distribution of outcomes Ui(-) in every channel.and with respect to different random channels of the measurement.

255

Physically, the introduced notion of different random channels of a measurement corresponds, under the same observed outcome, to different random quantum transitions of the environment of a measuring device, which we can not, however, specify with certainty.

The triple 7 = {A, V , a } is called a quantum stochastic representation of a generalized direct measurement.

We call direct measurements, presented by different quantum stochastic representations, stochastic representation equivalent if the statistical and stochastic description of these direct measurements is identical.

In the frame of the QSA von Neumann (projective) measurements present such the stochastic representation equivalence class of direct measurements on (E, B(M)), for which the complete statistical and the complete stochastic description is given by the von Neumann measurement postulates [1], presented by the formulae (16), (17).

3 Concluding remarks

We present a new general approach to the description of a generalized direct quantum measurement. The proposed approach allows to give:

• the complete statistical description (a POV measure and a family of posterior states) of any quantum measurement;

• the complete description in a Hilbert space of the stochastic behaviour of a quantum system under a measurement (in the sense of specifying of the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement);

• to formalize the consideration of all possible cases of quantum measurements, including measurements continuous in time;

• to give the semiclassical interpretation of the description of a generalized direct quantum measurement.

4 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University. I would like to thank A. Khrennikov for the warm hospitality and fruitful discussions.

References

1. J. Von Neumann, Mathematical foundations of Quantum Mechanics (Princeton, U. Princeton, NJ, 1955).

256

2. E. B. Davies, J. T. Lewis , An operational approach to quantum probability. Commun. Math.Phys.17, 239-260 (1970).

3. E. B. Davies, Quantum Theory of Open Systems (Academic Press, London 1976).

4. A. S. Holevo, Probabilistic and statistical aspects of quantum the-on/(Moscow, Nauka, 1980; North Holland, Amsterdam, 1982, English translation).

5. K. Kraus, States, Effects and Operations: Fundamental Notions of Quantum Theory (Springer-Verlag, Berlin, 1983).

6. M. Ozawa, Quantum measuring processes of continuous observables. J. Math. Phys. 25, 79-87 (1984).

7. M. Ozawa, Conditional probability and a posteriori states in quantum mechanics. Publ. RIMS, Kyoto Univ. 21, 279-295 (1985).

8. A. Barchielli, V. P. Belavkin, Measurements continuous in time and a posteriori states in quantum mechanics. J. Phys. A; Math.Gen. 24, 1495-1514 (1991).

9. E.R. Loubenets, Quantum stochastic approach to the description of quantum measurements. Research Report N 39, MaPhySto, University of Aarhus, Denmark (2000).

10. A. Peres, Classical intervention in quantum systems. I. The measuring process. Phys. Rev. A. 61, 022116 (1-9) (2000).

11. H. Wiseman, Adaptive quantum measurements. Proceedings of the Workshop on Stochastics and Quantum Physics. Miscellanea N 16, 89-93, MaPhySto, University of Aarhus, Denmark (1999).

257

A B S T R A C T M O D E L S O F P R O B A B I L I T Y

V. M. M A X I M O V

Institute of Computer Science, Bialystok University,

PL15887 Bialystok, ul.Sosnowa 64, POLAND

Probability theory presents a mathematical formalization of intuitive ideas of independent events and a probability as a measure of randomness. It is based on axioms 1-5 of A.N. Kolmogorov x and their generalizations 2 . Different formalized refinements were proposed for such notions as events, independence, random value etc., 2 ' 3 , whereas the measure of randomness, i.e. numbers from [0,1], remained unchanged. To be precise we mention some attempts of generalization of the probability theory with negative probabilities4 . From another side the physicists tryed to use the negative and even complex values of probability to explain some paradoxes in quantum mechanics 5 , 6 , 7 . Only recently, the necessity of formalization of quantum mechanics and their foundations 8 led to the construction of p-adic probabili t ies9 , 1 0 , 1 1 , which essentially extended our concept of probability and randomness. Therefore, a natural question arises how to describe algebraic structures whose elements can be used as a measure of randomness. As consequence, a necessity arises to define the types of randomness corresponding to every such algebraic structure. Possibly, this leads to another concept of randomness that has another nature different from combinatorical - metric conception of Kolmogorov. Apparenly, discrepancy of real type of randomness corresponding to some experimental data lead to paradoxes, if we use another model of randomness for data processing12. Algebraic structure whose elements can be used to estimate some randomness will be called a probability set $ . Naturally, the elements of 4> are the probabilities.

1 What probability sets $ are possible?

For practical conclusions of probability theory, two kinds of events so called, certain and uncertain, are of importance. Therefore, the probability set $ must have two type of elements corresponding to certainty and uncertainty. Their main role is that they are coupling all elements of $. We interpret them as a possibility of a determination of any probability p € $ of a random events by an infinite sequence of random independent variables denned by the probability set $. In this connection we don't require the formal physical interpretation for certainty.

We would like to preserve all fundamental properties of probability on [0,1], corresponding to an intuitive ideas of a probability of an event for abstract probability set $.

Analogical situation occures in logic. A construction which preserve the main properties of Bool algebra and possesses a some new properties led to appearance of the logical Lukasiewicz-Tarski system13 '14.

258

Definition 1 A set $ is called the probability set if it has the following properties:

(i) In $ a binary operation "•" can be defined as multiplication of probabilities being unnecessary commutative. Whith respect that operation the set $ is semigroup. In addition, $ consists of three non-intersecting semigroups O, e and P , such that $ = O U P U e. The elements of semigroup O will play a role of zeros, i.e. O is a semigroup of zeros. The elements of e will play role of units, i.e. e is a semigroup of units. P is a semigroup of probabilities. Besides, for all p £ P , 8 £ O we have 9 • p, p • 6 £ O and for all p £ P , e 6 e we have e-p, p-e £ P .

It is clear that zero elements correspond to uncertain events, and the unit elements correspond to certain events.

(ii) For some elements of $ a commutative and associative operation "+" of addition is defined. The operation of addition and multiplication are distributive. It means that, ifforp,q,r £ $ the operationsp+q, (p+q)+r are defined, then operations q + r, p + (q + r) also are defined and an equality takes place (p + q) + r = p+ (q + r). In addition for all u,v,r the operations u-p + v-q, p-u + q-v are defined and the equalities take place r-(p + q)—r-p + r-q, (p + q)-r=p-r + q-r.

(iii) For all p £ P there exists a complementary element p £ P and e £ e such that p + p = e.

(iv) The operation "+ " is defined for all elements of O and is not defined for elements of e. Besides, for all p fi e, 6 ^ O a sum p + 6 is defined and p + 6 £ O, p + 6 $. e. Also for e £ e the inclusion takes place 6 + e £ e, but p + e is not defined.

(v) In $ some topology is introduced such that with respect that topology the operations "•" and "+ " are continuous. For arbitrary neighbourhood V(0) of zeros there esists p £ $ such that pn € V(O) for n>n0 (V,p).

(vi) Ifp,qE$ andp + q £ O, then it follows that p,q £ O (the property of indecomposability of zero). That property is not necessary. For example in the complex and p-adic probability it can be not fulfilled.

(vii) The equation p2 = p always has the solutions in O and e. If the equation p2 = p has the solutions only in O and in e then we will say, that Kolmogorov' condition is valid for probability set $ .

The properties (3.1)-(5) provide the main identity of independent' probabilities calculus, i.e. if

259

Pi + • • • +pn = e G e, pi 6 P , then we have

(p i + ---+Pn)n = E f t i •••Pik = e f c € e -

Unfortunately, operations of a direct sum and of a tensor product of [0,1] do not produce new probability set different from [0,1].

For example in case of a direct sum [0,1] © [0,1] with the coordinate-wise multiplication we have (p,q), p,q G [0,1] as probabilities. Consequently, (Pi,<?i) + (P2.92) = (pi +P2,qi +qi) and (pi,<?i)(p2,<?2) = (p i^ t f i f t ) - Obviously, the element (0,0) must be zero. But then (p,0)(0,q) = (0,0). It follows, by zero semigroup properties that (p, 0) G O or (0,^) £ O. Assume that (p, 0) € O, p $ O. Then by virtue of others axioms, we obtain (— p, 0) G O, 0 < — < 1 and, therefore, by the continuity property the set {(p, 0)},p G [0,1] consists O. Formally, the probability set differs from [0,1]. But the factorization with respect the set O yields the [0,1] once again with usual addition and multiplication (see section 2). However there exists the probability set $ satisfying all axioms in the algebra consisting of pairs (x,y), x,y G R with the operations of coordinate-wise addition and multiplication.

Indeed, consider the set $ on Fig.l (parallelogram) bounded by vertices 0,h, 1, —h where h < | . Then we can easly verify, that, if {x\, 2/1), (#2,2/2) G $, then (xix2,2/12/2) G $. The zero set O consists of a single element 0, and a set e consists of a single 1. The topology of $ is induced from R 2 . The remaining properties of 4> can be examined easily. Note that the first coordinate x runs over the segment [0,1].

Since R2 with the coordinate-wise addition and multiplication is a simplest non-trivial topological semi-field 15. We can consider $ as an example of a probability set included in a topological semi-field.

In 16 the foundation of classical probability theory is presented in terms of semi-fields. Thus the construction of probability sets in abstract topological semi-fields can be of interest for applications. In section 3 we considered multidimentional examples of probability sets which could be even non-commutative. These examples get beyond the frames of topological semi-fields.

The zero-indecomposability property can be included or not included into the properties of $. It depends on a problem. For example, if we consider all fields of p-adic numbers as a probability set, then the indecomposability property does not holds. Nevethless, it does not prevent the existence of an analogue of Bernoulli theorem in the p-adic probabilities10.

However, we can find sets satisfying all axioms in the field of p-adic numbers. For this purpose, we take a p-adic number q, \\q\\p < 1, that is not a root of any algebraic equation with integer coefficients. Then the set of p-adic

260

Fig. 1

numbers of a form nkq

k + nk+1qk+1 +••• + nrq

r,

where n\. G TV, and the rest of n^ belong to Z, k, r 1,2,3, . . . and of the form 1 — msq

s + ms+iqs+1 + • •• + mtq*, where ms £ N and the rest of m,j belong to Z, s,t = 1,2,3,... together with 0 and 1 are a probability sets with the operations of addition and multiplication in a p-adic set.

The semigroups O and e consist of 0 and 1, respectively. Essentially different examples of probability sets will be considered in sec

tions 3 and 4.

2 Uniqueness of semigroups of zeros and units.

(i) Proposition 1 In the probability set $ defined by operations "•" and "+" the semigroups O ande satisfying properties (3.1)-(3.4) are unique.

Proof. It is important to note that semigroups O and e posses the maximality property, i.e. they cannot be extended to semigroups O', O C O' and e', e C e' or e C e', O C O' preserving the properties (3.1)-(3.4). Indeed, if there is an extention O', then there is an element p £ O, such that p G O'. But this will contradict conditions (3.3)-(3.4), since on one hand, the operation p + e, e £ e is not defined for p £ O, and on the other side, the operation p + e is denned for all e, e £ $, since p £ O'.

261

Now let O' = O and e C e'. Then there exists an element j ) 6 e ' , but p £ e. By (3.3) there exists p, £ O such that p + p € e C e'. Prom the other side the operation p + q is not defined for q £ O' = O and p e e'. Thus any two pairs of semigroups O and e satisfying (3.1)-(3.4) are maximal.

By the same reason, in $ there exists no other pairs: semigroup O i and semigroup ei different from O and e. Indeed, assume these semigroups exist. Let Ox ^ O, O x <f_ O, O £ O j . Then 3p 6 O, p £ O i . If e r i e j 7 0, then the operation p + e is defined for e e e f l e i , since p £ O. On the other hand, the operation p + e is not defined for e £ e i , since p $ O i . If e H e ! = 0 we consider an element p, such that p ^ O but p £ O i . Then by (3.4) the sum p + q is defined V g € $ . On the other hand, the sum p -f e is not defined for e € e, since p $ O.

It remains to consider the case when O = O i , but e 2 e i - This case does not coinside with the case O = Oi and e C e i , studied above, but the proof remains the same. Namely, there exists such p £ e i , but p ^ e. By virtue of (3.3) there exists an element p £ O, such that p + p 6 e. At the same time, the operation p + p is not defined since p € ei and pi Oi = O.

(«,) The homomorphism of the probability set $ i into the probability set $2 can be defined as usual but with the following natural complement.

Definition 2 A mappind ip of a probability set $1 into the probability set $2 is defined to be homomorphism if:

(a) (p is a semigroup homomorphism with respect to the multiplication.

(b) If a sum p + q is defined in $ i ; then the sum <p(p) + <p(q) is also defined in $ 2 and <p(p + q) — ip(p) + (p(q).

(c) If a sum <p{p) + ip(q) is defined in $2, then the sum p + q is defined in $1 and, consequently, by (iib), we have ip(p + q) = ip(p) + <p(q).

Proposition 2 Let the probability set $2 &e a (p-homomophic image of a probability set$i. Let$i = O iUPiUe i and $ 2 = 0 2 UP2Ue 2 , where Oj, ei are semigroups of zeros and units, respectivly. Then <p(Oi) = O2, <^(Pi) = P2 and (p(ei) = e2 . Also we have <p(p) = ip(p) for allp € P i .

Proof. Consider sets Oi = <^-1(02), P i = <p -1(P2), ei = tp~1(e2). Since the sets 0 2 , P 2 and e2 do not intersect pairwise, the sets 0'1 ; P i , and ei also do not intersect pairwise and $1 = Oi U P[ U e[. Since

262

O2, P2, e2 are semigroups, the semigroup properties of ip imply that the sets 0[, P i , e[ are semigroups in $1. Further, using properties (iia) and (iib) one can easly verify that the sets O^ and e[ satisfy conditions (3.1)-(3.4) of definition 1 and thus are semigroups of zeros and units. In view of proposition 1, we have O'I = Oi and e^ = e i . It follows that P[ = P i . Then, if p £ P i there exists an element p £ P i such that p + p £ e i . Therefore ip(p + p) = ip{p) + (p(p) £ e2 and we can set ip(p) = <p(p).

(Hi) Let $ be an arbitrary probability set with a semigroup of zeros O. Propositions 1 and 2 allow to consider instead of the probability set $, a home-omorphic probability set $0 (by proposition 3, below) whose semigroup of zeros consists of a single element. Denote it by • . Then • possesses all properties of the usual zero, i.e. p+O = p, • • p = p • • , Vp € <l>o-

Definition 3 A class of the equivalence Kq of an element q £ $ is the set of all elements p £ $ for which p + 6\ = q + 62 for some #1, 62 € O. Set

$ / 0 = {Kq}, q G $.

From definition 3 it is clear that KB = O for all 0 E O. Indeed, let x £ Kg, then by definition 3, we have x + 61 = 0 + 62 for some 9i, 82 £ O. By 6 it follows that x £ O. Further, since p + 6 = 8+p\/6£Owe have p£Kp.

The following two lemmas are similar to those for conjugate classes in rings, but the proofs are different.

Lemma 1 If z £ Kp, then Kz = Kp.

Proof. If z £ Kp, then by definition 3 we have z + 81 = p + 62 for some #1, 82 £ O. Let x be an arbitrary element of Kz. Then by definition 3 we have that x + 83 = z + 84 for some 83, 84 £ O. Adding 81 to this equality and using the addition properties in $ and the relation z + 81 = p + 82

we obtain

(x + 83) + 0i = x + (83 + 0i) = (z + 8A) +8X =

= (Z + 01) + 04 = (p + 62) + 04 = P + (#2 + 04)

Since 03 + 0i and 02 + 84 belongs to O, from definition 3 follows that x £ Kp, i.e. Kz C Kp.

Also, from the relation p + 82 = z + 0i it follows that p £ Kz. Consequently Kp C Kz and we have Kz = Kp.

263

Lemma 2 The classes Kp and Kq either coinside or do not intersect.

Proof. Indeed, let KpC\Kq^%. If z € Kp n Kq then by Lemma 1 we have Kz = Kp and Kz = Kq, i.e Kp = Kq.

Proposition 3 In the set $ / 0 one can introduce the operations of multiplication and addition naturally induced by the operations in $ that transform $ / 0 to a probabilitic set. (We denote it by $o). Moreover, the semigroup of zeros of a probability set $o consists of a single element Kg = O, V0 € O, which possesses the properties of a usual zero.

Proof. Define the set Kp + Kq by a term-by-term addition of elements. The definition of Kp + Kq is correct if p + q is defined. Indeed, let us consider x G Kp, y G Kq. Then by definition 3 we have that x + 0i = P + 02, y + 03 — q + 64 for some 0» G O. Since p + q is defined, by properties (3.2) and (3.4), imply

(p + 02) + (q + 04) = (p + q) + (02 + 04) = (* + ») + (0i + 03)-

Consequently, x + y € -ftTP+9 and it follows that Kp + Kq C -ftTp+g.

Similarly, we can define the set Kp • Kq by term-by-term multiplication. If x G Kp, y e Kq we have x + 0i = p + 02 and y + 03 = <Z + 04, 0j € O. Multiplying left-hand and right-hand sides of these equalities and applying the properties of O we obtain

Or + 0i)(i/ + 03) = (p + 02)(<? + 04) = x • y + 0 = p • q + 0',

where 0, 0' € O. Consequently, x-y € Xp.g and therefore KpKq C Kp.,.

Those inclusions, lemma 2 and properties (3.3), (3.4) allow to introduce correctly the operations of multiplication and addition on classes <J>/0 by

KpGKq = Kpq, Kp\HKq = Kp+q. (1)

These operations transform the set $ / 0 into a probability semigroup $o- The zero semigroup of <J>0 consists a single class O = K#, 0 € O and the semigroup by units e /O consists of classes {Ke}, e € e. Obviously the properties (3.1)-(6) of definition 1 can be easly verified. The class K$ = O, V 0 G O, possesses all properties of usual zero, since Kq • Kg = Kq9 = Kg, = O and Kq + Kg = K g + e = if,.

We define < on $ as </j(p) = Kp. Obviously, the mapping <p satiesfies the conditions of definition 2 and, therefore, is a homomorphism $ into $0 = $ / 0 .

Probabilities with hidden parameters.

(i) The idea of a hidden variables is very popular in quantum mechanics17. With the help of hidden variables many investigators try to overcome some difficulties of quantum mechanics. For example, in 1 8 to solve the Bell's inequality paradox it was proposed the p-adic theory of distributions for hidden variables.

On the other hand we propose to consider the hidden variables as a hidden parametres of usual probabilities, so that the letter ones must be the abstract probabilities satysfying the conditions of definition 1.

At first, we consider one model of hidden parameters for abstract probabilities.

Definition 4 We say, that a set of abstract probabilities $ allows hidden parameters A (or $ has hidden parameters A), where A is certain topological space, if to each a £ A corresponds a subset Pa C $ , such that (J Pa = $ and the continuous mappings cp and ifi from A x A x $ x $

a

into A are defined and possess the following properties. The operations

(p, a) + (q, /3) = (p + q, tp(a, /?; p, q)) (2)

{p,a)-{q,/3) = {p-q,i>{a,/3;p,q)) (3)

where p G Pa, q £ P0, p + q G P^a.frp.q), P • Q € ^V(«,/?;P.«) define

on the set of pairs (p,a), a € A, p 6 Pa a probability set, denoted by #(;4), P(A) C $ x A.

Since the left hand side of (2) and (3) is the operations in the probability set $, the hidden parameters can describe additional properties of probabilities including some possible physical sense. It is obvious that the principle problem conserning the probability with hidden parameters is as follows: can we destinguish statistically the sequences Ci(w)> •••> C«(w)) — and T]i(ui), ...,nn(£j),..., where C*(w) a r e independent random variables with identical distributions with respect to usual probabilities from [0,1] and %(a>) are independent random variables with the some values as £fc(w), but with the distributions from probability set [0,1] x A and satisfying the conditions: if P{(k(u) E B} =p, then p{r)k(oJ) G B} — (p,a) for some a € A.

265

(ii) Now we consider the principle construction for different examples of usual probability on [0,1] with hidden parameters.

Proposition 4 Let $ = [0,1] and A be some convex semigroup in arbitrary Banach algebra over R. Then the set $ x A = {(p, a), a £ A} forms a probability set with respect to the operations:

(p,a) + (q,a) = (p + q, - £ - a + - ^ 8 ) , p + q<l (4) p+q p+q

(p,a)-{q,a) = (p-q,a- /?) (5)

Proof. As a zero set O we consider the set {(0,a)}, a £ A and as e we consider the set { ( l ,a )} , a £ A. Then all properties of definition 1 can be easly verified. By the proposition 3, all elements of the form (0, a) , a £ A can be ^identified with one zero.

A simple interesting example of such kind can be obtained by considering a set of pairs (p, q), p,q £ [0,1] with the operations:

(pi,Qi) + {P2,qi) = (pi +P2, ^ — q \ + }—92), Pi +P2 Pi+ Pi

0 < p i + p 2 < l (6)

(Pi ,9i ) • (P2.92) = (Pi -P2, qi • 92) (7)

Obviously, instead of q £ [0,1] we can take the elements of Banach algebra of sequences of numbers from [0,1] with coordinate-wise multiplication. We can interpret probabilities (p, q) with hidden parameters Q — (<7i)<72, •••)) 0 ^ Ii ^ 1 a s follows: if an event S occurs with the probability p, then the probabilities (71,(72, ••• can be considered as probabilities of some independent events Si,52, . . . which can occur when S occurs.

Another example of hidden parameters interesting from a probabilitic point of view can be obtained, when q = \\qij\\ runs over stochastic matrices. Now we can consider random index i, i = 1,2,... with distribution (Pt, ||<?fcmlD- Thus, if the event i occurs with probability pi, then qij is the probability of some events Sj. This duplicates the previous situation differing that the matrix multiplication implies more interpretations.

Problem of a general description of all mappings <p and ip of the set [0, l ] x 4 into [0,1] or the full description of probabilities [0,1] with hidden parameters from [0,1] remains open.

266

(Hi) As a prototype of a general construction of a probability $ with hidden parameters, we can consider a set of positive measures min(G) on some semigroup structure G with natural opperation of addition and composition of measures.

Indeed, let G be an arbitrary locally compact semigroup. Consider a set min(G) of all positive measures on G with weak topology. We can naturally define operation of convolution (composition) "*" on min(G) as follows: for /i, v € min(G)we set3 ,

H*v(B) =fj,xv{(x,y) : x-yeB, x,y£G}, (8)

where /i x v denotes direct product of measures fi and u on G. Then min(G) is a semigroup with respect to the convolution. Besides, the addition (fi + v){B) = n{B) + v{B) and the multiplication by a positive number A, (\v)(B) = X/J.(B) are defined on min(G). Obviously, the operations of convolutions and additions are distributive. Thus, the linear set min(G) is convex semigroup with respect to convolution.

The set min(G) possesses almost all properties of the probabilities sets with respect to these operations except one: there is no semigroup of units in min(G). But if we restrict min(G), we can obtain a convex semigroup possessing all properties of a probability set. To this end we consider a subset minj(G) of min(G) consisting of all probability measures, i.e. the set of positive measures fi for which (i(G) = 1. Prom (8) it follows that mini (G) is a semigroup. Consider a convex closed semigroup min[01](G), consisting of all non-negative measures fi for which 0 < /i(G) < 1. It can be readily seen that set min[0]i](G) with the operations of the addition and the composition satisfies all properties (3.1)-(6) of the probability set with a semigroup of units e = mini(G).

Each element fi from min[oii](G) can be obviously represented in the form p • (^fJ.), where n(G) = p 6 [0,1], p ^ 0, ^/i € mini (G). If fi and u belong to min[0ji](G), then we have:

p q p + q

H*v = p(-»)*q(-v) =pq{(-ti)*(-v)}- (10)

Prom (9) and (10) we obtain the

267

Proposition 5 The convex semigroup min[o,i](G) and the set ${mini(G)}, of elements (p,a), p £ [0,1], a E mini(G) with the operations (4), (5) are isomorphic.

The probabilities (p, n) can be interpreted similary to item ii above. However, the structure of multiplication of semigroup is rather more complicated. Consider an algebra of some events F. Suppose, that each such event has a state, which can be represented by an element of a group G. Let the probabilities (pi,p,i), ]TXPJ,^J) = (1,£*) assigne the distribution on events Ti C T, TiV\ Tj = 0. Then the probability (pi,fii) means the choice of a event Ti with the probability pi and the choice of a state g £ G with distribution \n.

It is obvious that the addition and multiplication of these probabilities must be determined by the physical model obtained from an experiment or theoretically.

4 Probability sets with a single unit.

If a semigroup G is finite, then min[0ii] (G) is convex set in the Euclidean space. We will show that convex set contains probability subsets with a single unit. A special two-demensional case of such probability set was presented in section 1.

(i) Let G be a finite group (commutative or non-comutative) with elements ei,62, . . . ,e s , s > 2. Consider a group algebra G(R), i.e. a linear space of linear forms ziei + (- xses, i j G R with a group multiplication of basic elements {ej}. Assume that the basis {ej} is ortonormalized. Let mini(G) be a simplex formed by the vertices e\,ei,--,es, and the set min[o)i](G) be a simplex formed by the vertices 0,ei,e2, . . . ,e s , see Fig.2. Then the measure (i 6 min[01](G) can be written as fj, = p\e\-\ \-pses, where 0 < pi < 1 and J2iPi 5: 1- The geometrical center of mini (G) is an invariant measure no = \e\ -\ h ^e s . For any measure fi € min[01] (G) we have:

/j,nG = nGiJ, - n{G)nG. (11)

In special case, if p, 6 mini(G), then una = nop = no and nG = no-Denote the line passing through the points 0 and no by I. Then, as it can be seen from Fig.2, mini(G) is a part of hyperplane orthogonal to line I and passing through the point no, and min[0)1](G) is a part of positive orthant cut of by mini(G).

268

^3

M/G)

i ^ _ „ ^ „ r . .

Fig. 2

Really, Fig.2 corresponds to the case s — 3, when G is a cyclic group of three elements. This case is of a special interest, because algebra G(R) is isomorphic to direct sum of real numbers field and complex numbers field19. Consider a cube Q as it is shown in Fig.2. The cube Q consists of all measures fi = Y^l Piei f° r which 0 < pt < j .

Proposition 6 The set Q considered as a subset of a convex semigroup minr0,i](C?) is a probability set with a single zero 0 and a single unit no-

Proof. Let us establish, that the set Q is a semigroup with respect to the multiplication. Indeed, if fi = ^2{piei, v — YHljej belong to Q, then 0 < Pi < - , 0 < qj < 1, and therefore, we have \i*v = Y^Pi1ieiej ~

S ( ^Pilik I efc> where i* = 1,2,..., s are defined uniquely for each i and k \ i J

k by the condition a • e{k = ejt, i, k = 1,2,..., s. Since G is a group, then for any fixed k, k — 1,2, .. . ,s, the indexes ik run over 1,2, . . . ,s, when i runs over 1,2, . . . ,s. Therefore, we have

$>ife < ; E « ^ ?

269

Now let us show that a complimentary element ~p exists for each p = p-\.e\ + • • • + pses € Q. By definition 1 we must have \i + ~p 6 e. In our case we set e = n g . Then p + ~p = ng and therefore ~p - nG - p. = ( i - pi)ei + ••• + ( j - ps)es 6 Q, since 0 < pi < £, i = 1,2,..., s. Finally, let us check property (3.4). Really, if p € Q, p ^ no then p(G) = A < 1. Thus, by virtue of (11) we have pna = ^GM = n(G)nG = \nG.

The remaining properties of definition 1 for the set Q follow straightforwardly from the properties of probability set min[0,i](G).

Note that the Kolmogorov condition (7) holds in Q.

(ii) It proves to be possible to construct even more general kind of probability sets with a single unit as a subsets of the set min[01] (G). For this purpose, we consider an arbitrary convex semigroup S(G) in mini (G) and a convex set SQ(G) formed by zero (0) and the elements of the set S(G). One can readily see that So(G) also satisfies properties of a probability set in which S(G) is a set of units.

Now we consider a set Q(S, G) which is an intersection of the set S$(G) and all half-spaces contained zero and bounded by hyperplanes parallel to the faces of the So(G) and passing through the point nG.

Proposition 7 Let S be an arbitrary convex semigroup in mini {G), central symmetric with respect to the point nG. Then Q(S, G) is a probability set with a single zero and a single unit.

Proof. We shall show that Q(S,G) is a semigroup with respect to convolution, and hence Q(S,G) as a subset of min[0]1](G), is a probability set with a single unit nG- First, note that, in view of central symmetry of 5 with respect to nG, an intersection of any face of So(G) with any hyperplan passing through the element nG and parallel to another face lays in the intersection of faces of SQ(G) and the hyperplan h passing through \nG and perpenducular to the line /.

Fig.3 shows a plane -K passing through the point p0 € S0(G) and line /. The rhombus 0AnGB is an intersection of Q(S,G) with this plane. Each element p, of this rhombus can be represented by p, = nG — Ai/xi, where pi € S(G), 0 < Ai < 1. Symilary, for each other element v of Q{S,G) we also have ii = nG - A2^i where v\ £ S(G), 0 < A2 < 1.

270

71 O S(G)

JA

- • x G s

^ 1

Fig. 3

Therefore the product fiv equals

(nG - Xim)(nG - A2î) - nG - A2nGî - Ai/zinG + AiAîî =

= ( 1 - A i - A2)nG +AiA2/ii^2. (12)

Let us show that the element (12) belongs to Q(S,G). Consider the first case when either Ai and A2 is greater than | . Let for example, Ai > | .

Then the point jl lays in the left-hand side of the rhombus and thus can be represented as ty, \i 6 S(G), t < | . On the other hand, we have v - T • v for v E Q(S,G), where v £ S(G), 0 < r < 1. Therefore, the product Jiv is equal tr • fiu, where fj, • v G S(G) and 0 < tr < | . Consequently, by construction of Q(S,G) measure p.i> lays the left of hyperplane h (Fig.3), and consequently ftu £ Q(S,G).

Now consider the case, when Ai < | , A2 < | . Then p = 1 — \x — A 2 > 0 and q = \1\2 > 0. Show that inequality p + 2q < 1 holds, which is equivalent to the inequality Ai + Ai > 2AiA2. Indeed, (Ai — A2)2 = Af + A| - 2AiA2 > 0. Since 0 < Ai < 1, 0 < A2 < 1 we have Ai + A2 - 2AX A2 > \\ + \l - 2AiA2 > 0. Whence p + 2p<l.

Thus, from (12) we have [iv = pna + qfJ-iVi, fJ-i, v\ £ S(G), p,q >

271

0, p + 2g < 1. Show, the measure m = pna + gw belongs to Q(S, G) for any measure w € S(G).

Fig.4 shows the plane passing through the points 0, u ans no- The point m = priG + qw lays on the line /' parallel to Ow and passing through priG-

Now, to prove that m belongs to Q(S,G) it suffices to demonstrate that \qu>\ < |A|. By similarity of triangles 0 u n s and pno BTIQ we have

|2A| ( l - p ) | n G |

u> \nG\ = l-p.

That is, |A| = | ( 1 -p)\u\. Then

\qu\ 1(1 -P)\" 2 Q

1 1 - p > 1

U)\

follows from the inequality p + 2q < 1.

Hypothesis: For arbitrary S(G) C mini(G), the set Q(S, G) as a subset of a convex semigroup minr0)i] (G) is a probability set with a single 0 and a single unit no •

272

We would like to note in connection with the examples of section 1, that a general description of probability sets in topological semi-fields and in the field of p-adic numbers is of a great interest for applications.

We hope that problems of an experimental determination of abstract probabilities will be considered in the continuation of this work.

5 Acknowledgments

In conclusion, I want to express my gratitude to A. Yu. Khrennikov (Vaxjo Univ. Sweden), Yu. V. Prokhorov, O. V. Viskov, I. V. Volovich, (all of Steklov Mathematical Institut, Russia), V. Ja. Kozlov (Academy of Criptografy, Russia), V. I. Serdobolskii (Moskow Univ. of Electronic and Math., Russia), and A. K. Kwasniewski (Bialystok Univ., Institut of Computer Science, Poland) for discussions and their advices on foundations of probability theory and quantum mechanics. This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University.

References

1. A. N. Kolmogorov, Foundation of the probability theory (Chelsea Publ. Comp, New York, 1956).

2. T. L. Fine, Theories of probabilities, an examination of foundations (Academic Press, New York 1973).

3. H. Heyer, Probability measures on locally compact groups (Springer -Verlag, Berlin-Heidelberg, New York, 1977).

4. Y. P. Studnev, TV and its applications 12, 727 (1967). 5. R. P. Feyman, Negative probability. Quantum implications. Essays in

Honour of David Bohm, B.J. Hiley and F.D.Peat (Routledge and Kegan Paul, London, 1987).

6. P. Dirac, Pev. Mod. Phys 17, 195 (1945). 7. 0 . G. Smolaynov and A. Y. Khrennikov, Dokl. Akademii Nauk USSR

281, 279 (1985). 8. V. S. Vladimirov, I. V. Volovich and E. I. Zelenov, p-adic analysis and

mathematical physics (World Scientific Publ., Singapore, 1993). 9. A. Y. Khrennikov, Theor. and Math. Phis. 97, 348 (1993).

10. A. Y. Khrennikov, Doklady Mathematics 55, 402 (1997). 11. A. Y. Khrennikov, Mathematical and physical arguments for the change

of Kolmogorov's axiomatics. Trends in Comtemporary Inf. Dim. Analysis and Quantum Probability, N.l , 215-249 (2000).

273

12. L. Accardi, The probabilitic roots of the quantum mechanical paradoxes. The wave - particle dualism (D. Reidel Publ. Company, Dordrecht, 1958).

13. C. C. Chang, Transactions of the Amer. Math. Sos. 86, 467 (1958). 14. R. S. Grigolia, Algebraic ananlysis of Lukasiewicz - Tarski's n-valued

logical systems. Selected papers on Lukasiewicz sentential calculi (PAN, Ossolineum, Poland, 1977).

15. T. A. Sarymsakov, Topological semi-fields and its applications (FAN, Tashkent, 1989).

16. T. A. Sarymsakov, Topological semi-fields and probability theory (FAN, Tashkent, 1969).

17. J. S. Bell, Rev. Mod. Phys. 38, 447 (1966). 18. A. Y. Khrennikov, Physics Letters A 200, 219 (1995). 19. B. L. Wan der Waerden, Algebra I Achte Auflage der modern algebra,

(Springer-Verlag, Berlin-Heidelberg, New Yok, 1977).

274

Q U A N T U M K-SYSTEMS A N D THEIR ABELIAN MODELS

H. NARNHOFER Institut fur Theoretische Physik

Universitat Wien Boltzmanngasse 5, A-1090 Wien E-mail: [email protected]

In this review the concept of quantum K-systems is studied, on one hand, based on a set of increasing algebras, on the other hand, with respect to entropy properties. We consider in examples how far it is possible to find abelian models.

1 Introduction

Classical ergodic theory is a powerful discipline both in mathematics and physics to analyze mixing properties of a given dynamics. Since in physics the mixing properties take place on the microscopic level that is controlled by quantum theory it is natural to try to translate the concepts of classical ergodic theory also into the quantum framework and to study how far these concepts can find their quantum counterpart and whether new features appear.

One possibility is the following: we start with a classical dynamical system, e.g. a free particle on a hyperbolic manifold with finite measure and quantize the dynamics, i.e. study the properties of the Laplace-Beltrami operator on this manifold. Since the manifold has finite measure the Laplace-Beltrami operator has necessarily discrete spectrum1 and the classical mixing properties can only have their footprints in the distribution of the eigenvalues at high energy2,3. Many deep results have been found on the basis of this approach. But in this review we will follow another path of considerations.

We start with the classical dynamical system with optimal mixing properties, the Kolmogorov system4,5,6. It can be characterized either by its algebraic structure or by properties of its dynamical entropy. Both concepts find their counterpart in quantum systems7 but they are not equivalent any more.

First we will give the definition of an algebraic K-system and some definitions of dynamical entropies. One of them relates the quantum system to classical K-systems that can be considered as models of the quantum system. Then we will give examples of algebraic quantum K-systems and will discuss how far they can be represented by classical models. Finally we will give examples of quantum K-systems for which no classical model exist and, on the other hand, a quantum dynamical model that allows the construction of a classical model, but for which the algebraic K-property so far cannot be controlled.

275

2 Classical K-System

Let us repeat the characteristics of a classical dynamical system (A, a, /z) where we take A to be the abelian algebra built by the characteristic functions over a measure space with measure fi and a an automorphism over A with [i o a = fi 4,5,6

Definition 2.1 We call (A, Ao, a, fi) a K(olmogorov) system if:

Ao £ A, crAoDAo, \JanAo=A, f]a~nAo = XI. (2.1)

For a given classical dynamical system (A, a, fi) we can decide in several ways, if some Ao (that is not unique) exists, so that (A,Ao,a,fj,) form a K-system 5,6

A) Choose some finite subalgebra 13 C A (i.e. some finite partition of the measure space) and construct its past algebra Ao = Un€N a~n&- If A) is a proper subalgebra of A, it will increase in time. Check, if \J anAo = A, if not, B has to be increased. If B is large enough, check, if f] a~nAo = Al.

B) Consider the conditional entropy H(B\Ao). If this expression is strictly positive V B, (A, a, fi) is a K-system.

C) If

lim H(anB\Ao) = H(B) VB, (2.2) n—foo

then (^4, a, (i) is a K-system.

The classical K-system can also be characterized by its clustering properties: Let (A,AQ,(J,H) be a K-system. Then to every B E A, e > 0, 3 n0 such that

\p(Bo-nA) - n(B)n(A)\ < en(A) VAeAo,n>n0. (2.3)

The prototype of a K-system are the Bernoulli shifts (including the Baker transformation): We regard the Bernoulli shift as an infinite tensor product A — <8)fez Bti where Be is isomorphic to a finite abelian algebra Bi « BQ = {Pi,... ,Pk} with projections P, with expectation values /z;. The dynamics is given as the shift a over the tensor product. The state \x has to be translation

276

invariant. It can be the tensor product of the local state, but we allow also spatial correlations. The dynamical entropy is given by

s u p t f l Q c / S I | J arB\ (2.4) " \t=0 r<-l+n J

= s u p i f f M J < r * B j (2.5)

and coincides with H (B) if the state p, factorizes.

3 Algebraic Quantum K-Systems

It is obvious that one can adopt Definition 2.1 directly to define an algebraic quantum K-system. It is also obvious that the definition is not empty because we can construct the quantum analogue of a Bernoulli shift by taking for B a nonabelian algebra, e.g. a full matrix algebra Mkxk- In the following we will first discuss physical applications of this quantum Bernoulli shift and then turn to generalizations.

A. A model for Quantum Measurement

We start with a finite-dimensional algebra B and a state u over B. In order to determine w we have to make many copies of u and repeat a variety of measurements. The classical Bernoulli shift consists of projections and every measurement gives as outcome 0 or 1 on these projections with probability corresponding to the state p. By repeated measurements we can determine p with exponentially increasing security.

In the quantal situation a measurement corresponds to pick some abelian subalgebra Bo of B, maximal abelian, if the measurement is sharp, and again the outcome of the measurement will be 0 or 1 on the projections in Bo- To determine the state u we have to vary the measurements, respectively the algebras Bo. Since the state space over B is compact it suffices to vary over finitely many Bo- Let u(Pj) = pj for Pj 6 BQ. TO get security on the density distribution with respect to Bo, the number of experiments have to be of the order pj(l — pj)/e2. For the algebra Bo that commutes with the density matrix p corresponding to u the entropy S(p\g ) is minimal and approximative security on the density distribution is reached for the smallest number of measurements. For other abelian subalgebras BQ we are satisfied with less security,

277

we have just to be sure that p\e0 is more mixed than p\-go. With pj — UJ(PJ)

for Pj £ Bo and Jj- = u(Pj) for ~Fj e B0- The probability that the outcome of N measurements gives a probability qj > pj + e is

Nipj-pj-e)2

exp — . (3.1a P i ( l - P j )

This has to be compared with the security given by N measurements on B0

~Ne2

exp-^-p - r . (3.1b)

Therefore the number of experiments N necessary to control p\s0 is small compared to the number N that fixes p\g and at the same time p. If we interpret the entropy as a measure on the reliability of a sequence of measurements we see that it is not changed compared to the classical expression, i.e. the same order of experiments is necessary and therefore

S(p) = S(p\Bo) = -Trplnp. (3.2)

Remark : In 8 the Shannon information resp. von Neumann entropy (3.2) was questioned to be the appropriate quantity. But in these considerations it was not taken into account that measurements on different abelian subalgebras are correlated. We have incorporated these correlations by taking into account the varying necessary accuracy and in this way got the desired result.

B. Lattice Systems

Again we choose a matrix algebra B and define A = ® n 6 ^ Bn as before. But now the algebra describes particles on a lattice (one-dimensional for n £ Z), the shift corresponds to space translation and the translation invariant state describes the system in e.g. the ground state or equilibrium state with respect to some Hamiltonian, e.g. the Heisenberg ferromagnet. Therefore in general the state will not factorize but be obtained as 9

... ,. T r e - ^ A u(A) = hm — ^s—. (3.3)

A-yZ Tr e-PH*

We assume that the sequence of local Hamiltonians H\ determines a time automorphism on the algebra that commutes with space translation. We can assume that ui(A) is space translation invariant. In order that we have an algebraic K-system on the von Neumann level (in the weak topology) it is necessary that the state is extremal space translation invariant. This can be achieved, if necessay, by a unique decomposition as in the classical situation9.

278

C. Fermi Systems

We consider the CAR algebra A{a(f), a^(g)} either over C2(Z) or L2(R). The shift defines an automorphism over A and the K-property is satisfied with AQ = {a( / ) ,a t ( / ) ;supp / 6 Z~ or R~}. This is not a Bernoulli-K-system, because creation and annihilation operators anticommute.

D. Quantum Stationary Markov Processes

Another example 10 of a K-system is provided by stationary Markov chains. Here many variations of the definition of such a Markov chain exist. We give an explicit example that again cannot be imbedded into a Bernoulli system.

Let Ao be a 2 x 2 matrix algebra and C = ® n € Z Cn a Bernoulli system, Cn again a 2 x 2 matrix algebra. Define the map Ti : A$ ® 1 —> Ao <8> C\ by

Ti(ax®l) — ~ox®ox

T^y®\) = ax®ay (3.4)

r i ( a z ® l ) = l®az.

On C we consider the shift r and a r-invariant state CJ. Therefore we can define

T = (Ti ® idci )°{idA®T). (3.5)

Then A[m,n] = \/m<k<nTk(Ao) and (-4[-oo,oo],^[-oo,o],?\ f ® w) define a K-

system for arbitrary states (p over ^lo-It can easily be seen that though -4[_oo,oo] can be imbedded in A®C, the

automorphism T is not asymptotically abelian:

[Tn{ax ® l),az ®l) = ioy®ox... ax. (3.6)

E. Prize-Powers Shift

Another illustrative example for a quantum K-system is the Prize-Powers shift n

Let ej be a unitary satisfying e2 = 1. Let

eiek = ( - l ) ^ - * ) e * e i with g(i - k) e {0,1}. (3.7)

Let aek = e^+i. Then

{Vg,o = {ehi < 0},Vg = {etJ £ Z},<J,T)

279

form an algebraic K-system where r is the tracial state

-r(e/) = Sift with e/ = J J eiu ...eik. (3.8) iiii„€l

Special examples are

a) g(l) — 1, g{k) = 0 otherwise: Then the algebra coincides with 0 A M^ x 2

where

&2k - crz®az£ Mk+i <8> Mk

R2k+i = 1 ®<Jx € >lfc .

b) g(i) = IV i. Then the algebra coincides with CAR on Z:

et = a,i+a\.

Other explicit examples can be found in1 2 . In all these examples (A - E) we inherit from the classical theory the

following

Theorem: Let (A, Ao, cr, u) be a K-system and u an extremal translation-ally invariant state. (That is equivalent that f)(j~nAo = Al in the strong topology.) Then to every A, e 3 no such that

\oj(Aa-nB) - U(A)OJ(B)\ < e\\B\\ \/n>n0, B e A0. (3.9)

Therefore we have the same clustering properties as in (2.3).

Proof: If OJ is the tracial state T(AB) = T(BA), then in the GNS representation

OJ(B) = (n|7r(B)|n>.

ir(Ao) defines a projection operator PQH = Tr(Ao)Q that is increasing respectively decreasing in an

u{Ao-xB) = oj(Aa-nP0(J-nB)

and

st- lim (7nP0 = 1, st- lim a~nPQ = \fl)(fl\. (3.10) n—*oo n—•oo

280

If LJ is not the tracial state but a KMS state, it cannot be excluded that ft is not only cyclic for TT(A)" but also for TT(AO)". But in this case the modular operator corresponding to ^(Ao)", A0 can replace P0 for controlling the cluster properties and satisfies13

st- lim <r-n—r^ = J |fi)(fi|. (3.11) A i / 2 + 1 2

n—¥oo

4 Dynamical En t ropy

The dynamical entropy of classical ergodic theory can be interpreted in two different ways:

If we use the definition

h{a) = supH(a,B) = supH(B\ I J a~nB), (4.1)

then it measures how the algebraic K-system increases and how in the course of time our information on the complete system increases.

If we concentrate on the fact that

lim H[akB\ I J a~nB) = H(B), (4.2)

it describes that the remote past becomes more and more irrelevant for the presence. Both properties can inspire us to look for an appropriate definition for a dynamical entropy for a quantum dynamical system.

a) For an algebraic K-system we can just copy the definition of a classical K-system.

Definition: Given two subalgebras A, B C M, w a state over M. Then we define with S(uj\ip) the relative entropy the conditional entropy H(A\B)

HU{A\B)= sup ^2(S(u\uiU - S(u\ui)B). (4.3)

Evidently H(A\B) > 0. By monotonicity of the relative entropy H(A\B) = OifAcB.

Let (A,Ao,a,u) be an algebraic K-system. Then HiJ(aAo\Ao) measures how fast AQ is increasing. The above expression has not been much

281

investigated. The main reason lies in the fact that for a given quantum dynamical system different to the classical situation, no strategy is known to decide whether an AQ with the desired properties exist. If it exists, there is no reason to assume that it is unique. In the classical situation the dynamical entropy does not depend on the special choice of AQ. In a quantum system, due to the lack of a constructive approach to Ao, we also have no chance to compare H(aAo\Ao) with respect to different past algebras Ao-

There exists also another characterization for the amount of increase:

For A D Ao, both type Hi algebras, define P0 the projector on AoO. in the GNS representation of the tracial state over A, Po 6 n(Ao)". Then 14

[A:A0}=T(P0)-\ (4.4)

r the trace over n(Ao)".

This definition has been generalized to type III algebras by1 5 . Note that it is not state dependent. As a typical example it can be evaluated for the Price-Powers shift: both (4.3) and (4.2) are independent of the sequence {g} and give In 2 resp. 2. But it should be noted that in general there exists only an order relation16

H(aAo\Ao) < 2 1 o g M o : M-

b) The main obstacle to use (4.3) or (4.4) as a definition for the dynamical entropy comes from the fact that for noncommutative algebras in general U n = 1 a~nB will increase in a way that can be hardly controlled.

An illustrating example is given by the following observation17.

Take A = {a(f),a^(g), f, g G C2(R), a} with a the space translation. We know already that it corresponds to a K-system with A0 = {a(f),a'(g), f,g, € C2(R~)}. But if we pick a(e~x ) and construct the algebra A0 = {a(e~(x_a) ), a > 0}, then A° coincides with A: if it would not, we could find some / with (/|e~(x~°) ) = OVa > 0 and this is impossible due to the analyticity properties of the Gauss function.

Due to this fact 18 proposed the following definition for a dynamical entropy

282

Definition: Let M be a hyperfinite von Neumann algebra with a faithful normal trace. Let Pf(M) be the family of finite subsets of M. Let X C M. We write

if for every x € w there exists ay e x s u c n that

T((X - y)(x - y)*) < 6. (4.5)

Let J" be the family of finite dimensional C* subalgebras of M. Then

rT(cj,5) = inf{rank A : A e T{M),UJ C A}. (4.6)

1 (n~l

haT(a,uj,S) = lim sup —logrr I I J oUu),8 n-¥oo n \ ^

\j=o

haT(a,u>) = suphaT(a,uj,S) (5>0

haT(a) = sup{/ ioT ( (T,w):w6P/(M)}. (4.7)

The notation stands for approximation entropy of a.

The above definition allows many variations. For instance, the lim sup can be replaced by a lim inf, and we can hope, but it is not proven, that this does not change the definition.

New information can be gained if we change the approximation conditions (4.5).

The topological entropy uses the approximation in norm. But to keep generality we cannot assume that the full matrix algebra belongs to A. Concentrating on nuclear C* algebras we have to approximate via completely positive maps (<p,ip,B), with B a finite dimensional algebra, if : M -> B and if>: B ->• M, such that

\\tp o tp(a) - a\\ < 6 V a G w . (4.8)

hat{a) is denned as haT only under the new approximation condition. If M is an AF-algebra and therefore possesses a tracial state, then the topological entropy dominates the approximation entropy

ht{a) < hat{a). (4.9)

283

As another possibility we can approximate ip o p(a) — a in the strong topology in a given representation corresponding to a state ip and replace the rank of the best algebra A by the entropy19

s = (ipoip).

All these definitions satisfy the requirement that they coincide with the usual definitions (state dependent dynamical entropy or topological entropy) if we apply them to commutative algebras.

Let us finally remark that applied to the Price-Powers shift, again independent of {g} (3.7)

haT(a) = hat(a) = ht - <p(a) = ^ H(AoW'1 AQ). (4.10) Li

For further studies we refer to (Stormer, Choda, Dykema)20 '21 '22.

c) An approch that differs very much from the mathematically motivated definition of Voiculescu is offered by Alicki and Fannes23. It is motivated from the concrete method how we are able to determine by experiment the state of a system: we perform a measure and repeat the measurement in the course of time. Here we use the idea of the history of a system as discussed e.g. in24 '25 .

A single measure corresponds to a partition of unity

fc-i ]•>**; = !. (4.11) j = 0

In fact, we may think that the x^ are commutative selfadjoint projection operators. But by time evolution this commutativity is destroyed anyhow and also for the necessary estimations it is preferable to consider this generalized partition of unity without further restrictions on Xi. Repetition of the measurement corresponds to a composed partition

X = (x0,...x„-i)

ax = ((TX0,... ,o-xn_i)

VX°X = (... ,<TXi---Xk,...),

i.e. a partition of size k2.

(ii\x*Xj\n) = MX

284

defines a density matrix of dimension k with entropy

H{x) ~ S(MX. (4.12)

As dynamical entropy h(x) we define

h(x) = limsup— H(am~1x°---vx°x) m rn

= limsup — S(Mam-ixo axox)

h{a) = suph(X). (4.13)

But here a problem arises: if we do not restrict B in the algebra A we lose control on the dynamical entropy. For instance, if we take as C*-algebra the Cuntz algebra9 with 11,17j — % and UfUj = Pj and use the {Ui} for %, then the identity map has infinite dynamical entropy. If, for instance, we consider the shift on the lattice system B), then we can choose as natural subalgebra B that is dense in A, the algebra of strictly local operators. Some weakening of this restriction is possible, and this is of course necessary, if we want to apply the theory to time evolution with interaction, where local operators immediately delocalize. But this derealization decreases exponentially fast in space26, therefore B consisting of exponentially localized operators, should be sufficient to define a dynamical entropy for time evolution in the sense of Alicki and Fannes. As an example we consider the shift on the lattice. Then

/ IAFMO, = S(LJ) + lnd, (4.14)

there s(u is the entropy density corresponding to the state w and d is the dimension of the full matrix algebra of each lattice point.

d) As last proposal for the definition of a dynamical entropy we describe the one which, in fact, has the longest history: First it was proposed by Connes and Stormer for type II algebras27 and then generalized in28 and 29 to general situations. We present the definition given by Sauvageot and Thouvenot 30 which they showed to be equivalent to the ones in 27 and 29 for hyperfinite algebras. In their definition it is most evident that this dynamical entropy measures how far the quantum system is related to a classical K-system. In addition, concepts developed in this framework also find their application in quantum information theory.

285

Definition: The entropy defect of an abelian model. Let (.4, w) be a nonabelian algebra with state u. Let (B, n) be an abelian algebra with state fi that is coupled to A by a state A over A®B, satisfying A| t = w, X\B = fi. Its entropy defect is defined as

HX(B\A) = [H^B) - S(LJ ® ii\X)A9B]. (4.15)

Theorem: The entropy of the state u is given as

SA(w) = sup [HB(fi) - HX(B\A)]. (4.16)

In fact, there exist many abelian models that optimize the above expression: every decomposition of OJ into pure states ui = J^ILi Viui c a n be interpreted as abelian model with B = {P i , . . . ,Pn} and fi(Pi) = fii, $Pi®A) = fiiOJi(A).

Due to quantum effects the entropy is not monotonically increasing, if we consider an increasing sequence An C Am ,n<m. But monotonicity can be regained, if we change the definition to

Definition: Let A C C and (B,fx,$ be an abelian model for (C,CJ).

Then

HUlC(A)= sup [HB{n) - HX(B\A)]. (4.17) (B,M,A)

This suggests the definition for a dynamical entropy:

Definition: Given (A,a,u>) a quantum dynamical system. The dynamical entropy is given by

hu(a) = sup[/»M(P|P_) - H(P\P- ® A)] (4.18)

where the supremum is taken over all dynamical abelian models (B, n, 0 ) with n o 0 = 0 and coupling A o 0 <g> a = A, A|.4 = u>, \\B = A*- Here P- = U^Li Q~nP the past algebra of the partition P.

Remark: There holds equality between hu(a) and

sup [MP |P_) - H(P\A)]. (4.19)

286

This is based on considering

H(P\P-) = lim - H I \ / ekP) )

H(P\P_ ®A) = lim - H I \ / BkP\A) ]

and taking V @kP as a new abelian model.

It is evident that one can also define the dynamical entropy with respect to a subalgebra C C A

K{a,C) = sup[/iM(P|P_) - H{P\P- ® C)], (4.20)

an expression that we need, if we want to discuss 2.C) in the framework of quantum systems. Notice that (4.19) cannot be replaced in general by an expression like (4.18).

The main task now is to find abelian models. This can be done very similar as for calculating the entropy of a state.

Theorem: Assume a state w is decomposed

w = ^Mii,...,i„Wi1,...,in. (4.21)

Define

Consider

< < = 1^ Wi,...,i»Wii,...i»-it ,l^k

H(C, aC,..., ak^C) =: 5( W ) - £ S{$) + £ / ^ S M U ^ ^ - M .

(4.22)

Consider now the decomposition

w = ^ p y 51 E ' 1 " - i " W i ' - - i « ^ ' = S/*i„...,i.w<i....,<-- (4.23) r = - *

In the limit lim,—,..*, lim„^.oo (i-e. we have to start with a sufficiently large decomposition) the {pik} converge to an abelian model and all

287

abelian models can be obtained in this way. The detailed proof for this statement can be found in3 0 .

This theorem enables us to find lower limits for the dynamical entropy. Together with the fact that

1 H(C,aC,...,ak-lC) < \ SU(C) + 0(8), (4.24)

if C C C in the sense of (4.5) or (4.8), we also have the upper bound29

h(a) < sup lim \ H(C,... , <r*-1C) (4.25) c k

so that in some cases we can really evaluate the dynamical entropy.

5 Some General Considerations on Abelian Models

As we already mentioned the entropy of a state over a quantum system can be calculated via an abelian model. For a matrix algebra this view point may look superficial, but has found its important application in the theory of entangled states, where subalgebras A® B C C are considered and the entanglement describes that a pure state over C will not be pure as state over A resp. B. This entanglement can be used for quantum communication and the amount of this applicability is expressed as entanglement of formation31 (compare (4.17))

E{u,A) = S(u)A - HW(A) = miY^mS{u>\uji)A. (5.1)

Expressed in terms of an abelian model we can also write

HU(A) = sup S(U®H\*)A®B0, (5.2) A,0o

where A is a state over BQ ® C. We have the following inequality: Let w as state over C be written in the

GNS-representation w(C) = <n|7r(C)|n>

and let C be the commutant in this representation. Then

S(u ® H\U>)A®C0 < HU(A) < S(UJ ® U}\LJ)A®C' (5.3)

with C'0 any abelian subalgebra of C. A maximal abelian subalgebra of C gives a lower bound to the entropy and in some cases it even is the best

288

abelian model (compare 32 and the explicit results in 33 for estimates on E, i.e. without dynamics), but in other examples 32, see also the forthcoming 6.E, it is evidently too small. If, in addition, the abelian model has to carry a dynamics the question arises, when the abelian model can be imbedded into the commutant (or whether by the natural isomorphism the algebra itself contains a sufficiently large time invariant abelian subalgebra).

Here we have the following results:

Theorem: 34. Assume that (A,CT,CJ) is a dynamical system and OJ a tracial state. Assume that the analogue of l.c) ("entropic K-system") is satisfied, i.e.

lim H(on,B) = H(A) V finite dimensional B C A. n—too

Then

st-lim[yl,<r'M]=0 V A. (5.4)

Proof: It sufficies to choose B = {P} for all projection operators in A. Then {P} is its own best abelian model in the calculation of H(B). Refinements of the models {P},... , {anP} have to be used to calculate H(an,B) (compare theorem (4.23)). But they are only possible, if P and anP nearly commute.

The theorem was generalized to other states 34, but with the restriction that we had to be able to keep control over sufficiently many optimal abelian models. We do not believe that these restrictions cannot be removed by a harder analysis.

Another result on footprints of commutativity is the following.

Theorem: 35. Assume that in the calculation of the dynamical model there exists an optimal abelian model, i.e.

h(a) = sup (4.19) = maxAip,e(4.19), (5.5) \,B,0

then the algebra .4 contains an abelian subalgebra Ao on which a acts as an automorphism. Notice that this does not imply that this abelian subalgebra already is the optimal abelian model.

6 Abelian Models for Algebraic K-Systems

In the following we will discuss the examples of abelian K-systems given in Sect. 3 and how far they allow to find good abelian models.

289

A) In this model of a quantized Bernoulli system that completely factorizes the obvious choice of the abelian model that gives the correct result is

-4o = (g)4n )

n€Z

where BQ is the abelian algebra that commutes with p and describes the measurements with maximal certainty.

B) For the lattice system for which the state does not factorize any more it does not suffice to pick a suitable abelian subalgebra at every lattice point. This provides an abelian model, but not an optimal one. According to the observations (4.25) it is clear that an upper bound for the dynamical entropy is given by the entropy density 29, and it seems very plausible that it should not be less. To our knowledge no general proof is available, but for the states that are of physical interest, equality is shown.

Already in 29 equality was shown under some compatibility relation between space translation and modular automorphism. Only in reality it is difficult to check whether this compatibility relation holds. For quasifree states this is possible and was done in 3 6 . Here an abelian subalgebra was selected for increasing size of the tensor product. This subalgebra delocalizes, but only to such an extent that the convergence of these subalgebras to an abelian model that gives the desired result, can be controlled.

In 37 equilibrium states over lattice systems as in 9 were considered and a decomposition offered that in the limit gave the desired result. 38 applied the affinity of the dynamical entropy to control these limits and allow to exchange them. His ideas are generalized in39 giving the following result:

If you assume that the shift a is asymptotically abelian (i.e. we consider not only lattice algebras but some generalization in the framework of AF-algebras) and you consider a dynamics given by a sequence of local Hamiltonians, then:

The thermodynamic limit of the equilibrium states exists and they satisfy the KMS property with respect to the dynamics.

For these states the entropy density and the dynamical entropy of the shift coincide. The dynamical entropy of the shift can be used in a thermodynamic variation principle. This variation principle is satisfied exactly by states that are KMS with respect to the time evolution.

290

The maximal dynamical entropy is achieved by the tracial state and coincides in this state with the Voiculescu-dynamical entropy hat (4.9). In all these examples the abelian model is constructed by considering the sequence p\ = C~HA and the corresponding minimal projectors in (4.21-23).

There exists another possibility to construct space translation invariant states on the lattice, namely the method of correlated states:

We start again with our chain A = ®nBn. In addition, we choose an algebra C (we restrict to finite dimensional ones) and consider some completely positive map F : C ® $ -> C, that we can write as fb{c), and we demand / i (c) = c. Let w be a state over C satifying Q o fx =Q. Then we define

uj(bi <g>... ® bk) = Q(fbl ofbao...o fb„(l))

where bi is an operator at the lattice point i (many of them can be 1).

It can be checked that in this way we obtain a translation invariant state. If, e.g. /&(1) = oj(b) • 1, then we obtain a state that is clustering. If we want to have nontrivial correlations between nearest neighbours, we have to choose another / , but this enforces that there must be also correlations to other neighbours. Space clustering is encoded in the convergence properties of / ( " ) 4 0 .

Now the construction of an abelian model is offered by a decomposition of F into finer completely positive maps. Convergence properties in the construction of abelian models as it is necessary in (4.23) are now controlled by convergence properties of F (that acts over finite dimensional algebras) instead of convergence properties of space correlations. Again we have to choose Bn sufficiently large, i.e. combine sufficiently many lattice points. With appropriate estimates it was shown 41 that for all finitely correlated states (C of finite dimension) the dynamical entropy and the entropy density of the so constructed states coincide.

C) The Fermi Algebra

If we concentrate on the even subalgebra Ae of the CAR algebra, i.e. the algebra consisting of even polynomials in creation and annihilation operators, this is just a special AF-algebra that is asymptotically abelian and therefore the results in39 guarantee that for equilibrium states dynamical entropy of space translation and entropy density coincide.

If, in addition, we apply the theorem 29

h{an) = \n\ h(a),

291

then obviously

hAA°n) < hA(an)

~ h^PlP^-HiPlP-ttA)

< hli(P\PLn))-H(P\P-®Ae) + ln2 (6.1)

shows that hAc(a) = hA(a).

Nevertheless the noncommutativity of the algebra has consequences:

Theorem: If u> = OJ O a, then UJ(AQ) = 0 for all odd elements in A.

Proof:

M4>)|2 N-l

„ N n=0

= ^EF U PO.^4W. (6-2)

The anticommutator vanishes for strictly local odd operators except for (£-k) = 0(l). Therefore

K 4 o ) | 2 < ^ ViV.

We notice that noncommutativity reduces the possibility for invariant states.

Concerning the question for entropic K-systems (2.2), for all even subal-gebras

KmH((Tn,Be)=H(Be),

but for a typical odd subalgebra AQ = {ao + %}" h(a, Ao) = 0.

D) For the stationary quantum Markov chain again an abelian model can be constructed that gives the optimal result, i.e. the entropy density10. The main idea in the proof is the fact that apart from the algebra A we can concentrate on the algebra C and inside of this algebra we construct an optimal decomposition. Therefore in the limit of these decompositions we find an abelian model with vanishing entropy defect H(P\P- ® A).

292

As we already mentioned, the automorphism T (as in our special example) will not be asymptotically abelian in general and therefore the system fails to be an entropic K-system. Similar as for the Fermi system we can introduce the gauge automorphism

7 ~Ox = -Vx

l°y = -Oy

•yaz = az.

The elements invariant under this gauge automorphism are asymptotically abelian under space translation, because they become localized in 1 ®C. Therefore again the result corresponds to the results in3 9 , though the states are constructed in different ways.

E) The last example we want to discuss in this framework is the Price-Powers shift. We have already considered the special case g(i) — 1, the Fermi algebra (3Eb). For g{l) = 1, g(l) — 0 otherwise, the representation (3Ea) already indicates how to construct an abelian model: For a2 we are dealing with a quantum Bernoulli shift that is factorizing with the obvious choice for an abelian model. Therefore it is easy to construct the abelian model for a:

We can consider Bff2 as subalgebra of A, therefore oBai is again an abelian subalgebra and for the shift a we consider the abelian model

oBai

with the obvious coupling. Notice that now we have presented an example where the entropy defect of the abelian model does not vanish, i.e. the abelian model is not a subalgebra of the system. For arbitrary g we will in general fail to find an abelian model. We have only to vary the proof (6.2): If g is sufficiently irregular, so that for all wj € A, where Wi are monomials in a, i € / ,

[wI,<rkwI]+ = 0

for infinitely many k, so that

|w(w/)|2 = J2 TT\ UJ^(jkwi)

= jjjl E w([«'/,"*"/«'/]+) = o (j-ijJ , (6.3)

293

then LJ(WI) has to vanish.

In fact, it was shown in42 that it is possible to construct a sequence {g}, so that (6.3) holds for all wi and therefore the only invariant state is the tracial state. In4 3 we proved that with probability one on the set of possible {<?} (6.3) holds and again we have a unique invariant state. But this argument can be generalized to every coupling to abelian models, therefore every coupling has to be trivial and the dynamical entropy in the sense of29, resp. 30 vanishes.

The Price-Powers shift was also studied in the context of Voiculescu's dynamical entropy and in the context of the Alicki-Fannes entropy23 '44. Here the increasing property is the dominant feature. We obtain

hat(a) = i In 2, hAF (a) = In 2, (6.4)

independently of the special sequence {g}.

If we return to our remark that the dynamical entropy describes how information increases, but at the same time becomes more and more irrelevant for classical dynamical systems, we notice that the Voiculescu and the Alicki-Fannes algebra concentrate on the fact that information increases, whereas the 29 entropy is sensitive to the amount, how information becomes irrelevant.

7 Continuous K-Systems

So far we concentrated on discrete dynamics. But obviously the discrete group of translation Z can be replaced by R without varying much of the definitions. Especially due to the linearity of the dynamical entropy (which is proven for 18 and2 9)

h{an) = \n\ h(a), (7.1)

also for the continuous groups R we can choose the subgroup aZ and can calculate the dynamical entropy (for all possible definitions) for this subgroup. It can be shown that the result will be independent of the scaling parameter a.

Also the definition of an algebraic quantum K-system is applicable also for a continuous group. Only in this case the amount of increase cannot be described by [At : Ao\: it is either zero on infinity, because [At : AQ] = n[At/n : AQ] and [A\ : ,40] is either 0 or > 2 1 4 .

294

This remark shows that a continuous quasifree evolution over a Fermi lattice system (aaa(f) = a(eiapf), a 6 R) can give positive dynamical entropy but cannot correspond to a continuous algebraic K-system:

[At : A0] = hat(at)

and hat(at) = hT(crt)

in the tracial state (compare39). This leads to a contradiction, if hT(aT) is bounded.

A prototype of a continuous K-system is given in relativistic quantum field theory:

The Wedge Algebra 45. Consider the algebra Aw = {<t>{x),xi > 0} as subalgebra of a quantum field theory A. This algebra is mapped into itself by the following automorphisms:

a) &x , the shift in the x\-direction. Therefore {A,Aw,<ri ,(Q\ • |fi)} is a K-system in an irreducible state. The unitary operator implementing ai1] is eiplx with spec (P1) = R.

b) l£\ the shift in the light direction x1 + x°. Again {A,Aw,t£\ (ft| • |ft)} is a K-system. Now £^ = ad eiL with spec (L1) = R+.

c) \fl) is cyclic and separating for Aw- Therefore it defines a KMS-automorphism and this KMS-automorphism coincides with the geometric action of the boost b^. With {Aw,Z{1)Aw,bw, (tt\ • |ft)} we obtain a new K-system, where the K-automorphism is now the modular automorphism ad b^ = ad e%B , £x acts as endomorphism on Aw- The generators satisfy

[ f l W , L W ] = i l W . (7.2)

These relations can be generalized to the following theorem:

Theorem: Let {A, Ao,Tt,uj} be a modular K-system, i.e. rt the modular automorphism of A and

n A0 D Ao-

a) Then the GNS vector \Q) implementing ui is cyclic and separating both for A and Ao-

295

b) Let Tt be implemented by eim, eiHtil = \Q). Let rt° be the modular automorphism of A implemented by eiH * with eiH *|fi) = |ft). Then

G =: H° - H is well defined, G > 0,

e i G s , s > 0, implements an endomorphism on A with elG A e~%G = Ao

[H,G) = iG. (7.3)

The proof is based on the analyticity properties of the modular operator, taking appropriate care of domain properties46 '47.

We notice that for quantum modular K-systems in a natural way endomorphism arise that satisfy the Anosov commutation relations and therefore offer by Lyapunov exponents the clustering properties of the automorphism:

Theorem: Let {A, T(t),a(s),uj} be an Anosov system with r the K-automorphism and a the Anosov endomorphism.

Take XA to be the characteristic function (a, oo) for some a > 0. Choose A and B € A such that

i) AQ, 6 T>(Gr) for some r > 0,

ii) XA'(G)BQ, = 0. As a consequence (n|Z?|fi) = 0.

Then

|w(i4T*B)| < e-tra-r\\Bn\\\\GrAn\\. (7.4)

We refer t o 1 and4 8 .

As for discrete quantum K-systems we wonder whether the dynamical entropy is positive and there exists nontrivial models: Again no general result is available. On the basis of quasifree evolution 49 we can construct models for fermions and bosons that are modular K-systems with positive dynamical entropy. But there exists also a ^-deformed quasifree modular system50. Here the past algebra has trivial relative commutant and therefore the algebra does not contain any subalgebra, on which the dynamics acts asymptotically abelian, which according to 34 seems to be a requirement for the construction of abelian models.

296

8 Mixing Properties Without Algebraic K-Property

As already mentioned, no strategy is available up to now to construct for a given quantum dynamical system a subalgebra that satisfies the K-property. A model for which it is still undecided whether we are dealing with an algebraic K-system is the rotation algebra51.

Definition: The rotation algebra Aa is built by unitary operators U, V with

U-V = eiaV • U (8.1)

for some a G [0,27r). This algebra arises in a natural way in a physically motivated example: Consider a free particle in a constant magnetic field, confined to two di

mensions. Then the particle describes Larmor bounds. In the thermodynamic limit these Larmor bounds can be occupied up to a precise filling factor52. This thermodynamic limit can most easily be achieved by confining the particles in an additional harmonic potential whose strength is going to zero53 . Another method more taylor-made to study electric currents are periodic boundary conditions. Therefore the algebra is built by eiav*, e'Pv*, einx, e'my with

piavx Jinx pin(x+a) iavx

pifivypiny _ pim(y+P) giffvy

eiavXpil3vy _ pia0Bpi0vypiav* /g 2}

with B the magnetic field orthogonal to the plane. All other commutators vanish.

If we introduce

exp[inx] =

exp[im?7] =

len the algebra splits into

{eiav*, em

exp

exp

tn(x - —vy

im(y - ~5vx

'y}® {einx,eimy}

pinXpimy _ g i /Bpimypinx

297

Therefore the rotation algebra with a = l/B describes the algebra of the center of the Larmor precision.

For Aa there exists a representation on C{T2)

7r(Va) = exp [i [y - ^Pz) ] > (8.3)

where p , , pv are the momentum operators - —-, - — , with periodic boundary i ox i ay

conditions on the torus. For |fi) = |1), the constant function on the torus

*{JJa)\il) = eix

n(va)\n) = jy (8.4)

independent of the rotation parameterM . On Aa we have the following automorphism

4(^C) = J^usv?

with

n m

= T n m - ( : ! ) •

ad — be = 1.

tjW describe currents and are therefore of physical relevance. QT describes dilation in R? space and reduces to a map on the torus T2 only for discrete values and discrete directions of the dilation. A physical description for QT can be given, if it describes a sudden periodic push to the particle. Whereas CT'1' and a(2> have no good mixing behaviour, QT inherits all mixing properties from the classical torus due to (8.4)

(n\n(Wa(z))QTn(Wa(z))\n) = (Q\ir{W0{z))QTn(W0(z))\il). (8.5)

But with respect to dynamical entropy the noncommutativity plays an essential

298

role: Let A be the eigenvalue > 1 of T. Then

hat(&T) = \ In A for a irrational18

= In A for a. rational

/ IAF(©T) = In A for all a 5 5

/ ICNT(©T) = hi A for a rational

> 0 for a depending rationally on A57

= 0 in general56.

In addition, it was possible to construct for a rational a subalgebra Ao, so that (A, AQ,QT,U) became a K-system54. This was possible, because A can be looked at as a crossed product of the classical algebra on T2 with a discrete translation group and by rather general considerations crossed product algebras inherit under some conditions the K-structure of the underlying algebra 56. Obviously this construction does not give a hint for irrational a.

The strong dependence on a of the CNT-dynamical entropy is based on the fact of the strong dependence of the asymptotic commutation behaviour. Only if a and A are rational depending the system is asymptotically abelian and the commutator converges asymptotically fast to zero. This rapid convergence made it possible to construct an abelian model57 using the fact that the algebra Aa can be imbedded in, but is not an AF-algebra. Therefore, different from the approaches for lattice systems, the abelian model cannot be identified up to convergence problems with an abelian subalgebra of Aa-

9 Time Evolution

As we have seen, in a quantum system there are many possibilities for some kind of mixing behaviour that are not equivalent as in the classical situation. Up to now we concentrated on dynamics that were constructed in such a way that they should give us information on possible ergodic structures.

When dynamics is given to us by a sequence of local Hamiltonians we have, up to now, hardly control on the asymptotic behaviour, apart from quasifree evolution.

We mention just one result: The x-y model58 allows a transformation to a quasifree evolution. Therefore we know that it is weakly, but not strongly asymptotically abelian. Its dynamical entropy is positive and all definitions give the same result (with the dimensional correction term for /IAF)- We do not know whether it is an algebraic K-system for a discrete subset in time. For sure it is not a continuous algebraic K-system.

299

References

1. G.G. Emch, H. Narnhofer, G.L. Sewell, W. Thirring, Anosov Actions on Non-Commutative Algebras, J. Math. Phys. 35/11, 5582-5599 (1994).

2. M.C. Gutzwiller, Chaos in classical and quantum mechanics (Springer, New York, 1990).

3. E. Bogomolny, F. Leyvraz, C. Schmit, Statistical Properties of Eigenvalues for the Modular Group, in Xlth International Congress of Mathematical Physics, Daniel Jagolnitzer ed. (International Press, Boston, 306-323, 1995)

4. A.N. Kolmogorov, A new metric invariant of transitive systems and automorphisms of Lebesgue spaces, Dokl. Akad. Nauk 119, 861-864 (1958).

5. P. Walters, An Introduction to Ergodic Theory (Springer, New York, 1982).

6. LP. Cornfeld, S.V. Fomin, Ya.G. Sinai, Ergodic Theory (Springer, New York, 1982).

7. H. Narnhofer, W. Thirring, Quantum K-Systems, Commun. Math. Phys. 125 565-577 (1989).

8. C. Brukner, A. Zeilinger, Conceptual Inadequacy of the Shannon Information in Quantum Measurements, quant-ph/0006087

9. 0 . Bratteli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics I, II (Springer, Berlin, Heidelberg, New York, 1993).

10. B. Kiimmerer, Examples of Markov dilation over 2 x 2 matrices, in L. Accardi, A. Frigerio, V. Gorini eds., Quantum Probability and Applications to the Quantum Theory of Irreversible Processes, Springer, Berlin, 1984, 228-244, and private communications

11. R.T. Powers, An index theory for semigroups of *-endomorphisms of B{H) and type Hi factors, Canad. J. Math. 40 86-114 (1988); G.L. Price, Shifts of Hi factors, Canad. J. Math. 39 492-511 (1987).

12. H. Narnhofer, W. Thirring, Chaotic Properties of the Noncommutative 2-Shift, in From Phase Transition to Chaos, G. Gyorgyi, I. Kondor, S. Sasvari, T. Tel eds., World Scientific 1992, 530-546

13. H. Narnhofer, W. Thirring, Clustering for Algebraic K-Systems, Lett. Math. Phys. 30 307-316 (1994).

14. V.F.R. Jones, Index for subfactors, Invent. Math. 72 1-25 (1983). 15. R. Longo, Simple Injective Subfactors, Adv. Math. 63 152-171 (1987),

Index of Subfactors and Statistics of Quantum Fields, Commun. Math. Phys. 130 285-309 (1990).

16. M. Choda, Entropy of canonical shifts, Trans. Amer. Math. Soc. 334 827-849 (1992).

300

17. H. Narnhofer, A. Pflug, W. Thirring, Mixing and Entropy Increase in Quantum Systems, in Symmetry in Nature in honour of Luigi A. Radicati di Brozolo, Scuola Normale Superiore, Pisa , 597-626 (1989).

18. D.V. Voiculescu, Dynamical Approximation Entropies and Topological Entropy in Operator Algebras, Commun. Math. Phys. 170 249-282 (1995).

19. M. Choda, A C* Dynamical Entropy and Applications to Canonical En-domorphisms, J. Fund. Anal. 173 453-480 (2000).

20. E. Stormer, A Survey of noncommutative dynamical entropy, Oslo preprint No. 18, Dep. of Mathematics, MSC-class 46L40 (2000)

21. M. Choda, Entropy on crossed products and entropy on free products, preprint (1999)

22. K. Dykema, Topological entropy of some automorphisms of reduced amalgamated free product C* algebras, preprint (1999)

23. R. Alicki, F. Fannes, Defining Quantum Dynamical Entropy, Lett. Math. Phys. 32 75-82 (1994).

24. R.B. Griffiths, Consistent histories and the interpretation of quantum mechanics, J. Stat. Phys. 36 219-279 (1984).

25. M. Gell-Mann, J. Hartle, Alternative decohering histories in quantum mechanics, in Proc. of the 25th Int. Conf. on High Energy Physics, Vol. 2, ed. by K.K. Phua and Y. Yamaguchi, World Scientific, Singapore, 1303-1310 (1991).

26. E.H. Lieb, D.W. Robinson, The finite group velocity of quantum spin systems, Commun. Math. Phys. 28 251-257 (1972).

27. A. Connes, E. Stormer, Entropy of IIj von Neumann algebras, Acta Math. 134 , 289-306 (1972).

28. A. Connes, Acad. Sci. Paris301I, 1-4 (1985). 29. A. Connes, H. Narnhofer, W. Thirring, Dynamical Entropy of C*-

Algebras and von Neumann Algebras, Commun. Math. Phys. 112 691-719 (1987).

30. J.L. Sauvageot, J.P. Thouvenot, Une nouvelle definition de I'entropic dynamique des systems non commutatifs, Commun. Math. Phys. 145, 411-423 (1992).

31. C.H. Bennett, D.P. DiVincenzo, J.A. Smolin, W.K. Wootters, Mixed state entanglement and quantum error corrections, Phys. Rev. A 54, 3824-3851 (1996).

32. F. Benatti, H. Narnhofer, A. Uhlmann, Decomposition of quantum states with respect to entropy, Rep. Math. Phys. 38, 123-141 (1996).

33. W.K. Wootters, Entanglement of formation of an arbitrary state of two qubits, q-ph/970929,

301

34. F. Benatti, H. Narnhofer, Strong asymptotoc abelianess for entropic K-systems,Commun. Math. Phys. 136 231-250 (1991); Strong Clustering in Type III Entropic K-Systems, Mh. Math. 124, 287-307 (1996).

35. H. Narnhofer, An Ergodic Abelian Skeleton for Quantum Systems, Lett. Math. Phys. 28, 85-95 (1993).

36. H. Narnhofer, W. Thirring, Dynamical Theory of Quantum Systems and Their Abelian Counterpart, in On Klauder's Path, eds. G.G. Emch, G.C. Hegerfeldt, L. Streit, World Scientific, 127-145 (1994).

37. H. Narnhofer, Free energy and the dynamical entropy of space translation, Rep. Math. Phys. 25, 345-356 (1988).

38. H. Moriya, Variational principle and the dynamical entropy of space translation, Rev. Math. Phys. 11, 1315-1328 (1999).

39. S. Neshveyev, E. Stormer, The variational principle for a class of asymptotically abelian C* algebras, MSC-class 46L55 (2000)

40. M. Fannes, B. Nachtergaele, R.F. Werner, Finitely correlated states of quantum spin systems, Commun. Math. Phys. 144 443-490 (1992).

41. R.F. Werner, private communication 42. H. Narnhofer, E. Stormer, W. Thirring, C* dynamical systems for which

the tensor product formula for entropy fails, Ergod. Th. & Dynam. Sys. 15, 961-968 (1995).

43. H. Narnhofer, W. Thirring, C* dynamical systems that are highly anti-commutative, Lett. Math. Phys. 35 145-154 (1995).

44. R. Alicki, H. Narnhofer, Comparison of Dynamical Entropies for the Noncommutative Shifts, Lett. Math. Phys. 33, 241-247 (1995).

45. H.J. Borchers, On the Revolutionization of Quantum Field Theory by Tomita's Modular Theory, ESI preprint, 160 pages, 148 references

46. H.J. Borchers, On Modular Inclusion and Spectrum Condition, Lett. Math. Phys. 27, 311-324 (1993).

47. H.W. Wiesbrock, Halfsided Modular Inclusions of von Neumann Algebras, Commun. Math. Phys. 157, 83-92 (1993), Commun. Math. Phys. 184, 683-685 (1997).

48. H. Narnhofer, Kolmogorov Systems and Anosov Systems in Quantum Theory, review, to be publ. in IDAQP.

49. H. Narnhofer, W. Thirring, Realization of Two-Sided Quantum K-Systems, Rep. Math. Phys. 45, 239-256 (2000).

50. D. Shlyakhtenko, Free quasifree states, Pac. Journ. of Math. 177 329-368 (1997).

51. M.A. Rieffel, Pac. J. Math., 93, 415 (1981). 52. R.B. Laughlin, Quantized Hall Conductivity in Two Dimensions, Phys.

302

Rev. B 23/10, 5632-5633 (1981). 53. N. Ilieva, W. Thirring, Second quantization picture of the edge currents

in the fractional quantum Hall effect, math-ph/0010038 54. F. Benatti, H. Narnhofer, G.L. Sewell, A Non Commutative Version of

the Arnold Cat Map, Lett. Math. Phys. 21, 157-172 (1991). 55. R. Alicki, J. Andries, M. Fannes, P. Tuyls, Lett. Math. Phys. 35, 375-

383 (1995). 56. H. Narnhofer, Ergodic Properties of Automorphisms on the Rotation

Algebra, Rep. Math. Phys. 39, 387-406 (1997). 57. S.V. Neshveyev, On the K property of quantized Arnold cat maps, J.

Math. Phys. 41 1961-1965 (2000). 58. H. Araki, T. Matsui, Commun. Math. Phys. 101 213-246 (1985).

303

SCATTERING IN Q U A N T U M TUBES

B O R J E NILSSON

School of Mathematics and Systems Engineering, Vaxjo University, SE-351 95 VAXJO, Sweden


It is possible to fabricate mesoscopic structures where at least one of the dimensions is of the order of de Broglie wavelength for cold electrons. By using semiconductors, composed of more than one material combined with a metal slip-gate, two-dimensional quantum tubes may be built. We present a method for predicting the transmission of low-temperature electrons in such a tube. This problem is mathematically related to the transmission of acoustic or electromagnetic waves in a two-dimensional duct. The tube is asymptotically straight with a constant cross-section. Propagation properties for complicated tubes can be synthesised from corresponding results for more simple tubes by the so-called Building Block Method. Conformal mapping techniques are then applied to transform the simple tube with curvature and varying cross-section to a straight, constant cross-section, tube with variable refractive index. Stable formulations for the scattering operators in terms of ordinary differential equations are formulated by wave splitting using an invariant imbedding technique. The mathematical framework is also generalised to handle tubes with edges, which are of large technical interest. The numerical method consists of using a standard MATLAB ordinary differential equation solver for the truncated reflection and transmission matrices in a Fourier sine basis. It is proved that the numerical scheme converges with increasing truncation.

1 Introduction

In the search for faster computers critical parts are becoming smaller. Today, it is possible to build mesoscopic structures where some dimensions are of the order of the de Broglie wavelength for cold electrons. Often the electron motion is confined to two dimensions. Consequently, it may be necessary, at least for some computer parts, to include quantum effects in the design process.

A large number of studies, devoted to such quantum effects, have been carried out in recent years and a review is given by Londegan et alx . Many investigations aim at understanding the physical properties of a particular quantum tube rather than developing reliable mathematical and numerical methods that can be used in a more general context. The research has given valuable knowledge on the physical behaviour but also reports on the limitations of the methods used. For instance, Lin & Jaffe2 report that a straightforward matching at the boundary of a circular bend does not converge, demonstrating the numerical problems with such a method. An illposedness is present in quantum tube scattering and some type of regularisation is therefore required to avoid large errors. Often, the tubes have sharp corners to facilitate manufacturing

304

but also to enhance quantum effects. The presence of corners with attached singularities requires special treatment.

Scattering of electrons in quantum tubes, see figure 1, is theorywise related to the scattering of acoustic and electromagnetic waves in ducts. Nilsson 3 treats a general method for the acoustic transmission in curved ducts with varying cross-sections. Wellposedness, i.e. stability, is achieved in an asymptotic sense. The mathematical framework guarantees consistent results and allows for sharp corners and a proof for numerical convergence is given. We set out to present a quantum version of the results of Nilsson3. In this way the problems reported on convergence2 and on inconsistent mathematical results would be resolved.

The paper is organised as follows. An introduction to scattering in quantum tubes is given in section 2 and a mathematical model is formulated in section 3. The Building block Method which is a systematic method to analyse complicated tubes in terms of results for simple tubes is also briefly described. Then in section 4 the scattering problem for the curved tube with varying cross-section and constant potential is reformulated to a scattering problem for a straight tube with a varying refractive index. The solution to this problem is presented in section 5 and a discussion on numerical methods are also given.

2 Tubes in quantum heterostructures

A schematic view of a quantum heterostructure is shown in figure 2 following Wu et al. 4 Electrons are emitted from the n-type doped AlGaAs layer, migrate into the GaAs layer and stay close to the boundary to the AlGaAs layer. In this way a very narrow layer of electrons which are free to move in a plane is formed. Nearly all the electrons in this two-dimensional gas are in the same quantum state. By applying a negative potential on the metal electrodes on the top of the heterostructure in figure 1, the electrons are banished from the region below the electrodes. For relatively low voltages, the effective potential in the tube for one electron is close to the square-well potential. 1 As a consequence the electrons in the two-dimensional gas are further restricted to a tube that in form is a mirror picture of the gap between the two electrodes. This quantum tube links the electrons between the two two-dimensional gases on both sides of the strip formed by the electrodes.

3 Mathematical model

Consider a two-dimensional tube with interior ft' according to figure 1. The boundary V consists of two continuous curves, F'+ and r'_, which are piecewise

305

C2. The upper boundary r + can be continuously deformed to T'_ within ft'. Outside a bounded region the duct is straight with constant widths a and b, respectively. These terminating ducts are called the left and the right terminating duct or L and R for short. We use stationary scattering theory for one electron in an effective potential, with time dependence exp(—iEt/h), assuming that the wave function ip satisfies the time-independent Schrodinger equation Atp + k2ip = 0 in ft',where k2 = 2m*E/h and m* is the effective mass5 . Usually k2 is called energy. The effective potential is assumed to be a square well meaning that Vlr' = 0-

In a tube with constant cross-section the harmonic wavefunction ip can be uniquely decomposed in leftgoing and rightgoing parts by ip = ip++ip~. Super indices " + " and " — " indicate rightgoing or plus and leftgoing or minus waves respectively. Let ipfn

a n d V^ be known incoming waves in the terminating ducts. tpfn is present in the left and ip~n in the right one. Let us write

f V = 1>tn + R+tfn + T-rp-JnL , , \ r/j = VTn + R'iTn + T+i>fninR ' ^ '

where for example the last two terms in (3.1a) are minus waves and the equation defines the left reflection mapping R+ that maps the incoming wave to an outgoing one in L. The scattering problem consists of finding the mappings R+ ,T~, R~ and T+ as functions of energy for a given duct. In summary we have

Aip + k2i> = Oinfl'

1>+=1>pnL • {6-2)

i> = ">PininR

There is always a solution to (3.2), and except for a discrete number of eigenenergies k2 = kf,i = 1,2,3,..., the solution is unique. 6 When k2 = k2, an eigenenergy, there exists a solution without incoming but with outgoing waves.

The use of the Building Block Method 7 or transfer matrix formalism 8 is very efficient for the solution of scattering problems. In this method a tube with a complicated geometry is divided into two parts usually where the tube is straight. These two parts are converted to the type shown in figure 1 by extending the terminating tubes to infinity. A sub tube for the tube shown in figure 1 originates from the left part and is depicted in figure 3. The Building Block Method gives a procedure for calculating the mappings R+, T~, R~, and T+ for the entire tube in terms of the corresponding scattering properties for the sub tubes. This procedure can be repeated to get several sub tubes.

306

Rather than using a general numerical package for conformal mappings we have for the calculations in this paper employed the Schwarz-Christoffel mapping for a duct with corners and rounding the corners using the methods of Henrici 9. Required analytic integrations are performed in MATHEMATICA.

We recall the standard duct theory6 in a form that illustrates the illposed-ness of the problem and we have

oo oo

rP = V>+ + V- = Y, A+e t e"V»(v) + £ ^ e ^ - ' ^ l y ) , (3.3) ra=l n = l

with pn(y) = sin(nny/a) and an = ^Jk2 — n2n2/a2, Im an > 0. It is convenient to define the operator Bo by

/ -Bo / = £ r T = l ttn/nVn, , , .,

I f(y) = Zn=l<*nfn<Pn(y) ' ^

We find that BQ — d2x 4- k2 and dx^ — ±i50V'±- The initial value problem,

/ dxtp+(x) = iB0ip

+(x), . .

I V+(0) = ^ , ( ;

is illposed for x < 0, but not for x > 0. If an attenuated plus wave is marched to the left an exponential growth is found. To avoid the illposedness, ip is decomposed and the plus waves are calculated by marching to the right and minus waves in the opposite direction.

4 Reformulated scattering problem

To be able to use powerful spectral methods it is advantageous to transform the tube to a flat boundary. It is enough, according to the Building Block Method, to consider the scattering in the sub tubes and we restrict ourselves to the first part as shown in figure 3. One way of transforming the tube is to use a conformal mapping w(C) transforming the interior CI' of the tube with variable cross-section in the £ = x + iy plane (figure 3) to the interior H of a straight tube with constant cross-section in the w = u + iv plane. The straight tube is described by —oo (u, v) = tp(x, y) we get

f d2ucl> + B2(u)^ = 0inn ( .

\ 0(u,O) = 0(u,o) = O , u e R ' K ' '

with B2{u) = d2 + k2n(u,v) and n = \dC,/dw\2. /^(u,i>)-1 can be denoted as a refractive index for the straight tube. In figure 4, /x related to the simple

307

tube in figure 3 is depicted. The factor (i(u, v) is asymptotically constant at both ends of the tube or more precisely fj,(u, v) = (i±+0(e^cu^), u —> ±00 with [i- — 1 and /J+ = (b/a)2.

We use a first order description and rewrite (4.1a) as

9u \ du<j> ) ~ { - B 2 0j{ du<t> ) (4.2)

To avoid illposedness the decomposition <j> = <f>+ + cf>~ is introduced which must be identical to the corresponding decomposition (3.3) in regions where n is a constant. The new state variables (<f>+,<f>~) are introduced via the linear relation

\du)-\ ic -ic ){<t>- ) • (4.3)

Solving (4.3) for 0+and <j> that

and taking the u-derivative and using (4.2) we find

*(£) - ( ; i)(£)- (4.4)

where

a = MiduC-^C + iC~lB2 + iC] -(duC-1)C + iC-1B2-iC -(duC-1)C-iC-1B2 + iC

S ='\'[{duC-l)C - iC~lB2 - iC]~

& _ 1

7 = I ' 2

(4.5)

To generalize the concept of transmission operators we make them u-dependent, using a similar notation as Fishman10:

/ 4>+{u2) \ f T+(U2,Ui) V tf-(Ul) J V ^+("2,«l)

(u1 (u2) "\ ( 4>+(ui) \

J V r («») ) ' R T-(Ul,u2)

(4.6)

assuming that t*i < u2, and suppressing the explicit v-dependence. It is assumed for (4.6) that the scattering problem has a unique solution or that homogenous solutions are removed. A homogenous solution is usually called a bound state.

Next we find a differential equation for the scattering operators T+(u2, u\), R~(ui,u2), R+(u2,ui) and T~(ui,u2) in (4.6) using the invariant imbedding technique11 '10 . It is required that the incoming wave from the right, <j>~{u2),

308

is vanishing. Then put u\ = u, find du<j) (u) from (4.6), use (4.6) once more to obtain

duR+(u2,u) = J + 5R+(u2,u) - R+(u2,u)a - R+(u2,u)PR+(u2,u), (4.7)

In a similar manner we get

duT+ (u2, u) = -T+ (u2 ,u)a-T+ (u2, u)/3R+ {u2, u). (4.8)

The stability properties of (4.7) and (4.8) are of central importance. In the flat regions where B = B+ or B- we have C — B and duC~x — 0 implying that /? = 7 = 0 and a = -S = IB. Similarly (4.7) and (4.8) reduce to duX

+ = —iBX+, X+ = R+ or T + , equations which are well-posed for marching to the left. The initial values to accompany (4.7) and (4.8) are R+(u2,u2) = 0 and T+(u2,u2) = / , where I is the identity operator.

We choose C — B- + f(u)(B+ — £?_) that is independent of v. Here / is increasing and smooth with limu-^-oo/^) = 0, and limu_>00/(u) = 1.

5 Solution of the scattering problem

For the numerical solution of the scattering operator we expand <j) in a Fourier sine series and / i i n a Fourier cosine series:

/ ^(u,v) = £ ~ = 1 (pn{u)tpn(v) ( .

where £n(v) = cos(mr/a). Using the notation 4> = ((j>0,(j)\,...)T we find that

^ M + B 2 ( U ) ^ ) = 0. (5.2)

The matrix elements of B 2 (u) are given by

k2 n2TT2

B2(u)nm = — [-fj,m+n(u) - Hm-n{u) - Hm + Hn-m(u)] ^Snm (5.3)

and it is understood in (5.3) that [ii(u) = 0 for negative I. For the tube in the physical C—plane we require that locally both the poten

tial and the kinetic part of the energy are finite, that is both Jx \ip\ dxdy < oo and Jx \Vip\ dxdy < oo for all finite regions X inside the tube. We say that ip belongs to the Sobolev space Hj1^ meaning that tp and its first derivatives are locally square integrable. Transformed to the straight duct the local finite energy requirement means Jv \(f>\ fidudv < oo and / ^ |V^| dudv < oo for all

309

finite regions U inside the tube. For a smooth boundary cf> is more regular, and also the second derivatives of <j> are square integrable, that is 0 G H2

0C. It follows from the theory of Grisvard12 that also the second derivatives of <j> are square integrable, which means that <j> 6 H2

oc. According to a graph theorem13

cj) € H2oc implies that cf>(u,-) 6 H3/2(0,o), meaning that up to 3/2 derivatives

are square integrable. To interpret this regularity with fractional derivatives we define, following Taylor13, the function space

Ds = \fe L2(0, a); f^ | / „ | 2 (l + n2)s < oo 1 , s > 0, (5.4) I 71=0 J

wi th / = J2^Li fn<Pn a n d / „ = (f,<pn)/(<Pn,'Pn)- D s is a Hi lber t space wi th the norm

oo

11/112). = (/,/) = £ l / n | 2 ( i + «2)'- (5-5) n=l

Taylor13 shows that D0 =L2(0,o), Di =Hj(0,a), D2 =H2(0,a)nHj(0,a) and that dvDs = D s_i , s > 1. In this terminology we have that for a smooth boundary <j>{u, •) € D3/2-

The operator 92 is self-adjoint on D3/2- Thus, we may define B± by

oo

B±f = ^2 \/k2H±-nHya?fnipn, (5.6) 7 1 = 1

assuming that the branch Im > 0 of the square root is taken. It is clear that T + , R~, R+ and T~ are mappings D3 /2 ^ D 3 / 2 and B±: D s —> D s_i , s > 1.

For tubes with edges in the £—duct things are a little more complicated. With no restriction on the sharpness of the edges we cannot improve that (j> € H\oc implying <j>{u,-) €Di/2 . Then, as an intermediate step in our calculations B±<j) should be in the space D_!/2 . Such a derivative must of course be interpreted as a distribution. However, the end result, i.e. scattered wave function belongs to D ^ . To generalise we define by duality for positive s

£»_s = | g; / f(v)g(v)dv < oo for all f £ Ds\ .

Multiplication by^/ju is an operator T>i/2 ->• D_!/2 and if s > 1/2 we have the following mapping properties: B± : D s - • Dg_i,d„ : D s -> D5_!, and T + , R~, R+ and T~ are mappings D s -^D s .

310

The equations (4.7-4.8) can only in very special cases be solved in a closed form. Therefore some type of numerical scheme is used. Generally a numerical method cannot give uniform convergence for the entire space Ds. In a practical application it is usually sufficient to know the effect of the scattering matrices on the lowest eigenfunctions, the first No say. A practical method is therefore to truncate the matrix representation of (4.7) - (4.8) to N » NQ and solve the finite-dimensional ordinary differential equation with a standard numerical routine. Nilsson3 proves that such a procedure converges when N —> oo.

Presently, numerical results are not available for the quantum tube scattering. However, Nilsson 3 presents results for the acoustic case where the Neumann rather than the Dirichlet boundary condition applies. He reports that for the lowest order reflection coefficient N = 1, i.e. a scalar solution, is accurate up to ka = 1.5, N = 2 gives a good and N = 5 gives a perfect discription up to ka = 6. Energy conservation holds for all N.

References

1. J. T. Londegan, J. P. Carini, D. P. Murdock, Binding and scattering in two-dimensional systems - Applications to quantum wires, waveguides and photonic crystals. Lecture notes in physics (Berlin, Springer, 1999).

2. K. Lin, R. L. Jaffe, Bound states and threshold resonances in quantum wires with circular bends. Phys. Rev. B54, 5750-5762 (1996).

3. B. Nilsson, Acoustic transmission in curved ducts with varying cross-sections. Article submitted to Proc. Roy. Soc. A..

4. J. C. Wu, M. N. Wybourne, W. Yindeepol, A. Weisshaar, S. M. Good-nick, Interference phenomena due to a double bend in a quantum wire. Appl Phys. Lett. 59, 102-104 (1991).

5. J. Davies, The Physics of low-dimensional semiconductors (Cambridge, Cambridge University press, 1998).

6. M. Cessenat, Mathematical methods in electromagnetism (Singapore, World Scientific Publishing Co., 1996).

7. B. Nilsson, O. Brander, The propagation of sound in cylindrical ducts with mean flow and bulk reacting lining - IV. Several interacting discontinuities. IMA J. Appl. Math 27, 263-289 (1981).

8. H. Wu, D. W. L. Sprung, J. Martorell, Periodic quantum wires and their quasi-one-dimensional nature. J. Phys. D: Appl. Phys. 26, 798-803 (1993).

9. P. Henrici, Applied and computational complex analysis. Volume I (New York, John Wiley k Sons, 1988).

10. L. Fishman, One-way propagation methods in direct and inverse scalar

311

wave propagation modeling. Radio Science 28(5), 865-876 (1993). 11. R. Bellman, G. M. Wing, An introduction to invariant imbedding. Clas

sics in Applied Mathematics, 8. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 1992.

12. P. Grisvard, Elliptic problems in nonsmooth domains. Monographs and studies in mathematics, 24 (Boston, Pitman, 1985).

13. M. Taylor, Partial differential equations I. Basic theory. Applied mathematics sciences, 115 (NewYork: Springer, 1996).

312

Figure 1: Two-dimensional quantum tube

Doped AJGaAs

Undoped AIGaAs

Undoped GaAs

Semi insulating GaAs

Figure 2: Schematic picture of heterostructure and split-gate structure.

313

Figiire 3: Sub-tube with interior Q' and upper boundary T^_and lower boundary T'_. b/a -0.6.

2 0

Figure 4: fi(u,v) in the straight duct. Parameters as in figure 3. fi x is the refractive index.

314

POSITION EIGENSTATES A N D THE STATISTICAL AXIOM OF Q U A N T U M MECHANICS

L. POLLEY Physics Dept, Oldenburg University, 26111 Oldenburg, Germany

E-mail: polleyQuni-oldenburg. de

Quantum mechanics postulates the existence of states determined by a particle position at a single time. This very concept, in conjunction with superposition, induces much of the quantum-mechanical structure. In particular, it implies the time evolution to obey the Schrodinger equation, and it can be used to complete a truely basic derivation of the statistical axiom as recently proposed by Deutsch.

1 Quantum probabilities according to Deutsch

A basic argument to see why quantum-mechanical probabilities must be squares of amplitudes (statistical axiom) was given by Deutsch1 '2. It is independent of the many-worlds interpretation. Deutsch considers a superposition of the form

He introduces an auxilliary degree of freedom, i = 1 , . . . , m + n, and replaces

1 4) and \B) by normalized superpositions,

/

~r~ m nr m+n

£5>)|i> l5>Wn £ m) (L2)

i—l i=m+l All amplitudes in the grand superposition are equal to 1/y/m + n and should result in equal probabilities for the detection of the states. This immediately implies the ratio m : n for the probabilities of property A or B.

The argument has clear advantages over previous derivations of the statistical axiom. Gleason's theorem3 '4, for example, is mathematically non-trivial and not well received by many physicists, while von Neumann's assumption {0\ +C>2) = (Oi) + (O2) about expectations of observables 5 '6 is difficult to interpret physicswise if 0\ and Oi are non-commuting4'5.

However, Deutsch's argument relies in an essential way on the unitarity of the replacement, or the normalization of any physical state vector. Why should a state vector be "normalized" in the usual sense of summing the squares of amplitudes? It would seem desirable to provide justification for this beyond

315

its being "natural" 2. In fact, the reasoning would appear circular without an extra argument about unitarity or normalization. I have proposed 7 to realize the "replacement" (1.2) physically by the time evolution of a suitable device. Then, what can be said about quantum-mechanical evolution without anticipating the unitarity?

2 Schrodinger's equation for a free particle as a consequence of position eigenstates

For free particles, a well-known and elegant way to obtain the Schrodinger equation is via unitary representations of space-time symmetries. Interactions can be introduced via the principle of local gauge invariance. However, this approach to the equation anticipates unitarity.

As I pointed out recently8, the Schrodinger equation for a free scalar particle is also a consequence of the very concept of a position eigenstatea in dis-cretized space. To an extent, this just means to regard "hopping amplitudes", as they are familiar from solid state theory, as a priori quantum-dynamical entities. The point is to show, however, that a hopping-parameter scenario without unitarity would lead to consequences sufficiently absurd to imply that unitarity must be a property of the physical system. As will be seen below, the absurdity is that a wave-function that makes perfect sense at t = 0 would cease to exist anywhere in space at an earlier or later time.

Consider a spinless particle "hopping" on a 1-dimensional chain of positions x = na where n is integer and a is the lattice spacing.

• • • • • - > — • — - • • \- a -\ ™-i n "+i

Assume the particle is in an eigenstate \n, t) of position number n at time t (using the Heisenberg picture), and it has a possibility to change its position. The information given by a "position at one time" does not determine which direction the particle should go. Thus the eigenstate \n, t) necessarily is a superposition when expressed in terms of eigenstates relating to another time t'. Moreover, because of the same lack of information, positions to the left and right will have to occur symmetrically. If t' —¥ t, only nearest neighbours will be involved. Thus we expect a "hopping equation" of the form

\n,t)=a \n,t')+/3 |n + l , t ' ) + / ? \n-l,t')

This can be rewritten as a differential equation in t,

—i— \n, t) = V \n, t) + K \n + 1, t) + K \n — 1, t) K, V complex (so far)

"Which relies on linear algebra, hence includes the concept of "superposition".

316

Parameters a,/3 and K, V are in an algebraic relation8 which need not concern us here. To obtain an equation for a wave-function we consider a general state \tp) composed of simultaneous position eigenstates,

\ip) = ^J^>(n,£) \n,t) (Heisenberg picture) n

This defines the coefficients ip(n,t) for all t. Now take the time derivative on both sides, identify i[){n,t) with a function ip(x,t) where x = na, and Taylor-expand the shifted values ip(x ± a, t). This results in

Finally, take a —> 0 on the relevant physical scale. The spatial spreading of the wave-function is then given by the a2 term, and the solution of the equation is

il>(x,t) = e~i{v+2K)t f rP(p)eipxe-ia2Kp2tdp

This time evolution would be unitary if K and V were real. Hence, consider the consequences of a non-real K. The integrand would then contain an evolution factor increasing towards positive or negative times like

exp (± a2 Im/tp21)

This would lead to physically absurd conclusions about certain "harmless" wave-functions, like the Lorentz-shape function ij){x) = 1/1 + x2:

• For Im/t > 0, "harmless" function rp{p) oc exp(—\p\) would not exist anywhere in space after a short while.

• For Im/c < 0, the "harmless" function could not be prepared for an experiment to be carried out on it after a short while.

In a mathematical sense, of course, it still remains a postulate that the value of K be real. But physicswise, it does seem that unitarity of quantum mechanics is unavoidable once the superposition principle and the concept of position eigenstate are taken for granted.

As for parameter V, the factor e~lVt would be raised to the nth power in an n-particle state, and would lead to an absurdity similar to the above with certain superpositions of n-particle states unless V is real, too.

317

3 Driven particle: Weyl equation in general space-time

As an example of a particle interacting with external fields we may consider a massless spin 1/2 particle with inhomogeneous hopping conditions8. Here the starting point is common eigenstates of spin and position, where "position" refers to a site on a cubic spatial lattice. A particle in such a state at time t will be in a superposition of neighbouring positions and flipped spins at a time t' « t. In 3 dimensions, and immediately in terms of a wave-function, the corresponding differential equation is

-i—ips(x,t)= S~] Hnssiil>sl{x-an,t) at ,*—?

lattice directions

where Hnssi are any complex amplitudes. On-site hopping (time-like direction) is included as n = 0. To begin with, a free particle is defined by translational and rotational symmetry. In this case, the hopping amplitudes reduce to two independent parameters8, e and K, both of them complex so far. By Taylor-expanding the wave-function and taking a —> 0 we find

dtips{x,t) = etp3(x,t) - aKa^s,dntps>(x,t)

If K had an imaginary part, it would lead to physical absurdities with the time-evolution of certain "harmless" wave-functions similarly to the previous section. For real K, we recover the non-interacting Weyl equation.

If we now admit for "slight" (order of o) anisotropics and inhomogeneities in the hopping amplitudes, by adding some a7MSS/(x, t) to the hopping constants above, we recover a general-relativistic version of the equation 9 with the Juss' (x, t) acting as spin connection coefficients. Unitarity in this context means that the probability current density

j"(*,t) = v;(*,*)<cvv(*,t) is covariantly conserved:

daja + Ta

0aj? = O

This is found to hold automatically if the vector connection coefficients are identified as usual9 through the matrix equation

Imposing no constraints on the spin connection coefficients, we are dealing with a metric-affine space-time here, which can have torsion and whose metric

318

Figure 1: An array of eight cavities of equal shape. The initial state is located in the central cavity. When each channel is opened for an appropriate time, the state evolves to an equal-amplitude superposition of the peripheral cavity-states.

may be covariantly non-constant. The study of space-times of this general structure has been motivated by problems of quantum gravity9. It may be interesting to note that nothing but propagation by superposing next-neighbour states needs to be assumed here. In particular, scalar products of state vectors are not needed.

4 Realizing Deutsch's "substitution" as a time evolution

Having demonstrated "automatic" unitarity on two rather general examples we can now turn with some confidence to the original issue of completing Deutsch's derivation of the statistical axiom.

To realize the particular substitution (1.2) for state vector (1.1), let us consider a particle with internal eigenstate \A) or \B), such as the polarisations of a photon. Let this particle be placed in a system of cavities6 connected by channels (Fig. 1) which can be opened selectively for internal state \A) or

' 'Or Paul traps, or any other sort of potential well; these are to enable us to store away parts of the wave function so that there is no influence on them by the other parts.

319

\B). It will be essential in the following that all cavities are of the same shape, because this will enable us to exploit symmetries to a large extent. The location of the particle in a cavity will serve as the auxilliary degree of freedom as in (1.2), except that \A) and \B) before the substitution will be identified with |A)|0) and |-B)|0) where |0) corresponds to the central cavity.

Now let only one of the channels be open at a time. We are then dealing with the wave-function dynamics of a two-cavity subsystem, while the rest of the wave-function is standing by. What law of evolution could we expect? A particle with a well-defined (observed) position 0 at time t will no longer have a well-defined position at time t' if we allow it to pass through a channel, without observing it. Thus a state |0, t) defined by position 0 at time t (using the Heisenberg picture) will be a superposition when expressed in terms of position states relating to a different time t'. In particular, if channel 0 <-• 1 is the open one,

\0,t) = a\0,t') + p\l,t')

Likewise, by symmetry of arrangement,

|l,t> = a | M ' ) + 0 | O l f )

It follows that |0, t) ± |1,£) are stationary states whose dependence on time consists in prefactors

(a ± fi)k after k time steps. (4.1)

If the particle is initially in the rest of the cavities, whose channels are shut, we would expect this state not to change with time:

|rest,t) = |rest,i')

Now, if (4.1) were not mere phase factors, we could easily construct a superposition of |0), |1), and |rest) so that, relative to the disconnected cavities, the part of the state vector in the connected cavities would grow indefinitely or vanish in the long run. As there is no physical reason for such an imbalance between the connected and the disconnected cavities, we conclude that

a + p = ei* a-0 = eiv'

Having shown evolution through one open channel to be unitary, we can identify an opening time interval7, r m , to realize the following step of the replacement (1.2):

y/m\A) |0) + |rest) ^ y/m=l\A) |0) + | A) 11) + |rest)

320

Here |rest) stands for state vectors that are decoupled, such as all |.B)|i), and all | 4) |i) with i ^ 0,1. Opening other channels analogously, each one for the appropriate r m and internal state, we produce an equal-amplitude superposition

m m+n

X»|i> + £ |B>|t> i=l i=m+l

The probability of finding the particle in a particular cavity is now 1/m + n as a matter of symmetry. As the internal state is correlated with a cavity by the conduction of the process, the probabilities for A and B immediately follow. These must also be the probabilities for finding A or B in the original state, because properties A and B have remained unchanged during the time evolution.

5 Can normalization be replaced by symmetry?

An interesting side effect of the above realization of Deutsch's argument is that state vectors need no longer be normalized at all. Permutational symmetry of a superposition suffices to show that all possible outcomes of an experiment must occur with equal frequency. Then the numerical values of the probabilities are fully determined. This feature of quantum probabilities may be relevant to problems of normalization in quantum gravity10, such as the non-locality of summing \xp\2 over all of space, or the non-normalizability of the solutions of the Wheeler-DeWitt equation.

References

1. D. Deutsch, Proc. Roy. Soc. Lond. A 455, 3129 (1999); Oxford preprint (1989).

2. B. DeWitt, Int. J. Mod. Phys. 13, 1881 (1998). 3. A. M. Gleason, J. Math. Mech. 6, 885 (1957). 4. A. Peres, Quantum Theory (Kluwer Academic Publishers, Dordrecht,

1995). 5. J. von Neumann, Mathematische Grundlagen der Quantenmechanik

(Springer, Berlin-New-York, 1932). 6. A. Bohr, 0 . Ulfbeck, Rev. Mod. Phys. 67, 1 (1995). 7. L. Polley, quant-ph/9906124 8. L. Polley, quant-ph/0005051 9. F. W. Hehl et al, Rev. Mod. Phys. 48, 393 (1976); Phys. Rep. 258, 1

(1995). 10. A. Ashtekar (ed.), Conceptual problems of quantum gravity (Birkhauser,

1991).

321

IS RANDOM EVENT THE CORE QUESTION ? SOME REMARKS AND A PROPOSAL

P. ROCCHI

IBM, via Shangai 53, 00144 Roma, Italy E-mail: paolorocchi@it. ibm. com

This work addresses the Probability Calculus foundations. We begin with considering the relations of the event models today in use with the physical reality. Then we propose the structural model of the event and a definition of probability that harmonizes the interpretations sustained by different probabilistic schools.

1 Preface

The origin of the Probability Calculus is credited to Pascal who applied rigorous methods to the matter that had been grasped by gamblers and unreliable individuals until then. He intended to lay the foundations of a new Geometry and the random event should be a "point" in this hypothetical abstract science. Throughout the centuries several scientists shared the Pascal's conjecture which has been accepted without discussion. Instead in our opinion an exhaustive and systematic approach to probability requires us to investigate the argument before examining the probability itself. The probability theories do not diverge in their final results, do not provide different formulas for the total probability and the conditioned probability, instead they are in contrast on the foundations to wit in the initial concepts, and this circumstance seems to us a substantial reason to study the random event.

In brief we may say that the probability theories use two main models of the random event: the linguistic model and the set model. We shall examine them in the ensuing sections. However we do not restrict our works to mere criticism but we shall trace a theoretical proposal. This one provides a new mathematical model of the random event and a definition of probability which seems capable of harmonizing the various authors appearing today in contrast: Kolmogorov and the frequentists, the subjectivist and objectivist schools, etc. In this article we present a few elements taken from the complete theoretical framework [11].

2 Linguistic Model

In general different sentences can describe the same random event. Let the propositions p, q, ... regard one event and verify the equivalence relationship

322

p a> q (1)

They form the equivalence class X

X={p,q,...} (2)

that constitutes the model of the random event so that we have

P = P(X) (3)

We share the opinion that random events are extremely complex and the linguistic model (2) is consistent with this feature. Disciplines which investigate complicated phenomena such as psychology and sociology, business management and medicine, adopt the linguistic representation and consider other schemes to be too simple and reductive. The proposition seems an adequate model except for the following perplexity. Each primitive is a simple idea and can be left to intuition only for its fundamental property. For example a number, a point, an entity are elementary concepts. Can we declare that the random event is complex and contemporarily assume it is a primary concept ? The acknowledgement of the complexity opposes the primitive assumption. This contrast would at least require an in depth justification that instead is lacking, as far as we know.

The inconsistency is confirmed in the every-day practice and we examine the linguistic model in relation to the facts.

2.1) - Some subjectivists declare that each particular of the event should be described in order to make evident its uniqueness whereas in usual calculations we accept a sentence such as

"The coin comes down heads " (4)

Note that only two items are reported: the coin and the result. The precise date, time, place and all the particulars that make the event unique and unrepeatable, remain implicit. In fact the parts of a probabilistic event are not easy to distinguish and to relate in a sentence. In conclusion a gap exists between the theoretical assertions and the practical applications of (2).

2.2) - In the Logic of Predicates every phrase has a precise meaning and is liable to be calculated. Programmers using Prolog and Lisp, develop inferences. Logical programs can deduce the thesis from the hypothesis using precise clauses. However this linguistic precision constitutes an exception and normally the natural language is approximate to the extent that a word must be interpreted. The natural language usually represents a random event in generic terms whereas the linguistic model (2) should be liable to the probability calculation (3).

323

3 Ensemble Model

The axiomatic theory [8] assumes that the sample space D. includes all the possible elementary events. Kolmogorov defines the random event X as a set of particular events Ex

X= {Ex} (5)

when X is a subset of Q

X c Q (6)

and the probability is the measure of X

P = P(X) (7)

The practical application of the theory is immediately clarified by Kolmogorov who defines X as the "result" of the event.

3.1) - This conception causes some perplexities in the light of modern systemic studies. Applied and theoretical works on systems [7] assume the event as the dynamic producing the result from the antecedent item

EVENT

ou tpu t / (8)

The result is a part and the event is the whole. The properties of the event are evidently quite different from the properties of the output. We encounter heavy difficulties when we call {Ex) "set of events " and contemporarily we conceive it as a "set of results". We cannot merge them without a logical justification: But do we have any ?

3.2) - Some probabilistic outcomes cannot be properly modeled as sets and subsets. The spectrum of interference in the "two slit experiment" is a well-known case emerging in Quantum Physics [6].

input

324

4 Structural Model

We searched for a solution of the above written difficulties and we designed a theoretical framework based on the structure model for the random event.

Ludwing von Bertalanffy, father of the General Systems Theory, conceives a system, and consequently an event, as an intricate set of items which affect one another [2]. Interacting and connecting is the essential character and the inner nature of events, and we take this idea as the basis of our theoretical proposal. We make the following assumption

Axiom 4.1) - The idea of relating, of connecting, of linking is a primitive.

This idea suggests two elements specialized in relating and in being related that we call entity and relationship. We define them such as.

Definition 4.2) - The relationship R connects the entities and we say R has the property of connecting.

Definition 4.3) - The entity E is connected by R and we say E has the property of being connected.

Intuitively, we may say R is the "active" element and E is the "passive" one. They are symmetric, complementary and complete since they exhaust the applications of Axiom 4.1). Relationships and entities are already known in Algebra as operations and elements; as arrows and objects; as edges and vertices. The main difference is that all of them are given as primitive, while R and E derive from the axiomatic concept 4.1). In other words, the properties of the relationship and the entity are openly given in 4.2) and 4.3), while they are implicit in other theories. We underline that Axiom 4.1) is not a theoretical refinement and will provide the necessary basis to the ensuing inferences.

From Definitions 4.2) and 4.3) follows that the relationship R links the entity E and they give the set

S = (E;R) (9)

which is an algebraic structure [4]. In this article we discuss theoretical models with respect to the physical reality thus we immediately examine howE, R and S provide proper models for events. The parts of an event are entities and relationships. As an example an entity is a dice, a spade, heads, tails, a product. The relationship that connects two or more entities is, for ease, a device, a force, a physical interaction [3]. In the physical reality an event is a dynamic phenomenon linking Ein to Eout, and from (9) we can deduce this general structure

325

5 = (Ein, Eout; R) (10)

Using a graph we get

^ .

R Eout (11)

R is the pivotal element in (10) and (11), and the structural model represents accurately the facts. In addition we get the following advantages.

1. The result Eout is distinct from the event S. The parts and the whole are logically separate and they give a precise answer to objection 3.1).

2. Relations and entities constitute finite and also infinite sets so that R and E match with both discrete and continuous mathematical formalism.

3. When Eout is an ensemble

Eout = {Ex} (12) Eout c= Q (13)

The structure accomplishes the set model in (5) and (6). 4. The result Eout may be also a rational or an irrational number, a real or an

imaginary value. It can be calculated by a wave function or by another function etc. and we can offer a formal solution to point 3.2).

5. The structure S can include the comprehensive context of the probabilistic event. E.g. The atomic experiment depends upon the observer Eo and we have this exhaustive structure

S = (Ein.Eout, Eo; R) (14)

We believe that the structural model can give a contribution to Quantum Probability.

6. A simple sentence includes nouns that are entities and a verb representing a dynamical evolution. E.g. (4) expresses the following entities and relationship

"The coin \ comes down \ heads " Ein R Eout (15)

326

In short the algebraic structure accomplishes the linguistic model. However a sentence can be equivocal whereas the structure S is a rigorous formalism and answers to point 2.2).

Note that the set (9) has the associative/ dissociative property namely the event is unicum S; then it is defined in terms of the details E and R. If this analysis is insufficient we reveal the entities (El,E2..,Em) and the relations (Rl, R2..,Rp); these are exploded at a greater level, and so forth. The structure of levels is the complete and rigorous model of any event

S = = (E;R) = = (El,E2...,Em;Rl,R2...,Rp) = = (E11.E12 ,Eml,Em2,.,Emk;Rll,R12 ,Rpl,Rp2..,Rph) (16)

The structure can also be written such as

level 0 S level 1 E;R level 2 El,E2...,Em;Rl,R2...,Rp level 3 E11.E12 Eml,Em2..,Emk;Rll,R12 ,Rpl,Rp2..,Rph (17)

The multiple level decomposition is known also as "hierarchical property" in literature [13]. It is applied by professionals in software analysis methodologies [14],[10], it is basic in modern ontology [12] and in various other sectors [1]. The progressive explosion of the event is already known in the Probability Calculus where we use trees connecting the parts and the subparts of a random event. For example an urn contains x red balls, y green balls and z white balls. Which is the probability of getting a white and two green balls through three draws ?

We consider the drawing Rw of a white ball w and Rg of a green ball. The winning combinations wgg, gwg, ggw are generated by Rl, R2 and R3. Intuitively we write this tree connecting three levels

R3

/ l \ Rg.Rg.Rw (18)

The structure of levels (17) is rigorous and complete. It includes the relations of the event as well as the entities

327

level 0 S level 1 g,w; R1+R2+R3 level 2 wgg,gwg,ggw;(Rw,Rg,Rg)+(Rg,Rw,Rg)+(Rg,Rg,Rw) (19)

Thanks to this completeness the structural model provides some insight into what is involved. In particular if Rx at level k includes the subrelationships of level (k + 1), then Rx connects the entities through these subrelationships. E.g. The structure of levels (19) illustrates the dynamic Rl carried out by (Rw,Rg,Rg) that physically determine the results The structure (16) proves that any event is composed of precise macromechanisms and micromechanisms. Any event appears like an industrial apparatus, a mechanical clock or an electronic device including various working parts. This operational analysis, which is based on Axiom 4.1), will be fundamental in the next section.

5 Certain and Uncertain Structures

Probability is the answer to such kinds of questions: Who will win the next foot-ball match? Who will be voted in the regional elections? Shall I pass the examinations? Where is the photon now ?

These questions prove that probability concerns the particulars of an event that is already known in the whole. We see the overall random phenomenon but, however, we ignore the details that will produce the result. When we ask "who will win the next match ?", we are familiar with the match, we already know the teams which will play, where the match will be held, etc. We master the event, however we do not have the details that will set out the result. Why do we not have details ?

The cognitive difficulties, related to the particulars of a random event, take several origins. For example there is a generic memory, the reports are not detailed, the particulars are missing because they are disseminated over a vast area, we meet obstacles in the use of instruments etc.

Ignorance of microscopic is sometimes a voluntary choice. Every detail could be observed and yet we decline to know them. For example a company has collected analytical data but the executive managers ignore them and evaluate their average values in taking important decisions. Macroscopic knowledge and unawareness of microscopic items provide a precise method. Statisticians assume this method that is absolutely scientific.

Let us translate these concepts into the formalism just introduced. Let the event S have the level / , the level 2, up to the level q; two cases arise now.

328

5.1 Certain Structures

The event is entirely described by the relations and the entities of level q. The elements at level (q + 1) do not exist in the paper and in the physical reality. This structure, which is wholly defined and complete, is certain. As an example we take a body falling

level 0 S level 1 Eb,ET;Rf (20)

The structure includes the body Eb, the Earth £Tand the force of gravity Rf at level 1. The elements exhaustively model the event and other elements do not exist in the physical world.

5.2 Uncertain Structures

The event is not entirely described by the relations and the entities of level q. The microelements pertaining to level (q + 1) exist in the physical reality and influence the final results in a decisive way however the structure do not include them. We call uncertain (or random) such a structure which is partial. As ease we take the flipping of a coin. The structure includes the coin Em, the launching/falling dynamcs Rm. The entities Et heads and Ec tails and the relations, which are alternative and produce them, appear at the next level

level 0 S level 1 Em.Rm level! Et,Ec;Rt+Rc level 3 (21)

The subrelationships of Rt and of Re produce any specific outcome. They are essential since they would enable the calculation of any result and should be listed at the level 3 in (21). However they do not appear and the structure (21) is uncertain.

6 Probability

A certain event is entirely explained through the structure of levels. The structure clearly indicates "how" the event runs through q levels which are exhaustive by definition. On the contrary the uncertain structure is incomplete and cannot describe "how" the event runs in the physical reality. As the impossibility of describing "how" the event functions since the level (q + 1) is unknown, we inquire "when" the event behaves, that is "when" the random event exists in the physical reality. This

329

inquire unveils a typically physical approach. The problem eludes whoever develops an abstract study. For the pure theoretician the event S, once defined on the paper, exists by definition. The applicative instead knows the great difference between the definition of a model and its experimental observation.

The structure of levels (16) proves that the event S works through R, therefore we measure the ability to connect of the relationship.

Definition 5.1) - When R links the input to the output in the physical reality, the event S is certain and the measure P(R) equals one

P(R)= 1 (22)

When R does not run in the physical reality, S is impossible in the facts and the measure P(R) is zero

P(R) = 0 (23)

If R occasionally runs, P(R) assumes a decimal value. The connection is neither sure nor impossible and R has a value between zero and one

0 < P(R) < 1 (24)

We call probability the measure P(R) of the operation R which extensively indicates the occurrence of S. We can add the ensuing remarks.

1. The relationship R is the precise argument of probability while S is generic. 2. Definition 5.1) is coherent with the common sense on probability as P(R)

gauges the possibility or the impossibility of the random event. 3. In some special events we can define the operation using its outcome. Formally

we state an univocal relation between Eout and R

Eout => R (25)

and we calculate the probability of the outcome

P(Eout) = P(R) (26)

E.g. The result heads Et appears whenever Rt works and we forecast the chances of a gamble from the possible outputs

P(Et) = P(Rt) = 0.5 (27)

330

In conclusion if (12), (13) and (26) are true Definition 5.1) is consistent with the Kolmogorov 's theory.

4. Certain structures include only certain elements, impossible elements have no sense and are omitted. The unitary value of probability merely confirms what is already related in the levels. For example P(Rf) is one and substantiates the structure of levels (20). Conversely the uncertain structure lacks the lowest elements that are essential and (24) unveils them. The decimal values of probabilities clarify the intervention of the elements at level (q + 1). For example we ignore the parts of Rt producing the result Et in (21) instead the probability (27) is capable of explaining how they work. Exactly half of the S occurrences is due to the subrelationships of Rt and the other half is activated by the components of Re. The explicative and predictive values of probability in (24) appear absolutely relevant.

7 Experimental Verification

Our inferences are strictly inspired by experience and Definition 5.1) must be confirmed in the facts. In order to simplify the discussion of practical verification, let the event include either the relationship Ri or NOT Ri at level 2, and level 3 is ignored

level 0 S level 1 E;R level 2 Ei.NOT Ei; (Ri+NOTRi) level 3 (28)

The probability P(Ri) expresses the runs of Ri by definition, thus the occurrences gs(Ri) in the sample s verifies the theoretical value P(Ri). As much as Ri connects, so much is gs(Ri). Vice versa as little Ri runs, so small is gs(Ri). However the absolute frequency gs(Ri) exceeds the range [0,1], and we select the relative frequency Fs(Ri) which verifies

0 < Fs(Ri) <1 (29)

According to this theory the relative frequency must coincide with the probability calculated theoretically instead Fs(Ri) does not coincide withP(3?(). Why ? There is perhaps a systematic error in the experiment ?

The relationship Ri at level q works by means of its subrelationships at level (q + 1), however we do not know in details how these ones behave. In particular a subrelationship at level (q + 1) occurs random and a finite number of tests does not

331

allow the subrelationships of Ri to maintain their dynamical contribution to Ri. Symmetrically the subrelationships of NOTRi are not proportional to P(NOT Ri). Every finite sample of tests unbalances Ri and NOT Ri. The occurrences of one group are lower to what they ought to be and the occurrences of the other are greater since the subrelationships are casual. The relative frequencies appear in favour of one group of subrelationships and in detriment of another. Fs(Ri) and Fs(NOT Ri) are necessarily unreliable and disagree P(Ri) and P(NOT Ri). We conclude the correct trial of probability must be extended over the universe where the subrelationships of Ri and of NOT Ri do not undergo limitations. The ideal experimentation of P(Ri), which excludes any deforming influence and provides the unaltered value oiFs(Ri), requires the number Gs of tests be infinite

Gs = oo (30)

In this situation the theoretical value P(Ri) and the experimental one coincide

\Fs(Ri) - P(Ri)\ = 0 (31)

The ideal experiment (30) is unattainable therefore we can only bring near. We define this approximation using the limit

Urn \Fs(Ri) - P(Ri)\ = 0 Gs^oo (32)

The limit affirms that, given the high number AT, there is a value Gs

Gs > N (33)

such that

\Fs(Ri) - P(Ri)\ <1/Gs (34)

In other words we repeat the tests a "sufficiently high" number of times and the difference between the frequency and the probability will be less to the "small" number 1/Gs. The limit (32) ensures a result as fine as desired. It proves that the probability defined by (22), (23), (24) is verifiable in the fact and confirms that the present theory has substance.

The limit (32), known as empirical law of chance or law of great numbers, does not define probability but explains its experimental verification only. It is less meaningful with respect to the law sustained by frequentists [9] and does not give rise to the same conceptual difficulties. The limit (32) does not use probability to

332

describe the approximation of Fs(Ri) to P(Ri) and avoids a certain conceptual tautology.

8 Objective and Subjective Probability

The limit (32) states that the higher the number of tests the more frequency moves near to probability. Vice versa the smaller the sample, the less reliable is the experimental control of probability. The maximum deviation emerges in a single test and the structural model provides the explanation.

One subrelationship of the level (q + 1) fires the single experiment and this subrelationship pertains to Ri or otherwise pertains to NOT Ri. In both cases the frequency deviates completely from the probability which should be decimal.

I • Gs 1 >N oo Fs wrong approximate right

(35)

The spectrum (35) is valid in relation to frequency and also in relation to probability: What does this mean?

Any scientific measure takes its meaning under the precise conditions in which it is defined. Therefore a parameter does not have a value for ever but does only in the practical conditions under which it must be tested. And this rule also concerns probability. A fairly simple case can clarify the matter.

We define the force/ as the factor causing the acceleration a to the mass

f=m-a (36)

Mechanics defines the force (36) in the conditions which pertain exclusively to the inertial system. This is characterized by the property of being stationary or moving straight on and steadily. In the inertial system the mass m goes through the force and accelerates in accordance with (36). Conversely the body can move without any mechanical solicitation in the non-inertial reference. The force cannot be tested and definition (36) is meaningless when system is not inertial.

In general a scientific measure takes on a significance only under the experimental conditions pertaining to it and out of this context it objectively has no meaning. The same criterion applies to probability with additional difficulties due to the experimental conditions that are expressed by the limit (32) and are somewhat

333

complex. We have not two alternative and mutually exclusive reference systems, intertial and non-intertial, conversely we have the continuous spectrum (35). Probability is correctly experimented and thus takes on a right and objective significance when

Gs =00 (37)

This is unattainable and we use a large sample

Gs >N (38)

the higher is the test number and the more objective is the probability verification. Probability loses significance as more as Gs decreases. The test is absolutely meaningless when

Gs = 1 (39)

Probability is very useful (see point 3 in section 6) and we calculate P(R) even if (39) is true. In the single event however "the probability does not exist" as De Finetti paradoxically states [5]. Probability can only orientate the personal expectation, namely probability takes on a subjective significance.

I

Gs 1 >N Fs wrong approximate P subjective objective

Note that the subjectivist schools focus their attention on the single event while the general event is a repetition of single events. This remarks put to light, once again, that incongruences between various authors take their roots on the random event modeling.

In substance Fs(Ri) and P(Ri) have a correct and objective meaning when they refer to the entire inductive base. As the number of experiments decrease so the precision of Fs(Ri) decreases and the objectivity of P(Ri) decreases progressively to the point (39) in which the numerical value of Fs(Ri) is systematically wrong and the value ofP(Ri) is subjective.

00

right

(40)

334

9 Conclusions

Our theoretical proposal arose from a critical approach to the probabilistic event, in particular we started with examining the relation between theoretical models today in use and the physical reality. We believe the algebraic structure meets the needs better than the linguistic and the set models. Besides the theoretical appreciations that we listed in the previous pages, we highlight that structures of levels are already applied in several fields and in Probability Calculus too.

The definition of probability, that derives from the structural model, is consistent with the common sense and with the probabilistic schools. The different interpretations of probability, which today are conflicting, are unified in between our framework. We judge this is a significant feature and may provide a stimulation to the scientific debate.

The reader may find some parts in this paper sketchy and insufficiently explained, we regret the conciseness. Other considerations and further calculations have been developed in [11] but exhaustive discussions cannot be included here.

References 1. Ahl V., Allen T.F.H., Hierarchy theory: a vision, vocabulary and epistemology

(Columbia Univ. Press, N.Y., 1996). 2. von Bertalanffy L., General system theory (Brazziller, N.Y., 1968). 3. Chen P.S., The entity-relationship model: toward a unified view of data, ACM

Transactions on Database Systems, vol 1, n.l (1976) 4. Cony L., Modern algebra and the rise of mathematical structures (Verlang,

N.Y, 1996). 5. de Finetti B., Theory of probability (Wiler & Sons, N.Y., 1975). 6. Feynman R., The concept of probability in quantum mechanics, Proceedings

Symp. on Math. andProb., California University Press (1951). 7. Kalman R.E., Falb P.L., Arbib M.A., Topics in mathematical system theory

(McGraw,N.Y,1969). 8. Kolmogorov A.N., Foundations of the theory of probability (Chelsea, N.Y.,

1956). 9. von Mises R., The mathematical theory of probability and statistics (Academic

Press, London, 1964). 10. Rocchi P., Technology + culture = software (IOS Press, Amsterdam, 2000). 11. Rocchi P., La probabilitd e oggettiva o soggettiva ? (Pitagora, Bologna, 1998). 12. Uschold ML, Building ontologies: toward a unified methodology, Proc. Expert

Systems, Cambridge (1996). 13. Takahara Y, Mesarovic M.D., Macko D., Theory of hierarchical, multilevel

systems (Academic Press, N.Y., 1970). 14. YourdonE., Modern structured analysis (Englewood Cliffs, N.Y., 1989).

335

CONSTRUCTIVE FOUNDATIONS OF R A N D O M N E S S

V. I. SERDOBOLSKII Moscow 109028, B.Trekhsviatitelskii 3/12, MGIEM E-mail: [email protected]

The ideas of the complexity and randomness are developed in a successively constructive theory. The Kolmogorov complexity is reconsidered as a minimization process. Basic theorems are proved for the processes. A new notion of the complexity based on sequential prefix coding algorithms (S-algorithms) is proposed. It is proved that a constructive infinite binary sequence is algorithmically stationary iff it is an S-encoded random sequence.

1 Introduction

In 1963 A.N.Kolmogorov [1] suggested an algorithmic approach to foundation of the probability. His new definition of probability was based on the notion of the complexity which was defined as the length of the minimal description: for a binary word x, the complexity function is defined as

*• (*)= min b | , (1) A(p)=x

where p are (shorter) binary words and the minimum is evaluated over all possible algorithms A. A remarkable properties of this approach was that thus algorithmically defined randomness was proved to display all traditional laws of probability. However, the function K(x) denned by (1) in a traditional intuitive approach cannot be effectively calculated since it is not a partially recursive function. In fact, this function is computable only for finitely many words x [2]. In [3] it was shown that K{x) is not partially recursive for any universal algorithm. In [4] the definition (1) was called "a heuristic basis for various approximation". In [5], the author writes that the non-constructive form of the definition (1) leads to some difficulties so that "many important relations hold only to within an error term measured by the logarithm of the complexity". To offer a constructive definition of randomness, it would be desirable to call an infinite sequence random if all initial segments (prefixes) in it are incompressible. However, it was proved [6] that such sequences do not exist. Kolmogorov proposed some definition of randomness (K-randomness) but he wrote that it was to be improved.

In this paper we reconsider fundamental relations of the Kolmogorov complexity theory and develop a successively constructive formalism. The main idea is that, as far as we deal with algorithms, we must explicitly take into account the current time of their performance. Thus, a static notion of minimal

336

description must be replaced by the process of the minimization. Here we suggest a rigorous formalism in which it is possible to replace somewhat obscure intuitive reasoning of the existing complexity theory by formal investigation of strings of symbols. We present a survey of basic results of the Kolmogorov complexity theory in terms of processes of step-by-step performance of algorithms. We also introduce a new form of the complexity based on a restriction by algorithms coding sequentially from left to right (S-algorithms). Constructive infinite binary sequences can be called stationary if frequencies of all finite blocks of digits in it converge. We prove that a sequence is stationary iff it is the transformation of an incompressible (up to a logarithmic term) sequence by a sequential left-to-right encoding algorithm.

Let us define the objects of the investigation and fix notations. We study binary words x that are finite chains of binary digits and, at the same time, binary numbers. These words are transformed with algorithmic procedures A, which can be represented by Turing algorithms (Turing machines) or, equiva-lently, by partially computable (partially recursive) functions. We also study infinite sequences x°° of binary digits which can be considered at the same time as infinite sequences of words x of increasing length n, i.e., initial segments of x°°. In the constructive approach, these sequences must be generated by some finite algorithms (generating functions). We write A(x) = y if A halts at some finite step and yields y. If A(x) does not halt we write A(x) =? We will often need to perform algorithms step-by-step. Let At{x) denote the result of the performance of A{x) for t steps: At(x) — y if A{x) halts at the step t' < t and yields y. We write At(x) =? if A(x) does not halt or halts only at the moment t' > t. Let |a;| denote the length of binary word x.

2 Kolmogorov Complexity

According to Kolmogorov, the complexity of a binary word is the length of a minimal program generating this word. To make this definition completely constructive, we first must explicitly describe the minimization procedure. To minimize a partially computable function f(x) we combine the search of x with counting number of steps of an algorithm that evaluates f(x). Let us use the uniform increasing numeration N = 1,2,... of n-tuples of arguments; for example, let N = 1,2,3,4,5, . . . represent pairs (1,1), (1,2), (2,1), (2,2), (1 ,3) , . . .

Define the standard minimization process for A(x) as follows

min A(x) = {A(x,N), N = l,2,...} X

where N = (x,t), A(x,0)=?, and A(x,N) = min (A(x,N - l),A t(x)) for

337

N > 1. In the minimization process, the sign "?" can be treated as infinity. If A{x) halts for a computable number of steps t then the minimization process ends and min A(x) is a computable function. If no such t exists, we can say

X

then that the function A(x) has no "bottom". Consider the universal Turing machine U: by definition, U(A,p) = A(p)

in the domain, where (and in the following) the same letter A also denotes the text of the algorithm. Let \A\ denote the length of the text A. Theorem 1. There exist computable functions such that the mass problem of their minimization process halting is algorithmically unsolvable.

Proof. Consider the indicator function ind(x,t) = 0 if Ut(x) with x = (A,p) halts exactly at the step t so that Ut(x) = A(x), otherwise ind(x,t) = 1. Denote

(j){x,t) =TT ind(a;,T). T<t

The minimization process {<f>(x, l),(j>(x, 2 ) , . . . } is finite iff U(x) halts. But the halting problem for the universal Turing machine U is algorithmically unsolvable.

Now we can define the complexity as follows.

Definition 1. Given binary word x and an algorithm (partially computable function) A, the complexity of x with respect to A is K(x, A) = {K(x, A,N), N = 1 ,2 , . . .} , where

K(x, A, N) = min \p\ (p,t)<N: A,(p)=x

In this definition, A{p) is called a generating algorithm and p is called a program or a code for x.

So the complexity is defined as a process but not as a function. If A(x) halts for some x, then the sequence K(x,A) = {K(x,A,N), N = 1,2,. . .} converges to a constant for some computable N = NQ and we can say that the complexity function K(x) is defined. Otherwise, no such constructive function exist.

To compare minimization processes we need a special technique.

Definition 2. Given two minimization processes

min A(x) = {A(x, N), N = 1,2,... }, min B(x) = {B(x, M), M = 1,2,. . .} X X

we write A(x)<B(x) if for each M there exist an iVo such that for all N > N0

the inequality holds A(x, N) < B(x,M).

338

If the both processes halt we can write simply A(x) < B(x). If A(x)<B(x) and A(x)>B(x) we say that the strong equivalence holds

and write A(x) ~ B{x). Define also a weak equivalence: A(x) « B(x) if A(x)<B(x) + c along with B{x)<A(x) + c.

The algorithmic theory of complexity was started with the discovery of universal descriptions and universal complexity. This basic discovery was made simultaneously and independently by Kolmogorov and R.Solomonoff in 1960-1964 (see in [7]).

This theory is developed to study minimal descriptions of arbitrarily long words x with finite algorithms. It means that \A\ < c. All basic results are obtained with the accuracy up to constants c which are supposed to be independent of x.

Definition 3. The complexity of the word x with respect to an algorithm A is the process K(x, A) = {K(x, A,N), N = 1,2,... }, where

K(x, A, N) = min |»|. (p,t)<N: At(p)=x

We use two methods of the complexity theory: upper estimates of the complexity are derived by the construction of explicit generating procedures; lower estimates are obtained by counting the variety of words and their programs.

Theorem 2. For any algorithm A we have

K(x,U)<K{x,A) + cA,

where CA depends only on A but not on x.

Proof. Count steps of A{x) by steps of the universal Turing machine performing A. For each N we can find a number M such that

K(x, U, N) = min \z\ < (z,t)<N: U,(z)=x ~

min min |(B,p)| < min (CA + \p\) < B; \B\<c (p,t)<N: Ut(B,p)=x ~ (p,t)<N: Ut(A,p)=x ~

CA+ min \p\ = CA +K(X,A), (p,t)<M: A,(p)=x

where CA is a constant depending only on A. This is the proof.

This statement is called the Invariance Theorem. Its significance is that it introduces a universal measure of complexity which is calculated by trying different algorithms with different input words. Let us fix a particular universal Turing machine U as a reference machine and set K(x) = K(x, U).

339

Let us call the difference |x| — K(x) the number of regularities.

Remark 1. Given n = \x\, the fraction of words x with the number of regularities more than m is no more than 2~m .

This follows from the fact that there are only 2 n _ m programs p of length n—m. So almost all words are incompressible up to a slowly increasing function of n.

Remark 2. K{x)<\x\ + c. This is obvious since we can use, as a generating, the identity algorithm A(x) = x.

Note that the minimization process in Theorem 2 can be made more efficient if we restrict p with \p\ < \x\ + c.

The complexity of finite words depends strongly on the additive constant c. Therefore, the main object of study will be the complexity of words x of arbitrarily great lengths n.

Theorem 3. If f{x) is a partially computable function, then K(f(x))<K(x) + c.

Proof. Suppose the algorithm evaluating f(x) halts. Given an arbitrary algorithm A we construct the composition B = fA. By Definition 3 and Theorem 2, for each N we can find M and a constant c' independent of x such that

K(f(x),U,N)= min \p\ < (z,t)<N: Ut(z)=f(x)

min min Inl + c < min \p\ + c < B: \B\<c (p,t)<M: Bt(p)=f(x) ~ (p,t)<M:f(At(p))=f(x)

min Id + c = K(x, A) + c<K(x) + c'. (p,t)<M:At(p)=x V '

The theorem is proved.

Example. Let x — 0n (n zeros). Then K(x)<K(n) + c< logn + c. If n = l m , then K(x)<log\ogn + c. Clearly, K{xn) is not monotone in n.

By definition, it is impossible to present a conceivable example of a high-complexity word.

To separate a number n in chain, we define a special self-delimiting code for an integer n as follows: n = Omln, where m = logn, with the length \n\ = 2log n + 1; or a more refined code n = O l o g m lmn of length \n\ < logn + 2 + 2 log logn. Here (and in the following) log a; for x > 0 denotes a function equal to an integer nearest from above to the standard logarithmic function logx, and only positive arguments of log a; are considered (if x < 0 then the expressions containing log a; are supposed to equal 0).

340

Note that the set of n presents a prefix-free set. More sparing self-delimiting codes can be obtained by further iterations. Denote their length by log* n = log + log log n 4- log log log n + ... (the iterated logarithm).

Theorem 4. K(x, y)<K(x) + K(y) + 2 log ||z|| + 1.

Proof. It suffices to use programs for (x, y) of the form p = 0mlp1p2, where m = logpi, A(pi) = x, B(p2) = y, and 0m serves to separate p\ from p2.

3 Incompressibility

Now we consider algorithmically generated infinite sequences of digits x°° that are treated as sequences of words {x : |x| = n = 1 ,2 , . . .} .

We cite (in a simplified form) two theorems by Martin-L6f [6].

Theorem 5. Any constructive x°° contains infinitely many words x of length n with K(x)<n — logn + c.

Theorem 6. For almost all sequences x°° for any e > 0 for all words x of length n > no with some computable no, we have K(x) > n — (1 + e) logn.

Thus the complexity of a typical constructive binary sequence fluctuates between the lower bound n — (1 + e)logn and n.

The idea to define randomness as algorithmic incompressibility was put forward by Kolmogorov [2] and G.J.Chaitin [8]. There exist no sequences in which all words in it are c-incompressible.

Definition 4 (Kolmogorov). An infinite binary sequence is called K-random if it contains infinitely many words x with if(a;)>|a;| — c.

Remark 3. Almost all sequences x°° are K-random.

This follows from the fact that there is only a portion 2~c of words x for which K(a;)<|a;| - c.

Definition 5. An infinite binary sequence x°° = {x} is called L-random if for some c we have K(x)>n — c logn for all words, n = \x\.

Theorem 6 states that almost all binary sequences are L-random. Stepping aside from the incompressibility idea Martin-L6f [6] suggested

another notion of randomness based on the idea of universal tests. The Martin-Lof randomness (ML-randomness) follows from the Kolmogorov randomness. If z°° is Martin-Lof random then for any e > 0 we have K(x)>n- ( l + e ) l o g n from some n onwards.

These properties suggest three notions of randomness implied one from the other: K -+ ML -> L.

Now let us restrict classes of algorithms.

341

4 Reversible Complexity

Let us restrict ourselves with reversible algorithms.

Definition 6. An algorithm A(p) is called reversible (R-algorithm) if one can find another algorithm B = A-1 such that A(p) — x implies B(x) — p and vice versa.

These algorithms state 1-1 correspondence between inputs and outputs. We can say that B(x) is an encoding algorithm and A(p) is a decoding algorithm.

Definition 7. R-complexity of a word x is defined as the process KR(X) = {KR(x, N), N = 1 ,2 , . . .} , where

KR(X,N) = min min Id, A: \A\<c {p,t)<N: Ut(A,p)=x

where A are R-algorithms and the minimization process is shortened by discovering the first root of the equation A(p) = x.

Since the class of R-algorithms includes the identity algorithm, we have KR(X) < \x\ + c.

Definition 8. A function (an algorithm) A(x) is called unidomain , if there are no pairs x\ ^ x-i such that A{x\) = A{x2).

Proposition 1. A function A(x) is unidomain iff it is reversible.

Proof. First, let A be unidomain. Using A let us construct an algorithm B(y) as follows:

for (p,t) = 1,2, . . . do if At(p) = y then B(y) := p; halt

endfor

If A(x) = y then this algorithm provides the first root of this equation and halts. If A(x) =? then we have B(y) =?. Conversely, if A is a reversible algorithm, then there exist an algorithm B(y) such that A{x) = y implies B(y) = x and the argument of A is recovered uniquely.

Theorem 7. There exist no algorithm W such that for any algorithm A we have W(A) = 1 if A can be a reversible algorithm, and W(A) = 0 if not.

Proof. To prove this assertion, it suffices to prove it for some special class of A. Let N be a nullifying algorithm such that for any x we have N(x) = 0 , and let B be an arbitrary algorithm. Choose A so that A(0) = 0, A(l) = N(B(1)), and A(n) = n for n > 1. This algorithm is not unidomain iff -B(l) halts. However, the mass problem of algorithm halting is algorithmically unsolvable. This proves the theorem.

342

Theorem 8. The complexity KR{X) as K(X).

Proof. The relation K(X)<KR(X) + c follows from definitions. Prove the converse relation. Let K{x) be given by a sequence of functions

Kix.N) = min min Ipl. A: \A\<c (A,p,t)<N: At(p)=x

where A are arbitrary algorithms. Given A, the minimization here is carried out over all roots of the equation At(p) = x. We replace the evaluation of all roots for a single algorithm At by evaluating roots of a number of the equations. Let us numerate roots of the equation A(p) = x in the process (p, t) = 1,2, Construct the algorithm B(v,p) as follows.

k:=0 for (q,r)=l ,2 , . . . do

if AT{q) — x then k := k + 1 if k = v and p = q then

B := x; halt endfor

The function B(v,p) = x iff p is the root number is, otherwise B(y,p) =?. By construction, for fixed v the function B(i/,p) is unidomain. The theorem statement follows.

Knowing the complexity of a word x we can constructively evaluate its minimal codes. Minimizing descriptions of physical events x can be considered as a process of a "cognition" of x by search of a regularities producing the phenomenon x. It is known that all elementary physical processes are time-reversible. The reversible generating algorithms, generally speaking, can be less efficient in producing long words. The equivalence K{x) « KR{X) stated by Theorem 8 can be interpreted as the absence of phenomena that can be produced but not cognized within the frames of the algorithmic theory.

5 Complexity and Information

Kolmogorov discovered [2], [9] that information theory can be developed from the algorithmic definition of complexity.

The conditional complexity of a binary word x with respect to the word y is defined as the minimal length of a program that generates x from y:

K(x\y,A)= min \p\. (p,t): At(p,y)=x

Theorem 9. There exists an optimal algorithm V such that for any algorithm A we have

K(x\y) d=!f K(x\y, V)<K(x\y, A) + c.

343

Example. We have K(On\n)<c, where the constant c is the length of the algorithm generating 0" from n.

We show the connection between the notion of complexity and optimal coding in the Shannon information theory. Suppose the words x of length n be partitioned from left to right into sequences of k blocks {ba} of binary digits of the identical length I, m = 2l blocks in total, n = kl. Denote by f„ the empirical frequency of the occurence of bs in x. The Shannon entropy per block is defined as

s

Theorem 10. Let o word x be partitioned into k blocks of length I. Then k~1K(x)<H(f) + clogfc/fc, where c depends on I but not on x.

Proof. Use a special code not depending on the source of information {universal code). To specify x we can fix numbers k3 = kfs of the occurence of each block bs for all blocks s of length I and the number

~ kilk2\...kml

m = 2l, where fci + • • • + km = k. Applying the Stirling formula we find that the length of this code is no more than m log k + kH(f) + c log k. The theorem statement follows.

Thus, K{x) can be considered as the entropy and K(y\x) as the conditional entropy. The information in x about y is I(x\y) = K(y) — K(y\x).

Remark 4. For arbitrary words x and y,

K(y\x)<K(y) + c and K(x,y) = K(x) + K(y|x) + clog|x|.

Indeed, consider a special code for (x, y) of the form P1P2, where pi is a self-delimiting code for x and pi is a code for y. We have

K(x,y)< min min (|Pi| + IP2I). A,B: | A | < c , | B | < c (pi,P2,t): At(pi) = x, Bt(p2) = y

This is the required statement. Note that the measure of the information I(x\y) is non-negative only

asymptotically for long x and y. The correction logarithmic term can be prescribed to the individual description of x in contrast to traditional description in terms of distributions.

344

6 Frequency Ra te s

The stability of frequency rates that is assumed a priori in the conventional concept of probability can be deduced in the algorithmic theory.

Denote the empiric rate of occurences of 1 in x by f(x, 1). The frequency rates stability can be stated as follows.

Theorem 11 . Given L-random x°°, c > 0, for each word x in it

\f(x,l)-l/2\2<c\ogn/n,

where c does not depend on n. Proof. Use a special code p for x as follows. Let k = nf(x,l), and

P = (fc>j)> where j = 1 , . . . C* numerates all words x of length n with k units. Use the prefix codes for (k, j) of the form kj with \k\ = log* \k\ < 21ogn. Thus

A'(a;)<|(*>m)|<21ogn + logC7*.

Using the Stirling formula we find that logC* < nH(k/n) + clogn, where the entropy H(f) = —/log/ — (1 - / ) log( l - / ) , / = k/n. It satisfies the inequality H(f) < 1 — 2( / - 1/2)2. Combining these formulas we obtain the desired result.

Remark 5. If \f(x, 1) - 1/2|2 > c/n, then K(x)<n - 1/2 logn + c. This inequality shows the effect of a regularity when the number of units is too close to n/2 .

The refinement is natural. We consider a partition of x°° — {x} into blocks of digits b of the identical length \b\ = /. Define by f{x,b{) the number of blocks b = bi among the partition of a word x of length n = kl. Denote 7T = 2 ' -J .

Theorem 12. Given an L-random sequence x°° = {x} and a block of digits b of length I, for all words x of length n we have

\f(x,b)-2~l\2 <c(b) logn/n.

A number of other specifically probabilistic laws deduced previously by intuitive reasoning in can be proved similiarly.

7 Prefix Complexity

In 1974-1975 another approach to the complexity was developed starting from the concept of a prefix complexity (by L.A.Levin, P.Gacs, G.J.Chaitin [10-12])

345

Definition 9. A set of words is called prefix-free if there are no pairs of different words such that one is the beginning of the other.

Lemma 1. (1) If {pi} is a prefix set, n; = \pi\, i — 1,2,..., then the Kraft inequality

holds £ 2-"<<l;

t = l , 2 , . . .

(2) if numbers n\, n<i,... satisfy the Kraft inequality, then one can find binary words pi, P2,. • • of length n\, n-i,... such that the set {pi} is prefix-free.

These words can be constructed by the well-known Fano-Shannon procedure.

Definition 10. An algorithm is called a prefix algorithm if its domain is a prefix-free set. The prefix complexity of a word x with respect to a prefix algorithm A is defined as the process Kp(x, A) = {Kp(x, A,N), N = 1 ,2 , . . .} , where

KP(x,A,N)= min ||p||. (p,t)<N: At=x

The set of prefix algorithms is an enumerable set.

Theorem 13. There exists a universal prefix algorithm V such that for any prefix algorithm A we have

KP{x) d= KP(x, V)<KP(x, A) + cA.

To deal with prefix algorithms, we notice that we can recover the word x = 0n (n zeros) from n, but we cannot encode numbers n as simple integers since they are not prefix-free. Using self-delimiting codes we obtain prefix-free codes of length n + log* n.

Remark 6. K(x)<KP(x)<K(x) + log*(z).

Remark 7. Kp(x,y)<Kp(x) + Kp(y) + c. In contrast to K(x), here we do

not need an end marker for the word x since x is recognized as a prefix.

Theorem 14 [12]. For any fixed length n of words x we have max Kp(x)>n + log* n — c.

X

Theorem 15 [13]..An infinite sequence x°° is Martin-Lof random iff Kp(x)>\x\ — c for all words x.

346

For most of x°° we have Kp(x)>\x\ — c for all x. Thus, the prefix complexity of almost all sequences fluctuates within the bounds \x\ and |a;| + log* \x\ (with the accuracy up to c).

8 Universal Probability

The idea of a universal a priori probability was put forward by Solomonoff in [4]. For a binary word x, he introduced the probability P(x) = 2 _ l p ^^ , where p(x) is a minimal description of a;. However,

£2-*<*> = oo. x

To obtain normalizable algorithmic probabilities, the Kraft inequality for a prefix-free set was proposed and this led to the development of a theory of the prefix complexity [10-12]. Let us reformulate the basic results of it in a successively constructive form.

Definition 11. The algorithmic probability of x is defined by the process

P(x) = {2-Kr(*<N\ AT = 1,2,. . .}

Example. If x = 0n , then Kp(x)< logn + 2 log log n + c. Hence P(x)>c/(nlog2 n).

Definition 12. The universal a priori probability is defined by Q{x) = {Q(x,U,N), N = (p, t) — 1,2,.. .} where U is the universal prefix algorithm and

Q(x,U,N) = Q{x,U,N-l) + md(Ut(p) = x) 2~M,

where the indicator function equals 1 iff Ut(p) halts exactly at the step number t otherwise 0.

Since the mass problem of the universal machine halting is algorithmically unsolvable, the sequence Q(x) has no "ceiling".

The following Coding Theorem shows that these two formulations define processes differing by no more than a constant.

Theorem 16. For each x we have Kp{x) » logQ(x).

In [14] a non-constructive infinite binary fraction was considered

n =53 Q(x) < I.

347

The real number fi was called the universal algorithm halting probability. It can be interpreted as a process {Q(N), N — 1,2,.. .} with

fi(jV) = Yl MN " !) + 'md(ut(p) = *)]> (x,p,t)<N

where the indicator function equals 1 iff Ut{p) halts exactly at the moment t yielding x, otherwise 0.

The monotone increasing sequence il(N) is bounded from above and has no "ceiling". Knowing first signs of il{N), N — 1,2,.. . , we can accumulate in fi solutions of all constructive problems of bounded complexity. C.Bennet and M.Gardner would call ft "the number of Wisdom" [15].

9 Sequentially Coding Algorithms

We suggest the following extension of the complexity theory produced by a restriction with algorithms coding sequentially from left to right.

A set P of code words is called complete-code if any half-infinite sequence can be represented as a concatenation of codes from P.

Definition 13. An one-to-one constructive function T : X <—> Y is called a coding table if it is defined on complete-code prefix-free sets X and Y.

Definition 14. An algorithm A evaluating a coding table T : X <—> Y is called a sequential coder or an S-algorithm if

(1) for any concatenation x = x\Xi ...Xk of words Xi from X, we have A(x) = A(x1)A(x2)...A(xk);

(2) for any concatenation y = A(xx)A(x2) •. • A(xk) we also have A(x1x2...xk) = y.

The set of S-algorithms is recursively enumerable.

Definition 15. The S-complexity of a word x with respect to an S-algorithm A is a process Ks(x, A) = {Ks(x, A,N), N = 1 ,2 , . . .} , where

Ks(x,A,N)d= min \p\. (p,t)<N: At(p)=x

Theorem 17. There exists a (universal) S-algorithm V such that for any S-algorithm A we have

Ks(x) = Ks(x,V)<Ks(x,A) + cA,

where CA does not depend on x.

348

Since the class of S-algorithms contains the identity algorithm (with A(0) = 0, A(l) = 1), we have Ks(x)<\x\+c. If f(x) is a partially computable function evaluated by some S-algorithm, then Ks(f(x))<Ks(x) + c.

Obviously, K(x)<Ks(x)<Kp(x). But we only have Ks{x,y)<Kp{x) + Ks(y) since the sequentially coding algorithm can separate the utmost left prefix from the remaining ones.

For words x = 0™, we have Ks(x)< log* n. For almost all sequences x°° for all sufficiently long words x in it for any

c > 1, we have Ks(x)>K(x)>\x\ — clog |x|.

Definition 16. A binary sequence is called S-random if for all words x, Ks(x)>\x\ — c log |a;|, where c does not depend on x.

Definition 17. A binary sequence x°° = {x} is algorithmically stationary if for any block b of digits in it there exist the limit lim f(b, x).

x—>oo

Any L -random sequence is algorithmically stationary. Lemma 2. / / a binary sequence y°° = {y} is produced from an algorithmically stationary sequence x°° = {x} by an S-algorithm A so that y = A(x), then the sequence y°° is also algorithmically stationary.

Proof. Suppose y°° is produced from x°° by y = A(x), where A is an S-algorithm. The algorithm A defines a prefix-free domain X and a code-complete range of values Y. Choose a block of digits b. Using the completeness of Y, we have b — 2/12/2 • • • Vk, where j / , 6 Y, i = 1,2,... k. By the sequential property we can find a program a = X\Xi.. .Xk with all Xi € X such that A{a) = b. The frequencies f(a,x) = f(b,y). This proves the lemma.

Lemma 3. Ks{Ks(x))<\Ks(x)\ + c.

Proof. Note that S-algorithms are such that the composition AB of two S-algorithms A and B is again an S-algorithm. For a fixed N we find

Ks(x.N) = min min Ipl; A: \A\<c (p,t)<N: At(p)=x

and for the minimizing value p = Po,

Ks{po,M)= min min \y\. B: \B\<c (y,t):<M: Bt(y)=p0 '

Let y = 2/0 be the minimizing value of a code for po- Since for some t, AtBt(y) = x (if both algorithms halt), it is clear that Ks{x) < \y\ + c. We obtain K(x)<Ks(p) « Ks(Ks(x)).

Theorem 18. An infinite binary sequence x°° is algorithmically stationary iff it is an S-algorithm transformation of some S-random sequence.

349

Proof. First, assume that y = A(x) for all x € x°° and Ks(x)>\x\ — clog \x\. We have K(x)>Ks(x)-log* \x\. So K(x)>\x\ -c ' log|a; | , c ' > c + l . By Theorem 12 the sequence x°° is stationary.

To prove the converse, assume that x°° = {x} is stationary. We find minKs(x, N) for (p, t) < N; let p be a minimum code for x, At(p) = x for some t if At(p) halts. Here A : P -¥ X has the domain P and the range X, both prefix-free and code-complete. Since X is code-complete, we can express x as x\xi...Xk with Xi e X, and A(pi) = Xi with pi € P , i = l,...k. By Lemma 3 we have Ks(p)>\p\ - c. It follows that p — p\pi ...pk is log-incompressible. The proof is complete.

The comparison of different notions of the complexity and randomness shows that this difference is no more than a logarithmic term. With account of stationarity theorems, it seems plausible to suggest a common definition of randomness of infinite sequences x°° — {x} as the incompressibility up to the term c log |x|, where c does not depend on x.

In conclusion, I have a pleasure to express my sincere gratitude to prof. V.M.Maximov for encouraging discussions.

References

1. A. N. Kolmogorov. Grundlagen der Wahrscheintlickkeits Rechnung (Springer Verlag, 1933; in English: Chelsea, New York, 1956).

2. A. N. Kolmogorov, Problems of Information Transfer, 1, 1, 1-7 (1965). 3. L. Longren, Computer and Information Sciences, 2, 165-175(1967). 4. R. J. Solomonoff, Progress of Symposia in Applied Math., AMS, 43

(1962); IEEE Trans, on Inform. Theory, 4, 5, 662-664(1968). 5. Li Ming, P. Vitanyi, An Introduction to Kolmogorov Complexity (Springer,

Berlin-Heridelberg-New-York, 1993). 6. P. Martin-L6f, Information and Control, 9, 602-619(1966); Zeits. Warsch.

Verw. Geb., 19,225-230(1971). 7. A. N. Shiryaev, The Annals of Probability, 17, 3, 866-944(1989). 8. G. J. Chaitin, J. ACM, 16, 145-159(1969). 9. A. N. Kolmogorov, Russian Math. Survey, 38, 4, 27-36(1983). 10. L. A. Levin, Problems of Information Transmission, 10, 3,206-210(1974). 11. P. Gacs, Soviet Math. Doklady, 15, 1477-1480(1974). 12. G. J. Chaitin, J. ACM, 22, 329-340(1975). 13. V. V. Vjugin, Semiotika i Informatika (in Russian), 16, 14-43(1981);

V. A. Uspenskii, SIAM J. Theory Probab. Appl, 32, 387-412(1987). 14. R. J. Solomonoff, Information and Control, 7, 1-22(1964). 15. C. H. Bennet, M. Gardner, Sci. America, 241, 11, 20-34(1979).

350

STRUCTURE OF PROBABILISTIC INFORMATION A N D Q U A N T U M LAWS

JOHANN SUMMHAMMER Atominstitut der Osterreichischen Universitdten

Stadionallee 2, A-1020 Vienna, Austria E-mail: [email protected]

The acquisition and representation of basic experimental information under the probabilistic paradigm is analysed. The multinomial probability distribution is identified as governing all scientific data collection, at least in principle. For this distribution there exist unique random variables, whose standard deviation becomes asymptotically invariant of physical conditions. Representing all information by means of such random variables gives the quantum mechanical probability amplitude and a real alternative. For predictions, the linear evolution law (Schrodinger or Dirac equation) turns out to be the only way to extend the invari-ance property of the standard deviation to the predicted quantities. This indicates that quantum theory originates in the structure of gaining pure, probabilistic information, without any mechanical underpinning.

1 Introduction

The probabilistic paradigm proposed by Born is well accepted for comparing experimental results to quantum theoretical predictions*. It states that only the probabilities of the outcomes of an observation are determined by the experimental conditions. In this paper we wish to place this paradigm first. We shall investigate its consequences without assuming quantum theory or any other physical theory. We look at this paradigm as defining the method of the investigation of nature. This consists in the collection of information in probabilistic experiments performed under well controlled conditions, and in the efficient representation of this information. Realising that the empirical information is necessarily finite permits to put limits on what can at best be extracted from this information and therefore also on what can at best be said about the outcomes of future experiments. At first, this has nothing to do with laws of nature. But it tells us how optimal laws look like under probability. Interestingly, the quantum mechanical probability calculus is found as almost the best possibility. It meets with difficulties only when it must make predictions from a low amount of input information. We find that the quantum mechanical way of prediction does nothing but take the initial uncertainty volume of the representation space of the finite input information and move this volume about, without compressing or expanding it. However, we emphasize, that any mechanistic imagery of particles, waves, fields, even

351

space, must be seen as what they are: The human brain's way of portraying sensory impressions, mere images in our minds. Taking them as corresponding to anything in nature, while going a long way in the design of experiments, can become very counter productive to science's task of finding laws. Here, the correct path seems to be the search for invariant structures in the empirical information, without any models. Once embarked on this road, the old question of how nature really is, no longer seeks an answer in the muscular domain of mass, force, torque, and the like, which classical physics took as such unshakeable primary notions (not surprisingly, considering our ape origin, I cannot help commenting). Rather, one asks: Which of the structures principally detectable in probabilistic information, are actually realized?

In the following sections we shall analyse the process of scientific investigation of nature under the probabilistic paradigm. We shall first look at how we gain information, then how we should best capture this information into numbers, and finally, what the ideal laws for making predictions should look like. The last step will bring the quantum mechanical time evolution, but will also indicate a problem due to finite information.

2 Gaining experimental information

Under the probabilistic paradigm basic physical observation is not very different from tossing a coin or blindly picking balls from an urn. One sets up specific conditions and checks what happens. And then one repeats this many times to gather statistically significant amounts of information. The difference to classical probabilistic experiments is that in quantum experiments one must carefully monitor the conditions and ensure they are the same for each trial. Any noticeable change constitutes a different experimental situation and must be avoided.0

Formally, one has a probabilistic experiment in which a single trial can give K different outcomes, one of which happens. The probabilities of these outcomes, pi, ...,PK, (52Pj = 1), are determined by the conditions. But they are unknown. In order to find their values, and thereby the values of physical quantities functionally related to them, one does N trials. Let us assume the outcomes j = 1, ...,K happen L\, ...,LK times, respectively (52 Lj = N). The Lj are random variables, subject to the multinomial probability distribution. Listing Li, ...,LK represents the complete information gained in the N trials. The customary way of representing the information is however by other random

"Strictly speaking, identical trials are impossible. A deeper analysis of why one can neglect remote conditions, might lead to an understanding of the notion of spatial distance, about which relativity says nothing, and which is badly missing in todays physics.

352

variables, the so called relative frequencies Vj = Lj/N. Clearly, they also obey the multinomial probability distribution.

Examples:

* A trial in a spin-1/2 Stern-Gerlach experiment has two possible outcomes. This experiment is therefore goverend by the binomial probability distribution. * A trial in a GHZ experiment has eight possible outcomes, because each of the three particles can end up in one of two detectors 2. Here, the relative frequencies follow the multinomial distribution of order eight. * Measuring an intensity in a detector, which can only fire or not fire, is in fact an experiment where one repeatedly checks whether a firing occurs in a sufficiently small time interval. Thus one has a binomial experiment. If the rate of firing is small, the binomial distribution can be approximated by the Poisson distribution.

We must emphasize that the multinomial probability distribution is of utmost importance to physics under the probabilistic paradigm. This can be seen as follows: The conditions of a probabilistic experiment must be verified by auxiliary measurements. These are usually coarse classical measurements, but should actually also be probabilistic experiments of the most exacting standards. The probabilistic experiment of interest must therefore be done by ensuring that for each of its trials the probabilities of the outcomes of the auxiliary probabilistic experiments are the same. Consequently, empirical science is characterized by a succession of data-takings of multinomial probability distributions of various orders. The laws of physics are contained in the relations between the random variables from these different experiments. Since the statistical verification of these laws is again ruled by the properties of the multinomial probability distribution, we should expect that the inner structure of the multinomial probability distribution will appear in one form or another in the fundamental laws of physics. In fact, we might be led to the bold conjecture that, under the probabilistic paradigm, basic physical law is no more than the structures implicit in the multinomial probability distribution. There is no escape from this distribution. Whichever way we turn, we stumble across it as the unavoidable tool for connecting empirical data to physical ideas.

The multinomial probability distribution of order K is obtained when calculating the probability that, in N trials, the outcomes 1,..., K occur L\, ...,LK

times, respectively:

Prob(L1,...,LK\N,p1,...,pK) = L K ^ - P K - (2-1)

The expectation values of the relative frequencies are

353

Vj = pj (2 .2)

and their standard deviations are

3 Efficient representation of probabilistic information

The reason why probabilistic information is most often represented by the relative frequencies Vj seems to be history: Probability theory has originated as a method of estimating fractions of countable sets, when inspecting all elements was not possible (good versus bad apples in a large plantation, desirable versus undesirable outcomes in games of chance, etc.). The relative frequencies and their limits were the obvious entities to work with. But the information can be represented equally well by other random variables \j> a s l°ng a s these are one-to-one mappings Xj{vj)i s o that no information is lost. The question is, whether there exists a most efficient representation.

To answer this, let us see what we know about the limits pi, ...,PK before the experiment, but having decided to do iV trials. Our analysis is equivalent for all K outcomes, so that we can pick out one and drop the subscript. We can use Chebyshev's inequality4 to estimate the width of the interval, to which the probability p of the chosen outcome is pinned down.6

If N is not too small, we get

Wp = 2kJ^, (3.1)

where A; is a free confidence parameter. (Eq.(4) is not valid at ^=0 or 1.) Before the experiment we do not know u, so we can only give the upper limit,

Wp < - ^ . (3.2)

But we can be much more specific about the limit x of the random variable x(f), for which we require that, at least for large N, the standard deviation

'Chebyshev's inequality states: For any random variable, whose standard deviation exists, the probability that the value of the random variable deviates by more than fc standard deviations from its expectation value is less than, or equal to, fc-2. Here, A; is a free confidence parameter greater 1.

354

A% shall be independent of p (or of x for that matter, since there will exist a function p{x)),

Ax = ^ , (3.3)

where C is an arbitrary real constant. For the derivation of the function X(v) it is easiest to make use of the illustration in Fig.l. Although it already shows the solution, the argument is general enough, so that the particular form of the discussed function does not matter. First we note that x(^) shall be smooth and differentiate and strictly monotonic. For sufficiently large N the probability distribution of v can be approximated by a normal distribution centered at v and with standard deviation Av. In other words, it will approach the gaussian form

Prob{v\N,p) « rexp (y-vf 2(Ai/)2 (3.4)

where r is the normalization factor. But clearly, the corresponding probability distribution of \ will also tend to the gaussian form of standard deviation Ax-(For instance, take the probability distributions of v and x for P — -5. These are the ones in the middle, as shown in Fig.l.) And if N is large, both Av and Ax will be small, so that in the range of x and v where the probability is significantly different from zero, the curve x(^) can be approximated by its tangent

X « X W + ( | ) __{v-v). (3.5)

Then it follows that the characteristic width of the probability distribution of x> which is Ax, will be proportional to the characteristic width of the probability distribution of v, which is Av. The proportionality constant will be g£, because this is by how much the distribution for v gets 'squeezed' or 'stretched' to become the one for x- So we have, for large N,

£U £. (3.6) Av dv '

Use of (3) and (6), and integration yields

X = C arcsin (2v - 1) + 9, (3.7)

where 9 is an arbitrary real constant?. For comparison with v we confine x to [0,1] and thus set C = 7r_1 and 6 = .5, as was already done in Fig.l. Then we

355

have Ax = l/(iry/N), and upon application of Chebyshev's inequality we get the interval wx to which we can pin down the unknown limit x as

wx = — ? = . (3.8)

Clearly, this is narrower than the upper limit for wp in eq.(5). Having done no experiment at all, we have better knowledge on the value of x than on the value of p, although both can only be in the interval [0,1]. And note that, the actual experimental data will add nothing to the accuracy with which we know x, but they may add to the accuracy with which we know p. Nevertheless, even with data, wp may still be larger than to,, especially when p is around 0.5.

For the representation of information the random variable x is the proper choice, because it disentangles the two aspects of empirical information: The number of trials N, which is determined by the experimenter, not by nature, and the actual data, which are only determined by nature. The experimenter controls the accuracy wx by deciding N, nature supplies the data x, and thereby the whereabouts of x. In the real domain the only other random variables with this property are the linear transformations afforded by C and 9. From the physical point of view x *s °f interest, because its standard deviation is an invariant of the physical conditions as contained in p or x. The random variable x expresses empirical information with a certain efficiency, eliminating a numerical distortion that is due to the structure of the multinomial distribution, and which is apparent in all other random variables. We shall call x an efficient random variable (ER). More generally, we shall call any random variable an ER, whose standard deviation is asymptotically invariant of the limit the random variable tends to, eq.(6).

Another graphical depiction of the relation between v and \ c a n be given by drawing a semicircle of diameter 1 along which we plot v (Fig.2a). By orthogonal projection onto the semicircle we get the random variable C, = [K + 2arcsin(2i/ — l)]/4 and thereby Xi when we choose different constants. The drawing also suggests a simple way how to obtain a complex ER. We scale the semicircle by an arbitrary real factor a, tilt it by an arbitrary angle ip, and place it into the complex plane as shown in Fig.2b. This gives the random variable

0 = a(yv(l-v) +iv} e^ + b (3.9)

where b is an arbitrary complex constant. We get a very familiar special case by setting a — 1 and 6 = 0:

V> = (yjv (1 - v) + iv) e'iv. (3.10)

356

Figure 1: Functional relation between random variables v and x> and their respective probability distributions as expected for N = 100 trials, plotted for five different values of p: .07, .25, .50, .75 and .93. The bar above each probablity distribution indicates twice its standard deviation. Notice that the standard deviations of v differ considerably for different p, while those of x a r e aU the same, as required in eq.(6)

357

(a) (b) Figure 2: (a) Graphical construction of efficient random variable £ (and thereby of x) from the observed relative frequency v. £ is measured along the arc. (b) Similar construction of the efficient random variable /3. It is given by its coordinates in the complex plane. The quantum mechanical probability amplitude ip is the normalized case of /3, obtained by setting a = 1 and 6 = 0.

358

For large N the probability distribution of v becomes gaussian, but also that of any smooth function of v, as we have already seen in Fig.l. Therefore the standard deviation of ip is obtained as

Aip dip

dv 4" = S f <3 U>

Obviously, the random variable ip is an ER. It fulfills \ip\2 — i/, and we recognize it as the probability amplitude of quantum theory, which we would infer from the observed relative frequency v. Note, however, that the intuitive way of getting the quantum mechanical probability amplitude, namely, by simply taking ^/vexp(ia), where a is an arbitrary phase, does not give us an ER.

We have now two ways of representing the obtained information by ERs, either the real valued x o r the complex valued /?. Since the relative frequency of each of the K outcomes of a general probabilistic experiment can be converted to its respective efficient random variable, the information is efficiently represented by the vector (XI,---,XK), or by the vector (0i,...,/3K). The latter is equivalent to the quantum mechanical state vector, if we normalize it: (ipu...,ipK).

At this point it is not clear, whether fundamental science could be built solely on the real ERs \j o r whether it must rely on the complex ERs /J,-, and for practical reasons on the normalized case ipj, as suggested by current formulations of quantum theory. We cannot address this problem here, but mention that working with the j3j or ipj can lead to nonsensical predictions, while working with the Xj never does, so that the former are more sensitive to inconsistencies in the input data 6 . Therefore we use only the ipj in the next section, but will not read them as if we were doing quantum theory.

4 Predictions

Let us now see whether the representation of probabilistic information by ERs suggests specific laws for predictions. A prediction is a statement on the expected values of the probabilities of the different outcomes of a probabilistic experiment, which has not yet been done, or whose data we just do not yet know, on the basis of auxiliary probabilistic experiments, which have been done, and whose data we do know. We intend to make a prediction for a probabilistic experiment with Z outcomes, and wish to calculate the quantities 4>s, (s = 1,..., Z), which shall be related to the predicted probabilities Ps

as Ps = \(j>s\2- We do not presuppose that the <ps are ERs.

We assume we have done M different auxiliary probabilistic experiments of various multinomial order Km, m = 1,..., M, and we think that they provided

359

all the input information needed to predict the cf>s, and therefore the Ps. With (13) the obtained information is represented by the ERs ip™, where m denotes the experiment and j labels a possible outcome in it (j = 1,..., Km). Then the predictions are

and their standard deviations are, by the usual convolution of gaussians as approximations of the multinomial distributions,

A<t>s =

N M

4Nn

d<j)s

dip (4.2)

where Nm is the number of trials of the mth auxiliary experiment. If we wish the <f>s to be ERs, we must demand that the A(ps depend only on the Nm. (A technical requirement is that in each of the M auxiliary experiments one of the phases of ERs ip^1 cannot be chosen freely, otherwise the second summations in (16) could not go to Km, but only to Km — 1.) Then the derivatives in (16) must be constants, implying that the <f>s are linear in the i/)™. However, we cannot simply assume such linearity, because (15) contains the laws of physics, which cannot be known a priori. But we want to point out that a linear relation for (15) has very exceptional properties, so that it would be nice, if we found it realized in nature. To be specific, if the Nm are sufficiently large, linearity would afford predictive power, which no other functional relation could achieve: It would be sufficient to know the number of trials of each auxiliary probabilistic experiment in order to specify the accuracy of the predicted <f>s. No data would be needed, only a decision how many trials each auxiliary experiment will be given! Moreover, even the slightest increase of the amount of input information, by only doing one more trial in any of the auxiliary experiments, would lead to better accuracy of the predicted <j>s, by bringing a definite decrease of the A<j>s. This latter property is absent in virtually all other functional relations conceivable for (15). In fact, most nonlinear relations would allow more input information to result in less accurate predictions. This would undermine the very idea of empirical science, namely that, by observation our knowledge about nature can only increase, never just stay the same, let alone decrease. For this reason we assume linearity and apply it to a concrete example.

We take a particle in a one dimensional box of width w. Alice repeatedly prepares the particle in a state only she knows. At time t after the preparation Bob measures the position by subdividing the box into K bins of width w/K

360

and checking in which he finds the particle. In N trials Bob obtains the relative frequencies vi,..., VK, giving a good idea of the particle's position probability distribution at time t. He represents this information by the ERs xpj of (10) and wants to use it to predict the position probability distribution at time T (T > t).

First he predicts for t + dt. With (15) the predicted <ps must be linear in the ipj if they are to be ERs,

K

</)s(t + dt) = J2asjxpj. (4.3) i= i

Clearly, when dt —> 0 we must have asj — 1 for s — j and asj = 0 otherwise, so we can write

asj (t) = 6aj + gsj (t)dt, (4.4)

where gSj(t) are the complex elements of a matrix G and we included the possibility that they depend on t. Using matrix notation and writing the <f>s

and ipj as column vectors we have

${t + dt) = [1 + G(t)dt] $. (4.5)

For a prediction for time t + 2dt we must apply another such linear transformation to the prediction we had for t + dt,

${t + 2dt) = [1 + G(t + dt)dt] ${t + dt). (4.6)

Replacing t + dt by t, and using <p(t + dt) = <t>{t) H—Qp-dt, we have

d${t) dt

= G{t)<j>{t). (4.7)

With (10) the input vector was normalized, \ip\2 — 1. We also demand this from the vector <f>. This results in the constraint that the diagonal elements gaa must be imaginary and the off-diagonal elements must fulfill gsj = —g*js. And then we have obviously an evolution equation just as we know it from quantum theory.

For a quantitative prediction we need to know G(*) and the phases (pj of the initial ipj. We had assumed the <pj to be arbitrary. But now we see that they influence the prediction, and therefore they attain physical significance. G(t) is a unitary complex K x K matrix. For fixed conditions it is independent of time, and with the properties found above, it is given by K2 — 1 real

361

numbers. The initial vector ip has K complex components. It is normalized and one phase is free, so that it is fixed by 2K — 2 real numbers. Altogether K2 + IK - 3 = (K + 3) (K - 1) numbers are needed to enable prediction. Since one probabilistic experiment yields K — 1 numbers, Bob must do K + 3 probabilistic experiments with different delay times between Alice's preparation and his measurement to obtain sufficient input information. But neither Planck's constant nor the particle's mass are needed. It should be noted that this analysis remains unaltered, if the initial vector ip is obtained from measurement of joint probability distributions of several particles. Therefore, (21) also contains entanglement between particles.

5 Discussion

This paper was based on the insight that under the probabilistic paradigm data from observations are subject to the multinomial probability distribution. For the representation of the empirical information we searched for random variables which are stripped of numerical artefacts. They should therefore have an invariance property. We found as unique random variables a real and a complex class of efficient random variables (ERs). They capture the obtained information more efficiently than others, because their standard deviation is an asymptotic invariant of the physical conditions. The quantum mechanical probability amplitude is the normalized case-of the complex class. It is natural that fundamental probabilistic science should use such random variables rather than any others as the representors of the observed information, and therefore as the carriers of meaning.

Using the ERs for prediction has given us an evolution prescription which is equivalent to the quantum theoretical way of applying a sequence of infinitesimal rotations to the state vector in Hilbert space7. It seems that simply analysing how we gain empirical information, what we can say from it about expected future information, and not succumbing to the lure of the question what is behind this information, can give us a basis for doing physics. This confirms the operational approach to science. And it is in support of Wheeler's It-from-Bit hypothesis8, Weizsacker's ur-theor$, Eddington's idea that information increase itself defines the rest10, Anandan's conjecture of absence of dynamical laws11, Bohr and Ulfbeck's hypothesis of mere symmetry^2 or the recent 1 Bit — 1 Constituent hypothesis of Brukner and Zeilingei13.

In view of the analysis presented here the quantum theoretical probability calculus is an almost trivial consequence of probability theory, but not as applied to 'objects' or anything 'physical', but as applied to the naked data of probabilistic experiments. If we continue this idea we encounter a deeper

362

problem, namely whether the space which we consider physical, this 3- or higher dimensional manifold in which we normally assume the world to unfurl 14, cannot also be understood as a peculiar way of representing data. Kant conjectured this - in somewhat different words - over 200 years ago1 5 . And indeed it is clearly so, if we imagine the human observer as a robot who must find a compact memory representation of the gigantic data stream it receives through its senses16. That is why our earlier example of the particle in a box should only be seen as illustration by means of familiar terms. It should not imply that we accept the naive conception of space or things, like particles, 'in' it, although this view works well in everyday life and in the laboratory — as long as we are not doing quantum experiments. We think that a full acceptance of the probabilistic paradigm as the basis of empirical science will eventually require an attack on the notions of spatial distance and spatial dimension from the point of view of optimal representation of probabilistic information.

Finally, we want to remark on a difference of our analysis to quantum theory. We have emphasized that the standard deviations of the ERs \ a n d tp become independent of the limits of these ERs only when we have infinitely many trials. But there is a departure for finitely many trials, especially for values of p close to 0 and close to 1. With some imagination this can be noticed in Fig.l in the top and bottom probability distributions of \ , which are a little bit wider than those in the middle. But as we always have only finitely many trials, there should exist random variables which fulfill our requirement for an ER even better than x a n d ip- This implies that predictions based on these unknown random variables should also be more precise! Whether we should see this as a fluke of statistics, or as a need to amend quantum theory is a debatable question. But it should be testable. We need to have a number of different probabilistic experiments, all of which are done with only very few trials. From this we want to predict the outcomes of another probabilistic experiment, which is then also done with only few trials. Presumably, the optimal procedure of prediction will not be the one we have presented here (and therefore not quantum theory). The difficulty with such tests is however that, in the usual interpretation of data, statistical theory and quantum theory are treated as separate, while one message of this paper may also be that under the probabilistic paradigm the bottom level of physical theory should be equivalent to optimal representation of probabilistic information, and this theory should not be in need of additional purely statistical theories to connect it to actual data. We are discussing this problem in a future paper17.

363

Acknowledgments

This paper is a result of pondering what I am doing in the lab, how it can be that in the evening I know more than I knew in the morning, and discussing this with G. Krenn, K. Svozil, C. Brukner, M. Zukovski and a number of other people.

References

1. M. Born, Zeitschrift f. Physik 37, 863 (1926); Brit. J. Philos. Science 4, 95 (1953).

2. D. Bouwmeester et al., Phys. Rev. Lett. 82, 1345 (1999) and references therein.

3. W. Feller, An Introduction to Probability Theory and its Applications, (John Wiley and Sons, New York, 3rd edition, 1968), Vol.1, p.168.

4. ibid., p.233. 5. The connection of this relation to quantum physics was first stressed by

W. K. Wootters, Phys. Rev. D 23, 357 (1981). 6. We give the example in quant-ph/0008098. 7. Several authors have noted that probability theory itself suggests quan

tum theory: A. Lande, Am. J. Phys. 42, 459 (1974); A. Peres, Quantum Theory: Concepts and Methods, (Kluwer Academic Publishers, Dordrecht, 1998); D. I. Fivel, Phys. Rev. A 50, 2108 (1994).

8. J. A. Wheeler in Quantum Theory and Measurement, eds. J. A. Wheeler and W. H. Zurek (Princeton University Press, Princeton, 1983) 182.

9. C. F. von Weizsacker, Aufbau der Physik (Hanser, Munich, 1985). Holger Lyre, Int. J. Theor. Phys., 34, 1541 (1995). Also quant-ph/9703028.

10. C. W. Kilmister, Eddington's Search for a Fundamental Theory (Cambridge University Press, 1994).

11. J. Anandan, Found. Phys. 29, 1647 (1999). 12. A. Bohr and 0 . Ulfbeck, Rev. Mod. Phys. 67, 1 (1995). 13. C. Brukner and A. Zeilinger, Phys. Rev. Lett. 83, 3354 (1999). 14. A penetrating analysis of the view of space implied by quantum theory

is given by U. Mohrhoff, Am. J. Phys. 68 (8), 728 (2000). 15. Immanuel Kant, Critik der reinen Vernunft (Critique of Pure Reason),

Riga (1781). There should be many English translations. 16. E.T. Jaynes introduced the 'reasoning robot' in his book Prob

ability Theory: The Logic of Science in order to eliminate the problem of subjectivism that has been plaguing probability theory and quantum theory alike. The book is freely available at http://bayes.wustl.edu/etj/prob.html

17. J. Summhammer (to be published).

364

Q U A N T U M C R Y P T O G R A P H Y I N S P A C E A N D B E L L ' S T H E O R E M

I G O R V O L O V I C H

Steklov Mathematical Institute, Gubkin St. 8,

GSP-1, 117966, Moscow, Russia


Bell's theorem states that some quantum correlations can not be represented by classical correlations of separated random variables. It has been interpreted as incompatibility of the requirement of locality with quantum mechanics. We point out that in fact the space part of the wave function was neglected in the proof of Bell's theorem. However this space part is crucial for considerations of property of locality of quantum system. Actually the space part leads to an extra factor in quantum correlations and as a result the ordinary proof of Bell's theorem fails in this case. Bell's theorem constitutes an important part in quantum cryptography. The promise of secure cryptographic quantum key distribution schemes is based on the use of Bell's theorem in the spin space. In many current quantum cryptography protocols the space part of the wave function is neglected. As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but they could be insecure in the real three-dimensional space. We discuss an approach to the security of quantum key distribution in space by using a special preparation of the space part of the wave function.

1 Introduction

Bell's theorem1 states that there are quantum correlation functions that can not be represented as classical correlation functions of separated random variables. It has been interpreted as incompatibility of the requirement of locality with the statistical predictions of quantum mechanics : . For a recent discussion of Bell's theorem see, for example 2 - 17 and references therein. It is now widely accepted, as a result of Bell's theorem and related experiments, that "local realism" must be rejected.

Evidently, the very formulation of the problem of locality in quantum mechanics is based on ascribing a special role to the position in ordinary three-dimensional space. It is rather surprising therefore that the space dependence of the wave function is neglected in discussions of the problem of locality in relation to Bell's inequalities. Actually it is the space part of the wave function which is relevant to the consideration of the problem of locality.

In this note we point out that the space part of the wave function leads to an extra factor in quantum correlation and as a result the ordinary proof of Bell's theorem fails in this case. We present a criterium of locality (or nonlocality) of quantum theory in a realist model of hidden variables. We

365

argue that predictions of quantum mechanics can be consistent with Bell's inequalities for Gaussian wave functions and hence Einstein's local realism is restored in this case.

Bell's theorem constitutes an important part in quantum cryptography19. It is now generally accepted that techniques of quantum cryptography can allow secure communications between distant parties 18 - 25. The promise of secure cryptographic quantum key distribution schemes is based on the use of quantum entanglement in the spin space and on quantum no-cloning theorem. An important contribution of quantum cryptography is a mechanism for detecting eavesdropping.

However in many current quantum cryptography protocols the space part of the wave function is neglected. But exactly the space part of the wave function describes the behaviour of particles in ordinary real three-dimensional space. As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but could be insecure in the real three-dimensional space.

It follows that proofs of the security of quantum cryptography schemes which neglect the space part of the wave function could fail against attacks in the real three-dimensional space. We will discuss how one can try to improve the security of quantum cryptography schemes in space by using a special preparation of the space part of the wave function.

2 Bell's Inequality

In the presentation of Bell's theorem we will follow 17 where one can find also more references. The mathematical formulation of Bell's theorem reads:

cos(a -P)± E&tip (2.1)

where £Q and r)p are two random processes such that |£a | < 1, \r\$\ < 1 and E is the expectation. Let us discuss in more details the physical interpretation of this result. Consider a pair of spin one-half particles formed in the singlet spin state and moving freely towards two detectors (Alice and Bob). If one neglects the space part of the wave function then the quantum mechanical correlation of two spins in the singlet state ipspin is

Dspin(a, b) = (ipspin\(7 -a® a • b\tpspin) = -a • b (2.2)

Here a and b are two unit vectors in three-dimensional space, a — ( o i , ^ , ^ ) are the Pauli matrices and

366

Bell's theorem states that the function Dspin{a,b) Eq. (2.2) can not be represented in the form

P(a,b) = Jaa,\)r](b,X)dp(X) (2.3)

i.e.

Dspin(a,b) ^ P(a,b) (2.4)

Here £(a, A) and 77(6, A) are random fields on the sphere, |£(a, A)| < 1, \rj(b, A)| < 1 and dp(X) is a positive probability measure, / dp{\) = 1. The parameters A are interpreted as hidden variables in a realist theory. It is clear that Eq. (2.4) can be reduced to Eq. (2.1).

One has the following Bell-Clauser-Horn-Shimony-Holt (CHSH) inequality

\P(a, b) - P(a, b') + P(a', b) + P(a', b')\<2 (2.5)

Prom the other hand there are such vectors (ab — a'b = a'b' = — ab' = V2/2) for which one has

\Dspin(a, b) - Dspin(a, b') + Dspin(a', b) + Dspin(a', b')\ = 2^2 (2.6)

Therefore if one supposes that Dspin(a,b) = P(a,b) then one gets the contradiction.

It will be shown below that if one takes into account the space part of the wave function then the quantum correlation in the simplest case will take the form g cos(a — /3) instead of just cos(a - /3) where the parameter g describes the location of the system in space and time. In this case one can get the representation

gcos(a-p)=EZaT]l3 (2.7)

if g is small enough (see below). The factor g gives a contribution to visibility or efficiency of detectors that are used in the phenomenological description of detectors.

3 Localized Detectors

In the previous section the space part of the wave function of the particles was neglected. However exactly the space part is relevant to the discussion of locality. The complete wave function is tp = (V>a/3(ri,r2)) where a and /? are spinor indices and r i and r^ are vectors in three-dimensional space.

367

We suppose that Alice and Bob have detectors which are located within the two localized regions OA and OB respectively, well separated from one another.

Quantum correlation describing the measurements of spins by Alice and Bob at their localized detectors is

G(a,0A,b,OB) = (1>W • aPoA ® a • bPoB|V> (3-1)

Here PQ is the projection operator onto the region O. Let us consider the case when the wave function has the form of the product

of the spin function and the space function tp = y ,spin^(i ,i,r2). Then one has

G(a, 0A, b, 0B) = g(0A, 0B)Dspin(a, b) (3.2)

where the function

9(OA,OB)= [ \4>(r1,T2)\2dT1dv2 (3.3)

JOAXOB

describes correlation of particles in space. It is the probability to find one particle in the region OA and another particle in the region OB- One has

0<g(OA,OB)<l (3.4)

Remark. In relativistic quantum field theory there is no nonzero strictly localized projection operator that annihilates the vacuum. It is a consequence of the Reeh-Schlieder theorem. Therefore, apparently, the function g(OA,Os) should be always strictly smaller than 1. I am grateful to W. Luecke for this remark.

Now one inquires whether one can write the representation

9(0A,0B)Dspin(a,b) = f^a,OA,X)v(b,0B,\)dP(X) (3.5)

Note that if we are interested in the conditional probablity of finding the projection of spin along vector a for the particle 1 in the region OA and the projection of spin along the vector b for the particle 2 in the region OB then we have to divide both sides of Eq. (3.5) to g(OA, OB)-

The factor g is important. In particular one can write the following representation15 for 0 < g < 1/2:

gcos(a-/3)= v ^ c o s ( a - A ) v / 2 p c o s ( ^ - A ) — (3.6) Jo An

Let us now apply these considerations to quantum cryptography.

368

4 Quantum Key Distribution

Ekert1 9 showed that one can use the EPR correlations to establish a secret random key between two parties ("Alice" and "Bob"). Bell's inequalities are used to check the presence of an intermediate eavesdropper ("Eve"). There are two stages to the Ekert protocol, the first stage over a quantum channel, the second over a public channel.

The quantum channel consists of a source that emits pairs of spin one-half particles, in a singlet state. The particles fly apart towards Alice and Bob, who, after the particles have separated, perform measurements on spin components along one of three directions, given by unit vectors a and b. In the second stage Alice and Bob communicate over a public channel.They announce in public the orientation of the detectors they have chosen for particular measurements. Then they divide the measurement results into two separate groups: a first group for which they used different orientation of the detectors, and a second group for which they used the same orientation of the detectors. Now Alice and Bob can reveal publicly the results they obtained but within the first group of measurements only. This allows them, by using Bell's inequality, to establish the presence of an eavesdropper (Eve). The results of the second group of measurements can be converted into a secret key. One supposes that Eve has a detector which is located within the region OE and she is described by hidden variables A.

We will interpret Eve as a hidden variable in a realist theory and will study whether the quantum correlation Eq. (3.2) can be represented in the form Eq. (2.3). ^From (2.5), (2.6) and (3.5) one can see that if the following inequality

g(0A,0B) <1/V2 (4.1)

is valid for regions OA and OB which are well separated from one another then there is no violation of the CHSH inequalities (2.5) and therefore Alice and Bob can not detect the presence of an eavesdropper. On the other side, if for a pair of well separated regions OA and OB one has

9(OA,OB) >l/y/2 (4.2)

then it could be a violation of the realist locality in these regions for a given state. Then, in principle, one can hope to detect an eavesdropper in these circumstances.

Note that if we set g(OA, OB) = 1 in (3.5) as it was done in the original proof of Bell's theorem, then it means we did a special preparation of the states of particles to be completely localized inside of detectors. There exist such

369

well localized states (see however the previous Remark) but there exist also another states, with the wave functions which are not very well localized inside the detectors, and still particles in such states are also observed in detectors. The fact that a particle is observed inside the detector does not mean, of course, that its wave function is strictly localized inside the detector before the measurement. Actually one has to perform a thorough investigation of the preparation and the evolution of our entangled states in space and time if one needs to estimate the function g(C>A, OB)-

5 Gaussian Wave Functions

Now let us consider the criterium of locality for Gaussian wave functions. We will show that with a reasonable accuracy there is no violation of locality in this case. Let us take the wave function <f> of the form <f> = V'i(ri)V'2(r2) where the individual wave functions have the moduli

\Mr)\2 = ( ^ ) » / V " V / a , |V>2(r)|2 = (^ )» /»e -» ' ( ' - 1 )V» (5.1)

We suppose that the length of the vector 1 is much larger than 1/m. We can make measurements of PoA and PQB for any well separated regions OA and OB- Let us suppose a rather nonfavorite case for the criterium of locality when the wave functions of the particles are almost localized inside the regions OA and OB respectively. In such a case the function 9(OA,OB) can take values near its maxumum. We suppose that the region OA is given by \ri\ < 1/m,r = (ri , r2,r3) and the region OB is obtained from OA by translation on 1. Hence V'i(ri) is a Gaussian function with modules appreciably different from zero only in OA and similarly «/>2(i"2) is localized in the region OB- Then we have

g(0A, OB) = ( ^ L J ^ e~x^2dx\ (5.2)

One can estimate (5.2) as

g(0A,0B)<(^ (5.3)

which is smaller than 1/2. Therefore the locality criterium (4.1) is satisfied in this case.

Let us remind that there is a well known effect of expansion of wave packets due to the free time evolution. If e is the characteristic length of the Gaussian

370

wave packet describing a particle of mass M at time t = 0 then at time t the chracteristic length tt will be

It tends to (H/Me)t as t —> oo. Therefore the locality criterium is always satisfied for nonrelativistic particles if regions OA and OB are far enough from each other. The case of relativistic particles will be considered in a separate publication.

6 Conclusions

It is shown in this note that if we do not neglect the space part of the wave function of two particles then the prediction of quantum mechanics can be consistent with Bell's inequalities. One can say that Einstein's local realism is restored in this case.

It would be interesting to investigate whether one can prepare a reasonable wave function for which the condition of nonlocality (4.2) is satisfied for a pair of the well separated regions. In principle the function g(C>A, OB) can approach its maximal value 1 if the wave functions of the particles are very well localized within the detector regions OA and OB respectively. However, perhaps to establish such a localization one has to destroy the original entanglement because it was created far away from detectors.

It is shown that the presence of the space part in the wave function of two particles in the entangled state leads to a problem in the proof of the security of quantum key distribution. To detect the eavesdropper's presence by using Bell's inequality we have to estimate the function g(OA, OB)- Only a special quantum key distribution protocol has been discussed here but it seems there are similar problems in other quantum cryptographic schemes as well.

We don't claim in this note that it is in principle impossible to increase the detectability of the eavesdropper. However it is not clear to the present author how to do it without a thorough investigation of the process of preparation of the entangled state and then its evolution in space and time towards Alice and Bob.

In the previous section Eve was interpreted as an abstract hidden variable. However one can assume that more information about Eve is available. In particular one can assume that she is located somewhere in space in a region OE- It seems one has to study a generalization of the function g(OA,OB), which depends not only on the Alice and Bob locations OA and OB but also depends on the Eve location OE, and try to find a strategy which leads to an optimal value of this function.

371

7 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University. I would like to thank A. Khrennikov for the warm hospitality and fruitful discussions. This work is supported in part also by RFFI 99-01-00105 and INTAS 99-0590.

References

1. J.S. Bell, Physics 1, 195 (1964) 2. A. Peres, Quantum Theory: Concepts and Methods, Kluwer, Dordrecht,

1993 3. L.E. Ballentine, Quantum Mechanics, Prince-Hall, 1990 4. Muynck W.M. de, De Baere, W. and Martens, H. , Found, of Physics,

(1994), 1589 5. D.M. Greenberger, M.A. Home, A. Shimony, and A. Zeilinger, Am. J.

Phys. 58, 1131 (1990) 6. S.L. Braunstein, A. Mann, and M. Revzen, Phys. Rev. Lett. 68, 3259

(1992) 7. N.D. Mermin, Am. J. Phys. 62, 880 (1994) 8. G. M. D'Ariano, L. Maccone, M. F. Sacchi and A. Garuccio, Tomographic

test of Bell's inequality, quant-ph/9907091 9. Luigi Accardi and Massimo Regoli, Locality and Bell's inequality, quant-

ph/0007005 10. Andrei Khrennikov, Non-Kolmogorov probability models and modified

Bell's inequality, quant-ph/0003017 11. Almut Beige, William J. Munro and Peter L. Knight, A Bell's Inequality

Test with Entangled Atoms, quant-ph/0006054 12. F. Benatti and R. Floreanini, On Bell's locality tests with neutral kaons,

hep-ph/9812353 13. A. Khrennikov, Statistical measure of ensemble nonreproducibility and

correction to Bell's inequality, Nuovo Cimento, 115B (2000)179 14. W. A. Hofer, Information transfer via the phase: A local model of

Einstein-Podolksy-Rosen experiments, quant-ph/0006005 15. Igor Volovich, Yaroslav Volovich, Bell's Theorem and Random Variables,

quant-ph/0009058 16. N. Gisin, V. Scarani, W. Tittel, H. Zbinden, Optical tests of quantum

nonlocality: from EPR-Bell tests towards experiments with moving observers, quant-ph/0009055

17. Igor V. Volovich, Bell's Theorem and Locality in Space, quant-

372

ph/0012010 18. C.H. Bennett and G. Brassard, in Proc. of the IEEE Inst. Conf. on

Comuters, Systems, and Signal Processing, Bangalore, India (IEEE, New York,1984) p.175

19. A.K. Ekert, Phys. Rev. Lett. 67 (1991)661 20. D. S. Naik, C. G. Peterson, A. G. White, A. J. Berglund, P. G. Kwiat,

Entangled state quantum cryptography: Eavesdropping on the Ekert protocol, quant-ph/9912105

21. Gilles Brassard, Norbert Lutkenhaus, Tal Mor, Barry C. Sanders, Security Aspects of Practical Quantum Cryptography, quant-ph/9911054

22. Kei Inoue, Takashi Matsuoka, Masanori Ohya, New approach to Epsilon-entropy and Its comparison with Kolmogorov's Epsilon-entropy, quant-ph/9806027

23. Hoi-Kwong Lo, Will Quantum Cryptography ever become a successful technology in the marketplace?, quant-ph/9912011

24. Akihisa Tomita, Osamu Hirota, Security of classical noise-based cryptography, quant-ph/0002044

25. Yong-Sheng Zhang, Chuan-Feng Li, Guang-Can Guo, Quantum key distribution via quantum encryption, quant-ph/0011034

373

INTERACTING STOCHASTIC PROCESS A N D RENORMALIZATION THEORY

YAROSLAV V O L O V I C H

Physics Department, Moscow State University,

Vorobievi Gori, 119899Moscow, Russia


A stochastic process with self-interaction as a model of quantum field theory is studied. We consider an Ornstein-Uhlenbeck stochastic process x(t) with interaction of the form x ( a ' ( t ) 4 , where a indicates the fractional derivative. Using Bogoliubov's R—operation we investigate ultraviolet divergencies for the various parameters a. Ultraviolet properties of this one-dimensional model in the case a = 3/4 are similar to those in the ip\ theory but there are extra counterterms. It is shown that the model is two-loops renormalizable. For 5/8 < a < 3/4 the model has a finite number of divergent Feynman diagrams. In the case a = 2/3 the model is similar to the <p\ theory. If 0 < a < 5/8 then the model does not have ultraviolet divergencies at all. Finally if a > 3/4 then the model is nonrenormalizable.

1 Introduction

There is a very fruitful interrelation between probability theory and quantum field theory 1 _ 6 . In this note we consider a stochastic process that shows the same divergencies as quantum electrodynamics or </>4 theory in the 4-dimensional spacetime. This stochastic process corresponds to one-dimensional Euclidean quantum field theory with the quartic interaction that contains fractional derivatives. This one-dimensional model can be used for studying the fundamental problem of non-perturbative investigation of renormalized quantum field theory1 '3 . It can also find applications in theory of phase transitions5 '6 .

The Interacting Stochastic Process. Let x(t) = x(t,u)) be an Ornstein-Uhlenbeck stochastic process with the correlation function

1 r°° pip(t-r) p~m\t-r\

where m > 0. There exists a spectral representation of the Ornstein-Uhlenbeck stochastic process 8

x{t,u)= JeiktC(dk,u)

374

where ((dk,u) is a stochastic measure. We define the fractional derivative a

as

* < < * > (t,w)= f\k\aeiktC(dk,oj) (1.2)

If 0 < a < 1/2 then x^(t) is a stochastic process. If a > 1/2 then one needs a regularization described below. We will use distribution notations and write

1 f°° C,(dk,ui) = x(k,cj)dk, i(k,w) = — I x(t,cj)e

2 ? r J-oo

-iktdt

We want to give a meaning to the following correlation functions

K{h ,...,tN)= E{x{h) • • • x{tN)e~xu)/ E(e-xu) (1.3)

for all N = 1,2,... Here

/

OO

:X^{T)A :g(T)dT (1.4)

-OO

where g(r) is a nonnegative test function with a compact support (the volume cut-off), a;(Q)(i) denotes the fractional derivative (1.2), A > 0 and : ^ ( ^ ( T ) 4 : is the Wick normal product. We will denote the expectation value as E(A) — {A). In this notations (x(t)x(r)) = ± J^ ^^rdp

For the correlation function (1.3) one has the perturbative expansion

(x(h)... x{tN)e~xu) = V K—f- (xfa) • ••x(tN)Un) (1.5) n=0

If a > 5/8 then the expectation value in (1.5) has no meaning because there are ultraviolet divergencies. We have to introduce a cutoff stochastic process xK (t) 3

xK(t,e>)= f eiktadk,u) J —K

Instead of U in (1.3) we put

UK = j :4 a ) M 4 : 9(r)dr

"Stochastic differential equations with fractional derivatives 7 are considered also on p—adic number fields.

375

where

J—K

The problem is to prove that after the renormalization there exists a limit of the correlation functions

(x{h)-x(tN)e-w')rm

as K -> oo in each order of the perturbation expansion. We will consider this problem below by using the Bogoliubov-Parasiuk .R-operation and the standart language of the Feynman diagrams.

In the momentum representation we obtain the expression of the form

{x(pi)...x{jpN)e~xu) = ^2Gr(pi,... ,PN)

Here the sum runs over all Feynman diagrams T with N external legs that can be build up using 4-vertices corresponding to the x^4 term. Contributions from the connected diagrams with n 4-vertices and L internal lines has a form

j = i j j = i < i j + m

where I = L — (n — 1), qi are linear combinations of the internal momenta fci,... , ki, and external momenta p i , . . . ,PN-

The canonical degree D(T) of a proper diagram is defined by the dimension of the corresponding Feynman integral with respect to the integration variables. Using (1.6) we have

D = D(T) = (2a - 2)L + I = (2a - \)L - n + 1 (1.7)

If for a given diagram D < 0 then this diagram is superficially finite, otherwise it is divergent. Let us consider a proper diagram with n vertices, L internal lines, and E legs. We have the following relation

An-2L + E (1.8)

Note that for any nontrivial connected diagram

2n > L > n > 2 (1.9)

E <2n (1.10)

376

Theorem If a < 5/8 then all Feynman diagrams of the interacting stochastic process are superficially finite. If 5/8 < a < 3/4 then there exists a finite number of divergent diagrams, moreover all divergent diagrams have only 0 or 2 legs. If a = 3/4 then the model is renormalizable and all divergent diagrams have only 0, 2 or 4 external lines. Finally, if a > 3/4 then the model is nonrenormalizable. Proof Let us prove the first statement of the theorem, i.e. if a < 5/8 then D < 0 for any n > 2. Using (1.7) and (1.9) we have

D nr 5 T n L-An + A ^ <2L L-n + l = <

a<5/8 8 4 (1.11)

< In - An + 4

< 0 4 2

Prom (1.11) it follows that D < 0 for any a < 5/8. Let us consider a = 5/8. Similarly to (1.11) from (1.7) we have

D L-An + A 2_ n

a=5/8 < 0 (1.12)

Therefore only two-point (n = 2) diagram could be divergent (in this case D = 0). Rewriting (1.12) in the form

D A-(E + L)

a<5/8 (1.13)

Prom (1.13) it follows that only diagram with E = 0, L — A, n = 2 is divergent. In the case when 5/8 < a < 3/4 we can write

a = (1.14)

where 0 < e < 1/8. Substituting (1.14) into (1.7) and using (1.9) we have

D L 2n

= --2Le-n + l< — a=3/4-er 2 2

2ns - n + 1 = 1 - 2ne (1.15)

Thus for any given s > 0 (and therefore any a < 3/4) there exists a number N such that for any n > N the canonical dimension D < 0. Hence there exists only a finite number of divergent diagrams. Rewriting (1.15) in the form

D a=3/4-e

= -2Le + A-E

377

It follows that D > 0 only if E < 4, i.e. E = 0 or E = 2 and the model is super-renormalizable.

Let us consider the case when a = 3/4. Using (1.8) and (1.7) we have

D = l - f (1.16) a=3/4 4

The equality (1.16) means that all divergent diagrams have only 0, 2, or 4 legs and the model is renormalizable.

Finally if a > 3/4 we have

D = - - n + l = > ^ > 0 (1.17) a>3/4 2, 1 2 ,

Therefore if a > 3/4 then all proper diagrams are divergent. • Examples of application of this theorem one can find in9 .

2 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University. I would like to thank A. Khrennikov for the warm hospitality and fruitful discussions.

References

1. N.N. Bogoliubov and D.V. Shirkov, Introduction to the theory of quantum fields, Nauka, Moscow, 1973

2. T. Hida Brownian Motion, Springer-Verlag, 1980. 3. J. Glimm and A. Jaffe Quantum Physics. A Functional Integral Point of

View, Springer-Verlag, 1987 4. T. Hida, H.-H. Kuo, J. Potthoff and L. Streit, White noise: An Infinite

Dimensional Calculus, Kluwer Academic, 1993 5. J. Kogut, K. Wilson Phys. Reports., 12C, p. 75, 1974 6. A.Z. Patashinski and V.L. Pokrovski, The fluctuational theory of phase

transitions, Nauka, Moscow, 1975 7. V.S. Vladimirov, Generalized functions over the field ofp—adic numbers

Russian Math. Surveys 43:5 (1988) 8. I.I. Gihman and A.V. Skorohod, Introduction to Theory of Random Pro

cesses, Nauka, Moscow, 1977 9. Ya.I. Volovich, Interacting stochastic process and renormalization theory,

quant-ph/0008063

ISBN 981-02-4846-6

www. worldscientific.com 48 84hc 9 789810 248468

Foundations of Probability and Physics: Proceedings of the Conference

Documents

Transcript of Foundations of Probability and Physics: Proceedings of the Conference