Biologically plausible learning rules for neural networks and quantum computing

6
* Corresponding author. 001-412-268-2670; fax: 001-412-268-3757. E-mail address: bmlv@andrew.cmu.edu (B. Morel). Neurocomputing 32}33 (2000) 921}926 Biologically plausible learning rules for neural networks and quantum computing Benoit Morel* Department of Engineering and Public Policy, Carnegie Mellon University, Baker Hall 129, 5000 Forbes Avenue., Pittsburgh, PA 15213-3890, USA Accepted 13 January 2000 Abstract Hebb's rule is assumed to be closely associated with biological learning. It has not been so far a source of powerful learning algorithms for arti"cial neural networks. We point to the fact that Hebb's rule implemented in a quantum algorithm leads to learning algorithm converging much faster. The origin of the di!erence is `quantum entanglementa. Quantum entanglement may have a neuronal equivalent, which may be the reason why biological learning uses rules which are di$cult to implement on a computer. ( 2000 Elsevier Science B.V. All rights reserved. Keywords: Quantum computing; Quantum entanglement; Hebbian learning; Neural networks 0. Introduction Arti"cial neural networks (ANNs) are extreme simpli"cations of the architecture of the brain. They are also powerful processors. They have been behind important advances in processing in general, and parallel processing, in particular. The fact that ANNs are simpli"ed reproduction of the brains parallel architecture may have something to do with their performance as fast and powerful processors, despite their simplicity. If biological evolution acts as an optimizer, one is led to assume that the parallel architecture of the brain expresses a form of optimality. ANNs are not only used as processors. They are also a powerful instrument to study the brain. There seems to be a natural relation between brain functions and 0925-2312/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 5 - 2 3 1 2 ( 0 0 ) 0 0 2 6 1 - 7

Transcript of Biologically plausible learning rules for neural networks and quantum computing

*Corresponding author. 001-412-268-2670; fax: 001-412-268-3757.E-mail address: [email protected] (B. Morel).

Neurocomputing 32}33 (2000) 921}926

Biologically plausible learning rules for neural networks andquantum computing

Benoit Morel*Department of Engineering and Public Policy, Carnegie Mellon University, Baker Hall 129,

5000 Forbes Avenue., Pittsburgh, PA 15213-3890, USA

Accepted 13 January 2000

Abstract

Hebb's rule is assumed to be closely associated with biological learning. It has not been so fara source of powerful learning algorithms for arti"cial neural networks. We point to the fact thatHebb's rule implemented in a quantum algorithm leads to learning algorithm converging muchfaster. The origin of the di!erence is `quantum entanglementa. Quantum entanglement mayhave a neuronal equivalent, which may be the reason why biological learning uses rules whichare di$cult to implement on a computer. ( 2000 Elsevier Science B.V. All rights reserved.

Keywords: Quantum computing; Quantum entanglement; Hebbian learning; Neural networks

0. Introduction

Arti"cial neural networks (ANNs) are extreme simpli"cations of the architecture ofthe brain. They are also powerful processors. They have been behind importantadvances in processing in general, and parallel processing, in particular. The fact thatANNs are simpli"ed reproduction of the brains parallel architecture may havesomething to do with their performance as fast and powerful processors, despite theirsimplicity. If biological evolution acts as an optimizer, one is led to assume that theparallel architecture of the brain expresses a form of optimality.

ANNs are not only used as processors. They are also a powerful instrument tostudy the brain. There seems to be a natural relation between brain functions and

0925-2312/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved.PII: S 0 9 2 5 - 2 3 1 2 ( 0 0 ) 0 0 2 6 1 - 7

ANNs. They have contributed to provide important insights into our understandingof many cognitive processes. But learning, an important cognitive process is a notableexception.

A variety of learning algorithms have been developed for ANNs. A well-knownexample is back-propagation. Back-propagation has proven to be a very e$cientlearning algorithm. But learning rules like back-propagation are not `biologicallyplausiblea [1]. It is generally recognized that synaptic plasticity in the form of Hebb'srule plays a central role in biological learning. Attempts to develop learningalgorithms for ANNs based on Hebb's rule have failed to deliver algorithmsas e$cientas back-propagation, or which converge as fast as what is observed invivo [7].

If one takes seriously the kind of insights that ANNs provide about cognition,"nding biologically plausible learning rules for ANNs is obviously important. Onemajor reason for using ANNs to study cognitive processes is the fact that they seem tonaturally capture important features of the brain activity. But as long as learningalgorithms based on Hebb's rule fail to work as e$ciently in ANNs as they do in thebrain, ANNs lack something fundamental. Something, which has to do with whatmay be the most important of the activities of the brain: learning.

The same logic which makes one presume that the parallel architecture of the braincorresponds to some form of optimization, leads one to infer that the biologicallearning rules and their implementation in the brain, may represent also a kind ofoptimization. Our understanding of the extent to which Hebb's rule applies is limited.As long as the (im)possibility of a powerful learning algorithm solely based on Hebb'srule is not demonstrated, one is not in a position to infer that the brain learns byassociation only. This has implications for the debate about the limits of the connec-tionist model of the brain [8].

In this short paper, we do not answer those questions. We demonstrate insteadwhat we hope will be perceived as a suggestive result, with potentially seriousimplications. We observe that a Hebbian algorithm of learning framed as a quantumalgorithm, leads to a much faster converging algorithm than it does as a non-quantumor classical algorithm. Seen from the perspective of quantum computing, this isnot a surprising result [2]. Seen from the perspective of neuroscience, this mayseem irrelevant, as to the best of our present knowledge, the brain is not a quantumsystem. But the brain does not need to be a quantum system for this result to berelevant.

Quantum computing owes its e$ciency to `quantum entanglementa. A neuronalequivalent of quantum entanglement may very well exist and be at work in the brain.Quantum entanglement has non-quantum equivalents. In the case of the nervoussystem, this could mean that the processing of a nervous signal, involves a coherentinteractive propagation among many neurons in parallel. In itself, this idea is notreally revolutionary.

What the detour of quantum computing accomplishes, is to provide a plausibleargument to explain why Hebb's rule may be so e$cient as a basis for learning. Italso suggests that entanglement plays an important role in the brain informationprocessing.

922 B. Morel / Neurocomputing 32}33 (2000) 921}926

1. Hebbian rule-based learning rules for quantum learning

Taking an ANN approach, we model a processor as a mapping between an n-vectorinput DsT

j/1,...,nand an m-vector output: DqT

i/1,...,m:

DqTi"

n+j/1

wjiDsT

j. (1)

The m]n matrix MwjiN is built from the weights. Learning rules are algorithms of

updating of the elements of the matrix MwjiN, up until the input vector DsT

j/1,...,n, is

mapped on the desired output vector: DpTi/1,...,m

. Hebbian-based learning rules havethe general form

dwji+r(Ep!qE)(p

i!q

i)sj, (2)

where r(Ep!qE) is a reward function which depends on the distance as vectorsbetween the present output q and the desired output p. Any algorithm to updateweights, which has been so far proposed, based on that kind of rule, turned out to beat best slowly converging algorithms. It is safe to state that Hebb's rule has not beenso far a natural source of powerful learning algorithms for ANNs.

1.1. Quantum approach

Assume that DsTj/1,...,n

, DpTi/1,...,m

and DqTi/1,...,m

are quantum states, i.e. vectors inHilbert spaces. The problem is to "nd the set of weights Mw8 j

iN such that:

DpiT"+n

j/1w8 jiDsjT. The set of weights Mw8 j

iN becomes a quantum operator.

The learning algorithm in that case is a quantum process, which makes the originaloutput state DqT

i/1,...,mevolve into the desired output state DpT

i/1,...,m. Quantum

processes correspond to the e!ect of unitary operators derived from a Hamiltonian.The unitarity of the operator guarantees that there is conservation of probability. AsHamiltonian, we take

H"Hp#H

q"EG

m+j/1

DpjTSpjD#

m+j/1

DqjTSqjDH. (3)

By de"nition: DpiT"(;

Hp`Hq)kiDq

kT.;

Hp`Hqis the unitary operator of the quantum

evolution associated with the Hamiltonian. E is a constant, for the time beingarbitrary. It is the eigenvalue of the Hamiltonian, i.e. it is in general the energy of thesystem. The quantum process can also be interpreted as an operation, which sends theweight matrix Mwj

iN on Mw8 j

iN:

w8 ji"

m+k/1

(;Hp`Hq

)kiwjk. (4)

The learning rules are buried in Eq. (4). In order to make them more explicit, weassume that the components of the vectors DpT

i/1,...,mand DqT

i/1,...,mare expressed in

B. Morel / Neurocomputing 32}33 (2000) 921}926 923

an ortho-normal basis and that: SpDqT"x ( x can be a complex number, but we can,without loss of generality assume that it is real). The e!ect of the quantum propaga-tion is to align DpT and DqT, i.e. to have xP1. The weight matrix in these notations canbe written as:wj

i"Dq

iTSsjD and w8 j

i"Dp

iTSsjD.

The in"nitesimal version of w8 ji"+m

k/1(;

Hp`Hq)kiwj

khas the following form (we

used: SqDqT"1):

dwji"!iE+

k

M(Hp#H

q)kiNDq

kTSsjD"!iE(Dp

iTSpDqT#Dq

iT)SsjD. (5)

Using SpDqT"x, this takes a form where one can recognize a realization of Hebb'srule

dwji"!iE(Dp

iTx#Dq

iT)SsjD. (6)

This quantum system is basically identical to the quantum problem studied byFahri and Gutman [4] to analyze why the search algorithm of Grover [6] convergesso fast. It is straightforward (details can be found in the paper of Fahri and Gutman[4]) that if x is the original value of SpDqT, the amplitude SpDqT"xcos(Ext)!i sin(Ext), which implies that the probability of the state is: DSpDqTD2"sin2(Ext)#x2cos2(Ext). The expected time at which DSpDqTD2"1, i.e. at which the vectors DpT andDqT are aligned is therefore: t"p/2Ex.

This has been shown to correspond to a much shorter convergence time than thecorresponding classical (i.e. non-quantum) algorithm.

2. Implications

In the previous section, we showed that a quantum algorithm, already known toconverge faster than its non-quantum equivalent could be construed as a realizationof a Hebb's rule-based learning algorithm (cf. Eq. (6)). The implications of that resultfor ANNs are limited. But its implications for neuroscience are potentially far moreimportant.

One of the last of the many contributions of Feynman [5] is the observation thatcomputers are not ideally suited to solve numerically quantum mechanical problems.Quantum physics di!ers from classical physics in some subtle but fundamental ways,which complicate its implementation on computers. Using computers to solve quan-tum physics problems consumes a lot of processing power.

On the other hand, it was discovered that framing some NP-complete problems asquantum problems led to much faster algorithms to solve them. But in order to takeadvantage of that observation, quantum computers are needed, i.e. computers dealingwith quantum states (qu-bits of information) instead of classical computers manipula-ting bits of information. Quantum computers may exist in the future [3].

If existing computers are awkward instruments for quantum physics, it is doubtfulthat quantum algorithms would ever turn out to be more e$cient than existingproven techniques like back-propagation, as a basis for learning rules for ANNs.

924 B. Morel / Neurocomputing 32}33 (2000) 921}926

On the other hand, if the laws of quantum physics provide a naturally more e$cientway to process information, one may wonder whether this does not indicate that theremay be alternative ways of processing information, found by the brain and stillunknown to us. What is behind the e$ciency of quantum computing may providea clue.

Quantum entanglement is where lies the most important di!erence with classicalphysics, and where the e$ciency of the algorithm comes from. A quantum process canbe represented as the evolution of a vector in a Hilbert space. The evolution of thedi!erent components of the vector is in#uenced by the state of the other components.Their evolution is entangled. But we cannot access the information, which is ex-changed between the components. A measurement amounts to looking at one com-ponent. All the information, which was being exchanged collapses. We only haveindirect evidence of it, by studying what the observable depends on.

A quantum learning algorithm for ANNs would correspond to an algorithm, wherethe changes on the weights would be `entangleda. In the same way, neuronalentanglement would mean that what takes place in one neuron, cannot be meaning-fully isolated from what takes place in the neurons interacting with it.

That there may be some form of neuronal entanglement in the brain activity is ina sense a common assumption. That it may be part of the explanation of theperformances of the brain as an information processor is not necessarily a veryrevolutionary suggestion. But the implication of those suggestions is that monitoringwhat takes place in one neuron to understand how a speci"c piece of informationpropagates in and is processed by the brain has serious limitations. The fact that it isdi$cult, if not impossible to look at many neurons at the same time, does not changethat other fact. As long as neuronal entanglement is not understood much better, weare very limited in our ability to make sense of the processing power of the brain.

Even if ANNs may not bene"t from quantum-based learning algorithms, it may stillbe useful to develop such algorithms. They could help study the relation betweeninformation entanglement and Hebbian learning.

References

[1] F.H.C. Crick, Nature 337 (1989) 129}132.[2] A. Ekert, R. Jozsa, Quantum computation and Shor's factoring algorithm, Rev. Modern Phys. 68

(1996) 733}753.[3] A. Ekert, R. Jozsa, R. Penrose (Eds.), Philosophical Transactions of the Royal Society in Mathematical,

Physical and Engineering Sciences (special issue) Quantum Computation: Theory and Experiment,Vol. 356, 1998, pp. 1173}1948.

[4] E. Fahri, S. Gutman, Analog analogue of a digital quantum computation, Phys. Rev. A 57 (1998)2403}2407.

[5] R.P. Feynman, Quantum mechanical computers, Foundation Phys. 16 (1985) 507}531.[6] L. Grover, Quantum mechanics help in searching a needle in a haystack, Phys. Rev. Lett. 79 (1997) 325.[7] P. Mazzoni, R.A. Andersen, M.J. Jordan, Amore biologically plausible learning rule for neural

networks, Proc. Natl. Acad. Sci. USA 88 (1991) 4433}4437.[8] S. Pinker, How the Mind Works (Norton, New York, 1997).

B. Morel / Neurocomputing 32}33 (2000) 921}926 925

Benoit Morel received a Ph.D. in theoretical Physics from the university ofGeneva. His post-doctoral career in physics took him to Harvard, CERN andCaltech. Then he moved to Stanford to start working on issues related tointernational security. He joined the faculty of Carnegie Mellon in 1987. Aftera few years at CMU, he began to develop an active interest in the mathematicalmodels of biological processes.

926 B. Morel / Neurocomputing 32}33 (2000) 921}926