The neural basis of predicate-argument...

22
1. Introduction This article argues for the following thesis: Neural evidence exists for predicate-argument structure as the core of phy- logenetically and ontogenetically primitive (prelinguistic) mental representations. The structures of modern natural languages can be mapped onto these primitive representa- tions. The idea that language is built onto preexisting repre- sentations is common enough, being found in various forms in works such as Bickerton (1998), Kirby (1999; 2000), Hur- ford (2000b), and Bennett (1976). Conjunctions of ele- mentary propositions of the form PREDICATE(x) have been used by Batali as representations of conceptual struc- ture preexisting language in his impressive computer sim- ulations of the emergence of syntactic structure in a popu- lation of interacting agents (Batali 2002). Justifying such preexisting representations in terms of neural structure and processes is relatively new. This paper starts from a very simple component of the Fregean logical scheme, PREDICATE(x), and proposes a neural interpretation for it. This is, to my knowledge, the first proposal of a “wormhole” between the hitherto mutu- ally isolated universes of formal logic and empirical neuro- science. The fact that it is possible to show a correlation be- tween neural processes and logicians’ conclusions about logical form is a step in the unification of science. The dis- coveries in neuroscience confirm that the logicians have been on the right track, that the two disciplines have some- thing to say to each other despite their radically different methods, and that further unification may be sought. As the brain has a complexity far in excess of any representation scheme dreamt up by a logician, it is to be expected that the basic PREDICATE(x) formalism is to some extent an ideal- ization of what actually happens in the brain. But, conced- ing that the neural facts are messier than could be captured with absolute fidelity by any formula as simple as PREDI- CATE(x), I hope to show that the central ideas embodied in the logical formula map satisfyingly neatly onto certain specific neural processes. The claim that some feature of language structure maps onto a feature of primitive mental representations needs (1) a plausible bridge between such representation and the structure of language, and (2) a characterization of BEHAVIORAL AND BRAIN SCIENCES (2003) 26, 261–316 Printed in the United States of America © 2003 Cambridge University Press 0140-525X/03 $12.50 261 The neural basis of predicate-argument structure James R. Hurford Language Evolution and Computation Research Unit, School of Philosophy, Psychology, and Language Sciences, University of Edinburgh, Edinburgh EH8 9LL, Scotland, United Kingdom. [email protected] Abstract: Neural correlates exist for a basic component of logical formulae, PREDICATE(x). Vision and audition research in primates and humans shows two independent neural pathways; one locates objects in body-centered space, the other attributes properties, such as colour, to objects. In vision these are the dorsal and ventral pathways. In audition, similarly separable “where” and “what” pathways exist. PREDICATE(x) is a schematic representation of the brain’s integration of the two processes of delivery by the senses of the loca- tion of an arbitrary referent object, mapped in parietal cortex, and analysis of the properties of the referent by perceptual subsystems. The brain computes actions using a few “deictic” variables pointing to objects. Parallels exist between such nonlinguistic variables and linguistic deictic devices. Indexicality and reference have linguistic and nonlinguistic (e.g., visual) versions, sharing the concept of at- tention. The individual variables of logical formulae are interpreted as corresponding to these mental variables. In computing action, the deictic variables are linked with “semantic” information about the objects, corresponding to logical predicates. Mental scene descriptions are necessary for practical tasks of primates, and preexist language phylogenetically. The type of scene de- scriptions used by nonhuman primates would be reused for more complex cognitive, ultimately linguistic, purposes. The provision by the brain’s sensory/perceptual systems of about four variables for temporary assignment to objects, and the separate processes of per- ceptual categorization of the objects so identified, constitute a pre-adaptive platform on which an early system for the linguistic de- scription of scenes developed. Keywords: argument; attention; deictic; dorsal; logic; neural; object; predicate; reference; ventral James R. Hurford is Professor of General Linguistics at the University of Edinburgh. He has a broad interest in reconciling various traditions in linguistics which have tended to conflict. In particular, he has worked on articulating a framework in which formal representation of grammars in individual minds interacts with proper- ties of language as used in communities. The framework emphasizes the interaction of evolution, learning and communication. He is perhaps best known for his com- puter simulations of various aspects of the evolution of language.

Transcript of The neural basis of predicate-argument...

Page 1: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

1. Introduction

This article argues for the following thesis: Neural evidenceexists for predicate-argument structure as the core of phy-logenetically and ontogenetically primitive (prelinguistic)mental representations. The structures of modern naturallanguages can be mapped onto these primitive representa-tions.

The idea that language is built onto preexisting repre-sentations is common enough, being found in various formsin works such as Bickerton (1998), Kirby (1999; 2000), Hur-ford (2000b), and Bennett (1976). Conjunctions of ele-mentary propositions of the form PREDICATE(x) havebeen used by Batali as representations of conceptual struc-ture preexisting language in his impressive computer sim-ulations of the emergence of syntactic structure in a popu-lation of interacting agents (Batali 2002). Justifying suchpreexisting representations in terms of neural structure andprocesses is relatively new.

This paper starts from a very simple component of theFregean logical scheme, PREDICATE(x), and proposes aneural interpretation for it. This is, to my knowledge, thefirst proposal of a “wormhole” between the hitherto mutu-ally isolated universes of formal logic and empirical neuro-science. The fact that it is possible to show a correlation be-tween neural processes and logicians’ conclusions aboutlogical form is a step in the unification of science. The dis-coveries in neuroscience confirm that the logicians havebeen on the right track, that the two disciplines have some-

thing to say to each other despite their radically differentmethods, and that further unification may be sought. As thebrain has a complexity far in excess of any representationscheme dreamt up by a logician, it is to be expected that thebasic PREDICATE(x) formalism is to some extent an ideal-ization of what actually happens in the brain. But, conced-ing that the neural facts are messier than could be capturedwith absolute fidelity by any formula as simple as PREDI-CATE(x), I hope to show that the central ideas embodiedin the logical formula map satisfyingly neatly onto certainspecific neural processes.

The claim that some feature of language structure mapsonto a feature of primitive mental representations needs(1) a plausible bridge between such representation andthe structure of language, and (2) a characterization of

BEHAVIORAL AND BRAIN SCIENCES (2003) 26, 261–316Printed in the United States of America

© 2003 Cambridge University Press 0140-525X/03 $12.50 261

The neural basis ofpredicate-argument structure

James R. HurfordLanguage Evolution and Computation Research Unit, School of Philosophy,Psychology, and Language Sciences, University of Edinburgh, EdinburghEH8 9LL, Scotland, United [email protected]

Abstract: Neural correlates exist for a basic component of logical formulae, PREDICATE(x). Vision and audition research in primatesand humans shows two independent neural pathways; one locates objects in body-centered space, the other attributes properties, suchas colour, to objects. In vision these are the dorsal and ventral pathways. In audition, similarly separable “where” and “what” pathwaysexist. PREDICATE(x) is a schematic representation of the brain’s integration of the two processes of delivery by the senses of the loca-tion of an arbitrary referent object, mapped in parietal cortex, and analysis of the properties of the referent by perceptual subsystems.

The brain computes actions using a few “deictic” variables pointing to objects. Parallels exist between such nonlinguistic variables andlinguistic deictic devices. Indexicality and reference have linguistic and nonlinguistic (e.g., visual) versions, sharing the concept of at-tention. The individual variables of logical formulae are interpreted as corresponding to these mental variables. In computing action, thedeictic variables are linked with “semantic” information about the objects, corresponding to logical predicates.

Mental scene descriptions are necessary for practical tasks of primates, and preexist language phylogenetically. The type of scene de-scriptions used by nonhuman primates would be reused for more complex cognitive, ultimately linguistic, purposes. The provision bythe brain’s sensory/perceptual systems of about four variables for temporary assignment to objects, and the separate processes of per-ceptual categorization of the objects so identified, constitute a pre-adaptive platform on which an early system for the linguistic de-scription of scenes developed.

Keywords: argument; attention; deictic; dorsal; logic; neural; object; predicate; reference; ventral

James R. Hurford is Professor of General Linguisticsat the University of Edinburgh. He has a broad interestin reconciling various traditions in linguistics whichhave tended to conflict. In particular, he has worked onarticulating a framework in which formal representationof grammars in individual minds interacts with proper-ties of language as used in communities. The frameworkemphasizes the interaction of evolution, learning andcommunication. He is perhaps best known for his com-puter simulations of various aspects of the evolution oflanguage.

Page 2: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

“primitive mental representation” independent of lan-guage itself, to avoid circularity. The means of satisfyingthe first, the “bridge to language” condition, will be dis-cussed in the next subsection. Fulfilling the second con-dition, the bridge to brain structure and processing, by establishing the language-independent validity of PRED-ICATE(x) as representing fundamental mental processesin both humans and nonhuman primates, will occupy themeat of this article (sects. 2 and 3). The article is originalonly in bringing together the fruits of others’ labours.Neuroscientists andpsychologists will be familiar withmuch of the empirical research cited here, but I hope theywill be interested in my claims for its wider significance.Linguists, philosophers, and logicians might be excited todiscover a new light cast on their subjects by recent neu-rological research.

1.1. The bridge from logic to language

The relationship between language and thought is, ofcourse, a vast topic, and there is only space here to sketchmy premises about this relationship.

Descriptions of the structure of languages are couchedin symbolic terms. Although it is certain that a human’sknowledge of his/her language is implemented in neu-rons, and at an even more basic level of analysis, in atoms,symbolic representations are clearly well suited for thestudy of language structure. Neuroscientists don’t needlogical formulae to represent the structures and processesthat they find. Ordinary language, supplemented by dia-grams, mathematical formulae, and neologized technicalnouns, verbs, and adjectives, is adequate for the expres-sion of neuroscientists’ amazingly impressive discoveries.Where exotic technical notations are invented, it is forcompactness and convenience, and their empirical con-tent can always be translated into more cumbersome or-dinary language (with the technical nouns, adjectives,etc.).

Logical notations, on the other hand, were developed byscholars theorizing in the neurological dark about the struc-ture of language and thought. Languages are systems forthe expression of thought. The sounds and written charac-ters, and even the syntax and phonology of languages canalso be described in concrete ordinary language, aug-mented with diagrams and technical vocabulary. Here too,invented exotic notations are for compactness and conve-nience; which syntax lecturer has not paraphrased S r NPVP into ordinary English for the benefit of a first-year class?But the other end of the language problem, the domain ofthoughts or meanings, has remained elusive to non-tauto-logical ordinary language description. Of course, it is possi-ble to use ordinary language to express thoughts – we do itall the time. But to say that “Snow is white” describes thethought expressed by “Snow is white” is either simplywrong (because description of a thought process and ex-pression of a thought are not equivalent) or at best unin-formative. To arrive at an informative characterization ofthe relation between thought and language (assuming therelation to be other than identity), you need some charac-terization of thought which does not merely mirror lan-guage. So logicians have developed special notations for de-scribing thought (not that they have always admitted orbeen aware that that is what they were doing). But, up tothe present, the only route that one could trace from the

logical notations to any empirically given facts was backthrough the ordinary language expressions which moti-vated them in the first place. A neuroscientist can show you(using suitable instruments which you implicitly trust) thesynapses, spikes, and neural pathways that he investigates.But the logician cannot illuminatingly bring to your atten-tion the logical form of a particular natural sentence, with-out using the sentence itself, or a paraphrase of it, as an in-strument in his demonstration. The mental adjustment thata beginning student of logic is forced to make, in trainingherself to have the “logician’s mindset,” is absolutely differ-ent in kind from the mental adjustment that a beginningstudent of a typical empirical science has to make. Onemight, prematurely, conclude that logic and the empiricalsciences occupy different universes, and that no wormholeconnects them.

Despite its apparently unempirical character, logical for-malism is not mere arbitrary stipulation, as some physicalscientists may be tempted to believe. One logical notationcan be more explanatorily powerful than another, as Frege’sadvances show. Frege’s introduction of quantifiers bindingindividual variables which could be used in argumentplaces was a great leap forward from the straitjacket of sub-ject-predicate structure originally proposed by Aristotleand not revised for over two millennia. Frege’s new nota-tion (but not its strictly graphological form, which was aw-fully cumbersome) allowed one to explain thoughts and in-ferences involving a far greater range of natural sentences.Logical representations, systematically mapped to the cor-responding sentences of natural languages, clarify enor-mously the system underlying much human reasoning,which, without the translation to logical notation, would ap-pear utterly chaotic and baffling.

It is necessary to note a common divergence of usage,between philosophers and linguists, in the term “subject.”For some philosophers (e.g., Strawson 1959; 1974), apredicate in a simple proposition, as expressed by Johnloves Mary, for example, can have more than one “sub-ject;” in the example given, the predicate corresponds toloves and its “subjects” to John and Mary. In this usage, theterm “subject” is equivalent to “argument.” Linguists, onthe other hand, distinguish between grammatical subjectsand grammatical objects, and further between direct andindirect objects. Thus, in Russia sold Alaska to America,the last two nouns are not subjects, but direct and indirectobject respectively. The traditional grammatical division ofa sentence into Subject 1 Predicate is especially prob-lematic where the “Predicate” contains several NPs, se-mantically interpreted as arguments of the predicate ex-pressed by the verb. Which argument of a predicate, if any,is privileged to be expressed as the grammatical subject ofa sentence (thus in English typically occurring before theverb, and determining number and person agreement inthe verb) is not relevant to the truth-conditional analysisof the sentence. Thus, a variety of sentences such as Alaskawas sold to America by Russia and It was America that wassold Alaska by Russia all describe the same state of affairsas the earlier example. The difference between the sen-tences is a matter of rhetoric, or appropriate presentationof information in various contextual circumstances, in-volving what may have been salient in the mind of thehearer or reader before encountering the sentence, or howthe speaker or writer wishes to direct the subsequent dis-course.

Hurford: The neural basis of predicate-argument structure

262 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 3: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

Logical predicates are expressed in natural language bywords of various parts of speech, including verbs, adjec-tives, and common nouns. In particular, there is no specialconnection between grammatical verbs and logical predi-cates. The typical correspondences between the main En-glish syntactic categories and basic logical terms are dia-grammed below.

Pronouns J – – – ArgumentsProper Names

Nouns — HCommon Nouns

Verbs

Adjectives 6 – – – 1, 2, . . . n–place Predicates

Prepositions

Adverbs

Common nouns, used after a copula, as man in He is a manplainly correspond to predicates. In other positions, al-though they are embedded in grammatical noun phrases, asin A man arrived, they nonetheless correspond to predi-cates.

The development of formal logical languages, of whichfirst order predicate logic is the foremost example andhardiest survivor, heralds a realization of the essential dis-tance between ordinary language and purely truth-condi-tional representations of “objective” situations in the world.Indeed, early generations of modern logicians, includingFrege, Russell, and Tarski, believed the gap between ordi-nary language and logical, purely truth-conditional repre-sentations to be unbridgable. Times have changed, andsince Montague there have been substantial efforts to de-scribe a systematic mapping between truth conditions andordinary language. Ordinary language serves several pur-poses in addition to representation of states of affairs. Myargument in this article concerns mental representations ofsituations in the world, as these representations existed be-fore language, and even before communication. Thus, mat-ters involving how information is presented in externalizedutterances are not our concern here. The exclusive concernhere with pre-communication mental representations ab-solves us from responsibility to account for further cogni-tive properties assumed by more or less elaborate signals incommunication systems, such as natural languages. For thisreason also, the claims to be made here about the neuralcorrelates of PREDICATE(x) do not relate at all directly tomatters of linguistic processing (e.g., sentence parsing), asopposed to the prelinguistic representation of events andsituations.

Bertrand Russell was, of course, very far from conceivingof the logical enterprise as relating to how non-linguisticcreatures represent the world. But it might be helpful tonote that Russell’s kind of flat logical representations, as in∃ x [KoF(x) & wise(x)] for The king of France is wise,1 areessentially like those assumed by Batali (2002) and focussedon in this article. Russell’s famous controversy with Straw-son (Russell 1905; 1957; Strawson 1950) centered on the ef-

fect of embedding an expression for a predicate in a nounphrase determined by the definite article. Questions of def-initeness only arise in communicative situations, with whichStrawson was more concerned. A particular object in theworld is inherently neither definite nor indefinite; onlywhen we talk about an object do our referring noun phrasesbegin to have markers of definiteness, essentially conveying“You are already aware of this thing.”

The thesis proposed here is that there were, and still are,pre-communication mental representations which embodythe fundamental distinction between predicates and argu-ments, and in which the foundational primitive relationshipis that captured in logic by formulae of the kind PREDI-CATE(x). The novel contribution here is that the centralityof predicate-argument structure has a neural basis, adaptedto a sentient organism’s traffic with the world, rather thanhaving to be postulated as “logically true” or even Platoni-cally given. Neuroscience can, I claim, offer some informa-tive answers to the question of where elements of logicalform came from.

The strategy here is to assume that a basic element of firstorder predicate logic notation, PREDICATE(x), suitablyembedded, can be systematically related to natural lan-guage structures, in the ways pursued by recent generationsof formal semanticists of natural language, for example,Montague (1970; 1973), Parsons (1990), Kamp and Reyle(1993). The hypothesis here is not that all linguistic struc-ture derives from prelinguistic mental representations. I ar-gue elsewhere (Hurford 2002) that in fact very little of therich structure of modern languages directly mirrors anymental structure in pre-existing language.

In generative linguistics, such terms as “deep structure”and “surface structure,” “logical form” and “phonetic form”have specialized theory-internal meanings, but the basic in-sight inherent in such terminology is that linguistic struc-ture is a mapping between two distinct levels of represen-tation. In fact, most of the complexity in language structurebelongs to this mapping, rather than to the forms of the an-choring representations themselves. In particular, the syn-tax of logical form is very simple. All of the complexities ofphonological structure belong to the mapping betweenmeaning and form, rather than to either meaning or formper se. A very great proportion of morphosyntactic struc-ture clearly also belongs to this mapping – componentssuch as word-ordering, agreement phenomena, anaphoricmarking, most syntactic category distinctions (e.g., noun,verb, auxiliary, determiner), which have no counterparts inlogic, and focussing and topicalization devices. In this re-spect, the view taken here differs significantly from Bicker-ton’s (in Calvin & Bickerton 2000) that modern grammar inall its glory can be derived, with only a few auxiliary as-sumptions, from the kind of mental representations suit-able for cheater detection that our prelinguistic ancestorswould have been equipped with; see Hurford (2002) for afuller argument.

Therefore, to argue, as I will in this article, that a basiccomponent of the representation of meaning preexists lan-guage and can be found in apes, monkeys, and possiblyother mammals, leaves most of the structure of language(the complex mappings of meanings to phonetic signals)still unexplained in evolutionary terms. To argue that apeshave representations of the form PREDICATE(x), does notmake them out to be language-capable humans. Possessionof the PREDICATE(x) form of representation is evidently

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 263

Page 4: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

not sufficient to propel a species into full-blown syntacticlanguage. There is much more to human language thanpredicate-argument structure, but predicate-argumentstructure is the semantic foundation on which all the rest isbuilt.

The view developed here is similar in its overall directionto that taken by Bickerton (1990). Bickerton argues for a“primary representation system (PRS)” existing in variouslydeveloped forms in all higher animals. “In all probability,language served in the first instance merely to label proto-concepts derived from prelinguistic experience” (p. 91).This is entirely consistent with the view proposed here, as-suming that what I call “prelinguistic mental predicates” areBickerton’s “protoconcepts.” Bickerton also believes, as Ido, that the representation systems of prelinguistic crea-tures have predicate-argument structure. Bickerton furthersuggests that, even before the emergence of language, it ispossible to distinguish subclasses of mental predicatesalong lines that will eventually give rise to linguistic dis-tinctions such as Noun/Verb. He argues that “[conceptscorresponding to] verbs are much more abstract than[those corresponding to] nouns” (p. 98). I also believe thata certain basic functional classification of predicates can beargued to give rise to the universal linguistic categories ofNoun and Verb. But that subdivision of the class of predi-cates is not my concern here. Here the focus is on the morefundamental issue of the distinction between predicatesand their arguments. So this paper is not about the emer-gence of Noun/Verb structure (which is a story that mustwait for another day). (Batali’s [2002] impressive computersimulations of the emergence of some aspects of naturallanguage syntax start from conjunctions of elementary for-mulae in PREDICATE(x) form, but it is notable that theydo not arrive at anything corresponding to a Noun/Verb dis-tinction.)

On top of predicate-argument structure, a number ofother factors need to come together for language to evolve.Only the sketchiest mention will be given of such factorshere, but they include (a) the transition from private men-tal representations to public signals; (b) the transition frominvoluntary to voluntary control; (c) the transition from epi-genetically determined to learned and culturally transmit-ted systems; (d) the convergence on a common code by acommunity; (e) the evolution of control of complex hierar-chically organized signalling behaviour (syntax); (f) the de-velopment of deictic here-and-now talk into definite refer-ence and proper naming capable of evoking events andthings distant in time and space. It is surely a move forwardin explaining the evolution of language to be able to dissectout the separate steps that must be involved, even if theseturn out to be more dauntingly numerous than was previ-ously thought. (In parallel fashion, the discovery of thestructure of DNA immediately posed problems of previ-ously unimagined complexity to the next generation of bi-ologists.)

1.2. Prelinguistic predicates

In the view adopted here, a predicate corresponds, to a firstapproximation, to a judgement that a creature can makeabout an object. Some predicates are relatively simple. Fora simple predicate, the senses provide the brain with inputallowing a decision with relatively little computation. On ascale of complexity, basic colour predicates are near the

simple end, while predicates paraphrasable as sycamore orweasel are much more complex. Mentally computing theapplicability of complex predicates often involves simplerpredicates, hence relatively more computation.

Some ordinary languages predicates, such as big, dependfor their interpretation on the prior application of otherpredicates. Generically speaking, a big flea is not big; this isno contradiction, once it is admitted that the sentence im-plicitly establishes two separate contexts for the applicationof the adjective big. There is “big, generically speaking,”that is, in the context of consideration of all kinds of objectsand of no one kind of object in particular; and there is “bigfor a flea.” This is semantic modulation. Such modulation isnot a solely linguistic phenomenon. Many of our higher-level perceptual judgements are modulated in a similar way.An object or substance characterized by its whitish colour(like chalk) reflects bright light in direct sunlight, but a lightof lower intensity in the shade at dusk. Nevertheless, thebrain, in both circumstances, is able to categorize thiscolour as whitish, even though the lower intensity of light isreflected by a greyish object or substance (like slate) in di-rect sunlight. In recognizing a substance as whitish or grey-ish, the brain adjusts to the ambient lighting environment.Viewing chalk in poor light, the visual system returns thejudgement “Whitish, for poor light”; in response to light ofthe same intensity, as when viewing slate in direct sunlight,the visual system returns the judgement “Greyish, for broaddaylight.” A similar example can be given from speech per-ception. In a language such as Yoruba, with three level lex-ical tones, high, mid, and low, a single word spoken by anunknown speaker cannot reliably be recognized as on a hightone spoken by a man or a low or mid tone spoken by awoman or child. But as soon as a few words are spoken, thehearer recognizes the appropriate tones in the context ofthe overall pitch range of the speaker’s voice. Thus, theranges of external stimuli which trigger a mental predicatemay vary, systematically, as a function of other stimuli pre-sent.

This article will be mainly concerned with 1-place pred-icates, arguing that they correspond to perceived proper-ties. There is no space here to present a fully elaborated ex-tension of the theory to predicates of degree greater thanone, but a few suggestive remarks may convince a readerthat in principle the theory may be extendable to n-placepredicates (n . 1).

Prototypical events or situations involving 2-place predi-cates are described by John kicked Fido (an event) or Thecat is on the mat (a situation). Here I will take it as giventhat observers perceive events or situations as unifiedwholes; there is some psychological reality to the conceptof an atomic event or situation. In a 2-place predication(barring predicates used reflexively), the two participantentities involved in the event or situation also have proper-ties. In formal logic, it is possible to write a formula such as∃ x ∃ y [kick(x, y)], paraphrasable as Something kicks some-thing. But I claim that it is never possible for an observer toperceive an event of this sort without also being able tomake some different 1-place judgements about the partic-ipants. Perhaps the most plausible potential counterexam-ple to this claim would be reported as I feel something. Nowthis could be intended to express a 1-place state, as in I amhungry; but if it is genuinely intended as a report of an ex-perience involving an entity other than the experiencer, Iclaim that there will always be some (1-place) property of

Hurford: The neural basis of predicate-argument structure

264 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 5: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

this entity present to the mind of the reporter. That is, the“something” which is felt will always be felt as having someproperty, such as sharpness, coldness or furriness. Ex-pressed in terms of a psychologically realistic logical lan-guage enhanced by meaning postulates, this amounts to theclaim that every 2-place predicate occurs in the implicansof some meaning postulate whose implicatum includes 1-place predicates applicable to its arguments. The selec-tional restrictions expressed in some generative grammarsprovide good examples; the subject of drink must be ani-mate, the object of drink must be a liquid.

In the case of asymmetric predicates, the asymmetry canalways be expressed in terms of one participant in the eventor situation having some property which the other lacks.And, I suggest, this treatment is psychologically plausible.In cases of asymmetric actions, as described by such verbsas hit and eat, the actor has the metaproperty of being theactor, cashed out in more basic properties such as move-ment, animacy, and appearance of volition. Likewise, theother, passive, participant is typically characterized byproperties such as lack of movement, change of state, inan-imacy, and so forth (see Cruse 1973 and Dowty 1991 for rel-evant discussion). Cases of asymmetric situations, such asare involved in spatial relations as described by prepositionssuch as on, in, and under, are perhaps less obviously treat-able in this way. Here, I suggest that properties involvingsome kind of perceptual salience in the given situation areinvolved. In English, while both sentences are grammati-cal, The pen is on the table is commonplace, but The tableis under the pen is studiously odd. I would suggest that anobject described by the grammatical subject of on has aproperty of being taken in as a whole object comfortably bythe eye, whereas the other object involved lacks this prop-erty and is perceived (on the occasion concerned) rather asa surface than as a whole object.

In the case of symmetric predicates, as described by fighteach other or as tall as, the arguments are not necessarilydistinguished by any properties perceived by an observer.

I assume a version of event theory (Davidson 1980; Par-sons 1990), in which the basic ontological elements arewhole events or situations, annotated as e, and the partici-pants of these events, typically no more than about three,annotated as x, y, and z. For example, the event describedby A man bites a dog could be represented as ∃ e, x, y,bite(e), man(x), dog(y), agent(x), patient(y). In clumsy En-glish, this corresponds to “There is a biting event involvinga man and a dog, in which the man is the active volitionalparticipant, and the dog is the passive participant.” The lessnewsworthy event would be represented as ∃ e, x, y, bite(e),man(x), dog(y), agent(y), patient(x). The situation de-scribed by The pen is on the table could be represented as∃ e, x, y, on(e), pen(x), table(y), small_object(x), surface(y).

In this enterprise it is important to realize the great am-biguity of many ordinary language words. The relations ex-pressed by English on in An elephant sat on a tack and in Abook lay on a table are perceptually quite different (thoughthey also have something in common). Thus, there are atleast several mental predicates corresponding to ordinarylanguage words. When, in the histories of natural lan-guages, words change their meanings, the overt linguisticforms become associated with different mental predicates.The predicates which I am concerned with here are prelin-guistic mental predicates, and are not to be simply identi-fied with words.

Summarizing these notes, it is suggested that it may bepossible to sustain the claim that n-place predicates (n . 1)are, at least in perceptual terms, constructible from 1-placepredicates. The core of my argument in this article con-cerns formulae of the form PREDICATE(x), that is, 1-placepredications. My core argument in this article does notstand or fall depending on the correctness of these sugges-tions about n . 1-place predicates. If the suggestions aboutn . 1-place predicates are wrong, then the core claim islimited to 1-place predications, and some further argumentwill need to be made concerning the neural basis of n . 1-place predications. A unified theory relating all logicalpredicates to the brain is methodologically preferable, sothere is some incentive to pursue the topic of n . 1-placepredicates.

1.3. Individual variables as prelinguistic arguments

Here are two formulae of first order predicate logic(FOPL), with their English translations.

CAME(john) (Translation: “John came”)∃ x[TALL(x) & MAN(x) & CAME(x)] (Translation: “A tall man

came”)

The canonical fillers of the argument slots in predicatelogic formulae are constants denoting individuals, corre-sponding roughly to natural language proper names. In themore traditional schemes of semantics, no distinction be-tween extension and intension is made for proper names.On many accounts, proper names have only extensions(namely the actual individuals they name), and do not haveintensions (or “senses”). “What is probably the most widelyaccepted philosophical view nowadays is that they [propernames] may have reference, but not sense” (Lyons 1977,p. 219). “Dictionaries do not tell us what [proper] namesmean – for the simple reason that they do not mean any-thing” (Ryle 1957). In this sense, the traditional view hasbeen that proper names are semantically simpler than pred-icates. More recent theorizing has questioned that view.

In a formula such as CAME(john), the individual con-stant argument term is interpreted as denoting a particularindividual, the very same person on all occasions of use ofthe formula. FOPL stipulates by fiat this absolutely fixed re-lationship between an individual constant and a particularindividual entity. Note that the denotation of the term is athing in the world, outside the mind of any user of the log-ical language. It is argued at length by Hurford (2001) thatthe mental representations of protohumans could not haveincluded terms with this property. Protothought had noequivalent of proper names. Control of a proper name inthe logical sense requires Godlike omniscience. Creaturesonly have their sense organs to rely on when attempting toidentify, and to re-identify, particular objects in the world.Where several distinct objects, identical to the senses, ex-ist, a creature cannot reliably tell which is which, and there-fore cannot guarantee control of the fixed relation betweenan object and its proper name that FOPL stipulates. It’s nouse applying the same name to each of them, because thatviolates the requirement that logical languages be unam-biguous. More detailed arguments along these lines aregiven in Hurford (1999; 2001), but it is worth repeatinghere the counterargument to the most common objectionto this idea. It is commonly asserted that animals can rec-ognize other animals in their groups.

The following quotation demonstrates the prima facie at-

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 265

Page 6: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

traction of the impression that animals distinguish such in-dividuals, but simultaneously gives the game away.

The speed with which recognition of individual parents can beacquired is illustrated by the “His Master’s Voice” experimentsperformed by Stevenson et al. (1970) on young terns: these re-sponded immediately to tape-recordings of their own parents(by cheeping a greeting, and walking towards the loudspeaker)but ignored other tern calls, even those recorded from otheradult members of their own colony. (Walker 1983, p. 215)

Obviously, the tern chicks in the experiment were not rec-ognizing their individual parents – they were being fooledinto treating a loudspeaker as a parent tern. For the ternchick, anything that behaved sufficiently like its parentwas “recognized” as its parent, even if it wasn’t. The ternchicks were responding to very finely-grained propertiesof the auditory signal, and apparently neglecting even themost obvious of visual properties discernible in the situa-tion. In tern life, there usually aren’t human experi-menters playing tricks with loudspeakers, and so ternshave evolved to discriminate between auditory cues just tothe extent that they can identify their own parents with ahigh degree of reliability. Even terns presumably some-times get it wrong.

Animals respond in mechanical robot-like fashion to key stim-uli. They can usually be “tricked” into responding to crudedummies that resemble the true, natural stimulus situation onlypartially, or in superficial respects. (Krebs & Dawkins 1984,p. 384; quoted in Hurford 2001)

The logical notion of an individual constant permits nodegree of tolerance over the assignment of these logicalconstants to individuals; this is why they are called “con-stants.” It is an a priori fiat of the design of the logical lan-guage that individual constants pick out particular individ-uals with absolute consistency. In this sense, the logicallanguage is practically unrealistic, requiring, as previouslymentioned, Godlike omniscience on the part of its users,the kind of omniscience reflected in the biblical line “Buteven the very hairs of your head are all numbered” (Mat-thew, Ch.10).

Interestingly, several modern developments in theoriz-ing about predicates and their arguments complicate thetraditional picture of proper names, the canonical argu-ment terms. The dominant analysis in the modern formalsemantics of natural languages (e.g., Montague 1970; 1973)does not treat proper names in languages (e.g., John) likethe individual constants of FOPL. For reasons having to dowith the overall generality of the rules governing the com-positional interpretation of all sentences, modern logicaltreatments make the extensions of natural language propernames actually more complex than, for example, the exten-sions of common nouns, which are 1-place predicates. Insuch accounts, the extension of a proper name is not simplya particular entity, but the set of classes containing that en-tity, while the extension of a 1-place predicate is a class.Concretely, the extension of cat is the class of cats, while theextension of John is the set of all classes containing John.

Further, it is obvious that in natural languages, there aremany kinds of expressions other than proper names whichcan fill the NP slots in clauses.

Semantically, then, PNs are an incredibly special case ofNP; almost nothing that a randomly selected full NP candenote is also a possible proper noun denotation. This issurprising, as philosophers and linguists have often treatedPNs as representative of the entire class of NPs. Somewhat

more exactly, perhaps, they have treated the class of fullNPs as representable . . . by what we may call individual de-noting NPs (Keenan 1987, p. 464).

This fact evokes one of two responses in logical accounts.The old-fashioned way was to deny that there is any straight-forward correspondence between natural language clauseswith nonproper name subjects or objects and their transla-tions in predicate logic (as Russell [1905] did). The modernway is to complicate the logical account of what grammati-cal subjects (and objects), including proper names, actuallydenote (as Montague did).

In sum, logical formulae of the type CAME(john), con-taining individual constants, cannot be plausibly claimed ascorresponding to primitive mental representations pre-existing human language. The required fixing of the desig-nations of the individual constants (“baptism” in Kripke’s[1980] terms) could not be practically relied upon. Modernsemantic analysis suggests that natural language propernames are in fact more complex than longer noun phraseslike the man, in the way they fit into the overall composi-tional systems of modern languages. And while propernames provide the shortest examples of (nonpronominal)noun phrases, and hence are convenient for brief expositoryexamples, they are in fact somewhat peripheral in their se-mantic and syntactic properties.

Such considerations suggest that, far from being primi-tive, proper names are more likely to be relatively late de-velopments in the evolution of language. In the historicalevolution of individual languages, proper names are fre-quently, and perhaps always, derived from definite descrip-tions, as is still obvious from many, such as, Baker, Wheeler,Newcastle. It is very rare for languages to lack propernames, but such languages do exist. Machiguenga (or Mat-sigenka), an Arawakan language, is one, as several primarysources (Johnson 2003; Snell 1964) testify.

A most unusual feature of Matsigenka culture is the nearabsence of personal names (Snell 1964, pp. 17–25). Sincepersonal names are widely regarded by anthropologists as ahuman universal (e.g., Murdock 1960, p. 132), this startlingassertion is likely to be received with skepticism. When Ifirst read Snell’s discussion of the phenomenon, before Ihad gone into the field myself, I suspected that he hadmissed something (perhaps the existence of secret ceremo-nial names) despite his compelling presentation of evidenceand his conclusion:

I have said that the names of individual Machiguenga, whenforthcoming, are either of Spanish origin and given to them bythe white man, or nicknames. We have known Machiguenga In-dians who reached adulthood and died without ever having re-ceived a name or any other designation outside of the kinshipsystem. . . . Living in small isolated groups there is no impera-tive need for them to designate each other in any other way thanby kinship terminology. Although there may be only a few tribeswho do not employ names. I conclude that the Machiguenga isone of those few. (Snell 1964, p. 25)

Experience has taught me that Snell was right. Althoughthe Matsigenka of Shimaa did learn the Spanish namesgiven them, and used them in instances where it was nec-essary to refer to someone not of their family group, theyrarely used them otherwise and frequently forgot orchanged them. (Johnson 2003)

Joseph Henrich, another researcher on Machiguengatells me “This is a well established fact among Machiguengaresearchers” (personal communication).

Hurford: The neural basis of predicate-argument structure

266 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 7: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

In this society there is very little cooperation, exchange or shar-ing beyond the family unit. This insularity is reflected in the factthat until recently they didn’t even have personal names, refer-ring to each other simply as “father,” “patrilineal same-sexcousin,” or whatever. (Douglas 2001, p. 41)

The social arrangements of our prelinguistic ancestorsprobably involved no cooperation, exchange, or sharing be-yond the family unit, and the mental representations whichthey associated with individuals could well have been kin-ship predicates or other descriptive predicates.

In Australian languages, people are usually referred to bydescriptive predicates.

Each member of a tribe will also have a number of personalnames, of different types. They may be generally known by anickname, describing some incident in which they were in-volved or some personal habit or characteristic, e.g., “[she who]knocked the hut over,” “[he who] sticks out his elbows whenwalking,” “[she who] runs away when a boomerang is thrown,”“[he who] has a damaged foot.” But each individual will alsohave a sacred name, generally given soon after birth. (Dixon1980, p. 27)

The extensive anthropological literature on names testi-fies to the very special status, in a wide range of cultures, ofsuch sacred or “baptismal” proper names, both for peopleand places. It is common for proper names to be used withgreat reluctance, for fear of giving offense or somehow in-truding on a person’s mystical selfhood. A person’s propername is sometimes even a secret.

The personal names by which a man is known are somethingmore than names. Native statements suggest that names arethought to partake of the personality which they designate. Thename seems to bear much the same relation to the personalityas the shadow or image does to the sentient body. (Stanner1937, quoted in Dixon 1980, p. 28)

It is hard to see how such mystical beliefs can have becomeestablished in the minds of creatures without language.More probably, it was only early forms of language itselfthat made possible such elaborate responses to propernames.

Hence, it is unlikely that any primitive mental represen-tation contained any equivalent of a proper name, that is,an individual constant. We thus eliminate formulae of thetype of CAME(john) as candidates for primitive mentalrepresentations.

This leaves us with quantified formulae, as in ∃ x [MAN(x)& TALL(x)]. Surely we can discount the universal quanti-fier ∀ as a term in primitive mental representations. Whatremains is one quantifier, which we can take to be implic-itly present and to bind the variable arguments of predi-cates. I propose that formulae of the type PREDICATE(x)are evolutionarily primitive mental representations, forwhich we can find evidence outside language.

2. Neural correlates of PREDICA TE(x)

It is high time to mention the brain. In terms of neuralstructures and processes, what justification is there forpositing representations of the form PREDICATE(x) insidehuman heads? I first set out some ground rules for corre-lating logical formulae, defined denotationally and syntac-tically, with events in the brain.

Representations of the form PREDICATE(x) are here in-terpreted psychologistically; specifically, they are taken tostand for the mental events involved when a human attends

to an object in the world and classifies it perceptually as sat-isfying the predicate in question. In this psychologistic view,it seems reasonable to correlate denotation with stimulus.Denotations belong in the world outside the organism;stimuli come from the world outside a subject’s head. Awhole object, such as a bird, can be a stimulus. Likewise,the properties of an object, such as its colour or shape, canbe stimuli. The two types of term in the PREDICATE(x)formula differ in their denotations. An individual variabledoes not have a constant denotation, but is assigned differ-ent denotations on different occasions of use; and the deno-tation assigned to such a variable is some object in theworld, such as a particular bird, or a particular stone or aparticular tree. A predicate denotes a constant property ob-servable in the world, such as greenness, roundness, or thecomplex property of being a certain kind of bird. The ques-tion to be posed to neuroscience is whether we can find sep-arate neural processes corresponding to (1) the shifting, adhoc assignment of a “mental variable” to different stimulusobjects in the world, not necessarily involving all, or evenmany, of the objects’ properties, and (2) the categorizationof objects, once they instantiate mental object variables, interms of their properties, including more immediate per-ceptual properties, such as colour, texture, and motion, andmore complex properties largely derived from combina-tions of these.

The syntactic structure of the PREDICATE(x) formulacombines the two types of term into a unified whole capa-ble of receiving a single interpretation which is a functionof the denotations of the parts; this whole is typically takento be an event or a state of affairs in the world. The brack-eting in the PREDICATE(x) formula is not arbitrary: it rep-resents an asymmetric relationship between the two typesof information represented by the variable and the predi-cate terms. Specifically, the predicate term is understood insome sense to operate on, or apply to, the variable, whosevalue is provided beforehand. The bracketing in the PRED-ICATE(x) formula is the first, lowest-level, step in the con-struction of complex hierarchical semantic structures, asprovided, for example, in more complex formulae of FOPL.The innermost brackets in a FOPL formula are alwaysthose separating a predicate from its arguments. If we canfind separate neural correlates of individual variables andpredicate constants, then the question to be put to neuro-science about the validity of the whole formula is whetherthe brain actually at any stage applies the predicate (prop-erty) system to the outputs of the object variable system, ina way that can be seen as the bottom level of complex, hi-erarchically organized brain activity.

2.1. Separate locating and identifying components invision and hearing

The evidence cited here is mainly from vision. Human vi-sion is the most complex of all sensory systems. About aquarter of human cerebral cortex is devoted to visual analy-sis and perception. There is more research on vision rele-vant to our theme, but some work on hearing has followedthe recent example of vision research and arrived at similarconclusions.

2.1.1. Dorsal and ventral visual streams. Research on theneurology of vision over the past two decades has reachedtwo important broad conclusions. One important message

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 267

Page 8: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

from the research is that vision is not a single unified sys-tem: Perceiving an object as having certain properties is acomplex process involving clearly distinguishable pathways,and hence processes, in the brain (seminal works areGoodale & Milner 1992; Trevarthen 1968; Ungerleider &Mishkin 1982).

The second important message from this literature, as ar-gued, for instance, by Milner and Goodale (1995), is thatmuch of the visual processing in any organism is inextrica-bly linked with motor systems. If we are to carve nature ather joints, the separation of vision from motor systems is inmany instances untenable. For many cases, it is more rea-sonable to speak of a number of visuomotor systems. Thus,frogs have distinct visuomotor systems for orienting to andsnapping at prey, and for avoiding obstacles when jumping(Ingle 1973; 1980; 1982). Distinct neural pathways from thefrog’s retina to different parts of its brain control these re-flex actions.

Distinct visuomotor systems can similarly be identified inmammals, as Milner and Goodale (1995) report:

In summary, the modular organization of visuomotor behaviourin representative species of at least one mammalian order, therodents, appears to resemble that of much simpler vertebratessuch as the frog and toad. In both groups of animals, visuallyelicited orienting movements, visually elicited escape, and vi-sually guided locomotion around barriers are mediated by quiteseparate pathways from the retina right through to motor nu-clei in the brainstem and spinal cord. This striking homology inneural architecture suggests that modularity in visuomotor con-trol is an ancient (and presumably efficient) characteristic ofvertebrate brains. (Milner & Goodale 1995, pp. 18–19)

Coming closer to our species, a clear consensus hasemerged in primate (including human) vision research thatone must speak of (at least) two separate neural pathwaysinvolved in the vision-mediated perception of an object.The literature is centred around discussion of two relateddistinctions: the distinction between magno and parvochannels from the retina to the primary visual cortex (V1)(Livingstone & Hubel 1988), and the distinction betweendorsal and ventral pathways leading from V1 to further vi-sual cortical areas (Mishkin et al. 1983; Ungerleider &Mishkin 1982). These channels and pathways functionlargely independently, although there is some crosstalk be-tween them (Merigan et al. 1991; Van Essen et al. 1992),and in matters of detail there is, naturally, complication(e.g., Hendry & Yoshioka 1994; Johnsrude et al. 1999;Marois et al. 2000) and some disagreement (e.g., Franz etal. 2000; Merigan & Maunsell 1993; Zeki 1993). See Milnerand Goodale (1995, pp. 33–39, 134–36) for discussion ofthe magno/parvo-dorsal/ventral relationship. (One has tobe careful what one understands by “modular” when quot-ing Milner & Goodale [1995]. In real brains, modules areneural entities that modulate, compete, and cooperate,rather than being encapsulated processors for one “faculty”[Arbib 1987b].) It will suffice here to collapse under the la-bel “dorsal stream” two separate pathways from the retinato posterior parietal cortex; one route passes via the lateralgeniculate nucleus and V1, and the other bypasses V1 en-tirely, passing through the superior colliculus and pulvinar(see Milner & Goodale 1995, p. 68).

While it is not obvious that both divergences pertain tothe same functional role, the proposals made here are notso detailed or subtle as to suggest any relevant discrimina-tion between these two branches of the route from retina

to parietal cortex. The dorsal stream has been characterizedas the “where” stream, and the ventral stream as the “what”stream. The popular “where” label can be misleading, sug-gesting a single system for computing all kinds of spatial lo-cation; as we shall see, a distinction must be made betweenthe computing of egocentric (viewer-centred) locational in-formation and allocentric (other-centred) locational infor-mation. Bridgeman et al. (1979) use the preferable terms“cognitive” (for “what” information) and “motor-oriented”(for “where” information). Another suitable mnemonicmight be the “looking” stream (dorsal) and the “seeing”stream (ventral). Looking is a visuomotor activity, involvinga subset of the information from the retina controlling cer-tain motor responses such as eye-movement, head andbody orientation, and manual grasping or pointing. Seeingis a perceptual process, allowing the subject to deploy otherinformation from the retina to ascribe certain properties,such as colour and motion, to the object to which the dor-sal visuomotor looking system has already directed atten-tion.

Appreciation of an object’s qualities and of its spatial locationdepends on the processing of different kinds of visual informa-tion in the inferior temporal and posterior parietal cortex, re-spectively. (Ungerleider & Mishkin 1982, p. 578)

Both cortical streams process information about the intrin-sic properties of objects and their spatial locations, but thetransformations they carry out reflect the different pur-poses for which the two streams have evolved. The trans-formations carried out in the ventral stream permit the for-mation of perceptual and cognitive representations whichembody the enduring characteristics of objects and theirsignificance; those carried out in the dorsal stream, whichneed to capture instead the instantaneous and egocentricfeatures of objects, mediate the control of goal-directed ac-tions (Milner & Goodale 1995, pp. 65–66).

Figure 1 shows the separation of dorsal and ventral path-ways in schematic form.

Experimental and pathological data support the distinc-tion between visuo-perceptual and visuomotor abilities.

Patients with cortical blindness, caused by a lesion to thevisual cortex in the occipital lobe, sometimes exhibit “blind-sight.” Sometimes the lesion is unilateral, affecting just onehemifield, sometimes bilateral, affecting both; presentationof stimuli can be controlled experimentally, so that conclu-sions can be drawn equally for partially and fully blind pa-tients. In fact, paradoxically, patients with the blindsightcondition are never strictly “fully” blind, even if both hemi-fields are fully affected. Such patients verbally disclaimability to see presented stimuli, and yet they are able tocarry out precisely guided actions such as eye-movement,manual grasping and “posting” (into slots). (See Goodale etal. 1994; Marcel 1998; Milner & Goodale 1995; Sanders etal. 1974; Weiskrantz 1986; 1997. See also Ramachandran &Blakeslee [1998] for a popular account.)

These cited works on blindsight conclude that the sparedunconscious abilities in blindsight patients are those iden-tifying relatively low-level features of a “blindly seen” ob-ject, such as its size and distance from the observer, whileaccess to relatively higher-level features such as colour andsome aspects of motion is impaired.2 Classic blindsightcases arise with humans, who can report verbally on theirinability to see stimuli, but parallel phenomena can betested and observed in nonhumans. Moore et al. (1998)

Hurford: The neural basis of predicate-argument structure

268 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 9: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

summarize parallels between residual vision in monkeysand humans with damage to V1.

A converse to the blindsight condition has also been ob-served, indicating a double dissociation between visually-directed grasping and visual discrimination of objects.Goodale et al.’s patient R.V. could discriminate one objectfrom another, but was unable to use visual information tograsp odd-shaped objects accurately (Goodale et al. 1994).Experiments with normal subjects also demonstrate a mis-match between verbally reported visual impressions of thecomparative size of objects and visually-guided grasping ac-tions. In these experiments, subjects were presented with astandard size-illusion-generating display, and asserted (in-correctly) that two objects differed in size; yet when askedto grasp the objects, they spontaneously placed their fingersexactly the same distance apart for both objects (Aglioti etal. 1995). Aglioti et al.’s conclusions have recently beencalled into question by Franz et al. (2000); see the discus-sion by Westwood et al. (2000) for a brief up-to-date surveyof nine other studies on this topic.

Advances in brain-imaging technology have made it pos-sible to confirm in nonpathological subjects the distinct lo-calizations of processing for object recognition and objectlocation (e.g., Aguirre & D’Esposito [1997] and other stud-ies cited in this paragraph). Haxby et al. (1991), while not-ing the homology between humans and nonhuman pri-mates in the organization of cortical visual systems into“what” and “where” processing streams, also note some dis-placement, in humans, in the location of these systems dueto development of phylogenetically newer cortical areas.They speculate that this may have ramifications for “func-tions that humans do not share with nonhuman primates,such as language.” Similar homology among humans andnonhuman primates, with some displacement of areas spe-cialized for spatial working memory in humans, is noted byUngerleider et al. (1998), who also speculate that this dis-

placement is related to the emergence of distinctively hu-man cognitive abilities.

The broad separation of visual pathways into ventral anddorsal has been tested against performance on a range ofspatial tasks in normal individuals (Chen et al. 2000). Sevenspatial tasks were administered, of which three “were con-structed so as to rely primarily on known ventral streamfunctions and four were constructed so as to rely primarilyon known dorsal stream functions” (p. 380). For example, atask where subjects had to make a same/different judge-ment on pairs of random irregular shapes was classified asa task depending largely on the ventral stream; and a taskin which “participants had to decide whether two buildingsin the top view were in the same locations as two buildingsin the side view” (p. 383) was classified as depending largelyon the dorsal stream. These classifications, though subtle,seem consistent with the general tenor of the research re-viewed here, namely, that recognition of the properties ofobjects is carried out via the ventral stream and the spatiallocation of objects is carried out via the dorsal stream. Af-ter statistical analysis of the performance of forty-eight sub-jects on all these tasks, Chen et al. conclude

The specialization for related functions seen within the ventralstream and within the dorsal stream have direct behavioralmanifestations in normal individuals. . . . at least two brain-based ability factors, corresponding to the functions of the twoprocessing streams, underlie individual differences in visu-ospatial information processing. (Chen et al. 2000, p. 386)

Chen et al. speculate that the individual differences inventral and dorsal abilities have a genetic basis, mentioninginteresting links with Williams syndrome (Bellugi et al.1988; Frangiskakis et al. 1996).

Milner (1998) gives a brief but comprehensive overviewof the evidence, up to 1998, for separate dorsal and ventralstreams in vision. For my purposes, Pylyshyn (2000) sumsit up best:

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 269

Retina

SC

LGNd

Pulv

PrimaryVisualCortex

Infero- temporalCortex

PosteriorParietalCortex

Ventral Stream

Dorsal S

tream

Figure 1. Schematic diagram showing major routes whereby retinal input reaches dorsal and ventral streams. The inset [brain draw-ing] shows the cortical projections on the right hemisphere of a macaque brain. LGNd, lateral geniculate nucleus, pars dorsalis; Pulv,pulvinar nucleus; SC, superior colliculus (From Milner & Goodale 1995).

Page 10: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

The most primitive contact that the visual system makes withthe world (the contact that precedes the encoding of any sen-sory properties) is a contact with what have been termed visualobjects or proto-objects . . . As a result of the deployment of fo-cal attention, it becomes possible to encode the various prop-erties of the visual objects, including their location, color, shapeand so on. (Pylyshyn 2000, p. 206)

2.1.2. Auditory location and recognition. Less researchhas been done on auditory systems than on vision. Thereare recent indications that a dissociation exists between thespatial location of the source of sounds and recognition ofsounds, and that these different functions are served byseparate neural pathways.

Rauschecker (1997), Korte and Rauschecker (1993), andTian and Rauschecker (1998) investigated the responses ofsingle neurons in cats to various auditory stimuli. Raus-checker concludes

The proportion of spatially tuned neurons in the AE [5 ante-rior ectosylvian] and their sharpness of tuning depends on thesensory experience of the animal. This and the high incidenceof spatially tuned neurons in AE suggests that the anterior ar-eas could be part of a “where” system in audition, which signalsthe location of sound. By contrast, the posterior areas of cat au-ditory cortex could be part of a “what” system, which analyseswhat kind of sound is present. (Rauschecker 1997, p. 35)

Rauschecker suggests that there could be a similar func-tional separation in monkey auditory cortex.

Romanski et al. (1999) have considerably extended theseresults in a study on macaques using anatomical tracing ofpathways combined with microelectrode recording. Theirstudy reveals a complex network of connections in the au-ditory system (conveniently summarized in a diagram byKaas & Hackett 1999). Within this complex network it ispossible to discern two broad pathways, with much crosstalk between them but nevertheless somewhat specializedfor separate sound localization and higher auditory pro-cessing, respectively. The sound localization pathway in-volves some of the same areas that are centrally involved invisual localization of stimuli, namely, dorsolateral prefrontalcortex and posterior parietal cortex. Kaas and Hackett(1999), in their commentary, emphasize the similarities be-tween visual, auditory, and somatosensory systems, each di-viding along “what” versus “where” lines.3 Graziano et al.(1999) have shown that certain neurons in macaques havespatial receptive fields limited to about 30 cm around thehead of the animal, thus contributing to a specializedsound-location system.

Coming to human audition, Clarke et al. (2000) tested arange of abilities in four patients with known lesions, con-cluding

Our observation of a double dissociation between auditoryrecognition and localisation is compatible with the existence oftwo anatomically distinct processing pathways for non-verbalauditory information. We propose that one pathway is involvedin auditory recognition and comprises lateral auditory areas andthe temporal convexity. The other pathway is involved in audi-tory-spatial analysis and comprises posterior auditory areas, theinsula and the parietal convexity. (Clarke et al. 2000, p. 805)

Evidence from audition is less central to my argumentthan evidence from vision. My main claim is that in predi-cate-argument structure, the predicate represents somejudgement about the argument, which is canonically an at-tended-to object. There is a key difference between visionand hearing. What is seen is an object, typically enduring;

what is heard is an event, typically fleeting. If language isany guide (which it surely is, at least approximately) mentalsound predicates can be broadly subdivided into thosewhich simply classify the sound itself (rendered in Englishwith such words as bang, rumble, rush), and those whichalso classify the event or agent which caused the sound (ex-pressed in English by such words as scrape, grind, whisper,moan, knock, tap). (Perhaps this broad dichotomy is moreof a continuum.) When one hears a sound of the first type,such as a bang, there is no object, in the ordinary sense of“object,” which “is the bang.” A bang is an ephemeral event.One cannot attend to an isolated bang in the way in whichone directs one’s visual attention to an enduring object. Theonly way one can simulate attention to an isolated bang isby trying to hold it in memory for as long as possible. Thisis quite different from maintained visual attention, whichgives time for the ventral stream to do heavy work catego-rizing the visual stimuli in terms of complex properties. Notall sounds are instantaneous, like bangs. One can notice acontinuous rushing sound. But again, a rushing sound is notan object. Logically, it seems appropriate to treat bangs andrushing sounds either with zero-place predicates, that is, aspredicates without arguments, or as predicates taking eventvariables as arguments. (The exploration of event-basedlogics is a relatively recent development.) English descrip-tions such as There was a bang or There was a rushing tendto confirm this.

Sounds of the second type, classified in part by what(probably) caused them, allow the hearer to postulate theexistence of an object to which some predicate applies. If,for example, you hear a miaow, you mentally classify thissound as a miaow. This, as with the bang or the rushingsound, is the evocation of a zero-place predicate (or alter-natively a predicate taking an event variable as argument).Certainly, hearing a miaow justifies you in inferring thatthere is an object nearby satisfying certain predicates, inparticular CAT(x). But it is vital to note that the Englishword miaow is two-ways ambiguous. Compare That soundwas a miaow with A cat miaowed, and note that you can’tsay *That sound miaowed or *That cat was a miaow. Wherethe subject of miaow describes some animate agent, theverb actually means “cause a miaow sound.”

It is certainly interesting that the auditory system alsoseparates “where” and “what” streams. But the facts of au-dition do not fit so closely with the intuitions, canonicallyinvolving categorizable enduring objects, which I believegave rise to the invention by logicians of predicate-argu-ment notation. The idea of zero-place predicates has gen-erally been sidelined in logic (despite their obvious applic-ability to weather phenomena); and the extension ofpredicate-argument notation to include event variables isrelatively recent. (A few visual predicates, like that ex-pressed by English flash, are more like sounds, but theseare highly atypical of visual predicates.)

We have now considered both visual and auditory per-ception, and related them to object-location motor re-sponses involving eye movement, head movement, bodymovement, and manual grasping. Given that when the headmoves, the eyes move too, and when the body moves, thehands, head, and eyes also move, we should perhaps not besurprised to learn that the brain has ways of controlling theinteractions of these body parts and integrating signals fromthem into single coherent overall responses to the locationof objects. Given a stimulus somewhere far round to one

Hurford: The neural basis of predicate-argument structure

270 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 11: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

side, we instinctively turn our whole body toward it; if thestimulus comes from not very far around, we may only turnour head; and if the stimulus comes from quite close to ourfront, we may only move our eyes. All this happens regard-less of whether the stimulus was a heard sound or some-thing glimpsed with the eye. Furthermore, as we turn ourhead or our eyes, light from the same object falls on a trackacross the retina, yet we do not perceive this as movementof the object. Research is beginning to close in on the areasof the brain that are responsible for this integrated locationability. Duhamel et al. (1992) found that the receptive fieldsof neurons in lateral intraparietal cortex are adjusted tocompensate for saccades.

One important form of spatial recoding would be to modulatethe retinal information as a function of eye position with respectto the head, thus allowing the computation of location in head-based rather than retina-based coordinates. . . . By the time vi-sual information about spatial location reaches premotor areasin the frontal lobe, it has been considerably recalibrated by in-formation derived from eye position and other non-retinalsources. (Milner & Goodale 1995, p. 90)

The evidence that Milner and Goodale (1995) cite isfrom Galletti and Battaglini (1989), Andersen et al. (1985;1990), and Gentilucci et al. (1983). Brotchie et al. (1995)present evidence that in monkeys

the visual and saccadic activities of parietal neurons are stronglyaffected by head position. The eye and head position effects areequivalent for individual neurons, indicating that the modula-tion is a function of gaze direction, regardless of whether theeyes or head are used to direct gaze. These data are consistentwith the idea that the posterior parietal cortex contains a dis-tributed representation of space in body-centred coordinates.(Brotchie et al. 1995, p. 232)

Gaymard et al. (2000, p. 819) report on a pathological hu-man case which “supports the hypothesis of a commonunique gaze motor command in which eye and head move-ments would be rapidly exchangeable.” Nakamura (1999)gives a brief review of this idea of integrated spatial repre-sentations distributed over parietal cortex. Parietal cortex isthe endpoint of the dorsal stream, and neurons in this areaboth respond to visual stimuli and provide motor control ofgrasping movements (Jeannerod et al. 1995). In a study ofvision-guided manual reaching, Carrozzo et al. (1999) havelocated a gradual transformation from viewer-centered tobody-centered and arm-centered coordinates in superiorand inferior parietal cortex. Graziano et al. (1997) discov-ered “arm1visual” neurons in macaques, which are sensi-tive to both visual and tactile stimuli, and in which the vi-sual receptive field is adjusted according to the position ofthe arm. Stricanne et al. (1996, p. 2071) investigated howlateral intraparietal (LIP) neurons respond when a monkeymakes saccades to the remembered location of soundsources in the absence of visual stimulation; they proposethat “area LIP is either at the origin of, or participates in,the transformation of auditory signals for oculomotor pur-poses.” Most recently, Kikuchi-Yorioka and Sawaguchi(2000) have found neurons which are active both in thebrief remembering of the location of a sound and in thebrief remembering of the location of a light stimulus. A fur-ther interesting connection between visual and auditory lo-calization comes from Weeks et al. (2000), who find thatboth sighted and congenitally blind subjects use posteriorparietal areas in localizing the source of sounds, but theblind subjects also use right occipital association areas orig-

inally intended for dorsal-stream visual processing. Egly etal. (1994) found a difference between left-parietal-lesionedand right-parietal-lesioned patients in an attention-shiftingtask.

The broad generalization holds that the dorsal streamprovides very little of all the information about an objectthat the brain eventually gets, but just about enough to di-rect attention to its location and enable some motor re-sponses to it. The ventral stream fills out the picture withfurther detailed information, enough to enable a judge-ment by the animal about exactly what kind of object it isdealing with (e.g., flea, hair, piece of grit, small leaf, shadow,nipple; or, in another kind of situation, brother, sister, fa-ther, enemy, leopard, human). A PET scan study (Martin etal. 1996) confirms that the recognition of an object (say, asa gorilla or a pair of scissors) involves activation of a ventraloccipitotemporal stream. The particular properties that ananimal identifies will depend on its ecological niche andlifestyle. It probably has no need of a taxonomy of pieces ofgrit, but it does need taxonomies of fruit and prey animals,and will accordingly have somewhat finely detailed mentalcategories for different types of fruit and prey. I identifysuch mental categories, along with non-constant properties,such as colour, texture, and movement, which the ventralstream also delivers, with predicates.

2.2. “Dumb” attentional mechanisms and theobject /property distinction

Some information about an object, for example, enoughabout its shape and size to grasp it, can be accessed via thedorsal stream, in a preattentive process. The evidence citedabove from optical size illusions in normal subjects showsthat information about size as delivered by the dorsalstream can be at odds with information about size as deliv-ered by the ventral stream. Thus, we cannot say that the twostreams have access to exactly the same property, “size”;presumably the same is true for shape. Much processing forshape occurs in the ventral stream, after its divergence fromthe dorsal stream in V1 (Gross 1992); at the early V1 stage,full shapes are not represented, but rather basic informa-tion about lines and oriented edges, as Hubel and Wiesel(1968) first argued, or possibly about certain 3D aspects ofshape (Lehky & Sejnowski 1988). Something about the ap-pearance of an object in peripheral vision draws attentionto it. Once the object is focally attended to, we can try to re-port the “something” about it that drew our attention. Butthe informational encapsulation (in the sense of Fodor1983) of the attention-directing reflex means that the moredeliberative process of contemplating an object cannot beguaranteed to report accurately on this “something.” Andstimuli impinging on the retinal periphery trigger differentprocesses from stimuli impinging on the fovea. Thus, it isnot clear whether the dorsal stream can be said to deliverany properties, or mental predicates, at all. It may not beappropriate to speak of the dorsal stream delivering repre-sentations (accessible to report) of the nature of objects.Nevertheless, in a clear sense, the dorsal stream does de-liver objects, in a minimal sense of “object” to be discussedbelow. What the dorsal stream delivers, very fast, is infor-mation about the egocentric location of an object, whichtriggers motor responses resulting in the orientation of fo-cal attention to the object. (At a broad-brush level, the dif-ferences between preattentive processes and focal atten-

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 271

Page 12: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

tion have been known for some time, and are concisely andelegantly set out in Ch. 5 of Neisser 1967.)

In a functioning high-level organism, the informationprovided by the dorsal and ventral streams can be expectedto be well coordinated (except in the unusual circumstanceswhich generate illusions). Thus, although predicates/prop-erties are delivered by the ventral stream, it would not besurprising if a few of the mental predicates available to a hu-man being did not also correspond at least roughly to in-formation of the type used by the dorsal stream. But hu-mans have an enormous wealth of other predicates as well,undoubtedly accessed exclusively via the ventral stream,and bearing only indirect relationships to salient attention-drawing traits of objects. Humans classify and name objects(and substances) on the basis of properties at all levels ofconcreteness and salience. Landau et al. (1988; 1998a;1998b) and Smith et al. (1996) report a number of experi-ments on adults’ and children’s dispositions to name famil-iar and unfamiliar objects. There are clear differences be-tween children and adults, and between children’sresponses to objects that they in some sense understandand to those that are strange to them. Those subjects withleast conceptual knowledge of the objects presented, thatis, the youngest children, presented with strange objects,tended to name objects on the basis of their shape. Smithet al. (1996) relate this disposition to the attention-drawingtraits of objects:

Given that an adult is attending to a concrete object and pro-ducing a novel name, children may interpret the novel name asreferring to “whatever it is about the object that most demandsattention.” An attentional device that produces this result maywork well enough to start a child’s learning of a specific objectname. (Smith et al. 1996, p. 169)

This is not unexpected. Higher-level features and cate-gories are learned, and once learned, can be applied in ex-tending names to things. The youngest humans, havinglearned few or no higher-level categories, have only themost basic features to appeal to, those corresponding to in-formation gleaned by the dorsal stream. See Bloom (2000)for a recent commentary on this literature, emphasizing adifferent theme, but consistent with the hypothesis thatchildren’s earliest naming tendencies capitalize strongly onattention-drawing traits of objects.

But doesn’t talk of “attention-drawing traits of objects”undermine my central argument, by locating some “traits”(alias properties) within the class of information deliveredby the dorsal stream? A position diametrically opposed tomine would be that ultimately there is no distinction at allto be made between objects and properties. A philosophi-cal argument for such a position might appeal to Englishterms such as “objecthood,” meaning the property of beingan object. Advanced logical systems can play havoc with ba-sic ontological categories, such as object and property, byvarious devices such as type-raising. Such devices may beappropriate in the analysis of elaborated human languagesand the systems of thought that they make available. Yes,humans can treat properties as objects, by reification, andobjects as properties (by “Pegasizing Pegasus,” as Quine putit). But I would claim that an ape’s mental traffic with theworld is in terms of two broadly noninterconvertible onto-logical categories, object and property.

A more psychologically plausible argument against myposition might claim that any property of an object that onecould give a name to could in principle be an attention-

drawing trait. This would potentially attribute to the dorsalstream any information conveyed by a predicate, thus de-stroying the hypothesis that it is the ventral stream that de-livers predicates. I emphasize that such issues should be ad-dressed with empirical (neuro-) psychological evidence,rather than purely philosophical argumentation. Some rel-evant evidence exists, pointed out by O’Brien and Opie(1999), in connection with blindsight, as follows.

Consider the comments made by Weiskrantz’ subject D.B., af-ter performing well above chance in a test that involved distin-guishing between Xs and Os presented in his scotoma. WhileD.B. maintained that he performed the task merely by guess-ing:

If pressed, he might say that he perhaps had a “feeling” thatthe stimulus was either pointing this or that way, or was“smooth” (the O) or “jagged” (the X). On one occasion inwhich “blanks” were randomly inserted in a series of stim-uli . . . he afterwards spontaneously commented he had afeeling that maybe there was no stimulus present on sometrials. But always he was at a loss for words to describe anyconscious perception, and repeatedly stressed that he sawnothing at all in the sense of “seeing,” and that he was merelyguessing. (Weiskrantz et al. 1974, p. 721)

Throughout D.B.’s verbal commentaries there are similar re-marks. Although he steadfastly denies “seeing” in the usual waywhen presented with visual stimuli, he frequently describessome kind of concurrent awareness. He talks of things “poppingout a couple of inches” and of “moving waves,” in response tosingle point stimuli (Weiskrantz 1986, p. 45). He also refers to“kinds of pulsation” and of “feeling some movement” in re-sponse to moving line stimuli. (Weiskrantz 1986, p. 67)

Consequently, while blindsight subjects clearly do not havenormal visual experience in the “blind” regions of their visualfields, this is not to say that they don’t have any phenomenal ex-perience whatsoever associated with stimuli presented in theseregions. What is more, it is not unreasonable to suggest thatwhat little experience they do have in this regard explains theirresidual discriminative abilities. D.B., for example, does not seeXs or Os (in the conventional sense). But in order to performthis task he doesn’t need to. All he requires is some way of dis-criminating between the two stimulus conditions q some broadphenomenal criterion to distinguish “Xness” from “Oness.” Andas we’ve seen, he does possess such a criterion: one stimuluscondition feels “jagged” while the other feels “smooth.” Thus,it is natural to suppose that he is able to perform as well as hedoes (above chance) because of the (limited) amount of infor-mation that is consciously available to him. (O’Brien & Opie1999, p. 131)

Unlike O’Brien and Opie, I am not mainly concernedwith consciousness. I am content to concede that O’Brienand Opie have a point, and to fall back on the reservationthat a formula as simple as PREDICATE(x) cannot be ex-pected to mirror exactly all the processes of such a complexorgan as the brain. The stark contrast between the blind-sight patient’s experience and his performance is evidencethat the brain separates sub- or semiconscious awareness ofthe bare presence of an object from the vast array of judge-ments that can be made by a normal person about the prop-erties of an object. Perhaps training can boost the set ofproperties which can act as attention-drawing traits. But Iwould predict that only a tiny subset of properties are nat-ural attention-drawing properties, and that any propertiesadded to this set by practice or training are likely to swinginto action significantly more slowly than the primal atten-tion-drawing properties.

This prediction conflicts with a prediction of Milner andGoodale’s in their final chapter addressing further research

Hurford: The neural basis of predicate-argument structure

272 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 13: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

questions prompted by the dorsal/ventral distinction. Theywrite “It is unlikely that the dorsal stream plays the majorrole in mediating this initial [attention] selection process,since object recognition and ‘semantic’ knowledge mayhave to be taken into account.” (Milner & Goodale 1995,p. 202) With due deference to Milner and Goodale, I sug-gest that their implicit premise that all “semantic” recogni-tion takes place in the ventral stream may be too strong, andthat a very limited set of primal properties can be accessedby the dorsal stream. I would further claim that access tothese primal attention-drawing properties is highly encap-sulated, unlike access to properties delivered by the ventralstream. It is an intuition of this difference that gives rise tothe logician’s postulate that the fundamental logical struc-ture is an asymmetric relation between two distinct logicaltypes, predicate and argument.

As an interim summary, the formula PREDICATE(x) is asimplifying schematic representation of the integration bythe brain of two broadly separable processes. One processis the rapid delivery by the senses (visual and/or auditory)of information about the egocentric spatial location of a ref-erent object relative to the body, represented in parietalcortex. The eyes, often the head and body, and sometimesalso the hands, are oriented to the referent object, whichbecomes the instantiation of a mental variable. The otherprocess is the somewhat slower analysis of the delivered ref-erent object by the perceptual (visual or auditory) recogni-tion subsystems in terms of its properties. The asymmetricrelationship between the predicate and the variable, inher-ent in the bracketing of the formula, also holds of the twoneural processes:

From the genetical and functional perspectives, the two modesof processing are asymmetrically related: while egocentricevaluation of “where” need not take into account the identity ofobjects, the perception of “what” usually proceeds through anintermediate stage in which objects are dynamically localized.(Bridgeman et al. 1994)

There is an interesting parallel (more than merely coin-cidental) in the uses of the term “binding” in logic and neu-roscience. The existence of a blue dot can be representedin FOPL as ∃ x [BLUE(x) & DOT(x)]. (The ordering of theconjuncts is immaterial.) Here the existential quantifier issaid to “bind” the variable x immediately after it, and, alsoimportantly, all further instances of this variable must fallwithin the scope, indicated by brackets, of the quantifier.The variable and its binding quantifier thus serve to unitethe various predicates in the formula, indicating that theydenote properties of the same object. Logical binding is nota relationship between a predicate and its argument, but arelationship between all predicates in the scope of a partic-ular quantifier which take the bound variable as argument.In neuroscience, “binding is the problem of representingconjunctions of properties. . . . For example, to visually de-tect a vertical red line among vertical blue lines and diago-nal red lines, one must visually bind each line’s color to itsorientation” (Hummel 1999). Detection of properties isgenerally achieved via the ventral stream. The dorsalstream directs attention to an object. Once attention is fo-cussed on a particular object, the ventral stream can delivera multitude of different judgements about it, which can berepresented logically by a conjunction of 1-place predica-tions. The bare drawing of attention to an object, with nocategory judgements (yet) made about it, corresponds tothe “∃ x” part of the logical formula.

Evidently, the brain does solve the binding problem, al-though we are not yet certain exactly how it does it. Theclaim advanced here for a connection between predicate-argument structure and the ventral/dorsal separation doesnot depend on what, in detail, the brain’s solution to thebinding problem turns out to be.

2.3. Related proposals

2.3.1. Landau and Jackendoff: Nouns and prepositions.Jackendoff and Landau (1992) and Landau and Jackendoff(1993)4 noticed the early neurological literature on ventraland dorsal streams, and proposed a connection between the“where”/“what” dichotomy and the linguistic distinctionbetween prepositions and common nouns. They correlatecommon nouns denoting classes of physical objects with in-formation provided by the ventral stream, and prepositionswith information provided by the dorsal stream. Landauand Jackendoff emphasize the tentative and suggestive na-ture of their conclusions, but it will be useful to explainbriefly why I believe their proposed correlations are incor-rect, and to contrast their suggestions with mine.

Let us start with the proposed noun/ventral correlation.Nouns, as they correctly state, encode complex properties,such as being a dog. And categorization of objects, as whenone recognizes a particular object as a dog, involves the ven-tral stream. This much is right. Landau and Jackendoff em-phasize the striking contrast between the enormous num-ber of nouns in a language and the very restricted numberof prepositions. It is this stark quantitative contrast whichstands in need of explanation, and for which they invoke theneurological “what”/“where” distinction. Their reasoning isthat the dorsal stream provides a bare minimum of infor-mation about the location of an object (no more than is en-coded by the small inventory of prepositions in a language),while the ventral stream does all the rest of the work thatmay be necessary in categorizing it. This characterization ofthe relative amounts of linguistically expressible informa-tion provided by the respective streams certainly goes in theright direction (but is in fact, I will argue, an understate-ment).

However, a correlation of populous syntactic categories(such as noun) with the ventral stream, and a complemen-tary correlation of sparsely populated categories (such aspreposition) with the dorsal stream will not work. Consideradjectives. Adjectives are never as numerous in a languageas nouns, many languages have only about a dozen adjec-tives, and some languages have none at all (Dixon 1982).Taking the numbers of nouns, adjectives, and prepositions(or postpositions) across languages as a whole, one wouldbe more likely to group adjectives with prepositions as rel-atively sparsely populated syntactic categories. But many ofthe properties typically expressed by adjectives, such ascolour, are detected within the ventral stream. Landau andJackendoff might respond with the revised suggestion thatthe ventral stream processes both noun meanings and ad-jective meanings, leaving the difference in typical numbersof nouns and adjectives still unexplained, and this is fairenough, but it gets closer to the correlation proposed in thepresent paper between predicates generally and the ventralstream. Indeed, when one considers all syntactic categories,rather than restricting discussion to just nouns and prepo-sitions, it is clear that judgements corresponding to themeanings of many verbs (e.g., move and its hyponyms), and

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 273

Page 14: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

many adverbs (e.g., fast and similar words) are made in theventral stream. Verbs are pretty numerous in languages,though not as numerous as nouns, while adverbs are muchless numerous, and some languages don’t have adverbs atall. The relative population-size of syntactic categories doesnot correlate with the ventral/dorsal distinction.

Now consider Landau and Jackendoff ’s proposed dorsal/preposition correlation. Prepositions express predicates,many of which give spatial information, both egocentric andallocentric. Their article naturally depended on the litera-ture available at the time it was written, especially the clas-sic Ungerleider and Mishkin (1982), which gave the im-pression of a distinction between “object vision” and asingle system of “spatial vision.” In a later very detailed cri-tique of this work, Milner and Goodale (1995) devote sev-eral chapters to accumulating evidence that an egocentricsystem of “visual guidance of gaze, hand, arm or whole bodymovement” (p. 118) is located in the posterior parietal re-gion, while many other kinds of visual judgement, includ-ing computation of allocentric spatial information, aremade using occipito-temporal and infero-temporal regionsof cortex. “Perhaps the most basic distinction that needs tobe made in thinking about spatial vision is between the lo-cational coordinates of some object within the visual fieldand the relationship between the loci of more than one ob-ject” (Milner & Goodale 1995, p. 89). Prepositions do notrespect this distinction, being used indiscriminately forboth egocentric (e.g., behind me) and allocentric (e.g., be-hind the house) information. Only information of the ego-centric kind is computed in the dorsal stream.

Of course, as Bryant (1993, p. 242) points out, there mustbe interaction between the systems for egocentric locationand the building of allocentric spatial maps. Galati et al.(2000) is a recent fMRI study which begins to relate ego-centric and allocentric functions to specific regions of cor-tex.

Both nouns and prepositions express predicates. I haveargued that the categorical judgements of properties andrelations involved in the application of all predicates to at-tended-to objects are mediated by the ventral stream. Thekey logical distinction is between predicates and individualvariables, not between different syntactic subclasses ofwords which express predicates. Thus, the logical distinc-tion correlated here with the neurological dorsal/ventraldistinction is considerably more fundamental, and hencelikely to be evolutionarily more primitive, than the distinc-tion on which Landau and Jackendoff focus. This idea isclose to what I believe Bridgeman (1993), in his commen-tary on them, states: “cognitive and [motor-oriented] spa-tial systems can be distinguished on a lower level than thatof Landau and Jackendoff, a level that differentiates lin-guistic from nonlinguistic coding” (p. 240). Predicates arecoded linguistically; the vast majority of words in a languagecorrespond to predicates. In languages generally, only a tinyinventory of words, the indefinite pronouns, such as some-thing and anything could be said to correlate directly withthe individual variables x, y, z of simple formulae such as ∃ x[LION(x)], loosely translatable as Something is a lion. Inmore complex examples, a case can be made that the logi-cal variables correspond to anaphoric pronouns, as in Therewas a lion and it yawned. The deictic nature of the variableswhose instantiations are delivered to posterior parietal cor-tex by the sensory “where” systems will be the subject ofsection 4.

2.3.2. Givon: Lexical concepts and propositions. Givon(1995, pp. 408–10), in a brief but pioneering discussion, re-lates the dorsal and ventral visual pathways to linguistic in-formation in a way which is partly similar to my proposedcorrelation. In particular, Givon correlates information ac-cessed via the ventral stream with lexical concepts. This isvery close to my correlation of this information with prelin-guistic predicates. Prelinguistic predicates are concepts (orwhat Bickerton calls “protoconcepts”), and they can be-come lexical concepts by association with phonologicalforms, once language gets established. My proposal differsfrom Givon’s in the information that we correlate with thedorsal stream, which he correlates with “spatial relation/motion – propositional information about states or events”(p. 409). Givon, writing before 1995, relied on several of thesame sources as Landau and Jackendoff, and, like them, as-sumes that “the dorsal (upper) visual processing stream an-alyzed the spatial relations between specific objects andspatial motion of specific objects. This processing track isthus responsible for analyzing specific visual states andevents” (p. 409, emphasis in original). As mentioned above,Milner and Goodale (1995) subsequently presented evi-dence that such allocentric spatial information is notprocessed in the dorsal stream. Elsewhere in Givon’s ac-count, there is an acknowledgement of the role of thestream to the temporal lobe in accessing information aboutspatial motion:

Further, even in non-human primates, the object recognition(ventral) stream analyzes more than visually perceived objectsand their attributes. Thus Perrett et al. (1989) in their study ofsingle-cell activation in monkeys have been able to differentiatebetween single cortical cells that respond to objects (nouns),and those that are activated by actions (verbs). Such differ-entiation occurs within the object recognition stream itself, inthe superior temporal sulcus of the left-temporal lobe. Andwhile the verbs involved – e.g., moving an object by hand to-wards mouth – are concrete and spatio-visual, they involvemore abstract computations of purpose and causation. (Givon1995, p. 410, emphasis in original)This attribution undermines Givon’s earlier identifica-

tion of the dorsal stream as the stream providing informa-tion about spatial motion. Note that Givon begins to corre-late neural structure with the specifically linguisticcategories of noun and verb, a move which I avoid. I corre-late information accessed by the ventral stream with pred-icates, regardless of whether these eventually get expressedas nouns, verbs, adjectives, or any other lexical category.The present proposed correlation of distinct neural path-ways with logical predicates and individual variables differsfrom both Landau and Jackendoff ’s and Givon’s proposalsin claiming completely prelinguistic correlates for the ven-tral and dorsal pathways. The correlation that I propose forinformation delivered by the dorsal stream is developed inmore detail in the next section.

2.3.3. Rizzolatti and Arbib: A prelinguistic “grammar” ofaction. Rizzolatti and Arbib’s paper (1998) contains a sec-tion entitled “A pre-linguistic ‘grammar’ of action in themonkey brain.” Like me, they are concerned with a neuralprecursor to language, found in monkey brains. There aresuperficial similarities between our proposals and differ-ences, which are important to state.

Rizzolatti and Arbib use a kind of logical notation to con-vey an idea about the activity of “canonical” macaque F5neurons in grasping small objects.

Hurford: The neural basis of predicate-argument structure

274 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 15: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

We view the activity of “canonical” F5 neurons as part of thecode for an imperative case structure, for example, Command:grasp-A(raisin), as an instance of grasp-A(object), where grasp-A is a specific kind of grasp, to be applied to the raisin. Notethat this case structure is an “action description,” not a linguis-tic representation. “Raisin” denotes the specific object towardswhich the grasp is directed, whereas grasp-A is a specific com-mand directed towards an object with well specified physicalproperties. (Rizzolatti & Arbib 1998, p. 192)The formula used here by Rizzolatti and Arbib is best

taken as a shorthand for a sequence of separate processes;the compression into a single formula gives rise to severalpotentially misleading infelicities. Logically, a term like“raisin” is a predicate, and therefore (in FOPL) should notbe used as an argument. This is not a merely pernicketypoint. Key to my own proposal is the idea that a predicateis the logical expression of a judgement about the categoryto which some attended-to object belongs. The process ofperceiving something to be a raisin is, I claim, well repre-sented by the formula RAISIN(x). Allowing, for the mo-ment, “GRASP-A” as a predicate, the sequence of events inthe monkey’s brain with which R&A are here concernedwould be better expressed as

RAISIN(x) Equation 1GRASP-A(x) Equation 2

That is, the judgement that the attended-to object is araisin precedes the motor instruction to grasp it in a certainway, if the animal is acting with any deliberation. If the ani-mal does not make a deliberate categorical judgement, butsimply reflexively grabs the object (with activation essen-tially limited to the dorsal stream), then, according to thecorrelation I propose, there is no question of the predicateRAISIN, or any other predicate, being involved. I have lessto say about the use of predicate notation to cover motor in-structions. Classical logic was devised as a way of objectivelyrepresenting (inter alia) observable events and states of af-fairs, and the present proposal is to link logic to the neuralprocesses involved in perception of stimuli from outside theanimal, and not to the mechanisms involved in purposefulaction by the animal. Rizzolatti and Arbib’s discussion, whileappealing to a notation which is logic-like in that it appar-ently has predicate-argument structure, does not in fact de-construct this formula and attribute the separate parts to dif-ferent neural processes, as is proposed in the present article.

3. Attention to locations, features, or objects?

Thus far, I have correlated logical predicates with perceivedfeatures, such as colour or shape, or more complex combi-nations of features, which make up a particular face; and Ihave correlated the instantiations of individual variable ar-guments of predicates with whole objects attended to, suchas a particular bird, stone or tree. But, one might ask, isn’tan object nothing more than a bundle of features?5 The no-tion of an object, as opposed to its features, is important forthe central claim of this article, that modern neurosciencehas revealed close correlates of the elements of the logicalPREDICATE(x) formula. In FOPL, individual variables areinstantiated by whole objects, not by properties. Substan-tial evidence now exists that the primary targets of attentiveprocesses are indeed whole objects, and not properties orfeatures.

Beside the object/feature distinction, the object/locationdistinction must also be mentioned. Preattentive processes,

operating largely through the dorsal stream, direct atten-tion to a location represented in a mental spatial map de-fined in terms of parts of the body. So, in a sense, attentionis directed to a place, rather than to an object. But, exceptin cases of illusion or stimuli that vanish as soon as they arenoticed, what the mind finds at the location to which at-tention is directed is an object. So what is held in attention,the object, or the location? Evidence has accumulated inrecent years that what is held in attention are objects, andnot locations.

A paper by Duncan (1984), while by no means the firston this topic, is a good place to start a survey of recent re-search. Duncan distinguishes between object-based, dis-crimination-based, and space-based theories of visual at-tention.

Object-based theories propose a limit on the number of sepa-rate objects that can be perceived simultaneously. Discrimina-tion-based theories propose a limit on the number of separatediscriminations that can be made. Space-based theories pro-pose a limit on the spatial area from which information can betaken up. (p. 501)

Space-based theories have been called “mental spotlight”theories, as they emphasize the “illumination” of a small cir-cle in space. Duncan experimented with brief exposures tonarrow displays, subtending less than one degree at the eye,consisting of two overlapping objects, an upright box (smallor large) with a line (dotted or dashed) passing downthrough it. The box always had a small gap in one side, toleft or right, and the line always slanted slightly to the rightor the left. Subjects had to report judgements on two di-mensions at a time, from the four possible dimensionsbox(size), box(gap), line(tilt), and line(texture).

It was found that two judgments that concern the same objectcan be made simultaneously without loss of accuracy, whereastwo judgments that concern different objects cannot. Neitherthe similarity nor the difficulty of required discriminations, northe spatial distribution of information, could account for the re-sults. The experiments support a view in which parallel, preat-tentive processes serve to segment the field into separate ob-jects, followed by a process of focal attention that deals withonly one object at a time. (p. 501)

And,The present data confirm that focal attention acts on packagesof information defined preattentively and that these packagesseem to correspond, at least to a first approximation, to our in-tuitions concerning discrete objects. (Duncan 1984, p. 514)

Duncan notes that object-based, discrimination-based,and space-based theories are not mutually exclusive. Thisidea is repeated by some later writers (e.g., Egly et al. 1994;Vecera & Farah 1994), who discuss the possibilities of dis-tinct systems of attention operating at different stages orlevels (e.g., early versus late) or in response to differenttasks (e.g., expectancy tasks versus selection tasks). The ex-perimental evidence for space-based attention provided bythese authors involves a different task from the task thatDuncan set his subjects (although the experimental mate-rials were very similar). Duncan asked his subjects forjudgements about the objects attended to. The experimentssuggesting space-based attention involved subjects beinggiven a “precue” (mostly valid, sometimes not) leadingthem to expect a stimulus to appear in a certain area, or ona certain object, and their task was simply to press a buttonwhen the stimulus appeared. Reaction times were mea-sured and compared. Vecera and Farah (1994) suggest: “In-

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 275

Page 16: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

stead of attention being a single limitation or a single sys-tem, there may be different types of limitations or differenttypes of attention that depend on the representations usedin different tasks” (p. 153). This way of expressing it seemsto me to depart from the useful distinction between preat-tentive processes and focal attention. Duncan’s subjectsgave judgements about what was in their focal attention. Inthe precued experiments, the reaction times measured thesubjects’ preattentive processes. As Egly et al. (1994) note,“previous findings revealed evidence for both space-basedand object-based components to visual attention. However,we note that these two components have been identified invery different paradigms” (p. 173). I will continue on the as-sumption that the cued reaction-time paradigm in fact testspreattentive processes. My question here is whether focalattention operates on objects, locations, or features.6

A series of papers (Baylis 1994; Baylis & Driver 1993;Gibson 1994) takes up Duncan’s theme of whether focal at-tention is applied to objects or locations. As with Duncan’sexperiments, subjects were required to make judgementsabout what they saw, but in this case reaction times weremeasured. In most of the experiments, the displays shownto subjects could be interpreted as either a convex white ob-ject against a black ground, or two partly concave black ob-jects with a white space between them. Subjects had tojudge which of two apices in the display was the lower. Theapices could be seen as belonging to the same (middle) ob-ject, or to two different (flanking) objects.

Position judgments about parts of one object were more rapidthan equivalent judgments about two objects even though thepositions to be compared were the same for one- and two-ob-ject displays. This two-object cost was found in each of five ex-periments. Moreover, this effect was even found when the one-and two-object displays were physically identical in every re-spect but parsed as one or two objects according to the subjects’perceptual set. . . . We propose that spatial information is rou-tinely represented in two different ways in the visual system.First, a scene-based description of space represents the loca-tion of objects within a scene. Second, an object-based de-scription is produced to describe the relative positions of partsof each object. Such a hierarchical representation of space mayparallel the division of the primate visual system into a scene-based dorsal stream and an object-based ventral stream.”(Baylis & Driver 1993, pp. 466–67)

Gibson (1994) suggested that these results could havebeen caused by a confound between the number of objectsperceived and the concavity or convexity of the objects.Baylis (1994) replied to this objection with further experi-ments controlling against this possible confound, reinforc-ing the original conclusion that making a judgement abouttwo objects is more costly than making a judgement abouta single object, even when the displays are in fact physicallyidentical.

Luck and Vogel (1997) presented subjects with visual ar-rays, with a slight delay between them, and asked them toreport differences between the arrays. They summarizetheir conclusion as follows:

It is possible to retain information about only four colours ororientations in visual working memory at one time. However, itis also possible to retain both the colour and the orientation offour objects, indicating that visual working memory stores in-tegrated objects rather than individual features. Indeed, ob-jects defined by a conjunction of four features can be retainedin working memory just as well as single-feature objects, allow-ing sixteen individual features to be retained when distributed

across four objects. Thus, the capacity of visual working mem-ory must be understood in terms of integrated objects ratherthan individual features. (p. 279)

Valdes-Sosa et al. (1998) studied transparent motion “de-fined by two sets of differently colored dots that were in-terspersed in the same region of space, and matched in spa-tial and spatial frequency properties” (p. B13).

Each set moved in a distinct and randomly chosen direction.We found that simultaneous judgments of speed and directionwere more accurate when they concerned only one set thanwhen they concerned different sets. Furthermore, appraisal ofthe directions taken by two sets of dots is more difficult thanjudging direction for only one set, a difficulty that increases forbriefer motion. We conclude that perceptual grouping by com-mon fate exerted a more powerful constraint than spatial prox-imity, a result consistent with object-based attention. (p. B13)

The most recent and most ingenious experiment com-paring object-based, feature-based, and location-based the-ories of attention is Blaser et al. (2000). In this experiment,subjects were presented with a display consisting of twopatterned patches (“Gabors”), completely spatially super-imposed. The trick of getting two objects to seem to occupythe same space at the same time was accomplished by pre-senting the patches in alternate video frames. The patcheschanged gradually, and with a certain inertia, along thethree dimensions of colour, thickness of stripes, and orien-tation of stripes. Subjects had to indicate judgements aboutthe movements of these patches through “feature space.”In one experiment it was shown that observers are “capableof tracking a single object in spite of a spatially superim-posed distractor.” In a second experiment, “observers hadboth an instruction and a task that encouraged them to at-tend and track two objects simultaneously. It is clear thatobservers did much worse in these conditions than in thewithin-object conditions, where they only had to attend andtrack a single object.”

The story so far, then, is that the brain interprets rela-tively abrupt discontinuities – such as change of orientationof a line, change of colour, change of brightness – togetheras constructing holistic visual objects which are expected toshare a “common fate.” It is these whole objects that areheld in attention. A shift of attention from one object to an-other is costly, whereas a shift of attention from one featureof an object to another feature of the same object is lesscostly. This is consistent with the view underlying FOPLthat the entities to which predicates apply are objects, andneither properties nor locations. In accepting this correla-tion between logic and neuropsychology we have, paradox-ically, to abandon an “objective” view of objects. No per-ceptible physical object is ever the same from one momentof its existence to the next. Every thing changes. Objects aremerely slow events. What we perceive as objects is entirelydependent on the speed our brains work at. An object isanything that naturally attracts and holds our attention. Butobjects are what classical logicians have had in mind as thebasic entities populating their postulated universes. Thetradition goes back at least to Aristotle, with his “primarysubstances” (5 individual physical objects).

4. Computing deictic variables in vision, action,and language

The previous section concerned the holding in attention ofsingle whole objects. We can deal with several different ob-

Hurford: The neural basis of predicate-argument structure

276 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 17: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

jects in a single task, and take in scenes containing morethan one object. How do we do this, and what are the lim-its on the number of different objects we can manage to“keep in mind” at any one time?

The idea of objects of attention as the temporary instan-tiations of mental computational variables has been devel-oped by Kahneman and Treisman (1992), Ballard et al.(1995; 1997), and Pylyshyn (2000), drawing on earlier workincluding Kahneman and Treisman (1984), Ullman (1984),Agre and Chapman (1987) and Pylyshyn (1989). The ideabehind this work is that the mind, as a computational de-vice for managing an organism’s interactions with the world,has available for use at any time a small number of “deictic”or “indexical” variables. Pylyshyn (1989) calls such variables“FINSTs,” a mnemonic for “INSTantiation FINger.”

A FINST is, in fact, a reference (or index) to a particular fea-ture or feature cluster on the retina. However, a FINST has thefollowing additional important property: because of the wayclusters are primitively computed, a FINST keeps pointing tothe “same” feature cluster as the cluster moves across theretina. . . . The FINST itself does not encode any properties ofthe feature in question, it merely makes it possible to locate thefeature in order to examine it further if needed. (Pylyshyn 1989,pp. 69–70)

This is precisely what the FINST hypothesis claims: it says thatthere is a primitive referencing mechanism for pointing to cer-tain kinds of features, thereby maintaining their distinctiveidentity without either recognizing them (in the sense of cate-gorizing them), or explicitly encoding their locations. (Pylyshyn1989, p. 82, emphasis in original)

All practical tasks involve analysis of the scene of the taskin terms of the principal objects concerned. The simplescene-descriptions of predicate logic, such as ∃ x, y [MAN(x)& DOG(y) & BEHIND(y,x)] (translated as A dog is behinda man) have direct counterparts in examples used by visionresearchers of what happens in the brain when analyzing avisual scene. An early example from Ullman is:

Suppose, for example, that a scene contains several objects,such as a man at one location, and a dog at another, and that fol-lowing the visual analysis of the man figure we shift our gazeand processing focus to the dog. The visual analysis of the manfigure has been summarized in the incremental representation,and this information is still available at least in part as the gazeis shifted to the dog. In addition to this information we keep aspatial map, a set of spatial pointers, which tell us that the dogis at one direction, and the man at another. Although we nolonger see the man clearly, we have a clear notion of what ex-ists where. The “what” is supplied by the incremental repre-sentations, and the “where” by the marking map. (Ullman 1984,p. 150)Since this passage was written, in the early 1980s, vision

research has substantially developed the idea of separate“where” and “what” neural pathways, dorsal and ventral re-spectively, as surveyed above.

The everyday tasks of primates are plausibly envisaged insuch terms. Activities such as fishing for termites with astick and eating them, or building a sleeping nest in a tree,or collaborating with others in a hunt, all involve attentionto different objects while performing the task. During thetask, immediate attention is shifted from one thing to an-other, but the small number of principal things involved inthe task are not put out of mind. Crucial information aboutthem is stored as the contents of variables, or computationalpointers. The termite-fishing chimpanzee at one momentattends to the termites caught on its stick, and guides them

to its mouth. Meanwhile, it still holds, as part of the ongo-ing larger task, information about the hole in the termitemound, though it is not visually attending to it while puttingthe termites in its mouth. After eating the termites, visualattention is switched back to the hole in the termite mound,and the stick is manually guided into the hole. The chim-panzee need not rediscover the properties of the hole (e.g.,its size and orientation), because these properties havebeen stored as the contents of a computational variable.

(Managing scenes with several objects necessitates con-trol of sameness and difference. The ape doing some prac-tical task with several objects does not need to be able todistinguish these objects in principle from all other objectsin the world, but certainly does need to distinguish amongthe objects themselves. This is the simple seed from whichthe more advanced concept of a unique-in-the-world indi-vidual may grow.)

An idea very similar to Pylyshyn’s FINSTs, but slightlydifferent in detail, is proposed by Kahneman and Treisman(1984; 1992). These authors hypothesize that the mind setsup temporary “object files” in which information about ob-jects in a scene is stored. The object files can be updated,as the viewer tracks changes in an object’s features or loca-tion. It is emphasized that the information stored in tem-porary object files is not the same as that which may bestored in long-term memory. But the information in objectfiles can be matched with properties associated with objectsin long-term memory, for such purposes as object recogni-tion. When (or shortly after) objects disappear from thecurrent scene, their object files are discarded. A file full ofinformation is not a variable. In discussing the relationshipbetween object files and Pylyshyn’s FINSTs, Kahnemanand Treisman (1992) suggest that “a FINST might be theinitial phase of a simple object file before any features havebeen attached to it” (p. 217). This correspondence workswell, apart from a reservation, which Kahneman and Treis-man (1992) note, involving the possibility of there being ob-jects with parts that are also objects. This is a detail that Iwill not go into here. An “empty” object file, available forinformation to be put into it, is computationally an unin-stantiated variable, provided that it can be identified anddistinguished from other such files that are also availableand that may get different information put into them. Thefact that object files can be updated, are temporary, and canbe discarded for re-use with completely new values, un-derlines their status as computational variables used by themind for the short-term grasping of scenes.

Kahneman and Treisman (1992) “assume that there issome limit to the number of object files that can be main-tained at once” (p. 178). Ballard et al. (1997) stress thatcomputational efficiency is optimized if the number of suchvariables is small. Luck and Vogel (1997) demonstrate alimit of four objects in visual working memory (and proposean interesting explanation in terms of the “oscillatory ortemporally correlated firing patterns among the neuronsthat code the features of an object,” p. 280). Pylyshyn as-sumes “a pool of four or five available indexes” (Pylyshyn2000, p. 201). It is perhaps at first helpful to concretizethese ideas by identifying the available variables in the sameway as logicians do, by the letters w, x, y, and z. Neither lo-gicians nor vision researchers wish to be tied to the claimthat the mind can handle a maximum of only four variables,but hardly any examples given by them ever involve morethan four separate variables. So it would seem for many

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 277

Page 18: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

practical purposes that about four variables are enough. Inperforming an everyday task, then, a creature such as a pri-mate mentally juggles a parsimonious inventory of vari-ables, w, x, y, z, . . . . Cowan (2001) provides a very thor-ough and extensive survey of studies of short term memory,concluding that there is a remarkable degree of similarityin the capacity limit in working memory observed with awide range of procedures. A restricted set of conditions isnecessary to observe this limit. It can be observed only withprocedures that allow assumptions about what the inde-pendent chunks are, and the limit the recursive use of thelimited-capacity store . . . The preponderance of evidencefrom procedures fitting these conditions strongly suggestsa mean memory capacity in adults of 3 to 5 chunks, whereasindividual scores appear to range more widely from about2 up to about 6 chunks. The evidence for this pure capac-ity limit is considerably more extensive than that for thesomewhat higher limit of 7 1 2 stimuli (Cowan 2001).

This small inventory of variables can explain other knownsize-limitations in humans and non-human primates. Theupper limit of subitizing in humans is around 4; given aquick glance at a group of objects, a human can guess ac-curately how many there are, without explicit counting, upto a limit of about 4 or 5 (see Antell & Keating 1983; Gel-man & Gallistel 1978; Mandler & Shebo 1982; Russac 1983;Schaeffer et al. 1974; Starkey & Cooper 1980 for some rel-evant studies). Both Ullman (1984, p. 151) and Pylyshyn(2000, pp. 201–202) make the connection between subitiz-ing (which Ullman calls “visual counting”) and the markingor indexing of locations in a scene. Trick and Pylyshyn(1993; 1994) explain the natural limit of subitizing in termsof the number of objects that can be involved in “pre-at-tentive” processing in vision. Dehaene (1997), in work onthe numerical competences of many species, finds a naturaldifference between low numerosities up to about 3 or 4,and higher ones. For details of how this natural discontinu-ity at around 4 in the number sequence is reflected in thenumerals, adjectives, and nouns of many human languages,see Hurford (1987; 2000a).

The simple clauses of human languages are constrainedto a maximum of about four or five core arguments; indeed,most clauses have fewer than this. Presumably this reflectsthe structure of the underlying mental propositions. Con-ceivably, one could analyze the content of a complex sen-tence, such as The cat chased the mouse that stole the cheesethat lay in the house that Jack built as having a single pred-icate CHASE-STEAL-LIE-BUILD and five arguments (thecat, the mouse, the cheese, the house, and Jack). But it ismore reasonable to suppose that the grammatical structureof such embedded natural language clauses reflects a men-tal structure involving a nesting of separate propositions,each with its own simple predicate expressing a relation be-tween just two arguments (which may be shared with otherpredicates).7

Ballard et al. (1997) give grounds why the number ofvariables juggled in computing practical tasks must be small(typically no more than three). Of course, most sentencesin human languages are not direct representations of anypractical task on the part of the speaker, like “Put the stickin the hole.” Humans exchange declarative informationabout the world for use at later times, for example, “Yourmother’s coming on Tuesday.” But mental scene descrip-tions are necessary for carrying out practical tasks of thekind that primates are capable of, and therefore pre-exist

language phylogenetically. It is plausible that the type ofscene descriptions used by nonhuman primates would bereused for more complex cognitive, and ultimately linguis-tic, purposes. I suggest that the limitation of elementarypropositions to no more than about three arguments, andthe typical use of even fewer arguments, derives from theconsiderations of computational efficiency advanced byBallard et al. (1997).8

The marking, or indexing, of spatial locations in a visuallyanalyzed scene, as described by Ullman and Pylyshyn, hasa direct analog in human signed languages. Where spokenlanguages establish the existence of discourse referentswith noun phrases, and subsequently use definite pronounsand descriptions to re-identify these referents, signed lan-guages can use a directly visuo-spatial method of keepingtrack of discourse referents. A user of British Sign Lan-guage, for instance, on telling a story involving three par-ticipants, will, on introducing them into the discourse, as-sign them a position in the signing space around him. Onreferring back to these individuals, he will point to the ap-propriate spatial position (equivalent to saying “this one” or“that one”).

[In many sign languages] Anaphoric pronouns can only occurfollowing the localization of the referent noun in the locationassigned to the pronoun. Nouns articulated in the space in frontof the body are, for example, moved to third person space;nouns located on a body part would be followed by an indexingof third person space. This assignment of location to a refer-ent . . . then continues through the discourse until it is changed.To indicate anaphoric reference, the signer indexes the locationpreviously assigned to that referent. . . .

The operation of anaphora . . . can be seen in the followingBSL example “The woman keeps hitting the man.” In this, thesign MAN is articulated with the left hand, followed by the ‘per-son’ classifier, located to fourth person space. The left hand re-mains in the “person” classifier handshape and fourth personlocation, while the remainder of the sentence is signed. Thesign WOMAN is articulated with the right hand, followed bythe “person” classifier, located to third person space. The verbHIT, an agreement verb, is then articulated, moving on a trackfrom the subject (third person) to object (fourth person).9 (Woll& Kyle 1994, p. 3905)

See also Liddell (1990), McDonald (1994), and Padden(1990). For the sign language recipient, the experience ofdecoding a signed scene-describing utterance closely par-allels the visual act of analyzing the scene itself; in bothcases, the objects referred to are assigned to different loca-tions in space, which the recipient/observer marks.

There is a further parallel between linguistic deicticterms and the deictic variables invoked by vision re-searchers. As we have seen, Pylyshyn postulates “a pool offour or five available indexes,” and Ballard et al. (1997) em-phasize that most ordinary visually guided tasks can be ac-complished with no more than three deictic variables. Thedeictic terms of natural languages are organized into inter-nally contrastive subsystems: English examples are here/there, now/then, yesterday/today/tomorrow, Past-tense/non-Past-tense, this/that, these/those. Some languages areslightly richer in their deictic systems than English. Japa-nese, for instance, distinguishes between three demonstra-tives, kono (close to the speaker), sono (close to the listener,or previously referred to), and ano (reasonably distant fromboth speaker and listener); this three-way distinction indemonstrative adjectives is paralleled by three-way distinc-tions in kore/sore/are (demonstrative pronouns) and koko/

Hurford: The neural basis of predicate-argument structure

278 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

Page 19: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

soko/asoko and kochira/sochira/achira (adverbs of placeand direction respectively). Spanish likewise makes a three-way distinction in demonstratives, este/ese/aquel, withslightly different meanings from the Japanese. There are afew languages with four-way contrasts. Tlingit is one suchlanguage. In Tlingit,

yáa “this (one) right here” is clearly “close to Sp”; héi “this (one)nearby” is characterized by a moderate distance from Sp with-out reference to the Adr; wée “that (one) over there” is againnot identified by the location of the Adr; and yóo “that (one) faroff (in space or time),” the fourth term, is simply remote fromthe speech situation. (Anderson & Keenan 1985, p. 286)

Anderson and Keenan mention two other languages, Sreand Quileute, as also having four-way deictic contrasts.They mention one language, CiBemba, with a five-way sys-tem, and one, Malagasy, with a seven-way system; frankly, Iam skeptical of the claim for seven degrees of contrast alonga single dimension in Malagasy. “Systems with more thanfive terms along the basic deictic dimension are exceedinglyrare” (Anderson & Keenan 1985, p. 288).

The extreme rarity of languages providing more than fivecontrasting deictic terms in any subsystem correspondsnicely to the “pool of four or five available indexes,” or vi-sual deictic variables, postulated by Pylyshyn. In an utter-ance entirely concerning objects in the vicinity of thespeech situation, none of which are identified by any pred-icate/property, there is a limit to how many separate thingsa speaker or hearer can keep track of, with expressionsequivalent to “this one near me,” “that one near you,” “thatone yonder,” and so on. Pylyshyn (1989) explicitly relates hisFINST devices to the indexical pronouns here and there,and suggests that FINSTs provide a semantics for such ex-pressions. It is important to note the highly elastic size ofthe domains appealed to in deixis. Within deictic systems,“near” and “far” are typically relative, not absolute. Hence,within a domain which is all in some sense near the speaker,there nevertheless will still be a distinction between “near”and “far.”

The provision by the brain’s sensory/perceptual systemsof a pool of about four or five variables for ad hoc deictic as-signment to objects in the accessible environment, and theseparate processes of perceptual categorization of the ob-jects so identified, constitutes an early system for the rep-resentation of scenes. This system was based on multiple in-stances of (or conjunctions of) propositions of the formPREDICATE(x), involving up to about four different vari-ables. An example of such a scene-description might be

APE(x) & STICK(y) & MOUND(z) & HOLE(w) & IN(w,z)& PUT(x,y,w)

translating to An ape puts a stick into a hole in a mound.This translation is given here just for convenience.10 So far,we have made no move to suggest how such nonlinguisticmental representations came to be externalized in theshared communication system of a community. If we aretalking about language at all, it is, so far, only private lan-guage. Nevertheless, given the genetic homogeneity ofcommunities of primates, it is highly likely that what hap-pens in the brain of one animal on seeing a scene is repre-sented very similarly in the brains of its fellow troop mem-bers. The simply structured internal representationsprovide a preadaptive platform on which a simple publiclanguage could develop.11

I have suggested certain parallels between the prelin-

guistic representation of events (restriction to three to fiveparticipants, location of the participants in egocentricspace) and features of modern human languages (clausesize, limits of deictic systems, anaphora in sign languages).I believe that these features of language can ultimately betraced back to evolutionary precursors in the prelinguisticrepresentations. But it also seems very likely that in the evo-lution of the language capacity, the human brain has liber-ated itself from certain of the most concrete associations ofthe prelinguistic representations. Thus, when a modern hu-man processes a sentence describing some abstract rela-tion, such as Ambition is more forgivable than greed, it isunlikely that any specifically egocentric space-processing(parietal) areas are activated. The relation between ancientegocentric visuo-spatial maps and modern features of lan-guage is, I would claim, rather like the relationship betweenancient thermoregulation panels and wings, a relationshipof homology or exaptation. If the ancient structures hadnever existed, the modern descendants would not have theparticular features that they do, but the modern descen-dants are just that, descendants, with the kind of modifica-tions one expects from evolution.

5. Common ground of neuroscience, linguistics,and philosophy

I have made the connection between neural processing ofvisual scenes and mental representations of propositions asexpressed by simple natural language clauses. The sameconnection is everywhere heavily implicit, though not ex-plicitly defended, in the writing of the vision researcherscited here. In particular, the four terms, “deictic,” “indexi-cal,” “refer,” and “semantic,” borrowed from linguistics andthe philosophy of language, have slipped with remarkableease and naturalness into the discussion of visual process-ing. “Deictic” as a grammatical term has a history goingback to the Greek grammarians (who used “deiktikos”; seeLyons 1977, p. 636, for a sketch of this history), indicatinga “pointing” relationship between words and things. “Deic-tic” and “indexical” are equivalent terms. Agre and Chap-man (1987) apply the term “indexical” to computational en-tities invoked by a program designed for fast, efficient,planning-free interaction with a model world. These enti-ties “are not logical categories because they are indexical:their extension depends on the circumstances. In this way,indexical-functional entities are intermediate between log-ical individuals and categories” (Agre & Chapman 1987,p. 270).12 The parallels between efficient computing forfast local action and the efficient fast analysis of visualscenes, using deictic or indexical entities, are later taken upby a small but growing number of writers (e.g., Ballard etal. 1995; 1997; Pylyshyn 2000) arguing the advantages ofreorientating perceptual and cognitive research along “sit-uated” or “embodied” lines.

Similarly, the term “refer” is typically used in ordinarylanguage, and consistently in the more technical discourseof linguists and philosophers, with a linguistic entity, suchas a word, as one of its arguments, and a thing in the worldas another argument, as in “Fido refers to my dog.” Straw-son’s classic article “On Referring” (Strawson 1950) is allabout statements and sentences of ordinary languages; forSearle (1979) and other speech act theorists, referring is aspeech act. Linguists prefer to include a third argument,

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 279

Page 20: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

the speaker, as in “He referred to me as Jimmy.” Manuallypointing to an object, without speaking, might be consid-ered by some linguists and philosophers to be at best a mar-ginal case of referring, especially where the intention is todraw attention of another to the object. But notice how eas-ily this and other originally linguistic terms (“demonstra-tive,” “indexical”) are interpreted when applied to a visual,entirely non-linguistic process:

The visual system . . . needs a special kind of direct referencemechanism to refer to objects without having to encode theirproperties. . . . This kind of direct reference is provided by whatis referred to as a demonstrative, or more generally, an indexi-cal.13 (Pylyshyn 2000, p. 205)

The central idea involved in linguistic and vision-ori-ented and activity-oriented uses of the terms “deictic,” “in-dexical” and “refer” is attention. In all cases, be it a monkeyswivelling its eyes toward a target, an ape grasping for anobject, or a human referring to an object with a demon-strative pronoun, the organism is attending to an object.This is the archetypal sense of “refer-”; the linguist’s pre-ferred usage of “refer-,” involving a speaker, is closer to thearchetypal sense than the twentieth century logician’s, forwhom reference is a relation between words and things,without mediation by any agent’s mind. But the linguist’sand the philosopher’s restriction of “referring” to a neces-sarily linguistic act misses what I claim is the phylogenetic,prelinguistic origin of referring.

Classically, semantics is said to involve a relation betweena representation and the world, without involvement of anyuser of that representation (e.g., a speaker) (Carnap 1942;Morris 1938; 1946). Thus, the relation of denotation be-tween a proper name and its referent, or between a predi-cate and a set of objects, is traditionally the concern of se-mantics. Vision researchers use the term “semantic” withno sense of a relation involving linguistic entities. Jeannerodet al. (1995) identify events in the dorsal stream with prag-matics (though perhaps “praxics” might have been a betterterm) and events in the ventral stream with semantics:

In humans, neuropsychological studies of patients with lesionsto the parietal lobule confirm that primitive shape characteris-tics of an object for grasping are analyzed in the parietal lobe,and also demonstrate that this “pragmatic” analysis of objects isseparated from the “semantic” analysis performed in the tem-poral lobe. (Jeannerod et al. 1995, p. 314)

Likewise Milner and Goodale (1995, p. 88) write of the“content or semantics” of nonverbal interactions with theworld, such as putting an object in a particular place. Fur-ther, “even after objects have been individuated and iden-tified, additional semantic content can be gleaned fromknowing something about the relative location of the ob-jects in the visual world” (Milner & Goodale 1995, p. 88).The central idea linking linguists’, philosophers’, and visionresearchers’ use of “semantic” is the idea of information orcontent. For us modern humans, especially the literate va-riety, language so dominates our lives that we tend to be-lieve that language has a monopoly of information and con-tent. But of course there is, potentially, information ineverything. And since the beginning of the electronic age,we now understand how information can be transmitted,transformed, and stored with wires, waves, and neurons. In-formation about the relative location of the objects in a vi-sual scene, or about the properties of those objects, repre-sented in a perceiver’s brain, has the same essential qualityof “aboutness,” a relation with an external world, that lin-guists and philosophers identify with the semantics of sen-tences. Those philosophers and linguists who have insistedthat semantics is a relation between a language and theworld, without mediation by a representing mind, haveeliminated the essential middleman between language andthe world. The vision researchers have got it more right, inspeaking of the “semantics” of neural representations, re-gardless of whether any linguistic utterance is involved. Itis on the platform of such neural representations that lan-guage can be built.

An evolutionary history of reference can be envisaged, inwhich reference as a relation between the mind and theworld is the original. This history is sketched in Figure 2.

Hurford: The neural basis of predicate-argument structure

280 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3

WORLD

(presents

objects

and

properties)

1 individual’sprivate mental processes.Dyadic relation (WORLD, MIND).

1 individual’sprivate mental processes,associated public behaviour.Triadic relation (WORLD, MIND, BEHAVIOUR).

1st individual’sprivate mental processes,associated public behaviour.2nd individual’sprivate mental processes.Tetradic relation (WORLD, MIND1, BEHAVIOUR, MIND2).

STAGE 1

STAGE 2

STAGE 3

Figure 2. The evolution of reference. The relationship between mental processes and the world is the original and enduring factor.The last stage is successful reference as understood by linguists, and as manifested by people speaking natural languages. The stages mayoverlap, in that further evolution of one stage may continue to complexify after evolution of a later stage has commenced.

Page 21: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

At present, the dual use of such terms as “deictic” and“refer” for both linguistic and visual processes is possibly nomore than a metaphor. The mere intuitive plausibility of theparallels between the visual and the linguistic processes isnot as good as empirical evidence that the brain in some waytreats linguistic deictic variables and visual deictic variablesin related ways. Possibly the right kind of evidence could beforthcoming from imaging studies, but the picture is sureto be quite complicated.

6. Wrapping up

6.1. It could have been otherwise

It could conceivably have been otherwise, both from a log-ical and a biological point of view. Consider, first, alterna-tive biologies. We can conceive of a world in which organ-isms sense the ambient temperature of their surroundingsby a single sensory organ which doesn’t distinguish anysource of radiant heat. Further, such a creature might havea keen sense of smell, and be able to discriminate betweenthousands of categorically different smells assailing its smellorgan. And the creature might have arrays of light detectorsevenly spaced all over its body, all feeding into a single in-ternal organ activated by an unweighted average of the in-puts. Such a creature would have no internal representationof objects, but only a set of “zero-place predicates.” It couldsense “The world outside is in such-and-such a state.” Cer-tainly, the higher animals on planet Earth are not like this,but I would be surprised if some lower animals were notsomewhat like it. It just happens to be the case that the lawsof physics, chemistry, and biology conspire to produce aworld containing discrete categorizable objects, and so, notsurprisingly, but not logically necessarily, advanced crea-tures have evolved ways of dealing with them.

An alternative logic is also easily conceivable, in whichthere is no predicate-argument structure. It already existsin the form of the propositional calculus, typically intro-duced in logic textbooks as a simple step towards the more“advanced” predicate calculus. A propositional calculus,with no predicate-argument structure, would be all that isneeded by the creature described in the previous para-graph.

Here is a final thought experiment. A “Turing robot” isentirely conceivable as a working automaton, capable ofnavigating and surviving in a complex world. Instead ofreading a character on a tape, the Turing robot “reads” apatch of the world in front of it, matching the input to somemonadic symbol occurring in the quadruples of its instruc-tion set. Instead of shifting the tape to right or left, it shiftsitself to an adjacent patch of world, and it can act, onemonadic action at a time, on the patch of world it is lookingat. Given a complex enough instruction set, such a robotcould replicate any of the complex computations carriedout by an advanced real live creature successfully negotiat-ing the world. The Turing robot’s hardware, and the indi-vidual elements of its software instruction set, the basicquadruples, contain nothing corresponding to predicate-ar-gument structure, though it is probable that we could in-terpret some higher-level pattern or subroutine in thewhole instruction set as somehow corresponding to predi-cate-argument structure. The dorsal/ventral separation inhigher mammals is, I argue, an evolved hardware imple-mentation of predicate-argument structure.

6.2. Falsifiability

This article is an instance of reductionism. It takes two pre-viously unrelated fields, logic and neuroscience, and arguesthat what logicians are really dealing with, whenever theyappeal to predicate-argument structure, has a basis inneural processing. This in no way minimizes the validity ofstudies in logic; rather it enhances their validity. Biologistsworking with Mendelian genes without knowledge of DNAwere doing valid work. “Abstract” work on the structure ofhuman thought, and its relationship to language, must con-tinue. But as long as we recognize that the object of study,both in logic and in linguistics, has a psychological basis, oneof us should also work on bridging the gap between theo-retical studies couched in logico/linguistic terminology andempirical studies in psychology and neuroscience. Onlythose who view logical and linguistic structure as Platonic,in some way existing independently of human minds, canignore psychology and neuroscience.

Can a reductionist argument be falsified? Yes. Some pro-posed reductions are just plain wrong, some are well justi-fied, and some are partly right. What justifies a reduction-ist argument is the goodness of fit between the twoindependently established theories. The present argumentwould be invalidated if it could be shown that any of the fol-lowing apply:

The canonical arguments of predicates in logic do not denoteindividual objects.

Canonical predicates in logic do not denote properties.The dorsal stream processes properties at least as much as

the ventral stream.The ventral stream plays a large role in drawing attention to

objects.

I concede that an extreme version of my reductionist pro-posal is falsified in many ways, because, on the logical side,for example, formal semanticists often use nonobject-de-noting terms as arguments of predicates, and on the neu-rological side, for example, some detection of properties isachieved by the dorsal stream. So the fit between the prac-tices of logicians and formal semanticists with predicate-ar-gument structure and the neural facts is not quite perfect.But, I claim, there is enough of a clear parallelism betweenthe two domains to indicate that neuroscience has revealedfacts which significantly inform the domain that logiciansand formal semanticists traditionally deal with. Here againI mention that the brain is vastly more complex than eventhe most baroque of logical formalisms, and that one shouldexpect complexities arising from brain studies that logicalstudies simply do not relate to. A logical formalism relatesto the brain in the same way as a road map relates to a realplace.

6.3. Then, now , and next

The neural correlates of PREDICATE(x) can be found notonly in humans but also in primates and probably manyother higher mammals. Thus, as far as human evolution isconcerned, this form of mental representation is quite“primitive,” an early development not unique to ourspecies. It can be seen as building on an earlier stage (evi-dent, for example, in frogs) in which the only response toan attention-drawing stimulus was some immediate action.A fundamental development in higher mammals was toaugment, and eventually to supplant, the immediate motor

Hurford: The neural basis of predicate-argument structure

BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 281

Page 22: The neural basis of predicate-argument structureinst.eecs.berkeley.edu/~cs182/sp07/readings/Hurford.pdfLogical predicates are expressed in natural language by words of various parts

responses of a sensorimotor system with internalized, judg-mental responses which could be a basis for complex infer-ential processes working on material stored in long termmemory. Rather than “If it moves, grab it,” we begin to have“If it catches your attention, inspect it carefully and figureout what do to with it,” and later still “If you notice it, re-member what is important about it for later use.”

Simple early communicative utterances could be reportsof a PREDICATE(x) experience. For example, the vervetchutter could signify that the animal is having a SNAKE(x)experience, that is, has had its attention drawn to an objectwhich it recognizes as a snake. Primitive internal represen-tations, I have claimed, contain two elements, a deictic vari-able and a categorizing predicate. Nowhere in natural non-human communication do we find any two-term signals inwhich one term conveys the deictic element and the otherconveys the mental predicate. But some simple sentencesin some human languages have just these elements and noother. Russian and Arabic provide clear examples.

eto celovekDEICTIC MAN “This is a man.” (Russian)di sahlDEICTIC EASY “That is easy.” (Egyptian Arabic)

Even if the internal representations of animals are struc-tured in the PREDICATE(x) form, there would be no evo-lutionary pressure to structure the corresponding signalsinto two parts until the number of possible mental combi-nations of predicates and variables exceeded the total num-ber of predicates and variables, counted separately (Nowaket al. 2000). If the category of things that are pointed to ina given direction is always the same, there is no pressure forthe signal to differentiate the direction from the category.

I have argued that PREDICATE(x) is a reasonableschematic way of representing what happens in an act ofperception. It is another step, not taken here, to show thata similar kind of logical form is also appropriate for repre-senting stored episodic memories. A form in which only in-dividual variables can be the arguments of predicates mightbe too restrictive. Here, let me, finally, mention the “Aris-totle problem.” Aristotle and his followers for the next twomillennia took the basic semantic representation to be Sub-ject 1 Predicate, where the same kind of term could fillboth the Subject slot and the Predicate slot. Thus, for ex-ample, a term such as man could be the subject of The mandied and the predicate of Plato is a man. Kant’s characteri-zation of analytic judgements relies on subject terms beingof the same type as predicate terms. “Analytical judgmentsexpress nothing in the predicate but what has been alreadyactually thought in the concept of the subject, though notso distinctly or with the same (full) consciousness”. (Kant1905, translation of Kant 1783).14 FOPL is more distancedfrom the surface forms of natural languages, and the sameterms cannot be both arguments (e.g., subjects) and predi-cates. It remains to provide an explanation for the typicalstructure of modern languages, organized around theNoun/Verb dichotomy. I suspect that an explanation can beprovided in terms of a distinction between predicates whichdenote invariant properties of objects, such as being a dog,and more ephemeral properties, such as barking. But thatis another story.

NOTES1. The logical formula is simplified for convenience here.2. A complication to this picture arises from work on the recog-

nition of facial expressions by blindsight patients (de Gelder et al.1999; 2000; Heywood & Kentridge 2000; Morris et al. 1999). Fa-cial expressions are complex and are generally thought to requireconsiderable higher-level analysis. Yet detection of facial expres-sions (e.g., sad, happy, fearful, angry) is possible in some blindsightpatients, suggesting that some aspects of this task also are per-formed via a pathway that, like at least one dorsal pathway, by-passes primary visual cortex.

3. Belin and Zatorre (2000) suggest that the dorsal auditorypathway is involved in extracting the verbal message contained ina spoken sentence. This seems highly unlikely, as parsing a sen-tence appeals to higher-level lexical and grammatical information.The evidence they cite would only be relevant to the early pres-sure-sequence-to-spectrogram stages of spoken sentence pro-cessing.

4. Landau and Jackendoff (1993) is a more detailed version ofJackendoff and Landau (1992); I will refer here to the later paper,Landau and Jackendoff (1993).

5. Bertrand Russell at times espoused the view that particularsare in reality nothing but bundles of properties (Russell 1940;1948; 1959). See also Armstrong (1978). There is also a phenom-enalist view that “so-called material things, physical objects, arenothing but congeries of sensations” (Copi 1958).

6. Egly et al. (1994) state:

We found evidence for both space-based and object-based components tocovert visual orienting in normal observers. Invalid cues produced a costwhen attention had to be shifted from the cue to another location withinthe same object, demonstrating a space-based component to attention.However, the costs of invalid cues were significantly larger when attentionhad to be shifted an equivalent distance and direction to part of anotherobject, demonstrating an object-based component as well. (p. 173)

This again conflates attention-shifting, a preattentive (and post-attentive) process, with attention itself. These experiments relateonly to attention-shifting, as the title of Egly et al.’s (1994) articleimplies. (Further, it would be interesting to know whether the dis-tribution of RTs for the invalidly cued “within-object” attentionshifts was in fact bimodal. If so, this could suggest that subjectswere sometimes interpreting the end of a rectangle as a differentobject from the rectangle itself, and sometimes not. In this case,the responses taken to indicate a space-based process could in facthave been object-based.)

7. There is presumably a complex ecological balance betweenthe information carried by a mental predicate and its frequency ofuse in the mental life of the creature concerned. Complex rela-tions, if occurring frequently enough, might be somehow com-pressed into unitary mental predicates. An analogous case in lan-guage would be the common compressing of CAUSE(a,PRED(b) ) into a form with a single causative verb.

8. The claim in the text is not about memory limitations in-volved in parsing linguistic strings; it is about how many argumentsthe elementary propositions in the mind of a prelinguistic creaturecould have.

9. Both third and fourth person space in BSL are like availablepronouns for entities being signed about, other than the speakeror hearer. It is not that BSL has four grammatical persons in thesense that English has three (1st – speaker, 2nd – hearer, 3rd – allother entities).

10. This formula, like any FOPL formula, conveys no tempo-ral transitions. Tense logic is more complicated than FOPL.

11. See Batali (2002) for a computer simulation of the emer-gence of public language from representations of exactly thisform.

12. Agre and Chapman (1987) do not, as stated by Ballard etal. (1995), use the term “deictic.”

13. Indeed, this quoted sentence contains the stem “refer-”four times, three times alluding to a visual process and once to alinguistic convention; probably few readers remark on the coinci-dence as in any way disturbing.

Hurford: The neural basis of predicate-argument structure

282 BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3