The conjectured role of Polani et al.’s relevant information, behavioral variation and recursive...

15
The conjectured role of Polani et al.’s relevant information, behavioral variation and recursive cognition in selection for a human language faculty James Goodman Metropolitan Remand and Reception Centre, NSW Dept. of Corrective Services, Australia article info Article history: Received 25 September 2011 Received in revised form 15 January 2012 Accepted 17 February 2012 Available online 19 March 2012 Keywords: Language Information theory Evolutionary psychology Game theory abstract The speculative argument presented in this review is based on the assumption that Polani et al.’s formalization will limit communication to the minimal amount of information needed to employ adaptive behavior. Selection for some of the distinct features of human language is argued to generally depend on relevant information, behavioral variation, and recursive aspects of cognition. Behavioral variation is argued to cause the perception of rel- evant objects to vary between individuals, thereby favoring selection for phenotypes with a greater referential signaling capacity in cooperative contexts. If the memory-dependent aspects of recursive cognition reflect the perception of relevant fitness problems over wide space–time intervals, then individual difference in behavior will also unevenly distribute the perception of relevant objects between agents in a similar manner, even if similar rel- evant objects are perceived. Where individual fitness is highly dependent on the local coor- dination of behavior between agents in the social structure of an interaction network, an unevenly distributed perception of different relevant objects in space–time will then increase interaction uncertainty and the behavioral error potential in the ongoing local coordination of interdependent behavior. The extent to which a discrete message capacity can evenly distribute information in communicative interactions is argued to depend on the recursive capacity of language to referentially pinpoint the coordinates of discrete ref- erential objects in continuous intervals. Given asymmetric interaction valuation between individuals in a diversified social ecology, an evenly distributed perception of relevant objects is argued to be limited by the referential coding efficiency problem of differences in the attention paid to communication effort, thereby indicating the stable role of recur- sive cognitive inference. Ó 2012 Elsevier Ltd. All rights reserved. 1. Introduction Communication appears to be very important to most forms of life, even plants and bacteria. For instance, airborne chem- ical signals are employed between plants as adaptations to herbivory (Heil and Karban, 2009), and micro-organisms employ long-range chemical signaling and chemotactic signaling to enable a cooperative self-organized response to adverse condi- tions (Ben-Jacob et al., 2000). On the other extreme, human communication has evolved the complex use of a wide variety of transmission devices such as written grammar, gestural signing (Corballis, 2009a,b,c), sound production structures, and body language. In particular, some of the unique features of human language can include morphological and phonological complexity, the continuous number of signals that can be produced by syntactic recursion (Hauser et al., 2002), as well as the large number of discrete references that recursion is argued to depend on (Jackendoff and Pinker, 2005). Despite a relatively greater 0388-0001/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.langsci.2012.02.002 E-mail address: [email protected] Language Sciences 34 (2012) 604–618 Contents lists available at SciVerse ScienceDirect Language Sciences journal homepage: www.elsevier.com/locate/langsci

Transcript of The conjectured role of Polani et al.’s relevant information, behavioral variation and recursive...

Language Sciences 34 (2012) 604–618

Contents lists available at SciVerse ScienceDirect

Language Sciences

journal homepage: www.elsevier .com/ locate/ langsci

The conjectured role of Polani et al.’s relevant information, behavioralvariation and recursive cognition in selection for a human language faculty

James GoodmanMetropolitan Remand and Reception Centre, NSW Dept. of Corrective Services, Australia

a r t i c l e i n f o a b s t r a c t

Article history:Received 25 September 2011Received in revised form 15 January 2012Accepted 17 February 2012Available online 19 March 2012

Keywords:LanguageInformation theoryEvolutionary psychologyGame theory

0388-0001/$ - see front matter � 2012 Elsevier Ltdhttp://dx.doi.org/10.1016/j.langsci.2012.02.002

E-mail address: [email protected]

The speculative argument presented in this review is based on the assumption that Polaniet al.’s formalization will limit communication to the minimal amount of informationneeded to employ adaptive behavior. Selection for some of the distinct features of humanlanguage is argued to generally depend on relevant information, behavioral variation, andrecursive aspects of cognition. Behavioral variation is argued to cause the perception of rel-evant objects to vary between individuals, thereby favoring selection for phenotypes with agreater referential signaling capacity in cooperative contexts. If the memory-dependentaspects of recursive cognition reflect the perception of relevant fitness problems over widespace–time intervals, then individual difference in behavior will also unevenly distributethe perception of relevant objects between agents in a similar manner, even if similar rel-evant objects are perceived. Where individual fitness is highly dependent on the local coor-dination of behavior between agents in the social structure of an interaction network, anunevenly distributed perception of different relevant objects in space–time will thenincrease interaction uncertainty and the behavioral error potential in the ongoing localcoordination of interdependent behavior. The extent to which a discrete message capacitycan evenly distribute information in communicative interactions is argued to depend onthe recursive capacity of language to referentially pinpoint the coordinates of discrete ref-erential objects in continuous intervals. Given asymmetric interaction valuation betweenindividuals in a diversified social ecology, an evenly distributed perception of relevantobjects is argued to be limited by the referential coding efficiency problem of differencesin the attention paid to communication effort, thereby indicating the stable role of recur-sive cognitive inference.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

Communication appears to be very important to most forms of life, even plants and bacteria. For instance, airborne chem-ical signals are employed between plants as adaptations to herbivory (Heil and Karban, 2009), and micro-organisms employlong-range chemical signaling and chemotactic signaling to enable a cooperative self-organized response to adverse condi-tions (Ben-Jacob et al., 2000). On the other extreme, human communication has evolved the complex use of a wide variety oftransmission devices such as written grammar, gestural signing (Corballis, 2009a,b,c), sound production structures, and bodylanguage.

In particular, some of the unique features of human language can include morphological and phonological complexity, thecontinuous number of signals that can be produced by syntactic recursion (Hauser et al., 2002), as well as the large numberof discrete references that recursion is argued to depend on (Jackendoff and Pinker, 2005). Despite a relatively greater

. All rights reserved.

J. Goodman / Language Sciences 34 (2012) 604–618 605

complexity, human language does share a common ground with all other forms of communication between life forms, whichis what it acts upon, and why it is employed. Regardless of simplicity or complexity, the information that is encoded anddecoded in communication can be argued to be information that operates on behavior, in terms that are of adaptive signif-icance to the individual.

Information and communication can significantly influence the reproductive fitness of organisms (Polani, 2009; Scott-Phillips, 2007). Although organisms such as plants or bacteria do not require a central nervous system to act on information,vertebrates show a relatively greater size and complexity of their central nervous systems (Dawkins, 2009; Oxford Dictio-nary of Biology, 2008), and it would follow that vertebrate fitness has a greater behavioral sensitivity to information quan-tity, complexity, diversity and variation. Communication deals specifically with interactive behavior, so it is notunreasonable to assume that the complexity of human language depends on human interactive behavior to exist. But exactlyhow are the unique features of human communication related to interactive behavior, and why is communication as rich anddiverse in information as human language not selected as a trait in other animals? For that matter, why communicate at allin the first place?

The study of human language has recently seen an increasing emergence of research associated with these types of ques-tions, with a trend towards an interdisciplinary approach. Jones (2010) provides comparative evidence that variation to theranking of underlying constraints that operate on the grammatical structure of kinship terminology between English andSeneca are also constant in other domains, and argues that grammar is a uniquely human adaptation for sophisticated coor-dination games. Empirical work by Ryabko and Reznikova (2009) has employed information theory to successfully predictrelations between language, cognitive skills and interactive behavior in ants. In linguistics and social ecology, statistical anal-ysis by Lupyan and Dale (the Language Niche Hypothesis, 2010) on over 2000 languages and demographics has identified anumber of dependent relations between the structure of language and its social environment. Human language, it seems,may very well be determined by behavior.

In seeking to justify the view that the uniqueness of human language serves an adaptive function, this review attempts topartially outline the nature of an evolutionary pressure that could possibly lead to selection for a human language faculty, byexploring a number of consistent inferences that can be drawn between the theoretical and empirical output of research inthis very diverse field. By drawing attention to these apparent consistencies, this speculative discussion aims to provide amulti-disciplinary argument that may warrant further investigation. Although this discussion is constrained within a verylimited number of subjects, it is hoped that aspects of the argument are robust enough to identify common ground thatmay be of interdisciplinary relevance to other fields of language research.

2. Communication and behavior in relation to game theory

A game requires the interdependence of strategies (Hirschey, 2009), where an individual’s payoff for selecting a behaviordepends on the behavior selected by each individual, being generally classified within the domains of cooperative and com-petitive strategies. Individuals interact by each selecting a strategy, and the interaction results in a payoff to each agent: inevolutionary games, an individual’s payoff outcome for selecting a competitive or cooperative action is proportionate toreproductive fitness (Nowak et al., 2010). In signaling games, Jager summarizes the common theme of the models so far(as of 2008): there is a sender and receiver, the sender has private information about an event that the receiver does nothave, events have a probability distribution, the sender transmits a signal to the receiver, the receiver selects an action thatmay depend on the observed signal, and ‘‘the utilities of sender and receiver may depend on the event, the signal and thereceivers action’’ (Jager, 2008), implying some measure of cooperative circumstance. Although the evolutionary conditionsfor communication are argued to generally require stable cooperative interaction (Scott-Phillips (2010), for a comprehensivereview of information theory and non-zero-sum logic in animal communication that is independent to this review, see Pent-eriani, 2010), human language is also observed in a very wide range of competitive and exploitative contexts, and thereforedifficult to conceptually limit to cooperation. So, an examination of communication in competitive interaction may serve tonarrow down the role of cooperation somewhat.

For instance, human language has the code-like exclusionary property of being mutually incomprehensible betweensome 6000 human groups, thereby reinforcing within-group cooperation while at the same time repelling outsiders (Corbal-lis, personal communication, 2011). Communication also occurs in direct competition, where qualities such as aggression aresignaled during contests, and can occasionally be observed to result in a competitor conceding the contested resourcewithout a fight, therefore signaling in competitive contexts would seem to be clearly adaptive. However, differences inthe employment of aggressiveness signaling during a competitive interaction appear to be less related to fighting ability,and more related to asymmetries in long-term resource valuation (Hurd, 2006). Why then would aggressiveness signalingbe adaptive if it makes little difference to the ability to win an all-out fight?

McNamara and Leimar (2010) argue that aggressiveness communication can be viewed as a form of cooperation in cost-benefit appraisal, where competitors are interested in minimizing the potential costs of an all-out fight by informing eachother of their respective formidability. This contrasts with Hurd’s formalization that aggressiveness signaling is a ‘‘non-binding probabilistic promise of future action’’ that informs each competitor of their respective valuation of the contestedresource. A fight may result in serious physical and social costs to participants, regardless of whether they win or not(Campbell, 2005). Males in general will encounter more rivals over their lifetime, but have a greater number of reproductive

606 J. Goodman / Language Sciences 34 (2012) 604–618

opportunities, so the potentially serious cost of a fight for each encounter is solved by signaling the willingness to fight (Eliaset al., 2010, p. 873) for the resource.

Communication can also improve cheating performance against others in cooperative relationships, and this is evidencedby our communicative range for deception and manipulation (Buss, 1996). Despite this capability, stable cheating is stronglyargued to depend on the condition that the benefits of cheating do not crowd out cooperation, producing a Red Queen armsrace that selects for counter-adaptations to cheating, and cheating adaptations to counter counter-cheating adaptations, andso on (Ridley, 1993; Buss and Duntley, 2008, p. 57). It may then be possible that cheating and deception can increase thesophistication of language, however, that is beyond the scope of this paper.

In predation problems, anti-predator competition between gazelles can select for communication to predators. When acheetah monitors a herd of gazelles, members of the herd will jump in the air and signal to the cheetah the relevance of phys-ical condition to possible running speed (Miller, 2003). The absolute height of the jump is not important, rather, it is whetheror not the gazelle can jump higher than others in the herd, thereby increasing its odds of survival by improving the cheetah’sability to target lower-cost prey (Miller, 2003).

Some degree of cooperative interdependency can therefore be argued to be responsible for communication in competitivecontexts, and can imply that a decrease in cooperation will decrease communication. If stable cooperative dependence be-tween organisms is indeed a general prerequisite, then situational cues that indicate an increased competitive relationshipwith another social agent may presumably cause relatively less communication to be employed over time, holding the levelof cooperative interdependence constant. The influential strength of cooperation on signaling could also be indicated some-what by communication in uncertain conditions; a human community will contain a complex assortment of cooperative andcompetitive relationships, where any two community members cooperate in some relationships, and compete against eachother in others, i.e. ‘‘frenemies’’ (Weir, 2011). If an individual faces a large number of different relationships that can changein proportion to the change in social dynamics, then a degree of uncertainty can be associated with the nature of a relation-ship with another agent, hence the presence of risk in social dynamics.

In this context, a communication bias can be predicted by error management theory (Haselton and Buss, 2000), whereEMT holds that decision-making in uncertain conditions indicating a trade-off between behavior errors will select for a biasto commit the least costly error. Buss and Haselton provide empirical evidence supporting this theory in optimistic maleover-perception of female sexual interest, and pessimistic female under-perception of male commitment intent. This typeof bias is being increasingly found in a very wide range of contexts, from overreaction to looming sounds (Haselton et al.,2009) and future discounting (Daly and Wilson, 2005, Haselton and Nettle, 2006) to anxiety and allergies (Nesse, 1991),all of which incorporate a degree of uncertainty.

If competitive interdependence generally selects against communication, then error management will predict a commu-nication bias when an agent is uncertain of a cooperative relationship, on the condition of a cost trade-off between compe-tition and cooperation. For instance, it may be costly to act on a possibly deceptive message transmitted by someone whomay have competitive intentions, and signaling to a competitor can be costly if the message allows the competitor to actbetter. However, if a relevant message is not signaled to a possible cooperator, or if a relevant message from a cooperatoris not acted on, then an interaction problem may not get solved, and reduced cooperation or punishment may be incurredover future interactions with others (Jensen, 2010).

Therefore, stable cooperation as a prerequisite for communication should be able to be indicated to some extent by errormanagement theory, which would predict an under-communication bias with situational cues that indicate an uncertaincooperative relationship and a higher cost for cooperating with a potential competitor, which includes a bias not to acton messages transmitted by a possible competitor, and a bias to transmit fewer messages that are relevant to the potentialcompetitor. On the other hand, an over-communication bias would be predicted by cues indicating an uncertain cooperativerelationship with a higher cost potential for not cooperating. This discussion will now proceed under the assumption thatstable signaling will generally require some degree of cooperative interdependence between interacting organisms.

3. A language faculty in relation to several information theoretic concepts

As encoded information, the written structure and phonological structure of language is communicated in discrete se-quence from a finite set of symbols, governed by a set of probabilities (Pinker and Jackendoff, 2005; Shannon, 1948). Theoutcome of this stochastic process is the ability to transmit a signal from an information source through an encoding trans-mitter, channel, decoding receiver, to a destination. In order for this communicative component of a human language facultyto function, the message selected at the information source and reproduced at the destination must be the same signal froma finite set of possible signals at each point (Shannon, 1948).

The semantic information referred to by an encoded message is entirely distinct from the information communicatedbetween the message capacities of the information source and destination (Shannon, 1948); the referential object/s arenot physically communicated (this would require mental telepathy), rather, the referential information in the cognitivestructures of the central nervous system will need to be matched to the selected/reproduced signal, which will demandadditional information processing. Therefore, the information-processing capacity of the physical structures and neurolog-ical operations associated with a human language faculty would need to encompass the components of the generalcommunications model, and the matching of referential objects in the CNS to the message capacity of the faculty.

J. Goodman / Language Sciences 34 (2012) 604–618 607

In that highly restrictive context, an increased language faculty is approached throughout this discussion as the informa-tion-processing capacity to communicate any message from a larger number of possible messages, where an increased sig-naling capacity also depends on the capacity to refer to a proportionately large number of perceivable objects that informbehavior. At this point, a conceptual distinction should be drawn between the use of the term language and the term mes-sage capacity. The number of words in the English dictionary is approximately six hundred thousand (Prokopenko et al.,2011), whereas the message capacity of an average literate person is estimated to be around fifty thousand words and con-cepts (Pinker and Jackendoff, 2005; Corballis, 2009a,b,c). In applying game theory and information theory to language evo-lution, Nowak et al. (2002) assume that ‘‘from a biological perspective, language is not the property of an individual, but theextended phenotype of the population’’. Although the language of a population may indeed evolve to the Dawkins extendedphenotype concept, from a biological perspective it would be a very big leap to assume that language cannot be the output ofbiological adaptation. With that in mind, this discussion argues selection for language in terms of the individual’s capacity toemploy language.

4. Relevant information and behavior

The formalization and empirical testing of ‘‘relevant information’’ (Polani et al., 2001, 2006; Polani, 2009; Salge andPolani, 2009a,b, 2011; Van Dijk et al., 2010) demonstrates the critical significance of information to adaptive behavior.The environment of any organism will contain problems that will affect the fitness the organism, selecting for an informa-tion-processing capability to perceive these problems and produce an output of problem-solving adaptive behavior. This cir-cular process between the agent and its environment is referred to as the perception–action loop, where an ‘‘agent interactswith its environment by perceiving a sensor input from a set of sensor input states, and selecting an action from a set of out-put actions’’ (Salge and Polani, 2011). Although an increase to sensory information-processing will increase the performanceand utility of behavior with respect to the valuation of a problem, the increasing fitness cost of information-processing willbe traded off against the corresponding utility gain of the problem-solving behavioral output, where the optimal trade-off isbalanced between information-processing cost and behavior utility (Polani, 2009). As a result, the information that is pro-cessed will be limited to the minimal information that an organism needs to employ the adaptive strategy (Polani et al.,2001). Therefore, selection will evolve organisms to only process enough information needed to solve problems in their envi-ronment with behavior, holding phenotype variation at the optimal trade-off point constant.

When tested by the evolution of artificial agents in a multi-agent foraging scenario, Salge and Polani (2009a,b, 2011) pro-vide evidence that the ‘‘digested’’ relevant information encoded in the adaptive behavior of others will provide a rudimen-tary motivation for individuals to socially interact and observe each other, and at least decode the information present inothers actions. Digested information is produced by behavior, without any intention to communicate (Salge, personal com-munication, 2011); for instance, the performance of bee foraging behavior increases once a hive of bees takes up residencewithin the vicinity of another hive (Danchin et al., 2004). By interacting with the environment, the actions of an agent willencode a higher concentration of relevant information than their environment; as the performance of the organism in-creases, the density of relevant information in the organism’s actions also increases (Salge and Polani, 2011).

This example serves to illustrate the value of information to behavior: as a means to improve the utility of a behavior, anorganism will seek to maximize the information processed by its sensors. However, the utility of any increased performanceto behavior to will sharply diminish as the cost of sensory information-processing passes the optimal trade-off with gains tobehavioral utility; phenotype variation at the optimal trade-off point provides the means for a species to evolve either way,either to realize an evolutionary successful behavior or to diminish a less successful one (Polani, 2009). This fundamentalparsimony principle of perception–action, which is proposed by Polani et al., to be universal, holds that organisms willevolve to balance the trade-off between the utility of behavior and the fitness cost of maintaining the sensory informationchannel. Therefore, information processing is costly, and the cost is traded off against behavior; as a result, an organism willonly perceive what it can regulate with behavior.

By way of analogy, Dorner (1996) points out that ‘‘nowhere in nature does a creature run around on three legs but dragalong a fourth, perfectly functional but unused leg.’’ For instance, Dukas (2004) argues that the attention paid to a foragingtask is limited in part by the relative amounts of conspicuous and cryptic resources in the environment. Focused attentionappears to require more information-processing to catch cryptic prey, and divided attention appears to require less informa-tion-processing to catch conspicuous prey. Less costly limited attention can depend in part on a relatively greater quantity ofa conspicuous resource, and likewise selection for more costly focused attention can depend on a relatively greater quantityof a cryptic resource (Dukas, 2004). The extra neural matter required to catch more cryptic prey will be selected against ifwandering around and consuming encountered prey provides relatively better returns, partially indicating why carnivorestend to have larger brains than herbivores.

This information-processing cost of attention can be contextually related to least effort communications in a similar man-ner. In communicative interactions, speaker effort is lower if message ambiguity is higher, where a greater number of ref-erential objects per message gives a lower probability of one encoded message referring to one particular object.Encoding effort will decrease as messages become more ambiguous and referentially useless, given by an increased numberof referable objects per message, i.e. approaching referential uselessness (Prokopenko et al., 2011, p. 5). On the other hand,listener effort is lower if message ambiguity is lower, where a fewer number of referential objects per message gives a higher

608 J. Goodman / Language Sciences 34 (2012) 604–618

probability of one decoded message referring to one particular object. Decoding effort will decrease as messages become lessambiguous and less referentially useless, given by a decreased number of referable objects per message, i.e. approaching anindexical reference of one object per message (p. 5).

Therefore, the cost of the effort required to encode a more cryptic (more ambiguous) message will be less than the cost ofencoding a more conspicuous (less ambiguous) message. On the other hand, the effort required to decode a more crypticmessage will have a higher cost than decoding a more conspicuous message. If signals are presumed to be referentially lim-ited to perceivable objects that are directly or indirectly relevant to interdependence, then information parsimony wouldimply that the attention paid to speaking and listening effort can reflect the perceived value of the interdependent problemthat is subject to referential signaling in the communicative interaction.

If this is presumed for social interaction, then an individual who perceives a greater valuation in an interdependent prob-lem would invest more coding effort into referentially matching perceivable objects to actuated signals and perceived sig-nals, and an individual who perceives a lower valuation would invest relatively less coding effort into matching objects tosignals, so an individual with more at stake would seem to decode and encode with more effort than another agent with lessto gain/lose. The trade-off in coding effort will then be more pronounced when the fitness of an organism is more dependenton an interdependent problem than another organism, i.e. if the interaction valuation is asymmetric. In terms of referentialambiguity, a greater perceived valuation would then imply dependence on any object from a smaller number of possibleobjects per message, and a lower perceived valuation would imply dependence on any object from a larger number of pos-sible objects per message.

The coding effort applied to signal actuation and signal perception may then be inferred to reflect the degree of sensitivity tothe particular interdependent behaviors that may possibly be employed with respect to a message, where increased ambiguitygives a lower probability that a signal will refer to a particular object, and where each perceivable object can inform behavior.For instance, an agent who invests less effort into referentially encoding or decoding a message would depend on any behaviorfrom a larger number of possible interdependent behaviors per message, thereby less concerned with the particular strategiesthat may actuated by themselves or other interdependent agents with respect to the signal. Conversely, the individual with therelatively greater valuation would depend on any behavior from a smaller number of interdependent behaviors per signal,thereby relatively more concerned with the particular strategies that can possibly be employed per signal.

If this line of reasoning holds, then an interdependent problem that differs in value between communicating organismscan reflect differences in the attention paid to coding effort, therefore a greater subjective fitness value would seem to de-mand a greater referential specification of interdependent behavior per signal. In certain contexts, the trade-off in commu-nication effort will manifest in phenomena such as Zipf’s power law, where the frequency of a word is inversely proportionalto its rank in frequency (Yang, 2010). In addition to being observed in the study of other animals, this coding efficiency prin-ciple has also been found in the vocal communication of the Formosan macaque, indicating that similar selective pressurescan operate on communication in different primate species (Semple et al., 2010). The coding effort trade-off behind Zipf’s lawis formalized by Obst et al. (2009) to be the origin of scaling in genetic coding, as an optimal solution to maximize the ref-erential power of genetic codes under the constraint imposed by the cost trade-off in translation effort between codon usageand amino acid translation, where referential accuracy is constrained between indexical reference systems and referentiallyuseless systems (Obst et al., 2009).

By approaching least effort communications as a continuous optimization problem, Prokopenko et al. (2011) demonstrateby simulation that Zipf’s power law will occur in the vicinity of the phase transition of the cost trade-off between speakereffort and listener effort. Prokopenko et al., also determine that Zipf’s law is not a lone coding efficiency law, as an inverse-factorial law is found to be more dominant and widespread, occurring more regularly at the phase transition of referentialaccuracy instead of in the vicinity of it. It can then be said that cost-minimization mechanisms do operate on the coded com-munication of referential objects, and the utility of referential accuracy can seem to be balanced against the information-pro-cessing cost of speaking and listening effort.

In summary, relevant information holds that sensory information-processing is strictly limited to problems solvable bybehavior, and an organism will only evolve the capacity to perceive objects that are relevant to adaptive behavior. Signalingbetween organisms deals specifically with the actuation and perception of sensory information, thereby directly related toadaptive behavior. Information-processing is costly, so the biological capacity to referentially encode and decode perceivableobjects to and from any signal from a set of possible signals would also be costly, in which case it is certainly possible thatsignaling between humans can be limited in reference to problems that are solvable by behavior between humans. If humancommunication is then presumed to be limited to relevant information and interdependent behavior, then the referentialmessage capacity of a language faculty can be limited to the minimal selection of messages needed to solve interaction prob-lems with others. Although incorporating this type of assumption may appear to be extremely reductive, especially so giventhe broad scope of language, this discussion will now attempt to point out that it may not necessarily contradict the com-plexity of language.

5. Relating a message capacity to the number of discrete behaviors

There are presumed to be as many behavioral adaptations as there are problems solvable by behavior (Buss and Greiling,1999; Tooby and Cosmides, 2005). If the variety of discrete behaviors reflect the variety of problems that are solvable by

J. Goodman / Language Sciences 34 (2012) 604–618 609

behavior, and the referential objects of signals are directly or indirectly limited to interdependent behavior, then the varietyof an organism’s selection of messages can be constrained by the variety of behaviors needed to solve interdependent prob-lems, where the cost of sensory information-processing is limited by the utility of interactive behavior. Essentially, a largenumber of discrete messages would then imply a large number of discrete interactive behaviors. The underlying principlehere is representational to the law of requisite variety (Ashby, 1956).

Requisite variety holds that a regulator’s (agent’s) ability to reduce the variety in its environment cannot exceed theagent’s variety of possible actions, where the term variety corresponds to the term entropy in Shannon’s work (Ashby,1956). In other words, ‘‘only variety (entropy) in actions can reduce the variety (entropy) in the environment’’ (Klyubinet al., 2007). This law generally represents the perception–action loop of a controller with its environment, through the sen-sors and actuators (behaviors) of the organism. Actuator variety alone is insufficient for ongoing regulation; an organism’sability to regulate its environment will also depend on the available sensory information (Polani, 2009). Consequently, anorganism will only be able to regulate the variety that it can perceive; in order to ‘‘reduce the entropy (variety) in the envi-ronment by a certain amount, an organism has to first acquire that amount of information from the environment’’ (Polani,2009). The capacity to solve a given variety of interactions with the other social agents therefore implies a perception–actionloop between an organism and its interdependent environment. Similarly, a referential actuation capacity in signaling wouldthen seem to depend on a referential perception capacity, holding phenotype variation constant, in which case a communi-cation faculty can evolve in terms of a signal actuation–perception loop between agents, where the optimal signal capacitycan at least depend on the variety of interdependent behavior.

Signal variety constrained by interaction variety can be argued in terms of information parsimony. If an interdependentproblem has been presented between agents, and the agents are memoryless (they have no recollection of the behavior ofother agents in similar circumstances), and a variety of interdependent behaviors can applied to the problem, then theadvantage of a message capacity with a similar variety can now seem to be apparent. It would follow that an interdependentproblem and a large variety of possible behaviors may then require the referential coding capacity for a similarly large vari-ety of possible messages. If the message variety of a communication faculty is smaller than the variety of possible interde-pendent actions, then phenotype variation would seem to favor the reproduction of individuals with a larger messagecapacity. On the other hand, if the message variety is greater than the variety of interdependent behavior, then no furtherinteractive regulation can be gained, the information-processing cost of the excess message capacity will be selected against,and individual variation will favor phenotypes with less costly message capacities; this analogy describes the parsimonyprinciple of relevant information.

Furthermore, the research output of evolutionary psychology strongly indicates a very large human distribution of spe-cies-typical behavioral adaptations. Non-social environmental problems appear to only account for a small proportion; theremaining behaviors seem to represent a large variety of interdependent relationships between individuals in a human com-munity (Buss, 1996). If our variety of interdependent behavior is species-typical, then our capacity to perceive the corre-sponding variety of relevant interdependent objects will also be species-typical. If a referential message capacity reflectsthe ability to signal the variety of perceivable objects needed inform interdependent behavior, then the human capacityfor language can imply a large number of interdependent behaviors. In this parsimonious contextualization, the informa-tion-processing cost of a referential message capacity would seem to depend on the number of discrete messages neededto referentially inform a proportionate number of interdependent behaviors through signal actuation and signal perception.

However, Hypothesis 3 of Hauser et al. (2002), states that ‘‘only the faculty of language in the narrow sense (FLN) is un-iquely human’’, where ‘‘FLN comprises only the core computational mechanisms of recursion as they appear in narrow syn-tax’’. Recursive syntax can generate an unlimited number of phrase variations (‘‘discrete infinity’’), where recursion in syntaxis ‘‘largely based on long-distance dependencies that bracket phrases embedded into larger phrases’’ (Aboitiz et al., 2010),which seemingly indicates no relation between FLN and discrete behavior. Aboitiz et al., provide strong evidence that recur-sive syntax, and language in general, was made possible by a powerful short-term memory circuit associated with linguisticprocessing, a uniquely human neural innovation referred to as the phonological loop.

This elaborate auditory-vocal short-term memory circuit is derived from the ‘‘auditory prefrontal networks that preexistin the non-human primate’’, and ‘‘overlaps and possibly coevolved with a more ancient circuit involved in hand manipula-tion and gesture coding in the mirror neuron network’’ (Aboitiz et al., 2010). The auditory-vocal function of the neural circuitsupports Pinker and Jackendoff’s argument (2005) that phonology is the unique hallmark of human language, however, theynote that phonological structure is not technically recursive (Pinker and Jackendoff, 2005). On the other hand, the distinctshort-term memory capacity of the loop seems agreeable to Hauser et al.’s view that recursion is a unique feature of humanlanguage (though not their hypothesis that it is the only unique feature), leaving recursive phonology as an apparently con-tradictory explanation.

Hypothesis 3 also suggests that ‘‘FLN as an adaptation is open to question’’. The elaborate design, the short-term memorycapacity, and the information-processing cost of the phonological loop would strongly suggest that the recursive capacity oflanguage is an adaptation. If this is presumed, then what is the nature of the problem that recursive language would solve,and how would recursive language then inform discrete behavior with the capacity to generate a continuous number ofdependent signals?

Although natural languages can use syntactic recursion to create an unbounded number of signals, the cost of the com-plexity involved can render increasingly expensive signals more useless in evolutionary stable signaling games (Jager, 2008,p. 138), where the interests of the sender and receiver coincide, the signal selection size is limited, and signaling costs vary

610 J. Goodman / Language Sciences 34 (2012) 604–618

between each signal from the set of possible signals, and the action selection size is also limited, and the number of actionsequals the number of events. If the number of signals exceeds the number of events, the most expensive signals are neverused (Jager, 2008); recursion can create very large signal dependencies, however, that does not necessarily mean that it isdesigned for very large signals, or that very large signals will be employed. In a novel representation of recursion by visualgroupings, Jackendoff and Pinker (2005, p. 218) demonstrate that the recursive elements within a signal depend on the dis-crete parent element of the signal, therefore recursion language can be inferred to enable continuous variation within theheaded hierarchies of discrete signals. For instance, recursion could continuously bracket this paragraph with embeddedphrases until it equals the number of words used in the entire discussion, however, the parent referential objects would stillbe similar, and the information-processing costs would render the signal increasingly expensive.

So, if communication informs the interaction of strategies, how would recursive language inform discrete interactivebehavior, if at all? Jackendoff and Pinker (2005) argue that the headed hierarchy aspect of syntactic recursion evolved to rep-resent recursive structures in cognition, such as headed hierarchies in conceptual structure (Jackendoff, 1987). Corballis(2007) argues that recursive thought is a uniquely human adaptation to ‘‘the complex calculus of social affairs’’, reflectedin the capacity for theory of mind, inferential mind reading by mirror neuron networks, mental time travel through recursiveself-reference to episodic memory of prior experience, and in particular, the referential distribution of episodic memorybetween individuals (Suddendorf and Corballis, 2007; Suddendorf and Corballis 2010; Corballis, 2009a,b,c). Corballis pro-poses that ‘‘grammatical language evolved primarily to communicate episodes, thus greatly enlarging the vocabulary ofreal-world entities for the construction of personal futures’’. The demands of mental time travel on memory are significant,where cognitive storage requirements need to accommodate the perception of objects that are not immediately present(Corballis, 2009a,b,c).

Empirical work has further established the utility of memory to future-orientated behavior. In a formal approach testedby artificial agent simulation, Klyubin et al. (2007) demonstrate that controllers seeking to maximize information through aconstrained sensor capacity and limited memory can evolve self-organized compressed memory structures, in which mem-ory structures evolve to reflect the initial position (space) and future position (time) of interaction in the perception–actionfeedback loop of the agent with the environment. Frederic Peters hypothesizes that the recursive quality of consciousness insubjective self-awareness is primarily an energy efficient process for the spatiotemporal updating of ongoing interaction be-tween the controlling agent and the environment, derived from a completely recursive working memory loop between acontrollers current state and the next expected current state, as a moment-by-moment event, where the adaptive valueof consciousness is ‘‘derived from the capacity to remain cognitively alert while physically inert, exchanging expensive phys-ical activity for low-cost pre-physical orientative processing’’ (Peters, 2010).

The memory-dependent aspects of recursive cognition can therefore be argued to optimize behavior with respect to theperception of objects that are subject to greater variation in space–time intervals. For instance, the number of behavioraladaptations to social problems can be assumed to be discrete and finite within a species, however, the space–time coordi-nates of each discrete social problem may vary over wide continuous intervals between agents. In this context, memory inrecursive cognition may imply that an interdependent coordination of discrete behavior is more dependent on what behav-ior is employed, when it is employed, why it is employed, and where it is employed. Consequently, the perception of discretereferential objects in communication can be subject to increased variation in time and space between agents. The short-termphonological memory capacity for recursive syntax (Aboitiz et al., 2010) may seem to solve this problem, through the abilityto actuate and perceive continuous phrase variations to long-distance dependencies within the headed hierarchies of dis-crete messages, thereby referentially pinpointing with greater accuracy the contextual space–time coordinates of discreteinterdependent objects subject to greater variation in continuous space–time intervals. Although recursive reference andthe spatiotemporal aspects of recursive cognition could then seem to be related in this context, the reviewed research sug-gests that the role of memory is functionally different between the two operations, where Aboitiz et al., argues that short-term phonological memory facilitates long-distance message dependence.

As demonstrated by Aboitiz et al., and Jager, recursive syntax will be costly in terms of information processing, thereforerecursive syntax may represent an adaptation of some kind. Jackendoff and Pinker also demonstrate that syntactic recursionwill be constrained within the headed hierarchies of discrete signals. Klyubin et al., formally and empirically demonstrate therole of compressed memory in optimizing future-orientated behavior under constraints imposed on sensory information-processing and storage. Corballis, Suddendorf and Peters provide evidence that memory-dependent aspects of recursive cog-nition appear to be designed to optimize behavior with respect to fitness problems that are subject to increased space–timevariation. Where discrete interdependent behavior is coordinated by the ability to refer to any object from a finite selectionof interdependent objects, this section has argued that a recursive capacity to actuate and perceive continuous phrase var-iation to long-distance signal dependencies can possibly depend on the degree of variation to the perceived space–time coor-dinates of discrete interdependent objects in continuous intervals between agents, where the discrete referential objects ofcommunication are relevant in some way to the coordination of behavior.

Therefore, given recursive syntax, it is possible that the discrete message capacity of a human language faculty may belimited by the variety of behaviors needed to solve interdependent problems, where the capacity for recursive referenceis assumed to reflect variation to the perceived space–time coordinates of discrete interdependent objects between agents.This however does not address any ecological conditions that can introduce spatiotemporal variance into social interaction,as so far this discussion has only examined a language selection pressure from one side of the information parsimony ledger,being the possibility that information parsimony may limit the signal capacity of a human language faculty to the minimal

J. Goodman / Language Sciences 34 (2012) 604–618 611

selection of discrete messages needed to solve interaction with others in a community. The other side of the relevant infor-mation ledger deals with the utility of behavior against the sensory information needed to employ it, so, what interactivefitness costs could be incurred if a referential message capacity is not large enough?

6. Number of individual differences and the potential for error

Few species can be argued to demonstrate the complexity of cooperative self-organization in human communities, whichmay lead to speculation about the extent to which the performance of a human community depends on language to solve thebehavioral variability of human local interactions. If the message capacity of highly social insect such as an ant or bee is heldconstant, one may wonder how it would perform if it faced increased variability in behavior between members of its com-munity, and an increased number of species-typical behaviors in interdependent contexts.

Even small amounts of behavioral variation can both stabilize and disrupt cooperation in animal populations (Burgmullerand Taborsky, 2010), and create markets where individuals can preferentially choose between cooperative partners (McNa-mara and Leimar, 2010). Individual difference in behavior is argued to be an adaptation to occupy a strategic social niche in amulti-niche social environment such as a hierarchy (Buss and Greiling, 1999), through competitive pressure on organisms todiversify into community social roles as a means to reduce the individual cost of niche-exploitation conflict in a social ecol-ogy (Burgmuller and Taborsky, 2010). Evolutionary personality psychology argues that behavioral phenotype variation inhumans is designed for social role specialization in a hierarchy, evidenced by a large range of correlated variations in spe-cies-typical domains that are also correlated with variation in social ranking, such as differences in hierarchy tactics (Lundet al., 2007; Zuroff et al., 2010), assortative mating (Botwin et al., 1997), and more generally in our ability to specialize inproductive roles. When related to the self-organized exchange of goods and services in markets, the comparative advantageof individual difference is self-evident in terms of economic output and prosperity (Ridley, 2010). So, if it were not for a largeset of possible messages, how would an individual’s ongoing fitness payoff for cooperative performance in local-level inter-action be affected by a large number of species-typical behavioral variations, that on a whole seem designed to prevent‘‘niche overlap’’? This section will start to approach the reductive question by first examining several game-theoretic inter-pretations of interaction in social structure.

In evolutionary graph theory, the interaction structure of agents in a social community will generally be informed at thelocal level, which is clearly conceptualized on the linear-scaled network topology of regular graphs. A local interaction refersto ‘‘the setting where players only interact with a small subset of the population, such as close friends, neighbors, or col-leagues, rather than with the overall population’’ (Weidenholzer, 2010), represented in graph models by the social interac-tions (links, edges) of an individual (vertex, node) with nearest neighbors (vertices, nodes) in the social interaction networkstructures of agents in a population. In the closely-related area of evolutionary set theory, individuals have membership in afinite number of various groups (sets), interact with others who are in the same set, and interact several times with individ-uals they share several sets with (Nowak et al., 2010). In the evolutionary process of these models, agents update their inter-action strategies based on the payoff outcomes of ongoing interaction with local neighbors, where strategy updating inevolutionary games refers to the selection of action based on agent-intrinsic strategy update rules. When updates rules oper-ate in graph models, an ‘‘update rule may not only concern a strategy change alone, but a reorganization of the agents localnetwork structure’’ (Szabo and Fath, 2007); for instance, Nowak et al. (2010) argue that the interaction of an individual withhis/her local interactive networks and set memberships has a significant influence on the self-organization of the structure ofhuman society.

Update rules are simplifications of cognitive processes that ‘‘describe how agents perceive their surrounding environ-ment, what information they acquire, what beliefs and expectations they form from inner experience, and how all this trans-lates into strategy updates during the game’’ (Szabo and Fath, 2007). If this is contextualized in the biological setting ofinterdependent perception–action, an organism would perceive its local neighborhood sensor input from a set of sensor in-put states, and based on a cognitive information-processing mechanism (update rule), select an action from a number of pos-sible actions, which in sum is representational to the perception–action bandwidth of a set of interdependent behavioraladaptations, where the number of species-typical update rules equal the variety of species-typical behavioral adaptations.In terms of the network topology of a graph, this analogy can be further employed to illustrate the interdependent percep-tion–action loop of a regulating agent in ongoing interaction with its social agency environment, through social interaction(local interaction channel, edges/links on a graph) with local neighbors in the social network structure of agents (who occupythe vertices/nodes on the graph), where variety in the regulator’s agency environment refers to the variety of interdependentbehavioral adaptations (interaction perception–action bandwidth) that can be employed within the local interaction net-work channel of agents in a graph society. If this analogy is extended to human society, the arguably large, species-typicaldistribution size of interdependent human behavior can be inferred to represent the perception–action bandwidth of the lo-cal interaction channel between individuals in the social network hierarchies of a human community. Although this mayappear to be something of an over-extension of graph theory, several works have identified the real-world relevance ofgraphs to the complexity of social systems.

The Barabasi–Albert ‘‘scale-free’’ graph model holds that preferential interaction between social agents leads to a power-law growth of inhomogeneous social network connectivity that is biased towards more visible vertices (Barabasi and Albert,1999; Szabo and Fath, 2007), leading to socio-economic disparities and diverse network structures as the ‘‘inevitable

612 J. Goodman / Language Sciences 34 (2012) 604–618

consequence of self-organization due to the local decisions made by the individual vertices’’ (Barabasi and Albert, 1999).Realistically, social structures ‘‘fall somewhere between regular and scale-free graphs’’ (Santos et al., 2008). When graphpopulations in evolutionary prisoner dilemma games are rendered diverse in social status and wealth by fitness payoffs thatare mapped to individuals by scale-free, extrinsic differences in social network connectivity, cooperation becomes a robuststrategy that mostly relies on the diversity in wealth and social status (Perc and Szolnoki, 2008; Santos et al., 2008). Thisimposed scale-free condition leads to high-ranking players in cooperative clusters with larger connectivity dominatingthe game and prevailing against defectors (Perc and Szolnoki, 2008), where defectors survive in a minor role as low-fitnesssocial parasites by exploiting low-fitness cooperators (Santos et al., 2008). Santos et al., stipulate that combining ‘‘socialdiversity’’ with reputation and punishment in evolutionary games ‘‘will provide instrumental clues on the self-organizationof social communities and their economical implications’’ (2008).

Interaction network heterogeneity is therefore shown to be a powerful cooperative mechanism, so, holding private infor-mation constant, why would a signaling capacity even be needed to inform the level of cooperative interaction for a givensocial complexity? Although the scale-free models do reveal a great deal about the cooperative structure associated withcomplex social interaction, the agent-extrinsic ‘‘social diversity’’ in Perc et al., and Santos et al., is caused by experimentallyimposing differences in network connectivity to isolate the effect of this on agent cooperation. The agents themselves areintrinsically invariant, in that a singular homogeneous update rule is innate to all agents in the population, implying identicalpersonalities whose sole behavioral adaptation only varies in response to the mapping of fitness payoffs by a stationaryscale-free network. This highlights an issue in the game-theoretic literature that McNamara et al., identify: ‘‘in applyinggame theory to problems in biology, differences between individuals are often ignored. . . variation in a behavioral trait islikely to be due to both genetic variation and variation that is environmentally induced’’ (2010). Perc et al., and Santoset al., therefore address an environmental component of behavioral variation in cooperative social roles, however, whatthese models do not address is the role of heritable differences in generating complex social structure through cooperativepartner markets for preferential interaction (McNamara et al., 2010), niche specialization as avoidance of niche overlap con-flict (Buss and Greiling, 1999; Burgmuller and Taborsky, 2010), as well as in the exploitation of comparative advantage driv-ing the self-organized complexity of modern human economies (Ridley, 2010). Therefore, the question can be asked: if itwere not for the ability to employ language, how would the individual perform in human society, and how would humansociety perform if it were not for language being employed between different individuals?

Given that the scale-free models in Perc et al., and Santos et al., are not signaling games, agent interaction would be in-ferred to approximate a perfectly informed strategy selection, so agents would be almost completely certain of strategiesthat have been selected or will be selected. For instance, Perc et al., provide that uncertainty was incorporated into the strat-egy adoption process of agents, being the minimum amount needed (‘‘0.1’’) to smooth out any ‘‘trapped conditions’’ in exper-iment evolution, so the ongoing interactive behavior of local agents in the graph population would be 90% certain betweenagents. If the strategy adoption of others is 100% certain, then interdependent information is perfectly distributed alonginteraction networks, private information is zero, all agents are unable perceive any more interdependent information thanthey can already perceive, and signaling will not be needed to inform the updating of interdependent strategy. Therefore, aperfectly informed individual would be absolutely certain of local agent behavior, and strategy selection is perfectly in-formed without the need to select or perceive signals. In contrast, the private information theme of signaling games sofar (Jager, 2008) is at odds with the condition of perfect information. McNamara et al., argue that behavioral variation willintroduce uncertainty, risk and error into strategy selection, where ‘‘errors in the execution of strategy are inevitable andshould therefore realistically be included into game theory modeling’’ (2010).

Generally, risk and uncertainty in an evolutionary game refers to the probability of erroneously selecting an action thatleads to an undesired fitness cost (i.e. a mistake). Uncertainty is given by a probability distribution of possible actions thatmay be taken by another agent: if the probability distribution of interdependent behavior is perceived to be perfectly uni-form, then all possible behaviors will be equally probable. If on the other hand possible interactive behaviors are perceivednot to be uniform, then some behaviors will be more certain or less certain than others. When weighted against the payoffdistribution of the interaction problem, the relevance of the uncertainty can be inferred to reflect the probability-weightedfitness payoffs for a selection of actions (for an extensive game-theoretic review of the trade-off between risk dominance andpayoff dominance in local interaction, see Weidenholzer, 2010).

In this generalized context of uncertainty and error cost in cooperation games, a private discussion by this author (2011)entertained the possibility that an increased number of possible behaviors (without deliberation on behavioral variation) canincrease the probability of error in cooperative contexts, where information is asymmetric between the sender and receiver.For instance, a fixed number of adaptive outcomes within the payoff distribution of an interaction problem and a larger vari-ety of possible behaviors will imply a smaller probability that any random behavior will solve the problem, indicating selec-tion for the capacity to employ any message from an increased number of possible messages. Salge (personalcommunication, 2011) summarizes the main lines of the argument: if information is unevenly distributed, and allows anagent to act better if present, and if the main interaction is of a cooperative nature, then agents are interested in distributingthe information as evenly as possible, if a bigger selection of possible behaviors leads to a higher error cost. Given that a largenumber of assumptions hold, an increased selection of possible messages can give an agent the ability to distribute informa-tion more evenly in a local interaction, and as a result error cost is reduced, so overall agent performance is increased, there-by their own performance in cooperative scenarios is also increased (Salge, personal communication, 2011).

J. Goodman / Language Sciences 34 (2012) 604–618 613

The problem here is, the interdependent behaviors that can be employed within a local neighborhood will locally informeach agent: ‘‘since behavior is also used to communicate, a bigger number of possible actions also leads to a higher commu-nication bandwidth (with similar error rates in the channel), and could therefore balance this effect’’ (Salge, personal com-munication, 2011). An increased number of possible behaviors can thereby locally inform an interaction just as well as alarger signaling capacity. Even if information is asymmetrically distributed in a local neighborhood by an event that occursin the next local set of agents, or even by an event that occurs a number of sets away, the strategies employed by agentsimmediate to the event will locally inform behavior along the interaction network, thereby precluding uncertainty and errorwith respect to the relevance of the event, regardless of the number of possible behaviors. Additionally, agents update theirstrategies independently of each other (random sequential or asynchronous update) in the majority of real social systems(Szabo and Fath, 2007); sequential games such as chess lead to perfect information between players (Miller, 2003).

So, if agents on a graph population are individually different instead of behaviorally invariant, would an individual facegreater interactive uncertainty? Not necessarily, since he/she only interacts with nearest neighbors. Given that humans havememory and an apparent tendency to form localized relationships with preferred cooperative partners, then given individualdifferences, memory of past behavior should be sufficient solve ongoing interaction without the need to signal. However,what cannot be observed are the ongoing behaviors and characteristics of the individuals who we do not locally interactwith, but who interact within the local neighborhoods of those we interact with. If the members of our local neighborhoodoptimize their future-oriented niche occupation behavior with respect to the ongoing behavior of individuals in their neigh-borhood and so on, then this leaves an impression that an ongoing chain of uncertainty can operate throughout the inter-action network of a multi-niche social ecology such as a hierarchy.

In terms of perception–action, individuals would then face the ongoing optimization of future-oriented niche-regulationin a local interaction channel whose bandwidth has an error rate proportionate to the variety of behavioral variations and thedegree of species-typical variance associated with each behavior. Therefore, our interaction network can be argued to besomewhat characterized by stable uncertainty in our interaction channel. If individual fitness is substantially dependenton updating ongoing future-oriented cooperative performance in social status and resource allocation, and if a larger varietyof behavioral variations can lead to a higher error cost, then a conservative lower bound can be argued for the utility of asignaling capacity.

A conservative lower bound in error can also be conceptualized in terms of the subjective valuation of an interdependentproblem between individually different agents. If the perceived valuation an interdependent problem reflects its fitness va-lue to life-strategy, the utility of a given behavior with respect to the problem may be perceived to be more adaptive for onethan the other. A variety of different agents may then employ a variety of different behaviors to solve an interaction problemthat differs in value between them. An individual can then encounter an increased variety of behaviors that may possibly beemployed by cooperative partners over n-interactions in time and in social space between interdependent agents across theindividual’s interaction network, hence uncertainty in ongoing interaction between intrinsically heterogeneous agents.

An asymmetric interaction valuation can also indirectly imply error potential through differences in the attention paid toreferential coding effort, where a relatively lower valuation has been argued to involve less attention paid to referentiallyspecifying any perceivable object from a larger number of possible objects per signal, and less concern about the particularinterdependent actions that may be employed with respect to the message. On the other hand, the individual with more atstake would depend on referentially specifying any perceivable object from a smaller number of possible objects per signal,and thereby more concerned about the behaviors that may be employed, which may appear to indicate that the probabilityof error can be weighted on the individual with more to gain or lose in the interdependent problem being subject to signalingin the communicative interaction.

The previous section has argued that the memory-dependent aspects of recursive cognition can reflect considerablespace–time variation to the coordination of cooperative behavior, while refraining from giving consideration to the ecolog-ical conditions that could induce such variation. The reviewed literature implies the possibility that spatiotempo-ral-extended coordination in humans generally refers to strategic cooperation in the complex system dynamics of a socialstructure, where spatiotemporal variation is introduced by behavioral variation in a diversified social ecology. This sectionhas also raised the question of an individual’s performance in society without language. Without contextual coordinates, andas a sole means to improve individual performance in cooperative interactions, a discrete message capacity would performvery poorly in coordinating strategic interaction in a socially complex community over large continuous intervals in time andsocial space. Where an individual faces such interactive complexity, the recursive capacity to actuate and perceive contin-uous variation to long-distance message dependence can be associated with minimizing spatiotemporal error, be it a seriesof small errors or a less frequent but significant error, which incidentally would seem to indicate the utility of rememberingthe personality of others.

Therefore, if signaling updates the ongoing future-orientated optimization of behavior with respect to interdependentproblems that vary in time and social space between agents, then the perception of situational cues that indicate immediateor spatiotemporal extended interactive uncertainty can be argued to cause an agent to initiate referential updating throughsignaling in communicative interaction. Given a set of discrete messages, the utility of communication faculty would thendepend in no small part on the ability to employ recursive language.

The literature reviewed so far can now imply an important basic point in the form of a hypothesis: assuming that diver-sified social niches can reflect community-wide asymmetries in the valuation of interdependent problems, and if the maininteraction is of a cooperative nature, then interactive uncertainty regarding the potential for behavioral error can lead to

614 J. Goodman / Language Sciences 34 (2012) 604–618

error management predicting selection for a stable over-communication bias. For example, even if an individual has a verylow valuation of an interaction, individual difference implies a degree of uncertainty as to another individual’s valuation ofan interaction. Selfishly failing to solve a low-value cooperation problem may possibly impose a greater cost on the otheragent, which can then lead to substantial costs associated with reputation damage, decreased cooperation, or punishmentfrom that agent over future interactions (McNamara et al., 2010). Given the leading considerations and hypotheses discussedso far, it is possible that a stable over-communication bias could be expected, proportionate to the strength of cooperativeinterdependence, and the degree of behavioral phenotype variance in a population. As pointed out by an anonymous re-viewer of this article, ‘‘over-communication’’ essentially refers to the capacity for recursive reference, a central issue inthe study of language evolution: reflection on this insight can lead to the conclusion that stable over-communication by lan-guage will clearly imply an increased demand on the capacity for recursive reference.

In summary, a conservative view could infer that the message capacity of a communication faculty would at least reflectthe error rate of potential cooperative mistakes in the perception–action bandwidth of local interaction in a social network.This section highlights a need to focus on the spatiotemporal qualities of referential objects in human language: the signalsthat are actuated and perceived within a communicative interaction may not necessarily be relevant to circumstances sur-rounding the immediate interaction, rather, it can be argued that the language employed within a communicative interac-tion may directly or indirectly refer to interaction problems that are subject to spatiotemporal variation in the socialstructure dynamics of the speaker and listener. Therefore, in a game-theoretic context, human language can be argued toperform an updating function that referentially updates immediate interaction strategies, and in particular, referentially up-dates the future-oriented optimization of life-strategy interaction in a diversified social structure. With that in mind, thisdiscussion will now attempt to discuss the plausibility of arguing that behavioral variation and the memory-dependent as-pects of recursive cognition can unevenly distribute the perception of relevant objects between agents, and unevenly distrib-ute the perceived coordinates of discrete relevant objects in space–time between agents.

7. Individual difference and asymmetric relevant information

Buss et al., argue that a major part of the human adaptive landscape is defined by individual differences in behavior, andbehavioral variation requires learning about the behavior of others in order to solve interactions (McNamara and Leimar,2010). Species-typical behavioral variation also implies information-processing differences, where information parsimonyholds that an organism will only process enough sensor information needed to perform behavior at an adaptive level of util-ity. Although sensory information-processing is held to be costly, an increase will improve the efficiency and performance ofa behavior in solving a problem in the organism’s environment. On the other hand, the cost of processing sensory input intobehavioral output will become more of a fitness burden after information-processing passes the optimal trade-off with util-ity gains from an increase to behavior performance (Polani, 2009). The fitness value of increased information-processing willthen depend on an increase to the valuation of the fitness problem (i.e. fitness cost value or fitness benefit value; avoid in-jury, win mate, etc.). If the valuation is perceived to be greater, then relevant information holds that sensory information-processing will reflect the perceived utility of the behavior (Polani, 2009). In this respect, even if the members of a speciesare assumed to have the same suite of behavioral adaptations, a species-typical variation in each behavior would seem toindicate that different agents will perceive different valuations for similar problems.

Therefore, heritable, innate behavioral variation directly implies intrinsic differences in sensory information-processingbetween individuals, hence an uneven distribution of relevant information. Furthermore, if individual difference in behavioris also extrinsically influenced by ongoing strategic interaction with network neighbors in a ‘‘socially diverse’’ multi-nichesociety, then unevenly distributed relevant information between agents can be indirectly implied by the heterogeneous so-cial coordinates of network vertices occupied by individuals in a social hierarchy. If heterogeneous niche diversification insocial status and wealth indicates the variety of social roles needed to balance niche-exploitation conflict through cooper-ative behavior in a social ecology, then differences in the perceived relevance of problems should be partially determinedby differences in the ongoing cooperative interaction associated with the occupation of different social roles. In sum, theseemingly axiomatic relation between relevant information and behavioral variation can be argued to strongly imply vari-ation in the perception of relevant objects, hence an unevenly distributed perception of discrete relevant objects betweencooperating individuals.

The extent to which the perception of relevant objects is unevenly distributed between organisms would be inferred todepend on the number of species-typical behaviors, and the species-typical variance of each behavioral adaptation. Assum-ing that the spatiotemporal aspects of recursive cognition reflect the perception of fitness problems over large intervals intime and social space, then given behavioral variation, the perceived coordinates of discrete interdependent objects can alsobe unevenly distributed between individuals over large continuous space–time intervals. Therefore, the conditions of rele-vant information, cognitive recursion and behavioral variation can be argued to unevenly distribute the perception of dis-crete interdependent objects between individuals, and unevenly distribute the perception of interdependent objects inspace–time, hence the private information theme of signaling games so far, as well as uncertainty and error in ongoing coop-erative interaction.

However, if individuals preferentially choose between cooperative partners (McNamara and Leimar, 2010), then the orga-nization of an individual’s local network would reflect preference (Barabasi and Albert, 1999) for cooperative alignment with

J. Goodman / Language Sciences 34 (2012) 604–618 615

local relationships (Burgmuller and Taborsky, 2010). So, in the context of preferred local partners, exactly how unevenly dis-tributed can the perception of relevant objects be? Burgmuller et al., hold that behaving in a consistently different wayavoids competition for the same niche in a social ecology, and generally, individually different agents will be interestedin forming cooperative relations with those whom they are strategically aligned, but not so similar as to result in niche-over-lap competition for the same niche, holding the resource scarcity of the niche constant. In this respect, asymmetric relevantinformation in a heterogeneous network can appear to reflect individual differences at the local interaction level, as opposedto the population level. If information parsimony limits the processing cost of referential signaling to the minimum informa-tion needed to coordinate ongoing cooperative interaction, then it is possible that the signal actuation–perception channelfor a human language faculty may depend on the extent to which the perception of relevant interdependent objects can beunevenly distributed between agents in the local interaction channel of the social network structure of a human community.

If this is presumed, and if an interaction problem is accurately represented by the general features of the standard gametable, then the referential objects of social communication would be generally limited to the problems, behaviors, and pay-offs relevant to the coordination of ongoing interaction between strategically aligned individuals in a local network. For in-stance, possible referential constraints can be casually observed in human sex differences to the relevance of referentialobjects in male and female social communication. Male fitness generally depends on social status in a dominance hierarchy,resource ownership, and access to females (Schmitt, 2005). Female fitness generally depends on reproductive health and thecooperative strength of relationships with coalition members such as family and friends (Schmitt, 2005). If males and fe-males generally perceive their relevant social environment in those terms, then it is possible that the referential objectsof male social communication can be generally constrained to behaviors, problems and payoffs that are relevant to socialstatus, resources, access to females, and valued female qualities. Likewise, the referential objects of female social communi-cation would be inferred to be limited to behaviors, problems and payoffs that are relevant to reproductive health, socialrelationships, and valued male qualities. This would imply that the objects being referred to in a communicative interactionare of adaptive relevance to the coordination problems faced by the speaker and listener, therefore, referential sex differ-ences in language can be expected.

Personality research by Lund et al. (2007) on hierarchy tactics in the workplace has established sex differences in hier-archy-specific competition and cooperation tactics in males and females, where male work colleagues were found to be morelikely to employ deception/manipulation behaviors. Although no significant sex differences were found in social display/net-working behaviors, females were more likely to ‘‘help others, cultivate friendships, display positive social characteristics, andenhance appearance’’ (2007). In both Lunde et al., and Zuroff et al. (2010), individual differences in hierarchy tactics werecorrelated with individual differences in the five-factor personality domains. Burgmuller et al. (2010) propose that correlatedpersonality traits will correspond to preference for correlated ecological conditions in a multi-niche society, being preferredsocial density and preferred niche profile in hierarchy structure.

Therefore, if individuals are interested in optimizing the utility of ongoing future-orientated cooperation with local neigh-bors who are relevant to the individual’s strategic niche, and if behavioral variation unevenly distributes relevant informa-tion, and if the particular subject matter in a social communication is directly relevant to the ongoing cooperativeoptimization of the relationship, then with respect to interactive uncertainty and error, individuals should be interestedin distributing the information as evenly as possible.

The problem here is, behavioral variation and information parsimony has been argued to imply the perceived valuation ofan interdependent problem will vary between individuals, in which case the attention paid to speaking and listening effortcan be weighted on the agent with more at stake, reflecting a greater dependence on referential accuracy, a greater sensi-tivity to any behavior from smaller number of possible behaviors, and a relatively lower probability that any random behav-ior from a set of possible behaviors will solve the interaction. In this circumstance, a more dependent individual could recruitincreased decoding and encoding effort from a less dependent individual through the manipulative over-estimation of a pay-off value for a problem. For instance, an exaggeration of a fitness payoff will engage a greater amount of attention from theagent with less at stake, increasing behavioral responsiveness to particular referential objects in a manner that is relativelymore adaptive to the manipulating agent. However, this represents a cheating problem, which this discussion has argued todepend on stable cooperation.

Where the efficient referential communication of a message is in negative relation to the difference in attention paid tocommunication effort between individually different agents, behavioral variation would therefore appear to unevenly dis-tribute relevant information, increase the need to communicate information, and inhibit the extent to which informationcan be evenly distributed in communicative interactions. Unlike the communication of other animals, human language isinherently ambiguous (Sperber and Wilson, 2002, Corballis, personal communication, 2011); the approach employed by thisreview seems to agree with that observation, and has suggested the possibility that individual difference may cause stablevariation to the perceived valuation of interdependent problems being subject to referential coding in signaling, therebycausing consistent variation in the attention paid to communication effort, hence stable ambiguity. This would seem to indi-cate the stable role of spatially recursive cognitive functions in solving ambiguity, such as mirror neuron networks (Corballis,2010) and inferential mind-reading (Sperber and Wilson, 2002) in the capacity to employ theory of mind.

This apparent mutual dependence between recursive cognitive inference and language in the literature can possibly berelated to variation between social environments. For instance, in the Definitive Book of Body Language (2003), Barbaraand Allan Pease casually observe a negative relationship between verbal and non-verbal skills for individuals of differentsocioeconomic backgrounds. Individuals with little or no formal education appear to perform poorly in verbal language,

616 J. Goodman / Language Sciences 34 (2012) 604–618

and seem generally more adept at employing and ‘‘reading’’ non-verbal communication, whereas more literate individualsappear relatively less adept at employing and reading non-verbal communication. If this is taken at face value, socioeco-nomic differences in verbal and non-verbal language skill could be argued to be attributed to learning demands in differentdevelopmental environments, where the developmental acquisition of recursive cognitive inference is traded off in favor of agreater interactive utility by improved access to verbal language acquisition, hence the ability to communicate with greaterreferential accuracy through more grammatically complex language. This of course assumes that interactive communicationin general performs the same type of adaptive function, which this discussion has argued to be the utility of informingbehavioral solutions to interdependent problems. However, there is a fundamental problem in assuming that the surfacecomplexity of human language is entirely due to biological adaptation, and this problem is referred to as the logical problemof language evolution (Christiansen and Chater, 2008).

8. Comment regarding Christiansen and Chater’s ‘‘language as shaped by the brain’’

Christiansen and Chater (2008) identify the logical problem of language evolution: the probability that the human capa-bility for language can be produced by purely non-evolutionary means is astronomically small, however, the evolutionaryspeed of grammatical structure over thousands of generations of language learners and users presents a moving target toadaptation by a biological faculty to universal grammar. This represents a paradox between the intricate human-specificcomplexity of language acquisition mechanisms, and a grammatical structure that in itself evolves at a faster rate than a bio-logical adaptation for universal grammatical structure can evolve, therefore, a gene coding for universal grammar cannot befixed throughout the diverse world-wide population of humans (2008). Christiansen et al., provide the hypothesis that thegrammatical composition of language is an evolutionary system in its own right: it is not the human brain that has evolvedthe capacity for language, rather, it is language that evolves to be learnable by the brain over successive generations of gram-matical transformations through learning and use, under economizing constraints imposed by complex cognitive mecha-nisms that are not in themselves designed to be language-specific.

Therefore, in the strong form of the hypothesis, the complexity of human grammar could be viewed as a cognitive by-product that manifests in the basic adaptive functioning of a communication faculty, where Hypothesis 1 of Hauser et al.(2002) holds that ‘‘the faculty of language in the broad sense (FLB) is strictly homologous to animal communication.’’ How-ever, the strong form assumes that the logical problem of language evolution will strictly hold, which rests on the moving-target assumption that ‘‘universal, arbitrary constraints on the structure of language cannot emerge from biological adapta-tion to a varied pattern of linguistic environments’’ (Christiansen and Chater, 2008). In statistical analysis on over 2000 lan-guages and social demographics, Lupyan and Dale (2010) determine that the complexity of inflectional morphology will begreater if the population is smaller and more socially cohesive, where ‘‘languages with more speakers (larger groups) are lessmorphologically specified than languages with fewer speakers (smaller groups)’’. In particular, ‘‘lexical strategies are morelikely to be used in place of inflectional morphology to encode evidentially, negation, aspect and possession’’ in larger, lesssocially cohesive populations (Lupyan and Dale, 2010). The inflectional morpheme redundancy of grammatical structure istherefore related to variation between the social environments of language learners and users.

Through the application of optimality theory, Jones (2010) determines that variation to kin terminology between Englishand Seneca appears to be constrained by variation in ranking to a finite, species-typical set of grammatical and conceptualconstraints. Jones provides the hypothesis that an innate grammar faculty is part of a uniquely human set of adaptations forcoordination games, such as resource allocation amongst kin, ‘‘adapted to facilitate the construction of locally shared codesof communication and interaction’’, where ‘‘the grammar faculty interacts with, but is distinct from, domain-specific adap-tations in conceptual structure, phonology and syntax’’ (Jones, 2010, p. 379). In each of these domains, the grammar facultyuses the ranking of constraints to ‘‘generate grammatical output, match the language learner’s rank of constraints with com-munity rankings, and possibly, discovers constraints’’ (p. 379). In kinship terminology at least, grammar would seem specif-ically designed to facilitate the coordination of behavior. It is then possible that grammatical transformations can be imposedby variation to the ranking of universal grammatical constraints.

These theoretically and empirically sound examples would seem to add credibility to the impression that human lan-guage may be the output of an evolved, sophisticated communication faculty that facilitates a greater optimization to theutility of an individual’s behavior in relation to the interdependent environment. However, this would not necessarilypreclude the semi-strong form of Christiansen and Chater’s hypothesis, as the surface complexity of language does indeedvary between populations, and does continuously evolve over generations of language users and learners. In this context,the evolution of language may be further narrowed down: to what extent can the complexity of human language be shapedby self-organized evolution to the human brain over continuous generational acquisition and use, and to what extent can thecomplexity of human language be attributed to a species-typical signaling adaptation that may be designed to solveinterdependent problems?

9. Conclusion

This review has employed a multi-disciplinary approach to examine a number of apparent relations between factors thatmay possibly be associated with selection for a human language faculty, with the emphasis on an information theoretic

J. Goodman / Language Sciences 34 (2012) 604–618 617

context. The conceptual tools that were touched on included, but were not limited to, recursive aspects of cognition and lin-guistics, individual difference theory, the law of requisite variety, evolutionary games involving graphs and signals, errormanagement theory, referential coding efficiency, and the parsimony principle of relevant information in perception–action.The argument suggests that a selection pressure for human language may possibly be outlined through further examinationby researchers in regards to apparent relations between the principle of relevant information, recursive cognition, and inter-dependent behavioral variation.

In general summary of this speculative discussion, information parsimony was argued to limit communication to theminimal amount of information needed to employ adaptive interdependent behavior, where the information-processing costof a language faculty is in optimal trade-off with the utility of interactive behavior, and where the capacity to actuate andperceive an increased selection of possible messages implies reference to a similar selection of perceivable objects that in-form behavior. By varying the information processed by sensors into behavior between agents, relevant information andbehavioral phenotype variation were argued to unevenly distribute the perception of discrete objects between interdepen-dent agents. Behavioral variation and the memory-dependent aspects of recursive spatiotemporal cognition were argued tounevenly distribute the perception of discrete interdependent objects in space–time, implying the utility of short-term pho-nological memory in the capacity to actuate and perceive continuous variation to long-distance message dependencies byrecursive reference. As recursion in human language is arguably a central issue of significance to research in the evolutionof language, a large proportion of this review has been allocated to investigate the matter.

To conclude, a conservative lower bound has been implied in selection for the information-processing capacity of a hu-man language faculty, being the interactive behavior error potential of a large number of individual differences. In relevanceto that position, the material reviewed by this article suggests the basic outline of a hypothesis regarding the stable condi-tions for recursive reference (S.6). However, the argument has not attempted to make any form of a specific grammaticalprediction: the discussion has simply reviewed a number of consistencies in the literature that appear to indicate the partialoutline of conditions that can possibly evolve an increased referential signal actuation–perception capacity, and in doing so,this discussion hopes to have demonstrated the robust utility and interdisciplinary relevance of information theory in thevery diverse field of language research. This article has also clearly identified evolutionary game theory and the psychologicalsciences as critical in the advancement of theory for the evolution of human language. As an interdisciplinary problem, themost conspicuous discipline in a coordinated approach to this highly demanding task is language-specific expertise: the re-viewed works of Lupyan, Dale and Jones serve to illustrate the relevance of this point.

Acknowledgments

The author thanks Christoph Salge and Mikhail Prokopenko for advice and direction on the information-theoretic scope ofthis paper, Michael Corballis for critical feedback on language and cognition, the participants of the 11th March 2011 Ma-chine Learning seminar at the CSIRO ICT Centre, Marsfield, and five anonymous reviewers for critical comment and insight.

References

Aboitiz, F., Aboitiz, S., Garcia, R., 2010. The phonological loop. Current Anthropology 51 (Suppl. 1), 55–65.Ashby, W., 1956. An Introduction to Cybernetics, second ed. Chapman and Hall Ltd.Barabasi, A., Albert, R., 1999. Emergence of scaling in random networks. Science 286, 509–512.Ben-Jacob, E., Cohen, I., Levine, H., 2000. Cooperative self-organization of micro-organisms. Advances in Physics 49 (4), 395–554.Botwin, M., Buss, D., Shackelford, T., 1997. Personality and mate preferences: five factors in mate selection and marital satisfaction. Journal of Personality

and Research 65, 1.Burgmuller, R., Schurch, R., Hamilton, I., 2010. Evolutionary causes and consequences of consistent individual variation in cooperative behavior.

Philosophical Transactions of the Royal Society B 365, 2751–2764.Burgmuller, R., Taborsky, M., 2010. Animal personality due to niche specialization. Trends in Ecology and Evolution 25, 504–511.Buss, D., 1996. Social adaptation and five major factors of personality. In: Wiggins, J.S. (Ed.), The Five Factor Model of Personality: Theoretical Perspectives.

Guilford, New York, pp. 180–207.Buss, D., Duntley, J., 2008. Adaptations for exploitation. Group Dynamics 12, 53–62.Buss, D., Greiling, H., 1999. Adaptive individual differences. Journal of Personality 67, 2.Campbell, A., 2005. Aggression. The Handbook of Evolutionary Psychology. John Wiley & Sons (Chapter 21).Christiansen, M., Chater, N., 2008. Language as shaped by the brain. Behavioral and Brain Sciences 31, 489–558.Corballis, M., 2007. The Uniqueness of Human Recursive Thinking. American Scientist, May–June, pp. 240–248.Corballis, M., 2009a. The evolution of language. The year in cognitive neuroscience 2009. Annals of the New York Academy of Sciences 1156, 19–43.Corballis, M., 2009b. Language as gesture. Human Movement Science 28, 556–565.Corballis, M., 2009c. Mental time travel and the shaping of language. Experimental Brain Research 192, 553–560.Corballis, M., 2010. Mirror neurons and the evolution of language. Brain and Language 112, 25–35.Daly, M., Wilson, M., 2005. Carpe diem: Adaptation and devaluing the future. The Quarterly Review of Biology 80 (1), 55–60.Danchin, E., Giraldeau, L., Valone, T., Wagner, R., 2004. Public information: From nosy neighbors to cultural evolution. Science 305, 487–491.Dawkins, R., 2009. The Greatest Show on Earth. Bantam Press, Transworld Publishers.Dorner, D., 1996. The Logic of Failure. Basic Books. Perseus Books Group, New York.Dukas, R., 2004. Causes and consequences of limited attention. Brain, Behavior, and Evolution 63, 197–210.Elias, D., Botero, C., Andrade, M., Mason, A., Kasumovic, M., 2010. High resource valuation fuels desperado fighting tactics in female jumping spiders.

Behavioral Ecology (June), 868–875.Haselton, M., Buss, D., 2000. Error management theory: A new perspective on biases in cross-sex mind reading. Journal of Personality and Social Psychology

78, 81–91.Haselton, M., Nettle, D., 2006. The paranoid optimist: an integrative evolutionary model of cognitive biases. Personality and Social Psychology Review 10 (1),

47–66.

618 J. Goodman / Language Sciences 34 (2012) 604–618

Haselton, M., Bryant, G., Wilke, A., Frederick, D., Galperin, A., Frankenhuis, W., Morre, T., 2009. Adaptive rationality: An evolutionary perspective on cognitivebias. Social Cognition 27 (5), 733–763.

Hauser, M., Chomsky, N., Fitch, W., 2002. The faculty of language: what is it, who has it, and how did it evolve? Science 298, 1569–1579.Heil, M., Karban, R., 2009. Explaining evolution of plant communication by airborne signals. Trends in Ecology and Evolution 25 (3), 137–144.Hirschey, M., 2009. Managerial Economics, 12th ed. South-Western CENGAGE Learning.Hurd, P., 2006. Resource holding potential, subjective resource value, and game-theoretic models of aggressiveness signaling. Journal of Theoretical Biology

241, 639–648.Jackendoff, R., 1987. Consciousness and the Computational Mind. MIT Press, Cambridge, MA.Jackendoff, R., Pinker, S., 2005. The nature of the language faculty and its implications for evolution of language. Cognition 97, 211–225.Jager, G., 2008. Evolutionary stability conditions for signaling games with costly signals. Journal of Theoretical Biology 253, 131–141.Jensen, K., 2010. Punishment and spite, the dark side of cooperation. Philosophical Transactions of the Royal Society B 365, 2635–2650.Jones, D., 2010. Human kinship, from conceptual structure to grammar. Behavioral and Brain Sciences 33, 267–416.Klyubin, A., Polani, D., Nehaniv, C., 2007. Representations of space and time in the maximization of information flow in the perception-action loop. Neural

Computation 19, 2387–2432.Lund, O., Tamnes, C., Moestue, C., Buss, D., Vollrath, M., 2007. Tactics of hierarchy negotiation. Journal of Research in Personality 41, 25–44.Lupyan, G., Dale, R., 2010. Language structure is partly determined by social structure. PloS One 5 (1), e8559.McNamara, J., Leimar, O., 2010. Variation and the response to variation as a basis for successful cooperation. Philosophical Transactions of the Royal Society

B 365, 2627–2633.Miller, J., 2003. Game Theory at Work. McGraw-Hill, The McGraw-Hill Companies.Nesse, R., 1991. What Good is Feeling Bad? The Sciences, November/December, pp. 30–37.Nowak, M., Komarova, N., Niyogi, P., 2002. Computational and evolutionary aspects of language. Nature 417, 611–617.Nowak, M., Tarnita, C., Antal, T., 2010. Evolutionary dynamics in structured populations. Philosophical Transactions of the Royal Society B 365, 19–30.Obst, O., Polani, D., Prokopenko, M., 2009. Origins of scaling in genetic code. In: Proc. European Conference on Artificial Life, Budapest. Springer.Oxford Dictionary of Biology, 2008, sixth ed. Oxford University Press.Pease, A., Pease, B., 2003. The Definitive Book of Body Language. Pease International.Penteriani, V., 2010. Arguments for the integration of the non-zero-sum logic of complex animal communication with information theory. Entropy 12, 127–

135.Perc, M., Szolnoki, A., 2008. Social diversity and promotion of cooperation in the spatial prisoner’s dilemma game. Physical Review E 77 (1–5), 011904.Peters, F., 2010. Consciousness as recursive, spatiotemporal self location. Psychological Research Psychologische Forschung 74, 407–421.Pinker, S., Jackendoff, R., 2005. The faculty of language: what’s special about it? Cognition 95, 201–236.Polani, D., Martinez, T., Kim, J., 2001. An information-theoretic approach for the quantification of relevance. In: Proceedings of the 6th European Conference

on Advances in Artificial Life. Springer-Verlag, London, pp. 704–713.Polani, D., Nehaniv, C., Martinetz, T., Kim, J., 2006. Relevant information in optimized persistence vs. progeny strategies. In: Rocha, L., Bedau, M., Floreano, D.,

Goldstone, R., Vespignani, A., Yaeger, L. (Eds.), Proc. Artificial Life X. MIT Press, Cambridge, MA, pp. 337–343.Polani, D., 2009. Information: currency of life? HFSP Journal 3 (5), 307–316.Prokopenko, M., Ay, N., Obst, O., Polani, D., 2011. Phase transitions in least-effort communications. Journal of Statistical Mechanics: Theory and Experiment

P11025, 1–30.Ridley, M., 1993. The Red Queen. Harper Perennial, Penguin Putnam.Ridley, M., 2010. The Rational Optimist. Harper Collins Publishers, Fourth Estate.Ryabko, B., Reznikova, Z., 2009. The use of ideas of information theory for studying ‘‘language’’ and intelligence in ants. Entropy 11, 836–853.Salge, C., Polani, D., 2009. Information Theoretic Incentives for Social Interaction. Technical Report 495, Department of Computer Science, University of

Hertfordshire, UK.Salge, C., Polani, D., 2009b. Information-driven organization of visual receptive fields. Advances in Complex Systems 12 (3), 311–326.Salge, C., Polani, D., 2011. Digested information as an information-theoretic motivation for social interaction. Journal of Artificial Societies and Social

Simulation 14 (1), 5.Santos, F., Santos, M., Pacheco, J., 2008. Social diversity promotes the emergence of cooperation in public goods games. Nature 454, 217–313.Schmitt, D., 2005. Fundamentals of Human Mating Strategies. The Handbook of Evolutionary Psychology. John Wiley & Sons (Chapter 9).Scott-Phillips, T., 2007. The social evolution of language, and the language of social evolution. Evolutionary Psychology 5 (4), 740–753.Scott-Phillips, T., 2010. The evolution of relevance. Cognitive Science 34, 583–601.Semple, S., Hsu, M., Agoramoorthy, G., 2010. Efficiency of coding in macaque vocal communication. Biology Letters. http://dx.doi.org/10.1098/

rsbl.2009.1062.Shannon, C., 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 379–423.Sperber, D., Wilson, S., 2002. Pragmatics, modularity and mind-reading. Mind & Language 17 (1 and 2), 3–23.Suddendorf, T., Corballis, M., 2007. The evolution of foresight: what is mental time travel, and is it unique to humans? Behavioral and Brain Sciences 30,

299–351.Suddendorf, T., Corballis, M., 2010. Behavioral evidence for mental time travel in non-human animals. Behavioral Brain Research 215, 292–298.Szabo, G., Fath, G., 2007. Evolutionary games on graphs. Physics Reports 446, 97–216.Tooby, J., Cosmides, L., 2005. Conceptual Foundations of Evolutionary Psychology. The Handbook of Evolutionary Psychology. John Wiley & Sons (Chapter 1).Van Dijk, S., Polani, D., Nehaniv, C., 2010. What do You Want to do Today? Relevant-Information Bookkeeping in Goal-Oriented Behavior. Department of

Computer Science, University of Hertfordshire, UK.Weidenholzer, S., 2010. Coordination games and local interactions: a survey of the game theoretic literature. Games 1, 551–585.Weir, K., 2011. Fickle Friends. Scientific American, Mind, May/June, p. 14.Yang, C., 2010. Who is Afraid of George Kingsley Zipf? Department of Linguistics and Computer Science, University of Pennsylvania.Zuroff, D., Fournier, M., Patall, E., Leybman, M., 2010. Steps toward an evolutionary personality psychology: individual differences in the social rank domain.

Canadian Psychology 51 (1), 58–66.