A. Zilhao-Evolution, Rationality and Cognition_ a Cognitive Science for the Twenty-First Century...

199

description

Evolution, Rationality and Cognition

Transcript of A. Zilhao-Evolution, Rationality and Cognition_ a Cognitive Science for the Twenty-First Century...

  • Administrator2000e014coverv05b.jpg

  • Evolution, Rationality and Cognition

    Evolutionary thinking has expanded in the latter decades, spreading from itstraditional stronghold the explanation of speciation and adaptation inbiology to new domains including the human sciences. The essays in thiscollection attest to the illuminating power of evolutionary thinking whenapplied to the understanding of the human mind.

    The contributors to Evolution, Rationality and Cognition use an evolution-ary standpoint to approach the nature of the human mind, including bothcognitive and behavioural functions. Cognitive science is by its nature aninterdisciplinary subject and the essays use a variety of disciplines includingthe philosophy of science, the philosophy of mind, game theory, roboticsand computational neuroanatomy to investigate the workings of the mind.The topics covered by the essays range from general methodological issues tolong-standing philosophical problems such as how rational human beingsactually are.

    This book will be of interest across a number of elds, including philo-sophy, evolutionary theory and cognitive science.

    Antnio Zilho is Associate Professor in Philosophy at the University ofLisbon.

  • Routledge studies in the philosophy of science

    1 Cognition, Evolution and RationalityA cognitive science for the twenty-rst centuryEdited by Antnio Zilho

  • Evolution, Rationality andCognitionA cognitive science for the twenty-rstcentury

    Edited by Antnio Zilho

  • First published 2005by Routledge2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN

    Simultaneously published in the USA and Canadaby Routledge270 Madison Ave, New York, NY 10016

    Routledge is an imprint of the Taylor & Francis Group

    2005 Antnio Zilho editorial matter and selection; thecontributors their contributions

    All rights reserved. No part of this book may be reprinted orreproduced or utilized in any form or by any electronic, mechanical,or other means, now known or hereafter invented, includingphotocopying and recording, or in any information storage orretrieval system, without permission in writing from the publishers.

    British Library Cataloguing in Publication DataA catalogue record for this book is available from the British Library

    Library of Congress Cataloging in Publication DataA catalog record for this book has been requested

    ISBN 0-415-36260-1

    This edition published in the Taylor & Francis e-Library, 2006.

    To purchase your own copy of this or any of Taylor & Francis or Routledgescollection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.

    (Print Edition)

    ISBN 0-203-01291-7 Master e-book ISBN

  • Contents

    List of illustrations viiList of contributors ixPreface x

    Editors introduction 1A N T N I O Z I L H O

    PART IEvolution 15

    1 Intelligent design is untestable: what about naturalselection? 17E L L I O T T S O B E R

    2 Social learning and the Baldwin effect 40D A V I D P A P I N E A U

    3 Signals, evolution, and the explanatory power oftransient information 61B R I A N S K Y R M S

    PART IIRationality 83

    4 Untangling the evolution of mental representation 85P E T E R G O D F R E Y - S M I T H

  • 5 Innateness and brain-wiring optimization: non-genomic nativism 103C H R I S T O P H E R C H E R N I A K

    6 Evolution and the origins of the rational 113I N M A N H A R V E Y

    PART IIICognition 133

    7 How to get around by mind and body: spatialthought, spatial action 135B A R B A R A T V E R S K Y

    8 Simulation and the evolution of mindreading 148C H A N D R A S E K H A R S R I P A D A A N D A L V I N I . G O L D M A N

    9 Enhancing and augmenting human reasoning 162T I M V A N G E L D E R

    Index 182

    vi Contents

  • Illustrations

    Figures

    1.1 In the process of SPD, the population begins at t0 with a sharp value 24

    1.2 In the process of PD, the population begins at t0 with a sharp value 25

    1.3 Which hypothesis, SPD or PD, confers the higher probability on the observed present phenotype? 26

    1.4 Given the observed fur length of present day polar bears and their close relatives, what is the best estimate of the trait values of the ancestors A1, A2, . . ., A5? 29

    1.5 Two problems in which one has to estimate the character of an ancestor, based on the observed value of one or moredescendants 30

    3.1 Aumanns stag hunt 653.2 Kreps stag hunt 673.3 Evolution of information 753.4 Evolution of correlation I 763.5 Evolution of correlation II 773.6 Evolution of bargaining behaviors 775.1 Complex biological structure arising directly from basic

    physics 1055.2 Runscreen for Tensarama, a force-directed placement

    algorithm for optimizing layout of ganglia of the nematodeCaenorhabditis elegans 106

    5.3 Turing machine program that has been the contender for title of ve-state busy-beaver maximally productive TM program without challenge for over a decade 108

  • Tables

    1.1 Which hypothesis, Design or Chance, confers the greaterprobability on the observation that the watch is made of metal and glass? 20

    1.2 Which hypothesis, Design or Chance, confers the greaterprobability on the observation that vertebrates have a camera eye? 21

    1.3 If polar bears now have fur that is 10 centimeters long, does the hypothesis of SPD or the hypothesis of PD renderthat outcome more probable? 23

    1.4 When a population evolves from its initial state I to its present state P, how will that trajectory be related to theputative optimal phenotype O specied by the hypothesis of SPD? 27

    viii Illustrations

  • Contributors

    Christopher Cherniak is Professor of Philosophy at the Department ofPhilosophy Committee on History and Philosophy of Science of theUniversity of Maryland at College Park.

    Peter Godfrey-Smith is Associate Professor of Philosophy at HarvardUniversity.

    Alvin I. Goldman is Professor of Philosophy and Research Scientist in Cog-nitive Science at Rutgers University.

    Inman Harvey is Senior Lecturer at the School of Cognitive and ComputingSciences of the University of Sussex and Senior Researcher at the Centre forComputational Neuroscience and Robotics at the same University.

    David Papineau is Professor of Philosophy of Science at Kings CollegeLondon.

    Brian Skyrms is Professor of Logic and Philosophy of Science at the Schoolof Social Sciences of the University of California at Irvine.

    Elliott Sober is Hans Reichenbach Professor of Philosophy and Henry VillasResearch Professor at the University of Wisconsin, Visiting Professor atthe London School of Economics and Political Science, Fellow of theAmerican Academy of Arts and Sciences and President of the Philosophyof Science Association.

    Chandra Sekhar Sripada completed an M.D. and an internship in psychia-try; he currently studies the philosophies of cognitive science and biologyat Rutgers University.

    Barbara Tversky is Professor of Psychology at Stanford University.

    Tim van Gelder is Associate Professor (Principal Fellow) at the Departmentof Philosophy of the University of Melbourne and Director of the Aus-tralian Thinking Skills Institute.

    Antnio Zilho is Associate Professor at the Department of Philosophy ofthe University of Lisbon.

  • Preface

    The set of nine essays collected in this volume constitutes the proceedings ofthe Second International Cognitive Science Conference, jointly organized inthe city of Oporto, Portugal, by the Portuguese Philosophical Society andthe Abel Salazar Association, between 27 September and 29 September2002. All of them were invited as original contributions. The essay by BrianSkyrms was in the meantime published in the journal Philosophy of Science (69(3): 40728, 2002). I would like to thank both the author and the pub-lisher, The University of Chicago Press, for their permission to reprint it inthis volume.

    Other acknowledgements are also due to a number of other people andinstitutions. First, I would like to thank Lusa Garcia Fernandes and thecrew she gathered around the Abel Salazar Association, in Oporto, for theirwonderful job in making sure that all went well with the logistics of theevent. Andr Abath provided also invaluable help at different stages of theorganization of the conference. I would also like to express my gratitude toProfessor Emeritus M.D. Nuno Grande, a distinguished member of theOporto Medicine Faculty, for his own support and the support of the Uni-versity of Oporto and the City Council he was able to mobilize. And ofcourse for the support of the association he leads an association that bearsthe name of Abel Salazar, the Portuguese medical scientist, polymath andenthusiastic supporter of scientic philosophy under whose aegis the confer-ence was placed. The Calouste Gulbenkian Foundation (FCG), The Por-tuguese-American Foundation for Development (FLAD) and the PortugueseFoundation for the Support of Science and Technology (FCT) all contributedwith funding without which the conference could not have taken place.

    Special acknowledgements are due to David Papineau for his earlysupport of the idea and for his advice and suggestions, and to Tony Bruceand Terry Clague for having welcomed the proposal of publishing this bookas a volume in the series Routledge Studies in the Philosophy of Science.Finally, I am most pleased to thank all the speakers and contributors ofessays for their coming to Oporto in 2002, and for their scientic effort, per-sonal cooperation and remarkable patience.

    The Portuguese Philosophical Societys First International Cognitive

  • Science Conference took place in Lisbon in May 1998; its proceedings werepublished by the Oxford University Press in 2001 under the title TheFoundations of Cognitive Science. Both the First and the Second InternationalCognitive Science Conferences were generally held by those who attendedthem to have been major scientic events. They brought to Portugal animpressive array of prestigious cognitive scientists. This succession createdthe beginnings of a tradition. I trust this tradition in the making will behonoured by the current Direction of the Portuguese Philosophical Societywith the organization of the Third International Cognitive Science Confer-ence in 2006.

    Antnio ZilhoLisbon, Portugal

    Preface xi

  • Editors introduction

    Antnio Zilho

    The essays collected in this volume constitute the proceedings of the SecondInternational Cognitive Science Conference, jointly organized in the city ofOporto, Portugal, by the Portuguese Philosophical Society and the AbelSalazar Association. All the papers read at this conference, held in September2002, were invited contributions. The contributors are among the top worldresearchers in evolutionary thinking and cognitive science. The theme of theconference was Evolution, Rationality and Cognition: A cognitive science for thetwenty-rst century also the title of this collection.

    The collection contains nine original essays. They cover a wide range ofissues belonging to different provinces of knowledge. The issues coveredvary from the evolutionary mechanisms that underlie the emergence ofcomplex adaptive behaviours to the systematic errors in spatial memory andjudgement that have been found in recent psychological research; from theoptimization of the wiring layout of nervous systems to the status of folkpsychology. The provinces of knowledge touched upon include philosophyof science, philosophy of biology, philosophy of mind, game theory, cog-nitive psychology, computational neuroanatomy, computer science androbotics.

    These essays constitute no random collection. Although the domains ofenquiry these researchers work on differ widely, their thinking is united by atheoretical standpoint that shapes their essays essentially, namely, the evolu-tionary standpoint. This is the standpoint according to which the idea ofevolution, besides explaining speciation and adaptation in biology, as it hasbeen traditionally acknowledged, also has a tremendously illuminatingpower in the human and behavioural sciences. This power is appropriatelyexpressed in the motto Brian Skyrms included in the conclusion of hisgame-theoretical essay below: Evolution matters! This community ofapproach ensures thus a unity that is much deeper than the apparent diver-sity brought about by the use of vocabularies and conceptual apparatusesbelonging to scientic and philosophical disciplines as disparate as thosementioned above.

    The collection is broken up into three major parts, each comprising ofthree essays. Part I deals with general questions of evolutionary theory. Part

  • II focuses on the issue of rationality. Part III tackles some particular cogni-tive problems.

    The collection begins with a broad methodological essay, namely, ElliottSobers Intelligent design is untestable: what about natural selection? Thisis an essay that combines general topics in epistemology and philosophy ofscience with more specic topics in the philosophy of biology. The issueSober addresses in his essay is: What are the criteria in terms of which it ispossible to distinguish between what are adaptive hypotheses with realscientic value and what is mere adaptive storytelling? This is an issue adap-tive thinking has to deal with right from the start. It is therefore a good wayof starting an approach to the theme Evolution, Rationality and Cognition.

    Elliott Sober starts his essay with a set of methodological claims. First, heclaims that in order to evaluate any empirical hypothesis one has to deter-mine its likelihood value. Second, he claims that the likelihood value of agiven hypothesis is to be cashed out as the probability of the available evid-ence given the hypothesis. Third, he claims that testing a hypothesis essen-tially requires testing it against competitors. The corollary of these threemethodological claims is the further claim that a tested hypothesis willprevail if its likelihood value is greater than the likelihood value of the rivalhypotheses. Sober then tells us that these sound methodological principleswere usually ignored by intelligent design theorists. These tended to use theWhat else could it be? type of rhetorical question in order to drive theirpoint home. But not all did so. Sober points out that enlightened intelligentdesign theorists such as Arbuthnot (16671735) and Paley (17431805) didrealize the methodological fault contained in the What else could it be?type of argument. These British creationists took chance to be the only com-petitor hypothesis imaginable; they therefore claimed to have proven thesoundness of the design hypothesis by claiming that it had a higher likeli-hood value than chance.

    Although it is undoubtedly true that the likelihood value of chance pro-ducing complex adaptive design is very low, this proof fails because, accord-ing to Sober, it is simply not possible to ascribe any value to the likelihoodof the design hypothesis in the absence of independent evidence concerningthe characteristics of the designer. And if it is not possible to ascribe anylikelihood value to the design hypothesis, it is not possible to claim thatsuch a value is higher than the likelihood value of chance either, no matterhow small the latter might be. Thus, contrary to the claims of Arbuthnotand Paley, the design argument, as they formulated it, is simply untestable.

    This fact notwithstanding, Sober claims that the methodological lesson ofArbuthnot and Paley should not be forgotten by evolution theorists.However, it is not uncommon for evolutionists to use the What else couldit be? rhetorical question when arguing for selectionist explanations ofadaptive complexity. Sober thinks this is unfortunate. He then points outthat there is a modern equivalent to the hypothesis of chance within the evo-lutionary framework, namely, the hypothesis of random genetic drift. The

    2 Antnio Zilho

  • central contention of Sobers essay is then the following: evolution theoristsshould make sure that the likelihood value of their explanations of traits bynatural selection is actually greater than the likelihood value of the altern-ative explanation according to which the trait to be explained happens to bethe outcome of a process of pure random genetic drift.

    In the remainder of his essay, Sober illustrates by means of particularexamples how an analysis of the comparative likelihood of a selectionist andof a pure drift hypothesis purporting to explain the presence of a particulartrait in a species could be done. In the course of this analysis, he stresses twocrucial points. First, the range of the concept of complexity should not beimplicitly assumed to be congruent with the range of the concept of opti-mality, as is frequently happens. Sober argues that no matter how complex atrait is, there may be independent evidence that it is not an optimal adapta-tion; and, if this is the case, its presence in the organism may confer agreater likelihood to the pure drift hypothesis rather than to the hypothesisof natural selection. Thus, complexity by itself is no sure evidence fornatural selection. Second, frequently the relevant auxiliary informationneeded to carry out a likelihood analysis of a selectionist explanation of atrait against its competitors will simply not be available. Sober then con-cludes his essay by advising evolution theorists to learn to live with this pos-sibility and to strive for more modest goals when this information is indeednot available.

    After the discussion of broad methodological issues in evolutionarytheory, we turn to matters more specically related to the study of complexadaptive behaviour. In Social learning and the Baldwin effect, David Pap-ineau deals with a particularly difcult problem evolution theorists have toface when they try to understand the display of some particular succession ofcomplex behaviours by an animal species. This problem is: How could sucha succession ever have come about if each of the behaviours by itself woulddo no good to the animals and if it is impossible to imagine that the wholesuccession of behaviours came into being simultaneously? Papineau tries tond an answer to this puzzle by appealing to the so-called Baldwin effect.

    The Baldwin effect was proposed more than one hundred years ago bythe American psychologist James Mark Baldwin as a Darwinian mechanismthat, under some conditions, might seem to corroborate the Lamarckianhypothesis that acquired characteristics could be inherited.

    How does the Baldwin effect work? The idea is the following. Imagine apopulation of animals well adapted to a particular environment. Supposethat, for some reason, the environment changes. Because of this change,some of the animals typical behavioural strategies cease to be adaptive.Suppose now that some members of the population are able to learn duringtheir lifetime new behaviours that t their new environment. These indi-viduals will then have a much better chance to survive and reproduce thanthose that were not able to learn the new behavioural strategies. Moreover, ifthe offspring of these individuals is able to learn the new tricks from their

    Editors introduction 3

  • parents, then they will also have a much better chance to survive and repro-duce than the offspring of those who have not learned the new tricks, and soon and so forth. Baldwins idea is then that, under such circumstances, thepopulation will have the chance to undergo genetic mutations that willallow the animals to display the new behavioural strategies without learn-ing.

    There are two problems involved with Baldwins hypothesis. The rst isthat it is not at all clear why the new successful behavioural strategiesshould become innate. If the population is able to learn them and to trans-mit this acquired knowledge to the next generation, what advantage could itgain from getting them genetically xed? Losing exibility is not supposedto be a good thing. The second problem: Even assuming that there is someadvantage in getting the new behavioural strategies genetically xed, whywould the mutations allowing this genetic xation to occur be more likelyto happen in the individuals having learned the new strategies than in anyothers?

    Baldwin seems to have never provided a convincing answer to the rstquestion. As to the second question, Baldwins answer is implicit in theabove description of the effect bearing his name. According to him, theanimals capable of learning would be more likely to undergo the rightmutations than the others simply because the animals unable to learn wouldbe driven to extinction before they had any chance to undergo any muta-tions. The learning of the new behaviours would thus create, according to anexpression coined by Godfrey-Smith, a breathing space that would provideenough time for the right mutations to occur and disseminate across thepopulation of learners. Such an answer, however, seems to rely on a view ofnatural selection as a process that works by killing off whole legions of mal-adapted organisms. However, the appropriate view of natural selection is asa process that affects the reproductive rates of populations. Although phe-nomena of mass extinction are indeed possible, they seem to be the excep-tion rather than the rule. Be this as it may, Baldwin provides us with nointrinsic reason why we should expect that the acquisition of the new behav-iours by learning would in any way contribute to the selection of the genesthat would render them innate (besides, of course, by keeping the organismsalive and thus keeping all options open).

    In his essay, Papineau argues that there are indeed mechanisms those ofgenetic assimilation and niche construction in terms of which it is possibleto nd a convincing answer to this latter question. He claims further thatthere are cases of social learning in which these mechanisms of geneticassimilation and niche construction can be seen to operate. He then proceedsto analyse particular cases of social learning in some animal species andargues that these cases provide us also with an answer to the rst questionabove: What advantage is there in genetically xing a behavioural trait thatcan be learned?

    Thus, according to Papineau, the consideration of these cases allows us to

    4 Antnio Zilho

  • understand how Baldwin effect phenomena might account for at leastsome of the more mind-boggling evolutionary processes: those by means ofwhich successions of innate complex adaptive behaviours can arise by naturalselection.

    The last of the three essays included in Part I of this volume is BrianSkyrmss Signals, evolution, and the explanatory power of transientinformation. This is an essay in evolutionary game theory. It is a contribu-tion to an account of how communication systems might evolve in popula-tions of differential replicators.

    In his famous 1969 essay Convention, David Lewis was able to showhow a simple communication system can be modelled as a game-theoreticalequilibrium and how such an equilibrium can remain stable in a populationif all of its members share a common and identical interest in communicat-ing the right information and if both common knowledge of the structure ofthe game and of rationality is assumed. The original selection of the sig-nalling equilibrium embodying the communication system was, in turn,accounted for in terms of saliency. Criticisms of Lewiss model pointed outthat, on the one hand, his assumptions of common knowledge of the struc-ture of the game and of rationality were too strong to be empirically credibleand that, on the other hand, some convincing story needed to be told abouthow any particular signalling system became salient in the rst place. In hisprevious work, Skyrms showed that these criticisms can be met if the game-theoretical approach to signalling systems is conceived of in evolutionaryterms rather than in terms of rational choice. Within the evolutionaryframework, neither the strong assumptions of common knowledge of thestructure of the game and of rationality nor salience are needed. An equilib-rium may be simultaneously reached and selected among many other pos-sible equilibria by the sheer dynamics of the process of differentialreproduction.

    One of Lewiss assumptions remained undisputed though, namely, theassumption that all members of the relevant population share a common andidentical interest in the occurrence of successful communication. But thisassumption admits also being challenged as unrealistic. The Israeli evolu-tionary biologist Zahavi addressed this challenge. He concentrated his atten-tion on the study of costly signals and pointed out that informativesignalling is also bound to evolve under circumstances of unequal interests ifwe take the meaning of the signals to be the showing off that the sender isable to pay the cost of sending them. In his contribution to this volume,Skyrms goes one step further and challenges the idea that costliness isrequired for the emergence of meaningfulness under circumstances ofunequal interests. He runs computer simulations of the evolutionary dynam-ics of different Stag Hunt and bargaining games to which costless pre-playsignalling devoid of any pre-existent meaning was added. According torational choice theory, such signals should never get any informative contentat all and should thus remain completely ineffective.

    Editors introduction 5

  • The results obtained in Skyrmss simulations contradict the expectationsbrought about by rational choice theory. Equilibria that would otherwiseemerge are destabilized by the introduction of costless signalling andsurprising new equilibria are created. Moreover, the relative magnitude ofthe original basins of attraction is also considerably shifted. Unless somealternative explanation is presented that is able to account for these effects,the results Skyrms obtained in his simulations seem to vindicate the thesisthat costless signalling may become informative under conditions of unequalinterests. If an evolutionary understanding of the emergence of human lan-guages is to be achieved, this is an extremely important result.

    The second part of the volume begins with Peter Godfrey-SmithsUntangling the evolution of mental representation. What is at stake in hisessay is the ontogenetic onset of rationality. Godfrey-Smith begins tacklingthis issue by discussing the status of folk psychology and the nature ofsemantic properties. He tries to clarify this much debated problem by intro-ducing an alternative understanding of the so-called theorytheoryapproach and by suggesting a new way of regarding the relation that obtainsbetween folk psychology and our inner cognitive mechanisms.

    The debate on this topic traditionally revolves around two issues. First,the issue of knowing what is the right way to account for our folk-psychological practices of interpreting actions as intentional; second, theissue of knowing what is the extent to which these practices accuratelyreect the details of our inner cognitive mechanisms. Two views dominatethis debate: the nativist view and the so-called interpretation stance view.According to the former view, folk psychology reects a competence for theunderstanding of our conspecics as intentional creatures we are innatelyendowed with; moreover, this competence is supposed to tell us somethingsubstantive about the underlying mechanisms subserving intentional action.According to the latter view, folk-psychological practices of action-interpre-tation are just a behaviour-dependent way of rationalizing our actions andthey tell us nothing substantive about the cognitive mechanisms in ques-tion. This view is held by, e.g., Daniel Dennett. The former view admitsbeing divided in turn into two main sub-views: the theorytheory approachand the simulationist approach. The theorytheory approach, held by, e.g.,Jerry Fodor, claims that folk psychology is a descriptive theory innately real-ized in a module of our minds which is basically true of the inner cognitivemechanisms subserving our actions; the simulationist approach, held by,e.g., Alvin Goldman, claims that the folk-psychological interpretive compe-tences we display result from an innate simulation ability by means of theexercise of which we end up understanding the mental lives of others byassuming that they undergo the same mental processes we do when we placeourselves in the situations they nd themselves in.

    Godfrey-Smiths own alternative to the theorytheory approach consistsin considering folk psychology to be a model, in the science-philosophicalsense of the term, rather than a theory. As a model, folk psychology should

    6 Antnio Zilho

  • be understood as an abstract structure, denable in terms of a characteristicset of elements and interrelations between them. Thus, by the age theybegin to reason in intentional terms, children would not be displaying thecommand of a sophisticated theory of rationality; they would rather beacquiring a competence to reason according to such a loosely dened struc-ture. Seen as a model, folk psychology is also not supposed to determine itsown interpretation. Godfrey-Smith therefore thinks that the folk-psycholog-ical model is in fact compatible with almost all interpretations of it whichhave been put forth in the philosophical literature.

    The other most contentious issue in the debate regarding the status offolk psychology and the nature of semantic properties is the determinationof the relation it has with the underlying cognitive mechanisms of thehuman mind. In this respect, Godfrey-Smith makes two distinct sugges-tions. The rst is that if we assume, as all parties in the debate seem to do,that folk-psychological explanations have been around in our interpretivepractices for a long time, then it will probably be the case that they haveexerted some impact upon our cognitive mechanisms (and vice versa). Thejustication for this conclusion is simple: the cognitive mechanisms in ques-tion are meant to guide us in our social interactions; the environment inwhich these social interactions have been consistently taking place is anenvironment in which the expectations of others towards us and their expla-nations of our behaviour play a pre-eminent role; therefore, these cognitivemechanisms were exposed to natural selection in an environment shaped byfolk-psychological practices. Thus, some sort of co-evolution of the twotraits is to be expected, and it is highly unlikely that none of them somehowreects the other.

    The second suggestion is about the precise nature of this reection and isbound to be highly controversial. Contrary to standard theory-theorists, whoclaim that folk psychology results from an innate module of our mind thatgets triggered in the course of the maturational process when children arearound four years of age, Godfrey-Smith puts forth a sort of neo-Whoranview according to which folk psychology exists primarily as a social and lin-guistic practice. However, as a consequence of the evolutionary interactionmentioned above, children rewire substantially the structure of their socialthinking along folk-psychological lines by the age of four. It is such arewiring that makes folk psychology true of them from then on. That is, theonset of rationality takes place at this stage as the consequence of a process ofinternalization. There is a sense in which Godfrey-Smiths proposal mightbe seen as reminiscent of Dennetts view of human consciousness. As amatter of fact, according to the latter, consciousness is the result of a massivereprogramming of the childs brain. This reprogramming is, in turn,induced by the childs submission to socially produced linguistic inputs.

    We might thus say that, according to Godfrey-Smith, the explanatorymodel and the inner reality end up matching each other, not because theexplanatory model describes accurately a pre-existent reality it was meant to

    Editors introduction 7

  • cognize, but rather because the inner reality transforms itself in order toadapt to a social reality shaped in agreement with the explanatory model.

    The scepticism towards the standard theorytheory approach to indi-vidual rationality, apparent in Godfrey-Smiths essay, is further developed inChristopher Cherniaks essay. Having produced previously an extensive bodyof work in computational neuroanatomy, Christopher Cherniak is concernedwith the following question: What is the right level of structural complexityat which talk of optimization in a nervous system makes evolutionary sense?In his essay titled Innateness and brain-wiring optimization: non-genomicnativism he uses formal tools developed in the area of Computer Sciencecalled component place optimization in order to conclude that such a talk isbest suited to the hardware domain of wiring layout rather than to the soft-ware domain of abstract cognitive structure.

    The optimization observed in the wiring layout of different organismsnervous systems cries out for an explanation. How is it that it might havecome about? At this stage, Cherniak presents us with a curious analogybetween computationalist views in the philosophy of mind and a fundamen-tal assumption of modern genetics. The idea that the mind is best viewed asan abstract software structure, typical of functionalism, correlates well withthe idea that the genome is a program that codes for the construction of awhole organism. However, just as the former, the latter idea needs to facesome hard questions. One of them is: How much information is it reallypossible to compress in a genetic code? This question becomes even morepoignant if we restrict our attention to particular organs. For instance, howmuch specic information for brain building can actually be coded in agenetic code? Cherniaks answer to this question is that probably not thatmuch. He estimates that the amount of brain-specic DNA availablemight amount to as little information as is contained in a desk dictionary(about 50,000 entries) i.e., 100Mb total. Note that he is talking aboutthe human brain here; arguably, the most complex physical structure knownin the universe.

    How is this possible? Cherniak advances a bold thesis to answer thisquestion. According to him, a signicant part of an organisms anatomicalstructure is accounted for by optimization processes that are generateddirectly from underlying physical processes with no genomic intermedia-tion. He speaks of a division of labour existing between the genome andthese more basic physical processes. Nativists typically insist that the mindis no tabula rasa; Cherniak takes the underlying intuition a few steps further.According to him, the information contained in a genome does not uctuatein some sort of ethereal information space either; rather it is inscribed in aparticular type of matter already containing signicant structural informa-tion; otherwise, it would have no chance of being effective.

    The extension of Cherniaks thesis to the rationality debate in cognitivescience leads him to stress how profoundly hardware and software engin-eering differ. Moreover, this difference is, according to him, responsible for

    8 Antnio Zilho

  • the abyss that separates the performances of the two domains over the lastfty years. As he puts it, quite crudely: If hardware had developed as hasAI, we would still be using abacuses and sliderules computers wouldmerely be exotic laboratory confections.

    He concludes his essay with a prediction that will not be particularly wel-comed by the supporters of traditional cognitive science, i.e., What thefuture, dominated by hardware engineering, has to offer us is probably theproduction of intelligent behaviour from opaque dynamical processes inwhich no states of the mechanisms admit being neatly identiable asrepresentations or logical rules for processing them.

    Harveys essay complements Cherniaks well. As a matter of fact, in Evo-lution and the origins of the rational, Harvey contends that mainstreamphilosophy of mind and cognitive science got trapped in a conceptual deadend because of careless use of intentional language in empirical research. Hequotes Wittgensteins famous metaphor of the y trapped in the y-bottleto describe the situation. And, in tune with Wittgensteins therapeuticrecommendations, he equally contends that the way to get rid of the insol-uble mind-philosophical mysteries that, according to him, plague theseviews is by eschewing mentalist language altogether in the course of appliedresearch.

    His contribution to this volume is an essay in evolutionary robotics. In ithe tries to show how we can examine our assumptions regarding our every-day and philosophical uses of the language of intentionality and rationalityby creating through evolution articial life forms and observing theirbehaviour and interaction with each other.

    On the one hand, we might say that Harvey views his own work,developed within the dynamical systems approach, as being carried out fromthe perspective Cherniak refers to as the perspective typical of hardwareengineering as opposed to that typical of software engineering. On the otherhand, however, he explicitly describes it as being an approach to cognitionthat admits being seen as a sort of philosophy of mind with a screwdriver.Such an approach, he contends, is more challenging than its armchair coun-terpart because its assumptions are tested in the construction of real physicaldevices. These devices are also subject to processes of articial selection thatmimic the Darwinian mechanisms existent in nature. The upshot of suchprocesses is the evolution of animated creatures that are capable of simplegoal-directed behaviours such as avoiding obstacles or approaching andeeing targets. And, following in the wake of Cherniaks prediction, heexplicitly contends that the inner architecture of the mechanisms subservingadaptive behaviour in evolved creatures is simply too opaque to be usefullydescribed in our usual intentionality laden cognitive vocabulary.

    The production of these articial animated creatures plays a double theo-retical role then. On the one hand, it is used to challenge the reasonablenessof certain theoretical assumptions and preconceptions of the mainstreamview by showing that their physical world implementation is simply either

    Editors introduction 9

  • not feasible or just not appropriate. On the other hand, it can be used toprove that intelligent adaptive behaviour can be elicited from physicallyimplemented cognitive architectures in which the intent to isolate some dis-crete states as being the representations, beliefs or desires of the evolveddevice is simply hopeless. Moreover, given that these architectures wereevolved out of completely random genotypes of articial DNA by the pres-sures of articial selection alone, the implication is that real-world cognitivecapacities evolved in the earths biosphere by a process of natural selectionshould be subserved by cognitive architectures of the same type. That is, thiswork can be used for mind philosophical purposes as an argument byanalogy.

    Of course, such a program is limited, for the moment at least, to theanalysis, understanding and reproduction of relatively simple adaptivebehaviours, such as those one is bound to nd in bacteria, insects and otherinferior animals. But Harvey claims that this is the sensible approach totake: only beginning small and simple can one hope to achieve a properunderstanding of the large and complex. The attempt to sidestep this stageof research by mainstream cognitive science and to try to model highly com-plicated patterns of human intelligent behaviour right from the start is,according to Harvey, another major factor in what he considers to be thequasi-paralysis aficting research in traditional computationalist AI thesedays.

    The third and last part of the volume comprises three essays dealing withspecic cognitive questions. These are space cognition, emotion recognitionand the psychology of reasoning. Part III begins with Barbara Tverskysessay How to get around by mind and body: spatial thought, spatial action an essay in the psychology of space cognition. This essay deals with thesystematic errors that have been found in spatial memory and judgementand sketches a way of accounting for them.

    According to Barbara Tversky, the space of navigation has been studiedby two research communities, each of them approaching the subject from arather different angle. She calls one of these communities the mind commun-ity and the other the body community. According to her, psychologistsbelonging to the mind community have been concerned mainly with gather-ing knowledge from the analysis of spatial judgements made explicitly byhuman subjects. Still according to her, psychologists belonging to the bodycommunity have been concerned mainly with gathering knowledge from theanalysis of animal spatial behaviour.

    Startlingly, these two communities seem to have been arriving at contra-dictory conclusions. As a matter of fact, whereas the mind community pro-duces study after study in which more and more systematic errors in thespatial judgements of agents are unveiled, the body community constantlyemphasizes the extreme accuracy and ne-tuning of animal spatial behaviour.This is a striking contrast, crying out for analysis and explanation. BarbaraTverskys essay is an attempt at providing us with one such explanation.

    10 Antnio Zilho

  • Barbara Tversky claims that people think about space in a non-geometrical and non-isotropic way. As a matter of fact, according to her, oneof the main aspects of peoples spatial thinking is its hierarchical organi-zation. This form of organization of space has a rationale: it helps keepingtrack of correlations in memory, it helps retrieving them from there, and italso facilitates spatial inference. However, there is a trade-off here. Compre-hensiveness and complete accuracy are sacriced for manageability. A con-sequence of this trade-off is that once the structure of a particularorganization of spatial thought is understood, it is possible to frame spatialquestions in such a way that, in order to answer them rightly, the subjectmust violate the organizing hierarchy structuring his own spatial thinking.Not surprisingly, these answers are more often than not answered wronglyand subjects are led to fall into contradiction. Barbara Tversky claims thatthis is precisely the phenomenon the mind community has been unveiling.Hierarchical organization is just one of the several non-geometric and non-isotropic aspects of peoples spatial perception and thinking though. Thereare others, leading to different kinds of systematic error in peoples spatialjudgements. What should be crucial here, however, is, according to BarbaraTversky, the realization that these errors are not inform; rather, they stemfrom characteristic patterns.

    The presence of spatial thinking in humans is not to be evolutionarilyaccounted for in terms of a need to answer correctly tricky questionnairesimagined by clever psychologists. Rather, it is there in order to allow us tobe able to get back home, to nd out the way to places where food is or totrace back escape routes from predators or enemies. As the body communityhas been consistently showing, animals, human or otherwise, are extremelygood at doing this. Barbara Tversky tells us that they tend to explain thisperformance in terms of both local sensory-motor couplings and an import-ant reliance on local environmental cues that help correcting them.

    In view of this, she claims that spatial thought cannot be adequatelyunderstood independently of spatial action. She then goes on to assert that,once such a coupled understanding is achieved, the theoretician must realizethat the accuracy that is sacriced by the sort of structure that underlies theorganization of general spatial thinking is promoted contextually by theinteraction of the agent with the environment. However, the ne-tuningbrought about by the interaction with the environment does not affect thegeneral mechanisms that are mobilized in order to produce idle armchairspatial judgements. This, in turn, explains why systematic errors in thesejudgements persist despite the existence of selective pressures that promoteaccuracy of navigation.

    Sripadas and Goldmans joint essay titled Simulation and the evolutionof mindreading deals with an already mentioned cognitive ability humansdisplay: the ability to understand intentionally the behaviour of theirconspecics. They call this ability mindreading. More specically, Sripadaand Goldman are interested in two particular questions associated with

    Editors introduction 11

  • mindreading: How do people do it? and What might be the evolutionarybackground for the development of one such ability in us?

    As previously mentioned, there is an ongoing mind-philosophical debatebetween supporters of different approaches to mindreading. The properway of answering the rst question above is obviously the focus of thedispute. Sripada and Goldmans essay is meant to be a contribution to thisdebate in that they provide an argument in favour of one of these approaches the so-called simulation theory. However, they do not address here thetopic that tends to be most hotly discussed in connection with this debate,namely, the topic of propositional attitude ascription. Neither is their argu-ment a general one. Rather, it is restricted to the analysis of only one of dif-ferent mindreading tasks, namely, face-based emotion recognition. This isthe reason why their essay is included in Part III of this collection instead ofbeing presented in association with Godfrey-Smiths, Cherniaks andHarveys essays in Part II.

    Face-based emotion recognition is the ability to ascribe the experiencingof particular emotions to other humans when one is confronted with theirfacial expressions. Sripada and Goldmans simulationist claim is then thefollowing. Face-based emotion recognition is best accounted for in terms of asimulation process by means of which the reproduction of an emotional stategets triggered in the mind of the human observer when he is confrontedwith the emotionally laden facial expression of another human. The observerthen ascribes to the target of his observation the experiencing of the emo-tional state he actually enacted in his own mind.

    Sripada and Goldmans claim is a purely empirical one. They thereforepresent experimental evidence in order to support it. The evidence in ques-tion is of two different kinds. First, the analysis of clinical stories of brain-damaged patients which became unable to feel a certain number of basicemotions (namely, fear, anger or disgust) as a consequence of their lesions.According to Sripada and Goldman, these clinical stories show more thanimpairment in experiencing the appropriate emotions under the appropriatecircumstances. They also show that these patients became unable to detectthese emotions in other peoples faces too. Second, an fMRI study of theexperiencing of a particular emotion (disgust) by normal subjects. Accordingto Sripada and Goldman, the neuroimaging produced in this study showsthat the same areas of the brain are activated both when the subjects areexperiencing disgust and when they are observing facial expressions of otherpeople undergoing the experience of disgust.

    To what extent can it be said that this evidence supports the authorsclaim? As Sober put it in the opening essay of this collection, testing anempirical hypothesis is testing it against its competitors in order to deter-mine which of them has a greater likelihood value. The competitor hypothe-sis against which Sripada and Goldman are measuring their own hypothesisis the already mentioned theorytheory hypothesis. The question shouldthen be rephrased thus: To what extent can it be said that the authors claim

    12 Antnio Zilho

  • confers a greater probability on the observations than the claim of the sup-porters of the theorytheory?

    Sripada and Goldman argue that the empirical ndings they collected areprecisely those that should be expected to be found under the assumption ofthe truth of their hypothesis. That is, they claim that it is a consequence oftheir hypothesis that the same mechanisms should be mobilized in orderboth to undergo a mental state and to detect it in a conspecic. They thusargue that the probability of the combined impairment given their hypothe-sis is extremely high. Similar considerations apply for the appraisal of theneuroimaging evidence. On the other hand, Sripada and Goldman claimthat assuming the truth of the rival hypothesis does not probabilify the evid-ence to any relevant degree, given the distinction drawn by thetheorytheory supporters between the information-based nature of the cog-nitive procedures assumed to be at work in the process of mindreading andthe non-information based nature of the processes that are assumed tounderlie the experiencing of basic emotions.

    Besides resorting to clinical evidence, Sripada and Goldman also put forth anevolutionary argument in order to both back up their claim and to try toanswer the second question above. This argument is twofold. On the one handthey claim that, contrary to theory-building, a simulation routine admits beingunderstood as a fast and frugal heuristic, in the sense of the term coined byGigerenzer and Todd. As such, it works by taking advantage of a stable prop-erty of the usual human environment (namely, the fact that other humans,endowed with similar cognitive apparatuses, are part of it), in order to deliver areliable cognitive behaviour at low computational costs. On the other hand,they claim that there is a plausible route for the evolution of a simulationroutine in humans, namely, an exaptation for the purpose of mindreading of theprocesses underlying the well studied phenomenon of emotion contagion.Again, according to the authors, no such plausible evolutionary route can beforeseen for a theory-based approach to face-based emotion recognition.

    Finally, we get to the last essay in this collection. It is Tim van GeldersEnhancing and augmenting human reasoning. In this essay, van Gelderemphasizes not so much how the cognitive sciences help us reach an under-standing of the human mind but rather how they may help us improve itscapabilities. In particular, van Gelder is interested in the improvement ofactual human reasoning. According to him, this is a desideratum that maybest be achieved by a proper use of computer-supported argument mapping.

    Computer-supported argument mapping is a software package for pro-ducing and manipulating graphical presentations of reasoning structures ina computer screen. According to van Gelder, this tool helps improvepeoples reasoning skills in two ways. First, it helps enhance whatever reason-ing skills people may already have, namely, those skills they unconsciouslydisplay in their everyday arguments and inferences. Second, it allows peopleto augment their reasoning abilities by helping them to perform more accu-rately and more extensively in this domain.

    Editors introduction 13

  • Whether or not computer-supported argument mapping has the bene-cial effects in human inferential performance van Gelder claims it to have, isan empirical question that can only be decided by amassing large bodies ofexperimental data. Van Gelder refers some supportive psychological studiesperformed at the University of Melbourne in which the impact of the intro-duction of this tool in the teaching of critical thinking was actually meas-ured. However, as he himself acknowledges, a more extensive research is stillneeded.

    Given that his conviction is that the conclusions of such a future exten-sive research will only strengthen the partial results already observed at theUniversity of Melbourne, van Gelder proceeds by trying to nd an explana-tion for the benecial effects the pedagogical use of this software package issupposed to have on actual reasoning skills. The explanation he comes upwith is that these effects are a consequence of the more embodied characterreasoning acquires when it is represented by means of computer-supportedargument mapping techniques. This explanation is in turn to be understoodagainst a theoretical background according to which our reasoning abilitieswere developed not ab ovo but by the requisitioning of the more basicsensory-motor abilities of our minds for this new job. The use of colours,lines, shapes, spatial distributions and other graphical devices in order torepresent argument structure makes us feel more at home in reasoningtasks precisely because these are the aspects of the world our minds were pri-marily designed to attend to. Conversely, most people face difculties whentrying to reason properly in abstract terms because, in the absence of such anembodiment, traditional intellective procedures are felt as foreign by theirminds.

    Van Gelder claims that radical changes in the equipment we use to helpus reason may have not only the effect of advancing our capabilities but alsothe effect of transforming our minds. Thus he claims that, to this extent,these changes are bound to acquire a role in the evolution of human nature.Now, one such radical change has indeed happened in the past, namely, theintroduction of writing. Van Gelder concludes his essay by suggesting thatthe regular use of computer-supported argument mapping will have as dra-matic an effect on our future intellectual lives as the regular use of alpha-betic writing had on the intellectual lives of our ancestors over threethousand years ago.

    I do not know how accurate this contention might be. But its boldnesscertainly closes this volume nicely.

    14 Antnio Zilho

  • Part I

    Evolution

  • 1 Intelligent design is untestableWhat about natural selection?

    Elliott Sober1

    The argument from design is best understood as a likelihood inference. ItsAchilles heel is our lack of knowledge concerning the aims and abilities thatthe putative designer would have; in consequence, it is impossible to deter-mine whether the observations are more probable under the design hypothe-sis than they are under the hypothesis of chance. Hypotheses about the roleplayed by natural selection in the history of life also can be evaluated withina likelihood framework, and here too there are auxiliary assumptions thatneed to be in place if the likelihoods of selection and chance are to be com-pared. I describe some problems that arise in connection with the project ofobtaining independent evidence concerning those auxiliary assumptions.

    1 What else could it be?

    Defenders of the design argument sometimes ask What else could it be?when they observe a complex adaptive feature. The question is rhetorical;the point of asking it is to assert that intelligent design is the only mechan-ism that could possibly bring about the adaptations we observe. Contempor-ary evolutionists sometimes ask the same question, but with a differentrhetorical point. Whereas intelligent design seems to some to be the onlygame in town, natural selection seems to others to be the only possiblescientic explanation of adaptive complexity.

    I propose to argue that intelligent design theorists and evolutionists areboth wrong when they argue in this way. Whenever a hypothesis confers aprobability on the observations without deductively entailing them, evaluat-ing how well supported the hypothesis is requires that one consider alternat-ives. Testing the hypothesis requires testing it against competitors. Developingthis point leads to a recognition of the crucial mistake that undermines thedesign argument. The question then arises as to whether evolutionaryhypotheses about the process of natural selection fall prey to the same error.Although Ill begin by emphasizing the parallelism between intelligentdesign and natural selection, I emphatically do not think that they are on apar. The relevant point of difference is that intelligent design, as a claimabout the adaptive features of organisms, is, at least as it has been developed

  • so far, an untestable hypothesis. Hypotheses describing the role of naturalselection, on the other hand, can be tested. But how they are to be tested isan interesting question, as we shall see.

    2 Likelihood and intelligent design

    As mentioned, What else could it be? is a rhetorical question, whose pointis to assert that some favored mechanism is the only one that could possiblyproduce what we observe. This line of reasoning has a familiar deductivepattern, namely modus tollens:

    If H were false, O could not be true.O is true.

    (MT) H is true.

    Despite the allure of this line of reasoning, many defenders of the designargument have recognized that it is misguided. One of my favorite versionsof the argument is due to John Arbuthnot (1710), who was clear about thispoint. Arbuthnot tabulated birth records in London over 82 years andnoticed that in each year, slightly more sons than daughters were born.Realizing that boys die in greater numbers than girls, he saw that the slightbias in the sex ratio at birth gradually subsides until there are equalnumbers of males and females at the age of marriage. Arbuthnot took this tobe evidence of intelligent design; God, in his benevolence, wanted each manto have a wife and each woman to have a husband. To draw this conclusion,Arbuthnot considered what he took to be the relevant competing hypothesis that the sex ratio at birth is determined by a chance process. Arbuthnothad something very specic in mind when he spoke of chance; he meant thateach birth has a probability of of being a boy and a probability of ofbeing a girl. Under the chance hypothesis, a preponderance of boys in agiven year has the same probability as a preponderance of girls; there is, inaddition, a third possibility that has a very small probability (e) namely,that there should be exactly as many boys as girls in a given year:

    Pr(more boys than girls are born in a given year Chance)Pr(more girls than boys are born in a given year Chance)Pr(equal numbers of boys and girls are born in a given year Chance) e

    Thus, the probability that more boys than girls will be born in a givenyear, according to the Chance hypothesis, is a little less than . The Chancehypothesis therefore entails that the probability of there being more boysthan girls in each of the 82 years is less than ()82 (Stigler 1986: 225226).

    Arbuthnot did not use modus tollens to defend intelligent design; rather,he constructed a likelihood inference:

    18 Elliott Sober

  • Pr(Data Intelligent Design) is very high.Pr(Data Chance) ()82.

    (L) The Data strongly favor Intelligent Design over Chance.

    Arbuthnot used a principle that later came to be called The Law of Like-lihood (Hacking 1965; Edwards 1972; Royall 1997): the data lend moresupport to the hypothesis that confers on them the greater probability. Hereand in what follows, I use the terms likelihood and likely in the tech-nical sense introduced by R.A. Fisher (1925). The likelihood of a hypothesisis not the probability it has in the light of the evidence; rather, it is theprobability that the evidence has, given the hypothesis. Dont confusePr(Data H) with Pr(H Data); the former is Hs likelihood, while the latteris Hs posterior probability. Understood in this way, Arbuthnots argumentdoes not purport to show that the sex ratio data he assembled was probablydue to intelligent design. To obtain that result, hed need further assump-tions concerning the prior probabilities of the two hypotheses.2 I omit thesein my reconstruction of the design argument because I dont see how theycan be understood as objective quantities.

    The likelihood version of the design argument is modest. As just noted,it declines to draw conclusions about the probabilities of hypotheses. But itis modest in a second respect it does not claim to evaluate all possiblehypotheses. Arbuthnot considered Design and Chance, but could not haveaddressed the question of how Darwinian theory might explain the sex ratio.This puzzled Darwin (1872) and was successfully analyzed by R.A. Fisher(1930) and then by W.D. Hamilton (1967). Thus, even if Arbuthnot isright that Design beats Chance, it remains open that some third hypothe-sis might trump Design. There is no way to survey all possible explanations;we can do no more than consider the hypotheses that are available. The ideathat there is a form of argument that sweeps all possible explanations fromthe eld, save one, is an illusion.3

    I conclude that the rst premise in the modus tollens version of the designargument is false. It is false that Intelligent Design is the only process thatcould possibly produce the adaptations we observe. Long before Darwin,Chance was on the table as a possible candidate, and after 1859 the hypothe-sis of evolution by natural selection provided a third possibility. In sayingthis, I am not commenting on which of these three explanations is best. I ammerely making a logical point. What we observe is possible according to allthree hypotheses. We cant use modus tollens in this instance. Rather, we needto employ a comparative principle; the Law of Likelihood seems eminentlysuited to that task.

    Intelligent design is untestable 19

  • 3 Whats wrong with the design argument?

    To explain what is wrong with the design argument as an explanation of thecomplex adaptive features that we observe in organisms, it is useful to con-sider an application of this style of reasoning that works just ne. Here I havein mind William Paleys (1802) famous example of the watch found on theheath. Construed as a likelihood inference, Paleys argument aims to establishtwo claims that the watchs characteristics would be highly probable if thewatch were built by an intelligent designer and that the characteristics wouldbe very improbable if the watch were the product of chance. The latter claimI concede. But why are we so sure that the watch would probably have thefeatures we observe if it were built by an intelligent designer?

    To clarify this question, lets examine Table 1.1, which illustrates a set ofpossibilities concerning the abilities and desires that the putative designer ofthe watch might have had. The cell entries represent which hypothesis intelligent design or chance confers the higher probability on the watchsbeing made of metal and glass.4 Which hypothesis wins this likelihoodcompetition depends on which row and column is correct.5 The observationthat the watch is made of metal and glass would be highly probable if thedesigner wanted to make a watch out of metal and glass and had the know-how to do so, but not otherwise. If we have no knowledge of what thesegoals and abilities would be, we will not be able to compare the likelihoodsof the two hypotheses.

    The question we are now considering did not stop Paley in his tracks, norshould it have done. It is not an unfathomable mystery what goals and abili-ties the putative designer would have if the designer is a human designer.When Paley imagined walking across the heath and nding a watch, healready knew that his fellow Englishmen are able to build artifacts out ofmetal and glass and are rather inclined to do so. This is why he was entitledto assert that the probability of the observations, given the hypothesis ofintelligent design, is reasonably high.

    20 Elliott Sober

    Table 1.1 Which hypothesis, Design or Chance, confers the greater probability onthe observation that the watch is made of metal and glass? That dependson the abilities and desires that the putative designer would have if heexisted

    Desires: what does the putative designerwant the watch to be made of?

    Metal and glass Not metal and glass

    Abilities: what Metal and glass Design Chancematerials does the putative designer Not metal and glass Chance Chanceknow how to use?

  • The situation with respect to the eye that vertebrates have is radically dif-ferent. If an intelligent designer made this object, what is the probabilitythat it would have the various features we observe? The probability wouldbe extremely low if the designer in question were an eighteenth-centuryEnglishman. But we all know that Paley had in mind a very different kindof designer. The problem is that this designers radical otherness put Paley ina corner from which he was unable to escape. He was in no position to saywhat this designers goals and abilities and raw materials would be, and so hewas unable to assess the likelihood of the design hypothesis in this case.

    The problem that Paley faced in his discussion of the eye is depicted inTable 1.2. If the putative designer were able to make the eye that verte-brates have (a camera eye) and wanted to do so, then Design would have ahigher likelihood than Chance. But if the designer were unable to do this, orif he were able to do whatever he pleased but preferred giving vertebratesthe compound eye now found in many insects, Chance would beat Design.Paley had no independent information about which row and which columnis true (nor even about which are more probable and which are less).

    Thus, Paleys analogy between the watch and the eye is deeply mislead-ing. In the case of the watch, we have independent knowledge of thecharacteristics the watchs designer would have if the watch were, in fact,made by an intelligent designer. This is precisely what we lack in the case ofthe eye. It does no good simply to invent assumptions about raw materialsand desires and abilities; what is needed is independent evidence about them.Paley emphasizes in Natural Theology that he intends the design argument toestablish no more than the existence of an intelligent designer, and that it is aseparate question what characteristics that designer actually has. His argu-ment runs into trouble because these two issues are not as separate as Paleywould have liked.6

    The criticism I have just described of the design argument does notrequire us to consider Darwinian theory as an alternative explanation. We do

    Intelligent design is untestable 21

    Table 1.2 Which hypothesis, Design or Chance, confers the greater probability onthe observation that vertebrates have a camera eye? That depends on theabilities and desires that the putative designer would have if he existed

    Desires: what kind of eye does theputative designer want vertebratesto have?

    A camera eye A compound eye

    Abilities: what kind A camera eye Design Chanceof eye is the putative designer able to give Only a compound eye Chance Chanceto vertebrates?

  • not need an alternative explanation of the adaptive contrivances of organismsto see that the intelligent design hypothesis at least as it was developed byArbuthnot and Paley, and as it is put forward by present-day intelligentdesign theorists is untestable.7

    4 The parallel challenge for selectionist explanations

    Just as hypotheses that postulate an intelligent designer cannot be justiedby saying that no other process could possibly give rise to the adaptive fea-tures we observe, the same is true of hypotheses that appeal to the process ofnatural selection. In this case as well, we need to compare the likelihood ofthe hypothesis of natural selection with the likelihood of one or more altern-ative explanations. One obvious alternative is the idea of chance, which inmodern evolutionary theory takes the form of the hypothesis of randomgenetic drift. The drift hypothesis says (roughly) that the alternative traitspresent in a lineage have nearly identical tnesses and that the frequencies oftraits in the population change by random walk. Here we may repeat whatArbuthnot said about chance in connection with sex ratio it is very improb-able (though not impossible) that the vertebrate eye should have the featureswe observe, if it arose by random genetic drift. We now need to considerwhat the probability of the eyes features are, if the eye was produced bynatural selection. That turns out to depend on further assumptions. Ofcourse, these further assumptions do not concern the raw materials, goals,and abilities that a putative designer might have. To make it easier toexplain what these further assumptions are, Im going to change examplesfor a while from the much beloved vertebrate eye to the fact that polarbears have fur that is, let us say, 10 centimeters long. Ill return to the eyelater on and describe how lessons drawn from thinking about bear fur applyto it.

    First, I need to clarify the two hypotheses I want to compare. I willassume that evolution takes place in a nite population. This means thatthere is an element of drift in the evolutionary process, regardless of whatelse is going on. The question is whether selection also played a role. So wehave two hypotheses pure drift (PD) and selection plus drift (SPD). Were thealternative traits identical in tness or were there tness differences (andhence natural selection) among them? I will understand the idea of drift in away that is somewhat nonstandard. The usual formulation is in terms ofrandom genetic drift; however, the problem I want to address concerns a phe-notype the evolution of fur that is 10 centimeters long. To decide howrandom genetic drift would inuence the evolution of this phenotype, wedhave to know how genes inuence phenotypes. How many loci inuence thisphenotype? Are the different loci additive in their effects on fur length? Iam going to bypass these genetic details by using a purely phenotypicnotion of drift: under the hypothesis of pure drift, a populations probabilityof increasing its average fur length by a small amount is the same as its

    22 Elliott Sober

  • probability of reducing fur length by that amount.8 Ill similarly bypass thegenetic details in formulating the hypothesis of selection-plus-drift; Illassume that the SPD hypothesis identies some phenotype (O) as theoptimal phenotype and says that an organisms tness decreases monotoni-cally as it deviates from that optimal value. This means, for example, that if12 centimeters is the optimal fur length, then 11 is tter than 10, 13 istter than 14, etc.9 Given this singly-peaked tness function whoseoptimum is O, the SPD hypothesis says that a populations probability ofmoving a little closer to O exceeds its probability of moving a little fartheraway. The SPD hypothesis says that O is an attractor in the lineages evolu-tion.10 For evolution to occur, either by pure drift or by selection plus drift,there must be variation. Ill assume that mutation always provides a cloud ofvariation around the populations average trait value; this assumption isamply justied by observations of many traits in many natural populations.

    We now need to assess the likelihoods of the two hypotheses. Given thatpresent day polar bears have fur that is 10 centimeters long, what is theprobability of this observation under the two hypotheses? The answerdepends on the fur length that the ancestors of present day polar bears pos-sessed and also on the optimal fur length toward which natural selection, ifit occurred, would be pushing the lineage. Some of the options are describedin Table 1.3. The lineage leading to the present population might beginwith fur that is 2 or 8 or 10 centimeters long. And the optimal fur lengthmight be 2 or 8 or 10 or 12 or 18 centimeters.

    Suppose that the populations present value of 10 centimeters alsohappens to be the optimal value; this is the situation represented by thethird column of Table 1.3. In this case, the initial state of the lineagedoes not matter. Regardless of which row we consider, the hypothesis ofselection-plus-drift has a higher likelihood than the pure drift hypothesis

    Intelligent design is untestable 23

    Table 1.3 If polar bears now have fur that is 10 centimeters long, does the hypothe-sis of SPD or the hypothesis of PD render that outcome more probable?The answer depends on the lineages initial state and on the fur lengththat would be optimal if selection were in operation. Cells that have SPDor PD in them describe which hypothesis is more likely. The answers forcells with O or U in them will be described presently; in these cases, thepopulation either overshoots or undershoots the optimum postulated by theSPD hypothesis.

    Possible optimal fur lengths

    2 8 10 12 18

    Possible initial states 2 PD O SPD U U8 PD PD SPD U U

    10 PD PD SPD U U

  • polar bears have a higher probability of exhibiting a trait value of 10 if selec-tion is pushing them in that direction than they would have if fur lengthwere the result of pure drift. In contrast, suppose that 2 is the optimal furthickness. If the lineage starts evolving with a trait value of 8, then selectionwould work against its increasing to a value of 10. Reaching a value of 10would then be less probable under the selection-plus-drift hypothesis than itwould be if the trait were subject to pure drift. This is why pure drift beatsselection-plus-drift in the rst column.

    The cells in Table 1.3 with O or U in them are harder to evaluate. Noticethat these cells are of two types. If the population began with an initial stateof 2 and the optimal value is 8, then the population has to overshoot thisoptimum if it is to exhibit a nal state of 10. The second kind of case arisesif the population begins with 2 and has 12 as its optimum; in this case, thepopulation has to undershoot the optimum if it is to end up with a trait valueof 10. These two harder cases, as well as the two easier cases already ana-lyzed, exhaust the possibilities when selection is understood in terms of amonotonic tness function. We can analyze all four cases at once by furtherinvestigating the implications of the two hypotheses.

    The dynamics of SPD are illustrated in Figure 1.1, adapted from Lande(1976). At the beginning of the process, at t0, the average phenotype in the

    24 Elliott SoberPr

    obab

    ility

    t 0 t 1

    t 2

    t 3 t

    Average phenotype in the population

    _

    w

    opt.

    Figure 1.1 In the process of SPD, the population begins at t0 with a sharp value. Astime passes, the mean of the distribution moves toward the optimumand the variance of the distribution increases.

  • population has a sharp value. The state of the population at various latertimes is represented by different probability distributions. Notice that as theprocess unfolds, the mean value of the distribution moves in the direction ofthe optimum. The distribution also grows wider, reecting the fact that thepopulations average phenotype becomes more uncertain as more timeelapses. The speed at which the population moves toward the distributioncentered at the optimum depends on the traits heritability and on thestrength of selection, which is represented in Figure 1.1 by the peakednessof the w-bar curve. The width of the different distributions depends on theeffective population size and on the strength of selection; the larger theproduct of these two, the narrower the bell curve. In summary, SPD can bedescribed as the shifting and squashing of a bell curve.11

    In contrast, the process of PD involves just the squashing of a bell curve;evolution in this case leaves the mean value of the distribution unchanged,although uncertainty about the traits future state increases. In the limit ofinnite time, the probability distribution is at, indicating that all averagephenotypes are equiprobable.12 I assume that the quantitative charactercannot drop below 0 and that there is some upper bound on its value e.g.,that the lineage leading to present day polar bears cannot evolve fur that ismore than, say, 100 centimeters.

    Intelligent design is untestable 25Pr

    obab

    ility

    1000

    t0

    t 1

    t 2

    t 3

    t

    Average phenotype in the population

    Figure 1.2 In the process of PD, the population begins at t0 with a sharp value. Astime passes, the mean of the distribution remains the same and the vari-ance of the distribution increases.

  • We now are in a position to analyze when SPD will be more likely thanPD. Figure 1.3a depicts the relevant distributions when there has been nitetime since the lineage started evolving from its initial state (I). Notice thatthe PD distribution stays centered at I, whereas the SPD curve has moved inthe direction of the putative optimum. Notice further that the PD curve hasbecome more attened than the SPD curve has; selection impedes spreadingout. Figure 1.3b depicts the two distributions when there has been innitetime. The SPD curve is centered at the optimum while the PD curve isentirely at. Whether nite or innite time has elapsed, the likelihoodanalysis is the same: the SPD hypothesis is more likely than the PD hypothesis pre-cisely when the populations actual value is close to the optimum. Of course, whatclose means depends on how much time there has been between thelineages initial state and the present, on the intensity of selection (as meas-ured by how peaked the w-bar function is in Figure 1.1), on the traits heri-tability, and on the effective population size. Time, intensity of selection,and heritability are relevant to predicting how much the mean of the SPDcurve will be shifted in the direction of the optimum; effective populationsize is relevant to predicting how much variance there will be around thatmean value. For example, if innite time has elapsed (Figure 1.3b), the SPDcurve will be more tightly centered on the optimum, the larger the popu-lation is. If 10 is the observed value of our polar bears, but 11 is theoptimum, SPD will be more likely if the population is small, but the reversewill be true if the population is large.

    26 Elliott SoberPr

    (x |)

    0 100100 0

    SPD

    PD

    I O

    SPD

    I O

    PD

    Average phenotypein present population

    Average phenotypein present population

    (a) Finite time (b) Infinite time

    Figure 1.3 Which hypothesis, SPD or PD, confers the higher probability on theobserved present phenotype? Whether the time between the initial state(I) of the population and the present observation is (a) nite or (b) in-nite, SPD has the higher likelihood precisely when the present trait valueis close to the optimum (O).

  • In summary, if we want to test SPD against PD as possible explanationsfor why polar bears now have fur that is 10 centimeters long, we need toknow what the optimal phenotype would be if the selection hypothesis weretrue. If the optimum (O) turns out to be 10, were done SPD has thehigher likelihood. However, if the optimum differs from 10, even a little,we need more information. If we can discover what the lineages initial state(I) was, and if this implies that the population evolved away from theoptimum, were done PD has the higher likelihood. But if our estimates ofthe values of I and O entail that there has been undershooting or overshoot-ing, we need more information if we are to say which hypothesis is morelikely. These four possibilities are summarized in Table 1.4. One surprisethat emerges from this analysis is that undershooting is ambiguous even ifthe population has evolved in the direction of the optimum, this, by itself,does not entail that SPD is more likely than PD.

    Earlier in this essay I chided the intelligent design theorist for simplyassuming that the observed traits of organisms must be what the putative intel-ligent designer intended. The same epistemological point applies to the evolu-tionist. It does no good simply to assume that the observed fur length must beoptimal because natural selection must have been the cause of the traits evolu-tion. This assertion is question-begging. What one needs is independent evid-ence about this and the other auxiliary assumptions that are needed for thetwo hypotheses to generate testable predictions. I argued before that theassumptions that the design hypothesis needs are not independently attested.Is the evolutionist in a better situation than the creationist in this respect?

    Intelligent design is untestable 27

    Table 1.4 When a population evolves from its initial state I to its present state P, howwill that trajectory be related to the putative optimal phenotype O speciedby the hypothesis of SPD? There are four possibilities to consider. In two ofthem, the relationship of I, P, and O determines which hypothesis confersthe greater probability on the observed present state P; in the other two when the population overshoots or undershoots the putative optimum more information is needed to say which hypothesis is more likely.

    Which hypothesis is morelikely?

    (a) Present state coincides --------------------------- selection-plus-driftwith the putative optimum I PO

    (b) Population evolves away -------------------------- pure driftfrom the putative optimum P I O

    (c) Population overshoots -------------------------- ?the putative optimum I O P

    (d) Population undershoots -------------------------- ?the putative optimum I P O

  • 5 Independent evidence about the populations earliertrait value and about the shape of the tness function

    If present day polar bears all have fur that is 10 centimeters long, how arewe to discover what the tness consequences would be of having fur that islonger or shorter? The rst and most obvious way to address this question isby doing an experiment. Let us dispatch a band of intrepid ecologists to theArctic who will attach parkas to some polar bears, shave others, and leaveothers with their fur lengths unchanged. We then can monitor the survivaland reproduction of these experimental subjects. This permits us to inferwhich tness values attach to different fur lengths.

    There is a second approach to the problem of identifying the tness func-tion, one that is less direct and more theoretical. Suppose there is an ener-getic cost associated with growing fur. We know that the heat loss anorganism experiences depends on the ratio of its surface area and its bodyweight. We also know that there is seasonal variation in temperature.Although it is bad to be too cold in winter, it is also bad to be too warm insummer. We also know something about the abundance of food. These andother considerations might allow us to construct a model that describeswhat the optimal fur length is; this model would not assume that the bearsactual trait value is optimal or close to optimal. This type of engineeringanalysis has been developed for other traits in other organisms;13 there is noreason in principle why it cant be carried out for the case at hand.

    Unfortunately, these two approaches face a problem. It is more obvious inconnection with the experimental approach, but it attaches to both. Theexperiment, in the rst instance, tells us about the tness function thatwould be in place if there were variation in fur length among polar bearsnow. How is this relevant to our historical question concerning the processesthat were at work as polar bears evolved? The same question attaches to theengineering approach, in that it uses assumptions about the other traits thatpolar bears have. For example, we probably will need information about therange of temperatures that exist in the bears environment and about thebears body mass and surface area. If we use data from current bears and theircurrent environment, we need to consider whether these values provide goodestimates of the values that were in place ancestrally.

    This leads to our second problem how independent evidence about thelineages ancestral fur length might be obtained. Of course, we cant jump ina time machine and go back and observe the characteristics present in theancestors of polar bears. Does this mean that the lineages initial state isbeyond the reach of evidence? We know that polar bears and other bearsshare common ancestors and we know this independently of our questionabout why polar bears have fur that is 10 centimeters long. We can use othercharacteristics for example, ones that have no adaptive signicance for theorganisms that have them to infer the genealogical relationships thatconnect polar bears to other bears; this allows us to specify a phylogenetic

    28 Elliott Sober

  • tree like the one depicted in Figure 1.4 in which polar bears and their rela-tives are tip species. We can then write down the fur lengths of polar bearsand their near relatives on the tips of that tree. The character states weobserve in these tip species provide evidence about the character states of theancestors, represented by interior nodes. How might this inference frompresent to past be drawn?

    Before addressing that question, I want to describe why Figure 1.4 showsthat our question about SPD versus PD needs to be spelled out in moredetail. It is obvious that present day polar bears have multiple ancestors,each with their own trait values. If these were all known, the problem ofexplaining why polar bears now have fur that is 10 centimeters long woulddecompose into a number of sub-problems why the fur length present atA5 evolved to the length present at A4, why A4s fur length evolved to thevalue found at A3, etc. SPD may be a better answer than PD for some ofthese transitions, but the reverse might be true for others. Furthermore, it isperfectly possible that SPD is better supported than PD as an answer to thequestion Why do polar bears have fur that is 10 centimeters long, giventhat their ancestor Ai had a fur length of f1? but that the reverse is true forthe question Why do polar bears have fur that is 10 centimeters long, giventhat their ancestor Aj had a fur length of f2?

    Now back to the problem of inferring the character states of ancestors.One standard method that biologists use is parsimony we are to prefer the

    Intelligent design is untestable 29

    Polar bears10 6 6 6 6 6

    A1

    A2

    A3

    A4

    A5

    Figure 1.4 Given the observed fur length of present day polar bears and their closerelatives, what is the best estimate of the trait values of the ancestors A1,A2, . . ., A5? The most parsimonious hypothesis is that all of them had atrait value of 6.

  • assignment of states to ancestors that minimizes the total amount of evolu-tion that must have occurred to produce the trait values we observe in tipspecies. This is why assigning the ancestors in Figure 1.4 a value of 6 is saidto have greater credibility than assigning them a value of 10. But whyshould we use parsimony to draw this inference? Does the Law of Likelihoodjustify the Principle of Parsimony? If not, does the principle have someother justication? Or is it merely an unjustiable prejudice that leads us toprefer hypotheses that are more parsimonious? These are large questions,which I wont attempt to address in any detail here. However, a few pointsmay be useful. First, it turns out that if drift is the process at work in a phy-logenetic tree, then the most parsimonious assignment of trait values toancestors (where parsimony means minimizing the squared amount ofchange) is also the hypothesis of maximum likelihood (Maddison 1991). Onthe other hand, if there is a directional selection process at work, parsimonyand likelihood can fail to coincide (Sober 2002c). This point can be graspedby considering the problem depicted in Figure 1.5a. Two descendants havetrait values of 10 and 6; our task is to infer the character state of their mostrecent common ancestor A. Notice that if there is very strong directionalselection for increased fur length toward an optimum of, say, 20 in both lin-eages, then the setting of the ancestor that maximizes the probability thatthe descendants will obtain values of 10 and 6 will be something less than 6.The problem can be simplied even further, by considering just the singledescendant and single ancestor depicted in Figure 1.5b. If the descendant hasa trait value of 10, the most parsimonious assignment of character state to theancestor is 10. But if the lineage has been undergoing strong selection forincreasing its trait value, then the most likely assignment will be something

    30 Elliott Sober

    10 6

    A?

    (a)

    A?

    (b)

    10

    Figure 1.5 Two problems in which one has to estimate the character of anancestor, based on the observed value of one or more descendants. In(a), the most parsimonious hypothesis is that A8; in (b), the mostparsimonious hypothesis is that A10.

  • less than 10. Imagine you have to swim across a river that has a very strongcurrent. The way to maximize your probability of reaching a target on theother side is not to start directly across from it; rather, you should start a bitupstream.

    It follows that parsimony does not provide evidence about ancestral char-acter states that is independent of the hypotheses of chance and selection thatwe wish to test.14 This problem concerning how ancestral fur length is