Anomalous Behavior in a Traveller's Dilemma

download Anomalous Behavior in a Traveller's Dilemma

of 14

Transcript of Anomalous Behavior in a Traveller's Dilemma

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    1/14

    American Economic Association

    Anomalous Behavior in a Traveler's Dilemma?Author(s): C. Monica Capra, Jacob K. Goeree, Rosario Gomez, Charles A. HoltReviewed work(s):Source: The American Economic Review, Vol. 89, No. 3 (Jun., 1999), pp. 678-690Published by: American Economic AssociationStable URL: http://www.jstor.org/stable/117040 .

    Accessed: 25/02/2012 04:40

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

    of scholarship. For more information about JSTOR, please contact [email protected].

    American Economic Association is collaborating with JSTOR to digitize, preserve and extend access to The

    American Economic Review.

    http://www.jstor.org

    http://www.jstor.org/action/showPublisher?publisherCode=aeahttp://www.jstor.org/stable/117040?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/117040?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=aea
  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    2/14

    Anomalous Behavior in a Traveler's Dilemma?By C. MONICACAPRA, JACOBK. GOEREE,RoSARIo GOMEZ,ANDCHARLESA. HOLT*

    The notion of a Nash equilibriumhas joinedsupply and demandas one of the two or threetechniquesthat economists instinctively try touse first in the analysis of economic interac-tions. Moreover, the Nash equilibrium andclosely relatedgame-theoreticconcepts arebe-ing widely appliedin othersocial sciences andeven in biology, where evolutionary stabilityoften selects a subset of the Nash equilibria.Manypeople areuneasy aboutthe starkpredic-tions of the Nash equilibrium n some contextswhere the extremerationalityassumptions eemimplausible.KaushikBasu's (1994) "traveler'sdilemma" s a particularly onvincing exampleof a case where the unrelenting ogic of gametheory is at odds with intuitive notions abouthumanbehavior.The story associatedwith thedilemma s thattwo travelerspurchase denticalantiques while on a tropical vacation. Theirluggage is lost on thereturn rip,and the airlineasks themto makeindependent laimsfor com-pensation.In anticipationof excessive claims,the airlinerepresentativeannounces:

    We know that the bags have identicalcontents,andwe will entertainany claimbetween $2 and $100, but you will eachbe reimbursedat an amount that equalsthe minimum f the two claims submitted.If the two claims differ, we will also paya reward of $2 to the personmaking thesmallerclaim and we will deduct a pen-alty of $2 from the reimbursemento thepersonmakingthe largerclaim.Notice that, ifrespectiveof the actual value ofthe lost luggage,there s a unilateralncentiveto"undercut"he other's claim. It follows from

    this logic thatthe only Nash equilibrium s forboth to make the minimum claim of $2. AsBasu (1994) notes, this is also the uniquestrictequilibrium,and the only rationalizable quilib-rium when claims arediscrete.Whenone of usrecentlydescribed this dilemmato an audienceof physicists, someone asked incredulously:"Isthis whateconomiststhinkthe equilibriums? Ifso, then we should shut down all economicsdepartments."The implausibilityof the Nash equilibriumprediction s based on doubtsthata small pen-alty and/orrewardcan drive claims all the wayto an outcome that minimizes the sum of theplayers' payoffs. Indeed, the Nash equilibriumin a traveler's dilemma is independentof thesize of the penalty or reward.Economic intu-ition suggests thatbehaviorconformsclosely tothe Nash equilibriumwhen the penalty or re-ward is high, but that claims rise to the maxi-mum level as the penalty/rewardparameterapproaches$0.Thispaperuses laboratoryxperimentsoeval-uatewhetheraverageclaimsare affectedby (the-oretically irrelevant)changes in the penalty/rewardparameter.The laboratory rocedures redescribedn SectionI. The secondandthirdsec-tionscontainanalysesof aggregate nd ndividualdata.SectionIV presentsa learing model that sused to obtainmaximum ikelihoodestimatesofthe learing and decisionerrorparamneters.hefifthsectionconsidersbehaviornthefinalperiodsaftermost learninghas occurred, .e., when aver-age claims stabilizeand behaviorconvergesto atypeof noisyequilibriumwhich combinesa stan-dard ogit probabilistic hoice rule with a Nash-like consistency-of-actions-and-beliefsondition.The leaming/adjustmentndequilibriummodelsarecomplementary,ndtogether heyarecapableof explaining omekey featuresof the data.Sec-tionVI concludes.

    L. ProceduresThe datawere collectedfromgroupsof 9-12subjects, with each group participatingin a

    * Capra:Department f Economics,Washington ndLeeUniversity,Lexington,VA 24450; Goeree and Holt: Depart-mentof Economics,Universityof Virginia,114 Rouss Hall,Charlottesville,VA 22901; Gomez:Department f Econom-ics, Universityof Malaga,29013 Malaga,Spain.Thisprojectwas funded n partby theNationalScienceFoundationGrantNos. SBR-9617784and SBR-9818683).We wish to thankPeter Coughlan, Rob Gilles, Susana Cabrera-Yeto, reneComeig,NadegeMarchand, ndtwo anonymous eferees orsuggestions.678

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    3/14

    VOL 89 NO. 3 CAPRAET AL: ANOMALOUSBEHAVIORN A TRAVELER'S ILEMMA? 679series of traveler's dilemma games during asessionthat asted aboutan hourand a half.Thistype of experimenthad not been done before,and we felt that more would be learnedfromletting the penalty/rewardparameter,R, varyover a wide rangeof values. Therefore,we usedtwo high values of R ($0.50 and $0.80), twointermediate alues of R ($0.20 and$0.25), andtwo low values of R ($0.05 and $0.10). Thepenalty/rewardparameter alternatedbetweenhigh and low values in parts A and B of theexperiment.For example, session 1 began withR = $0.80 in part A, which lasted for 10periods.ThenR was lowered to $0.10 in partB.Subjects were recruited from economicsclasses at the University of Virginia, with thepromisethatthey would be paid a $6 participa-tion fee plus all additionalmoneyearnedduringthe experiment. Individual earnings rangedfrom about $24.00 to $44.00 for a session. Webegan by reading the instructionsfor part A(these instructionsare available from the au-thors on request).Althoughdecisions were re-ferred to as "claims," he earningscalculationswere explainedwithoutreference o the context,i.e., without mentioning luggage, etc. In eachperiod, subjects would record their claim ontheirdecision sheets,which were collected andrandomly matched (with draws of numberedping-pong balls) to determine the "other'sclaim"and"yourearnings,"andthe sheets werethen returned.Claims were required o be anynumberof cents between and including80 and200, with decimalsbeing used to indicatefrac-tions of cents. Subjects only saw the claimdecision made by the person with whom theywere matched n a given period.They were toldthatpartA would be followed by "anotherde-cision-makingexperiment"but were not givenadditional nformationabout partB. The pen-alty/rewardarameter as changed n partB, andrandompairwisematchingsweremade as before.PartB lasted or 10periods, xcept n thefirst wosessionswhere t lastedfor 5 periods.

    II. DataThe part A data are summarized in Fig-ure 1. Each line connects the period-by-periodaverages of the 9-12 subjects in each group.There is a different penalty/rewardparameter

    for eachcohort,as indicatedby the labelson the

    right. The data plots are boundedby horizontaldashedlines that show the maximum and min-imum claims of 200 and 80. The Nash equilib-rium prediction s 80 for all treatments.The twohighestlines in Figure1 plot the averageclaimsfor low reward/penaltyparametersof 5 and 10(cents). The first-periodaverages are close to180, and they stay high in all subsequentperi-ods, well away from the Nash equilibrium.Thetwo lowest lines representthe average claimsfor the higher penalty/reward arametersof 50and 80. Note that with these parameters, heaverage claims quickly fall toward the Nashequilibrium. For intermediate reward/penaltyparametersof 20 and 25, the average claimslevel off at about120 and 145 respectively.Theaverages in the last five periods are clearlyinversely related to the magnitudeof the penal-ty/rewardparameter, nd the n-ullhypothesisofno relation can be rejected at the 1-percentlevel. 1For some sessions, the switch in treatmentsbetween parts A and B caused a dramaticchange in behavior. In the two sessions usingR = 80 and R = 10, for example, thebehavior is ayproximatelyreversed, as shownin Figure 2. There is some evidence of asequence effect, since the average claimswere higher for R = 10 when this treatmentcame first than when it followed the R = 80treatmentthat "locked" onto a Nash equilib-rium. In fact, the sequence effect was sostrong in one session, with a treatmentswitchfrom R = 50 to R = 20, that the data did notrise in part B after converging to the Nashoutcome in part A. In all other sessions, thehigh-R treatment resulted in lower averageclaims, as shown in Table 1.Consideragainthe nullhypothesisof no treat-ment effect, underwhich higheraverageclaimsareequally ikelyin bothtreatments. he alterna-tive hypothesis s that averageclaims arehigher

    ' Of the 720 (=6!) ways that the 6 session averagescould have been ranked, here are only 6 possibleoutcomesthat are as extreme as the one observed (i.e., with zero orone reversals between adjacentR values). Under the nullhypothesis the probability of obtaining a rankingthis ex-treme is: 6/720, so the null hypothesis can be rejected(one-tailedtest) at the 1-percent evel.2 Obviously,the partB data have not settled down yetafter5 periods,and thereforewe decided to extendpartB to

    10 periods in subsequentsessions.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    4/14

    680 THEAMERICAN CONOMICREVIEW JUNE 1999

    cents maximum claim200 ------------R =190180 R = 10170160150 R = 25140130 -120 - R 20110100

    80 t RBo ------------------------------------ R 8 070- Nash equilibrium R8060 r50 1 2 3 4 5 6 7 8 9 10 period

    FIGURE 1. DATA FOR PART A FOR VARIOUS VALUES OF THE REWARD/PENALTY PARAMETER

    forthe treatmentswith a low value of R. The nullhypothesiscan be rejectedat a 3-percent ignifi-cance level using a standardWilcoxon (signed-rank)nonparametricest.Thusthetreatmentffectis significant, ven though t does not affect theNashequilibrium.Basically, the Nash equilibrium providesgood predictions for high incentives (R = 80and R = 50), but behavior is quite differentfrom the Nash predictionunder the treatmentswith low and intermediatevalues of R. In par-ticular,as shown in Figure 1, the data for thelow-R treatments s concentrated t theoppositeend of the range of feasible decisions. Basu's(1994) presentationof the traveler's dilemmainvolved low incentives relative to the rangeofchoices so, in this sense, the intuitionbehindthedilemma s confirmed.3Tosummarize, heNash

    equilibriumpredictionof 80 for all treatmentsfails to accountfor the most salientfeature ofthe data, the intuitive inverse relationship be-tween average claims and the parameter thatdetermines herelativecost of having thehigherclaim.Since the Nash equilibrium works well insome contexts, what is needed is not a radi-cally different alternative, but rather, a gen-eralization that conforms to Nash predictionsin some situations (e.g., with high-R values)and not in others. In addition, it would beinteresting to consider dynamic theories toexplain the patterns of adjustment in initialperiods when the data have not yet stabilized.Many adjustmenttheories in the literature arebased on the idea of movement toward a bestresponse to previously observed decisions.The next section evaluates some of these ad-justment theories and shows that they explaina high proportionof the directions of changes3Basu (1994) does not claim to offer a resolutionof theparadox, but he suggests several directions of attack.Loosely speaking, these approaches nvolve restricting n-dividualdecisions to sets, T, and T2 for players 1 and 2respectively,where each set contains all best responsestoclaimsin the other person'sset. Such sets may exist if opensets are allowed, or if some notion of "ill-definedcatego-

    ries" is introduced.Without further refinementthese ap-proaches do not predict the effects of the penalty/rewardparameteron claim levels.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    5/14

    VOL. 89 NO. 3 C2APRAT AL: ANOMALOUSBEHAVIORN A TRAVELER'S ILEMMA? 681

    cents200 -----------------------------------------------------190 Session1

    170 - SessioSesion2960 -.R =801 5 0 - ...........

    13 2 01 21 41120 Session 1 SespeiodFIUE21VRG LIS O AT10FSESO DR IE N SESO 2 DSHDLIE

    in individual claims, but not the strong effectof the penalty/rewardparameteron the levelsof average claims. Then in Sections IV and Vwe presentboth dynamic and equilibriumthe-ories thatare sensitive to the magnitudeof thepenalty/reward parameter.III. Patternsof IndividualAdjustment

    One approachto data analysis is based onthe perspective that people react to previousexperience via what is called reinforcementlearning in the psychology literature. In thisspirit, Reinhard Selten and Joachim Buchta(1994) consider a model of directional adjust-ment in response to immediate past experi-ence. The prediction is that changes are morelikely to be made in the direction of whatwould have been a best response to others'decisions in the previous period. The predic-tions of this "learning direction theory" are,therefore, qualitative and probabilistic. Thetheory is useful in that it provides the naturalhypothesis thatchanges in the "wrong"direc-tion are just as likely as changes in the "right"direction. This null hypothesis is decisivelyrejected for data from auctions (Selten andBuchta, 1994).To evaluate eamningirection heory,we cat-

    egorize all individual claims after the first pe-nod as eitherbeing consistentwith the theory,"+", or inconsistent,"-". Excluded from con-sideration are cases of no change, irrespectiveof whether or not these are Nash equilibriumdecisions. These cases of no change are classi-fied as "na" fornot applicable).Table 2 showsthe data classificationcounts by treatment.The(79, 21, 80) entry under the R = 5 columnheading, for example, means that therewere 79"+" classifications,21 "-" classifications, and80 "na" classifications. The percentage givenjustbelow thisentry ndicatesthat79 percentofthe "+" and "-" changes were actually "+".The percentages exclude the "na" cases fromthe denominator.The "percentageof +" rowindicatesthatsignificantlymore than half of theclassificationswere consistent with the learningdirectiontheory. Therefore,the null hypothesisof no difference can be rejected in all treat-ments, as indicatedby the "p-value"row.Note, however, that at least part of thesuccess of learning direction theory may bedue to a statistical artifact if subjects' deci-sions are draws from a randomdistribution,asdescribed for instance by the equilibriummodel in Section V. With random draws, theperson who has the lower claim in a givenperiod is more likely to be near the bottom of

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    6/14

    682 THEAMERICANECONOMICREVIEW JUNE 1999TABLE I-AVERAGE CLAIMS IN TH{ELAST FIVE PERIODS FOR ALL SESSIONS

    Session 1 2 3 4 5 6High-Rtreatment 82 99 92 82 146 170Low-R treatment 162 186 86 116 171 196

    the distribution, and hence to draw a higherclaim in the next period. Similarly, the personwith the higher claim is more likely to draw alower claim in the following period. In fact, itcan be shown that if claims are drawn fromany stationary distribution, the probability is2/3 that changes are in the direction predictedby learning direction theory.4 But even thenull hypothesis that the fraction of predictedchanges is 2/3 rejected by our data at lowlevels of significance.One feature of learning direction theory inthis context is that noncritical changes in thepenalty/rewardparameterR do not change thedirectional predictions of the theory. This isbecause R affects the magnitudeof the incen-tive to change one's claim, but not the direc-tion of the best response. This feature isshared by several other directional best-response models of evolutionary adjustmentthat have been proposed recently, admittedlyin differentcontexts. For example, Vincent P.Crawford (1995) considered an evolutionaryadjustment mechanism for coordinationgames that was operationalized by assumingthat individuals switch to a weighted averageof their previous decision and the best re-sponse to all players' decisions in the previ-ous period. This adaptive learning model,which explains some key elements of adjust-ments in coordination game experiments, issimilar to directional learning with the extentof directional movements determined bythe relative weights placed on the previousdecision and on the best response in the ad-justment function. Another evolutionary for-

    mulation that is independent of the magnitudeof R is that of imitation models in whichindividuals are assumed to copy the decisionof the person who made the highest payoff.With two-person matchings in a traveler'sdilemma, the high-payoff person is always theperson with the lower claim, regardlessof theR parameter, so that imitation (with a littleexogenous randomness) will result in deci-sions that are driven to near-Nash levels.5 Toconclude, individual changes tend to be in thedirection of a best response to the other'saction in the previous period, but the strongeffect of the penalty/reward parameter onthe average claims cannot be explained bydirectional learning, adaptive learning (par-tial adjustmentto a best response), and imi-tation-based learning models.IV. A DynamicLearningModel with LogitDecisionErrorIn this section we presenta dynamicmodel inwhich players use a simple counting rule toupdatetheir(initially diffuse)beliefs about oth-ers' claims. The modeling of beliefs is impor-tant because people will wish to make highclaims if they come to expectthat otherswill dothe same. Although the only set of internallyconsistent beliefs andperfectlyrationalactionsis at the unique Nash equilibriumclaim of 80for all values of R, the costs of increasingclaims above 80 depend on the size of thepenalty/reward arameter. orsmallvaluesof Rsuchdeviationsarerelativelycostless and somenoise in decision-makingmay result in claimsthat are well above 80. As subjects enr:ounterhigher claims, the (noisy) best responsesto ex-pectedclaims may become even higher.In thismanner,a relativelysmallamountof noise may4 Supposethat a player'sdraw,x,, is less thanthe otherplayer's draw,y. Then the probabilityhat a next draw, x2,is higher than x, is given by: P[x2 > xIIy > xI] =P[x2 > xI, y > x,]IP[y > xI]. The numerator s equal tothe probability hatx, is the lowest of three draws,which is'A,and the denominator s equal to the probability hatxI isthe lowest of 2 draws,which is ?/2.So the relevant proba-

    bility is 2/3.

    5Paul Rhode andMark Stegeman (1995) and FernandoVega-Redondo (1997) have shown that this type of imita-tion dynamicwill driveoutputs n a Cournotmodel up to theWalrasian evels.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    7/14

    VOL.89 NO. 3 CAPRAET AL: ANOMALOUS EHAVIORNA TRAVELER'S ILEMMA? 683TABLE 2-CONSISTENCY OF CLAIM CHANGES WITH LEARNING DIRECTION THEORY

    AllR = 5 R= 10 R =20 R = 25 R =50 R-80 treatmentsNumbersof +,-, na 79, 21, 80 62,15,43 50,7,123 94, 23, 63 65,12,103 50, 3, 87 400, 81, 499Percentage

    of +a 79 81 88 80 84 94 83p-valueb

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    8/14

    684 THEAMERICANECONOMICREVIEW JUNE 1999otherweightsremainunchanged.Theseweightstranslate nto belief probabilities,Pi(j, t), bydividing theweight of eachcategoryby the sumof all weights. The model is one of "fictitiousplay" in which a new observation s weightedby a learningparameter,p, that determinestheimportanceof a new observationrelativeto theinitial prior.A low value of p indicates "con-servative"behavior n that new informationhaslittle impact on a player's beliefs, which aremainly determinedby the initial prior.Once beliefs are formed,they can be used todetermine the expected payoffs of all the op-tions available.Since in our model each playerchooses among n possible categories, the ex-pectedpayoffs are given by the sum

    n(1) ,e(j,t)= r(j, k)Pi(k, t), j -1, n..nk=1

    where wi(j, k) is playeri's payoff from choos-ing a claim equal to j when the other playerclaims k.In a standardmodel of best-replydynamics,aplayer simply chooses the category thatmaxi-mizes the expected payoff in (1). However, aswe discussed above, the adjustmentsn such amodel will be independentof the magnitudeofthe key incentive parameter,R. We will there-fore allow players to make nonoptimal deci-sions, or "mistakes,"with the probabilityof amistake being inversely relatedto its severity.The specificparameterizationhat we use is thelogit rule, for which player i's decision proba-bilities, Di(j, t), are proportional o an expo-nential functionof expectedpayoffs:

    (2) Di(j, t) = exp(n(j, t)/p)exp(,a (k, t)l,ut)k= 1The denominator nsuresthatthe choice prob-abilities addupto 1, and,uis an errorparameterthat determines he effect of payoff differenceson choice probabilities.When , is small, thedecisionwith thehighest payoff is very likely tobe selected, whereas all decisions becomeequally likely (i.e., behavior becomes purelyrandom) n the limit as ,u tends to infinity.To

    summarize he key ingredientsof our dynamicmodel: (i) playersstartwith a uniformprioranduse a simple counting rule to updatetheir be-liefs; (ii) these beliefs determineexpectedpay-offs by (1); and (iii) the expected payoffs in turndetermineplayers'choice probabilitiesby (2)?9This "logit learning model"can be used toestimate he errorparameter,, andthe learningparameter,p. Recall that the probabilitythatplayer i chooses a claim in thejdi categoryinperiodt is given by Di(j, t), and the likelihoodfunctionis simply the productof the decisionprobabilities f the actualdecisionsmade for allsubjectsand all 10 periods.The maximumikeli-hood estimatesof the errorand learningparame-ters of the dynamic earningmodel are: = 10.9(0.6) and p = 0.75 (0.12), with standard rrorsshownin parentheses. he errorparameters sig-nificantlydifferentromthevalueof zero impliedby perfectrationality,which is not surprisingnlightof the cleardeviations romthe Nash predic-tions.10 f the learningparameterwere equal to1.0,each observation f another erson'sdecisionwouldbe as informative sprior nformation,o avalue of 0.7 meansthat the prior nformations

    9An alternativeapproachwould specify that the proba-bility of a given decision is an increasing functionof pay-offs that have been earnedwhen thatdecisionwas madeinthe past. Thus high-payoff outcomes are "reinforced." eeAlvin E. Roth and Ido Erev (1995) and Erev and Roth(1998) for a simulation-basedanalysis of reinforcementmodels in other contexts.10We also estimated he learningmodel for each sessionseparately,and in all cases the error parameterestimateswere significantlydifferentfrom zero, except for the R =20 session where the programdidnot converge.Recallthatthis treatmentwas the only one with an average claim thatwas out of the order that cofrespondsto the magnitudeofthe R parameter.The errorparameter stimates (with stan-dardefrors)for R = 5, 10, 25, 50, and 80 were 6.3 (1.0),4.0 (1.0), 16.7 (5.1), 6.8 (0.9), and 9.5 (0.7) respectively.These estimatesareof approximately he same magnitude,but some of the differences are statisticallysignificantatnormal evels, whichindicatesthat the learningmodel doesnot accountfor all of the "cohorteffects."These estimatesare, however, of roughlythe same magnitudeas those wehave obtained n othercontexts. Capraet al. (1998) estimatean errorparameterof 8.1 in an experimentalstudy of im-perfect price competition. The estimates for the LisaAndersonandHolt (1997) information ascadeexperimentsimply an errorparameterof about 12.5 (when payoffs aremeasured n cents as in this paper).RichardD. McKelveyand Thomas R. Palfrey (1998) use the Brandts and Holt(1996) signalinggamedata to estimate u= 10 (theyreportl/,u = 0.1).

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    9/14

    VOL.89 NO. 3 CAPRAET AL.: ANOMALOUS EHAVIORNA TRAVELER'S ILEMMA? 685TABLE 3-PREDICTED AND ACTUAL AVERAGE CLAIMS

    R = 5 R= 10 R = 20 R = 25 R -= 50 R =80Average claim in period1 180 177 131 150 155 120Averageclaim for periods8-10 195 186 119 138 85 81Average simulatedclaim forperiod 1 171 166 156 150 118 97Averagesimulatedclaims forperiods8-10 178 170 155 148 107 85Logit equilibriumprediction 183 174 149 133 95 88Nash equilibriumprediction 80 80 80 80 80 80

    slightlystrongerhan he informationonveyed na single observation.Table 3 shows the relationshipbetween av-erage claims and simulation-basedpredictionsof the logit learningmodel.The first row showsaverage claims observedin the first period ofthe experiment,where claims are highest forR = 5 and R = 10, and lowest for R = 80.The second row shows the average observedclaims for the final threeperiods;we see thatclaimsriseslightlyfor the two low-R treatmentsandfall for thehigh-Rtreatments.The thirdrowshows the first-periodpredictions of the dy-namic model, based on the estimatederrorrateand the assumptionof uniform initial priors.These predictionsare also inversely relatedtothe level of R. The predictionsof the dynamicmodel for periods 8-10, shown in the fourthrow,areobtainedby lettinga computerprogramkeep trackof 10 cohortsof 10 simulatedsub-jects which begin with flat priors,make error-prone decisions, "see" the other's decision, andupdate beliefs before being rematched ran-domly withanother imulatedsubject.The sim-ulatedclaimsalso showa tendency or claimstoincrease for low-R values and decrease forhigh-Rvalues, but the treatment ffect is a littletoo flat relative to the actual data. To summa-rize,theparameterestimates or the logit learn-ing model can be used in simulations toreproducethe qualitative eatures of observedadjustmentpatterns and the inverse relation-ship between thepenalty/reward arameterandaverage claims.

    V. A Logit EquilibriumAnalysisAs playersgain experienceduringthe exper-

    iment, the prior nformationbecomes consider-

    ably less important. With more experience,there are fewer surpriseson average, and thisraises the issue of what happens if decisionsstabilize, as indicated by the relatively flattrends in the final periods of part A for eachtreatmentn Figure 1. An equilibrium s a statein which the beliefs reach a point where thedecision distributionsmatchthe belief distribu-tions, which is the topic of this section. Recallthat in the previous section's logit learningmodel, player i's belief probabilities,Pi(j, t)for thejth categoryin periodt, are used in theprobabilistic hoice function (2) to calculatethecorrespondingchoice probabilities,Di(j, t). Asymmetric ogit equilibrium s a situationwhereall players' beliefs have stabilizedat the samedistributions,so that we can drop the i and targumentsand simply equatethe correspondingdecision and belief probabilities:Di(j, t) =Pi(j, t) = P(j) for decision category j. Insuch an equilibrium, he equations n (2) deter-mine the equilibriumprobabilities (McKelveyand Palfrey, 1995, 1998).11The probabilitiesthat solve these equationswill, of course, de-pendon the penalty/reward arameter,whichisdesirablegiven the fact thatthis parameterhassuch a strong effect on the levels at whichclaims stabilize in the experiments.The equi-librium probabilitieswill also depend on theerrorparametern (2), which can be estimated

    1l The logit equilibriumhas been used to explain devi-ations from Nash behavior in some matrix games (RobertW. Rosenthal, 1989; McKelvey and Palfrey, 1995; JackOchs, 1995), in other games with a continuumof decisions,e.g., the "all-pay" auction (Simon P. Anderson et al.,1998a), public-goodsgames (Anderson et al., 1998b), andprice-choice games (Gladys Lopez, 1995; Capra et al.,1998).

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    10/14

    686 THEAMERICAN CONOMICREVIEW JUNE 1999as before by maximizing the likelihood func-tion. Instead of being determined by learningexperience,beliefs are now determinedby equi-librium consistency conditions.'2

    It is clear from the data patternsin Figure1 that the process is not in equilibrium n theearly periods, as average claims fall for sometreatments and rise for others. Therefore, itwould be inappropriate o estimate the logitequilibriummodel with all dataas was done forthe logit learningmodel. We used the last threeperiods of data to estimate ,u - 8.3 with astandard rrorof 0.5. This errorparameter sti-mate for the equilibriummodel is somewhatlower than the estimate for the logit learningmodel (10.9). This differencemay be due to thefact that the learningmodel was estimatedwithdata from all periods,includingthe initial peri-ods where decisions show greater variability.Despite the difference in the treatmentof be-liefs, the logit learning and equilibriummodelshave similar structures,and are complementaryin the sense that the equilibriumcorresponds othe case where learning would stop havingmucheffect, i.e., wheredecision andbelief dis-tributionsare identical.Once the errorand penalty/rewardparame-ters are specified, the logit equilibrium equa-tions in (2) can be solved using Mathematica.Figure3 shows the equilibriumprobabilitydis-tributionsfor all treatments with A = 8.3.13

    These plots reveal a clear inverse relationshipbetweenpredictedclaims and the magnitudeofthe penalty/rewardparameter,as observed inthe experiment. In particular,notice that thenoise introduced n the logit equilibriummodeldoes more than spread the predictions awayfroma central endencyat the Nash equilibrium.In fact, for low values of R, the claim distribu-tions are centered well away from the Nashprediction,at the opposite end of the range offeasible choices.We can use the logit equilibriummodel (for= 8.3) to obtainpredictions or the last threeperiods. These predictions,calculated from theequilibriumprobabilitydistributions n Figure3, are found in the fifth row in Table 3. Thecloseness of the logit equilibriumpredictionsand the actual averages (row 2) for the finalthree periods is remarkable.In all cases, thepredictionsare much better than those of theNashequilibrium,whichis 80 for all treatments(row 6 in Table 3).14 To summarize,the esti-matederrorparameterof the logit equilibriummodel can be used to derive predictedaverageclaims that trackthe salient treatmenteffect onclaim data in the final threeperiods, an effectthat is not explainedby the Nash equilibrium.The logit-equilibriumapproachin this sec-tiondoes not explainall aspectsof the data. Forexample, the claims in part B are generallylower when precededby very low claims in acompetitivepartA treatment,as can be seen inFigure2. This cross-gamelearning,which hasbeen observedin otherexperiments, s difficultto model, and is not surprising.After all, theoptimaldecision dependson beliefs about oth-ers' behavior, and low claims in a previoustreatment an affect thesebeliefs. Beliefs wouldalso be influencedby knowingthe trueprice ofthe item that was lost in the traveler's dilemmagame.This truevaluemightbe a focal pointforclaimsmadein early periods.Anotheraspectofthe datathatis not explained by the logit equi-libriummodel is the tendencyfor a significant

    12 A theoreticalanalysis of the effectof R on equilibriumclaims can be based on a continuous formulation n whichprobabilitiesare replaced by a continuous density,f(x),with a distributioniunction, F(x). In equilibrium, theserepresentplayers' beliefs about others' claims, which de-terminethe expected payoff from choosing a claim of x,denoted by re(x). The expected payoffs deternine theclaim densityvia a continuous ogitchoice function: (x)=k exp(7Te(x)/tL),wherek is a constant of integration.Thisis not a closed-form solution for f(x), since the claimdistribution ffects the expected payoff function.Neverthe-less, it is possible to derivea numberof theoreticalproper-ties of the equilibriumclaim distribution.Anderson et al.(1998c) consider a class of auction-likegames that ncludesthe traveler'sdilemmaas a special case. For this class, thelogit equilibrium xists,and is uniqueand symmetricacrossplayers. Moreover, it is shown that, for any ,u > 0, anincrease n theR parameter esultsin a stochasticreductionin claims, in the sense of first-degree tochastic dominance.13The density for R = 80 seems to lie below that forR 50. What the figuredoes not show is thatthe densityfor R = 80 puts most of its mass at claims that are veryclose to 80 and has a much higher vertical intercept.

    14 We have no formal proofthat the belief distributionsin the logit learningmodel will convergeto the equilibriumdistributions, ut noticethat the simulatedaverageclaimsinrow 4 end up being reasonablyclose to the predictedequi-libriumclaims inrow 5, even afteras few as 8-10 simulatedmatchings,and the differenceis largely due to the highererrorparameter stimate for the dynamicmodel.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    11/14

    VOL.89 NO. 3 CAPRAETAL.:ANOMALOUSBEHAVIOR N A TRAVELER'S ILEMMA? 687

    D e n s i t y0.06

    0.05 R = 8 0

    0.04 R=50.03 R = 5 0 R = 1 00.02-R 210.01-

    080 90 100 110 120 130 140 150 160 170 180 190 200Nash Claim Claim Amount

    FIGURE 3. LOGIT EQUILIBRIUM CLAIM DENSITIES (WITH , = 8.3)

    fractionof the subjects o use the samneecisionas in thepreviousperiod.Othersubjectschangetheir decisions frequently,even when averagedecisions have stabilized. There seems to besome inertia n decision-making hatis not cap-turedby the logit model. Finally, separateesti-mates of the logit error parameterfor eachtreatmentreveal some differences. However,the estimates are, with one exception, of thesame order of magnitude and are similar toestimates thatwe have found for other games.

    VI. ConclusionBasu'straveler's ilemma s of interestbecausethe starkpredictions f the uniqueNash equilib-rium are at odds with most economists' ntuitionabouthow peoplewould behavein such a situa-tion. This conflictbetweentheoryand intuitionsespecially sharpfor low values of the penalty/

    rewardparameter,ince upwarddeviations rom

    the low Nash equilibrium laims are relativelycostless.Theexperiment eported ere s designedto exploit the invarianceof the Nash predictionwith respect o changes n the penalty/rewarda-rameter.The behavior of financiallymotivatedsubjectsconfirmedour expectation hat the Nashpredictionwould fail on two counts:claims werewell above the Nash prediction or some treat-ments,andaverageclaimswereinverselyrelatedto the value of the penalty/reward arameter.Moreover,these results cannot be explainedbyany theory, static or dynamic, that is based on(perfectly ational)best responses o a previouslyobserved laim,since the bestresponse o a givenclaim s independentf the penalty/rewardaram-eter n the traveler'sdilemmagame.Inparticular,the strongtreatment ffects are not predictedbylearning direction theory, imitation theories,orevolutionarymodels that specify partial adjust-ments to best responsesto the most recent out-come.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    12/14

    688 THEAMERICANECONOMICREVIEW JUNE 1999

    AverageClaim200 -- i -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -1s8180X160X

    140120 - L o g i t Prediction1001 _80 ------------ ----------------------------Nash Prediction 060 5 10 20 25 50 80Penalty/RewardParameter

    FIGURE 4. PREDICTED AND ACTUAL AVERAGE CLAIMS FOR THE FINAL THREE PERIODSNote: Dots representaverageclaims for each of the treatments.

    The Nash equilibrium s the central organiz-ing concept in game theory, and has been forover 25 years. This approach should not bediscarded; t has workedwell in many contexts,and here it works well for high values of thepenalty/reward parameter. Rather, what isneeded is a generalization that includes theNashequilibriumas a specialcase, and thatcanexplain why it predicts well in some contextsand not others. One alternativeapproach s tomodel the formation of beliefs about others'decisions, and we implementthis by estimatinga dynamic learning model in which playersmake noisy best responses to beliefs thatevolve, using a standard logit probabilistic

    choice rule. In an equilibrium where beliefsstabilize, the belief and decision distributionsare identical, although the probabilisticchoicefunction will keep injecting some noise into thesystem.The logit equilibrium model uses thelogit probabilistic choice function to deter-mine decisions, while keeping a Nash-likeconsistency-of-actions-and-beliefs condition.This model performs particularly well in thetraveler' dilemma game, where the Nash pre-dictions are at odds with both data and intu-ition about average claims and incentiveeffects. Consider the results for each treat-ment, as shown by the dark dots in Figure

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    13/14

    VOL.89 NO. 3 CAPRAET AL: ANOMALOUSBEHAVIORN A TRAVELER'S ILEMMA? 6894 that represent average claims for the finalthreeperiods plotted above the correspondingpenalty/reward parameter on the horizontalaxis. If one were to draw a freehand linethrough these dots, it would look approxi-mately like the dark curved line, which is infact the graphof the logit equilibriumpredic-tion as a function of the R parameter(calcu-lated on basis of the estimated value of thelogit error parameter for the equilibriummodel).15 Even the treatment eversal betweenR valuesof 20 and 25 seems unsurprising iventhe closeness of these two treatmentson thehorizontalaxis and the flatness of the densitiesfor these treatmentsn Figure3. Recall thattheNash prediction s 80 (the horizontalaxis) forall treatments.Thus the data are concentrated tthe oppositeend of the rangeof feasible claimsfor low values of the penalty/reward arameter,which cannot be explained by adding errorsaround he Nash prediction.These data patternsare well explainedby the logit equilibriumandlearningmodels.

    RE1FERENCESAnderson, Lisa and Holt, Charles A. "Informa-tion Cascades in the Laboratory."AmericanEconomic Review, December 1997, 87(5),pp. 847-62.Anderson, Simon P.; Goeree, Jacob K. and Holt,CharlesA. "Rent Seeking with BoundedRa-tionality: An Analysis of the All-Pay Auc-tion." Journal of Political Economy, August1998a, 106(4), pp. 828-53.. "A Theoretical Analysis of Altruismand Decision Error n Public Goods Games."Journal of Public Economics, November1998b, 70, pp. 297-323.

    . "Logit Equilibria for Auction-LikeGames." Working paper, University of Vir-ginia, 1998c.Basu, Kaushik."TheTraveler'sDilemma: Para-doxes of Rationalityn Game Theory."Amer-ican Economic Review, May 1994 (Papers

    and Proceedings), 84(2), pp. 391-95.Brandts, Jordi and Holt, Charles A. "NaiveBayesian LeamingandAdjustments o Equi-librium n Signaling Games."Workingpaper,Universityof Virginia, 1996.Camerer,ColinandHo, Teck-Hua."ExperienceWeighted Attraction Learning in Normal-Form Games." Econometrica, 1999 (forth-coming).Capra,C. Monica;Goeree,JacobK.; Gomez,Ro-sario and Holt, Charles A. "Leaming andNoisy EquilibriumBehavior in an Experi-mental Study of ImperfectPrice Competi-tion."Workingpaper,Universityof Virginia,1998.Chen,Yan andTang,Fang-Fang."Learning ndIncentive CompatibleMechanisms or PublicGoods Provision: An ExperimentalStudy."Journal of Political Economy, June 1998,106(3), pp. 633-62.Cooper, David J.; Garvin, Susan and Kagel,John H. "Signalling and Adaptive Learningin an Entry Limit Pricing Game." RandJournal of Economics, Winter 1997, 28(4),pp. 662-83.Crawford,Vincent P. "Adaptive Dynamics inCoordinationGames." Econometrica,Janu-

    ary 1995, 63(1), pp. 103-44.Erev, Ido and Roth, Alvin E. "Predicting HowPeople Play Games: Reinforcement Learn-ing in Experimental Games with Unique,Mixed Strategy Equilibria."AmericanEco-nomic Review, September 1998, 88(4), pp.848-81.Fudenberg,Drew and Levine,DavidK. Learningin games. Cambridge,Massachusetts: MITPress, 1998.Lopez,Gladys."QuantalResponse Equilibria orModels of Price Competition."Ph.D. disser-tation, Universityof Virginia, 1995.Luce, R. Duncan.Individual choice behavior.New York:Wiley, 1959.McKelvey,Richard D. and Palfrey, Thomas R."Quantal Response Equilibria for NormalFormgames."Games and Economic Behav-ior, July 1995, 10(1), pp. 6-38.

    . "QuantalResponse Equilibria or Ex-tensive Form Games." ExperimentalEco-nomics, 1998, 1(1), pp. 9-41.Mookherjee,Dilip and Sopher, Barry. "Learn-ing and Decision Costs in ExperimentalConstantSum Games."Games and Economic

    '" Since we estimated a virtually identical error ratefor a different experiment in Capra et al. (1998), thepredictions in Figure 4 could have been derived from adifferent data set.

  • 8/2/2019 Anomalous Behavior in a Traveller's Dilemma

    14/14

    690 THEAMERICAN CONOMICREVIEW JUNE 1999Behavior,April1997, 19(1), pp. 97-132.Ochs, Jack. "Games with Unique Mixed Strat-egy Equilibria: An Experimental Study."Games and Economic Behavior, July 1995,10(1), pp. 202-17.Rhode, Paul and Stegeman, Mark. "Non-Nash Equilibria of Darwinian Dynamics(with Applications to Duopoly)." Workingpaper, Virginia Polytechnical Institute,1995.Rosenthal,RobertW. "A Bounded-RationalityApproach to the Study of NoncooperativeGames." nternationalJournalof Game The-ory, 1989, 18(3), pp. 273-91.Roth, Alvin E. and Erev, Ido. "Learningn Ex-

    tensive-FormGames:ExperimentalData andSimple Dynamic Models in the IntermediateTerm."Gamesand EconomicBehavior, Jan-uary 1995, 8(1), pp. 164-212.Selten,Reinhardand Buchta,Joachim."Experi-mental Sealed Bid First Price Auctions withDirectly Observed Bid Functions."Workingpaper,Universityof Bonn, 1994.Smith,VernonL. and Walker,James M. "Mon-etary Rewardsand Decision Cost in Experi-mentalEconomics:An Extension,"Workingpaper,Universityof Arizona, 1997.Vega-Redondo,Fernando. "The Evolution ofWalrasianBehavior."Econometrica,March1997, 65(2), pp. 375-84.