reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is...

30
TOKEN REINFORCEMENT: A REVIEW AND ANALYSIS TIMOTHY D. HACKENBERG UNIVERSITY OF FLORIDA Token reinforcement procedures and concepts are reviewed and discussed in relation to general principles of behavior. The paper is divided into four main parts. Part I reviews and discusses previous research on token systems in relation to common behavioral functions—reinforcement, temporal organization, antecedent stimulus functions, and aversive control—emphasizing both the continuities with other contingencies and the distinctive features of token systems. Part II describes the role of token procedures in the symmetrical law of effect, the view that reinforcers (gains) and punishers (losses) can be measured in conceptually analogous terms. Part III considers the utility of token reinforcement procedures in cross-species analysis of behavior more generally, showing how token procedures can be used to bridge the methodological gulf separating research with humans from that with other animals. Part IV discusses the relevance of token systems to the field of behavioral economics. Token systems have the potential to significantly advance research and theory in behavioral economics, permitting both a more refined analysis of the costs and benefits underlying standard economic models, and a common currency more akin to human monetary systems. Some implications for applied research and for broader theoretical integration across disciplines will also be considered. Key words: token reinforcement, conditioned reinforcement, symmetrical law of effect, cross-species analysis, behavioral economics _______________________________________________________________________________ A token is an object or symbol that is exchanged for goods or services. Tokens, in the form of clay coins, first appeared in human history in the transition from nomadic hunter- gatherer societies to agricultural societies, and the expansion from simple barter economies to more complex economies (Schmandt-Bes- serat, 1992). Since that time, token systems, in one form or another, have provided the basic economic framework for all monetary transac- tions. From the supermarket to the stock market, any economic system of exchange involves some form of token reinforcement. Token systems have been successfully em- ployed as behavior-management and motiva- tional tools in educational and rehabilitative settings since at least the early 1800s (see Kazdin, 1978). More recently, token reinforce- ment systems played an important role in the emergence of applied behavior analysis in the 1960–70s (Ayllon & Azrin, 1968; Kazdin, 1977), where they stand as among the most successful behaviorally-based applications in the history of psychology. Laboratory research on token systems dates back to pioneering studies by Wolfe (1936) and Cowles (1937) with chimpanzees as experimental subjects. Conducted at the Yerkes Primate Research Laboratories, and published as monographs in the Journal of Comparative Psychology, these papers describe an interconnected series of experiments ad- dressing a wide range of topics, including discrimination of tokens with and without exchange value, a comparison of tokens and food reinforcers in the acquisition and main- tenance of behavior, an assessment of re- sponse persistence under conditions of imme- diate and delayed reinforcement, preference between tokens associated with different rein- forcer magnitudes and with qualitatively dif- ferent reinforcers (e.g., food vs. water, play vs. escape), and social behavior engendered by token reinforcement procedures, to name just a few. The focus of these early investigations was on conditioned reinforcement—the condi- tions giving rise to and maintaining the acquired effectiveness of the tokens as rein- forcing stimuli. A central question concerned the degree to which tokens, as conditioned reinforcers, were functionally equivalent to This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study of token reinforcement. Manuscript preparation was supported by NSF Grant IBN 0420747. I am indebted to Edmund Fantino and an anonymous reviewer for comments on an earlier version of the paper, and to Rachelle Yankelevitz and Bethany Raiff for assistance with figure preparation. Correspondence may be sent to the author at the Department of Psychology, University of Florida, Gaines- ville, FL 32611-2250 (e-mail: [email protected]). doi: 10.1901.jeab.2009.91-257 JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2009, 91, 257–286 NUMBER 2(MARCH) 257

Transcript of reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is...

Page 1: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

TOKEN REINFORCEMENT: A REVIEW AND ANALYSIS

TIMOTHY D. HACKENBERG

UNIVERSITY OF FLORIDA

Token reinforcement procedures and concepts are reviewed and discussed in relation to generalprinciples of behavior. The paper is divided into four main parts. Part I reviews and discusses previousresearch on token systems in relation to common behavioral functions—reinforcement, temporalorganization, antecedent stimulus functions, and aversive control—emphasizing both the continuitieswith other contingencies and the distinctive features of token systems. Part II describes the role of tokenprocedures in the symmetrical law of effect, the view that reinforcers (gains) and punishers (losses) canbe measured in conceptually analogous terms. Part III considers the utility of token reinforcementprocedures in cross-species analysis of behavior more generally, showing how token procedures can beused to bridge the methodological gulf separating research with humans from that with other animals.Part IV discusses the relevance of token systems to the field of behavioral economics. Token systemshave the potential to significantly advance research and theory in behavioral economics, permittingboth a more refined analysis of the costs and benefits underlying standard economic models, and acommon currency more akin to human monetary systems. Some implications for applied research andfor broader theoretical integration across disciplines will also be considered.

Key words: token reinforcement, conditioned reinforcement, symmetrical law of effect, cross-speciesanalysis, behavioral economics

_______________________________________________________________________________

A token is an object or symbol that isexchanged for goods or services. Tokens, inthe form of clay coins, first appeared in humanhistory in the transition from nomadic hunter-gatherer societies to agricultural societies, andthe expansion from simple barter economiesto more complex economies (Schmandt-Bes-serat, 1992). Since that time, token systems, inone form or another, have provided the basiceconomic framework for all monetary transac-tions. From the supermarket to the stockmarket, any economic system of exchangeinvolves some form of token reinforcement.

Token systems have been successfully em-ployed as behavior-management and motiva-tional tools in educational and rehabilitativesettings since at least the early 1800s (seeKazdin, 1978). More recently, token reinforce-ment systems played an important role in theemergence of applied behavior analysis in the1960–70s (Ayllon & Azrin, 1968; Kazdin,

1977), where they stand as among the mostsuccessful behaviorally-based applications inthe history of psychology.

Laboratory research on token systems datesback to pioneering studies by Wolfe (1936)and Cowles (1937) with chimpanzees asexperimental subjects. Conducted at theYerkes Primate Research Laboratories, andpublished as monographs in the Journal ofComparative Psychology, these papers describean interconnected series of experiments ad-dressing a wide range of topics, includingdiscrimination of tokens with and withoutexchange value, a comparison of tokens andfood reinforcers in the acquisition and main-tenance of behavior, an assessment of re-sponse persistence under conditions of imme-diate and delayed reinforcement, preferencebetween tokens associated with different rein-forcer magnitudes and with qualitatively dif-ferent reinforcers (e.g., food vs. water, play vs.escape), and social behavior engendered bytoken reinforcement procedures, to name justa few.

The focus of these early investigations wason conditioned reinforcement—the condi-tions giving rise to and maintaining theacquired effectiveness of the tokens as rein-forcing stimuli. A central question concernedthe degree to which tokens, as conditionedreinforcers, were functionally equivalent to

This paper is dedicated to the memory of E. F. Malagodifor his enduring contributions to the study of tokenreinforcement. Manuscript preparation was supported byNSF Grant IBN 0420747. I am indebted to EdmundFantino and an anonymous reviewer for comments on anearlier version of the paper, and to Rachelle Yankelevitzand Bethany Raiff for assistance with figure preparation.

Correspondence may be sent to the author at theDepartment of Psychology, University of Florida, Gaines-ville, FL 32611-2250 (e-mail: [email protected]).

doi: 10.1901.jeab.2009.91-257

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2009, 91, 257–286 NUMBER 2 (MARCH)

257

Page 2: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

unconditioned reinforcers. This emphasis onthe conditioned reinforcement value of tokenswas maintained in later work by Kelleher(1956), also with chimpanzees and also con-ducted at the Yerkes Laboratories (see Dews-bury, 2003, 2006, for an interesting historicalaccount of this period at Yerkes). Kelleher’swork was notable in that it was the first to placetoken reinforcement in the context of rein-forcement schedules. The emphasis thenshifted from response acquisition to mainte-nance, and from discrete-trial to free-operantmethods. It also opened the study of tokenreinforcement to more refined techniques forassessing conditioned reinforcement value(e.g., chained and second-order schedules).Prior to Kelleher’s work, conditioned rein-forcement was generally studied in acquisitionor extinction, and the effects were weak andtransient. With the emerging technology ofreinforcement schedules, however, condi-tioned reinforcement in general, and tokenreinforcement in particular, was studied insituations that permitted continued pairing ofstimuli with food, producing more robusteffects.

Kelleher’s work suggested another impor-tant role of tokens—that of organizing coor-dinated sequences of behavior over extendedtime periods. An emphasis on conditionedreinforcement and temporal organization hasbeen carried through in later investigations oftoken reinforcement with rats (Malagodi,Webbe, & Waddell, 1975) and pigeons (Foster,Hackenberg, & Vaidya, 2001). Token rein-forcement procedures have also been extend-ed to the study of choice and self-control( Jackson & Hackenberg, 1996), and to aver-sive control (Pietras & Hackenberg, 2005),where they have proven useful in cross-speciescomparisons and to behavioral-economic for-mulations more generally (Hackenberg,2005).

Despite periods of productive researchactivity, the literature on token reinforcementhas developed sporadically, with little integra-tion across research programs. And to date, nopublished reviews of laboratory research ontoken systems exist. The purpose of thepresent paper is to review what is known abouttoken reinforcement under laboratory condi-tions and in relation to general principles ofbehavior. The extensive literature on tokensystems in applied settings is beyond the scope

of the present paper, although it will considersome implications for applied research of abetter understanding of token-reinforcementprinciples.

Two chief aims of the paper are to identifygaps in the research literature, and to high-light promising research directions in the useof token schedules under laboratory condi-tions. The paper is organized into four mainsections. In Section I, past research on tokensystems will be reviewed and discussed inrelation to common behavioral functions—reinforcement, temporal organization, ante-cedent stimulus functions, and aversive con-trol. This kind of functional analysis providesan effective way to organize the literature,revealing continuities between token rein-forcement and other types of contingencies.

Although resembling other contingencies,token-based procedures have distinctive char-acteristics that make them effective tools in theanalysis of behavior more generally. Thepresent paper considers three areas in whichtoken systems have proven especially useful asmethodological tools. Section II discusses theuse of token systems in addressing the sym-metrical law of effect—the idea that reinforc-ers (gains) and punishers (losses) can beviewed in conceptually analogous terms, assymmetrical changes along a common dimen-sion. Such a view is foundational in majortheoretical accounts of behavior, yet support-ing laboratory evidence is surprisingly sparse.Token systems provide a methodologicalcontext in which to explore such questionsacross species and outcome types.

This common methodological contextmakes token systems useful in a comparativeanalysis more generally, where minimizingprocedural differences is paramount. This isthe main focus of Section III, a review anddiscussion of procedures and concepts in thearea of choice and self-control, showing howtoken systems can be used to bridge keyprocedural barriers separating research withhumans from that with other animals.

Section IV explores the implications oftoken systems for behavioral economics. Lab-oratory research in the experimental analysisof behavior has contributed to behavioraleconomics for over three decades (Green &Rachlin, 1975; Hursh, 1980, Lea, 1978),though with few exceptions (Kagel, 1972;Kagel & Winkler, 1972; Winkler, 1970), this

258 TIMOTHY D. HACKENBERG

Page 3: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

work has not made direct contact with tokenreinforcement methods or concepts. This isunfortunate, because token systems have greatpotential for exploring a wide range ofrelationships between behavior and economiccontingencies, and for uniting the differentbranches of behavioral economics.

I. FUNCTIONAL ANALYSIS OFTOKEN REINFORCEMENT

A token reinforcement system is an inter-connected set of contingencies that specifiesthe relations between token production, accu-mulation, and exchange. Tokens are oftenmanipulable objects (e.g., poker chips, coins,marbles), but can be nonmanipulable as well(e.g., stimulus lamps, points on a counter,checks on a list). A token, as the name implies,is an object of no intrinsic value; whateverfunction(s) a token has is established throughrelations to other reinforcers, both uncondi-tioned (e.g., food or water) or conditioned(e.g., money or credit). As such, a token mayserve multiple functions—reinforcing, punish-ing, discriminative, eliciting—depending onits relationship to these other events. Thepresent review will be organized aroundknown behavioral functions: reinforcement,temporal organization, antecedent stimulusfunctions (discriminative and eliciting), andpunishment.

Reinforcing Functions

Tokens are normally conceptualized asconditioned reinforcers, established as suchthrough their relationship to other reinforc-ers. In the studies by Wolfe (1936) and Cowles(1937), the chimpanzees were trained todeposit poker chips into a vending-machineapparatus (as shown in the top panel of Figure1) for a variety of unconditioned reinforcers(e.g., grapes, water). Initially, depositing wasmodeled by the experimenter; exchange op-portunities were continuously available andeach token deposit was reinforced immediate-ly with food, ensuring contiguous token–foodpairings. The importance of such pairings wasdemonstrated in subsequent conditions, inwhich depositing was brought under stimuluscontrol: The chimps were trained to responddifferentially with respect to tokens withexchange value (white tokens, exchangeablefor grapes) and without exchange value (brass

tokens, exchangeable for nothing). Suchdifferential behavior was evidenced by (a)exclusive preference for white over brasstokens, and (b) extinction of depositing brasstokens. Periodic tests throughout the studyverified the selective reinforcing effectivenessof the white chips: When white and brasstokens were scattered on the floor, thechimpanzees picked up and deposited onlythe white (food-paired) tokens.

Acquisition and maintenance of new beha-vior. Once token depositing had been estab-lished, the chimpanzees were then trained tooperate a weight-lifting apparatus for access totokens. The animals lifted a bar from ahorizontal to a vertical position, the force ofwhich could be adjusted by adding weight tothe bar, a stem of which extended outside thework chamber (see bottom of Figure 1). Themain question was whether tokens couldestablish and maintain behavior as effectivelyas food reinforcers. Several of Wolfe’s (1936)experiments addressed this question, system-atically comparing token-maintained behaviorwith food-maintained behavior. In one exper-iment, for example, sessions in which tokenswere earned alternated with sessions in whichfood was earned. Tokens were earned after 10responses and exchanged immediately forfood. For 3 of 4 subjects, there was littledifference in the time required to complete 10trials in the token versus food sessions. For theother subject, the time was considerablyshorter in the food sessions. Similar resultswere obtained using a different procedure, inwhich the weight requirement increased infixed increments with each reinforcer (anearly type of escalating, or progressive, sched-ule). Again, for all but one subject, there wasessentially no difference in the terminalweights achieved under token and foodconditions. For the remaining animal, theweights were slightly but consistently higherunder food than under tokens, indicatinggreater efficacy of food over token reinforcers.

Using an apparatus similar to Wolfe (1936),as well as some of the same subjects, Cowles(1937) compared the effects of tokens (pokerchips) and food (raisins and seeds) under avariety of experimental arrangements. In oneexperiment, for example, chimpanzees weretrained on a two-position discrimination task,in which the position of the alternativeassociated with tokens was alternated repeat-

TOKEN REINFORCEMENT 259

Page 4: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Fig. 1. Subject and apparatus used in Wolfe (1936). Top: Chimpanzee depositing a poker-chip token into tokenreceptacle. Bottom: Outside view of experimental apparatus. See text for details. From Yerkes (1943). Property of theYerkes National Primate Research Center, Emory University. Reprinted with permission.

260 TIMOTHY D. HACKENBERG

Page 5: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

edly until it was selected exclusively. Tokenscould be exchanged for food at the end of thesession, which lasted either 10 or 20 trials. Thediscrimination was rapidly learned by 2 of 3subjects on this procedure, and improved withrepeated testing (from 1st to 2nd session withina day) and across repeated reversals. Inanother experiment, the same 2 chimpanzeeswere trained on a 5-position discrimination-learning task, in which correct responsesproduced either tokens later exchangeablefor food, or food alone. The discriminationwas learned rapidly for both subjects, andappeared to improve over time. One subjectlearned better (higher accuracy and fewertrials to meet criterion) with food than withtokens, while the other learned equally wellwith both reinforcers. However, even for thelatter, token sessions were characterized bymore ‘‘off-task’’ behavior, especially early inthe exchange ratio. Under conditions in whicha limited number of trials were provided,learning was more rapid under food thanunder token-food conditions for both subjects.Thus, despite little difference in asymptoticperformance under both reinforcers, learningproceeded somewhat more rapidly with foodthan with tokens.

Subsequent experiments by Cowles (1937)showed that the tokens derived their reinforc-ing capacity through their relation to theterminal (food) reinforcers. In one experimentalong these lines, Cowles compared the relativeefficacy of food-paired and nonpaired tokensto that of food on the accuracy of a kind ofdelayed match-to-sample discrimination underconditions of immediate and delayed ex-change. In delayed exchange, the tokensaccumulated in a tray across a block of trialsand were deposited in a room adjacent to thatin which the tokens were earned. Accuracy wassomewhat higher under food–token thannonfood–token conditions. In immediate ex-change, the chimpanzees earned and ex-changed tokens in the same room, anddeposits were required to initiate the next trial.Accuracy increased under both food-pairedand nonfood-paired token conditions, mark-edly so under paired conditions. Under condi-tions in which tokens were compared to fooddirectly, accuracy was slightly but consistentlybetter under food than food-paired tokenconditions, and significantly better under foodthan nonfood-paired token conditions.

In sum, the results of these experimentsdemonstrate that tokens function as effectivereinforcers, albeit somewhat less effective thanunconditioned reinforcers. This differentialeffectiveness of tokens versus food reinforcersis to be expected, if the tokens are functioningas conditioned reinforcers. The research alsoshowed that tokens derive their reinforcingfunctions through relations to food. Whilethis, too, may hardly be surprising, few studieshave demonstrated the necessary and suffi-cient conditions for tokens to function asconditioned reinforcers. Indeed, little isknown about the training histories needed toestablish reinforcing functions of tokens. Isexplicit token-food pairing necessary? Mustconditional stimulus (CS) functions of tokensbe established before they will function aseffective reinforcers? In short, what are theoptimal conditions for establishing and main-taining tokens as reinforcers? These arefundamental but unresolved questions requir-ing additional research.

Generalized reinforcing functions. When to-kens are paired with multiple terminal rein-forcers, they are said to be generalized reinforcers(Skinner, 1953). It is common in tokensystems with humans, for example, to includea store in which tokens can be exchanged foran array of preferred items and activities. Suchgeneralizability should greatly enhance thedurability of tokens, making them less depen-dent on specific motivational conditions,although there is surprisingly little empiricalsupport for this. Perhaps the closest was anearly experiment by Wolfe (1936). Threechimpanzees were given choices between blacktokens (exchangeable for peanuts) and yellowtokens (exchangeable for water) under condi-tions of 16-hr deprivation from food or water(alternating every two sessions). All 3 subjectsgenerally preferred the tokens appropriate totheir current deprivation conditions, thoughthe preference was not exclusive. Two addi-tional subjects were therefore run under 24-hrdeprivation conditions and were permittedfree access to the alternate reinforcer for1 hr prior to each session. Eliminating themotivation for one reinforcer was thought toheighten the motivation for the other. Thesesubjects generally preferred the deprivation-specific reinforcer, and did so more stronglythan the 3 subjects studied under moremodest deprivation conditions.

TOKEN REINFORCEMENT 261

Page 6: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Had a third token type been paired withboth food and water, generalized reinforcingfunctions of the tokens could then have beenevaluated. To date, however, generalizedfunctions of tokens have yet to be explored.This remains a key topic of research on tokenreinforcement, especially in light of theimmense theoretical importance attached tothe concept of generalized reinforcement(Skinner, 1953). What types of experiencesare necessary in establishing generalized func-tions? Once established, how do generalizedreinforcers compare to more specific reinforc-ers? Are generalized reinforcers more durablethan specific reinforcers? Are generalizedreinforcers more resistant to extinction, orother disruptions, than more conventionalreinforcers? Will generalized reinforcers bepreferred to specific reinforcers? Will general-ized reinforcers generate more work to obtainthem? Are generalized reinforcers more aver-sive to lose than more specific reinforcers? Inshort, under what conditions will generalizedreinforcers prove more valuable than specificreinforcers? These are just a few of the manyfoundational questions brought within thescope of an experimental analysis of general-ized reinforcement.

Bridging reinforcement delay. Another func-tion commonly ascribed to conditioned rein-forcers is that of bridging temporal gapsbetween behavior and delayed reinforcers(Kelleher, 1966a; Skinner, 1953; Williams,1994a, b). In an early experiment along theselines, Wolfe (1936) showed that tokens couldmaintain behavior in the face of extensivedelays to food reinforcement. Four conditionswere studied: (1) tokens were earned immedi-ately, but could not be exchanged until after adelay (immediate token, delayed exchange/food); (2) food was earned for each responsebut delivered after a delay (no token, delayedfood); (3) same as 2, except that brass chipswere available during the delays (nonpairedtoken, delayed food); and (4) tokens wereearned and could be deposited immediately,but food was delivered only after a delay(immediate token/exchange, delayed food).This latter condition differed from (1) in thatno tokens were present during the delayinterval.

In all four conditions, the delays increasedprogressively with each food delivery, until abreakpoint was reached—a point beyond

which responding ceased (5 min without aresponse). Responding was clearly strongest inCondition 1 (immediate token, delayed ex-change/food) for all 4 chimpanzees studied,reaching delays of 20 min and 1 hr for 2subjects. For the other 2 subjects, breakpointsunder Condition 1 were undefined, both inexcess of 1 hr. The breakpoints under Condi-tions 2–4 were generally short (less than3 min) for all subjects and did not differsystematically across conditions. Interestingly,there was evidence of pausing under the latterconditions, but not under Condition 1, indi-cating that the immediate delivery of a tokenhelped sustain otherwise weak behavior tem-porally remote from food.

Performances in Conditions 1 and 4 arerelevant to accounts of conditioned reinforce-ment based on concepts of marking—thefacilitative effects of response-produced stimu-lus changes. According to this view, the rate-enhancing effects of stimuli paired with foodarise not from the temporal pairing ofstimulus with food (conditioned reinforce-ment), but from the response-produced stim-ulus change, which is said to mark, or makemore salient, the response that produces thereinforcer (Lieberman, Davidson, & Thomas,1985; Williams, 1991). If the tokens functionmainly to mark events in time, then nodifferences would be expected under Condi-tions 1 and 4, as both involve similar response-produced stimulus changes: immediate tokenpresentation. Conversely, if tokens function asconditioned reinforcers via temporal pairingswith food, then greater persistence would beexpected under Condition 1 than Condition 4.The higher breakpoints obtained under Con-dition 1 versus Condition 4 are clearly moreconsistent with the latter view, suggesting thatthe tokens were indeed serving as conditionedreinforcers—that their function depended onexplicit pairing with food—rather than merelyas temporal markers.

Similar results were reported in a nontokencontext by Williams (1994b, Experiment 1),with response acquisition under delayed rein-forcement. Rats’ lever presses produced foodafter a 30-s delay. In marking conditions, eachlever press produced a 5-s tone at the outset ofthe 30-s delay interval. In conditioned-rein-forcement conditions, each lever press pro-duced a 5-s tone at the beginning and end ofthe delay interval. Responding was acquired

262 TIMOTHY D. HACKENBERG

Page 7: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

more rapidly under the latter conditions,showing the importance of the temporalpairing of the stimulus with food. In a secondexperiment, Williams compared acquisitionunder these latter conditions to that underconditions in which the tone remained onduring the entire 30-s delay interval, designedto assess the bridging functions of the stimu-lus. Although both conditions included tem-poral pairing of tone and food, the bridgingconditions proved less effective than condi-tions in which the tone was presented only atthe beginning and end of the interval.Williams attributed these differences to thehigher rate of food delivery (per unit time) inthe presence of the briefer signals.

Some conditions in Wolfe’s (1936) experi-ment also bear on potential bridging functionsof the tokens. For the 2 subjects whoseresponding was maintained at 1-hr delaysunder Condition 1, delays exceeding 1 hrwere examined. As before, the delay increasedsystematically across sessions. Because it wasimpractical to retain the subjects in the work-space under such long delays, the chimpan-zees were permitted to leave the experimentalspace (and the token) 5 min after making aresponse and were returned 5 min prior to theend of the delay interval. Both subjectscontinued to respond at the maximum delaystested: 5 hrs for one subject, 24 hrs for theother. This suggests that the continued pres-ence of the token was not essential to thereinforcing value of the tokens, as long astemporal pairing of tokens and food wasintact. This conclusion must be tempered bythe fact that these conditions were conductedfollowing extended exposure to conditions inwhich the tokens filled the temporal gap, andmay therefore have depended in part on thathistory. Additional research is needed toisolate the necessary and sufficient conditionsunder which response-produced stimuli en-hance tolerance for delayed reinforcers. Itshould be apparent from this brief review,however, that token reinforcement proceduresare well suited to the task.

Temporal Organization: Schedules of TokenReinforcement

Despite the conceptual importance of thepathbreaking work by Wolfe (1936) and Cow-les (1937), research on token reinforcementlay dormant for approximately two decades.

Kelleher revived research on token reinforce-ment in the 1950s, combining it with theemerging technology of reinforcement sched-ules. A fundamental insight of Kelleher’s (and,later, Malagodi’s) research was the conceptu-alization of a token reinforcement contingen-cy as a series of three interconnected schedulecomponents: (a) the token-production schedule,the schedule by which responses producetokens, (b) the exchange-production schedule,the schedule by which exchange opportunitiesare made available, and (c) the token-exchangeschedule, the schedule by which tokens areexchanged for other reinforcers. Research hasshown that behavior is systematically relatedboth to the separate and combined effects ofthese schedule components. This research willbe reviewed below, along with a discussion ofthe relevance of such work to other schedulesused to study conditioned reinforcement.

Token-production schedules. The first demon-stration of schedule-typical behavior undertoken reinforcement schedules was reportedby Kelleher (1956). Two chimpanzees werefirst trained to press a lever on fixed-ratio (FR)and fixed-interval (FI) schedules of foodreinforcement, in which responses producedfood after a specified number of responses(FR) or the first response after a specifiedperiod of time (FI). Once lever pressing wasestablished, the subjects were then trained todeposit poker-chip tokens in a receptacle forfood. Unlike the earlier studies by Wolfe andCowles, in which exchange responses wereenabled or prevented by opening or closing ashutter in front of the token receptacle,Kelleher brought exchange responding underdiscriminative control. Deposits in the pres-ence of an illuminated window produced food,whereas deposits when the window was darkdid not. When depositing was under stimuluscontrol, the animals were taught to press alever to obtain poker chips. Under both simpleand multiple schedules, Kelleher found thatFR and FI schedules of token productiongenerated response patterning characteristicof FR and FI schedules of food production.

In a subsequent study, Kelleher (1958)performed a more extensive analysis of FRschedules of token reinforcement. Responsesin the presence of a white light producedtokens (poker chips) that could be exchangedfor food during an exchange period (signaledby a red light) at the end of the session. With

TOKEN REINFORCEMENT 263

Page 8: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

the exchange schedule held constant at FR 60(i.e., 60 tokens were required to produce theexchange period) and the token-exchangeschedule held constant at FR 1 (i.e., once theexchange period was reached, each of the 60token deposits produced a food pellet), thetoken-production ratio was varied from FR 60to FR 125 gradually over the course of 10sessions. Responding was characterized by atwo-state pattern: alternating bouts of respond-ing and pausing, similar to that seen under FRschedules of food reinforcement. The break-run pattern persisted in extinction with tokenschedules discontinued, an effect also seenwith FR schedules of food reinforcement(Ferster & Skinner, 1957).

Malagodi (1967a, b, c) extended Kelleher’sresearch on token-reinforced behavior, usingrats rather than chimpanzees as subjects andmarbles rather than poker chips as tokens.The basic apparatus is depicted in Figure 2,which shows a photograph of the front wall ofthe chamber and a diagram of the tokenreceptacle. Lever presses produced marbles,dispensed into the token hopper. Duringscheduled exchange periods (denoted by aclicker and local illumination of the tokenreceptacle), depositing the tokens in the tokenreceptacle produced food, dispensed into thefood hopper.

In several experiments, Malagodi (1967 a, b,c) examined response patterning under vari-ous token-production schedules. Illustrativecumulative records from Malagodi (1967a)are shown in Figure 3, with VI and FR token-production responding on the left and rightpanels, respectively. The exchange-productionand token-exchange (deposit) schedules wereheld constant at FR 1. Consistent with Kelle-her’s findings, FR token-production patternsshowed a high steady rate with occasionalpreratio pausing, characteristic of FR pattern-ing with food reinforcers. Similarly, VI token-production responding was characterized by alower but steady rate, resembling responsepatterning on simple VI schedules. The token-exchange (deposit) records show a relativelyconstant rate, appropriate to the FR-1 ex-change schedule. Similar schedule-typical re-sponse patterns under FR and VI token-production schedules were reported by Mala-godi (1967 b, c).

More recently, Bullock and Hackenberg(2006) expanded the analysis of token-pro-duction effects, parametrically examining be-havior under both FR token-production andexchange-production schedules under steady-state conditions. The subjects were pigeonsand the tokens consisted of stimulus lamps(LEDs) mounted in a horizontal array above

Fig. 2. Photograph of intelligence panel and diagram (side view) of token receptacle used in Malagodi’s researchwith rats as subjects and marbles as tokens. See text for details. From Malagodi (1967a). Reprinted with permission.

264 TIMOTHY D. HACKENBERG

Page 9: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

the response keys in an otherwise standardexperimental chamber. With the token-ex-change schedule held constant at FR 1, thetoken-production ratio was varied systematical-ly across conditions from FR 25 to FR 100 ateach of three different exchange-productionFR values (2, 4, and 8). Figure 4 showsresponse rates on the token-production sched-ule as a function of the token-production andexchange-production ratios. In general, re-sponse rates declined with increases in thetoken-production FR (left panels), especiallyat the higher exchange-production ratios (asdenoted by symbols), an effect that corre-sponds well to that reported with simple FRschedules (Mazur, 1983). (Exchange-produc-tion effects [right panels] will be taken up inthe next section.) There was only suggestiveevidence of such an effect in Kelleher (1958),as the token-production FR was increasedincrementally over approximately 10 sessions,with data shown only for the lowest andhighest ratio.

Taken together, the results of Kelleher(1958) with chimpanzees and poker-chiptokens, of Malagodi (1967a, b, c) with ratsand marble tokens, and of Bullock andHackenberg (2006) with pigeons and stimu-lus-lamp tokens, show that performance underschedules of token production resemble, inboth patterning and rate, performance under

Fig. 3. Representative cumulative records for ratsexposed to token-reinforcement schedules. Left panels:VI 1-min token-production schedules. Right panels: FR 20token-production schedules. Pips indicate token deliveries.Exchange-production and token-exchange schedules (la-beled deposit in the figure) were both FR 1. From Malagodi(1967a). Reprinted with permission.

Fig. 4. Token-production and exchange-productionresponse rates as a function of FR token-production ratio(left panels) and exchange-production ratio (right panels)of 4 pigeons across the final five sessions of each condition.Note that the graphs in left and right panels are derivedfrom the same data, plotted differently to highlight theimpact of token-production and exchange-productionratios. Different symbols represent different ratio sizesfor the exchange-production ratios (left panels) andtoken-production ratios (right panels). Unconnectedpoints represent data from replicated conditions; errorbars represent standard deviations. From Bullock &Hackenberg (2006). Copyright 2006, by the Society forthe Experimental Analysis of Behavior, Inc. Reprintedwith permission.

TOKEN REINFORCEMENT 265

Page 10: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

schedules of food reinforcement (Mazur,1983), suggesting that tokens do indeed serveas conditioned reinforcers. Token-reinforcedbehavior is also modulated, however, by theexchange-production schedule that deter-mines how and when exchange opportunitiesare made available. In the Bullock and Hack-enberg experiment, for example, responserates were generally low in the early segmentsof the exchange-production cycle, but in-creased throughout the cycle, as additionaltokens were earned. Response rates were thusa joint function of the token-productionschedule and exchange-production schedule.

Exchange-production schedules. More directevidence of exchange-production effectscomes from studies in which the exchange-production schedule value or type is directlymanipulated. For example, Webbe and Mala-godi (1978) compared rats’ response ratesunder VR and FR exchange-production sched-ules while holding constant the token-produc-tion schedule (FR 20) and token-exchangeschedule (FR 1) requirements. In other words,a fixed number of lever presses were requiredto produce tokens (marbles) and to exchangetokens for food, but groups of tokens wererequired to produce an exchange period. Insome conditions, the exchange-productionschedule was FR 6 (six tokens required toproduce exchange periods), and in others VR6 (an average of six tokens required toproduce exchange periods). Responding wasmaintained under both schedule types, but athigher rates under the VR schedules. Most ofthe rate differences were due to differentialpostexchange pausing produced by the sched-ules. The FR exchange-production schedulegenerated extended pausing early in the cyclegiving way to higher rates later in the cycle—the ‘‘break-run’’ pattern characteristic ofsimple FR schedules. This bivalued patternwas greatly attenuated under the VR exchange-production schedule.

Foster et al. (2001) reported similar effectswith FR and VR exchange ratios varied over amuch wider range of values. Pigeons’ keypecksproduced tokens (stimulus lamps) accordingto an FR 50 schedule and exchange opportu-nities according to FR and VR schedules.Exchange schedule type (FR or VR) was variedwithin a session in a multiple-schedule ar-rangement, whereas the schedule value wassystematically varied from 1 to 8 across

conditions (requiring between 50 and 400responses per exchange period). Responserates were consistently higher, and pausingconsistently shorter, under VR than undercomparable FR exchange-production sched-ules. Moreover, rates were more sensitive toexchange-production ratio size under FR thanunder VR schedules. These effects are por-trayed in Figure 5, which shows response ratesas a function of exchange-production ratio sizeand schedule type. Here and in Bullock andHackenberg (2006), right panels of Figure 4,response rates decreased with increases in thesize of the FR exchange-production schedule.These effects on response rates, as well asthose on preratio pausing (not shown), resem-ble effects typically obtained with simple FRand VR schedules, suggesting that extended

Fig. 5. Token-production response rates of 3 pigeonsas a function of exchange-ratio size and schedule type: FR(open circles) and VR (closed circles) over the final fivesessions of each condition. Unconnected points denotedata from replicated conditions. Error bars indicate therange of values contributing to the means. From Foster,Hackenberg, & Vaidya (2001). Copyright, 2001 by theSociety for the Experimental Analysis of Behavior, Inc.Reprinted with permission.

266 TIMOTHY D. HACKENBERG

Page 11: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

segments of behavior may be organized byexchange schedules in much the same way thatlocal response patterns are organized bysimple schedules.

Viewed in this way, token schedules closelyresemble second-order schedules of reinforce-ment (Kelleher, 1966a), in which a patternof behavior generated under one schedule(the first-order, or unit schedule) is treated asa unitary response reinforced according toanother schedule (the second-order sched-ule). In a token arrangement, first-order(token-production) units are reinforced ac-cording to a second-order (exchange-produc-tion) schedule. If the exchange-productionschedules are acting as higher-order sched-ules, then the rate and patterning of token-production units should be schedule-appro-priate. In the Foster et al. (2001) study, forexample, the rate of FR 50 token-productionunits was higher under VR than under FRexchange, owing mainly to extended pausingin the early segments under the FR exchange-production ratio. That the functions relatingpausing and response rates to schedule valueand type also holds at the level of second-orderschedules, suggests that token-productionunits might profitably be viewed as individualresponses themselves reinforced according toa second schedule, supporting a view of tokenschedules as a kind of second-order schedule.

Further evidence of exchange-schedulemodulation over token-reinforced behaviorcomes from studies utilizing time-based ex-change-production schedules. In a study byWaddell, Leander, Webbe, and Malagodi(1972), rats’ lever presses produced tokens(marbles) under FR 20 schedules in thecontext of FI exchange schedules, whichvaried parametrically from 1.5 to 9 min acrossconditions. The first token produced after theFI exchange interval timed out produced anexchange period (light and clicker) in thepresence of which each token could beexchanged for a food pellet.

Token-production patterns were character-ized by bivalued (break-run) patterns, suchthat responding either occurred at high ratesor not at all, similar to that seen under simpleFR schedules. These patterns are illustrated inFigure 6, which shows cumulative records ofresponding across the exchange-production FIcycle (within each panel) and as a function ofexchange-schedule FI durations (across pan-

els). The temporal pattern of token-produc-tion units within an exchange cycle waspositively accelerated, similar to that seenunder simple FI schedules, with longer pausesnear the beginning of a cycle giving way tohigher rates as the exchange period grewnearer. Overall rates decreased and quarter-life values increased with increases in the FIexchange schedule, similar to that seen underFI schedules of food presentation. The overallpattern of results is consistent with that seenunder second-order schedules of brief stimu-lus presentation (Kelleher, 1966b), furthersupporting the view that token-productionschedules are units conditionable with respectto second-order schedule requirements.

Token-exchange schedules. In the studies re-viewed thus far, the token-production and/orexchange-production schedules have beenmanipulated in the context of a constanttoken-exchange schedule. That is, once theexchange period is produced, each token-exchange response removes a token andproduces food. In the only published studyin which the token-exchange schedule wasmanipulated (Malagodi, Webbe, & Waddell,1975), rats’ lever presses produced marbletokens according to an FR 20 token-produc-tion schedule, and exchange periods accord-ing to either FR 6 (2 rats) or FI t -s schedules (2rats). The token-exchange ratio was systemat-ically manipulated across conditions, such that

Fig. 6. Representative cumulative records for one ratexposed to FR 20 token-production schedules as afunction of exchange-production schedules: FI 1.5 min(top), FI 4.5 min (middle) and FI 9 min (bottom). Tokendeliveries are indicated by pips, exchange periods by resetsof the response pen. From Waddell, Leander, Webbe, &Malagodi (1972). Reprinted with permission.

TOKEN REINFORCEMENT 267

Page 12: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

multiple tokens were required to produceeach food pellet.

Figure 7 shows initial-link response ratesacross successive token-production FRs as afunction of the terminal-link token schedulefor the pair of FR subjects (divided by phasefor Rat T-22). Response rates increased inproximity to the terminal-link exchange sched-ule, and were ordered inversely with the token-exchange FR. Similar effects were observed inrats exposed to the FI token-exchange sched-ule (not shown). Together, these results areconsistent with those obtained with chainedschedules more generally (Ferster & Skinner,1957; Hanson & Witoslawski, 1959). Moreover,the token-production FR units resembledsimple FR units, consistent with findings inthe literature on second-order schedules (Kel-leher, 1966a), suggesting that token-produc-tion schedules create integrated responsesequences reinforced according to second-order exchange-production and token-ex-change schedules.

In sum, behavior under token reinforce-ment schedules is a joint function of thecontingencies whereby tokens are producedand exchanged for other reinforcers. Otherthings being equal, contingencies in the laterlinks of the chain exert disproportionatecontrol over behavior. Token schedules arepart of a family of sequence schedules thatincludes second-order and extended chainedschedules. Like these other schedules, tokenschedules can be used to create and synthesizebehavioral units that participate in largerfunctional units under the control of othercontingencies. Token schedules differ, howev-er, from these other sequence schedules inseveral important ways.

First, unlike conventional chained andsecond-order schedules, the added stimuli intoken schedules are arrayed continuously, andeach token is paired with food duringexchange periods. These token–food relationsmay render token schedules more susceptibleto stimulus–stimulus intrusions than othersequence schedules, as discussed below inrelation to antecedent stimulus functions.Second, unlike other sequence schedules,token schedules provide a common currency(tokens) in relation to which reinforcer valuecan be scaled. This makes the proceduresuseful in exploring symmetries between thebehavioral effects of losses and gains, as

discussed in Section II. Third, the interdepen-dent components of a token system (produc-tion and exchange for other reinforcers)closely resemble the point or money reinforce-ment systems commonly used in laboratoryresearch with humans. This makes tokenprocedures especially useful in narrowing themethodological gulf separating research withhumans from that with other animals, asdiscussed in Section III. Finally, added stimuli

Fig. 7. Response rates across successive FR token-production segments as a function of token-exchange FRfor 2 rats. Overall mean response rates are shown on theright-hand side of each plot. From Malagodi, Webbe, &Waddell (1975). Copyright 1975, by the Society for theExperimental Analysis of Behavior, Inc. Reprintedwith permission.

268 TIMOTHY D. HACKENBERG

Page 13: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

(number of tokens) in token schedules arecorrelated not only with temporal proximity tofood (as in other sequence schedules) but withamount of reinforcement available in ex-change. This makes token schedules especiallyuseful tools in studying economic concepts,such as unit price, as discussed in Section IV.

Token Accumulation

Another distinctive feature of token rein-forcement schedules is the accumulation oftokens prior to exchange. In an early experi-ment by Cowles (1937), chimpanzees werestudied under three conditions: (1) tokenswere earned and accumulated but could notbe exchanged until the end of the interval, (2)food was earned and accumulated but couldnot be consumed until the end of the interval,and (3) food was earned but did not accumu-late and could not be consumed until the endof the interval. Subjects were trained underthree different FI exchange values (1, 3, and5 min) before the final value of 10 min wasreached. For one subject, responding wasmaintained under (1) and (2) but not under(3). For the other subject, responding wasweakly but consistently maintained under (1)but not under (2) or (3). For both subjects,the number of tokens earned decreasedthroughout the interval (counter to the typicalFI pattern), an effect attributed to ‘‘tokensatiation.’’ In support of this notion, whentokens (5, 15, or 30) were provided at thebeginning of the session, token-maintainedbehavior decreased, such that an approximate-ly equal number of tokens (free + earned)were obtained. To complicate matters, howev-er, the total number of tokens or grapesdecreased across sessions, suggesting perhapsadditional motivational factors.

More recently, Sousa and Matsuzawa (2001)reported accumulation of tokens (coins) bychimpanzees in a match-to-sample procedure.Correct responses produced a token thatcould be exchanged at any time by insertinginto a slot. Once inserted, a single exchangeresponse would deliver food reinforcement.Performance initially established via foodreinforcement showed little or no deteriora-tion when token reinforcers were substitutedfor food. The token reinforcement contingen-cy was also sufficient to generate accurateresponding in new discriminations. Thus,consistent with the older studies by Cowles

(1937) and Wolfe (1936), token reinforce-ment contingencies appear to be effective inestablishing and maintaining discriminativeperformance in a variety of tasks. The authorsalso noted ‘‘spontaneous’’ savings of tokens inone of the chimps—spontaneous in the sensethat it was not strictly required by thecontingencies. The authors reasoned that suchbehavior was adaptive, requiring less effort inthe long run. That is, accumulating groups oftokens before exchange means that theexchange response (in this case, moving fromone part of the chamber to another) need onlybe executed once rather than repeatedly,increasing the overall reinforcer density (num-ber of reinforcers per unit time). Althoughplausible, their procedures do not permit aclear test of this notion.

Yankelevitz, Bullock and Hackenberg (2008)examined patterns of token accumulationunder a variety of ratio-based token-produc-tion and exchange schedules, permittingclearer specification of the relative costs ofaccumulating and exchanging tokens. Cyclesbegan with the illumination of a token-production key. Satisfying an FR on this keyproduced a token and simultaneously illumi-nated a second key, the exchange-productionkey. Satisfying the exchange-production ratioproduced the exchange period, during whichall of the tokens earned that cycle could beexchanged for food. Unlike the extended-chained schedules reviewed above, in whichtoken-production and exchange-productionschedules were arranged sequentially inchain-like fashion, the token-production andexchange-production ratios here were ar-ranged concurrently after the first token hadbeen earned, providing subjects with a choicebetween accumulating additional tokens andmoving directly to the exchange period.

The token-production and exchange-pro-duction ratios were systematically varied acrossconditions, with the exchange-production ra-tio manipulated at three different token-production ratios. Summary data are shownin Figure 8. The top panel shows percentageof cycles in which accumulation occurred (twoor more tokens prior to exchange) as afunction of FR exchange-production size,averaged across 3 subjects. The bottom panelshows the average number of tokens accumu-lated per exchange, also expressed as afunction of FR exchange-production size.

TOKEN REINFORCEMENT 269

Page 14: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

The different FR token-production values aredisplayed separately (denoted by differentsymbols and data paths). In general, accumu-lation varied directly with the exchange-pro-duction ratio and inversely with the token-production ratio. In other words, as the costsassociated with producing the exchange peri-od increased, more tokens were accumulatedprior to exchange. Conversely, as the costs ofproducing each token increased, fewer tokenswere accumulated.

Viewing the tokens as conditioned reinforc-ers, the results are consistent with priorresearch on reinforcer accumulation (Cole,1990; Killeen, 1974; McFarland & Lattal,2001), but over a wider parametric range ofconditions. Moreover, as Cole and others havenoted, accumulation procedures involve trade-offs between short-term and longer-term rein-forcement variables. Going immediately toexchange minimizes the upcoming reinforcerdelay, but at the expense of increasing theoverall response:reinforcer ratio. Conversely,accumulation increases the delay to theupcoming reinforcer, but reduces the numberof responses per unit of food (unit price).Token accumulation procedures thus bear oncurrent debates over the proper time scaleover which behavior is related to reinforce-ment variables. This topic will be discussed inmore detail in Section IV.

Antecedent Stimulus Functions of Tokens

The evidence reviewed above strongly sug-gests conditioned reinforcing functions of thetokens. As stimuli temporally related to otherreinforcers, however, tokens acquire anteced-ent functions as well, discriminative andeliciting. Research bearing on these functionswill be addressed in this section.

Discriminative functions. In the Kelleher(1958) study reviewed above, low responserates in the early token-production segmentsgave way to high and roughly invariant rates inthe later segments (in closer proximity to theexchange period and food). Weak respondingin the early links was attenuated by providingfree tokens at the outset of the session,suggesting that the tokens were serving adiscriminative function, signaling temporalproximity to the exchange periods and food.Thus, despite no change in the schedulerequirements (the same number of responseswere required to produce tokens and ex-

change periods), responding was strength-ened by arranging stimulus conditions morelike those prevailing later in the cycle. Thediscriminative functions of stimuli early in thesequence parallels effects seen on extended-chained schedules with ratio components( Jwaideh, 1973; see review by Kelleher &Gollub, 1962).

Similar discriminative functions were likelyat work in the so-called token satiation effectreported by Cowles (1937), in which providingfree tokens at the beginning of a sessionresulted in fewer tokens earned. Recall that

Fig. 8. Mean token accumulation as a function of FRexchange-production ratio. Top panel: Percentage ofcycles in which multiple tokens accumulated prior toexchange. Bottom panel: Mean tokens accumulated perexchange period. The different symbols represent differ-ent token-production FR sizes. Data are based on the finalfive sessions per condition, averaged across 3 subjects.(Adapted from Yankelevitz, Bullock, & Hackenberg, 2008).

270 TIMOTHY D. HACKENBERG

Page 15: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

responding decreased throughout the interval,such that the stimulus conditions late in theinterval controlled low rates of behavior. Moredirect evidence of discriminative effects oftokens on accumulation comes from theYankelevitz et al. (2008) study, described inthe preceding section. When the tokens wereremoved but all other contingencies heldconstant, accumulation decreased markedly,suggesting that the tokens were discriminativefor further token earning.

Eliciting functions. In addition to reinforc-ing and discriminative functions, pairing to-kens and food may also establish CS functionsof the tokens. In one of Cowles’ (1937)discrimination experiments, food-paired to-kens were provided for correct responses andnonpaired tokens for incorrect responses. Inaddition to serving as a more effective rein-forcer, the paired but not the unpaired tokensevoked strong token-directed consummatoryresponses. The chimpanzees came to respondto the paired tokens as if they were food,putting them into their mouths, and so on.Such token-directed consummatory behaviorhas also been reported in other studies withmanipulable tokens: poker chips (Kelleher,1958), marbles (Malagodi, 1967d), and ballbearings (Boakes, Poli, Lockwood, & Goodall,1978; Midgley, Lea, & Kirby, 1989).

In the Midgley et al., (1989) study, forexample, rats were trained to deposit ballbearings in a floor receptacle by reinforcingprecursor responses (e.g., handling, rollingtokens toward the receptacle) with food. Intwo experiments, depositing was successfullytrained in 13 of 15 rats. In addition to thereinforced response, however, the authorsnoted frequent instances of ‘‘misbehavior,’’defined as an excess frequency of precursorresponses. That such behavior continued tooccur despite postponing food deliveriessuggests the involvement of evocative oreliciting functions of the tokens. A proceduralanalysis of token procedures lends plausibilityto this account. Tokens are closely coupledwith exchange periods and food throughoutmost token-reinforcement contingencies, en-suring repeated food–token pairings. In addi-tion to the close temporal pairing of tokensand food during exchange periods, the pre-sentation and accumulation of tokens on manytoken procedures is temporally correlated withfood, differentially signaling delays to food.

Thus, apart from the discriminative functionsnoted above, tokens may also evoke or elicitbehavior based on their temporal relations tofood.

Such token-evoked behavior is probablymore likely with manipulable tokens such asmarbles and poker chips than with nonmani-pulable tokens such as lights. Manipulabletokens are handled in various ways throughoutthe sequence of earning and exchanging themfor food, and this physical contact with thetokens then becomes part of the behaviorengendered by the procedures. With nonma-nipulable tokens, on the other hand, suchphysical contact with the tokens is permittedbut not required by the contingencies. Al-though pigeons have been known to orient toand occasionally peck at token lights, suchtoken-directed behavior occurs much lessoften than with manipulable tokens.

Behavior evoked or elicited by tokens is aninteresting and important topic in its ownright, and may play an important role in acomprehensive analysis of token systems.Because such behavior is often incompatiblewith behavior reinforced by tokens, however, itmay be desirable when examining primarilyoperant functions of tokens to use nonmani-pulable tokens, less susceptible to intrusionsfrom embedded stimulus–food relations. Spe-cific methodological choices therefore dependon the types of questions asked.

Punishing Functions

If tokens function as conditioned reinforc-ers, then the loss of tokens should function asconditioned punishers. Such token-loss con-tingencies, also known as response-cost con-tingencies, are common in application and ineveryday life, but until recently have not beenan explicit topic of laboratory investigationwith nonhumans. The main obstacle appearsto be methodological. With manipulable to-kens, such as marbles or poker chips, thetokens are in a subject’s possession until theyare exchanged for food; there is no obviousway to remove a token once it is has beenearned. With nonmanipulable tokens, such aslights, however, tokens are earned and accu-mulated but are not physically in a subject’spossession. Removing a token is just asstraightforward as presenting one, by extin-guishing rather than illuminating a tokenlight. This makes such procedures especially

TOKEN REINFORCEMENT 271

Page 16: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

well-suited to the study of conditioned pun-ishment via token loss.

In the first published study along these lines,Pietras and Hackenberg (2005) examinedtoken-loss contingencies in a multiple-sched-ule arrangement with pigeons. Tokens consist-ed of stimulus lights, mounted in a horizontalarray above the response keys (as describedabove). Identical VR 4 (RI 30 s) schedules oftoken reinforcement operated in both com-ponents, such that tokens were produced onaverage every 30 s, and exchanges were pro-duced on average following every four tokens.A conjoint FR schedule of token loss wasarranged in one component (punishmentcomponent), such that every n responses(either 10 or 2, in different conditions)removed a token from the token array.

Response rates were differentially sup-pressed in the punishment component, dem-onstrating a clear punishment function oftoken loss. Because both the exchange-pro-duction and token-loss schedules were ratiobased, exchanges and food were infrequent inthe punishment component. Despite extreme-ly low rates of food reinforcement, respondingwas not completely suppressed, reaching ap-proximately 30–40% of baseline (unpunished)levels, even under the most stringent (FR 2)token-loss schedule. Only when the tokenswere no longer produced under extinctionconditions was responding eliminated. Thissuggests that the mere production of tokens asconditioned reinforcers maintained behavioreven when they were only rarely exchanged forfood. To test this, Pietras and Hackenberg(2005) conducted an additional condition,termed exchange extinction, in which the to-ken-loss contingency was suspended—permit-ting production and accumulation of tokens—but tokens could not be exchanged for food.Exchange responses turned off tokens but didnot produce food. Response rates under theseconditions stabilized well above rates in theextinction condition and at approximately thesame levels as in the punishment conditions,suggesting that responding was maintained bythe production and accumulation of tokens asconditioned reinforcers even in the signaledabsence of food. These effects parallel thoseseen with other conditioned reinforcers (Hor-ney & Fantino, 1984).

Overall, the results are consistent with priorfindings with human subjects: Response-con-

tingent token loss suppresses behavior onwhich it is contingent (Munson & Crosbie,1998; O’Donnell, Crosbie, Williams, & Saun-ders, 2000; Weiner, 1962, 1963, 1964). Con-tingent token loss also produces indirecteffects on behavior, such as contrast (Crosbie,Williams, Lattal, Anderson, & Brown, 1997),comparable to that seen with electric-shockpunishment (Brethower & Reynolds, 1962).Together with the Pietras and Hackenberg(2005) findings with pigeons, the results withhuman subjects appear to support an inter-pretation based on negative punishment (i.e.,punishment via token loss). In negative-pun-ishment studies, however, suppression due topunishment is confounded with changes inreinforcement density that normally accompa-ny response suppression. That is, becausenegative punishment, by definition, involvesreinforcer loss, reductions in responding maybe attributed not to a direct punishmenteffect, but to concomitant changes in rein-forcer density.

To overcome this interpretive difficulty,Raiff, Bullock, and Hackenberg (2008) com-pared response suppression under token-losscontingencies to that under yoked scheduleswith comparable rates of food density. Like thePietras and Hackenberg (2005) study, a multi-ple-schedule format was used with pigeons assubjects and lights as tokens. Identical sched-ules of token gain, VR 4 (RI 30 s), operated inboth components; a conjoint FR 2 token-lossschedule was superimposed on the tokenschedule in one component. To separate thedirect effects of punishment from indirecteffects of changes in the density of food, ayoked-control condition was included, in whichthe punishment contingency was suspendedbut the density of food reinforcement wasyoked to that from the token-loss condition.

Figure 9 shows summary data, averagedacross the final five sessions under baseline(hatched bars) and loss (filled bars) condi-tions for each pigeon. The filled bars labeledTL (token loss) represent data from punish-ment conditions, those labeled YF (yokedfood) from yoked control conditions, ex-pressed as a proportion of the response ratesin the no-loss components of the samesessions. Response rates were somewhat re-duced in the YF condition, indicating somefood-related decrements, but remained consis-tently higher than under TL conditions.

272 TIMOTHY D. HACKENBERG

Page 17: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Because the reinforcer densities were equiva-lent under the two loss conditions, the greaterreductions under token-loss conditions sup-port a punishment interpretation.

Although the research on token loss to dateis consistent with a punishment interpretation,much remains to be done. For example, littleis known about how punishment effects relateto (a) punishment variables (e.g., rate, magni-tude, and schedule of token loss), (b) rein-forcement variables (e.g., rate, magnitude, andschedule of reinforcement maintaining thepunished behavior), (c) motivational variables(e.g., deprivation, generalizability of the to-kens), or (d) the availability of unpunishedalternatives, to name just a few. The use oftoken systems may help revitalize the empiricalstudy of aversive control, a field that has beenin decline over the past few decades (Critch-field & Rasmussen, 2007).

II. TOKEN SYSTEMS AND THESYMMETRICAL LAW OF EFFECT

One implication of results of token-losspunishment studies is that reinforcers andpunishers may be viewed in conceptuallyanalogous terms, as symmetrical changesalong a common dimension. This commondimension has long been assumed in majortheories of behavior, though empirical sup-port is surprisingly scarce. A major obstacle ismethodological: reinforcers are typically appe-titive events (e.g., food and water) whereaspunishers are typically aversive events (e.g.,electric shock). This poses serious scalingissues. For example, how does food compareto shock? Can they be measured in compara-ble units? One approach to the problem hasbeen to scale reinforcers and punishers em-pirically, using post hoc descriptive modelingto fit the parameters (deVilliers, 1980; Farley,1980; Farley & Fantino, 1978).

An alternative approach is to use tokenprocedures in which gains and losses can bemeasured in common currency units. This wasthe strategy adopted by Critchfield, Paletz,MacAleese, and Newland (2003) in a studywith human subjects, designed to test two

Fig. 9. Response rates and standard deviations inbaseline (hatched bars) and loss (filled bars) conditions,expressed as a percentage of rates in the no-losscomponent within the same session, for 4 pigeons acrossthe final five sessions of each condition. The labels TL andYF represent data from Token-Loss and Yoked-Food

r

conditions, respectively. See text for additional details.Adapted from Raiff, Bullock, & Hackenberg (2008).

TOKEN REINFORCEMENT 273

Page 18: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

different quantitative models of punishment—a direct suppression model and a competitivesuppression model. The former assumes pun-ishment is a primary process and symmetricaleffects of losses and gains; the latter assumespunishment is a secondary process and asym-metrical effects of losses and gains. Humansubjects were exposed to a variety of concur-rent schedules in which choices both occa-sionally produced and lost points (later ex-changeable for money). The schedules ofpoint gain were always unequal: One schedulepaid off at a higher rate. The loss scheduleswere arranged in such a way that the predic-tions of the two models diverged. In 17 of 20cases across three experiments when themodels made different predictions, the directsuppression model provided a better accountof the data. These results are broadly consis-tent with the few studies conducted withnonhuman animals and electric-shock punish-ment (deVilliers, 1980; Farley, 1980), but theuse of point gains and losses obviates the needfor post hoc functional scaling of differentconsequence types.

More recently, Rasmussen and Newland(2008) studied concurrent-schedule perfor-mance in humans with schedules of point gainand loss, using the generalized matching lawto quantify bias and sensitivity. Under no-punishment conditions, there was slight un-dermatching and little or no bias. Whencontingent point loss was superimposed onone of the two schedules, however, sensitivitywas sharply reduced, with strong bias towardthe unpunished alternative. The bias was morepronounced than would be expected from aconsideration of relative reinforcement ratesalone, suggesting asymmetrical effects of gainsand losses. More specifically, the authorsestimated that in this experiment a loss wasworth approximately three times more than again.

Using a similar analytic strategy with humansubjects, Magoon and Critchfield (2008) ex-amined responding under similarly structuredconcurrent schedules of positive (point gain)and negative (point-loss avoidance) reinforce-ment. Two matching functions were obtainedper subject—one under homogenous condi-tions (all positive reinforcement) and oneunder heterogenous conditions (positive andnegative reinforcement, arranged concurrent-ly). Because both gains and losses were of

similar magnitude, a bias would reflect thedifferential impact of one consequence typeover the other. No such bias was found,indicating symmetrical effects of losses andgains.

Extending these lines of research to token-based procedures with nonhumans wouldshed light on the cross-species generality ofthe symmetrical law of effect, putting a sharperquantitative point on the token-loss findingsreviewed above. It would also provide a freshperspective on a range of decision-makingphenomena studied extensively in psychologyand economics, such as risky choice. To takejust one example, when choosing betweencertain and uncertain outcomes, human sub-jects are more likely to select the probabilistic(risky) option in a loss context than in a gaincontext (Kahnemen & Tversky, 1979, 2000). Inother words, humans tend to be risk-prone forlosses and risk-averse for gains. So pervasive isthis finding that it is deemed axiomatic in themost influential theories of human decision-making. Kahneman and Tversky’s (1979)Prospect Theory, for example, holds that thesubjective value function is convex and steeperfor losses than for gains: Losses loom larger—carry greater subjective value—than corre-sponding gains.

Gain-Loss Symmetry in Comparative Perspective

The gain-loss asymmetry seen in many riskychoice studies with humans is part of a largerpattern of biases known as framing effects—thetendency for decisions to be influenced by thecontext or format in which they are presented(Kahneman & Tversky, 2000). Frames arenormally verbal constructions, such as hypo-thetical scenarios that describe different cours-es of action. This raises questions about thegenerality of framing-induced behavioral bias-es. Are these effects limited to language-proficient humans, or are they more funda-mental? To answer this question, recentresearch has explored framing effects innonhuman primates.

In a clever experimental adaptation ofmethods used with human subjects, Chen,Lakshminarayanan, and Santos (2006) testedloss aversion in captive capuchin monkeys.The monkeys were trained to exchange tokens(metal coins) with experimenters for pre-ferred food items. The monkeys then receiveda budget of 12 tokens that could be spent on

274 TIMOTHY D. HACKENBERG

Page 19: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

either of two options, and were exposed toseveral conditions designed to assess framing.In one, the monkeys were given choicesbetween two variable options—one framed asa gain and the other as a loss. The gain optionbegan each trial with one food item (displayedby the experimenter) to which a second itemwas added with 50% probability. The lossoption began each trial with two food itemsfrom which one was lost with 50% probability.Both options thus provided the same overallreturn (1.5 food items per trial), but differedin how they were framed. The monkeysgenerally preferred the gain option—spendingmore of their token budget on it—a result notexplainable on the basis of return rate alone.Instead, the authors argued that the monkeyshave an inherent aversion to loss, similar tothat commonly observed in humans (but seeSilberberg, Roma, Huntsberry, Warren-Boul-ton, Sakagami, Ruggiero, et al., 2008, for analternative interpretation).

In a similar vein, Brosnan, Jones, Lamberth,Mareno, Richardson, and Schapiro (2007)reported suggestive evidence in chimpanzeesof an endowment (or status-quo) effect—theseemingly paradoxical tendency to dispropor-tionately weight items already in one’s posses-sion. The results of these and other recentstudies with primates have important implica-tions for the cross-species generality of behav-ioral biases, and their presumed evolutionarybasis. It has been argued on the basis of theseresults that behavioral biases are not unique tohumans, but rather, are shared by otherprimates, and in this sense may be a funda-mental characteristic of primate cognition.Before this or any other explanation can beproperly assessed, however, we need to know agreat deal more about behavioral biases innonhuman animals—the conditions that cre-ate them and the range of conditions (andspecies) under which they are observed.However such research turns out, it promisesto reveal important information on the gener-ality of effects deemed foundational to majortheories of behavior.

III. TOKEN SYSTEMS ANDCROSS-SPECIES COMPARISONS

A theme running through the studiesreviewed in the preceding section is that ofcross-species generality—the degree to which

principles of behavior are applicable acrossspecies. In providing a common methodolog-ical context, token procedures are well suitedto cross-species comparative studies. One areain which token systems have already provenuseful in cross-species comparisons is that ofchoice and self-control. This section willinclude a review and discussion of such work.

In a series of important papers, Logue andcolleagues (see review by Logue, 1988) exam-ined humans’ choices between sooner–smallerreinforcers (SSRs) and larger–later reinforcers(LLRs) with token reinforcers (points laterexchangeable for money). In a study by Logue,Pena-Correal, Rodriguez, and Kabela (1986),for example, subjects selected between a smallnumber of points available immediately and alarger number of points available after a delay.Postreinforcer delays followed the SSR, suchthat overall rate of reinforcement was equal forboth options. Under a wide range of condi-tions, humans generally preferred the LLR,the option providing the greatest net rein-forcement.

This basic result has been reported inseveral other studies of self-control withhuman subjects and token reinforcers (Logue,King, Chavarro, & Volpe, 1990; Flora & Pavlik,1992). Although such performances have beentaken as evidence of self-control, the delays totokens (points) are only one of several delaysin the procedure. Other and perhaps morecritical are the delays to the exchange period(the periods during which points are ex-changeable for money) and the delays to theterminal reinforcers. In the Logue et al.(1986) study, as in most laboratory studies,the exchange delays were held constant, at theend of a subject’s participation in the exper-iment. If these later reinforcers in the chain—the delays to exchange and money—areviewed as the operative reinforcers, then theself-control seen in most laboratory studieswith human subjects may instead be viewed assensitivity to monetary reinforcer amount withmonetary reinforcer delay held constant. Suchsensitivity to reinforcer amount is an interest-ing and important finding in its own right, asthere are few such data on reinforcer magni-tude effects in humans. By itself, however, thefinding may hold little relevance to the issue ofself-control. To constitute self-control, a sub-ject must face contrasting delays to theterminal reinforcers.

TOKEN REINFORCEMENT 275

Page 20: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Reconceptualizing the traditional proce-dures as a token-reinforcement system suggeststhat self-control observed in prior results maybe limited to those cases in which self-controland impulsivity are defined with respect totoken delays and the exchange delays are heldconstant. In support of this, Hyten, Madden,and Field (1994) found evidence of anexchange-delay effect with humans in a self-control procedure: Subjects were more likelyto exhibit self-control (i.e., select the largernumber of points each trial) when theexchange delays were equal (e.g., a week afterthe session) than when they were unequal andshorter for the small-reinforcer option (seealso Roll, Reilly, & Johanson, 2000, foranalogous effects in choices between moneyand cigarettes).

Jackson and Hackenberg (1996) reportedsimilar findings in an analogous study withpigeons and token reinforcers. The tokenswere stimulus lights, as described above,designed to mimic the display of points inanalogous experiments with human subjects.Like points, the light tokens served as curren-cy; each token was worth 2-s food, in that itcould be exchanged for that amount of foodduring scheduled exchange periods. In thebasic procedure, pigeons were given repeatedchoices between SSR (1 token deliveredimmediately) and LLR (3 tokens deliveredafter a delay), with the overall rate of (tokenand food) reinforcement held constant. Insome conditions (Experiment 4), the delays tothe exchange periods (when any earnedtokens could be exchanged for food) wereequal irrespective of which choice had beenmade. These equal-exchange delay (ED) con-ditions were designed to mimic the typicalhuman experiment in which the delays to theexchange period and money are held con-stant. In other conditions, the exchangeperiods were scheduled just after token deliv-ery, and were therefore unequal and shorterfor the SSR. These unequal-exchange delayconditions (UD) are more like the conditionsunder which self-control is typically studiedwith nonhuman subjects. If choices are gov-erned not by the token delays, but by theexchange delays and food, then one wouldpredict preference for the LLR under EDconditions and the SSR under UD conditions.

Summary results are shown in Figure 10,which displays the number of choices (in a 10-

trial session) of the LLR across exchange-delayconditions: UD for unequal exchange delayand ED for equal exchange delay. All 3pigeons strongly preferred the small-reinforceroption under UD conditions and the large-reinforcer option under ED conditions. Thus,consistent with the Hyten et al. (1994) resultswith humans, pigeons’ choices were governednot by token delays but by the delays to periodsduring which those tokens are exchanged forother reinforcers.

Hackenberg and Vaidya (2003) replicatedand extended these results, separately manip-ulating not only the exchange delays but theterminal-reinforcer (food) delays as well.Together, the results of these studies demon-strate disproportionate control by reinforcerslater in the chain (exchange and terminal-reinforcer delays) relative to token reinforcers.That delays to exchange and terminal rein-forcers exert strong control over behavior isnot surprising given the exchange-scheduleeffects reviewed above, and the extensiveliterature on extended-chained (Gollub,1977) and concurrent-chained schedules (Fan-tino, 1977). Indeed, such effects follow directlyfrom a conceptualization of a token scheduleas a kind of hybrid schedule with extendedand concurrent-chained components. In sup-port of this, in conditions in the Jackson andHackenberg (1996) study in which exchangeperiods followed blocks of choice trials (Ex-periments 1 and 2), and in which tokenstherefore accumulated prior to exchange,choice latencies were long in the early trialsof a block, and short and undifferentiated inthe later trials of a block.

Taken as a whole, the results with humans(Hyten et al., 1994) and pigeons (Jackson &Hackenberg, 1996; Hackenberg & Vaidya,2003) converge on a similar point: In token-based self-control procedures, choice patternsare governed not by token delays but by delaysto the periods during which tokens areexchangeable for other reinforcers. Whenthese delays are equal, preferences for theLLR usually prevail; when the delays areunequal, preferences for the SSR usuallyprevail. In other words, under conditionsmore typical of research with humans, more‘‘human-like’’ performance is seen (i.e., self-control), whereas under conditions moretypical of research with nonhuman animals,more ‘‘animal-like’’ performance is seen (i.e.,

276 TIMOTHY D. HACKENBERG

Page 21: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

impulsivity). This is not to imply that alldifferences between human and nonhumanself-control are due to procedural variables.Other factors, such as human verbal and rule-governed histories, may interact with proce-dural variables. It is not possible to isolate theimpact of these or any other factors, how-ever, unless procedural differences are takeninto proper account. A failure to do so mayresult in misleading or premature conclusionsin cross-species comparisons (Hackenberg,2005).

IV. TOKEN SYSTEMS ANDBEHAVIORAL ECONOMICS

Token reinforcement procedures are highlyrelevant to behavioral economics. From aneconomic perspective, tokens can be concep-tualized as a type of currency, earned andexchanged for other commodities. Tokenproduction is analogous to a wage, exchangeopportunities to procurement costs, accumu-lation to savings, terminal reinforcers toexpenditures, and so on. To date, however,there has been little substantive contactbetween behavioral economics and tokenreinforcement. This is unfortunate, becausetoken systems provide a controlled and analyt-ically tractable environment in which toexplore interactions between behavior andeconomic variables.

In this section, the role of token systems inbehavioral economics will be discussed. First,relations between tokens and consumer-de-mand concepts will be explored, showing howtoken systems permit more refined analyses ofprice than traditional formulations. Second,the utility of establishing common currency inassessing demand for qualitatively differentcommodities (including generalized commod-ities) will be discussed. Third, some appliedimplications of token economic systems will beconsidered. Finally, the role of token systemsin forging a cross-disciplinary approach tobehavioral economics will be discussed.

Demand Elasticity and Unit Price

The Bullock and Hackenberg (2006) study,reviewed above, showed that response ratesdeclined systematically with increases in token-production ratio and exchange-productionratio (see Figure 4). These results also followfrom an economic analysis: Response output

Fig. 10. Mean number of large-reinforcer choices per10-trial session for 3 pigeons in a token-based self-controlprocedure as a function of exchange-delay conditions.Delays to the exchange periods were either unequal andshorter for the small-reinforcer option (UD) or equal forboth options (ED). Data are from the final 10 sessions percondition; error bars reflect the range of values contrib-uting to the mean. From Jackson & Hackenberg (1996).Copyright 1996, by the Society for the ExperimentalAnalysis of Behavior, Inc. Reprinted with permission.

TOKEN REINFORCEMENT 277

Page 22: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

depended both on the costs of producing atoken (wages) and the costs of producingexchange opportunities (procurement costs).In everyday terms, one might think of these asthe costs associated with earning money (e.g.,on a job) and the costs associated withspending that money (e.g., driving to thestore, making an internet purchase). Thatresponding decreased across both sets ofvariables suggests that costs can be meaning-fully segregated into token production (wages)and exchange production (procurement).

A third way to vary price in token schedulesis illustrated by the Malagodi et al. (1975)study, reviewed above. Rather than altering theprice via number of responses required toproduce a token (wage), or number of tokensrequired to produce an exchange (procure-ment), Malagodi et al. manipulated thenumber of tokens required to produce eachfood delivery—the exchange price. Outsidethe lab, these costs are analogous to the sellingprice of a commodity—how much currency isspent on each unit of the commodity. Aswould be expected from an economic per-spective, response rates in the initial links werea decreasing function of exchange rate (num-ber of tokens required per unit of food).

The declining response rates with increasesin price (more stringent token-production,exchange-production, and token-exchange re-quirements) resemble the downward-slopingdemand curves of behavioral economic analy-sis (Hursh & Silberberg, 2008), suggesting thatdemand is sensitive to all three types of price.It is also possible on these procedures tocompute price differently, as a unit price, acomposite cost–benefit measure based on thecombined effects of the reinforcer size andprice (DeGrandpre, Bickel, Hughes, Layng,& Badger, 1993). In laboratory experiments,unit price is typically defined as the numberof responses per unit of reinforcer. Tokenschedules are especially well suited to a unit-price analysis because the number of tokens iscorrelated not only with proximity to theterminal reinforcer but with the magnitudeof that reinforcer. With ratio-based tokenschedules, the unit price equals the cumulativenumber of responses required to produce afood reinforcer divided by the number of foodreinforcers.

In the Bullock and Hackenberg (2006)study, for example, a token-production ratio

of 25 and an exchange-production ratio of 4yields a unit price of 25 (100 responses for fourreinforcers). Holding constant the exchange-production ratio at 4, but increasing the token-production ratio to 100 yields a four-foldincrease in the unit price (400 responses perfour reinforcers). Consistent with these in-creases in unit price, response rates decreasedwith increases in the size of the token-production ratio (left side of Figure 4). Con-versely, holding the token-production sched-ule constant at FR 25, but changing theexchange-production schedule from 2 to 8yields equivalent unit prices—50 responses fortwo reinforcers (unit price of 25) vs. 400responses for eight reinforcers (unit price of25). Based on the equivalent unit prices, onemight expect equivalent response rates acrossvariations in exchange-production schedules.Contrary to this prediction, however, responserates decreased with increases in the exchange-production ratio, despite equivalent unit pric-es (right side of Figure 4). Responding ap-peared to be governed not by the unit price(number of responses per unit of reinforcer),but rather by the number of responses (ordelay) per exchange period, without regard tothe number of reinforcers available per ex-change.

To further explore unit price in a tokenreinforcement context, Foster and Hacken-berg (2004) gave pigeons repeated choicesbetween FR token-reinforcement scheduleswith equal and unequal unit prices. Consistentwith a straightforward application of the unitprice model, when unit prices were unequal,pigeons strongly preferred the alternative withthe lower unit price. When unit prices wereequal, however, pigeons preferred alternativeswith smaller FRs and less food to those withlarger FRs and more food. Although inconsis-tent with the standard unit price equation, thedata were broadly consistent with a modifiedversion incorporating temporal discounting(Madden, Bickel, & Jacobs, 2000). And ifviewed in relation to delays rather thanresponses, the results are also consistent withdelay-reduction theory (Fantino, 1977). Likethe data with single alternatives, time and/orresponses to the exchange period appear to becritical determinants of choice behavior.

In a similar vein, the results of the Yankele-vitz et al. (2008) study, reviewed above, are alsoconsistent with a unit-price formulation. In

278 TIMOTHY D. HACKENBERG

Page 23: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

that study, different patterns of accumulationyielded different unit prices. Because theexchange-production ratio had only to becompleted once to exchange all of the tokensthat cycle, accumulating the maximum of 12tokens was the optimal pattern, the patternthat always yielded the lowest unit price in thelong run. Conversely, accumulating no to-kens—moving to exchange at the first oppor-tunity—yielded the highest overall unit price.Intermediate levels of accumulation yieldedintermediate unit prices. To the extent thatthe token-production and exchange-produc-tion ratios were concurrently available, tokenaccumulation can be conceptualized as a seriesof choices between accumulating one or moreadditional tokens versus exchanging tokensalready earned that cycle—a tradeoff betweendifferent unit prices, computed across differ-ent time periods. In the Yankelevitz et al.study, accumulation varied directly with theexchange-production ratio, with a modifiedunit-price model providing a reasonably accu-rate description of the data.

In sum, these results illustrate how theeconomic concept of unit price may bebrought to bear on token-reinforcement pro-cedures. Unlike traditional conceptions of unitprice in which costs are defined solely in termsof response output, however, token systemspermit a more refined analysis of cost—thecosts of producing tokens (wages), the costs ofproducing exchange opportunities (procure-ment), and the costs of exchanging tokens forother commodities (selling price). This may beuseful, depending on the particular context inwhich price is assessed, as there is clearevidence that token-reinforced behavior ismultiply determined by the token-productionschedule, the exchange-production schedule,and the token-exchange schedule. An eco-nomic analysis needs to consider how suchschedule effects enter into broader cost–benefit formulations.

Common Currency Units

Another major benefit of token systems isthe possibility of developing a common cur-rency with which to assess with greaterquantitative precision the relationships be-tween qualitatively different commodities. Thisis a central aim of behavioral economics, butto this point the currency has been definedalmost exclusively in relation to response or

time allocation; in other words, price of areinforcer is paid in the currency of responsesor time. Simple response allocation and timeallocation measures can sometimes be prob-lematic, however, especially with comparisonsbetween low-frequency activities of high valueand high-frequency activities of low value. Thisis in part what motivated research on concur-rent-chains schedules (Autor, 1969; Fantino,1977), in which preference for a reinforcer(behavior in the initial link) can be separatedfrom behavior in the presence of signalspaired with that reinforcer (behavior in theterminal link). In this way, preference ismeasured in the same currency units (relativeinitial-link responses).

Token systems provide an alternate ap-proach to the problem of common currency.Rather than defining currency in responses ortime, the currency is defined in commontoken units, and preference as the number oftokens a subject is willing to spend on anactivity or commodity. Tokens are a kind ofcommodity economists term fungible, or mutu-ally replaceable for another of like kind. Tothe extent that tokens are interchangeable,one for another, they can be used to scalequalitatively different commodities.

Such scaling procedures would be especiallyuseful in assessing demand for generalizedcommodities. Generalized commodities arepervasive in everyday human economic con-texts, money being the prime example. Moneyis such powerful currency because it is associ-ated with a wide variety of other goods andservices—its functions are generalized ratherthan specific to a particular currency andeconomic context. In principle, such general-ized commodities should be more durablethan specific reinforcers; or, in economicterms, less sensitive to price changes (lesselastic). One might also expect generalizedcommodities to hold their value across a widerrange of motivational conditions; or, in eco-nomic terms, to function as substitutes for lessgeneralized (specific) commodities. Althoughboth of these assertions are plausible, there islittle empirical evidence to support them.Indeed, convincing demonstrations of gener-alized reinforcement under laboratory condi-tions are virtually nonexistent. Most attemptshave either failed (Myers & Trapold, 1966) orproduced weak and transient effects (Lubinski& Thompson, 1987).

TOKEN REINFORCEMENT 279

Page 24: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Token systems are well suited to addressgeneralized reinforcement functions. Estab-lishing and evaluating the efficacy of tokensas generalized reinforcers would provide animportant bridge to monetary-based econo-mies in human affairs, a crucial step in theextrapolation of laboratory-based experimen-tal economics to that of everyday humaneconomic behavior. This would help bringwithin methodological reach a host of scien-tific problems heretofore limited to researchwith humans. Economic issues aside, establish-ing and maintaining generalized reinforce-ment under laboratory conditions would itselfbe a groundbreaking development in theexperimental analysis of behavior, openingnew research vistas.

Applied Implications

A better understanding of token systemsalso has important practical implications.Token economies are among the oldest andmost successful programs in all of appliedpsychology. Despite their well documentedtherapeutic and educational benefits (Glynn,1990; Kazdin, 1982; Williams, Williams, &McLaughlin, 1989), research on token econo-mies in the applied realm has developed withlittle or no recognition of laboratory researchon token systems, nor has it benefited from aneconomic perspective. Laboratory research ontoken systems and experimental economicssuggest a myriad of techniques for maximizingthe efficacy of token systems as motivationaland behavior-management tools in appliedsettings, including ways to establish newbehavior, to maintain behavior under longdelays to reinforcement, to encourage savingsand sensitivity to long-term outcomes, to namejust a few. In short, laboratory research ontoken systems, especially research that takes anexplicitly economic focus, has important butunderappreciated implications for clinical andeducational programs.

At the same time, reconceptualizing appliedtoken programs in behavioral-economicterms—as applied laboratories for investigat-ing economic behavior—has the potential tosignificantly advance research and theory inbehavioral economics. In an interesting seriesof papers in the 1970s, Winkler and colleaguesencouraged a view of token systems as exper-imental economies, arguing that token systemswere ripe for an economic analysis. Using this

type of analysis, several studies of tokensystems in state psychiatric hospitals foundbroad support for economic demand theory(Battalio, Kagel, Winkler, Fisher, Bassman, &Krasner, 1974; Fisher, Winkler, Krasner, Kagel,Battalio, & Basmann, 1978; Winkler, 1970,1973, 1980), suggesting the general utility ofan economic perspective on token systems.

This applied behavioral-economic workfailed to gain a foothold at the time, but thereare reasons to suspect a more supportiveintellectual climate today. A number of ap-plied researchers have successfully appliedbehavioral-economic concepts and analyses toan ever-widening range of topics, includingreinforcer assessment (DeLeon, Iwata, Goh, &Worsdell, 1997), substitution effects (Hanley,Iwata, Lindberg, & Connors, 2003), andremediation of problem behavior (Roane,Call, & Falcomata, 2005; Tustin, 1994), toname just a few. Combining this emergingeconomic perspective with token economies inapplied research has the potential to trans-form applied settings (classrooms, grouphomes, work centers, and the like) intolaboratories for examining economic behav-ior.

The field of substance abuse liability andtreatment has drawn extensively from behav-ioral-economic analyses (Bickel, DeGrandpre,& Higgins, 1995; Hursh, 1991), especially thesubfield utilizing contingency management, orvoucher reinforcement (Silverman, 2004).Vouchers are tokens exchangeable for moneyor goods, made contingent on drug absti-nence. This form of treatment has been shownto be a highly effective approach to drugabstinence with various drugs (see Higgins,Heil, & Lussier, 2004, for a review), although arelatively narrow range of contingencies hasbeen explored to date. In the typical program,an escalating magnitude of reinforcementschedule is used, whereby successive urine-freesamples earn increasing monetary amounts.The schedule is often coupled with a punish-ment schedule, whereby a positive sampleresets the earnings to a lower value. Althoughthis has proven effective in a majority of drug-dependent research participants, little isknown about these kinds of schedules andthe degree to which they are optimal forproducing abstinence. Combining this line ofresearch with a laboratory-based analysis oftoken reinforcement contingencies may fur-

280 TIMOTHY D. HACKENBERG

Page 25: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

ther enhance the clinical efficacy of thesealready successful programs. At the same time,this would expand the study of token sched-ules, bringing an interesting new range ofcontingencies within the scope of an experi-mental analysis.

Laboratory studies of drug self-administra-tion may also benefit from the use of tokenschedules. In a typical self-administrationexperiment, subjects (usually nonhuman pri-mates) are trained to respond for access todrug infusions. Using both simple and con-current schedules, such procedures have prov-en useful in assessing the relative reinforcingefficacy of a variety of drugs. A potentialdifficulty with such procedures, however, isthat the pharmacological effects of a drug mayinteract with its reinforcing efficacy. Forexample, large doses of drugs produce phar-macological and motoric effects that interferewith behavior maintained by access to thedrug. Second-order schedules of brief stimuluspresentation have been used to help circum-vent this problem by providing a context inwhich drug-paired stimuli maintain extendedsequences of behavior in the absence of drug(Kelleher & Goldberg, 1977). Expanding thesemethods to second-order schedules of tokenreinforcement would combine an economicperspective with the considerable methodo-logical advantages of second-order schedules.For example, one can envision a token systemin which tokens earned in an extended workcomponent could be used to purchase druginfusions at various prices, either in isolationor in combination with other drug andnondrug reinforcers. Such work would consid-erably advance the study of drugs as reinforc-ers, permitting an analysis of demand inrelation to other reinforcers in a context thatapproximates human economic systems.

Toward a Cross-Disciplinary Behavioral Economics

Behavioral economics is not a single unifiedtheory, but rather, a family of models andinterpretations sharing some common as-sumptions. There are two main branches ofbehavioral economics, arising from somewhatdifferent traditions of research and interpre-tation. One comes mainly from consumerdemand theory—how the value of a commod-ity is influenced by economic variables (e.g.,price, income, availability of substitutablegoods). This is the main behavioral-economic

approach familiar to readers of this journal,and the one emphasized in the precedingsections.

The other behavioral-economic approacharises from more cognitively-oriented studiesof human judgment and decision-making,emphasizing anomalies or biases in judgment(Camerer, Loewenstein, & Rabin, 2003).Spawned by Kahneman and Tversky’s highlyinfluential research on context and framingeffects (Kahneman & Tversky, 1979, 2000;Tversky & Kahneman, 1974, 1992), other linesof investigation have revealed systematic de-partures from classical economic theory andits implicit assumption of utility maximi-zation, including studies of behavioral finance(Thaler, 1993), time discounting (Frederick,Loewenstein, & O’Donoghue, 2002), behav-ioral game theory (Camerer, 2003), riskassessment (Fischhoff, 1995), and the expand-ing field of neuroeconomics (Camerer, Loe-wenstein, & Prelec, 2005; McClure, Laibson,Loewenstein, & Cohen, 2004), to name just afew.

Although both branches of behavioral eco-nomics are concerned broadly with behaviorin an economic context—and both adopt theterm behavioral economics—they have developedlargely in parallel, with little cross-fertilizationof ideas and concepts. This is unfortunate,because these two traditions of research andinterpretation yield complementary perspec-tives, and both are necessary parts of acomprehensive behavioral-economic concep-tion of the subject matter. A major obstacle isthe lack of a standard methodology. Behavior-al economics from the consumer-demandtradition has typically used nonhuman animalsas subjects and commodities with immediateconsummatory value (e.g., food, water, drug).Behavioral economics from the judgment anddecision making realm, on the other hand, hastypically used humans as subjects and mone-tary-based commodities (or hypothetical sce-narios with monetary-based outcomes). Untilthese different traditions are brought intogreater methodological alignment, the possi-bilities for a broader, more integrative ap-proach, are remote.

Token systems provide a methodologicaland analytic context in which behavioraleconomics of both persuasions can developin more coherent and mutually supportiveways. A token system combines the best

TOKEN REINFORCEMENT 281

Page 26: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

features of each approach. Like the approachemanating from the consumer-demand realm,laboratory-based token systems maximize ex-perimental control, permitting parametricmanipulation of variables useful in an eco-nomic analysis. Like the approach growing outof the judgment and decision-making litera-ture, token systems are modeled after mone-tary-based economic systems, in which tokensare earned, accumulated, and exchanged forother commodities. And when combined withgeneralized tokens, exchangeable for multiplecommodities, laboratory-based token systemsbring within reach an exciting range ofpossibilities for an experimental analysis ofeconomic behavior.

V. SUMMARY AND CONCLUSIONS

Token systems are ubiquitous in humanculture, providing the basic framework foreconomic transactions with the world. Yetdespite laboratory research dating back tothe 1930s and the highly successful education-al and therapeutic applications of tokenprocedures since the 1960s, relatively little isknown about token reinforcement per se—about its basic principles of operation. Twochief aims of the present paper were to reviewwhat is known about token reinforcementunder laboratory conditions, and to highlightpromising research directions.

Research reviewed in Part I explored thevarious behavioral functions of tokens, includ-ing the conditions that give rise to andmaintain the reinforcing functions of tokens,the generalizability of token reinforcers, therole of tokens in temporally organized behav-ior, the antecedent (discriminative and signal-ing) functions of tokens, and the conditionedaversive functions of token loss. When viewedas an interrelated set of contingencies, tokenschedules share important properties withother sequentially-arranged contingencies(e.g., chained and second-order schedules).Like these other schedules, token schedulescan be used both to establish and maintainbehavior over extended time periods, and tocreate and synthesize behavioral units thatparticipate in larger functional units. And, likethese other schedules, the added stimuli intoken-reinforcement schedules can serve mul-tiple functions—reinforcing, discriminative,and eliciting—depending on the particular

temporal-correlative relations with otherevents. Token schedules, in particular amongsequence schedules, fall squarely in the middleof important but unresolved issues in the fieldof conditioned reinforcement.

In addition to the sequential arrangementof contingencies they share with other sched-ules, token schedules possess unique charac-teristics that make them useful as tools ininvestigating a wide range of other scientificproblems. And it is in these extensions to newareas of research and theory where thepotential impact of token reinforcementlooms largest. This was the focus of researchreviewed in the later sections of the paper.

Section II explored the use of tokenschedules in addressing the symmetrical lawof effect, the view that gains and losses exertequivalent but opposite effects on behavior.This issue is of fundamental importance in theanalysis of behavior, but past efforts have beenhampered by qualitative differences in thenature of the reinforcer (gain) and punisher(loss) stimuli. With token schedules, it ispossible to define gains and losses in concep-tually analogous terms, as changes in signalong a common dimension of value. Research(mostly with human subjects) has begun toreveal the relationships between gains andlosses, but this work is still quite preliminary.Using token procedures to explore thesequestions with nonhuman subjects wouldprovide important information on the cross-species generality of the phenomena, bringingnew investigative realms within the scope of alaboratory analysis.

Research reviewed in Section III consideredthe use of token schedules in cross-speciesanalysis of behavior more generally. Tokenschedules are well suited to such comparativeanalyses, as they mimic common proceduralfeatures of laboratory research with humans,and thus provide a common methodologicalcontext in which to compare human perfor-mance to that of other animals. Tokenschedules have been used to minimize proce-dural disparities across species, bringing hu-man and nonhuman performances into betteraccord. While much remains to be done in thisrealm, the use of token schedules holds greatpromise as a methodological tool in compar-ative analyses.

The research reviewed in Section IV illus-trated some of the ways in which token systems

282 TIMOTHY D. HACKENBERG

Page 27: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

inform and extend an economic analysis ofbehavior. Enabling detailed measurement ofeconomic behavior (e.g., labor, consumption,preference, savings) on multiple levels ofanalysis and in relation to a wide range ofeconomic variables (e.g., prices, wages, inter-est, inflation), token systems provide anexperimental test bed for fundamental con-cepts in behavioral economics. At the sametime, economic methods and concepts (de-mand curves, unit price, fungible currency)may yield insights into the complex interde-pendencies between various components of atoken system, putting a sharper quantitativepoint on functional relationships betweenbehavior and reinforcement variables.

The relationship between analysis and exten-sion goes both ways. As more is learned aboutits basic principles of operation, the moreeffective token systems become as methodolog-ical tools, and the more likely they are to yieldconceptual advances. In this way, the extensionsof token reinforcement to new domains ofresearch and theory may begin to repay themethodological debt in conceptual break-throughs including, perhaps, even new ways toapproach the old problem of conditionedreinforcement with which this research began.

REFERENCES

Autor, S. M. (1969). The strength of conditionedreinforcers as a function of frequency and probabilityof reinforcement. In D. P. Hendry (Ed.), Conditionedreinforcement (pp. 127–162). Homewood, IL: DorseyPress.

Ayllon, T., & Azrin, N. H. (1968). The tokeneconomy: A motivational system for therapy and rehabilita-tion. New York: Appleton-Century-Crofts.

Battalio, R. C., Kagel, J. H., Winkler, R. C., Fisher, E. B., Jr.,Bassman, R. L., & Krasner, L. (1974). An experimen-tal investigation of consumer behavior in a controlledenvironment. Journal of Consumer Research, 1, 52–60.

Bickel, W. K., DeGrandpre, R. J., & Higgins, S. T. (1995).The behavioral economics of concurrent drug rein-forcers: A review and reanalysis of drug self-adminis-tration research. Psychopharmacology, 118, 250–259.

Boakes, R. A., Poli, M., Lockwood, M. J., & Goodall, G.(1978). A study of misbehavior: Token reinforcementin the rat. Journal of the Experimental Analysis of Behavior,29, 115–134.

Brethower, D. M., & Reynolds, G. S. (1962). A facilitativeeffect of punishment on unpunished behavior. Journalof the Experimental Analysis of Behavior, 5, 191–199.

Brosnan, S. F., Jones, O. D., Lamberth, S. P., Mareno, C.,Richardson, A. S., & Schapiro, S. J. (2007). Endow-ment effect in chimpanzees. Current Biology, 17,1704–1707.

Bullock, C. E., & Hackenberg, T. D. (2006). Second-orderschedules of token reinforcement with pigeons:Implications for unit price. Journal of the ExperimentalAnalysis of Behavior, 85, 95–106.

Camerer, C. F. (2003). Behavioral game theory: Experiments instrategic interaction. Princeton, NJ: Princeton UniversityPress.

Camerer, C. F., Loewenstein, G. F., & Prelec, D. (2005).Neuroeconomics: How neuroscience can informeconomics. Journal of Economic Literature, 43, 9–64.

Camerer, C. F., Loewenstein, G., & Rabin, M. (Eds.).(2003). Advances in behavioral economics. Princeton, NJ:Princeton University Press.

Chen, M. K., Lakshminarayanan, V., & Santos, L. R.(2006). How basic are behavioral biases? Evidencefrom Capuchin monkey trading behavior. Journal ofPolitical Economy, 114, 517–537.

Cole, M. R. (1990). Operant hoarding: A new paradigm forthe study of self-control. Journal of the ExperimentalAnalysis of Behavior, 53, 247–261.

Cowles, J. T. (1937). Food-tokens as incentives for learningby chimpanzees. Comparative Psychological Monographs,12, 1–96.

Critchfield, T. S., Paletz, E. M., Macaleese, K. R., &Newland, M. C. (2003). Punishment in human choice:Direct or competitive suppression? Journal of theExperimental Analysis of Behavior, 80, 1–27.

Critchfield, T. S., & Rasmussen, E. R. (2007). It’s aversiveto have an incomplete science of behavior. MexicanJournal of Behavior Analysis, 33, 1–6.

Crosbie, J., Williams, A. M., Lattal, K. A., Anderson, M. M.,& Brown, S. M. (1997). Schedule interactions involv-ing punishment with pigeons and humans. Journal ofthe Experimental Analysis of Behavior, 68, 161–175.

DeGrandpre, R. J., Bickel, W. K., Hughes, J. R., Layng, M.P., & Badger, G. (1993). Unit price as a useful metricin analyzing effects of reinforcer magnitude. Journal ofthe Experimental Analysis of Behavior, 60, 641–666.

DeLeon, I. G., Iwata, B. A., Goh, H., & Worsdell, A. S.(1997). Emergence of reinforcer preference as afunction of schedule requirements and stimulussimilarity. Journal of Applied Behavior Analysis, 30,439–449.

de Villiers, P. A. (1980). Toward a quantitative theory ofpunishment. Journal of the Experimental Analysis ofBehavior, 33, 15–25.

Dewsbury, D. A. (2003). Conflicting approaches: Operantpsychology arrives at a primate laboratory. TheBehavior Analyst, 26, 253–265.

Dewsbury, D. A. (2006). Monkey Farm: A history of the YerkesLaboratories of Primate Biology, Orange Park, Florida,1930–1965. Lewisburg, PA: Bucknell University Press.

Fantino, E. (1977). Conditioned reinforcement: Choiceand information. In: W. K. Honig, & J. E. R. Staddon(Eds.), Handbook of operant behavior (pp. 313–339).Englewood Cliffs, NJ: Prentice-Hall.

Farley, J. (1980). Reinforcement and punishment effects inconcurrent schedules: A test of two models. Journal ofthe Experimental Analysis of Behavior, 33, 311–326.

Farley, J., & Fantino, E. (1978). The symmetrical law ofeffect and the matching relation in choice behavior.Journal of the Experimental Analysis of Behavior, 29,37–60.

Ferster, C. B., & Skinner, B. F. (1957). Schedules ofreinforcement. New York: Appleton-Century-Crofts.

TOKEN REINFORCEMENT 283

Page 28: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Fischhoff, B. (1995). Ranking risks. Risk: Health, Safety andEnvironment, 6, 189–200.

Fisher, E. B., Jr., Winkler, R. C., Krasner, L., Kagel, J.,Battalio, R. C., & Basmann, R. L. (1978). Economicperspectives in behavior therapy: Complex interde-pendencies in token economies. Behavior Therapy, 9,391–403.

Flora, S. R., & Pavlik, W. B. (1992). Human self-control andthe density of reinforcement. Journal of the ExperimentalAnalysis of Behavior, 57, 201–208.

Foster, T. A., & Hackenberg, T. D. (2004). Choice and unitprice in a token reinforcement context. Journal of theExperimental Analysis of Behavior, 81, 5–25.

Foster, T. A., Hackenberg, T. D., & Vaidya, M. (2001).Second-order schedules of token reinforcement withpigeons: Effects of fixed- and variable-ratio exchangeschedules. Journal of the Experimental Analysis ofBehavior, 76, 159–178.

Frederick, S., Loewenstein, G., & O’Donoghue, T. (2002).Time discounting and time preference: A criticalreview. Journal of Economic Literature, 40, 351–401.

Glynn, S. M. (1990). Token economy approaches forpsychiatric patients: progress and pitfalls over 25years. Behavior Modification, 14, 383–407.

Gollub, L. (1977). Conditioned reinforcement: Scheduleeffects. In: W. K. Honig, & J. E. R. Staddon (Eds.),Handbook of operant behavior (pp. 288–312). EnglewoodCliffs, NJ: Prentice-Hall.

Green, L., & Rachlin, H. (1975). Economic and biologicalinfluences on a pigeon’s key peck. Journal of theExperimental Analysis of Behavior, 23, 55–62.

Hackenberg, T. D. (2005). Of pigeons and people: Someobservations on species differences in choice and self-control. Brazilian Journal of Behavior Analysis, 1,135–147.

Hackenberg, T. D., & Vaidya, M. (2001). Second-orderschedules of token reinforcement with pigeons:Effects of fixed and variable-ratio exchange schedules.Journal of the Experimental Analysis of Behavior, 76,159–178.

Hackenberg, T. D., & Vaidya, M. (2003). Determinants ofpigeons’ choices in token-based self-control proce-dures. Journal of the Experimental Analysis of Behavior, 79,207–218.

Hanley, G. P., Iwata, B. A., Lindberg, J. S., & Conners, J.(2003). Response-restriction analysis: I. Assessment ofactivity preferences. Journal of Applied Behavior Analysis,36, 47–58.

Hanson, H. M., & Witoslawski, J. J. (1959). Interactionbetween the components of a chained schedule.Journal of the Experimental Analysis of Behavior, 2,171–177.

Higgins, S. T., Heil, S. H., & Lussier, J. P. (2004). Clinicalimplications of reinforcement as a determinant ofsubstance abuse disorders. Annual Review of Psychology,55, 431–461.

Horney, J., & Fantino, E. (1984). Choice for conditionedreinforcers in the signaled absence of primaryreinforcement. Journal of the Experimental Analysis ofBehavior, 41, 193–201.

Hursh, S. R. (1980). Economic concepts for the analysis ofbehavior. Journal of the Experimental Analysis of Behavior,34, 219–238.

Hursh, S. R. (1991). Behavioral economics of drug self-administration and drug abuse policy. Journal of theExperimental Analysis of Behavior, 56, 377–393.

Hursh, S. R., & Silberberg, A. (2008). Economic demandand essential value. Psychological Review, 115, 186–198.

Hyten, C., Madden, G. J., & Field, D. P. (1994). Exchangedelays and impulsive choice in adult humans. Journalof the Experimental Analysis of Behavior, 62, 225–233.

Jackson, K., & Hackenberg, T. D. (1996). Token reinforce-ment, choice, and self-control in pigeons. Journal of theExperimental Analysis of Behavior, 66, 29–49.

Jwaideh, A. R. (1973). Responding under chained andtandem fixed-ratio schedules. Journal of the Experimen-tal Analysis of Behavior, 19, 259–267.

Kagel, J. H. (1972). Token economies and experi-mental economics. Journal of Political Economy, 80,779–785.

Kagel, J. H., & Winkler, R. C. (1972). Behavioraleconomics: Areas of cooperative research betweeneconomics and applied behavioral analysis. Journal ofApplied Behavior Analysis, 5, 335–342.

Kahneman, D., & Tversky, A. (1979). Prospect theory: Ananalysis of decisions under risk. Econometrica, 47,263–291.

Kahneman, D., & Tversky, A. (Eds.). (2000). Choices, values,and frames. New York: Cambridge University Press.

Kazdin, A. E. (1977). The token economy: A review andevaluation. New York: Plenum Press.

Kazdin, A. E. (1978). History of behavior modification:Experimental foundations of contemporary research. Balti-more, MD: University Park Press.

Kazdin, A. E. (1982). The token economy: A decadelater. Journal of Applied Behavior Analysis, 15, 431–445.

Kelleher, R. T. (1956). Intermittent conditioned reinforce-ment in chimpanzees. Science, 124, 679–680.

Kelleher, R. T. (1958). Fixed-ratio schedules of condi-tioned reinforcement with chimpanzees. Journal of theExperimental Analysis of Behavior, 1, 281–289.

Kelleher, R. T. (1966a). Chaining and conditionedreinforcement. In: W. K. Honig (Ed.), Operantbehavior: Areas of research and application (pp. 160–212). New York: Appleton-Century-Crofts.

Kelleher, R. T. (1966b). Conditioned reinforcement insecond-order schedules. Journal of the ExperimentalAnalysis of Behavior, 9, 475–485.

Kelleher, R. T., & Goldberg, S. R. (1977). Fixed-intervalresponding under second-order schedules of foodpresentation or cocaine injection. Journal of theExperimental Analysis of Behavior, 28, 221–231.

Kelleher, R. T., & Gollub, L. R. (1962). A review of positiveconditioned reinforcement. Journal of the ExperimentalAnalysis of Behavior, 5, 543–597.

Killeen, P. (1974). Psychophysical distance functions forhooded rats. The Psychological Record, 24, 229–235.

Lea, S. E. G. (1978). The psychology and economics ofdemand. Psychological Bulletin, 85, 441–466.

Lieberman, D. A., Davidson, F. H., & Thomas, G. V.(1985). Marking in pigeons: The role of memory indelayed reinforcement. Journal of Experimental Psychol-ogy: Animal Behavior Processes, 11, 611–624.

Logue, A. W. (1988). Research on self-control: Anintegrating framework. Behavioral and Brain Sciences,11, 665–709 (includes commentary).

Logue, A. W., King, G. R., Chavarro, A., & Volpe, A. S.(1990). Matching and maximizing in a self-controlparadigm using human subjects. Learning and Motiva-tion, 21, 340–368.

284 TIMOTHY D. HACKENBERG

Page 29: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Logue, A. W., Pena-Correal, T. E., Rodriguez, M. L., &Kabela, E. (1986). Self-control in adult humans:Variation in positive reinforcer amount and delay.Journal of the Experimental Analysis of Behavior, 46,159–173.

Lubinski, D., & Thompson, T. (1987). An animal model ofthe interpersonal communication of interoceptive(private) states. Journal of the Experimental Analysis ofBehavior, 48, 1–15.

Madden, G. J., Bickel, W. K., & Jacobs, E. A. (2000). Threepredictions of the economic concept of unit price in achoice context. Journal of the Experimental Analysis ofBehavior, 73, 45–64.

Magoon, M. A., & Critchfield, T. S. (2008). Concurrentschedules of positive and negative reinforcement:Differential-impact and differential-outcomes hypoth-eses. Journal of the Experimental Analysis of Behavior, 90,1–22.

Malagodi, E. F. (1967a). Acquisition of the token-rewardhabit in the rat. Psychological Reports, 20, 1335–1342.

Malagodi, E. F. (1967b). Fixed-ratio schedules of tokenreinforcement. Psychonomic Science, 8, 469–470.

Malagodi, E. F. (1967c). Variable-interval schedules oftoken reinforcement. Psychonomic Science, 8, 471–472.

Malagodi, E. F. (1967d). Second-order chained andtandem schedules of token-reinforcement in the rat.Unpublished doctoral dissertation, University ofMiami.

Malagodi, E. F., Webbe, F. M., & Waddell, T. R. (1975).Second-order schedules of token reinforcement:Effects of varying the schedule of food presentation.Journal of the Experimental Analysis of Behavior, 24,173–181.

Mazur, J. E. (1983). Steady-state performance on fixed-,mixed-, and random-ratio schedules. Journal of theExperimental Analysis of Behavior, 39, 293–307.

McClure, S. M., Laibson, D. I., Loewenstein, G., &Cohen, J. D. (2004). Separate reward systems valueimmediate and delayed monetary rewards. Science,306, 503–507.

McFarland, J. M., & Lattal, K. A. (2001). Determinants ofreinforcer accumulation during an operant task.Journal of the Experimental Analysis of Behavior, 76,321–338.

Midgley, M., Lea, S. E. G., & Kirby, R. M. (1989).Algorithmic shaping and misbehavior in the acquisi-tion of token deposit by rats. Journal of the ExperimentalAnalysis of Behavior, 52, 27–40.

Munson, K. J., & Crosbie, J. (1998). Effects of response costin computerized programmed instruction. Psychologi-cal Record, 48, 233–251.

Myers, W. A., & Trapold, M. A. (1966). Two failures todemonstrate superiority of a generalized secondaryreinforcer. Psychonomic Science, 5, 321–322.

O’Donnell, J., Crosbie, J., Williams, D. C., & Saunders, K. J.(2000). Stimulus control and generalization of point-loss punishment with humans. Journal of the Experi-mental Analysis of Behavior, 73, 261–274.

Pietras, C. J., & Hackenberg, T. D. (2005). Response-costpunishment via token-loss with pigeons. BehaviouralProcesses, 69, 343–356.

Raiff, B. R., Bullock, C. E., & Hackenberg, T. D. (2008).Response-cost punishment with pigeons: Furtherevidence of response suppression via token loss.Learning & Behavior, 36, 29–41.

Rasmussen, E. B., & Newland, M. C. (2008). Asymmetry ofreinforcement and punishment in human choice.Journal of the Experimental Analysis of Behavior, 89,157–167.

Roane, H. S., Call, N. A., & Falcomata, T. S. (2005). Apreliminary analysis of adaptive responding underopen and closed economies. Journal of Applied BehaviorAnalysis, 38, 335–348.

Roll, J. M., Reilly, M. P., & Johanson, C. (2000). Theinfluence of exchange delays on cigarette versusmoney choice: A laboratory analog of voucher-basedreinforcement therapy. Experimental and Clinical Psy-chopharmacology, 8, 366–370.

Schmandt-Besserant, D. (1992). Before writing: Volume 1:From counting to cuneiform. Austin, TX: University ofTexas Press.

Silberberg, A., Roma, P. G., Huntsberry, M. E., Warren-Boulton, F. R., Sakagami, T., Ruggiero, A. M., et al.(2008). On loss aversion in capuchin monkeys.Journal of the Experimental Analysis of Behavior, 89,145–155.

Silverman, K. (2004). Exploring the limits and utility ofoperant conditioning in the treatment of drugaddiction. The Behavior Analyst, 27, 209–230.

Skinner, B. F. (1953). Science and human behavior. New York:Macmillan.

Sousa, C., & Matsuzawa, T. (2001). The use of tokens asrewards and tools by chimpanzees (Pan troglodytes).Animal Cognition, 4, 213–221.

Thaler, R. H. (Ed). (1993). Advances in Behavioral FinanceRussell Sage Foundation.

Tustin, R. D. (1994). Preference for reinforcers undervarying schedule arrangements: A behavioral econom-ic analysis. Journal of Applied Behavior Analysis, 27,597–606.

Tversky, A., & Kahneman, D. (1974). Judgment underuncertainty: Heuristics and biases. Science, 185,1124–1131.

Tversky, A., & Kahneman, D. (1992). Advances in prospecttheory: Cumulative representation of uncertainty.Journal of Risk and Uncertainty, 5, 297–323.

Waddell, T. R., Leander, J. D., Webbe, F. M., & Malagodi,E. F. (1972). Schedule interactions in second-orderfixed-interval (fixed-ratio) schedules of token rein-forcement. Learning and Motivation, 3, 91–100.

Webbe, F. M., & Malagodi, E. F. (1978). Second-orderschedules of token reinforcement: Comparisons ofperformance under fixed-ratio and variable-ratioexchange schedules. Journal of the Experimental Analysisof Behavior, 30, 219–224.

Weiner, H. (1962). Some effects of response cost uponhuman operant behavior. Journal of the ExperimentalAnalysis of Behavior, 5, 201–208.

Weiner, H. (1963). Some effects of response cost andaversive control of human operant behavior. Journal ofthe Experimental Analysis of Behavior, 6, 415–421.

Weiner, H. (1964). Response cost and fixed-ratio perfor-mance. Journal of the Experimental Analysis of Behavior, 7,79–81.

Williams, B. A. (1991). Marking and bridging versusconditioned reinforcement. Animal Learning & Behav-ior, 19, 264–269.

Williams, B. A. (1994a). Conditioned reinforcement:Experimental and theoretical issues. The BehaviorAnalyst, 17, 261–285.

TOKEN REINFORCEMENT 285

Page 30: reed.edureed.edu/psychology/hackenberg/pubs/hackenberg_token_review_2009.pdf · This paper is dedicated to the memory of E. F. Malagodi for his enduring contributions to the study

Williams, B. A. (1994b). Conditioned reinforcement:Neglected or outmoded explanatory construct? Psy-chonomic Bulletin & Review, 1, 457–475.

Williams, B. F., Williams, R. L., & McLaughlin, T. F.(1989). The use of token economies with indi-viduals who have developmental disabilities. In: E.Cipani (Ed.), The treatment of severe behavior dis-orders (pp. 3–18). Washington, DC: AAMR Publica-tions.

Winkler, R. C. (1970). Management of chronic psychiatricpatients by a token reinforcement system. Journal ofApplied Behavior Analysis, 3, 47–55.

Winkler, R. C. (1973). An experimental analysis ofeconomic balance, savings, and wages in a tokeneconomy. Behavior Therapy, 4, 22–40.

Winkler, R. C. (1980). Behavioral economics, token econo-mies, and applied behavior analysis. In: J. E. R. Staddon(Ed.), Limits to action (pp. 269–297). New York: AcademicPress.

Wolfe, J. B. (1936). Effectiveness of token rewards for chim-panzees. Comparative Psychological Monographs, 12, 1–72.

Yankelevitz, R. L., Bullock, C. E., & Hackenberg, T. D.(2008). Reinforcer accumulation in a token-reinforce-ment context. Journal of the Experimental Analysis ofBehavior, 90, 283–299.

Yerkes, R. M. (1943). Chimpanzees: A Laboratory colony. NewHaven, CT: Yake University Press.

Received: August 25, 2008Final acceptance: October 22, 2008

286 TIMOTHY D. HACKENBERG