Timeout from a high-force requirement as a reinforcer: An effective procedure for human operant...

6
Behavioural Processes 99 (2013) 1–6 Contents lists available at ScienceDirect Behavioural Processes jou rn al h om epa ge: www.elsevier.com/locate/behavproc Timeout from a high-force requirement as a reinforcer: An effective procedure for human operant research Jérôme Alessandri , Vinca Rivière University of Lille Nord de France, France a r t i c l e i n f o Article history: Received 12 January 2013 Received in revised form 20 May 2013 Accepted 30 May 2013 Keywords: Humans Self-control Timeout from a high-force requirement Negative reinforcement a b s t r a c t A procedure to study human operant conditioning is described using a timeout from a high-force require- ment as reinforcer when a high force response was required. Experiment 1 reported evidence that a timeout from a high-force requirement acted as a reinforcer and a second experiment demonstrated sensitivity to delay to escape from the force requirement as a parameter of choice in a self-control paradigm. The results of the two experiments indicate a functional similarity between unconditioned reinforcers (e.g., food) used in nonhuman subjects and the present reinforcer, demonstrating that the present procedure is well-suited to study operant conditioning in humans. © 2013 Elsevier B.V. All rights reserved. 1. Introduction The most prevalent reinforcement method to study operant con- ditioning in humans is the use of points exchangeable for money (Kangas and Hackenberg, 2009). The main advantage of this method is that money is a powerful generalized reinforcer that is effective in a large variety of situations as it can be exchanged for a large number of goods and services (Skinner, 1953). Also, for this rea- son, money can be used as a reinforcer for most adult humans despite the individual differences in preference and reinforcer efficacy (Kangas and Hackenberg, 2009). However, the financial cost is a serious obstacle for many researchers who use human participants (Hake, 1982; Kangas and Hackenberg, 2009). One lim- itation of using different reinforcers across species (humans vs non humans) is that different results may be obtained depending on whether an unconditioned reinforcer or a generalized rein- forcer is used. For example, under a self-control procedure (i.e., choice between an immediate, shorter reinforcer and a delayed, larger reinforcer), humans generally prefer the self-control option when using monetary reinforcement (for a review, see Logue, 1988), while they mostly prefer the impulsive alternative when using unconditioned reinforcement, as do pigeons (Kirk and Logue, 1997). These conflicting results reveal the relative lack of sensi- tivity to delay to conditioned reinforcement as a parameter of Corresponding author at: Department of Psychology, University of Lille Nord de France, F-59000 Lille, France. Tel.: +33 621145149. E-mail address: [email protected] (J. Alessandri). choice that may be due to the fact that points are not immedi- ately consumable unlike food (Hyten et al., 1994). This may pose a serious problem when assessing the interspecies generality of behavioral principles. As operant research using nonhuman orga- nisms was earlier and most predominant than research involving humans, animal data often are used as the benchmark (Perone et al., 1988) on which human data are expected to conform. Thus, comparable reinforcement procedures seem necessary in order to produce similar performances in human and nonhuman subjects. As a consequence, it is of interest to develop an alterna- tive procedure to study operant conditioning in humans using a reinforcing stimulus that has similar reinforcing properties to food reinforcement. Thus, the present report explored an alternative reinforcement procedure to study human operant conditioning, using a timeout from a high-force requirement (hereafter called “break”) while the participant was required to press a force cell with the maximum force possible (see for an earlier develop- ment Azrin, 1960). It was assumed that pressing with high force is aversive (Alessandri et al., 2008), the termination of which could serve as reinforcer. In Experiment 1, the reinforcing function of a break was tested under a multiple variable-interval (VI) yoked variable-time schedules (VT) schedule in which reinforcement fre- quency and distribution were equated across the two components. It was expected that responding would be maintained only in the VI component. In Experiment 2, participants were trained on a self-control procedure to assess sensitivity to delay as a parame- ter of choice. A preference for the immediate, shorter option was expected. 0376-6357/$ see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.beproc.2013.05.014

Transcript of Timeout from a high-force requirement as a reinforcer: An effective procedure for human operant...

Tp

JU

a

ARRA

KHSTN

1

d(iinsdecpinofclw1u1t

F

0h

Behavioural Processes 99 (2013) 1– 6

Contents lists available at ScienceDirect

Behavioural Processes

jou rn al h om epa ge: www.elsev ier .com/ locate /behavproc

imeout from a high-force requirement as a reinforcer: An effectiverocedure for human operant research

érôme Alessandri ∗, Vinca Rivièreniversity of Lille Nord de France, France

r t i c l e i n f o

rticle history:eceived 12 January 2013eceived in revised form 20 May 2013

a b s t r a c t

A procedure to study human operant conditioning is described using a timeout from a high-force require-ment as reinforcer when a high force response was required. Experiment 1 reported evidence that atimeout from a high-force requirement acted as a reinforcer and a second experiment demonstrated

ccepted 30 May 2013

eywords:umanself-controlimeout from a high-force requirementegative reinforcement

sensitivity to delay to escape from the force requirement as a parameter of choice in a self-controlparadigm. The results of the two experiments indicate a functional similarity between unconditionedreinforcers (e.g., food) used in nonhuman subjects and the present reinforcer, demonstrating that thepresent procedure is well-suited to study operant conditioning in humans.

© 2013 Elsevier B.V. All rights reserved.

. Introduction

The most prevalent reinforcement method to study operant con-itioning in humans is the use of points exchangeable for moneyKangas and Hackenberg, 2009). The main advantage of this methods that money is a powerful generalized reinforcer that is effectiven a large variety of situations as it can be exchanged for a largeumber of goods and services (Skinner, 1953). Also, for this rea-on, money can be used as a reinforcer for most adult humansespite the individual differences in preference and reinforcerfficacy (Kangas and Hackenberg, 2009). However, the financialost is a serious obstacle for many researchers who use humanarticipants (Hake, 1982; Kangas and Hackenberg, 2009). One lim-

tation of using different reinforcers across species (humans vson humans) is that different results may be obtained dependingn whether an unconditioned reinforcer or a generalized rein-orcer is used. For example, under a self-control procedure (i.e.,hoice between an immediate, shorter reinforcer and a delayed,arger reinforcer), humans generally prefer the self-control option

hen using monetary reinforcement (for a review, see Logue,988), while they mostly prefer the impulsive alternative when

sing unconditioned reinforcement, as do pigeons (Kirk and Logue,997). These conflicting results reveal the relative lack of sensi-ivity to delay to conditioned reinforcement as a parameter of

∗ Corresponding author at: Department of Psychology, University of Lille Nord derance, F-59000 Lille, France. Tel.: +33 621145149.

E-mail address: [email protected] (J. Alessandri).

376-6357/$ – see front matter © 2013 Elsevier B.V. All rights reserved.ttp://dx.doi.org/10.1016/j.beproc.2013.05.014

choice that may be due to the fact that points are not immedi-ately consumable unlike food (Hyten et al., 1994). This may posea serious problem when assessing the interspecies generality ofbehavioral principles. As operant research using nonhuman orga-nisms was earlier and most predominant than research involvinghumans, animal data often are used as the benchmark (Peroneet al., 1988) on which human data are expected to conform.Thus, comparable reinforcement procedures seem necessary inorder to produce similar performances in human and nonhumansubjects.

As a consequence, it is of interest to develop an alterna-tive procedure to study operant conditioning in humans using areinforcing stimulus that has similar reinforcing properties to foodreinforcement. Thus, the present report explored an alternativereinforcement procedure to study human operant conditioning,using a timeout from a high-force requirement (hereafter called“break”) while the participant was required to press a force cellwith the maximum force possible (see for an earlier develop-ment Azrin, 1960). It was assumed that pressing with high force isaversive (Alessandri et al., 2008), the termination of which couldserve as reinforcer. In Experiment 1, the reinforcing function ofa break was tested under a multiple variable-interval (VI) yokedvariable-time schedules (VT) schedule in which reinforcement fre-quency and distribution were equated across the two components.It was expected that responding would be maintained only in the

VI component. In Experiment 2, participants were trained on aself-control procedure to assess sensitivity to delay as a parame-ter of choice. A preference for the immediate, shorter option wasexpected.

2 ehavio

2

2

2

swdt(E

2

a(amilAwp

22sWf1t

7satwf(tt

ft

ttat(

2bc

J. Alessandri, V. Rivière / B

. Experiment 1

.1. Materials and methods

.1.1. ParticipantsThe participants were 7 undergraduate students at the Univer-

ity of Lille III who were all volunteers. No course credit or moneyas given for participation. One participant was removed from theataset because he took free breaks while he was required to presshe force cell, leaving 6 participants in the analyses reported here2 male and 4 female). All of the participants were right-handed.ach participant gave informed consent.

.1.2. ApparatusEach participant sat alone in a quiet room, at a table with

response console containing a Novatech Mini40 ATi force cellTatem Industrial Automation Ltd., Derby, U.K.) (diameter of 40 mmnd height of 12 mm) that served to measure force, a computeronitor, a computer mouse, and a computer keyboard (inactive

n Experiment 1). The force cell was mounted on the table at theeft side and the computer mouse was placed at the right side.ll participants were trained and tested with a program createdith Labview 8.6 (National Instruments Corporation, Austin, TX),resented on a computer monitor.

.1.3. Procedure

.1.3.1. Force criterion assessment. The force criterion was definedeparately for each participant at the beginning of the first session.

e asked each participant to press the force cell with the maximumorce possible with the thumb of his or her left hand during three0-s intervals separated each other by a 3-s break. They were givenhe following instructions:

When the message “press” appeared on the screen please pressthe cell force with the maximum force possible and continu-ously with your left thumb until the message “break” appearedon the screen. Please don’t use other digits to press the cell force.

For each participant, the force criterion was defined as at least5% of this maximum force. The actual high force used varied con-iderably from a low of 30 N to a high of 220 N. The force criterionssessment was to equate reinforcing value of breaks across par-icipants. During the multiple schedule training, a vertical gaugeas displayed at the top center of the screen and indicated the

orce proportion of the force criterion exhibited by the participantupdated every 0.1-s). Participants were encouraged to maintainhe indicator at the top of the gauge (i.e., the force criterion) andhey were given the following instructions:

Please try to achieve and maintain the indicator at the top ofthe gauge as much as possible. At no time are you allowed tostop pressing except when the message “break” appeared onthe screen. Please follow these recommendations because it ishighly important for the sake of the experiment.

A vertical gauge was used because a pilot study showed that theorce level exhibited was much higher when the gauge was in placehan when it was not.

Before the first session of the day, participants were requiredo press with the maximum force possible during 120-s (withhe gauge displayed). This condition was designed to serve asn establishing operation (EO) for escape behavior (i.e., responsehat produced the break) during the multiple schedule conditionMichael, 1982).

.1.3.2. Response. The reinforced response was clicking the leftutton of the computer mouse while the mouse cursor (a 0.5-m solid white arrow) was on a circle (diameter of 6 cm). At the

ural Processes 99 (2013) 1– 6

beginning of the first session, participants only were told that theycould click on the circle (participants were not told that respondingproduced breaks).

2.1.3.3. Reinforcement. A 10-s break served as the reinforcer. It wassignaled by the message “break” and was terminated by the mes-sage “press.” Participants were instructed as follows:

Please stop pressing while the message “break” appeared onthe screen but immediately press the force cell after seeing themessage “press”.

2.1.3.4. Schedule. Participants were trained on multiple VI yokedVT schedules of reinforcement. The mean VI value was 10-s andthe schedule was comprised of 12 intervals, from the Fleshlerand Hoffman’s (1962) progression, that were sampled at randomwithout replacement until all 12 had been used. Reinforcement fre-quency and distribution were equated across the two components(i.e., the VI component was run first and the reinforcement num-ber and distribution that had been distributed was applied in thefollowing yoked VT component). Each component was signaled bya distinct color (either white or black for the VI component andthe VT component counterbalanced across participants) of a cir-cle that was displayed at the center of the screen (the backgroundscreen color was gray). Component duration was 100-s each andthe session terminated after three VI components and three yokedVT components.

Three sessions typically were conducted per day and were sep-arated each other by a 5-min break. The total number of sessionscompleted ranged from 12 to 15.

2.2. Results and discussion

Fig. 1 shows cumulative responses across sessions for each par-ticipant. Participants developed responding in the VI componentthat remained stable across sessions. In contrast, in the VT com-ponent responding occurred at the first sessions but soon declinedand disappeared for all of the participants except for P5 (but whoseresponse rate was slightly higher in the VI component). Theseresults demonstrate that break acted here as a reinforcer for clickingon the circle as response rate only increased when this event wasresponse-produced (cf. Catania, 1992; Skinner, 1953, and see forsimilar results using primary reinforcers Lattal and Maxey, 1971).

3. Experiment 2

The results of Experiment 1 show that timeout from a high-force requirement can be a potent reinforcer when made availableto humans through interval schedules. Experiment 2 assessed thesensitivity to delay to reinforcement using a self-control procedure.Using food reinforcement (for a review, see Logue, 1988) or nega-tive reinforcement like escape from noise (e.g., Navarick, 1982), itwas shown that subjects may prefer an immediate reinforcer toa delayed but larger reinforcer, showing the primacy of the delayvariable to determine preference. A similar outcome here wouldincrease the relevance and utility of this procedure.

3.1. Materials and methods

3.1.1. Participants and apparatusThe participants were 13 undergraduate students at the Univer-

sity of Lille III. All were volunteers. No course credit or money was

given for participation. Two participants were removed from theexperiment because of noncompliance in pressing with high force(under 50% of the force criteria) during the first session. An addi-tional participant was removed for medical reasons. This left 10

J. Alessandri, V. Rivière / Behavioural Processes 99 (2013) 1– 6 3

F Expes

p8wpa

33tasfetw

3mpfa

3mrnwwcirsto

ig. 1. The cumulative number of responses across sessions for each participant inchedule.

articipants in the analyses reported here (3 males and 7 females). of the participants were right-handed and 2 were left-handed butere used to using the computer mouse with the right hand. Eacharticipant gave informed consent. The same apparatus was useds in Experiment 1.

.1.2. Procedure

.1.2.1. Force criteria assessment. Two criteria were determined athe beginning of the first session according to the same procedures in Experiment 1. First, the force peak criterion consisted in mea-uring the maximum force as in Experiment 1. Second, the averageorce criterion consisted in measuring the higher average force fromach of the three intervals. As in Experiment 1, a gauge indicatinghe force exhibited by the participant was displayed in the sameay.

.1.2.2. Response and reinforcement. The keyboard, and not theouse, was used in Experiment 2. The choice response consisted of

ressing either the left or the right arrow keys and the same rein-orcer, a period of break from a required force exertion, was useds in Experiment 1.

.1.2.3. Series 1: Self-control procedure with same overall reinforce-ent duration. Before each session, participants were asked to

emove their watches and any timing devices so that they wouldot use them to time the delays in the choice trials. Participantsere trained on a discrete-trials procedure diagrammed in Fig. 2,ith four forced-choice trials (two for each alternative) followed by

hoice trials. The session terminated after 420-s had elapsed dur-ng the choice trials. Prior to the choice phase, participants were

equired to press the force cell during a period varying from 7-s (8-

for P3 in Series 1 and P4 in series 2 both in the second condition)o 25-s (36-s for P3 in Series 1 and P4 in series 2 both in the sec-nd condition) depending on their performances. That served as

riment 1 as a function of the VI component and the VT component of the multiple

establishing operation for choice behavior that produced break(Michael, 1982), and it consisted on three different steps all inwhich the participants were required to press the maximum forcepossible to attain and maintain the top of the gauge. In Step 1, at theend of a period of 5-s, the average force exerted during this periodhad to equal or exceed the average force criterion. If the averageforce criterion was met, the participant proceeded to the next step;if not, a message appeared: “You have not pressed with enoughforce, try again.” and a period of 5-s was re-presented only oncewhether the average force criterion was met or not. Next in Step 2,the participant had to maintain the peak force criterion (i.e., the topof the gauge) for 2-s (3-s for P3 in Series 1 and P4 in series 2 bothin the second condition) to proceed to the final step or after 10-s(20-s for P3 in Series 1 and P4 in series 2 both in the second con-dition) had elapsed whatever force level was produced. Finally inStep 3, a message was displayed that stated to “choose between theleft arrow key and the right arrow key, while you have to continuepressing with the maximum force possible”. A choice response wasreinforced either if the participant maintained the peak force crite-rion during 1-s, or after 5-s had elapsed whatever his or her forcelevel produced. This step was introduced to make sure that par-ticipants continue to press while they were choosing. The averageforce and peak force criteria were adjusted trial by trial dependingon whether or not the participant had attained the criteria. If theparticipant had attained the average force criterion, then this cri-terion was increased to 20% in the following trial; if he had failedin the first interval but had succeeded in the second interval, thenthe average force criterion did not change; and if he had failed inboth intervals, then the average force criterion was decreased to10%. In the same way, if the peak force had been attained in Steps

2 and 3, then this criterion was increased to 20%; if he had failedin either one of the interval, then the average force criterion didnot change; and if he had failed in both intervals, then the averageforce criterion was decreased to 10%.

4 J. Alessandri, V. Rivière / Behavioural Processes 99 (2013) 1– 6

2-s PFC

If not: 10-s force

Impulsive Self-Control

Pre

-ch

oic

e F

orc

e P

hase

5-s

Reinf.

10-s

Reinf.

Next trial

Choice Phase

10-s

force

Next trial

(Series 1)

Or 15-s force

(Series 2)

5-s AFC

If not: + 5-s force

1-s PFC

If not: 5-s force

Next trial

(Series 2)

ITI: 20 -s Series 1)

35-s (Series 2)ITI:35-s

15-s Low Force

Step 1

Step 2

Step3

Fig. 2. Schematic diagram of the procedure in Experiment 2. First, participants were required to meet three stages of force requirement. AFC stands for average force criterionand PFC stands for peak force criterion (see text for further details). When the term “force” appears alone it means that no force criterion was required to achieve the nextstage but participants were asked to press with the maximum force possible. The left portion of the figure shows the impulsive option and indicates the sequence of eventswhen it was in effect following a response on the corresponding arrow key (following the meeting of the PFC 2-s criterion or after 5-s in the choice phase; see text for furtherd e of ea to neS

apb(wwrpatctimwtow

3f

etails). The right portion shows the self-control option and indicates the sequencrrow key. Please note the variations in ITI durations (total time from trial onseteries 2.

The impulsive choice resulted in an immediate presentation of 5-s break, whereas the self-control choice resulted in a 10-s forceeriod followed by a 10-s break. Each choice response was signaledy a corresponding arrow (length of 4 cm) at the corresponding sidee.g., left arrow at the left center of the side if a left choice responseas made) along the outcome phase. The self-control alternativeas alternated between the left and right side across sessions. Each

einforcer was followed by a 15-s interval that provided a sup-lementary phase of force recovery, in which the participant wassked to “press the force cell with low force”. Note that althoughhe reinforcement duration was twice as long in the self-controlhoice as in the immediate choice, the overall reinforcement dura-ion roughly was the same between the two options because thentertrial interval was not equated. A control condition was imple-

ented in which the only change relative to the previous conditionas that the delay to the longer reinforcer was removed. This condi-

ion assessed sensitivity to duration of reinforcement in the absencef differential delay. As in Experiment 1, three sessions typically

ere conducted per day and were separated by a break of 5 min.

.1.2.4. Series 2: Self-control procedure with different overall rein-orcement duration. The procedure used in Series 2 was similar

vents when it was in effect following a reinforced response on the correspondingxt trial onset) and force durations in the impulsive option between Series 1 and

to that of Series 1 except that the impulsive choice provided a15-s force period following reinforcement to equate the intertrialinterval. Thus, the self-control choice provided higher overall rein-forcement duration than the impulsive choice as it typically wasin the literature (for a review, see Logue, 1988). And in the con-trol condition, the interval after reinforcement was also adjusted(15-s following the shorter reinforcer and 10-s following the largerreinforcer) to equate the intertrial intervals between the twoalternatives.

3.2. Results and discussion

Fig. 3 shows the choice proportion on the longer break for eachparticipant in Series 1. All of the participants but one strongly pre-ferred the immediate, shorter break. This preference was reversedwhen the longer reinforcer was immediately presented and pref-erence for the shorter reinforcer returned when the delay was

reinstated, indicating sensitivity to reinforcement duration. Forthe only participant who showed a preference for the self-controlchoice (P3), an increase in force duration prior to choice was inves-tigated to increase the potency of an immediate break. A decrease in

J. Alessandri, V. Rivière / Behavioural Processes 99 (2013) 1– 6 5

F e largi

pr

poia

Fit

ig. 3. The choice proportions for each participant in Experiment 2 Series 1 on thncrease in the pre-force duration (see text for further details).

reference for the self-control choice was found, but no preferenceeversal to the impulsive choice was observed.

Fig. 4 shows the choice proportion on the longer break for eacharticipant in Series 2. As in Series 1, all of the participants but

ne largely preferred the immediate, shorter break, a finding sim-lar to that typically found with unconditioned reinforcement (for

review of related research with pigeons, see Logue, 1988; and

ig. 4. The choice proportions for each participant in Experiment 2 Series 2 on the largerncrease in the pre-force duration (see text for further details) and double asterisk indicahe usual criterion) for this single session.

er reinforcer when it was delayed (D) or immediate (ND). The asterisk marks an

with humans, see Kirk and Logue, 1997). Once again, a preferencereversal was observed when the longer reinforcer was immediatelypresented and preference for the shorter reinforcer again returnedwhen the longer reinforcer was delayed, except for one participant

(P5). The results for this participant were highly variable but seemto depend on the order of session across the day. In effect, prefer-ence for the self-control choice declined more and more as time

reinforcer when it was delayed (D) or immediate (ND). Single asterisk indicates antes an error in setting the peak force criterion (a decrease of about 20% relative to

6 ehavio

ptushc

s1asf

4

damaom(brmbp1

bshpplmShu

edorpgbtcot2tafrrmd

J. Alessandri, V. Rivière / B

assed. Furthermore, for another participant (P2), an error in set-ing the peak force criterion (a decrease of about 20% relative to thesual criterion) for a single session was followed by an exceptionalhift in preference toward the self-control choice. This reveals theigh sensitivity of self-control choice to the force exhibited beforehoice.

Only one participant (P4) did not show preference for the impul-ive choice in the first condition. For this participant, as in Series, an increase in force duration prior to choice was investigated inn attempt to increase the potency of an immediate break. But noystematic decrease in preference for the self-control choice wasound despite a transient decrease in the first session.

. General discussion

The present report tested the reliability of an effective proce-ure to study operant conditioning in humans using a timeout from

high-force requirement as reinforcement, a high force require-ent as a motivational operation, and a click on a circle or pressing

keyboard as response. Experiment 1 demonstrated the adequacyf a break as a reinforcer. In effect, it established behavior rapidly,aintained it at stable level over extended number of sessions

Perone and Galizio, 1987), and brought it under schedule controly response-dependent reinforcement and response-independenteinforcement. Responding was maintained by negative reinforce-ent (i.e., suspension of the force requirement) (Skinner, 1953)

ut this consequence was scheduled in essentially the same way asositive reinforcers such as food presentation (Perone and Galizio,987).

Experiment 2 showed that sensitivity of choice to delay ofreak reinforcement resembled the sensitivity of choice to delayshown in prior studies. In effect, as with food reinforcement inumans and nonhumans (for a review of related research withigeons, see Logue, 1988; and in humans, see Kirk and Logue, 1997),articipants preferred a shorter but immediate reinforcer over a

onger but delayed reinforcer, showing the primacy of reinforce-ent delay over reinforcement duration as a determinant of choice.

uch strong impulsive preference also has been observed with adultumans when another negative reinforcer (i.e., noise offset) wassed (e.g., Navarick, 1982).

As was stated in the introduction, it is of interest to provevidence of applicability in humans of the behavioral principlesiscovered in animals. In order to replicate animal data, the usef break as reinforcement in humans presents several advantageselative to monetary reinforcement (please note that it does notreclude the use of monetary reinforcement or the use of othereneralized or conditioned reinforcement for the study of humanehavior under more naturalistic contingencies). First, it is rela-ively inexpensive compared to monetary reinforcement (a forceell can cost less than $100 and it has a relatively long lifespan). Sec-nd, it shows more functional similarity to primary reinforcementhan points exchangeable for money as the results of Experiment

have showed, even though it should be categorized as a nega-ive reinforcer (but the necessity of distinction between positivend negative reinforcement was casting doubt in the literature, seeor example Michael, 1975). Finally, the effectiveness of break as a

einforcer seem not to depend on pairings with other events occur-ing outside the laboratory as it was with points exchangeable foroney (i.e., spending money), but rather depends on the required

uration and level of force. Also, the use of break as reinforcement

ural Processes 99 (2013) 1– 6

may be preferable over the use of other primary reinforcers as foodor liquid. In effect, it is easier to control for the deprivation levelthan for food or liquid.

There were limitations of the current study that might beaddressed in future research. First, the response rates in Experi-ment 1 were relatively low (about from 2 to 8 responses per minute)compared to those typically observed in the literature. Informalobservations revealed long pauses after reinforcement, indicatingthat a break may not be reinforcing during this time period. Per-haps the use of shorter reinforcement intervals would increase theresponse rate and decrease the postreinforcement pauses. Second,several participants were removed because they were suspected tonot comply with the requirements to attain the force criterion or topress with the maximum force possible. And for the remaining par-ticipants it was virtually impossible to assess if they pressed withthe maximum force possible all the time even though they attainedthe force criteria at least about more than 60% of the time. This wasnot surprising given the fact that pressing with high force is anaversive activity for humans (Alessandri et al., 2008), and thereforeescape or avoidance from this situation by pressing with lower forcemay be tempting for every participant. Perhaps the use of externalconsequences as course credit contingent on the achievement of agiven force criterion would have increased the compliance of theset of participants.

Acknowledgments

I thank Romain Mazzoleni and Mike Perfillon for their assis-tance with data collection. And I especially thank Andy Lattal forhis comments and his help in proofreading the revised version.

References

Alessandri, J., Darcheville, J.-C., Delevoye-Turrell, Y., Zentall, T.R., 2008. Preferencefor rewards that follow greater effort and greater delay. Learn. Behav. 36,352–358.

Azrin, N.H., 1960. Use of rests as reinforcers. Psychol. Rep. 7, 240.Catania, A.C., 1992. Learning, 3rd ed. Prentice-Hall, Englewood Cliffs, NJ.Fleshler, M., Hoffman, H.S., 1962. A progression for generating variable-interval

schedules. J. Exp. Anal. Behav. 5, 529–530.Hake, D.F., 1982. The basic-applied continuum and the possible evolution of human

operant social and verbal research. Behav. Anal. 5, 21–28.Hyten, C., Madden, G.J., Field, D.P., 1994. Exchange delays and impulsive choice in

adult humans. J. Exp. Anal. Behav. 62, 225–233.Kangas, B.D., Hackenberg, T.D., 2009. On the reinforcing human behavior in the lab-

oratory: a brief review and some recommendations. Exp. Anal. Human Behav.Bull. 27, 21–26.

Kirk, J.M., Logue, A.W., 1997. Effects of deprivation level on humans’ self control forfood reinforcers. Appetite 28, 215–226.

Lattal, K.A., Maxey, G.C., 1971. Some effects of response independent reinforcers inmultiple schedules. J. Exp. Anal. Behav. 16, 225–231.

Logue, A.W., 1988. Research on self-control: an integrating framework. Behav. BrainSci. 11, 665–679.

Michael, J., 1975. Positive and negative reinforcement, a distinction that is nolonger necessary; or a better way to talk about bad things. Behaviorism 3,33–44.

Michael, J., 1982. Distinguishing between discriminative and motivational functionsof stimuli. J. Exp. Anal. Behav. 37, 149–155.

Navarick, D.J., 1982. Negative reinforcement and choice in humans. Learn. Motiv. 13,361–377.

Perone, M., Galizio, M., 1987. Variable-interval schedules of timeout from avoidance.J. Exp. Anal. Behav. 47, 97–113.

Perone, M., Galizio, M., Baron, A., 1988. The relevance of animal-based principles

in the laboratory study of human operant conditioning. In: Davey, G., Cullen,C. (Eds.), Human Operant Conditioning and Behavior Modification. Wiley, NewYork, pp. 59–85.

Skinner, B.F., 1953. Science and Human Behavior. The Macmillan Company, NewYork.