The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response...

The Matching Law

Richard J. Herrnstein

Reinforcement schedule

Fixed-Ratio (FR) : the first response made after a given number of responses is reinforced. (ex. Peicework system)

Variable-Ratio (VR) : similar to FR except that the number of responses required varies between reinforcements. (ex. Gambling)

Reinforcement schedule

Fixed-Interval (FI) : the first response made after a given time interval is reinforced. (ex. Stipend)

Variable-Interval (VI) : similar to FI except the interval requirements vary between reinforcers around some specified average value. (ex. Fishing)

Ch1 – Experimental design

In this experiment, the reinforcement were given to pigeons when they pecked on either of two keys.

The reinforcement for one key was delivered on a variable-interval schedule which was independent of the schedule for the other key.

Ch1 - Experimental design

The mean time interval between reinforcements on each key was the primary independent variable.

These intervals were chosen so that the mean interval of reinforcement for the two keys was held constant at 1.5 minutes. For example, VI(3) VI(3); VI(2.25) VI(4.5); VI(1.5) VI(∞);

Ch1 – Result Relative

frequency of responding to Key A.

It is exactly equal to the relative frequency of reinforcement.

Ch1 – Result The absolute rate

of responding on each of the keys.

It is approximately a linear function that passes through the origin.

Ch1 – Discussion

Rate of responding is a linear measure of response strength.

The relative frequency of responding on a given key closely approximated the relative frequency of reinforcement on that key.

Ch4 – Maximizing vs Matching

From a viewpoint of the maximizer, equilibrium is reached when a distribution of activities cannot improve the obtained outcomes by a redistribution of choices.

Matching requires the ratio of the frequencies of any two behavior, B1 and B2, to match that of their obtained reinforcements, R1 and R2.

Ch4 - Melioration How does an organism come to match

its distribution of choices to the obtained reinforcements? By shifting behavior toward higher local rates of reinforcement.

If RD is zero, equilibrium is achieved. When RD>0, time allocation shifts toward t1; when RD<0, it shifts toward t2.

Ch4 – Comparison between matching and maximization

Melioration implies that behavior maximizes total reinforcement, RT, under two and only two conditions, as follows:

Ch4 – Concurrent VR VR

, where

, where

Ch4 – Concurrent VR VR

Melioration predicts preference for alternative 1, since RD>0 at all allocations.

Maximization predicts likewise because RT is at its maximum at t1=1.

When alternative 1 reinforces with a higher probability than alternative 2

Ch4 – Concurrent VI VR Local rate of reinforcement for VI , where V : scheduled average interreinforcement time d1 : average interresponse time during responding on the VI

I : a measure of interchangeover times between two schedules

t1 : proportion of time spent on the VI

(1-t1) : proportion of time on VR

Local rate of reinforcement for VR

Ch4 – Concurrent VI VR ,

, where ,

Ch4 – Concurrent VI VR The optimal strategy for

conc VI VR would seem to call for lots of time on the VR with occasional forays to the VI to collect a reinforcement come due. Nevertheless, no subject displayed any such bias toward VR.

Solid : best-fitting lineDashed : prediction of

matchingDot-dashed : prediction of

maximization

Ch4 – Concurrent VI VR Divergence

between two theories

The value of RT when RD=0 is about 15 percent lower than when RT is maximized, which is the reinforcement cost of matching.

Ch4 – Mazur’s Experiment VI 45-second schedule, which randomly and

equally often assigned each dark period to one key or the other.

During the 3-second dark periods, a small ration of food was delivered with the following probability :

Ch4 – Mazur’s Experiment

For a maximizer, the pigeons should always sample each alternative frequently and equally.

For a matcher, the pigeons should shift preference along with the proportions of yielding food.

Ch4 – Mazur’s Experiment

Ch4 – Vaughan’s Experiment

Modified Conc VI VI schedule Schedule values were updated every 4 minutes

of responding. In condition a, the left schedule reinforces at a

higher rate than the right schedule; in condition b, vice versa.

Ch4 – Vaughan’s Experiment Maximization picture In either condition, a subject earned the

maximum, 180 reinforcements per hour, by spending 0.125-0.25 of its time responding to the right altenative.

Ch4 – Vaughan’s Experiment Melioration picture Melioration should have held choice within the

interval from 0.125-0.25 during condition a and within the interval 0.75-0.875 during condition b.

Ch4 – Vaughan’s Experiment

The results for the three pigeons

Limits of Melioration Comparative Most of data showing the different predict between

maximizing and melioration came from the pigeons. Psychophysical Ambiguity in the meaning of “local” rates of reinforcement Motivational Food was used as the only reinforcement. Procedural Melioration can be generalized from concurrent to single-

response procedures and multiple schedules, but there is no fully satisfactory formula for multiple-schedule responding yet.

Inherent Limits Is the class of equally reinforced movements also the class

of maximally reinforced movements?

The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response...

Documents

Transcript of The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response...