Instrumental Conditioning
-
Upload
cherokee-solis -
Category
Documents
-
view
34 -
download
0
description
Transcript of Instrumental Conditioning
Instrumental Conditioning
Also called Operant Conditioning
Appetitive Aversive
PositiveContingency(R produces O)
NegativeContingency(R prevents O)
Instrumental Conditioning Procedures
PositiveReinforcement
Response increases
Punishment
Response decreases
OmissionTraining
Response decreases
NegativeReinforcement
Response increases
Instrumental Conditioning involves three key elements:
• a response
• an outcome (the reinforcer)
• a relation, or contingency, between the R and O
The Instrumental Response
• usually an arbitrary motor response
• for example, bar-pressing has nothing to do with eating food
• there are limits on the types of responses that can be modified by instrumental conditioning
• relevance, or belongingness, is an issue in instrumental conditioning as well as in Pavlovian conditioning
Relevance, or Belongingness, in Instrumental Conditioning
Certain responses naturally ‘belong with’ the reinforcer because of the animal’s evolutionary history
Just like all CSs are not equally associable with all USs, not all responses are equally conditioned with all reinforcers
Shettleworth tried to condition various behaviors with food reward in hamsters
• used a number of different behaviors • digging and face-washing
Mean TimeSpent inBehavior
Trials
Digging
Face-washing
• some responses are more relevant to food reward than others • behavior such as digging increase the chances of coming in contact with food• face-washing won’t increase the chances of coming in contact with food; may even interfere with food-related behaviors
• The Brelands trained many different species to perform tricks for ads, movies, etc. e.g., pigs putting coins in a piggy bank.
Often they found that once the response was trained, it would deteriorate; other “instinctive” behaviors (e.g., rooting the coins) would “drift” in and interfere with performance of the operant response.
The pigs treated the coins as if they were food and these food related behaviors interfered with the response the Brelands were trying to condition
Instinctive Drift
The Instrumental Reinforcer
Increases in the quantity or quality of the reinforcer can increase the rate of responding
Experiment by Hutt (1954) – described in the text
In runway experiments, animals will run faster for bigger reward
Responding to a particular reward also depends on ananimal’s past experience with other reinforcers
Experiment by Mellgren (1972) – described in the textExperiment by Crespi (1942)
Experiment by Crespi (1942)3 groups of rats were given 20 trials to run down an alleywayfor food
Group 1: large reward – 64 pelletsGroup 2: medium reward – 16 pellets
Group 3: small reward – 4 pellets
Baseline
Mean Speed
Gp 1
Gp 2
Gp 3
Crespi (1942)In phase 2, the reward level was switched for 2 groups
Group 1: 64 pellets – 16 pellets; negative contrastGroup 2: 16 pellets – 16 pelletsGroup 3: 4 pellets – 16 pellets; positive contrast
Crespi compared groups who were switched to 16 pellets from a large or small reward to a group consistently given 16 pellets
Test trials
Mean Speed
Gp 1
Gp 2
Gp 3
Baseline
Shift
4-16
16-16
64-16 Negative contrast(64-16 pellets)Ran slower
Positive contrast(4-16 pellets)Ran faster
Positive and negative contrast indicate that behavior is not just affected by current conditions
Performance is also affected by previous reward conditions
The Response – Reinforcer Relation
Two types of relationships exist between a response and a reinforcer:
temporal relationship; temporal contiguity refers to the delivery of the reinforcer immediately after the response
causal relationship; response-reinforcer contingency refers to the extent to which the response is necessaryand sufficient for the occurrence of the reinforcer
Effects of temporal contiguity
Instrumental learning is disrupted by delaying the reinforcer after the response
Dickinson et al (1992)rats were reinforced for lever-pressingvaried the delay between occurrence of the response and delivery of the reinforcer
0
Leverpresses/min
5
10
15
20
Delay (s)
20
6040
Why is instrumental conditioning so sensitive to a delay of reinforcement?
Delay makes it difficult to figure out which response is being reinforcedThere are ways to overcome the problem:
1. Provide a secondary, or conditioned, reinforcerimmediately after the response, even if the primary reinforcer does not occur until later
A secondary or conditioned reinforcer is a conditioned stimulus that was previously associated with the reinforcer
Conditioned reinforcers can serve to ‘bridge’ a delay between the response and the primary reinforcer
2. Another technique that facilitates learning with delayed reinforcement is to mark the target responseto distinguish it from other responses
The marking procedure demonstrated by Lieberman et al (1979)
They tested whether rats could learn a correct turn or choice in a maze despite a long delay of reward
StartBox
ChoiceBox
White
Black
DelayBox
GoalBox
Subjects were placed in the start box and allowed to choose one of two alleyways (White was correct)Three groups:
Group 1: Light – after they made a choice, rats in this groupreceived a 2 s light (regardless of choice) and were allowed to go to the delay boxGroup 2: Noise – treated the same, except 2 s noise
Group 3: Control – no stimulus; went directly to delay box afterthe choice
All rats confined to the delay box for 2 min, then allowed to go to the goal box. Food was given, but only if they had chosen white.
Results:
50
100
Trials
MeanPercent Correct
Control
NoiseLight
Control group stayed at approximately 50% correct
Light and Noise groups learned the discrimination (i.e., learnedto choose white over black)
So why did the Light and Noise improve discrimination learning?
the cues helped to mark the choice response in memory after making a choice and receiving the L or N,subjects more effectively rehearsed the choice theyhad just made when reward was given later on, after 2 min delay,the memory for previous choice was stronger these effects of marking cannot be explained interms of secondary or conditioned reinforcementbecause the marking stimulus was presented afterboth correct and incorrect choices