Instrumental Learning
-
Upload
emma-chambers -
Category
Documents
-
view
29 -
download
1
description
Transcript of Instrumental Learning
![Page 1: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/1.jpg)
Instrumental Learning
All Learning where an animal operates on its environment to obtain a reinforcement.
Operant (Skinnerian) conditioning
![Page 2: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/2.jpg)
1. Thorndike and the law of effect
• The animal has an increased probability to repeat the behavior that was emitted just before the reward.
• As Thorndike would say, “the memory becomes stamped in”
![Page 3: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/3.jpg)
Many different type a learning apparatus were tried between 1900-1945
• Escape learning for cats (Thorndike, Guthrie).
• Rat Jumping Stand (Myers)
• Complex Maze Learning (Tolman, Lashley)
• T-maze (Hull, Spence)
• Morris water-maze (modern day development)
![Page 4: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/4.jpg)
The Common element
• All used a fixed trial presentation method.
• Subjects had a fixed experience across animals and the learning varied per animal.
• Subjects had a variable experience but had a fixed criterion of learning that must be obtained.
![Page 5: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/5.jpg)
• One looked at the time it took to acquire the goal (mazes) or the number of trials to reach the criterion.
![Page 6: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/6.jpg)
Skinner and Operant Behavior
The unique feature of operant training is the experimenter waits until the animal does the specified response before a reward is given. This is called free operant behavior.
![Page 7: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/7.jpg)
Reward vs. Reinforcement
• A reward is a global state of affairs given to the whole animal (food, electric shock).
• A reinforcement is for the specific discreet response done by the animal to obtain the reward.
![Page 8: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/8.jpg)
Primary reinforcers
• Eating, drinking & sexing
• Addicting drugs
• For animals the equivalent of money for humans, e.g., poker chips, marbles.
• When used in this way it is called Condition reinforcers
![Page 9: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/9.jpg)
Positive and Negative Reinforcement
• Both positive and aversive stimuli can be used to guide behavior. Both are used to increase a desired response. The reinforcement is delivered close in time after the emission of the desired response is accomplished.
![Page 10: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/10.jpg)
Reinforcement & Punishment
• Concept – Positive Reinforcement
![Page 11: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/11.jpg)
Description
• Increasing the frequency of a behavior by following it with the presentation of a positive reinforcer – a pleasant, positive stimulus or experience
![Page 12: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/12.jpg)
Example
• Saying “Good job” after someone works hard to perform a task.
![Page 13: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/13.jpg)
Types of reinforces
• Appetitive – usually food
• Negative --- shock, air puff; those stimuli that deliver pain or discomfort.
![Page 14: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/14.jpg)
Positive Reinforcement
![Page 15: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/15.jpg)
Concept:
• Negative reinforcer
![Page 16: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/16.jpg)
Negative Reinforcement
![Page 17: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/17.jpg)
Note the following
• The removal of a negative stimulus is positively reinforcing – the animal will tend to do that behavior that removes itself from the cues associated with the aversive state of affairs.
![Page 18: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/18.jpg)
Reinforcement/Punishment
![Page 19: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/19.jpg)
Shaping
• Shaping is the method by which one gets the animal to accomplish the desired response in the first place.
![Page 20: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/20.jpg)
• The final behavior desired is broken down into small steps or increments. The accomplishment of the first step leads directly to the next step in the chain.
![Page 21: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/21.jpg)
How to train a monkey to hit a key.
![Page 22: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/22.jpg)
Continuous reinforcement
• A reinforcement is given for every desired response. Stop giving the reinforcement the animal stops responding..
![Page 23: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/23.jpg)
Intermittent reinforcement
• Intermittent reinforcement is more resistant to extinction than continuous reinforcement.
![Page 24: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/24.jpg)
Appetitive Schedules of reinforcement
• Schedules of reinforcement are base on two criteria., number of responses, or the passage of time.
![Page 25: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/25.jpg)
Ratio Schedules (FR)
• Fixed ratio schedule delivers a reinforcement after a given number of responses has been formed.
![Page 26: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/26.jpg)
![Page 27: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/27.jpg)
![Page 28: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/28.jpg)
Variable Ratio (VR)
• Here the number of responses varies about a mean response rate
• Slope is not quite as steep as fixed ratio
![Page 29: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/29.jpg)
![Page 30: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/30.jpg)
![Page 31: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/31.jpg)
Fixed interval (FI)
• Here, a reinforcement is delivered after the first response after the passage of a fixed amount of time.
• Note the scalloping of the cummulative record.
![Page 32: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/32.jpg)
![Page 33: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/33.jpg)
Variable Interval (VI)
• Variable interval is similar to FI schedule except it is the time lapse between the availability of successive reinforcements that is varied. For example, 1, 3, 2, ect. The interval is named after the mean amount of time past. Again the reinforcement is delivered after the first response after the interval has past.
![Page 34: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/34.jpg)
![Page 35: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/35.jpg)
VI
• Note that in variable interval schedules one does not see the scalloping one sees in FI schedules. The slope is not as steep as in VR not FR schedules
![Page 36: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/36.jpg)
Differential Reinforcement for Rate
• In ratio schedules there is a contingency between the rate of responding and the rate of reinforcement. That is the faster the animal responds the faster it gets a reinforcement. The contingency is not as strong for interval schedules but still there.
![Page 37: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/37.jpg)
Setting up a Differential Rate
• One sets up a contingency between the numbers of responses within a given time interval for reinforcement. The key is to control the rate of response per unit time, i.e. control the inter-response time (IRT)
![Page 38: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/38.jpg)
Differential Reinforcement for High Rates (DRH)
• Here the animals must respond 10 times in 5 seconds as and example. Each time this criteria is met the animal get reinforced after the last response
![Page 39: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/39.jpg)
Differential Reinforcement for Low Rates of Responding (DRL)
• Here the animal must inhibit early responses to meet a criterion of say 10 sec. If the animal responds prior to the 10 sec a clock is reset and the animal must start the wait period over.
![Page 40: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/40.jpg)
Current theory postulates two underlying processes
• The animal forms a temporal discrimination.
• The animal actively inhibits responding. (uses ancillary responses, not to the requisite key, or bar to pass the time).
![Page 41: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/41.jpg)
DRH/DRL
• Respond within a window of time. Must respond after a specific time has past, must not allow an upper time span to be exceeded.
![Page 42: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/42.jpg)
• Wyler/Prim study using single neuron
![Page 43: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/43.jpg)
Negative Control of Behavior
• Behavior emitted that removes an aversive state of affairs.
![Page 44: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/44.jpg)
Negative reinforcer
Description: Increasing the frequency of a behavior by following it with the removal of an unpleasant stimulus or experience
![Page 45: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/45.jpg)
Concept
• Avoidance conditioning
![Page 46: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/46.jpg)
Avoidance conditioning
• Description: Learning to make a response that avoids an unpleasant stimulus.
![Page 47: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/47.jpg)
Example
• You slow your car to the speed limit when you spot a police car, thus avoiding being stopped and reducing the fear of a fine; very resistant to extinction
![Page 48: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/48.jpg)
1. Escape and Avoidance
![Page 49: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/49.jpg)
The control of Intrinsic behavior
• Avoidance tasks the removal of one-self from an environment which has previously been associated with a negative reinforcement.
![Page 50: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/50.jpg)
Sidman Avoidance
• Shock-Shock interval (shock every 5 sec)
![Page 51: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/51.jpg)
S. A. (cont.)
• Response shock interval (time delay of shock/bar push)
![Page 52: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/52.jpg)
S. A. (cont.)
• Very, very hard to extinguish.
• VAN - chimp
![Page 53: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/53.jpg)
VIII. Punishment – different types
•
![Page 54: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/54.jpg)
Punishment 2 (Penalty)
![Page 55: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/55.jpg)
Example
• You learn to use the mute button on the TV remote control to remove the sound of an obnoxious commercial
![Page 56: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/56.jpg)
Concept
• Escape Conditioning
![Page 57: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/57.jpg)
Escape Conditioning
• Description: Learning to make a response that removes an unpleasant stimulus
![Page 58: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/58.jpg)
Example
• A little boy learns the crying will cut short the time that he must stay in his room
![Page 59: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/59.jpg)
Concept
• Punishment
![Page 60: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/60.jpg)
Punishment
• Description: Decreasing the frequency of a behavior by either presenting an unpleasant stimulus (punishment 1) or removing a pleasant one (punishment 2 (penalty).
![Page 61: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/61.jpg)
Example
• You swat the dog after it steals food from the table, or you take a favorite toy away from a child who misbehaves. A number of cautions should be kept in mind when using punishment (see below for an example).
![Page 62: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/62.jpg)
Learned helplessness
• Continued punishment until the animal refuses to respond even when there is no aversive state of affairs.
![Page 63: Instrumental Learning](https://reader034.fdocuments.us/reader034/viewer/2022042703/5681350f550346895d9c63e6/html5/thumbnails/63.jpg)
Combined Operant and C. C.