Download - Burrhus Frederic Skinner - Henderson State Universityfac.hsu.edu/ahmada/3 Courses/2 Learning/Learning... · Burrhus Frederic Skinner ... Burrhus Frederic Skinner 12. Great contributions

1

1

Burrhus Frederic Skinner(1904 - 1990)

Chapter 5

2

Burrhus Frederic Skinner

1. Born Mar. 20, 1904 Susquehanna, Pennsylvania.

2. Did his PhD (1931) from Harvard.

3. Wanted to become a writer was disappointed to learn that he had nothing to write about, instead became a great psychologist.

1904-1990

3


4. Wrote The behavior of organisms (1938). Walden two (1948), after Thoreau’s Walden.

5. Taught at University of Minnesota (1936-48).

6. Chair at Indiana University (1945/48).

7. Came back to Harvard (1948-90). 1904-1990

ww

w.sim

plyp

sycho

logy.pw

p.blu

eyon

der.co

.uk

2

4


8. Beyond freedom and dignity (1971).

9. About behaviorism (1976).

10. Upon further reflection (1987).

11. Continued to publish to the end of his life in journals like Analysis of Behavior (1989). 1904-1990

image

s-cdn

01.associate

dcon

tent.co

m

5


12. Great contributions to learning and education.

13. Contributions to child development.

14. Project ORCON (ORganic CONtrol).

15. Died in 1990.

1904-1990

pavlo

v.psico

l.unam

.mx:8080

Project ORCON

6

Comparison

Operant Conditioning Respondent Conditioning

Skinnerian or operant conditioning

Classical, Pavlovian, or respondent conditioning

Type R conditioningreinforcing stimulus is

contingent upon a response

Type S conditioningreinforcing stimulus is

contingent upon a stimulus

S S (Food) RS R S (Food)

3

7

Comparison Continued

Operant Conditioning Respondent Conditioning

Responses are emitted to a known reinforcer.

Responses are elicited to a known stimulus.

Conditioning strength = Rate of response

Conditioning strength = Response magnitude

8

Theoretical Differences

Functionalists Associationists

Edward ThorndikeBurrhus Skinner

Ivan PavlovEdwin Guthrie

Concentrated on responses as they

brought about consequences.

Concentrated on stimuli as they brought

responses.

S S RS R S

9

Radical Behaviorism

1. Behavior cannot be explained on the basis of drive, motivation and purpose. All of these take psychology back to its mentalistic nature.

2. Behavior has to be explained on the basis of consequences (reinforcements, punishments) and environmental factors. This, Skinner proposed, was the back bone of all scientific psychology.

4

10

Principles of Operant Learning

1. We need to know what is reinforcing for the organism. How can we find a reinforcer? It is merely a process of selection, which is difficult to determine. Reinforcers related to bodily conditions are easy to determine, like food and water.

2. This reinforcement will predict response.

3. Reinforcement increases rate of responding.

11

Operant Chambers

Skinner devised operant chambers for rats and pigeons to study behavior in a controlled

environment. Operant chambers opportunities to control reinforcements and other stimuli.

12

Magazine Training

1. At the beginning of this training the rat is deprived (a procedure) of food for 23 hours, and placed in the operant chamber.

2. The experimenter presses a hand held switch which makes a clicking sound (secondary reinforcer) and a food pellet (primary reinforcer)drops in the food magazine.

3. The rat learns to associate the clicking sound with the food pellet.

5

13

Magazine Training

4. To train the rat to come to the food magazine and eat food, the experimenter presses the switch when the rat is near the food magazine. After a few trials the rat associates clicking sound with coming of the food, and stays close to the magazine to eat food.

Lever

FoodMagazine

FoodPellet

14

Shaping

1. To train the rat to press the lever and get a food, the experimenter shapes rat’s behavior. Shaping involves reinforcing (secondary) rat for behaviors that approximate the target behavior, i.e., coming closer and closer to the lever and finally pressing it. This procedure is calledsuccessive approximation.

2. To shape lever-pressing behavior, differential reinforcement can also be used. In this procedure only lever-pressing behaviors are reinforced not others.

15

Cumulative Recording

Time

Cu

mu

lati

ve R

esp

on

ses

Operant LevelOne Response

Second Response

PaperMovement

6

16

Responding Rate

Time

Cu

mu

lati

ve R

esp

on

ses

Slow rate of responding

Shallow trace

Rapid rate of responding

Steep trace

17

Cumulative Responses: Sniffy

Cu

mu

lati

ve R

esp

on

ses

75Responses

75Responses

75Responses

18

Extinction

Remove reinforcement (food) and the lever pressing behavior is extinguished.

Leverpressingresponse

FoodLever

S R S

7

19

Extinction

Time

Cu

mu

lati

ve R

esp

on

ses

Extinction (Operant

Level)

NoFood

0

10

20

30

40

50

60

5 10 15 20 25 30

Trials

Be

hav

ior

(Cu

mu

lati

ve R

esp

on

ses)

Extinction

& Rest

Spontaneous

Recovery

20

Spontaneous Recovery

Just as we have spontaneous recovery in classical conditioning, a restful period after extinction

initiates lever-pressing response in the animal.

21

Discrimination Learning

The organism can be conditioned to discriminate between two or more stimuli. A discriminative

operant is a response that is emitted specifically to one stimulus (SD) but not the other (SΔ).

Discriminative Stimulus

Response Reinforcement

Light ‘ON’ (SD) Press lever Food

Light ‘OFF’ (SΔ)Lever not pressed

No Food

8

22

Secondary Reinforcement

“Any neutral stimulus paired with a primary reinforcer (e.g., food or water) takes on reinforcing

properties of its own" (Hergenhahn and Olson, 2001)” and is called a secondary stimulus. Thus, all

discriminative stimuli are secondary reinforcers.

23

Generalized Reinforcers

1. A secondary reinforcer can become a generalized reinforcer when paired with a number of primary reinforcers. Money then is a generalized reinforcer, for it is associated with primary reinforcers like food, drink and mates.

2. Secondary reinforcer is similar to Allport’s (1961) idea of functional autonomy. First there is activity for reinforcement, but then the activity by itself becomes reinforcing, e.g., joined merchant navy for money but now enjoys sailing for its own sake.

24

Chaining

A discriminative stimulus (SD) initiates a response (SR) which serves as a stimulus (SD) for the next

response (SR) and so on till the final response (R) is followed by primary reinforcement.

SD R SD

SR

R SD

SR

R SR

Manystimuli

Orients Sight oflever

Approaches Contactlever

Pressesbar

FoodPellet

Similar to Guthrie’s movement-produced stimuli.

9

25

Reinforcement & Punishment

If response is followed by a reinforcer then the response increases. However, if it is followed by a

punisher then the response decreases.

26

Reinforcement

Reinforcer Contingency Example Behavior

Primary PositiveDoing work getting

foodWork

increases

Secondary PositiveStudying books

getting good grades

Studying increases

Primary NegativeHeater proximity

avoids cold

Heater proximity increases

Secondary NegativeWaking early

avoiding traffic

Waking early

increases

27

Punishment

Punisher Contingency Example Behavior

Primary PositiveWork with

electricity get shock

Work with electricity decreases

Secondary PositiveInsult boss get reprimanded

Insulting boss decreases

Primary NegativeQuarrelsome

behavior lose food

Quarrelsome behaviordecreases

Secondary NegativeComing home late

no going outComing late decreases

10

28

Consequences & Contingencies

Contingency

Positive Negative

Consequence

ReinforcementBehavior increases

Behavior increases

PunishmentBehavior decreases

Behavior decreases

Like Thorndike, Skinner believed that positive reinforcement strengthened behavior but

punishment did not weaken behavior.

29

Estes’s Punishment Experiment

0

100

200

300

400

500

1 2 3

Extinction Session

Cu

mu

lati

ve R

esp

on

ses

No reinforcement + punishment

No reinforcement

30

Punishment

1. Unwanted emotional byproducts (generalized fears).

2. Conveys no information to the organism.

3. Justifies pain to others.

4. Unwanted behaviors reappear in its absence.

5. Aggression towards the agent.

6. One unwanted behavior appears in place of another.

11

31

Punishment

Why punishment?

It reinforces the punisher!

Alternatives to Punishment

1. Do not reinforce the unwanted behavior.

2. Let the individual engage in the undesirable behavior for long till he is sick of it.

3. Wait for the unwanted behavior to dissolve over development.

32

Schedule of Reinforcement

A. When a response is always followed by reinforcement it is called continuous reinforcement. Such a response after learning is easy to extinguish.

B. When occurrence of reinforcement is probabilistic it is termed as partial reinforcement, and is difficult to extinguish. During partial reinforcement superstitious behaviors arise. An animal behaves peculiarly to get reinforcement, when its not being received.

33

Ratio Schedules

1. Reinforcement that occurs after every nth

response is called fixed ratio schedule. For example, when the rat presses the bar 5 times to get food, it is on FR5 schedule.

2. Reinforcement occurs after an average of n responses is known as variable ratio schedule. Sometimes the reinforcement is introduced after 3 bar presses at other times 8 bar presses, however, the average bar presses equals 5. Abbreviated as VR5.

12

34

Interval Schedules

3. When reinforcement occurs after a specified interval of time is called fixed Interval schedule. Animal gets food after 5 seconds. Abbreviated as FI5.

4. When reinforcement occurs after an average interval of time is called variable Interval schedule. Some times the rat gets the food pellet after 3 seconds and some times after 8 seconds however the average time interval equals 5 seconds (VI5).

35

Schedules of Reinforcement

Sequence

Fixed Variable

Domain

Ratio

Interval

Different learning curves emerge with different reinforcement schedules. For ratio schedules they

are steeper than interval schedules.

36

Concurrent Schedules

VI5 VI10

5. Concurrent schedules provide two simultaneous schedules of reinforcements, organisms (pigeons) will distribute their responses according to these schedules (Skinner, 1950).

0

1

2

3

4

5

1 5 10 15 20 25 30

Time (Minutes)

Be

hav

ior

(Cu

mu

lati

ve R

esp

on

ses)

VI 5

VI 10

13

37

Herrnstein Matching Law

Herrnstein (1970; 1974) showed with a mathematical equation that relative reinforcement

equals relative response (behavior).

B1

B1 + B2

R1

R1 + R2=

Red Key Green Key

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Relative Reinforcement Red Key

Re

lati

ve B

eh

avio

r R

ed

Ke

y

38

Simple Choice behavior

Study

DelayedReward

Going tothe movies

ImmediateReward

Gratification by seeing a movie

Gratificationwith a good

grade

Gratification from rewards can be immediate or delayed. Our simple choice behaviors are dictated

by these reinforcements accordingly.

39

Concurrent Chain Schedule

2 secof grain

6 secof grain

6 seconds

4 secondsdifference

2 sec

Light Delay Reinforcement

6a. Concurrent chain schedule produce complex choice behaviors so under one condition pigeons preferred small sooner reinforcer (Rachlin & Green, 1972).

14

40

Concurrent Chain Schedule

2 secof grain

6 secof grain

Light Delay Reinforcement

6b. And in the other condition, pigeons preferred large delayed reinforcers (Rachlin & Green, 1972).

24 seconds

20 seconds

4 secondsdifference

41

Complex Choice Behavior

Thus organisms (human and animal) behave differently to different rewards. Selection of rewards

in a complex choice situation is based on a combination of reward imminence (how large or

small they are) and reward delay (length of time to reach them).

42

Progressive Ratio Schedule

7a. Progressive ratio schedule provides a tool to measure the efficacy of a reinforcer. To determine whether one reinforcer is more effective than the other, progressive ratio schedule requires the organism to indicate in behavioral terms the maximum it will “pay “ for a particular reinforcer.

15

43


7b. The organism is trained on a fixed ratio schedule say FR2 and receives say 5 pellets of food. The schedule is increased to FR4, so now the animal makes 4 responses before it gets 5 pellets of food. The schedule is increased to FR8 and so on. There comes a time for a schedule (FR64) that the animal is not willing to engage in responses to get the reinforcement.

44


7c. We can compare two reinforcements (food and water) and determine at which schedule the animal breaks down for them, thus comparing their efficacy. Food breaks down before water.

0

2

4

6

8

10

12

14

16

0 1 2 4 8 16 32 64 128 256 512

Me

an L

og

Rei

nfo

rcem

ent

Rat

e

Log FR Schedule

Reinforcement A (Food)

Reinforcement B (Water)

45

Verbal Behavior

Like any other behavior language (verbal behavior)is also a behavior and largely consists of speaking,

listening, writing and reading behaviors. These behaviors are governed by antecedent conditions

(stimuli), and consequences (reinforcements).

16

46

Types of Verbal Behavior

1. Mand (from demand or command): A listening or talking behavior. The individual (child) behaves appropriately to the command given by another (adult) and is reinforced. The child may also request (demand) something to relieve a need.

The adult says, “look (mand) I have a toy for you”. The child looks (behaves) and is reinforced with the toy (reinforcement).

47


2. Echoic Behavior: A talking behavior. A word or a sentence repeated verbatim. Can be loud or silent as in reading. The adult says “cookies” (stimulus) the child echoes the word (behavior) and gets a smile (reinforcement).

Audible Silent

Cookies

Cookies

48


3. Tact: A talking behavior. A verbal behavior in which individuals correctly names or identifies (tact) objects (stimuli) and the other individuals reinforce them for a correct match.

FlowersGood

17

49


4. Autoclitic Behavior: A talking behavior. This behavior (autoclitic) occurs when a question (stimulus) is posed. The answer to the question is followed by reinforcement (praise). Also called intraverbal behavior.

Which mammal lives in the sea?

A whale!

50

ABC of Verbal Behavior

Type Antecedent (A) Behavior (B) Consequence (C)

Mand State of Deprivation or aversive stimulation

Verbal utterance Reinforcer that reduces state of

deprivation

Echoic Verbal utterance from another individual

Repetition of what the speaker says

Conditioned reinforcement

(praise) from the other person

Tact Stimulus (usually object) in the environment

Verbal utterance naming or referring

to the object

Conditioned reinforcement from

the other person

Autoclitic Verbal utterance (often a question)

from another person

Verbal response (answer to a

question)

Verbal feedback or reinforcement

Based on Skinner (1957)

51

Programmed Learning

Skinner was interested in applying theory of learning to education, therefore introduced teaching

machines. Electromechanical devices that promoted teaching and learning.

up

load

.wikim

edia.o

rg

18

52

Programmed Learning

1. Teaching machines provide sustained activity.

2. Insures a point is understood before moving on (small steps).

3. Presents learner with material he is ready for.

4. Helps learner find the right answer.

5. Provides immediate feedback.

53

Learning Theory & Behavior Technology

1. Skinner did not believe in formulating a theory of learning, the way Hull did.

2. Behavior should be explained in terms of stimuli, not physiology.

3. Functional analysis of stimuli and behaviors should be the goal of psychology not the “why of behaviors”.

4. We need behavior technology to resolve human problems. But our culture, government and religion erodes reinforcements to problem-free behaviors.

54

David Premack

1. Born: October 26, 1925,Aberdeen, South Dakota.

2. Started working at the Yerkes Primate Biology Laboratory (1954).

3. Intelligence in Apes and Man (1976).The Mind of an Ape (1983). Original Intelligence: The Architecture of the Human Mind (2002).

1925-Present

19

55

David Premack

4. Emeritus professor of psychology at the University of Pennsylvania.

5. William James Fellow Award (2005).

1925-Present

56

Premack Principle

Responses (behaviors) that occurred at a higher frequency could be used as reinforcers for responses

that occurred with low frequency. In other words High-probability behavior (HPB) can be used to

reinforce low-probability behavior (LPB).

In order to increase grooming behavior (LPB), eating behavior (HPB) was used

as a reinforcer. Each time the animal groomed, it was given the opportunity

to eat. His grooming behavior increased.

Eating(HPB)

Grooming(LPB)

Proportion of behaviorin the animal

Grooming(HPB)

57

Relativity of Reinforcement

To test his theory in humans, Premack took 31, 1st graders and gave them gumball and pinball machine to play with. Based on their activity he was able to

classify them into eaters and manipulators.

Gumballmachine

PinballMachine

Phase I

20

58

Relativity of Reinforcement

If the child was an eater, he was only allowed to eat if he

played the pinball machine.

Playing behavior increased!

Phase II

If the child was a manipulator,he was only allowed to play if he

ate from the gumball machine.

Eating behavior increased!

59

Transituational Nature of Reinforcement

A high probability behavior like eating will become a low probability behavior if the animal eats. Not only

does the probability of the behavior changes, but the very nature of the reinforcement changes with

time.

Food

Rewarding Neutral Punishing

Nature of reinforcement over time (Kimble, 1993).

60

Disequilibrium Hypothesis

Timberlake (1980) suggests that any activity can become a reinforcer if the activity is blocked in some way. If drinking is blocked a state of disequilibrium is

produced in the animal, and now can be used as a reinforcer.

30%

10%

Eating Drinking ActivityWheel

10%

20%

State ofDisequilibrium

21

61

Marian Breland Bailey

1. Born Dec. 2, 1920 in Minneapolis, Minnesota.

2. Became the second PhD student under Skinner moved to Hot Spring and relocated Animal Behavior Enterprises (ABE).

3. Studied functional analysis of behavior and taught at Henderson State University.

4. Died Sep. 25, 2001.

1925-2001

62

Instinctive Drift

When instinctive behavior comes in conflict with conditioned operant behavior, animals show a tendency to drift in the direction of instinctive

behavior.

Marian Breland and Keller Breland trained raccoons to put a wooden coins in a box (commercial for a saving

bank) but raccoons had trouble putting the coins in the box especially, when there were two coins to deposit. Brelands argued that raccoons instinctive behavior of

washing (rubbing) the food before eating came in conflict with the learnt behavior.

63

Questions

17. Would you use the same reinforcers to

manipulate the behavior of both children and

adults? If not what would make the

difference?

18. What is partial reinforcement effect? Briefly describe the ratio and interval reinforcement schedules studied by Skinner.

19. Explain the difference between Premack’s and Timberlake’s views of reinforcers.