Reinforcement & Punishment: What is an S R ? Lesson 11.

23
Reinforcement & Punishment: What is an S R ? Lesson 11

Transcript of Reinforcement & Punishment: What is an S R ? Lesson 11.

Reinforcement & Punishment:

What is an SR?Lesson 11

What is an SR?

Thorndike’s Law of Effect Satisfiers & annoyers

Skinner determined by how B changes reinforcer: B punisher: B

Primary reinforcers & punishers biologically important stimuli ~

What is an SR? (continued) Secondary reinforcers & punishers

money praise

How do they become an SR? Classical Conditioning Higher order learning ~

Drive Reduction View (50s & 60s)

Similar to Law of Readiness Relative state of deprivation required

for a basic drive thought to always be true Drive motivation

B reduction of drive state (SR) ~

Sensory reinforcement Sensory stimulus unrelated to

biological drive monkeys learn response

reward is watching toy train rats learn to bar press

reward = turning on a light or turning off light ~

Premack Principle

Commonly used in educational setting impractical or unethical to use food

Thought of reinforcers as responses press bar eating response wider application of I/O conditioning

Differential probability principle High probability responses

reinforce low probability responses ~

Premack Principle

Homme et al (1963) Unruly 3 year olds

High probability behaviors ignored teacher screaming pushing furniture

Low probability behavior sitting quietly ~

Premack Principle: Homme et al

Rewarded sitting quietly with... 3 min of running around screaming

Results: sitting quietly increased Particular behaviors observed by

different kids different responses effective

reinforcers for different kids ~

Premack Principle

Charlop, Kurtz, & Casey (1990) autistic children

High probability behaviors echolalia perseveration

Low probability behaviors adding up coins judging objects: same or different ~

Premack Principle: Charlop et al

# of sessions

% correctresponses

40

80

100

60

food RFT

echolalia RFT

Premack Principle: Problems

Fluctuation of response probabilities e.g., sometimes kid would rather

play outside than play video games Solution: token economies

Reinforcer value not absolute Individuals differ Can change with context ~

Behavioral Regulation Approach

Response deprivation limit access to a response does not require high vs. low probability

Behavioral homeostasis preferred distribution of activities operant conditioning imposes limits behavioral bliss point

e.g., time spent studying vs. video games ~

Behavioral Regulation Approach

A behavior is limited below bliss point disturbance of behavioral homeostasis

analogous to increased biological drive Contingency set during I/O procedure

establish relationship between responses B move toward bliss point (baseline) ~

Behavioral Regulation Approach

Low probability behaviors as reinforcers observe baseline rate of behavior limit activity below baseline

Require a response to engage in deprived behavior

contingency Increase toward bliss point

cost vs. benefits determines how much ~

What Becomes Connected?

Skinner? refused to consider associations

Thorndike: S-R view (SD-B) association b/n stimulus context

and response NOT the outcome (SR) no representation of reinforcer ~

S-R-O (SD-B-SR) view: Tinkelpaugh (1928)

Goal-oriented responding respond with idea of getting reward

The monkey and the hidden banana 2 cups, put banana under 1 task: choose cup with banana

Secretly substituted rotten lettuce monkey became agitated Expected banana reward (outcome) ~

S-R vs. S-R-O

Adams & Dickinson (1981) Taste aversion paradigm

Associate sucrose (sweetner) w/ lithium chloride (LiCl) illness

Will rats press bar to get something that makes them sick? ~

S-R vs. S-R-O

Phase 1: Trained rats to bar press for sucrose

Phase 2: associate sucrose w/ illness

Phase 3: Will rats press bar now?

No sucrose delivered ~

S-R vs. S-R-O : Results Predictions?

If S-R-O If S-R

Results Rats did not press bar Supports S-R-O ~

S-R vs. S-R-O

Use different levels of training Phase 1: Same procedure but…

some get 100 RFTs some get 500 RFTs ~

Results & Conclusions

Less training low response rate Little training outcome important S-R-O

Extensive training high response rate outcome less important response is well established S-R ~

Parallel learning in humans Learning a skill

e.g., to drive a car Early trials

consider consequences must think about what you are doing

After extensive experience becomes automatic after many trials ~

Extrinsic Reward vs Intrinsic Motivation

Early trials expectation of reinforcer extrinsic reward CER = positive affect

Well-established behavior no expectation of reward intrinsic motivation CER = positive affect ~