Aversive Control Negative Reinforcement Avoidance Learning Escape Learning.

Aversive Control

• Negative Reinforcement

• Avoidance Learning

• Escape Learning

Negative Reinforcement

S:e.g. operant chamber

R e.g. bar press

Removes/Prevents

SAversive e.g. shock

Strengthens

Note: if R removes SAversive = Escape

Negative Contingency

p(SAversive/R) < p(SAversive/noR)

if R prevents SAversive = Avoidance

Discriminated or Signalled Avoidance

A warning stimulus signals a forthcoming SAversive

If the required response is made during the warning stimulus, before the SAversive occurs, the subject avoids the shock.

If a response is not made during the warning stimulus, the SAversive occurs, and terminates when the required response is made

e.g., one-way avoidance

two-way avoidance

The Two-Process Theory of Avoidance

Explains avoidance learning in terms of two necessary processes:

1. The subject learns to associate the warning stimulus with the SAversive

2. The subject can then be negatively reinforced during the warning stimulus

Thus the two-process theory reduces avoidance learning to escape learning; the organism learns to escape from the CS and the fear that it elicits.

Acquired-drive experiments support the Two-Process Theory of Avoidance since animals do learn to escape from the CS

The Two-Process Theory of Avoidance

1. Level of fear is not always positively correlated with avoidance

2. Avoidance behavior should cycle at asymptote, but it typically does not.

3. Avoidance behavior should not be learned if the response does not terminate the CS, but it is.

Problems for the Two-Process Theory of Avoidance

4. Animals can learn Free-Operant (or Sidman) avoidance

Alternative Theoretical Accounts of Avoidance Behavior

1. Positive reinforcement through Conditioned Inhibition of fear

Conditioned Safety Signals performance of the avoidance response results indistinctive feedback stimuli (i.e., spatial cues, tactile cues, etc…) the avoidance response produces a period of safetyallowing the feedback cues to become signals for the absence of shock (i.e., safety signals) since a shock-free period is desirable, a conditionedinhibitory stimulus for shock could serve as a positivereinforcer


2. Reinforcement of avoidance through reduction of shock frequency with Two-Process theory, reduction in shock frequency was by-product of avoidance responses reduction in shock frequency is important rats will press lever to reduce frequency of shocks from 6/min to 3/min


3. Avoidance and Species-Specific Defense Reactions (SSDRs) more concerned with the actual response and whatdetermines the animal’s response early in training aversive stimuli elicit strong innate responses (i.e., freezing, flight, fighting, seeking out dark areas) species typical responses are readily learned as avoidance responses (e.g., Jump out of box in two trials versus press a lever to avoid shock 1000s of trials) punishment is responsible for the selection of theavoidance response

Punishment

S:e.g. operant chamber

R e.g. bar press

Produces

SAversive e.g. shock

Weakens

p(SAversive/R) > p(SAversive/noR)

Positive Contingency:

Usually a response that must be punished is maintained by a positive reinforcer, thus experimentally SAversive is usually made contingent on a response that has been or is being positively reinforced.

Both Skinner and Thorndike claimed punishment was not very effective in suppressing behavior

Punishment

Skinner’s Experiment on Punishment

Phase 1:

Rats were reinforced with food on a VI schedule

Phase 2:

Extinction for 120 minutes on two successive days

During only the first 10 min of extinction on day 1; one group of rats was punished for each bar press (paw slapped); the other, control, group was not punished.

Results of Skinner’s Experiment on Punishment

Punishment suppressed responding while it was being administered, but when punishment stopped, the punished rats ended up making as many responses overall in extinction as the unpunished controls.

Skinner concluded that punishment was not an effective way to control behavior.

Consideration of administration of punishing stimulus: Punishment effective if punishment:

is intense/prolonged from start

is response contingent rather than response independent (fig 10.13)

occurs immediately after response rather than delayed

is on a continuous rather than partial reinforcement schedule (fig 10.14)

Consideration of response to be punished: Punishment effective if:

punished response is not being reinforced; or motivation for reinforcer is reduced

there is an alternative response to the punished response to acquire reinforcer (fig 10.15)

the punished response is not a species-specific defence reaction

Consideration of situation in which punishment is to be administered; punishment effective if:

subject cannot discriminate when punishment will be administered and when not

punishment does not signal SAppetitive

Problems that may arise with the use of punishment to eliminate behavior:

undesirable CERs to the situation and/or person associated with punishment

general suppression of responding

difficulties in applying punishment systematically so that it is effective (e.g., discriminative cues; punishing every instance of the behavior

imitation of the aggressive behavior involved in punishment

escape/avoidance or aggressive responses in punishing situation

Practice Exams

Midterm

1. According to the Rescorla-Wagner model, learning will only occur if an animal has experienced a(n) .

A. CSB. USC. URD. CR

2. Suppose a 5-second tone is presented, then a 5-second gap, then food. Now 60 seconds pass, and the tone-gap-food sequence is again presented. In this example, the CS-US intervalis seconds, and the intertrial interval is seconds.

A. 10; 60B. 10; 70C. 5; 60D. 5; 70

3. You have discovered a new species of creatures, the zorks. They eat birds and identify those birds that are edible by their call. Zorks are bothered by stinging insects who mark their territory with a sour fluid: The zorks use this taste to avoid the insects’ territory. Suppose you perform the “bright, noisy, and tasty” water study, in which light/tone/saccharin CSs are paired with USs of shock or poison. If you assume the nonequivalence of associability, then the zorks who got USs should stop drinking only during a cue.

A. Shock; light/toneB. Shock; saccharinC. Poison; saccharinD. Poison; light

4. You have established a tone as a conditioned inhibitor (a CS-), using a shock US. Which of the following procedures would be most likely to cause the tone to lose its inhibitory power?

A. Present the tone alone for 20 trialsB. Present the tone alone for 200 trialsC. Present 100 trials where the shock is followed by the toneD. Present 100 trials where shocks and tones are given randomly

5. Suppose animals in Group 1 are exposed to a number of electric shocks, while animals in Group 2 are not. Next, all animals are given tone-shock pairings. What is the typical result?

A. Both groups acquire a CR at the same rate B. Group 1 acquires a CR more quickly than Group 2C. Group 1 acquires a CR more slowly than Group 2D. Group 1 develops an inhibitory CR

6. Which of the following produces the strongest conditioning?A. Simultaneous conditioningB. Backward conditioningC. Trace conditioningD. Delayed conditioning

7. Which of the following is an example of a CR?A. Salivating when lemon juice is put on your tongueB. A pigeon pecking grainC. Feeling nauseated when seeing moldy food.D. Flinching when a tree limb falls near you.

8. Which of the following is an example of an unconditioned response?A. You run when someone yells, “Fire!”B. Your mouth waters when you think about chocolate cake.C. You jump when a balloon pops behind your head.D. Your dog wags its tail when you open a can of dog food.

9. When did Pavlov present the sound of a metronome and food powder?

A. Whenever the dog was hungryB. When the dog was quiet and not reacting to other stimuliC. As soon as the dog salivatedD. Independent of the dog’s behavior

10. According to contingency theory, inhibitory conditioning occurs to a CS only when:

A. CSs and USs never occur together.B. Unsignaled USs are more likely than signaled USs. C. Signaled USs are more likely than unsignaled UssD. The likelihood of getting a signaled US is the same as the

likelihood of getting an unsignaled US.

1 (4%). “Extinction entails the elimination of an association.” Evaluate this quote, and provide evidence supporting your conclusions.

2 (3%). “Pavlovian conditioning is merely a way of teaching conditioned reflexes.” Evaluate this quote.

3 (2%). Describe an experiment that supports the S-R theory of second-order conditioning.

4 (4%). What advantage does the Pearce-Hall model have over the Rescorla-Wagner model in explaining data from unblocking experiments?

5 (2%). Describe two common procedures for measuring conditioned inhibition.

Practice Exams

Final

1. Which of the following procedures results in a decrease in an instrumental response?

A. Avoidance conditioningB. Escape conditioningC. Omission trainingD. Reward training

2. Suppose a study is conducted with rats. In phase 1, Group T-F is given tone-food pairings, while Group F is only presented with food. In phase 2, a lever is inserted, and each lever press is followed by the tone. What would convince you that the tone is a conditioned reinforcer?

A. Either group pressed the lever in phase 2B. Group F failed to press the leverC. Group T-F pressed the lever in phase 2

D. Group T-F pressed the lever more than Group F

3. Punishment in operant conditioning is analogous to excitatory aversive conditioning in Pavlovian conditioning. What, in the punishment procedure, is analogous to the CS, in Pavlovian conditioning?

A. The aversive stimulusB. The operant responseC. The suppression of the responseD. The reward

4. Your friend is attempting, unsuccessfully, to teach her dog to shake hands using an operant conditioning procedure. You are concerned with contiguity, so you advise your friend:

A. To give the treat immediately after the dog respondsB. Not to wait too long between saying “shake” and giving the treat C. To first reinforce any movement of the right pawD. To swat the dog’s nose if it fails to quickly offer its paw

5. Whenever a light comes on, a rat’s lever press is followed by a food pellet. To begin the extinction phase, the researcher should:

A. Stop turning on the lightB. Stop delivering food after lever pressesC. Remove the leverD. Both A and B

6. The type of schedule that typically produces a pause and then an accelerating rate of responding is the schedule of reinforcement.

A. Fixed intervalB. Variable intervalC. Fixed ratio

D. Variable ratio

7. You have discovered a new species, which eats mosquitos, and you have observed a number of its behaviors in the wild, including jumping (to catch mosquitos) and digging (before going to sleep). You are now testing whether instrumental conditioning occurs for the species and try each of the following:

1. Jumping is followed by presenting two mosquitos;2. Digging is followed by presenting two mosquitos; and3. Pressing a lever is followed by presenting two mosquitos

From most to least, what do you predict about how much each behavior will increase with the above contingencies?

A. Jumping, lever pressing, diggingB. Lever pressing, jumping, diggingC. Digging, lever pressing, jumping

D. All behaviors should increase the same amount

8. If Jason sets the table, he does not have to wash the dishes. Which procedure does this exemplify?

A. Avoidance conditioningB. Escape conditioningC. Omission trainingD. Punishment

9. If an animal experiences independent of its behavior, it later has trouble learning a response in a(n) situation.

A. shocks; avoidance training

B. Food; reward trainingC. Neither A nor B

D. Both A and B

10. Two schedules which selectively reinforce long IRTs are:A. VR and VI B. VI and DRHC. VI and DRLD. VR and DRL

1 (3%). Explain why the term “reinforcement” is not defined in terms of specific pleasurable stimuli such as food. Explain how you would know for sure that a given stimulus is indeed a reinforcer.

2 (4%). Describe the conditions under which matching behavior does and does not occur. Describe the two principal explanations for matching.

3 (4%). In an operant conditioning experiment, associations can develop between the discriminative stimulus and the response (S-R association) and/or between the discriminative stimulus and the outcome (S-O association). Describe an experiment that demonstrates each of these associations (i.e. one experiment for each association).

4 (4%). Describe an experiment demonstrating the basic learned helplessness effect. Does the learned helplessness learning deficit result from lack of control over outcomes? (Be sure to support your answer with experimental evidence).

1 (3%). In any experimental situation, it is necessary to determine if the response you see is due to conditioning, or is a by-product of some other variable (i.e. pseudo-conditioning). Describe three control procedures used in Pavlovian conditioning. 2 (4%). How does the Two-factor theory explain avoidance conditioning. Describe two problems for this theory.

3 (4%). A general process view of learning suggests that any stimulus/response may be conditioned using Pavlovian/operant techniques. Discuss the extent to which such a claim is justified, providing experimental evidence to support your argument.

4 (4%). While both contiguity and contingency are important in conditioning, it is thought that contingency is more important. Describe one operant and one Pavlovian experiment that demonstrate the importance of contingency.

Aversive Control Negative Reinforcement Avoidance Learning Escape Learning.

Documents

Transcript of Aversive Control Negative Reinforcement Avoidance Learning Escape Learning.