Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

20
Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001

Transcript of Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Page 1: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Stochastic Optimizationand Simulated Annealing

Psychology 85-419/719January 25, 2001

Page 2: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

In Previous Lecture...

• Discussed constraint satisfaction networks, having:– Units, weights, and a “goodness” function

• Updating states involves computing input from other units– Guaranteed to locally increase goodness– Not guaranteed to globally increase goodness

Page 3: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

The General Problem: Local Optima

Goo

dnes

s

Activation State

Local Optima

True Optima

Page 4: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

How To Solve the Problemof Local Optima?

• Exhaustive search?– Nah. Takes too long. n units have 2 to the nth

power possible states (if binary)

• Random re-starts?– Seems wasteful.

• How about something that generally goes in the right direction, with some randomness?

Page 5: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Sometimes It Isn’t Best ToAlways Go Straight Towards

The Goal

• Rubik’s Cube: Undo some moves in order to make progress

• Baseball: sacrifice fly

• Navigation: move away from goal, to get around obstacles

Page 6: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Randomness Can Help Us Escape Bad Solutions

Activation State

Page 7: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

So, How Random Do WeWant to Be?

• We can take a cue from physical systems• In metallurgy, metals can reach a very strong

(stable) state by:– Melting it; scrambles molecular structure– Gradually cooling it– Resulting molecular structure very stable

• New terminology: reduce energy (which is kind of like the negative of goodness)

Page 8: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Simulated Annealing

T

neti i

e

ap

1

1]1[

Odds that a unit is on is a function of:

The input to the unit, net

The temperature, T

Page 9: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Picking it Apart...

• As net increases, probability that output is 1 increases– e is raised to the negative of net/T; so as net gets

big, e to the negative of net/T goes to zero. So probability goes to 1/1=1.

T

neti i

e

ap

1

1]1[

Page 10: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

The Temperature Term

• When T is big, the exponent for e goes to zero.

• e (or anything) to the zero power is 1

• So, odds output is 1 goes to 1/(1+1)=0.5

T

neti i

e

ap

1

1]1[

Page 11: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

The Temperature Term (2)

T

neti i

e

ap

1

1]1[

• When T gets small, exponent gets big.

• Effect of net becomes amplified.

Page 12: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Different Temperatures...

Net Input

Pro

babi

lity

Out

put i

s 1

High Temp

Med Temp

Low Temp

0

1

.5

T

neti i

e

ap

1

1]1[

Page 13: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Ok, So At What RateDo We Reduce Temperature?

In general, must decreaseit very slowly to guaranteeconvergence to globaloptimum

)1log()(

t

ctT

0 50 100

T

In practice, we can getaway with a more aggressiveannealing schedule..

Page 14: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Putting it Together...

• We can represent facts, etc. as units

• Knowledge about these facts encoded as weights

• Network processing fills in gaps, makes inferences, forms interpretations

• Stable Attractors form; the weights and input sculpt these attractors.

• Stability (and goodness) enhanced with randomness in updating process.

Page 15: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Stable Attractors Can BeThought Of As Memories

• How many stable patterns can be remembered by a network with N units?

• There are 2 to the N possible patterns…• … but only about 0.15*N will be stable• To remember 100 things, need 100/0.15=666

units!• (then again, the brain has about 10 to the 12th

power neurons…)

Page 16: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Human Performance, When Damaged (some examples)

• Category coordinate errors– Naming a CAT as a DOG

• Superordinate errors– Naming a CAT as an ANIMAL

• Visual errors (deep dyslexics)– Naming SYMPATHY as SYMPHONY– or, naming SYMPATHY as ORCHESTRA

Page 17: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

The Attractors We’ve TalkedAbout Can Be UsefulIn Understanding This

CAT

COT

“CAT”

CAT

COT

“CAT”

Normal Performance A Visual Error

(see Plaut Hinton, Shallice)

Page 18: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Properties of Human Memory

• Details tend to go first, more general things next. Not all-or-nothing forgetting.

• Things tend to be forgotten, based on– Salience– Recency– Complexity– Age of acquisition?

Page 19: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Do These Networks Have These Properties?

• Sort of.

• Graceful degradation. Features vanish as a function of strength of input to them.

• Complexity: more complex / arbitrary patterns can be more difficult to retain

• Salience, recency, age of acquisition?– Depends on learning rule. Stay tuned

Page 20: Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Next Time:Psychological Implications:

The IAC Model of Word Perception

• Optional reading: McClelland and Rumelhart ‘81 (handout)

• Rest of this class: Lab session. Help installing software, help with homework.