Seminar talk, 2008

112
Learning to Forage: Rules, rules, everywhere a rule. Steven Hamblin - Dept. of Biology, UQÀM

description

A seminar talk I gave in 2008 to the University of Toronto graduate students seminar.

Transcript of Seminar talk, 2008

Page 1: Seminar talk, 2008

Learning to Forage:Rules, rules, everywhere a rule.

!

Steven Hamblin - Dept. of Biology, UQÀM

Page 2: Seminar talk, 2008

The road ahead...

Page 3: Seminar talk, 2008

Some background:

Components of the problem: learning, foraging, optima.

Producer-Scrounger game.

Learning rules.

The road ahead...

Page 4: Seminar talk, 2008

Our approach:

Simulations and genetic algorithms.

Results.

Next steps.

The road ahead...

Page 5: Seminar talk, 2008

Learning

Page 6: Seminar talk, 2008

Learning

Page 7: Seminar talk, 2008

Learning

Page 8: Seminar talk, 2008
Page 9: Seminar talk, 2008
Page 10: Seminar talk, 2008
Page 11: Seminar talk, 2008
Page 12: Seminar talk, 2008
Page 13: Seminar talk, 2008
Page 14: Seminar talk, 2008

ESS: A strategy which, if adopted a population, cannot be invaded by a rare mutant strategy.

Page 15: Seminar talk, 2008

Social foraging

Equilibrium

behaviour

Learning

Page 16: Seminar talk, 2008

Evolution of Learning Rules

Page 17: Seminar talk, 2008

Producer

Producer-Scrounger Game

Page 18: Seminar talk, 2008

Producers

Scrounger

Producer-Scrounger Game

Page 19: Seminar talk, 2008

Producer-Scrounger Game

Page 20: Seminar talk, 2008

544 A N I M A L B E H A V I O U R , 2 9 , 2

Where the two pay-of f curves intersect , bo th types fare equal ly well: to one side o f the inter-

section p roducers do better, to the other, scroungers do better. We can call this the ESS po in t in accordance with the principle o f evo-

lu t ionar i ly stable strategies ( M a y n a r d Smith

1974; Dawkins 1976). The ESS po in t represents the stable mixture o f producers and scroungers

in selective terms to which groups which conta in the two types should converge (Fig. lb) .

However , the s i tuat ion is unl ikely to be as

s t ra ight forward as that . Because bo th frequency-

dependent and dens i ty-dependent factors are

l ikely to opera te with changes in g roup size, pay- offs to p roducers and scroungers are more accu-

ra te ly represented as pay-off surfaces (Fig. lc). The same principles app ly to the surfaces as to the curves in Fig. l a , except now the intersect ion

between the surfaces for p roducers and scroungers produces a line ra ther than a single

point . The line o f intersect ion can be m a p p e d as

an ESS line on to the two-dimensional surface between the p roducer / sc rounger axes (Fig. l d),

and groups should now ' t r ack ' the line ra ther than converge to a single point . A new and im-

p o r t an t impl ica t ion arising f rom the idea o f an

ESS line is t ha t the ra t io o f p roducers to scroungers at equi l ibr ium is l ikely to depend on

group size. Depend ing on the shape of the two intersecting surfaces, the ESS l ine in the hori-

zonta l p lane can describe a wide var ie ty o f curves all o f which, except for s t raight lines

th rough the origin, show a group size effect. The

at

No. scroungers No. producers

Here producers do better

Pay-off to / S C F O U n Q @ r s

S:- ducers

5 4 3 2 1 0 1 2 3 4 5 6

Here scroungers

b)

So group composit ion should adjust

I

i

ESS

,Fig. 1

PaY-~ f t / /

, o rod cer - ,

E S S - l i n e

I No. producers

0 1 J 3 4 5 6 ~._

d) ~ ~ = 2 "",~.. g j y H e r e scroungers ~ do bet ter

#

6 / Here producers i

Fig. I. (a) Pay-off to individual producers and scroungers as a function of the producer :scrounger ratio in the group (here arbitrarily set at six individuals). The intersection of the two curves is a point representing equal pay-offs to producers and scroungers; when strategies are conditional it is the point at which it would not pay any individual to change strategy. (b) The ESS corresponding to the pay- offs shown in (a). (c) The pay-off to individual producers and scroungers as a function of the number of scroungers at a site yields two surfaces. The intersection of the sur- faces is a line giving the ESS for each group size. (d) The projection of these ESS's onto the horizontal plane, giving the ESS line as a function of the number of pro- ducers and the number of scroungers.

General note: For simplicity the ESS line has been drawn as if non-integer numbers of producers and scroungers were possible. Restriction to integers gives a line to the right of that shown, usually as close as possible. The integer ESS for a given flock size gives a ratio of scroungers to producers such that if any one changed strategy be would do worse.

precise shapes of the surfaces may vary depend- ing on the na ture o f the p roducer / sc rounger

re la t ionship. In gua rde r / ' sneak ' re la t ionships dur ing mat ing, for example, the pay-of f to

guarders (producers) might decrease mono-

Barnard & Sibley, 1981.

Page 21: Seminar talk, 2008

50% producer. 50% scrounger.100% 0%

0% 100%producer.

producer.

scrounger.

scrounger.

Page 22: Seminar talk, 2008
Page 23: Seminar talk, 2008
Page 24: Seminar talk, 2008
Page 25: Seminar talk, 2008

Do they learn?

Page 26: Seminar talk, 2008

Do they learn?

Yes:

Page 27: Seminar talk, 2008

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Page 28: Seminar talk, 2008

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Katsnelson et al. , 2008

Page 29: Seminar talk, 2008

Do they learn?

Yes:

Mottley & Giraldeau, 2000.

Katsnelson et al. , 2008

ISBE, 2008.

Page 30: Seminar talk, 2008
Page 31: Seminar talk, 2008
Page 32: Seminar talk, 2008
Page 33: Seminar talk, 2008
Page 34: Seminar talk, 2008

Individual-based model (a.k.a. agent-based model).

Rules tested in isolation; stability test was questionable.

Page 35: Seminar talk, 2008

Rules

Page 36: Seminar talk, 2008

RulesRelative payoff sum

Page 37: Seminar talk, 2008

RulesRelative payoff sum

Perfect Memory

Page 38: Seminar talk, 2008

RulesRelative payoff sum

Perfect Memory

Linear Operator

Page 39: Seminar talk, 2008

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Page 40: Seminar talk, 2008

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Page 41: Seminar talk, 2008

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Page 42: Seminar talk, 2008

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Page 43: Seminar talk, 2008

Relative Payoff Sum

where 0 < x < 1 is a memory factor,

ri > 0 is the residual value associated with alternative i,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Page 44: Seminar talk, 2008

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Page 45: Seminar talk, 2008

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Page 46: Seminar talk, 2008

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Page 47: Seminar talk, 2008

Perfect Memory

Si(t) = � + Ri(t)/(⇥ + Ni(t))

where Ri(t) is the cumulative payo�s from alternative i to time t,

Ni(t) is the number of time periods from the beginning in which the option

was selected,

� and ⇥ are parameters.

Page 48: Seminar talk, 2008

Linear Operator

Si(t) = xSi(t� 1) + (1� x)Pi(t)

where 0 < x < 1 is a memory factor,

Pi(t) is the payo� to alternative i at time t, and

Si(t) is the value that the animal places on the behavioural alternative i at

time t.

Page 49: Seminar talk, 2008

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Page 50: Seminar talk, 2008

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Page 51: Seminar talk, 2008

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Multiple stable rules with multiple parameters?

Page 52: Seminar talk, 2008

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Page 53: Seminar talk, 2008

Agent Start

At a patch with food?

Feed

Produce or scrounge?

Produce Scrounge

Move randomly

No

Yes

Any conspecifics

feeding?No

Move to closest

Closest still feeding?

There yet?Still food in

patch?Yes

No

Feed

YesNo

No

Yes

Page 54: Seminar talk, 2008

Agent Start

At a patch with food?

Feed

Produce or scrounge?

Produce Scrounge

Move randomly

No

Yes

Any conspecifics

feeding?No

Move to closest

Closest still feeding?

There yet?Still food in

patch?Yes

No

Feed

YesNo

No

Yes

Simulation notes...Foraging grid is a variable-sized square grid with movement in the 4 cardinal directions.

Number of patches and number of agents kept to 20% and 10% of grid size.

Thus: 40x40 grid would have 320 patches and 160 agents

Page 55: Seminar talk, 2008

Genetic Algorithms

Algorithms that simulate evolution to solve optimization problems.

Page 56: Seminar talk, 2008

Initial population

Measure fitness

Select for

reproduction

Mutation

Exit> n generations

Page 57: Seminar talk, 2008

One final wrinkle.

Environmental vs. frequency-dependent variance in payoff.

Page 58: Seminar talk, 2008
Page 59: Seminar talk, 2008
Page 60: Seminar talk, 2008
Page 61: Seminar talk, 2008
Page 62: Seminar talk, 2008
Page 63: Seminar talk, 2008

Environmental variation.

Manipulating patch density.

N changes, with greater N meaning greater variation.

Page 64: Seminar talk, 2008

Foraging / Learning rule simulation.

Page 65: Seminar talk, 2008

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Page 66: Seminar talk, 2008

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Sources of variation

Page 67: Seminar talk, 2008

Problem Solution

Rules tested in isolation. Simulation population randomly generated, using all rule types.

Parameter values arbitrarily chosen; few values tested.

Genetic algorithm to optimize across the whole parameter space.

Will rules converge on an ESS? Are they ES Learning rules?

Genetic algorithm to simulate population dynamics.

Page 68: Seminar talk, 2008

Results to date

Page 69: Seminar talk, 2008

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

02

46

810

Relative Payoff Sum Perfect Memory Linear Operator

0 500

Page 70: Seminar talk, 2008

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

0200

400

600

800

Relative Payoff Sum Perfect Memory Linear Operator

0 500

Page 71: Seminar talk, 2008

rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules

050

100

150

200

250

300

350

Relative Payoff Sum Perfect Memory Linear Operator

0 500

Page 72: Seminar talk, 2008

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Page 73: Seminar talk, 2008

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Producer residual

Page 74: Seminar talk, 2008

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Scrounger residual

Producer residual

Page 75: Seminar talk, 2008

01

23

45

Group size

Para

met

er v

alue

s

● ●

10 40 90 160 360 1000

Scrounger residual

Producer residual

Memory factor

Page 76: Seminar talk, 2008

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)Relative Payoff Sum

rp >> rs for large population sizes.

-1 0 1 2 3 4 5 6 7 8

1

2

3

4

5

Producer residual

Scrounger residual

Time without payo! to behaviour

Value assignedto behaviour

Page 77: Seminar talk, 2008

●●

● ●

0.0

0.2

0.4

0.6

0.8

1.0

Group size

Prop

ortio

n of

spe

cial

ists

.

● ●

10 40 90 160 360 1000

mean=0.981

mean=0.008

Page 78: Seminar talk, 2008

2 4 6 8 10

0.24

50.

250

0.25

50.

260

Periods of environmental variability

Mea

n pr

opor

tion

of sc

roun

ging

.

Page 79: Seminar talk, 2008

2 4 6 8 10

0.52

0.54

0.56

0.58

Periods of environmental variability

Mea

n pr

opor

tion

of sp

ecia

lists.

Page 80: Seminar talk, 2008

What does that mean?

Page 81: Seminar talk, 2008
Page 82: Seminar talk, 2008

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Page 83: Seminar talk, 2008

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Page 84: Seminar talk, 2008

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Small, but consistent effect of environmental variability.

Page 85: Seminar talk, 2008

Under the assumptions of this model, the Relative Payoff Sum rule is optimal.

Differences in residuals gives a prediction for empirical tests.

Small, but consistent effect of environmental variability.

Learning is selected against.

Page 86: Seminar talk, 2008

Next steps?

Page 87: Seminar talk, 2008

Questions?

Thanks to:

The Giraldeau Lab.

Guy Beauchamp.

Maria Modanu and Steve Walker, for the invitation.

Page 88: Seminar talk, 2008

Evolution of learning rule form.

Page 89: Seminar talk, 2008

Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)

Si(t) = � + Ri(t)/(⇥ + Ni(t))

Si(t) = xSi(t� 1) + (1� x)Pi(t)

Relative Payoff Sum?

Perfect Memory?

Linear Operator?

Page 90: Seminar talk, 2008
Page 91: Seminar talk, 2008
Page 92: Seminar talk, 2008
Page 93: Seminar talk, 2008

Initial population

Measure fitness

Select for

reproduction

Mutation

Exit> n generations

Page 94: Seminar talk, 2008
Page 95: Seminar talk, 2008
Page 96: Seminar talk, 2008

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Page 97: Seminar talk, 2008

Foraging / Learning rule simulation.

Genetic algorithm to optimize parameters and simulate population dynamics.

Genetic programming to optimize rule structure.

Page 98: Seminar talk, 2008
Page 99: Seminar talk, 2008
Page 100: Seminar talk, 2008
Page 101: Seminar talk, 2008
Page 102: Seminar talk, 2008
Page 103: Seminar talk, 2008
Page 104: Seminar talk, 2008
Page 105: Seminar talk, 2008
Page 106: Seminar talk, 2008
Page 107: Seminar talk, 2008

Learning

Page 108: Seminar talk, 2008

Learning

Page 109: Seminar talk, 2008

Learning

Page 110: Seminar talk, 2008

Learning

Page 111: Seminar talk, 2008

Learning

Page 112: Seminar talk, 2008

housed in flocks of six in common cages (59!32 and46 cm high) made of galvanized wire mesh and kept on a12:12 h light:dark cycle at 27"C (#2"). They were fed adlibitum on a mixture of white and red millet seeds andoffered ad libitum water. Each bird was marked with aunique combination of two coloured leg bands. Inaddition, the tail and neck feathers of each individualwere coloured with acrylic paint to allow individualidentification from a distance.

ApparatusThe purpose of the experimental apparatus was to

constrain subjects to act as either producers or scroungersin order to manipulate the frequency of each tactic

within a flock. The apparatus consisted of an indoor cage(273!102 cm and 104 cm high) with a producer and ascrounger compartment divided by a series of 22 patches,of which every second one contained seeds (Fig. 2a). Anopaque barrier placed length-wise from ceiling to floorprevented birds from moving between the producer andscrounger compartments (Fig. 2a).

Each patch consisted of a seed container and a stringthat prevented the seeds from falling out. Pulling thestring caused the seeds to fall into a 2!2 cm collectingdish located directly below the seed container. Oncein the collecting dish the seeds were available to theindividual that pulled the string from the producercompartment and all individuals within the scrounger

BarrierScrounger side

Producer side

Seed container

Division

Collecting dish

String

Perch

Scrounger sideProducer side

(b)

(a)

Figure 2. Top view of the experimental apparatus (a) and foraging patch (b). Individuals could search for seed-containing patches by pullingthe string associated with each patch. Strings were available only in the producer compartment. Birds in the scrounger compartment searchedfor individuals feeding from produced patches. When the top portion of an opaque barrier was in place, the birds in one compartment couldnot move into the other compartment. A close-up view of the patch (b) shows that producers had to sit on a perch directly in front of a patchto pull the string associated with that patch, and if seeds were present, they were released into the collecting dish. From the perch, a producercould reach the collecting dish by stretching its neck through a small hole in the division placed between compartments. The arrow indicatesthe direction in which the string had to be pulled to release the seeds.

343MOTTLEY & GIRALDEAU: CONVERGING ON PS EQUILIBRIA