Academic Course: 10 On-line adaptation, learning, evolution

Designed by Gusz Eiben & Mark Hoogendoorn

On-line adaptation, learning, evolution

http://www.cs.vu.nl/~gusz/

http://www.few.vu.nl/~mhoogen/


Outline

• Population-based Adaptive Systems

• Types of adaptation: evolution, individual (lifetime) learning, social learning

• Machine learning

• Reinforcement learning

• Off-line vs. on-line adaptation




Population-based Adaptive Systems

PAS have two essential features

•They consist of a group of basic units that can perform actions, e.g., computation, communication, interaction, etc.

•The ability to adapt at

– individual level (modify agent ) and/or

– group level (add/remove agent).




Types of adaptation

• Evolutionary learning (EL): Changes at population level (assumed non-Lamarckian)

• Lifetime learning (LL): Changes at agent level

– Individual learning (IL): adaptation autonomouslythrough a purely internal procedure

– Social learning (SL): adaptation through interaction /communication




Taxonomy of adaptation

Adaptation

EvolutionaryLearning

LifetimeLearning

IndividualLearning

SocialLearning




Taxonomy of adaptation 2

Adaptation

EvolutionaryLearning

LifetimeLearning

IndividualLearning

SocialLearning

Learning

Evolution




Adaptation ≠ operation• Operation: controller is being used

– Sensory inputs outputs (motor, comm. device)

– Robot behavior changes, not the controller

• Adaptation: controller is being changed

– Present controller new controller

– Uses utility/reward/fitness info

– It may require

• One single robot – learning

• More robots – evolution, social learning

• Adaptation + operation = generate + test

• Off-line (initial controller design, before start) vs. on-line (after start)




Genotype

Develo

pm

ental

Engin

e (deco

der)

Genetic operators:mutation & xover

Learningoperators

Robot behavior

State of theenvironment

Phenotype =controller

Reward

FitnessSelectionoperators




Genotype

Develo

pm

ental

Engin

e (deco

der)


Learningoperators

Robot behavior


Reward


Phenotype

controllershape




Phenotype

Genotype

Develo

pm

ental

Engin

e (deco

der)


Learningoperators

Robot behavior


Reward


controllershape




Evolutionary loop

GenotypeD

evelop

men

tal Engin

eGenetic operators:mutation & xover

Learning operator(s)Robot

behaviorChanges in

environmentController =phenotype

Reward

FitnessSelection

operator(s)




Learning loop

GenotypeD

evelop

men

tal Engin

eGenetic operators:mutation & xover

Learning operator(s)Robot

behaviorChanges in

environmentController =phenotype

Reward

FitnessSelection

operator(s)




ENVIRONMENTAGENT

Reward r(t)

State s(t)

Action a(t)




Reinforcement learning

Agent in situation/state st chooses action at

World changes to situation/state st+1

Agent perceives situation st+1 and gets reward rt+1

Telling the agent what to do is its

POLICY πt(s, a) = P r{at = a|st = s}

Given the situation at time t is s, the policy gives the probability the agent’saction will be a.

For example: πt(s, goforward) = 0.5, πt(s, gobackward) = 0.5.

Reinforcement learning ⇒ Get/find/learn the policy




Further reading

• Evert Haasdijk and A.E. Eiben and Alan F.T. Winfield, Individual Social and Evolutionary Adaptation in Collective Systems , Serge Kernbach (eds.) , Handbook of Collective Robotics , Pan Stanford , 2011



Academic Course: 10 On-line adaptation, learning, evolution

Technology

Transcript of Academic Course: 10 On-line adaptation, learning, evolution