Robotica Lezione 14. Lecture Outline Neural networks Classical conditioning AHC with NNs Genetic...

35
Robotica Lezione 14

Transcript of Robotica Lezione 14. Lecture Outline Neural networks Classical conditioning AHC with NNs Genetic...

Page 1: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Robotica

Lezione 14

Page 2: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Lecture Outline Neural networks Classical conditioning AHC with NNs Genetic Algorithms Classifier Systems Fuzzy learning Case-based learning Memory-based learning Explanation-based learning

Page 3: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Q Learning Algorithm

Q(x,a) Q(x,a) + b*(r + *E(y) - Q(x,a)) x is state, a is action b is learning rate r is reward is discount factor (0,1) E(y) is the utility of the state y, computed as

E(y) = max(Q(y,a)) for all actions a

Guaranteed to converge to optimal, given infinite trials

Page 4: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Supervised Learning

Supervised learning requires the user to give the exact solution to the robot in the form of the error direction and magnitude.

The user must know the exact desired behavior for each situation.

Supervised learning involves training, which can be very slow; the user must supervise the system with numerous examples.

Page 5: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Neural Networks NNs are the most widely used example of

supervised learning There are numerous types of NNs (feed-

forward, recurrent) and NN learning algorithms

Hebbian learning: increase synaptic strength along pathways associated with stimulus and correct response

Perceptron learning: delta rule or back-propagation

Page 6: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Perceptron Learning

Algorithm:

Page 7: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

NNs are RL In all NNs, the goal is to minimize the error

between the network output and the desired output

This is achieved by adjusting the weights on the network connections

Note: NNs are a form of reinforcement learning

NNs perform supervised RL with immediate error feedback

Page 8: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Classical Conditioning

Classical conditioning comes from psychology (Pavlov 1927)

Assumes that unconditioned stimuli (e.g., food) cause unconditioned responses (e.g., salivation); US => UR

A conditioned stimulus is, over time, associated with an unconditioned response (CS=>UR)

E.g., CS (bell ringing) => UR (salivation)

Page 9: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Classical Conditioning

Instead of encoding SR rules, conditioning can be used to from the associations automatically

Can be encoded in NNs

Page 10: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Connectionist AHC Robuter learned a set of gain multipliers for

exploration (Gachet et al)

Page 11: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Associative Learning

Learning new behaviors by associating sensors and actions into rules

E.g.,: 6-legged walking (Edinburgh U.) Whisker sensors first, IR and light later 3 actions: left,right, ahead User provided feedback (shaping) Learned avoidance, pushing, wall following,

light seeking

Page 12: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Two-Layer Perceptron Arch.

Page 13: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

More NN Examples

Some domains and tasks lend themselves very well to supervised NN learning

The best example is robot motion planning for articulation/manipulation

The answer to any given situation is well known, and can be trained

E.g., NNs are widely used for learning inverse kinematics

Page 14: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Evolutionary Methods

Genetic/evolutionary approaches are based on the evolutionary search metaphor

The states/situations and actions/behaviors are represented as "genes”

Different combinations are tried by various "individuals" in "populations"

Individuals with the highest "fitness" perform the best, are kept as survivors

Page 15: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Evolutionary Methods

The others are discarded; this is the selection process.

The survivors' "genes" are mutated, crossed-over, and new individuals are so formed, which are then tested and scored.

In effect, the evolutionary process is searching through the space of solutions to find the one with the highest fitness.

Page 16: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Summary of Evolution

Evolutionary methods solve search and optimization problems using a fitness function operators

They represent the solution as a genetic encoding (string)

They select ‘best’ individuals for reproduction and applyCross over, mutation

They operate on populations

Page 17: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Genetic Algorithms

Page 18: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Levels of Application

Genetic methods can be applied at different levels: 1) for tuning parameters (such as gains in a

control system) 2) for developing controllers (policies) for

individual robots 3) for developing group strategies for multi-robot

systems (by testing groups as populations)

Page 19: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

GAs v. GPs

When applied to strings of genes, the approaches are classified as genetic algorithms (GA)

When applied to pieces of executable programs, they approaches are classified as genetic programming (GP)

GP operates at a higher level of abstraction than GA

Page 20: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Classifier Systems Use GAs to learn rule sets ALECSYS - Autono-mouse Learn behaviors and coordination

Page 21: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Evolving Structure & Behavior

Evolving morphology & behavior (Sims 1994) using a physics-based model

Fitness in simulation through competition (e.g., walking and swimming)

Page 22: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Co-evolving Brains and Bodies

Extension to the physical world (Pollack)

Page 23: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Fuzzy Logic

Fuzzy control produces actions using a set of fuzzy rules based on fuzzy logic

This involves: fuzzifying: mapping sensor readings into a set of

fuzzy inputs fuzzy rule base: a set of IF-THEN rules fuzzy inference: maps fuzzy sets onto other fuzzy

sets using membership fncts. defuzzifying: mapping a set of fuzzy outputs onto

a set of crisp output commands

Page 24: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Fuzzy Control Fuzzy logic allows for specifying behaviors

as fuzzy rules Such behaviors can be smoothly blended

together (e.g., Flakey, DAMN) Fuzzy rules can be learned

Page 25: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Memory-Based Learning

Memory-based learning (MBL) involves storing the exact parameters of a situation/action.

This type of learning is important for systems with a great deal of parameters, where blind search through parameter space would take prohibitively long.

MBL takes a lot of space for data storage.

Page 26: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Memory-Based Learning

This is a trade-off against computation time; MBL assumes that it is best to use the space to remember than time to compute.

Content-addressable memory is used to approximate output for a new input

This is a form of interpolation The approach uses regression (basic

intuition is weighted averaging)

Page 27: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Symbolic Learning Learning new rules given a set of predefined

rules by inference by other logic techniques

Example: induction on decision trees

Page 28: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Example: Robo-SOAR

Logical reasoning system Production system based on implications or

rules Chunking Conflict resolution mechanism when

different rules apply Continuous conversion of deliberative info to

efficient representations

Page 29: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Multistrategy Learning

Learning methods can be combined Combining NNs and RL is common

use NNs to generalize states then apply RL in a reduced state space

Combining NNs and MBL use NNs to interpolate between instances examples: ping pong, juggling, pole balance

Page 30: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Multistrategy Learning

Combining NNs and genetic algorithms is also useful example: learning to walk (next)

Neuro-fuzzy is a popular approach example: USC AFV

Page 31: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Example: 6-legged walking

Each leg modeled as an oscillatorPhasing of legs controlled by a NNWeights of NN learnt using a GA

Page 32: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Example: AVATAR USC Robotics Labs

Page 33: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

State of the Art There are many learning techniques

Use dependant on application

Robotics is a particularly challenging domain for learning very few successful examples active area of research

Page 34: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Implemented Systems

Supervised learning has been used to learn controllers for complex, many DOF arms, as well as for navigating specific paths and performing articulated skills such as juggling and pole-balancing.

Memory-based learning has been used to learn controllers for very similar manipulator tasks as above.

Page 35: Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Implemented Systems

Reinforcement learning has been used to learn controllers for individual navigating robots, groups of foraging robots, map-learning and maze-learning robots.

Evolutionary learning has been used to learn controllers for individual navigating robots, maze-solving robots, herding robots, and foraging robots.