Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human...

32
Robot skill synthesis through human sensorimotor learning Jan Babič Jožef Stefan Institute, Slovenia

Transcript of Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human...

Page 1: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Robot skill synthesis through human sensorimotor learning

Jan Babič Jožef Stefan Institute, Slovenia

Page 2: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Grasping vs. full body motion

11/6/2012 1

Page 3: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Outline

• Ball swapping • Grasping • Reactive postural control • Concluding remarks

2

Human sensorimotor learning

Robot skill synthesis

Page 4: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Robots in everyday life…

• Robots in daily life require new methods for synthesis of skillful behavour

• Classical approach requires experts, and lot of expert work hours. How could non-experts teach robots is an active research topic in

robotics: Teaching by demonstration Robotic imitation … • To make the task as natural and easy for the human teacher • The human provides an initial demonstration but is NOT part of the

motor control loop

3

Page 5: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

The paradigm • Use human sensorimotor learning ability to obtain robot behaviors • Include the human in the control loop • May ask human to do extensive training • Utilize the human brain as the adaptive controller

Motor command (u) Human Motion (m)

Robot state (s) Feedback to human sensory system (f)

Human ~Adaptive Controller

Feedforward Interface

Feedback Interface

4

Page 6: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Sensorimotor learning • Sensorimotor learning is fundamental for adaptive and

intelligent behavior • Driving a car • Using a pair of chopsticks • Using a computer mouse • …

5

Motor command (u) Human Motion (m)

Robot state (s) Feedback to human sensory system (f)

Feedforward Interface

Feedback Interface

Page 7: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Skill synthesis for autonomy For autonomous operation, the key issue is transferring the control policy

learnt by human to the robot

6

Motor command (u) Human Motion (m)

Robot state (s) Feedback to human sensory system (f)

Human ~Adaptive Controller

Feedforward Interface

Feedback Interface

Robot Learning: Learn π: s → u

Page 8: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Why should this paradigm work?

• The ability of the brain to learn novel control tasks by forming internal models. The robot can be considered as a tool (e.g. as driving a car, playing an instrument, using chopsticks)

• The flexibility of the body schema; extensive human training modifies the body schema so that the robot can be naturally controlled

7

Page 9: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Ball swapping is a suitable task for testing the proposal since it is complex and not straightforward to manually program on a robotic hand

Ball swapping

work of Erhan Oztop

8

Page 10: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Ball swapping interface

Motor command (u) Human Motion (m)

Robot state (s) Feedback to human sensory system (f)

Robot Learning: Learn π: s → u Human ~

Adaptive Controller

Feedforward Interface

Feedback Interface

Feedback to human: DIRECT VISION

9

Page 11: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

VisualEyez Output

30Hz

Gifu Hand Controller

+10Hz Input Driven

Inverse Kinematics

Input Driven

Central Controller

30HzUser Interface

Marker Positions

Marker Positions

Gifu Hand Joint Angles

Gifu Hand Joint Angles

Hand Status

commands

System info

VizualEyez data

Finger tip positions

Finger tip positions For Gifu Hand

Raw joint angles

Desired joint angles

Gifu Hand actuation

Human hand movement

Data Capture

Build hand Reference frame

Calibration

Inverse Kinematics

Filtering

PD Control

Human control of the robot

10

Page 12: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Human learning…

11

Page 13: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Finally human learns to swap balls

12

Page 14: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

B. Smoothing & Linear interpolation A. Original finger joint trajectories

C. Kicks superimposed on to (B) D. Speed-up, then apply (B) and (C)

Index finger

Ring finger Little finger

Middle finger

Offline analysis & improving performance

13

Page 15: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Ball swapping at x2 speed up

Open loop control u=π(time)

14

Page 16: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Color tracking

policy u = f(s,v) u: desired joint angles s: finger joint angles v: position of the balls Learning Technique: Unsupervised Kernel Regression (UKR)

Color tracker developed in house by Ales Ude

In collaboration with Jan Steffen from Bielefeld University

Ball swapping with visual feedback

15

Page 17: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Ball swapping: feedback vs. open loop

Open loop control u=π(time)

Closed loop (feedback) control u=π(angles, ball positions)

16

Page 18: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Extending the paradigm to visual grasping

in collaboration with Brian Moore 17

Page 19: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Visual grasping

in collaboration with Emre Ugur 18

Page 20: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Analysis and preliminary findings

Pre-grasp Pose

Grasp pose

Moore B, Oztop E (2012) Robotic grasping and manipulation through human visuomotor learning. Robotics and Autonomous Systems 60: 441-451 19

Page 21: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Limited success in simulation to robot transfer

The skill obtained in the simulator was satisfactory. Transfer to the real robot had limited success General observation: For efficient grasp skill generation there should be low level tactile controller at the fingers (work in planning

20

Page 22: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Reactive postural control

21

Motor command (u) Human Motion (m)

Robot state (s) Feedback to human sensory system (f)

Feedforward Interface

Feedback Interface

• teach the humanoid robot to counteract external postural perturbations • choices of feedback interface:

o abstract visual feedback o motion of the support polygon o force impulses at human‘s COM

Page 23: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

• The key factor for muscle activation during postural control is COM information [Lockhart et al, 2007]

• An interface between robot COM and human COM

„COM force“ iterface

22

Page 24: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

„COM force“ iterface

23

Page 25: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

24

Reactive postural control • Goals:

o teach the robot to counteract external postural perturbations o on-line learning o gradually transfer control responsibility from human to autonomous robot

controller

submitted for ICRA 2013

Page 26: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

robot joint positions

Autonomous Controller

Feedforward interface

Feedback Interface

human joint positions

sensory stimulation

sensory information

Influence Weighting

training data training data

25

Principles

Page 27: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

• To teach the robot the demonstrated task we used Locally Weighted Projection Regression [Vijayakumar et al. 2005]

• LWPR offers incremental learning and is among the fastest regression techniques [Nguyen et al. 2011]

• As oppossed to global regression (GPR) which uses entire training set, LWPR partitions the training set into more sections

• Each section is described by a local model:

• Influence of each model is determined by the weights characterized by Gaussian kernel:

• The output prediction for an input x is a sum of contributions from all models weighted by wk:

Tk k ky x θ=

1exp( ( ) ( ))2

Tk k k kw x c D x c= − − −

1

1

( )ˆ

Nk kk

Nkk

w y xy

w=

=

= ∑∑

26

Machine learning

Page 28: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

• The influence weighting algorithm calculates the mean square error (MSE) between the human reaction and predicted reaction over a period T during the demonstration.

• The maximum MSE is set as a reference for the weighting criterion:

• The criterion is used to weight the human influence and the influence of the autonomous controller.

• The output that is controlling the robot is calculated by:

• If the MSE fails to improve over N periods the algorithm disconnects the human from the control loop.

• At that point the robot is considered trained.

max

totalMSEC

MSE=

(1 )human predictedy Cy C y= + −

27

Responsibility transfer

Page 29: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

28

Responsibility transfer

Page 30: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

29

Stability algorithm + manipulation task

• Combine learned skill with an additional arbitrary task

• Stability algorithm can influence the manipulation task but only if necessary.

• Manipulation task must not influence the postural stability of the robot.

• Null space exploration

in collaboration with Leon Žlajpah

Page 31: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

Our work so far indicates that

► Obtaining robot skill via human sensorimotor learning is a viable approach ► Since the paradigm reverse engineers the control policy obtained by the

brain, the behaviors obtained should be natural and human-like ► Help built smart prosthetics that can be controlled intuitively via high level

signals or brain machine interface (BMI) ► Shed light on mechanisms of internal models, agency and body image

• Help ameliorate impairments related to these brain mechanisms • Offer new design principles for robot self exploration and learning

Concluding remarks…

30

Page 32: Robot skill synthesis through human sensorimotor learning...Obtaining robot skill via human sensorimotor learning is a viable approach Since the paradigm reverse engineers the control

THANK YOU FOR YOUR ATTENTION!

Collaborators: Erhan Oztop, Ozyegin University, Turkey Luka Peternel, Jožef Stefan Institute, Slovenia Joshua Hale, Cyberdyne, Japan Mitsuo Kawato, ATR, Japan