Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward...

Robot controlwith Deep Reinforcement Learning

Hadi Beik-Mohammadi

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

• Inverse and Forward Kinematic

• How to Learn a behavior

• Methods

Inverse Recurrent Model

Deep Deterministic Policy Gradient

• Conclusion

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 1

End Effector

Joint 1

Joint 0

Joint 0Join

0 300X (CM)

Joint SpaceCartesian Space

Target

End Effector

Target

TargetTarget

Target

Forward Kinematic

Forward and Inverse Kinematic

.Inverse Kinematic

Joint Space Cartesian Space

How to build agents that learn behaviors in a dynamic world?

The brain evolved, not to think or feel, but to control movement

Daniel Wolpert, nice TED talk

Learning a behavior:

Learning to map sequences of observations to actions, for a particular goal

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 [3] 4

What supervision does an agent need to learn purposeful

behaviors in dynamic environments?

• Rewards: sparse feedback from the environment whether the

desired behavior is achieved

• Demonstrations

• Specifications/Attributes of good behavior

Inverse Recurrent Model (IRM)[1]• Control Snake-Like Many Joint Robot Arms

• BPTT on recurrent forward models

• Recurrent Neural Networks LSTM

• Offline

Deep Deterministic Policy Gradient (DDPG)[2]• Deep Reinforcement Learning Method

• Actor Critic Network

• Continuous Action Domain

• Model Free

• Online

Inverse Recurrent Model (IRM)

Deep Deterministic Policy Gradient (DDPG)

CRITIC NETWORK

ACTOR NETWORK

ENVIRONMENT

ACTION

State-Value Function

Policy Function

Deep Deterministic Policy Gradient (DDPG)

https://youtu.be/tJBIqkC1wWM

Rewarding

End Effector

Joint 1

Joint 0

Dist (t)

Reward 1 = Gaussian(Dist(t))

https://www.sfu.ca/sonic-studio/handbook/Graphics/Gaussian.gif

Reward 2 = Dist(t-1) – Dist(t)

2 DOF Manipulator Actor Critic maps during Learning

• Operate over continuous action spaces

• Algorithm can learn policies end-to-end

• Model-Free

• No Proof for learning

• No Guarantee for results

• Requires a large number of training episodes to find solutions

References

• [1] Sebastian Otte , Adrian Zwiener , and Martin V. Butz, Inherently Constraint-Aware Control of Many-Joint Robot Arms with Inverse Recurrent Models

• [2] Continuous control with deep reinforcement learning, Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

• [3] Deep Reinforcement Learning and Control, Spring 2017, CMU 10703

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 14

Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward...

Documents

Transcript of Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward...

I Fuzzy Inverse Kinematic Mapping: Rule and …€¦ · i Fuzzy Inverse Kinematic Mapping: Rule Generation, Efficiency, and Implementation Yangsheng Xu Michael C. Nechgba CMU-RI-TR-93-02

Higher satisfaction after total knee arthroplasty using ... · Higher satisfaction after total knee arthroplasty using restricted inverse kinematic alignment compared to adjusted

MA ATLLABB sCCo oddees Mfforr fKKiinneemmaattiicc aanndd ...shodhganga.inflibnet.ac.in/bitstream/10603/8593/23... · inverse kinematic model is presented, thirdly MATLAB code for

Kinematic KKiinneemmaattiicc Kinematic Modeling ...

Kinematic Singularities Peter Donelan of Robot Manipulatorscdn.intechweb.org/pdfs/11035.pdf · Kinematic singularities of robot manipulators are ... and the analysis of inverse kinematics

Closed-Form Inverse Kinematic Joint Solution for Humanoid ...edge.rit.edu/edge/P13203/public/WorkingDocuments... · Closed-Form Inverse Kinematic Joint Solution for Humanoid Robots

Kinematic Modeling of Wheeled Mobile Robots · Mobile Robot, Kinematics, Wheeled Robot, Forward Solution, Inverse Solution, Mobility Charscteristics, ... wheeled mobile robots, we

Inverse kinematic of European Robotic Arm based on a new … · 2021. 1. 14. · like industrial robots. The inverse kinematic problem of a 6-axis robot manipulator, consist of two

17 i5 ijaet0511556 intelligent inverse kinematic copyright ijaet

Synthetic Echocardiographic Image Sequences for Cardiac … · 2011. 6. 15. · Synthetic Echocardiographic Image Sequences for Cardiac Inverse Electro-Kinematic Learning Adityo Prakosa

ACCESSOIRES MOTORISATION CIRCUIT DE … · Solo Adventure Ө Corsair Black Devil Ө Top 80 Prix Euro TTC 197,40 197,40 197,40 197,40 Moteur de démarreur universel convenant pour

Inverse Kinematic Modelling of a 6-DOF(6-CRS) …Inverse Kinematic Modelling of a 6-DOF(6-CRS) Parallel Spatial Manipulator 666-2 planes making it more complex to control and bulky

Solving Inverse Kinematic 2d

A Novel Inverse Kinematic Approach for Delta Parallel Robotdelta parallel robot, inverse kinematics analysis, experimental study, results, discussion of the results, future work and

“Furry friends” Пушистые друзья «Sounds» [Ө] [ә][ә] [t] [ v] [][]

What is the Inverse Kinematics Kinematic analysis is one of the first steps in the design of most industrial robots. Kinematic analysis allows the designer.

Inverse Kinematics 2 - University of California, San Diego · Forward & Inverse Kinematics The forward kinematic function computes the world space end effector DOFs from the joint

k u c e h h ө gү g ^ө Кыргыз ... · - Ч. Айтматов атындагы Кыргыз Республикасынын мамлекеттик улуттук орус драма

Inverse Kinematic Control of Humanoids Under Joint Constraintskoasas.kaist.ac.kr/bitstream/10203/174910/1/000316461300002.pdf · designed to have structural similarities to humans,

Modelling and Control of Inverse Dynamics for a 5-DOF Parallel Kinematic Polishing Machine