SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin...

31
SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control Presenter: Kai-En Lin 5/14/2020

Transcript of SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin...

Page 1: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor

Planning and Control

Presenter: Kai-En Lin5/14/2020

Page 2: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Introduction• Related work• SE3-Nets

• Algorithm• Experiments• Conclusion• Future Work

Outline

Page 3: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Problem statement• Observe a scene with a camera• Control the robot to reach a target

Introduction

Camera Scene

Page 4: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Traditional approach• Data-associate the observed scene to target• E.g. tracking different parts of the robot

• Model the effect of applied actions to changes to the scene• E.g. knowing what happens after the action

• Deep Learning approach• Tries to learn similar models• Lacks the ability to associate objects/parts across

scenes

Introduction

Page 5: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Goal:• Devise a learning-based algorithm that allows:•Data-association•Modeling of the object dynamics•Correct prediction and control from the model

Introduction

Page 6: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• SE3-Nets• Segment object parts• Predict SE(3) transformation for each part to target• No explicit modeling of data association

Related Work

Page 7: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Given• an observation x" of the scene (depth map / point

cloud)• applied actions u"

• Predict• the transformed output point cloud x"$%

Algorithm

Page 8: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• We can decompose the problem of modeling scene dynamics into:1. Modeling scene structure2. Modeling the dynamics of individual parts3. Combining local pose changes to model the dynamics

of the entire scene

• With deep learning:1. An encoder to distinguish individual parts and predict

a 6D pose for each of them2. A pose transition network to model the dynamics in

the pose space. Takes source pose and action to predict the change in poses

3. A transform layer to apply SE(3) transforms to input point cloud using predicted pose deltas

Algorithm

Page 9: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• We can decompose the problem of modeling scene dynamics into:1. Modeling scene structure2. Modeling the dynamics of individual parts3. Combining local pose changes to model the dynamics

of the entire scene

• With deep learning:1. An encoder to distinguish individual parts and predict

a 6D pose for each of them2. A pose transition network to model the dynamics in

the pose space. Takes source pose and action to predict the change in poses

3. A transform layer to apply SE(3) transforms to input point cloud using predicted pose deltas

Algorithm

Page 10: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• An encoder takes the input 3D point cloud x" and generates the following:• Masks for the moving parts (m")• 6D pose per segmented part (p")• 3D position•Orientation as 3-parameter axis-angle vector

Modeling Scene Structure

Page 11: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Modeling Scene Structure

Depth Map

Segmentation Map

Estimated Poses

Page 12: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• We can decompose the problem of modeling scene dynamics into:1. Modeling scene structure2. Modeling the dynamics of individual parts3. Combining local pose changes to model the dynamics

of the entire scene

• With deep learning:1. An encoder to distinguish individual parts and predict

a 6D pose for each of them2. A pose transition network to model the dynamics in

the pose space. Takes source pose and action to predict the change in poses

3. A transform layer to apply SE(3) transforms to input point cloud using predicted pose deltas

Algorithm

Page 13: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• A fully-connected pose transition network takes the predicted poses from the encoder (p") and applied actions (u") as input and predicts:• The change in pose (∆p") for all 𝐾 segmented parts

(6D vector)

Modeling Part Dynamics

Page 14: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Modeling Part Dynamics

Page 15: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• We can decompose the problem of modeling scene dynamics into:1. Modeling scene structure2. Modeling the dynamics of individual parts3. Combining local pose changes to model the dynamics

of the entire scene

• With deep learning:1. An encoder to distinguish individual parts and predict

a 6D pose for each of them2. A pose transition network to model the dynamics in

the pose space. Takes source pose and action to predict the change in poses

3. A transform layer to apply SE(3) transforms to input point cloud using predicted pose deltas

Algorithm

Page 16: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• No trainable parameters• Given point cloud (x"), the predicted scene

segmentation (m") and the change in poses (∆p"), calculates the point cloud in the next frame (x"$%):

𝑥+,$%- = /𝑚,

1- 𝑅,1𝑥,- + 𝑇,1 ,

6

17%where 𝑅,1 is rotation, 𝑇,1 translation

Predicting Scene Dynamics

Page 17: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Algorithm1.

2.

3.

Page 18: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Supervision• Point-wise data associations across a pair of point

clouds (x,, x,$%)• Related by an action (u,)

Training

Page 19: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Total loss 𝐿 = 𝐿9 + 𝛾𝐿;• 3D Loss 𝐿9• Pose consistency loss 𝐿;• 𝛾 = 10

Training

Page 20: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Training

Page 21: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• 3D Loss 𝐿9

𝐿9 =1𝑁/

𝑥+,$%? − 𝑥A,$%? B

𝛼𝑓E? + 𝛽,

GH

?7%where 𝐻𝑊 is the number of points, 𝛼 = 0.5, 𝛽 = 1𝑒 − 3,𝑓E? = 𝑥A,$%? − 𝑥,?, scaling factor to make the loss scale-invariant

Training

Page 22: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Pose consistency loss 𝐿;

𝐿; =1𝐼/ �̂�,$%? − 𝑝,$%? BR

?7%

,

where p+,$% = p, ⊕ ∆p, , the expected pose at time t+1

Training

Page 23: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• Visual servoing• Given the current image and the target image,

generate controls to reach the target• SE3-Pose-Nets solve this by using the latent pose

space to data-associate the observations and minimizing the error between initial pose pT and the final pose pU.

Closed-Loop Visuomotor Control Using SE3-Pose-Nets

Page 24: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Closed-Loop Visuomotor Control Using SE3-Pose-Nets

Page 25: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Closed-Loop Visuomotor Control Using SE3-Pose-Nets

Page 26: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• SE3-Pose-Nets performs worse than other models when predicting scene dynamics• The pose space might make the training problem

harder• Constraint on pose consistency is different from the

prediction problem

Experiments

Page 27: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Experiments

Page 28: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Experiments

Page 29: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

Experiments

Page 30: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• SE3-Pose-Nets is an end-to-end framework for learning predictive models that enable control of objects in a scene

• It learns a consistent pose space for each individual part

• Does not require external data association• The network enables computation of controls in the

low dimensional pose space

Conclusion

Page 31: SE3-Pose-Nets: Structured Deep Dynamics Models for ... · 2020-05-14  · SE3-Pose-Nets Kai-EnLin •We can decompose the problem of modeling scene dynamics into: 1. Modeling scene

SE3-Pose-Nets Kai-En Lin

• SE3-Pose-Nets has difficulties handling joints further down the kinematic chain

• Extending the system to interact with and manipulate external objects

• Long-term planning to utilize the latent pose space

Future Work