Occlusion-Aware Multi-View Reconstruction ofArticulated Objects for Manipulation
Xiaoxia Huang
Committee members
Dr. Stanley Birchfield (Advisor)Dr. Ian Walker
Dr. John Gowdy Dr. Damon Woodard
Motivation
Domestic robots in many applications require manipulation of articulated objects Tools: scissors, shears, pliers, stapler Furniture: cabinets, drawers, doors, windows, fridge Devices: laptop, cell phone Toys: truck, puppet, train, tricycle
3/72
Important problem: Learning kinematic models
Approach
Part 1: Reconstruct 3D articulated model using multiple perspective views
Part 2: Manipulate articulated objects – even occluded parts
Part 3: Apply RGBD sensor to improve performance
4/72
Related Work
5/72
[ Yan et al., PAMI 2008 ]
[ Katz et al., ISER 2010 ]
[ Ross et al., IJCV 2010 ]
[ Sturm et al., IJCAI 2009, IROS 2010 ] [ Sturm et al., ICRA 2010 ]
Our Approach
Recovers kinematic structure from images
Features: Uses single camera Produces dense 3D models Recovers both prismatic and revolute joints Handles multiple joints Provides occlusion awareness
6/72
Approach
Part 1: Reconstruct 3D articulated model using multiple perspective views
Part 2: Manipulate articulated objects – even occluded parts
Part 3: Apply RGBD sensor to improve performance
7/72
Procrustes-Lo-RANSAC (PLR)
9/72
{I1}
{I2}
w
2D joint estimation
R t
Alignment /Segmentation
{I1}
{I2}
…
…
3D reconstruction
P
Q
u
w
θ
Axis direction estimation
Axis point estimation
3D jointestimation
Procrustes-Lo-RANSAC (PLR)
10/72
3D reconstruction
uw
2D joint estimation
w
θ
Axis direction estimation
Axis point estimation
Alignment /Segmentation
3D jointestimation
{I1}
{I2}
P
Q
{I1}
{I2}
R t
…
…
Camera Calibration
11/72
Camera model
intrinsic parameters K
extrinsic parameters
imagepoint
x’
Features
Input
K, R, tOutput
object point
Bundler: http://phototour.cs.washington.edu/bundler/{I}
SIFT
SIFT Features
Input images
SIFT features[ Lowe, IJCV 2004 ]
Matched SIFT features
12/72
658 keypoints 651 keypoints
24 matches
Camera Calibration
13/72
Camera model
EXIF tag or default value
intrinsic parameters K
extrinsic parameters
imagepoint
x’
Features
Structure from motion
K, R, t
Minimize error
Input
Output
object point
Bundler: http://phototour.cs.washington.edu/bundler/{I}
SIFT
3D Model Reconstruction
Expand to neighboring empty image cells Not expanded if there is a depth discontinuity
15/72
patch
Image projection of a patch
I1 I2
Object
PMVS: http://grail.cs.washington.edu/software/pmvs/
Procrustes-Lo-RANSAC (PLR)
17/72
3D reconstruction
uw
2D joint estimation
w
θ
Axis direction estimation
Axis point estimation
Alignment /Segmentation
3D jointestimation
{I1}
{I2}
R t
P
Q
{I1}
{I2}
…
…
Alignment / Segmentation
18/72
Matchfeature
C
3D correspondenceFind closest
Find closestProcrustes +Lo-RANSAC
R, t,σ
up
dat
e
Segment
End N
Y
Good?
Project into image
S1
S2
ASIFT F1
ASIFT F2
{I1}
{I2}
…
…
P
Q
Procrustes Analysis
19/72
Procrustes analysis is the process of performing a shape-preserving similarity transformation.
Greek myth{A}
{B}
http://www.mythweb.com/today/today07.html
Procrustes-Lo-RANSAC (PLR)
23/72
3D reconstruction
uw
2D joint estimation
w
θ
Axis direction estimation
Axis point estimation
Alignment /Segmentation
3D jointestimation
{I1}
{I2}
R t
P
Q
{I1}
{I2}
…
…
Object Model in 2D
26/72
{A} {B}
Link 1
Link 0
Transformation of Link 0 between two configurations:
Configuration 1 Configuration 2
Link 1
Link 0
Align Link 0
27/72
Configuration 1 Configuration 2
Link 0
Link 1
{A}
Link 1
Link 0
{A}
Transformation of Link 1 between two configurations:
Procrustes-Lo-RANSAC (PLR)
30/72
3D reconstruction
uw
2D joint estimation
w
θ
Axis direction estimation
Axis point estimation
Alignment /Segmentation
3D jointestimation
{I1}
{I2}
R t
P
Q
{I1}
{I2}
…
…
3D Joint
Revolute joint
Prismatic joint
31/72
Axis direction u
Axis point w
u = t/|t|
w = mean({pi})Joint is classified using R
Axis direction u
Axis point w
Revolute Joint Direction
32/72
u
θAxis angle
representation
(Singularity: = 0°or = 180°)
Direct computation Eigenvalues / eigenvectors
Tw
o m
etho
ds
Approach
Part 1: Reconstruct 3D articulated model using multiple perspective views
Part 2: Manipulate articulated objects – even occluded parts
Part 3: Apply RGBD sensor to improve performance
44/72
Part 2 : Object Manipulation
Robotic arm + Eye in hand (camera on the end effector)
3D articulated model + Scale estimation(σ) Object registration + Manipulation
45/72
3D articulated modelRobotic arm Manipulating object
Hand-eye Calibration
Calibration object: chessboard Robotic arm with a camera moves from P to Q A is the motion of the camera, B is the corresponding
motion of the robot hand
47/72
Let:
Then:
Since:
Only unknown:
Object Pose Estimation
Place the object in the camera field of view Take an image of the object at some viewpoint Detect 2D-3D correspondences Estimate the object pose by POSIT algorithm
48/72
POSIT :
• does not require the correspondences are planar
• Iteratively approximate an object pose using POS (Pose from Orthography and Scaling)
• POS simplifies a perspective projection by a scaled orthographic projection
Experimental Results (1)
Camera calibration
49/72
20 different views of a chessboard (7×9 squares of 10×10mm)
Experimental Results (1)
Corners extraction
50/72
+ : extracted corner (up to 0.1pixel)
: corner finder window (5×5mm)
Experimental Results (1)
Intrinsic parameters
51/72
The calibration results of a Logitech Quickcam Pro 5000
Experimental Results (1)
Reprojection corners
53/72
+ : extracted corner
: corresponding reprojected corner
: reprojection error and direction
Experimental Results (2)
Hand-eye calibration
54/72
position (X, Y, Z) (mm) rotation angles (Rx,Ry,Rz) (rad)
from
rob
ot system Average reprojection error is 1.1 pixel
Experimental Results (3)
Object pose estimation
55/72
Ten images taken at arbitrary viewpoints
The corresponding images with the highest number of matched features
Experimental Results (4)
Object manipulation
56/72
Frames of a video sequence of PUMA manipulating a toy truck
Once and are obtained, is provided by the robot kinematic system. Given any particular point on the object
the robot can locate its end effector to that position using:
where
Approach
Part 1: Reconstruct 3D articulated model using multiple perspective views
Part 2: Manipulate articulated objects – even occluded parts
Part 3: Apply RGBD sensor to improve performance
57/72
RGBD Sensor
Motion sensing system : Microsoft Kinect, Asus Xtion sensor Microsoft Kinect (Nov. 2010)
Xbox 360 video game console CMOS color camera (8-bit VGA resolution, 30Hz) Infrared (IR) laser projector (structure lighting) CMOS IR camera (11-bit VGA resolution) Multi-array microphone
58/72Microsoft Kinect sensor
Light pattern
KinectFusion- Depth map
L projects a light spot P on an object surface, and O observes the spot
Triangle (OPL)
solve d
Given b,α and β
60/72
Standard structured lighting model
Depth image (VGA &11-bit )
KinectFusion-Vertex and normal map
Vertex map is a 3D point cloud
Normal vector indicates the direction of the surface at a vertex
61/72
intrinsic matrix (IR camera)
2D depth pointdepth
3D vertex
cross product
neighboring points
KinectFusion- Camera tracking
Small motion between consecutive positions of Kinect Find correspondences using projective data association Estimate camera pose Ti by applying ICP algorithm to
vertex and normal maps
62/72
Tracking camera pose
KinectFusion- Volumetric integration
Volumetric representation (3×3×3m, 512 voxels/axis)
(0,1] (outside of the surface) tsdf(g) = 0 (on the surface)
[-1,0) (inside the surface)
63/72TSDF volume grid
outside
inside
surface (zero-crossing)
KinectFusion- Surface rendering
Cast a ray through the focal point for each pixel Traverse voxels along the ray Find the first surface by observing the sign change of tsdf(g) Compute the intersection point using points around the surface
boundary
64/72
outside
inside
Ray-casting technique
surface
TSDF volume grid
Experimental Results (1)
66/72
Build a 3D model of a microwave
color images
corresponding depth images
3D model
Experimental Results (2)
67/72
Build another 3D model of the microwave
color images
corresponding depth images
3D model
Experimental Results (3)
68/72
Interactively change the configuration of the microwave
color images
corresponding depth images
Conclusion
Yield occlusion-aware multi-view models Do not require prior knowledge of the object Do not make any assumptions regarding planarity of the
object support both revolute and prismatic joints automatically classify joint type work for objects with multiple joints only require two different configurations of the object show its effectiveness on a range of environmental
conditions with various types of objects useful in domestic robotics
70/72
Reconstruct kinematic structure of articulated objects
Conclusion
Given a particular point on the object, the robot can move its end effector to that position, even if the point is not visible in the current view
Given a particular grasp point, the robot can grab the object at that point and move in such a way so as to exercise the articulated joint
71/72
Manipulate articulated objects
Do not require artificial markers attached to objects Yield much denser 3D models Reduce the computation time
Apply Kinect to system
Top Related