Learning Models of Shape from 3D Range Data
description
Transcript of Learning Models of Shape from 3D Range Data
Learning Models of Shape from 3D Range Data
Dragomir AnguelovArtificial Intelligence Lab
Stanford University
Shape Models for Animation
Shape Models for Motion Estimation
[Gollum - Time Warner]
Animation
Fz
Fx
Mx
FyMy
Mz
Biomechanics
Shape Models for Scene Understanding
Goal: Understand sensor input in terms of objects and relations
“puppet holding stick”
Machine Learning for Model Construction
Example-based models
[Allen et al. ‘02] [Allen et al. ‘03]
Simulation-basedmodels
[Wilhelms, Van Gelder ‘97] [Aubel ‘02]
Artist-designedmodels
[Dreamworks] [Poser – Curious Labs][Lucasfilm]
Machine Learning for Motion Estimation
Marker motion capture
[Polar Express]
Markerless motion capture
[Bregler et al. ‘98]
[Cheung et al ’03]
Physical measurement
[Braune,Fischer 1892]
Shape Models from 3D Scans
Pose
vari
ati
on
Body-shape variation
Object models: Discover object parts Model pose variation in terms of
parts Class models:
Model shape variation within class
3D Range Scans
Cyberware Scans 4 views, ~125k polygons ~65k points each
Problems Missing surface Drastic shape changes
Standard Modeling Pipeline
[Allen, Curless, Popovic 2002]
1. Articulated Template 2. Fit Template to Scans
3. Interpolation
A lot of human intervention
Pose or body shape deformations modeled, but not both
Similar to: [Lewis et al. ‘00] [Sloan et al. ’01] [Mohr, Gleicher ’03], …
Contributions
3D Scan segmentation and object detection (not in thesis)
3D Scans
Registration
Recover skeleton
Learn model of deformations
Unsupervised non-rigid registration
Automatic articulated model recovery
Modeling pose and body shape deformations
Talk outline
Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated model
Modeling the space of human deformations
Scene understanding (not in thesis)
Registration
Task: Establish correspondences between two surfaces
Generative Model
Model mesh X Transformed mesh X’
Transformation
Goal: Given model mesh X and scan mesh Z, recover transformation and correspondences C
Scan mesh Z
Correspondences
C
ix'
Correspondence ck specifies which point x’i generated point zk
ix
kz
Deformation ModelTreat each link as a spring, resisting stretching and twisting
Transformation
21d 12d
1x 2x
12' d
21d12d
'1'x 2'x21' d
Gaussian noise model
Model X
Non-rigid Iterative Closest Point
Algorithm Assume initial alignment known
Compute correspondences C (given ) For each point zk , find its nearest neighbor x’i
Solve for transformation (given C) which Brings matching pairs together Minimizes deformation
[Shelton ’00], [Chui & Rangarajan ’02], [Allen et al. ’03], [Hähnel et al.’03]
Nonrigid ICP Experiment
XZ
c1
c2
Correspondences for different points computed independently
Poor correspondencesPoor transformations
ZX
Correlated Correspondence Algorithm
Input:Pair of scans
Output: Correspondences
Correlated Correspondence Algorithm
Computes an embedding of mesh Z into mesh X
The embedding enforces: • Minimal surface deformation• Similar local surface appearance• Preservation of geodesic distance
[Anguelov, Srinivasan, Pang, Koller, Thrun, Davis ‘04]
ZX
Markov Network
C2C1
C3
(C1, C2)
(C2,C3)(C1,C3)
(C2)
(C3)
(C1) Markov network
0 0.5 1 1.5 2
3
2
1
Single potential
0 0.5 1 1.5 2
3,33,23,12,32,22,11,31,21,1
Pairwise potential
Joint probability distribution
Correlated Correspondence Model
Scan Point ziLocal appearance
Model Point x1Local appearance
Scan Point zjLocal appearance
Model Point x2Local appearance
12…N
Link
Link
Ci
(Ci,Cj
)
Deformation potential
Cj(Ci) (Cj)
Appearance potential
CC Potentials
C2C1
C3
(C1, C2)
(C2,C3)(C1,C3)
(C2)
(C3)
(C1) Markov network
Local appearance potentials (Ci=k) Use spin images [Johnson+Hebert ’97] to quantify the
surface similarity around two matching points
Deformation potentials D(Ci=k,Cj=l) = P(e’ij| ekl)
Want a good consistent assignment for all correspondences C !
Markov Network Inference
Inference is Markov Nets is generally intractable Exponential search space
Loopy Belief Propagation (LBP) [Pearl ’88] is an efficient algorithm for search in exponential spaces
Converges to a local minimum (of the Bethe free energy)
C2C1
C3
(C1, C2)
(C2,C3)(C1,C3)
(C2)
(C3)
(C1) Markov network
Geodesic Potentials: near -> near
Nearby points in Z must be nearby in X Constraint between each pair of adjacent points
zi, zj
otherwise
dxxDistlCkC lkG
jiG 1
),(0),(
Z XScan Z Model X
Geodesic Potentials: far -> far
Distant points in Z must be distant in X Constraint between each pair of distant points zk, zl (farther
than 5r)
otherwise
rxxDistlCkC lkG
jiG 1
2),(0),(
Z X
r resolution of mesh X
Results: Pose Deformation
No markers used
Results: Body Shape Deformation
No markers used
Application: Scan Completion
Model
• 4 markers were placed manually on each of these scans
Cyberware scans
Registrations
Applications: Animation
Linear interpolation in local link deformation space
Talk outline
Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated model
Modeling the space of human deformations
Scene understanding (not in thesis)
Recovering articulated models
Input: models, correspondences
Output: rigid parts, skeleton
[Anguelov, Koller, Pang, Srinivasan, Thrun ‘04]
Recovering Articulation: State of the art
Algorithm assigns points to parts independently; ignoring the correlations between the assignments
Prone to local minima
Each joint is estimated from a separate sequence
Skeleton: 9 parts
combine
[Cheung et al., ‘03]
Recovering articulation [Anguelov et al. ’04]
Stages of the process1. Register meshes using Correlated Correspondences
algorithm2. Cluster surface into rigid parts3. Estimate joints
Recovering articulation [Anguelov et al. ’04]
Stages of the process1. Register meshes using Correlated Correspondences
algorithm2. Cluster surface into rigid parts3. Estimate joints
Probabilistic Model
y1 yN…
a1
x1
aN
xN
…
…
Part labels
Points
z1
c1Point corrs
Points
Model
zK
cK
…
TransformedModel
Instance
12…P
Transformations
T1
TP
…
Contiguity Prior Parts are preferably contiguous regions
Adjacent points on the surface should have similar labels
Enforce this with a Markov network:
a1 a2
a3
Penalizes large number of parts
Clustering algorithm
Algorithm Given transformations , perform min-cut*
inference to get
Given labels , solve for rigid transformations
*[Greig et al. 89], [Kolmogorov & Zabih 02]
If a part doesn’t contribute to the likelihood, it will be automatically dropped
Clustering Movie
Results: 70 Human Scans
Tree-shaped skeletonfound
Rigid parts found
Results: Puppet
Results: Arm
Application: Tracking
[Anguelov, Mündermann, Corazza ‘05]
Application: Tracking
[Mündermann, Corazza, Anguelov ‘05]
Application: Tracking
[Mündermann, Corazza, Anguelov ‘05]
Talk outline
Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated skeleton
Modeling the space of human shapes Pose and body shape deformations Application: shape completion
Scene understanding (not in thesis) Discriminative Markov networks for scan
segmentation Articulated object detection
Deformation Transfer Problem
How do you combine two displacement-based models?- displacements cannot be multiplied
- adding displacements ignores notion of object scale
Pose deformation = point displacementsfrom articulated template
[Allen et al. ‘02]
[Lewis et al. ‘00]
[Mohr & Gleicher ‘03][Wang & Phillips ‘02]
[Sand et al. ‘03]
Body shape deformation = point displacements from average shape
[Allen et al. ‘03][Seo & Thalmann‘03][Sloan et al. ‘01]
Predicting Human Deformation
Deformedpolygon
Templatepolygon
Pose deformation
Body shape deformation
Rigid part
rotation
Predict from nearby
joint angles
Linear subspace(PCA)
[Anguelov, Srinivasan, Koller, Thrun, Rodgers, Davis ‘05]
Reconstructing the ShapeDeformedpolygon
Templatepolygon
Pose deformation
Body shape deformation
Rigid part
rotation
To reconstruct the entire mesh Y, solve:
Related work: [Sumner & Popovic ’04]
We have:
Pose Deformationinput
Joint angles Deformations
output
Regression function
Linear regression from two nearest joints
Pose Deformation Space
Body Shape Deformation
input
output
Low-dimensional subspace (PCA)
Body Shape Deformation Space
Combining Pose and Body Shape Spaces
Talk outline
Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated skeleton
Modeling the space of human deformations Pose and body shape deformations Application: shape completion
Scene understanding (not in thesis) Markov networks for scan segmentation Articulated object detection
Shape Completion
Sparsesurfacemarkers
Find most probablesurface
w.r.t. model
Joint angles R
Body shape
in PCA space
Completed surface
Partial View Completion
Motion Capture Animation
Talk outline
Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated skeleton
Modeling the space of human deformations Pose and body shape deformations Application: shape completion
Model-based object detection (not in thesis) Markov networks for scan segmentation Articulated object detection
Model-Based Object Detection
Task: Detect articulated object pose in 3D scan
Challenge: large search space 15 parts, 6 (constrained) DOF each
[Anguelov, Rodgers, Koller]
Our Framework Detectors: Obtain initial part location
hypotheses Scan segmentation Spin images
Scoring: define energy preferring “good” embeddings of the model in the scene
Area and edge matching Joint match Occlusion Parts intersection
Inference: find a consistent and high-scoring set of location hypotheses for all model parts
[Anguelov, Rodgers, Koller]
Model-Based Object Identification Result
[Anguelov, Rodgers, Koller, ongoing]
Object-Based Segmentation Train model to assign scan points to parts
Discriminative training using pre-segmented scans
Collective classification Neighboring points more likely assigned to same
part
Use associative Markov network, with min-cut for inference
[Anguelov, Taskar, Chatalbashev, Gupta, Koller, Heitz, Ng, 2005]
[Anguelov, Taskar, Chatalbashev, Gupta, Koller, Heitz, Ng, 2005]
Segmentation Results
Results
Comparison
AMN SVM
Contributions
Unsupervised non-rigid scan registration
Automatic recovery of articulated models from 3D scans Tracking in shape-from-silhouette data
Learning human pose and body shape deformations Shape completion applications
Additional work: Discriminative learning of Markov networks for 3D scan
segmentation Detection of articulated models in 3D range scans Object-based 2D mapping
Future Work Extending the human deformation model
Nonlinear prediction of pose deformation (ongoing) Acquire and learn from the entire pose / body-shape matrix Prior on likely joint angles, e.g. [Popovic et al. ’04] Enforce temporal consistency in tracking applications
Markerless motion capture Use learned human shape prior for tracking in 3D data
streams (shape-from-silhouette)
Scene understanding Generative / discriminative learning of part-based object
models for detection
Publications
D. Anguelov, P.Srinivasan, D.Koller, S.Thrun, J. Rodgers, J.Davis. SCAPE: Shape Completion and Animation of People. [SIGGRAPH 2005]
D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. Discriminative Learning of Markov Random Fields for Segmentation of 3D Range Data. [CVPR 2005]
D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, H. Pang and J. Davis. The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces. [NIPS 2004]
D. Anguelov, D. Koller, H. Pang, P. Srinivasan, S. Thrun. Recovering Articulated Object Models from 3D Range Data. [UAI 2004]
D. Anguelov, L. Mundermann and S. Corazza. 2005, An Iterative Closest Point Algorithm forTracking Articulated Models in 3D Range Scans. [ASME/SBC 2005]
D. Anguelov, D. Koller, E. Parker, S. Thrun. Detecting and Modeling Doors with Mobile Robots. [ICRA 2004]
D. Anguelov, R. Biswas, D. Koller, B. Limketkai, S. Sanner, and S. Thrun. Learning hierarchical object maps of non-stationary environments with mobile robots. [UAI 2002]
Thank you Stanford AI Lab
Collaborators Daphne Koller Sebastian Thrun Praveen Srinivasan Ben Taskar Hoi-Cheung Pang Jim Rodgers Geremy Heitz Vassil Chatalbashev Rahul Biswas Evan Parker Dinkar Gupta Uri Lerner Andrew Ng
Stanford Biomechanics Lab Lars Mündermann Stefano Corazza Thomas Andriacchi
UC Santa Cruz James Davis
DAGS+ Carlos Guestrin Lise Getoor Eran Segal Christian Shelton
Dags-extended And last, but not least
Olya
Representation of pose deformation
Pose deformation
Rigid articulateddeformation
Given estimates of R, Q, synthesizing the shape is straightforward :
Twists and exponential maps
Twist
From twist to rotation matrix
Joint angles
Kernel SVM Regression
Learning pose deformation
For each polygon, predict entries of from rotations of nearest 2 joints (represented as twists ).
Linear regression parameters :
Obtaining values of in the first place:
Learning body-shape deformation
Include also change in shape due to different people:
Do PCA over body-shape matrices :
Getting estimates of :
Shape completion Find surface Y from our space which matches a set
of markers Z
Y[Z] : completed mesh deforms out of space spanned by , R to match Z
Y’[Z]: predicted mesh constrained to be in space spanned by , R
Target optimized by iteratively solving for , R orY while holding the others fixed
Articulated ICP
Partial view completion
Process: Add a few markers (~6-8) Run CC algorithm to get >
100 markers Optimize to find completion
surface
Shape completion from motion capture data
Our pipeline
Local surface signatures
Use spin-images [Johnson ’97] 2D Histogram of distances from an oriented reference point Rotationally-invariant / Robust under clutter and occlusion / Compressible
(PCA)
Potential (ck = i) encodes how well the signature of point zk matches the signature of point xi in the model:
),;()( Sikk SSNic
Modeling Human Deformation
template
Predict independently for each triangle
Reconstructcomplete shape
Related work: [Sumner+Popovic 2004]
Shape Models for Scene Understanding
Constellation model Pictorial structures[Fergus, Perona ‘04] [Huttenlocher ‘00]
Computer Vision
Effect of Link Prior
No links Markov net No links Markov net
Gets the boundaries better Robust to poor initialization
Conclusions Presented a data-driven method of modeling human
deformations induced by Pose Body shape
Extending the model Nonlinear prediction of pose deformation (ongoing)
Shape complete original scans using current model Acquire and learn from the entire pose-bodyshape matrix Prior on likely joint angles, e.g. [Popovic + et al ’04] Enforce temporal consistency in tracking applications
Extending the possible applications Markerless motion capture (ongoing)
shape completion in shape-from-silhouette data Model beasts other than humans