Learning Models of Shape from 3D Range Data

Learning Models of Shape from 3D Range Data

Dragomir AnguelovArtificial Intelligence Lab

Stanford University

Shape Models for Animation

Shape Models for Motion Estimation

[Gollum - Time Warner]

Animation

Fz

Fx

Mx

FyMy

Mz

Biomechanics

Shape Models for Scene Understanding

Goal: Understand sensor input in terms of objects and relations

“puppet holding stick”

Machine Learning for Model Construction

Example-based models

[Allen et al. ‘02] [Allen et al. ‘03]

Simulation-basedmodels

[Wilhelms, Van Gelder ‘97] [Aubel ‘02]

Artist-designedmodels

[Dreamworks] [Poser – Curious Labs][Lucasfilm]

Machine Learning for Motion Estimation

Marker motion capture

[Polar Express]

Markerless motion capture

[Bregler et al. ‘98]

[Cheung et al ’03]

Physical measurement

[Braune,Fischer 1892]

Shape Models from 3D Scans

Pose

vari

ati

on

Body-shape variation

Object models: Discover object parts Model pose variation in terms of

parts Class models:

Model shape variation within class

3D Range Scans

Cyberware Scans 4 views, ~125k polygons ~65k points each

Problems Missing surface Drastic shape changes

Standard Modeling Pipeline

[Allen, Curless, Popovic 2002]

1. Articulated Template 2. Fit Template to Scans

3. Interpolation

A lot of human intervention

Pose or body shape deformations modeled, but not both

Similar to: [Lewis et al. ‘00] [Sloan et al. ’01] [Mohr, Gleicher ’03], …

Contributions

3D Scan segmentation and object detection (not in thesis)

3D Scans

Registration

Recover skeleton

Learn model of deformations

Unsupervised non-rigid registration

Automatic articulated model recovery

Modeling pose and body shape deformations

Talk outline

Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated model

Modeling the space of human deformations

Scene understanding (not in thesis)

Registration

Task: Establish correspondences between two surfaces

Generative Model

Model mesh X Transformed mesh X’

Transformation

Goal: Given model mesh X and scan mesh Z, recover transformation and correspondences C

Scan mesh Z

Correspondences

C

ix'

Correspondence ck specifies which point x’i generated point zk

ix

kz

Deformation ModelTreat each link as a spring, resisting stretching and twisting

Transformation

21d 12d

1x 2x

12' d

21d12d

'1'x 2'x21' d

Gaussian noise model

Model X

Non-rigid Iterative Closest Point

Algorithm Assume initial alignment known

Compute correspondences C (given ) For each point zk , find its nearest neighbor x’i

Solve for transformation (given C) which Brings matching pairs together Minimizes deformation

[Shelton ’00], [Chui & Rangarajan ’02], [Allen et al. ’03], [Hähnel et al.’03]

Nonrigid ICP Experiment

XZ

c1

c2

Correspondences for different points computed independently

Poor correspondencesPoor transformations

ZX

Correlated Correspondence Algorithm

Input:Pair of scans

Output: Correspondences

Correlated Correspondence Algorithm

Computes an embedding of mesh Z into mesh X

The embedding enforces: • Minimal surface deformation• Similar local surface appearance• Preservation of geodesic distance

[Anguelov, Srinivasan, Pang, Koller, Thrun, Davis ‘04]

ZX

Markov Network

C2C1

C3

(C1, C2)

(C2,C3)(C1,C3)

(C2)

(C3)

(C1) Markov network

0 0.5 1 1.5 2

3

2

1

Single potential

0 0.5 1 1.5 2

3,33,23,12,32,22,11,31,21,1

Pairwise potential

Joint probability distribution

Correlated Correspondence Model

Scan Point ziLocal appearance

Model Point x1Local appearance

Scan Point zjLocal appearance

Model Point x2Local appearance

12…N

Link

Link

Ci

(Ci,Cj

)

Deformation potential

Cj(Ci) (Cj)

Appearance potential

CC Potentials

C2C1

C3

(C1, C2)

(C2,C3)(C1,C3)

(C2)

(C3)

(C1) Markov network

Local appearance potentials (Ci=k) Use spin images [Johnson+Hebert ’97] to quantify the

surface similarity around two matching points

Deformation potentials D(Ci=k,Cj=l) = P(e’ij| ekl)

Want a good consistent assignment for all correspondences C !

Markov Network Inference

Inference is Markov Nets is generally intractable Exponential search space

Loopy Belief Propagation (LBP) [Pearl ’88] is an efficient algorithm for search in exponential spaces

Converges to a local minimum (of the Bethe free energy)

C2C1

C3

(C1, C2)

(C2,C3)(C1,C3)

(C2)

(C3)

(C1) Markov network

Geodesic Potentials: near -> near

Nearby points in Z must be nearby in X Constraint between each pair of adjacent points

zi, zj

otherwise

dxxDistlCkC lkG

jiG 1

),(0),(

Z XScan Z Model X

Geodesic Potentials: far -> far

Distant points in Z must be distant in X Constraint between each pair of distant points zk, zl (farther

than 5r)

otherwise

rxxDistlCkC lkG

jiG 1

2),(0),(

Z X

r resolution of mesh X

Results: Pose Deformation

No markers used

Results: Body Shape Deformation

No markers used

Application: Scan Completion

Model

• 4 markers were placed manually on each of these scans

Cyberware scans

Registrations

Applications: Animation

Linear interpolation in local link deformation space

Talk outline

Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated model

Modeling the space of human deformations

Scene understanding (not in thesis)

Recovering articulated models

Input: models, correspondences

Output: rigid parts, skeleton

[Anguelov, Koller, Pang, Srinivasan, Thrun ‘04]

Recovering Articulation: State of the art

Algorithm assigns points to parts independently; ignoring the correlations between the assignments

Prone to local minima

Each joint is estimated from a separate sequence

Skeleton: 9 parts

combine

[Cheung et al., ‘03]

Recovering articulation [Anguelov et al. ’04]

Stages of the process1. Register meshes using Correlated Correspondences

algorithm2. Cluster surface into rigid parts3. Estimate joints

Probabilistic Model

y1 yN…

a1

x1

aN

xN

…

…

Part labels

Points

z1

c1Point corrs

Points

Model

zK

cK

…

TransformedModel

Instance

12…P

Transformations

T1

TP

…

Contiguity Prior Parts are preferably contiguous regions

Adjacent points on the surface should have similar labels

Enforce this with a Markov network:

a1 a2

a3

Penalizes large number of parts

Clustering algorithm

Algorithm Given transformations , perform min-cut*

inference to get

Given labels , solve for rigid transformations

*[Greig et al. 89], [Kolmogorov & Zabih 02]

If a part doesn’t contribute to the likelihood, it will be automatically dropped

Clustering Movie

Results: 70 Human Scans

Tree-shaped skeletonfound

Rigid parts found

Results: Puppet

Results: Arm

Application: Tracking

[Anguelov, Mündermann, Corazza ‘05]

Application: Tracking

[Mündermann, Corazza, Anguelov ‘05]

Talk outline

Automating the data processing pipeline Unsupervised non-rigid registration Recovering an articulated skeleton

Modeling the space of human shapes Pose and body shape deformations Application: shape completion

Scene understanding (not in thesis) Discriminative Markov networks for scan

segmentation Articulated object detection

Deformation Transfer Problem

How do you combine two displacement-based models?- displacements cannot be multiplied

- adding displacements ignores notion of object scale

Pose deformation = point displacementsfrom articulated template

[Allen et al. ‘02]

[Lewis et al. ‘00]

[Mohr & Gleicher ‘03][Wang & Phillips ‘02]

[Sand et al. ‘03]

Body shape deformation = point displacements from average shape

[Allen et al. ‘03][Seo & Thalmann‘03][Sloan et al. ‘01]

Predicting Human Deformation

Deformedpolygon

Templatepolygon

Pose deformation

Body shape deformation

Rigid part

rotation

Predict from nearby

joint angles

Linear subspace(PCA)

[Anguelov, Srinivasan, Koller, Thrun, Rodgers, Davis ‘05]

Reconstructing the ShapeDeformedpolygon

Templatepolygon

Pose deformation

Body shape deformation

Rigid part

rotation

To reconstruct the entire mesh Y, solve:

Related work: [Sumner & Popovic ’04]

We have:

Pose Deformationinput

Joint angles Deformations

output

Regression function

Linear regression from two nearest joints

Pose Deformation Space

Body Shape Deformation

input

output

Low-dimensional subspace (PCA)

Body Shape Deformation Space

Combining Pose and Body Shape Spaces

Talk outline


Modeling the space of human deformations Pose and body shape deformations Application: shape completion

Scene understanding (not in thesis) Markov networks for scan segmentation Articulated object detection

Shape Completion

Sparsesurfacemarkers

Find most probablesurface

w.r.t. model

Joint angles R

Body shape

in PCA space

Completed surface

Partial View Completion

Motion Capture Animation

Talk outline


Modeling the space of human deformations Pose and body shape deformations Application: shape completion

Model-based object detection (not in thesis) Markov networks for scan segmentation Articulated object detection

Model-Based Object Detection

Task: Detect articulated object pose in 3D scan

Challenge: large search space 15 parts, 6 (constrained) DOF each

[Anguelov, Rodgers, Koller]

Our Framework Detectors: Obtain initial part location

hypotheses Scan segmentation Spin images

Scoring: define energy preferring “good” embeddings of the model in the scene

Area and edge matching Joint match Occlusion Parts intersection

Inference: find a consistent and high-scoring set of location hypotheses for all model parts

[Anguelov, Rodgers, Koller]

Model-Based Object Identification Result

[Anguelov, Rodgers, Koller, ongoing]

Object-Based Segmentation Train model to assign scan points to parts

Discriminative training using pre-segmented scans

Collective classification Neighboring points more likely assigned to same

part

Use associative Markov network, with min-cut for inference

[Anguelov, Taskar, Chatalbashev, Gupta, Koller, Heitz, Ng, 2005]

[Anguelov, Taskar, Chatalbashev, Gupta, Koller, Heitz, Ng, 2005]

Segmentation Results

Results

Comparison

AMN SVM

Contributions

Unsupervised non-rigid scan registration

Automatic recovery of articulated models from 3D scans Tracking in shape-from-silhouette data

Learning human pose and body shape deformations Shape completion applications

Additional work: Discriminative learning of Markov networks for 3D scan

segmentation Detection of articulated models in 3D range scans Object-based 2D mapping

Future Work Extending the human deformation model

Nonlinear prediction of pose deformation (ongoing) Acquire and learn from the entire pose / body-shape matrix Prior on likely joint angles, e.g. [Popovic et al. ’04] Enforce temporal consistency in tracking applications

Markerless motion capture Use learned human shape prior for tracking in 3D data

streams (shape-from-silhouette)

Scene understanding Generative / discriminative learning of part-based object

models for detection

Publications

D. Anguelov, P.Srinivasan, D.Koller, S.Thrun, J. Rodgers, J.Davis. SCAPE: Shape Completion and Animation of People. [SIGGRAPH 2005]

D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. Discriminative Learning of Markov Random Fields for Segmentation of 3D Range Data. [CVPR 2005]

D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, H. Pang and J. Davis. The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces. [NIPS 2004]

D. Anguelov, D. Koller, H. Pang, P. Srinivasan, S. Thrun. Recovering Articulated Object Models from 3D Range Data. [UAI 2004]

D. Anguelov, L. Mundermann and S. Corazza. 2005, An Iterative Closest Point Algorithm forTracking Articulated Models in 3D Range Scans. [ASME/SBC 2005]

D. Anguelov, D. Koller, E. Parker, S. Thrun. Detecting and Modeling Doors with Mobile Robots. [ICRA 2004]

D. Anguelov, R. Biswas, D. Koller, B. Limketkai, S. Sanner, and S. Thrun. Learning hierarchical object maps of non-stationary environments with mobile robots. [UAI 2002]

Thank you Stanford AI Lab

Collaborators Daphne Koller Sebastian Thrun Praveen Srinivasan Ben Taskar Hoi-Cheung Pang Jim Rodgers Geremy Heitz Vassil Chatalbashev Rahul Biswas Evan Parker Dinkar Gupta Uri Lerner Andrew Ng

Stanford Biomechanics Lab Lars Mündermann Stefano Corazza Thomas Andriacchi

UC Santa Cruz James Davis

DAGS+ Carlos Guestrin Lise Getoor Eran Segal Christian Shelton

Dags-extended And last, but not least

Olya

Representation of pose deformation

Pose deformation

Rigid articulateddeformation

Given estimates of R, Q, synthesizing the shape is straightforward :

Twists and exponential maps

Twist

From twist to rotation matrix

Joint angles

Kernel SVM Regression

Learning pose deformation

For each polygon, predict entries of from rotations of nearest 2 joints (represented as twists ).

Linear regression parameters :

Obtaining values of in the first place:

Learning body-shape deformation

Include also change in shape due to different people:

Do PCA over body-shape matrices :

Getting estimates of :

Shape completion Find surface Y from our space which matches a set

of markers Z

Y[Z] : completed mesh deforms out of space spanned by , R to match Z

Y’[Z]: predicted mesh constrained to be in space spanned by , R

Target optimized by iteratively solving for , R orY while holding the others fixed

Articulated ICP

Partial view completion

Process: Add a few markers (~6-8) Run CC algorithm to get >

100 markers Optimize to find completion

surface

Shape completion from motion capture data

Our pipeline

Local surface signatures

Use spin-images [Johnson ’97] 2D Histogram of distances from an oriented reference point Rotationally-invariant / Robust under clutter and occlusion / Compressible

(PCA)

Potential (ck = i) encodes how well the signature of point zk matches the signature of point xi in the model:

),;()( Sikk SSNic

Modeling Human Deformation

template

Predict independently for each triangle

Reconstructcomplete shape

Related work: [Sumner+Popovic 2004]

Shape Models for Scene Understanding

Constellation model Pictorial structures[Fergus, Perona ‘04] [Huttenlocher ‘00]

Computer Vision

Effect of Link Prior

No links Markov net No links Markov net

Gets the boundaries better Robust to poor initialization

Conclusions Presented a data-driven method of modeling human

deformations induced by Pose Body shape

Extending the model Nonlinear prediction of pose deformation (ongoing)

Shape complete original scans using current model Acquire and learn from the entire pose-bodyshape matrix Prior on likely joint angles, e.g. [Popovic + et al ’04] Enforce temporal consistency in tracking applications

Extending the possible applications Markerless motion capture (ongoing)

shape completion in shape-from-silhouette data Model beasts other than humans

Learning Models of Shape from 3D Range Data

Documents

Transcript of Learning Models of Shape from 3D Range Data