Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying...

Post on 17-Dec-2015

215 views 0 download

Transcript of Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying...

Learning to estimate human pose with data driven belief

propagation

Gang Hua, Ming-Hsuan Yang, Ying WuCVPR 05

Project Goal

Learning human motionNeed to know the human body configurationsDetect human body parts from a single image

Where are head, arms, legs, torsos?

Estimate human body configurations

What are the size, location and orientation?

First step (i.e., initialization) for full human body tracking

Statistical Inference

Input OutputLocation/size/orientation• head• arms• legs

Challenges

Large variation in poseOcclusion: some parts are not visibleLighting variation: affects appearanceCluttered background: noisy visual cuesHigh dimensional state variables

Main Idea

Analysis by synthesis (i.e., Hypothesize and test)Statistical inferenceLocate body parts using cuesImportance sampling

Learn the shapes of human body parts

Intelligently guess some possible answers, i.e., assembly of body parts

Match each guessed answer with image observation using shape prior and geometry constraints

Head sample

Torsosample

Upper legsample

Which observed assembly looksmost likely to bea human?

Lowerarm sample

Image Potentialbody parts

Assembly ofbody parts

Best assembly

visual cues & importance sampling

local observation & belief propagation

In Plain English

Learning shape Collect prior knowledge of body partsImportance sampling Intelligent guess of answerObservation What is seen in image such as appearance, color, and edgesBelief Local evidenceBelief propagation Inference using all relevant local evidencePotential functions Encode constraints

Head sample

Torsosample

Upper legsample

Which observed assembly looksmost likely to bea human?

Lowerarm sample

Markov NetworkXi: pose state of each limbZi: image observation of each limb Ψij(Xi, Xj): each undirected link represents a potential functionΦi(Zi|Xi): each directed link represents a observation likelihoodTo infer P(Xi|Z) (i.e.,

P(state variables|image observations)

For each body part, normalize labeled shape and learn a low-dimensional representation, psi, using probabilistic Principal Component Analysis (PCA)Pose parameters: Xi ={psi, sx, sy, , tx, ty}

Normalize the labeled shape

(1) Normalized shape, (2) originally labeled shape and (3) reconstructed shape

labeled shape

(1) (2)(3)

Learning Body Shapes

Face Detection for Head Pose

AdaBoost-based face detectorDetection results are good but not precise2 class k-means algorithm to cluster skin color pixels The head pose hypothesis Ixh is obtained by re-centering the face rectangle to the centroid of the skin color cluster and then projecting to the head PCA spaceGaussian importance function

Image specific skin color segmentation Least square rectangle fitting for lower-arm& upper-leg hypothesis Upper-arm& Lower-leg hypothesis from constrained local search Gaussian mixture importance function

Skin color segmentationRectangle fittingUpper-arm& lower-leg search

Arm/Leg Importance Functions

Torso Pose Importance Function

Probabilistic Hough transform to detect line segmentsLines are assembled to quad-shapes and are prunedCanny edge masked likelihood t

(n) are evaluated for each good hypothesis Ixt(n) Gaussian mixture importance functionResults from

Hough transformTorso hypothesis

Potential ConstraintEncode physical constraints of human body partsLink points are defined between two adjacent body partsThe potential function is defined by a Gaussian radial basis function

Defined link points

Likelihood Model

Average normalized steered edge response in R, G, B bands

Likelihood is the maximum of the three

Experiment: Likelihood Model

Translation of the left-lower-leg Curve for the likelihood value

Joint Posterior Distribution

The joint posterior distribution of the Markov network is

where X={X1, X2, …, X9}The goal is to infer the marginal posterior P(Xi|Z) i.e.,

P( Configuration of body part i | Image observation)

Belief Propagation

Message passing

Non-Gaussian distribution makes closed form implementation intractableBelief propagation Monte Carlo

evidence from neighboring nodes

combine with local evidence fromobservation

Belief Propagation Monte Carlo

Experimental Results

State of the artProposed method

USC Brown

Set up Single frame Single frame Multi-view and video

Algorithm Data Driven Belief Propagation Monte Carlo (DDBPMC),

Marko Chain Monte Carlo (MCMC),

Belief Propagation (BP) and PAMPAS

Characteristics efficient, +well posed

problem,+more robust to

lighting change, + can be applied to

ASIMO directly, ++extended to full

body tracker easily, +

numerous experiments, ++

overall: ++

ad-hoc, - not a well posed

problem, -may be sensitive

to lighting change, - not applicable to

ASIMO directly, -may not be

extended to full body tracker, -

few results are available, --

overall: -

systematic, + work for specific

environment, - may be sensitive

to lighting change, - require multi

cameras, --may be extended

to full body tracker, +

few results are available, --

overall: +

Speed 2 to 3 minute per frame

5+ minute per frame

Unknown, but should be more than 3 minutes

Limitations of Current Work

Some skin color regionsFace in frontal poseReasonable contrast (visible edges)Low degree of occlusions

Concluding Remarks

A novel algorithm for pose estimationPrincipled statistical formulation in recovering Human pose in 2-D A working prototypeWork towards full human body tracking