Post on 17-Dec-2015
Learning to estimate human pose with data driven belief
propagation
Gang Hua, Ming-Hsuan Yang, Ying WuCVPR 05
Project Goal
Learning human motionNeed to know the human body configurationsDetect human body parts from a single image
Where are head, arms, legs, torsos?
Estimate human body configurations
What are the size, location and orientation?
First step (i.e., initialization) for full human body tracking
Statistical Inference
Input OutputLocation/size/orientation• head• arms• legs
Challenges
Large variation in poseOcclusion: some parts are not visibleLighting variation: affects appearanceCluttered background: noisy visual cuesHigh dimensional state variables
Main Idea
Analysis by synthesis (i.e., Hypothesize and test)Statistical inferenceLocate body parts using cuesImportance sampling
Learn the shapes of human body parts
Intelligently guess some possible answers, i.e., assembly of body parts
Match each guessed answer with image observation using shape prior and geometry constraints
Head sample
Torsosample
Upper legsample
Which observed assembly looksmost likely to bea human?
Lowerarm sample
Image Potentialbody parts
Assembly ofbody parts
Best assembly
visual cues & importance sampling
local observation & belief propagation
In Plain English
Learning shape Collect prior knowledge of body partsImportance sampling Intelligent guess of answerObservation What is seen in image such as appearance, color, and edgesBelief Local evidenceBelief propagation Inference using all relevant local evidencePotential functions Encode constraints
Head sample
Torsosample
Upper legsample
Which observed assembly looksmost likely to bea human?
Lowerarm sample
Markov NetworkXi: pose state of each limbZi: image observation of each limb Ψij(Xi, Xj): each undirected link represents a potential functionΦi(Zi|Xi): each directed link represents a observation likelihoodTo infer P(Xi|Z) (i.e.,
P(state variables|image observations)
For each body part, normalize labeled shape and learn a low-dimensional representation, psi, using probabilistic Principal Component Analysis (PCA)Pose parameters: Xi ={psi, sx, sy, , tx, ty}
Normalize the labeled shape
(1) Normalized shape, (2) originally labeled shape and (3) reconstructed shape
labeled shape
(1) (2)(3)
Learning Body Shapes
Face Detection for Head Pose
AdaBoost-based face detectorDetection results are good but not precise2 class k-means algorithm to cluster skin color pixels The head pose hypothesis Ixh is obtained by re-centering the face rectangle to the centroid of the skin color cluster and then projecting to the head PCA spaceGaussian importance function
Image specific skin color segmentation Least square rectangle fitting for lower-arm& upper-leg hypothesis Upper-arm& Lower-leg hypothesis from constrained local search Gaussian mixture importance function
Skin color segmentationRectangle fittingUpper-arm& lower-leg search
Arm/Leg Importance Functions
Torso Pose Importance Function
Probabilistic Hough transform to detect line segmentsLines are assembled to quad-shapes and are prunedCanny edge masked likelihood t
(n) are evaluated for each good hypothesis Ixt(n) Gaussian mixture importance functionResults from
Hough transformTorso hypothesis
Potential ConstraintEncode physical constraints of human body partsLink points are defined between two adjacent body partsThe potential function is defined by a Gaussian radial basis function
Defined link points
Likelihood Model
Average normalized steered edge response in R, G, B bands
Likelihood is the maximum of the three
Experiment: Likelihood Model
Translation of the left-lower-leg Curve for the likelihood value
Joint Posterior Distribution
The joint posterior distribution of the Markov network is
where X={X1, X2, …, X9}The goal is to infer the marginal posterior P(Xi|Z) i.e.,
P( Configuration of body part i | Image observation)
Belief Propagation
Message passing
Non-Gaussian distribution makes closed form implementation intractableBelief propagation Monte Carlo
evidence from neighboring nodes
combine with local evidence fromobservation
Belief Propagation Monte Carlo
Experimental Results
State of the artProposed method
USC Brown
Set up Single frame Single frame Multi-view and video
Algorithm Data Driven Belief Propagation Monte Carlo (DDBPMC),
Marko Chain Monte Carlo (MCMC),
Belief Propagation (BP) and PAMPAS
Characteristics efficient, +well posed
problem,+more robust to
lighting change, + can be applied to
ASIMO directly, ++extended to full
body tracker easily, +
numerous experiments, ++
overall: ++
ad-hoc, - not a well posed
problem, -may be sensitive
to lighting change, - not applicable to
ASIMO directly, -may not be
extended to full body tracker, -
few results are available, --
overall: -
systematic, + work for specific
environment, - may be sensitive
to lighting change, - require multi
cameras, --may be extended
to full body tracker, +
few results are available, --
overall: +
Speed 2 to 3 minute per frame
5+ minute per frame
Unknown, but should be more than 3 minutes
Limitations of Current Work
Some skin color regionsFace in frontal poseReasonable contrast (visible edges)Low degree of occlusions
Concluding Remarks
A novel algorithm for pose estimationPrincipled statistical formulation in recovering Human pose in 2-D A working prototypeWork towards full human body tracking