3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for...

3D Human Body Pose Estimation from Monocular

Video

Moin Nabi

Computer Vision GroupInstitute for Research in Fundamental Sciences (IPM)

Introduction to Human Pose Estimation

Articulated pose estimation from single-view monocular image(s)

Application of Human Pose Estimation

■ Entertainment: Animation, Games■ Security: Surveillance■ Understanding: Gesture/Activity recognition

Difficulties of Human Pose estimation

■ Appearance/size/shape of people can vary dramatically

■ The bones and joints are observable indirectly (obstructed by clothing)

■ Occlusions■ High dimensionality of the state space■ Lose of depth information in 2D image projections

Difficulties of Human Pose estimation

■ Challenging Human Motion

Problem Backgrounds

■ Break up a very hard problem into smaller manageable pieces

Goal: Reliable 3D Human Pose Estimation from single-camera input

Graphical model(definition)

Nodes: Xi Random Variables

Edges: P(Xj/Xi) Conditional Probability

Graphical model(Examples)

Graphical model(Inference)

discrete

continuous

Belief propagation

(a) monocular input image with bottom up limb proposals overlaid (b); (c) distribution over 2D limb poses computed using nonparametric belief propagation; (d) sample of a 3D body pose generated from the 2D pose; (e) illustration of tracking.

Hierarchical Inference Framework

Hierarchical Inference Framework

Inferring 2D pose

2D Loose-Limbed Body Model

Graphical Modeling the Person

X = {X1,X2, ...,XP}

in terms of 2D position, rotation, scale and foreshortening of parts, Xi € R5

Modeling the constraints

Modeling the constraints

■ Kinematic Constraints

■ Occlusion Constraints …

Joint probability

Limb proposal

5 × 5 × 20 × 20 × 8 = 80, 000 valuated discrete states

valuating the likelihood function

chose the 100 most likely states for each part

discretizing the state space into:

5 scales5 foreshortenings20 vertical positions20 horizontal positions8 rotations

Image likelihood

In defining we use edge, silhouette and color features and combine them.

approximate the global likelihood with a product of local terms

None Parametric Belief Propagation

Use an Iterative method

of message passing to find better poses

2D Loose-Limbed Body Model(summary)

Result

Inferring 3D pose from 2D


Mixture of Experts (MoE)


Problem: p(Y|X)is non-linear mapping, and not one-to-one


Solution: p(Y|X)may be approximated by a locally linear mappings (experts)

MoE Formally

Training of MoE is done using EM procedure (similar to learning Mixture of Gaussians)

Illustration of 3D pose inference



Mixture of Experts (MoE)

Hidden Markov Model (HMM)

Result

Thank You

3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for...

Documents

Transcript of 3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for...