2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
-
Upload
amrollah1367 -
Category
Documents
-
view
217 -
download
0
Transcript of 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
1/15
Monocular 3D Pose Estimation
and Tracking by Detection
Mykhaylo Andriluka
Stefan RothBernt Schiele
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
2/15
3D Pose Estimation
Estimate positions and angles of individual bodyparts in a 3D space
Monocular refers to a single camera system
Very reliable in controlled situations used inmotion tracking
Currently poor performance in realistic scenes
Frequently relies on edge detection/background
subtraction Potential problems: loose clothing, occlusions,
ego motion, background clutter
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
3/15
Why is it interesting for us?
Accurate body pose estimation makes actionrecognition practically trivial
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
4/15
This paper
Performs 3D pose estimation of multiple
people simultaneously with a single camera in
a realistic street scene
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
5/15
Pictorial Structures Model
2D part-based model each part i
represented by lmi = {xmi,ymi,mi,smi} at frame m
Lm - Overall part configuration at frame m
Dm - Visual evidence at frame m
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
6/15
Pictorial Structures Model
Body represented as left/right lower and
upper legs, torso, head and left/right upper
and lower arms
Each body part detected individually by parts
detectors
The posterior probability Lm is maximized to
detect the body
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
7/15
Viewpoint Estimation
This method only detects people, and only
from a single viewpoint
This paper trains 10 of these detectors from a
multiview dataset each detector assumes a
different viewpoint
This gives us viewpoint estimation find the
detector with the strongest response to the
scene
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
8/15
Tracklet Extraction
Want to extract tracks of each person relatingtemporal states can give us more information for bodypose estimation even gives more robustness againstocclusion
Use pictorial structure model as detector, to getbounding boxes, and likely viewpoint at each frame, foreach person
Treat bounding boxes and viewpoint probabilities asemissions, hypotheses as states, in a Hidden MarkovModel
Use Viterbi Decoding to extract most likely sequence ofstates/viewpoints.
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
9/15
Tracklet Extraction
Transition Probabilities between states:
For viewpoints, high transition probabilities
between similar viewpoints, to reflect that people
turn slowly
For bounding boxes, transition probability is
proportional to difference between RGB colour
histograms within each bounding box
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
10/15
3D Pose Estimation
Use 2D->3D examplars to pick most likely 3D
pose in tracklet, for each frame. This gives us
M body pose hypotheses for each frame,
where M is the length of the tracklet
3D body pose at frame m: Qm = {qm, m, hm}
q joint configuration
body rotation in 3D world
h position and scale of the body
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
11/15
Representation of Pose
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
12/15
3D Pose Estimation
Single Frame likelihood:
Breakdown:
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
13/15
Position of Body Parts
In order to reduce computational complexity
of 3D body part estimation, we find theJ most
likely locations for each body part n in frame
m, and then calculate the Gaussiandistribution of that body part.
This allows the posterior probability to be
modelled as:
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
14/15
hGLPVM
Given the above information, with a prior
estimation of p(Q1:m)= p(q1:m)p(h1:m), we can
estimate the posterior probability of the
frames using hGLPVM
This models the sequence of poses as a
Gaussian process, and solves using MAP
estimation
-
8/3/2019 2011-6-3Monocular 3D Pose Estimation and Tracking by Detection
15/15
Pose Estimation Examples