Proxemics Recognition - GitHub Pages · 2019. 9. 15. · Proxemics Recognition Yi Yang 1, Simon...
Transcript of Proxemics Recognition - GitHub Pages · 2019. 9. 15. · Proxemics Recognition Yi Yang 1, Simon...
Proxemics Recognition
Yi Yang1, Simon Baker, Anitha Kannan, Deva Ramanan1
1Department of Computer Science, UC Irvine
Proxemics
Proxemics: the study of spatial arrangement of people as they interact
- anthropologist Edward T. Hall in 1963
Brother and Sister Holding Hands
Friends WalkingSide by Side
Husband Hugging and Holding Wife’s HandMom Holding Baby
Touch Code• Hand Touch Hand• Hand Touch Shoulder
• Shoulder Touch Shoulder• Arm Touch Torso
Applications
• Personal Photo Search:– Find a specific interesting photo
• Analysis of TV shows and movies• Kinect• Web Search• Auto-Movie/Auto-Slideshow• Locate interesting scenes
Proxemics DataSet• 200 training images• 150 testing images
• Collected From – Simon, Bing, Google, Gettyimage, Flickr
• No video data• No Kinect 3d depth data
Proxemics DataSet
Number of People
Com
ple
xity
A Naïve ApproachInput Image
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
A Naïve ApproachInput Image
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Image Feature
A Naïve ApproachInput Image
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Pose Estimation i.e. Find skeletons
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Image Feature
A Naïve ApproachInput Image
Interaction Recognition
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Hand touch Hand
Pose Estimation i.e. Find skeletons
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Image Feature
Naïve Approach Results
Hand touch Hand Hand touch Shoulder
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
From pose estimationRandom guess
Human Pose Estimation
• Not bad when no real interaction between people
- Y. Yang & D. Ramanan, CVPR 2011
Interactions Hurt Pose Estimation
Occlusion + Ambiguous Parts
Our ApproachDirect Proxemics Recognition
Input Image
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Image Feature
Our ApproachDirect Proxemics Recognition
Input Image
Interaction Recognition
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Image Feature
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Hand touch Hand
Pictorial Structure Model
• !: Image• "#: Location of part $
- P. Felzenszwalb etc., PAMI 2009
Pictorial Structure Model
• !" : Unary template for part #• $(&, ("): Local image features at location ("
Pictorial Structure Model
• !(#$, #&): Spatial features between #$ and #&• ($&: Pairwise springs between part ) and part *
“Two Head Monster” Modeli.e. “Hand-Touching-Hand”
“Two Head Monster” Modeli.e. “Hand-Touching-Hand”
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
The modelsHand touch Hand Hand touch Shoulder
Shoulder touch Shoulder Arm touch Torso
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Match Model to Image
• Inference:max$ %(', ))
– Efficient algorithm: – Dynamic programming
• Learning:– Structural SVM Solver
Refinements + Extensions
• Sub-categoriesBecause of symmetry, 4 models for hand-hand etc
R. Hand L. Hand
L. Hand L. Hand
L. Hand R. Hand
R. Hand R. Hand
Refinements + Extensions
• Sub-categoriesBecause of symmetry, 4 models for hand-hand etc
• Co-occurrence of proxemics:Reduce redundancy, map Multi-Label -> Multi-Class
R. Hand L. Hand
L. Hand L. Hand
L. Hand R. Hand
R. Hand R. Hand
Naïve Approach Results
Hand touch Hand Hand touch Shoulder
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
From pose estimationRandom guess
Direct Approach Results
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
From pose estimationRandom guessOur model
Hand touch Hand Hand touch Shoulder
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recall
prec
isio
n
Quantitative Results
[1] Y. Yang & D. Ramanan, CVPR 2011
Improves Pose EstimationY & D CVPR 2011
Our Model
Improves Pose EstimationY & D CVPR 2011
Our Model
Improves Pose EstimationY & D CVPR 2011
Our Model
Conclusion• Proxemics and touch codes
for human interaction
Conclusion• Proxemics and touch codes
for human interaction
• Directly recognizing proxemics significantly outperforms
Conclusion• Proxemics and touch codes
for human interaction
• Directly recognizing proxemics significantly outperforms
• Recognizing proxemics helps pose estimation
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
Acknowledgements
Thank Simon and MSR for internship
Acknowledgements
Thank Anitha for a lot of suggestions
Acknowledgements
Thank Anarb for gettyimages
Acknowledgements
Thank Eletee for her beautiful smiling
Acknowledgements
Thank everybody for not falling asleep
Thank you
Articulated Pose Estimation
- Yi Yang & Deva Ramanan, CVPR 2011
max$,& '(), *,+)
For a tree graph (V,E): dynamic programming
min/12 2
s. t. ∀7 ∈ pos 2 ; < )=, >= ≥ 1∀7 ∈ neg, ∀> 2 ; < )=, > ≤ −1
Given labeled positive {)=, *=,+=} and negative {)=}, write >= = (*=,+=), and ' ), > = 2 ; < ), >
Inference & Learning
Learning
Inference