Kinect Case S tudy
description
Transcript of Kinect Case S tudy
![Page 2: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/2.jpg)
![Page 3: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/3.jpg)
Motorized base
![Page 4: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/4.jpg)
http://www.youtube.com/watch?v=dTKlNGSH9Po&feature=related
![Page 5: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/5.jpg)
![Page 6: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/6.jpg)
Depthhttp://www.youtube.com/watch?v=inim0xWiR0o
http://www.youtube.com/watch?v=7TGF30-5KuQ&feature=related
![Page 7: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/7.jpg)
Questions
• Why a dot pattern?• Why a laser?• Why only one IR camera?• Is the dot pattern random?• Why is heat a problem?• How is it calibrated?• Why isn’t depth computed everywhere?• Would it work outside?
![Page 8: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/8.jpg)
Pose recognition
• Research in pose recognition has been on going for 20+ years.
• Many assumptions: multiple cameras, manual initialization, controlled/simple backgrounds
![Page 9: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/9.jpg)
Model-Based Estimation of 3D Human Motion, Ioannis Kakadiaris and Dimitris Metaxas, PAMI 2000
![Page 10: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/10.jpg)
Tracking People by Learning Their Appearance, Deva Ramanan, David A. Forsyth, and Andrew Zisserman, PAMI 2007
![Page 11: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/11.jpg)
Kinect• Why does depth help?
![Page 12: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/12.jpg)
Shotton et al. proposed two main steps:
1. Find body parts2. Compute joint positions.
Real-Time Human Pose Recognition in Parts from Single Depth ImagesJamie Shotton Andrew Fitzgibbon Mat Cook Toby Sharp Mark FinocchioRichard Moore Alex Kipman Andrew Blake, CVPR 2011
Algorithm design
![Page 13: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/13.jpg)
Finding body parts
• What should we use for a feature?
• What should we use for a classifier?
![Page 14: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/14.jpg)
Finding body parts
• What should we use for a feature?– Difference in depth
• What should we use for a classifier?– Random Decision Forests
![Page 15: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/15.jpg)
Features
![Page 16: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/16.jpg)
Classification
Learning:
1. Randomly choose a set of thresholds and features for splits.2. Pick the threshold and feature that provide the largest information gain.3. Recurse until a certain accuracy is reached or depth is obtained.
![Page 17: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/17.jpg)
Implementation details
• 3 trees (depth 20) (why so few?)
• 300k unique training images per tree.• 2000 candidate features, and 50 thresholds• One day on 1000 core cluster.• Why RDF and not AdaBoost, SVMs, etc.?
![Page 18: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/18.jpg)
Synthetic data
![Page 19: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/19.jpg)
Synthetic training/testing
![Page 20: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/20.jpg)
Real test
![Page 21: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/21.jpg)
Results
![Page 22: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/22.jpg)
Joint estimation
• Apply mean-shift clustering to the labeled pixels. (why mean shift?)
• “Push back” each mode to lie at the center of the part.
![Page 23: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/23.jpg)
Results
![Page 24: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/24.jpg)
Failures
• Why would the system fail?
![Page 25: Kinect Case S tudy](https://reader035.fdocuments.us/reader035/viewer/2022081422/56816385550346895dd46cce/html5/thumbnails/25.jpg)
Video
• http://research.microsoft.com/pubs/145347/CVPR%202011%20-%20Final%20Video.mp4
Story about the making of Kinect:
http://www.wired.co.uk/magazine/archive/2010/11/features/the-game-changer