Segmenting Video Into Classes of Algorithm-Suitability
description
Transcript of Segmenting Video Into Classes of Algorithm-Suitability
![Page 1: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/1.jpg)
Segmenting Video Into Classes of Algorithm-SuitabilityOisin Mac Aodha (UCL)Gabriel Brostow (UCL)Marc Pollefeys (ETH)
![Page 2: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/2.jpg)
Which algorithm should I (use / download / implement) to track things in this video?
Video from Dorothy Kuipers
![Page 3: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/3.jpg)
The Optical Flow Problem #2 all-time Computer Vision problem (disputable)
“Where did each pixel go?”
![Page 4: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/4.jpg)
Optical Flow Solutions
Compared against each other on the “blind” Middlebury test set
![Page 5: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/5.jpg)
Anonymous. Secrets of optical flow estimation and their principles. CVPR 2010 submission 477
![Page 6: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/6.jpg)
Anonymous. Secrets of optical flow estimation and their principles. CVPR 2010 submission 477
![Page 7: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/7.jpg)
1st Best Algorithm (7th overall as of 17-12-2009)DPOF [18]: C. Lei and Y.-H. Yang. Optical flow estimation on coarse-to-fine region-trees using discrete optimization. ICCV 2009
![Page 8: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/8.jpg)
(3rd overall as of 17-12-2009) 2nd Best Algorithm Classic+Area [31]: Anonymous. Secrets of optical flow estimation and their
principles. CVPR 2010 submission 477
![Page 9: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/9.jpg)
Anonymous. Secrets of optical flow estimation and their principles. CVPR 2010 submission 477
Should use algorithm A
Should use algorithm B
![Page 10: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/10.jpg)
Road
Sky
BuildingCar
SidewalkFence
SignSymbol
Tree
PedestrianColumnPole
![Page 11: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/11.jpg)
(Artistic version; object-boundaries don’t interest us)
Algorithm D
Algorithm BAlgorithm C
Algorithm B
Algorithm BAlgorithm B
Algorithm A
Algorithm C
Algorithm CAlgorithm A
![Page 12: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/12.jpg)
Hypothesis:
that the most suitable algorithm can be chosen for each video automatically, through supervised training of a classifier
![Page 13: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/13.jpg)
Hypothesis:
that the most suitable algorithm can be chosen for each video automatically, through supervised training of a classifier
that one can predict the space-time segments of the video that are best-served by each available algorithm (Can always come back to choose a per-frame or
per-video algorithm)
![Page 14: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/14.jpg)
Experimental Framework
Image data
Groundtruth labels
Learning algorithmFeatureextraction
Trainedclassifier
Estimated class labels
![Page 15: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/15.jpg)
Image data
Groundtruth labels
Learning algorithmFeatureextraction
Estimated class labels
Trainedclassifier
New test data
Experimental Framework
![Page 16: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/16.jpg)
Image data
Groundtruth labels
Learning algorithmFeatureextraction
Estimated class labels
Trainedclassifier
New test data
Experimental Framework
Random ForestsBreiman, 2001
![Page 17: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/17.jpg)
Image data
Groundtruth labels
Learning algorithmFeatureextraction
Estimated class labels
Trainedclassifier
New test data
Experimental Framework
![Page 18: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/18.jpg)
“Making” more data
![Page 19: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/19.jpg)
Formulation
Training data D consists of feature vectors x and class labels c (i.e. best-algorithm per pixel)
Feature vector x is multi-scale, and includes: Spatial Gradient Distance Transform Temporal Gradient Residual Error (after bicubic reconstruction)
![Page 20: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/20.jpg)
Formulation
Training data D consists of feature vectors x and class labels c (i.e. best-algorithm per pixel)
Feature vector x is multi-scale, and includes: Spatial Gradient Distance Transform Temporal Gradient Residual Error (after bicubic reconstruction)
![Page 21: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/21.jpg)
Formulation
Training data D consists of feature vectors x and class labels c (i.e. best-algorithm per pixel)
Feature vector x is multi-scale, and includes: Spatial Gradient Distance Transform Temporal Gradient Residual Error (after bicubic reconstruction)
![Page 22: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/22.jpg)
Formulation DetailsTemporal Gradient
Residual Error
![Page 23: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/23.jpg)
Application I: Optical Flow
![Page 24: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/24.jpg)
![Page 25: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/25.jpg)
False Positive Rate (for EPE < 1.0)FlowLib Decision
Confidence
True Po-
sitive Rate
Ave EPE
Number of pixels x105
![Page 26: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/26.jpg)
Application II: Feature Matching
![Page 27: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/27.jpg)
Comparing 2 Descriptions What is a match? Details are important…
Nearest neighbor (see also FLANN) Distance Ratio PCA
Evaluation: density, # correct matches, tolerance
“192 correct matches (yellow) and 208 false matches (blue)”
![Page 28: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/28.jpg)
SIFT Decision Confidence
![Page 29: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/29.jpg)
SIFT Decision Confidence
![Page 30: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/30.jpg)
SIFT Decision Confidence
![Page 31: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/31.jpg)
Hindsight / Future Work
Current results don’t quite live up to the theory:
Flaws of best-algorithm are the upper bound (ok) Training data does not fit in memory (fixable) “Winning” the race is more than rank (problem!)
![Page 32: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/32.jpg)
Summary Overall, predictions are correlated with the best algorithm for
each segment (expressed as Pr!)
Training data where one class dominates is dangerous – needs improvement
Other features could help make better predictions Results don’t yet do the idea justice
One size does NOT fit all At least in terms of algorithm suitability Could use “bad” algorithms!
![Page 33: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/33.jpg)
![Page 34: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/34.jpg)
Ground Truth Best Based on Prediction
![Page 35: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/35.jpg)
White = 30 pixel end point errorFlowLib Based on Prediction
![Page 36: Segmenting Video Into Classes of Algorithm-Suitability](https://reader031.fdocuments.us/reader031/viewer/2022020308/568168b4550346895ddf8868/html5/thumbnails/36.jpg)
FlowLib Based on Prediction(Contrast enhanced)