Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines...
Transcript of Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines...
![Page 1: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/1.jpg)
Scene Understanding Tutorial:Surfaces and 3D Models
Derek Hoiem
University of Illinois
![Page 2: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/2.jpg)
Outline
1) How do we model 3D scenes?
2) How do we recover individual geometric properties? – Surface orientations / materials / depth
– Occlusion boundaries
– Viewpoint
3) How do we infer complete 3D scenes? - Probabilistic model
- Structured SVM
- Sequential prediction
![Page 3: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/3.jpg)
How can we model 3D scenes?
![Page 4: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/4.jpg)
Scene-Level Geometric Description
Gist, Spatial Envelope
Stages
Oliva Torralba 2001, 2006
Nedovic et al. 2007
![Page 5: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/5.jpg)
Pixel Map Geometric Description
Geometric Context
Depth Map
Hoiem et al. 2005, 2007
Saxena et al. 2005, 2007
![Page 6: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/6.jpg)
Loosely structured 3d model
Agarwal et al. 2009
Kim et al. 2013
Point Cloud
Voxels
![Page 7: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/7.jpg)
Structured models: supporting planes
Ground Plane Multiple Support Planes
Hoiem et al. 2006, 2008Bao et al. 2010
![Page 8: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/8.jpg)
Structured models: coarse 3d
Ground Plane with BillboardsHoiem et al. 2008
![Page 9: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/9.jpg)
Structured models: coarse 3d
Ground Plane with Walls 3D Box ModelLee et al. 2010 Hedau et al. 2009/10
![Page 10: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/10.jpg)
Structured models: detailed 3D
Guo Hoiem (unpublished)
![Page 11: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/11.jpg)
Representational Trade-Offs
Simple Detailed
Literal
Interpreted
![Page 12: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/12.jpg)
Representational Trade-OffsLiteral
Interpreted
Lends to solutions derived from
geometry and physics
Easily quantifiable results
Not directly useful for most
interaction/understanding tasks
Requires hand-crafted
representations and annotations
More difficult to measure error
More directly useful for high-
level tasks
![Page 13: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/13.jpg)
Representational Trade-Offs
Simple ComplexRobust inference from limited
cues
Incomplete scene information
Useful for general guidance and
priors
Requires more sensor data for
similar accuracy; important to
represent uncertainty
Complete models enable high-
level priors and constraints
Useful for moving, grasping,
understanding
![Page 14: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/14.jpg)
How can we recover geometric properties?
• Surface orientations, materials, depth
• Occlusion boundaries
• Viewpoint
![Page 15: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/15.jpg)
The challenges
Ambiguity 2D projection
(loss of depth info)
Ambiguity from occlusion
(loss of 3d info)
Ambiguity in connectedness
(requires direct manipulation)
Image credit (3d surface): Newcombe Davis 2010
![Page 16: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/16.jpg)
Geometric properties can be inferred only because our world is structured
Abstract World Our World
Image Credit (left): F. Cunin
and M.J. Sailor, UCSD
![Page 17: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/17.jpg)
Recovering surface properties: two main approaches
1. Train classifier/regressor
2. Transfer from patch/image matcher
– Discussed in more detail later in tutorial
![Page 18: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/18.jpg)
Example: describe 3D surfaces with geometric classes
Sky
Vertical
Support
Planar
(Left/Center/Right)
Non-Planar
Porous
Non-Planar
Solid
Hoiem Efros Hebert, ICCV 2005, IJCV 2007
![Page 19: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/19.jpg)
Geometry estimation as recognition
…
Surface Geometry Classifier
Vertical,
Planar
Training Data
Region
Features
Color
Texture
Perspective
Position
![Page 20: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/20.jpg)
Use a variety of image cues
Vanishing points, lines
Color, texture, image location
Texture gradient
![Page 21: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/21.jpg)
Surface Layout Algorithm
Segmentation
Hoiem Efros Hebert (2007)
Features
Perspective
Color
Texture
Position
Input Image Surface Labels
…
Training Data
Trained
Region
Classifier
![Page 22: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/22.jpg)
Surface Layout AlgorithmMultiple
Segmentations
Hoiem Efros Hebert (2007)
Features
Perspective
Color
Texture
Position
Input ImageConfidence-Weighted
Predictions
…Training Data
Trained
Region
Classifier
Final
Surface Labels
![Page 23: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/23.jpg)
Surface Description Result
![Page 24: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/24.jpg)
Results
Input Image Ground Truth Our Result
![Page 25: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/25.jpg)
Failures: Reflections, Rare Viewpoint
Input Image Ground Truth Our Result
![Page 26: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/26.jpg)
General framework for geometric pixel labeling
0.InputRGB Image
Multiple Images
Video
RGBD Image
1. Split into RegionsPixels
Square patches
Segmentation
Multiple segmentation
2. Extract featuresColor
Texture
Lines (perspective)
Position
3D Normal (w/ depth)
3D Planarity (w/ depth)
3. ClassifyBoost decision trees
SVM
KNN
Random forest
4. Regularize SolutionAverage predictions
MRF
Fit model
5.Pixel map of labels/valuesGeometric classes
Surface normals
Depth
Occlusion boundaries
Materials
Indoor surfaces
Object categories
![Page 27: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/27.jpg)
3D reconstruction from geometric context
Labeled Image Fit Ground-Vertical
Boundary with Line
Segments
Form Segments
into Polylines
Cut and Fold
Final Pop-up Model
[Hoiem Efros Hebert 2005]
![Page 28: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/28.jpg)
Need object/occlusion boundaries for more complex scenes
Surface Layout 3D Model
![Page 29: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/29.jpg)
Recovering major occlusions
Left side of arrow occludes
Hoiem et al., IJCV 2011
![Page 30: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/30.jpg)
Occlusion Cues
Regioncolor, position, shape
Boundarystrength, length, continuity
Surface Layout Depth
Pb: Martin Fowlkes Malik ’02
Objects
Gestalt Cuescontinuity, closure,
valid junctions
![Page 31: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/31.jpg)
Occlusion Algorithm
Learned Models
CRF Inference
P(occlusion)Next Segmentation
Remove Weak
Edges
Input Image Oversegmentation Occlusion Cues
![Page 32: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/32.jpg)
![Page 33: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/33.jpg)
Occlusion Result
Boundaries, Foreground/Background, Contact Depth (Max)
Depth (Min)
![Page 34: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/34.jpg)
3D Model with Occlusions
3D Model without
Occlusion Reasoning3D Model with Occlusion
Reasoning
![Page 35: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/35.jpg)
Occlusion boundary map
Input Image P(occlusion) boundary map
Hoiem et al., IJCV 2011
![Page 36: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/36.jpg)
Viewpoint
Yo
vt
Yc
vb
vh
Y=0
bh
bt
c
o
vv
vv
Y
Y
Ground plane model:
horizon + height
3D box model:
vanishing points + extent
![Page 37: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/37.jpg)
Viewpoint cues
Vanishing Points Image Texture / Transfer
Detected Objects
![Page 38: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/38.jpg)
Summary of key points
• Geometric properties can be recovered by modeling (or learning) the structure of the world
• Surface, boundary, and viewpoint properties can be inferred from multiple cues
• Retain uncertain estimates, avoid early decisions
![Page 39: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/39.jpg)
How to interpret scenes as a whole?
![Page 40: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/40.jpg)
Strategies for structured prediction
• Probabilistic models
• Structured SVM
• Sequential structured prediction
![Page 41: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/41.jpg)
Probabilistic models: example “objects in perspective”
• Use when dependencies are sparse
• When dependencies form a tree, learning and inference are easy and fast
• Most likely and marginal solutions possible (depending on model)
![Page 42: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/42.jpg)
Structured SVM
Example: Fitting a box to a room
(Schwing Urtasun 2012)
Geometric Context (Hoiem et al. 2007)
Orientation Maps (Lee et al. 2009)
Predicted box layout
![Page 43: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/43.jpg)
Box Layout Model
• Room is an oriented 3D box
– Three vanishing points specify orientation
– Two pairs of sampled rays specify position/size
Hedau Hoiem Forsyth, ICCV 2009
![Page 44: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/44.jpg)
SSVM example: fitting a box
Train weights 𝒘 to minimize
min𝒘‖𝒘‖2 + 𝐶
𝑛=1
𝑙
max𝑦∈Y( Δ 𝑦𝑛, 𝑦 + 𝒘(𝜓 𝑥𝑛, 𝑦 − 𝜓 𝑥𝑛, 𝑦𝑛 )
Loss Margin
Features are sum of predictions in wall/floor/ceiling
regions from geometric context and orientation maps
![Page 45: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/45.jpg)
SSVM example: fitting a box
• Main idea: correct solution should have higher score than each other solution by a margin of that solution’s loss
• Cutting plane training algorithm requires iteratively solving for “most violating constraint”, so inference must be fast
• Area-sum features computed quickly with integral geometry
• Inference computed quickly (~10ms) with branch and bound
![Page 46: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/46.jpg)
Structured SVM: comments
• Most often used when predicted variables have same type (e.g., so single loss makes sense)
• Learning can be difficult when loss is complex (when loss-augmented inference is intractable)
• Often used when single solution is desired (though there are some n-best approaches cf. Batra et al.)
Train weights 𝒘 to minimize
min𝒘‖𝒘‖2 + 𝐶
𝑛=1
𝑙
max𝑦∈Y( Δ 𝑦𝑛, 𝑦 + 𝒘(𝜓 𝑥𝑛, 𝑦 − 𝜓 𝑥𝑛, 𝑦𝑛 )
Loss Margin
![Page 47: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/47.jpg)
Sequential structured prediction
Iteratively predict each variable based on features and confidences of other variables
Barrow and Tenenbaum ‘78
illumination
orientation
distance
reflectance
intensity
![Page 48: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/48.jpg)
Sequential structured prediction
Input
Image
Predictor
2
Predictor
1
Predictor
N
Individual
Predictors
…
Image
Features
Features 1
Features 2
Features N
…
Predictions
…
21
12
NN-1
…
Contextual
Features
One predictor can be any classifier,
regressor, or even full graphical model
![Page 49: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/49.jpg)
Example: reasoning about surfaces, occlusion, objects, and viewpoint
Hoiem Efros Hebert 2008
![Page 50: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/50.jpg)
![Page 51: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/51.jpg)
Sequential prediction as belief propagation
• Sequential prediction updates each prediction based on marginals of related predictions
– New update function learned for each iteration (typically)
– Classifier encodes complex functions of many variables
– Each iteration improves likelihood of training predictions
– Can provide guarantees on prediction loss
Tu Bai 2010: Auto-context
Ross Munoz Hebert Bagnell 2011: Message-Passing Inference Machines
“unrolled BP”
(Ross et al. 2011)
![Page 52: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/52.jpg)
What strategy to use?
• Graphical model (probabilistic or energy/SVM)– Dependencies are sparse and easy to model
– Single loss or probability function makes sense
– Want an explicit global objective function
• Sequential prediction– Dependencies are dense and/or complex
– Need to make multiple predictions with different loss functions
![Page 53: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/53.jpg)
Big open problems
• Modeling uncertainty in complex scene representations
• Developing approaches that easily adapt to different input sensors
• Cumulative scene understanding over long observations
![Page 54: Scene Understanding Tutorial: Surfaces and 3D Modelsdfouhey/ECCV2014Tutorial/surfacesAnd...Lines (perspective) Position 3D Normal (w/ depth) 3D Planarity (w/ depth) 3. Classify Boost](https://reader036.fdocuments.us/reader036/viewer/2022071509/6129ef67efa644383f40ccf6/html5/thumbnails/54.jpg)
Questions?
• Next up
– Abhinav Gupta on “Volumetric and Functional Constraints”
– David Fouhey on “Non-parametric approaches to 3D scene understanding”