3D Object Recognition and Scene...

91
EECS 442 – Computer vision 3D Object Recognition and Scene Understanding

Transcript of 3D Object Recognition and Scene...

Page 1: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

EECS 442 – Computer vision

3D Object Recognition and Scene Understanding

Page 2: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Object: Building 8-10 meters away

Object: Car, ¾ view 2-3 meters away

Interpreting the visual world

Object: Traffic light

Page 3: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

How can we achieve all of this?

• 3D modeling – no semantic • Semantic reasoning – no 3D geometry • Joint 3D modeling and semantic reasoning

… Chen & Medioni, 92 Debevec et al 96 Pollefeys et al 02 Nister 04 Hartley & Zisserman, 00 Levoy et al., 00 Brown & Lowe, 04 Schindler et al 08 Snavely et al 08 Agarwal et al 09 Etc…

Page 4: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

How can we achieve all of this?

… Weber et al. 00 Felzenszwalb & Huttenlocher, 00 Leibe & Schiele, 04 Kumar & Hebert ’04 Fei-Fei & Perona, ‘05 Sivic et al. ’05 Shotton et al ‘05 Grauman et al. ‘05

Ullman et al. 02 Fergus et al. ’03 Torralba et al. ‘03

Lazebnik et al, 06 Maji & Malik, 07

Vedaldi & Soatto ’08 Zhu et al 08 Etc…

• 3D modeling – no semantic • Semantic – no 3D geometry

Page 5: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

How can we achieve all of this?

• 3D modeling – no semantic • Semantic – no 3D geometry

• Semantic from range data – disjoint 3D modeling and recognition

• … • Huber 01 • Rusu et al. 08 • Brostow et al. 08 • Son & Kim 10 • Tang et al. 10 • Adan et al. 11 • etc …

Cou

rtes

y of

Ada

n e

t al

, 201

1

Page 6: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

How can we achieve all of this?

• 3D modeling – no semantic • Semantic – no 3D geometry

• Joint 3D modeling and semantic reasoning

• Hoiem et al. 06-10 • Gould et al. 09 • Hedau et al. 09

• Gupta et al, 10 • Ladick´y et al, 10 • Bao, Sun, Savarese 10 • Sun, Bao, Savarese 10 • Bao & Savarese, 11

• Semantic from range data – disjoint 3D modeling and recognition

Page 7: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Joint 3D modeling and recognition

• Given the scene the layout, objects can be detected more robustly

• Objects and their geometrical attributes provide constraints for estimating the scene layout

Page 8: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• 3D Object detectors

– Robust to view point transformation

– Allow to estimate pose, scale and 3D shape

• Methods for coherent object detection and scene layout estimation – single view

– multi-view

– videos

In this lecture….

Page 9: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Viewing sphere

• Detect objects under generic view points • Estimate object pose • General and work for any object category

Azimuth , Zenith

3D Object Detectors

Page 10: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D Object Detectors

• Detect objects under generic view points • Estimate object pose • General and work for any object category

Page 11: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D Object Categorization

Page 12: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Felzenszwalb & Huttenlocher ‘03 •Fei-Fei et al. ‘04

•Leibe et al. ‘04

•Sudderth et al ‘05 •Torralba et al. ‘05 •Lazebnik et al. ‘06 •Todorovic et al. ’06 •Bosh et al ‘07 •Vedaldi & Soatto ‘08

•Kumar & Hebert ‘04 •Sivic et al. ’05 •Shotton et al ‘05

•Grauman et al. ‘05

•Leung et al ‘99 •Weber et al. ‘00 •Ullman et al. 02 •Fergus et al. ’03 •Torralba et al. ‘03

Single view object categorization

Page 13: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Zhang et al ’95 •Schmid & Mohr, ‘96 •Schiele & Crowley, ’96 •Lowe, ‘99 •Jacob & Barsi, ‘99 •Rothganger et al., ‘04

•Edelman et al. ’91 •Ullman & Barsi, ’91 • Rothwell ‘92 •Linderberg, ’94 •Murase & Nayar ‘94

•Ferrari et al, ’05 •Brown & Lowe ’05 •Snavely et al ’06 •Yin & Collins, ‘07

•Ballard, ‘81 •Grimson & L.-Perez, ‘87 •Lowe, ’87

Single 3D object recognition

Page 14: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D Object Categorization

Page 15: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D models - Explicit 3d models - Implicit 3d models

• Chiu et al. ‘07 • Hoiem, et al., ’07 • Yan, et al. ’07

3D Object Categorization

Mixture of 2D single view models

• Weber et al. ‘00 • Schneiderman et al. ’01 • Bart et al. ’04 • Gu & Ren, ‘10

•Thomas et al. ‘06 • Kushal, et al., ’07 • Savarese et al, 07, 08

Page 16: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Single view model

Single view model

Mixture of 2D models • Weber et al. ’00 • Schneiderman et al. ’01 • Ullman et al. 02 • Fergus et al. ’03 • Torralba et al. ’03

• Felzenszwalb & Huttenlocher ‘03 • Leibe et al. ’04 • Shotton et al. ‘05 • Grauman et al. ’05

• Savarese et al, ‘06 •Todorovic et al. ’06 • Vedaldi & Soatto ’08 • Zhu et al 08 • Gu & Ren, ‘10

3D Category model

CONS: Single view models are independent Non scalable to large number of categories/view-points Just b. boxes Cannot estimate 3D pose or 3D layout

Page 17: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D models - Implicit 3d models - Explicit 3d models

• Chiu et al. ‘07 • Hoiem, et al., ’07 • Yan, et al. ’07 …. • Xiang & Savarese ‘12

3D Object Categorization

Mixture of 2D single view models

• Weber et al. ‘00 • Schneiderman et al. ’01 • Bart et al. ’04 • Gu & Ren, ‘10

•Thomas et al. ‘06 • Kushal, et al., ’07 • Savarese et al, 07, 08 • Sun et al. ’09 …

Page 18: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Implicit 3D models

Sparse set of interest points or parts of the objects are linked across views by implicit 3D transformations (H, F)

… 3D Category

model

Page 19: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

x’ x

Linking features or parts across views: Perspective or affine transformation constraints

x’ = H x

Page 20: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

l’ = FT x x

x’

l’ = FT x x’ l’

Linking features or parts across views: Epipolar Transformation Constraints

Page 21: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Sparse set of interest points or parts of the objects are linked across views.

… Multi-view

model

• Thomas et al. ’06 • Leibe et al. ‘04

Implicit 3D models by ISM representations

Page 22: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Cou

rtesy o

f Thom

as et al. 06

Set of region-tracks connecting model views Each track is composed of image regions of a single physical surface patch along the model views in which it is visible.

[Ferrari et al. ’04, ‘06]

Region tracks

Implicit 3D models by ISM representations

Page 23: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Results

Implicit 3D models by ISM representations

Page 24: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Canonical parts captures view invariant diagnostic appearance information

Savarese, Fei-Fei, ICCV 07 Savarese, Fei-Fei, ECCV 08 Sun, et al, CVPR 2009, ICCV 09

• Parts and relationship are modeled in a probabilistic fashion • Parameters are learnt so as to maximize detection accuracy

• 2d ½ structure linking parts via weak geometry

Implicit 3D models by graph-based representations

Page 25: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Parameterization on view-sphere

T

• Model the object as collection of parts for any T and S on the viewing sphere

S

Page 26: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Multi-view generative part-based model

T, S

Image

Yn=Codeword Xn=Location

Yn=Codeword Xn=Location

Page 27: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

A

Image

V

K

= Part Prop. Prior

= Part Appearance

= Part Location/shape

Yn=Codeword Xn=Location

T, S

Multi-view generative part-based model

Page 28: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

A

Image

V

K

= Part Prop. Prior

= Part Appearance

= Part Location/shape

Yn=Codeword Xn=Location

T, S

Multi-view generative part-based model • Learning: estimate the latent variables and relevant parameters, given the observations

• Variational EM can be used Blei, ICML 2004.

Page 29: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

T

Within triangle constraints:

im

ji

ji mmM

jm

jiM

Encoded as a penalty term in variational EM

Incorporating geometrical constraints

Page 30: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

T

Encoded as a penalty term in variational EM

Incorporating geometrical constraints

View morphing constraints:

= Shape

= Center

Seitz &Dyer SIGGRAPH 96 Xiao & Shah CVIU ‘04

Page 31: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

S. M. Seitz and C. R. Dyer, Proc. SIGGRAPH 96, 1996, 21-30

Incorporating geometrical constraints

Page 32: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

hhhhhOPPPI ,,,: 321

kkkkkOPPPI ,,,: 321

Sequential ransac J-linkage [toldo et al 07]

•Defining initial parts and part correspondences

Initializing the model

Page 33: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Semi-supervised

• Class label • Object bounding box

• No need to observe same object instance from multiple views

• No pose labels [unlike Sun CVPR 09]

[unlike Savarese & Fei-Fei, 07, 08]

• No part labels

Page 34: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Incremental learning

• Enable unorganized and on-line collection training images • Increase efficiency in learning (no need large storage space)

T

Page 35: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Incremental learning

• Evidence of training image is used to update model parameters

T

• Assign new training image to a triangle of the view sphere

• Re-estimate sufficient statistics in a iterative fashion

Page 36: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Evolution of learnt parts

Page 37: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

38

Car

Examples of learnt part-based models

Page 38: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

39

Travel iron

Examples of learnt part-based models

Page 39: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Experimental results

• Object detection from any viewing angles • Accurate estimation of the object pose

• PASCAL 2006 dataset • 3D Object Dataset

Page 40: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

41

Car

Page 41: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

42

Travel Iron

Page 42: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Our model

Detection

Car Bicycle

Savarese & Fei-Fei ICCV ’07

Sun et al, CVPR 09

- 3D Object Dataset

ROC ROC

Page 43: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

V1

V2

V3

V4

V5

V6

V7

V8

Our model Savarese ICCV ’07

45º

90º

135º

180º

225º

270º

315º

Classification Accuracy

Viewpoint Classification 3D object dataset

Page 44: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Predicting object appearance from novel views

Viewing sphere

?

Page 45: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

[For natural scenes, see Hoiem et al 07; Saxena et al 07]

Thomas et al 08 Cremer et al 09

Predicting object appearance from novel views

Page 46: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00
Page 47: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Affine transformation

Our model

Predicting object appearance from novel views

Page 48: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D models - explicit 3d models - Implict 3d models

• Chiu et al. ‘07 • Hoiem, et al., ’07 • Yan, et al. ’07

3D Object Categorization

Mixture of 2D single view models

• Weber et al. ‘00 • Schneiderman et al. ’01 • Bart et al. ’04 • Gu & Ren, ‘10

•Thomas et al. ‘06 • Kushal, et al., ’07 • Savarese et al, 07, 08

Page 49: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Chiu et al. ‘07 • Hoiem, et al., ’07 • Yan, et al. ’07 • Xiang & Svarese, 12

Explicit 3D Models

• Part configuration is modeled as a conditional random fields with maximal margin parameter estimation

• Enable 6DOF object pose estimation • 3D layout estimation of object parts

3D Category model

Hij

Page 50: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

3D models - Explicit 3D models - Implicit 3D models

• Chiu et al. ‘07 • Hoiem, et al., ’07 • Yan, et al. ’07

Mixture of 2D single view models

• Weber et al. ‘00 • Schneiderman et al. ’01 • Bart et al. ’04 • Gu & Ren, ‘10

•Thomas et al. ‘06 • Kushal, et al., ’07 • Savarese et al, 07, 08

[3D object dataset, 07]

• Xiang & Savarese, CVPR 12

Explicit 3D Models

Page 51: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• 3D Object detectors

– Robust to view point transformation

– Allow to estimate pose, scale and 3D shape

• Methods for coherent object detection and scene layout estimation – single view

– multi-view

– videos

In this lecture….

Page 52: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Coherent probabilistic model captures relationship between objects and supporting planes No assumptions on cameras

Work both in indoors and outdoors

3D scene understanding from a single image Bao, Sun, Savarese, CVPR 2010; BMVC 2010; IJCV 2012

• Hoiem et al. 06-10 • Gould et al. 09 • Hedau et al. 09 •Lee et al. ‘09, 10 • Gupta et al, 10, 11 • Tsai et al. ‘11

Page 53: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Coherent probabilistic model captures relationship between objects and supporting planes No assumptions on cameras

Work both in indoors and outdoors

3D scene understanding from a single image

• Hoiem et al. 06-10 • Gould et al. 09 • Hedau et al. 09 •Gupta et al, 10, 11

Bao, Sun, Savarese, CVPR 2010; BMVC 2010; IJCV 2012

Page 54: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• 3D Object detectors

– Robust to view point transformation

– Allow to estimate pose, scale and 3D shape

• Methods for coherent object detection and scene layout estimation – single view

– multi-view

– videos

In this lecture….

Page 55: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Bao & S. Savarese, CVPR 2011 Bao, Bagra, Savarese . CORP – ICCV 2011 Bao, Bagra, Chao, Savarese, CVPR 2012

Bao, Xiang, Savarese, ECCV 2012

3D scene understanding from multiple images Semantic Structure from Motion (SSFM)

• Huber 01 • Rusu et al. 08 • Brostow et al. 08

•Son & Kim 10 • Tang et al. 10 • Adan et al. 11 • etc …

Page 56: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Semantic Structure from Motion (SSFM)

Page 57: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Semantic Structure from Motion (SSFM)

Y CO

Y CB

Y CQ

Fact

or

grap

h

Page 58: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

SSFM: point-level compatibility

Y CQ

Page 59: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Tomasi & Kanade ‘92 • Triggs et al ’99 • Soatto & Perona 99 • Hartley & Zisserman 00 • Dellaert et al. 00

Point re-projection error

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

SSFM: point-level compatibility

projection

observation

• Pollefeys & V. Gool 02 • Nister 04 • Brown & Lowe 07 • Snavely et al. 08

Page 60: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

SSFM: Object-level compatibility

Y CO

Object “re-projection” error

Page 61: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Camera 1 Camera 2

• Agreement with measurements is computed using position, pose and scale

SSFM: Object-level compatibility

Page 62: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Class = “car” scale=1 pose=“back“

SSFM: Object-level compatibility

• Savarese, Fei-Fei, ICCV 07 • Savarese, Fei-Fei, ECCV 08

• Su et al, ICCV 2009 • Sun, et al, CVPR 2009 • Sun et al, ECCV 2010

• Yu & Savarese, CVPR 2012

• A 3D object detector returns the confidence value (probability) that an object class c with scale s and pose p is found at x,y

Page 63: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Class = “car” scale=3 pose=“3/4“

• Savarese, Fei-Fei, ICCV 07 • Savarese, Fei-Fei, ECCV 08

• Su et al, ICCV 2009 • Sun, et al, CVPR 2009 • Sun et al, ECCV 2010

• Yu & Savarese, CVPR 2012

SSFM: Object-level compatibility

• A 3D object detector returns the confidence value (probability) that an object class c with scale s and pose p is found at x,y

Page 64: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Camera 1 Camera 2

SSFM: Object-level compatibility

Class = “car” scale=1 pose=“back“

Class = “car” scale=1 pose=“3/4“

• Efficiently implemented using a parallel computing architecture

Page 65: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

SSFM: Region-level compatibility

Y CB

Region “re-projection” error

Page 66: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

SSFM with interactions

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Y OB

Y QB

Y QO

Y CO

Y CB

Y CQ

Bao, Bagra, Chao, Savarese CVPR 2012

Page 67: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

SSFM with interactions

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Object-Point Interactions:

x

x

x

Bao, Bagra, Chao, Savarese CVPR 2012

Page 68: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

SSFM with interactions

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Point-Region Interactions:

x

x

x

Page 69: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

SSFM with interactions

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Object-Region Interactions:

Page 70: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

SSFM with interactions

•Measurements I • Points (x,y,scale)

• Objects (x,y, scale, pose)

• Regions (x,y, pose)

•Model Parameters:

• Q = 3D points • O = 3D objects • B = 3D regions

• = cam. prm. K, R, T

Object-Region Interactions:

Page 71: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Solving the SSFM problem

• Modified Markov Chain Monte Carlo (MCMC) sampling algorithm

• Initialization of the cameras, objects, and points are critical for the sampling

• Initialize configuration of cameras using: • SFM • consistency of object/region properties across views

F. Dellaert, S. Seitz, S. Thrun, and C. Thorpe. Feature correspondence: A markov chain monte carlo approach. In NIPS, 2000

Page 72: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Public Ford Campus Vision and LiDAR Dataset

• Object categories: Cars • Ground truth depth provided by LiDAR

[Pandey et al, International Journal of Robotics Research, 2011]

In-house Office dataset

• Object categories: mugs, mice, keyboards • Ground truth depth provided by Kinect

In-house Street dataset

• Object categories: humans • No ground truth depth available

Results

Page 73: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Observations Joint reconstruction & recognition

Det

ecti

on

s Se

gmen

tati

on

Vie

w 1

V

iew

N

Results

Page 74: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00
Page 75: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Observations Joint reconstruction & recognition

Det

ecti

on

s Se

gmen

tati

on

Vie

w 1

V

iew

N

Results

Page 76: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00
Page 77: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Observations Joint reconstruction & recognition

Det

ecti

on

s Vie

w 1

V

iew

N

SSFM Source code available! http://www.eecs.umich.edu/vision/research.html

Results

Page 78: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Object detection results

[1] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. TPAMI, 2009

DPM [1] SSFM (2011) with 2 views

SSFM (2012) with 2 views

SSFM (2012) with 4 views

54.5% 61.3% 62.8% 66.5%

FOR

D

CA

MP

US

(car

s)

Average precision in detecting objects (cars) in the 2D image

Accuracy in localizing objects in the 3D space (AP)

Hoiem [2]

SSFM

[2011]

SSFM

[2012]

FORD CAMPUS – cars 21.4% 32.7% 43.1%

OFFICE – keyboards, mice,

monitors

15.5% 20.2% 21.6%

[2] D. Hoiem, A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008.

Page 79: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Camera estimation results

Camera translation error

SFM [1] SSFM (2011)

SSFM (2012)

FORD CAMPUS 26.5 19.9 12.1

OFFICE 8.5 4.7 4.2

STREET 27.1 17.6 11.4

Camera rotation error

SFM [1] SSFM (2011)

SSFM (2012)

<1 <1 <1

9.6 4.2 3.5

21.1 3.1 3.0

[1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV, (2), Nov. 2008

Camera parameter reconstruction errors

Page 80: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Results

Page 81: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Results

eT

Page 82: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Source code available!

• http://www.eecs.umich.edu/vision/research.html

Page 83: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• 3D Object detectors

– Robust to view point transformation

– Allow to estimate pose, scale and 3D shape

• Methods for coherent object detection and scene layout estimation – single view

– multi-view

– videos

In this lecture….

Page 84: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Choi & Shahid & Savarese , WMC 2010

• Choi & Savarese , ECCV 2010

• Wu et al 07, • Breitenstein et al 09, • Zhao et al 04 • Ess et al 09

• Monocular cameras • Un-calibrated cameras • Arbitrary motion • Highly cluttered scenes

•Occlusion • Background clutter

•Moving targets

Joint 3D modeling and recognition from videos

Page 85: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Joint tracking and camera estimation

Interest Points in 3D

Tracked Interest Points

Camera Parameters

Pedestrian Detections

Target Location in 3D

Ω : set of state variables

Χ : set of observations

• Easily add additional evidence • 3d depth • IMU, etc…

• 5 frames/second! • Code available on line soon!

Page 86: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00
Page 87: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Safe Driving Applications

Page 88: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

Autonomous navigation

Page 89: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

• Intelligent vision requires joint reconstruction -recognition

• Geometry provides critical contextual cues for robust recognition

• High level semantics help establish robust geometrical constraints for reconstruction

– Within a single view

– Across views

• High level semantics help scalability in reconstruction problems

– Fewer images are needed with wider baseline

Conclusions

Page 90: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00

EECS 442 – Computer vision

• Hope you have enjoyed this class!

• Good luck with your projects & presentations!

Page 91: 3D Object Recognition and Scene Understandingeecs.umich.edu/vision/teaching/EECS442_2012/lectures/...3D Object Categorization Mixture of 2D single view models • Weber et al. Z00