Li- Jia Li, Richard Socher , Li Fei-Fei

Post on 22-Feb-2016

55 views 0 download

Tags:

description

Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework. Li- Jia Li, Richard Socher , Li Fei-Fei. City Travel. Pagoda. Sunrise Sunshine Sun. Weber et al 00 Fergus et al 03 Felzenswalb et al 04. Classification. City Travel. - PowerPoint PPT Presentation

Transcript of Li- Jia Li, Richard Socher , Li Fei-Fei

Towards Total Scene Understanding:Classification, Annotation and

Segmentation in an Automatic Framework

Li-Jia Li, Richard Socher, Li Fei-Fei

1

2

City Travel

Pagoda

SunriseSunshine

Sun

3

City Travel

Pagoda

SunriseSunshine

Sun

Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06

Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06

Duygulu et al 02

Barnard et al 03

Blei et al 03Gupta et al 08

Alipr Li et al 03Sudderth et al 05

Segmentation

Classification

AnnotationRemark: Approaches in yellow will be used to compare withour model in later Experiments.

4

City Travel

Pagoda

SunriseSunshine

Sun

Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06

Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06

Duygulu et al 02

Barnard et al 03

Blei et al 03Gupta et al 08

Alipr Li et al 03Sudderth et al 05

Segmentation

Classification

Annotation

UTotal Scene Understandi

ng

Application

5

6

Classification Annotation Segmentation

Mutually beneficial!

7

AthleteHorseGrassTreesSkySaddle

Classification Annotation Segmentation

HorseHorse

class: Polo

8

Horse

Horse

Horse

HorseHorse

SkyTree

Grass

AthleteHorseGrassTreesSkySaddle

Classification Annotation Segmentation

Horse

Athleteclass: Polo

9

class: Polo

Horse

Horse

Horse

HorseHorse

AthleteHorseGrassTreesSkySaddle

Classification Annotation Segmentation

10

Related Work:

Tu et al 03

AnnotationSegmentation

Horse

Horse

Horse

HorseHorse

SkyTree

GrassHorse

Athlete

Li & Fei-Fei 07

AnnotationClassification

Sky

GrassHorse

AthleteHorse

Horse

Horse

HorseHorse

Class: Polo

ClassificationSegmentation

Tree

Heitz et al 08

Class: Polo

Learning

Model

Recognition & Experiment

Outline

Classification

Annotation Segmentation

12

C

Nr

O

RNF

XAr

NtZ

S

T

D

AthleteHorseGrassTreesSkySaddle

13

C

Visual

Text

class: Polo

AthleteHorseGrassTreesSkySaddle

Joint distribution of random variable Visual Component

Text Component.

D

14

O

14

Text Component.

D

Visual

TextC

class: Polo

15

RNF

Color LocationTexture Shape

Text Component.

O

D

Visual

TextC

class: Polo

RNF

O

D

Visual

TextC

class: Polo

16

XAr

Text Component.

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

AthleteHorseGrassTreesSkySaddle

Text Component.

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

.

S AthleteHorseGrassTreesSkySaddle

AthleteHorseGrassTreesSkySaddle

VisibleNot visible

“Switch variable”

Horse

Horse

Horse

HorseHorse

Athlete

Horse

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

S AthleteHorseGrassTreesSkySaddle

VisibleNot visible

“Switch variable”

T

Horse

.

Visual Text C

Nr

O

RNF

XAr

NtZ

S

TLearning

Model

Recognition & Experiment

Outline

21

Learning

Exact Inference is Intractable !

Relationship of the random variables

Visual

Text C

Nr

O

RNF

XAr

NtZ

S

T

22

Relationship of the random variables

Visual

Text C

Nr

O

RNF

XAr

NtZ

S

T

Top-down force

Bottom-up force from visual information

Bottom-up force from text information

Collapsed Gibbs Sampling

(R. Neal, 2000)

23

Scene/Event imagesfrom the Internet

There is no object-text correspondence…

AthleteHorseGrassTree

Saddle

24

Scene/Event imagesfrom the Internet

Our model builds the correspondence…

C

Nr

O

RNF

XAr

NtZ

S

T

D

AthleteHorseGrassTree

Saddle

25

AthleteHorseGrassTreesSkySaddle

AthleteHorseGrassBall

However, a big obstacle is: many objects always co-occur together

??

?

Scene/Event imagesfrom the Internet

26

C

RNF

XAr Nr Z

Nt

T

S

O

One solution: some good initialization of O

Grass

Athlete

Horse

AthleteHorseGrassTreesSkySaddle

Scene/Event imagesfrom the Internet

27

Scene/Event imagesfrom the Internet

Initializing O: obtain internet images for each O Object images

28

Scene/Event images

C

RNF

XAr Nr Z

Nt

T

SO

Any object

detection&

segmentation

Algorithm

D

Initializing O: train an object detector for each OObject imagesEvent/Scene images

29

Scene/Event images

…Black box

object detection& segmentation

Black box object detection& segmentation

C

RNF

XAr Nr Z

Nt

T

SO

D

Initialize O in the scene image by the trained object detectors

Object imagesEvent/Scene images

Any object

detection&

segmentation

Algorithm

30

Scene/Event images

…Black box

object detection& segmentation

Black box object detection& segmentation

C

RNF

XAr Nr Z

Nt

T

SO

Black box object detection& segmentation

D

Initialize O in the scene image by the trained object detectors

Cao & Fei-Fei, 2007

θ C

XR

O

NrAr

Our Model

Object imagesEvent/Scene images

C

RNF

XAr Nr Z

Nt

T

SO

D

Auto-semi-supervised learning: Small # of initialized images + Large # of uninitialized images

Our Model + AthleteHorseGrassTree

SaddleWind

Small # of initialized images

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

SnowboardLarge # of uninitialized images

Scene/Event images

AthleteHorseGrassTree

SaddleWind

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

Snowboard

Large # of uninitialized images

Visual Text C

Nr

O

RNF

XAr

NtZ

S

T

Learning Model

Recognition & Experiment• Dataset• Learned Model• Results

OutlineSmall # of automatically initialized images

Badminton

Bocce

Croquet

Polo

33

8 Event/Scene Classes

Remark: Tags are not used during testing

Rockclimbing

Rowing

Sailing

Snowboarding

34

8 Event/Scene Classes

35

C

Nr

RNF

XAr

NtZ

S

T

Learned model: O

D

O

36

Athlete

Grass

Horse

C

Nr

O

NF

XAr

NtZ

S

T

D

R

Learned model: R

37

C

Nr

O

RNF

XAr

NtZ

T

D

S

Learned model: S

38

8 way classification: 54%

Classification Annotation Segmentation

39

Classification Annotation Segmentation

Alipr: Li et al 03 Corr LDA: Blei et al 03

40

Classification Annotation Segmentation

41

Effect of top-down class context

Horse

C

O

R X Z

T

SO

R X Z

T

S

Model w/o top-down class Full Model

AthleteHorseGrassTree

SaddleWind

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

Snowboard

Large # of uninitialized images

Small # of automatically initialized images

Visual Text C

Nr

O

RNF

XAr

NtZ

S

T

Sky

AthleteTree

Mountain

Rock Class: Rock

climbingAthleteMountainTreeRockSkyAscent

Sky

Athlete

Water

Tree sailboat

Class: SailingAthleteSailboatTreeWaterSkyWind

Learning Model

Recognition & Experiment

Tree

AthleteSnowboard

Snow

Class: Snowboarding

AthleteSnowboardTreeSnowSkyPowder

43

ThankProf. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers

And You