Visual Object Analysis using Regions and Local Features

103
Visual Object Analysis using Regions and Local Features Carles Ventura Royo Co-advisors Xavier Giró i Nieto Verónica Vilaplana Besler Tutor Ferran Marqués Acosta

Transcript of Visual Object Analysis using Regions and Local Features

Page 1: Visual Object Analysis using Regions and Local Features

Visual Object Analysis using Regions and Local

FeaturesCarles Ventura Royo

Co-advisorsXavier Giró i Nieto

Verónica Vilaplana Besler

TutorFerran Marqués Acosta

Page 2: Visual Object Analysis using Regions and Local Features

2

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Conclusions

Page 3: Visual Object Analysis using Regions and Local Features

3

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 4: Visual Object Analysis using Regions and Local Features

4

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 5: Visual Object Analysis using Regions and Local Features

5

Introduction: Semantic segmentation

Instancesegmentation

Classsegmentation

boat

Page 6: Visual Object Analysis using Regions and Local Features

6

Introduction: Semantic segmentation

Part I: Single view Part II: Multiview

STATE OF THE ART

OUR RESULTS

Page 7: Visual Object Analysis using Regions and Local Features

7

Introduction: Visual Object Analysis

vs

Objects Scene

Page 8: Visual Object Analysis using Regions and Local Features

8

Introduction: Regions

Page 9: Visual Object Analysis using Regions and Local Features

9

Introduction: Regions

1 2

9

6

7

3

45

810

11

9 2

3

12 10

15 14

4 13

5 1

16 7

18 17

8 6

19

BINARY PARTITION TREE

Page 10: Visual Object Analysis using Regions and Local Features

10

Introduction: Regions

1 2

9

6

7

3

45

810

9

2

310

4

5

1

7

8

6

REGION ADJACENCY GRAPH

Page 11: Visual Object Analysis using Regions and Local Features

11

Introduction: Local Features

Local Features Global Features

Page 12: Visual Object Analysis using Regions and Local Features

12

Introduction: Local Features Aggregation• Bag of Features (BoF) [1]

vectorquantization

codebook

Bag of Features

[1] G Csurka et al, Visual Categorization with Bags of Keypoints. ECCV’04

Page 13: Visual Object Analysis using Regions and Local Features

13

Introduction: Local Features Aggregation• Pooling

1𝑁∑

𝑖=1

𝑁

𝑥 𝑖

1𝑁∑

𝑖=1

𝑁

𝑥 𝑖 𝑥𝑖𝑇

First Order Average Pooling (O1P) [1]

Second Order Average Pooling (O2P) [2]𝑥𝑖 : 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠

No need of codebook High dimensionality

[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 14: Visual Object Analysis using Regions and Local Features

Part IContext analysis

in semantic segmentation

Page 15: Visual Object Analysis using Regions and Local Features

15

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 16: Visual Object Analysis using Regions and Local Features

16

Introduction: Context

[2] A Rabinovich et al, Objects in Context. ICCV’07

Semantic context [1,2] Spatial context

[1] M Bar, Visual Objects in Context. Nature Reviews Neuroscience 2004

GOAL: Analyze the influence of the spatial context in object recognition

Page 17: Visual Object Analysis using Regions and Local Features

17

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 18: Visual Object Analysis using Regions and Local Features

18

Related Work: Ideal scenarioGroundtruthobjectlocation

[1] J.R.R. Uijlings et al., The Visual Extent of an Object. IJCV’12

Conclusion: Aggregating the local features over three region pools (interior, border and surround) increases the performance [1]

Page 19: Visual Object Analysis using Regions and Local Features

19

Related Work: Realistic scenario• Pipeline [1]

Input image

Generate object

candidates

Rank object

candidates

Predict class

scores

Aggregate high-rank

candidates

[1] J Carreira et al, Object Recognition as Ranking Holistic Figure-Ground Hypotheses. CVPR’10

Semantic partition

Page 20: Visual Object Analysis using Regions and Local Features

20

Related Work: Realistic scenario• How is each class predictor trained? [1]

0.81790.6861

0.9013

0.73810.7105

0.6462

TRAI

NIN

GDA

TA

A SVR is used to learn the function that predicts the overlap for each class

GOAL: CHANGE SPATIAL CODIFICATION

O2PF O2PG

overlapscore

os_1os_2

os_N

SVR os = f([O2PF O2PG])

[O2PF_1 O2PG_1] [O2PF_2 O2PG_2]

[O2PF_1 O2PG_1]

[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 21: Visual Object Analysis using Regions and Local Features

21

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 22: Visual Object Analysis using Regions and Local Features

22

Contributions• Figure-Border-Ground spatial pooling in the realistic scenario

os_1os_2

os_N

SVR os = f([O2PF O2PB O2PG])

[O2PF_1 O2PB_1 O2PG_1] [O2PF_2 O2PB_2 O2PG_2]

[O2PF_N O2PB_N O2PG_N]

Page 23: Visual Object Analysis using Regions and Local Features

23

Contributions• Contour-based spatial pyramid [1]: crown-based

os_1os_2

os_N

SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])

[O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2]

[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06

Page 24: Visual Object Analysis using Regions and Local Features

24

Contributions• Contour-based spatial pyramid [1]: Cartesian-based

os_1os_2

os_N

SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4])

[O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2]

[O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06

Page 25: Visual Object Analysis using Regions and Local Features

25

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 26: Visual Object Analysis using Regions and Local Features

26

Experiments• Pascal VOC segmentation challenge 2011 & 2012 [1]• Train, validation and test subsets• Train: 1,112 (2011) / 1,464 (2012)• Validation: 1,111 (2011) / 1,449 (2012)• Test: 1,111 (2011) / 1,456 (2012)

• 20 semantic classes• aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dinningtable, dog,

horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor

• Evaluation measure: Average Accuracy Classification

[1] M Everingham et al, The PASCAL Visual Object Classes (VOC) Challenge. IJCV’10

Page 27: Visual Object Analysis using Regions and Local Features

27

Experiments: Local Features Aggregation• Pooling

1𝑁∑

𝑖=1

𝑁

𝑥 𝑖

1𝑁∑

𝑖=1

𝑁

𝑥 𝑖 𝑥𝑖𝑇

First Order Average Pooling (O1P) [1]

Second Order Average Pooling (O2P) [2]𝑥𝑖 : 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠

No need of codebook High dimensionality

[1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 28: Visual Object Analysis using Regions and Local Features

28

Experiments• Ideal scenario• Train set: train11• Test set: val11

F [1] F-B F-G [1] F-B-G

eSIFT [1] 63.9 66.2 66.4 68.6

eMSIFT [1] 64.8 68.9 67.7 70.8

[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 29: Visual Object Analysis using Regions and Local Features

29

Experiments• Ideal scenario• Train set: train11• Test set: val11

F [1] F-B F-B-G

Non SP 64.8 68.9 70.8

Crown-based SP 68.7 71.1 71.7

Cartesian-based SP 67.7 71.6 72.7

[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 30: Visual Object Analysis using Regions and Local Features

30

Experiments• Ideal scenario• Train set: train11• Test set: val11

Figure SP (Figure) Border Ground AAC

eSIFT+eMSIFT+eLBP eSIFT 72.98 [1]

eSIFT+eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 73.84

eSIFT+eMSIFT+eLBP eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 75.86

[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 31: Visual Object Analysis using Regions and Local Features

31

Experiments• Realistic scenario (CPMC [1])• Train set: train11• Test set: val11

Figure SP (Figure) Border Ground AAC

eSIFT eSIFT 28.6 [2]

eSIFT eSIFT eSIFT 34.8

eSIFT+eMSIFT+eLBP eSIFT 37.2 [2]

eSIFT eSIFT eSIFT eSIFT 37.4

eSIFT+eMSIFT+eLBP eSIFT eSIFT eSIFT 39.6

[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

[1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10

Page 32: Visual Object Analysis using Regions and Local Features

32

Experiments• Realistic scenario (CPMC [1])• Train set: trainval11/12• Test set: test11/12

[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

F-G [2] F-B-G SP(F)-B-G

VOC11 38.8 43.8 40.3

VOC12 39.9 42.2 40.8

[1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10

Page 33: Visual Object Analysis using Regions and Local Features

33

Experiments• Realistic scenario (MCG [1])• Train set: train11• Test set: val11

[2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

F-G [2] F-B-G SP(F)-B-G

CPMC 37.2 38.9 39.6

MCG 30.9 34.1 36.1

[1] P Arbeláez et al, Multiscale combinatorial grouping. CVPR’14

Page 34: Visual Object Analysis using Regions and Local Features

34

Experiments: Qualitative evaluationF-G F-B-G F-G F-B-G

aeroplanebicycle bicycle

cat bird

motorbike boat

bottle

busbus

motorbike car

chaircat

chair chair

horse bird

cow

Page 35: Visual Object Analysis using Regions and Local Features

35

Experiments: Qualitative evaluationF-G F-B-G F-G F-B-Gchair

diningtable

cow dog

person

horseperson motorbike

motorbikemotorbike

person

pottedplant bottle

sheep

sofacat

bus

train train

tvmonitor

Page 36: Visual Object Analysis using Regions and Local Features

36

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Conclusions

Page 37: Visual Object Analysis using Regions and Local Features

37

Conclusions• Figure-Border-Ground spatial pooling improves the original Figure-

Ground pooling in both ideal and realistic scenarios• The Border region pool carries the richest contextual information• The Cartesian-based spatial pyramid outperforms the crown-based

spatial pyramid, but both of them may result in overfitting• Both Figure-Border-Ground pooling and Cartesian-based spatial

pyramid have been validated with MCG object candidates• Published in ICIP’15

Page 38: Visual Object Analysis using Regions and Local Features

Part IIMultiresolution co-clustering for

uncalibrated multiview segmentation

Page 39: Visual Object Analysis using Regions and Local Features

39

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 40: Visual Object Analysis using Regions and Local Features

40

IntroductionST

ATE

OF

THE

ART

OU

R RE

SULT

S

Page 41: Visual Object Analysis using Regions and Local Features

41

Introduction• First goal: improving generic segmentation• Motion-based region adjacency graph• New resolution parameterization• Relaxing hierarchical constraints with a two-step architecture• Practical framework for a global optimization

• Second goal: improving semantic segmentation• Semantic-based generic segmentation• Automatic resolution selection technique• Generic segmentation based semantic segmentation

Page 42: Visual Object Analysis using Regions and Local Features

42

Introduction• Co-segmentation

• Video segmentation

• Co-clustering

Page 43: Visual Object Analysis using Regions and Local Features

43

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 44: Visual Object Analysis using Regions and Local Features

44

Related Work: Co-clustering framework [1,2]• Objective: Find the clusters that define the coherent regions across

the different views at multiple resolutions

[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11

LEAV

ES

PART

ITIO

NS

CO-CLUSTERED PARTITIONS

INPU

T IM

AGES

HIER

ARCH

IES

Page 45: Visual Object Analysis using Regions and Local Features

45

Related Work: Co-clustering framework [1,2]• Objective: Find the clusters that define the coherent regions across

the different views

view 1 view 2 view 1 view 2

LEAVES PARTITIONS CO-CLUSTERED PARTITIONS

[2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15[1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11

R2

Page 46: Visual Object Analysis using Regions and Local Features

46

Related Work: Co-clustering framework• Representation with boundary variables• Intra-image boundary variables: D1,2, D1,3, D2,3, D4,5, D5,6

• Inter-image boundary variables: D1,4, D1,5, D2,4, D2,5, D3,6

view 1 view 2 view 1 view 2

LEAVES PARTITIONS CO-CLUSTERED PARTITIONS

D1,2 = 0 D1,4 = 0D1,3 = 1 D1,5 = 0D2,3 = 1 D2,4 = 0D4,5 = 0 D2,5 = 0D5,6 = 1 D3,6 = 0

R2

Page 47: Visual Object Analysis using Regions and Local Features

47

Related Work: Co-clustering framework• How are the values of the boundary variables chosen?

view 1 view 2

LEAVES PARTITIONS

INTRA INTERACTIONS INTER INTERACTIONS

Q1,2, Q1,3, Q2,3, Q4,5, Q5,6 Q1,4, Q1,5, Q2,4, Q2,5, Q3,6

R2

Page 48: Visual Object Analysis using Regions and Local Features

48

Related Work: Co-clustering framework• Hierarchical constraint

view 1 view 2

1 2

7 3

8

4 5

9 6

10

Co-clustered partitions cannot violate the hierarchical structures

R2

Page 49: Visual Object Analysis using Regions and Local Features

49

Related Work: Co-clustering framework• Hierarchical constraint

view 1 view 2

1 3

7 2

8

4 5

9 6

10

Co-clustered partitions cannot violate the hierarchical structures

R2

Page 50: Visual Object Analysis using Regions and Local Features

50

Related Work: Co-clustering framework• Multiresolution parameterization

view 1 view 2

LEAVES PARTITIONS

R2

Page 51: Visual Object Analysis using Regions and Local Features

51

Related Work: Co-clustering framework• Iterative approach

Page 52: Visual Object Analysis using Regions and Local Features

52

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 53: Visual Object Analysis using Regions and Local Features

53

Contribution I: Motion-based adjacency

View #i View #i-1

Page 54: Visual Object Analysis using Regions and Local Features

54

Contribution I: Motion-based adjacency• Similarity computation• RAG definition

View #i View #i-1

Page 55: Visual Object Analysis using Regions and Local Features

55

Contribution II: Resolution parameterization

view 1 view 2

LEAVES PARTITIONS…

Original parameterization

Proposed parameterization

= ???

= 2

R2

Page 56: Visual Object Analysis using Regions and Local Features

56

Contribution III: Two-step iterative architecture• Hierarchical constraints are not imposed in a second step

Page 57: Visual Object Analysis using Regions and Local Features

57

Contribution III: Two-step iterative architecture

First step Second step

Page 58: Visual Object Analysis using Regions and Local Features

58

Contribution III: Two-step iterative architecture

Page 59: Visual Object Analysis using Regions and Local Features

59

Contribution IV: Generic global co-clustering

• All co-clustered partitions resulting from the iterative architecture are fed into a global optimization

• The reduction on the number of regions makes the global optimization feasible

Page 60: Visual Object Analysis using Regions and Local Features

60

Contribution V: Semantic global co-clustering

• Semantic information is introduced in the global optimization

Page 61: Visual Object Analysis using Regions and Local Features

61

Contribution V: Semantic global co-clustering

GENERICCO-CLUSTERING

SEMANTIC SEGMENTATIONS

SEMANTIC CO-CLUSTERING

Page 62: Visual Object Analysis using Regions and Local Features

62

Contribution VI: Automatic resolution selection

view 1 view 2

LEAVES PARTITIONS…

MULTIRESOLUTIONCO-CLUSTERING

• We propose a method that automatically selects the resolution that best fits with the semantic information

SEMANTICPARTITIONS

SINGLE RESOLUTIONCO-CLUSTERING

R2

Page 63: Visual Object Analysis using Regions and Local Features

63

Contribution VII: Coherent semantic partitions

view 1 view 2LEAVES PARTITIONS

SEMANTIC PARTITIONS

SINGLE RESOLUTIONCO-CLUSTERING

COHERENTSEMANTIC PARTITIONS

R2

Page 64: Visual Object Analysis using Regions and Local Features

64

Contribution VII: Coherent semantic partitions

STATE OF THE ART [1]

OUR RESULTS

[1] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

Page 65: Visual Object Analysis using Regions and Local Features

65

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 66: Visual Object Analysis using Regions and Local Features

66

Experiments: Dataset• Multiview dataset [1]

[1] A. Kowdle et at, Multiple view object cosegmentation using appearance and stereo cues (ECCV’12)

Page 67: Visual Object Analysis using Regions and Local Features

67

Experiments: Generic co-clusteringCo-segmentation techniques

Video segmentation techniques

Co-clustering techniques• I-1S: Motion-compensated one-step

iterative (baseline)• I-2S: Two-step iterative• UCM+I-1S: First step is replaced by a cut

from a hierarchical segmentation algorithm• I-2S+GG: Two-step iterative followed by

generic global optimization

Page 68: Visual Object Analysis using Regions and Local Features

68

Experiments: Generic co-clustering

I-2S UCM+I-1S I-2S+GG

[KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S

BMW 0.72 0.68 0.70 0.42 0.56 0.70 0.65 0.63 0.62 0.67

Chair 0.79 0.77 0.76 0.53 0.78 0.80 0.76 0.47 0.59 0.78

Couch 0.93 0.95 0.94 0.78 0.90 0.85 0.88 0.73 0.89 0.90

GardenChair 0.84 0.63 0.87 0.31 0.52 0.70 0.68 0.63 0.84 0.80

Motorbike 0.76 0.77 0.77 0.39 0.39 0.71 0.73 0.46 0.54 0.70

Teddy 0.92 0.92 0.92 0.69 0.87 0.88 0.84 0.85 0.82 0.90

Average 0.83 0.79 0.83 0.52 0.67 0.77 0.76 0.63 0.72 0.79

CO-CLUSTERING CO-SEGMENTATION VIDEO SEGMENTATION BASELINES

• Two-step iterative co-clustering techniques (I-2S and I-2S+GG) outperform other state-of-the-art techniques

Page 69: Visual Object Analysis using Regions and Local Features

69

Experiments: Semantic co-clusteringCo-clustering techniques• I-2S+GG(MR): Multiresolution global

generic co-clustering• I-2S+SG(MR): Multiresolution global

semantic co-clustering• I-2S+GG(SR): Single resolution global

generic co-clustering• I-2S+SG(SR): Single resolution global

semantic co-clustering

Semantic segmentation techniques• SCSS: Semantic co-clustering based

semantic segmentation• GCSS: Generic co-clustering based

semantic segmentation• [ZJRP+15]: state-of-the-art

[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

Page 70: Visual Object Analysis using Regions and Local Features

70

Experiments: Qualitative assessment

Page 71: Visual Object Analysis using Regions and Local Features

71

Experiments: Qualitative assessment

Page 72: Visual Object Analysis using Regions and Local Features

72

Experiments: Qualitative assessment

leaves partition

I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15]

[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

Page 73: Visual Object Analysis using Regions and Local Features

73

Experiments: Qualitative assessment

leaves partition

I-2S I-2S+GG I-2S+SG SCSS

[ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15

[ZJRP+15]

Page 74: Visual Object Analysis using Regions and Local Features

74

Experiments: Qualitative assessment

Occlusion/Object Boundary Detection Dataset [GVB11] Ballet and Breakdancers datasets [ZKU+04]

Page 75: Visual Object Analysis using Regions and Local Features

75

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 76: Visual Object Analysis using Regions and Local Features

76

Conclusions• The use of motion cues significantly improved the performance• The new resolution parameterization allowed us to have a more uniform

distribution of resolutions• The two-step architecture improved the performance of the original one-

step architecture • Although global optimization is now feasible, there is no clear gain for

generic co-clustering. However, it is useful for semantic co-clustering.• A small decrease in performance is achieved as a result of applying the

resolution selection technique• Submitted to ECCV’16 (waiting decision)

Page 77: Visual Object Analysis using Regions and Local Features

77

Future Work• Extending experiments to video datasets• VSB100 (Video Segmentation Benchmark) [1]• Cityscapes [2]

• Extending experiments to calibrated scenarios

• Training end-to-end CNNs for multiview semantic segmentation

[1] F Galasso et al, A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis. ICCV’13

[2] M Cordts et al, The cityscapes dataset for semantic urban scene understanding. CVPR’16

Page 78: Visual Object Analysis using Regions and Local Features

78

Outline• Introduction• Part I: Context Analysis in semantic segmentation• Part II: Multiresolution co-clustering for uncalibrated multiview

segmentation• Introduction• Related Work• Contributions• Experiments• Conclusions

• Conclusions

Page 79: Visual Object Analysis using Regions and Local Features

79

Conclusions• Results achieved in the first part by considering new spatial

configurations are now obsolete after the outstanding results achieved by deep learning techniques.• Results from deep learning techniques were used in the second part.• The proposed multiresolution co-clustering has improved state-of-

the-art results, but we should consider an end-to-end deep learning approach to achieve a more significant improvement.• Semantic segmentation techniques evolve really fast, making this field

very competitive and challenging.

Page 80: Visual Object Analysis using Regions and Local Features

80

Publications• Related with the Thesis

• C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically driven multiresolution co-clustering for uncalibrated multiview segmentation. Submitted to the European Conference on Computer Vision (ECCV) 2016. In process of review.

• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, K. McGuinness, F. Marques, Noel E O'Connor. Improving spatial codication in semantic segmentation. International Conference on Image Processing (ICIP) 2015.

• C. Ventura. Visual object analysis using regions and interest points. ACM international conference on Multimedia 2013.

Page 81: Visual Object Analysis using Regions and Local Features

81

Publications• Other publications:

• K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor, A. F. Smeaton, A. Salvador, X. Giro-i-Nieto, C. Ventura. Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks. TRECVID Workshop 2014.

• C. Ventura, V. Vilaplana, X. Giro-i-Nieto, F. Marques. Improving retrieval accuracy of Hierarchical Cellular Trees for generic metric spaces. Multimedia Tools and Applications, 2014.

• C. Ventura, X. Giro-i-Nieto, V. Vilaplana, D. Giribet, E. Carasusan. Automatic keyframe selection based on mutual reinforcement algorithm. International Workshop on Content-Based Multimedia Indexing (CBMI) 2013.

• C. Ventura, M. Tella-Amo, X. Giro-i-Nieto. UPC at MediaEval 2013 Hyperlinking Task. MediaEval 2013.

• C. Ventura, M. Martos, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Hierarchical navigation and visual search for video keyframe retrieval. International Conference on Multimedia Modeling 2012.

Page 82: Visual Object Analysis using Regions and Local Features

82

Page 83: Visual Object Analysis using Regions and Local Features

83

Introduction: Context

Source: A. Oliva and A. Torralba, The role of context in object recognition

Page 84: Visual Object Analysis using Regions and Local Features

84

Introduction: Context

Source: A. Oliva and A. Torralba, The role of context in object recognition

Page 85: Visual Object Analysis using Regions and Local Features

85

Introduction: Context

Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segmentations.

Page 86: Visual Object Analysis using Regions and Local Features

86

Related Work: Realistic scenario

Source: J. Carreira et al., Semantic segmentation with second-order pooling

Input image

Object segment hypotheses

Ranked object segment hypotheses (class independent)

object plausibility

score

Page 87: Visual Object Analysis using Regions and Local Features

87

Related Work: Realistic scenario

Source: J. Carreira et al., Semantic segmentation with second-order pooling

Predict overlap estimate of each segment to each object class and sort segments by maximal score

Aggregate high-rank segments

Page 88: Visual Object Analysis using Regions and Local Features

88

Related Work: Realistic scenario0.8179

0.68610.9013

0.73810.7105

0.6462

TRAI

NIN

GDA

TATE

STDA

TA ?0.4905

[1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12

Page 89: Visual Object Analysis using Regions and Local Features

89

Related Work: Co-clustering framework• What are the contour elements?

view 1 view 2

LEAVES PARTITIONS Which contour elements are considered to compute Q1,4?• Contour elements of R1

• Contour elements of R4

Page 90: Visual Object Analysis using Regions and Local Features

90

Related Work: Co-clustering framework

INTRA INTERACTIONS INTER INTERACTIONS

Page 91: Visual Object Analysis using Regions and Local Features

91

Related Work: Co-clustering framework

Page 92: Visual Object Analysis using Regions and Local Features

92

Related Work: Co-clustering framework

LINEAR PROGRAMMING RELAXATION

Page 93: Visual Object Analysis using Regions and Local Features

93

Related Work: Co-clustering framework

12

3 4

5

Intra: Q1,2 = -0.81 Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49Inter: Q1,3 = 2.81e+03 Q1,4 = -1.36e+03 Q1,5 = -1.45e+03 Q2,3 = -2.81e+03 Q2,4 = 1.36e+03 Q2,5 = 1.45e+03

x 0

x 0

x 1

Q4,5 = -0.49 D4,5 = 1 ??𝐷4,5≤𝐷4,2+𝐷2,5

D4,2 = 0, D2,5 = 0 D4,5 = 0

Page 94: Visual Object Analysis using Regions and Local Features

94

Related Work: Co-clustering framework

LEAVES PARTITIONS CO-CLUSTERED PARTITIONS

Page 95: Visual Object Analysis using Regions and Local Features

95

Related Work: Co-clustering framework• Hierarchical constraint

PARENT NODE 11

Inter-sibling boundaries:

Intra-sibling boundaries:

Page 96: Visual Object Analysis using Regions and Local Features

96

Related Work: Co-clustering framework• Multiresolution parameterization

: Number of active contours to encode leave contours

: Maximum fraction to describe the r-th coarse level

: Maximum difference between consecutive levels

= 9 = 0.5 = 0.1

4.53.6

Page 97: Visual Object Analysis using Regions and Local Features

97

Related Work: Co-clustering framework• Iterative approach

Page 98: Visual Object Analysis using Regions and Local Features

98

Contribution II: Resolution parameterization

Selected inter-sibling boundaries:

Page 99: Visual Object Analysis using Regions and Local Features

99

Contributions• Semantic global co-clustering

1. Class assignment to regions 3. Optimization constraints• Regions from same partition

with same class

• Regions from different partitions with diferent class

2. Similarity penalizations• Regions from same partition

with different classes

Page 100: Visual Object Analysis using Regions and Local Features

100

Contribution VI: Automatic resolution selection• Some applications require a single resolution

l1

l2

C1

C2

C3

l1 C1 C2U

l2 C2

C2 l1 or l2 ? l1

Page 101: Visual Object Analysis using Regions and Local Features

101

Experiments: Semantic co-clustering

Page 102: Visual Object Analysis using Regions and Local Features

102

Conclusions• Multiresolution co-clustering framework for uncalibrated multiview

sequences• Two-step architecture• Global optimization• Semantic-based co-clustering with resolution selection

• Submitted to ECCV’16 (waiting decision)

Page 103: Visual Object Analysis using Regions and Local Features

103

Conclusions• Part I: Improving spatial codification in semantic segmentation• Figure-Border-Ground in realistic scenario• Contour-based spatial pyramid

• Part II: Multiresolution co-clustering for uncalibrated multiview segmentation• Results from Part I are replaced by SoA deep learning techniques• Generic co-clustering for multiview sequences• Semantic co-clustering for multiview sequences