Post on 08-Aug-2020
NAGARAJA et al.: SEARCHING FOR OBJECTS USING STRUCTURE IN INDOOR SCENES 1
Searching for Objects using Structure inIndoor Scenes - Supplementary MaterialVarun K. Nagarajavarun@umiacs.umd.edu
Vlad I. Morariumorariu@umiacs.umd.edu
Larry S. Davislsd@umiacs.umd.edu
University of MarylandCollege Park, MD. USA.
1 Experiments and ResultsTable 1 shows the operating point of the region classifier. The thresholds for the detectorsare set based on the best F1 point on the validation curves. We work with 18 categoriesand do not include the box category as its performance values are very low with an averageprecision of 1.4%.
meanAP (%) 22.9 52.9 27.0 16.7 29.1 3.8 13.0 14.6 16.2 9.7 10.4 11.6 25.2 20.2 33.3 11.5 18.5 29.4 20.3Prec (%) 87.5 79.8 52.7 65.1 69.8 16.7 37.8 41.7 56.0 65.7 75.0 48.0 57.5 78.6 73.6 44.1 27.8 90.9 59.3Rec (%) 23.3 54.5 34.8 20.7 35.3 11.7 19.5 17.5 18.7 11.7 12.4 15.2 32.5 21.2 37.6 18.0 28.9 29.4 24.6F1 (%) 36.8 64.8 41.9 31.4 46.9 13.8 25.7 24.7 28.0 19.8 21.2 23.1 41.5 33.3 49.7 25.5 28.3 44.4 34.8
Table 1: Operating characteristics of RCNN-depth. The table shows the performance ofthe region classification module RCNN-depth on the testing set of NYU depth v2 dataset.We use the top 100 region proposals and the threshold for the detectors is chosen such that weobtain the best F1 score on the validation set. RCNN-depth is called by our search strategyafter exploring a region.
c© 2015. The copyright of this document resides with its authors.It may be distributed unchanged freely in print or electronic forms.
2 NAGARAJA et al.: SEARCHING FOR OBJECTS USING STRUCTURE IN INDOOR SCENES
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
bathtub
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.3
0.35
0.4
0.45
0.5
0.55
bed
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.1
0.15
0.2
0.25
0.3
bookshelf
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
chair
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.1
0.15
0.2
0.25
0.3
counter
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
−0.04
−0.02
0
0.02
0.04
0.06
0.08
desk
Number of Image Regions
Ave
rag
e P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
door
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
dresser
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
garbage−bin
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100−0.05
0
0.05
0.1
0.15lamp
Number of Image Regions
Ave
rag
e P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
monitor
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
night−stand
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
0.3
pillow
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
sink
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.15
0.2
0.25
0.3
0.35
sofa
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
table
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
television
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0
0.05
0.1
0.15
0.2
0.25
0.3
toilet
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene+Objects Context (RandBgndSubset)
Scene+Objects Context (OEDBgndSubset)
Figure 1: Scene+Objects Context: Comparison of background selection techniques. Thesearch strategy trained with a background subset selected using determinant maximizationperforms better or equally well as the strategy trained with a background subset selectedrandomly. The main advantage of the determinant maximization based subset selection isthe repeatability of experiments unlike the random subset selection.
NAGARAJA et al.: SEARCHING FOR OBJECTS USING STRUCTURE IN INDOOR SCENES 3
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
bathtub
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.3
0.35
0.4
0.45
0.5
0.55
bed
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.1
0.15
0.2
0.25
0.3
bookshelf
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
chair
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.1
0.15
0.2
0.25
0.3
counter
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
−0.04
−0.02
0
0.02
0.04
0.06
0.08
desk
Number of Image Regions
Ave
rag
e P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
door
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
dresser
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
garbage−bin
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100−0.05
0
0.05
0.1
0.15lamp
Number of Image Regions
Ave
rag
e P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
monitor
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
night−stand
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
0.3
pillow
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.05
0.1
0.15
0.2
0.25
sink
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0.15
0.2
0.25
0.3
0.35
sofa
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
table
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 1000.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
television
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
10 20 30 40 50 60 70 80 90 100
0
0.05
0.1
0.15
0.2
0.25
0.3
toilet
Number of Image Regions
Avera
ge P
recis
ion
Proposal Rank
Scene Context (RandBgndSubset)
Scene Context (OEDBgndSubset)
Figure 2: Scene Context: Comparison of background selection techniques. The perfor-mance of the classifier trained with a background subset selected using determinant maxi-mization is comparable to that of the classifier trained with a random background subset. Themain advantage of the determinant maximization based subset selection is the repeatabilityof experiments unlike the random subset selection.
4 NAGARAJA et al.: SEARCHING FOR OBJECTS USING STRUCTURE IN INDOOR SCENES
Groundtruth Proposal Rank Scene Context Scene+Objects Context
lamp
night−stand
chair
night−stand night−stand
lamp
(a) Searching for lamp. Number of regions processed = 35
table table tablechair
table
chair
(b) Searching for table. Number of regions processed = 15
chair
chair
chair
chair
chair
table
chair
chair
table
sofa
chair
table
sofa
chair
chair
chair
(c) Searching for chair. Number of regions processed = 45
counter
sink
counter
(d) Searching for counter. Number of regions processed = 20Figure 3: Search results for different queries. We compare three strategies - ranked se-quence obtained from the region proposal technique (unaware of query class), ranked se-quence obtained from a classifier trained for a query class using scene context features aloneand sequence produced by a search strategy trained for a query class using both scene contextand object-object context features. Red boxes indicate regions labeled as query class, yellowboxes indicate regions other than the query class and blue boxes indicate regions labeled asbackground. The images show a state in the search sequence of different methods at a certainnumber of regions processed. Our strategy which uses both scene context and object-objectcontext can locate an object of the query class earlier than the other methods.