A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes,...

A Database of Human Segmented Natural Images

and Two Applications

David Martin, Charless Fowlkes, Doron Tal, Jitendra Malik

UC Berkeley{dmartin,fowlkes,doron,malik}@eecs.berkeley.edu

David Martin - UC Berkeley - ICCV 2001 2

Motivation

• Berkeley Segmentation Dataset Groundtruth for image segmentation of natural images

• App#1: A segmentation benchmark• App#2: Ecological statistics


Benchmark Example for Recognition

MNIST handwritten digit dataset [LeCun, AT&T]http://www.research.att.com/~yann/exdb/mnist/index.html

METHOD ERROR (%)Boosted LeNet-4, [distortions] 0.7Virtual SVM deg 9 poly [distortions] 0.8LeNet-5, [distortions] 0.8LeNet-5, [huge distortions] 0.85LeNet-5, [no distortions] 0.95Reduced Set SVM deg 5 polynomial 1K-NN, Tangent Distance, 16x16 1.1SVM deg 4 polynomial 1.1LeNet-4 1.1LeNet-4 with K-NN instead of last layer 1.1LeNet-4 with local learning instead of ll 1.12-layer NN, 300 HU, [deskewing] 1.6LeNet-1 [with 16x16 input] 1.7K-nearest-neighbors, Euclidean, deskewed 2.4

Training set, test set, evaluation methodology, algorithm ranking


The Image Dataset

• 1000 Corel images– Photographs of outdoor scenes– Texture is common– Large variety of subject matter– 481 x 321 x 24b


Establishing Groundtruth• Def: Segmentation

= Partition of image pixels into exclusive sets

• Manual segmentation by human subjects– Custom Java tool to facilitate task

• Currently: 1000 images, 5500 segmentations, 20 subjects

• Naïve subjects (UCB undergrads) given simple, non-technical instructions


Directions to Image Segmentors

• You will be presented a photographic image• Divide the image into some number of

segments, where the segments represent “things” or “parts of things” in the scene

• The number of segments is up to you, as it depends on the image. Something between 2 and 30 is likely to be appropriate.

• It is important that all of the segments have approximately equal importance.


• The segmentations are not identical.

• But are they consistent??


Perceptual organization

forms a hierarchyimage

background left bird right bird

grass bush

headeye

beakfar

body headeye

beak

body

Each subject picks a slice through this hierarchy.


Quantifying inconsistency

S1 S2

How much is S1 a refinement of S2 at pixel ?

),(

),(\),(),(

1

2121

i

ii

pSR

pSRpSRSSLRE


Segmentation Error Measure

• One-way Local Refinement Error:

i

ii pSSLREpSSLREn

SSSE ),,(),,,(min1

),( 122121

• Segmentation Error allows refinement in either direction at each pixel:

),(

),(\),(),(

1

2121

i

ii

pSR

pSRpSRSSLRE


Human segmentations are consistent

SE (Color Human Segmentations)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Segmentation Error (SE)

Same Image

Different Images

Distribution of segmentation error over the dataset.


Color Gray InvNeg


InvNeg


Color Gray InvNeg


Gray vs. Color vs. InvNeg Segmentations

SE (gray, gray) = 0.047SE (gray, color) = 0.047

Color may affect attention, but doesn’t seem to affect perceptual organization

SE (gray, gray) = 0.047SE (gray, invneg) = 0.059

InvNeg interferes with high-level cues

(2500 gray, 2500 color,200 invneg segmentations)


Benchmark Methodology

• Separate training and test datasets with no images in common

• Generate computer segmentation(s) of each image in test set– Determine error of each computer

segmentation using SE measure– Algorithm scored by mean SE

• Example: – SE (human, human) = 0.05– SE (NCuts, human) = 0.22– SE (different images) = 0.30


Ecological Statistics of Image Segmentations

• Validating and quantifying Gestalt grouping factors [Brunswik 1953]

• Priors on region properties

• Recent work on natural image statistics:– Filter outputs [Ruderman 1994, Olshausen & Field 1996,

Yuille et. al. 1999]– Object sizes [Alvarez, Gousseau, Morel 1999]– Shape [Zhu 1999] – Contours [August & Zucker 2000, Geisler et al. 2001]


Relative power of cues

• Pairwise grouping cues– Proximity– Luminance similarity– Color similarity– Intervening contour– Texture similarity


P (Same Segment | Proximity)


P (Same Segment | Luminance)


Bayes Risk for Proximity Cue


Bayes Risk for Various Cues Conditioned on Proximity


Mutual Information for Various Cues Conditioned on Proximity


Priors on Region Properties

• Area• Convexity


Empirical Distribution of Region Area

y = Kx-

= 0.913

Compare with Alvarez, Gousseau, Morel 1999.


Empirical Distribution of Region Convexity


Conclusion

• Large new database of segmentations of natural images by humans

• A segmentation benchmark• Ecological statistics

– Relative power of grouping cues– Priors on region properties

http://www.cs.berkeley.edu/~dmartin/segbench

A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes,...

Documents

Transcript of A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes,...