Http:// Berkeley Vision GroupNIPS Vancouver 20021 Learning to Detect Natural Image Boundaries Using...

NIPS Vancouver 2002 1http://www.cs.berkeley.edu/projects/visionUC Berkeley Vision Group

Learning to Detect Natural Image Boundaries Using Local Brightness,

Color, and Texture Cues

David Martin, Charless Fowlkes, Jitendra Malik{dmartin,fowlkes,malik}@eecs.berkeley.edu

UC Berkeley Vision Grouphttp://www.cs.berkeley.edu/projects/vision


Multiple Cues for Grouping

• Many cues for perceptual grouping: – Low-Level: brightness, color, texture, depth, motion

– Mid-Level: continuity, closure, convexity, symmetry, …

– High-Level: familiar objects and configurations

This talk: Learn local cue combination rule from data


Non-Boundaries Boundaries

I

T

B

C


Goal and Outline

• Goal: Model the posterior probability of a boundary Pb(x,y,) at each pixel and orientation using local cues.

• Method: Supervised learning using dataset of 12,000 segmentations of 1,000 images by 30 subjects.

• Outline of Talk:

1. 3 cues: brightness, color, texture

2. Cue calibration

3. Cue combination

4. Compare with other approaches– Canny 1986, Konishi/Yuille/Coughlan/Zhu 1999

5. Pb images


Brightness and Color Features

• 1976 CIE L*a*b* colorspace• Brightness Gradient BG(x,y,r,)

2 difference in L* distribution

• Color Gradient CG(x,y,r,) 2 difference in a* and b*

distributions

i ii

ii

hg

hghg

22 )(

2

1),(

r(x,y)


Texture Feature

• Texture Gradient TG(x,y,r,) 2 difference of texton histograms

– Textons are vector-quantized filter outputs

TextonMap


Cue Calibration

All free parameters optimized on training data

• Brightness Gradient– Scale, bin/kernel sizes for KDE

• Color Gradient– Scale, bin/kernel sizes for KDE, joint vs. marginals

• Texture Gradient– Filter bank: scale, multiscale? – Histogram comparison: L1, L2, L, 2, EMD– Number of textons– Image-specific vs. universal textons

• Localization parameters for each cue (see paper)


Classifiers for Cue Combination

• Classification Trees– Top-down splits to maximize entropy, error bounded

• Density Estimation– Adaptive bins using k-means

• Logistic Regression, 3 variants– Linear and quadratic terms– Confidence-rated generalization of AdaBoost (Schapire&Singer)

• Hierarchical Mixtures of Experts (Jordan&Jacobs)– Up to 8 experts, initialized top-down, fit with EM

• Support Vector Machines (libsvm, Chang&Lin)

Range over bias/variance, parametric/non-parametric, simple/complex


ClassifierComparison


Cue Combinations


Alternate Approaches

• Canny Detector– Canny 1986

– MATLAB implementation

– With and without hysteresis

• Second Moment Matrix– Nitzberg/Mumford/Shiota 1993

– cf. Förstner and Harris corner detectors

– Used by Konishi et al. 1999 in learning framework

– Logistic model trained on eigenspectrum


Two Decades of Boundary

Detection


Pb Images ICanny 2MM Us HumanImage


Pb Images IICanny 2MM Us HumanImage


Pb Images IIICanny 2MM Us HumanImage


Summary and Conclusion

1. A simple linear model is sufficient for cue combination– All cues weighted approximately equally in logistic

2. Proper texture edge model is not optional for complex natural images

– Texture suppression is not sufficient!

3. Significant improvement over state-of-the-art in boundary detection

– Pb useful for higher-level processing

4. Empirical approach critical for both cue calibration and cue combination

Segmentation data and Pb images on the webhttp://www.cs.berkeley.edu/projects/vision

Http:// Berkeley Vision GroupNIPS Vancouver 20021 Learning to Detect Natural Image Boundaries Using...

Documents

Transcript of Http:// Berkeley Vision GroupNIPS Vancouver 20021 Learning to Detect Natural Image Boundaries Using...