Abstract

24
Computer Vision Group, University of Bonn Vision Laboratory, Stanford University Abstract This paper empirically compares nine image dissimilarity measures that are based on distributions of color and texture features summarizing over 1,000 CPU hours of computational experiments. Ground truth is collected via a novel random sampling scheme for color, and via an image partitioning method for texture. Quantitative performance evaluations are given for classification, image retrieval, and segmentation tasks, and for a wide variety of dissimilarity measures. It is demonstrated how the selection of a measure, based on large scale evaluation, substantially improves the quality of classification, retrieval, and unsupervised segmentation of color and texture images .

description

Abstract. - PowerPoint PPT Presentation

Transcript of Abstract

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Abstract

This paper empirically compares nine image dissimilarity measures that are based on distributions of color and texture features summarizing over 1,000 CPU hours of computational experiments. Ground truth is collected via a novel random sampling scheme for color, and via an image partitioning method for texture.

Quantitative performance evaluations are given for classification, image retrieval, and segmentation tasks, and for a wide variety of dissimilarity measures. It is demonstrated how the selection of a measure, based on large scale evaluation, substantially improves the quality of classification, retrieval, and unsupervised segmentation of color and texture images.

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Goals

compare distribution-based image dissimilarity measures

evaluate dependency on parameter settings

develop generic and statistically sound benchmarking methodology

examine influence in different applications:classification, retrieval, annotation and unsupervised segmentation

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Image Representation

Distributions:– adaptive binning (multivariate histograms)

– marginal histograms

– cumulative marginal histograms

prototypes,)(minarg:; jjj

IiIif ccxx

ri

rri

r tItiIif )(:; 1 xx

ri

rr tIiIiF )(:; xx

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Image Representation

dissimilarity measures:

for multivariate (full) distributions

for marginal distributions

• Color: CIELab color space • Texture: Gabor filter responses

JID ,

r

r JID ,

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Heuristic Dissimilarity Measures

• Minkowski-distance

e.g. p = 1 [see 8] (Histogram Intersection), [see 9]

• Weighted-Mean-Variance (WMV)

[see 4]

Lp

p

i

pJifIifJID

1

;;,

p

r

rr

r

rrr JIJIJID

,

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Statistical Dissimilarity Measures

• Kolmogorov-Smirnoff distance (KS)

[see 2]

• Cramer/Von Mises (CvM)

• -statistic

[see 6]

2

2/;;ˆ,

ˆ

ˆ;,

2

JifIififif

ifIifJID

i

JiFIiFJID rri

r ;;max,

2;;, JiFIiFJID rrr

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Information-Theoretic Measures

• Kullback-Leibler divergence (KL)

[see 5]

• Jeffrey divergence (JD)

[see 6]

i if

JifJif

if

IifIifJID

ˆ;

log;ˆ

;log;,

i JifIif

IifJID;;

log;,

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Ground Distance Measures

• Quadratic Form (QF)via similarity matrix A to incorporate similarities between bins

[see 3]

• Earth Movers Distance (EMD)by solving the transportation problem for the optimal admissible flow gij between the two distributions. dij is the dissimilarity between bins.

[see 7]

JIt

JIJID ffAff,

jiij

jiijij

g

dg

JID

,

,,

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Properties

Lp WMV KS/CvM 2 KL JD QF EMD

Symmetrical + + + + - + + +

Triangle inequality + + + - - - + +

Computation Medium Low Medium Medium Medium Medium High High

Ground Distance - - + - - - + +

Multivariate + - - + + + + +

Individual binning - + - - - - - +

Partial Matches - - - - - - - +

Non-Parametric + - + + + + + +

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Methodology

• quality measure: separating into different tasks (classification, retrieval, segmentation)

• parameters: select best possible for every measure by exhaustive evaluation

• evaluate processing steps separately: such as representation, dissimilarity measures, application

• ground truth: collected by sampling given images

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Parameter Settings

• exhaustive search over parameter values:– K nearest neighbors (k = 1, 3, 5, 7)

– sample size: (color: 4, 8, 16, 32, 64 pixels

texture: 82 , 162 , 322 , 642 , 1282 , 2562 pixels)

– number of bins: (4, 8, 16, 32, 64, 128, 256; for EMD only for 4, 8, 16, 32)

– number of Gabor filters: (12, 24, 40)

• quality measures:– classification: K-NN classifier with leave-one-out

– image retrieval: precision vs. number of retrieved images

– unsupervised segmentation: pixel-wise error

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Texture Segmentation

L1 - adaptive JD - adaptive

2 - marginal L1 - marginal JD - marginal KS

original

Unsupervised Grouping by normalized pairwise clustering [6]

.....

Original

Image Representation

Feature Extraction

Grouping by Optimization

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Color Classification

20

40

60

80

100

4 8 16 32 64

Classification Results for Color, Full Histograms, 8 Bins

2

2

L1

L1EMD

QF

L2

0

20

40

60

80

100

4 8 16 32 64

Classification Results for Color, Full Histograms, 256 Bins

L

L

Cla

ssif

icat

ion

Err

or [

%]

EMD

94 images from Corel Database, 16 Samples from each image

Full distributions:

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Color Classification

2

L1

L1

L2

0

20

40

60

80

100

4 8 16 32 64

Classification Results for Color, Marginal Histograms, 8 Bins/dim

CvM

EMD

0

20

40

60

80

100

4 8 16 32 64

Classification Results for Color, Marginal Histograms, 256 Bins/dim

CvM

L

Cla

ssif

icat

ion

Err

or [

%]

Sample size Sample size

KS

WMV

Marginal distributions:

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Texture Classification

10

20

30

40

50

60

8 16 32 64 128 256

Classification Results for Texture, Multivariate Histograms, 8 Bins

JD2

KL

L1

L2

QFEMD

0

10

20

30

40

50

60

8 16 32 64 128 256

Classification Results for Texture, Multivariate Histograms, 256 Bins

JD2

KL

L1

L2

QF

L

Cla

ssif

icat

ion

erro

r [%

]

Sample size

EMD

94 images from Brodatz Album, 16 samples from each image

Full distributions:

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Texture Classification

10

20

30

40

50

60

8 16 32 64 128 256

Classification Results for Texture, Marginal Histograms, 8 Bins

JD2

KL

L1

L2

CvMKS

0

10

20

30

40

50

60

8 16 32 64 128 256

Classification Results for Texture, Marginal Histograms, 256 Bins

JD2

KL

L1

L2

CvM

KS

WMVL

Cla

ssif

icat

ion

erro

r [%

]

Sample size Sample size

Marginal distributions:

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Color Retrieval

20

40

60

80

100

0 20 40 60 80 100

Retrieval Results for Color Images, Sample Size 16

2 Marginals

CvM

WMV

2 Full

EMD FullQF Full

Number of retrieved images

Prec

isio

n

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Texture RetrievalRetrieval Result for Textured Images, Sample Size 8x8

2 Full

2 MarginalEMD Full

QF Full

WMV

L 2 Full

Full

CvM

L

0

10

20

30

40

50

60

0 20 40 60 80 100

Prec

isio

n

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Conclusion

• no overall best measure, but different tools for different tasks

• marginal histograms and aggregate measures good for large feature spaces and small samples

• multivariate histograms effective on large sample sizes and/or well-adapted binning

• EMD attractive for moderate similarities

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Literature

[1] M.Flickner et al. Query by image and video content: The cubic system. IEEE Computer 1995.

[2] D. Geman et al. Boundary detection by constraint optimization. PAMI 1990.

[3] J. Hafner et al. Efficient color histogram indexing for quadratic form distance function. PAMI 1995.

[4] B. Manjunath and W. Ma. Texture features for browsing and retrieval of image data. PAMI 1996.

[5] T. Ojala et al. A comparative study of texture measures with classification based on feature distributions. Pattern Recognition 1996.

[6] J. Puzicha et al. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. CVPR 1997.

[7] Y. Rubner et al. A metric for distributions with applications to image databases. ICCV 1998.

[8] M. Swain and D. Ballard. Color indexing. IJCV 1991.

[9] H. Voorhees and T. Poggio. Computing texture boundaries from images. Nature 1988.

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

77

9

11

13

15

17

1 3

L 1 Marginals

L 1 Marginals

L 1 Full

L 1 Full

EMD

EMD

2 Marginals

2 Marginals

2 Full

2 Full

Classification Results for Color Classification Results for Texture

5 7K K

Cla

ssif

icat

ion

erro

r [%

]

5

7

9

11

13

15

1 3

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

L 1 Marginals

EMD

2 Marginals

2 Full

Number of filters

Cla

ssif

icat

ion

erro

r [%

]

6

10

14

18

22

12 24 40

Classification Results for Texture

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

L inf Marginals

L inf Full

L inf Full

L 1 Full

L 1 Full

EMD

EMD

2 Marginals

2 Marginals

2 Full

2 Full

Classification Results for Color, Sample Size 16 Classification Results for Texture, Sample Size 2562

Number of bins Number of bins

Cla

ssif

icat

ion

erro

r [%

]

0

20

40

60

80

4 8 16 32 64 128 2560

10

20

30

4 8 16 32 64 128 256

Computer Vision Group, University of Bonn Vision Laboratory, Stanford University

Results: Texture Segmentation

5 10 15 20 2520% quantile [%]

2

JD

L1

KS

CvM

FullMarginal

Evaluated over database of 100 Brodatz images

2 4 6 8 10 12Median Error [%]

2

JD

L1

KS

CvM