Download - How to Evaluate Foreground Maps ? CVPR2014 Poster.

How to Evaluate Foreground Maps ?

CVPR2014 Poster

Outline

Introduction Limitation of Current Measures Solution Experiment Conclusions

Introduction

The comparison of a foreground map against a binary ground-truth is common in various computer-vision problems salient object detection object segmentation foreground-extraction

Several measures have been suggested to evaluate the accuracy of these foreground maps. AUC measure AP measure F-measure PASCAL

Introduction But the most commonly-used measures for

evaluating both non-binary maps and binary maps do not always provide a reliable evaluation.

Introduction

Our contributions:1. Identifying three assumptions in

commonly-used measures.2. We proceed to amend each of these flaws

and to suggest a novel measure that evaluates foreground maps at an increased accuracy .

3. Proposing four meta-measures to analyze the performance of evaluation measures.

Introduction

Two appealing properties of our measure are:1. being a generalization of the FB –measure2. providing a unified evaluation to both binary and

non-binary maps.

Limitation of Current Measures

Three flawed assumptions :

1. Interpolation flaw

2. Dependency flaw

3. Equal-important flaw


Current Evaluation Measures Evaluation of binary maps:

4 basic quantities :

1. TP (true-positive)

2. TN (true-negative)

3. FP (false-positive)

4. FN (false-negative)


Current Evaluation Measures Evaluation of binary maps:

Common score :

TPR=

FPR=


Current Evaluation Measures Evaluation of non-binary maps:

AUC (Area-Under-the-Curve) AP (Average-Precision)

Image Source: http://zh.wikipedia.org/wiki/File:Curvas.png

Interpolation flaw

The source of the interpolation flaw is the thresholding of the non-binary maps.

Dependency flaw

dependency between false-negatives

Equal-important flaw

the location of the false-positives

Solution

Resolving the Interpolation Flaw Resolving the Dependency Flaw & the Equal-

Importance Flaw The New Measure – -measure

Resolving the Interpolation Flaw

The key idea is to extend the four basic quantities: TP, TN, FP and FN , to deal with non-binary values.

G1xN : the column-stack representation of the binary ground-truth, where N is the number of pixels in the image.

D1xN : the non-binary map to be evaluated against the ground-truth.

Resolving the Interpolation Flaw

For binary map, pixel i correct G[ i ] = D[ i ] incorrect G[ i ] ≠ D[ i ]

For non-binary

Resolving the Dependency Flaw & the Equal-Importance Flaw

Assumptions deal with detection errors. Our key idea is to attribute different importance

to different errors.

Reformulate the basic quantities:


We suggest applying a weighting function to the errors.

ANxN : captures the dependency between pixels

BNx1 : represents the varying importance of the pixels


Reformulate the basic quantities with weight:

The New Measure – -measure

Having dealt with all three flaws, we proceed to construct our evaluation measure.

Experiments

Meta-measure :

1. The ranking of an evaluation measure should agree with the preferences of an application that uses the map as input.

2. A measure should prefer a good result by an algorithm that considers the content of the image, over an arbitrary map.

Experiments

meta-measure :

3. The score of a map should decrease when using a wrong ground-truth map.

4. The ranking of an evaluation measure should not be sensitive to inaccuracies in the manually marked boundaries in the ground-truth maps.

Experiments :Meta-measure(1)

Application Ranking


State-of-art vs. Generic


Ground-truth Switch


Annotation errors

Conclusions

We analyzed the currently-used evaluation measures that suffer from three flawed assumptions: interpolation, dependency and equal-importance.

We suggested an evaluation measure that amends these assumptions, and it offers a unified solution to the evaluation of non-binary and binary maps.

The advantages of our measure were shown via four different meta-measures, both qualitatively and quantitatively.