How to Evaluate Foreground Maps ?
CVPR2014 Poster
Outline
Introduction Limitation of Current Measures Solution Experiment Conclusions
Introduction
The comparison of a foreground map against a binary ground-truth is common in various computer-vision problems salient object detection object segmentation foreground-extraction
Several measures have been suggested to evaluate the accuracy of these foreground maps. AUC measure AP measure F-measure PASCAL
Introduction But the most commonly-used measures for
evaluating both non-binary maps and binary maps do not always provide a reliable evaluation.
Introduction
Our contributions:1. Identifying three assumptions in
commonly-used measures.2. We proceed to amend each of these flaws
and to suggest a novel measure that evaluates foreground maps at an increased accuracy .
3. Proposing four meta-measures to analyze the performance of evaluation measures.
Introduction
Two appealing properties of our measure are:1. being a generalization of the FB –measure2. providing a unified evaluation to both binary and
non-binary maps.
Limitation of Current Measures
Three flawed assumptions :
1. Interpolation flaw
2. Dependency flaw
3. Equal-important flaw
Limitation of Current Measures
Current Evaluation Measures Evaluation of binary maps:
4 basic quantities :
1. TP (true-positive)
2. TN (true-negative)
3. FP (false-positive)
4. FN (false-negative)
Limitation of Current Measures
Current Evaluation Measures Evaluation of binary maps:
Common score :
TPR=
FPR=
Limitation of Current Measures
Current Evaluation Measures Evaluation of non-binary maps:
AUC (Area-Under-the-Curve) AP (Average-Precision)
Image Source: http://zh.wikipedia.org/wiki/File:Curvas.png
Interpolation flaw
The source of the interpolation flaw is the thresholding of the non-binary maps.
Dependency flaw
dependency between false-negatives
Equal-important flaw
the location of the false-positives
Solution
Resolving the Interpolation Flaw Resolving the Dependency Flaw & the Equal-
Importance Flaw The New Measure – -measure
Resolving the Interpolation Flaw
The key idea is to extend the four basic quantities: TP, TN, FP and FN , to deal with non-binary values.
G1xN : the column-stack representation of the binary ground-truth, where N is the number of pixels in the image.
D1xN : the non-binary map to be evaluated against the ground-truth.
Resolving the Interpolation Flaw
For binary map, pixel i correct G[ i ] = D[ i ] incorrect G[ i ] ≠ D[ i ]
For non-binary
Resolving the Dependency Flaw & the Equal-Importance Flaw
Assumptions deal with detection errors. Our key idea is to attribute different importance
to different errors.
Reformulate the basic quantities:
Resolving the Dependency Flaw & the Equal-Importance Flaw
We suggest applying a weighting function to the errors.
ANxN : captures the dependency between pixels
BNx1 : represents the varying importance of the pixels
Resolving the Dependency Flaw & the Equal-Importance Flaw
Resolving the Dependency Flaw & the Equal-Importance Flaw
Reformulate the basic quantities with weight:
The New Measure – -measure
Having dealt with all three flaws, we proceed to construct our evaluation measure.
Experiments
Meta-measure :
1. The ranking of an evaluation measure should agree with the preferences of an application that uses the map as input.
2. A measure should prefer a good result by an algorithm that considers the content of the image, over an arbitrary map.
Experiments
meta-measure :
3. The score of a map should decrease when using a wrong ground-truth map.
4. The ranking of an evaluation measure should not be sensitive to inaccuracies in the manually marked boundaries in the ground-truth maps.
Experiments :Meta-measure(1)
Application Ranking
Experiments :Meta-measure(2)
State-of-art vs. Generic
Experiments :Meta-measure(3)
Ground-truth Switch
Experiments :Meta-measure(4)
Annotation errors
Conclusions
We analyzed the currently-used evaluation measures that suffer from three flawed assumptions: interpolation, dependency and equal-importance.
We suggested an evaluation measure that amends these assumptions, and it offers a unified solution to the evaluation of non-binary and binary maps.
The advantages of our measure were shown via four different meta-measures, both qualitatively and quantitatively.
Top Related