Pattern recognition (3) - University of Victoria

17
1 Pattern recognition (3)

Transcript of Pattern recognition (3) - University of Victoria

Page 1: Pattern recognition (3) - University of Victoria

1

Pattern recognition (3)

Page 2: Pattern recognition (3) - University of Victoria

2

Things we have discussed until now

  Statistical pattern recognition   Building simple classifiers

 Supervised classification   Minimum distance classifier   Bayesian classifier (1D and multiple D)   Building discriminant functions

 Unsupervised classification   K-means algorithm

Page 3: Pattern recognition (3) - University of Victoria

3

Equivalence between classifiers

  Pattern recognition using multivariate normal distributions and equal priors is simply a minimum Mahalonobis distance classifier.

Page 4: Pattern recognition (3) - University of Victoria

4

Today

  Performance evaluation of classification systems

  Reading: slides and blackboard derivations only

Page 5: Pattern recognition (3) - University of Victoria

5

Performance classification paradigms

  Against ground truth (manually generated segmentation/classification)   The method of preference in medical image

segmentation   Benchmarking: for mature/maturing subfields in

computer vision   Example 1: “The gait identification challenge problem:

datasets and baseline algorithm”, in International Conference on Pattern Recognition 2002

  Example 2: “Benchmark Studies on Face Recognition”, in International Workshop on Automatic Face- and Gesture- Recognition 1995.

Page 6: Pattern recognition (3) - University of Victoria

6

Evaluation of classifiers

  ROC analysis   Precision and recall   Confusion matrices

Page 7: Pattern recognition (3) - University of Victoria

7

ROC analysis

  ROC stands for receiver-operator characteristic and was initially used to analyze and compare the performances of human radar operators.

  A ROC curve=plot of false positive rate against true positive rate as some parameter is varied.

  1970: ROC curves were used in medical studies; useful in bringing out the sensitivity (true positive rate) versus specificity (false positive rate) of diagnosis trials.

  Computer Vision performs ROC analysis for algorithms

  We can also compare different algorithms that are designed for the same task

Page 8: Pattern recognition (3) - University of Victoria

8

ROC terminology

  Four kinds of errors:   TP “yes” and are right (True Positives) “hit”   TN “no” and are right (True Negatives) “correct

rejection”   FP “yes” and are wrong (False Positives) “false

alarm”   FN “no” and are wrong (False Negatives)

“miss”   We don’t actually really need all four rates

because   FN = 1-TP   TN = 1-FP

Page 9: Pattern recognition (3) - University of Victoria

9

False positives, false negatives

Page 10: Pattern recognition (3) - University of Victoria

10

ROC curves

trade-off between the true positive rate and the false positive rate: an increase in true positive rate is accompanied by an increase in false positive rate

the area under each curve gives a measure of accuracy

Page 11: Pattern recognition (3) - University of Victoria

11

ROC curve

- the closer the curve approaches the top left-hand corner of the plot, the more accurate the classifier; - the closer the curve is to a 45 diagonal, the worse the classifier;

Page 12: Pattern recognition (3) - University of Victoria

12

Where are ROC curves helpful?

  Detection-type problems   Face detection in images/video data   Event detection in video data   Lesion detection in medical images   Etc…

Page 13: Pattern recognition (3) - University of Victoria

13

Precision and recall

  Also used mostly for detection-type problems

  In a multiple class case, can be measured for each class

Page 14: Pattern recognition (3) - University of Victoria

14

Trade-of between precision and recall

Example: content-based image retrieval   Suppose we aim at detecting all sunset images

from an image database   The image database contains 200 sunset images   The classifier retrieves 150 of the relevant 200

images and 100 images of no interest to the user   Precision=150/250=60%   Recall=150/200=57%

  The system could obtain 100 percent recall if returned all images in the database, but its precision would be terrible

  If we aim at a low false alarm rate: precision would be high, recall would be low.

Page 15: Pattern recognition (3) - University of Victoria

15

Confusion matrix

  Used for visualizing/reporting results of a classification system

Page 16: Pattern recognition (3) - University of Victoria

16

The binary confusion matrix

  We can construct a binary confusion matrix for one class

Page 17: Pattern recognition (3) - University of Victoria

17

Calculating the precision and recall from the confusion matrix

Example. Consider the confusion matrix of a OCR that produces the following output over a test document set

Calculate the precision and recall for class a.