ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using...

20
ROC curve estimation

description

ROC curve Originally stands for Receiver Operating Characteristic curve. It is used widely in biomedical applications like radiology and imaging. An important utility here is to assess classifiers in machine learning.

Transcript of ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using...

Page 1: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

ROC curve estimation

Page 2: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Index• Introduction to ROC• ROC curve• Area under ROC curve• Visualization using ROC curve

Page 3: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

ROC curve• Originally stands for Receiver

Operating Characteristic curve.• It is used widely in biomedical

applications like radiology and imaging.

• An important utility here is to assess classifiers in machine learning.

Page 4: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Example situation• Consider diagnostic test for a disease• Test has 2 possible outcomes:• Positive or negative.• Now based on this we will explain the

various notations used in ROC curves in the next slide.

Page 5: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Data distribution available

Test Result

Pts Pts with with diseasdiseasee

Pts Pts without without the the diseasedisease

Page 6: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Test Result

Call these patients “negative”

Call these patients “positive”

Threshold

Page 7: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True Positives

Some definitions ...

Page 8: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False Positives

Page 9: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True negatives

Page 10: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False negatives

Page 11: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Confusion Matrix• Confusion matrix is defined as a matrix

consisting of two rows and two columns.• The orientation of entries in the confusion

matrix is as follows if say the confusion matrix is called CMat.

• Then CMat[1][1]=True Positives CMat[1][2]=False Positives.

• Similarly CMat[2][1]=False Negatives and CMat[2][2]=True Negatives.

Page 12: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

2-class Confusion Matrix

• Reduce the 4 numbers to two ratestrue positive rate = TP = (#TP)/(#P)false positive rate = FP = (#FP)/(#N)

• Rates are independent of class ratio*

True classPredicted class

positive negative

positive (#P) #TP #P - #TPnegative (#N) #FP #N - #FP

Page 13: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Comparing classifiers using Confusion Matrix

TruePredictedpos neg

pos 60 40neg 20 80

TruePredictedpos neg

pos 70 30neg 50 50

TruePredictedpos neg

pos 40 60neg 30 70

Classifier 1TP = 0.4FP = 0.3

Classifier 2TP = 0.7FP = 0.5

Classifier 3TP = 0.6FP = 0.2

Page 14: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Interpretations from the Confusion matrix

• The following metrics for a classifier can be calculated using the confusion matrix. These can be used for evaluating the classifier.

• Accuracy = (TP+TN)• Precision = TP/(TP+FP)• Recall = TP/(TP+FN)• F-Score = 2*recall*precision/(recall +

precision)

Page 15: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

True

Pos

itive

Rat

e

(se

nsiti

vity

)

0%

100%

False Positive Rate (1-specificity)

0%

100%

ROC curve

Page 16: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

A good test: A poor test:

ROC curve comparison

Page 17: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Area under ROC curve (AUC) • Overall measure of test performance• Comparisons between two tests based on

differences between (estimated) AUC• For continuous data, AUC equivalent to Mann-

Whitney U-statistic (nonparametric test of difference in location between two populations)

• Determines the accuracy of a classifier in machine learning.

Page 18: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

True

Pos

itive

Rat

e

0%

100%

False Positive Rate

0%

100%

True

Pos

itive

R

ate

0%

100%

False Positive Rate

0%

100%

True

Pos

itive

R

ate

0%

100%

False Positive Rate

0%

100%

AUC = 50%

AUC = 90% AUC =

65%

AUC = 100%

True

Pos

itive

R

ate

0%

100%

False Positive Rate

0%

100%

AUC for ROC curves

Page 19: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

Further Evaluation methods

• ROC curve based visualization• The visualization of the ROC curve is

a very good method of evaluating the classifier.

• Tools like Matlab, Weka and Orange provide facilities to support visualization of the ROC curve.

Page 20: ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.

• ROCR is one such tool which provides effective visualization.