Performance measurement

13
Performance measurement

description

Performance measurement. Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 = YES’ and ‘0 = NO’ Should we threshold at 0.5, saying that anything > 0.5 is a 1, anything =< 0.5 is a zero?. - PowerPoint PPT Presentation

Transcript of Performance measurement

Page 1: Performance measurement

Performance measurement

Page 2: Performance measurement

Performance measurement

• Must be careful what performance metric we use

• For example, say we have a NN classifier with 1 output unit, and we code ‘1 = YES’ and ‘0 = NO’

• Should we threshold at 0.5, saying that anything > 0.5 is a 1, anything =< 0.5 is a zero?

Page 3: Performance measurement

Performance measurement

• Only if classification/misclassification cost are the same for each of the two classes

• Output threshold of 0.5 is not set in stone

• What is performance if we use decision threshold of 0.6, or 0.4?

Page 4: Performance measurement

Performance measurement

• For example, in predicting consumer creditworthiness:

• Are costs of loaning money to someone who then defaults same as:

• Costs of not lending money to someone who would in actual fact have repaid the loan?

Page 5: Performance measurement

Confusion matrix/crosstabs

• Calculate four quantities:• True Positives (TP): answer = YES, network said

YES• True Negatives (TN): answer = NO, network said

NO• False Positives (FP): answer = NO, network said

YES• False Negatives (FN): answer = YES, network

said NO

Page 6: Performance measurement

Confusion matrix

NetworkYES

NetworkNO

Actual YES TP FN

Actual NO FP TN

Page 7: Performance measurement

Confusion matrix

• Calculate a confusion matrix for many different output thresholds (e.g., 0.1, 0.2 …0.9)

• From these matrices, calculate the following values:• Calculate probabilities• hit rate = true positive ratio = sensitivity =

TP/(TP+FN)• false alarm rate = false positive ratio = FP/(FP+TN)• Plot Receiver Operating Characteristic (ROC) Curve

Page 8: Performance measurement

ROC Curve

poor (nodescrimination)

hit rate

F A rate

Page 9: Performance measurement

ROC curve

good

hit rate

F A rate

Page 10: Performance measurement

ROC curves

• Area under curve gives idea of how good classifier is. 0.5 = no good, approaching 1 = excellent

• Can then build in profits/costs of different correct answers/mistakes into the confusion matrices to build a Gains Chart. Again, look at this area on chart

• Classifier with highest area on gains chart is the most profitable

Page 11: Performance measurement

Values for gains chart

NetworkYES

NetworkNO

Actual YES TP*cost1 FN*cost2

Actual NO FP*cost3 TN*cost4

Page 12: Performance measurement

Performance of regression networks

• Mean square error?• Goodness of fit, R squared values?• Again, are costs/benefits of errors same for all

different values?• Useful to ‘eyeball’ data, see if there are some

areas where network is good, some where not as good, look at relative costs

Page 13: Performance measurement

Summary

• When you measure performance, be careful what you are measuring!