Civitas Learning: Understanding ROC Curves

Introduction to ROC Curves Data Science Basics Series

May 14, 2014

CIVITAS LEARNING, INC. – CONFIDENTIAL INFORMATION

What is ROC? Receiver Operating Characteristic

Systematically trade off detection against false alarm Using

You woke me up at 3 am!!!

Wake up,

you’re late for

class!!!


A Brief History of ROC Curves • Developed by electrical engineers and radar

operators during WWII to detect enemy airplanes vs. geese. •  Illustrates the performance of binary classifiers -

elements in a set divided into two groups • Compares trade-offs between detection and false

alarm rate • Now used in many fields

•  Psychology •  Medicine and biometrics •  More recently in machine learning and data mining


Detection vs. False Alarm • Detec7on/sensi7vity/true posi7ve rate measures how many true posi0ve cases are correctly detected •  False alarm/specificity/false posi7ve rate measures the number of false alarms •  Tradeoff: Usually can op0mize for one but not both •  Example: Disease detec0on •  Sacrifice false alarm for detec0on if cost of missed detec0on is alarmingly high


How is ROC Generated?

Model

GPA Activities Courses

Financial aid SAT/ACT

High school

Features à Scores à PDF à ROC

Prob

abili

ty o

f det

ectio

n

Probability of false alarm

Optimal point on the ROC curve depends on reach capacity and ROI

Predicted risk Score


How is ROC Generated? Features à Scores à PDF à ROC



High school

Prob

abili

ty o

f det

ectio

n




Model


How is ROC Generated? Features à Scores à PDF à ROC

Cutoff threshold



High school

Prob

abili

ty o

f det

ectio

n




Model


Model Performance

Overlap is a measure of the model’s ability to separate between success and failure.

With a strong model you can be confident of assigning a particular score to an outcome category.

With a weaker model, there is a large amount of overlap, so a particular score could mean that an outcome can be either good or bad with equal probability.

STRONG MODEL

WEAK MODEL

Predicted risk score

ROC


False Alarm Rate

Detec0on

Rate

Parts of a ROC Curve

Civitas Model

Random Ordering


False Alarm Rate

Detec0on

Rate


Total Population: •  10,000 students •  9,000 continued •  1,000 did not continue

ROC Information •  Correct identification rate of non-

continuing students = 125/1,250 = 10%

Point on Line: •  1,250 students •  1,125 continued •  125 did not continue


False Alarm Rate

Detec0on

Rate



ROC Information •  Correct identification rate of non-

continuing students = 750/7,500 = 10%

Point on Line: •  7,500 students •  6,750 continued •  750 did not continue


False Alarm Rate

Detec0on

Rate

Tradeoffs: Without the model, more advisors are needed to reach more students who will not persist.

As you go up and to the right, you would be reaching out to more at-risk students (higher detection rate), but more interventions require more advising time and resources since correct identification rate of non-continuing students remains at the same 10%.


Model Performance: With the model, the same number of advisors can reach out to 5X more students who will not persist.


Point on Line: •  1,250 students •  1,125 continued •  125 did not continue •  Correct = 125/1250 = 10.0%

ROC Information: •  1,250 students •  650 continued •  600 did not continue •  Correct identification rate of non-

continuing students = 600/1250 = 48.0%

False Alarm Rate

Detec0on

Rate

Civitas Model

Random Ordering

~5X


Model Evaluation

With a stronger predictive model •  Detection rate improves

•  False alarm rate decreases

•  Correctness increases at every student threshold

False Alarm Rate

Detec0on

Rate

Civitas Model

Random Ordering

ACCURACY VS. ROC CURVES

Why is accuracy an incomplete and likely misleading measure of a predictive model?


Accuracy vs. ROC Curves Case: You use an algorithm to identify students who are at risk of not continuing to the next term. Following the case study, 10% of students do not persist. You test your predictive model on the data and find that you made correct predictions 92% of the time.

A crackpot scientist tells you,

“I could’ve gotten 90% accuracy just by predicting

everyone will persist. After all the math, you gained only

2%?!”

Don’t give up yet! Your predictive model is still helpful.

Accuracy vs. ROC Curves

You have a team of advisors, and they have time to reach out to 1,250 students to suggest ways they can increase their likelihood of persisting.

Accuracy vs. ROC Curves

= 100 students


Accuracy vs. ROC Curves Without the predictive model, you have to pick 1,250 students at random to assist. If 10% of them are expected to not persist, only 125 students would be likely to benefit from the intervention.


Accuracy vs. ROC Curves With the predictive model, you can choose the 1,250 students by ordering them by the highest predicted risk score. The test case reveals 600 of these students are at risk and would be most likely to benefit from the right intervention at the right time.

WITHOUT

PREDICTIVE MODEL

WITH PREDICTIVE MODEL

The ROC Curve Tradeoff

Students most likely to benefit from an intervention

~5x improvement

THANK YOU

VIEW this webinar on-demand on our LinkedIn Page

FOLLOW @CivitasLearning to continue the conversation on Twitter

SHARE comments and ideas for future webinars on the Civitas Learning Space

linkedin.com/company/Civitas-Learning twitter.com/CivitasLearning civitaslearningspace.com

Civitas Learning: Understanding ROC Curves

Technology

Transcript of Civitas Learning: Understanding ROC Curves