Wrangle 2016: Driving Healthcare Operations with Small Data

28
Driving Healthcare Operations with Data Science

Transcript of Wrangle 2016: Driving Healthcare Operations with Small Data

Page 1: Wrangle 2016: Driving Healthcare Operations with Small Data

Driving Healthcare Operations with Data Science

Page 2: Wrangle 2016: Driving Healthcare Operations with Small Data

"literally a health insurance company"

Page 3: Wrangle 2016: Driving Healthcare Operations with Small Data

"Operations"

Clinical Operations

● Close member "gaps in care"

○ Not taking their meds

○ Not seeing their doctors

○ Not getting tested

● Document conditions

Insurance Operations

● Approve / deny claims

● Approve / deny authorizations

● Catch fraud

Page 4: Wrangle 2016: Driving Healthcare Operations with Small Data

"Operations"

Clinical Operations

● Close member "gaps in care"

○ Not taking their meds

○ Not seeing their doctors

○ Not getting tested

● Document conditions

Insurance Operations

● Approve / deny claims

● Approve / deny authorizations

● Catch fraud

Page 5: Wrangle 2016: Driving Healthcare Operations with Small Data

E.g.

Page 6: Wrangle 2016: Driving Healthcare Operations with Small Data

Call members and tell them scary

stories?

Knock on the doors of most non-

adherent members?

Ask members politely? Use different messages for rich and

poor members?

Page 7: Wrangle 2016: Driving Healthcare Operations with Small Data

Data ScienceEnter

Page 8: Wrangle 2016: Driving Healthcare Operations with Small Data

Data ScienceWhat should we do?

For whom?

Did it work?

Enter

Page 9: Wrangle 2016: Driving Healthcare Operations with Small Data

Case Study: Whom to Call for Home Visits?

Page 10: Wrangle 2016: Driving Healthcare Operations with Small Data
Page 11: Wrangle 2016: Driving Healthcare Operations with Small Data

Can we predict which of our diabetic members will have complications in the

next 6 months?

Page 12: Wrangle 2016: Driving Healthcare Operations with Small Data

Time

Observation Interval Prediction Interval

Page 13: Wrangle 2016: Driving Healthcare Operations with Small Data

Time

Observation Interval Prediction Interval

Demographic info, lab tests, medications,

other diagnoses

Diagnosed with diabetes

complications?

Page 14: Wrangle 2016: Driving Healthcare Operations with Small Data

Features Labels

Member Age Hypertension hba1c

CP001 65 Yes 6.5

CP002 77 No 8.3

CP002 84 Yes 7.4

Diagnosed with Complication in 6-

month Interval

Yes

No

Yes

Page 15: Wrangle 2016: Driving Healthcare Operations with Small Data

Challenge: High Class Imbalance

● Historically, only 8% of diabetic members have been diagnosed with complications over a 6-month period.

● Easy to get "high" accuracy, but hard to get decent precision/recall tradeoff.

Page 16: Wrangle 2016: Driving Healthcare Operations with Small Data

Approach: High Class Imbalance

● Evaluate using area under ROC curve.

● Empirically, tree ensemble models appear to handle the imbalance better than logistic regression.

Page 17: Wrangle 2016: Driving Healthcare Operations with Small Data

Challenge: Missing Data

● Glycated hemoglobin clearly an important feature… but we only have measurements for ~60% of members.

● Whether we have a measurement correlates with both:

○ Diabetes complications.

○ How well a model trained without the lab measurement performs.

Page 18: Wrangle 2016: Driving Healthcare Operations with Small Data

Approach: Missing Data

● Simply hardcode all missing values to something outside the measurement range.

○ In our case, 0.0.

● This way, tree models can split on "have a measurement" vs. "don't have a measurement".

Page 19: Wrangle 2016: Driving Healthcare Operations with Small Data

Final Model: Gradient Boosting Tree Ensemble

Page 20: Wrangle 2016: Driving Healthcare Operations with Small Data

Evaluation

AUROC: 0.8

Precision: 24%

Recall: 66%

Page 21: Wrangle 2016: Driving Healthcare Operations with Small Data

Most Predictive Features

Glycated Hemoglobin

Age

Hypertension

Takes Insulin

Page 22: Wrangle 2016: Driving Healthcare Operations with Small Data
Page 23: Wrangle 2016: Driving Healthcare Operations with Small Data

Did it work?

Page 24: Wrangle 2016: Driving Healthcare Operations with Small Data

Do we catch more complications if we make calls using the model?

Page 25: Wrangle 2016: Driving Healthcare Operations with Small Data

Control Group Treatment Group

Page 26: Wrangle 2016: Driving Healthcare Operations with Small Data

Control Group Treatment Group

Call Group(Chosen at Random)

Call Group(Chosen by Model)

Page 27: Wrangle 2016: Driving Healthcare Operations with Small Data

Found Complications Didn't Find Complications

Control Group 8 92

Treatment Group 24 76

FAKE RESULTS

Page 28: Wrangle 2016: Driving Healthcare Operations with Small Data

Found Complications Didn't Find Complications

Control Group 8 92

Treatment Group 24 76

FAKE RESULTS

Chi-Squared Test