Big data chicago v2 5 14 14
-
Upload
tim-gilchrist -
Category
Documents
-
view
113 -
download
1
description
Transcript of Big data chicago v2 5 14 14
Big Data in Health Care What Marketers Need to Know
Tim Gilchrist, May 2014@timgilchrist
Session Goals
• Cover– Big Data– Artificial Intelligence / Machine Learning – How to be an Informed Consumer– Applications for Marketers–Questions
2
Big Data
The term for a collection of data sets so large and complex that they become difficult to process
• Many data sources with different formats
• Data with missing values
• Text / Social Media
• Things that don’t fit in Excel
3
Artificial Intelligence
4
“Pay no attention to the man behind the curtain”
Machine Learning
The construction and study of systems that can learn from data
5
Bayes
Thomas Bayes (1701 – 7 April 1761) was an English mathematician and Presbyterian minister, known for formulating the theorem that bears his name: Bayes' theorem
6
Bayes theorem uses prior probabilities, combined with new observations to calculate the probability of a hypothesis being true or false
Bayes is a natural fit to health care due to the presence of hypothesis (diagnosis) and events (tests / observations)
Example
You are all doctors who have administered a critical test to 50 patients
You know the test is:– 75% Accurate– 10% False Positives
7
– 10 of your patients tested positive– How many are actually sick?
Bayes’ Theorem – Mammogram Example
We can present the data as a decision tree representing the probabilities confronting doctors and patients
8
If we were to take population-level down to the individual level, much more accurate probabilities would be possible
Cancer 10%
No Cancer 90%
Cancer .03%
No Cancer 99%
Test + 8%
Test - 92%#1 Mammogram
#2 B
iops
y#3
Tim
e
Observation
Probability
What Does this Mean To Marketers?
• Big data is about discovering relationships
9
“Can’t beat a man with some insurance. I need that health plan baaaby!”
• Then using data-driven insights to inform strategy
Big Data Health Landscape
HIE
Member PCP Specialist
Sees PCP Gets X-Ray Sees Specialist Ambulatory
Outpatient
Analysis / Transformation
EMR AdmissionDischarge PrescriptionClaims
Plan Data Portal
Care
Pat
hO
utpu
tsD
ata
Type
s
Direct ConnectionWearable
Telemetry
What is Happening What Will Happen
Social PurchasedLocation
Cell
Example
Text Mining for Sales
11
Listening & Collecting
12
Noise Signal
Training / Processing
13
Tweets extracted from the Twitter Fire Hose with
key words “Health” and “Plan”
1MM per day
1
@CapoeiraBatuque
“What's the best plan thru affordable health care. Blue cross? Blue shield? Health net? #confused #healthcare”
@bluecalgal
Obama says don't listen to Fox, why? Obama lied about keep your dr, health plan, cheaper
than cell phone (bs) keep your dr
Create Training databases for any classification desired. + and – outcomes used here
2
Weka turns Tweets into numerical code that can be
analyzed by computer “String to Word Vector”.
uses Naïve Bayes classifier
3
“What's the best plan thru affordable health care. Blue cross?
Blue shield? Health net? #confused #healthcare”
.225.357
.155
.999
ACA AccessHealthHealthPlanNow PressureConfusedDumb
@ Handle# Followers# TweetsProfileRetweetsLocationDate/Time
Text AnalysisStemming/Tokenization
Demographic Analysis
Once classifications are established, rules can be
applied to new Tweets with high accuracy ~90+%
4
Result
14
Other Uses
15
• Model who will be your most valuable:
• Customer
• Facebook follower
• (Really) Determining Sentiment
• Marketing Mix Simulations
• Consumer Facing Predictive Technology
• Prod development (HIX)
Questions?
16
Appendix
17
Discovering Relationships Between Data
We can use machine learning to form relationships between sets of data that are seemingly unrelated (Causal Relationships): • Making your bed in the morning and job
satisfaction• Artificial Christmas trees and family “brag
letters”• What you buy and why
18
A Causal Diagram Based on Established Relationships for Estimating the Incidence of Coronary Heart Disease (CHD).
(Source: Comparative quantification of health risks: Conceptual framework and methodological issues)
19© Tim Gilchrist 2013
Bayes’ Theorem – Mammogram Example
Problem: Estimates of breast cancer over diagnosis range from 25%–52%. Physicians often misinterpret their own lab results
20
Prior ProbabilityChance that a woman will develop breast cancer in her 40s X 1.4%
New Event: MammographyAbility of mammogram to detect cancer when present Y 75%
False positives Z 10%
Posterior ProbabilityRevised Probability, given new event (positive mammogram)
xy+z(1-x)9.6%
A positive mammogram still leaves a 90.4% chance that the test showed something other than cancer. When biopsies are performed on this age group, 75% are negative. Physicians rarely consider the other 90.4%
What Does this Mean To Marketers?
21
Other Reading
Why Most Marketers Will Fail In The Era Of Big Data
8 Marketers Doing Big Data Right
The big-data revolution in US health care: Accelerating value and innovation
HITLAB Speaks with Tim Gilchrist, Director of eBusiness Strategy for WellPoint
22
The State of Big Data
23