Smart Data in Health – How we will exploit personal, clinical, and social “Big Health Data”...
-
Upload
amit-sheth -
Category
Health & Medicine
-
view
117 -
download
0
Transcript of Smart Data in Health – How we will exploit personal, clinical, and social “Big Health Data”...
1
Smart Data in Health – How we will exploit personal, clinical, and social
“Big Health Data” for better outcomes
Webinar given to Brain Health Alliance, June 30, 2015 Amit ShethKno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio http://knoesis.org http://knoesis.org/amit/hcls
Special Thanks: Sujan Perera
7
Smart Data
Makes Sense
Actionable or help decision support/making
Contextual
Information
Personalize
d
Smart Data
Smart data makes sense out of Big data
It provides value from harnessing the challenges posed by volume, velocity, variety
and veracity of big data, in-turn providing actionable information and improve decision
making.
Healthcare Data Usage - Examples
• Support Research - Genomics and BeyondThe key to enable the personalized medicine and discoveries
• Transform Data to InformationMine data for meaning and patterns/predictive analytics
• Support Self-CareMobile apps to keep track of your health status
• Support Providers - Improve Patient CareAnalyzing social and clinical data streams to create behavioral health records
• Increase AwarenessInform about epidemics, identifying counterfeit drugs,
inform about environmental issues
Few Success Stories
• IBM- Ontario’s Institute of Technology : predict the onset of nosocomial infections 24 hours before symptoms appeared.
• University of Michigan Health System : reducing the need for blood transfusions by 31 per cent and expenses by $200,000 a month.
• Kaiser Permanente : discovery of adverse drug effects and subsequent withdrawal of the drug Vioxx from the market.
• Harvard Medical School : computer algorithms to analyze EHR data to detect and categorize patients with diabetes for public health surveillance.
• Seton Healthcare – IBM : bulging jugular vein is a strong—and easily observed—predictor that a patient admitted for congestive heart failure is likely to wind up back in the hospital.
Kno.e.sis Harness the Value
kHeath analyzes both active and passive observations of the patients to generate the alarms that helps to improve health, fitness, and wellbeing of the patient. It uses Semantic Sensor Web technology, Semantic Perception, and Intelligence at the Edge to enable sophisticated analysis of personal health observations.
kHealth
Data Sources
kHealth Wiki
Kno.e.sis Harness the Value
The overall aim of PREDOSE is to develop techniques to facilitate prescription drug abuse epidemiology, related to the illicit use of pharmaceutical opioids. PREDOSE is designed to capture the knowledge, attitudes and behaviors of prescription drug abusers through the automatic extraction of semantic information from social media.
PREDOSE
Data Sources
PREDOSE Wiki
Kno.e.sis Harness the Value
eDrugTrends is social media data analytics platform to monitor the cannabis and synthetic cannabinoids usage. It uses Twitter and Web forums data to: 1) Identify and compare trends in knowledge, attitudes, and behaviors related to cannabis and synthetic cannabinoid, and 2) Identify key influencers in cannabis and synthetic cannabinoid-related discussions on Twitter.
eDrugTrends
Data Sources
eDrugTrends Wiki
Kno.e.sis Harness the Value
This project seeks to understand and satisfy users’ need for keeping track of new information in healthcare and well-being. The project harvest collective intelligence to identify high quality, reliable and informative healthcare content shared over social media based on following analysis: Text Analysis, Semantic analysis, Reliability analysis, Popularity Analysis.
Social Health Signals
Data Sources
Social Health Signals Wiki
Technology Stack
EMRSensor
Explicit/Implicit Entity Recognition Understanding Language NuancesNoise Filtering
Entity Disambiguation Sentiment Extraction
Spatial Information Extraction
Knowledge Extraction Semantic Perception
Time Series AnalysisPredictive Analytics Semantic Analysis
Social Network Analysis
Temporal Information Extraction
Data Integration
Kno.e.sis StrengthThe research at Kno.e.sis fundamentally believe that the ‘knowledge about the world and the problem domain’ has critical role to play in solving the complex real world problems. Hence, our technologies always exploit the background knowledge available to overcome the unique challenges posed by the problem at hand.
Computationally we seek to combine bottom brain and top brain inspired computing.
21
Sujan Perera, Cory Henson, Krishnaprasad Thirunarayan, Amit Sheth, Suhas Nair, 'Semantics Driven Approach for Knowledge Acquisition from EMRs', Special Issue on Data Mining in Bioinformatics, Biomedicine and Healthcare Informatics, Journal of Biomedical and Health Informatics (To Appear)
Intuition: Knowledge is built by abstracting real world facts, once built it should be able to explain the real world
Public Knowledge is not always Sufficient
Semantics Driven Approach for Knowledge Acquisition from EMRs
Explanation Module
Explained?
Yes
NoHypothesis
FilteringHypothesis Generation
Hypothesis with High
Confidence
D
D D
DD
D
Patient Notes
UMLS
Knowledge Acquisition
23
1. Annotate the EMR documents with given knowledgebase2. Find unexplained symptoms3. Generate hypothesis for unexplained symptoms
1. All disorders in document becomes candidates4. Filter out candidate disorder with high confidence
1. Get disorders which has relationship with unexplained symptom in given knowledgebase
2. Collect the “neighborhood” of the disorders3. Get the intersection of “neighborhood” and candidate
disorders
Knowledge Acquisition - Algorithm
Implicit Entity Recognition
Bob Smith is a 61-year-old man referred by Dr. Davis for outpatient cardiac catheterization because of a positive exercise tolerance test. Recently, he started to have left shoulder twinges and tingling in his hands. A stress test done on 2013-06-02 revealed that the patient exercised for 6 1/2 minutes, stopped due to fatigue. However, Mr. Smith is comfortably breathing in room air. He also showed accumulation of fluid in his extremities. He does not have any chest pain.
Person PersonUMLS:
C0018795
UMLS: C0008031
UMLS: C0015672
Named Entity Recognition (gives type)Co-reference ResolutionNegation DetectionEntity LinkingTemporal Information Extraction
Implicit Entity Recognition
Bob Smith is a 61-year-old man referred by Dr. Davis for outpatient cardiac catheterization because of a positive exercise tolerance test. Recently, he started to have left shoulder twinges and tingling in his hands. A stress test done on 2013-06-02 revealed that the patient exercised for 6 1/2 minutes, stopped due to fatigue. However, Mr. Smith is comfortably breathing in room air. He also showed accumulation of fluid in his extremities. He does not have any chest pain.
Shortness of breath - negated
edema
Shortness of breath : uncomfortable sensation of difficulty in breathingEdema : excessive accumulation of fluid
Implicit Entity Recognition
Implicit Entity Recognition (IER) is the task of determining whether a sentence, which does not contain the proper name of an entity, nevertheless refers to the entity.
Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott, 'Implicit Entity Recognition in Clinical Documents', In proceedings of The Fourth Joint Conference on Lexical and Computational Semantics (*SEM), 2015
Implicit Entity Recognition
Sentence Entity
Her breathing is still uncomfortable. Shortness of breath
It is important to prevent shortness of breath and lower extremity swelling from fluid accumulation.
Edema
She says she did not have any warning prior to losing consciousness and remembers everything.
Syncope
His tip of the appendix was inflamed. Appendicitis
There is a 1.3 cm gallstone within the gallbladder neck which is not obstructing. Cholecystitis
Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott, 'Implicit Entity Recognition in Clinical Documents', In proceedings of The Fourth Joint Conference on Lexical and Computational Semantics (*SEM), 2015
SSNOntology
2 Interpreted data(deductive)[in OWL] e.g., threshold
1 Annotated Data[in RDF]e.g., label
0 Raw Data[in TEXT]e.g., number
Levels of Abstraction
3 Interpreted data (abductive)[in OWL]e.g., diagnosis
Intellego
“150”
Systolic blood pressure of 150 mmHg
ElevatedBlood
Pressure
Hyperthyroidism
less
use
ful …
…
mor
e us
eful
……
31
* based on Neisser’s cognitive model of perception
ObserveProperty
PerceiveFeature
Explanation
Discrimination
1
2
Translating low-level signals into high-level knowledge
Focusing attention on those aspects of the environment that provide useful information
Prior Knowledge
32
Perception Cycle*
ObserveProperty
PerceiveFeature
Explanation1
Translating low-level signals into high-level knowledge
Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
34
Explanation
Inference to the best explanation• In general, explanation is an abductive problem; and
hard to compute
Finding the sweet spot between abduction and OWL• Single-feature assumption* enables use of OWL-DL
deductive reasoner
* An explanation must be a single feature which accounts forall observed properties
Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
35
Explanation
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Observed Property Explanatory Feature
36
Explanation
Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features
ObserveProperty
PerceiveFeature
Explanation
Discrimination2
Focusing attention on those aspects of the environment that provide useful information
37
Discrimination
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Discriminating Property Explanatory Feature
38
Discrimination
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information
canary in a coal mine
Empowering individuals for their own health
40
kHealth
What?
• kHealth is a knowledge-based approach/application for patient-centric health-care that exploits:(a) Web based tools and social media, (b) Mobile phone technology and wireless sensors, (c) For synthesizing personalized actions from heterogeneous health data
(i) For disease prevention and treatment(ii) For health, fitness and well-being
41
kHealth
kHealth – Applications & Impact
Condition Number of patients Total cost per year
Asthma 25 million 50 billion
ADHF 5 million 34 billion
Parkinson’s disease 1 million 25 billion
Sensordrone (Carbon monoxide,
temperature, humidity) Node Sensor
(exhaled Nitric Oxide)
43
Sensors
Android Device (w/ kHealth App)
Total cost: ~ $500*Along with two sensors in the kit, the application uses a variety of population level signals from the web:
Pollen level Air Quality Temperature & Humidity
kHealth – Asthma Patient Kit
Personal level Signals
Public level Signals
Population level Signals
Domain Knowledge
Risk Model
Events from Social Streams
Take Medication before going to work
Avoid going out in the evening due to high pollen levels
Contact doctor
AnalysisPersonalized Actionable
Information
Data Acquisition & aggregation
44
Health Signal Processing Architecture
45
Risk assessment model
Semantic Perception
Personal level Signals
Public level Signals
Domain Knowledge
Population level Signals
GREEN -- Well Controlled YELLOW – Not well controlledRed -- poor controlled
How controlled is my asthma?
Patient Health Score (Diagnostic)
46
Risk assessment model
Semantic Perception
Personal level Signals
Public level Signals
Domain Knowledge
Population level Signals
Patient health Score
How vulnerable* is my control level today?
*considering changing environmental conditions and current control level
Patient Vulnerability Score (Prognostic)
47
Population Level
Personal
Wheeze – YesDo you have tightness of chest? –Yes
Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location>
<ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChestTightness
PollenLevel
Pollution
Activity
Wheezing
ChestTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert Knowledge
Background Knowledge
Tweets reporting pollution level and asthma attacks
Acceleration readings fromon-phone sensors
Sensor and personal observations
Signals from personal, personal spaces, and community spaces
Risk Category assigned by doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor
Health Signal Extraction to Understanding
kHealth (Asthma) Demo
D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)
Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled ComputingCITAR - Center for Interventions Treatment and Addictions Research
http://wiki.knoesis.org/index.php/PREDOSE
Bridging the gap between researcher and policy makers
Early identification of emerging patterns and trends in abuse
PREDOSE
In 2008, there were 14,800 prescription painkiller deaths*
*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/
• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs
Gil KerlikowskeDirector, ONDCP
Launched May 2011
PREDOSE
PREDOSE
Early Identification and Detection of Trends
Access hard-to-reach Populations
Large Data Sample Sizes
Group Therapy: http://www.thefix.com/content/treatment-options-prison90683
Interviews
Online Surveys
Automatic Data Collection
Not Scalable
Manual Effort
Sample Biases
Epidemiologist
Qualitative Coding
Problems
Computer Scientist
Automate Information Extraction & Content Analysis
Web Crawler
Informal Text DatabaseWeb Forums
2
4
58
Data Cleaning
Stage 1. Data Collection3
Stage 2. Automatic Coding
Stage 3. Data Analysis and Interpretation
1
6
Qualitative and Quantitative Analysis of Drug User Knowledge, Attitudes
and Behaviors
+ =
Semantic Web Database
Information Extraction Module
Temporal Analysis for Trend Detection
10
Triples/RDF Database
Entity Identification
Sentiment ExtractionRelationship
Extraction
Triple Extraction
7Opioid, Cannabinoid,Side Effect, Feeling
[Buprenorphine has_slang_term bupe][Suboxone subClassOf Buprenorphine][Suboxone_Injection CAUSES Nausea]
Drug Abuse Ontology (Schema)
9
PREDOSE Web Application
9
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Codes Triples (subject-predicate-object)
Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia
Suboxone used by injection, amount Suboxone injection-dosage amount-2mg
Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria
Triples
DOSAGE PRONOUN
INTERVAL Route of Admin.
RELATIONSHIPS SENTIMENTS
DIVERSE DATA TYPES
ENTITIES
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Buprenorphine
subClassOf
bupe
Entity Identification
has_slang_term
SuboxoneSubutex
subClassOf
bupey
has_slang_term
Drug Abuse Ontology (DAO)83 Classes37 Properties
33:1 Buprenorphine24:1 Loperamide
feel pretty damn good
feel great
Sentiment Extraction
+ve
experience sucked
didn’t do shit
-ve
bad headache
Ontology Lexicon Lexico-ontology Rule-based Grammar
ENTITIESTRIPLES
EMOTIONINTENSITYPRONOUN
SENTIMENT
DRUG-FORMROUTE OF ADM
SIDEEFFECT
DOSAGEFREQUENCY
INTERVAL
Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia
disgusted, amazed, irritatedmore than, a, few of
I, me, mine, myIm glad, turn out bad, weird
ointment, tablet, pill, filmsmoke, inject, snort, sniffItching, blisters, flushing, shaking hands, difficulty
breathing
DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)
FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)
INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)
PREDOSE: Smarter Data through Shared Context and Data Integration
55
Loperamide is used to self-medicate to from Opioid Withdrawal symptoms
PREDOSE: Loperamide-Withdrawal Discovery
Social Health Signals
http://www.internetlivestats.com/internet-users/
Around 3 Billions (40%) of the world population
Around 300 Million (87 %) of the US population
• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently– Can help non-experts to make more informed decisions – Play a vital role in improving health literacy
Social Health Signals
• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information
• Most queries are initiated in search engines
According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.
*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013
Social Health Signals
Social Health Signals - Motivation
• Analyzing health search log– Helps to understand population level health information needs
– How users formulate search queries (“expression of information need”)
– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors
• Such knowledge can be applied – to improve the health search experience
– to develop next-generation knowledge and content delivery systems
Social Health Signals - Studies
• Online information seeking: Personal computer vs Smart devices
• What information about the cardiovascular disease do people search
vs.
Social Health Signals - Studies
• Comparative analysis of online health information seeking for chronic diseases
• Analyzing temporal patterns of the online health seeking
Cardiovascular Diseases
Arthritis
Cancer Diabetes
Social Health Signals - Studies
• Analyzing online information seeking for “Food and Diet” in the context of health
• Identification of users intent for health information seeking
• Using background knowledge based to develop a rule based classification approach
– Using UMLS MetaMap and based on UMLS concepts and semantic types
– To categorize CVD search queries into 14 “consumer oriented” health categories
Research Problem
eDrugTrends
• eDrugTrends is a software platform developed to semi-automate the processing and visualization of thematic, sentiment, spatio-temporal, and social network dimensions on cannabis and synthetic cannabinoid use.
• This built on top of our existing analytics platforms Twitris and PREDOSE.
eDrugTrends - Significance
• eDrugTrends advance the the field’s technological and methodological capabilities to harness social media for drug abuse surveillance research.
• eDrugTrends informs the field on new trends regarding the use of cannabis and synthetic cannabinoid usage.
eDrugTrends – Preliminary Study
• We studied the differences in volume of hash oil (form of cannabis) related tweets among varying cannabis legalization policies.
• We studies the attitudes about the use of hash oil products.
eDrugTrends – Data Set
• ~18,000 Tweets in early October from ~14,500 users.
• 20% contains identifiable state level geolocations.
• ExamplesIf you smoke spice and you live in a Weed Legal state.... You are trash
Tried my first dab Tuesday night. Best sleep I've had in a while. Too bad dabs are too expensive for me.
I used to smoke k2 all the time when my bestfriend was on papers‚ then I almost died n never touched it again.
eDrugTrends – Early Findings
• Tweets related to hash oil are highest in the states that have passed medical and recreational usage of cannabis.
• The users have high positive attitude towards the cannabis usage in such states.
• These finding will help to develop intervention and policy responses.
Thank You
Visit Us @ www.knoesis.orgwith additional background at http://knoesis.org/amit/hcls
Ohio Center of Excellence in Knowledge-enabled Computing -An Ohio Center of Excellence in BioHealth Innovation
Wright State University
Amit Sheth’s PHD students
Ashutosh Jadhav
Hemant Purohit
Vinh Nguyen
Lu ChenPavan
KapanipathiPramod
Anantharam
Sujan Perera
Alan Smith
Pramod Koneru
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa Gunaratna
Delroy Cameron
Sanjaya Wijeratne
Wenbo Wang
Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)