Healthcare innovations at Kno.e.sis
-
Upload
amit-sheth -
Category
Health & Medicine
-
view
426 -
download
2
Embed Size (px)
description
Transcript of Healthcare innovations at Kno.e.sis

Healthcare Innovations at Kno.e.sis
Put Knoesis Banner
Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014
Amit Sheth
Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State University, USA

2
• Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014)
• Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )
• 100 researchers including 15 World Class faculty (>3K citations/faculty) and ~45 PhD students- practically all funded
• Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)

3
Amit Sheth’s PHD students
Ashutosh Jadhav
Hemant Purohit
Vinh Nguyen
Lu ChenPavan
KapanipathiPramod
Anantharam
Sujan Perera
Alan Smith
Swapnil Soni
Maryam Panahiazar
Sarasi Lalithsena
Shreyansh Batt
Kalpa Gunaratna
Delroy Cameron
Sanjaya Wijeratne
Wenbo Wang
Kno.e.sis in 2014 = ~100 researchers (15 faculty, ~50 PhD students)
Special thanks
Special thanks
Special thanks
Special thanks
Special thanks: This presentation covers some of the work of these researchers.

4
• 80% of doctors will eventually become obsolete: Vinod Khosla, VC and founder of Sun Microsystems
• “The Doctor is (Always) In: Reinventing the Doctor-Patient
Relationship for the 21st Century” [Dr. J. Shlain]. More data is generated under patient control and outside clinical system. Patient empowerment, reimbursement changes and AHA.
• #dHealth and #IoT are two hottest hashtags at CES and SXSW
Healthcare is changing way too fast

5
The Patient of the FutureMIT Technology Review, 2012
http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/

6
Collaborators

7
Healthcare Innovation at Kno.e.sis
(with subset of applications)

8
kHealth:Knowledge empowered personalized
digital mhealthWith applications to: ADHF, GI, Asthma,
[Geriatrics]
Contact: Prof. Amit Sheth

10
Providing actionable information in a timely manner is crucial to avoid information overload or fatigue
Sleep dataCommunity data
Personal Schedule Activity data
Personal health records
Data Overload for Patients/health aficionados

11
Weather Application
Detection of events, such as wheezing sound, indoor
temperature, humidity, dust, and CO2 level
Weather ApplicationAsthma Healthcare Application
Action in the Physical World
Close the window at home during day to avoid CO2 inflow,
to avoid asthma attacks at night
Public Health
Personal
Population Level
‘FOR human’: Improving Human Experience

12
Making sense of sensor data with

13
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information
canary in a coal mine
knowledge-enabled healthcare
kHealth

14
kHealth to Manage ADHF(Acute Decompensated Heart Failure)

15
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
25 million
300 million
$50 billion
155,000
593,000
People in the U.S. are diagnosed with asthma (7 million are children)1.People suffering from asthma worldwide2.
Spent on asthma alone in a year2
Hospital admissions in 20063
Emergency department visits in 20063
Asthma

16
Asthma is a multifactorial disease with health signals spanning personal, public health, and population levels.
Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
Value
Can we detect the asthma severity level?Can we characterize asthma control level?What risk factors influence asthma control?What is the contribution of each risk factor?
sem
antic
s Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information
WHY Big Data to Smart Data?Healthcare example

17ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ;
*consider referral to specialist
Asthma Control and Actionable Information
Sensors and their observations for understanding asthma
Personal, Public Health, and Population Level Signals for Monitoring Asthma

18
At DischargeHealth Score Non-compliance Poor economic
statusNo living assistance
Vulnerability Score
Well Controlled Low
Well Controlled Very low
Not Well Controlled
High
Not Well Controlled
Medium
Poor Controlled Very High
Poor Controlled High
Estimation of readmission vulnerability based on the personal health score
Personal Health Score and Vulnerability Score

19
Population Level
Personal
Wheeze – YesDo you have tightness of chest? –Yes
Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location><ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert Knowledge
Background Knowledge
tweet reporting pollution level and asthma attacks
Acceleration readings fromon-phone sensors
Sensor and personal observations
Signals from personal, personal spaces, and community spaces
Risk Category assigned by doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor
Health Signal Extraction to Understanding

20
Social streams has been used to extract many near real-time events
Twitter provides access to rich signals but is noisy, informal, uncontrolled capitalization, redundant,
and lacks context
We formalize the event extraction from tweets as a sequence labeling problem
How do we know the event phrases and who creates the training set? (manual creation is ruled out)
Now you know why you’re miserable! Very High Alert for B-ALLERGEN Ragweed I-ALLERGEN pollen. B-FACILITY Oklahoma I-FACILITY Allergy I-FACILITY Clinic says it’s an extreme exposure situation
Idea: Background knowledge used to create the training set e.g., typing information becomes the label for a concept
Health Signal Extraction Challenges

21
intelligence at the edge
Approach 1: Send all sensor observations to the cloud for processing
Approach 2: downscale semantic processing so that each device is capable of machine perception
Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.

22
Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning
0101100011010011110010101100011011011010110001101001111001010110001101011000110100111
Efficient execution of machine perception

23
O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to
linear
Evaluation on a mobile device

24
2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web
3 Intelligence at the edgeBy downscaling semantic inference, machine perception can
execute efficiently on resource-constrained devices
1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making
Semantic Perception for smarter analysis:
3 ideas to takeaway

25
PREDOSE:Social media analysis driven
epidemiologyApplication: Prescription drug abuse and beyond
Contact: Delroy Cameron

26
D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)
Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled ComputingCITAR - Center for Interventions Treatment and Addictions Research
http://wiki.knoesis.org/index.php/PREDOSE
Bridging the gap between researcher and policy makers
Early identification of emerging patterns and trends in abuse
PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

27
In 2008, there were 14,800 prescription painkiller deaths*
*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/
• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs
Gil KerlikowskeDirector, ONDCP
Launched May 2011
PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

28
Early Identification and Detection of Trends
Access hard-to-reach Populations
Large Data Sample Sizes
Group Therapy: http://www.thefix.com/content/treatment-options-prison90683
Interviews
Online Surveys
Automatic Data Collection
Not Scalable
Manual Effort
Sample Biases
Epidemiologist
Qualitative Coding
Problems
Computer Scientist
Automate Information Extraction & Content Analysis
PREDOSE: Bringing Epidemiologists and Computer Scientist together


I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Codes Triples (subject-predicate-object)Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia
Suboxone used by injection, amount Suboxone injection-dosage amount-2mg
Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria
experience sucked
feel pretty damn good
didn’t do shit
feel great
Sentiment Extraction
bad headache
+ve
-ve
Triples
DOSAGE PRONOUN
INTERVAL Route of Admin.
RELATIONSHIPS SENTIMENTS
DIVERSE DATA TYPES
ENTITIES
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Buprenorphine
subClassOf
bupe
Entity Identification
has_slang_term
SuboxoneSubutex
subClassOf
bupey
has_slang_term
Drug Abuse Ontology (DAO)83 Classes37 Properties
33:1 Buprenorphine24:1 Loperamide

31
Ontology Lexicon Lexico-ontology Rule-based Grammar
ENTITIESTRIPLES
EMOTIONINTENSITYPRONOUN
SENTIMENT
DRUG-FORMROUTE OF ADM
SIDEEFFECT
DOSAGEFREQUENCY
INTERVAL
Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia
disgusted, amazed, irritatedmore than, a, few of
I, me, mine, myIm glad, turn out bad, weird
ointment, tablet, pill, filmsmoke, inject, snort, sniffItching, blisters, flushing, shaking hands, difficulty
breathing
DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)
FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)
INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)
PREDOSE: Smarter Data through Shared Context and Data Integration

32
Data Type Semantic Web Technique Limitations of Other Approaches
Entity Ontology-driven Identification & Normalization
ML/NLP IR
Requires Labeled Data
Unpredictable term frequencies
Triple Schema-drivenDifficult to
develop language model
Requires entity disambiguation
Sentiment Ontology-assisted Target Entity Resolution
Inconsistent data for Parse Trees or
rules
Diverse simple & complex slang
terms & phrases
PREDOSE: Role of Semantic Web and Ontologies

33
Loperamide is used to self-medicate to from Opioid Withdrawal symptoms
Loperamide-Withdrawal Discovery

34
EMR and clinical text analysis:Intelligence from clinical data
Contact: Sujan Parera

35
• Active Semantic EMR: high quality, low error, faster completion of patient records
• Predicting patient outcomes and advice discharge decisions based on both structured (billing) data and clinical text (unstructured data)
• Deep understanding of clinical text for Computer Assisted Coding for ICD9 and ICD10 and Computerized Document Improvement (commercial products from ezDI)

Explanation Module
Explained?
Yes
NoHypothesis
FilteringHypothesis Generation
Hypothesis with High
Confidence
D
D D
DD
D
Patient Notes
UMLS
Semantic Driven Approach for Knowledge Acquisition from EMRs

37
Deep clinical text analysis using semantics enhanced NLP has enabled our industry partner ezDI to develop exciting commercial products: ezCDI (Computerized Document Improvement) and ezCAC (Computer Assisted ICD9/ICD10 Coding)
See: http://ezdi.us
Semantics enhanced NLP

38
• Typical NLP algorithms misclassify linguistic nuances• Document 1:
• Coronary artery disease listed in the current diagnosis list• “Send for carotid duplex to rule out carotid artery stenosis given his risk factors and
underlying coronary artery disease.“ (NLP output says patient does not have coronary artery disease)
• Document 2:• “Extremities : Warm and dry. No clubbing or cyanosis. No lower extremity edema.“• “I have advised the patient on the side effect of potential lower extremity edema.“ (NLP
output says patient has lower extremity edema)
• Document 3• “He is not having any symptoms of chest pain or exertional syncope or dizziness.”• “I advised him that if he experiences chest pain, shortness of breath with exertion or
dizziness or syncopal episodes to let us know and we can do appropriate workup.” (NLP output says patient has chest pain, shortness of breath, dizziness, syncopal)
Green - correctly identified entities Red – misclassified entities
Semantics enhanced NLP

39
Semantics enhanced NLP
• Domain knowledge can be used to resolve misclassifications
Atrial FibrillationSyncope
Is_symptom_of
Warfarin
Atenolol
AspirinIs_medication_for
Symptoms Medication
Medication
Medication
• There are strong evidences to suggest that patient has Atrial Fibrillation.

40
Raw Text to Knowledge
He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on atenolol alone for his hypertension.
Raw Text
Concepts
Knowledge
Inference
diovan lotrel renal insufficiency atenolol hypertension
diovanvaltuna
valsartan
antihypertensive agent
atenolol
tenominatenix kidney failure
renal insufficiency
kidney disease
disorder
blood pressure disorder
hypertension
systoloc hypertension
pulmonary hypertension
Patient taking atenolol for hypertension
Patient has kidney disease
Patient is on antihypertensive drugs
is used to treat
is a
drug
disorder

cTAKESezNLP
ezKB<problem value="Asthma" cui="C0004096"/><med value="Losartan" code="52175:RXNORM" /><med value="Spiriva" code="274535:RXNORM" /><procedure value="EKG" cui="C1623258" />
ezFIND ezMeasure ezCDIezCAC
www.ezdi.us
ezHealth Platform
41

42
Online Health Information Seeking
Contact: Ashutosh Jadhav

43
Internet Users in the World
http://www.internetlivestats.com/internet-users/
Around 3 Billions (40%) of the world population
Around 300 Million (87 %) of the US population

44
• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently– Can help non-experts to make more informed decisions – Play a vital role in improving health literacy
Online Health Information Seeking

45
• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information
According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.
*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013
Online Health Information Seeking

46
• One of the most common ways to seek online health Information is via Web search engines such as Google, Yahoo! and Bing
According to the Pew Survey, approximately 8 in 10 online health inquiries initiate from a search engine.
Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013
Online Health Information Seeking

47
• Analyzing health search log– Helps to understand population level health information needs
– How users formulate search queries (“expression of information need”)
– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors
• Such knowledge can be applied – to improve the health search experience
– to develop next-generation knowledge and content delivery systems
Motivation

Online Health Information Seeking
Smart Devices
Personal Computers
vs.
Jadhav A et al. “Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal” Journal of Medical Internet Research 2014;16(7):e160 (Impact factor 3.8)

Desktop
Mobile
Mobile usagetakesOver
Motivation

• With the recent exponential increase in usage of smart devices, the percentage of people using smart devices to search for health information is also growing rapidly
Motivation

• Experience of online information searching varies depending on the device used – Smart devices (SDs) : mobile, tablets– Personal computers (PCs): desktop, laptop
• PCs and SDs have distinct characteristics– Readability, user experience, accessibility, etc.
Motivation

• In order to improve the health information searching
process and to be prepared for technology shift, it is
necessary
– to understand how device choice influences
online health information seeking
Study Objective

• Data:– Health search queries – lunched from PCs and SDs– submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal
(MayoClinic.com)
• Data timeframe: – June 2011 to May 2013
• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)
• Dataset size: – More than 100 million health search queries for both PCs and SDs
Dataset Creation

• For PCs and SDs, we analyzed and compared– Frequently searched health categories
– Types of search queries (keyword-based, Wh-questions, Yes/No questions)
– Structural properties of the queries • Length of the search queries• Usage of the search query operators• Usage of special characters
– Misspellings in the health search queries
– Linguistic characteristics of the queries
Comparative Data Analysis

The most-searched health categories are ‘Symptoms’ (1 in 3 search queries), ‘Causes’ and ‘Treatments & Drugs’
One of the least searched health category is “Prevention” The distribution of search queries for different health categories differ
with the device used for search Search queries from both PCs and SDs, follow similar pattern for
distribution of the search queries between health categories
Intent Mining for Health Information Seeking

Health queries are predominately formulated using keywords (~85%); followed by Wh and Yes/No questions
Users ask more health questions from SDs compared to those from PCs
In the health search queries, users ask more “what”, “how” questions => descriptive information need “can”, “is” and “does” questions => factual information need
Intent Expression: Search Query Type

Average length of the queries from SDs (3.29 words and 18.86 characters) is bit longer than that of PCs (2.9 words and 17.61 chars)
Health queries tend to be longer than the general search queries indicating users interest in more specific information
Intent Expression: Search Query Length

Online Health Information Seeking for Cardiovascular Diseases
Jadhav A et al."What Information about Cardiovascular Diseases do People Search Online?”, 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014.
Jadhav A et al. "Online Information Searching for Cardiovascular Diseases: An Analysis of Mayo Clinic Search Query Logs” AMIA 2014 Annual Symposium, Washington DC, Nov 15-19, 2014

59
• According to CDC, in the United States– CVD is one of the most common chronic diseases– the leading cause of death (1 in every 4 deaths)
• CVD is common across all socioeconomic groups and demographics
• Most of the CVDs require lifelong care and the patient is in charge of managing the disease through self-care
• Online health resources are “significant information supplement” for the patients with chronic conditions
Motivation

60
• Although chronic diseases affect large population, very few prior studies have investigated online health information searching exclusively for chronic diseases and especially for CVD.
• In this study, we address this knowledge gap in the community – by performing population-level intention mining for online
health information seeking
Motivation

61
• Data:– CVD related search queries – submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal
(MayoClinic.com)
• Data timeframe: – September 2011 to August 2013
• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)
• Dataset size: – 10 million CVD related search queries, which is a significantly large dataset
for a single class of diseases.
Dataset Creation

62
• Identification of users intent for health information seeking
• For exampleSearch Query Health Category
Heart palpitations with headache Symptoms
Tylenol raise blood pressure Medication, Vital signPump for pulmonary hypertension Medical device, Disease
Red wine heart disease Food, DiseaseBypass surgery Treatment
Research Problem

63
• Using background knowledge based to develop a rule based classification approach
– Using UMLS MetaMap and based on UMLS concepts and semantic types
– To categorize CVD search queries into 14 “consumer oriented” health categories
– Precision: 88.42% , Recall: 86.07% and F-Score: 0.8723
Intent Mining for Online Health Information Seeking

64
Methods Overview

Intent Mining for Health Information Seeking:
Association Rules for Categorization

• One in every two search is related to either ‘Diseases and Conditions’ or ‘Vital signs’.
• Other popular health categories that users search for includes ‘Symptoms’, ‘Living with’, ‘Treatments’, ‘Food and Diet’ and ‘Causes’.
• Although CVD can be prevented with some lifestyle and diet changes, interestingly very few OHISs search for CVD ‘Prevention’.
Intent Mining for Health Information Seeking:
Categorization Results

• A search query can be categorized into zero, one or more health categories
• Using our categorization approach, we categorized 92% of the 10 million CVD related queries into at least one health category
• Most of the queries (around 88%) are categorized into either one or two categories
• Very few CVD queries (4.28%) are categorized into 3 or more categories.
Intent Mining for Health Information Seeking:
Categorization Results

• Most of the top search queries are related to major CVD diseases and conditions.
• At the same time, queries about blood pressure (high/low) and heart rate also searched frequently
Top CVD Search Queries

• Average search query length for CVD is 3.88 words and 22.22 characters
• Around 80% of the CVD search queries have 3 or more words.
• The analysis implies that, CVD search queries are longer than previously reported non-medical as well as medical queries
• Longer search queries also denote users’ interest in more specific information about the disease; subsequently users use more words to narrow down to a particular health topic.
Intent Expression: Search Query Length

• Users predominantly formulate search queries using keywords (80%), though queries with Wh-Questions are also significant
• Few queries (2.5%) are formulated as Yes/No type questions
• In Wh-questions, OHISs mostly use “How” and “What” in the search queries and both of them generally signify that more descriptive information is needed
• Yes/No questions are usually used to check some factual information. In Yes/No Questions, OHISs more often start the search queries with “does” “can” and “is”
Intent Expression: Search Query Types

Comparative Analysis of Online Health Information Seeking for
Chronic Diseases
Cardiovascular Diseases
Arthritis
Cancer Diabetes

Analyzing Temporal Patterns in Online Health Information Seeking

Analyzing online information seeking for “Food and Diet” in the
context of “Health”

75
• Everyday millions of health related tweets shared
• Most of these tweets are highly personal and contextual
• Only around 12% posts are informative*
• Keyword-based search doesn't help
• User has to manually identify informative tweets
How to automate the identification of informative content?
Problem: Identifying Signals from Noise

76
Present high quality, reliable and informative health related information shared over social media by understanding
Who who shared the information?social network user People Analysis
share what what content is shared? social media post Content Analysis
when when the post is generated? Temporal Analysis
in what context what is the topic of the message? Semantic Analysis
on which channel
To which website, the social media post is pointing? Reliability Analysis
with what social effect
how many retweets, facebook like/share, comments for the post?
Popularity Analysis
Social Health Signals

77
Search and Explore
Top health news
Faceted search (by health topics)
Social Health Signals

78
On going projects

79
• Stress, obesity/lifestyle disease, chronic diseases
• Food and diet in the health context• Keeping elderly at home as long as possible• Clinical research – developing blood test for
esophageal cancer detection
On the drawing board

80
• Kno.e.sis is a truly multidisciplinary, pan-University Center of Excellence were world class technology/computing expertise come together with clinical research and applications in health, fitness & wellbeing
• Major theme: personalized digital health, patient empowerment, informed patients, epidemiology
• More is covered in my talk on Semantic Data enabling Personalized Digital Health
Take Away

81
http://knoesis.orghttp://knoesis.org/vision
http://knoesis.org/amit/hcls
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA
thank you, and please visit us at

82
1. Henson C, Thirunarayan K, Sheth A. An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices 11th International Semantic Web Conference (ISWC 2012), Boston, Massachusetts, USA, November 11-15, 2012
2. Henson C, Sheth A, Thirunarayan K. Semantic Perception: Converting Sensory Observations to Abstractions IEEE Internet Computing, vol. 16, no. 2, pp. 26-34, Mar./Apr. 2012, doi:10.1109/MIC.2012.20
3. Henson C, Thirunarayan K, Sheth A. An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web. Applied Ontology, vol. 6(4), pp.345-376, 2011.
4. Perera S, Sheth A, Thirunarayan K, Nair S and Shah N. Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help. International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM), pp. 21-26, Burlingame, USA, Nov 1, 2013,
5. Perera S, Henson C, Thirunarayan K, Sheth A, Nair S. Semantics Driven Approach for Knowledge Acquisition From EMRs. IEEE Journal of Biomedical and Health Informatics, vol.18, no.2, pp.515-524, March 2014, doi: 10.1109/JBHI.2013.2282125, PMID: 24058038
Selected References

83
6. Cameron D, Smith GA, Daniulaityte R, Sheth A et al.PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. 46(6): 985-997, 2013. PMID: 23892295
7. Cameron D, Bodenreider O, Yalamanchili H, Danh T et al. A Graph-Based Recovery and Decomposition of Swanson's Hypothesis using Semantic Predications. Journal of Biomedical Informatics 46(2): 238-251, 2013.
8. Jadhav A, Sheth A, Pathak J. Analysis of Online Information Searching for Cardiovascular Diseases on a Consumer Health Information Portal. American Medical Informatics Association (AMIA) Annual Symposium 2014, Washington DC, November 15-19, 2014
9. Jadhav A, Andrews D, Fiksdal A, Kumbamu A, McCormick JB, et al. Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal. J Med Internet Res 2014;16(7):e160, PMID: 25000537
10. Fiksdal A, Kumbamu A, Jadhav A, Nelsen L, Pathak J, McCormick JB. Evaluating the Process of Online Health Information Searching: A Qualitative Approach to Exploring Consumer Perspectives. in press at J Med Internet Res 2014
11. Jadhav A, Wu S, Sheth A, Pathak J. Online Information Seeking for Cardiovascular Diseases: A Case Study from Mayo Clinic. 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014
Selected References