Mining primary care EMRs
-
Upload
filippo-galgani -
Category
Technology
-
view
318 -
download
0
Transcript of Mining primary care EMRs
Understanding patient
experiences from mining
primary care data
Centre for Health Informatics
Filippo Galgani
Adam Dunn
Margaret Williamson
Malcolm Gillies
Guy Tsafnat
General Practice EMRs
• Aim: measure quality of care for a range of conditions in a diverse
population using GP EMR data.
• Dataset: longitudinal data (2.5 million Australian patients) including
prescriptions, diagnoses, pathologies, referrals
• Patients’ journey: grouping patients by experience to detect relevant
patterns in data over time..
Big Data Problems
• Data collected to keep patient history:
– Dealing with missing information
– Inconsistency
– Combination of short text fields (not coded) and numerical
values
• Doctors’ time constraints make data entry inaccurate
• Progress notes not available (privacy issue)
• Patients may visit other practices (thus missing information)
• Events happen irregularly
Continuity of care
Reasons for Prescription
123571
162357
Some Reason Given
Reason Missing
1974 different for PPI prescriptions
GORD (Gastro-oesophageal Reflux Disease) 50842
Reflux - gastro-oesophageal 13596
Reflux oesophagitis 6285
GOR (Gastro-oesophageal Reflux) 6047
Gastritis 5755
Gastro-oesophageal Reflux 4356
… …
Textual inconsistency:
Natural Language Processing
gord
GORD
gord;
gord • Normalization of case
and punctuation
• Stopword Filtering
• Spelling Correction
Gastro-oesophageal
Reflux Disease Gastro-oesophageal
Reflux
oesophygitis oesophagitis
Textual inconsistency:
Natural Language Processing
• Lemmatization Oesophagitis ulcerative
Oesophagitis ulcerating
Oesophagitis
ulcer
• Acronym Expansion
• Synonyms
GORD
GORD (Gastro-oesophageal Reflux Disease)
Gastro-oesophageal Reflux Disease =
Reflux oesophagitis Gastro-oesophageal Reflux =
Reasons for Prescription
GORD (Gastro-oesophageal Reflux Disease) 50842
Reflux - gastro-oesophageal 13596
Reflux oesophagitis 6285
GOR (Gastro-oesophageal Reflux) 6047
Gastritis 5755
Gastro-oesophageal Reflux 4356
… …
GORD (Gastro-oesophageal Reflux Disease) 87217
NLP pipeline
1974 different for PPI prescriptions
123571
162357
Some Reason Given
Reason Missing
123571
162357
Some Reason Given
Reason Missing
Reasons for Prescription
?
Missing Information: Machine Learning Approach
Random set of PPI patients
annotated by experts wrt GORD
Grouping Patients by Journey
Conclusion
• Data mining on GP EMRs is challenging due to the
noisy, messy and sparse nature of the data
• Analyzing journeys is possible, it required:
– Temporal reasoning (infer missing events)
– Natural Language Processing (solve textual
inconsistencies)
– Machine Learning (predict missing information)
– Domain knowledge (for modeling)
Acknowledgment
• This research was funded by the Australian Department of Health
and Ageing through the NPS MedicineWise as part of the
MedicineInsight Program.
• I wish to express my gratitude to:
Malcolm Gillies and Margaret Williamson from NPS
Adam Dunn and Guy Tsafnat from UNSW
• Thank you for the attention