Pragmatic text mining: From literature to electronic health records

96
Lars Juhl Jensen Pragmatic text mining From literature to electronic health records

Transcript of Pragmatic text mining: From literature to electronic health records

Page 1: Pragmatic text mining: From literature to electronic health records

Lars Juhl Jensen

Pragmatic text miningFrom literature to electronic health

records

Page 2: Pragmatic text mining: From literature to electronic health records

why text mining?

Page 3: Pragmatic text mining: From literature to electronic health records

data mining

Page 4: Pragmatic text mining: From literature to electronic health records

guilt by association

Page 5: Pragmatic text mining: From literature to electronic health records
Page 6: Pragmatic text mining: From literature to electronic health records

structured data

Page 7: Pragmatic text mining: From literature to electronic health records

unstructured text

Page 8: Pragmatic text mining: From literature to electronic health records

biomedical literature

Page 9: Pragmatic text mining: From literature to electronic health records

>10 km

Page 10: Pragmatic text mining: From literature to electronic health records

too much to read

Page 11: Pragmatic text mining: From literature to electronic health records

computer

Page 12: Pragmatic text mining: From literature to electronic health records

as smart as a dog

Page 13: Pragmatic text mining: From literature to electronic health records

teach it specific tricks

Page 14: Pragmatic text mining: From literature to electronic health records
Page 15: Pragmatic text mining: From literature to electronic health records
Page 16: Pragmatic text mining: From literature to electronic health records

named entity recognition

Page 17: Pragmatic text mining: From literature to electronic health records

dictionary-based approach

Page 18: Pragmatic text mining: From literature to electronic health records

identification required

Page 19: Pragmatic text mining: From literature to electronic health records

dictionary

Page 20: Pragmatic text mining: From literature to electronic health records

cyclin dependent kinase 1

Page 21: Pragmatic text mining: From literature to electronic health records

CDC2

Page 22: Pragmatic text mining: From literature to electronic health records

expansion rules

Page 23: Pragmatic text mining: From literature to electronic health records

CDC2

Page 24: Pragmatic text mining: From literature to electronic health records

hCdc2

Page 25: Pragmatic text mining: From literature to electronic health records

flexible matching

Page 26: Pragmatic text mining: From literature to electronic health records

hyphens and spaces

Page 27: Pragmatic text mining: From literature to electronic health records

“black list”

Page 28: Pragmatic text mining: From literature to electronic health records

SDS

Page 29: Pragmatic text mining: From literature to electronic health records

efficient tagger

Page 30: Pragmatic text mining: From literature to electronic health records

Pafilis et al., PLOS ONE, 2013

Page 31: Pragmatic text mining: From literature to electronic health records

the formal way

Page 32: Pragmatic text mining: From literature to electronic health records

benchmark

Page 33: Pragmatic text mining: From literature to electronic health records

manually annotated corpus

Page 34: Pragmatic text mining: From literature to electronic health records

automatic tagging

Page 35: Pragmatic text mining: From literature to electronic health records
Page 36: Pragmatic text mining: From literature to electronic health records

precision

Page 37: Pragmatic text mining: From literature to electronic health records

recall

Page 38: Pragmatic text mining: From literature to electronic health records

natural language processing

Page 39: Pragmatic text mining: From literature to electronic health records

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 40: Pragmatic text mining: From literature to electronic health records

hard work

Page 41: Pragmatic text mining: From literature to electronic health records

the pragmatic way

Page 42: Pragmatic text mining: From literature to electronic health records

“benchmark light”

Page 43: Pragmatic text mining: From literature to electronic health records

requires fewer calories

Page 44: Pragmatic text mining: From literature to electronic health records

non-annotated corpus

Page 45: Pragmatic text mining: From literature to electronic health records

automatic tagging

Page 46: Pragmatic text mining: From literature to electronic health records

random inspection

Page 47: Pragmatic text mining: From literature to electronic health records
Page 48: Pragmatic text mining: From literature to electronic health records

precision

Page 49: Pragmatic text mining: From literature to electronic health records

no recall

Page 50: Pragmatic text mining: From literature to electronic health records

relative recall

Page 51: Pragmatic text mining: From literature to electronic health records

co-mentioning

Page 52: Pragmatic text mining: From literature to electronic health records

within documents

Page 53: Pragmatic text mining: From literature to electronic health records

within paragraphs

Page 54: Pragmatic text mining: From literature to electronic health records

within sentences

Page 55: Pragmatic text mining: From literature to electronic health records

weighted score

Page 56: Pragmatic text mining: From literature to electronic health records

unifying text & data

Page 57: Pragmatic text mining: From literature to electronic health records

web resources

Page 58: Pragmatic text mining: From literature to electronic health records

text mining

Page 59: Pragmatic text mining: From literature to electronic health records

curated knowledge

Page 60: Pragmatic text mining: From literature to electronic health records

Letunic & Bork, Trends in Biochemical Sciences, 2008

Page 61: Pragmatic text mining: From literature to electronic health records

experimental data

Page 62: Pragmatic text mining: From literature to electronic health records

von Mering et al., Nucleic Acids Research, 2005

Page 63: Pragmatic text mining: From literature to electronic health records

computational predictions

Page 64: Pragmatic text mining: From literature to electronic health records

common identifiers

Page 65: Pragmatic text mining: From literature to electronic health records

quality scores

Page 66: Pragmatic text mining: From literature to electronic health records

proteins

Page 67: Pragmatic text mining: From literature to electronic health records

Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011

Page 68: Pragmatic text mining: From literature to electronic health records

small molecules

Page 69: Pragmatic text mining: From literature to electronic health records

Kuhn et al., Nucleic Acids Research, 2012

Page 70: Pragmatic text mining: From literature to electronic health records

compartments

Page 71: Pragmatic text mining: From literature to electronic health records

compartments.jensenlab.org

Page 72: Pragmatic text mining: From literature to electronic health records

tissues

Page 73: Pragmatic text mining: From literature to electronic health records

tissues.jensenlab.org

Page 74: Pragmatic text mining: From literature to electronic health records

diseases

Page 75: Pragmatic text mining: From literature to electronic health records
Page 76: Pragmatic text mining: From literature to electronic health records

electronic health records

Page 77: Pragmatic text mining: From literature to electronic health records

Jensen et al., Nature Reviews Genetics, 2012

Page 78: Pragmatic text mining: From literature to electronic health records

structured data

Page 79: Pragmatic text mining: From literature to electronic health records

Jensen et al., Nature Reviews Genetics, 2012

Page 80: Pragmatic text mining: From literature to electronic health records

unstructured data

Page 81: Pragmatic text mining: From literature to electronic health records

clinical narrative

Page 82: Pragmatic text mining: From literature to electronic health records
Page 83: Pragmatic text mining: From literature to electronic health records

Danish

Page 84: Pragmatic text mining: From literature to electronic health records

busy doctors

Page 85: Pragmatic text mining: From literature to electronic health records

psychiatric patients

Page 86: Pragmatic text mining: From literature to electronic health records

pharmacovigilance

Page 87: Pragmatic text mining: From literature to electronic health records

structured data

Page 88: Pragmatic text mining: From literature to electronic health records

medication

Page 89: Pragmatic text mining: From literature to electronic health records

text mining

Page 90: Pragmatic text mining: From literature to electronic health records

drug indications

Page 91: Pragmatic text mining: From literature to electronic health records

adverse drug events

Page 92: Pragmatic text mining: From literature to electronic health records

temporal correlation

Page 93: Pragmatic text mining: From literature to electronic health records

complex filtering

Page 94: Pragmatic text mining: From literature to electronic health records

Eriksson et al., in submitted, 2013

Page 95: Pragmatic text mining: From literature to electronic health records

Eriksson et al., submitted, 2013

Drug substance ADE p-value

Chlordiazepoxide Nystagmus 4.0e-8

Simvastatin Personality changes

8.4e-8

Dipyridamole Visual impairment

4.4e-4

Citalopram Psychosis 8.8e-4

Bendroflumethiazide

Apoplexy 8.5e-3

Page 96: Pragmatic text mining: From literature to electronic health records

AcknowledgmentsProtein networksChristian von MeringDamian SzklarczykMichael KuhnManuel StarkJean MullerTobias DoerksAlexander RothMilan SimonovicBerend SnelMartijn HuynenPeer Bork

Localization and diseaseSune FrankildAlberto SantosKalliopi TsafouJanos BinderReinhard SchneiderSean O’DonoghueElectronic health recordsRobert ErikssonThomas WergeSøren Brunak