Knowledge Engineering from Big Data in Oncology
-
Upload
andre-dekker -
Category
Healthcare
-
view
430 -
download
0
Transcript of Knowledge Engineering from Big Data in Oncology
Andre Dekker, PhDMedical PhysicistMAASTRO Clinic
Knowledge Engineering from Big Data in Oncology
2
© MAASTRO 2015
Disclosures
Research collaborations incl. funding / honoraria etc.– Varian (VATE, chinaCAT, euroCAT), Siemens (euroCAT), Sohard (SeDI,
CloudAtlas), Mirada Medical (CloudAtlas), Philips (EURECA, TraIT, SWIFT-RT), Xerox (EURECA), De Praktijkindex (DLRA)
Public research funding– Radiomics (USA-NIH/U01CA143062), euroCAT(EU-Interreg), duCAT (NL-
STW), EURECA (EU-FP7), SeDI & CloudAtlas (EU-EUREKA), TraIT (NL-CTMM), DLRA (NL-NVRO)
Spin-offs and commercial ventures– MAASTRO Innovations B.V. (CSO)– Various patents on medical machine learning
3
© MAASTRO 2015
Do we know which tulip will be pink or yellow?
http://www.amystewart.com
4
© MAASTRO 2015
Do we know which tulip will be pink or yellow?
1.00
AUC
0.72
0.50
5
© MAASTRO 2015
Do we know which patient is likely to survive?
AUC1.00
AUC0.50
AUC0.72
6
© MAASTRO 2015
Testing predictions by MDs
Lung cancer2 year survival158 patients5 MDsProspectiveAUC: 0.56
Oberije et al. Kruger et al. 1999
Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence leads to inflated self-assessments. J Pers Soc Psych
7
© MAASTRO 2015
The doctor is drowning
• Explosion of data• Explosion of decisions• Explosion of ‘evidence’*
• 3 % in trials, bias• Sharp knife
*2010: 1574 & 1354 articles on lung cancer & radiotherapy = 7.5 per dayHalf-life of knowledge estimated at 7 years (in young students)
Source: J Clin Oncol 2010;28:4268
Source: JMI 2012 Friedman, Rigby
8
© MAASTRO 2015
Main Opportunity of Big Data Driven Medicine : Rapid Learning Health Care / Precision Medicine / Predict outcome in an individual
In [..] rapid-learning [..] data routinely generated through patient care and clinical research feed into an ever-growing [..] set of coordinated databases. J Clin Oncol 2010;28:4268
[..] rapid learning [..] where we can learn from each patient to guide practice, is [..] crucial to guide rational health policy and to contain costs [..].Lancet Oncol 2011;12:933
Examples: Radiotherapy CAT (www.eurocat.info) ASCO’s CancerLinQ
Source: J Clin Oncol 2010;28:4268
9
© MAASTRO 2015
Why would we want to predict outcome in an individual patient?
If you can’t predict outcomes
Doctor/Patient perspective• you can’t inform and involve your patient properly• you might not make the right decision of treatment
A over treatment B
Quality perspective• you can’t know if your treatments are given the
predicted outcome
Innovation perspective• you can’t determine which patient (group) we need
to innovate in
Source: www.predictcancer.org (MAASTRO)
Source: www.lifemath.net (MGH)
10
© MAASTRO 2015
Big data in Oncology
Oncology2005-2015140M patients100k hospitals1-10GB per patient140-1400PB80% unstructured
Source: Cancer Research UK
Source: Institute for Health Technology Transformation
11
© MAASTRO 2015
Main challenge of using Big Data and Outcomes Research in Oncology
• You need to learn from other patients to predict the outcome of a new patient
• These data are spread out over 100k hospitals
• So we need to share…, challenges:• Administrative (I don’t have the
time)• Political (I don’t want to )• Ethical (I am not allowed)• Technical (I can’t)
Oncology2005-2015140M patients100k hospitals1-10GB per patient140-1400PB80% unstructured
[..] the problem is not really technical […]. Rather, the problems are ethical, political, and administrative. Lancet Oncol 2011;12:933
12
© MAASTRO 2015
The ‘standard’ approach
• Sharing standardized, highly curated data from clinical research programs
• Very useful, but only 3% of patients (if that)• Worries about privacy, loss of control, limited amount of
features, limited reusability, a lot of work
13
© MAASTRO 2015
A different approach
If sharing is the problem: Don’t share the data
If you can’t bring the data to the learning applicationYou have to bring the learning application to the data
Consequences• The learning application has to be distributed • The data has to be readable by an application (i.e. not a human)
• Solution: Sharing standardized highly curated research data• Solution: Not-sharing non-standardized non-curated clinical data
14
© MAASTRO 2015
Distributed Learning
See youtube: https://www.youtube.com/watch?v=ZDJFOxpwqEA
15
© MAASTRO 2015
euroCAT, duCAT, chinaCAT, ozCAT, VATE, ukCAT, dkCAT, worldCAT, BIONIC Network
Industry Partners
Active or funded CAT partners (19)
Prospective centers
2
5
Map from cgadvertising.com
5
Clinical / Academic Partners
16
© MAASTRO 2015
Does it work ? euroCAT’s example
• Distributed = Centralized (ADMM method, Boyd-Stanford)• Distributed learning better than learning on single center data
• 550 iterations, two hours (centralized < 1 min)
Learn in Validate in AUCAachen (n=7) Liège (n=186) 0.61Eindhoven (n=32) Liège (n=186) 0.72Hasselt (n=45) Liège (n=186) 0.68Maastricht (n=52) Liège (n=186) 0.75All 4 together (n=136) Liège (n=186) 0.77All 5 together (n=322) World (n=inf) ?
17
© MAASTRO 2015
Summary
Knowledge Engineering from Big Data in Oncology
• The challenge of Big Data in oncology• Is not the size but the distribution• Is imaging and not genomics (for now)
• The aim of Knowledge Engineering is • To predict outcomes better via prediction models• To update these models continuously in rapid learning
18
© MAASTRO 2015
Acknowledgements
• Varian, Palo Alto, CA, USA• Siemens, Malvern, PA, USA• RTOG, Philadelphia, PA, USA• MAASTRO, Maastricht, Netherlands• Policlinico Gemelli, Roma, Italy• UH Ghent, Belgium• Catherina Zkh Eindhoven, Netherlands• UZ Leuven, Belgium• Radboud, Nijmegen, Netherlands• University of Sydney, Australia
• Liverpool and Macarthur CC, Australia• CHU Liege, Belgium• Uniklinikum Aachen, Germany• LOC Genk/Hasselt, Belgium• Princess Margaret Hospital, Canada• The Christie, Manchester, UK• UH Leuven, Belgium• State Hospital, Rovigo, Italy• Illawarra Shoalhaven CC, Australia • Fudan Cancer Center, Shanghai, China
More info on: www.predictcancer.org www.cancerdata.orgwww.eurocat.info www.mistir.info
Andre Dekker, PhDMedical PhysicistMAASTRO Clinic
Thank you for your attention
More info on: www.eurocat.info
www.predictcancer.orgwww.cancerdata.org
www.mistir.infowww.maastro.nl