Turning Big Data into Precision Medicine
-
Upload
matthieu-schapranow -
Category
Health & Medicine
-
view
425 -
download
1
Transcript of Turning Big Data into Precision Medicine
Turning Big Data into Precision Medicine: Real-life Experiences
Dr. Matthieu-P. Schapranow Festival of Genomics, Boston, MA
June 24, 2015
■ Online: Visit we.analyzegenomes.com for latest research results, tools, and news
■ Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis: How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014
■ In Person: Join us for “Big Data in Medicine” July 1-2, 2015 in Potsdam, Germany
Important things first: Where do you find additional information?
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
2
What is the Hasso Plattner Institute, Potsdam, Germany?
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
3
■ Since 2009 Program Manager E-Health & Life Sciences
■ 2006-2014 Strategic Projects SAP HANA
■ Visiting Scientist at Charité, Berlin and V.A., Boston, MA
■ Software Engineer by training (PhD, M.Sc., B.Sc.)
Who are you dealing with?
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
4
■ Patients
□ Individual anamnesis, family history, and background
□ Require fast access to individualized therapy
■ Clinicians
□ Identify root and extent of disease using laboratory tests
□ Evaluate therapy alternatives, adapt existing therapy
■ Researchers
□ Conduct laboratory work, e.g. analyze patient samples
□ Create new research findings and come-up with treatment alternatives
The Setting Actors in Oncology
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015 5
Turning Big Data into Precision Medicine
IT Challenges Distributed Heterogeneous Data Sources
6
Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes
Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB)
Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov
Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB
PubMed database >23M articles
Hospital information systems Often more than 50GB
Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Cancer patient records
>160k records at NCT
Turning Big Data into Precision Medicine
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Our Approach Analyze Genomes: Real-time Analysis of Big Medical Data
7
In-Memory Database
Extensions for Life Sciences
Data Exchange, App Store
Access Control, Data Protection
Fair Use
Statistical Tools
Real-time Analysis
App-spanning User Profiles
Combined and Linked Data
Genome Data
Cellular Pathways
Genome Metadata
Research Publications
Pipeline and Analysis Models
Drugs and Interactions
Drug Response Analysis
Pathway Topology Analysis
Medical Knowledge Cockpit Oncolyzer
Clinical Trial Assessment
Cohort Analysis
...
Turning Big Data into Precision Medicine
Case Vignette I
■ Patient: 48 years, female, non-smoker, smoke-free environment
■ Diagnosis: Non-Small Cell Lung Cancer (NSCLC), stage IV
■ Markers: KRAS, EGFR, BRAF, NRAS, (ERBB2)
■ Initial treatment: Surgery
■ Therapy: Palliative chemotherapy
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
8
Medical Knowledge Cockpit
■ Query-oriented search interface
■ Seamless integration of patient specifics, e.g. from EMR
■ Parallel search in international knowledge bases, e.g. for biomarkers, literature, cellular pathway, and clinical trials
Medical Knowledge Cockpit for Patients and Clinicians Linking Patient Specifics with International Knowledge
Turning Big Data into Precision Medicine
9
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Medical Knowledge Cockpit for Patients and Clinicians
■ Search for affected genes in distributed and heterogeneous data sources
■ Immediate exploration of relevant information, such as
□ Gene descriptions,
□ Molecular impact and related pathways,
□ Scientific publications, and
□ Suitable clinical trials.
■ No manual searching for hours or days: In-memory technology translates searching into interactive finding!
Turning Big Data into Precision Medicine
Automatic clinical trial matching build on text
analysis features
Unified access to structured and un-structured data
sources
10
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Medical Knowledge Cockpit for Patients and Clinicians Pathway Topology Analysis
■ Search in pathways is limited to “is a certain element contained” today
■ Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA
■ Implemented graph-based topology exploration and ranking based on patient specifics
■ Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start Turning Big Data into
Precision Medicine
Unified access to multiple formerly disjoint data sources
Pathway analysis of genetic variants with graph engine
11
Case Vignette II
■ Patient: 67 years, male, smoker, consumes frequently alcohol
■ Diagnosis: Squamous cell carcinoma of the oropharynx, T2N2bM0, stage IVa
■ Initial treatment: Surgery
■ After one year: Relapse multiple metastatic nodules to the lung
■ Therapy: Palliative chemotherapy
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
12
Drug Response Analysis
Real-time Data Analysis and Interactive Exploration
Drug Response Analysis Data Sources
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
Smoking status, tumor classification
and age (1MB - 100MB)
Raw DNA data and genetic variants
(100MB - 1TB)
Medication efficiency and wet lab results
(10MB - 1GB)
13
Patient-specific Data
Tumor-specific Data
Compound Interaction Data
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
14
Showcase
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
15 Calculating Drug Response… Predict Drug Response
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
16 cetuximab might be more
beneficial for the current case
Our Methodology Design Thinking
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
17
Our Methodology Design Thinking
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
18
Desirability
■ Portfolio of integrated services for clinicians, researchers, and patients
■ Include latest treatment option, e.g. most effective therapies
Viability
■ Enable precision medicine also in far-off regions and developing countries
■ Involve word-wide experts (cost-saving)
■ Combine latest international data (publications, annotations, genome data)
Feasibility
■ HiSeq 2500 enables high-coverage whole genome sequencing in 20h
■ IMDB enables allele frequency determination of 12B records within <1s
■ Cloud-based data processing services reduce TCO
Combined column and row store
Map/Reduce Single and multi-tenancy
Lightweight compression
Insert only for time travel
Real-time replication
Working on integers
SQL interface on columns and rows
Active/passive data store
Minimal projections
Group key Reduction of software layers
Dynamic multi-threading
Bulk load of data
Object-relational mapping
Text retrieval and extraction engine
No aggregate tables
Data partitioning Any attribute as index
No disk
On-the-fly extensibility
Analytics on historical data
Multi-core/ parallelization
Our Technology In-Memory Database Technology
+
+++
+
P
v
+++t
SQL
xx
T
disk
19
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
■ 1,000 core cluster at Hasso Plattner Institute with 25 TB main memory
■ 25 nodes, each consists of:
□ 40 cores
□ 1 TB main memory
□ Intel® Xeon® E7- 4870
□ 2.40GHz
□ 30 MB Cache
In-Memory Database Technology Hardware Characteristics at HPI FSOC Lab
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
20
■ Main memory access is the new bottleneck
■ Lightweight compression can reduce this bottleneck, i.e.
□ Lossless
□ Improved usage of data bus capacity
□ Work directly on compressed data
Lightweight Compression
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
21
Attribute Vector
RecId ValueId 1 C18.0 2 C32.0 3 C00.9 4 C18.0 5 C20.0 6 C20.0 7 C50.9 8 C18.0
Inverted Index
ValueId RecIdList 1 2 2 3 3 5,6 4 1,4,8 5 7
Data Dictionary
ValueId Value 1 Larynx 2 Lip 3 Rectum 4 Colon 5 Mama Table
… … … C18.0 Colon 646470 C50.9 Mama 167898 C20.0 Rectum 647912 C20.0 Rectum 215678 C18.0 Colon 998711 C00.9 Lip 123489 C32.0 Larynx 357982 C18.0 Colon 091487
RecId 1 RecId 2 RecId 3 RecId 4 RecId 5 RecId 6 RecId 7 RecId 8 …
• Typical compression factor of 10:1 for enterprise software
• In financial applications up to 50:1
■ For patients
□ Identify relevant clinical trials and medical experts
□ Become an informed patient
■ For clinicians
□ Identify pharmacokinetic correlations
□ Scan for similar patient cases, e.g. to evaluate therapy efficiency
■ For researchers
□ Enable real-time analysis of medical data, e.g. assess pathways to identify impact of detected variants
□ Combined mining in structured and unstructured data, e.g. publications, diagnosis, and EMR data
What to Take Home? Test it Yourself: AnalyzeGenomes.com
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015 22
Turning Big Data into Precision Medicine
Keep in contact with us!
Hasso Plattner Institute Enterprise Platform & Integration Concepts (EPIC)
Program Manager E-Health Dr. Matthieu-P. Schapranow
August-Bebel-Str. 88 14482 Potsdam, Germany
Dr. Matthieu-P. Schapranow [email protected] http://we.analyzegenomes.com/
Schapranow, Festival of Genomics, Boston, MA, June 24, 2015
Turning Big Data into Precision Medicine
23