Role of Biomedical Informatics in Translational Cancer Research
Design Templates: Translational Biomedical Research and the ...
Transcript of Design Templates: Translational Biomedical Research and the ...
Integrative Analysis of Pathology, Radiology
and High Throughput Molecular Data
Joel Saltz MD, PhD
Director Center for Comprehensive
Informatics
Objectives
• Reproducible anatomic/functional characterization at
gross level (Radiology) and fine level (Pathology)
• Integration of anatomic/functional characterization with
multiple types of “omic” information
• Create categories of jointly classified data to describe
pathophysiology, predict prognosis, response to
treatment
• Data modeling standards, semantics
• Data research issues
In Silico Program Objectives (from NCI)
• In silico is an expression used to mean "performed on computer or via
computer simulation.“ (Wikipedia)
• In silico science centers: support investigator-initiated, hypothesis-
driven research in the etiology, treatment, and prevention of cancer
using in silico methods
– Generating and publishing novel cancer research findings leveraging
caBIG tools and infrastructure
– Identifying novel bioinformatics processes and tools to exploit existing
data resources
• Encouraging the development of additional data resources and caBIG
analytic services
• Assessing the capabilities of current caBIG tools
• Emory, Columbia, Georgetown, Fred Hutchinson Cancer ,
Translational Genomics Research Institute
In Silico Center for
Brain Tumor ResearchSpecific Aims:
1. Influence of necrosis/
hypoxia on gene expression and
genetic classification.
2. Molecular correlates of high
resolution nuclear morphometry.
3. Gene expression profiles
that predict glioma progression.
4. Molecular correlates of MRI
enhancement patterns.
Integrative Analysis: Tumor Microenvironment
• Structural and functional
differentiation within tumor
• Molecular pathways are time
and space dependent
• “Field effects” – gradient of
genetic, epigenetic changes
• Radiology, microscopy, high
throughput genetic, genomic,
epigenetic studies, flow
cytometry, microCT,
nanotechologies …
• Create biomarkers to
understand disease
progression, response to
treatment
Tumors are organs consisting of
many interdependent cell types
• From John E. Niederhuber, M.D. Director National Cancer Institute, NIH presented at Integrating and Leveraging the Physical Sciences to Open a New Frontier in Oncology, Feb 2008
Informatics Requirements
•Parallel initiatives
Pathology, Radiology,
“omics”
•Exploit synergies
between all initiatives
to improve ability to
forecast survival &
response.
Radiology
Imaging
Patient Outco
me
Pathologic Features
“Omic”
Data
In Silico Center for
Brain Tumor Research
Key Data Sets
REMBRANDT: Gene expression and genomics data set of all glioma subtypes
The Cancer Genome Atlas (TCGA): Rich “omics” set of GBM, digitized Pathology and Radiology
Vasari Feature Set: Standardized annotation of gliomas of all subtypes
TCGA Research Network
Digital Pathology
Neuroimaging
Distinguishing Among the Gliomas
“There are also many cells which appear to be transitions between gigantic oligodendroglia and astrocytes. It is impossible to classify them as belonging in either group”
Bailey P, Bucy PC. Oligodendrogliomas of the brain.
J Pathol Bacteriol 1929: 32:735
Oligodendroglioma Astrocytoma
Nuclear Qualities
Progression to GBM
Anaplastic Astrocytoma
(WHO grade III)
Glioblastoma
(WHO grade IV)
TCGA Neuropathology Attributes
120 TCGA specimens; 3 Reviewers
Presence and Degree of:
Microvascular hyperplasia
Complex/glomeruloid
Endothelial hyperplasia
Necrosis
Pseudopalisading pattern
Zonal necrosis
Inflammation
Macrophages/histiocytes
Lymphocytes
Neutrophils
Differentiation:
Small cell component
Gemistocytes
Oligodendroglial
Multi-nucleated/giant cells
Epithelial metaplasia
Mesenchymal metaplasia
Other Features
Perineuronal/perivascular
satellitosis
Entrapped gray or white matter
Micro-mineralization
Feature Extraction
TCGA Whole Slide Images
Jun Kong
Astrocytoma vs Oligodendroglima
Overlap in genetics, gene expression, histology
Astrocytoma vs Oligodendroglima
• Assess nuclear size (area and
perimeter), shape (eccentricity,
circularity major axis, minor axis, Fourier
shape descriptor and extent ratio),
intensity (average, maximum, minimum,
standard error) and texture (entropy,
energy, skewness and kurtosis).
Nuclear Qualities
Which features carry most prognostic significance?
Which features correlate with genetic alterations?
Whole slide scans from 14 TCGA GBMS (69 slides)
7 purely astrocytic in morphology; 7 with 2+ oligo component
399,233 nuclei analyzed for astro/oligo features
Cases were categorized based on ratio of oligo/astro cells
Machine-based Classification of TCGA GBMs (J Kong)
TCGA Gene
Expression Query:
c-Met overexpression
Nuclear Feature Analysis: TCGA
• Using the parallel computation infrastructure of Sun Grid Engine, we analyzed image tiles of 4096x4096 of 213 whole-slide TCGA images of permanent tissue sections.
• Approximately 90 million nuclei segmented.
• 79 patients: 57 are diagnosed as GBM (‘oligo 0’)
17 are classified as GBM with ‘oligo 1’,
5 as GBM with ‘oligo 2+’.
• With each data file including all nuclear features from one patient, all nuclei were classified with color blue, green, and red representing nuclei scored as 1~3, 4~6, and 7~10, respectively.
• The ratio of nuclei classified ≤ 5 to >5 was computed for each 79 patients.
• Ratios associated with ‘oligo 0’ and ‘oligo 2+’ patient populations were compared with two-sample t-test (p=0.0145)
Nuclear Feature Analysis: TCGA
Discriminating Features (Grade 1 vs. Grade 7-10)
Discriminating Features (Grade 1 vs. Grade 7-10)
Discriminating Features (Grade 1 vs. Grade 7-10)
Determine the influence of tumor
microenvironment on gene expression
profiling and genetic classification using
TCGA data
Necrosis =
Severe Hypoxia
Microvascular
HyperplasiaGBM
3 Gene Families are Altered in GBM:
RTK, p53 and RB
GBM: necrosis, hypoxia, angiogenesis
and gene expression
• Does the presence or degree of necrosis within
digitized frozen section slides correlate with specific
gene expression patterns or determine algorithm-
based unsupervised clustering of GBMs gene
expression categories?
• Does the presence or degree of necrosis influence
the type of angiogenesis or pro-angiogenic gene
expression patterns within human gliomas?
GBM:
% Necrosis
TCGA: GBM Frozen Sections
• 179 cases were assessed for % necrosis on frozen
section slides for TCGA quality assurance.
• Cox-based regression analysis for % necrosis vs. gene
expression (795 probe sets; 647 distinct genes)
Network Analysis based on % Necrosis
Carlos
Moreno
David
Gutman
Aim 4. Identify correlates of MRI enhancement patterns in astrocytic neoplasms with underlying vascular changes and gene expression profiles.
No enhancement
Normal Vessels
Stable lesion
?Rim-enhancement
Vascular Changes
Rapid progression
Angiogenesis Segmentation
H&E
Image
Color
Deconvolution
Hematoxylin
Image
Eosin
Image
Eosin
Image
Spatial
Norm.
Density
Image
Density
Calculation
Boundary
Smoothing
Density
Image
Object
ID
Segmented
Vessels
Eosin intensity image
Angiogenic Segmentation
States of AngiogenesisEndothelial Hypertrophy
Complex Microvascular
Hyperplasia
Endothelial Hyperplasia
Lee Cooper
Sharath Cholleti
Vessel Characterization
• Bifurcation detection
Vasari Imaging Criteria(Adam Flanders, TJU; Dan Rubin, Stanford, Lori Dodd,
NCI)• Require standardized validated feature sets to
describe de novo disease.
• Fundamental obstacle to new imaging criteria as treatment biomarkers is lack of standard terminology:
– To define a comprehensive set of imaging features of cancer
– For reporting imaging results
– To provide a more quantitative, reproducible basis for assessing baseline disease and treatment response
Defining Rich Set of Qualitative and Quantitative Image Biomarkers
• Community-driven ontology development project; collaboration with ASNR
• Imaging features (5 categories)
– Location of lesion
– Morphology of lesion margin (definition, thickness, enhancement, diffusion)
– Morphology of lesion substance (enhancement, PS characteristics, focality/multicentricity, necrosis, cysts, midline invasion, cortical involvement, T1/FLAIR ratio)
– Alterations in vicinity of lesion (edema, edema crossing midline, hemorrhage, pial invasion, ependymal invasion, satellites, deep WM invasion, calvarial remodeling)
– Resection features (extent of nCE tissue, CE tissue, resectedcomponents)
Results: Reader Agreement• High inter-observer agreement among
the three readers
– (kappa = 0.68, p<0.001)
• Percentage agreement was also high for most features
individually
– 22 of 30 features (73%) had agreement greater than
50%
– Twelve features (40%) had >80% agreement
– No feature had less than 20% agreement
• Feature agreement rose substantially when used with
tolerance (+/- 1).
Preliminary Relationships of Features to Survival
• Cox proportional hazards models were fit to each of the thirty features related to overall survival.
• Features associated with lower survival included (p<.0001):– Proportion of enhancing tissue at baseline.
– Thick or nodular enhancement characteristics.
– Contralateral hemisphere invasion.
• Proportion of non contrast enhancing tumor (nCET) had positive correlation with survival.
• Tumor size at baseline had no relationship to survival.
Coupling silico methodology with a clinical trial:
Will Treatment work and if not, why not?
Example: Avastin and Glioblastoma as
in RTOG-0825 (plus institutional
accrual)
Treatment: Radiation therapy and
Avastin (anti angiogenesis)
Predict and Explain: Genetic, gene
expression, microRNA, Pathology,
Imaging
Imaging/RT reproducibility, Integration
with EMR, PACS, RT systems
D
A
T
A
I
N
T
E
G
R
A
T
I
O
N
Information Warehouse User AccessAcquisition Transfer
CPOE
OR system
Patient Mgmt
Dictated reports
Pathology reports
Daily
ADT
Lab
Respiratory
Blood
Endoscopy
Cardiology
Siemens Img
Real
time
Patient Billing
Practice Plans
Pt Satisfaction
Weekly
Monthly
Cancer Genetics
Wound
Images
Tissue
Pulmonary
Genomic Data
Web
Multi-Dimensional
Analysis & Data
Mining Ad-hoc
Query
Error Report Benchmarking
De-
Identification
Honest Broker
Wound
Center
Research
Web Scorecards
& Dashboards
Image Analysis
Text Mining, NLP
Business Clinical
Research External
Meta Data
Overv
iew
Crucial to Leverage Institutional
Data
Ohio State Information Warehouse Infrastructure
Annotations and Imaging Markup (AIM)
Annotations and Imaging Markup Developer (AIM) provides a standard for medical image annotation and markup
for images used in the research space, and in particular, the image based cancer clinical trial. It is notable that
there is no existing standard for radiology annotation and markup. The caBIG® program is working with almost
every standards body such as DICOM to elicit consensus regarding use of AIM as the accepted standard for
radiology annotation and markup, and is positioned to extend AIM to digital pathology.
The pixel at the tip of the
arrow [coordinates (x,y)]
in this image
[DICOM: 1.2.814.234543.23243]
represents the
Ascending Thoracic Aorta
[SNOMED:A3310657]
Aim Data Service Emory (AIME) is a caGrid data
service that manages AIM documents in XML
databases. AIME supports query,
enumerationQuery, queryByTransfer, submit
and submitByTransfer methods
MicroAIM
• Provide a semantically enabled data model for pathology
and microscopy image markups and observations
• Goal: interoperable data exchange and knowledge
sharing
• Ongoing collaborative efforts to standardize via caBIG
and Association for Pathology Informatics
• microAIM data service provides comprehensive query
support, and can be efficiently implemented either
through an XML-based or a relational based approach.
Pathology AIM; Semantic Annotation and Spatial
Reasoning
Observation on a single or multiple objects
Support multiple objects and associate
different observations to each object
Calculation on a single or multiple objects
Class of type to represent mask or field
value
Represent provenance information for
computed markup and annotation
Identify all instances of a set of cell types
Complex spatial queries: Quantify areas of
glomeruloid microvascular proliferation
within X microns of pseudopalisading
necrosis
Use of caBIG Tools in Emory in Silico Brain
Tumor Research Center
Osirix
XIP/AVT
Clear Canvas
NBIA
AIME
TCGA Portal
Modified Workstation Extended AIME Service
Enterprise Architecture 2.0
caB2B
caIntegrator 2
CBIIT Research
Team
Carl Schaefer
Jinghui ZhangTBPT
ICR
VCDE/
ARCH
Science
CIP:
Research &
Clinical Trials
TCGA Portal
In vivo
Imaging
Data Science Research Challenges Driven by
In Silico Discovery Research
• Data integration that targets multiple data sources with
conflicting metadata and conflicting data
• Efficient methods for semantic query that targets
questions involving complex multi-scale features
associated with petascale and exascale ensembles of
highly annotated images
• Computer assisted annotation and markup for very large
datasets
• Systems to support combinations of structured and
irregular accesses to exascale datasets
Data Science Research Challenges
• Structural and semantic metadata management: how to manage tradeoff between flexibility and curation
• Data and semantic modeling infrastructures and policies able to scale to handle distributed systems with an aggregate of 10*9 or more data models/concepts
• Three dimensional (time dependent) reconstruction, feature detection and annotation of 3-D microscopy imagery
• Workflow infrastructure for large scale data intensive
computations
Final Data Science Challenge:
Large Dataset Size
– Basic small mouse is 10 cm3
– 1 μ resolution – very roughly 1013 bytes/mouse
– Molecular data (spatial location) multiply by 102
– Vary genetic composition, environmental manipulation, systematic mechanisms for varying genetic expression; multiply by 103
Total: 1018 bytes per big science
animal experiment
Thanks to:
• In silico center team: Dan Brat (Science PI), Tahsin Kurc, Ashish
Sharma, Tony Pan, David Gutman, Jun Kong, Sharath Cholleti, Carlos
Moreno, Chad Holder, Erwin Van Meir, Daniel Rubin, Tom Mikkelsen,
Adam Flanders, Joel Saltz (Director)
• caGrid Knowledge Center: Joel Saltz, Mike Caliguiri, Steve Langella
co-Directors; Tahsin Kurc, Himanshu Rathod Emory leads
• caBIG In vivo imaging team: Eliot Siegel, Paul Mulhern, Adam
Flanders, David Channon, Daniel Rubin, Fred Prior, Larry Tarbox and
many others
• In vivo imaging Emory team: Tony Pan, Ashish Sharma, Joel Saltz
• Emory ATC Supplement team: Tim Fox, Ashish Sharma, Tony Pan, Edi
Schreibmann, Paul Pantalone
• Digital Pathology R01: Foran and Saltz; Jun Kong, Sharath Cholleti,
Fusheng Wang, Tony Pan, Tahsin Kurc, Ashish Sharma, David
Gutman (Emory), Wenjin Chen, Vicky Chu, Jun Hu, Lin Yang, David J.
Foran (Rutgers)
Thanks!