Challenges and opportunities in personal omics profiling
-
Upload
tnaugenomics-lab -
Category
Education
-
view
1.219 -
download
0
description
Transcript of Challenges and opportunities in personal omics profiling
Challenges and Opportunities in Personal OMICS Profiling
Suresh kumar
The broad idea behind the topic
• The functional state of a cell can be explained by the integrated set of different OMICS data, called molecular signature or biomarker.
• The same fact can be exploited to find out difference between diseased and normal.
• For diagnosis of a diseases in future, personal OMICS profiling (POP) is indispensible.
• The POP further confer advantage to produce personal drugs, based on POP.
Small clarification about components of this topic
• OMICS– The term ‘‘omic’’ is derived from the Latin suffix ‘‘ome’’ meaning mass or
many. Thus, OMICS involve a mass (large number) of measurements per endpoint. (Jackson et al., 2006)
• Integration of OMICS data– Efficient integration of data from different OMICS can greatly facilitate the
discovery of true causes and states of disease, mostly done by softwares (Andrew et al., 2006).
• Biomarker development or molecular signatures– A set of biomolecular features (snapshots of OMICS integration) to predict a
phenotype (diseased) of clinical interest on a previously unseen patient sample (Sung et al., 2012).
• Personalized OMICS profiling– The minimal required OMICS data for every person
• Personalized medicine– The drug formulations which are prepared based on the POP (Chan and
Ginsburg, 2011)
What is ‘omics’?• In biological context , suffix –omics is used to refer to
the study of large sets of biological molecules (Smith et al., 2005)
• The realization that DNA is not alone regulate complex biological processes (as a result of HGP, 2001), triggered the rapid development of several fields in molecular biology that together are described with the term OMICS.
• The OMICS field ranges from – Genomics (focused on the genome) – Proteomics (focused on large sets of proteins, the
proteome) – Metabolomics (focused on large sets of small molecules,
the metabolome). (Jelle et al., 2010)
Genomics
• The field of genomics has been divided into 3 major categories.– Genotyping (focused on the genome sequence),
• The physiological function of genes and the elucidation of the role of specific genes in disease susceptibility (Syvanen, 2001)
– Transcriptomics (focused on genomic expression)• The abundance of specific mRNA transcripts in a biological sample
is a reflection of the expression levels of the corresponding genes (Manning et al., 2007)
– Epigenomics (focused on epigenetic regulation of genome expression)
• Study of epigenetic processes (expression activities not involving DNA) on a large (ultimately genome-wide) scale (Feinberg, 2007)
Genotyping• Goal
– Identification of the physiological function of genes– Role of specific genes in disease susceptibility (syvanen et al., 2001)
• Common Parameter used– Among different variations (insertions, deletions, SNPs, etc.), single
nucleotide polymorphisms (SNPs) are the most commonly investigated (Sachidanandam et al., 2001) and can be used as markers for diseases.
– Tag SNPs (informative subset of SNPs) and fine mapping are further used to identify true cause of phenotype (patil et al., 2001).
• Application– Identification of genes associated with disease
• Recent improvement in genotyping– Array-based genotyping techniques, allowing the simultaneous
assessment (up to 1 million SNPs) per assay, leads to the genotyping of entire genome known as genome-wide association studies (GWAS) Jelly et al., 2010)
Transcriptomics• Gene expression profiling
– The identification and characterization of the mixture of mRNA that is present in a specific sample.
• Principle– The abundance of specific mRNA transcripts in a biological sample is a
reflection of the expression levels of the corresponding genes
(Manning et al., 2007).• Application
– To associate differences in mRNA mixtures originating from different groups of individuals to phenotypic differences between the groups
(Nachtomy et al., 2007). • Challenge
– The transcriptome in contrast to the genome is highly variable over time, between cell types and environmental changes (Celis et al., 2000).
Epigenomics• Epigenetic processes
– Mechanisms other than changes in DNA sequence that cause effect in gene transcription and gene silencing30-32.
– Number of mechanisms of epigenomics but is mainly based on two mechanisms, DNA methylation and histone modification28 33-39.
– Recently RNAi has acquired considerable attention31 40 41. • Goal
– The focus of epigenomics is to study epigenetic processes on a large (ultimately genome-wide) scale to assess the effect on disease28 29.
• Association with disease– Hypermethylation of CpG islands located in promoter regions of genes
is related to gene silencing. 28 36. Altered gene silencing plays a causal role in human disease31 34 37 38 42.
– Histone proteins are involved in the structural packaging of DNA in the chromatin complex. Post translational histone modifications such as acetylation and methylation are believed to regulate chromatin structure and therefore gene expression34 37
Proteomics• Proteomics provides insights into the role proteins in biological systems.
The proteome consists of all proteins present in specific cell types or tissue and highly variable over time, between cell types and will change in response to changes in its environment, a major challenge (Fliser et al., 2007).
• The overall function of cells can be described by the proteins (intra- and inter-cellular )and the abundance of these proteins (Sellers et al., 2003)
• Although all proteins are directly correlated to mRNA (transcriptome) , post translational modifications (PTM) and environmental interactions impede to predict from gene expression analysis alone (Hanash et al., 2008)
• Tools for proteomics– Mainly two different approaches that are based on detection by
• mass spectrometry (MS) and • protein microarrays using capturing agents such as antibodies.
• Major focuses– the identification of proteins and proteins interacting in protein-complexes– Then the quantification of the protein abundance. The abundance of a specific
protein is related to its role in cell function (Fliser et al., 2007)
Metabolomics• The metabolome consists of small molecules (e.g. lipids or
vitamins) that are also known as metabolites (Claudino et al., 2007).
• Metabolites are involved in the energy transmission in cells (metabolism) by interacting with other biological molecules following metabolic pathways.
• Metabolic phenotypes are the by-products of interactions between genetic, environmental, lifestyle and other factors (Holmes et al., 2008).
• The metabolome is highly variable and time dependent, and it consists of a wide range of chemical structures.
• An important challenge of metabolomics is to acquire qualitative and quantitative information with preturbance of environment (Jelly et al., 2010)
Application of different omics
Joyce et al., 2006
Overview of the different OMICS technologies
TechnologyMolecules of
interestDefinition
Temporal variance
Disease influence
Genotyping DNA Assessment of variability in DNA sequence in the genome
None No
Epigenomics Epigenetic modifications of DNA
Assessment of factors that regulate gene expression without changing DNA sequence of the genome
Low / Moderate
Probable
Gene expression profiling
RNA Assessment of variability in composition and abundance of the transcriptome
High Yes
Proteomics Proteins Assessment of variability in composition and abundance of the proteome
High Yes
Metabolomics Small molecules
Assessment of variability in composition and abundance of the metabolome
High Yes
(Jelle et al., 2010)
(Jiannis, 2009)
(Carmen and Matthias , 2004)
Genomic techniques
Proteomic techniques
Biological sample
Metabolic Profiling Techniques
• There is no single technology to detect all compounds found in biological system.
• Metabolic analytical techniques – gas chromatography
(GC), – liquid chromatography
(LC), – capillary
electrophoresis (CE)-MS, and
– NMR
(Kazuki S and Fumio M, 2010)
'OMICS' data repositories
(Joyce et al. 2006)
Why do we integrate the OMICS data?
• A functional state of a biological system can be seen as snapshots of OMICs
• To make better and faster decisions about therapeutic targets.
• To differentiate the diseased phenotype with the normal ones
• Thus data integration is a perennial issue in OMICS.
(Akula et al., 2009)
Integrating OMICS data• The computational tools for
integrating 'omics' data generally tackle three specific tasks– Identifying the network
scaffold by delineating the connections that exist between cellular components
– Decomposing the network scaffold into its constituent parts in an attempt to understand the overall network structure
– Developing cellular or system models to simulate and predict the network behaviour that gives rise to particular cellular phenotypes.
(Akula et al., 2009)
OMICS integration techniques
(Joyce et al., 2006)
Software for omics data integration
(Joyce et al., 2006)
What is omics based medicine?
• To date, application of comprehensive molecular information to medicine has been referred to as “genomic medicine”(Guttacher and Collins, 2002)
• Post genomic advances collectively called omics are giving rise to new possibilities of medicine, inducted a rapidly progressing informatics, called “clinical bioinformatics” (Knaup et al., 2004), or in a more recent term, “translational informatics” (Gaughan, 2006) is playing an indispensable role by deriving clinically meaningful information from the vast amount of omics data and more predictive or preventive than conventional genomic medicine.
• This new stage of molecular medicine needs a new term to distinguish itself from genomic medicine. We may call it simply “omics-based medicine” (Tanaka, 2010)
Developmental stages of omics medicine
• Data driven analysis of omics data– It leads to efficient sets of genes called “signature” from data
mining or exploratory statistics to gene expression profiles of diseased cells to predict recurrence of cancers (Alizadeh et al., 2002).
• Model driven analysis of omics data– Diseases would be better understood as a phenotype caused by
“systems distortion of the molecular network” due to the interrelated malfunction of genes and proteins, termed as pathway diseases (Grubb et al., 2009)
• System based analysis of omics data– All omics data exclusively from a biological system analysed for
diseases as “systems pathology”, in the sense that it is a proper application of systems biology to diseases (Tanaka, 2009).
Three generations of omics based medicine
• The first generation of omics based medicine– Base
• The inborn individual differences of genome using genetic polymorphism– Analytical method
• Simple statistical parameters• In the second generation of omicsbased medicine,
– Base• Vast amount of the various post-genomic disease omics data containing comprehensive
molecular information of diseased somatic cells – Analytical method
• Data driven analysis.• Third generation of omics based medicine
– Base• Knowledge about the cellular molecular network, system level understanding of the
disease, called systems pathology, – Analytical method
• Model driven analysis.
(Tanaka, 2009)
Some of commercial Signatures
What is personalized medicine?• Personalized medicine is a
– Broad and rapidly advancing field of health care using each person's unique clinical, genetic, genomic, and environmental information.
– An integrated, coordinated, and evidence-based approach for individualizing patient care.
– PM utilizes our molecular understanding of disease to enhance preventive health care strategies.
• The overarching goal of personalized medicine is to optimize medical care and outcomes for each individual, resulting in an unprecedented customization of patient care.
• The components of personalized medicine are,
– Family Health History (FHH)– Health Risk Assessment (HRA)– Integration of omics datasets– Clinical Decision Support (CDS)
(Isaac and Ginsburg, 2010)
Family Health History (FHH)• FHH is an invaluable tool for the delivery of personal
health risk information, reflecting the complex combination of shared genetic, environmental, and lifestyle factors.
• The assessment and integration of FHH information have not been embraced by the health care community (79)
• The challenge of incorporating FHH into the public's health involves three essential components:
(a) accessible, standard collection methods; (b) health care provider access; and (c) clinical guidance for interpretation and use. (175).
Health Risk Assessment (HRA)
• A fundamental component of personalized medicine is a standard health risk assessment (HRA) to evaluate an individual's likelihood of developing the most common chronic diseases (or disease events).
Eg., • Framingham coronary heart disease model, developed from the
Framingham Heart Study begun in 1948 (111). • The Gail model breast-cancer risk assessment and its modified
versions are also widely accepted tools (58).
• lack of standards for the clinical data required or the algorithms used, and to the lack of integration into health information technology systems (133)
Clinical Decision Support (CDS)
• To optimize the use of FHH and HRAs, clinical decision support (CDS) systems are used.
• Computerized CDS systems are increasingly being used, which integrates all patient-specific information to help manage diagnosis and treatment.
• CDS systems have been shown to improve prescribing practices, enhance preventive care, and improve compliance with evidence-based standards of care (12, 195, 224)
• Efficient algorithms and standard input format for different kind of patient specific information.
Clinical importance of omics“-omics” approach Generated information Applications Notable examples
Human genome sequence (genomics)
Whole-genome sequence, SNPs, and CNVs (10–15 million)
Disease mechanismsDisease diagnosisPharmacogenomics
Age-related macular degeneration (120), HCV virologic response (1), AML (32), warfarin dosing (6)
Gene expression profiles (transcriptomics)
Microarrays and RNA sequencing ( 25,000 transcripts)
Disease mechanismsDisease diagnosisDisease prognosisPharmacogenomics
AML (71), ALL (94), ACS (20), breast cancer (161)
Proteome (proteomics)
Protein profiles of specific protein products
Disease diagnosis ACS (143)
Metabolome (metabolomics)
Metabolic profiles (1,000–10,000 metabolites)
Disease mechanismsPharmacogenomics
ACS (182), drug toxicity (44), cancer profiling (76), CAD (193
Abbreviations: ACS, acute coronary syndromes; ALL, acute lymphoblastic leukemia; AML, acute myeloid leukemia; CAD, coronary artery disease; CNV, copy number variation; HCV, hepatitis C virus; SNP, single-nucleotide polymorphism. Table adapted from Reference 66.
Molecular diagnostics of disease
Opportunities
• There are two important origins of opportunities for personal omics profiling– The opportunities arising from advances in the
biologic sciences – The opportunities arising from advances in
healthcare IT
Increased level in testing
*NIH Report on Genetics and Health **BNP = B-type Natriuretic Peptide
Predictive model development
Overall opportunities of PM
Advancement in health care IT
Challenge I
• OMICS data is currently spread world wide in wide variety of formats.
• These formats can be unified and migrated across platforms through suitable techniques
• Possible solution– The use of XML techniques to store data. – XML is used to provide a document markup
language that is easier to learn, retrieve, store and transmit. It is semantically richer than HTML.
(Akula, 2009)
Challenge II• Integrating fragmentation of knowledge from several sources of
heterogeneous information into a coherent entity (Goble et al., 2008)
• It is widely recognized that successful data integration is one of the keys to improve productivity for stored data.
• Possible solutions– bio warehousing (tool sql)
• integrates its component databases into a common representational framework within a single database management system (Lee, 2006)
– database federation (COBRA and J2EE)• A federated database is a logical association of independent databases that
provides a single, integrated, coherent view of all resources in the federation.– controlled vocabularies
• a form of data integration by enforcing naming conventions for data elements that ultimately appear in omics databases (Avraham et al., 2008)‐
Overall challenges
Making available of relevant information
Why did they develop?– Repository of molecular information and detailed clinical
information– Relating the genome and the pathological findings may yield
good future medicine.
iCOD• Data stored (140 patient cases
of hepatocellular carcinoma)– disease information of the
patients – CGH (Comparative Genomic
Hybridization)– gene expression profiles– comprehensive clinical
information • clinic al manifestations, • medical images (CT, X-ray,
ultrasounds, etc), • laboratory tests, • drug histories, • pathological findings and• life-style environmental
information.• Online address
– http ://omics.tmd.ac.jp/icod_p ub_eng
Omics data integration tool• Aim
– Making the omics data in exchangable format and organize the data in an integrative way and link it with applications for data interpretation and analysis
• Description– DIPSBC is a data integration
platform for medium-scale collaboration projects.
– Because of its modular design and the incorporation of XML data formats it is highly flexible and easy to use.
– DIPSBC uses XML for data representation
• URL– http://dipsbc.molgen.mpg.de.
Advanced personalized medicine
Overview of the work• Idea behind the work
– Personalized medicine may get new realm by combining genomic information with regular periodical monitoring of physiological states by multiple high-throughput methods.
• Methodology– Authors presented an integrative personal omics profile (iPOP), an analysis
that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period.
• Outcomes– The iPOP analysis revealed various medical risks, including type 2 diabetes. – It also uncovered extensive, dynamic changes in diverse molecular
components and biological pathways across healthy and diseased conditions. – Extremely high-coverage genomic and transcriptomic data, which provide the
basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and
– an unexpected RNA editing mechanism. – This study demonstrates that longitudinal iPOP can be used to interpret
healthy and diseased states by connecting genomic information with additional dynamic omics activity.
Conclusion
• Advances in molecular biology and computational informatics are powering personalized medicine
• Personalized medicine presents real opportunities and real challenges to the existing model of care provision
• Personalized medicine includes genomics, but is more than genomics
• Healthcare IT will be vital to the realization of personalized medicine
Thank you