Amia tb-review-09
-
Upload
russ-altman -
Category
Science
-
view
180 -
download
2
Transcript of Amia tb-review-09
Translational Bioinformatics 2009: The Year in Review
Russ B. Altman, MD, PhDStanford University
1Wednesday, March 18, 2009
Thanks!• Casey Overby
• Gil Omenn
• Iddo Friedberg
• Howard Bilofsky
• Brad Malin
• Phil Bourne
• Ted Shortliffe
• Ramon Felciano
• Atul Butte
• Bernie Daigle
• Chirag Patel
• Sarah Aerni
• David Chen
• Joel Dudley
• Alex Morgan
• Yves Lussier
• Andrea Califano
2Wednesday, March 18, 2009
Goals
• Provide an overview of the major scientific events, trends and publications in translational bioinformatics
• Create a “snapshot” of what seems to be important in March, 2009 for the amusement of future generations.
• Marvel at the progress made and the opportunities ahead.
3Wednesday, March 18, 2009
Process
1. Think about what has had early impact
2. Think about sources to trust
3. Solicit advice from colleagues
4. Surf online resources
5. Select papers to highlight in ~2 slides and some to highlight in < 1 slide.
4Wednesday, March 18, 2009
Caveats
• Strictly considered 3/1/08 to 3/16/09 (one exception)
• Focused on human biology and clinical implications (except really important model systems): molecules, clinical data, informatics.
• Considered both data sources and informatics methods (and combination)
• Tried to avoid simply following crowd mentality.
5Wednesday, March 18, 2009
Final list
• 70 semi-finalist papers
• 19 presented here briefly
• 11 others mentioned
• This talk and bibliography will be made available on the conference website.
6Wednesday, March 18, 2009
Final categories• Literature analysis
• Genetic Privacy
• Genes x Environment
• Genes + drugs/small molecules
• Schizophrenia
• Networks for understanding disease
• Stem cell biology
• Potpourri
7Wednesday, March 18, 2009
But first, a lesson in how to make impact...
“A recipe for high impact” (Cokol et al, Genome Biology, 2007)
• Use MESH heading usage to define temperature and novelty
• Temperature = high for using popular concepts
• Novelty = high for using new MESH headings
• Separately applied to methodology and topic
• Conclusion: High impact factor = high topic temperature and medium-low method temperature
8Wednesday, March 18, 2009
9Wednesday, March 18, 2009
10Wednesday, March 18, 2009
Method T vs. Topic T
11Wednesday, March 18, 2009
“A recent advance in the automatic indexing of the biomedical literature”
(Neveol et al, J. Biomed. Info.)
• NLM team uses natural language processing to assign main heading/subheading pair recommendations
• 48% precision, 30% recall compared to human indexers
• Deployed and used currently.
12Wednesday, March 18, 2009
Genetic Privacy
13Wednesday, March 18, 2009
“Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays”
(Homer et al, PLoS Genetics)
• Previously: assumption that reporting pooled data (e.g. distribution of SNP alleles for 1000 individuals) would be safe
• This paper demonstrated that with knowledge of population frequencies and mixture frequencies, can infer with high confidence if one person contributed DNA to mixture
14Wednesday, March 18, 2009
Key idea: 500K noisy measurements = certainty.
15Wednesday, March 18, 2009
“On Jim Watson’s APOE status: genetic information is hard to hide”
(Nyholt et al, Eur. J. of Human Genetics)
• Watson’s genome published, but ApoE sequence (associated with Alzheimer’s) redacted (E4 allele increases risk, E2 decreases)
• Investigators showed that SNPs in LD (correlated) with the APOE markers make imputation trivial (but did not report allele)
• JWGB removed additional 2MB of sequence
16Wednesday, March 18, 2009
“Human genomes as email attachments”
(Christley et al, Bioinformatics)
• 99% of genome is identical (uses reference genome and reference SNP db)
• Encode position, offset, Huffman code for sequence, dbSNP ref when possible
• Watson genome = 3.3 million SNPs, 220K indels
• 3.2 GB to 84 MB (ref genome) to 4.1 MB (full compression)
17Wednesday, March 18, 2009
Gene x Environment
18Wednesday, March 18, 2009
“Prevalence in the United State of selected candidate gene variants: Third National Health and Nutrition Examination Survey, 1991-1994”
(Chang et al, American J. of Epidemiology)
• NHANES, ongoing population study, incredible documentation of health
• 7159 participants
• Allele frequencies for 90 variants, 50 genes from 6 pathways: nutrient metabolism, immunity & inflammation, xenobiotic metabolism, DNA repair, blood pressure, oxidative stress
19Wednesday, March 18, 2009
C in VDR rs731236
20Wednesday, March 18, 2009
“The ‘etiome’: identification and clustering of human disease etiological factors”
(Liu et al, BMC Bioinformatics & AMIA SUMMIT)
• Defined 3342 etiological MESH headings associated with 3159 diseases
• Defined etiology based on UMLS Semantic network
• Defined 1100 genes associated with 1034 diseases
• Created joint clustering of diseases, genes, etiologies as basis for understanding environmental influences on genetic diseases
21Wednesday, March 18, 2009
The “ACE/Hypertension/Diet” cluster
22Wednesday, March 18, 2009
The “p53/cancer/toxin” cluster
23Wednesday, March 18, 2009
Genes + Drugs/Small Molecules
24Wednesday, March 18, 2009
“Drug target identification using side-effect similarity”
(Campillos et al, Science)
• Used similarity in side effect phenotypes to infer if two drugs share a target
• Applied to 746 drugs, to create drug-drug network of 1018 relationships
• 261 relationships in chemically dissimilar molecules (20 tested, 13 validated by binding, 11 with reasonable affinity)
25Wednesday, March 18, 2009
Drug-drug network
26Wednesday, March 18, 2009
Rabeprazole (PPI) and neurological effects
27Wednesday, March 18, 2009
“Estimation of the warfarin dose with clinical and pharmacogenetic data”
(International Warfarin Pharmacogenetics Consortium, New Eng. J. Med.)
• 5000+ patients pooled from 21 sites, 9 countries
• Clinical variables + genotypes for CYP2C9 and VKORC1
• Pharmacogenetic dosing equation outperformed clinical algorithm
• High- and low-dose extremes benefit the most
28Wednesday, March 18, 2009
29Wednesday, March 18, 2009
30Wednesday, March 18, 2009
“Metabolomics analysis reveals large effects of gut microflora on mammalian blood metabolites”
(Wikoff et al, PNAS)
• MS study of metabolites in plasma in colonized and uncolonized clonal mice.
• Metabolite levels markedly affected, particularly amino acids
• Phase II metabolizing enzyme response to microflora observed
• Implies that host genotype only one consideration in predicting the presence and metabolism of xenobiotics
31Wednesday, March 18, 2009
Two systems for drug metabolism
32Wednesday, March 18, 2009
“Metabolic profiles delineate potential role for sarcosine in prostate cancer progression”
(Sreekumar et al, Nature)
• Combined HPLC & Mass Spec to profile 1126 metabolites across 262 clinical samples (tissue, urine, plasma)
• Able to distinguish normal, BPH, local cancer and metastatic cancer from signatures
• Sarcosine, derivative of glycine (GLY) increased in invasive cancer
• Injection of sarcosine or knock-down lead to invasive phenotype from benign cells!
33Wednesday, March 18, 2009
Metabolite levels
34Wednesday, March 18, 2009
Sarcosine levels measured
35Wednesday, March 18, 2009
Schizophrenic results on...Schizophrenia
36Wednesday, March 18, 2009
“No significant association of 14 candidate genes with schizophrenia in a large European ancestry sample: implications for psychiatric genetics”
(Sanders et al, Am. J. Psychiatry)
• 1870 schizophrenics, 2002 controls
• 789 SNPs in 14 genes with reported associations
• No genome-wide significance for SNPs or haplotypes, all compatible with chance.
37Wednesday, March 18, 2009
Hits precisely as expected by chance
38Wednesday, March 18, 2009
“Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia”
(Walsh et al, Science)
• Novel deletions and duplications in 20% vs. 5%
“A genome-wide investigation of SNPs and CNVs in schizophrenia”
(Need et al, PLoS Genetics)
• 8 cases with deletions > 2 MB, 0 controls
• No evidence that preferential disruption of schizophrenia pathways
39Wednesday, March 18, 2009
Networks that tell us about disease...
40Wednesday, March 18, 2009
“Network-based global inference of human disease genes”
(Wu et al, Molecular Systems Biology)
• CIPHER tool to predict and rank potential disease genes
• Based on intuition that similar phenotypes are linked to functionally related genes
• Reports ranked candidates for 5000 phenotypes over human genome
• Builds on previous work to create a new classification of disease based on molecular data.
41Wednesday, March 18, 2009
Relating phenotypes to genes
42Wednesday, March 18, 2009
43Wednesday, March 18, 2009
“Genetic-linkage mapping of complex hereditary disorders to a whole-genome molecular-interaction network”
(Iossifov et al, Genome Res.)
• Problem: multifactorial, heterogeneous disorders
• Novel framework combines linkage formalism with molecular interaction networks (text mining)
• Better statistics through grouping of genes
• Apply to autism, bipolar disorder & schizophrenia
• Find shared gene targets
44Wednesday, March 18, 2009
45Wednesday, March 18, 2009
“A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas”
(Mani et al, Mol. Sys. Bio.)
• Method to detect cancer-causing genetic lesions
• Look for molecular interactions that become dysregulated in specific tumors
• Algorithm for aberrant genes using mutual information measuring gain- or loss-of-correlation
• Applied to B-cell interactome, with strong predictive performance
46Wednesday, March 18, 2009
47Wednesday, March 18, 2009
“Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans”
(Querec et al, Nat. Immunology)
• Yellow fever vaccine old and effective, but totally empirical
• Goal to understand biology using microarray expression and cytokine activities
• Combined data to predict T-cell response correctly in 90% of independent sample
• Predicted neutralizing antibody formation with 100% accuracy
48Wednesday, March 18, 2009
Strong up-regulated network
49Wednesday, March 18, 2009
Separating CD8 response
50Wednesday, March 18, 2009
“Analysis of drosophila segmentation network identifies a JNK pathway factor overexpressed in kidney cancer” (Liu et al, Science)
• Initial goal: build network of fly segmentation genes, whose human homologs associated with cancer
• Combined gene expression, chromatin IP, literature mining, and yeast 2-hybrid results to build network of genes involved in segmentation control
• Found a major hub in resulting network = D-SPOP modulates JNK pathway--implicated in cancers
• H-SPOP tested and found as biomarker in 99% of clear cell renal cell carcinomas
51Wednesday, March 18, 2009
D-SPOP is a hub in JNK network
52Wednesday, March 18, 2009
H-SPOP stains clear cell carcinoma specifically
53Wednesday, March 18, 2009
Stem cell biology
54Wednesday, March 18, 2009
“Integration of external signaling pathways with core transcriptional network in embryonic stem cells”
(Chen et al, Cell)
• The stem cell miracle continues...4 transcription factors can be introduced into somatic cells to transform them into pluripotent stem cells.
• Investigators mapped targets of 13 transcription factors implicated in stem cell transformation (ChIP-seq)
• Specific genomic regions targeted by different TFs.
• Understanding regulation of ES-cells is coming...
55Wednesday, March 18, 2009
TF binding is co-localized
56Wednesday, March 18, 2009
Binding motifs defined
57Wednesday, March 18, 2009
The mother of TF networks
58Wednesday, March 18, 2009
Great work skipped...• Biosurveillance of emerging
biothreats using scalable genotype clustering
• The infinite sites model of genome evolution
• Predicting unobserved phenotypes for complex traits from whole-genome SNP data
• Cost effective strategies for completing the interactome
• Relating protein pharmacology by ligand chemistry
• Better bioinformatics through usability analysis
• Dynamic modularity in protein interaction networks predict breast cancer outcome
• A burst of segmental duplications in the genome of the African great ape ancestor
• DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome
• A universal mechanism ties genotype to phenotype in trinucleotide diseases
• Global diversity in the human salivary microbiome
59Wednesday, March 18, 2009
“The Human Protein Atlas--a tool for pathology”
(Ponten et al, J. Pathology)
• 6122 antibodies (representing 5011) proteins exposed to 708 tissues. Data available.
60Wednesday, March 18, 2009
2008 Crystal ball...Sequencing makes a comeback (watch out microarrays....)
Translational science projects will create astounding data sets (hopefully available) to catalyze research
GWAS will continue to proliferate
Consumer-oriented genetics will create demand for online resources for interpretation
Difficult decisions about when/how to bring new molecular diagnostics to practice.
61Wednesday, March 18, 2009
2009 Crystal ball...
• Focus on mechanism in interpreting genetic associations
• More sophisticated mechanisms to find signal in GWAS, including data integration
• Cellular dynamics of expression, metabolites, proteins
• Multiple human & cancer genome sequences
• Consumer sequencing (vs. genotyping)
62Wednesday, March 18, 2009