Identification of epigenetic patterns in birth cohorts
Transcript of Identification of epigenetic patterns in birth cohorts
Allan Just PhD
Harvard School of Public Health
PPTOX IV: Boston 2014
Identifying epigenetic patterns in birth cohorts
Contact: [email protected]
New directions for epigenomics in epidemiology
1. Why epigenetics excites the PPTOX crowd
2. How high-throughput technologies are
changing the biomarker game
3. Strategies when features vastly outnumber
samples
Allan Just – PPTOX2014 2
Epigenetics in perinatal programming
Boekelheide et al. Environ Health Persp. 2012 Allan Just – PPTOX2014 3
Beyond candidate gene approaches: Epigenome-wide
flickr.com/photos/jgr/2771104231/
Allan Just – PPTOX2014 4
High throughput + quantitative epi
5
6
Joubert et al. EHP 2012
Prenatal smoking has reproducible associations with the fetal methylome measured in blood
n=1062 Effects are small differences Sign differs within AHRR
Same top site seen in former vs never smoking adults
X-axis: Years since quitting (showing 432 former smoking males from the Normative Aging Study)
7
line adds median from 187 never smokers
8
9
10
Steps toward a successful epigenome-wide association study
Michels et al. Nature Methods 2013 Allan Just – PPTOX2014 11
Information/cost tradeoff Pyrosequencing few sites; ~$5/sample Sequenom MassArray Illumina 450k BeadChip 485K sites; ~$300/sample
RRBS 1M sites
CpGiant 5M sites
Whole Genome Bisulfite Seq 28M sites; $$$
Allan Just – PPTOX2014 12
High throughput technologies
Microarrays
Standardized assays
Measure the same thing
Less expensive
Less flexible
Less future development
Next Gen Sequencing
More sites per sample
Costs rapidly decreasing
Innovation in new assays
Bioinformatic challenges
Information per site varies
13
Measuring DNA methylation with the Illumina 450K microarray
• bisulfite treated DNA hybridizes to targeted probes → fluorescence
• % methylation = (methylated / unmethylated + methylated fluorescence)
• 485,512 methylation sites
Image from illumina.com
Allan Just – PPTOX2014 14
The benefit of standards: widespread adoption of the 450k
• Shared methods – software for analysis
• Public Repositories
marmal-aid.org (Lowe and Rakyan 2013)
contains 14,586 samples as of 10/27/2014
Allan Just – PPTOX2014 15
Opportunities for cohort epigenomics
Standardization of laboratory assays
measurement becomes a service
But, overwhelming amounts of data
How do you process,
learn from,
communicate with bigger data
Allan Just – PPTOX2014 16
Big(ger) data: scaling our approach • Having more data doesn’t relieve assumptions
(e.g. confounding, linearity, etc)
flickr.com/photos/terryfreedman/15425678852/
Allan Just – PPTOX2014 17
Tradeoffs and consequences of the new epigenomics
Multiple Comparisons:
On the 450k Bonferroni significance is ~1e-7
What happens when n << p?
Can you look at everything and find something?
Allan Just – PPTOX2014 18
Wilhelm-Benartzi et al. Br J Cancer 2013
Advancing Epigenomics Theme 1: improved measurements
Allan Just – PPTOX2014 19
Advancing Epigenomics Theme 2: improved analytic approaches
Restrict the search:
• Subset to biologically relevant features
• Prioritize best measured features
Borrow information:
• Pathway analyses
• Bump hunting or adjacency clustering (Aclust)
Allan Just – PPTOX2014 20
Borrow from your neighbor: Aclust
• Cluster adjacent sites based on correlation (within nearby region, ~1kb)
• Model methylation in the cluster as a multivariate outcome
Sofer et al. Bioinformatics 2013 Allan Just – PPTOX2014 21
Advancing Epigenomics Theme 3: larger sample sizes
Meta-analysis:
Prenatal And Childhood Epigenetics (PACE) Consortium
Allan Just – PPTOX2014 22
Summary: New directions for epigenomics in epidemiology
1. Exciting prospects to discover broad links with
early exposures and later outcomes
2. Measurements are standardized; analyses are
complicated
3. Need care when n<<p; borrow information
using biology and statistics
Allan Just – PPTOX2014 23
Collaborators
• Andrea Baccarelli
• Robert Wright
• Roz Wright
• Joel Schwartz
• Xihong Lin
Computational Epigenomics working group: Amar Mehta
Elena Colicino
Golareh Agha
Richard Barfield
Grant Support from the NIEHS: K99 ES023450
Contact: [email protected]
24