Computational Approaches in Epigenomics
description
Transcript of Computational Approaches in Epigenomics
Computational Approaches in Epigenomics
Guo-Cheng YuanDepartment of Biostatistics and Computational Biology
Dana-Farber Cancer Institute
Harvard School of Public Health
BIO506, Jan 11th, 2010
Definition
• Epigenetics refers to changes in phenotype (appearance) or gene expression caused by mechanisms other than changes in the underlying DNA sequence.
wikipedia
Epigenetic mechanisms
• Nucleosome positions
• Histone modification
• DNA methylation
Chromatin
• DNA is packaged into chromatin.
• Nucleosome is the fundamental unit of chromatin. It wraps 146 bp DNA.
• The chromatin structure is hierarchical.
Felsenfeld and Groudine 2003
Nucleosome and histone modification
First layer chromatin structure looks like “beads-on-a-string”.
A nucleosome is made of core histone proteins.
The amino acids on the N-terminus of histones can be covalently modified. Felsenfeld and Groudine 2003
DNA methylation
Alberts et al. Molecular Biology of the Cell
DNA methylation normally occurs at CpG dinucleotide only and can be inherited during cell-division.
Why do we care?
• Epigenetics is an extra layer of transcriptional control.
• Epigenetics plays an important role in development.
• Epigenetic mechanisms can cause cancer and other diseases.
• Epigenetic patterns are reversible and can be influenced by environments.
Our goalsepigenonic
data
microarray
DNA sequence
…
Computational model
Characterize cell-type specific epigenetic states
Elucidate epigenetic targeting
mechanism
Understand epigenetic
regulation in cell differentiation
Epigenetic signature of
diseases
TF binding
Chromatin domains
Intrachromosomal interactions
large-scale histone modification patterns
chromatin loops
A hidden Markov model for prediction of multi-gene chromatin domains
Jessica Larson
Prediction results
Targeting mechanism for epigenetic factorsNucleosome positions
Histone modification pattern
Wavelet Energy
Dinucleotide Frequency
Signal
Wavelet Basis
Signal Decomposition
E1E2
E3
kk EElinP
nucleosomeP
...
ker)(
)(log 11
An N-score model to prediction nucleosome positions
Yuan and Liu
N-score prediction in two yeast species
Lanterman et al.
Polycomb targets developmental genes in ES
Boyer et al. 2006
Polycomb
Oct4NanogSox2
expressed
repressed
Kim et al. 2008
Motif A
Motif B Motif C
AA cS AA cS
NO YES
BB cS BB cS
NO YES NO YES
CC cS CC cS
A computational model: BARTBART is a Bayesian average of regression trees
Chipman et al. 2007
Overall prediction accuracy
AUC = 0.82
all factors
5 factors
CpG
random
testing data ROC
Number of cell-types in which the gene is
targeted
Pro
pen
sity
sco
re
Spring Liu; Zhen Shao
TF network
+Polycomb
Hox
Dnmt1Hox
+
cell-type A cell-type B
An integrated network
Jess Mar
Future directions
• How do genetic and epigenetic factors work together to regulate cell-type specific gene expression?
• How does the integrated regulatory network change across cell-types?
• Are there epigenetic signatures associated with common diseases and if so what role do they have?
Acknowledgment
• Jessica Larson • Yingchun (Spring) Liu• Zhen Shao
• John Quackenbush Lab– Jess Mar
• Stuart Orkin Lab– Xiaohua Shen– Jongwan Kim
• Steve Altschuler• Ollie Rando• Jun Liu
• Claudia Adams Barr Program