ChIP Seq Presentation

download ChIP Seq Presentation

of 23

Transcript of ChIP Seq Presentation

  • 8/6/2019 ChIP Seq Presentation

    1/23

    PRACTICAL: CHIP-SEQ DATA

    ANALYSISAndre Faure & Petra Schwalie

    Paul Flicek Lab, Vertebrate Genomics, EMBL-EBI9. March 2010

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    2/23

    RESOURCES

    http://www.bioconductor.org (R packages & workflows; help)

    http://seqanswers.com (software overview; forum)

    data repositories: ArrayExpress & GEO, ENA

    ENCODE, modENCODE (collaborative efforts, ChIP-seq)

    Reviews + Benchmarks (see last slide)

    Wednesday, 9 March 2011

    http://seqanswers.com/http://seqanswers.com/http://seqanswers.com/http://www.bioconductor.org/http://www.bioconductor.org/
  • 8/6/2019 ChIP Seq Presentation

    3/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    4/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Peak-calling

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    5/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    6/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Genomic context

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    7/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    8/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Read profile plots

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    9/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    10/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Motif analysis

    G

    C

    T

    GT

    A

    GT

    G

    ACACGTAA

    G

    C

    G

    A

    T

    CG

    TAGAGT

    GAGACTGTACT

    A

    G

    A

    G

    C

    T

    C

    C

    A

    G

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    11/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    12/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichmentDifferential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    13/23

    WORKFLOW

    Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)

    Quality check & align (not discussed here)

    (1) Peak-calling

    (2) Genomic context

    Read profile plots

    (3) Motif analysis (de novo & scanning)

    (4) Differential enrichment

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    14/23

    (1) PEAK-CALLING

    chipseq, GenomicRanges (Bioconductor)

    estimating fragment length

    extending reads

    islands of enrichment

    modeling the background (e.g. Poisson, neg. binomial) calling peaks (manual, MACS, SWEMBL)

    genomic overlaps: comparison of peak-calling results

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    15/23

    (2) GENOMIC CONTEXT

    biomart, GenomicRanges (Bioconductor)

    obtaining annotation (Ensembl)

    overlaps with annotation (e.g. promoters)

    enrichment of peaks in genomic areas (e.g. promoters) (not discussed here)

    functional term enrichment (not discussed here) (e.g. GREAT, McLean et al. Nat Biotechnol)

    average profile plots on genomic feature/peak summit

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    16/23

    (3) MOTIF ANALYSIS

    BSgenome, seqLogo, GenomicRanges (Bioconductor)

    MEME (de novo motif discovery)

    obtaining the peak sequences

    de novo motif discovery

    motif scanning: motifs per peaks?

    motif enrichment vs. background (not discussed here)

    refining the PWM for a given factor

    motif profile plot (distribution of motif around peak summit)

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    17/23

    (4) DIFFERENTIAL ENRICHMENT

    DESeq, GenomicRanges (Bioconductor)

    defining regions of interest (ROI) obtaining counts per regions of interest (replicates & conditions)

    estimating library sizes

    estimating variation of counts per ROIs

    calling differentially modified regions (negative binomial distribution)

    overview of significantly modified regions

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    18/23

    http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/chipseq_practical.pdf

    Wednesday, 9 March 2011

    http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/
  • 8/6/2019 ChIP Seq Presentation

    19/23

    (1) PEAK-CALLING

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    20/23

    PEAK ANALYSIS

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    21/23

    (3) MOTIF ANALYSIS

    motif discovery

    motif profile

    motifs/peaks

    MACS Swembl

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    22/23

    (4) DIFFERENTIAL HISTONE

    MODIFICATION

    Wednesday, 9 March 2011

  • 8/6/2019 ChIP Seq Presentation

    23/23

    CHIP-SEQ REVIEWS +

    BENCHMARKS ChIP-seq: advantages and challenges of a maturing technology (Park, Nat Rev Genet 2009)

    Computation for ChIP-seq and RNA-seq studies (Peke et al, Nat Methods 2009)

    Design and analysis of ChIP-seq experiments for DNA-binding proteins (Kharchenko et al, NatBiotechnol 2008)

    Q&A: ChIP-seq technologies and the study of gene regulation (Liu et al, MBC Biol 2010)

    Evaluation of algorithm performance in ChIP-seq peak detection (Wilbanks, PLos ONE 2010)

    A practical comparison of methods for detecting transcription factor binding sites in ChIP-seqexperiments (Laajala et al, BMC Bioinformatics)