Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... ·...
Transcript of Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... ·...
![Page 1: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/1.jpg)
Analysis of ChIP-seq data
in Galaxy
November, 2012 Local copy: https://galaxy.wi.mit.edu/
Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/
Hot Topics: Galaxy 1
![Page 2: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/2.jpg)
Hot Topics: Galaxy
Font Conventions
• Bold and blue refers to tools on the left hand window
• Bold and green refers to tabs and menus on the top (Analyze data, Shared Data, etc)
• Slides with Red Headers describe the hands-on exercises
• Red and italic refers to menus and history names used on the hands-on
2
![Page 3: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/3.jpg)
ChIP-seq
3 Park, P. J., ChIP-seq: advantages and challenges of a maturing technology, Nat Rev Genet. Oct;10(10):669-80 (2009)
![Page 4: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/4.jpg)
General workflow for ChIP-seq analysis
4
Fastq files from the sequencing facility
Check the quality of your reads (NGS: QC and manipulation -> FastQC)
Step 1: Map the reads to the genome (BOWTIE)
Step 2: Identify peaks (MACS)
Step 3: Post processing: Annotate peaks
i.e. find genes overlapping or close to the peaks
Park, P. J., ChIP-seq: advantages and challenges of a maturing technology, Nat Rev Genet. Oct;10(10):669-80 (2009)
![Page 5: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/5.jpg)
Hot Topics: Galaxy
Hands-On Exercises
• Data upload (get files needed for analysis) – Raw data: fastq files (ChIP and WCE) – Intermediate files: output files of the first analysis steps – Annotation files: genes and upstream regions, we will
use them to get a set of genes that overlap or are close to the peaks
• ChIP-seq analysis. – Map with bowtie – Identify peaks bound with MACS – Find genes that overlap or are close to the peaks
5
![Page 6: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/6.jpg)
The Galaxy Interface A web based platform for analysis of large genomic datasets
6
No need of programming experience. Integrates many bioinformatics tools within one interface. Keeps track of all the steps performed in an analysis. Even if you delete the datasets, the history keeps the tools used.
LOCAL COPY Faster Customizable 250Gb of storage Data is private Jobs are sent to the cluster
Type “https://galaxy.wi.mit.edu/” in your browser address. You will be prompted for your name and password (these are the same that you use for your email)
![Page 7: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/7.jpg)
Hot Topics: Galaxy
Galaxy Interface: Analyze Data
Tools window
Data display and tool’s dialog window
History window:
Data analysis
Processed data Green: job is finished Yellow: job is running Gray: job is in queue Red: there is a problem
7
History window: All analysis steps are saved. Data is not overwritten. Can create workflow to repeat an analysis.
![Page 8: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/8.jpg)
Hot Topics: Galaxy
How to find your previous histories
8
History menu
![Page 9: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/9.jpg)
Hot Topics: Galaxy
Getting Data: Upload File
Upload File
Execute
Upload or paste file
File Format
9
Genome Assembly
![Page 10: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/10.jpg)
Getting Data: Uploading Large Files Step 1: copy your file to
/nfs/galaxy/uploads/[email protected] using a sftp client
10
CyberDuck
/nfs/galaxy/uploads/[email protected]
22
![Page 11: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/11.jpg)
Hot Topics: Galaxy
Getting Data: Uploading Large Files Step 2: Select and upload the file within galaxy
11
Execute
Genome Assembly
Upload Fie
![Page 12: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/12.jpg)
Hands-on: Data Upload This is an schematic of the data we need to upload for each step.
Step 1. Map reads History: mapWithBowtie
Input files: WCE.fastq
Nanog.fastq
Step 2. Call peaks History: InputForMACS_mm9
Input files: Filter SAM on data 3_WCE
Filter SAM on data 4_Nanog
Step 3. Post processing History: InputFor_annotatePeaks
Input files: Peaks
Refseq Genes 3Kb Upstream of Refseq Genes
12
Hot Topics: Galaxy
![Page 13: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/13.jpg)
Hands-on: Data Upload
• Create a new history and name it “mapWithBowtie” 1) On the history MENU select Create New 2) On the history MENU select Saved Histories 3) Once you see your histories on the middle window click on
the “Unnamed history” drop down menu and select Rename
2
Hot Topics: Galaxy 13
1
3
![Page 14: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/14.jpg)
Hands-on: Data Upload stay in “mapWithBowtie” history
Upload the files that I have copied for to your uploads directory
14
![Page 15: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/15.jpg)
Hot Topics: Galaxy
Hands-on: Data Upload • Click on the Shared Data Tab and select Published Histories
• Select InputForMACS_mm9
15
Click on Import history
![Page 16: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/16.jpg)
Hands-on: Data Upload • Click on the Shared Data Tab and select
Published Histories • Select InputFor_annotatePeaks • Click on Import history
Hot Topics: Galaxy 16
![Page 17: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/17.jpg)
Hot Topics: Galaxy
Getting Data from UCSC (local copy)
UCSC Main
17
Get Output
![Page 18: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/18.jpg)
Hot Topics: Galaxy 18
Send to Galaxy
Getting Data from UCSC (local copy)
![Page 19: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/19.jpg)
Hands-on: Data Upload Optional
stay in “imported: InputFor_annotatePeaks ” history
Use the link to the UCSC main table browser 1. Get all mouse refseq genes mm9 chr1 2. Get all 3Kb upstream regions from mouse refseq
genes mm9 chr1
Now you have all the data we need for the hands-on exercises
Hot Topics: Galaxy
19
![Page 20: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/20.jpg)
Hot Topics: Galaxy
Important Icons
20
Display data
Edit attributes
Delete
Clicking on the name of the dataset displays it bellow
Display data in local UCSC browser
Download
View Details
Run this job again
View or Report this error Reporting an error will create a ticket
![Page 21: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/21.jpg)
Hot Topics: Galaxy
History
21
Good Practices Make a new history for each analysis that you perform. Rename the outputs of your jobs Permanently delete data that you don’t need (or you will reach your quota of 250Gb).
![Page 22: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/22.jpg)
Hot Topics: Galaxy
History is not removed when datasets are removed
22
![Page 23: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/23.jpg)
Hot Topics: Galaxy
Other useful commands on the History menu
Transfer data between histories Share your history with other users
Access histories that other users shared with you
23
![Page 24: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/24.jpg)
General workflow for ChIP-seq analysis
24
Fastq files from the sequencing facility
Check the quality of your reads (NGS: QC and manipulation -> FastQC)
Step 1: Map the reads to the genome (BOWTIE)
Step 2: Identify peaks (MACS)
Step 3: Post processing: Annotate peaks
i.e. find genes overlapping or close to the peaks
Park, P. J., ChIP-seq: advantages and challenges of a maturing technology, Nat Rev Genet. Oct;10(10):669-80 (2009)
![Page 25: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/25.jpg)
Hot Topics: Galaxy
Analysis of ChIP-seq data using Galaxy
1. History: mapWithBowtie 1. Run FASTQ Groomer to convert fastq file to fastq Sanger format 2. Map with bowtie 3. Filter out unmapped reads
2. History: imported: InputForMACS_mm9 1. Call peaks bound using MACS 2. Select the peaks that are on chr1
3. History: imported: InputFor_annotatePeaks. Annotate peaks 1. Annotate peaks using Operate on Genomic Intervals tools 2. Annotate peaks using the Integrative Analysis- > peak2gene
tool
25
![Page 26: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/26.jpg)
Illumina data format
• Fastq format:
• ++ @ILLUMINA-F6C19_0048_FC:5:1:12440:1460#0/1 GTAGAACTGGTACGGACAAGGGGAATCTGACTGTAG +ILLUMINA-F6C19_0048_FC:5:1:12440:1460#0/1 hhhhhhhhhhhghhhhhhhehhhedhhhhfhhhhhh
@seq identifier seq +any description
seq quality values
/1 or /2 paired-end
26 Hot Topics: Galaxy
![Page 27: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/27.jpg)
Hot Topics: Galaxy
Sequence quality values on different FASTQ formats
27
http://en.wikipedia.org/wiki/FASTQ_format
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS............................... ..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII .................................JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh | | | | | 33 59 64 73 104 S - Sanger Phred+33, raw reads typically (0, 40) X - Solexa Solexa+64, raw reads typically (-5, 40) I - Illumina 1.3+ Phred+64, raw reads typically (0, 40) J - Illumina 1.5+ Phred+64, raw reads typically (3, 40)
To discriminate between Solexa and Illumina 1.3+ check if your sequences' quality scores have any of the characters ;<=>?
![Page 28: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/28.jpg)
Hot Topics: Galaxy
FASTQ formats and FASTQ Groomer
28
![Page 29: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/29.jpg)
Hot Topics: Galaxy
Mapping Reads with Bowtie
29
Bowtie
![Page 30: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/30.jpg)
Hot Topics: Galaxy Next Gen.Seq.
Mapping Reads with Bowtie
30
Set it to the read length used in your experiment; for today's session leave it as the default “28”
![Page 31: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/31.jpg)
Hot Topics: Galaxy
NGS: SAM Tools -> Filter SAM
31
![Page 32: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/32.jpg)
Hot Topics: Galaxy
Hands-on: Analysis of ChIP-seq data 1 History: mapWithBowtie
• Run NGS: QC and manipulation -> FASTQ Groomer on the 2 fastq files. The input files are Sanger format, but you still have to run fastq Groomer
• Map each of the fastq files with bowtie NGS: Mapping -> Map with Bowtie for Illumina – Genome to map to: mm9 canonical – Other parameters: use best
• Take the output from bowtie and filter out reads not mapped using:
NGS: SAM Tools -> Filter SAM Tip: You don’t have to wait for fastq groomer or bowtie to finish to
send the next job
32
![Page 33: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/33.jpg)
Hot Topics: Galaxy
Analysis of ChIP-seq data: MACS
33
MACS
Make this your read length (leave it as is for the hands on)
mm9
![Page 34: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/34.jpg)
Hot Topics: Galaxy
Filter and Sort: Filter data on any column
34
![Page 35: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/35.jpg)
Hot Topics: Galaxy
Hands-on: Analysis of ChIP-seq data 2 History: InputForMACS_mm9
• Take the filtered mapped reads (uploaded by me in this history) and run MACS NGS: Peak Calling -> MACS Model-based Analysis for ChIP-Seq
• Using the file that MACS generates “MACS peaks on Filter SAM on data 4 “ select only the peaks on chr1 Filter and Sort -> Filter data on any column using simple expressions
• Other filters you may want to use when you are running your analysis are: – Get the top 2000 peaks – Get peaks with FC > cut-off value – Get peaks with -log P > cut-off value
35
![Page 36: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/36.jpg)
Hot Topics: Galaxy
Post processing Text Manipulation
36
![Page 37: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/37.jpg)
Hot Topics: Galaxy
Operate on Genomic Intervals: Intersect the intervals of two datasets
37
![Page 38: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/38.jpg)
Hot Topics: Galaxy
Operate on Genomic Intervals: Intersect the intervals of two datasets
38
![Page 39: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/39.jpg)
Hot Topics: Galaxy
Hands-on: Analysis of ChIP-seq data 3 History: inputFor_annotatePeaks
Annotate peaks. 1. Combine the mm10 refseq genes file and the 3Kb upstream of
refseq gene file Text Manipulation -> Concatenate datasets tail-to-head Find the genes or upstream regions that overlap with peaks Operate on Genomic Intervals -> Intersect the intervals
of two datasets
2. Find genes located at 3 Kb or less from the peak center using Integrative Analysis -> peak2gene
39
![Page 40: Analysis of ChIP-seq data in Galaxybarc.wi.mit.edu/education/hot_topics/galaxy/Galaxy... · 11/8/2012 · Analysis of ChIP-seq data using Galaxy 1. History: mapWithBowtie 1. Run](https://reader030.fdocuments.us/reader030/viewer/2022040203/5e94f589b181d84f080e2499/html5/thumbnails/40.jpg)
Hot Topics: Galaxy
Tutorials and References • Galaxy tutorials http://galaxy.psu.edu/screencasts.html • Previous Hot Topics http://jura.wi.mit.edu/bio/education/hot_topics • References Giardine et al. (2005) Galaxy: a platform for interactive large-scale analysis. Genome Research 15:1451-5
Taylor et al. (2007) Using Galaxy to perform large-scale interactive data analyses. Current Protocols in Bioinformatics Chapter 10, unit 10.
Blankenberg et al. (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics 26(14):1783-5
Park, P. J. (2009) ChIP-seq: advantages and challenges of a maturing technology Nat. Rev. Genet. 10(10):669-80
Pepke et al. (2009) Computation for ChIP-seq and RNA-seq studies. Nature Methods 6, S22 - S32
Wilbanks et al. (2010) Evaluation of Algorithm Performance in ChIP-Seq peak Detection. PLoS ONE 5(7)
Szalkowski et al. (2010) Rapid innovation in ChIP-seq peak-calling algorithms is outdistancing benchmarking efforts. Brief Bioinform doi: 10.1093/bib/bbq068
Rye et al. (2011) A manually curated ChIP-seq benchmark demonstrates room for improvement in current peak-finder programs. Nucleic Acids Res. 39(4):e25
Zhang et al. (2008) Model-based Analysis of ChIP-Seq (MACS). Genome Biol vol. 9 (9) pp. R137
40