Exploring genome features with the UCSC Genome Browser
Transcript of Exploring genome features with the UCSC Genome Browser
Exploring genome features with the UCSC Genome Browser
Bingbing Yuan
BaRC Hot Topics – April 14, 2015
Bioinformatics and Research Computing
Whitehead Institute
http://barc.wi.mit.edu/hot_topics/ 1
Why UCSC Genome Browser ?
• Visualize many datasets hosted in UCSC
• Visualize your genome-mapped datasets
• Download genome features with the Table Browser
2
Today’s outline
• Introduction to the genome view and track display • Search and display ENCODE data • View your list of regions in the browser • Use table browser to download and annotate genome
features • Convert coordinates/features between genomes • Use Public Hub to display tracks hosted at non-UCSC
servers. • Save your session
Throughout the talk: mining the genome content
3
UCSC Genome Browser http://genome.ucsc.edu
Local: http://membrane.wi.mit.edu/
4
Global View
5
Genome Viewer
Tracks (group of data)
Navigation
6
Click: item description
Clic
k: t
rack
de
scri
pti
on
R
igh
t cl
ick:
tra
ck c
on
figu
rati
on
Drag track to new position
Zoom in with Drag-and-select
7
Track description
8
Item description
9
configure
10
11
mode of an individual annotation track: Hide: the track is not displayed at all.
Dense: the track is displayed with all features collapsed into a single line.
Squish: the track is displayed with each annotation feature shown separately, but at 50% the height of full mode. Features are unlabeled.
Pack: the track is displayed with each annotation feature shown separately and labeled
Full: the track is displayed with each annotation feature on a separate line.
Gene structure
12
5’UTR 3’UTR Coding exon
Intron
Commonly used gene models GENCODE, Ensembl, UCSC, Refseq
View genes/exons only
13
multi-region
Browser Tracks
click links for detail information
14
The current genome build might have less tracks than the last version. For example: hg38 has considerably fewer tracks than hg19
Demo and Exercise 1
• Search your favorite gene in the genome browser.
• How many exons does your gene have? What’s the strand orientation relative to the genome?
• How many isoforms does this gene have?
• Does the RefSeq gene catalog contain the correct number of isoforms of your favorite human gene?
• Zoom in to look for the start codon. You may need to change the track setting.
15
ENCODE super track settings
16
ENCODE track setting
17
ENCODE: genome.ucsc.edu/ENCODE/
18
ENCODE: search tracks
19
Demo and Exercise 2
• Identify the transcription factor binding sites in your gene’s promoter region.
• Are there any RNA-seq data expressed in your favorite cell/tissue? If so, which isoform is most likely to be expressed?
20
Add custom track
21
OR
track name='sample peaks' description='sample peaks'
track name='User Track' description='User Supplied Track'
File format: above links Or view in
http://genome.ucsc.edu/FAQ/FAQformat.html
WI UCSC Browser: membrane.wi.mit.edu URL for viewing files: http://tak.wi.mit.edu/solexa_ucsc/
URL
File Formats
22
Bed: regions Enriched Chip-seq signal for TF binding
Wig(gle): continues signal Chip-seq signal
BAM: alignment of reads RNA-seq alignment
23
UCSC Table Browser
Demo and exercise 3
• Load peaks (bed format) derived from Chip-seq: • GM12878 H3K36me3 Histone Mods by ChIP-seq Peaks
from ENCODE/Broad
• To save time, only peaks in chr22
• Identify the Refseq genes that could be regulated by H3K36me3 – Go to the Table Browser:
– Choose RefGene table
– Intersect with the above uploaded track
24
Convert to other genome/build
25
Save to file
Convert to other genome/build LiftOver: multiple regions
26
blat: align sequences
27
view alignment in browser
Track Data Hubs
28
Share/save session
29
More Information
• UCSC Browser Tutorials https://genome.ucsc.edu/training/index.html
Free Videos:
https://genome.ucsc.edu/training/vids/index.html
Open Helix:
http://www.openhelix.com/ucsc
• FAQ: https://genome.ucsc.edu/FAQ/
• Genomewiki: http://genomewiki.ucsc.edu/index.php/Main_Page
30