The UCSC Genome Browser

Post on 31-Dec-2015

36 views 2 download

description

The UCSC Genome Browser. From Men to Mice. WJ Kent, C Sugnet, T Furey, T Pringle, M Schwartz, R Baertsch, R Weber, K Roskin, D Thomas, S Rogic, M Diekhans, F Hsu, D Karolchik, D Haussler. Cardiac Troponin T2. Comparative Genomics at BMP10. Normalized eScores. Mouse/Human Synteny. - PowerPoint PPT Presentation

Transcript of The UCSC Genome Browser

The UCSC Genome BrowserFrom Men to Mice

WJ Kent, C Sugnet, T Furey, T Pringle, M Schwartz, R Baertsch, R Weber, K Roskin, D Thomas, S Rogic, M Diekhans, F Hsu, D Karolchik, D Haussler

Cardiac Troponin T2

Comparative Genomics at BMP10

Normalized eScores

Mouse/Human Synteny

Track Options & FiltersMini-buttons bring up track options such as those for spliced EST track below.

Which EST to Sequence?

MGC ESTS Drawn in Red

DNA Coloring

gctcgttcaggggtaaaggtgtattctagatCCACAACAAGCCCCGTGGTCTAGCACAGC AAAGAGAAAAAAAGAGAACACGAAAATGCCCTTGCTCCCCTCCGGGGGCCCCTTTTGTGC GGTTCTTGCCAACGCAGCAGCCCTCCTGCTATATAGCCCGCCGCGCCgCAGCCCCACCCG CTCAGCGCCGCCGCCCCACCAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAAGG TGAGCCCAGCCTGCGCCCCGGGACCCCGGAGCTTCCTCCATCGCGGGGGCCAGAGACTGG GGCAGGAGCAGGCCTGTGAGACCTCGCCTTGTCCCGCCTTGCCTTGCAGATCACCCTCTA CGAGGACCGGGGCTTCCAGGGCCGCCACTATGAATGCAGCAGCGACCACCCCAACCTGCA GCCCTACTTGAGCCGCTGCAACTCGGCGCGCGTGGACAGCGGCTGCTGGATGCTCTATGA GCAGCCCAACTACTCGGGCCTCCAGTACTTCCTGCGCCGCGGCGACTATGCCGACCACCA GCAGTGGATGGGCCTCAGCGACTCGGTCCGCTCCTGCCGCCTCATCCCCCACGTGAGTAC ATCCTCAAGTCAGGACCCAGGCCCTCAGGACACTCACTGGAtgGTTTCAAGCAAAAGTTA AACATTAGAAGTAGTGATCAGTcacaataaCTGAGAGTGGACAAAAGATGAACTATAGTG GATTAAGTCAATAGagttTGCTCCCCACATAAGCAAAGTATTACCCAGACAcCAGTTAAT caCAATTAATCCACAAATATGTATTGAGTAGGAATGTGTCTCCTGCCctAGGGGTTGTAT

Coloring CRYGD Start

Gene Expression Tracks

Alt Splicing Tracks

Complex Transcription

Add Your Own Tracks

• Users can extend the browser with their own tracks.

• User tracks can be private or public.

• No programming required.

• GFF, GTF, PSL or BED formats supported#chrom start end [name strand score]

chr1 1302347 1302357 SP1 + 800

chr1 1504778 1504787 SP2 — 980

The Underlying Database

• Power users and bioinformaticians sometimes want underlying database.

• There is a table for each track. • Larger tracks have a table for each chromosome.• Format of a track table generally similar to add-

your-own track formats.• Pieces of database available from ‘tables’ browser.• Whole database available as tab-separated files.

Parasol and Kilo Cluster

• UCSC cluster has 1000 CPUs running Linux

• 1,000,000 BLASTZ jobs in 25 hours for mouse/human alignment

• We wrote Parasol job scheduler to keep up.– Very fast and free.

– Jobs are organized into batches.

– Error checking at job and at batch level.

Acknowledgements

NHGRI, The Wellcome Trust, HHMI, NCI, and Taxpayers in the US and worldwide.

Whitehead, Sanger, Wash U, Baylor, Stanford, DOE, and the international sequencing centers.

NCBI, Penn State, Ensembl, Genoscope, The SNP Consortium, UC Berkeley, LBL, LLL, Riken, The Mammalian Gene Collection, Softberry, IMIM, Affymetrix, Perlagen, Rosetta, the Mouse Homology Group

The thousands of people who worked on the sequence and annotations