Professional Development Course 1 – Molecular Medicine Genome Biology June 12 , 2012
description
Transcript of Professional Development Course 1 – Molecular Medicine Genome Biology June 12 , 2012
Professional Development Course 1 –Molecular Medicine
Genome BiologyJune 12, 2012
Ansuman Chattopadhyay, PhDHead, Molecular Biology Information ServicesHealth Sciences Library SystemUniversity of [email protected]
http://www.hsls.pitt.edu/guides/genetics
Genomic achievements since the Human Genome Project
http://www.hsls.pitt.edu/molbio
Objective
Organism Whole Genome Sequence Databases
Genome Browsers
http://www.hsls.pitt.edu/molbio
Topics
Genome Sequencing Projects
NCBI Genome resources Integrated Microbial Genome UCSC Genome Bioinformatics
Genome Browsers
UCSC Genome Browser UCSC Table Browser NCBI Map viewer Generic Genome Browser (Gbrowse)
http://www.hsls.pitt.edu/molbio
Genome Biology
Human Genome Project Video
http://www.hsls.pitt.edu/molbio
Chromosome Structure
http://www.hsls.pitt.edu/molbio
Genome Biology: Karyotype
Adapted from NGHRI
Trisomy 21
Monosomy X
http://www.hsls.pitt.edu/molbio
Genome Biology: Karyotype
NHGRI
http://www.hsls.pitt.edu/molbio
Genome Biology: Molecular Cloning
p53
CFTRNFkB
8 September, 1989
http://www.hsls.pitt.edu/molbio
Genome Biology : Time Line
1976
RNA Bacteriophage MS2
2001
Human Genome Draft Seq
2003
Published Complete Human Ref Genome
2007
Diploid Genome seq ofan Individual Human
2011
Published Complete Genomes: 1863 organisms
1995
HaemophilusInfluenza
2008
Jim Watson Genome
Yeast
1996
1998
C. elegans
2002
Drosophila
http://www.hsls.pitt.edu/molbio
DNA Sequencing Cost
http://www.hsls.pitt.edu/molbio
Oxford Nanopore
A 20-node installation, using 8,000-nanopore cartridges, is expected
to deliver a complete human genome at 50-fold coverage in 15 minutes, according to the company, or 3 terabases of data per day, based on a sequencing
speed of 300 bases per second. For that setup, the cost per gigabase is expected to be under $10.
http://www.hsls.pitt.edu/molbio
Organism Whole Genome Sequences
2001 2012
http://www.hsls.pitt.edu/molbio
Organism Whole Genome Sequences
HumanMouse
Rat
Dog
Cow
Chimp
Rabbit
……..
http://www.hsls.pitt.edu/molbio
Genomes OnLine Database (GOLD) http://www.genomesonline.org/index.htm
Global comprehensive access to information regarding complete and ongoing genome projects, as well as metagenomes & metadata
http://www.hsls.pitt.edu/molbio
Genome Resources
http://www.hsls.pitt.edu/molbio
Search for organism’s whole genome
sequence
http://www.hsls.pitt.edu/molbio
Genome Resources
NCBI: Genomes Resources : Link
Genome: http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome
JGI: Integrated Microbial genome Link
http://www.hsls.pitt.edu/molbio
NCBI Genome
http://www.hsls.pitt.edu/molbio
NCBI BioProject Query: Check the status of genome sequencing
for an organism, such as honey bee.
Answer: Enter search term under BioProject
Select the appropriate organism
The BioProject summary page will provide information of available projects and sequencing status
Click on Project Type for more detailed information
Explore Related Resources
http://www.hsls.pitt.edu/molbio
http://www.hsls.pitt.edu/molbio
Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/rabbit.swf
Resources
• NCBI Genome Project: http://www.ncbi.nlm.nih.gov/genomeprj• NCBI Genome: http://www.ncbi.nlm.nih.gov/sites/genome
Find the genomic sequence for an organism, such as rabbit.
NCBI Genome Project A collection of complete and in-progress large-scale sequencing, assembly,
annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals for browsing and retrieving projects pertaining to each organism.
CLICKRabbit
http://www.ncbi.nlm.nih.gov/genomeprj
http://www.hsls.pitt.edu/molbio
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/molbio
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/molbio
http://www.hsls.pitt.edu/molbio
Link to the video tutorial:http://media.hsls.pitt.edu/media/molbiovideos/img.swf
Resources
Integrated Microbial Genome (IMG):http://img.jgi.doe.gov/cgi-bin/w/main.cgi
Find the genomic sequence for a bacteria, such as Salmonella enterica
Human genome sequence
http://www.hsls.pitt.edu/molbio
Genomic achievements since the Human Genome Project
http://www.hsls.pitt.edu/molbio
http://goo.gl/bsZdN
http://www.hsls.pitt.edu/molbio
Genome Biology: Structural Variations
http://www.hsls.pitt.edu/molbio
Genome Reference Consortium
Link to the PLoS Biology paper on the GRC : http://goo.gl/30Xun
http://www.hsls.pitt.edu/molbio
NCBI Genome Resourceshttp://www.ncbi.nlm.nih.gov/guide/genomes/
http://www.hsls.pitt.edu/molbio
What is a Genome Browser?
Genome Browsers enable researchers to visualize & browse entire genomes with annotated data including:
• gene prediction and structure • proteins• expression• regulation• variation• comparative analysis• etc.
Annotated data is usually from multiple diverse sources.
http://www.hsls.pitt.edu/molbio
Eukaryotic Genome Browsers
http://www.hsls.pitt.edu/molbio
Display: Vertical
Display: Horizontal
Non-vertebrate Genome Browsers
http://www.hsls.pitt.edu/molbio
Genome Browsers
The Big Three
NCBI MapViewer UCSC Genome Browser EBI Ensemble
Generic Genome Browser
(Gbrowse)
Display: Vertical
Display: Horizontal
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser Default Tracks
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser Page
http://www.hsls.pitt.edu/molbio
mRNA and EST Tracks
Expression (such as microarray)
Comparative Genomics• As a group• Individual species
Variation and Repeats(including SNPs, copy number variation)
Groups of data (Tracks)
ENCODE Tracks
Phenotype and Disease Tracks
Regulation (including TFBS)
Navigating the Human Genome
Browse the region of human chromosome 7 between 54,318043 to 55,974,438 bp (chr7:54,318,043-55,974,438)
http://www.hsls.pitt.edu/molbio
http://www.hsls.pitt.edu/molbio
Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/ucsc_genes.swf
Resource
UCSC Genome Browser: http://genome.ucsc.edu/
Browse the region of human chromosome 7 between 54,318043 to 55,974,438 bp.
What genes are present in this region ?
UCSC Genome Browser
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
What genes are present in this region?
http://www.hsls.pitt.edu/molbio
Bioinformatics Institutionshttp://www.ebi.ac.uk/http://www.ncbi.nlm.nih.gov/
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
What is RefSeq ?
http://www.hsls.pitt.edu/molbio
NCBI Sequence Databases
GenBank archival database of nucleotide sequences
from >160,000 organisms More info
RefSeq based on GenBank record, non-redundant
expert verified databases of reference sequences More info
http://www.hsls.pitt.edu/molbio
International Nucleotide Sequence Database Collaboration
http://www.hsls.pitt.edu/molbio
Primary Vs Derivative databases
http://www.hsls.pitt.edu/molbio
RefSeq Scope & Accessions
Genomic DNA NC_123456 - complete genome, complete
chromosome, complete plasmid NG_123456 - genomic region NT_123456 - genomic contig
mRNA - NM_123456 Protein - NP_123456
more about RefSeq scope and accessions...
http://www.hsls.pitt.edu/molbio
RefSeq Status Codes
Provisional Reviewed Predicted Genome Annotation
more about RefSeq status codes
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
Display Options
http://www.hsls.pitt.edu/molbio
Hide: removes a track from view
Dense: all items collapsed into a single line
Squish: each item = separate line, but 50% height + packed
Pack: each item separate, but efficiently stacked (full height)
Full: each item on separate line
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
Gene Description
http://www.hsls.pitt.edu/molbio
Informative description
other resource links
microarray data
mRNA secondary structure
links to sequences
protein domains/structure
orthologs in other species
Gene Ontology™ descriptions
mRNA descriptions
pathways
genetic association studies
comparative toxicology
gene model
UCSC Genome Browser: Navigating a Genomic Region
Find SNPs present in this region
http://www.hsls.pitt.edu/molbio
http://www.hsls.pitt.edu/molbio
Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/ucsc_snp.swfFile: UCSC_part2.swf
Resource
UCSC Genome Browser: http://genome.ucsc.edu/
Browse the region of human chromosome 7 between 55,033,691 to 55,282,150 bp.
What genetic variations are present in this region ?Retrieve the DNA sequence of this genomic region showing
SNPs in red and all gene exons in blue
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
UCSC Genome Browser: Navigating a Genomic Region
http://www.hsls.pitt.edu/molbio
BLAT: Map a protein sequence into the
genome
http://www.hsls.pitt.edu/molbio
UCSC Blat: Place a Peptide Seq into the Genome
Peptide Seq:NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET
Nucleotide seq:AAATCCTCACATTTTTACTCAAATGTTGGACTTCAAATTCAGACATATGAACTTCAGGAAAGC AATGTTCA
http://www.hsls.pitt.edu/molbio
http://www.hsls.pitt.edu/molbio
Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/blat.swfFile: Blat.swf
Resource
UCSC BLAT: http://genome.ucsc.edu/cgi-bin/hgBlat?command=start
Place a mRNA or peptide sequence into the human genome
UCSC Blathttp://genome.ucsc.edu/cgi-bin/hgBlat
http://www.hsls.pitt.edu/molbio
UCSC Blat
http://www.hsls.pitt.edu/molbio
UCSC Blat
Peptide Seq:NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET
http://www.hsls.pitt.edu/molbio
Thank you!Any questions?
Carrie Iwema Ansuman [email protected] [email protected] 412-383-6887 412-648-1297
http://www.hsls.pitt.edu/molbio