NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

40
NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez

Transcript of NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

Page 1: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

NCBI Molecular Biology Resources

March 2007

Using Entrez

Page 2: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

eWWWAccess

Entrez&BLAST

Page 3: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Genomes

Taxonomy

Entrez: Database Integration

PubMed abstracts

Nucleotide sequences

Protein sequences

3-D Structure

3 -D Structure

Word weight

VAST

BLASTBLAST

Phylogeny

Hard LinkNeighborsRelated Sequences

NeighborsRelated SequencesBLinkDomains

NeighborsRelated Structures

Page 4: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Database Searching with Entrez

Using limits and field restriction to find human MutL homologLinking and neighboring with MutLMapping SNPs onto structure and the genome

Page 5: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Global NCBI (Entrez) Search

Human hereditary nonpolyposis colon cancerHuman hereditary nonpolyposis colon cancer

Page 6: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Global Entrez Search Results

Page 7: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Nucleotide Sequences

Nucleotide database now three parts•EST expressed sequence tags•GSS genome survey sequences•CoreNucleotide everything else

Nucleotide database now three parts•EST expressed sequence tags•GSS genome survey sequences•CoreNucleotide everything else

Page 8: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Advanced Search OptionsTabsTabs

Page 9: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

eMore Precise Nucleotides

Search

nonpolyposis[All Fields] AND colon cancer[Title] AND human[Organism] AND biomol_mrna[Properties] AND srcdb_refseq[Properties]nonpolyposis[All Fields] AND colon cancer[Title] AND human[Organism] AND biomol_mrna[Properties] AND srcdb_refseq[Properties]

Page 10: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Useful Field Restrictions[Title]: Definition line in GenBank / GenPept format shown in Summary format

glyceraldehyde 3 phosphate dehydrogenase[Title]

[Organism]: NCBI’s taxonomy. Organizing system for molecular databases

mouse[organism]; green plants[organism]; Streptomyces coelicolor[organism]

[Properties]: molecule type, location, database source

biomol_mrna[properties]; biomol_genomic[properties]; gene_in_mitochondrion[properties]; srcdb pdb[properties]

[Filter]: subsets of data, Entrez links

all[filter]; nucleotide mapview[filter]; nucleotide omim[filter]

Page 11: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Organism Field: NCBI’s Taxonomy

Page 12: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Useful Properties Field Terms

Molecule type

biomol_mrna

biomol_genomic

GenBank division

gbdiv_est

gbdiv_htg

gbdiv_xxx

Gene location

gene_in_mitochondrion

gene_in_chloroplast

gene_in_genomic

Source Database

srcdb_refseq

srcdb_pdb

srcdb_swiss_prot

Page 13: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Human MutL RefSeq

GenBank RecordsGenBank Records

Page 14: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

NM_000249: Links

Page 15: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Literature Links

OMIM

Page 16: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

OMIM: Human Disease Genes

Conserved Domain

Page 17: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Sequence Links

Finding Homologs and Structures

Page 18: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Protein Link

BLAST LinkBLAST Link

Conserved DomainsConserved Domains

Page 19: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

BLink: BLAST Link

top 200 onlytop 200 only

Redundant GIsRedundant GIs

Page 20: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

BLink: non-redundant relatives

zebrafish homolog zebrafish homolog

BLASTBLAST

Page 21: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Short Cut: Related Structures

Page 22: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

E. coli MutL Structure

Cn3D viewerCn3D viewer

Conserved DomainsConserved Domains

3D Domain Neighbors3D Domain Neighbors

Structure NeighborsStructure Neighbors

Pubchem compoundPubchem compound

Page 23: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

MLH1 Domain Structure: CDD

ATPase DomainATPase Domain

Mismatch Repair DomainMismatch Repair Domain

Page 24: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

MLH1: ATPase Domain

Page 25: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Mapping Polymorphisms onto Structure

Page 26: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

eGeneView: Variations Human

MLH1

ATPase domain

Page 27: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Related Structures

Page 28: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Mapping Variation Onto Structure

Conserved Asn

AsnIle

Ile – Val

Page 29: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Genome Resources

Page 30: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

NM_000249: Genome Links

Page 31: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

The Map Viewer

Genome BLASTGenome BLAST

Page 32: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Map Viewer: Human MLH1CustomizableCustomizable

NCBI Assembly

EST Hits

Gene Annotations

Models

Transcripts

Download data and sequencesDownload data and sequences

Page 33: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

eSynteny: Mammalian Genomes

Albumin Gene FamilyAlbumin Gene Family

Page 34: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Homologene

early globin gene

A-chain gene B-chain gene

frog A chick A mouse A mouse B chick B frog B

paralogsorthologs orthologs

gene duplication

• Completely Annotated Eukaryotic Genomes

• Homologous UniGene determined for other organisms

• Protein similarities first• Guided by taxonomic tree• Includes orthologs and paralogs

• Completely Annotated Eukaryotic Genomes

• Homologous UniGene determined for other organisms

• Protein similarities first• Guided by taxonomic tree• Includes orthologs and paralogs

Page 35: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Homologene Cluster

Page 36: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Rice Homolog

Page 37: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

The Gene Database• Gene Centered Information• Unifies LocusLink and microbial Genomes• 2.4 million records for 3,822 taxaHuman 38,603 Sea Urchin 30,603

Chimpanzee 31,502 Mosquito 13,763

Mouse 60,746 Fruit Fly 21,116

Rat 38,117 C. elegans 20,935

Dog 20,154 Fungi 168,802

Cow 23, 677 Green Plants 76,847

Chicken 18, 469 Archea 74,627

Zebrafish 38, 594 Bacteria 1,361,390

Page 38: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Genes MLH1: One Stop Shopping

Page 39: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Genes MLH1: One Stop Shopping (cont.)

Page 40: NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.

NC

BI

Fie

ldG

uid

e

Genes: Display Options and Links