NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.
-
Upload
charity-burke -
Category
Documents
-
view
218 -
download
0
Transcript of NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.
NC
BI
Fie
ldG
uid
e
NCBI Molecular Biology Resources
March 2007
Using Entrez
NC
BI
Fie
ldG
uid
eWWWAccess
Entrez&BLAST
NC
BI
Fie
ldG
uid
e
Genomes
Taxonomy
Entrez: Database Integration
PubMed abstracts
Nucleotide sequences
Protein sequences
3-D Structure
3 -D Structure
Word weight
VAST
BLASTBLAST
Phylogeny
Hard LinkNeighborsRelated Sequences
NeighborsRelated SequencesBLinkDomains
NeighborsRelated Structures
NC
BI
Fie
ldG
uid
e
Database Searching with Entrez
Using limits and field restriction to find human MutL homologLinking and neighboring with MutLMapping SNPs onto structure and the genome
NC
BI
Fie
ldG
uid
e
Global NCBI (Entrez) Search
Human hereditary nonpolyposis colon cancerHuman hereditary nonpolyposis colon cancer
NC
BI
Fie
ldG
uid
e
Global Entrez Search Results
NC
BI
Fie
ldG
uid
e
Nucleotide Sequences
Nucleotide database now three parts•EST expressed sequence tags•GSS genome survey sequences•CoreNucleotide everything else
Nucleotide database now three parts•EST expressed sequence tags•GSS genome survey sequences•CoreNucleotide everything else
NC
BI
Fie
ldG
uid
e
Advanced Search OptionsTabsTabs
NC
BI
Fie
ldG
uid
eMore Precise Nucleotides
Search
nonpolyposis[All Fields] AND colon cancer[Title] AND human[Organism] AND biomol_mrna[Properties] AND srcdb_refseq[Properties]nonpolyposis[All Fields] AND colon cancer[Title] AND human[Organism] AND biomol_mrna[Properties] AND srcdb_refseq[Properties]
NC
BI
Fie
ldG
uid
e
Useful Field Restrictions[Title]: Definition line in GenBank / GenPept format shown in Summary format
glyceraldehyde 3 phosphate dehydrogenase[Title]
[Organism]: NCBI’s taxonomy. Organizing system for molecular databases
mouse[organism]; green plants[organism]; Streptomyces coelicolor[organism]
[Properties]: molecule type, location, database source
biomol_mrna[properties]; biomol_genomic[properties]; gene_in_mitochondrion[properties]; srcdb pdb[properties]
[Filter]: subsets of data, Entrez links
all[filter]; nucleotide mapview[filter]; nucleotide omim[filter]
NC
BI
Fie
ldG
uid
e
Organism Field: NCBI’s Taxonomy
NC
BI
Fie
ldG
uid
e
Useful Properties Field Terms
Molecule type
biomol_mrna
biomol_genomic
GenBank division
gbdiv_est
gbdiv_htg
gbdiv_xxx
Gene location
gene_in_mitochondrion
gene_in_chloroplast
gene_in_genomic
Source Database
srcdb_refseq
srcdb_pdb
srcdb_swiss_prot
NC
BI
Fie
ldG
uid
e
Human MutL RefSeq
GenBank RecordsGenBank Records
NC
BI
Fie
ldG
uid
e
NM_000249: Links
NC
BI
Fie
ldG
uid
e
Literature Links
OMIM
NC
BI
Fie
ldG
uid
e
OMIM: Human Disease Genes
Conserved Domain
NC
BI
Fie
ldG
uid
e
Sequence Links
Finding Homologs and Structures
NC
BI
Fie
ldG
uid
e
Protein Link
BLAST LinkBLAST Link
Conserved DomainsConserved Domains
NC
BI
Fie
ldG
uid
e
BLink: BLAST Link
top 200 onlytop 200 only
Redundant GIsRedundant GIs
NC
BI
Fie
ldG
uid
e
BLink: non-redundant relatives
zebrafish homolog zebrafish homolog
BLASTBLAST
NC
BI
Fie
ldG
uid
e
Short Cut: Related Structures
NC
BI
Fie
ldG
uid
e
E. coli MutL Structure
Cn3D viewerCn3D viewer
Conserved DomainsConserved Domains
3D Domain Neighbors3D Domain Neighbors
Structure NeighborsStructure Neighbors
Pubchem compoundPubchem compound
NC
BI
Fie
ldG
uid
e
MLH1 Domain Structure: CDD
ATPase DomainATPase Domain
Mismatch Repair DomainMismatch Repair Domain
NC
BI
Fie
ldG
uid
e
MLH1: ATPase Domain
NC
BI
Fie
ldG
uid
e
Mapping Polymorphisms onto Structure
NC
BI
Fie
ldG
uid
eGeneView: Variations Human
MLH1
ATPase domain
NC
BI
Fie
ldG
uid
e
Related Structures
NC
BI
Fie
ldG
uid
e
Mapping Variation Onto Structure
Conserved Asn
AsnIle
Ile – Val
NC
BI
Fie
ldG
uid
e
Genome Resources
NC
BI
Fie
ldG
uid
e
NM_000249: Genome Links
NC
BI
Fie
ldG
uid
e
The Map Viewer
Genome BLASTGenome BLAST
NC
BI
Fie
ldG
uid
e
Map Viewer: Human MLH1CustomizableCustomizable
NCBI Assembly
EST Hits
Gene Annotations
Models
Transcripts
Download data and sequencesDownload data and sequences
NC
BI
Fie
ldG
uid
eSynteny: Mammalian Genomes
Albumin Gene FamilyAlbumin Gene Family
NC
BI
Fie
ldG
uid
e
Homologene
early globin gene
A-chain gene B-chain gene
frog A chick A mouse A mouse B chick B frog B
paralogsorthologs orthologs
gene duplication
• Completely Annotated Eukaryotic Genomes
• Homologous UniGene determined for other organisms
• Protein similarities first• Guided by taxonomic tree• Includes orthologs and paralogs
• Completely Annotated Eukaryotic Genomes
• Homologous UniGene determined for other organisms
• Protein similarities first• Guided by taxonomic tree• Includes orthologs and paralogs
NC
BI
Fie
ldG
uid
e
Homologene Cluster
NC
BI
Fie
ldG
uid
e
Rice Homolog
NC
BI
Fie
ldG
uid
e
The Gene Database• Gene Centered Information• Unifies LocusLink and microbial Genomes• 2.4 million records for 3,822 taxaHuman 38,603 Sea Urchin 30,603
Chimpanzee 31,502 Mosquito 13,763
Mouse 60,746 Fruit Fly 21,116
Rat 38,117 C. elegans 20,935
Dog 20,154 Fungi 168,802
Cow 23, 677 Green Plants 76,847
Chicken 18, 469 Archea 74,627
Zebrafish 38, 594 Bacteria 1,361,390
NC
BI
Fie
ldG
uid
e
Genes MLH1: One Stop Shopping
NC
BI
Fie
ldG
uid
e
Genes MLH1: One Stop Shopping (cont.)
NC
BI
Fie
ldG
uid
e
Genes: Display Options and Links