Other biological databases
description
Transcript of Other biological databases
![Page 1: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/1.jpg)
Other biological databases
![Page 2: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/2.jpg)
Biological systems
Taxonomic data
Literature
Protein folding and 3D structure
Small molecules
Pathways and networks
Biological systems
Protein families and domains
Whole genome data
Sequence data
Ontologies -GO
![Page 3: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/3.jpg)
Other Biological Databases
• Transcription factor binding sites -TRANSFAC
• Protein structure databases- PDB, SCOP, CATH
• Protein family databases- Pfam, Prints, PROSITE etc.
• Chemicals and small molecules -ChEBI
• Gene expression databases –GEO, ArrayExpress
• Metabolic pathways - Reactome, KEGG
• Genome Databases- Ensembl, FlyBase, WormBase etc.
• Human genetics-related databases –HapMap, dbSNP
![Page 4: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/4.jpg)
Transcription factor binding sites
• TRANSFAC –database of eukaryotic transcription factors: http://www.gene-regulation.com/pub/databases.html#transfac
• TESS –Transcription Element Search System –for predicting transcription factor binding sites, uses TRANSFAC: http://www.cbi.upenn.edu/tess
• TFsearch –for searching transcription factor binding sites: http://www.cbrc.jp/research/db/TFSEARCH.html
![Page 5: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/5.jpg)
Protein structure databases
• Main resource is Protein Data Bank (PDB): http://www.rcsb.org/pdb/
• Contains the spatial coordinates of macromolecule atoms whose 3D structure has been obtained by X-ray or NMR studies
• Proteins represent more than 90% of available structures (others are DNA, RNA, sugars, viruses, protein/DNA complexes…)
• Can search by PDB code
![Page 6: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/6.jpg)
Searching MSD
http://www.ebi.ac.uk/msd -Search by PDB code
![Page 7: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/7.jpg)
Protein structure-related databases
• Structural family databases based on PDB –SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/) and CATH (http://www.biochem.ucl.ac.uk/bsm/cath/)
• Predicted structures in SWISS-MODEL (http://swissmodel.expasy.org//SWISS-MODEL.html)
![Page 8: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/8.jpg)
Protein family databases
• Databases that produce signatures for identifying protein families or domains
• Used for functional classification of proteins
• E.g. Pfam, PROSITE, Prints, SMART, TIGRFAMs etc.
• Integrated into single resource InterPro (http://www.ebi.ac.uk/interpro)
![Page 9: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/9.jpg)
InterProScan sequence search
Stand-alone version available
![Page 10: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/10.jpg)
InterPro text search
Search keyword, protein acc or InterPro acc
![Page 11: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/11.jpg)
Results for
protein acc
![Page 12: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/12.jpg)
Example InterPro
entry
![Page 13: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/13.jpg)
Chemicals and small molecules
• Chemical abstracts- http://www.cas.org/• ChEBI- http://www.ebi.ac.uk/chebi• KEGG –part of it includes chemicals
http://www.genome.jp/kegg • ChemID plus -chemicals cited in NLM databases
http://chem2.sis.nlm.nih.gov/chemidplus/chemidlite.jsp
• MSD-Chem –ligands and chemicals in MSD
![Page 14: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/14.jpg)
CheBI example entry
![Page 15: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/15.jpg)
Hierarchy for
chemicals
![Page 16: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/16.jpg)
Gene expression databases
• NCBI Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/
• ArrayExpress http://www.ncbi.nlm.nih.gov/geo/
• Stanford microarray database http://genome-www5.stanford.edu/
• Can usually search for experiments or particular expression profiles
![Page 17: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/17.jpg)
GEO search page
![Page 18: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/18.jpg)
Profiles search results
![Page 19: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/19.jpg)
Specific entry and experiment info
![Page 20: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/20.jpg)
ArrayExpress search results
![Page 21: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/21.jpg)
What does the data look like?
• Info on experiment, array used, etc.
• Raw or processed tab delimited file containing spots and their intensities cy3/cy5 ratios) across different samples
• Files with meta data e.g. sample info, annotation and coordinates of each spot on array
![Page 22: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/22.jpg)
Proteomics: SWISS-2DPAGE
![Page 23: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/23.jpg)
Enzymes and metabolic pathways
• Contain information describing enzymes, biochemical reactions and metabolic pathways;
• ENZYME and BRENDA: nomenclature databases that store information on enzyme names and reactions;
• IntEnz: Integrated relational Enzyme database
![Page 24: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/24.jpg)
Enzyme nomenclature• E.C. (Enzyme Commission) numbers assigned based
on reactions they catalyze
• Hierarchy, high level groups:– EC 1 –Oxidoreductases– EC 2 –Transferases– EC 3 –Hydrolases– EC 4 –Lyases– EC 5 –Isomerases– EC 6 –Ligases
![Page 25: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/25.jpg)
EC example
![Page 26: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/26.jpg)
Metabolic Pathway databases• PATHGUIDE >200 pathways• KEGG (Kyoto encyclopedia of genes and genomes):
http://www.genome.jp/kegg -includes:– Database of chemicals, genes and networks (metabolic,
regulatory etc.)– Well-curated and quite specific
• EcoCyc (Encyclopedia of E. coli K12 genes and metabolism): http://ecocyc.org –curation of entries genome
• Reactome –curated biological pathways: http://www.reactome.org/
• GenMAPP –pathways contributed by users
![Page 27: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/27.jpg)
http://www.genome.ad.jp/kegg
Different pathway in different species: -> comparison
![Page 28: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/28.jpg)
Pathway in Reactome
![Page 29: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/29.jpg)
Example of a pathway in BioCyc
![Page 30: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/30.jpg)
Protein-protein interaction databases
• Protein-protein interaction databases store pairwise interactions or complexes
• Can get 1 to more than 20,000 interactions per publication• IntAct http://www.ebi.ac.uk/intact • DIP (Database of Interacting Proteins) http://dip.doe-
mbi.ucla.edu/• BIND (Biomolecular Interaction Network Database)
http://submit.bind.ca:8080/bind/
![Page 31: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/31.jpg)
Protein-protein interactions in IntAct
![Page 32: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/32.jpg)
Integrated functional interactions in STRING
![Page 33: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/33.jpg)
Genome browsers
• Integrate sequence & functional data for a genome• Ensembl –genome browser for major eukaryotic genomes,
e.g. human, mouse etc. http://www.ensembl.org• UCSC browser -http://genome.ucsc.edu/ • FlyBase –Drosophila genome database:
http://www.ebi.ac.uk/flybase• WormBase –C. elegans: http://www.wormbase.org• PlasmoDB –Plasmodium (malaria): http://plasmodb.org• Etc.
![Page 34: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/34.jpg)
Ensembl genome browser
![Page 35: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/35.jpg)
Ensembl gene view 1
![Page 36: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/36.jpg)
Ensembl gene view 2
![Page 37: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/37.jpg)
Gene within context on chromosome
![Page 38: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/38.jpg)
Human genetics databases
• GeneCards (http://www.genecards.org/)
• HapMap (http://hapmap.ncbi.nlm.nih.gov/)
• OMIM http://www.ncbi.nlm.nih.gov/omim
• HGDP Human Genome Diversity Project (http://hagsc.org/hgdp/files.html)
![Page 39: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/39.jpg)
Most of the databases are disease or gene centric i.e. p53
Mutation/polymorphism databases
![Page 40: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/40.jpg)
dbSNPhttp://www.ncbi.nlm.nih.gov/SNP/
Repository of all known mutation (human and other organisms)
![Page 41: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/41.jpg)
Where to find the databases
• Table of addresses for major databases and tools
• Nucleic Acids Research Database issue January each year
• Nucleic Acids Research Software issue –new
• Expasy list of tools: http://ca.expasy.org/links.html
![Page 42: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/42.jpg)
Large scale data retrieval
• Programmatic access to many databases
• MySQL access to some
• BioMart access –public and private
• FTP sites –large data downloads
![Page 43: Other biological databases](https://reader036.fdocuments.us/reader036/viewer/2022062403/5681554b550346895dc31ace/html5/thumbnails/43.jpg)
Other tutorials
• http://www.ensembl.org/info/website/tutorials/index.html
• http://www.ebi.ac.uk/training/online/
• http://www.ebi.ac.uk/2can/home.html