© Wiley Publishing. 2007. All Rights Reserved. Protein and Specialized Sequence Databases.
-
Upload
myra-roberts -
Category
Documents
-
view
216 -
download
0
Transcript of © Wiley Publishing. 2007. All Rights Reserved. Protein and Specialized Sequence Databases.
© Wiley Publishing. 2007. All Rights Reserved.
Protein and Specialized Sequence Databases
Learning Objectives
Finding out the basics of protein maturationDeciphering a Swiss-Prot entryGetting to know specialized protein
databases such as KEGG (the metabolic-
pathways database) or PDB (the structure
database)
Outline
1. Getting from a gene to a mature protein2. Reading a UniProt/Swiss-Prot entry3. Exploring metabolic databases such as
KEGG4. Finding out how about post translational
modifications
From a Gene to a Functional Protein
DNA genes get transcribed into mRNAs mRNAs are translated into proteins Proteins often need to be matured before becoming active Matured proteins must be transported to their destination
• Cell nucleus• Mitochondria or other organelle• Periplasma (bacteria)• Secreted outside the cell
The protein is functional when it reaches the place where it has to work (just like you and me)!
Protein Maturation
Maturation can involve• Removal of some fragments• Specific protein cleavage • Chemical modifications• Phosphorylation• Addition of lipids or sugars
(glycosylation)
Knowing Your Protein
To understand how your protein works, you need to
know about• Its maturation• Its transportation• Its mechanism of functioning
All this information must be determined
experimentally If it has been done, it’s in Swiss-Prot
The Swiss-Prot Database
Entries describe all proteins that have known functions Small, non-redundant database: 100,000 entries
• trEMBL contains 4 the 4 million putative proteins found in GenBank • Swiss-Prot contains the subset of trEMBL with a known function
All entries annotated manually Most accurate database for protein function Access Swiss-Prot at www.expasy.ch
Browsing a Swiss-Prot Entry
Find this entry at www.expasy.org/uniprot/P00533
The Main Sections of a Swiss-Prot Entry
General information• Accession number
References• Bibliography
Comment section• Functional information
Cross-references• Links to entries in other databases
Feature table• Mapping of every known function
Sequence
The General Information in a Swiss-Prot Entry
The Entry Name• Identifies the entry• Can change if the entry gets merged
The Primary Accession Number• Has the form PXXXX• Is permanent and never changes
Last Modified lets you know when the entry was last modified The Protein Name and Synonyms provide some common names of your protein The From and Taxonomy fields indicate where the protein comes from The References section lists all the references used to compile this entry
The Comments Section
The Comments section lists all
the known functions of the
protein. This section is a valuable
document compiled manually by
specialists Comments deal with the most
standard topics (see table)
Comment Section of the Entry P00533
The Cross-reference Section
Contains hyperlinks to
other entries in other
databases
Automatically updated
Some Important Cross-References
EMBL: GenBank original DNA sequencePDB: Experimental structure of your proteinDIP: Proteins interacting with your proteinGlycoSuiteDB: GlycolsylationsMIM: List of genetic diseases involving your proteinOntologies: Function of your proteinProfiles: Known protein domains in your proteinENSEMBL: Genomic location of your protein
The Features Section
Localizes precisely every known
function of your protein, each on
its sequence TRANSMEM: Transmembrane
domain ACT_SITE: Active sites BINDING: Binding sites DISULPHID: Bridge of cysteines
Finding Out More About Your Protein’s Maturation
Proteins are often modified
to make them active
Modification can imply
attaching a lipid or a sugar
Use these resources to
determine the details of the
modification
www.ebi.ac.uk/RESID• This site details every
known post-translational
modification
www.glycosuite.com• A complete database of all
known sugars found in
proteins
www.lipidbank.jp• A database of lipids
The Function of Your Protein
The Features and the Comments sections give you valuable functional information
To find out about the function of your protein, you will need to determine• Where your protein works• Metabolic pathway in which the protein is involved • The protein’s 3D structure• Which protein family it belongs to
You may find this data by following links in the cross-links section
Where Does Your Protein Work?
Proteins are usually part of a metabolic pathway
A metabolic pathway is like a chain of production linking several different proteins
Metabolic pathways modify metabolites by passing them from one enzyme to the next
On the KEGG pathway, each enzyme appears with its EC number
Some Important Resources forMetabolic Pathways
www.genome.ad.jp/kegg• KEGG is the most extensive database of metabolic
pathways • You can use it to compare species
www.chem.qmul.ac.uk/iubmb• The IUBMD assigns the EC numbers used to describe an
enzyme activitywww.ecocy.org
• An exhaustive list of all known metabolic pathways in E. coli and other bacteria
What Is the Structure of Your Protein ?
A protein must have the right structure to perform its function The structure of a protein is the key to understanding it Predicting protein structures is very difficult Precise prediction requires experiments
• X-ray crystallography• Nuclear magnetic resonance
Prediction from sequence alone is possible but unreliable
Some Databases of Protein Structures
www.rcsb.org/pdb• The database of protein structures• The protein’s “PDB” is often a synonymous with
its structure www.ncbi.nlm.nih.gov/Structure
• The other home of protein structuresswissmodel.expasy.org
• Prediction of structures from sequences
Some Important Protein Families
Proteins can be classified into
families
This classification is based on
both function and sequence
Very specialized databases are
available for the most important
families
www.kinasenet.org• Kinases control everything in us;
their deregulation is the cause of
many cancers
imgt.cines.fr• Immunoglobulins are key elements
of our natural defenses
rebase.neb.com• This site is a key resource on
restriction enzymes
Wrapping It Up
Predicting protein function is a central goal in biology
Protein databases help organize knowledge
They provide the material for• Developing new biological experiments• Developing new prediction algorithms• Extrapolating experimental data to unknown sequences