Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan...

49
Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan [email protected]
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    3

Transcript of Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan...

Protein analysis and proteomics

Friday, 27 January 2006

Introduction to Bioinformatics

DA [email protected]

protein

[1] Protein families

[4] Protein function

[2] Physical properties

[3] Protein localization

Fig. 8.1Page 224

Perspective 1: Protein families

(domains and motifs)

Page 225

Definitions

Signature: • a protein category such as a domain or motif

Domain: • a region of a protein that can adopt a 3D structure• a characteristic fold or functional region• a family (superfamily) is a group of proteins that share a

domain• examples: zinc finger domain immunoglobulin domain

Motif (or fingerprint):• a short, conserved region of a protein• typically 10 to 20 contiguous amino acid residues

Page 225

15 most common domains (human)

Zn finger, C2H2 type 1093 proteinsImmunoglobulin 1032EGF-like 471Zn-finger, RING 458Homeobox 417Pleckstrin-like 405RNA-binding region RNP-1400SH3 394Calcium-binding EF-hand 392Fibronectin, type III 300PDZ/DHR/GLGF 280Small GTP-binding protein 261BTB/POZ 236bHLH 226Cadherin 226 Table 8-3

Page 227Source: Integr8 program at www.ebi.ac.uk/proteome/

Definition of a domain

According to InterPro at EBI (http://www.ebi.ac.uk/interpro/):

A domain is an independent structural unit, found aloneor in conjunction with other domains or repeats.Domains are evolutionarily related.

Tables 8-1,8-2Page 226

According to SMART (http://smart.embl-heidelberg.de):

A domain is a conserved structural entity with distinctivesecondary structure content and a hydrophobic core.Homologous domains with common functions usuallyshow sequence similarities.

Varieties of protein domains

Fig. 8.2Page 228

Extending along the length of a protein

Occupying a subset of a protein sequence

Occurring one or more times

Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2)

MBD

Page 227

TRD

The protein includes a methylated DNA binding domain(MBD) and a transcriptional repression domain (TRD).MeCP2 is a transcriptional repressor.

Mutations in the gene encoding MeCP2 cause RettSyndrome, a neurological disorder affecting girls primarily.

Fig. 8.3Page 228

Result of an MeCP2 blastp search:A methyl-binding domain shared by several proteins

Are proteins that share only a domain homologous?

Fig. 8.3Page 228

ProDom entry for HIV-1 pol shows many related proteins

Fig. 8.7Page 231

Proteins can have both domains and patterns (motifs)

Domain(aspartylprotease)

Domain(reversetranscriptase)

Pattern(severalresidues)

Pattern(severalresidues)

Fig. 8.7Page 231

Fig. 8.8Page 232

Definition of a motif

A motif (or fingerprint) is a short, conserved region of a protein. Its size is often 10 to 20 amino acids.

Simple motifs include transmembrane domains andphosphorylation sites. These do not imply homologywhen found in a group of proteins.

PROSITE (www.expasy.org/prosite) is a dictionary of motifs (there are currently >1300 entries)(9/05). In PROSITE,a pattern is a qualitative motif description (a proteineither matches a pattern, or not). In contrast, a profileis a quantitative motif description. We will encounterprofiles in Pfam, ProDom, SMART, and other databases.

Page 231-233

Perspective 2: Physical properties of proteins

Page 233

Fig. 8.9Page 234

Posttranslational modifications:

Fig. 8.11Page 235

Fig. 8.11Page 235

Fig. 8.12Page 236

Fig. 8.13Page 238

Fig. 8.13Page 238

Fig. 8.13Page 238

Syntaxin, SNAP-25 and VAMP are three proteins that interact via coiled-coil domains

Introduction to Perspectives 3 and 4: Gene Ontology (GO) Consortium

Page 237

The Gene Ontology Consortium

An ontology is a description of concepts. The GOConsortium compiles a dynamic, controlled vocabularyof terms related to gene products.

There are three organizing principles: Molecular functionBiological processCellular compartment

You can visit GO at http://www.geneontology.org.There is no centralized GO database. Instead, curatorsof organism-specific databases assign GO termsto gene products for each organism.

Page 237

GO terms are assigned to Entrez Gene entries

Fig. 8.14Page 241

Fig. 8.14Page 241

Fig. 8.14Page 241

Fig. 8.14Page 241

The Gene Ontology Consortium: Evidence Codes

IC Inferred by curatorIDA Inferred from direct assayIEA Inferred from electronic annotationIEP Inferred from expression patternIGI Inferred from genetic interactionIMP Inferred from mutant phenotypeIPI Inferred from physical interactionISS Inferred from sequence or structural similarityNAS Non-traceable author statementND No biological dataTAS Traceable author statement

Table 8-7Page 240

Perspective 3: Protein localization

Page 242

protein

Protein localization

Page 242

Protein localization

Proteins may be localized to intracellular compartments,cytosol, the plasma membrane, or they may be secreted. Many proteins shuttle between multiple compartments.

A variety of algorithms predict localization, but thisis essentially a cell biological question.

Page 242

Fig. 8.15Page 242

PSORT: searches for sorting signals that are characteristic of proteins localized to particular cellular compartments

Fig. 8.16Page 244

Fig. 8.16Page 244

Localization of 2,900 yeast proteins

Michael Snyder and colleagues incorporated epitopetags into thousands of S. cerevisiae cDNAs,and systematically localized proteins (Kumar et al., 2002).

See http://ygac.med.yale.edu for a database including2,900 fluorescence micrographs.

Page 243

Perspective 4: Protein function

Page 243

Protein function

Function refers to the role of a protein in the cell.We can consider protein function from a varietyof perspectives.

Page 243

1. Biochemical function(molecular function)

RBP binds retinol,could be a carrier

Fig. 8.17Page 245

2. Functional assignmentbased on homology

RBPcould bea carrier

too

Othercarrier proteins

Fig. 8.17Page 245

3. Functionbased on structure

RBP forms a calyx

Fig. 8.17Page 245

4. Function based onligand binding specificity

RBP binds vitamin A

Fig. 8.17Page 245

5. Function based oncellular process

DNA RNA

RBP is abundant,soluble, secreted

Fig. 8.17Page 245

6. Function basedon biological process

Analyze a gene knockout phenotype;RBP is essential for vision

Fig. 8.17Page 245

7. Function based on “proteomics”or high throughput “functional genomics”

High throughput analyses show...

RBP levels elevated in renal failureRBP levels decreased in liver disease

Fig. 8.17Page 245

Functional assignment of enzymes:the EC (Enzyme Commission) system

Oxidoreductases 1,003Transferases 1,076Hydrolases 1,125Lyases 356Isomerases 156Ligases 126

Table 8-8Page 246Updated 9/04, 9/05

Functional assignment of proteins:Clusters of Orthologous Groups (COGs)

Information storage and processing

Cellular processes

Metabolism

Poorly characterized

Table 8-9Page 247See Chapter 14 for COGs at NCBI