Readings for this week

19
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for proposal feedback/progress checkup

description

Readings for this week. Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks…. Sign up for meeting next week for proposal feedback/progress checkup. Inferring protein function. By genomic context…………. Inferring protein function. By homology……. - PowerPoint PPT Presentation

Transcript of Readings for this week

Page 1: Readings for this week

Readings for this week

Gogarten et al Horizontal gene transfer…..

Francke et al. Reconstructing metabolic networks…..

Sign up for meeting next week for proposal feedback/progress checkup

Page 2: Readings for this week

Inferring protein function

By genomic context………….

Page 3: Readings for this week

Inferring protein function

By homology……

Page 4: Readings for this week

COGs—Clusters of Orthologous Groups(Eukaryotic versions are KOGs)

Identified using all-all against all sequence comparisons on collection of complete genomes. Includes genes with orthologous and paralogous relationships

COGS are grouped into large scale functional categories

Page 5: Readings for this week
Page 6: Readings for this week
Page 7: Readings for this week

Domains--Conserved structural entities with distinctive secondary structure content and an hydrophobic core

Example: Protein kinase domain

Motifs-- A pattern of amino acids that is conserved across many proteins and confers a particular function on

the protein. Example: Zinc finger CX2-4C....HX2-4H

Looking at Parts of Proteins

Page 8: Readings for this week

PFAM—Protein Families DatabaseBased on Hidden Markov Models (HMM)

statistical probability models of multiple sequence alignments

Uses a seed alignment of manually curated alignments (PFAM-A)

Based on these alignments a Position Specific Scoring Matrix (PSSM) is created

How to identify domains?

Page 9: Readings for this week

Position Specific Scoring Matrix (PSSM)

Page 10: Readings for this week

PFAM—Protein Families DatabaseSearching a protein against PFAM results in an E value with meaning similar to BLAST evalues (the probability that a sequence would score that well for that domain by chance)

Page 11: Readings for this week

Other Protein Databases

SMART—uses HMMs, focus is signalling and regulatory proteins (tend to be more divergent than enzymes)

TIGR FAMs– TIGR curated alignments used to generated HMMs, one advantage is names should be functionally accurate for all proteins they represent

PRINTS—not HMM based, uses “fingerprints” of conserved motifs

Ecumenical solution—InterPro—collection of multiple databases under one umbrella

Page 12: Readings for this week

Still more kinds of BLAST

PSI-BLAST– Position Specific Iterated BLASTUse to: find members of a protein family or build a custom position-specific score matrix

most sensitive BLAST program, making it useful for finding very distantly related proteins or new members of a protein family

1st round: Standard BLASTP search, then a PSSM is built with all hits with E values better than inclusion threshold

2nd round: PSSM is used to evaluate the alignment in this search. Additional hits better than inclusion threshold are incorporated into an updated PSSM

3rd + rounds: as second round. Search reaches convergence when no new hits are found.

Can save PSSM for use in later searching

Page 13: Readings for this week

Still more kinds of BLAST

PHI-BLAST– Pattern Hit Initiated BLASTFind proteins similar to the query around a given pattern

Must enter both a query sequence containing the pattern AND a pattern to search on

Example Pattern: (easy) FGELA

(harder) [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,11)-R-[STAQ]-A-x-[LIVMA]-x-[STACV]

Matching peptide: FGELALMYNTPRAATIVA

Page 14: Readings for this week

Enzyme Nomenclature

1. Oxidoreductases

2. Transferases

3. Hydrolases

4. Lyases

5. Isomerases

6. Ligases

EC Numbers: A hierachical classification scheme for enzymes

enzymes are named and classified according to the reactions they catalyze

Page 15: Readings for this week
Page 16: Readings for this week
Page 17: Readings for this week

KEGG– Kyoto Encyclopedia of Genes and Genomes

Collection of manually drawn metabolic/cellular pathway maps, based on most up to date biochemical information

Metabolic maps are strongest feature--use EC numbered enzymes as key players, allowing pathways of different genomes to be easily mapped based on their predetermined EC content

Also has a growing collection of signalling/cellular process maps

Putting it all together….

Page 18: Readings for this week
Page 19: Readings for this week