Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.
-
Upload
aaliyah-wates -
Category
Documents
-
view
222 -
download
7
Transcript of Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.
![Page 1: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/1.jpg)
![Page 2: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/2.jpg)
Mathematical Challenges in Protein Motif Recognition
Bonnie Berger
MIT
![Page 3: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/3.jpg)
Approaches to Structural Motif Recognition
Alignments
Multiple alignments & HMMs
Threading
Profile methods (1D, 3D)
* Statistical methods
![Page 4: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/4.jpg)
Structural Motif Recognition
1) Collect a database of positive examples of a motif (e.g., coiled coil, beta helix).
2) Devise a method to determine if an unknown sequence folds as the motif or not.
3) Verification in lab.
![Page 5: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/5.jpg)
Our Coiled-Coil Programs
PairCoil [Berger, Wilson, Wolf, Tonchev, Milla, Kim,1995]• predicts 2-stranded CCs• http://theory.lcs.mit.edu/paircoil
MultiCoil [Wolf, Kim, Berger, 1997]• predicts 3-stranded CCs• http://theory.lcs.mit.edu/multicoil
LearnCoil-Histidine Kinase [Singh, Berger, Kim, Berger, Cochran, 1998]• predicts CCs in histidine kinase linker domains• http://theory.lcs.mit.edu/learncoil
LearnCoil-VMF [Singh, Berger, Kim, 1999]• predicts CCs in viral membrane fusion proteins• http://theory.lcs.mit.edu/learncoil-vmf
![Page 6: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/6.jpg)
Long Distance Correlations
In beta structures, amino acids close in the folded 3D structure may be far away in the linear sequence
![Page 7: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/7.jpg)
Biological Importance of Beta Helices
Surface proteins in human infectious disease:• virulence factors (plants, too)• adhesins• toxins• allergens
Amyloid fibrils (e.g., Alzheimer’s, Creutzfeld Jakob (Mad Cow) disease)
Potential new materials
![Page 8: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/8.jpg)
What is Known
Solved beta-helix structures:
12 structures in PDB in 7 different SCOP families
Related work:
• ID profile of pectate lyase (Heffron et al. ‘98)
• HMM (e.g., HMMER)
• Threading (e.g., 3D-PSSM)
![Page 9: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/9.jpg)
Key Databases
Solved structures:
Protein Data Bank (PDB) (100’s of non-redundant structures)[www.rcsb.org/pdb/]
Sequence databases:
Genbank (100’s of thousands of protein sequences)[www.ncbi.nlm.nih.gov/Genbank/GenbankSearch.html]
SWISSPROT (10’s of thousands of protein sequences)[www.ebi.ac.uk/swissprot]
![Page 10: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/10.jpg)
Performance:
• On PDB: no false positives & no false negatives.
• Recognizes beta helices in PDB across SCOP families in cross-validation.
• Recognizes many new potential beta helices.
• Runs in linear time (~5 min. on SWISS-PROT).
[Bradley, Cowen, Menke, King, Berger: RECOMB 2001]
BetaWrap Program
![Page 11: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/11.jpg)
BetaWrap ProgramHistogram of protein scores for:
• beta helices not in database (12 proteins)• non-beta helices in PDB (1346 proteins
)
![Page 12: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/12.jpg)
Single Rung of a Beta Helix
![Page 13: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/13.jpg)
![Page 14: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/14.jpg)
3D Pairwise Correlations
Stacking residues in adjacent beta-strands
exhibit strong correlations
Residues in the T2 turn have special
correlations (Asparagine ladder,
aliphatic stacking)
B3T2
B2
B1
![Page 15: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/15.jpg)
3D Pairwise Correlations
Stacking residues in adjacent beta-strands
exhibit strong correlations
Residues in the T2 turn have special
correlations (Asparagine ladder,
aliphatic stacking)
B3T2
B2
B1
![Page 16: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/16.jpg)
![Page 17: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/17.jpg)
Question: but how can we find these correlations which are a variable distance apart in sequence?
[Tailspike, 63 residue turn]
![Page 18: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/18.jpg)
Finding Candidate Wraps
• Assume we have the correct locations of a
single T2 turn (fixed B2 & B3).
• Generate the 5 best-scoring candidates for the next rung.
B2
B3 T2Candidate
Rung
![Page 19: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/19.jpg)
Scoring Candidate Wraps (rung-to-rung)
Similar to probabilistic framework plus:
• Pairwise probabilities taken
from amphipathic
beta (not beta helix)structures in PDB.
• Additional stacking bonuseson internal pairs.
• Incorporates distribution on
turn lengths.
![Page 20: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/20.jpg)
Scoring Candidate Wraps (5 rungs)
• Iterate out to 5 rungs generating candidate wraps:
• Score each wrap:
- sum the rung-to-rung scores
- B1 correlations filter
- screen for alpha-helical content
![Page 21: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/21.jpg)
Potential Beta HelicesToxins:• Vaculating cytotoxin from the human gastric pathogen H. pylori• Toxin B from the enterohemorrhagic E. coli strain O157:H7
Allergens:• Antigen AMB A II, major allergen from A. artemisiifolia (ragweed)• Major pollen allergen CRY J II, from C. japonica (Japanese cedar)
Adhesins:• AIDA-I, involved in diffuse adherence of diarrheagenic E. coli
Other cell surface proteins:• Outer membrane protein B from Rickettsia japonica• Putative outer membrane protein F from Chlamydia trachomatis• Toxin-like outer membrane protein from Helicobacter pylori
![Page 22: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/22.jpg)
The Problem
Given an amino acid residue subsequence, does it fold as a coiled coil? A beta helix?
Very difficult:
• peptide synthesis (1-2 months)
• X-ray crystallization, NMR (>1 year)
• molecular dynamics
Our goal: predict folded structure based on a template of positive examples.
![Page 23: Mathematical Challenges in Protein Motif Recognition Bonnie Berger MIT.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649c765503460f9492a3e6/html5/thumbnails/23.jpg)
CollaboratorsMath / CS
Mona Singh
Ethan Wolf
Phil Bradley
Lenore Cowen
Matt Menke
David Wilson
Theo Tonchev
Biologists
Peter S. Kim
Jonathan King
Andrea Cochran
James Berger
Mari Milla