Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.
-
Upload
emery-walker -
Category
Documents
-
view
213 -
download
1
Transcript of Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.
![Page 1: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/1.jpg)
Bioinformatics – NSF Summer School 2003Z. Luthey-Schulten, UIUC
![Page 2: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/2.jpg)
Sequence-Sequence Alignment
• Smith-Watermann
• Needleman-Wunsch
Sequence-Structure Alignment•Threading•Hidden Markov
![Page 3: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/3.jpg)
Sequence Alignment & Dynamic Programming
Seq. 1: a1 a2 a3 - - a4 a5…an
Seq. 2: c1 - c2 c3 c4 c5 - …cm
number of possible alignments:
Smith-Waterman alignment algorithm
matrix similarity : S Score Matrix H: Traceback
AWGHEAW--HE
A R N D C Q E G H I L K M F P S T W Y V B Z X 5 -2 -1 -1 -2 0 -1 1 -2 -1 -2 -1 -1 -3 -2 1 0 -3 -2 0 -1 -1 0 A -2 9 0 -1 -3 2 -1 -3 0 -3 -2 3 -1 -2 -3 -1 -2 -2 -1 -2 -1 0 -1 R -1 0 8 2 -2 1 -1 0 1 -2 -3 0 -2 -3 -2 1 0 -4 -2 -3 4 0 -1 N -1 -1 2 9 -2 -1 2 -2 0 -4 -3 0 -3 -4 -2 0 -1 -5 -3 -3 6 1 -1 D -2 -3 -2 -2 16 -4 -2 -3 -4 -4 -2 -3 -3 -2 -5 -1 -1 -6 -4 -2 -2 -3 -2 C 0 2 1 -1 -4 8 2 -2 0 -3 -2 1 -1 -4 -2 1 -1 -1 -1 -3 0 4 -1 Q -1 -1 -1 2 -2 2 7 -3 0 -4 -2 1 -2 -3 0 0 -1 -2 -2 -3 1 5 -1 E 1 -3 0 -2 -3 -2 -3 8 -2 -4 -4 -2 -2 -3 -1 0 -2 -2 -3 -4 -1 -2 -1 G -2 0 1 0 -4 0 0 -2 13 -3 -2 -1 1 -2 -2 -1 -2 -5 2 -4 0 0 -1 H -1 -3 -2 -4 -4 -3 -4 -4 -3 6 2 -3 1 1 -2 -2 -1 -3 0 4 -3 -4 -1 I -2 -2 -3 -3 -2 -2 -2 -4 -2 2 6 -2 3 2 -4 -3 -1 -1 0 2 -3 -2 -1 L -1 3 0 0 -3 1 1 -2 -1 -3 -2 6 -1 -3 -1 0 0 -2 -1 -2 0 1 -1 K -1 -1 -2 -3 -3 -1 -2 -2 1 1 3 -1 7 0 -2 -2 -1 -2 1 1 -3 -2 0 M -3 -2 -3 -4 -2 -4 -3 -3 -2 1 2 -3 0 9 -4 -2 -1 1 4 0 -3 -4 -1 F -2 -3 -2 -2 -5 -2 0 -1 -2 -2 -4 -1 -2 -4 11 -1 0 -4 -3 -3 -2 -1 -2 P 1 -1 1 0 -1 1 0 0 -1 -2 -3 0 -2 -2 -1 5 2 -5 -2 -1 0 0 0 S 0 -2 0 -1 -1 -1 -1 -2 -2 -1 -1 0 -1 -1 0 2 6 -4 -1 1 0 -1 0 T -3 -2 -4 -5 -6 -1 -2 -2 -5 -3 -1 -2 -2 1 -4 -5 -4 19 3 -3 -4 -2 -2 W -2 -1 -2 -3 -4 -1 -2 -3 2 0 0 -1 1 4 -3 -2 -1 3 9 -1 -3 -2 -1 Y 0 -2 -3 -3 -2 -3 -3 -4 -4 4 2 -2 1 0 -3 -1 1 -3 -1 5 -3 -3 -1 V -1 -1 4 6 -2 0 1 -1 0 -3 -3 0 -3 -3 -2 0 0 -4 -3 -3 5 2 -1 B -1 0 0 1 -3 4 5 -2 0 -4 -2 1 -2 -4 -1 0 -1 -2 -2 -3 2 5 -1 Z 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 0 -1 -2 0 0 -2 -1 -1 -1 -1 -1 X
![Page 4: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/4.jpg)
AWGHEAW--HE
Smith-Waterman Local Alignment Score Matrix
![Page 5: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/5.jpg)
A R N D C Q E G H I L K M F P S T W Y V B Z X 5 -2 -1 -1 -2 0 -1 1 -2 -1 -2 -1 -1 -3 -2 1 0 -3 -2 0 -1 -1 0 A -2 9 0 -1 -3 2 -1 -3 0 -3 -2 3 -1 -2 -3 -1 -2 -2 -1 -2 -1 0 -1 R -1 0 8 2 -2 1 -1 0 1 -2 -3 0 -2 -3 -2 1 0 -4 -2 -3 4 0 -1 N -1 -1 2 9 -2 -1 2 -2 0 -4 -3 0 -3 -4 -2 0 -1 -5 -3 -3 6 1 -1 D -2 -3 -2 -2 16 -4 -2 -3 -4 -4 -2 -3 -3 -2 -5 -1 -1 -6 -4 -2 -2 -3 -2 C 0 2 1 -1 -4 8 2 -2 0 -3 -2 1 -1 -4 -2 1 -1 -1 -1 -3 0 4 -1 Q -1 -1 -1 2 -2 2 7 -3 0 -4 -2 1 -2 -3 0 0 -1 -2 -2 -3 1 5 -1 E 1 -3 0 -2 -3 -2 -3 8 -2 -4 -4 -2 -2 -3 -1 0 -2 -2 -3 -4 -1 -2 -1 G -2 0 1 0 -4 0 0 -2 13 -3 -2 -1 1 -2 -2 -1 -2 -5 2 -4 0 0 -1 H -1 -3 -2 -4 -4 -3 -4 -4 -3 6 2 -3 1 1 -2 -2 -1 -3 0 4 -3 -4 -1 I -2 -2 -3 -3 -2 -2 -2 -4 -2 2 6 -2 3 2 -4 -3 -1 -1 0 2 -3 -2 -1 L -1 3 0 0 -3 1 1 -2 -1 -3 -2 6 -1 -3 -1 0 0 -2 -1 -2 0 1 -1 K -1 -1 -2 -3 -3 -1 -2 -2 1 1 3 -1 7 0 -2 -2 -1 -2 1 1 -3 -2 0 M -3 -2 -3 -4 -2 -4 -3 -3 -2 1 2 -3 0 9 -4 -2 -1 1 4 0 -3 -4 -1 F -2 -3 -2 -2 -5 -2 0 -1 -2 -2 -4 -1 -2 -4 11 -1 0 -4 -3 -3 -2 -1 -2 P 1 -1 1 0 -1 1 0 0 -1 -2 -3 0 -2 -2 -1 5 2 -5 -2 -1 0 0 0 S 0 -2 0 -1 -1 -1 -1 -2 -2 -1 -1 0 -1 -1 0 2 6 -4 -1 1 0 -1 0 T -3 -2 -4 -5 -6 -1 -2 -2 -5 -3 -1 -2 -2 1 -4 -5 -4 19 3 -3 -4 -2 -2 W -2 -1 -2 -3 -4 -1 -2 -3 2 0 0 -1 1 4 -3 -2 -1 3 9 -1 -3 -2 -1 Y 0 -2 -3 -3 -2 -3 -3 -4 -4 4 2 -2 1 0 -3 -1 1 -3 -1 5 -3 -3 -1 V -1 -1 4 6 -2 0 1 -1 0 -3 -3 0 -3 -3 -2 0 0 -4 -3 -3 5 2 -1 B -1 0 0 1 -3 4 5 -2 0 -4 -2 1 -2 -4 -1 0 -1 -2 -2 -3 2 5 -1 Z 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 0 -1 -2 0 0 -2 -1 -1 -1 -1 -1 X
Blosum 40 Substitution Matrix
![Page 6: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/6.jpg)
Protein Structural Relationships
Can protein structural relationships help us to understand evolutionary dynamics?
Is there a connection between evolutionary events and changes in protein structure?
What is the effect of gene duplication, horizontal gene transfer, and other evolutionary mechanisms on protein shape?
Substitution Indel Domain Insertion
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 7: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/7.jpg)
Sequence Alignment & Dynamic Programming
Seq. 1: a1 a2 a3 - - a4 a5…an
Seq. 2: c1 - c2 c3 c4 c5 - …cm
number of possible alignments:
Needleman-Wunsch alignment algorithm
matrix similarity : S Score Matrix H: Traceback A R N D C Q E G H I L K M F P S T W Y V B Z X 5 -2 -1 -1 -2 0 -1 1 -2 -1 -2 -1 -1 -3 -2 1 0 -3 -2 0 -1 -1 0 A -2 9 0 -1 -3 2 -1 -3 0 -3 -2 3 -1 -2 -3 -1 -2 -2 -1 -2 -1 0 -1 R -1 0 8 2 -2 1 -1 0 1 -2 -3 0 -2 -3 -2 1 0 -4 -2 -3 4 0 -1 N -1 -1 2 9 -2 -1 2 -2 0 -4 -3 0 -3 -4 -2 0 -1 -5 -3 -3 6 1 -1 D -2 -3 -2 -2 16 -4 -2 -3 -4 -4 -2 -3 -3 -2 -5 -1 -1 -6 -4 -2 -2 -3 -2 C 0 2 1 -1 -4 8 2 -2 0 -3 -2 1 -1 -4 -2 1 -1 -1 -1 -3 0 4 -1 Q -1 -1 -1 2 -2 2 7 -3 0 -4 -2 1 -2 -3 0 0 -1 -2 -2 -3 1 5 -1 E 1 -3 0 -2 -3 -2 -3 8 -2 -4 -4 -2 -2 -3 -1 0 -2 -2 -3 -4 -1 -2 -1 G -2 0 1 0 -4 0 0 -2 13 -3 -2 -1 1 -2 -2 -1 -2 -5 2 -4 0 0 -1 H -1 -3 -2 -4 -4 -3 -4 -4 -3 6 2 -3 1 1 -2 -2 -1 -3 0 4 -3 -4 -1 I -2 -2 -3 -3 -2 -2 -2 -4 -2 2 6 -2 3 2 -4 -3 -1 -1 0 2 -3 -2 -1 L -1 3 0 0 -3 1 1 -2 -1 -3 -2 6 -1 -3 -1 0 0 -2 -1 -2 0 1 -1 K -1 -1 -2 -3 -3 -1 -2 -2 1 1 3 -1 7 0 -2 -2 -1 -2 1 1 -3 -2 0 M -3 -2 -3 -4 -2 -4 -3 -3 -2 1 2 -3 0 9 -4 -2 -1 1 4 0 -3 -4 -1 F -2 -3 -2 -2 -5 -2 0 -1 -2 -2 -4 -1 -2 -4 11 -1 0 -4 -3 -3 -2 -1 -2 P 1 -1 1 0 -1 1 0 0 -1 -2 -3 0 -2 -2 -1 5 2 -5 -2 -1 0 0 0 S 0 -2 0 -1 -1 -1 -1 -2 -2 -1 -1 0 -1 -1 0 2 6 -4 -1 1 0 -1 0 T -3 -2 -4 -5 -6 -1 -2 -2 -5 -3 -1 -2 -2 1 -4 -5 -4 19 3 -3 -4 -2 -2 W -2 -1 -2 -3 -4 -1 -2 -3 2 0 0 -1 1 4 -3 -2 -1 3 9 -1 -3 -2 -1 Y 0 -2 -3 -3 -2 -3 -3 -4 -4 4 2 -2 1 0 -3 -1 1 -3 -1 5 -3 -3 -1 V -1 -1 4 6 -2 0 1 -1 0 -3 -3 0 -3 -3 -2 0 0 -4 -3 -3 5 2 -1 B -1 0 0 1 -3 4 5 -2 0 -4 -2 1 -2 -4 -1 0 -1 -2 -2 -3 2 5 -1 Z 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 0 -1 -2 0 0 -2 -1 -1 -1 -1 -1 X
??? Tutorial: W=d
![Page 8: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/8.jpg)
Needleman-Wunsch Global Alignment
Similarity Values Initialization of Gap Penalties
http://www.dkfz-heidelberg.de/tbi/bioinfo/PracticalSection/AliApplet/index.html
![Page 9: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/9.jpg)
Filling out the Score Matrix H
http://www.dkfz-heidelberg.de/tbi/bioinfo/PracticalSection/AliApplet/index.html
![Page 10: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/10.jpg)
Traceback and Alignment
The Alignment
Traceback (blue) from optimal score
http://www.dkfz-heidelberg.de/tbi/bioinfo/PracticalSection/AliApplet/index.html
![Page 11: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/11.jpg)
Energy Landscape Theory of Structure Prediction
![Page 12: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/12.jpg)
Protein Structure Prediction
SISSIRVKSKRIQLG….
1-D protein sequence 3-D protein structure
Sequence Alignment
Ab Initio protein folding
SISSRVKSKRIQLGLNQAELAQKV------GTTQ…
QFANEFKVRRIKLGYTQTNVGEALAAVHGS…
Target protein of unknown structure
Homologous/analogous protein of known structure
Sequence Alignment: the Energy Function
?E= Ematch + Egap Egap=
Ematch=
![Page 13: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/13.jpg)
A1 A3A2 A4 A5 …
“Scaffold”structureTarget sequence threading alignment
between target and scaffold
Threading: Sequence-Structure Alignment
Threading Energy Function
R. Goldstein, Z. Luthey-Schulten, P. Wolynes (1992, PNAS)
![Page 14: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/14.jpg)
Gap Penalties
Sequence-Structure Gap Energy
gapbondsHprofilecontact EEEEH +++= −
( )ggap PkTE log=
Distribution of Gaps
jir ′′i′
i
j′i′ j′
i jijl −=
j
Insertion Deletion
target
scaffold( ) )exp( 11 lbalPinsertion −∗=
( ) ( )⎟⎟⎠
⎞⎜⎜⎝
⎛ −−∗= 2
2
22
22
expσ
brarPdeletion
oo
A5.7A0.3 range <<⇒ r
( )( ) ⎟⎟
⎠
⎞⎜⎜⎝
⎛−−∗∗=
l
rlcr
l
arlP
3
2
32
2/33
3bulge exp,
σσo
A0.4 range >⇒ r
Bulge
jir ′′
R. Goldstein, Z. Luthey-Schulten, P. Wolynes (1994) Proc 27th Annu Hawaii Int Conf Sys Sci:306.
![Page 15: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/15.jpg)
Similarity Measures
Sequence Identity
fraction of identically matched residues
Q “Structural Identity”fraction of native contacts
ijr
ijr′′′
![Page 16: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/16.jpg)
A summary of Energy Landscape Theory
Energy Landscape Theory
When <dEs /DE> is maximum the energy landscape is optimally funneled.
Onuchic , Luthey -Schulten, Wolynes (1997 ) Annu . Rev. Phys. Chem. 48:545-600.Koretke , Luthey -schulten,Wolynes( 1996) Prot. Sci. 5:1043
Energy
dEs
2DEmolten globule
distribution
native states
mmatchggap
EEE ????
Optimization over an Ensemble of Folds
A summary of Energy Landscape Theory
Energy Landscape Theory
When <dEs /DE> is maximum the energy landscape is optimally funneled.
Onuchic , Luthey -Schulten, Wolynes (1997 ) Annu . Rev. Phys. Chem. 48:545-600.Koretke , Luthey -schulten,Wolynes( 1996) Prot. Sci. 5:1043
Energy
dEs
2DEmolten globule
distribution
native states
mmatchggap
EEE ????
Optimization over an Ensemble of Folds
<Es/E>
Es
2
![Page 17: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/17.jpg)
Homology Modeling - Threading
![Page 18: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/18.jpg)
Results from CASP5CM/FR
The prediction is never better than the scaffold.
Threading Energy function requires improvement.
![Page 19: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/19.jpg)
You are now entering the twilight zone of sequence identity. We need profiles!
Watch for Bioinformants!!!
![Page 20: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/20.jpg)
Profiles – Evolution Revisited
• “What molecular sequences taught us in the 1960’s was that the genealogical history of an organism is written to one extent or another into the sequences of each of its genes, an insight that became the central tenet of a new discipline, molecular evolution”
• Woese (PNAS, 2000) Pauling (1965)
![Page 21: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/21.jpg)
Universal Tree
The Universal Phylogenetic Tree inferred from comparative analyses of rRNA sequences: Woese(PNAS, 1990)
![Page 22: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/22.jpg)
Horizontal Gene Transfer
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 23: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/23.jpg)
Multiple Sequence Alignments
• “The aminoacyl-tRNA synthetases, perhaps better than any other molecules in the cell, eptiomize the current situation and help to under standard (the effects) of HGT” Woese (PNAS, 2000; MMBR 2000)
![Page 24: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/24.jpg)
Standard Dogma Molecular Biology
• DNA RNA Proteins
• Role of AARS?
• Charging of t-RNA
![Page 25: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/25.jpg)
NCBI 3D
![Page 26: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/26.jpg)
LeuRS Canonical Tree
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll (Yale) Micro. Mol. Biol. Rev. March 2000..
![Page 27: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/27.jpg)
D,N Sequence Phylogenetic Trees
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll (Yale) Micro. Mol. Biol. Rev. March 2000..
![Page 28: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/28.jpg)
Fold Motifs of AARSs
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 29: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/29.jpg)
Structure Conserved More than Sequence Structural Overlap of Class II AARS
Conserved helices Conserved sheets
![Page 30: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/30.jpg)
Subset of Class II Structural Tree
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 31: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/31.jpg)
![Page 32: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/32.jpg)
Novel Evolutionary Connections from Sequence and Structure
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll (Yale) Micro. Mol. Biol. Rev. March 2000..
Canonical Pattern
D E F L W Y
Canonical Pattern +
I H P M
Basal Canonical
V T A R
Gemini
K1 K2 C S G N Q
No canonical patternHorizontal transfer after
B-AE split.
O’Donoghue, Luthey-Schulten, UIUC 2003
![Page 33: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/33.jpg)
( )rP
( )lP
l, gap length (residues) rij, spatial gap distance (Å)
Gap Distribution Functions
( )( )lPlog
Spatial Gap Distribution FuncitonLength Gap Distribution Function
B. Qian & R. Goldstein. (2001) Proteins 45:102.
![Page 34: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/34.jpg)
Structural Alignment Methods
• PDB - Structural Neighbors – CE (Bourne)
• Stamp - Russell
![Page 35: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/35.jpg)
Multiple Structural AlignmentsSTAMP1. Initial Alignment• Multiple Sequence alignment• Ridged Body “Scan”
2. Refine Initial Alignment & Produce Multiple Structural Alignment
R. Russell, G. Barton (1992) Proteins 14: 309.
•Dynamic Programming (Smith-Waterman) through P matrix gives optimal set of equivalent residues.•This set is used to re-superpose the two chains. Then iterate until alignment score is unchanged.•This procedure is performed for all pairs.
![Page 36: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/36.jpg)
Multiple Structural Alignments
STAMP – cont’d2. Refine Initial Alignment & Produce Multiple Structural Alignment
Alignment score:
Multiple Alignment:•Create a dendrogram using the alignment score.•Successively align groups of proteins (from branch tips to root).•When 2 or more sequences are in a group, then average coordinates are used.
![Page 37: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/37.jpg)
Stamp Output/Secondary Structure
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 38: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/38.jpg)
Stamp Output/Clustal Format
O’Donoghue and Luthey-Schulten, UIUC 2003
![Page 39: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/39.jpg)
Examples of Useful Web Tools
•Genomes – Sequence and Gene Information•Domain Architecture•Multiple Sequence Alignments•Phylogenetic Trees•Structural Databases•Hidden Markov Methods
![Page 40: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/40.jpg)
NCBI: Genomes
![Page 41: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/41.jpg)
Charging the tRNA
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll (Yale) Micro. Mol. Biol. Rev. March 2000..
![Page 42: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/42.jpg)
NCBI 3D
![Page 43: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/43.jpg)
Report from SWISS-PROT
![Page 44: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/44.jpg)
PFAM Report
![Page 45: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/45.jpg)
![Page 46: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/46.jpg)
Sequence Dendrogram from Clustal
Luthey-Schulten, UIUC 2003
![Page 47: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/47.jpg)
Phylogenetic Tree in Tutorial
Pogorelov and Luthey-Schulten, UIUC 2003
![Page 48: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/48.jpg)
![Page 49: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/49.jpg)
Alignment in MOE
![Page 50: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/50.jpg)
Alignment in MOE
![Page 51: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/51.jpg)
Transmembrane Proteins - HMM
Example Bacteriorhodpsin – Anurag Sethi UIUC
![Page 52: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/52.jpg)
Stamp Profile
Sethi and Luthey-Schulten, UIUC 2003
![Page 53: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/53.jpg)
![Page 54: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/54.jpg)
HMMer Profile-Profile Alignment
Sethi and Luthey-Schulten, UIUC 2003
![Page 55: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/55.jpg)
Clustal Profile-Profile Alignment
Sethi and Luthey-Schulten, 2003
![Page 56: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/56.jpg)
Structure Prediction Modeller 6.2/Hmmer
Sethi and Luthey-Schulten, UIUC 2003 Modeller 6.2 A. Sali, et al.
![Page 57: Bioinformatics – NSF Summer School 2003 Z. Luthey-Schulten, UIUC.](https://reader035.fdocuments.us/reader035/viewer/2022070411/56649f3e5503460f94c5f407/html5/thumbnails/57.jpg)
Acknowledgements
• Felix Autenrieth
• Barry Isralewitz
• Patrick O’Donoghue
• Taras Pogorelov
• Anurag Sethi