CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure...
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
2
Transcript of CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure...
![Page 1: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/1.jpg)
CISC667, F05, Lec20, Liao 1
CISC 467/667 Intro to Bioinformatics(Fall 2005)
Protein Structure Prediction
Protein Secondary Structure
![Page 2: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/2.jpg)
CISC667, F05, Lec20, Liao 2
Protein structure
• Primary: amino acid sequence of the protein
• Secondary: characteristic structure units in 3-D.
• Tertiary: the 3-dimensional fold of a protein subunit
• Quaternary: the arrange of subunits in oligomers
![Page 3: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/3.jpg)
CISC667, F05, Lec20, Liao 3
Experimental Methods
• X-ray crystallography
• NMR spectroscopy
• Neutron diffraction
• Electron microscopy
• Atomic force microscopy
![Page 4: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/4.jpg)
CISC667, F05, Lec20, Liao 4
• Computational Methods for secondary structures– Artificial neural networks– SVMs– …
• Computational Methods for 3-D structures– Comparative (find homologous proteins)– Threading – Ab initio (Molecular dynamics)
![Page 5: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/5.jpg)
CISC667, F05, Lec20, Liao 5
![Page 6: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/6.jpg)
CISC667, F05, Lec20, Liao 6
![Page 7: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/7.jpg)
CISC667, F05, Lec20, Liao 7
![Page 8: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/8.jpg)
CISC667, F05, Lec20, Liao 8
![Page 9: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/9.jpg)
CISC667, F05, Lec20, Liao 9
• Helix complete turn every 3.6 AAs
• Hydrogen bond between (-C=O) of one AA and (-N-H) of its 4th neighboring AA
![Page 10: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/10.jpg)
CISC667, F05, Lec20, Liao 10
Hydrogen bond b/w carbonyl oxygen atom on one chain and NH group on the adjacent chain
![Page 11: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/11.jpg)
CISC667, F05, Lec20, Liao 11
Ramachandran Plot
PHI: -57; PSI -47
![Page 12: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/12.jpg)
CISC667, F05, Lec20, Liao 12
Ramachandran Plot
Parallel: PHI: -119; PSI: 113
Anti-parallel: PHI: -139; PSI: 135
![Page 13: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/13.jpg)
CISC667, F05, Lec20, Liao 13
![Page 14: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/14.jpg)
CISC667, F05, Lec20, Liao 14
![Page 15: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/15.jpg)
CISC667, F05, Lec20, Liao 15
Residue conformation preferences
Helix: A, E, K, L, M, R
Sheet: C, I, F, T, V, W, Y
Coil: D, G, N, P, S
![Page 16: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/16.jpg)
CISC667, F05, Lec20, Liao 16
Artificial neural networks
• Perceptron o(x1, …, xn ) = g(∑jWj xj )
∑jWj xj g o
x1W1
Input links
Output
Inputfunction
output
Activation function
X0 = 1
W0
x2
xn
W2
Wn
.
.
.
![Page 17: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/17.jpg)
CISC667, F05, Lec20, Liao 17
• Activation functions
+1 +1
-1
+1
Sigmoid(x) = 1/(1+e-x)Sign(x) =
1 if x ≥ 0
-1 otherwiseStep(x) = 1 if x ≥ t
0 otherwise
tx x x
![Page 18: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/18.jpg)
CISC667, F05, Lec20, Liao 18
Artificial Neural Networks
![Page 19: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/19.jpg)
CISC667, F05, Lec20, Liao 19
2-unit output
![Page 20: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/20.jpg)
CISC667, F05, Lec20, Liao 20
• Learning: to determine weights and thresholds for all nodes (neurons) so that the net can approximate the training data within error range. – Back-propagation algorithm
• Feedforward from Input to output• Calculate and back-propagate the error (which is the
difference between the network output and the target output)
• Adjust weights (by gradient descent) to decrease the error.
![Page 21: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/21.jpg)
CISC667, F05, Lec20, Liao 21
w1
w0
E[w
]
Gradient descent
w new = w old - r [∂E/∂w]
where r is a positive constant called learning rate, which determines the step size for the weights to be altered in the steepest descent direction along the error surface.
![Page 22: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/22.jpg)
CISC667, F05, Lec20, Liao 22
Data representation
![Page 23: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/23.jpg)
CISC667, F05, Lec20, Liao 23
• Issues with ANNs– Network architecture
• FeedForward (fully connected vs sparsely connected)
• Recurrent
• Number of hidden layers, number of hidden units within a layer
– Network parameters• Learning rate
• Momentum term
– Input/output encoding • One of the most significant factors for good performance
• Extract maximal info
• Similar instances are encoded to “closer” vectors
![Page 24: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/24.jpg)
CISC667, F05, Lec20, Liao 24
An on-line service
![Page 25: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/25.jpg)
CISC667, F05, Lec20, Liao 25
• Performance – ceiling at about 65% for direct encoding
• Local encoding schemes present limited correlation information between residues
• Little or no improvement using multiple hidden layers.– Surpassing 70% by
• Including evolutionary information (contained in multiple alignment)
• Using cascaded neural networks• Incorporating global information (e.g., position specific
conservation weights)
![Page 26: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/26.jpg)
CISC667, F05, Lec20, Liao 26
Cathy Wu, Computers Chem. 21(1997)237-256
![Page 27: CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.](https://reader033.fdocuments.us/reader033/viewer/2022042821/56649d585503460f94a37095/html5/thumbnails/27.jpg)
CISC667, F05, Lec20, Liao 27
Resources
Protein Structure Classification– CATH:
http://www.biochem.ucl.ac.uk/bsm/cath/– SCOP: http://scop.mrc-lmb.cam.ac.uk/scop/– FSSP:
PDB: http://www.rcsb.org/pdb/