Supervisor: Mr. Phan Trường Lâm Supervisor:. Team information.
By Eng. Monther Alhamdoosh Supervisor: Prof. Rita Casadio Co-supervisor: Dr. Piero Fariselli...
-
Upload
claud-davidson -
Category
Documents
-
view
212 -
download
0
Transcript of By Eng. Monther Alhamdoosh Supervisor: Prof. Rita Casadio Co-supervisor: Dr. Piero Fariselli...
ByEng. Monther Alhamdoosh
Supervisor: Prof. Rita CasadioCo-supervisor: Dr. Piero Fariselli
Disulfide Connectivity Prediction Using Machine Learning
Approaches
LAUREA MAGISTRALE IN BIOINFORMATICSINTERNATIONAL BOLOGNA MASTER IN BIOINFORMATICS
ALMA MATER STUDIORUM ▪ UNIVERSITÀ DI BOLOGNA
Session II 2009/2010
In Literature
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 2
Accuracy indicesThe percentage of connectivity patterns
that are correctly predicted.
The percentage of disulfide bridges that are correctly predicted.
δ(x, y) = 1 when the predicted pattern y matches the correct pattern x.
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions
Our Proposed Solutions
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 3
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions
Machine Learning
1
2
3
4
Basic System Design
Pattern Scoring Schemes
Our Proposed Solutions
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 4
Step 3: Estimate the disulfide propensity
Neural Networks-based ModelsSingle-Layer Feed-forward Network (SLFN).Extreme Learning Machines (ELMs).
Pseudo-inverse matrix to get output weights.
Additive (Sigmoid) Hidden NeuronsRBF (Guassian) Hidden Neurons.
Back-propagation (BP).Gradient Descent to get all weights.
Support Vector Machines (SVM)Support Vector Regression (SVR).Radial Basis Function (RBF) Kernels. Grid Search is used to find the best values
for g and c.
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions
SLFN
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 5
ELM (Additive vs. RBF hidden neurons)Training Time curves
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions Additive Hidden Neurons RBF Hidden NeuronsNumber of Neurons
Number of Neurons
ELM outperforms BP
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 6
The accuracy values of ELM and BP
Performance Enhancement
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions
Comparison of different ELM and BP models.
Model
B = 2 B = 3 B = 4 B = 5 Overall Best # of
neurons
Time (s)Qc Qp Qc Qp Qc Qp Qc Qp Qc Qp
ELM (Sig) 65 65 42 28 42 24 27 4 46 41 150 28.52
ELM (RBF) 66 66 45 32 45 26 31 5 48 43 9018.5
5
BP (Sig) 62 62 38 26 40 23 29 5 44 38 95559.29
Our method performance with L1 RBF kernels initialized using k-mean clustering. The Best performing number of hidden neurons is 270 and the corresponding training time is 425.11 seconds.Connectivit
y Size2 3 4 5 overal
Qc 67 48 44 37 51
Qp 67 36 27 6 45
SVR vs. NN
September 10th, 2010M.Sc. Thesis in BioinformaticsEng. Monther Alhamdoosh 7
Comparison of SVR and NN-based methods Both tested on PDB0909 with Set A
of descriptors.
• Introduction The Amino Acid
Cysteine Importance of SS
Bonds Machine Learning
• Statement of the Problem
Aim of Research In Literature
• Our Proposed Solutions
• Results
• Comparisons with previous methods
• Conclusions
MethodB = 2 B = 3 B = 4 B = 5 Overall
Qc Qp Qc Qp Qc Qp Qc Qp Qc Qp
SVR (BSP) 69 6959
46
44 23 45 23 5650
NN (ELM) 67 67 48 36 44 27 37 6 51 45