Single-model quality assessment using protein structural and …caora//materials/MCBIOS... ·...
Transcript of Single-model quality assessment using protein structural and …caora//materials/MCBIOS... ·...
![Page 1: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/1.jpg)
Speaker: Renzhi Cao
Major: Computer Science
Fifth year Ph.D
Single-model quality assessment using
protein structural and contact information
with machine learning techniques
![Page 2: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/2.jpg)
Outline
Part I: IntroductionProtein quality assessment
CASP competition
Part II: QAcon method
Part III: Result
Part IV: Conclusion
![Page 3: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/3.jpg)
Part I: Introduction
>T0759 HR9083A, Human, 109 residues
MGHHHHHHSHMVVIHPDPGRELSPEEAHRAGLIDWNMFVKLRSQECDWEEISVKGPNGES
SVIHDRKSGKKFSIEEALQSGRLTPAHYDRYVNKDMSIQELAVLVSGQK
![Page 4: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/4.jpg)
Native
Predicted model2Predicted model1
![Page 5: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/5.jpg)
Predicted model2 Predicted model3
![Page 6: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/6.jpg)
GDT-TS: 0.07 GDT-TS: 0.33GDT-TS: 0.21
Predicted model1 Predicted model2 Predicted model3
![Page 7: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/7.jpg)
Evaluating metrics
1. Loss
2. Correlation
𝐺𝐷𝑇𝑏𝑒𝑠𝑡 𝑚𝑜𝑑𝑒𝑙 − 𝐺𝐷𝑇𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑡𝑜𝑝 1
𝑛 𝑥𝑦 − ( 𝑥)( 𝑦)
[𝑛 𝑥2 − 𝑥 2][𝑛 𝑦2 − 𝑦 2]
![Page 8: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/8.jpg)
CASP (Critical Assessment of Techniques for Protein
Structure Prediction). Sel20 (Stage1)
Top150 (Stage2)
![Page 9: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/9.jpg)
Outline
Part I: IntroductionProtein quality assessment
CASP competition
Part II: QAcon method
Part III: Result
Part IV: Conclusion
![Page 10: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/10.jpg)
12 features10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
![Page 11: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/11.jpg)
1. RWp:
RWplus
score
10.
ES 4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
![Page 12: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/12.jpg)
2. MoE:
ModelEvalu-
ator
score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF
3.
Dop
![Page 13: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/13.jpg)
3. Dop:
Dope score10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
![Page 14: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/14.jpg)
4. Con:
Contact
satisfaction
score
10.
ES
1.
RWp
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
![Page 15: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/15.jpg)
Contact threshold is set to 8
Res1
Res2
![Page 16: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/16.jpg)
5. SS score:
Secondary
structure
similarity
score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
![Page 17: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/17.jpg)
6. SP score:
Secondary
structure
penalty score
10.
ES
1.
RWp
4.
Con
7.
EC
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
𝑆𝑝𝑒𝑛𝑎𝑙𝑡𝑦 =𝑁𝑢𝑚𝐻𝑒𝑙𝑖𝑥 + 𝑁𝑢𝑚𝑆ℎ𝑒𝑒𝑡
𝑁
![Page 18: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/18.jpg)
7. EC:
Euclidean
Compact
score
10.
ES
1.
RWp
4.
Con
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
𝑆𝐸𝑢𝑐𝑙𝑖 = )𝐸𝑢𝑐𝑙𝑖(𝑖, 𝑗
3.8 ∗ 𝑖 − 𝑗
![Page 19: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/19.jpg)
8. SU: score:
Surface
score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
𝑆𝑠𝑢𝑟𝑓 = 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑛𝑜𝑛𝑝𝑜𝑙𝑎𝑟𝑖 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑎𝑟𝑒𝑎𝑖
![Page 20: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/20.jpg)
9. EM:
Exposed
mass score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
11.
SA
12.
RF 2.
MoE
3.
Dop
𝑆𝑚𝑎𝑠𝑠 = 𝐴𝑟𝑒𝑎 𝑜𝑓 𝑁𝑜𝑛𝑝𝑜𝑙𝑎𝑟𝑖 ∗ 𝑀𝑎𝑠𝑠𝑖 𝐴𝑟𝑒𝑎 𝑜𝑓 𝐸𝑥𝑝𝑜𝑠𝑒𝑑𝑖 ∗ 𝑀𝑎𝑠𝑠𝑖
![Page 21: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/21.jpg)
10. ES:
Exposed
surface
score
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
12.
RF 2.
MoE
3.
Dop
𝑆𝐸𝑆 = 𝐸𝑥𝑝𝑜𝑠𝑒𝑑 𝑎𝑟𝑒𝑎𝑖 𝐴𝑟𝑒𝑎𝑖
![Page 22: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/22.jpg)
11. SA:
Solvent
accessibility
similarity
score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
12.
RF 2.
MoE
3.
Dop
![Page 23: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/23.jpg)
12. RF:
RF_CB_SRS
_OD score
10.
ES
1.
RWp
4.
Con
7.
EC
6.
SP
5.
SS
8.
SU
9.
EM
11.
SA
2.
MoE
3.
Dop
![Page 24: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/24.jpg)
RF
RWp
MoE
Con
SP
SS
…
12 features
…
OutputHidden layer
Error: 1
2𝑦 − 𝑡 2
𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛
𝑥 =1
1 + 𝑒𝑥
𝐵𝑎𝑐𝑘𝑤𝑎𝑟𝑑 𝑝𝑟𝑜𝑝𝑜𝑔𝑎𝑡𝑖𝑜𝑛
𝑤𝑡+1 = 𝑤𝑡 + η ∗әE
ә𝑤
𝐹𝑜𝑟𝑤𝑎𝑟𝑑 𝑝𝑟𝑜𝑝𝑜𝑔𝑎𝑡𝑖𝑜𝑛
𝑤𝑖,𝑗𝑥𝑖
![Page 25: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/25.jpg)
Outline
Part I: IntroductionProtein quality assessment
CASP competition
Part II: QAcon method
Part III: Result
Part IV: Conclusion
![Page 26: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/26.jpg)
Part III: Result
Server name Ave. corr. Ave. loss
ProQ2 0.643 0.090
QAcon 0.639 0.100
VoroMQA 0.561 0.108
Wang_SVM 0.655 0.109
Wang_deep_1 0.613 0.128
RWplus 0.536 0.135
raghavagps-qaspro 0.35 0.156
Table 1. The per-target average correlation, average loss for QAcon and other methods on sel20 of CASP11.
![Page 27: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/27.jpg)
Part III: Result
Server name Ave. corr. Ave. loss
ProQ2 0.372 0.058
QAcon 0.395 0.067
VoroMQA 0.401 0.069
RWplus 0.295 0.084
Wang_SVM 0.362 0.085
raghavagps-qaspro 0.222 0.085
Wang_deep_1 0.302 0.089
Table 2. The per-target average correlation, average loss for QAcon and other methods on top150 of CASP11.
![Page 28: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/28.jpg)
![Page 29: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/29.jpg)
Part III: Result
Target name Contact satisfaction
T0778 0.6142
T0825 0.6049
T0807 0.5387
T0815 0.5189
T0817 0.5181
T0811 0.5176
T0854 0.4953
T0762 0.4607
T0819 0.4531
T0768 0.4529
T0776 0.4492
T0798 0.4343
T0805 0.4252
T0801 0.3936
T0847 0.3864
Table 3.Contact satisfaction score of all
CASP11 native structures (top15)
Contact satisfaction Ave. Corr Ave. Loss
Con (Top 25) 0.682 0.156
Con (Bottom 25) -0.016 0.233
Table 4. The average correlation and loss for CASP11 sel20 targets
Contact satisfaction Ave. Corr Ave. Loss
Con (Top 25) 0.221 0.146
Con (Bottom 25) 0.080 0.134
Table 5. The average correlation and loss for CASP11 top150 targets
![Page 30: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/30.jpg)
Outline
Part I: IntroductionProtein quality assessment
CASP competition
Part II: QAcon method
Part III: Result
Part IV: Conclusion
![Page 31: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/31.jpg)
Part IV: Conclusion
QAcon
Contact as a potential feature for QA
![Page 32: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/32.jpg)
Q & A
Email: [email protected]
Badri Adhikari
Debswapna Bhattacharya
Miao Sun
Jianlin Cheng
All other lab members
Jie Hou
![Page 33: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/33.jpg)
33
Supplementary
1. The RF_CB_SRS_OD
score(Rykunov and Fiser, 2007)
energy score for evaluating the protein
structure based
on statistical distance dependent pairwise
potentials
2. RWplus score(Zhang and
Zhang, 2010)
energy score evaluating protein models
based on distance-dependent
atomic potential
3. ModelEvaluator score(Wang,
et al., 2009)
score evaluating protein models based on
structural features and support vector
machines.
![Page 34: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/34.jpg)
34
Supplementary
4. Dope score(Shen and Sali,
2006)
energy score evaluating protein models based
on the reference state
of non-interacting atoms in homogeneous
sphere
5. Con scoreThe contact score is calculated by the satisfaction
of contact predicted from the sequence and the
one parsed from the model. PSI-COV is used for
contact prediction, and the NNcon is used when
PSI-COV fails to make predictions.
6. SS score This score is calculated by the difference between
secondary structure predicted by Spine X
(Faraggi, et al., 2012) from the protein sequence
and those of a model parsed by DSSP (Kabsch
and Sander, 1983).
![Page 35: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/35.jpg)
35
Supplementary
7. SP score This score is calculated by the percentage of helix and
sheet matching between secondary structure predicted
and the on parsed from the model
8. EC scoreThe Euclidian compact score is calculated by summation
of pairwise Euclidean distance between amino acids
divided by (N*N-1)*3.8, N is the total number of amino
acids in the sequence
9. SU score This surface score is calculated by the total
area of exposed nonpolar residues divided
by the total area of all residues
![Page 36: Single-model quality assessment using protein structural and …caora//materials/MCBIOS... · 2016-06-24 · CASP (Critical Assessment of Techniques for Protein Structure Prediction).](https://reader034.fdocuments.us/reader034/viewer/2022042412/5f2af327b5a59d74a66e7b08/html5/thumbnails/36.jpg)
36
Supplementary
10. EM score The exposed mass score is calculated as the total mass of
nonpolar residues area divided by the total mass of exposed
residue area
11. ES score The exposed surface score is calculated as the total exposed
residue area divided by the total residue area.
12. SA score The solvent accessibility score is calculated by the
percentage of difference between the predicted solvent
accessibility and the one parsed from the model.