Multi-factor model for prediction of the caspase degradome Lawrence Wee

32
Multi-factor model for prediction of the caspase degradome Lawrence Wee

description

Multi-factor model for prediction of the caspase degradome Lawrence Wee. What are Caspases?. The Biochemistry of Caspases 1. Caspases are cysteine proteases. Recognize tetrapeptide sequence on substrates (P4-P3-P2-P1). P4 P3 P2 P1 P1’ P2’ - D– E – V – D --- T – Y. - PowerPoint PPT Presentation

Transcript of Multi-factor model for prediction of the caspase degradome Lawrence Wee

Page 1: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model for prediction of the caspase degradome

Lawrence Wee

Page 2: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

What are Caspases?

1. Fuentes-Prior et al. Biochem J. 2004 Dec 1;384(Pt 2):201-32.

2. Thornberry et al. J Biol Chem. 1997 Jul 18;272(29):17907-11.

The Biochemistry of Caspases1

Caspases are cysteine proteases.

In-vitro optimal tetrapeptide specificities2

    P4 P3 P2 P1

Group I

Caspase-1 W E H D

Caspase-4 W/L E H D

Caspase-5 W/L E H D

Group II

CED-3 D E T D

Caspase-3 D E V D

Caspase-7 D E V D

Caspase-2 D E H D

Group III

Caspase-6 V E H D

Caspase-8 L E T D

Caspase-9 L E H D

Recognize tetrapeptide sequence on substrates (P4-P3-P2-P1).

P4 P3 P2 P1 P1’ P2’

- D– E – V – D --- T – Y

Cleave after canonical Asp (D) residue at the P1 position.

Page 3: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

What are Caspases?

1. Hengartner MO. The biochemistry of apoptosis.Nature. 2000 Oct 12;407(6805):770-6.

As the final effectors of apoptosis, caspases cleave many protein substrates.

Caspases in Apoptosis1

Extrinsic

Intrinsic

Page 4: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

The Caspase Degradome

The State of the Caspase Degradome1

1. Categories are assigned according to Fischer et al (2003).

Functional Distribution of Caspase Substrates

Ser/Thr-Protein kinases in signal transduction

12%

Cytoskeletal and structural proteins

10%

DNA-binding and transcription factors

9%

RNA synthesis and splicing7%

DNA synthesis, cleavage and repair

7%

Cell adhesion6%

Cell Cycle proteins6%

Calcium, c-AMP, c-GMP and Lipid metabolism

5%

Nuclear structural and abundant proteins

4%

Membrane Receptors4%

Neurodegeneration4%

G protein signaling3%

Apoptosis regulation3%

ER and Golgi-resident proteins

1%

Protein phosphatases1%

Protein modification1%

Tyr protein kinases1%

Other substrates4%

Protein degradation3%

Viral proteins3%

Adapter proteins1%Cytokines

1%

Protein translation4%

Page 5: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

What is the Degradome?

The Degradome

The Caspase Degradome: The repertoire of proteins cleaved by caspases .

Genome Transcriptome Proteome

Degradome

SubstratesProteases

Genomics Transcriptomics Proteomics

Degradomics

1. Lopez-Otin C and Overall CM. Protease degradomics: a new challenge for proteomics. Nat Rev Mol Cell Biol. 2002 Jul;3(7):509-19.

Degradome: The protease-substrate repertoire in a cell, tissue or organism1

Page 6: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Question

What proteins are cleaved by caspases?

Page 7: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Question

What proteins are cleaved by caspases?

Strategy

How about predicting the caspase degradome?

Page 8: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Algorithms and servers

1. Accuracy as reported in papers using the authors’ datasets.

2. GraBCas accuracy when tested on our dataset.

Existing algorithms and servers

Program Algorithm Accuracy Authors Dataset

PeptideCutterConsensus

Motifs Not Reported

Gasteiger et al (2005)

Outdated dataset

PEPSConsensus

Motifs Not Reported

Lohmuller et al (2003)

Outdated dataset

CasPredictorPosition Specific Scoring Matrices 81%

1 Garay-Maipartida et al (2005)

137 sequences (Fischer, 2003)

GraBCasPosition Specific Scoring Matrices 87%

2 Backes et al (2005)

No dataset provided

BBBF NN Neural Networks 96%1

Yang (2005)Small dataset.

(12 sequences)

SVM Prediction

SVM 82-97%Wee et al (2006,

2007)219 sequences

Page 9: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Server for SVM-based prediction

CASVM Web Server1,2

1. Wee et al.CASVM: web server for SVM-based prediction of caspase substrates cleavage sites. Bioinformatics. 2007 Dec 1;23(23):3241-3.

2. Wee et al. SVM-based prediction of caspase substrate cleavage sites. BMC Bioinformatics. 2006 Dec 18;7 Suppl 5:S14

CASVM web server predicts caspase cleavage sites using our SVM algorithmwww.casbase.org/casvm

Page 10: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Question

What proteins are cleaved by caspases?

Strategy

How about predicting the caspase degradome?

Problem

Predicting caspase cleavage sites is not good enough

Page 11: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Predicting the Caspase Degradome

Limitations of Caspase Cleavage Sites Prediction

1. Analysis on our caspase substrates dataset.

Not all bona fide cleavage site motifs are cleaved in vivo1:

• 80% of true substrates contain at least one other identical caspase cleavage site sequence which is not reported as a true cleavage site in literature.

- Tpr (DDED-2117)- p28BAP31 (AAVD-163) - golgin 160 (SEVD-311)- Topo I (PEDD-123) - heterogeneous nuclear ribonucleoparticle C1/C2 (GEDD-305)

Page 12: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Problem

Predicting caspase cleavage sites is not good enough

Solution

How about incorporating other structural factors?

Page 13: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Problem

Predicting caspase cleavage sites is not good enough

Solution

How about incorporating other structural factors?

Secondary structures? Solvent exposure?

Page 14: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Analysis of caspase cleavage sites

Caspase cleavage sites are analyzed for:

Dataset of caspase cleavage sites& non-cleavage sites

SABLE1

Propensity for secondary structures

Propensity for solvent exposure

1. http://sable.cchmc.org/

Page 15: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Analysis of caspase cleavage sites

Cleavage sites tend to locate in unstructured regions

Figure 1 Figure 2

Cleavage sites prefer unstructured regions

Page 16: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Analysis of caspase cleavage sites

Cleavage sites tend to locate in solvent exposed regions

Figure 3 Figure 4

Cleavage sites prefer solvent exposed regions

Page 17: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Analysis of caspase cleavage sites

Cleavage sites tend to locate in unstructured and solvent exposed regions

Cleavage sites prefer highly unstructured regions with high solvent exposure

Figure 5

Page 18: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Analysis of caspase cleavage sites

Cleavage sites tend to locate in unstructured and solvent exposed regions

Non-cleavage sites prefer regions with secondary structures and less solvent exposure

Figure 6

Page 19: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model

Current algorithms Better algorithm?

Cleavage site prediction Cleavage site prediction

Secondary structures

Solvent exposure

Page 20: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model prediction

Schematic diagram of the multi-factor algorithm

Step 1 - Caspase cleavage site prediction (using an existing algorithm)

......MIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVF.....

Step 2 – Selection of structurally favorable candidates

VETELKLICCDILDVLDKHLIPAA ELKLICCDILDVLDKHLIPAANTG

LAKAAFDDAIAELDTLSEESYKDS

Cp, Sp and P-score are calculated for all subsequences

ELKLICCDILDVLDKHLIPAANTGCleavage sites in subsequences with P-score above cut-off are selected

Page 21: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Algorithms and servers

Existing algorithms and servers

Program Algorithm Accuracy Authors Dataset

PeptideCutterConsensus

Motifs Not Reported

Gasteiger et al (2005)

Outdated dataset

PEPSConsensus

Motifs Not Reported

Lohmuller et al (2003)

Outdated dataset

CasPredictorPosition Specific Scoring Matrices 81%

1 Garay-Maipartida et al (2005)

137 sequences (Fischer, 2003)

GraBCasPosition Specific Scoring Matrices 87%

2 Backes et al (2005)

No dataset provided

BBBF NN Neural Networks 96%1

Yang (2005)Small dataset.

(12 sequences)

SVM Prediction

SVM 82-97%Wee et al (2006,

2007)219 sequences

1. Accuracy as reported in papers using the authors’ datasets.

2. GraBCas accuracy when tested on our dataset.

Page 22: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model prediction

Validating the multi-factor model

Dataset of caspase cleavage sites& non-cleavage sites

Analysis

Test Multi-factor model prediction

Page 23: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model prediction

Validating the multi-factor model

Figure 7

Using CASVM

Page 24: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model prediction

Validating the multi-factor model

Figure 8

Using GraBCas

Page 25: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Multi-factor model prediction

Validating the multi-factor model

Positive Predictive Values of Models

0

10

20

30

40

50

60

70

80

90

100

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75

P-Score

PPV

(%)

Figure 9

CASVM

GraBCas

Page 26: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

RTK cleavage prediction

Prediction of potential caspase cleavage among RTKs

Receptor Tyrosine Kinases (RTKs)

Belong to the tyrosine kinase superfamily

Plasma membrane bound

Involved in cell survival, proliferation, differentiation

Image taken from http://www.pvrireview.org/viewimage.asp?img=PVRIReview_2009_1_2_124_50732_u1.jpg

Page 27: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Questions

Which RTKs are cleaved by caspases?

What are the consequences of cleavage?

Page 28: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

RTK cleavage prediction

Prediction of RTK cleavage using the multi-factor model

52 RTKs from Uniprot

Step 1Caspase cleavage sites predicted with

CASVM

Step 2

Cleavage sites scored for Cp, Sp and P-score

Selection of structurally favorable cleavage sites

Page 29: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

RTK cleavage prediction

Prediction of potential caspase cleavage among RTKs

Global mapping of predicted caspase cleavage sites on receptor tyrosine kinases

RTK Family RTKs UNIPROT ID Predicted Caspase Cleavage Sites1

EGF receptor EGFR P00533 321 458 587 770 916 1006 1009 1012 1083 1127 1152 1171

ERBB2 P04626 277 326 382 639 1016 1019 1087 1125

ERBB3 P21860 162 165 242 581 1010 1020 1327

ERBB4 Q15303 218 245 300 335 510 564 585 595 878 922 1012 1015 1018 1068 1241

Insulin receptor INSR P06213 75 483 526 546 549 672 704 716 949 985 1145 1210 1259 1330 1344

INSRR P14616 154 585 676 816 916 1101 1166 1207 1280

IGF1R P08069 156 300 342 519 539 542 675 1121 1186 1235 1294 1306

ROS1 P08922 100 358 483 513 684 711 842 1202 1391 1853 2058 2062 2135 2247

PDGF receptor PDGFRA P16234 215 244 287 422 568 733 763 846 902 919 1015 1024 1033 1074

PDGFRB P09619 78 200 285 575 691 737 1091

CSF1R P07333 51 63 269 741 746

KIT P10721 439 479 768

FLT3 P36888 200 455 600 959

FGF receptor FGFR1 P11362 69 90 110 130 131 132 133 142 218 527 768 782

FGFR2 P21802 75 126 135 136 138 506 521 530 785 794 795

FGFR3 P22607 77 136 139 143 147 497 516 521 776 792

FGFR4 P22455 119 129 187 240 507 516 575 770 779

VEGF receptor VEGFR1 P17948 372 495 630 958 987 1135 1165 1168 1262

VEGFR3 P35916 19 45 77 304 371 556 725 728 1130 1216 1274

Page 30: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

RTK cleavage prediction

Prediction of potential caspase cleavage among RTKs

Results and conclusion

• Cleavage sites are found throughout the length of receptor• 92% of all RTKs contain intracellular cleavage sites• 98% contain extracellular cleavage sites• 21% contain juxtamembrane domain cleavage sites (in cytoplasmic portion)• 80% contain cleavage sites within the tyrosine kinase domain (in cytoplasmic portion)

Page 31: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

Conclusion

Conclusion

• Multi-factor model can be applicable to other protease-substrate prediction problem.

• Two step approach may be better than a single step

• Other factors can be incorporated into separate steps (exosites prediction, protein-protein interactions). But correlations must be low.

Page 32: Multi-factor model for prediction of the caspase degradome  Lawrence Wee

The End