epubs.surrey.ac.ukepubs.surrey.ac.uk/814092/1/REF EURUROL-D-16-01757... · Web viewThe optimal...
-
Upload
hoangkhuong -
Category
Documents
-
view
216 -
download
0
Transcript of epubs.surrey.ac.ukepubs.surrey.ac.uk/814092/1/REF EURUROL-D-16-01757... · Web viewThe optimal...
1
Molecular subgroup of primary prostate cancer presenting with metastatic biology
Authors’ Information
Steven M. Walker 1, 2, Laura A. Knight 1 2, Andrena M. McCavigan 2, Gemma E. Logan 2,
Viktor Berge 3, Amir Sherif 4, Hardev Pandha 5, Anne Y. Warren 6, Catherine Davidson 1,
Adam Uprichard 1 ,Jaine K. Blayney 1, Bethanie Price 2, Gera L. Jellema 2, Aud Svindland 3,
Simon S. McDade 1, Christopher G. Eden 5, Chris Foster 7, Ian G. Mills 1, 3, 8, 9, David E. Neal 10, Malcolm D. Mason 11, Elaine W. Kay 12, David J. Waugh 1, D. Paul Harkin 1, 2, R. William
Watson 13, Noel W. Clarke14, Richard D. Kennedy 1, 2
1 Centre for Cancer Research and Cell Biology, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK2 Almac Diagnostics, 19 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK3 Department of Urology, Oslo University Hospital (Aker), Oslo, N-0424, Norway4 Department of Surgical and Perioperative Sciences, Urology and Andrology, Umeå University, Umeå, Sweden, SE-901 875 Department of Microbial Sciences, University of Surrey, Leggett Building, Guildford, GU2 7XH, UK6 Department of Pathology, Addenbrooke’s Hospital, Cambridge, CB2 2QQ, UK7 Institute of Translational Medicine, University of Liverpool, Merseyside, L69 3BX, UK8 Department of Molecular Oncology, Oslo University Hospital/Institute for Cancer Research, Oslo, N-0424, Norway9 Prostate Cancer Research Group, Centre for Molecular Medicine Norway (NCMM), University of Oslo and Oslo University Hospitals, Forskningsparken, Oslo, N-0349, Norway10 Uro-oncology Research Group, Cambridge Research Institute, Cambridge, CB2 0RE, UK11 Wales Cancer Bank, Cardiff University, Health Park, Cardiff, CF14 4XN, UK12 Centre for Systems Medicine, RCSI, Beaumont Hospital, Dublin, Ireland13 UCD School of Medicine, 8 Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland14 Christie NHS Foundation Trust, 550 Wilmslow Rd, Manchester, M20 4BX, UK
Corresponding Author:
Professor Richard D. Kennedy
Centre for Cancer Research and Cell Biology
Queen’s University of Belfast
Lisburn Road
Key Words: prostate cancer, prognostic, recurrence, progression, Metastatic Assay
Word Count: Total= 3155 words, Main= 2881 Abstract = 274 words
1
2
3
4
5
6
7
8
9
10
11121314151617181920212223242526
2728
29
30
31
32
33
34
35
36
37
38
2
ABSTRACT
BACKGROUND: Approximately 4-25% of patients with early prostate cancer develop
disease recurrence following radical prostatectomy.
OBJECTIVE: To identify a molecular subgroup of prostate cancers with metastatic
potential at presentation resulting in a high risk of recurrence following radical
prostatectomy.
DESIGN, SETTING & PARTICIPANTS: Unsupervised hierarchical clustering was
performed using gene expression data from 70 primary resections, 31 metastatic lymph
nodes and 25 normal prostate samples. Independent assay validation was performed
using 322 radical prostatectomy samples from four sites with a mean follow-up of 50.3
months.
OUTCOME MEASURES & STATISTICAL ANALYSIS: Molecular subgroups were identified
using unsupervised hierarchical clustering. A partial least squares approach was used to
generate a gene expression assay. Relationships with outcome (time to biochemical and
metastatic recurrence) were analyzed using multivariable Cox regression and log-rank
analysis.
RESULTS & LIMITATIONS: A molecular subgroup of primary prostate cancer with
biology similar to metastatic disease was identified. A 70-transcript signature
(Metastatic Assay) was developed and independently validated in the radical
prostatectomy samples. Metastatic Assay positive patients had increased risk of
biochemical recurrence (Multivariable HR 1.62 [1.13-2.33]; p= 0.0092) and metastatic
recurrence (Multivariable HR=3.20 (1.76-5.80); p=0.0001). A combined model with
CAPRA-S identified patients at increased risk of biochemical and metastatic recurrence
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
3
superior to either model alone (HR=2.67 [1.90-3.75]; p<0.0001 and HR=7.53 [4.13-
13.73]; p<0.0001 respectively. The retrospective nature of the study is acknowledged as
a potential limitation.
CONCLUSIONS: The Metastatic Assay may identify a molecular subgroup of primary
prostate cancers with metastatic potential.
PATIENT SUMMARY: The Metastatic Assay may improve the ability to detect patients at
risk of metastatic recurrence following radical prostatectomy. The impact of adjuvant
therapies should be assessed in this higher risk population.
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
4
INTRODUCTION
Although prognosis for localized prostate cancer patients following radical
prostatectomy is very good, 4-25% (dependent upon disease stage and use of
population PSA screening) will develop metastatic disease within 15 years 1,2. In
addition, patients with low and some intermediate risk prostate cancers are best
treated by active surveillance, however there is clinical uncertainty about progression
in this population 3. Progression in low/intermediate risk may be due to a more
biologically aggressive genotype of primary tumours, whilst in clinically higher risk
groups there may be undetected micro-metastatic disease at presentation 4. This could
be treated by adjuvant approaches including pelvic radiotherapy 5, extended lymph
node dissection 6, adjuvant hormone therapy 7 or chemotherapy 8.
Presently metastatic risk is estimated from histopathologic grade (Gleason score and
clinical grade grouping), tumour stage and presenting PSA level. These prognostic
factors have limitations; 15% of lower-grade prostate cancer patients (Gleason≤7)
experience disease recurrence 9 whereas 74-76% of higher-grade patients (Gleason>7)
do not develop metastatic disease following surgery 10. For Gleason 7 tumours,
dominant lesion grade affects prognosis, 40% of Gleason 4+3 patients developing
recurrence by 5-years compared to 15% for Gleason 3+4 11. Clearly there is a need to
identify additional prognostic factors to guide adjuvant treatment. Current approaches
can be broadly classified as mathematical risk models using clinical factors such as
CAPRA 12 and CAPRA-Surgery (CAPRA-S) 13 scoring, or biomarkers measured from
tumour tissue. Regarding biomarkers, researchers have taken immunohistochemical
approaches such as high Ki67 expression 14 or PTEN loss to indicate metastatic potential
15. Others have used multiplexing approaches where a gene expression 16-18 or proteomic
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
5
signature 19 has been trained against known outcomes to predict high and low risk
disease using archived material.
It is recognized that malignancies originating from the same anatomical site can
represent different molecular entities 20. We hypothesized that a unique molecular
subgroup of primary prostate cancers may exist that has a gene expression pattern
associated with metastatic disease. We took an unsupervised hierarchical clustering
approach using primary localised prostate cancer, primary prostate cancer presenting
with concomitant metastatic disease, lymph node metastasis and normal prostate
samples to identify a novel “metastatic molecular subgroup”. A 70-transcript signature
(Metastatic Assay) was developed using this approach and independently validated in a
cohort of radical prostatectomy samples for biochemical and metastatic recurrence.
101
102
103
104
105
106
107
108
109
110
111
112
113
6
PATIENTS & METHODS
Study design
Study design followed the reporting recommendations for tumour marker prognostic
studies (REMARK) guidelines as outlined in the criteria checklists (Supplemental Table
1 & Appendix A) and REMARK study design diagram (Supplementary Figure 1).
Patients
Formalin Fixed Paraffin Embedded (FFPE) sections from 126 samples (70 primary
prostate cancer specimens from radical prostatectomy resections including those with
known concomitant metastases, 31 metastatic disease in lymph nodes and 25
histologically confirmed normal prostate samples that did not display hypertrophy,
sourced from bladder resections) were collected from the University of Cambridge and
the Institute of Karolinska for molecular subgroup identification (Supplementary Table
2). A secondary training dataset of 75 primary resection samples was collected, of
which 20 were profiled in duplicate, to aid selection of the final signature length
(Supplementary Table 3). For independent in-silico validation three public datasets
were identified 17,21,22, GSE25136, n=79 (Supplemental Table 4), GSE46691, n=545
(Supplemental Table 5) and GSE21034, n=126 (Supplemental Table 6). 322 FFPE
prostatectomy samples from four sites were collected for independent validation of the
assay (Supplementary Table 7). Biochemical recurrence was defined as a post-
prostatectomy rise in PSA of >0.2 ng/ml followed by a subsequent rise. Metastatic
recurrence was defined as radiologic evidence of any metastatic disease, including
lymph nodes, bone and visceral metastases. Inclusion criteria were T1a-T3c NX M0
prostate cancers treated by radical-prostatectomy, no previous systemic adjuvant or
neoadjuvant treatment in non-recurrence patients and at least 3 years follow-up.
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
7
Ethical approval was obtained from East of England Research Ethics Committee (Ref:
14/EE/1066).
Metastatic-subgroup Assay Discovery
The 126 discovery samples were analyzed for gene expression using a cDNA microarray
platform optimized for FFPE tissue. Unsupervised hierarchical clustering, an unbiased
statistical method to discover structure in data, was applied to the gene expression
profiles. Genes were selected using variance-intensity ranking and then an iterative
procedure of clustering with different gene-lists to determine the optimal set for
reproducibility (Supplementary Methods). Data matrices were standardized to median
gene expression and agglomerative 2-dimensional hierarchical clustering performed,
using Euclidean distance and Ward’s linkage. The optimal number of sample and gene
clusters were identified using the GAP statistic 23.
GO biological processes determined biological significance of the gene clusters. Chi-
squared or ANOVA tests were used to assess association of sample clusters with clinical
data. Class-labels were assigned to samples, classifying the subgroup enriched with
metastatic tumours as the “metastatic-subgroup”; and the subgroup enriched with
normal prostate samples the “non-metastatic-subgroup”.
A signature to identify the metastatic-subgroup was developed using partial-least-
squares (PLS) regression. All model development steps (pre-processing,
gene-filtering/selection, model parameter estimation) were nested within 10x5-fold
cross-validation (CV), including assessment of signature score reproducibility in
5xFFPE-separate sections and repeatability across 20 resection samples from the
secondary training dataset with technical duplicates. In sum, area under the ROC curve
(AUC), C-index performance for metastatic recurrence in the additional dataset of 75
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
8
resections and assay stability across replicates were used to guide the final number of
transcripts detected by the assay. Thresholds for dichotomizing predictions were
selected at the point where sensitivity and specificity for detecting the metastatic-
subgroup reached a joint maximum.
Statistical Assessment of Assay Performance
The performance of the Metastatic Assay regarding biochemical and metastatic
progression was assessed by sensitivity and specificity. Cox regression was used to
investigate prognostic effects of the assay with respect to time to recurrence endpoints.
The estimated effect of the assay was adjusted for PSA, age and Gleason score in a
multivariable model. A second multivariable analysis was performed to investigate the
prognostic effect of the assay when adjusting for CAPRA-S 13, whilst further assessing
additional prognostic effect of a combined model generated for the assay and CAPRA-S
together. Verification of proportional hazard assumptions was assessed using a
statistical test based on the Schoenfeld residuals 24. Samples with unknown clinical
factors were excluded. All tests of statistical significance were 2-sided at 5% level of
significance.
Combined model development and application (Metastatic Assay and CAPRA-S)
A combined model using Metastatic Assay dichotomized calls and CAPRA-S
dichotomized into Low risk (CAPRA-S: 0-5) and High risk (CAPRA-S: 6-10) was assessed
in the resection validation cohort independently against biochemical and metastatic
endpoints using Cox regression analysis (Supplementary Methods). Subjects were
classified ‘Low risk’ given a combined model result Assay Negative/CAPRA-S low risk;
otherwise subjects were labelled ‘High Risk’ (i.e. samples that were classified as Assay
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
9
Negative/CAPRA-S high risk, Assay Positive/CAPRA-S low risk or Assay
Positive/CAPRA-S high risk).
See Supplemental Methods for additional experimental detail.
185
186
187
10
RESULTS
Molecular Subtyping and Identification of a Metastatic-subgroup in the Discovery
cohort
We hypothesized a molecular subgroup of poor prognosis primary prostate cancers
would be transcriptionally similar to metastatic disease. To identify this subgroup, we
measured gene expression in primary prostate cancers, primary prostate cancers with
known concomitant metastases, metastatic lymph node samples and histologically
confirmed normal prostate tissue (Supplementary Table 2).
Unsupervised hierarchical clustering identified two sample groups and two gene
clusters (Figure 1A). Importantly, one of the molecular subgroups (C1) demonstrated
significant enrichment for primary cancers with known concomitant metastatic disease
(Figure 1A & 1B, chi-squared p<0.0001). In addition, the C1 group contained all
metastatic lymph node samples and no normal prostate samples. We defined this
subgroup as the ‘metastatic-subgroup’ and the other (C2) the ‘non-metastatic-
subgroup’.
Identifying Metastatic-subgroup Biology
A feature of the metastatic-subgroup was loss of gene expression observed in Gene
cluster 1 (G1) (Figure 1A & Supplementary Table 8). To investigate if loss of gene
expression was due to epigenetic silencing we measured DNA methylation in 8
metastatic and 14 non-metastatic-subgroup samples (Supplementary Table 9). Semi-
supervised hierarchical clustering of the methylation data of down-regulated genes (G1)
separated the samples into 2 groups (Supplementary Figure 2 & Supplementary Table
10), with 7/8 samples (88%) from the metastatic-subgroup (M2), and 10/14 samples
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
11
(71%) from the non-metastatic-subgroup clustering together (M1) (chi-squared,
p=0.02). Functional analysis demonstrated that the metastatic-subgroup had higher
levels of methylation in genes that negatively regulate pathways known to be involved
in aggressive prostate cancer such as WNT and growth signalling (Supplementary Table
11) 25.Together these data suggest that epigenetic silencing is a feature of the
metastatic-subgroup and may therefore be important in metastases.
To better understand molecular processes upregulated in the metastatic-subgroup we
performed differential gene analysis, identifying 222 that were over-expressed.
Ingenuity Pathway Analysis (IPA) (www.ingenuity.com) identified 2 up-regulated
pathways in the metastatic-subgroup (FDR p<0.05). The ToppGene Suite 26 identified
18 up-regulated pathways (FDR p<0.05) (Supplementary Table 12). These pathways
represented mitotic progression and Forkhead Box M1 (FOXM1) pathways.
Consistently, FOXM1 was 2.80 fold over-expressed in the metastatic-subgroup.
Development of a Metastatic Assay
Next we developed an assay that could identify metastatic-subgroup tumours
(Supplementary Figure 3). Computational classification using PLS-regression resulted
in a 70-transcript Metastatic Assay. In the training set, the AUC under CV for detecting
the metastatic-subgroup was 99.1 [98.5-99.8]. The standard deviation (SD) in assay
scores using 5 separate sections from the same tumour was 0.06 representing 6.9% of
the assay range and 100% agreement in assay call. In a secondary training dataset of 75
primary resections, the C-index for detecting the metastatic subgroup was 90.4, with an
SD in assay scores using 20 patient samples with technical replicates of 0.02
representing 2.9% of assay range (Supplementary Figure 4).
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
12
Importantly, as the assay was trained against a distinct molecular subgroup rather than
clinical outcome; there was a bimodal distribution of scores (Supplementary Figure 5).
The Metastatic Assay gene list and weightings are listed in Supplementary Table 13.
Metastatic Assay performance in public datasets
The assay was applied to three independent public prostate cancer resection gene
expression datasets. Assay scores were calculated using the partial least squares model
and dichotomized into assay-positive and assay-negative (Supplemental Methods). In
the first (n=79) 21 the assay was significantly associated with biochemical recurrence
with a sensitivity of 70.3% and specificity of 66.7% (Chi-square p=0.0049). In a second,
(n=545) 17, the assay was significantly associated with metastatic progression with a
sensitivity of 67.0% and specificity of 54.6% (Chi-square p<0.0001). Using a third
dataset with time to event data, (n=126) 22, multivariable analysis adjusting for Gleason
(grades represented in four subgroups), age and PSA demonstrated increased risk of
biochemical recurrence (HR=3.03 [1.43-6.41]; p=0.0040) (Table 1)(Figure 2B).
However, possibly due to the small number of metastatic events (11%) the association
with outcome in multivariable analysis did not reach statistical significance (HR=2.53,
[0.67-9.54]; p=0.1735) (Table 1).
Metastatic Assay performance in an independent primary prostate cancer resection
dataset
The assay was then applied to 322 FFPE prostatectomy samples from four clinical sites
with a median follow-up 50.3 months using predefined inclusion/exclusion criteria per
REMARK guidelines (Supplementary Figure 1). A pre-defined assay cut-off of 0. 3613
was used to define Metastatic Assay positivity. On multivariable analysis a positive
assay result was associated with increased risk of biochemical recurrence (HR=1.62
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
13
[1.13 – 2.33]; p=0.0092) (Figure 3A and Table 2) and metastatic recurrence (HR=3.20
[1.76 – 5.80]; p=0.0001) (Table 2 and Figure 3B).
Comparison of the Metastatic Assay with clinical risk stratification
To test assay independence from approaches used in the clinic, we assessed its
performance within risk groups defined by Gleason score and the CAPRA-S model in the
independent resection validation cohort. When separated by Gleason (high risk GS≥4+3
and low risk GS≤3+4) the Metastatic Assay identified patients at higher risk of
metastatic recurrence with a HR of 2.43 (1.14-5.17; p=0.0036) and HR=5.61 (1.19-
26.47; p=0.0013) in the high and low risk GS groups respectively (Figure 3C).
The CAPRA-S prognostic model uses PSA at presentation, age, Gleason score, T-stage,
seminal vesicle invasion (SVI), extracapsular extension (ECE), lymph node invasion
(LNI) and surgical margins 13. In multivariable analysis adjusted for CAPRA-S, both the
Metastatic Assay and CAPRA-S were significantly associated with biochemical
recurrence (HR=1.72 [1.19-2.48]; p=0.0042 and HR=2.52 [1.79-3.54]; p<0.0001) and
development of metastatic disease (HR=2.94 [1.60-5.40]; p=0.0005 and HR=4.76 [2.46-
9.23], p<0.0001) (Table 2). Given the independence of the Metastatic Assay result and
CAPRA-S score a combined model was assessed. Patients classified within the high-risk
subgroup (Assay Positive and CAPRA-S high) were significantly associated with both
biochemical and metastatic recurrence (HR=2.67 [1.90-3.75]; p<0.0001 and HR=7.53
[4.13-13.7]; p<0.0001 respectively) demonstrating superiority to either model alone
(Figure 4 and Table 2, Combined Model).
To assess the clinical impact of the combined model of Metastatic Assay plus CAPRA-S,
additional performance metrics were assessed for the metastatic endpoint in the
independent resection validation validation cohort. As the assay was dichotomous, the
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
14
comparison of sensitivity and specificity between the Metastatic Assay alone, CAPRA-S
alone and the Combined Model were investigated. Whilst the sensitivity of CAPRA-S
(70.5%) was greater than that of the Metastatic Assay alone (47.7%), there was an
increase in sensitivity to 80.1% when combined in the model. There was, however a
decrease in specificity from 81.9% (Metastatic Assay) and 71.5% (CAPRA-S) to 61.1% in
the Combined Model which may indicate patients who have not yet experienced
recurrence within the 50.3 month median follow-up (Supplementary Table 15).
Metastatic Assay performance as a continuous predictor of recurrence
A Combined Model of continuous Metastatic Assay scores and CAPRA-S had higher
performance for predicting metastatic recurrence, with the highest C-index, HR and AUC
compared to either metric alone, within two validation cohorts (MSKCC: AUC=0.88,
[0.81-0.93], HR 1.55 [1.26-1,91]; p<0.0001, C-index=0.83 [0.74-0.91] (Supplementary
table 16) and Independent Resection Validation: AUC=0.80 [0.74-0.85], HR 1.66 [1.43-
1.93]; p<0.0001, C-index=0.82 [0.76-0.86] (Supplementary Table 17)). The Metastatic
Assay is an independent predictor of both biochemical and metastatic recurrence when
assessed as a continuous variable in multivariate analysis in two validation datasets
(MSKCC: HR 2.00 [1.24-3.24]; p=0.0050 and HR 2.99 [1.10-8.17]; p=0.0334 and
Independent Resection Cohort: HR 1.16 [1.03-1.30]; p=0.0155 and HR 1.52 [1.24-1.85];
p<0.0001 (per 0.1 unit change in assay score)) (Supplementary Table 18).
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
15
DISCUSSION
The majority of early prostate cancer patients treated by radical resection are cured.
However, up to 25% of patients develop metastatic disease within 15 years 1,2. In
surveillance for low/intermediate risk disease there is concern about risks of clinical
under-grading and disease progression, with a proportion of patients needing
treatment within 5 years 3. This engenders clinical uncertainty in modern practice in
two key areas; firstly in the appropriate and safe selection of patients for active
surveillance, particularly in the Gleason 3+4 intermediate group, and secondly in
patients undergoing radical local treatment for intermediate and higher grade tumours,
where adjuvant loco-regional and systemic treatment may improve outcome. A test
which helps to select patients at higher risk of progression in these settings will have
significant clinical utility.
Several prognostic gene expression assays have been developed by comparing gene
expression data between good and poor outcome patients 16-18. In contrast, we identified
a molecular subgroup of primary prostate cancer samples that shared biology with
metastatic disease. We developed an assay for this molecular subgroup which identified
patients at risk of biochemical and metastatic recurrence in three publicly available and
one prospectively collected multicentre dataset.
Consistent with the molecular subgroup representing metastatic biology, the assay was
better at predicting metastatic progression rather than biochemical recurrence. The
latter does not necessarily predict metastatic development; only one third of patients
with biochemical recurrence develop measurable metastatic disease 8 years after
resection 27. In addition, the HR of 3.20 for metastatic recurrence compares favourably
to the reported hazard ratios for other prognostic assays to predict metastatic disease,
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
16
HRs ranging between 1.40 and 3.30 16-18. A significant feature of assay performance was
independence from CAPRA-S, allowing the development of a combined risk model with
superior performance to either CAPRA-S or the Metastatic Assay individually.
An interesting feature of the metastatic-subgroup was methylation and loss of gene
expression such as OLFM4 known to inhibit metastatic processes including WNT
signalling 28. It is therefore possible that novel therapies aimed at reversing epigenetic
silencing or targeting WNT signalling may act against the metastatic biology in this
molecular subgroup 29. Regarding up-regulated genes in the metastatic-subgroup, a
significant proportion were regulated by FOXM1 known to promote prostate cancer
progression 30. Indeed, others have found increased FOXM1 gene expression to be
prognostic and have included it in a 31-gene expression assay 16. Interestingly only
6/70 genes in the metastatic assay overlapped with 3 prognostic signatures that are
entering clinical practice (AZGP1 18, PTTG1, TK1 and KIF11 16, ANO7 and MYBPC1 17 )
GenomeDx (p=0.06) , GHI (p=0.16) and Myriad (p=0.06) after multiple test correction
using a Benjamini-Hochberg correction, likely reflecting the distinct approach of
molecular subtyping versus trained endpoint analysis (Supplementary Figure 6).
A potential limitation of this study is the retrospective validation of the assay in historic
datasets. Diagnostic and surgical approaches have improved with time, which may
reduce disease recurrence. We expect, however, that the effect of these improvements
would mostly be on local recurrence whereas this assay has been developed to predict
metastatic disease progression, likely largely beyond surgical control at presentation.
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
17
CONCLUSIONS
We have identified a molecular subgroup of primary prostate cancer with metastatic
capacity. We hypothesize that using this molecular subtyping approach may improve
patient stratification considering active surveillance and may benefit patients with
higher risk clinically localized disease by focusing loco-regional and systemic adjuvant
therapy in those at highest risk of regional and systemic failure.
349
350
351
352
353
354
355
18
AUTHOR CONTRIBUTIONS
Study concept and design: Walker, Harkin and Kennedy.
Acquisition of data: Walker, Knight, Logan, Blayney, McCavigan, Price and Jellema.
Analysis and interpretation of data: Walker, Knight and Kennedy
Writing of the manuscript: Walker, Logan, Knight, Clarke and Kennedy
Critical revision of the manuscript for important intellectual content: Waugh, Mills, Neal,
Clarke and Harkin.
Obtaining funding: Kennedy, Harkin.
Administrative, technical or material support: Sherif, Warren, Neal, Berge, Svindland,
Pandha, Mason, McDade, Watson, Davidson, Uprichard and Kay.
ACKNOWLEDGEMENTS
We acknowledge the Welsh Cancer Biobank/Cardiff University Health, Irish Prostate
Cancer Research Consortium Biobank, the Northern Ireland Biobank and The Prostate
Biobank associated with Oslo University Hospital along with their members of the tissue
acquisition teams. In particular we thank E. Smith (University of Surrey) and L. Spary
(Welsh Cancer Bank) for the support in acquiring samples and corresponding clinical
data from the clinical sites. We would also like to thank J. Fay (RCSI, Beaumont Hospital)
for continued support and guidance with pathology. In addition, this work was
supported by the Belfast-Manchester Movember Centre of Excellence (CE013_2-004),
funded in partnership with Prostate Cancer UK (Waugh, Clarke and Mills) and by
European Regional Development Fund through Invest Northern Ireland (INI), Ref:
RD1208001 and RD0115336.
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
19
SUPPLEMENTARY METHODS
Patients
The Discovery cohort was collected in collaboration with two clinical sites, University of
Cambridge and Institute of Karolinska. These samples were anonymous without clinical
outcome and were solely used for molecular subgroup identification. Samples ranged in
type with 56% primary tumours, 17% primary tumours with concomitant metastases,
8% known metastatic disease and 19% normal. In total 44% of patients had a high
Gleason scores>7, 19% with a Gleason 7 and the remaining patients either <7 or
unknown (Supplementary Table 2). Of the Discovery cohort, 22 patient samples were
selected for methylation analysis. In total, 36% (8/22 patients) were ‘Assay Positive’
within the C1 subgroup of metastatic biology, with 72% harboring a Gleason scores >7
(Supplementary Table 9). Within the secondary training dataset 7% had a Gleason of
<7, with 77% Gleason 7 and 16% Gleason >7. Median pre-operative PSA levels were 7.7
ng/ml and median age of 58 years. The first in silico dataset (GSE25136) consisted in
total of 79 patients with a 53 to 47% split of non-recurrence to recurrence patients.
Overall, 56% of patients had a resection Gleason score of 7, with two equal proportions
of patients (each 22%) having a Gleason of either <7 or >7. Median pre-operative PSA
levels were 7.6 ng/ml and median age of 61.2 years (Supplemental Table 4). An
additional in silico dataset (GSE46691) comprised of 545 patients in total, of which 39%
had known metastatic progression. Exactly half the population had a Gleason of 7, with
38% >7 and 12% <7 (Supplemental Table 5). The final in silico dataset (GSE21034)
consisted of a total of 126 patients with a 25 and 11% split of biochemical recurrence
and metastatic progression respectively. Overall, 57% of patients had a resection
Gleason score of 7, with 11% >7 and 32% <7. Median pre-operative PSA levels were
5.92 ng/ml and median age of 57.6 years (Supplemental Table 6).
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
20
Prior to acquiring the retrospective validation cohort a power calculation was
performed using a Hazard ratio of 2.0. Using preliminary data we estimated that the
metastatic group (Assay Positive) is approximately 30% of the population with a
recurrence rate of 40%, therefore with 263 patients with approximately 105 recurrence
events this will give a study power of between 90% at a significance level of 0.05. The
retrospective validation cohort was collected in collaboration with four clinical sites,
University Hospital of Oslo, Wales Cancer Bank, University of Surrey and the Irish
Prostate Cancer Research Consortium. Samples ranged across recurrence subgroups
with 53% non-recurrence, 32% biochemical recurrence and 15% known metastatic
progression. Median time to recurrence event was 12 months (biochemical) and 3
months (metastatic). In total 17% of patients had a high Gleason score >7, with 61%
having a Gleason 7 and the remaining patients either <7 or unknown. The majority of
patients (99%) had a pathological T-stage of either T2 or T3. Median pre-operative PSA
levels were 8.4 ng/ml and median age of 62 years. Seminal vesicle invasion (19%),
lymph node invasion (5%), extracapsular extension (30%) and positive surgical
margins (32%) were also appropriately represented across the validation cohort
(Supplementary Table 7).
Whilst overarching ethical approval was obtained, additional clinical site ethical
approval was also obtained from collaborators, namely The Prostate Biobank Oslo and
the Irish Prostate Cancer Research Consortium Biobank/Mater Misericordiae University
Hospital ethics committee.
Molecular Profiling of Prostate Cancer samples
Samples were pathology reviewed to identify the most dominant Gleason grade within
the tumour for macrodissection. Total RNA was extracted from 2x10 µm
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
21
macrodissected FFPE tissue slides using the Roche High-Pure RNA Paraffin Kit (Roche
Diagnostics GmbH, Mannheim, Germany). RNA was converted into cDNA, amplified and
converted into single-stranded form using SPIA® technology of the WT-Ovation™ FFPE
RNA Amplification System (NuGEN Technologies Inc., San Carlos, CA, USA). Amplified
cDNA was fragmented, biotin-labelled using FL-Ovation™ cDNA Biotin Module (NuGEN
Technologies Inc.), and hybridized to the Almac Prostate Cancer DSATM. Arrays were
scanned using Affymentrix Genechip® Scanner 7G (Affymetrix Inc., Santa Clara, CA,
USA). Stratagene Universal Human Reference (UHR) samples and ES-2 cell lines were
used as process controls.
Methylation Profiling of Prostate Cancer samples
For the 22 patients, 8 metastatic-subgroup and 14 non-metastatic-subgroup, DNA was
extracted using Recoverall (Life technologies). Genomic DNA (800 ng) was treated with
sodium bisulfite using the Zymo EZ DNA Methylation KitTM (Zymo Research, Orange,
CA, USA) according to the manufacturer’s procedure, with the alternative incubation
conditions recommended when using the Illumina Infinium Methylation Assay. The
methylation assay was performed on 4 l bisulfite-converted genomic DNA at 50 ng/ l μ μ
according to the Infinium HD Methylation Assay protocol. Samples were processed onto
Illumina 450k arrays as per manufacturer’s procedures.
Data preparation & Quality Control (QC)
Microarray Data
Samples were pre-processed using the Robust Multi-Array (RMA) average methodology
30. The QC assessment comprised a combination of the following quality metrics
including array image analysis, GeneChip QC, principal components analysis (PCA) and
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
22
intensity distribution analysis. Array data was examined to identify any image artefacts.
As part of the GeneChip QC, percent present (%P), average signal absent, scale factor,
average background and raw Q were all assessed. Samples with a %P<15% were
deemed a QC fail. Hotelling T2 and residual residual Q method was used to identify
sample outliers at the expression level within the PCA analysis. Finally, Kolmogorov-
Smirnov statistic 31 were used to examine the intensity distribution of the samples and
identify outliers.
Methylation Data
Raw data was processed using the R package “Lumi”, specifically this was used to
correct any color bias and normalise the processed data using a quantile approach,
uncorrected b-values were extracted using the same software.
Hierarchical clustering
Genes were ranked based on variance and intensity (variance high → low; intensity high
→ low). A two-step process was implemented to determine the optimal data matrix size
to be used for subgroup identification (unsupervised analysis). Firstly, the most stable
number of sample groups was identified. Secondly, the optimal number of genes leading
to the identified number of sample groups was determined. The most stable number of
sample groups was identified by sub setting the ranked and sorted data matrix into 50
sub matrices increments of 100 (max being 5000 genes). The GAP statistic 21 was run to
determine the optimal number of sample groups in each of these sub-matrices. This
index gives an indication of the within-cluster tightness and between-cluster
separateness. The smallest number of genes generating the optimal sample cluster
number was selected as the list of most variable genes to take forward for unsupervised
subgroup identification. For the purpose of clustering, the data matrices were
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
23
standardized to the median value of gene expression. Standardization of the data allows
the comparison of different genes’ expression levels, which may not necessarily be on
the same scale or at the same intensity levels. Following standardization, 2-dimensional
hierarchical clustering was performed (samples x genes). Euclidean distance was used
to calculate the distance matrix, which is a multidimensional matrix representing the
distance from each data point (gene-sample pair) to all the other data points. Ward’s
linkage method was subsequently applied to join the samples and genes together, based
on the calculated distance matrix. In order to determine the optimal number of sample
clusters and gene clusters, the GAP statistic was calculated for a range of potential
clusters.
Functional enrichment analysis was performed to determine the significance of each
gene cluster. Enrichment analysis consisted of the comparison of the gene list of interest
to other gene lists of known function grouped according to the GO classification
“Biological processes” (entities). Entities were ranked according to a statistically
derived enrichment score 32 and adjusted for multiple testing 34; thereby measuring the
significance of likelihood that the association between the gene set of interest and a
given process is due to chance.
Differential Expression Analysis
The pre-processed data was filtered to remove all Affymetrix AFFX control probe sets
and uninformative probe sets whose expression resides in the background noise region
(background filtering). Background filtering was performed based on a combination of
the expression and the variance of individual probe sets. Expression selects those probe
sets whose average expression is above the threshold defined by σBg at the user
specified significance level . The variance selects those probe sets whose variance is α
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
24
above the variance of the background σBg. A t-test was performed on the reliably
detected probe set list to establish the variance contributions of the factor of interest
(cluster). Multiple test correction (MTC) was applied using False Discovery Rate (FDR,
33). Data was filtered based on a fold change greater than 2 and an adjusted p-value of
0.05. Functional enrichment analysis was performed on the resulting gene list to
provide insight into pathways associated with the genes in the list. Using commercial
software IPA, functional enrichment analysis was conducted to identify and rank
biological entities which are found to be associated with the gene sets of interest 32.
Entities have been ranked according to a statistically derived enrichment score 34 and
adjusted for multiple testing 33; thereby measuring the significance (pFDR threshold <
0.05) of likelihood that the association between the gene set of interest and a given
process or pathway is due to chance.
Signature Generation
The following steps summarize the procedure for developing the gene signature:
1. Cross-validation : The samples were randomly split into 5 cross-validation (CV)
folds for signature training/testing, and this was repeated 10 times to allow an
unbiased estimation of the model performance.
2. Pre-processing : RMA background correction of the data at the probe intensity
level, followed by a median summary of the intensities of probes to probe sets
and subsequently probe sets to Entrez gene ID. The Entrez gene level
summarized data matrix was log2 transformed and quantile normalized. Note
that samples in the CV test set were normalized using a quantile normalisation
model from the corresponding CV training set to ensure that all estimates of
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
25
model performance are based on signature scores pre-processed on a per sample
basis.
3. Filtering : A gene filter was applied before model development to remove 75
percent of genes with low variance and low intensity.
4. Machine Learning : Partial Least Squares (PLS) was used to train the algorithm
against the “metastatic-subgroup” endpoint.
5. Feature Selection : A wrapper based method for feature selection was
implemented, where genes (those remaining after the initial filter) are ranked
using the respective weights defined by the PLS algorithm and 10 percent of
genes with the lowest absolute weights are removed. This process is repeated
after each round of feature elimination (within cross validation) where the genes
are re-ranked in order to determine the genes with the lowest absolute weights
and removing 10 percent each time until only 2 genes remained.
6. Interim validation data set : Five separate sections across an FFPE tumour block
were profiled in order to evaluate the impact of biological heterogeneity on the
signature score. A secondary training dataset of 75 samples of which 20 were
profiled in duplicate were additionally used to guide signature selection.
Signature scores for each of these sections were calculated under CV alongside
each CV test set.
Model selection included the following steps:
1. Evaluating the Area under the Receiver Operating Characteristic (ROC) Curve
(AUC) in the training data and C-index performance in the secondary training
dataset under cross validation.
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
26
2. Evaluating the variability in signature scores across the five separate sections of
FFPE material which were predicted under CV. The variability was determined
by calculating the standard deviation (SD) of the signature scores across the five
samples and expressing the SD as a fraction of the signature score range (i.e.
calculating a percent SD).
3. Evaluating the variability in signature scores across 20 patients with technical
replicates which were predicted under CV. The variability was determined by
calculating the pooled standard deviation (SD) of the signature scores across the
20 patient technical replicates and expressing the SD as a fraction of the
signature score range (i.e. calculating a percent SD).
The signature length that yielded a high AUC in training set, a high C-index in the
secondary training set and low SDs in both the reproducibility samples and clinical
technical replicates was selected. Following migration of the Metastatic Assay to a
platform with an improved chemistry (NuGEN Ovation FFPE WTA V3), a technical bias
adjustment was applied to the assay threshold which was used to dichotomize assay
scores for resection clinical validation cohort.
Generation of Metastatic Assay Scores
Probeset expression was summarized to an Entrez Gene ID level using the median
value. Assay scores were calculated using the partial least squares model:
i
iii kbxwscore Signature
Where w i is the weight of each entrez gene, x i is the gene expression, b iis the entrez gene
specific bias and k=0.4365. Assay calls were assigned based upon predefined cut-off for
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
27
resection (0.3613). Samples with a continuous score result > cut-off were labelled ‘assay
positive’ otherwise ‘assay negative’.
Methylation Data Analysis
The data was transformed using the logit transformation and negative values were
corrected for by adding a factor of 10 to the data matrix. Semi-supervised hierarchical
cluster analysis was performed in the methylation data using the genes in G1 (gene
cluster 1) which were under-expressed in the metastatic biology subgroup relative to
the non-metastatic biology subgroup. Gene symbols were mapped to methylation probe
IDs using HumanMethylation450_15017482_v1-2 annotation which were then
summarized to gene level using the median. Functional enrichment analysis of the gene
clusters was performed as previously described for the microarray analysis.
Performance analysis of the Metastatic Assay as a continuous predictor
Predicted score outputs from the Metastatic Assay were transformed to a continuous
scale between 0 and 1 using the overall range of scores i.e. Scorei = Xi – (min(X) /
(max(X)-min(X)). A combined model of continuous Metastatic Assay scores with CAPRA-
S was developed under cross-validation to reduce bias in performance estimates. Cox
proportional hazards regression method was used to estimate the univariate and
multivariable hazard ratios (HRs) (incorporating Gleason, Age & iPSA) of the continuous
Metastatic Assay scores, CAPRA-S and the combined model scores. Area under the
receiver-operating characteristic curve (AUC) and concordance Index (C-Index)
performance metrics were also calculated to determine significance for prediction of
biochemical and metastatic outcomes. Net benefit scores were determined across risk
thresholds ranging from 0-40% for metastatic events using decision curve analysis.
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
28
589
590
591
29
REFERENCES
1. Wilt TJ, Brawer MK, Jones KM, et al. Radical prostatectomy versus observation for localized prostate cancer. N Engl J Med. 2012;367(3):203-213.
2. Bill-Axelson A, Holmberg L, Garmo H, et al. Radical prostatectomy or watchful waiting in early prostate cancer. N Engl J Med. 2014;370(10):932-942.
3. Klotz L, Vesprini D, Sethukavalan P, et al. Long-term follow-up of a large active surveillance cohort of patients with prostate cancer. J Clin Oncol. 2015;33(3):272-277.
4. Bader P, Burkhard FC, Markwalder R, Studer UE. Is a limited lymph node dissection an adequate staging procedure for prostate cancer? J Urol. 2002;168(2):514-518; discussion 518.
5. Roach M, 3rd, DeSilvio M, Lawton C, et al. Phase III trial comparing whole-pelvic versus prostate-only radiotherapy and neoadjuvant versus adjuvant combined androgen suppression: Radiation Therapy Oncology Group 9413. J Clin Oncol. 2003;21(10):1904-1911.
6. Abdollah F, Gandaglia G, Suardi N, et al. More extensive pelvic lymph node dissection improves survival in patients with node-positive prostate cancer. Eur Urol. 2015;67(2):212-219.
7. Zapatero A, Guerrero A, Maldonado X, et al. High-dose radiotherapy with short-term or long-term androgen deprivation in localised prostate cancer (DART01/05 GICOR): a randomised, controlled, phase 3 trial. Lancet Oncol. 2015;16(3):320-327.
8. James ND, Sydes MR, Clarke NW, et al. Addition of docetaxel, zoledronic acid, or both to first-line long-term hormone therapy in prostate cancer (STAMPEDE): survival results from an adaptive, multiarm, multistage, platform randomised controlled trial. Lancet. 2016;387(10024):1163-1177.
9. Cooperberg MR, Lubeck DP, Meng MV, Mehta SS, Carroll PR. The changing face of low-risk prostate cancer: trends in clinical presentation and primary management. J Clin Oncol. 2004;22(11):2141-2149.
10. Bolla M, van Poppel H, Tombal B, et al. Postoperative radiotherapy after radical prostatectomy for high-risk prostate cancer: long-term results of a randomised controlled trial (EORTC trial 22911). Lancet. 2012;380(9858):2018-2027.
11. Makarov DV, Sanderson H, Partin AW, Epstein JI. Gleason score 7 prostate cancer on needle biopsy: is the prognostic difference in Gleason scores 4 + 3 and 3 + 4 independent of the number of involved cores? J Urol. 2002;167(6):2440-2442.
12. Cooperberg MR, Pasta DJ, Elkin EP, et al. The University of California, San Francisco Cancer of the Prostate Risk Assessment score: a straightforward and reliable preoperative predictor of disease recurrence after radical prostatectomy. J Urol. 2005;173(6):1938-1942.
13. Cooperberg MR, Hilton JF, Carroll PR. The CAPRA-S score: A straightforward tool for improved prediction of outcomes after radical prostatectomy. Cancer. 2011;117(22):5039-5046.
14. Khor LY, Bae K, Paulus R, et al. MDM2 and Ki-67 predict for distant metastasis and mortality in men treated with radiotherapy and androgen deprivation for prostate cancer: RTOG 92-02. J Clin Oncol. 2009;27(19):3177-3184.
15. Cuzick J, Yang ZH, Fisher G, et al. Prognostic value of PTEN loss in men with conservatively managed localised prostate cancer. Br J Cancer. 2013;108(12):2582-2589.
16. Cuzick J, Swanson GP, Fisher G, et al. Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study. Lancet Oncol. 2011;12(3):245-255.
17. Erho N, Crisan A, Vergara IA, et al. Discovery and validation of a prostate cancer genomic classifier that predicts early metastasis following radical prostatectomy. PLoS One. 2013;8(6):e66855.
592
593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640
30
18. Klein EA, Cooperberg MR, Magi-Galluzzi C, et al. A 17-gene assay to predict prostate cancer aggressiveness in the context of Gleason grade heterogeneity, tumor multifocality, and biopsy undersampling. Eur Urol. 2014;66(3):550-560.
19. Shipitsin M, Small C, Choudhury S, et al. Identification of proteomic biomarkers predicting prostate cancer aggressiveness and lethality despite biopsy-sampling error. Br J Cancer. 2014;111(6):1201-1212.
20. Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747-752.
21. Glinsky GV, Glinskii AB, Stephenson AJ, Hoffman RM, Gerald WL. Gene expression profiling predicts clinical outcome of prostate cancer. J Clin Invest. 2004;113(6):913-923.
22. Taylor BS, Schultz N, Hieronymus H, et al. Integrative genomic profiling of human prostate cancer. Cancer Cell. 2010;18(1):11-22.
23. Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2001;63(2):411-423.
24. GRAMBSCH PM, THERNEAU TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):515-526.
25. Kypta RM, Waxman J. Wnt/beta-catenin signalling in prostate cancer. Nat Rev Urol. 2012;9(8):418-428.
26. Chen J, Bardes EE, Aronow BJ, Jegga AG. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 2009;37(Web Server issue):W305-311.
27. Pound CR, Partin AW, Eisenberger MA, Chan DW, Pearson JD, Walsh PC. Natural history of progression after PSA elevation following radical prostatectomy. JAMA. 1999;281(17):1591-1597.
28. Li H, Liu W, Chen W, Zhu J, Deng CX, Rodgers GP. Olfactomedin 4 deficiency promotes prostate neoplastic progression and is associated with upregulation of the hedgehog-signaling pathway. Sci Rep. 2015;5:16974.
29. Thibault A, Figg WD, Bergan RC, et al. A phase II study of 5-aza-2'deoxycytidine (decitabine) in hormone independent metastatic (D2) prostate cancer. Tumori. 1998;84(1):87-89.
30. Aytes A, Mitrofanova A, Lefebvre C, et al. Cross-species regulatory network analysis identifies a synergistic interaction between FOXM1 and CENPF that drives prostate cancer malignancy. Cancer Cell. 2014;25(5):638-651.
641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672
31
FIGURE & TABLE LEGENDS
Table 1 – Multivariable analysis of the MSKCC cohort for biochemical recurrence (right)
and metastatic progression (left), p-values, hazard ratios (HR) and 95% confidence
intervals (CI) of the HR are outlined within the table. Covariate analysis of the
Metastatic Assay adjusting for CAPRA-S within the MSKCC cohort is also included with
p-values, hazard ratios (HR) and 95% confidence internals (CI) of the HR outlined.
Table 2 - Multivariable analysis of the Metastatic Assay in the independent resection
validation cohort for biochemical recurrence (right) and metastatic progression (left),
p-values, hazard ratios (HR) and 95% confidence intervals (CI) of the HR are outlined
within the table. Covariate analysis of the Metastatic Assay adjusting for CAPRA-S
within the independent resection validation cohort is also included with P-values,
hazard ratios (HR) and 95% confidence internals (CI) of the HR outlined. Analysis from
a combined model of the Metastatic Assay and CAPRA-S within the independent
resection validation cohort was also assessed, outlining p-values, hazard ratios and
confidence intervals for biochemical and metastatic disease progression.
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
32
Suppl. Table 1 – Checklist for REMARK classification of Biomarkers
Suppl. Table 2 - Summary of demographic, clinical and pathological variables
considered for analysis of the Discovery resection cohort. Table outlines total number
of patients with each defined tumour type (Primary, Primary with metastasis,
Metastatic disease and Normal), the number and percentage of patients associated with
each of the representative Gleason grades and the number (%) of patients obtained
from each of the two clinical sites.
Suppl. Table 3 - Summary of demographic, clinical and pathological variables for the
secondary training dataset. Table outlines total number of patients, the median and
range of age at surgery (years), time to recurrence (months), pre-operative PSA levels
(ng/ml), Gleason scores, within each pathological T-stage subgroup, with lymph node
invasion (LNI), and patients with positive and negative surgical margins.
Suppl. Table 4 – Summary of demographic, clinical and pathological variables
considered for analysis of the Glinsky in silico validation sample cohort. Table outlines
total number and percentage of patients with recurrence events, disease relapse, each
representative Gleason score for both resection and biopsy, lymph node invasion (LNI)
and surgical margins. Medians and range are also summarized for pre-operative PSA
levels and age.
Suppl. Table 5 – Summary of demographic, clinical and pathological variables
considered for analysis of the Erho in silico validation sample cohort. Table outlines
total number and percentage of patients with metastatic recurrence events and
representative Gleason scores.
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
33
Suppl. Table 6 - Summary of demographic, clinical and pathological variables
considered for analysis of the MSKCC in silico validation sample cohort. Table outlines
total number and percentage of patients with recurrence events either biochemical or
metastatic, site of metastasis, pathological T-stage, each representative prostatectomy
Gleason score, seminal vesicle invasion (SVI) and surgical margins. Medians and range
are also summarized for pre-operative PSA levels and age.
Suppl. Table 7 - Summary of demographic, clinical and pathological variables
considered for analysis of the independent resection validation cohort. Table outlines
total number of patients, the median and range of age at surgery (years), time to
recurrence (months), pre-operative PSA levels (ng/ml) and the number (%) of patients
from each of the four clinical sites, within each recurrence subgroup, associated with
each of the representative Gleason scores, within each pathological T-stage subgroup,
with lymph node invasion (LNI), seminal vesicle invasion (SVI), extracapsular extension
(ECE) and patients with negative, diffuse or focal surgical margins.
Suppl. Table 8 – Summary of up-regulated and down-regulated genes of the
‘metastatic-subgroup’ compared to the non-metastatic-subgroup, outlining gene
symbol, fold change and FDR p-values.
Suppl. Table 9 – Summary of demographic, clinical and pathological variables
considered for analysis of the methylation sample cohort. Table outlines total number
of patients from each clinical site, with each defined tumour type (Primary or
Metastatic), the number and percentage of patients associated with each of the two
identified subgroups (C1 and C2), the representative Gleason score and clinical T-stage.
Suppl. Table 10 - Summary of genes identified in gene clusters G1 and G2.
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
34
Suppl. Table 11 - Summary of biological processes from Gene Ontology functional
analysis within the cluster of G1, p-values, pFDR-value and genes involved within the
pathway.
Suppl. Table 12 – Summary of canonical pathways from IPA and Toppgene functional
analysis within the over-expressed and under-expressed subgroups including pathway
name, p-values, pFDR-value and genes involved within the pathway.
Suppl. Table 13 – Developed Metastatic Assay with the 70 gene transcripts, Entrez
gene ID, weightings and bias.
Suppl. Table 14 – Performance metrics (Sensitivity and Specificity) of the Metastatic
Assay for biochemical recurrence from Glinsky in silico dataset and metastatic
recurrence from Erho in silico dataset.
Suppl. Table 15 – Performance metrics (Sensitivity and Specificity) for metastatic
recurrence of CAPRA-S alone and the developed combined model (CAPRA-S +
Metastatic Assay).
Suppl. Table 16 – Univariate assessment of the Metastatic Assay as a continuous
predictor in the MSKCC cohort both alone and in a combined model with CAPRA-S.
Suppl. Table 17 – Univariate assessment of the Metastatic Assay as a continuous
predictor in the Independent Resection Validation cohort both alone and in a combined
model with CAPRA-S.
Suppl. Table 18 – Multivariable assessment of the Metastatic Assay as a continuous
predictor in the MSKCC cohort and the Independent Resection Validation cohort.
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
35
Figure 1 – A) Hierarchical clustering of transcriptional profiles from the Discovery
cohort. Specific genes which are upregulated (red) or downregulated (green) are
labelled on the vertical axis within gene clusters. Sample cluster C1 represents the
‘Metastatic-subgroup’ characterized by a shut-down of gene expression (G1) compared
to sample cluster C2. B) Bar chart representing the number and type of each tumour
mapping to each of the two identified sample clusters within the Discovery cohort. C)
Pie chart depicting association with increased methylation rates (dark blue) in the
under-expressed genes of 30% and the over-expressed genes of 6% (p<0.001).
Figure 2 - Kaplan Meier survival analysis for association of the Metastatic Assay at
predicting time to biochemical recurrence (A) and metastatic progression (B) in the
MSKCC in silico cohort. Survival probability (%) showed reduced progression-free
survival (PFS) in months of the ‘Assay Positive’ (yellow) of 85 patients when compared
to the ‘Assay Negative’ (blue) of 41 patients for biochemical and metastatic disease
respectively, (HR = 3.76 [1.70-8.34], p < 0.0001 and (HR = 6.00 [1.90-18.91], p =0.0005).
Figure 3 - Kaplan Meier survival analysis for association of the Metastatic Assay at
predicting time to biochemical recurrence (A) and metastatic progression (B) in the
resection validation cohort. Survival probability (%) showed reduced progression-free
survival (PFS) in months of the ‘Assay Positive’ (yellow) of 74 patients when compared
to the ‘Assay Negative’ (blue) of 248 patients for biochemical and metastatic disease
respectively, (HR = 1.76 [1.18-2.64], p=0.0008, (HR = 3.47 [1.70-7.07], p<0.0001). C)
Association of the Metastatic Assay at predicting metastatic progression stratified into
low risk (GS≤3+4) tumours and high risk (GS≥4+3) tumours, HR 5.61 [1.19-26.47],
p=0.0013 and HR 2.43 [1.14-5.17], p=0.0036 respectively.
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
36
Figure 4 - A) Association of a combined model (Assay + CAPRA-S) at predicting time to
biochemical recurrence of high/low risk disease in the resection cohort. Reduced
progression-free survival (PFS) in months of the ‘High Risk’ subgroup (yellow) of 112
patients when compared to the ‘Low Risk’ subgroup (blue) of 125 patients (HR = 2.67
[1.90-3.75]; p<0.0001). B) Association of a combined model (Assay + CAPRA-S) at
predicting time to metastatic disease progression of high/low risk disease in the
resection cohort. Reduced progression-free survival (PFS) in months of the ‘High Risk’
subgroup (yellow) of 112 patients compared to ‘Low Risk’ subgroup (blue) of 125
patients (HR =7.53 [4.13 – 13.73]; p <0.0001).
779
780
781
782
783
784
785
786
787
788
37
Suppl. Figure 1 – REMARK study design flow diagram.
Suppl. Figure 2 - Hierarchical clustering of methylation profiles from the specific genes.
Genes which have increased (red) or decreased (green) levels of methylation are
labelled on the vertical axis within gene clusters. 7/8 samples (88%) from the
metastatic-subgroup (M2), and 10/14 samples (71%) from the non-metastatic-
subgroup clustering together (M1) (chi-squared, p=0.02).
Suppl. Figure 3 - Workflow outlining the training and validation processes of the
Metastatic Assay development and optimization.
Suppl. Figure 4 - A) Assessment of the C-index in a cohort of 75 primary prostate
cancer cases treated with primary resection. C-index showing the identification of
metastatic recurrence by the Metastatic Assay (C-index=90.4). B) Standard deviation
(SD) of the signature scores across the 20 patient technical replicates.
Suppl. Figure 5 - Histogram to show the bimodal distribution of Metastatic Assay
scores in the Discovery cohort.
Suppl. Figure 6 – Venn diagram outlining the overlap of genes between the Metastatic
Assay and the three clinically utilised prognostic assays.
Suppl. Figure 7 – Decision Curve Analysis showing the net benefit of the Metastatic
Assay, CAPRA-S and the Combined Model in (A) MSKCC and (B) Independent Resection
Validation across probability thresholds for the metastatic endpoint compared with
treating all (All) or no patients (None).
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
38
MAIN TABLES & FIGURES
Table 1. Validation of Metastatic Assay in MSKCC cohort
Biochemical Recurrence Metastatic Recurrence Multivariate Model 1
Multivariate Model 1
Covariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 3.03 1.43 to 6.41 0.0040 Metastatic Assay 2.53 0.67 to 9.54 0.1735
Gleason: (3+4) <7 4+3 8-10
0.382.048.09
0.10 to 1.370.76 to 5.43
2.74 to 23.91
0.14090.15790.0002
Gleason: (3+4)* <7 4+3 8-10
0.0022.61187.79
0.002.34 to 218.06
16.52 to 2134.99
0.96580.0073
<0.0001Age 0.99 0.94 to 1.04 0.6564 Age 0.88 0.80 to 0.97 0.0110
PSA 1.00 0.96 to 1.04 0.9857 PSA 0.94 0.89 to 0.98 0.0106
Multivariate Model 2 Multivariate Model 2 Covariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 3.35 1.62 to 6.94 0.0012 Metastatic Assay 3.95 1.15 to 13.53 0.0298
CAPRA-S 3.92 1.92 to 7.99 0.0002 CAPRA-S 3.50 1.13 to 10.80 0.0302
Abbreviations: HR, hazard ratio; CI, confidence intervals; PSA, prostate specific antigen; CAPRA-S, Cancer of the Prostate Risk Assessment post-surgical.
*Absence of metastatic events in patients with Gleason score <3+4.
38
809
810
811
812
813
39
Table 2. Validation of Metastatic Assay in Primary Prostate Cancer Resection Dataset.
Biochemical Recurrence Metastatic Recurrence Multivariate Model 1 Multivariate Model 1 Covariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 1.62 1.13 to 2.33 0.0092 Metastatic Assay 3.20 1.76 to 5.80 0.0001Gleason: (3+4) <7 4+3 8-10
0.761.952.79
0.44 to 1.301.29 to 2.951.82 to 4.30
0.32240.0017
<0.0001
Gleason: (3+4) <7 4+3 8-10
0.724.336.85
0.19 to 2.731.89 to 9.932.92 to 16.04
0.63580.0006
<0.0001
Age 1.00 0.97 to 1.03 0.9027 Age 0.97 0.92 to 1.02 0.2828PSA 1.01 1.00 to 1.01 0.0321 PSA 1.00 0.99 to 1.01 0.6423
Multivariate Model 2 Multivariate Model 2 Covariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 1.72 1.19 to 2.48 0.0042 Metastatic Assay 2.94 1.60 to 5.40 0.0005
CAPRA-S 2.52 1.79 to 3.54 <0.0001 CAPRA-S 4.76 2.46 to 9.23 <0.0001
Combined Model HR 95% CI p Combined Model HR 95% CI p
Metastatic Assay + CAPRA-S2.67 1.90 to 3.75 <0.0001
Metastatic Assay + CAPRA-S7.53
4.13 to 13.73
<0.0001
Abbreviations: HR, hazard ratio; CI, confidence intervals; PSA, prostate specific antigen; CAPRA-S, Cancer of the Prostate Risk Assessment post-surgical.
39
814
40
Figure 1 Molecular Subtyping and Identification of the Metastatic subgroup
40
A
G2
G1
A
C1 C2
815
816
817
41
Figure 1 Cont’d Molecular Subtyping and Identification of the Metastatic-subgroup
41
P<0.001
B
P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001P<0.001
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
42
Figure 2 Validation of the Metastatic Assay in resections using the MSKCC in silico dataset
42
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
43
Figure 3 Validation of the Metastatic Assay in resections using the retrospective independent resection validation dataset
43
BA
856
857
858
859
860
861
862
863
44
C
44
GS≤3+4 GS ≥ 4+3
864
865
866
867
868
869
870
871
872
45
Figure 4 Validation of the Metastatic Assay in resections using a combined model with CAPRA-S to stratify high and low risk
45
A B
873
874
875
876
877
878
879
880
881
882
46
SUPPLEMENTARY DATA
Supplementary Table 1 – contained within the Supplemental File (Tab 1)
Supplementary Table 2. Demographic and Clinical characteristics of Discovery Cohort.
Covariate No. % Clinical Site Cambridge 73 58 Karolinska 53 42
Sample Type Primary Tumour 70 56
Primary Tumour with Mets 21 17 Metastatic Disease 10 8 Normal 25 19 Gleason Score 6 10 8 7 24 19 8 - 10 56 44 N/A 36 29
46
883
884
885
886
887
888
889
47
Supplementary Table 3. Demographic and Clinical Characteristics of the Secondary Training Dataset
Covariate No. % Age 58 (45-71)
Recurrence Event Recurrence 26 35 Non-Recurrence 49 65 Time to Recurrence Biochemical 37 (7-102)
Metastatic
64 (19-105)
Pre-operative PSA Median (Min - Max) 7.7 (2.6 – 61.4) Gleason Score <7 5 7 7 58 77 >7 12 16 Surgical Margins (SM) Positive 33 56 Negative 42 44 Lymph Node Invasion (LNI) Yes 2 2
NoUnknown
1756
2375
T-stage T2 36 48
T3 39 52
47
890
891
48
Supplementary Table 4. Demographic and Clinical Characteristics of the Glinsky in silico validation cohort
Covariate No. % Recurrence Event Recurrence 37 47 Non-Recurrence 42 53 Relapse Yes 37 47 No 42 53 Pre-operative PSA Median (Min - Max) 7.6(1.5 - 62.1) Gleason Score (Resection) <7 17 22 7 44 56 >7 18 22 Surgical Margins (SM) Positive 50 63 Negative 29 37 Lymph Node Invasion (LNI) Yes 3 4 No 76 96 Gleason Score (Biopsy) <7 37 47 7 32 41 >7 10 12 Age Median (Min - Max) 61.2(44.9 - 72.7)
48
892
49
Supplementary Table 5. Demographic and Clinical Characteristics of the Erho in silico validation cohort
Covariate No. % Gleason Score <7 63 12 7 271 50 >7 211 38 Metastatic Recurrence No 333 61 Yes 212 39
49
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
50
Supplementary Table 6. Demographic and Clinical Characteristics of the MSKCC in silico validation cohort
Covariate No. % Type of Tumour Primary 126 100 Biochemical Recurrence Yes 32 25 No 94 75 Metastatic Recurrence Yes 14 11 No 112 89
Site of Metastasis Bone 7 6 Local 2 2 RP node, Bone 1 1 Bone, Soft Tissue 1 1 RP node, Lung 1 1 Pelvic node 1 1 Lung 1 1 None 112 89 Age at Diagnosis Median (Min - Max) 57.6(37.3 - 72.78)
Pre-operative PSA Median (Min - Max) 5.9(1.1 - 46.4) Pathological T-stage
50
51
T1 71 56 T2 50 40 T3 5 4 Surgical Margins (SM) Positive 30 24 Negative 96 76 Seminal Vesicle Invasion (SVI) Yes 13 10 No 113 90 Lymph Node Invasion (LNI) No 97 77 Yes 6 5 Unknown 23 18 Race Black Non-Hispanic 23 18 White Non-Hispanic 94 75 Black Hispanic 3 2 Asian 2 2 Unknown 4 3 Gleason <7 40 32 7 72 57 >7 14 11
Supplementary Table 7. Demographic and Clinical characteristics of the Independent Resection Validation Cohort. Covariate No. % Clinical Site IPCRC 61 19 Oslo 142 44 Surrey 34 11 WCB 85 26 Age at Surgery Median 62 (41-75) Recurrence Event Non-recurrence 172 53 Biochemical recurrence 103 32 Metastatic progression 47 15 Time to Recurrence - Median (range) Biochemical recurrence 12 (1-100) Metastatic progression 6 (3-63) Pre-operative PSA Median (range), ng/ml 8.4 (2 - 253) Gleason score <6 2 1 6 67 21 7 197 61 8 - 10 55 17 Pathological T-stage T1 1 0.5
51
916
52
T2 174 54 T3 146 45 T4 1 0.5 Lymph Node Invasion (LNI) Yes 16 5 No 105 33 Unknown 201 62 Seminal Vesicle Invasion (SVI) Yes 62 19 No 260 81 Extracapsular Extension (ECE) Yes 97 30 No 190 59 Unknown 35 11 Surgical Margins (SM) Negative 132 41 Focal 40 12 Diffuse 65 20 Unknown 85 27
Site of Metastasis Bone Mets 24 51 Visceral Mets 17 36 Varied Mets 6 13
Supplementary Table 8 – contained within the Supplemental File (Tab 2)
Supplementary Table 9. Demographic & Clinical Characteristics of the
Methylation Cohort
Covariate No. % Clinical Site
Cambridge 5 23 Karolinska 17 77 Sample Type Primary Tumour 20 91 Metastatic Disease 2 8 Class Label/Subgroup C1 8 36 C2 14 64 Gleason Score 6 2 9 7 4 18 8 9 41 9 7 32 Stage
52
917
918
919
920
53
T2 5 23 T3 14 64 Unknown 3 13
Supplementary Table 10 – contained within the Supplemental File (Tab 3)
Supplementary Table 11 – contained within the Supplemental File (Tab 4)
Supplementary Table 12 – contained within the Supplemental File (Tabs 5 & 6)
53
921
922
923
924
925
926
927
54
Supplementary Table 13. Signature gene IDs, weights and bias.
Gene Name Entrez Gene ID Weight Bias
CAPN6 827 -0.010899 4.440873
THBS4 7060 -0.009632 6.912586
PLP1 5354 -0.008886 4.383572
MT1A 4489 -0.008681 6.747957
MIR205HG 406988 -0.008279 7.215245
SEMG1 6406 -0.007935 4.230423
RSPO3 84870 -0.007296 4.293173
ANO7 50636 -0.007164 6.522548
PCP4 5121 -0.007139 7.621758
ANKRD1 27063 -0.006922 5.928315
MYBPC1 4604 -0.006845 4.574319
MMP7 4316 -0.006835 6.756722
SERPINA3 12 -0.006831 5.745462
SELE 6401 -0.006810 5.977682
KRT5 3852 -0.006403 6.080494
LTF 4057 -0.006400 6.497260
KIAA1210 57481 -0.006381 3.559966
TMEM158 25907 -0.006312 8.063421
ZFP36 7538 -0.006271 9.960827
FOSB 2354 -0.006108 6.954936
PCA3 50652 -0.006102 5.262342
TRPM8 79054 -0.006060 4.865791
PTTG1 9232 0.006017 4.712693
LOC283194 283194 -0.005950 4.980381
PAGE4 9506 -0.005837 7.073907
STEAP4 79689 -0.005685 8.105295
TMEM178A 130733 -0.005647 7.594526
CXCL2 2920 -0.005598 8.928978
HS3ST3A1 9955 -0.005593 4.232782
EYA1 2138 -0.005581 5.504276
RSPO2 340419 -0.005563 3.922421
PKP1 5317 -0.005553 5.912186
MUC6 4588 -0.005522 6.640037
PENK 5179 -0.005506 4.514855
DEFB1 1672 -0.005400 6.825491
SLC7A3 84889 -0.005390 4.649004
MIR578 693163 -0.005355 5.087389
PI15 51050 -0.005264 4.858716
UBXN10-AS1 101928017 -0.005259 6.065878
54
928
55
PDK4 5166 -0.005249 4.174094
PHGR1 644844 -0.005208 5.183571
SERPINE1 5054 -0.005195 6.691866
PDZRN4 29951 -0.005147 4.752328
ZNF185 7739 -0.005105 6.900544
ADRA2C 152 -0.005055 7.078377
AZGP1 563 -0.005018 8.191178
TK1 7083 0.004966 5.581335
POTEH 23784 -0.004961 4.824976
KIF11 3832 0.004929 3.917669
CLDN1 9076 -0.004924 4.960283
MIR4530 100616163 -0.004908 10.536452
MAFF 23764 -0.004901 8.497945
ZNF765 91661 -0.004862 3.976333
CKS2 1164 0.004856 6.503981
TCEAL7 56849 -0.004856 4.819328
PLIN1 5346 0.004831 4.629392
SIGLEC1 6614 0.004773 5.503752
FAM150B 285016 -0.004773 6.664595
MFAP5 8076 -0.004772 4.129177
SFRP1 6422 -0.004762 7.901262
DUSP5 1847 -0.004718 5.762678
VARS2 57176 0.004675 5.223455
ABCC4 10257 -0.004664 5.230377
SH3BP4 23677 -0.004623 4.882708
SORD 6652 -0.004573 8.958411
MTERFD1 51001 0.004522 5.334199
DPP4 1803 -0.004506 4.659748
AATBC 284837 0.004502 4.905313
FAM3B 54097 -0.004443 7.388071
KLK3 354 -0.004425 10.226441
55
929
930
931
932
933
934
935
936
56
Supplementary Table 15. Performance metrics for metastatic recurrence for CAPRA-S and the Combined Model (CAPRA-S + Metastatic Assay)
Metric Metastatic Assay CAPRA-S Combined Model Sensitivity 47.73% [32.46-63.31] 70.45% [54.80-83.24] 84.09% [69.93-93.36] Specificity 81.87% [75.69-87.03] 71.50% [64.58-77.75] 64.14% [53.88-68.06]
Supplementary Table 14. Performance metrics for Biochemical recurrence (Glinsky) and Metastatic recurrence (Erho) for the Metastatic Assay
Metric Metastatic Assay
(Glinsky cohort)Metastatic Assay
(Erho cohort) Sensitivity 70.27% [56.41-80.00] 66.67% [52.44-76.63] Specificity 66.98% [60.51-71.36] 54.65% [50.43-57.38]
56
937
938
939
940
941
942
943
944
57
Supplementary Table 16. Univariate assessment of the Metastatic Assay as a continuous predictor in the MSKCC cohort both alone and in a combined model with CAPRA-S
Metric Metastatic Assay CAPRA-S Combined Model
Biochemical Recurrence
AUC 0.67 [0.58-0.75] 0.79 [0.71-0.86] 0.79 [0.71-0.86] HR 1.39 [1.16-1.66]; p=0.0003 1.39 [1.24-1.56]; p<0.0001 1.60 [1.39-1.85]; p<0.0001
C-index 0.71 [0.60-0.79] 0.75 [0.66-0.83] 0.80 [0.72-0.87]
Metastatic Recurrence
AUC 0.71 [0.63-0.79] 0.87 [0.80-0.93] 0.88 [0.81-0.93]
HR 1.48 [1.13-1.95]; p=0.0048 1.39 [1.19-1.61; p<0.0001 1.55 [1.26-1.91]; p<0.0001
C-index 0.69 [0.52-0.84] 0.79 [0.69-0.88] 0.83 [0.74-0.91]
Supplementary Table 17. Univariate assessment of the Metastatic Assay as a continuous predictor in the Independent Resection Validation cohort both alone and in a combined model with CAPRA-S
Metric Metastatic Assay CAPRA-S Combined Model
Biochemical Recurrence
AUC 0.59 [0.54-0.65] 0.76 [0.70-0.82] 0.77 [0.71-0.82] HR 1.19 [1.06-1.34]; p=0.0030 1.27 [1.19-1.36]; p<0.0001 1.38 [1.27-1.50]; p<0.0001
C-index 0.58 [0.54-0.63] 0.62 [0.58-0.68] 0.68 [0.64-0.74]
Metastatic Recurrence
AUC 0.71 [0.66-0.76] 0.76 [0.70-0.81] 0.80 [0.74-0.85]
HR 1.58 [1.29-1.93]; p<0.0001 1.45 [1.28-1.64]; p<0.0001 1.66 [1.43-1.93]; p<0.0001
C-index 0.71 [0.64-0.78] 0.73 [0.66-0.79] 0.82 [0.76-0.86]
Supplementary Table 18. Multivariable assessment of the Metastatic Assay as a continuous predictor in the MSKCC cohort and the Independent Resection Validation cohort
57
945
946
947
948
58
Biochemical Recurrence Metastatic Recurrence Multivariate Model 1 - MSKCC
Multivariate Model 1 - MSKCC
Covariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 2.00 1.24 to 3.24 0.0050 Metastatic Assay 2.99 1.10 to 8.17 0.0334
Gleason: (3+4) <7 4+3 8-10
0.512.094.27
0.14 to 1.900.78 to 5.57
1.27 to 14.35
0.31730.14340.0195
Gleason: (3+4)* <7 4+3 8-10
0.0021.0280.14
0.002.22 to 198.597.10 to 901.73
0.96720.00820.0004
Age 0.97 0.92 to 1.03 0.3552 Age 0.89 0.82 to 0.97 0.0093
PSA 0.97 0.92 to 1.02 0.2005 PSA 0.89 0.83 to 0.96 0.0026
Multivariate Model 2 – Independent Resection Cohort Multivariate Model 2 – Independent Resection CohortCovariate HR 95% CI p Covariate HR 95% CI p
Metastatic Assay 1.16 1.03 to 1.30 0.0155 Metastatic Assay 1.52 1.24 to 1.85 <0.0001
Gleason: (3+4) <7 4+3 8-10
0.771.972.81
0.45 to 1.321.30 to 2.981.82 to 4.32
0.34160.0015
<0.0001
Gleason: (3+4) <7 4+3 8-10
0.784.407.08
0.21 to 2.961.92 to 10.113.01 to 16.62
0.72110.0005
<0.0001Age 1.00 0.97 to 1.02 0.7220 Age 0.96 0.92 to 1.01 0.1269
PSA 1.01 1.00 to 1.01 0.0163 PSA 1.00 0.99 to 1.02 0.5067
Abbreviations: HR, hazard ratio; CI, confidence intervals; PSA, prostate specific antigen; CAPRA-S, Cancer of the Prostate Risk Assessment post-surgical.
*Absence of metastatic events in patients with Gleason score <3+4.
58
949
950
951
952
953
59
Suppl. Figure 1 – REMARK Study Design Flow Diagram
59
954
955
956
60
Suppl. Figure 2 - Semi-supervised hierarchical clustering of the methylation data of down-regulated genes
M1 M2
60
957
958
959
960
61
Suppl. Figure 3 Process for the development of the Metastatic Assay
61
961
962
963
964
965
966
62
Suppl. Figure 4 Assessment of C-index (A) and SD (B) performance metrics for the metastatic endpoint in the secondary training dataset to justify selection of a 70-gene signature
62
A
B
967968969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
985
63
Suppl. Figure 5 Distribution of the Metastatic Assay signature scores in the Discovery dataset
63
986987
64
Suppl. Figure 6 - Venn diagram outlining the overlap of genes between the Metastatic Assay and the three clinically utilised prognostic assays.
64
PTTG1
TK1
KIF11
AZGP1
ANO7
MYBPC1
988989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
65
Suppl. Figure 7 – Decision Curve Analysis showing the net benefit of the Metastatic Assay, CAPRA-S and the Combined Model in (A) MSKCC and (B) Independent Resection Validation across probability thresholds for the metastatic endpoint compared with treating all (All) or no patients (None)
65
A B
100810091010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
66
APPENDIX A
**Gemma to add further REMARK text**
66
1029
1030