On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation
description
Transcript of On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation
![Page 1: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/1.jpg)
On the Road to Predictive OncologyChallenges for Statistics and for
Clinical Investigation
Richard Simon, D.Sc.Chief, Biometric Research Branch
National Cancer Institutehttp://brb.nci.nih.gov
![Page 2: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/2.jpg)
Biometric Research Branch Websitehttp://brb.nci.nih.gov
• Powerpoint presentations• Reprints• BRB-ArrayTools software• Web based tools for clinical trial design with
predictive biomarkers
![Page 3: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/3.jpg)
Prediction Tools for Informing Treatment Selection
• Most cancer treatments benefit only a minority of patients to whom they are administered
• Being able to predict which patients are likely or unlikely to benefit from a treatment might – Save patients from unnecessary complications and
enhance their chance of receiving a more appropriate treatment
– Help control medical costs – Improve the success rate of clinical drug development
![Page 4: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/4.jpg)
Types of Biomarkers
• Predictive biomarkers– Measured before treatment to identify who is
likely or unlikely to benefit from a particular treatment
• Prognostic biomarkers– Measured before treatment to indicate long-
term outcome for patients untreated or receiving standard treatment
![Page 5: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/5.jpg)
• Surrogate endpoints– Measured longitudinally to measure the pace
of disease and how it is effected by treatment for use as an early indication of clinical effectiveness of treatment
![Page 6: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/6.jpg)
Prognostic & Predictive Biomarkers
• Single gene or protein measurement– ER protein expression– HER2 amplification– EGFR mutation– KRAS mutation
• Index or classifier that summarizes expression levels of multiple genes– OncotypeDx recurrence score
![Page 7: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/7.jpg)
Validation = Fit for Intended Use
• Analytical validation– Accuracy, reproducibility, robustness
• Clinical validation– Does the biomarker predict a clinical endpoint
or phenotype• Clinical utility
– Does use of the biomarker result in patient benefit
• By informing treatment decisions• Is it actionable
![Page 8: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/8.jpg)
Pusztai et al. The Oncologist 8:252-8, 2003
• 939 articles on “prognostic markers” or “prognostic factors” in breast cancer in past 20 years
• ASCO guidelines only recommended routine testing for ER, PR and HER-2 in breast cancer
![Page 9: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/9.jpg)
• Most prognostic markers or prognostic models are not used because although they correlate with a clinical endpoint, they do not facilitate therapeutic decision making;
• Most prognostic marker studies are based on a “convenience sample” of heterogeneous patients, often not limited by stage or treatment.
• The studies are not planned or analyzed with clear focus on an intended use of the marker
• Retrospective studies of prognostic markers should be planned and analyzed with specific focus on intended use of the marker
• Prospective studies should address medical utility for a specific intended use of the biomarker– Treatment options and practice guidelines– Other prognostic factors
![Page 10: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/10.jpg)
Potential Uses of Prognostic Biomarkers
• Identify patients who have very good prognosis on standard treatment and do not require more intensive regimens
• Identify patients who have poor prognosis on standard chemotherapy who are good candidates for experimental regimens
![Page 11: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/11.jpg)
Predictive Biomarkers
![Page 12: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/12.jpg)
![Page 13: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/13.jpg)
![Page 14: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/14.jpg)
Major Changes in Oncology
• Recognition of the heterogeneity of tumors of the same primary site with regard to molecular oncogenesis
• Availability of the tools of genomics for characterizing tumors
• Focus on molecularly targeted drugs• Have resulted in
– Increased interest in prediction problems– Need for new clinical trial designs– Increased pace of innovation
![Page 15: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/15.jpg)
• p>n prediction problems in which number of variables is much greater than the number of cases– Many of the methods of statistics are based
on inference problems– Standard model building and evaluation
strategies are not effective for p>n prediction problems
![Page 16: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/16.jpg)
Model Evaluation for p>n Prediction Problems
• Goodness of fit is not a proper measure of predictive accuracy
• Importance of Separating Training Data from Testing Data for p>n Prediction Problems
![Page 17: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/17.jpg)
Simulation Training Validation
1
2
3
4
5
6
7
8
9
10
p=7.0e-05
p=0.70
p=4.2e-07
p=0.54
p=2.4e-13
p=0.60
p=1.3e-10
p=0.89
p=1.8e-13
p=0.36
p=5.5e-11
p=0.81
p=3.2e-09
p=0.46
p=1.8e-07
p=0.61
p=1.1e-07
p=0.49
p=4.3e-09
p=0.09
![Page 18: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/18.jpg)
Separating Training Data from Testing Data
• Split-sample method
• Re-sampling methods– Leave one out cross validation– K-fold cross validation– Replicated split-sample– Bootstrap re-sampling
![Page 19: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/19.jpg)
• “Prediction is very difficult; especially about the future.”
![Page 20: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/20.jpg)
Prediction on Simulated Null DataSimon et al. J Nat Cancer Inst 95:14, 2003
Generation of Gene Expression Profiles• 20 specimens (Pi is the expression profile for specimen i)
• Log-ratio measurements on 6000 genes• Pi ~ MVN(0, I6000)
• Can we distinguish between the first 10 specimens (Class 1) and the last 10 (Class 2)?
Prediction Method• Compound covariate predictor built from the log-ratios of the 10 most differentially expressed genes.
![Page 21: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/21.jpg)
Number of misclassifications
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Pro
porti
on o
f sim
ulat
ed d
ata
sets
0.00
0.05
0.100.90
0.95
1.00
Cross-validation: none (resubstitution method)Cross-validation: after gene selectionCross-validation: prior to gene selection
![Page 22: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/22.jpg)
Cross Validation• With proper cross-validation, the model must be
developed from scratch for each leave-one-out training set. This means that feature selection must be repeated for each leave-one-out training set.
• The cross-validated estimate of misclassification error is an estimate of the prediction error for the model developed by applying the specified algorithm to the full dataset
![Page 23: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/23.jpg)
Permutation Distribution of Cross-validated Misclassification Rate of a Multivariate
Classifier Radmacher, McShane & Simon
J Comp Biol 9:505, 2002
• Randomly permute class labels and repeat the entire cross-validation
• Re-do for all (or 1000) random permutations of class labels
• Permutation p value is fraction of random permutations that gave as few cross-validated misclassifications as in the real data
![Page 24: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/24.jpg)
Model Evaluation for p>n Prediction Problems
• Odds ratios and hazards ratios are not proper measures of prediction accuracy
• Statistical significance of regression coefficients are not proper measures of predictive accuracy
![Page 25: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/25.jpg)
Evaluation of Prediction Accuracy
• For binary outcome– Cross-validated prediction error– Cross-validated sensitivity & specificity– Cross-validated ROC curve
• For survival outcome– Cross-validated Kaplan-Meier curves for predicted
high and low risk groups• Cross-validated K-M curves within levels of standard
prognostic staging system
– Cross-validated time-dependent ROC curves
![Page 26: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/26.jpg)
i
-i
p dimensional vector of expression levels for case iy =binary {0,1} class indicator for case i
p dimensional weights for linear classifier computed for training set with case i omitted;
ix
-i
i
component j is zero if variable j is not selected in feature selection step for training set -i
score for case i computed from model developed with case i omitted
y c
ix
-i
i1
i i1
i i1
predicted class
mis-classification error y (c)=y
sensitivity(c) y (c)=y
specificity(c) (1 ) y (c)=y
ROC curve sensitivity(c) vs 1-specificity(c)
i
n
i
n
ii
n
ii
I x c
I
y I
y I
LOOCV Error Estimates for Linear Classifiers
![Page 27: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/27.jpg)
Cross-validated Kaplan-Meier Curves for Predicted High and Low Risk Groups
0( ; )( ; ) ( ) exp( )
1 ( ; )
estimate of weights based on algorithm applied to training set with case i omitted Algorithm may involve feature selection,
i
f t xt x t xF t x
penalized regression, ...
Classify case i as high risk if i ix c
![Page 28: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/28.jpg)
Cross-Validated Time Dependent ROC Curve
-i
* landmark time of interestsensitivity(c)=Pr[ x>c|t t*]
Pr[t<t*| x>c]Pr[ x>c] =Pr[t t*]
Estimate 1st term in numerator using KM estimator
for cases with . Estimate 2nd tei
t
x c
-i
rm as proportion
of n cases with . Estimate denominator using KM estimator for all n cases.specificity(c)=Pr[ x c|t>t*]
Pr[t>t*| x c]Pr[ x c] =Pr[t>t*]
Time-dependent
ix c
ROC is sensitivity(c) vs 1-specificity(c)
![Page 29: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/29.jpg)
Is Accurate Prediction Possible For p>n?
• Yes, in many cases, but standard statistical methods for model building and evaluation are often not effective
• Standard methods may over-fit the data and lead to poor predictions
• With p>n, unless data is inconsistent, a linear model can always be found that classifies the training data perfectly
![Page 30: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/30.jpg)
Is Accurate Prediction Possible For p>>n?
• Some problems are easy; real problems are often difficult
• Simple methods like DLDA, nearest neighbor classifiers and shrunken centroid classifiers are at least as effective as more complex methods for many datasets
• Because of correlated variables, there are often many very distinct models that predict about equally well
![Page 31: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/31.jpg)
• p>n prediction problems are not multiple testing problems
• The objective of prediction problems is accurate prediction, not controlling the false discovery rate– Parameters that control feature selection in prediction
problems are tuning parameters to be optimized for prediction accuracy
• Optimizaton by cross-validation nested within the cross-validation used for evaluating prediction accuracy
• Biological understanding is often a career objective; accurate prediction can sometimes be achieved in less time
![Page 32: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/32.jpg)
Model Instability Does Not Mean Prediction Inaccuracy
• Validation of a predictive model means that the model predicts accurately for independent data
• Validation does not mean that the model is stable or that using the same algorithm on independent data will give a similar model
• With p>n and many genes with correlated expression, the classifier will not be stable.
![Page 33: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/33.jpg)
Traditional Approach to Oncology Clinical Drug Development
• Phase III trials with broad eligibility to test the null hypothesis that a regimen containing the new drug is on average not better than the control treatment for all patients who might be treated by the new regimen
• Perform exploratory subset analyses but regard results as hypotheses to be tested on independent data
![Page 34: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/34.jpg)
Traditional Clinical Trial Approaches
• Have protected us from false claims resulting from post-hoc data dredging not based on pre-defined biologically based hypotheses
• Have led to widespread over-treatment of patients with drugs from which many don’t benefit
• Are less suitable for evaluation of new molecularly targeted drugs which are expected to benefit only the patients whose tumors are driven by de-regulation of the target of the drug
![Page 35: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/35.jpg)
Molecular Heterogeneity of Human Cancer
• Cancers of a primary site in many cases appear to represent a heterogeneous group of diverse molecular diseases which vary fundamentally with regard to – their oncogenecis and pathogenesis – their responsiveness to specific drugs
• The established molecular heterogeneity of human cancer requires the use new approaches to the development and evaluation of therapeutics
![Page 36: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/36.jpg)
How Can We Develop New Drugs in a Manner More Consistent With Modern Tumor Biology and ObtainReliable Information About What Regimens Work for What Kinds of
Patients?
![Page 37: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/37.jpg)
Using phase II data, develop predictor of response to new drugDevelop Predictor of Response to New Drug
Patient Predicted Responsive
New Drug Control
Patient Predicted Non-Responsive
Off Study
![Page 38: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/38.jpg)
Evaluating the Efficiency of Enrichment and Stratification Clinical Trial Designs With
Predictive Biomarkers
• Simon R and Maitnournam A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004; Correction and supplement 12:3229, 2006
• Maitnournam A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005.
![Page 39: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/39.jpg)
Model for Two Treatments With Binary Response
•New treatment T•Control treatment C•1- proportion marker +•pc control response probability•response probability for T:
–Marker + (pc + 1)
–Marker - (pc + 0)
![Page 40: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/40.jpg)
Randomized Ratio(normal approximation)
• RandRat = nuntargeted/ntargeted
1= rx effect in marker + patients 0= rx effect in marker - patients =proportion of marker - patients• If 0=0, RandRat = 1/ (1-) 2
• If 0= 1/2, RandRat = 1/(1- /2)2
2
1
1 0(1 )RandRat
![Page 41: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/41.jpg)
Randomized Rationuntargeted/ntargeted
1-Express target
0=0 0= 1/2
0.75 1.78 1.31
0.5 4 1.78
0.25 16 2.56
![Page 42: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/42.jpg)
• Relative efficiency of targeted design depends on – proportion of patients test positive– effectiveness of new drug (compared to control) for
test negative patients• When less than half of patients are test positive
and the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients
![Page 43: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/43.jpg)
TrastuzumabHerceptin
• Metastatic breast cancer• 234 randomized patients per arm• 90% power for 13.5% improvement in 1-year
survival over 67% baseline at 2-sided .05 level• If benefit were limited to the 25% assay +
patients, overall improvement in survival would have been 3.375%– 4025 patients/arm would have been required
![Page 44: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/44.jpg)
Developmental Strategy (II)
Develop Predictor of Response to New Rx
Predicted Non-responsive to New Rx
Predicted ResponsiveTo New Rx
ControlNew RX Control
New RX
![Page 45: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/45.jpg)
Developmental Strategy (II)
• Do not use the diagnostic to restrict eligibility, but to structure a prospective analysis plan
• Having a prospective analysis plan is essential• “Stratifying” (balancing) the randomization is
useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan
• The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier
![Page 46: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/46.jpg)
• R Simon. Using genomics in clinical trial design, Clinical Cancer Research 14:5984-93, 2008
• R Simon. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics, Expert Opinion in Medical Diagnostics 2:721-29, 2008
![Page 47: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/47.jpg)
![Page 48: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/48.jpg)
Analysis Plan B
(Fall-back Plan)
• Compare the new drug to the control overall for all patients ignoring the classifier.– If poverall 0.03 claim effectiveness for the eligible
population as a whole• Otherwise perform a single subset analysis
evaluating the new drug in the classifier + patients– If psubset 0.02 claim effectiveness for the classifier +
patients.
![Page 49: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/49.jpg)
Analysis Plan C(Interaction Plan)
• Test for difference (interaction) between treatment effect in test positive patients and treatment effect in test negative patients
• If interaction is significant at level int then compare treatments separately for test positive patients and test negative patients
• Otherwise, compare treatments overall
![Page 50: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/50.jpg)
Sample Size Planning for Analysis Plan C
• 88 events in test + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power
• If 25% of patients are positive, when there are 88 events in positive patients there will be about 264 events in negative patients– 264 events provides 90% power for detecting
33% reduction in hazard at 5% two-sided significance level
![Page 51: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/51.jpg)
Simulation Results for Analysis Plan C
• Using int=0.10, the interaction test has power 93.7% when there is a 50% reduction in hazard in test positive patients and no treatment effect in test negative patients
• A significant interaction and significant treatment effect in test positive patients is obtained in 88% of cases under the above conditions
• If the treatment reduces hazard by 33% uniformly, the interaction test is negative and the overall test is significant in 87% of cases
![Page 52: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/52.jpg)
• It can be difficult to identify a single completely defined classifier candidate prior to initiation of the phase III trial evaluating the new treatment
![Page 53: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/53.jpg)
![Page 54: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/54.jpg)
Generalization of Biomarker Adaptive Threshold Design(Global Test Approach)
• Have identified K candidate predictive binary classifiers B1 , …, BK thought to be predictive of patients likely to benefit from T relative to C
• Eligibility not restricted by candidate biomarkers
![Page 55: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/55.jpg)
End of Trial Analysis
• Compare T to C for all patients at significance level overall (e.g. 0.03)
– If overall H0 is rejected, then claim effectiveness of T for eligible patients
– Otherwise
![Page 56: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/56.jpg)
• Test T vs C restricted to patients positive for Bk for k=1,…,K – Let Sk be log likelihood ratio statistic for treatment
effect in patients positive for Bk (k=1,…,K) • Let S* = max{Sk)} , k* = argmax{Sk)} • Compute null distribution of S* by permuting
treatment labels• If the unpermutted data value of S* is significant
at level 0.05- overall ,claim effectiveness of T for patients positive for Bk*
![Page 57: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/57.jpg)
Cross-Validated Adaptive Signature Design
(Clinical Cancer Research, Jan 2010)
W Jiang, B Freidlin, R Simon
![Page 58: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/58.jpg)
Cross-Validated Adaptive Signature Design
End of Trial Analysis
• Compare T to C for all patients at significance level overall (e.g. 0.03)
– If overall H0 is rejected, then claim effectiveness of T for eligible patients
– Otherwise
![Page 59: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/59.jpg)
Otherwise
• Partition the full data set into K parts P1 ,…,PK• Form a training set by omitting one of the K parts,
e.g. part k.– Trk={1,…,n}-Pk
• The omitted part Pk is the test set• Using the training set, develop a predictive binary
classifier B-k of the subset of patients who benefit preferentially from the new treatment compared to control
• Classify the patients i in the test set as sensitive B-
k(xi)=1 or insensitive B-k(xi)=0– Let Sk={j in Pk : B-k(xi)=1}
![Page 60: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/60.jpg)
• Repeat this procedure K times, leaving out a different part each time
• After this is completed, all patients in the full dataset are classified as sensitive or insensitive– Scv= Sk
![Page 61: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/61.jpg)
• For patients classified as sensitive, compare outcomes for patients who received new treatment T to those who received control treatment C.– Outcomes for patients in Scv T vs outcomes for
patients in Scv C• Compute a test statistic Dsens
– e.g. the difference in response proportions or log-rank statistic for survival
• Generate the null distribution of Dsens by permuting the treatment labels and repeating the entire K-fold cross-validation procedure
• Perform test at significance level 0.05 - overall
![Page 62: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/62.jpg)
• If H0 is rejected, claim superiority of new treatment T for future patients with expression vector x for which B(x)=1 where B is the classifier of sensitive patients developed using the full dataset
• The estimate of treatment effect for future sensitive patients is Dsens computed from the cross-validated sensitive subset Scv
• The stability of the sensitive subset {x:B(x)=1} can be evaluated based on applying the classifier development algorithm to non-parametric bootstrap samples of the full dataset {1,...,n}
![Page 63: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/63.jpg)
70% Response to T in Sensitive Patients25% Response to T Otherwise
25% Response to C20% Patients Sensitive, n=400
ASD CV-ASD
Overall 0.05 Test 0.486 0.503
Overall 0.04 Test 0.452 0.471
Sensitive Subset 0.01 Test
0.207 0.588
Overall Power 0.525 0.731
![Page 64: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/64.jpg)
![Page 65: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/65.jpg)
![Page 66: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/66.jpg)
Prediction Based Analysis of Clinical Trials
• Using cross-validation we can evaluate any classification algorithm for identifying the patients sensitive to the new treatment relative to the control using any set of covariates.
• The algorithm and covariates should be pre-specified. • The algorithm A, when applied to a dataset D should provide a
function B(x;A,D) that maps a covariate vector x to {0,1}, where 1 means that treatment T is prefered to treatment C for the patient.
• The algorithm can be simple or complex, frequentist or Bayesian
based. – Prediction effectiveness depends on the algorithm and the dataset– Complex algorithms may over-fit the data and provide poor results
• Including Bayesian models with many parameters and non-informative priors• Prediction effectiveness for the given clinical trial dataset can be
evaluated by cross-validation
![Page 67: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/67.jpg)
Conclusions
• A more personalized oncology is rapidly developing based (so far) on information in the tumor genome
• Genomics has spawned new and interesting areas of biostatistics including methods for p>n prediction problems, systems biology and the design of predictive clinical trials
• There are important opportunities and great needs for young biostatisticians with rigorous training in biostatistics and high motivation for trans-disciplinary research in biology and biomedicine
![Page 68: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/68.jpg)
Acknowledgements
• Kevin Dobbin• Boris Freidlin• Wenyu Jiang• Aboubakar Maitournam• Michael Radmacher• Jyothi Subramarian• Yingdong Zhao
![Page 69: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/69.jpg)
BRB-ArrayTools• Architect – R Simon• Developer – Emmes Corporation
• Contains wide range of analysis tools that I have selected• Designed for use by biomedical scientists• Imports data from all gene expression and copy-number
platforms– Automated import of data from NCBI Gene Express Omnibus
• Highly computationally efficient• Extensive annotations for identified genes• Integrated analysis of expression data, copy number
data, pathway data and data other biological data
![Page 70: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/70.jpg)
Predictive Classifiers in BRB-ArrayTools
• Classifiers– Diagonal linear discriminant– Compound covariate – Bayesian compound covariate– Support vector machine with inner
product kernel– K-nearest neighbor– Nearest centroid– Shrunken centroid (PAM)– Random forrest– Tree of binary classifiers for k-
classes• Survival risk-group
– Supervised pc’s– With clinical covariates– Cross-validated K-M curves
• Predict quantitative trait– LARS, LASSO
• Feature selection options– Univariate t/F statistic– Hierarchical random variance
model– Restricted by fold effect– Univariate classification power– Recursive feature elimination– Top-scoring pairs
• Validation methods– Split-sample– LOOCV– Repeated k-fold CV– .632+ bootstrap
• Permutational statistical significance
![Page 71: On the Road to Predictive Oncology Challenges for Statistics and for Clinical Investigation](https://reader033.fdocuments.us/reader033/viewer/2022051623/56815d34550346895dcb31e9/html5/thumbnails/71.jpg)
BRB-ArrayToolsJune 2009
• 10,000+ Registered users • 68 Countries• 1000+ Citations