COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA,...

35
COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those of the author and do not necessarily reflect those of the FDA.

Transcript of COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA,...

Page 1: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

COMPUTER INTENSIVE AND RE-RANDOMIZATION

TESTS IN CLINICAL TRIALS

Thomas Hammerstrom, Ph.D.

USFDA, Division of Biometrics

The opinions expressed are those of the author and do not necessarily reflect those of the FDA.

Page 2: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

OBJECTIVE OF TALK

Discuss role of randomization and deliberate balancing in experimental design.

Compare standard and computer intensive tests to examine robustness of level and power of common tests with deliberately balanced assignments when assumed distribution of responses is not correct.

Page 3: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

OUTLINE OF TALK

I. Testing with Deliberately Balanced Assignment

II. Common Mistakes in Views on Randomization and Balance

III. Robustness Studies on Inference in Deliberately Balanced Designs

Page 4: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

I. TESTING WITH DYNAMIC ALLOCATION

Page 5: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

DYNAMIC ASSIGNMENTS

1. Identify several relevant, discrete covariates, e.g., age, sex, CD4 count

2. Change randomization probabilities at each assignment to get each level of each covariate split nearly 50-50 between arms

Page 6: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

3. Assign new subject randomly if all covariates are balanced

assign deterministically or with unequal probabilities to move toward marginal balance if not currently balanced

Page 7: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

ISSUES WITH DYNAMIC ASSIGNMENTS

1. Why bother with this elaborate procedure?

2. Are the levels of tests for treatment effect preserved when standard tests are used with dynamic (minimization) assignments?

3. Does the use of minimization increase power in the presence of both treatment and covariate effects?

Page 8: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

II. COMMON MISTAKES IN ANALYSIS OF BASELINE

COVARIATES

Page 9: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 1. Purpose of Randomization is to

Create Balance in Baseline Covariates

Fact: Purpose of Randomization is to Guarantee Distributional Assumptions of Test Statistics and Estimators

Page 10: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 2. It is good practice in a randomized trial to test for equality between arms of a baseline covariate.

Fact: All observed differences between arms in baseline covariates are known with certainty to be due to chance. There is no alternative hypothesis whose truth can be supported by such a test.

Page 11: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 3. If a test for equality between arms of a baseline covariate is significant, then one should worry.

Fact: Such test statistics are not even good descriptive statistics since p-values depend on sample size, not just the magnitude of the difference.

Page 12: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 4. Observed Imbalances in Baseline

Covariates cast Doubt on the Reality of Statistically Significant Findings in the Primary Analysis.

Fact: The standard error of the primary statistic is large enough to insure that such imbalances create significant treatment effects no more frequently than the nominal level of the test.

Page 13: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 5. Type I Errors can be Reduced by

Replacing the Primary Analysis with one Based on Stratifying on Baseline Covariates Observed Post Facto to be Unbalanced.

Fact: The Operating Characteristics of Procedures Selected on the Basis of Observation of the Data are not generally Quantifiable.

Page 14: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

If the Agency approved of Post Hoc Fixing of

Type I Errors by Adding New Covariates to the Analysis (or by other Adjustments to ‘Fix Randomization Failures’),

Then it should also Approve of Similar Post Hoc Fixing of Type II Errors when ‘Failure of Randomization’ Leads to Imbalance in Favor of the Control Arm.

Page 15: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 6. If the same Random Assignment

Method gave more even Balance in Trial A than in Trial B, then one should place more trust in a Rejection of the Null Hypothesis from Trial A.

Fact: Balance on Baseline Covariates Decreases the Variance of Test Statistics and Estimators. It Increases the Power of Tests when the Alternative Hypothesis is True. It has no Effect on Type I Error.

Page 16: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Mistake 7. Balance on Baseline Covariates

Leads to Important Reductions in Variances.

Fact: Even without Balance, the Variance of Test Statistics and Estimators are of size O(1/N) where N = sample size.

Balancing on p Baseline Covariates Decreases these variances by Subtracting a Term of size O(p/N2)

Page 17: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Typical model for Continuous Response:

Yik = mi + g1x1ik + … + gpxp

ik + eik

where eik ~ N(0, s2)

mi = treatment effect,

Xik = (x1ik,…,xp

ik) = vector of covariates

g1 ,…, gp = unknown vector of covariate effects

Page 18: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

s2 * Precision of Estimate of (m1-m0 ) =

N/2 - Z’Z

where N = number per arm,

Z = V-1(X1. - X0.),

V2 = matrix of cross-products of X/2N, and

randomization distribution of

(X1. - X0.) ~ N( 0, V2), of Z ~ N(0, Ip),

of Z’Z ~ Chi-square(p)

Precision with Balance = N/2,

E(Precision without Balance) = N/2 - O(p)

Page 19: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

III. ROBUSTNESS STUDIES ON INFERENCE IN

DELIBERATELY BALANCED DESIGNS

A. MODELS USED TO COMPARE METHODS

Page 20: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

METHODS COMPARED

1. Dynamic Allocation analyzed by F-statistic from ANCOVA based on arm and covariates

2. Dynamic Allocation analyzed by re-randomization test, using difference in means

3. Randomized Pairs, analyzed by F-statistic from ANCOVA using arm and covariates

Page 21: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

BASIC FORM OF SIMULATED DATA

1. Control & test arms, N subjects randomized 1:1

2. X1j, …, X7j = binary covariates for subject j

3. ej = unobserved error for subject j

4. Yj = observed response for subject j

5. I1j = 1 if subject j in arm 1, test arm

6. Yj = mj I1j + ej + d k=17Xkj

Page 22: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

MODELS FOR ERRORS

1. ej ~ N( 0 , 1 ) Normal

2. ej ~ exp( N( 0 , 1 )) Lognormal

3. ej ~ N( 4j/N , 1 ) Trend

4. ej ~ .9 N( 0 , 1) + .1 N( 0, 25 ) Mixed

5. ej ~ N( 0 , 4j/N ) Hetero

6. ej ~ N( cos(2j/N) , 1 ) Sine wave

7. ej ~ N( 0 , 1 ) if j<J

~ N(4, 1) if j>=J Step

Page 23: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

MODELS FOR COVARIATES

X1j, …, X7j are

1. independent with p1, …, p7 constant in j

2. correlated with p1, …, p7 constant

3. independent with p1, …, p7 monotone in j

4. independent with p1, …, p7 sinusoid in j

Coefficient d = 1 or 0

Page 24: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

MODELS FOR TREATMENT

1. Treatment effect mj = m, constant over j

2. Treatment effect mj = m * (4j/N), increasing over j

Page 25: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

COMPARISONS

1. Select one of the models

2. Generate 200 sets of covariates and unobserved errors

3. For each set, construct I1j once by dynamic & once by randomized pairs

4. Compute the 200 p-values for different tests and assignment methods

Page 26: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SIMULATED DATA FOR COX REGRESSION

1. Control & test arms, N subjects randomized 1:1

2. X1j, …, X7j = binary covariates for subject j

3. YLj = potentially observed failure time for subject j on arm L = 0 or 1

4. YLi /[ dL( 1+ k=17Xkj )] ~ FL, L = 0 or 1

5. FL = Exponential or Weibull

6. Censoring ~ Exp with scale large or small

Page 27: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

RESULTS WITH COX REGRESSION

1. Assign subjects by dynamic allocation.

2. Estimate treatment effect by proportional hazards regression

3. Re-randomize and compute new ph reg estimates many times.

4. Compare parametric p-value with percentile of real estimate among all rerandomized treatment estimates

Page 28: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

III. ROBUSTNESS STUDIES ON INFERENCE IN

DELIBERATELY BALANCED DESIGNS

B. RESULTS OF SIMULATIONS

Page 29: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SIMULATION RESULTS

1. In most cases considered, the gold standard but computer intensive re-randomization test gave the same power curve as the standard ANCOVA F-test for the dynamic allocation. Both level, when H0 was true, and power, otherwise, were the same.

Page 30: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SIMULATION RESULTS

2. In most cases considered, the ANCOVA F-test gave the same power curve whether the subjects were assigned by dynamic allocation or randomized pairs. Deliberate balance on baseline covariates gave no improvement in power.

Page 31: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SIMULATION RESULTS

3. There was one clear exception to the above findings. When untreated responses showed a trend with time of enrollment, the ANCOVA F-test for treatment gave incorrectly low power.

Page 32: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SIMULATION RESULTS

4. In most cases considered with time to event data with dynamic allocation, the re-randomization test gave the same results as the Cox regression.

Page 33: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SUMMARY

1. Modifying a Randomization Method to Achieve Deliberate Balance Serves Mainly Cosmetic Purposes & Should be Discouraged

2. Balance on Covariates Reduces Variance of Test Stats & Estimators but Only by Small Amounts

Var( trt effect) = O(1/N) when balanced

When unbalanced , Var is larger by a term = O(p/N2)

Page 34: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SUMMARY

3. Rerandomization analyses based on Finite Population Models are gold standard for randomized trials

4. IID Error models are only approximations

5. Approximation is adequate for level with common minimization allocations under a wide variety of potential violations of the assumptions.

Page 35: COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

SUMMARY

6. Deliberate Balance Allocations and Simple Tests Require Belief that God is Randomizing Your Subjects’ Responses.

Randomization and Finite Population Based Tests Protect You if the Devil is Determining the Order of Your Subjects’ Responses