Group Sequential and Adaptive Designs - Harvard Catalyst...Power if designed with base-case...

Group Sequential and Adaptive Designs

Part II: Adaptive Designs

May 2, 2015

Cyrus Mehta, Ph.D. – Cytel Inc.

• Adaptive designs are defined as:

“…changes in design or analyses guided by examination of the accumulated data at an interim point in the trial…”

-- 2010 FDA guidance

• Two types: “generally well understood” and “less well understood”

Adaptive Designs

2

• Well Understood:

• Group sequential designs

• Blinded sample size re-estimation

• Less Well Understood:

• Unblinded sample size re-estimation

• Dose selection

• Population enrichment

• Switching endpoints

Adaptive Designs in Confirmatory Trials

3

• Respond flexibly to accumulating data

• Efficiently and effectively streamline drug development

• Reduce costs and time to market

• Increase probability of success

• Combine adaptive design with adaptive financing

Opportunities for Adaptive Methods

4

Opportunities yes, but…

Potential for operational bias:

• premature unblinding of interim results

• investigator behavior changes after interim

• misclassification of primary outcome

• open label studies especially prone to this bias

Inflation of type I error rate

• handled by appropriate statistical methodology

Adaptations may not match pre-planned decision rules • handled by adjusting the final significance level (like GSD)

Challenges for Adaptive Methods

5

• Therapy for relapsed or refractory AML is generally unsatisfactory; no approved drugs; dismal prognosis

• Vosaroxin and Ara-C combination evaLuating Overall survival in Relapsed/refractory AML

• Phase 3, double-blind, placebo-controlled, multinational trial for first-relapsed or refractory Acute Myeloid Leukemia (AML)

• Evaluate efficacy and safety of Vos+Ara-C versus Vos+Placebo

Case Study I: VALOR Trial for AML

6

VALOR Phase 3 Schema

After Cycle 1, all subsequent cycles at 70 mg/m2 vosaroxin on Days 1 and 4.

7

• Primary endpoint is overall survival

• Design for 90% power at 5% (2-sided) significance level

• Complete the trial in 30 months

• Enroll for 24 months

• Follow for 6 additional months

Design Objectives

8

• Limited information on Vosaroxin from a single phase 2 trial of 69 patients with no active comparator

• Median OS for Vosaroxin estimated at 7 months from phase 2 trial

• Median OS for Cytarabine estimated at 5 months from meta-analysis of prior studies and consultation with KOLs

• Hazard ratio estimated to be 0.71 amidst considerable uncertainty

Prior Phase 2 Data

9

• Based on phase 2 data:

• Assume 5/7 month median for Ctrl/Trtm (HR=0.71)

• Require 375 events and 450 subjects @ 19/month

• But phase 2 estimates are subject to uncertainty

• What if 5/6.5 mth median on Ctrtl/Trtm (HR=0.77)?

• HR=0.77 is still clinically meaningful

• Require 616 events and 732 subjects @ 31/month

• Not a feasible option for sponsor

• Given these constraints, how to design this single pivotal trial?

Sponsor’s Dilemma

10

Sponsor is Resource and Time Constrained

True HR

Power if designed with base-case

assumption (HR=0.71)

Power if designed with conservative

assumption (HR=0.77)

0.71 91% 99%

0.74 83% 97%

0.77 71% 90%

Resources Needed 450 patients@19/mth 732 patients@31/mth

Risk of designing for the base case (HR=0.71) • Pilots or POC trials often demonstrate greater efficacy than larger multicenter • trials (Pereira et. al., JAMA 2012; 308(16) 1676-1684)

Difficulty of designing with the conservative assumption (HR=0.77) • Unable to muster up the resources for such a large investment up-front • Rule of thumb: cost/patient is about $50-80K for an oncology trial with OS

11

• Design up-front for 90% power at HR=0.71 (requires 375 events; 450 subjects @ 19/mth)

• One interim analysis after 50% information (187 events)

• Stop early if overwhelming evidence of efficacy

• Stop early for futility if low conditional power

• Increase sample size and events if interim results are promising

• Key Idea: Milestone Driven Investment Invest additional resources and re-power the study only after seeing promising interim results

A Strategy of Staged Investment

12

Partition the interim outcome into three zones based on the interim estimate of conditional power:

• Unfavorable: CP < 30%; no change to design

• Promising: 30% ≤ CP < 90%; increase resources

• Favorable: CP > 90%; no change to design

The Promising Zone Design

13

Increase Maximum Information in Promising Zone

14

Increasing events from 374 to 561 if in Promising Zone at interim

15

Conditional Power of Conventional Design (375 events) if HR = 0.77

16

Conditional Power of Adaptive Design at interim if HR = 0.77

17

Superimpose CP Curves of Conventional and Adaptive Designs

18

Include distribution of Zstat at interim analysis time point

19

Preserving the Type-1 Error

20

Operating Characteristics

21

• Design specification including simulation results and method for controlling type-1 error presented for FDA review

• Actual decision rule for triggering the adaptation was included as a restricted appendix in the FDA briefing book and the DMC charter

• Sponsor’s briefing book and DMC charter did not contain the restricted appendix

Regulatory Interactions and Confidentiality

22

Results at Interim Analysis

• The estimate of the hazard ratio is 0.7628 • The Conditional Power was 82% -- in the promising zone • Sample size and events were increased by 50%

23

Kaplan-Meier Plots at Interim

24

Interim Result Triggered Additional Investment

25

Preserving the Type I Error (CHW Adjustment)

26

• Primary Endpoint Overall Survival:

• 7.5 months on Vos vs. 6.1 months on Placebo.

• unstratified results: HR = 0.87, p=0.06

• stratified results: HR = 0.83, p=0.02

• Unfortunately, protocol pre-specified unstratified analysis as primary and stratified analysis as key secondary

• Complete Response rate: 30.1% vosaroxin arm vs. 16.3% placebo arm, p<0.0001

VALOR Results

27

• Totality of data suggest benefit of vosaroxin in relapsed/refractory AML.

• Adaptive design played an important role in demonstrating drug activity. • Staged investment

• Careful implementation led to − Control of Type I error

− Minimized potential for operational bias

• Without the sample size adaptation p-value = 0.22

• Practical questions emerge with implementation

Concluding Remarks

28

• Diarrhea affects 20-30% of HIV-infected individuals in post-highly active anti-retroviral therapy (HAART) era1

• No existing therapy has been shown to be effective, safe, and well-tolerated by individuals with HIV diarrhea.2

• Diarrhea negatively impacts quality of life1,3 and compliance with antiretroviral medications4,5.

• Noncompliance leads to reduced drug levels, higher viral loads, and drug resistance.6,7

• Successful treatment of diarrhea could improve HIV treatment outcomes.

Case Study II. The ADVENT Trial for the

treatment of HIV induced diarrhea

29

Crofelemer: From Plant to Molecule

Croton lechleri Latex from Croton lechleri

Crofelemer Molecule

30

Results of Phase II Trial (37554-210)

Percent Change in Abnormal Stool Weight (Inpatient Period)

Significant Baseline Diarrhea (SBD) Subset

Percent Change in Abnormal Stool Frequency (Inpatient Period)

Significant Baseline Diarrhea (SBD) Subset

-70

-60

-50

-40

-30

-20

-10

0

2 3 4 5 6 7

DAYS

% C

ha

ng

e

Placebo

250 mg T

500 mg T

500 mg B

-80

-70

-60

-50

-40

-30

-20

-10

0

2 3 4 5 6 7

DAYS

% C

han

ge in

Fre

qu

en

cy

Placebo

250 mg T

500 mg T

500 mg B

Analysis of patients with secretory diarrhea and urgency at baseline showed significant effect of 250 mg and 500 mg

crofelemer tablets in percent reduction in abnormal stool weight (p=0.014 and p=0.005 respectively); significant in 500 mg beads as

well

Analysis of patients with secretory diarrhea and urgency at baseline showed significant effect of 250 mg and 500 mg

crofelemer tablets in percent reduction in number of abnormal stools (p=0.019 and p=0.003 respectively); significant in 500 mg

beads as well

1 2 7

inpatient Outpatient Period

week 2 week 3 week 4

14 21 28 35 3 4 5 6

washout Withdraw

antidiarrheals

• Previous Phase 3 HIV diarrhea study (37554-210) design – Double-blind placebo-controlled randomized trial

• 1-day Baseline and 6-day inpatient stool collection period

• Responders were eligible for 3-week double-blind outpatient treatment period; followed by 1-week treatment washout

31

Option 1: A Single 4-Arm Trial

Randomize

125 mg bid

250 mg bid

500 mg bid

Placebo

• 80% power to detect: p0=35%; p1=35%; p2=35%; p3=55% • Perform the Bonferroni-Holm Test for the final analysis • Require 520 patients (130/arm) for 80% power with 1-sided a=0.025

32

Option 2: Separate Phase 2 and Phase 3 Trials (Operationally Seamless)

• Trial 1 : make a dose selection (sponsor involvement permitted) • Trial 2 : test selected dose versus placebo with 216 patients at FULL a=0.025 • This design requires 200 + 216 = 416 patients for 80% power Advantage: Very flexible for dose selection; Disadvantage: data from Trial 1 cannot be used for final analysis of Trial 2

33

Option 3: One Integrated Trial (Inferentially Seamless)

• Dose selection at interim analysis as before • Final analysis combines the data from both stages utilizing the

method of Posch et. al. (Statistics in Medicine, 2005) • This design requires 380 patients for 80% power

34

Sample Size Savings

Method Sample Size

Bonferroni-Holms 520

Two Separate Trials (operationally seamless) 416

Combined Phase 2-3 (inferentially seamless) 380

• The team selected the combined phase 2-3 trial • Used the Posch et. al. (2006) method for controlling the type-1 error • Prior to launch, Design details and simulation details were submitted to FDA along with operational details for conducting the interim analysis and preventing bias

35

Why is Posch method necessary?

• Suppose s is the selected dose at Stage 1

• Then the Wald statistic for the final analysis can be written as

• Why is it not OK to simply test Zs > 1.96?

• We cannot ignore that two doses were dropped; now Zs(2)

is the maximum of three Wald statistics.

• Thus Zs is not N(0,1) under H0 and a=0.025 is not preserved

• In this case we can show that ; double the nominal a=0.025

0 sP (Z 1.96) 0.05

36

Why not control the a by simulation?

• Could we not use a more conservative cutoff than 1.96 to preserve the a?

• For example, we can show by simulation that

• Not acceptable to FDA for confirmatory trials

• cut off depends on timing of interim look

• cannot ensure strong control of type-1 error

0 sP (Z 2.257) 0.025

37

Strong Control of Type-1 Error

Strong control means that probability of making a false claim is less than a no matter which of the above null hypotheses is applicable

38

Study Design

• Drug: Crofelemer (NP-303, SP-303)

• Study: Phase 3

• Indication: Symptomatic Relief for the Treatment of HIV-associated diarrhea

• Doses: 125 mg, 250 mg, 500 mg b.i.d. vs. placebo

• Study design: 2-stage adaptive design

– Stage I: Dose Selection

– Stage II: Dose Confirmation

• Study endpoints: efficacy

– Primary – Clinical response, defined as two or less watery BM per week, during at least two of the four weeks of the efficacy assessment period

39

study medication

10-day screening 3-day run-in

stop ADM

baseline 500 mg BID 46 5-mo placebo free extension

ADVENT Trial (NP 303-101): Stage I

history of diarrhea

250 mg BID 54 5 –mo placebo free extension

125 mg BID 44 5-mo placebo-free extension

5-mo placebo free extension placebo 50

Stage 1 – 194 evaluable patients for Dose Selection

4 Weeks

40

Criteria for Dose Selection for Stage 2 by IAC

• Selection of the dose of crofelemer for Stage 2 will be made by the IDMC based on the following criteria:

• The primary efficacy variable in the ITT population, concomitant with AE and SAE rates will be used for dose selection

• Assuming there are no safety issues, the crofelemer dose selected for Stage 2 will be one for whom the primary efficacy variable is at least 2.0% greater than the other crofelemer treatments

• If 2 or 3 treatment groups’ percents are less than 2% of each other, and there are no safety issues, the lowest of these doses will be selected for Stage 2

• No futility rule for efficacy reasons, only safety

41

study medication

10-day screening 3-day run-in

stop ADM

baseline

ADVENT Trial (NP 303-101) – Stage II

history of diarrhea

125 mg BID 92 5-mo placebo free extension

5-mo placebo free extension placebo 88

Stage 2 – 180 additional evaluable patients for Dose Assessment phase (to be combined with Stage 1 patients for the placebo and 125 mg BID dose groups)

42

Timing and Activities during the Interim Analysis of ADVENT

Milestones and Events •Completed Stage-1 enrollment: mid-June 2009 •Last patient placebo-controlled treatment period end: mid-July 2009 •Data clean-up completed: end of July 2009 •Stage I database locked for Interim Analysis •Interim Analysis Committee meets August 3 2009 •Enrollment re-opens to Stage II: August 4 2009 •Interim Investigator Meeting: August 6-8 2009

ADVENT protocol (per FDA mandate) stipulated that Stage II must be initiated within 8 weeks from the time enrollment was stopped after Stage I

43

Final Analysis

Dose Stage 1 Results Stage 2 Results

Response P-value Response P-value

Placebo 1/50 (2%) ------ 10/88 (11.4%) ------

125 mg 9/44 (20.5%) 0.0019 15/92 (16.3%) 0. 1690

250 mg 5/54 (9.3%) 0.0563

500 mg 9/46 (19.6%) 0.0024

Marginal Results for Each Stage

The 125 mg dose was selected for Stage 2

Combination P-value for the 125 mg dose from the two stages: 1 1

1 1 1 1C(p ,q ) 1 1/ 2 (1 p ) 1/ 2 (1 q ) 0.0032

Therefore H(1) is rejected by at the local a=0.025 level

44

Must invoke closed testing

• Although H(1) is rejected at the local a=0.025, we cannot yet make a claim for the 125 mg dose

• We must also reject H(12), H(13) and H(123), at their respective local a=0.025 levels

• Use Simes test to compute the adjusted p-values p(12) , p(13) and p(123)

(12) (12) 1 (12) 1 (12)

(13) (13) 1 (13) 1 (13)

(123) (123) 1 (123) 1 (123)

C(p ,q ) 1 1/ 2 (1 p ) 1/ 2 (1 q ) 0.025

C(p ,q ) 1 1/ 2 (1 p ) 1/ 2 (1 q ) 0.025

C(p ,q ) 1 1/ 2 (1 p ) 1/ 2 (1 q ) 0.025

Reject H(12) if:

Reject H(13) if:

Reject H(123) if:

45

Simes-adjusted stage wise and combination P-values

1 2 3

(12)

1 2 1 2

(13)

1 3 1 3

(123)

1 2 3 1 2 3 1 2 3

p 0.0019;p 0.0564;p 0.0024

p min{2min(p ,p ),max(p ,p )} 0.0038

p min{2min(p ,p ),max(p ,p )} 0.0024

p min{3min(p ,p ,p ),1.5med(p ,p ,p ),max(p ,p ,p )} 0.0036

Adjusted p-values for Stage 1

Adjusted p-values for Stage 2 (12) (13) (123)

1q q q q 0.1690

(1) (12) (12) (12)

1 1

(13) (13) (13) 123) (123) (123)

C(p ,q ) 0.00321 reject H ; C(p ,q ) 0.00514 reject H

C(p ,q ) 0.00382 reject ; C(p ,q ) 0.00503 reject H

� �

Combination p-values from the two stages

Since all combination p-values are < 0.025, we can claim that the 125 mg dose is better than placebo at overall a=0.025 level

46

The Overall P-value for H(1)

1 (1) (1)(1)

1 (12) (12)(12)

1 (13) (13)(13)

(123)

[1 C(p ,q ) 1/ 2xPval (x)dx 0.0032

1/ 2

[1 C(p ,q ) 1/ 2xPval (x)dx 0.00514

1/ 2

[1 C(p ,q ) 1/ 2xPval (x)dx 0.00382

1/ 2

Pval

1 (123) (123)[1 C(p ,q ) 1/ 2x(x)dx 0.00503

1/ 2

Find the overall P-value associated with each individual hypothesis test

The multiplicity adjusted overall P-value for testing H(1) is:

(1) (12) (13) (123)max(Pval ,Pval ,Pval ,Pval ) 0.00514

Because there was no early stopping, the combination p-value and the overall p-value are the same

47

crofelemer 500 mg p.o. b.i.d.



placebo p.o. b.i.d.

n = 50

194 patients in Stage 1


placebo p.o. b.i.d.

180 patients in Stage 2

138

Stage I Stage II

n = 44

n = 54

n = 46

n = 93

n = 88

136

Adaptive design Interim analysis

374 total patients in the adaptive trial

+ =

ADVENT: A Single Pivotal Dose-Selection

and Dose-Confirmation Phase 3 Trial

Primary efficacy endpoint analysis on 274 patients from Stage 1 and 2 P = 0.00514

48

Package Label for Fulyzaq (Crofelemer)

49

Workshop3: Adaptive Lung Cancer Study

50

• Study Introduction - A new chemical entity (NCE) is being developed for the treatment of reward deficiency syndrome, specifically alcohol dependence and binge eating disorder. Compared with other orally available treatments, NCE was designed to exhibit enhanced oral bioavailability, thereby providing improved efficacy for the treatment of alcohol dependence.

• Multicenter, randomized, double-blind, placebo-controlled study conducted in two parts using a 2-stage adaptive design.

• Primary Endpoint - The endpoint is based on the patient-reported number of standard alcoholic drinks per day, transformed into a binary outcome measure, abstinence from heavy drinking.

Workshop4 – MaMs using p-value combination

57

• In Stage 1, approximately 400 eligible subjects randomized equally among four treatment arms (NCE [doses: 1, 2.5, or 10 mg]) and matching placebo.

• After observing the data on 400 subjects, perform an interim analysis and carry forward the best two doses out of three. Put additional 100 subjects on each of those two doses and Placebo. So total starting Sample Size = 700 (400+300)

• Desired Power = 80%

• Overall adjusted type-1 error = 0.025

• Use Bonferroni for multiplicity adjustment

• Let us not have early stopping boundaries for efficacy and futility

• Run 1000 simulations

Design Inputs

58

East 6.4 – Input Windows

59

East 6.4 – Input Windows (Contd..)

60

• Save this design as “Best2” in the Library

• Question1 - What is the global power of this study?

Global Power: ____

• Question2 - How frequently was the pair (1 mg–10 mg) selected for Stage 2 and observed to be efficacious?

% Efficacy:____

Q&A

61

• Question3 – How frequently 10 mg was selected for Stage 2 regardless of whether it was found significant at end of Stage 2 or not.

% Sims 10 mg Selected:____

• The previous design dropped one dose. Run the simulations again without dropping any doses (set r=3). Save this design as “All3” in the Library. Notice the change in power.

62

• Set r =2 and run the simulations under Null Hypothesis (same proportion of response on all arms)

• Save this design in the Library. Does it preserve the FWER?

Question4 – What is the simulated power?___

• Let us monitor the “Best2” design. Select it and click IM icon.

Enter the proportions: 0.1, 0.11, 0.19, 0.21

63

• Enter the data for second stage as shown below. We are dropping the less performing 1mg dose.

64

Group Sequential and Adaptive Designs - Harvard Catalyst...Power if designed with base-case...

Documents

Transcript of Group Sequential and Adaptive Designs - Harvard Catalyst...Power if designed with base-case...