Adaptive dose-finding: Proof of concept with type I error control

13
Adaptive dose-finding: Proof of concept with type I error control Frank Miller AstraZeneca, Statistics & Informatics, S-15185 So¨derta¨lje, Sweden Received 2 September 2009, revised 16 July 2010, accepted 25 July 2010 We consider an adaptive dose-finding study with two stages. The doses for the second stage will be chosen based on the first stage results. Instead of considering pairwise comparisons with placebo, we apply one test to show an upward trend across doses. This is a possibility according to the ICH-guideline for dose- finding studies (ICH-E4). In this article, we are interested in trend tests based on a single contrast or on the maximum of multiple contrasts. We are interested in flexibly choosing the Stage 2 doses including the possibility to add doses. If certain requirements for the interim decision rules are fulfilled, the final trend test that ignores the adaptive nature of the trial (naı¨ve test) can control the type I error. However, for the more common case that these requirements are not fulfilled, we need to take the adaptivity into account and discuss a method for type I error control. We apply the general conditional error approach to adaptive dose-finding and discuss special issues appearing in this application. We call the test based on this approach Adaptive Multiple Contrast Test. For an example, we illustrate the theory discussed before and compare the performance of several tests for the adaptive design in a simulation study. Key words: Adaptive design; Adaptive multiple contrast test; Clinical studies; Conditional error approach; Dose-finding. Supporting Information for this article is available from the author or on the WWW under http://dx.doi.org/10.1002/bimj.200900222. 1 Introduction One aim of a dose-finding study within the development process for a drug is to gain knowledge about the dose–response relationship of the new drug. We consider here a dose-finding study conducted prior to performing confirmatory Phase III studies. The dose–response information is necessary for choosing the dose(s) for these Phase III studies. Another important aim of many dose-finding studies is Proof of Concept: It is important to show by a significance test that the drug has an effect. One possibility for Proof of Concept is to test the result at each dose versus placebo (pairwise comparisons). Another possibility mentioned in the ICH guideline E4 (1994) is to perform one single test to show an upward trend across doses. For this, trend tests with contrasts can be applied, see Robertson et al. (1988) and Ruberg (1995b). A single test uses information from all doses and has the potential to be in general more powerful compared with a pairwise testing procedure. However, the test may not be very powerful for certain alternatives. More robust tests to show an upward trend are e.g. the Bartholomew test (Bartholomew, 1961), the trend test based on the maximum of multiple contrasts (Steward and Ruberg, 2000; Bretz et al., 2005) or tests based on the maximum of other robust tests like the Jonckheere test (Neuha¨user et al., 1998). According to the ICH guideline E4 (1994), a successful and ‘‘well-controlled dose–response study is also a study that can serve as primary evidence of effectiveness’’ when regulatory authorities later have to decide whether the new drug is approved or not. Therefore, it is important to use adequate statistical methodology for the analysis of a Phase II dose-finding study including control the type I error. *Corresponding author: e-mail: [email protected], Phone: 146-8-553-279-08, Fax: 146-8-553-289-47 r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Biometrical Journal 52 (2010) 5, 577–589 DOI: 10.1002/bimj.200900222 577

Transcript of Adaptive dose-finding: Proof of concept with type I error control

Page 1: Adaptive dose-finding: Proof of concept with type I error control

Adaptive dose-finding: Proof of concept with type I error control

Frank Miller�

AstraZeneca, Statistics & Informatics, S-15185 Sodertalje, Sweden

Received 2 September 2009, revised 16 July 2010, accepted 25 July 2010

We consider an adaptive dose-finding study with two stages. The doses for the second stage will be chosenbased on the first stage results. Instead of considering pairwise comparisons with placebo, we apply onetest to show an upward trend across doses. This is a possibility according to the ICH-guideline for dose-finding studies (ICH-E4). In this article, we are interested in trend tests based on a single contrast or onthe maximum of multiple contrasts. We are interested in flexibly choosing the Stage 2 doses including thepossibility to add doses. If certain requirements for the interim decision rules are fulfilled, the final trendtest that ignores the adaptive nature of the trial (naıve test) can control the type I error. However, for themore common case that these requirements are not fulfilled, we need to take the adaptivity into accountand discuss a method for type I error control. We apply the general conditional error approach toadaptive dose-finding and discuss special issues appearing in this application. We call the test based onthis approach Adaptive Multiple Contrast Test. For an example, we illustrate the theory discussed beforeand compare the performance of several tests for the adaptive design in a simulation study.

Key words: Adaptive design; Adaptive multiple contrast test; Clinical studies; Conditionalerror approach; Dose-finding.

Supporting Information for this article is available from the author or on the WWW underhttp://dx.doi.org/10.1002/bimj.200900222.

1 Introduction

One aim of a dose-finding study within the development process for a drug is to gain knowledgeabout the dose–response relationship of the new drug. We consider here a dose-finding studyconducted prior to performing confirmatory Phase III studies. The dose–response information isnecessary for choosing the dose(s) for these Phase III studies.

Another important aim of many dose-finding studies is Proof of Concept: It is important to show bya significance test that the drug has an effect. One possibility for Proof of Concept is to test the result ateach dose versus placebo (pairwise comparisons). Another possibility mentioned in the ICH guidelineE4 (1994) is to perform one single test to show an upward trend across doses. For this, trend tests withcontrasts can be applied, see Robertson et al. (1988) and Ruberg (1995b). A single test uses informationfrom all doses and has the potential to be in general more powerful compared with a pairwise testingprocedure. However, the test may not be very powerful for certain alternatives. More robust tests toshow an upward trend are e.g. the Bartholomew test (Bartholomew, 1961), the trend test based on themaximum of multiple contrasts (Steward and Ruberg, 2000; Bretz et al., 2005) or tests based on themaximum of other robust tests like the Jonckheere test (Neuhauser et al., 1998).

According to the ICH guideline E4 (1994), a successful and ‘‘well-controlled dose–response study isalso a study that can serve as primary evidence of effectiveness’’ when regulatory authorities later haveto decide whether the new drug is approved or not. Therefore, it is important to use adequate statisticalmethodology for the analysis of a Phase II dose-finding study including control the type I error.

*Corresponding author: e-mail: [email protected], Phone: 146-8-553-279-08, Fax: 146-8-553-289-47

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Biometrical Journal 52 (2010) 5, 577–589 DOI: 10.1002/bimj.200900222 577

Page 2: Adaptive dose-finding: Proof of concept with type I error control

When a dose-finding study is planned, there are usually information sources about potentialdose–effect relationships available, such as literature data for related compounds, preclinical data,pharmacokinetic and pharmacodynamic analyses from Phase I studies. This information is thenused to determine the doses for the dose-finding study.

There might be still considerable uncertainty about the dose-range of interest even when using allavailable information. In this case, an adaptive dose-finding design can be of value: the study isstarted with some doses. In one or more interim analyses, data from the actual study are analysedand the doses are chosen for the next stage of the study.

Several methods for testing the drug effect at the end of an adaptive dose-finding study wereproposed. When considering pairwise comparisons versus placebo, one needs to account for mul-tiple comparisons in order to preserve the overall type I error rate. Posch et al. (2005) discuss amethod based on the combination test approach. An adaptive Dunnett procedure was elaboratedby Koenig et al. (2008). Stallard and Friede (2008) apply group-sequential testing to adaptive dose-finding designs. Friede and Stallard (2008) compare these different methods for significance testswith regard to their statistical properties.

In this article, we are interested in applying a trend test for an adaptive dose-finding study. Lang et al.(2000) consider a two-stage study having the same dose arms in both stages. In the first stage, theBartholomew-test is used for testing. The effects of the doses are then estimated using monotoneregression technique. In the second stage, a trend test can be adaptively applied with contrasts based onthese estimates. The final test is then a p-value combination test using the first and second stage testwhich controls the type I error by construction. Their observation is that this test improves robustness,i.e. increasing the power in situations where a non-adaptive testing has poor performance. They con-sidered however not the possibility of choosing the second stage doses based on Stage 1 data.

For a two-stage study where treatments can be dropped in the interim analysis, Chang et al.(2006) and Chow and Chang (2006, Section 8.4) suggest to use trend tests in both stages and to re-shape their contrasts based on the estimates of the first stage effects.

In Section 2.1, we introduce the non-adaptive single stage trend test based on a single contrast.The mathematical notation covers the case of non-balanced allocation since most studies end inpractice with different patient numbers per group. We develop in Section 2.2 a two-stage trend testwith a single contrast, which is stratified for stage. Section 2.3 generalizes the notation to a trend testwhere the test statistic is the maximum of multiple contrast statistics. In Section 3, we discussgenerally the multiple contrast test in the adaptive setting. A sufficient condition when the naıveversion of the test (using the critical value from the non-adaptive test) controls the type I error isprovided. Further we discuss how the test can be modified to control the type I error, if thiscondition is not fulfilled. For this, we apply the general conditional error approach to adaptivedose-finding and discuss special issues with the application in this context. We call the test based onthis approach Adaptive Multiple Contrast Test (AMCT). Using an example, we illustrate how asingle and a multiple contrast test can be applied for an adaptive dose-finding study in Section 4.The example illustrates also a reasonable strategy to choose Stage-2-doses including the possibilityto add doses. Several tests based both on a single contrast and on multiple contrasts which controlthe type I error are described. Simulations about the characteristics of the tests for the adaptivedesign example are provided. Finally, a few issues of practical importance are discussed in Section 5.

2 Proof of concept with a trend test

2.1 Single stage trend test with a single contrast

Let n be the sample size and Y be the vector of all n observations. We assume that

Y � Nnðl;s2InÞ ð1Þ

578 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 3: Adaptive dose-finding: Proof of concept with type I error control

with n-dimensional mean vector l, n-dimensional identity matrix In and s240. We assume here thats2 is known. The case for unknown s2 is discussed in Section 5. Let P¼In � 1ð1>1Þ�11> be theprojection matrix on the subspace orthogonal to the n-dimensional vector 1 ¼ ð1; . . . ; 1Þ>.

To test the null hypothesis H0: l 5 b � 1 with unknown b, we can use as test statistic for a trendtest

T ¼c>PYffiffiffiffiffiffiffiffiffiffiffic>Pcp � N

c>Plffiffiffiffiffiffiffiffiffiffiffic>Pcp ;s2

� �

where c is the n-dimensional contrast vector. It is possible to express the test statistic without usingthe projection matrix P since c>PY ¼ ðc� �c1Þ>Y ¼ c>ðY� �Y1Þ ¼ ðc� �c1Þ>ðY� �Y1Þ with �c and �Ythe mean of c and Y, respectively. The test rejects the null hypothesis if and only if T/s4F�1 (1�a)with F being the distribution function of the standard normal distribution. The vector c is chosen ina way that large values of T are a sign of the desired drug effect.

As a function of the contrast vector c, the power of the trend test is maximised if and only if c issuch that Pc is a multiple of Pl. This is well known (see e.g. Robertson et al. (1988), p. 178) and canbe seen using the Cauchy–Schwarz inequality c>Pl=

ffiffiffiffiffiffiffiffiffiffiffic>Pcp

�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic>Pc � l>Pl

p=ffiffiffiffiffiffiffiffiffiffiffic>Pcp

¼ffiffiffiffiffiffiffiffiffiffiffiffil>Pl

p.

Note that with our notation it is very easy to see this property irrespectively whether balanced orunbalanced allocation to the doses is used. In practice, the vector l is unknown, of course.Therefore, the contrast vector c cannot be chosen to be equal to l (implying Pc5Pl) in advance. Areasonable approach is to choose c equal or close to an anticipated value for l.

The definition above of the trend test is generally valid for a regression model. We are, however,interested here especially in dose-finding studies. First, for the application of the trend test in a dose-finding setting, the components of c belonging to observations for the same dose should coincide.Second, it is possible to rewrite the test statistic for the special case of dose-finding: Let m1,y, mg bethe number of patients per dose for the g dose-groups and let g1,y, gg be the componentsof c belonging to dose i5 1,y,g. Define �g ¼ ð1=nÞ

Pgi¼1 migi. Then c>PY ¼ ðc� �c1Þ>Y ¼Pg

i¼1 miðgi � �gÞ �Yi with �Yi being the mean of dose-group i. Further, c>Pc ¼Pg

i¼1 miðgi � �gÞ2.

Despite this possibility of reformulation, we think it is more convenient to use the matrix–vectorformulation in the sequel.

2.2 Two-stage trend test with a single contrast

For a study with two stages and n1 and n2 observations in Stages 1 and 2, respectively, it is usuallydesirable to use a ‘‘stratified’’ trend test allowing for a stage-effect. This means that a stage-mean iscalculated for each stage separately and only deviations from the within stage mean are consideredto be a sign for a drug effect. Hence, the null hypothesis is that the true mean of the first stageobservations is an unknown constant, say b1, and the mean of the second stage observations is apossibly different constant, say b2. In other words, H0:l 5X(b1b2)> with X being the n� 2-matrixhaving n1 1s followed by n2 0s in the first column and n1 0s followed by n2 1s in the second column.Let Yj, cj, lj, Pj, Tj be the values for Stage j, j 5 1, 2, denoted previously by Y, c, l, P, T. Define

Y ¼ ðY>1 ;Y>2 Þ>, c ¼ ðc>1 ; c

>2 Þ>, l ¼ ðl>1 ;l

>2 Þ>, n5 n11n2. Define further P by

P ¼P1 0

0 P2

� �¼ In � XðX>XÞ�1X>:

The test statistic for the stratified trend test is

T ¼c>PYffiffiffiffiffiffiffiffiffiffiffic>Pcp ¼

c>1 P1Y11c>2 P2Y2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic>1 P1c11c>2 P2c2

p ¼1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

w11w2

pffiffiffiffiffiffiw1

pT11

ffiffiffiffiffiffiw2

pT2

� �ð2Þ

with wj ¼ c>j Pjcj, j5 1, 2. The test rejects the null hypothesis if and only if T/s4F�1 (1�a).

Biometrical Journal 52 (2010) 5 579

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 4: Adaptive dose-finding: Proof of concept with type I error control

With the same argumentation as for the single stage trend test, the stratified trend test hasmaximised power if Pc5Pl which follows if c1 is chosen equal to l1 and c2 is chosen equal to l2.

2.3 Two-stage trend test with multiple contrasts

For tests based on multiple contrasts, the maximum of say k contrast tests forms the test-statistic.

Let cðrÞ ¼ ðcðrÞ>

1 ; cðrÞ>

2 Þ> be the r-th contrast vector, r5 1,y,k and let T ðrÞ be the test statistic defined

with the contrast vector cðrÞ according to formula (2). The multiple contrast test uses the maximummaxfTg of the vector T ¼ ðT ð1Þ; . . . ;T ðkÞÞ> as test statistic. Let C ¼ ðcð1Þ; . . . ;cðkÞÞ be the n� k-matrixof contrast coefficients for the whole study and C1, C2 the matrix for Stage 1 and 2, respectively.Let W ¼ diagðwð1Þ; . . . ;wðkÞÞ ¼ diagðcð1Þ

>

Pcð1Þ; . . . ; cðkÞ>

PcðkÞÞ be the k� k-diagonal matrixwith the variances of cðrÞ

>

PY, r5 1,y,k in the diagonal and Wj ¼ diagðwð1Þj ; . . . ;w

ðkÞj Þ

¼ diagðcð1Þ>

j Pjcð1Þj ; . . . ; c

ðkÞ>

j PjcðkÞj Þ, j 5 1, 2.

If for each of the k contrast tests we consider a stratified trend test, the vector of contrast teststatistics can be written as

T ¼W�1=2C>PY ¼ ðW11W2Þ�1=2C>1 P1Y11ðW11W2Þ

�1=2C>2 P2Y2

¼ðW11W2Þ�1=2ðW

1=21 T11W

1=22 T2Þ

ð3Þ

with Tj ¼W�1=2j C>j PjYj. The test rejects the null hypothesis (which is the same as in Section 2.2) if

and only if maxfTg/s4ua. The critical value ua can be derived using multidimensional integrationover the k-dimensional density of T/s which is under the null hypothesis a normal distribution withmean vector 0 and covariance matrix

R ¼ ðrðr;sÞÞr¼1;...;k;s¼1;...;k where rðr;sÞ ¼cðrÞ

>

PcðsÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifficðrÞ

>

PcðrÞcðsÞ>

PcðsÞp : ð4Þ

Numerical integration routines described e.g. by Genz and Bretz (2009) and Bornkamp et al. (2009)can be used to determine ua.

3 Adaptive multiple contrast test

So far, we have not introduced the possibility to choose the doses or the contrast vector c2 forStage 2 depending on Stage 1. We will discuss this in Sections 3.1 and 3.2. The value w2 in (2) forthe single contrast case and the diagonal matrix W2 in (3) and covariance matrix R in (4), for themultiple contrast case are then no longer fixed constants but random variables depending on theStage-1-results. Hence, it is not clear that if the test Reject H0 if and only if T/s4F�1(1�a) (ormaxfTg/s4ua for multiple contrast case) controls the type I error. There are two sources ofpotential affection of the type I error: Since W2 depends on Stage 1, the two stages will have a data-dependent weighting. If results of Stage 1 suggesting the alternative hypothesis are upweighted andresults suggesting the null hypothesis downweighted, the adaptive test will not control the type Ierror. This phenomenon is well known from sample size re-estimation. Another source of affectionof the type I error can occur since the critical value of the multiple contrast test is based on thecovariance matrix R, which becomes random in the adaptive design. This effect is specific tomultiple contrast tests applied to adaptive dose-finding trials. Possibilities to control the type I errorby adjusting the critical value in the adaptive setting are discussed in Section 3.3.

580 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 5: Adaptive dose-finding: Proof of concept with type I error control

3.1 Choice of doses for Stage 2

Dose-finding studies have usually several objectives: Proof of Concept is only one among others.The main interest is usually to collect information about the dose–response relationship. Morespecifically, certain aspects of the dose–response curve can be of interest, like estimation of a targetdose. The objective(s) can be quite different in different dose-finding studies as pointed out byRuberg (1995a). When an underlying dose–response model (e.g. an Emax-sigmoid model) is as-sumed, the dose-finding objective can often be expressed explicitly. This makes it possible to applyoptimal design theory to choose the doses for the study and the allocation proportions to each dose.Dose-finding objectives in statistical terms can then be, for example, to estimate some or allparameters in a parametric dose–response model, to estimate the dose with a certain prespecifiedeffect or to estimate the ‘‘interesting region’’ of a dose–response curve. Burman et al. (2010) discusssuch objectives for dose-finding studies. Since optimal designs for dose-finding models usuallydepend on unknown model parameters, it is an advantage to apply optimal design in an adaptivesetting. The knowledge collected prior to an interim analysis can then be used to calculate theoptimal design for the next stage of the study. Miller et al. (2007) have computed optimal designsbased on prior knowledge which could be updated with a new optimal design in an interim analysis.In addition, several methods described by Dragalin et al. (2010) are based on optimal designsassuming a flexible working model as underlying dose–response relationship.

However, not in all real dose-finding situations it is possible to develop the requirements neededfor such adaptive optimal designs: To specify exactly the objective(s), the anticipated models and todetermine possible prior assumptions about unknown parameters can be challenging. Especially,discussions within the clinical team might show that there are different views and consensus is notpossible. Alternatively to using mathematically optimised designs like in Miller et al. (2007) or insome of the methods in Dragalin et al. (2010), simpler rules can be used for choosing Stage 2 dosesbased on different outcomes from Stage 1. Even if they seem more heuristical, optimal design ideascan help the statistician to derive such rules. We will use in the example in Section 4.1 rules thatare less depending on model assumptions and more importantly easier to communicate to non-statisticians.

3.2 Choice of the contrast vector for Stage 2

When doses can be added or dropped or allocation ratios can be changed for Stage 2, the questionarises how the Stage 2 contrasts of a two-stage trend test should be specified.

The Stage 2 contrast can be directly adjusted by using the estimated effect at the doses ascontrasts for Stage 2 and thereby the observed effect is used as best guess assumption for Stage 2.This method was proposed by Lang et al. (2000) and by Chang et al. (2006) for the case that no newdoses are added in Stage 2. If new doses are added, the contrast coefficients for these doses need tobe specified based on the observed effect on other doses, e.g. by interpolation between the effect ofadjacent doses. If multiple contrasts are used for Stage 2, contrasts can be specified not only basedon the observed value but also on the level of uncertainty which will usually exist after observingStage 1 data of limited size.

Another indirect application of this method is to fix in advance the number of treatment groupsand the contrast coefficients for the treatment groups in Stage 2 but to adjust the doses of thetreatment groups in order that the expected effect is closer to the prespecified contrast coefficients.

When an important goal of the study is to determine the best fitting dose–response shape amongseveral candidates as it is done in the MCP-Mod method (see Bretz et al., 2005; Pinheiro et al.,2006), it is more reasonable to stick to prespecified response shapes and thereby to the prespecifiedcontrasts. However, the contrasts depend often on unknown parameter values and therefore initialguesses for them are required in Stage 1. Hence – as we discuss briefly in Section 5 – it is a possibilityto estimate these parameters in the interim and to use the contrasts based on them for Stage 2.

Biometrical Journal 52 (2010) 5 581

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 6: Adaptive dose-finding: Proof of concept with type I error control

When the MCP-Mod method is applied, it is possible to combine modelling and dose–responseestimation with Proof of Concept testing, see Pinheiro et al. (2006) and Bretz et al. (2008).

3.3 Type I error control

We discuss in this section how the adaptive contrast test can be specified in order to control thetype I error. If the two-stage trend test is applied in a situation where the Stage 2 doses or contrastsare chosen based on Stage 1 results, we call the test AMCT.

3.3.1 Type I error control of the AMCT used with the non-adaptive critical valueIf in Stage 2 only designs (dose choices, sample sizes per doses, contrast coefficients) are allowedwhich have all the same V2 ¼ C>2 P2C2, then the multiple contrast test

T ¼W�1=2C>PY ¼ ðW11W2Þ�1=2ðW

1=21 T11W

1=22 T2Þ

RejectH0 if and only if maxfTg=s4ua

controls the type I error a, where ua is determined according to Section 2.3.The proof is straightforward since the distribution of T is independent of Stage 1 data: Condi-

tioned on the Stage-1-results, T2 is Nð0; s2W�1=22 V2W

�1=22 Þ-distributed under H0. Since the con-

ditional distribution is independent of Stage 1, the unconditional distribution of T2 isNð0; s2W

�1=22 V2W

�1=22 Þ and T1 and T2 are independent and T � Nð0; s2W�1=2C>PCW�1=2Þ under

H0.

3.3.2 Control of the type I error of the AMCT in the general caseIf the possibility is desired to choose Stage-2-settings with differing V2, the adaptivity needs to betaken into account for the test. We apply here the conditional error approach (see e.g. Muller andSchafer, 2001, 2004) in order to control the type I error. In our context, the approach can besummarized as follows:

(i) Prespecify V2 as V�2 prior to the study. By this, the weight matrix W2 (diagonal matrixwith same diagonal elements as V2) and the covariance matrix R5 (W11W2)

�1/2(V11V2)(W11W2)

�1/2, V1 ¼ C>1 P1C1, are also specified. We call these matrices W�2 and R�,respectively. Derive the (non-adaptive) critical value u�a based on R� (whichdefines a ‘‘base-test’’) using multidimensional integration over a normal distributiondensity.

(ii) In the interim analysis, determine the conditional error

a ¼ PH0ðmaxfT�g=s4u�ajY1Þ ð5Þ

where T�/s under the condition Y1 and under the null hypothesis is multidimensional

normal distributed with mean vector ðW11W�2Þ�1=2C>1 P1Y1 and covariance matrix R�2 ¼

ðW11W�2Þ�1=2V�2ðW11W�2Þ

�1=2.

(iii) Change the design and the contrast vector leading to a new C2 and V2 ¼ C>2 P2C2 whichdetermines W2 and R2 5 (W11W2)

�1/2V2(W11W2)�1/2.

(iv) Compute a new critical value ua such that

PH0ðmaxfTg=s4uajY1Þ ¼ a ð6Þ

where T/s under the condition Y1 and under the null hypothesis is multidimensionalnormal distributed with mean vector ðW11W2Þ

�1=2C>1 P1Y1 and covariance matrix R2.

582 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 7: Adaptive dose-finding: Proof of concept with type I error control

To compute the conditional errors in (5) and (6), k-dimensional integration over a normal dis-tribution needs to be performed again and numerical integration routines as mentioned before canbe applied.

3.3.3 Some remarks for the adaptive single contrast test (k5 1)The test derived with the conditional error approach coincides with the weighted Adaptive SingleContrast Test with prespecified w�2

T� ¼1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

w11w�2p ffiffiffiffiffiffi

w1

pT11

ffiffiffiffiffiffiw�2

pT2

� �RejectH0 if and only if T

�=s4F�1ð1� aÞ:

In Step (i) of the conditional error approach, the choice of V�2 ¼ w�2 determines the fixed weight ofStage 2.

Further, this test can be written as weighted inverse normal combination test, see Lehmacher andWassmer (1999). The p-values from the Stage 1- and Stage 2-trend tests, pj 5 1�F(Tj), j5 1, 2, arecombined according to:

T� ¼1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

w11w�2p ffiffiffiffiffiffi

w1p

F�1ð1� p1Þ1ffiffiffiffiffiffiw�2

pF�1ð1� p2Þ

� �:

Koenig et al. (2008) have applied the conditional error approach for pairwise comparisons in adaptivedesigns where treatments can be dropped in the interim. Since they consider a Dunnett test, they derivethe distribution of the maximum of several contrasts for pairwise comparisons. In our paper, the trendtest vector has a more general covariance. Owing to this, we are not restricted to have the sameallocation ratios to doses throughout the study, which is required from Koenig et al. (2008). Anotherdifference of our approach here is that there is no natural default Stage 2 design. This makes thespecification of V�2 in Step (i) of Section 3.3.2 necessary which defines a ‘‘base-test’’.

4 Adaptive dose-finding example

In this section, we discuss how the AMCT can be applied to an adaptive dose-finding study using anexample.

4.1 Design of the example study

Let us assume that in our dose-finding study we have the possibility to use 6 doses called 1, 2,y, 6and placebo.

The unknown true dose–response relation could look like one of the three scenarios in Table 1,assuming a continuous primary outcome and a largest effect of 0.5. The dose-range of increasingefficacy of the new drug could be between dose 0 and 2 (‘‘high potency’’ scenario), between dose 2and 4 (‘‘medium potency’’ scenario), or between dose 4 and 6 (‘‘low potency’’ scenario). It is of highimportance to investigate not only doses without effect and on the plateau of the dose–effect relationbut also to obtain information about effect and tolerability in the dose range of increasing efficacy.

We consider the following adaptive dose-finding design: In Stage 1, the four dose-groups 0, 2, 4and 6 will be used with n1/4 patients each. Then, an interim analysis of the results in Stage 1 isperformed and the effect is estimated for the dose-groups. Let the mean result be �Y0, �Y2, �Y4 and �Y6

and the differences between subsequent doses i and j be denoted by Dij ¼ �Yi � �Yj. In Stage 2, n2/4patients in each of four dose-groups will be observed and the groups will be chosen according to thefollowing four decision rules that cover all possible outcomes of Stage 1:

(i) If D024maxðD24; D46Þ, use dose-groups 0, 1, 2, and 6 in Stage 2 (Design 1),

Biometrical Journal 52 (2010) 5 583

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 8: Adaptive dose-finding: Proof of concept with type I error control

(ii) If D24 � D024D46, use dose-groups 0, 2, 3, and 6 in Stage 2 (Design 2),(iii) If D244D46 � D02, use dose-groups 0, 3,4, and 6 in Stage 2 (Design 3),(iv) If D46 � maxðD02; D24Þ, use dose-groups 0,4, 5, and 6 in Stage 2 (Design4).

The intention of this rule is to focus on the most interesting dose-range with the largest increase inestimated effect from one dose to the next. In the range with the largest increase, the uncertaintyabout the effect of not investigated doses is largest. Note that an underlying assumption here is thatthe true effect is a non-decreasing function of dose which makes it e.g. reasonable to continuealways with dose 6.

For n1 5 n2 5 164 patients in each stage, an assumed variance of s2 5 1 and normally distributeddata, we simulated five scenarios: a flat (no-effect) scenario with true effect 0 at each dose, a scenariowith linearly increasing effect over the whole dose range (effect i/12 for dose i5 0,y,6), and thethree scenarios in Table 1. Note that an effect size (maximal effect/standard deviation) of 0.5 is atypical setting in clinical applications and the sample size is reasonable from a power perspective forthis effect size as we will see in Section 4.2.

The simulation results are given in Table 2. These show that for the three scenarios with a dose-range of increasing efficacy between two Stage 1 doses, there is high likelihood (485%) to pick thedose between these two doses for Stage 2. Hence, the design has the desired properties: a dose in therange of increasing efficacy is ensured with high probability for these scenarios. For the flat andlinear dose–response scenario, the probability of picking either dose 1, 3, or 5 is almost equal(around 1/3).

4.2 Adaptive dose-finding with type I error control

In this section, we compare the performance of several tests which control the type I error in theadaptive dose-finding trial by simulations. Tests 1–3 include single contrast tests, Tests 4 and 5 arebased on multiple contrast tests.

Test 1: A linear contrast test for doses in Stage 1 (using the contrasts �3, �1, 1, 3) and a linearcontrast test for the chosen doses (using contrasts �3, �1, 1, 3 for the four chosen doses). With thedecision rules of the interim analysis described in Section 4.1, doses are added where the effectincreases most and doses can be eliminated if the increase in effect between two adjacent doses is notlarge. Therefore, if we put the chosen Stage-2-doses on an ordinal scale (0, 1, 2, 3), the expecteddose–effect relation should be close to linear. The coefficients for the smallest and highest dose,placebo and dose 6, should remain unchanged in Stage 2. The four designs for Stage 2 have all thesame patient number n2/45 41 for each dose. Consequently, w2 ¼ c>2 P2c2 ¼ c>2 c2 ¼

41ðð�3Þ21ð�1Þ2112132Þ ¼ 820 is equal for all possible Stage-2-choices or in other words pre-specified to be 820. The adaptive contrast test based on T ¼

ffiffiffiffiffiffiw1p

T11ffiffiffiffiffiffiw2p

T2

� �� ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiw11w2

pcontrols

therefore the type I error.

Table 1 Hypothetical dose–response scenarios.

True effect at dose

Scenario 0 1 2 3 4 5 6

High potency 0 0.25 0.5 0.5 0.5 0.5 0.5Medium potency 0 0 0 0.25 0.5 0.5 0.5Low potency 0 0 0 0 0 0.25 0.5

584 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 9: Adaptive dose-finding: Proof of concept with type I error control

Test 2: A linear contrast test for doses in Stage 1 (using the contrasts �3, �1, 1, 3) and a singlecontrast test based on the estimated effects in Stage 2 are used. This method is mentioned by Chowand Chang (2006). However, in contrast to the situation in the mentioned reference, we have herethe complication that a new dose is investigated in Stage 2. We have therefore extended their test byobtaining the contrast coefficient for a new dose by linear interpolation between the estimated effectof the two adjacent doses. Further, since we assume a monotone dose–response relation, we esti-mate the effects by monotone regression in the interim and base the Stage 2 contrasts on thesemonotonized estimates.

Test 3: A Bartolomew test for Stage 1 (Bartholomew, 1961, or Robertson et al., 1988) and a singlecontrast test based on the estimated effects in Stage 2 using interpolation as above. Except for theinterpolation, this test follows the idea of Lang et al. (2000).

Test 4: Two candidate models, an E-max model with a concave shape and a convex dose–response relation with exponential-type shape (when restricted to the dose range 0–6) are used. SeeTable 3 for model specification and contrast coefficients. The contrast cofficients for each dose arehere prespecified for both stages and not based on interim data.

Test 5: As above, we use a multiple contrast test, now with three candidate models: an E-max, alogistic and an exponential-type model, see Table 3.

To illustrate the theory introduced before, we derive the values for the matrices of Test 4. Thematrix C1 is n1� k-dimensional, i.e. 164� 2 for the example, and given by:

C>1 ¼0 � � � 0 0:60 � � � 0:60 0:86 � � � 0:86 1 � � � 10 � � � 0 0:14 � � � 0:14 0:40 � � � 0:40 1 � � � 1

� �

The 2� 2-matrix V1 ¼ C>1 P1C1 is then given by V1 ¼ ðvði;jÞ1 Þi;j with v

ð1;1Þ1 ¼ v

ð2;2Þ1 ¼ 24:00 and

vð1;2Þ1 ¼ v

ð2;1Þ1 ¼ 19:71. Analogously, we calculate the matrix V2 ¼ C>2 P2C2 for each of the four

possible Stage-2-designs. E.g. for Design 1 with 41 patients on dose 0, 1, 2, and 6,

C>2 ¼0 � � � 0 0:38 � � � 0:38 0:60 � � � 0:60 1 � � � 10 � � � 0 0:06 � � � 0:06 0:14 � � � 0:14 1 � � � 1

� �

The results for V2 and R2 are summarized in Table 4. The diagonal elements in the marticies V2

relative to them in V1 can be interpreted as weighting of Stage 2 for the contrasts in a (non-adaptive)two-stage multiple contrast test. In order not to deviate too much from the non-adaptive weightingbetween the stages in the contrast vectors, we use as matrix V�2 the elementwise mean of the fourpossible matrices V2. This matrix will then be used as ‘‘base-test’’ for calculation of the conditionalpower based on Stage 1. Note that the matrix V�2 defined this way has the properties of a covariancematrix, i.e. is symmetric and positive semi-definite. The matrix R� ¼ ðrðr;sÞ

Þ is then given byrðj;jÞ

¼ 1; j ¼ 1; 2; rð1;2Þ�

¼ rð2;1Þ�

¼ 0:8272. The critical value for a5 0.025 is ua 5 2.1427 based onnumerical calculation using the qmvnorm-function in the R-package mvtnorm.

Table 2 Probabilities of choosing doses for second stage of the adaptive dose-finding design(100 000 simulations for each scenario).

Stage-2-doses

Scenario 0111216 0121316 0131416 0141516

Flat (%) 31.8 18.2 18.5 31.5High potency (%) 86.7 7.9 1.0 4.4Medium potency (%) 7.5 42.8 42.2 7.5Low potency (%) 4.4 1.0 8.0 86.6Linear dose–response (%) 31.8 18.3 18.2 31.7

Biometrical Journal 52 (2010) 5 585

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 10: Adaptive dose-finding: Proof of concept with type I error control

Under the null hypothesis, a flat dose–response relation, the rejection probability is by con-struction of the tests a5 2.5%. Table 5 shows the power of the five adaptive tests described beforeapplied for the dose-finding example based on simulations. For each scenario, 100 000 simulationswere performed with the programme R and using for Test 4 and 5 the R package mvtnorm tocalculate the conditional errors (5) and (6). All five tests were evaluated for the same data.

Test 1, which adapts indirectly the doses to the contrast vector, compared with the other tests hasgood power for the linear dose–response, but limited power for the low/high potency scenario. Tests2 and 3 differ only by the test in the first stage: Test 2 which uses a linear trend test is slightly morepowerful for the linear dose–response shape while Test 3 using a Bartholomew test in the first stageis more powerful for the high and low potency scenario. Test 5, the multiple contrast test based onthree contrasts, outperforms Test 2 for all four considered scenarios. Overall, Test 5 has a very goodperformance. Only for the linear dose–response scenario, Tests 1 and 4 are slightly better. Note herethat Test 5 does not include a contrast with a linear dose–response shape. Test 4’s power for themedium potency is lower compared with the other tests. This can be explained since the twocontrasts of Test 4 do not include a contrast similar to this scenario. This missing contrast is thenincluded in Test 5 as logistic contrast, which explains the improvement for this scenario. In general,it is also noteable that the power differences of the five tests are not dramatically big.

Note that the total sample size (n11n2 5 328) was chosen in order that the considered tests have atleast around 90% power for each of the four scenarios.

5 Discussion

A general possibility when a two-stage design is performed is to calculate separate tests and p-valuesfor the stages and then to apply a p-value combination test. Bauer and Rohmel (1995) applied thisapproach to an adaptive dose-finding trial with choice of second stage doses based on first stage

Table 3 Contrast coefficients for multiple contrast tests 4 and 5.

Contrast coefficients for dose

Test Contrast name Dose–response model 0 1 2 3 4 5 6

4 E-max 3/2 � d/(d13) 0.00 0.38 0.60 0.75 0.86 0.94 1.00Exponential-type 1�3/2 � (6�d)/(9�d) 0.00 0.06 0.14 0.25 0.40 0.62 1.00

5 E-max 7/6 � d/(d11) 0.00 0.58 0.78 0.88 0.93 0.97 1.00Logistic [11exp{2(�d13)}]�1 0.00 0.02 0.12 0.50 0.88 0.98 1.00Exponential-type 1�7/6 � (6�d)/(7�d) 0.00 0.03 0.07 0.12 0.22 0.42 1.00

Table 4 Matrices V2 5CT2P2C2 and R2 for the possible Stage-2-designs.

Design V2 ¼ ðvðr;sÞ2 Þr;s R2 ¼ ðr

r;s2 Þr;s

vð1;1Þ2 v

ð1;2Þ2 v

ð2;2Þ2 rð1;1Þ2 rð1;2Þ2 rð2;2Þ2

1 21.54 21.07 27.10 0.4731 0.4368 0.53042 22.22 18.65 24.51 0.4807 0.3939 0.50533 24.51 18.65 22.22 0.5053 0.3939 0.48074 27.10 21.07 21.54 0.5304 0.4368 0.4731� 23.84 19.86 23.84 0.4984 0.4152 0.4984

586 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 11: Adaptive dose-finding: Proof of concept with type I error control

results. Indeed, we have noted that for the single contrast case (k5 1), the conditional error ap-proach is equal to a fixed weight combination of z-statistics, which is known to be equivalent to aweighted version of the inverse normal combination test. Our approach helps to justify theweighting between the stages and offers the possibility to add covariates for the whole study, notnecessarily separate for the two stages. For k41, the two approaches differ: by using the inversenormal combination test, the maximum is taken over the contrast tests within the stages. It isreasonable to apply the inverse normal combination if the multiple contrasts are applied purely forrobustness reasons without special interpretation of the contrasts. Hence, it is possible to combinethe results from two totally different contrast tests. With the approach discussed in this paper, thecorresponding contrast tests from the two stages are first combined and after that, the maximum istaken.

We have applied in this paper the conditional error approach for an adaptive dose-finding trial.In difference to other applications of this approach, e.g. sample size re-estimation, there is nonatural default design for Stage 2. Therefore, we need a ‘‘base-test’’ for being able to calculate theconditional power. This base-test needs to be specified in advance, i.e. in the Clinical Study Pro-tocol. Our approach to specify this test can be viewed as defining an ‘‘average-test’’ where we tookthe ‘‘mean’’ over the tests for the four possible Stage 2 designs. However, in practice, an interimmonitoring committee might decide to apply another very different Stage 2 design (which was notused to calculate the base-test). The only disadvantage is then that the weighting between the stagesmight deviate more from the non-adaptive weighting (implying some reduced efficiency), but theresulting test after application of the conditional error approach is statistically valid, i.e. controls thetype I error. This flexibility of the method is important. One alternative to our average-test ap-proach is to calculate the conditional error in the interim as maximum over all preplanned Stage 2designs avoiding the definition of the hypothetical average-test. Then, the adaptive and non-adaptive tests coincide if the design with largest conditional error is chosen for Stage 2. Hence, thisalternative is an interesting possibility, if the decision rule for the Stage 2 design is related tomaximisation of the conditional error. However, in situations like in the considered example, we seeno such relation and prefer therefore the average-test approach limiting the largest deviation be-tween adaptive and non-adaptive test.

We have mentioned in Section 3.2 the MCP-Mod method that uses several candidate models. Forsimplicity, we used in the example candidate models defining contrasts which do not depend onunknown parameters. If the contrasts are parameter dependent, one might start Stage 1 withcontrasts based on an initial guess of the parameters, estimate the parameters in the interim anddefine the contrasts for Stage 2 based on them. The approach described in this paper is applicable tothis case. The only new question which one has to think about is the definition of the base-test. Theeasiest way is to define it based purely on the initial parameter guesses, but in principle someadditional parameter values could be included as well in order to define the base-test.

In practical applications, the variance s2 is unknown and has to be estimated from the data by anestimator S2. As estimator S2, one can choose the residual variance after applying an ANCOVA (ormixed model). Doses and Stage can be included as covariates together with other covariates.

Table 5 Power of 5 tests for the adaptive dose-finding design (100 000 simulations for eachscenario).

Scenario Test 1 Test 2 Test 3 Test 4 Test 5

High potency (%) 90.1 91.4 93.8 92.5 93.6Medium potency (%) 96.3 96.4 96.5 94.8 97.2Low potency (%) 90.2 91.3 93.8 92.5 93.5Linear dose–response (%) 90.8 89.4 88.1 91.1 89.9

Biometrical Journal 52 (2010) 5 587

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 12: Adaptive dose-finding: Proof of concept with type I error control

Assuming normally distributed data as we did in Eq. (1), S2� p/s2 has a w2-distribution with pon11

n2 degrees of freedom. The degrees of freedom depend on the number of doses and covariates in themodel. The single contrast tests discussed in this paper can then be modified to ‘‘Reject H0 if andonly if T=S4F�1p ð1� aÞ’’ with F being the distribution function of the t-distribution with p degreesof freedom. The multiple contrast test is then ‘‘Reject H0 if and only if maxfTg/S4ua’’ with ua beingthe critical value for a multivariate t-distribution. The critical value can be numerically calculatedusing the qmvt function of the R-package mvtnorm. The difficulty in this application is that theconditional errors (see (5) and (6)) depend on the unknown variance. Approximations can beapplied to deal with this dependence, see Posch et al. (2004). For large sample sizes like in theexample considered in this paper, the approximations work well.

We have not considered here the possibility for an interim stop if no evidence for a drug effect isobserved (futility stop). In most situations in practice, a drug effect in the target population is not yetestablished when performing a dose-finding study and hence such a possibility should be implemented.However, in many cases it is good not to adjust the critical value of the final test for this possibility(non-binding futility rule) since if an interim monitoring committee continues the study despite the rulesays stop, the final trend test can still be performed and is statistically valid. The methods described inthis paper can therefore be used also if a non-binding futility rule is included. The test will however beslightly conservative and implications on power of the test should be investigated.

Acknowledgements The author likes to thank Olivier Guilbaud for several helpful discussions on this topic.He further likes to thank the unknown referees and an unknown Associate Editor for helpful and very inspiringcomments leading to an improvement of this paper.

Conflict of Interest

The author has declared no conflict of interest.

References

Bartholomew, D.J. (1961). Ordered tests in the analysis of variance. Biometrika 48, 325–332.Bauer, P. and Rohmel, J. (1995). An adaptive method for establishing a dose–response relationship. Statistics

in Medicine 14, 1595–1607.Bornkamp, B., Pinheiro, J. and Bretz, F. (2009). MCPMod: an R package for the design and analysis of dose-

finding studies. Journal of Statistical Software 29.Bretz, F., Hsu, J., Pinheiro, J. and Liu, Y. (2008). Dose finding – a challenge in statistics. Biometrical Journal

50:480–504.Bretz, F., Pinheiro, J. and Branson M. (2005). Combining multiple comparisons and modeling techniques in

dose–response Studies. Biometrics 61, 738–748.Burman, C.F., Miller, F. and Wong, K.W. (2010). Improving dose-finding – a philosophic view. In: Chow,

S.C., Pong, A. (Eds.).Handbook of Adaptive Designs in Pharmaceutical and Clinical Development. Taylor &Francis, Boca Raton.

Chang, M., Chow, S. C. and Pong, A. (2006). Adaptive design in clinical research: issues, opportunities, andrecommendations. Journal of Biopharmaceutical Statistics 16, 299–309.

Chow, S.C. and Chang, M. (2006). Adaptive Design Methods in Clinical Trials. Chapman & Hall/CRC,Englewood Cliffs, NJ, Boca Raton.

Dragalin, V., Bornkamp, B., Bretz, F., Miller, F., Padmanabhan, S. K., Patel, N., Perevozskaya, I., Pinheiro, J.and Smith, J. R. (2010). A simulation study to compare new adaptive dose-ranging designs. Statistics inBiopharmaceutical Research (to appear)

Friede, T. and Stallard, N. (2008). A comparison of methods for adaptive treatment selection. BiometricalJournal 50, 767–781.

Genz, A. and Bretz, F. (2009). Computation of Multivariate Normal and t Probabilities. Springer, Heidelberg.

588 F. Miller: Adaptive dose-finding: proof of concept with type I error control

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 13: Adaptive dose-finding: Proof of concept with type I error control

ICH guideline E4 (1994). International conference on harmonisation of technical requirements for registrationof pharmaceuticals for human use. ICH Topic E4: Dose– response Information to Support Drug Regis-tration. ICH Technical Coordination, EMEA, London.

Koenig, F., Brannath, W., Bretz, F. and Posch, M. (2008). Adaptive Dunnett tests for treatment selection.Statistics in Medicine 27, 1612–1625.

Lang, T., Auterith, A. and Bauer, P. (2000). Trendtests with adaptive scoring. Biometrical Journal 42,1007–1020.

Lehmacher, W. and Wassmer, G. (1999). Adaptive sample size calculations in group sequential trials.Biometrics 55, 1286–1290.

Miller, F., Guilbaud, O. and Dette, H. (2007). Optimal designs for estimating the interesting part of adose–effect curve. Journal of Biopharmaceutical Statistics, 17, 1097–1115.

Muller, H. H. and Schafer, H. (2001). Adaptive group sequential designs for clinical trials. Biometrics 57,886–891.

Muller, H. H. and Schafer, H. (2004). A general statistical principle for changing a design any time during thecourse of a trial. Statistics in Medicine 23, 2497–2508.

Neuhauser, M., Liu, P. Y. and Hothorn, L. A. (1998). Nonparametric tests for trend: Jonckheere’s test, amodification and a maximum test. Biometrica Journal 40, 899–909.

Pinheiro, J., Bretz, F. and Branson, M. (2006). Analysis of dose–response studies: modeling approaches. In:Ting, N. (Ed.). Dose Finding in Drug Development. Springer, New York, 146–171.

Posch, M., Koenig, F., Branson, M., Brannath, W., Dunger-Baldauf, C. and Bauer, P. (2005). Testing andestimating in flexible group sequential designs with adaptive treatment selection. Statistics in Medicine 24,3697–3714.

Posch, M., Timmesfeld, N., Konig, F. and Muller, H. H. (2004). Conditional rejection probabilities of student’st-test and design adaptations. Biometrical Journal 46, 389–403.

Robertson, T., Wright, F. T. and Dykstra, R. L. (1988). Order Restricted Statistical Inference. Wiley,Chichester.

Ruberg, S. J. (1995a). Dose response studies: I. Some design considerations. Journal of BiopharmaceuticalStatistics 5, 1–14.

Ruberg, S. J. (1995b). Dose response studies: II. Analysis and interpretation. Journal of BiopharmaceuticalStatistics 5, 15–42.

Stallard, N., and Friede, T. (2008). A group-sequential design for clinical trials with treatment selection.Statistics in Medicine 27, 6209–6227.

Steward, W. H. and Ruberg, S. J. (2000). Detecting dose response with contrasts. Statistics in Medicine 19,913–921.

Biometrical Journal 52 (2010) 5 589

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com