Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO...

45
Sample size Sample size determination determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010

Transcript of Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO...

Page 1: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sample size determinationSample size determination

Nick Barrowman, PhDSenior Statistician

Clinical Research Unit, CHEO Research Institute

March 29, 2010

Page 2: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

OutlineOutline

• Example: lowering blood pressure• Introduction to some statistical

issues in sample size determination• Two simple approximate formulas• Descriptions of sample size

calculations from the literature

Page 3: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

ExampleExample

• Physicians design an intervention to reduce blood pressure in patients with high blood pressure

• But does it work? Need a study.• How many participants are required?• Too few: may not detect an effect

even if there is one.• Too many: may unnecessarily expose

patients to risk.

Page 4: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

The null hypothesisThe null hypothesis

• For intervention studies, the null hypothesis is usually this: on average there is no effect.

• “Innocent until proven guilty”• The physicians who designed the

intervention believe the null hypothesis is false.

• The study is designed to test the null hypothesis.

• Often write H0 for the null hypothesis.

Page 5: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

The studyThe study

• The population is considered to be all people who might be eligible for the intervention (might depend on age, other medical conditions, etc.)

• Study participants are viewed as a sample from this population.

• Suppose for each study participant we measure blood pressure at baseline, and after 6 weeks of intervention

• Outcome is change in blood pressure

• H0 is that mean change in BP is 0.

Page 6: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Population Population vs.vs. sample sample

Population

Population mean of the change in blood pressure

Random sample

Inference Sample mean of the change in blood pressure

Calculation

Page 7: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Population distribution of change in blood pressure

mean

± 1 standard deviation

Probability distributionsProbability distributions

Recall that variance is the square of the standard deviation, often written as

Page 8: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Population distribution of change in blood pressurePopulation distribution of change in blood pressure

Page 9: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sampling distributionSampling distribution of of meanmean change in blood change in blood pressure (N=1)pressure (N=1)

Page 10: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sampling distributionSampling distribution of of meanmean change in blood change in blood pressure (N=2)pressure (N=2)

Page 11: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sampling distributionSampling distribution of of meanmean change in blood change in blood pressure (N=5)pressure (N=5)

Page 12: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sampling distributionSampling distribution of of meanmean change in blood change in blood pressure (N=10)pressure (N=10)

Increasing sample size reduces the variability of the sample mean.

standard deviation

standard error

SD

NSE =

Page 13: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Variance and sample sizeVariance and sample size

• As we’ve seen, increasing the sample size is akin to reducing the variance

• Equivalently, reducing the variance (e.g. using a more precise measurement device) can reduce the sample size requirements

Page 14: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Hypothesis testHypothesis test

Sampling distribution of the mean under the null hypothesis,a.k.a. the null distribution

Page 15: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Hypothesis testHypothesis test

Rejection region

Observed mean

Reject the null hypothesis if the observed mean is far in the tails of the null distribution, i.e. we have ruled out chance

Page 16: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Possible scenariosPossible scenarios

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

that the intervention has an effect (reject H0)

or

Page 17: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Possible scenariosPossible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

that the intervention has an effect (reject H0)

or

or

Page 18: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

FourFour possible scenarios possible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

that the intervention has an effect (reject H0)

or

or

Page 19: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Four possible scenariosFour possible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

Correctly accept H0

that the intervention has an effect (reject H0)

or

or

Page 20: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Four possible scenariosFour possible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

Correctly accept H0

that the intervention has an effect (reject H0)

Correctly reject H0

or

or

Page 21: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Four possible scenariosFour possible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

Correctly accept H0

that the intervention has an effect (reject H0)

Type-I error Correctly reject H0

or

or

Page 22: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Four possible scenariosFour possible scenarios

In reality, either …the intervention has no effect (H0 is true)

the intervention has an effect (H0 is false)

Based on the study findings we infer either …

that the intervention has no effect (accept H0)

Correctly accept H0 Type-II error

that the intervention has an effect (reject H0)

Type-I error Correctly reject H0

or

or

Page 23: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Type-I errorType-I error

If the null hypothesis is true, the rejection region of the test represents type-I error.

The probability of type-I error is the area of the red region below, and is denoted by .

Page 24: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Type-II errorType-II error• Type-II error is failing to reject the null

hypothesis when it is false.• The probability of type-II error is denoted

.• It depends on how big the true effect is• Sample size calculations require

specification of an alternative hypothesis, which indicates the size of effect we would like to detect

Page 25: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Type-II errorType-II error

Page 26: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Type-II errorType-II error

Page 27: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Type-II errorType-II error

Page 28: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Relationship between type-I and type-II errorRelationship between type-I and type-II error(alpha=0.05)(alpha=0.05)

Page 29: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Relationship between type-I and type-II errorRelationship between type-I and type-II error(alpha=0.10)(alpha=0.10)

Page 30: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Relationship between type-I and type-II errorRelationship between type-I and type-II error(alpha=0.20)(alpha=0.20)

Page 31: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Relationship between type-I and type-II errorRelationship between type-I and type-II error

• Sample size calculations depend on the tradeoff between type-I and type-II error.

• We usually fix the probability of type-I error (alpha) at 5% and then try to minimize the probability of type-II error (beta).

• Define Power = 1 – beta• We want to maximize power• One way to do this is by increasing the

sample size

Page 32: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

How sample size affects powerHow sample size affects power

Page 33: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sample size (doubled)Sample size (doubled)

Page 34: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sample size (quintupled)Sample size (quintupled)

Page 35: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

An approximate formula for the blood An approximate formula for the blood pressure examplepressure example

• Suppose the variance in the change in blood pressure, sigma2, is the same for the null and alternative hypotheses

• Suppose alpha is fixed at 0.05 and we use two-sided tests (allowing for the possibility that blood pressure could be either increased or decreased by the intervention)

• Then we will have approximately 80% power to detect a mean change in blood pressure delta if we enroll N participants, where

N = 8 sigma2 / delta2 (approximately)

Page 36: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

ExampleExample

• Suppose the standard deviation of the change in blood pressure is anticipated to be 7 mmHg (so the variance is 49)

• Suppose we fix alpha at 0.05 and we’d like to have approximately 80% power to detect a mean change of 5 mmHg

• Then we would need about 16 participants

Page 37: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

When there are two groupsWhen there are two groups

• So far, the example has used a single group of study participants

• Usually we want to compare two groups: a control group that receives “standard of care” or placebo, and an experimental group that receives a new intervention

• This is how most randomized controlled trials are set up

• In this case, delta is the difference between the means of the two groups.

• For simplicity, assume that the variance is the same in the two groups.

Page 38: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

An approximate sample size formula An approximate sample size formula for the case of two groupsfor the case of two groups

• A similar approximate formula applies, again assuming alpha=0.05 and power=80%:

N per group = 16 sigma2 / delta2

(approximately)

• Careful! This is the required sample size per group.

• Also, note that the constant is double what is was for the case of a single group.

• So the total sample size is 4 times as large.

Page 39: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

ExampleExample

• Suppose we want to compare patients randomized to placebo with patients randomized to a new intervention

• Suppose the standard deviation is anticipated to again be 7 mmHg (so the variance is 49)

• Suppose we fix alpha at 0.05 and we’d like to have approximately 80% power to detect a change of 5 mmHg

• Then we would need about 32 participants per group, for a total of about 64 participants

Page 40: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

SummarySummary

• increases with variance• decreases with size of effect to detect• decreases with probability of type-I error,

alpha• decreases with probability of type-II error,

beta

Required sample size …

Page 41: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
Page 42: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
Page 43: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Sample size determination has many Sample size determination has many other aspectsother aspects

• Different types of outcomes: dichotomous (e.g. mortality), time-to-event (e.g. survival time), etc.

• Different designs: observational studies (e.g. case-control), surveys, prevalence studies

• Practical considerations: e.g. costs, feasibility of recruitment

Page 44: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

Questions?Questions?

Page 45: Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.

αα = Probability of type-I error = Probability of type-I error

(Rejecting the null hypothesis when it is in fact true.)

Power = 1 Power = 1 –– ββ

(Rejecting the null hypothesis when it is in fact false.)

Review: A comedy of errors …Review: A comedy of errors …

Probability of a false conviction

Probability of a true conviction