From diagnostic test to hypothesis test - MICEapps · From diagnostic test to hypothesis test. Plan...
Transcript of From diagnostic test to hypothesis test - MICEapps · From diagnostic test to hypothesis test. Plan...
Kameshwar Prasad
Professor of Neurology, Former Chief, Neurosciences Centre
All India Institute of Medical Sciences, New Delhi
From diagnostic test to
hypothesis test
Plan
• Revise concepts related to diagnostic
tests
• Learn concepts related to hypothesis
testing
How do I teach sample size
calculation?
Dichotomous Outcome (2 Independent Samples)
• Test H0: p1 = p2 vs. HA: p1 p2
• Assuming two-sided alternative and equal allocation
***Always Round Up To Nearest Integer!
2
22111/2-1
/
2 z
qpqpzqpn groupper
p1, p2 = projected true probabilities of “success” in the
two groups
q1 = 1 – p1, q2 = 1 – p2
= p1 – p2
p = (p1 + p2)/2, q = 1 – p
z1-/2 is the N(0,1) cutoff corresponding to
z1- is the N(0,1) cutoff corresponding to β
Dichotomous Outcome(2 Independent Samples)
where is the probability from a standard normal distribution
2211
2/1 2
qpqp
qpznPower
Continuous Outcome(2 Independent Samples)
• Test H0: 1 = 2 vs. HA: 1 2
• Two-sided alternative and equal allocation
• Assume outcome normally distributed with:
2
2
12/1
2
2
2
1
/
zzn groupper
mean 1 and variance 12 in Group 1
mean 2 and variance 22 in Group 2
For RCTs
Sample size with grade 3 math
Randomized Controlled Trial (RCT)
Two groups of equal size
Parallel groups
• Hypothesis : In patients with hypertensive brain
haemorrhage, surgery reduces 30-day mortality from
40% to 20%.
Best medical management alone : 40% (Pc) to
Surgery + best medical management : 20% (Pe)
RCT : Superiority Hypothesis
Find out their average = 30% (p)
Pe = 20%
Pc = 40%
Find out the difference between Pe & Pc
= 20% (d)
RCT : Superiority Hypothesis
Average of Pe & Pc = 30% (p)
Difference between Pe & Pc = 20% (d)
Sample size per group =
16p(100 - p)
(d x d)= 16 x 30 x 70 = 84 per group
20 x 20
Total N=168
RCT : Superiority Hypothesis
Sample size per group =
16p(100 - p)d x d
This is for a study with two equal parallel groups.
p = average of Pe & Pc
d = difference between Pe and Pc
Exercise: calculate sample size for the following hypothesis:
Dexamethasone adjunctive therapy reduces mortality from 15% to 5% in children with neonatal meningitis.
Sample size per group =
16p(100 - p)d x d
This is for a study with two equal parallel groups.
Thank You
By now, you have at least one
question?
• Where does the ‘16’ come from?
By now, you have at least one
question?
• Where does the ‘16’ come from?
• Before we address this, I will take your
test…….
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
- False
negative
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
- False
negative
True negative
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 190 40
- 10 160
200 200
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ 190
True positive
40
False positive
- 10
False negative
160
True positive
200 200
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 95% 20%
- 5% 80%
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease
+
Disease -
+ 190 40
- 10 160
200 200
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ TP rate FP rate
- FN rate TN rate
What we call rate is actually probability.
Truth revealed by Gold
standard
T
E
S
T
Disease
+
Disease -
+ 95%
(True
positive
rate)
20%
(False
positive
rate)
- 5%
(False
negative
rate)
80%
(True
negative
rate)
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 190
(True
Positive)
40
(False
Positive)
- 10
(False
Negative)
160
(True
Negative)
200 200
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 95%
(True
positive
rate)
20%
(False
positive
rate)
- 5%
(False
negative
rate)
80%
(True
negative
rate)
Giving names: ‘terms’
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ Sensitivity
(True
Positive
rate)
(False
Positive
rate)
-
(False
Negative
rate)
Specificity
(True
Negative
rate)
Think of relationship between
false-negative rate and sensitivity
2 by 2 table: sensitivity
Disease
Test
+ -
+
-
Sensitivity = a / a + c
Proportion of people
with the disease who
have a positive test
result.
So, a test with 84%
sensitivity….means
that the test identifies
84 out of 100 people
WITH the disease
a
True
positives
c
False
negatives
Conducting a study is like doing a
diagnostic test
• Want to know (diagnose) the truth
• But there is no ‘gold standard’ to reveal
the truth, which is known only to ‘God’
• Any study, like a diagnostic test, is an
attempt to find the truth
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
False
negative
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
False
negative
True
negative
Four Possible Results
• Where do results go wrong?• Where do errors occur?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
False
Positive
Does not Work
-
False
Negative
True
Negative
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
False
Positive error
Does not Work
-
False
Negative
error
True
Negative
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
Type I error
(False
positive)
Does not Work
-
Type II
error
(false
negative)
True
Negative
Types of errors
False positive result: Type I error
False negative result: Type II error
Cannot plan for error free results
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False
Positive error
probability
Does not Work
-
False
Negative
error
probability
True
Negative
probability
Consultation with prof of
biostatistics
• Sir, can you help with sample size
calculation of my thesis (RCT)?
• And, so on….
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False Positive
error probability
(alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
Think of relationship between
false-negative rate and sensitivity
2 by 2 table: sensitivity
Disease
Test
+ -
+
-
Sensitivity = a / a + c
Proportion of people
with the disease who
have a positive test
result.
So, a test with 84%
sensitivity….means
that the test identifies
84 out of 100 people
WITH the disease
a
True
positives
c
False
negatives
What is the true probability when beta is varying?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False Positive
error probability
(alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
What is the other name for true positive probability?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
POWER
False Positive error
probability (alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
How much risk of errors you want to take?
Type I error rate
5%
Type II error rate
5%, 10%, 20%
Probability of Errors
Probability of FP (Type I ) error: α
Probability of FN (Type II) error: β
Almost always α=5%
Usually β = 20 %, 10%.
With 20%, power = ?, With 10%, power = ?
Source for 16
α= 5% β= 20%Power = 80%
Allocation ratio = 1:1
Dichotomous Outcome (2 Independent Samples)
• Test H0: p1 = p2 vs. HA: p1 p2
• Assuming two-sided alternative and equal allocation
***Always Round Up To Nearest Integer!
2
22111/2-1
/
2 z
qpqpzqpn groupper
p1, p2 = projected true probabilities of “success” in the
two groups
q1 = 1 – p1, q2 = 1 – p2
= p1 – p2
p = (p1 + p2)/2, q = 1 – p
z1-/2 is the N(0,1) cutoff corresponding to
z1- is the N(0,1) cutoff corresponding to β
ANY QUESTION?
Table: Multiplication factors for frequently used
power and alpha*
Power Multiplication
Factor for
alpha = 5%
Multiplication
Factor for
alpha = 1%
80% 16 23
90% 21 30
95% 26 36
99% 37 48
* Rounded to the nearest whole number.
Summary (what have we learnt)
• Concepts of hypothesis testing are
similar to those used in studies of
diagnostic tests
• Type I error/ Type II error
• Alpha/ beta
• Power
• ‘God is great’
Thank You
Sample size for a diagnostic test
study
• How many cases (Disease +) you
need?
• What is your expected (for the test to be
useful) sensitivity? Say 90% (= p)
• Within what range you want to estimate
this? Say +/- 5%
• d= difference between upper and lower
limit of the range 10%
• Now do the mental math.
Sample size for a diagnostic test
study• How many controls (disease -)you
need?
• What is your expected (for the test to be
useful) specificity? Say 80% (= p)
• Within what range you want to estimate
this? Say +/- 5%
• d= 10%
• Now do the mental math.
What kind of outcome measure
does your study have?
• Two category (dichotomous, binary)
• Numerical (continuous)
What kind of outcome measures
are there in these statements?
• Rosaglitazone reduces blood sugar level in diabetes.
• Clopidogrel reduces incidence of myocardial infarction.
• Carotid angioplasty prevents stroke.
• Nifedipine controls BP effectively in hypertensive
emergencies.
• Statins control cholesterol level in high-risk individuals.
• Steroids induce remssionin SLE.
• Rampril improves LV ejection fraction.
RCT-Superiority hypothesis
Continuous outcome measure• RCT : Two groups of equal size.
• Formula : 16x how much effect are you interested in?
• The value of ‘x’ depends on size of effect.
Effect size x Sample
Small 25 16x25 = 400 per group
Moderate 4 16x4 = 64 per group
Large 2 16x2 = 32 per group
Effect Size
• How much is the difference with respect
to its variation
• Difference d
• Variation s.d.
• Effect size = d / s.d.
• Large 0.8, Moderate 0.5, Small 0.2
Example
• You are planning a study to improve
LVEF using stem cells in acute MI
• You expect moderate effect, hence the
sample size will be 16x4 = 64 per group
• What is the general formula?
• n per group = 16 s2/d2
Variable Placebo (N=92) BMC (N=95)
Global LVEF (%)
Baseline
Mean
Median
46.9±10.4
47.5
48.3±9.2
50.6
4 Mo
Mean
Median
49.9±13.0
53.2
53.8±10.2
54.7
Absolute difference
Mean
Median
3.0±10
4.0
5.5±10
5.0
Death, recurrence of MI, & any
revascularization procedure40 / 103 23/101
One MI Study using stem cells
Calculating sample size for LVEF change
• Need two things :
Standard deviation (s)
Difference expected (d)
s = 10%
d = 5%
Sample size = 16s2 / d2
= (16 x 10 x 10) / (5 x 5)
= 64 per group
k n1 n2 n1+n2
1 n n 2n
2 0.75n 1.5n 2.25n
3 0.67n 2.0n 2.67n
4 0.62n 2.5n 3.12n
5 0.60n 3.0n 3.60n
10 0.55n 5.5n 6.05n
100 0.50n 50.n 50.0n
Table: Study sizes necessary to achieve approximately
the same power in trial with two groups, of which one
contains k times as many individuals as the other.
Reference for the formula
• Lehr R. Sixteen S-aquared over D-
squared: A Relation for crude sample
size estimates. Statistics in Medicine
1992;11:1099-1102
Disclaimer
• The sample size formula discussed in the given time
works only for –
RCT with superiority hypothesis and dichotomous
outcomes
NOT for case control, cohort or cross-sectional
studies
NOT for non-inferiority or equivalence hypothesis
Thank You
RCT : Superiority Hypothesis
In % 16p(100-p)
d x d
In decimals, 16p (1-p)
d x d
p = average of Pe & Po
d = difference between Pe and Po