Understanding P- values and CI 20Nov08 (1)

8/8/2019 Understanding P- values and CI 20Nov08 (1)

1/37

Understanding P-values andConfidence Intervals

Thomas B. Newman, MD, MPH

20 Nov 08


2/37

Announcements

Optional reading about P-values andConfidence Intervals on the website

Exam questions due Monday 11/24/08 5:00

PM Next week (11/27) is Thanksgiving

Following week Physicians and Probability(Chapter 12) and Course Review

Final exam to be distributed in SECTION 12/4and posted on web

Exam due 12/11 8:45 AM

Key will be posted shortly thereafter


3/37

Overview Introduction and justification

What P-values and Confidence Intervals dontmean

Whatthey do mean: analogy betweendiagnostic tests and clinical researc

Useful confidence interval tips

CI for negative studies; absolute vs.

relative risk Confidence intervals for small numerators


4/37

Why cover this material here?

P-values and confidence intervals are

ubiquitous in clinical research

Widely misunderstood and mistaught

Pedagogical argument:

Is itimportant?

Can you handle it?


5/37

Example: Douglas Altman Definition of

95% Confidence Intervals* "A strictly correct definition of a 95% CI is,

somewhat opaquely, that 95% of such

intervals will contain the true population

value.

Little is lost by the less pure interpretation of

the CI as the range of values within which we

can be 95% sure thatthe population valuelies.

*Quoted in: Guyatt, G., D. Rennie, et al. (2002). Users' guides to the medical

literature : essentials of evidence-based clinical practice. Chicago, IL,AMAPress.


6/37

Understanding P-values and

confidence intervals is important

because

It explains things which otherwise do

not make sense, e.g. the need to state

hypotheses in advance and correction

for multiple hypothesis testing

You will be using them all the time

You are future leaders in clinicalresearch


7/37

You can handle it because

We have already covered the important

concepts at length earlierin this course

Priorprobability

Posteriorprobability

What you thought before + new

information = what you think now We will support you through the process


8/37

Review of traditional statistical

significance testing

State null (Ho) and alternative (Ha)

hypotheses

Choose

Calculate value oftest statistic from

your data

Calculate P- value from test statistic

If P-value < , reject Ho


9/37

Problem:

Traditional statistical significance testing

has led to widespread misinterpretation

of P-values


10/37

What P-values dontmean

Ifthe P-value is 0.05, there is a 95%

probability that

The results did not occur by chance

The null hypothesis is false

There really is a difference between the

groups


11/37

So if P = 0.05, what IS there a 95%

probability of?


12/37

White board:

2x2 tables and false positive confusion

Analogy with diagnostic tests

(This is covered step-by-stepin thecourse book.)


13/37

Analogy between diagnostic tests

and research studies

Diagnostic Test Research Study

Absence ofDiseasePresence of disease

Severit of disease in t e

diseased group

Cutoff for distinguishingpositive and negative

results

Test result


14/37

Analogy between diagnostic tests

and research studies

Diagnostic Test Research Study

Negative result (test

withinnormallimits)

Positive resultSensitivity

False positive rate (1-

specificity)

Prior probability ofdisease (ofa given

severity)

Posterior probability of

disease, given test result


15/37

Extending the Analogy

Intentionally ordered tests and

hypotheses stated in advance

Multiple tests and multiple hypotheses

Laboratory error and bias

Alternative diagnoses and confounding


16/37

Bonferroni

Inequality: If we do k differenttests,

each with significance level , the

probability that one or more will be

significantis less than or equal to k v

Correction: If we test k different

hypotheses and want ourtotal Type 1

error rate to be no more than alpha,then we should reject H0 only if P < /k


17/37

Derivation

Let A & B = probability of a Type 1 error forhypotheses A and B

P(A or B) = P(A) + P(B) P(A & B)

Under Ho, P(A) = P(B) =

So P(A or B) = + - P(A & B) = 2 - P(A & B). Of course, itis possible to falsely reject 2 different null

hypotheses, so P(A & B) > 0. Therefore, the

probability of falsely rejecting either ofthe null

hypotheses must be less than 2.

Note that often A & B are notindependent, in which

case Bonferroni will be even more excessively

conservative


18/37

Problems withBonferroni correction

Overly conservative (especially whenhypotheses are notindependent)

Maintains s

pec

ificity a

tthe ex

pense ofsensitivity

Does nottake priorprobability intoaccount

Not clear when to use it BUT can be useful if results still

significant


19/37

CONFIDENCE INTERVALS


20/37

What Confidence Intervals dont

mean There is a 95% chance thatthe true

value is within the interval

If you conclude thatthe true value iswithin the interval you have a 95%chance of being right

The range of values within which wecan be 95% sure thatthe populationvalue lies


21/37

One source of confusion: Statistical

confidence

(Some) statisticians say: You can be 95%

confident thatthe population value is in the

interval. This is NOT the same as There is a 95%

probability thatthe population value is in the

interval.

Confidence is tautologously defined by

statisticians as what you get from a

confidence interval


22/37

Illustration

If a 95% CI has a 95% chance of containing

the true value, then a 90% CI should have a

90% chance and a 40% CI should have a

40% chance.

Study: 4 deaths in 10 subjects in each group

RR= 1.0 (95% CI: 0.34 to 2.9)

40% CI: 0.75 to 1.33

Conclude from this study thatthere is 60%

chance thatthe true RR is 1.33?


23/37

Confidence Intervals apply to a

Process Consider a bag with 19 white and 1 pink

grapefruit

The process of selecting a grapefruit atrandom has a 95% probability of yielding awhite one

But once Ive selected one, does it still have a95% chance of being white?

You may have prior knowledge that changesthe probability (e.g., pink grapefruit havethinnerpeel are denser, etc.)


24/37

Confidence Intervals for negative

studies: 5 levels of sophistication

Example 1: Oral amoxicillin to treatpossible occult bacteremia in febrilechildren*

Randomized, double-blind trial

3-36 month old children with T 39 C (N=955)

Treatment: Amox 125 mg/tid ( 10 kg) or250 mg tid (> 10 kg)

Outcome: majorinfectious morbidity

*Jaffe et al., New Engl J Med 1987;317:1175-80


25/37

Amoxicillin for possible occult

bacteremia 2: Results Bacteremia in 19/507 (3.7%) with amox,

vs 8/448 (1.8%) with placebo (P=0.07)

Major Infectious Morbidity 2/19(10.5%) with amox vs 1/8 (12.5%) withplacebo (P = 0.9)

Conclusion: Data do not supportroutine use of standard doses ofamoxicillin


26/37

5 levels of sophistication

Level 1: P > 0.05 = treatment does notwork

Level 2: Look atpower for study.

(Authors reported power = 0.24 forOR=4. Therefore, study underpowered

and negative study uninformative.)


27/37

5 levels of sophistication, contd

Level 3: Look at 95% CI!Authors calculated OR= 1.2 (95% CI:

0.02 to 30.4)

This is based on 1/8 (12.5%) with placebovs 2/19 (10.5%) with amox

(They putplacebo on top)

(Silly to use OR)

With amox on top, RR = 0.84 (95% CI:0.09 to 8.0)

This was level of TBN in letterto theeditor (1987)


28/37

5 levels of sophistication, contd

Level 4: Make sure you do an intentionto treat analysis!

Itis notOK to restrict attention to

bacteremic patients So it should be 2/507 (0.39%) with amox

vs 1/448 (0.22%) with placebo

RR= 1.8 (95% CI: 0.05 to 6.2)


29/37

Level 5: the clinically relevant quantity

is the Absolute Risk Reduction (ARR)!

2/507 (0.39%) with amox vs 1/448 (0.22%)with placebo

ARR = 0.17% {amoxicillin worse} 95% CI (0.9% {harm} to +0.5% {benefit})

Therefore, LOWER limit of 95% CI for benefit(I.e., best case) is NNT= 1/0.5% = 200

So this study suggests need to treat 200children to prevent Major InfectiousMorbidity in one


30/37

Stata output. csi 2 1 505 447

| Exposed Unexposed | Total

-----------------+------------------------+----------

Cases | 2 1 | 3

Noncases | 505 447 | 952

-----------------+------------------------+----------Total | 507 448 | 955

| |

Risk | .0039448 .0022321 | .0031414

| |

| Point estimate | [95% Conf. Interval]

|------------------------+----------------------

Risk difference | .0017126 | -.005278 .0087032

Risk ratio | 1.767258 | .1607894 19.42418

Attr. frac. ex. | .4341518 | -5.219315 .9485178

Attr. frac. pop | .2894345 |

+-----------------------------------------------

chi2(1) = 0.22 Pr>chi2 = 0.6369


31/37

Example 2: Pyelonephritis and new renal

scarring in the International Reflux

Study in Children*

RCT of ureteral reimplantation vs prophylactic

antibiotics for children with vesicoureteral

reflux Overall result: surgery group fewer episodes

ofpyelonephritis (8% vs 22%; NNT = 7; P chi2 = 0.0437


34/37

Conclusions

No evidence that new pyelonephritis causesscarring

Some evidence thatit does not

P-values and confidence intervals are

approximate, especially for small sample

sizes

There is nothing magical about 0.05

Key concept: calculate 95% CI for negative

studies

ARR for clinical questions (less generalizable)

RR for etiologic questions


35/37

Confidence intervals for small

numeratorsObserved

numer t r

Appr x mate

Numerat rfor

UpperLimit of 95%CI

0 3

1 5

2 7

3 9

4 10


36/37

When P-values and Confidence Intervals

Disagree

Usually P < 0.05 means 95% CI excludes null value.

But both 95% CI and P-values are based on

approximations, so this may not be the case Illustrated by IRSC slide above

If you want 95% CI and P- values to agree, use test-

based confidence intervals see next slide


37/37

Alternative Stata output: Test-

based CI

.

. csi 2 18 28 58,tb

| Exposed Unexposed | Total

-----------------+-----------------------+------------

Cases | 2 18 | 20

Noncases | 28 58 | 86

-----------------+-----------------------+------------

Total | 30 76 | 106

| |

Risk | .0666667 .2368421 | .1886792

| |

| Point estimate | [95% Conf. Interval]

|-----------------------+------------------------

Risk difference | -.1701754 | -.3363063 -.0040446 (tb)

Risk ratio | .2814815 | .0816554 .9703199 (tb)

Prev. frac. ex. | .7185185 | .0296801 .9183446 (tb)

Prev. frac. pop | .2033543 |

+------------------------------------------------- chi2(1) = 4.07 Pr>chi2 = 0.0437

Understanding P- values and CI 20Nov08 (1)

Documents

Transcript of Understanding P- values and CI 20Nov08 (1)