Comparing Two Proportions ( p 1 vs. p 2 )

50
Comparing Two Comparing Two Proportions Proportions ( ( p p 1 1 vs. p vs. p 2 2 ) )

description

Comparing Two Proportions ( p 1 vs. p 2 ). Inferential Methods. Large independent samples 1. z-test for comparing p 1 vs. p 2 2. CI for ( p 1 – p 2 ) 3. Effect size (see below) Quantifying Risk (or Benefit) 1. Relative Risk (RR) ~ tests and CI - PowerPoint PPT Presentation

Transcript of Comparing Two Proportions ( p 1 vs. p 2 )

Page 1: Comparing Two Proportions ( p 1  vs. p 2 )

Comparing Two ProportionsComparing Two Proportions((pp11 vs. p vs. p22))

Page 2: Comparing Two Proportions ( p 1  vs. p 2 )

Inferential MethodsInferential Methods• Large independent samplesLarge independent samples

1. z-test for comparing p1. z-test for comparing p11 vs. p vs. p22

2. CI for ( 2. CI for (pp11 – p – p22))

3. Effect size (see below)3. Effect size (see below)

• Quantifying Risk (or Benefit)Quantifying Risk (or Benefit)1. Relative Risk (RR) ~ tests and CI1. Relative Risk (RR) ~ tests and CI

2. Odd’s Ratio (OR) ~ tests and CI2. Odd’s Ratio (OR) ~ tests and CI3. Number Needed to Treat (NNT) &3. Number Needed to Treat (NNT) &

Number Needed to Harm (NNH)Number Needed to Harm (NNH)

• Small independent samplesSmall independent samples- Fisher’s Exact Test (use software)- Fisher’s Exact Test (use software)

Page 3: Comparing Two Proportions ( p 1  vs. p 2 )

Inferential Methods (cont’d)Inferential Methods (cont’d)

• Small dependent samplesSmall dependent samples- McNemar’s test (binomial- McNemar’s test (binomial))

• Large dependent samplesLarge dependent samples- McNemar’s test (chi-square)- McNemar’s test (chi-square)

Page 4: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

• Research QuestionResearch Question::

“ “Do the pregnant teens who receive nutrition Do the pregnant teens who receive nutrition education produce a lower proportion of low education produce a lower proportion of low birth weight babies than do pregnant teens birth weight babies than do pregnant teens who do not receive such instruction?” who do not receive such instruction?”

• Study:Study:To conduct the study 314 pregnant teens were To conduct the study 314 pregnant teens were randomly assigned to receive the nutrition randomly assigned to receive the nutrition education and 316 pregnant teens were assigned education and 316 pregnant teens were assigned to the non-instruction group.to the non-instruction group.

Page 5: Comparing Two Proportions ( p 1  vs. p 2 )

Experimental Comparative StudyExperimental Comparative Study

Population 1p1

Population 2p2

p vs. p

Sample size = n1

Calculate:

Sample size = n2

Calculate:

To make inferences use: Hypothesis test, CI for difference in proportions (and possibly RR, OR, & NNT/NNH).

Population(e.g. pregnant teens) Randomly assign

Page 6: Comparing Two Proportions ( p 1  vs. p 2 )

Test statistic for large independent samplesTest statistic for large independent samplesFor testing For testing equalityequality of the two proportions only of the two proportions only

HHoo: : ((pp11 – p – p22)) = =

AA: : ((pp11 – p – p22)) > > (upper-tail)(upper-tail)

((pp11 – p – p22)) < < (lower-tail)(lower-tail)

((pp11 – p – p22)) = = (two-tail, use CI approach) (two-tail, use CI approach)

Provided Provided nn11pp11 >> 10 10 & & nn11qq11 >> 10 10 andand n n2 2 pp22 >> 1010 & & nn22qq22 >> 1010

Page 7: Comparing Two Proportions ( p 1  vs. p 2 )

Test Statistic for Large Independent SamplesTest Statistic for Large Independent Samples

For testing to see if For testing to see if difference is at least difference is at least

HHoo: : ((pp11 – p – p22)) = =

AA: : ((pp11 – p – p22)) > > (upper-tail) (upper-tail)

((pp11 – p – p22)) < < (lower-tail)(lower-tail)

Provided Provided nn11pp11 >> 10 10 & & nn11qq11 >> 10 10 andand n n2 2 pp22 >> 1010 & & nn22qq22 >> 1010

Most important case

Page 8: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Interval for (Confidence Interval for (pp11 – p – p22) )

for Large Independent Samplesfor Large Independent Samples

Provided Provided nn11pp11 >> 10 10 & & nn11qq11 >> 10 10

n n2 2 pp22 >> 1010 & & nn22qq22 >> 1010

The confidence interval for (The confidence interval for (pp11 – p – p22) has a general form:) has a general form:

z-values

90% z = 1.645

95% z = 1.960

99% z = 2.578

Page 9: Comparing Two Proportions ( p 1  vs. p 2 )

Effect Size for Large Independent SamplesEffect Size for Large Independent Samples

There are three main ways to quantify effect size in There are three main ways to quantify effect size in situations where we are comparing proportions situations where we are comparing proportions across two populations or treatment groups.across two populations or treatment groups.

1)1) Difference in sample proportions =Difference in sample proportions =

(this almost might referred to as the risk difference)(this almost might referred to as the risk difference)

2)2) Relative Risk or Risk Ratio (RR) =Relative Risk or Risk Ratio (RR) =

3)3) Odds Ratio (OR) (see Odds Ratio (OR) (see Probability pptProbability ppt))

Page 10: Comparing Two Proportions ( p 1  vs. p 2 )

Effect Size for Large Independent Effect Size for Large Independent SamplesSamples

Which measure you use depends on the context of Which measure you use depends on the context of the experiment/study and more importantly how the experiment/study and more importantly how the data was collected.the data was collected.

In observational studies (e.g. case-control studies) In observational studies (e.g. case-control studies) the odds ratio (OR) is primarily used because the the odds ratio (OR) is primarily used because the outcome of interest is NOT random. Therefore outcome of interest is NOT random. Therefore we cannot talk about the proportion of people we cannot talk about the proportion of people with the disease, we can only talk about the with the disease, we can only talk about the proportion with the risk factor. proportion with the risk factor.

(See example later in this Powerpoint)(See example later in this Powerpoint)

Page 11: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

Here we are interested in determining if pregnant Here we are interested in determining if pregnant teens who receive the nutrition education have a teens who receive the nutrition education have a lower prevalence of low birth weight infants, but we lower prevalence of low birth weight infants, but we are not necessarily looking for a certain size (are not necessarily looking for a certain size () ) for for that difference.that difference.

Let,Let,

ppEE = proportion of babies with low birth weight born = proportion of babies with low birth weight born

to teens who underwent nutrition education.to teens who underwent nutrition education.

ppNN = proportion of babies with low birth weight born = proportion of babies with low birth weight born

to teens who did not receive nutrition education.to teens who did not receive nutrition education.

Page 12: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

STEP 1) State HypothesesSTEP 1) State Hypotheses

HHoo: p: pEE = p = pNN or equivalently or equivalently ((ppEE – p – pNN)) = = 00

HHAA: p: pEE < p < pNN or equivalently or equivalently ((ppEE – p – pNN)) < < 00

STEP 2) Determine Test CriteriaSTEP 2) Determine Test Criteria

a) Choose a) Choose (we could use something else)(we could use something else)

b) b) From the CDC website we find that around 9% of infants born in the U.S. From the CDC website we find that around 9% of infants born in the U.S. are classified as having low birth weights. For teen mothers that percentage is are classified as having low birth weights. For teen mothers that percentage is probably higher but smaller probably higher but smaller pp’s require larger samples, thus we will use ’s require larger samples, thus we will use p = .09 p = .09 to check sample size considerations. Here to check sample size considerations. Here nn11 = 314 and = 314 and nn22 = 316 so … = 316 so …

nn11pp11 = 28, = 28, nn11qq11 = 286, = 286, nn22pp22 = 28, = 28, nn22qq22 = 288 (i.e. samples are LARGE) = 288 (i.e. samples are LARGE)

THUS WE USE LARGE SAMPLE TEST FOR COMPARING POPULATION PROPORTIONS ASSUMING EQUALITYUNDER THE NULL, i.e. = 0.

Page 13: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant TeensSTEP 3) Collect Data and Compute Test StatisticSTEP 3) Collect Data and Compute Test Statistic

In the study, 23 of the 314 teen mothers receiving nutrition education In the study, 23 of the 314 teen mothers receiving nutrition education had low birth weight babies compared to 39 of the 316 mothers in had low birth weight babies compared to 39 of the 316 mothers in the non-instruction group.the non-instruction group.

Using these results we can calculate all the necessary proportions to use in the z-test statistic shown below.

Sample Proportion Calculations

Page 14: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant TeensSTEP 3) Collect Data and Compute Test StatisticSTEP 3) Collect Data and Compute Test Statistic

In the study, 23 of the 314 teen mothers receiving nutrition education In the study, 23 of the 314 teen mothers receiving nutrition education had low birth weight babies compared to 39 of the 316 mothers in had low birth weight babies compared to 39 of the 316 mothers in the non-instruction group.the non-instruction group.

Finally calculating the test statistic we see that difference in the sample proportions is over 2 SE’s below 0.

Test Statistic Calculations

Page 15: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

STEPS 4 & 5) Compute p-value and make decisionSTEPS 4 & 5) Compute p-value and make decision

Our observed test statistic value is Our observed test statistic value is z = - 2.11z = - 2.11. . To find To find p-value we use the fact that our test statistic has a standard p-value we use the fact that our test statistic has a standard normal distribution.normal distribution.

From standard normal table or computer

P(Z < - 2.11) = .0172

The probability that chance variation alone would produce an observed proportion for education group this small or smaller when compared to the non-instruction group is 1.72%. Thus we have evidence to suggest that the proportion of low birth weight babies born to teen mothers in education group is smaller than that for the non-instruction group (p = .0172).

Page 16: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

STEPS 6) Quantify Significant EffectsSTEPS 6) Quantify Significant Effects

• 95% CI for Difference in Proportions95% CI for Difference in Proportions

Necessary Computations

Page 17: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• 95% CI for (95% CI for (ppEE – p – pNN) = (-.0985, -.0039)) = (-.0985, -.0039)

or (- 9.85%, -.39 %)or (- 9.85%, -.39 %)

One potential interpretation of CI:One potential interpretation of CI:

We estimate that the percentage of low birth We estimate that the percentage of low birth weight babies born to teen mothers who weight babies born to teen mothers who participate in a nutrition education program is participate in a nutrition education program is between .39 and 9.85 percentage points between .39 and 9.85 percentage points smallersmaller than that for teen mothers who are not than that for teen mothers who are not given this instruction.given this instruction.

Page 18: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• 95% CI for (95% CI for (ppEE – p – pNN) = (-.0985, -.0039)) = (-.0985, -.0039)

or (- 9.85%, -.39 %)or (- 9.85%, -.39 %)

Another potential interpretation of CI:Another potential interpretation of CI:

For pregnant teens participating in the nutrition For pregnant teens participating in the nutrition education program we estimate that the education program we estimate that the prevalence of low birth weight is between .39 prevalence of low birth weight is between .39 and 9.85 percentage points smaller than that and 9.85 percentage points smaller than that for teen mothers receiving no such education. for teen mothers receiving no such education.

Page 19: Comparing Two Proportions ( p 1  vs. p 2 )

Relative Risk or Risk RatioRelative Risk or Risk Ratio

• Recall from the probability presentation that risk Recall from the probability presentation that risk ratio or relative risk is defined as:ratio or relative risk is defined as:

• We can use this in the study of potentially We can use this in the study of potentially beneficial treatments by computing it as follows:beneficial treatments by computing it as follows:

Page 20: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• Using Relative Risk or Risk Ratio (RR)Using Relative Risk or Risk Ratio (RR)

We have,We have,

The The relative risk (RR) relative risk (RR) of low birth weight associated with of low birth weight associated with being in the control (non-instruction) group is given by: being in the control (non-instruction) group is given by:

Relative Risk or Risk Ratio (RR) = .1234/.0732 = 1.686Relative Risk or Risk Ratio (RR) = .1234/.0732 = 1.686 i.e., teen mothers not participating in the nutrition education i.e., teen mothers not participating in the nutrition education program have a 1.686 times higher chance of having a baby program have a 1.686 times higher chance of having a baby with a low birth weight. Another way we state it is that their with a low birth weight. Another way we state it is that their risk of having a low birth weight baby is 68.6% higher. risk of having a low birth weight baby is 68.6% higher.

Page 21: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• Using Relative Risk or Risk Ratio (RR)Using Relative Risk or Risk Ratio (RR)

We have,We have,

Another way to look at this is in terms of benefit associated Another way to look at this is in terms of benefit associated with being in the education vs. the non-instruction (control) with being in the education vs. the non-instruction (control) group. This is achieved by reciprocating the RR.group. This is achieved by reciprocating the RR.

““Risk Reduction” = .0732/.1234 = .5932 which constitutes a Risk Reduction” = .0732/.1234 = .5932 which constitutes a roughly 41% reduction in risk of having a low birth weight roughly 41% reduction in risk of having a low birth weight baby associated with receiving the nutrition education.baby associated with receiving the nutrition education.

Page 22: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• But wait… there is more! But wait… there is more!

For situations where we are looking at a potentially For situations where we are looking at a potentially beneficial treatment we can report the beneficial treatment we can report the NNTNNT..

• NNT (Number Needed to Treat)NNT (Number Needed to Treat):: the number of patients who the number of patients who need to be treated to prevent 1 adverse outcome. To find the need to be treated to prevent 1 adverse outcome. To find the NNT we simple compute:NNT we simple compute:

Page 23: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• NNT (Number Needed to Treat)NNT (Number Needed to Treat):: the number of patients who the number of patients who need to be treated to prevent 1 adverse outcome. need to be treated to prevent 1 adverse outcome.

• Thus we estimate that we would need to have 20 Thus we estimate that we would need to have 20 teen mothers participate in nutrition education teen mothers participate in nutrition education program to see 1 fewer baby born with a low birth program to see 1 fewer baby born with a low birth weight amongst teen moms. weight amongst teen moms.

• Note: If we reciprocate the confidence limits for a CI for the “risk Note: If we reciprocate the confidence limits for a CI for the “risk difference” we obtain a 95% CI for the NNT. Here we would have,difference” we obtain a 95% CI for the NNT. Here we would have,

((1/.0985 , 1/.00391/.0985 , 1/.0039) = (10.15 , 256.41)) = (10.15 , 256.41)So we estimate that we would need to have between 10 and 256 teen So we estimate that we would need to have between 10 and 256 teen mothers participate in the program to see 1 fewer low birth weight baby mothers participate in the program to see 1 fewer low birth weight baby with 95% confidence.with 95% confidence.

Page 24: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

• There is no reason to use the odds ratio There is no reason to use the odds ratio (OR) here because the “disease” outcome (OR) here because the “disease” outcome (i.e. low birth weight), is random. We can (i.e. low birth weight), is random. We can still calculate it however. still calculate it however.

• Recall from probability presentationRecall from probability presentation

absent)risk |"("1absent)risk |"("present)risk |"("1

present)risk |"("

diseasePdiseasePdiseaseP

diseaseP

OR

Page 25: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

Here,Here,

So teen mothers who received no nutrition So teen mothers who received no nutrition education during pregnancy have education during pregnancy have 1.78 1.78 times higher odds for having a baby with times higher odds for having a baby with low birth weightlow birth weight when compared to teen when compared to teen mothers who did receive nutrition mothers who did receive nutrition instruction.instruction.

78.1

9268.0732.8766.1234.

absent)risk |"("1absent)risk |"("present)risk |"("1

present)risk |"("

diseasePdiseasePdiseaseP

diseaseP

OR

Page 26: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

We can display data from this study in a We can display data from this study in a 2 X 2 contingency table2 X 2 contingency table format format

Treatment

Low Birthweight

Normal Birthweight Row Totals

No Instruction (N) 39 277 nN= 316

Nutrition Education (E) 23 291 nE = 314

Column Totals62 568 630

Study Results:Study Results:

In the study, 23 of In the study, 23 of the 314 teen the 314 teen mothers receiving mothers receiving nutrition education nutrition education had low birth weight had low birth weight babies compared to babies compared to 39 of the 316 39 of the 316 mothers in the non-mothers in the non-instruction group.instruction group.

Page 27: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

Recall from the probability presentation that the Recall from the probability presentation that the OR has an easy formula when our data are OR has an easy formula when our data are displayed in a 2 X 2 table.displayed in a 2 X 2 table.

Treatment Low Birthweight

Normal Birthweight

Row Totals

No Instruction (N)

a39

b277 nN= 316

Nutrition Education

(E)

c23

d291 nE = 314

Column Totals

62 568 630

Easier Formula!

OR = ad/bc

= (39)(291)/(277)(23)

= 1.78

Page 28: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

Whew! Let’s stop an summarize our Whew! Let’s stop an summarize our findings to this point in a table.findings to this point in a table.

Perhaps we should have confidence intervals for the RR and OR as well!

Page 29: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Intervals for Relative Confidence Intervals for Relative Risk (RR) and Odds Ratio (OR)Risk (RR) and Odds Ratio (OR)

Before we look at the computationalBefore we look at the computational

procedures for finding these CI’s we mustprocedures for finding these CI’s we must

note that the 2 X 2 table for our data note that the 2 X 2 table for our data MUSTMUST

BEBE in the format below: in the format below: The key is identifying which cell is “a” and that risk or treatment is always the row variable!!!!

Page 30: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Intervals for RRConfidence Intervals for RR

1.1. Take natural log of estimated RR, Take natural log of estimated RR, ln(RR)ln(RR)

2.2. Compute standard error of Compute standard error of ln(RR)ln(RR)

3.3. Find CI for Find CI for ln(RR)ln(RR)

21

))(ln(cn

d

an

bRRSE

),())(ln()()ln( ULRRSEvaluezRR

Page 31: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Intervals for RRConfidence Intervals for RR

4.4. Find CI for RR by taking the antilog (Find CI for RR by taking the antilog (eexx) ) of the endpoints of CI for RR in log scale:of the endpoints of CI for RR in log scale:

LCL for RR = LCL for RR = eeLL

UCL for RR = UCL for RR = eeUU

i.e., i.e.,

CI for RR = (eCI for RR = (eLL , e , eUU))

Page 32: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Intervals for ORConfidence Intervals for OR

1.1. Take natural log of estimated OR, Take natural log of estimated OR, ln(OR)ln(OR)

2.2. Compute standard error of Compute standard error of ln(OR)ln(OR)

3.3. Find CI for Find CI for ln(OR)ln(OR)

dcbaORSE

1111))(ln(

),())(ln()()ln( ULORSEvaluezOR

Page 33: Comparing Two Proportions ( p 1  vs. p 2 )

Confidence Intervals for ORConfidence Intervals for OR

4.4. Find CI for OR by taking the antilog (Find CI for OR by taking the antilog (eexx) ) of the endpoints of CI for of the endpoints of CI for ln(OR)ln(OR)::

LCL for OR = LCL for OR = eeLL

UCL for OR = UCL for OR = eeUU

i.e., i.e.,

CI for OR = (eCI for OR = (eLL , e , eUU))

Page 34: Comparing Two Proportions ( p 1  vs. p 2 )

Hypothesis Testing for RR and ORHypothesis Testing for RR and OR

• In general we are interested in identifying In general we are interested in identifying situations where the RR/OR are greater situations where the RR/OR are greater than 1 (increased risk) or less than 1 than 1 (increased risk) or less than 1 (decreased risk). (decreased risk).

• For either the null hypothesis says that the For either the null hypothesis says that the RR or OR is equal to 1 or equivalently the RR or OR is equal to 1 or equivalently the ln(RR) or ln(OR) is 0 because ln(1) = 0.ln(RR) or ln(OR) is 0 because ln(1) = 0.

0)ln(: 0)ln(:

toequivalent is

1 : 1 :

ORHRRH

ORHRRH

oo

oo

Page 35: Comparing Two Proportions ( p 1  vs. p 2 )

Hypothesis Testing for RR and ORHypothesis Testing for RR and OR

• The test statistic in either case provided our The test statistic in either case provided our sample sizes are “large” issample sizes are “large” is

• So we use the standard normal distribution So we use the standard normal distribution to find the p-value associated with the test to find the p-value associated with the test statistic.statistic.

• Better approach in practice is to simply look Better approach in practice is to simply look at whether or not CI for RR/OR contains 1 at whether or not CI for RR/OR contains 1 or not, if it does not contain 1 we Reject Hor not, if it does not contain 1 we Reject Hoo..

)1,0(~))(ln(

)ln(N

estimateSE

estimatez

Page 36: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

Treatment Low Birthweight

Normal Birthweight

Row Totals

No Instruction (N)

a39

b277 nN= 316

Nutrition Education (E)

c23

d291 nE = 314

Column Totals

62 568 630

Find a 95% CI for RR

1) RR = .1234/.0732 = 1.68

2) ln(RR) = .519

3) Find SE(ln(RR)) =

4) Find confidence limits for ln(RR)

.519+(1.96)(.251)=(.027,1.011)

5) Take antilog (ex) of endpoints

(e.027,e1.011) = (1.027,2.748)

251.)316(39

277

)314(23

291

The CI contains only values above 1, thus we conclude the lack of nutritional education is associated with increased risk and nutritional education is associated with decreased risk of low birth weight.

Page 37: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

Treatment Low Birthweight

Normal Birthweight

Row Totals

No Instruction (N)

a39

b277 nN= 316

Nutrition Education (E)

c23

d291 nE = 314

Column Totals

62 568 630

Find a 95% CI for OR

1) OR = (39)(291)/(277)(23) = 1.78

2) ln(RR) = .577

3) Find SE(ln(RR)) =

4) Find confidence limits for ln(RR)

.577+(1.96)(.280)=(.029,1.125)

5) Take antilog (ex) of endpoints

(e.029,e1.125) = (1.029,3.082)

280.291

1

23

1

277

1

39

1

Again we see the CI contains only values above 1, thus we conclude the lack of nutritional education is associated with increased risk and nutritional education is associated with decreased risk of low birth weight.

Page 38: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Pregnant TeensExample 1: Nutrition Education for Pregnant Teens

Hypothesis Tests for RR and ORHypothesis Tests for RR and OR(even though the CI’s were enough!)(even though the CI’s were enough!)

Test Statistic for RRTest Statistic for RR Test Statistics for ORTest Statistics for OR

Both p-values are less then Both p-values are less then therefore reject the null therefore reject the null and conclude there is increased risk associated with being a and conclude there is increased risk associated with being a control and hence decreased risk of low birth weight control and hence decreased risk of low birth weight associated with the nutritional education program for associated with the nutritional education program for pregnant teens. This agrees with our conclusion from CI’s.pregnant teens. This agrees with our conclusion from CI’s.

.01972.061)P(Z 0193.)068.2(

061.2.280

.577z 068.2

251.

519.

ZP

z

Page 39: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

That’s it! Let’s summarize our final findings in a table.That’s it! Let’s summarize our final findings in a table.

All of this is much easier to do using statistical software, e.g. JMP.

Page 40: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

Enter data table as shown belowEnter data table as shown below

Page 41: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

You can see the three options related to what we You can see the three options related to what we just discussed below:just discussed below:

Check them all

Page 42: Comparing Two Proportions ( p 1  vs. p 2 )

Example 1: Nutrition Education for Example 1: Nutrition Education for Pregnant Teens Pregnant Teens

All the CI’s we calculated “by hand” are shown below.All the CI’s we calculated “by hand” are shown below.

Page 43: Comparing Two Proportions ( p 1  vs. p 2 )

Fisher’s Exact TestFisher’s Exact Test• When sample sizes are “small” or when it When sample sizes are “small” or when it

is available, one should use Fisher’s Exact is available, one should use Fisher’s Exact Test for comparing pTest for comparing p11 vs. p vs. p2.2.

• The computations are tedious and finding The computations are tedious and finding a p-value requires special tables but it is a p-value requires special tables but it is implemented in many statistical software implemented in many statistical software packages. packages.

• By default JMP will always calculate p-By default JMP will always calculate p-values for Fisher’s Exact Test when 2 X 2 values for Fisher’s Exact Test when 2 X 2 contingency tables are analyzed.contingency tables are analyzed.

Page 44: Comparing Two Proportions ( p 1  vs. p 2 )

Fisher’s Exact TestFisher’s Exact Test• The results from JMP are shown below:The results from JMP are shown below:

• The alternatives are communicated verbally along The alternatives are communicated verbally along side the p-values, the one we are interested is side the p-values, the one we are interested is boxed. It states that…boxed. It states that…The probability of having a baby with a normal The probability of having a baby with a normal birth weight is greater for those who in the group birth weight is greater for those who in the group that received nutritional education (p = .0235).that received nutritional education (p = .0235).

This p-value is EXACT and does not come from a This p-value is EXACT and does not come from a normal approximation!normal approximation!

Page 45: Comparing Two Proportions ( p 1  vs. p 2 )

Preliminary Summary of Independent Preliminary Summary of Independent Sample Comparisons (pSample Comparisons (p11 vs. p vs. p22))

• When sample sizes are “large” one can use When sample sizes are “large” one can use a z-test and CI to make inferences about a z-test and CI to make inferences about (p(p11 – p – p22), otherwise use Fisher’s Exact Test.), otherwise use Fisher’s Exact Test.

• To further quantify and discuss effect size To further quantify and discuss effect size one can use RR and OR, along with one can use RR and OR, along with inferential methods for them.inferential methods for them.

• If it makes sense for the given situation, If it makes sense for the given situation, NNT can also be calculated from (pNNT can also be calculated from (p11 – p – p22).).

Page 46: Comparing Two Proportions ( p 1  vs. p 2 )

Observational Comparative StudyObservational Comparative Study(e.g. case-control)(e.g. case-control)

Population 1p1

Population 2p2

p vs. p

Sample size = n1

Calculate:

Sample size = n2

Calculate:

To make inferences use: Hypothesis test, CI for difference in proportions (and possibly RR, OR, & NNT/NNH).

Page 47: Comparing Two Proportions ( p 1  vs. p 2 )

Example 2: Age at 1Example 2: Age at 1stst Pregnancy and Pregnancy and Cervical Cancer (Case-Control Study)Cervical Cancer (Case-Control Study)

• In a In a case-control studycase-control study, we sample individuals , we sample individuals who have a “disease” of interest (cases) and who have a “disease” of interest (cases) and individuals who do not have the “disease” individuals who do not have the “disease” (controls) and compare these two populations in (controls) and compare these two populations in terms of potential risk factors. terms of potential risk factors.

• In this study, samples of women who have In this study, samples of women who have cervical cancer and women who did not have cervical cancer and women who did not have cervical cancer were independently taken. The cervical cancer were independently taken. The proportions of women who had their first child at proportions of women who had their first child at or before the age of 25 were compared for these or before the age of 25 were compared for these two populations of women.two populations of women.

Page 48: Comparing Two Proportions ( p 1  vs. p 2 )

Example 2: Age at 1Example 2: Age at 1stst Pregnancy and Pregnancy and Cervical Cancer (Case-Control Study)Cervical Cancer (Case-Control Study)

In conducting the study 49 women with cervical cancer and 317 women of similar age & background without cervical cancer were sampled. The number of women having their first child at or before the age of 25 was determined for both samples.

Page 49: Comparing Two Proportions ( p 1  vs. p 2 )

Example 2: Age at 1Example 2: Age at 1stst Pregnancy and Pregnancy and Cervical Cancer (Case-Control Study)Cervical Cancer (Case-Control Study)

• Because the number of women with the disease Because the number of women with the disease was chosen by the researchers we cannot was chosen by the researchers we cannot consider P(consider P(disease | riskdisease | risk), thus RR cannot be ), thus RR cannot be calculated. calculated. (RR test and CI)(RR test and CI)

• We will only compare the proportion of women We will only compare the proportion of women with the “risk factor” in each group.with the “risk factor” in each group.((z-test, Fisher’s Exact test, CI for (pz-test, Fisher’s Exact test, CI for (p11 – p – p22) ) ))

• If the prevalence of the risk factor is greater for If the prevalence of the risk factor is greater for the disease group then we have evidence of an the disease group then we have evidence of an association or link between the factor and the association or link between the factor and the disease.disease.

Page 50: Comparing Two Proportions ( p 1  vs. p 2 )

Example 2: Cervical Cancer & Age Example 2: Cervical Cancer & Age at 1at 1stst Pregnancy Pregnancy

Enter data into JMP like this

Notice that risk factor presence is the Y because it is the random outcome variable and X is case-control status.

85.7% of those in case group had the risk factor

64.0% of those in the control group had the risk factor.

The proportion of women without the risk factor is greater for the control group than for the case group (p = .0014).

Women who have their first pregnancy at or before 25 years of age have 3.369 times higher odds for developing cervical cancer.