Statistics for clinicians

46
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1

description

Statistics for clinicians. - PowerPoint PPT Presentation

Transcript of Statistics for clinicians

Statistics for clinicians

Biostatistics course by Kevin E. Kip, Ph.D., FAHAProfessor and Executive Director, Research CenterUniversity of South Florida, College of NursingProfessor, College of Public HealthDepartment of Epidemiology and BiostatisticsAssociate Member, Byrd Alzheimer’s InstituteMorsani College of MedicineTampa, FL, USA

1

SECTION 4.1SECTION 4.1

Module OverviewModule Overviewand Introductionand Introduction

Hypothesis testing for 2 or more independent groups and non-parametric methods.

Module 4 Learning Outcomes:

1. Calculate and interpret 2 sample hypotheses:a) 2 sample – continuous outcomeb) >2 samples – continuous outcomec) 2 sample dichotomous outcomed) >2 samples dichotomous outcome

2. Specify 2-sample hypotheses and conduct formal testing using SPSS

3. Differentiate between parametric and non-parametric tests

4. Identify properties of non-parametric tests

Module 4 Learning Outcomes:

5. Calculate and interpret non-parametric tests:a) 2 independent samples – Wilcoxon Rank Sum

Testb) Matched samples – Wilcoxon Signed Rank Testc) >2 independent samples – Kruskal Wallis Test

6. Conduct and interpret non-parametric analyses using SPSS.

Assigned Reading:

Textbook: Essentials of Biostatistics in Public Health

Chapter 7Sections 7.5, 7.7 to 7.9Pages 138-141 and 144-162

Chapter 10

SECTION 4.2SECTION 4.2

Framework of hypothesis Framework of hypothesis testingtesting

General Steps for Hypothesis Testing:

1)Set up the hypothesis and determine the level of

statistical significance (including 1 versus 2-sided

hypothesis).

2)Select the appropriate test statistic

3)Set up the decision rule

4)Compute the test statistic

5)Conclusion (interpretation)

Hypothesis Testing Calculations:

1)Two Sample – Independent Groups

a) Continuous outcome (student t test)b) Dichotomous outcome (risk difference or

risk ratio—chi-square test)

2)More than 2 Samples – Independent Groups

a) Continuous outcome (analysis of variance-ANOVA)

b) Categorical Outcome (chi-square test)

Framework of Hypothesis Testing

Goal is to compare sample parameter estimates (e.g. mean,

proportion, etc.) between 2 or more independent groups.

The groups can be defined from a clinical trial, such as treatment

versus placebo, or an observational study, such as men versus

women, or exposed versus not exposed.

With 2 groups, one group serves as the “comparison” or “control”

group representing the null value.

Groups do not need to be of the same size.

With more than 2 groups, can compare whether any groups differ

(e.g. means) or whether groups differ in an ordered manner.

SECTION 4.3SECTION 4.3

Two-sample: independent Two-sample: independent groups – continuous outcome groups – continuous outcome

1. Two-Sample: Independent Groups-Continuous Outcome

Parameter: Difference in population means: μ1 – μ2

H0: μ1 – μ2 = 0; μ1 = μ2

H1: μ1 > μ2;μ1 < μ2;μ1 = μ2;

Test statistics:

n1 > 30 and n2 > 30

n1 < 30 or n2 < 30

Critical value of zin Table 1C

Critical value of tin table 2

d.f. = n1 + n2 - 2

1. Two-Sample: Independent Groups-Continuous Outcome

Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women.

n X sMen 1623 128.2 17.5Women 1911 126.5 20.1

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: μ1 = μ2

H1: μ1 = μ2 (two-sided hypothesis) α = 0.05

1. Two-Sample: Independent Groups-Continuous Outcome

Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women.

n X sMen 1623 128.2 17.5Women 1911 126.5 20.1

2)Select the appropriate test statistic:n1 > 30 and n2 > 30, so use z

3)Set up the decision rule:Reject H0 if z < 1.96 or z > 1.96

1. Two-Sample: Independent Groups-Continuous Outcome

Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women.

n X sMen 1623 128.2 17.5Women 1911 126.5 20.1

4) Compute the test statistic:

= sqrt(359.12) = 19.0

5) Conclusion: Reject H0 because 2.66 > 1.96

1. Two-Sample: Independent Groups-Continuous Outcome (Practice)

Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05)

n X sMen 165 198.88 38.416Women 337 222.23 42.023

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: _________________H1: _________________

2)Select the appropriate test statistic:n1 > 30 and n2 > 30, so use z __________________

3)Set up the decision rule:_________________________________________

1. Two-Sample: Independent Groups-Continuous Outcome (Practice)

Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05)

n X sMen 165 198.88 38.416Women 337 222.23 42.023

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: μ1 = μ2

H1: μ1 = μ2 (two-sided hypothesis)

2)Select the appropriate test statistic:n1 > 30 and n2 > 30, so use z

3)Set up the decision rule:Reject H0 if z < 1.96 or z > 1.96

1. Two-Sample: Independent Groups-Continuous Outcome (Practice)

Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05)

n X sMen 165 198.88 38.416Women 337 222.23 42.023

4)Compute the test statistic

5) Conclusion: _________________________________

1. Two-Sample: Independent Groups-Continuous Outcome (Practice)

Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05)

n X sMen 165 198.88 38.416Women 337 222.23 42.023

4)Compute the test statistic

165 + 337 - 2

(165–1)(38.416)2 + (337–1)(42.023)2

Sp = = 40.875

198.88 – 222.23 -23.35z = ----------------------- = ------- = -6.01 40.875 1/165 + 1/337 3.884

5) Conclusion: Reject H0: abs(-6.01) > 1.96

1. Two-Sample: Independent Groups-Continuous Outcome (Practice)

Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05)

SPSS

AnalyzeCompare Means

Independent Samples T-TestTest Variable: Total cholesterolGroup Variable: Gender (defined as 1,2)

Options: 95% C.I.

SECTION 4.4SECTION 4.4

Two-sample: independent Two-sample: independent groups – dichotomous groups – dichotomous outcome outcome

2. Two-Sample: Independent Groups-Dichotomous Outcome

Parameter: Risk Difference (RD)(p1 – p2) or Risk Ratio (RR)(p1 / p2)

H0: RD: p1 = p2; or p1 – p2 = 0; RR: p1 / p2 = 1.0 H1: RD: p1 = p2; or p1 – p2 = 0; RR: p1 / p2 = 1.0

Test statistics: Critical valueof z in

Table 1C

min[n1p1, n1(1 – p1)] > 5

min[n2p2, n2(1 – p2)] > 5

Note: p = proportion of successes (outcomes)

2. Two-Sample: Independent Groups-Dichotomous Outcome

Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers.

No CVD CVD TotalSmoker 663 81 744 p1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p2 = 298/3055 = 0.0975

(RD)(p1 – p2 = 0.0114); Risk Ratio (RR)(p1 / p2 = 1.12)

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: p1 = p2

H1: p1 = p2 (two-sided hypothesis) α = 0.05

2. Two-Sample: Independent Groups-Dichotomous Outcome

Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers.

No CVD CVD TotalSmoker 663 81 744 p1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p2 = 298/3055 = 0.0975

2) Select the appropriate test statistic: min[n1p1, n1(1 – p1)] > 5min[n2p2, n2(1 – p2)] > 5 --- use z

3) Set up the decision rule:Reject H0 if z < 1.96 or z > 1.96

2. Two-Sample: Independent Groups-Dichotomous Outcome

Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers.

No CVD CVD TotalSmoker 663 81 744 p1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p2 = 298/3055 = 0.0975

4) Compute the test statistic:

81 + 298 379p = ---------------- = -------- = 0.0988 744 + 3055 3799

0.1089 – 0.0975z = -------------------------------------------- = 0.927 0.0988(1 – 0.0988)(1/744 + 1/3055)

5) Conclusion: Do not reject H0: -1.96 < 0.927 < 1.96

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total< 3 times/wk 177 18 195 p1 = _____________ > 3 times/wk 278 23 301 p2 = _____________

(RD) (p1 – p2 = _______); Risk Ratio (RR) (p1 / p2 = _______)

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: _____________________________H1: _____________________________ α = 0.05

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total< 3 times/wk 177 18 195 p1 = 18/195 = 0.0923 > 3 times/wk 278 23 301 p2 = 23/301 = 0.0764

(RD) (p1 – p2 = 0.0159); Risk Ratio (RR) (p1 / p2 = 1.21)

1) Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis).

H0: p1 = p2

H1: p1 = p2 (two-sided hypothesis) α = 0.05

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total<3 times/wk 177 18 195 p1 = _______ >3 times/wk 278 23 301 p2 = _______

2) Select the appropriate test statistic: min[n1p1, n1(1 – p1)] > 5min[n2p2, n2(1 – p2)] > 5 --- use z

3) Set up the decision rule:Reject H0 if: ________________________

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total<3 times/wk 177 18 195 p1 = 18/195 = 0.0923 >3 times/wk 278 23 301 p2 = 23/301 = 0.0764

2) Select the appropriate test statistic: min[n1p1, n1(1 – p1)] > 5min[n2p2, n2(1 – p2)] > 5 --- use z

3) Set up the decision rule:Reject H0 if z < 1.96 or z > 1.96

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total<3 times/wk 177 18 195 p1 = _______ >3 times/wk 278 23 301 p2 = _______

4) Compute the test statistic:

p = ______________ = __________

z = __________________________________

5) Conclusion: ________________________________

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise.

Exercise No diabetes Diabetes Total<3 times/wk 177 18 195 p1 = 18/195 = 0.0923 >3 times/wk 278 23 301 p2 = 23/301 = 0.0764

4) Compute the test statistic:

18 + 23 41p = ---------------- = -------- = 0.0827 195 + 301 496

0.0923 – 0.0764z = ------------------------------------------- = 0.628 0.0827(1 – 0.0827)(1/195 + 1/301)

5) Conclusion: Do not reject H0: -1.96 < 0.628 < 1.96

2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice)

Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise (α = 0.05)

SPSS

AnalyzeDescriptive Statistics

CrosstabsRow Variable: Exercise >times/weekColumn Variable: History of diabetes

Statistics – Chi-squareCells – Observed, Expected

Note: Pearson chi-square test in SPSS includes Yates correction.

SECTION 4.5SECTION 4.5

More than two-samples: More than two-samples: independent groups – independent groups – continuous outcome continuous outcome

3. More Than Two Independent Groups-Continuous Outcome

Parameter: Difference in means for more than 2 groups (ANOVA)

H0: μ1 = μ2 = …μk

H1: Means are not all equal Test statistic: F value

Find critical value in Table 4(df1 = k – 1; df2 = N – k)

Where nj = sample size in the jth group (e.g. j = 1,2,3……)

Xj = mean in the jth group

X = overall mean

k = number of independent groups (k > 2)

N = total number of observations in analysis

ANOVA Assumptions: Outcome follows a normal distribution (all groups)Variances approximately equal among groups

3. More Than Two Independent Groups-Continuous Outcome

“Between-group” variabilityF = ---------------------------------------- “Residual or error” variability (“within-group” variability) (i.e. variability in the outcome); (null hypothesis is that all groups are random samples)

F statistic assesses whether differences among the means (the numerator) are larger than expected by chance.

F statistic has 2 degrees of freedom; df1 (numerator), df2 (denominator)df1 = k – 1; df2 = N – k

Table 4 contains critical values for the F distribution

3. More Than Two Independent Groups-Continuous Outcome

Analysis of Variance (ANOVA) Table

Source of Variation

Sum of Squares (SS)

Degrees of freedom (df)

Mean Squares (MS) F

Between-groupSSB

k - 1 SSB

MSB = ---------- k – 1

MSBF = ------- MSE

Error or residual(random)

“within-group”

SSE

N - k SSE*

MSE = ---------- N – k

------

TotalSST

N - 1 ------ ------

*Textbook on page 150 has typographical error

3. More Than Two Independent Groups-Continuous Outcome

Low Calorie Low Fat Low Carb Control

8 2 3 2

9 4 5 2

6 3 4 -1

7 5 2 0

3 1 3 3

n1 = 5 n2 = 5 n3 = 5 n4 = 5

X1 = 6.6 X2 = 3.0 X3 = 3.4 X4 = 1.2

Example: Weight Loss by Treatment (in Pounds)

1) Set up the hypothesis and determine level of statistical significance

H0: µ1 = µ2 = µ3 = µ4 H1: Means are not all equal; α = 0.05

2) Select the appropriate test statistic

3. More Than Two Independent Groups-Continuous Outcome

Low Calorie Low Fat Low Carb Control

n1 = 5 n2 = 5 n3 = 5 n4 = 5

X1 = 6.6 X2 = 3.0 X3 = 3.4 X4 = 1.2

Example: Weight Loss by Treatment (in Pounds)

3) Set up the decision rule --- see critical value in Table 4df1 = k – 1 = 4 – 1 = 3df2 = N – k = 20 – 4 = 16

Reject H0 if F > 3.24

3. More Than Two Independent Groups-Continuous Outcome

Low Calorie Low Fat Low Carb Control

n1 = 5 n2 = 5 n3 = 5 n4 = 5

X1 = 6.6 X2 = 3.0 X3 = 3.4 X4 = 1.2

Example: Weight Loss by Treatment (in Pounds)

4) Compute test statistic: SSB = SSE =(ANOVA table) MSB = SSB / (k – 1) MSE = SSE / (N – k)

F = MSB / MSE

SSB = 5(6.6 – 3.6)2 + 5(3.0 – 3.6)2 + 5(3.4 – 3.6)2 + 5(1.2 – 3.6)2

= 45.0 + 1.8 + 0.2 + 28.8 = 75.8

SSE = 21.4 + 10.0 + 5.4 + 10.6 = 47.4 (see tables 7-24 to 7-28, page 151 of text)

MSB = 75.8 / (4 – 1) = 25.3

MSE = 47.4 / (20 – 4) = 3.0

F = 25.3 / 3.0 = 8.43

5) Conclusion: Reject H0; 8.43 > 3.24

3. More Than Two Independent Groups-Continuous Outcome

Low Calorie Low Fat Low Carb Control

n1 = 5 n2 = 5 n3 = 5 n4 = 5

X1 = 6.6 X2 = 3.0 X3 = 3.4 X4 = 1.2

Example: Weight Loss by Treatment (in Pounds)

ANOVA orthogonal contrasts of mean:Sometimes, rather than just comparing a difference among all means, we wish to compare specific means or whether the means increase or decrease in a monotonic (linear) manner. This can be achieved with orthogonal contrasts of the means.Sum of coefficients in each linear contrast must equal zeroIn the example above:

µ1 versus (µ2, µ3, µ4) -3 1 1 1(µ1, µ2) versus (µ3, µ4) -1 -1 1 1(µ1, µ2, µ3) versus µ4 -1 -1 -1 3linear trend -2 -1 1 2

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

1) Set up the hypothesis and determine level of statistical significance

H0: __________________________________H1: ___________________________________ α = 0.05

2) Select the appropriate test statistic: __________________

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

1) Set up the hypothesis and determine level of statistical significance

H0: µ1 = µ2 = µ3 = µ4

H1: Means are not all equal;H1: Means increase or decrease in a monotonic (linear) manner; α = 0.05

2) Select the appropriate test statistic

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

3) Set up the decision rule --- see critical value in Table 4

df1 = k – 1 = ___________df2 = N – k = ___________

http://www.danielsoper.com/statcalc3/calc.aspx?id=4

Reject H0 if: _______________________

Total N = n1 + n2 + n3 + n4 = ___________________________

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

3) Set up the decision rule --- see critical value in Table 4

df1 = k – 1 = 4 – 1 = 3df2 = N – k = 473 – 4 = 469

http://www.danielsoper.com/statcalc3/calc.aspx?id=4

Reject H0 if F > 2.62

Total N = n1 + n2 + n3 + n4 = 88 + 191 + 139 + 55 = 473

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

Source of Variation

Sum of Squares (SS)

Degrees of freedom (df)

Mean Squares (MS) F

Between-groupSSB990.9

k – 1________

SSBMSB = ------ = _____ k – 1

MSBF = ------- MSE

Error or residual(random)

“within-group”

SSE15766.2

N – k_______

SSEMSE = ------- = _____ N – k

F = _______

TotalSST

16757.1N – 1

_______ ------ ------

4. Compute the test statistic F = ____________ N = ____

5. Conclusion: ___________________________

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Normal Pre-hypertensive Htn Stage I Htn Stage II

n1 = 88 n2 = 191 n3 = 139 n4 = 55

X1 = 28.42 X2 = 29.43 X3 = 30.75 X4 = 33.39

s = 5.37 s = 5.75 s = 5.89 s = 6.39

Example: Body Mass Index by Blood Pressure Classification

Source of Variation

Sum of Squares (SS)

Degrees of freedom (df)

Mean Squares (MS) F

Between-groupSSB990.9

k – 13

SSBMSB = ------ = 330.3 k – 1

MSBF = ------- MSE

Error or residual(random)

“within-group”

SSE15766.2

N – k469

SSEMSE = ------- = 33.6 N – k

F = __9.83__

TotalSST

16757.1N – 1

472 ------ ------

4. Compute the test statistic F = MSB / MSE N = 473

5. Conclusion: Reject H0: 9.83 > 2.62 Linear trend: F=11.27

3. More Than Two Independent Groups-Continuous Outcome (Practice)

Example: Body mass index and blood pressure classification in the Heart SCORE Study (α = 0.05)

SPSS

AnalyzeCompare Means

One-Way ANOVADependent Variable: Body mass indexGroup Variable (Factor): Blood pressure class

Contrasts -2 -1 1 2 Options:

DescriptiveHomogeneity of variance testMeans plot