The Role of Confidence Intervals in Research

The Role of Confidence Intervals in Research1. A study compared the serum HDL cholesterol levels in people with low-fat diets to people with diets high in fat intake.

From the study, a 95% confidence interval for the mean HDL cholesterol for the low-fat group extends from 43.5 to 50.5.

a. Does this mean that 95% of all people with low-fat diets will have HDL cholesterol levels between 43.5 and 50.5? Explain.

b. A 95% confidence interval for the mean HDL cholesterol for the low-fat group extends from 43.5 to 50.5. A 95% confidence interval for the mean HDL cholesterol for the high-fat group extends from 54.5 to 61.5.

Based on these results, would you conclude that people with low-fat diets have lower HDL cholesterol levels, on average, than people with high-fat diets?

2. In Question 1, we compared average HDL cholesterol levels for two diet groups by computing separate confidence intervals for the two means. Is there a more direct value (and single C.I.) to examine in order to make the comparison between the two groups?

Thought Questions

The Role of Confidence Intervals in ResearchMaking guesses about an individualWhile waiting for a friend outside the library, try to guess whether the next student leaving the library is overweight.

To keep things simple, select the next male student, but not an athlete (hence excluding the 300 Ib offensive lineman).

We'll also imagine that our task is to guess body mass index (BMI), which is weight in kilograms divided by the square of height in meters.

In a study of 100 non athlete male students at your university, the mean BMI was 26.0 and the standard deviation was 3.9. So if you had to guess the BMI of the next guy leaving the library, your "best guess" would be ???

How confident are you that they have a BMI of your "best guess"?

Your alternative to guessing a single value would be to say something like, "I guess that his BMI issomewhere between 22 and 30." How confident should you be in this answer?

If you guessed "between 18 and 34"-two standard deviations of the mean-about what percentage of your guesses would be correct?

The Role of Confidence Intervals in ResearchMaking guesses about the results of a StudyLet's try to do the same with the results of studies. Lets say your statistics teacher is finalizing his own analysis of BMI in 100 non-athlete males.

What will he announce as the mean?

Again, your best guess would be 26.0, but again you know that results of studies can vary. This time we aren't worried about how BMI varies between different individuals but how the mean BMl varies between studies.

So we want to think about standard error of your sample mean rather than standard deviation for the population .

The standard error of our study is the standard deviation divided by the square root of the sample size, which gives 0.39.

95% of the results of a study will be within two standard errors of the true population mean (if you were to measure the BMI of all non-athlete males at Columbia and get the mean).

If we guessed that the results of the lecturer's study would be between 25.2 and 26.8 we'd have a pretty good chance of being right.

The Role of Confidence Intervals in ResearchConfidence Intervals for StudiesThe range we give for results of a study is called a confidence interval.

A confidence interval is useful for interpreting the results of a study.

For example, imagine that we were looking at a study of whether a "mentoring" program affected SAT scores.

We read that mentoring was associated with an increase in SAT scores by 5 points, with a 95% confidence interval from 2 to 8 points.

What does the confidence interval tells us?

What would we conclude if the confidence interval went from -2 to 11?

The Role of Confidence Intervals in ResearchConfidence Intervals for Population MeansExamples of measures (parameters) of interest: What is the mean number of hours Columbia first year students study each week? What was their mean grade point average in high school?Sampling Distribution of a sample meanThe sampling distribution of the sample mean is approximately Normal when the sample size n is large or when the population (the sample is drawn from) is Normal.

The mean of the sampling distribution is equal to the true population mean

The standard deviation(SD) for the sampling distribution of the sample mean is

population standard deviationsample size

Proportions have a link between the proportion value and the standard deviation of the sample proportion. This is not the case with means

Well do the best we can: estimate the population standard deviation with the sample standard deviation.SEM = standard error of the sample mean = sample standard deviation/n

The Role of Confidence Intervals in Research1. Population of measurements is bell-shaped, and a random sample of any size is measured.

OR

2. Population of measurements of interest is not bell-shaped, but a large random sample is measured. Sample of size 40 is considered large, but if there are extreme outliers, better to have a larger sample.Recall: Conditions for Rule for Sample MeansConstructing a Confidence Interval for a MeanIn 95% of all samples, the sample mean will fall within 2 standard errors of the true population mean.

A 95% confidence interval for a population mean:

sample mean 2 (SEM)

SEM = standard error of the sample mean = sample standard deviation/n

The Role of Confidence Intervals in ResearchExample : Comparing Diet and ExerciseCompare weight loss (over 1 year) in men who diet but do not exercise and vice versa.

Diet Only Group:

sample mean = 7.2 kgsample standard deviation = 3.7 kgsample size = n = 42standard error of the sample mean = 3.7/ 42= 0.571

95% confidence interval for population mean: 7.2 2(0.571) = 7.2 1.1 = 6.1 kg to 8.3 kg

Exercise Only Groupsample mean = 4.0 kgsample standard deviation = 3.9 kgsample size = n = 47standard error of the sample mean = 3.9/ 47 = 0.56995% confidence interval for population mean: 4.0 2(0.569) = 4.0 1.1 = 2.9 kg to 5.1 kg

The Role of Confidence Intervals in ResearchInterpretation of your confidence interval Diet Only Group: 95% Confidence Interval : 6.1 kg to 8.3 kg sample mean : 7.2 kg

95% of all men will lose between 6.1 and 8.3 kg on this diet.

We are 95% confident that a randomly selected man will lose between 6.1 and 8.3 kg on this diet

If we took many random samples of men, about 95 out of every 100 of them would produce a confidence interval that contained the true mean weight loss of men on this diet

The true mean diet loss of man is 7.2kg 95% of the time.

95% of all samples will have a weight loss between 6.1 and 8.3 kg .

The Role of Confidence Intervals in ResearchProject 03 City Data - 2009 School Survey Parent Engagement Score:sample mean = 7.1sample standard deviation = 0.5sample size = n = 30standard error of the sample mean = 0.5/ 30= 0.091

95% confidence interval for population mean: 7.1 2(0.091) = 7.1 0.182 = 6.918 to 7.282

True Population Mean: 7.2

Teacher Engagement Score:sample mean = 7.0sample standard deviation = 0.8sample size = n = 30standard error of the sample mean = 0.8/ 30= 0.146

95% confidence interval for population mean: 7.0 2(0.146) = 7.0 0.292 = 6.708 to 7.292

True Population Mean: 7.1

The Role of Confidence Intervals in ResearchConfidence Intervals for Difference Between Two MeansIn many instances, such as in the diet versus exercise example, we are interested in comparingThe population means under two conditions or for two groups. Construct a single confidenceinterval for the difference in the population means for the two groups/conditions.

General form for Confidence Intervals:sample value 2 measure of variability

Collect a large sample of observations, independently, under each condition/from each group. Compute the mean and standard deviation for each sample.

2. Compute the standard error of the mean (SEM) for each sample by dividing the sample standard deviation by the square root of the sample size.

For independent random quantities, variances add. Square the two SEMs and add them together. Then take the square root. This will give you the standard error of the difference in two means. measure of variability = [(SEM1)2 + (SEM2)2]

4.A 95% confidence interval for the difference in the two population means is: difference in sample means 2 [(SEM1)2 + (SEM2)2]

The Role of Confidence Intervals in ResearchExample: Comparing Diet and ExerciseCompare weight loss (over 1 year) in men who diet but do not exercise and vice versa.Diet Only Group:sample mean = 7.2 kgsample standard deviation = 3.7 kgsample size = n = 42standard error = SEM1 = 3.7/ 42 = 0.571

Exercise Only Groupsample mean = 4.0 kgsample standard deviation = 3.9 kgsample size = n = 47standard error = SEM2 = 3.9/ 47 = 0.569

Compute standard error of the difference in two means:

Compute the confidence interval: [7.2 4.0] 2(0.81) = 3.2 1.6 = 1.6 kg to 4.8 kg

measure of variability = [(0.571)2 + (0.569)2]

= 0.81

The Role of Confidence Intervals in ResearchProject 03 City Data - 2009 School Survey Parent Engagement Score:sample mean = 7.1sample standard deviation = 0.5sample size = n = 30standard error of the sample mean = 0.5/ 30= 0.091

Teacher Engagement Score:sample mean = 7.0sample standard deviation = 0.8sample size = n = 30standard error of the sample mean = 0.8/ 30= 0.146

Compute standard error of the difference in two means (Parent minus Teachers Mean Score)

Compute the confidence interval: [7.1 7.0] 2(0.172) = 0.1 0.344 = -0.244 to .444Actual Difference between Population Means : 7.2 7.1 = 0.1

measure of variability = [(0.091)2 + (0.146)2]

= 0.172

The Role of Confidence Intervals in ResearchHow Journals Present Confidence IntervalsStudy of the relationship between smoking during pregnancy and subsequent IQ of child. Journal article (Olds, Henderson, and Tatelbaum, 1994) provided 95% confidence intervals, most comparing the means for mothers who didnt smoke and mothers who smoked ten or more cigarettes per day, hereafter called smokers.After control for confounding background variables(diet, education, age, drug use, parents IQ quality of parental care and duration of breast feeding), the average difference observed at 12 and 24 months was 2.59 points (95% CI: 3.03, 8.20); the difference observed at 36 and 48 months was reduced to 4.35 points (95% CI: 0.02, 8.68)

The Role of Confidence Intervals in ResearchReporting Standard Errors of the MeanComparison in serum DHEA-S levels for practitioners and non practitioners of transcendental meditation. Serum DHEA-S Concentrations ( SEM)difference in sample means 2 [(SEM1)2 + (SEM2)2]29 2(16.3)29 32.63.6 to 61.6[117-88] 2 [(12)2 + (11)2]How do we interpret this interval?

The Role of Confidence Intervals in ResearchUnderstanding the Confidence LevelFor a confidence level of 95%, we expect that about 95% of all such intervals will actually cover the true population value. The remaining 5% will not. Confidence is in the procedure over the long run.

90% confidence level => multiplier = 1.64595% confidence level => multiplier = 2 (to be exact it is 1.96)99% confidence level => multiplier = 2.576More confidence Wider Interval

The Role of Confidence Intervals in ResearchText Questions6. Suppose you were given a 95% confidence interval for the difference in two population means.What could you conclude about the population means if

a. The confidence interval did not cover zero

b. The confidence interval did cover zero

Text QuestionsThe Role of Confidence Intervals in Research

The Role of Confidence Intervals in Research

Documents

Transcript of The Role of Confidence Intervals in Research