Chapter 7 Statistical Inference: Confidence Intervals

87
Agresti/Franklin Statistics, 1 of 87 Chapter 7 Statistical Inference: Confidence Intervals Learn …. How to Estimate a Population Parameter Using Sample Data

description

Chapter 7 Statistical Inference: Confidence Intervals. Learn …. How to Estimate a Population Parameter Using Sample Data. Section 7.1. What Are Point and Interval Estimates of Population Parameters?. Point Estimate. - PowerPoint PPT Presentation

Transcript of Chapter 7 Statistical Inference: Confidence Intervals

Page 1: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 1 of 87

Chapter 7Statistical Inference: Confidence Intervals

Learn ….

How to Estimate a Population

Parameter Using Sample Data

Page 2: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 2 of 87

Section 7.1

What Are Point and Interval Estimates of Population

Parameters?

Page 3: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 3 of 87

Point Estimate

A point estimate is a single number that is our “best guess” for the parameter

Page 4: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 4 of 87

Interval Estimate

An interval estimate is an interval of numbers within which the parameter value is believed to fall.

Page 5: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 5 of 87

Point Estimate vs Interval Estimate

Page 6: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 6 of 87

Point Estimate vs Interval Estimate

A point estimate doesn’t tell us how close the estimate is likely to be to the parameter

An interval estimate is more useful• It incorporates a margin of error which

helps us to gauge the accuracy of the point estimate

Page 7: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 7 of 87

Point Estimation: How Do We Make a Best Guess for a Population Parameter?

Use an appropriate sample statistic:• For the population mean, use the sample

mean

• For the population proportion, use the sample proportion

Page 8: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 8 of 87

Point Estimation: How Do We Make a Best Guess for a Population Parameter?

Point estimates are the most common form of inference reported by the mass media

Page 9: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 9 of 87

Properties of Point Estimators

Property 1: A good estimator has a sampling distribution that is centered at the parameter

• An estimator with this property is unbiased• The sample mean is an unbiased estimator

of the population mean

• The sample proportion is an unbiased estimator of the population proportion

Page 10: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 10 of 87

Properties of Point Estimators

Property 2: A good estimator has a small standard error compared to other estimators

• This means it tends to fall closer than other estimates to the parameter

Page 11: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 11 of 87

Interval Estimation: Constructing an Interval that Contains the Parameter

(We Hope!)

Inference about a parameter should provide not only a point estimate but should also indicate its likely precision

Page 12: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 12 of 87

Confidence Interval

A confidence interval is an interval containing the most believable values for a parameter

The probability that this method produces an interval that contains the parameter is called the confidence level • This is a number chosen to be close to 1,

most commonly 0.95

Page 13: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 13 of 87

What is the Logic Behind Constructing a Confidence Interval?

To construct a confidence interval for a population proportion, start with the sampling distribution of a sample proportion

Page 14: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 14 of 87

The Sampling Distribution of the Sample Proportion

Gives the possible values for the sample proportion and their probabilities

Is approximately a normal distribution for large random samples

Has a mean equal to the population proportion

Has a standard deviation called the standard error

Page 15: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 15 of 87

A 95% Confidence Interval for a Population Proportion

Fact: Approximately 95% of a normal distribution falls within 1.96 standard deviations of the mean

• That means: With probability 0.95, the sample proportion falls within about 1.96 standard errors of the population proportion

Page 16: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 16 of 87

Margin of Error

The margin of error measures how accurate the point estimate is likely to be in estimating a parameter

The distance of 1.96 standard errors in the margin of error for a 95% confidence interval

Page 17: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 17 of 87

Confidence Interval

A confidence interval is constructed by adding and subtracting a margin of error from a given point estimate

When the sampling distribution is approximately normal, a 95% confidence interval has margin of error equal to 1.96 standard errors

Page 18: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 18 of 87

Section 7.2

How Can We Construct a Confidence Interval to Estimate a

Population Proportion?

Page 19: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 19 of 87

Finding the 95% Confidence Interval for a Population Proportion

We symbolize a population proportion by p The point estimate of the population

proportion is the sample proportion We symbolize the sample proportion by p̂

Page 20: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 20 of 87

Finding the 95% Confidence Interval for a Population Proportion

A 95% confidence interval uses a margin of error = 1.96(standard errors)

[point estimate ± margin of error] =

errors) ard1.96(stand p̂

Page 21: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 21 of 87

Finding the 95% Confidence Interval for a Population Proportion

The exact standard error of a sample proportion equals:

This formula depends on the unknown population proportion, p

In practice, we don’t know p, and we need to estimate the standard error

n

pp )1(

Page 22: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 22 of 87

Finding the 95% Confidence Interval for a Population Proportion

In practice, we use an estimated standard error:

n

ppse

)ˆ1(ˆ

Page 23: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 23 of 87

Finding the 95% Confidence Interval for a Population Proportion

A 95% confidence interval for a population proportion p is:

n

)p̂-(1p̂ se with 1.96(se), p̂

Page 24: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 24 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?”

• Of n = 1154 respondents, 518 were willing to do so

Page 25: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 25 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey

Page 26: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 26 of 87

Example: Would You Pay Higher Prices to Protect the Environment?

0.48) (0.42, 0.03 0.45

)015.0(96.1 1.96(se)p̂

015.01154

)55.0)(45.0(

45.01154

518p̂

se

Page 27: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 27 of 87

Sample Size Needed for Large-Sample Confidence Interval for a Proportion

For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:

15 )p̂-n(1 and 15 ˆ pn

Page 28: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 28 of 87

“95% Confidence”

With probability 0.95, a sample proportion value occurs such that the confidence interval contains the population proportion, p

With probability 0.05, the method produces a confidence interval that misses p

Page 29: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 29 of 87

How Can We Use Confidence Levels Other than 95%?

In practice, the confidence level 0.95 is the most common choice

But, some applications require greater confidence

To increase the chance of a correct inference, we use a larger confidence level, such as 0.99

Page 30: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 30 of 87

A 99% Confidence Interval for p

2.58(se) ˆ p

Page 31: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 31 of 87

Different Confidence Levels

Page 32: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 32 of 87

Different Confidence Levels

In using confidence intervals, we must compromise between the desired margin of error and the desired confidence of a correct inference

• As the desired confidence level increases, the margin of error gets larger

Page 33: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 33 of 87

What is the Error Probability for the Confidence Interval Method?

The general formula for the confidence interval for a population proportion is:

Sample proportion ± (z-score)(std. error)

which in symbols is

z(se) ˆ p

Page 34: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 34 of 87

What is the Error Probability for the Confidence Interval Method?

Page 35: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 35 of 87

Summary: Confidence Interval for a Population Proportion, p

A confidence interval for a population proportion p is:

n

)p̂-(1p̂z p̂

Page 36: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 36 of 87

Summary: Effects of Confidence Level and Sample Size on Margin of Error

The margin of error for a confidence interval:

• Increases as the confidence level increases

• Decreases as the sample size increases

Page 37: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 37 of 87

What Does It Mean to Say that We Have “95% Confidence”?

If we used the 95% confidence interval method to estimate many population proportions, then in the long run about 95% of those intervals would give correct results, containing the population proportion

Page 38: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 38 of 87

A recent survey asked: “During the last year, did anyone take something from you by force?”

Of 987 subjects, 17 answered “yes” Find the point estimate of the proportion

of the population who were victims

a. .17

b. .017

c. .0017

Page 39: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 39 of 87

Section 7.3

How Can We Construct a Confidence Interval To Estimate a

Population Mean?

Page 40: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 40 of 87

How to Construct a Confidence Interval for a Population Mean

Point estimate ± margin of error The sample mean is the point

estimate of the population mean The exact standard error of the

sample mean is σ/ In practice, we estimate σ by the

sample standard deviation, s

n

Page 41: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 41 of 87

How to Construct a Confidence Interval for a Population Mean

For large n…• and also

For small n from an underlying population that is normal…

The confidence interval for the population mean is:

)n

z(

x

Page 42: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 42 of 87

How to Construct a Confidence Interval for a Population Mean

In practice, we don’t know the population standard deviation

Substituting the sample standard deviation s for σ to get se = s/ introduces extra error

To account for this increased error, we replace the z-score by a slightly larger score, the t-score

n

Page 43: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 43 of 87

How to Construct a Confidence Interval for a Population Mean

In practice, we estimate the standard error of the sample mean by se = s/

Then, we multiply se by a t-score from the t-distribution to get the margin of error for a confidence interval for the population mean

n

Page 44: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 44 of 87

Properties of the t-distribution

The t-distribution is bell shaped and symmetric about 0

The probabilities depend on the degrees of freedom, df

The t-distribution has thicker tails and is more spread out than the standard normal distribution

Page 45: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 45 of 87

t-Distribution

Page 46: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 46 of 87

Summary: 95% Confidence Interval for a Population Mean

A 95% confidence interval for the population mean µ is:

To use this method, you need:• Data obtained by randomization

• An approximately normal population distribution

1-n df );( t.025

n

sx

Page 47: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 47 of 87

Example: eBay Auctions of Palm Handheld Computers

Do you tend to get a higher, or a lower, price if you give bidders the “buy-it-now” option?

Page 48: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 48 of 87

Example: eBay Auctions of Palm Handheld Computers

Consider some data from sales of the Palm M515 PDA (personal digital assistant)

During the first week of May 2003, 25 of these handheld computers were auctioned off, 7 of which had the “buy-it-now” option

Page 49: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 49 of 87

Example: eBay Auctions of Palm Handheld Computers

“Buy-it-now” option:

235 225 225 240 250 250 210

Bidding only:

250 249 255 200 199 240 228 255 232 246 210 178 246 240 245 225 246 225

Page 50: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 50 of 87

Example: eBay Auctions of Palm Handheld Computers

Summary of selling prices for the two types of auctions:

buy_now N Mean StDev Minimum Q1 Median Q3

no 18 231.61 21.94 178.00 221.25 240.00 246.75 yes 7 233.57 14.64 210.00 225.00 235.00 250.00

buy_now Maximum

no 255.00

yes 250.00

Page 51: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 51 of 87

Example: eBay Auctions of Palm Handheld Computers

Page 52: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 52 of 87

Example: eBay Auctions of Palm Handheld Computers

To construct a confidence interval using the t-distribution, we must assume a random sample from an approximately normal population of selling prices

Page 53: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 53 of 87

Example: eBay Auctions of Palm Handheld Computers

Let µ denote the population mean for the “buy-it-now” option

The estimate of µ is the sample mean:

x = $233.57 The sample standard deviation is:

s = $14.64

Page 54: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 54 of 87

Example: eBay Auctions of Palm Handheld Computers

The 95% confidence interval for the “buy-it-now” option is:

which is 233.57 ± 13.54 or (220.03, 247.11)

)7

64.14(44.257.233)(025.

n

stx

Page 55: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 55 of 87

Example: eBay Auctions of Palm Handheld Computers

The 95% confidence interval for the mean sales price for the bidding only option is:

(220.70, 242.52)

Page 56: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 56 of 87

Example: eBay Auctions of Palm Handheld Computers

Notice that the two intervals overlap a great deal:

• “Buy-it-now”: (220.03, 247.11)

• Bidding only: (220.70, 242.52)

There is not enough information for us to conclude that one probability distribution clearly has a higher mean than the other

Page 57: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 57 of 87

How Do We Find a t- Confidence Interval for Other Confidence Levels?

The 95% confidence interval uses t.025

since 95% of the probability falls between - t.025 and t.025

For 99% confidence, the error probability is 0.01 with 0.005 in each tail and the appropriate t-score is t.005

Page 58: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 58 of 87

If the Population is Not Normal, is the Method “Robust”?

A basic assumption of the confidence interval using the t-distribution is that the population distribution is normal

Many variables have distributions that are far from normal

Page 59: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 59 of 87

If the Population is Not Normal, is the Method “Robust”?

How problematic is it if we use the t- confidence interval even if the population distribution is not normal?

Page 60: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 60 of 87

If the Population is Not Normal, is the Method “Robust”?

For large random samples, it’s not problematic

The Central Limit Theorem applies: for large n, the sampling distribution is bell-shaped even when the population is not

Page 61: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 61 of 87

If the Population is Not Normal, is the Method “Robust”?

What about a confidence interval using the t-distribution when n is small?

Even if the population distribution is not normal, confidence intervals using t-scores usually work quite well

We say the t-distribution is a robust method in terms of the normality assumption

Page 62: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 62 of 87

Cases Where the t- Confidence Interval Does Not Work

With binary data

With data that contain extreme outliers

Page 63: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 63 of 87

The Standard Normal Distribution is the t-Distribution with df = ∞

Page 64: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 64 of 87

The 2002 GSS asked: “What do you think is the ideal number of children in a family?”

The 497 females who responded had a median of 2, mean of 3.02, and standard deviation of 1.81. What is the point estimate of the population mean?

a. 497

b. 2

c. 3.02

d. 1.81

Page 65: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 65 of 87

Section 7.4

How Do We Choose the Sample Size for a Study?

Page 66: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 66 of 87

How are the Sample Sizes Determined in Polls?

It depends on how much precision is needed as measured by the margin of error

The smaller the margin of error, the larger the sample size must be

Page 67: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 67 of 87

Choosing the Sample Size for Estimating a Population Proportion?

First, we must decide on the desired margin of error

Second, we must choose the confidence level for achieving that margin of error

In practice, 95% confidence intervals are most common

Page 68: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 68 of 87

Example: What Sample Size Do You Need For An Exit Poll?

A television network plans to predict the outcome of an election between two candidates – Levin and Sanchez

They will do this with an exit poll that randomly samples votes on election day

Page 69: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 69 of 87

Example: What Sample Size Do You Need For An Exit Poll?

The final poll a week before election day estimated Levin to be well ahead, 58% to 42%

• So the outcome is not expected to be close

The researchers decide to use a sample size for which the margin of error is 0.04

Page 70: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 70 of 87

Example: What Sample Size Do You Need For An Exit Poll?

What is the sample size for which a 95% confidence interval for the population proportion has margin of error equal to 0.04?

Page 71: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 71 of 87

Example: What Sample Size Do You Need For An Exit Poll? The 95% confidence interval for a

population proportion p is:

If the sample size is such that 1.96(se) = 0.04, then the margin of error will be 0.04

)(96.1ˆ sep

Page 72: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 72 of 87

Example: What Sample Size Do You Need For An Exit Poll?

Find the value of the sample size n for which 0.04 = 1.96(se):

22 )04.0/()ˆ1(ˆ(1.96)n

:nfor lly algebraica solve

ˆ1(ˆ96.104.0

pp

npp

Page 73: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 73 of 87

Example: What Sample Size Do You Need For An Exit Poll?

A random sample of size n = 585 should give a margin of error of about 0.04 for a 95% confidence interval for the population proportion

Page 74: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 74 of 87

How Can We Select a Sample Size Without Guessing a Value for the

Sample Proportion

In the formula for determining n, setting

= 0.50 gives the largest value for n out of all the possible values to substitute for

Doing this is the “safe” approach that guarantees we’ll have enough data

Page 75: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 75 of 87

Sample Size for Estimating a Population Parameter

The random sample size n for which a confidence interval for a population proportion p has margin of error m (such as m = 0.04) is

2

2)ˆ1(ˆm

zppn

Page 76: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 76 of 87

Sample Size for Estimating a Population Parameter

The z-score is based on the confidence level, such as z = 1.96 for 95% confidence

You either guess the value you’d get for the sample proportion based on other information or take the safe approach of setting = 0.50p̂

Page 77: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 77 of 87

Sample Size for Estimating a Population Mean

The random sample size n for which a 95% confidence interval for a population mean has margin of error approximately equal to m is

To use this formula, you guess the value you’ll get for the sample standard deviation, s

2

24

m

sn

Page 78: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 78 of 87

Sample Size for Estimating a Population Mean

In practice, since you don’t yet have the data, you don’t know the value of the sample standard deviation, s

You must substitute an educated guess for s• You can use the sample standard

deviation from a similar study

Page 79: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 79 of 87

Example: Finding n to Estimate Mean Education in South Africa

A social scientist plans a study of adult South Africans to investigate educational attainment in the black community

How large a sample size is needed so that a 95% confidence interval for the mean number of years of education has margin of error equal to 1 year?

Page 80: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 80 of 87

Example: Finding n to Estimate Mean Education in South Africa

No prior information about the standard deviation of educational attainment is available

We might guess that the sample education values fall within a range of about 18 years

Page 81: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 81 of 87

Example: Finding n to Estimate Mean Education in South Africa

If the data distribution is bell-shaped, the range from – 3 to + 3 will contain nearly all the distribution

The distance – 3 to + 3 equals 6s

Solving 18 = 6s for s yields s = 3 So ‘3’ is a crude estimate of s

x s x s

x xs s

Page 82: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 82 of 87

Example: Finding n to Estimate Mean Education in South Africa

The desired margin of error is m = 1 year

The required sample size is:

361

)3(442

2

2

2

m

sn

Page 83: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 83 of 87

What Factors Affect the Choice of the Sample Size?

The first is the desired precision, as measured by the margin of error, m

The second is the confidence level

Page 84: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 84 of 87

What Other Factors Affect the Choice of the Sample Size?

A third factor is the variability in the data

• If subjects have little variation (that is, s is small), we need fewer data than if they have substantial variation

A fourth factor is financial• Cost is often a major constraint

Page 85: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 85 of 87

What if You Have to Use a Small n?

The t- methods for a mean are valid for any n• However, you need to be extra cautious to

look for extreme outliers or great departures from the normal population assumption

Page 86: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 86 of 87

What if You Have to Use a Small n?

In the case of the confidence interval for a population proportion, the method works poorly for small samples

Page 87: Chapter 7 Statistical Inference:  Confidence Intervals

Agresti/Franklin Statistics, 87 of 87

Constructing a Small-Sample Confidence Interval for a Proportion

Suppose a random sample does not have at least 15 successes and 15 failures

The confidence interval formula:

● Is still valid if we use it after adding ‘2’ to the original

number of successes and ‘2’ to the original number of failures

This results in adding ‘4’ to the sample size n

n

ppzp

)ˆ1(ˆˆ