Chapter 11 Problems of Estimation 11.1 Estimation of means 11.2 Estimation of means (unknown...

Post on 19-Dec-2015

240 views 5 download

Transcript of Chapter 11 Problems of Estimation 11.1 Estimation of means 11.2 Estimation of means (unknown...

Chapter 11 Problems of Estimation

11.1 Estimation of means

11.2 Estimation of means (unknown variance)

11.3 Skip

11.4 Estimation of proportions

11.1 The Estimation of Means

How to estimate the population mean μ, and

standard deviation σfrom sample data x1, x2, …, xn?

We usually use sample mean to estimate μand sample standard deviation s to estimate σ.

and s are called point estimates.

x

x

Point estimate of the mean

• For a certain sample, sample mean, which is the point estimate of the population mean, is a single number.

• Since sample means fluctuate from sample to sample, we must expect an error .

• A point estimate along does not tell us about the possible size of the error.

Interval Estimate—Confidence intervals

• An interval estimate consists of an interval which will contain the quantity it is supposed to estimate with a specified probability (or degree of confidence).

• Recall that for large random samples from infinite populations, the sampling distribution of the mean is approximately a normal distribution with

• So we will utilize some properties of normal

distribution to explain a confidence interval.

and x xn

For a standard normal curveFor a standard normal curve

Define Z/2 to be such that P(Z > Z/2)=/2. Hence the area under the standard normal curve between -Z/2 and Z/2 is equal to 1-./2 0.10 0.05 0.025 0.010 0.005Z/2 1.282 1.645 1.96 2.326 2.576

1-

-z/2 z/2

Standard normal

For X normal with mean For X normal with mean and standard deviation and standard deviation , ,

With probability 1-, deviates from by no more than

This is called maximum error of estimate with probability 1-.

1-

Distribution of

/ 2 ( )zn

/ 2 ( )z

n

x

x

2

E zn

For X normal with mean For X normal with mean and standard deviation and standard deviation , ,

The probability is 0.95 that will differ from by at most

or approximately to be “off” either way by at most 1.96 standard errors of the mean.

.95

Distribution of

1.96( ) 2( )sen

1.96( ) 2( )sen

x

x1.96E

n

Maximum error E with probability 1-

• With probability 0.95, deviates from μ by no more than

(approximately 2 standard error away from the true value)

x

nE

96.1

• Probability Maximum error E0.80

0.90

0.95

0.99

n

282.1

n

645.1

n

96.1

n

576.2

Maximum error E with probability

• The maximum error depends on both the confidence level and sample size!

• You can determine the sample size according to the confidence level and the maximum error.

1

Sample size for estimating • How large must our sample to keep our

error no more than E with probability 1-?

2 2/ 2

2

/ 2

/ 2

/ 2

z

z En

z

zn

n

n

E

E

E

As 2 increases, n increases.

As E decreases, n increases.

As our error probability decreases, n increases.

Confidence Interval for Means

After computing sample mean , find a range of values such that 95% of the time the resulting range includes the true value .

x

For known and x normal,

(-1.96 1.96) 0.95/

( 1.96 1.9

1.96 1.96(

6 ) 0.95

or is a 95% confidence interval f r) o .xx x S

xP

n

P x xn

En

n

Degree of ConfidenceDegree of Confidence

The degree of confidence states the probability that the interval will give a correct answer.

• If you use 95% confidence interval often, in the long run 95% of your intervals will contain the true parameter value.

• When the method is applied once, you do not know if your interval gave a correct value (95% of the time) or not (5% of the time).

Example 11.1

• Suppose we measure specific gravity of a metal, and σ=0.025.

• Send each of you into the lab to take n=25 measurements:

005.05

025.0

nx

Example 11.1

• 95% CI for the mean:

• If the true value is 2, then about 95% of students will find this is true:

)005.0(96.12)005.0(96.1 xx

)005.0(96.1)005.0(96.1 xx

Confidence Intervals

100(1-a)% CI:

80%

90%

95%

99%

nzX a

2/

nX

282.1

nX

645.1

nX

96.1

nX

576.2

Example 11.2

• X=breaking strength of a fish line.

σ=0.10. In a random sample of size n=10,

Find a 95% confidence interval for μ, the

true average breaking strength.

3.10x

Solution:

• Standard error of the mean:

• Critical value=1.96; maximum error is

• CI:

from 10.24 to 10.36

0316.010

10.0

nx

062.096.1 x

062.03.10

Example 11.2 (continued)

• How large a sample size is needed in order to get a maximum error no more than 0.01with 95% probability if the sample mean is used to estimate the true mean?

• Solution

n=385, always round up!

16.38401.0

10.096.12

22

2

222/

E

zn

11.2 Estimation of Means (unknown variance)

• A sample of size n:

x1, x2, …, xn

from a normal population with mean μ, and

standard deviation, σ.• If σis known, with probability

nzxnzx // 2/2/

1

If σis unknown• Estimate σby sample standard deviation s• The estimated standard error of the mean will be

• Using the estimated standard error we have a confidence interval of

• The multiplier needs to be bigger than Z/2 (e.g., 1.96). The confidence interval needs to be wider to take into account the added uncertainty in using s to estimate .

• The correct multipliers were figured out by a Guinness Brewery worker.

nsSE /

)____(n

sx

What is the correct multiplier? “t”

• 100(1-)% confidence interval when is unknown

• 95% CI =100(1-)% confidence interval when is unknown

)/(2

nstx

)/(025.0 nstx

Properties of t distribution

• The value of t/2 depends on how much information we have about . The amount of information we have about depends on the sample size.

• The information is “degrees of freedom” and for a sample from one normal population this will be: df=n-1.

t t curve and curve and z curvez curve

Both the standard normal curve N(0,1) (the z distribution), and all t(k) distributions are density curves, symmetric about a mean of 0, but t distributions have more probability in the tails.

As the sample size increases, this decreases and the t distribution more closely approximates the z distribution. By n = 1000 they are virtually indistinguishable from one another.

Critical values of t distribution

• t table is given in the book (p. 497)

• It depends on the degrees of freedom as well• Df alpha t• 5 0.10 1.476• 10 0.05 1.812 • 20 0.01 2.528• 25 0.025 2.060

)( ttP

Areas under the curve

• The area between and is 2/t 2/t 1

1)( 2/2/ tttP

1)

/( 2/2/ t

ns

xtP

Confidence interval for the meanwhen is unknown

• With probability

• Maximum error

1

n

stx

n

stx 2/2/

n

stE 2/

Example (ex. 11.16, p 273)

• Noise level, n=12

74.0 78.6 76.8 75.5 73.8 75.6

77.3 75.8 73.9 70.2 81.0 73.9

1. Point estimate for the average noise level of vacuum cleaners;

2. 95% Confidence interval

Solution

• n=12,

• Critical value with df=11

• 95% CI:

53.75x 75.2s

201.2025.0 t

75.153.7512

75.2201.253.75

28.7778.73

11.4 The Estimation of Proportions

• Notation:

1. μ, σ mean and variance

2. p proportion=probability of a success

Consider count data:

n=# of trials, p=probability of a success

Estimate of p

• Xi=0, or 1 with probability 1-p or p

• Mean of Xi =p: population mean

• X=sum of Xi

• Sample proportion (mean) X/n p

n

Xp ˆ

Example 11.4

• Toss a coin 100 times and you get 45 heads

• Estimate p=probability of getting a head

Solution:

Is the coin balanced one?

45.0100

45ˆ p

Estimate of p

If np≥5 and n(1-p)≥5, then is approximately normal.

n

pp

p

p

)1(

proportion sampleˆ

ˆ

Maximum error

• We have (1-a)100% confidence that the error in our estimate is at most

(worst case is p=1/2.)

nz

n

ppzE 2

1*21)1(

22

CI

• An approximate 100(1-)% confidence interval for p is

n

ppzp

)ˆ1(ˆˆ

2

Sample Size

• The sample size required to have probability 1- that our error is no more than E is

Since p is unknown, you have to estimate itin the formula.

22/22/ )(2

1*

2

1)()1(

E

z

E

zppn

Maximize p(1-p) to get the sample size

• If you don’t have any prior information about p, then

Maximum p(1-p)=1/4

2

22/

4E

zn

If you know p is somewhere …

• If then

maximum p(1-p)=0.3(1-0.3)=0.21

• If then

maximum p(1-p)=0.4(1-0.4)=0.24

3.0p

2

22/21.0

E

zn

6.0p

2

22/24.0

E

zn

How to estimate the maximum

• Estimate p(1-p) by substitute p with the value closest to 0.5

(0, 0.1), p=0.1

(0.3, 0.4), p=0.4

(0.6, 1.0), p=0.6

Example 11.4 (continued)

• 95% CI for p

• 0.3525<p<0.5475 with 95% probability

0975.045.0100

)55.0(45.096.145.0

Example 11.5 (example 11.13 in text)

• A state highway dept wants to estimate what proportion of all trucks operating between two cities carry too heavy a load

• 95% probability to assert that the error is no more than 0.04

• Sample size needed if1. p between 0.10 to 0.252. no idea what p is

Solution

1. E=0.04, p=0.25

Round up to get n=451

2. E=0.04, p(1-p)=1/4

n=601

96.1025.0 z

19.450004

96.1)75.0(25.0

2

2

n

96.1025.0 z

25.60004.04

96.12

2

n