๐œ‹: ESTIMATES, CONFIDENCE INTERVALS, AND...

16
: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Transcript of ๐œ‹: ESTIMATES, CONFIDENCE INTERVALS, AND...

๐œ‹: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS

Business Statistics

The CLT for ๐œ‹Estimating proportion

Hypothesis on the proportion

Old exam question

Further study

CONTENTS

โ–ช Estimating, confidence intervals, and hypothesis test for ๐œ‡are based on the central limit theoremโ–ช and therefore on the normal distribution

โ–ช For ๐œŽ2 we needed another distributionโ–ช the ๐œ’2-distribution

โ–ช What to use for ๐œ‹?โ–ช the probability of success in a Bernoulli experiment

โ–ช Based on sampling theoryโ–ช so, repeated Bernoulli experiment

โ–ช so, a binomial distribution

โ–ช and for large ๐‘›, approximately a normal distribution (โ†’ CLT)

THE CLT FOR ๐œ‹

Define ๐‘‹๐‘– as the outcome (0 or 1) in one Bernoulli experiment

โ–ช Total number of โ€œ1โ€s in ๐‘› Bernoulli experimentsโ–ช ๐‘Œ = ฯƒ๐‘–=1

๐‘› ๐‘‹๐‘–โ–ช Average number of โ€œ1โ€s (due to CLT, with binomial results):

โ–ช ๐‘ƒ =๐‘Œ

๐‘›= เดค๐‘‹~๐‘ ๐œ‡๐‘‹,

๐œŽ๐‘‹2

๐‘›= ๐‘ ๐œ‹,

๐œ‹ 1โˆ’๐œ‹

๐‘›

โ–ช provided ๐‘›๐œ‹ โ‰ฅ 5 and ๐‘› 1 โˆ’ ๐œ‹ โ‰ฅ 5

THE CLT FOR ๐œ‹

๐‘ƒ is the estimator of ๐œ‹a concrete estimate is ๐‘

Estimator:

โ–ช for ๐œ‡: เดค๐‘‹~๐‘ ๐œ‡๐‘‹,๐œŽ๐‘‹2

๐‘›

โ–ช for ๐œ‹: ๐‘ƒ~๐‘ ๐œ‹,๐œ‹ 1โˆ’๐œ‹

๐‘›

Point estimate:

โ–ช for ๐œ‡: เทœ๐œ‡ = าง๐‘ฅ =1

๐‘›ฯƒ๐‘–=1๐‘› ๐‘ฅ๐‘–, with observation ๐‘ฅ๐‘– โˆˆ โ„

โ–ช for ๐œ‹: เทœ๐œ‹ = ๐‘ =1

๐‘›ฯƒ๐‘–=1๐‘› ๐‘ฅ๐‘–, with observation ๐‘ฅ๐‘– = 0 or 1

Standard error of estimate:

โ–ช for ๐œ‡: ๐œŽ เดค๐‘‹ =๐œŽ๐‘‹

๐‘›

โ–ช for ๐œ‹: ๐œŽ๐‘ƒ =๐œ‹ 1โˆ’๐œ‹

๐‘›

THE CLT FOR ๐œ‹

Both standard errors decrease with ๐‘›

โ–ช Estimating ๐œ‹ by ๐‘

โ–ช and estimating ๐œŽ๐‘ƒ =๐œ‹ 1โˆ’๐œ‹

๐‘›by ๐‘ ๐‘ƒ =

๐‘ 1โˆ’๐‘

๐‘›

โ–ช standard error of proportion

โ–ช So, we have for ๐œ‹โ–ช a point estimate ๐‘ =

๐‘Œ

๐‘›

โ–ช an interval estimate ๐‘ โˆ’ ๐‘ง๐›ผ/2๐‘ 1โˆ’๐‘

๐‘›, ๐‘ + ๐‘ง๐›ผ/2

๐‘ 1โˆ’๐‘

๐‘›

โ–ช 1 โˆ’ ๐›ผ confidence interval for ๐œ‹

โ–ช ๐‘ โˆ’ ๐‘ง๐›ผ/2๐‘ 1โˆ’๐‘

๐‘›โ‰ค ๐œ‹ โ‰ค ๐‘ + ๐‘ง๐›ผ/2

๐‘ 1โˆ’๐‘

๐‘›

ESTIMATING PROPORTION

Example

Context: a sample of 75 retail in-store purchases showed that 24were paid in cash. Give a 95% confidence interval for ๐œ‹.

โ–ช ๐‘ =๐‘ฆ

๐‘›=

24

75= 0.32; this is the point estimate for ๐œ‹

โ–ช standard error of the estimate:

โ–ช ๐‘ ๐‘ƒ =๐‘ 1โˆ’๐‘

๐‘›=

0.32 1โˆ’0.32

75= 0.054

โ–ช CI๐œ‹,0.95: โ–ช 0.32 โˆ’ 1.96 ร— 0.054 , 0.32 + 1.96 ร— 0.054 = 0.214 , 0.426โ–ช or: 0.214 โ‰ค ๐œ‹ โ‰ค 0.426โ–ช or: 0.32 ยฑ 0.106

ESTIMATING PROPORTION

Check validity: ๐‘›๐‘ โ‰ฅ 5 and ๐‘› 1 โˆ’ ๐‘ โ‰ฅ 5

You flip a coin 100 times and find 45 times head. Give a

95% confidence interval for ๐œ‹โ„Ž๐‘’๐‘Ž๐‘‘ .

EXERCISE 1

Test a hypothesis on the proportion of a Bernoulli process

โ–ช Example:โ–ช you are a police officer

โ–ช you wonder if less than 50% of the (one-sided) traffic accidents

occur with female drivers driving the car

HYPOTHESES ON THE PROPORTION

โ–ช Statistical modelโ–ช each accident has an underlying Bernouilli process of happening

to a man (0) or to a woman (1), ๐‘‹~๐‘Ž๐‘™๐‘ก ๐œ‹โ–ช you observe the next ๐‘› = 5 car accidents, and report the

outcomes (0/1)

โ–ช you define ๐‘Œ as the number of accidents that is caused by a

woman

โ–ช the sequence of 5 observations can be regarded as a binomial

process, ๐‘Œ~๐ต๐‘–๐‘› ๐œ‹, 5โ–ช you start by assuming the accident rates are equal, i.e.,

hypothesize that ๐œ‹ = 0.5

โ–ช Suppose you observed ๐‘ฆ = 1, i.e., one car accident by a

woman

HYPOTHESES ON THE PROPORTION

โ–ช Step 1:โ–ช ๐ป0: ๐œ‹ โ‰ฅ 0.5; ๐ป1: ๐œ‹ < 0.5; ๐›ผ = 0.05

โ–ช Step 2:โ–ช sample statistic: ๐‘Œ =#female; reject for โ€œtoo smallโ€ values

โ–ช Step 3:โ–ช if ๐ป0 is just true, ๐‘Œ~๐ต๐‘–๐‘› 0.5,5 ; no assumptions required

โ–ช Step 4:โ–ช ๐‘โˆ’value = ๐‘ƒ๐ต๐‘–๐‘› 0.5,5 ๐‘Œ โ‰ค 1 = ๐‘ƒ ๐‘Œ = 0 + ๐‘ƒ ๐‘Œ = 1 =

0.0313 + 0.1563 = 0.1876

โ–ช Step 5:โ–ช ๐‘โˆ’value > ๐›ผ ; do not reject ๐ป0; there is not sufficient evidence

for concluding that ๐œ‹ < 0.5

HYPOTHESES ON THE PROPORTION

What if we have a large sample, say ๐‘› = 100?

โ–ช binomial tables and formulas donโ€™t work

Use normal approximation

โ–ช if ๐‘Œ~๐ต๐‘–๐‘› ๐œ‹, ๐‘› then ๐‘ =๐‘Œโˆ’๐‘›๐œ‹

๐‘›๐œ‹ 1โˆ’๐œ‹~๐‘ 0,1

โ–ช conditions: ๐‘›๐œ‹ โ‰ฅ 5 and ๐‘› 1 โˆ’ ๐œ‹ โ‰ฅ 5: OK

Example

โ–ช same as before (car accidents by gender)

โ–ช but now based on ๐‘› = 100โ–ช with ๐‘ฆ = 40 observed accidents by women

HYPOTHESES ON THE PROPORTION

โ–ช Step 1:โ–ช ๐ป0: ๐œ‹ โ‰ฅ 0.5; ๐ป1: ๐œ‹ < 0.5; ๐›ผ = 0.05

โ–ช Step 2:โ–ช sample statistic: ๐‘Œ =#female; reject for โ€œtoo smallโ€ values

โ–ช Step 3:

โ–ช if ๐ป0 is just true, ๐‘ =๐‘Œโˆ’๐‘›๐œ‹

๐œŽ๐‘Œ=

๐‘Œโˆ’๐‘›๐œ‹

๐‘›๐œ‹ 1โˆ’๐œ‹~๐‘ 0,1

โ–ช normal approximation OK (๐‘›๐œ‹ โ‰ฅ 5 and ๐‘› 1 โˆ’ ๐œ‹ โ‰ฅ 5)

โ–ช Step 4:

โ–ช ๐‘ง๐‘๐‘Ž๐‘™๐‘ =40โˆ’100ร—0.5

100ร—0.5 1โˆ’0.5= โˆ’2.00 (see, however, next page!)

โ–ช ๐‘ง๐‘๐‘Ÿ๐‘–๐‘ก = โˆ’1.645

โ–ช Step 5:โ–ช reject ๐ป0, accept ๐ป1; there is sufficient evidence for concluding that ๐œ‹ < 0.5

HYPOTHESES ON THE PROPORTION

โ–ช Note:โ–ช we forgot about the continuity correction

โ–ช a slightly more accurate result can be achieved with the continuity

correction

โ–ช Example:

โ–ช ๐‘ƒ ๐‘‹ โ‰ค 40 โ‰ˆ ๐‘ƒ ๐‘‹ โ‰ค 401

2= ๐‘ƒ ๐‘ โ‰ค

401

2โˆ’100ร—0.5

100ร—0.5ร— 1โˆ’0.5=

๐‘ƒ ๐‘ โ‰ค โˆ’1.9 < 0.05

โ–ช When needed?โ–ช not when ๐‘โˆ’value = 0.002 or ๐‘โˆ’value = 0.743โ–ช but required in cases like the example, when ๐‘โˆ’value โ‰ˆ ๐›ผ

HYPOTHESES ON THE PROPORTION

21 May 2015, Q1m

OLD EXAM QUESTION

Doane & Seward 5/E 11.1-11.2

Tutorial exercises week 5

confidence intervals

hypothesis tests (binomial)

hypothesis tests (normal)

FURTHER STUDY