Lecture 8 Notes

25
ECON1203/ECON2292 Business and Economic Statistics Week 8

Transcript of Lecture 8 Notes

Page 1: Lecture 8 Notes

ECON1203/ECON2292 Business and Economic

Statistics

Week 8

Page 2: Lecture 8 Notes

2

Week 8 topics Hypothesis testing Type I & type II errors Test about the mean when population variance is

known p-values

Key references Keller 11, esp. 11.1-11.2

Page 3: Lecture 8 Notes

3

Basic inference problems Estimation

Process of obtaining reliable information about population parameters

Could be point or interval estimates

Hypothesis testing: in many applications we want to answer questions like On average, does a customer of CCResort spend more than

$255 per day? Is the earth warming up? Is the sale increased after the advertising? i.e., we have a belief/hypothesis about a population parameter of

interest. Need to “test” if the data support that belief/hypothesis

Page 4: Lecture 8 Notes

Hypothesis testing examples and concepts The judicial system

Defendant is presumed innocent (hypothesis). Note: not other way around.

Look for evidence to reject that hypothesis If enough evidence then reject (defendant found guilty),

otherwise not reject (defendant goes free). Given evidence defendant is either found not guilty or guilty Possible for these verdicts to be in error

Sometimes a guilty person goes free Sometimes a ‘miscarriage of justice’ – innocent person jailed

Concepts are similar in statistical inference when compared to those in the judicial system

4

Page 5: Lecture 8 Notes

Hypothesis testing examples and concepts… Business example: Quality control at McDonalds

“Quarter-pounder with cheese” is presumed to comprise 0.25 pounds (0.11 kg) of precooked meat

Given data (sample of hamburgers), is McDonalds’ claim correct or not?

Null hypothesis (denote by H0) Some statement about a population parameter Let X be weight of precooked meat with mean µ Then null hypothesis is H0: µ = 0.25

Alternative hypothesis (denote by H1) Will depend on the research objective Some possibilities here

H1: µ ≠ 0.25, two tailed hypothesis (Truth in advertising) H1: µ < 0.25, one (lower) tailed hypothesis (Trading standards

agency)

5

Page 6: Lecture 8 Notes

Hypothesis testing examples and concepts… A statistical test uses information from

the data to make a decision where there are just TWO possibilities

Do not reject the null hypothesis, or Reject the null hypothesis (in favour of the

alternative) Type I errors occur when we reject a

true null hypothesis Denote P(Type I error) = α Also called the significance level

Type II errors occur when we don’t reject a false null hypothesis Denote P(Type II error) = β

Which of these errors does our criminal judicial system seek to minimize?

6

Page 7: Lecture 8 Notes

7

Hypothesis testing examples and concepts… Statistical tests are designed such that P(Type I error) is

controlled (made small), but not P(Type II error). Therefore When we reject the null, we are “confident” When we do not reject the null, it means we don’t have sufficient

evidence in the data to refute it

If McDonalds’ claim is not rejected, does that proveMcDonald’s quarter-pounders do contain a quarter pound of meat? No!

If a defendant is found not guilty, does it prove the defendant is innocent? No, it means strictly that evidence is not strong enough to

consider the person to be guilty

Page 8: Lecture 8 Notes

8

Hypothesis testing examples and concepts… How are data used to test the null hypothesis?

Proceed by comparing a test statistic with value specified in H0 & decide whether differences are: Small enough to be attributable to random sampling errors do not

reject H0, or So large that it is more likely that H0 is not correct reject H0

Formally define a rejection (or critical) region Values of the test statistic that are so extreme they lead us to reject

H0 in favour of H1

Other values of the test statistic that are not so extreme lie in the non-critical region

The value that separates the critical region from the noncritical region is called the critical value

Page 9: Lecture 8 Notes

9

Quality control at McDonalds Consider H0: µ = 0.25, H1: µ < 0.25 A sample of 25 hamburgers produces sample

mean weights (in pounds!) of: (a) 0.24 (b) 0.22 (c) 0.28 (d) 0.21

Which of these represent evidence against H0?

Which of these would lead you to reject H0?

For which are you most likely to reject H0?

Page 10: Lecture 8 Notes

10

Quality control at McDonalds…

0

0

05.0

1

0

2

ofrejection tolead would23. (b)But reject not d then woul24. (a) If

2336.0 and 645.1 then .05, ,25 0.05,Let

25.0

Thus

25.0)(

is true,is null when thecomputed error), I P(type Then, . ifReject

be ch willregion whirejection thedetermine toNeed25.0:25.0:

),(~ ismeat hamburger of weight Assume

HXHX

xznn

zx

nxZPxXP

xX

HH

NX

L

L

LL

L

=

=

=====

−=

=

−<=<

<

<=

σα

σ

ασ

µµ

σµ

α

Page 11: Lecture 8 Notes

Quality control at McDonalds… The critical region can be written as

Note: Choice of significance level matters! Suppose α =0.01, the new critical region is

Z < -2.33 or in terms of a sample mean, cutoff is 0.2267 rather 0.2336 Does it make sense that the two critical values are less than

before when α =0.05?

Thus, case (b) with sample mean of 0.23 would now not lead to rejection of the null hypothesis

11

nXZzZ

/25.0 where, if HReject 0 σα

−=−<

Page 12: Lecture 8 Notes

12

One or two tail tests? Quality control at McDonalds What are null & alternative hypotheses from

McDonalds’ perspective?

“It is expected that you will spend at least ten hours per week studying this course.” What are null & alternative hypotheses here from

lecturer’s perspective?

Page 13: Lecture 8 Notes

One or two tail tests? What if we want to test H0: µ = µ0 vs H1: µ >µ0

13

Page 14: Lecture 8 Notes

One or two tail tests?

What if we want to test H0: µ = µ0 vs H1: µ ≠ µ0

14

Page 15: Lecture 8 Notes

Procedure for solving hypothesis testing problems1. State the hypotheses The claim that we are looking for evidence in data to

support should be stated as the alternative hypothesis

2. Find the critical value(s) Depend on the significant level α

3. Compute the test statistic4. Make the decision to reject or not reject the null

hypothesis5. Summarize the results

15

Page 16: Lecture 8 Notes

16

Skills test Human Resources (HR) for a company has

conducted skills tests for some time Past experience indicates test scores (out of 10) for

job applicants are X~N(6.6, 1) A new skills course is offered to prospective

employees HR wants to know whether such applicants tend to perform

better than the norm in the test What is your response given sample of 25 applicants with

such a course if they completed the test with an average score 7.09 & if α = 0.05?

Page 17: Lecture 8 Notes

17

Skills test…

course of esseffectiven supports evidence Reject:Decision

46.251

6.609.7:statistic test edStandardiz

645.1:if reject .05, Given

6.6:6.6:

course taking those for score mean be Let

0

05.

1

0

=−

=−

=

=>=

>

=

H

nxz

zz

HH

σµ

α

µµ

µ

Page 18: Lecture 8 Notes

18

p-values How do I choose the significance level α? No rules Conventional choices are α = 0.1 or 0.05 or 0.01 In McDonald’s example we saw it could matter

Why do I have to choose a particular α? You don’t Calculate the “empirical significance level” or p-value

p-value is the probability of obtaining a value of the test statistic more extreme than that observed, given that the null hypothesis is true “More extreme” depends on form of alternative hypothesis For skills test it would be probability of a statistic larger

than 7.09

Page 19: Lecture 8 Notes

19

p-values…

0069.)46.2(

5/160.609.7

)09.7(

=>=

−>

−=

>=−

ZPn

XP

XPvaluep

σµ

Thus very unlikely to find such an extreme value for mean test scores given H0 There is strong evidence to reject H0

Alternatively, for any choice of significance level α greater than .0069 we would reject H0

Page 20: Lecture 8 Notes

20

SIA: Student progress in BES Student outcomes potentially influenced by University resources such as IT & library facilities Quality of teaching staff Quality of fellow students - peer effects

Take the results from first feedback quiz as an indicator of student quality How good is this modelling assumption likely to be?

Past results over several years taken as population distribution

Page 21: Lecture 8 Notes

SIA: Student progress in BES... Key features of past

results for FBQ#1 On average students do

well Median=9, mean=8.5,

only 5.3% with mark <5, st.dev.=1.88

Distribution is “non-normal” as “piles up” in right tail (skew to left) 22% with mark of 9 &

41% with 10

21

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0 1 2 3 4 5 6 7 8 9 10

Rela

tive

freq

uenc

y

Mark

Feedback Quiz # 1: Population distribution

Page 22: Lecture 8 Notes

22

SIA: Student progress in BES... Q1: What if you randomly chose a BES student to

help you with your work What is the probability that their test mark is at least 9?

Q2: Define a “good” tutorial to be one where the mean mark over 20 students is at least 9 What is the probability that you’re in a “good” tutorial?

Q3: Is the 2014 BES cohort any different from past? Test this hypothesis if a randomly selected tutorial of size

25 has a mean mark of 8.8

Page 23: Lecture 8 Notes

SIA: Student progress in BES... Let X denote the marks in the quiz Take relative frequency distribution of X to be population

distribution (non-normal, remember) Thus µ =8.5, σ =1.88

23117.0)19.1(

2088.15.89)9(

)20/88.1,5.8(~ CLTby Assume:263.041.022.0

)10()9()9(ondistributi population Use :1

2

=≥=

−≥

−=≥

=+==+==≥

ZPn

XPXP

NXQ

XPXPXPQ

σµ

Page 24: Lecture 8 Notes

24

SIA: Student progress in BES...

( )0

210

reject to evidence ntinsufficie 0.01say at4238.02119.0280.02

2588.15.88.82)8.8(2

value- calculatenow ),25/88.1,5.8(~ CLTBy

5.8:5.8: :3

HZP

nXPXPp

pNX

HHQ

=⇒=×=≥×=

−≥

−×=>×=

≠=

α

σµ

µµ

Page 25: Lecture 8 Notes

25

Progress report #5 Have procedures to test hypotheses in a range

of circumstances Need to consider situations where population variance

(as well as population mean) are unknown Leads us to t distribution

Need to recognize that the same concepts apply to situations other than estimating the mean Leads us to confidence intervals & hypothesis testing for

the population proportion