Lecture 11 Notes - Asal Aslemand -...

48
1 Lecture 11 Notes Chapter 17. More About Tests Chapter 18. Inferences About Means

Transcript of Lecture 11 Notes - Asal Aslemand -...

Page 1: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

1

Lecture 11 Notes➢ Chapter 17. More About Tests

➢ Chapter 18. Inferences About Means

Page 2: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Important Ideas from Lecture 10

2

• All the hypothesis tests boils down to the same question: “Is an observed difference or pattern too large to be

attributed to chance?”

• We measure “how large” by putting our sample results in the context of a sampling distribution model (e.g.,

Normal model, 𝑡 distribution – which we will learn later in this lecture).

Steps in conducting Hypothesis Testing:

1. State the null and the alternative hypothesis.

2. Check the necessary assumptions.

3. Identify the test-statistic. Find the value of the test-statistic.

4. Find the p-value of the test-statistic.

5. State (if any) a conclusion.

Page 3: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Important Ideas from Lecture 10

3

P-value:

• It is a conditional probability.

• It is not the probability that Ho (null hypothesis: current belief) is true.

• It is: P(observed statistic value [or even more extreme] | Ho]. Given Ho (the null hypothesis), because Ho gives

the parameter values that we need to find required probability.

• P-value serves as a measure of the strength of the evidence against the null hypothesis (but it should not serve as

a hard and fast rule for decision).

• If p-value = 0.03 (for example) all we can say is that there is 3% chance of observing the statistic value we

actually observed (or one even more inconsistent with the null value).

Page 4: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

More About P-values

4

Recall the example in lecture 10 in which we investigated whether there is evidence to suggest that the true

proportion 𝑝 of all Canadians who worked at a job or business at anytime (between July 2007 and June 2008),

regardless of the number of hours per week, was less than 50%.

• The p-value of the observed test-statistics was less than 0.0001. With this p-value all we can say that if the true

percentage of all Canadians who worked at a job or business at anytime (between July 2007 and June 2008),

regardless of the number of hours per week was 50%, the probability of observing percentage of Canadians -

who worked at a job or business at anytime (between July 2007 and June 2008), regardless of the number of

hours per week – no higher than 50% in a sample like this is less than 1 chance in 10000.

Page 5: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

More About P-values

5

Example:

A New England Journal of Medicine paper reported that the seven-year risk of heart attack in diabetes patients

taking the drug Avandia was increased from the baseline of 20.2% to an estimated risk of 28.9%.

• This study estimated the p-value: P( Ƹ𝑝 ≥ 28.9% | 𝑝 = 20.20%) = 0.03

This p-value means that a heart attack rate of at least as high as the one they observe could be expected in 3% of

similar experiments, even if, there were no increased risk from taking Avandia.

• An earlier study had estimated the seven-year risk to be 26.9% and thus reported the p-value of:

P( Ƹ𝑝 ≥ 26.9% | 𝑝 = 20.20%) = 0.27

This p-value means that a heart attack rate of at least as high as the one they observe could be expected in 27% of

similar experiments, even if, there were no increased risk from taking Avandia. This is not remarkable enough to

reject Ho: 𝑝 = 20.20%. In other words this study was not convincing.

Page 6: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

More About P-values

6

• Big p-values just mean that what we have observed is not surprising. It means that, the results are in line with our

assumption that the null hypothesis models the world, so we have no reason to reject it.

• A big p-value does not prove that the null hypothesis is true.

• When we see a big p-value, all we can say is: we cannot reject Ho (we fail to reject Ho) – we cannot conclude Ha

(We have no evidence to support Ha).

Page 7: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Hypothesis Testing for a Proportion

7

In 1980s, it was generally believed that congenital abnormalities affect 5% of the nation’s children. Some people

believe that the increase in the number of chemicals in the environment in recent years has led in the incidence of

abnormalities. A recent study examined 384 children and found that 46 of them showed signs of abnormality. Is

this strong evidence that the risk has increased?

We will aim at stating and/or answering the following:

a. Write appropriate hypotheses.

b. Check the necessary assumptions.

c. Identify the test-statistic. Find the value of the test-statistic.

d. Find the p-value of the test-statistic and explain its meaning in the context of this problem.

e. Give (if any) a conclusion.

f. Do environmental chemicals cause congenital abnormalities?

Page 8: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Hypothesis Testing for a Proportion

8

a. Write appropriate Hypotheses (State the Null and the Alternative Hypotheses)

Ho: 𝑝 = 0.05 verses Ha: 𝑝 > 0.05

b. Check the Necessary Assumptions:

• Independence Assumption:

There is no reason to think that one child having genetic abnormalities would affect the probability that other children have them.

• Randomization Condition:

This sample may not be random, but genetic abnormalities are plausibly independent. The sample is probably representative of all children, with regards to genetic abnormalities.

• 10% Condition:

The sample of 384 children is less than 10% of all children.

• Success/Failure Condition:

np = (384)(0.05) = 19.2 and n(1 − 𝑝) = (384)(0.95) = 364.8 are both greater than 10, so the sample is large enough.

Page 9: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Hypothesis Testing

9

c. Identify the test-statistics. Find the value of the test-statistic.

We showed in part b that the conditions have been satisfied, so a Normal model can be used to model the sampling distribution of the sample proportion. That is,:

Ƹ𝑝 is approx. normal with mean 𝑝 = 0.05 and standard error of 𝜎 ො𝑝 =𝑝(1−𝑝)

𝑛=

0.05(1−0.05)

384= 0.0111

Under Ho: 𝑝 = 0.05, the test statistics has a Z standard normal distribution.

Thus, we perform a one-proportion z-test:

Ƹ𝑝 = 46

384= 0.1198 is the estimated proportion of children with genetic abnormalities

Z = ො𝑝 −𝑝

𝜎ෝ𝑝= 0.1198−0.05

0.05(1−0.05)

384

= 0.1198 −0.05

0.0111≅ 6.28

The value of Z is approximately 6.28, meaning that the observed proportion of children with genetic abnormalities is over 6 standard deviations above the hypothesized proportion (𝑝 = 0.05).

Page 10: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Hypothesis Testing

10

d. Find the p-value of the test-statistic and explain its meaning in the context of this problem.

P-value = P(Z > 6.28) ≅ 0.000

Note: We find the area above the

Z of 6.28 since Ha: 𝑝 > 0.05

If 5% of children have genetic abnormalities, the

chance of observing 46 children with genetic

abnormalities in a random sample of 384 children is

almost 0.

Page 11: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Hypothesis Testing

11

e. Give (if any) a conclusion.

With a P-value of this low, we reject the null hypothesis. There is a very strong evidence that more than 5% of children have genetic abnormalities.

f. Do environmental chemicals cause congenital abnormalities?

We do not know that environmental chemicals cause genetic abnormalities. We merely have evidence that suggests that a greater percentage of children are diagnosed with genetic abnormalities now, compared to the 1980s.

Page 12: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Using 95% CI for a Proportion

12

• The p-value in the previous example was extremely small (less than 0.0001). That is a strong evidence to suggest

that more than 5% of children have genetic abnormalities. However, it does not say that the percentage of

sampled children with genetic abnormalities was “a lot more than 5%”. That is, the p-value by itself says nothing

about how much greater the percentage might be. The confidence interval provides that information.

• To assess the difference in practical terms, we should also construct a confidence interval:

95% CI for P: ෝ𝑝 ± 𝑀𝐸 = ෝ𝑝 ± (𝑍∗𝑥 𝑆𝐸( ො𝑝))

= 0.1198 ± (1.96 x 0.0166)

= 0.1198 ± 0.0324 = (0.0874, 0.1522)

Recall that Ƹ𝑝 = 46

384= 0.1198

Z* for 95% interval: Z* = 1.96

𝑆𝐸 𝑃 =ො𝑝(1− ො𝑝)

𝑛=

0.1198 (1−0.1198)

384= 0.0166

Interpretation:

We are 95% Confident that the true percentage of

children with genetic abnormalities is between

8.74% and 15.22%.

Page 13: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

13

Z = 39.377 ≅ 6.28 (take the positive sign because the difference between ො𝑝 = 0.1198 and 𝑝 = 0.05 is positive: +0.0698

P-value = 3.494 x 10−10 < 0.0001. p-value is less than 𝛼 = 0.05

We reject Ho and conclude Ha; We have a very strong evidence to conclude that more than 5% of all children have genetic

abnormalities.

95% CI for P: (9.1%, 15.6%) – We are 95% confident that the true percentage of all children that have genetic

abnormalities is between approximately 9.1% and 15.6%. Since both values of this CI are more than the hypothesized

value of P = 0.05 (5%), we can further infer that this true percentage is more than 5% (with 0.95 probability).

Same Example: CI and Hypothesis Testing for a Proportion (Two-sided Test) – in R

Same Example:

Ho: 𝑝 = 0.05

Ha: 𝑝 ≠ 0.05

Page 14: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Decisions Errors in Tests

14

• When Ho is true, a Type I error occurs if Ho is rejected.

The probability of making a type I error is denoted by 𝛼.

• When Ho is false, a Type II error occurs if Ho is not rejected.

The probability of making a type II error is denoted by 𝛽.

Page 15: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Decisions Errors in Tests

15

In medical disease testing, the null hypothesis is usually the assumption that a person is healthy. The alternative is

that the person has the disease we are testing for.

Ho: Healthy verses Ha: Infected

• Type I error: Reject Ho when it is true.

A Type I error is a false positive: A healthy person is diagnosed with the disease.

That is, a person must go under further test.

• Type II error: Fail to reject Ho (“Accept Ho”) when it is false.

A Type II error is a false negative in which an infected person is diagnosed as disease-free.

That is, a sick person gets untreated.

For example: If a new treatment is being tested for a disease (e.g., epilepsy), a Type I error will lead to future

patients getting a useless treatment; a Type II error means a useful treatment will remain undiscovered.

Page 16: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Decisions Errors in Tests

16

Jury Trial:

Ho: Innocent verses Ha: Guilty

• Type I error: Reject Ho when it is true.

A Type I error occurs if the jury convicts an innocent .

• Type II error: Fail to reject Ho (“Accept Ho”) when it is false.

A Type II error occurs if the jury fails to convict a guilty person.

Page 17: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

What type of error could we making in our example?

17

In 1980s, it was generally believed that congenital abnormalities affect 5% of the nation’s children.

Some people believe that the increase in the number of chemicals in the environment in recent years

has led in the incidence of abnormalities. A recent study examined 384 children and found that 46 of

them showed signs of abnormality. Is this strong evidence that the risk has increased?

Ho: 𝑝 = 0.05 verses Ha: 𝑝 > 0.05

𝑍 ≅ 6.28, 𝑝-value < 0.0001 (which is less than 𝛼 = 0.05). We reject Ho and conclude Ha.

This means we could be making a Type I error. We decided that the true percentage is more than 5%

based on our data (as evidence against Ho), however, it could be that the hypothesized value of 5% is

true (e.g., 𝐻0: 𝑝 = 0.05 could be true).

Page 18: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

What type of error could we making in our example?

18

Suppose it was claimed that the percentage of adult Canadians who worked at a job or business at

anytime (between July 2007 and June 2008), regardless of the number of hours per week, was 50%. Of

the 4,756 respondents, 1,581 indicated that they worked at a job or business at anytime (between July

2007 and June 2008), regardless of the number of hours per week . Is there evidence to suggest that the

true proportion 𝑝 is greater than 0.50?

Ho: 𝑝 = 0.50 Ha: 𝑝 > 0.50

Z = - 23.02

P-value = P(Z > -23.02) ≅ 1, P-value > 𝛼 = 0.05; We Fail to Reject Ho; We cannot conclude Ha.

This means we could be making a Type II error. We indicated that there is no evidence to conclude that

the true percentage of adult Canadians who worked at a job or business at anytime (between July 2007

and June 2008), regardless of the number of hours per week was more than 50% - this conclusion

implies that Ho: 𝑝 = 0.50 is plausible, but it could not be the case.

Page 19: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

What about making a correct decision? Power of Test

19

Power of test refers to probability of correctly reject Ho when it is false: P(reject Ho | Ho is false)

Power = 1 – Beta = 1 – P(fail to reject Ho | Ho is false)

Note: The complement of reject Ho is, fail to reject Ho. For example: P(A|B) = 1 - P(AC|B)

• When we think about power, we imagine the null hypothesis is false.

• The value of power depends on how far the truth lies from the value we hypothesize.

• We call this distance between the null hypothesis value (for example) 𝑝0 and the truth 𝑝, the effect size.

• The effect size is unknown, of course, since it involves the true p.

• But, we can estimate the effect size as the difference between the null value and the observed estimate.

• The effect size is central to how we think about the power of hypothesis test.

• A larger effect is easier to see and results in larger power.

• Small effects are difficult to detect. They will result in more Type II errors and therefore lower power.

• The power of the test both depends on the size of the effect and the amount of variability in the sampling model. For

proportions, we use a Normal sampling model (for Ƹ𝑝) with standard deviation inversely proportional to the square root of the

sample size, n.

Page 20: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Power and Beta

20

A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if

less than 90% of adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in

a certain region.

a. Calculate power of the test if the percentage of adults who drink milk is really 85%.

b. Calculate beta if the percentage of adults who drink milk is really 85%.

Page 21: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Power and Beta

21

A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of

adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in a certain region.

a. Calculate power of the test if the percentage of adults who drink milk is really 85%.

Alpha = 0.05 = P(reject Ho | Ho is true)

P(Z < -1.645) = 0.05 (this is the rejection region)

Z critical value is -1.645

Power = P(reject Ho | Ho is false)

= P(ො𝑝−0.90

0.90(1−0.90)

200

< -1.645 | P = 0.85)

= P( Ƹ𝑝 < 0.8651 | P = 0.85)

= 𝑃(𝑍 <0.8651−0.85

0.85 1−0.85

200

≅ 0.60) ≅ 0.73

Page 22: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Power and Beta

22

A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of

adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in a certain region.

b. Calculate beta if the percentage of adults who drink milk is really 85%.

Beta = 1 – power = 1 – 0.73 = 0.27

Page 23: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example of Power and Sample Size

23

A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of

adults drink milk (at alpha = 0.05). They collect a random sample of 100 adults in a certain region.

a. Calculate power of the test if the percentage of adults who drink milk is really 85%.

Alpha = 0.05 = P(reject Ho | Ho is true)

P(Z < -1.645) = 0.05 (this is the rejection region)

Z critical value is -1.645

Power = P(reject Ho | Ho is false)

= P(ො𝑝−0.90

0.90(1−0.90)

100

< -1.645 | P = 0.85)

= P( Ƹ𝑝 < 0.85065 | P = 0.85)

= 𝑃(𝑍 <0.85065−0.85

0.85 1−0.85

100

≅ 0.02) ≅ 0.51

Page 24: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Power increases as Sample Size, n increases

24

A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90%

of adults drink milk (at alpha = 0.05).

They collect a random sample of 50 adults in a

certain region. Calculate power of the test if the

percentage of adults who drink milk is really 85%.

They collect a random sample of 200 adults in a

certain region. Calculate power of the test if the

percentage of adults who drink milk is really 85%.

If we keep 𝛼 at the same size, larger sample sizes increase the power of test because sampling variability (sampling distributing)

are much narrower. The critical value, 𝑝∗ gets closer to 𝑝0 and farther from p.

Page 25: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

The Sampling Model for Sample Mean

• When a random sample is drawn from any population with mean 𝜇 and standard deviation 𝜎, its sample mean ഥ𝑦

has a sampling distribution with the same mean as the population mean 𝜇, and standard error of 𝜎ത𝑦 =𝜎

𝑛.

• The Central Limit Theorem tells us that no matter what population the random sample comes from, the shape of

the sampling distribution is approximately normal as long as the sample size n is large. That is, the larger the

sample used, the more closely the Normal model approximates the sampling distribution of the sample mean.

• For large n (n > 60; note some use n > 30), we express the sampling distribution of ഥ𝑦 : ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎

𝑛)

However, in practice the population parameters are unknown. For example, 𝜎 is unknown. In that case, we estimate

𝜎 by the sample standard deviation, 𝑆. Thus, we replace 𝜎 with S: ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎

𝑛≅

𝑠

𝑛)

25

Page 26: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Testing for the Mean of a Quantitative Population for the Big Sample

Suppose we wish to test a hypothesis about a mean of a quantitative population denoted by 𝜇:

𝐻0: 𝜇 = 𝜇0 𝐻𝑎: 𝜇 ≠ 𝜇0

For large n (n > 60), we know that by Central Limit Theorem sampling distribution of ഥ𝑦 : ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎

𝑛)

And if 𝜎 is unknown we estimate 𝜎 by the sample standard deviation, 𝑆.

Thus, we replace 𝜎 with S: ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎

𝑛≅

𝑠

𝑛)

Our test statistic is: Z = ഥ𝑦 −𝜇𝜎

𝑛

≅ഥ𝑦 −𝜇𝑠

𝑛

26

Page 27: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example: Testing for the Mean of a Quantitative Population with large Sample

27

Researchers claimed that the true mean number of hours of work for all Canadians with poor health is

different from 40 hours (usual hours of work per week). In order to test their hypothesis, these researchers

relied on the obtained statistics from the Canadian Community Health Survey (CCHS, 2011) for a random

sample of 65 Canadians with poor health. For this random sample, the mean hours of work was 33.91

hours and with standard deviation of 13.85. The histogram of hours of work for this sample was

approximately normal.

• How far away from 40 hours the sample mean need to be in order for researchers be able to support

their claim?

• In other words, how many estimated standard error do the sample mean need to be away from the value

of 40 hours so that researchers could support their claim?

• We need to find the value of the test-statistics, which summarizes how far (e.g., how many est. standard

error) the point estimate (e.g., ഥ𝑦 ) is way from the hypothesized 𝐻0value.

• In this example, we are interested to see how many est. standard error, ഥ𝑦 = 33.91 is away from 40.

• Recall CLT: For a large random sample, ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎

𝑛). However, 𝜎 is unknown in this example.

We estimate 𝜎 by the sample standard deviation, S. Thus, the estimated standard error is 𝑠

𝑛.

Page 28: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example: Testing for the Mean of a Quantitative Population with large Sample

28

Researchers claimed that the true mean number of hours of work for all Canadians with poor health is

different from 40 hours (usual hours of work per week). In order to test their hypothesis, these researchers

relied on the obtained statistics from the Canadian Community Health Survey (CCHS, 2011) for a random

sample of 65 Canadians with poor health. For this random sample, the mean hours of work was 33.91

hours and with standard deviation of 13.85 hours. The histogram of hours of work for this sample was

approximately normal.

𝐻0: 𝜇 = 40 𝐻𝑎: 𝜇 ≠ 40

Z = ഥ𝑦 −𝜇𝜎

𝑛

≅ഥ𝑦 −𝜇𝑠

𝑛

= 33.91−40

13.85

65

= -3.55

𝑝-value = 2x Area (below Z of -3.55) = 2(0.0002) = 0.004

𝑝-value of 0.0004 < (𝛼 = 0.05). We reject Ho and conclude Ha.

We have strong evidence to conclude that the mean hours of work for Canadians with poor health is

different from 40.

Page 29: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Confidence Interval for the Mean of a Quantitative Population when the Sample is Big

Recall the generic form of stating (finding) a CI: Point Estimate ± Margin of Error

Therefore, the confidence for the mean 𝜇 has the form: ො𝜇 ± Margin of Error

= ഥ𝑦 ± ME

= ഥ𝑦 ± (𝑍 ∗ 𝑆𝐸(ഥ𝑦 ))

= ഥ𝑦 ± (Z( 𝜎

𝑛))

≅ ഥ𝑦 ± (Z( 𝑠

𝑛))

29

Page 30: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Example: CI for the Mean of a Quantitative Population with large Sample

30

For a random sample of 65 Canadians with poor health, the hours of work had mean 33.91 hours and

standard deviation of 13.85 hours. Find 95% CI for 𝜇: ഥ𝑦 ± ME

= ഥ𝑦 ± (𝑍 ∗ 𝑆𝐸(ഥ𝑦 ))

= ഥ𝑦 ± (Z( 𝜎

𝑛))

≅ ഥ𝑦 ± (Z( 𝑠

𝑛))

= 33.91 ± (1.96( 13.85

65))

= 33.91 ± (1.96 x 1.72)

= 33.91 ± 3.37 = (30.54, 37.28)

Interpretation: We are 95% confident that the true mean hours of work for Canadians with poor health is

between 30.54 and 37.28 hours .

Page 32: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

32

The t distribution

The density, t distribution, was calculated by William Gosset.Recall: When population standard deviation 𝝈 is unknown, its value is estimated by the sample standard deviation S; The value for S is different for different random samples with different sizes (n).

• The t distribution is bell-shaped and is symmetric (like the Normal model) about the mean 0.• The standard deviation is a bit larger than 1 and its value depends on degrees of freedom, df = n-1

(one less than the sample size). • The t distribution has a slightly different spread for different values of df.• The t distribution has a wider shape than Z standard normal distribution when sample size is small. • When df is about 60 or more, the two distributions (Z and t) are nearly identical. • We can think of t distribution with df = ∞ (infinity) as standard normal distribution, Z, because

as n (sample size) increases, we have 𝑠

𝑛≅

𝜎

𝑛

William Gosset

Page 33: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Assumptions and Conditions for the t distribution

• Independence Assumption:

The data values should be independent (form each other).

• Randomization Condition:

This condition is satisfied if the data arise from a random sample or suitably randomized experiment. Randomly sampled data, especially data from simple random sample are ideal – almost surely they are independent, with well defined target population.

• Normal Population Assumption:

Student’s t distribution will not work for data that are badly skewed. So, Examine graphical displays (e.g., Histogram or Boxplot); Check mean vs median. But note that t distribution performs adequately well even when the assumption of normality is violated (e.g., slight skewedness in the data). Thus, we say that t distribution is robust when the assumption of normality is violated.

Even for small n (sample size) check for Nearly Normal Condition:

• The data come from a distribution that is unimodal and reasonably symmetric. Check this assumption by making a histogram, boxplot, or normal probability plot.

• For very small sample, n <15, the data should follow a Normal model fairly closely. If you find clear outlier or skewness do not use t method.

• For sample size n between 15 and 40, the t method will work reasonably well for mildly to moderately skewed unimodal data, but would perform badly in the presence of strong skewness or outliers. Make a histogram, boxplot, or normal probability plot of data.

33

Page 34: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Testing for the Mean of a Quantitative Population for Small Sample with Unknown 𝜎

Suppose we wish to test a hypothesis about a mean of a quantitative population denoted by 𝜇:

𝐻0: 𝜇 = 𝜇0 𝐻𝑎: 𝜇 ≠ 𝜇0

When certain assumptions and conditions are met, the standardized sample mean has:

𝑡 =ഥ𝑦 − 𝜇𝑠𝑛

Follows a student’s t-model with n-1 degrees of freedom. Note, we estimate the standard deviation of ഥ𝑦 with:

SE ഥ𝑦 =𝑠

𝑛

34

Page 35: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

35

Example of Hypothesis Testing for a Population Mean 𝝁

Studies on attitudes toward statistics showed that, on a 7-point Likert scale, where “1” is an indicative of

a strong negative attitudes, to “7” which is an indicative of strong positive attitudes about statistics, male

students’ feelings concerning statistics (known as the “Affect” component) is 4, on average. A researcher

takes a random sample of 18 male students who were enrolled in an introductory statistics course and

estimates their mean affect toward statistics. Her sample had mean 4.08 and standard deviation of 0.90.

Is there evidence to suggest that the true mean Affect for the male group is more than 4?

Page 36: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

36

Checking the Assumptions and Conditions in our Example

The histogram looks bell-shaped and symmetric.

The boxplot shows no outlier and is approx. symmetric.

The normal probability plot is close to the straight line.

All these three plot suggest that there is no violation of

assumption of normality.

Page 37: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

37

Checking the Assumptions and Conditions in our Example

Independence Assumption: Males students are independently (of each other) selected.

Randomization Condition: Males students are randomly selected for this study.

Normal Population Assumption: Histogram, boxplot, and normal probability plots shows that this

distribution came from a distribution that is unimodal and reasonably symmetric.

Page 38: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

38

Example of Hypothesis Testing for a Population Mean 𝝁

Studies on attitudes toward statistics showed that, on a 7-point Likert scale, where “1” is an indicative of

a strong negative attitudes, to “7” which is an indicative of strong positive attitudes about statistics, male

students’ feelings concerning statistics (known as the “Affect” component) is 4, on average. A researcher

takes a random sample of 18 male students who were enrolled in an introductory statistics course and

estimates their mean affect toward statistics. Her sample had mean 4.08 and standard deviation of 0.90.

Is there evidence to suggest that the true mean Affect for the male group is more than 4?

Page 39: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

39

Example of Hypothesis Testing for a Population Mean 𝝁

It is claimed that the males’ mean Affect, that is feelings concerning statistics, is 4 (on a 7-point Likert scale).

A researcher takes a random sample of 18 male students who were enrolled in an introductory statistics course and

estimates their affect toward statistics. He sample had mean 4.08 and standard deviation of 0.90. Is there evidence

to suggest that the true mean Affect for the male group is more than 4?

𝐻0: 𝜇 = 4 𝐻𝑎: 𝜇 > 4

Under Ho, our test statistic is: 𝑡 =ത𝑦−𝜇𝑠

𝑛

(t distribution with df = n -1)

𝑡 =4.08−40.90

18

= 0.08

0.211= 0.39 with df = 18 – 1 = 17

𝑡 = 0.39 has p-value greater than 0.10

(see explanation: next slide)

So, p-value is greater than 𝛼 = 0.05.

We Fail to Reject Ho. We cannot conclude Ha.

We have no evidence to conclude that the mean Affect for the male group is

more than 4 (this implies that 𝐻0: 𝜇 = 4 is plausible – we could be making a Type II error).

Page 40: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Finding P-Value from t-table

Is there evidence to suggest that the true mean Affect for the male group is more than 4?

𝐻0: 𝜇 = 4 𝐻𝑎: 𝜇 > 4

𝑡 = 0.39 with df = 18 – 1 = 17

Find P-value (Picture version of this explanation is on the next slide):

Go along the line of df = 17 and find a t-score close to 0.39.

We see that t-values are increasing in the row of df = 17.

So it in this case, we take, the first t-value of 1.330 as our reference.

Our test-statistics of t = 0.39 is smaller than t = 1.330

the area above t = 1.330 is 0.100; so area above t = 0.39 would be much

greater than 0.100. Ultimately, our p-value is a big value!

𝑡 = 0.39 has p-value > 𝛼 = 0.05. We Fail to Reject Ho.

We cannot conclude Ha. We have no evidence to conclude that the

mean Affect for the male students taking an introductory statistics

course is more than 4

(this implies that 𝐻0: 𝜇 = 4 is plausible but we could be making a

Type II error).

Page 41: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

Finding P-Value using t-distribution

Online Applet: https://istats.shinyapps.io/tdist/

Is there evidence to suggest that the true mean Affect for the male group is more than 4?𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 > 𝟒

Page 42: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

One-sample t-Interval for the Mean: Example: 95% CI for a Mean (when n is small)

When the assumptions and conditions are met, the Confidence Interval for a mean is:

= ഥ𝑦 ± (𝑡𝑛−1∗ ∗ 𝑆𝐸(ഥ𝑦 ))

where the standard error of the mean 𝑆𝐸 ഥ𝑦 =𝑠

𝑛

The critical 𝑡𝑛−1∗ depends on the confidence level that you specify on the number of degrees of freedom, n-1, which

we get from the sample size.

42

Page 43: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

43

95% Confidence Interval for Males’ Mean Affect:

95% CI for 𝜇: ഥ𝑦 ± ME

ഥ𝑦 ± (𝑡𝑛−1∗ ∗ 𝑆𝐸(ഥ𝑦 ))

Confidence Level is 0.95

𝛼 = 0.05 (error probability)

𝛼/2 = 0.025

n = 18; df = n - 1 = 18 – 1 = 17

t-score: 𝑡0.025 with df =17 is 2.110

ത𝑦 ± 𝑡0.025(𝑠

𝑛)

= 4.08 ± [(2.110) 0.897

18]

= 4.08 ± 0.45

= (3.63, 4.53)

Interpretation: We are 95% confident that the true mean affect for male is between 3.63 and 4.53.

Page 44: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

44

CI and Hypothesis Testing for a Population Mean 𝝁 – in R

One-sided test:

𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 > 𝟒

Two-sided test:

𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 ≠ 𝟒

Page 45: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

45

Example of Hypothesis Testing for a Population Mean 𝝁

Laughter is often called “the best medicine”; studies have shown that laughter can reduce muscle tension and

increase oxygenation of the blood. Researchers investigated the physiological changes that accompany laughter. 25

subjects (18-34 years old) watched film clips designed to evoke laughter. During the laughing period, researchers

measured the heart rate (beats per minutes) of each subject and obtained: ത𝑦 = 73.5, and 𝑆 = 6. It is well known

that mean restoring heart rate is 71 beats per minute. Is there evidence that the true mean heart rate during laughter

exceeds 71 beats per minute? Use alpha = 0.05

Assumptions

Random Sample of 18-34 years old is taken from the population.

Heart rate during laughter has a normal distribution.

Page 46: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

46

Example of Hypothesis Testing for a Population Mean 𝝁

Ho: 𝜇 = 71 Ha: 𝜇 > 71

𝑡𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 =ത𝑦−𝜇𝑆

𝑛

=73.5−71

6

25

= 2.083 ~𝑡(𝑑𝑓 = 24)

P-value: 𝑃 𝑡24 > 2.083 = ?

Look under the t-table (we will do this in class together).

Along the df=24, search for a value close to 2.083.

We find: 2.064 < 2.083 < 2.492

Look above the table, the second row, since we have a one-sided test, note the associated area below each of the

values: 0.010 < p-value < 0.025 ;

P-value < 0.05.P-value is small. We reject Ho.

We have evidence to indicate that the true mean heart rate during laughter exceeds 71 beats per minute.

Page 47: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

47

Example of Hypothesis Testing for a Population Mean 𝜇 – Using CI

Let’s find the 95% CI for the true mean heart rate during laughter.

Recall n = 25 had ത𝑦 = 73.5 and 𝑆 = 6Thus, df = 25-1=24

Alpha = 0.05 (since confidence level is 0.95)

Confidence interval is two-sided so we divide alpha be 2:

Alpha/2 = 0.05/2 = 0.025

Look up value, in t table for 𝑡(0.025;𝑑𝑓=24) = 2.064

95% CI for the true mean heart rate during laughter is:

ത𝑦 ± 𝑡0.025;24𝑆

𝑛

= 73.5 ± 2.0646

25

= 73.5 ± 2.4768= (71.02, 75.98)

Conduct hypothesis testing: Ho: 𝜇 = 71 Ha: 𝜇 > 71

We reject Ho since 71 is not in this interval.

We indicate that the true mean heart rate during laughter exceeds 71 beats per minute.

Interpretation: We are 95% confident that

the true mean heart beat during laughter is

between 71.02 and 75.98 beats per minute.

Page 48: Lecture 11 Notes - Asal Aslemand - Welcomeasalaslemand.weebly.com/.../1/31310805/lecture_11_notes.pdf · 2019-01-20 · 1 Lecture 11 Notes Chapter 17. ... Recall the example in lecture

48

Determining Sample Size

How large of a sample size is needed if researchers need to estimate the true mean heart beats during laughter to be

within 4% with 95% confidence level?

• Suppose we know the actual value of 𝜎 = 5.

To solve for n is the Margin of Error part of Confidence Interval:

n = (𝑍∗2)

𝑆2

𝑀𝐸2= (1.96)2∗ (5)2

0.042= 60025

• Suppose we do not know the actual value of 𝜎; we estimate 𝜎 with previous information:

Recall n = 25 had ത𝑦 = 73.5 and 𝑆 = 6Thus, df = 25-1=24

We use t0.025;24=2.064

n = (𝑡0.025;24)2 𝑆2

𝑀𝐸2= (2.064)2∗ (6)2

0.042= 95852.16

Round up for precision: n = 95853