Week 8_2014s2
-
Upload
kimberly-hughes -
Category
Documents
-
view
217 -
download
0
Transcript of Week 8_2014s2
-
8/10/2019 Week 8_2014s2
1/31
-
8/10/2019 Week 8_2014s2
2/31
2
Week 8 topics
Confidence intervals for the population mean
More on hypothesis testing
Type I and Type II errors
p-values Power
-
8/10/2019 Week 8_2014s2
3/31
3
Interval estimation: review
Point estimators produce a single estimate of theparameter of interest
In many real-world situations, some notion of the
margin of error would be useful Interval estimatorsproduce an intervali.e., a
range of valuesand a degree of confidenceassociated with that interval Hence the name confidence interval
How often would you expect the true population parameterto be in this (sample-specific) interval?
-
8/10/2019 Week 8_2014s2
4/31
4
Interval estimation for means
95.96.196.1
:yieldtostatementyprobabilitthisrearrangecanweNow
95.96.1/
96.1
tables)(from96.1025.)(05.chooseSay we
1/
:ofvalueedstandardizthisaroundintervalsymmetricaConsider
/thusand)/,(~Suppose 2
nX
nXP
n
XP
boundboundZP
boundn
XboundP
X
n
XZnNX
-
8/10/2019 Week 8_2014s2
5/31
5
Interval estimation
interval!confidenceadefine,96.1,endpointsThen
X
Remember: The endpoints of the interval are themselves random
variables
We have constructed a random interval
is a constant
For a particular sample (and sample mean value), iseither in the confidence interval, or it is not
If 100 size-n samples were drawn, we would expect95 of them to include
-
8/10/2019 Week 8_2014s2
6/31
6
Confidence intervals
CIs for means and proportions typically have a
similar structure
Centred at sample statistics
Endpoints are some multiple of the standarderror (if we dont know sigma) or standard
deviation (if we do know sigma) of the samplingdistribution
The multiple is determined by the confidencelevelchosen by the investigator
Remember: If you dont know sigma and have asmall sample, use the t-distribution tables to get
your boundsnot the Z!
-
8/10/2019 Week 8_2014s2
7/31
7
Selecting sample size
Recall the auditor Clare from last lecture
Suppose she is OK with assuming she knows sigma, andneeds to decide on a sample size
She wants a sample size that yields a margin of error of$4, and she is willing to set the confidence level at 90%
We can now write down the CI and use it to solve for thesample size nthat she requires.
1564
334.30645.1
nor
4requireweThus
intervalconfidencethedefines
05.
2/
n
nz
nzX
Recall, =$30.334, byassumption(based onhistorical data)
-
8/10/2019 Week 8_2014s2
8/31
8
Hypothesis testing examples
and concepts, again
Maintained or null hypothesis Some statement about a population parameter
LetX be the weight of precooked meat, with mean
Then the null hypothesis is H0: = 0.25 Alternative hypothesis
Will depend on the research objective
Some possibilities here: H1: 0.25, two-tailed hypothesis test(so a value too
extreme in either direction violates the tenet of Truth in
Advertising)
H1: < 0.25, one (lower)-tailed hypothesis test (so a valuetoo low violates the minimum standard of a trading standards
agency or consumer advocacy group)
-
8/10/2019 Week 8_2014s2
9/31
9
Hypothesis testing examples
and concepts
Recall how are data used to test a nullhypothesis: Proceed by comparing a test statistic with the value
specified in H0and decide whether the difference is: Small enough to be attributable to random sampling errors
do not reject H0, or
So large that H0is more likely not to be correctreject H0
Formally define a rejection (or critical) region Values of the test statistic that are so extreme they lead us to
reject H0 in favour of H1
Other values of the test statistic that are not soextreme lie in the non-critical region
-
8/10/2019 Week 8_2014s2
10/31
10
Quality control at McDonalds A quarter-pounder with cheese is presumed to
comprise 0.25 pounds (0.11 kg) of precooked meat
Consider H0: = 0.25, H1: < 0.25
A sample of 25 hamburgers produces sample mean
weights (in pounds!) of: (a) 0.24 (b) 0.23 (c) 0.28 (d) 0.21
Which of these represents evidence against H0?
Which of these would lead you to reject H0?
For which are you most likely to reject H0?
-
8/10/2019 Week 8_2014s2
11/31
11
Quality control at
McDonalds
.determinecanthen we,andknowandsetweifThus
25.)(
implieswhichifReject
:belwhich wilregion,rejectionthedeterminetoneedWe
25.0:
25.0:
),(~ismeathamburgerofweighttheAssume
1
0
2
L
LL
L
xn
n
xZPxXP
xX
H
H
NX
-
8/10/2019 Week 8_2014s2
12/31
-
8/10/2019 Week 8_2014s2
13/31
13
Quality control at
McDonalds
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
z
0.05
-1.645
0.2336
= 0.25
x
-
8/10/2019 Week 8_2014s2
14/31
Quality control at
McDonalds
Our choice of significance level matters!
Suppose =0.01
Our new rejection region is then z < -2.33 insteadof z < -1.645 , or in terms of the sample mean, thecutoff is 0.2267 rather 0.2336
Does it make sense that the critical value is loweron the number line when =0.01 than when =0.05?
Thus, case (b) with a sample mean of 0.23 would
now NOTlead to rejectionof the null hypothesis 14
-
8/10/2019 Week 8_2014s2
15/31
-
8/10/2019 Week 8_2014s2
16/31
16
p-values
How do I choose the significance level ? No rules Conventional choices are = 0.1, 0.05, or 0.01 In the McDonalds example, we saw it could matter
Why do I have to choose a particular ? You dont (though doing so can helpfully bind your hands) You can calculate the empirical significance level, orp-
value
The p-value associated with a given test statistic
is the probability of obtaining a value of the teststatistic as or m ore extreme than that observed,given that the null hypothesis is true More extreme depends on form of the alternative
hypothesis
-
8/10/2019 Week 8_2014s2
17/31
17
pvalues
0069.)46.2(
5/160.609.7
)09.7(
ZP
n
XP
XPvaluep
From the skills test example of last week:
Thus, it is very unlikely (less than a 1% chance) to
find such an extreme value for the sample mean testscore if H0is correct
There is strong evidence to reject H0
Put another way: for any choice of significance level ()
greater than .0069, we would reject H0
We can only calculatethis if we assume thesampling distribution of Xis centred at a particular
valuewhich we assumeto be the populationmean under the null!
-
8/10/2019 Week 8_2014s2
18/31
-
8/10/2019 Week 8_2014s2
19/31
Student progress in BES...
Key features of pastresults for the 10-markquiz On average, students do
well Median=9, mean=8.5,
only 5.3% with mark
-
8/10/2019 Week 8_2014s2
20/31
20
Student progress in BES...
Q1: Suppose a student randomlychose another BESstudent to help him with his work
What is the probability that the chosen students test mark is
at least 9? Q2: Define a good tutorial (of 20 students) to be one
where the mean mark is at least 9
What is the probability that any given student is in a good
tutorial? Q3: Was the semester 1 2014 BES cohort any
different from what we have seen in the past?
Test this hypothesis using data from a randomly-selected s1
2014 tutorial of size 25, in which the mean mark is 8.8
-
8/10/2019 Week 8_2014s2
21/31
-
8/10/2019 Week 8_2014s2
22/31
22
Student progress in BES...
.rejecttoevidencentinsufficieisthere),level!alconventionotherany(or0.01say,at,
4238.02119.0280.02
2588.1
5.88.82)8.8(2
above.statedtesttailed-twotheassuming
mean,sampleourwithassociatedvalue-thecalculatecanWe.)25/88.1,5.8(~CLT,By
5.8:5.8::3
0
2
10
H
ZP
n
XPXPp
pNX
HHQ
This isthe
samplemean in
oursample!
-
8/10/2019 Week 8_2014s2
23/31
Hypothesis testing:
A note about types of errors
Concepts about errors one can make duringhypothesis testing are similar in statisticalinference and in the judicial system (introduced in
last lecture) Recall the McDonalds quality control example A quarter-pounder with cheese is presumed to
comprise 0.25 pounds (0.11 kg) of precooked meat
Given data (a sample of hamburgers), we can evaluatethis implicit claim
Yet we might conclude that their hamburgers do contain0.25 pounds when in fact they dont (false negative)
Or, we might conclude that their hamburgers dont
contain 0.25 pounds when in fact they do (false positive)23
-
8/10/2019 Week 8_2014s2
24/31
24
Hypothesis testing examples
and concepts
Type I errorsoccur when we reject a true nullhypothesis Only possible to make this error when the null is true
Denote P(Type I error) = P(Reject H0| H0true) =
Type II errorsoccur when we dont reject a false nullhypothesis
Only possible to make this error when the null is false Denote P(Type II error) =b
P(Do not reject H0| H0not correct) =b
P(Type II error) depends on what the actual(alternative) parameter value is!
significance level
-
8/10/2019 Week 8_2014s2
25/31
25
Calculating probability of Type
II errors
Recall our McDonalds example: Suppose we conduct this one-tailed test:
H0: = 0.25, H1: < 0.25
With n= 25, = 0.05, and = 0.05, we previously found the
relevant decision rule to be a rejection of H0 if we find that oursample mean < 0.2336
Suppose that in fact, McDonalds only puts 0.24 pounds in theirquarter pounder. Will we detect this?
y!discrepancthedetectingnotofchance74%aisThere
7389.)64.0(
2505.0
24.2336.)24.|2336.(
ZP
ZPXP b
We nowassume thesamplingdistribution iscentred at thealternative!
the probability that we get a test statistic that makes us fail to
reject the null, given that an alternative is true
-
8/10/2019 Week 8_2014s2
26/31
26
Power of a test
Power (in statistics): The probability of correctlyrejecting a false null hypothesis
P(Do not reject H0| H0not correct) =b
P(Reject H0| H0not correct) = Power = 1-b
( 0.64) .7389P Zb
From the prior slide:
So, given a true population parameter of 0.24 pounds of meat inthe quarter pounder, the power of this test is:
1 1 0.7389 0.2611b
-
8/10/2019 Week 8_2014s2
27/31
27
Power of a test
Suppose a company is considering installing an additionalpress to continuously extrude copper. The investment is onlyviable if the press extrudes more than 170 metres of copperper hour. This suggests the following hypothesis test:
H0: = 170; H1: > 170;
with the firm investing in an additional press upon rejection ofthe null.
A large random sample of 400 production hours from existingsimilar presses in the plant has a sample mean of 176 m/hand a sample standard deviation of 65 m/h. Further supposethat this sample is large enough to invoke the CLT.
-
8/10/2019 Week 8_2014s2
28/31
28
Power of a testSay the firm sets up the hypothesis testusing = 0.01.
H0: = 170; H1: > 170
n= 400, = 0.01, s = 65
Decision rule:
Reject H0if sample mean > 177.57
The firm would hate to make a mistakeover this critical investment. If the newpress had a mean production of 180m/h,
it would be a very attractive investment.What is the power of the test if, in actualfact, = 180?
177.57 | 180p Xb
177.57 180
65 400p Zb
0.75 0.2266p Zb
1-b=1-0.2266
= 0.7734
Verify at home!!
-
8/10/2019 Week 8_2014s2
29/31
29
Power of a test
What happens to power ifwe increase to 0.05?
What happens to power ifwe increase n to 1000?
1b= 10.0764= 0.9236.
The power increases:
The power increases:
1b10 1
-
8/10/2019 Week 8_2014s2
30/31
Power of a test...
Summary: H0: = 170; H1: = 180
If =0.01, thenb = 0.2266(Power = 0.7734) when n=400
If =0.05, thenb = 0.0764(Power = 0.9236) when n=400
If =0.05, thenb 0.0000(Power 1) when n=1000
With a different alternative, e.g. H1: = 178 (verify!): If =0.05, thenb=0.2075(Power = 0.7925) when n=400
This (closer-to-the-null) alternative is harder to detect!
P(Z
-
8/10/2019 Week 8_2014s2
31/31
31
Progress report
We now have procedures to test hypotheses in arange of circumstances, using the sampleproportion and the sample mean
We can use the standard normal tables and thet-tables, as appropriate, in generatingconfidence intervals and testing hypotheses
We know the mistakes we could make in ourtesting, and about the power of our tests.
Next week: Chi-squared tests.
After that, the final broad module of the course
begins Linear regression!