Chapter 8 · Chips Chips Ahoy! 0.001.002.003.004.005 ty 1100 1200 1300 1400 1500 chips Chips Ahoy!...
Transcript of Chapter 8 · Chips Chips Ahoy! 0.001.002.003.004.005 ty 1100 1200 1300 1400 1500 chips Chips Ahoy!...
1
Slide 1
Chapter 8Descriptive Statistics Distributions Inferential Statistics
• Estimating a population mean
• Confidence Intervals
• Margin of error• Student’s t-distribution
Slide 2
Chips Ahoy! 1000 chips challenge
At least 1000 chocolate chips per 18 ounce bag?
Chip – “distinct piece of chocolate baked into
cookie dough”.
Stats class at US Air Force Academy
275 bags from all over country, 42 randomly
selected.
Cookies dissolved in water, chips counted.
Results…………
Slide 3
Results of 42 bags. . . . .
1200 1219 1103 1213 1258 1325 1295
1247 1098 1185 1087 1377 1363 1121
1279 1269 1199 1244 1294 1356 1137
1545 1135 1143 1215 1402 1419 1166
1132 1514 1270 1345 1214 1154 1307
1293 1546 1228 1239 1440 1219 1191
Estimate the mean number of chips per bag for all Chips Ahoy! Cookies?
2
Slide 4
Point Estimate
Question: 1. Mean age of people in labour force2. Mean starting salary of Arts graduates.
Answer: Small population, take census, large population then we need to sample. Can achieve pretty accurate, cost effective results.
Mean
Slide 5
Confidence Interval
Sample mean will (probably) not exactly equal the population mean, some sampling error is to be expected.Need information on the accuracy of the point estimate……confidence level.
Confidence interval – abbreviate to CI
Slide 6
Confidence Interval – new mobile home prices
Random sample of 36 prices
= 42.2 σ = 7.2
Remember
is normally distributed
95.44% of sample lies within 2 s.d.
-2.4 = 39.8 + 2.4 = 44.6
2.136
2.7
nx
x
x x
39.8 44.6
95.44% confident that μ lies in here
x
x
3
Slide 7
Simulation for new mobile home prices
Slide 8
Simulation for new mobile home prices
Green dot signifies sample mean
In 19/20 (95%) of the samples the population
mean lies inside the sample CI
If we simulate more than 40, say 1000 times
expect 95.44 to be in CI!
Assume normally distributed?
is approx. normally distributed by CLT,
then CI of approx norm distributed also.xx
Slide 9
Obtaining CI’s for a population mean when σ is known.
95.44% of all samples have means within 2 s.d. of μ
Generally
100(1-α)% of all
samples have
means within Zα/2
s.d’s of μ.
4
Slide 10
More Formally
Slide 11
When to use the Z-interval procedure
For small samples (<15) the z-interval procedure should be used only when the variable under consideration is normally distributed (or close).
For samples of moderate size (15<30) the z-interval procedure can be used unless the data contains outliers or the variable under consideration is far frombeing normally distributed.
For large samples (>30) the z-interval procedure can be used essentially without restriction. However if outliers are present and their removal is not justified, the effect of the outliers on the CI should be examined, if significant – resample.
Slide 12
Fundamental Principle of Data Analysis
Before performing a statistical inference
procedure, examine the sample data. If any of
the conditions required for using the procedure
appear to be violated, do not apply the
procedure. Instead use a different, more
appropriate procedure, or, of you are unaware
of one consult a statistician.
5
Slide 13
Age of the labour force
Sample of 50 people’s ages, σ = 12.1 years, x bar = 36.4
Find a 95% CI?
Table II - 95% CI so (1- α) = 0.05, or 0.025 in each tail – we get a value of 1.96.
36.4 ± 1.96 x 12.1/√50
33.0 to 39.9
We can be 95% confident that the mean age, μ, of all the people in the labour force is somewhere between 33.0 and 39.8 years.
nZx
.2/
Slide 14
Confidence and precision
For a fixed size, decreasing the confidence level increases the precision, and vice-versa.
Slide 15
Sample Size for estimating μ
The sample size required for a (1- α ) level
confidence interval for μ with a specified margin of
error, E, is given by the formula
Rounded up to the nearest whole number
e.g. 95% CI for μ within 0.5 year of x. (σ = 12.1 years)
2
2/ .
E
Zn
79.22495.0
1.1296.12
n
6
Slide 16
CI for population mean when σ is unknown
n
xz
ns
xt
Students t-distribution
Standardised version of x
Different dist. for each sample size, dist. identified by name ‘degrees of freedom’, df = n-1
Slide 17
t-dist & t-curves
Thicker tails – converge on normal dist. in the limit
Slide 18
Properties
7
Slide 19
Using the t-table
Slide 20
σ unknown
Pick pocket crimes
447 207 627 430 883
313 844 253 397 214
217 768 1064 26 587
833 277 805 653 549
649 554 570 223 443
Slide 21
Value ($) lost for sample of 25 pick pocket crimes
N= 25df = n-1 = 25-2=24X bar = 513.32S = 262.23
8
Slide 22
Value lost for 25 pick pocket crimes
Normality?Outliers?
n
stx .2/
25
23.262.064.232.513
95% CI for mean value lost
405.07 to 621.57
Slide 23
Treatment of Outliers
Slide 24
Chips Ahoy! 1000 chips challenge
Variable Obs Mean Std. Err. [95% Conf. Interval]
Chips 42 1261.571 18.14284 1224.931 1298.212
n= 42; d.f.=41; tα/2 = t0.025 = 2.201; x = 1261.6; s = 117
n
stx .2/
42
6.117.021.26.1261
1224.9 to 1298.2
0.00
0.25
0.50
0.75
1.00
Norm
al F
[(va
r1-m
)/s]
0.00 0.25 0.50 0.75 1.00Chips
Chips Ahoy!
0
.001
.002
.003
.004
.005
Density
1100 1200 1300 1400 1500chips
Chips Ahoy!
9
Slide 25
Department of FinanceMonthly Economic Bulletin October 2014
Monthly bulletin best source of Dept Finance general
economic information.
It contains many of the variables that we have looked
at in other publications.
This publication also contains some detailed
budgetary and economic statistics.
Slide 26
10
Slide 28
Slide 29
Slide 30
11
12
13
14