Inferensi Statistika
Confidence Interval
Mean, , is unknown
Population Random SampleI am 95%
confident that is between 40 &
60, which is
a point estimate + or - margin of
error.
Mean X = 50
Estimation Process
Sample
Provides Range of Values Based on Observations from 1 Sample
Gives Information about Closeness to Unknown Population Parameter
Stated in terms of Probability Never 100% Sure
Confidence Interval Estimation
Confidence Interval Sample Statistic
Confidence Limit (Lower)
Confidence Limit (Upper)
A Probability (confidence level, denoted by C) That the Population Parameter Falls
Somewhere Within the Interval.
Elements of Confidence Interval Estimation
90% Samples
95% Samples
estimate_
Confidence Intervals
xx .. 64516451
xx 96.196.1
xx .. 582582 99% Samples
nzXzX X
**
X_
Probability that the unknown population parameter falls
within the interval
Denoted by C = level of confidence
e.g. C=90%, 95%, 99%.
Level of Confidence
Confidence Intervals
Intervals Extend from
C proportion of Intervals Contains .
1-C proportion Does Not.
Area=C (1-C)/2(1-C)/2
X_
x_
Intervals & Level of Confidence
Sampling Distribution of
the Mean
toXzX *
XzX *
X
Data Variation
measured by Sample Size
Level of Confidence
C
Intervals Extend from
Factors Affecting Interval Width
X - z* to X + z*
xx
n/XX
Assumptions Population Standard Deviation Is Known Population Is Normally Distributed If Not Normal, use large samples
Confidence Interval Estimate
Confidence Intervals (Known)
nzX
* n
zX
*
Assumptions Population Standard Deviation Is
Unknown Population Must Be Normally Distributed
Use Student’s t Distribution
Confidence Interval Estimate
Confidence Intervals (Unknown)
n
StX n 1
* n
StX n 1
*
Zt
0
t (df = 5)
Standard Normal
t (df = 13)Bell-Shaped
Symmetric
‘Fatter’ Tails
Student’s t Distribution
Upper Tail Area
df .25 .10 .05
1 1.000 3.078 6.314
2 0.817 1.886 2.920
3 0.765 1.638 2.353
t0
Assume: n = 3 df = n - 1 = 2
= .10 /2 =.05
2.920t Values
/ 2=(1-C)/2
.05
Student’s t Table
A random sample of n = 25 has = 50 and s = 8. Set up a 95% confidence interval estimate for .
. .46 69 53 30
X
Example: Interval EstimationUnknown
n
StX n 1
* n
StX n 1
*
25
80639250 . 25
80639250 .
Sample Size
Too Big:•Requires toomuch resources
Too Small:•Won’t do the job
What sample size is needed to be 90% confident of being correct
within ± 5? A pilot study suggested that the standard
deviation is 45.
nZ
m
2 2
2
2 2
2
1645 45
5219 2 220
..
Example: Sample Size for Mean
Round Up
Test of Significance
The second method of making statistical inference.
It is to test a Hypothesis about the population parameters
A hypothesis is an assumption about the population parameter.
A parameter is a Population mean or proportion
The parameter must be identified before analysis.
I assume the mean GPA of this class is 3.5!
What is a Hypothesis?
States the Assumption (numerical) to be tested
e.g. The average # TV sets in US homes is at 3 (H0: 3)
Begin with the assumption that the null hypothesis is TRUE.
(Similar to the notion of innocent until proven guilty)
The Null Hypothesis, H0
•Refers to the Status Quo
•The Null Hypothesis may or may not be rejected.
Is the opposite of the null hypothesis e.g. The average # TV sets in US homes is less than 3 (Ha: < 3) or is NOT equal to 3 (Ha: 3).
Challenges the Status Quo Never contains the ‘=‘ sign The Alternative Hypothesis may
or may not be accepted
The Alternative Hypothesis, Ha
State the Null Hypothesis (H0: 3)
State the Alternative Hypothesis (Ha: < 3 or Ha: ) as the conclusion if H0 is not true.
Logically, if the alternative Ha: < 3 is not true, we can accept either Ho: > 3 or Ho: = 3.
Identify the Problem
Population
Assume thepopulationmean age is 50.(Null Hypothesis)
REJECT
The SampleMean Is 20
SampleNull Hypothesis
50?20 XIs
Hypothesis Testing Process
No, not likely!
Sample Mean = 50
Sampling DistributionIt is unlikely that we would get a sample mean of this value ...
... if in fact this were the population mean.
... Therefore, we reject the null
hypothesis that = 50.
20H0
Reason for Rejecting H0
Defines Unlikely Values of Sample Statistic if Null Hypothesis Is True Called Rejection Region of Sampling Distribution
Designated (alpha) Typical values are 0.01, 0.05, 0.10
Selected by the Researcher at the Start
Provides the Critical Value(s) of the Test
Level of Significance,
Level of Significance, and the Rejection Region
H0: 3
H1: < 30
0
0
H0: 3
H1: > 3
H0: 3
H1: 3
/2
Critical Value(s)
Rejection Regions
Convert Sample Statistic (e.g., ) to Standardized Z Variable
Compare to Critical Z Value(s) If Z test Statistic falls in Critical Region,
Reject H0; Otherwise Do Not Reject H0
Z-Test Statistics (Known)
Test Statistic
X
n
XXZ
X
X
1. State H0 H0 : 3
2. State H1 Ha :
3. Choose = .05
4. Choose n n = 100
5. Choose Test: Z Test (or p Value)
Hypothesis Testing: Steps
Test the Assumption that the true mean # of TV sets in US homes is at least 3.
6. Set Up Critical Value(s) Z = -1.645
7. Collect Data 100 households surveyed
8. Compute Test Statistic Computed Test Stat.= -2
9. Make Statistical Decision Reject Null Hypothesis
10. Express Decision The true mean # of TV set is less than 3 in the US
households.
Hypothesis Testing: Steps
Test the Assumption that the average # of TV sets in US homes is at least 3.
(continued)
Assumptions Population Is Normally Distributed If Not Normal, use large samples Null Hypothesis Has or Sign
Only
Z Test Statistic:
One-Tail Z Test for Mean (Known)
n
xxz
x
x
Z0
Reject H0
Z0
Reject H0
H0: H1: < 0
H0: 0 H1: > 0
Must Be Significantly Below = 0
Small values don’t contradict H0
Don’t Reject H0!
Rejection Region
Does an average box of cereal contain more than 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company has specified to be 15 grams. Test at the 0.05 level.
368 gm.
Example: One Tail Test
H0: 368 H1: > 368
_
Z .04 .06
1.6 .5495 .5505 .5515
1.7 .5591 .5599 .5608
1.8 .5671 .5678 .5686
.5738 .5750
Z0
Z = 1
1.645
.50 -.05
.45
.05
1.9 .5744
Standardized Normal Probability Table (Portion)
What Is Z Given = 0.05?
= .05
Finding Critical Values: One Tail
Critical Value = 1.645
= 0.025n = 25Critical Value: 1.645
Test Statistic:
Decision:
Conclusion:
Do Not Reject at = .05
No Evidence True Mean Is More than 368Z0 1.645
.05
Reject
Example Solution: One Tail
H0: 368 H1: > 368 50.1
n
XZ
Probability of Obtaining a Test Statistic More Extreme or ) than Actual Sample Value Given H0 Is True
Called Observed Level of Significance Smallest Value of a H0 Can Be Rejected
Used to Make Rejection Decision If p value Do Not Reject H0
If p value <, Reject H0
p Value Approach -- used in the pdf Chapter
Z0 1.50
p Value.0668
Z Value of Sample Statistic
From Z Table: Lookup 1.50
.9332
Use the alternative hypothesis to find the direction of the test.
1.0000 - .9332 .0668
p Value is P(Z 1.50) = 0.0668
p Value Solution
0 1.50 Z
Reject
(p Value = 0.0668) ( = 0.05). Do Not Reject.
p Value = 0.0668
= 0.05
Test Statistic Is In the Do Not Reject Region
p Value Solution
Does an average box of cereal contains 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company has specified to be 15 grams. Test at the 0.05 level.
368 gm.
Example: Two Tail Test
H0: 368
H1: 368
= 0.05n = 25Critical Value: ±1.96
Test Statistic:
Decision:
Conclusion:
Do Not Reject at = .05
No Evidence that True Mean Is Not 368Z0 1.96
.025
Reject
Example Solution: Two Tail
-1.96
.025
H0: 386
H1: 38650.1
2515
3685.372
n
XZ
Connection to Confidence Intervals
For X = 372.5oz, = 15 and n = 25,
The 95% Confidence Interval is:
372.5 - (1.96) 15/ 25 to 372.5 + (1.96) 15/ 25
or
366.62 378.38
If this interval contains the Hypothesized mean (368), we do not reject the null hypothesis.
It does. Do not reject.
_
Assumptions Population is normally distributed If not normal, only slightly skewed &
a large sample taken
Parametric test procedure t test statistic
t-Test: Unknown
nSX
t
Example: One Tail t-Test
Does an average box of cereal contain more than 368 grams of cereal? A random sample of 36 boxes showed X = 372.5, and 15. Test at the 0.01 level.
368 gm.
H0: 368 H1: 368
is not given,
= 0.01n = 36, df = 35Critical Value: 2.4377
Test Statistic:
Decision:
Conclusion:
Do Not Reject at = .01
No Evidence that True Mean Is More than 368Z0 2.4377
.01
Reject
Example Solution: One Tail
H0: 368 H1: 368 80.1
3615
3685.372
nSX
t
Type I Error Reject True Null Hypothesis Has Serious Consequences Probability of Type I Error Is
Called Level of Significance
Type II Error Do Not Reject False Null Hypothesis Probability of Type II Error Is
(Beta)
Errors in Making Decisions
H0: Innocent
Jury Trial Hypothesis Test
Actual Situation Actual Situation
Verdict Innocent Guilty Decision H0 True H0 False
Innocent Correct ErrorDo NotReject
H0
1 - Type IIError ( )
Guilty Error Correct RejectH0
Type IError( )
Power(1 - )
Result Possibilities
Reduce probability of one error and the other one goes up.
& Have an Inverse Relationship
True Value of Population Parameter Increases When Difference Between
Hypothesized Parameter & True Value Decreases
Significance Level Increases When Decreases
Population Standard Deviation Increases When Increases
Sample Size n Increases When n Decreases
Factors Affecting Type II Error,
n
rt test =n (r i test)2
---------------------------1 + (n-1) (r i test)2
Top Related