PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and...
-
Upload
esther-hood -
Category
Documents
-
view
216 -
download
1
Transcript of PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and...
PPA 501 – Analytical Methods in Administration
Lecture 6a – Normal Curve, Z-Scores, and Estimation
Normal Curve
The normal curve is central to the theory that underlies inferential statistics.
The normal curve is a theoretical model. A frequency polygon that is perfectly symmetrical and
smooth. Bell shaped, unimodal, with infinite tails. Crucial point distances along the horizontal axis,
when measured in standard deviations, always measure the same proportion under the curve.
Normal Curve
Computing Z-Scores
To find the percentage of the total area (or number of cases) above, below, or between scores in an empirical distribution, the original scores must be expressed in units of the standard deviation or converted into Z scores.
s
XXZ i
Computing Z-Scores – Mean ideology of House delegation by state.
Computing Z-Scores: Examples
What percentage of the cases fall between -0.5 and 0.01 on the ideology scale?
From Excel, =standardize(-0.5, 0.01, 0.205) = Z = -2.4878; =normsdist(-2.4878); p=0.006427
From Excel, =standardize(0.0,0.01,0.205)= Z = -0.04878; =normsdist(-0.04878); p=0.480547
P-0.5&0.0 = 0.480547-0.006427 = 0.474120. 47.4% of the distribution lies between -0.56 and 0 on the
ideology scale.
488.2205.0
51.0
205.0
01.05.0
s
XXZ i .049.0
205.0
01.0
205.0
01.000.0
s
XXZ i
Computing Z-Scores: Examples
What percentage of the House delegations from 1953 to 2005 have more conservative scores than 0.5? (1 - .992 = 0.008 or 0.8%)
What percentage have more liberal scores than -0.25? (10.2%).
Xi Xmean s Z p0.5 0.01 0.205 2.390244 0.991581
-0.25 0.01 0.205 -1.268293 0.102347
Computing Z-scores: Rules
If you want the distance between a score and the mean, subtract the probability from .5 if the Z is negative. Subtract .5 from the probability if Z is positive.
If you want the distance beyond a score (less than a score lower than the mean), use the probability score from Excel. If the distance is more than a score higher than the mean), subtract the probability in Excel from 1.
Computing Z-scores: Rules
If you want the difference between two scores other than the mean: Calculate Z for each score, identify the
appropriate probability, and subtract the smaller probability from the larger.
Estimation Procedures
Bias – does the mean of the sampling distribution equal the mean of the population?
Efficiency – how closely around the mean does the sampling distribution cluster. You can improve efficiency by increasing sample size.
Estimation Procedures
Point estimate – construct a sample, calculate a proportion or mean, and estimate the population will have the same value as the sample. Always some probability of error.
Estimation Procedures
Confidence interval – range around the sample mean. First step: determine a confidence level: how much
error are you willing to tolerate. The common standard is 5% or .05. You are willing to be wrong 5% of the time in estimating populations. This figure is known as alpha or α. If an infinite number of confidence intervals are constructed, 95% will contain the population mean and 5% won’t.
Estimation Procedures
We now work in reverse on the normal curve. Divide the probability of error between the upper
and lower tails of the curve (so that the 95% is in the middle), and estimate the Z-score that will contain 2.5% of the area under the curve on either end. That Z-score is ±1.96.
Similar Z-scores for 90% (alpha=.10), 99% (alpha=.01), and 99.9% (alpha=.001) are ±1.65, ±2.58, and ±3.29.
Estimation Procedures
mean theoferror standard population the
level alpha by the determined as score ZtheZ
mean sample the
interval confidence..
where
..
N
X
ic
NZXic
Estimation Procedures – Sample Mean
mean theoferror standard the1
level alpha by the determined as score ZtheZ
mean sample the
interval confidence..
where
1..
n
s
X
ic
n
sZXic
Only use if sample is 100 or greater
Estimation Procedures
You can control the width of the confidence intervals by adjusting the confidence level or alpha or by adjusting sample size.
Confidence Interval Examples
Mean House Ideology for presidential disaster requests (1953 to 2005) with 90%, 95%, and 99% confidence intervals.
Confidence Interval Examples from Presidential Disaster Decisions, 1953 to 2005Variable Mean Std. Deviation No. of cases Std. Error Cnfd. Int. Lower Bound Upper BoundMean Ideology of House Delegation by State 0.006 0.205 2493 0.004 90% 0.000 0.013Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 95% -0.002 0.015Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 99% -0.004 0.017
Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 90% -0.031 -0.012Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 95% -0.033 -0.010Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 99% -0.037 -0.006
PPA 501 – Analytical Methods in Administration
Lecture 6b – One-Sample and Two-Sample Tests
Five-step Model of Hypothesis Testing
Step 1. Making assumptions and meeting test requirements.
Step 2. Stating the null hypothesis. Step 3. Selecting the sampling distribution
and establishing the critical region. Step 4. Computing the test statistic. Step 5. Making a decision and interpreting
the results of the test.
Five-step Model of Hypothesis Testing – One-sample Z Scores
Step 1. Making assumptions. Model: random sampling. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho: H1:
1
testtailed-one;
or test tailed-one;
testtailed-two;
1
1
1
Five-step Model of Hypothesis Testing – One-sample Z Scores
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65
(one-tailed).
Five-step Model of Hypothesis Testing – One-sample Z Scores
Step 4. Computing the test statistic. Use z-formula.
Step 5. Making a decision. Compare z-critical to z-obtained. If z-
obtained is greater in magnitude than z-critical, reject null hypothesis. Otherwise, accept null hypothesis.
Five-Step Model: Critical Choices
Choice of alpha level: .05, .01, .001. Selection of research hypothesis.
Two-tailed test: research hypothesis simplify states that means of sample and population are different.
One-tailed test: mean of sample is larger or smaller than mean of population.
Type of error to maximize: Type I or Type II. Type I – rejecting a null hypothesis that is true. Type II – accepting a null hypothesis that is false.
Five-Step Model: Critical Choices
Five-step Model: Example
Is the average age of voters in the 2000 National Election Study different than the average age of all adults in the U.S. population?
Five-step Model of Hypothesis Testing – Large-sample Z Scores
Step 1. Making assumptions. Model: random sampling. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho: H1:
24.451
testtailed-two;1
Five-step Model of Hypothesis Testing – Large-sample Z Scores
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. α=0.05. Z(critical)=1.96 (two-tailed)
Five-step Model of Hypothesis Testing – Large-sample Z Scores
Step 4. Computing the test statistic.
Step 5. Making a decision.
67.44217.
97.1
179888.17
24.4521.47)(
N
XobtainedZ
population age voting theolder thantly significan is sample The
.difference no of hypothesis null Reject the
96.167.4)()(
criticalZobtainedZ
Five-Step Model: Small Sample T-test (One Sample)
Formula
1)(
Ns
Xobtainedt
Five-Step Model: Small Sample T-test (One Sample)
Step 1. Making Assumptions. Random sampling. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis. Ho:
H1:
1
testtailed-one;
or test tailed-one;
testtailed-two;
1
1
1
Five-step Model of Hypothesis Testing – One-sample t Scores
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=N-1. t(critical) from Appendix A, Table B in Agresti
and Franklin.
Five-step Model of Hypothesis Testing – One-sample t Scores
Step 4. Computing the test statistic.
Step 5. Making a decision. Compare t-critical to t-obtained. If t-obtained
is greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis.
1)(
Ns
Xobtainedt
Five-step Model of Hypothesis Testing – One-sample t Scores
Is the average age of individuals in the JCHA 2000 sample survey older than the national average age for all adults? (One-tailed).
Five-Step Model: Small Sample T-test (One Sample) – JCHA 2000
Step 1. Making Assumptions. Random sampling. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis. Ho:
H1:
24.451
testtailed-one;1
Five-Step Model: Small Sample T-test (One Sample) – JCHA 2000
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=41-1=40. t(critical) =1.684.
Five-Step Model: Small Sample T-test (One Sample) – JCHA 2000
Step 4. Computing the test statistic.
Step 5. Making a decision. T(obtained) > t(critical). Therefore, reject the
null hypothesis. The sample of residents from the Jefferson County Housing Authority is significantly older than the adult population of the United States.
29.2299.3
54.7
40866.20
24.4578.52
1)(
Ns
Xobtainedt
Two-Sample Models – Large Samples
Most of the time we do not have the population means or proportions. All we can do is compare the means or proportions of population subsamples.
Adds the additional assumption of independent random samples.
Two-Sample Models – Large Samples
Formula.
11
)(
2
22
1
21
21
21
21
N
s
N
s
XXobtainedZ
XX
XX
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho: H1:
21
testtailed-one;
or test tailed-one;
testtailed-two;
21
21
21
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65
(one-tailed).
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Step 4. Computing the test statistic.
Step 5. Making a decision. Compare z-critical to z-obtained. If z-obtained is
greater in magnitude than z-critical, reject null hypothesis. Otherwise, accept null hypothesis.
11
)(
2
22
1
21
21
21
21
N
s
N
s
XXobtainedZ
XX
XX
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Do non-white citizens of Birmingham, Alabama, believe that discrimination is more of a problem than white citizens?
Five-Step Model – Large Two-Sample Tests (Fair Housing)
Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho: H1:
21
testtailed-one;21
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=+1.65 (one-tailed).
Five-Step Model – Large Two-Sample Tests (Z Distribution)
Step 4. Computing the test statistic.
Step 5. Making a decision. Z(obtained) is greater than Z(critical), therefore reject
the null hypothesis of no difference. Non-whites believe that discrimination is more of a problem in Birmingham.
224.3
173.
56.
022.008.
56.
42966.
141058.1
14.270.2
11
)(22
2
22
1
21
21
Ns
Ns
XXobtainedZ
Five-Step Model – Small Two-Sample Tests
If N1 + N2 < 100, use this formula.
21
21
21
222
211
21
2
)(
21
21
NN
NN
NN
sNsN
XXobtainedt
XX
XX
Five-Step Model – Small Two-Sample Tests (t Distribution)
Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Equal population variances Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho:
H1:
21
testtailed-one;
or test tailed-one;
testtailed-two;
21
21
21
22
21
Five-Step Model – Small Two-Sample Tests (t Distribution)
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=N1+N2-2 t(critical). See Appendix A, Table B.
Five-Step Model – Small Two-Sample Tests (t Distribution)
Step 4. Computing the test statistic.
Step 5. Making a decision. Compare t-critical to t-obtained. If t-obtained is
greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis.
21
21
21
222
211
21
2
)(
21
21
NN
NN
NN
sNsN
XXobtainedt
XX
XX
Five-Step Model – Small Two-Sample Tests (t Distribution)
Did white and nonwhite residents of the Jefferson County Housing Authority have significantly different lengths of residence in 2000?
Five-Step Model – Small Two-Sample Tests (JCHA 2000)
Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Equal population variances Normal sampling distribution.
Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho:
H1:
21
testtailed-two;21
22
21
Five-Step Model – Small Two-Sample Tests (JCHA 2000)
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05, two-tailed. Df=N1+N2-2=14+25-2=37 t(critical) from Appendix B = 2.042
Five-Step Model – Small Two-Sample Tests (t Distribution)
Step 4. Computing the test statistic.
Step 5. Making a decision. Z(obtained) is less than Z(critical) in magnitude.
Accept the null hypothesis. Whites and nonwhites in the JCHA 2000 survey do not have different lengths of residence in public housing.
448.2002.28
63.12
)3338(.4909.84
63.12
1114.7147.7138
63.12
)14(251425
21425)744.93(25)337.56(14
84.8221.70
2
)(22
21
21
21
222
211
21
NNNN
NNsNsN
XXobtainedZ
PPA 501 – Analytical Methods in Administration
Lecture 6c – Analysis of Variance
Introduction
Analysis of variance (ANOVA) can be considered an extension of the t-test.
The t-test assumes that the independent variable has only two categories.
ANOVA assumes that the nominal or ordinal independent variable has two or more categories.
Introduction
The null hypothesis is that the populations from which the each of samples (categories) are drawn are equal on the characteristic measured (usually a mean or proportion).
Introduction
If the null hypothesis is correct, the means for the dependent variable within each category of the independent variable should be roughly equal.
ANOVA proceeds by making comparisons across the categories of the independent variable.
Computation of ANOVA
The computation of ANOVA compares the amount of variation within each category (SSW) to the amount of variation between categories (SSB).
Total sum of squares.
SSWSSBSST
XNXSST
XXSST i
nalcomputatio ;22
2
Computation of ANOVA
Sum of squares within (variation within categories).
Sum of squares between (variation between categories).
category a ofmean theX
categories e within thsquares theof sum theSSW
2
k
ki XXSSW
category a ofmean theX
category ain cases ofnumber theN
categories ebetween th squares of sum theSSB
k
k
2
XXNSSB kk
Computation of ANOVA
Degrees of freedom.
categories ofnumber k
cases ofnumber N
SSB with associated freedom of degreesdfb
SSW with associated freedom of degreesdfw
where
1
kdfb
kNdfw
Computation of ANOVA
Mean square estimates.
withinsquareMean
between squareMean F
dfb
SSBbetween squareMean
dfw
SSW withinsquareMean
Computation of ANOVA
Computational steps for shortcut. Find SST using computation formula. Find SSB. Find SSW by subtraction. Calculate degrees of freedom. Construct the mean square estimates. Compute the F-ratio.
Five-Step Hypothesis Test for ANOVA.
Step 1. Making assumptions. Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances.
Step 2. Stating the null hypothesis.
different is means theof oneleast at 1
210
H
H k
Five-Step Hypothesis Test for ANOVA.
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05 (or .01 or . . .). Degrees of freedom within = N – k. Degrees of freedom between = k – 1. F-critical=Use Appendix D, p. 499-500.
Step 4. Computing the test statistic. Use the procedure outlined above.
Five-Step Hypothesis Test for ANOVA.
Step 5. Making a decision. If F(obtained) is greater than F(critical), reject
the null hypothesis of no difference. At least one population mean is different from the others.
ANOVA – Example 1 – JCHA 2000
Report
JCHA Program Rating
3.0313 8 1.70837
4.5000 2 .70711
4.6667 6 .81650
4.0556 9 .79822
3.6731 13 1.20927
3.8289 38 1.25082
Marital StatusMarried
Separated
Widowed
Never Married
Divorced
Total
Mean N Std. Deviation
What impact does marital status have on respondent’s rating Of JCHA services? Sum of Rating Squared is 615
ANOVA – Example 1 – JCHA 2000
Step 1. Making assumptions. Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances.
Step 2. Stating the null hypothesis.
different is means theof oneleast at 1
543210
H
H
ANOVA – Example 1 – JCHA 2000
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 38 – 5 =
33. Degrees of freedom between = k – 1 = 5 – 1 =
4. F-critical=2.69.
ANOVA – Example 1 – JCHA 2000
Step 4. Computing the test statistic.
ANOVA Table
10.980 4 2.745 1.931 .128
46.908 33 1.421
57.888 37
(Combined)Between Groups
Within Groups
Total
JCHA Program Rating* Marital Status
Sum ofSquares df Mean Square F Sig.
ANOVA – Example 1 – JCHA 2000
9019.570981.557615
)8289.3(38615 222
SST
XNXSST
9797.10
3156.04625.02115.49008.00893.5)8289.36731.3(13
)8289.30556.4(9)8289.36667.4(6)8289.35.4(2
)8289.30313.3(8
2
222
22
SSB
XXNSSB kk
9222.469797.109019.57 SSBSSTSSW
ANOVA – Example 1 – JCHA 2000
4151
33538
kdfb
kNdfw
9304.14219.1
7449.2
withinsquareMean
between squareMean F
7449.24
9797.10
dfb
SSBbetween squareMean
4219.133
9222.46
dfw
SSW withinsquareMean
ANOVA – Example 1 – JCHA 2000.
Step 5. Making a decision. F(obtained) is 1.93. F(critical) is 2.69.
F(obtained) < F(critical). Therefore, we fail to reject the null hypothesis of no difference. Approval of JCHA services does not vary significantly by marital status.
ANOVA – Example 2 – Presidential Disaster Set
What impact does Presidential administration have on the president’s recommendation of disaster assistance?
ANOVA – Example 2 – Presidential Disaster Data Set
Step 1. Making assumptions. Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances.
Step 2. Stating the null hypothesis.
different is means theof one1
109876543210
H
H
ANOVA – Example 2 – Presidential Disaster Data Set
Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 2642 –
10 = 2632. Degrees of freedom between = k – 1 = 10 – 1
= 9. F-critical=1.883.
ANOVA – Example 2 – Presidential Disaster Data Set
Step 4. Computing the test statistic.
ANOVA – Example 2 – Presidential Disaster Data Set
Step 5. Making a decision. F(obtained) is 12.863. F(critical) is 1.883.
F(obtained) > F(critical). Therefore, we can reject the null hypothesis of no difference. Approval of federal disaster assistance does vary by presidential administration.