PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and...

PPA 501 – Analytical Methods in Administration

Lecture 6a – Normal Curve, Z-Scores, and Estimation

Normal Curve

The normal curve is central to the theory that underlies inferential statistics.

The normal curve is a theoretical model. A frequency polygon that is perfectly symmetrical and

smooth. Bell shaped, unimodal, with infinite tails. Crucial point distances along the horizontal axis,

when measured in standard deviations, always measure the same proportion under the curve.

Normal Curve

Computing Z-Scores

To find the percentage of the total area (or number of cases) above, below, or between scores in an empirical distribution, the original scores must be expressed in units of the standard deviation or converted into Z scores.

s

XXZ i

Computing Z-Scores – Mean ideology of House delegation by state.

Computing Z-Scores: Examples

What percentage of the cases fall between -0.5 and 0.01 on the ideology scale?

From Excel, =standardize(-0.5, 0.01, 0.205) = Z = -2.4878; =normsdist(-2.4878); p=0.006427

From Excel, =standardize(0.0,0.01,0.205)= Z = -0.04878; =normsdist(-0.04878); p=0.480547

P-0.5&0.0 = 0.480547-0.006427 = 0.474120. 47.4% of the distribution lies between -0.56 and 0 on the

ideology scale.

488.2205.0

51.0

205.0

01.05.0

s

XXZ i .049.0

205.0

01.0

205.0

01.000.0

s

XXZ i

Computing Z-Scores: Examples

What percentage of the House delegations from 1953 to 2005 have more conservative scores than 0.5? (1 - .992 = 0.008 or 0.8%)

What percentage have more liberal scores than -0.25? (10.2%).

Xi Xmean s Z p0.5 0.01 0.205 2.390244 0.991581

-0.25 0.01 0.205 -1.268293 0.102347

Computing Z-scores: Rules

If you want the distance between a score and the mean, subtract the probability from .5 if the Z is negative. Subtract .5 from the probability if Z is positive.

If you want the distance beyond a score (less than a score lower than the mean), use the probability score from Excel. If the distance is more than a score higher than the mean), subtract the probability in Excel from 1.

Computing Z-scores: Rules

If you want the difference between two scores other than the mean: Calculate Z for each score, identify the

appropriate probability, and subtract the smaller probability from the larger.

Estimation Procedures

Bias – does the mean of the sampling distribution equal the mean of the population?

Efficiency – how closely around the mean does the sampling distribution cluster. You can improve efficiency by increasing sample size.


Point estimate – construct a sample, calculate a proportion or mean, and estimate the population will have the same value as the sample. Always some probability of error.


Confidence interval – range around the sample mean. First step: determine a confidence level: how much

error are you willing to tolerate. The common standard is 5% or .05. You are willing to be wrong 5% of the time in estimating populations. This figure is known as alpha or α. If an infinite number of confidence intervals are constructed, 95% will contain the population mean and 5% won’t.


We now work in reverse on the normal curve. Divide the probability of error between the upper

and lower tails of the curve (so that the 95% is in the middle), and estimate the Z-score that will contain 2.5% of the area under the curve on either end. That Z-score is ±1.96.

Similar Z-scores for 90% (alpha=.10), 99% (alpha=.01), and 99.9% (alpha=.001) are ±1.65, ±2.58, and ±3.29.


mean theoferror standard population the

level alpha by the determined as score ZtheZ

mean sample the

interval confidence..

where

..

N

X

ic

NZXic

Estimation Procedures – Sample Mean

mean theoferror standard the1

level alpha by the determined as score ZtheZ

mean sample the

interval confidence..

where

1..

n

s

X

ic

n

sZXic

Only use if sample is 100 or greater


You can control the width of the confidence intervals by adjusting the confidence level or alpha or by adjusting sample size.

Confidence Interval Examples

Mean House Ideology for presidential disaster requests (1953 to 2005) with 90%, 95%, and 99% confidence intervals.

Confidence Interval Examples from Presidential Disaster Decisions, 1953 to 2005Variable Mean Std. Deviation No. of cases Std. Error Cnfd. Int. Lower Bound Upper BoundMean Ideology of House Delegation by State 0.006 0.205 2493 0.004 90% 0.000 0.013Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 95% -0.002 0.015Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 99% -0.004 0.017

Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 90% -0.031 -0.012Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 95% -0.033 -0.010Mean Ideology of Senate Delegation by State -0.022 0.300 2493 0.006 99% -0.037 -0.006


Lecture 6b – One-Sample and Two-Sample Tests

Five-step Model of Hypothesis Testing

Step 1. Making assumptions and meeting test requirements.

Step 2. Stating the null hypothesis. Step 3. Selecting the sampling distribution

and establishing the critical region. Step 4. Computing the test statistic. Step 5. Making a decision and interpreting

the results of the test.

Five-step Model of Hypothesis Testing – One-sample Z Scores

Step 1. Making assumptions. Model: random sampling. Interval-ratio measurement. Normal sampling distribution.

Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho: H1:

1

testtailed-one;

or test tailed-one;

testtailed-two;

1

1

1


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65

(one-tailed).


Step 4. Computing the test statistic. Use z-formula.

Step 5. Making a decision. Compare z-critical to z-obtained. If z-

obtained is greater in magnitude than z-critical, reject null hypothesis. Otherwise, accept null hypothesis.

Five-Step Model: Critical Choices

Choice of alpha level: .05, .01, .001. Selection of research hypothesis.

Two-tailed test: research hypothesis simplify states that means of sample and population are different.

One-tailed test: mean of sample is larger or smaller than mean of population.

Type of error to maximize: Type I or Type II. Type I – rejecting a null hypothesis that is true. Type II – accepting a null hypothesis that is false.

Five-Step Model: Critical Choices

Five-step Model: Example

Is the average age of voters in the 2000 National Election Study different than the average age of all adults in the U.S. population?

Five-step Model of Hypothesis Testing – Large-sample Z Scores

Step 1. Making assumptions. Model: random sampling. Interval-ratio measurement. Normal sampling distribution.


24.451

testtailed-two;1


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. α=0.05. Z(critical)=1.96 (two-tailed)


Step 4. Computing the test statistic.

Step 5. Making a decision.

67.44217.

97.1

179888.17

24.4521.47)(

N

XobtainedZ

population age voting theolder thantly significan is sample The

.difference no of hypothesis null Reject the

96.167.4)()(

criticalZobtainedZ

Five-Step Model: Small Sample T-test (One Sample)

Formula

1)(

Ns

Xobtainedt

Five-Step Model: Small Sample T-test (One Sample)

Step 1. Making Assumptions. Random sampling. Interval-ratio measurement. Normal sampling distribution.

Step 2. Stating the null hypothesis. Ho:

H1:

1

testtailed-one;

or test tailed-one;

testtailed-two;

1

1

1

Five-step Model of Hypothesis Testing – One-sample t Scores

Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=N-1. t(critical) from Appendix A, Table B in Agresti

and Franklin.



Step 5. Making a decision. Compare t-critical to t-obtained. If t-obtained

is greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis.

1)(

Ns

Xobtainedt


Is the average age of individuals in the JCHA 2000 sample survey older than the national average age for all adults? (One-tailed).

Five-Step Model: Small Sample T-test (One Sample) – JCHA 2000

Step 1. Making Assumptions. Random sampling. Interval-ratio measurement. Normal sampling distribution.

Step 2. Stating the null hypothesis. Ho:

H1:

24.451

testtailed-one;1


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=41-1=40. t(critical) =1.684.



Step 5. Making a decision. T(obtained) > t(critical). Therefore, reject the

null hypothesis. The sample of residents from the Jefferson County Housing Authority is significantly older than the adult population of the United States.

29.2299.3

54.7

40866.20

24.4578.52

1)(

Ns

Xobtainedt

Two-Sample Models – Large Samples

Most of the time we do not have the population means or proportions. All we can do is compare the means or proportions of population subsamples.

Adds the additional assumption of independent random samples.

Two-Sample Models – Large Samples

Formula.

11

)(

2

22

1

21

21

21

21

N

s

N

s

XXobtainedZ

XX

XX

Five-Step Model – Large Two-Sample Tests (Z Distribution)

Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution.


21

testtailed-one;

or test tailed-one;

testtailed-two;

21

21

21


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65

(one-tailed).



Step 5. Making a decision. Compare z-critical to z-obtained. If z-obtained is

greater in magnitude than z-critical, reject null hypothesis. Otherwise, accept null hypothesis.

11

)(

2

22

1

21

21

21

21

N

s

N

s

XXobtainedZ

XX

XX


Do non-white citizens of Birmingham, Alabama, believe that discrimination is more of a problem than white citizens?

Five-Step Model – Large Two-Sample Tests (Fair Housing)

Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution.


21

testtailed-one;21


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Α=0.05. Z(critical)=+1.65 (one-tailed).



Step 5. Making a decision. Z(obtained) is greater than Z(critical), therefore reject

the null hypothesis of no difference. Non-whites believe that discrimination is more of a problem in Birmingham.

224.3

173.

56.

022.008.

56.

42966.

141058.1

14.270.2

11

)(22

2

22

1

21

21

Ns

Ns

XXobtainedZ

Five-Step Model – Small Two-Sample Tests

If N1 + N2 < 100, use this formula.

21

21

21

222

211

21

2

)(

21

21

NN

NN

NN

sNsN

XXobtainedt

XX

XX

Five-Step Model – Small Two-Sample Tests (t Distribution)

Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Equal population variances Normal sampling distribution.

Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho:

H1:

21

testtailed-one;

or test tailed-one;

testtailed-two;

21

21

21

22

21


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05. Df=N1+N2-2 t(critical). See Appendix A, Table B.



Step 5. Making a decision. Compare t-critical to t-obtained. If t-obtained is

greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis.

21

21

21

222

211

21

2

)(

21

21

NN

NN

NN

sNsN

XXobtainedt

XX

XX


Did white and nonwhite residents of the Jefferson County Housing Authority have significantly different lengths of residence in 2000?

Five-Step Model – Small Two-Sample Tests (JCHA 2000)

Step 1. Making assumptions. Model: Independent random samples. Interval-ratio measurement. Equal population variances Normal sampling distribution.

Step 2. Stating the null hypothesis (no difference) and the research hypothesis. Ho:

H1:

21

testtailed-two;21

22

21

Five-Step Model – Small Two-Sample Tests (JCHA 2000)

Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Α=0.05, two-tailed. Df=N1+N2-2=14+25-2=37 t(critical) from Appendix B = 2.042



Step 5. Making a decision. Z(obtained) is less than Z(critical) in magnitude.

Accept the null hypothesis. Whites and nonwhites in the JCHA 2000 survey do not have different lengths of residence in public housing.

448.2002.28

63.12

)3338(.4909.84

63.12

1114.7147.7138

63.12

)14(251425

21425)744.93(25)337.56(14

84.8221.70

2

)(22

21

21

21

222

211

21

NNNN

NNsNsN

XXobtainedZ


Lecture 6c – Analysis of Variance

Introduction

Analysis of variance (ANOVA) can be considered an extension of the t-test.

The t-test assumes that the independent variable has only two categories.

ANOVA assumes that the nominal or ordinal independent variable has two or more categories.

Introduction

The null hypothesis is that the populations from which the each of samples (categories) are drawn are equal on the characteristic measured (usually a mean or proportion).

Introduction

If the null hypothesis is correct, the means for the dependent variable within each category of the independent variable should be roughly equal.

ANOVA proceeds by making comparisons across the categories of the independent variable.

Computation of ANOVA

The computation of ANOVA compares the amount of variation within each category (SSW) to the amount of variation between categories (SSB).

Total sum of squares.

SSWSSBSST

XNXSST

XXSST i

nalcomputatio ;22

2


Sum of squares within (variation within categories).

Sum of squares between (variation between categories).

category a ofmean theX

categories e within thsquares theof sum theSSW

2

k

ki XXSSW

category a ofmean theX

category ain cases ofnumber theN

categories ebetween th squares of sum theSSB

k

k

2

XXNSSB kk


Degrees of freedom.

categories ofnumber k

cases ofnumber N

SSB with associated freedom of degreesdfb

SSW with associated freedom of degreesdfw

where

1

kdfb

kNdfw


Mean square estimates.

withinsquareMean

between squareMean F

dfb

SSBbetween squareMean

dfw

SSW withinsquareMean


Computational steps for shortcut. Find SST using computation formula. Find SSB. Find SSW by subtraction. Calculate degrees of freedom. Construct the mean square estimates. Compute the F-ratio.

Five-Step Hypothesis Test for ANOVA.

Step 1. Making assumptions. Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances.

Step 2. Stating the null hypothesis.

different is means theof oneleast at 1

210

H

H k


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05 (or .01 or . . .). Degrees of freedom within = N – k. Degrees of freedom between = k – 1. F-critical=Use Appendix D, p. 499-500.

Step 4. Computing the test statistic. Use the procedure outlined above.


Step 5. Making a decision. If F(obtained) is greater than F(critical), reject

the null hypothesis of no difference. At least one population mean is different from the others.

ANOVA – Example 1 – JCHA 2000

Report

JCHA Program Rating

3.0313 8 1.70837

4.5000 2 .70711

4.6667 6 .81650

4.0556 9 .79822

3.6731 13 1.20927

3.8289 38 1.25082

Marital StatusMarried

Separated

Widowed

Never Married

Divorced

Total

Mean N Std. Deviation

What impact does marital status have on respondent’s rating Of JCHA services? Sum of Rating Squared is 615




different is means theof oneleast at 1

543210

H

H


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 38 – 5 =

33. Degrees of freedom between = k – 1 = 5 – 1 =

4. F-critical=2.69.



ANOVA Table

10.980 4 2.745 1.931 .128

46.908 33 1.421

57.888 37

(Combined)Between Groups

Within Groups

Total

JCHA Program Rating* Marital Status

Sum ofSquares df Mean Square F Sig.


9019.570981.557615

)8289.3(38615 222

SST

XNXSST

9797.10

3156.04625.02115.49008.00893.5)8289.36731.3(13

)8289.30556.4(9)8289.36667.4(6)8289.35.4(2

)8289.30313.3(8

2

222

22

SSB

XXNSSB kk

9222.469797.109019.57 SSBSSTSSW


4151

33538

kdfb

kNdfw

9304.14219.1

7449.2

withinsquareMean

between squareMean F

7449.24

9797.10

dfb

SSBbetween squareMean

4219.133

9222.46

dfw

SSW withinsquareMean

ANOVA – Example 1 – JCHA 2000.

Step 5. Making a decision. F(obtained) is 1.93. F(critical) is 2.69.

F(obtained) < F(critical). Therefore, we fail to reject the null hypothesis of no difference. Approval of JCHA services does not vary significantly by marital status.

ANOVA – Example 2 – Presidential Disaster Set

What impact does Presidential administration have on the president’s recommendation of disaster assistance?

ANOVA – Example 2 – Presidential Disaster Data Set



different is means theof one1

109876543210

H

H


Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 2642 –

10 = 2632. Degrees of freedom between = k – 1 = 10 – 1

= 9. F-critical=1.883.


Step 5. Making a decision. F(obtained) is 12.863. F(critical) is 1.883.

F(obtained) > F(critical). Therefore, we can reject the null hypothesis of no difference. Approval of federal disaster assistance does vary by presidential administration.

PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and...

Documents

Transcript of PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and...