SPSS Supplement Guide - Pindling.org · Available for Microsoft® Windows® and Macintosh®, the...

1

SPSS Supplement Guide

Version Date: July 2007

Courtney A. Pindling, PhD

2

CONTENT

Introduction ............................................................................................................... 3

Descriptive Statistics ........................................................................................................... 8

Scales of Measurements ............................................................................................ 8

Frequency Distributions ............................................................................................ 9

Variable Measurement Level................................................................................... 10

Central Tendency..................................................................................................... 14

Measure of Dispersion ................................................................................... 15

Box Plots ........................................................................................................ 17

The Standard Score ........................................................................................ 20

The Normal Distribution ......................................................................................... 22

Correlation ..................................................................................................... 28

Inference Statistics ............................................................................................................ 34

Introduction ............................................................................................................. 34

Hypothesis Testing - One-sample case for the mean .............................................. 35

Hypothesis Testing - Two-sample case for the mean.............................................. 39

Hypothesis Testing - Correlated samples case for the mean ................................... 43

Simple Linear Regression........................................................................................ 47

Chi-square Tests of Association .............................................................................. 53

Appendix - Statistical Tables.......................................................................................... 58

Z-score Probability Distribution Table (cumulative) .............................................. 58

Values of t at the 0.05 and 0.01 level: Two-tailed................................................... 62

Values of t at various significance levels: One-tailed ............................................. 63

Chi-square Table...................................................................................................... 64

Critical Values for Correlation Coefficient, r .......................................................... 65

3

Introduction

The purpose of this guide is to help students get started using SPSS. Topics

presented in this guide are: introduction to SPSS, installation of software, the

Data/Variable Views, handling data, getting help, SPSS basic menus, Examples of Using

SPSS for analysis.

SPSS solutions have been assisting college and university administrators for more

than 37 years. The software is used at thousands of colleges and universities worldwide

in a wide variety of disciplines. These solutions assist faculty and administrators in

several key areas:

SPSS was originally designed for use by social scientists to analyze data from

surveys. Over the years it has grown to include a wide range of techniques, which are

outlined briefly below. SPSS for Windows is a version of this statistical package that is

especially configured to work in the Windows operating environment. This version has a

wide range of statistical procedures and also a selection of high-resolution graphics

facilities. It also has links to many other packages. The MS-Windows version surrounds

this core with an extensive help and menu system.

The limit on the size of problem that can be tackled is essentially dependent on

the amount of RAM or virtual memory available on your machine. There are effectively

no limits on the number of variables or cases that the program can handle.

Students in advanced quantitative courses or those conducting graduate-level

research need a powerful statistics package to get results. That's why the SPSS Graduate

Pack includes the full version of SPSS Base, two add-on modules, and for Windows®

users, software for structural equation modeling (SEM) to give students the advanced

statistics and techniques you can't find in most student software packages.

Recommended Graduate Pack (http://spss.com/gradpack/):

The SPSS Graduate Pack provides the most complete tool set for use in your

advanced courses. Use the SPSS Graduate Pack for topics such as:

1. Quantitative methods

2. Research methods

http://spss.com/gradpack/

4

3. Educational administration

4. Nursing research

5. And many more

Students can get the best information from any dataset using the in-depth statistics

of the SPSS Graduate Pack. Use basic statistics from counts and crosstabs, to advanced

procedures, including general linear models, linear mixed models, binomial and

multinomial logistic regression, and structural equation models.

Available for Microsoft® Windows® and Macintosh®, the SPSS Graduate Pack

includes:

1. Full version of SPSS Base

2. Two add-on modules, SPSS Regression™ and SPSS Advanced Models™

3. Software for structural equation modeling, Amos™ (available only on Windows)

As students' analytical needs increase, the SPSS Graduate Pack grows right along

with them. They can purchase additional add-on modules and software for specialized

techniques, such as complex sampling and correspondence analysis, publication-ready

tabular reporting, and much more. Best of all, the SPSS Graduate Pack is affordable for

students—up to 85 percent off * the commercial list price of SPSS Base. Students have

the option to purchase the SPSS Graduate Pack on-line at www.journeyed.com,

www.academicsuperstore.com, www.studentdiscounts.com, or lease a copy for 6 or 12

months at www.e-academy.com.

Please note:

The SPSS Graduate Pack is an educational tool, not intended for commercial use

This software will operate for approximately four years

Technical support is limited to installation questions only

The SPSS Graduate Pack is available for use in the United States and Canada only

Purchase by anyone other than degree-seeking students is strictly prohibited by

the license agreement

This guide provides an overview of a first semester course in statistics using

SPSS. It is divided into several sections, one section for each topic.

http://www.journeyed.com

http://www.academicsuperstore.com

http://www.studentdiscounts.com

http://www.e-academy.com

5

The section on central tendency deals with statistics used to describe a typical

data value for a data set. It employs such statistics as: mean, median, mode, and

frequency polygon (shows mode, sometimes mean and median).

The section on variability deals with statistics that describe the variability of a

sample distribution with such measures as: range, standard deviation, and variance.

The standard score and normal curve allow one to make statements about how far

a data point is from its mean and estimate the probability of other points relative to the

mean.

While descriptive statistics describes data, inference statistics makes predictions

or inference about data. The t-test is a usefully tool for comparing the mean of a data set

to some constant value or comparing the means of two samples or distributions. We also

use the t-test to compare both independent samples and correlated samples. The t-test

assumes that both data sets are fairly normally distributed, but is can be used for non-

normal data sets within certain limits.

The Pearson Chi-square test is used to measure the degree of associations between

two or more categorical data sets (based on observed versus expected frequencies); the

data are often nominal numbers. This is a distribution independent analysis; however, the

samples must be random samples.

The null hypothesis states that what are being compared are the same – means or

distributions. We reject the null hypothesis if: a. the significance of the test (p-value) is

less than 0.05 (95% confidence), b. the test statistics is greater than the table lookup

value, and c. the confidence interval for the test does not contain zero. Table 1 shows a

summary of the statistics used in this workbook.

Table 2 shows the SPSS procedures (basic SPSS menu steps) commonly used in a

first semester statistics course. For example, to create a new variable with all the standard

scores (z-scores) for each scores of a variable simply select Descriptive Statistics from

the SPSS Analyze menu, then Descriptives, then check the “Save standardized values as

variable” from the dialogue menu, move the variable you want standardized scores on to

the variable select window, and press OK.

6

Table 1

Statistics Summary

Statistics Description Remarks Descriptive Statistics Central Tendency Typical Value Mean, median, mode, and

frequency

Variability Spread of Distribution Range, variance, and standard deviation (std dev.)

Standard Score Number of standard deviations from the mean

68%: mean ± 1 std dev. 95%: mean ± 2 std dev. 99%: mean ± 3 std dev.

Normal Distribution Bell-shape, symmetric distribution

Assumed by most statistics

Correlation Measures relationships between 2 variables

Strength and direction of relationship, r

Inference Statistics One-sample t-test Compares mean of a

variable to a constant value

Two-sample t-test (independent)

Compares two sample means

Correlated t-test Compares means of two related samples

Different if p-value < 0.05

Different if test statistics > table lookup value

Different if confidence interval does not contain 0

Pearson Chi-square Measures associations between categorical data

Nominal numbers

7

Table 2

SPSS Statistics Procedure Summary

Statistics Description Remarks

Descriptive Statistics Central Tendency Analyze > Descriptive Statistics > Frequencies [Select

Variable(s)] > Select Central Tendency statistics from Statistics Option > OK (Option to Check Frequency Table display)

Variability Analyze > Descriptive Statistics > Frequencies [Select Variable(s)] > Select Dispersion statistics from Statistics Option > OK (Option to Check Frequency Table display)

Standard Score Analyze > Descriptive Statistics > Descriptives [Select Variable] > Check Save standardized values as variable > OK (New Variable Created with z-scores)

Normal Distribution Use Standard Normal Probability Distribution Tables to find Probabilities (Pr) of Variable (X) from: z = (X – M)/SD

Correlation Analyze > Correlate > Bivarate [Select at least 2 Variables; Correlation (Pearson or Spearman); Significance (Two-tail or One-tail)] > Check Flag significant correlations > OK

Inference Statistics One-sample t-test Analyze > Compare Means > One-Sample T Test [Select

Variable(s)] > Enter Test value (reference mean) > OK

Two-sample t-test (independent)

Analyze > Compare Means > Independent Samples T Test [Select Test Variable(s) and Grouping Variable] > Define Groups (Option Define Confidence Interval) > OK

Correlated t-test Analyze > Compare Means > Paired-Samples T Test [Select Two Variables] > (Option Define Confidence Interval) > OK

Pearson Chi-square Analyze > Nonparametric Tests > Chi Square Test [Select At least Two Variables] > OK

8

CHAPTER ONE

Descriptive Statistics

Scales of Measurements

Measurement is defined as the assignment of numbers to objects or events

according to prescribe rules. Once assigned, these numbers have certain properties of

which we must be aware of as we perform arithmetic or mathematics operations.

Nominal scale is the assignment of numbers for the sole purpose of

differentiating one object from another. Joe’s book locker is labeled 50 which

differentiates his from Sam’s whose locker is labeled 80. Assigning “1” for female and

“2” for male in order to categorize gender in a survey is a nominal assignment of a

number. Nominal assignments are not subjected to arithmetic manipulations.

Ordinal scale is the assignment of numbers for the purpose of differentiating

between objects as well as showing the direction of the difference between them. The

ranking of objects or events is a good example of an ordinal scale. On a survey

questionnaire one may assign “1” for Low Sociability, “2” for Average Sociability, and

“3” for High Sociability. We can now use more than or less than terms to compare

numbers. One cannot say that a person with an Average Sociability assignment of “2” is

twice as sociable as a person with an assignment of “1”.

Interval scale is the assignment of numbers to differentiate and assess the

amount of the difference between objects or events in equal intervals. A good example

of an interval scale is measurement of temperature on the Fahrenheit (F) or Celsius (C)

scale. We can say that a temperature increase from 200 to 400 F is twice as much increase

as from 500 to 600 F.

9

Ratio Scale has all the characteristics of the interval scale plus an absolute value

point. An absolute value point allows us to make statements involving ratios of two

numerical observations, such as “twice as long” or “half as fast”. The zero point for the

Fahrenheit temperature scale has an arbitrary zero point; therefore, we cannot say that 800

F is twice as warn as 400 F. If it takes Joe 6 minutes to run a mile and Sam takes 12

minutes to run a mile, we can say that Sam is twice as fast as Joe because 0 minutes is an

absolute value point; this assignment belongs to a ratio scale.

Most physical scales such as time, length, and weight are ratio scales, but very

few behavioral measurements are of this type.

Frequency Distributions

Frequency Distribution is a table constructed to show how many times a given

score or group of scores occurred in a set of data. A simple frequency distribution (see

Table 1) is the ordering of the frequencies of a set of data from highest to lowest scores in

a table. When scores are grouped into intervals showing how many scores occurred in

each interval, this is called a grouped frequency distribution (see Table 2).

Apparent Limits are the limits displayed in a grouped frequency table (Table 2,

col. 1); these limits give a reasonable range between which groups of data exist.

Real Limits (for continuous data – measurements or observations that depend

upon the accuracy of the measuring instrument) of any interval extend from ½ unit below

and above the apparent lower and upper limits respectively. The real limits of the 20 – 30

interval are 19.5 and 30.5. The real lower limit is designated L and the real upper limit is

designated U.

The Midpoint, MP, of an interval is its exact center. The MP of any interval is

found by adding the apparent upper limit to the apparent lower limit and dividing by 2.

The MP of the 20 – 30 interval is 25.

Interval size is denoted by the symbol i; it is the distance between the real lower

limit and the real upper limit. The interval size is determined by subtracting L from U. It

is recommended that you group data so that there are between 8 and 15 intervals.

10

Variable Measurement Level

You can specify the level of measurement as scale (numeric data on an interval or ratio scale), ordinal, or nominal. Nominal and ordinal data can be either string (alphanumeric) or numeric. Measurement specification is relevant only for:

Custom Tables procedure and chart procedures that identify variables as scale or categorical. Nominal and ordinal are both treated as categorical. (Custom Tables is available only in the Tables add-on component.)

SPSS-format data files used with Answer Tree.

You can select one of three measurement levels:

Scale. Data values are numeric values on an interval or ratio scale--for example, age or income. Scale variables must be numeric.

Ordinal. Data values represent categories with some intrinsic order (for example, low, medium, high; strongly agree, agree, disagree, strongly disagree). Ordinal variables can be either string (alphanumeric) or numeric values that represent distinct categories (for example, 1 = low, 2 = medium, 3 = high).

Note: For ordinal string variables, the alphabetic order of string values is assumed to reflect the true order of the categories. For example, for a string variable with the values of low, medium, high, the order of the categories is interpreted as high, low, medium--which is not the correct order? In general, it is more reliable to use numeric codes to represent ordinal data.

Nominal. Data values represent categories with no intrinsic order--for example, job category or company division. Nominal variables can be either string (alphanumeric) or numeric values that represent distinct categories--for example, 1 = Male, 2 = Female.

For SPSS-format data files created in earlier versions of SPSS products, the following rules apply:

String (alphanumeric) variables are set to nominal.

String and numeric variables with defined value labels are set to ordinal.

Numeric variables without defined value labels but less than a specified number of unique values are set to ordinal.

Numeric variables without defined value labels and more than a specified number of unique values are set to scale.

Figure 1. SPSS: Help menu, measurement levels.

11

Frequency (f) simply indicates how many scores are located or counted in each

interval.

Number, N, is the number of scores in a distribution or the total of all the

frequencies for all the intervals. N is also called the sample size.

The Figures and Tables below show how to use SPSS to create the

frequency distribution table and polygon (histogram) and some SPSS outputs.

Figure 2. SPSS frequency procedure: Analyze -> Charts -> Select Histograms

12

Table 3

Simple Frequency Distribution of 9th Graders (ODE)

N Frequency (f)

Percent

Cumulative Percent

28 1 1.1 1.1 33 1 1.1 2.1 34 1 1.1 3.2 37 1 1.1 4.3 40 1 1.1 5.3 47 3 3.2 8.5 49 1 1.1 9.6 50 4 4.3 13.8 51 2 2.1 16.0 52 1 1.1 17.0 53 1 1.1 18.1 54 2 2.1 20.2 55 2 2.1 22.3 56 2 2.1 24.5 57 1 1.1 25.5 59 3 3.2 28.7 60 2 2.1 30.9 61 4 4.3 35.1 62 2 2.1 37.2 63 1 1.1 38.3 64 3 3.2 41.5 65 3 3.2 44.7 66 3 3.2 47.9 67 3 3.2 51.1 68 5 5.3 56.4 69 4 4.3 60.6 71 3 3.2 63.8 72 3 3.2 67.0 73 5 5.3 72.3 74 3 3.2 75.5 75 4 4.3 79.8 77 2 2.1 81.9 78 3 3.2 85.1 79 1 1.1 86.2 80 1 1.1 87.2 81 2 2.1 89.4 82 1 1.1 90.4 83 2 2.1 92.6 84 2 2.1 94.7 85 1 1.1 95.7 86,95,98, 100 1 1.1 96.8

13

Table 4

Grouped Frequency Distribution of 9th Graders (ODE_New)

Apparent Limits (Scores)

Frequency (f)

Real Limits (L – U)

Midpoint (MP)

Interval Size (i)

90 – 100 3 89.5 – 100 95.0 10.5

80 – 89 7 79.5 – 89.5 84.5 10

70 – 79 8 69.5 – 79.5 74.5 10

60 – 69 11 50.5 – 69.5 64.5 10

50 – 59 9 49.5 – 50.5 54.5 10

40 – 49 3 39.5 – 49.5 40.5 10

30 – 39 2 29.5 – 39.5 34.5 10

20 – 29 1 19.5 – 29.5 24.5 10

Note. This Group Frequency Table is manually created

Table 5

Grouped Frequency SPSS Output

Scores Frequency

Cum Percent

20 – 29 1

1.1

30 – 39 3

4.3

40 – 49 5

9.6

50 – 59 18

28.7

60 – 69 30

60.6

70 – 79 24

86.2

80 – 89 10

96.8

90 – 100 3

100.0

Figure 3 Histogram of grouped frequency.

0.00

20.00

40.00

60.00

80.00

100.00

grade9th

0

5

10

15

20

25

30

Frequency

Mean = 61.2766

Std. Dev. = 13.5388

N = 94

Histogram

14

Central Tendency

Central tendency is an attempt to devise a statistical method that yield a single

value that would tell us something about the typical value(s) of a distribution. The three

most common central tendency statistics are the arithmetic mean, the median, and the

mode.

The mean or arithmetic mean or average is the sum of all values divided by the

number of values, N. When the mean is determined from a subset of data from an entire

population, it is often denoted by the symbol, M. When the mean is of the entire

population, it is designated by the symbol, µ.

The median is a central tendency statistics that attempts to find the exact center of

or mid-point of the data or scores. The median is the value that separates the upper half of

the date set or distribution from the lower half; often this is called the 50th percentile.

The mode is the statistics that shows which score(s) is the most frequent. The

mode is often found by selecting the score(s) that has the highest frequency from the

simple frequency distribution table. Bimodal distribution has two modes. Figure below

shows the SPSS Method: Load ODE Data; Analyze -> Frequencies -> Statistics, Select Mean,

Median, and Mode -> Select Display Frequency, Continue.

The Table below shows the output from a SPSS procedure used to

generate the descriptive statistics showing the mean, median and mode.

Table 6

Descriptive Statistics on 9th Grade (ODE)

Statistics Values

Mean 65.86

Median 67.00

Mode 68, 73

*Multiple modes exist. The smallest value is shown in the frequency statistical summary.

15

Figure 4. Central tendency measures of pass9th variable from ODE.

Measure of Dispersion

Variation is the fluctuation of scores about a measure of central tendency. To

describe a set of measurements accurately we need to know both the central tendency and

the measure of the variation.

The range is the simplest and most straightforward measure of variability; it is

the difference of the lowest score from the highest score. It does not however, tell us

anything about the pattern of the distribution of data.

The variance is the average sum of square deviation of a set of data from the

mean and is often denoted by the symbol, S2.

16

The standard deviation is the square root of the variance, or the average sum of

square deviation of a set of data from the mean. It is the most widely used measure of

variability and is often reported in most research statistical summaries. It is often denoted

by the symbol, S (SD is used to report in APA style).

Table 7 shows the measurement of dispersion for the pass9th variable from the

ODE data table. Figure below shows the SPSS measures of variation selection process:

Analyze -> Frequencies -> Statistics -> Select, Std. Deviation, Variance, Range -> Continue.

Table 7

Measures of Dispersion for pass9th Variable (ODE)

Statistics Values

Std. Deviation 13.61

Variance 185.22

Range 72.00

N is 94

17

Figure 5. Variability measures of pass9th variable from ODE.

Box Plots

A Box Plot is a graphical chart or diagram that illustrates both the central

tendency and variability of a data set: Minimum, 25 percentile (25% of data below point),

Median (50% percentile), 75 percentile, and Maximum value. SPSS procedure: Analyze

18

-> Explore -> Select variable, Statistics select Descriptive and Percentile, for Plots select

Boxplots.

Figure 6. SPSS procedure for Boxplot.

The SPSS output for Boxplot with selected statistics may look like the following

Figures and Tables:

19

Table 8

SPSS Output for Boxplot: Descriptive Statistics Table

Variable Name of Statistics Statistic Std. Error Mean 85.7000

.78951

Lower Bound 83.9140

95% Confidence Interval for Mean Upper Bound

87.4860

5% Trimmed Mean 85.6667

Median 85.5000

Variance 6.233

Std. Deviation 2.49666

Minimum 82.00

Maximum 90.00

Range 8.00

Interquartile Range 3.75

Skewness .373

.687

VAR00001

Kurtosis -.336

1.334

Table 9

SPSS Output Boxplot: Percentile Table

Percentiles

5 10 25 50 75 90 95 Weighted Average(Definition 1) 82.0000

82.1000

83.7500

85.5000

87.5000

89.9000

.

Tukey's Hinges

84.0000

85.5000

87.0000

20

VAR00001

82.00

84.00

86.00

88.00

90.00

Figure 7. SPSS Output: Boxplot diagram

The Standard Score

The standard score or z-score is simply a way of telling how far a score is from

the mean in standard deviation units. Knowing the z-score or standard score for a

particular data point, X not only tells us how far that data is from the mean but also what

percent of the distribution or data set is below or above that point, X. We use the z-score

table in the Appendix to determine this percentile.

It is a bit awkward when discussing a score, X, to say that it is “2 standard

deviations above the mean” or “1.5 standard deviations below the mean.” The z-score

was developed to state this fact; this, in effect, says the same thing in other meaningful

ways.

21

Note that, when the z-score is positive, it is located above the mean, and when

negative it is below the mean. A z-score of zero (0) tells us that 50% of the data is above

or below the mean.

The formula for converting any score, X into its corresponding z-score is:

z X MS

where

z is the z-score

X is the observed score or data point being examined

M is the mean of the distribution of scores

S is the standard deviation

From the ODE data table, we know that for the pass9th variable that M=65.86 and

S=13.61.

Arlington schools:

If we would like to know how the Arlington schools did relative to the rest of the schools

for the pass9th variable in ODE table we would compute Arlington’s z-score or standard

score. Arlington’s observed score, X was 84, so its z-score is:

z X MS

84 65.8613.61 1.33

The positive 1.33 tells us that Arlington school scored above the mean of all the schools

or their mean score is 1.33 standard deviations above the mean (84 = 65.86 + 1.33 x

13.61). The percentile for this z-score (see shaded cell of the z-score cumulative table in

Appendix) is 0.9082 or 90.82% (0.9082 x 100). This means that 90.82% of the rest of the

schools had their pass9th scores below that of Arlington or that Arlington’s pass9th score

was 90.8% above of the rest of the schools.

22

Lima schools:

If we would like to know how the Lima schools did relative to the rest of the schools for

the pass9th variable in the ODE table we would compute Lima’s z-score or standard

score. Lima’s observed score, X was 40, so its z-score is:

z X MS

40 65.8613.61 1.90

The negative 1.90 tells us that Lima schools scored below the mean of all the schools or

their mean score is -1.90 standard deviations below the mean (-1.90 = 65.86 - 1.90 x

13.61). The percentile for this z-score (see shaded cell of the z-score cumulative table in

Appendix) is 0.028716493 or 2.87% (0.028716493 x 100). This means that 2.87% of the

rest of the schools had their pass9th scores below that of Lima or that Lima’s pass9th

score was higher than only 2.87% of the rest of the schools. One may also state that

97.13% (100 - 2.87) of the schools scored higher than Lima for the pass9th variable.

SPSS can generate the standard score (See Figure 8) for you with the following

procedure: Analyze -> Descriptives -> Select variable and Save standardize values as

variables. SPSS will create a new variable with the z-scores for each value of the

variable.

The Normal Distribution

Many physiological and psychological measurements are normality distributed;

that is, a graph of the measurement looks like the familiar symmetrical, bell-shaped

distribution shown below. If we know or can assume that our data is similar to that of a

normal distribution, then there are many statistical conclusions or predictions we can

make about our data. Many statistical calculations assume that the data is normally

distributed. If the data is not normally distributed we can use non-parametric statistical

tests to draw conclusions or make predictions about our data.

There is a whole family of normal curve, depending on the sample or population

mean, M and the sample or population standard deviation, S. The normal curve can also

be drawn with the z-score instead of the actual data values.

23

Figure 8. SPSS procedure: Standard scores

24

Figure 9. IQ Normal Curve.

Figure 10. Standard Normal Curve (z-

scores).

One important characteristic of the normal curve is the information that the area

under the curve tells us about the probability of the set of data we are studying.

The area under the normal curve when the z-score is 0 is 50%. This is indicated

from the z-score table in the Appendix as 0.50. The area under the normal curve when the

z-score is +1.0 is 84.13% or 0.8413 from the z-score cumulative probability distribution

table in Appendix.

Figure 11. Area under curve when z = 0.0.

Figure 12. Area under curve when z = 1.0.

25

Using the z-score table in Appendix, the percent of the sample or population

between z-score of 0 and 1.0 is 84.13% - 50% = 34.13%. The percent of the sample

distribution below a z-score of -1.0 is 15.87%. Therefore, the percent of the sample

between a z-score of -1.0 and +1.0 is 84.13% - 15.87% = 68.26% or approximately 68%.

Note, that the percent of the population or sample between z = -2 and z = +2 is

approximately 95% and the percent between z = -3 and z = +3 is about 99%.

Figure 13. Area under curve when z = -1.

Figure 14. Area between z=-1 and z = +1.

Figure 15. Area between z = -2 and z = +2.

Figure 16. Area above z = 1.65.

26

Figure 17. SPSS procedure for creating a normal curve fit.

If we wanted to know what percent of data is above a z-score of 1.65, we would

observed from the z-score table in the Appendix that the probability less than 1.65 is

95.05. Therefore, the percentage above 1.65 is 4.95% (100 - 95.05).

27

20 40 60 80 100

Passed 9th Grade

0

5

10

15

20

Fre

qu

ency

Mean = 65.86Std. Dev. = 13.609N = 94

Histogram

Figure 18. Normal curve fit for the pass9th variable from ODE.

28

Correlation

Correlation is a statistical method that tries to determine if the is a relationship

between two variables, such as high school GPA and success in college. If there is a

relationship, we say that the two variables are correlated.

Correlation techniques indicate the strength or amount of relationship, so that a

single value will tell us how any two variables are related. This single value is called the

correlation coefficient, r. When we which to predict scores from one variable, knowing

the scores from another, we use another statistical technique called regression. So

correlation tells us if a relationship exists and regression enables us to use this

relationship to predict one variable score, given the score of the other.

The correlation coefficient, r (Pearson), ranges from values of -1 to +1. An r

value of +1 suggests that the two variables are strongly related positively; that is, as the

scores of one variable increases, the other also increases (From Table 10: Ability and

Speed). An r value of -1 suggests that the two variables are strongly related negatively; as

the scores of one variable increases, the other decreases (Table 10: Ability and GPA).

When the r value is 0, there is no relationship or no correlation.

Three assumptions are made about Pearson r: 1. it requires interval or ratio data,

2. the relationship between variables must be linear, and 3. the technique requires pairs of

data values.

29

Table 10

Correlation Example Table

Player Ability

GPA

Speed

Index

A 1

5

1

3

B 2

4

2

2

C 3

3

3

4

D 4

2

4

2

E 5

1

5

3

1 2 3 4 5

Ability

1

2

3

4

5

Figure 19. Positive correlation: Ability vs.

Speed.

1 2 3 4 5

Ability

1

2

3

4

5

Figure 20. Negative correlation: Ability

vs. GPA.

30

The sign of the correlation coefficient tells us

whether one variable is increasing or decreasing

as the other is increasing. The size of r,

indicates the amount of the relationship.

1 2 3 4 5

Ability

2

2.5

3

3.5

4

Figure 21. No correlation: Ability vs. Index.

The Pearson r for the data above is show below:

Table 11

Correlation Matrix for Correlation Example

Variable Ability GPA Speed Index

Ability 1

-1**

1**

0

GPA

1

-1**

0

Speed

1

0

Index

1

**Correlation is significant at the 0.01 level (2-tailed).

If the relationship between the variables is linear, we may use the correlation

coefficient determined by the Pearson r. For nonlinear relationships, we may use

Spearman Rank correlation coefficient, rs.

The significance of the correlation coefficient, r is dependent upon the sample

size and the level of confidence one wishes to have for the correlation coefficient. In the

SPSS correlation method this is related to the p-value or the significance level. If the

significance level, p-value, is very small (less than 0.05, for 95% confidence), then the

31

correlation is significant and the two variables are linearly related (especially so for the

Pearson r). If the significance level, p-value is very large (or p > 0.50) the correlation is

not significant, and the two variables are not linearly related.

Most textbooks use the following scheme to interpret the value of the correlation

coefficient as follows:

Table 12

Interpretations for Correlation Coefficient

Correlation Coefficient value Interpretation

>= 0.80 Very Strong

0.60 to 0.80 Strong

0.40 to 0.60 Moderate

0.20 to 0.40 Low

=< 0.20 Very Low

32

Figure 22. SPSS correlation procedure: Analyze -> Correlate (Pearson).

33

Table 13

Correlation Matrix for First 25 Data for CPS50

Variables independent

living scale

Self

confidence

score

Academic

aptitude test

Personal

adjustment

scale

Social

skills

inventory

Age of

student

independent

living scale

1

0.774**

0.37

0.623**

0.707**

-0.687**

Self

confidence

score

1

0.559**

0.764**

0.863**

-0.720**

Academic

aptitude test

1

0.552**

0.422*

-0.405*

Personal

adjustment

scale

1

0.669**

-0.31

Social skills

inventory

1

-0.618**

Age of

student

1

** Correlation is significant at the 0.01 level (2-tailed).

* Correlation is significant at the 0.05 level (2-tailed).

34

CHAPTER TWO

Inference Statistics

Introduction

Read the sections on Sampling Theory and Hypothesis Testing in the Appendix

[not attached] for an introduction to inference statistics. Before you perform any

inference statistical tests it is best to know: a. a little about your sample and/or

population, b. have an hypothesis that you want tested, and c. determine which statistical

test you will use to evaluate your hypothesis. This chapter deals with selection of

appropriate statistical tests to evaluate your hypothesis or make inference about your

sample(s).

The student t-test: The t-distribution: The t-distribution is similar to a normal

distribution and approaches that of a normal curve when the sample size is very large.

The following are properties of the t-distribution:

1. There are an infinite number of student t distribution, one for each degree

of freedom, df. The number of degrees of freedom for the One-sample case is 1 minus

the sample size or N-1. That is, df = N - 1.

2. The Student's t distribution is similar to the standard normal distribution

in shape: That is it as a mean of t = 0 and the total area under the cure is 1 or the

probability density or distribution of the function.

3. As the sample size gets larger the curve gets closer to the standard normal

curve. Therefore for sample size >= 30 the z-score statistics can be used.

35

Figure 23. Normal and t-distribution curves.

Hypothesis Testing - One-sample case for the mean

One-Sample Case: If you wish to compare the mean of your sample to a test value, a

constant, you use the t distribution. For this test you will need your sample mean, M; the

sample standard deviation, S, the sample size, N and the value that you want the sample

mean compared to.

Hypothesis: Your null hypothesis, H0 (test statement) is that, your sample mean = test

value. Your alternate hypothesis, Ha is that there is no reason to believe that our sample

mean differs from the test value.

H0: M = E, where E is your test value

Statistics Used: Once you select the appropriate t-test method from your SPSS program,

it automatically calculates these statistics for you:

1. Degree of freedom: df = N - 1

2. Standard error: SX

SN

36

3. Test Statistics, t-test: t M E

SX

4. Alpha level, a = 0.05 for 95% confidence statement and a = 0.01 for

99%

5. The t-value: ta, t alpha (a) is the table look up value given alpha (0.05) and, df

Significance level: The probability at which you reject the null hypothesis.

Acceptance and Rejection Criteria:

Reject the null hypothesis for any of the following reasons:

1. The significance level < alpha or p-value < 0.05

2. The absolute value of the t-test (calculated) > t-value (table lookup)

3. The confidence interval for the mean difference does not contain zero.

Do not reject (accept) the null hypothesis for any of the following reasons:

1. The significance level > alpha or p-value > 0.05

2. The absolute value of the t-test (calculated) < t-value (table lookup)

3. The confidence interval for the mean difference contains zero

The SPSS output for the One-sample t-test will provide sufficient information for

you to make a decision about your null hypothesis. The following are two examples of

the one-sample t-test.

The SPSS procedure to calculate the one-sample t-test for a single variable and

compare it to a constant value is as follows: Analyze -> Compare Means -> One-Sample T Test

-> Enter Test Value and under Option Select Confidence Interval of 95% (for alpha of 0.05 or significance

level).

37

Figure 24. SPSS one-sample t-test procedure

Example 1: Is the mean of the pass9th variable 70? Since we which to compare the mean

of a single variable to a constant value of 70, we can use the one-sample t-test with an

alpha or a of 0.05 (95% confidence). When we performed this analysis using SPSS, we

obtained to following results:

The null hypothesis is: H0: 70 = mean of pass9th variable; the SPSS program

actually calculates the test statistics for Ha : mean of pass9th is not > 70, so the t-test will

be negative - we used its absolute value.

38

Table 14

SPSS One-Sample t-test of pass9th Variable: Basic Statistics

N Mean Std. Deviation

Std. Error

Mean

Passed 9th Grade 94

65.86

13.609

1.404

Table 15

SPSS One-Sample t-test of pass9th Variable: t-test Analysis for Test Value of 70

Test Value = 70

t

df

Sig. (2-tailed)

Mean

Difference

95% Confidence

Interval of the

Difference

Lower Upper

Passed 9th Grade -2.948

93

.004

-4.138

-6.93

-1.35

Making inference or Results of the test:

The t-value obtained from the one-tailed t-distribution table in the Appendix is

1.661 (df = 90+, a= 0.05) and since the following is observed from the results Table 12,

we reject the null hypothesis in favor of the alternate hypothesis that there is a

significant difference between the test value of 70 and the sample mean for the pass9th

variable. We could also say that the mean for the pass9th variable of 65.86 is

significantly less that 70.

Reject the null hypothesis for the following reasons (any one):

1. The significance p-value = (0.004 x 2) = 0.008 < 0.05

2. The absolute value of the t-test of 2.948 > t-value of 1.661 (appendix)

3. The confidence interval (-6.93 to -1.35) of the mean difference does not

contain zero.

39

Hypothesis Testing - Two-sample case for the mean

Two-Sample Independent Case: If you wish to compare the means of two variables,

you use the two-tailed t-test statistical strategy. For this test you will need to know both

sample means, M1 and M2; both samples standard deviations, S1 and S2, and the sample

sizes, N1 and N2. We are assuming independence for this case, i.e. a member of one group

does not influence the selection of any member of the other group.

Hypothesis: Your null hypothesis, H0: mean1 = mean2. Your alternate hypothesis, Ha is

that there is a difference in the means.

H0: M1 = M2



1. Degree of freedom: df = N1 + N2 - 2

2. Standard error of the mean difference, SD

3. Test Statistics, t-test: t

M1 M2

SD

4. Test for Equality of Variance (Levene's Test for Equality of Variances).

a. If the significance for the Levene's Test > 0.05, use the row of

information for Equality of Variance; F-test: p > 0.05

b. If the significance for the Levene's Test < 0.05, use the row of

information for “Unequality” of Variance; F-test: p < 0.05

5. Alpha level, a = 0.05 for 95% confidence statement and a = 0.01 for 99%

6. The t-value: ta , t alpha is the table look up value given alpha (0.05) and, df

(From two-tailed t-distribution table).

40





2. The absolute value of the t-test (calculated) > t-value (lookup)






The SPSS output for the two-sample t-test will provide sufficient information for

you to make a decision about your null hypothesis. The following is an example of the

two-sample t-test.

The SPSS procedure to calculate the two-sample t-test for two independent

variables is as follows: Analyze -> Compare Means -> Independent Samples T Test -> Select

and Define a Grouping Variable (two values) -> Under Option make sure the Confidence Interval

is 95% (for alpha of 0.05 or significance level).

41

Example: Are the means for the pass9th variable by hilo variable (values 1 and 2) the

same? Since we which to compare two means (two groupings of the pass9th variable by

the hilo variable), we can use the two-sample t-test with an alpha of 0.05 (95%

confidence; 95% or 0.95 = 1.00 – 0.05). When we performed the analysis using SPSS, we

obtained to following results:

Figure 25. SPSS two-sample t-test procedure for independent samples.

The null hypothesis is that: H0: mean1 (value 1) = mean2 (value 2) of pass9th variable.

42

Table 16

SPSS Two-Sample t-test of pass9th Variable by Hilo: Basic Statistics

hilo N Mean Std. Deviation

Std. Error

Mean

Passed 9th

Grade

1.00 47

62.49

14.368

2.096

2.00 47

69.23

12.033

1.755

Table 17

SPSS Two-Sample t-test of pass9th Variable by Hilo: t-test Analysis

t-test for Equality of Means

Levene's

Test for

Equality of

Variances

Confidence

Interval, 95%

F

Sig.

t

df

Sig. 2-

tailed

Mean

Diff

Std

Error

Lower

Upper

Equal

Variance

0.756

0.387

-2.467

92

0.015

-6.745

2.734

-12.174

-1.315

Unequal

Variance

-2.467

89.251

0.016

-6.745

2.734

-12.176

-1.313

Making inference or Results of the test:

Since the F-test significance level of 0.387 > 0.05, we assume that the variances

are equal, and use row for Equal Variance to make our inference about the means (notice

that the statistics looks the same for both rows, maybe due to large sample size).

The t-value obtained from the two-tailed t-distribution table in the Appendix is

1.987 and since the following is observed from the results Table 14, we reject the null

hypothesis in favor of the alternate hypothesis that there is a significant difference

between the two means for hilo values for the pass9th variable.

43

We reject the null hypothesis for the following reasons:

1. The significance p-value = 0.015 < 0.05

2. The absolute value of the t-test of 2.467 > t-value of 1.987 (Appendix)

3. The confidence interval (-12.174 to -1.315) of the mean difference

does not contain zero.

Hypothesis Testing - Correlated samples case for the mean

Correlated or Paired Samples: If you wish to compare the means of two variables that

are related or correlated, you use the Paired t-test statistical strategy. For this test you will

need to know both sample means, M1 and M2; sample standard deviations, S1 and S2, and

the sample sizes, N1 = N2. If the test for correlation shows a low correlation coefficient

and the test significance is high, consider using the Independent Samples t-test.

Usually the two correlated variables represents the same group of samples but

measured at different time (e.g. before and after an event or treatment), or related groups

(e.g. husbands and wives or left wheel and right wheel of the same automobile).

Hypothesis: Your null hypothesis, H0 mean1 = mean2. Your alternate hypothesis, Ha is

that there is a difference in the means.

H0: M1 = M2



1. Degree of freedom: df = N1 - 1

2. Standard error of the mean difference,

SESD

N 1, where SD

D2

N D2

44

3. Test Statistics, t-test:

t DSE

4. Test for Correlation

a. Correlated if r is high and correlation p < 0.05

b. Use Independent Samples t-test, if r is low and correlation p > 0.05

5. Alpha level, a = 0.05 for 95% confidence statement and a = 0.01 for 99%

6. The t-value: ta , t alpha is the table look up value given alpha (0.05) and, df

(From two-tailed t-distribution table).





2. The absolute value of the t-test (calculated) > t-value (table lookup)






The SPSS output for the Paired t-test will provide sufficient information for you

to make a decision about your null hypothesis. The following is an example of the Paired

t-test for two correlated variables.

45

The SPSS procedure to calculate the Paired t-test for two correlated variables is as

follows: Analyze -> Compare Means -> Paired-samples T Test -> Select the Correlated

Variables -> Under Option make sure the Confidence Interval is 95% (for alpha of 0.05 or significance

level).

Example: Are the means for the Visual and Mathach variables the same from the

HSB500 table? Since we which to compare two means for related or correlated variables,

we can use the Paired t-test with a suggested alpha of 0.05 (95% confidence). When we

performed the analysis using SPSS, we obtained to following results:

The null hypothesis is that: H0 : mean1 (Visual) = mean2 (Mathach)

Table 19

SPSS Paired t-test of Visual and Mathach Variables: Basic Statistics

Variable Mean N Std. Deviation

Std. Error

Mean

Pair 1 visual 5.697

500

3.887

0.174

mathach 13.098

500

6.605

0.295

Table 20

SPSS Paired t-test Correlation Table

N Correlation Sig.

Pair 1 visual &

mathach 500

.438

.000

46

Figure 26. SPSS Paired t-test procedure.

47

Table 21

SPSS Paired t-test Analysis Table

Paired Differences

t df

Sig. (2-

tailed)

Mean

Std.

Deviation

Std.

Error

Mean

95% Confidence

Interval of the

Difference

Lower Upper

visual -

mathach

-7.401

6.018

0.269

-7.930

-6.872

-27.499

499

.000

Making inference from the t-test Analysis

The t-value obtained from the two-tailed t-distribution table in the Appendix is

1.960 and since the following is observed from the results Table 17, we reject the null

hypothesis in favor of the alternate hypothesis that there is a significant difference

between the two means for hilo values for the pass9th variable.

We reject the null hypothesis for the following reasons:

1. The significance p-value = 0.000 < 0.05

2. The absolute value of the t-test of 27.499 > t-value of 1.960 (Appendix)

3. The confidence interval (-7.930 to -6.872) of the mean difference

does not contain zero.

Simple Linear Regression

If two variables are correlated, we can find a formula that predicts the value of

one variable given the value of the other. The variable that is used for prediction is called

the independent variable and is often associated with the variable, X. The predicted

48

variable, Y is called the dependent variable for its calculation is dependent upon the

other variable, X.

A linear equation or the linear formula is a formula of a straight line and is given

my the equation: y = mx + b, where m is the slope of the line, b is the y-intercept (the

value of y when = 0 or the value of the dependent variable when the independent variable

is zero) and X and Y are the independent and dependent variables receptively. If the graph

of two related variables looks like a straight line we say that they appear to be linear

(linearity exist between them).

The purpose of linear regression is to find an equation of the best straight line that

represents the linear relationship between both sets of data. Knowing the linear formula

of a pair of data sets allows us to predict the value of one variable given the value of the

other.

Assumptions for the Pearson r in Regression

There are four assumptions needed before one can meaningfully apply the linear

regression model:

1. Both variables must be correlated, r must be significant

2. The relationship between both variables must be linear (close to a straight line)

3. Both variables are fairly normality distributed (population)

4. Standard deviation of the dependent variable about Y for a given value of

X is about the same (homoscedasticity)

The SPSS procedure for Linear Regression is (HWJ100 table): Analyze -> Regression ->

Linear -> Select Dependent and Independent Variable.

49

Figure 27. Regression SPSS procedure: Equation.

The outputs for the regression analysis are many, and tell us different things about

the analysis. The Regression Summary (Table 18) shows us the Pearson's correlation

coefficient value, r. Since r is high, 0.838, it indicates that the variables are linearly

related and the Coefficient of Determination, r2 is 0.703. The Coefficient of

Determination tells us how much of the variation in the dependent variable, Y, is due to

change in the independent variable, X. An r2 of 0.703 tells us that 70.3% of the variation

in Verbal scores is associated with changes in GPA. That means that 29.7% is caused by

other factors. Table 18 and 19 indicate that the GPA is the independent variable

(predictor variable) and Verbal is the dependent.

50

Table 22

SPSS Regression Table: Summary

Model R R Square

Adjusted R

Square

Std. Error of

the Estimate

1 .838(a)

.703

.700

7.29910

a Predictors: (Constant), GPA

The regression correlation Table 23 gives tells us many other things. The t value

of 8.676 is high which is good and the significance level of 0.000 < 0.05. This indicates

that the GPA significantly predicts the Verbal scores. Rule of thumb: t value greater than

2 and less than -2 show significant predictability of the independent variable.

The linear regression model is shown in the B column of Table 23; the b is the

Constant and the m is the GPA's B value of 25.011. So the equation for this regression

line of equation is:

Verbal = 40.540 + 25.011(GPA)

So a GPA of 2.1 would predict a Verbal score of 93.063 = 40.54 + 25.011(2.1)

Table 23

SPSS Regression Table: Correlation

Model Unstandardized

Coefficients Standardized

Coefficients

t Sig.

B Std. Error Beta

1 (Constant

) 40.540

4.673

8.676

.000

gpa 25.011

1.643

.838

15.221

.000

a Dependent Variable: Verbal

51

Figure 28. SPSS procedure: Analyze -> Regression -> Curve Estimation.

The Analysis of Variance (ANOVA), Table 24 tells us how good the regression

model is in predicting the Verbal variable. An F-test significance level of 0.000 < 0.05

(p-value < a) shows that the regression model is, significantly, a good model in

predicting the outcome, Verbal given GPA. Other information in Table 24 also shows the

significance of the model: high Sum of Squares for the Regression row relative to the

Residual row, and high F-test statistics indicate significance

52

Table 24

SPSS Regression Analysis: ANOVA

Model

Sum of

Squares df Mean Square F Sig.

1 Regressio

n 12343.457

1

12343.457

231.685

.000(a)

Residual 5221.133

98

53.277

Total 17564.590

99

a Predictors: (Constant), GPA

b Dependent Variable: Verbal

80.00

90.00

100.00

110.00

120.00

130.00

140.00

2.00 2.50 3.00 3.50 4.00

gpa

Observed

Linear

verbal

Figure 29. SPSS: Linear Plot of GPA and Visual.

53

Chi-square Tests of Association

The Chi-square test is a nonparametric statistical test that is not affected by the

distribution of the data (the data can be non-normal) - only that the sample be random.

Like correlation it tests the strength of the associations between variables. It does this by

comparing actual (observed) numbers in each group (categories) with those expected

theoretically or by chance. The Chi-square test requires that data be expressed as

frequencies (numbers in each category); this is nominal level of measurement.

The reliability of the Chi-square test requires that the expected frequencies in each

category be not less than 5. Each category should be independent of each other; that is, no

data should fall into more than one category.

We will illustrate the Chi-square test of association with two examples. The first

example tests the hypothesis whether there is any relationship or association between a

college student's sex (gender) and his or her father's level of education (students attending

college from the HSB500 table). Before we perform the analysis, what would you expect

the outcome to be? The second example tests the hypothesis whether there is any

relationship between a college student's sex and grade index.

Example 1: Is there a relationship between sex (0, 1) and father's education (2 - 10)?

A summary of the number of cases (frequencies) in each categories are show

below. Table 25 is called the contingency table, specifically a “9 x 2” contingency table

(nine columns for the father's education variable and two rows for the sex variable). It is

also called the cross-tabulation, rows and columns are totaled.

54

Table 25

9 x 2 Contingency Table: Sex and Father's Education

Father's education

less

than HS

HS

grad

less

than 2

yr

Voc

more

than 2

yr

Voc

less

than 2

yr

Coll

more

than 2

yr

Coll

Coll

grad

Maste

r's

MD/

PhD

Total

Sex

2 3 4 5 6 7 8 9 10 280

0 72 73 12 22 14 20 41 15 11 280

1 60 56 2 16 13 6 35 21 11 220

Total 132 129 14 38 27 26 76 36 22 500

The contingency table for this data is generated with SPSS program from the

HSB500 table. The SPSS procedure for a simply Chi-square association analysis is

shown below. Note, there are other types of Chi-square analysis beyond the scope of this

text.

SPSS Chi-square procedure: Analyze -> Descriptive Statistics -> Crosstabs -> select row variable

(sex) and column variable(s) (faed) -> under Statistics, check Chi-square.

The output for the Chi-square association analysis is listed in Table 26. Notice the

following:

1. The Pearson Chi-square statistics is, 13.465 this is less than the

(table, 1 - )

13.465 < 15.51 (df = 8, = 0.05); So null hypothesis is true, there is no

association between sex and fathers education.

2. The significance level is 0.097 > alpha, of 0.05

3. No (0%) expected frequency is less than 5 (20% maximum allowed)

55

Figure 30. SPSS Procedure: Chi-square for association.

56

Table 26

Chi-square Analysis: Sex and Father's Education

Value df

Asymp. Sig.

(2-sided)

Pearson Chi-square 13.465(a)

8

.097

Likelihood Ratio 14.467

8

.070

Linear-by-Linear

Association .543

1

.461

N of Valid Cases 500

a 0 cells (.0%) have expected count less than 5. The minimum expected count is 6.16.

Example 2: Is sex related to students’ grade index in the HSB500 table?

The cross-tabulation or 7 x 2 contingency table (7 columns and 2 rows of

frequencies associated with grades and sex) is shown below:

Grades

2 3 4 5 6 7 8

Total

Sex 0 1 12 26 57 69 79 36 280

1 2 14 40 47 52 42 23 220

Total 1 3 26 66 104 121 121 59 500

The Chi-square analysis shows that a person's sex (gender) influence their grade

index, because Pearson’s Chi-square value is 13.987 > 12.59 (df = 6, = 0.05 or

0.95 ;

1- ). The significance level is 0.030 < = 0.05, so there is an association between these

two variables. Note that less than 20% of expected frequencies are less than 5 (14.3%).

57

Table 27

Chi-square Analysis: Sex and Grades

Value df

Asymp. Sig.

(2-sided)

Pearson Chi-Square 13.987(a)

6

.030

Likelihood Ratio 14.014

6

.029

Linear-by-Linear

Association 10.505

1

.001

N of Valid Cases 500

a 2 cells (14.3%) have expected count less than 5. The minimum expected count is 1.32.

58

Appendix - Statistical Tables

Z-score Probability Distribution Table (cumulative)

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

-3.6

0.000159146

0.0001531

0.00014734

0.00014175

0.00013635

0.00013115

0.00012614

0.00012131

0.00011665

0.00011216

-3.5

0.000232673

0.0002241

0.00021582

0.00020782

0.0002001

0.00019266

0.00018547

0.00017853

0.00017184

0.00016538

-3.4

0.000336981

0.0003249

0.00031316

0.00030184

0.00029091

0.00028034

0.00027013

0.00026028

0.00025075

0.00024156

-3.3

0.000483483

0.0004665

0.00045014

0.00043429

0.00041895

0.00040411

0.00038977

0.00037589

0.00036248

0.00034952

-3.2

0.000687202

0.0006637

0.00064102

0.00061901

0.00059771

0.00057709

0.00055712

0.0005378

0.0005191

0.000501

-3.1

0.000967671

0.0009355

0.00090432

0.0008741

0.00084481

0.00081642

0.00078891

0.00076226

0.00073644

0.00071143

-3

0.001349967

0.0013063

0.00126394

0.00122284

0.00118296

0.00114428

0.00110675

0.00107036

0.00103507

0.00100085

-2.9

0.00186588

0.0018072

0.00175022

0.00169488

0.00164113

0.00158894

0.00153826

0.00148907

0.00144131

0.00139496

-2.8

0.002555191

0.0024771

0.00240124

0.00232746

0.00225574

0.00218603

0.00211827

0.00205242

0.00198844

0.00192628

-2.7

0.003467023

0.0033642

0.00326415

0.00316677

0.00307201

0.00297982

0.00289012

0.00280287

0.002718

0.00263546

-2.6

0.004661222

0.0045271

0.00439653

0.00426928

0.00414534

0.00402463

0.00390708

0.00379261

0.00368115

0.00357265

-2.5

0.00620968

0.0060366

0.00586776

0.00570315

0.00554265

0.00538617

0.00523363

0.00508495

0.00494005

0.00479883

-2.4

0.008197529

0.0079763

0.00776025

0.00754941

0.00734363

0.00714281

0.00694686

0.00675566

0.00656913

0.00638717

-2.3

0.010724081

0.0104441

0.01017041

0.00990305

0.00964185

0.00938669

0.00913745

0.00889403

0.00865631

0.00842418

-2.2

0.013903399

0.0135525

0.01320934

0.01287368

0.01254542

0.01222443

0.01191059

0.01160376

0.01130381

0.01101063

59

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

-2.1

0.017864357

0.0174291

0.01700296

0.01658575

0.01617733

0.01577755

0.01538628

0.01500337

0.01462868

0.01426207

-2

0.022750062

0.0222155

0.02169162

0.0211782

0.02067509

0.02018215

0.0196992

0.01922611

0.0187627

0.01830884

-1.9

0.028716493

0.0280665

0.02742888

0.02680335

0.02618978

0.02558799

0.02499783

0.02441912

0.02385169

0.0232954

-1.8

0.035930266

0.0351478

0.03437945

0.03362491

0.03288406

0.03215671

0.0314427

0.03074184

0.03005397

0.02937891

-1.7

0.044565432

0.0436329

0.04271618

0.0418151

0.04092947

0.04005911

0.03920386

0.03836352

0.03753793

0.0367269

-1.6

0.054799289

0.0536989

0.05261613

0.05155074

0.05050257

0.04947145

0.04845721

0.04745966

0.04647863

0.04551395

-1.5

0.066807229

0.0655217

0.06425551

0.06300838

0.06178019

0.06057077

0.05937995

0.05820756

0.05705344

0.0559174

-1.4

0.080756711

0.0792699

0.07780389

0.07635856

0.07493374

0.0735293

0.07214508

0.07078091

0.06943666

0.06811215

-1.3

0.096800549

0.095098

0.09341757

0.0917592

0.09012273

0.08850805

0.08691502

0.08534351

0.08379338

0.08226449

-1.2

0.115069732

0.1131395

0.1112325

0.10934862

0.10748776

0.10564984

0.10383475

0.10204238

0.10027263

0.09852539

-1.1

0.135666102

0.1334996

0.13135693

0.12923816

0.1271432

0.12507199

0.12302446

0.12100054

0.11900017

0.11702326

-1

0.15865526

0.1562477

0.15386424

0.15150502

0.14916997

0.14685908

0.14457233

0.14230969

0.14007112

0.13785661

-0.9

0.184060092

0.1814112

0.17878635

0.17618552

0.17360876

0.17105611

0.1685276

0.16602324

0.16354306

0.16108706

-0.8

0.211855334

0.20897

0.20610799

0.20326933

0.20045414

0.19766249

0.19489447

0.19215016

0.18942961

0.18673291

-0.7

0.241963578

0.238852

0.23576242

0.23269502

0.22964992

0.22662728

0.22362722

0.22064988

0.21769537

0.21476382

-0.6

0.274253065

0.2709308

0.26762883

0.26434723

0.26108623

0.25784604

0.25462685

0.25142882

0.24825216

0.24509702

-0.5

0.308537533

0.3050257

0.30153177

0.29805594

0.29459849

0.29115966

0.28773968

0.28433881

0.28095726

0.27759528

-0.4

0.344578303

0.340903

0.33724276

0.33359785

0.32996858

0.32635524

0.32275813

0.31917752

0.3156137

0.31206695

-0.3

0.382088643

0.3782805

0.37448423

0.37070005

0.36692833

0.36316941

0.35942363

0.3556913

0.35197276

0.34826832

-0.2

0.420740313

0.4168339

0.41293561

0.40904593

0.40516518

0.40129373

0.39743194

0.39358019

0.38973881

0.38590818

-0.1

0.460172104

0.4562046

0.45224153

0.44828318

0.44432997

0.44038229

0.43644053

0.43250507

0.42857629

0.42465458

60

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0 0.5 0.5039894

0.50797835

0.51196653

0.5159535

0.51993887

0.52392225

0.52790324

0.53188144

0.53585646

0.1

0.539827896

0.5437954

0.54775847

0.55171682

0.55567003

0.55961771

0.56355947

0.56749493

0.57142371

0.57534542

0.2

0.579259687

0.5831661

0.58706439

0.59095407

0.59483482

0.59870627

0.60256806

0.60641981

0.61026119

0.61409182

0.3

0.617911357

0.6217195

0.62551577

0.62929995

0.63307167

0.63683059

0.64057637

0.6443087

0.64802724

0.65173168

0.4

0.655421697

0.659097

0.66275724

0.66640215

0.67003142

0.67364476

0.67724187

0.68082248

0.6843863

0.68793305

0.5

0.691462467

0.6949743

0.69846823

0.70194406

0.70540151

0.70884034

0.71226032

0.71566119

0.71904274

0.72240472

0.6

0.725746935

0.7290692

0.73237117

0.73565277

0.73891377

0.74215396

0.74537315

0.74857118

0.75174784

0.75490298

0.7

0.758036422

0.761148

0.76423758

0.76730498

0.77035008

0.77337272

0.77637278

0.77935012

0.78230463

0.78523618

0.8

0.788144666

0.79103

0.79389201

0.79673067

0.79954586

0.80233751

0.80510553

0.80784984

0.81057039

0.81326709

0.9

0.815939908

0.8185888

0.82121365

0.82381448

0.82639124

0.82894389

0.8314724

0.83397676

0.83645694

0.83891294

1

0.84134474

0.8437523

0.84613576

0.84849498

0.85083003

0.85314092

0.85542767

0.85769031

0.85992888

0.86214339

1.1

0.864333898

0.8665004

0.86864307

0.87076184

0.8728568

0.87492801

0.87697554

0.87899946

0.88099983

0.88297674

1.2

0.884930268

0.8868605

0.8887675

0.89065138

0.89251224

0.89435016

0.89616525

0.89795762

0.89972737

0.90147461

1.3

0.903199451

0.904902

0.90658243

0.9082408

0.90987727

0.91149195

0.91308498

0.91465649

0.91620662

0.91773551

1.4

0.919243289

0.9207301

0.92219611

0.92364144

0.92506626

0.9264707

0.92785492

0.92921909

0.93056334

0.93188785

1.5

0.933192771

0.9344783

0.93574449

0.93699162

0.93821981

0.93942923

0.94062005

0.94179244

0.94294656

0.9440826

1.6

0.945200711

0.9463011

0.94738387

0.94844926

0.94949743

0.95052855

0.95154279

0.95254034

0.95352137

0.95448605

1.7

0.955434568

0.9563671

0.95728382

0.9581849

0.95907053

0.95994089

0.96079614

0.96163648

0.96246207

0.9632731

1.8

0.964069734

0.9648522

0.96562055

0.96637509

0.96711594

0.96784329

0.9685573

0.96925816

0.96994603

0.97062109

1.9

0.971283507

0.9719335

0.97257112

0.97319665

0.97381022

0.97441201

0.97500217

0.97558088

0.97614831

0.9767046

2

0.977249938

0.9777845

0.97830838

0.9788218

0.97932491

0.97981785

0.9803008

0.98077389

0.9812373

0.98169116

2.1

0.982135643

0.9825709

0.98299704

0.98341425

0.98382267

0.98422245

0.98461372

0.98499663

0.98537132

0.98573793

2.2

0.986096601

0.9864475

0.98679066

0.98712632

0.98745458

0.98777557

0.98808941

0.98839624

0.98869619

0.98898937

61

2.3

0.989275919

0.9895559

0.98982959

0.99009695

0.99035815

0.99061331

0.99086255

0.99110597

0.99134369

0.99157582

2.4

0.991802471

0.9920237

0.99223975

0.99245059

0.99265637

0.99285719

0.99305314

0.99324434

0.99343087

0.99361283

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

2.5

0.99379032

0.9939634

0.99413224

0.99429685

0.99445735

0.99461383

0.99476637

0.99491505

0.99505995

0.99520117

2.6

0.995338778

0.9954729

0.99560347

0.99573072

0.99585466

0.99597537

0.99609292

0.99620739

0.99631885

0.99642735

2.7

0.996532977

0.9966358

0.99673585

0.99683323

0.99692799

0.99702018

0.99710988

0.99719713

0.997282

0.99736454

2.8

0.997444809

0.9975229

0.99759876

0.99767254

0.99774426

0.99781397

0.99788173

0.99794758

0.99801156

0.99807372

2.9

0.99813412

0.9981928

0.99824978

0.99830512

0.99835887

0.99841106

0.99846174

0.99851093

0.99855869

0.99860504

3

0.998650033

0.9986937

0.99873606

0.99877716

0.99881704

0.99885572

0.99889325

0.99892964

0.99896493

0.99899915

3.1

0.999032329

0.9990645

0.99909568

0.9991259

0.99915519

0.99918358

0.99921109

0.99923774

0.99926356

0.99928857

3.2

0.999312798

0.9993363

0.99935898

0.99938099

0.99940229

0.99942291

0.99944288

0.9994622

0.9994809

0.999499

3.3

0.999516517

0.9995335

0.99954986

0.99956571

0.99958105

0.99959589

0.99961023

0.99962411

0.99963752

0.99965048

3.4

0.999663019

0.9996751

0.99968684

0.99969816

0.99970909

0.99971966

0.99972987

0.99973972

0.99974925

0.99975844

3.5

0.999767327

0.9997759

0.99978418

0.99979218

0.9997999

0.99980734

0.99981453

0.99982147

0.99982816

0.99983462

3.6

0.999840854

0.9998469

0.99985266

0.99985825

0.99986365

0.99986885

0.99987386

0.99987869

0.99988335

0.99988784

3.7

0.99989217

0.9998963

0.99990036

0.99990423

0.99990796

0.99991156

0.99991502

0.99991835

0.99992156

0.99992465

3.8

0.999927628

0.9999305

0.99993325

0.99993591

0.99993846

0.99994092

0.99994329

0.99994556

0.99994775

0.99994986

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

Note. Generated using the standard normal formula at given alpha levels

62

Values of t at the 0.05 and 0.01 level: Two-tailed

df

a = 0.05

a = 0.01

1

12.706

63.657

2

4.303

9.925

3

3.183

5.841

4

2.777

4.604

5

2.571

4.032

6

2.447

3.707

7

2.365

3.500

8

2.306

3.355

9

2.262

3.250

10

2.228

3.169

11

2.201

3.106

12

2.179

3.055

13

2.160

3.012

14

2.145

2.977

15

2.132

2.947

16

2.120

2.921

17

2.110

2.898

18

2.101

2.879

19

2.093

2.861

20

2.086

2.845

21

2.080

2.831

22

2.074

2.819

23

2.069

2.807

24

2.064

2.797

25

2.060

2.787

26

2.056

2.779

27

2.052

2.771

28

2.048

2.763

29

2.045

2.756

30

2.042

2.750

40

2.021

2.705

50

2.009

2.678

60

2.000

2.660

70

1.994

2.648

80

1.990

2.639

90

1.987

2.632

100

1.984

2.626

1.960

2.576

Adapted from Sockloff, A., & Edney, J. (1972). Some extension of Student’s t and Pearson’s r central distributions, Technical Report (May 1972). Measurement and Research, Temple University, Philadelphia.

63

Values of t at various significance levels: One-tailed

Significance Levels, p df 0.4

0.25 0.1

0.05 0.025

0.01 0.005

0.0025 0.001 0.0005

1 0.325

1.000

3.078

6.314

12.706

31.821

63.656

127.321

318.289

636.578

2 0.289

0.816

1.886

2.920

4.303

6.965

9.925

14.089

22.328

31.600

3 0.277

0.765

1.638

2.353

3.182

4.541

5.841

7.453

10.214

12.924

4 0.271

0.741

1.533

2.132

2.776

3.747

4.604

5.598

7.173

8.610

5 0.267

0.727

1.476

2.015

2.571

3.365

4.032

4.773

5.894

6.869

6 0.265

0.718

1.440

1.943

2.447

3.143

3.707

4.317

5.208

5.959

7 0.263

0.711

1.415

1.895

2.365

2.998

3.499

4.029

4.785

5.408

8 0.262

0.706

1.397

1.860

2.306

2.896

3.355

3.833

4.501

5.041

9 0.261

0.703

1.383

1.833

2.262

2.821

3.250

3.690

4.297

4.781

10 0.260

0.700

1.372

1.812

2.228

2.764

3.169

3.581

4.144

4.587

11 0.260

0.697

1.363

1.796

2.201

2.718

3.106

3.497

4.025

4.437

12 0.259

0.695

1.356

1.782

2.179

2.681

3.055

3.428

3.930

4.318

13 0.259

0.694

1.350

1.771

2.160

2.650

3.012

3.372

3.852

4.221

14 0.258

0.692

1.345

1.761

2.145

2.624

2.977

3.326

3.787

4.140

15 0.258

0.691

1.341

1.753

2.131

2.602

2.947

3.286

3.733

4.073

16 0.258

0.690

1.337

1.746

2.120

2.583

2.921

3.252

3.686

4.015

17 0.257

0.689

1.333

1.740

2.110

2.567

2.898

3.222

3.646

3.965

18 0.257

0.688

1.330

1.734

2.101

2.552

2.878

3.197

3.610

3.922

19 0.257

0.688

1.328

1.729

2.093

2.539

2.861

3.174

3.579

3.883

20 0.257

0.687

1.325

1.725

2.086

2.528

2.845

3.153

3.552

3.850

21 0.257

0.686

1.323

1.721

2.080

2.518

2.831

3.135

3.527

3.819

22 0.256

0.686

1.321

1.717

2.074

2.508

2.819

3.119

3.505

3.792

df 0.4

0.25 0.1

0.05 0.025

0.01 0.005

0.0025 0.001 0.0005 23 0.256

0.685

1.319

1.714

2.069

2.500

2.807

3.104

3.485

3.768

24 0.256

0.685

1.318

1.711

2.064

2.492

2.797

3.091

3.467

3.745

25 0.256

0.684

1.316

1.708

2.060

2.485

2.787

3.078

3.450

3.725

26 0.256

0.684

1.315

1.706

2.056

2.479

2.779

3.067

3.435

3.707

27 0.256

0.684

1.314

1.703

2.052

2.473

2.771

3.057

3.421

3.689

29 0.256

0.683

1.311

1.699

2.045

2.462

2.756

3.038

3.396

3.660

29 0.256

0.683

1.311

1.699

2.045

2.462

2.756

3.038

3.396

3.660

30 0.256

0.683

1.310

1.697

2.042

2.457

2.750

3.030

3.385

3.646

40 0.255

0.681

1.303

1.684

2.021

2.423

2.704

2.971

3.307

3.551

50 0.255

0.679

1.299

1.676

2.009

2.403

2.678

2.937

3.261

3.496

60 0.254

0.679

1.296

1.671

2.000

2.390

2.660

2.915

3.232

3.460

70 0.254

0.678

1.294

1.667

1.994

2.381

2.648

2.899

3.211

3.435

80 0.254

0.678

1.292

1.664

1.990

2.374

2.639

2.887

3.195

3.416

90 0.254

0.677

1.291

1.662

1.987

2.368

2.632

2.878

3.183

3.402

100 0.254

0.677

1.290

1.660

1.984

2.364

2.626

2.871

3.174

3.390

110 0.254

0.677

1.289

1.659

1.982

2.361

2.621

2.865

3.166

3.381

120 0.254

0.677

1.289

1.658

1.980

2.358

2.617

2.860

3.160

3.373

8 0.253

0.674

1.282

1.645

1.960

2.326

2.576

2.807

3.090

3.290

Note. Generated using the t distribution (one-tailed) at given alpha levels

64

Chi-square Table

df

0

0.01

0.03

0.05

0.1

0.9

0.95

0.97

0.99

1

1

0.000039

0.00016

0.00098

0.0039

0.0158

2.71

3.84

5.02

6.63

7.88

2

0.01

0.0201

0.0506

0.1026

0.2107

4.61

5.99

7.38

9.21

10.6

3

0.0717

0.115

0.216

0.352

0.584

6.25

7.81

9.35

11.34

12.84

4

0.207

0.297

0.484

0.711

1.064

7.78

9.49

11.14

13.28

14.86

5

0.412

0.554

0.831

1.15

1.61

9.24

11.07

12.83

15.09

16.75

6

0.676

0.872

1.24

1.64

2.2

10.64

12.59

14.45

16.81

18.55

7

0.989

1.24

1.69

2.17

2.83

12.02

14.07

16.01

18.48

20.28

8

1.34

1.65

2.18

2.73

3.49

13.36

15.51

17.53

20.09

21.96

9

1.73

2.09

2.7

3.33

4.17

14.68

16.92

19.02

21.67

23.59

10

2.16

2.56

3.25

3.94

4.87

15.99

18.31

20.48

23.21

25.19

11

2.6

3.05

3.82

4.57

5.58

17.28

19.68

21.92

24.73

26.76

12

3.07

3.57

4.4

5.23

6.3

18.55

21.03

23.34

26.22

28.3

13

3.57

4.11

5.01

5.89

7.04

19.81

22.36

24.74

27.69

29.82

14

4.07

4.66

5.63

6.57

7.79

21.06

23.68

26.12

29.14

31.32

15

4.6

5.23

6.26

7.26

8.55

22.31

25

27.49

30.58

32.8

16

5.14

5.81

6.91

7.96

9.31

23.54

26.3

28.85

32

34.27

18

6.26

7.01

8.23

9.39

10.86

25.99

28.87

31.53

34.81

37.16

20

7.43

8.26

9.59

10.85

12.44

28.41

31.41

34.17

37.57

40

24

9.89

10.86

12.4

13.85

15.66

33.2

36.42

39.36

42.98

45.56

30

13.79

14.95

16.79

18.49

20.6

40.26

43.77

46.98

50.89

53.67

40

20.71

22.16

24.43

26.51

29.05

51.81

55.76

59.34

63.69

66.77

60

35.53

37.48

40.48

43.19

46.46

74.4

79.08

83.3

88.38

91.95

120

83.85

86.92

91.58

95.7

100.62

140.23

146.57

152.21

158.95

163.64

Note. Generated using the Chi-square distribution at various 1 – a levels.

65

Critical Values for Correlation Coefficient, r

df (n-2): 0.1 0.05 0.02 0.011 0.988 0.997 0.9995 0.99992 0.9 0.95 0.98 0.993 0.805 0.878 0.934 0.9594 0.729 0.811 0.882 0.9175 0.669 0.754 0.833 0.8746 0.622 0.707 0.789 0.8347 0.582 0.666 0.75 0.7988 0.549 0.632 0.716 0.7659 0.521 0.602 0.685 0.735

10 0.497 0.576 0.658 0.70811 0.476 0.553 0.634 0.68412 0.458 0.532 0.612 0.66113 0.441 0.514 0.592 0.64114 0.426 0.497 0.574 0.62315 0.412 0.482 0.558 0.60616 0.4 0.468 0.542 0.5917 0.389 0.456 0.528 0.57518 0.378 0.444 0.516 0.56119 0.369 0.433 0.503 0.54920 0.36 0.423 0.492 0.53721 0.352 0.413 0.482 0.52622 0.344 0.404 0.472 0.51523 0.337 0.396 0.462 0.50524 0.33 0.388 0.453 0.49625 0.323 0.381 0.445 0.48726 0.317 0.374 0.437 0.47927 0.311 0.367 0.43 0.47128 0.306 0.361 0.423 0.46329 0.301 0.355 0.416 0.45630 0.296 0.349 0.409 0.44935 0.275 0.325 0.381 0.41840 0.257 0.304 0.358 0.39345 0.243 0.288 0.338 0.37250 0.231 0.273 0.322 0.35460 0.211 0.25 0.295 0.32570 0.195 0.232 0.274 0.30380 0.183 0.217 0.256 0.28390 0.173 0.205 0.242 0.267

100 0.164 0.195 0.23 0.254

Level of Significance (p) for a Two-Tailed Test

SPSS Supplement Guide - Pindling.org · Available for Microsoft® Windows® and Macintosh®, the...

Documents

Transcript of SPSS Supplement Guide - Pindling.org · Available for Microsoft® Windows® and Macintosh®, the...