8/2/2019 Correla Test
1/19
SPSS for Psychologists Contents vii
Contents
Dedications v
Preface xi
Acknowledgements xv
Chapter One Introduction 1
1 Psychological research and SPSS 2
2 Some basic statistical concepts 4
3 Working with SPSS 18
4 Starting SPSS 20
5 How to exit from SPSS 246 Some useful option settings in SPSS 25
Chapter Two Data entry in SPSS 27
1 The Data Editor window 28
2 Defining a variable in SPSS 30
3 Entering data 42
4 Saving a data file 45
5 Opening a data file 48
6 Data entry exercises 50
7 Answers to data entry exercises 548 Summary descriptive statistics and the Viewer window 56
Chapter Three Tests of difference for two sample designs 69
1 An introduction to the t-test 70
2 The independent t-test 71
3 The paired t-test 79
4 An introduction to the nonparametric equivalents of the t-test 85
5 The MannWhitney test 86
6 The Wilcoxon test 89
Chapter Four Tests of correlation 93
1 An introduction to tests of correlation 94
2 Descriptive statistics in correlation 95
3 Pearsons r: parametric test of correlation 102
4 Spearmans rs: nonparametric test of correlation 106
8/2/2019 Correla Test
2/19
viii SPSS for Psychologists Contents
Chapter Five Tests for nominal data 109
1 Nominal data and dichotomous variables 110
2 Chi-square tests versus the chi-square distribution 112
3 The goodness-of-fit chi-square 113
4 The multi-dimensional chi-square 114
5 The McNemar test for repeated measures 127
Chapter Six Data handling 131
1 An introduction to data handling 132
2 Sorting a file 133
3 Splitting a file 135
4 Selecting cases 137
5 Recoding values 141
6 Computing new variables 146
7 Counting values 149
8 Ranking cases 1529 Other useful functions 155
10 Data file for scales or questionnaires 157
Chapter Seven Analysis of variance 161
1 An introduction to analysis of variance (ANOVA) 162
2 One-way between-subjects ANOVA 175
3 Two-way between-subjects ANOVA 182
4 One-way within-subjects ANOVA 188
5 Two-way within-subjects ANOVA 1946 Mixed ANOVA 204
7 Some additional points 210
8 Planned and unplanned comparisons 213
9 Nonparametric equivalents to ANOVA: KruskalWallis and Friedman 221
Chapter Eight Multiple regression 227
1 An introduction to multiple regression 228
2 Performing a multiple regression on SPSS 235
Chapter Nine Analysis of covariance and multivariate analysis of
variance 245
1 An introduction to analysis of covariance 246
2 Performing analysis of covariance on SPSS 250
3 An introduction to multivariate analysis of variance 263
4 Performing multivariate analysis of variance on SPSS 267
8/2/2019 Correla Test
3/19
SPSS for Psychologists Contents ix
Chapter Ten Discriminant analysis and logistic regression 273
1 Discriminant analysis and logistic regression 274
2 An introduction to discriminant analysis 276
3 Performing discriminant analysis on SPSS 280
4 An introduction to logistic regression 293
5 Performing logistic regression on SPSS 294
Chapter Eleven Factor analysis, and reliability and dimensionality
of scales 301
1 An introduction to factor analysis 302
2 Performing a basic factor analysis on SPSS 313
3 Other aspects of factor analysis 326
4 Reliability analysis for scales and questionnaires 331
5 Dimensionality of scales and questionnaires 337
Chapter Twelve Beyond the basics 3411 The syntax window 342
2 Option settings in SPSS 350
3 Getting help in SPSS 352
4 Printing from SPSS 355
5 Incorporating SPSS output into other documents 358
6 Graphing tips 359
7 Interactive charts 365
Glossary 367
References 387
Appendix 1: Data files 391
Appendix II: Defining a variable in SPSS versions 8 and 9 423
Appendix III: Adding regression lines to scattergrams before Version 12 433
1 Simple scattergram (Chapter 4, Section 2) 434
2 Scattergram with multiple groups (Chapter 9, Section 2) 438
Index 441
8/2/2019 Correla Test
4/19
Chapter Four
Tests of correlation
An introduction to tests of correlation
Descriptive statistics in correlation
Pearsons r: parametric test of
correlation
Spearmans rs: nonparametric test ofcorrelation
8/2/2019 Correla Test
5/19
94 SPSS for Psychologists Chapter Four
Section 1: An introduction to tests of correlation
Researchers often wish to measure the degree of relationship between two variables.
For example, there is likely to be a relationship between age and reading ability in
children. Such an investigation is not a true experiment, for the same reason that a
natural independent groups design (e.g., when age group or sex is selected as the
grouping variable) is not a true experiment. In both, the experimenter does not
manipulate the independent variable, and no statement about causation can be
made. In a natural independent groups design, the experimenter chooses the levels
of the independent variable from natural characteristics, and then looks for
differences between the groups. In a correlation there is no independent variable:
you simply measure two variables. So, if someone wished to investigate the effect
of smoking on respiratory function, then, in a natural independent groups design,
you could choose to measure and then compare respiratory function in smokers
with that in non-smokers. A more common design, however, would be forresearchers to measure both how many cigarettes people smoke and their
respiratory function, and then test for a correlation.
An important point to remember is that correlation does not imply causation. In any
correlation, there could be a third variable which explains the association between the
two variables that you measured. For example, there may be a correlation between the
number of ice creams sold and the number of people who drown. Here temperature is
the third variable, which could explain the relationship between the measured
variables. Even when there seems to be a clear cause and effect relationship, a
correlation alone is not sufficient evidence for a causal relationship. Only if one
variable has been manipulated can one draw such conclusions.
Francis Galton carried out early work on correlation, and one of his colleagues,
Pearson, developed a method of calculating correlation coefficients for parametric
data: Pearson's Product Moment Correlation Coefficient (Pearsons r). When one or
both of the scales is not either interval or ratio, or if the data do not meet the other two
assumptions for using parametric statistical tests, then a nonparametric test of
correlation such as Spearmans rs should be used. Thes is to distinguish it from
Pearsons r. This test was originally called Spearmans (the Greek letter rho).
Note that for a correlation to be acceptable one should normally test at least 100
participants; otherwise a small number of participants with extreme scores could
skew the data and either prevent a correlation from being revealed when it does
exist or cause an apparent correlation that does not really exist. The scattergram is a
useful tool for checking such eventualities.
8/2/2019 Correla Test
6/19
SPSS for Psychologists Chapter Four 95
Section 2: Descriptive statistics in correlation
One of the easiest ways to tell if two items are related and to spot trends is to plot
scattergrams or scatterplots. Figure 4.1 shows a hypothetical example. Each point on
the scattergram represents the age and the reading ability of one child. The line running
through the data points is called a regression line. It represents the best fit of a
straight line to the data points. The line in Figure 4.1 slopes upwards from left to right:
as one variable increases in value, the other variable also increases in value and this is
called a positive correlation. The closer the points are to being on the line itself, the
stronger the correlation. If all the points fall along the straight line, then it is said to be a
perfect correlation. The scattergram will also show you any outliers.
Figure 4.1. Scattergram illustrating a positive correlation: hypothetical data for the
relationship between age and reading ability in children.
In the scattergram shown in Figure 4.2, the dots are scattered randomly, all over the
graph. It is not possible to draw any meaningful best fit line at all, and the
correlation would be close to zero: that is, there is no relationship between the two
variables.
Figure 4.2. Scattergram showing two variables with zero relationship.
8/2/2019 Correla Test
7/19
96 SPSS for Psychologists Chapter Four
It is often the case that as one variable increases in value, the other variable decreases in
value: this is called a negative correlation. In the following example of how to produce
a scattergram with SPSS, we are going to use data that give a negative correlation.
EXAMPLE STUDY: RELATIONSHIP BETWEEN AGE AND CFF
A paper by Mason, Snelgar, Foster, Heron and Jones (1982) described an
investigation of (among other things) whether the negative correlation between age
and CFF (explained below) is different for people with Multiple Sclerosis than for
control participants. For this example, we have created a data file that will
reproduce some of the findings for the control participants. CFF can be described
briefly and somewhat simplistically as follows. If a light is flickering on and off at a
low frequency, then most people can detect the flicker. If the frequency of flicker is
increased then eventually it looks like a steady light. The frequency at which
someone can no longer perceive flicker is called his or her critical flicker frequency(CFF). (These data are available in Appendix I or from the web address listed
there.)
How to obtain a scattergram
Click on Graphs on the menu bar, and then from the menu select Scatter. In the
Scatter/Dot dialogue box, shown below, click on the Simple Scatter display, then
click on the Define button. (Note: in Version 12 and earlier, the dialogue box is called
Scatterplot.)
The other options in the Scatter/Dot (Scatterplot)dialogue box produce other types of
graph, which you can explore in the future. We will only be describing the Simple
Scatter command. After you have clicked on the Define button, the Simple
Scatterplot dialogue box will appear.
8/2/2019 Correla Test
8/19
SPSS for Psychologists Chapter Four 97
In the Simple Scatterplot dialogue box, shown below, move the variable names,
one into the box labelled X Axis, and one into the Y Axis box. You can use the
Titles button and the Options button if you wish.
When you have finished, click on OK. The Output Window will open, containing
the scattergram: a part of that window is shown on the next page.
TIP The Panel by facility, introduced in Version 13, allows you to plot a scattergram
for two different groups at the same time. For example, if we had recorded the gender of
the participants then we could plot a scattergram for men and for women separately by
moving the grouping variable name into Columns. A second grouping variable (e.g.,
patient or control participant) could beused in Rows to produce 4 separate scattergramsin all.
8/2/2019 Correla Test
9/19
98 SPSS for Psychologists Chapter Four
How to add a regression line to the scattergram
To add the regression line, you have to edit the graph: start by double-clicking in
the scattergram, and the SPSS Chart Editor window, shown at the bottom of this
page, will appear.
From this point in the procedure, changes were made between SPSS Versions 11
and 12, and another change between Versions 12 and 13. Here we show you how to
produce the regression line for Version 12 and also for Version 13. The procedure
for Version 10 or 11 is shown in Appendix III.
8/2/2019 Correla Test
10/19
SPSS for Psychologists Chapter Four 99
8/2/2019 Correla Test
11/19
100 SPSS for Psychologists Chapter Four
You can copy the scattergram and paste it into a Word document for a report, adding a
suitable figure legend. For example, see Figure 4.3 on the next page.
TIP Figure legends should be suitable for the work into which you are incorporating
the figure. The legend to Figure 4.3 might be suitable for a report about the study into
age and CFF. The legends to Figures 4.1 and 4.2, however, are intended to help you
follow the explanation in this book, and would not be suitable for a report.
In addition to adding the regression line you can edit other elements of the chart, to
improve appearance. For example, SPSS charts are usually rather large. If you leave
them large, then the report will be spread over more pages than necessary which can
hinder the ease with which the reader follows your argument. You can shrink charts
easily in Word, but it is best to change the size in Chart Editor as then the font andsymbol size will automatically be adjusted for legibility. Editing would also be useful
when a number of cases all fall at the same point. The data that we use to illustrate use
of Spearmans rs (Section 4) demonstrates that situation. To clearly illustrate the data
you can edit the data symbols in Chart Editor, so that they vary in size according to
the number of cases at each point. Guidelines on the appearance of Figures are
given in APA (2001).
8/2/2019 Correla Test
12/19
SPSS for Psychologists Chapter Four 101
Figure 4.3. Critical flicker frequency (in Hz) plotted against participants age (in
years).
A scattergram is a descriptive statistic that illustrates the data, and can be used to
check the data. For example, there may be some extreme outliers that strongly
influence the regression line, or there may be a non-linear relationship. If there doesappear to be a linear relationship (as Pearsons r makes the assumption that any
relationship will be linear) we can find out whether or not it is significant with an
inferential statistical test of correlation. A test of correlation will give both the
significance value and the strength of the correlation. The strength of correlation is
indicated by the value of the correlation coefficient which varies between 1 and 0.
A perfect negative correlation would have a coefficient of 1, and a perfect positive
correlation would have a coefficient of +1. In psychology perfect correlations (in
which all the points fall exactly on the regression line) are extremely rare and rather
suspect.
Note the R Sq Linear value that appears in the scattergram (Versions 12 and 13).
This is not the correlation coefficient itself; it is the square of Pearsons r(which we
demonstrate in Section 3). r2 is itself a useful statistic that we will return to in
Section 3. You can remove the R Sq legend if you wish: in the Chart Editor window
double-click on the legend, so that it is selected, then press delete key.
8/2/2019 Correla Test
13/19
102 SPSS for Psychologists Chapter Four
Section 3: Pearsons r: parametric test of correlation
To illustrate how to carry out this parametric test of correlation, we will use the
same data as we used to obtain the scattergram and regression line.
The hypothesis tested was that there would be a negative correlation between CFF
and age.
The study employed a correlational design. Two variables were measured. The first
was age, operationalised by asking participants who ranged in age from 25 to 66 to
participate. The second variable was CFF, operationalised by using a flicker
generator to measure CFF for each participant: six measures were made, and the
mean taken to give a single CFF score for each participant.
HOW TO PERFORM A PEARSONS R
TIP SPSS will correlate each variable that you include with every other variable that
you include. Thus, if you included three variables A, B and C, it will calculate the
correlation coefficient for A * B, A * C and B * C. In the Pearsons rexample we have just
two variables, but in the Spearmans rs example we include three variables so that you
can see what a larger correlation matrix looks like.
8/2/2019 Correla Test
14/19
SPSS for Psychologists Chapter Four 103
TIPIn the
Bivariate Correlationsdialogue box, you have the option of choosing
either a one- or two-tailed test, and SPSS will then print the appropriate value ofp. In the
statistical tests that we have covered previously, SPSS prints the two-tailed p value, and
if you have a one-tailed hypothesis you halve that value to give the one-tailed p value.
The annotated output for Pearsons ris shown on the next page.
8/2/2019 Correla Test
15/19
104 SPSS for Psychologists Chapter Four
SPSS OUTPUT FOR PEARSONS R
Obtained Using Menu Item: Correlate > Bivariate
What you might write in a report is given below, after we tell you about effect sizes
in correlation.
TIP For correlations, the sign of the coefficient indicates whether the correlation is
positive or negative, so you must report it (unlike the sign in a t-test analysis).
8/2/2019 Correla Test
16/19
SPSS for Psychologists Chapter Four 105
EFFECT SIZES IN CORRELATION
The value ofrindicates the strength of the correlation, and it is a measure of effect
size (see Chapter 1, Section 2). As a rule of thumb, rvalues of 0 to .2 are generally
considered weak, .3 to .6 moderate, and .7 to 1 strong. The strength of the correlation
alone is not necessarily an indication of whether it is an important correlation: thesignificance value should normally also be considered. With small sample sizes this
is crucial, as strong correlations may easily occur by chance. With large to very
large sample sizes, however, even a small correlation can be highly statistically
significant. To illustrate that, look at a table of the critical values ofr(in the back of
most statistics text books). For example, if you carry out a correlation study with a
sample of 100 and obtain rof .2, it is significant at the .05 level, two-tailed. Yet .2 is
only a weak correlation. In some survey studies sample sizes may be in the
thousands, so significance alone cannot be used a guide. Instead the effect size and
the proportion of variation explained may be more important.
The concept of proportion of variance explained is described in Chapter 7, Section
1. Briefly, a correlation coefficient allows us to estimate the proportion of variation
within our data that is explained by the relationship between the two variables. (The
remaining variation is down to extraneous variables, both situational and
participant.) The proportion of variation explained is given by r2. Thus, for the age
and CFF example in which r = .78, r2
= .6084 and we can say that 60% of the
variation in the CFF data can be attributed to age. Note that, logically, we can just
as easily say that 60% of the variation in the age data can be attributed to CFF. The
latter statement should make it clear that we are not implying a causal relationship:
we cannot do so with correlation. The important practical point is that the two
variables have quite a lot of variation in common, and one could use a persons age
to predict what their CFF might be. If their measured CFF is outside the lower
confidence limit for their age, then we could investigate further.
Note that the proportion of variation explained does not have to be large to be
important. How important it is may depend on the purpose of the study (see Howell,
2002, pp 304305). Proportion of variance explained in correlational designs will
be returned to in Chapter 8 on multiple regression.
Reporting the results
In a report you might write: There was a significant negative correlation between
age and CFF (r = .780, N = 20, p < .0005, one-tailed). It is a fairly strong
correlation: 60.8% of the variation is explained. The scattergram (Figure 4.3) shows
that the data points are reasonably well distributed along the regression line, in a
linear relationship with no outliers.
8/2/2019 Correla Test
17/19
106 SPSS for Psychologists Chapter Four
Section 4: Spearmans rs: nonparametric test ofcorrelation
If either (or both) of the two variables involved in a correlational design are
nonparametric (because they do not meet the assumptions for parametric data, seeChapter 1, Section 2), then we use a nonparametric measure of correlation. Here,
we describe two such tests, Spearmans rs and Kendalls tau-b.
EXAMPLE STUDY: THE RELATIONSHIPS BETWEEN ATTRACTIVENESS,
BELIEVABILITY AND CONFIDENCE
Previous research using mock juries has shown that attractive defendants are less
likely to be found guilty than unattractive defendants, and that attractive individuals
are frequently rated more highly on other desirable traits, such as intelligence. In astudy undertaken by one of our students, participants saw the testimony of a woman
in a real case of alleged rape. They were asked to rate her, on a scale of 1 to 7, in
terms of how much confidence they placed in her testimony, how believable she
was and how attractive she was. (These data are available in Appendix I or from the
web address listed there.)
The design employed was correlational; with three variables each measured on a 7
point scale. Although it often accepted that such data could be considered interval
in nature (see Chapter 1, Section 2), for the purpose of this Section we will consider
it as ordinal data. The hypotheses tested were that:
1. There would be a positive relationship between attractiveness and confidence
placed in testimony.
2. There would be a positive relationship between attractiveness and believability.
3. There would be a positive relationship between confidence placed in testimony and
believability.
TIP We are using this study to illustrate use of Spearmans rs and some other aspects
of correlation. However, multiple regression (Chapter 8) would usually be more
appropriate for 3 or more variables in a correlational design.
HOW TO PERFORM SPEARMANS RS
Carry out steps 1 to 5 as for the Pearsons r (previous Section). At step 6 select
Spearman instead ofPearson (see Bivariate Correlations dialogue box below).
8/2/2019 Correla Test
18/19
SPSS for Psychologists Chapter Four 107
This example also illustrates the fact that you can carry out more than one
correlation at once. There are three variables, and we want to investigate the
relationship between each variable with each of the other two. To do this you
simply highlight all three variable names and move them all into the Variables box.
The SPSS output for Spearmans rs is shown below.
SPSS OUTPUT FOR SPEARMANS RS
Obtained Using Menu Item: Correlate > Bivariate
8/2/2019 Correla Test
19/19
108 SPSS f P h l i t Ch t F
REPORTING THE RESULTS
When reporting the outcome for each correlation, you would write at the
appropriate points:
There was a significant positive correlation between confidence in testimony and
believability (rs = .372,N= 89,p < .0005, two-tailed).There was no significant correlation between confidence in testimony and
attractiveness (rs = .157,N= 89,p = .143, two-tailed).
There was a significant positive correlation between attractiveness and believability
(rs = .359,N= 89, p = .001, two-tailed).
You could illustrate each pair of variables in a scattergram (see Section 2). These
data illustrate an aspect of scattergrams mentioned in Section 2. Many cases have
the same values on both variables and it is unclear where all the cases are. To
clearly illustrate the data you can edit the data symbols in Chart Editor, so that they
vary in size according to the number of cases at each position.
Note that the R Sq Linear value, given in the scattergram when you add a regression
line, is the square of Pearsons r (r2) and not the square of Spearmans rs. As
described in Section 3, r2 indicates the proportion of variation explained. You will
see that it is rather small for each of these three relationships; the largest is 18.4%.
As this research deals with possible influences on jury decisions, a small amount of
variance explained might nonetheless be important.
HOW TO PERFORM KENDALLS TAU-B:
Some researchers prefer to use Kendalls tau instead of Spearmans rs. To undertake
a Kendalls tau, follow the same steps as for Pearsons r, but at step 6 select
Kendalls tau-b. The output takes the same form as that for Spearmans rs.
Kendalls tau-b takes ties into account. Kendalls tau-c, which ignores ties, is
available in Crosstabs (see Chapter 5, Section 4).
Top Related