7/27/2019 MEASURES OF RELATIONSHIP.docx
1/11
Measures of Relationship
Chapter 5 of the textbook introduced you to the two most widely used measures of
relationship: the Pearson product-moment correlation and the Spearman rank-order
correlation. We will be covering these statistics in this section, as well as othermeasures of relationship among variables.
What is a Relationship?
Correlation coefficients are measures of the degree of relationship between two or
more variables. When we talk about a relationship, we are talking about the manner in
which the variables tend to vary together. For example, if one variable tends to
increase at the same time that another variable increases, we would say there is a
positive relationship between the two variables. If one variable tends to decrease as
another variable increases, we would say that there is a negative relationship betweenthe two variables. It is also possible that the variables might be unrelated to one
another, so that there is no predictable change in one variable based on knowing about
changes in the other variable.
As a child grows from an infant into a toddler into a young child, both the child's
height and weight tend to change. Those changes are not always tightly locked to one
another, but they do tend to occur together. So if we took a sample of children from a
few weeks old to 3 years old and measured the height and weight of each child, we
would likely see a positive relationship between the two.
A relationship between two variables does not necessarily mean that one variable
causes the other. When we see a relationship, there are three possible causal
interpretations. If we label the variables A and B, A could cause B, B could cause A,
or some third variable (we will call it C) could cause both A and B. With the
relationship between height and weight in children, it is likely that the general growth
of children, which increases both height and weight, accounts for the observed
correlation. It is very foolish to assume that the presence of a correlation implies a
causal relationship between the two variables. There is an extended discussion of this
issue in Chapter 7 of the text.
Scatter Plots and Linear Relationships
A helpful way to visualize a relationship between two variables is to construct a
scatter plot, which you were briefly introduced to in our discussion ofgraphical
techniques. A scatter plot represents each set of paired scores on a two dimensional
http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htm7/27/2019 MEASURES OF RELATIONSHIP.docx
2/11
graph, in which the dimensions are defined by the variables. For example, if we
wanted to create a scatter plot of our sample of 100 children for the variables of height
and weight, we would start by drawing the X and Y axes, labeling one height and the
other weight, and marking off the scales so that the range on these axes is sufficient to
handle the range of scores in our sample. Let's suppose that our first child is 27 inches
tall and 21 pounds. We would find the point on the weight axis that represents 21pounds and the point on the height axis that represents 27 inches. Where these two
points cross, we would put a dot that represents the combination of height and weight
for that child, as shown in the figure below.
We then continue the process for all of the other children in our sample, which might
produce the scatter plot illustrated below.
7/27/2019 MEASURES OF RELATIONSHIP.docx
3/11
It is always a good idea to produce scatter plots for the correlations that you compute
as part of your research. Most will look like the scatter plot above, suggesting a linear
relationship. Others will show a distribution that is less organized and more scattered,
suggesting a weak relationship between the variables. But on rare occasions, a scatterplot will indicate a relationship that is not a simple linear relationship, but rather
shows a complex relationship that changes at different points in the scatter plot. The
scatter plot below illustrates a nonlinear relationship, in which Yincreases
asXincreases, but only up to a point; after that point, the relationship reverses
direction. Using a simple correlation coefficient for such a situation would be a
mistake, because the correlation cannot capture accurately the nature of a nonlinear
relationship.
7/27/2019 MEASURES OF RELATIONSHIP.docx
4/11
Pearson Product-Moment Correlation
The Pearson product-moment correlation was devised by Karl Pearson in 1895,
and it is still the most widely used correlation coefficient. This history behind
themathematical developmentof this index is fascinating. Those interested in that
history can click on the link. But you need not know that history to understand how
the Pearson correlation works.
The Pearson product-moment correlation is an index of the degree of linear
relationship between two variables that are both measured on at least an ordinal scale
of measurement. The index is structured so the a correlation of 0.00 means that there
is no linear relationship, a correlation of +1.00 means that there is a perfect positive
relationship, and a correlation of -1.00 means that there is a perfect negative
relationship. As you move from zero to either end of this scale, the strength of the
relationship increases. You can think of the strength of a linear relationship as how
tightly the data points in a scatter plot cluster around a straight line. In a perfect
relationship, either negative or positive, the points all fall on a single straight line. We
will see examples of that later. The symbol for the Pearson correlation is a
lowercase r, which is often subscripted with the two variables. For example, rxy would
stand for the correlation between the variablesXand Y.
http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htm7/27/2019 MEASURES OF RELATIONSHIP.docx
5/11
The Pearson product-moment correlation was originally defined in terms ofZ-scores.In fact, you can compute the product-moment correlation as the average cross-
productZ, as show in the first equation below. But that is an equation that is difficult
to use to do computations. The more commonly used equation now is the second
equation below. Although this equation looks much more complicated and looks like
it would be much more difficult to compute, in fact, this second equation is by far the
easier of the two to use if you are doing the computations with nothing but a
calculator.
You can learn how to compute the Pearson product-moment correlation either by hand
or using SPSS for Windows by clicking on one of the buttons below. Use the browser's
return arrow key to return to this page.
Compute the Pearson product-moment
correlation by hand
Compute the Pearson product-momentcorrelation using SPSS
USE THE BROWSER'S
BACK ARROW KEY TO RETURN
Spearman Rank-Order Correlation
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htm7/27/2019 MEASURES OF RELATIONSHIP.docx
6/11
The Spearman rank-order correlation provides an index of the degree of linear
relationship between two variables that are both measured on at least an ordinal scale
of measurement. If one of the variables is on an ordinal scale and the other is on an
interval or ratio scale, it is always possible to convert the interval or ratio scale to an
ordinal scale. That process is discussed in the section showing you how to compute
this correlation by hand.
The Spearman correlation has the same range as the Pearson correlation, and the
numbers mean the same thing. A zero correlation means that there is no relationship,
whereas correlations of +1.00 and -1.00 mean that there are perfect positive and
negative relationships, respectively. The formula for computing this correlation is
shown below. Traditionally, the lowercase rwith a subscripts is used to designate the
Spearman correlation (i.e., rs). The one term in the formula that is not familiar to you
is d, which is equal to the difference in the ranks for the two variables. This is
explained in more detail in the section that covers the manual computation of the
Spearman rank-order correlation.
Compute the Spearman rank-order
correlation by hand
Compute the Spearman rank-ordercorrelation using SPSS
USE THE BROWSER'S
BACK ARROW KEY TO RETURN
The Phi Coefficient
The Phi coefficient is an index of the degree of relationship between two variables
that are measured on a nominal scale. Because variables measured on a nominal scaleare simply classified by type, rather than measured in the more general sense, there is
no such thing as a linear relationship. Nevertheless, it is possible to see if there is a
relationship. For example, suppose you want to study the relationship between
religious background and occupations. You have a classification systems for religion
that includes Catholic, Protestant, Muslim, Other, and Agnostic/Atheist. You have
also developed a classification for occupations that include Unskilled Laborer, Skilled
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htm7/27/2019 MEASURES OF RELATIONSHIP.docx
7/11
Laborer, Clerical, Middle Manager, Small Business Owner, and Professional/Upper
Management. You want to see if the distribution of religious preferences differ by
occupation, which is just another way of saying that there is a relationship between
these two variables.
The Phi Coefficient is not used nearly as often as the Pearson and Spearmancorrelations. Therefore, we will not be devoting space here to the computational
procedures. However, interested students can consult advances statistics textbooks for
the details. you can compute Phi easily as one of the options in
the crosstabs procedure in SPSS for Windows. Click on the button below to see how.
Using Crosstabs in SPSS for Windows
USE THE BROWSER'S
BACK ARROW KEY TO RETURN
Advanced Correlational Techniques
Correlational techniques are immensely flexible and can be extended dramatically to
solve various kinds of statistical problems. Covering the details of these advanced
correlational techniques is beyond the score of this text and website. However, we
have included brief discussions of several advanced correlational techniques on
the Student Resource Website, includingmultidimensional scaling,path
analysis,taxonomic search techniques, andstatistical analysis of neuroimages.
Nonlinear Correlational Procedures
The vast majority of correlational techniques used in psychology are linear
correlations. However, there are times when one can expect to find nonlinear
relationships and would like to apply statistical procedures to capture such complex
relationships. This topic is far too complex to cover here. The interested student will
want to consult advanced statistical textbooks that specialize in regression analyses.
There are two words of caution that we want to state about using such nonlinear
correlational procedures. Although it is relatively easy to do the computations using
modern statistical software, you should not use these procedures unless you actually
understand them and their pitfalls. It is easy to misuse the techniques and to be fooled
into believing things that are not true from a naive analysis of the output of computer
programs.
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htm7/27/2019 MEASURES OF RELATIONSHIP.docx
8/11
The second word of caution is that there should be a strong theoretical reason to
expect a nonlinear relationship if you are going to use nonlinear correlational
procedures. Many psychophysiological processes are by their nature nonlinear, so
using nonlinear correlations in studying those processes makes complete sense. But
for most psychological processes, there is no good theoretical reasons to expect a
nonlinear relationship.
Linear Regression
As you learned in Chapters 5 and 7 of the text, the value of correlations is that they
can be used to predict one variable from another variable. This process is called linear
regression or simply regression. It involves fitting mathematically a straight line to
the to the data from a scatter plot. Below is a scatter plot from our discussion of
correlations. We have added a regression line to that scatter plot to illustrate how
regression works. We compute the regression line with formulas that we will presentto you shortly. The regression line is based on our data. Once we have the regression
line, we can then use it to predict Yfrom knowingX. The scatter plot below shows the
relationship of height and weight in young children (birth to three years old). The line
that runs through the data points is called the regression line. It is determined by an
equation, which we will discuss shortly. If we know the value ofX(in this case,
weight) and we want to predict YfromX, we draw a line straight up from our value
ofXuntil it intersects the regression line, and then we draw a line that is parallel to
theX-axis over to the Y-axis. We then read from the Y-axis our predicted value
forY(in this case, height).
7/27/2019 MEASURES OF RELATIONSHIP.docx
9/11
In order to fit a line mathematically, there must be some stated mathematical criteria
for what constitutes a good fit. In the case of linear regression, that mathematical
criteria is called least squares criteria, which is shorthand for the line being
positioned so that the sum of the squared distances from the score to the predictedscore is as small as it can be. If you are predicting Y, you will compute a regression
line that minimized the sum of the (Y-Y')2. Traditionally, a predicted score is referred
to by using the letter of the score and adding a single quotation after it (Y' is
read Yprime orYpredicted). To illustrate this concept, we removed most of the
clutter of data points from the above scatter plot and showed the distances that are
involved in the least squares criteria. Note that it is the vertical distance from the point
to the prediction line--that is, the difference from the predicted Y(along the regression
line) and the actual Y(represented by the data point). A common misconception is that
you measure the shortest distance to the line, which will be a line to the point that is at
right angles to the regression line. It may not be immediately obvious, but if you weretrying to predictXfrom Y, you would be minimizing the sum of the squared
distancesX-X'. That means that the regression line for predicting YfromXmay not be
the same as the regression line for predictingXfrom Y. In fact, it is rare that they are
exactly alike.
7/27/2019 MEASURES OF RELATIONSHIP.docx
10/11
The first equation below is the basic form of the regression line. It is simply the
equation for a straight line, which you probably learned in high school math. The two
new notational items are byx and ayx which are the slope and the intercept of the
regression line for predicting YfromX. The slope is how much the Yscores increaseper unit ofXscore increase. The slope in the figure above is approximately .80. For
every 10 units movement along the line on theXaxis, the Yaxis moves about 8 units.
The intercept is the point at which the line crosses the Yaxis (i.e., the point at
whichXis equal to zero. The equations for computing the slope and intercept of the
line are listed as the second and third equations, respectively. If you want to
predictXfrom Y, simple replace all theXs with Ys and the Ys withXs in the equations
below.
7/27/2019 MEASURES OF RELATIONSHIP.docx
11/11
A careful inspection of these equations will reveal a couple of important ideas. First, if
you look at the first version of the equation for the slope (the one using the correlationand the population variances), you will see that the slope is equal to the correlation if
the population variances are equal. That would be true either for
predictingXfrom YorYfromX. What is less clear, but is also true, is that the
regression lines for predictingXor predicting Ywill be identical if the population
variances are equal. That is the ONLY situation in which the regression lines are the
same. Second, if the correlation is zero (i.e., no relationship betweenXand Y), then
the slope will be zero (look at the first part of the second equation). If you are
predicting Y fromX, your regression line will be horizontal, and if you are
predictingXfrom Y, your regression line will be vertical. Furthermore, if you look at
the third equation, you will see that the horizontal line for predicting Ywill be at themean ofYand the vertical line for predictingXwill be at the mean ofX. Think about
that for a minute. IfXand Yare uncorrelated and you are trying to predict Y, the best
prediction that you can make is the mean ofY. If you have no useful information
about a variable and are asked to predict the score of a given individual, your best bet
is to predict the mean. To the extent that the variables are correlated, you can make a
better prediction by using the information from the correlated variable and the
regression equation.
Top Related