Download - MEASURES OF RELATIONSHIP.docx

7/27/2019 MEASURES OF RELATIONSHIP.docx

1/11

Measures of Relationship

Chapter 5 of the textbook introduced you to the two most widely used measures of

relationship: the Pearson product-moment correlation and the Spearman rank-order

correlation. We will be covering these statistics in this section, as well as othermeasures of relationship among variables.

What is a Relationship?

Correlation coefficients are measures of the degree of relationship between two or

more variables. When we talk about a relationship, we are talking about the manner in

which the variables tend to vary together. For example, if one variable tends to

increase at the same time that another variable increases, we would say there is a

positive relationship between the two variables. If one variable tends to decrease as

another variable increases, we would say that there is a negative relationship betweenthe two variables. It is also possible that the variables might be unrelated to one

another, so that there is no predictable change in one variable based on knowing about

changes in the other variable.

As a child grows from an infant into a toddler into a young child, both the child's

height and weight tend to change. Those changes are not always tightly locked to one

another, but they do tend to occur together. So if we took a sample of children from a

few weeks old to 3 years old and measured the height and weight of each child, we

would likely see a positive relationship between the two.

A relationship between two variables does not necessarily mean that one variable

causes the other. When we see a relationship, there are three possible causal

interpretations. If we label the variables A and B, A could cause B, B could cause A,

or some third variable (we will call it C) could cause both A and B. With the

relationship between height and weight in children, it is likely that the general growth

of children, which increases both height and weight, accounts for the observed

correlation. It is very foolish to assume that the presence of a correlation implies a

causal relationship between the two variables. There is an extended discussion of this

issue in Chapter 7 of the text.

Scatter Plots and Linear Relationships

A helpful way to visualize a relationship between two variables is to construct a

scatter plot, which you were briefly introduced to in our discussion ofgraphical

techniques. A scatter plot represents each set of paired scores on a two dimensional
http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htm


2/11

graph, in which the dimensions are defined by the variables. For example, if we

wanted to create a scatter plot of our sample of 100 children for the variables of height

and weight, we would start by drawing the X and Y axes, labeling one height and the

other weight, and marking off the scales so that the range on these axes is sufficient to

handle the range of scores in our sample. Let's suppose that our first child is 27 inches

tall and 21 pounds. We would find the point on the weight axis that represents 21pounds and the point on the height axis that represents 27 inches. Where these two

points cross, we would put a dot that represents the combination of height and weight

for that child, as shown in the figure below.

We then continue the process for all of the other children in our sample, which might

produce the scatter plot illustrated below.


3/11

It is always a good idea to produce scatter plots for the correlations that you compute

as part of your research. Most will look like the scatter plot above, suggesting a linear

relationship. Others will show a distribution that is less organized and more scattered,

suggesting a weak relationship between the variables. But on rare occasions, a scatterplot will indicate a relationship that is not a simple linear relationship, but rather

shows a complex relationship that changes at different points in the scatter plot. The

scatter plot below illustrates a nonlinear relationship, in which Yincreases

asXincreases, but only up to a point; after that point, the relationship reverses

direction. Using a simple correlation coefficient for such a situation would be a

mistake, because the correlation cannot capture accurately the nature of a nonlinear

relationship.


4/11

Pearson Product-Moment Correlation

The Pearson product-moment correlation was devised by Karl Pearson in 1895,

and it is still the most widely used correlation coefficient. This history behind

themathematical developmentof this index is fascinating. Those interested in that

history can click on the link. But you need not know that history to understand how

the Pearson correlation works.

The Pearson product-moment correlation is an index of the degree of linear

relationship between two variables that are both measured on at least an ordinal scale

of measurement. The index is structured so the a correlation of 0.00 means that there

is no linear relationship, a correlation of +1.00 means that there is a perfect positive

relationship, and a correlation of -1.00 means that there is a perfect negative

relationship. As you move from zero to either end of this scale, the strength of the

relationship increases. You can think of the strength of a linear relationship as how

tightly the data points in a scatter plot cluster around a straight line. In a perfect

relationship, either negative or positive, the points all fall on a single straight line. We

will see examples of that later. The symbol for the Pearson correlation is a

lowercase r, which is often subscripted with the two variables. For example, rxy would

stand for the correlation between the variablesXand Y.
http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htm


5/11

The Pearson product-moment correlation was originally defined in terms ofZ-scores.In fact, you can compute the product-moment correlation as the average cross-

productZ, as show in the first equation below. But that is an equation that is difficult

to use to do computations. The more commonly used equation now is the second

equation below. Although this equation looks much more complicated and looks like

it would be much more difficult to compute, in fact, this second equation is by far the

easier of the two to use if you are doing the computations with nothing but a

calculator.

You can learn how to compute the Pearson product-moment correlation either by hand

or using SPSS for Windows by clicking on one of the buttons below. Use the browser's

return arrow key to return to this page.

Compute the Pearson product-moment

correlation by hand

Compute the Pearson product-momentcorrelation using SPSS

USE THE BROWSER'S

BACK ARROW KEY TO RETURN

Spearman Rank-Order Correlation
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htm


6/11

The Spearman rank-order correlation provides an index of the degree of linear

relationship between two variables that are both measured on at least an ordinal scale

of measurement. If one of the variables is on an ordinal scale and the other is on an

interval or ratio scale, it is always possible to convert the interval or ratio scale to an

ordinal scale. That process is discussed in the section showing you how to compute

this correlation by hand.

The Spearman correlation has the same range as the Pearson correlation, and the

numbers mean the same thing. A zero correlation means that there is no relationship,

whereas correlations of +1.00 and -1.00 mean that there are perfect positive and

negative relationships, respectively. The formula for computing this correlation is

shown below. Traditionally, the lowercase rwith a subscripts is used to designate the

Spearman correlation (i.e., rs). The one term in the formula that is not familiar to you

is d, which is equal to the difference in the ranks for the two variables. This is

explained in more detail in the section that covers the manual computation of the

Spearman rank-order correlation.

Compute the Spearman rank-order

correlation by hand

Compute the Spearman rank-ordercorrelation using SPSS

USE THE BROWSER'S


The Phi Coefficient

The Phi coefficient is an index of the degree of relationship between two variables

that are measured on a nominal scale. Because variables measured on a nominal scaleare simply classified by type, rather than measured in the more general sense, there is

no such thing as a linear relationship. Nevertheless, it is possible to see if there is a

relationship. For example, suppose you want to study the relationship between

religious background and occupations. You have a classification systems for religion

that includes Catholic, Protestant, Muslim, Other, and Agnostic/Atheist. You have

also developed a classification for occupations that include Unskilled Laborer, Skilled
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htm


7/11

Laborer, Clerical, Middle Manager, Small Business Owner, and Professional/Upper

Management. You want to see if the distribution of religious preferences differ by

occupation, which is just another way of saying that there is a relationship between

these two variables.

The Phi Coefficient is not used nearly as often as the Pearson and Spearmancorrelations. Therefore, we will not be devoting space here to the computational

procedures. However, interested students can consult advances statistics textbooks for

the details. you can compute Phi easily as one of the options in

the crosstabs procedure in SPSS for Windows. Click on the button below to see how.

Using Crosstabs in SPSS for Windows

USE THE BROWSER'S


Advanced Correlational Techniques

Correlational techniques are immensely flexible and can be extended dramatically to

solve various kinds of statistical problems. Covering the details of these advanced

correlational techniques is beyond the score of this text and website. However, we

have included brief discussions of several advanced correlational techniques on

the Student Resource Website, includingmultidimensional scaling,path

analysis,taxonomic search techniques, andstatistical analysis of neuroimages.

Nonlinear Correlational Procedures

The vast majority of correlational techniques used in psychology are linear

correlations. However, there are times when one can expect to find nonlinear

relationships and would like to apply statistical procedures to capture such complex

relationships. This topic is far too complex to cover here. The interested student will

want to consult advanced statistical textbooks that specialize in regression analyses.

There are two words of caution that we want to state about using such nonlinear

correlational procedures. Although it is relatively easy to do the computations using

modern statistical software, you should not use these procedures unless you actually

understand them and their pitfalls. It is easy to misuse the techniques and to be fooled

into believing things that are not true from a naive analysis of the output of computer

programs.
http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htm


8/11

The second word of caution is that there should be a strong theoretical reason to

expect a nonlinear relationship if you are going to use nonlinear correlational

procedures. Many psychophysiological processes are by their nature nonlinear, so

using nonlinear correlations in studying those processes makes complete sense. But

for most psychological processes, there is no good theoretical reasons to expect a

nonlinear relationship.

Linear Regression

As you learned in Chapters 5 and 7 of the text, the value of correlations is that they

can be used to predict one variable from another variable. This process is called linear

regression or simply regression. It involves fitting mathematically a straight line to

the to the data from a scatter plot. Below is a scatter plot from our discussion of

correlations. We have added a regression line to that scatter plot to illustrate how

regression works. We compute the regression line with formulas that we will presentto you shortly. The regression line is based on our data. Once we have the regression

line, we can then use it to predict Yfrom knowingX. The scatter plot below shows the

relationship of height and weight in young children (birth to three years old). The line

that runs through the data points is called the regression line. It is determined by an

equation, which we will discuss shortly. If we know the value ofX(in this case,

weight) and we want to predict YfromX, we draw a line straight up from our value

ofXuntil it intersects the regression line, and then we draw a line that is parallel to

theX-axis over to the Y-axis. We then read from the Y-axis our predicted value

forY(in this case, height).


9/11

In order to fit a line mathematically, there must be some stated mathematical criteria

for what constitutes a good fit. In the case of linear regression, that mathematical

criteria is called least squares criteria, which is shorthand for the line being

positioned so that the sum of the squared distances from the score to the predictedscore is as small as it can be. If you are predicting Y, you will compute a regression

line that minimized the sum of the (Y-Y')2. Traditionally, a predicted score is referred

to by using the letter of the score and adding a single quotation after it (Y' is

read Yprime orYpredicted). To illustrate this concept, we removed most of the

clutter of data points from the above scatter plot and showed the distances that are

involved in the least squares criteria. Note that it is the vertical distance from the point

to the prediction line--that is, the difference from the predicted Y(along the regression

line) and the actual Y(represented by the data point). A common misconception is that

you measure the shortest distance to the line, which will be a line to the point that is at

right angles to the regression line. It may not be immediately obvious, but if you weretrying to predictXfrom Y, you would be minimizing the sum of the squared

distancesX-X'. That means that the regression line for predicting YfromXmay not be

the same as the regression line for predictingXfrom Y. In fact, it is rare that they are

exactly alike.


10/11

The first equation below is the basic form of the regression line. It is simply the

equation for a straight line, which you probably learned in high school math. The two

new notational items are byx and ayx which are the slope and the intercept of the

regression line for predicting YfromX. The slope is how much the Yscores increaseper unit ofXscore increase. The slope in the figure above is approximately .80. For

every 10 units movement along the line on theXaxis, the Yaxis moves about 8 units.

The intercept is the point at which the line crosses the Yaxis (i.e., the point at

whichXis equal to zero. The equations for computing the slope and intercept of the

line are listed as the second and third equations, respectively. If you want to

predictXfrom Y, simple replace all theXs with Ys and the Ys withXs in the equations

below.


11/11

A careful inspection of these equations will reveal a couple of important ideas. First, if

you look at the first version of the equation for the slope (the one using the correlationand the population variances), you will see that the slope is equal to the correlation if

the population variances are equal. That would be true either for

predictingXfrom YorYfromX. What is less clear, but is also true, is that the

regression lines for predictingXor predicting Ywill be identical if the population

variances are equal. That is the ONLY situation in which the regression lines are the

same. Second, if the correlation is zero (i.e., no relationship betweenXand Y), then

the slope will be zero (look at the first part of the second equation). If you are

predicting Y fromX, your regression line will be horizontal, and if you are

predictingXfrom Y, your regression line will be vertical. Furthermore, if you look at

the third equation, you will see that the horizontal line for predicting Ywill be at themean ofYand the vertical line for predictingXwill be at the mean ofX. Think about

that for a minute. IfXand Yare uncorrelated and you are trying to predict Y, the best

prediction that you can make is the mean ofY. If you have no useful information

about a variable and are asked to predict the score of a given individual, your best bet

is to predict the mean. To the extent that the variables are correlated, you can make a

better prediction by using the information from the correlated variable and the

regression equation.