MEASURES OF RELATIONSHIP.docx

download MEASURES OF RELATIONSHIP.docx

of 11

Transcript of MEASURES OF RELATIONSHIP.docx

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    1/11

    Measures of Relationship

    Chapter 5 of the textbook introduced you to the two most widely used measures of

    relationship: the Pearson product-moment correlation and the Spearman rank-order

    correlation. We will be covering these statistics in this section, as well as othermeasures of relationship among variables.

    What is a Relationship?

    Correlation coefficients are measures of the degree of relationship between two or

    more variables. When we talk about a relationship, we are talking about the manner in

    which the variables tend to vary together. For example, if one variable tends to

    increase at the same time that another variable increases, we would say there is a

    positive relationship between the two variables. If one variable tends to decrease as

    another variable increases, we would say that there is a negative relationship betweenthe two variables. It is also possible that the variables might be unrelated to one

    another, so that there is no predictable change in one variable based on knowing about

    changes in the other variable.

    As a child grows from an infant into a toddler into a young child, both the child's

    height and weight tend to change. Those changes are not always tightly locked to one

    another, but they do tend to occur together. So if we took a sample of children from a

    few weeks old to 3 years old and measured the height and weight of each child, we

    would likely see a positive relationship between the two.

    A relationship between two variables does not necessarily mean that one variable

    causes the other. When we see a relationship, there are three possible causal

    interpretations. If we label the variables A and B, A could cause B, B could cause A,

    or some third variable (we will call it C) could cause both A and B. With the

    relationship between height and weight in children, it is likely that the general growth

    of children, which increases both height and weight, accounts for the observed

    correlation. It is very foolish to assume that the presence of a correlation implies a

    causal relationship between the two variables. There is an extended discussion of this

    issue in Chapter 7 of the text.

    Scatter Plots and Linear Relationships

    A helpful way to visualize a relationship between two variables is to construct a

    scatter plot, which you were briefly introduced to in our discussion ofgraphical

    techniques. A scatter plot represents each set of paired scores on a two dimensional

    http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/graphs.htm
  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    2/11

    graph, in which the dimensions are defined by the variables. For example, if we

    wanted to create a scatter plot of our sample of 100 children for the variables of height

    and weight, we would start by drawing the X and Y axes, labeling one height and the

    other weight, and marking off the scales so that the range on these axes is sufficient to

    handle the range of scores in our sample. Let's suppose that our first child is 27 inches

    tall and 21 pounds. We would find the point on the weight axis that represents 21pounds and the point on the height axis that represents 27 inches. Where these two

    points cross, we would put a dot that represents the combination of height and weight

    for that child, as shown in the figure below.

    We then continue the process for all of the other children in our sample, which might

    produce the scatter plot illustrated below.

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    3/11

    It is always a good idea to produce scatter plots for the correlations that you compute

    as part of your research. Most will look like the scatter plot above, suggesting a linear

    relationship. Others will show a distribution that is less organized and more scattered,

    suggesting a weak relationship between the variables. But on rare occasions, a scatterplot will indicate a relationship that is not a simple linear relationship, but rather

    shows a complex relationship that changes at different points in the scatter plot. The

    scatter plot below illustrates a nonlinear relationship, in which Yincreases

    asXincreases, but only up to a point; after that point, the relationship reverses

    direction. Using a simple correlation coefficient for such a situation would be a

    mistake, because the correlation cannot capture accurately the nature of a nonlinear

    relationship.

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    4/11

    Pearson Product-Moment Correlation

    The Pearson product-moment correlation was devised by Karl Pearson in 1895,

    and it is still the most widely used correlation coefficient. This history behind

    themathematical developmentof this index is fascinating. Those interested in that

    history can click on the link. But you need not know that history to understand how

    the Pearson correlation works.

    The Pearson product-moment correlation is an index of the degree of linear

    relationship between two variables that are both measured on at least an ordinal scale

    of measurement. The index is structured so the a correlation of 0.00 means that there

    is no linear relationship, a correlation of +1.00 means that there is a perfect positive

    relationship, and a correlation of -1.00 means that there is a perfect negative

    relationship. As you move from zero to either end of this scale, the strength of the

    relationship increases. You can think of the strength of a linear relationship as how

    tightly the data points in a scatter plot cluster around a straight line. In a perfect

    relationship, either negative or positive, the points all fall on a single straight line. We

    will see examples of that later. The symbol for the Pearson correlation is a

    lowercase r, which is often subscripted with the two variables. For example, rxy would

    stand for the correlation between the variablesXand Y.

    http://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/statconcepts/pearsondev.htm
  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    5/11

    The Pearson product-moment correlation was originally defined in terms ofZ-scores.In fact, you can compute the product-moment correlation as the average cross-

    productZ, as show in the first equation below. But that is an equation that is difficult

    to use to do computations. The more commonly used equation now is the second

    equation below. Although this equation looks much more complicated and looks like

    it would be much more difficult to compute, in fact, this second equation is by far the

    easier of the two to use if you are doing the computations with nothing but a

    calculator.

    You can learn how to compute the Pearson product-moment correlation either by hand

    or using SPSS for Windows by clicking on one of the buttons below. Use the browser's

    return arrow key to return to this page.

    Compute the Pearson product-moment

    correlation by hand

    Compute the Pearson product-momentcorrelation using SPSS

    USE THE BROWSER'S

    BACK ARROW KEY TO RETURN

    Spearman Rank-Order Correlation

    http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/pearson.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manpear.htm
  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    6/11

    The Spearman rank-order correlation provides an index of the degree of linear

    relationship between two variables that are both measured on at least an ordinal scale

    of measurement. If one of the variables is on an ordinal scale and the other is on an

    interval or ratio scale, it is always possible to convert the interval or ratio scale to an

    ordinal scale. That process is discussed in the section showing you how to compute

    this correlation by hand.

    The Spearman correlation has the same range as the Pearson correlation, and the

    numbers mean the same thing. A zero correlation means that there is no relationship,

    whereas correlations of +1.00 and -1.00 mean that there are perfect positive and

    negative relationships, respectively. The formula for computing this correlation is

    shown below. Traditionally, the lowercase rwith a subscripts is used to designate the

    Spearman correlation (i.e., rs). The one term in the formula that is not familiar to you

    is d, which is equal to the difference in the ranks for the two variables. This is

    explained in more detail in the section that covers the manual computation of the

    Spearman rank-order correlation.

    Compute the Spearman rank-order

    correlation by hand

    Compute the Spearman rank-ordercorrelation using SPSS

    USE THE BROWSER'S

    BACK ARROW KEY TO RETURN

    The Phi Coefficient

    The Phi coefficient is an index of the degree of relationship between two variables

    that are measured on a nominal scale. Because variables measured on a nominal scaleare simply classified by type, rather than measured in the more general sense, there is

    no such thing as a linear relationship. Nevertheless, it is possible to see if there is a

    relationship. For example, suppose you want to study the relationship between

    religious background and occupations. You have a classification systems for religion

    that includes Catholic, Protestant, Muslim, Other, and Agnostic/Atheist. You have

    also developed a classification for occupations that include Unskilled Laborer, Skilled

    http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/spearman.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/manspear.htm
  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    7/11

    Laborer, Clerical, Middle Manager, Small Business Owner, and Professional/Upper

    Management. You want to see if the distribution of religious preferences differ by

    occupation, which is just another way of saying that there is a relationship between

    these two variables.

    The Phi Coefficient is not used nearly as often as the Pearson and Spearmancorrelations. Therefore, we will not be devoting space here to the computational

    procedures. However, interested students can consult advances statistics textbooks for

    the details. you can compute Phi easily as one of the options in

    the crosstabs procedure in SPSS for Windows. Click on the button below to see how.

    Using Crosstabs in SPSS for Windows

    USE THE BROWSER'S

    BACK ARROW KEY TO RETURN

    Advanced Correlational Techniques

    Correlational techniques are immensely flexible and can be extended dramatically to

    solve various kinds of statistical problems. Covering the details of these advanced

    correlational techniques is beyond the score of this text and website. However, we

    have included brief discussions of several advanced correlational techniques on

    the Student Resource Website, includingmultidimensional scaling,path

    analysis,taxonomic search techniques, andstatistical analysis of neuroimages.

    Nonlinear Correlational Procedures

    The vast majority of correlational techniques used in psychology are linear

    correlations. However, there are times when one can expect to find nonlinear

    relationships and would like to apply statistical procedures to capture such complex

    relationships. This topic is far too complex to cover here. The interested student will

    want to consult advanced statistical textbooks that specialize in regression analyses.

    There are two words of caution that we want to state about using such nonlinear

    correlational procedures. Although it is relatively easy to do the computations using

    modern statistical software, you should not use these procedures unless you actually

    understand them and their pitfalls. It is easy to misuse the techniques and to be fooled

    into believing things that are not true from a naive analysis of the output of computer

    programs.

    http://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/neuroimages.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/taxometrics.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/path_analysis.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/multidimensional_scaling.htmhttp://www.ablongman.com/graziano6e/text_site/MATERIAL/Stats/phi.htm
  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    8/11

    The second word of caution is that there should be a strong theoretical reason to

    expect a nonlinear relationship if you are going to use nonlinear correlational

    procedures. Many psychophysiological processes are by their nature nonlinear, so

    using nonlinear correlations in studying those processes makes complete sense. But

    for most psychological processes, there is no good theoretical reasons to expect a

    nonlinear relationship.

    Linear Regression

    As you learned in Chapters 5 and 7 of the text, the value of correlations is that they

    can be used to predict one variable from another variable. This process is called linear

    regression or simply regression. It involves fitting mathematically a straight line to

    the to the data from a scatter plot. Below is a scatter plot from our discussion of

    correlations. We have added a regression line to that scatter plot to illustrate how

    regression works. We compute the regression line with formulas that we will presentto you shortly. The regression line is based on our data. Once we have the regression

    line, we can then use it to predict Yfrom knowingX. The scatter plot below shows the

    relationship of height and weight in young children (birth to three years old). The line

    that runs through the data points is called the regression line. It is determined by an

    equation, which we will discuss shortly. If we know the value ofX(in this case,

    weight) and we want to predict YfromX, we draw a line straight up from our value

    ofXuntil it intersects the regression line, and then we draw a line that is parallel to

    theX-axis over to the Y-axis. We then read from the Y-axis our predicted value

    forY(in this case, height).

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    9/11

    In order to fit a line mathematically, there must be some stated mathematical criteria

    for what constitutes a good fit. In the case of linear regression, that mathematical

    criteria is called least squares criteria, which is shorthand for the line being

    positioned so that the sum of the squared distances from the score to the predictedscore is as small as it can be. If you are predicting Y, you will compute a regression

    line that minimized the sum of the (Y-Y')2. Traditionally, a predicted score is referred

    to by using the letter of the score and adding a single quotation after it (Y' is

    read Yprime orYpredicted). To illustrate this concept, we removed most of the

    clutter of data points from the above scatter plot and showed the distances that are

    involved in the least squares criteria. Note that it is the vertical distance from the point

    to the prediction line--that is, the difference from the predicted Y(along the regression

    line) and the actual Y(represented by the data point). A common misconception is that

    you measure the shortest distance to the line, which will be a line to the point that is at

    right angles to the regression line. It may not be immediately obvious, but if you weretrying to predictXfrom Y, you would be minimizing the sum of the squared

    distancesX-X'. That means that the regression line for predicting YfromXmay not be

    the same as the regression line for predictingXfrom Y. In fact, it is rare that they are

    exactly alike.

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    10/11

    The first equation below is the basic form of the regression line. It is simply the

    equation for a straight line, which you probably learned in high school math. The two

    new notational items are byx and ayx which are the slope and the intercept of the

    regression line for predicting YfromX. The slope is how much the Yscores increaseper unit ofXscore increase. The slope in the figure above is approximately .80. For

    every 10 units movement along the line on theXaxis, the Yaxis moves about 8 units.

    The intercept is the point at which the line crosses the Yaxis (i.e., the point at

    whichXis equal to zero. The equations for computing the slope and intercept of the

    line are listed as the second and third equations, respectively. If you want to

    predictXfrom Y, simple replace all theXs with Ys and the Ys withXs in the equations

    below.

  • 7/27/2019 MEASURES OF RELATIONSHIP.docx

    11/11

    A careful inspection of these equations will reveal a couple of important ideas. First, if

    you look at the first version of the equation for the slope (the one using the correlationand the population variances), you will see that the slope is equal to the correlation if

    the population variances are equal. That would be true either for

    predictingXfrom YorYfromX. What is less clear, but is also true, is that the

    regression lines for predictingXor predicting Ywill be identical if the population

    variances are equal. That is the ONLY situation in which the regression lines are the

    same. Second, if the correlation is zero (i.e., no relationship betweenXand Y), then

    the slope will be zero (look at the first part of the second equation). If you are

    predicting Y fromX, your regression line will be horizontal, and if you are

    predictingXfrom Y, your regression line will be vertical. Furthermore, if you look at

    the third equation, you will see that the horizontal line for predicting Ywill be at themean ofYand the vertical line for predictingXwill be at the mean ofX. Think about

    that for a minute. IfXand Yare uncorrelated and you are trying to predict Y, the best

    prediction that you can make is the mean ofY. If you have no useful information

    about a variable and are asked to predict the score of a given individual, your best bet

    is to predict the mean. To the extent that the variables are correlated, you can make a

    better prediction by using the information from the correlated variable and the

    regression equation.