Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill...

Post on 12-Jan-2016

214 views 1 download

Transcript of Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill...

Chapter 12

Examining Relationships in Quantitative Research

Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin

12-2

Relationships between Variables

• Is there a relationship between the two variables we are interested in?

• How strong is the relationship?• How can that relationship be best described?

12-3

Relationships between Variables

–Linear relationship: The strength and nature of the relationship remains the same over the range of both variables

–Curvilinear relationship: The strength and/or direction of their relationship changes over the range of both variables

12-4

Covariation and Variable Relationships

• Covariation: The amount of change in one variable that is consistently related to the change in another variable of interest– Scatter diagram: A graphic plot of the relative

position of two variables using a horizontal and a vertical axis to represent the values of the respective variables• A way of visually describing the covariation between

two variables

12-5

Positive Relationship between X and Y

12-6

Negative Relationship between X and Y

12-7

Curvilinear Relationship between X and Y

12-8

No Relationship between X and Y

12-9

Correlation Analysis

• Pearson correlation coefficient: Statistical measure of the strength of a linear relationship between two numerical variables– Varies between – 1.00 and 1.00• 0 represents absolutely no association between two

variables• – 1.00 or 1.00 represent a perfect association between

two variables– -1.00 represents a perfect negative (indirect) association– + 1.00 represents a perfect positive (direct) association

12-10

Rules of Thumb about the Strength of Correlation Coefficients

12-11

Assumptions for CalculatingPearson’s Correlation Coefficient

• The two variables have been measured using interval- or ratio-scaled measures

• Relationship is linear• Variables come from a normally distributed

population

12-12

SPSS Pearson Correlation Example

What is the extent of the relationship between ‘Satisfaction’ and ‘Likelihood of Recommending’?

12-13

Coefficient of Determination

• Coefficient of determination (r2): A number measuring the proportion of variation in one variable accounted for by the variation in another variable– Can be thought of as a percentage and varies from 0.0 to

100% (0.0 to 1.0)– The larger the size of the coefficient of determination:

• The stronger the linear relationship between the two variables being examined

• The greater the proportion of variation in the DV that is explained by variation in the IV.

– How is this the same as / different from the correlation coefficient (r)?

12-14

Correlating Rank Data

• Spearman rank order correlation coefficient: A statistical measure of the linear association between two variables where both have been measured using rank order scales.– Measures essentially the same thing as the

Pearson correlation coefficient

12-15

SPSS Spearman Rank Order Correlation

What is the extent of the relationship between ‘Food Quality’ and ‘Service’ rankings by customers of Santa Fe Grill?

12-16

What is Regression Analysis?

• A method for arriving at more mathematically detailed relationships (predictions) than those provided by the correlation coefficient

• Allows numerical predictions of DVs from IVs• Assumptions– Variables are measured on interval or ratio scales– Variables come from a normal population– Error terms are normally and independently

distributed

12-17

Bivariate regression analysis

• Bivariate regression analysis: A statistical technique that analyzes the linear relationship between two variables by estimating coefficients for an equation of a straight line– One variable is designated as the dependent

variable (DV)– The other is designated the independent or

predictor variable (IV)

12-18

Fundamentals of Bivariate Regression

• General formula for a straight line:

• Where,– Y = The dependent variable– a = The intercept (point where the straight line

intersects the Y-axis when X = 0)– b = The slope (the change in Y for every 1 unit

change in X )– X = The independent variable used to predict Y– ei = The error of the prediction

12-19

The Straight Line Relationship in Regression

12-20

Fitting the Regression Line Using the “Least Squares” Procedure

12-21

Ordinary Least Squares• A statistical procedure that estimates regression equation coefficients that

produce the lowest sum of squared differences between the actual and predicted values of the dependent variable

Regression Coefficient• Same as “slope coefficient”• An indicator of the importance of an independent variable in predicting a

dependent variable• Large coefficients are good predictors and small coefficients are weak

predictors – ** only applies to bivariate regression!!

12-22

SPSS Results for Bivariate RegressionWhat is the mathematical relationship between ‘Satisfaction’ and customers’ perception of ‘Reasonable Prices’ at Santa Fe Grill?

12-23

Multiple Regression Analysis

• A statistical technique which analyzes the linear relationship between a dependent variable and multiple independent variables by:– Estimates multiple slope coefficients for the

equation of a straight line– Each DV has a slope coefficient that partially

predicts IV– Much more complicated than bivariate regression

12-24

Fundamentals of Multiple Regression

• General formula for a straight line:• Y = a + b1X1 + b2X2 + b3X3 + … ei

• Where,– Y = The dependent variable– a = The intercept (point where the straight line intersects the Y-

axis when X = 0)– b1 = The slope (the change in Y for every 1 unit change in X1 )

– X1 = The first independent variable used to predict Y

– b2 = The slope (the change in Y for every 1 unit change in X2 )

– X2 = The second independent variable used to predict Y

– ei = The error of the prediction

12-25

Standardized Beta Coefficient

• An estimated regression coefficient that has been recalculated to have a mean of 0 and a standard deviation of 1

• Enables independent variables with different units of measurement to be directly compared on the strength of their association with the dependent variable

12-26

Examining the Omnibus Statistical Significanceof the Regression Model

• Model F statistic: Magnitude of “Model F” used determine whether the entire regression is significant– A significant F statistic indicates that the

regression model as a whole is “significant” (i.e. can be trusted!)

– Look for p-value of F-Statistic less than .05

12-27

Substantive Significance

• The multiple r2 (coefficient of determination) describes the strength of the relationship between all the independent variables as a group and the dependent variable– The larger the r2 measure, the more the behavior

of the dependent measure that is explained by the group of independent variables

– 1 - r2 = “coefficient of alienation” or the portion of DV variation that remains unexplained.

12-28

Examining the Statistical Significanceof Each Coefficient

• Each regression coefficient is divided by its standard error to produce a t statistic

• P-values of t-tests for coefficients = 0 that are less than .05 are typically regarded as “significant”– Significantly different from “0”– If “significant”, we are confident in coefficient’s

(i.e. variable’s) mathematical effect on the DV.

12-29

Multiple Regression Assumptions

• Linear relationship between DV and IVs• Homoskedasticity: The pattern of the co-variation

is constant (the same) around the regression line, whether the values are small, medium, or large– Heteroskedasticity: The pattern of covariation around

the regression line is not constant, and varies in some way when the values change from smaller to larger

• Normal distribution: All variables are normally distributed

12-30

Example of Heteroskedasticity

12-31

Example of a Normally Distributed Variable

12-32

SPSS Results for Multiple RegressionWhat is the mathematical relationship between ‘Satisfaction’ and customers’ perception of ‘Fresh Food’, ‘Food Taste’ and ‘Proper Food Temperature’ at Santa Fe Grill? Which is the best regression model?

12-33

• Assess the statistical significance of the overall (omnibus) regression model using the “Model F” statistic and its associated p-value

• Evaluate the regression’s adjusted multiple R-squared (i.e. coefficient of determination)

• Examine the individual regression coefficients and their p-values to see which are statistically significant

• Look for p < .05, but consider “marginal” cases

• Look at values of the standardized beta coefficients to assess relative influence of each predictor (IV) on the Dependent Variable (DV)

Evaluating a Regression Analysis - Summary

12-34

Multicollinearity• A situation in which several independent variables are highly correlated with each

other• Can result in difficulty in estimating independent regression coefficients for the

correlated variables• Standard errors of Beta coefficients become unreasonably high• Beta coefficients will typically not be significant

12-35

Multicollinearity – How to Avoid or Fix it!!

• Eliminate or replace highly correlated IVs– Perform a correlation matrix– Typically search for correlations higher than .5 in

absolute value• Factor Analysis Techniques (also called

“Principal Components Analysis”)• “Live with it”