Exploring relationships between variables Ch. 10 Scatterplots, Associations, and Correlations Ch. 10...
-
Upload
tracy-daniel -
Category
Documents
-
view
215 -
download
0
Transcript of Exploring relationships between variables Ch. 10 Scatterplots, Associations, and Correlations Ch. 10...
Exploring relationships between variables
Exploring relationships between variables
Ch. 10Scatterplots, Associations,
and Correlations
Ch. 10Scatterplots, Associations,
and Correlations
ScatterplotsScatterplots
• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values
• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values
Scatterplots Scatterplots
• Can be positive or negative• Show relationship amongst 2
variables• Can be shown more in depth
through the Z-scores of both variables (ZX, ZY)
• Can be positive or negative• Show relationship amongst 2
variables• Can be shown more in depth
through the Z-scores of both variables (ZX, ZY)
Z-scoresZ-scores
• X-MeanX / Standard Deviation (SX)
• Y-MeanY / Standard Deviation (SY)
• Calculating standard deviation in the same way as before.
• X-MeanX / Standard Deviation (SX)
• Y-MeanY / Standard Deviation (SY)
• Calculating standard deviation in the same way as before.
RatioRatio
• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the
strength of the linear association between 2 variables
• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the
strength of the linear association between 2 variables
variablesvariables
• Explanatory Variable – X• Response Variable - Y
• Explanatory Variable – X• Response Variable - Y
Least-Squares LineLeast-Squares Line• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x
• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x
SSxSSx
• This is calculated by obtaining the sum of each squared x
• You then subtract the sum of x squared divided by n
• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)
• This is calculated by obtaining the sum of each squared x
• You then subtract the sum of x squared divided by n
• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)
SSxySSxy
• Sum of squares of x and y• Take the sum of each x value
times each y value.• You then subtract from that
total the (Sum of x) * (Sum of y) n
• Sum of squares of x and y• Take the sum of each x value
times each y value.• You then subtract from that
total the (Sum of x) * (Sum of y) n
SSxySSxy
• SSxy is a more efficient way of computing
• Sum of each (x-xbar) * (y-ybar)
• SSxy is a more efficient way of computing
• Sum of each (x-xbar) * (y-ybar)
Complete Guided Ex. #3 page 566
Complete Guided Ex. #3 page 566
Standard Error of Estimate
Standard Error of Estimate
• Se = square root of E(y-yp)squared/n – 2
• How to calculate square root of SDY – b(SDx * SDy) / n-2
• Se = square root of E(y-yp)squared/n – 2
• How to calculate square root of SDY – b(SDx * SDy) / n-2
ResidualsResiduals
• You can graph the residual of the equation to see if the regression is accurate
• Residuals are the difference between the observed value and the predicted value
• R = observed - predicted
• You can graph the residual of the equation to see if the regression is accurate
• Residuals are the difference between the observed value and the predicted value
• R = observed - predicted
Confidence IntervalsConfidence Intervals
• Yp – E < y < yp + E• Yp = predicted value of y
• Yp – E < y < yp + E• Yp = predicted value of y
What does this mean (better understanding)What does this mean
(better understanding)
Types of dataTypes of data
• Outlier• Leverage• Influential Point• Lurking Variable
• Outlier• Leverage• Influential Point• Lurking Variable
OutlierOutlier
• Any data point that stands away from the others
• Any data point that stands away from the others
LeverageLeverage
• Data points with X-values that are far from the mean
• Can alter the line of least regression
• Data points with X-values that are far from the mean
• Can alter the line of least regression
Influential PointInfluential Point
• Omitting this point can drastically alter the regression model
• Omitting this point can drastically alter the regression model
Lurking VariableLurking Variable
• A variable that is hidden in the equation
• It is not explicitly part of the model but affects the way the variables in the model appear
• A variable that is hidden in the equation
• It is not explicitly part of the model but affects the way the variables in the model appear