Correlation & Regression-Moataza Mahmoud
-
Upload
anum-seher -
Category
Documents
-
view
7 -
download
0
description
Transcript of Correlation & Regression-Moataza Mahmoud
![Page 1: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/1.jpg)
Correlation & Correlation & RegressionRegression
![Page 2: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/2.jpg)
CorrelationCorrelation
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
![Page 3: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/3.jpg)
• Rectangular coordinate
• Two quantitative variables
• One variable is called independent (X) and
the second is called dependent (Y)
Scatter diagram
![Page 4: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/4.jpg)
Example
![Page 5: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/5.jpg)
Scatter diagram of weight and systolic blood Scatter diagram of weight and systolic blood pressurepressure
80
100
120
140
160
180
200
220
60 70 80 90 100 110 120wt (kg)
SBP(mmHg)
![Page 6: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/6.jpg)
80
100
120
140
160
180
200
220
60 70 80 90 100 110 120Wt (kg)
SBP(mmHg)
Scatter diagram of weight and systolic blood pressure
![Page 7: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/7.jpg)
Scatter plots
The pattern of data is indicative of the type of relationship between your two variables:
positive relationship negative relationship no relationship
![Page 8: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/8.jpg)
Positive relationshipPositive relationship
![Page 9: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/9.jpg)
0
2
4
6
8
10
12
14
16
18
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Hei
gh
t in
CM
![Page 10: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/10.jpg)
Negative relationshipNegative relationship
Reliability
Age of Car
![Page 11: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/11.jpg)
No relationNo relation
![Page 12: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/12.jpg)
Correlation CoefficientCorrelation Coefficient
Statistic showing the degree of relation between two variables
![Page 13: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/13.jpg)
Simple Correlation coefficient Simple Correlation coefficient (r)(r)
It is also called Pearson's correlation It is also called Pearson's correlation or product moment correlation or product moment correlationcoefficient. coefficient.
It measures the It measures the naturenature and and strengthstrength between two variables ofbetween two variables ofthe the quantitativequantitative type. type.
![Page 14: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/14.jpg)
The The signsign of of rr denotes the nature of denotes the nature of association association
while the while the valuevalue of of rr denotes the denotes the strength of association.strength of association.
![Page 15: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/15.jpg)
If the sign is If the sign is +ve+ve this means the relation this means the relation is is direct direct (an increase in one variable is (an increase in one variable is associated with an increase in theassociated with an increase in theother variable and a decrease in one other variable and a decrease in one variable is associated with avariable is associated with adecrease in the other variable).decrease in the other variable).
While if the sign is While if the sign is -ve-ve this means an this means an inverse or indirectinverse or indirect relationship (which relationship (which means an increase in one variable is means an increase in one variable is associated with a decrease in the other).associated with a decrease in the other).
![Page 16: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/16.jpg)
The value of r ranges between ( -1) and ( +1)The value of r ranges between ( -1) and ( +1) The value of r denotes the strength of the The value of r denotes the strength of the
association as illustratedassociation as illustratedby the following diagram.by the following diagram.
-1 10-0.25-0.75 0.750.25
strong strongintermediate intermediateweak weak
no relation
perfect correlation
perfect correlation
Directindirect
![Page 17: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/17.jpg)
If If rr = Zero = Zero this means no association or this means no association or correlation between the two variables.correlation between the two variables.
If If 0 < 0 < rr < 0.25 < 0.25 = weak correlation. = weak correlation.
If If 0.25 ≤ 0.25 ≤ rr < 0.75 < 0.75 = intermediate correlation. = intermediate correlation.
If If 0.75 ≤ 0.75 ≤ rr < 1 < 1 = strong correlation. = strong correlation.
If If r r = l= l = perfect correlation. = perfect correlation.
![Page 18: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/18.jpg)
n
y)(y.
n
x)(x
n
yxxy
r2
22
2
How to compute the simple correlation coefficient (r)
![Page 19: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/19.jpg)
ExampleExample::
A sample of 6 children was selected, data about their A sample of 6 children was selected, data about their age in years and weight in kilograms was recorded as age in years and weight in kilograms was recorded as shown in the following table . It is required to find the shown in the following table . It is required to find the correlation between age and weight.correlation between age and weight.
serial No
Age (years)
Weight (Kg)
1712
268
3812
4510
5611
6913
![Page 20: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/20.jpg)
These 2 variables are of the quantitative type, one These 2 variables are of the quantitative type, one variable (Age) is called the independent and variable (Age) is called the independent and denoted as (X) variable and the other (weight)denoted as (X) variable and the other (weight)is called the dependent and denoted as (Y) is called the dependent and denoted as (Y) variables to find the relation between age and variables to find the relation between age and weight compute the simple correlation coefficient weight compute the simple correlation coefficient using the following formula:using the following formula:
n
y)(y.
n
x)(x
n
yxxy
r2
22
2
![Page 21: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/21.jpg)
Serial n.
Age (years)
(x)
Weight (Kg)
(y)xyX2Y2
17128449144
268483664
38129664144
45105025100
56116636121
691311781169
Total∑x=41
∑y=66
∑xy= 461
∑x2=291
∑y2=742
![Page 22: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/22.jpg)
r = 0.759r = 0.759
strong direct correlation strong direct correlation
6
(66)742.
6
(41)291
6
6641461
r22
![Page 23: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/23.jpg)
EXAMPLE: Relationship between Anxiety and EXAMPLE: Relationship between Anxiety and Test ScoresTest Scores
AnxietyAnxiety
))XX((
Test Test score (Y)score (Y)
XX22YY22XYXY
101022100100442020
88336464992424
22994481811818
117711494977
5566252536363030
6655363625253030
∑∑X = 32X = 32∑∑Y = 32Y = 32∑∑XX22 = 230 = 230∑∑YY22 = 204 = 204∑∑XY=129XY=129
![Page 24: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/24.jpg)
Calculating Correlation CoefficientCalculating Correlation Coefficient
94.)200)(356(
1024774
32)204(632)230(6
)32)(32()129)(6(22
r
r = - 0.94
Indirect strong correlation
![Page 25: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/25.jpg)
Regression AnalysesRegression Analyses
Regression: technique concerned with predicting some variables by knowing others
The process of predicting variable Y using variable X
![Page 26: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/26.jpg)
RegressionRegression
Uses a variable (x) to predict some outcome Uses a variable (x) to predict some outcome variable (y)variable (y)
Tells you how values in y change as a function Tells you how values in y change as a function of changes in values of xof changes in values of x
![Page 27: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/27.jpg)
Correlation and RegressionCorrelation and Regression
Correlation describes the strength of a Correlation describes the strength of a linear relationship between two variables
Linear means “straight line”
Regression tells us how to draw the straight line described by the correlation
![Page 28: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/28.jpg)
Regression Calculates the “best-fit” line for a certain set of dataCalculates the “best-fit” line for a certain set of data
The regression line makes the sum of the squares of The regression line makes the sum of the squares of the residuals smaller than for any other linethe residuals smaller than for any other line
Regression minimizes residuals
80
100
120
140
160
180
200
220
60 70 80 90 100 110 120Wt (kg)
![Page 29: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/29.jpg)
By using the least squares method (a procedure By using the least squares method (a procedure that minimizes the vertical deviations of plotted that minimizes the vertical deviations of plotted points surrounding a straight line) we arepoints surrounding a straight line) we areable to construct a best fitting straight line to the able to construct a best fitting straight line to the scatter diagram points and then formulate a scatter diagram points and then formulate a regression equation in the form of:regression equation in the form of:
n
x)(x
n
yxxy
b2
2
1)xb(xyy b
bXay
![Page 30: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/30.jpg)
Regression Equation
Regression equation describes the regression line mathematically Intercept Slope 80
100
120
140
160
180
200
220
60 70 80 90 100 110 120Wt (kg)
SBP(mmHg)
![Page 31: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/31.jpg)
Linear EquationsLinear EquationsLinear EquationsLinear Equations
Y
Y = bX + a
a = Y-intercept
X
Changein Y
Change in X
b = Slope
bXay
![Page 32: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/32.jpg)
Hours studying and Hours studying and gradesgrades
![Page 33: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/33.jpg)
Regressing grades on hours grades on hours
Linear Regression
2.00 4.00 6.00 8.00 10.00
Number of hours spent studying
70.00
80.00
90.00
Final grade in course = 59.95 + 3.17 * studyR-Square = 0.88
Predicted final grade in class =
59.95 + 3.17*(number of hours you study per week)
![Page 34: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/34.jpg)
Predict the final grade ofPredict the final grade of……
Someone who studies for 12 hours Final grade = 59.95 + (3.17*12) Final grade = 97.99
Someone who studies for 1 hour: Final grade = 59.95 + (3.17*1) Final grade = 63.12
Predicted final grade in class = 59.95 + 3.17*(hours of study)
![Page 35: Correlation & Regression-Moataza Mahmoud](https://reader030.fdocuments.us/reader030/viewer/2022032704/55cf8f91550346703b9d9400/html5/thumbnails/35.jpg)
Multiple Regression
Multiple regression analysis is a straightforward extension of simple regression analysis which allows more than one independent variable.