2.correlation and regression
-
Upload
santosh-ashilwar -
Category
Education
-
view
252 -
download
1
description
Transcript of 2.correlation and regression
Correlation and Regression
1. Definition of Correlation
2. Scatter Diagram,
3. Karl Pearson Coefficient of Correlation
4. Limits for Correlation Coefficient
5. Definition of Regression
6. Lines of Regression
7. Regression Curves
8. Regression coefficients
9. properties of Regression coefficients
10. Correlation Analysis vs. Regression Analysis.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Definition of Correlation
Correlation is a statistical tool which studies the relationship
between the two variable under study
Correlation analysis involves various method and techniques
for studying and measuring the extent of the relationship
between the two variable
If there is any relation between two variables i.e. when one
variable changes the other also changes in the same or in the
opposite direction, we say that the two variables are correlated.
It means the study of existence, magnitude and direction of the
relation between two or more variables.
The famous astronomist Bravais, Prof. Sir Fanci’s Galton,
Karl Pearson Prof. Neiswang and so many others have
contributed to this great subject.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Negative Correlation–as x increases, y decreases
x = hours of training (horizontal axis)
y = number of accidents (vertical axis)
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plots and Types of Correlation
60
50
40
30
20
10
0
0 2 4 6 8 10 12 14 16 18 20
Hours of Training
Accid
en
ts
Positive Correlation–as x increases, y increases
x = SAT score
y = GPA
GP
A
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plots and Types of Correlation
4.003.753.50
3.002.752.502.252.00
1.501.75
3.25
300 350 400 450 500 550 600 650 700 750 800
Math SAT
No linear correlation
x = height y = IQ
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plots and Types of Correlation
160
150
140
130
120
110
100
90
80
60 64 68 72 76 80
Height
IQ
Types of Correlation
Positive and negative correlation
Linear and non-linear correlation
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Positive and negative correlation
If two variables change in the same direction, then this
is called a positive correlation. For example:
Advertising and sales.
If two variables change in the opposite direction then
the correlation is called a negative correlation. For
example: T.V. registrations and cinema attendance.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Linear and non-linear correlation
The nature of the graph gives us the idea of the linear type
of correlation between two variables. If the graph is in a
straight line, the correlation is called a "linear correlation"
and if the graph is not in a straight line, the correlation is
non-linear or curvi-linear.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Degrees of Correlation
Through the coefficient of correlation, we can measure the degree or
extent of the correlation between two variables.
On the basis of the coefficient of correlation we can also determine
whether the correlation is positive or negative and also its degree or
extent.
Perfect correlation: If two variables changes in the same direction and
in the same proportion, the correlation between the two is perfect
positive
Absence of correlation: If two series of two variables exhibit no
relations between them or change in variable does not lead to a change
in the other variable
Limited degrees of correlation: If two variables are not perfectly
correlated or is there a perfect absence of correlation, then we term the
correlation as Limited correlation
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Degrees of Correlation
High degree, moderate degree or low degree are the three categories of
this kind of correlation. The following table reveals the effect of
coefficient or correlation.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Degrees Positive Negative
Absence of correlation
Zero 0
Perfect correlation + 1 -1
High degree + 0.75 to + 1 - 0.75 to -1
Moderate degree + 0.25 to + 0.75 - 0.25 to - 0.75
Low degree 0 to 0.25 0 to - 0.25
Methods Of Determining Correlation
We shall consider the following most commonly used
methods.
(1) Scatter Plot
(2) Kar Pearson’s coefficient of correlation
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method
In this method the values of the two variables are
plotted on a graph paper.
One is taken along the horizontal ( (x-axis) and the
other along the vertical (y-axis).
By plotting the data, we get points (dots) on the
graph which are generally scattered and hence the
name ‘Scatter Plot’.
The manner in which these points are scattered,
suggest the degree and the direction of correlation.
The degree of correlation is denoted by ‘ r ’ and its
direction is given by the signs positive and negative.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method i) If all points lie on a rising straight line the correlation is
perfectly positive and r = +1 (see fig.1 )
ii) If all points lie on a falling straight line the correlation is
perfectly negative and r = -1 (see fig.2)
iii) If the points lie in narrow strip, rising upwards, the correlation
is high degree of positive (see fig.3)
iv) If the points lie in a narrow strip, falling downwards, the
correlation is high degree of negative (see fig.4)
v) If the points are spread widely over a broad strip, rising
upwards, the correlation is low degree positive (see fig.5)
vi) If the points are spread widely over a broad strip, falling
downward, the correlation is low degree negative (see fig.6)
vii) If the points are spread (scattered) without any specific pattern,
the correlation is absent. i.e. r = 0. (see fig.7)
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method
i) If all points lie on a rising straight line the correlation is perfectly
positive and r = +1 (see fig.1 )
ii) If all points lie on a falling straight line the correlation is perfectly
negative and r = -1 (see fig.2)
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method
iii) If the points lie in narrow strip, rising upwards, the correlation is high
degree of positive (see fig.3)
iv) If the points lie in a narrow strip, falling downwards, the correlation is
high degree of negative (see fig.4)
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method
v) If the points are spread widely over a broad strip, rising upwards, the
correlation is low degree positive (see fig.5)
vi) If the points are spread widely over a broad strip, falling downward,
the correlation is low degree negative (see fig.6)
vii) If the points are spread (scattered) without any specific pattern, the
correlation is absent. i.e. r = 0. (see fig.7)Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Scatter Plot Method
Though this method is simple and is a rough idea
about the existence and the degree of correlation, it
is not reliable. As it is not a mathematical method,
it cannot measure the degree of correlation.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Karl Pearson’s coefficient of correlation
It gives the numerical expression for the measure of
correlation. it is noted by ‘ r ’. The value of ‘ r ’ gives the
magnitude of correlation and sign denotes its direction. It is
defined as
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Karl Pearson’s coefficient of correlation
Karl Pearson’s coefficient of correlation
Example Calculate the coefficient of correlation between the
heights of father and his son for the following data.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Height of father (cm):
165 166 167 168 167 169 170 172
Height of son (cm):
167 168 165 172 168 172 169 171
Solution: n = 8 ( pairs of observations )
Height of
father
xi
Height of
son
yi
x = xi-x y = yi-y xy x2 y2
165 167 -3 -2 6 9 4
166 168 -2 -1 2 4 1
167 165 -1 -4 4 1 16
167 168 -1 -1 1 1 1
168 172 0 3 0 0 9
169 172 1 3 3 1 9
170 169 2 0 0 4 0
172 171 4 2 8 16 4
Sxi=1344 Syi=1352 0 0 Sxy=24 Sx2=36 Sy2=44
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Karl Pearson’s coefficient of correlation
Example From the following data compute the coefficient of
correlation between x and y.
.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Limits for Correlation Coefficient
Pearsonian correlation coefficient can not exceed 1
numerically. In other words it lies between -1 and +1 .
Symbolically , -1 <=r<=+1
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Definition of Regression
1. Regression can be defined as a method that estimates the value of
one variable when that of other variable is known, provided the
variables are correlated.
2. The dictionary meaning of regression is "to go backward." It was
used for the first time by Sir Francis Galton in his research paper
"Regression towards mediocrity in hereditary stature."
3. Regression helps us to estimate one variable or the dependent
variable from the other variable or the independent variable
4. According to Blair “Regression is the measure of the average
relationship between two or more variable in terms of the original
units of data”
5. According to Wallis and Robert “It is often more important to find
out what the relation actually is , in order to estimate or predict one
variable(the depedent variable) and statistical techniques
appropriate in such cases is Called Regression Analysis ”
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Lines of Regression1. In scatter plot, we have seen that if the variables are
highly correlated then the points (dots) lie in a narrowstrip. if the strip is nearly straight, we can draw astraight line, such that all points are close to it fromboth sides.
2. This line is called the line of best fit if it minimizes thedistances of all data points from it.
3. This line is called the line of regression. Nowprediction is easy because now all we need to do is toextend the line and read the value
4. Thus to obtain a line of regression, we need to have aline of best fit. But statisticians don’t measure thedistances by dropping perpendiculars from points on tothe line.
5. They measure deviations ( or errors or residuals asthey are called) (i) vertically and (ii) horizontally. Thuswe get two lines of regressions as shown in the figure(1) and (2).
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Lines of Regression
They are obtained by
(I)Graphically - by Scatter plot
(II)Mathematically - by the method of least squares.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Lines of Regression By - Graphically - by Scatter plot Method
Line of regression of y on x
Its form is y = a + b x
It is used to estimate y when x is givenWhere a is intercept of the line and b is the slope of line x on y.
(2) Line of regression of x on y
Its form is x = a + b y
It is used to estimate x when y is given.
Where a is intercept of the line and b is the slope of line y on x.
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Lines of Regression By - by the method of least squares
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Line of regression of y on x
Where
Line of regression of x on y
Where
Example on Regression By - by the method of least squares
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
A panel of two judges A and B graded dramatic performance by independently awarding marks as follows:
Solution:-
Example on Regression By - by the method of least squares
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Example Continue ….
Example on Regression By - by the method of least squares
Subject :- Statistics & Numerical Mathematics Created By Prof . Santosh Ashilwar
Example Continue ….