Post on 08-Apr-2018
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 1/69
Measurement of central tendency
Measurement of dispersionCorrelation Regression
Statistical methods
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 2/69
Data ts types
Definition of Data: Facts, figures, enumerations & other materials, pastand present, serving as basis for study and analysis; they are raw
material for analysis; provide basis for testing hypothesis, developingscales and tables Data help researchers draw inferences on specific issues/
problems Quality of findings depend on relevance, adequacy & reliability of
data Types of data (Not in statistical sense)
A.1. Personal data (Individual as a source) Demographic & socio-economic Characteristics Behaviour variables Attitude, behaviour, opinions Awareness, preferences, knowledge
Practices, intensions
2. Organisational data (Organisational sources) Archives ,Manuscript library, museums
3. Territorial data Economic structure, occupation pattern
B. I Secondary (Paper method)
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 3/69
Methods & Techniques of Data Collection
I-Secondary data
How to scrutinize
Published & unpublished
Methods where used
A-Meta analysis
B- Historical method
C-Content analysis D-Informetrics
E-Use studies
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 4/69
II-Primary data
A-Records & relics B-Observation C-Experimentation D-Simulation E-Ask people orally F-Ask people in writing G-Panel study H-Projective techniques I -Sociometry
J -Case study-Interview / Depth interview / Schedule-Mail survey / questionnaire-Mechanical devices
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 5/69
Primary Data
Secondary Data-1. Internet sites /webpage of different companies and
organizations2. Central and local govt. studies and reports,3. Rules on international trading, import and exports,
state budgets4. FICCI(federation of Indian chambers of conference
and industry),CII(Confederation of INDIAN INDUSTRY), ASSOC AM(Associated chamber of commerce and Industry).
5. Policies on foreign direct investment
Data Sources
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 6/69
Skewness and Kurtosis: someexamples
Edu ational Attainment
7.06.05.0
.0¡
.0¢
.01.0
Edu ational Attainment
F r e q u e n
1¢
0
100
80
60
0
¢
0
0
Std. De£
= 1.81
¤
ean =
.8
N =¡ ¡ ¡
.00
Reason or ermination
17.515.01¢
.510.07.55.0¢
.50.0
Reason or ermination
F r e q u e n
80
60
0
¢
0
0
Std. De£
= 5.¢
6
¤
ean =
.6
N = 1¡ ¢
.00
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 7/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 8/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 9/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 10/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 11/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 12/69
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 13/69
Pictogram
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 14/69
Annotated box plot
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 15/69
Describing Data Numerically
Arithmetic Mean
Median
Mode
Describing Data Numerically
Variance
Standard Deviation
Coefficient of Variation
Range
Interquartile Range
Central Tendency Variation
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 16/69
Measures of Central Tendency
Central Tendency
Mean Median Mode
n
n
1i
i§!!
Overview
Midpoint of ranked values
Most fre uentlyobserved value
Arithmeticaverage
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 17/69
Arithmetic Mean
The arithmetic mean (mean) is the mostcommon measure of central tendency
For a population of N values:
For a sample of size n:
Sample size
nnn1
n
1ii
!!
§!
. Observedvalues
N
xxx
N
x
N21
N
1ii
!!§
! .
Population size
Populationvalues
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 18/69
Arithmetic Mean
The most common measure of central tendency
Mean sum of values divided by the number of values
Affected by extreme values (outliers)
(continued)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
35
15
5
54321!!
4
5
2
5
104321!!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 19/69
Median
In an ordered list, the median is the ³middle´number (50% above, 50% below)
Not affected by extreme values Median L+[(1/2N-C)/f ]h Q2
Compare knowledge level in Two subjects for agroup of students by median
0 1 2 3 4 5 6 7 8 9 10
Median = 3
0 1 2 3 4 5 6 7 8 9 10
Median = 3
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 20/69
Quartiles, Deciles. Percentiles
Similar to median which divides data in to parts , Quartiles (dividesdata in four parts), Deciles(divides data in ten parts) and percentiles(divides data in 1000 parts)
Mode 3median-2mode
3,2,1,..4
Q j !¹ º
¸©ª
¨
! jh f
f c p jN
L
9,....2,1
..10
D j
!
¹ º ¸©
ª¨
!
j
h f
f c p jN
L
99...2,1
..100
P j
!
¹ º ¸©
ª¨
!
j
h f
f c p jN
L
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 21/69
Finding the Median
The location of the median:
If the number of values is odd, the median is the middle number
If the number of values is even, the median is the average of the two middle numbers
Note that is not the val ue of the median, only the
position of the median in the ranked data
dataorderedtheinosition
1n
ositionedian
!
2
1n
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 22/69
Mode
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may be several modes
Mode L+[(f -f -1)/(2f -f -1-f 1 )]h
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Frequency after modalclass
Frequency beforemodal class
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 23/69
Five houses on a hill by the beach
Review xample
$
$
$
$
$
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 24/69
Review xample:Summary Statistics
Mean: ($3,000,000/5)
$600,000
Median: middle value of ranked data$300,000
Mode: most fre uent value$100,000
House Prices:
$2,000,000
500,000300,000
100,000
100,000
Sum 3,000,000
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 25/69
Example
5 1 Class Freque C.F less C.F More Than
9 2 19 5-10 5 5 49
7 3 20 10-15 6 11 44
9 4 22 15-20 15 26 38
10 5 22 20-25 10 36 23
9 7 17 25-30 5 41 135 7 30-35 4 45 8
Mean 7.714286 4.142857 20 35-40 2 47 4
mode 9 7 22 40-45 2 49 2median 9 4 20
SD 4.238095 5.47619 4.5
Median=L+[(1/2N-C)/f ]h e ( - - ( - - -Median Class=Total Freq/2 Class MODAL CLASS= Max Frequency class
Median Class='15-20 i.e 15 is max fre in freq G21
i.e 26 in Cumulative frequency
Median=15+[((1/2)49-11)/15 ]5 Mode 15+[(15-6)/(2x15-6-10 )]5
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 26/69
Mean is generally used, unlessextreme values (outliers) exist
Then median is often used, sincethe median is not sensitive toextreme values.
Example: Median home prices may be
reported for a region ± less sensitive tooutliers
Which measure of locationis the ³best´?
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 27/69
Geometric mean & Harmonicmean
Geometric mean is nth root of product of n observations ( ex: averagepercent increase in sales, production, ), Best considered in case of constructing index number .
Harmonic mean: restricted use such as average rate of increase of
profits average price at which an article has been sold
N
X anti §!
loglogG.M
,H.M,1
H.M
§§¹ º
¸©ª
¨!
¹ º
¸©ª
¨!
X
f
X
N
21
2211 loglog.log N N
G N G N
!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 28/69
Same center,
different variation
Measures of Variability
Variation
Variance Standard
Deviation
Coefficient
of Variation
Range Interquartile
Range
Measures of variation give
information on the spreador variability of the datavalues.
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 29/69
Range
Simplest measure of variation
Difference between the largest and the smallest
observations:Range Xlargest ± Xsmallest
0 1 2 3 4 5 9 10 11 12 13 14
Range = 14 - 1 = 13
Example:
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 30/69
Ignores the way in which data are distributed
Sensitive to outliers
7 8 9 10 11 12
Range = 12 - 7 = 5
7 8 9 10 11 12
Range = 12 - 7 = 5
Disadvantages of the Range
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 5 - 1 = 4
Range = 120 - 1 = 119
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 31/69
Inter uartile Range
Can eliminate some outlier problems by usingthe interquartile range .
Eliminate high- and low-valued observationsand calculate the range of the middle 50% of
the data Inter uartile range 3rd uartile ± 1st uartile
IQR Q3 ± Q1
13
13deviationquartileof tCoeff icien
R angeof tCoeff icien
S L
S L
!
!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 32/69
Inter uartile Range
Median
(Q2)
X
maximum
Xminimum
Q1 Q3
Example:
25% 25% 25% 25%
12 30 45 57 70
Inter uartile range57 ± 30 27
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 33/69
Quartiles
Quartiles split the ranked data into 4 segments withan e ual number of values per segment
25% 25% 25% 25%
The first uartile, Q1, is the value for which 25% of theobservations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% arelarger)
Only 25% of the observations are greater than the thirduartile
Q1 Q2 Q3
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 34/69
Quartile Formulas
Find a uartile by determining the value in theappropriate position in the ranked data, where
First uartile position: Q1 = 0.25(n+1)
Second uartile position: Q2 = 0.50(n+1)(the median position)
Third uartile position: Q3 = 0.75(n+1)
where n is the number of observed values
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 35/69
(n 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
Quartiles
Sample Ranked Data: 11 12 13 16 16 17 18 21 22
Example: Find the first uartile
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 36/69
Variance=s uare of S.D
Standard deviation
A xd N
f d
N
f d
N
d
N
d
!¹¹
º
¸
©©
ª
¨!!
!!
¹¹
º
¸
©©
ª
¨!!
!!
§§
§
§§
§
!
!
,mean)SD(assumed
, N
)x(x
mean)actualSD(fr om
,mean)SD(assumed
, N
)x(x
mean)actualSD(fr om
22
n
1i
2i
22
n
1i
2i
W
W
W
W
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 37/69
Population Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
Population standard deviation:
1-
)(x 1i
i§!
!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 38/69
Calculation Example:Sample Standard Deviation
SampleData (xi) : 10 12 14 15 17 18 18 24
n = Mean = x = 16
4.24267
126
1816)(2416)(1416)(1216)(10
1n
)x(24)x(14)x(12)X(10s
2222
2222
!!
!
!
.
.
A measure of the ³average´scatter around the mean
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 39/69
SD Example
Example
607.148
124
f f
)x(xf S.
i
2
i
n
1i
2
ii
!!!
!§
§§
§!fd
Size x Freq f D=x-9 Fxd fxd2
6 3 -3 -9 27
7 6 -2 -12 24
9 -1 -9 9
9 13 0 0 0
10 8 1 8 811 5 2 10 20
12 4 3 12 36
Total f =48 Sum=124
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 40/69
SD
Who is better scorer & who is more consistent?
%6.69100)27/8.18(%,6.83100)50/8.41(
8.18)24(2.93022
BS.27,eanB
8.41)1(8.175011
AS.,50meanA
2
22
2
22
!!!!
!!¹¹
º
¸
©©
ª
¨!!
!!
¹
¹
º
¸
©
©
ª
¨!!
§§
§§
xCovB xCoar iation
n
d
n
d
n
d
n
d
B. Man A =x
D1=x-51 d12 B.Man B=y
D2=y-51 d22
12 47
115 12
6 16
73 42
7 4
19 51
119 37
36 48
84 13
29 0
Total -10 17508 Total -240 9302
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 41/69
Measuring variation
Small standard deviation
Large standard deviation
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 42/69
Comparing Standard Deviations
Mean = 15.5s = 3.33811 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5
s = 0.926
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 4.570
Data C
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 43/69
Advantages of Variance andStandard Deviation
Each value in the data set is used in thecalculation
Values far from the mean are given extraweight
(because deviations from the mean are s uared)
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 44/69
If the data distribution is bell-shaped, thenthe interval:
contains about 68% of the values inthe population or the sample
The Empirical Rule
1 s
68%
1s
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 45/69
contains about 95% of the values in
the population or the sample
contains about 99.7% of the values
in the population or the sample
The Empirical Rule
2 s
s
3s
99.7%95%
2 s
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 46/69
Coefficient of Variation
Measures relative variation
Always in percentage (%)
Shows variation relative to mean
Can be used to compare two or more sets of
data measured in different units
%s
C V �¹¹ º
¸©©ª
¨!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 47/69
Comparing Coefficientof Variation
Stock A:
Average price last year = $50
Standard deviation = $5
Stock B:
Average price last year = $100
Standard deviation = $5
Both stockshave the samestandarddeviation, but
stock B is lessvariable relativeto its price
10100$50$5100sCV A !�!�¹¹
º ¸©©
ª¨!
5%100%$100
$5100%
x
sCV
B !�!�¹¹ º
¸©©ª
¨!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 48/69
Approximations for Grouped Data
Suppose a data set contains values m1, m2, . . ., mk,occurring with fre uencies f 1, f 2, . . . f K
For a population of N observations the mean is
For a sample of n observations, the mean is
N
mf
K
1iii§
!!
n
mf
x
K
1iii§
!!
§!
!K
1iif Nr
§!
!K
1iif nr
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 49/69
Shape of a Distribution
Describes how data are distributed
Measures of shape
Symmetric or skewed
Mean = MedianMean < Median Median < Mean
Right-SkewedLeft-Skewed Symmetric
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 50/69
Moments are defined as
Moments
.364
.23,
,)(xf N1 pointarbitraryaboutmomentrth
,)x(xf N
1meanaboutmomentrth
41
2121344
311233
2122
n
1i
r ii
'
n
1i
r ii
Q Q Q Q Q Q Q
Q Q Q Q Q Q Q Q
Q
Q
dddddd!
dddd!dd!
!!
!!
§
§
!
!
weget
ar
r
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 51/69
Skewness Skewness refers lack of symmetry (may be from mean)
and tell about difference between variation , skewnesstell about direction of skewness such as left skewed or right skewed.
Karl pearson coefficient of skewness=(Mean-Mode)/Standard deviation
Bowley¶s or uartile coefficient of skewness
Coefficient of skewness based on third moment13
13 2Skewnessowley
med QQ
!
,,32
23
111 tivealways posiwher e Q
Q F FK !!
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 52/69
Moments
From the given data find the first four momentsabout origin Monthly
profitNo of
Companies (f)
Less
than 7.5
4
7.5-12.5 10
12.5-17.5
20
17.5-22.5
36
22.5-27.5
16
27.5-32.5
12
32.5-37.5
2
Monthly Mid point No of fd3
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 53/69
SD
Find the first four moments about origin
Monthlyprofit
Mid pointX
No of Compani
es (f)D=(X-20)/5
fd fd2fd3
fd4
Lessthan 7.5
5 4 -3
7.5-12.5 10 10 -2
12.5-17.5
15 20 -1
17.5-22.5
20 36 0
22.5-27.5
25 16 1
27.5-32.5
30 12 2
32.5-37.5
35 2 3
Total N=100 -6 178 -42 874
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 54/69
Moments about arbitrary mean and mean
An s Kur t osi s
Sk ewne ss
x
x
!!!!
!!!!
!!
!!
!!
33
,0422.0
,5057.5423)3.0(3)3.0)(5.44(6
)5.523.0(45.5462
,504.12)3.0(2)5.443.0(35.52
,41.44)3.0(5.44
22
422
32
23
11
42
4'
33'
22'
Q
Q FK
Q
Q FK
Q
Q
Q
,5.5463625100
874,5.52125
100
42
,5.4425100
178,3.05
100
6
4
4
43
3
3
22
2'1'
!!!d!
!!d
!!!d!
!!d
§§
§§
xi N
f d xi
N
f d
xi N
f d xi
N
f d
Q Q
Q Q
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 55/69
Kurtosis
Kurtosis refers to bulginess or degree of flatness or peakness.
More peaked than normal
then leptokurtic
Less peaked then platykurtic normal curve is mesokurtic
3,kurtosis 222
42 !! FK Q
Q F
32 H F
32 R F
Leptokurtic
Platykurtic
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 56/69
Scatter Plots of Data with VariousCorrelation Coefficients
Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3r = +1
Y
X
r = 0
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 57/69
Correlation helps in determining the degreebetween two or more variables, However it
does not tell us cause effect relationship. Methods: Scatter diagram , karl Pearson
coefficient of correlation, Spearman¶s rank
correlation Karl Pea Formula:
Correlation
2 2
X X Y Y
r
X X Y Y
¨ ¸¨ ¸ © ¹© ¹ª ºª º
!
¨ ¸ ¨ ¸ © ¹ © ¹
ª º ª º
§
§ §
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 58/69
Ex: Psycological test of intelligence Ratio and engineering ability Ratio of 10 students are as follows. Calculatethe coefficient of correlation. ( Mean x= xbar =99,Mean y =ybar =98)
Student Intelligent ratio x
x-xbar=X X square EnggRatio y
y-ybar= Y Y square XY
A 105 6 101 3B 104 5 103 5
C 102 3 100 2
D 101 2 98 0
E 100 1 95 -3
F 99 0 96 -2
G 98 -1 104 6
H 96 -3 92 -6
I 93 -6 97 -1
J 92 -7 94 -4
TOT AL 990 0 170 980 0 140 92
59.0140170
9222
!!!§§
xY X
XY r
Correlation Of bivariate grouped
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 59/69
Correlation Of bivariate groupeddata
222 2
x y x y
x x y y
N f d d f d d r
N f d f d N f d f d
!
§ §
§ § § §
When fre uency data is given
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 60/69
Spearman Rank Correlation Ex:
Persons Rank in stat R1
Rank inIncome R 2
D=R1-R2 D SQUARED
A 9 1
B 10 2
C 6 3
D 5 4E 7 5
F 2 6
G 4 7
H 8 8
I 1 9
j 3 10
280
2
6 2801
10(10 1)
xr !
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 61/69
Features of Correlation Coefficient, r
Unit free
Ranges between ±1 and 1
The closer to ±1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any positive linear relationship
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 62/69
Interpreting the Result
r = .733
There is a relatively
strong positive linear
relationship between
test score #1
and test score #2
Students who scored high on the first test tendedto score high on second test
Scatter Plot of Test Scores
70
75
80
85
¥ 0
¥ 5
100
70 75 80 85 ¥ 0 ¥ 5 100
Test #1 Score
T e s t # 2
S c o r e
Obtaining Linear Relationships i e
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 63/69
Obtaining Linear Relationships i.eregression
An e uation can be fit to show the best linear relationship between two variables:
Y = a +bX
Where Y is the dependent variable and X is the
independent variable
Normal e uations for regression line of y on x
.
,y2§§§ §§ !
! xb xa xy
xbna
Regression e ample
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 64/69
Regression example
2, , , x a by x na b y xy a y b y! ! ! § §§ § §
s n x y xsquare y square xy
1 1 2 1 4 2
2 2 5 4 25 10
3 3 3 9 9 9
4 4 8 16 64 32
5 5 7 25 49 35
n=5 15 25 55 151 88
Regression line of x on y
Regression example
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 65/69
Regression example
2, , , y a bx y na b x xy a x b x! ! ! § §§ § §
s square y square xy
1 1 2 1 4 2
2 2 5 4 25 10
3 3 3 9 9 9
4 4 8 16 64 32
5 5 7 25 49 35
n=5 15 25 55 151 88
Regression line of y on x
R i C ffi i t h d i ti
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 66/69
Regression Coefficient when deviationstaken from assumed mean
From second e uation from the previous slide
This shows that means lie on y=a+bx. Shifting theorigin to (xbar,ybar) e n
Takes form
.
,.y
xba y
n
xba
n
!
!§§
.2§§§ ! xb xa xy
)()(
,)(
))((
.)(0))((
.)()())((
222
2
2
x xr y ythus
r n
XY
X
XY
x x
y y x xb
x xb y y x x
x xb x x y y x x
x
y
x
y
x
!
!!!
!
!
!
§§§
§§
§§
§§§
W
W
W
W
W
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 67/69
Reg Coe of y on x and x on y
So we get regression line of y on x and x on yrespectively
2
2
2
),()(
),()(
),()(
),()(
r r r
Y
X Y b y yr x x
y yb x x
X
X Y b x xb y y
x xr y y
y
x
x
y
xy y
x
xy
yx yx
x
y
!
!!
!
!!
!
§§
§§
W
W
W
W
W
W
W
W
x: rom e o ow ng a a n e wo
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 68/69
regression e uations.
Sales 91 97 108 121 67 124 51 73 111 57
Purchase 71 75 69 97 70 91 39 61 80 47
Sales
X=x-xbar= x-
90x square purchase
Y=y-
ybar=y-
70 Ysquare XY
91 1 1 71 1 1 1
97 7 49 75 5 25 35
108 18 324 69 -1 1 -18
121 31 961 97 27 729 837
67 -23 529 70 0 0 0
124 34 1156 91 21 441 714
51 -39 1521 39 -31 961 1209
73 -17 289 61 -9 81 153
111 21 441 80 10 100 210
57 -33 1089 47 -23 529 759
900 0 6360 700 0 2868 3900
8/6/2019 Cent Tend SD Corr Reg
http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 69/69
Reg Coe and reg line of y on x and x on y
Regression line of y on x and x on y.
y x
y x
y yb x x
Y
XY b
x y x y
X
XY
b
xy
xy
yx
36.12.5
),70(36.1)90(
)()(
36.12868/3900
,613.083.14),90(
613.0)70(
,613.06360
3900
2
2
!
!
!
!!!
!
!
!!!
§§
§§