Cent Tend SD Corr Reg

69
Measurement of central tendency Measurement of dispersion Correlation Regression Statistical methods

Transcript of Cent Tend SD Corr Reg

Page 1: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 1/69

Measurement of central tendency

Measurement of dispersionCorrelation Regression

Statistical methods

Page 2: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 2/69

Data ts types

Definition of Data: Facts, figures, enumerations & other materials, pastand present, serving as basis for study and analysis; they are raw

material for analysis; provide basis for testing hypothesis, developingscales and tables Data help researchers draw inferences on specific issues/

problems Quality of findings depend on relevance, adequacy & reliability of 

data Types of data (Not in statistical sense)

A.1. Personal data (Individual as a source) Demographic & socio-economic Characteristics Behaviour variables Attitude, behaviour, opinions Awareness, preferences, knowledge

Practices, intensions

2. Organisational data (Organisational sources) Archives ,Manuscript library, museums

3. Territorial data Economic structure, occupation pattern

B. I Secondary (Paper method) 

Page 3: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 3/69

Methods & Techniques of Data Collection

I-Secondary data

How to scrutinize

Published & unpublished

Methods where used

A-Meta analysis

B- Historical method

C-Content analysis D-Informetrics

E-Use studies

Page 4: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 4/69

II-Primary data

A-Records & relics B-Observation C-Experimentation D-Simulation E-Ask people orally F-Ask people in writing G-Panel study H-Projective techniques I -Sociometry

J -Case study-Interview / Depth interview / Schedule-Mail survey / questionnaire-Mechanical devices

Page 5: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 5/69

Primary Data

Secondary Data-1. Internet sites /webpage of different companies and

organizations2. Central and local govt. studies and reports,3. Rules on international trading, import and exports,

state budgets4. FICCI(federation of Indian chambers of conference

and industry),CII(Confederation of INDIAN INDUSTRY), ASSOC AM(Associated chamber of commerce and Industry).

5. Policies on foreign direct investment

Data Sources

Page 6: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 6/69

Skewness and Kurtosis: someexamples

Edu ational Attainment

7.06.05.0 

.0¡ 

.0¢ 

.01.0

Edu ational Attainment

       F      r      e      q     u      e      n

1¢ 

0

100

80

60

 

0

¢ 

0

0

Std. De£ 

= 1.81

¤ 

ean = 

.8

N =¡ ¡ ¡ 

.00

Reason or ermination

17.515.01¢ 

.510.07.55.0¢ 

.50.0

Reason or ermination

       F      r      e      q     u      e      n

80

60

 

0

¢ 

0

0

Std. De£ 

= 5.¢ 

6

¤ 

ean = 

.6

N = 1¡ ¢ 

.00

Page 7: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 7/69

Page 8: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 8/69

Page 9: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 9/69

Page 10: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 10/69

Page 11: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 11/69

Page 12: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 12/69

Page 13: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 13/69

Pictogram

Page 14: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 14/69

Annotated box plot

Page 15: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 15/69

Describing Data Numerically

Arithmetic Mean

Median

Mode

Describing Data Numerically

Variance

Standard Deviation

Coefficient of Variation

Range

Interquartile Range

Central Tendency Variation

Page 16: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 16/69

Measures of Central Tendency

Central Tendency

Mean Median Mode

n

n

1i

i§!!

Overview

Midpoint of ranked values

Most fre uentlyobserved value

 Arithmeticaverage

Page 17: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 17/69

 Arithmetic Mean

The arithmetic mean (mean) is the mostcommon measure of central tendency

For a population of N values:

For a sample of size n:

Sample size

nnn1

n

1ii

!!

§!

. Observedvalues

N

xxx

N

x

N21

N

1ii

!!§

! .

Population size

Populationvalues

Page 18: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 18/69

 Arithmetic Mean

The most common measure of central tendency

Mean sum of values divided by the number of values

Affected by extreme values (outliers)

(continued)

0 1 2 3 4 5 6 7 8 9 10

Mean = 3

0 1 2 3 4 5 6 7 8 9 10

Mean = 4

35

15

5

54321!!

4

5

2

5

104321!!

Page 19: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 19/69

Median

In an ordered list, the median is the ³middle´number (50% above, 50% below)

Not affected by extreme values Median L+[(1/2N-C)/f ]h Q2

Compare knowledge level in Two subjects for agroup of students by median

0 1 2 3 4 5 6 7 8 9 10

Median = 3

0 1 2 3 4 5 6 7 8 9 10

Median = 3

Page 20: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 20/69

Quartiles, Deciles. Percentiles

Similar to median which divides data in to parts , Quartiles (dividesdata in four parts), Deciles(divides data in ten parts) and percentiles(divides data in 1000 parts)

Mode 3median-2mode

3,2,1,..4

Q j !¹ º

 ¸©ª

¨

! jh f  

 f  c p jN 

 L

9,....2,1

..10

D j

!

¹ º ¸©

ª¨

!

 j

h f  

 f  c p jN 

 L

99...2,1

..100

P j

!

¹ º ¸©

ª¨

!

 j

h f  

 f  c p jN 

 L

Page 21: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 21/69

Finding the Median

The location of the median:

If the number of values is odd, the median is the middle number 

If the number of values is even, the median is the average of the two middle numbers

Note that is not the val ue of the median, only the

 position of the median in the ranked data

dataorderedtheinosition

1n

ositionedian

!

2

1n

Page 22: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 22/69

Mode

A measure of central tendency

Value that occurs most often

Not affected by extreme values

Used for either numerical or categorical data

There may be several modes

Mode L+[(f -f -1)/(2f -f -1-f 1 )]h

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

Frequency after modalclass

Frequency beforemodal class

Page 23: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 23/69

Five houses on a hill by the beach

Review xample

$

$

$

$

$

House Prices:

$2,000,000

500,000

300,000

100,000

100,000

Page 24: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 24/69

Review xample:Summary Statistics

Mean: ($3,000,000/5)

$600,000

Median: middle value of ranked data$300,000

Mode: most fre uent value$100,000

House Prices:

$2,000,000

500,000300,000

100,000

100,000

Sum 3,000,000

Page 25: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 25/69

Example

5 1 Class Freque C.F less C.F More Than

9 2 19 5-10 5 5 49

7 3 20 10-15 6 11 44

9 4 22 15-20 15 26 38

10 5 22 20-25 10 36 23

9 7 17 25-30 5 41 135 7 30-35 4 45 8

Mean 7.714286 4.142857 20 35-40 2 47 4

mode 9 7 22 40-45 2 49 2median 9 4 20

SD 4.238095 5.47619 4.5

Median=L+[(1/2N-C)/f ]h e ( - - ( - - -Median Class=Total Freq/2 Class MODAL CLASS= Max Frequency class

Median Class='15-20 i.e 15 is max fre in freq G21

i.e 26 in Cumulative frequency

Median=15+[((1/2)49-11)/15 ]5 Mode 15+[(15-6)/(2x15-6-10 )]5

Page 26: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 26/69

Mean is generally used, unlessextreme values (outliers) exist

Then median is often used, sincethe median is not sensitive toextreme values.

Example: Median home prices may be

reported for a region ± less sensitive tooutliers

Which measure of locationis the ³best´?

Page 27: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 27/69

Geometric mean & Harmonicmean

Geometric mean is nth root of product of n observations ( ex: averagepercent increase in sales, production, ), Best considered in case of constructing index number .

Harmonic mean: restricted use such as average rate of increase of 

profits average price at which an article has been sold

 N 

 X anti §!

loglogG.M

,H.M,1

H.M

§§¹ º

 ¸©ª

¨!

¹ º

 ¸©ª

¨!

 X 

 f  

 X 

 N 

21

2211 loglog.log  N  N 

G N G N 

!

Page 28: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 28/69

Same center,

different variation

Measures of Variability

Variation

Variance Standard

Deviation

Coefficient

of Variation

Range Interquartile

Range

Measures of variation give

information on the spreador variability of the datavalues.

Page 29: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 29/69

Range

Simplest measure of variation

Difference between the largest and the smallest

observations:Range Xlargest ± Xsmallest

0  1 2 3 4 5 9 10  11  12 13 14

Range = 14 - 1 = 13

Example:

Page 30: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 30/69

Ignores the way in which data are distributed

Sensitive to outliers

7 8 9 10 11 12

Range = 12 - 7 = 5

7 8 9 10 11 12

Range = 12 - 7 = 5

Disadvantages of the Range

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 5 - 1 = 4

Range = 120 - 1 = 119

Page 31: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 31/69

Inter uartile Range

Can eliminate some outlier problems by usingthe interquartile range .

Eliminate high- and low-valued observationsand calculate the range of the middle 50% of 

the data Inter uartile range 3rd uartile ± 1st uartile

IQR Q3 ± Q1

13

13deviationquartileof tCoeff icien

R angeof tCoeff icien

QQ

QQ

S  L

S  L

!

!

Page 32: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 32/69

Inter uartile Range

Median

(Q2)

X

maximum

Xminimum

Q1 Q3

Example:

25% 25% 25% 25%

12 30 45 57 70

Inter uartile range57 ± 30 27

Page 33: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 33/69

Quartiles

Quartiles split the ranked data into 4 segments withan e ual number of values per segment

25% 25% 25% 25%

The first uartile, Q1, is the value for which 25% of theobservations are smaller and 75% are larger 

Q2 is the same as the median (50% are smaller, 50% arelarger)

Only 25% of the observations are greater than the thirduartile

Q1 Q2 Q3

Page 34: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 34/69

Quartile Formulas

Find a uartile by determining the value in theappropriate position in the ranked data, where

First uartile position: Q1 = 0.25(n+1)

Second uartile position: Q2 = 0.50(n+1)(the median position)

Third uartile position: Q3 = 0.75(n+1)

where n is the number of observed values

Page 35: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 35/69

(n 9)

Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data

so use the value half way between the 2nd and 3rd values,

so Q1 = 12.5

Quartiles

Sample Ranked Data: 11 12 13 16 16 17 18 21 22

Example: Find the first uartile

Page 36: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 36/69

Variance=s uare of S.D

Standard deviation

 A xd  N 

 f  d 

 N 

 f  d 

 N 

 N 

!¹¹

 º

 ¸

©©

ª

¨!!

!!

¹¹

 º

 ¸

©©

ª

¨!!

!!

§§

§

§§

§

!

!

,mean)SD(assumed

, N

)x(x

mean)actualSD(fr om

,mean)SD(assumed

, N

)x(x

mean)actualSD(fr om

22

n

1i

2i

22

n

1i

2i

Page 37: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 37/69

Population Standard Deviation

Most commonly used measure of variation

Shows variation about the mean

Has the same units as the original data

Population standard deviation:

1-

)(x 1i

i§!

!

Page 38: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 38/69

Calculation Example:Sample Standard Deviation

SampleData (xi) : 10 12 14 15 17 18 18 24

n = Mean = x = 16

4.24267

126

1816)(2416)(1416)(1216)(10

1n

)x(24)x(14)x(12)X(10s

2222

2222

!!

!

!

.

.

 A measure of the ³average´scatter around the mean

Page 39: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 39/69

SD Example

Example

607.148

124

f f 

)x(xf S.

i

2

i

n

1i

2

ii

!!!

§§

§!fd 

Size x Freq f   D=x-9 Fxd fxd2

6 3 -3 -9 27

7 6 -2 -12 24

9 -1 -9 9

9 13 0 0 0

10 8 1 8 811 5 2 10 20

12 4 3 12 36

Total f =48 Sum=124

Page 40: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 40/69

SD

Who is better scorer & who is more consistent?

%6.69100)27/8.18(%,6.83100)50/8.41(

8.18)24(2.93022

BS.27,eanB

8.41)1(8.175011

AS.,50meanA

2

22

2

22

!!!!

!!¹¹

 º

 ¸

©©

ª

¨!!

!!

¹

¹

 º

 ¸

©

©

ª

¨!!

§§

§§

 xCovB xCoar iation

n

n

n

n

B. Man  A =x

D1=x-51 d12 B.Man B=y

D2=y-51 d22

12 47

115 12

6 16

73 42

7 4

19 51

119 37

36 48

84 13

29 0

Total -10 17508 Total -240 9302

Page 41: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 41/69

Measuring variation

Small standard deviation

Large standard deviation

Page 42: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 42/69

Comparing Standard Deviations

Mean = 15.5s = 3.33811 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5

s = 0.926

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5

s = 4.570

Data C

Page 43: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 43/69

 Advantages of Variance andStandard Deviation

Each value in the data set is used in thecalculation

Values far from the mean are given extraweight

(because deviations from the mean are s uared)

Page 44: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 44/69

If the data distribution is bell-shaped, thenthe interval:

contains about 68% of the values inthe population or the sample

The Empirical Rule

 1 s

68%

1s

Page 45: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 45/69

contains about 95% of the values in

the population or the sample

contains about 99.7% of the values

in the population or the sample

The Empirical Rule

 2 s

  s

3s

99.7%95%

2 s

Page 46: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 46/69

Coefficient of Variation

Measures relative variation

Always in percentage (%)

Shows variation relative to mean

Can be used to compare two or more sets of 

data measured in different units

%s

C V �¹¹ º

 ¸©©ª

¨!

Page 47: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 47/69

Comparing Coefficientof Variation

Stock A:

Average price last year = $50

Standard deviation = $5

Stock B:

Average price last year = $100

Standard deviation = $5

Both stockshave the samestandarddeviation, but

stock B is lessvariable relativeto its price

10100$50$5100sCV A !�!�¹¹

 º ¸©©

ª¨!

5%100%$100

$5100%

x

sCV

B !�!�¹¹ º

 ¸©©ª

¨!

Page 48: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 48/69

 Approximations for Grouped Data

Suppose a data set contains values m1, m2, . . ., mk,occurring with fre uencies f 1, f 2, . . . f K

For a population of N observations the mean is

For a sample of n observations, the mean is

N

mf 

K

1iii§

!!

n

mf 

x

K

1iii§

!!

§!

!K

1iif Nr 

§!

!K

1iif nr 

Page 49: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 49/69

Shape of a Distribution

Describes how data are distributed

Measures of shape

Symmetric or skewed

Mean = MedianMean < Median Median < Mean

Right-SkewedLeft-Skewed Symmetric

Page 50: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 50/69

Moments are defined as

Moments

.364

.23,

,)(xf  N1 pointarbitraryaboutmomentrth

,)x(xf  N

1meanaboutmomentrth

41

2121344

311233

2122

n

1i

r ii

'

n

1i

r ii

 Q Q Q Q Q Q Q

 Q Q Q Q Q Q Q Q

 Q

 Q

dddddd!

dddd!dd!

!!

!!

§

§

!

!

weget 

ar 

Page 51: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 51/69

Skewness Skewness refers lack of symmetry (may be from mean)

and tell about difference between variation , skewnesstell about direction of skewness such as left skewed or right skewed.

Karl pearson coefficient of skewness=(Mean-Mode)/Standard deviation

Bowley¶s or uartile coefficient of skewness

Coefficient of skewness based on third moment13

13 2Skewnessowley

QQ

med QQ

!

,,32

23

111 tivealways posiwher e Q

 Q F FK  !!

Page 52: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 52/69

Moments

From the given data find the first four momentsabout origin Monthly

profitNo of 

Companies (f)

Less

than 7.5

4

7.5-12.5 10

12.5-17.5

20

17.5-22.5

36

22.5-27.5

16

27.5-32.5

12

32.5-37.5

2

Monthly Mid point No of fd3

Page 53: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 53/69

SD

Find the first four moments about origin

Monthlyprofit

Mid pointX

No of Compani

es (f)D=(X-20)/5

fd fd2fd3

fd4

Lessthan 7.5

5 4 -3

7.5-12.5 10 10 -2

12.5-17.5

15 20 -1

17.5-22.5

20 36 0

22.5-27.5

25 16 1

27.5-32.5

30 12 2

32.5-37.5

35 2 3

Total N=100 -6 178 -42 874

Page 54: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 54/69

Moments about arbitrary mean and mean

 An s Kur t osi s

Sk ewne ss

 x

 x

!!!!

!!!!

!!

!!

!!

33

,0422.0

,5057.5423)3.0(3)3.0)(5.44(6

)5.523.0(45.5462

,504.12)3.0(2)5.443.0(35.52

,41.44)3.0(5.44

22

422

32

23

11

42

4'

33'

22'

 Q

 Q FK 

 Q

 Q FK 

 Q

 Q

 Q

,5.5463625100

874,5.52125

100

42

,5.4425100

178,3.05

100

6

4

4

43

3

3

22

2'1'

!!!d!

!!d

!!!d!

!!d

§§

§§

 xi N 

 f  d  xi

 N 

 f  d 

 xi N 

 f  d  xi

 N 

 f  d 

 Q Q

 Q Q

Page 55: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 55/69

Kurtosis

Kurtosis refers to bulginess or degree of flatness or peakness.

More peaked than normal

then leptokurtic

Less peaked then platykurtic normal curve is mesokurtic

3,kurtosis 222

42 !! FK  Q

 Q F

32 H F

32 R F

Leptokurtic

Platykurtic

Page 56: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 56/69

Scatter Plots of Data with VariousCorrelation Coefficients

 Y

X

 Y

X

 Y

X

 Y

X

 Y

X

r = -1 r = -.6 r = 0

r = +.3r = +1

 Y

X

r = 0

Page 57: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 57/69

Correlation helps in determining the degreebetween two or more variables, However it

does not tell us cause effect relationship.  Methods: Scatter diagram , karl Pearson

coefficient of correlation, Spearman¶s rank

correlation Karl Pea Formula:

Correlation

2 2

  X X Y Y  

  X X Y Y  

¨ ¸¨ ¸ © ¹© ¹ª ºª º

!

¨ ¸ ¨ ¸ © ¹ © ¹

ª º ª º

§

§ §

Page 58: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 58/69

Ex: Psycological test of intelligence Ratio and engineering ability Ratio of 10 students are as follows. Calculatethe coefficient of correlation.  ( Mean x= xbar =99,Mean y =ybar =98)

Student Intelligent ratio x

x-xbar=X X square EnggRatio y

y-ybar= Y Y square XY

  A 105 6 101 3B 104 5 103 5

C 102 3 100 2

D 101 2 98 0

E 100 1 95 -3

F 99 0 96 -2

G 98 -1 104 6

H 96 -3 92 -6

I 93 -6 97 -1

J 92 -7 94 -4

TOT AL 990 0 170 980 0 140 92

59.0140170

9222

!!!§§

 xY  X 

 XY r 

Correlation Of bivariate grouped

Page 59: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 59/69

Correlation Of bivariate groupeddata

222 2

  x y x y

  x x y y

 N f  d d  f  d d r 

 N f  d  f  d  N f  d  f  d 

!

§ §

§ § § §

When fre uency data is given

Page 60: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 60/69

Spearman Rank Correlation Ex:

Persons Rank in stat R1

Rank inIncome R 2

D=R1-R2 D SQUARED

 A 9 1

B 10 2

C 6 3

D 5 4E 7 5

F 2 6

G 4 7

H 8 8

I 1 9

 j 3 10

280

2

6 2801

10(10 1)

 xr !

Page 61: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 61/69

Features of Correlation Coefficient, r 

Unit free

Ranges between ±1 and 1

The closer to ±1, the stronger the negative linear 

relationship

The closer to 1, the stronger the positive linear 

relationship

The closer to 0, the weaker any positive linear relationship

Page 62: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 62/69

Interpreting the Result

r = .733

There is a relatively

strong positive linear 

relationship between

test score #1 

and test score #2

Students who scored high on the first test tendedto score high on second test

Scatter Plot of Test Scores

70

75

80

85

¥  0

¥  5

100

70 75 80 85 ¥  0 ¥  5 100

Test #1 Score

   T  e  s   t   #   2

   S  c  o  r  e

Obtaining Linear Relationships i e

Page 63: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 63/69

Obtaining Linear Relationships i.eregression

An e uation can be fit to show the best linear relationship between two variables:

Y = a +bX

Where Y is the dependent variable and X is the

independent variable

Normal e uations for regression line of y on x

.

,y2§§§ §§ !

! xb xa xy

 xbna

Regression e ample

Page 64: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 64/69

Regression example

2, , ,  x a by x na b y xy a y b y! ! ! § §§ § §

s n x y xsquare y square xy

1 1 2 1 4 2

2 2 5 4 25 10

3 3 3 9 9 9

4 4 8 16 64 32

5 5 7 25 49 35

n=5 15 25 55 151 88

Regression line of x on y

Regression example

Page 65: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 65/69

Regression example

2, , ,  y a bx y na b x xy a x b x! ! ! § §§ § §

s  square y square xy

1 1 2 1 4 2

2 2 5 4 25 10

3 3 3 9 9 9

4 4 8 16 64 32

5 5 7 25 49 35

n=5 15 25 55 151 88

Regression line of y on x

R i C ffi i t h d i ti

Page 66: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 66/69

Regression Coefficient when deviationstaken from assumed mean

From second e uation from the previous slide

This shows that means lie on y=a+bx. Shifting theorigin to (xbar,ybar) e n

Takes form

.

,.y

 xba y

n

 xba

n

!

!§§

.2§§§ ! xb xa xy

)()(

,)(

))((

.)(0))((

.)()())((

222

2

2

 x xr  y ythus

r n

 XY 

 X 

 XY 

 x x

 y y x xb

 x xb y y x x

 x xb x x y y x x

 x

 y

 x

 y

 x

!

!!!

!

!

!

§§§

§§

§§

§§§

Page 67: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 67/69

Reg Coe of y on x and x on y

So we get regression line of y on x and x on yrespectively

2

2

2

),()(

),()(

),()(

),()(

r r r 

 X Y b y yr  x x

 y yb x x

 X 

 X Y b x xb y y

 x xr  y y

 y

 x

 x

 y

 xy y

 x

 xy

 yx yx

 x

 y

!

!!

!

!!

!

§§

§§

x: rom e o ow ng a a n e wo

Page 68: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 68/69

regression e uations.

Sales 91 97 108 121 67 124 51 73 111 57

Purchase 71 75 69 97 70 91 39 61 80 47

Sales

X=x-xbar= x-

90x square purchase

Y=y-

ybar=y-

70 Ysquare XY

91 1 1 71 1 1 1

97 7 49 75 5 25 35

108 18 324 69 -1 1 -18

121 31 961 97 27 729 837

67 -23 529 70 0 0 0

124 34 1156 91 21 441 714

51 -39 1521 39 -31 961 1209

73 -17 289 61 -9 81 153

111 21 441 80 10 100 210

57 -33 1089 47 -23 529 759

900 0 6360 700 0 2868 3900

Page 69: Cent Tend SD Corr Reg

8/6/2019 Cent Tend SD Corr Reg

http://slidepdf.com/reader/full/cent-tend-sd-corr-reg 69/69

Reg Coe and reg line of y on x and x on y

Regression line of y on x and x on y.

 y x

 y x

 y yb x x

 XY b

 x y x y

 X 

 XY 

b

 xy

 xy

 yx

36.12.5

),70(36.1)90(

)()(

36.12868/3900

,613.083.14),90(

613.0)70(

,613.06360

3900

2

2

!

!

!

!!!

!

!

!!!

§§

§§