Correlation

CORRELATION -Rank Correlation

CorrelationDefinition:

In a distribution if the change in one variable effects a change in the other variable,

the variable are said to be correlated(or there

is a correlation between the variables)

Let X and Y measure some characteristics of a particular system .To study the overall

measure of the system it is necessary to

measure the interdependence of X and Y.

Correlation

If the quantities(X,Y) vary in such a way that change in one variable corresponds to

change in the other variable then the variables

X and Y are correlated.

Types of Correlation:

The important ways of classifying the

correlation are:

1. Positive and Negative

2. Simple , Partial and Multiple

3. Linear and non-Linear.

Correlation

Methods of Studying Correlation:

The following are the important methods of ascertaining between two variables.

Scatter diagram method Karl Pearson’s Co-efficient Spearman’s Rank Correlation Co-Efficient

• Scatter Diagram Method: The simplest device for

studying correlation between two variables is a special type of dot chart.

CorrelationKarl Pearson’s Co-Efficient of Correlation:

r = (∑ xy) / (N σxσ y )

where

x = (x-¯)

y= (y- ¯)

σx =Standard Deviation of Series X.

σ y =Standard Deviation of Series Y.

r= The correlation Coefficient

N=Number of Pairs of Observation.

x

y

Demerits• By applying Scatter diagram method we can

get

an idea about the direction of correlation and also

whether it is high or low. But we cannot establish the

exact degree of correlation between the variables as it is

possible by applying mathematical methods. • The Karl Pearson’s method is based on the

assumption that the population being studied is normally

distributed. When it is known that the population is not

normal or when he shape the distribution is not known,

there is need for a measure of correlation that involves

no assumption about the parameter of the population.

So Why Rank Correlation????

It is possible to avoid making any assumptions

about the populations being studied by ranking the

observations according to size and basing the

calculations on the ranks rather than upon the original

observations. It does not matter which way the items are

ranked, item number one may be the largest or it may be

the smallest. Using ranks rather than actual observations

gives the coefficient of rank correlations.

This method of finding out co variability or the

lack of it between two variables was developed by the

British Psychologist Charles Edward Spearman in 1904.

RankingA ranking is a relationship between a set of

items such that, for any two items, the first is either 'ranked higher than', 'ranked lower than' or 'ranked equal to' the second. In mathematics, this is known as a weak order or total preorder of objects. It is not necessarily a total order of objects because two different objects can have the same ranking. The rankings themselves are totally ordered. For example, materials are totally preordered by hardness, while degrees of hardness

are totally ordered.

Rank Correlation

““Rank correlationRank correlation”” is the study of relationships

between different rankings on the same set of items. It

deals with measuring correspondence between two

rankings, and assessing the significance of this

correspondence. Spearman’s correlation coefficient is

defined as:

r = 1-((6∑D2)/(N(N-1)2))

Where r , denotes rank coefficient of correlation and D

refers to the difference of rank relation between paired I

tems in two series.

FeaturesThe rank method has principal uses:

• The sum of the differences between two variables is

zero.

• Spearman’s rank correlation coefficient ρ is the

Pearsonian correlation coefficient between the ranks.

• The rank correlation can be interpreted in the same

way as Karl Pearson’s correlation coefficient.

Features

• Karl Pearson correlation coefficient assumes that the

sample observations are drawn from a normal

population. Rank correlation coefficient is a distribution-

free measure since no strict assumption is made about

the population from which it is drawn.

• The values obtained for two formulae are different due

to the fact that when ranking is used some information

is hidden. • Spearman’s formula is the only formulae available to

find the correlation between qualitative characters.

Types of Rank Methods

In the rank correlation we may have two types of problems:

• Where ranks are given• Where ranks are not given• Where repeated ranks occur

Note:

If r = 1 then there is a perfect Positive correlation

If r = 0 then the variables are uncorrelated

If r=-1 then there is a perfect Negative Correlation

Where ranks are given :

Where actual ranks are given to us the steps required for

computing rank correlation are:

• Take the difference of the two ranks, i.e., (R1-R2)

and denotes these differences by D.• Square these difference and obtain the total • Apply the formula

Where ranks are not given

• When we are given the actual data and not the

ranks, it will be necessary to assign the ranks. Ranks

can be assigned by taking either highest values as 1 or

the lowest value as 1. But whether we start with the

lowest value or the highest value we must follow the

same method in case of both the variables.

Steps To Find RC

• Step 1:– Draw the table like

• Step 2:– Fill the data field with the given data

• Step 3:• Give the Rank for the data

• Step 4– Find the difference d & d2

• Step 5:– Apply the formula:

r =

Where d= difference, n=no.of data

• THANK YOU………

Correlation

Technology

Transcript of Correlation