Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate...

108

Transcript of Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate...

Page 1: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3
Page 2: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

Dr. A. K. Parida, Ph.D.Associate Professor

Department of Agricultural StatisticsCollege of Agriculture (OUAT), Bhubaneswar-3

PREFACE

The subject statistics has much importance for teaching,research and extension in the field of agriculture and alliedscience. The knowledge and expertise of the subject isimmensely helpful to the teachers, scientists, students andresearch scholars for their area of study and application. Wecollect data from different sources by different methods fordifferent purposes. As these data are random in nature, they aresubjected to various manipulations to infer valid conclusions forfurther efficient use and correct decisions. No doubt, we canhandle the voluminous data so generated for the purpose by useof computers and softwares. But, the fundamental concepts,knowledge and expertise on procedures, principles andtechniques of statistics play a vital role to arrive at a valid andmeaningful conclusion.

This practical manual has been conceived and prepared forthe students and teachers as well to acquaint the basic conceptsof statistical principles and procedures of calculations as per thesyllabi of 4th Dean’s committee of ICAR for under graduatecourses in agriculture and allied sciences. The manuscript of thismanual has been prepared with my long years of teachingexpertise and persuasion from students and teachers of theuniversity. The contents so developed have been referred andcopied from many text books, journals, manuals and theinternet. I acknowledge the help of those sources. I expectcomments from the users of this manual for any addition ordeletion and improvement in future. I wish the practical manualwould be very much useful for students and research workers.

I may, also, thank to the authorities for providing fundsfrom the XIth ICAR development grant for printing the manual.

Date: March 25, 2009 Amulya Kumar Parida

Page 3: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

CONTENTS

Practicals Topics Page

I. Statistical methods 1

1.1 Construction of Frequency Table 1

1.2 Graphical representation of frequencydistribution

4

1.3 Measures of central tendency or central value -Arithmetic Mean, Geometric Mean, HarmonicMean, Median, Mode, Quartile, Decile andPercentiles

6

1.4 Measures of dispersion of a frequencydistribution - Mean deviation, StandardDeviation, Variance, and Coefficient of Variation(C.V.)

13

1.4 Moments and Measure of skewness and kurtosis 17

1.5 Testing of Hypothesis or Test of Significance ordecision rule

20

1.6 Standard normal deviate (SND) or Z tests orLarge Sample Tests - for single mean anddifference of two means

21

1.7 Small Sample Tests - test of 2 variances, testfor single mean, two independent means andtwo dependent means

24

1.8 Chi-square test (χ2) - Goodness-of-fit andindependence or association of attributes

33

1.9 Correlation and regression - Pearson’scorrelation coefficient and its test, Spearman'sRank correlation coefficient; fitting of regressionequations of two variables Y and X

38

II. DESIGN AND ANALYSIS OF EXPERIMENTS 47

2.1 Basic concepts on design of experiments -Analysis of variance : one-way and two-wayclassification

47

Page 4: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

Practicals Topics Page

2.2 Analysis of data in completely randomized design(CRD): unequal replications, equal replications

52

2.3 Analysis of data in randomised complete blockdesign(RCBD)

57

2.4 Analysis of data in Latin square design (LSD) 61

2.5 Missing plot technique in design of Experiments 64

2.6 Analysis of data in RCBD with one missingobservation

65

2.7 Analysis of data in LSD with one missingobservation

68

III. SAMPLING TECHNIQUES 71

3.1 Principal steps in a sample survey 72

3.2 Simple random sampling (SRS): Selection ofsampling units from a Population

76

3.3 Parameter estimation in SRS: SRSWOR, SRSWR 78

3.4 Stratified sampling 82

3.5 Systematic sampling 88

APPENDIX STATISTICAL TABLES (t, F, χ2, r, Z, randomnumber)

93

Table-1(a): Critical values for t-distribution 93

Table-1(b): Critical values for t-distribution (One& Two-tailed)

93

Table-2: Critical values for F-distribution 95

Table-3: χ2 (Chi-Squared) Distribution: CriticalValues of χ2

101

Table-4: Critical value for Correlationcoefficients (Simple or Partial)

101

Table-5: Percentage points of the normaldistribution, Z

102

Table-6: Random numbers 103

Page 5: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-1

PRACTICAL MANUAL ON STATISTICS

Two major practical aspects of scientific investigations are collectionof data and interpretation of the collected data. The data may begenerated through a sample survey on a naturally existing population or adesigned experiment on a hypothetical population. The collected data arecondensed and useful information extracted through techniques ofstatistical inference. This manual essentially deals with various statisticalmethods and techniques used for objectively tabulating the data, step-bystep computation of data and making valid inferences out of the samewhich will be useful for under graduate students.

General Objective: To impart knowledge to the students on basicconcepts and statistical techniques applied in agriculture and alliedsciences.

Specific objectives:

By the end of practical exercises, the students will be able to:

1. Acquaint with the practical applications of statistical techniques inagriculture.

2. Make self sufficient and to draw valid conclusion of statisticaltechniques.

I. STATISTICAL METHODS

1.1. Construction of frequency tableA frequency table is a technique which meaningfully summarizes a

set of observations in a tabular form so as to bring about the essentialinformation contained in it. A tabular arrangement of data by classestogether with the corresponding class frequencies is called a frequencydistribution or frequency table.

There are two types of frequency table.

i. Exclusive type

ii.Inclusive type

The frequency table of exclusive type (lower limit value is includedand upper limit is excluded) is formed when the data are continuous andit is called as continuous distribution. The frequency table of inclusivetype (both lower and upper limit values included) is considered when thedata are discrete or discontinuous and it is called discontinuous / discretedistribution.

Page 6: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-2

Procedure:

The following steps are to be considered for constructing afrequency table from a set of data.

Step-1. Determination of number of classes

Usually the number of classes should be of 5 to 15 otherwise theinformation contained in the data may be lost. One may use the formulaof Sturge’s rule for determining the number of classes, K.

K= 1+3.322 log10 N where, N=No. of observations

Step-2. Determination of magnitude of class interval (CI)

From a given set of observations, locate the maximum (Max) andminimum (Min) value.

Then, Range= Max – Min

and CI or class width (d) will be: d =K

MinMax

If ‘d’ have decimal value then consider the nearest integral value as classwidth.

Step-3. Choice of class limits or class boundaries

First, we should check whether the observations of the variable is acontinuous or discrete type viz. height, weight, volume etc. ofmeasurement type is a continuous variable and no. of trees, no. ofstudents etc. of count type is discrete variables. Use exclusive method offrequency distribution if the variable is continuous otherwise inclusivemethod if variable is discrete.

Step-4. Formation of classes:

a. Exclusive method: From the first class the subsequent classes aremade by adding d with both lower and upper limits, e.g. if first class isL to L+d then second class is L+d to L+2d and so on.Exa. 10 to 15, 15 to 20, 20 to 25 etc.

b. Inclusive method: From the first class the subsequent classes aremade by adding (d+1) instead of d to both lower and upper limits,e.g. if first class is L to L + d then second class is [L+(d+1)] to[L+(2d+1)] and so on.Exa. 10 to 15, 16 to 21, 22 to 27 etc.

Step-5. Determination of Class frequency

It is how frequently a value of the variable occurs in a class. Theclass frequencies are determined with the help of tally marks (|).

Step-6. Construction of frequency distribution table

Page 7: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-3

The frequency table has the following headings.

Classes Tally mark Frequency

(1) (2) (3)

The classes are formed starting with the minimum value of the setof observations having each class of difference of class width(d). Then,tally marks are made under each class as per the appearance of theobservations sequentially. In a class when 5th tally mark is required,either a slash(/) or overhead mark(¯) is drawn to the group of 4 tallymarks. The tally mark in each class starts from the first observation till tothe end of data. Then the tally marks are counted as frequency of theclass in the last column.

Problem-1. Construct the frequency distribution table with the following30 observations.

10(Min),15,17,20,21,16,17,18,20,31,35(Max),13,12,15,14,12,15,17,14,13,15,14,13,14,20,19,18,28,24,25.

Solution:

(i). No. of Classes, K = 1 + 3.322 log10N where, N = 30K= 1+3.322 Log1030= 1+3.322 1.4771= 1+4.90=5.90 6.

(ii). Class size, d =K

MinMax

d .516.46

25

6

1035

a. Exclusive method:

Table-1. Construction of frequency distribution table with CI=5

Class Tally marks Frequency10-15 IIII IIII 1015-20 IIII IIII I 1120-25 IIII 525-30 II 230-35 I 135-40 I 1Total 30

Page 8: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-4

b. Inclusive method:

Table-2. Construction of frequency distribution table with CI=5

Class Tally mark Frequency10-14 IIII IIII 1015-19 IIII IIII I 1120-24 IIII 525-29 II 230-34 I 135-39 I 1Total 30

1.2. Graphical representation of frequency distribution

Graphical representation of the observations facilitate to betterunderstanding about some more depth of distribution of observations. Thefrequency distribution can be represented in the form of Histogram,Frequency polygon, Frequency curve and Ogive.

Procedure:a. Histogram: Histogram is a set of vertical bars in a 2-dimensional

graph whose areas are proportional to the frequency of the class. Itcan be drawn by taking classes in X-axis and drawing bars ofcorresponding class frequencies in the Y-axis.

b. Frequency polygon: It is made by joining straight lines with the midpoints of each bars of the Histogram.

c. Frequency curve: A Frequency curve is a graphical representation offrequencies corresponding to their variate values by a smooth handcurve. Frequency curve is made when the CI of each class is smallso as to draw a smooth hand curve. It can be drawn by smoothhand joining of mid points of frequency polygon.

d. Ogive: It is a graph plotted for the variate values and theircorresponding cumulative frequency of a frequency distribution. Itsshape is just like elongated “S”. An Ogive is prepared by using‘more than type’ or ‘less than type’ or both of cumulativefrequencies.

The above graphical representation of frequency data is easily madewith exclusive type. If a frequency table is of inclusive type, it is firstmade into exclusive type and then the above types of graphs are drawn.

Cumulative frequency is the systematic sum of frequencies of eachclass in downward (less than type) and upward (more than type) in theclasses of frequency table.

Page 9: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-5

Problem–2. Construct the Histogram, Frequency Polygon, Frequencycurve and Ogive of the following frequency distribution on the length of60 sorghum ear heads (cm).

Class (Length) : 18-20 21-23 24-26 27-29 30-32 33-35 36-38No. of ear head : 4 10 14 16 10 4 2

Solution:

As the given frequency table is of inclusive type, the classes ofexclusive type is to be made for continuity of classes and then the bothtype of cumulative frequencies are to be computed.

Table-3. Cumulative frequency table

Class Exclusive Class Mid value Frequency Cumulative FrequencyLess than Greater than

18-20 17.5-20.5 19 4 4 6021-23 20.5-23.5 22 10 14 5624-26 23.5-26.5 25 14 28 4627-29 26.5-29.5 28 16 44 3230-32 29.5-32.5 31 10 54 1633-35 32.5-35.5 34 4 58 636-38 35.5-38.5 37 2 60 2

Fig. 1. HISTOGRAM Fig. 2. FREQUENCY POLYGON

Fig. 3. FREQUENCY CURVE Fig. 4. OGIVE(1-less type, 2-more type)

Page 10: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-6

Exercise: Construct a frequency distribution table, histogram, frequencypolygon, frequency curve and ogive for the following data and interpretthe results.

25, 32, 45, 8, 24, 42, 22, 12, 9, 15, 26, 35, 23, 41, 47, 18, 44, 37, 27,46, 38, 24, 43,46, 10, 21, 36, 45, 22, 18.

1.3. Measures of central tendency or central value

Central tendency or central value is the property of the distributionof data where we compute a central value which represents all othervalues. It is commonly measured by the Arithmetic Mean (or Mean),Geometric Mean, Harmonic Mean, Median and Mode.

Procedure:

Mean or Arithmetic Mean (A.M)The arithmetic mean is the sum of observations divided by the total

number of observations.

i. For a series of data: If the series have ‘n’ values of a variable ‘X’, i.e. x,x2,………….., x n, the Arithmetic Mean (A.M) is given by:

n

xX

ValuesofNo

valuesofSum

n

xxxMA

n

ii

n

1

21

.........................

.

ii. For ungrouped frequency distribution:

Suppose the values x1, x2…………………..,xn occur with frequencies

f1,f2,………………, fn, then A.M. is given by:

n

ii

n

iii

fNN

xfX ,

.1

iii. For grouped frequency distribution:

If data are grouped according to different class intervals, the midvalue of each class is taken as an approximation to the value of thevariable representing that class. If m1, m2………… …… mn represents themid values of ‘n’ classes of the variable ‘X’ and f1, f2,……..,fn representsthe corresponding frequencies, the Arithmetic Mean of x is

Page 11: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-7

n

1ii

n

1iii

f

mfX

a). Short-cut method (or change of origin):If di = (xi - A), A= any arbitrary value(called origin), then

ii

n

1iii

f

d.fAX

b). Step-deviation method (or change of origin and scale):

If

h

Axu i

i where, A = any arbitrary value(called origin),

h = magnitude of class interval (or scale), then

n

1iiiuf

N

hAX

Geometric Mean (G.M.)Geometric mean is the ‘n-th’ root of the product of all ‘n’ values.

i. For a series of data: If the values of the variable are x1, x2,…xn, thenthe Geometric mean of ‘x’ is:

n

xi

i

n

x

nn

xn

AntiG

or

xn

GelyAlternativ

xxxG

110

11010

/121

log1

log

''

log1

log,

.............

ii. For ungrouped frequency distribution:

If the values x1, x2………. xn occur with frequencies f1,f2….fn respectively,then

nffffnn

ff xxxG ...........

12

21

121)............( or

n

xii xf

NAntiG

110log

1log

N =n

fff ...........21

iii. For Grouped frequency distribution:

n21n21

1

f.............ff

1f

nf

2f )m..............mm(G or

n

xii mf

NAntiG

110log

1log

Page 12: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-8

N =n

fff ...........21

and m1, m2……….. mn are mid-values of the classes.

Harmonic mean (H.M.)

The Harmonic Mean is the reciprocal of the mean of reciprocal of theobservations.

i. For a series of data: If x1, x2…….xn are values of a given variable, thenthe Harmonic Mean is:

n

i in x

n

xxxn

MH

121

11........

111

1.

ii. For ungrouped frequency distribution:

If x1, x2,…………,xn occur with the frequencies f1,f2,……..,fn respectively,then,

n

i i

i

i

n

n

i

xf

f

x

f

x

f

x

ff

MH)(...........

.

2

2

1

1

iii. For grouped frequency distribution:

.,.......,,,, 21 classestheofvaluesmidaremmmwhere

mf

fHM n

i

i

i

Problem-3. The frequency distribution of weight(g) of 180 sorghum ear-heads is given in the following table. Calculate the A.M., G.M and H.M.

Table-4. Frequency distribution of sorghum ear heads

Weight of ear head in gm(X)

No. of ear heads(f)

40-60 660-80 2880-100 35100-120 50120-140 30140-160 10160-180 12180-200 9

Total 180

Page 13: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-9

Solution:

Table-5. Computation of mean (A.M.) by direct method, short-cutmethod and step-deviation method

Class(X)

Midvalue(mi)

fi fi mi A di fidi

ui=

h

)Am( i fi ui

40-60 50 6 300 -60 -360 -3 -1860-80 70 28 1960 -40 -1120 -2 -5680-100 90 35 3150 -20 -700 -1 -35100-120 110 50 5500 110 0 0 0 0120-140 130 30 3900 20 600 1 30140-160 150 10 1500 40 400 2 20160-180 170 12 2040 60 720 3 36180-200 190 9 1710 80 720 4 36

Total N=180 fimi =20060

- - fidi =260

- fiui= 13

The mean weight of ear head is given by:

i. Direct method: g44.11118020060

N

mfX

n

1iii

ii. Short-cut method : g44.11144.1110180

260110

N

dfAX

n

1iii

iii. Step-deviation method:

Table-6. Computation of Geometric mean (G.M.)

Class

(x)

Mid value

mi

Frequency

fi

Log10mi fi log10mi

40-60 50 6 1.69 10.1460-80 70 28 1.84 51.5280-100 90 35 1.95 68.25100-120 110 50 2.04 102.00120-140 130 30 2.11 63.30

g44.11144.111013180

20110uf

N

hAX

n

1iii

Page 14: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-10

140-160 150 10 2.17 21.70160-180 170 12 2.23 26.76180-200 190 9 2.27 20.43

Total 180 - 364.1

gAntGf

mfGLog n

ii

n

iii

71.104)02.2log(;02.2180

1.364log.

1

1

Table-7. Computation of Harmonic Mean (H.M.)

Class(x)

Mid valuesmi

Frequencyfi

fi/mi

40-60 50 6 0.1260-80 70 28 0.480-100 90 35 0.38100-120 110 50 0.45120-140 130 30 0.23140-160 150 10 0.06160-180 170 12 0.07180-200 190 9 0.04

Total - N=180 (fi/ mi)= 1.75

Harmonic mean (H.M.) = g

m

ff

i

i

i 85.10275.1

180

Conclusion: From the above calculation the Arithmetic Mean (A.M.),Geometric Mean (G.M.), and Harmanic Mean (H.M.) of weight of sorghumear-heads are 111.44g, 104.71g, and 102.85g respectively. And therelation obtained is A.M. > G.M. > H.M.Note: The relation may be A.M. ≥ G.M. ≥ H.M.

Median, Quartile, Decile and PercentilesIn a frequency distribution (arranged in increasing or decreasing

order), median is that value where half of the observation would be abovethe value and half below it. Similarly Quartiles, Deciles and Percentiles arethose values of the variate which divide the total frequencies into 4 parts,10 parts and 100 parts equally respectively.

Procedure:

Prepare a cumulative frequency table and then calculate i.N/4, i.N/10,i.N/100 to find out the ith Quartile class, ith Decile class, ith Percentile classrespectively. In case of Quartiles, i=1,2,3; in Decile, i=1,2,……,9 and incase of Percentile, i=1,2,…….,99.

Page 15: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-11

Formula: )f.c)xN(i

f

hLC

ioT

where, L0= Lower limit of the : ith Quartile class in case of ith Quartile: i th Decile class in case of ith Decile: ith Percentile class in case of ith Percentile

h = Width of the frequency distribution classfi =Frequency of the i th Quartile or ith Decile or ith PercentileclassN =Total frequency = ( fi)c.f = Less than cumulative frequency preceding the ith Quartileor ith Decile or ith Percentile classx=4 or 10 or 100 for Quartiles, Deciles and Percentiles,respectively.

How to find a quartile/decile/percentile class?

In a frequency table, to find out the ith Quartile class/ith Decileclass/ith Percentile class compute the i.N/4 or i.N/10 or i.N/100respectively. Then locate the respective class in the table whosecorresponding c.f. is more than these values. In case of Quartiles,i=1,2,3; in Decile, i=1,2,……,9 and in case of Percentile, i=1,2,…….,99.

Problem-4. Find the Median (2nd Quartile); lower Quartile(1st Quartile),7th Decile and 85th Percentile of the frequency distribution given below:

Marks instatistics

below10

10-20 20-30 30-40 40-50 50-60 60-70 above70

No. ofstudents

8 12 20 32 30 28 12 4

Solution:

Table-8. Computation of Median, Quartile, Decile and Percentile

Marks in Statistics(X)

No. of Students(fi)

Less than Cumulativefrequency (c.f)

<10 8 810-20 12 2020-30 20 4030-40 32 7240-50 30 10250-60 28 13060-70 12 142>70 4 146=N

(i) Median = 2nd quartile denoted by Q2 i.e. i=2

So, for i=2, i.N/4= 732

146

4

2

N

Page 16: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-12

Hence Median Class is 40-50 corresponding to c.f.=102 which is

>73. Median = L0 +if

h (N/2-c.f)

= 40 +30

10 (73 - 72)= 40+0.33= 40.33

(ii) First Quartile = Q1 Here, i=1

So, for i=1, i.N/4 = (1 x N/ 4 =4

146 )=36.5

Hence Q1 Class is 20-30 corresponding to c.f.=40 which is >36.5.

Q1 = L0 +if

h (N/4 - c.f.)

= 20 + 25.2825.820)205.36(20

10

(iii) Seventh Decile = D7 Here, i=7

So, for i=7, i.N/4= 2.102)10

1467)10/7(

N

And 7th decile class is 50-60.

07.5007.050)0.1022.102(2810

50

..10/.77

fcNf

hLoD

i

(iv) 85th Percentile = P85 Here, i=85

So, for i=85, i.N/4=

100

14685)10085( N =124.1

And 85th Percentile class is 50-60.

89.5789.750)1021.124(2810

50P85

Mode of a frequency distributionThe Mode is the value of the variate which occurs most frequently in

the data set. In a frequency table the Modal class is the class which hasgreatest frequency.

Procedure:

i. For a series or ungrouped data: The observation which have the highestfrequency i.e. the value which occurs maximum times is the mode.

ii. For grouped data:

Formula:

hffpf

ffLMMode

s

poO

)2()(

Where, L0 = Lower limit of the modal class

Page 17: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-13

f = frequency of the modal classfp = frequency preceeding the modal classfs = frequency succeeding the modal classh = width of the frequency distribution class

Note: The class which has highest frequency is the modal class

Problem-5. Compute the Modal value of the wages of workers in a farmfrom the following frequency distribution.

Wages (Rs.) No. of workers30-35 1235-40 1840-45 2245-50 2750-55 1755-60 2360-65 2965-70 8

Solution:

Modal class = Maximum frequency(=29) class i.e. 60-65

Mode =)2(

)(0

sp

p

fff

ffL

x h

L0 = lower limit of modal class = 60

f = frequency of modal class = 29

fp = frequency of the preceeding modal class = 23

fs = frequency of the succeeding modal class = 8

h = class size = 5

Mode = 11.6111.160527

660

)823292(

5)2329(60

1.4. Measures of dispersion of a frequency distributionLiteral meaning of dispersion is scatterdness. We study dispersion

to have an idea about the homogeneity or heterogeneity of thedistribution i.e. the scatterdness of observations from a central value.There are several measures of dispersion and each provides specificinformation concerning the scatter or dispersion of values in adistribution. Measure of mean along with dispersion gives some moreinformation about the data. The measures of dispersion are Range,Quartile Deviation, Mean Deviation, Standard Deviation, Variance andCoefficient of Variation.

Page 18: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-14

Mean deviation from a particular value ‘A’ (Mean or Median orMode) of a frequency distribution

Procedure:

Mean deviation is defined as the arithmetic mean of the absolutedeviations of the variate values from a particular measure of location. Thismean deviation may be about Mean, about Median or about Mode.

In a frequency distribution,

M.D. AxfN

1i

n

1ii

where, x1, x2,…………., xn are values of classes or mid-values of the classeswith frequencies f1,f2,………..,fn.

N= Total frequency =

n

1iif

A= either Mean or Median or Mode

Problem-6. Compute the Mean Deviation from the Mean from thefollowing data.

Wages (Rs.) Number of labourers60-70 550-60 1040-50 2030-40 820-30 3

Solution:

Table-9. Computation of Mean Deviation from Mean

Wages(Rs.)

Mid Values(xi)

Number oflabourers (fi)

ii xf |d| =|x-mean|

f |d|

60-70 65 5 325 18.70 93.5050-60 55 10 550 8.70 87.0040-50 45 20 900 1.30 26.0030-40 35 8 280 11.30 90.4020-30 25 3 75 21.30 63.90Total 46 2130 - 360.80

Mean= 30.4646

2130

i

ii

f

xf

Mean Deviation from mean 843.746

80.360

f

df

Page 19: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-15

Standard Deviation, Variance and Coefficient of Variation (C.V.)

Procedure:

The arithmetic mean of the squares of the deviation of the variatevalues from their arithmetic mean is defined as the Variance. The positivesquare root of the Variance is called the Standard Deviation (S.D.).

Coefficient of Variation (C.V.) is the relative magnitude of Variation,based on observations relative to the magnitude of their arithmetic mean.It is defined as the ratio of standard deviation to arithmetic meanexpressed as percentage.

There are two methods for calculation of Standard deviation:i). Direct methodii). Short-cut method (by changing of origin and scale)

i. Direct method:Step 1 : Calculate mid value (xi) for group data

Step 2 : Calculate fi.xi of each class and finally fi.xi

Step 3 : Calculate xi2 and fi.xi

2 and finally fi.xi2

Step 4 : Calculate S.D. ( ) by using the formula

S.D.= =+

i

2

ii2

ii fN,Where,N

x.f

N

x.f

and Variance,2

ii2

ii2

N

xf

N

x.f

ii. Short-cut Method or Step deviation method:Step 1 : Calculate the mid value (xi) for group dataStep 2 : Calculate deviation value (di), where

c

Axd i

i

where, A=any arbitrary value or mean, c=class size

Step 3: Calculate, ii d.f and i2i d.f and finally 2

iiii d.fanddf

Step 4: Calculate S.D. by using formula

S.D=

N

df

N

d.fc

2

i.i2

ii

So, Variance =

2

ii2

ii22

N

df

N

dfc

Page 20: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-16

Coefficient of Variation, C.V.= 100X

100Mean

.D.S

Standard deviation is an absolute measure of dispersion whereasC.V. is a relative measure of dispersion expressed in percentage forcomparing two or more data sets.

Problem-7. Compute the Standard Deviation, Variance and C.V. from thefollowing data.

Size of the holding(ha)

No. offarmers

2.5-3.5 10003.5-4.5 23004.5-5.5 36005.5-6.5 24006.5-7.5 17007.5-8.5 30008.5-9.5 500

Solution:Table-10. Calculation table for Standard Deviation

Size ofholding(ha.)

Mid value(xi)

(fi) fi .xi fi .xi2 di=(xi -A)

for A=6 fi.di fi.di2

2.5-3.5 3 10,00 3000 9000 -3 -3000 90003.5-4.5 4 2300 9200 36,800 -2 -4600 92004.5-5.5 5 3600 180,00 90,000 -1 -3600 36005.5-6.5 6 2400 14400 86400 0 0 06.5-7.5 7 1700 11900 83,300 1 1700 17007.5-8.5 8 3000 24000 19200 2 6000 120008.5-9.5 9 500 4500 40,500 3 1500 4500Total 14,500 85,000 5,38,000 -2000 40,000

a). Direct method:

S.D=2

ii2

ii

N

x.f

N

x.f

= 65.1362.34103.3714500

85000

14500

5380002

b). Step Deviation Method:

i. S.D =

N

d.f

N

d.fc

2

ii2

ii

=2

14500

2000

500,14

000,401

Page 21: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-17

= 019.0758.2 = 655.1739.2

ii. Variance = 2D.S = 2655.1 = 2.739

iii. Coefficient of Variation, C.V. = 100Mean

.D.S

Here, Mean =

i

ii

f

x.f= 862.5

500,14

85000

%23.28

100862.5

655.1100

Mean

D.SV.C

Moments, skewness and kurtosis

First four moments about mean of a frequency distribution

Procedure:Generally there are two types of moments.

1).Moments about mean ( r )

i

rii

r f

)xx(f

2).Moments about origin ( r' )

Axd,wheref

d.fii

i

rii1

r and A=any arbitrary value

By step deviation method

)h

Axd,Where(

f

dfh ii

i

rii

r

r

Moments about mean are:01

2122 )('

'

31

'1

'2

'33 )(23 '

41'2'

1'

2'

13''

44 3)(64

Measure of Skewness and Kurtosis for a frequency distribution

Skewness is defined as lack of symmetry from mid value. Measures ofSkewness signify the direction and extent of Skewness (skewed to left orright). There are two methods to find out Measure of Skewness from agiven frequency table.

First method – Karl Pearson coefficient of Skewness

Step-1. Find out Mean, Mode and S.D.

Page 22: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-18

Step-2. Calculate measure of Skewness by using the formula given byKarl Pearson,

D.S

ModeMeanSk

Second method - For wide class of frequency distribution

Step-1. Find 2nd and 3rd moments about meanStep-2. Calculate measure of Skewness,

32

23

11

Where,i

2ii

2 f

)xx(f

,i

ii

f

xxf

3

3

)(

If 1 =0 or 1 =0, indicates the distribution is symmetrical otherwiseskewed to left or right as per the sign of 3 -ve or +ve.

Kurtosis is a measure of the peakedness or flatness of a curve of adistribution. Kurtosis is of three types - Platykurtic, Leptokurtic andMesokurtic. Kurtosis can be computed by the following steps.

Step-1.Find out 2nd and 4th moments about the mean of distributionStep-2.Calculate Kurtosis as,

22

42

or 322

Where,meanaboutmomentcentralnd

meanaboutmomentcentralth

2

4

2

4

If 2 = 3 or 2 =0, indicates the distribution is normal i.e. mesokurtic

2 >3 or 2 >0, indicates the distribution is more peaked i.e. leptokurtic

2 <3 or 2 <0, indicates the distribution is more flattened i.e. platykurtic

Problem-8. Calculate the four moments about mean and find out themeasures of Skewness & Kurtosis from the following table.

ClassInterval

10-20 20-30 30-40 40-50 50-60 60-70 70-80

Frequency 3 7 4 14 8 6 3

Solution:

Page 23: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-19

Table-11. Calculation of moments

Classinterval

Frequency(fi)

Midvalue(xi)

di=

h

)Ax( i iidf 2

iidf 3iidf 4

iidf

10-20 3 15 -3 -9 27 -81 24320-30 7 25 -2 -14 28 -56 11230-40 4 35 -1 -4 4 -4 440-50 14 45 0 0 0 0 050-60 8 55 1 8 8 8 860-70 6 65 2 12 24 48 9670-80 3 75 3 9 27 81 243Total 45 2 118 -4 706

From the table,

44.042

210

f

d.fh'

i

ii1

22.26245

118100

f

d.fh

i

2ii2

2'

88.8845

41000

f

d.fh

i

3ii3

3'

88.15688845

706000,10

f

d.fh

i

4ii4

4'

02.262)44.0(22.262)( 22'1

'22

83.434170.001.435

)44.0(2220.262)(44.0(388.88

)(2.33

3'1

1'2

'13

'3

11.006.20364.23488.156888

)44(.3)44.0(22.2624)44.0()55.88(688.156888

)(34.642

4'1

24

''24

''34

'4

'

47.1573264

So,

Skewness = 1 = 10.046.179888

12.189077

)02.262(

)83.434()(

3

2

23

32

1

Page 24: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-20

Kurtosis = 29.248.68654

47.157326

)02.262(

47.157326

)( 222

42

.

29.210.0

lyrespective

andareondistributigiventheofKurtosisandSkewnessmethodmomentBy

So, it is concluded that the distribution of the data is notsymmetrical i.e. skewed to the left as 1 =0.10 and the sign of 3 is –ve.Again the distribution is also not normal i.e. less peaked(platykurtic) as

2 is less than 3,i.e., 2 =2.29.

Exercise: The following are the 405 soybean plant heights collected froma particular plot.

Plant height(cm.)

8-12

13-17

18-22

23-27

28-32

33-37

38-42

43-47

48-52

53-57

No. ofplants( f i )

6 17 25 86 125 77 55 9 4 1

Compute:i).A.M., G.M., H.M., Median, Modeii). Mean Deviation from mean, S.D., Variance, C.V.iii). Coefficient of Skewness and Kurtosisiv). Interpret the results of above for soyabean

1.5. Testing of Hypothesis or Test of Significance or decision rule

The estimate based on sample values do not equal to the truevalue in the population due to inherent variation in the population.The samples drawn will have different estimates compared to thetrue value. It has to be verified that whether the difference betweenthe sample estimate and the population value is due to samplingfluctuation or real difference. If the difference is due to samplingfluctuation only it can be safely said that the sample belongs to thepopulation under question and if the difference is real we haveevery reason to believe that sample may not belong to the populationunder question.

Steps involved in test of hypothesis:

1) The null and alternative hypothesis will be formulated2) Test statistic will be constructed3) Level of significance will be fixed4) The table (critical) values will be found out from the tables for a

given level of significance5) The null hypothesis will be rejected at the given level of significance

if the value of test statistic is greater than or equal to the critical

Page 25: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-21

value. Otherwise null hypothesis will be accepted.6) In the case of rejection the variation in the estimates will be

called “significant‟ variation. In the case of acceptance thevariation in the estimates will be called “not- significant‟.

1.6. Standard normal deviate (SND) or Z tests or Large SampleTests

If the sample size n ≥ 30 then it is considered as large sample andif the sample size n< 30 then it is considered as small sample andaccordingly there are large sample and small sample tests.

SND Test or One Sample (Z-test) for single mean

Case-I: Population standard deviation () is known

Assumptions:

1. Population is normally distributed2. The sample is drawn at random

Conditions:

1. Population standard deviation is known2. Size of the sample is large (n > 30)

Procedure: Let x1,x2, ………xn be a random sample size of n from a

normal population with mean μ and variance 2. Let x be the samplemean of sample of size ‘n’

Null Hypothesis is H0 : μ = μ0 (a specified value)and alterative is H1: μ ≠μ0 (two-tail)

Under H0, the test statistic is

Z=n

x

/

0

~ N(0,1)

i.e. the above statistic follows Normal Distribution with mean μ0 andvaraince ‟1‟.

If Zcal ≤ Ztab at 5% level of significance, H0 is accepted and hence weconclude that there is no significant difference between thepopulation mean and the one specified in H0 as μ0.

Problem-9. A sample of 900 leaves has a mean of 3.4 cms and S.D. of2.61 cms. Is the sample drawn from a large population of mean 3.25cms?

Solution:

Here, Null Hypothesis is H0 : μ = μ0

and altenative is H1: μ ≠μ0 (two-tail)

Page 26: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-22

Given x =3.4, μ0=3.25, σ=2.61 and n=900

Putting the values in the formula, we get Z=1.73

The tabulated value of Z at 5% is 1.96.

So, Z calculated is less than tabulated. Hence, H0 is accepted i.e.the sample drawn is from a large population of mean 3.25 cms.

Exercise: A herd of 1500 steer was fed a special high-protein grain for amonth. A random sample of 29 was weighed and had gained an averageof 6.7 kgs. If the standard deviation of weight gain for the entire herd is7.1kgs., test the hypothesis that the average weight gain per steer for themonth was more than 5 kgs. (Hints: H 0: μ = 5 H 1: μ > 5, Zcal=1.289)

Case-II: If is not known

Null hypothesis (H0) : = 0

under H0, the test statistic

Z=ns

x/

|| 0 ~ N(0,1)

Where, s= )])/()(1

[ 22 nxxn

and x’s are sample observations.

If Zcal ≤ Ztab at 5% level of significance, H0 is accepted and hencewe conclude that there is no significant difference between thepopulation mean and the one specified in H0 otherwise we do not acceptH0.

The table below gives some critical values of Z as:

Level ofsignificance

Critical value of Z

Two-tail One-tail10% 1.645 1.285% 1.96 1.6451% 2.58 2.33

SND test for two sample means or Z-test of significance fordifference of two means

Case-I: when σ is known

Procedure:

Let 1x be the mean of a random sample of size n1 from a populationwith mean μ1and variance σ1

2 and let2x be the mean of a random

sample of size n2 from another population with mean μ2 and variance

Page 27: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-23

σ22.

The hypothesis is, H0: μ1= μ2 and H1: μ1≠ μ2(two-tail)i.e. the null hypothesis states that the population means of the twosamples are identical. Under the null hypothesis the test statisticbecomes

Z=

2

22

1

21

21 ||

nn

xx

~N(0,1)

i.e the above statistic follows Normal Distribution with mean “0‟ andvariance ‟1‟.If σ1

2=σ2

2= σ2 (say) i.e. both samples have the same standard

deviation(or variance), then the test statistic becomes

Z=

21

21

11

||

nn

xx

~N(0,1)

If Zcal ≤ Ztab at 5% level of significance, H0 is accepted otherwiserejected.

If H0 is accepted means, there is no significant difference between twopopulation means of the two samples and means are identical.

Problem-10. The Average panicle length of 60 paddy plants in fieldNo.1 is 18.5 cm and that of 70 paddy plants in field No.2 is 2 0 . 3 cm.with common S.D. of 1.15 cm. Test whether there is significantdifference between two paddy fields w.r.t. mean of panicle length.

Solution:

Hypothesis, H0: There is no significant difference between the meansof two paddy fields w.r.t. panicle length, i.e. μ1=μ2Under H0, the test statistic becomes

Z= 1 2 ~N(0,1)where,

1x =18.5, 2x =20.3 n1=60, n2=70, σ=1.15

Substituting the given values in the formula, we get Z=8.89

Conclusion: So, at 5% level of significance 8.89 > 1.96(table value) andhence H0 is rejected means there is significant difference between meanpanicle lengths of the two paddy populations in regard to panicle length.

Page 28: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-24

Example: The amount of a certain trace element in blood is known tovary with a standard deviation of 14.1 ppm (parts per million) for maleblood donors and 9.5 ppm for female donors. Random samples of 75male and 50 female donors yield concentration means of 28 and 33 ppm,respectively. Test whether the population means of concentrations of theelement are the same for men and women assuming unequal variance?

(Hints: H 0: μ1 = μ2 H1 : μ1 ≠ μ2 Zcal=-2.37)

Case-II: when S.D. of both populations not known

The above methods are followed only after estimating the S.D. of the twopopulations from the sample observations as:

S1= )])/()(1

[ 211

21

1 nxx

nS2= )])/()(

1[ 2

222

22

nxxn

Where x1 and x2 are the independent sample observations with sizes n1and n2 from the two normal populations respectively.

The pooled variance (S2) or S.D.(S) is computed as:

S2=

Problem-11. A breeder wants to investigate whether the number offilled grains per panicle is the same in a new variety of paddy ACM.5and an old variety ADT.36. To verify a random sample of 50 plants ofACM.5 and 60 plants of ADT.36 were selected from the experimentalfields. The following results were obtained:

ForACM.5 For ADT.36Mean=139.4 Mean=112.9S1=26.864 S2=20.1096N1=50 N2=60

Test whether the claim of the breeder is correct.Solution:The hypothesis is, H0: μ1= μ2 and H1: μ1≠ μ2(two-tail)Assuming that the two population variances are unequal put the givenvalues in the formula

Z =

2

22

1

21

21 ||

nn

xx

= 4.76

Calculated value of Z > Table value of Z at 5% ls (=1.96), H0 isrejected. We conclude that the number of filled grains per panicle issignificantly different in the two verities ACM.5 and ADT.36.

1.7. Small Sample Tests

Page 29: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-25

It is applicable when the sample size n<30.

Test of hypothesis on equality of two variances (Snedecor’s F-test orvariance ratio test)

Let x1, x2,…,xn1 of size n1 be a sample drawn from a normalpopulation with variance x

2 and y1, y2,….,yn2 be another sample of sizen2 drawn independently from a normal population with variance y

2 forthe same variable under study. Now we are interested to know whethertwo samples are drawn from two different normal populations or theybelong to same normal population w.r.t. variance or scatterdness of theobservations.

Procedure:

Step-1. The Assumptions in F-test:

i. Parent population must be normal.ii. Samples are independent.

Step-2. Take the null hypothesis

y2

x2

0 :H against Alternate hypothesis y2

x2

1 :H

Step-3. Choose the level of significance i.e 5% or 1%.Step-4. Choose the location of Critical region i.e one tailed or two tailedtest.Step-5. Compute the observed value of F as:

1)(

1)(

,

)(..)1()1(

2

22

1

22

22212

2

n

yyS

n

xxSWhere

numeratortheintakenisvalueGreaterSSiffdnandnwithS

SF

iy

ix

yxy

x

Step-6. Compare the observed value with tabular value.Step-7. If Fcal > Ftab then null hypothesis rejected and significant.

Fcal≤ Ftab then null hypothesis accepted and it is not

significant.

Problem-12. Two independent samples on dry weight(g) of plants wereobserved from two populations as:Sample–1 (x): 39, 41, 43, 41, 45, 39, 42, 44

Sample–2 (y): 40, 42, 40, 44, 39, 38, 40

Does the estimate of the population variances differ significantly?

Solution:

Page 30: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-26

The Hypothesis is:

)var(: 220 iancessamehavepopulationthethathypothesisthetakeH yx

yxH 221 :

Level of significance, = 0.05

Test Statistics,1

)(1

)(,

2

22

1

22

2

2

n

yySand

n

xxSwhere

S

SF i

yi

x

y

x

Table-12. Calculation of variances

Obs. No. x y )xx( )yy( 2)xx( 2)yy(

1 39 40 -2.75 -0.42 7.5625 0.1764

2 41 42 -0.75 1.58 0.5625 2.4964

3 43 40 1.25 -0.42 1.5625 0.1764

4 41 44 -0.75 3.58 0.5625 12.8164

5 45 39 3.25 -1.42 10.5625 2.0164

6 39 38 -2.75 -2.42 7.5625 5.8564

7 42 40 0.25 0.42 0.0625 0.1764

8 44 - 2.25 - 5.0625 -

Total 334 283 - - 33.5 23.7148

210.1952144.3

782969.4S

SF

952144.36

7148.23

1n

)yy(S,782969.4

7

5.33

1n

)xx(S

42.407283

nyy,75.418

334n

xx

y2

2x

2

2i2

y1

2i2

x

2

i

1

i

As n1=8 and n2=7, so for 7 and 6 degree of freedom at = 0.05the critical value of ‘F’ is 3.97. Since, the calculated value of F=1.21 isless than the critical value(=3.97) the H0 is accepted i.e. the estimate ofthe population variances does not differ significantly. It is concluded thatthe two samples have been drawn from the same population or thevariances of the two populations are same.

Test for single mean (Student’s t-test)

Page 31: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-27

This test is used to test if the sample mean ( x ) differ significantlyfrom the hypothetical value of the population mean 0.

Procedure:

Step-1.Let x1, x2, …xn be a random sample of size n drawn from a populationwith following assumptions :

i. Parent population must be normal.ii. The sample is random.iii. The population Standard deviation is normal.iv. The sample size must be <30.

Step-2.

o1

oO

:HhypothesisAlternate

:HhypothesisNullTake

Step-3. Level of Significance as 5% or 1%Step-4. Choose the location of ritical Region i.e one tailed or two tailed.Step-5. Compute the sample statistic (observed) of student t-test.

ns

xt 0 with (n-1) degrees of freedom

Where,

deviationdardSSamples

meanPopulationSpecified

meanSamplex

tan0

1n

)x(xs.e.i

2n

ii

Step-6. Compare the sample statistic with tabulated value.

Step-7. Decision Rule

i. If t(cal) > t(tab) then Significant and Null hypotheses rejected.ii. If t(cal) ≤ t(tab) then Not significant and Null hypothesis accepted.

Problem-13. Ten animals are fed with an animal feed. The gain inwt.(kg) of animals are given below. Negative value indicates loss inweight. Test whether there is significant gain in weight as a result ofconsumption of that particular animal feed.

Animal No. 1 2 3 4 5 6 7 8 9 10

Gain in Wt.(x) 25 10 11 13 12 8 5 13 7 -4

Solution:

Page 32: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-28

Null hypothesis 0:Ho (i.e. there is no gain in weight)0.:H1 i.e. there is gain in weight

This is a case of one tailed test.

Table-13. Calculation for t-Statistic

Animal No. Gain in wt.(x) )xx( 2)xx( 1 25 15 2252 10 0 03 11 1 14 13 3 95 12 2 46 8 -2 47 5 -5 258 13 3 99 7 -3 910 -4 -14 196

Total 100x 482)xx( 2

ns

xtand

x

xxMean

0

1010100

31.4

103.7

010

3.79

482

1

10,0,10

2

0

t

n

xxs

nxWhere

Since the calculated t-value of 4.31 is more than the table value oft=1.833 at 5% level significance for 9 d.f. for one tail test, the nullhypotheses is rejected and alternate hypothesis is accepted. So, we canconclude that there is +ve gain in wt. due to consumption of theparticular feed.

Exercise: A random sample of height (ft.) of 10 trees from a forest wasobserved. Test whether the mean height of trees of that forest is 100ft. ornot at 5% level. (Hints: Calculated t=-0.62)

Test for difference of two means for Independent samples (Fisher’st-test)

Page 33: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-29

This test is used to test the difference between two populationmeans on the basis of two independent sample means or to test whethertwo samples have been drawn from the same population having samemean.

Procedure:

Let x1, x2, …xn1 be a random sample of size n1 drawn from a populationwith mean x and y1, y2, … , yn2 be another independent random samplewith mean y having the following assumptions.

i. Parent population must be normal.ii. Samples are random and independent of each other.

Case-I: Population variance for both the samples same and unknown.

Step-1. Take Null hypothesis yx:Ho

Alternative hypothesis yx1 .:H

Step-3. Choose the level of significance either 5% or 1%.

Step-4. Choose the location of Critical region i.e. one tailed test or twotailed test.

Step-5. Compute the sample t value (calculated) on the following formulaof Fisher’s- t test.

21 n

1

n

1s

yxt

with (n1+n2–2) d.f.

Here,2nn

)yy()xx(s

21

2i

2i

is the estimated standard deviation

of the populationWhere,

SamplendofnobservatioofnonSamplendofmeanSampley

samplestofnobservatioofnonsamplestofmeanSamplex

2.,2

1,1

2

1

Step-6. Compare the calculated value with table value.Step-7. If. t(cal) > t(tab) then Null hypothesis rejected and it is significant.

if, t(cal) ≤ t(tab) then Null hypothesis accepted and it is notsignificant.

Problem-14. The interest is to study the effect of two treatments A & Bon the yield of a crop each of the treatments being repeated in 5 plotsand the yield/plot noted below.

Yield (in kg/plot)

Page 34: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-30

Treatment-A (x) 9 10 13 11 7 10x

Treatment-B (y) 15 10 14 15 11 13y

Test whether the mean yield obtained as a result of these two treatmentsdiffer significantly.Solution:

Step-1. Null hypothesis,).(: meanstwobetweendifferencetsignificannoeiHo BA

Alternate Hypothesis,).(:1 tlysignificandiffermeanstwoeiH BA

Step-2. This is a case of two-tailed test.Step-3. The level of significance chosen is 5%.Step. 4

Table-14. Calculation for Fisher’s–t Statistic

So,

29.225.5

842

82220

2

5,13565

,105

50

21

22

21

nn

yyxxs

andnnyx

ii

Test Statistic, 08.244.13

63.029.2

3

51

51

29.2

1310

11

21

nns

yxt

Step-5. The two tailed table value for “t” at 5% significance level with 8d.f. is 2.306. So, calculated t is less than table value and hence the nullhypothesis is accepted. It is concluded that the two treatments do notproduce any significant difference in the mean yield.

Exercise: To assess the effect of inoculation with mycorrhiza on the heightgrowth of seedlings of a crop, 10 seedlings inoculated withmycorrhiza(Group-1) and another 10 seedlings without inoculation(Group-2) were collected from an experiment. The height of seedlings obtainedunder the two groups of seedlings was:

Sl. No. x y )xx( )yy( 2)xx( 2)yy( 1 9 15 -1 2 1 42 10 10 0 -3 0 93 13 14 3 1 9 14 11 15 1 2 1 45 7 11 -3 -2 9 4

Total 50 65 - - 20 22

Page 35: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-31

Plot 1 2 3 4 5 6 7 8 9 10Group I 23 17.4 17 20.5 22.7 24 22.5 22.7 19.4 18.8Group II 8.5 9.6 7.7 10.1 9.7 13.2 10.3 9.1 10.5 7.4

Under the assumption of equality of variance of seedling height in the twogroups, test the equality of means. (tcal=11.75)

Exercise:Using the data of example of F-test, test equality of 2 means.

Test for difference of two dependent sample means(paired t-test)

Procedure:

Let (x1, y1), (x2, y2),…,(xn, yn) be n paired observations of a sample froma population with basic assumption as follows:

i. Parent population must be normal.ii. Samples are dependent and occur pair-wise.

Step-1. Take Null hypothesis: 0::0 dHorH oyx i.e. no differenceAlternate hypotheses:

)0dor0dor(0d:Hor:H 1yx1

Step-3. Choose the level of significance either 5% or 1%.

Step-4. Choose the location of Critical region i.e. one tailed test ‘or’ twotailed test.

Step-5. Compute the observed t statistic on the following formula of pair-ttest:

)var''..1

,

..)1(

2

1 iabledofmeandeindds

yxdiWhere

fdnwith

ns

dt

ii

Step-6. Compare the observed value with tabular value.

Step-7. If t-calculated > t-tabulated then null hypothesis rejected and it issignificant otherwise null hypothesis is accepted.

Problem-15. Memory capacity of 9 students was tested before and aftertraining. Test at 5 per cent level of significance whether the training waseffective from the following scores.

Student 1 2 3 4 5 6 7 8 9

Before (x) 10 15 9 3 7 12 16 17 4

Page 36: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-32

After (y) 12 17 8 5 6 11 18 20 3

Solution:

Here, marks obtained by the same batch of students in the tests areavailable. Hence, the marks are expected to be correlated. So, paired t-test will be appropriate. Then taking the null hypothesis that the mean ofdifference is zero, we can write,

0d:Htesttoequivalentiswhich,:H 0yx0

yxH :1

As we are having matched pairs; we use paired ‘t’-test , which is given by

..)1( fdnwith

nS

dt

Table-15. Calculation for paired-t

Student Score (x)xi

Score (y)yi

Differencedi=(xi-yi)

di2

1 10 12 -2 42 15 17 -2 43 9 8 1 14 3 5 -2 45 7 6 1 16 12 11 1 17 16 18 -2 48 17 20 -3 99 4 3 1 1

Total - - -7 29

361.1572.0778.0

9715.1

778.0

715.1944.219

778.0929

1

.

1

)(

778.097

9

2

222

nS

dt

n

dnd

n

dds

ddHere

ii

i

Page 37: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-33

Table value of ‘t’ at 5% level for 8 df is 2.306. The calculated valueis less than table value. Hence, it is not significant and the null hypothesisis accepted. Hence we can conclude that the training was not effective.

Exercise: Data pertaining to organic carbon(OC) content measured at twodifferent layers of 10 number of soil pits in a natural forest were collectedto study whether the OC content is same or different as:

Organic carbon (%)Soil pit 1 2 3 4 5 6 7 8 9 10

Layer1

(x) 1.59 1.39 1.64 1.17 1.27 1.58 1.64 1.53 1.21 1.48

Layer2

(y) 1.21 0.92 1.31 1.52 1.62 0.91 1.23 1.21 1.58 1.18

Analyse the data and draw your conclusion.

(Hints: sd2=0.1486 tcal =1.485)

1.8. Chi-square test (χ2)

Chi-square test of significance is for testing the agreementsbetween observation and hypothesis (or expected) where the data arepurely qualitative or enumerative in character. Such enumerative data arecharacterized by the frequency of occurrence or non-occurrence of eventsor attributes or categories expressed as counts or proportions orpercentages. But, the expected frequency in each category shouldpreferably be more than 5 and the total number of observations should belarge, say, more than 50.

χ2-test for Goodness-of-fit

This involves testing of significance of difference between observedfrequencies and the frequencies expected on some prior hypothesis orrule. If Oi is a set of observed frequencies and Ei is corresponding set ofexpected frequencies (i=1,2,…,n), the Karl Pearson’s Chi-square (χ2) isgiven by :

χ2 =

Procedure:

Step-1. Follow the following assumption

i. Sample observation should be independent.ii. Constraint on cell frequency should be linear i.e ii EO iii. Total number of frequency should be reasonably large.iv. No theoretical(expected) cell frequency be less than 5.

Page 38: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-34

Step-2. Take the null hypothesis , ii0 EO:H Alternative hypothesis ii1 EO:H

Step-3. Choose the level of significance either =5% or 1%.

Step-4. Choose the location of critical region i.e. one tailed or two tailed

Step-5. Compute the Chi-square value as per formula.

Step-6. Compare the observed value with tabular value and take decisionas:

If χ2cal > χ2

tab then null hypotheses rejected and significant at .If χ2

cal ≤ χ2tab then null hypothesis accepted and non significant at .

Problem-16. In a cross between parents of the genetic constitution AAbband aaBB the phenotypes in F2 sample is classified as follows.

AB Ab aB Ab Total87 29 32 12 160

They are expected to occur in a 9:3:3:1 ratio.

Does the segregation agree with the theoretical ratio?

Solution:

Ho: The Segregation agree with the theoretical ratioH1: The Segregation does not agree with the theoretical ratio.Level of Significance = 0.05

Tests Statistic is χ2= .df3withE

)EO(4

1i i

2ii

The expected frequencies are computed on the basis of thetheoretical segregation ratio 9:3:3:1. The total is 9+3+3+1=16. Weexpect ‘9’ out of ‘16’ to belong to AB group, that is, the probability of AB

is16

9

The expected frequency of AB is therefore, 90160169

The expected frequency of, Ab is 3016016

3

The expected frequency of, aB is 30160163

And the expected frequency of ab is 1016016

1

Table-16. Calculation for Chi-square value

Page 39: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-35

Phenotype

Observedfrequency

(Oi)

Expectedfrequency

(Ei)(Oi-Ei) (Oi-Ei)2

i

2ii

E

)EO( χ2

value

AB 87 90 -3 9 0.100Ab 29 30 -1 1 0.033 0.666aB 32 30 2 4 0.133ab 12 10 2 4 0.400

The calculated χ2 value is 0.666 which is less than the critical valueof χ2 (with 3 d.f. at =0.05 is 7.815). Therefore, the calculated χ2 value isnot significant. Hence we accept the null hypothesis and conclude that theobserved phenotypic ratio confirms to the theoretical segregation ratio of9:3:3:1.

Exercise: Data were collected on the number of insect species from anundisturbed area of a Wildlife Sanctuary in different months to testwhether there are any significant differences between the numbers ofinsect species found in different months. (Hints: we may state the nullhypothesis as the diversity in terms of number of insect species is thesame in all months and derive the expected frequencies in differentmonths accordingly). Test the data. (Ans. χ2=134.84)

Month Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec. TotalOi 67 115 118 72 67 77 75 63 42 24 32 52 804

χ2-test of independence or association of attributes

When individuals are classified simultaneously on the basis ofvariables or attributes or categories the resulting table of frequencies iscalled a (r x c) contingency table i.e. r-rows and c-columns. The χ2 testmay be applied to contingency table to find out if the variables areindependent or associated.

Procedure:

The χ2 value for this test may be obtained by two ways :i. By estimating the value of Ei (Expected frequency) from the values of Oi(Observed frequency) and applying 2 as goodness-of-fit.ii. For 2x2 contingency table

2 x 2 Contingency tableCategory

Group I II Total1 a b a+b2 c d c+d

Total a+c b+d N=a+b+c+d

Page 40: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-36

The simple formula to calculate 2= .f.d1with)db)(ca)(dc)(ba(

N)bcad( 2

Where a,b,c,d are observed cell frequencies. If any of the expected cellfrequencies is less than 5, then a slightly modified formula is necessary.The corrected formula for 2x2 contingency table called Yates’ Correctionfor continuity is:

)db)(ca)(dc)(ba(

N.2

Nbcad

2

2

Problem-17. In a survey of fertilizer practices in India each of 323 cottongrowing fields selected for survey was classified in the twin criteria ofirrigation practice (irrigated or non-irrigated) and the practice of manuring(manured or un-manured) resulting in the following contingency table.

Irrigated Non- Irrigated Total

Manured 75(a) 35(b) 110

Un-manured 115 (c) 98(d) 213

Total 190 133 323

It is required to test whether the practice of irrigation and thepractice of manuring are independent or related (associated).

Solution:

Ho: these two-factors irrigation and manuring are independent.H1: these two-factors irrigation and manuring are dependent orassociated.

First Method: Goodness-of-fit

The expected frequencies of each cell are calculated as:The expected frequency of the cell (a) is

,7.64323

190110

N

)ca()ba(

Cell (b) is 29.45323

133110

N

)db()ba(

Cell (c) is 29.125323

190213

N

)ca()dc(

Cell (d) is 7.87323

133213

N

)db)(dc(

The 2 is calculated using the formula

Page 41: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-37

..1)12()12(2

22 fdwithondistributixfollows

E

EOx

i i

ii

Table-17. Calculation of chi-square value

Irrigated Non irrigated TotalManured 75(O1) 35 (O2) 110

64.7(E1) 45.3 (E2)Un Manured 115(O3) 98 (04) 213

125.3 (E3) 87.7(E4)Total 190 133 323

The 2

value computed for the above table is

00.603.67.87

)7.8798(

3.125

)3.125115(

3.45

)3.4535(

7.64

)7.6475(

E

EO

2222

4

1i i

2ii

Second Method: Independence of attributes

0.603.6592076100

323)325.3(

)9835()11575()98115()3595(

323)115359875(

))()()((

.)(

2

2

22

dbcadcba

Nbcadx

The 2

value computed for the above two methods is 6.00. Sincethere are only two categories, irrigation and manuring, the df for the

above contingency table is one. The table value of 2with 1df at 5% level

of significance is 3.84. Here the 2

calculated values is higher than thetable value and so the null hypothesis of independence of two factorsirrigation and manuring is rejected and concluded that they are mutuallyrelated or associated.

Exercise: The following table shows the result of inoculation againstcholera in a group of people. Examine the effect of inoculation incontrolling susceptibility to cholera. (Hints: apply Yates’ correction)

Not attacked AttackedInoculated 43 5Not-inoculated 7 28

Page 42: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-38

1.9. Correlation and regression

In many natural systems, changes in one attribute are accompaniedby changes in another attribute and that a definite relation exists betweenthe two. In other words, there is a correlation between the two variables.For instance, several soil properties like nitrogen content, organic carboncontent or pH are correlated and exhibit simultaneous variation. Strongcorrelation is found to occur between several morphometric features of atree. In such instances, an investigator may be interested in measuringthe strength of the relationship. Having made a set of paired observations(xi,yi); i = 1, ..., n, from n independent sampling units, a measure of thelinear relationship between two variables can be obtained by a quantitycalled Pearson’s product moment correlation coefficient or simplycorrelation coefficient.

Correlation is the study of co-variation between two variables tounderstand how the variables are closely related. In correlation analysis,both the variables are normally distributed and must be continuous. Fordiscovering and measuring the magnitude and direction of relationshipbetween two variables we use the statistical tool known as correlationcoefficient and its range is -1 to +1. The + and – sign indicates thedirection of relationship and the value gives the magnitude or strengthbetween the two variables.

Regression is the functional relationship between two or morevariables and thereby provides a mechanism for prediction or forecasting.When the relationship between two variables is a straight line it is calledsimple linear regression.

Karl Pearson’s correlation coefficient and its test of significance

Procedure: Let (Xi,Yi); i = 1,2,3, ...n, be from n independent samplingunits of 2 quantitative variables.

a). Direct Method:Step-1. Construct a table for finding X2, Y2 and XY valuesStep-2. Calculate 22 ,,, YXXYX Step-3. Calculate Karl Pearson’s correlation coefficient by

rxy = 2222 )(.)(

.

YYnXXn

YXXYn

b). Step deviation method (change of origin and scale):

Step-1. Calculate U & V

Where, .h

AXU

; .

k

BYV

Page 43: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-39

A, B are arbitrary values from X & Y and h, k are suitable chosen scales.

Step-2. Construct frequency distribution table for finding U,V, UV, U2,V2

Step-3. Calculate 22 V&U,UV,V,U

Step-4. Calculate correlation coefficient by

2222

2222

..

)()(

VnVUnU

VVnVU

ORVVnUUn

VUUVnruv

Where, n/VV,n/UU Both methods results the same value, i.e. rxy = ruv

Test of correlation coefficient:

Null hypothesis, H0: =0 and Alternative, H1: ≠0

Here is the correlation in the population and r is the estimate of fromsample observation.

Level of Significance, =0.05

And Test statistic, t=21

2

r

nr

~Student’s-t distribution with (n-2) d.f.

The tcal is compared with ttab. If tcal ≤ ttab, then H0 is accepted meansnot significant i.e. the two variables have no linear relationship (may besome other like nonlinear) and if tcal > ttab, then H1 is accepted meanssignificant or we say the two variables are linearly related with themagnitude and direction of r.

Problem-18. The following data gives the height of father and their sonsin 10 families. Compute the correlation coefficient of heights and test itssignificance and give your conclusion.

Height of father (cm) 63 69 65 67 68 69 69 70 71 71

Height of son (cm) 65 63 63 65 67 67 68 71 61 69

Solution:

Page 44: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-40

Table-17. Calculation of correlation coefficient

Ht. Offather(X)

Ht. of Son(Y) X2 Y2 XY U=X-

AV=Y-

BU V U2 V2

63 65 3969 4225 4095 -5 0 0 25 069 63 4761 3969 4347 1 -2 -2 1 465 63 4225 3969 4095 -3 -2 6 9 467 65 (B) 4489 4225 4355 -1 0 0 1 0

68 (A) 67 4624 4489 4556 0 2 0 0 469 67 4761 4489 4623 1 2 2 1 469 68 4761 4624 4692 1 3 3 1 970 71 4900 5041 4970 2 6 12 4 3671 61 5041 3721 4331 3 -4 -12 9 1671 69 5041 4761 4899 3 4 12 9 16

Total=682 659 46572 43513 44963 2 9 21 60 93

a). Direct Method:

27.033.711

192

849.596

19243428110.43513.4651244657210

6596824496310

)(.))(

.2222

valuesputtingand

YYnXXn

YXXYnrxy

b). Step Deviation method:

27.0102.71

2.19

)21.9).(72.7(

2.19

)81.0(1093).04.0(1060

)18.0(1021

81.0)(04.0

9.0;2.0

2222

22

VnVUnU

VUnUVr

VU

n

VV

n

UU

uv

The correlation coefficient between father and son in both methods is0.27.

Test of significance of r:

Page 45: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-41

Putting the value of r in the formula, t=21

2

r

nr

the t statistic, t=2)27.0(1

21027.0

=0.79

The ttab=2.31 with 8 d.f. at 5% ls.

So, tcal < ttab and H0 is accepted i.e. not significant. It is concludedthat the height of father and their son is not linearly related or we will saythat the height of father increase or decrease does not indicate theincrease or decrease in height of son.

Exercise: The data on pH and organic carbon content were measuredfrom soil samples collected from 15 pits taken in natural forests as given:

Soil Pit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

pH(x) 5.7 6.1 5.2 5.7 5.6 5.1 5.8 5.5 5.4 5.9 5.3 5.4 5.1 5.1 5.2

Organiccarbon(y)

(%)

2.1 2.17 1.97 1.39 2.26 1.29 1.17 1.14 2.09 1.01 0.89 1.6 0.9 1.01 1.21

Compute a suitable statistic and test to study whether increase inph of soil affects the organic carbon in that forest.(Hints:r=0.3541 andtcal=1.3652)

Exercise: The following data contain 15 paired values of photosyntheticrate(Y) and light interception(X) observed on leaves of a particular treespecies. The photosynthetic rate is dependent variable and the quantity oflight is independent variable. Study the linear relationship between thetwo variables with test.

Tree 1 2 3 4 5 6 7 8X 0.7619 0.7684 0.7961 0.838 0.8381 0.8435 0.8599 0.9209Y 7.58 9.46 10.76 11.51 11.68 12.68 12.76 13.73

Tree 9 10 11 12 13 14 15X 0.9993 1.0041 1.0089 1.0137 1.0184 1.0232 1.028Y 13.89 13.97 14.05 14.13 14.2 14.28 14.36

Spearman's Rank correlation coefficient

A rank correlation is any of several statistics that measure therelationship between rankings of different ordinal variables or differentrankings of the same variable, where a "ranking" is the assignment of thelabels "first", "second", "third", etc. to different observations of aparticular variable. Like any correlation calculation, it is appropriate forboth continuous and discrete variables, including ordinal variables. A rankcorrelation coefficient measures the degree of similarity between two

Page 46: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-42

rankings, and can be used to assess the significance of the relationbetween them. A rank correlation coefficient can measure thatrelationship, and the measure of significance of the rank correlationcoefficient can show whether the measured relationship is small enoughto likely be a coincidence. It is measured by Spearman's rank correlationcoefficient or Spearman's rho denoted by the Greek letter (rho)of statistical dependence between two variables. It assesses how well therelationship between two variables can be described and lie in the interval[-1 to +1]. An increasing rank correlation coefficient implies increasingagreement between rankings. The coefficient value can be interpreted as:

i. 1 if the agreement between the two rankings is perfect; the tworankings are the same.

ii. 0 if the rankings are completely independent.iii. −1 if the disagreement between the two rankings is perfect; one

ranking is the reverse of the other.

For a sample of size n, the n raw scores or values Xi,Yi are convertedto ranks xi,yi and ρ is computed. Identical values (rank ties or valueduplicates) are assigned a rank equal to the average of their positions inthe ascending order of the values.

The Spearman’s correlation coefficient is:

Where, di = xi – yi (i=1,2,3 ….n)

Procedure:

For a sample observation the Spearman rank correlation coefficient is:

)1n(n

d61r

2

2i

s

and when ties occur,

)1n(n

)1m(mdi61r

2

22

s

Here, di= xi-yi , xi=Rank of 1st variable, yi= Rank of 2nd variablem= No. of ties in any group.

Following steps are applicable for finding rank correlation

Step-1. Rank all observations

I. Ranking should be made from highest to lowest of the observations.II. If any two or more of the observations are same in magnitude then

all of them must carry the same rank (average of ranks).

Page 47: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-43

Step-2. When a common rank is assumed for different observations of a

factor then12

)1m(m 2 is added to the numerator of the 2nd term of the

formula for the correlation coefficient.Step-3. The sum of differences of the rank should be equal to zero, whichis a check for the correction of the calculation.

Problem-19. Find the Rank correlation between the following data

Preference Price (x) 73.2 85.8 78.9 75.8 77.2 81.2 83.8

Debenture Price (y) 92.8 99.2 98.8 98.3 98.3 96.7 97.1

Determine the relationship between preference share price & debentureprice?

Solution:

Table-18. Calculation of rank correlation coefficient

Preferenceshare price (x)

Rank x(xi)

DebenturePrice (y)

Rank y(yi)

di=xi-yi di2

73.2 7 97.8 5 2 485.8 1 99.2 1 0 078.9 4 98.8 2 2 475.8 6 98.3 3.5 2.5 6.2577.2 5 98.3 3.5 1.5 2.2581.2 3 96.7 7 -4 1683.8 2 97.1 6 -4 16

50.48d 2i

Here, y has 2 identical values (m=2) and n=7.

Therefore, rank correlation (rs)

125.0336

)5.05.48(61

)17(7

12

)12(25.486

1)1n(n

12

)1m(md6

12

2

2

22

i

It is concluded that the two prices are poorly related i.e. if one priceis increasing the other is not in the same way increasing.

Exercise: In a survey observations on 10 persons were taken on IQ andNo. of Hours Spent in TV per week(Y) as below. Compute the rankcorrelation and study whether increase in IQ of persons invite the hoursspent in TV per week.

Page 48: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-44

Person IQ(X) No. of Hours Spentin TV per week(Y)

1 106 72 86 03 100 274 101 505 99 286 103 297 97 208 113 129 112 610 110 17

(Hints: Ans. rs = −0.1757)

Fitting of regression equations of two variables Y and X

In regression analysis, both variables are normally distributed andone of the variables represents cause (independent or explanatoryvariable) and other is effect (dependent or response variable). Therelationship between two variables can be expressed as a function knownas Regression. When only two variables are involved in regression, thefunctional relationship is known as simple regression. If the relationshipbetween the two variables is linear, it is known as simple linearregression.

For simple linear regression, two regression equations are given by:

)YY(bYX:YonX

)XX(bYY:XonY

xy

yx

nsobservatioofNonnX

XnY

Y

YonXoftcoefficienregressionb

XonYoftcoefficienregressionbWhere

xy

yx

.,

,

Procedure:

Fitting of regression equations are carried out in two phases.

a). Calculation of regression coefficients (bYX and bXY)

i). Direct method:

Step-1. Construct a table to find out X2, Y2, XYStep-2. Compute X, Y, X2, Y2, XY, Y and X from the table.Step-3. Calculate the regression coefficients by the formula:

Page 49: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-45

22

22

)(

)(

YYn

YXXYnb

XXn

YXXYnb

xy

yx

ii). Step deviation method:

Step-1. Reduce the value of X & Y to U & V

Where A & B are arbitrary values and h & k are suitable scales and

h

AXU

k

BYV

Step-2. Construct the table to compute U2, V2, UV

Step-3. Compute tablethefromVUUVVU 22 ,,,,

Step-4. Compute regression coefficients by the formula:

nsobservatioofpairsofnonWhere

UUn

VUUVnbUonVoftcoefficiengression

VVn

VUUVnbVonUoftcoefficiengression

VU

UV

.

)(,Re

)(,Re

22

22

Step-5. Compute bXY = UVbk

h & bYX = VUbh

k

b). Finding the regression equations

After estimating the values of ,X Y , bYX and bXY and putting thesevalues in the following equations the regression equations can beobtained.

)XX(bYY yx and )YY(bXX xy

Problem-20. The Following data is given monthly Income & Expenditureon food of 10 families.

Income (x) 120 90 80 150 130 140 110 95 70 105

Expenditure (y) 40 36 40 45 40 44 45 38 50 35

Find the two linear regression equations and correlation coefficient.

Solution:

Leth

AXU

,

k

BYV

Here, A = 110, h= 5 ; B = 40, k =1

Page 50: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-46

Table-19. Calculation of sums & sum of squares

Income (X) Expenditure(Y) h

AXU

k

BYV

UV U2 V2

120 40 2 0 0 4 090 36 -4 -4 16 16 1680 40 -6 0 0 36 0150 45 8 5 40 64 25130 40 4 0 0 16 0140 44 6 4 24 36 16110 45 0 5 0 0 2595 38 -3 -2 6 9 470 50 -8 10 -80 64 100105 35 -1 -5 5 1 25

Total=1090 413 -2 13 11 246 211Here n = 10

Regression coefficient U on V = bUV ,

07.01941

136

1692110

26110

)13(21110

)132(1110

)(

2

22

VVn

VUUVn

Regression coefficient of V on U = bvu

055.0)2(24610

136

)U(Un

VUUVn

2

22

.35.0011.0Re

011.0)055.0(51

)(

35.0)07.0(15

)(, )

lyrespectiveandareyonxandxonyoftCoefficiengression

bh

kb

bk

hbSo

vuyx

uvxy

Therefore, the two regression equations and correlation coefficient are:i. Y on X : Y- 41.3 = 0.011(X-109)ii. X on Y : X – 109 = 0.35(Y – 41.3)iii. Correlation of X & Y = √(0.011x0.35) = 0.062

Exercise: From the Exercise in correlation data on photosynthetic rate(Y)and light interception(X), find the regression equation of Y on X andestimate Y when X= 0.95.

Page 51: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-47

II. DESIGN AND ANALYSIS OF EXPERIMENTS

2.1. Basic concepts on design of experiments

Planning an experiment to obtain appropriate data with respect toany problem under investigation is known as ‘design of experiment’. It isa complete sequence of steps taken well in time to ensure thatappropriate data will be obtained in a way which permits an objectiveanalysis of the data leading to valid inferences with respect to the statedproblems. “Design of experiment” comprises the process of planning ofexperiments, analysing the data /observations and interpretation of theresults. The techniques for making inferences is known as the “analysis ofvariance”. There are three basic principles of the design of experiments:

(i) Replication, (ii) Randomization and (iii) Local control.

(i).Replication: The replication of treatments by applying them to morethan one experimental unit under investigation is known as replication.Replication is necessary in order to get an estimate of the experimentalerror variation- cause due to uncontrolled factors. Again, replicationincreases the precision of treatments. Replication of treatments helps inreducing the error in the experiment in addition to providing an estimateof error.

(ii).Randomization: Assigning treatments or factors to be tested toexperimental units according to definite law of probability is known asRandomization. In the principle of randomization, every experimental unitwill have the same chance of receiving any one of the treatments understudy. For an objective comparision it is necessary that treatments areallotted randomly to various experimental units. Statistical proceduresemployed in making inferences about treatments hold good only when thetreatments are allotted randomly to various experimental units.

(iii).Local control: Though every experiment should provide an estimate oferror variation, it is not desirable to have a large experimental error. Thereduction of experimental error can be achieved by making use of the factthat adjacent areas in the field are relatively homogeneous than thosewidely separated. The aim of local control is to reduce the error bysuitably modifying the allocation of treatment to the experimental unitsby previous knowledge.

Analysis of variance (ANOVA)Analysis of variance is basically a technique of partitioning the

overall variation in the responses observed in an investigation intodifferent assignable sources of variation, some of which are specifiableand others unknown. Further, it helps in testing whether the variation due

Page 52: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-48

to any particular component is significant as compared to residualvariation that can occur among the observational units.

Some important definition for experimental designs

Treatment: In experimentation, various objects of comparison are knownas treatments. In practice, treatments may refer to a physical substance(fertilizers/varieties of crops/animal breed/feeds etc.) or aprocedure/condition/methods of cultivation/sowing/housing conditions,etc. which are applied to experimental units for getting response.

Experimental Unit: The basic objects on which the experiment is done areknown as experimental unit.

Model: In statistics, model is generally expressed in terms of symbols,usually as a set of equations consisting of factors and treatments with arandom effect.

Fixed effect model: A model in which the factors are fixed effects and theerror affect is random is called a fixed effect model. A fixed effect modelwith two factors is written as:

ijkjiijk e eijk is i.i.d ~ ),0( 2eN

Random effect model: Models in which factors are random effects and theerror affect is random is called random effect model.

Mixed effect model: Models in which some factors are fixed and somerandom with error affect random is called mixed effect model.

Hypothesis: Any assumption or statement about the populationcharacteristic is called hypothesis. It may be parametric or non-parametric.

Null hypothesis: It is the hypothesis which is tested for possible rejectionunder the assumption that it is true.

Degrees of Freedom: The degrees of freedom correspond to the numberof independent deviations or contrasts that are available from the dataand can be calculated by deducting from the number of values availableto the number of constants that are calculated from the data.

Level of significance: This is the probability (under Ho) which leads to therejection of the null hypothesis (or rejection region). It is generallydenoted by the symbol and is usually be 0.05(or 5%) or 0.01(or 1%).

Basic assumptions for analysis of variance:(i) All the effects of different sources of variation (e.g treatment,

environment etc.) are additive.(ii) Experimental errors are independent.(iii) Experimental errors have common variance.(iv) Experimental errors are normally distributed or asymptotic i.e,

i.i.d~N (o,e2)

Page 53: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-49

Analysis of variance of one-way classified dataLet there be n observation yij, which are grouped into t

classes/treatments such that in the i-th group there are ni observationsi.e.

i=1,2,3….t; j=1,2,3,…,ni and i

i nn

and yij is response due to i-th treatment of j-th unit

Layout:

Treatments

1 2 . . i . . t

y11 y21 yi1 yt1

y12 y22 yi2 yt2

. .

y1j y2j yij ytj

. .

y1n1 y2n2 yini ytnt

Total T1 T2 Ti Tt Grand total=G

Mean Grand mean=

Model:ijiij ety

where, is a constant representing the general conditions to which allthe observations are subjected; ti is the unknown effect of the i-th classto be estimated and eij’ are independent random variables with zero meanand constant variance, 2

e .

Hypothesis: Under certain additional assumptions, analysis of varianceleads to testing the following hypotheses,

and for at least one i and j

Analysis:

Step-1. Compute Correction Factor CF= )( 2 nG

Step-2. Compute Total Sum of Square, TSS= CFyji

ij ,

2

Page 54: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-50

Step-3. Compute Treatment Sum of Square, TrSS= CFnT

i i

i )(2

Step-4. Compute Error Sum of Square, ESS=TSS - TrSSStep-5. Prepare ANOVA Table

Sources of variation d.f. SS MSS Fcal F (tab)

Treatments t-1 TrSS1

t

TrSSTMS

EMS

TMS

Error n-t ESStn

ESSEMS

Total n-1 TSS

Step-6. Compare F values as:If Fcal ≤ Ftab at α level then H0 is accepted i.e. all treatment effects aresame or not significant.

If Fcal > Ftab at α level then H1 is accepted i.e. at least two treatmenteffects are different or significant.

Step-7. If in ANOVA, the test is not significant which means all thetreatments are equal in giving the effect, then stop further analysis asresult is concluded. But, if the test is significant means at least twotreatments are different for giving the effect, then proceed for comparingthe difference of treatment effects by Critical Difference (CD) or LeastSignificant Difference (LSD) test.

CD Test:i). Estimate SE of i-th treatment mean, inEMSmSE /)(

ii). Estimate SE of the difference between i-th and j-th treatment mean,

ji nnEMSdSE

11)(

If ni= nj = r, then SE (d) =r

EMS2

iii). Compute CD = SE(d) x t, t=Tabulated t with error d.f. at α level

iv). Compare the difference of any two treatment means (DTM) with theCD value to find the significant difference between treatments. If any DTMis less than or equal to CD, then the two are not significant otherwisesignificantly different. All such treatment pairs are compared likewise.

Step-8. In order to find out the reliability of the experiment, thecoefficient of variation (CV) is computed as:

100meanOverall

EMSCV

Page 55: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-51

If the CV is 20% or less, it is an indication of better precision of theexperiment and when the CV is more than 20% the experiment may berepeated and efforts made to reduce the experimental error.

Analysis of variance of two-way classified data

Two-way ANOVA is carried out when there are two-way variabilityof factors. For example, treatment as first factor and blocking as secondfactor in agricultural experiments; feed and housing condition in poultry;learning process and education standard in social science; tree speciesand agro-climatic condition, etc. Let yij be the responses due toi=1,2,3….t treatments and j=1,2,3,…r blocks in a trial, then

Layout: Let there be t treatments with r blocks or replications for studyingthe response of a characteristic, y

Replication r1 r2 .. rr Total MeanTreatmentt1 Y11 Y12 .. Y1r T1 T1/rt2 Y21 Y22 .. Y2r T2 T2/r.. .. .. .. .. .. ..tt Yt1 Yt2 Ytr Tt Tt/rTotal R1 R2 .. Rr G M=G/rt

Model: The model for two way classified data with one observation percell:

ijjiij ebty

Hypothesis: Under certain additional assumptions, analysis of varianceleads to testing the following hypotheses,

and for at least one i and j

Analysis:Step-1. Compute Correction Factor CF= )( 2 rtG

Step-2. Compute Total Sum of Square, TSS= CFyji

ij ,

2

Step-3. Compute Treatment Sum of Square, TrSS= CFrT

i

i )(2

Step-4. Compute Replication Sum of Square, RSS= CFtR

j

j )(2

Step-5. Compute Error Sum of Square, ESS=TSS – TrSS - RSSStep-6. Prepare ANOVA Table

Sources of variation d.f. SS MS Fcal F (tab)

Page 56: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-52

Replication r-1 RSS1

r

RSSRMS

EMS

RMS

Treatments t-1 TrSS1

t

TrSSTMS

EMS

TMS

Error (r-1)(t-1) ESStn

ESSEMS

Total rt-1 TSS

Step-7. Compare F values as:

If Fcal ≤ Ftab at α level then H0 is accepted i.e. all treatment effects aresame or not significant.If Fcal > Ftab at α level then H1 is accepted i.e. at least two treatmenteffects are different or significant.

Step-8. If in ANOVA, the test is not significant means all the treatmentsare equal in giving the effect, then stop further analysis as result isconcluded. But, if the test is significant means at least two treatments aredifferent for giving the effect, then proceed for comparing the differenceof treatment effects by Critical Difference (CD) or Least SignificantDifference (LSD) test as above.Step-9. SE of mean, rEMSmSE /)( and

SE (diff of 2 means), rEMSdSE /2)(

Step-10. 100M

EMSCV

2.2. Analysis of data in completely randomized design (CRD)

The simplest design using only two essential principles of fieldexperimentation, viz. replication and randomization, is the completelyrandomised design (CRD). This is a one-way classification of data. In thisdesign whole of the experimental units is divided into no. of experimentalunits depending on the no. of treatments and no. of replication for eachtreatment. The treatments are then allotted randomly to the units of theentire homogeneous material and observations on different characteristicsor variables of interest are recorded. This design is useful for laboratoryor green house experiments where treatment is the only variable ofinterest for comparison.

Procedure:

The analysis is same as that of one-way classification with model,assumptions, hypothesis and steps of calculation.

Model, Yij = +ti +eij

Where, Yij is the value of the variate in the jth replicate of the ithtreatment (i=1,2….t; j=1,2…..ri)

Page 57: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-53

= is the general mean effect

ti is the effect due to ith treatment

eij is random error which is iid ~ N (0, e2)

Step-1.The observations of a variable y recorded can be arrived asfollows:

Arrangement of observation of CRD

Treatment1 2 3 ……… TY11 Y21 Y31 ……… Yt1Y12 Y22 Y32 ……… Yt2Y13 Y23 Y33 ……… Yt3--- --- --- ----- ----

Total T1 T2 T3 ----- Tt GTNo. of Repl. r1 r2 r3 ----- rt nTreat mean

111 / rTT 222 /rTT 333 /rTT ttt rTT /

Step-2. The testing of hypothesis is,

and for at least one i and jStep-3. Analysis of data

i). Correction Factor (C.F.) =n

)GT( 2

ii). Total Sum of Squares (TSS) = .F.CY ij2

= .F.C)Y....YY( tr2

122

112

iii). Treatment Sum of Squares (TrSS) = .F.Cr

T...

r

T

r

T

t

2t

2

22

1

21

iv). Error Sum of Squares (ESS) = TSS – TrSS

Step-4. Preparation of ANOVA table

Sources of variation d.f. SS MSS Fcal F (tab)

Treatments t-1 TrSS1

t

TrSSTMS

EMS

TMS

Error n-t ESStn

ESSEMS

Total n-1 TSS

Step-5. If the calculated value of F is greater than the table valueof ,tn,1tF ; where α denotes the level of significance, the hypothesis,Ho, is rejected and it can be inferred that some or all the treatment effectsare significantly different.

Page 58: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-54

Step-6. Calculation of standard errors and CD value for pair comparison:(a).Estimated SE of ith treatment mean, irEMSmSE /)(

(b).Estimated SE of the difference between i-th and j-th treatment mean is

ji r

1

r

1EMS)d(SE

If ri= rj = r, then SE (d) =r

EMS2

(c). CD = SE(d) x t(d).The treatment means are arranged according to their ranks indescending order. Using the CD value the bar chart is completed tointerpret the treatment comparisons.

CRD with unequal replications

Problem-21. A varietal trial on green gram was conducted in a greenhouse under CRD having five varieties V1, V2, V3, V4, V5 and replicatedwith 3, 4, 5, 4 and 4, respectively. The data recorded on grain yield arepresented below.

Grain yield of green gram (kg/pot)

Varieties V1 V2 V3 V4 V51.6 2.5 1.3 2.0 1.61.2 2.2 0.9 1.5 1.01.5 2.4 0.8 1.6 0.8-- 1.9 1.1 1.4 0.9-- -- 1.0 -- --

Total 4.3 9.0 5.1 6.5 4.3Repl 3 4 5 4 4Mean 1.43 2.25 1.02 1.62 1.08Variance 0.043 0.070 0.037 0.069 0.129

Analyse the data and find the best variety of highest grain yield.

Solution:

Step-1. Null hypothesis Ho: T1=T2 = T2…….= T5 means all varieties givethe same yield;

H1:T1 T2 …. T5 means all the varieties does not give the same yield

Step-2. Calculation

i). C.F.= (29.2)2/ 20 = 42.6320ii). TSS=[(1.6) 2+(1.2)2+……….+ (0.9) 2] – C.F. =47.840 – 42.632=5.208

Page 59: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-55

iii). SS due to treatments (varieties) =TrSS or VSS

1683.46320.428003.46

.F.C4

)3.4(

4

)5.6(

5

)1.5(

4

)0.9(

3

)3.4( 22222

iv). ESS = TSS - VSS = 5.2080 – 4.1683 = 1.0397

Step-3. Construction of ANOVA table

Sources of variation d.f. SS MSS Fcal F0.01Variety 4 4.1683 1.4021 15.037** 4.893Error 15 1.0397 0.0693Total 19 5.2080

** Significant at 1% level

Step-4. Since the observed value F is greater than 1% tabulated F value,the null hypothesis rejected. It indicates some of the treatment pairs aredifferent. So, the C.D. test is required for pair wise comparison.

Step-5. Calculation of SE for V1 and V2

SE(d)=

21 r

1

r

1EMS

4

1

3

10693.0 2011.0040423.0

The table value of t for = 0.05 and 15 df is 2.131Hence, CD= (2.131) (0.2011) = 0.4285Similarly CD value of other pairs are:

V1 and V3 = 0.4096,V1 and V4; V1 and V5 = 0.4285V2 and V3; V3 and V4; V3 and V5 = 0.3763V2 and V4; V2 and V5 = 0.3966.

Comparison of the difference between the mean yields of the varietieswith the corresponding CD value will result in the following bar chart.

142 VVV 35VV

Conclusion: It is concluded that the variety V2 is the best variety in givinghighest grain yield followed by V1 & V4 and V3 & V5.

Exercise: The data from a laboratory experiment is used in whichobservations were made on mycelial growth of different Rizoctoniasolani isolates on PDA medium as:

R. solani isolates Mycelial growth

Repl. 1 Repl. 2 Repl. 3

RS-1 29.0 28.0 29.0

RS-2 33.5 31.5 29.0

Page 60: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-56

RS-3 26.5 30.0 ----

RS-4 48.5 46.5 49.0

RS-5 34.5 31.0 ----

Analyse the data and draw conclusions on significant difference ofdifferent Rizoctonia solani.

CRD with equal replications

Problem-22. In order to find out the yielding abilities of five varieties ofsesamum, an experiment was conducted in a poly house using a CRD withfour plots per varieties. The observations are given in the table below.

Seed yield of sesamum (g/plot)

Varieties 1 2 3 4 525 25 24 20 1421 28 24 17 1521 24 16 16 1318 25 21 19 11

TotalMean

8521.2

10225.5

8521.2

7218.0

5313.2

Analyse the data and draw conclusions on varietal performance ofdifferent sesamum varieties.

Solution:

Step-1. Null hypothesis Ho: V1 = V2 …. = V5, H1: at least 2 varieties aredifferent.

Step-2. Calculation

(i). C.F. = 25.7880

20

397 2

(ii). TSS = [(25.)2 + (21)2 +….. (11)2] – C.F. = 8307 – 7880.45 = 426.55

(iii). Varieties SS= VSS = .F.C)53......10285(4

1 222

= 8211.75- CF = 8211.75 – 7880.45 = 331.30

(iv). ESS = 426.55 – 331.30 = 95.25

Step-3. Construction of ANOVA table

Sources of variation d.f. SS MSS Fcal FtabVarieties 4 331.30 82.825 13.043 ** 4.893

Error 15 95.25 60350Total 19 426.55

** Significant at 1% level.

Page 61: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-57

Step-4. Since the observed value of F is greater than the 5% tabulevalue, the null hypothesis rejected. So, we proceed for CD test.

SE(d) =

7819.14

350.62

The table value of t for = 0.05 and 15 df is 2.131

Hence, CD = (2.131) (1.7819) = 3.7972 = 3.80

The arrangement of treatments according to their ranks and the bar chartwill be: = V2 V1 V3 V4 V5

Conclusion: From the analysis, it is concluded that the variety V2 is thebest.

Exercise: The data represent a set of observations on wood densityobtained on a randomly collected set of 7 stems belonging to five canespecies.

Species1 2 3 4 5

1 0.58 0.53 0.49 0.53 0.572 0.54 0.63 0.55 0.61 0.643 0.38 0.68 0.58 0.53 0.634 0.32 0.55 0.54 0.47 0.685 0.52 0.45 0.41 0.41 0.616 0.41 0.59 0.63 0.58 0.747 0.47 0.65 0.58 0.44 0.71

Analyse the data and draw conclusion on difference of cane species.

2.3. Analysis of data in randomised complete block design (RCBDor RBD) with one observation per cell

In order to control variability in one direction in the experimentalmaterial it is desirable to divide the experimental unit into homogenousgroup of units called blocks perpendicular to treatments. The treatmentsare randomly allocated to each of these blocks. This procedure gives anarrangement of ‘t’ treatments in ‘r’ blocks such that each treatmentoccurs precisely once in each block.

Procedure:

Page 62: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-58

The analysis of a Randomised Complete Block Design is the onesimilar to analysis of a two-way classified data. For analysis of this designwe use the linear additive model,

Yij = ijji ert

Where, = the overall mean; ti = the ith treatment effectrj=the jth replication effect, andeij = the error term iid~ N (0.e

2)

Step-1. The observations from a RBD can be arranged as follows:

Arrangement of data in RBD with t treatments and r replications

Treatment Replication Total1 2 3 …………. r1 Y11 Y12 Y13 .………… Y1r T12 Y21 Y22 Y23 ..………. Y2r T23 Y31 Y32 Y33 .……….. Y3r T3

.……….. .……….. .……….. .……….. .……….. .……….. .………..t Yt1 Yt2 Yt3 .……….. Ytr Tt

Total R1 R2 R3 ..……….. Rt GT

Step-2. The data can be analysed as:(i). C.F. = (GT)2/rt(ii). Total SS=TSS = yij

2 – C.F.

(iii). Replication SS= RSS= .F.CRt

I 2j

(iv). Treatment SS= TrSS = .F.CTr

I 2i

(v). Error SS=ESS = TSS – RSS – TrSS

Step-3. We are interested in testing the hypothesis

Ho: t1 = t2 =. ………= tt, against the alternative that at least 2 t’s are notequal.

Step-4. ANOVA table

Sources of variation d.f SS MSS Fcal F(tab)

ReplicationTreatmentError

r - 1t - 1

(r - 1)(t-1)

RSSTrSSESS

RMSTMSEMS

RMS / EMSTMS /EMS

Total rt-1 TSS

Step-5. If F-test shows that there is no significant difference betweenreplications, it indicates that RBD will not contribute to precision indetecting treatment differences. In such situations the adoption of RBD inpreference to CRD is not advantageous.

Page 63: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-59

Step-6. If by F-test we find significant difference between treatments,then we can use CD for comparing pairs of treatments. The CD is givenby:

CD = tα x SE(d)Where, tα = table value of t for α (0.01 or 0.05) level of significance anderror degrees of freedom.

And SE(d) =r

EMS2

Based on the CD value the bar chart can be drawn and conclusions can bewritten.

Problem-23. Plan and yield of six paddy strains (A,B,C,D,E,F) yield(kg/plot) in a RBD experiment with four replications is shown below.

Block-I Block-II Block-III Block-IVA (12)E (14)C (11)D (7)B (5)F (10)

B (4)C (6)E (11)A (16)D (8)F (9)

B (7)C (9)D (9)E (15)F (12)A (14)

F (8)A (18)C (10)E (6)D (8)E (12)

(Parentheses figures are yield observations)

Analyse the data and draw conclusions on paddy strains for yieldperformance.

Solution:

Step-1. Null hypothesis H0 : TA = TB= ….= TF (All the varieties have thesame mean yield); H1 : At least 2 strains are different

Step-2. The data can be arranged in the following two-way classification.

Paddy yield (in kg/plot)

Treatment Replication or Blocks Treatment Total MeanI II III IVABCDEF

1251171410

16468119

147991512

186108128

602236325239

155.598139.8

Rep. Total 59 54 66 62 GT=241

Step-3. Calculation here, N=r x t = 4 x 6=24

(i). Correction factor, CF = 242024

)241(

N

)GT( 22

(ii). Total SS=TSS= (122+……+ 82) – CF= 2717-2420=297

Page 64: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-60

(iii). Replication or Block SS=RSS = 1224202432CF

6

62............60 22

(iv). Variety SS=VSS = 23724202657CF9

)39....60( 22

(v). Error SS=ESS= TSS – RSS – VSS = 297-12-237=48

Step-4. Construction of ANOVA Table

Sources of variation d.f. SS MSS FcalFtab

5% 1%Block (r-1)=3 12 4 1.25nsVariety (t-1) =5 237 47 14.8** 2.90 4.56Error 15 48 3.2Total (rt-1)=23 297

NS- Not significant ** Significant at 1% level

Step-5. Since the calculated F value of variety is greater than the F tablevalue for 5 and 15 d.f at 1% level, the conclusion is that the varietiesdiffer significantly at 1% level or the varietal differences are highlysignificant.

Step-6. Critical difference, CD = 69.2131.222.3

42

..1505.0

fdfortEMS

Step-7. The arrangement of treatments according to their ranks withrespect to their mean and their bar chart is as follows:

Varieties: A E F C D BConclusion: The Bar chart shows that varieties (A & E) are superior to B &(C, D,F); while (C,D,F) are at par with respect to yield performance ofthese 6 paddy strains.

Exercise: In a field experiment laid out under RCBD, data is made onseven provenances of Gmelina arborea for the girth at breast-height(gbh) of the trees attained since 6 years of planting.

gbh (cm) of trees in plots 6 years after planting

Treatment (Provenance) Replication

I II III

1 30.85 38.01 35.10

2 30.24 28.43 35.93

3 30.94 31.64 34.95

4 29.89 29.12 36.75

Page 65: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-61

5 21.52 24.07 20.76

6 25.38 32.14 32.19

7 22.89 19.66 26.92

Analyse the data and draw conclusions on treatment differences.

2.4. Analysis of data in Latin square design (LSD)

This design controls heterogeneity in two directions in theexperimental material. In this design two restrictions are imposed byforming blocks in two perpendicular directions, row wise and column wise.Treatments are allotted in such a way that every treatment occur onceand only once in each row and each column. Thus, a Latin square of ‘t’treatments is an arrangement of t x t or t2 cells such that every row orevery column contains every treatment precisely once. By thisarrangement the error variation can be considerably reduced further.

Procedure:

For analysis of these designs we use the linear additive modelijkkjiijk etcry

Where, yijk is the observation on kth treatment in the ith row and jth column(i= 1,2,…………..,s, j=1,2,…………,s; k= 1,2,………,s) is the general mean effect, ri is the effect due to ith row, cj is the effectdue to jth column, tk is the effect due to kth treatment and eijk is therandom error component which is assumed to be independently andidentically normal distribution with mean zero and a constant variance,

2e .

Analysis:

Let, there be s treatments arranged in s rows and s columns, thencompute,(i). Ri= Total of ith row =

jijky

(ii). Cj= Total of jth column = i

ijky

(iii). TK= Total of kth treatment in the design(iv). C.F.= TotalGrandisGTwhere,s)GT( 22

(v). TSS (Total Sum of Squares) = .F.Cyi j

2ijk

(vi). RSS (Row Sum of Squares) = .F.CsRi

2i

Page 66: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-62

(vii). CSS (Column Sum of Squares) = .F.CsCj

2j

(viii). TrSS (Treatment Sum Squares) =.F.CsT

k

2k

(ix). ESS (Error Sum of Squares) = TSS- RSS- CSS - TrSS(x). Hypothesis Ho:t1=t2=……………= ts against H1 that ti’s are not equal

(xi). ANOVA Table

Sources d.f. SS MSS FcalRow (s-1) RSS Sr

2= RSS/ s-1Column (s-1) CSS Sc

2 = CSS/s-1Treatment (s-1) TrSS St

2 = TrSS/s-1 St2/se

2

Error (s-1) (s-2) ESS Se2=ESS/(s-1) (s-2)

Total (s2-1) TSS

If the calculated value of F for treatment is greater than the table ofF:(s-1);(s-1)(s-2) d.f., the hypothesis Ho is rejected. We can infer thatthe treatment effects are significantly different. To detect the difference,CD test is performed.The estimated SE of the difference between ith and jth treatment is

s

Se2)d(SE

2

The critical difference (CD) can be calculated asCD= SE(d) x t at error df

The degrees of freedom for t are those as for error. The treatmentmeans are computed as Tk/s (k=1,2,………,s). These means can becompared with the help of CD value. Any two treatments means are saidto differ significantly if their difference is larger than the CD value.

Problem-24. An experiment was carried out on Sorghum with 5varieties (A,B,C,D & E) in a (5 5) LSD. The Plan and grain yield (kg/plot)are given below:

Rows Columns Row totalI II III IV VI B (6) A (11) E (8) D (6) C (5) 36II A (9) D (9) C (4) E (14) B (10) 46III C (3) B (8) D (7) A (12) E (8) 38IV E (10) C (5) A (10) B (7) D (10) 42V D (8) E (15) B (9) C (3) A (18) 53

Columntotal 36 48 38 42 51 215

(Parentheses figures are yield observations of respective treatments)Perform the ANOVA and compare the variety mean yields.

Page 67: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-63

Solution:

Step-1. Hypothesis:

EA1

EDCBA0

.....................:H

:H

Step-2. Yield (kg/plot) of varieties and their totals

A B C D E11 6 5 6 89 10 4 9 1412 8 3 7 810 7 5 10 1018 9 3 8 15

Tk 60 40 20 40 55

Variety totals are: A=60, B=40; C=20; D=40; E= 55

Step-3. Calculation

(i). Grand total, GT = 215, Total no. of observations=N=25(ii). No. of varieties, s = 5

(iii). Correction factor, C.F. = 184925

)215(

N

)GT( 22

(iv). Total Sum of Squares=TSS =

31418492163

1849)18..........116(.F.Cy 22

i

2

j

2ijk

(v). Row Sum of Squares=RSS =

8.3618498.1885

18495

5342384636(.F.C

s

R 22222

i

2i

(vi). Column Sum of Squares=CSS =

8.3218498.1881

18495

)5142384836(.F.C

s

C 22222

j

2j

(vii). Variety Sum of Squares=VrSS

=19618492045

18495

)5540204060(..F.C

s

T 22222

k

2k

(viii). Error SS=ESS= TSS- RSS – CSS- VrSS=314-36.8-32.8-196 = 48.4

Step-4. Construction of ANOVA Table

Source of variation df SS MSS Fcal Ftab

Page 68: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-64

5% 1%Rows 4 36.8 9.2 (9.2/4.03)=2.28 nsColumns 4 32.8 8.2 (8.2/4.03)=2.03 nsVariety 4 196.0 49.0 (49/4.03)=12.15 ** 3.26 5.41Error 12 48.4 4.03Total 24 314

Step-5. Comparing the F ratio for Rows, Columns and Varieties with thetable value of F (for 4 and 12 d.f) it is found that only difference invarietal means are highly significant.

Step-6. CD at 5% = SE(d) x t0.05 for 12 d.f

= 74.218.226.118.25

03.42

The arrangement of variety means according to their ranks and thebar chart will be done by comparing the differences with CD value.

Variety A E B D CMeans 12 11 8 8 4

and the bar coding is: EA DB C

Conclusion: The analysis reveals that the varietal differences is presentand variety A & E are at par; variety B & D are also at par but C iscompletely different in giving the yield of the crop. Variety A & E are thebest varieties for yield performance.

Exercise: In a varietal trial on paddy to test the yielding ability of 5varieties (A,B,C,D,E), an experiment was laid out in a 5x5 LSD. Theresults are given below.

Grain yield of paddy (kg/plot)

D 39.0 A 24.1 E 26.1 B 37.0 C 42.2E 21.2 B 38.1 A 24.0 C 39.3 D 33.1C 35.6 E 33.5 B 38.1 D 40.8 A 24.2A 30.8 C 31.1 D 46.7 E 28.7 B 44.9B 44.3 D 29.6 C 41.1 A 26.3 E 24.4

Analyse the data and draw conclusion on yielding ability of paddyvarieties.

2.5. Missing plot technique in design of Experiments

Statistical concept: In agricultural field experiments, the experimenteris often encountered with the situation that the observations of aparticular plot may be lost or are so much affected by some extraneouscauses that it would not be desirable to regard these observations asnormal experimental observations. Such data are generally analysed

Page 69: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-65

through missing plot technique. Statistical analysis of such type of designswhere observation on one or more plot are missing is somewhatcomplicated due to disturbance in the initially symmetrical distribution ofplot among different treatments and also among different blocks. Theanalysis of such experiments, however, can be carried out by one of thefollowing methods.

(a) Estimating the missing value(s) using the Principle of least squaresi.e. minimizing the error sum of squares.

(b) Method of interaction(c) Method of fitting constants, and(d) Analysis of the data with missing observation by the technique of

analysis of covariance.

In the following, we shall use the first method of analysis of data withone missing observation.

2.6. Analysis of data in RCBD with one missing observation

Procedure:

When any one observation of a character under study is missing, wefirst estimate the missing observation and substitute the estimated valuein that place and proceed for analysis. The method consists of selecting avalue ‘x’ for the unknown missing value such that the error variance iskept at minimum.

Consider a randomized block design with t treatments and rreplications and one observation is missing.

Let, x be the value of the missing observation and this is estimatedas:

)1)(1(

'''

tr

GtTrBx

where,

B’ = total of available values of the replication that contains the missingvalueT’ = total of available values of the treatment that contains the missingvalueG’ = grand total of all the available values

The analysis is than carried out as usual after substituting theestimated value of the missing value with the following changes.i). The d.f. for error and total is corrected by subtracting 1 from the actuald.f.ii). Treatment Sum of Squares is to be corrected by subtracting the bias,

B=2

2'''

)1)(1()(

rtt

GtTB

Page 70: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-66

iii). Standard error for testing the significance of the difference betweentreatment means:

(a).Standard error of the difference between two treatment meansnot involving the missing value:

SE(d) =r

Se2 2

Where, Se2 is the Error Mean Square

(b).Standard error of the difference between two treatment meansone of which involves the missing value:

SE(d) =

)1)(1(

2tr

t

r

EMS

Problem-25. To find out the best source of nitrogen at 60 kg/ha, anexperiment was conducted on paddy with 5 sources of nitrogen in 4blocks. The yield data for different treatments are given below.

Yield of grain (kg/plot)

BlocksAmmoniumSulphate

AmmoniumChloride Urea Chilean

nitrateAmmonium

Sulphate NitrateS1 S2 S3 S4 S5

I 25.4 32.5 37.5 22.5 20.5II 17.3 -- 25.4 14.7 21.5III 22.4 28.4 30.1 23.5 23.5IV 30.5 33.4 34.5 22.4 28.5

The observation relating to application of Ammonium Chloride in thesecond block is missing. Estimate the missing value and analyse the data.

Solution:

Step-1. Prepare the following two-way table between treatments andblocks treating the yield corresponding to S2 in second block as missing.

Treatment X Block table

Blocks Treatments1 2 3 4 5 Total

IIIIIIIV

25.417.322.430.5

32.5--

28.433.4

37.525.430.134.5

22.514.723.522.4

20.521.523.528.5

138.478.9127.9149.3

Total 95.6 94.3 127.5 83.1 94.0 494.5

Step-2. Estimate missing value, x,

Page 71: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-67

)1)(1(

'''

tr

GTtBrx 4.24

43

5.494)3.945()9.784(

Step-3. Insert the estimated missing value and carryout the analysis ofvariance according to the usual procedure of RBD except for subtracting 1d.f from the d.f. for total S.S as well as from the d.f. for error S.S.

Step-4. Calculation of sum of squares

C.F. = 86.1346220

)9.518(

tr

)GT( 22

Total S.S=TSS= 35.65886.1346221.14124.F.Cy 2ij

Block S.S =BSS= 01.23286.1346287.13694F.Ct

B

i

i2

Treatment S.S.=TrSS = 87.34386.1346273.13806.F.Cr

T 2j

j

Error S.S. =ESS= TSS – BSS – TrSS= 658.35 – 232.01 – 343.87 = 82.47

While the error mean square is an unbiased estimate of the errorvariance, the treatment S.S. is an over estimate and has to be correctedby subtracting from it a bias, B

B= 36.17945

)5.4943.9459.78()1)(1(

)( 2

2

2'`'

rtt

GtTB

Corrected Treatment S.S. = 343.87 – 17.36 = 326.51

Step-5. ANOVA Table

Sources d.f. SS MSS FBlocksTreatmentsErrorTotalTreatments(Corrected)Error

3411184

11

232.01343.8782.47658.35326.51

99.83

77.3485.977.50--

81.63

9.03

10.31 **11.46 **

----

8.99 **

** Significances at 1% level.

Step-6. Calculation Standard Error

(a). Standard error of the difference between two treatment means notinvolving the missing value:

Page 72: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-68

SE(d) = r

Se2 2

4

50.72plotkg /936.1

(b). Standard error of the difference between the two treatment meansone of which has a missing value:

SE(d) = plotkgtr

t

r

EMS/13.2

112

Exercise: In an experiment under RCBD for comparing fodder yield of 7sorghum varieties, the data was obtained as:

Fodder yield (t/ha)

Variety ReplicationI II III

V1 14.5 14.0 14.0V2 16.5 16.9 16.7V3 x 16.7 17.4V4 17.6 16.9 17.5V5 18.5 17.9 17.6V6 19.3 18.3 18.8V7 19.5 19.0 20.0

Here data on V3 in R-I is missing. Analyse the data and draw yourconclusion.

2.7. Analysis of data in LSD with one missing observation

Procedure:

Step-1. Estimate the missing value, x,

)2t)(1t(

`G2)TCR(tx

'''

where,

t = no. of treatments

R’ = total of available values of the row containing the missing value

C’ = total of available values of the column containing the missing value

T’ = total of available values of the treatment containing the missing

value

G’ = grand total of all available values

Step-2. The estimated missing value, x, is then inserted and the analysisis carried out according to the usual procedure for LSD, except, for

Page 73: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-69

subtracting 1 d.f. from the d.f. for total S.S. and error S.S. andcomputing the corrected treatment S.S. by adjusting the bias, B as

B= 2

2''''

))2)(1(())1((

tt

TtCRG

Step-3. Standard Error for testing the significance of difference betweentwo treatment means will be done as follows:

a. SE of the difference between two treatment means not involvingthe missing value,

t

Se2)d(SE

2

where, Se2 is the error mean square.b. SE of the difference between two treatment means one of which

has a missing value,

)2t)(1t(

1

t

2Se)d(SE 2

Problem-26. The data of grain yield of paddy from a varietal trail in 5 x5 latin square design is shown in the following table. The yield of variety Cis missing from second row.

Grain yield of paddy (kg/ plot)

E C D B A Total26 42 39 37 24 168A D E C B24 33 21 x 38 166D B A E C47 45 31 29 31 183B A C D E38 24 36 41 34 173C E B A D41 24 44 26 30 165

TOTAL= 176 168 171 133 157 805Analyse the data and draw your conclusion.

Solution:

Step-1. We first estimate the missing value, x as

3212

385

)25)(15(

)805(2)150133116(5

)2t)(1t(

G2)TCR(tX

''''

Step-2. On substitution of the estimated value in the missing place, weget the corrected totals as follows:

Total of second row = 148; Total of 4th column= 165

Page 74: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-70

Total of treatment C = 185; Grand total = 837

Step-3. Calculate the various sum of squares as normal LSD:CF= (GT)2/t2 = 28022.76

Total SS =TSS= 29399.00 - CF= 1376.24

Row SS =RSS= 28154.20 - CF= 131.44

Column SS=CSS= 28063.00 - CF= 40.24

Treatment SS=TrSS= 28925.00 – CF = 902.24

Error SS=ESS=TSS - RSS - CSS – TrSS = 302.32

Step-4. Upward bias,B

44.13)34(

)]150(4113116805[

)]2t)(1t[(

)T)1t(CRG(2

2

2

2''''

Corrected treatment SS=TrSS(Adj.) = 902.24-13.44 = 888.80

Step-5. Construction of ANOVA Table:

ANOVA Table

Sources of variation d.f. SS MS FRow 4 131.44 32.86 1.196Column 4 40.24 10.06 <1Treatment(Adj.) 4 888.80 222.20 8.085Error 11 302.32 27.4836Total 23 1362.80

Step-6. Estimation of Standard errors (SE):

a. SE of the difference between two treatment means not involving themissing value

2956.7)3156.3()201.2(CD

3156.35

4836.272

t

Se2)d(SE

2

b. SE of the difference between two treatment means one of whichinvolves the missing value:

0220.8)6447.3()201.2(

6447.32839.132515

152

4836.272)1

12)( 2

CD

tttSedSE

Step-7. Arrange the variety means in descending order of value andprepare the bar chart as:

AECDB

Page 75: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-71

Conclusion: For yield performance, variety B,D & C are at par and bestfollowed by both E & A.

Exercise: Estimate the missing value in the following LSD layout having 4treatments A,B,C & D and analyse the data to draw conclusion.

A 12 C 19 B 10 D 8

C 18 B 12 D 6 A --

B 22 D 10 A 5 C 21

D 12 A 7 C 27 B 17

III. SAMPLING TECHNIQUES

Essentially, sampling consists of obtaining information from only apart of a large group or population so as to infer about the wholepopulation. The object of sampling is thus to secure a sample which willrepresent the population and reproduce the important characteristics ofthe population under study as closely as possible. The principaladvantages of sampling as compared to complete enumeration of thepopulation are reduced cost, greater speed, greater scope and improvedaccuracy. The smaller size of the sample makes the supervision moreeffective. Moreover, it is important to note that the precision of theestimates obtained from certain types of samples can be estimated fromthe sample itself. The most ‘convenient’ method of sampling is that inwhich the investigator selects a number of sampling units which heconsiders ‘representative’ of the whole population

When sampling is performed so that every unit in the populationhas some chance of being selected in the sample and the probability ofselection of every unit is known, the method of sampling is calledprobability sampling. An example of probability sampling is randomselection, which should be clearly distinguished from haphazard selection,which implies a strict process of selection equivalent to that of drawinglots. In this manual, any reference to sampling, unless otherwise stated,will relate to some form of probability sampling. The probability that anysampling unit will be selected in the sample depends on the samplingprocedure used. The important point to note is that the precision andreliability of the estimates obtained from a sample can be evaluated only

Page 76: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-72

for a probability sample. The object of designing a sample survey is tominimise the error in the final estimates. Even if the sample is aprobability sample, the sample being based on observations on a part ofthe population cannot, in general, exactly represent the population. Theaverage magnitude of the sampling errors of most of the probabilitysamples can be estimated from the data collected. The magnitude of thesampling errors depends on the size of the sample, the variability withinthe population and the sampling method adopted. Thus, if a probabilitysample is used, it is possible to predetermine the size of the sampleneeded to obtain desired and specified degree of precision. A samplingscheme is determined by the size of sampling units, number of samplingunits to be used, the distribution of the sampling units over the entirearea to be sampled, the type and method of measurement in the selectedunits and the statistical procedures for analysing the survey data. Avariety of sampling methods and estimating techniques developed tomeet the varying demands of the survey statistician accord the user awide selection for specific situations. One can choose the method orcombination of methods that will yield a desired degree of precision atminimum cost.

3.1. Principal steps in a sample survey

In any sample survey, we must first decide on the type of data tobe collected and determine how adequate the results should be. Secondly,we must formulate the sampling plan for each of the characters for whichdata are to be collected. We must also know how to combine the samplingprocedures for the various characters so that no duplication of field workoccurs. Thirdly, the field work must be efficiently organised with adequateprovision for supervising the work of the field staff. Lastly, the analysis ofthe data collected should be carried out using appropriate statisticaltechniques and the report should be drafted giving full details of the basicassumptions made, the sampling plan and the results of the statisticalanalysis.

(i) Specification of the objectives of the survey: Careful considerationmust be given at the outset to the purposes for which the survey is to beundertaken. The characteristics on which information is to be collectedand the degree of detail to be attempted should be fixed. If it is a surveyof trees, it must be decided as to what species of trees are to beenumerated, whether only estimation of the number of trees underspecified diameter classes or, in addition, whether the volume of trees isalso proposed to be estimated. It must also be decided at the outset whataccuracy is desired for the estimates.

(ii) Construction of a frame of units: The first requirement of probabilitysample of any nature is the establishment of a frame. A frame is a list of

Page 77: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-73

sampling units which may be unambiguously defined and identified in thepopulation. The sampling units may be compartments, topographicalsections, strips of a fixed width or plots of a definite shape and size. Thesampling frame is collected from secondary sources like revenuedepartment or any related offices or books, journals or records etc.

(iii) Choice of a sampling design: If it is agreed that the sampling designshould be such that it should provide a statistically meaningful measure ofthe precision of the final estimates, then the sample should be aprobability sample, in that every unit in the population should have aknown probability of being selected in the sample. The choice of units tobe enumerated from the frame of units should be based on someobjective rule which leaves nothing to the opinion of the field worker. Thedetermination of the number of units to be included in the sample and themethod of selection is also governed by the allowable cost of the surveyand the accuracy in the final estimates.

(iv) Organisation of the field work: The entire success of a samplingsurvey depends on the reliability of the field work. Proper selection of thepersonnel, intensive training, clear instructions and proper supervision ofthe fieldwork are essential to obtain satisfactory results. The field partiesshould correctly locate the selected units and record the necessarymeasurements according to the specific instruction given. The supervisingstaff should check a part of their work in the field and satisfy that thesurvey carried out in its entirety as planned.

(v) Analysis of the data: Depending on the sampling design used and theinformation collected, proper formulae should be used in obtaining theestimates and the precision of the estimates should be computed. Doublecheck of the computations is desired to safeguard accuracy in theanalysis.

(vi) Preliminary survey (pilot trials): The design of a sampling scheme fora survey requires both knowledge of the statistical theory and experiencewith data regarding the nature of the study area, the pattern of variabilityand operational cost. If prior knowledge in these matters is not available,a statistically planned small scale ‘pilot survey’ may have to be conductedbefore undertaking any large scale survey in that area. Such exploratoryor pilot surveys will provide adequate knowledge regarding the variabilityof the material and will afford opportunities to test and improve fieldprocedures, train field workers and study the operational efficiency of adesign. A pilot survey will also provide data for estimating the variouscomponents of cost of operations in a survey like time of travel, time oflocation and enumeration of sampling units, etc. The above informationwill be of great help in deciding the proper type of design and intensity ofsampling that will be appropriate for achieving the objects of the survey.

Page 78: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-74

Sampling terminology

Population : The word population is defined as the aggregate of units fromwhich a sample is chosen. Exa. All the plots, trees, plants, insects, blocks,villages, or people etc. of study area.

Sampling units: Sampling units are all the well defined units of thepopulation from which a sample is to be collected.

Sampling frame: A list of sampling units of a population of units.

Sample: One or more sampling units selected from a population accordingto some specified procedure constitute a sample.

Sampling intensity or sampling fraction: It is the ratio of the number ofunits in the sample to the number of units in the population (n/N).

Population total: Suppose a finite population consists of units U1, U2, …,UN. Let the value of the characteristic for the i-th unit be denoted by yi forevery unit Ui. The population total of the values, yi ( i = 1, 2, …, N) is:

Population mean: The arithmetic mean or average of yi values

Population variance: A measure of the variation between units of thepopulation is:

which measures the variation among the population units- large valuesindicate large variation between units and small values indicate that thevalues of the characteristic for the units are close to the population mean.The square root of the variance is known as standard deviation.

Coefficient of variation: The ratio of the standard deviation to the meanexpressed in percentage is:

100.. Y

SVC y

Page 79: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-75

It being unitless used to compare the variation between two or morepopulations or sets of observations for variability.

Parameter: A function of the values of the units in the population. Exa.Population mean, variance, C.V., correlation etc., are populationparameters. The problem in sampling theory is to estimate theparameters from a sample by a procedure that makes it possible tomeasure the precision of the estimates.

Estimator and estimate: Let the sample observations be y1, y2, …, yn ofsize n . Any function of the sample observations will be called a statistic.When a statistic is used to estimate a population parameter, the statisticwill be called an estimator. Exa. the sample mean is an estimator of thepopulation mean. Any particular value of an estimator computed from anobserved sample will be called an estimate.

Bias in estimation: A statistic t is said to be an unbiased estimator of apopulation parameter q if its expected value, E(t), is equal to q . Asampling procedure based on a probability scheme gives rise to a numberof possible samples by repetition of the sampling procedure. If the valuesof the statistic t are computed for each of the possible samples and if theaverage of the values is equal to the population value q , then t is said tobe an unbiased estimator of q. In case E(t) is not equal to q , thestatistic t is said to be a biased estimator of q and the bias is given by,bias = E(t) - q .

Sampling variance: It is defined as the average magnitude over allpossible samples of the squares of deviations of the estimator from itsexpected value and is given by V(t) = E[t - E(t)]2.

The larger the sample and the smaller the variability between units in thepopulation, the smaller will be the sampling error and the greater will bethe confidence in the results.

Standard error of an estimator: The square root of the sampling varianceof an estimator is known as the standard error of the estimator. Thestandard error of an estimate divided by the value of the estimate iscalled relative standard error which is usually expressed in percentage.

Accuracy and precision: Precision of an estimate is the inverse of thestandard error or the sampling variance. Accuracy usually refers to thesize of the deviations of the sample estimate from the mean and the biasthus measured by m - q. It is the accuracy of the sample estimate inwhich we are chiefly interested and it is the precision with which we areable to measure in most instances. We strive to design the survey andattempt to analyse the data using appropriate statistical methods in such

Page 80: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-76

a way that the precision is increased to the maximum and bias is reducedto the minimum.

Confidence limits: If the estimator t is normally distributed (generallyvalid for large samples), a confidence interval defined by a lower andupper limit can be expected to include the population parameter q with aspecified probability level. The limits are given by

Lower limit = )(tVZt

Upper limit = )(tVZt

Where V(t) is the estimate of the variance of t and Zα is the value of thenormal deviate corresponding to a desired α% confidence probability.When Zα is taken as 1.96, we say that the chance of the true value of qbeing contained in the random interval is 95 per cent.

Some general remarks: Capital letters will be used to denote populationvalues and small letters to denote sample values. The symbol ‘cap’ (^)above a symbol for a population value denotes its estimate based onsample observations. While describing the different sampling methods,the formulae for estimating only population mean and its samplingvariance are given. Two related parameters are population total and ratioof the character under study (y) to some auxiliary variable (x). Theserelated statistics can always be obtained from the mean by using thefollowing general relations.

where = Estimate of the population totalN = Total number of units in the population

= Estimate of the population ratioX = Population total of the auxiliary variable

3.2. Simple random sampling (SRS)

A sampling procedure such that each possible combination ofsampling units out of the population has the same chance of beingselected is referred to as simple random sampling. From theoreticalconsiderations, simple random sampling is the simplest form of samplingand is the basis for many other sampling methods. Simple randomsampling is most applicable for the initial survey in an investigation andfor studies which involve sampling from a small area where the samplesize is relatively small. The irregular distribution of the sampling units inan area in simple random sampling may be of great disadvantage where

Page 81: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-77

accessibility is poor and the costs of travel and locating the plots areconsiderably higher than the cost of enumerating the plot.

Selection of sampling units from a Population

In practice, a random sample is selected unit by unit. Two methods ofrandom selection for simple random sampling without replacement (WOR)are explained in this section.

i). Lottery method: The units in the population are numbered 1 to N andthen N identical paper chits with numberings 1 to N are obtained and onechit is chosen at random after shuffling the chits. The process isrepeated n times without replacing the chits selected. The units whichcorrespond to the numbers on the chosen chits form a simple randomsample of size n from the population of N units. In this way, theprobability of selecting any chit is the same for all the N chits.

ii). Selection based on random number tables: The procedure of selectionusing the lottery method obviously becomes rather inconvenientwhen N is large. To overcome this difficulty, we may use a table ofrandom numbers such as those published by Fisher and Yates a sample ofwhich is given in Appendix. The tables of random numbers have beendeveloped in such a way that the digits 0 to 9 appear independent of eachother and approximately equal number of times in the table. The simplestway of selecting a random sample of required size consists in selecting aset of n random numbers one by one from 1 to N in the random numbertable and then taking the units bearing those numbers. This proceduremay involve a number of rejections since all the numbers morethan N appearing in the table are not considered for selection. In suchcases, the procedure is modified as follows. If N is a d digited number, wefirst determine the highest d digited multiple of N, say N’. Then a randomnumber r is chosen from 1 to N’ and the unit having the serial numberequal to the remainder obtained on dividing r by N is considered asselected. If remainder is zero, the last unit is selected.

Problem-27: Select a simple random sample of n=5 units from apopulation of size N=40.

Solution:i). Serially number the population units from 1 to 40 (here 40 is 2-digit).ii). Find the highest two digit number 80 which is divisible by 40.iii). Let us select the 5th column of random number table (Table-5 ofAppendix).iv). The value 39 (which is less than N=40) is selected as 1st member ofthe sample.v).Other values of column 92, 90 ate rejected as >80.

Page 82: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-78

vi). 27 is selected (which is in 1-40) as 2nd sample unit.vii). 00 i.e 40th value selected as 3rd sample unit.vii). The next value is 74. Dividing it by 40 the remainder is 34. So 34th

unit as 4th sample unit.viii). Next comes 07 and it is selected as 5th sample unit.

So, the selected 5 sample units from the population members of 40are:39, 27, 40, 34 & 7.

Exercise: Select a random sample of 11 cows from a list 112 milchingcows of a herd by using the random number table.

3.3. Parameter estimation in SRS

a). SRS WOR (without replacement)

Let y1, y2,… ,yn be the measurements on a particular characteristicon n selected units in a sample from a population of N sampling units. Itcan be shown in the case of simple random sampling without replacementthat the sample mean,

is an unbiased estimator of the population mean, . An unbiased estimateof the sampling variance of is given by,

where,

Assuming that the estimate is normally distributed, a confidenceinterval on the population mean can be set with the lower and upperconfidence limits defined by,

Lower limit and Upper limit

where z is the table value which depends on how many observationsthere are in the sample. If there are 30 or more observations we can readthe values from the table of the normal distribution. If there are less than30 observations, the table value should be read from the tableof t distribution using n - 1 degree of freedom.

Page 83: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-79

b). SRS WR (with replacement)

Let y1, y2,… ,yn be the measurements on a particular characteristicon n selected units in a sample from a population of N sampling units withreplacement. Then,

1. Estimate of population mean,

2. Estimate of Variance of sample mean, 21)( yS

Nn

NYV

where

3. Estimate of population total, yNY

4. Estimate of C.I. of population mean:

Lower limit,Nn

NSZyY yL

1

Upper limit,Nn

NSZyY yL

1

Problem-28: A forest has been divided up into 1000 plots of 0.1 hectareeach and a simple random sample of 25 plots has been selected. For eachof these sample plots the wood volumes in m3 were recorded as:

Samle Obs. 1 2 3 4 5 6 7 8 9 10 11 12 13Wood Volume 7 8 2 6 7 10 8 6 7 3 7 8 9Samle Obs. 14 15 16 17 18 19 20 21 22 23 24 25Wood Volume 11 8 4 7 7 8 7 7 5 8 8 7

Estimate the population mean, 95% C.I. of mean, C.V. and total volumeof wood in the forest by SRSWOR and SRSWR. Compare the efficiency ofthe two methods.

Solution:

Page 84: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-80

a). SRSWOR

Let the ith sampling unit (i=1,2,3,……,25) of wood volume is designatedas yi.Now, an unbiased estimator of the population mean is obtained usingformula as:

= 7 m3

which is the mean wood volume per plot of 0.1 ha in the forest area.

An estimate ( ) of the variance of individual values of y is obtained usingformula as:

= = 3.833Then unbiased estimate of sampling variance of mean is

= 0.1495 and 0.3867 m3

The relative standard error,

C.V.= = (100) = 5.52 %The confidence limits on the population mean are :

Lower limit = 6.20

Upper limit = 7.80The 95% confidence interval for the population mean is (6.20, 7.80) m3.Thus, we are 95% confident that the confidence interval (6.20, 7.80)m3 would include the population mean.An estimate of the total wood volume in the forest area sampled caneasily be obtained by multiplying the estimate of the mean by the totalnumber of plots in the population. Thus,

with a confidence interval of (6200, 7800) obtained bymultiplying the confidence limits on the mean by N = 1000.

b). SRSWR

An unbiased estimator of the population mean is also obtained usingformula as:

= 7 m3

which is the mean wood volume per plot of 0.1 ha in the forest area.

Page 85: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-81

An estimate ( ) of the variance of individual values of y is also obtainedusing formula as:

= = 3.833Now, the unbiased estimate of sampling variance of mean is

833.325100011000

)(

YV =0.153167 and SE(est. of pop. Mean)=Nn

NS y

1 =0.391365 m3

The relative standard error, C.V.=0.3914x100/7=5.59%

The confidence limits on the population mean are :

Nn

NSZyY yL

1

= 3914.0064.27 =6.19

Nn

NSZyY yL

1

= 3914.0064.27 =7.81

Lower limit = 6.20

Upper limit = 7.80The 95% confidence interval for the population mean is (6.19, 7.81) m3.Thus, we are 95% confident that the confidence interval (6.19, 7.81)m3 would include the population mean.An estimate of the total wood volume in the forest area sampled caneasily be obtained by multiplying the estimate of the mean by the totalnumber of plots in the population. Thus,

with a confidence interval of (61900, 7810) obtainedby multiplying the confidence limits on the mean by N = 1000.The efficiency of SRSWOR w.r.t SRSWR =(0.1495/0.1531)x100=97.58%

Exercise: In an agriculture survey the following data has been recordedon holding size of land (in acres) as:

Sl.No.

HoldingSize

Sl.No.

HoldingSize

Sl.No.

HoldingSize

1 21.04 13 8.29 25 22.132 12.59 14 7.27 26 1.683 20.30 15 1.47 27 49.584 16.16 16 1.12 28 1.685 23.82 17 10.67 29 4.806 1.79 18 5.94 30 12.727 26.91 19 3.15 31 6.31

Page 86: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-82

8 7.41 20 4.84 32 14.189 7.68 21 9.07 33 22.19

10 66.55 22 3.69 34 2.5011 141.80 23 14.61 35 25.2912 28.12 24 1.10 36 20.99

Q.1. Draw a random sample of size, n=10 from these 36 observations.

Q.2. Findout the population parameters on mean, variance, total, C.V.,C.I. of mean at 95% confidence by SRSWOR and SRSWR.

Q.3. Compare the relative precision of SRSWOR with SRSWR.

3.4. Stratified sampling

The basic idea in stratified random sampling is to divide aheterogeneous population into sub-populations, usually known as strata,each of which is internally homogeneous in which case a precise estimateof any stratum mean can be obtained based on a small sample from thatstratum and by combining such estimates, a precise estimate for thewhole population can be obtained. Stratified sampling provides a bettercross section of the population than the procedure of simple randomsampling. It may also simplify the organisation of the field work.Geographical proximity is sometimes taken as the basis of stratification.The assumption here is that geographically contiguous areas are oftenmore alike than areas that are far apart. Administrative convenience mayalso dictate the basis on which the stratification is made. A fairly effectivemethod of stratification is to conduct a quick reconnaissance survey of thearea or pool the information already at hand and stratify the areaaccording to some characteristics like land, slope, breed, age, plant types,stand density, site quality etc. If the characteristic under study is knownto be correlated with a supplementary variable for which actual data or atleast good estimates are available for the units in the population, thestratification may be done using the information on the supplementaryvariable. For instance, the rainfall estimates obtained at a previousinventory of an area may be used for stratification of the population.

In stratified sampling, the variance of the estimator consists of onlythe ‘within strata’ variation. Thus the larger the number of strata intowhich a population is divided, the higher, in general, the precision, since itis likely that, in this case, the units within a stratum will be morehomogeneous. For estimating the variance within strata, there should bea minimum of 2 units in each stratum. The larger the number of stratathe higher will, in general, be the cost of enumeration. So, depending onadministrative convenience, cost of the survey and variability of the

Page 87: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-83

characteristic under study in the area, a decision on the number of stratawill have to be arrived at.

Allocation and selection of the sample within strata

Let the population is divided into k strata of N1, N2 ,…, Nk unitsrespectively, and that a sample of n units is to be drawn from thepopulation. The problem of allocation concerns the choice of the samplesizes in the respective strata, i.e., how many units should be taken fromeach stratum such that the total sample is n.

Other things being equal, a larger sample may be taken from astratum with a larger variance so that the variance of the estimates ofstrata means gets reduced. The application of the above principle requiresadvance estimates of the variation within each stratum. These may beavailable from a previous survey or may be based on pilot surveys of arestricted nature. Thus if this information is available, the samplingfraction (ni/Ni) in each stratum may be taken proportional to the standarddeviation of each stratum.

In case the cost per unit of conducting the survey in each stratum isknown and is varying from stratum to stratum an efficient method ofallocation for minimum cost will be to take large samples from thestratum where sampling is cheaper and variability is higher. To apply thisprocedure one needs information on variability and cost of observationper unit in the different strata.

Where information regarding the relative variances within strata andcost of operations are not available, the allocation in the different stratamay be made in proportion to the number of units in them or the totalarea of each stratum. This method is usually known as ‘proportionalallocation’.

For the selection of units within strata, in general, any methodwhich is based on a probability selection of units can be adopted. But theselection should be independent in each stratum. If independent randomsamples are taken from each stratum, the sampling procedure will beknown as ‘stratified random sampling’. Other modes of selection ofsampling such as systematic sampling can also be adopted within thedifferent strata.

Estimation of mean and variance

Page 88: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-84

We shall assume that the population of N units is first dividedinto k strata of N1, N2,…,Nk units respectively. These strata are non-overlapping and together they comprise the whole population, so that

N1 + N2 + ….. + Nk = N

When the strata have been determined, a sample is drawn fromeach stratum, the selection being made independently in each stratum.The sample sizes within the strata are denoted by n1, n2, …,nk respectively, so that

n1 + n2 +…..+ n3 = n

Let ytj (j = 1, 2,…., Nt ; t = 1, 2,..…k) be the value of the characteristicunder study for the j-th unit in the t-th stratum. Then,

i). the population mean in the t-th stratum is given by

The overall population mean is given by

The estimate of the population mean, , in this case will be obtained by

Where,

ii). Estimate of the variance of is given by

Page 89: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-85

Where,

Stratification, if properly done as explained in the previous sections,will usually give lower variance for the estimated population total or meanthan a simple random sample of the same size. However, a stratifiedsample taken without due care and planning may not be better than asimple random sample.

Problem-29: A forest area consisting of 69 compartments was dividedinto three strata containing compartments 1-29, compartments 30-45,and compartments 46-69 and sample size of 10, 5 and 8 compartmentsrespectively were chosen at random from the three strata. The serialnumbers of the selected compartments in each stratum are given incolumn (4) of the following Table. The corresponding observed volume ofthe particular species in each selected compartment in m3/ha is shown incolumn (5).

Table-20. Estimation of parameters under stratified sampling

Stratumnumber

Total numberof units in

the stratum(Nt)

Number ofunits sampled

(nt)

Selected samplingunit number

Volume(m3/ha)

( )

( )

(1) (2) (3) (4) (5) (6)

I

1182812201996177

5.404.874.613.264.964.734.392.344.742.85

29.1623.7221.2510.6324.6022.3719.275.4822.478.12

Total 29 10 .. 42.15 187.07

II4342364539

4.794.574.894.423.44

22.9420.8823.9119.5411.83

Total 16 5 .. 22.11 99.10

59 7.41 54.91

Page 90: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-86

III

50495854695247

3.705.457.013.835.254.506.51

13.6929.7049.1414.6727.5620.2542.38

Total 24 8 .. 43.66 252.30

Step-1. Compute the following quantities.

N = (29 + 16 + 24) = 69n = (10 + 5 + 8) = 23

: Iy = 4.215, IIy = 4.422, IIIy = 5.458

Step-2. Estimation of the population mean

Step-3. Estimation of the variance of

and

Page 91: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-87

Step-3. if we ignore the strata and assume that the same sample ofsize n = 23 formed a simple random sample (WOR) from the populationof N = 69, the estimate of the population mean would reduce to

Estimate of the variance of the mean is,

Where,

so that

=C.V.

The gain in precision due to stratification with SRSWOR is computed by

= 121.8

Thus the gain in precision is 21.8%.

Exercise: 2000 wheat cultivators’ holdings in a GP were stratifiedaccording to their sizes and the results due to stratification is given below.

Stratum No. No. of holdings(Ni)

Mean area perholding ( tY )

S.D. of area perholding (St)

Page 92: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-88

1 394 5.4 8.32 461 16.3 13.33 381 24.3 15.14 334 34.5 19.85 169 42.1 24.56 113 50.1 26.07 148 63.8 35.2

Estimate:

1. Mean of wheat area of the GP

2. Variance of mean of Wheat area of GP

3. C.V. of area of GP

4. Mean area, variance of mean, and C.V. of GP if considered as SRSWOR

5. Gain in precision of stratification with SRSWOR

3.5. Systematic sampling

Systematic sampling employs a simple rule of selecting every k-thunit starting with a number chosen at random from 1 to k (k=N/n) as therandom start. Let there be N sampling units in the population numbered 1to N, then a systematic sample of n units are selected starting with therandom start and others with an interval of k (called sampling interval)from it. This type of sampling is often convenient in exercising controlover field work. Apart from these operational considerations, theprocedure of systematic sampling is observed to provide estimators moreefficient than simple random sampling under normal conditions. Theproperty of the systematic sample in spreading the sampling units evenlyover the population can be taken advantage of by listing the units so thathomogeneous units are put together or such that the values of thecharacteristic for the units are in ascending or descending order ofmagnitude i.e. in some order. For example, knowing the fertility trend ofthe forest area the units (for example strips) may be listed along thefertility trend.

Selection of a systematic sample

Consider a population of N=48 units. A sample of n=4 units isneeded. Here, k =(48/4)=12. If the random number selected from the setof numbers from 1 to 12 is 11, then the units associated with serialnumbers 11, 23, 35 and 47 will be selected. This technique will generatek systematic samples with equal probability.

In situations where N is not fully divisible by n, k is calculated asthe integer nearest to N/n. In this situation, the sample size is not

Page 93: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-89

necessarily n and in some cases it may be n -1 and generates unequalsample sizes.

Parameter estimation

The estimate for the population mean per unit is given by the samplemean

where n is the number of units in the sample.

One-dimensional Systematic sampling

In the case of systematic strip surveys or, in general, any onedimensional systematic sampling, an approximation to the standarderror may be obtained from the differences between pairs of successiveunits. If there are n units enumerated in the systematic sample, there willbe (n-1) differences. The variance per unit is therefore given by the sumof squares of the differences divided by twice the number of differences.Thus if y1, y2,…,yn are the observed values (say volume) for the n units inthe systematic sample and defining the first difference d(yi) as givenbelow,

; (i = 1, 2, …, n -1)

the approximate variance per unit is estimated as

Problem-30: The following Table gives the observed diameters of 10trees selected by systematic selection of 1 in 20 trees from a standcontaining 195 trees in rows of 15 trees. The first tree was selected as the8th tree from one of the outside edges of the stand starting from onecorner and the remaining trees were selected systematically by takingevery 20th tree switching to the nearest tree of the next row after the lasttree in any row is encountered.

Table21. Tree diameter on a systematic sample of 10 trees from a plot

Tree No. DBH(cm), yi First difference, d(yi)

8 14.8

Page 94: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-90

28 12.0 -2.8

48 13.6 +1.6

68 14.2 +0.6

88 11.8 -2.4

108 14.1 +2.3

128 11.6 -2.5

148 9.0 -2.6

168 10.1 +1.1

188 9.5 -0.6

Solution:

Average diameter,

The nine first differences can be obtained as shown in column (3) of theTable. The error variance of the mean per unit is thus,

= 0.202167

k-Independent Systematic sampling of equal sample size

A difficulty with systematic sampling is that one systematic sampleby itself will not furnish valid assessment of the precision of theestimates. With a view to have valid estimates of the precision, one mayresort to partially systematic samples. A theoretically valid method ofusing the idea of systematic samples and at the same time leading tounbiased estimates of the sampling error is to draw a minimum of twosystematic samples with independent random starts. If , , …,are m estimates of the population mean based on m independentsystematic samples, the combined estimate for population mean is:

The estimate of the variance of is given by

Page 95: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-91

Notice that the precision increases with the number of independentsystematic samples.

Problem-31: The data given in the following Table have one systematicsample along with another systematic sample selected with independentrandom starts. In the second sample, the first tree was selected as the10th tree.

Table-22. Tree diameter on two independent systematic samples of 10trees from a plot.

Sample-1 Sample-2

Tree No. DBH (cm), yi Tree No. DBH (cm), yi

8 14.8 10 13.6

28 12.0 30 10.0

48 13.6 50 14.8

68 14.2 70 14.2

88 11.8 90 13.8

108 14.1 110 14.5

128 11.6 130 12.0

148 9.0 150 10.0

168 10.1 170 10.5

188 9.5 190 8.5

Solution:

Here, n=10, k=20 and N=200

The average diameter for the first sample is and thesecond sample is . Combined estimate of population mean ( ) isobtained as:

Page 96: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-92

= 12.13 cm

The estimate of the variance of mean is obtained as:

= 0.0036

= 0.06 cm and C.V.=0.06x100/12.13=0.49%

Total= 200x12.13=2426 cm

Exercise: Given below are data for 10 systematic samples of size 4 from apopulation of 40 units.

Systematic sample numbers1 2 3 4 5 6 7 8 9 10

0 1 2 1 4 5 6 7 7 97 8 9 10 12 13 15 6 16 1718 18 19 20 21 20 24 13 28 2929 30 31 31 30 32 35 37 38 63

Work out the estimate of population mean, total, variance, C.V. andrelative precision of systematic sample with SRSWOR.

*****************XXX******************

Page 97: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-93

APPENDIX

STATISTICAL TABLES (t, F, χ2, r, Z, random number)

Table-1(a): Critical values for t-distribution

Probability % Probability % Probability %DF 0.01 0.05 DF 0.01 0.05 DF 0.01 0.05

1 63.657 12.706 41 2.701 2.020 81 2.637 1.9902 9.925 4.303 42 2.698 2.018 82 2.637 1.9893 5.841 3.182 43 2.695 2.017 83 2.636 1.9894 4.604 2.776 44 2.692 2.016 84 2.635 1.9895 4.032 2.571 45 2.689 2.014 85 2.634 1.988

6 3.707 2.447 46 2.687 2.013 86 2.634 1.9877 3.499 2.365 47 2.684 2.012 87 2.633 1.9878 3.355 2.306 48 2.682 2.011 88 2.632 1.9879 3.250 2.262 49 2.679 2.010 89 2.632 1.987

10 3.169 2.228 50 2.677 2.008 90 2.631 1.987

11 3.106 2.201 51 2.675 2.007 91 2.630 1.98612 3.055 2.179 52 2.673 2.006 92 2.630 1.98613 3.102 2.160 53 2.671 2.005 93 2.629 1.98614 2.977 2.145 54 2.670 2.004 94 2.629 1.98615 2.947 2.131 55 2.668 2.004 95 2.628 1.986

16 2.921 2.120 56 2.666 2.003 96 2.628 1.98517 2.898 2.110 57 2.664 2.002 97 2.627 1.98518 2.878 2.101 58 2.663 2.002 98 2.626 1.98419 2.861 2.093 59 2.661 2.001 99 2.626 1.98420 2.845 2.086 60 2.660 2.000 100 2.625 1.984

21 2.831 2.080 61 2.658 1.999 105 2.623 1.98322 2.819 2.074 62 2.657 1.998 110 2.621 1.98223 2.807 2.069 63 2.656 1.998 115 2.619 1.98124 2.797 2.064 64 2.654 1.997 120 2.617 1.98025 2.787 2.060 65 2.653 1.996 125 2.616 1.979

26 2.779 2.056 66 2.652 1.996 130 2.614 1.97827 2.771 2.052 67 2.651 1.995 135 2.613 1.97828 2.763 2.048 68 2.650 1.995 140 2.611 1.97729 2.756 2.045 69 2.649 1.994 145 2.610 1.97630 2.750 2.042 70 2.647 1.994 150 2.609 1.976

31 2.744 2.040 71 2.646 1.993 160 2.607 1.97532 2.738 2.037 72 2.645 1.993 170 2.605 1.97433 2.733 2.035 73 2.644 1.993 180 2.603 1.97334 2.728 2.033 74 2.643 1.993 190 2.602 1.97335 2.723 2.030 75 2.643 1.992 200 2.601 1.972

36 2.719 2.028 76 2.642 1.992 250 2.596 1.96937 2.715 2.026 77 2.641 1.991 300 2.592 1.96838 2.711 2.024 78 2.640 1.991 350 2.590 1.96739 2.707 2.022 79 2.639 1.991 400 2.588 1.96640 2.704 2.021 80 2.638 1.990 2.576 1.960

Table-1(b): Critical values for t-distribution (One & Two-tailed)

Page 98: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-94

Percentage (P)

One-tailed Two-tailed

Degree of freedom (v) 5% 1% 5% 1%

1 6.31 31.8 12.7 63.7

2 2.92 6.96 4.30 9.92

3 2.35 4.54 3.18 5.84

4 2.13 3.75 2.78 4.60

5 2.02 3.36 2.57 4.03

6 1.94 3.14 2.45 3.71

7 1.89 3.00 2.36 3.50

8 1.86 2.90 2.31 3.36

9 1.83 2.82 2.26 3.25

10 1.81 2.76 2.23 3.17

11 1.80 2.72 2.20 3.11

12 1.78 2.68 2.18 3.05

13 1.77 2.65 2.16 3.01

14 1.76 2.62 2.14 2.98

15 1.75 2.60 2.13 2.95

16 1.75 2.58 2.12 2.92

17 1.74 2.57 2.11 2.90

18 1.73 2.55 2.10 2.88

19 1.73 2.44 2.09 2.86

20 1.72 2.53 2.09 2.85

22 1.72 2.51 2.07 2.82

24 1.72 2.49 2.06 2.80

26 1.71 2.48 2.06 2.78

28 1.70 2.47 2.05 2.76

30 1.70 2.46 2.04 2.75

35 1.69 2.44 2.03 2.72

40 1.68 2.42 2.02 2.70

45 1.68 2.41 2.01 2.69

Page 99: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-95

50 1.68 2.40 2.01 2.68

55 1.67 2.40 2.00 2.67

60 1.67 2.39 2.00 2.66

¥ 1.64 2.33 1.96 2.58

Table-2: Critical values for F-distribution

Smaller MS Degrees of freedom for greater mean square (n1)(n2) 1 2 3 4 5 6 7 8 9 10

1 5% 161.00 200.00 216.00 225.00 230.00 234.00 237.00 239.00 241.00 242.001% 4052.00 4999.00 5403.00 5625.00 5764.00 5859.00 5928.00 5981.00 6022.00 6056.00

2 5% 18.51 19.00 19.16 19.25 19.30 19.33 19.36 19.37 19.38 19.391% 98.49 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40

3 5% 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 8.81 8.781% 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.34 27.23

4 5% 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.961% 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.54

5 5% 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.78 4.741% 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.29 10.15 10.05

6 5% 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.061% 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87

7 5% 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.631% 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 6.71 6.62

8 5% 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.341% 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 5.91 5.82

9 5% 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.131% 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 5.35 5.26

10 5% 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.971% 10.04 7.56 6.55 5.99 5.64 5.39 5.21 5.06 4.95 4.85

11 5% 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.801% 9.65 7.20 6.22 5.67 5.32 5.07 4.88 4.74 4.63 4.54

12 5% 4.75 3.88 3.49 3.26 3.11 3.00 2.92 2.85 2.80 2.761% 9.33 6.93 5.95 5.41 5.06 4.82 4.65 4.50 4.39 4.30

13 5% 4.67 3.80 3.41 3.18 3.02 2.92 2.84 2.77 2.72 2.671% 9.07 6.70 5.74 5.20 4.86 4.62 4.44 4.30 4.19 4.10

14 5% 4.60 3.74 3.34 3.11 2.96 2.85 2.77 2.70 2.65 2.601% 8.86 6.51 5.56 5.03 4.69 4.46 4.28 4.14 4.03 3.94

15 5% 4.54 3.68 3.29 3.06 2.90 2.79 2.70 2.64 2.59 2.551% 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80

16 5% 4.49 3.83 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.491% 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69

17 5% 4.45 3.59 3.20 2.96 2.81 2.70 2.62 2.55 2.50 2.451% 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59

Table-2 (Continued…)Critical values for F-distribution

Page 100: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-96

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 1 2 3 4 5 6 7 8 9 1018 5% 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41

1% 8.28 6.01 5.09 4.58 4.25 4.01 3.85 3.71 3.60 3.51

19 5% 4.38 3.52 3.13 2.90 2.74 2.63 2.55 2.48 2.43 2.381% 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43

20 5% 4.35 3.49 3.10 2.87 2.71 2.60 2.52 2.45 2.40 2.351% 8.10 5.85 4.94 4.43 4.10 3.87 3.71 3.56 3.45 3.37

21 5% 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.321% 8.02 5.78 4.87 4.37 4.04 3.81 3.65 3.51 3.40 3.31

22 5% 4.30 3.44 3.05 2.82 2.66 2.55 2.47 2.40 2.35 2.301% 7.94 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26

23 5% 4.28 3.42 3.03 2.80 2.64 2.53 2.45 2.38 2.32 2.281% 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21

24 5% 4.26 3.40 3.01 2.78 2.62 2.51 2.43 2.36 2.30 2.261% 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.25 3.17

25 5% 4.24 3.38 2.99 2.76 2.60 2.49 2.41 2.34 2.28 2.241% 7.77 5.57 4.68 4.18 3.86 3.63 3.46 3.32 3.21 3.13

26 5% 4.22 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.28 2.221% 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.17 3.09

27 5% 4.21 3.50 2.96 2.73 2.57 2.46 2.37 2.30 2.25 2.201% 7.68 5.49 4.60 4.11 3.79 3.56 3.39 3.26 3.14 3.06

28 5% 4.20 3.34 2.95 2.71 2.56 2.44 2.36 2.29 2.24 2.191% 7.64 5.45 4.57 4.07 3.76 3.53 3.36 3.23 3.11 3.03

29 5% 4.18 3.33 2.95 2.70 2.54 2.43 2.35 2.28 2.22 2.181% 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.08 3.00

30 5% 4.17 3.32 2.92 2.69 2.53 2.42 2.34 2.27 2.21 2.161% 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.06 2.98

31 5% 4.16 3.31 2.91 2.68 2.52 2.41 2.33 2.26 2.20 2.151% 7.53 5.37 4.49 4.00 3.68 3.45 3.28 3.15 3.04 2.96

32 5% 4.15 3.30 2.90 2.67 2.51 2.40 2.32 2.25 2.19 2.141% 7.50 5.34 4.46 3.97 3.66 3.42 3.25 3.12 3.01 2.94

33 5% 4.14 3.29 2.89 2.66 2.50 2.39 2.31 2.24 2.18 2.131% 7.47 5.32 4.44 3.95 3.64 3.40 3.23 3.10 2.99 2.92

34 5% 4.13 3.28 2.88 2.65 2.49 2.38 2.30 2.23 2.17 2.121% 7.44 5.29 4.42 3.93 3.61 3.38 3.21 3.08 2.97 2.89

Table-2 (Continued…)Critical values for F-distribution

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 1 2 3 4 5 6 7 8 9 1035 5% 4.12 3.27 2.87 2.64 2.49 2.37 2.29 2.22 2.16 2.11

1% 7.42 5.27 4.40 3.91 3.60 3.37 3.20 3.06 2.96 2.88

36 5% 4.11 3.26 2.86 2.63 2.48 2.36 2.28 2.21 2.15 2.101% 7.39 5.25 4.38 3.89 3.58 3.35 3.18 3.04 2.94 2.86

37 5% 4.11 3.26 2.86 2.63 2.47 2.36 2.27 2.20 2.15 2.101% 7.37 5.23 4.36 3.88 3.56 3.34 3.17 3.03 2.93 2.84

38 5% 4.10 3.25 2.85 2.62 2.46 2.35 2.26 2.19 2.14 2.091% 7.35 5.21 4.34 3.86 3.54 3.32 3.15 3.02 2.91 2.82

39 5% 4.09 3.24 2.85 2.62 2.46 2.35 2.26 2.19 2.13 2.08

Page 101: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-97

1% 7.33 5.20 4.33 3.85 3.53 3.31 3.14 3.01 2.90 2.81

40 5% 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.071% 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.88 2.80

41 5% 4.08 3.23 2.84 2.61 2.45 2.33 2.25 2.18 2.12 2.071% 7.29 5.17 4.30 3.82 3.50 3.28 3.11 2.98 2.87 2.79

42 5% 4.07 3.22 2.83 2.60 2.44 2.32 2.24 2.17 2.11 2.061% 7.27 5.15 4.29 3.80 3.49 3.26 3.10 2.96 2.86 2.77

43 5% 4.07 3.22 2.83 2.60 2.44 2.32 2.24 2.17 2.11 2.061% 7.26 5.14 4.28 3.79 3.48 3.25 3.09 2.95 2.85 2.76

44 5% 4.06 3.21 2.82 2.59 2.43 2.31 2.23 2.16 2.10 2.051% 7.24 5.12 4.26 3.78 3.46 3.24 3.07 2.94 2.84 2.75

45 5% 4.06 3.21 2.82 2.59 2.43 2.31 2.23 2.15 2.10 2.051% 7.23 5.11 4.25 3.77 3.45 3.23 3.06 2.93 2.83 2.74

46 5% 4.05 3.20 2.81 2.58 2.42 2.30 2.22 2.14 2.09 2.041% 7.21 5.10 4.24 3.76 3.44 3.22 3.05 2.92 2.82 2.73

47 5% 4.05 3.20 2.81 2.58 2.42 2.30 2.22 2.14 2.09 2.041% 7.20 5.09 4.23 3.75 3.43 3.21 3.05 2.91 2.81 2.72

48 5% 4.04 3.19 2.80 2.57 2.41 2.30 2.21 2.14 2.08 2.031% 7.19 5.08 4.22 3.74 3.42 3.20 3.04 2.90 2.80 2.71

49 5% 4.04 3.19 2.80 2.57 2.41 2.30 2.21 2.14 2.08 2.031% 7.18 5.07 4.21 3.73 3.42 3.19 3.03 2.89 2.79 2.71

50 5% 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.021% 7.17 5.06 4.20 3.72 3.41 3.18 3.02 2.88 2.78 2.70

Table-2 (Continued…)Critical values for F-distribution

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 1 2 3 4 5 6 7 8 9 1055 5% 4.02 3.17 2.78 2.54 2.38 2.27 2.18 2.11 2.05 2.00

1% 7.12 5.01 4.16 3.68 3.37 3.15 2.98 2.85 2.75 2.66

60 5% 4.00 3.15 2.76 2.52 2.37 2.25 2.17 2.10 2.04 1.991% 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63

65 5% 3.99 3.14 2.75 2.51 2.36 2.24 2.15 2.08 2.02 1.981% 7.04 4.95 4.10 3.62 3.31 3.09 2.93 2.79 2.70 2.61

70 5% 3.98 3.13 2.74 2.50 2.35 2.23 2.14 2.07 2.01 1.971% 7.01 4.92 4.08 3.60 3.29 3.07 2.91 2.77 2.67 2.59

80 5% 3.96 3.11 2.72 2.48 2.33 2.21 2.12 2.05 1.99 1.951% 6.96 4.88 4.04 3.56 3.25 3.04 2.87 2.74 2.64 2.55

100 5% 3.94 3.09 2.70 2.46 2.30 2.19 2.10 2.03 1.97 1.921% 6.90 4.82 3.98 3.51 3.20 2.99 2.82 2.69 2.59 2.51

125 5% 3.92 3.07 2.68 2.44 2.29 2.17 2.08 2.01 1.95 1.901% 6.84 4.78 3.94 3.47 3.17 2.95 2.79 2.65 2.56 2.47

150 5% 3.91 3.06 2.67 2.43 2.27 2.16 2.07 2.00 1.94 1.891% 6.81 4.75 3.91 3.44 3.14 2.92 2.76 2.62 2.53 2.44

200 5% 3.89 3.04 2.65 2.41 2.26 2.14 2.05 1.98 1.92 1.871% 6.76 4.71 3.88 3.41 3.11 2.90 2.73 2.60 2.50 2.41

400 5% 3.86 3.02 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.851% 6.70 4.66 3.83 3.36 3.06 2.85 2.69 2.55 2.46 2.37

10005% 3.85 3.00 2.61 2.38 2.22 2.10 2.02 1.95 1.89 1.841% 6.66 4.62 3.80 3.34 3.04 2.82 2.66 2.53 2.43 2.34

Page 102: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-98

5%3.84 2.99 2.60 2.37 2.21 2.09 2.01 1.94 1.88 1.83

1% 6.64 4.60 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32

Table-2 (Continued…)Critical values for F-distribution

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 11 12 13 14 15 16 17 18 19 201 5% 243.00 244.00 244.50 245.00 245.50 246.00 246.50 247.00 247.50 248.00

1% 6082.00 6106.00 6124.00 6142.00 6156.00 6169.00 6177.00 6186.00 6194.00 6208.00

2 5% 19.40 19.41 19.42 19.42 19.43 19.43 19.43 19.44 19.44 19.441% 99.41 99.42 99.42 99.43 99.43 99.44 99.44 99.45 99.45 99.45

3 5% 8.76 8.74 8.73 8.71 8.70 8.69 8.68 8.68 8.67 8.661% 27.13 27.05 26.99 26.92 26.88 26.83 26.80 26.76 26.73 26.69

4 5% 5.93 5.91 5.89 5.87 8.86 5.84 5.83 5.82 5.81 5.801% 14.45 14.37 14.31 14.24 14.20 14.15 14.11 14.07 14.04 14.02

5 5% 4.70 4.68 4.66 4.64 4.62 4.60 4.59 4.58 4.57 4.561% 9.96 9.89 9.81 9.77 9.73 9.68 9.65 9.62 9.58 9.55

6 5% 4.03 4.00 3.98 3.96 3.94 3.92 3.91 3.90 3.88 3.871% 7.79 7.72 7.66 7.60 7.56 7.52 7.79 7.46 7.42 7.39

7 5% 3.60 3.57 3.55 3.52 3.51 3.49 3.48 3.47 3.45 3.441% 6.54 6.47 6.41 6.35 6.31 6.27 6.24 6.21 6.18 6.15

8 5% 3.31 3.28 3.26 3.23 3.22 3.20 3.19 3.18 3.16 3.151% 5.74 5.67 5.62 5.56 5.52 5.48 5.46 5.42 5.39 5.36

9 5% 3.10 3.07 3.05 3.02 3.00 2.98 2.97 2.96 2.94 2.931% 5.18 5.11 5.06 5.00 4.96 4.92 4.89 4.86 4.83 4.80

10 5% 2.94 2.91 2.89 2.86 2.84 2.82 2.81 2.80 2.78 2.771% 4.78 4.71 5.66 4.60 4.56 4.52 4.49 4.47 4.44 4.41

11 5% 2.82 2.79 2.77 2.74 2.72 2.70 2.69 2.68 2.66 2.651% 4.46 4.40 4.35 4.29 4.25 4.21 4.18 4.16 4.13 4.10

12 5% 2.72 2.69 2.67 2.64 2.62 2.60 2.59 2.57 2.56 2.541% 4.22 4.16 4.11 4.05 4.02 3.98 3.95 3.92 3.89 3.86

13 5% 2.63 2.60 2.58 2.55 2.53 2.51 2.50 2.49 2.73 2.461% 4.02 3.96 3.92 3.85 3.82 3.78 3.75 3.73 3.70 3.67

14 5% 2.56 2.53 2.51 2.48 2.46 2.44 2.43 2.42 2.40 2.391% 3.86 3.80 3.75 3.70 3.66 3.62 3.59 3.57 3.54 3.51

15 5% 2.51 2.48 2.46 2.43 2.41 2.39 2.38 2.36 2.35 2.331% 3.73 3.67 3.66 3.56 3.52 3.48 3.45 3.42 3.39 3.36

16 5% 2.45 2.42 2.40 2.37 2.35 2.33 2.32 2.31 2.29 2.281% 3.61 3.55 3.50 3.45 3.41 3.37 3.34 3.31 3.28 3.25

17 5% 2.41 2.38 2.36 2.33 2.31 2.29 2.28 2.26 2.25 2.231% 3.52 3.45 3.40 3.35 3.31 3.27 3.24 3.22 3.19 3.16

Table-2 (Continued…)Critical values for F-distribution

Page 103: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-99

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 11 12 13 14 15 16 17 18 19 2018 5% 2.37 2.34 2.32 2.29 2.27 2.25 2.24 2.22 2.21 2.19

1% 3.44 3.37 3.32 3.27 3.23 3.19 3.16 3.13 3.10 3.07

19 5% 2.34 2.31 2.29 2.26 2.24 2.21 2.20 2.18 2.17 2.151% 3.36 3.30 3.25 3.19 3.16 3.12 3.09 3.06 3.03 3.00

20 5% 2.31 2.28 2.26 2.23 2.21 2.18 2.17 2.15 2.14 2.121% 3.30 3.23 3.18 3.13 3.09 3.05 3.02 3.00 2.97 2.94

21 5% 2.28 2.25 2.23 2.20 2.18 2.15 2.14 2.12 2.12 2.091% 3.24 3.17 3.12 3.07 3.03 2.99 2.96 2.94 2.91 2.88

22 5% 2.26 2.23 2.21 2.18 2.16 2.13 2.12 2.10 2.09 2.071% 3.18 3.12 3.07 3.02 2.98 2.94 2.91 2.89 2.86 2.83

23 5% 2.24 2.20 2.17 2.14 2.12 2.10 2.09 2.07 2.06 2.041% 3.14 3.07 3.02 2.97 2.93 2.89 2.86 2.84 2.81 2.78

24 5% 2.22 2.18 2.16 2.13 2.11 2.09 2.07 2.06 2.04 2.021% 3.09 3.03 2.98 2.93 2.89 2.85 2.82 2.80 2.87 2.74

25 5% 2.20 2.16 2.14 2.11 2.09 2.07 2.05 2.04 2.02 2.001% 3.05 2.99 2.94 2.89 2.85 2.81 2.78 2.76 2.73 2.70

26 5% 2.18 2.15 2.13 2.10 2.08 2.05 2.04 2.02 2.01 1.991% 3.02 2.96 2.91 2.86 2.82 2.77 2.74 2.72 2.69 2.66

27 5% 2.16 2.13 2.11 2.08 2.06 2.03 2.02 2.00 1.99 1.971% 2.98 2.93 2.88 2.83 2.79 2.74 2.71 2.69 2.66 2.63

28 5% 2.15 2.12 2.09 2.06 2.04 2.02 2.01 1.99 1.98 1.961% 2.95 2.90 2.85 2.80 2.76 2.71 2.68 2.66 2.63 2.60

29 5% 2.14 2.10 2.08 2.05 2.03 2.00 1.99 1.97 1.96 1.941% 2.92 2.87 2.82 2.77 2.73 2.68 2.65 2.63 2.60 2.57

30 5% 2.12 2.09 2.05 2.04 2.02 1.99 1.98 1.96 1.95 1.931% 2.90 2.84 2.79 2.74 2.70 2.66 2.63 2.61 2.58 2.55

31 5% 2.11 2.08 2.05 2.03 2.01 1.98 1.97 1.95 1.94 1.921% 2.88 2.82 2.77 2.72 2.68 2.64 2.61 2.59 2.56 2.53

32 5% 2.10 2.07 2.05 2.02 2.00 1.97 1.96 1.94 1.93 1.911% 2.86 2.80 2.75 2.70 2.66 2.62 2.59 2.57 2.54 2.51

33 5% 2.09 2.06 2.04 2.01 1.99 1.96 1.95 1.93 1.92 1.901% 2.84 2.78 2.73 2.68 2.64 2.60 2.57 2.55 2.52 2.49

34 5% 2.08 2.05 2.03 2.00 1.98 1.95 1.94 1.92 1.91 1.891% 2.82 2.76 2.71 2.66 2.62 2.58 2.55 2.53 2.50 2.47

Table-2 (Continued…)Critical values for F-distribution

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 11 12 13 14 15 16 17 18 19 2035 5% 2.07 2.04 2.02 1.99 1.97 1.94 1.93 1.91 1.90 1.88

1% 2.80 2.74 2.69 2.64 2.60 2.56 2.53 2.51 2.48 2.45

36 5% 2.06 2.03 2.01 1.98 1.96 1.93 1.92 1.90 1.89 1.871% 2.78 2.72 2.67 2.62 2.58 2.54 2.51 2.49 2.46 2.43

37 5% 2.06 2.03 2.00 1.97 1.95 1.93 1.91 1.89 1.88 1.861% 2.77 2.71 2.66 2.61 2.57 2.53 2.50 2.47 2.44 2.41

38 5% 2.05 2.02 1.99 1.96 1.94 1.92 1.90 1.89 1.87 1.851% 2.75 2.69 2.64 2.59 2.55 2.51 2.48 2.46 2.43 2.40

39 5% 2.05 2.01 1.99 1.96 1.93 1.91 1.89 1.88 1.86 1.851% 2.74 2.68 2.63 2.58 2.54 2.50 2.48 2.45 2.42 2.38

Page 104: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-100

40 5% 2.04 2.00 1.98 1.95 1.93 1.90 1.89 1.87 1.86 1.841% 2.73 2.66 2.61 2.56 2.53 2.49 2.46 2.43 2.40 2.37

41 5% 2.01 2.00 1.98 1.95 1.92 1.90 1.88 1.86 1.85 1.831% 2.72 2.65 2.60 2.55 2.51 2.48 2.45 2.42 2.39 2.36

42 5% 2.02 1.99 1.97 1.94 1.92 1.89 1.87 1.86 1.84 1.821% 2.70 2.64 2.59 2.54 2.50 2.46 2.43 2.41 2.38 2.35

43 5% 2.02 1.99 1.96 1.93 1.91 1.89 1.87 1.85 1.83 1.821% 2.69 2.63 2.58 2.53 2.49 2.45 2.42 2.39 2.36 2.33

44 5% 2.01 1.98 1.95 1.92 1.90 1.88 1.86 1.85 1.83 1.811% 2.68 2.62 2.57 2.52 2.48 2.44 2.41 2.38 2.35 2.32

45 5% 2.01 1.98 1.95 1.92 1.90 1.88 1.86 1.84 1.82 1.811% 2.67 2.61 2.56 2.51 2.47 2.43 2.40 2.37 2.34 2.31

46 5% 2.00 1.97 1.94 1.91 1.89 1.87 1.84 1.84 1.82 1.801% 2.66 2.60 2.55 2.50 2.46 2.42 2.39 2.36 2.33 2.30

47 5% 2.00 1.97 1.94 1.91 1.89 1.87 1.85 1.83 1.81 1.801% 2.65 2.59 2.54 2.51 2.45 2.41 2.38 2.35 2.32 2.29

48 5% 1.99 1.96 1.93 1.90 1.88 1.86 1.85 1.83 1.81 1.791% 2.64 2.58 2.53 2.48 2.44 2.40 2.37 2.34 2.31 2.28

49 5% 1.99 1.96 1.93 1.90 1.88 1.86 1.84 1.82 1.80 1.791% 2.63 2.57 2.52 2.47 2.43 2.40 2.36 2.33 2.30 2.27

50 5% 1.98 1.95 1.92 1.89 1.87 1.85 1.83 1.82 1.80 1.781% 2.62 2.56 2.51 2.46 2.43 2.39 2.36 2.33 2.29 2.26

Table-2 (Continued…)Critical values for F-distribution

SmallerMS

Degrees of freedom for greater mean square (n1)

(n2) 11 12 14 16 20 24 30 40 50 7555 5% 1.97 1.93 1.88 1.83 1.76 1.72 1.67 1.61 1.58 1.52

1% 2.59 2.53 2.43 2.35 2.23 2.15 2.06 1.96 1.90 1.82

60 5% 1.95 1.92 1.86 1.81 1.75 1.70 1.65 1.59 1.56 1.501% 2.56 2.50 2.40 2.32 2.20 2.12 2.03 1.93 1.87 1.79

65 5% 1.94 1.90 1.85 1.80 1.73 1.68 1.63 1.57 1.54 1.491% 2.54 2.47 2.37 2.30 2.18 2.09 2.00 1.90 1.84 1.76

70 5% 1.93 1.89 1.84 1.79 1.72 1.67 1.62 1.56 1.53 1.471% 2.51 2.45 2.35 2.28 2.15 2.07 1.98 1.88 1.82 1.74

80 5% 1.91 1.88 1.82 1.77 1.70 1.65 1.60 1.54 1.51 1.451% 2.48 2.41 2.32 2.24 2.11 2.03 1.94 1.84 1.78 1.70

100 5% 1.88 1.85 1.79 1.75 1.68 1.63 1.57 1.51 1.48 1.421% 2.43 2.36 2.26 2.19 2.06 1.98 1.89 1.79 1.73 1.64

125 5% 1.86 1.83 1.77 1.72 1.65 1.60 1.55 1.49 1.45 1.391% 2.40 2.33 2.23 2.15 2.03 1.94 1.85 1.75 1.68 1.59

150 5% 1.85 1.82 1.76 1.71 1.64 1.59 1.54 1.47 1.44 1.371% 2.37 2.30 2.20 2.12 2.00 1.91 1.83 1.72 1.66 1.56

200 5% 1.83 1.80 1.74 1.69 1.62 1.57 1.52 1.45 1.42 1.351% 2.34 2.28 2.17 2.09 1.97 1.88 1.79 1.69 1.62 1.53

400 5% 1.81 1.78 1.72 1.67 1.60 1.54 1.49 1.42 1.38 1.321% 2.29 2.23 2.12 2.04 1.92 1.84 1.74 1.64 1.57 1.47

10005%

1.80 1.76 1.70 1.65 1.58 1.53 1.47 1.41 1.36 1.30

1% 2.26 2.20 2.09 2.01 1.89 1.81 1.71 1.61 1.54 1.44

5%1.79 1.75 1.69 1.64 1.57 1.52 1.46 1.40 1.35 1.28

Page 105: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-101

1% 2.24 2.18 2.07 1.99 1.87 1.79 1.69 1.59 1.52 1.41

Table-3: χ2 (Chi-Squared) Distribution: Critical Values of χ2

Table-4: Critical value for Correlation coefficients (Simple or Partial)

Probability % Probability % Probability %DF 0.01 0.05 DF 0.01 0.05 DF 0.01 0.05

1 1.000 0.997 41 0.389 0.301 130 0.223 0.1712 0.990 0.950 42 0.384 0.297 135 0.219 0.1683 0.959 0.878 43 0.380 0.294 140 0.215 0.1654 0.917 0.811 44 0.376 0.291 145 0.212 0.1625 0.874 0.754 45 0.372 0.288 150 0.208 0.159

6 0.834 0.707 46 0.368 0.285 160 0.202 0.1547 0.798 0.666 47 0.365 0.282 170 0.196 0.1508 0.765 0.632 48 0.361 0.279 180 0.190 0.1459 0.735 0.602 49 0.358 0.276 190 0.185 0.142

10 0.708 0.576 50 0.354 0.273 200 0.181 0.138

11 0.684 0.553 52 0.348 0.268 250 0.162 0.12412 0.661 0.532 54 0.341 0.263 300 0.148 0.11313 0.641 0.514 56 0.336 0.259 350 0.137 0.10514 0.623 0.497 58 0.330 0.254 400 0.128 0.09815 0.606 0.482 60 0.325 0.250 450 0.121 0.092

16 0.590 0.468 62 0.320 0.246 500 0.115 0.08817 0.575 0.456 64 0.315 0.242 600 0.105 0.08018 0.561 0.444 66 0.310 0.239 700 0.097 0.07419 0.549 0.433 68 0.306 0.235 800 0.091 0.06920 0.537 0.423 70 0.302 0.232 900 0.086 0.065

21 0.526 0.413 72 0.298 0.229 1000 0.081 0.062

Page 106: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-102

22 0.515 0.404 74 0.294 0.22623 0.505 0.396 76 0.290 0.22324 0.496 0.388 78 0.286 0.22025 0.487 0.381 80 0.283 0.217

26 0.478 0.374 82 0.280 0.21527 0.470 0.367 84 0.276 0.21228 0.463 0.361 86 0.273 0.21029 0.456 0.355 88 0.270 0.20730 0.449 0.349 90 0.267 0.205

31 0.442 0.344 92 0.264 0.20332 0.436 0.339 94 0.262 0.20133 0.430 0.334 96 0.259 0.19934 0.424 0.329 98 0.256 0.19735 0.418 0.325 100 0.254 0.195

36 0.413 0.320 105 0.248 0.19037 0.408 0.316 110 0.242 0.18638 0.403 0.312 115 0.237 0.18239 0.398 0.308 120 0.232 0.17840 0.393 0.304 125 0.228 0.174

Table-5: Percentage points of the normal distribution, ZThis table gives percentage points of the standard normal distribution. These are the values of z for whicha given percentage, P, of the standard normal distribution lies outside the range from -z to +z.P (%) Z90 0.125780 0.253370 0.385360 0.524450 0.674540 0.841630 1.036420 1.281615 1.439510 1.64495 1.96002 2.32631 2.57580.50 2.80700.25 3.0233

Page 107: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-103

0.10 3.29050.01 3.8906Table-6: Random numbersEach digit in the following table is independent and has a probability of (1/10). The table was computedfrom a population in which the digits 0 to 9 were equally likely.

77 21 24 33 39 07 83 00 02 77 28 11 37 3378 02 65 38 92 90 07 13 11 95 58 88 64 5577 10 41 31 90 76 35 00 25 78 80 18 77 3285 21 57 89 27 08 70 32 14 58 81 83 41 5575 05 14 19 00 64 53 01 50 80 01 88 74 2157 19 77 98 74 82 07 22 42 89 12 37 16 5659 59 47 98 07 41 38 12 06 09 19 80 44 1376 96 73 88 44 25 72 27 21 90 22 76 69 6796 90 76 82 74 19 81 28 61 91 95 02 47 3163 61 36 80 48 50 26 71 16 08 25 65 91 7565 02 65 25 45 97 17 84 12 19 59 27 79 1837 16 64 00 80 06 62 11 62 88 59 54 12 5358 29 55 59 57 73 78 43 28 99 91 77 93 8979 68 43 00 06 63 26 10 26 83 94 48 25 3187 92 56 91 74 30 83 39 85 99 11 73 34 9896 86 39 03 67 35 64 09 62 36 46 86 54 1372 20 60 14 48 08 36 92 58 99 15 30 47 8767 61 97 37 73 55 47 97 25 65 67 67 41 3525 09 03 43 83 82 60 26 81 96 51 05 77 7272 14 78 75 39 54 75 77 55 59 71 73 15 5659 93 34 37 34 27 07 66 15 63 14 50 74 2921 48 85 56 91 43 50 71 58 96 14 31 55 6196 32 49 79 42 71 79 69 52 39 45 04 49 9116 85 53 65 11 36 08 14 86 60 40 18 51 15

Page 108: Dr. A. K. Parida, Ph.D. STAT PRACT MANUAL.pdf · 2015-02-10 · Dr. A. K. Parida, Ph.D. Associate Professor Department of Agricultural Statistics College of Agriculture (OU AT), Bhubaneswar-3

UG Practical Manual on Statistics

Department of Agricultural Statistics, OUAT Page-104

64 28 96 90 23 12 98 92 28 94 57 41 99 1160 54 36 51 15 63 83 42 63 08 01 89 18 5342 86 68 06 36 25 82 26 85 49 76 15 90 1300 49 62 15 53 32 31 28 38 88 14 97 80 3326 64 87 61 67 53 23 68 51 98 60 59 02 3302 95 21 53 34 23 10 82 82 82 48 71 02 3965 47 77 14 75 30 32 81 10 83 03 97 24 3728 55 15 36 46 33 06 22 29 23 81 14 20 9159 75 78 49 51 02 20 17 02 30 32 78 44 7987 54 57 69 63 31 61 25 92 31 16 44 02 1094 53 87 97 15 23 08 71 26 06 25 87 48 9779 43 75 93 39 10 18 51 28 17 65 43 22 0648 38 71 77 53 37 80 13 60 63 59 75 89 7398 30 59 32 90 05 86 12 83 70 50 30 25 6585 80 16 77 35 74 09 32 06 30 91 55 92 3387 03 96 27 05 59 64 25 33 07 03 08 55 58