RM_5___6_Group_no_10_

8/2/2019 RM_5___6_Group_no_10_

1/20

RESEARCH METHODOLOGY : MFM SEM II GROUP 10

[Type text] Page 1

Group 10

Roll No. Name

8 Sarvesh Desai

17 Pooja Gupta

24 Nilesh Jadhav41 Rupesh Phalke

55 Venugopalan Swaminathan

RM Assignment: RM 5

Q1 Differentiate between following,

1. Parameter and statistic.2. Level of significance and level of confidence.3. Null and Alternate hypothesis.4. Type-I and type-II error.5.

One-tailed and two-tailed test of hypothesis

6. Testing of hypothesis and estimation.7. Point estimate and interval estimate.8. Parametric and non-parametric test of hypothesis9. Z-test and t-test of hypothesis.10.Test of goodness of fit and test of independence, under chi-square test11.1-way ANOVA and 2-way ANOVA.12.Test of confirmation and test of comparison.

Solution:

Q1.1 Parameter Statitics

1 A parameter describes a full population a statistic describes a sample2 A parameter is a property of the

underlying population distribution

"statistic" is "a function of a

sample/observation."

3 as the sample becomes large,

approaches the population mean, which

is a parameter

the sample mean is a statistic

Q1.2 Level of Significance Level of confidance

1 It indicates the likelihood that the

answer will fall outside that range

Is the expected % of times that actual

value will fall with the stated precision

limits

2 1% significance level means 99%confidance level

95% confidance level means 95 chancesin 100 that sample represents true

condition

3 It indicates the likelihood that the

answer will fall outside that range

Is the expected % of times that actual

value will fall with the stated precision

limits

Q1.3 Null Hypothesis Alternate hypothesis

8/2/2019 RM_5___6_Group_no_10_

2/20


[Type text] Page 2

1 Ho: The finding occurred by chance H1: The finding did not occur by chance

2 The null hypothesis is then assumed to

be true unless we find evidence to the

contrary

If we find that the evidence is just too

unlikely given the null hypothesis, we

assume the alternative hypothesis is

more likely to be correct.

Q1.4 Type I error Type II error

1 Means rejection of hypothesis which

should have been accepted

Means accepting the hypothesis which

should have been rejected

2 Denoted by alapha Denoted by Beta

3 Can be controlled by fixing it lower It depends on the type I error

Q1.5 One tailed Hypothesis two tailed hyopthesis

1 Rejection/Acceptance area only on one

side

Rejection/Acceptance area only on two

side

2

Q1.6 Testing of Hypothesis Estimation of Hypothesis

Hypothesis testing is carried out fortesting of the assumed criteria

Population parameters are unknown sohas to be estimated from sample

Q1.7 Point Estimate Interval Estimate

The esitmate of a population parameter

may be one single value or it could be a

range

Estimation of the parameter is not

sufficient. It is necessary to analyse and

see how confident we can be about this

particular estimation. One way of doing

it is defining confidence intervals. If we

have estimated q we want to know if the

true parameter is close to our

estimate. In other words we want to

find an interval that satisfies following

relation:

as the name suggests is the estimation

of the population parameter with one

number

8/2/2019 RM_5___6_Group_no_10_

3/20


[Type text] Page 3

Q1.8 Parameric test of hypotesis Non parameteric test of hypotesis

1 The observations must be independent Observations are independent

2 The observations must be drawn from

normally distributed populations

Variable under study has underlying

continuity

3 These populations must have the same

variances4 The means of these normal and

homoscedastic populations must be

linear combinations of effects due to

columns and/or rows*

Q1.9 Z test T test

1 Z-test is a statistical hypothesis test that

follows a normal distribution

T-test follows a Students T-distribution

2 Z-test is appropriate when you are

handling moderate to large samples (n >

30).

A T-test is appropriate when you are

handling small samples (n < 30)

3 Z-test will often require certain

conditions to be reliable.

T-test is more adaptable than Z-test

4 Z-tests are not commonly used than T-

tests

T-tests are more commonly used than Z-

tests

Q

1.10

Test of goodness of fit under chi sqaure Test of independence under chi sqaure

1 A goodness-of-fit test is a one variable

Chi-square test.

A test of independence is a two variable

Chi-square test

2 the goal of a Chi-square goodness-of-fit

test is to determine whether a set of

frequencies or proportions is similar to

and therefore fits with a hypothesized

set of frequencies or proportions

the goal of a two-variable Chi-square is

to determine whether or not the first

variable is related toor independent

ofthe second variable

3 A Chi-square goodness-of-fit test is like

to a one-sample t-test

A two variable Chi-square test or test of

independence is similar to the test for

an interaction effect in ANOVA

4 It determines if a sample is similar to,

and representative of, a population.

Is the outcome in one variable related to

the outcome in some other variable

Q1.11 1 way ANOVA 2 Way ANOVA

1 The purpose of one way Anova is to

verify whether the data collected from

different sources converge on a

common mean

purpose of the two way Anova is to

verify whether the data collected from

different sources coverage on a

common mean based on two categories

of defining characteristics

8/2/2019 RM_5___6_Group_no_10_

4/20


[Type text] Page 4

2 one way Anova is find out whether the

groups carried out the same procedures

in conducting research

Anova is used in the comparison of

treatment means. This involves the

introduction of randomized block

design. The experiment conducted in

the case of two way Anova gets split

normally into many mini experiments. In

short it can be said that the two way

Anova is employed for a design with two

or more treatment means that can be

called factorial designs.

Q

1.12

Test of confirmation Test of comparision

1

2

8/2/2019 RM_5___6_Group_no_10_

5/20


[Type text] Page 5

Q2 State whether following statements are true or false, giving reasons,

1) Level of significance is type-I error.2) In 1-way ANOVA, we need all samples to be of equal size.3) Point estimate is often insufficient because it is either right or wrong.4) In Z distribution , area contained between + / - 3* standard deviation is equal to

100%.

5) In fixing critical value of t, we need to specify level of significance or degrees offreedom or one/two tailed.

6) All tests of hypothesis are repetitive and hence universal.7) If the test fails to support null hypothesis, it also, indicates why test fails.8) ( 1 beta error ) is called power of test.9) 1% level of significance gives greater confidence to decision maker than 5% level of

significance.

10)In 1-way ANOVA, if F calculated is lesser than 1, it means the factor whichdifferentiates columns is the strong reason explaining variation in data.

11)If all data values are increased by 5, ANOVA inference drawn earlier will change.12)Client is supposed to give beta error to researcher in advance.13)In chi-square test, we want to confirm whether chi-square value is zero or not.14)Level of significance is rejection area under the sampling distribution beyond critical

value of test statistic

15)Good hypothesis can result into type-II error only.16)Alternate hypothesis can decide whether test is one tailed or two tailed in case of

large sample Z test.

17)Randomised block experimental design results into one-way ANOVA.18)Difference between sample statistic and population parameter is always significant.19)We use chi-square test of goodness of fit on nominal data 2-way classified.20)Latin square experimental design will lead to 3-way ANOVA

Solution:

Q2 State whether following

statements are true or false,

giving reasons

Answer Reason

Q 2 .1 Level of significance is type-I

error.

TRUE Level of significance indicates most

likelihood to reject the hypothesis

though its true which is Type-I error

Q 2 .2 In 1-way ANOVA, we need all

samples to be of equal size.

FALSE Not necessary. 1-way ANOVA can

result for unequal sample size also

Q 2 .3 Point estimate is often

insufficient because it is either

right or wrong.

TRUE Point estimate gives one value

which can be right or wrong where

interval gives range to check answer

Q 2 .4 In Z distribution , area contained

between + / - 3* standard

deviation is equal to 100%.

FALSE In Z distribution, area contained

between +/-3* SD is 99.87%

Q 2 .5 In fixing critical value of t, we

need to specify level of

significance or degrees of

freedom or one/two tailed.

TRUE To fix critical value of 't', we need to

specify LOS, DOF, one/tqo tailed.

8/2/2019 RM_5___6_Group_no_10_

6/20


[Type text] Page 6

Q 2 .6 All tests of hypothesis are

repetitive and hence universal.

TRUE When sample changes, we need to

repeat thst of hypothesis

Q 2 .7 If the test fails to support null

hypothesis, it also, indicates why

test fails.

FALSE No. It does no tell why test fails

Q 2 .8 ( 1 beta error ) is called power

of test.

TRUE 1-beta error is type-II error in which

False H0 is accepted.

Q 2 .9 1% level of significance gives

greater confidence to decision

maker than 5% level of

significance.

TRUE 1% LOS is 99% confidence level

which means 99% confidence level

is > 95% confidence level

Q 2 .10 In 1-way ANOVA, if F calculated

is lesser than 1, it means the

factor which differentiates

columns is the strong reason

explaining variation in data.

TRUE Yes. 'F' calculated is lesser than 1

explains variation in data with

strong reason

Q 2 .11 If all data values are increased by

5, ANOVA inference drawn

earlier will change.

FALSE

Q 2 .12 Client is supposed to give beta

error to researcher in advance.

TRUE Researcher should know the client

expected success rate

Q 2 .13 In chi-square test, we want to

confirm whether chi-square value

is zero or not.

TRUE

Q 2 .14 Level of significance is rejection

area under the sampling

distribution beyond critical value

of test statistic

TRUE LOS indicates the % failure in test

statistic

Q 2 .15 Good hypothesis can result into

type-II error only.

TRUE Here False H0 is accepted, indicating

failures are accepted hence good

hypothesis

Q 2 .16 Alternate hypothesis can decide

whether test is one tailed or two

tailed in case of large sample Z

test.

TRUE Alternate hypothesis tells the

Q 2 .17 Randomised block experimental

design results into one-way

ANOVA.

FALSE CR results into one way ANOVA

Q 2 .18 Difference between samplestatistic and population

parameter is always significant.

FALSE Lets say population has seasonalityfactor and while if the sampling is

not done proper way, your sample

statistic and population parameter

can be different.

Q 2 .19 We use chi-square test of

goodness of fit on nominal data

2-way classified.

TRUE

8/2/2019 RM_5___6_Group_no_10_

7/20


[Type text] Page 7

Q 2 .20 Latin square experimental design

will lead to 3-way ANOVA

TRUE

8/2/2019 RM_5___6_Group_no_10_

8/20


[Type text] Page 8

Q3 State whether following statements are true or false, giving reasons

1. Partial correlation analysis is same as multiple correlation analysis.2. If byx = 0.8, bxy = - 0.2, hence r = - 0.4.3. If byx = 0.8,bxy = 1.6, hence r = 1.13.4. byx and bxy must be less than 1, always.5. y = a + bx this equation can be used to estimate value of x for a given value of y

always.

6. If two regression lines are perpendicular to each other., correlation coefficient is 1

7. If r =0.7, amount of variation in y because of x is 70 %.8. Coefficient of determination can be negative sometimes.9. If one variable is constant, correlation between x and y is positive perfect.10.If coefficient of determination is less, stronger will be relationship between x and

y.

11.Coefficient of indetermination and standard error of estimate are same inconcepts.

12.Variance and co-variance mean the same thing.13.If correlation coefficient between x and y is 0.90, this definitely proves that

relationship is always causal.

14.If two regression lines coincide, coefficient of correlation is always +1.15.Intersection of two regression lines is the mean of each variable.

Solution:

Q3 State whether following

statements are true or false, giving

reasons

TRUE

/

FALSE

Reason

Q 3.1 Partial correlation analysis is same

as multiple correlation analysis.

FALSE Partial correlation measures the

effect of its independent variable on

the dependent variable whereas

multiple correlation takes into

account two independent and one

dependent variable.

Q 3.2 If byx = 0.8, bxy = - 0.2, hence r = -

0.4.

TRUE r=(0.8*0.2) = hence r0.16= - 0.4

Q 3.3 If byx = 0.8,bxy = 1.6, hence r =

1.13.

TRUE (.0.8*1.6) r = 1.28 r= 1.13

Q 3.4 byx and bxy must be less than 1,

always.

TRUE

Q 3.5 y = a + bx this equation can be used

to estimate value of x for a given

value of y always.

TRUE

Q 3.6 If two regression lines are

perpendicular to each other.,

correlation coefficient is 1

TRUE

Q 3.7 If r =0.7, amount of variation in y

because of x is 70 %.

TRUE

8/2/2019 RM_5___6_Group_no_10_

9/20


[Type text] Page 9

Q 3.8 Coefficient of determination can be

negative sometimes.

TRUE negative values of R2 may occur

when fitting non-linear trends to

data.

Q 3.9 If one variable is constant,

correlation between x and y is

positive perfect.

FALSE

Q 3.10 If coefficient of determination is

less, stronger will be relationship

between x and y.

FALSE

Q 3.11 COefficient of indetermination and

standard error of estimate are same

in concepts.

FALSE

Q 3.12 Variance and co-variance mean the

same thing.

FALSE

Q 3.13 If correlation coefficient between x

and y is 0.90, this definitely proves

that relationship is always causal.

FALSE

Q 3.14 If two regression lines coincide,

coefficient of correlation is always

+1.

FALSE When r +/- 1, there is exact linear

relationship between X & Y and two

regression lines coincides with each

other.

Q 3.15 Intersection of two regression lines

is the mean of each variable.

TRUE Two regression lines always

intersect each other at point mean

of X and mean of Y

8/2/2019 RM_5___6_Group_no_10_

10/20


[Type text] Page 10

Q4 Explain importance of following in statistical analysis (Under what circumstances will

you recommend following in analyzing data collected?

1. Mode as measure of central tendency.2. Coefficient of variation3. Interquartile range.4. Measures of skewness and kurtosis5. Syx : standard error of estimate of y because of x.6. Coefficient of determination ( r2)7. Co-variance in bivariate analysis8. Interval estimate.9. Classification, tabulation, presentation of data10.Frequency curve and histogram11.Correlation and regression analysis12.Yules coefficient of association

Solution:

1) Mode as measure of central tendency.The mode is the most frequently occurring value in the data set. The mode in a distribution

is that item around which there is maximum concentration. In general mode is the size of

the item which has the maximum frequency.

For example, in the data set {1,2,3,4,4}, the mode is equal to 4. A data set can have more

than a single mode, in which case it is multimodal. In the data set {1,1,2,3,3} there are two

modes: 1 and 3.

The mode can be very useful for dealing with categorical data. For example, if a sandwich

shop sells 10 different types of sandwiches, the mode would represent the most popular

sandwich. The mode also can be used with ordinal, interval, and ratio data. However, in

interval and ratio scales, the data may be spread thinly with no data points having the same

value. In such cases, the mode may not exist or may not be very meaningful.

2) Coefficient of variationThe coefficient of variation measures variability in relation to the mean (or average) and is

used to compare the relative dispersion in one type of data with the relative dispersion in

another type of data. The data to be compared may be in the same units, in different units,

with the same mean, or with different means.

Suppose you want to evaluate the relative dispersion of grades for two classes of students:

Class A and Class B. The coefficient of variation can be used to compare these two groups

and determine how the grade dispersion in Class A compares to the grade dispersion in

Class B. This is one example of how the coefficient of variation can be applied.

The coefficient of variation is a calculation built on other calculations -- the standard

deviation and the mean -- as follows:

This reads as 'the coefficient of variation is equal to the standard deviation divided by the

mean, multiplied by 100 (to produce a percentage).

The steps required for calculating the coefficient of variation are:

8/2/2019 RM_5___6_Group_no_10_

11/20


[Type text] Page 11

Calculate the mean for the data set.

Calculate the standard deviation.

Divide the standard deviation by the mean.

Multiply the result of step 3 by 100.

3) Interquartile range.The interquartile range (IQR) is the distance between the 75

thpercentile and the 25

th

percentile. The IQR is essentially the range of the middle 50% of the data. Because it uses

the middle 50%, the IQR is not affected by outliers or extreme values.

The IQR is also equal to the length of the box in a box plot.

4) Measures of skewness and kurtosisSkewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution,

or data set, is symmetric if it looks the same to the left and right of the center point.

For univariate data Y1, Y2, ..., YN, the formula for skewness is:

where is the mean, is the standard deviation, and N is the number of data points. The

skewness for a normal distribution is zero, and any symmetric data should have a skewness

near zero. Negative values for the skewness indicate data that are skewed left and positive

values for the skewness indicate data that are skewed right. By skewed left, we mean that

the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is

long relative to the left tail. Some measurements have a lower bound and are skewed right.

For example, in reliability studies, failure times cannot be negative.

Kurtosis is a measure of whether the data are peaked or flat relative to a normal

distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean,

decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flattop near the mean rather than a sharp peak. A uniform distribution would be the extreme

case

For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:

where is the mean, is the standard deviation, and N is the number of data points.

5) Syx : standard error of estimate of y because of x.Let us consider yest as the estimated value ofy for a given value ofx. This estimated value

can be obtained from the regression curve ofy on x From this, the measure of the scatter

about the regression curve is supplied by the quantity:

8/2/2019 RM_5___6_Group_no_10_

12/20


[Type text] Page 12

The above equation is called the Standard Error of Estimate ofy on x. It is important to note

that this Standard Error of Estimate has properties analogous to those of standard

deviation.

6) Coefficient of determination ( r2)The coefficient of determination, r

2,is useful because it gives the proportion of

the variance (fluctuation) of one variable that is predictable from the other variable.

It is a measure that allows us to determine how certain one can be in making

predictions from a certain model/graph.

The coefficient of determination is the ratio of the explained variation to the total

variation.

The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength

of the linear association between x and y.

The coefficient of determination represents the percent of the data that is the closest

to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that

85% of the total variation in y can be explained by the linear relationship between x

and y (as described by the regression equation). The other 15% of the total variation

in y remains unexplained.

The coefficient of determination is a measure of how well the regression line

represents the data. If the regression line passes exactly through every point on the

scatter plot, it would be able to explain all of the variation. The further the line is

away from the points, the less it is able to explain.

7) Co-variance in bivariate analysis

8) Interval estimate.An interval estimate is defined by two numbers, between which a population parameter is

said to lie. For example, a < x < b is an interval estimate of the population mean . It

indicates that the population mean is greater than a but less than b.

9) Classification, tabulation, presentation of dataTabulation refers to the systematic arrangement of the information in rows and columns.

Rows are the horizontal arrangement. In simple words, tabulation is a layout of figures in

rectangular form with appropriate headings to explain different rows and columns. The

main purpose of the table is to simplify the presentation and to facilitate comparisons

"A statistical table is a systematic organisation of data in columns and rows."

"Tabulation involves the orderly and systematic presentation of numerical data in a formdesigned to elucidate the problem under consideration."

10)Frequency curve and histogramFrequency curve is obtained by joining the points of frequency polygon by a freehand

smoothed curve. Unlike frequency polygon, where the points we joined by straight lines, we

make use of free hand joining of those points in order to get a smoothed frequency curve. It

is used to remove the ruggedness of polygon and to present it in a good form or shape. We

8/2/2019 RM_5___6_Group_no_10_

13/20


[Type text] Page 13

smoothen the angularities of the polygon only without making any basic change in the

shape of the curve. In this case also the curve begins and ends at base line, as is in case of

polygon. Area under the curve must remain almost the same as in the case of polygon.

A histogram is a way of summarising data that are measured on an interval scale (either

discrete or continuous). It is often used in exploratory data analysis to illustrate the major

features of the distribution of the data in a convenient form. It divides up the range of

possible values in a data set into classes or groups. For each group, a rectangle is

constructed with a base length equal to the range of values in that specific group, and an

area proportional to the number of observations falling into that group. This means that the

rectangles might be drawn of non-uniform height.

The histogram is only appropriate for variables whose values are numerical and measured

on an interval scale. It is generally used when dealing with large data sets (>100

observations), when stem and leaf plots become tedious to construct. A histogram can also

help detect any unusual observations or any gaps in the data set.

11)Correlation and regression analysisRegression analysis is the mathematical process of using observations to find the line ofbest

fitthrough the data in order to make estimates and predictions about the behaviour of the

variables. This line of best fit may be linear (straight) or curvilinear to some mathematical

formula.

Correlation analysis is the process of finding how well (or badly) the line fits the

observations, such that if all the observations lie exactly on the line of best fit, the

correlation is considered to be 1 or unity.

12)Yules coefficient of associationIn order to find the degree of intensity of association between two or more sets of

attributes, we should work out the coefficient of association , Professor Yules coefficient of

association

QAB = {(AB)(ab)-(Ab)(aB)}/{(AB)(ab)+(Ab)(aB)}

QAB = Yules coefficient of association between attributes A & B

(AB)=Frequency of class AB in which A & B are present

(Ab) = Frequency of class Ab in which A is present & B is absent

(aB) = Frequency of class aB in which A is absent & B is present

(ab)= Frequency of class ab in which both A & B are absent

8/2/2019 RM_5___6_Group_no_10_

14/20


[Type text] Page 14

RM Assignment: RM 6

Q1 Differentiate between following

1. Completely randomized ( CR ) and randomized block ( RB ) experimental design2. Stratified sampling and cluster sampling3. Sampling and non-sampling errors.4. Probability and non-probability sampling.5. Survey and experiment.6. Simple random sampling and systematic sampling.7. Nominal data and ratio data.8. Exploratory and diagnostic research.9. Validity and reliability in attitude measurement.10.Bias and error in research11.Structured and un-structured interview.12.Latin square and factorial experimental design.13.Principle of randomizing and principle of replication.14.Multi-stage sampling and multi-phase sampling.15.Informal experimental and formal experimental design

Solution:

Q1.1 Completely randomized ( CR ) Randomized block ( RB ) experimental

design

1 It is simple design than RB It is an improvement over CR

2 Invovles 2 principles Viz the principle of

replication and the principle of

randmozation

Principle of Local control can be applied

along with the other two principles of

experimental design

3 Subjects are randomly assigned to

experiment treatments

Subjects are divided into groups-Blocks ,

such that within each group thesubjectss are relatively homogenous in

respect to some other variable'

Is Analsed by 1 way ANOVA Is Analsed by 2 way ANOVA

Q1.2 Stratified sampling Cluster sampling

1 If a population from which a sample is

to be drawn does not constitue a

homogenous group , stratified sampling

technique is used

for bigger samples divide the area into a

number of smaller non overlapping

areas and then randomly select a

number of these smaller areas(Clusters)

2 Generally used to obtain representative

sample3 Sampling population is divided into

several sub -population(Strata) that are

individually more homogenous than the

total population then from Stratum

items are selected for sampling

Sample is divided in clusters which are

themselves clusters in themselves

4 Sample size ni = { n x N1 x si} /{N1 x s1

+N2 x s2+ ..Ni x si}

8/2/2019 RM_5___6_Group_no_10_

15/20


[Type text] Page 15

5 High cost required Low cost involved

6 More precise Less Precise

Q1.3 Probability Sampling Non-probability Sampling

1 Also known as Random sampling or

chance sampling

Also known as deliberate sampling

2 Every item of universe has eqal chance

of inclsion in sample

Organisers of inquiry purposively

choosw the particular units of the

universe for constituing a sample on the

bais that the sma;; ass that they so

select out of a hufe one will be typical

or represntative whle

3 Probability is 1 /NCn Just quota sampling no basis

Q1.4 Survey Experiment

1 The process of examing the truth of

statitical hypothesis relating to someresearch problem is known as an

experiment.

2 Two types absolute & comparitive

3 are conducted in case of descriptive

reaserch studies

are part of experimental research

studies

4 Larger samples Small samples

5 Normally used for social & behavioural

sciences

used for measure of the effects of an

experiment which he conducts

intentionally

6 Example firld research Example Laboratory research

Q1.5 Simple random sampling Systematic sampling

1 Just a random sample Various systemeatic approaches

2 every entity from universe may become

a sample

logic is defined in order to have better

control on sample

3 low cost high cost is involved

Q1.6 Nominal data ratio data

1 Simply a system of assigning nmber

symbols to events in order to lable hem.

has absolute or zero of measurement

2 conveienet for keeping taracks actual amounts of variables3 only mode is measure of central

tendancy

Geometric or harmonic means are used

as easure of central tendency

4 Widely used in surveys Used for physical measurement

Q1.7 Exploratory research Diagnostic research

1 This is carried out for exploring new

ideasm with support

This is carried out for digonising certain

problem

8/2/2019 RM_5___6_Group_no_10_

16/20


[Type text] Page 16

2 This is general research leading to

surveys

This is extensive research involves

depth study and stattical tools

3 Low to moderate cost compared to

Diagonostic research

High cost compared to exploratory

research

Q1.8 Validity in attitude measurement Reliability in attitude measurement

Q1.9 Bias in Research Error in research

1 This may impacts the results of the

research

This impacts a lot the results of the

reasearch

2 This is the attitude This is system related

Q1.10 Structured interview Un-structured interview

1 Invovles a set of predetermined

questions

Questions are not fixed

2 Highly standardised techniques of

recording

Normal standards for recording

3 Rigid procedure to intervirew freedom to condct interview

4 Question order is fixed sometimes Question sequence may be chaged

Q1 .11 Latin square Factorial experimental design

1 Very frequenctly used in agricultural

reasearch

are used in experiments where the

effects of varying more than one factor

are to be determined

2 Asumption that there is no interaction

between row factor & coum factors

There is interractio between row &

column entity

3 No of row & columns are required to be

equal

more complex problem are been looked

with multiple rows and columns

4 Acuuracy us low compared to factorial

deisgn

Provide equivalent accuracy with lesss

labour and as such are a source of

economy

Q1 .12 Principle of randomizing Principle of replication

1

2

Q1. 13 Multi-stage sampling Multi-phase sampling

1 It is further dvelopment of cluster

sampling

2 Easier to administer

3 Large no of units can be sampledfor

given cost under mutlistsge

8/2/2019 RM_5___6_Group_no_10_

17/20


[Type text] Page 17

Q1. 14 Informal experimental Formal experimental design

1 of 3 types

before & after without control design

After only cotrol design

Before & after with cotrol design

of 4 types

Completely randomized design (CR)

Rnadomized block design (RB)

Latin sqauare design (LS)Factorial design

2 Less sophisticated offer more control

3 based on differences of magnitude Use precise sratitical procedure for

analysis

Q2 Justify following statements

1. Quota sampling is a non-probability sampling.2. We dont need hypothesis firmed up in diagnostic research.3. Wording of questionnaire can cause ineffective instrument.4. In Latin square experimental design it is assumed that factors are independent ofeach other.5. Stratified sampling method assumes strata to be homogeneous within and

heterogeneous between.

6. Convenience sampling is a method of probability sampling.7. Semantic differential scale requires identifying bi-polar adjectives describing the

object.

8. Likert scale is a summative model for attitude measurement.9. Principle of replication in experimental design is aimed at increasing statistical

accuracy

10.Principle of local control in experimental design is identifying effect of known sourceof variation in data.11.Non-sampling errors cannot be totally avoided in research.

12.Word association test is a projective method of data collection.13. Defining the problem involves in identifying unit of analysis and characteristic of

interest, time and space references and environmental conditions.

14.Projective methods of data collection are used for inferred characteristics15.On ordinal data, we can do all mathematical operations.16.Optimal sample size is based on degree of accuracy and level of confidence

expected.

17.Cluster sampling needs each cluster to be homogeneous between andheterogeneous within.

18.Systematic sampling is not truly probability sampling.19. Parameters of quality data are same whether it is primary data or secondary data.

20.We firm up hypothesis based on exploratory, descriptive and diagnostic research.Solution:

1) Quota sampling is a non-probability sampling.The first step in non-probability quota sampling is to divide the population into exclusive

subgroups. Then, the researcher must identify the proportions of these subgroups in the

population; this same proportion will be applied in the sampling process. Finally, the

8/2/2019 RM_5___6_Group_no_10_

18/20


[Type text] Page 18

researcher selects subjects from the various subgroups while taking into consideration the

proportions noted in the previous step. The final step ensures that the sample is

representative of the entire population. It also allows the researcher to study traits and

characteristics that are noted for each subgroup. So in quota sampling the probability is not

considered hence it is called non probability sampling.

2) We dont need hypothesis firmed up in diagnostic research.Since DR aims to identify causes of a problem and its possible solutions.

3) Wording of questionnaire can cause ineffective instrument.Wording and order of questions, ensures that each respondent receives the same

stimuli, else the purpose of the survey will not get serve

4) In Latin square experimental design it is assumed that factors are independent ofeach other.

A Latin square is used in experimental designs in which one wishes to compare

treatments and to control for two other known sources of variation. It was recognized

that within a eld there would be fertility trends running both across the eld and up

and down the eld. So in an experiment to test, say, four different fertilizers, A, B, C and

D, the eld would divided into four horizontal strips and four vertical strips, thus

producing 16 smaller plots. A Latin square design will give a random allocation of

fertilizer type to a plot in such a way that each fertilizer type is used once in each

horizontal strip (row) and once in each vertical strip (column).

5) Stratified sampling method assumes strata to be homogeneous within andheterogeneous between.

6) Convenience sampling is a method of probability sampling.Convenience sampling is a non-probability sampling technique where subjects are

selected because of their convenient accessibility and proximity to the researcher.

7) Semantic differential scale requires identifying bi-polar adjectives describing theobject.

Yes, Semantic differential is a type of a rating scale designed to measure the connotative

meaning of objects, events, and concepts.

8) Likert scale is a summative model for attitude measurement.Likert (1932) developed the principle of measuring attitudes by asking people to respond

to a series of statements about a topic, in terms of the extent to which they agree with

them, and so tapping into the cognitive and affective components of attitudes.

9) Principle of replication in experimental design is aimed at increasing statisticalaccuracy

Measurements are usually subject to variation and uncertainty. Measurements are

repeated and full experiments are replicated to help identify the sources of variation, to

better estimate the true effects of treatments, to further strengthen the experiment's

reliability and validity, and to add to the existing knowledge of about the topic.[13]

8/2/2019 RM_5___6_Group_no_10_

19/20


[Type text] Page 19

However, certain conditions must be met before the replication of the experiment is

commenced: the original research question has been published in a peer-reviewed

journal or widely cited, the researcher is independent of the original experiment, the

researcher must first try to replicate the original findings using the original data, and the

write-up should state that the study conducted is a replication study that tried to follow

the original study as strictly as possible.

10)Principle of local control in experimental design is identifying effect of known sourceof variation in data.

Local control refers to grouping of the experimental units in such a way that the units

within a group (i.e., block) are more homogeneous than are units in different groups.

The experimental materials or conditions are more alike within a group. Thus, the

variation among experimental units within a group is less than the variation would have

been without grouping

11)Non-sampling errors cannot be totally avoided in research.Non-sampling errors are part of the total error that can arise from doing a statistical

analysis. The remainder of the total error arises from sampling error. Unlike sampling

error, increasing the sample size will not have any effect on reducing non-sampling

error. Unfortunately, it is virtually impossible to eliminate non-sampling errors entirely.

12)Word association test is a projective method of data collection.Word Association Test: An individual is given a clue or hint and asked to respond to the

first thing that comes to mind. The association can take the shape of a picture or a word.

There can be many interpretations of the same thing. A list of words is given and you

dont know in which word they are most interested

13)Defining the problem involves in identifying unit of analysis and characteristic ofinterest, time and space references and environmental conditions.

14)Projective methods of data collection are used for inferred characteristicsThis holds that an individual puts structure on an ambiguous situation in a way that

is consistent with their own conscious & unconscious needs

15)On ordinal data, we can do all mathematical operations.Ordinal data is second level of measurement therefore The experimental (scientific)

method depends on physically measuring things. The concept of measurement has been

developed in conjunction with the concepts of numbers and units of measurement.

Statisticians categorize measurements according to levels. Each level corresponds to

how this measurement can be treated mathematically

16)Optimal sample size is based on degree of accuracy and level of confidenceexpected.

17)Cluster sampling needs each cluster to be homogeneous between andheterogeneous within.

8/2/2019 RM_5___6_Group_no_10_

20/20


[Type text] Page 20

Common motivation for cluster sampling is to reduce the average cost per interview.

Given a fixed budget, this can allow an increased sample size.

18)Systematic sampling is not truly probability sampling.Systematic sampling is still thought of as being random, as long as the periodic interval is

determined beforehand and the starting point is random, For example, if you wanted to

select a random group of 1,000 people from a population of 50,000 using systematic

sampling, you would simply select every 50th person, since 50,000/1,000 = 50.

19)Parameters of quality data are same whether it is primary data or secondary data.Data that has been collected from first-hand-experience is known as primary data.

Primary data has not been published yet and is more reliable, authentic and objective.

Primary data has not been changed or altered by human beings, therefore its validity is

greater than secondary data. The review of literature in nay research is based on

secondary data. Nostly from books, journals and periodicals.

20)We firm up hypothesis based on exploratory, descriptive and diagnostic research

RM_5___6_Group_no_10_

Documents

Transcript of RM_5___6_Group_no_10_