SRIT / PICM105 SFM / Statistics for Management SRI ...

SRIT / PICM105 – SFM / Statistics for Management

SRIT / M & H / M. Vijaya Kumar 1

SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY

(AN AUTONOMOUS INSTITUTION)

COIMBATORE- 641010

PICM105 & STATISTICS FOR MANAGEMENT

TESTING OF HYPOTHESIS

History:

In inferential statistics, the null hypothesis is a general statement or default position

that there is no relationship between two measured phenomena, or no association among

groups. Testing (accepting, approving, rejecting, or disproving) the null hypothesis - and

thus concluding that there are or are not grounds for believing that there is a relationship

between two phenomena (e.g. that a potential treatment has a measurable effect) - is a

central task in the modern practice of science; the field of statistics gives precise criteria

for rejecting a null hypothesis.

The null hypothesis is generally assumed to be true until evidence indicates

otherwise. In statistics, it is often denoted .

The concept of a null hypothesis is used differently in two approaches to statistical

inference. In the significance testing approach of Ronald Fisher, a null hypothesis is

rejected if the observed data are significantly unlikely to have occurred if the null

hypothesis were true. In this case the null hypothesis is rejected and an alternative

hypothesis is accepted in its place. If the data are consistent with the null hypothesis, then

the null hypothesis is not rejected. In neither case is the null hypothesis or its alternative

proven; the null hypothesis is tested with data and a decision is made based on how likely

or unlikely the data are. This is analogous to the legal principle of presumption of

innocence, in which a suspect or defendant is assumed to be innocent (null is not rejected)

until proven guilty (null is rejected) beyond a reasonable doubt (to a statistically

significant degree).

In the hypothesis testing approach of Jerzy Neyman and Egon Pearson, a null

hypothesis is contrasted with an alternative hypothesis and the two hypotheses are

distinguished on the basis of data, with certain error rates.

https://en.wikipedia.org/wiki/Inferential_statistics

https://en.wikipedia.org/wiki/Hypothesis

https://en.wikipedia.org/wiki/Ronald_Fisher

https://en.wikipedia.org/wiki/Statistical_significance

https://en.wikipedia.org/wiki/Alternative_hypothesis

https://en.wikipedia.org/wiki/Alternative_hypothesis

https://en.wikipedia.org/wiki/Presumption_of_innocence

https://en.wikipedia.org/wiki/Presumption_of_innocence

https://en.wikipedia.org/wiki/Statistical_hypothesis_testing

https://en.wikipedia.org/wiki/Jerzy_Neyman

https://en.wikipedia.org/wiki/Egon_Pearson



Hypothesis testing

Hypothesis testing or significance testing is a method for testing a claim or

hypothesis about a parameter in a population, using data measured in a sample. In this

method, we test some hypothesis by determining the likelihood that a sample statistic

could have been selected, if the hypothesis regarding the population parameter were true.

A statistical hypothesis is an assumption about any aspect of a population. It could be

the parameters of a distribution like mean of normal distribution, describing the

population, the parameters of two or more populations, correlation or association between

two or more characteristics of a population like age and height etc.

Hypothesis is an integral part of any research or investigation. Many a time, initially

experiments or investigations are carried out to test a hypothesis, and the ultimate

decisions are taken on the basis of the collected information and the result of the test.

Level of Significance:

The level of significance is defined as the probability of rejecting a null hypothesis by

the test when it is really true, which is denoted as . That is, ( ) .

Critical region:

The critical region is the region of values that corresponds to the rejection of the null

hypothesis at some chosen probability level

Parameters

Parameters are numbers that summarize data for an entire population.

Statistics

Statistics are numbers that summarize data from a sample, i.e. some subset of the

entire population.

Eg:

A researcher wants to estimate the average height of women aged 20 years or older.

From a simple random sample of 45 women, the researcher obtains a sample mean height

of 63.9 inches.

The parameter is the average height of all women aged 20 years or older.

The statistic is the average height of 63.9 inches from the sample of 45 women



Null Hypothesis

A statistical hypothesis that states that there is no difference between a parameter

and a specific value, or that there is no difference between two parameters.

Alternative Hypothesis

A statistical hypothesis that states the existence of a difference between a parameter

and a specific value, or states that there is a difference between two parameters.

When we make a conclusion from a statistical test there are two types of errors that

we could make. They are called: Type I and Type II Errors.

Type I error: Reject when is true.

Type II error: Accept when is false.

Reject

is true is False

Type-I Error

Correct decision

Accept

Correct decision

Type II Error

One-tailed test: One-tailed test – indicates that the null hypothesis should be rejected when the test

value is in the critical region on one side.

Left-tailed test – when the critical

region is on the left side of the

distribution of the test value.

−∞

Acceptance Region

+∞

Rejection Region



Right-tailed test – when the critical

region is on the right side of the

distribution of the test value.

Two-tailed test:

Two-tailed test – the null hypothesis

should be rejected when the test value

is in either of two critical regions on

either side of the distribution of the

test value.

One sample test for mean of large samples (z-test)

A normally distributed population-variances known

−

√

A normally distributed population-variances unknown

−

√

where Sample mean; Population mean; Sample size

Population Standard Deviation; Sample Standard Deviation

Problem: 1

The mean lifetime of a sample of 100 light tubes produced by a company is found to

be 1580 hours with standard deviation of 90 hours. Test the hypothesis that the mean

lifetime of the tubes produced by the company is 1600 hours.

Answer:

−∞

Rejection Region

+∞

Acceptance Region

−∞

Rejection Region

Acceptance Region

+∞

Rejection Region



Given data: and

Null Hypothesis :

Alternative Hypothesis : (Two tailed test)

Test statistic:

A normally distributed population-variances known (large sample)

−

√

[ ]

−

(

√ )

− [ ]

Table value:

at 5% significance level is 1.96.

Decision rule:

| |

| | so we rejct .

Conclusion:

The mean life time of tubes produced

by the company is not 1600 hours.

Problem: 2

The mean breaking strength of a cables supplied by a manufacturer is 1800 with the

S.D of 100. By a new technique in the manufacturing process, it is claimed that the breaking

strength of the cable has increased. To test this claim a sample of 50 cables is tested and is

found that the mean breaking strength is 1850. Can we support the claim at 1% level of

significance?

Answer:

Given data: and

Null Hypothesis :

Alternative Hypothesis : (Right Tailed)

Rejection Region

Acceptance Region

Rejection Region

−

𝑧 −



Test statistic:


−

√

[ ]

−

(

√ )

[ ]

Table value:

at 1% significance level is 2.326

Decision rule:

| |

so we reject .

Conclusion:

We conclude that the mean breaking

strength of the cable is increased.

Problem: 3

A sample of 100 students is taken from a large population. The mean height of the

students in this sample is 160 cm. Can it be reasonably regarded that this sample is from a

population of mean 165 cm and S.D 10 cm?

Answer:

Given data: and

Null Hypothesis :

Alternative Hypothesis : (Two Tailed)

Test statistic:


−

√

[ ]

Rejection Region

Acceptance Region

𝑧



−

(

√ )

− [ ]

Table value:


Decision rule:

| |

| | so we reject .

Conclusion:

The sample are not drawn from the

population.

Problem: 4

A sample of 900 members has a mean 3.4 cm and S.D 2.61 cm. Is the sample from a

large population of mean 3.25 cm and S.D 2.61 cm. If the population is normal and the

mean is unknown.

Answer:

Given data: and

Null Hypothesis :


Test statistic:


−

√

[ ]

−

(

√ )

[ ]

Table value:


Rejection Region

Acceptance Region

Rejection Region

−

𝑧 −



Decision rule:

| |

| | so we accept .

Conclusion:

The sample are drawn from the population.

Problem: 5

A cosmetics company fills its best-selling 8 ounce jars of facial cream by an automatic

dispensing machine. The machine is set to dispense a mean of 8.1 ounces per jar.

Uncontrollable factors in the process can shift the mean away from 8.1 and cause either

under fill or overfill, both of which are undesirable. In such a case the dispensing machine

is stopped and recalibrated. Regardless of the mean amount dispensed, the standard

deviation of the amount dispensed always has value 0.22 ounce. A quality control engineer

routinely selects 30 jars from the assembly line to check the amounts filled. On one

occasion, the sample mean is 8.2 ounces and the sample standard deviation is 0.25 ounce.

Determine if there is sufficient evidence in the sample to indicate, at the 1% level of

significance, that the machine should be recalibrated.

Answer:

Given data: and

Null Hypothesis :


Test statistic:

A normally distributed population-variances unknown (large sample)

−

√

[ ]

−

(

√ )

[ ]

Rejection Region

Acceptance Region

Rejection Region

−

𝑧



Table value:


Decision rule:

| |

| | so we accept .

Conclusion:

We conclude that the machine does not

need to be recalibrated.

Problem: 6

It is hoped that a newly developed pain reliever will more quickly produce perceptible

reduction in pain to patients after minor surgeries than a standard pain reliever. The

standard pain reliever is known to bring relief in an average of 3.5 minutes with standard

deviation 2.1 minutes. To test whether the new pain reliever works more quickly than the

standard one, 50 patients with minor surgeries were given the new pain reliever and their

times to relief were recorded. The experiment yielded sample mean minutes and

sample standard deviation 1.5 minutes. Is there sufficient evidence in the sample to

indicate, at the 5% level of significance, that the newly developed pain reliever does deliver

perceptible relief more quickly?

Answer:

Given data: and

Null Hypothesis :

Alternative Hypothesis : (Left tailed test)

Test statistic:


−

√

[ ]

−

(

√ )

− [ ]

Rejection Region

Acceptance Region

Rejection Region

𝑧



Table value:

at 5% significance level is −

Decision rule:

| |

| | − so we reject

Testing of Hypothesis about Difference between two Means

The test statistic from normally distributed populations with known variances

(large samples)

−

√

+

( )

where Sample means; Sample sizes

Population Standard Deviations

Problem: 7

The buyer of electric bulbs bought 100 bulbs each of two famous brands. Upon

testing these he found that brand A had a mean life of 1500 hours with a standard

deviation of 50 hours whereas brand B had a mean life of 1530 hours with a standard

deviation of 60 hours. Can it be concluded at 5% level of significance, that the two brands

differ significantly in quality?

Solution:

Given data and

Null Hypothesis :


Acceptance Region

Rejection Region

−

𝑧 −



Test statistic:

The test statistic is given by

−

√

+

[ ]

−

√

+

−

Table value:


Decision rule:

| |

| | so we reject .

Conclusion:

The two brands of bulbs differ

significantly in quality.

Problem: 8

Intelligence test given to two groups of boys and girls gave the following information

Mean Score S.D Number

Girls 75 10 50

Boys 70 12 100

Is the difference in the mean scores of boys and girls statistically significant?

Answer:

Given data and

Null Hypothesis:

Null Hypothesis: (Two tailed test)

Rejection Region

Acceptance Region

Rejection Region

−

𝑧 −



Test statistic:


−

√

+

[ ]

−

√

+

Table value:


Decision rule:

| |

| | so we reject .

Conclusion:

The mean scores of boys and girls

statistically significant.

Problem: 9

A simple sample of heights of 6400 Englishmen has a mean of 170 cm and S.D of 6.4

cm, while a simple sample of heights of 1600 Americans has mean of 172 cm and S.D of 6.3

cm. Do the data indicate that Americans are on the average taller than Englishmen’s?

Solution:

Given that ( ) ( )

and

Null Hypothesis:

i.e., there is no significant difference between heights of Americans and

Englishmens.

Alternative Hypothesis:

(Left Tailed)

Rejection Region

Acceptance Region

Rejection Region

−

𝑧



Test statistic:


−

√

+

[ ]

−

√

+

−

Table value:

at 5% significance level is − .

Decision rule:

− − so we reject .

Conclusion:

Hence heights of Americans are taller

than Englishmen’s.

Problem: 10

In a certain factory there are two independent processes manufacturing the same

item. The average weight in a sample of 250 items produced from one process is found to

be 120 Ozs, with a s.d of 12 Ozs, while the corresponding figures in a sample of 400 items

from the other process are 124 Ozs and 14 Ozs. Is the difference between the two sample

means significant?

Answer:

Given data and

Null Hypothesis:

i.e., the sample means do not differ significantly.


(Two tailed)

Acceptance Region

Rejection Region

−

𝑧 −



Test statistic:


−

√

+

[ ]

−

√

+

−

Table value:


Decision rule:

| |

| | so we reject .

Conclusion:

That is there is a significant

difference between the sample means.

One sample test for mean of small samples ( −test)


−

√ −

where Sample mean; Population mean; Sample size

Sample Standard Deviation

Problem: 11

A random sample of 10 boys has the following IQ’s 70, 83, 88, 95, 98, 100, 101, 107, 110

and 120. Do these data support the assumption of a population mean IQ of 100 at 5% level

of significance?

Answer:

Rejection Region

Acceptance Region

Rejection Region

−

𝑧 −



To find mean and variance

Total

70 83 88 95 98 100 101 107 110 120 972

( − ) 739.84 201.64 84.64 4.84 0.64 7.84 14.44 96.04 163.84 519.84 1833.6

∑

√∑( − )

−

√

−

Given data: and

Null Hypothesis :


Test statistic


−

√

[ ]

−

(

√ )

− [ ]

Table value:

at 5% significance level with

− − degrees of freedom is

2.262.

Decision rule:

| |

| | so we accept .

Conclusion:

The given data supports the assumption of a population mean IQ of 100.

Rejection Region

Acceptance Region

Rejection Region

−

𝑡 −



Problem: 12

The heights of 10 males of a given locality are found to be 70, 67, 62, 68, 61, 68, 70,

64, 64, 66 inches. Is it reasonable to believe that the average height is greater than 64

inches?

Answer:


Total

70 67 62 68 61 68 70 64 64 66 660

( − ) 16 1 16 4 25 4 16 4 4 0 90

∑

√∑( − )

−

√

−

Given data: and and

Null Hypothesis :


Test statistic


−

√

[ ]

−

(

√ )

[ ]



Table value:



1.833.

Decision rule:

so we reject .

Conclusion:

We say that the average height is greater than 64 inches.

Problem: 13

A simple random sample of 10 people from a certain population has a mean age of

27. Can we conclude that the mean age of the population is not 30? The variance is known

to be 20. Let .

Answer:

Given data: and

Null Hypothesis :


Test statistic


−

√

[ ]

−

(√

√ )

− [ ]

Rejection Region

Acceptance Region

𝑡



Table value:



2.262.

Decision rule:

| |

| | so we accept .

Conclusion:

We conclude that the mean age of the population is 30.

Problem: 14

Ten oil tins are taken at random from an automatic filling machine. The mean weight of the

tins 15.8 kg and standard deviation of 0.5 kg. Does the sample mean differ significantly

from the intended weight of 16 kg?

Answer:

Given data: and

Null Hypothesis :

Sample mean doesn’t differ significantly


Sample mean differ significantly

Test statistic

A normally distributed population-variances known (Small Sample)

−

√

[ ]

−

(

√ )

− [ ]

Rejection Region

Acceptance Region

Rejection Region

−

𝑡 −



Table value:



2.262.

Decision rule:

| |

| | so we accept .

Conclusion:

The sample mean does not differ significantly.

One sample test for difference between two means of small samples (t-test)


−

√

+

+ −

where Sample means; Sample sizes

Sample Standard Deviations

Problem: 15

The height of six randomly chosen sailors are (in inches): 63, 65, 68, 69, 71 and 72.

Those of 10 randomly chosen soldiers are 61, 62, 65, 66, 69, 69, 70, 71, 72 and 73. Discuss,

the height that these data thrown on the suggestion that sailors are on the average taller

than soldiers.

Answer:


( − ) ( − )

63 25 61 46.24

65 9 62 33.64

68 0 65 7.84

Rejection Region

Acceptance Region

Rejection Region

−

𝑡 −



69 1 66 3.24

71 9 69 1.44

72 16 69 1.44

70 4.84

71 10.24

72 17.64

73 27.04

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:


(Right tailed)

Test statistic

The test statistic is given by (small sample)

−

√

+

−

√

+



Table value:


+ − + − degrees of

freedom is 1.761.

Decision rule:

so we accept .

Conclusion:

The sailors and soldiers are on the same heights.

Problem: 16

In a packing plant, a machine packs cartons with jars. It is supposed that a new

machine will pack faster on the average than the machine currently used. To test that

hypothesis, the times it takes each machine to pack ten cartons are recorded. The results

(machine.txt), in seconds, are shown in the following table.

New machine Old machine

42.1 41.3 42.4 43.2 41.8 42.7 43.8 42.5 43.1 44

41 41.8 42.8 42.3 42.7 43.6 43.3 43.5 41.7 44.1

Answer:

First we need to find the mean and variance of the given data

( − ) ( − )

42.1 0.0016 42.7 0.2809

41.3 0.7056 43.8 0.3249

42.4 0.0676 42.5 0.5329

43.2 1.1236 43.1 0.0169

41.8 0.1156 44 0.5929

41 1.2996 43.6 0.1369

41.8 0.1156 43.3 0.0049

42.8 0.4356 43.5 0.0729

42.3 0.0256 41.7 2.3409

42.7 0.3136 44.1 0.7569

∑ ∑( − ) ∑ ∑( − )

Rejection Region

Acceptance Region

𝑡

https://onlinecourses.science.psu.edu/stat500/sites/onlinecourses.science.psu.edu.stat500/files/data/machine/index.txt



∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:

i.e.,


(Right tailed)

Test statistic


−

√

+

−

√

+

−

Table value:


+ − + − degrees

of freedom is 1.734.

Decision rule:

− so we accept .

Conclusion:

The is no difference between two machines.

Rejection Region

Acceptance Region

𝑡 −



Problem: 17

Two independent samples of 8 and 7 items respectively had the following values.

Sample I 9 11 13 11 15 9 12 14

Sample II 10 12 10 14 9 8 10

Is the difference between the means of samples significant?

Answer:


( − ) ( − )

9 7.56 10 0.18

11 0.56 12 2.46

13 1.56 10 0.18

11 0.56 14 12.74

15 10.56 9 2.04

9 7.56 8 5.90

12 0.06 10 0.18

14 5.06

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:


(Two tailed)



Test statistic


−

√

+

−

√

+

Table value:



freedom is 2.16.

Decision rule:

| |

| | so we accept .

Conclusion:

There is no significant difference between the sample means.

Problem: 18

In a test given to two groups of students the marks obtained were as follows.

First Group 18 20 36 50 49 36 34 49 41

Second Group 29 28 26 35 30 44 46

Examine the significant difference between the means of marks secured by students of the

above two groups.

Answer:


Rejection Region

Acceptance Region

Rejection Region

−

𝑡



( − ) ( − )

18 361 29 25

20 289 28 36

36 1 26 64

50 169 35 1

49 144 30 16

36 1 44 100

34 9 46 144

49 144

41 16

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:


(Two tailed)

Test statistic


−

√

+

−

√

+



Table value:



freedom is 2.145.

Decision rule:

| |

| | so we accept .

Conclusion:

There is no significant difference between the sample means.

Testing of Hypothesis about Population Proportion

The test statistic for the population proportion is

−

√

( ) +

( )

and Population Proportion; −

Problem: 19

A university has found over the years that out of all the students who are offered

admission, the proportion who accepts is .70. After a new director of admissions is hired,

the university wants to check if the proportion of students accepting has changed

significantly. Suppose they offer admission to 1200 students and 888 accept. Is this

evidence at the level that there has been a real change from the status quo?

Answer:

Given data:

−

Null Hypothesis :


Rejection Region

Acceptance Region

Rejection Region

−

𝑡



Test statistic:

−

√

−

√

[ ]

Table value:


Decision rule:

| |

so we reject .

Hence the admissions were increased

after new director appointed.

Problem: 20

The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are

very satisfied with the service they receive. To test this claim, the local newspaper

surveyed 100 customers, using simple random sampling. Among the sampled customers,

73 percent say they are very satisfied. Based on these findings, can we reject the CEO's

hypothesis that 80% of the customers are very satisfied? Use a 0.05 level of significance.

Answer:

Given data:

−

Null Hypothesis :


Test statistic:

−

√

Rejection Region

Acceptance Region

Rejection Region

−

𝑧



−

√

− [ ]

Table value:


Decision rule:

| |

so we accept .

Problem: 21

40 people are attacked by disease and only 36 survived. Will you reject the hypothesis that

the survival rate, if attacked by this disease, is 85% in favour of hypothesis that it is more,

at 5% level of significance?

Answer:

Given data:

−

Null Hypothesis :


Test statistic:

−

√

−

√

[ ]

Rejection Region

Acceptance Region

Rejection Region

−

𝑧 −



Table value:


Decision rule:

so we accept .

Problem: 22

A producer confesses that 22% of the items manufactured by him will be defective. To test

his claim a random sample of 80 items were selected and 20 items were noted to be

defective. Test the validity of the producer’s claim at 1% level of significance.

Answer:

Given data:

−

Null Hypothesis :


Test statistic:

−

√

−

√

[ ]

Table value:


Rejection Region

Acceptance Region

𝑧



Decision rule:

so we accept .

Problem: 23

A die is thrown 9000 times and throw of 3 or 4 is observed 3240 times. Show that the die

cannot be regarded as an unbiased one and find the limits between which the probability

of a throw of 3 or 4 lies.

Answer:

Given that and

[ ]

−

( )

Test statistic:

−

√

−

√

[ ]

Rejection Region

Acceptance Region

𝑧



Table value:


Decision rule:

| |

so we reject .

Hence the die biased one.

Problem: 24

A manufacturer of light bulbs claims that an average 2% of the bulbs manufactured by his

firm are defective. A random sample of 400 bulbs contained 13 defective bulbs. On the

basis of this sample, can you support the manufacturer’s claim at 5% level of significance?

Answer:

Given that

Population proportion − −

( )

Test statistic:

−

√

−

√

[ ]

Table value:


Decision rule:

so we reject .

That is the manufactures cannot support

claim at 2%.

Rejection Region

Acceptance Region

Rejection Region

−

𝑧

Rejection Region

Acceptance Region

𝑧



Problem: 25

A quality control engineer suspects that the proportion of defective units among certain

manufactured items has increased from the set limit of 0.01. The test his claim, he

randomly selected 100 of these items and found that the proportion of defective units in

the sample was 0.02. Test the engineer’s hypothesis at 0.05 level of significance.

Solution:

Given that ( )

Population proportion − −

( )

Test statistic:

−

√

−

√

[ ]

Table value:


Decision rule:

| |

so we accept .

That is the proportion of defective units

has not increased.

Problem: 26

A coin is tossed 256 times and 132 heads are obtained. Would you conclude that the coin is

a biased one?

Answer:

Rejection Region

Acceptance Region

𝑧



Since the population proportion is not given, so we choose

Population proportion

( )

Test statistic:

−

√

−

√

[ ]

Table value:


Decision rule:

| |

so we accept .

Hence the die is unbiased one.

TESTING OF HYPOTHESIS ABOUT THE DIFFERENCE BETWEEN TWO PROPORTIONS

The test statistic for the difference between two proportions is

−

√ ( + )

( )

−

−

If P is not known, we use

+ +

−

Rejection Region

Acceptance Region

𝑧

−



Problem: 27

Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The

company states that the drug is equally effective for men and women. To test this claim,

they choose a simple random sample of 100 women and 200 men from a population of

100,000 volunteers. At the end of the study, 38% of the women caught a cold; and 51% of

the men caught a cold. Based on these findings, can we reject the company's claim that the

drug is equally effective for men and women? Use a 0.05 level of significance.

Answer:

( ) ( )

( )

To find :


+ +

+

+

− −

Test statistic:

−

√ ( + )

−

√ (

+ )

− [ ]



Table value:


Decision rule:

| |

so we reject .

The drug is not equally effective for men

and women.

Problem: 28

Time magazine reported the result of a telephone poll of 800 adult Americans. The

question posed of the Americans who were surveyed was: "Should the federal tax on

cigarettes be raised to pay for health care reform?" The results of the survey were:

Non – Smokers Smokers

said yes said yes

Is there sufficient evidence at the α = 0.05 level, say, to conclude that the two populations

smokers and non-smokers differ significantly with respect to their opinions?

Answer:

( )

To find :


+ +

+

+

− −

Rejection Region

Acceptance Region

𝑧 −

−



Test statistic:

−

√ ( + )

−

√ (

+

)

[ ]

Table value:


Decision rule:

| |

so we reject .

Hence we conclude that the two

populations smokers and non-smokers

differ significantly with respect to their

opinions.

Problem: 29

A swimming school wants to determine whether a recently hired instructor is working out.

Sixteen out of 25 of Instructor A's students passed the lifeguard certification test on the

first try. In comparison, 57 out of 72 of more experienced Instructor B's students passed

the test on the first try. Is Instructor A's success rate worse than Instructor B's? Use

α = 0.10.

Answer:

( )

To find :


Rejection Region

Acceptance Region

𝑧

−



+ +

+

+

− −

Test statistic:

−

√ ( + )

−

√ ( + )

− [ ]

Table value:


Decision rule:

so we accept .

There is no difference between two

instructor’s.

Problem: 30

Two types of medication for hives are being tested to determine if there is a difference in

the proportions of adult patient reactions. Twenty out of a random sample of 200 adults

given medication A still had hives 30 minutes after taking the medication. Twelve out of

another random sample of 200 adults given medication B still had hives 30 minutes after

taking the medication. Test at a 1% level of significance.

Answer:

Rejection Region

Acceptance Region

𝑧 −

−



( )

To find :


+ +

+

+

− −

Test statistic:

−

√ ( + )

−

√ (

+ )

[ ]

Table value:


Decision rule:

| |

so we accept .

There is no significant difference between

two medications.

Problem: 31

Researchers conducted a study of smartphone use among adults. A cell phone company

claimed that iPhone smartphones are more popular with whites (non-Hispanic) than with

African Americans. The results of the survey indicate that of the 232 African American cell

phone owners randomly sampled, 5% have an iPhone. Of the 1,343 white cell phone

owners randomly sampled, 10% own an iPhone. Test at the 5% level of significance. Is the

Rejection Region

Acceptance Region

𝑧

−



proportion of white iPhone owners greater than the proportion of African American

iPhone owners?

Answer:

( )

( )

( )

To find :


+ +

+

+

− −

Test statistic:

−

√ ( + )

−

√ (

+

)

− [ ]

Table value:


Decision rule:

− − so we reject .

Hence we conclude that a larger proportion

of white cell phone owners use iPhones

than African Americans

Rejection Region

Acceptance Region

𝑧 −

−



− distribution

distribution in statistics

The distribution is a right-skewed distribution used most commonly in Analysis of

Variance. When referencing the F distribution, the numerator degrees of freedom are

always given first, as switching the order of degrees of freedom changes the distribution

(e.g., ( ) does not equal to ( ))

- Test to Compare Two Variances

A Statistical Test uses an F Statistic to compare two variances, and , by dividing

them. The result is always a positive number (because variances are always positive). The

equation for comparing two variances with the -test is:

If the variances are equal, the ratio of the variances will equal 1 distribution is positively

Characteristics of distribution.

The -distribution is positively skewed and with the increase in the degrees of

freedom and , its skewness decreases. The value of the F-distribution is always

positive, or zero since the variances are the square of the deviations and hence cannot

assume negative values. Its value lies between and ∞.

Problem: 32

Two independent samples of sizes 9 and 7 from a normal population had the following

values of the variables.

Sample I : 18 13 12 15 12 14 16 14 15

Sample II : 16 19 13 16 18 13 15

Do the estimates of population variance differ significantly at 5% level of significance?

Answer:


https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/f-statistic-value-test/

https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/variance/

https://www.statisticshowto.datasciencecentral.com/ratios-and-rates/#ratio



( − ) ( − )

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:

There is no significant difference between two sample variances.


There is some significant difference between two sample variances.

The test statistic is

[ ]

Table value:

( − − ) ( − − ) ( ) .



Decision rule:

so we accept .

Conclusion:

There is no significant difference between two sample variances.

Problem: 33

Time taken by workers in performing a job are given below:

Type I 21 17 27 28 24 23

Type II 28 34 43 36 33 35 39

Test whether there is any significant difference between the variances of time distribution.

Answer:


( − ) ( − )

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and



Null Hypothesis:

There is no significant difference between the variances of time distribution.


There is some significant difference between the variances of time distribution.


[ ]

Table value:

( − − ) ( − − ) ( ) .

Decision rule:

so we accept .

Conclusion:

There is no significant difference between the two variances of time distribution.

Problem: 34

Test whether there is any significant difference between the variances of the populations

from which the following samples are taken:

Sample I: 20 16 26 27 23 22

Sample II: 27 33 42 35 32 34 38

Answer:


( − ) ( − )



∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:

There is no significant difference between the variances of time distribution.


There is some significant difference between the variances of time distribution.


[ ]

Table value:

( − − ) ( − − ) ( ) .

Decision rule:

so we accept .

Conclusion:

There is no significant difference between the two sample variances.



Problem: 35



Sample I: 76 68 70 43 94 68 33

Sample II: 40 48 92 85 70 76 68 22

Answer:


( − ) ( − )

∑ ∑( − ) ∑ ∑( − )

∑

√∑( − )

−

√

−

∑

√∑( − )

−

√

−

Given data and

Null Hypothesis:

There is no significant difference between the variances


There is some significant difference between the variances




[ ]

Table value:

( − − ) ( − − ) ( ) .

Decision rule:

so we accept .

Conclusion:

There is no significant difference between the two sample variances.

ANALYSIS OF VARIANCE

History:

The - and -tests developed in the 20th century were used until 1918, when Ronald

Fisher created the analysis of variance. ANOVA is also called the Fisher analysis of variance,

and it is the extension of the - and the -tests. The term became well-known in 1925, after

appearing in Fisher's book, "Statistical Methods for Research Workers." It was employed in

experimental psychology and later expanded to subjects that are more complex.

Definition:

Analysis of variance

Analysis of variance is a collection of statistical models and their associated

estimation procedures used to analyze the differences among group means in a sample.

ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.

https://www.investopedia.com/terms/z/z-test.asp



One Way Classification

Completely Randomized Design (CRD)

The one-way analysis of variance (ANOVA) is used to determine whether there are

any statistically significant differences between the means of two or more independent

(unrelated) groups (although you tend to only see it used when there are a minimum of

three, rather than two groups).

For Problem, you could use a one-way ANOVA to understand whether exam

performance differed based on test anxiety levels amongst students, dividing students into

three independent groups (e.g., low, medium and high-stressed students). Also, it is

important to realize that the one-way ANOVA is an omnibus test statistic and cannot tell

you which specific groups were statistically significantly different from each other; it only

tells you that at least two groups were different. Since you may have three, four, five or

more groups in your study design, determining which of these groups differ from each

other is important.

Advantages of randomized complete block designs

Complete flexibility. Can have any number of treatments and blocks.

Provides more accurate results than the completely randomized design due to

grouping.

Relatively easy statistical analysis even with missing data.

Allows calculation of unbiased error for specific treatments.

Disadvantages of randomized complete block designs

Not suitable for large numbers of treatments because blocks become too large.

Not suitable when complete block contains considerable variability.

Interactions between block and treatment effects increase error.



ANOVA table for One-way classification.

Source of variation Sum of

squares

Degrees of

freedom

Mean sum of

square F-ratio

Between samples SSC −

Within samples SSE −

Problem: 36

A random sample is selected from each of three makes of ropes and their breaking

strength (in pounds) are measured with the following results:

I II III

70 100 60

72 110 65

75 108 57

80 112 84

83 113 87

120 73

107

Test whether the breaking strength of the ropes differs significantly.

Answer:

For simplification of work we subtract each entry by 80, we form the table as follows

S. No

1 − −

2 − −

3 − −

4

5

6 −

7

− −



Null Hypothesis: Let us take the null hypothesis that the breaking strength of the ropes does not differ

significantly.

∑∑

− + −

( )

Total sum of squares SST [(− ) + (− ) + (− ) + ( ) +

+ + + + + +

(− ) + (− ) + (− ) + ( ) + ( ) + (− ) ] −

−

Between ropes (Column) sum of squares

+

+

−

(− )

+( )

+(− )

−

−

Error sum of squares −

−

ANOVA Table

Source of Variation Degrees of Freedom

Sum of squares

Mean square

F-ratio

Between ropes

Error

Total



Table value:

( ) .

Conclusion:

( ), we reject the null hypothesis, there is some significant difference

between the robes.

Problem: 37

The following are the number of mistakes made in 5 successive days by 4 technicians

working for a photographic laboratory. Test whether the difference among the four

samples mean can be attributed to chance. [Test at a level of significance ].

I II III IV

6 14 10 9

14 9 12 12

10 12 7 8

8 10 15 10

11 14 11 11

Answer:

S. No I II III IV

1 6 14 10 9

2 14 9 12 12

3 10 12 7 8

4 8 10 15 10

5 11 14 11 11

Null Hypothesis:

i.e., the difference among the four sample means can be attributed to chance.


There is a significant difference among the four sample means.

∑∑



+ + +

( )

Total sum of squares SST [

+ + + +

+ + + +

+ + + +

+ + + +

] −

−

Between column sum of squares

+

+

+

−

( )

+( )

+( )

+( )

−

−


−

ANOVA Table

Source of

Variation Degrees of Freedom

Sum of

squares

Mean sum of

squares F-ratio

Between

technicians − −

Error − −

Total − − 114.55



Table value:

( )

Conclusion:

( ), we accept the null hypothesis, there is no significant difference between

the two sample means.

Problem: 38

As part of the investigation of the collapse of the roof of a building, a testing

laboratory is given all the available bolts that connected all the steel structure at three

different positions on the roof. The forces required to shear each of these bolts (coded

values) are as follows:

Position 1 90 82 79 98 83 91

Position 2 105 89 93 104 89 95 86

Position 3 83 89 80 94

Answer:

For simplifying calculations we subtract 90 from each data.

S. No Position 1 Position 2 Position 3

1 −

2 − − −

3 − −

4

5 − −

6

7 −

Total − −

Null Hypothesis:

i.e., the difference among the sample means at the three positions is not significant.


: The differences between the sample means are significant.



∑∑

− + −

( )

Total sum of squares SST [

+ (− ) + (− ) + ( ) + (− ) + ( )

( ) + (− ) + ( ) + ( ) + (− ) + ( ) + (− )

(− ) + (− ) + ( ) + ( ) ] −

−


+

+

−

(− )

+( )

+(− )

−

−


−

ANOVA Table

Source of

Variation

Degrees of

Freedom

Sum of

squares

Mean sum

of squares F-ratio

Between

Positions −

Error −

Total



Table value:

( )

Conclusion:

( ), we accept the null hypothesis, there is no significant difference between

the two sample means.

Problem: 39

A completely randomized design experiment with 10 plots and 3 treatments gave the

following results:

Plot No 1 2 3 4 5 6 7 8 9 10

Treatment A B C A C C A B A B

Yield 5 4 3 7 5 1 3 4 1 7

Analyze the results for treatment effects.

OR

A completely randomized design experiment with ten plots and three treatments gave the

results given below. Analyze the results for the effects of treatments.

Treatment Replications

A 5 7 1 3

B 4 4 7

C 3 1 5

Answer:

S.No Replicant A Replicant B Replicant C

1 5 4 3

2 7 4 1

3 1 7 5

4 3

Total

Null Hypothesis:

There is no significant difference in the effects of treatments.




There is significant difference in the effects of treatments.

∑∑

+ +

( )

Total sum of squares SST [ ++ + +

+ +

+ + ] −

−


+

+

−

+

+

−

−


−

ANOVA Table

Source of

Variation

Degrees of

Freedom

Sum of

squares

Mean sum of

squares F-ratio

Between

Positions −

Error −

Total 40



Table value:

( )

Conclusion:

. We accept and conclude that there is no significant difference between

the effects of treatments.

Two Way Classification

Randomized Block Diagram (RBD)

With a randomized block design, the experimenter divides subjects into subgroups

called blocks, such that the variability within blocks is less than the variability between

blocks. Then, subjects within each block are randomly assigned to treatment conditions.

Compared to a completely randomized design, this design reduces variability within

treatment conditions and potential confounding, producing a better estimate of treatment

effects.

The table below shows a randomized block design for a hypothetical medical

experiment.

Gender Treatment

Placebo Vaccine

Male 250 250

Female 250 250

Subjects are assigned to blocks, based on gender. Then, within each block, subjects

are randomly assigned to treatments (either a placebo or a cold vaccine). For this design,

250 men get the placebo, 250 men get the vaccine, 250 women get the placebo, and 250

women get the vaccine.

It is known that men and women are physiologically different and react differently to

medication. This design ensures that each treatment condition has an equal proportion of

men and women. As a result, differences between treatment conditions cannot be

attributed to gender. This randomized block design removes gender as a potential source

of variability and as a potential confounding variable.

https://stattrek.com/Help/Glossary.aspx?Target=Completely%20randomized%20design

https://stattrek.com/Help/Glossary.aspx?Target=Placebo



Advantages of randomized block designs

The precision is more in RBD.

The amount of information obtained in RBD is more as compared to CRD.

RBD is more flexible. Statistical analysis is simple and easy.

Even if some values are missing, still the analysis can be done by using missing

plot technique.

Disadvantages of randomized complete block designs

When the number of treatments is increased, the block size will increase.

If the block size is large maintaining homogeneity is difficult and hence when more

number of treatments is present this design may not be suitable.

ANOVA table for One-way classification.


squares

Degrees of

freedom

Mean sum of

square −ratio

Between Columns SSC −

Between Rows SSR −

Error SSE

Problem: 40

The following data represents the number of units of production per day turned out

by different workers using 4 different types of machines.

Machine Type

A B C D

1 44 38 47 36

2 46 40 52 43

Workers 3 34 36 44 32

4 43 38 46 33

5 38 42 49 39

1. Test whether the five men differ with respect to mean productivity and

2. Test whether the mean productivity is the same for the four different machine types.

http://ecoursesonline.iasri.res.in/mod/page/view.php?id=15612




Answer:

Null hypothesis:

1. The 5 workers do not differ with respect to mean productivity

2. The mean productivity is the same for the four different machines.

To simplify calculation let us subtract 40 from each value, the new values are

Machine Type

Wo

rke

rs

A B C D Total

1 − −

2

3 − − − −

4 − −

5 − −

Total − −

( )

∑∑

−

[

( ) + (− ) + ( ) + (− ) + ( ) + ( ) + ( )

+( ) + (− ) + (− ) + ( ) + (− ) + ( ) + (− )

+( ) + (− ) + (− ) + ( ) + ( ) + (− ) ] −

−

Between machines (column) sum of squares

+

+

+

−

( )

+(− )

+( )

+(− )

−



−

Between workers (row) sum of squares

+

+

+

−

( )

+( )

+(− )

+( )

+( )

−

−

Error sum of squares

− −

− −

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Machines −

Workers −

Error

Total

Table value:

( ) ( )

Conclusion:

( ). Hence is rejected. That is the mean productivity is not the same

for the four machines.

( ). Hence is rejected. That is the mean productivity is not the same

for the four different workers.



Problem: 41

A company appoints 4 salesmen’s A, B, C and D and observes their sales in 3 seasons:

summer, winter and monsoon. The figures (in lakhs of Rs.) are given in the following table:

Salesmen

Season A B C D

Summer 45 40 38 37

Winter 43 41 45 38

Monsoon 39 39 41 41

Carry out an analysis of variance.

Answer:

Null hypothesis:

There is no significant difference between the sales in the three seasons

There is no significant difference between the four salesman

To simplify calculation let us subtract from each value, the new values are

Salesmen

Season Total

Summer − −

Winter −

Monsoon − −

Total −

( )

∑∑

−

[( ) + ( ) + (− ) + (− ) + ( ) + ( )

+( ) + (− ) + (− ) + (− ) + ( ) + ( ) ] −



Between salesmen (column) sum of squares

+

+

+

( )

+( )

+( )

+(− )

−

−

Between seasons (row) sum of squares

+

+

+

−

( )

+( )

+( )

−

−


− −

− −

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Salesmen − 7.639

Seasons − 4.0835

Error 7.639

Total 76.917

Table value:

( ) and ( )



Conclusion:

( ). Hence we accept the null hypothesis. That is there is no difference

between in the sales of the four salesmen.

( ). Hence we accept the null hypothesis. That is there is no difference

between the sales in the seasons.

Problem: 42

Analyse the following RBD and draw your conclusion.

Treatments

Blocks

12 14 20 22

17 27 19 15

15 14 17 12

18 16 22 12

19 15 20 14

Answer:

Null hypothesis:

There is no significant difference between treatments and blocks

To simplify calculation let us subtract 15 from each value, the new values are

Treatments

Blo

cks

Total

− −

− − −

−

−

Total

( )

∑∑

−



[ (− ) + (− ) + ( ) + ( )

( ) + ( ) + ( ) + ( )

( ) + (− ) + ( ) + (− )

( ) + ( ) + ( ) + (− )

( ) + ( ) + ( ) + (− ) ]

− −


+

+

+

−

+

+

+

−

−

Between row sum of squares

+

+

+

+

−

+

+(− )

+( )

+( )

−

−


− −

− −

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Treatments −

Blocks −

Error 184.8

Total



Table value:

( ) and ( )

Conclusion:

( ). Hence the null hypothesis is accepted, that is there is no significant

difference between treatments.


difference between blocks.

Problem: 43

A set of data involving four “four tropical feed stuffs A, B, C, D” tried on 20 chicks is

given below. All the twenty chicks are treated alike in all respects except the feeding

treatments and each feeding treatment is given to 5 chicks. Analyze the data. Weight gain

of baby chicks fed on different feeding materials composed of tropical feed stuffs.

Total

A 55 49 42 21 52 219

B 61 112 30 89 63 355

C 42 97 81 95 92 407

D 169 137 169 85 154 714

Grand Total

Answer:

Null hypothesis:

There is no significant difference between rows and columns.

55 49 42 21 52

61 112 30 89 63

42 97 81 95 92

169 137 169 85 154

( )



∑∑

−

[

+ + + +

+ + + +

+ + + +

+ + + +

] −

−


+

+

+

+

−

+

+

+

+

−

−

Between row sum of squares

+

+

+

−

+

+

+

−

−

Error sum of squares − −

− −

ANOVA table

Source of variation

Degrees of freedom

Sum of squares (SS)

Mean sum of squares (MS)

Variance Ration (F-Ratio)

Between

Columns −

Between

Rows −

Error

Total



Table value:

( ) and ( )

Conclusion:


difference between columns.

( ). Hence the null hypothesis is rejected, that is there is some

significant difference between rows.

Two Marks

1. State level of significance and critical region.

Answer:

Level of Significance:

The level of significance is defined as the probability of rejecting a null hypothesis by

the test when it is really true, which is denoted as . That is, ( ) .

Critical region:

The critical region is the region of values that corresponds to the rejection of the null

hypothesis at some chosen probability level

2. What are parameters and statistics in sampling?

Answer:

Parameters : Parameters are numbers that summarize data for an entire population.

Statistics : Statistics are numbers that summarize data from a sample, i.e. some

subset of the entire population.

3. Mention the various steps involved in testing of hypothesis.

Answer:

The 7 Step Process of Statistical Hypothesis Testing

Step 1: State the Null Hypothesis. ...

Step 2: State the Alternative Hypothesis. ...

Step 3: Set ...

Step 4: Collect Data. ...



Step 5: Calculate a test statistic. ...

Step 6: Construct Acceptance / Rejection regions. ...

Step 7: Draw a conclusion about .

4. Define Type – I error and Type – II errors.

Answer:

Type I error: Reject when is true.

Type II error: Accept when is false.

5. What are null and alternate hypothesis?

Answer:

There are two types of statistical hypotheses:

Null Hypothesis

A statistical hypothesis that states that there is no difference between a parameter

and a specific value, or that there is no difference between two parameters.

Alternative Hypothesis

A statistical hypothesis that there exists a significant difference between a parameter

and a specific value, or states that there is a difference between two parameters.

6. What are the applications of −distributions?

Answer:

To test if the sample mean values significantly from the hypothetical value of the

population mean.

To test the significants of the difference between two sample mean.

To test the significants of an observed sample correllation coefficients and sample

regression coefficients.

To test the significants of observed partial correllation coefficients.



7. Define Analysis of variance.

Answer:

Analysis of variance is a collection of statistical models and their associated

estimation procedures used to analyze the differences among group means in a sample.

ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.

8. State the basic principles of design of Experiments

Answer:

The major three principles of experimental designs are:

Replication : to provide an estimate of experimental error.

Randomization : to ensure that this estimate is statistically valid.

Local control : to reduce experimental error by making the experiment more efficient.

9. What do you understand by “Design of an experiment”?

Answer:

Design of experiments (DOE) is a systematic method to determine the relationship

between factors affecting a process and the output of that process.

In other words, it is used to find cause-and-effect relationships. This information is

needed to manage process inputs in order to optimize the output.

10. What is the aim of the design of experiment?

Answer:

Design of experiments (DOE) is a systematic method to determine the relationship

between factors affecting a process and the output of that process.

In other words, it is used to find cause-and-effect relationships. This information is

needed to manage process inputs in order to optimize the output.

11. State the assumptions involved in ANOVA.

Answer:

The experimental errors of your data are normally distributed.

Equal variances between treatments (Homogeneity of variances Homoscedasticity).

Independent of samples (Each sample is randomly selected and independent).



12. What are the basic steps in ANOVA?

Answer:

Set up hypotheses

Determine the level of significance.

Select the appropriate test statistic. ...

Set up decision rule. ...

Compute the test statistic. ...

Write Conclusion.

13. When do you apply the analysis of variance technique?

Answer:

Suppose we consider three or more samples at a time, in this situation we need

another testing hypothesis that all the samples are drawn from the same population, i.e.,

they have the same means. In this case we use analysis of variance to test the homogeneity

of several means.

14. What is a completely randomized design?

Answer:

The one-way analysis of variance or a completely randomized design is used to

determine whether there are any statistically significant differences between the means of

two or more independent (unrelated) groups (although you tend to only see it used when

there are a minimum of three, rather than two groups).

15. State any two advantages of a Completely Randomized Experimental Design.

Answer:

Complete flexibility. Can have any number of treatments and blocks.

Easy to calculate and perform the layouts.



16. Write down the ANOVA table for one way classification.

Answer:


squares

Degrees of

freedom

Mean sum of

square F-ratio

Between samples SSC −

Within samples SSE −

17. Define: RBD.

Answer:

With a randomized block design, the experimenter divides subjects into subgroups

called blocks, such that the variability within blocks is less than the variability between

blocks. Then, subjects within each block are randomly assigned to treatment conditions.

Compared to a completely randomized design, this design reduces variability within

treatment conditions and potential confounding, producing a better estimate of treatment

effects.

18. Write the ANOVA table for randomized block design.

Answer:

Source of

variation

Sum of

squares

Degrees of

freedom

Mean sum of

square F-ratio

Between

Columns SSC −

Between Rows SSR −

Error SSE

19. Discuss the advantages and disadvantages of Randomized block design.

Answer:

Advantages:

The precision is more in RBD.

The amount of information obtained in RBD is more as compared to CRD.

RBD is more flexible. Statistical analysis is simple and easy.

https://stattrek.com/Help/Glossary.aspx?Target=Completely%20randomized%20design




Disadvantages:

When the number of treatments is increased, the block size will increase.

If the block size is large maintaining homogeneity is difficult and hence when more

number of treatments is present this design may not be suitable.

20. Compare one-way classification model with two-way classification model.

Answer:

One – way ANNOVA Two – way ANNOVA

1 We cannot test two sets of

hypothesis at a time.

Two sets of hypothesis

can be tested at a time.

2 Data are classified according to one

factor

Data are classified according

to two different factor.

21. Write any two differences between RBD and CRD.

Answer:

Completely randomized block design Randomized block design

1 We cannot test two sets of hypothesis at a

time.

Two sets of hypothesis

can be tested at a time.

2 Data are classified according

to one factor

Data are classified according

to two different factor.

Problem: 1

The mean lifetime of a sample of 100 light tubes produced by a company is found to

be 1580 hours with standard deviation of 90 hours. Test the hypothesis that the mean

lifetime of the tubes produced by the company is 1600 hours.

Problem: 2

The mean breaking strength of a cables supplied by a manufacturer is 1800 with the

S.D of 100. By a new technique in the manufacturing process, it is claimed that the breaking

strength of the cable has increased. To test this claim a sample of 50 cables is tested and is



found that the mean breaking strength is 1850. Can we support the claim at 1% level of

significance?

Problem: 3

A sample of 100 students is taken from a large population. The mean height of the

students in this sample is 160 cm. Can it be reasonably regarded that this sample is from a

population of mean 165 cm and S.D 10 cm?

Problem: 4

A sample of 900 members has a mean 3.4 cm and S.D 2.61 cm. Is the sample from a

large population of mean 3.25 cm and S.D 2.61 cm. If the population is normal and the

mean is unknown.

Problem: 5

A cosmetics company fills its best-selling 8 ounce jars of facial cream by an automatic

dispensing machine. The machine is set to dispense a mean of 8.1 ounces per jar.

Uncontrollable factors in the process can shift the mean away from 8.1 and cause either

under fill or overfill, both of which are undesirable. In such a case the dispensing machine

is stopped and recalibrated. Regardless of the mean amount dispensed, the standard

deviation of the amount dispensed always has value 0.22 ounce. A quality control engineer

routinely selects 30 jars from the assembly line to check the amounts filled. On one

occasion, the sample mean is 8.2 ounces and the sample standard deviation is 0.25 ounce.

Determine if there is sufficient evidence in the sample to indicate, at the 1% level of

significance, that the machine should be recalibrated.

Problem: 6

It is hoped that a newly developed pain reliever will more quickly produce perceptible

reduction in pain to patients after minor surgeries than a standard pain reliever. The

standard pain reliever is known to bring relief in an average of 3.5 minutes with standard

deviation 2.1 minutes. To test whether the new pain reliever works more quickly than the

standard one, 50 patients with minor surgeries were given the new pain reliever and their

times to relief were recorded. The experiment yielded sample mean minutes and

sample standard deviation 1.5 minutes. Is there sufficient evidence in the sample to



indicate, at the 5% level of significance, that the newly developed pain reliever does deliver

perceptible relief more quickly?

Problem: 7

The buyer of electric bulbs bought 100 bulbs each of two famous brands. Upon

testing these he found that brand A had a mean life of 1500 hours with a standard

deviation of 50 hours whereas brand B had a mean life of 1530 hours with a standard

deviation of 60 hours. Can it be concluded at 5% level of significance, that the two brands

differ significantly in quality?

Problem: 8

Intelligence test given to two groups of boys and girls gave the following information

Mean Score S.D Number

Girls 75 10 50

Boys 70 12 100

Is the difference in the mean scores of boys and girls statistically significant?

Problem: 9

A simple sample of heights of 6400 Englishmen has a mean of 170 cm and S.D of 6.4

cm, while a simple sample of heights of 1600 Americans has mean of 172 cm and S.D of 6.3

cm. Do the data indicate that Americans are on the average taller than Englishmen’s?

Problem: 10

In a certain factory there are two independent processes manufacturing the same

item. The average weight in a sample of 250 items produced from one process is found to

be 120 Ozs, with a s.d of 12 Ozs, while the corresponding figures in a sample of 400 items

from the other process are 124 Ozs and 14 Ozs. Is the difference between the two sample

means significant?

Problem: 11

A random sample of 10 boys has the following IQ’s 70, 83, 88, 95, 98, 100, 101, 107, 110

and 120. Do these data support the assumption of a population mean IQ of 100 at 5% level

of significance?



Problem: 12

The heights of 10 males of a given locality are found to be 70, 67, 62, 68, 61, 68, 70,

64, 64, 66 inches. Is it reasonable to believe that the average height is greater than 64

inches?

Problem: 13

A simple random sample of 10 people from a certain population has a mean age of

27. Can we conclude that the mean age of the population is not 30? The variance is known

to be 20. Let .

Problem: 14

Ten oil tins are taken at random from an automatic filling machine. The mean weight of the

tins 15.8 kg and standard deviation of 0.5 kg. Does the sample mean differ significantly

from the intended weight of 16 kg?

Problem: 15

The height of six randomly chosen sailors are (in inches): 63, 65, 68, 69, 71 and 72.

Those of 10 randomly chosen soldiers are 61, 62, 65, 66, 69, 69, 70, 71, 72 and 73. Discuss,

the height that these data thrown on the suggestion that sailors are on the average taller

than soldiers.

Problem: 16

In a packing plant, a machine packs cartons with jars. It is supposed that a new

machine will pack faster on the average than the machine currently used. To test that

hypothesis, the times it takes each machine to pack ten cartons are recorded. The results

(machine.txt), in seconds, are shown in the following table.

New machine Old machine 42.1 41.3 42.4 43.2 41.8 42.7 43.8 42.5 43.1 44 41 41.8 42.8 42.3 42.7 43.6 43.3 43.5 41.7 44.1

Problem: 17

Two independent samples of 8 and 7 items respectively had the following values.

Sample I 9 11 13 11 15 9 12 14 Sample II 10 12 10 14 9 8 10

Is the difference between the means of samples significant?

https://onlinecourses.science.psu.edu/stat500/sites/onlinecourses.science.psu.edu.stat500/files/data/machine/index.txt



Problem: 18

In a test given to two groups of students the marks obtained were as follows.

First Group 18 20 36 50 49 36 34 49 41 Second Group 29 28 26 35 30 44 46

Examine the significant difference between the means of marks secured by students of the

above two groups.

Problem: 19

A university has found over the years that out of all the students who are offered

admission, the proportion who accepts is .70. After a new director of admissions is hired,

the university wants to check if the proportion of students accepting has changed

significantly. Suppose they offer admission to 1200 students and 888 accept. Is this

evidence at the level that there has been a real change from the status quo?

Problem: 20

The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are

very satisfied with the service they receive. To test this claim, the local newspaper

surveyed 100 customers, using simple random sampling. Among the sampled customers,

73 percent say they are very satisfied. Based on these findings, can we reject the CEO's

hypothesis that 80% of the customers are very satisfied? Use a 0.05 level of significance.

Problem: 21

40 people are attacked by disease and only 36 survived. Will you reject the hypothesis that

the survival rate, if attacked by this disease, is 85% in favour of hypothesis that it is more,

at 5% level of significance?

Problem: 22

A producer confesses that 22% of the items manufactured by him will be defective. To test

his claim a random sample of 80 items were selected and 20 items were noted to be

defective. Test the validity of the producer’s claim at 1% level of significance.

Problem: 23

A die is thrown 9000 times and throw of 3 or 4 is observed 3240 times. Show that the die

cannot be regarded as an unbiased one and find the limits between which the probability

of a throw of 3 or 4 lies.



Problem: 24

A manufacturer of light bulbs claims that an average 2% of the bulbs manufactured by his

firm are defective. A random sample of 400 bulbs contained 13 defective bulbs. On the

basis of this sample, can you support the manufacturer’s claim at 5% level of significance?

Problem: 25

A quality control engineer suspects that the proportion of defective units among certain

manufactured items has increased from the set limit of 0.01. The test his claim, he

randomly selected 100 of these items and found that the proportion of defective units in

the sample was 0.02. Test the engineer’s hypothesis at 0.05 level of significance.

Problem: 26

A coin is tossed 256 times and 132 heads are obtained. Would you conclude that the coin is

a biased one?

Problem: 27

Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The

company states that the drug is equally effective for men and women. To test this claim,

they choose a simple random sample of 100 women and 200 men from a population of

100,000 volunteers. At the end of the study, 38% of the women caught a cold; and 51% of

the men caught a cold. Based on these findings, can we reject the company's claim that the

drug is equally effective for men and women? Use a 0.05 level of significance.

Problem: 28

Time magazine reported the result of a telephone poll of 800 adult Americans. The

question posed of the Americans who were surveyed was: "Should the federal tax on

cigarettes be raised to pay for health care reform?" The results of the survey were:

Non – Smokers Smokers

said yes said yes

Is there sufficient evidence at the α = 0.05 level, say, to conclude that the two populations

smokers and non-smokers differ significantly with respect to their opinions?



Problem: 29

A swimming school wants to determine whether a recently hired instructor is working out.

Sixteen out of 25 of Instructor A's students passed the lifeguard certification test on the

first try. In comparison, 57 out of 72 of more experienced Instructor B's students passed

the test on the first try. Is Instructor A's success rate worse than Instructor B's? Use

α = 0.10.

Problem: 30

Two types of medication for hives are being tested to determine if there is a difference in

the proportions of adult patient reactions. Twenty out of a random sample of 200 adults

given medication A still had hives 30 minutes after taking the medication. Twelve out of

another random sample of 200 adults given medication B still had hives 30 minutes after

taking the medication. Test at a 1% level of significance.

Problem: 31

Researchers conducted a study of smartphone use among adults. A cell phone company

claimed that iPhone smartphones are more popular with whites (non-Hispanic) than with

African Americans. The results of the survey indicate that of the 232 African American cell

phone owners randomly sampled, 5% have an iPhone. Of the 1,343 white cell phone

owners randomly sampled, 10% own an iPhone. Test at the 5% level of significance. Is the

proportion of white iPhone owners greater than the proportion of African American

iPhone owners?

Problem: 32

Two independent samples of sizes 9 and 7 from a normal population had the following

values of the variables.

Sample I : 18 13 12 15 12 14 16 14 15

Sample II : 16 19 13 16 18 13 15

Do the estimates of population variance differ significantly at 5% level of significance?

Problem: 33

Time taken by workers in performing a job are given below:

Type I 21 17 27 28 24 23

Type II 28 34 43 36 33 35 39



Test whether there is any significant difference between the variances of time distribution.

Problem: 34



Sample I: 20 16 26 27 23 22

Sample II: 27 33 42 35 32 34 38

Problem: 35



Sample I: 76 68 70 43 94 68 33

Sample II: 40 48 92 85 70 76 68 22

Problem: 36

A random sample is selected from each of three makes of ropes and their breaking

strength (in pounds) are measured with the following results:

I II III 70 100 60 72 110 65 75 108 57 80 112 84 83 113 87

120 73

107

Test whether the breaking strength of the ropes differs significantly.

Problem: 37

The following are the number of mistakes made in 5 successive days by 4 technicians

working for a photographic laboratory. Test whether the difference among the four

samples mean can be attributed to chance. [Test at a level of significance ].

I II III IV 6 14 10 9

14 9 12 12 10 12 7 8 8 10 15 10

11 14 11 11



Problem: 38

As part of the investigation of the collapse of the roof of a building, a testing

laboratory is given all the available bolts that connected all the steel structure at three

different positions on the roof. The forces required to shear each of these bolts (coded

values) are as follows:

Position 1 90 82 79 98 83 91

Position 2 105 89 93 104 89 95 86 Position 3 83 89 80 94

Problem: 39

A completely randomized design experiment with 10 plots and 3 treatments gave the

following results:

Plot No 1 2 3 4 5 6 7 8 9 10 Treatment A B C A C C A B A B Yield 5 4 3 7 5 1 3 4 1 7

Analyze the results for treatment effects.

OR A completely randomized design experiment with ten plots and three treatments gave the

results given below. Analyze the results for the effects of treatments.

Treatment Replications A 5 7 1 3 B 4 4 7

C 3 1 5

Problem: 40

The following data represents the number of units of production per day turned out

by different workers using 4 different types of machines.

Machine Type

A B C D

1 44 38 47 36

2 46 40 52 43

Workers 3 34 36 44 32

4 43 38 46 33

5 38 42 49 39

1. Test whether the five men differ with respect to mean productivity and

2. Test whether the mean productivity is the same for the four different machine types.



Problem: 41

A company appoints 4 salesmen’s A, B, C and D and observes their sales in 3 seasons:

summer, winter and monsoon. The figures (in lakhs of Rs.) are given in the following table:

Salesmen Season A B C D

Summer 45 40 38 37 Winter 43 41 45 38

Monsoon 39 39 41 41 Carry out an analysis of variance.

Problem: 42

Analyse the following RBD and draw your conclusion.

Treatments

Blocks

12 14 20 22 17 27 19 15 15 14 17 12 18 16 22 12 19 15 20 14

Problem: 43

A set of data involving four “four tropical feed stuffs A, B, C, D” tried on 20 chicks is

given below. All the twenty chicks are treated alike in all respects except the feeding

treatments and each feeding treatment is given to 5 chicks. Analyze the data. Weight gain

of baby chicks fed on different feeding materials composed of tropical feed stuffs.

Total A 55 49 42 21 52 219 B 61 112 30 89 63 355 C 42 97 81 95 92 407 D 169 137 169 85 154 714

Grand Total



Assignment Problems:

1. A machine puts out 16 imperfect articles in a sample of 500. After the machine is

overhauled, it puts out 3 imperfect articles in a batch of 100. Has the machine

improved?

2. Before an increase in excise duty on tea, 800 persons out of a sample of 1000 persons

were found to be tea drinkers. After an increase in duty, 800 people were tea drinkers in

a sample of 1200 people. State whether there is a significant decrease in the

consumption of tea after the increase in excise duty?

3. In two large populations, there are 30 and 25 percent respectively of blue-eyed people.

Is this difference likely to be hidden in samples of 1200 and 900 respectively from the

two populations?

4. In a random sample of 100 men are taken from a village A, 60 were found to be

consuming alcohol. In other sample of 200 men are taken from village B, 100 were

found to be consuming alcohol. Do the two villages differ significantly in respect of the

proportion of men who consume alcohol?

5. In a referendum submitted by the students to the body at a university, 850 men and 560

women voted. 500 men and 320 women noted favorably. Does this indicate a significant

difference of opinion between men and women on this matter at 1% level of

significance?

6. In a year there are 956 births in a town A of which 52.5% were male, while in towns A &

B combined, this proportion in a total of 1406 births was 0.496. Is there any significant

difference in the proportion of male births in the two towns?

7. A cigarette manufacturing firm claims that its brand A cigarette out sells its brand B by

8%. If it is found that 42 out of a sample of 200 smokers prefer brand A and 18 out of

another random sample of 100 smokers prefer brand B, test whether the 8% difference

is a valid claim.

8. A sample of 10 boys had the I.Q’s: 70, 120, 110, 101, 88, 83, 95, 98, 100 and 107. Test

whether the population mean I.Q may be 100.



9. A mathematics test was given to 50 girls and 75 boys. The girls made an average grade

of 76 with a SD of 6, while boys made an average grade of 82 with a SD of 2. Test

whether there is any significant difference between the performance of boys and girls.

10. A random sample of 100 bulbs from a company P shows a mean life 1300 hours and

standard deviation of 82 hours. Another random sample of 100 bulbs from company Q

showed a mean life 1248 hours and standard deviation of 93 hours. Are the bulbs of

company P superior to bulbs of company Q at 5% level of significance?

11. Test if the difference in the means is significant for the following data:

Sample I: 76 68 70 43 94 68 33

Sample II: 40 48 92 85 70 76 68 22

12. The sales manager of a large company conducted a sample survey in two places A

and B taking 200 samples in each case. The results were the following table. Test

whether the average sales is the same in the 2 areas at 5% level.

Place A Place B

Average Sales Rs. 2,000 Rs. 1,700

S.D Rs. 200 Rs. 450

13. Random samples drawn from two places gave the following data relating to the

heights of male adults:

Place A Place B

Mean Height (in Inches) 68.5 65.5

S.D (in Inches) 2.5 3

No. Of Adult males in sample 1200 1500

Test at 5% level of significance that the mean height is the same for adults in the two

places.

14. Examine whether the difference in the variability in yields is significant at 5% level

of significance, for the following.

Sets of 40 plots Sets of 60 plots Mean yield per plot 1256 1243

S.D per plot 34 28

15. The following table shows the lives in hours of four brands of electric lamps:

Brand A 1610 1610 1650 1680 1700 1720 1800



Brand B 1580 1640 1640 1700 1750 Brand C 1460 1550 1600 1620 1640 1660 1740 1820 Brand D 1510 1520 1530 1570 1600 1680

Perform an analysis of variance and test the homogeneity of the mean lives of the

four brands of lamps.

16. Suppose that a random sample of n = 5 was selected from the vineyard properties for

sale in Sonoma County, California, in each of three years. The following data are

consistent with summary information on price per acre for disease-resistant grape

vineyards in Sonoma County. Carry out an ANOVA to determine whether there is

evidence to support the claim that the mean price per acre for vineyard land in Sonoma

County was not the same for each of the three years considered. Test at the 0.05 level

and at the 0.01 level.

1996: 30000 34000 36000 38000 40000 1997: 30000 35000 37000 38000 40000 1998: 40000 41000 43000 44000 50000

17. The accompanying data resulted from an experimental comparing the degree of

soiling for fabric copolymerized with the 3 different mixtures of methacrylic acid.

Analyse the classification

Mixture 1 0.56 1.12 0.90 1.07 0.94 Mixture 2 0.72 0.69 0.87 0.78 0.91 Mixture 3 0.62 1.08 1.07 0.99 0.93

18. The following data represents a certain person to work from Monday to Friday by

four different routes.

Days

Routes

Mon Tue Wed Thu Fri

1 22 26 25 25 31 2 25 27 28 26 29 3 26 29 33 30 33 4 26 28 27 30 30

Test at 5% level of significance whether the differences among the means obtained for

the different routes are significant and whether the differences among the means

obtained for the different days of the week are significant.

19. The sales of 4 salesmen in 3 seasons are tabulated here. Carry out an analysis of

variance.

Salesmen



Season A B C D Summer 36 36 21 35 Winter 28 29 31 32

Monsoon 26 28 29 29 20. Perform a 2 way ANNOVA on the data given below:

Treatment 1

1 2 3

1 30 26 38

2 24 29 28

Treatment 2

3 33 24 35

4 36 31 30

5 27 35 33

21. Three varieties of coal analyzed by four chemists and the ash content is tabulated

below. Perform an analysis of variance.

Chemists Coal A B C D

I 8 5 5 7 II 7 6 4 4 III 3 6 5 4

"Never stop learning; when we stop

learning, we stop growing."

~Loyal Jack Lewman~

SRIT / PICM105 SFM / Statistics for Management SRI ...

Documents

Transcript of SRIT / PICM105 SFM / Statistics for Management SRI ...