578 Assignment 1 F14 Sol

8/10/2019 578 Assignment 1 F14 Sol

1/15

1

BA 578 Assignment-Sol- due by Midnight (11:59pm) Monday, Sept 15th

,

2014(Chapters 1, 2, 3 and 4): Total 75 points

True/False (One point each)

Chapter 1

1. An example of a quantitative variable is the telephone number of an individual. FALSE

2. An example of a interval scale variable is the make of a car. FALSE

3. Credit score is an example of an interval scale variable. TRUE There is no intrinsic Zero.

An arbitrary minimum is established. Therefore, it is an interval scale variable.

4. The number of people eating at a local caf between noon and 2:00 p.m. is an example of a

discrete variable. TRUE

Chapter2

5. When establishing the classes for a frequency table it is generally agreed that the more classes

you use the better your frequency table will be. FALSE We try to follow the 2krule. Having

too many classes is not good.

6. The cumulative distribution function is never decreasing. TRUE It is always increasing and

becomes flat at the end point.

7. A Histogram is a graphic that is used to depict quantitative data. TRUE Bar Chart is used for

qualitative data.

Chapter 3

8. The Mean is the measure of central tendency that divides a population or sample into two

equal parts (that is two parts with equal frequencies) FALSE It is the median which does that.

9. If there are 7 classes in a frequency distribution then the fourth class necessarily contains the

median. FALSE It depends on the class frequencies

10. The sum of deviations from the mean (taking into account the frequencies) can be negative,zero or positive. FALSE It is always Zero

11. The median is said to be less sensitive to extreme values. TRUE This statement is a

relative statement (implicitly) comparing Median with the other popular measure of

central tendency, namely, the Mean. But some students read the statement in absolute


2/15

2

terms and answered it wrong although they knew that Median is not sensitive to extreme

values. Therefore, I removed this question from grading.

12. The Empirical Rule is used to describe a population that is not highly skewed. TRUE It is

based on the symmetrical Normal distribution and can be safely applied only for slightly

skewed non-Normal distributions. For highly skewed distribution it is not appropriate.

Chapter 4

13. If events A and B are independent and A is not an impossible event, then P(A/B) is not equal

to zero. TRUE In fact P(A/B) equals P(A) if A and B are independent, which is not zero unless

A is an impossible event.

14. If events A and B are mutually exclusive, then the conditional probability P(A/B) is a

positive number greater than zero but less than 1. FALSE This is obvious from the definition of

mutually exclusive events. If B occurs then A cannot occur. Therefore P(A/B) = 0.

15. The union of events A and B is given by all basic outcomes common to both A and B

FALSE This statement is for Intersection, not for Union.

Multiple Choices (each question carries two points):

Chapter 1

1. Ratio variables have the following unique or special characteristic:

A. Meaningful orderB. Predictable

C. Categorical in nature

D. An inherently defined zero value

2. Which of the following is a quantitative variable?

A. The make of a TV

B. The price of a TV

C. The VIN of a car

D. The rank of a police officer

E.The Drivers License Number

3. Which of the following is a categorical or Nominal variable?

A. The Social Security Number of a person

B. Bank Account Balance

C. Daily Sales in a Store

D. Air Temperature

E. Value of Company Stock


3/15

3

4. The level of Satisfaction in a Consumer survey would represent a(n) ____________ level of

measurement.

A. Nominative

B.Ordinal

C. Interval

D. Ratio

Chapter 2

5. When developing a frequency distribution the class (group), intervals must be

A. Large

B. Small

C.Mutually exclusive.

D. Whole numbers

E. Equal

Having equal intervals (or nearly equal intervals) is generally (not always) desirable. Butit is not necessary and not even appropriate in some applications. For example, in Income

distribution the classes are arbitrarily formed and are generally unequal. Similarly many

distributions have the lowest and/or highest class with open bounds which make these class

intervals different from other classes.

6. If there are 80 values in a data set, how many classes should be created for a frequency

histogram?

A. 4

B. 5

C. 6

D. 7

E. 8

Just apply the 2krule for question 8.

7. Consider the following frequency distribution from Excel. What is the missing value?

Bin Frequency Cumulative %

584 1 4.00%

1774.4 64.00%

2964.8 4 80.00%

4155.2 3 92.00%

5345.6 1 96.00%

More 1 100.00%


4/15

4

A. 4

B. 10

C. 12

D. 15

E. 20

Chapter 3

8. In a statistic class, 10 scores were randomly selected with the following results obtained: 75,

74, 77, 77, 71, 70, 65, 78, 67, and 66. What is the Standard deviation?

A. 21.40

B. 23.78

C. 4.88

D. 4.63

E. 214.00

X X(X-bar) (X-Xbar)

75 3 9

74 2 4

77 5 25

77 5 25

71 -1 1

70 -2 4

65 -7 49

78 6 36

67 -5 25

66 -6 36

720 0 214

s2

x = 214 / (101) = 23.78 sx= 23.78 = 4.88

9.According to a survey of the top 10 employers in a major city in the Midwest, a worker

spends an average of 413 minutes a day on the job. Suppose the standard deviation is 26.8

minutes and the time spent is approximately a normal distribution. What are the times that

approximately 95.45% of all workers will fall?


5/15

5

A. [387.5 438.5]

B. [386.2 439.8]

C. [372.8 453.2]

D. [359.4 466.6]

E. [332.6 493.4]

10. When using the Chebyshev's theorem to obtain the bounds for a 99.73 percent of the values

in a population, the interval generally will be ___________ the interval obtained for the same

percentage if normal distribution is assumed (empirical rule).

A. Shorter than

B.Wider than

C. The same as

D. A Subset of

See Instructions. Chebyshevs theorem is more general but is less precise compared to the

empirical rule.

11. In a hearing test, subjects estimate the loudness (in decibels) of sound and the results are:

68, 67, 70, 71, 67, 75, 69, 62, 80, 73, 68 What is the median?

A. 67

B. 68

C. 69

D. 70

E. 71

Put items in order: 62,67,67,68,68,69,70,71,73,75,80 Median = [11 + 1] / 2 or 6

th

item

12. The numbers of rooms for 15 homes recently sold were: 8, 8, 8, 5, 9, 8, 7, 6, 6, 7, 7, 7, 7, 9, 9

What is the standard deviation?

A. 1.96

B. 1.40

C. 1.31

D. 1.14

E 1.18

X X(X-bar) (X-Xbar)

5 -2.4 5.76

6 -1.4 1.96

6 -1.4 1.96


6/15

6

7 -.4 .16

7 -.4 .16

7 -.4 .16

7 -.4 .16

7 -.4 .16

8 .6 .36

8 .6 .36

8 .6 .36

8 .6 .36

9 1.6 2.56

9 1.6 2.56

9 1.6 2.56

111 0 19.6

Mean = 111/15 = 7.4 Sample variance s2

x= 19.6/14 = 1.4 and s = 1.4 = 1.18

Chapter 4

13. Two mutually exclusive events having positive probabilities are ______________

dependent.

A. Never

B. Sometimes

C.Always

They are necessarily dependent because the occurrence of one (seriously) affects the probability

of the other (makes it zero). Instructions on Ch 4 page 4

14. If P(A) >0 and P(B) > 0 and events A and B are independent, then:

A.P(A) = P(B)B.P((A|B)) = P(A)

C.P(A B) = 0

D.P(A B)=P(A)/P(B/A)

E. Both A and C are correct

See My Instructions on Ch 4 page 5. Independence does not imply equality of probabilities. So

the first choice is clearly wrong. The third choice applies to mutually exclusive events not


7/15

7

independent events. The fourth choice is also incorrect because there should be multiplication on

the right hand side not division. So the correct answer is B.

Essay Type Questions (4 points each)

Chapter 21. Consider the following data on distances traveled by people to visit the local amusement

park.

Distance Frequency

1-8 miles 15

8-15 miles 14

15-22 miles 10

22-29 miles 8

29-36 miles 3

Expand and construct the table adding columns for relative frequency and cumulative relative

frequency and construct the histogram of frequencies, plot the frequency polygon and the

Ogive curve using Excel.

distance freq rel.fr cum.rel.fr

1-8 15 0.30 0.30

8-15 14 0.28 0.58

15-22 10 0.20 0.78

22-29 8 0.16 0.94

29-36 3 0.06 1.00

total 50 1.00 na

The following plots were obtained using simple Excel and Insert/scatter plot functions (without using

analysis Tool Pack)

Histogram

1514

10

8

3

0

2

4

6

8

10

12

14

16

1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles

Frequency

Frequency


8/15

8

Frequency PolygonFrequency Polygon

Ogive Curve

15

14

10

8

3

0

2

4

6

8

10

12

14

16


Frequency

Frequency

0.30

0.58

0.78

0.941.00

0.00

0.20

0.40

0.60

0.80

1.00

1.20


Cumulative Relative Frequency

Cumulative Relative Frequency


9/15

9

2. Math test anxiety can be found throughout the general population. A study of 120 seniors at a

local high school was conducted. The following table was produced from the data. Complete the

missing parts. (Work step by step to solve this puzzle. Round the frequencies to the nearest

whole number.)

Score Range Frequency Rel frequency Cumulative Rel. freq.Very anxious 37-50 0.20

Anxious 33-36 12

Mild Anxiety 27-32

Relaxed 20-26 24

Very Relaxed 10-19 0.30

Total

We have to work step by step using our knowledge of Frequency tables to solve this puzzle.

For the f ir st class, Relati ve Frequency and Cumulative Relati ve Frequency wil l be the same.

So we wri te 0.20 in the fi rst row last column. Moreover, we fi nd the fr equency for thi s class by

mul tiplying Relati ve fr equency 0.20 by total f requency 120 to get 24. Thus, f ir st r ow iscompletely f il led. In the second row we convert the given frequency 12 in to relati ve fr equency

after dividing by 120 which gives 0.10. Therefore, the cumulative relative frequency in the

second row wi l l be 0.30. Thus, second row is f il led too.Next we convert the given relative

frequency in the fi fth row into f requency after mul tipl ying 0.30 by 120 and rounding to get 36.

Since the total f requency is given as 120, we can fi nd the remaining fr equency for the thi rd

row once we have the frequencies for the other four rows. I t is calcul ated as 24. The rest of

the story should be clear to you. Just remember that the total of all frequencies must be the

given number 120 and the total of al l r elative frequencies must always be 1.

Score Range Frequency Rel frequency Cumulative Rel. freq.

Very anxious 37-50 24 0.20 0.20Anxious 33-36 12 0.10 0.30

Mild Anxiety 27-32 24 0.20 0.50

Relaxed 20-26 24 0.20 0.70

Very Relaxed 10-19 36 0.30 1.00

Total 120 1.000 NA

3. The number of items rejected daily by a manufacturer because of defects for the last 30 days

are: 22, 21, 8, 17, 25, 20, 18, 19, 14, 13, 11, 6, 21, 23, 4, 19, 11, 12, 16, 16, 10, 28, 24, 6, 21, 20,

25, 5, 17, 9 . Complete this frequency table for the above data showing columns for Frequency,

Relative Frequency and Cumulative Relative Frequency and plot the Ogive curve

Frequency Relative Frequency Cum Relative Frequency4


10/15

10

Ogive

4-9 9-14 14-19 19-24 24-29

Chapter 34. The following frequency table summarizes the distances in miles of 100 patients from a

regional hospital.Distance (miles) Frequency0-4 404-8 30

8-12 20

12-16 516-20 5

Calculate the sample standard deviation for this data (since it is a case of grouped data with

classes, use group or class midpoints in the formula in place of X values).

Calculate the Sample Mean

DistanceClass Midpoint

(Mi)

Frequency (fi)fi*Mi

0-4 2 40 80

4-8 6 30 180

8-12 10 20 200

12-16 14 5 70

16-20 18 5 90

Total NA 100 620

The Sample Mean = = 6.2

Calculate the standard deviation:

0.167

0.367

0.567

0.867

1

0

0.2

0.4

0.6

0.8

1

1.2

Series2


11/15

11

Distance

Class

Midpoint

(Mi)

Frequency (fi) Deviation(Mi-

)

Squared Deviation

(Mi- )2 fi*(Mi- )

2

0-4 2 40 - 4.2 17.64 705.6

4-8 6 30 - 0.2 0.04 1.2

8-12 10 20 3.8 14.44 288.812-16 14 5 7.8 60.84 304.2

16-20 18 5 11.8 139.24 696.2

Total NA 100 NA NA 1996

Sample Variance, s2=

= 20.1616; Sample Standard Deviation, s =

= 4.49

5. Use the data in Essay question number 3 aboveto calculate the sample Mean,Variance and Standard deviation without grouping the data (that is, as a series of

individual values)

Answer: Mean = 16.033

Variance = 44.102

Standard Deviation = 6.641

Data

22 Column1

21

8 Mean 16.033

17 Standard Error 1.212

25 Median 17.000

20 Mode 21.000

18 Standard Deviation 6.641

19 Sample Variance 44.102

14 Kurtosis -0.955

13 Skewness -0.225

11 Range 24.000

6 Minimum 4.000

21 Maximum 28.000

23 Sum 481.0004 Count 30.000

19

11

12

16

16

10


12/15

12

28

24

6

21

20

255

17

9

Mean 16.033

Count 30

Using calculator

X X- (X- )2

22 5.967 35.601

21 4.967 24.668

8 -8.033 64.534

17 0.967 0.934

25 8.967 80.401

20 3.967 15.734

18 1.967 3.868

19 2.967 8.801

14 -2.033 4.134

13 -3.033 9.201

11 -5.033 25.334

6 -10.033 100.66821 4.967 24.668

23 6.967 48.534

4 -12.033 144.801

19 2.967 8.801

11 -5.033 25.334

12 -4.033 16.268

16 -0.033 0.001

16 -0.033 0.001

10 -6.033 36.401

28 11.967 143.201

24 7.967 63.4686 -10.033 100.668

21 4.967 24.668

20 3.967 15.734

25 8.967 80.401

5 -11.033 121.734

17 0.967 0.934


13/15

13

9 -7.033 49.468

Total 481 0.000 1278.967

Sample Variance 1278.967/29 = 44.102

Sample Std. Dev. 44.102 = 6.641

Chapter 4

6. At a college, 55 percent of the students are women and 40 percent of the students receive a

grade of C. About 35 percent of the students are female but not C students. Use this contingency

table.

C Not C

Female 0.30 0.55

Male

0.40

If a randomly selected student is a C student, what is the probability the student is a male

student?

The completed table is:

C Not C

Female 0.25 0.30 0.55

Male 0.15 0.30 0.45

0.40 0.60 1.000

P(M/C) = P(M and C)/P(C) = 0.15/0.40 = 0.375or 37.5% chance.

Some of you just answered 0.15, but that is the probability of "male andC", not the probability

of "Male"given C.

7. The contingency table about customers of a store who buy cigars and/or beer is given below.

Beer No Beer

Cigars 0.20

No cigar 0.10 0.40


14/15

14

Determine the probability that a customer will buy at least one of these items: cigar or beer.

The completed table is:

Beer No Beer

Cigars 0.30 0.20 0.50

No cigar 0.10 0.40 0.50

0.40 0.60 1.00

Answer: P(C or B) = P(C) + P(B) - P(C and B) = 0.50 + 0.40 - 0.30 = 0.60 or 60% chance.

You can also obtain the same probability by working with the rule of complements. The

opposite of buying Cigar or Beer or both is neither Cigar nor Beer. The probability forneither Cigar nor Beer accordingto the contingency table is 0.40. Therefore, by the ruleof complements, the probability asked is 1- 0.40 = 0.60.

8. Four employees who work as drive-through attendees at a local fast food restaurant are being

evaluated. As a part of quality improvement initiative and employee evaluation these workers

were observed over three days. One of the statistics collected is the proportion of time employee

forgets to include a napkin in the bag. Related information is given in the table.

Worker Proportion of Dinners Packed

Proportion of forgetting Napkin when

packing Dinner

Joe 0.20 0.05Jan 0.30 0.02

Cheryl 0.15 0.14

Clay 0.35 0.04


15/15

15

You just purchased a dinner and found that there is no napkin in your bag, what is the probability

that Clay has prepared your order?

Answer: First note that the last column in the above table gives conditional probabilities.

For example 0.06 is the probability of forgetting napkin given that Joe packed the dinner

or P(No napkin/Joe). In the question we are given that No napkin has occurred and asked

to find the probability of Clay in light of this result. So here we are asked a reverse

conditionality than the one given in the contingency table. According to the Instructions for

Chapter 4, this requires Bayesian rule. Therefore,

P(Clay/ No napkin) = 0.014/0.051 = 0.2745 or 27.45%

The numerator is P(Clay and No napkin)=P(Dinner packed by Clay)*P(No napkin given

that Clay packed Dinner) = 0.35*0.04 = 0.014.

The denominator is P(No napkin)= P(Joe and no napkin)+ P(Jan and No napkin)+

P(Cheryl and No napkin)+ P(Clay and no napkin) = 0.010 + 0.006 + 0.021 + 0.014 = 0.051

as shown in the table below (everything converted to decimals instead of percentage,

because working with percentage is messy):

Worker

Proportion of Dinners

Packed by individual

workers

Proportion of forgetting

Napkin given the

worker (conditional

probability)

Joint probability

Col. 2*Col.3

Joe 0.20 0.05 0.010

Jan 0.30 0.02 0.006

Cheryl 0.15 0.14 0.021Clay 0.35 0.04 0.014

0.051

This formula is also called the Bayesian rule for probability revision based on the results of

an experiment. Here the prior probability of Clay is 35%, but the posterior probability has

been revised downward to 27.45% (called the revised or posterior probability) after

noticing that the dinner had no napkins, because Clay is one of the least forgetful ones. If

the question were for Cheryl the posterior probability would be higher than the prior

probability because she has a very high chance of forgetting napkin (14%).

578 Assignment 1 F14 Sol

Documents

Transcript of 578 Assignment 1 F14 Sol