578 Assignment 1 F14 Sol
Transcript of 578 Assignment 1 F14 Sol
-
8/10/2019 578 Assignment 1 F14 Sol
1/15
1
BA 578 Assignment-Sol- due by Midnight (11:59pm) Monday, Sept 15th
,
2014(Chapters 1, 2, 3 and 4): Total 75 points
True/False (One point each)
Chapter 1
1. An example of a quantitative variable is the telephone number of an individual. FALSE
2. An example of a interval scale variable is the make of a car. FALSE
3. Credit score is an example of an interval scale variable. TRUE There is no intrinsic Zero.
An arbitrary minimum is established. Therefore, it is an interval scale variable.
4. The number of people eating at a local caf between noon and 2:00 p.m. is an example of a
discrete variable. TRUE
Chapter2
5. When establishing the classes for a frequency table it is generally agreed that the more classes
you use the better your frequency table will be. FALSE We try to follow the 2krule. Having
too many classes is not good.
6. The cumulative distribution function is never decreasing. TRUE It is always increasing and
becomes flat at the end point.
7. A Histogram is a graphic that is used to depict quantitative data. TRUE Bar Chart is used for
qualitative data.
Chapter 3
8. The Mean is the measure of central tendency that divides a population or sample into two
equal parts (that is two parts with equal frequencies) FALSE It is the median which does that.
9. If there are 7 classes in a frequency distribution then the fourth class necessarily contains the
median. FALSE It depends on the class frequencies
10. The sum of deviations from the mean (taking into account the frequencies) can be negative,zero or positive. FALSE It is always Zero
11. The median is said to be less sensitive to extreme values. TRUE This statement is a
relative statement (implicitly) comparing Median with the other popular measure of
central tendency, namely, the Mean. But some students read the statement in absolute
-
8/10/2019 578 Assignment 1 F14 Sol
2/15
2
terms and answered it wrong although they knew that Median is not sensitive to extreme
values. Therefore, I removed this question from grading.
12. The Empirical Rule is used to describe a population that is not highly skewed. TRUE It is
based on the symmetrical Normal distribution and can be safely applied only for slightly
skewed non-Normal distributions. For highly skewed distribution it is not appropriate.
Chapter 4
13. If events A and B are independent and A is not an impossible event, then P(A/B) is not equal
to zero. TRUE In fact P(A/B) equals P(A) if A and B are independent, which is not zero unless
A is an impossible event.
14. If events A and B are mutually exclusive, then the conditional probability P(A/B) is a
positive number greater than zero but less than 1. FALSE This is obvious from the definition of
mutually exclusive events. If B occurs then A cannot occur. Therefore P(A/B) = 0.
15. The union of events A and B is given by all basic outcomes common to both A and B
FALSE This statement is for Intersection, not for Union.
Multiple Choices (each question carries two points):
Chapter 1
1. Ratio variables have the following unique or special characteristic:
A. Meaningful orderB. Predictable
C. Categorical in nature
D. An inherently defined zero value
2. Which of the following is a quantitative variable?
A. The make of a TV
B. The price of a TV
C. The VIN of a car
D. The rank of a police officer
E.The Drivers License Number
3. Which of the following is a categorical or Nominal variable?
A. The Social Security Number of a person
B. Bank Account Balance
C. Daily Sales in a Store
D. Air Temperature
E. Value of Company Stock
-
8/10/2019 578 Assignment 1 F14 Sol
3/15
3
4. The level of Satisfaction in a Consumer survey would represent a(n) ____________ level of
measurement.
A. Nominative
B.Ordinal
C. Interval
D. Ratio
Chapter 2
5. When developing a frequency distribution the class (group), intervals must be
A. Large
B. Small
C.Mutually exclusive.
D. Whole numbers
E. Equal
Having equal intervals (or nearly equal intervals) is generally (not always) desirable. Butit is not necessary and not even appropriate in some applications. For example, in Income
distribution the classes are arbitrarily formed and are generally unequal. Similarly many
distributions have the lowest and/or highest class with open bounds which make these class
intervals different from other classes.
6. If there are 80 values in a data set, how many classes should be created for a frequency
histogram?
A. 4
B. 5
C. 6
D. 7
E. 8
Just apply the 2krule for question 8.
7. Consider the following frequency distribution from Excel. What is the missing value?
Bin Frequency Cumulative %
584 1 4.00%
1774.4 64.00%
2964.8 4 80.00%
4155.2 3 92.00%
5345.6 1 96.00%
More 1 100.00%
-
8/10/2019 578 Assignment 1 F14 Sol
4/15
4
A. 4
B. 10
C. 12
D. 15
E. 20
Chapter 3
8. In a statistic class, 10 scores were randomly selected with the following results obtained: 75,
74, 77, 77, 71, 70, 65, 78, 67, and 66. What is the Standard deviation?
A. 21.40
B. 23.78
C. 4.88
D. 4.63
E. 214.00
X X(X-bar) (X-Xbar)
75 3 9
74 2 4
77 5 25
77 5 25
71 -1 1
70 -2 4
65 -7 49
78 6 36
67 -5 25
66 -6 36
720 0 214
s2
x = 214 / (101) = 23.78 sx= 23.78 = 4.88
9.According to a survey of the top 10 employers in a major city in the Midwest, a worker
spends an average of 413 minutes a day on the job. Suppose the standard deviation is 26.8
minutes and the time spent is approximately a normal distribution. What are the times that
approximately 95.45% of all workers will fall?
-
8/10/2019 578 Assignment 1 F14 Sol
5/15
5
A. [387.5 438.5]
B. [386.2 439.8]
C. [372.8 453.2]
D. [359.4 466.6]
E. [332.6 493.4]
10. When using the Chebyshev's theorem to obtain the bounds for a 99.73 percent of the values
in a population, the interval generally will be ___________ the interval obtained for the same
percentage if normal distribution is assumed (empirical rule).
A. Shorter than
B.Wider than
C. The same as
D. A Subset of
See Instructions. Chebyshevs theorem is more general but is less precise compared to the
empirical rule.
11. In a hearing test, subjects estimate the loudness (in decibels) of sound and the results are:
68, 67, 70, 71, 67, 75, 69, 62, 80, 73, 68 What is the median?
A. 67
B. 68
C. 69
D. 70
E. 71
Put items in order: 62,67,67,68,68,69,70,71,73,75,80 Median = [11 + 1] / 2 or 6
th
item
12. The numbers of rooms for 15 homes recently sold were: 8, 8, 8, 5, 9, 8, 7, 6, 6, 7, 7, 7, 7, 9, 9
What is the standard deviation?
A. 1.96
B. 1.40
C. 1.31
D. 1.14
E 1.18
X X(X-bar) (X-Xbar)
5 -2.4 5.76
6 -1.4 1.96
6 -1.4 1.96
-
8/10/2019 578 Assignment 1 F14 Sol
6/15
6
7 -.4 .16
7 -.4 .16
7 -.4 .16
7 -.4 .16
7 -.4 .16
8 .6 .36
8 .6 .36
8 .6 .36
8 .6 .36
9 1.6 2.56
9 1.6 2.56
9 1.6 2.56
111 0 19.6
Mean = 111/15 = 7.4 Sample variance s2
x= 19.6/14 = 1.4 and s = 1.4 = 1.18
Chapter 4
13. Two mutually exclusive events having positive probabilities are ______________
dependent.
A. Never
B. Sometimes
C.Always
They are necessarily dependent because the occurrence of one (seriously) affects the probability
of the other (makes it zero). Instructions on Ch 4 page 4
14. If P(A) >0 and P(B) > 0 and events A and B are independent, then:
A.P(A) = P(B)B.P((A|B)) = P(A)
C.P(A B) = 0
D.P(A B)=P(A)/P(B/A)
E. Both A and C are correct
See My Instructions on Ch 4 page 5. Independence does not imply equality of probabilities. So
the first choice is clearly wrong. The third choice applies to mutually exclusive events not
-
8/10/2019 578 Assignment 1 F14 Sol
7/15
7
independent events. The fourth choice is also incorrect because there should be multiplication on
the right hand side not division. So the correct answer is B.
Essay Type Questions (4 points each)
Chapter 21. Consider the following data on distances traveled by people to visit the local amusement
park.
Distance Frequency
1-8 miles 15
8-15 miles 14
15-22 miles 10
22-29 miles 8
29-36 miles 3
Expand and construct the table adding columns for relative frequency and cumulative relative
frequency and construct the histogram of frequencies, plot the frequency polygon and the
Ogive curve using Excel.
distance freq rel.fr cum.rel.fr
1-8 15 0.30 0.30
8-15 14 0.28 0.58
15-22 10 0.20 0.78
22-29 8 0.16 0.94
29-36 3 0.06 1.00
total 50 1.00 na
The following plots were obtained using simple Excel and Insert/scatter plot functions (without using
analysis Tool Pack)
Histogram
1514
10
8
3
0
2
4
6
8
10
12
14
16
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Frequency
Frequency
-
8/10/2019 578 Assignment 1 F14 Sol
8/15
8
Frequency PolygonFrequency Polygon
Ogive Curve
15
14
10
8
3
0
2
4
6
8
10
12
14
16
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Frequency
Frequency
0.30
0.58
0.78
0.941.00
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1-8 miles 8-15 miles 15-22 miles 22-29 miles 29-36 miles
Cumulative Relative Frequency
Cumulative Relative Frequency
-
8/10/2019 578 Assignment 1 F14 Sol
9/15
9
2. Math test anxiety can be found throughout the general population. A study of 120 seniors at a
local high school was conducted. The following table was produced from the data. Complete the
missing parts. (Work step by step to solve this puzzle. Round the frequencies to the nearest
whole number.)
Score Range Frequency Rel frequency Cumulative Rel. freq.Very anxious 37-50 0.20
Anxious 33-36 12
Mild Anxiety 27-32
Relaxed 20-26 24
Very Relaxed 10-19 0.30
Total
We have to work step by step using our knowledge of Frequency tables to solve this puzzle.
For the f ir st class, Relati ve Frequency and Cumulative Relati ve Frequency wil l be the same.
So we wri te 0.20 in the fi rst row last column. Moreover, we fi nd the fr equency for thi s class by
mul tiplying Relati ve fr equency 0.20 by total f requency 120 to get 24. Thus, f ir st r ow iscompletely f il led. In the second row we convert the given frequency 12 in to relati ve fr equency
after dividing by 120 which gives 0.10. Therefore, the cumulative relative frequency in the
second row wi l l be 0.30. Thus, second row is f il led too.Next we convert the given relative
frequency in the fi fth row into f requency after mul tipl ying 0.30 by 120 and rounding to get 36.
Since the total f requency is given as 120, we can fi nd the remaining fr equency for the thi rd
row once we have the frequencies for the other four rows. I t is calcul ated as 24. The rest of
the story should be clear to you. Just remember that the total of all frequencies must be the
given number 120 and the total of al l r elative frequencies must always be 1.
Score Range Frequency Rel frequency Cumulative Rel. freq.
Very anxious 37-50 24 0.20 0.20Anxious 33-36 12 0.10 0.30
Mild Anxiety 27-32 24 0.20 0.50
Relaxed 20-26 24 0.20 0.70
Very Relaxed 10-19 36 0.30 1.00
Total 120 1.000 NA
3. The number of items rejected daily by a manufacturer because of defects for the last 30 days
are: 22, 21, 8, 17, 25, 20, 18, 19, 14, 13, 11, 6, 21, 23, 4, 19, 11, 12, 16, 16, 10, 28, 24, 6, 21, 20,
25, 5, 17, 9 . Complete this frequency table for the above data showing columns for Frequency,
Relative Frequency and Cumulative Relative Frequency and plot the Ogive curve
Frequency Relative Frequency Cum Relative Frequency4
-
8/10/2019 578 Assignment 1 F14 Sol
10/15
10
Ogive
4-9 9-14 14-19 19-24 24-29
Chapter 34. The following frequency table summarizes the distances in miles of 100 patients from a
regional hospital.Distance (miles) Frequency0-4 404-8 30
8-12 20
12-16 516-20 5
Calculate the sample standard deviation for this data (since it is a case of grouped data with
classes, use group or class midpoints in the formula in place of X values).
Calculate the Sample Mean
DistanceClass Midpoint
(Mi)
Frequency (fi)fi*Mi
0-4 2 40 80
4-8 6 30 180
8-12 10 20 200
12-16 14 5 70
16-20 18 5 90
Total NA 100 620
The Sample Mean = = 6.2
Calculate the standard deviation:
0.167
0.367
0.567
0.867
1
0
0.2
0.4
0.6
0.8
1
1.2
Series2
-
8/10/2019 578 Assignment 1 F14 Sol
11/15
11
Distance
Class
Midpoint
(Mi)
Frequency (fi) Deviation(Mi-
)
Squared Deviation
(Mi- )2 fi*(Mi- )
2
0-4 2 40 - 4.2 17.64 705.6
4-8 6 30 - 0.2 0.04 1.2
8-12 10 20 3.8 14.44 288.812-16 14 5 7.8 60.84 304.2
16-20 18 5 11.8 139.24 696.2
Total NA 100 NA NA 1996
Sample Variance, s2=
= 20.1616; Sample Standard Deviation, s =
= 4.49
5. Use the data in Essay question number 3 aboveto calculate the sample Mean,Variance and Standard deviation without grouping the data (that is, as a series of
individual values)
Answer: Mean = 16.033
Variance = 44.102
Standard Deviation = 6.641
Data
22 Column1
21
8 Mean 16.033
17 Standard Error 1.212
25 Median 17.000
20 Mode 21.000
18 Standard Deviation 6.641
19 Sample Variance 44.102
14 Kurtosis -0.955
13 Skewness -0.225
11 Range 24.000
6 Minimum 4.000
21 Maximum 28.000
23 Sum 481.0004 Count 30.000
19
11
12
16
16
10
-
8/10/2019 578 Assignment 1 F14 Sol
12/15
12
28
24
6
21
20
255
17
9
Mean 16.033
Count 30
Using calculator
X X- (X- )2
22 5.967 35.601
21 4.967 24.668
8 -8.033 64.534
17 0.967 0.934
25 8.967 80.401
20 3.967 15.734
18 1.967 3.868
19 2.967 8.801
14 -2.033 4.134
13 -3.033 9.201
11 -5.033 25.334
6 -10.033 100.66821 4.967 24.668
23 6.967 48.534
4 -12.033 144.801
19 2.967 8.801
11 -5.033 25.334
12 -4.033 16.268
16 -0.033 0.001
16 -0.033 0.001
10 -6.033 36.401
28 11.967 143.201
24 7.967 63.4686 -10.033 100.668
21 4.967 24.668
20 3.967 15.734
25 8.967 80.401
5 -11.033 121.734
17 0.967 0.934
-
8/10/2019 578 Assignment 1 F14 Sol
13/15
13
9 -7.033 49.468
Total 481 0.000 1278.967
Sample Variance 1278.967/29 = 44.102
Sample Std. Dev. 44.102 = 6.641
Chapter 4
6. At a college, 55 percent of the students are women and 40 percent of the students receive a
grade of C. About 35 percent of the students are female but not C students. Use this contingency
table.
C Not C
Female 0.30 0.55
Male
0.40
If a randomly selected student is a C student, what is the probability the student is a male
student?
The completed table is:
C Not C
Female 0.25 0.30 0.55
Male 0.15 0.30 0.45
0.40 0.60 1.000
P(M/C) = P(M and C)/P(C) = 0.15/0.40 = 0.375or 37.5% chance.
Some of you just answered 0.15, but that is the probability of "male andC", not the probability
of "Male"given C.
7. The contingency table about customers of a store who buy cigars and/or beer is given below.
Beer No Beer
Cigars 0.20
No cigar 0.10 0.40
-
8/10/2019 578 Assignment 1 F14 Sol
14/15
14
Determine the probability that a customer will buy at least one of these items: cigar or beer.
The completed table is:
Beer No Beer
Cigars 0.30 0.20 0.50
No cigar 0.10 0.40 0.50
0.40 0.60 1.00
Answer: P(C or B) = P(C) + P(B) - P(C and B) = 0.50 + 0.40 - 0.30 = 0.60 or 60% chance.
You can also obtain the same probability by working with the rule of complements. The
opposite of buying Cigar or Beer or both is neither Cigar nor Beer. The probability forneither Cigar nor Beer accordingto the contingency table is 0.40. Therefore, by the ruleof complements, the probability asked is 1- 0.40 = 0.60.
8. Four employees who work as drive-through attendees at a local fast food restaurant are being
evaluated. As a part of quality improvement initiative and employee evaluation these workers
were observed over three days. One of the statistics collected is the proportion of time employee
forgets to include a napkin in the bag. Related information is given in the table.
Worker Proportion of Dinners Packed
Proportion of forgetting Napkin when
packing Dinner
Joe 0.20 0.05Jan 0.30 0.02
Cheryl 0.15 0.14
Clay 0.35 0.04
-
8/10/2019 578 Assignment 1 F14 Sol
15/15
15
You just purchased a dinner and found that there is no napkin in your bag, what is the probability
that Clay has prepared your order?
Answer: First note that the last column in the above table gives conditional probabilities.
For example 0.06 is the probability of forgetting napkin given that Joe packed the dinner
or P(No napkin/Joe). In the question we are given that No napkin has occurred and asked
to find the probability of Clay in light of this result. So here we are asked a reverse
conditionality than the one given in the contingency table. According to the Instructions for
Chapter 4, this requires Bayesian rule. Therefore,
P(Clay/ No napkin) = 0.014/0.051 = 0.2745 or 27.45%
The numerator is P(Clay and No napkin)=P(Dinner packed by Clay)*P(No napkin given
that Clay packed Dinner) = 0.35*0.04 = 0.014.
The denominator is P(No napkin)= P(Joe and no napkin)+ P(Jan and No napkin)+
P(Cheryl and No napkin)+ P(Clay and no napkin) = 0.010 + 0.006 + 0.021 + 0.014 = 0.051
as shown in the table below (everything converted to decimals instead of percentage,
because working with percentage is messy):
Worker
Proportion of Dinners
Packed by individual
workers
Proportion of forgetting
Napkin given the
worker (conditional
probability)
Joint probability
Col. 2*Col.3
Joe 0.20 0.05 0.010
Jan 0.30 0.02 0.006
Cheryl 0.15 0.14 0.021Clay 0.35 0.04 0.014
0.051
This formula is also called the Bayesian rule for probability revision based on the results of
an experiment. Here the prior probability of Clay is 35%, but the posterior probability has
been revised downward to 27.45% (called the revised or posterior probability) after
noticing that the dinner had no napkins, because Clay is one of the least forgetful ones. If
the question were for Cheryl the posterior probability would be higher than the prior
probability because she has a very high chance of forgetting napkin (14%).