4-1 Chapter Four McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
-
Upload
bennett-griffith -
Category
Documents
-
view
213 -
download
0
Transcript of 4-1 Chapter Four McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
4-1
Chapter
Four
McGraw-Hill/Irwin
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
4-2
Chapter FourDescribing Data: Displaying and Describing Data: Displaying and
Exploring DataExploring Data
GOALSWhen you have completed this chapter, you will be able to:
TWODevelop and interpret a stem-and-leaf display.THREE Compute and interpret quartiles, deciles, and percentiles.
ONEDevelop and interpret a dot plot.
FOURConstruct and interpret box plots.
Goals
4-3
FIVE Compute and understand the coefficient of variation and the coefficient of skewness.
SIX Draw and interpret a scatter diagram.
SEVEN Set up and interpret a contingency table.
Chapter FourDescribing Data: Displaying and Describing Data: Displaying and
Exploring DataExploring Data
Goals
4-4
Dot plots: Report the details of each observation Are useful for comparing two or more data sets
Dot Plot
Dot Plot
4-5
This example gives the percentages of men and women participating in the workforce in a recent
year for the fifty states of the United States. Compare the dispersions of labor force
participation by gender.
Example 1
4-6
This example gives the percentages of men and women participating in the workforce in a recent
year for the fifty states of the United States. Compare the dispersions of labor force
participation by gender.
Example 1 (continued)
4-7
Percentage of men participating
In the labor force for the 50 states.
Percentage of women participating
In the labor force for the 50 states.
Example 1 (continued)
4-8
Stem-and-leaf Displays
Note: an advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation.
Stem-and-leaf Displays
Stem-and-leaf display: A statistical technique for displaying a set of data. Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits the leaf.
4-9
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10 11 12
Stock prices on twelveconsecutive days for a major
publicly traded company
8 6 , 7 9 , 9 2 , 8 4 , 6 9 , 8 8 , 9 1
8 3 , 9 6 , 7 8 , 8 2 , 8 5 .
Example 2
4-10
stem leaf
6 9
7 8 9
8 2 3 4 5 6 8
9 1 2 6
Stem and leaf display of stock prices
Example 2 (Continued )
4-11
D iv id e a se t o f
o b serv a tio n s
in to fo u r
eq u a l p a rts .
Quartiles
QuartilesQuartiles
4-12
QuartilesQuartiles
L o ca te th e m ed ia n ,
(5 0 th p ercen tile )
Quartiles (continued)
4-13
QuartilesQuartilesQuartilesQuartiles
L o ca te th e m ed ia n ,
(5 0 th p ercen tile )
th e first q u a rtile
(2 5 th p ercen tile )
Quartiles (continued)
4-14
QuartilesQuartilesQuartilesQuartiles
L o ca te th e m ed ia n ,
(5 0 th p ercen tile )
first q u a rtile (2 5 th p ercen tile )
a n d th e 3rd q u artile
(7 5 th p ercen tile )
Quartiles (continued)
4-15
QuartilesQuartilesQuartilesQuartiles
P
1 0 0
w h ere
P is th e d e s ired p e rcen tile
Lp = (n+1)
Quartiles (continued)
4-16
Using the twelve stock prices, we can find the median, 25th, and 75th percentiles as follows:
L 5 0 = (1 2 + 1 ) 5 01 0 0 = 6 .5 0 th o b se rv a tio n
L 2 5 = (1 2 + 1 ) 2 51 0 0
= 3 .2 5 th o b se rv a tio n
L 7 5 = (1 2 + 1 ) 7 51 0 0
= 9 .7 5 th o b se rv a tio n
Quartile 1
Quartile 3
Median
Example 2 (continued)
4-17
969291888685848382797869
121110987654321
25th percentilePrice at 3.25 observation = 79 + .25(82-79) = 79.75
50th percentile: MedianPrice at 6.50 observation = 85 + .5(85-84) = 84.50
75th percentilePrice at 9.75 observation = 88 + .75(91-88) = 90.25
Q1
Q2
Q3
Q4
Example 2 (continued)
4-18
Interquartile Range
The Interquartile range is the distance between the third quartile Q3 and the first quartile Q1.
This distance will include the middle 50
percent of the observations.
Interquartile range = Q3 - Q1
4-19
Example 3
For a set of observations the third quartile is 24 and the first quartile is 10. What is the quartile deviation?
The interquartile range is 24 - 10 = 14. Fifty
percent of the observations will occur between 10 and
24.
4-20
Box Plots
Five pieces of data are needed to construct a box plot: the Minimum Value, the First Quartile, the Median, the Third Quartile, and the Maximum Value.
A box plot is a graphical display, based on quartiles, that helps to picture a set of
data.
4-21
Example 4
Based on a sample of 20 deliveries,
Buddy’s Pizza determined the following information. The
minimum delivery time was 13 minutes and the maximum 30
minutes. The first quartile was 15 minutes, the median 18
minutes, and the third quartile 22 minutes. Develop a box plot
for the delivery times.
4-22
Example 4 continued
4-23
Example 4 continued
1 2 1 4 1 6 1 8 2 0 2 2 2 4 2 6 2 8 3 0 3 2
Q 1 Q 3M a xM in M ed ia n
4-24
Coefficient of Variation
%)100(X
sCV
The coefficient of variation is the ratio of the standard
deviation to the arithmetic mean, expressed as a
percentage:
M ean
Relative dispersion
4-25
Movie
Skewness is the measurement of the lack of symmetry of the distribution.
The coefficient of skewness can range from -3.00 up to 3.00 when using the following formula:
A value of 0 indicates a symmetric distribution.
Some software packages use a different formula which results in a wider range for the coefficient.
s
MedianXsk
3
4-26
Using the twelve stock prices, we find the mean to be 84.42, standard deviation, 7.18, median, 84.5.
Coefficient of variation
= 8.5%%)100(X
sCV
Coefficient of skewness
= -.035
Example 2 revisited
sMedianXsk
3
4-27
Scatter diagram: A technique used to show the relationship between variables.
Example The twelve days of stock prices and the overall market index on each day are given as follows:
Variables must be at least interval scaled.
Relationship can be positive (direct) or negative (inverse).
Scatter diagram
4-28
969291888685848382797869
PriceIndex(000s)
8.07.57.57.37.27.27.17.17.06.26.25.1
Relationship between Market Index and Stock Price
50
60
70
80
90
100
5 6 7 8 9 10
Index
Pri
ce
Example 2 revisited
4-29
A contingency table is a cross tabulation that simultaneously summarizes two variables of interest.
A contingency table is used to classify observations according to two identifiable characteristics.
Contingency tables are used when one or both variables are
nominally scaled.
Contingency table
4-30
Weight Loss45 adults, all 60 pounds overweight, are randomly assigned to three weight loss programs. Twenty weeks into the program, a researcher gathers data on weight loss and divides the loss into three categories: less than 20 pounds, 20 up to 40 pounds, 40 or more pounds. Here are the results.
Example 5
4-31
Weight
Loss
Plan
Less than 20 pounds
20 up to 40
pounds
40 pounds or more
Plan 1 4 8 3
Plan 2 2 12 1
Plan 3 12 2 1
Compare the weight loss under the three plans.
Example 5 continued