Chapter 1 · 2020. 12. 11. · BPS - 5th Ed. Chapter 1 29 Weight Data 192 110 195 180 170 215 152...
Transcript of Chapter 1 · 2020. 12. 11. · BPS - 5th Ed. Chapter 1 29 Weight Data 192 110 195 180 170 215 152...
BPS - 5th Ed. Chapter 1 1
Chapter 1
Picturing Distributions with Graphs
BPS - 5th Ed. Chapter 1 2
Statistics
Statistics is a science that involves the extraction of
information from numerical data obtained during an
experiment or from a sample. It involves the design
of the experiment or sampling procedure, the
collection and analysis of the data, and making
inferences (statements) about the population based
upon information in a sample.
BPS - 5th Ed. Chapter 1 3
Individuals and Variables
Individuals
– the objects described by a set of data
– may be people, animals, or things
Variable
– any characteristic of an individual
– can take different values for different
individuals
BPS - 5th Ed. Chapter 1 4
Variables
Categorical
– Places an individual into one of several
groups or categories
Quantitative (Numerical)
– Takes numerical values for which
arithmetic operations such as adding and
averaging make sense
BPS - 5th Ed. Chapter 1 5
Case Study
Weight Gain Spells
Heart Risk for Women
“Weight, weight change, and coronary heart disease
in women.” W.C. Willett, et. al., vol. 273(6), Journal
of the American Medical Association, Feb. 8, 1995.
(Reported in Science News, Feb. 4, 1995, p. 108)
BPS - 5th Ed. Chapter 1 6
Case Study
Weight Gain Spells
Heart Risk for Women
Objective:
To recommend a range of body mass index
(a function of weight and height) in terms of
coronary heart disease (CHD) risk in women.
BPS - 5th Ed. Chapter 1 7
Case Study
Study started in 1976 with 115,818
women aged 30 to 55 years and without
a history of previous CHD.
Each woman’s weight (body mass) was
determined.
Each woman was asked her weight at
age 18.
BPS - 5th Ed. Chapter 1 8
Case Study
The cohort of women were followed for
14 years.
The number of CHD (fatal and nonfatal)
cases were counted (1292 cases).
BPS - 5th Ed. Chapter 1 9
Case Study
Age (in 1976)
Weight in 1976
Weight at age 18
Incidence of coronary heart
disease
Smoker or nonsmoker
Family history of heart disease
quantitative
categorical
Variables measured
BPS - 5th Ed. Chapter 1 10
Distribution
Tells what values a variable takes and
how often it takes these values
Can be a table, graph, or function
BPS - 5th Ed. Chapter 1 11
Displaying Distributions
Categorical variables
– Pie charts
– Bar graphs
Quantitative variables
– Histograms
– Stemplots (stem-and-leaf plots)
BPS - 5th Ed. Chapter 1 12
Year Count Percent
Freshman 18 41.9%
Sophomore 10 23.3%
Junior 6 14.0%
Senior 9 20.9%
Total 43 100.1%
Data Table
Class Make-up on First Day
BPS - 5th Ed. Chapter 1 13
Freshman
41.9%
Sophomore
23.3%
Junior
14.0%
Senior
20.9%
Pie Chart
Class Make-up on First Day
BPS - 5th Ed. Chapter 1 14
41.9%
23.3%
14.0%
20.9%
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
Freshman Sophomore Junior Senior
Year in School
Per
cen
t
Class Make-up on First Day
Bar Graph
BPS - 5th Ed. Chapter 1 15
Example: U.S. Solid Waste (2000)
Data Table
Material Weight (million tons) Percent of total
Food scraps 25.9 11.2 %
Glass 12.8 5.5 %
Metals 18.0 7.8 %
Paper, paperboard 86.7 37.4 %
Plastics 24.7 10.7 %
Rubber, leather, textiles 15.8 6.8 %
Wood 12.7 5.5 %
Yard trimmings 27.7 11.9 %
Other 7.5 3.2 %
Total 231.9 100.0 %
BPS - 5th Ed. Chapter 1 16
Example: U.S. Solid Waste (2000)
Pie Chart
BPS - 5th Ed. Chapter 1 17
Example: U.S. Solid Waste (2000)
Bar Graph
BPS - 5th Ed. Chapter 1 18
Examining the Distribution of
Quantitative Data
Overall pattern of graph
Deviations from overall pattern
Shape of the data
Center of the data
Spread of the data (Variation)
Outliers
BPS - 5th Ed. Chapter 1 19
Shape of the Data
Symmetric
– bell shaped
– other symmetric shapes
Asymmetric
– right skewed
– left skewed
Unimodal, bimodal
BPS - 5th Ed. Chapter 1 20
Symmetric
Bell-Shaped
BPS - 5th Ed. Chapter 1 21
Symmetric
Mound-Shaped
BPS - 5th Ed. Chapter 1 22
Symmetric
Uniform
BPS - 5th Ed. Chapter 1 23
Asymmetric
Skewed to the Left
BPS - 5th Ed. Chapter 1 24
Asymmetric
Skewed to the Right
BPS - 5th Ed. Chapter 1 25
Outliers
Extreme values that fall outside the
overall pattern
– May occur naturally
– May occur due to error in recording
– May occur due to error in measuring
– Observational unit may be fundamentally
different
BPS - 5th Ed. Chapter 1 26
Histograms
For quantitative variables that take
many values
Divide the possible values into class
intervals (we will only consider equal widths)
Count how many observations fall in
each interval (may change to percents)
Draw picture representing distribution
BPS - 5th Ed. Chapter 1 27
Histograms: Class Intervals
How many intervals?
– One rule is to calculate the square root of the
sample size, and round up.
Size of intervals?
– Divide range of data (max−min) by number of
intervals desired, and round to convenient number
Pick intervals so each observation can only
fall in exactly one interval (no overlap)
BPS - 5th Ed. Chapter 1 28
Case Study
Weight Data
Introductory Statistics class
Spring, 1997
Virginia Commonwealth University
BPS - 5th Ed. Chapter 1 29
Weight Data
192 110 195 180 170 215
152 120 170 130 130 125
135 185 120 155 101 194
110 165 185 220 180
128 212 175 140 187
180 119 203 157 148
260 165 185 150 106
170 210 123 172 180
165 186 139 175 127
150 100 106 133 124
BPS - 5th Ed. Chapter 1 30
Weight Data: Frequency Table
Weight Group Count
100 - <120 7 120 - <140 12 140 - <160 7 160 - <180 8 180 - <200 12 200 - <220 4 220 - <240 1 240 - <260 0 260 - <280 1
sqrt(53) = 7.2, or 8 intervals; range (260−100=160) / 8 = 20 = class width
BPS - 5th Ed. Chapter 1 31
Weight Data: Histogram
0
2
4
6
8
10
12
14
Frequency
100 120 140 160 180 200 220 240 260 280
Weight
* Left endpoint is included in the group, right endpoint is not.
Num
ber
of stu
dents
BPS - 5th Ed. Chapter 1 32
Stemplots(Stem-and-Leaf Plots)
For quantitative variables
Separate each observation into a stem (first
part of the number) and a leaf (the remaining
part of the number)
Write the stems in a vertical column; draw a
vertical line to the right of the stems
Write each leaf in the row to the right of its
stem; order leaves if desired
BPS - 5th Ed. Chapter 1 33
Weight Data
192 110 195 180 170 215
152 120 170 130 130 125
135 185 120 155 101 194
110 165 185 220 180
128 212 175 140 187
180 119 203 157 148
260 165 185 150 106
170 210 123 172 180
165 186 139 175 127
150 100 106 133 124
1
2
BPS - 5th Ed. Chapter 1 34
Weight Data:
Stemplot(Stem & Leaf Plot)
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Key
20|3 means
203 pounds
Stems = 10’sLeaves = 1’s
192
2
1522
5
135
BPS - 5th Ed. Chapter 1 35
Weight Data:
Stemplot(Stem & Leaf Plot)
10 0166
11 009
12 0034578
13 00359
14 08
15 00257
16 555
17 000255
18 000055567
19 245
20 3
21 025
22 0
23
24
25
26 0
Key
20|3 means
203 pounds
Stems = 10’sLeaves = 1’s
BPS - 5th Ed. Chapter 1 36
Extended Stem-and-Leaf Plots
If there are very few stems (when the
data cover only a very small range of
values), then we may want to create
more stems by splitting the original
stems.
BPS - 5th Ed. Chapter 1 37
Extended Stem-and-Leaf Plots
Example: if all of the data values were
between 150 and 179, then we may
choose to use the following stems:
15
15
16
16
17
17
Leaves 0-4 would go on each
upper stem (first “15”), and leaves
5-9 would go on each lower stem
(second “15”).
BPS - 5th Ed. Chapter 1 38
Time Plots
A time plot shows behavior over time.
Time is always on the horizontal axis, and the
variable being measured is on the vertical axis.
Look for an overall pattern (trend), and
deviations from this trend. Connecting the data
points by lines may emphasize this trend.
Look for patterns that repeat at known regular
intervals (seasonal variations).
BPS - 5th Ed. Chapter 1 39
Class Make-up on First Day(Fall Semesters: 1985-1993)
0%
10%
20%
30%
40%
50%
60%
70%
Percent of Class
That Are Freshman
1985 1986 1987 1988 1989 1990 1991 1992 1993
Year of Fall Semester
Class Make-up On First Day
BPS - 5th Ed. Chapter 1 40
Average Tuition (Public vs. Private)