DISPLAYING DISTRIBUTIONS WITH GRAPHS
Section 1.1
INTRODUCTION TO STEMPLOTS
Video:
https://www.learner.org/courses/againstallodds/unitpages/unit02.html
STEMPLOT(STEM AND LEAF PLOT)
GRAPHING QUANTITATIVE DATA -STEMPLOTSAlso referred to as a stem-and-leaf plot.
Gives a quick picture of the shape of a distribution
Includes the actual numerical values in the graph
Works best for:
- Small number of observations (Guideline: 15 – 150)
HOW TO MAKE A STEMPLOTSeparate each observation into a stem (consisting of all but the final, right-most, digit) and a leaf (the final digit).
-Stems may have as many digits as needed (can trim if needed)
-Each leaf contains only 1 digit
Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column
Write each leaf in the row to the right of its stem, increasing in order out from the stem
- Leaves are normally written to the right of the stems, unless making a back-to-back stemplot.
LITERACY IN ISLAMIC COUNTRIES
EX: CREATE A STEMPLOTOF FEMALE LITERACY RATES
You can use your calculator to sort data…as well as many other useful things
2
3
4
5
6
7
8
9
9
1 8
6
0 3 3
0 1 1 8
2 5 6
9 9 9
Overall pattern of stemplot is irregular.
-often the case when there are very few observations.
Appear to be 2 clusters of data.
Suggests we may want to investigate the variation in literacy.
GRAPHING QUANTITATIVE DATA -STEMPLOTS
BACK-TO-BACK STEMPLOTSLeaves on each side are ordered out from the common stem.
9
8 1
6
3 3 0
8 1 1 0
6 5 2
9 9 9
0
8 8
0 8
3 4 5 9
2 2 4 5
6
0 0 0
2
3
4
5
6
7
8
9
1
0
Female Male
Be sure to label each leaf.
STEMPLOT GUIDELINESStemplots do not work well for large data sets where each stem must hold a large number of leaves.
-becomes cluttered, loses shape
Two helpful modifications:
- Splitting stems into 2 (0-4 and 5-9) or 5 (0-1, 2-3, 4-5, 6-7, and 8-9)
- Trimming: When observed values have many digits – removing the last digit or digits before making stemplot (trimmed, not rounded)
Have to use your best judgment in deciding when to split or trim.
- Keep in mind that the purpose of the stemplot is to display the shape of a distribution.
- Statistical software will often make this decision for you.
VIRGINIA TUITION FOR 06-07 SCHOOL YEAR
Goal: Trim data and make a stemplot
Minimum number of digits for a stemplot:
Here we will use ten-thousands for the stem and thousands for the leaf – we base this off of the smallest value.
0
1
2
3
9
0122223444444556667788999
1111222225
4
2
Ok, let’s split the stems into two (0-4 and 5-9)
0
0
1
1
2
2
3
3
9
0122223444444
556667788999
111122222
5
4
Now split stems into five (0-1, 2-3, 4-5, 6-7, and 8-9)
0
1
1
1
1
1
2
2
2
2
2
3
3
3
9
01
22223
44444455
66677
88999
1111
22222
5
4
Stemplot Examples
INTRODUCTION TO HISTOGRAMS
Video:
https://www.learner.org/courses/againstallodds/unitpages/unit03.html
HISTOGRAMSStemplots:
• display the actual values of the observations, making them awkward for large data sets.
• divide the observations into groups (stems) determined by the number system, not by judgment.
Histograms do not have these limitations.
A histogram breaks the range of values into classes (quantitative) and displays only the count or percent of the observations that fall into each class.
You can choose any convenient number of classes (usually 5-8) but you must always choose classes of equal width.
Too few classes can lead to a graph that looks like a skyscraper.
Too many classes can lead to a graph that looks like a pancake.
HISTOGRAMS (CONT)
Good to know: Histograms are slow to construct by hand
They also do not display the actual values observed
Because of this we use stemplots for small data sets
CREATING A HISTOGRAM
Determine a starting point for the classes and a common width.
First find the range of the values
Then determine a convenient starting point and width.
Count the number of individual observations in each class (this will determine the height of the class).
Graph the histogram (Be sure to leave no space between classes, unless a class is empty).
Create a histogram based on the data below
Data range is 81 - 145
Start at 75, go up by 10
CLASS AND COUNTS
HISTOGRAMS VS. BAR GRAPHSHistograms resemble bar graphs, but are very different.
Bar Graphs are for a categorical variable.
Histograms are for a quantitative variable.
Bar Graphs have a blank space to separate the items being compared.
Histograms have no space (unless there is an empty class).
This indicates that all values of the variable are covered.
DESCRIBING DISTRIBUTIONSWhen you are asked to describe a distribution you are actually being tasked with describing the overall pattern.
Look to see if there is an overall pattern or if there are any striking differences from the pattern.
The overall pattern consists of: Shape, Center, and Spread.
An important deviation from the pattern is an Outlier. This is an individual value that falls outside the overall pattern.
For now, use your best judgement to determine if an individual is an outlier. Later we will learn a specific set of guidelines for determining outliers.
DESCRIBING DISTRIBUTIONSUse SOCS or CUSS to help you remember the steps in describing distributions.
Shape, Outliers, Center, Spread
or...
Center, Unusual, Shape, Spread
We will be using this ALL year☺
CENTERFor now we will describe the center of a distribution by its midpoint
The value with about half of the observations with smaller values and half of the observations with larger values.
There is usually a peak associated with the center of the distribution (when unimodal).
Later we will learn additional measures of center.Center Center
UNUSUAL / OUTLIERSFor now, identifying outliers is a matter of judgement.
Points that are clearly apart from the body of the data, not just the largest or smallest value.
Later we will learn an objective process for determining if an outlier is present.
Probably not an
outlier Probably an outlier
SHAPEStemplots and Histograms display the shape of data in the same way.
Imagine the stemplot on its side so that the larger values lie to the right.
Peaks: Does the distribution have one or several major peaks (modes)
Unimodal BimodalMultimodal
SHAPE (CONT.)Skew: Is the distribution approximately symmetric or is it skewed in one direction?
SymmetricLeft Skewed Right Skewed
SHAPE (CONT. AGAIN)• Some variables commonly have distributions with
predictable shapes.
• Often times biological measurements on specimens from
the same species and sex have symmetric distributions.
• Salaries, savings, and home prices often have right-
skewed distributions.
SPREADThe spread of a distribution can be described by giving the smallest (minimum) and largest (maximum) values.
Later we will learn additional ways to describe the spread.
Spread is from
40 to 100
Top Related