Chapter 2Presenting Data in Tables and
Charts
Note:
• Sections 2.1 & 2.2 - examining data from 1 numerical variable.
• Section 2.3 - examining data from 2 numerical variables.
• Section 2.4 - examining data from 1 categorical variable (read).
• Section 2.5 - examining data from 2 categorical variables.
Section 2.1
Organizing Numerical Data
Examining One Numerical Variable.
Ordered Array
• Array of data ordered from smallest to largest value– Makes it easier to see the extreme values
and where the majority of values are located.
Using Excel
• Data | Sort• Select the heading of the column you want to
sort by first. Choose ascending or descending.
• Select the heading of the column you wanted to sort by second. Choose ascending or descending. Etc.
• Choose appropriate button “Header row” or “No header row”.
Stem & Leaf Display
• Shows how the data varies over a range of observations
• Separates data according to leading digits (stems) and trailing digits (leaves).
Stem & Leaf Display Stem Unit of 174 3 6
75
76
77
78 4
79 8
80 2
81 4
82
83
84 7
85
86
86
88
89 2
Data74
74.374.678.479.880.281.482.084.786.089.2
Stem & Leaf Display x
7 4 5 8
8 0 0 1 5 9
Stem unit: 10Using PHStat
7 4 4 5 8 108 0 1 2 5 6 9
The 10 in the top right cell shows that the number rounds to 80 but is in the 70’s
Data74
74.374.678.479.880.281.482.084.786.089.2
Using PHStat to create a Stem & Leaf Display
• PHStat | Descriptive Statistics | Stem-and-Leaf Display
• Enter range of values• If selection contains a heading, leave selected
“First cell contains a label”.• Select Stem Unit• Enter Title
Section 2.2
Tables And Charts For Numerical Data
Examining One Numerical Variable
The Frequency Distribution• Data is arranged into class groupings.• Creating class groupings
– Number of classes• Depends on number of observations• Typically 5 <= class groupings < 15
– Intervals should be the same width. Use the following:• Width of interval = Range / Number of class groupings
– Avoid overlapping classes
Frequency Distribution (continued)
• Consists of the number of occurrences of a value fitting within the range of each interval.
• Advantage - Data characteristics can be approximated.
• Disadvantage - Individual values are lost due to the grouping.
Ex. Given the following data:
74
74.3
74.6
78.4
79.8
80.2
81.4
82.0
84.7
86.0
89.2
Number of classes.
Width of interval
Lets choose 5 89.2 - 74 = 3.04
5
Approx. 3
Frequency Distribution
Interval Frequency
74 - 77 3
77 - 80 2
80 - 83 3
83 - 86 1
86 - 89 1
89 - 92 1
Right boundary is not included.
Using PHStat to create a Frequency Distribution
• PHStat | Descriptive Statistics | Frequency Distribution
• Enter the variable cell range• Enter the bin cell range• If you selected the heading when selecting
the data, leave selected “First cell in each range contains label”.
• Leave selected “Single Group Variable”• Enter title of your choice.
Bin (Used for PHStat only)• Contains the values that approximate the maximum
value of each class.• For example:
– If your intervals are,• -20.0 to -10.0• -10.0 to 0.0• 0 to 10.0• 10.0 to 20.0
– Your bin values could be• -10.1• -0.1• 9.9• 19.9
Bin Values
Intervals
If your data were recorded with 2 places after the
decimal, your bin values would be:
-10.01
-.01
9.99
19.99
Example
See the file Sec2.2.xls
Relative Frequency Distribution
• First create a Frequency Distribution. • The values in the Relative Frequency
Distribution are formed by dividing the frequency of each value within each class by the total number of values.
• The Relative Frequency Distribution contains the proportion of times a value occurs within each class.
Relative Frequency Distribution
Interval Frequency Relative Frequency
74 - 77 3 3/11 = .2727
77 - 80 2 2/11 = .1818
80 - 83 3 3/11 = .2727
83 - 86 1 1/11 = .0909
86 - 89 1 1/11 = .0909
89 - 92 1 1/11 = .0909
Total 11
Percentage Distribution
• First create a Relative Frequency Distribution
• The values in the Percentage Distribution are formed by multiplying each proportion in the Rel. Freq. Dist. by 100.
Percentage DistributionInterval Freq. Rel. Freq. Percentage Freq.
0 - 74 0 0.00 0%
74 - 77 3 .2727 27.27%
77 - 80 2 .1818 18.18%
80 - 83 3 .2727 27.27%
83 - 86 1 .0909 9.09%
86 - 89 1 .0909 9.09%
89 - 92 1 .0909 9.09%
Total 11
Benefit of a Relative Frequency Distribution or Percentage Distribution
• Essential when comparing two sets of data consisting of a different number of values.
For example:
2
5
8
2
9
2
5
2
8
5
5
5
8
5
2
5
5
Study 2Study 1
5 occurs 7/12 times. 7/12 = 0.583 Or 58.3% of the time
5 occurs 1/5 times. 1/5 = 0.2
Or 20% of the time
Cumulative Percentage Distribution
• Demonstrates the growth over the classes.
Cumulative Percentage Distribution
Interval Rel.Fq. Cumulative Dist.
0 - 74 0.00 0% = 0.0%
74 - 77 0.2727 0% = 0.0%
77 - 80 0.1818 27.27% = 27.27%
80 - 83 0.2727 27.27% + 18.18% = 45.45%
83 - 86 0.0909 27.27% + 18.18% + 27.27% = 72.72%
86 - 89 0.0909 27.27% + 18.18% + 27.27% 9.09% =81.81%
89 - 92 0.0909 27.27% + 18.18% + 27.27% + 9.09% + 9.09% = 90.9%
92 - 95 0.00 27.27% + 18.18% + 27.27% + 9.09% + 9.09% + 9.09%
= 99.99%
Total .9999
Cumulative Percentage Distribution
• Top of Pg. 56. SOLUTION From Table 2.5 ...
• Error
Using PHStat to create a Percentage or Cumulative
Percentage Distribution• These are automatically generated
when you create a Frequency distribution.
Class Midpoint
• Point halfway between the boundaries of each class.
Histogram
• Using a picture to demonstrate data.• Describes the numerical data that has been
grouped into a frequency, relative frequency, or percentage distribution.
• The random variable of interest is displayed along the horizontal axis (x-axis).
• The number, proportion or percentage of values per class are plotted along the vertical axis (y-axis)
Histogram
0
0.5
1
1.5
2
2.5
3
0 - 74 74 - 77 77 - 80 80 - 83 83 - 86 86 - 89 89 - 92 92 - 95
Frequency
Polygon (same info as Histogram)
• Using a picture to demonstrate data.• Describes the numerical data that has been
grouped into a frequency, relative frequency, or percentage distribution.
• The random variable of interest is displayed along the horizontal axis (x-axis).
• The number, proportion or percentage of values per class are plotted along the vertical axis (y-axis)
Polygon
0
0.5
1
1.5
2
2.5
3
3.5
0 - 74 74 - 77 77 - 80 80 - 83 83 - 86 86 - 89 89 - 92 92 - 95
Frequency
Using PHStat to create a Histogram & Polygon
• PHStat | Descriptive Statistics | Histogram & Polygons
• Enter the Variable Cell Range• Enter the Bin Cell Range• Enter the Midpoints Cell Range• If the first row contains headings, leave selected
“First cell in each range contains label”.• Select “Multiple Groups - Unstacked”.• Enter title of your choice• Leave check boxes on default selection.
Section 2.3
• Graphing Bivariate Numerical Data
• Examining 2 numerical variables.
Scatter Diagram
• Used to demonstrate the relationship between to numerical variables.
• One numerical variable is plotted on the x-axis.
• The other numerical variable is plotted on the y-axis.
• The result is a point on the x-y plane.
Example
• Cholesterol Level
• Meat Consumption in Ounces / Day
200 176 115 100 120 199 151 100 150
24 21 8 3 3 30 26 6 15
Scatter Diagram of previous data:
0
5
10
15
20
25
30
35
0 50 100 150 200 250
Meat Consumption in Ounces / Day
Cholesterol Level
Section 2.4
• Tables and charts for categorical data
• Covered in CSC 199– Read
Section 2.5
• Tabulating and Graphing Bivariate Categorical Data
• Use a Contingency Table or a Side-By-Side Chart.
Contingency Table
• Also called, “Cross-Classification Table”
• Used to study the values from two categorical variables.
Example:A sample of 20 graduates was taken and each individual was asked:1. What was your major?
2. What is your salary level?<= $30,000$30,000 - $50,000>= $50,000
Degree Year in School
English >=$50,000
Math $30,000 - $50,000
Math <= $30,000
English $30,000 - $50,000
English <= $30,000
Philosophy $30,000 - $50,000
Philosophy <= $30,000
English >=$50,000
Philosophy <= $30,000
Math >=$50,000
Math $30,000 - $50,000
Math >=$50,000
Math >=$50,000
English $30,000 - $50,000
A count of the number of degrees within each salary range.
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 1 2 2 5
Math 1 2 3 6
Philosophy 2 1 0 3
Grand Total 4 5 5 14
Percentages based on overall total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 7.14% 14.29% 14.29% 35.71%
Math 7.14% 14.29% 21.43% 42.86%
Philosophy 14.29% 7.14% 0.0% 21.43%
Total 28.57% 35.71% 35.71% 100.00%
Each value is divided by the total (12)
28.57 % of all polled make $30,000 or under.
42.86 % of all polled majored in math.
21.43 % of all polled majored in math and make $50,000 or more.
Percentages based on overall total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 7.14 % 14.29 % 14.29 % 35.71 %
Math 7.14 % 14.29 % 21.43 % 42.86 %
Philosophy 14.29 % 7.14 % 0.0 % 21.43 %
Total 28.57 % 35.71 % 35.71 % 100.00 %
Percentages based on row total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 20.00 % 40.00 % 40.00 % 100.00 %
Math 16.67 % 33.33 % 50.00 % 100.00 %
Philosophy 66.67 % 33.33 % 0.0 % 100.00 %
Total 28.57 % 35.71 % 35.71 % 100.00 %
Each value is divided by the total of its row.
A count of the number of degrees within each salary range.
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 1 2 2 5
Math 1 2 3 6
Philosophy 2 1 0 3
Grand Total 4 5 5 14
Percentages based on row total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 20.00 % 40.00 % 40.00 % 100.00 %
Math 16.67 % 33.33 % 50.00 % 100.00 %
Philosophy 66.67 % 33.33 % 0.0 % 100.00 %
Total 28.57 % 35.71 % 35.71 % 100.00 %
Of those who majored in math, 50.00 % make $50,000 or more.
Of those who majored in philosophy, 66.67 % make $30,000 or less.
Percentages based on column total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 25.00 % 40.00 % 40.00 % 35.71 %
Math 25.00 % 40.00 % 60.00 % 42.86 %
Philosophy 50.00 % 20.00 % 0.0 % 21.43 %
Total 100.00 % 100.00 % 100.00 % 100.00 %
Each value is divided by the total of its column
A count of the number of degrees within each salary range.
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 1 2 2 5
Math 1 2 3 6
Philosophy 2 1 0 3
Grand Total 4 5 5 14
Percentages based on column total
Degree <= $30,000 $30,000 - $50,000 >= $50,000 Total
English 25.00 % 40.00 % 40.00 % 35.71 %
Math 25.00 % 40.00 % 60.00 % 42.86 %
Philosophy 50.00 % 20.00 % 0.0 % 21.43 %
Total 100.00 % 100.00 % 100.00 % 100.00 %
Of those who make $30,000 or less, 50.00 % majored in philosophy
Of those who make between $30,000 and $50,000, 20.00 % majored in philosophy.
Side-By-Side Chart
• Visual display of bivariate categorical data.
• Used to detect relationships in the data.
Consider the following data:
NC SC NE IL
Percentage of Pop. that is literate 93 89 99 98
Percent of crime-related deaths 10 15 4 5
0 50 100 150
NC
SC
NE
IL
Crime RateLiteracy Rate
Side-By-Side Chart of the previous data
See the following:
• Excel Handbook for Chapter 2
• Pg. 93 - 104
Top Related