Principles of Graphical Excellence Best Paper: ALAIR April 5–6, 2001 AIR: June 2-5, 2002, Toronto...
-
Upload
kevin-gilbert -
Category
Documents
-
view
212 -
download
0
Transcript of Principles of Graphical Excellence Best Paper: ALAIR April 5–6, 2001 AIR: June 2-5, 2002, Toronto...
Principles of Graphical Excellence
Best Paper: ALAIR April 5–6, 2001
AIR: June 2-5, 2002, Toronto
Focus-IR, February 21, 2003
Anna T. Waggener, Ph.D.Institutional Assessment
United States Army War College
The Visual Display of Quantitative Information
Leading authority: Edward R. Tufte
History of Graphical Development
• First geographic maps were drawn on clay tablets.
• 17th Century: combined map skills and statistical skills to construct maps.
• Trade winds and monsoons on a world map.• Chart patterns of disease.• Later sophistication showed distribution of
1.3 million galaxies.
“Graphical excellence consists of the efficient
communication of complex quantitative ideas.”
• Organizing Numerical Data:• The Ordered Array and Stem-leaf Display
• Tabulating and Graphing Numerical Data:• Frequency Distributions: Tables, Histograms, Polygons
• Cumulative Distributions: Tables, the Ogive
Presentation Topics
• Tabulating and Graphing Univariate Categorical Data:
• The Summary Table• Bar and Pie Charts, the Pareto Diagram
•Tabulating and Graphing Bivariate Categorical Data:
• Contingency Tables• Side by Side Bar charts
• Graphical Excellence and Common Errors in Presenting Data
Presentation Topics (continued)
“At their best, graphics are instruments for reasoning
about quantitative information.”
2 144677
3 028
4 1
Organizing Numerical Data
Numerical Data
Ordered Array
Stem and LeafDisplay
Frequency DistributionsCumulative Distributions
Histograms
Polygons
Ogive
Tables
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
2 1 4 4 6 7 7
Organizing Numerical Data:
•Data in RawRaw form (as collected):24, 26, 24, 21, 27, 27, 30, 41, 32, 38
•Date OrderedOrdered from Smallest to LargestSmallest to Largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
•Stem and Leaf display:3 0 2 8
4 1
“Design is choice.”
O g ive
0
20
40
60
80
100
120
10 20 30 40 50 60
0
1
2
3
4
5
6
7
10 20 30 40 50 60
2 144677
3 028
4 1
Tabulating and Graphing Numerical Data
Numerical Data
Ordered Array
Stem and LeafDisplay
Histograms Ogive
Tables
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Frequency DistributionsCumulative Distributions
Polygons
Tabulating Numerical Data: Frequency Distributions
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class Frequency
10 but under 20 3 .15 15
20 but under 30 6 .30 30
30 but under 40 5 .25 25
40 but under 50 4 .20 20
50 but under 60 2 .10 10
Total 20 1 100
RelativeFrequency
Percentage
(continued)
His togram
0
3
6
54
2
00
1
2
3
4
5
67
5 15 25 36 45 55 More
Fre
qu
en
cy
Graphing Numerical Data:The Histogram
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class Midpoints
No Gaps Between
Bars
Graphing Numerical Data:The Frequency Polygon
Frequency
0
1
2
3
4
5
6
7
5 15 25 36 45 55 More
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Class Midpoints
Cumulative CumulativeClass Frequency % Frequency
10 but under 20 3 15
20 but under 30 9 45
30 but under 40 14 70
40 but under 50 18 90
50 but under 60 20 100
Tabulating Numerical Data:Cumulative Frequency
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Graphing Numerical Data:The Ogive (Cumulative % Polygon)
Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
O give
0
20
40
60
80
100
120
10 20 30 40 50 60
Tabulating and Graphing Categorical Data: Univariate Data
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Pareto DiagramBar Charts
Summary Table(University Revenues)
Revenue Category Amount Percentage (in thousands $)
Patient Services 46.5 42.27
Tuition/fees 32 29.09
Appropriations 15.5 14.09
Grants/Contracts 16 14.55
Total 110 100
Variables are Categorical.
0
5
1 0
1 5
2 0
2 5
3 0
3 5
4 0
4 5
S to c k s B o n d s S a vin g s C D
0
2 0
4 0
6 0
8 0
1 0 0
1 2 0
0 1 0 2 0 3 0 4 0 5 0
S to c k s
B o n d s
S a vin g s
C D
Graphing Categorical Data: Univariate Data
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Pareto DiagramBar Charts
Bar ChartEnrollment Summary
0 500 1000 1500 2000 2500 3000
Freshmen
Sophomores
Juniors
Seniors
Unclass
Grad
1st Prof
Pie Chart(for a factbook)
Percentages are rounded to the nearest percent.
Students by Classification
Seniors 15%
Sophomores 14%
Juniors
29%
Freshmen42%
Pareto Diagram
Axis for line graph shows
cumulative %
Axis for bar
chart shows % in each
category
0
510
15
2025
30
3540
45
1 2 3 4
0
20
40
60
80
100
120
Tabulating and Graphing Bivariate Categorical Data
•Contingency Tables
•Side by Side Charts
Tabulating Categorical Data: Bivariate Data
Contingency Table: Enrollment by College
Enrollment A&S BUS NRS Total Category
Freshmen 46 55 27 128
Sophomores 32 44 19 95
Juniors 15 20 13 48
Seniors 16 28 7 51
Total 109 147 66 322
Graphing Categorical Data: Bivariate Data
Side by SideChart
0 200 400 600 800 1000 1200
Freshmen
Sophomores
Juniors
Seniors
Unclass
Grad
1st Prof
A&S AH NS
Principles of Graphical Excellence
• Well designed presentation of data that provides:
Substance Statistics Design
• Communicates complex ideas with clarity, precision and efficiency
• Gives the largest number of ideas in the most efficient manner
• Almost always involves several dimensions• Requires telling the truth about the data
Data-Ink Ratio
Data information
Total ink used to print the graphic
“Much of twentieth-century thinking about statistical graphics
has been preoccupied with the question of how some amateurish chart might fool a naive viewer.”
• Using ‘chart junk’
• No relative basis In comparing data Batches
• Compressing the Vertical axis
• No zero point on the
Vertical axis
Errors in Presenting Data
‘Chart Junk’
Good Presentation
1960: $1.00
1970: $1.60
1980: $3.10
1990: $3.80
Minimum Wage Minimum Wage
0
2
4
1960 1970 1980 1990
$
Bad Presentation
Lie Factor
Size of effect shown in graphic
Size of effect in data
No Relative Basis
Good PresentationA’s received by
students.A’s received by
students.
Bad Presentation
0
200
300
FR SO JR SR
Freq.
10%
30%
FR SO JR SR
%
FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
Compressing Vertical Axis
Good Presentation
Quarterly Income Quarterly Income
Bad Presentation
0
25
50
Q1 Q2 Q3 Q4
$
0
100
200
Q1 Q2 Q3 Q4
$
No Zero Point on Vertical Axis
Good PresentationMonthly Expenses Monthly Expenses
Bad Presentation
0
39
42
45
J F M A M J
$
36
39
42
45
J F M A M J
$
Graphing the first six months of sales.
36
No Zero Point on Vertical Axis
Good Presentation
Monthly Expenses Monthly Expenses
Bad Presentation
0
20
40
60
J F M A M J
$
36
39
42
45
J F M A M J
$
Graphing the first six months of sales.
Main defense of the lying graphic....
“Well, at least it was approximately correct, we were just trying to show the general
direction of change.”
Presentation Summary
• Organized Numerical Data:• The Ordered Array and Stem-leaf Display
• Tabulated and Graphed Numerical Data• Frequency Distributions: Tables, Histograms, Polygons
• Cumulative Distributions: Tables, the Ogive
Presentation Summary (continued)
• Tabulated and Graphed Univariate Categorical Data: • The Summary Table• Bar and Pie Charts, the Pareto diagram
•Tabulated and Graphed Bivariate Categorical Data: • Contingency Tables• Side by Side charts
• Discussed Graphical Excellence and Common Errors in Presenting Data
“There remain, however, many other consideration in the design
of statistical graphics – not only of efficiency, but also of complexity,
structure, density, and even beauty.”
“Without data, it is anyone’s opinion.”
Author unknown