Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs...

32
First Prev Next Last Go Back Full Screen Close Quit Statistics 120 Good and Bad Graphs

Transcript of Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs...

Page 1: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Statistics 120Good and Bad Graphs

Page 2: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

The Plan

• In this lecture we will try to set down some basic rulesfor drawing good graphs.

• We will do this by showing that violating the rulesproduces bad graphs.

• Later in the course we see that there is a solidperceptual basis for some of these rules.

Page 3: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Data Content

• It does not make sense to use graphs to display verysmall amounts of data.

• The human brain is quite capable of grasping one two,or even three values.

Page 4: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Page 5: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Male Female

Gender

Per

cent

age

0

10

20

30

40

50

60

Class Gender Breakdown

Page 6: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Data Relevance

• Graphs are only as good as the data they display.

• No amount of creativity can produce a good graph fromdubious data.

Page 7: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Page 8: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Auckland City Council:City Scene

Page 9: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Complexity

• Graphs should be no more complex than the data whichthey portray.

• Unnecessary complexity can be introduced by

– irrelevant decoration

– colour

– 3d effects

• These are collectively known as “chartjunk.”

Page 10: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Age Structure of CollegeEnrolment (1972-1976)

This graph presents five values. Ituses six colours, unnecessaryperspective and a split axis to do so.

American Education Magazine.

Page 11: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

1972 1973 1974 1975 1976

28

29

30

31

32

33

Year

Per

cent

Age Structure of College EnrolmentPercent of Total Enrolment, Aged 25 and Over

Page 12: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

1972 1973 1974 1975 1976

Year

Per

cent

0

5

10

15

20

25

30

35

Age Structure of College EnrolmentPercent of Total Enrolment, Aged 25 and Over

Page 13: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Share Results

The extra dimension used in thisgraph has confused even the personwho created it.

The Washington Post, 1979.

Page 14: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

1972 1973 1974 1975 1976 1977

Earnings Per Share and Dividends

Dol

lars

1.021.08

1.16 1.16

1.281.34

1.53

1.711.63

1.721.82

1.7

EarningsDividends

Page 15: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Distortion

• Graphs should not provide a distorted picture of thevalues they portray.

• Distortion can be either deliberate or accidental.

• (Of course, it can be useful to know how to produce agraph which bends the truth.)

Page 16: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Page 17: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Theology

Fine Arts

Music

Medicine

Architecture

Education

Engineering

Law

Science

Commerce

Arts

0 2000 4000 6000 8000

Faculty Size

Page 18: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Engineering

Theology

Architecture

Commerce

Science

Medicine

Law

Fine Arts

Arts

Music

Education

0 20 40 60 80 100

Percentage of Female Students

Page 19: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

The “Lie Factor”

• Ed Tufte of Yale University has defined a “lie factor” asa measure of the amount of distortion in a graph. (Don’ttake this too seriously – i.e don’t learn it for the exam).

• The lie factor is defined to be:

Lie Factor=size of effect shown in graphic

size of effect shown in data

• If the lie factor of a graph is greater than 1, the graph isexaggerating the size of the effect.

Page 20: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Data Effect=27.5−18

18= 0.53, Graph Effect=

5.3− .6.6

= 7.83,

Lie Factor= 14.8

Page 21: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

1978 1979 1980 1981 1982 1983 1984 1985

Required Fuel Economy Standards:New Cars Built from 1978 to 1985

year

MP

G

0

5

10

15

20

25

30

Page 22: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Common Sources of Distortion

• The use of 3 dimensional “effects” is a common sourceof distortions in graphs.

• Another common source is the inappropriate use oflinear scaling when using area or volume to representvalues.

Page 23: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Lie Factor = 9.4

Page 24: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Is the bottom dollar note roughly half the size of the top one?

Page 25: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

0.0

0.2

0.4

0.6

0.8

1.0$1.00

94c

83c

64c

44c

1958 1963 1968 1973 1978Eisenhower Kennedy Johnson Nixon Carter

Purchasing Power of the Diminishing Dollar

Page 26: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Deliberate Distortion

• Sometimes graphs contain deliberate distortions.

• Usually these are an attempt to hide some feature of thedata.

Page 27: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Page 28: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Year

●● ●

● ●●

●● ● ●

● ●

1939 1947 1951 1955 1963 1965 1967 1970 1972 1973 1974 1975 1976

0

10,000

20,000

30,000

40,000

50,000

60,000

Median Net Incomes

Doctors

Other Professionals

Page 29: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Year

●● ●

● ●●

●● ● ●

● ●

1940 1945 1950 1955 1960 1965 1970 1975

0

10,000

20,000

30,000

40,000

50,000

60,000

Median Net Incomes

Doctors

Other Professionals

Page 30: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Errors

• Sometimes distortions occur because of errors.

• This can be because the artist tried to be clever andsometimes because they don’t understand statistics.

Page 31: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Page 32: Statistics 120 Good and Bad Graphsihaka/120/Lectures/lecture03.pdfStatistics 120 Good and Bad Graphs •First •Prev •Next •Last •Go Back •Full Screen •Close •Quit The

•First •Prev •Next •Last •Go Back •Full Screen •Close •Quit

Drawing Good Graphs

• If the “story” is simple, keep it simple.

• If the “story” is complex, make it look simple.

• Tell the truth – don’t distort the data.