Telling Stories With Data: Class 1

Post on 05-Apr-2017

46 views 2 download

Transcript of Telling Stories With Data: Class 1

Telling Stories With DataClass #1March 20th, 2017

David Newbury — @workergnome 1

What We're Doing Today:

— Syllabus Review

— (Brief) History of Data Visualization

— (Tiny) Theory of Visualization

— (Nerdy) Overview of Concepts

— (Fake) Data Exploration

David Newbury — @workergnome 2

Course Website:datastories.davidnewbury.com

David Newbury — @workergnome 3

Which is biggest?

15012, 8271, 30193, 1189, 9913, 16000, 92481, 49801, 100407, 2910, 3809, 8018, 61528, 18083, 38691, 1800

David Newbury — @workergnome 4

Which is biggest?

David Newbury — @workergnome 5

Which is biggest?

David Newbury — @workergnome 6

Why do wevisualize?David Newbury — @workergnome 7

(Brief)History ofData Visualization

David Newbury — @workergnome 8

Tabula Peutingeriana, 5th century CE

David Newbury — @workergnome 9

David Newbury — @workergnome 10

Rene Descartes, 1600s

David Newbury — @workergnome 11

Joseph Priestly, New Chart of History (1769)

David Newbury — @workergnome 12

William Playfair, (1786 & 1801)

David Newbury — @workergnome 13

David Newbury — @workergnome 14

John Snow, London Cholera Map (1854)

David Newbury — @workergnome 15

Cholera Map

David Newbury — @workergnome 16

Florence Nightingale, War Deaths (1855)

David Newbury — @workergnome 17

Charles Minard, March on Moscow (1862)

David Newbury — @workergnome 18

More recent history.

David Newbury — @workergnome 19

David Newbury — @workergnome 20

New York Times

David Newbury — @workergnome 21

(tiny)

Theory of VisualizationDavid Newbury — @workergnome 22

Dataviz is constructed reality.You are telling a story, not (just) stating facts.

David Newbury — @workergnome 23

data art

as opposed to

data visualization

as opposed to

statistical graphicsDavid Newbury — @workergnome 24

StatisticalGraphics

How do I create Statistical Graphs in SAS 9.1.3 without Proc Gplot. UCLA: Statistical Consulting Group.http://www.ats.ucla.edu/stat/sas/notes2/

David Newbury — @workergnome 25

Data Art

Dear Data Giorgia Lupi & Stefanie Posavec.http://www.dear-data.com

David Newbury — @workergnome 26

Two Uses1). help people grasp things outside their reach

David Newbury — @workergnome 27

Two Uses1). help people grasp things outside their reach

2.) tell stories

David Newbury — @workergnome 28

explanatory visualization work

as opposed to

exploratory visualizations

David Newbury — @workergnome 29

Dataviz is constructed reality.Do you care how true your story is?

Do you care how accurate your story is?

Are you trying to teach, entertain, or convince?

David Newbury — @workergnome 30

(Nerdy)Overview of Concepts

David Newbury — @workergnome 31

What can you visualise?

David Newbury — @workergnome 32

Potential Subjects.

subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...

David Newbury — @workergnome 33

Potential Subjects.

subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...

...and other things that begin with S.

David Newbury — @workergnome 34

What are you interested in?

I'm interested in subways.

David Newbury — @workergnome 35

Data Visualization starts with...

A Question.David Newbury — @workergnome 36

What question about your subject are you interested in?

— Are subways more efficient than owning a car?

— How often do I ride the subway in a year?

— What's locations have the best access to subways?

— What's the average subway commute in Pittsburgh?

David Newbury — @workergnome 37

Dimension and Scopeare about choosing what to focus on.

David Newbury — @workergnome 38

Dimension

Which bits of information about a subjectare you going to focus on?

David Newbury — @workergnome 39

Possible Dimensions

number of carsduration of ridedate of a ridedifferent linesnumber of stopscost per ridenumber of stops per daytime between stopscleanlinessDavid Newbury — @workergnome 40

Scope

Out of the infinite ways to look at your subject, how are you going to choose one?

David Newbury — @workergnome 41

Possible Scopes

All trains in a dayAll the rides that I've been on this yearMy train this morningAll of the stops in the cityEach lineEvery train stop in the past 50 years

David Newbury — @workergnome 42

(Fake)Data Exploration

David Newbury — @workergnome 43

Choose one.

subways, sheep, the solar system,shoes, sleep, skyline,snow, supermarket, sausages,school,the sea, spiders,staircases, syrup, soap,sawmills, stereos...

...and other things that begin with S.

David Newbury — @workergnome 44

TRY IT.1. Write down your subject2. Write down your question3. Write down as many dimensions as you can

4. Write down possible scopes for your dataDavid Newbury — @workergnome 45

What does yourdata look like?

David Newbury — @workergnome 46

Types of Data

DatesNumbersGeo CoordinateStringsCategories

David Newbury — @workergnome 47

Types of Data

number of cars - Numericduration of ride - Numericdate of a ride - Datedifferent lines - Categorynumber of stops - Numericcost per ride - Categorynumber of stops per day - Numerictime between stops - Numericcleanliness - StringDavid Newbury — @workergnome 48

Two (related ides):

Categories & measures

David Newbury — @workergnome 49

Categories are Discrete Things

Measures are for Counting

David Newbury — @workergnome 50

number of cars - Measureduration of ride - Measuredate of a ride - Measuredifferent lines - Categoriesnumber of stops - Measurecost per ride - Categoriesnumber of stops per day - Measuretime between stops - Measurecleanliness - Categories

David Newbury — @workergnome 51

A hidden dimension:

David, Daniel, Dawn, Danique

David Newbury — @workergnome 52

A hidden dimension:

David (1), Daniel (2), Dawn (3), Danique (4)

Position of the item in the group.

David Newbury — @workergnome 53

TRY IT.1. Choose a scope for your data.

2. Identify which dimensions are relevant.

3. Is the dimension is a category or a measure?

David Newbury — @workergnome 54

NowWhat?

David Newbury — @workergnome 55

We need to map our data

from a domainto a range.

David Newbury — @workergnome 56

Domain

number of cars - 1...8duration of ride - 30 sec...2 hoursdate of a ride - - 24ft...200ftdifferent lines - Red line, Blue line, Green line, Silver Line, Yellow Linenumber of stops - **2..20cost per ride - "$2.50, $1.75, $3.00, $0.00"number of stops per day - ??...???time between stops - 30 sec..5 minutesDavid Newbury — @workergnome 57

Range

Domain is the possible input values

Range is the possible output values

David Newbury — @workergnome 58

Data3, 7, 10, 6, 2Position of the item in the group.

Domain[0-10][1-5]

RangeX: 400px Y: 800px

MappingX: item position Y: numeric value David Newbury — @workergnome 59

Data3, 7, 10, 6, 2Position of the item in the group.

AreaDavid Newbury — @workergnome 60

Data3, 7, 10, 6, 2Position of the item in the group.

ColorDavid Newbury — @workergnome 61

Data

val1: 3, 7, 10, 6, 2val2: 5, 8, 1, 8, 3val3: Cat, Dog, Cat, Cat, DogPosition of the item in the group.

Mapping

X: item position Y: val1 Size: val2 Color: val3

David Newbury — @workergnome 62

Dimensions beyond X and Y.

ColorSizeShapeLabelsPatternsIconsAnything Else You Can Imagine

David Newbury — @workergnome 63

TRY IT.1. Identify your domains

2. For each domain, choose a range

3. Draw it!

David Newbury — @workergnome 64

FinishingTouchesDavid Newbury — @workergnome 65

Measures get AxisCategories get Headers

David Newbury — @workergnome 66

Labels

David Newbury — @workergnome 67

Axis

Category AxisNumber AxisDate AxisLog axis

David Newbury — @workergnome 68

Legends

David Newbury — @workergnome 69

TRY IT.1. Add a title to your chart

2. Label your axis

3. Add legends and labels as needed

David Newbury — @workergnome 70

Review

DimensionsScope

DomainRange

CategoriesMeasures

David Newbury — @workergnome 71

Thank You.

David Newbury — @workergnome 72