From Data to Visualization, what happens in between?

78
VISUALIZATION Krist Wongsuphasawat (@kristw) Xibu!ibqqfot!jo!cfuxffo@ FROM DATA TO Senior Data Visualization Scientist, Twitter

description

A talk at Data Visualization Summit 2014 in Santa Clara, CA ABSTRACT: What is the thought process that transforms data into visualizations? In this presentation, I will talk about guidelines that will help you when starting with raw data, walk through standard techniques, and also discuss things to keep in mind when making design decisions.

Transcript of From Data to Visualization, what happens in between?

Page 1: From Data to Visualization, what happens in between?

VISUALIZATION

Krist Wongsuphasawat (@kristw)

FROM DATA TO

Senior Data Visualization Scientist, Twitter

Page 2: From Data to Visualization, what happens in between?

Twitter Analytics / Visual Insights

Internal Dashboarding system

Exploratory data visualization tools !

External Public facing visualizations

#interactive

Page 3: From Data to Visualization, what happens in between?

#interactive

http://twitter.github.io/interactive

Page 4: From Data to Visualization, what happens in between?

Examples

Page 5: From Data to Visualization, what happens in between?

What are visualizations?

Page 6: From Data to Visualization, what happens in between?

pretty graphicsPOWER OF THE EYES

Page 7: From Data to Visualization, what happens in between?

prettyMEANINGFUL

Page 8: From Data to Visualization, what happens in between?

Anscombe’s Quartet

X Y

10.0 8.04

8.0 6.95

13.0 7.58

9.0 8.81

11.0 8.33

14.0 9.96

6.0 7.24

4.0 4.26

12.0 10.84

7.0 4.82

5.0 5.68

X Y

10.0 9.14

8.0 8.14

13.0 8.74

9.0 8.77

11.0 9.26

14.0 8.10

6.0 6.13

4.0 3.10

12.0 9.13

7.0 7.26

5.0 4.74

X Y

10.0 7.46

8.0 6.77

13.0 12.74

9.0 7.11

11.0 7.81

14.0 8.84

6.0 6.08

4.0 5.39

12.0 8.15

7.0 6.42

5.0 5.73

X Y

8.0 6.58

8.0 5.76

8.0 7.71

8.0 8.84

8.0 8.47

8.0 7.04

8.0 5.25

19.0 12.50

8.0 5.56

8.0 7.91

8.0 6.89

#1 #2 #3 #4

Page 9: From Data to Visualization, what happens in between?

Anscombe’s Quartet

Property Value

Mean of X 11.0

Variance of X 10.0

Mean of Y 7.5

Variance of Y 3.75

Correlation between X and Y 0.816

Linear regression y = 3.0 +0.5x

#1 #2 #3 #4

Identical statistics!

Page 10: From Data to Visualization, what happens in between?

Anscombe’s Quartet

#1 #2 #3 #4

0!

2!

4!

6!

8!

10!

12!

0! 5! 10! 15!0!1!2!3!4!5!6!7!8!9!10!

0! 5! 10! 15!0!

2!

4!

6!

8!

10!

12!

14!

0! 5! 10! 15!0!

2!

4!

6!

8!

10!

12!

14!

0! 10! 20!

but very different

Page 11: From Data to Visualization, what happens in between?

Napoleon’s Marchgeography

timecourse (attack/retreat)

quantity of troopstemperature

direction

Page 12: From Data to Visualization, what happens in between?

London Cholera Outbreak

Page 13: From Data to Visualization, what happens in between?

London Cholera Outbreak

Page 14: From Data to Visualization, what happens in between?

Visualization

• Power

• Understand data quickly

• Discover hidden facts

• Usage

• Storytelling / Reporting

• Exploratory data analysis

Page 15: From Data to Visualization, what happens in between?

“Visualization”

• Information Visualization (academia)

• InfoVis

• Data Visualization (commonly used)

• DataVis

!

• infographics (...)

Page 16: From Data to Visualization, what happens in between?

How to start?

• What tool should I use?

!

! DATA

Page 17: From Data to Visualization, what happens in between?

How to start?

• What tool should I use?

!

!

!

1. What type of data do I have?

DATA

Page 18: From Data to Visualization, what happens in between?

DATA

1) What type of data?

Page 19: From Data to Visualization, what happens in between?

DATA

1) What type of data?

vis7

vis5

vis3

vis2

vis1vis6

vis4

Many options... Which visualization technique should I use?

Page 20: From Data to Visualization, what happens in between?

1) What type of data?

• Visualizations are categorized by data types:

• 2,3- dimensional

• Multi-dimensional

• Temporal

• Tree

• Network

• etc.

Page 21: From Data to Visualization, what happens in between?

Let’s take a tour.

Page 22: From Data to Visualization, what happens in between?

2D, 3D data (real world objects)

!

a.k.a. Scientific Visualization (SciVis)

Page 23: From Data to Visualization, what happens in between?

2D: Maps

Page 24: From Data to Visualization, what happens in between?

3D: Brain

Page 25: From Data to Visualization, what happens in between?

Multi-dimensional data abstract dimensions

(+ real world dimensions)

Page 26: From Data to Visualization, what happens in between?

Flowers

species sepalLength sepalWidth petalLength petalWidth

setosa 5.1 3.5 1.4 0.2

setosa 4.9 3.0 1.4 0.2

setosa 4.7 3.2 1.3 0.2

virginica 4.6 3.1 1.5 0.2

virginica 5.0 3.6 1.4 0.2

virginica 5.4 3.9 1.7 0.4

DATA

Page 27: From Data to Visualization, what happens in between?

Scatterplot

http://bl.ocks.org/mbostock/3887118

Sepal Length

Sepal Width

Page 28: From Data to Visualization, what happens in between?

Scatterplot Matrix

http://bl.ocks.org/mbostock/4063663

Sepal Length

Sepal Width

Petal Length

Petal Width

Page 29: From Data to Visualization, what happens in between?

Cars

Name economy (mpg) cylinders power

(hp)weight

(lb)0-60 mph

(s)

Ford Mustang 18 6 88 3139 14.5

Honda Accord 31.5 4 68 2045 18.5

Honda Civic 24 4 97 2489 15

Mazda RX-7 23.7 3 100 2420 12.5

DATA

Page 30: From Data to Visualization, what happens in between?

Parallel Coordinates

http://bl.ocks.org/jasondavies/1341281

Page 31: From Data to Visualization, what happens in between?

The Geography of Tweets@miguelrios

Page 32: From Data to Visualization, what happens in between?

The Geography of Tweets@miguelrios

tweet counts latitude longitude

20,000 27.174526 78.042153

9,000 49.124093 52.201304

1,000 12.2995 31.59592

... ... ...

DATA

abstract dimension

real world dimensions

Page 33: From Data to Visualization, what happens in between?

Temporal Data value changes over time

events

Page 34: From Data to Visualization, what happens in between?

Line charts

http://bl.ocks.org/mbostock/3884955

Page 35: From Data to Visualization, what happens in between?

Calendar chart

Page 36: From Data to Visualization, what happens in between?

Events on timeline

http://evolutionofweb.appspot.com/#/evolution/day

Page 37: From Data to Visualization, what happens in between?

Trees hierarchy

Page 38: From Data to Visualization, what happens in between?

Tree

http://bl.ocks.org/mbostock/4339083

Page 39: From Data to Visualization, what happens in between?

Stock Market

Financial

All stocks

Healthcare Technology ...

Apple Google Canon ...

DATA

Page 40: From Data to Visualization, what happens in between?

TreeMaps

http://www.marketwatch.com/tools/stockresearch/marketmap

Page 41: From Data to Visualization, what happens in between?

Icicle

http://bl.ocks.org/mbostock/1005873

Page 42: From Data to Visualization, what happens in between?

Sunburst

http://bl.ocks.org/mbostock/4348373

Page 43: From Data to Visualization, what happens in between?

Networks nodes and edges

Page 44: From Data to Visualization, what happens in between?

Character Co-occurrences{! nodes: [! 'valjean',! 'fantine',! 'cosette',! ...! ],! edges: [! {character1: 'valjean', character2: 'fantine', 10},! {character1: 'valjean', character2: 'cosette', 5},! ...! ]!}!

DATA

Page 45: From Data to Visualization, what happens in between?

Node-link diagram

http://bl.ocks.org/mbostock/4062045

Page 46: From Data to Visualization, what happens in between?

Matrix

http://bost.ocks.org/mike/miserables/

Page 47: From Data to Visualization, what happens in between?

Combination Multi-D + Temporal

Multi-D + Tree Multi-D + Network Temporal + Tree

Temporal + Network ...

Page 49: From Data to Visualization, what happens in between?

VISUALIZATION visual encodings + interactions

tooltips animation highlight

filter etc.

bar chart line chart

matrix node-link treemaps

etc.

or multiple views

(data type)

Page 50: From Data to Visualization, what happens in between?

DATA

1) What type of data?

vis7

vis5

vis3

vis2

vis1vis6

vis4

Many options... Which visualization technique should I use?

Page 51: From Data to Visualization, what happens in between?

DATA

1) What type of data?

vis7

vis3

vis4

Less options... Still, which one should I use?

Page 52: From Data to Visualization, what happens in between?

How to start?

• What tool should I use?

!

!

!

1. What type of data do I have?

2. What do I want from the data?

DATA

Page 53: From Data to Visualization, what happens in between?

2) What do I want from the data?

• Many ways to visualize one type of data.

• Things to consider:

• audience (data scientist, execs, etc.)

• goal (storytelling, exploratory analysis)

• tasks

Page 54: From Data to Visualization, what happens in between?

Storytelling

Exploratory

Page 55: From Data to Visualization, what happens in between?

Four more years

https://www.youtube.com/watch?v=01un0ORjQps

Page 56: From Data to Visualization, what happens in between?

Photogrid (Treemap + photo)

http://twitter.github.io/interactive/sochi

Page 57: From Data to Visualization, what happens in between?

Soccer Tournament

https://uclfinal.twitter.com/

Page 58: From Data to Visualization, what happens in between?

State of the Union

http://twitter.github.io/interactive/sotu2014/#p1

Page 59: From Data to Visualization, what happens in between?

Ok, now tools.

1. What type of data do I have?

2. What do I want from the data?

Page 60: From Data to Visualization, what happens in between?

Tools

Option 1: Programming library

Option 2: Packaged software

You have to write code.

(Mostly) no coding involved

Page 61: From Data to Visualization, what happens in between?

Programming libraries

• d3.js, processing, R, etc.

!

• Copy and modify from examples.

• Can do custom stuffs (if you can figure out how)

• More overhead for common task

Page 62: From Data to Visualization, what happens in between?

Packaged software• Tableau (multi-dimensional)

• Gephi (graph)

• NodeXL (graph)

• Research projects (contact authors) !

• Just use the software. No hassle of code/debug

• Limited functionalities to what the tools can do

• Custom designs more difficult

Page 63: From Data to Visualization, what happens in between?

Ideal workflow

1. What type of data do I have?

2. What do I want from the data?

3. Pick appropriate techniques/tools

4. Done!

Page 64: From Data to Visualization, what happens in between?

Ideal workflow

1. What type of data do I have?

2. What do I want from the data?

3. Pick appropriate techniques/tools

4. Done!

Not that easy!

Page 65: From Data to Visualization, what happens in between?

Real-life workflow

data are dirty unsatisfied

transform

What type of data do I have?

Pre-process data

What do I want from the data?

Pick appropriate techniques/tools

See results change goalchange perspective

Page 66: From Data to Visualization, what happens in between?

New year 2014

http://twitter.github.io/interactive/newyear2014/

Page 67: From Data to Visualization, what happens in between?

Behind the scene

Page 68: From Data to Visualization, what happens in between?
Page 69: From Data to Visualization, what happens in between?
Page 70: From Data to Visualization, what happens in between?
Page 71: From Data to Visualization, what happens in between?
Page 72: From Data to Visualization, what happens in between?
Page 73: From Data to Visualization, what happens in between?
Page 74: From Data to Visualization, what happens in between?

VISUALIZATION

FROM DATA TO

@kristw

Page 75: From Data to Visualization, what happens in between?

VISUALIZATION

FROM DATA TO

@kristw

DATA first, not tools.

Page 76: From Data to Visualization, what happens in between?

VISUALIZATION

FROM DATA TO

@kristw

DATA first, not tools.

visual encodings(by data types)

+ interactionschoose:

Page 77: From Data to Visualization, what happens in between?

VISUALIZATION

FROM DATA TO

visual encodings(by data types)

+ interactions

DATA first, not tools.

@kristw

choose:

twitter.github.io/interactive

Page 78: From Data to Visualization, what happens in between?

Thank you