From Data To Graphics

50
Data Visualization Nikhil Srivastava, 20 Nikhil Srivastava Moringa School Summer 2015

Transcript of From Data To Graphics

Data Visualization Nikhil Srivastava, 2015

Nikhil Srivastava

Moringa School

Summer 2015

Data Visualization Nikhil Srivastava, 2015

• What is Data Visualization?

• Thinking and Seeing

• From Data to Graphics

• Principles and Guidelines

• Highcharts and Javascript

• Class Project

introduction

foundation & theory

building blocks

design & critique

construction

Last time …

Data Visualization Nikhil Srivastava, 2015

Bandwidth of Our Senses

Why Vision?

Data Visualization Nikhil Srivastava, 2015

The Software• High-level concepts: objects,

symbols

• Involves working memory

• Slow, sequential, conscious

• Sensory input

• Low-level features: orientation,

shape, color, movement

• Rapid, parallel, automatic

Visual Perception

“Bottom-up”

“Top-down”

Data Visualization Nikhil Srivastava, 2015

Task: Counting

Slow, sequential, conscious

Rapid, parallel, automatic

1281768756138976546984506985604982826762 9809858458224509856458945098450980943585 9091030209905959595772564675050678904567 8845789809821677654876364908560912949686

1281768756138976546984506985604982826762 9809858458224509856458945098450980943585 9091030209905959595772564675050678904567 8845789809821677654876364908560912949686

Data Visualization Nikhil Srivastava, 2015

Eye != Camera

Data Visualization Nikhil Srivastava, 2015

Summary

• Human vision is constrained and imperfect

• Use “pre-attentive” attributes carefully

• Minimize unnecessary visual movement

• Layout and scope as important as

measurement

Data Visualization Nikhil Srivastava, 2015

• What is Data Visualization?

• Thinking and Seeing

• From Data to Graphics

• Principles and Guidelines

• Highcharts and Javascript

• Class Project

introduction

foundation & theory

building blocks

design & critique

construction

Data Visualization Nikhil Srivastava, 2015

From Data to Graphics

What kind

of data do

we have?

How can we

represent the

data visually?

How can we

organize this into

a visualization?

  Athi River   Machakos  139,380   

  Awasi   Kisumu  93,369   

  Kangundo-Tala   Machakos  218,557   

  Karuri   Kiambu  129,934   

  Kiambu   Kiambu  88,869   

  Kikuyu   Kiambu  233,231   

  Kisumu   Kisumu  409,928   

  Kitale   Trans-Nzoia  106,187   

  Kitui   Kitui  155,896   

  Limuru   Kiambu  104,282   

  Machakos   Machakos  150,041   

  Molo   Nakuru  107,806   

  Mwingi   Kitui  83,803   

  Naivasha   Nakuru  181,966   

  Nakuru   Nakuru  307,990   

  Nandi Hills   Trans-Nzoia  73,626   

 

Visual Encoding

Data Visualization Nikhil Srivastava, 2015

What kind

of data do

we have?

How can we

represent the

data visually?

How can we

organize this into

a visualization?

  Athi River   Machakos  139,380   

  Awasi   Kisumu  93,369   

  Kangundo-Tala   Machakos  218,557   

  Karuri   Kiambu  129,934   

  Kiambu   Kiambu  88,869   

  Kikuyu   Kiambu  233,231   

  Kisumu   Kisumu  409,928   

  Kitale   Trans-Nzoia  106,187   

  Kitui   Kitui  155,896   

  Limuru   Kiambu  104,282   

  Machakos   Machakos  150,041   

  Molo   Nakuru  107,806   

  Mwingi   Kitui  83,803   

  Naivasha   Nakuru  181,966   

  Nakuru   Nakuru  307,990   

  Nandi Hills   Trans-Nzoia  73,626   

 

Data Visualization Nikhil Srivastava, 2015

Data as Input

  Athi River   Machakos  139,380   

  Awasi   Kisumu  93,369   

  Kangundo-Tala   Machakos  218,557   

  Karuri   Kiambu  129,934   

  Kiambu   Kiambu  88,869   

  Kikuyu   Kiambu  233,231   

  Kisumu   Kisumu  409,928   

  Kitale   Trans-Nzoia  106,187   

  Kitui   Kitui  155,896   

  Limuru   Kiambu  104,282   

  Machakos   Machakos  150,041   

  Molo   Nakuru  107,806   

  Mwingi   Kitui  83,803   

  Naivasha   Nakuru  181,966   

  Nakuru   Nakuru  307,990   

  Nandi Hills   Trans-Nzoia  73,626   

 

CleanRestructure

ExploreAnalyze

DATA

Visualization Goals

Data Visualization Nikhil Srivastava, 2015

Model and Attribute

item attr_A attr_B … attr_Z

item1 value1_A value1_B …

item2 value2_A value2_B …

… … …

itemN valueN_Z

Data Visualization Nikhil Srivastava, 2015

Data TypesCATEGORICAL ORDINAL NUMERICAL

Interval Ratio

Male / Female

Asia / Africa / Europe

True / False

Small / Med / Large

Low / High

Yes / Maybe / No

Latitude/Longitude

Compass direction

Time (event)

Length

Count

Time (duration)

= = = =

<  > < > < >

+ - + -

* /

Data Visualization Nikhil Srivastava, 2015

Data Types: Example

• Which are categorical? (=)

• Which are ordinal? (= < >)

ID Gender Test Score Grade Size Temperature

1 Male 77 C Small 36.5

2 Female 85 B Large 37.2

3 Female 95 A Medium 36.7

4 Male 90 A Large 37.4

• Which are interval? (= < > + -)

• Which are ratio? (= < > + - * /)

Data Visualization Nikhil Srivastava, 2015

Data Type TransformationCATEGORICAL ORDINAL NUMERICAL

Interval Ratio

Male / Female

Asia / Africa / Europe

True / False

Small / Med / Large

Low / High

Yes / Maybe / No

Time

Latitude/Longitude

Compass direction

Time

Length

Count

Binning/Categorizing

Differencing/Normalization

Data Visualization Nikhil Srivastava, 2015

Advanced Data Types

• Networks/Graphs

– Hierarchies/Trees

• Text

• Maps: points, regions, routes

Data Visualization Nikhil Srivastava, 2015

What kind

of data do

we have?

How can we

represent the

data visually?

How can we

organize this into

a visualization?

  Athi River   Machakos  139,380   

  Awasi   Kisumu  93,369   

  Kangundo-Tala   Machakos  218,557   

  Karuri   Kiambu  129,934   

  Kiambu   Kiambu  88,869   

  Kikuyu   Kiambu  233,231   

  Kisumu   Kisumu  409,928   

  Kitale   Trans-Nzoia  106,187   

  Kitui   Kitui  155,896   

  Limuru   Kiambu  104,282   

  Machakos   Machakos  150,041   

  Molo   Nakuru  107,806   

  Mwingi   Kitui  83,803   

  Naivasha   Nakuru  181,966   

  Nakuru   Nakuru  307,990   

  Nandi Hills   Trans-Nzoia  73,626   

 

Data Visualization Nikhil Srivastava, 2015

Class Exercise

• How can I represent 3 and 5 on the

whiteboard?

Data Visualization Nikhil Srivastava, 2015

Visual Encodings

Marks

point

line

area

volume

Channels

position

size

shape

color

angle/tilt

Data Visualization Nikhil Srivastava, 2015

Channel Effectiveness

Data Visualization Nikhil Srivastava, 2015

Channel Effectiveness

“Spatial position is such a good visual

coding of data that the first decision of

visualization design is which variables get

spatial encoding at the expense of others”

Data Visualization Nikhil Srivastava, 2015

Color as a Channel

Categorical Quantitative

Hue Good (6-8 max)

Poor

Value Poor Good

Saturation Poor Okay

Data Visualization Nikhil Srivastava, 2015

What kind

of data do

we have?

How can we

represent the

data visually?

How can we

organize this into

a visualization?

  Athi River   Machakos  139,380   

  Awasi   Kisumu  93,369   

  Kangundo-Tala   Machakos  218,557   

  Karuri   Kiambu  129,934   

  Kiambu   Kiambu  88,869   

  Kikuyu   Kiambu  233,231   

  Kisumu   Kisumu  409,928   

  Kitale   Trans-Nzoia  106,187   

  Kitui   Kitui  155,896   

  Limuru   Kiambu  104,282   

  Machakos   Machakos  150,041   

  Molo   Nakuru  107,806   

  Mwingi   Kitui  83,803   

  Naivasha   Nakuru  181,966   

  Nakuru   Nakuru  307,990   

  Nandi Hills   Trans-Nzoia  73,626   

 

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Scatter Plot point position 2 quantitative

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Scatter + Hue point position,color

2 quantitative, 1 categorical

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Scatter + Size (“Bubble”)

point position,size

3 quantitative

Data Visualization Nikhil Srivastava, 2015

Scatter Plot – Applications

CORRELATION GROUPING OUTLIERS

Data Visualization Nikhil Srivastava, 2015

Scatter Plot – Dangers

OCCLUSION (DENSITY)

OCCLUSION (OVERLAP)

3-D

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Line Chart line position(orientation)

2 quantitative

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Area Chart area size (length) 2 quantitative

Data Visualization Nikhil Srivastava, 2015

Line Chart – Applications

PATTERN OVER TIME COMPARISON

Data Visualization Nikhil Srivastava, 2015

Line Chart – Dangers

Y SCALING

X SCALING

OVERLOAD

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Bar Chart line size (length) 1 categorical,1 quantitative

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Histogram line size (length) 1 ordinal/quantitative,1 quantitative (count)

Data Visualization Nikhil Srivastava, 2015

Bar Chart – Applications

COMPARE CATEGORIES DISTRIBUTION

Data Visualization Nikhil Srivastava, 2015

Bar Chart – Dangers

TOO MANY CATEGORIES

POORLY SORTED

Data Visualization Nikhil Srivastava, 2015

type mark channel data represented

Pie Chart area size (angle) 1 quantitative

Data Visualization Nikhil Srivastava, 2015

Pie Chart – Dangers

AREA SCALE SIMILAR AREAS OVERLOAD

Data Visualization Nikhil Srivastava, 2015

Multi-Series Bar Charts

GROUPED BAR CHART

STACKED BAR CHART

Data Visualization Nikhil Srivastava, 2015

Multi-Series Line Charts

MULTIPLE LINE

STACKED AREA CHART

Data Visualization Nikhil Srivastava, 2015

Normalization

NORMALIZED BAR NORMALIZED AREA

Data Visualization Nikhil Srivastava, 2015

Small Multiples Chart

Data Visualization Nikhil Srivastava, 2015

More Charts

Treemap (Hierarchical Data)

Channels: ?

Strengths: 

nested relationships

Concerns: 

order vs aspect ratio

Data Visualization Nikhil Srivastava, 2015

More Charts

Multi-Level Pie(Hierarchical Data)

Channels: ?

Strengths: 

nested relationships

Concerns: 

readability

Data Visualization Nikhil Srivastava, 2015

More Charts

Heat Map(Table/Field Data)

Channels: ?

Strengths: pattern/outlier detection

Concerns: ordering/ clustering

Data Visualization Nikhil Srivastava, 2015

More Charts

Choropleth Map(Region Data)

Channels: ?

Strengths: 

geography

Concerns: 

region size

color spectrum

Data Visualization Nikhil Srivastava, 2015

More Charts

Cartogram(Region Data)

Channels: ?

Strengths: geographic pattern

Concerns: base map knowledge

Data Visualization Nikhil Srivastava, 2015

• What is Data Visualization?

• Thinking and Seeing

• From Data to Graphics

• Principles and Guidelines

• Highcharts and Javascript

• Class Project

introduction

foundation & theory

building blocks

design & critique

construction

Data Visualization Nikhil Srivastava, 2015

Highcharts: Review

• Basics

• Hello Chart

• API/Documentation

• Data import/manipulation

Data Visualization Nikhil Srivastava, 2015

Highcharts Cloud