This tutorial was developed for the Visualizing Venice...
Transcript of This tutorial was developed for the Visualizing Venice...
This tutorial was developed for the Visualizing Venice 2016 Summer Workshop.
http://www.dukewired.org/visualizing-venice-the-ghetto-of-venice/
Instructors: Mark J.V. Olson, Victoria Szabo
Teaching Assistants: Ludovica Galeazzo, Hannah L. Jacobs, Edward Triplett http://www.dukewired.org/
It is licensed under CC-BY-NC-SA 3.0 US.
https://creativecommons.org/licenses/by-nc-sa/3.0/us/
Please use, reuse, mix, and cite your source(s)!
Data Visualizations in RAW
About this Tutorial
RAW is an open web application that creates custom vector-based visualizations built on the D3.js
JavaScript library using a graphical user interface in your browser. D3.js is a JavaScript library that *
converts data into a visualization of your design without the need for a proprietary framework. †
Part I: Preparing the Data
For this tutorial, we will return to the historical data on shops in the Ghetto.
1. Locate and open the data, “Shops_Raw.csv”, in Microsoft Excel. This spreadsheet is a
compilation of the data from 1661, 1712, and 1739. Note the three columns: Tipologia (type),
Anno (year), and Nome (name).
�
2. Close the file and locate it in Finder. Right-click on it, and select Open With… → Atom. The
data has been saved in Excel as a .csv (comma-separated value) file. In a text editor, we can
see the raw form of the data, in which each line corresponds to a record, and columns are
separated by commas. We will need this data format for creating visualizations with RAW.
�
Language paraphrased from http://raw.densitydesign.org/. *
Language paraphrased from http://d3js.org/.†
Visualizing Venice Summer Workshop 2016 �1
Data Visualizations in RAW
Part II: Creating a RAW Visualization
1. Navigate to http://raw.densitydesign.org/.
2. Click “Use it now!” to open the four step visualization process.
3. In the window that loads, we can paste in our data. In the text editor, select all of your data.
Copy it (Edit → Copy) and paste it (Edit → Paste) into the RAW text field.
�
4. RAW will attempt to read your data and will give you a thumbs up if it has successfully read
(parsed) your data.
5. Scroll down in your browser to reveal the next step: choosing a visualization type.
�
There are 16 visualization types available as well as an option to add your own visualization
style created in d3.js.
Visualizing Venice Summer Workshop 2016 �2
Data Visualizations in RAW
6. Each visualization type can show a different aspect of our data. We might choose to highlight
change over time, typological quantities, business quantities, etc. In Part IV below, you can find
information on each visualization type and how you might choose to use it.
7. Let’s first try visualizing quantities of business types. Select
the circle packing visualization in the “Choose a Chart”
section.
8. A third step will appear below. Scroll down the page.
�
9. In the “Map your dimensions” section, we must choose how RAW creates a visualization using
the three columns of data. The dimensions, attributes that affect how a chart will look (size,
color, label, groupings, etc.), change depending on the chart you choose.
10. The column headers are listed in green on the left. Dimensions are listed to the right. Note
that each dimension type requires certain types of data: numbers, strings, and/or dates. RAW
attempts to identify each column as one of these types—listed next to the column name. If a
dimension does not accept a data type, that column’s box will turn yellow when it is added to
the dimension.
11. Drag and drop a column header into a dimension box: start by adding “Nome” to Hierarchy
and Color. (It is often possible to apply a column to multiple dimensions. It may also be
possible in some cases to leave some dimensions empty.)
Visualizing Venice Summer Workshop 2016 �3
Data Visualizations in RAW
12. Scroll down now to view and customize the visualization you’ve created.
� Your visualization should appear similar to this visualization: each colored bubble corresponds
to a business type (Nome), and the size of each bubble is determined by the number of
specific types listed in the dataset.
13. Does this visualization accomplish what we’ve set out to show? Might there be another way to
go about it?
14. Let’s now load into RAW the data saved in 1661_Shops_Raw.csv.
15. This data contains the list of Jewish business types present in and around the Ghetto in 1661.
Note, however, that the Anno column is gone, and a new “Quantity” column has been added.
Each business type appears only once in the data, and the “Quantity” column gives the
number of shops present.
16. Let’s see what happens when we use this data to create a new clustered force layout
visualization using the following as dimensions:
�
Visualizing Venice Summer Workshop 2016 �4
Data Visualizations in RAW
17. This time, we should see a visualization that gives us only the quantities of business types for
one year, color coded by Tipologia/type. In this case, removing the data from other years and
adjusting the data so that we include a numbers data type made it possible for us to be more
explicit in the visualization we’ve created. We’ve also chosen a slightly different visualization
format that replaces the “Hierarchy” dimension with a much more democratic “clusters”—as
there is no need to show any hierarchy in this case.
18. We can now customize the visualization’s appearance in the left menu
in “Customize your Visualization”.
19. Here we can adjust the visualization’s height and width, the padding
between nodes and clusters, and the colors chosen.
20. When you’re satisfied with these customizations, move to the final
sharing section where you may choose to download the visualization
as a .png, .svg., or .json file or to embed the visualization on a web
page using XML code provided in the “Embed Code” text box.
21. Note that it is possible to create further customizations in
the .svg, .json, and XML formats if you are familiar with XML or JSON.
Part III: Visualizing Time in RAW
We’ve made one visualization that shows different types of shops during one year, but what if we
want to add the time dimension to a visualization? We can do this in RAW using the Bump Chart,
Small Multiples, or Streamgraph.
1. Load the dataset “Shows_Time_Raw.csv” in RAW.
2. Note again that the data structure has been adjusted slightly from the original file to include
quantities in the far right field. Note also that the date format has changed to reflect RAW’s
accepted date formats.
3. This time, choose bump chart and set your dimensions to
Group: Nome Date: Anno Size: Quantity
4. View your chart. What does it tell you about changes in shops across the time span? What is
not made clear through this visualization?
Visualizing Venice Summer Workshop 2016 �5
Data Visualizations in RAW
5. Try the streamgraph visualization. What does it show that the bump chart does not? What
about the small multiples?
One next step for this visualization might be to consider comparing only a few of the business
types listed. What other kinds of visualizations can you envision from this dataset?
Part IV: RAW Visualization Types
Alluvial Diagram (Fineo-like)
Best for: showing relationships between individuals and/or categories
Dimenions: Steps, Size
Data types: numbers, strings, dates
Alluvial diagrams show correlations between categorical dimensions by visually linking elements that share the same categories. It can be used to show the evolution of cluster (movement of an element from one type of category to an-other) and to show groupings of elements that share common categories. Alluvial dia-grams can also known as Sankey diagrams or bipartite graphs. Compare to Parallel Coor-dinates.
Examples:
Raw’s design of the Alluvial Diagram is inspired by http://bost.ocks.org/mike/sankey.
Another example of the Alluvial Diagram in action can be seen in the People of Medieval Scotland 1093-1314 Relationships Explorer: http://db.poms.ac.uk/labs/connectionscloud.
Bump Chart
Best for: comparing quantitative changes over time be-tween multiple individuals/groups
Dimensions: Group, Date, Size
Data types: numbers, strings, dates
Raw’s Bump Chart shows viewers quantitative change over time as compared between different data groups. These “groups” themselves can be numbers, strings, or dates and could represent a single entity (such as one person) or a group of entities (as in the first example, people born in a particular area in the US).
Examples:
Raw’s design of the Bump Chart is inspired by the New York Times’ visualization: http://www.nytimes.com/interactive/2014/08/13/upshot/where-people-in-each-state-were-born.html?_r=0.
Visualizing Venice Summer Workshop 2016 �6
Data Visualizations in RAW
Popular baby names: http://www.visualcinnamon.com/babynamesus.
Circle Packing Best for: showing hierarchical structures and quantita-tive relationships between elements based on size and position
Dimensions: hierarchy, size, color, label
Data types: numbers, strings, dates
Circle packing, or nested circles enable users to show simultaneously hierarchical and quantitative relationships. This visualization is particularly effective for showing propor-tions between elements at different levels of hierarchy. See also Clustered Force Layout.
Examples:
Raw’s design of the Circle Packing chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4063530.
Jane Austen adaptations by year: https://janeaustendatavisualization.wordpress.com/2015/03/28/circle-packing-colour-pride-prejudice-adaptations-data/.
Circular Dendrogram
Best for: showing a large (wide) non-weighted hierar-chy in a more compact way, especially when the hierar-chy begins with more than one top level category
Dimensions: hierarchy
Data types: numbers, strings, dates
Dendrograms are tree-like diagrams used to represent the distribution of a non-weighted hierarchical clustering. This circular dendrogram places the highest hierarchical level, the single “root” above your data’s top categories, in the center. (If you have only one top cat-egory, this will appear as the root.) Each concentric ring moving outward is a progressive step down in the hierarchy. See also Cluster Dendrogram.
Examples:
Raw’s design of the Circular Dendrogram chart is inspired by Mike Bostock’s demonstra-tion: http://bl.ocks.org/mbostock/4063570
Visualizing My Craft Beer Consumption: http://vizthinker.com/visualizing-my-craft-beer-consumption-with-circular-dendrograms/
Visualizing Venice Summer Workshop 2016 �7
Data Visualizations in RAW
Cluster Dendrogram
Best for: showing a non-weighted hierarchy
Dimensions: hierarchy
Data types: numbers, strings, dates
Dendrograms are tree-like diagrams used to represent the distribution of a non-weighted hierarchical clustering. The different depth levels repre-sented by each node are visualized on the horizontal axis with the highest level of the hi-erarchy appearing on the left and radiating down the hierarchy to the right. See also Circu-lar Dendrogram and Reingold-Tilford Tree.
Examples:
Raw’s design of the Cluster Dendrogram chart is inspired by Mike Bostock’s demonstra-tion: http://bl.ocks.org/mbostock/4063570
Unified Astronomy Thesaurus: http://www.altbibl.io/astrothesaurus/uat/dendrogram.html
Clustered Force Layout Best for: categorizing and comparing individual elements
Dimensions: clusters, size, label, color
Data types: numbers, strings, dates
Similar to Circle Packing, Clustered Force Layout enables users to show simultaneously categorical and quantitative relationships so that elements can be analyzed both within and across categories. See also Circle Packing.
Examples:
Raw’s design of the Clustered Force Layout chart is inspired by Mike Bostock’s demonstra-tion: http://bl.ocks.org/mbostock/7882658
World’s Biggest Data Breaches: http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/
Convex Hull
Best for: showing a dataset’s overall quantitative and/or temporal value on two axes
Dimensions: X axis, Y axis
Data types: numbers, dates
A convex hull is the smallest convex shape created by a scatter plot, in which two values (quantitative or temporal) are compared for each element. Each element is represented by a point in the polygon.
Visualizing Venice Summer Workshop 2016 �8
Data Visualizations in RAW
Examples: Raw’s design of the Convex Hull chart is inspired by Mike Bostock’s demonstra-tion: http://bl.ocks.org/mbostock/4341699
Delaunay Triangulation
Best for: showing a dataset’s overall quantitative and/or temporal value on two axes
Dimensions: X axis, Y axis
Data types: numbers, dates
This visualization is a combination of the Convex Hull and Voronoi tessellation. It shows elements quantitatively by comparing two of their quantitative values. Each element in the dataset is represented by a vertex. Vertices connect to form triangles, which together make up a planar mesh representing the entire dataset. As with the Convex Hull, this mesh is the smallest polygon possible for this dataset.
See also Voronoi tessellation and Convex Hull.
Examples: Raw’s design of the Delauney Triangulation chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4341156
Hexagonal Binning
Best for: showing a large dataset’s most common val-ues (quantitative or temporal) in two variables (or cat-egories)
Dimensions: X axis, Y axis
Data types: numbers, dates
Like a scatter plot, except that each hexagon represents a point where two values (quanti-tative or temporal) appear together. The darker color the hexagon, the more often the combination of numbers and/or dates occurs in the dataset.
Examples: Raw’s design of the Hexagonal Binning chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4248145
Parallel Coordinates
Best for: Comparing two or more quantitative attribut-es across a dataset while also using color to show cat-egories.
Dimensions: dimensions (columns), color
Data types: numbers, dates, string
Parallel Coordinates charts are a great way to analyze multivariate data, or data with mul-tiple different values. Each column (dimension) in Parallel Coordinates represents a quanti-
Visualizing Venice Summer Workshop 2016 �9
Data Visualizations in RAW
tative value, while the color can be qualitative. An element is represented by a line that moves from one column to the next. The line moves up or down along the columns de-pending on its value in each dimension. Compare to Alluvial Diagrams.
Examples: Raw’s design of the Parallel Coordinates chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/jasondavies/1341281
Women Writers Project: http://www.wwp.northeastern.edu/wwo/lab/textbase.html
Reingold-Tilford Tree
Best for: showing a non-weighted hierarchy
Dimensions: hierarchy
Data types: numbers, dates, string
As with the circular dendrogram, this visualization shows hierarchies, beginning with a top-level “root”--
the top of your hierarchy if there is only one value, or a step above your dataset’s top lev-els, if there is more than one value, in your chosen hierarchy. The root appears at the left of the chart, and data categories (attributes) descend in hierarchical order to the right. Com-pare to the Cluster Dendrogram.
Examples: Raw’s design of the Reingold-Tilford Tree chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4339184.
Scatter Plot
Best for: showing relationships between two quantita-tive or temporal values for an element
Dimensions: X axis, Y axis, size, color, label
Data types: numbers, dates, string
A scatter plot is a type of mathematical diagram that compares values of two variables (categories) for a dataset on an X-Y graph (Cartesian co-ordinates). Each element in the dataset is a point on the graph. The point’s position is de-termined by the X and Y values of the element.
Small Multiples (Area)
Best for: showing quantitative change over time
Dimensions: group, date, size
Data types: numbers, dates, string
A small multiple is a series of small similar graphics or charts, in this case a type of line graph. See also Streamgraph.
Visualizing Venice Summer Workshop 2016 �10
Data Visualizations in RAW
Examples: Raw’s design of the Small Multiples chart is inspired by Mike Bostock’s demon-stration: http://bl.ocks.org/mbostock/9490313.
The Rise and Decline of Ask MetaFilter: http://www.projects.flowingdata.com/tut/linked_small_multiples_demo/.
Streamgraph
Best for: showing quantitative change over time
Dimensions: group, date, size
Data types: numbers, dates, string
A streamgraph can show continuous change over time and is comparable to a stacked bar chart. See also Small Multiples.
Examples: Raw’s design of the Streamgraph chart is inspired by Mike Bostock’s demon-stration: http://bl.ocks.org/mbostock/4060954.
Will Turman’s D3 Interactive Streamgraph: http://bl.ocks.org/WillTurman/4631136.
Treemap
Best for: showing categorized hierarchies
Dimensions: hierarchy, size, color, label
Data types: numbers, dates, string
A treemap shows hierarchies and proportions between a dataset’s elements. Hierarchical levels are clustered together. Large rectangles represent categories and are subdivided into further rectangles, stepping down the hierarchy until individual elements are reached.
Examples: Raw’s design of the Treemap chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4063582.
Voronoi Tessellation
Best for: showing a dataset’s overall quantitative and/or temporal value on two axes
Dimensions: hierarchy, size, color, label
Data types: numbers, dates, string
Voronoi Tessellation shows the minimum area of a polygon surrounding each point in a dataset. These polygons are defined by two variables (categories). The points themselves are set up in a scatter plot. The polygons are helpful for seeing distance between points. See also Delauney Triangulation.
Visualizing Venice Summer Workshop 2016 �11
Data Visualizations in RAW
Examples: Raw’s design of the Voronoi Tessellation chart is inspired by Mike Bostock’s demonstration: http://bl.ocks.org/mbostock/4060366.
Resources
- RAW FAQs: https://github.com/densitydesign/raw/wiki/FAQs
- Add your own d3.js visualization: https://github.com/densitydesign/raw/wiki/Adding-New-Charts
- Intro to Data Visualization (UCLA Center for Digital Humanities): http://
dh101.humanities.ucla.edu/?page_id=40
- Data + Design: a simple introduction to preparing and visualizing data: https://infoactive.co/
data-design
Visualizing Venice Summer Workshop 2016 �12