MPhil Lecture on Data Vis for Analysis

46
An Introduction to Data Visualisation for Analysis Exploring the Dataset - Textual, Numerical and Otherwise http://www.slideshare.net/shawnday/m-phil-datavisforanalysis

description

 

Transcript of MPhil Lecture on Data Vis for Analysis

Page 1: MPhil Lecture on Data Vis for Analysis

An Introduction to Data Visualisation for Analysis

Exploring the Dataset - Textual, Numerical and Otherwise

http://www.slideshare.net/shawnday/m-phil-datavisforanalysis

Page 2: MPhil Lecture on Data Vis for Analysis

AgendaThoughts from last week - wordpress.com?

Introduction

What do we mean by Data Analysis?

Some foundation terms and concepts

The Data Visualisation Process

Tools and Methods

Extending your toolset

An Exercise

Page 3: MPhil Lecture on Data Vis for Analysis

Objective

To appreciate the rich variety of techniques and tools available to digital humanities scholars for

data visualisation and analysis. The intention is to be able to add tools to your

arsenal and to have a sense of where to look for more.

Page 4: MPhil Lecture on Data Vis for Analysis

Breakpoint

One of the keys to good visualization is understanding what your immediate goals are.

Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to

others?

You - Visualisation for Data Analysis

Others - Visualisation for Presentation

Page 5: MPhil Lecture on Data Vis for Analysis

Speaking of Data AnalysisSPSS

SAS

OS Equivalents

Page 6: MPhil Lecture on Data Vis for Analysis

So Why Would You Want to Visualise Your Data?

Bypass language centres to tap directly into the visual cortex

Leverage ability to recognise patterns - what they call visual sense-making

Powerful graphics engines now allow for live data processing and sophisticated animations and interactive research environments

Sources: Geoff McGhee, Getting Started with Data Viz

Page 7: MPhil Lecture on Data Vis for Analysis

So Why Would You Want to Visualise Your Data?

Work with new data to create new knowledge

Explore data to discover things that used to be unknown, unknowable or impractical to know

Take a new perspective on the familiar to reveal previously hidden insights

Page 8: MPhil Lecture on Data Vis for Analysis

Visualising New Information

Tourists vs Locals, Eric Fischer, 2010 - Flickr

Page 9: MPhil Lecture on Data Vis for Analysis

Visualising New Information

Flickr Flow, Martin Wattenberg and Fernanda Viegas, 2009

Page 10: MPhil Lecture on Data Vis for Analysis

The Familiar through New Eyes

The Times Atlas

Page 11: MPhil Lecture on Data Vis for Analysis

How Could You Use Data Analysis“In the Lab” - for your own analysis

Online as part of collabourative groups

Through dissemination for extension of own work - crowdsourcing

Others?

Page 12: MPhil Lecture on Data Vis for Analysis

The Time Ribbon and the Tree Map

Page 13: MPhil Lecture on Data Vis for Analysis

Exploring the ordinary life of rural pioneers in nineteenth century Ontario

Visualisation Objective

Page 14: MPhil Lecture on Data Vis for Analysis

William Sunter Farm Diary, 1858

Farm Journal

Page 15: MPhil Lecture on Data Vis for Analysis

• 100s of pages

• Varying hands

• Varying quality

Diaries: the raw materials

Page 16: MPhil Lecture on Data Vis for Analysis

• Generate word frequency (Voyeur, TAPoR)

• Isolate known farm activities (NLP - LanguageWare)

• Collocate to link activity references to time, duration, and resources (Voyeur)

The Process

Page 17: MPhil Lecture on Data Vis for Analysis

Medical Diary by BlueChillies

Example: Medical Diary

Page 18: MPhil Lecture on Data Vis for Analysis

History flow by Martin Wattenberg and Fernanda Viegas

Example: History Flow

Page 19: MPhil Lecture on Data Vis for Analysis

The Result/ New Patterns

Page 20: MPhil Lecture on Data Vis for Analysis

•Less time haying

•The impact of technology

•More tasks faster

The Result/ New Patterns

Page 21: MPhil Lecture on Data Vis for Analysis

How Else Could this be done?

Page 22: MPhil Lecture on Data Vis for Analysis

• Easier to compare over intervals

• Multiple vectors with greater granularity in a compressed space

• The challenge is to find rich enough source materials to yield substantive datasets

What is the Value of this Visualisation

Page 23: MPhil Lecture on Data Vis for Analysis

The Tree Map

Page 24: MPhil Lecture on Data Vis for Analysis

Example: Newsmap

Page 25: MPhil Lecture on Data Vis for Analysis

Example: Panopticon

Page 26: MPhil Lecture on Data Vis for Analysis

• What are we studying?

–Self-declared occupations of politicians

• Why?

–What bias might they bring to their job?

• How?

–Visualising past occupation and mapping to political platform of party affiliated with

Case Study:Occupations of Politicians

Page 27: MPhil Lecture on Data Vis for Analysis

Occupations of TDs in the 30th Dáil

Page 28: MPhil Lecture on Data Vis for Analysis

Occupations of MPs in the 2nd Parliament

Page 29: MPhil Lecture on Data Vis for Analysis

Occupations of MPs in the 37th Parliament

Page 30: MPhil Lecture on Data Vis for Analysis

• The emergence of the professional politician with no private sector experience

• Occupational continuity across changes in governing party

The Result/ New Patterns

Page 31: MPhil Lecture on Data Vis for Analysis

How Else Could this be Done?

Page 32: MPhil Lecture on Data Vis for Analysis

• New ways of presenting allow new ways of seeing

• Hidden patterns become evident

• Suggest other hypothesis to test

The Value of Data Vis for Analysis

Page 33: MPhil Lecture on Data Vis for Analysis

Basic Terms Datamining

Statistics

Structured/Unstructured Data

Visualisation

Modelling

Page 34: MPhil Lecture on Data Vis for Analysis

Types of Data to VisualiseAudio Data

Categorical Data

Cartographic Data

Collections

Image DataStill

Moving

Metadata

Multimedia Data

Network DataSocial

Other

Numerical Data

Temporal Data

Textual DataNarrative

Qualitative

????

Page 35: MPhil Lecture on Data Vis for Analysis

General Steps in Data Vis for DHDiscovery / Acquisition

Cleaning / ‘Munging’

Analysis / Exploratory Vis

Presentation

Page 36: MPhil Lecture on Data Vis for Analysis

Discovery / AcquisitionOriginal Research

Spreadsheets

Databases

Digitized Media

Other DownloadsPublic Data

Archives/Libraries

Academic Partners

Purchase

ScrapingJunar

Outwit Hub

ScraperWiki

Page 38: MPhil Lecture on Data Vis for Analysis

Cleaning / Munging(Normalisation, Format Conversion)

Tools:Data Wrangler

Google Refine

Mr. Data Converter

Data WranglerDoes simple, split, clear, fold/unfold transforms on data

See example --> Data and Script

Google RefineWorks with larger datasets

Page 42: MPhil Lecture on Data Vis for Analysis

Analysis / Exploratory VisualisationWeb Services

Google Fusion Tables

Google Spreadsheets

IBM ManyEyes

TimeFlow

Applications

Tableau/Tableau Public

MS Office

OpenOffice

Gephi

Node XL (plug-in for Excel)

Spotfire

R Processing

Page 43: MPhil Lecture on Data Vis for Analysis

Google NGram ViewersExamine word frequency in digitised books

Currently about 4% of books ever published

In English, Chinese, French, German, Hebrew, Russian, and Spanish

Changes in word usage

Trends

Check out the Cultural Observatory @ Harvard

Page 45: MPhil Lecture on Data Vis for Analysis

WordleVisually present word frequency using size, weight, colour

Consider Word Clouds Considered Harmful

Page 46: MPhil Lecture on Data Vis for Analysis

ExerciseChoose a dataset from a source such as:

The CSO

Project Guttenberg

or your own material

Choose an appropriate Data Visualisation from a webservice we explored in workshop.

Explain the process and how you madeyour choice and embed it in your own blog using wordpress.com as we explored last week.

Suggest a research question that can be answered by using this data visualisation as a research environment

Send the link to me at: [email protected]

Maybe: http://politicalreform.ie/2011/12/04/state-of-enda-sunday-business-post-red-c-poll-4th-september-2011/