Info vis 4-22-2013-dc-vis-meetup-shneiderman

97
Information Visualization for Knowledge Discovery Ben Shneiderman [email protected] @benbendc Founding Director (1983-2000), Human-Computer Interaction La Professor, Department of Computer Science Member, Institute for Advanced Computer Studies

description

Slide show on Information visualization for the Data Visualization Meetup in Washington, DC during www.bigdataweek.com April 22, 2013

Transcript of Info vis 4-22-2013-dc-vis-meetup-shneiderman

Page 1: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization forKnowledge Discovery

Ben Shneiderman [email protected] @benbendc

Founding Director (1983-2000), Human-Computer Interaction LabProfessor, Department of Computer Science

Member, Institute for Advanced Computer Studies

University of MarylandCollege Park, MD 20742

Page 2: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Turning Messy BigData into Actionable SmallData

@benbendc

University of MarylandCollege Park, MD 20742

Page 3: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Interdisciplinary research community - Computer Science & Info Studies - Psych, Socio, Poli Sci & MITH (www.cs.umd.edu/hcil)

Page 4: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Design Issues

• Input devices & strategies• Keyboards, pointing devices, voice

• Direct manipulation

• Menus, forms, commands

• Output devices & formats• Screens, windows, color, sound

• Text, tables, graphics

• Instructions, messages, help

• Collaboration & Social Media

• Help, tutorials, training

• Search www.awl.com/DTUI

Fifth Edition: 2010

• Visualization

Page 5: Info vis 4-22-2013-dc-vis-meetup-shneiderman

HCI Pride: Serving 5B Users

Mobile, desktop, web, cloud

Diverse users: novice/expert, young/old, literate/illiterate, abled/disabled, cultural, ethnic & linguistic diversity, gender, personality, skills, motivation, ...

Diverse applications: E-commerce, law, health/wellness, education, creative arts, community relationships, politics, IT4ID, policy negotiation, mediation, peace studies, ...

Diverse interfaces: Ubiquitous, pervasive, embedded, tangible, invisible, multimodal, immersive/augmented/virtual, ambient, social, affective, empathic, persuasive, ...

Page 6: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Obama Unveils “Big Data” Initiative (3/2012)

 Big Data challenges:

•Developing scalable algorithms for processing imperfect data in distributed data stores

•Creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions.

http://www.whitehouse.gov/sites/default/files/microsites/ostp/big_data_press_release_final_2.pdf `

Page 7: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization & Visual Analytics

• Visual bands• Human percle

• Trend, clus..

• Color, size,..

• Three challe• Meaningful vi

• Interaction: w

• Process mo

1999

Page 8: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization & Visual Analytics

• Visual bandwidth is enormous• Human perceptual skills are remarkable

• Trend, cluster, gap, outlier...

• Color, size, shape, proximity...

• Three challenges• Meaningful visual displays of massive da

• Interaction: widgets & window coordinati

• Process models for discovery

1999 2004

Page 9: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization & Visual Analytics

• Visual bandwidth is enormous• Human perceptual skills are remarkable

• Trend, cluster, gap, outlier...

• Color, size, shape, proximity...

• Three challenges• Meaningful visual displays of massive data

• Interaction: widgets & window coordination

• Process models for discovery

1999 2004 2010

Page 10: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Business takes action

• General Dynamics buys MayaViz

• Agilent buys GeneSpring

• Google buys Gapminder

• Oracle buys Hyperion

• Microsoft buys Proclarity

• InfoBuilders buys Advizor Solutions

• SAP buys (Business Objects buys Xcelsius & Inxight & Crystal Reports )

• IBM buys (Cognos buys Celequest) & ILOG

• TIBCO buys Spotfire

Page 11: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Spotfire: Retinol’s role in embryos & vision

Page 12: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Spotfire: DC natality data

Page 13: Info vis 4-22-2013-dc-vis-meetup-shneiderman

http://registration.spotfire.com/eval/default_edu.asp

Page 14: Info vis 4-22-2013-dc-vis-meetup-shneiderman

10M - 100M pixels: Large displays

Page 15: Info vis 4-22-2013-dc-vis-meetup-shneiderman

100M-pixels & more

Page 16: Info vis 4-22-2013-dc-vis-meetup-shneiderman

1M-pixels & less Small mobile devices

Page 17: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization: Mantra

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

• Overview, zoom & filter, details-on-demand

Page 18: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Information Visualization: Data Types

• 1-D Linear Document Lens, SeeSoft, Info Mural

• 2-D Map GIS, ArcView, PageMaker, Medical imagery

• 3-D World CAD, Medical, Molecules, Architecture

• Multi-Var Spotfire, Tableau, Qliktech, Visual Insight

• Temporal LifeLines, TimeSearcher, Palantir, DataMontage

• Tree Cone/Cam/Hyperbolic, SpaceTree, Treemap

• Network Pajek, UCINet, NodeXL, Gephi, Tom Sawyer In

foV

iz

S

ciV

iz .

infosthetics.com visualcomplexity.com eagereyes.orgflowingdata.com perceptualedge.com datakind.orgvisual.ly Visualizing.org infovis.org

Page 19: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Anscombe’s Quartet

1 2 3 4

x y x y x y x y

10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58

8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76

13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71

9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84

11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47

14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04

6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25

4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50

12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56

7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91

5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

Page 20: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Anscombe’s Quartet

1 2 3 4

x y x y x y x y

10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58

8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76

13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71

9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84

11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47

14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04

6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25

4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50

12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56

7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91

5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

Property Value

Mean of x  9.0

Variance of x 11.0

Mean of y  7.5

Variance of y  4.12

Correlation 0.816

Linear regression y = 3 + 0.5x

Page 21: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Anscombe’s Quartet

Page 22: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Temporal Data: TimeSearcher 1.3

• Time series• Stocks

• Weather

• Genes

• User-specified patterns

• Rapid search

Page 23: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Temporal Data: TimeSearcher 2.0

• Long Time series (>10,000 time points)

• Multiple variables

• Controlled precision in match (Linear, offset, noise, amplitude)

Page 24: Info vis 4-22-2013-dc-vis-meetup-shneiderman

LifeLines: Patient Histories

www.cs.umd.edu/hcil/lifelines

Page 25: Info vis 4-22-2013-dc-vis-meetup-shneiderman

LifeLines2: Align-Rank-Filter & Summarize

Page 26: Info vis 4-22-2013-dc-vis-meetup-shneiderman

LifeFlow: Aggregation Strategy

Temporal Categorical Data (4 records)

LifeLines2 format

Tree of Event Sequences

LifeFlow Aggregation

www.cs.umd.edu/hcil/lifeflow

Page 27: Info vis 4-22-2013-dc-vis-meetup-shneiderman

LifeFlow: Interface with User Controls

Page 28: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 29: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 30: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 31: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 32: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 33: Info vis 4-22-2013-dc-vis-meetup-shneiderman

EventFlow: Original Dataset

Page 34: Info vis 4-22-2013-dc-vis-meetup-shneiderman

LABA_ICSs Merged

Page 35: Info vis 4-22-2013-dc-vis-meetup-shneiderman

SABAs Merged

Page 36: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Align by First LABA_ICS

Page 37: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Reduce Window Size

Page 38: Info vis 4-22-2013-dc-vis-meetup-shneiderman

EventFlow Team: Oracle support

www.cs.umd.edu/hcil/eventflow

www.umdrightnow.umd.edu/news/umd-research-team-developing-powerful-data-visualization-tool-support-oracle

Page 39: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Treemap: Gene Ontology

www.cs.umd.edu/hcil/treemap/

+ Space filling

+ Space limited

+ Color coding

+ Size coding - Requires learning

(Shneiderman, ACM Trans. on Graphics, 1992 & 2003)

Page 40: Info vis 4-22-2013-dc-vis-meetup-shneiderman

www.smartmoney.com/marketmap

Treemap: Smartmoney MarketMap

Page 41: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Market falls steeply Feb 27, 2007, with one exception

Page 42: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Market falls steeply Sept 22, 2011, some exceptions

Page 43: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Market mixed, February 8, 2008 Energy & Technology up, Financial & Health Care down

Page 44: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Market rises, September 1, 2010, Gold contrarians

Page 45: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Market rises, March 21, 2011, Sprint declines

Page 46: Info vis 4-22-2013-dc-vis-meetup-shneiderman

newsmap.jp

Treemap: Newsmap (Marcos Weskamp)

Page 47: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Treemap: WHC Emergency Room (6304 patients in Jan2006)

Group by Admissions/MF, size by service time, color by age

Page 48: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Treemap: WHC Emergency Room (6304 patients in Jan2006) (only those service time >12 hours)

Group by Admissions/MF, size by service time, color by age

Page 49: Info vis 4-22-2013-dc-vis-meetup-shneiderman

www.hivegroup.com

Treemap: Supply Chain

Page 50: Info vis 4-22-2013-dc-vis-meetup-shneiderman

www.hivegroup.com

Treemap: Nutritional Analysis

Page 51: Info vis 4-22-2013-dc-vis-meetup-shneiderman

www.spotfire.com

Treemap: Spotfire Bond Portfolio Analysis

Page 52: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Treemap: NY Times – Car&Truck Sales

www.cs.umd.edu/hcil/treemap/

Page 53: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Treemap (Voronoi): NY Times - Inflation

www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html

Page 54: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 55: Info vis 4-22-2013-dc-vis-meetup-shneiderman

VisualComplexity.com : Manuel Lima

Page 56: Info vis 4-22-2013-dc-vis-meetup-shneiderman

SocialAction

• Integrates statistics & visualization

• 4 case studies, 4-8 weeks (journalist, bibliometrician, terrorist analyst, organizational analyst)

• Identified desired features, gave strong positive feedback about benefits of integration

Perer & Shneiderman, CHI2008, IEEE CG&A 2009www.cs.umd.edu/hcil/socialaction

Page 57: Info vis 4-22-2013-dc-vis-meetup-shneiderman

www.centrifugesystems.com

Network from Database Tables

Page 58: Info vis 4-22-2013-dc-vis-meetup-shneiderman

NodeXL: Network Overview for Discovery & Exploration in Excel

www.codeplex.com/nodexl

Page 59: Info vis 4-22-2013-dc-vis-meetup-shneiderman

NodeXL: Network Overview for Discovery & Exploration in Excel

www.codeplex.com/nodexl

Page 60: Info vis 4-22-2013-dc-vis-meetup-shneiderman

NodeXL: Import Dialogs

www.codeplex.com/nodexl

Page 61: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Tweets at #WIN09 Conference: 2 groups

Page 62: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Flickr networks

Page 63: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Twitter discussion of #GOP

Red: Republicans, anti-Obama, mention FoxBlue: Democrats, pro-Obama, mention CNNGreen: non-affiliated

Node size is number of followersPolitico is major bridging group

Page 64: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Analogy: Clusters Are OccludedHard to count nodes, clusters

Page 65: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Separate Clusters Are More Comprehensible

Page 66: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Twitter networks: #SOTU

Page 67: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Group-In-A-Box: Twitter Network for #CI2012

Page 68: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Twitter Network for “TTW”

Page 69: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Pennsylvania Innovation Network

Page 70: Info vis 4-22-2013-dc-vis-meetup-shneiderman

PatentTech

SBIR (federal)

PA DCED (state)Related patent

2: Federal agency

3: Enterprise

5: Inventors

9: Universities

10: PA DCED

11/12: Phil/Pitt metro cnty

13-15: Semi-rural/rural cnty

17: Foreign countries

19: Other states

Pittsburgh Metro

Westinghouse Electric

Pharmaceutical/Medical

No Location Philadelphia

Navy

Page 71: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Innovation Patterns: 11,000 vertices, 26,000 edges

Page 72: Info vis 4-22-2013-dc-vis-meetup-shneiderman

PatentTech

SBIR (federal)

PA DCED (state)Related patent

2: Federal agency

3: Enterprise

5: Inventors

9: Universities

10: PA DCED

11/12: Phil/Pitt metro cnty

13-15: Semi-rural/rural cnty

17: Foreign countries

19: Other states

Pittsburgh Metro

Westinghouse Electric

Pharmaceutical/Medical

No Location Philadelphia

Navy

Innovation Clusters: People, Locations, Companies

Page 73: Info vis 4-22-2013-dc-vis-meetup-shneiderman
Page 74: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Interactive Methods to Reveal Patterns

Filtering Node & link attribute values or statistics

Clustering Cluster algorithmically by link connectivity

Grouping Group based on node attributes

Motif Common, meaningful structures Simplification replaced with simplified glyphs

Page 75: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting

Page 76: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Group-In-A-Box by Region

Page 77: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Interactive Methods to Reveal Patterns

Filtering Node & link attribute values or statistics

Clustering Cluster algorithmically by link connectivity

Grouping Group based on node attributes

Motif Common, meaningful structures Simplification replaced with simplified glyphs

Page 78: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Motif Simplification

(a) Fan motifs & glyphs (b) Connector motifs & glyphs

Page 79: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Motif Simplification

Page 80: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Motif Simplification

Page 81: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Clique Motifs & Glyphs: 4, 5 & 6

Page 82: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting: 65% Agreement

Page 83: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting: 70% Agreement

Page 84: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting: 80% Agreement

Page 85: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting: 90% Agreement

Page 86: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Senate Co-Voting: 95% Agreement

Page 87: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Analyzing Social Media Networks with NodeXL

I. Getting Started with Analyzing Social Media Networks 1. Introduction to Social Media and Social Networks 2. Social media: New Technologies of Collaboration 3. Social Network Analysis

II. NodeXL Tutorial: Learning by Doing 4. Layout, Visual Design & Labeling 5. Calculating & Visualizing Network Metrics  6. Preparing Data & Filtering 7. Clustering &Grouping

III Social Media Network Analysis Case Studies 8. Email 9. Threaded Networks 10. Twitter 11. Facebook   12. WWW 13. Flickr 14. YouTube  15. Wiki Networks 

www.elsevier.com/wps/find/bookdescription.cws_home/723354/description

Page 88: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Social Media Research Foundation

Researchers who want to - create open tools - generate & host open data - support open scholarship

Map, measure & understand social media  

Support tool projects to collection, analyze & visualize social media data.  

smrfoundation.org

Page 89: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Sense-Making Loop

Thomas & Cook: Illuminating the Path (2004)

Page 90: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Sense-Making Loop: Expanded

Thomas & Cook: Illuminating the Path (2004)

Page 91: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Discovery Process: Systematic Yet Flexible

Preparation• Own the problem & define the schedule• Data cleaning & conditioning• Handle missing & uncertain data• Extract subsets & link to related information

Page 92: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Preparation• Own the problem & define the schedule• Data cleaning & conditioning• Handle missing & uncertain data• Extract subsets & link to related information

Purposeful exploration – Hypothesis testing• Range & distribution• Relationships & correlations• Clusters & gaps• Outliers & anomalies• Aggregation & summary• Split & trellis• Temporal comparisons & multiple views• Statistics & forecasts

Discovery Process: Systematic Yet Flexible

Page 93: Info vis 4-22-2013-dc-vis-meetup-shneiderman

Preparation• Own the problem & define the schedule• Data cleaning & conditioning• Handle missing & uncertain data• Extract subsets & link to related information

Purposeful exploration – Hypothesis testing• Range & distribution• Relationships & correlations• Clusters & gaps• Outliers & anomalies• Aggregation & summary• Split & trellis• Temporal comparisons & multiple views• Statistics & forecasts

Situated decision making - Social context• Annotation & marking• Collaboration & coordination• Decisions & presentations

Discovery Process: Systematic Yet Flexible

Page 94: Info vis 4-22-2013-dc-vis-meetup-shneiderman

UN Millennium Development Goals

• Eradicate extreme poverty and hunger• Achieve universal primary education• Promote gender equality and empower women• Reduce child mortality• Improve maternal health• Combat HIV/AIDS, malaria and other diseases• Ensure environmental sustainability• Develop a global partnership for development

To be achieved by 2015

Page 95: Info vis 4-22-2013-dc-vis-meetup-shneiderman

30th Anniversary SymposiumMay 22-23, 2013

www.cs.umd.edu/hcil

Page 96: Info vis 4-22-2013-dc-vis-meetup-shneiderman

For More Information

• Visit the HCIL website for 700+ papers & info on videos www.cs.umd.edu/hcil

• See Chapter 14 on Info Visualization Shneiderman, B. and Plaisant, C., Designing the User Interface: Strategies for Effective Human-Computer Interaction: Fifth Edition (2010) www.awl.com/DTUI

• Edited Collections: Card, S., Mackinlay, J., and Shneiderman, B. (1999) Readings in Information Visualization: Using Vision to Think Bederson, B. and Shneiderman, B. (2003) The Craft of Information Visualization: Readings and Reflections

Page 97: Info vis 4-22-2013-dc-vis-meetup-shneiderman

For More Information

• Treemaps• HiveGroup: www.hivegroup.com • Smartmoney: www.smartmoney.com/marketmap • HCIL Treemap 4.0: www.cs.umd.edu/hcil/treemap

• Spotfire: www.spotfire.com • TimeSearcher: www.cs.umd.edu/hcil/timesearcher • NodeXL: nodexl.codeplex.com• Hierarchical Clustering Explorer:

www.cs.umd.edu/hcil/hce

• LifeLines2: www.cs.umd.edu/hcil/lifelines2 • EventFlow: www.cs.umd.edu/hcil/eventflow