1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

74
1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

Page 1: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

1

i247: Information Visualization and PresentationMarti Hearst

April 9, 2008 

 

Page 2: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

2

Search and Text Visualization

• Nominal data is hard to visualize• Goals of search vs. text analysis

– Only a tiny fraction of those people who want to use search want to analyze text.

– For those analysts, there are many interesting ideas available.

• Today:– Viz for text mining and sense making– Viz for text concordances– What’s up with Tag Clouds?

Page 3: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

3

Text Mining Support

• TAKMI, by Nasukawa and Nagano, ‘01

• The system integrates:– Analysis tasks (customer service help)– Content analysis– Information Visualization

Page 4: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

4

Text MiningTAKMI, by Nasukawa and Nagano, 2001

• Documents containing “windows 98”

Page 5: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

5

TAKMI, by Nasukawa and Nagano, 2001

TAKMI, by Nasukawa and Nagano, 2001• Patent documents containing “inkjet”, organized by

entity and year

Page 6: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

6

Text Mining: Jigsaw by Stasko et al.

Page 7: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

7

Text Mining: WebFountain

Page 8: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

8

Visualization Support for SenseMaking

• DLITE by Cousins et al. ‘97

Page 9: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

9

Query History

Entities

Dimensions

TRIST (The Rapid Information Scanning Tool) is the work space for Information Retrieval and Information Triage.

Launch Queries

AnnotatedDocument Browser

Comparative Analysisof Answers and Content

User Defined andAutomatic

Categorization

Rapid Scanningwith Context

Linked Multi-Dimensional Views Speed Scanning

Visualization in Sensemaking

TRIST, Jonkers et al 05

Page 10: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

10

Visualization for Sensemaking

Quick Emphasis of Items of

Importance.

Sandbox, Wright et al ‘06

Direct interactionwith Gestures(no dialog, no controls).

DynamicAnalytical Models.

Assertions with Proving/Disprovin

g Gates.

Page 11: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

11

Concordances & Word Frequencies

From www.concordancesoftware.co.uk

Page 12: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

12

Concordances & Word Frequencies

SeeSoft by Eick et al.

Page 13: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

13

Concordances & Word Frequencies

TextArc by Paley.

Page 14: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

14

Concordances & Word Frequencies

WordTree by Wattenberg

Page 15: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

15

Concordances & Word Frequencies

Bubble Charts (implemented by Wattenberg)

Page 16: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

16

Putting it together: Werschkul of the NYTimes

Page 17: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

17

What’s up with Tag Clouds?

Page 18: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

18

What are tags?

Page 19: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

19

Social Tagging• Metadata assignment without all the bother• Spontaneous, easy, and tends towards single terms• Usually used in the context of social media

Page 20: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

20

Example from del.icio.us

Page 21: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

21

The Tagging Opportunity• At last! Content-oriented metadata in the large!

• Attempts at metadata standardization always end up with something like the Dublin Core– author, date, publisher, … yaaawwwwnnn.

• I’ve always thought the action was in the subject metadata, and have focused on how to navigate collections given such data.

Page 22: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

22

The Tagging Opportunity

• Tags are inherently faceted !– It is assumed that multiple labels will be assigned to each

item• Rather than placing them into a folder• Rather than placing them into a hierarchy

– Concepts are assigned from many different content categories

– Helps alleviate the metadata wars:• Allows for both splitters and lumpers

– Is this a bird or a robin– Doesn’t matter, you can do both!

• Allows for differing organizational views– Does NASCAR go under sports or entertainment?– Doesn’t matter, you can do both!

Page 23: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

23

Tagging Problems• Tags aren’t organized• Thorough coverage isn’t controlled for• The haphazard assignments lead to problems with

– Synonymy– Homonymy

• See how this author attempts to compensate:

Page 24: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

24

Tagging Problems / Opportunities

• Some tags are fleeting in meaning or too personal– toread todo

• Tags are not “professional”– (I personally don’t think this matters)

Page 25: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

25Investigating social tagging and folksonomy in the art museumwith steve.museum", J. Trant, B. Wyman, WWW 2006 Collaborative Tagging Workshop

Page 26: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

26

What about Browsing?• I think tags need some organization• Currently most tags are used as a direct index into items

– Click on tag, see items assigned to it, end of story• Co-occurring tags are not shown• Grouping into small hierarchies is not usually done

– del.icio.us now has bundles, but navigation isn’t good– IBM’s dogear and RawSugar come the closest

• I think the solution is to organize tags into faceted hierarchies and do browsing in the standard way

Page 27: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

27

What does a typical tag cloud look like?

Page 28: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

28

Definition

Tag Cloud: A visual representation of social tags, organized into paragraph-style layout, usually in alphabetical order, where the relative size and weight of the font for each tag corresponds to the relative frequency of its use.

Page 29: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

29

Definition

Tag Cloud: A visual representation of social

tags, organized into paragraph-style layout, usually in alphabetical order, where

the relative size and weight of the font for each tag corresponds to the relative frequency of its use.

Page 30: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

30

flickr’s tag cloud

Page 31: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

31

del.icio.us

Page 32: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

32

del.icio.us

Page 33: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

33

blogs

Page 34: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

34

ma.gnolia.com

Page 35: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

35

NYTimes.com: tags from most frequent search terms

Page 36: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

36

IBM’s manyeyes project

Page 37: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

37

Amazon.com: Tag clouds on term frequencies

Page 38: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

38

I was puzzled by the questions:

• What are designers and authors’ intentions in creating or using tag clouds?

• How do they expect their readers to use them?

Page 39: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

39

On the positive side:

• Compact• Draws the eye towards the most frequent

(important?) tags• You get three dimensions simultaneously!

– alphabetical order– size indicating importance– the tags themselves

Page 40: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

40

Weirdnesses

• Initial encounters unencouraging– Some reports from industry:

• Is the computer broken?• Is this a ransom note?

Page 41: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

41

Weirdnesses• Violates principles of perceptual design

– Longer words grab more attention than shorter• Length of tag is conflated with its size

– White space implies meaning when there is none intended• Ascenders and descenders can also effect focus

– Eye moves around erratically, no flow or guides for visual focus

– Proximity does not hold meaning• The paragraph-style layout makes it quite arbitrary which

terms are above, below, and otherwise near which other terms

– Position within paragraph has saliency effects– Visual comparisons difficult (see Tufte)

Page 42: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

42

Weirdnesses

• Meaningful associations are lost– Where are the

different country names in this tag clouds?

Page 43: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

43

Weirdnesses Which operating systems are mentioned?

Page 44: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

44

Tag Cloud Study (1)

• First part compared tag cloud layouts– Independent Variables:

• Tag size• Tag proximity to a large font• Tag quadrant position

– Task: recall after a distractor task– 13 participants; effects for size and quadrant

• Second part compared tag clouds to lists– 11 participants– Tested recognition (from a set of like words)

and impression formation• Alphabetical lists were best for the latter; no

differences for the former

Getting our head in the clouds: Toward evaluation studies of tagclouds, Walkyria Rivadeneira Daniel M. Gruen Michael J. Muller David R. Millen, CHI 2007 note

Page 45: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

45

Tag Cloud Study (2)

• 62 participants did a selection task– (find this country out of a list of 10 countries)– Independent Variables:

• Horizontal list• Horizontal list, alphabetical• Vertical list• Vertical list, alphabetical• Spatial tag cloud• Spatial tag cloud, alphabetical

– Order for non-alphabetical not described– Alphabetical fastest in all cases, lists faster

than spatial– May have used poor clouds (some people

couldn’t “see” larger font answers)

• An Assessment of Tag Presentation Techniques; Martin Halvey, Mark Keane, poster at WWW 2007.

Page 46: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

46

A Justifying Claim

• You get three dimensions simultaneously!– alphabetical order– size indicating importance– the tags themselves

… but is this really a conscious design decision?

Page 47: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

47

Solution: Celebrity Interviews

• I was really confused about tag clouds, so I decided to ask the people behind the puffs– 15 interviews, conducted at foocamp’06

• Several web 2.0 leaders

– 5 more interviews at Google and Berkeley

Page 48: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

48

A Surprise• 7 interviewees DID NOT REALIZE that alphabetical

ordering is standard.– 2 of these people were in charge of such sites but had had

others write the code

• What was the answer given to “what order are tags shown in?”– hadn’t thought about it– don’t think about tag clouds that way– random order– ordered by semantic similarity

• Suggests that perhaps people are too distracted by the layout to use the alphabetical ordering

Page 49: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

49

Suggested main purposes:• To signal the presence of tags on the site

• A good way to get the gist of the site

• An inviting and fun way to get people interacting with the site

• To show what kinds of information are on the site– Some of these said they are good for navigation

• Easy to implement

Page 50: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

50

Tag Clouds as Self-Descriptions• Several noted that a tag cloud showing one’s own

tags can be evocative– A good summary of what one is thinking and reading about– Useful for self-reflection– Useful for showing others one’s thoughts

• One example: comparing someone else’s tags to own’s one to see what you have in common, and what special interests differentiate you

• Useful for tracking changes in friends’ lives– Oh, a new girl’s name has gotten larger; he must have a

new girlfriend!

Page 51: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

51

Tag Clouds as showing “Trends”• Several people used this term, that tag clouds show

trends in someone’s behavior– Trends are usually patterns across time, which are not

inherently visible in tag clouds– To note a trend using a tag cloud, one must remember what

was there at an earlier time, and what changed• tracking the girls’ names example

– This suggests a reason for the importance of the large tags – draws one’s attention to what is big now versus was used to be large.

– Suggests also why it doesn’t matter that you can’t see small tags.

Page 52: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

52

New Perspective: Tag Clouds are Social!

• It’s not about the “information”!• Not surprising in retrospect; tagging is in large part

about the social aspect– Seems to work mainly when the tags can be seen by many – Even better when items can be tagged by many and seen

by many

• What does this mean though when tag clouds are applied to non-social information?

Page 53: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

53

Follow-up Study• Informed by the interview results, we search for, read, and

coded web pages that mentioned tag clouds.– Looked at about 140 discussions– Developed 20 codes– Looked at another 90 discussions

• Used web queries: “tag clouds”, usability tag clouds, etc• Sampled every 10th url

– 58% personal blogs– 20% commercial blogs– 10% commercial web pages– rest from group blogs and discussion lists

– Doesn’t tell us what people who don’t write about tag clouds think.

Page 54: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

54

Sample

Page 55: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

55

Coding Results

Page 56: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

56

The Role of Popularity• Popularity in the sense that tag clouds (and tagging) are trendy

and popular.– Some people liked the visualization, but their popularity made

them less appealing• Famous post: “Tag clouds are the new mullets”• Led to self-consciousness about liking them

– Many complained about unaesthetic cloud designs– Little consensus on if they are a fad or have staying power

• Popularity also in the sense of the large font size for more popular tags– Many people like the prominence of large tags, but several

commented on the tyranny of the popular

Page 57: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

57

Commentary on Popularity

Page 58: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

58

Commentary on Popularity

Page 59: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

59

The Role of Navigation

• Opinions vary– Many simply state they are useful for navigation, but

with no support for this claim– Some claim the compactness makes navigation

easier than a vertical list– Some object to the varying font size on scannability– Others object to the lack of organization– Overall, there is no evidence either way that we

could find in the blog community

Page 60: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

60

Commentary on Navigation

Page 61: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

61

Commentary on Navigation

Page 62: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

62

Aesthetic Considerations

• Disagreement on the aesthetic and emotional appeal, especially for lay users.

• Those who like them find them fun and appealing

• Those who don’t find them messy, strange, like a ransom note

• Informal reports with first time users who are not in the Web 2.0 community are negative

Page 63: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

63

Commentary on New User Reaction

Page 64: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

64

Commentary on New User Reaction

Page 65: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

65

Trends again

• As in the interviews, the benefit of “trends” was mentioned many times.

• There is another sense of “trend” as “tendency or inclination,” and this might be what people mean.

Page 66: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

66

Tag Clouds as Social Information

• An emphasis that tag clouds are meant to show human behavior.

• We found reports of people commenting on other uses that were invalid because they did not reflect live user input:– One blogger noted the incongruity of an online library using

keyword frequencies in a tag cloud rather than having it reflect patron’s usage of the collection.

– An online community noticed one site’s cloud didn’t change over time and realized the sizes were decided by marketing. This was greated with derision.

Page 67: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

67

Commentary on Implicit Social Assumption

Page 68: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

68

Commentary on Implicit Social Assumption

Page 69: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

69

Implications

• Assume tag clouds are meant to reflect human mental activity (individual or group)

• Then what might seem design flaws from an information conveyance perspective may not be

• A large part of the appeal is the fun and liveliness.– The informality of the layout reflects the human activity

beneath it.

Page 70: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

70

Judith Donath, CACM 45(4), 2002

“Traditional data visualization focuses on making abstract numbers and relationships into concrete, spatialized images; the goal is to highlight important patterns while also representing the data accurately. This is a fine approach for social scientists studying the dynamics of online interactions. Yet for our purpose it is also important that the visualization evoke an appropriate intuitive response representing the feel of the conversation as well as depicting its dynamics.”

Page 71: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

71

Judith Donath, CACM 45(4), 2002

“[O]ne argument for deliberately designing evocative visualizations for online social environments is the existing default textual interfaces are themselves evocative, they simply evoke an aura of business-like monotony rather than the lively social scene that actually exists.''

Page 72: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

72

Alternative: “Semantic” Layout• Improving Tag-

Clouds as Visual Information Retrieval Interfaces, Hassan-Monteroa & Herrero-Solana, InSciT2006

• Tags grouped by “similarity, based on clustering techniques and co-occurrence analysis”

Page 73: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

73

Tag Cloud AlternativesProvided by Martin Wattenberg

Page 74: 1 i247: Information Visualization and Presentation Marti Hearst April 9, 2008.

74

Conclusions

• Social tagging is, in my view, a terrific way to get good content metadata.

• I think automated techniques can do a lot to help clean them up and organize them.

• They are an inherently social phenomenon, part of social media, which is a really exciting area.

• The “socialness” of social media can yield surprises, like tag clouds.