Using Information Scent to Model Users in Web1.0 and Web2.0
-
Upload
ed-chi -
Category
Technology
-
view
1.484 -
download
3
description
Transcript of Using Information Scent to Model Users in Web1.0 and Web2.0
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Modeling of Web Users from Web1.0 to Modeling of Web Users from Web1.0 to Web2.0Web2.0
Ed H. Chi, Principal Scientist and Area Manager
Augmented Social Cognition AreaPalo Alto Research Center
12010-03-20 Utrecht CogModeling
PARC OverviewPARC Overview Interdisciplinary research
center
Founded in 1970
Spun out of Xerox in 2002
Business model:– Contract research
– Licensing
– Joint ventures
– Spinoffs
2010-03-20 2Utrecht CogModeling
PARC InnovationPARC Innovation
chartered to create the architecture of information & the office of the future- invented distributed personal computing
- established Xerox’s laser printing business
- created the foundation for the digital revolution
Graphical User Interface
Laser Printing
Ethernet
Bit-mapped Displays
Distributed File Systems
Page Description Languages
First Commercial Mouse
Object-oriented Programming
WYSIWYG Editing
Distributed Computing
VLSI Design Methodologies
Optical Storage
Client/Server Architecture
Device Independent Imaging
Cedar Programming Language
2010-03-20 3Utrecht CogModeling
Utrecht CogModeling 4
How do people navigate?How do people navigate? Scan Skim Decide Action
2010-03-20
2010-03-20 Utrecht CogModeling 5
Ecological ApproachEcological Approach
human-information interaction is adaptive to the extent:
Net Knowledge Gained
Costs of InteractionMAXIMIZE [ ]
2010-03-20 Utrecht CogModeling 6
InformationInformationEnergyEnergy
Analogy to Optimal ForagingAnalogy to Optimal Foraging
Information Scent: The TheoryInformation Scent: The Theory
Information Scent is the user perception of the cost and value of information.– Similar to hunters following
animal foot prints.
2010-03-20 7Utrecht CogModeling
Information Scent: The IdeaInformation Scent: The Idea
cell
patient
dose
beam
new
medical
treatments
procedures
InformationNeed Text snippet
SeesWants
• Spreading activation– Bayesian prediction of relevance of individual
elements
2010-03-20 8Utrecht CogModeling
2010-03-20 Utrecht CogModeling 9
ibread
jbutter
sandwich
flour
Ai = Bi + WjSji
Activation of chunk i
Base-level activation of chunk i
Activation spreadfrom linked chunks j
Activation depends ona base level plus activation
spread from associated chunks
Bi = log( ) Pr(i) Pr(not i)
Sji = log( ) Pr(j|i)Pr(j|not i)
log likelihood of i occurring
log likelihood of i occurring with j
Base level activation reflectslog likelihood of events in the world.Strength of spread reflects log likelihood of event cooccurrance
Attacking The ProblemAttacking The Problem Users have information goals, their surfing
patterns are guided by information scent
Two questions– Given an information goal and a starting point
Where do users go? (Behavior)– Given some surfing pattern
What is the user’s goal? (Need)
2010-03-20 10Utrecht CogModeling
WUFIS: Web User Flow by Information Scent
UserInformation
Goal
Web site
WebPage
content links
Web user flow simulation
Predictedpaths
2010-03-20 11Utrecht CogModeling
Utrecht CogModeling 12
InfoScent: How does it work?InfoScent: How does it work?
Start users at page with some goal
Flow users through the
network
Examine user patterns
Scent Values: Probabilities of
Transition
2010-03-20
Utrecht CogModeling 13
InfoScent SimulationInfoScent Simulation
document
wordWQR
T
1000000
0010000
0000100
0100000
0000011
0000001
0001000
0101110
0
0
0
0
0
0
1
1 from
toTRS
0269.0269.00000
10731.000212.00
0000000
00001576.00
00000212.00
0731.000001
00000002
1
Weight MatrixQuery
Relevant
Docs
R = Relevant documents
T = Topology matrix
Normalize to Probability
Scent Matrix
2010-03-20
Now with the Scent Matrix, we then perform Spreading Activation.
Now with the Scent Matrix, we then perform Spreading Activation.
3
Utrecht CogModeling 14
Proximal Cue WordsProximal Cue WordsGoal: Find words that represent Information
Cues for hyperlinks:
Text of the link itself Words around link.– Lists, Paragraphs
1 2
2010-03-20
Utrecht CogModeling 15
Information CuesInformation Cues
If the above two fails,– Content words on the Distal Page– Title Words of the Distal Page
3
2010-03-20
2010-03-20Utrecht CogModeling 16
Bloodhound ProjectBloodhound Project
Starting Point: www.xerox.comTask: look for “high end copiers”
OUTPUTusability metrics
INPUT
2010-03-20 Utrecht CogModeling 17
Input Input TasksTasks
2010-03-20 Utrecht CogModeling 18
Stanford CSStanford CS
2010-03-20 Utrecht CogModeling 19
ONRONR
2010-03-20 Utrecht CogModeling 20
Instrumentation: WebLoggerInstrumentation: WebLogger
(BEFORE-NAVIGATE (http://altavista.com/ ) 105.331s 0.100s 951763010 10:36:50) (DOC-MOUSEMOVE (881 122 ) 105.431s 0.100s 951763010 10:36:50) (NAVIGATE-COMPLETE (http://www.altavista.com/)105.632s 0.201s 951763011 10:36:51) (EYETRACKER-SYNC (103 ) 106.242s 0.610s 951763011 10:36:51) (DOCUMENT-COMPLETE (http://www.altavista.com/)106.773s 0.531s 951763012 10:36:52) (SCROLL-POSITION (0 0 759 1181 ) 106.853s 0.080s 951763012 10:36:52) (DOC-MOUSEMOVE (874 123 ) 107.024s 0.171s 951763012 10:36:52) (DOC-MOUSEMOVE (874 123 ) 107.044s 0.020s 951763012 10:36:52) (DOC-MOUSEMOVE (874 123 ) 107.214s 0.170s 951763012 10:36:52) (EYETRACKER-SYNC (104 ) 107.244s 0.030s 951763012 10:36:52) (CHAR (a 874 123 ) 108.125s 2.904s 951763013 10:36:53) (EYETRACKER-SYNC (105 ) 108.245s 1.001s 951763013 10:36:53) (DOC-KEYPRESS (a INPUT ) 108.446s 0.201s 951763013 10:36:53)
2010-03-20 Utrecht CogModeling 21
User TracesUser Traces
2010-03-20 Utrecht CogModeling 22
Compare Visitation Compare Visitation DistributionsDistributions For each task, produce a user summary vector
that describes the frequency distribution of page visit over the document space.
For each task, ran Bloodhound and create bloodhound predicted frequency distribution.
2010-03-20 Utrecht CogModeling 23
ResultsResults Corr.Coeff.
Yahoo REI HivInSite Parcweb
task 1a 0.7528 0.4701 0.6811 0.7394
task 1b 0.7218 0.4763 0.7885 0.8756
task 2a 0.7489 0.9892 0.6671 0.8930
task 2b 0.8840 0.7073 0.6880 0.8573
task 3a 0.7768 0.7321 0.8835 0.7197
task 3b 0.6973 0.6979 0.5660 0.7123
task 4a 0.9022 0.9415 0.8407 0.8340
task 4b 0.9052 0.7600 0.4634 0.9344
• Produced click streams that:• Correlated strongly 1/3 of the time• Moderately slightly less than 2/3 of the
time– Problem: we do not know a priori which
third.
IUNIS: Inferring User Need by Info Scent
UserInformation
Goal
Web site
WebPage
content links
Web user flow simulation
Observedpaths
2010-03-20 24Utrecht CogModeling
2010-03-20 Utrecht CogModeling 25
Utrecht CogModeling 26
Evaluation of IUNISEvaluation of IUNIS
Procedure:– 10 Path booklets– Single rating sheet with the ten 20-word
summaries. A copy of this rating sheet is attached to each of the 10 path booklets.
– Users are asked to read through each booklet and rate each of the path summaries.
Each summary, 5-point Likert Scale. Which of the ten summaries was the best match.
2010-03-20
Utrecht CogModeling 27
Evaluation of IUNISEvaluation of IUNIS
Results:– Matching summary mean = 4.58 (median=5)– Non-matching summary mean = 1.97
(median=1)– Difference highly significant (p < .001)– Best match summary: 5.6 out of 10 (Cohen
Kappa=0.51)
Evaluation yield strong evidence that IUNIS generates good summaries of the Web paths.
2010-03-20
Utrecht CogModeling 28
ScentTrails: ScentTrails: Pre-highlight Pre-highlight navigation pathnavigation path
A store that knows your goal. Over 50% reduction in task time.
2010-03-20
2010-03-20 29
Web page with highlighted link Web page with highlighted link anchorsanchors
Partial information goal: “remote diagnostic technology”
62 copies/min.
92 copies/min.Remainder of information goal: “speed >= 75”
Utrecht CogModeling
2010-03-20 Utrecht CogModeling 30
ScentTrails algorithmScentTrails algorithm
Identify tasty pages Waft scent backward along links
– Loses intensity as it travels
remote diagnostics
copiers
fax machines
other maintenance
. . .
XC4411 XC5001
XC4411 copier
featuresFeatures:
remote diagnostics
. . .
digital copiers color copiers
back
Utrecht CogModeling 31
Results of user studyResults of user study
0
1
2
3
4
5
6
Scent
Trails
ShortS
cent
sear
ch
brow
se
Task
Co
mp
leti
on
Tim
e (m
inu
tes)
0%
10%
20%
30%
40%
50%
Fra
cti
on
Ab
ov
e 5
Min
ute
s(times capped at five
minutes)10/12 subjects preferred ScentTrails to both searching and browsing
2010-03-20
2010-03-20 32Utrecht CogModeling
ScentIndexScentIndex
Associated Entries underlined in red
33
ScentHighlightScentHighlight
User first type search keywords: “anthrax symptoms”
Conceptually highlight any relevant passages and keywords
Draw user attention
2010-03-20 Utrecht CogModeling
2010-03-20 34
Utrecht
CogModeling
MethodMethod
User Study SummaryUser Study Summary Overall, the ScentIndex eBook performed
better against the physical Book. Faster Speed:
– Subjects using the ScentIndex were faster in completing their tasks no matter whether they were experts or novices, F(1,12)=12.96, p<.01.
More Accurate:– Answers that they provided while using ScentIndex
interface were more accurate, F(1,12)=3.991, p=.06.
2010-03-20 Utrecht CogModeling 35
Poor heuristic
Good heuristic
HeuristicsHeuristics
2010-03-20 36Utrecht CogModeling
““Hints”Hints”
Solo
Cooperative (“good hints”)
2010-03-20 37Utrecht CogModeling
Finding a Finding a RestaurantRestaurant
Appropriate for the occasion
2010-03-20 Utrecht CogModeling 38
Research VisionResearch Vision
Augmented Social CognitionAugmented Social Cognition Cognition: the ability to remember, think, and reason; the
faculty of knowing. Social Cognition: the ability of a group to remember, think,
and reason; the construction of knowledge structures by a group.– (not quite the same as in the branch of psychology that studies
the cognitive processes involved in social interaction, though included)
Augmented Social Cognition: Supported by systems, the enhancement of the ability of a group to remember, think, and reason; the system-supported construction of knowledge structures by a group.
Citation: Chi, IEEE Computer, Sept 2008
392010-03-20 Utrecht CogModeling
Research MethodologyResearch Methodology
Characterize activity on social systems with analytics Model interaction social and community dynamics and
variables Prototype tools to increase benefits or reduce cost Evaluate prototypes via Living Laboratories with real users
40Utrecht CogModeling2010-03-20 40
Characterization Models
PrototypesEvaluations
412010-03-20 Utrecht CogModeling
Characterization Models
PrototypesEvaluations
Two Sides of TaggingTwo Sides of Tagging
Encoding Retrieval
42
http://edge.org
“science research cognition”
http://www.ted.com/index.php/speakers
“video people talks technology”
2010-03-20 42Utrecht CogModeling
Using Information Theory to Model Social Using Information Theory to Model Social TaggingTagging[Ed H. Chi, Todd Mytkowicz, ACM Hypertext 2008][Ed H. Chi, Todd Mytkowicz, ACM Hypertext 2008]
TopicsConcepts
Users Documents
Tags
T1…TnEncodingDecoding
Noise
2010-03-20 43Utrecht CogModeling
H(Tag) shows saturation in tag usage H(Tag) shows saturation in tag usage
442010-03-20 Utrecht CogModeling
2010-03-20 Utrecht CogModeling 46
II((DocDoc; ; TagTag) Mutual ) Mutual InformationInformation
Source: Hypertext 2008 study on del.icio.us (Chi & Mytkowicz)
Raise in avg. tag per bookmarkRaise in avg. tag per bookmark(note parallel the development in increasing # of (note parallel the development in increasing # of query words)query words)
472010-03-20 Utrecht CogModeling
482010-03-20 Utrecht CogModeling
Characterization Models
PrototypesEvaluations
• Synonyms• Misspellings• Morphologies
People use different tag words to express similar concepts.
Social Tagging Creates Noise
2010-03-20 49Utrecht CogModeling
2010-03-20 50
Guide
Web
Howto
TipsHelp
Tools
Tip
Tricks
Tutorial
Tutorials
Reference
Semantic Similarity GraphSemantic Similarity Graph
TagSearch: TagSearch: Use Semantic Use Semantic Analysis to Reduce NoiseAnalysis to Reduce Noise http://mrtaggy.com
Utrecht CogModeling
MapReduce ImplementationMapReduce Implementation
Spreading Activation in a bi-graph Computation over a very large data set
– 150 Million+ bookmarks
Tags URLs
P(URL|Tag)
P(Tag|URL)
2010-03-20 51Utrecht CogModeling
Understanding a new area…Understanding a new area…
2010-03-20 52
Characterization Models
PrototypesEvaluations
Utrecht CogModeling
MrTaggy.com: MrTaggy.com: social search browser with social social search browser with social bookmarksbookmarks
Joint work with Rowan Nairn, Lawrence Lee
Kammerer, Y., Nairn, R., Pirolli, P., and Chi, E. H. 2009. Signpost from the masses: learning effects in an exploratory social tag search browser. In Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston, MA, USA, April 04 - 09, 2009). CHI '09. ACM, New York, NY, 625-634.
2010-03-20 53Utrecht CogModeling
2010-03-20 54Utrecht CogModeling
Understanding a new area…Understanding a new area…
2010-03-20 56
Characterization Models
PrototypesEvaluations
Utrecht CogModeling
Baseline Baseline InterfaceInterface
2010-03-20 57Utrecht CogModeling
Experiment DesignExperiment Design 2 interface x 3 task domain design
– 2 Interface (between-subjects) Exploratory vs. Baseline
– 3 task domains (within-subjects) Future Architecture, Global Warming, Web Mashups
30 Subjects (22 male, 8 female)– Intermediate or advanced computer and web search skills– Half assigned Exploratory, half Baseline.
For each domain, single block with 3 task types:– Easy and Difficult Page Collection Task [6min each]– Summarization Task [12min]– Keyword Generation Task [2min]
2010-03-20 58Utrecht CogModeling
Procedure [2 hours]Procedure [2 hours] Prior Knowledge Test 1st Task Domain
– With easy and difficult page collection tasks, summarization and keyword generation task.
– NASA cognitive load questionnaire 2nd Task Domain
– Same battery of tasks and cognitive load questionaire
3rd Task Domain Experimental Survey
2010-03-20 59Utrecht CogModeling
Experimental Evauation Experimental Evauation [Kammerer et al, CHI2009][Kammerer et al, CHI2009]
Exploratory interface users:– performed more queries, – took more time, – wrote better summaries (in 2/3 domains), – generated more relevant keywords (in 2/3 domains),
and– had a higher cognitive load.
Suggestive of deeper engagement and better learning.
Some evidence of scaffolding for novices in the keyword generation and summarization tasks.
2010-03-20 60Utrecht CogModeling
The TeamThe Team
2010-03-20 Utrecht CogModeling 61
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Augmented Social Cognition:Augmented Social Cognition:From Social Foraging to Social From Social Foraging to Social SensemakingSensemaking
Research Vision: Understand how social computing systems can enhance the ability of a group of people to remember, think, and reason.
Living Laboratory: Create applications that harness collective intelligence to improve knowledge capture, transfer, and discovery.
http://asc-parc.blogspot.comhttp://[email protected]
622010-03-20 Utrecht CogModeling
2010-03-20 Utrecht CogModeling 63
2010-03-20 64Utrecht CogModeling
Enhanced ThumbnailsEnhanced ThumbnailsAndrew Faulring, Allison Woodruff and Ruth RosenholtzAndrew Faulring, Allison Woodruff and Ruth Rosenholtz
enhanced
plain
2010-03-20 65
Utrecht
CogModeling
Popout PrismPopout Prism [ [Suh &Woodruff]Suh &Woodruff]
TagSearch Exploratory FocusTagSearch Exploratory Focus
67
3 kinds of search
navigational transactional
28% 13%
You know what you want and where it is You know what you want to do
Existing search engines are OK
informational
59%
You roughly know what you want
but don’t know how to find it
Difficult for existing search engines
Opportunity
2010-03-20 Utrecht CogModeling