UCB HCC Retreat
SearchText Mining
Web Site Usability
Marti HearstSIMS
UCB HCC Retreat
BAILANDO Projects
Better Access to Information using Language Analysis and
Novel Dynamic Organizations
UCB HCC Retreat
Current BAILANDO Projects CHA-CHA:
Web Search results in Context
LINDI: UI support for Search Text Data Mining
TANGO: Automated Web Site Usability
UCB HCC Retreat
Search UIs
Combine Browsing & SearchPlace Search Results in Context
LargeCategoryHierarchies
UCB HCC Retreat
Cha-Cha Students: Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen
UCB HCC Retreat
Medical Category Hierarchy
M igraine M S
Disease
Carotid Artery Spinal Cord
Anatom y
T am oxifin Steroids
Drugs
M edicine
UCB HCC Retreat
DynaCat (Pratt, Hearst, & Fagan 99)
UCB HCC Retreat
DynaCat Study Design
Three queries 24 cancer patients Compared three interfaces
ranked list, clusters, categories Results
Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces
Similar results have been verified by another study by Chen and Dumais (CHI 2000)
Cat-a-Cone Interface(Hearst & Karadi 97)
UCB HCC Retreat
Improving Search via Large Category Hierarchies
How to show intersections across category
types? How to preview related categories in a user-
tailored, dynamic manner?
UCB HCC Retreat
Information retrieval
Text Data Mining
UCB HCC Retreat
Information retrieval
Selection or rejection of existing documents based on a function of word match.
UCB HCC Retreat
Text Data Mining
Relationships between information in documents can create new facts, not previously known.
UCB HCC Retreat
Imagine
You are a medical researcherYour patient hasspinal inflammationnumbness in fingerslow TC levelsnegative results for all tests
How can you help her?
UCB HCC Retreat
Idea
A new way of searching text.
Link pieces of information together
to formulate hypotheses …
UCB HCC Retreat
LINDILinking Information for New DIscoveries
Students: Barbara Rosario, David Blei Three main parts
Search UI for building and reusing hypothesis seeking strategies.
Statistical language analysis techniques for interpreting the text.
Backend for interfacing with various databases and translating different formats.
UCB HCC Retreat
Gathering Evidence
Spinal Inflammation
Numbness in fingers
Low TC Levels
UCB HCC Retreat
Gathering Evidence
Spinal Inflammation
Numbness in fingers
Low TC Levels
Find diseasesassociatedwith each
UCB HCC Retreat
Supporting Cascaded Search Operations
Spinal Inflammation
Numbness in fingers
Low TC Levels
UCB HCC Retreat
UCB HCC Retreat
New Language Analysis First use category labels to retrieve candidate
documents Then use language analysis to detect causal
relationships between concepts Title:
Magnesum deficiency implicated in increased stress levels. Interpretation:
<nutrient><reduction> related-to <increase><symptom> Use these to find relationships and formulate
hypotheses
UCB HCC Retreat
Statistical Semantic Parsing
Modern statistical techniques Mainly applied to syntactic structure
Probabilistic knowledge representation Represent hypotheses with different degrees
of certainty.
UCB HCC Retreat
Automating Assessment of
Web Site Usability
UCB HCC Retreat
Why Worry? Problem: IBM's extranet
Heavy use of help and search Unhappy users
Solution Massive web site redesign Focus on info-organization, not the purchasing
process. Cost: "in the millions"
Results Not announced or trumped up Use of "help" decreased 84% Sales increased 400%
UCB HCC Retreat
Web TANGOTool for Assessing NaviGation & Organization
Student: Melody Ivory
Goal: automated support for comparing design alternatives
How: Assess usability of the information architecture
Approximate people’s information-seeking behavior (Monte Carlo simulation)
Output quantitative usability metrics
UCB HCC Retreat
Anatomy of Web Site Design
Courtesy of Mark Newman
Information Architecture
NavigationDesign
InformationDesign
GraphicDesign
UCB HCC Retreat
Usability EvaluationStandard Techniques
User studies Have people use the interface to complete
some tasks Requires an implemented interface "Discount" vs. Scientific Results
Heuristic Evaluation An expert assesses a design or
implementation according to certain guidelines
UCB HCC Retreat
Automated Usability Evaluation Logging/capture
Pro: Easy Con: Requires implemented system Con: Don't know the user task (web) Con: Don't present alternatives Con: Don't distinguish error from success
Analytical Modeling Pro: doable at design phase Con: models an expert Con: academic exercise
Simulation
UCB HCC Retreat
Existing Metrics
Web metric analysis tools report on what is easy to measure, e.g.: Predicted download time Depth/breadth of site
We want to worry about Content User goals/tasks
Not available from logs
We also want to compare alternative designs.
UCB HCC Retreat
Monte Carlo Simulation
Have a model of information structure Have a set of user goals Want to assess navigation structure
Compare alternatives/tradeoffs Identify bottlenecks Identify critically important pages/links Check all pairs of start/end points Check overall reachability before and after a change.
UCB HCC Retreat
Monte Carlo Simulation At each step in the simulation
Assume a probability distribution over a set of next choices. The next choice is a function of:
The current goal The understandability of the choice The overall complexity of the set of choices Prior interaction history
These can use models of "scent" Varying the distribution corresponds to varying properties of
the links Spot-check important choices
UCB HCC Retreat
One Monte Carlo simulation step for Design 1, Task 1. Simulation starts from the home page and the target information is at Renter Support.
X
UCB HCC Retreat
Monte Carlo simulation results for Design 1, Task 1. Simulation runs start from all pages in the site. Average Navigation times are shown for Tasks 2 & 3.
X
UCB HCC Retreat
Using Simulator Results Design Decisions
Use Design 1 Improve Tasks 1 & 2
Next Steps Analyze results for Tasks 1
& 2 Create new Design 1 Repeat simulation to
compare old & new designs
Iterate if necessary
Design 1 Design 2 Task Time Errors Time Errors 1 41 sec 2 38 sec 4 2 38 sec 4 43 sec 5 3 32 sec 2 74 sec 6
UCB HCC Retreat
Research Issues: Navigation Predictions Develop IR model for predicting link selection
Requirements Information need (task metadata) Representation of pages (page metadata) Method for selecting links (relevance ranking) Maintaining user’s conceptual model during site traversal
(scent [Fur97,LC98,Pir97]) One possible approach
Information Foraging Theory [PC95,Pir97,PPR96] Functional categorization of pages based on features Prediction of relevance to current page
Consider link connectivity, text similarity & usage
UCB HCC Retreat
Other HCC-Related Projects
Using a large digital desk in design Ame Elliot
Using visualization for light design Dan Glaser
User interfaces and computer security Prof. Doug Tygar, Rachna Dahmija
Top Related