FaSet: A Set Theory Model for Faceted Search
-
Upload
fulvio-corno -
Category
Technology
-
view
2.960 -
download
4
description
Transcript of FaSet: A Set Theory Model for Faceted Search
FaSet: A Set Theory Model for
Faceted Search
Dario Bonino, Fulvio Corno, Laura Farinetti
Politecnico di TorinoDipartimento di Automatica e Informatica
http://elite.polito.it
Outline
FaSetWI/IAT 2009, Milano, Italy2
Faceted Search
Goal
The FaSet Set-Theoretical Model
FaSet Relational Implementation
Faceted Classification
Originated in Library Science
Ranganathan, 1962
Content-based classification scheme
Multi-dimensional
Facet = classification dimension
Multi-valued
Focus = allowed value in one of the facets
FaSet3 WI/IAT 2009, Milano, Italy
Example
FaSetWI/IAT 2009, Milano, Italy4
Color
Yellow
Red
Orange
Green
Blue
White
Black
Shape
Cube
Sphere
Cone
Cylinder
Taste
Sweet
Bitter
Neutral
Acid
Facets
Allowed foci for
each facet
Choice of the foci
describing the item
Faceted Search Systems
Faceted Classification
Simple, intuitive, versatile, powerful
Adopted by more and more web sites
As a classification system for their
products/items/documents/resources/…
As a model for the user interface in search, filtering,
refinement
FaSet5 WI/IAT 2009, Milano, Italy
Examples
FaSetWI/IAT 2009, Milano, Italy6
Examples
FaSetWI/IAT 2009, Milano, Italy7
Examples
FaSetWI/IAT 2009, Milano, Italy8
Facets in the real world
FaSetWI/IAT 2009, Milano, Italy9
Multi-valued
classification
During classification
During search
AND vs OR semantics?
Hierarchical (nested)
facets
Parents selectable?
Incomplete classification
Numerical ranges
Color
Yellow
Red
Orange
Green
Blue
White
Black
Other
Shape
Squared ▼
Cube
Parallelepiped
Rounded ▼
Sphere
Cylinder
Weight
0-50 g
50-100 g
100+ g
Facets in the Literature
User Interfaces Data and logic model
FaSetWI/IAT 2009, Milano, Italy10
Active research field since ~2000
Usability studies Mainly for search
interfaces
Application case studies
Web vs desktop environment
Mainly for multimedia data
Methodologies from Library science (Broughton, Vickery)
Formal models Dynamic Taxonomies
(Sacco)
Uniformities, Lattices (Priss)
Granular computing
Less applicable results
Goal of the paper
Propose a formal model: FaSet
for representing
Faceted Classification of resources
Faceted Search Interfaces for such resource sets
Searching, Filtering, Ranking operations
compatible with modern web applications
Mathematically simple
Easy mapping to Relational Algebra
Decouple classification and resources
versatile and flexible
Supports all “real-world” variations on Facets
FaSet11 WI/IAT 2009, Milano, Italy
Facets and Foci
Facets: disjoint sets
Fa, Fb, Fc, …
Facet space:
U = Fa Fb Fc …
Focus L: subset
La Fa
Many foci for each facet
Focus name: index list
La<i,j,k,…>
FaSetWI/IAT 2009, Milano, Italy12
Fa
Fb
U
Fa
La<2>
La<1>
La<1,1>
La<1,2>
Hierarchy
Hierarchical nesting of
foci is represented by
subset containment
La<narrower>
La<broader>
Locus names are
chosen to represent
hierarchical containment
La<i,j,k> La<i,j>
Reminds of Dewey Decimal
Classification
Incomplete taxonomy
No overlap allowed
A focus may be larger
than the union of its sub-
foci
FaSetWI/IAT 2009, Milano, Italy13
La<2>
La<1>
La<1,1>
La<1,2>
Fa
Classification (Facet)
Resources r are
classified w.r.t. the facet
space
“Projection”: r Fa
We may only represent
projections built by
combining foci
r Fa = ∪p La<p>
Just the focus names
are needed
{<1,1>,<2>}
FaSetWI/IAT 2009, Milano, Italy14
La<2>
La<1>
La<1,1>
La<1,2>
r Fa
Fa
Classification (Multidimensional)
On the multi-
dimensional space, the
cartesian product is
taken
r U = rFa rFb ...
Just the focus names
are needed
FaSetWI/IAT 2009, Milano, Italy15
r Fa
r Fb
r U
Searching in FaSet
Resources r
Classified as r U
Query q
Expressed uniformly as q U
Search = Filtering + Ranking
Filtering: r is relevant to q iff: (r U) ⋂ (q U)
Ranking: estimate the similarity S(q, r) of r to q
FaSetWI/IAT 2009, Milano, Italy16
Fb q
r1
r2
Fa
Filtering
All resources that match, even partially, with the
query
(r U) ⋂ (q U)
May be easily computed by checking focus names
Prefix-compatibility: La<p1> ≍ La<p2> iff
p1 = p2, or
p1 is a prefix of p2, or
p2 is a prefix of p1
At least one couple of foci, per each facet, must be
prefix-compatible
∀Fa : ∃ La<p1> ∈ q, La<p2> ∈ r : La<p1> ≍ La<p2>
FaSetWI/IAT 2009, Milano, Italy17
Example
L<>
L<1> L<2>
L<1,1> L<1,2> L<1,3> L<2,1> L<2,2>
<1,3> <2>
<1>
<2,2>
<1,2> <1,3>
<1,1> <1,2>
FaSetWI/IAT 2009, Milano, Italy18
q
r1
r2
r3
r4
Ranking
Compute similarity between resource and query
Often neglected by Faceted Search Interfaces
Define a Similarity Measure S(q, r) ∈ [0,1]
Compute similarity between matching foci (deeper
matches give higher scores)
Aggregate focus-based similarity measures in the same
facet (fuzzy sum)
Normalize facet-level results
Aggregate facet-based similarity measures across all
facets (fuzzy product)
FaSetWI/IAT 2009, Milano, Italy19
FaSet Relational Implementation
The FaSet classification requires
A constant set of Facets
A constant set of Foci
An “index” table storing the list of focus names for each
resource
FaSetWI/IAT 2009, Milano, Italy20
Resource
Database
constant
FaSet Relational Implementation
The FaSet search algorithm uses
Set operations
Universal and existential quantification
Aggregate operations for computing ranking measures
Directly supported by Relational DBMS primitives
FaSetWI/IAT 2009, Milano, Italy21
Future work
Experimentation of FaSet on sample data sets
Performance evaluation
Integration with front-end AJAX interfaces
CMS module
MIT Exhibit
Evaluation of the ranking
algorithm from the
Information Retrieval
point of view
FaSetWI/IAT 2009, Milano, Italy22
Conclusions - FaSet
Formally defined faceted Representation & Search
model
Light formalism
Supports hierarchies, nesting, multiple classification,
incomplete specifications, …
Compatible with modern web development
technologies
FaSetWI/IAT 2009, Milano, Italy23
Thank
you!