ISKOD conference, Konstanz, 20-22 February 2008
description
Transcript of ISKOD conference, Konstanz, 20-22 February 2008
ISKOD conference, Konstanz, 20-22 February 2008
Freely faceted classificationfor a Web-based
bibliographic archiveThe BioAcoustic Reference Database
Claudio Gnoli, Gabriele Merli, Gianni Pavan, Elisabetta Bernuzzi, Marco Priano (University of Pavia. Dep’t Mathematics & CIBRA)
• interdisciplinarity• requires some new KOS• based on phenomena• allowing to shift between perspectives• by analytico-synthetic techniques
• interdisciplinarity• requires some new KOS• based on phenomena• allowing to shift between
perspectives• by analytico-synthetic
techniques
The León Manifesto
The heresyDisciplines !
or phenomena...
sour
ces:
Haj
du B
arat
, Gno
li
Freely faceted classification
Developed within NATO-granted CRG research for a new general scheme
mainly byDouglas Foskett and Derek Austin
then partially evolved into the PRECIS verbal system
sour
ce: V
icke
ry
Existing freely facetedverbal indexing systems
• relational indexing [Farradane, 1950s]• Syntol [Gardin, 1960s]• PRECIS [Austin, 1970s]• POPSI [Bhattacharyya, 1980s]
Freely faceted classification
• Any concept has a constant notation, and• can be combined with any other• by expressing the kind of relationship.• Concepts are not bound to disciplinary
classes, but organized in classes of phenomena.
[Austin 1969, Prospects for a new general classification, J. librarianship, 1, n. 3, p. 149-169]
How phenomena can be ordered
...e atomsf moleculesl cellsm organismsn populationss communitiest institutionsv cultures...
increasingorganization
The ILC project
FFC: constant notation
mqvtn2a whales in Atlantic ocean
t8mqvtn institutions dealing with whales
wa4mqvtn food consisting of whales
wni60mqvtnvessels damaged by whales
xg8mqvtn painting of whales
FFC searchingIt suits computer applications, as each
concept can be retrieved separately by searching for the corresponding notation.
“whales” mqvtn
FFC browsing
mqvtn whalesmqvtn25e whales in estuariesu8mqvtn whale economyt8mqvtn institutions dealing with whalesxg8mqvtn paintings of whales
Results can be sorted systematically
FFC-like use of existing KOSs
Traditional classifications(DDC, UDC) can be used in this wayfor retrieval purposes, byassigning multiple classesto a document
Example: NEBIS opac [Pika]
FFC: free combinations
wni60mq vessels damaged by animals
mq60wni animals damaged by vessels
FFC: citation orderFacets of the same relevance are cited in a standard citation order (like in classic FC)
but focus facets can be promoted to the leading positions (like in Nuovo Soggettario)
wa4mq29q food consisting of animals in Japan
wa29q4mq Japanese food, consisting of animals
FFC problemsMore freedom requires more skills...
Users want simple notation (a virtue of DDC and BC2)
Austin concluded that FFC was good for IR, while mark-and-park systems were good for shelving two separate systems?!
Possible solutions• Indexers can be helped by
semi-automatical classification, and
• assisted by visual interfaces
Possible solutions• Notation can be shortened
by extra-defined foci
nyc oceanic environment25 [ny] in environment25c in oceanic environment
Possible solutions• only using letters, digits, and
brackets
abcd9e(5fg)8h
main class facets subfacets
A property of FFCitems with more facets are more
retrievable (by one facet or another) paradoxically, specialized documents
tend to be retrieved more often a balanced cataloguing policy is
needed
Application
Research needsThe database is fed with papers
actually used by the CIBRA staff in bioacoustic research, in both field recording and signal processing
Indexing interfaceThe indexercan edit theclassmark and dynamically see the
caption she is
producing
Suggested classesShe can be helped by automatic suggestions
generated by matching title with DB thesaurus
article title
edited notation
suggested classes
automatic caption
Suggested classesFor each title word, classes are suggested which
match caption, or synonyms, or description, or discipline
To improve precision• stopwords ignored: words < 4 letters,
“with”, ...
To improve recall• -s truncated
Verbal captions
They are synthesized from notation by a PHP script
Indexing interfaceInterface usability is still to be
improved
e.g. click-and-select, drag-and-drop, automatic default citation order:
gxxx kyy bzzz kyy gxx bzz
Classification by methodsThe León Manifesto advocates for
classificationby phenomena, theories, and methods:
birds, according to Darwinism, studied by observation
mqvo04d03b
Classification by methodsIn bioacoustics, methods are
relevant more often than theories
while in human sciences, the opposite seems to be true
[Szostak & Gnoli 2008, proc ISKO11 Montréal]
Classification by methods
Complexity issues“Guidelines on the applications of the
environment protection and biodiversity conservation act to interactions between offshore operations and larger cetaceans”
tn8ve(4qvtn(902o68v(3)25c))4d
Much facet nesting becomes problematiceven for the PHP script...
Taming complexitytn8V4d ve4V mqvtn902o68v(3)25c
Deictic V refers to the whole subsequent phase,thus avoiding most brackets
Possible solutionsThe system can be used
at various degrees of complexity,from purely free to fully faceted, according to the needs.
Websites: free classificationSpecialized literature: freely faceted
cl.
User searchOne or more facets can be
selected...
(also in combination with author/title/date)
Results display
The zero match problem[Tudhope & Binding 2008, Faceted thesauri, Axiomathes,
18, special issue on facet analysis, in prep.]
birds x threat x Europe = 0
Possible solution: enable “fuzzy” search by• ignoring one facet at a time
(in which order?...)• go one step up in hierarchy
Search refinementUsers should be allowed to refine search
by navigating through facets and hierarchyaccording to the number of results(average futility point 30)
This has been partially done already in a related archive...
Search refinement
Future developments• Complete fully faceted classmarks for all articles
• Derive consistent indexing policies from practice
• Fix automatic caption generation for complex cases
• Improve the facet selection menu
• Allow subject search by typing words
• Make the indexing interface evolving into a real assistant tool
Conclusion
Freely faceted classificationby phenomena, theories, and methodsis feasible
[Szostak 2007, Proc ISKOE León]
ILC people: Claudio Gnoli, Mela Bosch, Enzo Cesanelli, Viviana Doldi, Hong Mei, Gabriele Merli, Marcella Patania, Roberto Poli, Rick Szostak, Lorena Zuccolo
CIBRA people: Gianni Pavan, Elisabetta Bernuzzi, Claudio Fossati, Amanda May Koltz, Michele Manghi, Marco Priano
Published reports:Gnoli & Poli 2004, Levels of reality and levels of representation, Knowl org 31, 3,
151-160Gnoli & Merli 2005, Notazione e interfaccia di ricerca per una classificazione a
livelli, AIDA informazioni, 23, 1-2, 57-72Hong 2005, A phenomenon approach to faceted classification, 53th conf Japan Soc
LISGnoli 2006, The meaning of facets in nondisciplinary classifications, proc 9th ISKO
conf, Vienna, 11-18Gnoli & Hong 2006, Freely faceted classification for Web-based information
retrieval, New rev hypermedia & multimedia, 12, 1, 63-81Gnoli, Bosch & Mazzocchi 2007, A new relationship for multidisciplinary knowledge
organization systems: dependence, proc 8th ISKO Spain conf, León, 399-409Gnoli 2007, “Classic” vs. “freely” faceted classification, ISKO UK meeting
Ranganathan revisited, LondonGnoli, Pavan, Bernuzzi, Merli & Priano 2007, Freely faceted classification for the
BioAcoustic Reference Database, poster 21th IBAC conf, PaviaSzostak & Gnoli 2008, Classifying by phenomena, theories, and methods, proc
10th ISKO conf, MontréalWebsite: www.iskoi.org/ilc
...vielen Dank!
ISKOD conference, Konstanz, 20-22 February 2008