Advanced Information Systems Laboratory GeoSpatiumLab...
Transcript of Advanced Information Systems Laboratory GeoSpatiumLab...
-
Advanced InformationSystems Laboratory GeoSpatiumLab S.L.
ThManager
University of ZaragozaComputer Science and Systems Engineering DepartmentAdvanced Information Systems Laboratory (IA3)http://iaaa.cps.unizar.es/
GeoSpatiumLab S.L.http://www.geoslab.com/
http://iaaa.cps.unizar.es/http://www.geoslab.com/
-
Outline
IntroductionCapabilitiesConclusion
-
Introduction to thesauri
„ A thesaurus is a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between concepts (for example synonymous terms, broader terms, narrower terms and related terms) are made explicit“ [ISO 2788]Used to improve the precision and recall of information retrieval in digital libraries
provide a uniform and consistent vocabulary for indexing metadata ("description of the data holdings“) supply users with a suitable vocabulary for the retrieval. expansion of users queries by automatically adding new terms to the query
-
ThManager
ThManager facilitates the management of knowledge organization systems
thesauri and other types of controlled vocabularies, such as taxonomies or classification schemes
In particular, it facilitates the creation and visualization of SKOS RDF vocabularies
a W3C initiative for the representation of knowledge organization systems using the Resource Description Framework (RDF)
C on c e p tS c h em e
C on c ep t
rd f: lab e l
s k os .p re fla b e l
s k os .a ltL ab e l
s k os .s c op eN o te
s k os .b roa d er
s k os .n a rrow e r
s k os .re la ted
s k o s .d e fin it io n
s k o s .h as Top C on c ep t
d c :t it led c :p u b lis h e r.. .
s k o s .e xam p le
s k os .in S c h em e
s k os :s ym b o l (d c m iTyp e :im ag e)
s k os .p re fS ym b o l
s k os .a ltS ym b o l
D u b lin C ore M od e l
http://www.w3.org/2004/02/skos/
-
General features
Distributed as an Open Source tool through SourceForge.net
http://thmanager.sourceforge.net/Developed in JavaMulti-platform (Windows, Unix)
Storage of metadata and thesauri is managed directly trough file system
MultilingualJava internationalization methodologyCurrently: Spanish, English, (procedure to support new languages)
-
Capabilities
Repository of available thesauriDescription of thesauri by means of metadata Browsing of thesaurus contentEdition of thesaurus contentExchange of thesauri according to SKOS formatInterconnection of thesauri through WordNetlexical database
-
Repository of available thesauri
Main window of the applicationBrowser of available thesauri in the local repository
Allowed operationsSelection of thesauri for ulterior operations (browse content, export, delete, …)Sorting/filtering of thesauri according to descriptors values (columns)
-
Description of thesauri by means of metadata
Each thesaurus is described by means of a metadata application profile of Dublin Core
http://thmanager.sourceforge.net/docthesaurusdc_en.html
Metadata can be either visualized in HTML or edited through a form
-
Browsing of thesaurus content
It allows the browsing of terms with different viewers (language sensitive)
Hierarchical viewera tree showing the hierarchical structure of thesaurus concepts
Alphabetic viewerlist of concepts alphabetically ordered in the selected language
Search toolThe searching process is based on preferred labels allowing the following criteria: “equals”, ”starts with”and “contains”
For each selected conceptIt shows all the propertiesIt allows the navigation to the related concepts by means of hyperlinks
-
Hierarchical viewer
a tree showing the hierarchical structure of thesaurus concepts
-
Alphabetic viewer
list of concepts alphabetically ordered in the selected language
-
Search toolThe searching process is based on preferred labels allowing the following criteria: “equals”, ”starts with” and “contains”
-
Edition of thesaurus content
The tool provides an edition interface to modify the content of a thesaurus:
creation of concepts deletion of conceptsedition of properties and relations
broader and narrower relations to define a hierarchical structure of concepts.mark concepts as top concepts
o broader concept of a micro-thesauruso or concepts in a plain list
preferred label, alternative label, definition and scope note as multilingual properties
o structure: property type + language + valuenotation properties
o useful for creating classification schemes that provide multiple coding of terms
o example: ISO-639 list of languages has 2-letter and 3- letter codes
o structure: type (URI) + value
-
Edition of thesaurus content
-
Exchange of thesauri
Exchange of thesauri according to SKOS format Import/export operations include metadatadescribing each thesaurus
-
Interconnection of thesauri through WordNet lexical database
Thesauri are intended for the homogeneousclassification of resources
They are used to fill metadata keywordsHowever, there is still heterogeneity in metadata keywords
Metadata creators use different thesauri in different application domainsIf metadata catalogs provide access to general public
Queries may not contain same terms as keywords in metadata records
A possible solution to fill the semantic gapInterconnection of thesauri through a general purpose lexical ontology
-
Extraction of related concepts in Wordnet
ThManager generates an automatic mapping of thesaurus concepts against the concepts of Wordnet lexical databaseThis functionality is activated through the import dialog
Other knowledgerepresentation
models
Thesaurus 1Thesaurus 2Thesaurus N
Controlled list 1Controlled list 2
Controlled list NWordNet
-
Extraction of related concepts in Wordnet
-
Conclusions
ThManager is a flexible tool to manage thesauri
It provides enhanced functionality forthe improvement of classificationsTested with well known thesauri
EEA - GEMET (General MultilingualEuropean Thesaurus), FAO –AGROVOC, UNESCO Thesaurus, European Commission - EUROVOC
This tool can be easily integrated in other tools
It is integrated within CatMDEdit toselect the appropriate terms for metadata elementsAccesible as a Web Service (Web Ontology Service) for integrationwithin Web applications that requireselection of controlled vocabularies
ThManagerOutlineIntroduction to thesauriThManagerGeneral featuresCapabilitiesRepository of available thesauriDescription of thesauri by means of metadataNúmero de diapositiva 9Browsing of thesaurus contentHierarchical viewerAlphabetic viewerSearch toolEdition of thesaurus contentEdition of thesaurus contentExchange of thesauriInterconnection of thesauri through WordNet lexical databaseExtraction of related concepts in WordnetExtraction of related concepts in WordnetConclusions