SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory
-
Upload
gisela-lancaster -
Category
Documents
-
view
23 -
download
0
description
Transcript of SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory
![Page 1: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/1.jpg)
Rutherford Appleton Laboratory
SKOSEcoterm 2006
Alistair MilesCCLRC Rutherford Appleton Laboratory
Semantic Web Best Practices and Deployment
![Page 2: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/2.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 2
Reminder: what is it?
• Simple Knowledge Organisation System• Formal language for representing
controlled structured vocabularies (thesauri, classification schemes, … ?)
• Subject metadata & information retrieval …– ‘this document is about romantic love’.– ‘this document is about the cure of tuberculosis by x-
ray in India in the 1950s’.
• Application of RDF
![Page 3: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/3.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 3
Since Ecoterm 2005 …
• SKOS Core Guide & SKOS Core Vocabulary Specification …– First Working Draft May 2005– Second Working Draft October 2005
• Minor changes
• Quick Guide to Publishing a Thesaurus on the Semantic Web …– First Working Draft May 2005
![Page 4: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/4.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 4
What comes next … ?
• Life after SWBPD-WG … ?• Plans for next phase of W3C
Semantic Web Activity …• New WG?• SKOS W3C Recommendation by end
2007?• N.B. Not yet approved!
![Page 5: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/5.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 5
If Rec then …
• What is the scope? What is the fundamental design goal?
• First part of SKOS Rec would be requirements specification.
• Between now and Sept/Oct 2006 … define scope and requirements.
![Page 6: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/6.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 6
What I’d like to do here …
• Talk about some of the assumptions behind SKOS.
• Sketch some ideas on how to define scope and requirements for SKOS.
• Get your [email protected]
“SKOS: Requirements for Standardization”isegserv.itd.rl.ac.uk/public/skos/press/dc2006/paper.pdf
![Page 7: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/7.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 7
Brief history of scope …
• 2003-04: SWAD-Europe– ISO 2788 thesauri– “Non-standard” thesauri via extensibility e.g.
GeMET– Classification scheme (PACS)– Multilingual thesauri– Semantic mapping
• 2004: W3C Glossaries• 2005: Discussion re “terminologies”• Subject headings? Gazeteers?
Folksonomies? Taxonomies?
![Page 8: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/8.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 8
Assumptions: purpose …
• Formal representation of controlled structured vocabularies intended for use in information retrieval applications.
![Page 9: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/9.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 9
Assumptions: workflow …
a) Build a vocabularyb) Build an indexc) Retrieve
![Page 10: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/10.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 10
Assumptions: components …
• Vocabulary Development Application– Something to help build a vocabulary
• Indexing Application– Something to help build an index
• Retrieval Application– Something to help retrieve things
• SKOS ultimately designed to support interoperation of these three “key components”.
![Page 11: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/11.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 11
Proposed scope …
• SKOS is a formal language for representing controlled structured vocabularies intended for use within information retrieval applications.
• SKOS is required to support the interoperation of these three key components.
• I.e. define the requirements for SKOS by describing a set of functionalities that must be enabled.
![Page 12: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/12.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 12
Other components …
• Vocabulary mapping … ?• Metadata registries … ?• … ?
![Page 13: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/13.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 13
Component specs …
• … first discuss social and technological context, then return to component specs …
![Page 14: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/14.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 14
Context …
• What is the social and technological context in which controlled structured vocabs are used?
• Assume two basic needs…– Locate something I already know about.– Discover something new.
• N.B. a good location service is not necessarily a good discovery service.
– Cf. Google and del.icio.us
![Page 15: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/15.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 15
Strategies …
• Basic strategies for implementing retrieval services …
1. Statistical text analysis2. Analysis of user behaviour3. Index with controlled vocab
• Other strategies …1. … kos-assisted text analysis?
![Page 16: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/16.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 16
Cost problem …
• Given that applying controlled structured vocab for retrieval involves significant initial and ongoing investment…
• Given that other strategies are cheaper…
• Huge pressure to drive down cost and increase utility.
• Requirement for seamless integration.– I.e. controlled vocab is seldom used in isolation, most
applications will combine strategies.
![Page 17: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/17.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 17
Use case …
• Search portal …• Use combined strategies.
![Page 18: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/18.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 18
Component specs …
• Important factors …
• Minimise cost.– Decentralisation.– Assistance.
• Maximise “utility”.– Query expansion.– Smart ranking.– Maximize lifetime.
• Use the Semantic Web!– Situation A. search across many collections, where
indexers use same controlled vocab.– Situation B. search across many collections, where
indexes use different controlled vocabs.
![Page 19: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/19.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 19
Focus areas …
• Decentralisation requires different models of collaboration and change.
• Representing change a key factor to keeping a vocab applicable.
• Ranking and scoring well understood for text, less so for controlled index.
• Theory of query expansion? Field trials of query expansion?
• Strategies for providing assistance?
![Page 20: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/20.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 20
Change and collaboration
• Continuum of collaboration models: centralized <-> decentralised
• Continuum of change management models: continuous <-> discrete
• Decentralization can reduce cost of development and maintenance
• Change management can ensure continued utility – maximize ROI
• Support for declarative representation of change a requirement for SKOS.
![Page 21: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/21.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 21
Semantic Web architecture…
• Exploit Semantic Web facility to distribute and merge data.
• However, publication of data in the Semantic Web, best practices need work.
• See “Best Practice Recipes for Publishing RDF Vocabularies” W3C Working Draft (Google “publishing RDF”).
![Page 22: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/22.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 22
Semantic Web architecture
![Page 23: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/23.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 23
Direct interaction …
![Page 24: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/24.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 24
Information retrieval…
• Indexing and query evaluation well understood for text content.
• Less well understood for controlled metadata.
• Query types?• Query evaluation strategies, e.g.
query expansion?• Ranking?
![Page 25: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/25.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 25
Assistance for indexers …
• Provide suggestions– Comparison of labels and annotations– Machine learning – Exploit lexical resources– … ?
![Page 26: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/26.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 26
Assistance for mappers …
• Provide suggestions …– Analysis of labels and annotations– Exploit lexical resources– … ?
![Page 27: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory](https://reader030.fdocuments.us/reader030/viewer/2022032612/56813324550346895d9a07b2/html5/thumbnails/27.jpg)
http://www.w3.org/2004/02/skosAlistair Miles, Ecoterm 2006, slide 27
Summary
• SKOS: fundamental requirement to support information retrieval using controlled structured vocabularies.
• Define requirements by describing information retrieval functionalities.
• Divide functionalities into:– Presentation styles– Query types e.g. compound queries, coordination …– Query evaluation strategies
• Assumptions:– Key components– Semantic Web interaction– Context – pressure to make vocabularies “profitable”– … Issues: change, assistance, theory …