20080917 Rev
-
Upload
charper -
Category
Technology
-
view
2.757 -
download
0
description
Transcript of 20080917 Rev
![Page 1: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/1.jpg)
Linked Library Dataand the Semantic Web
Leveraging Library Authority Control outside of MARC
Applications
Presented 2008-09-17At the National Library of SwedenCorey A Harper
![Page 2: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/2.jpg)
2008-09-17 National Library of Sweden 2
Topical Overview• Linked Open Data and SemWeb• Library Authorities and Controlled
Vocabularies – Toward Library LOD• Work in progress in these areas• Metadata Normalization, Harmonization
and Recombination• Possibilities…
![Page 3: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/3.jpg)
2008-09-17 National Library of Sweden 3
“The vast bulk of data to be on the Semantic Web is already sitting in databases … all that is needed [is] to write an adapter to convert a particular format into RDF and all the content in that format is available.”
-Tim Berners-Lee in an interview with the Consortium Standards Bulletin
![Page 4: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/4.jpg)
2008-09-17 National Library of Sweden 4
Linked Open Data• Use URIs as names for things • Use HTTP URIs so that people can look
up those names. • When someone looks up a URI, provide
useful information. • Include links to other URIs. so that they
can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
![Page 5: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/5.jpg)
2008-09-17 National Library of Sweden 5
![Page 6: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/6.jpg)
2008-09-17 National Library of Sweden 6
![Page 7: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/7.jpg)
2008-09-17 National Library of Sweden 7
Linked Library Data• Resources get URI’s early in
lifecycle• Properties get URI’s• Vocabularies get URI’s• Everything is dereferenceable:
Able to request meaning over http
![Page 8: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/8.jpg)
2008-09-17 National Library of Sweden 8
Library Authority Data“Include links to other URIs. so that they
can discover more things.”
Short of providing and linking to URIs, this *is* authority data.
This is what our authority files are for.
![Page 9: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/9.jpg)
2008-09-17 National Library of Sweden 9
Authority Information• Controlled Vocabulary• SKOS for LCSH, Dewey, LCC, Mesh,
others• Need a structure for Name Authorities
– FOAF is only part of the answer• Standard URI’s for concepts and agents
– Possibly for FRBR Entities?
![Page 10: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/10.jpg)
2008-09-17 National Library of Sweden 10
![Page 11: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/11.jpg)
2008-09-17 National Library of Sweden 11
Library Controlled Vocabularies: Benefits
• Reputation - Trusted Tradition• Mature - Time tested and carefully
developed• General & Comprehensive - Cover
large knowledge spaces
![Page 12: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/12.jpg)
2008-09-17 National Library of Sweden 12
Library Controlled Vocabularies: Drawbacks
• Overly Complicated - extraneous information
• Archaic Syntax - MARC Records• Slow to evolve - authorities control
the authority control
![Page 13: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/13.jpg)
2008-09-17 National Library of Sweden 13
LCSH
![Page 14: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/14.jpg)
2008-09-17 National Library of Sweden 14
LCSH in Dublin Core• Encoding Scheme for DC Subject• No easy way to draw on equivelent
terms and cross-references• Abstract Model, RDF and SKOS
could enable applications to make use of the whole vocabulary
![Page 15: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/15.jpg)
2008-09-17 National Library of Sweden 15
Vocbaluary Encodings• MARC - Great for Library
Applications• MARC-XML• MADS• SKOS - Designed for use with RDF
}Helping Get Library Apps online
![Page 16: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/16.jpg)
2008-09-17 National Library of Sweden 16
LCSH in SKOS<skos:Concept rdf:about="http://example.com/lcsh#95000541"> <skos:prefLabel>World Wide Web</skos:prefLabel> <skos:altLabel>W3 (World Wide Web)</skos:altLabel> <skos:altLabel>Web (World Wide Web)</skos:altLabel> <skos:altLabel>World Wide Web (Information Retrieval
System)</skos:altLabel> <skos:broader rdf:about="http://example.com/lcsh#88002671" /> <skos:broader rdf:about="http://example.com/lcsh#92002381" /> <skos:related rdf:about="http://example.com/lcsh#92002816"/> <skos:narrower rdf:about="http://example.com/lcsh#2002000569"/> <skos:narrower rdf:about="http://example.com/lcsh#2003001415"/> <skos:narrower rdf:about="http://example.com/lcsh#97003254"/></skos:Concept>
![Page 17: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/17.jpg)
Diagram courtesy of Ed SummersSee upcoming DC2008Paper
![Page 18: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/18.jpg)
2008-09-17 National Library of Sweden 18
![Page 19: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/19.jpg)
2008-09-17 National Library of Sweden 19
Expected Benefits• Common RDF Semantics• Many Possible Web Services• Publish Vocabulary in Multiple
Formats– Ease of re-use
• Entertainment
![Page 20: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/20.jpg)
2008-09-17 National Library of Sweden 20
Name Authorities• Many National Authority Files• Separate records representing
same author– Different Languages– Different Scripts
![Page 21: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/21.jpg)
2008-09-17 National Library of Sweden 21
VIAF• Virtual International Authority File• First try - Merging• Second try - Linking
(then merging?)• Why not just link….?
![Page 22: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/22.jpg)
22National Library of Sweden2008-09-17
Same Entity/Variant ScriptsJapanese
japanisch
![Page 23: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/23.jpg)
2008-09-17 National Library of Sweden 23
Linking Open Names• Need an RDF Vocabulary for
Names and Corporations• FOAF is one piece of the puzzle• DC Agents Application Profile
– Quasi-Active DCMI Task Group
![Page 24: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/24.jpg)
2008-09-17 National Library of Sweden 24
VIAF as LOD• Use owl:sameAs to declare
equality• Every national authority file gets a
SPARQL endpoint• No need to merge authority files• Applications can query, merging
relevant sets locally
![Page 25: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/25.jpg)
2008-09-17 National Library of Sweden 25
Renew, reuse, recycle• Enable better sharing within
Library community• Share our data with other
communities• Reuse Authority Data in new and
interesting ways…
![Page 26: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/26.jpg)
2008-09-17 National Library of Sweden 26
![Page 27: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/27.jpg)
Shared Data Store
Local Data Store
IdentitySystemLCSH
ServiceLC-NAFService
The Restof the Web
Discovery Systems
![Page 28: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/28.jpg)
2008-09-17 National Library of Sweden 28
Summers, Ed
DC2008ConferenceProceedings
Authority Files
SubjectHeadings
SemanticWeb
LCSH, SKOS and Linked Data
Article
Tag Blog Post
dc:subject dc:creator
dc:subject
dc:title
dcterms:isPartOfskos:broader
Authority Files
taggedBy tagTarget
owl:sameAs
tagName
This is only an example!!
•The Graph may not be entirely correct
•Tagging ontologies are very new
•May involve blank nodes &/or reification
![Page 29: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/29.jpg)
2008-09-17 National Library of Sweden 29
Controlled Vocabularies Recontextualized
• LOD notion of “Information” vs. “Non-information” resources.– Info - documents on the web– Non-info - anything else: people, places,
things, books• Non-info resources have
representations / descriptions• These are info resources
![Page 30: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/30.jpg)
2008-09-17 National Library of Sweden 30
Controlled Vocabularies Recontextualized
• Authority records are descriptions of non-information resources
• Bibliographic records are (usually) descriptions of non-information resources
• Other areas of Authority Control…
![Page 31: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/31.jpg)
2008-09-17 National Library of Sweden 31
Image from the Getty Museum:http://www.getty.edu/research/conducting_research/standards/cdwa/entity.html
![Page 32: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/32.jpg)
2008-09-17 National Library of Sweden 32
FRBR• Library community’s first
formalization of our data model• Untested• Incredibly complicated• Not reflected well in descriptive
standards or practice
![Page 33: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/33.jpg)
2008-09-17 National Library of Sweden 33
FRBR“Simply by clustering your recordsinto work sets, you have not movedyour records into the FRBR model.FRBR is a complete data model that is anew way of looking at our data, not justtaking existing records and identifyingwork relationships”
- J. Rochkind - bibwild.wordpress.com
![Page 34: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/34.jpg)
2008-09-17 National Library of Sweden 34
…and Librarydata is extremelycomplicated
![Page 35: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/35.jpg)
2008-09-17 National Library of Sweden 35
MARC Record Graph• Does not include authority data• Coins new URI’s any non-literal value• Contains a few minor modeling errors
<modsrdf:Publisher modsrdf:value="Crowell" rdf:about="http://simile.mit.edu/2006/01/publisher/Crowell">
<modsrdf:location> <modsrdf:Place modsrdf:name="New York“
rdf:about="http://simile.mit.edu/2006/01/place/marccountry/nyu"/>
</modsrdf:location></modsrdf:Publisher>
![Page 36: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/36.jpg)
2008-09-17 National Library of Sweden 36
A Distinction• Metadata Harmonization:
– the “ability to use serveral different metadata standards in a single software system.”
• Metadata Normalization:– mapping serveral different metadata
standards to a single schema or structure for use in a single software system.
![Page 37: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/37.jpg)
2008-09-17 National Library of Sweden 37
Primo: A Case Study• Normalization Rules• Delivery templates• Tight SFX and MetaLib Integration• “Pipes” for different data sources• Hourly Availability Checking
– (Real Time in Version 2.0)
![Page 38: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/38.jpg)
2008-09-17 National Library of Sweden 38
Harvesting• Different Data Sources• Different Normalization Rules• All standardized on Primo
Normalized XML (PNX) Record– Very Flat, sections corresponding to
Primo Functionality
![Page 39: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/39.jpg)
2008-09-17 National Library of Sweden 39
Issues and Challenges• Managing Deduplication
– Dedup Data only out of box for MARC– Writing for OAI-PMH sources (EAD)
• Consortial Environment(s)• Appropriate Delivery Options• “Interpreting” Metadata
![Page 40: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/40.jpg)
2008-09-17 National Library of Sweden 40
EAD Records• Archivists Toolkit
– Previously in Access, Notepad, Excel– Authority Control (sort of)
• OAI-PMH Overlay• Multiple layers of Crosswalking• Deduping
![Page 41: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/41.jpg)
2008-09-17 National Library of Sweden 41
EAD / Aleph Dedup• Aleph Title:
– James E. Jackson and Esther Cooper Jackson papers
• EAD Title:– Guide to the James E. Jackson and Esther
Cooper Jackson papers 1917-2004 (Bulk 1937-1992) Tamiment 347
![Page 42: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/42.jpg)
2008-09-17 National Library of Sweden 42
MARC + EADEAD Record
Aleph Record
Authority Records
MARC Recordw/ Auth Data
OAI-DC Recordw/ FT of EAD
EAD PNX
Aleph PNX
Dedup PNX
![Page 43: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/43.jpg)
2008-09-17 National Library of Sweden 43
Value of Dedup• Indexing the Best of Both Worlds• EAD Records:
– Inventory– Long Biographical / Historical Notes
• MARC Data:– Cross References for Access Points
![Page 44: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/44.jpg)
2008-09-17 National Library of Sweden 44
It shouldn’t be this hard!• Dedup Process shouldn’t be
necessary– Authority files should be useable
within non-MARC applications– Merging is easier with more
granularity, more homogeneity, in data sets
![Page 45: 20080917 Rev](https://reader036.fdocuments.us/reader036/viewer/2022081412/5403b9138d7f72e04c8b48cd/html5/thumbnails/45.jpg)
2008-09-17 National Library of Sweden 45
Endless possibilities• This barely scratches the surface• Authority Data is only a small part• With more soundly modeled
bibliographic and authority data…– Terminology Services– Context sensitive
searching– Customized interfaces– Customized exhibitis
– Mashups– Web Services– User Profiling– Collaboration tools