V IAF and WorldCat/WorldCat Identities.
description
Transcript of V IAF and WorldCat/WorldCat Identities.
The world’s libraries. Connected.
VIAF and WorldCat/WorldCat Identities.
Improving works and expressions in VIAF
VIAF Council Meeting, Singapore, 2013-08-16
Janifer GatenbyOCLC
The world’s libraries. Connected.
Please note:
Excerpts from the following full presentation were presented by Janifer
Gatenby to the VIAF Council meeting
2013-08-16
The world’s libraries. Connected.
Multilingual WorldCatpresented by Janifer Gatenby
IFLA, Singapore, 2013-08-19
Karen Smith Yoshimura
Eric Childress
Janifer Gatenby
Jean Godby
Richard Greene
Jenny Toves
Diane Vizine Goetz
Robert Bremer
JD Shipengrover
Gail Thornburg
Jay Weitz
The world’s libraries. Connected.
WorldCat Today
• Resources in nearly all languages
• Contributed by more than 20,000 libraries worldwide
• More than half the database is for works not in English
The world’s libraries. Connected.
WorldCat Today
• Bibliographic Records• Hybrid records
• Parallel records
• Clustered at Work level (FRBR)
The world’s libraries. Connected.
Existing Architecture
AuthorsAuthor
sAuthors
SubjClassifSubj
ClassifSubjClassif
HoldingHoldin
gHoldings
Bibliographic recordWork
cluster
Content cluster
Manifestation cluster
The world’s libraries. Connected.
Complementary Initiatives
Work Level Record
GLIMIRManifestation
& Content Clusters
Multi-lingual Bibliographic
Structure
The world’s libraries. Connected.
Work Level Record
http://www.oclc.org/research/activities/workrecs.html
The world’s libraries. Connected.
Create a landing page summarizing content for a work
Work Level Record: Objective
The world’s libraries. Connected.
• The Content Cluster• Enables better work record displays by reducing the number of lines that
display for large works• Enables a choice of format and presents the formats that could be acceptable
substitutes• Consolidates holdings for identical content
• The Manifestation Cluster is important • Consolidates holdings at manifestation level• In the short term allows the record catalogued in the language of the interface
to be chosen for display• Reduces apparent duplication• Allows a more accurate count of the number of manifestations in WorldCat
(as opposed to the number of records)
GLIMIR
The world’s libraries. Connected.
Creates true multi-lingual displays• At work and manifestation levels
• Using all available data instead of “most appropriate record”
• Generates data
Corrects many of the 28 million records coded “und”
Better control and linking of translations
Input to refinement of work clusters
Smarter data storage
Multilingual Bibliographic Structure Project
The world’s libraries. Connected.
• Worldcat.org selects the most appropriate record to show to a user as representative of the work in the short result list and beyond
• The end result will not be very satisfactory from a multi-lingual viewpoint… here’s why
“Most appropriate” questioned
The world’s libraries. Connected.
Which record is better to present to a German speaker?
The world’s libraries. Connected.
Incomplete Swedish Record
The world’s libraries. Connected.
Hybrid record
The world’s libraries. Connected.
Most appropriate display
Build the display from all available
data
The world’s libraries. Connected.
• Work level data, mined from all associated bibliographic records will be displayed supplemented with expression / manifestation level data as the user drills through the short to fuller versions of the metadata.
Multilingual Bibliographic Structure Project
End user interface will show works and manifestations not bibliographic records; the cataloguing client will also show bibliographic records
The world’s libraries. Connected.
Proposed new architecture
Work
eng
fre
ger
jpn
ManifengManif
engManifeng
Manifeng
Manifeng Manif
eng
o freNotesContents
++
HoldingHoldin
gHolding
Holding
Subjsif
SubjClassif
eng
freger
jpn
AuthorsAuthor
sAuthorseng
freger
jpn
eng
fre
ger
jpn
eng
fre
ger
jpn
Translations (Language of work)
Maniffre
Holding
The world’s libraries. Connected.
• Language tagging of elements, particularly• Summaries (M21 520)
• Subject headings
• Display in script preferred by the user if data is available
• Improve translated interfaces• Show consolidated holdings as appropriate
Important principles
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
Translations
The world’s libraries. Connected.
• The cream of the world’s cultural and knowledge heritage is shared by being translated
• WorldCat contains many rich cataloguing records for these translations
Great works are translated
GOAL: Data mine the really good records to improve clustering, presentation, authority
records and linked data
The world’s libraries. Connected.
• Inconsistencies causing work clusters to be incomplete & less than optimal search results
• Titles without subtitles
• Different forms of uniform title or missing uniform title
• Inverted title
• Different coding of original and translated information
Translations
Generated uniform title authority records will overcome most of these differences without needing to edit individual records
The world’s libraries. Connected.
• Improve FRBR work groups• Made by data mining• Contribute to VIAF• Diffuse via VIAF as linked data• Possibility to create web page / web service
Generate uniform title authority records
The world’s libraries. Connected.
The world’s libraries. Connected.
Translation records in VIAF
• Will enrich VIAF significantly• New elements - translated title and translator
Author Title Expressions in VIAF Translation count in WorldCat
Atwood Blind assassin 8 31
Guevara Notas de viaje 0 11
Hawking Grand design 0 18
Lenard Grosse naturforscher 1 3
Loti Pêcheur d’Islande 1 31
The world’s libraries. Connected.
• Records are freely available to the world from VIAF in
• MARC-21
• XML
• RDF (linked data)
• Just links in JSON
• And other formats as introduced
Diffusion of Translation records
The world’s libraries. Connected.
• # of manifestations as opposed to # of records
• # of works that have translations
• Top translated authors and works
• And more
We don’t know now, but soon will