WorldCat Growth & Quality: Vision and Practice Ted Fons Director WorldCat Global Metadata Network...
-
Upload
noel-cross -
Category
Documents
-
view
214 -
download
0
Transcript of WorldCat Growth & Quality: Vision and Practice Ted Fons Director WorldCat Global Metadata Network...
WorldCat Growth & Quality: Vision and Practice
WorldCat Growth & Quality: Vision and Practice
Ted FonsDirector
WorldCat Global Metadata Network
Asia Pacific Regional
Council 2010April 15, 2010
OCLCThe world’s libraries. Connected.OCLCThe world’s libraries. Connected.
More collaboration
More institutions
More Web-scale
More synchronization
More innovation
Local
Group
Global
More
Better
Union Catalogue – Pivotal RoleUnion Catalogue – Pivotal Role
blogsRepositories, various sites
WorldCat Growth –
Growing WorldCat Faster
Create system-wide efficiencies in library management
WorldCat Growth since 1998Create system-wide efficiencies in library management
WorldCat Growth since 1998
Millions of records
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100
20
40
60
80
100
120
140
160
39 41 44 47 50 52 5561
67
86
108
139
170
WorldCat Growth = Batch ServicesWorldCat Growth = Batch Services
New Records Created Records
Enriched
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
Sources of WorldCat Growth FY09
Batch Services Online CatalogingQuality Control
WorldCat Growth –
Is It Working?
Local
Group
Global
Create system-wide efficiencies in library management
WorldCat TodayCreate system-wide efficiencies in library management
WorldCat Today
170 million records
1.5+ billion holdings
1 January 2010
Create system-wide efficiencies in library management
Files loaded or pending for WorldCatCreate system-wide efficiencies in library management
Files loaded or pending for WorldCat
ABES (France)
Bavarian State Library
Bibliothek Alexandrina (Egypt)
Bibliothekszentrum Baden Württemberg (Germany)
British Library
DANBIB (Denmark)
GBV (Germany)
HeBIS (Germany)
IDS Informationsverbund Deutsch-Schweiz (Switzerland)
Lebanese American University
LIBRIS (Sweden)
Qatar University
UnityUK
Zayed University Consortium (UAE)
Create system-wide efficiencies in library management
National files loaded or pending for WorldCat
Create system-wide efficiencies in library management
National files loaded or pending for WorldCat
Bibliothèque nationale de France
German National Library
Libraries Australia
National Central Library, Taiwan
National Library Board, Singapore
National Library of Barbados
National Library of China
National Library of Finland
National Library of Israel
National Library of Mexico
National Library of New Zealand
National Library of Scotland
National Library of Spain
National Library of Sweden
National Library of Wales
Swiss National Library
1998
36%
2009
53.8%
Percentage of Non-English RecordsTotal Records
English
French
German
Spanish
Japanese
Russian
Chinese
Italian
Latin
Portuguese
Dutch
Hebrew
1998
37.5m records
23.9 m
2.3 m
2.2 m
1.6 m
.8 m
.8 m
.7 m
.7 m
.3 m
.3 m
.2 m
.2 m
2009
117.2 m
records
64.3 m
8.5 m
17.9 m
4.5 m
2.8 m
2.3 m
4.3 m
2.1 m
1.9 m
1.1 m
2.9 m
1.2 m
Create system-wide efficiencies in library management
Multilingual WorldCatCreate system-wide efficiencies in library management
Multilingual WorldCat
1.9 billion items and growing!
170 million bib records
3.6 million digital items
1.5 billion holdings
325 million electronic database records
NEW! JSTOR Metadata: 4.5 million records
30 million items(Google, HathiTrust, OAIster)
Physical holdings in WorldCat
Licensed digital content in library
collectionsLocal library content
being digitized
Create system-wide efficiencies in library management
The collective collectionCreate system-wide efficiencies in library management
The collective collection
OAIsterOAIster
WorldCat Growth –
Synchronization with
Libraries, Repositories & Metadata Hubs
Local
Group
Global
Growing WorldCat FasterGrowing WorldCat Faster
New Data Ingest Platform under Services Oriented Architecture
PartnerData Automatic
Evaluation
Automatic Manipulation
Automatic Processing
Not
Good
Data
Good
Data
Create system-wide efficiencies in library management
Using Publisher Data to Grow WorldCat
Create system-wide efficiencies in library management
Using Publisher Data to Grow WorldCat
Establish partnerships with publishers
Ingest publisher and vendor metadata in ONIX
Enhance publisher metadata
Enrich WorldCat with publisher metadata
Output enhanced ONIX data to publishers/other partners
http://www.oclc.org/partnerships/material/nexgen/nextgencataloging.htm
Metadata Services for PublishersMetadata Services for Publishers
Publisher
Central Library
District Library
Tech School Library
Book Seller
Bib D
ata
Enriched Bib
Data
Enriched B
ib
Data
Real Time Update & Record EnrichmentReal Time Update & Record Enrichment
Union Catalogue
Central Library
District Library
Tech School Library
Territory Library
Design School Library
Commenced
Records Holdings Merge % Changes
02.2008 573,854 2,850,000 20-30% c.30,000 /mth
02.2009 159,741 1,130,000 50-60% Not yet
SRU
SRU
Union Catalog
WorldCat Growth –
Syndication
Local
Group
Global
Create system-wide efficiencies in library management
OCLC and Google to exchange data, link digitized books to WorldCat
Create system-wide efficiencies in library management
OCLC and Google to exchange data, link digitized books to WorldCat
Synchronizes WorldCat with digital collections of interest to the membership
Participating organizations provide OCLC with a regular feed of metadata
WorldCat is automatically updated with new MARC records as materials become available
Reciprocal linking between WorldCat and the host site
Automatic
Create system-wide efficiencies in library management
OCLC Syndicates WorldCat Data with Google Books
Create system-wide efficiencies in library management
OCLC Syndicates WorldCat Data with Google Books
WorldCat Quality –
Improving the Quality of the Database
Local
Group
Global
Quality Control ActivitiesQuality Control Activities
FY08 FY09
Bib Records Replaced 2,105,325 6,804,903
Manual Merges 207,742 137,832
Authority Records Replaced
395,817 1,112,815
Change Requests Received
182,348 134,902
Automated Enrichment of Master Records Automated Enrichment of Master Records
LC-ty
pe call n
umbe
r
Dewey
-type
call n
umbe
r
Other
call n
umbe
r
Conte
nts/S
umm
aries
Subjec
t ter
ms
URLs0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
2009 Oct Actual2010 Jan Actual2009 July Actual2010 Apr Actual2010 July Target
Reducing Duplicates – An Improved AlgorithmReducing Duplicates – An Improved Algorithm
First production run, May 18, 2009
Running small files (500 – 3000)
Statistics for May & June 2009
33,023 records processed
1,777 duplicates removed (5.7%)
846 records deferred for manual review
Reducing Duplicates – An Improved AlgorithmReducing Duplicates – An Improved Algorithm
Full production run, Feb. 2, 2010
Entire Database, beginning with OCN #1
Statistics so far:
7.5 Million records processed
Almost 650,000 records removed
Unique fields from deletes merged into the master record
The exception to this is non-Latin fields. We try to ensure that all non-Latin fields are in the retained record even if they are not on the list of mergeable fields.
Expert Community ExperimentExpert Community Experiment
Experiment to test “social cataloging” with OCLC’s expert community (modeled on Wikipedia)
Interest and motivation from WorldCat Local libraries that want to use WorldCat Local as their “database of record”
Total Replaces = 108, 766Total Replaces = 108, 766
February (15-28) March April May June July August (1-15)0
5,000
10,000
15,000
20,000
25,000
5,816
18,010
19,489
16,704
19,38720,287
9,073
Expert Community
What are they saying?What are they saying?
“I am loving the ability to fix typos, add more subject headings, etc. Some of which were things I would do locally but were too much of a hassle to fix at the oclc level.”
“Thank you so much for the opportunity to participate in the community enhancement experiment! Having the ability to correct typos, flesh out minimal…cataloging…is really wonderful. I hope the experiment works out well … -- I would love to see it made a permanent feature”
WorldCat is much more than a warehouse of recordsWorldCat is much more than a warehouse of records
•Continuous improvement of WorldCat records by members:•Enhance •Record enrichments•Expert Community Experiment•Error reporting
•OCLC’s quality management role:•WorldCat Quality group•Automated record enrichment•FRBRization•Duplicate detection and resolution•Support for Program for Cooperative Cataloging – NACO, CONSER, BIBCO, etc.•Ongoing conformance to library standards
A partnership of members and OCLC
WorldCat Quality–
Let’s Probe What “Quality” Really Means
Local
Group
Global
Online Catalogs: What Users and Librarians WantOnline Catalogs: What Users and Librarians Want
End-Users expect online catalogs:
to look like popular Web sites
to have summaries, abstracts, tables of contents
to help find needed information
Librarians expect online catalogs:
to serve end users’ information needs
to help staff carry out work responsibilities
to have accurate, structured data
to exhibit classical principles of organizationhttp://www.oclc.org/us/en/reports/onlinecatalogs/default.htm
Recommended enhancements to WorldCatTotal end-user responses
End-User Results: Recommended Enhancements
4
Librarian/Staff Results: Highlighted Differences
14
1
What did we learn?
End-user focus group resultsWhat did we learn?
End-user focus group results
Key observations:
• Delivery is as important, if not more important, than discovery.
• Seamless, easy flow from discovery through delivery is critical.
• Summaries and tables of contents are key elements of a description
• Improved search relevance is necessary.
WorldCat Registry –
Enabling Services
Local
Group
Global
Metadata about LibrariesMetadata about Libraries
WorldCat Registry
• A repository of metadata about libraries:
• Location• Contacts• Policies• Links
WorldCat Registry Value Proposition WorldCat Registry Value Proposition
The WorldCat Registry allows your library to:
• Provide direct linking to local library services over a variety of OCLC products including WorldCat.org and WorldCat Local
• Create and manage a profile that centralizes and automates information sharing with vendors and OCLC
• Receive a free benefit of greater internet visibility regardless of the OCLC membership
worldcat.org/registry/institutions
Registry Growth 2007-2009Registry Growth 2007-2009
2007
• 70, 000 records
• some library users
• 20,000 requests/mo via OpenURL Gateway
2009
• 130,000 records
• Over 4,500 library users managing records
• Processing 200-300,000 requests/mo via OpenURL Gateway
• Multiple OCLC and non-OCLC Services that rely on this data
Bringing It All Together: RedLaser AppBringing It All Together: RedLaser App
http://redlaser.com
The WorldCat knowledgebase
The VisionThe Vision
Achieve web scale for KB services
Move the KB to the cloud
Provide KB services through an API model to:
• Provide a central platform for KB data management
• Allow read and write access to the KB within OCLC services
• Allow read and write access to the KB for external services
The KB can be managed in one place, but exposed anywhere
Web scale value propositionWeb scale value proposition
70%
30%
INFRASTRUCTURE
INITIATIVE
Amazon.com: http://www.slideshare.net/goodfriday/amazon-web-services-building-a-webscale-computing-architecture
Cloud ComputingCloud Computing
A style of computing in which
scalable and elastic IT-enabled
capabilities are delivered as a
service to external customers
using Internet technologies. -Gartner Group
Simple: Web-based applications with shared data and services.
Traditional KB ServicesTraditional KB Services
The traditional model for KB services is to build a KB to support a service or product
KB
KB
Powering the libraryPowering the library
Link Resolver
ERM
A-Z
KB
Metasearch
KB
More powerMore power
A-Z
ERM Link Resolver
Metasearch
KB
Users Suppliers
Partners
Efficient storage of data in the cloud:Common use dataEfficient storage of data in the cloud:Common use data
Bib
Holdings
UserData
Common Use Data
Library
Users Suppliers
Partners
Efficient storage of data in the cloud:Common use dataEfficient storage of data in the cloud:Common use data
Titles
Holdings
UserData
Common Use
KB Data
Library
Collections
The VisionThe Vision
Achieve web scale for KB services
Move the KB to the cloud
Provide KB services through an API model to:
• Provide a central platform for KB data management
• Allow read and write access to the KB within OCLC services
• Allow read and write access to the KB for external services
The KB can be managed in one place, but exposed anywhere
WorldCatLinks Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
Link Resolver
KB
Citation
Science DirectEbscoGale
Available in:
KBWC API
Traditional Link Resolver Model
CollectionsTitlesHoldingsLinks
WorldCat
Third Party
WC Resource Sharing
WorldCat.org
TouchpointWC Local
Links Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
CollectionBuilder
WMSLicense
Manager
WMSERM
WorldCat Knowledgebase model
KBWC API
KBWC API
WorldCat
Links Holdings
Collections Titles
KBWC
Citation
LendersLinksRights
WMSLicense
Manager
WC Resource Sharing
KBWC API
WorldCat
WorldCat.org
Links Holdings
Collections Titles
KBWC
CollectionBuilder
WMSLicense
Manager
WMSERM
LinksRightsFilters
KBWC API
WorldCat
WorldCat.org
Links Holdings
Collections Titles
KBWC
CollectionBuilder
WMSLicense
Manager
WMSERM
Third Party
KBWC API
WorldCat
Third Party
WorldCat.org
Links Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
CollectionBuilder
WMSLicense
Manager
WMSERM
WorldCat Knowledgebase model
WC Resource Sharing
TouchpointWC Local
WorldCat Growth –
The Value of the Cooperative
Local
Group
Global
The Value of the Shared WorldCat NetworkThe Value of the Shared WorldCat Network
• An incomparable source of library-standard records to support local or group library discovery and collection management.
Record supply
• Bibliographic and holdings data from more than 70,000 libraries, underpinning delivery of library collections, resource sharing, and collection analysis.
Registration of holdings
• An infrastructure utilizing library standards for description, name authority control, classification, and terminologies.
Knowledge organization
Record Supply: Where do WorldCat records come from?Record Supply: Where do WorldCat records come from?
The cooperative provides the content.
The cooperative activity provides the value.
Cataloging: Key to All OCLC ServicesCataloging: Key to All OCLC Services
WorldCat
Find & Get Items
Discovery
Holdings & AvailabilityResource Sharing
Bibliographic DescriptionsCataloging &
Metadata Services
Library Descriptions
WorldCat Registry
OCLCThe world’s libraries. Connected.OCLCThe world’s libraries. Connected.
More collaboration
More institutions
More Web-scale
More synchronization
More innovation
Local
Group
Global
More
Better
WorldCat Growth & Quality: Vision and Practice
WorldCat Growth & Quality: Vision and Practice
Ted FonsDirector
WorldCat Global Metadata Network
Asia Pacific Regional
Council 2010April 15, 2010