Management Memo Thresholds Biology-Based stressor thresholds
Achieving Thresholds for Discovery
description
Transcript of Achieving Thresholds for Discovery
![Page 1: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/1.jpg)
Addressing Issues with EAD to Increase Discovery and Access
Merrilee ProffittSenior Program Officer OCLC Research
5 December 2013OCLC TAI-CHI webinar series#oclcr
Achieving Thresholds for Discovery
Dan Santamaria Assistant University Archivist for Technical ServicesSeeley G. Mudd Manuscript LibraryPrinceton University
![Page 2: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/2.jpg)
Issues with EAD
Merrilee ProffittSenior Program Officer, OCLC Research
5 December 2013OCLC TAI-CHI webinar series#oclcr
Achieving Thresholds for Discovery
![Page 3: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/3.jpg)
http://journal.code4lib.org/articles/8956
![Page 4: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/4.jpg)
4
EAD analysis• Based on an April 2013 harvest of EAD
encoded finding aids for ArchiveGrid• Analysis of elements that would support
five dimensions of a discovery system: 1. Search2. Browse3. Display4. Sort5. Limit
![Page 5: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/5.jpg)
5
EAD analysis• Focus on support for discovery not
standards or best practices (although not mutually exclusive).
![Page 6: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/6.jpg)
A Review of Discovery Options
![Page 7: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/7.jpg)
7
Methodology• Recreated analysis*
done by Wisser and Dean – Xpath queries across the data set
• Considered which elements would (or could) be used to “power” various aspects of discovery
• *not all tables reproduced
![Page 8: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/8.jpg)
8
MethodologyThe distribution of element usage was
roughly divided into 4 groups:
• Low -- between 0% - 50%• Medium -- between 51% - 80%• High -- between 81% - 95%• Complete -- between 96% - 100%
![Page 9: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/9.jpg)
9
Findings• Lots of “medium,” few “high” or
“complete”• Even when an element is accounted for,
the content may make it difficult to use (unitdate and extent are two examples)
• Most “complete” elements are administrative in nature, or are required by the DTD/schema
• In short, EAD encoding may not (now) give a lot of bang for the discovery buck.
![Page 10: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/10.jpg)
10
Is hope on the horizon?• Finding aids in ArchiveGrid may represent
legacy encoding• New focus on shared authoring tools may
help• EAD3 may help• Tools and techniques for improving finding
aids (with an emphasis on discovery) may help
![Page 11: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/11.jpg)
11
Over to Dan..
![Page 12: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/12.jpg)
Finding Aids and Thresholds for Discovery at Princeton
Dan Santamaria Seeley G. Mudd Manuscript Library
OCLC Research Webinar
![Page 13: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/13.jpg)
Discovery: Profession-Wide Challenges• The reluctance to embrace archival
standards
• EAD and document-centric description
• Most of all, the persistence of backlogs
![Page 14: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/14.jpg)
Challenges: Backlogs
– AN INTERNET ACCESSIBLE FINDING AID EXISTS FOR 44% OF ARCHIVAL COLLECTIONS
»OCLC “Taking Our Pulse Survey”
![Page 15: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/15.jpg)
Discovery: Institution-Specific Challenges• Backlogs– Princeton University Archives had no finding
aids as late as 1990.– 2005: 2/3 of University Archives lacked
descriptive records of any kind.• Little structured data for “Finding Aids”
from any division.• Most arrangement and description work
done by staff on short-term and soft money positions.
![Page 16: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/16.jpg)
Thresholds for Discovery: Phase 1 • Efficient backlog reduction
• DACS compliance
• Collection-level and series-level focus
• Make sure all of our collections were represented online
![Page 17: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/17.jpg)
Phase 1: Our ApproachPunting on idiosyncratic legacy description
TMs, pp. numbered 1-62, (pp. numbered 1-23 are photocopies of the original), ANs and holograph corrections 215 pages (pages 19 and 20 are missing). Dates and locations, 1975 March 26-1976 June 29; Princeton, N.J. (1-26, 31-34) Madison, Wis. (26-30) . Hanover, N.H. (34-38) . Sitges, Spain (39-215). Notebook on Casa de campo. Preoccupation with plot details, characterization, chapter transitions. After a long period away from home and from the novel (1-52), the author resumes work on it by re-evaluating each chapter. By the end of the notebook he has completed a second draft of the novel's first part (chs. 1-7) and the first chapter of the second part. The notebook contains a variety of personal comments about the author and those around him.
![Page 18: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/18.jpg)
Phase 1: Our Approach• Stated goals– Provide minimum level of online access to
collections (collection-level records).– Gain acceptable level of intellectual control
over collections.– Provide a centralized entry point for
researchers and staff.
![Page 19: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/19.jpg)
Phase 1: Our Approach• Survey entire holdings and record
holdings/location information and very basic descriptive data
• Create collection-level records for all collections – MARC– DACS single-level optimum
![Page 20: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/20.jpg)
Collection-Level EAD
![Page 21: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/21.jpg)
Phase 1: Results• All collections encoded in EAD and MARC
by end of 2007 • DACS single-level and multi-level optimum
• Processing and retro-conversion happening concurrently– More than 800 finding aids encoded, 2006-
2007– More than 2500 linear feet
processed/described in 2006-2007
![Page 22: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/22.jpg)
Thresholds for Discovery: Phase 2
![Page 23: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/23.jpg)
Phase 2: Requirements and Goals
![Page 24: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/24.jpg)
Principles• User focus– Find– Identify– Select – Obtain
• Data not documents
![Page 25: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/25.jpg)
Data Analysis
![Page 26: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/26.jpg)
Search/Browse/Sort/Display/Limit
![Page 27: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/27.jpg)
Search/Browse/Sort/Display/Limit
![Page 28: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/28.jpg)
Search/Browse/Sort/Display/Limit
![Page 29: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/29.jpg)
Beyond Collection-LevelSort by title Sort by date
![Page 30: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/30.jpg)
Data Enhancement• Specific Elements– Dates– Extent– Titles– Creators– “Access Points”– Digital Content
• ALL EADs– Minimize mixed
content– Unnumber <c0X>– Denested
<unititle> and <unidate>
– Remove <head> and @label
![Page 31: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/31.jpg)
DatesCollection-Level• Virtually all present• Virtually all normalized• Little work required
Component-Level
• WORK REQUIRED!• 2 months
![Page 32: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/32.jpg)
ExtentCollection-level• Virtually all present• Little structure• Effective for display • Ineffective for sorting;
reporting; analysis
Component-level• Consistently present
at series/subseries level
• Infrequently present at lower component levels
• Little structure
![Page 33: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/33.jpg)
Coming Soon: <physdescstructured>• Attributes:– @coverage = whole or part– @physdescstructuredtype = carrier,
materialtype, or spaceoccupied• Required Elements– <quantity> – <unittype>
![Page 34: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/34.jpg)
Access Points: Subjects and “Topics”
<subject rules="local" source="local" encodinganalog="690" authfilenumber="t9">American literature
</subject>
EAD SKOS
![Page 35: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/35.jpg)
Indexing
![Page 36: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/36.jpg)
Component Identifiers<c id="C0041_c0070" level="series">
<did><unittitle>Series 3: Correspondence</unittitle> <unitdate normal="1951-08-21/1978-12-31" type="inclusive">1951 August 21-1978</unitdate> <physdesc> <extent type="computed">1 folder</extent> </physdesc></did>
![Page 37: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/37.jpg)
Data Management• RelaxNG schema– Loose– Strict
• Normalization tool
![Page 38: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/38.jpg)
Lessons Learned
Iterative Description Works
![Page 39: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/39.jpg)
Lessons Learned: Content Standards
![Page 40: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/40.jpg)
Lessons LearnedUsability
![Page 41: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/41.jpg)
Lessons Learned: Discovery Happens Elsewhere
55%
19%
10%
8%
4%
2% 1% 1%
Traffic Sources
google / organic(direct) / (none)princeton.edu / referralen.wikipedia.org / referrallibrary.princeton.edu / referralbing / organiccatalog.princeton.edu / referralyahoo / organic
![Page 42: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/42.jpg)
Lessons LearnedThink beyond EAD: Monitor developments with conceptual models and linked data.
http://www.ica.org/13799/the-experts-group-on-archival-description/
![Page 43: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/43.jpg)
Where to Start1. DACS2. Structure3. Iterate
Tools that support all three
![Page 44: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/44.jpg)
CreditsArchival Description Working Group(2011-2013)
• Maureen Callahan
• John Delaney• Shaun Ellis• Regine Heberlein
• Dan Santamaria
• Jon Stroop• Don Thornbury
![Page 46: Achieving Thresholds for Discovery](https://reader035.fdocuments.us/reader035/viewer/2022062400/568168d3550346895ddfc435/html5/thumbnails/46.jpg)
Thank You!
©2013 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This work uses content from “Achieving Thresholds for Discovery” © OCLC & Dan Santamaria, used under a Creative Commons Attribution license: http://creativecommons.org/licenses/by/3.0/”
Merrilee Proffitt [email protected]
Dan Santamaria [email protected]