An Iterative Approach to Building Sustainable Repository Services on Fedora
Open Repositories 2009, May 19, 2009
Outline
• Organizational overview and backgroundo Claire Stewart, Head, Digital Collections
• Winterton Collection project o Karen Miller, Monographic and Digital Projects Cataloger, Bibliographic Services
• Iterative approacho Bill Parod, Repository Architect, Enterprise Systems
Parallel committee/department structure
may 2008may 2008
Repository Implementation Group project schedule
Repository Implementation Group project schedule
may 2009
EAD to images to EAD+images
Winterton Collection cataloging
• Full cataloging for each of the 76 original collections and at the container level (album, envelope, etc.) for collections of more than one container.
• Individual photographs are not (generally) cataloged fully:o Titleo Note (optional)o Publisher or Creator (if available)
• Full cataloging includedo Titleo Dates of coverageo Abstracto Scope and contents descriptiono Biographical or historical noteo Physical description (size of album, how many
pages, photos, etc.)o Subject headings
Providing cataloging at the album level means that • Many individual photographs will not be
described concisely by the subject headings assigned.
• Some subject headings may not apply at all to some photographs.
Transcribing only the photograph titles results in such problems as these when keyword searching:• Non-English words are not translated• People referred to in captions by their initials, not
names• Animals referred to by given name, not by
species• Non-descriptive captions
A.E.B. and his well-identified crowd
"Enmei and his rhino"
Repository Development Strategy
1. Implement models and services for ingest, preservation, and access of core content.
2. Provide tools for staff to ingest and manage repository content.
3. Facilitate integration of repository materials with end-user tools and services.
4. Iterate…
Draw Detailed Requirements from Project Commitments:A) OAI-ORE Annotation of OCA texts B) Cross Collection Search ProjectC) Winterton Photography CollectionD) Kirtas Mounting Books ProjectE) EAD InitiativeF) Hesler Photography CollectionG) Chemical BulletinH) Fava Masks I) Curator-driven Digitization ProjectJ) Charlotte Moorman / Prgm. African Studies Audio
Inventory Content Types
1) EAD encoded finding aids2) TEI encoded text transcriptions3) High resolution images4) Virtual crops of high resolution images5) Page imaged books6) 3D objects7) Aggregations: full text, fielded, and faceted search8) Audio 9) Video
Project / Content Type Matrix
Services by Content Type
Text ServiceImage ServiceMetadata Conversion ServiceDiscovery Service
Text ServiceEAD Objects
EAD Disseminator Methods:getEADHeadergetComponentAsHTML(unitid)getComponentStructuregetChildComponents(unitid)getComponentsgetComponentStructure(unitid)getAncestorComponents(unitid)getComponentChildrenAsJSON(unitid)getComponentAsEmbeddedHTML(unitid)getComponent (unitid)getElementById (xml:id)getArchDescNoComponentsgetElementsByName(element_name)getDigest(unitid)getComponentAsDC(unitid)getComponentAsMODS(unitid)reindex
Datastreams:DC MODSEADEAD to DC XSL EAD to MODS XSLEAD to HTML XSLEAD to HTML Frag XSLEAD Children to JSON XSLRELS-EXT
TEI Objects
TEI Disseminator Methods:getTOCgetImageTextTOCgetStructuredTextTOCgetHeader(xml:id)getHeadinggetChunk(xml:id)getPageByNumber(pageOrdinal)getPageByID(xml:id)reindex
Datastreams:DC MARCXMLDejaVuBook ORE REMPage Image ORE REMTEIRELS-EXT
EAD Objects
EAD Service methods:getEADHeadergetComponentAsHTML(unitid)getComponentStructuregetChildComponents(unitid)getComponentsgetComponentStructure(unitid)getAncestorComponents(unitid)getComponentChildrenAsJSON(unitid)getComponentAsEmbeddedHTML(unitid)getComponent (unitid)getElementById (xml:id)getArchDescNoComponentsgetElementsByName(element_name)getDigest(unitid)getComponentAsDC(unitid)getComponentAsMODS(unitid)reindex
Datastreams:DC MODSEADEAD to DC XSL EAD to MODS XSLEAD to HTML XSLEAD to HTML Frag XSLEAD Children to JSON XSLRELS-EXT
Fedora Text Disseminator
getComponent: unitidgetComponentAsHTML: unitidgetComponentAsDC: unitidgetComponentAsMODS: unitid....reindex
SGREPServlet
Encapsulate query syntaxXSLT optional on query result
SGREP : Executable program on service host
Add Fedora Disseminator MethodsAdd Fedora Disseminators
Add/Modify XSLT Processing on RetrievalAdd/Modify SGREP Queries
Replace Retrieval Software
Text Service Stack Enhancement Options
Examples: EAD “Digest”- C0n + title/id of children and ancestorsJSON support for EXT-JSHTML design iterationEAD to MODS conversion maturation
Image Service: Source Image
Cropped Image
Single Image File Referenced By Crop Information:
<svg:svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg"> <svg:image x="0" y="0" width="10656" height="7992" xlink:href="inu-wint/inu-wint-22.30.jp2"><svg:clipPath><svg:rect x="0" y="1166" width="8034" height="6036"/></svg:clipPath></svg:image></svg:svg>
CroppedPhoto
Single Image File Referenced By Crop Information:
<svg:svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg"> <svg:use xlink:href="http://repository.library.northwestern.edu/fedora/get/inu:inu-wint-22-30/DELIV-OPS"> <svg:clipPath> <svg:rect x="4246" y="1436" width="2997" height="2518"></svg:rect> </svg:clipPath> </svg:use> </svg:svg>
Image and Crop ObjectsImage Service methods (supported by both image and crop objects):
getWithWidth(width)getWithLongSide(length)getWithHeight(height)getCropWithWidth(x,y,width, height,destwidth)getCropWithHeight(x,y,width, height,destheight)getCropWithSize(x,y,width,height, destwidth , destheight)getWithSize(destwidth , destheight)
Image Object Datastreams:DCMODSPREMISSVGTIFFEXIFJP2MIX_TIFFMIX_JP2RELS-EXT
Crop Object Datastreams:DCMODSPREMISSVGRELS-EXT
http:/.../fedora/get/inu:inu-wint-22-30-2/inu:sdef-addimage/getWithLongSide?length=150
Fedora Image Disseminator
getWithWidth(width)getWithLongSide(length)getWithHeight(height)getCropWithSize(x, y, width, height, destwid…)
Image Servlet
Encapsulate rendering parametersObject specific rendering parameters (SVG)User request rendering parametersRendering service parameters and location
Rendering Service : Aware, DJatoka
Add Fedora Disseminator MethodsAdd Fedora Disseminators
Add/Modify Rendering OptionsAdd/Modify Rendering Service Parameters
Replace Rendering Software
Image Service Stack Enhancement Options
Examples: Added getLongSide(length)Added rotationOptimized rendering parametersRendering features - vector overlayObject reference chainingDjatoka experimentation
EAD Objects
EAD Service methods:
getEADHeadergetComponentAsHTML(unitid)getComponentStructuregetChildComponents(unitid)getComponentsgetComponentStructure(unitid)getAncestorComponents(unitid)getComponentChildrenAsJSON(unitid)getComponentAsEmbeddedHTML(unitid)getComponent (unitid)getElementById (xml:id)getArchDescNoComponentsgetElementsByName(element_name)getDigest(unitid)getComponentAsDC(unitid)getComponentAsMODS(unitid)reindex
Datastreams:DC MODSEADEAD to DC XSL EAD to MODS XSLEAD to HTML XSLEAD to HTML Frag XSLEAD Children to JSON XSLRELS-EXT
Image/Crop Objects
Image Service methods:
getWithWidth(width)getWithLongSide(length)getWithHeight(height)getCropWithWidth(x,y,width,
height,destwidth)getCropWithHeight(x,y,width,
height,destheight)getCropWithSize(x,y,width,height,
destwidth , destheight)getWithSize(destwidth , destheight)
Datastreams:DCMODSPREMISSVGTIFFEXIFJP2MIX_TIFFMIX_JP2RELS-EXT
Searching
• SOLR• MODS described collections• Metadata conversion services• Faceting• “Searchable” Interface
o MODS Collection Datastreamo Facet listo Field List
Project Checklist
A) OAI-ORE Annotation of OCA texts B) Cross Collection Search ProjectC) Winterton Photography CollectionD) Kirtas Mounting Books ProjectE) EAD InitiativeF) Hesler Photography CollectionG) Chemical BulletinH) Fava Masks I) Curator-driven Digitization ProjectJ) Charlotte Moorman / Prgm. African Studies Audio
Next Step:Collection Management Tools
EAD Ingest Processing
Image Ingest Processing
Heterogeneous Ingest Processing
Ingest Management Tools
Curator-selected Ingest (Ad-hoc collections)Digital Image Library (Art Slide Library +)Kirtas Book Scanning
Mounting Books Project - OR09 Tuesday, 3:00: Session 8B
Top Related