Post on 17-Dec-2015
ComPADRE
Experiences developing an OAI server over an existing database repository
Resources for Physics and Astronomy Education
Lyle BarbatoAmerican Association of Physics Teacherslbarbato@aapt.org
The ComPADRE Project
NSF/NSDL funded repository for Physics and Astronomy specific resources
Partnerships and collaborations– AAPT: American Association of Physics Teachers– AAS: American Astronomical Society– APS: American Physical Society– AIP/SPS: Society of Physics Students– Merlot: Multimedia Educational Resource for Learning and Online
Teaching Multiple collection interfaces for different target audiences Collection interfaces share a common architecture so we can
build once and use everywhere
Why OAI?
Fits well as a shared service Necessary step to becoming a service
provider for physics and astronomy data providers
NSDL harvesting Increased content visibility Theoretically, it has a ‘simple’ and ‘easy’
implementation
Our Backend Structure
Database driven– Our records are generated on-the-fly
Database model based on the Learning Object Model (LOM) metadata standard
OAI Implementation Experience
Research of OAI tools and protocol OAI Server Implementation and
corresponding database changes Implementation of metadata formats
– OAI_DC A few hours
– NSDL_DC A few weeks
OAI Server Implementation
Only 6 Verbs!– But opportunities still exist for problems and confusion
Focusing on implementation problems, they were either:
– Related to our database infrastructure Datestamps for exported ‘on the fly’ records General deletion handling
– Or related to supporting shared item records for multiple collections with separate workflows
Datestamp Issues
OAI requires datestamps be applied on the metadata record level.
This raises issues with database-based items:– When exports are done on the fly, item level last modified
datestamps are not good enough since fields that change in one format may not be included in another
– When a new metadata format is created, it will have a different OAI datestamp for creation, however the database record will already exist
Deleted Record Issues
The goal is for ‘persistent’ deleted records However:
– The datestamp for a deleted record must be the time it was deleted
Thus we could not use the ‘Last Modified’ date of our records due to potential future database-wide global vocabulary updates.
– Metadata versioning It happens. Plan to consistently export all metadata format versions forever Or plan to have ‘deleted’ record holders for existing records at
the time of the metadata format removal
Additional recommendations
For Service Providers– To avoid loss of context and information:
Place set info into the record using IsPartOf to alleviate “On the Horse” problems
If possible, Indicate the set size in the setDescription– Provide internal unique ids for de-duping– Indicate source of metadata– Expose the richest metadata format possible
Provide Richer Metadata – How?
Expose the richest metadata format possible– However, the OAI literature has little on how to create your
own xml schema if you need to: http://www.oaforum.org has a great tutorial on how to create a new schema by
extending the oai_dc schema to add new elements at http://www.oaforum.org/tutorial/english/page5.htm#section1
– Also, if you have an internal vocabulary that you wish to expose, a page that gives instructions for creating an xml schema ‘vocabulary’ via enumeration:
http://www.xml.com/pub/a/2003/02/05/wxs-enum.html
When it’s all said and done
There are tools which can do this for youBUT
If your infrastructure doesn’t mesh with existing tools, OAI is simple enough to implement yourself.
References
Compadre – http://www.compadre.org/portal
Open Archives Initiative– http://www.openarchives.org/
Digital Library Federation OAI and Shareable Metadata Best Practices Working Group
– http://oai-best.comm.nsdl.org/cgi-bin/wiki.pl
NSDL Metadata Resources Page– http://metamanagement.comm.nsdlib.org/IntroPage.html
Dublin Core Metadata Initiative– http://dublincore.org
References
The Basics of OAI: An introduction to the Protocol for Metadata Harvesting
– Web-Wise 2004: Sharing Digital ResourcesTim Cole, Sarah Shreeves, Martin Halberthttp://imlsdcc.grainger.uiuc.edu/OAI_Tutorial_WebWise.ppt
Bitter Harvest: Problems & Suggested Solutions for OAI-PMH Data & Service Providers
– Roy TennantCalifornia Digital Libraryhttp://www.cdlib.org/inside/projects/harvesting/bitter_harvest.html
Challenges for Service Providers When Importing Metadata in Digital Libraries
– Marilyn McClelland, David McArthur, Sarah Giersch, Gary Geislerhttp://www.dlib.org/dlib/april02/mcclelland/04mcclelland.html
References
JCDL OAI Workshop 2003– Naomi Dushay
http://www.cs.cornell.edu/people/simeon/workshops/JCDL2003/JCDL2003_OAI_Workshop_Naomi_Dushay.ppt
Improving Metadata Quality: Augmentation and Recombination– Naomi Dushay, Diane Hillmann, Jon Phipps
http://metamanagement.comm.nsdl.org/Metadata_Augmentation--DC2004.pdf
DLF Best Practices for OAI Data Providers and for Sharable Metadata - Proposal for a Workplan
– Sarah Shreeveshttp://imlsdcc.grainger.uiuc.edu/DLF_OAI/DLF_OAI_Workplan.pdf
Set Best Practices– http://oai-best.comm.nsdl.org/cgi-bin/wiki.pl?SetPractices
References
Service Provider (SP) Issues– Katrina (Kat) Hagedorn
http://www.kathagedorn.com/SP_issues.html CIC-OAI project recommendations for Dublin Core metadata
providers– Muriel Foulonneau & Timothy W. Cole
http://cicharvest.grainger.uiuc.edu/dcguidelines.asp Description Guidelines for RLG Cultural Materials (Metadata
Recommendations)– http://www.rlg.org/en/page.php?Page_ID=214
Open Archives Forum– http://www.oaforum.org/
Remaining questions
Selective Harvesting issue regarding moving in and out of a set
Persistence of a record’s ‘deleted’ status