SSHELCO 2016 metadata workshop

90
Plays Well with Others: Getting Your Digital Collections Metadata Ready for the World 2016 SSHELCO Annual March 17, 2016

Transcript of SSHELCO 2016 metadata workshop

Page 1: SSHELCO 2016 metadata workshop

Plays Well with Others:

Getting Your Digital Collections Metadata Ready

for the World2016 SSHELCO Annual

March 17, 2016

Linda Ballinger
This could be fun to add to the contact slide with all our names and emails.
Doreva Belfiore
That's awesome!
Page 2: SSHELCO 2016 metadata workshop

Linda Ballinger, Penn StateDoreva Belfiore, Temple UniversityBill Fee, State Library of PennsylvaniaLeanne Finnigan, Temple UniversityKristen Yarmey, University of ScrantonElise Warshavsky, Temple University

The Pennsylvania Digital Collections Project (PDCP) Metadata team!

Page 3: SSHELCO 2016 metadata workshop

On the agendaPDCP/DPLA OverviewMeet the aggregatorWhy metadata mattersField by Field metadata madness!

Derived FieldsRequired FieldsHighly Recommended FieldsRecommended FieldsOptional Fields

Page 4: SSHELCO 2016 metadata workshop

Before we startQ&A throughoutFun breaks at panelists’ discretion!

Slides and guidelines will be available.

Most of all:Don’t panic. We’re all in this together.

Page 5: SSHELCO 2016 metadata workshop

PDCP Project Overview

Page 6: SSHELCO 2016 metadata workshop

Toward a PA DPLA Hub

August 2014: meeting at the Free Library of Philadelphia

Initiated by Joe Lucia and Stacey Aldrich, former PA State Librarian

Including representatives from a number of institutions across the state

Page 7: SSHELCO 2016 metadata workshop

Founders

Page 8: SSHELCO 2016 metadata workshop

Why get involved?DPLA as major discoverability conduit:

Worldwide exposure for PA content

DPLA as a means of working efficiently:Collaboration at the cross-institutional levelTaking advantage of economy of scaleDPLA portal / api vs. customized siloes

Page 9: SSHELCO 2016 metadata workshop

Digital Public Library of America

http://dp.la

Page 10: SSHELCO 2016 metadata workshop

DPLA Hub and Spoke ModelContent Hubs:

Single institutions, 200K+ objects, i.e. NARA, Hathi Trust, NYPL

Service Hubs:Content aggregation for many

institutionsState/regional level; ideally 1:1 ratioDigital Commonwealth (MA), Mountain

West Digital Library, Empire State Digital Network (NY)

Page 11: SSHELCO 2016 metadata workshop
Page 12: SSHELCO 2016 metadata workshop

Digitization and Repository Support Activities

Digitization:For organizations that have not started

digitizing materials, or have not done much

Potential for remote, local and mobile digitization options (a.k.a. “scannebagos”)

Provided by the State Library of Pennsylvania

Content Hosting:For organizations that already have digital

files but no current digital repository capabilities

Provided by POWER Library (HSLC)Free for Pennsylvania institutions

Page 13: SSHELCO 2016 metadata workshop

SUCCESS!PDCP Announced as DPLA

Pennsylvania Service hub, August 28, 2015!

Estimated Timeline:September, 2015 - OrientationOctober-November, 2015 - Signing Legal

Agreements, Metadata RevisionOctober-December, 2015 - Metadata

normalization, harvesting tests and QAEarly 2016 - Planned live ingest of records

into the DPLA!

Page 14: SSHELCO 2016 metadata workshop

PA-DPLA Aggregator

Proof-of-concept prototypeDevelopment: Pennsylvania State University / Temple University

partnershipDec. 2014 - Mar. 2015Hydra (Fedora) - Open Source Platform

Harvesting & exposing metadata via OAI-PMHhttps://github.com/tulibraries/dplah

Released production versionSummer 2015

Page 15: SSHELCO 2016 metadata workshop

PA-DPLA Aggregator

OAI-PMH Metadata - Human readable with faceting browsing and searching

http://libcollab.temple.edu/aggregator

Page 16: SSHELCO 2016 metadata workshop

PA-DPLA Aggregatorhttp://libcollab.temple.edu/aggregator

Page 17: SSHELCO 2016 metadata workshop

PA-DPLA Aggregator

Kitchen Memories, Scranton Public Library, http://content.lackawannadigitalarchives.org/cdm/ref/collection/cookbooks/id/1657

http://libcollab.temple.edu/aggregator

Page 18: SSHELCO 2016 metadata workshop

Testing Data http://libcollab.temple.edu/aggregator-testing/

Page 19: SSHELCO 2016 metadata workshop

Prototype Harvested Content“Lowest hanging fruit”:

○ OAI-PMH harvestable data■ 29 institutions, 147K+ harvested records

○Primarily targeting collections from PDPC Steering and Planning Committee institutions

○Keep numbers manageable for testing purposes

○Scalable to full production mode for the future

Page 20: SSHELCO 2016 metadata workshop

Why do we need good metadata?

Page 21: SSHELCO 2016 metadata workshop

http://iknowwhereyourcatlives.com/

Page 23: SSHELCO 2016 metadata workshop

http://dp.la/map?utf8=%E2%9C%93&q=textiles

Page 26: SSHELCO 2016 metadata workshop

http://dp.la/apps?page=2

Page 27: SSHELCO 2016 metadata workshop

Collection Policies

Page 28: SSHELCO 2016 metadata workshop

CC0 Metadata

Contributing institutions are required to share their metadata and thumbnails under a CC0 license (full access - no rights reserved).

The digital objects themselves retain any original specified rights.

Page 29: SSHELCO 2016 metadata workshop

Collecting ScopeThe following types of collections are NOT currently accepted by the DPLA:Scholarly materials: ETDs, Journal ArticlesFinding Aids: EADs, Collection GuidesAggregate Description: Objects described at the

folder, series, or collection level instead of the item level

Items that don’t resolve to a publicly-accessible URLIndividual page-level objects instead of compound

ones

Page 30: SSHELCO 2016 metadata workshop

Restricted ItemsIf your institution needs to restrict any digital objects at the item level, for copyright or other reason:Enter the string pdcp_noharvest in fields that map to either of these Dublin Core values:dcterms:accessRightsdc:rights

Restricted Area for Humans Only, Ronyasoft, http://www.ronyasoft.com/products/poster-forge/templates/funny-signs/restricted-area-sign-template/images/restricted-area-sign-template.jpg

Page 31: SSHELCO 2016 metadata workshop

Derived Fields

Page 32: SSHELCO 2016 metadata workshop

Derived FieldsDerived fields are those metadata fields that are created by the PDCP aggregator automatically from the OAI-PMH feed.

Happy Face, Temple University, http://digital.library.temple.edu/cdm/ref/collection/p15037coll3/id/6541

Derived = “Dont worry, be happy!”

Page 33: SSHELCO 2016 metadata workshop

ThumbnailThumbnails are the small preview versions of your digital object that are shown both in your repository and in the DPLA.

They are important because they give viewers a confirmation that they have found (or not found) what they are looking for.

Page 34: SSHELCO 2016 metadata workshop

ThumbnailThumbnails can be derived by our aggregator from these common repository systems:

CONTENTdmBepressOmekaVUDL… and more to be added

Page 35: SSHELCO 2016 metadata workshop

Thumbnail

Portrait of Zapata, Kutztown University, http://digital.klnpa.org/cdm/ref/collection/asaro/id/21

Page 36: SSHELCO 2016 metadata workshop

ThumbnailFor other systems, we need a consistent path where the thumbnail is housed, i.e.:

http://www.server.org/repo/thumbs/$identifier/

Page 37: SSHELCO 2016 metadata workshop

CollectionThe collection name is set up by the team before harvesting. It generally matches the digital collection name found online.

Page 38: SSHELCO 2016 metadata workshop

Contributing InstitutionThe contributing institution name refers to YOUR ORGANIZATION and is set up by the team before harvesting.

Are YOU in This?, Temple University, http://digital.library.temple.edu/cdm/ref/collection/p16002coll9/id/2952

Page 39: SSHELCO 2016 metadata workshop

Contributing InstitutionThe Contributing Institution name can also be pulled automatically from the following DC fields:

ContributorCreatorPublisherSource

Page 40: SSHELCO 2016 metadata workshop

Intermediate ProviderIf your data is hosted by an aggregator or common repository then we list that entity as an Intermediate Provider, i.e.:

Keystone Library Network (KLN)Lackawanna Valley Digital Archives (LVDA)

POWER Library (via HSLC)

Page 41: SSHELCO 2016 metadata workshop

Resource LocationThe Resource Location is a trackback to the original collection URL for a digital object.

Example: http://content.lackawannadigitalarchives.org/cdm/ref/collection/SPL/id/36

Page 42: SSHELCO 2016 metadata workshop

Resource Location

Page 43: SSHELCO 2016 metadata workshop

Resource Location

Early Library Staff, Scranton Public Library, http://content.lackawannadigitalarchives.org/cdm/ref/collection/SPL/id/36

Page 44: SSHELCO 2016 metadata workshop

Resource LocationRequired by DPLA to present your original data record.

Can be derived from the OAI-PMH data feed for typical systems:

CONTENTdmBepressOmekaVUDL Can be custom mapped if needed for other systems, e.g.:http://www.server.org/repo/$identifier/

Page 45: SSHELCO 2016 metadata workshop

Required fields

Page 46: SSHELCO 2016 metadata workshop

TitleOther than the thumbnail, the title is often the first piece of information a user sees on a results listShould be the name by which an object is known, not a file name

Page 47: SSHELCO 2016 metadata workshop

LanguageRequired if appropriate

3 letter ISO 639-2 language codes are preferredAggregator normalizes these codes to full

language names for displayExamples:

eng ---> English lat---> Latin

ita ---> Italian san---> Sanskrit

spa ---> Spanish vie---> Vietnamese

Page 48: SSHELCO 2016 metadata workshop

Language

Page 49: SSHELCO 2016 metadata workshop

RightsContains information about rights associated with the resource

“In the public domain and may be used without copyright restriction.”

“Content is under copyright of the University of Scranton.”http://creativecommons.org/licenses/by-sa/3.0/

REMINDER: DPLA will only accept objects that are available and viewable to the general public

pdcp_noharvest

Page 50: SSHELCO 2016 metadata workshop

Rights‘Getting it Right on Rights’

Working group (DPLA, Europeana, etc.)

Released white paper May 2015 and opened it up for comments

Standardized rights statementsComing soon!

Page 51: SSHELCO 2016 metadata workshop

RightsThe aggregator can add rights statements at the collection level

Page 52: SSHELCO 2016 metadata workshop

TypeThe nature or genre of the resource

DCMI Type Vocabulary recommendedAssign ‘Text’ type to images of texts

Think of the user

Page 53: SSHELCO 2016 metadata workshop

TypeTypes used by DPLA:

text, image, sound, moving image, physical object

The aggregator can map your local types to these at the collection/seed level

Page 54: SSHELCO 2016 metadata workshop

Type

Page 55: SSHELCO 2016 metadata workshop

Highly Recommended Fields

Page 56: SSHELCO 2016 metadata workshop

Date CreatedDate of creation of the original resource.

(Not date digitized.)(Not date range or time period.)

Page 57: SSHELCO 2016 metadata workshop

Date CreatedMap todcterms:created (preferred)ordc:date

Page 58: SSHELCO 2016 metadata workshop

Date CreatedUseISO 8601 (W3CDTF) format

which looks likeYYYY-MM-DDYYYY-MMYYYY

http://xkcd.com/1179/

Page 59: SSHELCO 2016 metadata workshop

Date CreatedWatch out for:Extra spaces or symbols

_1943Missing digits

1943-1-5Placeholders and qualifying terms

Unknownn.d.ca. 19501950s

Page 60: SSHELCO 2016 metadata workshop

PlaceSpatial characteristics of the resource. Geographic location relevant to the original item.

Page 61: SSHELCO 2016 metadata workshop

PlaceMap todcterms:spatial (preferred)ordc:coverage

Page 62: SSHELCO 2016 metadata workshop

PlaceWhere is this thing?

Page 63: SSHELCO 2016 metadata workshop

PlaceMultiple choice:PhiladelphiaPhiladelphia; PennsylvaniaPhiladelphia (Pa.)Philadelphia, Pennsylvania, United StatesSeventh and Sansom Streets (Philadelphia, Pa.)

Franklin Institute (Philadelphia, Pa.)

Facade of the original Franklin Institute building

prior to moving to the Parkway in 1934.

Page 64: SSHELCO 2016 metadata workshop

PlaceUseLCNAF (preferred)orTGN, FAST, ...

Addresses, lat/long, or other location markers can also be mapped to dcterms:spatial.

Page 65: SSHELCO 2016 metadata workshop

PlaceExamples: Pittsburgh (Pa.)Allegheny County (Pa.)Harrison (Allegheny County, Pa. : Township)

Page 66: SSHELCO 2016 metadata workshop

PlaceWatch out for:Place ≠ Time Period

Page 67: SSHELCO 2016 metadata workshop

SubjectThe topic of the resource.

Page 68: SSHELCO 2016 metadata workshop

SubjectMap todc:subject

Page 69: SSHELCO 2016 metadata workshop

SubjectMany variations on a theme:NewspapersStudent newspaperNew Holland (Pa.) NewspapersScranton (Pa.) -- NewspapersWest Chester University Student NewspapersCollege student newspapers and periodicals -- Pennsylvania --

ScrantonUniversity of Scranton -- Students -- NewspapersLock Haven University of Pennsylvania Student Newspaper Archive

Page 70: SSHELCO 2016 metadata workshop

SubjectUseLC Authorities (preferred)orDDC, MeSH, UDC, AAT, TGN…

For LCSH, use space and double hyphen:term -- term

Page 71: SSHELCO 2016 metadata workshop

SubjectExamples: Harlem Renaissance -- MapsCoal miners -- Pennsylvania -- Social conditions

Cassatt, Mary, 1844-1926

Page 72: SSHELCO 2016 metadata workshop

SubjectWatch out for:Quotation marks

"D.O.R.A at Westminster"Separate terms with semi-colons

, Holt, Colbin32 Carat Club,anniversary ,charitable organizations,social

servicesOdd symbols or characters

2nd &amp

Page 73: SSHELCO 2016 metadata workshop

SubjectWatch out for:Standardization

Page 74: SSHELCO 2016 metadata workshop

SubjectWatch out for:Local terms with limited global meaningKSOMWML20

Page 75: SSHELCO 2016 metadata workshop

Recommended Fields

Page 76: SSHELCO 2016 metadata workshop

Creator - Old way(DON’T DO THIS)

Page 77: SSHELCO 2016 metadata workshop

Creator – Proper way

Page 78: SSHELCO 2016 metadata workshop

Description• The 500 field of the Dublin Core world

Page 79: SSHELCO 2016 metadata workshop

Format

From New York Heritagehttp://cdm16694.contentdm.oclc.org/cdm/ref/collection/p15109coll6/id/2083

From African Americans Seen Through the Eyes of the Newsreel Cameramanhttp://collections.contentdm.oclc.org/cdm/singleitem/collection/p9002coll1/id/277/rec/1

Page 80: SSHELCO 2016 metadata workshop

Identifiers• OCLC # 12177842• Call # PT 1.1• FRBR linking code• OCLC number (of object)• CONTENTdm number• CONTENTdm file name• Identifier• Generated identifier• HPHWPZ201404000165 (unique, assigned)

Page 81: SSHELCO 2016 metadata workshop

Publisher

Publisher State Library of Pennsylvania NO!

Publisher Mount Pleasant, Pa.: Mount Pleasant Press, 1906

Page 82: SSHELCO 2016 metadata workshop

Optional Fields

Page 83: SSHELCO 2016 metadata workshop

Alternate Title

Page 84: SSHELCO 2016 metadata workshop

Contributor

NO!

Page 85: SSHELCO 2016 metadata workshop

Wrap-Up

Page 86: SSHELCO 2016 metadata workshop

What’s next?See PDCP Metadata Guidelines (still in draft)

Let us know if you have feedback!We plan to finalize v.1 in AprilLiving document

Would your institution like to contribute to the DPLA?Email: [email protected] Institutions will be forwarded to different organizations based

upon needs and readiness for data harvest (harvesting and metadata support, repository support, digitization support)

Page 87: SSHELCO 2016 metadata workshop

Checklist to Contribute DataPermission letter agreeing to share

metadata and thumbnails to DPLA under a CC0 license

Data available on a publicly accessible website

Ability to share metadata via OAI-PMH or CSV file

Staff available to work with PDCP about metadata issues

Page 88: SSHELCO 2016 metadata workshop

Stay in touch:Come to PA Backwards session tomorrow morning (9am)!Email the PDCP team: [email protected] Twitter: Follow us at @pdcp_pa PADIGITAL Listserv - general information about statewide digital

[email protected] Send a message to listserv@albright org with the text “subscribe

padigital” in the body

Page 89: SSHELCO 2016 metadata workshop

Resources and SupportDocumentation : https://

drive.google.com/folderview?id=0B-icpMLW3cRXQmVQMnJoMkZJUDg&usp=sharing

Onboarding : Leanne Finnigan ([email protected]) or

Elise Warshavsky ([email protected])

Online office hours

Webinar workshops

METADATA

Original: The Doctor Is In, Peanuts Worldwide, LLC.

0 ¢

Page 90: SSHELCO 2016 metadata workshop

Thank you!

http://xkcd.com/1543/

PDCP Metadata TeamLinda BallingerDoreva Belfiore

Bill FeeLeanne FinniganKristen Yarmey