November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission...
-
Upload
john-mclain -
Category
Documents
-
view
220 -
download
2
Transcript of November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission...
![Page 1: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/1.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 1
METS: Metadata Encoding & Transmission Standard
![Page 2: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/2.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 2
Part One: Problem definition
![Page 3: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/3.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 3
Digital (Library) Objects
• Reformatted to digital• scanned photographs, books and journals• digitized audio/video files
• “Born digital”• TEI-encoded texts• digital images, audio, video files• GIS, statistical datasets• interactive content
![Page 4: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/4.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 4
Digital (Library) Objects
• Simple Objects– single files, e.g.
• visual TIFF images• MP3 files• TEI-encoded text
– objects stand alone • no relationships to other objects
![Page 5: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/5.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 5
Digital (Library) Objects
• Complex Objects– multiple related files, e.g.
– page images from books or articles– multiple channels in digital audio files– related sound and text files (multimedia)– statistical dataset and codebook
– objects cannot stand alone• multiple files required to interpret the
object• requires structural metadata to model
![Page 6: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/6.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 6
Structural metadata
• Maps physical files (digital assets) to logical items (complex digital objects)
• Examples– Scanned print material
• complex publication structures (e.g. journals runs)
• ordered relationship between digital page images
– A/V material• multiple resolutions of an image• multiple channels of an audio file
![Page 7: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/7.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 7
Structural metadata
• Examples, continued– Multimedia presentations
• relationship between images, text, sound, video, etc. (time-based or other)
– Web sites• linkages between web pages• sitemaps
– Databases• table models and ER diagrams
![Page 8: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/8.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 8
Digital (Library) Objects
• Also have other (non-structural) metadata– descriptive
• MARC, DC, FGDC, VRA core, other ontologies
– administrative• rights, provenance
– technical• format details, OAIS “representation
information”
• Standards exist or emerging for these
![Page 9: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/9.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 9
Part Two: Introduction to METS
![Page 10: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/10.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 10
METS Scope
• Supports– Structural metadata
• complex reformatted or born digital objects
– Metadata wrapper framework• descriptive, administrative, structural, etc.• structural required• others use namespaces to reference
“extension schemas”
![Page 11: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/11.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 11
Brief History
• 1997-2001 Making Of America II project– Funded by DLF and NEH– Included Berkeley, Cornell, NYPL, Penn State,
Stanford, U of Michigan
– Designed for scanned archival collections– SGML DTD included pre-defined descriptive,
administrative, structural metadata
• February 2001 DLF workshop on structural metadata produced METS framework
![Page 12: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/12.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 12
METS Header
Administrativemetadata
FileInventory
Structuremap
Descriptivemetadata
Behavioralmetadata
METS metadata “buckets”
optional
optional
optional required
optional optional
![Page 13: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/13.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 13
METS metadata
• XML “extension schemas”– descriptive metadata
• Dublin Core, MARC, FGDC, VRA, etc.• Berkeley’s GDM schema (from MOA2)
– administrative/technical metadata• NISO image technical metadata• LC schemas for A/V technical metadata• Rights metadata (e.g. PRISM, XrML, etc.)• Provenance metadata
![Page 14: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/14.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 14
M etad a ta R e fe ren ce M etad a ta W rap p er
D esc rip tive M etad a ta
Metadata Reference (mdRef): A link to external descriptive metadata. The type of link (URN/Handle/etc.)is included as an attribute, as is the metadata type.
Metadata Wrapper (mdWrap): Included descriptive metadata, as either binary data (Base64 encoded) or arbitrary XML using namespace mechanism. The metadata type is specified as an attribute.
METS Descriptive Metadata Section
![Page 15: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/15.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 15
Tech n ica lM etad a ta
IP R ig h tsM etad a ta
S ou rceM etad a ta
P reserva tionM etad a ta
A d m in is tra tiveM etad a ta
Technical Metadata (techMD): technical metadata regarding content files
IP Rights Metadata (rightsMD): rights metadata regarding content files or primary source material
Source Metadata (sourceMD): provenance information for content files.
Preservation Metadata (preservationMD): metadata to assist in preservation of digital content
All sections use generic metadata reference and wrapper subelements.
METS Administrative Metadata Section
![Page 16: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/16.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 16
e tc ., e tc ., e tc .
F ile G rou p F ile
F ile G rou p F ile
F ile In ven to ry(F ile G rou p )
File Group (fileGrp): provides mechanism for hierarchically subdividing physical files, for example by type
File (file): provides a pointer to an external file (Flocat) or includes file content internally (Fcontent) in Base64 encoding
METS File Inventory
![Page 17: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/17.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 17
etc ., e tc . e tc ....
D ivis ion M E TS P o in te r F ile P o in te r
D ivis ion M E TS P o in te r F ile P o in te r
D ivis ion
S tru c tu ra l M ap
The Structural Map provides a tree structure describing the original document. Each division (div) element is a node in that tree, and can identify content files associated with that division by a METS Pointer (mptr) or a File Pointer (fptr)
METS Structural Map
![Page 18: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/18.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 18
METS Pointer and File Pointer
METS Pointer (mptr): xlink to another METS file containing the content for the associated div. Useful for breaking up large objects (e.g., a journal run) into a series of smaller METS documents.
File Pointer (fptr): Identifies one or more entries in the File Inventory section containing the content for the associated div element. Can also limit the link from a div element to a portion of a content file (e.g., a segment of an audio or video file, a subarea of an image or video file, etc.).
![Page 19: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/19.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 19
A rea A rea . . .
P ara lle l F iles
A rea A rea . . .
S eq u en tia l F iles
F ile P o in te r
File Pointer (fptr): Can identify a single file in File Inventory using ID/IDREF linking
Parallel/Sequential(par/seq): Allows a div to be associated with several content files that should be played/displayed in parallel (video with separate audio track file) or sequentially.
Area (area): identifiers a point, linear segment, or 2D area within content file that corresponds with associated div element.
METS File Pointer Mechanisms
![Page 20: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/20.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 20
METS Area Element Attribtes
FILE: ID for File element in File InventorySHAPE: As in HTML Area elementCOORDS: As in HTML Area elementBEGIN: A start point within a file for defining
a segmentEND: An end point within a file for defining
a segmentBETYPE: Begin/End type: IDREF, Byte Offset,
or SMPTE time codeEXTENT: Length Duration of SegmentEXTYPE: Extent Type: Bytes, or SMPTE
![Page 21: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/21.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 21
Structure Example
<file ID=“f1” MIMETYPE=“audio/x-wav” SEQ=“1”><Flocat LOCTYPE=“URN”>
urn:x-nyu:violet42</Flocat>
</file><div N=“5” LABEL=“Question 5”>
<fptr><seq>
<area FILE=“f1” BEGIN=00:23:17:00 END=“00:23:38:00” BETYPE=“SMPTE”>
</area><seq>
</fptr></div>
![Page 22: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/22.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 22
• Created for multimedia structural encoding
• SMIL has “time-based” orientation – for playing multimedia presentations
• Very complex• May eventually be incorporated
Related standards: SMIL (W3C), MPEG-7 (ISO)
![Page 23: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/23.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 23
Related standards: RDF (W3C)• Also metadata wrapper framework• Structural metadata could be
supported, but doesn’t specify how…
• Opaque to use• No element semantics provided• element names deliberately meaningless
• Originally designed for descriptive metadata
![Page 24: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/24.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 24
Related standards: OAIS framework
![Page 25: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/25.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 25
METS and OAIS framework
• Submission Information Package (SIP)• METS as transfer syntax
• Dissemination Information Package (DIP)
• METS as tranfer syntax• METS as input to display applications
• Archival Information Package (AIP)• METS stored internally in an archive
![Page 26: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/26.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 26
Library Applications
• Digital Object transfer syntax– between systems
• enables interoperability
– between institutions• enables collection sharing
– implements OAIS SIP/DIP/AIP
![Page 27: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/27.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 27
Library Applications
• Input to Digital Object delivery systems (aka “disseminators”)– Simple bit-streaming– XSL stylesheet– Custom program for complex digital
object display
![Page 28: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/28.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 28
Part Three: METS Summary
![Page 29: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/29.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 29
METS summary
• Descriptive/technical/administrative metadata– not defined internally– points to external standard schemas
• Dublin Core, MARC, MPEG-7, etc.• AES audio metadata
– set of “best practice” schemas being identified
![Page 30: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/30.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 30
METS summary
• Structural metadata– defined internally and required– SMIL-lite
• simple support for multimedia, audio/visual
• SMIL may replace eventually
![Page 31: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/31.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 31
METS summary
• Current users include• UC Berkeley (archival collections)• Harvard (scanneded print publications, e-
journals)• Library of Congress (audio/visual collections)• British Library• RLG and OCLC• EU METAe project (historic newspapers)• Michigan State (oral history collections)• Univ of Virginia (FEDORA digital objects)• more daily...
![Page 32: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/32.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 32
METS summary
• Tools under development for– metadata capture– transformation– transfer– dissemination/display
• Profiles necessary for interoperation– Which extension schemas used?– How structure maps are organized…
![Page 33: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/33.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 33
METS summary
• Current status– version 1.3 available from LC– editorial board in place– LC standards office for maintenance
agency– DLF and RLG underwriting
• RLG will host editorial board, offer documentation and training, develop tools
– Several extension schemas available– Opening Day in October 2004
![Page 34: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5514c605550346b0338b4b31/html5/thumbnails/34.jpg)
November 22, 2003 DASER Conference. Copyright MIT, 2003 34
METS summary
• METS is not all things to all people…– Designed for local institutional application
support• Solving an immediate local problem• Common to many institutions• Flexible framework supports many institutional
situations
– Profiling necessary to interoperate• For OAIS packages• For shared tools• For other kinds of interoperation (e.g. cross
repository search)