METS: An Introduction Part II METS Mechanisms. What is METS? An XML-based standard for encoding...
-
Upload
austin-rice -
Category
Documents
-
view
224 -
download
3
Transcript of METS: An Introduction Part II METS Mechanisms. What is METS? An XML-based standard for encoding...
What is METS?
• An XML-based standard for encoding “hub” documents for materials whose content is digital. – XML is a markup language like SGML.
– A hub document draws together dispersed but related digital files and content
– METS uses XML to provide a vocabulary and syntax for identifying the digital pieces that together comprise a digital entity, for specifying the location of these pieces, and for expressing the relationships between these digital pieces
What is XML?
• Stands for Extensible Markup Language
• Markup Language like SGML (of which HTML is a flavor)
• Intended to serve many of the same purposes as SGML, only better
XML: Key Vocabulary & Concepts
1. Elements and Attributes
2. Pattern and Instance Documents
3. Namespaces
Elements and Attributes
• XML documents consist of a hierarchically arranged sequence of elements.
• Element consists of:– Element tag delimited by angle brackets. Tag contains:
• Element name• Element attributes: [attribute name]=‘[attribute value]’
– Element content or value. Elements don’t always have value. They may simply have a structural purpose.
– Nested Elements: An element can contain other elements– Element close tag: </[element name]>
Element Example 1<metsHdr CREATEDATE="2001-10-23T00:00:00" >
<agent ROLE="CREATOR">
<name>Rick Beaubien</name>
</agent>
</metsHdr>
Element Example 2<structMap>
<div TYPE=“QUAD15” LABEL="San Francisco Quad">
<fptr FILEID="FID1"/>
<fptr FILEID="FID20"/>
<div TYPE="map" LABEL="1895" DMDID="DM2">
<fptr FILEID="FID2"/>
<fptr FILEID="FID14"/>
<fptr FILEID="FID8"/>
</div>
</div>
</structMap>
Patterns and Instances
• Two main categories of documents in XML– “Pattern” or “rules” document: Specifies the
vocabulary and syntax to which a particular type of XML instance document must adhere
– “Instance” document: • Follows the rules specified in its governing pattern
document
• Uses these rules to instantiate a particular digital entity
Pattern Types• Two main types of Pattern Documents
– DTDs: Document Type Definitioin• Carryover from SGML• DTDs not expressed through XML at all• Example: MOA2.DTD
– Schemas • Can express the patterns or rules governing a particular
document type– Can also just define a set of attributes or elements that are
intended for use in a variety of other contexts
• Schemas are themselves XML documents• Controlled by a DTD• Example: METS.xsd
What do Schemas and DTDs Specify?
• Element level– Names– Namespaces– Sequence/nesting– Data types of content– Required/optional/repeatable– Attributes
• Attribute level– Names– Datatypes– Namespace
XML: Namespaces• Each XML Schema “pattern” document can create
a unique target “Namespace” that will be associated with it.
• Elements/attributes defined in the schema are said to belong to the declared target namespace.
• A schema can reference elements and attributes from external namespaces, and allow them to be used in specific contexts in the instance documents it governs.– Specific elements/attributes from specific namespaces– Any element from any namespace
Incorporating Specific Elements• Schema can provide for use of specific elements or
attributes from external namespaces in specific contexts
• Elements/Attributes from external namespaces must be preceded by a tag identifying the namespace, followed by a “:” Elements from primary namespace may also include a namespace prefix. Example (from METS instance document):
<METS:file ID=“FID1”><METS:FLocat LOCTYPE="URL" xlink:href="http://sunsite.berkeley.edu/brk10a.jpg"/>
</METS:file>
Allowing Any External Element• Schema can provide for use of any element from
any external namespace in specific contexts • Examples (from METS instance documents):
<METS:dmdSec ID="DM2"> <METS:mdWrap MDTYPE="OTHER"> <METS:xmlData> <gdm:gdm> <gdm:title>[Patrick Breen Diary]</gdm:title> <gdm:creator>Breen, Patrick</gdm:creator> </gdm:gdm> </METS:xmlData> </METS:mdWrap></METS:dmdSec>
Allowing Any External Element (cont’d)
<METS:dmdSec ID="DM3"> <METS:mdWrap MDTYPE="OTHER“> <METS:xmlData> <dc:dc> <dc:title>[Patrick Breen Diary]</dc:title> <dc:creator>Breen, Patrick</dc:creator> </dc:dc> </METS:xmlData> </METS:mdWrap></METS:dmdSec>
Intro to XML: Conclusion
• Key vocabulary and concepts:– Building blocks: elements & attributes
– Controls: schemas, dtds & instance documents
– Mix and match: namespaces
• Limits of presentation:– XML presentation very crude & restricted
– Not covered: how to read or create XML Schemas
– Examples all from METS instance documents. We will not look at METS schema. Just what it specifies.
Building a METS Document:The Framework
<METS:mets>
<METS:metsHdr /> Header
<METS:dmdSec /> Descriptive MD
<METS:amdSec /> Administrative MD
<METS:fileSec /> File list
<METS:structMap /> Structural Map
<METS:behaviorSec /> Behavior Section
</METS:mets>
METS Diagrammed
structMap
div
fileSec
fileGrp
file
amdSec
techMDsourceMD
digiprovMDrightsMD
dmdSec
dmdSec
Content
Administrative Md
Structure
Descriptive Md
behaviorSec
behaviorSec
Behavior
Building a METS Document: 5 key aspects
1. Expressing the Structure2. Linking Structure with Content3. Linking Structure with Descriptive
Metadata4. Linking Structure and Content Files with
Administrative metadata5. Not covered: Linking behaviors with
structures.
Building a METS Document:Aspect 1
1. Expressing the Structure a. Key elements:
i. <structMap>: structure is expressed in the context of a <structMap> element.a) <div>
i. Structure expressed through hierarchy of <div> elements
ii. <div> elements can be nested to any depth
Expressing structure: Add <div>s<METS:mets TYPE=“diary” LABEL=“Breen Diary”>
<METS:dmdSec />
<METS:admSec />
<METS:fileSec />
<METS:structMap TYPE=“physical”>
<METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”>
<METS:div ORDER=“1” TYPE=“page” LABEL=“Page 1” />
<METS:div ORDER=“2” TYPE=“page” LABEL=“Page 2” />
…
</METS:div>
</METS:structMap>
</METS:mets>
<structMap> Element
• Each <structMap> expresses a structure for the digital entity represented– METS object may contain more than one
<structMap>
• Attributes:– TYPE: logical, physical, or ??– LABEL: clarify purpose of structMap (type of
structure) to user
<div> Element (structMap)
• Each <div> represents a logical or physical segment of the digital entity represented. Root <div> represents entire object.
• Attributes:– ORDER: order among siblings
– ORDERLABEL: string representation of ORDER
– LABEL: identifies div to end user (as part of TOC)
– TYPE: type of division (chapter, page, entry, photograph, etc).
Building a METS object:Aspect 2
2. Linking Structure with Contenta) Key elements and attributes:
i. <fptr>: Links <div>s with <file> element(s) in the <fileSec> via FILEID attribute or via a. <area>: points to segment within a <file>b. <seq> : points to files that must be played in sequencec. <par> : points to files that must be played in parallel.
ii. <mptr>: links <div> with an independent, external METS object via a URI
iii. <file>: Element in the <fileSec> that points to a content file and/or itself contains the file contents. Links to external file viaa. <FLocat>: points via URI to external content file
Linking Structure with Content
structMap
ContentStructure
fileSec
fileGrp
fileFlocat
div
areafptr
mptr
seq
area
areapar
area
area
Linking in Simple Content 1<METS:mets>
<METS:fileSec> <METS:fileGrp VERSDATE=“2000-08-22T06:32:00”> <METS:file ID=“FID3” MIMETYPE=“image/gif”> <METS:Flocat LOCTYPE=“URL” xlink:href=“http:…” /> </METS:file> </METS:fileGrp> </METS:fileSec>
<METS:structMap TYPE=“physical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”> <METS:div ORDER=“1” TYPE=“page” LABEL=“Page 1”> <METS:fptr FILEID=“FID3” />
<METS:fptr FILEID=“FID35” />
</METS:div>
<METS:div ORDER=“2” TYPE=“page” LABEL=“Page 2” />
</METS:div> </METS:structMap> </METS:mets>
Linking in Simple Content 2
<METS:fileSec> <METS:fileGrp VERSDATE=“2000-08-22T06:32:00”> <METS:file ID=“FID3” MIMETYPE=“image/gif”> <METS:Flocat LOCTYPE=“URL” xlink:href=“http:…” /> </METS:file> </METS:fileGrp> <METS:fileGrp VERSDATE=“2000-08-22T07:32:00”> <METS:file ID=“FID35” MIMETYPE=“image/jpg”> <METS:Flocat LOCTYPE=“URL” xlink:href=“http:…” /> </METS:file> </METS:fileGrp> </METS:fileSec>
<fptr> Element (structMap.div)
• <div> element will contain an <fptr> element for each available manifestation of the <div>: thumbnail, med-res jpeg, hi-res jpeg, etc
• <fptr> points to associated content <file> or <file>s in the <fileSec>. – in case of simple content points directly to the
associated content <file> via the FILEID attribute
<fileGrp> Element (fileSec)
• <file> elements are organized into <fileGrp> elements representing versions of the content.– Example:
• One <fileGrp> might contain Master tif versions
• One <fileGrp> might contain Thumbnail versions
• One <fileGrp> might contain Medium-res jpg versions
• Attributes– VERSDATE: iso format date/time of creation
<file> Element (fileSec.fileGrp)• <file> element represents a content file• Main attributes:
– ID: required. Means for linking from <fptr> in <div>– MIMETYPE– SEQ– SIZE: in bytes– CREATED: iso format date/time of creation– CHECKSUM: MD5 digest value– OWNERID: primary identifier assigned by owner
• <file> may point to external file (via a <FLocat> element, or contain the actual file contents in Base64 (via a <FContent> element) or both
<FLocat> Element (fileSec.fileGrp.file)
• <FLocat> element points to external content via its xlink:href attribute (as do all METS elements that point to external content)
• Main attributes:– xlink:SimpleLink attributes: xlink:href,
xlink:role, xlink:arcrole, xlink:title, xlink:show, xlink:actuate
– LOCTYPE attribute: specifies the kind of xlink:href provided: URN, URL, PURL, HANDLE, DOI, OTHER
Linking in Complex Content: <area>
<METS:structMap TYPE=“physical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”> <METS:div ORDER=“1” TYPE=“page” LABEL=“Page 1”> <METS:fptr FILEID=“FID3” />
<METS:fptr FILEID=“FID35” /> <METS:fptr> <METS:area FILEID=“FID1” BETYPE=“IDREF” BEGIN=“PAGE1” END=“ENDPAGE1” /> </METS:fptr>
</METS:div>
<METS:div ORDER=“2” TYPE=“entry” LABEL=“Nov 21” />
</METS:div> </METS:structMap>
<area> Element (structMap.div.fptr)
• <area> element links a <div> to a segment of a content file
• <area> element provides numerous attributes for specifying an area within a file. These include:– SHAPE (html4 conventions: circ, poly, rect)– COORDS (html4 conventions)– BEGIN – END – BETYPE
• BYTE, IDREF, SMIL, MIDI, SMPTE, TIME, TCF
– EXTENT (duration)– EXTTYPE
• BYTE, SMIL, MIDI, SMPTE, TIME, TCF
Linking in Complex Content: <seq>
<METS:structMap TYPE=“logical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”> <METS:div ORDER=“1” TYPE=“entry” LABEL=“Nov 20”> <METS:fptr> <METS:seq> <METS:area FILEID=“FID2” /> <METS:area FILEID=“FID3” /> <METS:area FILEID=“FID4” /> </METS:seq> </METS:fptr> </METS:div> </METS:div></METS:structMap>
<seq> Element (structMap.div.fptr.seq)
• <fptr> element may link to content via a <seq> element
• <seq> element uses multiple <area> elements to identify files or parts of files that must be displayed/played in sequence to express the content of the associated <div>.
Linking in Complex Content: <par> element
<METS:div ORDER=“1” TYPE=“mmDiary” LABEL=“Breen Diary”> <METS:div ORDER=“1” TYPE=“page” LABEL=“Page 1”> <METS:fptr> <METS:par> <METS:area FILEID=“FID2” /> (image file) <METS:area FILEID=“FID33 BETYPE=“TIME” BEGIN=“00:00:00” END=“00:01:00” /> (sound file)
</METS:par> </METS:fptr> </METS:div></METS:div>
<par> Element (structMap.div.fptr)
• <fptr> element may link to content via a <par> element
• <par> element uses multiple <area> elements to identify files or parts of files that must be displayed/played in parallel to express content.
Linking in External METS object: <mptr>
<METS:structMap TYPE=“logical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”> <METS:div ORDER=“1” TYPE=“entry” LABEL=“Nov 20”> <METS:fptr FILEID=“FID3”>
<METS:fptr FILEID=“FID35”> <METS:fptr> <METS:area> FILEID=“FID1” BETYPE=“IDREF” BEGIN=“ENTRY1” END=“ENTRYEND1” /> </METS:fptr> </METS:div> <METS:div ORDER=“2” TYPE=“entry” LABEL=“Nov 21” /> … <METS:div ORDER=“35” TYPE=“letter” LABEL=“Letter from …”> <METS:mptr LOCTYPE=“URL” xlink:href=“http://…/l.xml /> </METS:div> </METS:div> </METS:structMap>
<mptr> Element (structMap.div.mptr)
• A <div> in a StructMap may want “pass the baton” to an external METS object
• A <mptr> element is used for this purpose• Main attributes:
– xlink:SimpleLink attributes: xlink:href, xlink:role, xlink:arcrole, xlink:title, xlink:show, xlink:actuate
– LOCTYPE attribute: specifies the kind of xlink:href provided: URN, URL, PURL, HANDLE, DOI, OTHER
Summary: Linking Structure with Content
• Structure is expressed in the <StructMap> through a hierarchy of <divs>
• <div>s are linked to content by means of <fptr> elements and/or <mptr> elements
• Each <fptr> or <mptr> associated with the <div> represents a manifestation of the <div>
Summary: Linking Structure with Content (cont’d)
• <fptr> element may point to content in four ways:– <fptr> may directly point to <file> element in
<FileSec>– <fptr> may contain an <area> element that points to a
segment of a file in the <fileSec>– <fptr> may contain a <seq> element. <seq> element
contains sequence of <area> elements that point to <file>s or segments of <file>s that must be played/displayed in sequence
– <fptr> may contain a <par> element. <par> element contains a sequence of <area> elements that point to <file>s that must be played/displayed in parallel
Summary: Linking Structure with Content (cont’d)
• <mptr> element may point to external METS object.
Building a METS object:Aspect 3
3. Linking Structure with Descriptive Metadata
a) Key elements and attributesi. <div> element in <structMap> may link to one or
more <dmdSec> elements via a DMDID attribute.ii. <dmdSec> may
a. point to external descriptive metadata via a <mdRef> element
b. itself contain descriptive metadata in an <mdWrap> element
Linking Structure with Descriptive Metadata
structMap
div
Structure Descriptive Md
dmdSecmdRef
dmdSecmdWrap
Linking to External Descriptive Metadata: DMDID
<METS:structMap TYPE=“logical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”DMDID=“DM1” > <METS:div ORDER=“1” TYPE=“entry” LABEL=“Nov 20”> <METS:fptr FILEID=“FID3” />
<METS:fptr FILEID=“FID35” /> <METS:fptr> <METS:area> FILEID=“FID1” BETYPE=“IDREF” BEGIN=“ENTRY1” END=“ENTRYEND1” /> </METS:fptr>
</METS:div>
<METS:div ORDER=“2” TYPE=“entry” LABEL=“Nov 21” />
</METS:div> </METS:structMap>
Linking to External Descriptive Metadata: <mdRef>
<METS:dmdSec ID=“DM1”> <METS:mdRef LOCTYPE=“URL” MDTYPE=“EAD”
xlink:href=“http://…/breen” LABEL=“Finding Aid”/ ></METS:dmdSec>
<mdRef> Element (dmdSec)• <mdRef> element in the context of the <dmdSec>
points to external descriptive metatadata (finding aid, catalog record)
• <mdRef> element provides numerous attributes for qualifying an md reference:– METS standard linking attributes: xlink:SimpleLink,– LOCTYPE, OTHERLOCTYPE– MIMETYPE – MDTYPE (MARC, EAD, DC)– OTHERMDTYPE– LABEL– XPTR (Xpointer to location within file)
Linking to Internal Descriptive Metadata: DMDID 2
<METS:structMap TYPE=“logical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”DMDID=“DM1 DM2” > <METS:div ORDER=“1” TYPE=“entry” LABEL=“Nov 20”> <METS:fptr FILEID=“FID3” />
<METS:fptr FILEID=“FID35” /> <METS:fptr> <METS:area> FILEID=“FID1” BETYPE=“IDREF” BEGIN=“ENTRY1” END=“ENTRYEND1” /> </METS:fptr>
</METS:div>
<METS:div ORDER=“2” TYPE=“entry” LABEL=“Nov 21” />
</METS:div> </METS:structMap>
Linking to External Descriptive Metadata: <mdWrap>
<METS:dmdSec ID=“DM1”> <METS:mdRef LOCTYPE=“URL” MDTYPE=“EAD”
xlink:href=“http://…/breen” LABEL=“Finding Aid”/ ></METS:dmdSec><METS:dmdSec ID=“DM2”> <METS:mdWrap MDTYPE=“OTHER” OTHERMDTYPE=“GDM”> <METS:xmlData> <gdm:gdm> <gdm:core> <gdm:coreDate>1846<gdm:coreDate> <gdm:title>[Patrick Breen Diary…] </gdm:title> </gdm:core: <gdm:creator ROLE=“Author”>Breen, Patrick</creator> </gdm:gdm> </METS:xmlData> </METS:mdWrap></METS:dmdSec>
<mdWrap> Element (dmdSec)• <mdWrap> provides a wrapper for metadata• <mdWrap> may wrap <xmlData> element
containing metadata encoded according to external schema: DC, MARCLITE, GDM, etc.
• <mdWrap> may wrap <binData> element containing base64Binary encoded data
• Attributes:– MIMETYPE– MDTYPE: MARC, EAD, DC, etc– OTHERMDTYPE: if MDTYPE is OTHER– LABEL: for presentation to end user
Summary: Linking Structure with Descriptive Metadata
• <div>s are linked to <dmdSec> elements by means of DMDID attribute containing idref(s).
• <div> at any level of the <structMap> hierarchy may reference a <dmdSec>
• Each <dmdSec> references or contains a discrete unit of descriptive metadata
• A <dmdSec> can (either/both)– reference external md via a <mdRef> element– wrap metadata via am <mdWrap> element:
• xml-encoded md conforming to external schema • base64Binary encoded metadata such as a MARC record
Building a METS object: Aspect 4
4. Linking Structure and Files with Administrative metadata.
a) Key attributes and elements:i. <div> elements in the <structMap> may link to one or more
administrative metadata units via an ADMID attribute.
ii. <file> elements in the <fileSec> may link to one or more administrative metadata units via an ADMID attribute
iii. <amdSec>, <techMD>, <rightsMD>, <sourceMD> and <digiprovMD> elements may a. point to external administrative metadata via a <mdRef>
element
b. themselves contain administrative metadata in an <mdWrap> element.
Linking Structure and Content with Administrative Md
structMap
div
fileSec
fileGrp
file
amdSec
sourceMD
digiprovMD
rightsMD
Content Administrative Md
Structure
techMDmdRef
mdWrap
Linking <div> to Admin Md: Adding ADMID
<METS:structMap TYPE=“logical”> <METS:div ORDER=“1” TYPE=“diary” LABEL=“Breen Diary”DMDID=“DM1 DM2” ADMID=“RM1”> <METS:div ORDER=“1” TYPE=“entry” LABEL=“Nov 20”> <METS:fptr FILEID=“FID3” />
<METS:fptr FILEID=“FID35” /> <METS:fptr> <METS:area> FILEID=“FID1” BETYPE=“IDREF” BEGIN=“ENTRY1” END=“ENTRYEND1” /> </METS:fptr>
</METS:div>
<METS:div ORDER=“2” TYPE=“entry” LABEL=“Nov 21” />
…
</METS:div> </METS:structMap>
Linking to Administrative Md: Adding <rightsMd>,<mdWrap>
<METS:amdSec> <METS:rightsMD ID=“RM1”> <METS:mdWrap MDTYPE=“OTHER” OTHERMDTYPE=“GAMRIGHTS”> <METS:xmlData> <gamrights:gamrights> <gamrights:copyRest>Copyright has been assigned
to the Bancroft Library.All requests… </gamrights:copyRest> </gamrights:gamrights> </METS:xmlData> </METS:mdWrap> </METS:rightsMD></METS:dmdSec>
<amdSec> Element• <amdSec> expresses administrative metadata
through 4 repeatable elements:– <rightsMD>– <techMD>– <sourceMD>– <digiprovMD>
• Each of these elements expresses admin md via same means as dmdSec expresses descriptive md:– <mdRef>: can point to external metadata– <mdWrap>: wraps metadata internally
• <div>s, <file>s, <fileGrp>s can link to <rightsMD>, <techMD>, <sourceMD>, <digiprovMD> or parent <amdSec> via ADMID.
Linking <file> to Admin MD:Add ADMID
<METS:fileSec>
<METS:fileGrp VERSDATE=“2000-08-22T07:32:00”>
<METS:file ID=“FID55” ADMID=“TM1 SM1” MIMETYPE=“image/tif”>
<METS:Flocat LOCTYPE=“URL” xlink:href=“http:…/x.tif” />
</METS:file>
</METS:fileGrp>
…
</METS:fileSec>
Linking <file> to Admin MD: <techMD
<METS:amdSec> <METS:techMD ID=“TM1”> <METS:mdWrap MDTYPE=“OTHER” OTHERMDTYPE=“GAMTECH”> <METS:xmlData> <gamtech:gamtech> <gamtech:compression>LZW</gamtech:compression> <gamtech:resolution>800</gamtech:resolution> </gamtech:gamtech> </METS:xmlData> </METS:mdWrap> </METS:techMD></METS:amdSec>
Linking <file> to Admin MD: <sourceMD>
<METS:amdSec> <METS:sourceMD ID=“SM1”> <METS:mdWrap MDTYPE=“OTHER” OTHERMDTYPE=“GAMSOURCE”> <METS:xmlData> <gamsource:gamsource> <gamsource:sourceID>BANC MSS C-E 176 </gamsource:sourceID> <gamsource:orgDimen X=“12” Y=“17” UNIT=“cm” /> </gamsource:gamsource> </METS:xmlData> </METS:mdWrap> </METS:sourceMD></METS:amdSec>
Summary: Linking Structure and files with Admin Metadata
• <div>s are linked to admin md elements by means of ADMID attribute containing idref(s).
• <div> at any level of the <structMap> hierarchy may reference <rightsMD> or other amd element
• <file>s and <fileGrp>s are linked to admin md elements by means of ADMID attribute. May link to <techMD>, <rightsMD>, <sourceMD>, <digiprovMD>, or entire <amdSec>
• Each <techMD>, <rightsMD>, <sourceMD>, <digiprovMD> references or contains a discrete unit of descriptive metadata
Summary: Linking Structure and files with Admin Metadata (cont)• <techMD>, <rightsMD>, <sourceMD>,
<digiprovMD> can (either/both)– reference external md via a <mdRef> element
• <mdRef> uses xlink:SimpleLink attributes to point to external administrative metadata.
– wrap metadata (either/or)
• xml-encoded md conforming to external schema in a <xmlData> element.
• base64Binary encoded metadata in a <binData> element
Building a METS object
1. Expressing the Structure2. Linking Structure with Content3. Linking Structure with Descriptive
Metadata4. Linking Structure and Files with
Administrative metadata5. Not covered: Linking behaviors with
structures.
METS Mechanisms: Conclusion
• METS provides varied and flexible mechanisms for – expressing structure or structures of a digital
entity– linking structure with simple and complex
content– linking structure with descriptive metadata– linking structure and content files with
administrative metadata– linking behaviors with structure and content