Metadata and Tagging

Post on 22-Jan-2015

437 views 0 download

description

University College Cork's Digital Arts and Humanities MA class slides on "Metadata and Tagging"

Transcript of Metadata and Tagging

Metadata

What is Metadata?

Metadata is ‘data about data’ or information about information.

Michael Day (Metadata in a Nutshell) defines metadata as “standardised descriptive information about resources, including non-digital ones.” (e.g. book metadata, photograph metadata, etc.)

3 types of Metadata

1. Descriptive Metadata (Human Level): “content about the content” or metacontent.

discovery and identification of resources.

2. Structural Metadata (Machine Level): data about the containers of data.

3. Administrative Metadata (Human Level): information for managing a resource.

Why create Metadata?

Metadata makes your work readily available because:

Searchable: it’s easy to find.

Authority: it’s clear who created it.

Citation: it’s easy for others to cite your work in their publications.

Collaboration: it helps other people to build on your work rather than having to recreate it.

Efficiency: you save time and money, if everyone creates metadata.

Funding: you may be required to make your work readily available to others.

What does Metadata do?

Metadata is the key to ensuring the survival and accessibility of resources into the future.

Descriptive metadata facilitates the discovery of relevant information.

Resource discovery (e.g. library catalogs)

Organize electronic resources

Facilitate interoperability and resource integration

Provide digital identification

Support archiving and preservation

Interoperability

Metadata is the key to interoperability.

Interoperability is the ability of multiple systems with different hardware and software platforms, data structures and interfaces to exchange data with minimal loss of content and functionality.

Metadata promotes interoperability by making digital information understandable to both humans and machines.

Storing Metadata

Metadata can be stored separately or embedded in a digital object.

Storing separately:

Can facilitate search and retrieval

Stored in database and linked to the object described.

E.g. Semantic web

Storing metadata with object:

Ensures it wont be lost

Removes the problem of linking between data and metadata

Metadata and object are updated simultaneously

E.g. TEI

Digital Identification

Metadata schemes include elements, such as standard numbers, to uniquely identify the work or object to which the metadata refers.

Location of a digital object is given using:

a file name

URL (Uniform Resource Locator)

PURL (Persistent URL)…… preferred!

DOI (Digital Object Identifier)

Collaborative Metadata

A tag is a non-hierarchical keyword or term assigned to a piece of information (Wikipedia)

Effectively, a tag is a form of metadata.

O’Shea (2013) describes 3 types of tagging :

1. Personal

2. Algorithmic

3. Social

Hashtags

A hashtag is a universal, standardised metadata tag/mark.

Therefore, #uccmadah is a form of descriptive metadata.

You choose your tags on Twitter, Facebook, Wordpress, Youtube, etc.

But, Youtube, Delicious, and Wordpress algorithms also suggest tags for you.

HTML Metatags

These are all located in the <head> tag.

Example:

<head>

<meta name=“author” content=“Paul O’Shea”>

<meta name=“keywords” content=“metadata, metacontent, HTML, etc….”>

<meta name=“description” content=“MA DAH teaching Powerpoint slides”>

</head>

Social Tagging

Social tagging is also known as Folksonomies.

Folksonomy originates from folk + taxonomy

Taxonomy is the branch of science concerned with systematic classifications.

Ergo -> crowd-sourced tagging

See plugins for MediaWiki/Ruby on Rails for example!

Geotagging

The process of adding geo-spatially represented metadata to various media and resources.

Provides location-specific information

Examples include:

Digital cameras automatically geotag using GPS.

Facebook and Twitter mobile apps allow you to geotag tweets and status updates.

Archiving and Preservation

Special elements are required:

To track the lineage of a digital object.

Details of its physical characteristics.

Document its behaviour for future technologies.

Composing Metadata

Metadata schemes:

Sets of metadata elements designed for a specific purpose, e.g. describing a particular type of information resource.

• If the resource lives on the internet then one may use the URI to locate it and RDF to describe it.

Metadata Schemes

Definition/meaning of the metadata elements are the semantics of the scheme.

Value given to the metadata elements are the content.

Metadata schemes specify the names of elements and their semantics, and also the syntax rules for how elements and their content should be encoded.

The Dublin Core Metadata Element Set

Originated from 1995 workshop sponsored by OCLC and NCSA

Dublin, Ohio

Common vocabulary, like FOAF (Friend of a Friend – describing interpersonal networks)

Important because it is endorsed by IETF and ISO

See Euorpeana: http://www.europeana.eu/

The Dublin Core Metadata Initiative (DCMI)

Original objective:

To define a set of elements that could be used by authors to describe their own web resources.

15 Elements:

Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Region, Coverage and Rights.

http://dublincore.org/documents/dces/

Dublin Core

Designed to be simple and concise

To describe Web based documents

Minimalist view vs. Structuralist view (See Lumpers and Splitters)

Minimalists: minimum elements, simple semantics and syntax

Structuralist: Finer semantic distinctions and more extensibility for particular communities.

Dublin Core ExampleTitle=”Metadata Demystified”Creator=”Brand, Amy”Creator=”Daly, Frank”Creator=”Meyers, Barbara”Subject=”metadata”Description=”Presents an overview ofmetadata conventions inpublishing.”Publisher=”NISO Press”Publisher=”The Sheridan Press”Date=”2003-07"Type=”Text”Format=”application/pdf”Identifier=”http://www.niso.org/standards/resources/Metadata_Demystified.pdf”Language=”en”

Semantic Web

Focuses on semantic content rather than plain text content.

Allows for disambiguation of terms with the same name, but different meanings.

Evolved from limited, but simple, HTML meta tags to a complex ‘web’ of standards.

Some of the well established standards are Unicode, URI, XML, RDF, Web Ontology Language (OWL), etc.

Tim Berners-Lee http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

TEI

The Text Encoding Initiative http://www.tei-c.org/index.xml

International project to develop guidelines for marking up electronic texts such as novels, plays, poetry, correspondence, etc.

TEI Guidelines for Electronic Text Encoding and Interchange

Specify a header portion, embedded in the resource that consists of metadata about the work.

TEI Header can be used to record bibliographic information about the digital edition.