ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry...

27
ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d

Transcript of ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry...

Page 1: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

ITU Focus Group on Identity ManagementGeneva, February 2007

Norman Paskin

Content industry standards activities

T E R T I U S L t d

Page 2: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• Norman Paskin [email protected]– Member of ISO TC/46/SC9 Identifier Interoperability working group – Digital Object Identifier system – Chair of CONTECS (indecs2 consortium)– Member ACAP Technical Working group, etc

Outline of the presentation:• Relevance for ITU FG• Terminology traps • Overview of major activities:

– ISO content identifiers – DOI (Digital Object Identifier)– music; publishing; licensing– MPEG– Party identifiers – Web-based identifiers

• Common themes and lessons

Content industry standards activities

Page 3: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• ITU FG scope: “management of...attributes of an entity”• Accommodate existing and new identity schemes • There is relevant ongoing work in other standards fora• A consistent approach to all kinds of inter-related entities is now

recognised as necessary:

Content industry standards activities

PeoplePeople makemake

StuffStuffuseuse

DealsDeals

aboutaboutdodo

Parties: living or deceased, people or organisations; groups; pseudonyms; avatars; characters; etc

Usual focus of “identity management”

Page 4: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• ITU FG scope: “management of...attributes of an entity”• Accommodate existing and new identity schemes • There is relevant ongoing work in other standards fora• A consistent approach to all kinds of inter-related entities is now

recognised as necessary:

• “… which entities digital identities need to be tied to, from users via networks, services, applications, content etc. to “things” in general”

• “The need to support roles and partial identities targeted to specific roles or usage contexts.

• “the requirement to support both roles that represent real persons as well as the construction of virtual persons..”

– ITU Workshop on Digital Identity for Next Generation Networks, Dec 06

Relevance to ITU FG IdM

Page 5: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Identifier = numbering schemes • Registries• Normally central control, commitment• Examples: ISBN, EAN bar codes, IANA, ITU phone numbering plans etc • Normally focus on attributes (metadata)

Identifier = syntax specifications • Normally little central control • e.g URI (URL); MPEG-21 DII • Few structured attributes, low barriers to entry• Some more structured than others: e.g. URN, info URI

Other confusions:• Some practical systems use both schemes and specifications (e.g. DOI) • Interactions between schemes and specifications:

– e.g. an ISBN can be expressed as a URL, as an EAN bar code, a DOI, etc• Identifier as “system” versus as a “unique label” • There are many badly-designed numbering schemes• There are many incorrect uses of well-designed numbering schemes

Terminology: the over-used term “identifier”

Page 6: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

ISO content “identification numbering”

ISO 2108 International Standard Book Numbering (ISBN)

ISO 3297 International Standard Serial Number (ISSN)

ISO 3901 International Standard Recording Code (ISRC)

ISO 10957 International Standard Music Number (ISMN)

ISO 15706 International Standard Audiovisual Number (ISAN)

ISO 15706-2 Version identifier for Audiovisual Works (V-ISAN)

ISO 15707 International Standard Musical Work Code (ISWC)

ISO 21047 International Standard Text Code (ISTC)

http://www.collectionscanada.ca/iso/tc46sc9/Information and Documentation - Identification and Description

Defining metadata now a requirement for each identifier scheme:

entities must be described as well as named

Page 7: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• International Standard Party Identifier (ISPI)– ISO Project 27729– “a new international identification system for the parties (persons and

corporate bodies) involved in the creation and production of content entities”.– Work on the ISPI project began in August 2006

• Digital Object Identifier (DOI) System – ISO/WD 26324– To standardise the existing DOI system (syntax is already a national US

standard, NISO Z39.84)

• Identifier Interoperability working group – Informal group – To consider what steps are necessary to improve interoperability of existing

and future ISO TC46/SC9 identifiers – “Identifier Interoperability: a report…” http://www.dlib.org/dlib/april06/

Some relevant current ISO TC46/SC9 activities

Page 8: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

The DOI System

• DOI (Digital Object Identifier) system: www.doi.org

• Initially developed from the publishing industry but now wider • a non-profit collaboration to develop infrastructure for persistent

identification and management of content• Approx 2000 user organisations (through agencies)

• Currently being standardised in ISO (TC46/SC9)• the home of ISBN etc “content identifiers”

• One application of the Handle System• adds to it additional features – social and technical infrastructure,

policies, metadata management.• focus on one area of interest (content/intellectual property) • offers a specific data model based on indecs (discussed later)• DOI technology equally applicable for parties and licences

Page 9: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

CISAC = Int. Confederation of Societies of Authors and Composers• Co-ordinates a music industry information system (member-based) • IPI = Interested Party Identifier (“which John Williams?”) • Long established system • Recent MWLI: Musical Works Licence Identifier*

DDEX = Digital Data Exchange* • http://www.ddex.net• Messaging standards for music industry chain • Modelled on earlier publishing industry efforts (ONIX) etc• Has its own Party ID (http://ddex.net/evaluation/licenceform.html )

GrId = Global Release Identifier• for digital tracks etc*.

* Spun out from Music Industry Integrated Identifiers Project (Mi3p)

Music supply chain

Page 10: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

ONIX = Online information exchange http://www.editeur.org/

• Editeur: International umbrella body for book industry standards development

• Collaborative effort with international, national and sectoral organisations

• Develops and maintains ONIX, EDItEUR / EDIFACT & XML / EDI standards etc

• Messaging exchange between publishers, booksellers (Amazon etc), libraries

• Works closely with ISBN International and others • Expanding into related areas

Publishing supply chain

Page 11: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• ONIX is developing standards for licensing and for multimedia, both of which require a rich semantic interoperability, – ONIX for Licensing Terms: need for license terms to be expressed in

standard processable format– DLF Electronic Resource Management Initiative (ERMI) working with

NISO and EDItEUR to enable standardised statement of usage rights linked with digital resources

• RDA (Resource Description and Access – new AACR); shared “RDA/ONIX Framework for resource categorisation” – http://www.dlib.org/dlib/january07/dunsire/01dunsire.html– Cataloging, Digital Archiving and Preservation projects have similar

requirements

Some relevant ONIX developments

Page 12: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

The ACAP project

Automated Content Access Protocolhttp://www.the-acap.org/

• Recently launched • “Technical framework which will allow publishers to provide permissions

information (relating to access and use of their content) in a form in which it can be recognised and where necessary interpreted by a search engine “crawler”,

• Aim: search engine operator (and perhaps, ultimately, any other user) is enabled systematically to comply with a policy or licence.

• “Being developed as an industry standard by the publishing industry, working with search engines and other technical and commercial partners”.

• “the availability or otherwise of standard methods of identification of content, licenses, systems and business partners are key issues for ACAP. Identification is crucial for authentication of systems and partners as well as for location of content and licenses.”

Page 13: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

User A User BTransaction/Use/Relationship

Digital ItemAuthorization/Value Exchange

• Transaction of some digital item

• Information about that transaction

Moving Picture Experts Group (MPEG) working group of ISO/IEC • Builds on MPEG standards MPEG 1, 2,4,7..• MPEG 21: The “Multimedia Framework”

MPEG 21 (ISO/IEC 21000)

Page 14: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• Part 3: “Digital Item Identifier”– syntax placeholder for e.g. URL, DOI, GrID

• Part 5: “Rights Expression Language” – can identify Principals

• Part 6: “Rights Data Dictionary” – 2000-term data dictionary for semantic interoperability– Contextual event-based, managed, data model– http://iso21000-6.net/– Methodology for continuing extensibility; more later

• Part 15: “Event Reporting” – enable owners of content to receive information about what has happened to

their stuff

MPEG 21 (ISO/IEC 21000)

18 standards under various categories: • Digital Item Identification• Intellectual Property Management and Protection• Terminals and Networks• Digital Item Management and Usage• Digital Item Representation• Event Reporting

Page 15: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• Some industry-specific standards – e.g. CIS IPI system (availability/governance issues)– Current STM publishers work on author and institute disambiguation

• Impractical to identify everybody• End-user identification mainly an issue of authentication

– ATHENS, SHIBBOLETH

• Identification of individual and corporate persons a major issue for rights (and authority control in libraries)

• Parties are more than just persons– Organisations, personae, pseudonyms, avatars…

• <indecs> identified need for a “directory of parties” linking person identifier schemes

Identifying parties

Page 16: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

• An EU-funded project (2002-2003) looking at the interoperation of “party identifiers” – www.interpary.org

• Aimed to demonstrate how (and why) existing schemes could interoperate e.g.– Library authority files– CISAC / IPI – Bibliographic databases– Performer databases

• Identified mechanisms for issues such as partial matching • Built on an earlier project: <indecs> • ISPI (ISO TC46/SC9) should learn from this

Interparty

Page 17: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Web-related identifiers

• URI, URL and URN 

• Internet community has been through some debate and confusion regarding URI and URN specifications.

• Confusions seem to centre on:• Conflation of “indication of the location of the end point”,

and an “indication of identity”• Differing views of whether DNS should be optional or required

for resolution • “Contemporary point of view” of the URI working group aims at

reconciliation • Still some different views compared to ontology work

• semantic web work may throw light on this

• Related work specific to information industries through NISO:

Page 18: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

NISO

NISO = National Information Standards Organization www.niso.org

• Open URL NISO standard Z39.88. A syntax to create web-transportable packages of metadata and/or identifiers about an information object.

• Not an identifier, but a complementary technology for appropraite redirection of an identifier resolution

• e.g. in use with Digital Object Identifiers (DOI) http://www.crossref.org/03libraries/16openurl.html

• "info" URI RegistryIETF RFC 4452: The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces. http://info-uri.info/

• Turn legacy identifiers into URLs (e.g. info:lccn/2002022641)• Now formalizing policies for the "info" URI registry. • “This identifier and its registry could serve as a focal point for NISO's

identifier activity, creating a trusted brand and a starting point for community members doing work that requires identifiers.” (NISO workshop on identifiers 2006)

Page 19: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

These are not unrelated independent efforts.

(1) Many of these standards and projects share a common view (and fundamental data model) of identifiers and metadata

- the <indecs> view which has a strong lineage over almost ten years:

Common themes and unifying activities

Page 20: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Interoperability of Data in E-Commerce Systems• <indecs> project 1998-2000• <indecs2> 2001-2002 (= MPEG21 Rights Data Dictionary)

• Focus on multimedia rights metadata: recognized that rights and descriptive metadata were inseparable. Produced an event-based reference model/framework (parties, resources, agreements)

• 50% EC funding + consortium members including:• EDItEUR (international book industry standards/ONIX)• IFPI (international record industry)• MPAA (international film industry)• Various copyright societies and associations• Various technology providers• Library and author representatives • International DOI Foundation

• Metadata in networks needs to support interoperability across– media (e.g. books, serials, audiovisual, software, abstract works). – functions (e.g. cataloguing, discovery, workflow, rights mgmt). – levels of metadata (from simple to complex). – semantic barriers. – linguistic barriers.

The <indecs> framework

Page 21: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Principles: • Unique Identification: every entity should be uniquely identified within

an identified namespace.• Functional Granularity: it should be possible to identify an entity

whenever it needs to be distinguished [1st class]• Designated Authority: the author of an item of metadata should be

securely identified.• Appropriate Access: everyone requires access to the metadata on which

they depend, and privacy and confidentiality for their own metadata from those who are not dependent on it.

• Definition of metadata: An item of metadata is a relationship that someone claims to exist between two referents (description)

Delivered: • Generic data model of e-commerce all types of intellectual property• Specifications for supporting services• Standardisation proposals• Documentation at www.indecs.org

Led to:• Contextual ontology architecture: contexts, roles, identities

The <indecs> framework

Page 22: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Agent

PlaceTime

Resource

Context

EntityTypesAn Entity may have typed relationships

with Entities of any kind (including those of its own kind)

EntityTypesAn Entity may have typed relationships

with Entities of any kind (including those of its own kind)

AttributeTypesAn Entity may have Attributes of any kind. (Attributes, which are a type of Resource,

may have their own Attributes).

AttributeTypesAn Entity may have Attributes of any kind. (Attributes, which are a type of Resource,

may have their own Attributes).

Contextual Relationships

RoleRole

RoleRole

RoleRole

RoleRole

RelatorRelator

Descriptor Descriptor

Name Name

Identifier Identifier

Annotation Annotation

Category Category

FlagFlag

QuantityQuantity

Attributes (illustrative: any Entity or Attribute may have Attributes of any type)

Every Relationshiphas a Relator

VerbVerb

Figure 1

COA MetaModel Overview

Non Contextual Relationships (illustrative: any Type of Entity may relate to any other)

Contextual ontology metamodel overview

Page 23: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

LicensingEvent UseEventPermits (MAY)

1-n

UseEvent

Prohibits (MUST NOT)

0-n

Payment

ReportingEvent

etc

Requires (MUST)

0-n

Has Exception

Has Precondition

This structure allows for whatever level of flexibility or granularity may be required now or in the future.

e.g. Terms of a Licence as a group of Events

Event = time, place, entities

Page 24: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Contextual Ontology usage examples

• ISO MPEG-21 Rights Data Dictionary (http://iso21000-6.net/)

• DDEX Digital Data EXchange - music industry (http://ddex.net/)

• ONIX: Book industry (+) messaging schemas (www.editeur.org )

• ONIX: Rights: ONIX for Licensing Terms, Repertoire and Distribution

• Digital Library Federation - communication of licence terms (ERMI: working with ONIX for licensing terms)

• DOI Data Dictionary (http://www.doi.org )

• Rightscom’s OntologyX - licensee of early output, plus their own later work (www.rightscom.com )

• RDA (Resource Description and Access); next generation of AACR/MARC cataloguing – RDA/ONIX common framework

• ACAP: Automated Content Access Protocol (http://www.the-acap.org/ )

• Consistent with FRBR, ABC-Harmony, OWL, CIDOC CRM, etc

Page 25: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

1. Many of these standards and projects share a common view (and fundamental data model) of identifiers and metadata

2. Some of these standards and projects share a common view (and fundamental data model) of identifier resolution

• Internet registries and distributed resolution • First class naming, functional granularity

• Info URI, URN?

• The Handle System: ideal choice to provide resolution for all identifiers– 10 years + – See separate presentation – DOI is a prime example– schemes that don’t want to use DOI can use own handle implementation

• Existing numbering schemes may be a suffix of a Handle – DOI currently working with ISBN International (ISBNs as DOIs)

• Or metadata may be linked through data values in handle record • First class naming, appropriate granularity• Authentication, security, does not conflate identity and data (e.g. location),

etc.

Common themes and unifying activities

Page 26: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

Conclusion

• Content industry standards activities are extending their old focus on numbering schemes

– into party identification, licensing, data modelling, and fundamental principles– interoperability, internet registries, ontologies

• Management of identifiers and metadata

= “Naming and meaning of digital objects” http://www.doi.org/topics/060927AXMEDIS2006DOI.pdf

• Need for first class naming– Handle system – infrastructure for extensible distributed services for using names to locate and

disseminate objects

• Need for semantic interoperability– Contextual ontology (<indecs>): Contexts, roles, relationships– functional granularity– use of existing metadata schemes

• Identity management discussions can learn from and use these techniques

Page 27: ITU Focus Group on Identity Management Geneva, February 2007 Norman Paskin Content industry standards activities T E R T I U S L t d.

ITU Focus Group on Identity ManagementGeneva, February 2007

Norman Paskin [email protected]

Content industry standards activities

T E R T I U S L t d