PERICLES Modelling Policies - Acting on Change 2016

30
GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3 Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation] “This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no601138”. INTERACTIVE WORKSHOP: Modelling Policies, Exploring Real Use Cases Justin Simpson (Artefactual) Fabio Corubolo (Univ. Liverpool/PERICLES) Jean-Yves Vion-Dury (Xerox/PERICLES) Stratos Kontopoulos (CERTH/PERICLES) Joel Simpson (Artefactual) @PericlesFP7 #PERIconf2016

Transcript of PERICLES Modelling Policies - Acting on Change 2016

Page 1: PERICLES Modelling Policies - Acting on Change 2016

GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3 Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation]

“This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no601138”.

INTERACTIVE WORKSHOP: Modelling Policies, Exploring Real Use CasesJustin Simpson (Artefactual)Fabio Corubolo (Univ. Liverpool/PERICLES)Jean-Yves Vion-Dury (Xerox/PERICLES)Stratos Kontopoulos (CERTH/PERICLES)Joel Simpson (Artefactual)

@PericlesFP7 #PERIconf2016

Page 2: PERICLES Modelling Policies - Acting on Change 2016

▶Objectives & Motivation▶Why model policies?▶Ontology-based Policy Modelling▶Policy and change management▶Examples

Outline

Page 3: PERICLES Modelling Policies - Acting on Change 2016

▶What do different types of stakeholders (e.g. archivists vs. technologists, generalists vs. specialists) think are significant from a “policy” perspective?▶Are ontologies a helpful tool in describing

policies?▶How could we represent real policies with a

proposed design pattern?

Objectives

Page 4: PERICLES Modelling Policies - Acting on Change 2016

Why Model Policies?

We believe the focus on policies in Archivematica today provides a number of benefits▶Simplification: separating rules (policies) from

workflow make both easier to configure and manage

▶Understandability: abstracting policies from technical implementation enables non-technical users to interact more directly with the system

▶Shareability: enables some level of sharing best practices across the community

Page 5: PERICLES Modelling Policies - Acting on Change 2016

Why Model Policies?

We think the PERICLES approach may help us improve upon our existing focus on policies: ▶Simplification: many important preservation

decisions are still deeply embedded in technical implementation

▶Understandability: using well defined vocabularies & languages (ontologies) to define policy will make make it easier to be precise and eliminate ambiguity

▶Shareability: using common standards will make it easier to share policy within a community

Page 6: PERICLES Modelling Policies - Acting on Change 2016

Why Model Policies? New Benefits

▶Impact analysis: ability to determine the impact of a system or policy change before it is committed

▶Reasoning / change management: In some cases we can automate the management (resolution) of change issues

▶Validation: we can attach ad hoc validation processes (tests)

▶Reuse: making use of existing ontological knowledge bases on formats and preservation policies in general

Page 7: PERICLES Modelling Policies - Acting on Change 2016

Abstraction of complex systems as models that can be manipulated independently

Model-driven Preservation

Models

Digital ecosystem◦ Analogy with biological

systems◦ Evolving systems of

interdependent entities

Capture and representation of the environment

Continuous change and reuse

Continuum approach▶ Merging of active-life

and archival phases▶ Non-custodial

Page 8: PERICLES Modelling Policies - Acting on Change 2016

▶Models can be constructed on existing infrastructure▶Does not require replacing existing

services▶Add preservation and policy management

on top of what exists▶Save in costs and adoption time

Model-driven approach

Page 9: PERICLES Modelling Policies - Acting on Change 2016

Pericles introduced ontologies at different levels, that are partially independent:▶LRM - ontology for linked resources▶Policy ODP - generic policy ontology▶DEM - formalism for digital ecosystem

(uses LRM)▶Domain specific instances

PERICLES Ontologies

Page 10: PERICLES Modelling Policies - Acting on Change 2016

▶Relation between change and dependency▶Understanding dependencies between digital

objects and resources within their environment is key to manage change

▶Given objects A and B, A is dependent on B if changes to B have a significant impact on the state of A, or if changes to B can impact the ability to perform function X on A.

Dependency and Change

Page 11: PERICLES Modelling Policies - Acting on Change 2016

Dependency: the association, relation or interaction among two or more Resources

Plan: presents a set of actions/steps to be executed by Agent

precondition and impactDescription:

intention: the intended usage of a Resourcespecification: the context of the Dependency

itself

LRM Dependency

Page 12: PERICLES Modelling Policies - Acting on Change 2016

LRM Dependency

Page 13: PERICLES Modelling Policies - Acting on Change 2016

PERICLES Design Pattern for Policies

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Page 14: PERICLES Modelling Policies - Acting on Change 2016

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Detailed definition of the policy contents

PERICLES Design Pattern for Policies

Page 15: PERICLES Modelling Policies - Acting on Change 2016

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Associates a policy with an entity that is subject to the policy.

PERICLES Design Pattern for Policies

Page 16: PERICLES Modelling Policies - Acting on Change 2016

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Associates a policy to the user community the policy has been

designed for.

PERICLES Design Pattern for Policies

Page 17: PERICLES Modelling Policies - Acting on Change 2016

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Associates a policy to the agent that is responsible for the application of

the policy (person or role)

PERICLES Design Pattern for Policies

Page 18: PERICLES Modelling Policies - Acting on Change 2016

http://ontologydesignpatterns.org/wiki/Submissions:Policy

Associates a policy with a process that is used to implement the policy

PERICLES Design Pattern for Policies

Page 19: PERICLES Modelling Policies - Acting on Change 2016

▶Policies can be expressed in formal languages

◦ SPIN and ReAL language work on ontologies▶They can impose constraints, perform

changes, validation▶Changes in the models (incl. policies) can

be managed using these rules and techniques

Modelling for Change Management

Page 20: PERICLES Modelling Policies - Acting on Change 2016

▶Policy: all videos from Collection X must be renderable on at least one of players Y▶Model based on the ODP pattern we just

described▶Uses PERICLES models and ideas▶This policy is a pattern on its own: keep data processable

Video Playback Example

Page 21: PERICLES Modelling Policies - Acting on Change 2016

Change in Video Playback

Page 22: PERICLES Modelling Policies - Acting on Change 2016

Example of change propagation

Page 23: PERICLES Modelling Policies - Acting on Change 2016

Scalable, flexible: a system view

Page 24: PERICLES Modelling Policies - Acting on Change 2016

▶Policy: Preferred email preservation format is “maildir”▶Preservation Task: When email is provided in

a format that is not suitable for preservation, we normalize the email to the appropriate preservation format. ◦ E.g. Normalize a “pst” object (a proprietary

email format from Microsoft) into an “maildir” object.

▶Implementation: We use the open source tool “readpst”.

Example

Page 25: PERICLES Modelling Policies - Acting on Change 2016

▶Policy: Preferred email preservation format is “mbox”▶Preservation Task: When email is provided in

a format that is not suitable for preservation, we normalize the email to the appropriate preservation format. ◦ E.g. Normalize a “pst” object (a proprietary

email format from Microsoft) into an “mbox” object.

▶Implementation: We use the open source tool “readpst”.

Change:

Page 26: PERICLES Modelling Policies - Acting on Change 2016

Policy Definition Diagram

Page 27: PERICLES Modelling Policies - Acting on Change 2016

Policy Definition Diagram

Page 28: PERICLES Modelling Policies - Acting on Change 2016

Exercise Time!Modelling Considerations –

1. Process vs. Policy: What’s important to define as “policy” from a preservation perspective? What aspects of the preservation process would be significant from a compliance point of view? Are there aspects of the process that are purely operational or technical (e.g. do we care which tool is used to perform a process? should that be part of policy?)

2. PERICLES Design Pattern for Policies: how would you describe the policy (or policies) from the example process using the constructs in the design pattern for policies? (e.g. requirement level; policy type)

3. Linked Resource Model: how would you model the policy along with the related concepts in a digital ecosystem model? e.g. what preconditions should exist, or specifications, impacts etc.

Page 29: PERICLES Modelling Policies - Acting on Change 2016

Exercise Time!Consider one of the following example preservation processes - which aspects of these processes make sense to model as policies? What might that model look like? 1. Extracting Attachments from email to enable other preservation processes

to act on the individual attachment (e.g. format identification, characterization, normalization)

2. Format Identification using tools such as Siegfried or Droid to identify the format of digital objects (email formats: .msg,.pst,.eml,.mbox or attachments: .doc,.pptx)

3. Virus Scans & Quarantine processes - using tools such as ClamAV to identify viruses and taking further action to address viruses found

4. Format Validation using tools such as JHove to determine if a particular digital object fully or partially complies with the specification for the purported format

5. Email Signature Validation - the process of validating individual emails that have been provided with a digital signature (e.g. using DKIM or DMARC)