Mapping Research Infrastructures with the ENVRI Reference Model

29
02/07/202 2 1

description

Describes work done by the EC FP7 ENVRI project (http://www.envri.eu/) on understanding the common requirements of ESFRI environmental research infrastructures, and developing a "reference model" to support a common language of communication and understanding between these vastly different communities of environmental scientists

Transcript of Mapping Research Infrastructures with the ENVRI Reference Model

Page 1: Mapping Research Infrastructures with the ENVRI Reference Model

10/04/2023

1

Page 2: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Creative Commons by Quinn Dombrowksi, used under CC-BY-SA 2.0, cropped

The ENVRI Reference Model

• Why we need it

• How we built it

• And what it is

• Early adoption and use

• Benefits and conclusions

Page 3: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Why we need it?

To help the community reach a common vision

To provide a common language for communication

To provide a uniform framework into which RIs’ components can be placed and compared

To provide common solutions to common problems

To secure interoperability

To enable reuse, share of resource/experiences, avoid duplication efforts

10/04/2023

3

Page 4: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Why we need it?

To help the community reach a common vision

To provide a common language for communication

To provide a uniform framework into which RIs’ components can be placed and compared

To provide common solutions to common problems

To secure interoperability

To enable reuse, share of resource/experiences, avoid duplication efforts

10/04/2023

4

Intended audience• Implementation teams

Architects, designers, integrators, Engineers

• Operations teams• Third party solution /

component providers

Page 5: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

How did we build it?

By analysing common requirements of Environmental Research Infrastructures

5

IAGOSEURO Argo

ICOS LifeWatch

COPAL

SIOS

EISCAT 3D EPOSEMSO

Page 6: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 28346510/04/2023

6

How did we build it?

Page 7: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 28346510/04/2023

7

with points of references between them

We identified 5 common

subsystems:

How did we build it?

Page 8: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ENVRI Common Subsystems

10/04/2023

Chen, Y. et al, Analysis of Common Requirements forEnvironmental Science Research Infrastructures, ISGC 2013

8

facilities for analysis, mining, experiments (combined/derived data)

supports users to conduct their roles in communities (data about users)

brings measurements / data streams into the infrastructure (non-reproducible data)

manages / maintains quality data (reproducible data)

facilities for discovery and access(published data)

Page 9: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Identified the functions/operations of Data Curation

10/04/2023

9

Functions/Embedded Services ICOS EPOS EMSO EISCAT-3D LifeWatch EURO-Argo

Data Quality Checking Yes Yes Unknown Yes Not Applicable Yes

Data Quality Verification Yes Unknown Unknown Unknown Not Applicable Yes

Data Identification Yes Yes Yes Unknown Not Applicable Unknown

Data Cataloguing Unknown Yes Yes Unknown Not Applicable Unknown

Data Product Generation Yes Yes Yes Yes Not Applicable Yes

Data Versioning Yes Unknown Unknown Unknown Not Applicable Unknown

Workflow Enactment No Yes Unknown Yes Not Applicable No

Data Preservation Yes Yes Yes Yes Not Applicable Yes

Data Replication No Yes Unknown Yes Not Applicable Yes

Data Replication Synchronisation No Unknown No Unknown Not Applicable Yes

Common Functions (Curation)

Page 10: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Identified the functions/operations of Data Access

10/04/2023

10

A full function list is on ENVRI RM website: http://confluence.envri.eu:8090/x/GwAF

Common Functions (Access)

Functions/Embedded Services ICOS EPOS EMSO EISCAT-3D LifeWatch Euro-Argo

Access Control Unknown Yes Unknown Yes Unknown Unknown

Data Conversion Yes Yes Yes Yes Yes Yes

Data Compression No No No No Yes No

Data Visualisation Yes Yes Yes Yes Yes Yes

Data Publication Yes Unknown Yes Unknown Yes Yes

Data Citation No Unknown Yes No Unknown No

(Resources/Data) Annotation Yes Yes Yes No Yes Yes

Metadata Harvesting Unknown Unknown Yes No Unknown No

Resource Registration Unknown Yes Yes No Yes No

Semantic Harmonisation No Yes Yes No Yes No

Data Discovery and Access Yes Yes Yes Yes Yes Unknown

Page 11: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

How did we build it?

10/04/2023

11

Analysis of common requirements resulted in a set of common functionalities

Identified a minimalmodel

Focuses on core interactions

Represents the mostfundamental

functionalities

A skeleton that can beextended

Future development based

on community interests

Page 12: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

How did we build it?

Using Open Distributed Processing (ODP)(ISO/IEC 10746)

A framework for structuring specification of large-scale complex distributed systems

An object modelling approachA viewpoints-based approach

10/04/2023

12

Page 13: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ODP Viewpoints

18/03/2014

Adapted from ISO/IEC 19793, 2009

13

Science

Page 14: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ENVRI RM: Science Viewpoint

We derive from common requirements,identifying communities, roles, behavioursModel defines:

5 common Communities according to 5-subsystems

Data Acquisition community collects raw data

Data Curation community manages and archives quality data

Data Publication c. assists publication, discovery & access

Data Service Provision c. provide services to derive knowledge

Data Usage community who make use of data/services

For each community: roles & behaviours10/04/2023

www.envri.eu/rm 14

e.g.: Acquisition Roles: Scientist, Technician, Observer, Sensor, etc.Behaviours: Design of measurement model,

Instrument configuration,calibration, data collection

Page 15: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ENVRI RM: Information Viewpoint

Data-oriented approach: Follow the data-lifecycle in each subsystemIdentify information objects, actions, state changes when events/actions occur

Model defines:A set of information objects handled by a subsystemA set of action types that cause the state changes Dynamic schema - how info objects evolve as the system operates, incl. constraints on state-changesStatic schema: instantaneous views at life-cycle stages

10/04/2023

15

Page 16: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ENVRI RM: Computational VP

Service-oriented, brokered approachCore functionalities encapsulated as a set of service objects

Model defines two types of service objects

A set of computational objects Each encapsulates specific functionalitiesEach provides a set of interfaces to invoke functions

A set of binding objects to coordinate multi-party interactions

10/04/2023

16

Page 17: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Science

Acquisition Subsystem

18/03/2014

Adapted from ISO/IEC 19793, 2009

17

Information objects: Specification of measurements Measurement result Persistent data Data state Metadata Persistent identifierAction types (cause state change): Perform measurement Add metadata Check quality Store dataStates: Raw, Reviewed, Published Processed, etc.

Computational objects: Instrument host Acquisition serviceInterfaces: Configure instrument Acquire data Import dataReference interactions: Raw data collection coordinates above objects with the Import service object and the Raw data object in the Curation subsystem

Community: Roles: Scientist, Technician, Observer, Sensor, etc. Acquisition Behaviours: Design measurement model, Configure instrument,

Calibrate, Collect data

http://envri.eu/rm

Page 18: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Reference Model Ontology

10/04/2023

18

Science Viewpoint

Information Viewpoint

Computational Viewpoint

RM Owl version: http://staff.science.uva.nl/~zhiming/Ontology/http://envriontology.appspot.com/main/.

The online tool:

http://envriontology.appspot.com/main/.

Page 19: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Early Adoption and use of the RM

Interactions with target audiences:

ESFRI ENV RIs : EISCAT 3D, ICOS, EPOS, EMSOOthers: GFBio, Helsinki University

All starting to use the language and model concepts

RDA Data Foundation & TerminologyUse case for evaluation

DASISH (ESFRI social sciences and humanities cluster)

ODP & Reference Model workshop, Colchester, 17 March 2014

CROSSING: Cross-cutting Services to Support data sharing

A top 5 topic for further study by (almost) all RIs19

18/03/2014

Page 20: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

EISCAT 3D Research Infrastructure

10/04/2023

20

EISCAT: European incoherent scatter radar for atmospheric, geospace research

EISCAT 3D: next generation 3D imaging radar

Studies how Earth’s atmosphere is coupled to space, is uniquely located for studies into arctic ionosphere

Pilot study, Feb 2013 to date, dialogue continuesEISCAT International Symposium, Lancaster, 10 Aug 2013

Page 21: Mapping Research Infrastructures with the ENVRI Reference Model

Integrated Carbon

Observation System

“A pan-European research infrastructure

for quantifying and understanding the greenhouse gas balance of the

European continent and adjacent regions”

Integrating atmospheric, marine and ecosystem measurements with standardized procedures and analysis, operational by 2016/17

Page 22: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

ICOS RI dataflow with RM labels

Scientists Policy makers

General public

ICOS Carbon Portal

Elaborated products & synteses

Data & metadata curation

ICOS measurement station networks

Atmospheric Thematic

Center

Ecosystem Thematic

Center

Oceanic Thematic

Center

Externally produced

elaborated products

Externally compiled

data

Data Processing & synthesis

Data Curation

Data acquisition

Communitysupport

Page 23: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Data acquisitionFunctionality No. HO CP *TCs *S-PIDATA ACQUISITION A Configuration logging A.5 develop,

recommend?yes?

Data collection A.10

recommend? yes

Data sampling A.12

develop? ?

Noise reduction A.13

develop, operate?

Realtime data collection A.11

? ?

Data transmission A.14

develop, operate

Data transmission monitoring A.16

yes yes?

Realtime data transmission A.15

yes: ATC, OTC ??

Instrument access A.4 ? yes Instrument calibration A.3 CAL yes Instrument configuration A.2 ? yes? Instrument integration A.1 ? yes? Instrument monitoring A.6 yes? yes? Parameter visualization A.7 provide links to TCs provide, operate Realtime parameter visualization A.8 provide links to TCs,

stationsoperate operate?

Process control A.9 coordinate yes?Discussions since January 2014 with tech and management. First try! NOT final by any means!

A next workshop (London, June 2014 )

Page 24: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

GfBio

German Federation for the Curation of Biological Data

Sustainable, service oriented, national data infrastructure facilitating data sharing for biological and environmental researchBased on well established archives e.g., MARUM, PANGAEA

ENVRI RM as common terminologyArchitecture - define and documentGfBio service portfolios and critical components based on minimalmodel and common functionsBusiness model - estimate GfBio costs andcompensation models required foroperation of these services

Initially for PANGAEA, Bexis, and DWB

Page 25: Mapping Research Infrastructures with the ENVRI Reference Model

ENVRI RM and GfBioPANGAEA portfolio

BData Curation Subsystem  offered service

cost, justification

cost numeric

cost category

compensation model

B.1 Data Quality Checking Technical quality control, plausibility checks computingper dataset

in kind

B.2 Data Quality VerificationIterative data peer-review process by data curators in cooperation with PI curation

per dataset

charges

B.3 Data Identification

Persistent and unique identification and citability of data with a Digital Object Identifier (DOI) computing

per dataset

in kind

B.4 Data CataloguingIterative metadata completion and ontology harmonisation by data curators curation

per dataset

charges

B.4   Provision of PANGAEA editorial system

software licence, maintenance, administration

basis cost or per project

charges

B.4   Project data curator training trainingper project

charges

B.5 Data Product Generation Preparation of data compilations curation

per compilation charges

B.6 Data Versioning  

B.7 Workflow EnactmentProvision and maintenance of Data submission and Ticket System (Jira)

licences, maintenance

per user

in kind

B.8Data Storage & Preservation

Long-term archiving and storage of data according to the ICSU WDS practices incl data authenticity and integrity checks curation

per dataset

charge

B.8  Iterative data reformatting and ingest by data curator curation

per dataset

charges

B.8  Long-term provision and maintenance of 4D Metadata catalogue

licences, maintenance

basis

in kind

©ht

tp://

libra

ryju

mpe

rs.w

ebs.

com

/

Page 26: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Benefits of Using the RM (Immediate 1-5 years)

Professional framework for clearly defining roles and

processes in RI operations

Makes it far easier to design RI in the Construction

Phase

Helps to evaluate current RIs for division of tasks

Helps to find missing or duplicated actions

Easier definition of requirements of IT components

Enabling a more modular approach for the RI IT

solutionsMakes easier to use external suppliers (e.g. international IT co-operation projects) for the component development

18/03/2014

26

Page 27: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Benefits of Using the RM (Intermediate 5-10 years)

A common language ensures common understandingAvoiding duplications Enabling re-use of components, solutions & policiesThe use of planned standard modular approach enables scalable design solutionsBetter risk management of RI development, due to possibility of changing individual modules and operations of the RIs, without needing to completely redesign the systems due to some ad-hoc solutionsImproving the trustworthiness of the RI products due to clearly defined and standardized ways to present workflows18/03/201

427

Page 28: Mapping Research Infrastructures with the ENVRI Reference Model

Project number: 283465

Benefits of Using the RM (Long-term 10-20 years)

Greater level of interoperability through the use of common standards, enabling data usage and communication between the RIs to become commonplace

Support of cross-disciplinary perspectives and products and enablement of systems science approach

Larger potential user base due to easier use of the RI products, which increases the impact and return on investment of RIs

18/03/2014

28

Page 29: Mapping Research Infrastructures with the ENVRI Reference Model

We need the same language to make things fit together

Thank you – Any questions?Picture is Creative Commons by www.glynlowe.com, used under CC-BY-SA 2.0

http://envri.eu/

rm