Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

14
SeaDataNet and EMODNET Vocabularies Roy Lowry Adam Leadbetter British Oceanographic Data Centre

Transcript of Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

Page 1: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

SeaDataNet and EMODNET Vocabularies

Roy LowryAdam Leadbetter

British Oceanographic Data Centre

Page 2: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

OverviewAutomated parameter aggregation

(P35) vocabulary status

EMODNET chemical filter

P01 semantic model exposure status

Management of concept deprecations

Page 3: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P35 StatusP35 is a vocabulary of parameters for

EMODNET chemistry lot products

EMODNET data parameters are marked up using P01 vocabulary

P01 is much finer grained than P35

Therefore aggregation of P01 parameters into P35 parameters is required

To date this has been done by a lot of painful manual work in ODV

Page 4: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P35 StatusHowever within NVS it is possible to

maintain and serve a mapping between P01 and P35

Each P35 concept has a URL that resolves to an XML document

This document can be used to drive automated parameter aggregation by identifying all P01 codes that may be incorporated into a P35 code

Page 5: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P35 StatusP35 presents design issues

P35 granularity (e.g. should there be separate products for unfiltered and filtered samples)

Which P01 terms should map to a given P35 term?

Design issues need governance - domain experts who can make these decisions

Governance now established based on experts from EMODNET partners communicating by list-server e-mail and Webex.

Page 6: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P35 StatusExample P35 concept (ITS90 temperature) set up in

OctoberJust over 100 additional entries considered by

governance and loaded this monthThese cover

SalinityDissolved oxygenNutrientsMetals in the water column

Next target is metals in sediments and biota (900-1000)P35 could easily reach several thousand entries

Page 7: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

EMODNET Chemical FilterNeed to consider what is required here

One approach is to specify a list of P02 codes that cover the themes included in the EU legislation

This comes with risks Some data outside the intended scope will be captured

(e.g. Methylated arsenic in a trawl designed for organotins)

Easy to overlook consequences of any P02 rationalisation

P02 list can be tested against P35 (both map to P01)

Page 8: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

EMODNET Chemical FilterAlternative approach

Capture P01 codes through data mining

Translate P35 into a list of P01 codes

Do the chemical filter on the basis of P01 rather than P02

Page 9: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P01 Model Exposure StatusBoth ODIP and EMODNET require access to

the factored semantic model that underpins P01

Strong pressure from ODIP (primarily Simon Cox) for this to be delivered through RDF-XML

For this every element of the factorisation requires a URI

This requires that every element to be covered by a controlled vocabulary

Page 10: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

P01 Model Exposure StatusThe biological entity in the factorisation is already covered

(S25 vocabulary)

Parameter - matrix relationship already covered (S02 vocabulary)

Currently working on the matrix entityConcepts like 'water body particulate >0.2um phase' Taking longer than expected (part-time working, EMODNET,

IMOS vocabulary demands, past misdemeanours)But getting very close

Then we just need the parameter entity

Page 11: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

Concept DeprecationMany SeaDataNet vocabularies have evolved, with

concepts added to satisfy specific demands

Governance explicitly prohibits deletion

This leads to issuesUnintended duplicates

Cause confusion Unnecessarily complicate aggregation

Variable granularity Discovery made more difficult (too many terms) Patchy domain coverage Unnecessarily complicates metadata markup

Page 12: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

Concept DeprecationNVS 1.0 handled deprecation poorly (URI changed)

Issues addressed in NVS 2.0

All payload documents include skos:note element set to 'accepted' or 'deprecated' owl:deprecated boolean element

Deprecated concept documents also have a dc:isReplaced By element

Full controlled vocabulary requests may be designated 'accepted', 'deprecated' or 'all' (default)

Page 13: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

Concept DeprecationConcept deprecation causes issues for the

SeaDataNet architectureDeprecated concepts contained in SeaDataNet

metadatabasesDeprecated concepts in SeaDataNet filestock

Consequently, much needed vocabulary improvements (P03, P02) held back due to concern about the consequences

Page 14: Roy Lowry Adam Leadbetter British Oceanographic Data Centre.

Concept DeprecationFollowing deprecation support is needed:

Deprecation handling within the SeaDataNet vocabulary client, which could either Only display accepted concepts (easy to implement) Flag the deprecated concepts (more work but a better

result)

Automatic parameter substitution in metadatabase file and data file import tools

Metadatabase sweepers (run regularly to clean up any concepts that have been deprecated since ingestion)