Mapping Data Models to VOTable The specification Published version ...

28
Mapping Data Models to VOTable The specification Published version https:// volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/M appingDMtoVOTable-v1.0-20150427.pdf Last working version at: https:// volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/M appingDMtoVOTable-v1.0-201506xx.docx

Transcript of Mapping Data Models to VOTable The specification Published version ...

Page 2: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Current Document Outline1. Introduction2. Use Cases3. The need for a mapping language4. Mapping with the <VODML> element5. General information about this spec6. Examples: Mapping VO-DML VOTable7. Patterns for annotating VOTable [Normative]8. Notable absences9. Serializing to other file formats10. ReferencesAppendix A: The VODML annotation elementAppendix B: Growing complexity: naïve, advanced and guru clientsAppendix C: Regular expressions for mappingAppendix D: Frequently Asked Questions

Please read it!

Page 3: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Discussion items

1. Mapping = identifying instances in (ad hoc) serializations2. Using VO-DML3. Use of VOTable meta-data elements

a. No change apart from <VODML>b. Use of GROUPs

4. Structure VODML element5. Mapping expression syntax6. Use of ivoa:quantity types7. Use of vo-dml: vodmlrefs8. vodml-ref prefix, explicitly declared or name ?9. Use of identifiers10. Mapping references

a) vo-dml:GROUPref, vo-dml:ORMReference, vo-dml:RemoteReference

11. Mapping to TAP_SCHEMA, FITS

Page 4: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

1. Mapping = identifying instances in (ad hoc) serializations

• The doc discusses a formalize way to map VO-DML models to tables.

• It defines this as: Identifying instances of model types inside tabular data structures

• Using VO-DML means:We have formal expressions for the data model elements and can use these in the mapping description.

Page 5: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Reminder

• Data Models, when properly reflecting the real world, are not lists

• They are graphs.

Page 6: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Especially

For information integration the best modelling approach is to try to represent the world as it is.

– Domain Model– For we have no idea how someone else has

represented the world for their application– But if we assume they have based it on a

conceptual model of the world (at least implicitly) there is a chance to map between them.

Page 7: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Pragmatism

We live in the same world, even though we want to do different things with(in) it.Let’s model that world as directly as possible.

Main issue then is level of detail.

Page 8: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

The world is complex, whether you like it or not

-name : string

Standards::Category

*-baseClass 0..1

1

-possibleValue

1..*

-abbreviation : string-amount : numeric

Standards::AtomicUnit

-power : rational-amount : numeric

Standards::ComponentUnit

Standards::CompoundUnit

1-component 1..*

Standards::Unit

*

-component1

-amount : numeric

Values::AtomicQuantity

*

-unit

1

Values::Classifier

-name : string

Values::ComponentQuantity

Values::CompositeQuantity

1

-component

1..*

Values::Identifier

Values::Quantity

*

-quantity 1Values::Value

-identifier : string

Experiments::Experiment-identifier : string-documentationURL : string

Protocols::Protocol

*

-recipe

1

Experiments::Result

1

-result

*

Protocols::ConfigurationDescriptor

-identifier : string

Protocols::Objective

1

-observable

*

Experiments::Subject

*

-observable *

1

-observation

*

-name : string-isIndependent : boolean

Protocols::Variable

*

-property

1

1

-variables

1..*

Experiments::Image

Experiments::ObjectList

Experiments::ConfidenceIndication

1

-confidence

*

Experiments::ValueAssignment

1

-values

* *

-variable 1

Experiments::Measurement

1

-value

1

Experiments::Identification

*

-value1

Experiments::Classification

* -value1

Protocols::AstronomicalObservatory

Protocols::Analysis

Protocols::Callibration Protocols::Simulator

Experiments::Configuration

*

-protocol

1

1

-configuration

*

1

-configurationParameter

1

Protocols::SourceExtraction

Experiments::InputData

Experiments::TimeOrderedData

Experiments::VisibilityData

Standards::CoordinateSystem

-name : string

Standards::EnergyBand

-locator : string-description : string

Products::PhysicalArtifact

-name : string-description : string

Standards::Name

*

-subject

1

*

-artifact 1

1

-inputData

*

*

-id

1

Standards::ClassificationSystem

*

-baseClassifcation

*

*

-category

1

Standards::NamingSystem

1

-object

*

*

-phenomenon

1

*

-phenomenon

1

-identifier : string-description : string

Standards::ReferenceSystem

Protocols::InputDataType

1

-inputDataType

* *

-type

1

Standards::MagnitudeSystem

Protocols::DataProcessingProtocols::Stacking

Protocols::CrossMatching

Standards::Constant

-name : string-abbreviation : string

Standards::PhysicalConstant

*

-value1

-name : string

Types::AbstractType

Types::DatatypeTypes::Representation

-name : string

Types::Field1

-field

*

*

-type1

*

-referenceSystem0..1* -type_11

*

-type 1

Protocols::Query

Phenomenology::AtomicNumericPhenomenon

*

-phenomenon

1

Phenomenology::BaseNumericPhenomenon

Phenomenology::CategoricalPhenomenon

Phenomenology::CompositePhenomenon

Phenomenology::DecompositionalPhenomenon

Phenomenology::DerivedNumericPhenomenon

-power : integer

Phenomenology::DerivedPhenomenonComponent

Phenomenology::Identification

Phenomenology::NumericPhenomenon

-name : string-description : string

Phenomenology::Phenomenon

Phenomenology::PositionalPhenomenon

-name : string

Phenomenology::Property

Phenomenology::ScientificArtifact

Phenomenology::SpatialSubjectType

-name : string-description : string

Phenomenology::SubjectType

*

-type

1

Phenomenology::Substance

1

-property1..*

1 -components1..*

*

-phenomenon

1

Phenomenology::TangibleObject

*

-component 1

*

-phenomenon

1

1

-uncertainty1

Experiments::Uncertainty

Page 9: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

2. Using VO-DML

• Standard, formal modelling language (aka meta-model)• Explicit modelling concepts promote understanding

– E.g. object types vs value types– Restrictions facilitate making modelling choices

• Standardization supports reuse of models by models• Machine readability supports validation, implementation

– E.g. XSLT scripts • Allows standard (faithful) serialization

– E.g. XML schema, HTML documentation, Java classes, TAP_SCHEMAs

– See VO-URP

Page 11: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Use of VOTable meta-data elements

• No change required to VOTable schema apart from <VODML>– discussion on Banff– replace use of @utype by a new element– only on : GROUP, PARAM(ref), FIELDref

• Use of GROUPs– data models group (more) primitive elements into (structured)

types– That’s what GROUPs are for

• Often elements themselves are structured:– GROUP hierarchies

Page 12: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

4. <VODML> element structure

<xs:complexType name="VODMLAnnotation"> <xs:sequence> <xs:element name="TYPE" type="VODMLReference“ minOccurs="0" maxOccurs="unbounded“/> <xs:element name="ROLE" type="VODMLReference“ minOccurs="0“/> <xs:element name="OPTIONMAPPING" type="VODMLOptionMapping" minOccurs="0" maxOccurs="unbounded“/> </xs:sequence></xs:complexType>

Page 13: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

VODML contents

• Order: TYPE-ROLE or ROLE-TYPE?• TYPE multiplicity

– 0..1 : “casting” of ROLE to actual type– 0..* : list ALL¹ types compatible with the ROLE, i.e.

actual type and all super types. (for "naïve clients")

• ¹ Maybe only “first” super type in separate models.

Page 14: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

OPTIONMAPPING

• Enumeration or SKOSConcept define valid values for attribute– map to VOTabe’s VALUES/OPTION pattern

• (Legacy) serialization may have chosen own values• Do we need to allow translation?• How if no OPTION defined on FIELD or PARAM?• Is OPTIONMAPPING on VODML best way to model?

– disadvantage: repeat of OPTION values required– advantage: multiple annotation possible – alt: add VODML to OPTION itself, with only the vodml-ref to the

literal

Page 15: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

E.g. value enum literal(from skyserver.sdss.org/PhotoObjAll.type)

<VODML> <ROLE>src:Source.classification</ROLE> <OPTIONMAPPING> <VALUE>6</VALUE> <LITERAL>src:source.SourceClassification.star</LITERAL> </OPTIONMAPPING> <OPTIONMAPPING> <VALUE>3</VALUE> <LITERAL>src:source.SourceClassification.galaxy</LITERAL> </OPTIONMAPPING> <OPTIONMAPPING> <VALUE>0</VALUE> <LITERAL>src:source.SourceClassification.unknown</LITERAL> </OPTIONMAPPING></VODML>

Page 16: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

5: Mapping expression syntax

• Chapter 7 uses somewhat abstract, but formal syntax to define mapping pattern

{GROUP G | G RESOURCE & G TABLE & G GROUP[VODML] & G/VODML/TYPE ⊂ ⊄ ⊄ ⇒ ObjectType & G/VODML/ROLE = NULL}

==a GROUP (G) with a VODML/TYPE identifying an ObjectType and no VODML/ROLE, contained in a RESOURCE, not contained in a TABLE, not contained in a GROUP with a VODML element.

• Formally define all patterns that a client should understand, that a provider should use to express meaning.

• Aims to be comprehensive and rigorous, scope open for discussion

Page 17: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

6. Use of ivoa: modelhttps://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml

• Defines reusable primitive types – conceptual: ivoa:string, ivoa:integer, ivoa:real– not: float vs double, int vs long vs short

• Defines reusable quantity types– ivoa:quantity.RealQuantity– with ucd, unit etc– can be mapped directly to FIELDref– ucd/unit implicitly mapped to FIELD’s attributes

Page 18: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Issues

• sufficient number of predefined types?– too many?

• do we need 'ucd' attribute in quantity types?– can we support other (SKOS) vocabularies

Page 19: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

7: Use of vo-dml: serialization modelhttps://

volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/vo-dml/VO-DML.vo-dml.xml

• Some serialization concepts must be available for annotations

– concepts not required in model definition• <objectType>vo-dm:Model : defines how/which models are

used in serialization• <dataType>vo-dml:Identifier : how to define explicit identifiers• <objectType>vo:dml:Reference: how to serialize references• <objectType>vo-dml:ObjectTypeInstance

– implicit super type of all serialized ObjectType instances– defines roles useful for serializations– <attribute>vo-dml:ObjectTypeInstance.ID [vo-dml:Identifier] – <reference>vo-dml:ObjectTypeInstance.container

[vo-dml:ObjectTypeInstance]

Page 20: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Issues

• Do we want this model?– how formal do we want to be– model introduced to preserve invariant that each

vodml-ref points to a concept in a model• Change name?

– vo-dml-s (for serialization)– ???

Page 21: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

8: vodml-ref prefix, explicitly declared or Model.name ?

• Long discussion came to conclusion not to allow flexible prefixes– use fixed name of Model

• Advantage:– simpler for clients, can always look for same prefix

• Disadvantage– need fixed list of names– independently created custom models may clash

Page 22: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

9. Use of identifiers

• VO-DML does not define explicit identifier role– Each objectType instance is implicitly assume to have an ID– No statement on structure

• Objects can be identified in many different ways.– Primary key column(s) in table– xsd:ID or xsd:key in XML serialization– GROUP/@ID element in VOTable– ivo/identifier– URI– ...

• vo-dml:Identifier DataType– vo-dml:Identifier.field [0..*]– May be mapped to single column

Page 23: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

10. Mapping Reference-s

Different types:• vo-dml:GROUPref : refer to singleton-GROUP• vo-dml:ORMReference : Object-relational

mapping pattern• vo-dml:RemoteReference : reference to

object in remote document– E.g. standard coordinate frames of photometry

filters stored in some registry(?)– Acceptable remote serializations?

Page 24: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

11. Mapping to TAP_SCHEMA, FITS

• Wrap with VOTable• Need standardize way of expressing wrapped

data structures• E.g. (from “VO-DML Mapper”)

– TBD

Page 25: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Implementations, code, tools

Page 26: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.
Page 27: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

Guru clients

• VOTable validator– Check

• VOTable loader– generate in-memory instances of Java classes generated from

VO-DML using XSLT• VOTable interpreter

– Writes instances to XML document following VO-DML-I schema– Using Java classes generated from VO-DML, not necessary

• All load VO-DML docs dynamically• Algorithms will be written up

Page 28: Mapping Data Models to VOTable The specification Published version  dml/doc/MappingDMtoVOTable-v1.0-20150427.pdf.

that’s it