How to prepare data for integration in SeaDataNet V1?
description
Transcript of How to prepare data for integration in SeaDataNet V1?
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
How to prepare data for integration in SeaDataNet V1?
M. Fichaut, R. Lowry, R. Schlitzer
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
2www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Overview
• 2 parts
• First part gives an overview of SeaDataNet system and of the available tools that can be used by SeaDataNet partners, and details some practical use cases
• Second part is dedicated to ODV version 4 presentation
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
3www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SDN V1 – Data centres• In SeaDataNet version 1 : 2 types of data centres
• Pilot data centres (11 TTT members + volunteers)
• Automatic data download from their system to SeaDataNet portal
• Requires a minimal technical infrastructure “Application server like TOMCAT or IIS” and software implementation including “Download manager” and “Coupling table”
• Other data centres
• Manual preparation of data for downloading by SeaDataNet Web portal
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
4www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
CDI
CSR
EDMERP
EDMED
SEADATANET PORTAL
European portal
Collection of
ASCII filesFormat X
Data in Database
Metadata in
Database
Data Input
Metadata Input
MetadataIn
Excel files
Partner system : pilot data centre
SeaDatanet Vocabulary
NEMO
ODV
Med2MedSDN
Coupling table
Download
Manager
Collection of
ASCII filesFormat SDN
XML Metadata
Files
Local copy of data
to download
Data request
Data download
MIKADO
MIKADO
XML Validator
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
5www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
CDI
CSR
EDMERP
EDMED
SEADATANET PORTAL
European portal
Collection of
ASCII filesFormat X
Data in Database
Metadata in
Database
Data Input
Metadata Input
MetadataIn
Excel files
Partner system : other data centre
SeaDatanet Vocabulary
NEMO
ODV
Med2MedSDN
Collection of
ASCII filesFormat SDN
XML Metadata
Files
Data download
Local copy of data
to download
Data requestby email
Manual preparation of data
MIKADO
MIKADO
XML Validator
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
6www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Summary
• SeaDataNet Vocabulary
• SeaDataNet formats
• SeaDataNet reformatting tools : NEMO and Med2MedSDN
• MIKADO tool and XML validator
• Interaction of these tools with the download manager
• Some use cases
• ODV version 4
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
7www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet vocabulary• SeaDataNet vocabularies populate many metadata fields and the
parameter descriptions in data
• They are delivered through a Vocabulary Server
• May be viewed through a client on the SeaDataNet web site (http://seadatanet.maris2.nl/v_bodc_vocab/welcome.aspx)
• May be accessed programmatically as described in Athens (IMDIS 2008 conference)
• Master copy of vocabularies always accessible from a well-known location (BODC)
• Vocabularies developed through the group governance of the SeaDataNet TTT or wider international bodies (SeaVoX, ICES platforms)
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
8www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet vocabulary
• Vocabularies in metadata
• Most partners will encounter vocabularies in metadata through Mikado and NEMO tools
• Most common problem will be that an entry required in a vocabulary isn’t there
• For example a ship required for a CSR record isn’t present in the C174 list.
• If this happens, contact the SeaDataNet help desk
• They will advise what you should do and contact Roy if necessary
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
9www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet vocabulary
• Vocabularies in metadata
• Adding new entries involves:
• Proposals for change are discussed on the appropriate e-mail list
• Editing the master vocabulary database
• Publication of the changes
• This takes time so please send requests as soon as possible and be patient.
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
10www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet vocabulary
• Vocabularies in metadata
• Ship codes
• If the ship isn’t present in the full ICES list as published on the ICES web site (the SeaDataNet Ship and Platform Codes at http://www.ices.dk/datacentre/reco/reco.asp) a new code must be obtained from ICES
• This has caused delays
• New on-line application system now available that will streamline the process (next April)
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
11www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet vocabulary• Vocabularies in data
• Parameters are labelled using terms from the P011 vocabulary
• This is comprehensive, but very large (21,000 terms)
• Thesaurus navigation tool on the SeaDataNet web site (http://seadatanet.maris2.nl/v_bodc_vocab/vocabrelations.aspx?list=P081) helps a lot
• Mapping for MEDATLAS parameter codes under construction and accessible through NEMO and Med2MedSDN tools
• Report other mapping problems to the SeaDataNet help desk
• Roy will provide whatever help he can
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
12www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet formats• ASCII formats
• defined for vertical profiles, times-series and trajectories
• ODV mandatory
• MEDATLAS optional
• NetCDF format
• CF (Climate and Forecast) compatible
• For gridded data (model output, satellite data and data syntheses)
• Also for other types of data difficult to handle in ASCII formats, due to their large volume or structural complexity
• Still being defined
http://www.seadatanet.org/standards_software/data_transport_formats
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
13www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
SeaDataNet extensions to ODV and MEDATLAS (1)
• SeaDataNet format extensions fulfil two functions
• Provide a linkage between data and metadata
• ODV : 2 additional columns :
LOCAL_CDI_ID and EDMO_CODE of the data centre providing the CDI
• MEDATLAS : 2 additional comment lines with key-words :
* LOCAL_CDI_ID =
* EDMO_CODE =
• Provide a linkage to standardised SeaDataNet semantic information such as detailed parameter descriptions
• ODV and MEDATLAS : additional comment lines
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
14www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
• Additional Comment lines for parameter mapping
SeaDataNet extensions to ODV and MEDATLAS (2)
• ODV//SDN_parameter_mapping
//<subject>SDN:LOCAL:DEPH</subject><object>SDN:P011::ADEPZZ01</object><unit>SDN:P061::ULAA</unit>
//<subject>SDN:LOCAL:TEMP</subject><object>SDN:P011::TEMPPR01</object><unit>SDN:P061::UPAA</unit>
• MEDATLAS*SDN_parameter_mapping
*<subject>SDN:LOCAL:PRES</subject><object>SDN:P011::PRESPR01</object><unit>SDN:P061::UPDB</unit>
*<subject>SDN:LOCAL:TEMP</subject><object>SDN:P011::TEMPPR01</object><unit>SDN:P061::UPAA</unit>
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
15www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Tools to generate SeaDataNet ASCII formats
NEMO
• JAVA tool to reformat ASCII files to SeaDataNet ODV and MEDATLAS formats - available under Windows
• Version 1.2.0 and user manual available at :
• http://www.seadatanet.org/standards_software/software/nemo
Med2MedSDN
• Java tool to translate MEDATLAS files to SeaDataNet MEDATLAS files - available under Windows
• Version 1.0 and user manual available at :
• http://www.seadatanet.org/standards_software/software/Med2MedSDN
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
16www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO main features
• Reformat any ASCII file of vertical profiles, time-series or trajectories to a SeaDataNet ASCII format (ODV, MEDATLAS).
• The input ASCII files can be :• one file per station for vertical profiles or time series• one file for one cruise for vertical profiles, time series or
trajectories• Related to cruises or not
• If not related to cruise, only ODV re-formatting is available
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
17www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO main principles• Users of NEMO describe the entry files format so that NEMO is able to find
the information which is necessary in the SeaDataNet formats.
• One pre-requirement is that all entry files processed at the same time by NEMO must be at the same format : the information about the stations must :
• be located at the same position : same line in the file, same position on the line or same column if CSV format
• be in the same format,
for example : for all the stations the latitude is : • on line 3 on the station header, • from character 21 to character 27, or 3rd column in CSV• the format is +DD.ddd
• Other pre-requirement is that data must be provided in columns in the data files.
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
18www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO – 5 steps for file conversion
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
19www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO : 5 steps
• Description of the file
• Description of the cruise : input manually or import of CSR XML V1
• Description of the station header
• Description of the measured parameters
• File conversion
• Model can be saved and reused
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
20www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO – Description of the input files (1)
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
21www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO – Description of the input files (2)
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
22www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO new functionalities
• Trajectories taken into account
• SeaDataNet extensions to ODV and MEDATLAS formats
• Possibility to keep quality flags if existing in input files and to map them to SeaDataNet QC flags scale
• Generation of a CDI summary file directly usable by MIKADO to generate XML CDI exports
• Generation of the coupling file to make the mapping between a LOCAL_CDI_ID (one profile, one time-series or one trajectory) and the name of the file containing this LOCAl_CDI_ID. This coupling file is used by the download manage
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
23www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO next version will
• Correct the known bugs and the new ones if detected by users
• Take into account the last release of ODV format with date ISO-8601 and data type ‘*’
• Improve time response for conversion of large volume files and for vocabulary update
• Take into account the ODV multi-station files as input of NEMO
• Be tested under Unix and Linux
• Be released in June 2009
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
24www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Med2MedSDN main features
• Reformats MEDATLAS files to MEDATLAS SeaDataNet format
• Java tool, bilingual (French, English)
• Adds the additional SeaDataNet information : mapping for parameters and LOCAL_CDI_ID and EDMO_CODE
• Able to reformat one file or a large number of files (in one directory)
• Linked to SeaDataNet vocabularies through Web services for parameters mapping and for list of EDMO codes
• Need of internet connexion while updating lists
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
25www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Med2MedSDN main screen
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
26www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Med2MedSDN log files•Errors are registered in a log file which can be open through Med2MedSDN main screen by clicking on “see log” in the error window
• One line in the log file is composed as following:
• Date, Name of the Software, Error severity level, Error message
INFO Informative messages for starting of the conversion or successful conversion
ERROR For conversion errors : conversion is cancelled on the current file but continues on the other files
FATAL For conversion errors which stop the processing of the files
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
27www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Med2MedSDN log file
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
28www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Med2MedSDN next version will
• Take into account the creation of the coupling file for SeaDataNet download manager
• Be released in June 2009
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
29www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Tool to generate XML meta-data files
MIKADO
• JAVA tool to generate XML descriptions of SeaDataNet catalogues
• EDMED : catalogue of Marine Environmental Datasets
• EDMERP : Marine Environmental Research Projects
• CSR : Cruise Summary Reports
• CDI : Common Data Index
• Version 1.5 and user manual available at :
http://www.seadatanet.org/standards_software/software/mikado
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
30www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
MIKADO version 1.5
• New functionalities
• Download EDMED files directly from central BODC catalogue through Web services: for the time being, awaiting the new EDMED V1 user interface developments
• World map to manage Marsden squares for CSR
• Data centre type options for CDI (SeaDataNet, ECOOP) : to allow other data Website than SeaDataNet
• Mapping download from BODC : to get existing mappings from BODC web site
• Sybase driver for JDBC
• Vocabulary update without restarting Mikado
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
31www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Next versions of MIKADO
• Version 1.6
• Being tested now
• Available next May
• Able to generate coupling.txt file used by the download manager, for data stored in ASCII files or in relational data base
• Version 1.7
• EDIOS
• Released by the end of 2009
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
32www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO to MIKADO to SeaDataNet CDI
SeaDataNetCDI
Collection of
ASCII files
ASCII SDN files
CDI summaryCSV file MIKADO
XML CDI files
Explanation in NEMO user manualSummary_CDI_NEMO.xml
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
33www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Links with SDN Download manager
Coupling table
Coupling.txt File
Modus 1,3
Download manager
SeaDataNet portal
SEADATANET PORTAL
MIKADO
Coupling.txt File
Modus 1,2,3
Coupling.txt File
Modus 1,3Med2MedSDN
• Modus 1 : data in mono-station file
• Modus 2 : data in database
• Modus 3 : Data in multi-station file
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
34www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
CDI
CSR
EDMERP
EDMED
SEADATANET PORTAL
European portal
Collection of
ASCII filesFormat X
Data in Database
Metadata in
Database
Data Input
Metadata Input
MetadataIn
Excel files
Partner system : pilot data centre
SeaDatanet Vocabulary
NEMO
ODV
Med2MedSDN
Coupling table
Download
Manager
Collection of
ASCII filesFormat SDN
XML Metadata
Files
Local copy of data
to download
Data request
Data download
MIKADO
MIKADO
XML Validator
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
35www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Use cases• Pre-requirement for all use cases is :
• Preparation of the mapping between your metadata and :
• SeaDataNet vocabularies : Sea areas, BODC parameters (PDV), Platform classes, SDN device categories ….
• some mapping is already available on BODC Web site :
• MEDATLAS to PDV, MEDATLAS units to BODC storage units
• EDMO : Marine organisations
• EDMERP : Marine environmental projects
(Incremental mapping managed by MIKADO)
• Quality checks of the data must be done using ODV or other software, before sending metadata to CDI
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
36www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Use case 1 – collection of XBTs or CTDs or Time-series files – no relational database1. Verify that all files of the collection have homogeneous format
2. Run NEMO
• to convert the files to SeaDataNet ODV
• to generate a CDI summary file
• [To generate the coupling.txt file for these data]
3. Run MIKADO to generate the XML CDI files with the configuration file delivered with NEMO for the CDI summary file
4. Use the XML validator to validate your XML files
5. [Implement the coupling file]
6. Send the XML CDI files to central catalogue
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
37www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Use case 2 – collection of MEDATLAS files and metadata in relational database1. Run Med2MedSDN
• to convert MEDATLAS files to MEDATLAS SDN files
• [To generate the coupling.txt file table for these MEDATLAS SDN files]
2. Run MIKADO on the metadata database
• to generate the XML CDI descriptions of the stations of these MEDATLAS files.
• [To generate the coupling.txt file table for these MEDATLAS SDN files]
3. Use the XML validator to validate your XML files
4. [Implement the coupling file]
5. Send the XML CDI files to central catalogue
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
38www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Use case 3 – collection of ASCII files and metadata in relational database1. Run NEMO
• to convert ASCII files to ODV [and MEDATLAS] SDN files
• [to generate the coupling.txt file table for these SDN files]
2. Run MIKADO on the metadata database
• to generate the XML CDI descriptions of the stations of these files.
• [to generate the coupling.txt file table for these SDN files]
3. Use the XML validator to validate your XML files
4. [Implement the coupling file]
5. Send the XML CDI files to central catalogue
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
39www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
Use case 4 – XBTs, CTDs, Time-series measurements – data and metadata in a relational database 1. Run MIKADO
• To create the configuration to retrieve metadata on these data in the database
• To export the XML CDI corresponding files
2. Run MIKADO to create the coupling table with the appropriate select statement to retrieve these measurements in the database
3. Use the XML validator to validate your XML files
4. Implement the coupling file
5. Send the XML CDI files to central catalogue
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
40www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
NEMO and Med2MedSDN demonstrations are possible ,
just ask!!!
Questions or problems on MIKADO are welcome too.
OBSERVATIONS
& PRÉVISIONS CÔTIÈRES
41www.seadatanet.org
SeaDataNet annual meeting, Madrid, 25-27 March 2009
And nowAll about ODV,
version 4, by Reiner ……………