A CHAIN-REDS Perspective about Data Access and Metadata Management
description
Transcript of A CHAIN-REDS Perspective about Data Access and Metadata Management
Co-ordination & Harmonisation of Advanced e-Infrastructuresfor Research and Education Data Sharing
[email protected] Agreement n. 306819
A CHAIN-REDS Perspective about Data Access and Metadata
ManagementRafael Mayo-García, CIEMAT
Tunis / 12-13 Dec 2013
A CHAIN-REDS Perspective about Data Access and Metadata Management
Roberto Barberaa,b, Carla Carrubbab, Giuseppina Inserrab, Christos Kanellopoulosc, Kostas Koumantarosc, Rafael Mayo-Garcíad, Ognjen
Prnjatc, Rita Riccerib, Manuel Rodriguez Pascuald, Antonio Rubio-Monterod, Federico Ruggierie
a University of Cataniab INFN-Catania
c GRNETd CIEMAT
e GARR & INFN-Roma Tre
Coordination &
Harmonisation of Advanced
eINfrastructuresCHAIN
CHAIN-REDS: A legacy from CHAIN
CHAIN-REDS is an EC (306819) funded project ~ 2.1 M€ 1 December 2012 – 30 months
Structured in WP 1 Project Management WP 2 Dissemination, Training and Outreach WP 3 Interoperation and coordination of e-
Infrastructures WP 4 Data Infrastructures WP 5 Support to small groups and emerging
communities
WP4 in CHAIN-REDS
CHAIN-REDS is an EC (306819) funded project ~ 2.1 M€ 1 December 2012 – 30 months
Structured in WP 1 Project Management WP 2 Dissemination, Training and Outreach WP 3 Interoperation and coordination of e-
Infrastructures WP 4 Data Infrastructures WP 5 Support to small groups and emerging
communities
WP4 in CHAIN-REDS
Partners INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
WP4 ‘Data infrastructures’
Partners INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
Europe
Europe
WP4 ‘Data infrastructures’
INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
Europe
Africa
WP4 ‘Data infrastructures’
Europe
INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
AfricaLatin America
WP4 ‘Data infrastructures’
INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
Latin AmericaAsia
WP4 ‘Data infrastructures’
Asia
INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
Asia
WP4 ‘Data infrastructures’
Middle East
Asia
INFN CIEMAT GRNET CESNET UBUNTUNET CLARA IHEP ASREN SIGMA ORIONIS C-DAC
WP4 ‘Data infrastructures’
Middle East
Public outreach and dissemination is focused on reporting on Trans-continental Data Infrastructures and Data repositories and on several Use Cases
D4.1 Trans-continental Data Infrastructures and Data repositories
D4.2 Analysis of Data Infrastructures and Data repositories (coming soon)
Available at http://www.chain-project.eu/deliverables
WP4 ‘Data infrastructures’
CHAIN-REDS has established official collaborations (MoUs) with other VRC-related communities
AgINFRA DCH-RP EarthServer EIFL ENGAGE
WP4 ‘Data infrastructures’
Conversations are being held with EUDAT, H3Africa, iMENTORS, IVOA, SAEON, SKA Africa, Univ. Cape Town
WP4 ‘Data infrastructures’
Extend the CHAIN-REDS Knowledge Base (BS) with Data capabilities http://www.chain-project.eu/knowledge-base
Knowledge Base: Infrastructure
RREN(s) NREN NGI CA(s) Ident.
Fed(s) ROC(s) Grid site(s) Application(
s)
An investigation on the available (Open Access) Data and Document Repositories has been performed
Information has been collected in Africa, Asia, Europe, Latin America and the Middle East
New ones have been incorporated into the Knowledge Base
These new repositories range from databases owned by a single group to huge continental collaborations
Knowledge Base:Document & Data repositories
Knowledge Base:Document & Data repositories
• 3,200 repos• >33 M docs
Knowledge Base:Document & Data repositories
About Open Access Data Repositories, standards are being promoted
OAI-PMH for metadata retrieval Dublin Core as metadata schema SPARQL for semantic web search VOTable (XML) as potential standard for the interchange
of data represented as a set of tables Persistent Identifiers (PID)
Standards
The adopted standards have been implemented in the CHAIN-REDS KB
Developments on (Open Access) Document and Data Repositories
A semantic web enrichment A semantic search engine
OADRs and DRs
25
Semantic enrichment
OAD
Rs
Dat
a Re
pos.OAI-PMH OAI-PMH
Harvester(running on grid/cloud)
Linked-data search engine
Semantic-web enrichment
End-points
Harvester(running on grid/cloud)
Semantic search engine architecture
The semantic search engine on CHAIN-REDS linked data is available
Allows searching among the semantically-enriched metadata coming from the OADRs and DRs included in the KB
OADRs and DRs
cell
OADRs and DRs
OADRs and DRs
New knowledge discovery!
Single and Parallel semantic search are available Single: the usual semantic search service described before Parallel: the new parallel semantic search service that allow
users to search in parallel across the millions of resources contained in the CHAIN-REDS Knowledge Base and in the ENGAGE Platform
Parallel semantic search engines have been made available also in others Science Gateways agINFRA (CHAIN-REDS Knowledge Base & OpenAgris
repository) DCH-RP (CHAIN-REDS Knowledge Base & Europeana, Cultura
Italia and Isidore repositories)
Semantic Search Engine
Performs sequential and parallel searches ENGAGE
agINFRA DCH-RP
Semantic Search Engine
Semantic Search Engine
A programmable use of the CHAIN-REDS Semantic Search Engine is also possible by means of a RESTful API
http://www.chain-project.eu/semantic-search-api CHAIN-REDS webpage Semantic Search Web
Example http://www.chain-project.eu/virtuoso/api/resources?
keyword=<KEYWORD>&limit=<NUMBER_OF_RESOURCES >
Semantic Search Engine
Future developments on A tool for extracting the data associated to OADRs The execution of distributed jobs in the Science
Gateway
Data Accessibility, Reproducibility and Trustworthiness (DART)
Based on the interoperability demo performed by CHAIN-REDS at EGI TF 2013
Aiming at seamlessly perform the cycle Access to a document Extraction of associated raw data
Execution of a code taking those data as input Generation of new results Upload of the new results and article
Coming actions
CHAIN-REDS has identified in a first phase several fields with interests in the different regions
Agriculture Cultural Heritage e-Government Earth Science Astronomy and Astrophysics
Potential collaborations with initiatives and projects working on these areas are being carried out
Conclusions
Other fields and groups are also of interest OADRs’ and DRs’ managers/owners are welcome to
contact the project to share their data within the CHAIN Knowledge Base (both in Africa and Latin America this is already happening)
CHAIN-REDS is also looking forward to receiving feedbacks from all interested organizations on the Knowledge Base and the semantic search service
Conclusions
Data developments have been carried out in the Regions of interest to CHAIN-REDS
A special action in the Middle East is now a priority for CHAIN-REDS
Semantic engine and web-enrichment are powerful tools to link data and retrieve information DART
Conclusions
Co-ordination & Harmonisation of Advanced e-Infrastructuresfor Research and Education Data Sharing
[email protected] Agreement n. 306819
Thank you !