Metadata & Brokering - a modern approach for INGV RI
-
Upload
daniele-bailo -
Category
Engineering
-
view
104 -
download
0
Transcript of Metadata & Brokering - a modern approach for INGV RI
![Page 1: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/1.jpg)
METADATAa modern approach
Daniele Bailo
![Page 2: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/2.jpg)
CHARACTERS
![Page 3: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/3.jpg)
Leading Actor
Digital Data
Sequence of (digital) symbols- With a meaning- Can be stored- Can be transmitted- Can be computed
![Page 4: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/4.jpg)
Guest Star
Metadata
Data about Data (really?)
FunctionsManage Data (discovery, selection etc)
Issues (selection of)- What is metadata to
me, can be data to others
- Many standards- Ontologies
![Page 5: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/5.jpg)
Actor
Broker(ing system)Intermediary software
Functions- Access to several
system at your place
- Collects data for you (integration)
Issues (selection of)- Performances- Works better with
metadata
![Page 6: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/6.jpg)
Actor
The Triad
A set of 3 elements to fully manage data
FunctionsPID – persistent identifierMetadata – discovery & selectionDO – data of interest
<PID, metadata, DO>
![Page 7: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/7.jpg)
Technical support staff
Data Base
Collection of (organized) Data
AliasRepository, Data Center etc.
Superpowers- DBMS (allows definition, creation, querying, update, and administration of databases)
![Page 8: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/8.jpg)
Technical support staff
APIs Application programming Interface
Standard procedures or instructions to access to a service (or function)
AliasWEB service, RESTful service, [thin layer] etc..
Needs- Standards for
requests- Standards for
response
![Page 9: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/9.jpg)
Themes1. Optimizaton of
resources
2. Single point access…to several Database and services
3. OPEN ACCESS obligationsBerlin Declaration,DPC…
4. Interoperation for data re-use New multidisciplinary science
5. Citationand data provenance
![Page 10: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/10.jpg)
Comments?
Questions?
![Page 11: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/11.jpg)
SCENARIOS1. Friendship based
discovery
2. Manual discovery
3. Advanced manual discovery
4. Brokering (canonical form
5. Metadata driven canonical brokering
6. Metadata driven canonical brokering with contextualization
PREMISEStructured data (standards)
![Page 12: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/12.jpg)
#0 friendship based discovery1. data stored on USB
pendrives, CDs etc.
2. Phone calls
3. Emails
Issues
Works well in masonry clubs
![Page 13: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/13.jpg)
#1 Manual discovery
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. User discovers data
2. Repository do not have web services
3. No metadata (or embedded into file or diectory structure)
4. Manual match & mapping
Issues
Performances, efficiency, error prone, partial datasets
Dataset
Dataset
DatasetData
setDataset
Dataset
![Page 14: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/14.jpg)
#2 Advanced manual discovery
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. User discovers data
2. Repository have access interfaces (APIs, WS…)
3. Minimal metadata set
4. Manual match & mapping
Issues- Performances,
efficiency, error prone
- Some standardization in place
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
![Page 15: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/15.jpg)
#4 Brokering (canonical form)
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. Broker discovers data
2. Repository have access interfaces (APIs, WS…)
3. Minimal metadata set
4. Minimal match &mapping
5. Multdisciplinary (ontologies)
Issues- Single AP- development and
maintenance- “hardcoded”
metadata
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
Broker
API Metadata canonical form
![Page 16: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/16.jpg)
#5 Metadata driven canonical Brokering
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. Broker discovers data
2. Access interfaces3. Full metadata set4. Advance match
&mapping5. Multdisciplinary
(ontologies)Issues- Single AP- Stored graph
metadata- Huge metadata
superset
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
Broker
API Metadatacatalog
![Page 17: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/17.jpg)
#6 Metadata driven canonical Brokeringwith contextualization
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. Map & match only contextualization metadata
2. Pointers to detailed metadata
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
Broker
API Metadatacatalog
![Page 18: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/18.jpg)
#6 Metadata driven canonical Brokeringwith contextualization
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
1. Map & match only contextualization metadata
2. Pointers to detailed metadata
3. Export metadata in any standard
3 layer metadata model
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
Discovery (DC) and (CKAN, eGMS)
Contextual (CERIF metadata model)
Detailed (community specific)
Gen
erat
e
Point to
![Page 19: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/19.jpg)
Question
There is a missing actor.
WHO?
![Page 20: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/20.jpg)
Dataset
Dataset
DatasetData
setDataset
DatasetData
setDataset
Dataset
API API API
Discovery (DC) and (CKAN, eGMS)
Contextual (CERIF metadata model)
Detailed (community specific)
<PID, metadata, DO>1. PID univocally
identifies a Digital Object
2. Metadata provides description of the Object
3. DO is the Digital Object… to be defined
Data from Irpinia
<PID, metadata, DO>
request response
![Page 21: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/21.jpg)
Wrapping up
We need1. Metadata describing
data2. APIs & web services3. Defined WS output
format4. PID system -5. Brokering system6. Metadata catalogue
supporting1. Ontologies2. Contextualization
![Page 22: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/22.jpg)
Q&A
![Page 23: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/23.jpg)
#3 Metadata driven canonical brokering
= data Format A – repository A
= data Format B – repository B
= data Format C – repository C
Dataset
Dataset
Dataset
Data from Irpinia
1. Broker discovers data
2. Repository have access interfaces (APIs, WS…)
3. Significant metadata set
4. Good match &mapping
Issues
- development and maintenance
- Single AP
- “hardcoded” metadata
Dataset
Dataset
DatasetData
setDataset
Dataset
API API API
Broker
API Metadatacatalog
![Page 24: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/24.jpg)
#4 Metadata driven canonical brokering
Broker
= any data format
Dataset
Issues
1. Predefined tools for matching and mapping
2. Writing software: n conversion algorithms to canonical form
3. Ontologies
4. Multidisciplinarybut many formats
5. Good data discovery
6. Not all metadata used
Dataset Data
set
Dataset
Dataset
= metadata format A
= metadata format B
Data from Irpinia
catalog
![Page 25: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/25.jpg)
#1 Conventional
Brokering
Broker
= data Format A
= data Format B
= data Format C
Dataset
Dataset Data
set
Dataset
Dataset
Dataset
Dataset
DatasetData
set Dataset
Dataset
Dataset
Data from Irpinia
Issues
1. Writing software: n*(n-1) conversion algorithms
2. does not scale in costs of development and maintenance
3. matching and mapping
4. works within a restricted research domain
5. “Complex” data discovery
![Page 26: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/26.jpg)
#2 Brokering with canonical form
Broker
= data Format A
= data Format B
= data Format C
Dataset
Dataset Data
set
Dataset
Dataset
Dataset
Dataset
DatasetData
set Dataset
Dataset
Dataset
Data from Irpinia
Issues
1. Writing software: n conversion algorithms to canonical form
2. works within a restricted research domain
3. matching and mapping
4. “Complex” data discovery
= canonical Format A
![Page 27: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/27.jpg)
#3 Metadata driven simple brokering
Broker
= any data format
Dataset
Issues
1. Good data discovery
2. Predefined tools for matching and mapping
3. Multidisciplinarybut many formats
4. Writing software: n*(n-1) conversion algorithms
5. Ontologies
Dataset Data
set
Dataset
Dataset
= metadata format A
= metadata format B
Data from Irpinia
METADATA
![Page 28: Metadata & Brokering - a modern approach for INGV RI](https://reader036.fdocuments.us/reader036/viewer/2022062313/55cc944fbb61eb6e208b458e/html5/thumbnails/28.jpg)
#2 Metadata driven canonical brokering
Broker
= any data format
Dataset
Issues
1. Predefined tools for matching and mapping
2. Writing software: n conversion algorithms to canonical form
3. Ontologies
4. Multidisciplinarybut many formats
5. Good data discovery
Dataset Data
set
Dataset
Dataset
= metadata format A
= metadata format B
Data from Irpinia
catalog
METADATA