Some problems with standard geospatial metadata...Simon Cox, Bruce Simons, Nick Car 12 March 2015...

Post on 08-Jul-2020

3 views 0 download

Transcript of Some problems with standard geospatial metadata...Simon Cox, Bruce Simons, Nick Car 12 March 2015...

Simon Cox, Bruce Simons, Nick Car

12 March 2015

LAND AND WATER FLAGSHIP

Some problems with standard geospatial metadata

This presentation

• Asks some questions

• Does not provide all the answers • … but suggests some directions …

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 2 |

Outline

• ANZLIC and GeoNetwork

• Where did ANZLIC come from?

• Records

• Uses of metadata

• UML vs XML

• RDF

• RDF vocabularies

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 3 |

ANZLIC Metadata

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 4 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

5 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

6 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

7 |

(horse designed by committee =

camel)

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

○ US FGDC metadata a

strong precedent

○ requirements collected in

the 1990s

○ image and map librarians

8 |

(horse designed by committee =

camel)

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

○ US FGDC metadata a

strong precedent

○ requirements collected in

the 1990s

○ image and map librarians

9 |

(horse designed by committee =

camel)

> dawn of the internet, dataset == file

> 10,000s datasets in standard series,

metadata == digital ‘index cards’

Problem #1: Data ≠ Datasets?

• When cataloguing books, maps, images, even files, the card-index metaphor is OK • A discrete record for each item of data

• Now we expect to access data at a variety of granularities, the dataset/metadata record paradigm no longer applies

• It is a sea of data, and should be matched by a sea of metadata (maybe in the same place)

Problems with metadata | Nick Car 10 |

Breaking it down

• Structural decomposition

Problems with metadata | Nick Car 11 |

• Functional decomposition

Lawrence, Lowry, Miller, Snaith & Woolf, Information in environmental data grids. Phil. Trans. A, 2009

Problem #2: One record can’t serve all purposes

• But one ‘record’ is all you got!

Problems with metadata | Nick Car 12 |

ISO metadata was formalized as UML classes

Problems with metadata | Nick Car 13 |

GeoNetwork stores metadata as XML documents in a text database (Lucene)

Problems with metadata | Nick Car 14 |

Problem #3: Documents package text, not objects

• Instances of UML classes = Objects

• XML document = serialization for transport

• Treating the XML document as ‘canonical’ makes a basic category error: ➢XML validation ≠ quality control

➢if you only intend to manage it as text, why bother with a UML analysis?

For object-oriented behavior, the serialized form must be ‘un-marshalled’ for processing

Problems with metadata | Nick Car 15 |

Metadata creation

Problems with metadata | Nick Car 16 |

Problem #4: Index cards are not infrastructure

• Metadata-entry paradigm encourages record counting as a KPI

• Surely there are better measures of usefulness?

• How can we know, if it is not part of a joined-up architecture

Problems with metadata | Nick Car 17 |

What does everyone else do?

1. Specialist systems for specialized communities – Is spatial special? Do we want our spatial data in the mainstream?

2. Don’t bother with metadata, just index the content – The original strategy of the search engines

– Google Knowledge Graph now works with entities, not text

– (shame the entities don’t have persistent URIs …)

3. Metadata annotations – schema.org – semantic-web-lite

4. What about the Data Repositories?

Problems with metadata | Nick Car 18 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

Problems with metadata | Nick Car 19 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

Problems with metadata | Nick Car 20 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

• RDF vocabularies? • DC, DCAT

• FOAF, PROV-O, VoID, SKOS, ADMS, LOCN

Problems with metadata | Nick Car 21 |

INSPIRE profile of DCAT-AP

Problems with metadata | Nick Car 22 |

INSPIRE metadata record as RDF

Problems with metadata | Nick Car 23 |

RDF benefits

• Standard vocabularies used in the broader community

• Intrinsically object/resource oriented

• URIs for keys - linked data

• Open world – missing information doesn’t make it invalid

• No intrinsic granularity

Problems with metadata | Nick Car 24 |

Summary

ANZLIC + GeoNetwork:

☹ Record-oriented metadata doesn’t match granularity of data

☹ Each record must serve multiple functions

☹ Object oriented design, but serialization-oriented processing

☹ Incentive to create records, not architecture

☹ Not aligned with anyone else’s metadata

RDF?:

☺ Graph of metadata to match graph of data

☺ Targeted metadata subsets can be constructed using SPARQL

☺ Intrinsically resource-oriented

☺ Part of web of Linked Data

☺ Standard RDF vocabularies

Problems with metadata | Nick Car 25 |

LAND AND WATER FLAGSHIP

Thank you Land and Water Flagship Nick Car Research Engineer

t +61 7 3833 5600 e nicholas.car@csiro.au

Land and Water Flagship Simon Cox Research Scientist

t +61 3 9252 6342 e simon.cox@csiro.au w people.csiro.au/C/S/Simon-Cox