Met a-data Resources in Europe: within NSIs and from Dosis Projects

42
Meta-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University Vienna

description

Met a-data Resources in Europe: within NSIs and from Dosis Projects. Wilfried Grossmann Department of Statistics and Decision Support Systems University Vienna. Contents. Introduction Contents of Meta-data IT- Structures for Meta-data Processing Meta-data Conclusions. Introduction. - PowerPoint PPT Presentation

Transcript of Met a-data Resources in Europe: within NSIs and from Dosis Projects

Page 1: Met a-data Resources in Europe: within NSIs and from Dosis Projects

Meta-data Resources in Europe: within NSIs and from

Dosis Projects

Wilfried Grossmann

Department of Statistics and Decision Support Systems

University Vienna

Page 2: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 2

Contents

Introduction

Contents of Meta-data

IT- Structures for Meta-data

Processing Meta-data

Conclusions

Page 3: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 3

Introduction

Continuing hot topics in the meta-data discussion

Content-orientation versus IT-orientation

There is a lack of communication between these two groups

Page 4: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 4

Introduction

Meta-data providers versus meta-data users

Who provides which type of information for whom?

Page 5: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 5

Contents of Meta-data

What kind of objects should be documented?

Basic statistical structures Variables Values Data sets

____________________

Statistical output Statistical Systems Statistical Processing

Page 6: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 6

Contents of Meta-data

Approaches towards meta-data content

The template oriented approach

The data warehouse approach

The process oriented approach

Page 7: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 7

Contents of Meta-data The template oriented approach

Templates defined by a number of working groups

For micro data and data setsDDI, Dublin Core

For (economic) macrodata

OECD, IMF, ECE (Internet)

Page 8: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 8

Contents of Meta-dataThe template oriented approach

The OECD Template:

Concepts and sources

Data Collection

Data manipulation by national source

Data quality

Data Transmission

International Standards

Data Storage and Manipulation by OECD

Output preparation and delivery by OECD

Page 9: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 9

Contents of Meta-data The template oriented approach

The IMF Template:

Coverage

Periodicity

Timeliness

Quality of disseminated data

Integrity of disseminated data

Access by the public

Page 10: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 10

Contents of Meta-data The template oriented approach

Although the OECD approach seems more

reliable from statistical point of view, IMF is

favoured at the moment by international

organisations (EUROSTAT)

Page 11: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 11

Contents of Meta-dataThe warehouse approach

Integration of the data inside the NSIs in a data warehouse

Output and dissemination as first step

Meta-data are oriented towards the needs of the

data warehouse

Page 12: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 12

Contents of Meta-dataThe warehouse approach

Projects in this direction in many NSI

Best documentation: Australian Office

Definitional meta-data

Procedural meta-data

Operational meta-data

Systems meta-data

Datasets meta-data

Page 13: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 13

Contents of Meta-dataThe process oriented approach

Combines statistical and IT considerations

Statistical data are considered not as final products but as the result of a process chain

More detailed consideration of statistical terminology

Page 14: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 14

Contents of Meta-dataThe process oriented approach

Starting point was the SCB-DOC model

(Rosen and Sundgren, 1991)

• A sequence of templates accompanying the statistical production process

• Ongoing activities at Statistics Sweden

• A number of NSIs want to adopt the model

Page 15: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 15

Contents of Meta-dataThe process oriented approach

The IDARESA model

Object oriented representation based on

SCB-DOC with emphasis on possible semi-automatic processing

Page 16: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 16

Contents of Meta-dataThe process oriented approach

The US-Bureau of census model

(Gillman, Appel et al. running project):

Statistical system defined as an identifiable process .... to produce one or more deliverables

Page 17: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 17

Contents of Meta-dataSummary

Process oriented approach seems

to be favourable for a number of reasons

Two Examples:

Classification servers

Data Quality

Page 18: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 18

Contents of Meta-dataSummary: Classification server

A classification server should

Support unified use of terminology inside NSIs or international organisations

Support harmonisation between (international) standard classifications and locally defined (adapted) classifications

Page 19: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 19

Contents of Meta-data Summary: Classification server

Requirements for a classification server

• A data base supporting easy and user friendly manipulation of hierarchy trees

• A mapping tool supporting the definition of correspondence tables between classifications

• A management strategy for implementation

Page 20: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 20

Contents of Meta-data Summary: Classification server

Up to now only few successful implementations

for partial solutions

EUROSTAT (SIMONE-Server)

New Zealand,

Page 21: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 21

Contents of Meta-data Summary: Data Quality

Data Quality Criteria for quality of statistics are well known

(Relevance, accuracy, timeliness, accessibility, comparability, coherence, completeness)

The problem

• Achieve quality in the production process

• Document quality by appropriate meta-data

Page 22: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 22

Contents of Meta-data Summary: Data Quality

Experience shows that documentation

quality is rather poor as soon as it is

separated from the production process

Example for an integration project

SIDI-approach by ISTAT

Page 23: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 23

IT Structures for Meta-data

Internet and data warehouse offer new opportunities for

Meta-data and data repositories

Meta-data access and exchange

Lead towards a more open policy in data dissemination

Page 24: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 24

IT Structures for Meta-dataMeta-data repositories

Approaches towards repositories

The thesaurus approach

The template oriented approach

The Data Warehouse oriented approach

Page 25: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 25

IT Structures for Meta-dataMeta-data repositories

Example for a thesaurus oriented approach

EUROSTAT servers for concepts and

definitions

• Advantage: available on the Internet

• Problem: Navigation not so easy

Page 26: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 26

IT Structures for Meta-dataMeta-data repositories

• Contents

– Descriptions (dictionaries)

– Semantic (coverage, standard classifications coherence of information)

– Administration (responsible persons)

– Selection (keywords, search facilities)

Page 27: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 27

IT Structures for Meta-dataMeta-data repositories

Example for the template oriented approach

StatBase: supporting access to meta-data as well as data and reports

• Meets quite well the requirements of OECD data template

• No direct connection between data and meta-data

Page 28: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 28

IT Structures for Meta-dataMeta-data repositories

Example for the warehouse oriented approach

StatLine(CBS): Based on data access from multidimensional tables (cubes)

• Accompanying meta-information is only in Dutch

• Extraction of special meta-data items is not so easy as in StatBase

Page 29: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 29

IT Structures for Meta-dataMeta-data access and exchange

Ongoing work in access and exchange

New Standards for access and exchange

Accessing distributed sources

Combination of information

Page 30: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 30

IT Structures for Meta-dataMeta-data access and exchange

Actual trends in standardization

• Traditional standards for data and meta-data exchange like GESMES or CLASET will probably switch to XML-platform.

• New standards from the Object Management Group (OMG)

Page 31: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 31

IT Structures for Meta-dataMeta-data access and exchange

Example MOF (Meta Object Facility)

– Extensible Framework for meta-data model definition

– Programming interface for storage and access of meta-data

– Integration facilities across domains

But note: This is a general approach for warehouses not necessarily tied with statistics

Page 32: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 32

IT Structures for Meta-dataMeta-data access and exchange

Example for Accessing and processing distributed sources

ADDSIA: Accessing and processing distributed sources for analysis purposes

• Minimum requirements for standardisation in advance

• Orientation towards statistical problems

Page 33: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 33

Processing Meta-data

Goal Data and meta-data are processed

together

<OldDataSets, OldMetadataSets>

<NewData, NewMetadata>

Page 34: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 34

Processing Meta-data

Advantages Reduction of documentation effort

More consistency in meta-data

Requirements Software tools supporting this view

Operational models for meta-data

Page 35: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 35

Processing Meta-data

Up to know only prototypes with emphasis

on different aspects of processing

The planning approach

The throughput approach

The transformation approach

Page 36: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 36

Processing Meta-dataThe planning approach

Develop software tools (workbench) for setting up meta-data documentation

BRIDGE/IMIM: A desktop for planning surveys and statistical

production Meta-data generated in the planning phase are

managed by the system No data are processed

Page 37: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 37

Processing Meta-dataThe planning approach

Improvement and adaptation of meta-data models for new tasks like quality and use of administrative sources

SIDI (Statistics Italy) Integration of quality in the statistical

production process Standardization of the production process

Page 38: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 38

Processing Meta-dataThe throughput approach

Use as much meta-data as possible from OldMeta-data to obtain NewMeta-data

CBS (ongoing work):

Use BLAISE meta-data as input Produce StatLine meta-data as output

Page 39: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 39

Processing Meta-dataThe transformation approach

Define meta-data algorithms for all types of data algorithms

Throughput meta-data Modified meta-data New meta-data Meta-data summarization

Page 40: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 40

Processing Meta-dataThe transformation approach

IDARESA project

Meta-data algorithms for elementary data base operations

ISMIS

Identification of added value in meta-data (new meta-data)

Pursuit of the production process inside EUROSTAT

Page 41: Met a-data Resources in Europe: within NSIs and from Dosis Projects

41Metadata Resources in Europe29.3.2000

Processing Meta-dataThe transformation approach

In p u t d a ta 1In p u t M eta -d a ta 1

In p u t d a ta 2In p u t M eta -d a ta 2

In te rim d a ta 1In te rim M eta-d a ta 1

In p u t d a ta 3In p u t M eta -d a ta 3

In te rim d a ta 4In te rim m eta-d a ta 4

In p u t d a ta 4In p u t M eta -d a ta 4

In p u t d a ta 5In p u t M eta -d a ta 5

In p u t d a ta 6In p u t M eta -d a ta 6

In te rim d a ta 2In te rim M eta-d a ta2

In p u t d a ta 7In p u t M eta -d a ta 7

In te rim d a ta 3In te rim M eta-d a ta 3

In te rim d a ta 5In te rim m eta-d a ta 5

O u tp u t d a taO u tp u t M eta-d a ta

Page 42: Met a-data Resources in Europe: within NSIs and from Dosis Projects

29.3.2000 Metadata Resources in Europe 42

Conclusions

Is there progress in meta-data research and development?

Yes, but rather slow because There is a lack of co-ordination in research

(Probably improved by a forthcoming meta-data working group)

There is an information gap between meta-data research groups and NSIs

NSIs seem to prefer their own solutions