Metadata Harvesting And Validation

Post on 26-Dec-2014

1.584 views 1 download

description

 

Transcript of Metadata Harvesting And Validation

Metadata Harvesting and Validation

Bram VandeputteK.U.Leuven

1

slideshare

• http://www.slideshare.net/bramvandeputte

• Validation Service• Online Validation Service• OAI-PMH• Harvesting Infrastructure

3

Overview

• Interoperability : Application Profile (AP)

• Manual check : very time consuming

• Need a tool for enforcing an AP => validation scheme

• A set of validation rules

• Reusable & extendable

4

Validation Service

Best practices derived from previous projects such as MELT and MACE

Reusable : modular + inheritance possible

• Components :

• XML schema : structure• schematron :

• mandatory/conditional elements

• empty fields

• vocabularies (auto generated)

• ...

• Vcard component

5

Validation Service

• Terminology :

• Validation Component

• Validation Scheme

• Validation Scheme URI :• http://aspect-project.org/validation/ASPECTv1.0/core

6

Validation Service

component : atomic block which does specific validation checking

scheme : collection of components that ensures validity against a whole AP

URI : unique identifier of a scheme

7

Validation Service

8

Validation Service

LOM loose

lomloose.xsd

vcard validator

empty attribute fields

ASPECTv1.0/core

vocabulary bank

Legend

uses

extends

ASPECT

vcard validator

validationScheme

validation component

recommended schematron rules

core schematron rules

ASPECTv1.0/recommended

IMS ILOX

9

Validation Service

!

10

Online Validation Service demo

validation to lre APrefer to lre ap document

• Client - Server model• Pull mechanism• options :

• selective harvesting (date and set)• incremental harvesting

• Metadata-agnostic

13

OAI-PMH

• Verbs : Identify, ListRecords, GetRecord• Parameters :

• baseUrl• from & until date• metadataPrefix• sets

14

OAI-PMH

• Multiple targets

• Each target separate properties (sets, date granularity, metadataPrefix, ...)

• Storing metadata (SPI, Filesystem, APP, ...)

• Extra features :

• Incremental harvesting

• harvesting scheduling

• Metadata validation + reporting

• OAI-PMH Target validation

• (User Friendly) GUI

15

Harvest Component

16

invalid : discarded or identifier recorded for next harvesting

16

The Harvest component

invalid : discarded or identifier recorded for next harvesting

ARIADNE Harvester

harvester log

16

invalid : discarded or identifier recorded for next harvesting

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOMLOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

16

LOM

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

16

LOM

SQISPI

invalid : discarded or identifier recorded for next harvesting

OAI

External Repository

LOMLOM

OAI-PMH

LOM

LOMLOM

OAI-PMH

ARIADNE Harvester

validation serviceharvester log

ASPECT Repository

Validation Msg

17

LOM

LOM

SQISPI

LOMLOM

OAI-PMH

LOM

Validation Msg

LOM

LOM

Validation Msg

1 2

34

5

6

18

Harvester Screenshot or live

demo

Validation Reports

• After harvesting -> report generated and put online

• report has 4 “levels” :

• full log (incl. metadata)

• reporting log

• Grouped Errors

• Error Summary

• Questions ?

23