NLW Linked Open Data Sets

Post on 11-Apr-2017

497 views 0 download

Transcript of NLW Linked Open Data Sets

NLW DatasetsNational Library of Wales

Owain Roberts @owainrrGlen Robson @glenrobson

Background• NLW has been digitising since late 90s

• Digitised material tends to be static material – the digtised stuff don’t change with time!

• Datasets and databases treated outside the collections systems

• Infrastructre gap identified for dealing with datasets

• Move to dealing with born-digital material and datasets

Datasets / Derived Content

PHYSICAL COLLECTIONS

DIGITISED COLLECTIONS

Digitisation

DERIVED CONTENT

Transcription

Automation / Crowdsourcing

DERIVED CONTENT

PHYSICAL COLLECTIONS

DIGITAL COLLECTIONS

DERIVED CONTENT

Automation / Crowdsourcing

The Storage Problem

Where do we put these?

• Datasets derived from physical collections

• Data derived from digital collections

What is linked data?

A way of connecting silos of data

A way of enhancing existing data

A way of structuring data (like a database or XML)

A standard way of sharing data (like an API)

Triples

Subject Predicate Object

•Person hasName Owain•Person hasAge 24•Person worksIn NLW – literal•Person worksIn NlW - literal•Person worksIn http://www.llgc.org.uk/ - URI

Aberystwyth Shipping Records

Transcribed as part of NLW Volunteer Programme544 Ships covering period 1856-1914

An example…

Cynefin Tithe Maps

1838 - 1947

Cymru 1914.org and Wales at War

1914-1918

Shipping Records1856 - 1914

Crime and Punishment

Database1730 - 1830

Welsh Biography Online

0 - 1970

Welsh Newspapers Online

1804 - 1919

EXTERNAL DATASETS

EXTERNAL DATASET

EXTERNAL DATASET

Cymru 1900~ 1900

Aberystwyth Observer23 March 1905

Aberystwyth Observer23 April 1905

Events

Cynefin Tithe Maps

1838 - 1947

Cymru 1914.org and Wales at War

1914-1918

Shipping Records1856 - 1914

Crime and Punishment

Database1730 - 1830

Welsh Biography Online

0 - 1970

Welsh Newspapers Online

1804 - 1919

EXTERNAL DATASETS

EXTERNAL DATASET

EXTERNAL DATASET

Cymru 1900~ 1900

New Developments

• IIIF• Linked Open Data

CommunityNational Libraries• Austria• British Library• France• Denmark• Egypt• Israel• New Zealand• Norway• Poland• Serbia• Vatican• Wales

http://www.slideshare.net/azaroth42/introduction-to-iiif

Research Institutions• C2RMF (France)• Cornell University• Johns Hopkins Univ. • Harvard University • Oxford University• Princeton University• Stanford University• Wellcome Library• Yale University• plus several more

Museums • YCBA• British Museum Aggregators• Artstor• DPLA• Europeana Projects• Biblissima • e-codices• TPEN• TextGrid

What can I do with it?

National Library of Wales

Repository

British LibraryDigital Library

National Library of NorwayRepository

BnFRepository

Image APIPresentation API

MiradorIIIF Viewer

Wellcome/Universal Viewer

IIIF Viewer

Mirador

http://stanford.io/1PW789d

Linked Open Data

• “A method of publishing structured data so that it can be interlinked and become more useful through semantic queries”

• Tim Berners-Lee coined the term in a 2006 design note about the Semantic Web project

https://en.wikipedia.org/wiki/Linked_data• Turing the web into data rather than

documents

Benefits for Research

• Queryable: SPARQL• Open Data

– Not limited by website – Not limited to an API– Keys to the database

• Linkable to other datasets– Wikipedia– Geonames

• Built to be added to

Book of Remembrance

• Once transcribed it will be a complete dataset of the Welsh fallen– Query by rank, location, service

• Linkable– Geonames – county/area– Wales at War - http://www.walesatwar.org

Shipping Registers

• 544 merchant vessels registered at the port of Aberystwyth

• 1856-1914• Crew lists – name, position, birth date, reason

for leaving, location • Transcribed by volunteers• Modelled in Linked Open Data

Top 4 Places Visited

Top Visits

Problems• Linking out

– Places -> Geonames, Cynefin– Ships -> Wikipedia– Ships -> Newspapers?

• Disambiguation– Between people

• In Shipping Records• Across resources e.g. Newspapers

– Between resources– Dutch Shipping to Newspaper linking: http://bit.ly/1Talish/

Going forward• Release more datasets as LOD• Crowd sourcing to create data• Working with researchers on enhancing datasets.• Can we turn the Newspapers into a Queryable

dataset?– Name Entity Recognition– Crowd– Research Projects

• Can we link our digital resources together?