Linked dataworkshopintro14aug2014

60
Linked Data: a practical approach Wellcome Institute, 14 August 2014 Adrian Stevenson and Jane Stevenson “Linked Data is Storytelling for computers. It doesn’t have the full richness, complexity and nuance that we invest in our narratives, but it does at least help computers to fit all the bits together in meaningful ways.”

description

Linked Data workshop for Archives Hub contributors. An introduction to Linked Data concepts, including entities, URIs, RDF, and use of Open Refine for name matching.

Transcript of Linked dataworkshopintro14aug2014

Page 1: Linked dataworkshopintro14aug2014

Linked Data: a practical approach

Wellcome Institute, 14 August 2014Adrian Stevenson and Jane Stevenson

“Linked Data is Storytelling for computers. It doesn’t have the full richness, complexity and nuance that we invest in our narratives, but it does at least help computers to fit all the bits together in meaningful ways.”

Page 2: Linked dataworkshopintro14aug2014

Linked Data workshop• Entities and Identities • Documents and Data• URIs and Connections• Triples• Data Creation • RDF Graphs and the Archives Hub Graph• Vocabularies• Locah: our experience of creating RDF• Connecting datasets • Demonstration websites• Name Matching and Demo of Open Refine• Linking Lives interface• Calm and Linked Data• Round up and close

Page 3: Linked dataworkshopintro14aug2014

Beatrice Webb

Page 4: Linked dataworkshopintro14aug2014

Martha Beatrice Webb, 1858-1943, social reformer

Page 5: Linked dataworkshopintro14aug2014

Martha Beatrice Webb, 1858-1943, social reformer

is the creator of some archive collections

Page 6: Linked dataworkshopintro14aug2014

Each of these is about an archive collection

Each of these is a document

Page 7: Linked dataworkshopintro14aug2014

Each document has lots of useful information

Each is formatted so a human reader can understand it

But let’s give each document an identifier that works on the Web…

The Web works with http://

Page 8: Linked dataworkshopintro14aug2014

http://archiveshub.ac.uk/data/gb394we

http://archiveshub.ac.uk/data/gb227msda865.w4

http://archiveshub.ac.uk/data/gb0097collmisc0241

http://archiveshub.ac.uk/data/gb0097sr1100

http://archiveshub.ac.uk/data/gb0097webblocalgovernment

http://archiveshub.ac.uk/data/gb0097collmisc0243

http://archiveshub.ac.uk/data/gb0097collmisc0242http://archiveshub.ac.uk/data/gb0097passfield

http://archiveshub.ac.uk/data/gb0097webbtradeunion

Page 9: Linked dataworkshopintro14aug2014

http://archiveshub.ac.uk/data/gb227msda865.w4

Page 10: Linked dataworkshopintro14aug2014
Page 11: Linked dataworkshopintro14aug2014

Martha Beatrice Webb, 1858-1943, social reformer

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

http://data.archiveshub.ac.uk/id/person/86607236

or….

http://viaf.org/viaf/86607236/

or….

Page 12: Linked dataworkshopintro14aug2014

Now we can make the statement:

<creator-of>

http://data.archiveshub.ac.uk/id/archivalresource/gb394-we

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

Martha Beatrice Webb

is the creator of the archive

Beatrice Webb: A summer holiday in Scotland, 1884

…identifiers for the Web (for a machine) …labels for humans

Page 13: Linked dataworkshopintro14aug2014

<creator-of> is the creator of the archive

George Bernard Shaw Diaries

…identifiers for the Web (for a machine) …labels for humans

George Bernard Shaw, 1859-1950, playwright

http://archiveshub.ac.uk/id/archivalresource/gb0097sr0293

http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist

Page 14: Linked dataworkshopintro14aug2014

We can start to say things about relationships…

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

<knew>

http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist

Page 15: Linked dataworkshopintro14aug2014

We can start to say things that go beyond what is known within our own space…

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

<is the same as>

http://viaf.org/viaf/86607236/

Page 16: Linked dataworkshopintro14aug2014

We can start to find different sources about the same person…

<is the same as>

http://viaf.org/viaf/121884166/

http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist

http://dbpedia.org/page/George_Bernard_Shaw

<is the same as>

Page 17: Linked dataworkshopintro14aug2014

We can put these ideas together…

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

<knew>

http://data.archiveshub.ac.uk/id/person/ncarules/shawgeorgebernard1856-1950irishdramatistcriticandnovelist

http://dbpedia.org/page/George_Bernard_Shaw

<also known as>

Page 18: Linked dataworkshopintro14aug2014

GRAPHS AND DATA MODELLING

Page 19: Linked dataworkshopintro14aug2014

Archival Resource Person

Created

Subject: Archival ResourcePredicate: CreatedByObject: Person

Subject > Predicate > Object

CreatedBy

Triple statement

Page 20: Linked dataworkshopintro14aug2014

CREATING AN ARCHIVE DESCRIPTION

Page 21: Linked dataworkshopintro14aug2014

Archival Resource Start date

Subject: Archival ResourcePredicate: dateCreatedObject: Start date

Subject > Predicate > Object

dateCreated

Triple statement

Page 22: Linked dataworkshopintro14aug2014

Archival Resource

biographical history

has

Beatrice Webb (1858-1943), nee Potter, social reformer and diarist. Married to Sidney

Webb, pioneers of social science. She was involved in many spheres of political and social activity including the Labour Party,

Fabianism, social observation, investigations into poverty, development of socialism, the foundation of the National Health Service

and post war welfare state, the London School of Economics, and the New

Statesman.

has

http://archiveshub.ac.uk/data/gb227msda865.w4

Page 23: Linked dataworkshopintro14aug2014

http://data.archiveshub.ac.uk/id/archivalresource/gb394-we

http://lexvo.org/id/iso639-3/eng

isLanguageOf

Subject: Archival ResourcePredicate: hasLanguageObject: Person

Subject > Predicate > Object

hasLanguage

Triple statement

Page 24: Linked dataworkshopintro14aug2014

Archival Resource Repository

Archival Record

describedBy

heldAt

encodedAs

EAD document

Title

has

An RDF Graph

Place

locatedIn

Page 25: Linked dataworkshopintro14aug2014

ArchivalResource

Finding Aid

EAD Document

Biographical History

Agent

Family Person Place

Concept

Genre Function

Org

maintainedBy/maintains

origination

associatedWith

accessProvidedBy/providesAccessTo

topic/page

hasPart/partOf

hasPart/partOf

encodedAs/encodes

Repository(Agent)

Book

Place

topic/page

Language

Level

administeredBy/administers

hasBiogHist/isBiogHistFor

foaf:focus Is-aassociatedWith

level

Is-a

language

ConceptScheme

inScheme

ObjectrepresentedBy

PostcodeUnit

Extent

Creation

Birth Death

extent

participates in

TemporalEntity

TemporalEntity

at time

at time

product of

in

Subject

Page 26: Linked dataworkshopintro14aug2014

Archival Resource ‘Creator’

?

Vocabularies

Page 27: Linked dataworkshopintro14aug2014

“You share vocabularies, so that other people (and computers) know when you’re talking about the same sorts of things. You share identifiers, so that other people (and computers) know that you’re talking about a specific person, place, object or whatever.”

Tim Sherratt, Web Developer and Digital Historian, Australia

Page 28: Linked dataworkshopintro14aug2014

http://archiveshub.ac.uk/locah/2011/03/describing-the-things-the-rdf-terms-used-part-2/

Page 29: Linked dataworkshopintro14aug2014

http://data.archiveshub.ac.uk/id/archivalresource/gb394-we

http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer

Subject: Archival ResourcePredicate: ‘origination’Object: Person

Subject > Predicate > Object

archiveshub:origination

Triple statement

Page 30: Linked dataworkshopintro14aug2014

Archival Resource

biographical history

has

Beatrice Webb (1858-1943), nee Potter, social reformer and diarist. Married to Sidney

Webb, pioneers of social science. She was involved in many spheres of political and social activity including the Labour Party,

Fabianism, social observation, investigations into poverty, development of socialism, the foundation of the National Health Service

and post war welfare state, the London School of Economics, and the New

Statesman.

archiveshub:hasBiographicalHistory

http://data.archiveshub.ac.uk/id/archivalresource/gb394-we

Page 31: Linked dataworkshopintro14aug2014

http://data.archiveshub.ac.uk/id/archivalresource/gb394-we http://lexvo.org/id/i

so639-3/eng

Subject: Archival ResourcePredicate: dcterms:languageObject: Person

Subject > Predicate > Object

dcterms:language

Triple statement

Page 32: Linked dataworkshopintro14aug2014

CONNECTING DATASETS

Page 33: Linked dataworkshopintro14aug2014

Linking Datasets

• If something is identified, it can be linked to• We can then take items from one dataset and link

them to items from other datasets

BBC

VIAF

DBPedia Archives Hub

Copac

GeoNames

Page 34: Linked dataworkshopintro14aug2014

“Humans, presented with pieces of information about people, put things into the form of a story.” (Edward Ayers)

“even isolated and inert pieces of evidence – a list, a letter, a map, a picture – can assume new and unimagined meanings when placed in juxtaposition with other fragments.” (Edward Ayers)

“You use a glass mirror to see your face; you use works of art to see your soul”

Page 35: Linked dataworkshopintro14aug2014
Page 36: Linked dataworkshopintro14aug2014
Page 37: Linked dataworkshopintro14aug2014

http://archiveshub.ac.uk/blog/2013/08/hub-viaf-namematching/

Page 38: Linked dataworkshopintro14aug2014
Page 39: Linked dataworkshopintro14aug2014
Page 40: Linked dataworkshopintro14aug2014
Page 41: Linked dataworkshopintro14aug2014

historywall.nma.gov.au

Page 42: Linked dataworkshopintro14aug2014

wraggelabs.com/shed/presentations/anzsi

Page 43: Linked dataworkshopintro14aug2014

USING OPEN REFINE

Page 44: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

44

Workshop Resources

• Workshop resources available from:

http://data.archiveshub.ac.uk/workshops/wellcome2014/

Page 45: Linked dataworkshopintro14aug2014

owl:sameAs

Archives Hub Person

owl:sameAs

VIAF Person

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/45

Page 46: Linked dataworkshopintro14aug2014

owl:sameAs

<http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer>

owl:sameAs

<http://viaf.org/viaf/86607236> .

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/46

Page 47: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

47

Matching Tools

• LOD Refine• http://code.zemanta.com/sparkica/download.html

• SILK Framework• http://wifo5-03.informatik.uni-mannheim.de/bizer/

silk/#workbench

• Module 3 at http://euclid-project.eu/ good for use of Open Refine and SILK

Page 48: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

48

LOD Refine

• Install files available from:– Mac:

• http://data.archiveshub.ac.uk/workshops/wellcome2014/Mac.zip

– Windows:• http://data.archiveshub.ac.uk/workshops/wellcome2014/

Windows.zip

– Direct:• http://code.zemanta.com/sparkica/download.html

• Install LOD Refine, run it and then in a web browser go to http://localhost:3333/

Page 49: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

49

LOD Refine

• Download example matching file from:– http://data.archiveshub.ac.uk/workshops/wellco

me2014/Matching_Sample.csv

– In LOD Refine go to ‘Create Project’ and import the Matching_Sample.csv data.

Page 50: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

50

Name Concatenation

• To concat the FamilyName, GivenName and Dates:

• Add new column:– Click on left down of ‘?Dates’ and select ‘Edit

Column’ > ‘Add Column Based on this Column’– Name the new column, e.g. ‘ConcatName’– Use the following GREL expression:

• cells["?FamilyName"].value + ", " + cells["?GivenName"].value + ", " + cells["?Dates"].value

Page 51: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

51

Page 52: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

52

Reconcile to VIAF

• Info on Roderick Page’s VIAF reconciliation service at:– http://iphylo.blogspot.co.uk/2013/04/reconciling-author-names-us

ing-open.html

• Add the VIAF reconciliation service by clicking on Concat column down arrow and select ‘Reconcile’ > ‘Start reconciling’

• Add the URI for VIAF reconciliation service:– http://iphylo.org/~rpage/phyloinformatics/services/

reconciliation_viaf.php

• Start Reconciling!

Page 53: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

53

Page 54: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

54

VIAF Reconciliation

• Facet the reconcil results by judgement• Confirm the matched and unmatched data as

required• Possibly create another column for e.g SKOS

close matches or ‘isLikes’

Page 55: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

55

Create VIAF URI Column

• Select the reconciled column’s dropdown menu > Edit column > Add column based on this column

• Give col a name and add the GREL expression: – "http://viaf.org/viaf/"+cell.recon.match.id

Page 56: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

56

Export the VIAF Triples

• Edit the RDF skeleton to include the columns to be matched and link using the owl:sameAs property.

• Check the preview• Export the RDF as Turtle of RDF/XML as

required.

Page 57: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

57

Page 58: Linked dataworkshopintro14aug2014

Workshop resources at http://data.archiveshub.ac.uk/workshops/wellcome2014/

58

How we created the tabular data

Page 59: Linked dataworkshopintro14aug2014

http://wraggelabs.com/shed/presentations/anzsi/

What we need is a data framework that sits beneath the text, identifying people, dates and places, and defining relationships between them and our documentary sources. A framework that computers could understand and interpret, so that if they saw something they knew was a placename they could head off and look for other people associated with that place. Instead of just presenting our research we’d be creating a whole series of points of connection, discovery and aggregation. (Tim Sherratt)

…this is the goal of Linked Data.

Page 60: Linked dataworkshopintro14aug2014

http://archiveshub.ac.ukhttp://archiveshub.ac.uk/bloghttp://archiveshub.ac.uk/locahhttp://data.archiveshub.ac.uk/linkinglives

This presentation is available under creative commons Non Commercial-Share Alike:http://creativecommons.org/licenses/by-nc/2.0/uk/