Norton Senior & ommunity Norton 55 West Main Street Norton ...
Mon norton tut_querying cultural heritage data
-
Upload
eswcsummerschool -
Category
Documents
-
view
66 -
download
0
description
Transcript of Mon norton tut_querying cultural heritage data
Querying Cultural Heritage DataCultural Heritage Data
Dr. Barry Norton,Development Manager, ResearchSpace*
* Funded by the Andrew W. Mellon Foundation * Hosted by the Curatorial Directorate, British Museum
Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museumbm-obj:EOC3130
Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
bm-obj:EOC3130
Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
• Or write this in Turtle:
bm-obj:EOC3130
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix bm-obj: <http://collection.britishmuseum.org/id/object/> .
@prefix bm-id: <http://collection.britishmuseum.org/id/> .
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can write this in Turtle:
• And check for it in SPARQL:
bm-obj:EOC3130
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
PREFIX crm: <http://erlangen-crm.org/current/>
PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/>
PREFIX bm-id: <http://collection.britishmuseum.org/id/>
ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum}
true
Statements and Patterns
• For a set of edges:bm-obj:EOC3130
bm-id:the-british-museum
crm:P51_has_former_or_current_owner
?
• We can do the work on the client:
• Or have the server do it by turning the triple into a triple pattern:
crm:P51_has_former_or_current_owner
?
bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
Exercise
??
Questions:• Why is the answer different?• Who are the two (other) one-time owners?
Solutions & Exercises• Why is the answer different?
– Reasoning, part of the work by the server (being a triplestore) means that if two things are related by crm:P52_has_current_ownerthen they’re related by then they’re related by crm:P51_has_former_or_current_owner
• This is part of the work that the server (triplestore) can do for you
• Exercise: query for the (strictly) former owners… ?
?
Solution 1/2
• Using specific server functionality:
Solution 2/2
• In pure SPARQL:
Solutions & ExercisesWho are the two (other) one-time owners?
• Since people and institutions (and places) are
??
• Since people and institutions (and places) are treated as are concepts, the names of the former owners are attached using skos:prefLabel
• Exercise: if you didn’t already, include the names in your query results
Solutions & ExercisesIf you didn’t already, include the names in your query results:
Question:Why are we back at two answers?
Answer
• Answer:– Just as we can add triples together to make a
graph in RDF, so we can add triple patterns together in SPARQL to make a graph pattern
– By default all triple patterns must be matched, – By default all triple patterns must be matched, but we can use the OPTIONAL {} pattern to allow variation
• Exercise:– Query for the owners and their names, if they
exist*
* N.B. this bug in the BM data will be fixed soon
Solution
Exercise
• Take a look here:
• Exercise: copy and run this query
CSV Exercise
• Type:
• Observe that one can now paste the query including line breaks*including line breaks*
• Type:
* N.B. for now you should first replace the "s with 's and change the one occurrence of ecrm: with crm: - we’ll fix this
* N.B. currently the query needs to be simplified as the BBC data is not loaded – this will be available soon
Data Analysis
• One can import this CSV file into many tools:– A spreadsheet can be a good way to carry out
basic visualisations– A scripting environment like (i)python/scipy or
R can allow more analysis before visualisation, but:
• both languages also have libraries to encapsulate interaction via SPARQL (rdflib/sparqlwrapper and SPARQL/RCurl respectively)
• one should decide whether more analysis should first be carried out using SPARQL…
Exercise
• If you haven’t so far, click on one of the (HotW) 100 Objects (such as number 70, Hoa Hakananai'a Easter Island Statue) having run the main queryhaving run the main query
• Choose a material and observe the query for other objects in this material
• Adapt this query to count how many BM objects are made from basalt
Solution & Exercise
• Exercise: Now count the ‘top ten’ materials and the number of objects for each
Solution
A Last Word
• SPARQLing a ‘native RDF’ database (often called a ‘triplestore’) is not the only option before defaulting to programming
• A ‘native graph’ database indexes the • A ‘native graph’ database indexes the graph in a different way, supporting traversal-oriented queries
Exercise
Double click
Exercise
Double click