A Semantic Web understanding of the Factoid Prosopography model

37
A Semantic Web understanding of the factoid prosopography model John Bradley Department Digital Humanities King's College London Ontologies for Prosopography: Who's Who? or, Who was Who? Workshop, July 8, 2014 at the DH2014 conference, Lausanne Switzerland twitter: #KingsDH @kingsdh 1

Transcript of A Semantic Web understanding of the Factoid Prosopography model

A Semantic Web understanding

of the factoid prosopography

model John Bradley

Department Digital Humanities

King's College London

Ontologies for Prosopography:

Who's Who? or, Who was Who?

Workshop, July 8, 2014 at the

DH2014 conference, Lausanne

Switzerland

twitter: #KingsDH

@kingsdh

1

Tim Berners-Lee on Linked

DataAll kinds of conceptual things, they have names now that start

with HTTP.

I get important information back. I will get back some data in a

standard format which is kind of useful data that somebody

might like to know about that thing, about that event.

I get back that information it's not just got somebody's height

and weight and when they were born, it's got relationships.

And when it has relationships, whenever it expresses a

relationship then the other thing that it's related to is given

one of those names that starts with HTTP.

Tim Berners-Lee: Linked Data presentation at TED 2009

2

Tim Berners-Lee on Linked

DataAll kinds of conceptual things, they have names now that start

with HTTP.

I get important information back. I will get back some

data in a standard format which is kind of useful data that

somebody might like to know about that thing, about that

event.

I get back that information it's not just got somebody's height

and weight and when they were born, it's got relationships.

And when it has relationships, whenever it expresses a

relationship then the other thing that it's related to is given

one of those names that starts with HTTP.

Tim Berners-Lee: Linked Data presentation at TED 2009

3

Ontologies and a

“prosopographical” domainVast amount of data is related to

people to various degrees: what is

most properly prosopographical?

Good Ontology design practice: a kind

of “Occam's Razor”.

So what information related to persons

is IN prosopography and what is

OUT?

4

Traditional Prosopography

5

SourcesPeople

From J.R. Martindale, The

Prosopography of the

Later Roman Empire, 3:

A.D. 527-641. Cambridge:

Cambridge University

Press. 1992.

Places

Prosopography and Identity of

PersonsAn important part of prosopography is in the identifying of personsWhat's constitutes an historical person's identity?

Formal URIs provide a part of the linked data answer.◦ http://db.poms.ac.uk/record/person/749/

However, historical identity, by its nature of being contestable, requires more than this:◦ Abraham, bishop of Dunblane (fl.1210×14-

1220×25) (id 749)6

Prosopography: more than “just”

person identificationHistorical persons survive for us through their

appearance in sources, and historians identify

them not only by their name, but also by what they

did and by other ways that they are described.

7

Information about persons is a

part of prosopography.Assertions about a person have traditionally formed the basis of prosopography.

One could argue that historical people from before the immediate past only "survive" in our memory through their presence in sources: what sources assert about them.

Arguments about their identity flow from what these sources say to us.

8

Identity as a Contestable

thingHistorical Prosopography must be contestable, since it deals in information which is often uncertain.

◦ About Person (I think your A is my (slightly different) B)

◦ About statements about a person (I think this statement belong to a different person, or mis-interprets the sources)

◦ After date/timing (I think this event happened at a different time)

Thus, simply naming a person A, and then simply asserting that s/he is different from someone else's person B is not sufficient!!

The identification has to not only be made, it has to be testable and contestable.

To make such assertions testable, one needs data about the person.

This is Berners-Lee's 2nd point!

9

Data in a standard format for

prosopography?"I get important information back. I will get back

some data in a standard format which is kind of

useful data that somebody might like to know about

that thing, about that event.“ (Berners-Lee: TED

2009)

What is the data in a "standard format"?

Mainstream Linked Data, take this to mean data in

RDF

◦ highly structured.

What is a representation about a person then that

is highly structured?

A set of Assertions10

Core structure for DDH’s

Prosopographical databases: "factoid

model"

Person

Assertion

(factoid)

Authority Lists

Assertion Type

Source

Location Possession

11

Instance of

Typed by

Connected toConnected to

Appears in

Connected to

Role / Name

Date

Structuring Prosopography: the

factoid

Pasin, Bradley (2011). Factoid-

based Prosopography and

Computer Ontologies: towards

an integrated approach

12

Factoid Assertions are …

Time dependant◦ PoMS's Abraham 749's identity seems to be

wrapped up in his having been Bishop of Dunblane.

◦ Even so, an assertion such as "Abraham (ID 749) was Bishop of Dunblane" is actually time dependant

Material about him might well exist from times before or after he was Bishop of Dunblane.

Source Driven:◦ The factoid model aims to present what the

sources are saying, and downplays what the modern day prosopographers, as historians, believe.

13

Source Assertion: An Act of Document

Interpretation

14

Pasin, Bradley (2011). Factoid-

based Prosopography and

Computer Ontologies: towards

an integrated approach

Martindale asserts that... “Greg. Tur HF” asserts that... Victorius 4 imprisoned Eucherius 4

“Two levels” of assertion

Problems: Agent and Event

driven approachesHistorical persons are not (always)

agents

◦ Oxford English Dictionary: 2: A person or thing that takes an active role or produces a

specified effect.

2.1: (Grammar) The doer of an action, typically expressed a the

subject of an active verb of in a by phrase with a passive verb.

◦ Merriam-Webster:

1. one that acts or exerts power

Assertions are not always about

events15

TEI: Personography: “Basic

Principles”Information about people, places, and

organizations, of whatever type, essentially

comprises a series of statements or

assertions relating to: ◦ characteristics or traits which do not, by and large, change

over time

◦ characteristics or states which hold true only at a specific

time

◦ events or incidents which may lead to a change of state or,

less frequently, trait.

TEI, section 13.3.1 (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html)

16

Factoid types as traits/states and

eventsPoMS/PoNE:

* Titles/Occupation

(state)

* Relationship (trait?)

* Possession (state)

* Transaction (event)

PASE:

* Authorship (trait?)

* Education (trait?)

* Event (event)

* Relationship

(trait?)

* Occupation (state)

* Office (state)

* Status (trait)

* Personal Info (trait)

* Possession (state)

* Transaction

(event)

Charlemagne:

* Attribute/relationship (state)

* Place relationship (*)

* Transaction (event)

* Miscellaneous (event)

PBW(E):

* Family (trait)

* Ethnicity (trait)

* Activity (event)

* Dignity (state)

* Kinship (trait?)

* Possession (state)

Roman Republic:

* Family (trait)

* Relationship (trait)

* Office (state)

17

Trait/State factoid

18

Trait/State factoid

Factoid

iii.6

"sanctus

anachorita

Dei"

Occupation

Type

"Anchorite"

Person

Cuthbert 1

"Saint; bishop

of Hexam ..."

Source

Anon.VitCuthberti

Location

Farne, island

Northumberland

19

Relationship Factoid

20

Relationship Factoid

Factoid

1177 x 1185

Relationship

Type

"Father"

Person

Alexander Seton

(father of Philip)

Source

1/6/176

(RRS, ii, no. 200)

Person

Philip of Seton

Role

"father"

Role

"son"

21

Transaction: (Gift)

22

Transaction: (Gift)

Factoid

"Gift of land

of Dodin"

1177 x 1185

Transaction

Type

"Gift"

Person

Henry, earl of

Northumberland

and Huntingdon

Source

1/5/2

(RRS, i, no. 106)

Person

Kelso Abbey

Role

Grantor

Person

Dodin of

Duddingston

Role

Previous

holder

Role

Beneficiary

Possession

Land

"land of Dodin"

BWK (Berwickshire)

23

Tim Berners-Lee on Linked

DataAll kinds of conceptual things, they have names now that start

with HTTP.

I get important information back. I will get back some data in a

standard format which is kind of useful data that somebody

might like to know about that thing, about that event.

I get back that information it's not just got somebody's

height and weight and when they were born, it's got

relationships. And when it has relationships, whenever it

expresses a relationship then the other thing that it's

related to is given one of those names that starts with

HTTP.

Tim Berners-Lee: Linked Data presentation at TED 2009

24

Relationships through

transactions

Factoid

"Gift of land

of Dodin"

1177 x 1185

Transaction

Type

"Gift"

Person

Henry, earl of

Northumberland

and Huntingdon

Source

1/5/2

(RRS, i, no. 106)

Person

Kelso Abbey

Grantor

Person

Dodin of

Duddingston

Previous

holder

Beneficiary

Possession

Land

"land of Dodin"

BWK (Berwickshire)

http://db.poms.ac.uk/record/person/90

http://db.poms.ac.uk/record/person/82

http://db.poms.ac.uk/record/person/108

25

A network of Witnesses derived

from PoMS data

Jackson, Cornell (2014). "Using Social Network Analysis to understand

the PoMS database. DH2014, 10 July, 201426

Factoid Data as Semantic Web

Data

D2R Server:

" D2R Server is

a tool for

publishing

relational

databases on

the Semantic

Web. "http://d2rq.org/d2r-server

27

Trait/State factoid as a triple

Factoid

iii.6

"sanctus

anachorita

Dei"

Occupation

Type

"Anchorite"

Person

Cuthbert 1

"Saint; bishop

of Hexam ..."

Source

Anon.VitCuthberti

Location

Farne, island

Northumberland

pasePerson:cuthbert_1 pase:hasOccupation

paseAuthority:Anchorite

28

A reified factoid

Factoid

iii.6

"sanctus

anachorita

Dei"

Occupation

Type

"Anchorite"

Person

Cuthbert 1

"Saint; bishop

of Hexam ..."

pasedata:factoid2568 rdf:type rdf:Statement ;

rdf:subject pasePerson:cuthbert_1 ;

rdf.predicate pase:hasOccupation ;

rdf.object paseAuthority:Anchorite ;

pase.fromSource paseSources:Anon.VitCuthberti ;

pase.associatedLocation paseLocation:farne.island […]

pasePerson:cuthbert_1 pase:hasOccupation

paseAuthority:Anchorite

Source

Anon.VitCuthberti

Location

Farne, island

Northumberland

29

Uncertainty

A data model needs to accommodate

statements of uncertainty. Where and

how?

The factoid model, by being "source

driven" deals with one aspect of this.

An factoid assertion is a claim about

what a source says, not what the

project believes (thus, a little less like

biography).30

Uncertainty

Dates: we use a TEI-like date model

for dates in our factoids

Qualification of statements, probably

at the factoid and factoid/person level:

Certain, Uncertain, Very Uncertain

31

Summary

Identity of an historical person is based on the assertions found in the sources.

The factoid model is designed to reflect this source-driven orientation.

Agent and Event orientation is not the full story for prosopography.

The factoid model can be mapped best to reified RDF statements.

Uncertainty is always an element in history, and needs to be accommodated.

32

33

... so, what's in

Identifiers for persons

names and forms of names for people

as they appear in sources

assertions about them, as derived

from the sources

models to deal with recording

uncertainty, and complex dating?

34

... what's near by?

Names for onomastic studies?

◦ I'm interested in seeing how name data

fits in in the SNAP-DRGN project!

Places (perhaps linked to, but the task

of formally organising places is a

project in its own right)

Sources (linked to, but not captured or

represented)

35

What’s still missing?

36

Linked Data and History

If linked data is to connect historical

data, it is likely to work best when

centered on three kinds of entities:

◦ Sources

◦ Places

◦ People

What adequately represents a linked

data representation of

prosopography?37