The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

19
Data Archiving and Networked Services The Entity Registry System Collaborative Editing of Entity Data in Poorly Connected Environments Christophe Guéret (@cgueret) Philippe Cudré-Mauroux AAAI Spring Symposium #SD4HumTech15 March 23-25, 2015 Stanford University

Transcript of The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Page 1: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Data Archiving and Networked Services

The Entity Registry SystemCollaborative Editing of Entity Data in Poorly Connected Environments

Christophe Guéret (@cgueret)

Philippe Cudré-Mauroux

AAAI Spring Symposium #SD4HumTech15March 23-25, 2015 Stanford University

Page 2: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

The big question

“This symposium aims to address the question of whether the technology is mature enough to warrant further investigation or whether the disadvantages outweight the utility of SD for this domain”

Page 3: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

And the answer (for Linked Data) is…

Yes, it is mature enough !

But Linked Data platforms need to be downscaled before they can deliver their full potential in the specific context. So far most of what the community has to offer does not fit

Page 4: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

On the upscaling of platforms

● General design approach– Design a “one size fits all” data model for the common space

– Make a centralised store in the cloud

– Connect users to the store

● Scale up to cater for more users

● Have a hard time trying to fit in users when– Limited or no infrastructures (connectivity, electricity, ...)

– Limited agreement on models / data heterogeneity

– Different level of (computer) literacy

Page 5: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

On the opposite

● Downscaling platforms to make them fit specific, challenging, usage contexts and use-cases

http://worldwidesemanticweb.org/

Page 6: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Other WWSW aspects

● Interfaces : non text-centric interaction with data (SPARQL-Voice, Icons, …)

● Relevancy: find the subset of structure data that is the most relevant, contextualised reasoning, local+global data

● Data: publication of development related data as Linked Open Data (IATI, IDS, ...)

Short video on our website in “About”

Page 7: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

The Entity Registry System (ERS)

Page 8: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Entities● Semi-structured,

interlinked descriptions of shared instances

– Persons

– Objects

– Software

– Locations

– Sensors

– …

Page 9: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Collaboratively describing entities

● A single information space can be useful

● But even when not done in a challenging context, deploying collaborative entity-editing platforms is technically exceedingly challenging– Local/Global QoS to serve arbitrary entity data

● Performance, scale-out

– Collaborative aspects

● Transactions, versioning, integration

– Offline / mobile concerns

● Caching / replication / serializability

Page 10: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

One solution: ERS

● Web-less Linked Data

● Three-tier solution to deploy entity-powered apps– Flexible

● Seamlessly reconcile entities in local / ad-hoc / global modes– Collaborative

● Transactional consistency, data versioning– Scalable

● Shared data store, tunable completeness – Open-source

● https://github.com/ers-devs

Page 11: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Starting centralised design

Page 12: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Introducing the “Contributors”

● The central store is removed

● Contributors are they own trusted data store

● They can cache content from other contributors

● They have a private store for private data

Page 13: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Adding a “Bridge”

● Can only cache content from Contributors

● Useful for asynchronous messaging

● Convenient for groups (schools, clusters, ...)

Page 14: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

And put it on a bus, or something else

● Can be used to implement a sneakernet

● Contributors can also do this when visiting different bridges

Page 15: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Need to get all the data in one place ?

● Use the third component of ERS : Aggregator

● An Aggregator aggregates the content coming from several Bridges

Page 16: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

About consistency of statements

● Different point of view are, by design, found in separated containers

● Provenance data is available for all containers

● Voting/concensus can resolve conflicts

<house1> “#people” “1”

<house1> “#people” “2”

<house1> “#people” “2”

<house1> “#people” “1”

Page 17: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

About updates and suppressions

● Statements containers are uniquely identified

● Updates– New versions of documents get automatically replicated

● Deletes– Only the creator of a given container can delete it

– Deletion in cache store do not get replicated

Page 18: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

What ERS does not solve (yet)

● Minting of identifiers– Every contributor can create their own identifiers. There is no

enforced scheme

● Global search for existing identifiers– Only local search is possible

● Modeling of data– Selection of vocabulary comes from the applications using ERS

Page 19: The Entity Registry System: Collaborative Editing of Entity Data in Poorly Connected Environments

Take away message

● Linked Data is a good way to create a globally integrated, yet decentralised, information space for describing entities

● ERS is provides simple Linked Data without the Web, without HTTP, without SPARQL, ...

● Reference implementation is open source, based on CouchDB/JSON-LD/Python/Avahi, lightweight, and compatible with HXL hashtags approach ;-)