© 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.
-
Upload
jocelyn-bradford -
Category
Documents
-
view
212 -
download
0
Transcript of © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.
![Page 1: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/1.jpg)
© 2006 IBM Corporation
Features of an Enterprise-ready Triple Store
Ben SzekelyJune, 2006
![Page 2: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/2.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Most examples of RDF triple stores focus on specific difficult problems
Focused on inference or standards
Preoccupied with “Billions of Triples”
Little thought given to application programming model.
Not multi-user (limited security)
![Page 3: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/3.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Boca Overview – Multi-user, distributed enterprise RDF repository
Selective RDF replication from server to client machines
Security, including named-graph-
based RDF access control
Audit trails of changes to data within named graphs
Near real-time event notifications
Sophisticated programming model
![Page 4: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/4.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Named Graphs
A named graph is the logical unit of RDF storage in Boca. Each triple exists in exactly one named graph
– If a triple exists in more than one named graph, it exists twice.
– Adding and removing triples is done in the context of a named graph
Each named graph has a metadata graph, containing information such as ACLs
Named graphs can be exposed via LSIDs, URLs, Web Services Named Graph applications
– LSID metadata
– Workflow documents
– Atom feeds
– FOAF profiles
![Page 5: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/5.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Underlying Technologies
Relational Database (DB2, Oracle, MySQL)– RDF triples stored in a table (subject, predicate, object, graphid)
– Space saved by normalizing URIs and strings to integer ids.
– Extra tables for history, ACLs, replication
J2EE (Jetty, Tomcat, WebSphere)– Jetty: Standalone server, checkout from CVS and run for testing
– WAS: Enterprise-ready Web-application server for real deployment
JMS Server (Active MQ, WebSphere MQ)– pub-sub messaging used for real-time notifications of triple updates.
![Page 6: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/6.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Replication
Boca clients have a persistent local RDF store that mirrors a subset of the triples on the Boca server.
Replicated subset specified by:– Triple patterns; e.g.
(<http://tdwg.org/meetings/GUID-2#>, <http://tdwg.org/preds/hasParticipant>,*)
– Named graph URIs
– Triple patterns within named graphs
When a replication is initiated, the service computes what has changed in the subset based on pattern and graph subscriptions.
Replication can work as a background process on the client, or be explicitly initiated.
Applications can query/write against graphs in the local and server models.
![Page 7: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/7.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Notification – maintaining the replica in real-time
Updates to named graphs on server are published in near real-time to clients.
Local replicas can be kept up-to-date between replications.
Notification is central to distributed RDF applications– Ex: workflow, collaboration
![Page 8: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/8.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Access Controls
Boca uses can have the following system-wide permissions: – canInsertNamedGraphs -- a user must have this permission in order to create a
new named graph (i.e. insert statements into a graph that does not yet exist in the system)
Boca users can have the following per-named-graph permissions (these apply also to the system graph):
– canRead -- a user with this permission may view the triples in the named graph and in its metadata graph
– canAdd -- a user with this permission may insert new triples into the named graph
– canRemove -- a user with this permission may remove triples from the named graph
– canChangeNamedGraphACL -- a user with this permission may change the ACL triples in the metadata graph
– canRemoveNamedGraph -- a user with this permission may entirely remove the named graph from the system
![Page 9: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/9.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Versioning
SVN-like approach to versioning
When a triple is added to or removed from a named graph, a new revision of that named graph is created.
Simple API for reading old revisions
Provides a straightforward mechanism for concurrent distributed computing.
– When a client submits an update to a named graph, it may specify the version number that it currently has. The update will fail if the graph has been more recently modified.
![Page 10: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/10.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
The Boca Programming Model
Named Graphs
Commands
Transactions
Versioning
Replication
Notification
![Page 11: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/11.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Abandoned features – Collections, Statement ACLs & Reification
Collections – a statement can exist in multiple collections– A more difficult programming model, what happens when I delete in the context of one
collection?
– Expensive to maintain
– Not a widely accepted programming model (as named graphs are)
Statement-level ACLs– Too expensive
– Difficult to program
– Not particularly useful, other than the odd, very important statement
– In that case, such a statement can live in its own named graph Reification
– Queries were very difficult to formulate
– Most RDF applications do not deal with reification
– Reification semantics often confused with true quoting
– Reification is an arbitrary layer of indirection that can be solved with ontologies
![Page 12: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/12.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Future Features
Arbitrary query-based replication/notification
Distributed servers
Open source
![Page 13: © 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.](https://reader036.fdocuments.us/reader036/viewer/2022082917/5515edb8550346cf6f8b526c/html5/thumbnails/13.jpg)
IBM Internet Technology
Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation
Applications
Executing OWL-S in a distributed fashion
Storing annotations
Providing LSID metadata
Web 2.0 application backend– Wikis, Blogs, Tagging, Atom
National Cancer Institute research platform