Got bored by the relational database? Switch to a RDF store!
-
Upload
benfante -
Category
Technology
-
view
2.038 -
download
1
description
Transcript of Got bored by the relational database? Switch to a RDF store!
Got bored by the relational database?Switch to a RDF store!
Fabrizio GiudiciTidalwave s.a.s.
Who I am
● Java consultant since 1996● Senior architect● Java instructor for Sun since 1998● Member of the NetBeans Dream Team● Technical Writer, Blogger at Java.Net, DZone● http://weblogs.java.net/blog/fabriziogiudici/● http://www.tidalwave.it/people
Where I am using RDF stores
http://bluemarine.tidalwave.it
Agenda
● Why the RDBMs?● RDBMs issues● The Semantic Model● OpenSesame and Elmo● A few code samples● Conclusion
RDBMs are everywhere
What do we expect from a RDBM?
● Persistence● Reliability● Transactions● Integrability● Manageability
Lack of cohesion
● Do we really need a RDBM for those things?● No, we don't
– Persistence and transactions are good
– The specific relational schema is evil
● RDBMs sell those stuff in a single package
ER-OO impedance
● Entity-Relationship is different than OO– Primary keys
– No inheritance
– No behaviour
– Normalization rules
– Relationship through external keys
ORMs
● Tools to minimize the ER-OO impedance● Java has got a standard API: JPA
– Hibernate, TopLink, EclipseLink, OpenJPA
– Tries to abstract the database à la Java
● Good, but the RDBM has still to be designed● And maintained
Can we get rid of the relational database?
The Semantic Model
● Semantic Technology != Semantic Web● RDF: Resource Description Framework● “Triples” are the atomic information item● Subject / predicate / object
– Java / is-a / programming-language
– Fabrizio / is-member-of / NetBeans Dream Team
– Verona / is-part-of / Veneto
– Verona / has-plate / “VR”
The Semantic Model
● The subject is a resource ● The predicate is a property● The object is a value● A value is a resource or a primitive type● Resources, properties identified by URL/URN
– Just a naming scheme
– Not necessarily web-related
Formal representation
● RDF is not related to XML● XML is just one of the way to represent RDF
– XML-RDF, unfortunately referred to as RDF
● Notation 3 (N3), another popular representation– Much more human-readable
● Other formats exist● RDF representation is often referred to as
“serialization”
(XML-)RDF is near to you
● RSS/RDF● Dublin Core● XMP by Adobe
Compared to RDBMs
● There's no fixed schema– Everything is a triple
– “AAA slogan”: Anyone can say Anything about Any topic
● Adding new data types is adding triples– No need to add / alter tables
– Maintainance is just updating data
● Databases can be distributed (federations)– Can be merged by just copying triples together
What about performance?
● Not as optimized as SQL● There's no spread knowledge about tuning as
for SQL● Some missing parts
– E.g. Sesame still misses select count(*)
OpenSesame, Elmo
● Popular Java infrastructure for RDF– FLOSS
– http://www.openrdf.org
● Elmo providers JPA-like operations– Annotations
– Specific API or even subset of JPA
A simple code example
● Note the use of standard ontologies
import org.openrdf.elmo.annotations.rdf;
@rdf(GeoVocabulary.URI_GEO_LOCATION)public class GeoLocation { @rdf("http://www.w3.org/2003/01/geo/wgs84_pos#lat") private Double latitude;
@rdf("http://www.w3.org/2003/01/geo/wgs84_pos#long") private Double longitude;
@rdf("http://www.w3.org/2003/01/geo/wgs84_pos#alt") private Double altitude;
@rdf("http://www.tidalwave.it/rdf/geo/2009/02/22#code") private String code; }
A simple code example
● Declare persistent classes inMETA-INF/org.openrdf.elmo.concepts
● Choose a store– Memory
– Memory backed by file
– Database (transactional)
Repository repository = new SailRepository( new MemoryStore(new File("/tmp/RDFStore"))); repository.initialize(); ElmoModule module = new ElmoModule(); SesameManagerFactory factory = new SesameManagerFactory(module, repository); SesameManager em = factory.createElmoManager();
Use as JPA EntityManager
em.getTransaction().begin(); GeoLocation genova = new GeoLocation(); genova.setLatitude(45.0); genova.setLongitude(9.0); genova.setCode("GE");
em.persist(genova); em.getTransaction().commit();
Queries
● There are specific query languages● SPARQL is one of the most popular● Similar to SQL, but triples in place of tables
PREFIX wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#>SELECT ?location WHERE { ?location wgs84:lat ?lat }
Running a query
em.getTransaction().begin(); String queryString = "PREFIX wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#>\n" + "SELECT ?location WHERE \n" + " {\n" + " ?location a ?type.\n" + " ?location wgs84:lat ?lat\n" + " }"; final ElmoQuery query = em.createQuery(queryString). setType("type", GeoLocation.class). setParameter("lat", 45.0);
final List<GeoLocation> result = query.getResultList();
for (GeoLocation l : result) { System.err.println(l); } em.getTransaction().commit();
Scratching the surface
● Elmo is powerful● Supports advanced constructs
– Objects with “multiple personality”
– Mixins
Open issues
● OpenSesame doesn't support all databases● Lack of experience
– Programming skills
– Maintainance
– Tuning
– Managerial culture
● Not widespread● Performance?
Conclusion
● RDBMs are mainstream, but old● They lead to rigid schemata, don't fit the OO● It's possible to use something different● RDF stores can be a viable alternative