GraphDB update from Dimitar Manov, Product Manager for GraphDB

17
2015 GraphDB User Group April, 2015

Transcript of GraphDB update from Dimitar Manov, Product Manager for GraphDB

Page 1: GraphDB update from Dimitar Manov, Product Manager for GraphDB

2015

GraphDB User GroupApril, 2015

Page 2: GraphDB update from Dimitar Manov, Product Manager for GraphDB

Outline – GraphDB 6.2

• 6.2 overview

• GraphDB core improvements

• New experimental features

• GraphDB Workbench

• Connectors

#22015

Page 3: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB 6.2 Overview

• Release plans – end of May, 2015

• Latest current release 6.1 – SP3– The workbench there already includes some of the new features

• Semantic versioning, e.g. 6.2.0, 6.2.1, etc.

• Sesame 2.7.15 upgrade

#32015

Page 4: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB 6.2 core improvements

• Optional support for SPOC index:

• <URI> ?p ?o.

• Reversed literal (dates and numbers) index

• Fwd: range (dates/numbers); Reverse: ID -> value; Q10

• AVLTree page improvements for CPU caching

• Different representation of statement data for higher locality

• 4-35% improvements in certain cases

• Transaction streaming (Enterprise)

• Before the whole Tx was sent to the master (XML serialized)

• This easily caused OOM errors for big data files (a few GB)

#42015

Page 5: GraphDB update from Dimitar Manov, Product Manager for GraphDB

New experimental features - Gremlin

• Gremlin/Blueprints support

• Blueprints has support for RDF DBs

• Operations are limited to what can be represented as RDF

• (which is a subset of the property graph model)

• Usage:• g = new com.ontotext.blueprints.GraphDBSailGraph(

• "http:// localhost:8080/gwb/repositories/myrepo");

#52015

Page 6: GraphDB update from Dimitar Manov, Product Manager for GraphDB

New experimental features - reasoning

• 4 types of reasoning– Hybrid – combines forward and backward chaining (type/subClassOf,

inverse/symmetric/equivalent properties, transitive and transitiveOver)

– Parallel (multithreaded)

– GeoSPARQL support

– SPARQL-MM support – combining temporal, spatial and multimedia

#62015

Page 7: GraphDB update from Dimitar Manov, Product Manager for GraphDB

Hybrid reasoning

• Properties that could be implemented via B/C:TYPE rdf:typeSCO rdfs:subClassOfSPO rdfs:subPropertyOf EQC owl:equivalentClass EQP owl:equivalentProperty INV owl:inverseOfSYM owl:SymmetricProperty TRANS owl:TransitiveProperty

– http://www.semantic-web-journal.net/system/fi les/swj508_1.pdf

– Interesting to experiment only B/C with huge datasets – 100-200B

– For hybrid scenario SmoothDelete and sameAs optimization may not work

#72015

Page 8: GraphDB update from Dimitar Manov, Product Manager for GraphDB

Parallel reasoning

• This is not “massive”/CUDA reasoning

• Optimal results - with 4 threads

• 10% to 400% improvements

• Locking issues in the LRUObjectCache

#82015

Page 9: GraphDB update from Dimitar Manov, Product Manager for GraphDB

Parallel reasoning

#92015

Page 10: GraphDB update from Dimitar Manov, Product Manager for GraphDB

SPARQL-MM

#102015

SELECT ?t1 ?t2 WHERE { ?f1 rdfs:label “Barack Obama”. ?f2 rdfs:label “American solger”. FILTER mmf:rightBeside(?f1,?f2)} ORDER BY ?t1 ?t2

Page 11: GraphDB update from Dimitar Manov, Product Manager for GraphDB

SPARQL-MM

• Spatial relations: rightBeside, spatialCovers, spatialDisjoint

• Temporal functions: after, before, temporalContains, temporalOverlaps

• Aggregation functions: spatialIntersection, spatialBoundingBox, temporalBoundingBox

• Combined aggregations: boundingBox, intersection

• http://2014.eswc-conferences.org/sites/default/files/eswc2014pd_submission_65.pdf

#112015

Page 12: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GeoSPARQL

• A standard for geospatial RDF Data

• Implemented as a GraphDB plugin

• Query optimizer uses Lucene-based index:• Yes: select ?x {?x a geo:Feature .?x geo:sfWithin <urn:Europe>}• NO: select ?x {?x a geo:Feature .?x geo:hasDefaultGeometry ?xg .?xg geo:hasSerialization ?xgLit .

<urn:Europe> geo:hasDefaultGeometry ?eg . ?eg geo:hasSerialization ?egLit . filter(geo:sfWithin(?xgLit, ?egLit))}

• Custom literals-in-relations extension:• # find all Features that are roughly within Bulgaria (based on the supplied

polygon that includes parts of neighboring countries too)select ?x where{ ?x a geo:Feature . ?x geo:sfWithin "Polygon((22 41, 29 41, 29 45, 22 45, 22 41))"^^geo:wktLiteral .}

#122015

Page 13: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB Workbench

• Overall– much better startup time and overall size/complexity

– PermGen issue fixed

– migration to Angular JS for the new views

– RESTful services for the new views that are documented and stable for outside usage

#132015

Page 14: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB Workbench

• SPARQL View– with much better error handling

– better user experience

– single page instead of going back and forth between query and results

#142015

Page 15: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB Workbench

• Simplified design & improved user experience:– Repositories View

– Export View

• Import View– handle huge files (chunking, retries, etc.)

– import from remote location (URL with data)

– simple text area import

#152015

Page 16: GraphDB update from Dimitar Manov, Product Manager for GraphDB

GraphDB Enterprise Workbench

• View & Management of the cluster

#162015

Page 17: GraphDB update from Dimitar Manov, Product Manager for GraphDB

Connectors

• Proved to be quite stable so far

• ES/Solr used to work only with a single worker

• Now we implemented transactional entity pool

• Enterprise release with 6.2 (1 Master, multiple workers)

#172015