Graph Processing with Titan and Scylla

18
Graph Processing with Titan and Scylla Jason Plurad Software Engineer, IBM Open Technology PMC and Committer, Apache TinkerPop

Transcript of Graph Processing with Titan and Scylla

Page 1: Graph Processing with Titan and Scylla

Graph Processing with Titan and ScyllaJason PluradSoftware Engineer, IBM Open TechnologyPMC and Committer, Apache TinkerPop

Page 2: Graph Processing with Titan and Scylla

Graphs with Titan and Scylla

ØGraph computing

•Graph landscape

• Titan and Scylla

Page 3: Graph Processing with Titan and Scylla

Common graph data domains

• Social network analysis

• Configuration management database

• Master data management

• Recommendation engines

• Knowledge graphs

• Internet of things

Page 4: Graph Processing with Titan and Scylla

Apache TinkerPop:Graph ComputingFramework

http://tinkerpop.apache.org

Page 5: Graph Processing with Titan and Scylla

Property graph and Gremlin

• Structure§ Vertex§ Edge§ Properties

• Gremlin§ Domain specific language (DSL) for graph§ Functional, data flow approach§ Full library of traversal steps§ Support for non-JVM languages

Page 6: Graph Processing with Titan and Scylla

Graphs with Titan and Scylla

üGraph computing

ØGraph landscape

• Titan and Scylla

Page 7: Graph Processing with Titan and Scylla

Graph Landscape• Graph database vs Graph processor

§ OLTP vs OLAP§ Neighborhood vs Whole graph

Page 8: Graph Processing with Titan and Scylla

Apache Spark or Apache Giraph

• Pick a graph processor for OLAP…§ Spark is the new hotness in analytics§ Giraph is better suited for gigantic graphs

• By using Apache TinkerPop and Gremlin, we can use

either one seamlessly

Page 9: Graph Processing with Titan and Scylla

Titan (Aurelius)

• Pick a graph database for OLTP…

• Pluggable storage backend

• Pluggable indexing backend

• Gift from Matthias Broecheler and Dan LaRocque

• Apache license but not in ASF?

http://titandb.io

Page 10: Graph Processing with Titan and Scylla

DataStax Enterprise Graph?

• Apache TinkerPop compliant

• Not open source

• Titan inspired

• Gremlin tooling with DataStax Studio

Page 11: Graph Processing with Titan and Scylla

Graphs with Titan and Scylla

üGraph computing

üGraph landscape

ØTitan and Scylla

Page 12: Graph Processing with Titan and Scylla

Why Titan?

• Designed for big graphs (10B+ edges)

• Local graph traversals (OLTP)

• Batch graph processing (OLAP)

• Desire a free, open source distributed graph database

Page 13: Graph Processing with Titan and Scylla

Titan Key Features

• Data management

• Vertex-centric indices

• Graph partitioning

• Edge compression

http://s3.thinkaurelius.com/docs/titan/1.0.0/getting-started.html

Page 14: Graph Processing with Titan and Scylla

Titan Architecture

http://s3.thinkaurelius.com/docs/titan/1.0.0/arch-overview.html

Page 15: Graph Processing with Titan and Scylla

Why Scylla?

• Drop-in replacement for Cassandra 2.1.8

• Thrift support (Duarte Nunes)§ Partial support in 1.3§ Full support in 1.4

• Titan is compatible with Scylla 1.3§ OLTP with Scylla is crazy fast§ OLAP via SparkGraphComputer

https://github.com/scylladb/scylla/issues/693

Page 16: Graph Processing with Titan and Scylla

Titan reawakened with Scylla

• Next steps

• Benchmarking OLTP and OLAP with Scylla

• Transition Titan to native CQL§ Essentially a rewrite§ Materialized views

• Native search in Scylla?

Page 17: Graph Processing with Titan and Scylla

• Open source leads the way

• Partner with open communities

Page 18: Graph Processing with Titan and Scylla

Thank You!http://titandb.io

http://tinkerpop.apache.org

Twitter/GitHub @pluradj