Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British Gas Connected Homes
DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
-
Upload
datastax-academy -
Category
Technology
-
view
400 -
download
1
Transcript of DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
What’s New in Apache TinkerPop?Open Source Graph Computing Framework
http://tinkerpop.incubator.apache.org/
Stephen Mallette - @spmallette
© 2015. All Rights Reserved.
By Andrea Mann from London, United Kingdom (Flickr Uploaded by Hohum) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
© 2015. All Rights Reserved.
Georgius Agricola, De re metallica 1556
© 2015. All Rights Reserved.
“Woman at spinning wheel with man carding” Smithfield Decretals (British Library, Royal 10 E. IV, fol. 147v), c. 1340“Carding, Spinning and Weaving” by Giovanni Boccaccio from De claris mulieribus 15th Century
© 2015. All Rights Reserved.
London, British Library, Royal 18 E.iii (15th century) [Public domain], via Wikimedia Commons
© 2015. All Rights Reserved.
[Public domain], via Wikimedia Commons
© 2015. All Rights Reserved.
By Unknown. Photo credit: Yale University Art Gallery. In the Public Domain. [Public domain], via Wikimedia Commons
[Public domain], via Wikimedia Commons
© 2015. All Rights Reserved.
By Dogcow (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons
© 2015. All Rights Reserved.
By Adam Schuster (Flickr: Proto IBM) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
By Arnold Reinhold [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons
© 2015. All Rights Reserved.
label: personname: Stephen
label: booktitle: Connections
label: personname: James
label: bought label: wrote
Graph Data Structure
© 2015. All Rights Reserved.
Gremlin in TinkerPop3
is NOT “just ”
It is advised that not use expressionsƛ
supports BOTH imperative and declarative querying
© 2015. All Rights Reserved.
$ bin/gremlin.sh
\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin>
© 2015. All Rights Reserved.
$ bin/gremlin.sh
\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin>
© 2015. All Rights Reserved.
$ bin/gremlin.sh
\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin>
discussion
wrote
hasResponse
person response
participatesIn hasRoot
© 2015. All Rights Reserved.
$ bin/gremlin.sh
\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin> g = graph.traversal()==>graphtraversalsource[tinkergraph[vertices:1933 edges:4125], standard]gremlin>
© 2015. All Rights Reserved.
gremlin> g.V(4608)==>v[4608]
4608
person
g.V(4608)
“Find the vertex with id 4608”
© 2015. All Rights Reserved.
gremlin> g.V(4608).values('userName')==>Renlit
4608
person
g.V(4608)
Renlit
userName
.values('userName')
“Get the value of the ‘userName’ property on vertex 4608”
© 2015. All Rights Reserved.
gremlin> g.V(4608).out('wrote')==>v[354560]==>v[640768]...==>v[466432]
4608 wrote
person response
g.V(4608) .out('wrote')
“Find the responses posted by ‘Renlit’”
© 2015. All Rights Reserved.
gremlin> g.V(4608).out('wrote').count()==>67
4608 wrote
person response
.out('wrote')
“Find the number of responses posted by ‘Renlit’”
g.V(4608) .count()
67
© 2015. All Rights Reserved.
gremlin> t = g.V(4608).out('wrote').count();null==>nullgremlin> t.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>DedupBijectionStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>ProfileStrategy==>EngineDependentStrategy==>ComputerVerificationStrategy==>StandardVerificationStrategy
© 2015. All Rights Reserved.
t.strategies.toList()
StrategyApplication
Original Query g.V(4608).out('wrote').count()
© 2015. All Rights Reserved.
AdjacentToIncidentStrategy
Post-Strategies g.V(4608).outE('wrote').count()
ConjunctionStrategyIncidentToAdjacentStrategy
IdentityRemovalStrategyDedupBijectionStrategyMatchPredicateStrategyRangeByIsCountStrategyTinkerGraphStepStrategyProfileStrategyEngineDependentStrategyComputerVerificationStrategyStandardVerificationStrategy
gremlin> g.V(4608).as('a').out('wrote').out('hasResponse').in('wrote') .where(neq('a')).groupCount().next()==>v[5376]=4==>v[2304]=2==>v[5888]=7...==>v[10496]=1
4608 wrote
person response
hasResponse
hasResponse
hasResponse
...
response
wrote
wrote
wrote
...
person person
4608
g.V(4608).
as('a').out('wrote') .out('hasResponse') .in('wrote') .where(neq('a')) .groupCount()
“Get a distribution over the authors who replied to ‘Renlit’”
© 2015. All Rights Reserved.
gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin>
4608 wrote
person response
g.V(4608) .out('wrote')
...
responseLevel
.values('responseLevel').groupCount()
“Get a distribution over the ‘responseLevel’ value for posts by ‘Renlit’”
© 2015. All Rights Reserved.
gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>
response
g.V() .has('type','response')
...
responseLevel
.values('responseLevel') .groupCount()
type response
“Get a distribution over the ‘responseLevel’ for all posts in the graph”
gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>
g.V(4608).out('wrote') .values('responseLevel') .groupCount()
g.V().has('type','response') .values('responseLevel') .groupCount()
© 2015. All Rights Reserved.
gremlin> :install org.apache.tinkerpop hadoop-gremlin 3.0.0-incubating==>Loaded: [org.apache.tinkerpop, hadoop-gremlin, 3.0.0-incubating] - restart the console to use [tinkerpop.hadoop]gremlin> :exit
... $ bin/gremlin.sh \,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> :plugin use tinkerpop.hadoop==>tinkerpop.hadoop activatedgremlin> hdfs.copyFromLocal('data.kryo', 'data.kryo')==>nullgremlin> hdfs.ls()==>rw-r--r-- smallette supergroup 5782840 data.kryogremlin>
© 2015. All Rights Reserved.
gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]
© 2015. All Rights Reserved.
gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]
© 2015. All Rights Reserved.
g.V(4608)
groupCount()
out().in() g.V().
Any Graph System
Neo4j
Titan
Sqlg
BlueM
ix
Hadoop
Giraph
Spark
OrientD
B
...
gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33
© 2015. All Rights Reserved.
gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> :> graph==>tinkergraph[vertices:1933 edges:4125]
© 2015. All Rights Reserved.
gremlin> g.V(10240).values('userName')==>Nayagremlin> g.V(5888).values('userName')==>Loret
© 2015. All Rights Reserved.
gremlin> subGraph = g.V(10240,5888).repeat(__.outE().subgraph('subGraph').inV()) .times(10) .cap('subGraph').next()==>tinkergraph[vertices:1152 edges:1343]gremlin> :> subGraph
© 2015. All Rights Reserved.
Naya
Loret
gremlin> :remote config visualTraversal subGraph svg==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> svg==>graphtraversalsource[tinkergraph[vertices:1152 edges:1343], standard]gremlin> svg.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>FilterRankingStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>EngineDependentStrategy==>GephiTraversalVisualizationStrategy==>ProfileStrategy==>ComputerVerificationStrategy
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]
© 2015. All Rights Reserved.
Takeaways
If you have connected data, use a Graph DB
If you use a Graph DB, consider
If you use , get started with Gremlin Console
© 2015. All Rights Reserved.