Shutl

23
Shutl delivers with Neo4j Tuesday, 30 July 13

description

 

Transcript of Shutl

Page 1: Shutl

Shutl  delivers  with  Neo4j

Tuesday, 30 July 13

Page 2: Shutl

Volker Pacher

senior developer @shutl

@vpacher

http://github.com/vpacher

Tuesday, 30 July 13

Page 3: Shutl

Tuesday, 30 July 13

Page 4: Shutl

Tuesday, 30 July 13

Page 5: Shutl

Problems?

Tuesday, 30 July 13

Page 6: Shutl

http://xkcd.com/287/Tuesday, 30 July 13

Page 7: Shutl

• exponential growth of joins in mysql with added features

• code base too complex and unmaintanable

• api response time growing too large the more data was added

• our fastest delivery was quicker then our slowest query!

problems with our previous attempt (v1):

Tuesday, 30 July 13

Page 8: Shutl

The case for graph databases:

• relationships are explicit stored (RDBS lack relationships)

• domain modelling is simplified because adding new ‘subgraphs‘ doesn’t affect the existing structure and queries (additive model)

• white board friendly

• schema-less

• db performance remains relatively constant because queries are localized to its portion of the graph. O(1) for same query

• traversals of relationships are easy and very fast

Tuesday, 30 July 13

Page 9: Shutl

What is a graph anyway?

Node 1 Node 2

Node 4

Node 3a collection of vertices (nodes)

connected by edges (relationships)

Tuesday, 30 July 13

Page 10: Shutl

a short history: the seven bridges of Königsberg (1735)

Leonard Euler

Tuesday, 30 July 13

Page 11: Shutl

directed graph

Node 1 Node 2

Node 4

Node 3

each relationship has a direction orone start node and one end node

Tuesday, 30 July 13

Page 12: Shutl

property graph:

name: Volker

• nodes contain properties (key, value)

• relationships have a type and are always directed

• relationships can contain properties too

name: Sam

:friends

name: Megan

:knowssince: 2005

name: Paul

:friends

:works_for

:knows

Tuesday, 30 July 13

Page 13: Shutl

Tuesday, 30 July 13

Page 14: Shutl

a graph is its own index (constant query performance)

Tuesday, 30 July 13

Page 15: Shutl

Tuesday, 30 July 13

Page 16: Shutl

Querying the graph: Cypher

• declarative query language specific to neo4j

• easy to learn and intuitive

• enables the user to specify specific patterns to query for (something that looks like ‘this’)

• inspired partly by SQL (WHERE and ORDER BY) and SPARQL (pattern matching)

• focuses on what to query for and not how to query for it

• switch from a mySQl world is made easier by the use of cypher instead of having to learn a traversal framework straight away

Tuesday, 30 July 13

Page 17: Shutl

• START: Starting points in the graph, obtained via index lookups or by element IDs.• MATCH: The graph pattern to match, bound to the starting points in START.• WHERE: Filtering criteria.• RETURN: What to return.• CREATE: Creates nodes and relationships.• DELETE: Removes nodes, relationships and properties.• SET: Set values to properties.• FOREACH: Performs updating actions once per element in a list.• WITH: Divides a query into multiple, distinct parts

cypher clauses

Tuesday, 30 July 13

Page 18: Shutl

an example graph

Node 1me

Node 2Steve

Node 3Sam

Node 4David

Node 5Megan

me - [:knows] -> Steve -[:knows] -> David

me - [:knows] -> Sam - [:knows] -> Megan

Megan - [:knows] -> David

knows

knowsknows

knows

knows

Tuesday, 30 July 13

Page 19: Shutl

START me=node(1)MATCH me-[:knows]->()-[:knows]->fofRETURN fof

the query

Tuesday, 30 July 13

Page 20: Shutl

START me=node(1)MATCH me-[:knows*2..]->fofWHERE fof.name =~ 'Da.*'RETURN fof

Tuesday, 30 July 13

Page 21: Shutl

root (0)

Year: 2013

Month: 05 Month 01

2014

0105

2013

Year: 2014

Month: 06

06

Day: 24 Day: 25

2425

Day: 26

26

Event 1 Event 2 Event 3

happens happens happens happens

representing dates/times

Tuesday, 30 July 13

Page 22: Shutl

find all events on a specific day

START root=node(0)MATCH root-[:‘2013’]-()-[:’05’]-()-[:’24’]-()- [:happens]-event RETURN event

Tuesday, 30 July 13