Big Data,Neo4j Developer,Neo4j Graph database,Casandra Developer
Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜...
Transcript of Graph Databases and Neo4j - University of Stirling · Graph Databases and Neo4j Kevin Swingler Re˜...
19/08/2015
1
Graph Databases and Neo4j
Kevin Swingler
Relationships
• We all know how a relational database models
relationships
• But there are limitations to the approach
– Relationships can’t have a type, or any properties
– Permissible relationships (PK and FK) are strictly
defined and cannot be added in an ad-hoc way
– Must be implemented within the relational model
19/08/2015
2
Flexible Relationships
• Imagine a set of objects that can have
arbitrary properties and arbitrary relationships
between the objects
Animal:Fish
Lives in:Water
Moves:Swims
Animal:Cat
Lives in:Land
Blood:Warm
Eats
Likes:a lot
Main Graph DB Features
• Each entity (object) can have different properties, just like a document database
• Any entity can have a relationship with any other entity
• Relationships have a type, and any pair of entities can have a relationship of any type
• Relationships have properties, so can be thought of as entities that join other entities
• Entity pairs can have more than one relationship
19/08/2015
3
Anatomy of a Graph DB
• Nodes represent entities, for example people in a social network or an organisation
• Edges represent relationships, e.g ‘Works for’
• Edges are directional: A works for B doesn’t mean B works for A
• So relationships are INCOMING or OUTGOING in respect to a node
• Edges have properties: A works for B: since 2003, as secretary
Labels, Types and Properties
• Nodes
– Label: E.g. Person, Movie
– Properties: E.g. Name, Age
• Edges
– Type: E.g. Works for, Loves
– Properties: E.g. Since when, how much
19/08/2015
4
Example
Relationship Depth
• Relationship depth measures the steps
between one entity and another
• For example Friend is depth 1, friend of a
friend is depth 2, etc.
19/08/2015
5
Graph Traversal
• Traversing a graph draws out relationship
• Traversing means moving from one node to
another along the relationship edges
• As a node can have more than one
relationship, traversal is not trivial
• There are algorithms that try to optimise the
traversal of a graph
Traversal Type
• A graph traversal starts with a chosen node,
either a specified root, or any given node
• It can follow INCOMING or OUTGOING nodes,
so go in either direction
• Useful for asking “Who works for A?” or “Who
does B report to?”
• Can traverse DEPTH_FIRST or BREADTH_FIRST
19/08/2015
6
DEPTH or BREADTH First
• From a node with multiple edges, leading to
long paths:
– Depth first follows the first path to its end, then
returns and follows the second ...
– Breadth first follows all the first steps first, then
lists the depth 2 paths, and so on
Example (Right to Left)
From Morpheus, starting with the edge to Reagan, following KNOWS
Breadth first order is (Reagan), (Trinity), (Reagan – Agent Smith)
Depth first is (Reagan – Agent Smith), (Trinity)
19/08/2015
7
Indexing
• Nodes and edges are indexed to speed
searches for single entities and relationships
• Means graph doesn’t need to be traversed
Index
Neo
Loves
Trinity
Loves
Why Use a Graph?
• Many data structures are examples of graphs:
– Linked lists
– Trees
– Maps
• So a graph is a generic data structure
• One way to address the impedence mismatch we discussed in lecture 1 – that objects in your java don’t match the structure of a DB
• Maths and algorithms of a graph well understood
19/08/2015
8
Facebook Graphs
• Using Touchgraph
Facebook Relationships
• Friends is the obvious one
• But you might also include
– Liked
• How many times
– Commented on
– In a relationship with
– Has chatted with
19/08/2015
9
Queries
• Who likes Me?
• Who has liked something I’ve posted?
• Who likes somebody I’ve liked?
• Which of my friends have chatted with each
other?
• Do any of my old school friends know any of
my university friends?
Query = Traversal
• The query “Who likes me” requires a traversal
of depth 1 of the incoming nodes to me with
the property “Like”
• “Do any of my old school friends know any of
my university friends” requires a traversal
from you to all your friends, then from friend
to friend
19/08/2015
10
Compared to Other DBs
A Graph Database transforms a RDBMS
Topple the stacks of records in a relational
database while keeping all the relationships,
and you’ll see a graph. Where an RDBMS is
optimized for aggregated data, Neo4j is
optimized for highly connected data.
http://docs.neo4j.org/chunked/stable/tutorial-comparing-models.html
Compared to Other DBs
A Graph Database elaborates a Key-Value
Store
A Key-Value model is great for lookups of
simple values or lists. When the values are
themselves interconnected, you’ve got a
graph. Neo4j lets you elaborate the simple
data structures into more complex,
interconnected data.
19/08/2015
11
Compared to Other DBs
A Graph Database
navigates a Document Store
The container hierarchy of a
document database
accommodates schema-free
data that can easily be
represented as a tree. Which
is of course a graph. Refer to
other documents (or
document elements) within
that tree and you have a more
expressive representation of
the same data. When in
Neo4j, those relationships are
easily navigable.
D=Document, S=Subdocument, V=Value, D2/S2 = reference to subdocument in (other) document
Neo4j Query Language - Cypher
• Its query language, Cypher is a declarative
language, like SQL
• Graph traversal is handled at a lower level, so
you don’t need to write traversals
• Commands are built from clauses that
represent matches to patterns and
relationships
19/08/2015
12
Create Nodes
• Create a node, called Kev, label it Person and
provide properties:
Create (Kev:Person { Name:’Kevin’, Age:45})
• And
Create (Beer:Drink {Name:’Beer’, Alcoholic:’Yes’})
Retrieve Nodes
• Use the MATCH operator
Match (a:Person) WHERE a.Name="Kevin"
• Instantiates a with the node with the name "Kevin"
• i.e { Name:’Kevin’, Age:45}
19/08/2015
13
Create RelationShips
• Create a relationship between Kevin and Beer
Match (a:Person),(b:Beer) WHERE
a.Name="Kevin" and b.Name=“Beer"
CREATE (a)-[r:Likes]->(b)
Neo4j in Java
• Neo4j is based on Java (that is what the 4j means)
Node kevin = graphDb.createNode();kevin.setProperty(“name”,”Kevin”);kevin.setProperty(“Age”,”45”);
Node beer= graphDb.createNode();beer.setProperty(“Alcoholic”,”Yes”);
Relationship rel = kevin.createRelationshipTo( beer, RelTypes.LIKES );
relationship.setProperty( "HowMuch", "Quite a lot" );
19/08/2015
14
Summary
• Graph databases are good for representing
entities and the relationships between them
• Far more rich than traditional relational
database
• Support ACID transactions
• Good for modelling naturally graph like
structures such as geographic locations, social
networks, etc.