© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1
Select the right model
„Document vs Graph,
what is the answer?“
Document rev 1.1
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2
OrientDB supports multiple models
Document
Graph
Custom Graph*
*available in 1.2
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3
What is the best choice for my domain?
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4
Facts
Graph Model has been built on top of
the Document Model
But why it is so fast?
Because in OrientDB relationships are
direct links, not relational JOINs
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5
This is the reason why
even using the Document model
you can manage complex graphs
of objects
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6
Relationships
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7
Graph Model Vertex & Edges I
Vertex A
Vertex B
out
in
Connections using bidirectional Edges
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8
Vertex A
Vertex B
out
in
Edge A-B
Graph Model Vertex & Edges II
Edges in OrientDB are records with own RecordID of class
«OGraphEdge» or just «E» as alias.
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9
Vertex A
Vertex B
in
Edge A-B
Graph Model Vertex & Edges III
out*
* in out
To access to the ougoing vertices use «out.in» because: 1. Vertex A exits through «out» to go to the 2. Edge, and then 3. «in» to arrive to the Vertex B
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10
Graph Model summary
connect vertices using edges
A vertex has "in" for the incoming
relationships and "out" for the outgoing
relationships
are always bidirectional
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11
Document Model oneway direct connections
Professor Jay
Student Steve
students *
Connections are directed without using Edges and are always
monodirectional
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12
Document Model express the cardinality
City Palo Alto
city 1
City = 1 Using LINK type
Professor Jay
Student Steve
students *
Students = N Using LINKSET type
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13
Document Model express the cardinality
Professor Jay
Student Steve
students *
Use LINKSET or LINKMAP for unordered collection doesn’t accept duplicates, otherwise
LINKLIST as ordered with duplicated
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14
Document Model bidirectional connections
Professor Jay
Student Steve
students *
professors *
To create bidirectional links create 2 relationships:
1. from Professor to Student 2. from Student to Professor
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15
Document Model summary
Connections are always as oneway, to have a
bidirectional relationship create 2 connections
Single cardinality: LINK
Multiple cardinality:
LINKSET and LINKMAP for unordered, no duplicates
and LINKLIST: ordered allows duplicates
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16
Graph Model vs
Document Model
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17
Graph Model PROS
1. Ability to use GREMLIN and full TinkerPop
Blueprints stack
2. Connections are always bidirectional: leave open
the ability to move in all the directions even is
not planned at the beginning
3. Edges can have properties
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18
Graph Model CONS
1. Edges are record themselves, so db is bigger: one
record more per edge
2. Traversing between Vertices needs to load the
Edge record too, so it‘s slower
3. All the outgoing relationships are inside the "out"
collection: worst performance in case you have
connections of different kind
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19
Document Model PROS
1. Lighter than Graph Model, no need for a separate
record to manage relationships
2. Faster on traversing because links are directing
bypassing the Edge records
3. Finer grained cardinality setting also the
relationship type
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20
Document Model CONS
1. Cannot use the GREMLIN and full TinkerPop
Blueprints stack
2. No native bidirectional connection, so the
application has to manage the double connection
3. Edges cannot have properties
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21
Suggestions What to use?
Graph and Document models have PROS
and CONS. Often it‘s hard to select the
right one because there couldn‘t be a
right one.
The next use cases are very generics so
don‘t take them as rule of the thumb
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22
Suggestions (1) Social Applications
Highly suggested the Graph Model because
you‘re always ready to analyze the graph in
any direction using advanced tools like
GREMLIN language.
Graph algorithm already developed through
TinkerPop stack like Shortest Path, Ranking,
etc.
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23
Suggestions (2) CRM/Business Applications
You could select both. Here the Document
Model is a good candidate because
relationships don‘t change so often and are
mostly known at the beginning
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24
Suggestions (3) Highest performance
Since the Graph Model is heavier because
requires more records and more traverse
time, the suggested here is the
Document Model
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26
NuvolaBase.com
The first
Graph Database
on the Cloud
always available
few seconds to setup it
use it from Web & Mobile
apps
© Luca Garulli - 2012 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27
Luca Garulli
www.twitter.com/lgarulli
CEO at
Ltd, London UK
Author of
Document-Graph NoSQL Open Source project
Top Related