Transforming your Graph Analytics with GraphDB (Petar Ivanov)
The Definition of GraphDB
-
Upload
takahiro-inoue -
Category
Technology
-
view
31.383 -
download
2
description
Transcript of The Definition of GraphDB
![Page 1: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/1.jpg)
The Definition ofGraphDB
@doryokujin
GraphDB Meet-Up Japan #1
![Page 2: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/2.jpg)
・Takahiro Inoue(age 26)
・twitter: doryokujin
・Majored in Math (Statistics & Graph Algorithm)
・Data Scientist
・Leader of MongoDB JP
・Interest: DataProcessing, GraphDB
About Me
![Page 3: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/3.jpg)
(1) Graph Type for GraphDB~Which Graph is Better for GraphDB ?~
(2) Graph Traversals~ Graph Query ≡ Graph Traversal ~
(3) Index Free Adjacency~The Key of Definition of GraphDB~
(4) Other Topics
Agenda
![Page 4: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/4.jpg)
(1) Graph Class for GraphDB~Which Graph is Better for GraphDB ?~
![Page 5: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/5.jpg)
・Graph is an ordered pair G = (V, E)
・Set V of Nodes
・Set E of Edges
- 2 Element Subsets of V
- Representing “Relationship” Between Nodes
- Directed or Undirected
Definition of Graph
![Page 6: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/6.jpg)
[Undirected Graph]
・Edges have no orientation
・Not ordered pairs, but sets {u, v} i.e. Edge (a, b) ≡ (b, a)
・All nodes have the same object type
・All edges have the same relationship
Def. Undirected Graph
G = (V, E)
![Page 7: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/7.jpg)
[Directed Graph (Digraph)]
・Ordered pair D = (V, A)
・A: Set of ordered pairs of nodes, called “arrows”
・All nodes have the same object type
・All edges have the same relationship
Def. Directed Graph
D = (V, A)
symmetric
![Page 8: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/8.jpg)
Example: (Un)Directed Graph
follow
follow
follow
follow
friend
friend
friend
・relationship of all edges: “friend”
・facebook friend is symmetric
・node object type: “user”
・relationship of all edges: “follow”
・twitter follow action is asymnetric
・node object type: “user”
[Facebook] [Twitter]
![Page 9: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/9.jpg)
[Mixed Graph]
・Edges may be directed and some may be undirected
[Multigraph]
・Including (direct/indirect) loop edges and multiple edges
Def. Mixed Graph, Multigraph
G = (V, E,A)
D = (V, A)
loop
multiple
![Page 10: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/10.jpg)
・These types of graphs can have common representation
・undirected edge --> 2 directed edges
・symmetric edge --> 2 asymmetric directed edges
・allows loop and multiple edge
Common Representation
![Page 11: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/11.jpg)
symmetric undirected
multiple
loop
・No undirected edge・No symmetric edge
Common Representation
![Page 12: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/12.jpg)
Def. SIngle-Relational Graph[Single-Relational Structures]
・Multigraph
・All edges must be the same relationship
・All nodes must be the same object type
・All graphs already introduced are SR-Graphs
Is this class sufficient for graph database ?
![Page 13: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/13.jpg)
[Multi-Relational Structures]
・More flexible than single-relational structures
・All edges are directed and asymmetric
・Each edge can have a different relationship
・Each node can have a different type object
Def. Multi-Relational Graph
![Page 14: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/14.jpg)
Example: Multi-Relational Graph
・4 types of relationships: “Reply”, “DM”, “RT”, “Block”
・Every node still have the same object type
[Twitter]
Reply
Reply DM
DM
ReplyBlock
Reply
RTRT
![Page 15: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/15.jpg)
Example: Multi-Relational Graph
・Many types of relationship・Connection: user --> item・Connection: user <--> user
want!
follow
like!
exhibit
want!
invite
exhibitwant!
want!
bought!exhibit
follow
follow
want!
message
[Livlis]http://www.livlis.com/
![Page 16: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/16.jpg)
Def. Property Graph
[Property Graph]
・Multi-Relational Graph
・Each node and edge has some properties
・Each property is represented by “key-value” and scheme-free
follow
since 2011/01/23
id id_Bfollow 500
follower 1000since 2011/06/01
id id_Afollow 100
follower 200since 2011/01/01
![Page 18: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/18.jpg)
name Bfollow 10follower 20sex man
Example: Property Graph
want!
follow
like!
exhibit
want!
invite
exhibitwant!
want!
bought!exhibit
follow
follow
want!
message
[Livlis]name Afollow 100follower 200sex man
favorite 50
since 01/01/01
price $50
since 01/01/01price $50access 500wated 10liked 30
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
... ...
![Page 19: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/19.jpg)
Def. Hyper Graph[Hyper Graph]
・Set V of Nodes
・Set E of non-empty subsets of V
・i.e. Edge can point to more than two nodes
・Every node or edge carry an arbitrary value as payload
・Property Graph ⊂ Hyper GraphH = (V, E)
Sones: manage edge types with GraphDB 2.1
![Page 20: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/20.jpg)
・Property Graph have flexible representation
・Key features:- All edges are directed and asymmetry
- Each edge can have a different relationship
- Each node can have a different type object
- All elements have property with key-value style
・Many GraphDBs support for Property Graph Models ※ Some GraphDBs support for Hyper Graph Model
Summary
![Page 21: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/21.jpg)
(2) Graph Traversals~ Graph Query ≡ Graph Traversal ~
![Page 22: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/22.jpg)
Graph Query ≡ Graph Traversal
・Not an “global” search like other RDBMS or NoSQL
・But traverse over the graph from “root node”
・”Locality” is very important
Graph Traversals
Property Graph Algorithms
![Page 23: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/23.jpg)
・To traverse a graph is to process every node in the graph exactly once
・The two most common traversal patterns are breadth-first traversal and depth-first traversal
・For each step, the traverser moves to it's adjacent vertices
・Repeat each step until specific times or full some condition
Graph Traversals
![Page 24: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/24.jpg)
The Graph Traversal Pattern 9
1
name=Alberto Pepe
2
name=...
3
name=...
4
name=...
friend
friend
friende
out
e
friend
lab+
v
in
✏
name
Fig. 3. A single path along along the f traversal.
those edges with the label friend, then traverse to the incoming (i.e. head)vertices on those friend-labeled edges. Finally, of those vertices, return theirname property.21 A single legal path according to this function is diagrammedin Figure 3. Though not diagrammed for the sake of clarity, the traversal wouldalso go from vertex 1 to the name of vertex 2 and vertex 3. The function f
is a “higher-order” adjacency defined as the composition of explicit adjacen-cies and serves as a join of Alberto and his friend’s names.22 The remainderof this section demonstrates graph traversals in real-world problems-solvingsituations.
3.1 Traversing for Recommendation
Recommendation systems are designed to help people deal with the problemof information overload by filtering information in the system that doesn’tpertain to the person [14]. In a positive sense, recommendation systems focusa person’s attention on those resources that are likely to be most relevantto their particular situation. There is a standard dichotomy in recommenda-tion research—that of content- vs. collaborative filtering-based recommenda-tion. The prior deals with recommending resources that share characteristics(i.e. content) with a set of resources. The latter is concerned with determiningthe similarity of resources based upon the similarity of the taste of the peo-ple modeled within the system [6]. These two seemingly di↵erent techniquesto recommendation are conveniently solved using a graph database and twosimple traversal techniques [10, 5]. Figure 4 presents a toy graph data set,where there exist a set of people, resources, and features related to each otherby likes- and feature-labeled edges. This simple data set is used for theremaining examples of this subsection.
21 Note that the order of a composition is evaluated from right to left.22 This is known as a virtual edge in the graph system called DEX [9].
・Single step traversal: from element i to element j, where i, j ∈ (V ∪ E).
・Can define graph traversals of arbitrary length from single step traversal
・Querying is performed through traversals, which can perform millions of "joins" per second
Graph Traversals
The Graph Traversal Pattern
![Page 25: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/25.jpg)
Graph Traversals
Basic Graph Traversals
![Page 26: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/26.jpg)
・GraphDB is efficient with respects to local data analysis (Recommendation, Social Analytics, Shortest Path). They all focus on “a user”
・Locality is defined by direct referent structures
・Frame all solutions to problems as a traversal over local regions of the graph
Summary
![Page 27: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/27.jpg)
(3) Index Free Adjacency~The Key of Definition of GraphDB~
![Page 28: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/28.jpg)
※ GraphDB is a not only database that can model a graph structures (RDB, Document, etc...)
[definition]
・A graph database is any storage system that provides “index-free adjacency”
The Definition of GraphDB
The Graph Traversal Programming Pattern
![Page 29: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/29.jpg)
[Important feature]
・Mini Index: Every element (node or edge) has a direct pointer to its adjacent element
・No Index lookup: we can determine which vertex is adjacent to which other vertex without looking up an index-tree
The Definition of GraphDB
![Page 30: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/30.jpg)
Relational Data Model
column1 column2 column312345678
[Index-tree]
[Graph data in table]
Graph Databases and Endogenous Indices
createdcreated
follows
follows
created
citescites
created
cites
createdfollows
follows
follows
name=twarkoage=30
name=ahzf
name=graph_blogviews=1000
name=tenderlovegender=male
date=2007/10
name=neo4jviews=56781
page_rank=0.023
name=peterneubauer
name property index
views property index gender property index
Graph Databases Make Use of Indices
A B C
D E }}
The Graph
Index of Vertices(by id)
• There is more to the graph than the explicit graph structure.
• Indices index the vertices, by their properties (e.g. ids).
Indexing of Verticies
Graph Data
The Graph Traversal Programming Pattern
![Page 31: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/31.jpg)
A
E
C
B
D
A B C
D E
B, C E D, E
2. Looking up the index-tree
log_2(n) time cost 4. Moving to
either B or C
1. Want to determineneighbors of A
3. Getting the adjacency list (B,C)[Index-tree] [Graph Data]
Relational Data Model
![Page 32: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/32.jpg)
[Index-tree] [Graph Data]
Lookup cost become larger Graph growth. O(log2n)
Looking cost become very high
Relational Data ModelTakes many time for traversing
Relational Data Model
![Page 33: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/33.jpg)
・Insert time: as the graph grows in size, the cost of a insert time become high
・lookup time: as the graph grows in size, the cost of a lookup time growth in proportional to n, O(log2n)
・memory size: as the graph grows in size, the memory size become high
Cost of Looking Up Index-tree
![Page 34: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/34.jpg)
Graph DB Model
[Graph Data]
[Mini-Index] direct references to its adjacent
verticesB
C
D
E
F
GA B, C
D,E
F,F
G
E,F,G
G
[Constant time]: It is dependent upon the number of connected
edges
![Page 35: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/35.jpg)
Mini-Index: Graph DB Model
[Graph Data]
The cost of a local step remains the same
![Page 36: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/36.jpg)
・Making external indexing system to index the properties of its vertices and edges
Indexing their properties
Graph Databases and Endogenous Indices
createdcreated
follows
follows
created
citescites
created
cites
createdfollows
follows
follows
name=twarkoage=30
name=ahzf
name=graph_blogviews=1000
name=tenderlovegender=male
date=2007/10
name=neo4jviews=56781
page_rank=0.023
name=peterneubauer
name property index
views property index gender property index
The Graph Traversal Programming Pattern
![Page 37: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/37.jpg)
・GraphDB provides “index-free adjacency”
・No looking up index-tree, each element has direct pointers
・They have a external index system for their properties (both nodes and relations)
・A very large graph can storage only single server because a traversal cost is independence of growth of graph
Summary
![Page 38: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/38.jpg)
(4) Other Benefits of GraphDB
![Page 39: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/39.jpg)
A Graph Database Transforms a RDBMS
← RDBMS↓ GraphDB as RDBMS
Comparing Database Models
![Page 40: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/40.jpg)
A Graph Database Transforms a Key-Value Store
← RDBMS
↓ GraphDB as Key-Value
Comparing Database Models
![Page 41: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/41.jpg)
A Graph Database transforms a Document DB
↑ Document DB↓ GraphDB as RDBMS
Comparing Database Models
![Page 42: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/42.jpg)
Example of e-commerce site
Square Pegs and Round Holes in the NOSQL World
![Page 43: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/43.jpg)
Square Pegs and Round Holes in the NOSQL World
Example of e-commerce site
![Page 44: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/44.jpg)
Square Pegs and Round Holes in the NOSQL World
Example of e-commerce site
Recommendation!!
![Page 45: The Definition of GraphDB](https://reader038.fdocuments.us/reader038/viewer/2022103110/54b72cc64a79591b2d8b462e/html5/thumbnails/45.jpg)
(1) Graph Type for GraphDB~Which Graph is Better for GraphDB ?~
(2) Graph Traversals~ Graph Query ≡ Graph Traversal ~
(3) Index Free Adjacency~The Key of Definition of GraphDB~
Did you Understand?