Graph Distribution - redislabs.com · 70 ? 534 71070 ? 92 ? 8964 71070 ? Node N contains names for...
Transcript of Graph Distribution - redislabs.com · 70 ? 534 71070 ? 92 ? 8964 71070 ? Node N contains names for...
● Property graph● Labeled entities● Schema less● Cypher query language● Aggregations, Arithmetic expressions, Sort...● Tabular resultset
RedisGraph
Tables
Name Age Height
Roi 33 187
Hila 33 170
Shany 23 167
Amit 31 180
Name Population
Israel 8.5M
Japan 127M
Italy 60M
SRC DEST
1 2
2 2
2 3
4 1
4 3
Person CountryVisit
Documents
ID: 1,
Name: ‘Roi’,
Age: 33,
Height: 187,
Visited: [6]
ID: 6,
Name: ‘Japan’,
Population: 127M
Adjacency matrix
1 0 1 0 1 0 0 0
0 1 0 0 1 1 0 1
0 0 1 0 1 0 0 0
1 1 0 1 1 0 1 1
1 0 1 0 1 1 0 0
Node i is connected to node j If A[i,j] = 1
Graph structure Hexastore
Triplets
SPO:Michael:Boss:Jim
SOP:Michael:Jim:Boss
OPS:Jim:Boss:Michael
OSP:Jim:Michael:Boss
PSO:Boss:Michael:Jim
POS:Boss:Jim:Michael
MichaelS
JimO
Boss
P
Node property setEntities - Key value store.
Person node with attributes:{
‘name’: ‘Bruce Buffer’,‘age’: 60,‘gender’: ‘male’
}
2 billion users
338 average friends for user
676 billion edges
152 terabytes ~= 1024*32 bytes per user + 64 * 2 bytes per edge
Problem
Query
Find friends of mine who’ve visited places I’ve been to and are older than me.
Match (ME:person)-[friend]->(F:person)-[visited]->(C:country)<-[visit]-(ME)WHERE ME.ID = 33 AND F.age > ME.age
RETURN F.name, C.name
Query example continued
WHERE F.age > ME.age RETURN F.name, C.nameNETWORK!
Index
Entities
Fetch name of every entity in (IDs)Entity’s age > 29
Query
Find all posts liked by friends of friends of mine, written by author X.
MATCH (ME:person)-[friend]->(:person)-[friend]->(F:person)-[like]->(post)<-[author]-(A:author)
WHERE ME.ID=46 AND A.ID=71070RETURN A.name, F.name
1. Node X contains FRIEND relations.
2. Seek to my ID in Node X (1 RPC). Retrieve a list of friend uids.
3. Do multiple seeks for each of the friend uids, to generate a list of friends of friends uids. result set 1
Query
FriendIndex
Queryexecutor
(ME:person)-[friend]->(:person)-[friend]->(F:person)
1. Node Y contains posting list for predicate LIKE.
2. Ship result set 1 to Node Y (1 RPC), and do seeks to generate a list of all posts liked by result set 1. result set 2
Query
LikeIndex
Queryexecutor
(F:person)-[like]->(post)
Resultset 1
QueryNode Z contains relations for predicate AUTHOR.Ship result set 2 to Node Z (1 RPC).Seek to author X, and generate a list of posts authored by X. result set 3
AuthorIndex
Queryexecutor
(post)<-[author]-(A:author)
Resultset 2
Resultset 4Intersected resultset 2 and 3
Friend ID Friend name Post ID Author ID Authorname
70 ? 534 71070 ?
92 ? 8964 71070 ?
Node N contains names for all uids, ship result set 4 to Node N (1 RPC), and convert uids to names by doing multiple seeks.
Query
AuthorIndex
Queryexecutor
RETURN A.name, F.name
Resultset 4
Resultset 4Intersected resultset 2 and 3
Friend ID Friend name Post ID Author ID Authorname
70 Ailon 534 71070 Omri
92 Boaz 8964 71070 Omri
RedisGraph
Not distributed,
Yet,
Work in progress:
Compact distributed index
Concurrent fast independent traversals
JanusGraph successor of Titan
● Relays on a storage backend e.g. Casandar.● Provides a graph interface on top of a table.● Delegates storing, replicating, distributing and persisting a graph to the
underline storage backend.
Takes a mature application from a similar domain and introduce a new data type API on top of existing data structure. (not optimal)
Solutions
Solutions
DGraph
Uses the concept of RDF NQuad to represents connections and badger as its key value store.
Both the graph index and the entities are distributed.