Graph Distribution - redislabs.com · 70 ? 534 71070 ? 92 ? 8964 71070 ? Node N contains names for...

41
Graph Distribution

Transcript of Graph Distribution - redislabs.com · 70 ? 534 71070 ? 92 ? 8964 71070 ? Node N contains names for...

Graph Distribution

Graph Database

SRC DESTRelation

Graph Database

Use cases:

Fraud detection

Recommendation engine

Social networks...

● Property graph● Labeled entities● Schema less● Cypher query language● Aggregations, Arithmetic expressions, Sort...● Tabular resultset

RedisGraph

Structure

Tables

Name Age Height

Roi 33 187

Hila 33 170

Shany 23 167

Amit 31 180

Name Population

Israel 8.5M

Japan 127M

Italy 60M

SRC DEST

1 2

2 2

2 3

4 1

4 3

Person CountryVisit

Documents

ID: 1,

Name: ‘Roi’,

Age: 33,

Height: 187,

Visited: [6]

ID: 6,

Name: ‘Japan’,

Population: 127M

Graph structure 101

Adjacency list

12

3

4

1 2 3 4

Adjacency matrix

1 0 1 0 1 0 0 0

0 1 0 0 1 1 0 1

0 0 1 0 1 0 0 0

1 1 0 1 1 0 1 1

1 0 1 0 1 1 0 0

Node i is connected to node j If A[i,j] = 1

Hexastore

SPO OSPSOP PSOOPS POS

SSubject

PPredicate

OObject

6

Graph structure Hexastore

Triplets

SPO:Michael:Boss:Jim

SOP:Michael:Jim:Boss

OPS:Jim:Boss:Michael

OSP:Jim:Michael:Boss

PSO:Boss:Michael:Jim

POS:Boss:Jim:Michael

MichaelS

JimO

Boss

P

Node property setEntities - Key value store.

Person node with attributes:{

‘name’: ‘Bruce Buffer’,‘age’: 60,‘gender’: ‘male’

}

2 billion users

338 average friends for user

676 billion edges

152 terabytes ~= 1024*32 bytes per user + 64 * 2 bytes per edge

Problem

Partitioning

Entities distribution

Property set 1 Property set 2 Graph index

Query

Find friends of mine who’ve visited places I’ve been to and are older than me.

Match (ME:person)-[friend]->(F:person)-[visited]->(C:country)<-[visit]-(ME)WHERE ME.ID = 33 AND F.age > ME.age

RETURN F.name, C.name

(ME:person)ME.ID = 33

Graph traversal

Graph index

Graph traversal

(ME:person)-[friend]->(F:person)

Graph index

(F:person)-[visited]->(C:country)

Graph traversal

Graph index

(C:country)<-[visit]-(ME)

Graph traversal

Graph index

Resultset

Friend ID Friend name Country ID Country name

70 ? 25 ?

92 ? 55 ?

56 ? 4 ?

Query

WHERE F.age > ME.age RETURN F.name, C.nameNETWORK!

Index

Entities

Fetch age for ID 33

Query example continued

WHERE F.age > ME.age RETURN F.name, C.nameNETWORK!

Index

Entities

Fetch name of every entity in (IDs)Entity’s age > 29

Resultset

Friend ID Friend name Country ID Country name

70 Noam 25 Japan

Index distribution

Friend relation Visit relation Graph index

Query

Find all posts liked by friends of friends of mine, written by author X.

MATCH (ME:person)-[friend]->(:person)-[friend]->(F:person)-[like]->(post)<-[author]-(A:author)

WHERE ME.ID=46 AND A.ID=71070RETURN A.name, F.name

1. Node X contains FRIEND relations.

2. Seek to my ID in Node X (1 RPC). Retrieve a list of friend uids.

3. Do multiple seeks for each of the friend uids, to generate a list of friends of friends uids. result set 1

Query

FriendIndex

Queryexecutor

(ME:person)-[friend]->(:person)-[friend]->(F:person)

Resultset 1Friends of friends

Friend ID Friend name

70 ?

92 ?

56 ?

1. Node Y contains posting list for predicate LIKE.

2. Ship result set 1 to Node Y (1 RPC), and do seeks to generate a list of all posts liked by result set 1. result set 2

Query

LikeIndex

Queryexecutor

(F:person)-[like]->(post)

Resultset 1

Resultset 2Liked posts

Friend ID Friend name Post ID

70 ? 534

70 ? 431

92 ? 8964

56 ? 12

56 ? 5356

QueryNode Z contains relations for predicate AUTHOR.Ship result set 2 to Node Z (1 RPC).Seek to author X, and generate a list of posts authored by X. result set 3

AuthorIndex

Queryexecutor

(post)<-[author]-(A:author)

Resultset 2

Resultset 4Intersected resultset 2 and 3

Friend ID Friend name Post ID Author ID Authorname

70 ? 534 71070 ?

92 ? 8964 71070 ?

Node N contains names for all uids, ship result set 4 to Node N (1 RPC), and convert uids to names by doing multiple seeks.

Query

AuthorIndex

Queryexecutor

RETURN A.name, F.name

Resultset 4

Resultset 4Intersected resultset 2 and 3

Friend ID Friend name Post ID Author ID Authorname

70 Ailon 534 71070 Omri

92 Boaz 8964 71070 Omri

RedisGraph

Not distributed,

Yet,

Work in progress:

Compact distributed index

Concurrent fast independent traversals

(you)-[ask]->(question)

@roilipman

JanusGraph successor of Titan

● Relays on a storage backend e.g. Casandar.● Provides a graph interface on top of a table.● Delegates storing, replicating, distributing and persisting a graph to the

underline storage backend.

Takes a mature application from a similar domain and introduce a new data type API on top of existing data structure. (not optimal)

Solutions

Solutions

DGraph

Uses the concept of RDF NQuad to represents connections and badger as its key value store.

Both the graph index and the entities are distributed.

Solutions

Arangodb

From my understanding this multi model database uses documents to represent all three data types: Documents, key value store and graph.

Not sure about how it distributes its data but it’s using RAFT to ensure consistencyIt is ACID.