FOSDEM 2012 - wolfSSL · [email protected] Technical / Community Update! FOSDEM 2012
Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney...
Transcript of Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney...
![Page 1: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/1.jpg)
zcourts.com
Tesseract
Distributed Graph Database FOSDEM 2015
1
Courtney [email protected]
31 Jan 2015Background from Gephi
![Page 2: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/2.jpg)
zcourts.com
• I can be found around the web as “zcourts”, Google it… • The web is one very prominent example of a graph • Too big for a single machine • So we must split or “partition” it over multiple • Partitioning is hard…in fact, it has been shown to be np-
complete • All we can do is edge closer to more “optimal” solutions • The Tesseract is an ongoing research project • Its focus is on distributed graph partitioning • The rest of this presentation is a series of solutions, which
together, takes one step closer to faster distributed graph processing
2
![Page 3: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/3.jpg)
zcourts.com
Graph - A graph G is made up of a set of vertices and edges, G = (V,E)
Vertex - Smallest unit of user accessible datum
Edge - Connects two vertices, may have a direction
Property - Key value pair available on an Edge or Vertex
3
V1V6V4
V5V2
V3
V7
Terminology
![Page 4: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/4.jpg)
zcourts.com
Aims of the Tesseract
1. Implement distributed eventually consistent graph database
2. Develop a distributed graph partitioning algorithm
3. Develop a computational model able to support both real time and batch processing on a distributed graph
4
![Page 5: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/5.jpg)
zcourts.com
Aims of the Tesseract
1. Implement distributed eventually consistent graph database
2. Develop a distributed graph partitioning algorithm
3. Develop a computational model able to support both real time and batch processing on a distributed graph
5
![Page 7: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/7.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data types
6
![Page 8: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/8.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
6
![Page 9: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/9.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)
6
Associative
![Page 10: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/10.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
6
Associative
Commutative
![Page 11: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/11.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
6
Associative
Commutative Not Idempotent
![Page 12: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/12.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
6
Associative
Commutative Not Idempotent
![Page 13: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/13.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
6
Associative
Commutative Not Idempotent
![Page 14: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/14.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
6
Associative
Commutative Not Idempotent
![Page 15: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/15.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
6
Associative
Commutative Not Idempotent
![Page 16: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/16.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
Addition with sets is done using ∪
6
Associative
Commutative Not Idempotent
![Page 17: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/17.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
Addition with sets is done using ∪
(1 ∪ 2) ∪ 3 = 1 ∪ (2 ∪ 3)
6
Associative
Commutative Not Idempotent
Associative
![Page 18: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/18.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
Addition with sets is done using ∪
(1 ∪ 2) ∪ 3 = 1 ∪ (2 ∪ 3)1 ∪ 2 = 2 ∪ 1
6
Associative
Commutative Not Idempotent
Associative
Commutative
![Page 19: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/19.jpg)
zcourts.com
CRDTs…in one slideTM
Conflict free replicated data typesi.e provably eventually consistent (Shapiro et al) replicated & distributed data structures.
(1+2) + 3 = 1 + (2+3)1 + 2 = 2 + 1
1 + 1 ≠1
Unfortunately addition isn’t enough. The CIA properties are required to have a CRDT
Luckily, graphs can be represented by a common mathematical structure which exhibits all 3 properties… Sets!
Addition with sets is done using ∪
(1 ∪ 2) ∪ 3 = 1 ∪ (2 ∪ 3)1 ∪ 2 = 2 ∪ 1
1 ∪ 1 = 16
Associative
Commutative Not Idempotent
Associative
Commutative Idempotent!
![Page 22: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/22.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
7
![Page 23: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/23.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
![Page 24: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/24.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
= insert = merge = replicate = remove
![Page 25: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/25.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
= insert = merge = replicate = remove
![Page 26: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/26.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
= insert = merge = replicate = remove
![Page 27: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/27.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
= insert = merge = replicate = remove
![Page 28: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/28.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ}
{aλ}
= insert = merge = replicate = remove
![Page 29: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/29.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 30: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/30.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
Each node adds “a” with a unique tag
locally
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 31: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/31.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 32: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/32.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{-aπ}
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 33: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/33.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 34: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/34.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}
Mark only -aπ as deleted.
= insert = merge = replicate = remove
![Page 35: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/35.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}
= insert = merge = replicate = remove
![Page 36: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/36.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
= insert = merge = replicate = remove
![Page 37: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/37.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
{aλ,-aπ}
= insert = merge = replicate = remove
![Page 38: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/38.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
{aλ,-aπ}
Merge takes symmetrical difference of the local and remote sets resulting in aλ
being in the set
= insert = merge = replicate = remove
![Page 39: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/39.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
{aλ,-aπ}
{aλ}
{aλ}
Merge takes symmetrical difference of the local and remote sets resulting in aλ
being in the set
= insert = merge = replicate = remove
{aλ}
![Page 40: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/40.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
{aλ,-aπ}
{aλ}
{aλ}
= insert = merge = replicate = remove
{aλ}
![Page 41: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/41.jpg)
zcourts.com
I lied, two slides…TM?
• Several types of CRDTs are available. • They provide us with “Strong Eventual Consistency”
i.e. given states propagate we’re provably guaranteed to converge.
• OR-set i.e. “Observed Removed”…add wins!
7
S1
S2
S3
s{}
{}
{}
add(a)
{aπ}
add(a)
{aλ}
{aλ} {aλ,aπ}
del(a)
{aλ,-aπ}
{-aπ}
{-aπ}
{aλ}
{aπ}{aλ,-aπ}
{aλ,-aπ}
{aλ}
{aλ}
= insert = merge = replicate = remove
{aλ}
• User never sees tags! • Query time checks are used to enable DAGs (if violation of DAG constraint is detected
then the runtime simply says the violating edge does not exist and triggers clean up) • Note,the deleted “a” is optionally kept as a tombstone if the runtime is configured to
support “snapshots”
![Page 42: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/42.jpg)
zcourts.com
Aims of the Tesseract
1. Implement distributed eventually consistent graph database
2. Develop a distributed graph partitioning algorithm
3. Develop a computational model able to support both real time and batch processing on a distributed graph
8
![Page 43: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/43.jpg)
zcourts.com
CRDTs again…because they’re important
• One very important property of a CRDT is:
{a,b,c,d} :⇔ {a,b} ∪ {c,d}
• Those two sets being logically equivalent is a
desirable property
• Enables partitioning (with rendezvous hashing for e.g.)
9
![Page 44: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/44.jpg)
zcourts.com
Naïve “cascading vertices”• Naïve graph partitioning • Depends on the query model to make up for its Naïvety • Uses hashing to place data • Two cascading algorithms formulated from:
V = the vertex to cascade n = max nodes to cascade across ń = auto-determined value of n, using logistics growth model d = deg(v) = Degree of V e = ⟨∀deg(v) ∈ G⟩ i.e. average degree of all vertices in the graph |nV| = Max number of edges per node for a vertex
i.e. cascading point (min number of edges before cascading occurs)
1. |nV| = d / n - user provides n, split evenly across nodes 2. |nV| = max(d,e) / n - user provides n, split evenly based on d or e
if e is bigger
10
![Page 45: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/45.jpg)
zcourts.com
“Cascading vertices” by example
1111
• Let’s use Twitter followers as an example • Each letter represents a unique follower
![Page 46: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/46.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
![Page 47: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/47.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
![Page 48: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/48.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
add(…) performs a cascade(deg(V))
![Page 49: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/49.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
![Page 50: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/50.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
![Page 51: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/51.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
![Page 52: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/52.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
cascade(deg(v)) >= threshold
![Page 53: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/53.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
![Page 54: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/54.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
![Page 55: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/55.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
![Page 56: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/56.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}
{w,x…n/3*threshold}
![Page 57: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/57.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}add(…) x f
{w,x…n/3*threshold}
![Page 58: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/58.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}add(…) x f
{w,x…n/3*threshold}
![Page 59: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/59.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}add(…) x f
{w,x…n/3*threshold}
wrap and repeat
![Page 60: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/60.jpg)
zcourts.com
“Cascading vertices” by example
1111
S1
S2
S3
s{}
{}
{}
add(…) x f
{a,b…n/threshold}add(…) x f
= insert = cascade
• Let’s use Twitter followers as an example • Each letter represents a unique follower
{r,s…n/2*threshold}add(…) x f
{w,x…n/3*threshold}
![Page 61: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/61.jpg)
zcourts.com
Aims of the Tesseract
1. Implement distributed eventually consistent graph database
2. Develop a distributed graph partitioning algorithm
3. Develop a computational model able to support both real time and batch processing on a distributed graph
12
![Page 62: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/62.jpg)
zcourts.com
Distributed computationLocalised calculations
Amortisation
Memoization
13
![Page 63: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/63.jpg)
zcourts.com
Amortisation• Optimise to perform more “cheap” computations • This allows us to occasionally pay the cost of more “expensive”
operations such that they computationally balance out • e.g. Checking data locally on a node vs querying over a
network
7
1
1
1
1
1
1
1
![Page 64: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/64.jpg)
zcourts.com
Amortisation• Optimise to perform more “cheap” computations • This allows us to occasionally pay the cost of more “expensive”
operations such that they computationally balance out • e.g. Checking data locally on a node vs querying over a
network
14
7
1
1
1
1
1
1
1
![Page 65: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/65.jpg)
zcourts.com
Memoization• Cache the results of computations
• A luxury afforded by immutability • Sacrifices disk space and memory • Provides improved query performance
15
![Page 66: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/66.jpg)
zcourts.com
Memoization• Cache the results of computations
• A luxury afforded by immutability • Sacrifices disk space and memory • Provides improved query performance
15
1st query
Traversen secs
Cache results
![Page 67: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/67.jpg)
zcourts.com
Memoization• Cache the results of computations
• A luxury afforded by immutability • Sacrifices disk space and memory • Provides improved query performance
15
1st query
Traversen secs
Cache results
2nd query
Cachen/r
![Page 68: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/68.jpg)
zcourts.com
Wormhole traversals• Immutability offers guarantees • Place markers at every n vertex intervals • When traversing, don’t visit every vertex, jump to markers
instead.
16
A B C
D
EF
G
• Markers at A, G, F, D • By pass B,C,E during traversal, almost halving the time. • The resulting data has any skipped vertex asynchronously
fetched• A key part of this is in the use of
“Path summaries” • Path summary is an optimisation
that enables the runtime to skip network requests
• Allows traversal to continue locally and async request is made to gather the remote results
follows
follows follows
knows
knows
friends
![Page 71: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/71.jpg)
zcourts.com
Going functional• Early implementation was in Haskell• Why? Because it did everything I wanted.
17
![Page 72: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/72.jpg)
zcourts.com
Going functional• Early implementation was in Haskell• Why? Because it did everything I wanted.• Later realised it’s not Haskell in particular I wanted
• …but its semantics • Immutability • Purity • and some other stuff • and, well…functions!
17
![Page 73: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/73.jpg)
zcourts.com
Going functional• Early implementation was in Haskell• Why? Because it did everything I wanted.• Later realised it’s not Haskell in particular I wanted
• …but its semantics • Immutability • Purity • and some other stuff • and, well…functions!
• The whole graph thing is an optimisation problem • The properties of a purely functional language
enables a run time to make a lot of assumptions • These assumptions open possibilities not otherwise
available (some times by allowing us to pretend a problem isn’t there)
17
![Page 74: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/74.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 75: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/75.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 76: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/76.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
• What you’re looking at is TQL
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 77: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/77.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
• What you’re looking at is TQL• a pure
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 78: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/78.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
• What you’re looking at is TQL• a pure• functional language
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 79: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/79.jpg)
zcourts.com
Distributed Query Model: TQL, Tesseract Query Language
• What you’re looking at is TQL• a pure• functional language• it has type inferencing and all the cool functional widgets!
18
• Haskell? • …before you start sneaking out the back doors • What would that even look like…?
![Page 82: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/82.jpg)
zcourts.com
Distributed Query Model: TQL pt2• How was that functional?• It employed use of:
• Functions - relation between a set of input and a set of permissible outputs
• Monads - structures that allow you to define computation in terms of the steps necessary to obtain the results of the computation.
• Monoids - a set with a single associative (1+ 2) + 3 == 1 + (2+3) binary operation an identity element (an element where, when applied to any other in the set, the value of the other element remains unchanged. e.g. given * as the binary operation and the set S={1,2,3}, 1 is the identity element since 1 * 1 = 1, 2 * 1 = 2 and 3 * 1 = 3)
• Currying - where a function which takes multiple arguments is converted into a series of functions which take a single argument, the currying technique produces partially applied functions.
• Higher order functions - functions which takes other functions as its parameter
• Function composition - the process of making the result of one function the argument of another
19
![Page 83: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/83.jpg)
zcourts.com
Distributed Query Model: TQL pt2• How was that functional?• It employed use of:
• Functions - relation between a set of input and a set of permissible outputs
• Monads - structures that allow you to define computation in terms of the steps necessary to obtain the results of the computation.
• Monoids - a set with a single associative (1+ 2) + 3 == 1 + (2+3) binary operation an identity element (an element where, when applied to any other in the set, the value of the other element remains unchanged. e.g. given * as the binary operation and the set S={1,2,3}, 1 is the identity element since 1 * 1 = 1, 2 * 1 = 2 and 3 * 1 = 3)
• Currying - where a function which takes multiple arguments is converted into a series of functions which take a single argument, the currying technique produces partially applied functions.
• Higher order functions - functions which takes other functions as its parameter
• Function composition - the process of making the result of one function the argument of another
• Don’t believe me? Let’s look at a definition for “INSERT” shown on the previous slide
19
![Page 85: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/85.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
![Page 86: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/86.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
![Page 87: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/87.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Graph namespace
![Page 88: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/88.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Graph namespace
Vertex type
![Page 89: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/89.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Graph namespace
Vertex type
Edge type
![Page 90: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/90.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Graph namespace
Vertex type
… = var-arg+ Homogeneous
Edge type
![Page 91: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/91.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Result of “INSERT”
Graph namespace
Vertex type
… = var-arg+ Homogeneous
Edge type
![Page 92: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/92.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Result of “INSERT”
Graph namespace
Vertex type
… = var-arg+ Homogeneous
Edge type
Result of lambda function
![Page 93: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/93.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Result of “INSERT”
Graph namespace
Vertex type
… = var-arg+ Homogeneous
Edge type
Result of lambda function
• Lambda function you say?
![Page 94: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/94.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
Function name
Result of “INSERT”
Graph namespace
Vertex type
… = var-arg+ Homogeneous
Edge type
Result of lambda function
• Lambda function you say? • Where, where?
![Page 95: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/95.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Lambda function you say? • Where, where?
![Page 96: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/96.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Lambda function you say? • Where, where?
![Page 97: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/97.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
![Page 98: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/98.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
From here…
![Page 99: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/99.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
From here… …to here!
![Page 100: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/100.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
![Page 101: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/101.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system
![Page 102: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/102.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation
![Page 103: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/103.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation• Built on top of LLVM
![Page 104: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/104.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation• Built on top of LLVM• TQL comes with a useful “standard” library like most languages
![Page 105: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/105.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation• Built on top of LLVM• TQL comes with a useful “standard” library like most languages• An “Algorithms & machine learning” module will ship as an add-on module
![Page 106: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/106.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation• Built on top of LLVM• TQL comes with a useful “standard” library like most languages• An “Algorithms & machine learning” module will ship as an add-on module• Ability to define new modules/add or override functions
![Page 107: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/107.jpg)
zcourts.com
Distributed Query Model: TQL pt3
INSERT :: ( (String -> (V…) -> (E…) -> PartialTransform) ) -> Transform
20
• Types are optional and are inferred using Hindley–Milner style type system• Functions are translated to “enriched” lambda calculus for reduction & evaluation• Built on top of LLVM• TQL comes with a useful “standard” library like most languages• An “Algorithms & machine learning” module will ship as an add-on module• Ability to define new modules/add or override functions• Include additional modules (yours or a third party’s)
![Page 108: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/108.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
![Page 109: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/109.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
![Page 110: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/110.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
TQL
![Page 111: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/111.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
TQL
CRDTs
![Page 112: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/112.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
TQL
CRDTs Cascading vertices
![Page 113: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/113.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
TQL
CRDTs Cascading vertices
Wormhole traversals
![Page 114: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/114.jpg)
zcourts.com21
Distributed Query Model: Runtime• The model places a lot of additional work server side. • Previously enumerated properties enable the server to
make a lot of assumptions and by proxy optimisations • Client interface remains consistent • While on going research can improve the run time
without major client changes
Tesseract runtime
TQL
CRDTs Cascading vertices
Wormhole traversals Optimisations (Memoization/Amortisation/etc)
![Page 116: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/116.jpg)
zcourts.com
Compaction & Garbage collection• Immutability means we store data that’s no longer needed i.e. garbage
22
![Page 117: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/117.jpg)
zcourts.com
Compaction & Garbage collection• Immutability means we store data that’s no longer needed i.e. garbage• CRDTs can accumulate a large amount of garbage
• This can be avoided by not keeping tombstones at all • Without tombstones the system is unable to do a consistent
snapshot • If snapshots are disabled, tombstones are not needed • Short synchronisation are used out of the query path to do some
clean up (currently evaluating RAFT for GC consensus)
22
![Page 118: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/118.jpg)
zcourts.com
Compaction & Garbage collection• Immutability means we store data that’s no longer needed i.e. garbage• CRDTs can accumulate a large amount of garbage
• This can be avoided by not keeping tombstones at all • Without tombstones the system is unable to do a consistent
snapshot • If snapshots are disabled, tombstones are not needed • Short synchronisation are used out of the query path to do some
clean up (currently evaluating RAFT for GC consensus)• Current work is modelled off of JVM’s generational collectors
22
![Page 119: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/119.jpg)
zcourts.com
Compaction & Garbage collection• Immutability means we store data that’s no longer needed i.e. garbage• CRDTs can accumulate a large amount of garbage
• This can be avoided by not keeping tombstones at all • Without tombstones the system is unable to do a consistent
snapshot • If snapshots are disabled, tombstones are not needed • Short synchronisation are used out of the query path to do some
clean up (currently evaluating RAFT for GC consensus)• Current work is modelled off of JVM’s generational collectors• Algorithm needs more investigation…
22
![Page 120: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/120.jpg)
zcourts.com
Compaction & Garbage collection• Immutability means we store data that’s no longer needed i.e. garbage• CRDTs can accumulate a large amount of garbage
• This can be avoided by not keeping tombstones at all • Without tombstones the system is unable to do a consistent
snapshot • If snapshots are disabled, tombstones are not needed • Short synchronisation are used out of the query path to do some
clean up (currently evaluating RAFT for GC consensus)• Current work is modelled off of JVM’s generational collectors• Algorithm needs more investigation…• Compaction also serves as an opportunity to optimise data location
• Write only means vertex properties and edges aren’t always next to each other in a data file
• During compaction we re-arrange contents • Helps reduce the amount of work required by spindle disks to fetch
a vertex’s data22
![Page 121: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/121.jpg)
zcourts.com
First release due in 2-3 months
Will be Apache v2 Licensed
github.com/zcourts/Tesseract
23
![Page 122: Tesseract - FOSDEM · zcourts.com Tesseract Distributed Graph Database FOSDEM 2015 1 Courtney Robinson courtney@zcourts.com 31 Jan 2015 Background from Gephi](https://reader031.fdocuments.us/reader031/viewer/2022022006/5ac870c87f8b9a51678c46e0/html5/thumbnails/122.jpg)
zcourts.com24
End…
Questions?
Courtney Robinson Google “zcourts”
[email protected] github.com/zcourts/Tesseract