Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice...
Transcript of Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice...
![Page 1: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/1.jpg)
Conflict-free Replicated Data Types
in Practice
Georges Younes Vitor Enes
Wednesday 11th January, 2017
HASLab/INESC TEC & University of Minho
InfoBlender
![Page 2: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/2.jpg)
Motivation
![Page 3: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/3.jpg)
Background
![Page 4: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/4.jpg)
Background: CAP Theorem
Brewer’s Conjecture
2000 Eric Brewer, PoDC Conference Keynote
2002 Seth Gilbert and Nancy Lynch, ACM SIGACT News 33(2)
Of three properties of shared-data
system - data Consistency, system
Availability and tolerance to network
Partitions - only two can be achievedat any given moment in time.
1
![Page 5: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/5.jpg)
Background: CAP Theorem
source: Lourenco et al. Journal of Big Data (2015) 2:18 DOI 10.1186/s40537-015-0025-0
2
![Page 6: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/6.jpg)
“ The best thing about being me... There are
so many ’me’s ”
-Agent Smith3
![Page 7: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/7.jpg)
Background: Data Replication
Data Replication: Maintaining multiple data copies on separate machines.
• Improves Availability
• Allows access when some replicas are not available
• Improves Performance
• reduced latency: let users access nearby replicas
• increased throughput: let multiple machines serve the data
Pessimistic
• Provide single-copy consistency
• Block access to a replica unless it
is provably up to date
• Perform well in LANs (small
latencies, uncommon failures)
• Not in Wide Area Networks
Optimistic
• Provide eventual consistency
• Let access to a replica without a
priori synchronization
• Updates propagated in bg,
occasional conflicts resolved later
• Offer many advantages over their
pessimistic counterparts 4
![Page 8: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/8.jpg)
Background: Eventual Consistency (EC)
5
![Page 9: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/9.jpg)
Background: Eventual Consistency (EC)
6
![Page 10: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/10.jpg)
Background: Eventual Consistency (EC)
7
![Page 11: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/11.jpg)
Background: Eventual Consistency (EC)
At the same moment, two users can see the sametweet with different number of favorites 8
![Page 12: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/12.jpg)
Conflict-free Replicated Data
Types
![Page 13: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/13.jpg)
Conflict-free Replicated Data Types
CRDTs
• Conflict-free: resolves conflicts automatically converging to the same
state
• Replicated: multiple independent copies
• Data Types: Registers, Counters, Maps, Sequences, Sets,
Graphs..etc
CRDT Flavors
There are two flavors of CRDTs:
• Operation-based: broadcasts update operations to other replicas
• State-based: propagates the local state to other replicas
9
![Page 14: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/14.jpg)
Op-based CRDTs
Figure 1: Operation based replication
For each update operation do:
• Prepare: Calculate the downstream (the effect observed locally)
• Apply the effect locally
• Disseminate (reliable causal broadcast) the calculated effect to be
applied on all other replicas10
![Page 15: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/15.jpg)
Op-based (Counter :: N)
replica A
SA = 0
SA = update(inc,SA) = 0 + 1 = 1
m1 = inc
sendB(m1)
on receiveR(m2):
SA = update(dec ,SA) = 1− 1 = 0
Op-based (Counter :: N)
replica B
SB = 0
SB = update(dec,SB) = SB − 1 = −1
m2 = inc
sendA(m2)
on receiveR(m1):
SB = update(inc,SB) = −1 + 1 = 0
What if m1 is lost?
What if m2 is duplicated?
11
![Page 16: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/16.jpg)
How do solve these problems?
TRCB: Tagged Reliable Causal Broadcast
• Tags each operation with a unique timestamp (VC)
• Delivery of operations respecting causal order
• Reliable bcast: at-least-once to prevent message loss
• Reliable bcast: at-most-once to prevent message duplication
Are all data types commutative?
• Counters are commutative: incr ; decr ; == decr ; incr ;
• Not all data types are (In fact most are NOT)
• A Set S1 with add(element,Set) and rmv(element,Set) operation is
not
• add(a,S1); rmv(a,S1); value(S1) = {}• rmv(a,S1); add(a,S1); value(S1) = {a}• How do we solve concurrent operations? 12
![Page 17: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/17.jpg)
POLog: Partially Ordered Log
13
![Page 18: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/18.jpg)
POLog: Partially Ordered Log
14
![Page 19: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/19.jpg)
POLog: Partially Ordered Log
15
![Page 20: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/20.jpg)
POLog: Partially Ordered Log
16
![Page 21: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/21.jpg)
POLog: Partially Ordered Log
17
![Page 22: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/22.jpg)
POLog: Partially Ordered Log
Querying the Add-Wins-Set (AWSet)
• An element v is in the set if there is an addv operation in the set
that is not succeeded by rmvv operation
• v |(t, addv ) ∈ POLog ∧ @(t ′, rmvv ) ∈ POLog .t → t ′
So does the POLog keep growing?
• In the op-based CRDT model, the POLog keeps growing
• Pure op-based CRDT model was introduced to:
• Apply GC on the POLog
• Reduces the message size
• Provides a more generic API
18
![Page 23: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/23.jpg)
Pure op-based in Redis
Figure 2: Pure op-based Architecture in Redis
19
![Page 24: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/24.jpg)
Operation-based (GCounter :: N)
A 1inc // 2
m1=inc
��
inc // 3
m2=inc
��B 1 2 3
State-based (GCounter :: I ↪→ N)
A {B1}inc // {A1,B1}
m1={A1,B1}
$$
inc // {A2,B1}
m2={A2,B1}
$$B {B1} {A1,B1} {A2,B1}
What if m1 is lost? (monotonicity)
What if m2 is duplicated? (idempotence)
What if m1 and m2 are reordered? (commutativity)
20
![Page 25: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/25.jpg)
Why are state-based CRDTs so cool?
State-based CRDT
A state-based CRDT is a join-semilattice (S ,v,t) where S is a poset,
v its partial order, and t a binary join operator that derives the least
upper bound for every two elements of S .
∀s, t, u ∈ S :
• s t s = s (idempotence)
• s t t = t t s (commutativity)
• s t (t t u) = (s t t) t u (associativity)
$ $ full state transmission $ $
21
![Page 26: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/26.jpg)
Avoiding full state transmission
How we can we avoid replica A sending the full state to
replica B?
Two fundamental problems
A knows something about B:
• B has Sold
• A has Snew and knows that B has Sold
• Goal: compute delta d of minimum size s.t. Snew = Sold t d
A knows nothing about B:
• B has Sold
• A has Snew
• Goal: protocol that minimizes communication cost
22
![Page 27: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/27.jpg)
Solving the first problem
23
![Page 28: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/28.jpg)
Solving the first problem: Delta-CRDTs
State-based (GSet :: P(E ))
A {} add x // {x}
{x}��
add y // {x , y}{x,y}
{x , y , z}
B {} {x} {x , y} add z // {x , y , z}
{x,y ,z}
;;
Delta-state-based
A {} add x // {x}
{x}��
add y // {x , y}{y}
{x , y , z}
B {} {x} {x , y} add z // {x , y , z}
{z}
;;
24
![Page 29: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/29.jpg)
Solving the second problem
25
![Page 30: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/30.jpg)
Solving the second problem: Join Decompositions
State-driven
A {a}add x,y// {a, x , y} {a, x , y , z}
{x,y}
$$B {a} add z // {a, z}
{a,z}
::
{a, x , y , z}
Digest-driven
A {a}add x,y// {a, x , y} {a, x , y}
({x,y},dB )
$$
{a, x , y , z}
B {a} add z // {a, z}
dB
<<
{a, x , y , z}
{z}
::
26
![Page 31: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/31.jpg)
Who uses that?
![Page 32: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/32.jpg)
27
![Page 33: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/33.jpg)
Lasp
• Lasp is a language for distributed, eventually consistent
computations ⇒ CRDTs
• from basho/riak dt to lasp/types
• lasp/types:
• State-based CRDTs
• Delta-based CRDTs
• Pure-op-based CRDTs
28
![Page 34: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/34.jpg)
GCounter state transmission
0
5
10
15
20
25
30
35
40
45
state delta state delta state delta state delta
GB
Tra
nsm
itte
d
(Client Number)
Advertisement Impression Counter
State
1024512256128
29
![Page 35: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/35.jpg)
LDB & LSim
LDB
• benchmarking platform for CRDTs
• lasp/types + replication
• State-based CRDTs
• Delta-based CRDTs
• Delta-based CRDTs + Join Decompositions
• Pure-op-based CRDTs
LSim
• LDB + Peer Services + Workloads
• experiments in DC/OS (Apache Mesos + Marathon)
30
![Page 36: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/36.jpg)
GSet state transmission
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 20 40 60 80 100 120 140 160 180 200
MB
Tra
nsm
itte
d
Time in Seconds
State - RingState - HyParView
State - Erdos RenyiDecompositions - HyParView
Decompositions - Erdos RenyiDecompositions - Ring
31
![Page 37: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/37.jpg)
Challenges
1. Log aggregation
2. Time
• Time-based graphs (e.g. x axis is time)
• Total-effort graphs demands equal total-time across all runs
3. Bugs
• LDFI
• IronFleet
• Jepsen
• . . .
4. $$ (On-Demand vs Spot Instances)
32
![Page 38: Conflict-free Replicated Data Types in Practice...Con ict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11th January, 2017 HASLab/INESC TEC & University](https://reader034.fdocuments.us/reader034/viewer/2022050118/5f4f2a30f791652e972821c5/html5/thumbnails/38.jpg)
Questions?
bit.ly/crdts-infoblender
32