Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”,...
Transcript of Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”,...
![Page 1: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/1.jpg)
MochiDB: A Byzantine Fault Tolerant Datastore
Tigran TsaturyanSaravanan Dhakshinamurthy
![Page 2: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/2.jpg)
Description
1. BFT KeyValue datastore (read(k), write(k,v), delete(k))
2. Consistent3. Supports transactions4. In-built sharding5. Optimized for reads and writes over
WAN
![Page 3: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/3.jpg)
Use case
Database to store configurations for infrastructure.● Most infrastructure as key -> value● Need to update multiple props together● Infrastructure needs to be consistent● Located in different part of the world (next
slide)
![Page 4: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/4.jpg)
Source: Amazon AWS + https://wondernetwork.com/pings
140 ms
210 ms
110 ms
![Page 5: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/5.jpg)
Architecture
1. Quorum Based BFTClient is a coordinator for transaction
2. Transactions can be two types - READ and WRITE
3. Min server requirement - 3f + 1
![Page 6: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/6.jpg)
BFT Read
1. Value2. WriteCertificate3. Timestamp (TS)4. …..
objectX
1. Value2. WriteCertificate3. Timestamp (TS)4. …..
objectY
client
server1
server2
server3
server4
“How that object happens to be that way”(Signed confirmations from the servers)
Transaction Transaction result
![Page 7: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/7.jpg)
BFT Write:Protocol view
1. Value2. WriteCertificate3. Timestamp (TS)4. …..
objectX
1. Value2. WriteCertificate3. Timestamp (TS)4. …..
objectY
client
server1
server2
server3
server4
Collection of grants (object, timestamp, trHash)
Transaction +Random seed (0-1000)
Server grants client to write object at some TS
WriteCertificate - collection of grants from 2f+1 servers
Acks that transaction was performed
![Page 8: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/8.jpg)
BFT Write:Server processing
time
Old epochs Epoch = 5000 Epoch = 6000
Current object TS = 5334
WRITE(“ObjectX”, “12”)RAND_seed = 315
Transaction 1
WRITE(“ObjectX”, “48”)RAND_seed = 467
Transaction 2
Write1 grant for TR1
Write1 grant for TR2
TR1 TR2
Write1 Write1
Write2
Write2
Order
Epoch for current state of the object (COMMITTED)
Epoch for current state of the object (COMMITTED)
Current object TS = 6315
Current object TS = 6467
![Page 9: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/9.jpg)
Features
● Sharding:1024 tokens equally spread across the ring and assign to servers. Data is replicated (replicationFactor) on the Nth subsequent servers
● GC:Need to cleanup old write grants that are never fulfilled. Server initiates GC, get agreement on object TS, prune non needed data
● Permissions:Client have READ, WRITE, ADMIN permissions embedded into its certificate
● Configuration changes:Similar to 2PC
● more….
![Page 10: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/10.jpg)
Engineering
Implementation● Java/Netty/ProtoBufs/Spring● In-memory object store (for now)
Lessons learned● Async IO, AWS fees● Full cluster within JVM and testing framework● Releasing resources● Concurrent operations● Do not make presentation in google docs :)
Testing● See paper● Local: 6ms -50%, 20 ms - 99% - READS; 16 ms - 50%, 60 ms -
99% WRITES
![Page 11: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/11.jpg)
Conclusion
THANK YOU!
Ready to run imageshttps://hub.docker.com/r/mochidb/mochi-db/Source code (48,310 lines of code): https://github.com/saravan2/mochi-db
CONTRIBUTIONS APPRECIATED!
![Page 12: Fault Tolerant Datastore MochiDB: A Byzantine · 2017-12-13 · Transaction 1 WRITE(“ObjectX”, “48”) RAND_seed = 467 Transaction 2 Write1 grant for TR1 Write1 grant for TR2](https://reader036.fdocuments.us/reader036/viewer/2022070811/5f0a88027e708231d42c1796/html5/thumbnails/12.jpg)
Mochi