Weaver: A High Performance, Transactional Graph Database...
Transcript of Weaver: A High Performance, Transactional Graph Database...
![Page 1: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/1.jpg)
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
Presented by: Ishank Jain
Department of Computer Science
02/12/2019
By Dubey et al.
![Page 2: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/2.jpg)
CONTENT§ Related work
§ Research question
§ Method
§ Challenges
§ Results
§ Future work
§ Questions
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 2
![Page 3: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/3.jpg)
RELATED WORK§ Offline Graph Processing Systems
§ Online Graph Databases
§ Temporal Graph Databases
§ Consistency Models
§ Concurrency Control
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 3
![Page 4: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/4.jpg)
RESEARCH QUESTION§ Existing systems either operate on offline snapshots, provide weak
consistency guarantees, or use expensive concurrency control techniques that limit performance.
§ The key challenge in a transactional system is to ensure that distributed operations taking place on different machines follow a coherent timeline.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 4
![Page 5: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/5.jpg)
PROBLEM EXAMPLE§ Path discovery query
n3 -> n5: removed
n5 -> n7: added
n1 -> n7 ?
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 5
![Page 6: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/6.jpg)
REDIFINALBLE TIMESTAMPS§ This technique Couples a) coarse-grained
vector timestamps b) a fine-grained timeline oracle to pay the overhead.
§ Fine-grained timeline oracle is used for ordering only the potentially-conflicting reads and writes.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 6
![Page 7: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/7.jpg)
NODE PROGRAM§ Uses scatter-gather like property.
§ Node programs are sometimes stateful.
§ Node program state is garbage collected after the query terminates on all servers.
§ Consistency: Weaver delays execution of a node program at a shard until after execution of all preceding and concurrent transactions.
§ Supports transitivity.
Towards Dependable Data Repairing with Fixing Rules PAGE 7
![Page 8: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/8.jpg)
ARCHITECTURE§ Shard Servers: The shard servers are responsible for
executing both node programs and transactions on the in-memory graph data.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 8
![Page 9: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/9.jpg)
ARCHITECTURE§ Backing Store:
§ Use HyperDex Warp as backing store.
§ Data recovery in case of failure.
§ Directs transactions on vertex.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 9
![Page 10: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/10.jpg)
ARCHITECTURE§ Timeline Coordinator:
§ Gatekeeper
§ Timeline oracle
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 10
![Page 11: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/11.jpg)
ARCHITECTURE§ Cluster Manager:
§ Failure detection,
§ System reconfiguration.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 11
![Page 12: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/12.jpg)
PROACTIVE ODERING USING GATEKEEPERS§ Vector clock.
§ Maintains a happens-before partial order between refinable timestamps.
§ Synchronization period.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 12
![Page 13: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/13.jpg)
PROACTIVE ODERING USING GATEKEEPERS
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 13
![Page 14: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/14.jpg)
REACTIVE ORDERING BY TIMELINE ORACLE§ Timeline oracle:
§ Guarantees graph remains acyclic.
§ Event dependency graph and new event creation.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 14
![Page 15: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/15.jpg)
TRANSACTIONS§ Transaction executed on backing store to ensure
validity.
§ FIFO channels,
§ NOP transactions
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 15
![Page 16: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/16.jpg)
FAULT TOLERANCE§ Graph data persistently stored on backing store.
§ All node programs, are re-executed by Weaver with a fresh timestamp after recovery.
§ To maintain monotonicity of timestamps on gatekeeper failures, a backup gatekeeper restarts the vector clock for the failed gatekeeper.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 16
![Page 17: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/17.jpg)
GRAPH PARTITIONING & CACHING § Streaming graph partitioning algorithms:
§ To reduce communication overhead.
§ Caching analysis for path discovery:
§ Path stored in cache at each vertex
§ Path deleted from cache once an edge in path deleted.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 17
![Page 18: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/18.jpg)
EVALUATION
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 18
Average latency (secs) of a Bitcoin block query in blockchain application.
![Page 19: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/19.jpg)
EVALUATION
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 19
Transaction latency for a social network workload on the LiveJournal graph.
![Page 20: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/20.jpg)
EVALUATION
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 20
Shows almost linear scalability with the number of shards
![Page 21: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/21.jpg)
RESULTS§ Weaver enables CoinGraph to execute Bitcoin block
queries 8x faster than Blockchain.info.
§ outperforms Titan by 10.9x on social network workload and outperforms GraphLab by 4x on node program workload
§ Weaver scales linearly with the number of gatekeeper and shard servers for graph analysis queries.
Towards Dependable Data Repairing with Fixing Rules PAGE 21
![Page 22: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/22.jpg)
IMPORTANT POINTS§ Proactive costs due to periodic synchronization messages between gatekeepers,
and the reactive costs incurred at the timeline oracle needs to be carefully balanced.
§ As synchronization period increases, the reliance on the timeline oracle increases.
§ TrueTime system assumes no network or communication latency, so a system synchronized with average error bound ε will necessarily incur a mean latency of 2ε.
§ Number of shard servers and gatekeepers in shard are the potential bottleneck for the query throughput. As synchronization period increases, the reliance on the timeline oracle increases.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 22
![Page 23: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/23.jpg)
QUESTIONS§ Why is node program allowed to visit a vertex multiple times in the weaver
model ?
§ The graph data in shard severs are kept in-memory, will keeping all data in-memory increase performance at expense of cost?
§ Does creation of new event by timeline oracle in anyway effect the model ? (adding overheads)
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 23
![Page 24: Weaver: A High Performance, Transactional Graph Database ...tozsu/courses/CS848/W19/presentations/Is… · n1 -> n7 ? Weaver: A High Performance, Transactional Graph Database Based](https://reader036.fdocuments.us/reader036/viewer/2022081400/6091f76c92df4b670516f664/html5/thumbnails/24.jpg)
REFERENCEAyush Dubey, Greg D. Hill, Robert Escriva, and Emin Gün Sirer. Weaver: a high-performance, transactional graph database based on refinable timestamps. Proc. VLDB Endow. 9(11): 852-863, 2016.
Weaver: A High Performance,Transactional GraphDatabase Based on Refinable Timestamps
PAGE 24