Masterless Distributed Applications With Riak...
Transcript of Masterless Distributed Applications With Riak...
![Page 1: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/1.jpg)
Masterless Distributed Applications With Riak Core
Tim.Tang May 2016
1
![Page 2: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/2.jpg)
Why Riak Core?
2
Distributed, Scalable, Failure-tolerant
![Page 3: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/3.jpg)
Why Riak Core?
3
Distributed, Scalable, Failure-tolerant
No central coordinator.Easy to setup/operate.
![Page 4: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/4.jpg)
Why Riak Core?
4
Distributed, Scalable, Failure-tolerant
Horizontally scalable.Easy add more physical nodes.
![Page 5: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/5.jpg)
Why Riak Core?
5
Distributed, Scalable, Failure-tolerant
No single point of failure. Self-healing.
![Page 6: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/6.jpg)
Why Consistent Hash?• Limits reshuffling of keys when
hash table data structure is rebalanced (Add/Remove Nodes).
• Uses consistent hashing to determine where to store data on a primary replica as well as fallbacks if the primary is offline.
6
![Page 7: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/7.jpg)
Why Consistent Hash?
7
Add Node 4Remove Node 3
![Page 8: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/8.jpg)
Concepts: The Ring
8
![Page 9: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/9.jpg)
Concepts: Virtual Node• One Erlang process per partition in the
consistent hashing ring.• One partition may have multi-vnodes.• Fundamental unit of replication, fault
tolerance, concurrency.• Receives work for its portion of the hash
space.
9
![Page 10: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/10.jpg)
Concepts: Virtual Node Master• Keep track of all active vnodes on its node
receives messages from coordinating FSMs.• Translates partition numbers to local PIDs
and dispatches commands to individual vnodes.
• One vnode_master per Physical Node.
10
![Page 11: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/11.jpg)
Concepts: N/R/W• N = number of replicas to store (on
distinct nodes)• R = number of replica responses
needed for a successful read per-request• W = number of replica responses
needed for a successful write perrequest
11
![Page 12: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/12.jpg)
Concepts: N/R/W
12
![Page 13: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/13.jpg)
Concepts: N/R/W
13
![Page 14: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/14.jpg)
Concepts:Preference List
14
![Page 15: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/15.jpg)
Concepts:Read Repair
15
• If a read detects that a vnode has stale data, it is repaired via asynchronous update.
• Passive Anti-Entropy, helps implement eventual consistency.
![Page 16: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/16.jpg)
Concepts: VClock
16
Ops NodeA([email protected]) NodeB([email protected]) NodeC([email protected])
NodeA +500 500 [{A,1}] 500 [{A,1}] 500 [{A,1}]
NodeA +200 700 [{A,2}] 700 [{A,2}] 700 [{A,2}]
NodeC + 300 1050 [{A,2}, {C,1}] 1050 [{A,2}, {C,1}] 1050 [{A,2}, {C,1}]
Network Split -- (A,B), (C)NodeC + 100 1050 [{A,2},{C,1}] 1050 [{A,2},{C,1}] 1150 [{A,2}, {C,2}]
NodeB + 500 1550 [{A,2}, {B,1}, {C,1}] 1550 [{A,2}, {B,1}, {C,1}] 1150 [{A,2}, {C,2}]
Network Repaired -- (A,B,C)NodeA + 50 1600 [{A,3}, {B,1}, {C,1}] 1600 [{A,3}, {B,1}, {C,1}] 1200 [{A,3}, {C,2}]
Get Request On NodeA, How To Merge Results?
![Page 17: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/17.jpg)
Concepts:VClock
17
• Last Write Wins (LWW)• Allow multiple versions to coexist, caller
reconcile the versions with full context.• Use riak_dt module to handle data
conflicting.
![Page 18: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/18.jpg)
Concepts:Handoff
18
• Handoff Types:• Ownership: occurs when a new node joins
the cluster or the vnode needs to be moved.• Hinted: occurs when a "fallback" vnode took
the responsibility for a "primary" vnode but the primary vnode is reachable again.
• Repairs: repair handoff happens when your application explicitly calls riak_core_vnode_manager:repair/3.
• Resize: > Riak core 2.0, riak_core_ring:resize().
![Page 19: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/19.jpg)
MisConcepts:Fallback
19
• One Physical Node down, all vnodes(primary) on this physical node status will fallback to another physical node. Switch to type fallback.
• Fallback is never performed by another partition, it goes to another node but keeps the index.
• For some time fallback and primary vnodes coexists.
![Page 20: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/20.jpg)
Routing With Consistent Hash
20
![Page 21: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/21.jpg)
Adding A Node
21
![Page 22: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/22.jpg)
Traditional Router
22
![Page 23: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/23.jpg)
Riak Core Router
23
![Page 24: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/24.jpg)
How Do The Routers Reach Agreement?
24
• Each node has one copy of ring cache.• Compare ring state strictly by gossip protocol.• Ring knows each vnode status.
![Page 25: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/25.jpg)
What We Get “Out Of Box”• Physical Node cluster state management.• Ring state management.• Vnode placement and replication.• Cluster and ring state gossip protocols.• Consistent hashing utilities.• Handoff activities, covering set callbacks.• Rolling upgrade capability.• Key based request dispatch.• etc…
25
![Page 26: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/26.jpg)
Building An Application On Riak Core
• MIDI demo => https://github.com/tim-tang/midi• Reference:
• http://marianoguerra.github.io/little-riak-core-book/index.html
• Rebar3 => https://www.rebar3.org/• rebar3_template_riak_core => https://github.com/
marianoguerra/rebar3_template_riak_core• Riak Core source => https://github.com/basho/
riak_core• Erlang/OTP 18.
![Page 27: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/27.jpg)
Riak Core Pitfalls
• Cluster membership is controlled by a human, even when a node failure has been (correctly) detected by the cluster manager.
• Vnode distribution around the ring is sometimes suboptimal.
![Page 28: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/28.jpg)
Is it a good fit?• It expects you to have a "key" that
links to a blob of data or service.• The key (or rather its chash)
determines its primary vnode and adjacent replicas.
• The data itself is opaque and has application context.
![Page 29: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/29.jpg)
Not Mentioned• Merkle Trees (AAE): https://
github.com/basho/riak_core/blob/develop/docs/hashtree.md
• Ring Resizing: https://github.com/basho/riak_core/blob/develop/docs/ring-resizing.md
![Page 30: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/30.jpg)
References
• Riak Core Confliction resolution => https://github.com/tim-tang/try-try-try/tree/master/04-riak-core-conflict-resolution
• CRDT LASP => https://github.com/lasp-lang/lasp
• Why Vector Clock are Easy => http://basho.com/posts/technical/why-vector-clocks-are-easy/
![Page 31: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak](https://reader030.fdocuments.us/reader030/viewer/2022040821/5e6a216c36f500588931ed76/html5/thumbnails/31.jpg)
Thanks!
Q&A.31