Large&Scale&DataManagement CAP$THEOREMpagesperso.lina.univ-nantes.fr/~molli-p/pmwiki/... ·...

CAP THEOREM Large Scale Data Management

Consistency, Availability, Par99ons-‐Tolerance

•  Conjecture by Eric Brewer at PODC 2000 : –  It is impossible for a web service to provide following three guarantees : •  Consistency •  Availability •  Par99on-‐tolerance

•  Established as theorem in 2002: –  Lynch, Nancy, and Seth Gilbert. Brewer’s conjecture and the feasibility of consistent, available, par99on-‐tolerant web services. ACM SIGACT News, v. 33 issue 2, 2002, p. 51-‐59.

Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem

CAP theorem

•  Consistency -‐ all nodes should see the same data at the same 9me

•  Availability -‐ node failures do not prevent survivors from con9nuing to operate

•  Par88on-‐tolerance -‐ the system con9nues to operate despite arbitrary message loss

•  A distributed system can sa8sfy any two of these guarantees at the same 8me but not all three


Consistency + Availability

•  Examples: – Single-‐site databases – Cluster databases – LDAP – xFS file system

•  Traits: – 2-‐phase commit – cache valida9on protocols


Consistency + Par99on Tolerance

•  Examples: – Distributed databases – Distributed Locking – Majority protocols

•  Traits: – Pessimis9c locking – Make minority par99ons unavailable (Quorums)


Availability + Par99on Tolerance

•  Example: – Code – DNS – Usenet

•  Traits: – Expira9on/leases – Conflict resolu9on – Op9mis9c replica9on


Data Store and CAP

•  RDBMS : CA (Master/Slave replica8on, Sharding) •  Amazon Dynamo : AP (Read-‐repair, applica9on hooks)

•  Terracota : CA (Quorum vote, majority par99on survival)

•  Apache Cassandra : AP (Par99oning, Read-‐repair) •  Apache Zookeeper: AP (Consensus protocol) •  Google BigTable : CA •  Apache CouchDB : AP Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem

hTp://blog.nahurst.com/visual-‐guide-‐to-‐nosql-‐systems

Techniques for CAP

•  Consistent Hashing •  Vector Clocks •  Sloppy Quorum •  Merkle trees •  Gossip-‐based protocols •  CRDTs •  See that later…


Idea of the proof

•  hTp://www.youtube.com/watch?v=Jw1iFr4v58M

Atomic Data Object

•  Atomic/Linearizable Consistency: – There must exist a total order on all opera9on such that each opera9on looks as if it were completed at a single instant

– This is equivalent to requiring requests on the distributed shared memory to act as if they are execu9ng on single node, responding to opera9ons one at the 9me


Available Data Objects

•  For a distributed system to be con9nuously available, every request received by a non-‐failing node in the system must result in a response –  That is, any algorithm used by service must eventually terminate •  (In some ways, this is weak defini9on of availability : it puts no bounds on how long the algorithm may run before termina9ng, and therefore allows unbounded computa9on)

•  (On the other hand, when qualified by the need for par99on tolerance, this can be seen as a strong defini9on of availability : even when severe network failures occur, every request must terminate)


Par99on Tolerance •  In order to model par99on tolerance, the network is allowed to

lose arbitrary many messages sent from one node to another •  When a network is par99oned, all messages sent from nodes in one

component of the par99on to another component are lost. •  The atomicity requirement implies that every response will be

atomic, even though arbitrary messages sent as part of the algorithm might not be delivered

•  The availability requirement therefore implies that every node receiving request from a client must respond, even through arbitrary messages that are sent may be lost

•  Par99on Tolerance : No set of failures less than total network failure is allowed to cause the system to respond incorrectly


Asynchronous Network Model

•  There is no clock •  Nodes must make decisions based only on messages received and local computa9on


Asynchronous Networks: impossibility result

•  Theorem 1 : It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following proper:es: – Availability – Atomic consistency in all fair execu9ons (including those in which messages are lost)



•  Proof (by contradic9on) : –  Assume an algorithm A exists that meets the three criteria : •  atomicity, availability and par99on tolerance

– We construct an execu9on of A in which there exists a request that returns and inconsistent response

–  Assume that the network consists of at least two nodes. Thus it can be divided into two disjoint, non-‐empty sets G1,G2

–  Assume all messages between G1 and G2 are lost. –  If a write occurs in G1 and read occurs in G2, then the read opera9on cannot return the results of earlier write opera9on.



•  Formal proof: –  Let v0 be the ini9al value of the atomic object –  Let α1 be the prefix of an execu9on of A in which a single write of a value not equal to v0 occurs in G1, ending with the termina9on of the write opera9on.

–  assume that no other client requests occur in either G1 or G2. assume that no messages from G1 are received in G2 and no messages from G2 are received in G1

– we know that write opera9on will complete (by the availability requirement)



•  Let α2 be the prefix of an execu9on in which a single read occurs in G2 and no other client requests occur, ending with the termina9on of the read opera9on

•  During α2 no messages from G2 are received in G1 and no messages from G1 are received in G2

•  We know that the read must return a value (by the availability requirement)

•  The value returned by this execu9on must be v0 as no write opera9on has occurred in α2



•  Let α be an execu9on beginning with α1 and con9nuing with α2. To the nodes in G2 , α is indis9nguishable from α2, as all the messages from G1 to G2 are lost (in both α1 and α2 that together make up α), and α1 does not include any client requests to nodes in G2.

•  Therefore, in the α execu9on -‐ the read request (from α2) must s9ll return v0.

•  However, the read request does not begin un9l aler the write request (from α1) has completed

•  This therefore contradicts the atomicity property, proving that no such algorithm exists


Asynchronous Networks: Impossibility Result


Impossibility results

•  It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following proper9es: – Availability -‐ in all fair execu9ons – Atomic consistency -‐ in fair execu9ons in which no messages are lost


Impossibility results

•  Proof: –  The main idea is that in the asynchronous model, an algorithm has no way of determining whether a message has been lost, or has been arbitrary delayed in the transmission channel

–  Therefore if there existed an an algorithm that guaranteed atomic consistency in execu9ons in which no messages were lost, there would exist an algorithm that guaranteed atomic consistency in all execu9ons.

–  This would violate Theorem 1


CAP theorem

•  While it is impossible to provide all three proper9es : atomicity, availability and par99on tolerance, any two of these proper9es can be achieved: – Atomic, Par99on Tolerant – Atomic, Available – Atomic, Par99on Tolerant


Atomic, Par99on-‐Tolerant •  If availability is not required , it is easy to achieve atomic data and

par99on tolerance •  The trivial system that ignores all requests meets these requirements •  Stronger liveness criterion : if all the messages in an execu9on are

delivered, system is available and all opera9ons terminate •  A simple centralized algorithm meets these requirements : a single

designated node maintains the value of an object •  A node receiving request forwards the request to designated node which

sends a response. When acknowledgement is received, the node sends a response to the client

•  Many distributed databases provide this guarantee, especially algorithms based on distributed locking or quorums : if certain failure paTerns occur, liveness condi9on is weakened and the service no longer returns response. If there are no failures, then liveness is guaranteed.


Atomic, Available

•  If there are no par99ons -‐ it is possible to provide atomic, available data

•  Centralized algorithm with single designated node for maintaining value of an object meets these requirements


Available, Par99on-‐Tolerant

•  It is possible to provide high availability and par99on tolerance if atomic consistency is not required

•  If there are no consistency requirements, the service can trivially return v0, the ini9al value in response to every request

•  It is possible to provide weakened consistency in an available, par99on-‐tolerant semng

•  Web caches are one example of weakly consistent network


Par9ally Synchronous Model

•  The Lynch Paper also details CAP in Par9ally Synchronous Model : every node has a clock and all clocks increase at the same rate.

•  However, clocks are not synchronized •  If theorem 1 holds in Par9ally synchronous model, the corollary 1.1 does not hold.

•  Weaker consistency (t-‐connected) can be achieved.

CAP Conclusion

•  It is possible to build Large Scale Distributed Data Management systems under the CAP theorem: – One property should be sacrified.

Sacrifying one Property •  If Consistency is sacrified (AP): –  Push consistency problems to applica9ons, Can be more difficult to solve, or not… high programming cost

–  Deployement on asynchonous infrastructure… •  If Availability is sacrified (CP) –  Blocking protocols can really block the system, –  Cheap programming cost on asynchronous infrastructure

•  If P is sacrified (AC) –  Need to provide a quasi-‐synchronous model, where complex failures never happens

–  Cheap programming cost with synchronous infra… Stonebraker CACM CACM 2010

Challenges

•  Whatever the choices been made, AC/AP/CP •  Scalability and throughtput that can be achieved with different approaches will make the difference

•  The balance between programming cost/scalability-‐efficiency will be the key.

•  Nice challenges for scien9st and engineers…

Clash of cultures

•  a Classic distributed systems: focused on ACID seman9cs –  A: Atomic –  C: Consistent –  I: Isolated –  D: Durable

•  a “Modern” Internet systems: focused on BASE –  Basically Available –  Sol-‐state (or scalable) –  Eventually consistent

NoSQL (CouchDB…) vs NewSQL (VoltDB…) Dan PritcheT BASE, an ACID Alterna9ve ACM Queue hTp://queue.acm.org/detail.cfm?id=1394128

hTp://blogs.the451group.com/informa9on_management/2011/04/15/nosql-‐newsql-‐and-‐beyond/

Large&Scale&DataManagement CAP$THEOREMpagesperso.lina.univ-nantes.fr/~molli-p/pmwiki/... ·...

Documents

Transcript of Large&Scale&DataManagement CAP$THEOREMpagesperso.lina.univ-nantes.fr/~molli-p/pmwiki/... ·...