All you didn't know about the CAP theorem

All you didn’t know about the CAP theorem

CAP theoremThe theorem was presented on the Symposium on Principles of Distributed Computing in 2000 by Eric Brewer.

In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem.

According to Brewer, he wanted the community to start conversation about it, but his words have been corrected and treated as a theorem.

What stands behind the cap?The CAP theorem states that in the distributed system you can

choose only 2 out of 3:

Consistency: every read would get you the most recent write

Availability: every node (if not failed) always executes queries (read and writes)

Partition-tolerance: even if the connections between nodes are down, the other two (A & C) promises, are kept.

AP Proof

Not consistent!

CP Proof

Not available!

CA Proof

No partition tolerance!

the CAP-triangle

Let’s take a look at the postgresqlMaster/slave architecture is one of common solutions

Slave can be synced with master in async/sync way

The transaction system use two-phase commit to ensure consistency

If a partition occurs you can’t talk to the server (in the basic case), the system is not CAP-available.

So, it can’t continue work in case of network partitioning, but it provides strong consistency and high-availability. It’s a CA system!

Let’s take a look at the mongodbMongoDB provides strong consistency, because it is a single-master system and

all writes go to the primary by default

MongoDB provides automatic failover in case of partitioning

If a partition occurs it will stop accepting writes to the system until it believes that it can safely complete those writes.

So, it can continue work in case of network partitioning and it gives up availability. It’s a CP system!

Let’s take a look at thepostgres + salesforce + heroku connect systemIt’s master-master system

Heroku connect is responsible for keeping system consistency

Salesforce data and our postgres db doesn’t know about each other

If network partition occurs both storages will be available. Heroku connect will try to reconnect.

So, it can continue work in case of network partitioning and it gives up consistency. It should be an AP system!

It is so easy! Now I know everything!

Actually, no.There are a lot of problems with CAP theorem:

CAP uses very narrow and far-from-the-real-world definitions

Actually, it is the choice only between consistency and availability

Many systems are neither CAP-consistent nor CAP-available

Pure AP systems are useless

Pure CP systems might behave not as expected

What is wrong with definitionsConsistency in CAP actually means linearizability (and it’s

really hard to reach it).

Availability in CAP is defined as “every request received by a non-failing [database] node in the system must result in a [non-error] response” and it’s not restricted by time.

The only fault considered by the CAP theorem is a network partition.

Linearizability

Why node failures are outside CAP?By the definition of availability: ...every node (if not

failed) always...

By the proof of CAP: the proof of CAP used by Gilbert and Lynch relies on having code running on both sides of the partition.

In some cases, a partition will be equivalent to a failure, but this equivalence will be obtained by implementing a specific logic in all the nodes.

Of course, we should manage node failures, but CAP doesn’t help us here.

AP / CP choicePartition Tolerance basically means that you’re communicating over an asynchronous network that may delay or drop messages. The internet and all our data centers have this property, so you don’t really have any choice in this matter.

Many systems are only “p”In case you have one master and one slave, and you are partitioned from the master - you can’t write, but you can read. It’s not CAP-available.

Ok, it’s a CP system, but usually sync between slave and master is async and there might be a gap between sync and system partitioning, so you do not have CAP-consistency.

AP and CP problemsPure AP is useless, it may just return any random value and it would be an AP system

Pure CP is useless too, because partitioning in CAP have no fixed duration, so the system provides only eventual consistency, which is not the strong one that we want to have.

Try to digest it

How to describe distributed systemRemember about CAP narrow definitions, as they

are still widely used

Use PACELC(A) theorem instead of CAP, it provides additional consistency/latency tradeoff

Describe how ACID/BASE principles apply to your system

Decide if the system suits your needs, considering the project you are working on.

Let’s take a look on PACELC(A)The PACELC theorem was first described and formalised by Daniel J. Abadi from

Yale University in 2012.

IF there is a partition (P), how does the system tradeoff availability and consistency (A and C)

ELSE (E) When the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)?

As PACELC theorem is based on CAP, it also uses CAP definitions.

PACELC(A) described systems

Let’s take a look on ACIDACID - is a set of properties of database transactions.

Jim Gray defined these properties of a reliable transaction system in the late 1970s and developed technologies to achieve them automatically.

Database vendors long time ago introduced 2 phase commit for providing ACID across multiple database instances.

Let’s take a look on ACIDAtomicity. All of the operations in the transaction will complete, or none

will

Consistency. The database will be in a consistent state when the transaction begins and ends

Isolation. The transaction will behave as if it is the only operation being performed upon the database

Durability. Once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors.

CAP/ACID definition confusionConsistency in ACID relates to data integrity,

whereas Consistency in CAP is a reference for Atomic Consistency (Linearizability), which is a consistency model

Isolation term is not used in CAP, but it’s definition in ACID is actually the same for what linearizability stands for

Availability in ACID is not used in definitions, but when it’s presented in articles, it means the same as CAP-availability, apart that it’s not required for all non-failing nodes to respond.

Let’s take a look on BASEEventually consistent services are often classified as providing BASE semantics, in contrast to traditional ACID guarantees.

One of the earliest definitions of eventual consistency comes from a 1988.

BASE essentially embraces the fact that true consistency cannot be achieved in the real world, and as such cannot be modelled in highly scalable distributed systems.

What is BASEBasic Availability states that there will be a response to any request, but,

that response could still be a “failure” or the data may be in an inconsistent or changing state

Soft-state state of the system could change over time due to “eventual consistency” changes

Eventual consistency states that the system will eventually become consistent, the system will continue to receive input and is not checking the consistency of every transaction before it moves onto the next one

Try to digest it too...

Fresh look on the postgresPostgres does allow multiple cluster configuration, so it’s really hard to describe all of them. Let’s just take the master - slave replication with the Slony implementation.

The system works according ACID (there are couple of problems with two-phase commit, but mostly it’s reliable)

In case of partition the Slony will try to proceed switchover and if all went fine, we have our new master with its consistency

When there is no partition the Slony gives up latency and does everything to approach strong consistency. Actually, ACID is the reason off high latency

The system is considered as PC/EC(A)

Fresh look on the mongodbMongoDB is a NOSQL database.

MongoDB is ACID in limited sense at the document level

In case of distributed system - it’s all about BASE

In case of no partition, the system guarantees reads and writes to be consistent

If the master node will be failed or partitioned from the rest of the system, some data will not be replicated. System elects a new master to remain available for reads and writes.(New master and old master are inconsistent)

The system is considered as PA/EC(A), as most of the nodes remain CAP-available in case of partition.

fresh look on thepostgres + salesforce + heroku connect systemThe Salesforce is an abstraction to oracle database (RDBMS) and the Postgres is

an actual RDBMS. Heroku connect is a tool to sync those too dbs. Let’s see what we have…

Heroku connect, Postgres and Salesforce provide us ACID, but we can’t operate with related entities

Surprise! The full system looks more like a BASE.

It uses streaming API to sync the entities.

In case of partition, both systems will be available, but not consistent

In case of no partition, the heroku connect try to sync as much as possible, and as a part of a BASE system it doesn’t care about consistency.

With all that said, I’d put our system to the PA/EL(A) systems.

ConclusionThe distributed systems might and should be understanded by a lot of metrics

and terms, but the start point is it’s tradeoffs

It’s really difficult to classify an abstract system. You should decide what kind of system do you want and then - look at what you can achieve

The 2 point is the reason why to find any good articles about that is not easy either

Do not overwhelm yourself by that work, you should be a scientist to do it 99 % correctly

Linkshttps://dzone.com/articles/better-explaining-cap-theorem

https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html

https://habrahabr.ru/post/231703/

http://blog.thislongrun.com/2015/03/the-confusing-cap-and-acid-wording.html

https://neo4j.com/blog/acid-vs-base-consistency-models-explained/

http://databases.about.com/od/databasetraining/a/databasesbegin.htm

https://brooker.co.za/blog/2014/07/16/pacelc.html

https://www.postgresql.org/files/developer/transactions.pdf

https://www.airpair.com/postgresql/posts/sql-vs-nosql-ko-postgres-vs-mongo

http://jennyxiaozhang.com/nosql-hbase-vs-cassandra-vs-mongodb/

http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html

https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed

http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf

http://blog.thislongrun.com/2015/03/dead-nodes-dont-bite.html

http://queue.acm.org/detail.cfm?id=2462076

https://en.wikipedia.org/

https://dzone.com/articles/better-explaining-cap-theorem

https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html

https://habrahabr.ru/post/231703/

http://blog.thislongrun.com/2015/03/the-confusing-cap-and-acid-wording.html

https://neo4j.com/blog/acid-vs-base-consistency-models-explained/

http://databases.about.com/od/databasetraining/a/databasesbegin.htm

https://brooker.co.za/blog/2014/07/16/pacelc.html

https://www.postgresql.org/files/developer/transactions.pdf

https://www.airpair.com/postgresql/posts/sql-vs-nosql-ko-postgres-vs-mongo

http://jennyxiaozhang.com/nosql-hbase-vs-cassandra-vs-mongodb/

http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html

https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed

http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf

http://blog.thislongrun.com/2015/03/dead-nodes-dont-bite.html

http://queue.acm.org/detail.cfm?id=2462076

https://en.wikipedia.org/

All you didn't know about the CAP theorem

Software

Transcript of All you didn't know about the CAP theorem