Reasoning about data and consistency in systems

23
Reasoning about data and consistency in systems Daniel Norman CTO, güdTECH unba.se contributor Twitter: @DreamingInCode

Transcript of Reasoning about data and consistency in systems

Page 1: Reasoning about data and consistency in systems

Reasoning about data and consistency in systems

Daniel NormanCTO, güdTECHunba.se contributorTwitter: @DreamingInCode

Page 2: Reasoning about data and consistency in systems

Caveat emptor!

There is no silver bullet.

Page 3: Reasoning about data and consistency in systems

TL;DR● Systems model the physical world● Don’t annoy the humans● Many of our systems are global● We have to be available 24x7x365● Mind-bending conceptual models of systems● A view of a plausibly modern system● No refunds

Page 4: Reasoning about data and consistency in systems

● Create a model of reality.● Solve problems by performing computations thereon.● Profit!

What are we trying to achieve?

Page 5: Reasoning about data and consistency in systems

Humans have certain expectations

We don’t like it when weird stuff happens – They get irritated.

● Well, do I have a new message or not?● Why didn’t that save? Oh, it did save? Arghhh!● Why is this so slow?● What kind of lousy product is this?● This should always be available, time is money!

Page 6: Reasoning about data and consistency in systems

Consistency model:

“A set of all histories of operations allowable under a system”1

In other words:

A contract between the programmer (or agent) and the system, which provides a set of invariants to which the system will conform.

1. https://aphyr.com/posts/313-strong-consistency-models

Page 7: Reasoning about data and consistency in systems

● Linearizable● Serializable● Sequential● Causal● Eventual● PRAM● Read Your Writes● Repeatable Read● Monotonic Write● Monotonic Read● Write Follows Reads...

Some common consistency models

Page 8: Reasoning about data and consistency in systems

Linearizable Consistency

Page 9: Reasoning about data and consistency in systems

Eventual Consistency

Page 10: Reasoning about data and consistency in systems

Casual ConsistencyCausal Consistency

Page 11: Reasoning about data and consistency in systems

Reality is Causal

Page 12: Reasoning about data and consistency in systems

Hey, why not use wallclock?

I can use my system clock (AKA wallclock) to order my operations, right?

No.

Definitely NOT.

● NTP is notoriously unreliable● Bad news: simultaneity isn’t actually a thing● Time is actually weird and lumpy ( ask a physicist )

Page 13: Reasoning about data and consistency in systems

Advantages and Disadvantages

● Linearizable Strongly consistent, Single POV, May entail patience.

● Serializable Almost a single POV, allows modest concurrency, patience still required.

● Sequential Concurrent writers go nuts, ordering is arbitrary though, patience required for reads.

● Eventual Concurrent writers and no patience required! It’ll get applied, no promises when.

● Causal No patience required! No waiting for readers or writers, but no single POV either.

Page 14: Reasoning about data and consistency in systems

What consistency models do we really use?

“Linearizability / Serializability, obviously. End of presentation.”

Page 15: Reasoning about data and consistency in systems

But actually...

Page 16: Reasoning about data and consistency in systems

But Wait!AWS MAGIC CASTLE TECHNOLOGY TO THE RESCUE!

Not so fast – AWS is pretty good, but we must still reason about their consistency models:

● S3 Read after Write● DynamoDB Eventual● SQS Sequential or Linearizable● Aurora / RDS Serializable

Page 17: Reasoning about data and consistency in systems

● FIFO makes my life easier● It works around packet loss● We’re accustomed to it’s foibles

Why do we like TCP?

Page 18: Reasoning about data and consistency in systems

● Single POV makes my life easier● A central gatekeeper helps us ignore our other consistency models● It works well in the small scale

Why do we like Serializable RDBMS?

Page 19: Reasoning about data and consistency in systems

It’s nice to avoid coordination, but:

● Incompatible with user’s worldview● Requires ad-hoc consistency models as an overlay

Why isn’t eventual consistency your final answer?

Page 20: Reasoning about data and consistency in systems

● LieFi TCP linearizability gone wrong.

● Errant Promotion Asymmetries are problematic.

● Race conditions When two systems race head to head, you lose.

A few scenarios:

Page 21: Reasoning about data and consistency in systems

Concurrency is either something you’re dealing with,or something you’re putting off. No exceptions.

Your system is distributed

Our limited comprehension of this complexity eventually leads to:

● Mysterious System Behaviors● Inefficient Business Processes● Lapses in Service● Sadness

Page 22: Reasoning about data and consistency in systems

Causality is great when you can use it.

Wallclock baad!

Eventual Consistency is seductive, but problematic.

Consistency models are everywhere.

Be mindful of Linearizability / Serializability limitations.

Parting words:

Page 23: Reasoning about data and consistency in systems

Thank you!

Daniel NormanCTO, güdTECHunba.se contributorTwitter: @DreamingInCode