VoltDB and the Jepsen Test

65
VoltDB and the Jepsen test: What we learned about data accuracy and consistency John Hugg September 29th, 2016 @johnhugg / [email protected]

Transcript of VoltDB and the Jepsen Test

VoltDB and the Jepsen test: What we learned about data

accuracy and consistency

John Hugg September 29th, 2016

@johnhugg / [email protected]

chat.voltdb.com@johnhugg

This Talk• Intro to VoltDB (will make this quick 😀)

Not going to explain how VoltDB works, or how to build an app

• What’s the value of consistency?

• What are the tradeoffs VoltDB made/makes to be consistent?

• Jepsen Testing Background & Results Not going to give a Jepsen talk the way Kyle Kingsbury does, but I will link to one

• A bit more on consistency and wrap up

What is ?

chat.voltdb.com@johnhugg

What is VoltDB?• Scale-out, clustered, SQL Relational database

• Blazing, in-memory architecture with disk-persistence Millions of ACID multi-statement transactions per second

• Strong serializable transactions by default, even at high scale

• An excellent processing engine with rich import/export functionality

• Simple operational model:All nodes the same. Send ops to any node. Unpack tarball to install. Public clouds / Private clouds / Bare metal / VMs / Containers

chat.voltdb.com@johnhugg

Use Cases• Anything where decisions based on logic are made in real time for incoming

events: • Policy Enforcement • Fraud Detection • Real-time personalization for Ad-Tech, Gaming, Loyalty Programs, Retail • Payments (Micro or otherwise)

• Anything where math / calculations are done: • Billing and reporting on live data • State tracking

chat.voltdb.com@johnhugg

Example: Telco

Mobile phone is dialed.

Request sent to VoltDB to decide if it

should be let through.

Single transaction looks at state and decides if this call:

is fraudulent? is permitted under plan?

has prepaid balance to cover?

State Blacklists

Fraud Rules Billing Info

Recent Activity for both Numbers

Export to OLAP

99.999% of txns respond in 50ms

chat.voltdb.com@johnhugg

Example: Micro Personalization

User clicks link on a website. This

generates a request to VoltDB.

VoltDB transaction scans a table of rules and

checks which apply to this event.

Eventually the transaction decides what

to show the user next.

That decision is exported to HDFS

Spark ML is used to look at historical data in HDFS and

generate new rules.

These rules are loaded into VoltDB every few hours.

User sees personalized

content

StateProcessing

Complex, Transactional

Business LogicScale-Out

Performance

Streaming Events

SQL Relational State

chat.voltdb.com@johnhugg

More?VoltDB.com Visit our booth next to O’Reilly

MeEmail Me:

[email protected]

Slack

What Do We Mean by Consistency?

chat.voltdb.com@johnhugg

ACID vs CAP: Fight

• ACID refers to transactional consistency.

• How do multi-statement, multi-value operations read and modify data?

• CAP refers to agreement on data values between multiple replicas.

• Consistency, Availability, Partition-Tolerance

• Do all copies of this data have the same values?

ACID: 1 Transaction = 1 Event

• Atomic: Either 100% done or 0% done. No in-between.

• (Consistent)

• Isolated: Two concurrent operations can’t interfere with each other

• Durable: If it says it’s done, then it is done.

Processing Code for a Single Event

Database / State

Processing Code for a Single Event

Database / State

x x x x

Not Atomic

Romeo And Juliet Explain “Atomicity”

Operation 1: Fake your death

Operation 2: Tell Romeo

Processing Code for a Single Event

Database / State

Processing Code for a Single Event

Not Isolated

chat.voltdb.com@johnhugg

CAP Tradeoffs

In the face of unreliable networks (partitions): There are some cases where a system has to choose between inconsistent data processing and not responding at all.

• CP Systems: If the system responds, the answer is the same at all replicas.

• AP Systems: The system can respond even if it isn’t sure.

AP doesn’t imply 100% uptime or even more uptime necessarily. AP often offers knobs to pick between safety and latency / availability.

• It’s possible (and common) to be neither CP or AP.CA is not a thing.

chat.voltdb.com@johnhugg

Links for Those at Home

• Disambiguating ACID and CAP (blog post) https://www.voltdb.com/blog/disambiguating-acid-and-cap

• "All In With Determinism for Performance and Testing in Distributed Systems” (talk)https://www.youtube.com/watch?v=gJRj3vJL4wE

chat.voltdb.com@johnhugg

VoltDB offers the Strongest Consistency Guarantees of Any System Anywhere

• Serializable ACIDLinearizable Operations CP in CAP

• A conscious choice from day one at VoltDB to turn the consistency dial to eleven.

• Verified by Kyle Kingsbury at jepsen.io*

*More accurately: Jepsen failed to show it wasn’t inconsistent in version 6.4.

What’s the Value of Consistency?

chat.voltdb.com@johnhugg

“Right Answers”?

• The simplest argument for consistency is that you get better answers, but that’s maybe too simple…

• There are lots of ways to take a less consistent system and get better answers.

• But most of them are more work for you, the developer. Many of them are not super efficient either.

chat.voltdb.com@johnhugg

Fewer Things Can Go Wrong

• A transaction can never partially fail => fewer bad states to worry about

• Secondary indexes and materializations are always in perfect sync

• Fewer awkward workarounds (like secondary indexes for example)

chat.voltdb.com@johnhugg

Exactly Once

• Everyone wants things to be processed and recorded exactly once.

• But distributed systems don’t care about what we want.

• In this world, bad things happen to good people.

• But there is some hope if you have strong consistency.

ACID

CP

is the property of certain operations in mathematics and computer science, that can be

applied multiple times without changing the result beyond the initial application.

Idempotence

Idempotent Not Idempotent

set x = 5;same as

set x = 5; set x = 5;

x++;not same as x++; x++;

if (x % 2 == 0) x++;same as

if (x % 2 == 0) x++; if (x % 2 == 0) x++;

if (x % 2 == 0) x *= 2;not same as

if (x % 2 == 0) x *= 2; if (x % 2 == 0) x *= 2;

spill coffee on brown pants eat whole plate of spaghetti

Idempotent Operations

Exactly Once Semantics

At-Least-Once Delivery

+

=

chat.voltdb.com@johnhugg

Operation How to Make it Idempotent

Insert Make it an upsert (PK required)

Many Inserts Transactional upserts

Complex Conditional Logic, possibly with many writes, some to non-unique tables.

If it adds a unique row somewhere, check if that row exists first

Keep a separate table with a log of work — always read log first

chat.voltdb.com@johnhugg

Isolation

• Many “ACID” systems don’t offer (or don’t default to using) strong isolation.

• Weak isolation, like “read committed” makes idempotency more challenging.

http://www.bailis.org/blog/when-is-acid-acid-rarely/

Latency & Consistency

Low Latency Can Affect the Decision

500ms

Want to be here You lose money here

Many options for building consistency on top of eventually-consistent systems introduce extra latency,

or at least much more variability in latency.

Get Into the “Fast Path”

• Policy Enforcement in Telco

• Instant Fraud Detection

• Change what a user sees in response to action:

• Change the next webpage content based on recent website actions.

• Pick what’s behind the magic door based on how the game is going.

What are the tradeoffs?

chat.voltdb.com@johnhugg

Timeouts

CAP Theorem here…

• We can’t confirm the transaction succeeded to the client until all nodes confirm.

• If a node doesn’t confirm, we wait up to a specified timeout. Then the fault resolution algorithm kicks in and ejects the node (with consensus among surviving nodes).

• This means killing a node can block transactions up to the timeout value, typically seconds, not milliseconds.

chat.voltdb.com@johnhugg

Two-Node Clusters Not Ideal

Is the other machine down?

Is the network partitioned?

3-Node clusters are better at reaching consensus.

chat.voltdb.com@johnhugg

Slower Cross Partition Write Ops

• If you need to verify all replicas of all involved partitions have the correct data, sometimes you are going to need to block until you get confirmations.

• There are tricks to make this better, but it will never be perfect.

chat.voltdb.com@johnhugg

High Bar for Testing / Slower Development

• Any feature we add must be vetted against our very strong consistency guarantees to users.

• Since it’s nearly impossible to prove a system like VoltDB is correct*, we are stuck trying to exhaustively find a counterexample.

• The amount of automated evil, self-verifying, randomized workloads we run nightly is getting pretty crazy. Jepsen is just a part of that.

*Tools like TLA+ are useful, but can’t verify all features and implementations practically, only subsets

Jepsen!

chat.voltdb.com@johnhugg

What is Jepsen?John-Speak:

Kyle Kingsbury built a tool he called Jepsen.

He uses this tool, usually customized, to break databases.

We hired him to break VoltDB.

jepsen.io

chat.voltdb.com@johnhugg

Key Jepsen Testing Thing

• We paid Kingsbury to try to break VoltDB.

• We gave him complete editorial control over the subsequent post about his findings.

• If he found issues, he was going to write about them.

• This is atypical and speaks to Kingsbury’s integrity and value as a third party validator.

chat.voltdb.com@johnhugg

How Does it Work?

Step 1: Run a Workload and LOG EVERYTHING

Step 2: Inject lots of network failures

Step 3: Run a superpowered solver on the logs, checking for any states

that contradict DB promises

Hand-drawn images were made by Kingsbury

chat.voltdb.com@johnhugg

Example Problem

Time Op Result

T0 Write(5) Success

T1 Read 5

T2 Write(6) Success

T3 Read 5

chat.voltdb.com@johnhugg

Fun Reading

• http://jepsen.io/analyses.html

• Most systems fail.

• How the projects respond can be interesting.

chat.voltdb.com@johnhugg

Why Jepsen for VoltDB?

• We are always hungry for tests!

• Could build VoltDB-Jepsen harness ourselves, but…Wouldn’t be as good and wouldn’t have Kingsbury’s credibility.

• Customers have asked about it.

• Kingsbury has a built-in audience (marketing)

chat.voltdb.com@johnhugg

VoltDB’s Thoughts

Serious question:

What’s the worst that could happen?

chat.voltdb.com@johnhugg

VoltDB Thoughts

• Our policy: Consistency or data loss bugs are blocking bugs to be prioritized above all else.

• So if Jepsen finds bugs, we need to fix them ASAP.

• The risk is that Jepsen finds bugs that we have to fix, which might impact our schedule.

• But that’s dumb. If our product has bugs, not knowing about them doesn’t make them not there.

Webinar Talk from Kyle Kingsbury: https://www.voltdb.com/wrjepsen

chat.voltdb.com@johnhugg

What was found?Issue Reproducable Fixed

Under network partitions, VoltDB allows stale and/or dirty reads in read-only transactions. Any redundant VoltDB cluster 6.4

Under network partitions, VoltDB can lose confirmed writes.

Only when redundancy level > node count / 2 6.4

After a network partition, a total cluster failure, and a recovery,

VoltDB can lose confirmed writes.

Only when redundancy level > node count / 2 6.4

chat.voltdb.com@johnhugg

What was found?Issue Reproducable Fixed

Under network partitions, VoltDB allows stale and/or dirty reads in read-only transactions. Any redundant VoltDB cluster 6.4

Under network partitions, VoltDB can lose confirmed writes.

Only when redundancy level > node count / 2 6.4

After a network partition, a total cluster failure, and a recovery,

VoltDB can lose confirmed writes.

Only when redundancy level > node count / 2 6.4

One production deployment vulnerable

One production deployment vulnerable

chat.voltdb.com@johnhugg

VoltDB Takeaway

Engineering Team:

• Good move for our never-ending quest to build better software.

Marketing & Perception:

• Passing Jepsen is good. People talking about VoltDB is good. Showing we care about this stuff is good.

• Having bugs is bad, but discussing and fixing issues openly and seriously can be positive.

chat.voltdb.com@johnhugg

Reproducible!

• 100% reproducible test: • Set up Jepsen from Github • Clone Jepen VoltDB driver

from Github • Run!

• Can’t do this with systems you don’t control.

https://github.com/jepsen-io/voltdb

https://voltdb.com/jepsen (bottom of page)

chat.voltdb.com@johnhugg

Jepsen is just one test• Jepsen is a Key-Value test, albeit one that was extended to multiple-

keys-per-transaction as part of the VoltDB work

• Doesn’t test configurations other than 5 node clusters with 5X redundanc

• Doesn’t test SQL, which can be much more complex, with many unpredictable writes per test

• Doesn’t test materialized views, Kafka importers, ElasticSearch exporters, cross-datacenter, windowing functions, complex stored procedures, etc…

chat.voltdb.com@johnhugg

Links for Those at Home• VoltDB 6.4 Passes Jepsen Testing

https://www.voltdb.com/jepsen

• How We Test at VoltDB (blog post pre-Jepsen)https://www.voltdb.com/blog/how-we-test-voltdb

• Testing VoltDB Against PostgreSQL https://www.voltdb.com/blog/testing-voltdb-against-postgresql

• Testing at VoltDB: SQLCoverage https://www.voltdb.com/blog/testing-voltdb-sqlcoverage

• "All In With Determinism for Performance and Testing in Distributed Systems” (talk) https://www.youtube.com/watch?v=gJRj3vJL4wE

Consistency Value Revisited

Why can’t I have nice things?

90ms

170ms

chat.voltdb.com@johnhugg

#speedoflightfail

• VoltDB-style CAP+ACID consistency across the globe would mean mean latencies of 100ms or more.

• For some apps, this is ok, but for many it’s very challenging.

• VoltDB offers Eventual-Consistency-style tools for dealing with multiple datacenter deployments.*

“speed of light” not “speedo flight”

chat.voltdb.com@johnhugg

*So everything you’ve said is a lie once I need two data centers?

chat.voltdb.com@johnhugg

Example: Telco (Revisited)

Mobile phone is dialed.

Request sent to VoltDB to decide if it

should be let through.

Single transaction looks at state and decides if this call:

is fraudulent? is permitted under plan?

has prepaid balance to cover?

State Blacklists

Fraud Rules Billing Info

Recent Activity for both Numbers

Export to OLAP

99.999% of txns respond in 50ms

chat.voltdb.com@johnhugg

Islands of Consistency

New York VoltDB

Strong-Serializable ACID + CP

100ms Async Replication

London VoltDB

Strong-Serializable ACID + CP

Boston User 20ms Latency

NYC Home

Glasgow User 20ms Latency London Home

chat.voltdb.com@johnhugg

Islands of Consistency

New York VoltDB

Strong-Serializable ACID + CP

100ms Async Replication

London VoltDB

Strong-Serializable ACID + CP

Boston User 20ms Latency

NYC Home

Client Migrates Home Takes > 100ms

Conflicts Extremely Rare

chat.voltdb.com@johnhugg

Local Consistency > None• Still get full functionality locally.

• Sends only committed transactions and applies them atomically on the peer clusters.

• Putting some smarts in the client makes conflicts extremely rare.

• This requires more planning and engineering work than a single datacenter solution.

Spoiler: Any complex distributed application is going to require lots of

planning and engineering work

chat.voltdb.com@johnhugg

No VoltDB?

Dynamo-Based Eventual Consistency

NYC London

(1)

Dynamo-Based Eventual Consistency

Dynamo-Based Eventual Consistency

Dynamo-Based Eventual Consistency(2)

Consistent System generating packaged events

Consistent System consuming packaged events

Kafka or Similar(3)

chat.voltdb.com

forum.voltdb.com

askanengineer @voltdb.com

@johnhugg @voltdb

voltdb.com/jepsen

all images from wikimedia w/ cc license unless otherwise noted

I want to learn more!

Takeaways Consistency Good!

Jepsen Good! VoltDB Interesting!

THANK YOU!

Our Strata Booth (across from O’Reilly)