Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

165
Konrad 'ktoso' Malawski GeeCON 2014 @ Kraków, PL Konrad `@ktosopl` Malawski need for async hot pursuit for Scalable apps

Transcript of Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Page 1: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Konrad 'ktoso' MalawskiGeeCON 2014 @ Kraków, PL

Konrad `@ktosopl` Malawski

need for asynchot pursuit

for Scalable apps

Page 2: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Konrad `ktoso` Malawski

Akka Team, Reactive Streams TCK

Page 3: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Konrad `@ktosopl` Malawski

akka.iotypesafe.comgeecon.org

Java.pl / KrakowScala.plsckrk.com / meetup.com/Paper-Cup @ London

GDGKrakow.pl lambdakrk.pl

Page 4: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Nice to meet you! Who are you guys?

Page 5: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Why scalability matters:

Page 6: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

High Performance Software Development

For the majority of the time, high performance software development

is not about compiler hacks and bit twiddling.

It is about fundamental design principles that are key to doing any effective software development.

Martin Thompson

practicalperformanceanalyst.com/2015/02/17/getting-to-know-martin-thompson-...

Page 7: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Agenda• Why?

• Async and Synch basics / definitions

• Back-pressure applied

• Async where it matters: Scheduling

• How NOT to measure Latency

• Concurrent < lock-free < wait-free

• I/O: IO, AIO, NIO, Zero

• C10K: select, poll, epoll / kqueue

• Distributed Systems: Where Async is at Home

• Wrapping up and Q/A

Page 8: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async Basics

Page 9: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 10: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 11: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 12: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 13: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 14: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 15: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 16: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 17: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Sync / Async

Page 18: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Asynchronous systems rely on buffering.

Page 19: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Asynchronous systems rely on buffering.

What if they overflow? How do you back-pressure?

Page 20: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? Example Without

Fast Publisher Slow Subscriber

Page 21: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Subscriber usually has some kind of buffer.

Page 22: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Page 23: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Page 24: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

What if the buffer overflows?

Page 25: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Use bounded buffer, drop messages + require re-sending

Page 26: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Kernel does this!Routers do this!

(TCP)

Use bounded buffer, drop messages + require re-sending

Page 27: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure “not needed” when:

speed(publisher) < speed(subscriber)

Page 28: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? Fast Subscriber, No Problem

No problem!

Page 29: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure?NACKing is NOT enough.

Page 30: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? (example without)

Page 31: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? Example NACKingTelling the Publisher to slow down / stop sending…

Page 32: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? Example NACKingNACK did not make it in time,

because M was in-flight!

Page 33: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure?Reactive-Streams

= “Dynamic Push/Pull”

Page 34: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Just push – not safe when Slow Subscriber

Just pull – too slow when Fast Subscriber

Back-pressure? RS: Dynamic Push/Pull

Page 35: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Solution:Dynamic adjustment

Back-pressure? RS: Dynamic Push/Pull

Just push – not safe when Slow Subscriber

Just pull – too slow when Fast Subscriber

Page 36: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Dynamic Push/PullSlow Subscriber sees it’s buffer can take 3 elements. Publisher will never blow up it’s buffer.

Page 37: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Dynamic Push/PullFast Publisher will send at-most 3 elements. This is pull-based-backpressure.

Page 38: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Dynamic Push/Pull

Fast Subscriber can issue more Request(n), before more data arrives!

Page 39: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Dynamic Push/PullFast Subscriber can issue more Request(n), before more data arrives.

Publisher can accumulate demand.

Page 40: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Accumulate demandPublisher accumulates total demand per subscriber.

Page 41: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? RS: Accumulate demandTotal demand of elements is safe to publish. Subscriber’s buffer will not overflow.

Page 42: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Back-pressure? Reactive StreamsMini in-line pitch

Specification and TCKreactive-streams.org

ImplementationsAkka Streams / Akka Http

Slick RxJava

ReactorRatPackVert.x

MongoDB driver

http://www.reactive-streams.org/announce-1.0.0

Page 43: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Highly parallel systemsEvent loops

Async where it matters:

Page 44: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 45: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 46: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 47: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 48: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 49: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 50: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 51: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 52: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Async where it matters: Scheduling

Page 53: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (notice they grey sync call)

Page 54: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (notice they grey sync call)

Page 55: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (now with Async db call)

Page 56: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (now with Async db call)

Page 57: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (now with Async db call)

Page 58: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Scheduling (now with Async db call)

Page 59: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Latency

Page 60: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Latency

Time interval between

the stimulation and response.

Page 61: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Latency Quiz

Gil Tene style, see: “How NOT to Measure Latency”

Is 10s latency acceptable in your app? Is 200ms latency acceptable?

How about most responses within 200ms?So mostly 20ms and some 1 minute latencies is OK?

Do people die when we go above 200ms?

So 90% below 200ms, 99% bellow 1s, 99.99% below 2s?

Page 62: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Latency in the “real world”

Gil Tene style, see: “How NOT to Measure Latency”

“Our response time is 200ms average, stddev is around 60ms”

— a typical quote

Page 63: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Latency in the “real world”

Gil Tene style, see: “How NOT to Measure Latency”

“Our response time is 200ms average, stddev is around 60ms”

— a typical quote

Latency does NOT behave like normal distribution!

“So yeah, our 99,99%’ is…”

Page 64: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

http://hdrhistogram.github.io/HdrHistogram/

Hiccups

Page 65: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Gil Tene style, see: “How NOT to Measure Latency”

Hiccups

Page 66: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Gil Tene style, see: “How NOT to Measure Latency”

Hiccups

Page 67: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Gil Tene style, see: “How NOT to Measure Latency”

Hiccups

Page 68: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Gil Tene style, see: “How NOT to Measure Latency”

Hiccups

Page 69: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Gil Tene style, see: “How NOT to Measure Latency”

Hiccups

Page 70: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 71: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 72: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 73: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Concurrent data structure

Page 74: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

A locks; B tries lock... A writes;

A unlocks; B acquires the lock... B (finally) writes & unlocks

What can happen in concurrent data structures:

Concurrent < lock-free < wait-free

Page 75: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

A locks; B tries lock... A writes;

A unlocks; B acquires the lock... B (finally) writes & unlocks

What can happen in concurrent data structures:

Moral? 1) Thread A is not very nice to B. 2) A lot of time is wasted.

Concurrent < lock-free < wait-freeW

asted time

Page 76: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-less

Concurrency is NOT Parallelism.

Rob Pike - Concurrency is NOT Parallelism (video)

def offer(a: A): Boolean // returns on failuredef add(a: A): Unit // throws on failure

def put(a: A): Boolean // blocks until able to enqueue

Page 77: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

concurrent data structure

< lock-free* data structure

* lock-free a.k.a. lockless

Page 78: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

What lock-free programming looks like:

Page 79: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

An algorithm is lock-free if it satisfies that:When the program threads are run sufficiently long,

at least one of the threads makes progress.

Concurrent < lock-free < wait-free

Page 80: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-freeWhat can happen in concurrent data structures:

A tries to write; B tries to write; B wins!A tries to write; C tries to write; C wins!A tries to write; D tries to write; D wins!A tries to write; B tries to write; B wins!A tries to write; E tries to write; E wins!A tries to write; F tries to write; F wins!…

Moral? 1) Thread A is a complete loser. 2) Thread A eventually should be able to progress…

Page 81: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

* Both versions are used: lock-free / lockless

class CASBackedQueue[A] { val _queue = new AtomicReference(Vector[A]())

// does not block, may spin though @tailrec final def put(a: A): Unit = { val queue = _queue.get val appended = queue :+ a

if (!_queue.compareAndSet(queue, appended)) put(a) }}

Concurrent < lock-free < wait-free

Page 82: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

class CASBackedQueue[A] { val _queue = new AtomicReference(Vector[A]())

// does not block, may spin though @tailrec final def put(a: A): Unit = { val queue = _queue.get val appended = queue :+ a

if (!_queue.compareAndSet(queue, appended)) put(a) }}

Concurrent < lock-free < wait-free

Page 83: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

class CASBackedQueue[A] { val _queue = new AtomicReference(Vector[A]())

// does not block, may spin though @tailrec final def put(a: A): Unit = { val queue = _queue.get val appended = queue :+ a

if (!_queue.compareAndSet(queue, appended)) put(a) }}

Concurrent < lock-free < wait-free

Page 84: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 85: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 86: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 87: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 88: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 89: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Page 90: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

“concurrent” data structure

< lock-free* data structure

< wait-free data structure

* Both versions are used: lock-free / lockless

Page 91: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Concurrent < lock-free < wait-free

Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue AlgorithmsMaged M. Michael Michael L. Scott

An algorithm is wait-free if every operation has a bound on the number of steps the algorithm will take before the

operation completes.

Page 92: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

wait-free: j.u.c.ConcurrentLinkedQueue

Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue AlgorithmsMaged M. Michael Michael L. Scott

public boolean offer(E e) { checkNotNull(e); final Node<E> newNode = new Node<E>(e);

for (Node<E> t = tail, p = t;;) { Node<E> q = p.next; if (q == null) { // p is last node if (p.casNext(null, newNode)) { // Successful CAS is the linearization point // for e to become an element of this queue, // and for newNode to become "live". if (p != t) // hop two nodes at a time casTail(t, newNode); // Failure is OK. return true; } // Lost CAS race to another thread; re-read next } else if (p == q) // We have fallen off list. If tail is unchanged, it // will also be off-list, in which case we need to // jump to head, from which all live nodes are always // reachable. Else the new tail is a better bet. p = (t != (t = tail)) ? t : head; else // Check for tail updates after two hops. p = (p != t && t != (t = tail)) ? t : q; }}

This is a modification of the Michael & Scott algorithm,adapted for a garbage-collected environment, with supportfor interior node deletion (to support remove(Object)). For explanation, read the paper.

Page 93: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 94: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

IO / AIO

Page 95: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

IO / AIO / NIO

Page 96: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

IO / AIO / NIO / Zero

Page 97: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Synchronous I / O

— Havoc Pennington(HAL, GNOME, GConf, D-BUS, now Typesafe)

When I learned J2EE about 2008 with some of my desktop colleagues our reactions included something like:

”wtf is this sync IO crap,

where is the main loop?!” :-)

Page 98: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Interruption!CPU: User Mode / Kernel Mode

Page 99: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

Page 100: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

Page 101: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

Page 102: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

Page 103: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

http://wiki.osdev.org/Context_Switching

Page 104: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Kernels and CPUs

[…] switching from user-level to kernel-level on a (2.8 GHz) P4 is 1348 cycles.

[…] Counting actual time, the P4 takes 481ns […]

http://wiki.osdev.org/Context_Switching

Page 105: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 106: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 107: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 108: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 109: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

Page 110: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O

“Don’t worry. It only gets worse!”

Page 111: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

I / O“Don’t worry.

It only gets worse!”

Same data in 3 buffers!4 mode switches!

Page 112: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Asynchronous I / O [Linux]

Linux AIO = JVM NIO

Page 113: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Asynchronous I / O [Linux]

NewIO… since 2004!(No-one calls it “new” any more)

Linux AIO = JVM NIO

Page 114: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Asynchronous I / O [Linux]

Less time wasted waiting. Same amount of buffer copies.

Page 115: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

ZeroCopy = sendfile [Linux]

“Work smarter. Not harder.”

http://fourhourworkweek.com/

Page 116: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

ZeroCopy = sendfile [Linux]

Page 117: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

ZeroCopy = sendfile [Linux]

Data never leaves kernel mode!

Page 118: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

ZeroCopy…

Page 119: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K and beyond

Page 120: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K and beyond

“10.000 concurrent connections”

Not a new problem, pretty old actually: ~12 years old.

http://www.kegel.com/c10k.html

Page 121: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K and beyond

It’s not about performance.It’s about scalability.

These are orthogonal things.

Threading differences: apache httpd / nginx / akka

(1 thread = 1 request) == Very heavy

Page 122: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

select/poll

Page 123: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – poll

Page 124: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – poll

Page 125: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – poll

Page 126: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

epoll

Page 127: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – epoll [Linux]

Page 128: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – epoll [Linux]

Page 129: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – epoll [Linux]

Page 130: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – epoll [Linux]

Page 131: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K – epoll [Linux]

Page 132: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K

O(n) is a no-go for epic scalability.

Page 133: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K

O(n) is a no-go for epic scalability.

State of Linux scheduling: O(n) ⇒ O(1) ⇒ CFS (O(1)/ O(log n))

And Socket selection: Select/Poll O(n) ⇒ EPoll (O(1))

Page 134: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

C10K

O(n) is a no-go for epic scalability.

State of Linux scheduling: O(n) ⇒ O(1) ⇒ CFS (O(1))

And Socket selection: Select/Poll O(n) ⇒ EPoll (O(1))

O(1) IS a go for epic scalability.Moral:

Page 135: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Distributed Systems

Page 136: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Distributed Systems“… in which the failure of a computer you didn't even know existed canrender your own computer unusable.”

— Leslie Lamport

http://research.microsoft.com/en-us/um/people/lamport/pubs/distributed-system.txt

Page 137: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Distributed SystemsThe bigger the system, the more “random” latency / failure noise.

Embrace instead of hiding it.

Page 138: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Distributed SystemsBackup Requests

Page 139: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests

A technique for fighting “long tail latencies”.By issuing duplicated work, when SLA seems in danger.

Page 140: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests

Page 141: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests

Page 142: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests - send

Page 143: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests - send

Page 144: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests - send

Page 145: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Backup requests

Avg Std dev 95%ile 99%ile 99.9%ile

No backups 33 ms 1524 ms 24 ms 52 ms 994 ms

After 10ms 14 ms 4 ms 20 ms 23 ms 50 ms

After 50ms 16 ms 12 ms 57 ms 63 ms 68 ms

Jeff Dean - Achieving Rapid Response Times in Large Online Services Peter Bailis - Doing Redundant Work to Speed Up Distributed Queries

Akka - Krzysztof Janosz @ Akkathon, Kraków - TailChoppingRouter (docs, pr)

Page 146: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Distributed SystemsCombined Requests

Page 147: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

A technique for avoiding duplicated work.By aggregating requests, possibly increasing latency.

Page 148: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

A technique for avoiding duplicated work.By aggregating requests, possibly increasing latency.

“Wat? Why would I increase latency!?”

Page 149: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 150: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 151: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 152: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 153: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 154: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests

Page 155: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

World of Tradeoffs

Page 156: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Combined requests with backpressure

reactive-streams.org

Timer can be replaced with back-pressure.

Page 157: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Wrapping up

Page 158: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Wrapping up

• Someone has to bite the bullet though!

• We’re all running on real hardware.

• We do it so you don’t have to.

Summing up the goal of this talk:

• Keep your apps pure and simple

• Be aware of internals

• Use libraries which apply these techniques

• Maybe indeed

• Messaging all the way!

These techniques are hard to get right, yet:

Page 159: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Links• akka.io • reactive-streams.org• akka-user

• Gil Tene - How NOT to measure latency, 2013• Jeff Dean @ Velocity 2014• Alan Bateman, Jeanfrancois Arcand (Sun) Async IO Tips @ JavaOne• http://linux.die.net/man/2/select• http://linux.die.net/man/2/poll• http://linux.die.net/man/4/epoll• giltene/jHiccup• Linux Journal: ZeroCopy I, Dragan Stancevis 2013

• Last slide car picture: http://actu-moteurs.com/sprint/gt-tour/jean-philippe-belloc-un-beau-challenge-avec-le-akka-asp-team/2000

Page 160: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Links• http://wiki.osdev.org/Context_Switching• CppCon: Herb Sutter "Lock-Free Programming (or, Juggling Razor Blades)" • http://www.infoq.com/presentations/reactive-services-scale• Gil Tene’s HdrHistogram.org

• http://hdrhistogram.github.io/HdrHistogram/plotFiles.html• Rob Pike - Concurrency is NOT Parallelism (video)• Brendan Gregg - Systems Performance: Enterprise and the Cloud (book)• http://psy-lob-saw.blogspot.com/2015/02/hdrhistogram-better-latency-capture.html• Jeff Dean, Luiz Andre Barroso - The Tail at Scale (whitepaper, ACM)• http://highscalability.com/blog/2012/3/12/google-taming-the-long-latency-tail-when-

more-machines-equal.html• http://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-

architects/• Marcus Lagergren - Oracle JRockit: The Definitive Guide (book)• http://mechanical-sympathy.blogspot.com/2013/08/lock-based-vs-lock-free-

concurrent.html• Handling of Asynchronous Events - http://www.win.tue.nl/~aeb/linux/lk/lk-12.html• http://www.kegel.com/c10k.html

Page 161: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Links• www.reactivemanifesto.org/• Seriously the only right way to micro benchmark on the JVM:

• JMH openjdk.java.net/projects/code-tools/jmh/• JMH for Scala: https://github.com/ktoso/sbt-jmh

• http://www.ibm.com/developerworks/library/l-async/• http://lse.sourceforge.net/io/aio.html• https://code.google.com/p/kernel/wiki/AIOUserGuide• ShmooCon: C10M - Defending the Internet At Scale (Robert Graham)

• http://blog.erratasec.com/2013/02/scalability-its-question-that-drives-us.html#.VO6E11PF8SM

• User-level threads....... with threads. - Paul Turner @ Linux Plumbers Conf 2013• Jan Pustelnik’s talk yesterday: https://github.com/gosubpl/sortbench

Page 162: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Special thanks to:

• Aleksey Shipilëv• Andrzej Grzesik• Gil Tene• Kirk Pepperdine• Łukasz Dubiel• Marcus Lagergren• Martin Thompson• Mateusz Dymczyk• Nitsan Wakart

Thanks guys, you’re awesome.

alphabetically, mostly

• Peter Lawrey• Richard Warburton• Roland Kuhn• Sergey Kuksenko• Steve Poole• Viktor Klang a.k.a. √• Antoine de Saint Exupéry :-)• and the entire Akka Team• the Mechanical Sympathy Mailing List

Page 163: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Learn more at:

• SCKRK.com –

• KrakowScala.pl – Kraków Scala User Group

• LambdaKRK.pl – Lambda Lounge Kraków

• GeeCON.org – awesome “all around the JVM” conference, [Kraków, Poznań, Prague, and more…!]

Software Craftsmanship KrakówComputer Science Whitepaper Reading Club Kraków

Page 164: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

Questions?

ktoso @ typesafe.com twitter: ktosopl

github: ktosoteam blog: letitcrash.com

home: akka.io

Page 165: Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-scale applications

©Typesafe 2015 – All Rights Reserved