How not to do Java Concurrency And how to find if you did it wrong Mark Winterrowd 2014-09-30.
State You’re Doing it Wrong: Alternative Concurrency ... · State You’re Doing it Wrong:...
Transcript of State You’re Doing it Wrong: Alternative Concurrency ... · State You’re Doing it Wrong:...
State You’re Doing it Wrong: Alternative Concurrency Paradigms For the JVMTM
Jonas BonérCrisp ABhttp://jonasboner.comhttp://crisp.se
2
Agenda> An Emergent Crisis> Shared-State Concurrency> Message-Passing Concurrency (Actors) > Software Transactional Memory (STM)> Dataflow Concurrency> Wrap up
3
Moore’s Law> The number of transistors are doubling every 18
months > Coined in the 1965 paper by Gordon E. Moore > Chip manufacturers have solved our problems for
years
4
Not anymore
5
The free lunch is over> The end of Moore’s Law > We can’t squeeze more out of one chip
6
Clock speed vs number of transistors
> The number of transistors continues to climb, at least for now > Clock speed, however, is a different story
7
Conclusion> This is an emergent crisis > Multi-processors are here to stay > We need to learn to take advantage of that > The world is going concurrent
8
The rise of Amdahl’s Law
“The speedup of a program using multiple processors in parallel
computing is limited by the time needed for the sequential fraction of
the program”
9
Shared-State Concurrency> Concurrent synchronized access to shared,
mutable state. > Protect mutable state with locks > The
Java C# C/C++ Ruby Python etc. way
10
Shared-State Concurrency is incredibly hard> Inherently very hard to use reliably> Even the experts get it wrong
11
Example: Transfer funds between bank accounts
12
Accountpublic class Account { private double balance; public void withdraw(double amount) { balance -= amount; } public void deposit(double amount) { balance += amount; } } > Not thread-safe
13
Make it thread-safepublic class Account { private double balance; public synchronized void withdraw(double amount) { balance -= amount; } public synchronized void deposit(double amount) { balance += amount; } } > Thread-safe, right? > What about atomic transfers?
14
Let’s write a transfer methodpublic synchronized void transfer( Account from, Account to, double amount) { from.withdraw(amount); to.deposit(amount); } > This will work right?
15
It’s broken> Someone might update the account outside our
transfer method
16
Transfer method; take 2public void transfer( Account from, Account to, double amount) { synchronized(from) { synchronized(to) { from.withdraw(amount); to.deposit(amount); } } } > This will work right?
17
Let’s transfer fundsAccount a1 = ... Account a2 = ... // in one thread transfer(a1, a2, 10.0D); // in another thread transfer(a2, a1, -3.0D);
18
Might lead to DEADLOCK!!!> Darn, this is really hard!!!
19
We need to enforce lock ordering> How? > Java won’t help us > Need to use code convention (names etc.) > Requires knowledge about the internal state and
implementation of Account > …runs counter to the principles of encapsulation
in OOP > Opens up a Can of Worms™
20
The devil is in the mutable state
21
The problem with locks> Locks do not compose > Ordering matters but can’t be enforced
Need to rely on code conventions > Not natural to reason about
Does not map well to the real world > Easy to get
Deadlocks Live locks Starvation
22
Jonas’ Law ;-)
“Shared-State Concurrency is like children;
completely indeterministic behavior with a risk for total deadlock.”
23
Help 1: java.util.concurrent> Great library > Raises the abstraction level > Simplifies concurrent code > Use it, don’t roll your own
24
Help 2: Immutable objects> Can be safely shared between threads > Easy to reason about > Less bug and error prone > Should be the default (final in Java) > Effectively immutable objects are OK
25
Help 3: Confinement> Thread confinement
ThreadLocal in Java > Stack confinement
26
Java bet on the wrong horse> But there are alternatives > There is actually a better world out there…
27
Alternative paradigms> Message-Passing Concurrency (Actors) > Software Transactional Memory (STM) > Dataflow Concurrency
28
Message-Passing Concurrency
29
Actors> Implements Message-Passing Concurrency> Share NOTHING> Isolated lightweight processes> Communicates through messages> Asynchronous and non-blocking
30
Actors> Originates in a 1973 paper by Carl Hewitt> Implemented in Erlang, Occam, Oz> Encapsulates state and behavior> Closer to the definition of OO than classes
31
Alan Kay (father of SmallTalk and OOP)
“OOP to me means only messaging, local retention and protection and
hiding of state-process, and extreme late-binding of all things.”
“Actually I made up the term “object-oriented”, and I can tell you
I did not have C++ in mind.Replace C++ with Java or C#
32
Actor Model of Concurrency > No shared state
… hence, nothing to synchronize.> Each actor has a mailbox (message queue)
33
Actor Model of Concurrency> Non-blocking send> Blocking receive> Messages are immutable> Highly performant and scalable
SEDA-style
34
Actor Model of Concurrency > Easier to reason about> Raised abstraction level> Easier to avoid
Race conditions Deadlocks Starvation Live locks
35
Fault-tolerant systems > Link actors> Supervisor hierarchies
One-for-one All-for-one
> Ericsson’s Erlang success story 9 nines uptime
36
Scala Actors> Asynchronous
Fire-and-forget Futures (Send Receive Reply Eventually)
> Synchronous> Message loop with pattern (message) matching> Erlang-style
37
Example: Hello Worldval server = actor { loop { // message loop react { // match on message case "greeting" => println("Hello World!") case "exit" => println("Exiting") exit case _ => println("Unknown message") } } }
38
Two different models> Thread-based
receive { ... }> Event-based
react { ... } Very lightweight Can easily create millions on a single workstation
> The models are unified
39
Example: Ship scheduling system > Uses Event Sourcing
Capture all changes to an application state as a sequence of events
40
Define the messagessealed abstract case class Event abstract case class StateChange(val time Date) extends Event { val recorded = new Date def process: Unit }
case class DepartureEvent( val time: Date, val port: Port, val ship: Ship) extends StateChange(time) { override def process = ship ! this }
case class ArrivalEvent( val time: Date, val port: Port, val ship: Ship) extends StateChange(time) { override def process = ship ! this }
41
Create the Ship actor class Ship(val name: String, val home: Port) extends Actor { def act = loop(home) private def loop(current: Port) { react { case ArrivalEvent(time, port, _) => println(toString + " ARRIVED at port " + port + " @ " + time) loop(port) case DepartureEvent(time, port, _) => println(toString + " DEPARTED from port " + port + " @ " + time) loop(Port.AT_SEA) case unknown => error("Unknown event: " + unknown) } }}
42
Create the EventProcessor actor class EventProcessor extends Actor { def act = loop(Nil) private def loop(log: List[StateChange]) { react { case event: StateChange => event.process loop(event :: log) case unknown => error("Unknown event: " + unknown) } } }
43
Create the Ship, EventProcessor and Ports
val processor = new EventProcessor processor.start val portSFO = new Port("San Francisco", Country.US) val portLA = new Port("Los Angeles", Country.US) val portYYV = new Port("Vancouver", Country.CANADA) val shipKR = new Ship("King Roy", portYYV) shipKR.start
44
Set the Ship to sea processor ! ArrivalEvent(new Date(..), portLA, shipKR) processor ! DepartureEvent(new Date(..), portYYV, shipKR) processor ! ArrivalEvent(new Date(..), portYYV, shipKR)
processor ! DepartureEvent(new Date(..), portSFO, shipKR) processor ! ArrivalEvent(new Date(..), portSFO, shipKR)
45
Add replay of messages case object Replay extends Event class EventProcessor extends Actor { def act = loop(Nil) private def loop(log: List[StateChange]) { react { ... // replay all events case Replay => log.reverse.foreach(_.process) loop(log) } } }
46
Add replay of up to a specific date (e.g. snapshot)
case class ReplayUpTo(date: Date) extends Event class EventProcessor extends Actor { def act = loop(Nil) private def loop(log: List[StateChange]) { react { ... // replay events up to a specific time case ReplayUpTo(date) => log.reverse .filter(_.occurred.getTime <= date.getTime) .foreach(_.process) loop(log) } } }
47
Priority Messagesdef act = loop(Nil) private def loop(log: List[StateChange]) { reactWithin(0) { case ArrivalEvent(time, port, _) => ... // highest priority react { case DepartureEvent(time, port, _) ... // lowest priority } } }
48
Actor alternatives for the JVM > Killim (Java)> Jetlang (Java)> Actor’s Guild (Java)> ActorFoundry (Java)> Actorom (Java)> GParallelizer (Groovy)> Fan Actors (Fan)
49
Problems with Actors> Actors don’t work well when
We really have shared state. F.e. bank account We need to form some unified consensus We need synchronous behavior
> A bit verbose compared to method dispatch
50
Software Transactional Memory (STM)
51
STM> See the memory (heap and stack) as a
transactional dataset> Similar to a database
begin commit abort/rollback
52
STM> Transactions are retried automatically upon
collision> Rolls back the memory on abort
53
STM> Transactions can nest> Transactions compose atomic { .. atomic { .. } }
54
Developer restrictions > All operations in scope of a transaction:
Needs to be idempotent Can’t have side-effects
55
Case study: Clojure
56
What is Clojure? > Functional language> Only immutable data and datastructures> Pragmatic Lisp> Great Java interop> Dynamic, but very fast
57
Clojure’s concurrency story > STM (Refs)
Synchronous Coordinated> Atoms
Synchronous Uncoordinated> Agents
Asynchronous Uncoordinated> Vars
Synchronous Thread Isolated
58
STM (Refs)> A Ref holds a reference to an immutable value> A Ref can only be changed in a transaction> Updates are atomic and isolated (ACI)> A transaction sees its own snapshot of the world> Transactions are retried upon collision
59
Agents> Manages independent state> Asynchronous (fire-and-forget)> Changes state by applying a function (state =>
newState)> Coordinates with STM (changes are held until
commit)
60
Atoms> Manages independent state> Synchronous> Atomic transition
61
Clojure’s STM implementation > MVCC (Multi-Version Concurrency Control)> Uses wait/notify> Deadlock detection> One single CAS (timestamp)
62
Example: Refs (def foo (ref {:a "fred" :b "ethel" :c 42 :d 17 :e 6})) @foo ;; -> {:d 17, :a "fred", :b "ethel", :c 42, :e 6} (commute foo assoc :a "lucy") ;; -> IllegalStateException: No transaction running (dosync (commute foo assoc :a "lucy")) @foo ;; -> {:d 17, :a "lucy", :b "ethel", :c 42, :e 6}
63
Uniformed state transition
;; refs (dosync (commute foo assoc :a "lucy")) ;; agents (send foo assoc :a "lucy") ;; atoms (swap! foo assoc :a "lucy")
64
My (humble) opinion on STM > Can never work fine in a language that don’t
have compiler enforced immutability> E.g. never in Java
> Still a research topic how to do it in imperative languages
65
Dataflow Concurrency> The forgotten paradigm
66
Dataflow Concurrency> Declarative > No observable non-determinism > Data-driven – threads blocks until data is available > No difference between concurrent and sequential
code
67
Dataflow Concurrency> Dataflow (Single-Assignment) Variables > Dataflow Streams (the tail is a dataflow variable) > Implemented in Oz and Alice
68
Dataflow Concurrency> No race-conditions > No deadlocks > No live-locks > Deterministic > BEAUTIFUL
69
Three operations> Create a dataflow variable > Wait for the variable to be bound > Bind the variable
70
Limitations> Can’t have side-effects
Exceptions IO etc.
71
Example: Oz-style dataflow concurrency for the JVM> Created my own implementation (DSL) > Implemented on top in Scala
72
API: Dataflow Variable// Create dataflow variable val x, y, z = new DataFlowVariable[Int] // Access dataflow variable (Wait to be bound) z() // Bind dataflow variable x << 40 // Lightweight thread thread { y << 2 }
73
Example: Dataflow Variables
val x, y, z = new DataFlowVariable[Int] thread { z << x() + y() println("z = " + z()) } thread { x << 40 } thread { y << 2 }
74
API: Dataflow Stream
// Create dataflow stream val producer = new DataFlowStream[Int] // Append to stream producer <<< s // Read from stream producer()
75
Example: Dataflow Streamsdef ints(n: Int, max: Int, stream: DataFlowStream[Int]) = { if (n != max) { stream <<< n ints(n + 1, max, stream) }}def sum(s: Int, in: DataFlowStream[Int], out: DataFlowStream[Int]): Unit = { out <<< s sum(in() + s, in, out) } def printSum(stream: DataFlowStream[Int]) = { println("Result: " + stream()) printSum(stream) }
76
Example: Dataflow Streams
val producer = new DataFlowStream[Int] val consumer = new DataFlowStream[Int] thread { ints(0, 1000, producer) } thread { sum(0, producer, consumer) } thread { printSum(consumer) }
77
Wrap up> Parallel programs is becoming increasingly
important> We need a simpler way of writing concurrent
programs> Java-style concurrency is too hard> There are alternatives worth exploring
Message-Passing Concurrency Software Transactional Memory Dataflow Concurrency
78
Jonas BonérCrisp ABhttp://jonasboner.comhttp://crisp.se