Concurrency and Multithreading Demistified - Reversim Summit 2014

66
Reversim Summit 2014 Concurrency and MultiThreading Demystified Haim Yadid - Performize-IT

description

Life as a software engineer is so exciting! Computing power continue to rise exponentially, software demands continue to rise exponentially as well, so far so good. The bad news are that in the last decade the computing power of single threaded application remains almost flat. If you decide to continue ignoring concurrency and multi-threading the gap between the problems you are able to solve and your hardware capabilities will continue to rise. In this session we will discuss different approaches for taming the concurrency beast, such as shared mutability,shared immutability and isolated mutability actors, STM, etc we will discuss the shortcomings and the dangers of each approach and we will compare different programming languages and how they choose to tackle/ignore concurrency.

Transcript of Concurrency and Multithreading Demistified - Reversim Summit 2014

Page 1: Concurrency and Multithreading Demistified - Reversim Summit 2014

Reversim  Summit  2014    Concurrency  and  Multi-­‐Threading  

Demystified  !

Haim Yadid - Performize-IT

Page 2: Concurrency and Multithreading Demistified - Reversim Summit 2014

About  Me:  Haim  Yadid

•21 Years of SW development experience •Performance Expert •Consulting R&D Groups •Training: Java Performance Optimization

Page 3: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 4: Concurrency and Multithreading Demistified - Reversim Summit 2014

Moore’s  Law

•Number of transistors doubles constantly •CPU frequency à stalled. •Performance boost through parallelism

Page 5: Concurrency and Multithreading Demistified - Reversim Summit 2014

Yes….  But

•Performance is not a problem anymore •We prefer commodity hardware •We have Hadoop and Big Data!!! •Hardware is cheap. •Several processes will do •My programming language will protect me

Page 6: Concurrency and Multithreading Demistified - Reversim Summit 2014

!Concurrency

Page 7: Concurrency and Multithreading Demistified - Reversim Summit 2014

Concurrency

•Decomposition of your program into independently executing processes •About structure •About design •It is not something you code it is something you architect

Page 8: Concurrency and Multithreading Demistified - Reversim Summit 2014

Parallelism

•Simultaneous execution of (possibly related) computations •On different cores •About execution and scheduling •Not about architecture

Page 9: Concurrency and Multithreading Demistified - Reversim Summit 2014

Worker•Someone who is capable of doing an efficient job •Hard working •Can execute •Will do it with pleasure

Sid

Page 10: Concurrency and Multithreading Demistified - Reversim Summit 2014

State

•What in Sid’s mind ? •Whats in the environment •In a certain point in time

Pile  of  sand

Page 11: Concurrency and Multithreading Demistified - Reversim Summit 2014

Task

•A Unit of work scheduled to sid •Possibly code •Data •Means to communicate with Sid •E.g. Java: Callable

Task

Page 12: Concurrency and Multithreading Demistified - Reversim Summit 2014

Our  World

Page 13: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 14: Concurrency and Multithreading Demistified - Reversim Summit 2014

Liveliness  Problems

•Deadlock •Starvation •Waiting for a train that will never come

func count() { for i := 0; i < 1000; i++ { fmt.Println(i) time.Sleep(10 * time.Millisecond) } } func main() { go count() time.Sleep(3000 * time.Millisecond) for {} }

Go

GOMAXPROCS=2

Page 15: Concurrency and Multithreading Demistified - Reversim Summit 2014

Data  Races

•Inopportune interleaving •Stale values •Loosing updates •Infinite loops

class Foo { private HashSet h = new HashSet(); ! boolean introduceNewVal(Object v) { if (!h.contains(v)) {h.add(v); return true; } return false; } } !

Java

Page 16: Concurrency and Multithreading Demistified - Reversim Summit 2014

Performance

•Communication overhead •Contention •Imbalanced task distribution •False sharing

val v = Vector(…) v.par.map(_ + 1)

Scala

Page 17: Concurrency and Multithreading Demistified - Reversim Summit 2014

Whats  wrong  here?

•IncrementX accessed from ThreadX •IncrementY accessed from Thread Y •An aggregator thread will read both

Class  T  {      volatile  int  x  =  0;        volatile  int  y=0;! long incrementX() { x++; } long incrementY() { y++; }}

False  sharing:  cache  coherency  -­‐>  hitting  same  cache  line

Java

Page 18: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 19: Concurrency and Multithreading Demistified - Reversim Summit 2014

Sid  Life  Cycle

•Creation •Destruction

Page 20: Concurrency and Multithreading Demistified - Reversim Summit 2014

Heavy  Weight  Sid

•Threads •Thread Pools/Executors: Single Threaded/Fixed size/Bounded min..max/Unbounded •Fork Join pools (Work stealing) •Storm Bolts

Page 21: Concurrency and Multithreading Demistified - Reversim Summit 2014

Lightweight  Sid

•Sid is not a thread rather Scheduled to a thread •Green Threads •Actors (Scala/Akka) •Agents(Closures) •Go Routines

Page 22: Concurrency and Multithreading Demistified - Reversim Summit 2014

Communication

•BlockingQueues •Futures and promises CompletableFuture •Dequeues •<- Go Channels •! (Scala actors)

Page 23: Concurrency and Multithreading Demistified - Reversim Summit 2014

Serial  Execution

•Three queries to DB executed •In Serial •Long response time

doGet(req,resp) { rs1 = runQuery1() rs2 = runQuery2() rs3 = runQuery3() resp.write(mergeResults(rs1,rs2,rs3))}

Pseudo(Java)

Can  be  parallelized

Page 24: Concurrency and Multithreading Demistified - Reversim Summit 2014

•Run three queries in parallel….. •but:

Create  Threads

doGet(req,resp) { q1 = new Query(); new Thread(q1).start(); q2 = new Query(); new Thread(q2).start(); q3 = new Query(); new Thread(q3).start();! rs1 = q1.getRs(); rs2 = q2.getRs(); rs3 = q3.getRs(); resp.write(mergeResults(rs1,rs2,rs3));}

Pseudo(Java)

Thread  leak  +  Thread  creation  overhead  +  data  races

Page 25: Concurrency and Multithreading Demistified - Reversim Summit 2014

Thread  Pool

doGet(req,resp) { ExecutorService e = Executors.newFixedThreadPool(3); ArrayList tasks = …; tasks.add(new QueryTask(Query1)) …. 3 List<Future<Integer>> fs; fs = e.invokeAll(tasks); // invoke all in parallel ArrayList results = … for (Future<Integer> f : l) { // collect results if (f.isDone()) { results.add(f.get()); } } resp.write(mergeResults(results)); }

Pseudo(Java)

Thread  pool  leak  +  Thread  pool  creation  overhead

Page 26: Concurrency and Multithreading Demistified - Reversim Summit 2014

Thread  Pool  2

static ExecutorService e = Executors.newFixedThreadPool(3); !doGet(req,resp) { ArrayList tasks = …; tasks.add(new QueryTask(Query1)) …. 3 List<Future<Integer>> fs; fs = e.invokeAll(tasks); // invoke all in parallel ! ArrayList results = … for (Future<Integer> f : l) { // collect results if (f.isDone()) { res.add(f.get()); } } resp.write(mergeResults(results)); }

Pseudo(Java)

Size  ?/  share  thread  pool?  /  name  thread  pool  threads

Page 27: Concurrency and Multithreading Demistified - Reversim Summit 2014

Same  Example  With  Go

func  execQuery(query  string,  c  chan  *Row)  {          c  <-­‐  db.Query(query)} !func  doGet(req,resp) { c := make(chan *Row) go execQuery(query1,c) go execQuery(query2,c) go execQuery(query3,c) for i := 0; i<3 ; i++ combineRs(rs <-c) }

Pseudo(Go)

Page 28: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 29: Concurrency and Multithreading Demistified - Reversim Summit 2014

State  Management

•Eventually we need to have state •It is easy to deal with state when we have one Sid •But what happens when there are several •We have three approaches •Most are familiar with only one

Page 30: Concurrency and Multithreading Demistified - Reversim Summit 2014

Handling  State

•Shared Mutability •Shared Immutability •Isolated Mutability

Page 31: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 32: Concurrency and Multithreading Demistified - Reversim Summit 2014

Shared  Mutability

•Multiple Sids access the same data •Mutate the state •Easily exposed to concurrency hazards

Page 33: Concurrency and Multithreading Demistified - Reversim Summit 2014

Visibility

•Change made by sid1 is visible to sid2? •Not so simple •Caches •compiler reordering

•Solutions •volatile keyword •Memory Model(Happens before)

CPU

Registers

L1  cache

L2  cache

L3  cache

Mem

ory

Page 34: Concurrency and Multithreading Demistified - Reversim Summit 2014

Atomicity

•What can be done in a single step ? •CAS constructs (Compare and swap) •AtomicInteger •AtomicLong •ConcurrenctHashMap putIfAbsent

Page 35: Concurrency and Multithreading Demistified - Reversim Summit 2014

Atomicity  is  not  Viral

•An (almost) real example •A non transactional database

•Balance per user

•Use atomicity to solve the problem

class  User  {        private  AtomicLong  balance  =  …..  !      int updateBalance(int diff) { long temp = balance.addAndGet(diff); setToDB(temp); } }

Java

Page 36: Concurrency and Multithreading Demistified - Reversim Summit 2014

Locking  (Pessimistic)

•Synchronized •Sync •Mutex •Reentrant lock •Semaphores •Synchronized Collections

Page 37: Concurrency and Multithreading Demistified - Reversim Summit 2014

Beware  Sync  Collections

•Two synchronized operations

•are not synchronized

if  (syncedHashMap.get(“b”)  !=  null)  {      syncedHashMap.put(“b”,2);  }

Java

Page 38: Concurrency and Multithreading Demistified - Reversim Summit 2014

Hazards  

•Fine grained •—> Deadlocks

•Coarsed Grained •—> contention

Page 39: Concurrency and Multithreading Demistified - Reversim Summit 2014

STM  (Optimistic)

•Software Transactional Memory •Transactional semantics ACI: (not Durable) •Atomic, •Consistent and •Isolated

•No deadlocks - when collision retry! •Clojure refs , Akka refs

STM  performance  problem  when  there  are  too  many  mutations.

Page 40: Concurrency and Multithreading Demistified - Reversim Summit 2014

STM

•Clojure refs and dosync •Scala Refs and atomic •Multiverse Java

Page 41: Concurrency and Multithreading Demistified - Reversim Summit 2014

Mutliverse  STM  Example

import org.multiverse.api.references.*; import static org.multiverse.api.StmUtils.*; ! public class Account{ ! private final TxnRef<Date> lastUpdate; private final TxnInteger balance; ! public Account(int balance){ this.lastUpdate = newTxnRef<Date>(new Date()); this.balance = newTxnInteger(balance); }

Java

Page 42: Concurrency and Multithreading Demistified - Reversim Summit 2014

Mutliverse  STM  Example

public void incBalance(final int amount, final Date date){ atomic(() ->{ balance.inc(amount); lastUpdate.set(date); ! if(balance.get()<0){ throw new IllegalStateException("Not enough money"); } }); } }

Java8

Page 43: Concurrency and Multithreading Demistified - Reversim Summit 2014

Mutliverse  STM  Example

public static void transfer(final Account from, final Account to, final int amount){ atomic(()->{ Date date = new Date(); ! from.incBalance(-amount, date); to.incBalance(amount, date); }); }

Java8

Retry  ahead  beware  of  side  effects  

Page 44: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 45: Concurrency and Multithreading Demistified - Reversim Summit 2014

Pure  Immutability  

•We have shared state •But Shared state is read only (after construction) •No concurrency issues •No deadlocks •No race conditions •No stale values •Optimal for cache

Page 46: Concurrency and Multithreading Demistified - Reversim Summit 2014

Support  From  Languages

•Functional Languages favour immutability •vals in scala clojure •final keyword in java •freeze method in ruby

Page 47: Concurrency and Multithreading Demistified - Reversim Summit 2014

Immutable  Object  Example

Object cannot be changed after construction all fields are final

public final Class MySet { private final Set<String> vals = new HashSet<String>(); public MySet(String names[]) { for(name:names) vals.add(name); } public boolean containsVal(String name);…..}

Java

Page 48: Concurrency and Multithreading Demistified - Reversim Summit 2014

CopyOnWrite  Collections  

•Any changes to it will create a new copy •Safe •Fast read, read without synchronisation •Iteration is fast •do not support remove() set() add()

Bad  performance  when  mutation  rate  is  high  

Page 49: Concurrency and Multithreading Demistified - Reversim Summit 2014

Persistent  collections

•Immutable collections •Which are efficient •Preserve the same O(_) characteristic of a mutable collection. •Shared structure •Recursive definition

Page 50: Concurrency and Multithreading Demistified - Reversim Summit 2014

Persistent  Trie

root

D

CA

0

1

1

E

2

F 2,1

Page 51: Concurrency and Multithreading Demistified - Reversim Summit 2014

Persistent  Trie

root

D

CA

0

1

1

E

2

F

E

root

1

Page 52: Concurrency and Multithreading Demistified - Reversim Summit 2014

Example  Customization  Cache

•A Web application server •Serving Complicated and customisable UI •Each user has it’s own customization (potentially) •Classic for immutable collection •Low mutation rate •High read rate

Page 53: Concurrency and Multithreading Demistified - Reversim Summit 2014

Example  Customization  Cache

•Customization Data is immutable •Customization Data HashMap is a Persistent Map •Cache is represented by a single STM reference •Update will fail if two are performing it at once

Page 54: Concurrency and Multithreading Demistified - Reversim Summit 2014

Immutability  and  GC

•Immutability is great •But: Generates of a lot of objects •When done for short lived objects GC can cope with it •Long lived immutable objects/collections which change frequently may cause GC to have low throughput and high pause times

Page 55: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 56: Concurrency and Multithreading Demistified - Reversim Summit 2014

Isolated  Mutability

•No shared state •Each Sid has its own pile of sand •Message passing between Sids •Prefer passing immutable objects

Page 57: Concurrency and Multithreading Demistified - Reversim Summit 2014

•Javascript Workers •Ruby/NodeJS multi process •Actors (Scala Erlang) •Agents (Clojure) •Go routines/ channels - Go

Isolated  Mutability

Page 58: Concurrency and Multithreading Demistified - Reversim Summit 2014

Actor

ActorIsolated  Mutable  State

Actor

Actor

Actor

Page 59: Concurrency and Multithreading Demistified - Reversim Summit 2014
Page 60: Concurrency and Multithreading Demistified - Reversim Summit 2014

•Monitoring System (e.g. Nagios) •~100k of monitors •running periodically •Each one has a state. •Consumers are able to query state. •Some monitors may affect other monitor state

Building  a  Monitoring  System

Page 61: Concurrency and Multithreading Demistified - Reversim Summit 2014

•MonitorActor (two actors) •HostActor •MonitorCache •SchedulerActor •QueryActor •UpdateActor

Components

Page 62: Concurrency and Multithreading Demistified - Reversim Summit 2014

•MonitorStateActor(MSA) •Alway readys to be queried •State updated by message from MRA •Stateless

•MonitorRecalculateActor(MRA) •Maybe recalculating and not responsive •Stateful •Supervises MSA

Monitor  Actors

Page 63: Concurrency and Multithreading Demistified - Reversim Summit 2014

•An immutable cache - holds all actor refs •Single view of the world. •Used by SchedulerActor and Query Actor •May have several objects managed By STM

MonitorsCache

Page 64: Concurrency and Multithreading Demistified - Reversim Summit 2014

Actor

MSA

Status/  state

MRAQuery  Actor

Scheduler

MonitorsCache

Page 65: Concurrency and Multithreading Demistified - Reversim Summit 2014

•Java Concurrency In Practice /Brian Goetz •Effective Akka / Jamie Allen

•Clojure High Performance Programming /Shantanu Kumar •Programming Concurrency on the JVM: Mastering Synchronization, STM, and Actors /Subramaniam, Venkat

Further  Reading

Page 66: Concurrency and Multithreading Demistified - Reversim Summit 2014

Thanks + Q&A + Contact Me

© Copyright Performize-IT LTD.

http://il.linkedin.com/in/haimyadid

[email protected]

www.performize-it.com

blog.performize-it.com

https://github.com/lifey

@lifeyx