Groovy concurrency

42
Groovy Concurrency Alex Miller Revelytix

description

Using Groovy? Got lots of stuff to do at the same time? Then you need to take a look at GPars (“Jeepers!”), a library providing support for concurrency and parallelism in Groovy. GPars brings powerful concurrency models from other languages to Groovy and makes them easy to use with custom DSLs: - Actors (Erlang and Scala) - Dataflow (Io) - Fork/join (Java) - Agent (Clojure agents) In addition to this support, GPars integrates with standard Groovy frameworks like Grails and Griffon. Background, comparisons to other languages, and motivating examples will be given for the major GPars features.

Transcript of Groovy concurrency

Page 1: Groovy concurrency

Groovy ConcurrencyAlex MillerRevelytix

Page 2: Groovy concurrency

• Groovy library - http://gpars.codehaus.org/

• Open-source

• Great team: Václav Pech, Dierk König, Alex Tkachman, Russel Winder, Paul King

• Integration with Griffon, Grails, etc

GParsJeepers!

Page 3: Groovy concurrency

Concurrency Patterns

• Asynchronous processing

• Computational processing

• Managing mutable state

• Actor messaging

• Agents

• Dataflow variables

Page 4: Groovy concurrency

Asynchronous Processing

Improve perceived performance by pushing I/O-bound work onto a different thread.

Page 5: Groovy concurrency

Serialized Processingmain

Retrieve user1 tweets

Retrieve user2 tweets

Twitter

Process tweets

Twitter

Page 6: Groovy concurrency

Asynchronous Processing

main background 1

Request user1 tweets

Process tweets

background 2

Retrieve user1 tweets Retrieve

user2tweets

Request user2 tweets

Page 7: Groovy concurrency

GParsExecutorsPool• Leverages java.util.concurrent.Executor

• void execute(Runnable command)

• Decouple task submission from task execution

• GParsExecutorsPool.withPool() blocks add:

• async() - takes closure, returns Future

• callAsync() - same, but takes args

Page 8: Groovy concurrency

Future

• Future represents the result of an asynchronous computation

• Poll for results, block till available, etc

• Cancel, check for cancellation

Page 9: Groovy concurrency

GParsExecutorsPool.withPool { def retrieveTweets = { query -> recentTweets(api, query) }

trends.each { retrieveTweets.callAsync(it.query) }}

trends.each { retrieveTweets(api, it.query)}

Page 10: Groovy concurrency

Computational Processing

Use multiple cores to reduce the elapsed time of some cpu-bound work.

Page 11: Groovy concurrency

Executor PoolsSame pools used for asynch are also good for transaction processing. Leverage more cores to process the work.

T1

Task 1

Task 2

Task 3

Task 4

Task 5

Task 6

T1

Task 1

Task 4

T2

Task 2

Task 5

T3

Task 3

Task 6

Page 12: Groovy concurrency

Work pools and Queues

takeput

queue

Page 13: Groovy concurrency

Fork/Join Pools

• DSL for new JSR 166y construct (JDK 7)

• DSL for ParallelArray over fork/join

• DSL for map/reduce and functional calls

Page 14: Groovy concurrency

Fork/Join Pools

takeput

queue

takeput

queue

takeput

queue

Page 15: Groovy concurrency

Fork/Join vs Executors• Executors (GParsExecutorsPool) tuned for:

• small number of threads (4-16)

• medium-grained I/O-bound or CPU-bound tasks

• Fork/join (GParsPool) tuned for:

• larger number of threads (16-64)

• fine-grained computational tasks, possibly with dependencies

Page 16: Groovy concurrency

Divide and Conquer

Task: Find max value in an array

Page 17: Groovy concurrency

Divide and Conquer

Page 18: Groovy concurrency

Divide and Conquer

Page 19: Groovy concurrency

ParallelArray

• Most business problems don’t really look like divide-and-conquer

• Common approach for working in parallel on a big data set is to use an array where each thread operates on independent chunks of the array.

• ParallelArray implements this over fork/join.

Page 20: Groovy concurrency

ParallelArrayOrder2/15

Order5/1

Order1/15

Order3/15

Order2/1

Order4/1

Starting ordersw/due dates

Filter only overdue Order2/15

Order1/15

Order2/1

Map to days overdue

Avg

45 3015

30 days

Page 21: Groovy concurrency

ParallelArray usage

def isLate = { order -> order.dueDate > new Date() }def daysOverdue = { order -> order.daysOverdue() }

GParsPool.withPool { def data = createOrders() .filter(isLate) .map(daysOverdue)

println("# overdue = " + data.size()) println("avg overdue by = " + (data.sum() / data.size()))}

Page 22: Groovy concurrency

Managing Mutable State

Page 23: Groovy concurrency

The Java

concurrency

bible

Page 24: Groovy concurrency

• It’s the mutable state, stupid.

• Make fields final unless they need to be mutable.

• Immutable objects are automatically thread-safe.

• Encapsulation makes it practical to manage the complexity.

• Guard each mutable variable with a lock.

• Guard all variables in an invariant with the same lock.

• Hold locks for the duration of compound actions.

• A program that accesses a mutable variable from multiple threads without synchronization is a broken program.

• Don’t rely on clever reasoning about why you don’t need to synchronize.

• Include thread safety in the design process - or explicitly document that your class is not thread-safe.

• Document your synchronization policy.

JCIP

Page 25: Groovy concurrency

• It’s the mutable state, stupid.

• Make fields final unless they need to be mutable.

• Immutable objects are automatically thread-safe.

• Encapsulation makes it practical to manage the complexity.

• Guard each mutable variable with a lock.

• Guard all variables in an invariant with the same lock.

• Hold locks for the duration of compound actions.

• A program that accesses a mutable variable from multiple threads without synchronization is a broken program.

• Don’t rely on clever reasoning about why you don’t need to synchronize.

• Include thread safety in the design process - or explicitly document that your class is not thread-safe.

• Document your synchronization policy.

JCIP

Page 26: Groovy concurrency

The problem with shared state concurrency...

Page 27: Groovy concurrency

The problem with shared state concurrency...

is the shared state.

Page 28: Groovy concurrency

Actors

• No shared state

• Lightweight processes

• Asynchronous, non-blocking message-passing

• Buffer messages in a mailbox

• Receive messages with pattern matching

• Concurrency model popular in Erlang, Scala, etc

Page 29: Groovy concurrency

Actors

Page 30: Groovy concurrency

start

Actors

Page 31: Groovy concurrency

start

send

Actors

Page 32: Groovy concurrency

receive

Actors

Page 33: Groovy concurrency

Actor Example

class Player extends AbstractPooledActor { String name def random = new Random()

void act() { loop { react { // player replies with a random move reply Move.values()[random.nextInt(Move.values().length)] } } }}

Page 34: Groovy concurrency

Agent• Like Clojure Agents

• Imagine wrapping an actor around mutable state...

• Messages are:

• functions to apply to (internal) state OR

• just a new value

• Can always safely retrieve value of agent

Page 35: Groovy concurrency

def stuff = new Agent<List>([])

// thread 1stuff.send( {it.add(“pizza”)} )

// thread 2 stuff.send( {it.add(“nachos”)} )

println stuff.val

Simple example...• Use generics to specify data type of wrapped data

• Pass initial value on construction

• Send function to modify state

• Read value by accessing val

Page 36: Groovy concurrency

class BankAccount extends Agent<Long> { def BankAccount() { super(0) } private def deposit(long deposit) { data += deposit } private def withdraw(long deposit) { data -= deposit }}

final BankAccount acct = new BankAccount()

final Thread atm2 = Thread.start { acct << { withdraw 200 }}

final Thread atm1 = Thread.start { acct << { deposit 500 }}

[atm1,atm2]*.join()println "Final balance: ${ acct.val }"

Page 37: Groovy concurrency

Dataflow Variables

• A variable that computes its value when its’ inputs are available

• Value can be set only once

• Data flows safely between variables

• Ordering falls out automatically

• Deadlocks are deterministic

Page 38: Groovy concurrency

• Logical tasks

• Scheduled over a thread pool, like actors

• Communicate with dataflow variables

Dataflow Tasks

Page 39: Groovy concurrency

final def x = new DataFlowVariable()final def y = new DataFlowVariable()final def z = new DataFlowVariable()

task { z << x.val + y.valprintln “Result: ${z.val}”

}

task {x << 10

}

task {y << 5

}

Dataflow example

Page 40: Groovy concurrency

• Bind handlers - can be registered on a dataflow variable to execute on bind

• Dataflow streams - thread-safe unbound blocking queue to pass data between tasks

• Dataflow operators - additional abstraction, formalizing inputs/outputs

Dataflow support

Page 41: Groovy concurrency

Sir Not-Appearing-In-This-Talk

• Caching - memoize

• CSP

• GPU processing

Page 42: Groovy concurrency

Alex Miller

Twitter: @puredanger

Blog: http://tech.puredanger.com

Code:

http://github.com/puredanger/gpars-examples

Conference: Strange Loop

http://strangeloop2010.com