Clojure's take on concurrency
-
Upload
yoavrubin -
Category
Technology
-
view
333 -
download
0
description
Transcript of Clojure's take on concurrency
Clojure’s take on Concurrency
Yoav Rubin
About me• Software engineer in IBM Research, Haifa
– Worked on• From large scale products to small scale research
projects
– Domains• Software tools• Development environments • Simplified programming
– Technologies• Frontend engineering• Java, Clojure
• Lecture the course “Functional programming on the JVM” in Haifa University
{:name “Yoav Rubin,:email [email protected],:blog http://yoavrubin.blogspot.com,:twitter @yoavrubin}
Agenda
• The problem of concurrency
• Reference types
• Pendings
Why concurrency is a problem?
Mutability
What is to mutate
• What is actually x = x+1
– LOAD R10 x– ADDI R10 1– STORE R10 X
What is to mutate
• Thread 1: x = x+1
– LOAD R10 X– ADDI R10 1– STORE R10 X
• Thread 2: x = x+5
– LOAD R10 X– ADDI R10 5– STORE R10 X
What will happen?
What is to mutate
• Thread 1: x = x+1
– LOAD R10 X
– ADDI R10 1
– STORE R10 X
• Thread 2: x = x+5
– LOAD R10 X– ADDI R10 5
– STORE R10 X
x is increased by 1 !!!
What is to mutate
• Thread 1: x = x+1
– LOAD R10 X
– ADDI R10 1– STORE R10 X
• Thread 2: x = x+5
– LOAD R10 X– ADDI R10 5
– STORE R10 X
x is increased by 5 !!!
What is to mutate
• Thread 1: x = x+1
– LOAD R10 X– ADDI R10 1– STORE R10 X
• Thread 2: x = x+5
– LOAD R10 X– ADDI R10 5– STORE R10 X
x is increased by 6 (the correct result)
Getting to the right result
• The first two cases introduced a race condition– Threads racing to perform a write to the same
place in the memory
• Can be prevented with critical section
Critical section
• A marker that does not allow a thread to enter a code segment as long as another thread is there
Critical section
• It is up to the developer to define it– Using locks
• Need to get the lock of the critical section before entering it
• Need to release the lock of the critical section after finishing with it
The trouble with locks
• Introduce a trade-off between improving performance and reducing complexity
• More complexity => more bugs• Concurrency bugs are:
– Harder to find– Harder to replicate– Harder to debug – Harder to solve
The trouble with locks
• To properly use locks we need to have a complete understanding of everything that happens in the program– Rarely possible, and if so, by top individuals– Hardly scalable
The trouble with locks
• If the entire program is locked, there’s no complexity related to lock management– But we suffer from poorer performance due to
no concurrency
• If nothing is locked => it is up to Murphy
Managing locks
• What to lock
• When to lock– What’s the right time for a specific lock– What’s the right order for a series of locks
• When to unlock– The right time for a specific lock– The right order for a series of locks
What to lock?
• Pessimistic approach – any accessed value, both read and write
• Optimistic approach – any value we try to write to– What happens if a read value is used in future
writes ?• We cannot trust writes that are based on an
unlocked read
When to lock?
• Grab the lock as soon as possible– Prevent from others to take it
• Postpone the locking as much as possible– Less effect on the rest of the threads
When to unlock?
• The first release defines the end of the critical section
• Release lock(x) after writing to X
• Release lock(x,y,z) the writes to x,y,z
Grabbing several locks
• In what order?– Ordered vs unordered
• What to do if we can’t grab them all– Keep and retry to continue– Release what we have an restart
Unordered + keeping the locks
Thread 1:• Need locks A and B
• (grab A)• Wait till B is unlocked
Thread 2:• Need locks A and B• (grab B)
• Wait till A is unlocked
Deadlock!!!
Unordered + release the locks
• Need locks A and B
• (grab A)• (can’t get B)
• (free A)
• Need locks A and B• (grab B)
• (can’t get A)
• (free B)
livelock!!!
Ordered
• Need to decide on strict order
• Need to enforce it throughout the software
• Need to enforce it on components that interact with the software
• Need to adapt to the order that was used in other components
• Need to update all of the places when there’s a change that affects the order– e.g., in case of refactoring
• Both code structure and element’s names
Who grabs the lock
• Need to prioritize the locking order– Need to update the priority based on the
application’s state
• Otherwise we may cause a starvation
– Thread A waits for a lock on X, other threads keep on grabbing that lock before thread A succeeds
Debugging concurrent software may introduce
heisenbugs
Writing correct concurrent software is very complicated
Complexity cause bugs
Known unknowns
Writing correct concurrent software is always harder than you think
The delta between how hard it is and how hard you think it is transforms to bugs which are almost impossible to solve
Unknown unknowns
Why does it happen
• Locks have the same abstraction level as types have in assembly– They don’t
• Types are used to allow correct interpretation of the areas in the memory– Semantic aspect of the software
• Locks are used to allow correct access to areas in the memory– Syntactic aspect of the software
• Lower level constructs mixed with higher level language
What’s the solution
• Types allow defining semantic interpretation of memory areas– Each access to a memory area has to pass through
the type information
• Need to find a mechanism that would define concurrency semantic to areas in the memory– So each access to the memory area would pass
through the concurrency semantics information
What’s the solution
• Add another level of indirection
• Manage changes based on concurrency semantics
• Reference types
Type info
memory
Concurrency semantics
The element
Reference types
symbol
(as oppose to)
symbol
Type info
memory
What happens when changing?
symbol
Type info
memory
What happens when changing?
Concurrency semantics
symbol
Type info
memory
Type info
Other memory
This area may be reclaimed by
the GC
Clojure epochal model
Symbol that has concurrency semantics
State 1 State 2 State 3
function function
State:The value of an identity
at a given time
State can be changed by applying function on an identity
Reference types
• Providing concurrency semantics as part of the language– The developer needs to decide what’s the
right concurrency semantics of the element• Just like deciding what’s the type of the element
• When combined with immutability, it results in almost eliminating the risk caused by concurrency
Declaring the semantics
as oppose to
implementing it (using locks)
Concurrency semantics
• The change is to be performed at:– Current thread (synchronous)– Another thread (A-synchronous)
• A change in the element’s state can be:– Visible to other threads (shared)– Not visible to other threads (isolated)
• A change in the element’s state can be – Coordinated with changes at other elements– Not coordinated with changes at other elements
Concurrency semantics
IsolatedCoordinatedSynchronous
No meaning
Concurrency semantics
IsolatedCoordinatedSynchronous
var
Concurrency semantics
IsolatedCoordinatedSynchronous
ref
Concurrency semantics
IsolatedCoordinatedSynchronous
atom
Concurrency semantics
IsolatedCoordinatedSynchronous
agent
Agent
• A value that can be shared between threads
• The change is not coordinated with other elements
• Execution is performed in an asynchronous manner– By a different thread
Agent
• Creation:– (agent <value>)– (def a (agent <value>)
• Reading– (deref <the-agent>)– @<the-agent>
Agent - activation
• Activation:– (send a-name func args)
• To be executed from a predefined thread pool
– (send-off a-name func args)• For blocking / heavy functions – uses a new thread
• Send and send-off return immediately– The return value is the agent
Agent - activation
• Agents are aware of transactions
• Agent can be activated within a transaction– send or send-off within dosync – The agents wait for the transaction to succeed
before activating• To prevent multiple execution due to retries
Agent - waiting
• Agents are performed in an asynchronous fashion– We may reach to a point in the program that we need
their updated value
• We need to wait for it to complete– (await a+)
• Though it may block forever• Returns nil
– (await-for millis a+ )• Waiting for a predefined milliseconds• Return nil in case the return is due to the timeout
Error handling
• Agents are executed in a different thread than the one that created them
• In case of error, they are in a FAILURE state
• Any send would result in the same error
• Can be restarted by – (restart-agent <the-agent> new-state)
Error handling
• It is possible to set a error handling function to an agent
• The function is activated in case of an error
• (set-error-handler! <the-agent> <er-fn>)– The error handling function receives two
arguments• The agent• The exception
Var
• A var’s value is visible in all threads
• We can change its value, but the changes is visible only in the changing thread
• Use ‘def’ to create a var
• (var <the-var-name>) returns the var– Or use the reader macro #’<the-var-name>
• #’a ;=>theNS/a
Var
• (def a ^:dynamic 8) to create a var that is re-bindable
• To rebind a var– The common way:
• (binding [binding-pairs] <expression>)
– Use set! within binding to re-bind the var to a new value
Var
• The much less used way to rebind a var
– (with-binding* {binding-map} <expression>)• Binding-map is paired with var => newVal• That’s where the reader macro #’ becomes handy
– (with-binding <the-var> <the-value>)
Var
• It is also possible to change the root value of a var– The root value is the value exposed to all the
threads
• (alter-var-root the-var f <args…>)– Note that the var’s value is the first argument
to f
Atom
• An atom’s value is shared between threads
• A change in an atom’s value is shared between threads
• The change is not coordinated with other Atoms
• The change is atomic – a single point in time
• Execution is synchronous
Atom
• Creation– (atom <value>)– (def a (atom <value>))
• Reading an atom’s value– (deref <the-atom>)– @<the-atom>
Atom
• (swap! atm func args)– The first argument of func is the pre-change
value of the atom• A new value is created based on the function
• (reset! atm val)– Change the atom’s value to val
Ref
• A ref’s value can be shared between threads
• The change can be coordinated with other refs– It is always performed within a transaction,
that can be executed on several refs
• Execution is synchronous
Ref
• Creation:– (ref <value>)– (def a (ref <value>))
• Reading– (deref <the-ref>)– @<the-ref>
Ref
• the modification of the ref is done using– (alter <the-ref> func args)
• The first argument of func must be the updated element
– (ref-set <the-ref> v)
• Using only the above will not work !!!!
Ref
• Need to execute the commands within a transaction
• Use (dosync <expr…>)
Transaction
• Transactions maintain the ACID property:– Atomic
• The change happens in a single point in time, for all the participating values
– Or it fails entirely
– Consistent• At any given point the consistency rules are valid
– It is possible to add such rules
– Isolated• Any change done within a transaction is not visible to an outside
viewer during the execution of a transaction– No side effects
– Durable• Once the transaction succeeds, its effects are not-susceptible to
system failures
Transactions and side-effects
• Transactions may be retried
• Do not perform side-effects in the body of alter / swap!– Any i/o , db call …
Software Transaction Memory (STM)• Clojure uses an STM to update refs and
atoms
• STM maintains the ACI properties– As it runs in memory – no writing to a disc
• Clojure’s STM uses the MVCC algorithm– Multi-version-concurrency-control– Used within commercial DBs, such as
Oracle’s
How the update works
• No assignment in the developer’s code
• The developer provides a function– How to create new value based on old value
• The update is managed by the system– There are locks behind the scenes
• The update functionality is just one of the things that can be provided by the developer– More things can be added
validation
• It is possible to provide a validator when creating a ref / atom / var / agent– (<elem> initial-val :validator fn)– (set-validator <elem> fn)
• The validation function accepts one argument, which is the new value– Returns either true or false
• If the validation function fails, the transaction fails– No retry– Note that atom’s update is done also within a
transaction
Observing changes
• It is possible to add a function that would be invoked upon a change in an element– Var / Atom / Ref / Agent
• (add-watch <elem> <key> <watch-fn>)– <elem>: the var / atom / ref / agent– <key>: a unique identifier of the watch-fn– <watch-fn>: a function that accepts 4 arguments
• <key> - the key used when the fn was attached to the elem• <elem> - the changed element• <old-val> - the old value of the element• <new-val> - the new value of the element
Observing changes
• Within the watch function:– Do not deref the element to get its value
• it may be different from both the old and new value
– Ignore the key
• Use the key when removing the watch– (remove-watch <elem> <key>)
Pendings
What are pendings
• A result of a calculation
• To be used later
• Who provides the calculation
• When to start it
What are pendings
• A box that contains a result of a computation• Future
– The computation is defined upon initialization– Starts when the future is defined
• Delay – The computation is defined upon initialization– Starts when somebody asks for the result of the
computation• Promise
– The computation is NOT defined upon initialization– It is up to someone who can access the promise to
provide it
Future / delay
• An asynchronous computation• Creation:
– (future <form>)– (def ftr (future <form>))
• Reading– (deref ftr)– @ftr
• Reading a future / delay is a blocking operation
Future / delay
• When to use– For starting long computations that will be
needed later• DB call• Service over HTTP• …
Promise
• A promise is a “box” that holds a data element– Not a computation
• The “box” can be filled once, and then its value can be read– Following attempts to “fill” the box would fail
silently
Promise
• Creation– (promise)– (def p (promise))
• Reading– (deref p)– @p
• Setting the value– (deliver p <the-val>)
That’s all for today