Ruby on Rails [ Ruby On Rails.ppt ] - [Ruby-Doc.org: Documenting ...
Ruby Concurrency
-
Upload
crazyinventor -
Category
Documents
-
view
44 -
download
1
Transcript of Ruby Concurrency
or
Concurrency Hell
How I stopped worrying about it
Egor Hamaliy
Agenda
1. Concurrency and Parallelism
2. General Concepts
3. Models
• Actors
• Mutexes/Locks
• STM
• CSP
• Futures/Promises
• …
Models
Models
• Coroutines
• Evented IO• Process calculi• Petri nets• ...
GENERAL CONCEPTS
Concurrency/Parallelism
OS mechanism
Scheduling
Communication
Part #1
Concurrency vs Parallelism
What’s the difference ?
Not all programmers agree on the meaning of the
terms 'parallelism' and 'concurrency'. They may
define them in different ways or do not distinguishthem at all.
Rob Pike
Concurrency is about dealing with lots of things at
once.
Parallelism is about doing lots of things at once.
http://blog.golang.org/concurrency-is-not-parallelism
Rob Pike
Execution
How things are executed?
• Process
• Thread
• Green Thread
OS Primitives
Scheduling
How things are scheduled?
Preemptive
Cooperative
Communication
How do the executing things not trip over each other?
Communication is always HARD
Models
Threads/Mutexes
Transactional Memory
Processes & IPC
CSP
Evented/Coroutines
Actors
Part #2
Model Execution Scheduling Communication Concurrent/Parallel
Implementation
Mutexes Threads Preemptive Shared Memory(locks) C/P Mutex
Transactional
Memory
Threads Preemptive Shared memory(commit/abort) C/P Clojure STM
Processes &
IPC
Processes Preemptive Shared memory/Message
passing
C/P Resque/Forking
CSP Threads/Proc
esses
Preemptive Message passing(channels) C/P Golang
Actors Threads Preemptive Message passing(mailboxes) C/P Erlang/Celluloid
Futures &
Promises
Threads Cooperative Message passing(itself) C/P Oz/Celluloid
Coroutines 1
process/threa
d
Cooperative Message passing C Fibers
Evented 1
process/threa
d
Cooperative Shared memory C Eventmachine
Atomicity problems
Mutex
Pros
• No need to worry about scheduling (preemptive)
• Commonly used
• Wide language support
Cons
• Scheduling overhead (context switching)
• Synchronization/locking issues
Mutex
http://en.wikipedia.org/wiki/Software_transactional_memory
"After completing an entire transaction verifies that other
threads have not made changes to memory that it
accessed in the past...If validation is successful, made
permanent, is called a commit."
"May also abort at any time."
Transactional Memory
Thread1:
atomic {
- read variable
- increment variable
- write variable
}
Thread2:
atomic {
- read variable
- increment variable
# going to write, but Thread1 has written variable...
# notices Thread1 changed data, so ROLLS BACK
- write variable
}
“Don't wait on a lock, just check when we're ready to commit”
Transactional Memory
Transactional Memory
Pros
• Increased concurrency
• No thread needs to wait for access to a resource
• Smaller scope that needs synchronizing - modifying disjoint parts of a
data structure
STM's benefits
http://www.haskell.org/haskellwiki/Software_transactional_memory
Cons
• Aborting transactions
• Places limitations on the behavior of transactions - they cannot perform
any operation that cannot be undone, including most I/O.
Transactional Memory
Methods of IPC
• Pipes
• Shared memory
• Message queues
IPC
How do we handle atomicity? Don't share memory.
How to communicate?
IPC
Pros
• Can't corrupt data when data is not shared.
• No locking.
• Easier to scale horizontally (adding nodes).
Cons
• Can't communicate over shared memory
• Slower to spawn a new process
• More memory overhead.
• Scaling horizontally is expensive.
IPC
Communicating Sequential Processes
CSP
Pros
• Uses message passing and channels heavily, alternative to
locks
Cons
• Handling very big messages, or a lot of messages,
unbounded buffers
• Messaging is essentially a copy of shared
CSP
Actors
Atomicity? Conflict? Every actor has it's own address space.
Don't share memory. Communicate via mailboxes.
Actors
Comparison with CSP
• CSP processes are anonymous, while actors have
identities.
• Message-passing in actor systems is fundamentally
asynchronous (CSP traditionally uses synchronous
messaging: "rendezvous")
• CSP uses explicit channels for message passing,
whereas actor systems transmit messages to named
destination actors.
Actors
Pros
• Uses message passing and channels heavily
• No shared state (avoid locks, easier to scale)
• Easier to maintain the code
Cons
• When shared state is required doesn't fit as well
• Handling very big messages, or a lot of messages
• Messaging is essentially a copy of shared data
Actors
Fibers
Fibers are coroutines:
Cooperative! Handing execution rights between one
another, saving local state.
Fibers
A Curious Course on Coroutines and Concurrency:
David Beazley (https://twitter.com/dabeaz) writing an operating system
with only coroutines.
http://dabeaz.com/coroutines/
No Threads, Evented style, just cooperative scheduling of coroutines...
Possible use cases:
http://stackoverflow.com/questions/303760/what-are-use-cases-for-a-
coroutine
Fibers
Pros
• Expressive state: state based computations much easier
to understand and implement
• No need for locks (cooperative scheduling)
• Scales vertically (add more cpu)
Cons
• Single thread: Harder to parallelize/scale horizontally
(use more cores, add more nodes)
• Constrained to have all the components work together
symbiotically
Fibers
Eventmachine
Examples
• C10k problem
• Eventmachine in ruby
• Twisted in python
• Redis's event loop
• Apache vs Nginx
• Node.js vs the world
Eventmachine
Eventmachine
“Evented servers are really good for very light requests, but
if you have a long-running request, it falls down on its face”
Technically, valid, but in practice, not necessarily true.
Eventmachine
Reactor:
• wait for event (Reactor job)
• dispatch "Ready-to-Read" event to user handler (Reactor job)
• read data (user handler job)
• process data ( user handler job)
Proactor:
• wait for event (Proactor job)
• read data (now Proactor job)
• dispatch "Read-Completed" event to user handler (Proactor job)
• process data (user handler job)
Pros
• Avoid polling. CPU bound vs IO bound
• Expanding your horizons (very different paradigms)
• Scales well vs spawning many threads
Cons
• You block the event loop, all goes bad
• Program flow is "spaghetti"-ish
• Callback Hell
• Hard to debug, you loose "the stack”
Eventmachine
Sidenote: C10M
http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html
http://c10m.robertgraham.com/p/manifesto.html
http://c10m.robertgraham.com/2013/02/wimpy-cores-and-scale.html
Conclusion
USE THE BEST TOOL FOR THE JOB
Questions?