Multithreaded Programming ECEN5043 Software Engineering of Multiprogram Systems University of...

60
Multithreaded Programming ECEN5043 Software Engineering of Multiprogram Systems University of Colorado Lectures 5 & 6

Transcript of Multithreaded Programming ECEN5043 Software Engineering of Multiprogram Systems University of...

Multithreaded Programming

ECEN5043 Software Engineering of

Multiprogram Systems

University of Colorado

Lectures 5 & 6

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

2

The Essence of Multiple Threads

Two or more processes that work together to perform a task Each process is a sequential program One thread of control per process

Communicate using shared variables Need to synchronize with each other, 1 of 2 ways

Mutual exclusion Condition synchronization

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

3

Opportunities & Challenges

What kinds of processes to use How many How they should interact Key to developing a correct program is to ensure

the process interaction is properly synchronized

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

4

Focus Programs in most common languages

Explicit concurrency, communication, & synchronization

Specify the actions of each process and how they communicate & synchronize

Asynchronous process execution Shared memory Single CPU and operating system

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

5

Multiprocessing monkey wrench

The solutions we will address this semester will presume a single CPU and therefore the concurrent processes share coherent memory

A multiprocessor environment with shared memory introduces cache and memory consistency problems and overhead to manage it.

A distributed-memory multiprocessor/multicomputer/network environment has additional issues of latency, bandwidth, etc.

We focus on the first bullet in this semester.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

6

Recall

A process is a sequential program that has its own thread of control when executed

A concurrent program contains multiple processes so every one has multiple threads

Multithreaded usually means a program contains more processes than there are processors to execute them

A multithreaded software system manages multiple independent activities

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

7

Why write as multithreaded?

To be cool (wrong reason) Sometimes, it is easier to organize the code and

data as a collection of processes than as a single huge sequential program

Each process can be scheduled and executed independently

Other applications can continue to execute “in the background”

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

8

Many applications, 5 basic paradigms

Iterative parallelism Recursive parallelism Producers and consumers (pipelines) Clients and servers Interacting peers

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

9

Iterative parallelism

Example? Several, often identical processes Each contains one or more loops Therefore each process is iterative They work together to solve a single program Communicate and synchronize using shared

variables Independent computations – disjoint write sets

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

10

Recursive parallelism

One or more independent recursive procedures Recursion is the dual of iteration Procedure calls are independent – each works

on different parts of the shared data Often used in imperative languages for

Divide and conquer algorithms Backtracking algorithms (e.g. tree-traversal)

Used to solve combinatorial problems such as sorting, scheduling, and game playing

If too many recursive procedures, we prune.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

11

Producers and consumers One-way communication between processes Often organized into a pipeline through which

info flows Each process is a filter that consumes the output

of its predecessor and produces output for its successor

That is, a producer-process computes and outputs a stream of results

Sometimes implemented with a shared bounded buffer as the pipe, e.g. Unix stdin and stdout

Synchronization primitives: flags, semaphores, monitors

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

12

Clients and servers Dominant interactive pattern in distributed systems

(see next semester) Client process requests a service & waits for reply Server waits for requests; then acts upon them. Server can be implemented

By a single process that handles one client process at a time

Multithreaded to service requests concurrently Concurrent programming generalizations of

procedures and procedure calls

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

13

Interacting peers

Occurs in distributed programs Several processes that execute the same code

and exchange messages to accomplish a task Used to implement

Distributed parallel programs including distributed versions of iterative parallelism

Decentralized decision making

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

14

Summary

Concurrent programming paradigms on a single processor Iterative parallelism Recursive parallelism Producers and consumers

No analog in sequential programs because producers and consumers are, by definition, independent processes with their own threads and their own rates of progress

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

15

Shared-Variable Programming

Frowned on in sequential programs, although convenient

Absolutely necessary in concurrent programs Must communicate to work together

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

16

Need to communicate

Communication fosters need for synchronization Mutual exclusion – need to not access shared

data at the same time Condition synchronization – one needs to wait

for another

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

17

Some terms

State – values of the program variables at a point in time, both explicit and implicit. Each process in a program executes independently and, as it executes, examines and alters the program state.

Atomic actions -- A process executes sequential statements. Each statement is implemented at the machine level by one or more atomic actions that indivisibly examine or change program state.

Concurrent program execution interleaves sequences of atomic actions. A history is a trace of a particular interleaving.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

18

Terms -- continued

The next atomic action in any ONE of the processes could be the next one in a history. So there are many ways actions can be interleaved and conditional statements allow even this to vary.

The role of synchronization is to constrain the possible histories to those that are desirable.

Mutual exclusion combines atomic actions into sequences of actions called critical sections where the entire section appears to be atomic.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

19

Terms – continued further

Property of a program is an attribute that is true of every possible history. Safety – never enters a bad state Liveness – the program eventually enters a

good state

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

20

How can we verify?

How do we demonstrate a program satisfies a property? A dynamic test considers just one possible

historyLimited number of tests unlikely to

demonstrate the absence of bad histories Operational reasoning -- exhaustive case

analysis Assertional reasoning – abstract analysis

Atomic actions are predicate transformers

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

21

Assertional Reasoning

Use assertions to characterize sets of states Allows a compact representation of states and

their transformations More on this later in the course

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

22

Warning

We must be wary of dynamic testing alone it can reveal only the presence of errors, not

their absence. Concurrent programs are difficult to test &

debugDifficult (impossible) to stop all processes

at once in order to examine their stateEach execution in general will produce a

different history

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

23

Example 1a -- Pattern in a File

Find all instances of a pattern in filesomething. Consider

string line;

read a line of input from stdin into line;

while (!EOF) {

look for pattern in line;

if (pattern is in line)

write line;

read next line of input;

}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

24

Example 1b -- concurrent & correct?

string line;

read a line of input from stdin into line;

while (!EOF) {

co look for pattern in line;

if (pattern is in line)

write line;

// read next line of input into line;

oc;

}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

25

Example 1c -- different variables

string line1, line2;

read a line of input from stdin into line1;

while (!EOF) {

co look for pattern in line1;

if (pattern is in line1)

write line1;

// read next line of input into line2;

oc;

}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

26

Example 1d - copy the line

string line1, line2;

read a line of input from stdin into line1;

while (!EOF) {

co look for pattern in line1;

if (pattern is in line1)

write line1;

// read next line of input into line2;

oc;

line1 = line2;

}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

27

Co inside while vs. while inside co??

Possible to get the loop inside the co brackets so that the multi-process creation only occurs once?

Yes. Put a while loop inside each of the two processes.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

28

Both processes inside co bracketscoprocess 1: find patternsstring line1;while (true) { wait for buffer to be full

or done to be true; if (done) break; line1 = buffer; signal buffer is empty; look for pattern in line1; if (pattern is in line1) write line1;}

process 2: read new lines

string line2;

while (true) {

read next line of input into line2;

if (EOF) (done=true; break;)

wait for buffer to be empty;

buffer = line2;

signal that buffer is full;

}

oc;

string buffer; bool done = false;

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

29

Synchronization

Required for correct answers whenever processes both read and write shared variables.

Sometimes groups of instructions must be treated as if atomic -- critical sections

Technique of double checking before updating a shared variable is useful (even though it sounds strange)

Example of double checking -- next

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

30

Example 2 -- sequential

Find the maximum value in an array

int m = 0;

for [ i = 0 to n-1 ] {

if (a[i] > m)

m = a[i];

}

If we try to examine every array element in parallel, all processes will try to update m but the final value will be the value assigned by the last process that updates m.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

31

Example 2b - concurrent w/ doublecheck

OK to do comparisons in parallel because they are read-only actions

But -- necessary to ensure that when the program terminates, m is the maximum :-)

int m = 0;co [i = 0 to n-1] if (a[i] > m) <if (a[i] > m) #recheck only if above ck true m = a[i]; >oc;Angle brackets indicate atomic operation.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

32

Why synchronize?

If processes do not interact, all interleavings are acceptable.

If processes do interact, only some interleavings are acceptable.

Role of synchronization: prevent unacceptable interleavings Combine fine-grain atomic actions into

coarse-grained composite actions (we call this ....what?)

Delay process execution until program state satisfies some predicate

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

33

Notation for synchronization

General:

<await (condition) statement-sequence;>

Mutual exclusion:

<statement-sequence>

Conditional synchronization only:

<await (condition);>

This is equivalent to: while (not condition);

(note the ending empty statement, i.e. semicolon)

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

34

Unconditional atomic action does not contain a delay condition can execute immediately as long as it

executes atomically (not interleaved) examples:

individual machine instructionsexpressions we place in angle bracketsawait statements where guard condition is

constant true or is omitted

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

35

Conditional atomic action - await statement with a guard condition If condition is false in a given process, it can

only become true by the action of other processes.

How long will the process wait if it has a conditional atomic action?

Locks and Barriers

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

37

How to implement synchronization

To implement mutual exclusion Implement atomic actions in software using

locks to protect critical sections Needed in most concurrent programs

To implement conditional synchronization Implement synchronization point that all

processes must reach before any process is allowed to proceed -- barrier

Needed in many parallel programs -- why?

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

38

Bad states, Good state

Mutual exclusion -- at most one process at a time is executing its critical section its bad state is one in which two processes

are in their critical section Absence of Deadlock (“livelock”) -- If 2 or more

processes are trying to enter their critical sections, at least one will succeed. its bad state is one in which all the processes

are waiting to enter but none is able to do so two more on next slide

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

39

Bad states -- continued Absence of Unnecessary Delay -- If a process is

trying to enter its c.s. and the other processes are executing their noncritical sections or have terminated, the first process is not prevented from entering its c.s. Bad state is one in which the one process that

wants to enter cannot do so, even though no other process is in the c.s.

Eventual entry -- process that is attempting to enter its c.s. will eventually succeed. liveness property, depends on scheduling

policy

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

40

Logical property of mutual exclusion When process1 is in its c.s., set property1 true. Similarly, for process2 where property2 is true. Bad state is where property1 and property2 are

both true at the same time Therefore

want every state to satisfy the negation of the bad state --

mutex: NOT(property1 AND property2)Needs to be a global invariant

True in the initial state and after each event that affects property1 or property2

<await (!property2) property1 = true>

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

41

Coarse-grain solution

process process1 {

while (true) {

<await (!property2) property1 = true;>

critical section;

property1 = false;

noncritical section;

}

}

process process2 {

while (true) {

<await (!property1) property2 = true;>

critical section;

property2 = false;

noncritical section;

}

}

bool property1 = false; property2 = false;

COMMENT: mutex: NOT(property1 AND property2) -- global invariant

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

42

Does it avoid the problems? Deadlock: if each process were blocked in its

entry protocol, then both property1 and property2 would have to be true. Both are false at this point in the code.

Unnecessary delay: One process blocks only if the other one is not in its c.s.

Liveness -- see next slide

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

43

Liveness guaranteed?

Liveness property -- process trying to enter its critical section eventually is able to do so If process1 trying to enter but cannot, then

property2 is true; therefore process2 is in its c.s. which

eventually exits making property2 false; allows process1’s guard to become true

If process1 still not allowed entry, it’s because the scheduler is unfair or because process2 again gains entry -- (happens infinitely often?)

Strongly-fair scheduler required, not likely.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

44

Three “spin lock” solutions A “spin lock” solution uses busy-waiting

Ensure mutual exclusion, are deadlock free, and avoid unnecessary delay

Require a fairly strong scheduler to ensure eventual entry

Do not control the order in which delayed processes enter their c.s.’s when >= 2 try

Three fair solutions to the critical section problem Tie-breaker algorithm Ticket algorithm Bakery algorithm

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

45

Tie-Breaker In typical P section attempting to enter its c.s.,

there is no control over which will succeed. To make it fair, processes should take turns Peterson’s algorithm uses an additional variable

to indicate which process was last to enter its c.s. Consider the “coarse-grained” program but ...

implement the conditional atomic actions in the entry protocol using only simple variables and sequential statements.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

46

Tie-breaker implementation

Could implement the await statement by first looping until the guard is false and then execute the assignment. (Sound familiar?)

But this pair of events is not executed atomically -- does not support mutual exclusion.

If reversed, deadlock can result. (Remember?) Let last be an integer variable to indicate which

was last to start executing its entry protocol. If both are trying (property1 and property2 are

true), the last to starts its entry protocol delays.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

47

Tie-breaker Implementation for n

If there are n processes, the entry protocol in each process consists of a loop that iterates thru n-1 stages.

If we can ensure at most one process at a time is allowed to get thru all n-1 stages, then at most one at a time can be in its critical section.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

48

n-process tie-breaker algorithm

See handout (also in Notes half of this slide)

This is quite complex and hard to understand.

But ...

livelock free

avoids unnecessary delay

ensures eventual entry

(A process delays only if some other process is ahead of it in the entry protocol. Every process eventually exits its critical section)

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

49

Ticket Algorithm

Based on the idea of drawing tickets (numbers) and then waiting turns

Needs a number-dispenser and a display indicating which number customer is being served

If one processor, customers are served one at a time in order of arrival

(If the ticket algorithm runs for a very long time, incrementing a counter will cause arithmetic overflow.)

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

50

int number = 1, next = 1, turn[1:n] = ([n] 0);

process CS[i = 1 to n] {

while (true) {

turn[i] = FetchAndAdd(number, 1); /* entry */

while (turn[i] != next) skip;

critical section;

next = next + 1; /* exit protocol */

noncritical section;

}

}

What is the global invariant?

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

51

Bakery Algorithm Downside of the ticket algorithm:

Without FetchAndAdd, require an additional critical section and ITS solution might not be fair.

Bakery algorithm fair and does not require any special machine

instructions Ticket: customer draws unique # and waits for

its number to become next This: Customers check with each other rather

than with a central next-counter to decide on order of service.

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

52

To enter its c.s., process CS[i] sets turn[i] to one more than the maximum of all the current values in turn.

Then CS[i] waits until turn[i] is the smallest nonzero value in turn.

What is the global invariant in words (not predicate logic notation)?

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

53

Bakery algorithm -- coarse-grain version

int turn[1:n] = ([n] 0);process CS[i = 1 to n] { while (true) { < turn[i] = max(turn[1:n]) + 1; > for [j = 1 to n such-that j != i] <await (turn[j] == 0 or turn[i] < turn[j]); > critical section turn[i] = 0; noncritical section; }}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

54

Bakery algorithm -- practicality?

Cannot be implemented on contemporary machines

The assignment to turn[i] requires computing the maximum of n values

The await statement references a shared variable (turn[j]) twice.

These could be atomic implementations by using another c.s. protocol such as the tie-breaker algorithm (inefficient)

What to do?

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

55

Initial (wrong) attempts

When n processes need to synchronize, often useful first to develop a two-process solution and then to generalize that solution.

Consider entry protocol for CS1: turn1 = turn2 + 1; while (turn2 != 0 and turn1 > turn2) skip; Flip the 1’s and 2’s to get the corresp. for CS2

Is this a solution? What’s the problem? The assignments and the while loop guards are

not implemented atomically. So?

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

56

Does the gallant approach work?

If both turn2 and turn2 are 1, let one of the processes proceed and have the other delay. (For example, strengthen the delay loop in CS2 to be turn2 >= turn1.)

Still possible for both to enter their c.s. because of a race condition.

Avoid the race condition: have each process set its value of turn to 1 (or any nonzero value) at the start of the entry protocol. Then it examines the other’s value of turn and resets its own:

turn1 = 1; turn1 = turn2 + 1; while (turn2 != 0 and turn1 > turn2) skip;

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

57

Working but not symmetric

One process cannot now exit its while loop until the other has finished setting its value of turn if it is in the middle of doing so.

Who has precedence? These entry protocols are not quite symmetric. Rewrite them, but first:Let (a, b) and (c, d) be pairs of integers and define

the greater-than relation between such pairs as follows:

(a, b) > (c, d) == true if a > c or if a == c and b > d == false otherwise

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

58

Symmetric --> easy to generalize

Rewrite turn1 > turn2in CS1 as

(turn1, 1) > (turn2, 2) Rewrite turn2 >= turn1 in CS2 as

(turn2, 2) > (turn1, 1) n-process bakery algorithm -- next slide

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

59

n-process bakery algorithmint turn[1:n] = ([n] 0);process CS[i = 1 to n] { while (true) { turn[i] = 1; turn[i] = max(turn[1:n]) + 1; for [j = 1 to n such that j != i] while (turn[j] != 0 and (turn[i], i) > (turn[j], j)) skip; critical section; turn[i] = 0; noncritical section; }}

revised 1/29/2007 ECEN5043 SW Eng of Multiprogram Systems, University of Colorado

60

Some interesting points re bakery

Devised by Leslie Lamport in 1974 and improved in 1979

More intuitive than earlier critical section solutions

Allows processes to enter in essentially FIFO order