riassunti_ricci

39
PROGRAMMAZIONE CONCORRENTE E DISTRIBUITA Course organization Le lezioni teoriche seguirano più o meno la seguente outline: First part: Foundations (4 weeks): module-1.1: Basic Concepts; module-1.2: The Concurrent Programming Abstraction; module-1.3: The Critical Section Problem; module-1.4: Verification of Concurrent Systems; module-1.5: Basic Synchronization Mechanisms & Constructs; Second part: Design and development of concurrent programs (3 weeks): module-2.1: Elements of Concurrent Programs Modelling and Design; module-2.2: Programming Models, Languages and Machines for Building Concurrent Programs; module-2.3: Concurrent Algorithms; Third part: Distributed Programming (4 weeks): module-3.1: Introduction to Distributed Programming; module-3.2: Channel-based Communication; module-3.3: Space-based Communication; module-3.4: Distributed Algorithms; Fourth part: Reactive and Real-time Systems (2 weeks): module-4.1: Introduction to Reactive and Real-Time Systems; module-4.2: Programming languages and technologies; mentre la pratica in laboratorio la seguente: Concurrent programming: module-lab-1.1: Concurrent Programming in Java – Introduction ; module-lab-1.2: Thread Safety ; module-lab-1.3: Basic Building Blocks for Synchronization; Elements of Concurrent Program Design : module-lab-2.1: Structuring Programs in Tasks ; module-lab-2.2: GUI Frameworks and Concurrency ; Distributed programming: module-lab-3.1: Middleware for Distributed Computing ; module-lab-3.2: Case study: Web Services ; module-lab-3.3: Case study: JavaSpaces / Triple Spaces ; Real-time & reactive system programming : module-lab-4.1: Java Real-time.

Transcript of riassunti_ricci

Page 1: riassunti_ricci

PROGRAMMAZIONE CONCORRENTE E DISTRIBUITA

Course organization

Le lezioni teoriche seguirano più o meno la seguente outline:

First part: Foundations (4 weeks):◦ module-1.1: Basic Concepts;◦ module-1.2: The Concurrent Programming Abstraction;◦ module-1.3: The Critical Section Problem;◦ module-1.4: Verification of Concurrent Systems;◦ module-1.5: Basic Synchronization Mechanisms & Constructs;

Second part: Design and development of concurrent programs (3 weeks):◦ module-2.1: Elements of Concurrent Programs Modelling and Design;◦ module-2.2: Programming Models, Languages and Machines for Building Concurrent

Programs;◦ module-2.3: Concurrent Algorithms;

Third part: Distributed Programming (4 weeks):◦ module-3.1: Introduction to Distributed Programming;◦ module-3.2: Channel-based Communication;◦ module-3.3: Space-based Communication;◦ module-3.4: Distributed Algorithms;

Fourth part: Reactive and Real-time Systems (2 weeks):◦ module-4.1: Introduction to Reactive and Real-Time Systems;◦ module-4.2: Programming languages and technologies;

mentre la pratica in laboratorio la seguente:

Concurrent programming:◦ module-lab-1.1: Concurrent Programming in Java – Introduction ;◦ module-lab-1.2: Thread Safety ;◦ module-lab-1.3: Basic Building Blocks for Synchronization;

Elements of Concurrent Program Design :◦ module-lab-2.1: Structuring Programs in Tasks ;◦ module-lab-2.2: GUI Frameworks and Concurrency ;

Distributed programming:◦ module-lab-3.1: Middleware for Distributed Computing ;◦ module-lab-3.2: Case study: Web Services ;◦ module-lab-3.3: Case study: JavaSpaces / Triple Spaces ;

Real-time & reactive system programming :◦ module-lab-4.1: Java Real-time.

Page 2: riassunti_ricci

L'esame avrà infine il seguente svolgimento:

written:◦ part set of exercises and questions about the theory ;

practice part:◦ a programming assignment assigned during the course, to be done in lab and at home :

▪ concurrent programming + distributed programming;▪ report + code;

◦ groups are allowed:▪ max 2 people;▪ common code is allowed, but individual reports ;

colloquium:◦ discussion of written and practice part ;

for 30L:◦ oral report/discussion about a scientific article.

Module 1.1 – Introduction

“Concurrency is concerned with the fundamental aspects of systems of multiple, simultaneously active computing agents, that interact with one another”:

systems with multiple activities or processes whose execution overlaps in time ; activities can have some kind of dependencies , therefore can interact.

Concurrent programming means building programs in which multiple computational activities overlap in time and typically interact in some wa y. A concurrent program is a finite set of sequential programs (processes) that can be executed in parallel (overlapped in time). The execution of a concurrent program is called concurrent computation.

It is a common mistake to confuse concurrent and parallel programming:

in parallel programming the execution of programs overlaps in time by running on separate physical processors;

in concurrent programming the execution of programs overlaps in time without necessarily running on separate physical processors, by sharing for instance the same processor.

Distributed programming is when processors are distributed over a network , so no shared memory.

Why concurrency and concurrent programming?

Performance improvement :◦ increased application throughput by exploiting parallel hardware ;◦ increased application responsiveness by optimizing the interplay among CPU and I/O

activities;

Page 3: riassunti_ricci

◦ speedup is the quantitative measurement for performance :

◦ Amdahl's law bounds the maximum speedup when parallelizing systems:

Concurrency as a tool for software design and construction :◦ rethinking to the way in which we solve problems, so new basic algorithms & data

structures;◦ rethinking to the way in which we design and build systems , so new level of abstraction;◦ affecting the full engineering spectrum, so modelling, design, implementation,

verification, testing.

A process is an abstract/general concept, it is the basic unit of a concurrent system , a single thread of control: process execution is meant to be completely asynchronous with each other. The basic kinds of interaction among processes are:

cooperation, when interactions are both expected and wanted:◦ communication, concerning the need of realising an information flow among processes,

typically realised in terms of messages;◦ synchronization, concerning the explicit definition or presence of temporal relationships

or dependencies among processes and among actions of distinct processes;

Page 4: riassunti_ricci

contention/competition, when interactions are expected and necessary but not wanted, typically concerns the need of coordinating the access by multiple processes to shared resources:◦ mutual exclusion, ruling the access to shared resources by distinct processes;◦ critical sections, ruling the concurrent execution of blocks of actions by distinct

processes; interference, referring to interactions which are neither expected, nor wanted, producing bad

effects only when the ratio among the process speeds assumes specific values (time dependent errors):◦ race condition, whenever two or more processes concurrently access and update shared

resources, and the result of the single update depends on the specific order occurring in process access;

◦ related to two main types of programming errors:▪ bad management of expected interactions;▪ presence of spurious interactions not expected in the problem.

It is quite common to confuse synchronization with mutual exclusion:

synchronization defines a timing relationship among processes , which includes actions happening at the same time or happening at the same relative rate or simply some action having to occur before another (precedence relationships) ;

mutual exclusion defines a restriction on access to shared data , so mutual exclusion is meaningless if no shared data is involved!

relationships:◦ mutual exclusion typically require some form of implicit synchronization, blocking

some actions, waiting for other actions to complete ;◦ synchronization does not necessarily require any kind of shared data nor the mutual

exclusion!

In un modo o nell'altro il concetto di sincronizzazione è legato a una qualche nozione di tempo, sia esso discreto, ovvero scandito da step computazionali, o continuo: posso voler imporre che un'azione venga eseguita solo al termine di un'altra (tempo discreto) o a un certo preciso istante o ancora dopo un certo lasso di tempo a seguito di un certo evento (tempo continuo).

Interferences and errors in concurrent programs can lead to critical situations for the concurrent system in the overall:

deadlock, a situation wherein two or more competing actions (processes) are waiting for the other to finish, and thus neither ever does . Such actions typically concerns the release of a locked shared resource or the receiption of a temporal signal or a message;

starvation, wherein a process is blocked in an infinite waiting :◦ resource starvation = the process is perpetually denied in accessing necessary resources;◦ without those resources, the program can never finish its task ;

livelock, similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing .

Tra la definizione di “sezione critica” e di “mutua esclusione” c'è una bella differenza: la sezione critica non implica necessariamente la presenza di una risorsa condivisa, mentre la mutua esclusione sì. Le sezioni critiche sono blocchi di istruzioni potenzialmente diversi presenti su processi diversi che si vuole vengano eseguiti uno alla volta.Si noti che la “starvation” non implica per forza un “deadlock”: nel deadlock tutti i processi

Page 5: riassunti_ricci

interagenti sono bloccati in attesa l'uno dell'altro, mentre la starvation può riguardare anche un solo processo che rimane in attesa mentre gli altri proseguono normalmente la loro esecuzione.

Module 1.2 – The concurrent programming abstraction

A model for concurrent programs execution could be:

modelling each process as a sequence of atomic actions (executed to completion without the possibility of interleaving), each action corresponding to the atomic execution of a statement;

modelling the execution of a concurrent program as a sequence of actions obtained by arbitrarily interleaving the actions (atomic statements) from the processes;

a scenario is an execution sequence that can occur as a result of the interleaving.

Given the model, the execution of a concurrent program can be formally represented by states and transitions between states:

the state is defined by a tuple consisting of:◦ one element for each process that is a statement from that process (pointed to by the

control pointer of that process);◦ one element for each global or local variable that is a value whose type is the same as

the type of a variable; there is a transition between two states s1 and s2 if executing a statement in state s1 changes

the state to s2.

Le transizioni tra gli stati di un programma concorrente sono non deterministiche: in assenza di vincoli, sincronizzazione e/o quant'altro che possa controllare i flussi di esecuzione dei processi interagenti non è dato sapere quale control pointer verrà scelto per eseguire il relativo statement (quindi quale processo avanzerà nella sua esecuzione).

The state diagram is a graph containing all the reachable states of the program:

scenarios are represented by directed pathes through the state diagram from the initial state ; cycles represent the possibility of infinite computation in a finite graph.

Page 6: riassunti_ricci

State-explosion problem:

with 2 processes:

with 3 process:

I meccanismi di sincronizzazione e di mutua esclusione hanno l'effetto di potare il grafo degli stati : essi infatti individuano gli stati che non è possibile raggiungere a fronte dei meccanismi di coordinazione introdotti. In pratica a ogni istruzione di sincronizzazione e/o mutua esclusione si inseriscono dei vincoli nel problema che diminuiscono il numero di stati del diagramma.

The atomicity of the actions exploited by the processes involved in the interaction is crucial to determine the possible scenarios and to verify if among them there is any wrong one:

atomic increment:

Page 7: riassunti_ricci

non-atomic increment:

only 2 scenarios out of 6 have the correct value for n (n=2):

But the notion of atomic can be referred not only to actions, but also to data structures:

a data object is defined atomic if it can be in a finite number of states equals to the number of values that it can assume;

in the case of ADTs (or more generally data objects) it is possible to identify two basic types of states, internal and external:◦ the internal state is meaningful for who defines the data object (class) ;◦ the external state is meaningful for who uses the data object ;◦ the correspondence among internal and external states is partial :

▪ there exist internal states which have no a correspondent external state ;▪ internal states which have a correspondent external state are defined consistent.

Then, the execution of an operation on a (non-atomic) ADT can go through states that are not consistent:

this is not a problem in the case of sequential programming, thanks to information hiding; conversely, it is a problem in the case of concurrent programming , because it can happen that

a process would work on an object while the object is in an inconsistent state.

Nell'ambito OO gli stati inconsistenti non sono un problema grazie all'information hiding : lo stato dell'oggetto non è visibile all'esterno, se non nel momento in cui viene invocato un metodo che

Page 8: riassunti_ricci

risponde con lo stato attuale dell'oggetto. Per l'utilizzatore esterno il metodo invocato è atomico, poiché il suo flusso computazione viene interrotto, passato all'oggetto passivo e ripreso solo al termine dell'operazione: per l'utilizzatore la chiamata a metodo è atomica.

In un sistema concorrente invece la logica dell'invocazione di metodo salta: quando interagisco con un'altra entità non le passo il mio flusso di esecuzione sospendendomi (in generale), ma continuo la mia esecuzione, quindi se l'operazione eseguita in seguito all'interazione non è veramente atomica (a livello macchina) non ho garanzia di ottenere il comportamento desiderato. Non essendo più adeguato il paradigma di chiamata a metodo, in ambito concorrente non ho garanzie che lo stato degli agenti nel momento dell'interazione sia sempre consistente .

We've created a model of the system (arbitrary interleaving) in which a kind of global entity executes the concurrent program by arbitrarily interleaving statements (to ease analysis), but actually in the reality computer system has not a global state, so is this model a good one? Is it valid for real concurrent computing systems?

Arbitrary interleaving means that we ignore time in our analysis of concurrent programs , focussing only to partial orders related to action sequences a1, a2, ecc. This makes concurrent programs amenable to formal analysis, which is necessary to ensure correctness of concurrent programs.

Due o più operazioni eseguite simultaneamente (possiamo assumere che siano realmente simultanee se abbiamo 2 processori) possono essere viste come eseguite sequenzialmente se esse non hanno dipendenze l'una dall'altra: in questo caso infatti possiamo pensare indifferentemente che venga eseguita prima una e poi l'altra ottenendo lo stesso risultato (LTS confluente se vuoi). Il problema nasce quindi con le dipendenze, poichè lì il modello non è più così buono: quando scendono in campo le dipendenze si introducono i meccanismi di lock (mutex, semafori, sezioni critiche) proprio per preservare l'integrità del modello che altrimenti verrebbe compromessa .

Checking correctness for sequential programs means testing based on specified input and expecting some related specified output. Concurrent programming is a new (challenging) perspective because the same input can give different outputs, depending on the scenario:

some scenarios may give correct output while others do not ; each time you run the program we will likely get a different scenario;

so we need a different approach, i.e. based on model checking by verification of properties of interest:

safety properties, for wich the property must be always true (it must be true in every state of every computation). They are expressed as invariants of a computation, typically used to specify that “bad things” should never happen :◦ mutual exclusion, no more than one process is ever present in a critical region ;◦ no deadlock, no process is ever delayed awaiting an event that cannot occur;

liveness properties, for wich the property must eventually become true (it must be true that in every computation there is some state in which it is true ). Typically used to specify that “good things” eventually happen:◦ no starvation, a process finally gets the resource it need;◦ no dormancy, a waiting process is finally awakened ;◦ reliable communication, a message sent by one process to another will be received.

Page 9: riassunti_ricci

The fairness property is a liveness property which holds that something good happens infinitely often, i.e. a process activated infinitely often during an application execution, each process getting a fair turn or an action that can be executed, eventually will be executed . There are different type of fairness:

unconditional fairness , if every (unconditional) atomic action that is eligible is eventually executed;

weak & strong fairness , stronger assumptions than unconditional fairness, concerning the selection of conditional atomic action.

Does this algorithm necessarily halt?

Yes, if we assume only fair scenarios: in the case an execution of q1 must be included in every scenario.

Module 1.3 – The critical section problem

The critical section problem definition involves:

N processes, each executing in an infinite loop (or not) a sequence of statements that can be divided in 2 subsequences: the critical section (CS) and the non-critical section (NCS);

each critical section is typically a sequence of statements that access some shared object .

Our task is to design entry and exit protocols that satisfy the following properties:

mutual exclusion, because statements from the critical sections of two or more processes must not be interleaved;

freedom from deadlock, so that if some processes are trying to enter their critical sections, then one of them must eventually succeed ;

freedom from individual starvation , to ensure that if any process tries to enter its critical section, then that process must eventually succeed;

Any proposed solution must satisfy also the progress property (weakly fair scenarios) for the CS : once a process starts to execute the statements of its critical section, it must eventually finish. Note that the NCS may loop, we don't care about that: it's not a matter of synchronization.

Now we focus ion a purely algorithmic solution , without using high-level mechanism but solely basic atomic statements.

Page 10: riassunti_ricci

Let’s consider a first possible solution:

The statement await turn = 1 is an implementation independent notation for a statements that waits until the condition turn = 1 becomes true (this can be implemented by a loop that does nothing until the condition is true). For studying correctness, we can reduce the number of states without altering the semantics of the program by removing unessential statements: whatever statements are executed by the CS and in the NCS are totally irrelevant to the correctness of the synchronization algorithms, so they can be removed:

To check the correctness we have to check that the three properties hold :

mutual exclusion is satisfied: there are not scenarios including states of the kind <p2,q2,_>; freedom of deadlock is satisfied : if some processes are trying to enter the CS (executing the

await), then one eventually succeeds; freedom of starvation is not satisfied: a process that wants to enter in CS (pre-protocol) can

starve while waiting a process which engaged an infinite loop inside its NCS (where there is not the progress property).

We introduce separate want variables intended to indicate when a process is its CS:

from wich by removing irrelevant statements we obtain:

Page 11: riassunti_ricci

By constructing the state diagrams it is simple to see that the mutual exclusion property is not satisfied because the state <p3,p3,true,true> is reachable: the problem is due to the non-atomicity of the pre-protocol and post- protocol.

Previous problem can be solved by considering the await statement part of the CS and moving the assignment before the await:

The mutual exclusion problem is now solved, however the solution suffers of possible deadlocks : both processes are trying to enter in their CS but neither will ever do.

Deadlock occurs when both processes simultaneously insist on entering their CS: we can solve the problem by requiring a process to give up its intention to enter its CS if it discovers that it is contending with the other process:

This way the deadlock is solved, but there is still the possibility of starvation, in the case of perfect interleaving in executing while.

The problem can be solved with a simple variation of this last algorithm, requiring that the right to insist on entering (instead of the right to enter) is explicitly passed between the processes by means of the variable turn. This is Dekker's algorithm:

Page 12: riassunti_ricci

Dekker's algorithm is correct: it satisfies mutual exclusion, and it is free from deadlock and starvation.

Dekker's algorithm works on any architecture, providing just load and store as atomic statements . Actually the algorithms and the CS problem and mutual exclusion problems in general can be greatly simplified exploiting more complex atomic statements directly provided by the concurrent machine: main examples are the test-and-set, exchange, fetch-and-add, compare-and-swap operators.

The test-and-set operator is an atomic statement defined as the execution of the two following statements with no possibility of interleaving: test-and-set(x,r) means < r := x, x := 1 > (atomically). This way the CS problem can be solved by an easier algorithm:

Siamo sicuri però che questo meccanismo garantisca assenza di starvation? Solo se assumiamo la condizione weakly-fair, altrimenti un processo potrebbe non effettuare la test-and-set. Si noti poi che lo scenario weakly-fair non è comunque molto valido: dire che prima o poi uno passa non è il massimo, perchè non si sa ne quando ne entro quanto (no upper bound).

By exploiting compound atomic statement we can easilly realize a basic lock mechanism with two stages/atomic operations: acquiring the lock and releasing the lock.

Atomic statements in Java can be implemented using synchronized blocks on the same shared lock object:

Page 13: riassunti_ricci

An alternative solution to CS problem for N-process accounts for introducing a ticket to establish the turn of a process:

The ticket algorithm can be implemented directly on machines that have an instruction like fetch-and-add: the bakery algorithm is a version of the ticket algorithm working also on base machines (introduced by Leslie Lamport in 1974, more complex).

Module 1.4 – Basics on verification of concurrent programs

There are two principal (class of) formal techniques to verify concurrent programs :

model checking, where verification is done by generating one by one all the states of the systems and by checking the properties state by state (can be automated by model checkers tools);

inductive proofs of invariants , in wich invariant properties are proved by induction over the states of the system (can be automated by tools called deductive systems).

With propositional calculus, correctness properties are expressed as logic formulae that must be true in order to verify the property in some state of the system: in our case propositions are about the values of the variables and of the control pointers during an execution of a concurrent program and each label of a statement of a process will be used as an atomic proposition whose interpretation is "the control pointer of that process is currently at that label".

Processes and systems change their state over the time, and then also the interpretation of formulae about their state can change over the time so we need a formal language/calculus that would take this aspect into the account. A temporal logic is a formal logic obtained by adding temporal operators to propositional or predicate logic: we adopt Linear Temporal Logic (LTL) to express properties that must be true (at a state) for every possible scenario (linear/discrete model of time).

LTL is based on two basic temporal operators, always and eventually :

box (or always) temporal operator, ▢A:◦ the formula ▢A is true in a state i of a computation if and only if the formula A is true

in all states j with j >= i;◦ the always operator can be used then to specify safety properties, because it specifies

what must be always be true;◦ mutual exclusion property: (s1 /\ s2);▢⌝

Page 14: riassunti_ricci

diamond (or eventually) temporal operator, ◇A:◦ the formula ◇A is true in astate i of a computation if and only if the formula A is true in

some states j with j >= i;◦ the eventually operator is used to specify liveness properties , because it specifies

something that eventually be true;◦ progress property for one shot (no loops): s1 → ◇s2;◦ progress property for loops: ▢(s1 → ◇s2).

Per esprimere la proprietà di liveness di un processo che voglia entrare nella sua sezione critica non basta scrivere che lo stato pre-protocol implica “eventually stato CS”: in caso di pre-protocol, CS e post-protocol dentro un loop la proprietà risulterebbe verificata anche se solo in una iterazione tra tutte le infinite possibili succede che il processo entri nella sezione critica (magari all'iterazione successiva non ci riesco). Quello che noi vogliamo è che la proprietà di liveness valga a qualunque iterazione (in qualunque stato): prima della formula di implicazione occorre quindi utilizzare l'operatore always (in qualunque stato tu sia deve valere che se sei nel pre-protocl prima o poi devi entrare in CS).

Basic properties of these operators are:

reflexivity:◦ ▢A → A;◦ A → ◇A;

duality:◦ ⌝▢A = ◇⌝A;◦ ◇⌝ A = ▢⌝A;

Always and eventually are unary operators. An example of useful and frequently used binary operator is until:

Until operator, A U B:◦ A U B is true in a state s i if and only if B is true in some state s j, j >= i and A is true in

all states sk, i <= k < j;◦ that is eventually B becomes true and that A is true until that happens ;

Weak-Until operator, A W B:◦ like Until operator, but formula B is not required to become true eventually. If it does

not, A must remain true indefinitely ;◦ A W B = as long as B is false, A must be true.

Weak-Until is almost used when freedom from starvation is too weak for the system of interest: in the scenario < tryP, tryQ, CSq, tryQ, CSq, …, tryQ, CSq, CSp > the property expressed by LTL formulae ◇CSp is true, but it's not enough! We need the k-bounded-overtaking property :

from the time a process p attempts to enter its critical section, another process can enter at most k times before p does;

the property can be expressed by the weak until operator W, i.e. with 1-bounded-overtaking we have in LTL notation: tryP → ⌝CSq W CSq W ⌝CSq W Csp.

Model checking is a strategy based on exhaustively searching the entire state space of a system and verify if certain properties are satisfied: if the system satisfies the property, the model checker generates a confirmation response, otherwise it produces a trace (counterexample) useful to identify bugs. The big problem of model-checking technique is the size of the state space: how to manage

Page 15: riassunti_ricci

graph of millions of states? Is it feasible?

SPIN is a widely used model-checker used in both academic research and industrial software development and PROMELA is the language that is used in Spin to write concurrent programs. JPF is a recent model-checker specialized for the verification of programs written in Java: it is a special JVM executing programs theoretically along all possible scenarios (execution paths), checking for property violations (deadlocks, uncaught exceptions, etc.) and if it finds an error JPF reports the whole execution that leads to it.

Invariants can be proved using induction over the states of all the computations. To prove that A is an invariant:

prove that A is true in the initial state ( the base case); assume that A is true in a generic state S (inductive hypothesis); prove that A is true in all the possible state next to S (inductive step).

Note that safety property are easier to verify :

a safety property must be true at all states so it is sufficient to find a state not verifying the property to complete the verification ;

a liveness property claims that a state satisfying a property will inevitably occur so it is not sufficient to check states one by one but it is necessary to check all possible scenarios (it requires more complex theory and software techniques).

Module 1.5 – Processes coordination with semaphores

Semaphores are a very simple but powerful general-purpose construct which makes it possible to solve almost any mutual exclusion and synchronisation problem. They are a primitive data type provided by the concurrent machine :

S.V is an integer >= 0; S.L is a set of process (id); it can be initialised with:

◦ a value k >= 0 for S.V;◦ the empty set {} for S.L;

it provides two basic atomic operations:◦ wait(S) (also called P(S) from Dijkstra original choice);◦ signal(S) (also called V(S) from Dijkstra original choice).

A semaphore S must satisfie the following invariants: S.V >= 0 /\ S.V = k + #signal(S) - #wait(S) assuming k the initial value of the integer component of the semaphore, #signal(S) the number of signal(S) statements that have been executed, and #wait(S) the number of wait(S) statements that

Page 16: riassunti_ricci

have been completely executed (a process that is blocked when executing wait(S) is not considered to have successfully executed the statement).

Semaphores can be one of three types:

Mutex (or binary semaphores), semaphores whose integer component can take only two values, 0 and 1 (the name derives from their typical use for implementing mutual exclusion );

General (or counting semaphores), semaphores whose integer component can take any value >= 0;

Event semaphores, initialised with 0 (used for synchronisation purpose).

Using a semaphore, the solution of the critical section problem for two processes is trivial using a semaphore as a lock:

(abbreviated code)

(blocked state for P and Q labelled as p1B and q1B)

The same solution applies also for N processes but it there is no more freedom from starvation .

Ma se considerassi solo scenari weakly-fair non avrei garanzia che la parola prima o poi venga data all'azione corrente di ogni processo? In realtà no: scegliere tra un pool di azioni appartenenti a processi diversi non è come scegliere a quale processo dare la parola . la prima scelta riguarda il concetto di fairness, mentre la seconda coinvloge lo scheduler: a livello di diagramma degli stati non possiamo manipolarlo (stando al modello di interleaving che usiamo).

Semaphores provide a basic mechanism also to synchronise processes solving (partial) order of execution problems:

Page 17: riassunti_ricci

The producer-consumer problem is an example of an order-of-execution problem with two types of processes:

producers execute a statement produce to create a data element and then sends this element to the consumer process;

consumers upon receipt of a data element from a producer process executes a statement consume with the data element as a parameter.

When a data element must be sent from one process to another, the communication can be:

synchronous, if the communication cannot take place until both the producer and consumer are ready to do so;

asynchronous, if the communication channel itself has some capacity for storing data elements.

The asynchronous case needs the introduction of a proper buffer where to store and retrieve data (a shared data structures with a mutable state, read by consumers and written by producers):

if there is an infinite buffer, there is only one interaction that must be synchronised: the consumer must not attempt a take operation from an empty buffer ( notEmpty is called resource semaphore):

The invariant for the case is notEmpty.V = #buffer (actually true only if p2+p3 and q1+q2 are considered atomic);

if the buffer is bounded there is also another interaction that must be synchronised : the producer must not attempt an append operation on a buffer which is full ( notEmpty and notFull are an example of split semaphores):

The invariant now is notEmpty + notFull = N.

Si noti che l'invariante degli “split semaphores” vale solo all'inizio di ogni loop, non necessariamente negli stati interni al loop: in particolare non vale tra la wait() e la signal().

Page 18: riassunti_ricci

As a generalisation of previous case, we consider the shared use of a non-atomic data structure (a buffer in this case), so with non-atomic operations , introducing a mutex for guaranteeing also mutual exclusion:

There is another possible classification of semaphores , related to their ability to ensure liveness (invariant properties remain the same):

strong semaphores; weak semaphores; busy-wait semaphores .

In strong semaphore S.L is not a set, but a queue:

this way starvation is impossible for any number N of processes! The early definition of semaphores we talk about was for weak sempahores, where S.L is a set and starvation is avoided for 2-processes scenarios only.

Busy-wait semaphores do not have the S.L field :

this way we loose fredoom from starvation even with 2 processes only!

Busy-wait semaphores are appropriate in a multi-processor system when the waiting process has its own processor and is not wasting CPU time that could be used for other computation.

In the Dining Philosophers problem the challenge is to design pre- and post- protocols to ensure the usual properties (mutual exclusion, freedom from deadlock, freedom from starvation ) considering that a philosopher can eat only if he/she has two forks (and trying to achieve performance in cases with no content).

Quando si introducono meccanismi di lock o comunque tecniche di sincronizzazione occorre tenere ben presente che le politiche implementate non devono influenzare le performance del sistema nelle situazioni in cui non c'è alcuna contesa . La parte del sistema che non richiede gestione dele

Page 19: riassunti_ricci

interazioni (perchè non ci sono o perchè non le si deve gestire per qualche motivo) deve continuare a funzionare come se non avessimo introdotto alcun meccansimo di protezione delle interazioni.

Let's try a first solution where each fork is modelled as a semaphore, so perform a wait(S) means taking a fork while doing a signal(S) amounts to releasing the fork prevoiusly acquired:

this way we achieve mutual exclusion but not freedom from deadlock (so freedom from starvation neither). This happens if every philosopher picks the first fork from the same side (all left or all right).

A first solution for the deadlock problem can be found observing that at most N-1 philosophers can eat at the same time. Introducing a turn system with N-1 tickets avalaible we ensure all theabove properties:

Another solution can be found observing that given the total order among the identifiers of the forks (i.e. with N philosophers forks from 0 to 3) the last philosopher was picking the forks in the opposite order with respect to the other philosophers (first the N-1 fork and then the 0 fork while others pick up first the lower index fork): the solution is then about picking the forks always in the same order (i.e. from lower index one to major). This is a general solution that works regardless the amount of resources and the number of interacting processes because it makes it impossible to have circular wait-for dependency among processes, that is the necessary condition for deadlock .

In the readers/writers problem we divide the processes into two classes:

readers, which are required to exclude writers but not other readers; writers, which are required to exclude both readers and other writers .

The problem is an abstraction of access to databases (or any kind of shared resource): no danger in having process reading data concurrently but writing or modifying data must be done under mutual exclusion to ensure consistency of the data .

Solutions must satisfy these invariants:

Page 20: riassunti_ricci

First we look at an over-constrained solution to be sure not to do this :)

This way we are serialising access for readers too , not only for writers (loss of performance when no content take place)!

The solution use another semaphore (functioning as a lock) dedicated to readers for the update of a common data structure, that is the number of readers actually accessing the resource (simultaneously):

Tale soluzione sfrutta un paio di semplici osservazioni:

solo il primo reader che vuole accedere alla risorsa deve acquisirne il lock , quelli che arrivano dopo (se ci sono) accedono direttamente, perchè tanto se il primo aveva preso il lock significa che nessun writer stava accedendo (e non accederà, appunto perchè il lock è stato preso);

inoltre è solo l'ultimo processo a terminare l'accesso alla risorsa che deve rilasciare il lock , poiché se lo rilasciasse uno qualunque di quelli prima potrebbe accadere di avere contemporaneamente uno scrittore e dei lettori sulla risorsa.

Page 21: riassunti_ricci

Module 1.6 – Processes coordination with monitors

Semaphores are a powerful construct but very low level and hard to use in complex concurrent programs, so we look forward for high-level abstractions: the monitors.

A monitor is a concurrent programming data structure encapsulating the synchronisation and mutual exclusion policy in accessing a resource (data structure). It's a generalisation of the object notion in OOP: classes encapsulating data + operation + synchronisation (or mutual exclusion policies).

Monitors have intrinsic (implicit) mutual exclusion property: their procedures (aka methods in Java) are executed at most one at a time. This way the programmer need not to use other explicit mechanisms for synchronization (like <wait()> and <signal()> primitives): if procedures of the same monitor are called by more than one process, the implementation itself ensures that these are executed under mutual exclusion (and atomically w.r.t. one another); otherwise, if methods from different monitors are called their execution can be interleaved.

Pay attention to the fact that no queues are associated to a monitor, so processes suspended due to monitor mutual exclusion property are subject to starvation!

To allow explicit pure synchronization monitors exploit condition variables: they are primitive data types that can be used to suspend and resume processes inside a monitor , representing conditions (events) on the monitor state that wait to be satisfied and that becomes satisfied. Each condition variable has a FIFO queue of blocked processes managed by means of two basic operations:

<waitC(cond)> suspends the execution of the performing process and releases the lock on the monitor;

<signalC(cond)> awake the first process on the blocked processes queue giving it the lock.

Note that the release of the monitor's lock by <waitC(cond)> is fundamental to manage correctly suspended processes: if they do not release the lock no other process could enter the monitor, so no other sould perform a <signalC(cond)> and the last suspended process is stuck!

To better manage wait and signal policies monitors provide usually other primitives:

<emptyC(cond)> to check if the queue associated to the variable <cond> is empty; <signalAll(cond)> to awake all suspended processes in <cond>'s queue in one shot; <wait(cond, rank)> to give priorities to waiting processes: lower values means higher priority; <minrank(cond)> returns the <rank> value of first process in the queue (so the minimum one

- first to resume).

Page 22: riassunti_ricci

Using monitors we can easly implement a sempahore having two alternatives (but equivalent):

To avoid misunderstanding keep always in mind that monitors and sempahores are not the same thing and do not behave in the same way!

The last remark talks about “signaling semantic”, but what does this means?

When a process executes a signal primitive one suspended process must be choosen for awakening: this way we could have two processes performing a monitor entry (method) at the same time, the one performing the signal and the one resumed by that signal! Should we enforce programmers to make signal always the last statement performed inside every monitor entry? Not at all...

We need to define a proper signaling discipline (semantic) to ensure that a monitor's entry is always executed by exactly one thread. Giving S = precedence of the signalling processes , W = precedence of the waiting processes and E = precedence of processes blocked on an entry we have :

Signal and Continue:◦ the signaller continues and the signalled process executes at some later time (such as

when signaller process ends the execution of the monitor's entry);◦ E < W < S, so need to re-test condition variable ◦ nonpreemptive semantic;

Page 23: riassunti_ricci

Signal and Wait:◦ signalled process executes now and the signaller waits, eventually competing with other

processes waiting for entering the monitor;◦ E = S < W, no need to re-test condition variable ◦ preemptive semantic;

Signal and Urgent Wait:◦ like signal and wait, but the signaller has priority over processes waiting for the lock ;◦ E < S < W;◦ it's classic solution for monitors.

All the coordination problems previously seen can be solved by monitors; for istance let's see a pseudo-code implementation solving the readers/writers problem in two different ways:

Now, let's introduce a brand new coordination problem that mimic best a complex resource allocation problem, the barber's shop:

it consists of a waiting room with unlimited seats (we suppose this just to ease problem analysis) and a barber room with one barber chair;

there are N customers and one barber; customers alternate between growing hair and getting a haircut while the barber sleeps and

cuts hair; if there are no customers to be served, the barber sleeps; if there is a customer but the barber is

busy with another one, then the customer waits until the barber is free ; if the barber is asleep, the customer must wake up the barber.

Given theese rules, we can abstarct from problem details finding that customers are clients requesting some kind of service, the barber is the server process capable to carry out such tasks and

Page 24: riassunti_ricci

the barber shop is the monitor responsible of client-client and client-server coordination. This kind of coordination needs multiple rendez-vous :

the barber has to wait for a customer to arrive and a customer has to wait for the barber to be available;

the customer needs to wait until the barber has finished giving him a haircut (we model this with the barber’s opening the exit door);

before closing the door, the barber needs to wait until the customer has left the shop (althought the braber could have opened the door the client may still be on barber's chair, preventing other customers to get the haircut).

Another well-known coordination problem whose solution schema can be applied also for other kinds of problem is the disk-scheduling problem : we have several concurrent processes requesting access to the disk content, wich needs to be made sequential to prevent data losses. Doing so we must minimise disk-access time that is seek-time (to move the mechanical arm on the right cylinder ) + rotational latency (to find the right sector):

FCFS (First Come Fist Served) scheduling:◦ requests are served in FIFO order;◦ fairness, but seek-time;

SSTF (Shortest-Seek-Time-First) scheduling:◦ serving first requests with lower seek time from current head pos tion;◦ possible starvation;

SCAN scheduling (elevator algorithm) :◦ arm moving forward and backward serving encountered requests;◦ no starvation;

C-SCAN scheduling:◦ like SCAN but serving the requests only along one direction ;

LOOK e C-LOOK scheduling:◦ like SCAN and C-SCAN but constraining the movement of the arm between cylinders

with pending requests.

Page 25: riassunti_ricci

A possible solution accounts for using a monitor functioning as a scheduler, separated from the resource to be controlled (the disk): this way all the users of the shared resource must follow the same protocol to ensure correctness of concurrent accesses .

Following this approach and using for istance the C-SCAN strategy we can identify problem invariants giving: position = current head position (-1 means not being accessed), C = requests for cylinders >= current head position (so scan in current direction) and N = requests < current head position (so scan in opposite direction).

Using two condition variables c and n for C and N

In such a solution all the processes must follow the required protocol for requesting the disk, then using and releasing it, because if any process fails to follow this protocol, the scheduling is defeated!

So we have two possible alternative solutions not to constrain processes to follow a pre-defined protocol:

a Disk Interface monitor can be used, encapsulating both the scheduler and the disk access :

Page 26: riassunti_ricci

usage of nested monitors :

To end this section dedicated to monitor abstraction we note that like a monitor can easily implement a sempahore (such as we saw in the beginning) then a proper usage of semaphore can behave like a monitor according to the desired signalling discipline:

In particulare we have:

one semaphore mutex for mutual exclusion; for each condition variable a semaphore <conSem> and a counter <condCount> keeping track

of the number of processes suspended on the variable.

Module 2.1 – Concurrent programs design

To model a concurrent system we need to follow a proper methodology, avoiding direct implementation wich ends soon and often in bugged code:

in order exploit concurrency whenever possible, we must identify tasks and data decomposition. Tasks are sequences of instructions corresponding to some piece of work to carry out as a responsability of some kind of agent;

then we need to find their dependencies to set up proper coordination policies avoiding

Page 27: riassunti_ricci

interferences; finally, we must choose a suitable concurrent architecture, mapping tasks into active entities

and modelling shared data structures as passive monitor-like entities.

The first step can be carried out following two different decomposition patterns , choosing the one that fits best natural problem definition:

we talk about task decomposition pattern if the problem can be naturally decomposed into a collection of independent or nearly independent tasks :◦ tasks must be sufficiently independent so that managing dependencies takes only a small

fraction of the program’s overall execution time ;◦ in this case first tasks are identified, then we analyse how data can be decomposed given

such task decomposition; otherwise we have a data decomposition pattern when problem's data can be easily

decomposed into units over wich is possible to operate relatively independently :◦ in this case task decomposition follows data decomposition because first data units are

identified, then tasks related to that units can be defined.

When problem decomposition is done we must analyse dependencies (following the second design step), grouping and ordering tasks in order to avoid possible race conditions due to temporal or data dependencies. Conceptual classes help us to carry out this kind of analysis, providing a logical and abstract view of the solution:

result parallelism means designing the system around the data structure yielded as the ultimate result:◦ we get parallelism by computing all the elements of the result simultaneously ;◦ (strictly related to the data-decomposition pattern! );◦ each agent is assigned to produce one piece of the result (working in parallel);◦ proper shared data structures have to be designed to wrap the result under construction,

encapsulating mutex and synchronization issues;◦ think about the components of the house (front, rear, side walls, roof, ... ): proceed by

building all the components simultaneously assembling them as they are completed; specialist parallelism amounts to design the system around an ensemble of specialists

connected into a logical network (with some implictly defined topology):◦ each agent is assigned to perform one specified kind of task (and only that one) ;◦ they all work in parallel to carry out tasks for wich they are useful;◦ coordination means are used to support specialists communication and coordination

(message boxes, blackboards, event services , …);◦ (opposite approach with respect to result parallelism! );◦ from the house example this means identify the separate skills required to build the

house (surveyors, excavators, roofers, …) then assemble a constructor crew in which each skill is represented by a separated specialist worker : they all start simultaneously (but initially most workers will have to wait around);

agenda parallelism means designing the system around a particular agenda of tasks and then assign many workers to each step (wich can be carried out in parallel):◦ each agent is assigned to help out with the current item on the agenda ;◦ every agent is capable of doing everything, thus performs different jobs depending on

the current task;◦ shared data structures are designed to manage data consumed and produced by agenda

activities;◦ back to the house we assemble a work team of generalists each capable of performing

Page 28: riassunti_ricci

any construction step, so that the team is set to work stage by stage following the agenda.

The last step in design methodology is the choice of a suitable architecture to bridge the gap between the conceptual classes and the implementation :

Master-Workers architecture;◦ Fork-Join variant;

Filter-Pipeline; Announcer-Listeners;

◦ Event-Based Coordination pattern (sub of Announcer-Listeners).

In Master-Workers pattern the architecture is composed by a master agent and a possibly dynamic set of worker agents, interacting through a proper coordination medium functioning as a bag of tasks:

the master agent decomposse the global task in subtasks, assigns the subtasks by inserting their description in the bag and finally collects tasks result;

worker agents get the task to do from the bag, execute the tasks and communicate the results (not necessarly putting them in the bag) ;

the bag of tasks resource is typically implemented as a blackboard or a bounded buffer.

The Fork-Join variant applies the master worker architecture recursively : workers are themselves masters, producing tasks and collecting results . This is an effective approach to implement a concurrent “divide-and-conquer” strategy (merge-sort problem).

Filter-Pipeline architecture is composed by a linear chain of agents interacting through some pipe or bounded buffer or channel resources (image processing) :

generator agent is the one starting the chain, creating data to be processed by the pipeline; filter agent is an intermediate agent of the chain, consuming input information from a pipe

and producing information into another pipe; sink agent is the one terminating the chain, collecting the results.

Finally, in Announcer-Listeners pattern the architecture is composed by an announcer agent and a dynamic set of listener agents , interacting through an event service (parallelisation of event-driven applications achieved by parallelising the execution of event handlers ):

the announcer agent notifies the occurrence of events on the event service; listener agents register on the event service so as to be notified of the occurrence of events

interesting for themselves; the event-service resource uncouples announcer-listeners interaction collecting and

dispatching events.

From what we have learnt in this chapter we can re-think about the MVC pattern in a concurrent system exploiting a useful mapping of concepts:

encapsulating control in agents (active components of the program) wich embeds the control logic of the system;

the model as a passive shared data structure manipulated by agents concurrently, thus encapsulating mutex properties (including coordination media );

the view as another passive shared entity observed and used by agents that can also gain

Page 29: riassunti_ricci

access to model resources (for instance to get the state to display).

Page 30: riassunti_ricci

Module 2.2 – Visual formalisms for concurrent systems

To have some formalism to describe a concurrent system behaviour is useful non just for human readability but also because formal specification means formal analysis. In this module we will consider 3 different visual formalisms:

Statecharts (also known from UML as State Diagrams) ; Petri nets (Viroli's ones!); Activity Diagrams .

Statecharts was introduced for modelling complex reactive systems, so systems mainly event-driven continuously having to react to external and internal stimuli with the main objective to introduce a way of describing reactive behaviour formally and rigorous enough to be simulated and analyzed. Beyond the general agreement that states and events are a rather natural medium for describing the dynamic behaviour of a complex systems there is the fact that finite states machine don't scale well with complexity: to be useful a state/event approach must be modular, hierarchical and well-structured.

To reach some kind of scalability the Statechart formalism provides two main features:

hierarchy, clusterning together states into groups at different levls of abstraction, promoting a zoom-like capability;

orthogonality, modeling independence and concurrency among states .

Clustering enforce a XOR decomposition between states increasing abstraction in a bottom-up approach: since β takes the system to B from either A or C, we can cluster the latter two into a new super-state D wich captures their common properties (XOR because to be in state D one must be either in A or in C, and not in both).

If we read the two schemas above in the opposite direction we get the idea of refinement: from the super-state D we step down one level of abstraction looking into it for states A and B (zoom in).

Shifting to the orthogonality feature mentioned above we have AND decomposition, capturing the fact that a certain system may be splitted into sub-systems whose composition of states define the overall system's global state .

Page 31: riassunti_ricci

This way we can give our schema the idea of concurrent and independent activity as well as synchornization needs: note that the same label lead from state F to G on the right sub-state D and from state B to state C on the left sub-state A .

To better model coordinating behaviours we have also two further features: actions and activities. The former are atomic actions wich take ideally a 0 amount of time and that must be attached to a transition (a label), promoting the idea that suc a transition have some kind of effect on the other componenents of the overall system; activities are groups of actions to be carried out while entering and exiting a state that can take a non-zero time to complete .

Let's now look at Petri nets: their major use is to model concurrent control flows interacting together between common states and sharing common resources. We can see a Petri net as composed by two dimensions: places and transitions model the static part of the system, wich relevant states and actions we have in our model; tokens give dynamism to such states, representing possible run time behaviours .

Say that a transition can fire if its input places have all enough tokens we see the inherent non determinisms in such a formalism: in every moment we know only wich transition can possibly fire because they are enabled by tokens, but we do not know wich will actually fire .

In a Petri net we have two important properties other than non-determinism:

asynchrony. There is no inherent measure of time or the flow of time , so the only important property of time, from a logical point of view, is in defining a partial ordering of the occurrence of events;

locality. In a complex systems composed by independent asynchronously operating sub-parts each part can be modelled by a Petri Net and the enabling and firing of transitions are then affected by, and in turn affect, only local changes in the marking of the Petri Net .

Page 32: riassunti_ricci

With a Petri net we can model almost every concurrent problem or mechanism we can think to:

action b before action a action a and b are synchronous

semaphore wait() semaphore signal()

In truth we need one more thing to prove that Petri nets can model every concurrent problem (and that they are a Turing equivalent formalism!): inhibitor arcs. They represent the pre-condition zero-testing: the attached place must have 0 tokens in order to enable the transition .

Petri nets can be formally described to make them amenable to model checking and every kind of formal analysis of properties of interest, such as liveness and safety. The structure of a Petri Net can be formally described as a tuple C = {P, T, I, O} :

P is a set of places; T is a set of transitions; input function I defines the set of input places for each transition tj;

Page 33: riassunti_ricci

output function O defines the set of output places for each transition t j; (we don't consider inhibitor arcs for the sake of simplicity, but it's just about adding another

term to the tuple).

A marking is an assignment of tokens to the places in the net: it can be formally specified as a function µ : P → N that associates to every place the number of tokens in it .

To exhaust Petri nets we present here some useful properties they may have:

we say that a marking Ω is reachable from Ω' if it is immediately reachable (there is a transition Ω → Ω' ) from Ω or is reachable from any marking Ω '' wich is immediately reachable from Ω;

safe nets are Petri nets in which no more than one token can ever be in any place of the net at the same time;

bounded net or k-bounded net (boundness) are nets in which the number of tokens in any place is bounded by k;

a Petri net is conservative if the number of tokens in the net is conserved (think for instance of using tokens to represent resources) ;

there is a dead transition in a marking if there is no sequence of transition firings that can enable it (related to deadlock situations) ;

live transition if it is potentially firable in all reachable markings .

To end this module we finally present UML Activity diagrams: they are dynamic diagrams that show the activity and the event that causes the object to be in the particular state. In a state chart the focus is on the states linked by transitions, while in activity diagrams we detail transitions (activities) that relate different states .

The diagram on the left shows a fork after activity1 : this indicates that both activity2 and activity3 are occurring at the same time. After activity2 there is a branch: this describes wichactivities will take place based on a set of conditions. All branches at some point are followed by a merge to indicate the end of the conditional behavior started by that branch. After the merge all of the parallel activities must be combined by a join before transitioning into the final activity state.

A swimlane is a way to group activities performed by the same actor on an activity diagram or to group activities in a single thread:

Page 34: riassunti_ricci

Module 2.3 – Concurrent machines and languages.

To describe/specify a concurrent program we need concurrent programming languages enabling programmers to write down programs as set of instructions to be executed concurrently ; to execute a concurrent program we need a concurrent machine (which can be abstract) designed to handle the execution of multiple sequential processes, by exploiting multiple processors (physical or virtual).

Flynn's taxonomy is a categorization of all computing systems according to the number of instructions stream and data stream :

Single Instruction - Single Data (SISD) : Von-Neumann model, single processor computers ; Single Intruction - Multiple Data (SIMD) : single instruction stream concurrently

broadcasted to multiple processors, each with its own data stream (image processing) ; Multiple Instruction - Single Data (MISD) ; Multiple Instruction - Multiple Data (MIMD) : each processor has its own stream of

instructions operating on its own data :◦ shared memory: all processes (processors) share a single address space and communicate

each other by writing and reading shared variables:▪ SMP (UMA): all processors share a connection to a common memory and access all

location memories at equal speed;▪ NUMA (Non-uniform Memory Access) : the memory is shared, by some blocks of

memory may be physically more closely associated with some processors than others;

◦ distributed memory: each process (processor) has its own address space and communicate with other process by message passing (sending and receiving messages).

An abstract machine (or abstract processor) is an entity that can execute the instructions of a specific source programming language . It can be realized on top of lower level processor, which can be physical or abstract, that has its own programming language:

Interpreter - a LO program that simulates PS on PO, interpreting LS . It's very flexible, but also very inefficient;

Compiler - the process PS is completely virtual, without an interpreter, so LS is translated into a functionally equivalent program, written (compiled) in LO so as to run directly on PO. It achieve high efficiency but more resource consuming plus less portability;

Virtual Machine - an abstract processor PI between PS and PO, executing programs written in a LI language. LS is translated into LI and executed onto an interpreter of LI running directly on PO. PI extends the functionalities of the physical machine PO so as to make it

Page 35: riassunti_ricci

easier the translation of the source language and at the same time it makes it easier the portability of the language on different POs .

A concurrent machine provides support for the execution of concurrent programs and basic mechanisms for:

multiprogramming - set of mechanisms that make it possible to create new virtual processors and allocate them to the physical processors by means of scheduling algorithms;

virtual processors generation and management ; synchronization and communication - two different typologies of mechanisms, related to two

different architectural models for concurrent machines:◦ shared memory model - presence of a shared memory among the virtual processors (as in

multi-threaded programming );◦ message passing model - every virtual processor has its own memory and no shared

memory among processors is present , so every communication and interaction among processors is realized through message passing;

access control to resources.

The approach to follow while shifting from sequential languages to concurrent ones can be one of three types:

sequential language + library with concurrent primitives (as in C + PThreads); language designed for concurrency (Concurrent Pascal, ADA, Erlang); hybrid approach - sequential paradigm extended with a native support for concurrency (Java )

+ library and patterns based on basic mechanisms (java.util.concurrent).

Among the first basic language notations for expressing concurrency (UNIX/POSIX) there were:

fork primitive - behavior similar to procedure invocation, with the difference that a new process is created and activated for executing the procedure (bifurcation of the program control flow):◦ input param: procedure to be executed;◦ output param: the unique identifier of the process created ;◦ the new process (child) is executed asynchronously with respect to the generating

process (parent) and existing processes ; join primitive - it detects when a process created by a fork has terminated and it synchronize

current control flow with such event (join of independent control flows):◦ input parameter: the identifier of the process to wait .

This framework is general and flexible, so it can be used to build any kind of concurrent application, but the level of abstraction is too low and thus error-prone: it is hard getting from the code an idea of what processes are active in a specific point of the program .

The COBEGIN - COEND construct was proposed by Dijkstra to provide a discipline for concurrent programming, enforcing the programmer to follow a specific scheme to structure concurrent programs . Concurrency is expressed in blocks:

instructions S1, S2, Sn are executed in parallel; an instruction Si can be as complex as a full program (it can include

nested cobegin/coend); the process executing a cobegin (pared) creates as many processes

Page 36: riassunti_ricci

(children) as the number of instructions in the body and suspends its execution until all the processes have terminated .

Now we have a stronger discipline in structuring a concurrent program, so it is more readable, but ont he other side we lose flexibility: how to create N concurrent processes, where N is known only at runtime?

Concurrent Pascal was the first language to introduce processes and monitors as first-class constructs: it was an imperative/procedural language enriched with the process abstraction.

ADA was introduced both for programming in-the-large, being high-level, structural and with OO elements (such as strong typing), and for concurrent programming to develop critical and real-time systems:

processes in ADA are called tasks and represents an activity that can be executed concurrently inside an ADA program;

the task definition includes:◦ task type declaration;◦ task entry declaration for inter-task communication ;◦ task body implementation specified as a procedure.

Erlang is a functional language providing a native support for concurrent programming based on processes and process asynchronous communication through message passing. It is executed on top of the BEAM concurrent virtual machine wich offers a completely abstract/virtual notion of process (not related to OS process or OS threads), hence we obtain an extremely efficient process management (hundred of thousands processes can be created on a single host) .

An Erlang program describes a series of functions (operators too are functions) and uses pattern matching to determine which function to execute upon a call. Modules are used to package functions together:

Page 37: riassunti_ricci

Erlang offers tuples and lists as primitive data structures (besides atomic data structures such as symbols, constants, numbers and strings): tuples support pattern matching and lists are stored in the H|T notation.

A process is a computational activity whose computational behaviour is given by some specific function. The spawn primitive is provided to launch a process, getting its PID , specifying the function module, function name and its parameters . Remind processes are logical entities: the BEAM maps logical processes to physical threads .

Processes can communicate solely through message passing , so each process has a mailbox where messages are enqueued:

sending messages: ! operator receiving messages:

it can be done specifying a proper pattern:

Java has been the first “mainstream” language providing a native support for concurrent programming, but with a “conservative approach”: the language is still purely OO with no explicit construct for defining processes (threads) , but some keywords and mechanisms for concurrency was introduced (synchronized blocks, wait/notify mechanisms) .

Page 38: riassunti_ricci

The abstract notion of process is implemented as a thread, with a direct mapping onto OS support for threads:

a thread is defined by a single control flow, sharing memory with all the other threads (hence private stack, common heap);

each Java program contains at least one thread, corresponding to the execution of the main method in the main class;

thread object can be instantiated and “spawned” by invoking the start method, beginning the execution of the process.

C and C++ gain multithreaded programming capabilities thanks to the PThread (POSIX-thread):

the abstract notion of process is implemented as a thread ; differently from Java, process body is specified by means of a procedure ; the standard defines just the interface/specification, not the implementation (which depends

on the specific OS).

Some of the primitives described in PThreads specification are (from pthread.h):

between data types:◦ pthread_t is the thread identifier data type;◦ pthread_attr_t is the data structure for specifying thread attributes;

among the main functions:◦ for thread creation (fork) pthread_create(pthread_t* tid, pthread_attr_t* attr, void*

(*func)(void*), void* arg) and for attributes initialization pthread_attr_init(pthread_attr_t*);

◦ for thread termination pthread_exit(int);◦ for thread join int pthread_join(pthread_t thread, void **value_ptr).

In the research landscape the Actor-based model was proposed in the context of AI: the Actor is the unique abstraction. Actors are autonomous entities, possibly distributed on different machines, executing concurrently and communicating through asynchronous message passing (no shared memory, every actor has a mailbox as in Erlang! In fact it is based on the actor model). An Actor can only communicate with Actors to which it is connected: such connections can be implemented in a direct physical attachment, in a memory address or even in a network address (email too).

To understand how to use an actor we must know what it can do in reply to a message received from some other actor:

with send primitive it can send a finite number of messages to other Actors - it is to concurrent programming what procedure invocation is to sequential programming ;

with create primitive it can spawn a finite number of new Actors - it is to concurrent programming what procedure abstraction is to sequential programming ;

with the become primitive it can designate its new behavior to be used for the next message it receives (replacing behaviour) .

Page 39: riassunti_ricci