574 Lecture 4

download 574 Lecture 4

of 25

Transcript of 574 Lecture 4

  • 8/17/2019 574 Lecture 4

    1/25

     

    Lecture 4:Lecture 4:

    Parallel ProgrammingParallel Programming

    ModelsModels

  • 8/17/2019 574 Lecture 4

    2/25

     

    Parallel Programming Models

    Parallel Programming Models:

    Data parallelism / Task parallelism Explicit parallelism / Implicit parallelism Shared memory / Distributed memory  Other programming paradigms

    • Object-oriented• Functional and logic 

  • 8/17/2019 574 Lecture 4

    3/25

     

    Parallel Programming Models

    Data Parallelism

    Parallel programs that emphasize concurrent execution of thesame task on different data elements (data-parallel programs)

    • Most programs for scalable parallel computers are data parallel innature.

    Task Parallelism

    Parallel programs that emphasize the concurrent execution ofdifferent tasks on the same or different data

    • Used for modularity reasons.• Parallel programs structured as a task-parallel composition of data-parallel components is common.

     

  • 8/17/2019 574 Lecture 4

    4/25

     

    Parallel Programming Models

    !ata parallelism

    "ask Parallelism

  • 8/17/2019 574 Lecture 4

    5/25

     

    Parallel Programming Models

    Explicit Parallelism

    "he programmer specifies directly the acti#ities of the multipleconcurrent $threads of control% that form a parallel

    computation. • Pro#ide the programmer &ith more control o#er program beha#ior

    and hence can be used to achie#e higher performance.

    Implicit Parallelism

    "he programmer pro#ides high-le#el specification of programbeha#ior.

    't is then the responsibility of the compiler or library toimplement this parallelism efficiently and correctly.

  • 8/17/2019 574 Lecture 4

    6/25

     

    Parallel Programming Models

    Shared Memory 

    "he programmers task is to specify the acti#ities of a set ofprocesses that communicate by reading and &riting shared memory.

    •  Advantage: the programmer need not be concerned &ith data-distributionissues.• Disadvantage: performance implementations may be difficult on computers

    that lack hard&are support for shared memory and race conditions tend toarise more easily

    Distributed Memory 

    Processes ha#e only local memory and must use some othermechanism (e.g. message passing or remote procedure call) toexchange information.•  Advantage: programmers ha#e explicit control o#er data distribution and

    communication.

  • 8/17/2019 574 Lecture 4

    7/25 

    Shared vs Distributed Memory

    Shared memory

    Distributed memory

    Memory

    us

    P P P P

    P P P P

    M M M M

    *et&ork

  • 8/17/2019 574 Lecture 4

    8/25 

    Parallel Programming Models

    Parallel Programming Tools:

    arallel !irtual "achine #!"$

    • Distributed memory% explicit parallelism "essage-assing Inter&ace #"I$

    • Distributed memory% explicit parallelism Threads

    • Shared memory% explicit parallelism Open" 

    •Shared memory% explicit parallelism

    'igh-er&ormance Fortran #'F$• Implicit parallelism

    aralleli(ing )ompilers• Implicit parallelism

  • 8/17/2019 574 Lecture 4

    9/25 

    Parallel Programming Models

  • 8/17/2019 574 Lecture 4

    10/25

     

    Parallel Programming ModelsMessage Passing Model

    Used on !istributed memory M'M! architectures

    Multiple processes execute in parallelasynchronously

    •Process creation may be static  or dynamic 

    Processes communicate by using send andrecei#e primiti#es

  • 8/17/2019 574 Lecture 4

    11/25

     

    Parallel Programming Models locking send+ &aits until all data is recei#ed

    *on-blocking send+ continues execution after

    placing the data in the buffer 

    locking recei#e+ if data is not ready &aits until itarri#es

    *on-blocking recei#e+ reser#es buffer and continueexecution. 'n a later *ait  operation if data is readycopies it into the memory.

  • 8/17/2019 574 Lecture 4

    12/25

     

    Parallel Programming Models ,ynchronous message-passing+ ,ender and

    recei#er processes are synchronized

    • locking-send locking recei#e

     synchronous message-passing+ no

    synchronization bet&een sender and recei#er

    processes

    • /arge buffers are re0uired. s buffer size is finite thesender may e#entually block. 

  • 8/17/2019 574 Lecture 4

    13/25

     

    Parallel Programming Models d#antages of message-passing model

    Programs are highly portable

    Pro#ides the programmer &ith explicit control o#er thelocation of data in the memory

    !isad#antage of message-passing model

    Programmer is re+uired to pay attention to such details asthe placement of memory and the ordering ofcommunication.

  • 8/17/2019 574 Lecture 4

    14/25

     

    Parallel Programming Models

    1actors that influence the performance of message-passing

    model

    and&idth

    /atency

     bility to o#erlap communication &ith computation.

  • 8/17/2019 574 Lecture 4

    15/25

     

    Parallel Programming Models

    2xample+ Pi calculation

    Π = f 34 f(x) dx 5 f 3

    4 6(47x8) dx 5 & 9 f(xi)f(x) 5 6(47x8)

    n 5 43

    & 5 4n

    xi 5 &(i-3.:)

    x

    f(x)

    3  3.4 3.8 xi  4 

  • 8/17/2019 574 Lecture 4

    16/25

     

    Parallel Programming Models

    ,e0uential ;ode

    #define f(x) 4.0/(1.0+x*x);

    main(){

    int n,i;

    float w,x,sum,pi;

    printf(“n!n");

    sanf(“$d", %n);

    w&1.0/n;

    sum&0.0;

    for (i&1; i'&n; i++){

    x&w*(i0.);

    sum +& f(x);

    pi&w*sum;

    printf(“$f!n", pi);

    Π 5 & 9 f(xi)

    f(x) 5 6(47x8)n 5 43

    & 5 4n

    xi 5 &(i-3.:)

    x

    f(x)

    3  3.4 3.8 xi  4 

  • 8/17/2019 574 Lecture 4

    17/25

     

    Parallel Programming Models

    Parallel PVM program

    Master +

    ;reates &orkers ,ends initial #alues to &orkers

  • 8/17/2019 574 Lecture 4

    18/25

     

    Parallel Virtual Machine (PVM)

    Data Distribution

    x

    f(x)

    3  3.4 3.8 xi  4 

    x

    f(x)

    3  3.4 3.8 xi  4 

  • 8/17/2019 574 Lecture 4

    19/25

     

    Parallel Programming Models

    SPMD Parallel PVM program

    Master + ;reates &orkers ,ends initial #alues to &orkers

  • 8/17/2019 574 Lecture 4

    20/25

     

    Parallel Programming Models

    Shared Memory Model

    Used on ,hared memory M'M! architectures

    Program consists of many independent threads

    ;oncurrently executing threads all share a single commonaddress space.

    "hreads can exchange information by reading and &riting tomemory using normal #ariable assignment operations

  • 8/17/2019 574 Lecture 4

    21/25

     

    Parallel Programming Models

    Memory Coherence Problem

    "o ensure that the latest #alue of a #ariable updated in one thread

    is used &hen that same #ariable is accessed in another thread.

    ?ard&are support and compiler support are re0uired

    ;ache-coherency protocol

    "hread 4 "hread 8

    @

  • 8/17/2019 574 Lecture 4

    22/25

     

    Parallel Programming Models

    Distributed Shared Memory (DSM) Systems

    'mplement ,hared memory model on !istributed memory M'M!

    architectures

    ;oncurrently executing threads all share a single common addressspace.

    "hreads can exchange information by reading and &riting to

    memory using normal #ariable assignment operations 

    Use a message-passing layer  as the means for communicatingupdated #alues throughout the system.

  • 8/17/2019 574 Lecture 4

    23/25

     

    Parallel Programming Models

    Synchronization operations in Shared Memory Model

    Monitors

    /ocks

    ;ritical sections

    ;ondition #ariables

    ,emaphores

    arriers

  • 8/17/2019 574 Lecture 4

    24/25

     

    PThreads

    'n the U*'@ en#ironment a thread+

    2xists &ithin a process and uses the process resources

    ?as its o&n independent flo& of control !uplicates only the essential resources it needs to be independently

    schedulable

    May share the process resources &ith other threads

    !ies if the parent process dies

    's Alight&eightA because most of the o#erhead has already beenaccomplished through the creation of its process.

  • 8/17/2019 574 Lecture 4

    25/25

    PThreads

    ecause threads &ithin the same process share resources+

    ;hanges made by one thread to shared system resources &ill be seenby all other threads.

    "&o pointers ha#ing the same #alue point to the same data.