Ppt Parallel Processing

31
Parallel Processing By: Bela Desai Guided by: Anita Patel

Transcript of Ppt Parallel Processing

Page 1: Ppt Parallel Processing

Parallel Processing

By: Bela Desai Guided by: Anita Patel

Page 2: Ppt Parallel Processing

What is parallel processing?

The simultaneous use of more than one CPU to execute a program is called parallel processing.

It works on the principle that large problems can often be divided into smaller ones, which are then solved concurrently i.e in parallel.

Page 3: Ppt Parallel Processing

Serial computation

Page 4: Ppt Parallel Processing

Parallel Computation

Page 5: Ppt Parallel Processing

Flynn’s Classification

S I S D

Single Instruction, Single DataS I M D

Single Instruction, Multiple Data

M I S D

Multiple Instruction, Single Data

M I M D

Multiple Instruction, Multiple Data

Page 6: Ppt Parallel Processing

Single Instruction, Single Data (SISD): A serial (non-parallel) computer. Single instruction. Single data. Examples: most PCs, single CPU

workstations and mainframes.

Page 7: Ppt Parallel Processing

Single Instruction, Multiple Data (SIMD):

Page 8: Ppt Parallel Processing

Multiple Instruction, Single Data (MISD):

Page 9: Ppt Parallel Processing

Multiple Instruction, Multiple Data (MIMD):

Page 10: Ppt Parallel Processing

Types Of Parallel Processing

Bit-level parallelism Instruction-level parallelism Data parallelism Task parallelism

Page 11: Ppt Parallel Processing

Bit-level parallelism

Speed-up in computer architecture was driven by doubling computer word size—the amount of information the processor can execute per cycle.

Increasing the word size reduces the number of instructions the processor must execute to perform an operation on variables whose sizes are greater than the length of the word.

Page 12: Ppt Parallel Processing

Instruction-level parallelism The instructions can be re-ordered and combined into

groups which are then executed in parallel without changing the result.

Processor with an N-stage pipeline can have up to N different instructions at different stages of completion.

The canonical example of a pipelined processor is a RISC processor, with five stages: instruction fetch, decode, execute, memory access, and write back. The Pentium 4 processor had a 35-stage pipeline.

Page 13: Ppt Parallel Processing

Data parallelism 1: PREV2 := 0 2: PREV1 := 1 3: CUR := 1 4: do: 5: CUR := PREV1 + PREV2 6: PREV2 := PREV1 7: PREV1 := CUR 8: while (CUR < 10)

Page 14: Ppt Parallel Processing

Task parallelism Entirely different calculations can be performed on either

the same or different sets of data. This contrasts with data parallelism, where the same

calculation is performed on the same or different sets of data. Task parallelism does not usually scale with the size of a problem.

Page 15: Ppt Parallel Processing

Shared Memory

Page 16: Ppt Parallel Processing

Multiple processors can operate independently but share the same memory resources.

Changes in a memory location effected by one processor are visible to all other processors.

Shared memory machines can be divided into two main classes based upon memory access times: UMA and NUMA

Page 17: Ppt Parallel Processing

Uniform Memory Access (UMA): Most commonly represented today by Symmetric

Multiprocessor (SMP) machines.

Identical processors.

Equal access and access times to memory.

Sometimes called CC-UMA - Cache Coherent UMA.

Page 18: Ppt Parallel Processing

Non-Uniform Memory Access (NUMA): Often made by physically linking two or more SMPs. One SMP can directly access memory of another

SMP. Not all processors have equal access time to all

memories. Memory access across link is slower If cache coherency is maintained, then may also be

called CC-NUMA - Cache Coherent NUMA

Page 19: Ppt Parallel Processing

Distributed Memory

Page 20: Ppt Parallel Processing

Hybrid Distributed-Shared Memory

Page 21: Ppt Parallel Processing

Parallel Programming Models Shared Memory Model Thread Model Message Passing Model Data Parallel Model Hybrid

Page 22: Ppt Parallel Processing

Shared memory model In the shared-memory programming model, tasks share a

common address space, which they read and write

asynchronously. An advantage of this model from the programmer's point of

view is that the notion of data "ownership" is lacking, so there is no need to specify explicitly the communication of data between tasks. Program development can often be simplified.

An important disadvantage in terms of performance is that it becomes more difficult to understand and manage data

locality.

Page 23: Ppt Parallel Processing

Thread Model

Page 24: Ppt Parallel Processing

POSIX thread

Library based; requires parallel coding Specified by the IEEE POSIX 1003.1c standard

(1995). C Language only Commonly referred to as Pthreads. Most hardware vendors now offer Pthreads in addition

to their proprietary threads implementations. Very explicit parallelism; requires significant

programmer attention to detail.

Page 25: Ppt Parallel Processing

OpenMP

Compiler directive based; can use serial code Jointly defined and endorsed by a group of major

computer hardware and software vendors. The OpenMP Fortran API was released October 28, 1997. The C/C++ API was released in late 1998.

Portable / multi-platform, including Unix and Windows NT platforms

Available in C/C++ and Fortran implementations Can be very easy and simple to use - provides for

"incremental parallelism"

Page 26: Ppt Parallel Processing

Message Passing Model A set of tasks that use their own local memory during

computation. Multiple tasks can reside on the same physical machine as well across an arbitrary number of machines.

Tasks exchange data through communications by sending and receiving messages.

Data transfer usually requires cooperative operations to be performed by each process. For example, a send operation must have a matching receive operation.

Page 27: Ppt Parallel Processing

Message Passing Model

Page 28: Ppt Parallel Processing

Data Parallel Model

Most of the parallel work focuses on performing operations on a data set. The data set is typically organized into a common structure, such as an array or cube.

A set of tasks work collectively on the same data structure, however, each task works on a different partition of the same data structure.

Tasks perform the same operation on their partition of work, for example, "add 4 to every array element".

Page 29: Ppt Parallel Processing

Data Parallel Model

Page 30: Ppt Parallel Processing

Hybrid1. Single Program Multiple Data (SPMD)

2.Multiple Program Multiple Data (MPMD)

Page 31: Ppt Parallel Processing

Conclusion

With the help of parallel processing, highly complicated scientific problems that are otherwise extremely difficult to solve can be solved effectively. Parallel computing can be effectively used for tasks that involve a large number of calculations, have time constraints and can be divided into a number of smaller tasks.