Discussion 05
-
Upload
jaweria-siddiqui -
Category
Documents
-
view
212 -
download
0
description
Transcript of Discussion 05
-
CS-421 Parallel Processing BE (CIS) Batch 2005-06 Discussion-05
Page - 1 - of 2
Granularity Definition # 1
The level on which work is done in parallel or the task size in a parallel processing environment
Examples
Job / Program Level - The highest level of parallelism conducted among programs through multiprogramming/timesharing,
multiprocessing.
- Coarsest granularity
Task / Procedure Level Conducted among tasks of a common program (problem) e.g. multithreading
Interinstruction Level Conducted among instructions through superscalar techniques
Intrainstruction Level - Conducted among different phases of an instruction through pipelining
- Finest granularity
Definition # 2
Granularity = (Time spent on Computation)/ (Time spent on Communication)
Fine-Grained Applications 9 low granularity i.e. more communication and less computation 9 less opportunity for performance enhancement 9 Facilitates load balancing
Coarse-Grained Applications 9 High granularity i.e. large number of instructions between synchronization and communication points 9 More opportunity for performance enhancement 9 Hardened to balance load
*****
Multiple Issue Architectures These architectures are able to execute multiple instructions in one clock cycle (i.e. performance beyond just
pipelining). An N-way or N-issue architecture can achieve an ideal CPI of 1/N.
There are two major methods of implementing a multiple issue processor.
Static multiple issue Dynamic multiple issue Static Multiple Issue Architecture The scheduling of instructions into issue slots is done by the compiler. We can think of instructions issued in a given clock cycle forming an instruction packet.
-
CS-421 Parallel Processing BE (CIS) Batch 2005-06 Discussion-05
Page - 2 - of 2
It is useful to think of the issue packet as a single instruction allowing several operations in predefined fields. This was the reason behind the original name for this approach: Very Long Instruction Word
(VLIW) architecture.
Intel has its own name for this technique i.e. EPIC (Explicitly Parallel Instruction Computing) used in Itanium series.
If it is not possible to find operations that can be done at the same time for all functional units, then the instruction may contain a NOP in the group of fields for unneeded units.
Because most instruction words contain some NOPs, VLIW programs tend to be very long. The VLIW architecture requires the compiler to be very knowledgeable of implementation details of the
target computer, and may require a program to be recompiled if moved to a different implementation of
the same architecture.
Dynamic Multiple Issue Architecture Also known as superscalars
The processor (rather than the compiler) decides whether zero, one, or more instructions can be issued in
a given clock cycle.
Support from compilers is even more crucial for the performance of superscalars because a superscalar
processor can only look at a small window of program. A good compiler schedules code in such a way
that facilities scheduling decisions by the processor.
******