PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of...
Transcript of PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of...
![Page 1: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/1.jpg)
![Page 2: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/2.jpg)
PART 1: Paradigms of Computing: Synchronous – Vector/Array, SIMD, Systolic Asynchronous – MIMD, reduction Paradigm, Hardware taxanomy: Flynn’s classification, Software taxonomy: Kung’s taxanomy, SPMD.
PART 2: Parallel Computing Models Parallelism in Uniprocessor Systems: Trends in parallel processing, Basic Uniprocessor
Architecture, Parallel Processing Mechanism.
PART 3: Parallel Computer Structures: Pipeline Computers, Array Computers, Multiprocessor SystemsArchitectural Classification Schemes: Multiplicity of Instruction-Data Streams, Serial versusParallel Processing, Parallelism versus Pipelining
PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation Tables
![Page 3: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/3.jpg)
Hardware taxonomy- Flynn’s classification
SISDSIMDMISDMIMD
Systolic array
![Page 4: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/4.jpg)
Flynn's taxonomy is a classification computer architecture of proposed by Michael j. Flynn in 1966.
Two types of information flow into a processor: instructions and data.
The instruction stream is defined as the sequence of instructions performed by the processing unit.
The data stream is defined as data traffic exchanged between the memory and the processing unit.
![Page 5: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/5.jpg)
Single instruction stream single data stream (SISD)
Single instruction stream, multiple data streams (SIMD)
Multiple instruction streams, single data stream (MISD)
Multiple instruction streams, multiple data streams (MIMD)
![Page 6: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/6.jpg)
One processing element which has access to a single program and data storage.
It loads an instruction and the corresponding data and executes the instruction.
The result is stored back in the data storage. SISD has Single processor. Data stored in single memory.
![Page 7: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/7.jpg)
.
7
PROCESSING ELEMENT(PE)
MAINMEMORY(M)
Instructions
Data
Control Unit PE MemoryPE
IS
IS DS
![Page 8: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/8.jpg)
Fast Sequential execution No Extra source Required
DISADVANTAGES OF SISD Slow execution of large instruction
![Page 9: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/9.jpg)
In MISD there are multiple processing elements each of which has a private program memory.
In each processing element obtains the same data element from the data memory.
Loads an instruction from its private program memory.
The different instructions are then executed in parallel.
![Page 10: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/10.jpg)
![Page 11: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/11.jpg)
A single instruction is applied to different data simultaneously.
SIMD machines have more than one processing element (PE).
General characteristics of SIMD computers are: − They distribute processing over a large amount
of hardware − They operate concurrently on many different
data elements – They perform the same computation on the all
data elements
![Page 12: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/12.jpg)
![Page 13: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/13.jpg)
Instruction operates on all loaded data in a single operation.
processing multiple data elements at the same time, with a single instruction.
performance boost if SIMD techniques can be utilized.
Not everything is suitable for SIMD processing, and not all parts of an application need to be SIMD accelerated to realize significant improvements.
![Page 14: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/14.jpg)
Multiple processing elements each of which has a separate instruction and data memory.
Each processing element loads a separate instruction and a separate data element.
Applies the instruction to the data element, and stores a possible result back into the data storage.
Processing elements work asynchronously to each other.
![Page 15: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/15.jpg)
![Page 16: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/16.jpg)
MIMD distribute processing over a number of independent processors.
� Share resources, among the component processors.
� Each processor operates independently. � Each processor runs its own program.
![Page 17: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/17.jpg)
![Page 18: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/18.jpg)
Systolic arrays are arrays of processors which are connected to a small number of nearest neighbours in a mesh-like topology. Processors perform a sequence of
operations on data that flows between them. Generally the operations will be the same in
each processor, with each processor performing an operation on a data item and them passing it on to its neighbour
![Page 19: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/19.jpg)
Use of a large number of PEs arranged in a well-organized structure.
In a hexagonal array, each PE has a simple function and communicates with neighbour PEs in a pipelined fashion.
multiplication of two 3-by-3 matrices A and B. Each circle represents a PE that has three inputs
and three outputs. The input and output values move through the
PEs at every clock pulse.
![Page 20: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/20.jpg)
Considering the current situation in the input values a11 and b11, and the output value c11 arrive at the same processor element, PE after two clock pulses.
Once all these values have arrived, the PE computes a new value for c11 by performing the following operation:
c11 = c11 + a11 * b11.
![Page 21: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/21.jpg)
![Page 22: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/22.jpg)
![Page 23: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/23.jpg)
LINEAR ARRAY ORTHOGONAL SYSTOLIC ARRAY HEXAGONAL SYSTOLIC ARRAY TRIANGULAR ARRAY
![Page 24: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/24.jpg)
Processing elements are arranged in one-dimension
Interconnection between PE and nearest element only
Differ relative to the number of data flows
Linear systolic array are- Matrix –vector
multiplication One dimensional
convolution
![Page 25: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/25.jpg)
PE are arranged in 2D grid.
Each PE in interconnected to its nearest neighbours to each direction.Systolic array differ
relative to the number and direction of data flows .
![Page 26: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/26.jpg)
PE are arranged in a two dimensional grid
PE are connected to the nearest neighbour to where interconnection have hexagonal symmetry.
Mapping of matrix-matrix multiplication algorithm result in a hexagonal array
![Page 27: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/27.jpg)
PE are arranged in triangular form
Triangular array different from linear array
They use the two algorithm-
Gaussian eliminationDecomposition
![Page 28: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/28.jpg)
Network Synchrony Modularity Regularity Locality Extensibility Pipelinability
![Page 29: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/29.jpg)
High speed and Low cost. Simple I/O subsystem. Regularity and modular design. Local interconnections. High degree of pipelining. Highly synchronized multiprocessing.
![Page 30: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/30.jpg)
Expensive Highly specialized, custom hardware is
required often application specific. Not widely implemented. Limited code base of programs and
algorithms.
![Page 31: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/31.jpg)
Matrix Inversion and Decomposition. Polynomial Evaluation. Systolic arrays for matrix multiplication. Image Processing. Systolic lattice filters used for speech and
seismic signal processing. Artificial neural network.
![Page 32: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/32.jpg)
Advanced Computer Architecture: Parallelism, Scalability, Programmability by kai hwang
![Page 33: PART 1 - hmvelms.org SCIENCE… · PART 4: Pipelining : An overlapped Parallelism, Principles of Linear Pipelining, Classification of Pipeline Processors, General Pipelines and Reservation](https://reader036.fdocuments.us/reader036/viewer/2022081400/606136b0cb19ee4bd0719eb4/html5/thumbnails/33.jpg)