Post on 17-Dec-2014
description
Pipeliningand
Co-processor
What is Pipelining
In simple words Pipelining means starting the execution of 2nd process before 1st is completed.
Overview
Pipelining is widely used in modern processors.
Pipelining improves system performance in terms of throughput.
Pipelined organization requires sophisticated compilation techniques.
Basic Concept
Faster Execution
Multi Tasking
Making the Execution of Programs Faster
Use faster circuit technology to build the processor and the main memory.
Arrange the hardware so that more than one operation can be performed at the same time.
In the latter way, the number of operations performed per second is increased even though the elapsed time needed to perform any one operation is not changed.
Traditional Pipeline Concept
A, B, C, D
each have one load of clothes to wash, dry, and fold.
“Washer” takes 30 minutes
“Dryer” takes 40 minutes
“Folder” takes 20 minutes
A B C D
Laundry Example
Traditional Pipeline Concept
Sequential laundry takes 6 hours for 4 loads
If they learned pipelining, how long would laundry take?
A
B
C
D
30 40 20 30 40 20 30 40 20 30 40 20
6 PM 7 8 9 10 11 Midnight
Time
Traditional Pipeline Concept
Pipelined laundry takes 3.5 hours for 4 loads
A
B
C
D
6 PM 7 8 9 10 11 Midnight
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20
Traditional Pipeline Concept
A
B
C
D
6 PM 7 8 9
T
a
s
k
O
r
d
e
r
Time
30 40 40 40 40 20
Use the Idea of Pipelining in a Computer
F1
E1
F2
E2
F3
E3
I1 I2 I3
(a) Sequential execution
T time
F1 E1
F2 E2
F3 E3
I1
I2
I3
Instruction
(c) Pipelined execution
Figure of Basic idea of instruction pipelining.
Clock cycle 1 2 3 4
TTime
Fetch + Execution
Role of Cache Memory
Each pipeline stage is expected to complete in one clock cycle.
The clock period should be long enough to let the slowest pipeline stage to complete.
Faster stages can only wait for the slowest one to complete.
Since main memory is very slow compared to the execution, if each instruction needs to be fetched from main memory, pipeline is almost useless.
Fortunately, we have cache.
Pipeline Performance
The potential increase in performance resulting from pipelining is proportional to the number of pipeline stages.
However, this increase would be achieved only if all pipeline stages require the same time to complete, and there is no interruption throughout program execution.
Unfortunately, this is not true.
Pipeline Performance
F1
F2
F3
I1
I2
I3
D1
D2
D3
E1
E2
E3
W1
W2
W3
Instruction
Figure 8.4. Pipeline stall caused by a cache miss in F2.
1 2 3 4 5 6 7 8 9Clock cycle
(a) Instruction execution steps in successive clock cycles
1 2 3 4 5 6 7 8Clock cycle
Stage
F: Fetch
D: Decode
E: Execute
W: Write
F1 F2 F3
D1 D2 D3idle idle idle
E1 E2 E3idle idle idle
W1 W2idle idle idle
(b) Function performed by each processor stage in successive clock cycles
9
W3
F2 F2 F2
Time
Time
Idle periods – stalls (bubbles)
Pipeline Performance
F1
F2
F3
I1
I2 (Load)
I3
E1
M2
D1
D2
D3
W1
W2
Instruction
F4I4
Clock cycle 1 2 3 4 5 6 7
Figure 8.5. Effect of a Load instruction on pipeline timing.
F5I5 D5
Time
E2
E3 W3
E4D4
Load X(R1), R2Structural hazard
Pipeline Performance
Again, pipelining does not result in individual instructions being executed faster; rather, it is the throughput that increases.
Throughput is measured by the rate at which instruction execution is completed.
Pipeline stall causes degradation in pipeline performance.
We need to identify all hazards that may cause the pipeline to stall and to find ways to minimize their impact.
Pipeline Hazards
There are situations, called hazards, that prevent the next instruction in the instruction stream from executing during its designated cycle
There are three classes of hazards Structural hazard Data hazard Branch hazard
Pipeline Hazards
Structural hazard Resource conflicts when the hardware cannot support
all possible combination of instructions simultaneously Data hazard
An instruction depends on the results of a previous instruction
Branch hazard Instructions that change the PC
Pipeline Stall
When a hazard prevents an instruction step from happening, the processor pauses the executing the step until hazard will restored.
Pipeline stalls slow the execution of an Instruction , but do not prevent it from executing correctly.
CO-PROCESSOR’s
WHAT IS CO-PROCESSOR
A computer co-processor is processor used to supplement the function of primary processor.
First seen on mainframe computers.
Accelerate the system performance.
HISTORY OF CO-PROCESSOR Co-processor for floating point arithmetic first
appeared in desktop computers in 1970s. The coprocessors become common in 1980s
and into the early 1990s. Early 8_Bit and 16 Bit processor uses
software to carryout the floating point arithmetic operations.
Math co-processor were popular purchase for users of computer-aided design (CAD) software and scientific and engineering calculations.
OPERATION PERFORMED BY COPROCESSOR
Floating point arithmetic Graphic & Signal processing. String processing. Encryption
Coprocessor are Unable to fetch the code from the memory so they work under the control of main processor .
Architecture of 8087
INTEL 8087
Numeric Processor. Packed in 40 pin ceramic DIP package. Available in 5 MHz, 8MHz, 10MHz
versions compatible with 8086, 8088, 80186, 80188.
It adds 68 new instruction to the instruction set of 8086.
How it works
The 8087 instruction may lie interleaved in the 8086 program, but it is the task of 8086 to identify the 8087 instructions from the program, send it to 8087 for further execution & after the completion of execution cycle the result may be referred back to CPU.
Operation of 8087 does not require any software support from the system software or operating system.
Architecture of 8087
Two major sections:
1) Control unit2) Numeric Execution unit
Control Unit
Function : It interface the coprocessor to the
microprocessor – system data bus. Monitors the instruction stream. If the instruction is an ESCape (coprocessor)
instruction, the coprocessor executes it; if not the microprocessor executes it.
It receives , decodes instructions, read and write memory operands and executes the 8087 instruction
Numeric Execution Unit (NEU)
Functions : Execute all the numeric processor
instructions. It has 8 register (80 bit) stack that holds
the operands for arithmetic instructions & the result.
Instruction either address data in specific stack data – register or uses push and pop mechanism to store and retrieve data.
Control Word Register of 8087
Coprocessor Control Instructions
The coprocessor has control instructions for initialization, exception handling, and task switching.
All control instructions have two forms.
Coprocessor Control Instructions
FINIT/FNINIT Performs a reset (initialize) operation on the
arithmetic coprocessor. The coprocessor operates with a closure of
projective (unsigned infinity), rounds to the nearest or even, and uses extended-precision when reset or initialized. also sets register 0 as the top of the stack
Coprocessor Control Instructions
FSETPM Changes the coprocessor to the protected-
addressing mode. used when the microprocessor is protected mode
Protected mode can only be exited by a hardware reset. or in 80386-Pentium 4, with a change to the
control register
Coprocessor Control Instructions
FLDCW Loads the control register with the
word addressed by the operand.
FSTCW Stores the control register into the
word-sized memory operand.
Coprocessor Control Instructions
FSTSW AX Copies the contents of the control register
to the AX register. not available to 8087
FCLEX Clears the error flags in the status register
and also the busy flag.
Graphics Coprocessor
noun a high-speed display adapter that is dedicated to graphics operations such as line drawing and plotting
A coprocessor utilized to accelerate the displaying of graphics, significantly speeding up the updating of the images on a screen, and freeing the CPU to take care of other tasks.
A graphics coprocessor maybe incorporated into a graphics accelerator, or may be part of a separate subsystem. Also called graphics processor .