An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Post on 20-Mar-2016

45 views 0 download

Tags:

description

An Instruction Set and Micro architecture for Instruction Level Distribution Processing. (Ho-Seop Kim and James E. Smith) Haiying Qu Electrical and Computer Engineering University of Alberta. Introduction 1. ILP : Instruction Level Parallelism Achieved significant performance gains - PowerPoint PPT Presentation

Transcript of An Instruction Set and Micro architecture for Instruction Level Distribution Processing

An Instruction Set and Micro architecture for Instruction

Level Distribution Processing

(Ho-Seop Kim and James E. Smith)

Haiying QuElectrical and Computer Engineering University of Alberta

Introduction 1 ILPILP: Instruction Level Parallelism Achieved significant performance gains

ILDPILDP: Instruction Level Distributed Processing Technology trend

Introduction 2 Proposed Micro architecture

Short pipelines Distributed processing elements: in-order instruction processing enable out-of order execution

Strand: dependent instructions Accumulator Inter instruction communication

Instruction Set 64 General Purpose

Registers: R0-R63 Source or

Destination 8 Accumulators: A0-A7

Dead Accumulator

Load/store Instruction One accumulator value One GPR One parcel Ai <- mem(Aj) Ai <- mem(Rj) mem(Ai) <- Rj mem(Rj) <- Ai

Register Instruction Operation: accumulator and GPR/immediate Result: accumulator or GPR Ai <- Ai op Rj Ai <- Ai op immed Ai <- Rj op immed Rj <- Ai Rj <- Ai op immed

Branch/jump Instruction

Conditional branch: compare Ai, 0 or GPR(All usual predicates)

Program counter (p) Indirect jump: Ai or GPR Return address: GPR P <- P + immed; Ai pred Rj P <- P + immed; Ai pred 0 P <- Ai P <- Rj P <- Ai; Rj <- P++

Example Code

Strand

Figure 3. Types of values and and associated registers

Strand Ends Two strands

intersect: copy one to GPR

Out put is a static global register

New strand

Figure 4. Issue timing

Stages Fetch: 4 words-- over 4 instructions Parceling: Break into individual instructions Renaming: GPR Steering: into FIFO according to the

accumulators

Figure 5 ILDP Processor Block Diagram

Some Concepts PE: Processing Element IR: Issue Register—single Reservation Station

ICN: Interconnection Network

Figure 6 Micro architecture

Table 1 Complexity Comparison

Please be noted: the ILDP’s is based on one PE

Table 2 Bench Mark Program Properties

Evaluation 1

Figure 7 type of register values Figure 8 Average strand length

Evaluation 2

Figure 9 Strand end Figure 10 instruction size

Evaluation 3

Figure 11 Cumulative strand re-use Figure 12 IPC

Evaluation 4

Figure 13 Global register rename map read/ write bandwidth

Table 3 Simulator Configurations

Discussion