An Instruction Set and Micro architecture for Instruction Level Distribution Processing

22
An Instruction Set and Micro architecture for Instruction Level Distribution Processing (Ho-Seop Kim and James E. Smith) Haiying Qu Electrical and Computer Engineering University of Alberta

description

An Instruction Set and Micro architecture for Instruction Level Distribution Processing. (Ho-Seop Kim and James E. Smith) Haiying Qu Electrical and Computer Engineering University of Alberta. Introduction 1. ILP : Instruction Level Parallelism Achieved significant performance gains - PowerPoint PPT Presentation

Transcript of An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Page 1: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

An Instruction Set and Micro architecture for Instruction

Level Distribution Processing

(Ho-Seop Kim and James E. Smith)

Haiying QuElectrical and Computer Engineering University of Alberta

Page 2: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Introduction 1 ILPILP: Instruction Level Parallelism Achieved significant performance gains

ILDPILDP: Instruction Level Distributed Processing Technology trend

Page 3: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Introduction 2 Proposed Micro architecture

Short pipelines Distributed processing elements: in-order instruction processing enable out-of order execution

Strand: dependent instructions Accumulator Inter instruction communication

Page 4: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Instruction Set 64 General Purpose

Registers: R0-R63 Source or

Destination 8 Accumulators: A0-A7

Dead Accumulator

Page 5: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Load/store Instruction One accumulator value One GPR One parcel Ai <- mem(Aj) Ai <- mem(Rj) mem(Ai) <- Rj mem(Rj) <- Ai

Page 6: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Register Instruction Operation: accumulator and GPR/immediate Result: accumulator or GPR Ai <- Ai op Rj Ai <- Ai op immed Ai <- Rj op immed Rj <- Ai Rj <- Ai op immed

Page 7: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Branch/jump Instruction

Conditional branch: compare Ai, 0 or GPR(All usual predicates)

Program counter (p) Indirect jump: Ai or GPR Return address: GPR P <- P + immed; Ai pred Rj P <- P + immed; Ai pred 0 P <- Ai P <- Rj P <- Ai; Rj <- P++

Page 8: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Example Code

Page 9: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Strand

Figure 3. Types of values and and associated registers

Page 10: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Strand Ends Two strands

intersect: copy one to GPR

Out put is a static global register

New strand

Figure 4. Issue timing

Page 11: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Stages Fetch: 4 words-- over 4 instructions Parceling: Break into individual instructions Renaming: GPR Steering: into FIFO according to the

accumulators

Page 12: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Figure 5 ILDP Processor Block Diagram

Page 13: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Some Concepts PE: Processing Element IR: Issue Register—single Reservation Station

ICN: Interconnection Network

Page 14: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Figure 6 Micro architecture

Page 15: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Table 1 Complexity Comparison

Please be noted: the ILDP’s is based on one PE

Page 16: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Table 2 Bench Mark Program Properties

Page 17: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Evaluation 1

Figure 7 type of register values Figure 8 Average strand length

Page 18: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Evaluation 2

Figure 9 Strand end Figure 10 instruction size

Page 19: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Evaluation 3

Figure 11 Cumulative strand re-use Figure 12 IPC

Page 20: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Evaluation 4

Figure 13 Global register rename map read/ write bandwidth

Page 21: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Table 3 Simulator Configurations

Page 22: An Instruction Set and Micro architecture for Instruction Level Distribution Processing

Discussion