U.maheswaran Presentation

CGRA WITH FLOATING POINT SUPPORT FOR SIGNED ALU

OPERATIONS MAPPED CGRA FOR FLOATING POINT OPERATIONS

U.Maheswaran.,M.E.[A.E]

PG Scholar,M.N.M.Jain Engineering college,

Chennai,Tamilnadu,India

P.VenugopalM.E.[A.E],M.B.A.

Asst.Professor,Dept.of ECE,

M.N.M.Jain Engineering college,


Submitted by

CONTENTS:

Introduction Problem Definition Technical Background Proposed Idea Design Methodology Design Aspects Results Conclusions Queries

INTRODUCTION

Why reconfigurable computing?

What are CGRA s?

Why not FPGAs for Complex applications?

Reconfigurable computing & Mapping applications?

PROBLEM DEFINITION

FPGAs are architectures with fine grain packing[bit level granularity]So flexibility reduces , complexity increases.

Hence CGRA comes to picture. CGRA has WORD /NIBBLE level granularity. But CGRA applications are domain specific. It Supports only

integer arithmetic. Proposing a new architecture supporting both integer and

floating point operations.

TECHNICAL BACKGROUND

In current scenario systems with Reconfigurable logic modules, have a greater impact on many technical applications.

FPGA’s are used in wide range in many technical domains to implement many interesting complex algorithms. But these FPGAs have less flexibility and give less efficiency, due to their fine grained architecture.

CONTINUED.. If we use such fine grained architecture for

complex algorithms, the flexibility has to be sacrificed and the system becomes more complex.

COARSE GRAINED ARCHITECTURES has greater granularity, where the divided resource entities(hardware/problem) are larger grain size

CONTINUED..

Hence on using this CGRA for complex algorithms the flexibility is preserved.

But typical CGRA’s comes without floating point unit Also they are domain specific.

To overcome these barriers, heuristic mapping functions are used .so that a floating point unit can be dynamically created by the mapping algorithm.

CONTINUED..

Hence this floating point enabled CGRA can be used for complex applications involving floating point arithmetic.

E.g.: DSP filter design, Graphics accelerators, and many multimedia applications.

Thus the hardware flexibility of a system is improved by using an High performance Hardware and the programming flexibility is achieved through mapping algorithms .

PROPOSED IDEA

The real challenge before us now is the grain size of the reconfigurable device.

By grouping the basic units of the Reconfigurable device with a data bus of a particular data width, and thus by improving its granularity is the aim of these authors.

EXISTING MODEL- FPGA Existing FPGAs support

algorithms based on integer arithmetic

The FUs can execute common word-level operations.

Register files for each computing module is localized.

No shared bus communication among reconfigurable computing modules

MAPPED CGRA WITH FLOATING POINT SUPPORT

CONTINUED.. The target architecture consists of a

reconfigurable computing module (RCM). RCM executes loop kernel code segments. A general-purpose processor for controlling the

RCM is present. These units are connected with a shared bus. Each PE can be dynamically reconfigured to

perform arithmetic/ logic operation.

DESIGN FLOW

DATA PATH & CONTROL PATH DESIGN

Any ASIC is typically implemented with FSMD .

It consists of control path and data path designs.

Control path design: Generation of set of control

signals, [control word] at every clock cycle.

Data path design: Computational tasks described by

control word

RECONFIGURING THE TARGET ARCHITECTURE

The mapped CGRA contains a co-processor[kcpsm3 Pico-Blaze] on the host FPGA ,used for the reconfiguration of grain size of FPGA.

The kcpsm-3 [Constant(K) Coded Programmable State Machine] is a free soft processor cores from Xilinx for use in their FPGA .

.

CONTINUED..

Xilinx documents the Pico-Blaze as requiring just 96 FPGA slices.

It runs kernel in looping fashion , and reconfigures the CLBs in to required PEs.

Reconfiguration details are stored in configurable caches. Now the floating point adder unit is synthesized on the mapped CGRA and addition is performed.

DESIGN ASPECTS

RESULTING PE OF MAPPED CGRA

FLOATING POINT OPERATIONS WITH MAPPED CGRA

A pair of PEs used for floating point operations. One PE computes Mantissa & another handles Exponent.

Steiner tree routing is preferred for faster routing performance. After ILP/QEA , heuristic approaches are followed to increase performance.

CONTINUED

Thus, each operation in a loop body is spatially mapped to a dedicated PE.

The main advantage of spatial mapping is that each PE may not need reconfiguration during execution of a loop because of its fixed functionality.

However, it has a disadvantage that spreading all operations of the loop body over the reconfigurable array may require a very large array size.

CONTINUEDThe operations that a PE (or a pair of PEs) in our

CGRA can execute are classified into three groups.1) Arithmetic/logical operationsA PE can execute ALU operations in one clock cycle . 2) Multiply/divide/load/store operations These operations are executed by dedicated

functional resources located outside the PE array in several clock cycles.

3) Floating-point operationsA pair of PEs can execute floating-point operations

taking several clock cycles.

CONCLUSION

Thus Increased performance, Flexibility is achieved in both programming and Hardware by this mapping process over a reconfigurable device.

A faster, more flexible reconfigurable hardware is mapped to support floating point operations, in this way, can draw good attention in embedded systems industries.

~

Queries

?

AUTHORS

P.Venugopal M.E.[A.E],M.B.A.

Asst.Professor,Dept.of ECE,



[email protected]+91-9444420128

U.Maheswaran.,M.E.[A.E]

PG Scholar,



[email protected]+91-9944215357

U.maheswaran Presentation

Documents

Transcript of U.maheswaran Presentation