Uic Montone Thesis
-
Upload
usrdresd -
Category
Technology
-
view
196 -
download
0
Transcript of Uic Montone Thesis
Time-driven reconfiguration-aware
floorplacer
BY
Alessio Montone
Thesis committee:
S. Dutt (chair), A. Khokhar, M.D. Santambrogio
UIC Thesis Defense: 05/08/2008
2
Rationale and InnovationRationale and Innovation
Problem statementGiven a reconfigurable architecture, find an on-chip position for each functional unit
Innovative contribution: taking into accountTarget Device HeterogeneityTarget Device reconfiguation capabilities Inter-FU Communication
3
AimsAims
Considering the area assignment problem tailored for reconfigurable architectures, providea formalization of the problem, andan approach (in 3 algorithms) for solving
4
OutlineOutline
IntroductionFloorplacementThe Proposed ApproachExperimental ResultsComparison with the state of the artConclusions and Future WorksQuestions
INTRODUCTIONINTRODUCTION
5
Reconfigurable Architectures - IReconfigurable Architectures - I
On FPGAsReconfigurable DevicesHeterogeneousReconfiguration Limits
Different types of Reconfigurable Architectures:
TotalPartial (Static)Partial (Dynamic)
6
Reconfigurable Architectures - IReconfigurable Architectures - I
Total
7
Reconfigurable Architectures - IReconfigurable Architectures - I
Total
Partial (Static)
8
Reconfigurable Architectures - IIReconfigurable Architectures - II
Partial Dynamic
9
Area Assignment ProblemArea Assignment Problem
Let consider a Reconfigurable ArchitectureGiven a scheduled task graph (TG) of the application
Node: Reconfigurable Functional Unit (RFU) [*], A netlist obtained after post synthesis and technology mapping (i.e., before placement and routing)
Aim: find an area assignment for each RFU
10
[*] K. Bazargan, R. Kastner, M.S.: 3-d floorplanning: Simulated annealing and greedy placement methods for reconfigurable computing systems. IEEE Rapid Systems Prototyping (1999)
Related Works - IRelated Works - I
[*] introduced the concept of 3D floorplanning for reconfigurable systems
SA in order to solve HW/SW codesign problemFor each task choose between
HW implementationSW implementation
LimitsNo device limits consideredNo communicationinfrastructure
11
[*] K. Bazargan, R. Kastner, M.S.: 3-d floorplanning: Simulated annealing and greedy placement methods for reconfigurable computing systems. IEEE Rapid Systems Prototyping (1999)
Related Works - IIRelated Works - II
[*] is the state of art in 3D floorplanningSimulated Annealing over Transitive Closure GraphTakes into account device reconfiguration limits
LimitsNo heterogeneity consideredHigh overhead communicationinfrastructure solution [**]
12
[*] Ping-Hung Yuh, Chia-Lin Yang, Yao-Wen Chang, Hsin-Lung Chen: Temporal Floorplanning Using 3D-subTCG, Design Automation Conference, 2004
[**] S. P. Fekete, E. Kohler, and J. Teich: Optimal FPGA Module Placement with Temporal Precedence Constraints, Proc. DATE, 2001.
FLOORPLACEMENTFLOORPLACEMENT
13
Floorplanning vs. PlacementFloorplanning vs. PlacementCharacteristic Floorplanning Placement
# items <100 >10.000
Items (for FPGAs) IP-Core Slice, CLB
Aim Find a position for each item
obj. function depends
mainly on Area mainly on Wirelength
Constraints Items can be positioned everywhere
There is a set of possible positions
14
PlacementFloor plan
Floorplacement - IFloorplacement - I
Hierarchical Approach (Floorplanning + Partitioning)
15
S. N. Adya, I. L. Markov, Fixed-outline Floorplanning: Enabling Hierarchical Design, IEEE Transaction on VLSI System, 2002
S. N. Adya, S. Chaturvedi, J. A. Roy, David A. Papa, I. L. Markov: Unification of Partitioning, Placement and Floorplanning , IEEE Intl. Conf. on CAD, 2004
Floorplacement - IIFloorplacement - II
Reconfigurable Functional Unit (RFU)A netlist obtained after post synthesis and technology mapping (i.e., before placement and routing)
Reconfigurable Region (RR)
16
Floorplacement - IIIFloorplacement - III
Resource Aware (i.e., not all positions are feasible)
Device heterogeneity
Device Reconfigurationcapabilities
17
THE PROPOSED THE PROPOSED APPROACHAPPROACH
18
Proposed Problem DefinitionProposed Problem Definition
19
AimDefine RRsFor each task find
find a RRA position inside RR
Objective FunctionMin. Fragmentation
ConstraintsCommunication issuesDevice limits
Target Devices: Xilinx Virtex 4 - 5Target architecture based on EAPR design flow
Target Architecture and DevicesTarget Architecture and Devices
20
InputInput
21
Proposed Approach: overviewProposed Approach: overview
22
11stst Algorithm: Partitioning into Algorithm: Partitioning into RRRR
Aim: identify the RRs and associate each RFU to one RRHow: partitioning the TG minimizing resource requirement variance of the RRs (moving and swapping nodes)
23
Resource of type t required by RFU n, at static photo p
22ndnd Algorithm:TFiRR - I Algorithm:TFiRR - I
24
Temporal Floorplacement inside RR (TFiRR)Aim: for each RR find a set of feasible width-height pairsHow: floorplacing RFUs inside corresponding RR Assumption: RFUs’ height = height of the RR they belong to
Pseudo Code:
22ndnd Algorithm:TFiRR - II Algorithm:TFiRR - II
25
Let consider an iteration:
22ndnd Algorithm:TFiRR - III Algorithm:TFiRR - III
26
Let consider an iteration:
22ndnd Algorithm:TFiRR - IV Algorithm:TFiRR - IV
27
Let consider an iteration:
22ndnd Algorithm:TFiRR - V Algorithm:TFiRR - V
28
Let consider an iteration:
22ndnd Algorithm:TFiRR - VI Algorithm:TFiRR - VI
29
22ndnd Algorithm: TFiRR – Example Algorithm: TFiRR – Example
30
33rdrd Algorithm: RR floorplacement - I Algorithm: RR floorplacement - I
Simulated AnnealingObjective Function
Data Structure4 Constraint Lists (one per row)
31
33rdrd Algorithm: RR floorplacement - II Algorithm: RR floorplacement - II
Simulated Annealing: movesSwap two RRsMove one RRsSpan over one more rowUn-Span over one less row
After each move packing is performed(i.e., the floorplacement is compressed)
32
33rdrd Algorithm: RR floorplacement – Example - I Algorithm: RR floorplacement – Example - I
33
33rdrd Algorithm: RR floorplacement – Example - II Algorithm: RR floorplacement – Example - II
34
ImplementationImplementation
Three simulated annealers written in C++ STL
35
Output ExamplesOutput Examples
TFiRR
RR Floorplacement
36
EXPERIMENTAL RESULTSEXPERIMENTAL RESULTS
37
Identification of the number of RRs Identification of the number of RRs
38
Partitioning’s impact on TFiRRPartitioning’s impact on TFiRR
TFiRR on Partitioned
TG
TFiRR on TG
Execution Time 125ms 114ms 4m54s
Width (normalized)
1.00 1.19 1.04
39
Increasing the number of RFUs decreases the possibility to pick up the right one
Partitioning is a precondition of the 3rd algorithm in order to better exploit FPGA’s area (2D Floorplacement)
Tests performed directly floorplacing RFUs Execution time about 100 ms (100K iterations)
Floorplacement – Success Floorplacement – Success RateRate
40
Floorplacement – Aspect RatioFloorplacement – Aspect Ratio
41
Tests performed directly floorplacing RFUs
COMPARISON WITH THE COMPARISON WITH THE STATE OF THE ARTSTATE OF THE ART
42
State of the artState of the art
43
Authors Comm. Infrastructure
ResourceAware
Reconfiguration Aware
Device LimitsAware
Bazargan et al. No No Yes No
Yuh et al. Limited, w/ High Overhead
No Yes Yes
Singhal et al. No No Yes No
Feng et al. No Yes No No
NotesNotes
The comparsion is performed with respect to the description given by Yuh et al in [*]
Yuh’s approach does not support– Multiple Resources– Existence of a Static side
In order perform the comparison– the case study has been chosen in order to avoid
multiple resource limitation– Yuh’s approach has been extended to support a
static side
44
[*] Ping-Hung Yuh, Chia-Lin Yang, Yao-Wen Chang, Hsin-Lung Chen: Temporal Floorplanning Using 3D-subTCG, Design Automation Conference, 2004
The Case StudyThe Case Study
A Reconfigurable Architecture (for Biomedical Purpose) on XC5VLX30T 1. Collecting data from sensor2. Elaborating them3. Sending to a host computer thorough the net
45
The Proposed SolutionThe Proposed Solution
It is a reconfigurable architecture with 2 static photos
46
Area assignments comparisonArea assignments comparison
47
Proposedsolution
Yuh’s solution
Communication performances comparisonCommunication performances comparison
Considering 100 M samples, 32 bit each, at 75 MHz.
The entire data are transferred by– The Proposed Approach in 2.6 seconds– Yuh’s approach in 416.0 seconds
48
Conclusions - IConclusions - I
An algorithm for the identification of area constraint for reconfigurable architectures has been introduced
Novelties: taking into account– Target device heterogeneity– Target device reconfiguration capabilities– Communication issues
49
Conclusions - IIConclusions - II
Results have been published– A. Montone, M.D. Santambrogio, D. Sciuto,
A Design Workflow for the Identification of Area Constraints in Dynamic Reconfigurable Systems, IEEE International Symposium on Electronic Design, Test and Applications (DELTA), 2008
– A. Montone, M.D. Santambrogio, Area Constraint Evaluation for FPGAs, The Syndicated Q1-2008, A technical newsletter for FPGA, ASIC Verification and DSP Designers, Synplicity Incorporation
Under revision:– A Reconfiguration-aware Floorplacer for FPGAs,
IEEE Field Programmable Logic (FPL), 2008
50
Future WorksFuture Works
Take into consideration IOBs and inter modules communications
Partitioning considering clock regions
51
QuestionsQuestions
52