Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C....
-
Upload
calvin-wion -
Category
Documents
-
view
217 -
download
0
Transcript of Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C....
Torino, Italy – June 25, 2013
NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013)
R. Cattaneo, C. PilatoC. Pilato, M. Mastinu, M.D. SantambrogioPolitecnico di Milano – Dip. di Elettronica, Informazione e
Bioingegneria
O. Kadlcek, O. PellMaxeler Technologies Ltd., London, UK
Runtime Adaptation on Dataflow Runtime Adaptation on Dataflow HPC PlatformsHPC Platforms
Christian Pilato – Politecnico di Milano 2
Context Definition The portion of the application that needs to be
accelerated is usually implemented in the hardware
Resource limitations can become a bottleneck
In some contexts, the HPC application should be able to adapt to the environment
Partial dynamic reconfiguration is a well-know technique to change the behavior at run timewhile reusing the same logicacross different tasks
Christian Pilato – Politecnico di Milano 3
Reconfigurable Computing
“Reconfigurable computing is intended to fillthe gap between hardware and software, achieving
potentially much higher performance than software, while maintaing a higher level of flexibility than hardware”
(K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and software,2002)
Christian Pilato – Politecnico di Milano 4
Reasons Behind
Some applications require performance that cannot be achieved by software
Some applications require to be flexible, modifiable, adaptable. Traditional hardware cannot achieve these results
Reconfigurable Computing platforms allow to be altered after their deployment, turning into a high-performance device able to meet resources constraints, adaptability constraints and reliability constraints
Christian Pilato – Politecnico di Milano
Maxeler Architecture
•Maxeler systems are based on the interaction between a CPU and an FPGA
•Maxeler exploits FPGAs only as devices devoted to hardware acceleration
5
Why do not try enhancing the flexibility and
performance of Maxeler platforms by exploiting
some intrinsic characteristics of the
FPGAs?
Christian Pilato – Politecnico di Milano 6
Objectives
Dynamic Partial Reconfiguration is a technique that can be applied to cope with problems such as the lack of available resources and the system adaptability and reliability
Maxeler architectures are very efficient for computation but they do not support the use of Dynamic Partial Reconfiguration
Designing a new tool flow able to support Dynamic Partial Reconfiguration in Maxeler architectures to offer adaptivity in the HPC domain
Rationale
Goals
Christian Pilato – Politecnico di Milano 7
Canny edge detector
Christian Pilato – Politecnico di Milano 9
Reconfiguration in FPGAs
Useful Definitions
Full Bitstream
Reconfigurable partitions
Reconfigurable modules
Partial Bitstream
Configurations
FPGA
Full bitstream
Christian Pilato – Politecnico di Milano 10
Maxeler Architecture
Christian Pilato – Politecnico di Milano 11
Example application
Manager
SLiC SLiC
Christian Pilato – Politecnico di Milano 12
MaxCompiler flow
MaxIDE
Javacompilatio
n
VHDL
BIT file
Java runtime
Christian Pilato – Politecnico di Milano
Preliminary Considerations
13
Hierarchical design VS flat design
NGDBuild, Map, PAR, Bitgen, are run as many times as the number of configurations
Need for the PXML file to lead the process
Christian Pilato – Politecnico di Milano
Proposed Approach
14
Focusing on Kernels instead of Manager
Kernels in the same Reconfigurable Block must have the same characteristics;
In every Configuration, exactly one Kernel must be assigned to each Reconfigurable Bock;
The same Kernel can not be placed in two different Reconfigurable Blocks.
Preserving as much as possible MaxCompiler/Xilinx tool flow structure
Mask the details to the designer
Christian Pilato – Politecnico di Milano 15
Reconfiguration on Kernels
Christian Pilato – Politecnico di Milano 16
User interface: DFE code
PRManager Main
...Configuration A = ...Configuration B = ...
build(A,B)
• Reconfigurable Block = Reconfigurable Partition• Kernel = Reconfigurable Module
Christian Pilato – Politecnico di Milano 17
Considerations
Christian Pilato – Politecnico di Milano 18
User interface: Host code
max_reconfig_partial_bitstream
DFE
Christian Pilato – Politecnico di Milano 19
Case Study: Edge Detection
Canny edge detection is applied to a video
There are two Reconfigurable Blocks and a total of four filters
each filter represents a Reconfigurable Module
Initially, the first two filters are applied
Then, the device is partially reconfigured and the other two filters are applied
19
DFE
Christian Pilato – Politecnico di Milano
MaxWorkstation
20
The targeted platform is MaxWorkstation
It contains a Intel i7 870 quad core CPU with 16 GB RAM
The Intel CPU is connected to the DFE via PCI Express
The DFE has 24 GB RAM, and it is a MAX3 board - XilinxV6
Christian Pilato – Politecnico di Milano 21
Experimental Results
Methodology applied to a video taken from “Mission Impossible”
combined with a set of compiler extensions for the automatic code generation of the kernels
details are totally hidden to the designer
[VIDEO]
Christian Pilato – Politecnico di Milano 22
Conclusions and Future Work
The proposed approach integrated Partial Dynamic Reconfiguration in a dataflow architecture
The process is totally transparent to the designer
Future works will focus on the current limitations:
Reconfigurable Areas constraints can be specified only as multiple of clock regions
During the partial reconfiguration of some Reconfigurable Blocks, all the Kernels are in reset status
??QuestionsQuestions
Christian Pilato – Politecnico di Milano 24
Implementation: design flow
The build process is divided in four mainstages
Christian Pilato – Politecnico di Milano 25
First build stage
• When the build process starts, MaxDC, XST and NGCBuild are run for each Reconfigurable Block and for the static part independently;
• The result of this first stage is a large number of netlist files.
Christian Pilato – Politecnico di Milano 26
Second build stage
• The second stage consist in running NGDBuild, MAP, Par, pr_verify and Bitgen for each configuration
• PXML file is automatically generated
• The static part is implemented only in the first configuration
• The reconfigurable modules are implemented only the first time they appear in a Configuration
Christian Pilato – Politecnico di Milano 27
Final stage
• Once the full bitstream and all the partial ones have been generated, they are encapsulated in the .Max file
• The first Configuration passed to the build method is choosen as the “default” Configuration
• This means that its full bitstream will be loaded in the CFPGA when the program starts