Exploring Alternative 3d Fpga Architectures

4
EXPLORING ALTERNATIVE 3D FPGA ARCHITECTURES: DESIGN METHODOLOGY AND CAD TOOL SUPPORT K. Siozios, K. Sotiriadis, V. F. Pavlidist, andD. Soudris Dep. of Electrical and Computer Engineering Democritus University of Thrace, Greece email: {ksiop, kostsot, dsoudris} @ee.duth.gr ABSTRACT This paper introduces a software supported methodology for exploring/evaluating 3D FPGA architectures. Two new CAD tools are developed: (i) the 3DPRO for placement and routing on 3D FPGAs and (ii) the 3DPower for power/energy estimation on such architectures. We mainly focus our exploration on the total number of layers and the amount of vertical interconnects (or vias). The efficiency of the proposed architecture is evaluated by making an exhaustive exploration for via connections under the EnergyxDelay Product criterion. Experimental results demonstrate the effectiveness of our solution, considering the 20 largest MCNC benchmarks. Considering 3D architectures with 4 layers and two scenarios of fabricated via densities (30% and 70%), we achieve an average decrease in the delay, the wire length, and the energy consumption of 18%, 17%, and 310%, respectively, as compared to 2D FPGAs. We also achieved high utilization of vias links. 1. INTRODUCTION In the real estate market, an often-stated truism is that as land becomes more expensive, there is a tendency to build upward, rather than outward. This idea has some resonance in the domain of ICs, where the sizes of the die are limited by yield and performance constraints. 3D integration can mitigate many of these limitations. For example, a considerable reduction in the number and length of the global wires can be achieved [2]. This decrease results, in turn, in performance enhancements and decreased power consumption for 3D ICs as compared to 2D circuits. Recently many research groups from academia [4, 5, 6, 7, 8], industry [9], and research institutes [1] have investigated significant effort on designing and manufacturing applications in 3D technologies. Several companies [9] develop 3D ICs for commercial purposes by wafer stacking, where the distance between the layers is mainly determined by the wafer thickness. Note that the existing industrial research primarily concerns the manufacturing and fabrication processes rather than the development of tools to support the design of emerging 3D technologies. This paper is part of the 03ED593 research project, implemented within the framework of the "Reinforcement Program of Human Research Manpower" (PENED) and co-financed by National and Community Funds (75% from E.U.-European Social Fund and 25% from the Greek Ministry of Development-General Secretariat of Research and Technology). Dept. of Electrical and Computer Engineering University of Rochester, USA email: [email protected] Although 3D integration promises considerable benefits, several challenges need to be satisfied. Among others, design space exploration is essential to build high- performance and low energy architectures that exploit all of the advantages offered by 3D integration. In addition, CAD tools that facilitate the design of 3D circuits are required. Up to date there are only a few academic CAD tools [4, 6] for mapping applications on 3D FPGA technologies, while there is no complete CAD flow in order to promote the commercialization of this potent design paradigm. Furthermore, there is no commercial CAD tool for realizing applications on 3D FPGAs, similar to the standalone tools and/or design flows (i.e. provided by Cadence, Mentor Graphics, and Xilinx) for 2D technologies. Consequently, there is a significant need to develop algorithms and software tools to exploit the advantages of the third dimension, and to solve time consuming and complex tasks, such as floorplanning, placement, and routing (P&R) for 3D FPGAs. In [6], a P&R approach for island style 3D FPGA architectures is described. A partitioning-based placement and simulated annealing-based refinement tools are used, which target on the reduction of the interconnection length. The authors report gains in wire lengths compared to 2D architectures, without considering, however, the wire power consumption and delay. Hence, these tools (PR3D) cannot be used for exploring alternative 3D architectures. In [4], a similar P&R approach for 3D FPGAs is described. The reconfigurable architecture consists of multiple stacked functional layers, while the communication among layers is realized by using 3D Switch Boxes (SBs). A tool, named TPR, for P&R in such devices was developed. Although TPR is one of the first attempts in academia to develop tools for 3D FPGA, it suffers from many limitations. The target architecture utilized in this tool initially assumes an unlimited number of vias, while the TPR aims at minimizing this number. However, such a scenario is not realistic, since the total number and the spatial distribution of vias are important problems that need to be addressed. In addition, this tool cannot estimate other important design parameters, such as the power/energy consumption. In this paper a software supported design methodology for exploring several parameters of 3D FPGAs is introduced. We evaluate for a number of cost factors, such as delay, energy consumption, and total wire length over a plethora of 3D architectures. Then, we perform exploration 1-4244-1060-6/07/$25.00 (C2007 IEEE. 652 Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.

description

dfdfd

Transcript of Exploring Alternative 3d Fpga Architectures

  • EXPLORING ALTERNATIVE 3D FPGA ARCHITECTURES:DESIGN METHODOLOGY AND CAD TOOL SUPPORT

    K. Siozios, K. Sotiriadis, V. F. Pavlidist, andD. Soudris

    Dep. of Electrical and Computer EngineeringDemocritus University of Thrace, Greece

    email: {ksiop, kostsot, dsoudris} @ee.duth.gr

    ABSTRACT

    This paper introduces a software supported methodologyfor exploring/evaluating 3D FPGA architectures. Two newCAD tools are developed: (i) the 3DPRO for placement androuting on 3D FPGAs and (ii) the 3DPower forpower/energy estimation on such architectures. We mainlyfocus our exploration on the total number of layers and theamount of vertical interconnects (or vias). The efficiency ofthe proposed architecture is evaluated by making anexhaustive exploration for via connections under theEnergyxDelay Product criterion. Experimental resultsdemonstrate the effectiveness of our solution, consideringthe 20 largest MCNC benchmarks. Considering 3Darchitectures with 4 layers and two scenarios of fabricatedvia densities (30% and 70%), we achieve an averagedecrease in the delay, the wire length, and the energyconsumption of 18%, 17%, and 310%, respectively, ascompared to 2D FPGAs. We also achieved high utilizationof vias links.

    1. INTRODUCTION

    In the real estate market, an often-stated truism is that asland becomes more expensive, there is a tendency to buildupward, rather than outward. This idea has some resonancein the domain of ICs, where the sizes of the die are limitedby yield and performance constraints. 3D integration canmitigate many of these limitations. For example, aconsiderable reduction in the number and length of theglobal wires can be achieved [2]. This decrease results, inturn, in performance enhancements and decreased powerconsumption for 3D ICs as compared to 2D circuits.

    Recently many research groups from academia [4, 5, 6,7, 8], industry [9], and research institutes [1] haveinvestigated significant effort on designing andmanufacturing applications in 3D technologies. Severalcompanies [9] develop 3D ICs for commercial purposes bywafer stacking, where the distance between the layers ismainly determined by the wafer thickness. Note that theexisting industrial research primarily concerns themanufacturing and fabrication processes rather than thedevelopment of tools to support the design of emerging 3Dtechnologies.This paper is part of the 03ED593 research project, implementedwithin the framework of the "Reinforcement Program of HumanResearch Manpower" (PENED) and co-financed by National andCommunity Funds (75% from E.U.-European Social Fund and 25%from the Greek Ministry of Development-General Secretariat ofResearch and Technology).

    Dept. of Electrical and Computer EngineeringUniversity of Rochester, USA

    email: [email protected]

    Although 3D integration promises considerablebenefits, several challenges need to be satisfied. Amongothers, design space exploration is essential to build high-performance and low energy architectures that exploit allof the advantages offered by 3D integration. In addition,CAD tools that facilitate the design of 3D circuits arerequired. Up to date there are only a few academic CADtools [4, 6] for mapping applications on 3D FPGAtechnologies, while there is no complete CAD flow inorder to promote the commercialization of this potentdesign paradigm. Furthermore, there is no commercialCAD tool for realizing applications on 3D FPGAs, similarto the standalone tools and/or design flows (i.e. providedby Cadence, Mentor Graphics, and Xilinx) for 2Dtechnologies. Consequently, there is a significant need todevelop algorithms and software tools to exploit theadvantages of the third dimension, and to solve timeconsuming and complex tasks, such as floorplanning,placement, and routing (P&R) for 3D FPGAs.

    In [6], a P&R approach for island style 3D FPGAarchitectures is described. A partitioning-based placementand simulated annealing-based refinement tools are used,which target on the reduction of the interconnection length.The authors report gains in wire lengths compared to 2Darchitectures, without considering, however, the wirepower consumption and delay. Hence, these tools (PR3D)cannot be used for exploring alternative 3D architectures.

    In [4], a similar P&R approach for 3D FPGAs isdescribed. The reconfigurable architecture consists ofmultiple stacked functional layers, while thecommunication among layers is realized by using 3DSwitch Boxes (SBs). A tool, named TPR, for P&R in suchdevices was developed. Although TPR is one of the firstattempts in academia to develop tools for 3D FPGA, itsuffers from many limitations. The target architectureutilized in this tool initially assumes an unlimited numberof vias, while the TPR aims at minimizing this number.However, such a scenario is not realistic, since the totalnumber and the spatial distribution of vias are importantproblems that need to be addressed. In addition, this toolcannot estimate other important design parameters, such asthe power/energy consumption.

    In this paper a software supported design methodologyfor exploring several parameters of 3D FPGAs isintroduced. We evaluate for a number of cost factors, suchas delay, energy consumption, and total wire length over aplethora of 3D architectures. Then, we perform exploration

    1-4244-1060-6/07/$25.00 (C2007 IEEE. 652

    Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.

  • for different number and various locations of the vias thatconnect circuits within the 3D FPGA. To best of ourknowledge, this is the first time that a software-supportedapproach for exploring/evaluating 3D FPGAs withdifferent number of vias is presented. Using the 20 largestMCNC benchmarks [11], we demonstrate the effectivenessof our methodology.

    2. THE PROPOSED 3D FPGA ARCHITECTUREAND TOOL FLOW

    In order to realize the interlayer vias, we have to extendsome conventional 2D SBs to employ connections to theother layers of the 3D FPGA. Although the utilized SBsare based on the pattern found in Xilinx XC-4000 FPGAarchitecture, the results are applicable for any other SBpattern found in bibliography. Different SB topologiesutilize a different number of pass transistors leading todifferent interconnection delay and power consumptionvalues. For example, in a 2D SB an incoming routing trackcan be connected to three other wires (F, = 3). Similarly,for a 3D SB, the incoming routing track is possible to beconnected to five other tracks (F, = 5). In the first case,the SB is formed by 6 transistors, while in the 3D approach10 transistors are required. As we target FPGAs, the powerconsumption is one of the upmost parameters for reductionand, therefore, the selection of the appropriate connectivityacross the 3D device layers is essential for efficientdesigns. Also, a large number of vias occupies largeportion of Si-area, where active circuits and interconnectsmust be excluded. Furthermore, the effect of thedistribution and length of these vias on the performanceand power consumption of 3D FPGAs needs to beaddressed.

    The proposed 3D architecture can be constructed byplacing a number of identical 2D individual layers,providing communication by interlayer vias amongvertically adjacent SBs. Hence, the SBs are extended to thethird dimension, while the structure of the individual logicblocks remains unchanged.

    Based on the required number of interconnections forthe successful implementation of an application ontoFPGAs, the nets can be routed by using various channelsegments to enhance both the delay/power efficiency andresource utilization. For all of the simulation/evaluationexperiments presented in this work, we use a multi-segment routing architecture similar to the one that appearsin the Xilinx Virtex devices for horizontal tracks(composed from routing segments of lengths LI, L2, L6,and long lines, while the distribution of the segments ineach channel is 8%, 20%, 60%, and 12% respectively). Forthe vias we use segment tracks of LI.

    In order to model the vertical wires we assume thateach via is electrically equivalent to a horizontal routingtrack with the same length. This means that the verticaltracks of our 3D FPGA have the same delay and powervalues as the horizontal segments with length LI. Thisassumption is based on the fabrication process [5], where

    the interlayer vias with length 5 pm -10 pm is feasible. Forsuch technologies, the delay of the wires dominates thedelay of the transistors (similarly to 2D architectures).

    Two new software tools are developed to support theproposed exploration/evaluation procedure for 3Darchitectures. These tools are integrated onto the existingMEANDER design flow [12] (Figure 1).

    Apphlcation descniption in HDL

    Existing Proposeddesign flow Synte\sdesign flow

    __/ I~~~~ecnlllllgy Mapping

    BTtsr**Eamgeneraio

    Figure 1: The MEANDER Framework for 2D/3D FPGAs

    The 3D branch adopts some existing CAD tools fromthe 2D toolset [12, 13], which do not need to be adapted forthe 3rd dimensional topology. Only the tools which arerelated to P&R and power estimation tasks should bereplaced by the new tools, because these tools consider theparticular traits of the 3D FPGAs. More specifically wedevelop a 3D Placement and Routing Optimizer (3DPRO).We also the 3DPower a novel tool to model and estimatethe power/energy consumption in 3D architectures. To bestof our knowledge, this toolset is the first completeframework in academia for mapping applications onto 3DFPGAs starting from a high level (HDL) description of theapplication and ending up to configuration file generation.More details about the 3D framework can be found in [10].

    3. EXPLORATION AND COMPARISON RESULTS

    We performed qualitative comparison between 3DPRO andthe TPR (the only public available tool for P&R on 3DFPGAs) tool (Table 1). Thus, 3DPRO performs architectureexploration for a significantly larger number of parametersas compared to TPR.

    The effectiveness of the proposed methodology isexhibited by exploring several 3D architectures for variousparameters. We performed exploration with the followingassumptions: (i) total number of layers is equal to four, (ii)percentage of vertical interconnects per layer ranges from0% (i.e., conventional 2D FPGA) to 100% (TPR solution),(iii) the spatial location (x, y, z) of each via per layerremains invariant, (iv) a via connection between adjacentlayers (with length L,) is electrically equivalent to L, wiresformed on the 2D FPGA plane, (v) the via width is W=4 inany layer, (vi) the hardware resources of each layer areidentical (i.e., identical number of Basic Logic Elements

    653

    Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.

  • (BLEs)) among different layers and (vii) the applicationsare implemented onto the smallest number of BLEs perFPGA layer that can be mapped.

    Table 1: Qualitative comparison between TRP and 3DPROFeature TRP 3DPROArchitecture exploration Yes YesMeasure Delay Yes YesMeasure Wire length Yes YesMeasure Power No Yes

    Subset DsgeSupported switch boxes Wilton sDpesigeUniversal spcfe

    Heterogeneous interconnect(simultaneously 2D/3D SBs) No YesVias exploration No YesBelong to complete framework No YesFull custom 3D interconnections No Yes

    Assuming a layer of size and is the available3D SBs per layer, the pattern of placement of a 3D SB isderived as follows: Assigning first a 3D SB to a location

    of a certain layer, then the neighboring 3D SBs areuniformly-assigned to the locations (x+r+1, y, z) where isderived by . Alsos r indicates thenumber of 2D SBs between two neighboring 3D SBs.

    In order to evaluate our methodology we performed anexhaustive exploration with the 20 largest MCNCbenchmarks. The results are summarized in Figure 4. Thehorizontal axis corresponds to the percentage of viaconnections in each layer of the 3D FPGA (which isidentical to the percentage of 3D SBs of an FPGA layer),while the vertical axis shows the normalized value ofEnergyxDelay Product (EDP). These points correspond toPareto points showing all of the possible solutions. Wenormalize the results with the EDP value of a conventional(i.e., 2D) FPGA. According to the designer requirements,similar curves to those in Figure 4 can be derived,considering, for instance, the energy consumption orperformance as the optimizing parameter of the system.

    2D Solution Ly Layers _ Layers

    Several conclusions can be drawn from the diagram ofFigure 4. As we increase the number of layers, theapplications are realized with smaller delay for critical netsand energy consumption in 3D FPGAs. Secondly, we canclaim that the developed P&R tools provide promisingresults for such architectures, where only a percentage ofSBs forms 3D via connections. More specifically, for thethree layers solution, as we increase the percentage of 3DSBs per layer, the EDP value increases. Similarly, the EDPcurve for four layer devices gives two local minima of30% and 70% of 3D SBs.

    Choosing the 3D architecture with the two localminima EDP values from Figure 4, we performed detailexploration in terms of the delay, the wire length and theenergy requirement for the chosen benchmarks shown inTable 2. We compare 2D (conventional) with 3D FPGAarchitecture consisting of 4 layers with 30% and 70% ofthe SBs of each layer to form 3D connections. Thecorresponding values of the delay reduction, the wirelength, and the energy consumption are: 16%, 17%, and30%, and, 18%, 15%, and 31%. Indeed, the wire lengthreduction due to 3D integration results in remarkableimprovements in delay and energy consumption.

    Furthermore, in Table 2 the columns with 100% viasgive the calculated values of delay, wire length, and energyconsumption, which correspond to the 3D architectures of[4]. It can be seen that these average values is similar to theones of the explored 3D FPGA architecture results (i.e.,30% and 70% vias). Specifically, a decrease up to 70% inthe utilized vertical interconnects is observed. The lastpoint is very important because we achieved the sameimprovements employing fewer vias.

    For 3-D system, the smaller number of vias means: (i)lower fabrication costs and (ii) larger useful silicon area ineach layer (a via contact occupies much more silicon areathan a simple metal contact).

    1 --+-"2 Layers" _ "3Layers" 4 LayersULn

    >,B 791

    .30)691"" 591; F

    := 1.05

    L- 1.00 -

    -04 ID95t o.90

    10% 20% 30% 40% 50% 60% 70% 80% 90% 100%F fabricated vias

    Figure 5: Vertical interconnects utilization

    1Q% 20%; 30%D 40%e 50% 607;D 70%o 80%o 90% lo(% fabricated vias

    Figure 4: Average EDP over the 20 largest MCNCbenchmarks for different number of layers and vias.

    The utilization degree of the fabricated vias is shown inFigure 5. We can infer that the number of actually-utilizedvertical interconnects deviates from the average utilizationdegree by a small fraction for a given number of layers.Considering the number of layers 2, 3, and 4, thecorresponding average values are 2.31%, 3.58% and4.98%, while the largest deviation from the average values

    654

    Iso

    D.75-

    1.70

    Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.

  • are 0.44%, 0.45% and 2.41%, respectively. Morespecifically, given a certain number of layers, the viautilization degree remains almost invariant - i.e., it isrelatively independent from the percentage of vias perlayer. We observed in Figure 5 that the utilized via links ofthe 4-layer architectures with fabricated 30% and 70% are4.63% and 4%, respectively, which means botharchitectures utilized almost the same number of vias.

    4. CONCLUSIONS

    A systematic methodology for exploring alternative 3DFPGA architectures is presented. This methodology issoftware supported by two new tools, namely 3DPRO and3DPower, which belong to the first complete 3D FPGADesign Framework in academia. Comparison resultsindicate improvements up to 18% in the delay, 17% in thewire length, and 31% in the energy consumption for theproposed 3D FPGAs as compared to existing 2D FPGAs.

    5. ACKNOWLEDGMENTS

    The authors acknowledge the support from Prof. K. Bazargan andH. Mogal (Univ. of Minnesota) about specific parts ofTPR tool.

    6. REFERENCES

    [2] J. W. Joyner et al., "Impact of Three-DimensionalArchitectures on Interconnects in Gigascale Integration",IEEE Trans. on VLSI, Vol. 9, No. 6, pp. 922-927, Dec. 2001

    [3] Kara Poon, et. al., "A Flexible Power Model for FPGA's", in12th Int. Conf FPL, 2002.

    [4] Cristinel Ababei, et. al., "Placement and Routing in 3DIntegrated Circuits", IEEE Design and Test, Vol. 22, No. 6,pp. 520-53 1, Nov-Dec 2005.

    [5] R. Reif, et. al., "Fabrication Technologies for Three-Dimensional Integrated Circuits",in ISQED, pp.33-37, 2002.

    [6] Shamik Das, et. al., "Technology, Performance, andComputer Aided Design of Three Dimensional IntegratedCircuits",Int. Symp. Physical Design, pp. 108-115, 2004.

    [7] Arifur Rahman, et. al., "Wiring Requirement and Three-Dimensional Integration Technology for FieldProgrammable Gate Arrays", IEEE Trans. on VLSI, Vol. 11,No. 1, pp. 44-54, Feb. 2003.

    [8] V. F. Pavlidis and E. G. Friedman, "Interconnect DelayMinimization through Interlayer Via Placement in 3-D ICs",in Proc. ofGreat Lakes Symp. on VLSI, pp. 20-25, 2005.

    [9] 3D IC Industry Summary, available at "http://www.tezzaron.com/technology/3D%201C%20Summary.htm".

    [10] K. Siozios, et. al., "A Software-Supported Methodology forDesigning High-Performance 3D FPGAs", in Proc. of 15thIFIP VLSI-SoC, 2007.

    [11] S. Yang, "Logic Synthesis and Optimization Benchmarks,Version 3.0", Techical Report, 1991.

    [12] http:Hvlsi.ee.duth.gr/amdrel[13] K. Siozios, et.al., "An Integrated Framework for

    Architecture Level Exploration of Reconfigurable Platform",in 15th FPL, pp. 658-661, 2005.

    [1] Eric Beyne, "The Rise of the 3rd Dimension for SystemIntegration", in Proc. of 8th EPTC, 2006.

    Table 2: Comparison results about MCNC benchmarks: Implementation in 2D and 3D FPGA architecture (with 30/o, 700 and 00/ vialinks 4 lavers and minimal EnerrvxDelav Product).

    bigkey 10.8 6.14 6.50 9.41 59.03 50.57 51.56 49.65 13.6 10.2 10.3 10.0clma 63.2 31.3 29.1 31.1 379.42 287.23 283.19 283.44 72.6 45.0 45.0 44.5des 14.7 8.74 9.87 8.67 94.07 54.90 53.94 55.03 22.6 13.0 12.9 13.0diffeq 15.3 16.7 11.3 18.0 43.48 36.97 45.65 36.04 24.3 15.1 12.4 11.9dsip 8.19 5.25 5.80 6.38 53.70 39.87 39.53 38.90 13.3 7.28 7.27 7.15elliptic 26.2 24.3 22.8 25.6 116.14 93.54 111.04 96.04 20.1 12.9 13.3 13.3exIOlO10 25.3 18.8 20.3 20.6 181.30 167.22 164.05 162.82 18.5 13.1 13.0 12.7ex5p 10.5 10.7 10.4 10.6 42.53 37.17 38.19 36.95 5.45 4.29 4.77 4.14frisk 31.6 30.7 32.6 32.3 122.70 110.30 109.19 108.89 35.6 25.1 25.9 26.4misex3 10.9 12.0 11.1 10.1 48.83 39.08 39.25 37.94 8.37 5.72 5.98 5.64pdc 27.4 27.4 24.6 26.8 257.77 226.94 222.49 228.05 25.7 23.5 19.9 19.2s298 26.0 27.3 21.1 21.9 62.12 57.94 60.14 57.82 14.8 10.4 10.0 10.2s38417 31.6 34.7 29.3 31.8 376.48 230.61 259.94 239.13 53.2 42.1 43.2 43.1s38584 25.7 18.6 19.4 19.8 225.13 198.58 192.200 191.35 43.4 30.0 30.4 31.1seq 15.6 12.7 14.8 11.6 64.36 52.12 56.85 50.69 9.84 7.29 8.92 7.15spla 21.4 18.3 20.8 18.0 169.22 127.94 127.92 127.96 15.9 12.1 13.2 12.6tcRf nQor I PI 1 1, 7I l471 7 70 7A44 7TX4 7A64 T77 0 w5 l1 OI I

    655

    Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.