1 OpenSoC Fabric An open source, parameterized, network generation tool Farzad Fatollahi-Fard, David...
-
Upload
benedict-burns -
Category
Documents
-
view
222 -
download
0
Transcript of 1 OpenSoC Fabric An open source, parameterized, network generation tool Farzad Fatollahi-Fard, David...
1
OpenSoC FabricAn open source, parameterized, network generation toolFarzad Fatollahi-Fard, David Donofrio, George Michelogiannakis, John Shalf
SoC for HPC Programmatic MeetingAugust 25-26, 2014. Denver, CO.
CoDEx
2
State of the Art in SoCWhat technology is being used to build SoCs? What can we leverage?
OpenSoC OverviewAn open source network generator using a new HDL - Chisel
OpenSoC Code Deep Dive and DemoOpenSoC module walk through and network output
MotivationPower, parallelism, and data movement drive the need for SoC
Conclusion and Future WorkNew features for OpenSoC
1
2
3
4
5
3
Power: The New Design Constraint
‣ Power densities have ceased to increase
‣ No power efficiency increase with smaller transistors
Trends beginning in 2004 are continuing…
4
‣ We have come to the end of clock frequency scaling
‣ Moore’s Law is alive and well
• Now seeing core count increasing
Power: The New Design ConstraintOn-chip parallelism increasing to maintain performance increases…
Parallelism increasingNERSC Trends
Franklin Hopper EdisonCori
(NERSC 8)
Core Count
4 24 48 (logical) >60
Clock Rate
2.3GHz 2.1GHz 2.4 GHz ~1.5GHz
Memory 8GB 32GB 64GB64-128GB
+On package
Peak Perf .352 PF 1.288 PF 2.57 PF > 3 TF
6
Hierarchical Power CostsData movement is the dominant power cost
120 pJ
2000 pJ
250 pJ
~2500 pJ
100 pJ
6 pJ
Cost to move data off chip to a neighboring node
Cost to move data off chip into DRAM
Cost to move off-chip, but stay within the package (SMP)
Cost to move data 20 mm on chip
Typical cost of a single floating point operation
Cost to move data 1 mm on-chip
7
State of the Art in SoCWhat technology is being used to build SoCs? What can we leverage?
OpenSoC OverviewAn open source network generator using a new HDL - Chisel
OpenSoC Code Deep Dive and DemoOpenSoC module walk through and network output
MotivationPower, parallelism, and data movement drive the need for SoC
Conclusion and Future WorkNew features for OpenSoC
1
2
3
4
5
Current Hardware Challenges
‣ Power is now limiting factor for leading-edge chips
• Moore’s Law continues… but now reliant to more exotic technologies such as 3D transistors, FinFET etc.
• Design Validation / Verification dominating development costs
‣ Solution: Smaller is Better
• Simpler, 5 to 9 stage pipeline cores
• Parallel is key to energy efficiency: CV2F
• Large arrays of small simple cores are easier to verify, more resilient
‣ Vibrant commodity market in IP components
Embracing embedded designs
MemoryDRAM
DRAM
Building an SoC from IP Logic BlocksIt’s Legos with a some extra integration and verification cost
Processor Core (ARM, Tensilica, MIPS deriv)With extra “options” like DP FPU, ECC
OpenSoC Fabric (on-chip network)(currently proprietary ARM or Arteris)
DDR memory controller(Denali/Cadence, SiCreations)+ Phy & Programmable PLL
PCIe Gen3 Root complex
Integrated FLASH Controller
10GigE or IB DDR 4x Channel
MemControl
MemControl
PC
Ie
FLAS
H
Contro
l
IB o
r G
igE
IB o
r G
igE
IO
SoC – What’s out there?
‣ Cost driven
• CPUs integrated with IO and Gfx to reduce BOM cost, decrease total PCB area
• Die power density / pin constraint driven
‣ Homogeneous cores
‣ Simple Networks
• Most SoCs rely on ring or cross-bar interconnect
• Increasing core count will drive need for more complex topologies
• KNL present in 2016 NERSC machine will have mesh
Some common features
12
What Interconnect Provides the Best Power / Performance Ratio?What tools exist to answer this question?
SoC - Interconnect ExamplesSome common topologies
The Importance of Network TopologyNetwork topology can greatly influence application performance
An analysis of on-chip interconnection networks for large-scale chip multiprocessorsACM Transactions on computer architecture and code optimization (TACO), April 2010
The Importance of NetworksNetworks consume a large fraction of total chip power...
Clock distribution10%
Routers and links26%
10-port RF9%
IMEM and DMEM20%
Dual FPMACs
34%
A 5-GHz Mesh Interconnect for a Teraflops Processor.IEEE Micro. 2007
What tools exist for SoC research
‣ Software models
• Fast to create, but plagued by long runtimes as system size increases
‣ Hardware emulation
• Fast, accurate evaluate that scales with system size but suffers from long development time
What tools do we have to evaluate large, complex networks of cores?
A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs. FPGA 2008
17
Software Models
‣ Booksim
• Cycle-accurate
• Verified against RTL
• Few thousand cycles per second
‣ Garnet
• Event driven
• Simulation speed limits designs to 100’s of cores
C++ based on-chip network simulators
Booksim ISPASS 2013
GARNET ISPASS 2009
18
Hardware Models
‣ Stanford open-source NoC router
• Verilog
• Precise but long simulation times
‣ Connect network generation
• Bluespec
• FPGA Optimized
HDL network generators and implementations
CONNECT: fast flexible FPGA-tuned networks-on-chip. CARL 2012
19
State of the Art in SoCWhat technology is being used to build SoCs? What can we leverage?
OpenSoC OverviewAn open source network generator using a new HDL - Chisel
OpenSoC Code Deep Dive and DemoOpenSoC module walk through and network output
MotivationPower, parallelism, and data movement drive the need for SoC
Conclusion and Future WorkNew features for OpenSoC
1
2
3
4
5
Chisel: A New Hardware DSL
‣ Chisel provides both software and hardware models from the same codebase
‣ Object-oriented hardware development
• Allows definition of structs and other high-level constructs
‣ Powerful libraries and components ready to use
‣ Working processors fabricated using chisel
Using Scala to construct Verilog and C++ descriptions
21
Recent Chisel DesignsChisel created cores successfully boot Linux
Clock test site
SRAM test site
DCDC test site
Processor Site
Raven core – 28nm
22
Chisel Overview
‣ Not “Scala to Gates”
‣ Describe hardware functionality
‣ Chisel creates graph representation
• Flattened
‣ Each node translated to Verilog or C++
How does Chisel work?
24
OpenSoC Fabric‣ Part of the CoDEx tool suite, written in
Chisel
‣ Dimensions, topology, VCs all configurable
‣ Fast functional C++ model for functional validation
• SystemC ready
‣ Verilog based description for FPGA or ASIC
• Synthesis path enables accurate power / energy modeling
‣ AXI Based endpoints
• Ready for ARM integration
An open source, flexible, parameterized, NoC generator
OpenSoC FabricAn open source, flexible, parameterized, NoC generator
26
OpenSoC: Current Status
‣ Available now:
• 2-D mesh or Flattened Butterfly network of arbitrary size
• Wormhole routing
‣ In Development
• Virtual Channels
• AXI Interface
Projected v1.0 release date of October 1st
27
State of the Art in SoCWhat technology is being used to build SoCs? What can we leverage?
OpenSoC OverviewAn open source network generator using a new HDL - Chisel
OpenSoC Code Deep Dive and DemoOpenSoC module walk through and network output
MotivationPower, parallelism, and data movement drive the need for SoC
Conclusion and Future WorkNew features for OpenSoC
1
2
3
4
5
28
OpenSoC Code ExamplesWalkthrough of Switch and Full Network Tester
29
State of the Art in SoCWhat technology is being used to build SoCs? What can we leverage?
OpenSoC OverviewAn open source network generator using a new HDL - Chisel
OpenSoC Code Deep Dive and DemoOpenSoC module walk through and network output
MotivationPower, parallelism, and data movement drive the need for SoC
Conclusion and Future WorkNew features for OpenSoC
1
2
3
4
5
30
Future additions‣ Photonics and circuit switched networks
‣ Integrated NIC model
‣ More diverse topologies and routing functions
‣ Validation against RTL and other simulators
‣ Standardized (AXI) interfaces at the endpoints
‣ More powerful synthetic traffic and trace replay support
‣ Power modeling in the C++ model
Towards a full set of features
31
Acknowledgements
‣ UCB Chisel
‣ US Dept of Energy
‣ Ke Wen
‣ Columbia LRL
‣ John Bachan
‣ Dan Burke
‣ BWRC
32
More Informationhttp://opensocfabric.org