Embedded Systems Group ( Department of Computer Science & Engineering Indian Institute of Technology...
-
Upload
jazmine-gundry -
Category
Documents
-
view
215 -
download
2
Transcript of Embedded Systems Group ( Department of Computer Science & Engineering Indian Institute of Technology...
Embedded Systems Group(http://www.cse.iitd.ac.in/esproject)
Department of Computer Science & EngineeringIndian Institute of Technology Delhi
June 11, 2002
Ph.D. Research Plan Presentation
Anup Gangwar
Slide Slide 22Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 33Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Introduction
Why customize architectures? General purpose computing domain Vs embedded Customization leads to cheaper design solutions
Architectural choices for exploiting ILP Superscalar processors
Try to extract ILP at run time, so, complex hardware Limited clock speeds and high power dissipation Not suited for embedded type of applications
VLIW processors Compiler has lot of knowledge about hardware Compiler extracts ILP statically, so, simplified hardware Possible to attain higher clock speeds
Slide Slide 44Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Introduction - Problems with VLIW Processors
Complex compiler required for extracting ILP
Adequate hardware support needed for compiler controlled execution
Code size expansion due to explicit NOPs if, The application does not contain enough parallelism
The compiler is not able to extract parallelism from the application
Need for good instruction encoding and NOP compression schemes
Slide Slide 55Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 66Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> FUs
Slide Slide 77Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> FUs (contd...)
Functional Unit Types MISO or Multiple Input Single Output MIMO or Multiple Input Multiple Output MIMO with LD/ST or MIMOs with memory interaction Rigid or flexible I/O timeshapes
NAME Inputs and Sources Outputs and Dests. I/ O Policy
MISO Multiple (Regfile) Single (Regfile) Flexible or Rigid
MIMO Multiple (Regfile) Multiple (Regfile) Flexible or Rigid
MIMO withLD/ST
Multiple (Regfile orMem.)
Multiple (Regfile orMem.)
Flexible or Rigid for Reg.and block LD/ ST formem.
Slide Slide 88Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Reg. File
Single register file organization doesn’t scale well Area grows as N3
Delay grows as N3/2
Power grows as N3
where N is the no. of Functional Units connected to the register file
Clustered VLIW architectures are the solution Each FU can read from/write to only a subset of
registers Data copying may increase execution latency Powerful application analysis required to overcome
above mentioned problems
Slide Slide 99Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Reg. File (contd...)
A Clustered VLIW Architecture
Slide Slide 1010Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Interconnect
Clustering FUs together requires deciding ICN between different clusters
between clusters and memory
Analysis of data access patterns required for evaluating cost-performance tradeoffs
Current ASIP vendors do not offer customizable interconnects
Slide Slide 1111Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Encoding
Instruction encoding/decoding scheme affects Code size Object code compatibility Branch miss prediction penalty Hardware cost Address specification in code size
Each UniOp is equivalent to a RISC/CISC instruction
UniOp UniOp UniOp UniOp
MultiOp
Slide Slide 1212Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Encoding (contd...)
ADD NOP FMUL NOP
IALU.0 IALU.1 FALU.0 BU.0
NOPs in a MultiOp
VLIW Processor Pipeline with Instruction Decompressor
Slide Slide 1313Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Summary
Slide Slide 1414Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 1515Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Existing Methodologies -> Simulation Driven
Slide Slide 1616Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
VLIW ASIP Synthesis Methodology
Task Set and Constraints
ArchitectureDescription
Architecture Design Space ExplorationApplication Parameter
Extraction
Retargetable Compiler
Instruction Encoding Specialization
Validation(Simulation with encoded instructions)
Architecture Description(Output to synthesizer)
Slide Slide 1717Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 1818Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> TrimaranC Program
IMPACT
SIMULATOR Generator
ELCOR
Bridge Code
ELCOR IR
HMDES Machine Description
Generated Simulator(Statistics)
•ANSI C Parsing•Code profiling•Classical machine independent optimizations•Block formation
•Machine dependent
code optimizations
•Code scheduling
•Register allocation•ELCOR IR to low level C files•HPL-PD virtual machine•Cache simulation•Performance statistics
•Compute and stall cycles•Cache stats•Spill code info
Slide Slide 1919Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> Trimaran (contd...)
Code Processor
Native Compiler
REBEL
HMDES
Low level C files C libraries Emulation Library
Executable for the host platform
Slide Slide 2020Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> Retargetable Assembler
Instruction Encoding Description
Toolkit Generator
Generated AssemblerAssembly Instructions
Object Code
To Simulator(for simulation with encoded instructions)
Slide Slide 2121Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 2222Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Plan -> Interconnect/RF/FU Specialization
Initially model the interconnect problem as ILP and later on move to other solutions
Code selection problem in compilers is similar to identifying compute intensive parts for AFUs
No. and type of FUs has not been properly explored
RF clustering problem has not been dealt with elsewhere
Jacome et. al. Deal with Interconnect/RF/FU specialization
simultaneously Operation chaining is not considered
Slide Slide 2323Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Plan -> Encoding/Decoding Specialization
Goal is to be able to generate encoding schemes automatically
Work of Shail Aditya et. al. Basically a parameterized encoding scheme Techniques especially for HPL-PD architecture Do not talk of dynamic code size minimization Encoding template is fixed exploration limited only to within
the template design space
Various encoding templates need to be explored, also the template itself may be derived from application
Dynamic code size minimization needs to be considered
Slide Slide 2424Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 2525Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Specialized FUs in Trimaran
Modeling MISOs Model as external function calls Replace in Trimaran bridge code and replace with AFU op Model new AFU in MDES with the required ops Introduce the semantics in simulator op definitions file
Modeling MIMOs Model as external function calls returning voids Replace in Trimaran bridge code and replace with AFU op Explicitly reserve registers in C-code for returning values Introduce operation semantics in simulator op definition
file
Slide Slide 2626Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Specialized FUs in Trimaran (contd...)
Modeling MIMOs with LD/ST Model as regular MIMOs Memory interaction with block LD/ST at beginning and
end of execute cycles
Additionally Possible to impose register file constraints Various I/O timeshapes, rigid or flexible Possible to introduce pipelined functional units
Slide Slide 2727Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Instruction Enc. in Trimaran
Slide Slide 2828Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Instruction Enc. in Trimaran (contd...)
New Jersey Machine Code Toolkit (NJMC) Deals with bits at symbolic level Can be used to write assemblers/disassemblers Specification in SLED (Specification Language for
Encoding/Decoding)
Model instruction decompressor in HMDES Instrument ELCOR to generate assembly code Encoding is done using procedures generated by
NJMC Problems with NJMC
VLIW instruction need to be broken up into 32 bit tokens Encoded instructions must end on 8 bit boundary
Slide Slide 2929Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Code Gen. for Clustered ASIPs
ELCOR Disadvantages
ELCOR is heavily oriented towards HPL-PD architecture Does not support clustered VLIW architecture
Advantages Strong optimizing compiler Rich library to deal with the IR
IMPACT compiler system offers another choice for building a backend
Feasibility study being carried out to fix a particular direction of work
Slide Slide 3030Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References
Slide Slide 3131Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
References
Bhuvan Middha, Varun Raj, Anup Gangwar, M. Balakrishnan, Anshul Kumar and Paolo Ienne, “A Trimaran based framework for exploring design space of VLIW ASIPs with coarse grain FUs”, ISSS-2002.
Anup Gangwar, M. Balakrishnan and Anshul Kumar, “A framework for studying the effect of VLIW processor instruction encoding and decoding schemes”, Mini Project, Dept. of CSE.
M. Jacome and G. de. Veciana, “Design challenges for new application specific processors”, IEEE Design and Test of Computers-2000.
B. Ramakrishna Rau and Michael S. Schlansker, “Embedded computer architecture and automation”, IEEE Computer-2001
Michael S. Schlansker and B. Ramakrishna Rau, “EPIC: An architecture for instruction-level parallel processors”, HPCA-2000.
N. G. Busa, A. van der Werf and M. Bekooij, “Scheduling coarse grain operations for VLIW processors”, ASPDAC-1998.
Shail Aditya, Scott A. Mahlke and B. Ramakrishna Rau, “Code size minimization and retargetable assembly for custom EPIC and VLIW processors”, ISSS-1999.