Compilers for embedded systems: Why are compilers an issue?

Click here to load reader

download Compilers for embedded systems: Why are compilers an issue?

of 20

  • date post

    30-Dec-2015
  • Category

    Documents

  • view

    36
  • download

    2

Embed Size (px)

description

Compilers for embedded systems: Why are compilers an issue?. Many reports about low efficiency of standard compilers Special features of embedded processors have to be exploited. High levels of optimization more important than compilation speed. - PowerPoint PPT Presentation

Transcript of Compilers for embedded systems: Why are compilers an issue?

Course Embedded Systems 2003Many reports about low efficiency of standard compilers
Special features of embedded processors have to be exploited.
High levels of optimization more important than compilation speed.
Compilers can help to reduce the energy consumption (energy optimization).
Compilers could help to meet real-time constraints.
Less legacy problems than for PCs.
There is a large variety of instruction sets.
Design space exploration for optimized processors makes sense
- * -
Assembler
(DSP)
Assembler
(µController)
C-Code
(µController)
C-Code
(DSP)
Compilation for multimedia processors
Compilation for network processors
- * -
Reducing Power Supply Voltage
optimization for high performance?
b += *c;
low-energy consumption if memories are at stand-by mode
Reduced energy if more values are kept in registers
- * -
Model of Tiwari (Dissertation, Princeton 1996):
Cost of instructions and of transitions between instructions;
Does not separate out the cost of memory access
Model of Simunic, de Micheli (DAC 99):
Model based on data sheets; does not require measurements.
Does not take transitions into account.
Russell, Jacome (ICCD, 1998): based on precise measurement for two fixed configurations;
cannot predict effect of changes to memory architecture.
Lee (LCTES 2001): detailed analysis of the effect pipeline stages; does not include multi-cycle operations and stalls
Dedicated energy models.
Data
Memory
Instruction
Memory
Instr
IAddr
Data
VDD
ALU
Multi-
plier
Barrel
Shifter
- * -
Ecpu_instr =
MinCostCPU(Opcodei) +
FUCost(Instri-1,Instri)
w: number of ones;
, ß: determined through experiments
Emem_instr =
MinCostMem(InstrMem,Word_widthi) +
Emem_data =
- * -
Results
It is not important, which address bit is set to ‘1’
The number of ‚1‘s in the address bus is irrelevant
The cost of flipping a bit on the address bus is independent of the bit position.
It is not important, which data bit is set to ‘1’
The number of ‚1‘s on the data bus has a minor effect (3%)
- * -
Energy-aware scheduling
Standard compiler optimizations with energy as a cost function
E.g.: Register pipelining:
R2:=a[0];
begin
- * -
(Average) Speed
2
4
8
2
4
5
Speed
years
access to DRAM ~ 500 instructions
penalty for cache miss soon same as for page fault in Atlas
[P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 2002, U. Brisbane]
2x every 2 years
- * -
Predictability: For satisfying timing constraints in hard real-time systems, predictability is the most important concern;
pre run-time scheduling is often the only practical means of providing predictability in a complex system [Xu, Parnas]
Time-triggered, statically scheduled operating systems
What about memory accesses?
Improve the average case behavior
Use “non-deterministic“ cache replacement algorithms
Scratch-pad/tightly coupled memory based predictability
- * -
Address space
scratch pad memory
Which segment (array, loop, etc.) to be stored in SPM?
Gain gi and size si for each segment i.
Maximise gain G = gi, respecting constraint K si.
Static memory allocation:
Solution: knapsack algorithm.
Multi_sort (mix of sort algorithms)
Cycles
as a function of the SPM size
Parameters different from previous slide
- * -
Hardware-support for block-copying
The DMA unit was modeled in VHDL, simulated, synthesized. Unit only makes up 4% of the processor chip.
The unit can be put to sleep when it is unused.
Code size reductions of up to 23% for a 256 byte SPM were determined using the DMA unit instead of the dynamic approach that uses processor instructions for copying.
DMA
MEMORY
CPU