Strengths and weaknesses of LLVM as a code...

41
www.cea.fr Strengths and weaknesses of LLVM as a code generator for an application-specific processor Ivan Llopard [email protected] CEA/LIALP April 4, 2013

Transcript of Strengths and weaknesses of LLVM as a code...

Page 1: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

&

www.cea.fr

Strengths and weaknesses ofLLVM as a code generator for an

application-specific processor

Ivan [email protected]

CEA/LIALP

April 4, 2013

Page 2: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 2

Page 3: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Context

Toolchain development for an application-specific processor,Mephisto

Integrated in a Network on Chip architecture for digital basebandprocessing

18 high-performance ASICs and 5 Mephistos

Software Defined Radio (SDR) system

Implementation of Long Time Evoluation (LTE) specificationData flow representation of the application

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 3

Page 4: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 4

Page 5: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Mephisto

Orthogonal VLIW architecture

Low power consumption and high throughput

50mW @ 3,2GOps

Instruction 265 bits wide

Compacted by a higher level instruction cache (64 bits)

Designed to handle the arithmetic of complex numbers (1 complexoperation per cycle)

No instruction decoder (fine grain control over the resources)

Accumulation-based operations with saturated fixed pointcomputations

Indexed register addressing

Clustered register files

Address generation support (limited communication with dataregisters)

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 5

Page 6: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Overview of Mephisto

The NoC controller loads the main program into Mephisto’s memoryand enable its execution

Consume and produce data through input and output FIFOs

Address generation

Register bank

& memories

FIFO In FIFO Out

Configuration

Operators

Instruction

memory

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 6

Page 7: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Mephisto: I-Path

64 bits Instruction RAM

Instruction as an array of indexes(bundle)Address sequences for the addressgenerators

Indexed profile tables

uLoader

Cache initializationUpdate cache or profile entries

Profile tables (265 bits)

MAC, FIFO, Auxiliary operators,Address generators, etc.

Delayed indexation

Hardware loop support

No call instructions

uLoader

IRAM

ICache

Profiles

MACs AG AuxF

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 7

Page 8: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Mephisto: Data-Path

Output register for everyfunctional unit

Multiply-and-Accumulateoperators (MAC)

Data

1 Indexed register file (RA)with 2 output ports: 2x16bits2 Data RAMs with aminimum addressable unit of4 bytes

Input and output FIFOs

Auxiliary functions

2 CORDIC implementations1 Divisor1 Compare & Select

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 8

Page 9: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 9

Page 10: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

MAC functional structure

Five-stage pipeline

Positive or negative accumulations

40-bits Accumulators

Arithmetical right shift and 16/31 bits saturation

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 10

Page 11: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

MAC: Complex operation

The MAC is able to perform half complex operations. Combiningthem, it can perform one complex operation per cycle.

Examples

“Half” complex multiplication� �load ra, [$1] ; load first output port

load rb, [$2] ; load second output port

; compute acc0 = Im{ra} × Re{rb}+ Re{ra} × Im{rb}mula acc0 , ra.h , rb.l , ra.l , rb.h

asrsat16 mout.h , acc0 , 10 ; saturate and shift

store mout.h , R:[$0] ; store to the register array� �Accumulation� �; compute acc0 += Im{ra} × Re{rb}+ Re{ra} × Im{rb}mula.add acc0 , ra.h , rb.l , ra.l , rb.h� �

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 11

Page 12: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Integration of Mephisto in LLVM

General rule: keep a reasonable amount of intrinsic functions

Multiple memory spaces

Address space specification is supported (e.g. global-shared in PTX,fs-gs in X86, etc.)

Register type

Complex is an aggregateBuilt-in vector types: Makes pattern matching easier

Non-spillable accumulators: cannot recover its original value becauseof saturation semantics

Intra and inter-MAC accumulator copies are illegal

Additions cannot be systematically promoted

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 12

Page 13: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Integration of Mephisto in LLVM

General rule: keep a reasonable amount of intrinsic functions

Multiple memory spaces

Address space specification is supported (e.g. global-shared in PTX,fs-gs in X86, etc.)

Register type

Complex is an aggregateBuilt-in vector types: Makes pattern matching easier

Non-spillable accumulators: cannot recover its original value becauseof saturation semantics

Intra and inter-MAC accumulator copies are illegal

Additions cannot be systematically promoted

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 12

Page 14: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Integration of Mephisto in LLVM

General rule: keep a reasonable amount of intrinsic functions

Multiple memory spaces

Address space specification is supported (e.g. global-shared in PTX,fs-gs in X86, etc.)

Register type

Complex is an aggregateBuilt-in vector types: Makes pattern matching easier

Non-spillable accumulators: cannot recover its original value becauseof saturation semantics

Intra and inter-MAC accumulator copies are illegal

Additions cannot be systematically promoted

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 12

Page 15: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Integration of Mephisto in LLVM

General rule: keep a reasonable amount of intrinsic functions

Multiple memory spaces

Address space specification is supported (e.g. global-shared in PTX,fs-gs in X86, etc.)

Register type

Complex is an aggregateBuilt-in vector types: Makes pattern matching easier

Non-spillable accumulators: cannot recover its original value becauseof saturation semantics

Intra and inter-MAC accumulator copies are illegal

Additions cannot be systematically promoted

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 12

Page 16: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Code generation for MAC

MAC clustering

Consider both MACs as being twodifferent clusters with its own registerbank (accumulators)The DAG is composed only by accoperations and φ-nodesAssign a cluster to each operationInstruction cloner and coalescer need towork together to avoid inter andintra-cluster copies

Accumulators cannot be stored into main memory: memorypromotion

Spilling in register allocation: Force rematerialization

Lack of saturation semantics: provide intrinsic functions

vacc0

φ-vacc1 φ-vacc2

vacc3 vacc4

vacc5 vacc6

Page 17: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 14

Page 18: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generation system

Five address generators (AG) containing a bank of 8 pointer registers each one

Support for modulo additions

Static sequence of addresses

Direct and indirect addressing modes

The executable memory space is not addressable

Pointer registers cannot be stored into memory (non-spillable)

No communication path between AGs

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 15

Page 19: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generators in LLVM

Considering a array mapped into the register file space, addresscomputations for loads and stores cannot not be data related� �reg[index] = reg[index] + reg[index -1];

index ++;� �

LLVM simplifies similar induction variables� �reg[new_index] = reg[index] + reg[index -1];

new_index ++; index ++;� �

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 16

Page 20: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generators in LLVM

Considering a array mapped into the register file space, addresscomputations for loads and stores cannot not be data related� �reg[index] = reg[index] + reg[index -1];

index ++;� �LLVM simplifies similar induction variables� �reg[new_index] = reg[index] + reg[index -1];

new_index ++; index ++;� �

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 16

Page 21: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generators in LLVM

Example� �for (int index =0; index <16; index ++) {

reg[index] = readf();

reg[index +16] = readf();

reg[index] = reg[index] + reg[index + 16]

}� �

index09

inc8 arrayidx

store1 store2 load

add

arrayidx2

store3

Leads to instruction cloning as in the MAC clusterization solution

Impossible if it finds instruction with unknown side-effects

Different load instructions following the output port they work on!

Output port reutilization versus register pointer clusterization

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 17

Page 22: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generators in LLVM

Example� �for (int index =0; index <16; index ++) {

reg[index] = readf();

reg[index +16] = readf();

reg[index] = reg[index] + reg[index + 16]

}� �index09

inc8 arrayidx

store1 store2 load

add

arrayidx2

store3

Leads to instruction cloning as in the MAC clusterization solution

Impossible if it finds instruction with unknown side-effects

Different load instructions following the output port they work on!

Output port reutilization versus register pointer clusterization

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 17

Page 23: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Address generators in LLVM

Example� �for (int index =0; index <16; index ++) {

reg[index] = readf();

reg[index +16] = readf();

reg[index] = reg[index] + reg[index + 16]

}� �index09

inc8 arrayidx

store1 store2 load

add

arrayidx2

store3

Leads to instruction cloning as in the MAC clusterization solution

Impossible if it finds instruction with unknown side-effects

Different load instructions following the output port they work on!

Output port reutilization versus register pointer clusterization

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 17

Page 24: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Strategies for instruction cloning

1 Find a path relating two conflicting nodes and clone it

2 Find the SCCs of the graph, reduce them (DAG) and find the meetof two conflicting nodes to start cloning from there

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 18

Page 25: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Strategies for instruction cloning

1 Find a path relating two conflicting nodes and clone it

2 Find the SCCs of the graph, reduce them (DAG) and find the meetof two conflicting nodes to start cloning from there

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 18

Page 26: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 19

Page 27: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Instruction scheduling and bundling

Since version 3.1, LLVM has arepresentation for the instructionbundle

Transparent integration

New iterator to get instructions insidebundles

New flag to mark bundled instructionsisInsideBundle

...

isInsideBundle

...

Insn 1

Insn 2

Insn n

Insn n+1

Insn n+2

Insn n+m

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 20

Page 28: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Instruction scheduling and bundling

Live intervals are computed taking bundles into account� �%vreg1 = opA %vreg0

%vreg3 = opB %vreg2 <kill >� �No live interval interference between %vreg1 and %vreg2

Not all back-end passes are aware of bundles

Finalized bundles

Replace bundle header for the special instruction opcode BUNDLEMove instruction flags from inside the bundle to the header

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 21

Page 29: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Instruction bundles in Mephisto

Instruction itineraries: Pipeline specification of operators (resourceusage and latencies)

Automatic generation of a DFA to check for valid instruction packets

Overlapped instructions. Bundled iff the constant field is the same inboth instructions� �load ra, [$1]

add agra1 , agra0 , $1� �Parallel post-increment. Bundled iff the same address register(agra0) is used in both instructions� �load ra, [agra0]

add agra0 , agra0 , $1� �Usage of multiple resources at the first stage

DFA is not enough!

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 22

Page 30: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Instruction bundles in Mephisto

Instruction itineraries: Pipeline specification of operators (resourceusage and latencies)

Automatic generation of a DFA to check for valid instruction packets

Overlapped instructions. Bundled iff the constant field is the same inboth instructions� �load ra, [$1]

add agra1 , agra0 , $1� �Parallel post-increment. Bundled iff the same address register(agra0) is used in both instructions� �load ra, [agra0]

add agra0 , agra0 , $1� �Usage of multiple resources at the first stage

DFA is not enough!

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 22

Page 31: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Instruction Itineraries

Non-interlocked processor

Resource aware list scheduler (VLIWscheduler in LLVM)

Instruction expansions after DAGserializer/scheduler

Pre-RA scheduler for coarse grainscheduling

Shorten liveness of accumulators inbasic block scopes

Post-RA scheduler for fine grainscheduling

Fix data and resource hazards andproduce final bundles

IR code

Instruction selection

DAG serializer/scheduler

Pre-RA scheduler

Register allocation

Post-RA scheduler

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 23

Page 32: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 24

Page 33: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Data memory granularity

The minimum addressable unit in Mephisto is 4 bytes (complex)

Doesn’t match the general assumption in LLVM: Byte-addressablemachines

The IR provides a good abstraction for address calculations:getelementptr� �%arrayidx = getelementptr inbounds [16 x <2 x i16 >]* %

pilote , i16 0, i16 %index� �GEP lowering in SDAG construction for instruction selection

This abstraction may be broken at different levels

Illegal casts coming down from C code (bitcasts)� �int a = 10;

char *ina = (char *)&a;� �Optimization passes (Loop Strength Reduction)

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 25

Page 34: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Data memory granularity

The minimum addressable unit in Mephisto is 4 bytes (complex)

Doesn’t match the general assumption in LLVM: Byte-addressablemachines

The IR provides a good abstraction for address calculations:getelementptr� �%arrayidx = getelementptr inbounds [16 x <2 x i16 >]* %

pilote , i16 0, i16 %index� �GEP lowering in SDAG construction for instruction selection

This abstraction may be broken at different levels

Illegal casts coming down from C code (bitcasts)� �int a = 10;

char *ina = (char *)&a;� �Optimization passes (Loop Strength Reduction)

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 25

Page 35: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Data memory granularity

The minimum addressable unit in Mephisto is 4 bytes (complex)

Doesn’t match the general assumption in LLVM: Byte-addressablemachines

The IR provides a good abstraction for address calculations:getelementptr� �%arrayidx = getelementptr inbounds [16 x <2 x i16 >]* %

pilote , i16 0, i16 %index� �GEP lowering in SDAG construction for instruction selection

This abstraction may be broken at different levels

Illegal casts coming down from C code (bitcasts)� �int a = 10;

char *ina = (char *)&a;� �Optimization passes (Loop Strength Reduction)

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 25

Page 36: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Outline

1 Context

2 Mephisto

3 MAC

4 Indexed register file

5 Instruction scheduling and bundling

6 Data memory granularity

7 Binary code compaction

8 Conclusions

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 26

Page 37: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Binary code compaction

Delayed execution of slots in theinstruction packet

Binary compaction for anefficient cache utilization

Allow bundling of anti and truedata dependencies

Hazard recognizer complexity

Factorization of profile entries

In-advance page replacement

uLoader

IRAM

ICache

MACs AGwAGra/b

2 7

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 27

Page 38: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Conclusions

+ Built-in representation for instruction bundles

+ Automatic validity checks on bundles

+ Memory space specification

- Support for saturation semantics and fixed point computations

- Word addressable machines

- Data-flow transformations to split calculations

Translation to accumulation form of arithmetic operations?

Scheduling of delayed bundles?

Binary code compaction strategies (cache aware)?

Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 28

Page 39: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Centre de Grenoble17 rue des Martyrs

38054 Grenoble Cedex

Centre de SaclayNano­Innov PC  172

91191 Gif sur Yvette Cedex

Thanks!

Page 40: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Programming interface

Additional semantics for

SaturationInput & output FIFO

Complex type: LLVM built-in vector

Exposed accumulators: long long� �int2 D1, D2, R1;

long long acc0 = 0, acc1 = 0;

for (int i=0; i<400; i++) {

D1 = readf ();

D2 = readf ();

// conj(D1) * D2

acc0 += D1.x * D2.x + D1.y * D2.y;

acc1 += D1.x * D2.y + D1.y * D2.x;

}

R1.x = asrsat(acc0 , 18);

R1.y = asrsat(acc1 , 18);

writef(R1);� �Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 30

Page 41: Strengths and weaknesses of LLVM as a code …compilfr.ens-lyon.fr/wp-content/uploads/2013/04/ivan.pdfStrengths and weaknesses of LLVM as a code generator for an application-speci

Cliquez pour modifier lestyle du titre

© CEA. All rights reserved&

Hardware loops

It hides loop counter evolution and boundary checks

It depends on target caractheristics

Up to 4 nesting levelsCannot exit the loop before termination

Target-independent hardware loop builder

High-level (IR) trip count computation using induction variableanalysis (ScalarEvolution in LLVM)

New loop intrinsics: enterloop/endloop� �entry:

call void @llvm.meph.enter.loop(i16 199)

br label %for.body

for.body:

...

%1 = call i16 @llvm.meph.end.loop ()

%2 = icmp slt i16 %1, 0

br i1 %2, label %for.end , label %for.body� �Strengths and weaknesses of LLVM as a code generator for an application-specific processor | DACLE Division | April 4, 2013 | 31