Working with custom ISA extensions in RISC-V · • RISC-V ISA (Instruction Set Architecture) is...
Transcript of Working with custom ISA extensions in RISC-V · • RISC-V ISA (Instruction Set Architecture) is...
Working with custom ISA extensions in RISC-VFelipe Torrezan, IAR Systems
Agenda
IAR Embedded Workbench for RISC-V• Compiler
–Extending RISC-V with Custom Instructions• Inline assembler• Custom instructions
Demo
IAR Embedded Workbench for RISC-V
IAR Embedded WorkbenchDevice support for RISC-V
Standard Extensions
M Integer Multiplication and Division
A Atomic Instructions
F Single-Precision Floating-Point
D Double-Precision Floating-Point
C Compressed Instructions
32-bit ISA
RV32I Integer Instruction Set
RV32E Base Integer Instruction Set (embedded)
Vendor Supported CoresAndes A25, D25F, N22, N25, N25F
CloudBEAR BM-310s
Microchip Mi-V_RV32IMA[F]_L1_(AHB/AXI)
SiFive E20, E21, E24, E31, E34, E76
Syntacore SCR1
Custom Extensions
X Non-Standard Instructions
Compiler
Compiler Proprietary design based on 36 years of experience
Based on a platform that is common among different targets to handle global optimizations, etc
Target unique backend for specific adaptations and optimizations
RISC-V specifics Primary focus will be on adding standard extensions
Initial prioritization on code size
Now it also has substantial improvements on execution speed
IAR C/C++ Compiler optmizations
Extensively testedCommercial test suites• Plum-Hall Validation test suite• Perennial EC++VS• Dinkum C++ Proofer
In-house developed test suite>500,000 lines of C/C++ test code run multiple times • Processor modes• Memory models• Optimization levels
Language standards• ISO/IEC 14882:2017 (C++17)• ISO/IEC 9899:2018 (C18)• ANSI X3.159-1989 (C89)
• IEEE 754 standard for floating-point arithmetic
Option to maximize speed with no size constraints
The linker can remove unused code
Multiple optimizations levels for code size and execution speed
Balance between size and speed by setting different optimizations for different parts of the code
Major features of the optimizer can be controlled individually
Multi-file compilation allows the optimizer to operate on a larger set of code
Extending RISC-V with Custom Instructions
Extending RISC-V with Custom Instructions• RISC-V ISA (Instruction Set Architecture) is designed in a modular way• It means that the ISA has several groups of instructions (ISA extensions)
that can be enabled or disabled as needed• One of the groups (“X”) is special; it has no predefined instructions• Designers can add any instruction they need for the application that they
want to accelerate• This is a very powerful feature, as it does not break any software
compatibility• In IAR Embedded Workbench we solved it with an assembler directive
Inline assembler
Inline assembler The IAR C/C++ Compiler for RISC-V provides several ways to access low-
level resources• Modules written entirely in Assembly• Intrinsic functions• Inline assembler
Inline assembler exampleint Add(int term1, int term2){
int sum;asm("add %0, %1, %2"
: "=r"(sum): "r" (term1), "r"
(term2));return sum;
}
Intrinsic functions examples <intrinsics.h>__enable_interrupt(void);__disable_interrupt(void);__read_csr(unsigned int _reg);__write_csr(unsigned int _reg, unsigned int _value);__set_bits_csr(unsigned int _reg, unsigned int _value);__set_bits_csr(unsigned int _reg, unsigned int _value);__wait_for_interrupt(void);__return_address(void);
Format Name Semantics
ADD rd, rs1, rs2 Add rd sx(rs1) +sx(rs2)
rd : integer register destinationrsN : integer register source Nsx(reg): signed 32-bit integer
Custom instructions
32-bit RISC-V instruction formats
Instruction Formats 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Register/Register funct7 rs2 rs1 funct3 rd opcode
Immediate imm[11:0] rs1 funct3 rd opcode
Upper Immediate imm[31:12] rd opcode
Store imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
Branch [12] imm[10:5] rs2 rs1 funct3 imm[4:1] [11] opcode
Jump [20] imm[30:21] [11] imm[19:12] opcode
https://riscv.org/specifications/
RISC-V base instruction formats showing immediate variants
Custom instructions The .insn directive generates custom instructions which are not directly
supported by the assembler The .insn directive can be used to inline assembly code in programs
written in C and C++
Intrinsic-like macro example/* Generates AND r,r,r */#define __insn_example(lhs, rhs) ({ \
int __lhs = (lhs), __rhs = (rhs), __res; \asm (".insn r 0x33, 0x7, 0x0, %0, %1, %2"
\: "=r" (__res) \: "r" (__lhs), "r" (__rhs)); \
__res; \})
Intrinsic-like function examplelong __insn_example(int lhs, int rhs) {
long res;/* Generates AND r,r,r */asm (".insn r 0x33, 0x7, 0x0, %0, %1, %2"
\: "=r" (res) \: "r" (lhs), "r" (rhs) );
return res;}
Custom instructions The .insn directive generates
instructions on all RISC-V instruction formats
These are RISC-V instruction formats which are supported
Some of the instructions allow relaxations at linking-time
These relaxations are controlled by the .option norelaxdirective
.insn directives.insn r op7, f3, f7, rd, rs1, rs2
.insn r op7, f3, f7, rd, rs1, rs2
.insn r4 op7, f3, f2, rd, rs1, rs2, rs3
.insn i op7, f3, rd, rs1, expr
.insn i op7, f3, rd, rs1, expr (rs1)
.insn i op7, f3, rd, rs1, expr (rs1)
.insn sb op7, f3, rd, rs1, expr
.insn sb op7, f3, rd, expr(rs1)
.insn b op7, f3, rd, rs1, expr
.insn u op7, f3, rd, expr
.insn uj op2, rd, expr
.insn cr op2, f4, rd, rs1
.insn ci op2, f2, rd, expr
.insn ciw op2, f3, rd’, expr
.insn ca op2, f6, f2, rd’, rs2’
.insn cb op2, f3, rs1’, expr
.insn cj op2, f3, expr
.insn cs op2, f3, rs1’, rs2’, expr* Please refer to the RISC-V ISA specification sections 2.3 and 12.2 for details on bit-layout
op7unsigned immediate for 7-bit opcode
op2unsigned immediate for 2-bit opcode
fNunsigned immediate for function code
rd, rsNregister fieldinteger (x0-x31) or FP (f0-f31)
Rd’, rsN’compact instruction reg. fieldinteger (x8-x15) or FP (f8-f15)
expr: immediate expression
Custom instructionsBuilt-in constants are available when generating a custom instruction
0x0 0x1 0x2 0x03 0x07 0x0B 0x0F 0x13 0x17 0x1B 0x23 0x27 0x2B 0x2F 0x33 0x37 0x3B 0x43 0x47 0x4B 0x4F 0x53 0x5B 0x63 0x67 0x6F 0x73 0x7B
C0 C1 C2 LOAD
LOAD_FP
CUSTOM_0
MISC_MEM
OP_IMM
AUPIC
OP_IMM_32
STORE
STORE_FP
CUSTOM_1
AMO
OP LUI
OP_32
MADD
MSUB
NMSUB
NMADD
OP_FP
CUSTOM_2
BRANCH
JALR
JAL
SYSTEM
CUSTOM_3
OPCODE
CO
NST
ANT
Examples/* equivalent to sltiu a0, a1, 0x40 */.insn i 0x13, 0x3, a0, a1, 0x40
/* equivalent to sb a0, 4(a1) */ .insn s 0x23, 0, a0, 4(a1)
/* also equivalent to sb a0, 4(a1) */.insn s STORE, 0, a0, 4(a1)
asm (".insn i 0x13, 0x3, a0, a1, 0x40");
404000A8 0405B513 sltiu a0, a1, 0x40
asm (".insn s 0x23, 0, a0, 4(a1)");
404000AC 00B50223 sb a1, 4(a0)
asm (".insn s STORE, 0, a0, 4(a1)");
404000B0 00B50223 sb a1, 4(a0)
Disassembly
Demonstration
Demo details IAR Embedded Workbench for RISC-V IAR I-jet debug probe
• ADA-MIPI20-RISCV12
Digilent Arty A7-100T FPGA board • Rev. E
• SiFive E21 core• SCIE (SiFive Custom Instruction Extension)• Comes with 2 custom instructions
Custom instructions in the demo (1/4)R-Type instruction, 0x0B opcode, function code f3 = 0
0x0 0x1 0x2 0x03 0x07 0x0B 0x0F 0x13 0x17 0x1B 0x23 0x27 0x2B 0x2F 0x33 0x37 0x3B 0x43 0x47 0x4B 0x4F 0x53 0x5B 0x63 0x67 0x6F 0x73 0x7B
C0 C1 C2 LOAD
LOAD_FP
CUSTOM_0
MISC_MEM
OP_IMM
AUPIC
OP_I
MM_32
STORE
STORE_FP
CUSTOM_1
AMO
OP LUI
OP_32
MADD
MSUB
NMSUB
NMADD
OP_FP
CUSTOM_2
BRANCH
JALR
JAL
SYSTEM
CUSTOM_3
OPCODE
CO
NST
ANT
SCIE custom instruction in a C/C++ program#define __scie_min(a, b) ({ \
long __a = (a), __b = (b), __res; \asm (".insn r 0x0B, 0, 0, %0, %1, %2" \
: "=r" (__res) \: "r" (__a), "r" (__b)); \
__res; \…/* __scie_min(a, b) */int min_val = __scie_min(42, -11);
int min_val = __scie_min(42, -11);
4040003A 02A00793 li12 a5, 0x2A
int min_val = __scie_min(42, -11);
4040003E 52D5 c.li t0, -0xB
int min_val = __scie_min(42, -11);
40400040 0057858B <custom-0> ;Unknown 32-bit instr.
Disassembly
Custom instructions in the demo (2/4)Built-in constant CUSTOM_0 used for mapping the instruction opcode
0x0 0x1 0x2 0x03 0x07 0x0B 0x0F 0x13 0x17 0x1B 0x23 0x27 0x2B 0x2F 0x33 0x37 0x3B 0x43 0x47 0x4B 0x4F 0x53 0x5B 0x63 0x67 0x6F 0x73 0x7B
C0 C1 C2 LOAD
LOAD_FP
CUSTOM_0
MISC_MEM
OP_IMM
AUPIC
OP_I
MM_32
STORE
STORE_FP
CUSTOM_1
AMO
OP LUI
OP_32
MADD
MSUB
NMSUB
NMADD
OP_FP
CUSTOM_2
BRANCH
JALR
JAL
SYSTEM
CUSTOM_3
OPCODE
CO
NST
ANT
SCIE custom instruction in a C/C++ program#define __scie_max(a, b) ({ \
long __a = (a), __b = (b), __res; \asm (".insn r CUSTOM_0, 1, 0, %0, %1, %2" \
: "=r" (__res) \: "r" (__a), "r" (__b)); \
__res; \…/* __scie_max(a, b) */int max_val = __scie_max(84, -22);
int max_val = __scie_max(84, -22);
4040003E 05400393 li12 t2, 0x54
int max_val = __scie_max(84, -22);
40400042 5829 c.li a6, -0x16
int max_val = __scie_max(84, -22);
40400044 0103960B <custom-0> ;Unknown 32-bit instr.
Disassembly
Custom instructions in the demo (3/4)Built-in constant CUSTOM_1 used for mapping the instruction opcode
0x0 0x1 0x2 0x03 0x07 0x0B 0x0F 0x13 0x17 0x1B 0x23 0x27 0x2B 0x2F 0x33 0x37 0x3B 0x43 0x47 0x4B 0x4F 0x53 0x5B 0x63 0x67 0x6F 0x73 0x7B
C0 C1 C2 LOAD
LOAD_FP
CUSTOM_0
MISC_MEM
OP_IMM
AUPIC
OP_I
MM_32
STORE
STORE_FP
CUSTOM_1
AMO
OP LUI
OP_32
MADD
MSUB
NMSUB
NMADD
OP_FP
CUSTOM_2
BRANCH
JALR
JAL
SYSTEM
CUSTOM_3
OPCODE
CO
NST
ANT
SCIE custom instruction in a C/C++ program#define __scie_mini(a, b) ({ \
long __a = (a), __res; \asm (".insn i CUSTOM_1, 0, %0, %1, %2" \
: "=r" (__res) \: "r" (__a), “I" (b)); \
__res; \…/* __scie_mini(a, b) */int min_val_imm = __scie_mini(126, -33);
int min_val_imm = __scie_mini(126, -33);
40400048 07E00E13 li12 t3, 0x7E
int min_val_imm = __scie_mini(126, -33);
4040004C FDFE06AB <custom-1> ;Unknown 32-bit instr.
Disassembly
Custom instructions in the demo (4/4)I-type instruction, 0x2B opcode, function code f3 = 1
0x0 0x1 0x2 0x03 0x07 0x0B 0x0F 0x13 0x17 0x1B 0x23 0x27 0x2B 0x2F 0x33 0x37 0x3B 0x43 0x47 0x4B 0x4F 0x53 0x5B 0x63 0x67 0x6F 0x73 0x7B
C0 C1 C2 LOAD
LOAD_FP
CUSTOM_0
MISC_MEM
OP_IMM
AUPIC
OP_I
MM_32
STORE
STORE_FP
CUSTOM_1
AMO
OP LUI
OP_32
MADD
MSUB
NMSUB
NMADD
OP_FP
CUSTOM_2
BRANCH
JALR
JAL
SYSTEM
CUSTOM_3
OPCODE
CO
NST
ANT
SCIE custom instruction in a C/C++ program#define __scie_maxi(a, b) ({ \
long __a = (a), __res; \asm (".insn i 0x2B, 1, %0, %1, %2" \
: "=r" (__res) \: "r" (__a), “I" (b)); \
__res; \…/* __scie_maxi(a, b) */int max_val_imm = __scie_maxi(210, -55);
int max_val_imm = __scie_maxi(210, -55);
40400050 0D200393 li12 t2, 0xD2
int max_val_imm = __scie_maxi(210, -55);
40400054 FC93982B <custom-1> ;Unknown 32-bit instr.
Disassembly
;I-type instr.
Summary
Summary
Summary One of the major benefits of using RISC-V is
the flexibility since OEMs and SoC vendors can design custom cores
Adding instructions provides full flexibility for innovation and differentiation
Custom ISA extensions is enabled easily through the .insn directive in the IAR EWRISCV without compromising code quality or performance