Bruno Cardoso Lopes [email protected] LLVM Developers ...
Transcript of Bruno Cardoso Lopes [email protected] LLVM Developers ...
![Page 1: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/1.jpg)
Object Code Emission& llvm-mc
Bruno Cardoso [email protected]
LLVM Developers Meeting, 2009Cupertino, CA
![Page 2: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/2.jpg)
Introduction
• Motivation
• Background
• Actual Code Emission
• Object Code Emission
• llvm-mc
![Page 3: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/3.jpg)
Motivation
Compiler
• Known path
• Object code path
.oAssembler Linker
.s
Compiler.o
Linker
![Page 4: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/4.jpg)
Motivation
• Why direct object code emission?
• Bypass the external assembler.
• Speed-up compile time.
![Page 5: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/5.jpg)
Background
• Current code emission:
• Asm printers
• JIT engine
![Page 6: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/6.jpg)
Asm Printer
• AsmPrinter
• Instructions are described on .td files.
• Auto-generated method is used to print instructions.
![Page 7: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/7.jpg)
Asm Printer
void X86AsmPrinter::printMCInst(const MCInst *MI) { if (MAI->getAssemblerDialect() == 0) X86ATTInstPrinter(O, *MAI).printInstruction(MI); else X86IntelInstPrinter(O, *MAI).printInstruction(MI);}
Auto-generated method
![Page 8: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/8.jpg)
JIT
• JIT emits binary code.
• Blobs are emitted to memory by a target specific code emitter class.
![Page 9: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/9.jpg)
JIT
• The code is emitted per-function
X86 Emitter
PPC Emitter
MachineFunctionPass
...
![Page 10: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/10.jpg)
JIT
• Only PPC has a auto-generated code emitter.
...class Emitter : public MachineFunctionPass { ... bool runOnMachineFunction(MachineFunction &MF);
void emitInstruction(const MachineInstr &MI, const TargetInstrDesc *Desc);...
![Page 11: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/11.jpg)
MachineCodeEmitter
• The actual binary code emission is done by calls to the MachineCodeEmitter.
void ...emitInstruction(const MachineInstr &MI, ...) {
// Emit the lock opcode prefix as needed. if (Desc->TSFlags & X86II::LOCK) MCE.emitByte(0xF0); ...
MachineCodeEmitter
![Page 12: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/12.jpg)
JITCodeEmitter• JIT code emission is implemented in the
JITCodeEmitter.
• A specialization from MCE.
• Implement methods to actually write to memory:
emitByte(..)emitULEB128Bytes(..)emitDWordLE(...)emitAlignment(...)
![Page 13: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/13.jpg)
Object Code
• Object Code support is implemented using this scenario.
• Specialize the MCE as JIT does.
• MCE is an instance of ObjectCodeEmitter.
![Page 14: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/14.jpg)
Object Code
• The specific formats (e.g. ELF) are specializations of ObjectCodeEmitter.
ELFCodeEmitter
COFFCodeEmitter
ObjectCodeEmitter
...
![Page 15: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/15.jpg)
Object Code
• Blobs of code and data are written to BinaryObjects.
• High level abstraction of "Sections" or "Segments".
class ELFSection : public BinaryObject { public: ...
![Page 16: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/16.jpg)
ELFCodeEmitter• Handling of ConstantPools and Jumptables.
• On each binary format a different section.
• Generic target relocations to ELF specific ones.
llvm::reloc_absolute_word
R_X86_64_32
![Page 17: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/17.jpg)
ELFCodeEmitter
• The ELFCodeEmitter emits code to BinaryObjects.
llvm MCE.emitByte(0xF0)
ELFCodeEmitter::emitByte(..)
BinaryObject::emitByte(..).text
![Page 18: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/18.jpg)
ELFWriter
• Emits the symbol table, string table, header and relocations into binary objects.
• Dump binary objects to a final file.
![Page 19: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/19.jpg)
Limitations
• Inline assembly not handled by emitters.
• That demands an assembly parser.
• Solution: llvm-mc.
![Page 20: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/20.jpg)
llvm-mc
• Machine code driver.
• Current playground for an assembly parser, assembler and disassembler.
![Page 21: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/21.jpg)
llvm-mc
• Goals:
• Extract all info from .td files.
• Auto-generate a assembler, disassembler and code generator.
• Integrate the assembler into the compiler.
![Page 22: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/22.jpg)
llvm-mc
• Goals:
• At least ~20% speedup at “-O0 -g”.
• Share binary writers code base as much as possible among different formats.
![Page 23: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/23.jpg)
llvm-mc
• In progress:
• Parse a assembly file and dump the Lex tokens.
.data
.ascii "hello"
identifier: .dataEndOfStatementidentifier: .asciistring: "hello"EndOfStatement
-as-lex
![Page 24: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/24.jpg)
llvm-mc• In progress:
• Parse and assemble a .s file, emitting asm again or object code.
$ llvm-mc -assemble -output-asm-variant=0 -show-encoding x86.s
.section __TEXT,__text,regular,pure_instructions subb %al, %al # encoding: [0x28,0xc0] addl $24, %eax # encoding: [0x83,0xc0,0x18]
![Page 25: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/25.jpg)
llvm-mc
• In progress:
• A complete assembler: includes relaxation phases, which allows late optimizations
• Example: Jump instruction encoding on x86.
![Page 26: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/26.jpg)
llvm-mc
• In progress:
• Interactive disassembler: makes easier to write regression tests for instruction encoding
$ llvm-mc -disassemble
74 22
1 instruction:74 22 je 34
user input
![Page 27: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/27.jpg)
llvm-mc
• Architecture
• The asm parser emits code through a generic streamer, MCStreamer.
• The streamer is specialized to emit asm or object code.
![Page 28: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/28.jpg)
llvm-mc
MCAsmStreamer
MCMachOStreamer
MCStreamer
EmitInstruction()
EncodeInstruction()
MachObjectWriter()
AsmParser.Run()
WriteObject()
.o
.s
![Page 29: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/29.jpg)
llvm-mc
• Current limitations
• Quite new and experimental.
• Demands lots of clean up and refactoring.
• Hardcoded for MachO.
![Page 30: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/30.jpg)
llvm-mc
• Current limitations
• ELF emission is not integrated into the llvm-mc architecture.
• ELF assembly parsing bits not implemented.
• The Assembly printer is not entirely converted to use MCAsmStreamer.
![Page 31: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/31.jpg)
llvm-mc
• MCStreamer future:
• Support other binary formats.
• New specializations for: JIT, dwarf EH and debug info.
![Page 32: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/32.jpg)
llvm-mc
• MCStreamer future:
• JIT and asm printers will eventually be merged into only one “emitter”.
• “-S” could generate “verbose assembly” by default (loop depth, encoding info, ...)
![Page 33: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/33.jpg)
Future Design
• Printing .s
MCAsmStreamer
MCStreamer
CodeGen Code Emitter
.s
![Page 34: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/34.jpg)
• JIT
MCJITStreamer
MCStreamer
CodeGen Code Emitter
Memory
Future Design
![Page 35: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/35.jpg)
• .o writing
MCObjectStreamer
MCStreamer
CodeGen Code Emitter
.oAssemblerBackend
ELF, MachO, ...
Future Design
![Page 36: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/36.jpg)
• Inline asm for .o file writing
MCObjectStreamer
CodeGen Code Emitter
.o
...AsmParser
MCStreamer
ELF, MachO, ...
AssemblerBackend
Future Design
![Page 37: Bruno Cardoso Lopes bruno.cardoso@gmail.com LLVM Developers ...](https://reader034.fdocuments.us/reader034/viewer/2022042723/5868e42b1a28ab39568bde92/html5/thumbnails/37.jpg)
MCMachOWriter
Relaxation
Layout
.o
MCELFWriter ...
...
Future DesignAssemblerBackend