4 % 2 3 % # 5 2 )4 0 5 9 - Rice...
Transcript of 4 % 2 3 % # 5 2 )4 0 5 9 - Rice...
Comp215: Real World StoriesMack Joyner, Dan S. Wallach (Rice University)
Copyright 2016, Mack Joyner, Dan S. Wallach. All rights reserved.
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
machine independent
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
machine independent
machine dependent
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
machine independent
machine dependent
Designed by software engineer
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
machine independent
machine dependent
Hardware designers and software engineers should work together
Designed by software engineer
Building Executable Code
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
machine independent
machine dependent
Hardware designers and software engineers should work together
Designed by software engineer
What happens when they don’t…
What’s an Assembler?
Parser
Optimizer
Code Gen
Java bytecode
source file
IR
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
Java World:C/C++ World:
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction register operand (src reg)
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction register operand (src reg)
register result (dst reg)
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction
ADD sums the values in R1 and R2 and stores the result in R3.
register operand (src reg)
register result (dst reg)
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction
ADD sums the values in R1 and R2 and stores the result in R3.
register operand (src reg)
register result (dst reg)
16 bits encodes the low level instruction
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction
ADD sums the values in R1 and R2 and stores the result in R3.
register operand (src reg)
register result (dst reg)
16 bits encodes the low level instruction
How many registers does this architecture likely have?
AssemblerTransforms low level instructions into machine code
Architecture specification typically defines low level instructions
Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
low level instruction
ADD sums the values in R1 and R2 and stores the result in R3.
register operand (src reg)
register result (dst reg)
16 bits encodes the low level instruction
How many registers does this architecture likely have? 8
Recall: Memory Hierarchy
“Computer Architecture, Fifth Edition: A Quantitative approach” by John Hennessy and David Patterson
Binary EncodingsHardware designers supply architecture specification
List of low level instructions Binary encoding for each instruction
Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)
PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
R3 R1 R2
Binary Encoding for ADD R1, R2, R3
ADDU R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0
R3 R1 R2
Binary Encoding for ADDU R1, R2, R3
Binary EncodingsHardware designers supply architecture specification
List of low level instructions Binary encoding for each instruction
Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)
PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }
ADD unsigned numbers (non-negative)
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
R3 R1 R2
Binary Encoding for ADD R1, R2, R3
ADDU R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0
R3 R1 R2
Binary Encoding for ADDU R1, R2, R3
Binary EncodingsHardware designers supply architecture specification
List of low level instructions Binary encoding for each instruction
Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)
PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }
ADD unsigned numbers (non-negative)
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
R3 R1 R2
Binary Encoding for ADD R1, R2, R3
ADDU R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0
R3 R1 R2
Binary Encoding for ADDU R1, R2, R3
There can be several 100s of these mappings
Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification
Start with some low level instructions, add more later May need to change binary encoding for instruction
SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010
Dist1PT: 00, 2PT: 01, 4PT: 10
WbN: 00, S: 01, A: 10
Bit16Bit 25
Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification
Start with some low level instructions, add more later May need to change binary encoding for instruction
SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010
Dist1PT: 00, 2PT: 01, 4PT: 10
WbN: 00, S: 01, A: 10
Bit16Bit 25
instruction in architecture specification
Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification
Start with some low level instructions, add more later May need to change binary encoding for instruction
SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010
Dist1PT: 00, 2PT: 01, 4PT: 10
WbN: 00, S: 01, A: 10
Bit16Bit 25
Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000
instruction in architecture specification
Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification
Start with some low level instructions, add more later May need to change binary encoding for instruction
SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010
Dist1PT: 00, 2PT: 01, 4PT: 10
WbN: 00, S: 01, A: 10
Bit16Bit 25
shift amount for bits based on current spec
Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000
instruction in architecture specification
Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification
Start with some low level instructions, add more later May need to change binary encoding for instruction
SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010
Dist1PT: 00, 2PT: 01, 4PT: 10
WbN: 00, S: 01, A: 10
Bit16Bit 25
Hardware Designer: Let’s add the BU, HU, WU types and the 8PT distribution. No problem right?
shift amount for bits based on current spec
Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000
instruction in architecture specification
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
String concat
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
[type] is the binary encoding for type
String concat
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
Encoding is calculated in one place, reducing likelihood of errors
[type] is the binary encoding for type
String concat
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
Encoding is calculated in one place, reducing likelihood of errors
[type] is the binary encoding for type
String concat
Hardware Designer: Let’s add the 16PT distribution. No problem right?
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d d w w
Bit 0Bit 31 Represent operands Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 00, 2PT: 01, 4PT: 10,
8PT: 11
WbN: 00, S: 01, A: 10
Bit16Bit 25
Encoding is calculated in one place, reducing likelihood of errors
[type] is the binary encoding for type
String concat
Hardware Designer: Let’s add the 16PT distribution. No problem right?
Hardware Designer: We can just use a bit in the opcode for the distribution!
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
Bit 4 for all opcodes must be 0
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
What do we need to do in the encoding to account for additional distribution bit?
Bit 4 for all opcodes must be 0
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
What do we need to do in the encoding to account for additional distribution bit?
Bit 4 for all opcodes must be 0
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
What do we need to do in the encoding to account for additional distribution bit?
Bit 4 for all opcodes must be 0gets d1,d0 bits
Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.
Minimize work for adding new instructions
SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }
t t t d1 d0 w w d2
Bit 0Bit 31 Represent instruction opcode
TypeB: 000, H: 001, W: 010,
BU: 011, HU: 100: WU: 101
Dist1PT: 000, 2PT: 001, 4PT: 010,
8PT: 011, 16PT: 100
WbN: 00, S: 01, A: 10
Represent operands Bit16Bit 25
What do we need to do in the encoding to account for additional distribution bit?
Bit 4 for all opcodes must be 0gets d1,d0 bits
add d2
Verify Correctness
Parser
Optimizer
Code Gen
Assembler
Linker
assembly
object file
executable file
source file
IR
C/C++ World
Disassembler
object file
assembly
DisassemblerTransforms machine code into low level instructions
Architecture specification typically defines low level instructions
Reverse of the assembler Can be used for assembly and disassembly correctness testing
000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
binary
Assembler
ADD R1, R2, R3
0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1
Opcode: Unique Id for ADDR3 R1 R2
Binary Encoding for ADD R1, R2, R3
Disassembler
ADD R1, R2, R3
Assembly Instructions
Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0
d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1
d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0
1st Byte2nd Byte
4 3 2 1
cst16
Bytes
Assembly Instructions
HW Designer: Let’s have both 2-byte and 4-byte instructions
Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0
d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1
d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0
1st Byte2nd Byte
4 3 2 1
cst16
Bytes
Assembly Instructions
HW Designer: Let’s have both 2-byte and 4-byte instructions
Assembly instructions w/ cst16 operand are 4-byte instructions
Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0
d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1
d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1
d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0
1st Byte2nd Byte
4 3 2 1
cst16
Bytes
Problem: Disassemble the Object FileAssembler generated the object file Instruction set has 2-byte and 4-byte instructions
Object file size is important Disassembler aborts if it reads past the end of file
Example: Current file position is 8, file size is 10, read request is 4 bytes
Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal
get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function
Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort
Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal
get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function
Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort
initial position is 0
Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal
get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function
Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort
initial position is 0
always non-negative
Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal
get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function
Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort
initial position is 0
always non-negative
bit_pos=0 is first bit from current position
Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal
get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function
Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort
initial position is 0
always non-negative
bit_pos=0 is first bit from current position
Undefined behavior: reading only 2-bytes of a 4-byte instruction
Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1
1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0
Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1
1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0
How do you write a disassemble function that ensures success?
Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1
1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0
How do you write a disassemble function that ensures success?
SW ENG PSEUDO CODE: dissemble_ins() {
}
Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1
1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0
How do you write a disassemble function that ensures success?
SW ENG PSEUDO CODE: dissemble_ins() {
}
Functions: file_size, get_pos, get_bit, exit, read_bytes
Solution: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)
AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5
0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0
0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1
1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0
How do you write a disassemble function that ensures success?
SW ENG PSEUDO CODE: dissemble_ins() { if (file_size() == get_pos()) exit(); if (get_bit(5) == 0) read_bytes(4); else read_bytes(2); dissemble_ins(); }
Functions: file_size, get_pos, get_bit, exit, read_bytes