4 % 2 3 % # 5 2 )4 0 5 9 - Rice...

57

Transcript of 4 % 2 3 % # 5 2 )4 0 5 9 - Rice...

Comp215: Real World StoriesMack Joyner, Dan S. Wallach (Rice University)

Copyright 2016, Mack Joyner, Dan S. Wallach. All rights reserved.

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

machine independent

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

machine independent

machine dependent

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

machine independent

machine dependent

Designed by software engineer

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

machine independent

machine dependent

Hardware designers and software engineers should work together

Designed by software engineer

Building Executable Code

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

machine independent

machine dependent

Hardware designers and software engineers should work together

Designed by software engineer

What happens when they don’t…

What’s an Assembler?

Parser

Optimizer

Code Gen

Java bytecode

source file

IR

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

Java World:C/C++ World:

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction register operand (src reg)

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction register operand (src reg)

register result (dst reg)

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction

ADD sums the values in R1 and R2 and stores the result in R3.

register operand (src reg)

register result (dst reg)

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction

ADD sums the values in R1 and R2 and stores the result in R3.

register operand (src reg)

register result (dst reg)

16 bits encodes the low level instruction

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction

ADD sums the values in R1 and R2 and stores the result in R3.

register operand (src reg)

register result (dst reg)

16 bits encodes the low level instruction

How many registers does this architecture likely have?

AssemblerTransforms low level instructions into machine code

Architecture specification typically defines low level instructions

Places machine code in an .obj file Produces 1 object file for each source file Organizes binary code in proper object file sections.

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

low level instruction

ADD sums the values in R1 and R2 and stores the result in R3.

register operand (src reg)

register result (dst reg)

16 bits encodes the low level instruction

How many registers does this architecture likely have? 8

Recall: Memory Hierarchy

“Computer Architecture, Fifth Edition: A Quantitative approach” by John Hennessy and David Patterson

Binary EncodingsHardware designers supply architecture specification

List of low level instructions Binary encoding for each instruction

Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)

PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

R3 R1 R2

Binary Encoding for ADD R1, R2, R3

ADDU R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0

R3 R1 R2

Binary Encoding for ADDU R1, R2, R3

Binary EncodingsHardware designers supply architecture specification

List of low level instructions Binary encoding for each instruction

Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)

PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }

ADD unsigned numbers (non-negative)

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

R3 R1 R2

Binary Encoding for ADD R1, R2, R3

ADDU R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0

R3 R1 R2

Binary Encoding for ADDU R1, R2, R3

Binary EncodingsHardware designers supply architecture specification

List of low level instructions Binary encoding for each instruction

Software engineers store mapping Map name to opcode only (static part) ADD -> Ox1b (hex)

PSEUDO CODE: create_encoding_combos() { hmap(ADD) = 0x1b; hmap(ADDU) = 0x1c; … }

ADD unsigned numbers (non-negative)

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

R3 R1 R2

Binary Encoding for ADD R1, R2, R3

ADDU R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 1 0 0

R3 R1 R2

Binary Encoding for ADDU R1, R2, R3

There can be several 100s of these mappings

Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification

Start with some low level instructions, add more later May need to change binary encoding for instruction

SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010

Dist1PT: 00, 2PT: 01, 4PT: 10

WbN: 00, S: 01, A: 10

Bit16Bit 25

Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification

Start with some low level instructions, add more later May need to change binary encoding for instruction

SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010

Dist1PT: 00, 2PT: 01, 4PT: 10

WbN: 00, S: 01, A: 10

Bit16Bit 25

instruction in architecture specification

Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification

Start with some low level instructions, add more later May need to change binary encoding for instruction

SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010

Dist1PT: 00, 2PT: 01, 4PT: 10

WbN: 00, S: 01, A: 10

Bit16Bit 25

Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000

instruction in architecture specification

Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification

Start with some low level instructions, add more later May need to change binary encoding for instruction

SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010

Dist1PT: 00, 2PT: 01, 4PT: 10

WbN: 00, S: 01, A: 10

Bit16Bit 25

shift amount for bits based on current spec

Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000

instruction in architecture specification

Problem: Incomplete Binary Encoding SetHardware designers supply partially complete architecture specification

Start with some low level instructions, add more later May need to change binary encoding for instruction

SW ENG PSEUDO CODE: create_encoding_combos() { hmap(STB_1PT_N) = (000 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STH_1PT_N) = (001 << 29) + (00 << 27) + (00 << 25) + opcode; hmap(STW_1PT_N) = (010 << 29) + (00 << 27) + (00 << 25) + opcode;… hmap(STW_1PT_A) = (010 << 29) + (00 << 27) + (10 << 25) + opcode; hmap(STB_2PT_N) = (000 << 29) + (01 << 27) + (00 << 25) + opcode;… hmap(STB_4PT_A) = (000 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STH_4PT_A) = (001 << 29) + (10 << 27) + (10 << 25) + opcode; hmap(STW_4PT_A) = (010 << 29) + (10 << 27) + (10 << 25) + opcode;… }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010

Dist1PT: 00, 2PT: 01, 4PT: 10

WbN: 00, S: 01, A: 10

Bit16Bit 25

Hardware Designer: Let’s add the BU, HU, WU types and the 8PT distribution. No problem right?

shift amount for bits based on current spec

Equivalent to: 0100 0000 0000 0000 0000 0000 0000 0000

instruction in architecture specification

SW ENGINEER

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

String concat

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

[type] is the binary encoding for type

String concat

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

Encoding is calculated in one place, reducing likelihood of errors

[type] is the binary encoding for type

String concat

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

Encoding is calculated in one place, reducing likelihood of errors

[type] is the binary encoding for type

String concat

Hardware Designer: Let’s add the 16PT distribution. No problem right?

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d d w w

Bit 0Bit 31 Represent operands Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 00, 2PT: 01, 4PT: 10,

8PT: 11

WbN: 00, S: 01, A: 10

Bit16Bit 25

Encoding is calculated in one place, reducing likelihood of errors

[type] is the binary encoding for type

String concat

Hardware Designer: Let’s add the 16PT distribution. No problem right?

Hardware Designer: We can just use a bit in the opcode for the distribution!

Garbage Collection

HW Designer: Sorry

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

Bit 4 for all opcodes must be 0

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] << 27) + ([wb] << 25) + opcode; hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

What do we need to do in the encoding to account for additional distribution bit?

Bit 4 for all opcodes must be 0

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

What do we need to do in the encoding to account for additional distribution bit?

Bit 4 for all opcodes must be 0

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

What do we need to do in the encoding to account for additional distribution bit?

Bit 4 for all opcodes must be 0gets d1,d0 bits

Dealing with Binary Encoding ChangesSoftware Engineer’s design should anticipate binary encoding changes.

Minimize work for adding new instructions

SW ENG PSEUDO CODE: create_encoding_combos(name, opcode) { for each type for each list for each wb ins_name = name + type + “_” + dist + “_” + wb; ins_enc = ([type] << 29) + ([dist] & 11 << 27) + ([wb] << 25) + opcode; if (type.equals(“16PT”)) ins_enc = ins_enc + (1 << 4); hmap(ins_name) = ins_enc; }

t t t d1 d0 w w d2

Bit 0Bit 31 Represent instruction opcode

TypeB: 000, H: 001, W: 010,

BU: 011, HU: 100: WU: 101

Dist1PT: 000, 2PT: 001, 4PT: 010,

8PT: 011, 16PT: 100

WbN: 00, S: 01, A: 10

Represent operands Bit16Bit 25

What do we need to do in the encoding to account for additional distribution bit?

Bit 4 for all opcodes must be 0gets d1,d0 bits

add d2

Verify Correctness

Parser

Optimizer

Code Gen

Assembler

Linker

assembly

object file

executable file

source file

IR

C/C++ World

Disassembler

object file

assembly

DisassemblerTransforms machine code into low level instructions

Architecture specification typically defines low level instructions

Reverse of the assembler Can be used for assembly and disassembly correctness testing

000 0

001 1

010 2

011 3

100 4

101 5

110 6

111 7

binary

Assembler

ADD R1, R2, R3

0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1

Opcode: Unique Id for ADDR3 R1 R2

Binary Encoding for ADD R1, R2, R3

Disassembler

ADD R1, R2, R3

Assembly Instructions

Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0

d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1

d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0

1st Byte2nd Byte

4 3 2 1

cst16

Bytes

Assembly Instructions

HW Designer: Let’s have both 2-byte and 4-byte instructions

Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0

d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1

d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0

1st Byte2nd Byte

4 3 2 1

cst16

Bytes

Assembly Instructions

HW Designer: Let’s have both 2-byte and 4-byte instructions

Assembly instructions w/ cst16 operand are 4-byte instructions

Operand Infocst16 = 16-bit constant (2 bytes) src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

d d d s2 s2 s2 s1 s1 s1 0 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 1 1 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 0 0 1 0 1 0

d d d s2 s2 s2 s1 s1 s1 1 1 1 0 1 0 1

d d d s2 s2 s2 s1 s1 s1 1 0 1 1 0 1 1

d d d s2 s2 s2 s1 s1 s1 0 1 1 0 0 1 0

1st Byte2nd Byte

4 3 2 1

cst16

Bytes

Problem: Disassemble the Object FileAssembler generated the object file Instruction set has 2-byte and 4-byte instructions

Object file size is important Disassembler aborts if it reads past the end of file

Example: Current file position is 8, file size is 10, read request is 4 bytes

Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal

get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function

Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort

Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal

get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function

Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort

initial position is 0

Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal

get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function

Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort

initial position is 0

always non-negative

Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal

get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function

Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort

initial position is 0

always non-negative

bit_pos=0 is first bit from current position

Problem: Disassemble the Object FileYou have 5 disassembly commands at your disposal

get_pos() - returns current position file_size() - returns the file size read_bytes(num_bytes) - read and disassemble num_bytes bytes get_bit(bit_pos) - returns requested bit’s value exit() - exit the disassembly function

Disassembler aborts if it reads past the end of file read_bytes(num_bytes), get_bit(bit_pos) may cause abort

initial position is 0

always non-negative

bit_pos=0 is first bit from current position

Undefined behavior: reading only 2-bytes of a 4-byte instruction

Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1

1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1

0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0

0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1

1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1

0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0

Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1

1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1

0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0

0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1

1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1

0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0

How do you write a disassemble function that ensures success?

Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1

1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1

0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0

0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1

1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1

0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0

How do you write a disassemble function that ensures success?

SW ENG PSEUDO CODE: dissemble_ins() {

}

Problem: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1

1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1

0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0

0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1

1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1

0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0

How do you write a disassemble function that ensures success?

SW ENG PSEUDO CODE: dissemble_ins() {

}

Functions: file_size, get_pos, get_bit, exit, read_bytes

Solution: Disassemble the Object FileOperand Infocst16 = 16-bit constant (2 bytes)src1, src2, dst = registers (R0 – R7)

AssemblyMOV cst16, R0ADD cst16, R0, R1MOV R1, R2SUB cst16, R2, R5CMP R5, R2 B R5

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1

1 0 1 0 0 0 0 0 0 1 1 1 1 0 1 1

0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0

0 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1

1 0 1 0 1 0 0 0 0 1 0 1 1 0 1 1

0 0 0 0 1 0 1 0 1 0 1 1 0 0 1 0

How do you write a disassemble function that ensures success?

SW ENG PSEUDO CODE: dissemble_ins() { if (file_size() == get_pos()) exit(); if (get_bit(5) == 0) read_bytes(4); else read_bytes(2); dissemble_ins(); }

Functions: file_size, get_pos, get_bit, exit, read_bytes