Computer architecture
Lecture 4: Processor instruction list
Piotr Bilski
Execution of program
• Processor executes machine instructions (after understanding them - decoding)
• Programmer creates a program in the symbolic low or high level language
• During compilation symbolic language is translated into the machine language instructions
Elements of the machine instructions
• Operation code• Argument references (operation input data)• Result reference (if needed)• Reference to the next instruction
0 3 4 15
Operation code Argument references
Arguments and results are stored in:
• Memory (main, cache, virtual)
• Processor registers (accumulator, general purpose registers)
• Input/output devices (hard drive, printer)
Instructions types
• Data processing (logical and arithmetic operations)
• Data storage (instructions related to the memory access)
• Data transmission (input/output operations)
• Control (result testing, non-sequential code execution – jumps, branches)
Relation between the symbolic and machine instructions
x = x + c;
LOAD 1001
ADD 1002
STORE 1001
1001
1002
x
cALU
Number of the addresses in the instruction
Instruction Action
SUB Y,A,B YA-B
MPY T,D,E TD*E
ADD T,T,C TT+C
DIV Y,Y,T YY/T
3 addresses
Instruction Action
MOVE Y,A YA
SUB Y,B YY-B
MOVE T,D TD
MPY T,E TT*E
ADD T,C TT+C
DIV Y,T YY/T
2 addressesInstruction Action
LOAD D ACD
MPY E ACAC*E
ADD C ACAC+C
DIV Y ACAC/Y
1 addressY=(A-B)/(C+D*E)
Number of the addresses in the instruction (cont.)
• Three addresses:ADD a,b,c
• Two addresses: MOVE a,b ADD a,c
• One address: LOAD b ADD c STOR a
a = b + c
Instruction list design problems
• How many (and which) operations for processor to execute?
• What data types (arguments, results)?
• What instruction format (length, addresses’ number)?
• How many (and which) registers?
• Which addressing modes?
Operands
• Addresses (unsigned integers)
• Numbers (numerical data) – fixed and floating point precision, decimal
• Characters (ASCII / IRA, EBCDIC codes etc.)
• Logical data (single bits)
Computer as the data storage
• Writing multiple-byte data in memory can be little endian, big endian, and bi-endian
• The difference between the models of the data storage is in the sequence of the bytes stored in memory, for example hexadecimal number 76859432 can be written in two ways:
263
264
265
266
263
264
265
266
76
85
94
32
32
94
85
76
Big endian
Little endian
Little and big endian
Big endian• Easy to sort character
sequences (strings)• Allows printing ASCII
characters withot any conversions
• Integers and characters are in the same order
• Used in: Sun SPARC, RISC processors, Motorola 680x0
Little endian• Easy to convert longer
number to the shorter one• Arithmetic operations are
easier to execute• Used in: Intel 80x86,
Pentium, Alpha
Bi-endian• Understands both
standards• Used in: PowerPC
Examples of little and big endian in the file types
Big endian:• Adobe Photoshop• IMG (GEM Raster) • JPEG • MacPaint • SGI (Silicon
Graphics)• Sun Raster
Little endian:• BMP (Windows,
OS/2 Bitmaps) • GIF • PCX (PC
Paintbrush) • TGA (Targa) • Microsoft RTF
(Rich Text Format)
Bi-endian:• Microsoft
RIFF (.WAV & .AVI)
• TIFF • XWD (X
Window Dump)
Pentium data types
• Data are organized in the multiplicity of the byte (byte – B, word – 2 B, double word – 4 B etc.)
• Formats are compliant with IEEE 754 norm• No need to store data under the evenly alligned
addresses• Unsigned integers (8, 16, 32, 64 bits) -
addresses• Signed integers (8,16, 32, 64 bits), two’s
complement representation• Floating point numbers (single, double, and
extended double precision)
Pentium data types (cont.)
• Generic (any content 16,32 or 64 bits long)
• Unpacked decimal number binary representation (one digit in a byte)
• Packed decimal number binary representation (two digits in a byte)
• Pointer (32-bit address)
• Bit field
• Byte chain
PowerPC data types
• Data 8, 16, 32, 64 bits long
• Data address alignment to the even byte is not required (though sometimes used)
• PowerPC is bi-endian type
• Stored: usigned and signed numbers (byte (8b), half-word (16b), word (32b), double word (64b)), floating point numbers (IEEE 754), byte chain (up to 128 B)
Operation classification
• Data transfer ( STORE, LOAD, SET PUSH, POP)• Arithmetic (ADD, SUB, NEG, INC, MULT)• Logical (AND, OR, NOT, TEST, SHIFT, ROTATE)• Control passing (JUMP, HALT, EXEC)• Input/output (READ, WRITE)• Conversion (TRANS, CONV)
Data transfer
• Aim: to move data from one location to another• Requires: determining memory location (virtual
address?), checking for cache memory, producing instruction of read/write operation
• Exemplary instructions: LOAD, STORE (in short, long, half-word versions etc.)
Logical operations
• Operands are treated as the bit chain• The most popular operations: AND, OR, XOR,
NOT• Bit chains treated as masks:
A1 = 10100101
AND
A2 = 11110000
10100000
A1 = 10100101
XOR
A2 = 11111111
01011010
Logical operations (cont.)• Logical shifting
• Arithmetic shifting
0
0
Changing execution order
• Related to the instructions’ execution order
• Contain jumps, calling procedures and execution of one operation in a loop
• Control passing can be conditional or unconditional
Conditional branches
• Multiple-bit code contains storing results of the operations being a condition to the jump execution, for example determined by the sign of the result, overflow and zeroing the result
• The second method is the jump condition embedded in the jump instruction
• Jump can be used in both directions
Branch example351
352
353 SUB X, Y
354 BRZ 373
........
372 BR 353
373
........
395 Rest of the code
396
BRZ – make a jump, if the result is zero
BR – make a jump unconditionally
Conditional code of the SUB operation determines jump in BRZ operation
Procedures
• They are isolated modules in the source code
• Their usage allows to increase flexibility of the code
• Require two instructions: call and return
• The same procedure can be called many times from different locations
• Procedures can be nested
Procedure and return location
• Procedure can be called from multiple locations in the program
• Nesting of calls is possible
• Calling the procedure requires storing the return address:– In the register– At the beginning of the called procedure– On the stack (the best option, allows the
operation of the nested (recurrent) procedures)
Procedure call
Stack
• It is an isolated memory space to store data, organized as the LIFO structure
• In many processors there is the register working as the stack pointer (for example, Motorola 68000)
• Main stack operations: PUSH, POP
Example of the stack implementation
Stack pointer
End of stack
F
T
PUSH
F
POP
F
Working with stack
• Operation a+b-(c/d)• Operation in the reverse polish notation: ab+cd/-
a
b
a+b a+b
c
d
a+b
c/d
a+b-c/d
Stack frame
• Set of the procedure parameters including return address
• Allows to call the nested procedures storing input and output parameters on the stack
Stack frame illustration
x2
x1
Return point
Previous frame pointer
y2
y1
Previous frame pointer
Return point
x2
x1
Previous frame pointer
Return point
Stack cont.
SP
FP
Procedure AProcedure A calls B
FP
SP
Stack frame in Pentium processor• Used by the ENTER, CALL commands• ENTER command supports compilers in the
nested procedures implementation• LEAVE command restores previous stack status• Frame pointer is stored in the EBP registry,
stack pointer in ESP registry• Example of the CALL execution:
PUSH EBP
MOV EBP, ESP
SUB ESP, space_in_memory
MMX instructions
• Introduced in 1996 r. to the Pentium processors• In the first version they were 57 SIMD
instructions• Used to execute operations on the integer
numbers• Purpose – multimedia applications (computer
games, graphics and sound processing)• MMX uses four new data types: packed byte,
packed word, packed double word, packed quadruple word
MMX instructions examples
• Arithmetic: PADD, PMUL, PMADD• Logical: PAND, PNDN, POR, PXOR• Comparison: PCMPEQ, PCMPGT• Conversion: PUNPCKH, PUNPCKL
• All instructions have suffixes determining, which type of data is used in the operation: B, W, D, Q
Additional MMX registers
• Eight 64-bit registers from MM0 to MM7• Due to the backward compatibility, the MMX registers
are accessible by the older software as the floating point registers
63 56 7 0
eight byte Seventh byte First byte
Fourth word
.....
Exemplary MMX operation
MMX arithmetics
• Saturation instead of the overflow
1111 0000 0000 0000
+0011 0000 0000 0000
10010 0000 0000 0000 overflow
1111 0000 0000 0000
+0011 0000 0000 0000
10010 0000 0000 0000
1111 1111 1111 1111 saturation
Why should we use MMX?
* - compared to the C code using traditional architecture
Operation Acceleration*
Echo effect 5,9
Matrix transposition 2
Arithmetic and logical operations on vectors
6
Fractals drawing (2D) 1,5
Billinear texture mapping (3D)
7
Median filter 3,8
Haar transform 2x2 2,2
Calculating L1 norm 3,3
3D transformation 3,1
SSE instructions
• Introduced in 1999 (Pentium 3)
• New 70 instructions for the floating point operations
• Additional 8 128-bit registers, addressed directly: XMM0 – XMM7 (plus control register MXCSR).
• Every register stores 4 32-bit floating point numbers
SSE (cont.)• New data type: 4-element vector of
floating point single precision numbers• Operations can be packed (PS – for all
elements of the vector), or scalar (SS – inly on the first elements)
• Example:
xmm0 = [X1 X2 X3 X4] xmm1 = [Y1 Y2 Y3 Y4]
ADDPS(xmm0,xmm1) =
[X1+Y1 X2+Y2 X3+Y3 X4+Y4]
3DNow! Instructions• Introduced in 1997 r. by the AMD
corporation• Provide set of 21 new instructions for the
floating point number calculations of the SIMD type
• Used in the multimedia applications (high resolution graphics, computer games, CAD/CAM)
• Extensions exist: Enchanced 3DNow!, 3DNow Professional
SSE2 instructions
• Introduced in 2001 (Intel Pentium IV, Athlon 64, Sempron 754, Transmeta Efficeon)
• Set of the additional 144 instructions, supported by 16 128-bit registers (XMM0 – XMM15)
• Performed operations on 64-bit floating point (coprocessors x87 work with 80-bit numbers) and integer 128-bit numbers
Next Sets of Instructions
• SSE3 (Prescott New Instructions) – 13 new instructions, including the complex numbers arithmetics (since 2004, Pentium IV Prescott, Athlon 64 E)
• SSSE3 (Supplemental Streaming SIMD Extension 3) – 16 new instructions operating on integers (since 2005 Xeon, Intel Core 2, AMD Phenom)
• SSE4 – 54 new instructions in two groups (47 and 7), including integer number instructions modifying EFLAGS register (new!), implemented in Intel Core 2, Celeron Conroe, Penryn
Next Sets of Instructions (c.d.)
• SSE5 – planned to be implemented by AMD in 2009. Finally replaced by three groups: XOP, FMA4, CVT16 (AVX compatible). Implemented in Buldozzer procesors in 2011. Instructions have even 4 arguments! Competitor to Intel’s SSE4
• AVX (Advanced Vector Extensions) – implemented by Intel in 2011: 16 new 256-bit registers (YMM0-YMM15) + 19 instructions working exclusively on these registers
Assembler
• Low level programming language
• Uses both instructions and symbolic pointers to data
• Every processor has its own assembler
Example of the assembly program
101 0010 0010 0000 0001
102 0001 0010 0000 0010
103 0001 0010 0000 0011
104 0011 0010 0000 0100
201 0000 0000 0000 0010
202 0000 0000 0000 0011
203 0000 0000 0000 0100
204 0000 0000 0000 0000
101 LDA 201
102 ADD 202
103 ADD 203
104 STA 204
201 DAT 2
202 DAT 3
203 DAT 4
204 DAT 0
FORMUL LDA I
ADD J
ADD K
STA L
I DATA 2
J DATA 3
K DATA 4
L DATA 0
MACHINE LANGUAGE SYMBOLIC ASSEMBLER
PROGRAM
L = I + J + K
Top Related