Integer Arithmetic - SKKU
Transcript of Integer Arithmetic - SKKU
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected])
Integer Arithmetic
Jinkyu Jeong ([email protected])Computer Systems Laboratory
Sungkyunkwan Universityhttp://csl.skku.edu
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 2
Where Are We?
• Abstraction Layers in Modern Systems
Algorithm
Gates/Register-Transfer Level (RTL)
Application
Instruction Set Architecture (ISA)
Operating System/Virtual Machine
Microarchitecture
Devices
Programming Language
Circuits
Physics
Scope ofthis course
(HW/SW Interface) – Ch. 2
HW
SW
Ch. 3, 4 and 5
* Sources: CS 252 lecture notes from Prof. Kubiatowicz (UC Berkeley). Coursera lecture notes for HW/SW Interface from Profs. Borriello and Ceze (Univ. of Washington).
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 3
Introduction
• Bits are just bits– No inherent meaning: conventions define relationship
between bits and numbers– n-bit binary numbers: 0 ~ 2n-1 decimal numbers
• Of course it gets more complicated– Numbers are finite (overflow)– Negative numbers– Fractions and real numbers– Precision and accuracy– Error propagation, ...
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 4
Arithmetic for Computers: An Overview
• Operations on integers– Textbook: P&H 3.1-3.4 (for both 4th and 5th Editions)– Addition and subtraction– Multiplication and division– Dealing with overflow
• Floating-point real numbers– Textbook: P&H 3.5– Representation and operations
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected])
Integer Addition & Subtraction
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 6
Integer Addition• Example: 7 + 6
• Overflow if result out of range– Adding +ve and –ve operands, no overflow– Adding two +ve operands• Overflow if result sign is 1
– Adding two –ve operands• Overflow if result sign is 0
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 7
Integer Subtraction
• Add negation of second operand
• Example: 7 – 6 = 7 + (–6)+7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001
• Overflow if result out of range– Subtracting two +ve or two –ve operands, no overflow– Subtracting +ve from –ve operand• Overflow if result sign is 0
– Subtracting –ve from +ve operand• Overflow if result sign is 1
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 8
Simple Adder (n-bit ripple-carry adder)
6 ICE3003: Computer Architecture | Spring 2012 | Jin-Soo Kim ([email protected])
Simple Adder � n-bit ripple-carry adder
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 9
Dealing with Overflow
• Some languages (e.g., C) ignore overflow– Use MIPS addu, addui, subu instructions
• Other languages (e.g., Ada, Fortran) require raising an exception– Use MIPS add, addi, sub instructions– On overflow, invoke exception handler• Save PC in exception program counter (EPC) register• Jump to predefined handler address• mfc0 (move from coprocessor reg) instruction can retrieve EPC
value, to return after corrective action
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 10
Arithmetic for Multimedia
• Graphics and media processing – Operates on vectors of 8-bit and 16-bit data– Use 64-bit adder, with partitioned carry chain• Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
– SIMD (single-instruction, multiple-data)
• Saturating operations– On overflow, result is largest representable value• c.f. 2s-complement modulo arithmetic
– E.g., clipping in audio, saturation in video
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected])
Integer Multiplication
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 12
Multiplication
• Start with long-multiplication approach
1000× 1001
10000000 0000 1000 1001000
Length of product is the sum of operand lengths
multiplicand
multiplier
product
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 13
Multiplication Hardware
Initially 0
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 14
Multiplication Hardware: Example
• Initially
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
00001000
1001
00000000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 15
Multiplication Hardware: Example
• Iteration 1: after addition
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
00001000
1001
00001000Write = 1
add
00001000
00000000
100001000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 16
Multiplication Hardware: Example
• Iteration 1: after shift
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
00010000
0100
00001000
Shift
Shift
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 17
Multiplication Hardware: Example
• Iteration 2 begins
1000× 1001
100000000000 1000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
00010000
0100
00001000Write = 0
add
00010000
00001000
00011000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 18
Multiplication Hardware: Example
• Iteration 2 after shift
1000× 1001
100000000000 1000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
00100000
0010
00001000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 19
Multiplication Hardware: Example
• Iteration 3
1000× 1001
10000000 00001000 1001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
01000000
0001
00001000Write = 0Write = 0
add
00100000
00001000
00101000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 20
Multiplication Hardware: Example
• Iteration 4
1000× 1001
10000000 0000 10001001000
multiplicand
multiplier
product
8 bits
8 bits
4 bits
10000000
0000
01001000Write = 1
add
01000000
00001000
01001000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 21
Optimized Multiplier• Perform steps in parallel: add/shift– Product register: left half is 0 and right-half is multiplier
• One cycle per partial-product addition– That’s ok, if frequency of multiplications is low
000…000 multiplier
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 22
Optimized Multiplier: Example
• Initially
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
0000 1001
4 bits
8 bits
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 23
Optimized Multiplier: Example
• Iteration 1 before shift
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
1000 1001
4 bits
8 bits
0000
write = 1
1000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 24
Optimized Multiplier: Example
• Iteration 1 after shift
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
0100 0100
4 bits
8 bits
0000
write = 1
1000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 25
Optimized Multiplier: Example
• Iteration 2
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
0010 0010
4 bits
8 bits
0100
write = 0
1100
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 26
Optimized Multiplier: Example
• Iteration 3
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
0001 0001
4 bits
8 bits
0010
write = 0
1010
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 27
Optimized Multiplier: Example
• Iteration 4
1000× 1001
10000000 0000 1000 1001000
multiplicand
multiplier
product
1000
0100 1000
4 bits
8 bits
0001
write = 1
1001
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 28
Faster Multiplier• Uses multiple adders– Cost/performance tradeoff
• Can be pipelined– Several multiplication performed in parallel
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 29
MIPS Multiplication
• Two 32-bit registers for product– HI: most-significant 32 bits– LO: least-significant 32-bits
• Instructions– mult rs, rt / multu rs, rt• 64-bit product in HI/LO
– mfhi rd / mflo rd• Move from HI/LO to rd• Can test HI value to see if product overflows 32 bits
– mul rd, rs, rt• A pseudo instruction• Least-significant 32 bits of product –> rd
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected])
Integer Division
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 31
Division
• Check for 0 divisor• Long division approach– If divisor ≤ dividend bits• 1 bit in quotient, subtract
– Otherwise• 0 bit in quotient, bring down next dividend bit
• Restoring division– Do the subtract, and if remainder goes < 0, add
divisor back
• Signed division– Divide using absolute values– Adjust sign of quotient and remainder as
required
10011000 1001010
-100010101 1010-1000
10
n-bit operands yield n-bitquotient and remainder
quotient
dividend
remainder
divisor
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 32
Division Hardware
Initially dividend
Initially divisor in left half
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 33
Division Hardware: Example
• InitiallyInitially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
10000000
01001010
0000
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 34
Division Hardware: Example
• Iteration 1-1 Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
10000000
11001010
0000sub
10000000
01001010
11001010
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 35
Division Hardware: Example
• Iteration 1-2b Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
10000000
11001010
0000add
10000000
11001010
01001010
negative
shift with 0
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 36
Division Hardware: Example
• Iteration 1-3 Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
01000000
01001010
0000
shift right
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 37
Division Hardware: Example
• Iteration 2-1 Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
01000000
00001010
0000
01000000
01001010
00001010
sub
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 38
Division Hardware: Example
• Iteration 2-2a Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
01000000
00001010
0001
positive
shift left with 1
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 39
Division Hardware: Example
• Iteration 2-3Initially divisor in
left half
10011000 1001010
-100010101 1010-1000
10
quotient
dividend
remainder
divisor
00100000
00001010
0001
shift right
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 40
Optimized Divider
• One cycle per partial-remainder subtraction
• Looks a lot like a multiplier!– Same hardware can be used for both
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 41
Faster Division
• Can’t use parallel hardware as in multiplier– Subtraction is conditional on sign of remainder
• Faster dividers (e.g. SRT devision) generate multiple quotient bits per step– Still require multiple steps
SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong ([email protected]) 42
MIPS Division
• Use HI/LO registers for result– HI: 32-bit remainder– LO: 32-bit quotient
• Instructions– div rs, rt / divu rs, rt– No overflow or divide-by-0 checking• Software must perform checks if required
– Use mfhi, mflo to access result