Floating Point Arithmetic The goal of floating point representation is represent a large range of...
-
Upload
alban-harrington -
Category
Documents
-
view
244 -
download
8
Transcript of Floating Point Arithmetic The goal of floating point representation is represent a large range of...
![Page 1: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/1.jpg)
Floating Point Arithmetic
• The goal of floating point representation is represent a large range of numbers
• Important Terms Given the number -123.154 x 105
Sign = negativeMantissa = 123.154Exponent = 5
![Page 2: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/2.jpg)
IEEE Binary Floating-Point Representation
![Page 3: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/3.jpg)
Storage of Floating Point Binary Numbers
(Short Real or Single Precision Format)
31 30 23 22 0
1 11111111 11111111111111111111111
Sign Exponent Mantissa
Long Real(double precision – 64 bits) – 1 bit for sign, 11 bits for exponent, 52 bits for mantissa
![Page 4: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/4.jpg)
Storage Components
• The Sign– The sign is positive(a 0 bit) or negative (a 1 bit)
• The Mantissa (Significand)– The bits to the right of decimal point is the mantissa or significand.
– The numeral to the left of the decimal point is ALWAYS 1 (normalized notation).
• The Exponent– The exponent can be either positive or negative. The exponent is biased by
+127.
– The numeral to the left of the decimal point is ALWAYS 1 (normalized notation).
![Page 5: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/5.jpg)
The Significand (Positional Notation)
![Page 6: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/6.jpg)
The Significand Must be Normalized
• 1234.567 = 1.234567 x 103
• Numbers are normalized by moving the decimal point so that only one digit appears to the left of the decimal point.
• 1101.101 = 1.101101 exponent = 3• 0.00101 = 1.01 exponent = -3• Note that the leading 1 is omitted from storage
![Page 7: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/7.jpg)
IEEE Bit Representation
![Page 8: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/8.jpg)
The Exponent is Biased by +127
![Page 9: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/9.jpg)
Exponent Encoding
• Exponent encoding is bias 127. To get the encoding, take the exponent and add 127 to it.
• If exponent is –1, then exponent field = -1 + 127 = 126 = 7EhIf exponent is 10, then exponent field = 10 + 127 = 137 = 89hSmallest allowed exponent is –126, largest allowed exponent is +127. This leaves the encodings 00H, FFH unused for normal numbers.
BR 6/00
![Page 10: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/10.jpg)
Floating Point Encoding
• The number of bits allocated for exponent will determine the maximum, minimum floating point numbers (range) 1.0 x 2 –max (small number) to 1.0 x 2 +max (large number)
• The number of bits allocated for the significand will determine the precision of the floating point number
• The sign bit only needs one bit (negative:1, positive: 0)
BR 6/00
![Page 11: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/11.jpg)
Convert Floating Point Binary Format to Decimal
1 10000001 01000000000000000000000 • What is the number shown?• Sign bit = 1, so negative. • Exponent field = 81h = 129.
Actual exponent = Exponent field – 127 = 129 – 127 = 2.
• Number is: -1 . (01000...000) x 22 = -1 . (0 x 2-1 + 1 x 2-2 + 0 x 2-3 .. +0) x 4= -1 . (0 + 0.25 + 0 +..0) x 4= -1.25 x 4
• = -5.0. BR 6/00
![Page 12: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/12.jpg)
Convert FP Decimal to binary encoding
What is the number -28.75 in Single Precision Floating Point?
1. Ignore the sign, convert integer and fractional part to binary representation first:
a. 28 = 1Ch = 0001 1100b. .75 = .5 + .25 = 2-1 + 2-2 = .11
-28.75 in binary is - 00011100.11 (ignore leading zeros)
2. Now NORMALIZE the number to the format 1.mmmm x 2exp
Normalize by shifting. Each shift right add one to exponent, each shift left subtract one from exponent:
- 11100.11 x 20 = - 1110.011 x 21
= - 111.0011 x 22 = - 1.110011 x 24
BR 6/00
![Page 13: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/13.jpg)
Convert Decimal FP to binary encoding (cont)
Normalized number is: - 1.110011 x 24 Sign bit = 1 Significand field = 110011000...000 Exponent field = 4 + 127 = 131 = 83h =
1000 0011 Complete 32-bit number is: 1 10000011 110011000….000 • Sign exponent mantissa
BR 6/00
![Page 14: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/14.jpg)
Algorithm for converting fractional decimal to Binary • An algorithm for converting any fractional decimal
number to its binary representation is successive multiplication by two (results in shifting left). Determines bits from MSB to LSB.
• Multiply fraction by 2. • If number >= 1.0, then current bit = 1, else current bit
= 0. • Take fractional part of number and go to ‘a’. Continue
until fractional number is 0 or desired precision is reached.
• Example: Convert .5625 to binary .5625 x 2 = 1.125 ( >= 1.0, so MSB bit = ‘1’). .125 x 2 = .25 ( < 1.0 so bit = ‘0’) .25 x 2 = .5 (< 1.0 so bit = ‘0’) .5 x 2 = 1.0 ( >= 1.0 bit = 1), finished. .5625 = .1001b
BR 6/00
![Page 15: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/15.jpg)
Overflow/Underflow, Double Precision
• Overflow in floating point means producing a number that is too big or too small (underflow) – Depends on Exponent size – Min/Max exponents are 2 –126 to 2 +127
is 10 -38 to 10 +38 . • To increase the range, need to increase number
of bits in exponent field. • Double precision numbers are 64 bits - 1 bit
sign bit, 11 bits exponent, 52 bits for significand • Extra bits in significand gives more precision, not
extended range.
BR 6/00
![Page 16: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/16.jpg)
Special Numbers
• Min/Max exponents are 2 –126 to 2 +127 . This corresponds to exponent field values of of 1 to 254.
• The exponent field values 0 and 255 are reserved for special numbers . Special Numbers are zero, +/- infinity, and NaN (not a number)
• Zero is represented by ALL FIELDS = 0. • +/- Infinity is Exponent field = 255 = FFh, significand = 0.
+/- Infinity is produced by anything divided by 0. • NaN (Not A Number) is Exponent field = 255 = FFh,
significand = nonzero. NaN is produced by invalid operations like zero divided by zero, or infinity – infinity.
BR 6/00
![Page 17: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/17.jpg)
Comments on IEEE Format
• Sign bit is placed in MSB for a reason – a quick test can be used to sort floating point numbers by sign, just test MSB
• If sign bits are the same, then extracting and comparing the exponent fields can be used to sort Floating point numbers. A larger exponent field means a larger number since the ‘bias’ encoding is used.
• All microprocessors that support Floating point use the IEEE 754 standard. Only a few supercomputers still use different formats.
BR 6/00
![Page 18: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/18.jpg)
Assigning Storage for Large Numbers
• Dd (define doubleword) – 4-byte storage; Real number stored as a doubleword is called a short real.– Dd 12345.678– Dd +1.5E+02– Dd 2.56E+38 ;largest positive exponent– Dd 3.3455E-39 ;largest negative exponent
• Dq (Define quadword) -8-byte storage; long real number (double in C,C++ and Visual) – Dq 2.56E+307 ;largest exponent
JM 11/02
![Page 19: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/19.jpg)
Floating Point Architecture(8087 Coprocessor)
• So far we have only dealt with integers• The 8087 was the math coprocessor for
the original PC.• With the 486, the FPU (floating point unit)
became part of the CPU chip.• We will only look at the instruction set of
the original 8087 chip.• Handles both integer and floating point
calculations.
Jm 11/02
![Page 20: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/20.jpg)
Floating Point Registers
ST(0) = ST
ST(2)
ST(1)
ST(3)
ST(4)
ST(5)
ST(6)
ST(7)
80-bit Registers
Instruction Pointer
Operand Pointer
Control Word
Status Word
Tag Word
32-bit Registers
16-bit Registers
JM 11/02
![Page 21: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/21.jpg)
Floating Point Unit (Coprocessor)Data Registers
• 8 individually addressable 80-bit registers– (ST(0), ST(1), ST(2)…ST(7))– Arranged in stack format
• ST(0) = ST -> top of stack
• Control Registers– 3 16-bit registers (control, status, tag)– 2 32-bit registers (instruction pointer, operand
pointer)
JM 11/02
![Page 22: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/22.jpg)
Floating Point Data Register Stack
![Page 23: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/23.jpg)
![Page 24: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/24.jpg)
Floating Point Registers
ST(0) = ST
ST(2)
ST(1)
ST(3)
ST(4)
ST(5)
ST(6)
ST(7)
80-bit Registers
Instruction Pointer
Operand Pointer
Control Word
Status Word
Tag Word
32-bit Registers
16-bit Registers
JM 11/02
![Page 25: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/25.jpg)
Transfer of Data
• Data must be in memory to be sent to the coprocessor (not in the CPU)
• The coprocessor loads the number from memory into its register stack, performs an arithmetic operation, stores the result in memory, and signals the CPU that it has finished.
JM 11/02
![Page 26: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/26.jpg)
Instruction Formats
• Begins with the letter F (to distinguish from CPU instructions)
• 2nd letter– B binary coded decimal operand– I binary integer operand– neither assume real number format. – FBLD - load bcd number– FILD - load integer number – FMUL – real number multiply
• Can not use CPU registers (such as AX, BX) as operands
JM 11/02
![Page 27: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/27.jpg)
Floating Point Operations
• Add Add source to destination
• Sub Subtract source from destination
• Subr Subtract destination from source
• Mul Multiply source by destination
• Div Divide destination by source
• Divr Divide source by destination
JM 11/02
![Page 28: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/28.jpg)
Basic Arithmetic Instructions
Instruction Form Mnemonic FormOperands (Dest,Source)
Example
Classical Stack Fop {ST(1), ST} FADD
Classical Stack, Extra Pop FopP {ST(1), ST} FSUBP
Register FopST(n), ST
ST, ST(n)
FMUL ST(1),ST FDIV ST,ST(3)
Register, pop FopP ST(n), STFADDP
ST(2),ST
Real Memory Fop {ST}, memReal FDIVR
Integer Memory FIop {ST}, memInt FSUBR hours
JM 11/02
![Page 29: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/29.jpg)
Instruction Forms• Classical stack
– No explicit operands needed – (ST, source; ST(1) destination)
– FADD ; ST(1)=ST(1) + ST
; pop ST
– FSUB ;ST(1) = ST(1) – ST; pop ST
100.0
20.0
ST
ST(1)
Before
120.0
After
JM 11/02
![Page 30: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/30.jpg)
Instruction Forms
• Register– Uses coprocessor registers as ordinary
operands (one must ST)
FADD st, st(1) ;st = st + st(1)
FDIVR st, st(3) ;st = st / st(3)
FIMUL st(2), st ;st(2) = st(2) * st
JM 11/02
![Page 31: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/31.jpg)
Instruction Forms
• Register Pop– Identical to register except st is popped at end
– FADDP st(1), st ; ST(1)=ST(1)+ST
; pop ST
; ST(0) = ST(1)
200.0
32.0
ST
ST(1)
Before
200.0
232.0
Intermediate
232.0
After
JM 11/02
![Page 32: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/32.jpg)
Instruction Forms
• Real Memory and Integer Memory– Have an implied first operand, ST– Second operand, explicit, is an integer or real
– FADD Myreal_op ;st = st + myreal_op– FIADD MyInteger_op ;st = st + myinteger_op
JM 11/02
![Page 33: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/33.jpg)
Initialize Instructionfinit
• Finit – initialize floating point processor– Should come first in code– Clears registers
JM 11/02
![Page 34: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/34.jpg)
Load Instructionsfld, fild
• Fld – load a real memory operand into ST(0)• Fild – load an integer memory operand into ST(0)
.data
op1 dd 6.0 ;floating point value
op2 dw 3 ;integer value
.code
finit
fld op1
fld op2
6.0
??
3.0
6.0JM 11/02
![Page 35: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/35.jpg)
Store Instructionsfst, fstp
• fst mem_location– (Float store)– Store value in ST into memory
• fstp mem_location– (Float store, and pop)– Store value in ST(0) into memory and then
pop stack
JM 11/02
![Page 36: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/36.jpg)
Reverse Polish Notation (operands are keyed in before their operators)
Evaluating a postfix expression 6 2 * 5 +
– When reading an operand from input• push it on stack
– When reading an operator from input• pop the two operands located at the top of
the stack• perform the selected operation on the
operands• push the result back on the stack.
JM 11/02
![Page 37: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/37.jpg)
TITLE FPU Expression Evaluation (Expr.asm)
; Implementation of the following expression:; (6.0 * 2.0) + (4.5 * 3.2); FPU instructions used.; Last update: 10/8/01
INCLUDE Irvine32.inc ; 32-bit Protected mode program.
.dataarray REAL4 6.0, 2.0, 4.5, 3.2dotProduct REAL4 ?
.codemain PROC
finit ; initialize FPUfld array ; push 6.0 onto the stackfmul array+4 ; ST(0) = 6.0 * 2.0fld array+8 ; push 4.5 onto the stackfmul array+12 ; ST(0) = 4.5 * 3.2fadd ; ST(0) = ST(0) + ST(1)fstp dotProduct ; pop stack into memory operandexit
main ENDPEND main
![Page 38: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/38.jpg)
Register Stack Example
Instruction Register Stack
fld op1 ST = 6.0
fld op2ST = 2.0
ST(1) = 6.0
fmul ST = 12.0
fld op3ST = 5.0
ST(1) = 12.0
fsub ST = 7.0
JM 11/02
![Page 39: Floating Point Arithmetic The goal of floating point representation is represent a large range of numbers Important Terms Given the number -123.154 x 10.](https://reader031.fdocuments.us/reader031/viewer/2022020717/56649eb65503460f94bbf4d3/html5/thumbnails/39.jpg)
Other Instructions
• fmul ;st(1) = st(1)* st(0), pop fdiv ;st(1) = st(1)/ st(0), pop fdivr ;st(1) = st(0)/ st(1), pop fsqrt ;st(0) = square root(st(0)) fsin ;st(0) = sine(st(0)); fcos ;st(0) = fcos(st(0));
BR 6/00