Computer Science 210 Computer Organization Floating Point Representation.

18
Computer Science 210 Computer Organization Floating Point Representation

Transcript of Computer Science 210 Computer Organization Floating Point Representation.

Page 1: Computer Science 210 Computer Organization Floating Point Representation.

Computer Science 210Computer Organization

Floating Point Representation

Page 2: Computer Science 210 Computer Organization Floating Point Representation.

Real Numbers

Format 1: <whole part>.<fractional part>

Examples: 0.25, 3.1415 …

Format 2 (normalized form): <digit>.<fractional part> × <exponent>

Example: 2.5 × 10-1

In mathematics, infinite range and infinite precision (“uncountably infinite”)

Page 3: Computer Science 210 Computer Organization Floating Point Representation.

math.pi

>>> import math

>>> math.pi3.141592653589793

>>> print(math.pi)3.14159265359

>>> print("%.50f" % math.pi)3.14159265358979311599796346854418516159057617187500

Looks like about 48 places of precision (in base10)

Page 4: Computer Science 210 Computer Organization Floating Point Representation.

IEEE StandardSingle precision: 32 bits

Double precision: 64 bits

2.5 × 10-1

Reserve some bits for the significand (the digits to the left of ×) and some for the exponent (the stuff to the right of ×)

Double precision uses 53 bits for the significand, 11 bits for the exponent, and one sign bit

Approximate double precision range is 10-308 to 10308

Page 5: Computer Science 210 Computer Organization Floating Point Representation.

IEEE Single Precision Format

32 bits

Roughly (-1)S x F x 2E

F is related to the significandE is related to the exponent

Rough rangeSmall fractions 2 x 10-38

Large fractions 2 x 1038

S Exponent Significand 1 8 23

Page 6: Computer Science 210 Computer Organization Floating Point Representation.

Fractions in Binary

In general, 2-N = 1/2N

0.12 = 1 × 2-1 = 1 × ½ = 0.510

0.012 = 1 × 2-2 = 1 × ¼ = 0.2510

0.112 = 1 × ½ + 1 × ¼ = 0.7510

Page 7: Computer Science 210 Computer Organization Floating Point Representation.

Decimal to Binary Conversion (Whole Numbers)

While N > 0 doSet N to N/2 (whole part)Record the remainder (1 or 0)

Set A to remainders in reverse order

Page 8: Computer Science 210 Computer Organization Floating Point Representation.

Decimal to Binary - Example

• Example: Convert 32410 to binary N Rem N Rem324162 0 5 0 81 0 2 1 40 1 1 0 20 0 0 1 10 0

• 32410 = 1010001002

Page 9: Computer Science 210 Computer Organization Floating Point Representation.

Decimal to Binary - Fractions

•While N > 0 (or enough bits) do

Set N to N*2 (whole part)Record the whole number part (1 or

0)Set N to fraction part

Set bits to sequence of whole number parts (in order obtained)

Page 10: Computer Science 210 Computer Organization Floating Point Representation.

Decimal fraction to binary - Example

• Example: Convert .6562510 to binary N Whole Part .65625

1.31250 1 0.6250 0 1.250 1 0.50 0 1.0 1 .6562510 = .101012

Page 11: Computer Science 210 Computer Organization Floating Point Representation.

Decimal fraction to binary - Example

• Example: Convert .4510 to binary N Whole Part .45

0.9 0 1.8 1 1.6 1 1.2 1 0.4 0 0.8 0 1.6 1 .4510 = .011100110011…2

Page 12: Computer Science 210 Computer Organization Floating Point Representation.

Round-Off Errors>>> 0.10.1

>>> print("%.48f" % 0.1)0.100000000000000005551115123125782702118158340454

>>> print("%.48f" % 0.25)0.250000000000000000000000000000000000000000000000

>>> print("%.48f" % 0.3)0.299999999999999988897769753748434595763683319092

Caused by conversion of decimal fractions to binary

Page 13: Computer Science 210 Computer Organization Floating Point Representation.

Scientific Notation - Decimal

• Number Normalized

Scientific 0.000000001 1.0 x 10-9

5,326,043,000 5.326043 x 109

Page 14: Computer Science 210 Computer Organization Floating Point Representation.

Floating Point

• IEEE Single Precision Standard (32 bits)

• Roughly (-1)S x F x 2E

– F is related to significand– E is related to exponent

• Rough range– Small fractions 2 x 10-38

– Large fractions 2 x 1038

S Exponent Significand 1 8 23

Page 15: Computer Science 210 Computer Organization Floating Point Representation.

Floating Point – Exponent Field

• This comes before significand for sorting purposes• With 8 bit exponent range would be –128 to 127• Note: -1 would be 11111111 and with simple

sorting would appear largest.• For this reason, we take the exponent, add 127 and

represent this as unsigned. This is called bias 127.• Then exponent field 11111111 (255) would

represent 255 - 127 = 128. Also 00000000 (0) would represent 0 - 127 = -127.

• Range of exponents is -127 to 128

Page 16: Computer Science 210 Computer Organization Floating Point Representation.

Floating Point – Significand

• Normalized form:1.1011… x 2E

• Hidden bit trick:Since the bit to left of binary point is

always 1, why store it?

• We don’t.

• Number = (-1)S x (1+Significand) x 2E-127

Page 17: Computer Science 210 Computer Organization Floating Point Representation.

Floating Point Example: Convert 312.875 to IEEE

Step 1. Convert to binary: 100111000.111

Step 2. Normalize: 1.00111000111 x 28

Step 3. Compute biased exponent in binary: 8 + 127 = 135 10000111

Step 4. Write the floating point representation:

0 10000111 00111000111000000000000

or 439C7000 in hexadecimal

Page 18: Computer Science 210 Computer Organization Floating Point Representation.

Floating Point Example: Convert IEEE 11000000101000… to decimal

Step 1. Sign bit is 1; so number is negative

Step 2. Exponent field is 10000001 or 129; so actualexponent is 2

Step 3. Significand is 010000…; so 1 + Significand is1.0100…

Step 4. Number = (-1)S x (1+Significand) x 2E-127

= (-1)1 x (1.010) x 22

= -101

= -5