3.1.3 Real Numbers and Normalised Floating-point Numbers

download 3.1.3 Real Numbers and Normalised Floating-point Numbers

of 24

description

U6/P3

Transcript of 3.1.3 Real Numbers and Normalised Floating-point Numbers

3.1.3 REAL NUMBERS AND NORMALISED FLOATING-POINT NUMBERS

Note: The first 1 on the left hand side is in the 9th position, therefore in a 8 bit register, the 9th bit Is simply removed. So you will then have 00000010 which is 2 in denary.

For example, subtracting 5 from 15 is really adding 5 to 15, but this is hidden by the two's-complement representation:

THE FORMAT OF BINARY FLOATING-POINT REAL NUMBERS NOTE: SYLLABUS STARTS HERE

A binary floating point number may consist of 2, 3 or 4 bytes; however the only ones you need to worry about are the 2 byte (16 bit) variety. The first 10 bits are the Mantissa; the last 6 bits are the exponent.

Just like the denary floating point representation, a binary floating point number will have a mantissa and an exponent, though as you are dealing with binary (base 2)

NOTE: After the most significant bit, place the decimal point (in the mantissa only)

100.1 011.1 The number is -3.5 decimal.

Convert binary floating-point real numbers into denaryThere are several stages to take when working out a floating point number in binary. In fact it is much like a disco dance routine - known on this page as the Noorgat Dance, (you wont be tested on name but it should help you to remember)1. Sign- find the sign of the mantissa (make a note of this)2. Slide- find the value of the exponent and whether it is positive or negative3. Bounce- move the decimal the distance the exponent asks, left for a negative exponent, right for a positive4. Flip- If the mantissa is negative perform twos complement on it5. Swim- starting at the decimal point work out the values of the mantissa, going left, then right. Now make sure you refer back to the sign you recorded on the sign move.

Example: binary floating point worked exampleLets try it out. We are given the following 16 bit floating point number, with 10 bits for the mantissa, and 6 bits for the exponent. Remember the decimal point is between the first and second most significant bits

The first action we need to perform is thesign, find out the sign of the mantissa

It is 0 so the mantissa is positiveThe second step in the Noorgat dance is theslide, we need to find the value of the exponent, that is the last 6 bits of the number

So we know that the exponent is of size positive one and we will have to move the decimal point one place to the right.The third step in the Noorgat dance is thebouncethat is moving the decimal point of the Mantissa the number of positions specified by the slide, which was one position to the right. Like so:

The fourth step is the optionalflip. Check back to the sign stage and see if the Mantissa is negative. It isn't? Oh well you can skip past this stage then as we only flip the number if the mantissa is negative.The fifth and final step is theswim. Taking the mantissa on its own we can now work out the value of the floating point number. Start at the centre and label each number to the leftand so on. The each number on the rightand so on.

Voila! the answer is 1

Work out the denary for the following, using 10 bits for the mantissa and 6 bits for the exponent:EX1) 0.001101000 000110

Answer:1. Sign: the mantissa starts with a zero, therefore it is apositivenumber.2. Slide: work out the value of the exponent000110 = +63. Bounce: we need to move the decimal point in the mantissa. In this case the exponent waspositiveso we need to move the decimal point 6 places to the right0.001101000 -> 0001101.0004. Flip: as the number isn't negative we don't need to do this5. Swim: work out the value on the left hand side and right hand side of the decimal point1+4+8 = +13 FINISHED!

EX2) 0 101000000 111111

Answer:1. Sign: the mantissa starts with a zero, therefore it is apositivenumber.2. Slide: work out the value of the exponent111111 It starts with a one therefore it is a negative number000001 = -13. Bounce: we need to move the decimal point in the mantissa. In this case the exponent wasnegativeso we need to move the decimal point 1 place to the left0.101000000 -> 0.01010000004. Flip: as the mantissa number isn't negative we don't need to do this5. Swim: work out the value on the left hand side and right hand side of the decimal point1/4 + 1/16 = +0.3125 FINISHED!

EX3) 1 011111010 000101

Answer:1. Sign: the mantissa starts with a one, therefore it is anegativenumber.2. Slide: work out the value of the exponent000101 = +53. Bounce: we need to move the decimal point in the mantissa. In this case the exponent waspositiveso we need to move the decimal point 5 places to the right1.011111010 -> 101111.10104. Flip: the mantissa is negative as noted in step one so we need to convert this number101111.1010 -> 010000.01105. Swim: work out the value on the left hand side and right hand side of the decimal point16+1/4+1/8 = -16.375 FINISHED!

EX4) 1 101000000 111101

Answer:1. Sign: the mantissa starts with a one, therefore it is anegativenumber.2. Slide: work out the value of the exponent111101 It starts with a one therefore it is a negative number000011 = -33. Bounce: we need to move the decimal point in the mantissa. In this case the exponent wasnegativeso we need to move the decimal point 3 places to the left. Watch carefully!1.101000000 -> 1.111101000000note that we placed extra ones on the front of the number.Consider the exponent being negative and the mantissa positive, we would add extra zeros on the front 0.01 * 2^-3 = 0.00001If both are negative placing zeros in front of the mantissa would make it positive!Therefore we need to add extra ones to keep the mantissa negativeWith the flip we'll lose these 'extra' ones4. Flip: the mantissa is negative as noted in step one so we need to convert this number1.111101000000 -> 0.0000110000005. Swim: work out the value on the left hand side and right hand side of the decimal point1/32+1/64 = -0.046875 Remember the number was negative! FINISHED!

EX5) 1 111111010 000011

Answer:1. Sign: the mantissa starts with a one, therefore it is anegativenumber.2. Slide: work out the value of the exponent000011 = +33. Bounce: we need to move the decimal point in the mantissa. In this case the exponent waspositiveso we need to move the decimal point 3 places to the right.1.111111010 -> 1111.1110104. Flip: the mantissa is negative as noted in step one so we need to convert this number1111.1110100 -> 0000.0001105. Swim: work out the value on the left hand side and right hand side of the decimal point1/16+1/32 = -0.09375 Remember the number was negative! FINISHED!

CONVERTING DENARY INTO BINARY FLOATING-POINTYou might also be asked to convert a denary number into its binary floating point equivalent.1. work out the binary equivalent2. work out how far to move the binary point (y)3. set the exponent to be reverse of the number of places you moved the binary point (-y)4. pad the number with extra bits

Example: denary to binary floating pointIf we are asked to convert the denary number 39.75 into binary floating point we first need to find out the binary equivalent:128 64 32 16 8 4 2 1 . 0 0 1 0 0 1 1 1 . 1 1 0 How far do we need to move the binary point to the left so that the number is normlised? 0 0 . 1 0 0 1 1 1 1 1 0 (6 places to the left)So to get our decimal point back to where it started, we need to move 6 places to the right. 6 now becomes your exponent.0.100111110 | 000110If you want to check your answer, convert the number above into decimal. You get 39.75!

EXAMPLE 1Work out the binary floating point for the following, using 10 bits for the mantissa and 6 bits for the exponent:67

Answer:128 64 32 16 8 4 2 1 . 0 1 0 0 0 0 1 1 . 0 0 0 How far do we need to move the binary point to the left so that the number is normlised? 0 . 1 0 0 0 0 1 1 0 0 0 (7 places to the left)To get the front to be normalised we must move the decimal point 7 places. (moving it 6 places would have made the number negative!)0.100001100 | 000111

EXAMPLE223.25[Collapse]Answer:128 64 32 16 8 4 2 1 . 0 0 0 1 0 1 1 1 . 0 1 0 How far do we need to move the binary point to the left so that the number is normlised? 0 0 0 . 1 0 1 1 1 0 1 0 (5 places to the left)To get the front to be normalised we must move the decimal point 5 places. (moving it 4 places would have made the number negative!)0.101110100 | 000101EXAMPLE 3123.80[Collapse]Answer:128 64 32 16 8 4 2 1 . 0 1 1 1 1 0 1 1 . 1 1 1 How far do we need to move the binary point to the left so that the number is normlised? 0 . 1 1 1 1 0 1 1 1 1 1 (7 places to the left)To get the front to be normalised we must move the decimal point 7 places.0.1111011111 | 000111But this is using 11 bits for the mantissa, we have to drop one, losing accuracy!0.111101111 | 000111

EXAMPLE4-513[Collapse]Answer:1024 512 256 128 64 32 16 8 4 2 1 . 0 1 0 0 0 0 0 0 0 0 1 . 0 0 0 Convert this into its negative form using the flipping rule:1024 512 256 128 64 32 16 8 4 2 1 . 1 0 1 1 1 1 1 1 1 1 1 . 0 0 0 How far do we need to move the binary point to the left so that the number is normalized? 1 . 0 1 1 1 1 1 1 1 1 1 0 0 0 (10 places to the left)To get the front to be normalized we must move the decimal point 10 places.1.011111111 | 001010Notice that we have had to drop the last one as this would not have fitted into 10 bits for the mantissa. This means that the number shown is only:10111111110.0converting this into denary:01000000010.0 = -514You'll look at errors using floating point numbers very soon

For when you have a 16bit number where the mantissa is 10bits and the exponent is 6 bits:thelargest positivenumber will be:Mantissa: 0.111111111Exponent: 011111thesmallest positivenumber will be:Mantissa: 0.000000001Exponent: 100000thelargest negativenumber will be:Mantissa: 1.000000000Exponent: 011111

thesmallest negativenumber will be:Mantissa: 1.111111111Exponent: 100000

NORMALISATION OF FLOATING-POINT NUMBERS: When storing numbers we need to use the space we are given in the most efficient way. We need the most efficient representation we can. With a fixed number of bits, a normalized representation of a number will display the number to the greatest accuracy possible. In summary normalized numbers: Give only one representation of a number Save space Give the most accurate representation of a number in a given number of bitsAs a rule of thumb: when dealing with Floating point numbers in binary you must make sure that the first two bits are different. That is:

And most definitelyNOT: 1.1 0.0

Let's look at an example. Taking a binary floating point number:

We can see that the number starts with. We need to change this tofor it be normalised. To do this we need to move the decimal place one position to the right, and to retain the same number represented by the unnormalised number we need to change the exponent accordingly. With a movement one place right to normalise the number we need to change the exponent to move the decimal point one place left to compensate. Thus subtracting one from the current exponent:

To make sure you have normalized it correctly, check that

Lets try a more complicated example:

To get the mantissa normalised we need to move the decimal point two places to the right. To maintain the same value as the original floating point number we need to adjust the exponent to be two smaller.

Now check that the new normalised value has the same value as the original.NOTE: Make sure that normalising a number does not change the sign bit. e.g. 0.0001 should go to 0.100 and NOT 1.000

Summary: Normalising numbers1. Normalise the left hand side (mantissa).2. Record the number of bounces it has taken to normalise3. Work out the exponent of the normalised number by using:original exponent bounce Normalised numbers start with 2 bits that are different Make sure that your normalisation does not change the sign of the mantissa Normalisation provides the maximum precision for a given number of bits Normalisation makes sure there is only one representation for each number

Exercise: Normalisation QuestionsAre the Following numbers normalised?

EXAMPLE10.010000000 111111Answer: No, as it starts with 0.0

EXAMPLE20.111111000 111111Answer: Yes, as it starts with 0.1

EXAMPLE31.100000010 111111Answer: No, as it starts with 1.1

Normalise the following numbers:

EXAMPLE10 010000000 111111Answer:1. 0.010000000 111111 -> 00.10000000 1111112. One place to the right3. 111111 - 1 = -1 -1 = -2 = 000010 (+2) = 111110 (-2)00.10000000 111110 = 0.100000000 111110

EXAMPLE20 001101000 000110

Answer:1. 0.001101000 000110 -> 000.1101000 0001102. Two places to the right3. 000110 - 2 = 6 - 2 = 4 = 000100 (+4)000.1101000 000100 = 0.110100000 000100

EXAMPLE31 111111010 000011

Answer:1. 1.111111010 000011 -> 1111111.010 0000112. Six places to the right3. 000011 - 6 = 3 - 6 = -3 = 111101 (-3)111111.010 111101 = 1.01000000 111101

REASONS FOR NORMALISATION

LIMITS OF FLOATING-POINT REPRESENTATION (effects of changing allocation of bits to mantissa and exponent)PrecisionWhen using floating point numbers you have to balance the range and the precision of numbers. That is whether you want to have a very large range of values or you want a number that is very precise down to a large number of decimal places. This means that you are going to always weigh up how many digits should be used for the mantissa and how many should be used for the exponent. In summary: If you want a very precise number use more digits for the mantissa and less for the exponent as this will allow for more decimal places If you want a large range of numbers use more digits for the exponent and less for the mantissa.

OVERFLOW

When the result of a sum is too large to be represented by your number system you might run out of space to represent it and end up storing a much smaller numberTry and show 99,999,999,999,999,999,999 in 12 bit FPUNDERFLOW

When a number or the result of an equation is too small, you might not have enough digits in your mantissa and exponent to show it. In the following example the number would register as 0Try and show 0.0000000000000000000000000001 in 12 bit FP

TRUNCATIONWhy computers cannot represent real numbers : 2, , but only approximation?

But floating-point cannot represent so many decimal digits and truncation will occur.

ROUNDING ERRORS IN BINARY REPRESENTATIONS

When we try to represent some numbers sometimes we can't within the space we have been given, for example trying to write down 1/3 = 0.33333333; you see what I mean? With floating point numbers you can't always get perfect precision and sometimes we suffer errors.Feed this equation into google:999999999999999 - 999999999999998The browser will perform a floating point calculation and give you the answer of 0!So recognizing that we can have rounding errors with floating point numbers we'll take a look at the different errors that might be caused. The following number wants to be represented in binary 23.27, the closest we get is 23.25

COMPUTERSCIENCE/UPPER6/P3Page 1