Topic 2 – Introduction to Computer Codes. Computer Codes A code is a systematic use of a given set...

38
Topic 2 – Introduction to Computer Codes

Transcript of Topic 2 – Introduction to Computer Codes. Computer Codes A code is a systematic use of a given set...

Topic 2 – Introduction to Computer Codes

Computer Codes A code is a systematic use of a given

set of symbols for representing information.

As an example, a traffic light uses a code to signal what you need to do… Red = Stop Yellow = Caution; about to go to red Green = Go

We will look at three important types of codes: numeric, character, and error detection & correction.

Fixed Point Numbers A fixed point number is used to represent

either signed integers or signed fractions. In either case, sign magnitude, 2’s complement, or 1’s complement systems can be used to represent the signed value.

A fixed point integer has an implied radix (decimal) point to the right of the least significant bit.

A fixed point fraction has an implied radix point between the sign bit and the most significant magnitude bit.

Fixed Point Numbers. . .n-1 n-2 2 1 0

sign bitimplied

binary point

.

magnitude representation

. . .n-1 n-2 2 1 0

sign bit

impliedbinary point

.

magnitude representation

(top) fixed point integer(bottom) fixed point fraction

Excess Representations An excess-K representation of a code C is

formed by adding the value K to each code word of C.

This sort of code is used frequently in the representation of the exponents of floating point numbers so that the smallest exponent value will be represented by all zeros.

As an example, let’s compute the excess-8 code for 4-bit 2’s complement numbers (note all we have to do is add 8 = (1000)2 to each code).

Excess-8 4-bit 2’s Complement Code

Decimal 2’s Complement

Excess-8

Decimal 2’s Complement

Excess-8

+7 0111 1111 -1 1111 0111

+6 0110 1110 -2 1110 0110

+5 0101 1101 -3 1101 0101

+4 0100 1100 -4 1100 0100

+3 0011 1011 -5 1011 0011

+2 0010 1010 -6 1010 0010

+1 0001 1001 -7 1001 0001

0 0000 1000 -8 1000 0000

Floating-Point Numbers Floating-point numbers are similar to

numbers written in scientific notation. In general, the floating-point form of a number N is written as:

N = M x rE

where M is the mantissa, a fixed-point number containing the significant digits of N, and E is the exponent, a fixed-point integer.

Mantissa Encoding In such a system, the mantissa and the

exponent are generally coded separately. The radix is implied.

The mantissa M is usually coded in sign-magnitude, usually as a fraction. It is usually written as:

M = (SM.an-1an-2…a-m)rsm

The sign bit, SM is usually 0 for a positive number and 1 for a negative number.

Exponent Encoding The exponent E is often encoded in excess-K

two’s complement notation. This representation of a number is formed by

adding a bias of K to the 2’s complement integer value of the exponent.

For binary floating-point numbers, K is usually selected to be 2e-1 where e is the number of bits in the exponent.

Therefore,-2e-1 <= E < 2e-1

0 <= E + 2e-1 < 2e

Exponent Encoding By examining this expression, we can see

that the biased value of E is a number that ranges from 0 (at its most negative value) to 2e–1 (at its most positive value).

The excess-K form of E can be written as:E = (be-1be-2…b0)excess-K

The sign of E is indicated (in this form) by the bit be-1, which will be 0 if E is negative and 1 if E is positive.

Normalization Note that more than one combination of

mantissa and exponent can represent the same number.M = (0.11010101)2 x 24

= (0.011010101)2 x 25

= (0.0011010101)2 x 26

In a digital system, it is useful to have one unique representation for each number. A floating-point number is normalized if the exponent is adjusted so that the mantissa has a nonzero value in its most significant digit.

Floating-Point Formats Floating-point formats used in computers

often differ in the number of bits used to represent the mantissa and the exponent, and the method of coding for each.

Most systems use a system where the sign is stored in the leftmost bit, followed by the exponent and then the mantissa.

Typically, floating-point numbers are stored in one-word or two-word formats.

Floating-Point Formats A one-word format:

A two-word format:

Mantissa M(most significant part)

Exponent ESM

Mantissa MExponent ESM

Mantissa M(least significant part)

IEEE Floating-Point Standards The Institute for Electrical and

Electronic Engineers (IEEE) has defined a set of floating point standards. The single-precision IEEE standard

calls for 32 bits, 1 sign bit, 23 mantissa bits, 8 exponent bits, and an exponent bias of 127.

The single-precision IEEE standard calls for 64 bits, 1 sign bit, 52 mantissa bits, 11 exponent bits, and an exponent bias of 1023.

Binary Coding Decimal (BCD) The binary coded decimal (BCD) code is

used for representing the decimal digits 0 through 9.

It is a weighted code which means each bit position in the code have a fixed numerical weight associated with it. The digit represented by a code word is found by summing the weighted bits.

BCD uses four bits, with the weights equal to those of a 4-bit binary integer (BCD is sometimes called a 8-4-2-1 code).

BCD Code Words

Digit BCD Code

Digit BCD Code

0 0000 5 0101

1 0001 6 0110

2 0010 7 0111

3 0011 8 1000

4 0100 9 1001

Use of BCD Codes BCD codes are used to encode numbers for

output to numerical displays (such as seven-segment displays).

They are also used to represent numbers in processors which perform decimal arithmetic directly (instead of binary arithmetic).

As an example, let’s encode the decimal number N=(9750)10 in BCD.

9 -> 1001, 7 -> 0111, 5 -> 0101, 0 -> 0000Therefore (9750)10 = (1001011101010000)BCD.

ASCII The most widely used code for

representing characters is ASCII. ASCII is a 7-bit code, frequently used

with an 8th bit for error detection (more about that in a bit).

A complete ASCII table is located in your textbook, or all around the web.

As an example, let’s encode ASCIIcode in ASCII.

ASCII EncodingCharacter ASCII

(binary)ASCII (hex)

A 1000001 41

S 1010011 53

C 1000011 43

I 1001001 49

I 1001001 49

c 1100011 63

o 1101111 6F

d 1100100 64

e 1100101 65

Gray Codes A Gray code is defined as a code

where two consecutive codewords differ in only one bit.

The distance between two code words is defined as the number of bits in which the two words differ. Therefore, a Gray code has a distance of one.

Let’s define a Gray code for the decimal numbers 0 through 15.

Gray Code ExampleDecimal

Binary Gray Decimal

Binary Gray

0 0000 0000 8 1000 1100

1 0001 0001 9 1001 1101

2 0010 0011 10 1010 1111

3 0011 0010 11 1011 1110

4 0100 0110 12 1100 1010

5 0101 0111 13 1101 1011

6 0110 0101 14 1110 1001

7 0111 0100 15 1111 1000

Codes and Weights An error in binary data is defined as an

incorrect value in one or more bits. A single error is an incorrect value in one bit and a multiple error is one or more bits being incorrect.

Errors may be caused by hardware failure, external interference (noise), or other unwanted events.

Certain types of codes allow the detection and sometimes the correction of errors.

Terminology C will refer to a code. I and J will refer to n-bit binary

codewords. The weight of I, w(I), will be defined

as the number of bits of I equal to 1. The distance between I and J, d(I, J)

is equal to the number of bit positions in which I and J differ.

Terminology Example I = (01101100) J = (11000100)

w(I) = 4 w(J) = 3 d(I, J) = 3

Error Detection and Correction Codes If the distance between any two code words in

C is >= dmin, the code is said to have minimum distance dmin.

For a given dmin, at least dmin errors are needed to transform one valid code word into another.

If there are fewer than dmin errors, a detectable noncode word results. If this noncode word is closer to one valid codeword than any other, the original code word can be deduced and the error corrected.

Error Detectionand Correction Codes In general, a code provides t error

correction plus detection of s additional errors if and only if the following inequality is satisfied…

2t + s + 1 <= dmin

Error Detection and Correction

2t + s + 1 <= dmin

Examining this inequality shows: A single-error detection code (s=1,

t=0) requires a minimum distance of 2. A single-error correction code (s=0,

t=1) requires a minimum distance of 3. A single-error correction and double-

error detection (s=t=1) requires a minimum distance of 4.

Parity Codes Parity codes are formed from a code C

by concatenating (operator |) a parity bit, P to each code word of C.

In an odd-parity code, the parity bit is specified to be either 0 or 1 as necessary for w(P|C) to be odd.

In an even-parity code, the parity bit is specified to be either 0 or 1 as necessary for w(P|C) to be even.

Information BitsP

Parity Code Example Concatenate a parity bit to the ASCII

code for the characters 0, X, and = to produce both odd-parity and even-parity codes.Character ASCII Odd-Parity

ASCIIEven-Parity ASCII

0 0110000 10110000 00110000

X 1011000 01011000 11011000

= 0111100 10111100 00111100

Parity Code Effectiveness Error detection on a parity code is easily

accomplished by checking to see if a codeword has the correct parity.

For example, if the parity of an odd-parity codeword is actually even, an error has occurred in this codeword.

Parity codes are minimum-distance-2 codes and thus can detect single errors. Unfortunately, errors in an even number of bits will not change the parity and are therefore not detectable using a parity code.

Two-out-of-Five Codes A two-out-of-five code is an error

detection code having exactly two bits equal to 1 and three bits equal to 0.

Error detection is accomplished by counting the number of ones in a code word. An error has occurred if this number is not equal to two.

These codes permit the detection of single errors and multiple errors in adjacent bits.

Two-out-of-Five Code Example As an example, let’s look at a two-

out-of-five code for decimal digits.

Digit Codeword

Digit Codeword

0 00011 5 01010

1 00101 6 10010

2 01001 7 01100

3 10001 8 10100

4 00110 9 11000

Hamming Codes A more complex error detection and

correction system of codes was introduced by Richard Hamming in 1950. Hamming codes are similar to an extension of parity codes in that multiple parity, or check bits are used.

Each check bit is defined over a subset of the information bits in a codeword. These subsets overlap such that each information bit is in at least two subsets.

Error Detection and Correction with Hamming Codes The error detection and correction properties of

a Hamming code are determined by the number of check bits used and how the check bits are defined over the information bits.

The minimum distance dmin is equal to the weight of the minimum-weight nonzero code word (the number of ones in the codeword with the fewest).

Hamming codes are complex in their formation and analysis. We will look at two different codes with different properties, but not discuss them in too much depth.

Hamming Code Example #1 This code has dmin=3, so it can provide

single error correction.Information Word

Hamming Code

Information Word

Hamming Code

Information Word

Hamming Code

0000 0000000 0110 0110011 1100 1100001

0001 0001011 0111 0111000 1101 1101010

0010 0010101 1000 1000111 1110 1110100

0011 0011110 1001 1001100 1111 1111111

0100 0100110 1010 1010010

0101 0101101 1011 1011001

Hamming Code Example #1 Here, let’s assume we have a single error in

the codeword 0100110 so that we read it as 1100110.

Only the codeword in which the error occurred has distance 1 from the invalid word. So, the detection of the error word is equivalent to correcting the error, since the only possible single error that could have produced this error word is the codeword 0100110.

So, this code offers single error correction.

Hamming Code Example #2 This code has dmin=4, so it can provide single

error correction and double error detection.Information Word

Hamming Code

Information Word

Hamming Code

Information Word

Hamming Code

0000 00000000 0110 01100011 1100 11001001

0001 00011011 0111 01111000 1101 11010010

0010 00101101 1000 10000111 1110 11100100

0011 00110110 1001 10011100 1111 11111111

0100 01001110 1010 10101010

0101 01010101 1011 10110001

Summary You should now understand:

Fixed-point and floating-point number representation

BCD and ASCII character codes Gray codes and excess-k codes Simple error detection and error

correction codes, including parity codes, 2-out-of-5 codes, and Hamming codes