Csa 02

Computer Architecture and Organization

Chapter 2The Central Processing Unit

Arithmetic & Logic Unit• Does the calculations• Everything else in the computer is there to

service this unit• Handles integers• May handle floating point (real) numbers• May be separate FPU (math's co-processor)• May be on chip separate FPU (486DX +)

2

ALU Inputs and Outputs

3

Integer Representation• Only have 0 & 1 to represent everything• Positive numbers stored in binary– e.g. 41=00101001

• No minus sign• Non negative numbers representation is

straight forward

4

Sign-Magnitude• Left most bit (MSB) is sign bit• 0 means positive• 1 means negative• +18 = 00010010• -18 = 10010010

5

Drawbacks/problems

• Need to consider both sign and magnitude in arithmetic

• Two representations of zero (+0 and -0)

6

Two’s Compliment• MSB Sign bit• For positive Numbers– Sign bit 0– Number zero as Positive

• For negative Numbers– Sign bit 1

7

Benefits

• One representation of zero• Arithmetic works easily

8

Conversion between different lengths• ‘n’ bit integer has to be stored in ‘m’ bit register,

where m>n• In sign magnitude representation, – Move the sign bit to the new leftmost position – Fill with zeros

• Two’s complement– Move the sign bit to the new leftmost position – Fill with the copies of sign bit

9

Integer Arithmetic - Negation• Sign magnitude– Invert the sign bit

• 2’s complement– Take Boolean complement of each bit including

sign bit– Add one two the result

10

Negation Special Case• Case 1: negation of 0 is 0• Case 2 : negation of -128 is -128• Unavoidable

11

Range of Numbers• 8 bit 2s compliment– +127 = 01111111 = 27 -1– -128 = 10000000 = -27

• 16 bit 2s compliment– +32767 = 011111111 11111111 = 215 - 1– -32768 = 100000000 00000000 = -215

12

Addition• Normal binary addition• Result is positive we get a positive number in

2’s complement • Result is negative – a negative number in 2’s

complement • Carry has to be ignored • If the result is larger than the word size being

used, then this condition is called overflow• When an overflow is occurred ALU must be

signaled that the result should not be used.

13

Overflow rule• If two numbers are added, and they are both

positive or both negative, then overflow occurs if and only if the result has the opposite sign.

14

Subtraction• Rule: Take twos compliment of subtrahend

and add to minuend– i.e. a - b = a + (-b)

• So we only need addition and complement circuits

15

Hardware for Addition and Subtraction

16

Multiplication• Complex• Work out partial product for each digit• Take care with place value (column)• Add partial products• Multiplication– Multiplier – Q , Multiplicand – M– A reg and c – 0– Control logic reads the bit from Q reg and if it is 1 M is

added to A and result is stored in A and C A Q Shifted – O means only Shifting

17

Unsigned Binary Multiplication

18

Execution of Example

19

Flowchart for Unsigned Binary Multiplication

20

Multiplying Negative Numbers• This does not work• Solution 1– Convert to positive– Multiply as unsigned integer– If signs were different, negate answer

• Solution 2– Booth’s algorithm

21

Booth’s Algorithm

22

Example of Booth’s Algorithm

23

Division• More complex than multiplication• Negative numbers are really bad!• Based on long division

24

001111

Division of Unsigned Binary Integers

1011

00001101

100100111011001110

1011

1011100

Quotient

Dividend

Remainder

PartialRemainders

Divisor

25

Flowchart for Unsigned Binary Division

26

2’s complement Division• Load the divisor-M register and dividend - A, Q registers.• Shift A, Q left 1 bit position.• If M and A have same signs, performs A=A-M or A=A+M • Operation is successful if the sign of A is same before and

after the operation.– If A=0 then set Q0= 1– If A not equal to 0 then set Q0= 0

• Repeat the above steps as many bits in Q position• The remainder is in A and the quotient is in Q.

27

M = 0011

28

Real Numbers• Numbers with fractions• Could be done in pure binary– 1001.1010 = 24 + 20 +2-1 + 2-3 =9.625

• Where is the binary point?• Fixed?– Very limited

• Moving?– How do you show where it is?

29

Floating Point

• +/- .significand x 2exponent

• Radix point is at the right of the MSB• Biased representation( A fixed value called BIAS from the field

to get the true exponent value)• Normalized

– MSB of the significand is non Zero

30

Floating Point Examples

31

• First bit is sign bit• First bit of significand value is 1 (no need to

store)• 127 is subtracted from the true exponent

value

32

33

• Using 32 bits

• -ve overflow• -ve underflow• Zero• +ve underflow• +ve overflow

Expressible Numbers

34

• In 32 bits – 8 bits - exponent– 23 bits - significand

• No. of bits in the exponent increases range also increases

• But only fixed number of values are expressed, we have reduced the density and precision

• Only way to increase both is to increase the bits • So most computers offers two – Single precision (32 bits)– Double precision (64 bits)

35

IEEE 754• Standard for floating point storage• Developed to facilitate the portability of programs

from processor to another• Defines 32 bit single and 64 bit double standards• 8 and 11 bit exponent respectively• Extended formats also (both significant and

exponent)

36

IEEE 754 Formats

37

• The following numbers use the IEEE 32-bit floating-point format. What is the equivalentdecimal value?

1 10000011 11000000000000000000000• Express the following numbers in IEEE 32-bit

floating-point format:• -1.5• Ans: 1 01111111 10000000000000000000000• 384• Ans: 0 10000111 00000000000000000000000

38

Floating point Arithmetic• For Addition and subtraction both operands should have

the same exponent value• Require shifting the radix point on one of the operands• Multiplication and division are more straightforward• A floating-point operation may produce one of these

conditions– Exponent overflow: A positive exponent exceeds the maximum possible

exponent value.– Exponent underflow: A negative exponent is less than the minimum possible

exponent value– Significand underflow: In the process of aligning significands, digits may flow

off the right end of the significand.– Significand overflow: The addition of two significands of the same sign may

result in a carry out of the most significant bit

39

FP Arithmetic Addition and subtraction • Addition and subtraction are more complex than

multiplication and division, because of the need for alignment.

• four basic phases of the algorithm– Check for zeros– Align significands (adjusting exponents)– Add or subtract significands– Normalize result

41

FP Addition & Subtraction Flowchart

42

FP Arithmetic Multiplication and Division• Check for zero• Add/subtract exponents • Multiply/divide significand • Normalize• Round

43

Floating Point Multiplication

44

Floating Point Division

45

Instruction Sets:Characteristics and Functions

What is an Instruction Set?

• The complete collection of instructions that the processor can execute

• Machine Code• Binary

47

Elements of an Instruction• Operation code (Op code)– Specifies the operation to be performed– Specified by a binary code called opcode

• Source Operand reference– One more source operands – Operands that are input for operation

• Result Operand reference– Put the answer here

• Next Instruction Reference– where to fetch the next instruction

48

Source and result operands

• Main memory (or virtual memory or cache)• CPU register• I/O device

49

Instruction Cycle State Diagram

50

Instruction Representation

• In machine code each instruction has a unique bit pattern

• For human consumption a symbolic representation is used– e.g. ADD, SUB, LOAD

• Opcodes represented by abbreviations called as mnemonics

• Operands can also be represented in this way– ADD R,Y

51

Simple Instruction Format

52

Instruction Types• Data processing• Data storage (main memory)• Data movement (I/O)• Program flow control

53

Number of Addresses (a)• 3 addresses– Operand 1, Operand 2, Result– Not to change value of any operand– a = b + c;– Needs very long words to hold everything

54

Number of Addresses (b)• 2 addresses– One address has double duty• as operand and result

– a = a + b– Reduces length of instruction

55

Number of Addresses (c)• 1 address– Implicit second address– Usually a register (accumulator)– Common on early machines

57

Number of Addresses (d)• 0 (zero) addresses– All addresses implicit– Uses a stack

58

How Many Addresses• More addresses– More complex – More registers• Inter-register operations are quicker

– Fewer instructions per program• Fewer addresses– Less complex– Shorter length to store– More instructions per program– Longer execution time

59

Instruction set Design• Operation repertoire– How many operations?– What can they do?– How complex are they?

• Data types• Instruction formats– Length of op code field– Number of addresses

• Registers– Number of CPU registers used

• Addressing modes60

Types of Operand• Addresses• Numbers– Difference between numbers used in ordinary

maths and computers– Latter is limited– Integer, floating point and decimal

• Characters– ASCII and IRA– EBCDIC used in IBM

• Logical Data– Bits or flags

61

x86 Data Types• 8 bit Byte• 16 bit word• 32 bit double word• 64 bit quad word• 128 bit double quad word• Data accessed across 32 bit bus in units of

double word read at addresses divisible by 4• Little endian

62

x86 Numeric Data Formats

63

SIMD Data Types• Packed byte and packed byte integer– Bytes packed into 64-bit quadword or 128-bit double quadword

• Packed word and packed word integer– 16-bit words packed into 64-bit quadword or 128-bit double

quadword• Packed doubleword and packed doubleword integer– 32-bit double word packed into 64-bit quadword or 128-bit

double quad word• Packed quad word and packed quadword integer– Two 64-bit quadwords packed into 128-bit double quadword

• Packed single-precision floating-point and packed double-precision floating-point– Four 32-bit floating-point or two 64-bit floating-point values

packed into a 128-bit double quadword

64

ARM Data Types• 8 (byte), 16 (halfword), 32 (word) bits• Halfword access should be halfword aligned and

word accesses should be word aligned• Nonaligned access–Default• Treated as truncated• Bits[1:0] treated as zero for word • Bit[0] treated as zero for halfword

–Data abort signal indicates alignment fault for attempting unaligned access

• All data types supports both Unsigned integer and Twos-complement signed

65

• Majority of ARM processors do not provide floating-point hardware– Saves power and area– Floating-point arithmetic implemented in

software• Optional floating-point coprocessor– Single- and double-precision IEEE 754 floating point data

types

66

ARM Endian Support• E-bit in system control register• Under program control

67

Types of Operation• Data Transfer• Arithmetic• Logical• Conversion• I/O• System Control• Transfer of Control

68

Data Transfer• Specify– Source– Destination– length of data

• May be different instructions for different movements– e.g. IBM 370

• Or one instruction and different addresses– e.g. VAX

69

Arithmetic

71

LOGICAL

72

Shift and Rotate Operations

73

Conversion

74

Input/output• May be specific instructions• May be done using data movement instructions

(memory mapped)• May be done by a separate controller (DMA)

75

Systems Control• Executed on special state• Used for Control registers • For operating systems use

76

NEED FOR CONTROL INSTRUCTIONS• Group of codes needs to be executed repeatedly • Decision making, satisfying condition• Breaking of tasks into smaller pieces

77

Transfer of Control• Branch or Jump instruction– Conditional– Unconditional

• Consider subtraction– BRP X - Branch to location X if result is positive.– BRN X - Branch to location X if result is negative.– BRZ X - Branch to location X if result is zero.– BRO X - Branch to location X if overflow occurs.

• BRE R1, R2, X – Branch to X if contents of R1 contents of R2.

79

• Skip– Skip the next instruction– e.g. increment and skip if zero– ISZ Register1

81

Procedure call• Advantages– Code reuse– Efficient use of storage

• Two basic instructions– Call - branches from the present location to the

procedure– Return- returns from the procedure to the place from

which it was called

82

Nested Procedure Calls

83

Use of Stack

84

Stack Frame Growth Using Sample Procedures P and Q

85

X86 status flags

86

X 86 operation types

87

ARM operation types• Load and store instructions• Branch instruction• Data processing instruction• Multiply instructions• Parallel addition and subtraction instruction– Image processing applications

• Status register access instructions– N, Z, C, V,

91

Unusual aspects of ARM• All instructions includes a condition code.– Not only branch instructions

• All data processing instructions include an S bit– Defines any updates ha been made to condition flags

92

Instruction Sets:Addressing Modes and Formats

Addressing Modes• Immediate• Direct• Indirect• Register• Register Indirect• Displacement• Stack

94

• All architectures provides more than one of these addressing modes

• How the processor can determine which address mode is being used ?– one or more bits in the instruction format can be used as

a mode field. – Value of the mode field determines which addressing

mode is to be used

95

Immediate Addressing

96

Immediate Addressing• Operand is part of instruction• Operand = address field• e.g. ADD 5– Add 5 to contents of accumulator– 5 is operand

• No memory reference to fetch data• Fast• Limited range

97

Direct addressing

98

Direct Addressing• Address field contains address of operand• Effective address (EA) = address field (A)• e.g. ADD A– Add contents of cell A to accumulator– Look in memory at address A for operand

• Single memory reference to access data• No additional calculations to work out

effective address• Limited address space

99

Indirect Addressing

100

Indirect Addressing • Address field refer to the address of a word in

memory, which in turn contains a full-length address of the operand

• EA = (A)– Look in A, find address (A) and look there for operand

• e.g. ADD (A)

– Add contents of cell pointed to by contents of A to accumulator

• Large address space • Multiple memory accesses to find operand • Hence slower

101

Register Addressing

102

Register Addressing• Operand is held in register named in address field• EA = R• Limited number of registers• Advantages– Small address field is needed for instructions so shorter

instructions – less time

• Less number of registers available

103

Register Indirect Addressing

104

Register Indirect Addressing• indirect addressing• EA = (R)• Operand is in memory cell pointed by contents of

register R• Large address space• One fewer memory access than indirect addressing

105

Displacement Addressing

106

Displacement Addressing• Uses both direct and register indirect addressing• EA = A + (R)• Address field hold two values– A = base value– R = register that holds displacement

107

Relative Addressing• A version of displacement addressing• R = Program counter, PC• EA = A + (PC)

108

Base-Register Addressing• A holds displacement• R contains a main memory address

109

Indexed Addressing• A = base• R = displacement• EA = A + R• Good for accessing arrays

110

Combinations• R value is incremented or decremented

automatically – auto indexing• Post index – indexing is performed after the

indirection– EA = (A) + (R)– Address is fetched and indexed with the register

value • Pre index – before the indirection – EA = (A+(R))

111

Stack Addressing• Operand is on top of stack• Stack pointer

112

x86 Addressing Modes• Virtual or effective address

– Starting address plus effective address gives linear address– This goes through page translation if paging enabled to get physical

address

• addressing modes available– Immediate– Register – Displacement– Base with displacement– Scaled index with displacement– Base with index and displacement– Base scaled index with displacement– Relative

113

x86 Addressing Mode Calculation

114

ARM Addressing ModesLoad/Store

• Only instructions that reference memory• base register plus offset• Offset– Offset added to or subtracted from base register contents to

form the memory address• Preindex– Memory address is formed as for offset addressing– Memory address also written back to base register

• Postindex– Memory address is base register value– Offset added or subtracted– Result written back to base register

• Base register acts as index register for preindex and postindex addressing

• Offset either immediate value in instruction or another register

115

ARM Indexing Methods

116

ARM Data Processing Instruction Addressing& Branch Instructions

• Data Processing – Register addressing– Or mixture of register and immediate addressing

• Branch– Immediate

117

ARM Load/Store Multiple Addressing• Load/store subset of general-purpose registers • Sequential range of memory addresses• Increment after, increment before, decrement

after, and decrement before• Base register specifies main memory address • Incrementing or decrementing starts before or

after first memory access

118

ARM Load/Store Multiple Addressing Diagram

119

Instruction Formats

• Layout of bits in an instruction• Includes opcode• Includes (implicit or explicit) operand(s)• Usually more than one instruction format in

an instruction set

120

Instruction Length• Affected by and affects:– Memory size– Memory organization– Bus structure– CPU complexity– CPU speed

• Trade off between powerful instruction repertoire and saving space

121

Allocation of Bits• Number of addressing modes• Number of operands• Register versus memory• Number of register sets• Address range• Address granularity

122

Assembly language• Machines store and understand binary

instructions• E.g. N= I + J + K initialize I=2, J=3, K=4

123

Improvements• Use hexadecimal rather than binary– Code as series of lines• Hex address and memory address

– Need to translate automatically using program• Add symbolic names or mnemonics for

instructions• Three fields per line– Location address– Three letter opcode– If memory reference: address

• Need more complex translation program125

• First field (address) now symbolic• Memory references in third field now symbolic• Now have assembly language and need an

assembler to translate

126

Csa 02

Documents

Transcript of Csa 02