Csa 02

126
Computer Architecture and Organization Chapter 2 The Central Processing Unit

Transcript of Csa 02

Page 1: Csa 02

Computer Architecture and Organization

Chapter 2The Central Processing Unit

Page 2: Csa 02

Arithmetic & Logic Unit• Does the calculations• Everything else in the computer is there to

service this unit• Handles integers• May handle floating point (real) numbers• May be separate FPU (math's co-processor)• May be on chip separate FPU (486DX +)

2

Page 3: Csa 02

ALU Inputs and Outputs

3

Page 4: Csa 02

Integer Representation• Only have 0 & 1 to represent everything• Positive numbers stored in binary– e.g. 41=00101001

• No minus sign• Non negative numbers representation is

straight forward

4

Page 5: Csa 02

Sign-Magnitude• Left most bit (MSB) is sign bit• 0 means positive• 1 means negative• +18 = 00010010• -18 = 10010010

5

Page 6: Csa 02

Drawbacks/problems

• Need to consider both sign and magnitude in arithmetic

• Two representations of zero (+0 and -0)

6

Page 7: Csa 02

Two’s Compliment• MSB Sign bit• For positive Numbers– Sign bit 0– Number zero as Positive

• For negative Numbers– Sign bit 1

7

Page 8: Csa 02

Benefits

• One representation of zero• Arithmetic works easily

8

Page 9: Csa 02

Conversion between different lengths• ‘n’ bit integer has to be stored in ‘m’ bit register,

where m>n• In sign magnitude representation, – Move the sign bit to the new leftmost position – Fill with zeros

• Two’s complement– Move the sign bit to the new leftmost position – Fill with the copies of sign bit

9

Page 10: Csa 02

Integer Arithmetic - Negation• Sign magnitude– Invert the sign bit

• 2’s complement– Take Boolean complement of each bit including

sign bit– Add one two the result

10

Page 11: Csa 02

Negation Special Case• Case 1: negation of 0 is 0• Case 2 : negation of -128 is -128• Unavoidable

11

Page 12: Csa 02

Range of Numbers• 8 bit 2s compliment– +127 = 01111111 = 27 -1– -128 = 10000000 = -27

• 16 bit 2s compliment– +32767 = 011111111 11111111 = 215 - 1– -32768 = 100000000 00000000 = -215

12

Page 13: Csa 02

Addition• Normal binary addition• Result is positive we get a positive number in

2’s complement • Result is negative – a negative number in 2’s

complement • Carry has to be ignored • If the result is larger than the word size being

used, then this condition is called overflow• When an overflow is occurred ALU must be

signaled that the result should not be used.

13

Page 14: Csa 02

Overflow rule• If two numbers are added, and they are both

positive or both negative, then overflow occurs if and only if the result has the opposite sign.

14

Page 15: Csa 02

Subtraction• Rule: Take twos compliment of subtrahend

and add to minuend– i.e. a - b = a + (-b)

• So we only need addition and complement circuits

15

Page 16: Csa 02

Hardware for Addition and Subtraction

16

Page 17: Csa 02

Multiplication• Complex• Work out partial product for each digit• Take care with place value (column)• Add partial products• Multiplication– Multiplier – Q , Multiplicand – M– A reg and c – 0– Control logic reads the bit from Q reg and if it is 1 M is

added to A and result is stored in A and C A Q Shifted – O means only Shifting

17

Page 18: Csa 02

Unsigned Binary Multiplication

18

Page 19: Csa 02

Execution of Example

19

Page 20: Csa 02

Flowchart for Unsigned Binary Multiplication

20

Page 21: Csa 02

Multiplying Negative Numbers• This does not work• Solution 1– Convert to positive– Multiply as unsigned integer– If signs were different, negate answer

• Solution 2– Booth’s algorithm

21

Page 22: Csa 02

Booth’s Algorithm

22

Page 23: Csa 02

Example of Booth’s Algorithm

23

Page 24: Csa 02

Division• More complex than multiplication• Negative numbers are really bad!• Based on long division

24

Page 25: Csa 02

001111

Division of Unsigned Binary Integers

1011

00001101

100100111011001110

1011

1011100

Quotient

Dividend

Remainder

PartialRemainders

Divisor

25

Page 26: Csa 02

Flowchart for Unsigned Binary Division

26

Page 27: Csa 02

2’s complement Division• Load the divisor-M register and dividend - A, Q registers.• Shift A, Q left 1 bit position.• If M and A have same signs, performs A=A-M or A=A+M • Operation is successful if the sign of A is same before and

after the operation.– If A=0 then set Q0= 1– If A not equal to 0 then set Q0= 0

• Repeat the above steps as many bits in Q position• The remainder is in A and the quotient is in Q.

27

Page 28: Csa 02

M = 0011

28

Page 29: Csa 02

Real Numbers• Numbers with fractions• Could be done in pure binary– 1001.1010 = 24 + 20 +2-1 + 2-3 =9.625

• Where is the binary point?• Fixed?– Very limited

• Moving?– How do you show where it is?

29

Page 30: Csa 02

Floating Point

• +/- .significand x 2exponent

• Radix point is at the right of the MSB• Biased representation( A fixed value called BIAS from the field

to get the true exponent value)• Normalized

– MSB of the significand is non Zero

30

Page 31: Csa 02

Floating Point Examples

31

Page 32: Csa 02

• First bit is sign bit• First bit of significand value is 1 (no need to

store)• 127 is subtracted from the true exponent

value

32

Page 33: Csa 02

33

• Using 32 bits

• -ve overflow• -ve underflow• Zero• +ve underflow• +ve overflow

Page 34: Csa 02

Expressible Numbers

34

Page 35: Csa 02

• In 32 bits – 8 bits - exponent– 23 bits - significand

• No. of bits in the exponent increases range also increases

• But only fixed number of values are expressed, we have reduced the density and precision

• Only way to increase both is to increase the bits • So most computers offers two – Single precision (32 bits)– Double precision (64 bits)

35

Page 36: Csa 02

IEEE 754• Standard for floating point storage• Developed to facilitate the portability of programs

from processor to another• Defines 32 bit single and 64 bit double standards• 8 and 11 bit exponent respectively• Extended formats also (both significant and

exponent)

36

Page 37: Csa 02

IEEE 754 Formats

37

Page 38: Csa 02

• The following numbers use the IEEE 32-bit floating-point format. What is the equivalentdecimal value?

1 10000011 11000000000000000000000• Express the following numbers in IEEE 32-bit

floating-point format:• -1.5• Ans: 1 01111111 10000000000000000000000• 384• Ans: 0 10000111 00000000000000000000000

38

Page 39: Csa 02

Floating point Arithmetic• For Addition and subtraction both operands should have

the same exponent value• Require shifting the radix point on one of the operands• Multiplication and division are more straightforward• A floating-point operation may produce one of these

conditions– Exponent overflow: A positive exponent exceeds the maximum possible

exponent value.– Exponent underflow: A negative exponent is less than the minimum possible

exponent value– Significand underflow: In the process of aligning significands, digits may flow

off the right end of the significand.– Significand overflow: The addition of two significands of the same sign may

result in a carry out of the most significant bit

39

Page 40: Csa 02

40

Page 41: Csa 02

FP Arithmetic Addition and subtraction • Addition and subtraction are more complex than

multiplication and division, because of the need for alignment.

• four basic phases of the algorithm– Check for zeros– Align significands (adjusting exponents)– Add or subtract significands– Normalize result

41

Page 42: Csa 02

FP Addition & Subtraction Flowchart

42

Page 43: Csa 02

FP Arithmetic Multiplication and Division• Check for zero• Add/subtract exponents • Multiply/divide significand • Normalize• Round

43

Page 44: Csa 02

Floating Point Multiplication

44

Page 45: Csa 02

Floating Point Division

45

Page 46: Csa 02

Instruction Sets:Characteristics and Functions

Page 47: Csa 02

What is an Instruction Set?

• The complete collection of instructions that the processor can execute

• Machine Code• Binary

47

Page 48: Csa 02

Elements of an Instruction• Operation code (Op code)– Specifies the operation to be performed– Specified by a binary code called opcode

• Source Operand reference– One more source operands – Operands that are input for operation

• Result Operand reference– Put the answer here

• Next Instruction Reference– where to fetch the next instruction

48

Page 49: Csa 02

Source and result operands

• Main memory (or virtual memory or cache)• CPU register• I/O device

49

Page 50: Csa 02

Instruction Cycle State Diagram

50

Page 51: Csa 02

Instruction Representation

• In machine code each instruction has a unique bit pattern

• For human consumption a symbolic representation is used– e.g. ADD, SUB, LOAD

• Opcodes represented by abbreviations called as mnemonics

• Operands can also be represented in this way– ADD R,Y

51

Page 52: Csa 02

Simple Instruction Format

52

Page 53: Csa 02

Instruction Types• Data processing• Data storage (main memory)• Data movement (I/O)• Program flow control

53

Page 54: Csa 02

Number of Addresses (a)• 3 addresses– Operand 1, Operand 2, Result– Not to change value of any operand– a = b + c;– Needs very long words to hold everything

54

Page 55: Csa 02

Number of Addresses (b)• 2 addresses– One address has double duty• as operand and result

– a = a + b– Reduces length of instruction

55

Page 56: Csa 02

56

Page 57: Csa 02

Number of Addresses (c)• 1 address– Implicit second address– Usually a register (accumulator)– Common on early machines

57

Page 58: Csa 02

Number of Addresses (d)• 0 (zero) addresses– All addresses implicit– Uses a stack

58

Page 59: Csa 02

How Many Addresses• More addresses– More complex – More registers• Inter-register operations are quicker

– Fewer instructions per program• Fewer addresses– Less complex– Shorter length to store– More instructions per program– Longer execution time

59

Page 60: Csa 02

Instruction set Design• Operation repertoire– How many operations?– What can they do?– How complex are they?

• Data types• Instruction formats– Length of op code field– Number of addresses

• Registers– Number of CPU registers used

• Addressing modes60

Page 61: Csa 02

Types of Operand• Addresses• Numbers– Difference between numbers used in ordinary

maths and computers– Latter is limited– Integer, floating point and decimal

• Characters– ASCII and IRA– EBCDIC used in IBM

• Logical Data– Bits or flags

61

Page 62: Csa 02

x86 Data Types• 8 bit Byte• 16 bit word• 32 bit double word• 64 bit quad word• 128 bit double quad word• Data accessed across 32 bit bus in units of

double word read at addresses divisible by 4• Little endian

62

Page 63: Csa 02

x86 Numeric Data Formats

63

Page 64: Csa 02

SIMD Data Types• Packed byte and packed byte integer– Bytes packed into 64-bit quadword or 128-bit double quadword

• Packed word and packed word integer– 16-bit words packed into 64-bit quadword or 128-bit double

quadword• Packed doubleword and packed doubleword integer– 32-bit double word packed into 64-bit quadword or 128-bit

double quad word• Packed quad word and packed quadword integer– Two 64-bit quadwords packed into 128-bit double quadword

• Packed single-precision floating-point and packed double-precision floating-point– Four 32-bit floating-point or two 64-bit floating-point values

packed into a 128-bit double quadword

64

Page 65: Csa 02

ARM Data Types• 8 (byte), 16 (halfword), 32 (word) bits• Halfword access should be halfword aligned and

word accesses should be word aligned• Nonaligned access–Default• Treated as truncated• Bits[1:0] treated as zero for word • Bit[0] treated as zero for halfword

–Data abort signal indicates alignment fault for attempting unaligned access

• All data types supports both Unsigned integer and Twos-complement signed

65

Page 66: Csa 02

• Majority of ARM processors do not provide floating-point hardware– Saves power and area– Floating-point arithmetic implemented in

software• Optional floating-point coprocessor– Single- and double-precision IEEE 754 floating point data

types

66

Page 67: Csa 02

ARM Endian Support• E-bit in system control register• Under program control

67

Page 68: Csa 02

Types of Operation• Data Transfer• Arithmetic• Logical• Conversion• I/O• System Control• Transfer of Control

68

Page 69: Csa 02

Data Transfer• Specify– Source– Destination– length of data

• May be different instructions for different movements– e.g. IBM 370

• Or one instruction and different addresses– e.g. VAX

69

Page 70: Csa 02

70

Page 71: Csa 02

Arithmetic

71

Page 72: Csa 02

LOGICAL

72

Page 73: Csa 02

Shift and Rotate Operations

73

Page 74: Csa 02

Conversion

74

Page 75: Csa 02

Input/output• May be specific instructions• May be done using data movement instructions

(memory mapped)• May be done by a separate controller (DMA)

75

Page 76: Csa 02

Systems Control• Executed on special state• Used for Control registers • For operating systems use

76

Page 77: Csa 02

NEED FOR CONTROL INSTRUCTIONS• Group of codes needs to be executed repeatedly • Decision making, satisfying condition• Breaking of tasks into smaller pieces

77

Page 78: Csa 02

78

Page 79: Csa 02

Transfer of Control• Branch or Jump instruction– Conditional– Unconditional

• Consider subtraction– BRP X - Branch to location X if result is positive.– BRN X - Branch to location X if result is negative.– BRZ X - Branch to location X if result is zero.– BRO X - Branch to location X if overflow occurs.

• BRE R1, R2, X – Branch to X if contents of R1 contents of R2.

79

Page 80: Csa 02

80

Page 81: Csa 02

• Skip– Skip the next instruction– e.g. increment and skip if zero– ISZ Register1

81

Page 82: Csa 02

Procedure call• Advantages– Code reuse– Efficient use of storage

• Two basic instructions– Call - branches from the present location to the

procedure– Return- returns from the procedure to the place from

which it was called

82

Page 83: Csa 02

Nested Procedure Calls

83

Page 84: Csa 02

Use of Stack

84

Page 85: Csa 02

Stack Frame Growth Using Sample Procedures P and Q

85

Page 86: Csa 02

X86 status flags

86

Page 87: Csa 02

X 86 operation types

87

Page 88: Csa 02

88

Page 89: Csa 02

89

Page 90: Csa 02

90

Page 91: Csa 02

ARM operation types• Load and store instructions• Branch instruction• Data processing instruction• Multiply instructions• Parallel addition and subtraction instruction– Image processing applications

• Status register access instructions– N, Z, C, V,

91

Page 92: Csa 02

Unusual aspects of ARM• All instructions includes a condition code.– Not only branch instructions

• All data processing instructions include an S bit– Defines any updates ha been made to condition flags

92

Page 93: Csa 02

Instruction Sets:Addressing Modes and Formats

Page 94: Csa 02

Addressing Modes• Immediate• Direct• Indirect• Register• Register Indirect• Displacement• Stack

94

Page 95: Csa 02

• All architectures provides more than one of these addressing modes

• How the processor can determine which address mode is being used ?– one or more bits in the instruction format can be used as

a mode field. – Value of the mode field determines which addressing

mode is to be used

95

Page 96: Csa 02

Immediate Addressing

96

Page 97: Csa 02

Immediate Addressing• Operand is part of instruction• Operand = address field• e.g. ADD 5– Add 5 to contents of accumulator– 5 is operand

• No memory reference to fetch data• Fast• Limited range

97

Page 98: Csa 02

Direct addressing

98

Page 99: Csa 02

Direct Addressing• Address field contains address of operand• Effective address (EA) = address field (A)• e.g. ADD A– Add contents of cell A to accumulator– Look in memory at address A for operand

• Single memory reference to access data• No additional calculations to work out

effective address• Limited address space

99

Page 100: Csa 02

Indirect Addressing

100

Page 101: Csa 02

Indirect Addressing • Address field refer to the address of a word in

memory, which in turn contains a full-length address of the operand

• EA = (A)– Look in A, find address (A) and look there for operand

• e.g. ADD (A)

– Add contents of cell pointed to by contents of A to accumulator

• Large address space • Multiple memory accesses to find operand • Hence slower

101

Page 102: Csa 02

Register Addressing

102

Page 103: Csa 02

Register Addressing• Operand is held in register named in address field• EA = R• Limited number of registers• Advantages– Small address field is needed for instructions so shorter

instructions – less time

• Less number of registers available

103

Page 104: Csa 02

Register Indirect Addressing

104

Page 105: Csa 02

Register Indirect Addressing• indirect addressing• EA = (R)• Operand is in memory cell pointed by contents of

register R• Large address space• One fewer memory access than indirect addressing

105

Page 106: Csa 02

Displacement Addressing

106

Page 107: Csa 02

Displacement Addressing• Uses both direct and register indirect addressing• EA = A + (R)• Address field hold two values– A = base value– R = register that holds displacement

107

Page 108: Csa 02

Relative Addressing• A version of displacement addressing• R = Program counter, PC• EA = A + (PC)

108

Page 109: Csa 02

Base-Register Addressing• A holds displacement• R contains a main memory address

109

Page 110: Csa 02

Indexed Addressing• A = base• R = displacement• EA = A + R• Good for accessing arrays

110

Page 111: Csa 02

Combinations• R value is incremented or decremented

automatically – auto indexing• Post index – indexing is performed after the

indirection– EA = (A) + (R)– Address is fetched and indexed with the register

value • Pre index – before the indirection – EA = (A+(R))

111

Page 112: Csa 02

Stack Addressing• Operand is on top of stack• Stack pointer

112

Page 113: Csa 02

x86 Addressing Modes• Virtual or effective address

– Starting address plus effective address gives linear address– This goes through page translation if paging enabled to get physical

address

• addressing modes available– Immediate– Register – Displacement– Base with displacement– Scaled index with displacement– Base with index and displacement– Base scaled index with displacement– Relative

113

Page 114: Csa 02

x86 Addressing Mode Calculation

114

Page 115: Csa 02

ARM Addressing ModesLoad/Store

• Only instructions that reference memory• base register plus offset• Offset– Offset added to or subtracted from base register contents to

form the memory address• Preindex– Memory address is formed as for offset addressing– Memory address also written back to base register

• Postindex– Memory address is base register value– Offset added or subtracted– Result written back to base register

• Base register acts as index register for preindex and postindex addressing

• Offset either immediate value in instruction or another register

115

Page 116: Csa 02

ARM Indexing Methods

116

Page 117: Csa 02

ARM Data Processing Instruction Addressing& Branch Instructions

• Data Processing – Register addressing– Or mixture of register and immediate addressing

• Branch– Immediate

117

Page 118: Csa 02

ARM Load/Store Multiple Addressing• Load/store subset of general-purpose registers • Sequential range of memory addresses• Increment after, increment before, decrement

after, and decrement before• Base register specifies main memory address • Incrementing or decrementing starts before or

after first memory access

118

Page 119: Csa 02

ARM Load/Store Multiple Addressing Diagram

119

Page 120: Csa 02

Instruction Formats

• Layout of bits in an instruction• Includes opcode• Includes (implicit or explicit) operand(s)• Usually more than one instruction format in

an instruction set

120

Page 121: Csa 02

Instruction Length• Affected by and affects:– Memory size– Memory organization– Bus structure– CPU complexity– CPU speed

• Trade off between powerful instruction repertoire and saving space

121

Page 122: Csa 02

Allocation of Bits• Number of addressing modes• Number of operands• Register versus memory• Number of register sets• Address range• Address granularity

122

Page 123: Csa 02

Assembly language• Machines store and understand binary

instructions• E.g. N= I + J + K initialize I=2, J=3, K=4

123

Page 124: Csa 02

124

Page 125: Csa 02

Improvements• Use hexadecimal rather than binary– Code as series of lines• Hex address and memory address

– Need to translate automatically using program• Add symbolic names or mnemonics for

instructions• Three fields per line– Location address– Three letter opcode– If memory reference: address

• Need more complex translation program125

Page 126: Csa 02

• First field (address) now symbolic• Memory references in third field now symbolic• Now have assembly language and need an

assembler to translate

126