Csa 02
Transcript of Csa 02
Computer Architecture and Organization
Chapter 2The Central Processing Unit
Arithmetic & Logic Unit• Does the calculations• Everything else in the computer is there to
service this unit• Handles integers• May handle floating point (real) numbers• May be separate FPU (math's co-processor)• May be on chip separate FPU (486DX +)
2
ALU Inputs and Outputs
3
Integer Representation• Only have 0 & 1 to represent everything• Positive numbers stored in binary– e.g. 41=00101001
• No minus sign• Non negative numbers representation is
straight forward
4
Sign-Magnitude• Left most bit (MSB) is sign bit• 0 means positive• 1 means negative• +18 = 00010010• -18 = 10010010
5
Drawbacks/problems
• Need to consider both sign and magnitude in arithmetic
• Two representations of zero (+0 and -0)
6
Two’s Compliment• MSB Sign bit• For positive Numbers– Sign bit 0– Number zero as Positive
• For negative Numbers– Sign bit 1
7
Benefits
• One representation of zero• Arithmetic works easily
8
Conversion between different lengths• ‘n’ bit integer has to be stored in ‘m’ bit register,
where m>n• In sign magnitude representation, – Move the sign bit to the new leftmost position – Fill with zeros
• Two’s complement– Move the sign bit to the new leftmost position – Fill with the copies of sign bit
9
Integer Arithmetic - Negation• Sign magnitude– Invert the sign bit
• 2’s complement– Take Boolean complement of each bit including
sign bit– Add one two the result
10
Negation Special Case• Case 1: negation of 0 is 0• Case 2 : negation of -128 is -128• Unavoidable
11
Range of Numbers• 8 bit 2s compliment– +127 = 01111111 = 27 -1– -128 = 10000000 = -27
• 16 bit 2s compliment– +32767 = 011111111 11111111 = 215 - 1– -32768 = 100000000 00000000 = -215
12
Addition• Normal binary addition• Result is positive we get a positive number in
2’s complement • Result is negative – a negative number in 2’s
complement • Carry has to be ignored • If the result is larger than the word size being
used, then this condition is called overflow• When an overflow is occurred ALU must be
signaled that the result should not be used.
13
Overflow rule• If two numbers are added, and they are both
positive or both negative, then overflow occurs if and only if the result has the opposite sign.
14
Subtraction• Rule: Take twos compliment of subtrahend
and add to minuend– i.e. a - b = a + (-b)
• So we only need addition and complement circuits
15
Hardware for Addition and Subtraction
16
Multiplication• Complex• Work out partial product for each digit• Take care with place value (column)• Add partial products• Multiplication– Multiplier – Q , Multiplicand – M– A reg and c – 0– Control logic reads the bit from Q reg and if it is 1 M is
added to A and result is stored in A and C A Q Shifted – O means only Shifting
17
Unsigned Binary Multiplication
18
Execution of Example
19
Flowchart for Unsigned Binary Multiplication
20
Multiplying Negative Numbers• This does not work• Solution 1– Convert to positive– Multiply as unsigned integer– If signs were different, negate answer
• Solution 2– Booth’s algorithm
21
Booth’s Algorithm
22
Example of Booth’s Algorithm
23
Division• More complex than multiplication• Negative numbers are really bad!• Based on long division
24
001111
Division of Unsigned Binary Integers
1011
00001101
100100111011001110
1011
1011100
Quotient
Dividend
Remainder
PartialRemainders
Divisor
25
Flowchart for Unsigned Binary Division
26
2’s complement Division• Load the divisor-M register and dividend - A, Q registers.• Shift A, Q left 1 bit position.• If M and A have same signs, performs A=A-M or A=A+M • Operation is successful if the sign of A is same before and
after the operation.– If A=0 then set Q0= 1– If A not equal to 0 then set Q0= 0
• Repeat the above steps as many bits in Q position• The remainder is in A and the quotient is in Q.
27
M = 0011
28
Real Numbers• Numbers with fractions• Could be done in pure binary– 1001.1010 = 24 + 20 +2-1 + 2-3 =9.625
• Where is the binary point?• Fixed?– Very limited
• Moving?– How do you show where it is?
29
Floating Point
• +/- .significand x 2exponent
• Radix point is at the right of the MSB• Biased representation( A fixed value called BIAS from the field
to get the true exponent value)• Normalized
– MSB of the significand is non Zero
30
Floating Point Examples
31
• First bit is sign bit• First bit of significand value is 1 (no need to
store)• 127 is subtracted from the true exponent
value
32
33
• Using 32 bits
• -ve overflow• -ve underflow• Zero• +ve underflow• +ve overflow
Expressible Numbers
34
• In 32 bits – 8 bits - exponent– 23 bits - significand
• No. of bits in the exponent increases range also increases
• But only fixed number of values are expressed, we have reduced the density and precision
• Only way to increase both is to increase the bits • So most computers offers two – Single precision (32 bits)– Double precision (64 bits)
35
IEEE 754• Standard for floating point storage• Developed to facilitate the portability of programs
from processor to another• Defines 32 bit single and 64 bit double standards• 8 and 11 bit exponent respectively• Extended formats also (both significant and
exponent)
36
IEEE 754 Formats
37
• The following numbers use the IEEE 32-bit floating-point format. What is the equivalentdecimal value?
1 10000011 11000000000000000000000• Express the following numbers in IEEE 32-bit
floating-point format:• -1.5• Ans: 1 01111111 10000000000000000000000• 384• Ans: 0 10000111 00000000000000000000000
38
Floating point Arithmetic• For Addition and subtraction both operands should have
the same exponent value• Require shifting the radix point on one of the operands• Multiplication and division are more straightforward• A floating-point operation may produce one of these
conditions– Exponent overflow: A positive exponent exceeds the maximum possible
exponent value.– Exponent underflow: A negative exponent is less than the minimum possible
exponent value– Significand underflow: In the process of aligning significands, digits may flow
off the right end of the significand.– Significand overflow: The addition of two significands of the same sign may
result in a carry out of the most significant bit
39
40
FP Arithmetic Addition and subtraction • Addition and subtraction are more complex than
multiplication and division, because of the need for alignment.
• four basic phases of the algorithm– Check for zeros– Align significands (adjusting exponents)– Add or subtract significands– Normalize result
41
FP Addition & Subtraction Flowchart
42
FP Arithmetic Multiplication and Division• Check for zero• Add/subtract exponents • Multiply/divide significand • Normalize• Round
43
Floating Point Multiplication
44
Floating Point Division
45
Instruction Sets:Characteristics and Functions
What is an Instruction Set?
• The complete collection of instructions that the processor can execute
• Machine Code• Binary
47
Elements of an Instruction• Operation code (Op code)– Specifies the operation to be performed– Specified by a binary code called opcode
• Source Operand reference– One more source operands – Operands that are input for operation
• Result Operand reference– Put the answer here
• Next Instruction Reference– where to fetch the next instruction
48
Source and result operands
• Main memory (or virtual memory or cache)• CPU register• I/O device
49
Instruction Cycle State Diagram
50
Instruction Representation
• In machine code each instruction has a unique bit pattern
• For human consumption a symbolic representation is used– e.g. ADD, SUB, LOAD
• Opcodes represented by abbreviations called as mnemonics
• Operands can also be represented in this way– ADD R,Y
51
Simple Instruction Format
52
Instruction Types• Data processing• Data storage (main memory)• Data movement (I/O)• Program flow control
53
Number of Addresses (a)• 3 addresses– Operand 1, Operand 2, Result– Not to change value of any operand– a = b + c;– Needs very long words to hold everything
54
Number of Addresses (b)• 2 addresses– One address has double duty• as operand and result
– a = a + b– Reduces length of instruction
55
56
Number of Addresses (c)• 1 address– Implicit second address– Usually a register (accumulator)– Common on early machines
57
Number of Addresses (d)• 0 (zero) addresses– All addresses implicit– Uses a stack
58
How Many Addresses• More addresses– More complex – More registers• Inter-register operations are quicker
– Fewer instructions per program• Fewer addresses– Less complex– Shorter length to store– More instructions per program– Longer execution time
59
Instruction set Design• Operation repertoire– How many operations?– What can they do?– How complex are they?
• Data types• Instruction formats– Length of op code field– Number of addresses
• Registers– Number of CPU registers used
• Addressing modes60
Types of Operand• Addresses• Numbers– Difference between numbers used in ordinary
maths and computers– Latter is limited– Integer, floating point and decimal
• Characters– ASCII and IRA– EBCDIC used in IBM
• Logical Data– Bits or flags
61
x86 Data Types• 8 bit Byte• 16 bit word• 32 bit double word• 64 bit quad word• 128 bit double quad word• Data accessed across 32 bit bus in units of
double word read at addresses divisible by 4• Little endian
62
x86 Numeric Data Formats
63
SIMD Data Types• Packed byte and packed byte integer– Bytes packed into 64-bit quadword or 128-bit double quadword
• Packed word and packed word integer– 16-bit words packed into 64-bit quadword or 128-bit double
quadword• Packed doubleword and packed doubleword integer– 32-bit double word packed into 64-bit quadword or 128-bit
double quad word• Packed quad word and packed quadword integer– Two 64-bit quadwords packed into 128-bit double quadword
• Packed single-precision floating-point and packed double-precision floating-point– Four 32-bit floating-point or two 64-bit floating-point values
packed into a 128-bit double quadword
64
ARM Data Types• 8 (byte), 16 (halfword), 32 (word) bits• Halfword access should be halfword aligned and
word accesses should be word aligned• Nonaligned access–Default• Treated as truncated• Bits[1:0] treated as zero for word • Bit[0] treated as zero for halfword
–Data abort signal indicates alignment fault for attempting unaligned access
• All data types supports both Unsigned integer and Twos-complement signed
65
• Majority of ARM processors do not provide floating-point hardware– Saves power and area– Floating-point arithmetic implemented in
software• Optional floating-point coprocessor– Single- and double-precision IEEE 754 floating point data
types
66
ARM Endian Support• E-bit in system control register• Under program control
67
Types of Operation• Data Transfer• Arithmetic• Logical• Conversion• I/O• System Control• Transfer of Control
68
Data Transfer• Specify– Source– Destination– length of data
• May be different instructions for different movements– e.g. IBM 370
• Or one instruction and different addresses– e.g. VAX
69
70
Arithmetic
71
LOGICAL
72
Shift and Rotate Operations
73
Conversion
74
Input/output• May be specific instructions• May be done using data movement instructions
(memory mapped)• May be done by a separate controller (DMA)
75
Systems Control• Executed on special state• Used for Control registers • For operating systems use
76
NEED FOR CONTROL INSTRUCTIONS• Group of codes needs to be executed repeatedly • Decision making, satisfying condition• Breaking of tasks into smaller pieces
77
78
Transfer of Control• Branch or Jump instruction– Conditional– Unconditional
• Consider subtraction– BRP X - Branch to location X if result is positive.– BRN X - Branch to location X if result is negative.– BRZ X - Branch to location X if result is zero.– BRO X - Branch to location X if overflow occurs.
• BRE R1, R2, X – Branch to X if contents of R1 contents of R2.
79
80
• Skip– Skip the next instruction– e.g. increment and skip if zero– ISZ Register1
81
Procedure call• Advantages– Code reuse– Efficient use of storage
• Two basic instructions– Call - branches from the present location to the
procedure– Return- returns from the procedure to the place from
which it was called
82
Nested Procedure Calls
83
Use of Stack
84
Stack Frame Growth Using Sample Procedures P and Q
85
X86 status flags
86
X 86 operation types
87
88
89
90
ARM operation types• Load and store instructions• Branch instruction• Data processing instruction• Multiply instructions• Parallel addition and subtraction instruction– Image processing applications
• Status register access instructions– N, Z, C, V,
91
Unusual aspects of ARM• All instructions includes a condition code.– Not only branch instructions
• All data processing instructions include an S bit– Defines any updates ha been made to condition flags
92
Instruction Sets:Addressing Modes and Formats
Addressing Modes• Immediate• Direct• Indirect• Register• Register Indirect• Displacement• Stack
94
• All architectures provides more than one of these addressing modes
• How the processor can determine which address mode is being used ?– one or more bits in the instruction format can be used as
a mode field. – Value of the mode field determines which addressing
mode is to be used
95
Immediate Addressing
96
Immediate Addressing• Operand is part of instruction• Operand = address field• e.g. ADD 5– Add 5 to contents of accumulator– 5 is operand
• No memory reference to fetch data• Fast• Limited range
97
Direct addressing
98
Direct Addressing• Address field contains address of operand• Effective address (EA) = address field (A)• e.g. ADD A– Add contents of cell A to accumulator– Look in memory at address A for operand
• Single memory reference to access data• No additional calculations to work out
effective address• Limited address space
99
Indirect Addressing
100
Indirect Addressing • Address field refer to the address of a word in
memory, which in turn contains a full-length address of the operand
• EA = (A)– Look in A, find address (A) and look there for operand
• e.g. ADD (A)
– Add contents of cell pointed to by contents of A to accumulator
• Large address space • Multiple memory accesses to find operand • Hence slower
101
Register Addressing
102
Register Addressing• Operand is held in register named in address field• EA = R• Limited number of registers• Advantages– Small address field is needed for instructions so shorter
instructions – less time
• Less number of registers available
103
Register Indirect Addressing
104
Register Indirect Addressing• indirect addressing• EA = (R)• Operand is in memory cell pointed by contents of
register R• Large address space• One fewer memory access than indirect addressing
105
Displacement Addressing
106
Displacement Addressing• Uses both direct and register indirect addressing• EA = A + (R)• Address field hold two values– A = base value– R = register that holds displacement
107
Relative Addressing• A version of displacement addressing• R = Program counter, PC• EA = A + (PC)
108
Base-Register Addressing• A holds displacement• R contains a main memory address
109
Indexed Addressing• A = base• R = displacement• EA = A + R• Good for accessing arrays
110
Combinations• R value is incremented or decremented
automatically – auto indexing• Post index – indexing is performed after the
indirection– EA = (A) + (R)– Address is fetched and indexed with the register
value • Pre index – before the indirection – EA = (A+(R))
111
Stack Addressing• Operand is on top of stack• Stack pointer
112
x86 Addressing Modes• Virtual or effective address
– Starting address plus effective address gives linear address– This goes through page translation if paging enabled to get physical
address
• addressing modes available– Immediate– Register – Displacement– Base with displacement– Scaled index with displacement– Base with index and displacement– Base scaled index with displacement– Relative
113
x86 Addressing Mode Calculation
114
ARM Addressing ModesLoad/Store
• Only instructions that reference memory• base register plus offset• Offset– Offset added to or subtracted from base register contents to
form the memory address• Preindex– Memory address is formed as for offset addressing– Memory address also written back to base register
• Postindex– Memory address is base register value– Offset added or subtracted– Result written back to base register
• Base register acts as index register for preindex and postindex addressing
• Offset either immediate value in instruction or another register
115
ARM Indexing Methods
116
ARM Data Processing Instruction Addressing& Branch Instructions
• Data Processing – Register addressing– Or mixture of register and immediate addressing
• Branch– Immediate
117
ARM Load/Store Multiple Addressing• Load/store subset of general-purpose registers • Sequential range of memory addresses• Increment after, increment before, decrement
after, and decrement before• Base register specifies main memory address • Incrementing or decrementing starts before or
after first memory access
118
ARM Load/Store Multiple Addressing Diagram
119
Instruction Formats
• Layout of bits in an instruction• Includes opcode• Includes (implicit or explicit) operand(s)• Usually more than one instruction format in
an instruction set
120
Instruction Length• Affected by and affects:– Memory size– Memory organization– Bus structure– CPU complexity– CPU speed
• Trade off between powerful instruction repertoire and saving space
121
Allocation of Bits• Number of addressing modes• Number of operands• Register versus memory• Number of register sets• Address range• Address granularity
122
Assembly language• Machines store and understand binary
instructions• E.g. N= I + J + K initialize I=2, J=3, K=4
123
124
Improvements• Use hexadecimal rather than binary– Code as series of lines• Hex address and memory address
– Need to translate automatically using program• Add symbolic names or mnemonics for
instructions• Three fields per line– Location address– Three letter opcode– If memory reference: address
• Need more complex translation program125
• First field (address) now symbolic• Memory references in third field now symbolic• Now have assembly language and need an
assembler to translate
126