Spr 2015, Jan 26... ELEC 5200-001/6200-001 Lecture 3 1 ELEC 5200-001/6200-001 Computer Architecture...
-
Upload
crystal-black -
Category
Documents
-
view
224 -
download
0
Transcript of Spr 2015, Jan 26... ELEC 5200-001/6200-001 Lecture 3 1 ELEC 5200-001/6200-001 Computer Architecture...
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 11
ELEC 5200-001/6200-001ELEC 5200-001/6200-001Computer Architecture and DesignComputer Architecture and Design
Spring 2015Spring 2015
Instruction Set ArchitectureInstruction Set Architecture(Chapter 2)(Chapter 2)
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
Department of Electrical and Computer EngineeringDepartment of Electrical and Computer EngineeringAuburn University, Auburn, AL 36849Auburn University, Auburn, AL 36849
http://www.eng.auburn.edu/[email protected]
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 22
Designing a ComputerDesigning a Computer
Control
Datapath MemoryCentral Processing
Unit (CPU)or “processor”
Input
Output
FIVE PIECES OF HARDWARE
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 33
Start by Defining ISAStart by Defining ISAWhat is instruction set architecture (ISA)?What is instruction set architecture (ISA)?
ISAISA– Defines registersDefines registers– Defines data transfer modes (instructions) between Defines data transfer modes (instructions) between
registers, memory and I/Oregisters, memory and I/O– There should be There should be sufficient sufficient instructions to efficiently instructions to efficiently
translate any program for machine processingtranslate any program for machine processing
Next, define instruction set format – binary Next, define instruction set format – binary representation used by the hardwarerepresentation used by the hardware– Variable-length vs. fixed-length instructionsVariable-length vs. fixed-length instructions
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 44
Types of ISATypes of ISAComplex instruction set computer (CISC)Complex instruction set computer (CISC)– Many instructions (several hundreds)Many instructions (several hundreds)– An instruction takes many cycles to executeAn instruction takes many cycles to execute– Example: Intel PentiumExample: Intel Pentium
Reduced instruction set computer (RISC)Reduced instruction set computer (RISC)– Small set of instructions (typically 32)Small set of instructions (typically 32)– Simple instructions, each executes in one clock Simple instructions, each executes in one clock
cycle – cycle – REALLY? Well, almost.REALLY? Well, almost.– Effective use of pipeliningEffective use of pipelining– Example: ARMExample: ARM
On Two Types of ISAOn Two Types of ISA
Brad Smith, “ARM and Intel Battle over the Brad Smith, “ARM and Intel Battle over the Mobile Chip’s Future,” Computer, vol. 41, Mobile Chip’s Future,” Computer, vol. 41, no. 5, pp. 15-18, May 2008.no. 5, pp. 15-18, May 2008.
Compare 3Ps:Compare 3Ps:PerformancePerformance
Power consumptionPower consumption
PricePrice
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 55
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 66
Pipelining of RISC InstructionsPipelining of RISC Instructions
FetchInstruction
DecodeOpcode
FetchOperands
ExecuteOperation
StoreResult
Although an instruction takes five clock cycles,one instruction can be completed every cycle.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 77
Growth of ProcessorsGrowth of ProcessorsLanguage of the MachineLanguage of the MachineWe’ll be working with the We’ll be working with the MIPS instruction set MIPS instruction set architecturearchitecture– similar to other similar to other
architectures architectures developed since the developed since the 1980's1980's
– Almost 100 million Almost 100 million MIPS processors MIPS processors manufactured in 2002manufactured in 2002
– used by NEC, Nintendo, used by NEC, Nintendo, Cisco, Silicon Cisco, Silicon Graphics, Sony, …Graphics, Sony, …
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
01998 2000 2001 20021999
Other
SPARC
Hitachi SH
PowerPC
Motorola 68K
MIPS
IA-32
ARM
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 88
MIPS Instruction Set (RISC)MIPS Instruction Set (RISC)Instructions execute simple functions.Instructions execute simple functions.Maintain regularity of format – each Maintain regularity of format – each instruction is one word, contains instruction is one word, contains opcodeopcode and and argumentsarguments..Minimize memory accesses – whenever Minimize memory accesses – whenever possible use registers as arguments.possible use registers as arguments.Three types of instructions:Three types of instructions:
Register (R)-type – only registers as arguments.Register (R)-type – only registers as arguments.Immediate (I)-type – arguments are registers and Immediate (I)-type – arguments are registers and numbers (constants or memory addresses).numbers (constants or memory addresses).Jump (J)-type – argument is an address.Jump (J)-type – argument is an address.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 99
MIPS Arithmetic InstructionsMIPS Arithmetic InstructionsAll instructions have 3 operandsAll instructions have 3 operands
Operand order is fixed (destination first)Operand order is fixed (destination first)
Example:Example:
C code: C code: a = b + c;
MIPS ‘code’:MIPS ‘code’: add a, b, c
“The natural number of operands for an operation like addition is “The natural number of operands for an operation like addition is three… requiring every instruction to have exactly three three… requiring every instruction to have exactly three operands conforms to the philosophy of keeping the hardware operands conforms to the philosophy of keeping the hardware simple”simple”
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1010
Arithmetic Instr. (Continued)Arithmetic Instr. (Continued)Design Principle: simplicity favors regularity. Design Principle: simplicity favors regularity.
Of course this complicates some things...Of course this complicates some things...
C code:C code: a = b + c + d;
MIPS code:MIPS code: add a, b, cadd a, a, d
Operands must be registers Operands must be registers (why?) (why?) Remember von Remember von Neumann bottleneck.Neumann bottleneck.
32 registers provided32 registers provided
Each register contains 32 bitsEach register contains 32 bits2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1111
Registers vs. MemoryRegisters vs. Memory
Processor I/O
Control
Datapath
Memory
Input
Output
Arithmetic instructions operands must be registersArithmetic instructions operands must be registers32 registers provided32 registers provided
Compiler associates variables with registers.Compiler associates variables with registers.What about programs with lots of variables? What about programs with lots of variables? Must use Must use memory.memory.
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1212
Memory OrganizationMemory OrganizationViewed as a large, single-dimension array, with an Viewed as a large, single-dimension array, with an address.address.
A memory address is an index into the array.A memory address is an index into the array.
"Byte addressing" means that the index points to a byte "Byte addressing" means that the index points to a byte of memory.of memory.
2004 © Morgan Kaufman Publishers
32 bit word
32 bit word
32 bit word
32 bit word
.
.
.
8 bits of data 8 bits of data 8 bits of data 8 bits of data
8 bits of data 8 bits of data 8 bits of data 8 bits of data
8 bits of data 8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data 8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
Byte 0 byte 1 byte 2 byte 3
byte 4 byte 10
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1313
Memory OrganizationMemory OrganizationBytes are nice, but most data items use larger "words"Bytes are nice, but most data items use larger "words"
For MIPS, a word contains 32 bits or 4 bytes.For MIPS, a word contains 32 bits or 4 bytes.
word addressesword addresses
223232 bytes with addresses from 0 to 2 bytes with addresses from 0 to 232 32 – 1 – 1
223030 words with addresses 0, 4, 8, ... 2 words with addresses 0, 4, 8, ... 232 32 – 4 – 4
Words are alignedWords are alignedi.e., what are the least 2 significant bits of a word i.e., what are the least 2 significant bits of a word
address?address?
...
Registers hold 32 bits of data
Use 32 bit address
2004 © Morgan Kaufman Publishers
0
4
8
12
.
32 bits of data
32 bits of data
32 bits of data
32 bits of data
32 bits of data
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1414
InstructionsInstructionsLoad and store instructionsLoad and store instructionsExample:Example:
C code:C code: A[12] = h + A[8];
MIPS code:MIPS code: lw $t0, 32($s3) #addr of A in reg s3add $t0, $s2, $t0 #h in reg s2sw $t0, 48($s3)
Can refer to registers by name (e.g., $s2, $t2) instead of numberCan refer to registers by name (e.g., $s2, $t2) instead of numberStore word has destination lastStore word has destination lastRemember arithmetic operands are registers, not memory!Remember arithmetic operands are registers, not memory!
Can’t write: Can’t write: add 48($s3), $s2, 32($s3)
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1515
Our First ExampleOur First ExampleCan we figure out the code of subroutine?Can we figure out the code of subroutine?
Initially, k is in reg 5; base address of v is in reg 4; Initially, k is in reg 5; base address of v is in reg 4; return addr is in reg 31return addr is in reg 31
swap(int v[], int k);{ int temp;
temp = v[k]v[k] = v[k+1];v[k+1] = temp;
}
swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1616
What Happens?What Happens?
When the program reaches “call swap” statement:When the program reaches “call swap” statement:– Jump to swap routineJump to swap routine
Registers 4 and 5 contain the arguments (register convention)Registers 4 and 5 contain the arguments (register convention)
Register 31 contains the return address (register convention)Register 31 contains the return address (register convention)
– Swap two words in memorySwap two words in memory– Jump back to return address to continue rest of the programJump back to return address to continue rest of the program
.
.call swap...
return address
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1717
Memory and RegistersMemory and Registers
Word 0
Word 1
Word 2
v[0] (Word n)
048
12 .
4n . . .
4n+4k.
v[1] (Word n+1)
Register 0
Register 1
Register 2
Register 3
Register 4
Register 31
Register 5
v[k] (Word n+k)
4n
k
Memory
byte addr.
Ret. addr.v[k+1] (Word n+k+1)
.
.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1818
Our First ExampleOur First ExampleNow figure out the code:Now figure out the code:
swap(int v[], int k);{ int temp;
temp = v[k]v[k] = v[k+1];v[k+1] = temp;
}
swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 1919
So Far We’ve LearnedSo Far We’ve LearnedMIPSMIPS
— loading words but addressing bytes— loading words but addressing bytes— arithmetic on registers only— arithmetic on registers only
InstructionInstruction MeaningMeaning
add $s1, $s2, $s3 $s1 = $s2 + $s3sub $s1, $s2, $s3 $s1 = $s2 – $s3lw $s1, 100($s2) $s1 = Memory[$s2+100]
sw $s1, 100($s2) Memory[$s2+100] = $s1
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2020
Instructions, like registers and words of data, are also 32 bits Instructions, like registers and words of data, are also 32 bits longlong
– Example: Example: add $t1, $s1, $s2– registers are numbered, registers are numbered, $t1=8, $s1=17, $s2=18
Instruction Format:Instruction Format:
000000 10001 10010 01000 00000 100000
opcode rs rt rd shamt funct
Can you guess what the field names stand for?Can you guess what the field names stand for?
Machine LanguageMachine Language
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2121
Violating Regularity for a Good CauseViolating Regularity for a Good Cause
GrandCentralStation
TimesSquare
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2222
Consider the load-word and store-word instructions,Consider the load-word and store-word instructions,– What would the regularity principle have us do?What would the regularity principle have us do?– New principle: Good design demands a compromiseNew principle: Good design demands a compromise
Introduce a new type of instruction formatIntroduce a new type of instruction format– I-type for data transfer instructionsI-type for data transfer instructions– other format was R-type for registerother format was R-type for register
Example: Example: lw $t0, 32($s2)lw $t0, 32($s2)
35 18 9 32
opcode rs rt 16 bit number
Where's the compromise?Where's the compromise?
Machine LanguageMachine Language
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2323
Instructions are bitsInstructions are bits
Programs are stored in memory Programs are stored in memory to be read or written just like datato be read or written just like data
Fetch and Execute CyclesFetch and Execute CyclesInstructions are fetched and put into a special registerInstructions are fetched and put into a special register
Opcode bits in the register "control" the subsequent actionsOpcode bits in the register "control" the subsequent actions
Fetch the “next” instruction and continueFetch the “next” instruction and continue
Processor Memorymemory for data, programs,
compilers, editors, etc.
Stored Program ConceptStored Program Concept
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2424
Decision making instructionsDecision making instructions– alter the control flow,alter the control flow,– i.e., change the "next" instruction to be executedi.e., change the "next" instruction to be executed
MIPS conditional branch instructions:MIPS conditional branch instructions:
bne $t0, $t1, Label beq $t0, $t1, Label
Example:Example: if (i==j) h = i + j;
bne $s0, $s1, Labeladd $s3, $s0, $s1
Label: ....
ControlControl
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2525
MIPS unconditional branch instructions:MIPS unconditional branch instructions:j label
Example:Example:if (i!=j) beq $s4, $s5, Lab1 h=i+j; add $s3, $s4, $s5else j Lab2 h=i-j; Lab1:sub $s3, $s4, $s5
Lab2:...
Can you build a simpleCan you build a simple for for looploop??
ControlControl
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2626
So Far We’ve LearnedSo Far We’ve LearnedInstructionInstruction MeaningMeaningadd $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1bne $s4,$s5,Label Next instr. is at Label if
$s4 ≠ $s5beq $s4,$s5,Label Next instr. is at Label if
$s4 = $s5j Label Next instr. is at Label
Formats:Formats:
op rs rt rd shamt funct
op rs rt 16 bit address
op 26 bit address
R
I
J2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2727
Three Ways to Jump: j, jr, jalThree Ways to Jump: j, jr, jal
jj instrinstr # jump to machine instruction # jump to machine instruction instrinstr
(unconditional jump)(unconditional jump)
jrjr $ra$ra # jump to address in register ra# jump to address in register ra
(used by calee to go back to (used by calee to go back to caller)caller)
jaljal addraddr # set $ra = PC+4 and go to # set $ra = PC+4 and go to addraddr
(jump and link; used to jump to (jump and link; used to jump to a a procedure) procedure)
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2828
We have: beq, bne, what about Branch-if-less-than?We have: beq, bne, what about Branch-if-less-than?
New instruction:New instruction:if $s1 < $s2 then
$t0 = 1slt $t0, $s1, $s2 else
$t0 = 0
Can use this instruction to build new “pseudoinstruction”Can use this instruction to build new “pseudoinstruction”
blt $s1, $s2, Label
Note that the assembler needs a register to do this,Note that the assembler needs a register to do this,— there are policy of use conventions for registers— there are policy of use conventions for registers
Control FlowControl Flow
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 2929
PseudoinstructionsPseudoinstructions
bltblt $s1, $s2, reladdr$s1, $s2, reladdrAssembler converts to:Assembler converts to: sltslt $1, $s1, $s2$1, $s1, $s2
bnebne $1, $zero, reladdr$1, $zero, reladdr
Other pseudoinstructions: bgt, ble, bge, li, moveOther pseudoinstructions: bgt, ble, bge, li, moveNot implemented in hardwareNot implemented in hardwareAssembler expands pseudoinstructions into Assembler expands pseudoinstructions into machine instructionsmachine instructionsRegister 1, called $at, is reserved for converting Register 1, called $at, is reserved for converting pseudoinstructions into machine code.pseudoinstructions into machine code.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3030
Policy of Register Usage Policy of Register Usage (Conventions)(Conventions)
Name Register number Usage
$zero 0 the constant value 0
$v0-$v1 2-3 values for results and expression evaluation
$a0-$a3 4-7 arguments
$t0-$t7 8-15 temporaries
$s0-$s7 16-23 saved
$t8-$t9 24-25 more temporaries
$gp 28 global pointer
$sp 29 stack pointer
$fp 30 frame pointer
$ra 31 return address
Register 1 ($at) reserved for assembler, 26-27 for operating system2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3131
Small constants are used quite frequently (50% of operands) Small constants are used quite frequently (50% of operands) e.g., e.g., A = A + 5;A = A + 5;
B = B + 1;B = B + 1;C = C – 18;C = C – 18;
Solutions? Why not?Solutions? Why not?– put 'typical constants' in memory and load them. put 'typical constants' in memory and load them. – create hard-wired registers (like $zero) for constants like one.create hard-wired registers (like $zero) for constants like one.
MIPS Instructions:MIPS Instructions: addi $29, $29, 4
slti $8, $18, 10andi $29, $29, 6ori $29, $29, 4
Design Principle: Make the common case fast. Design Principle: Make the common case fast. Which format?Which format?
ConstantsConstants
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3232
We'd like to be able to load a 32 bit constant into a registerWe'd like to be able to load a 32 bit constant into a register
Must use two instructions, new "load upper immediate" instructionMust use two instructions, new "load upper immediate" instruction
lui $t0, 1010101010101010
Then must get the lower order bits right, i.e.,Then must get the lower order bits right, i.e.,ori $t0, $t0, 1010101010101010
1010101010101010 0000000000000000
0000000000000000 1010101010101010
1010101010101010 1010101010101010
ori
1010101010101010 0000000000000000
filled with zeros
How About Larger Constants?How About Larger Constants?
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3333
Assembly provides convenient symbolic representationAssembly provides convenient symbolic representationmuch easier than writing down numbersmuch easier than writing down numbers
e.g., destination firste.g., destination first
Machine language is the underlying realityMachine language is the underlying realitye.g., destination is no longer firste.g., destination is no longer first
Assembly can provide 'pseudoinstructions'Assembly can provide 'pseudoinstructions'e.g., “move $t0, $t1” exists only in Assembly e.g., “move $t0, $t1” exists only in Assembly
implemented using “add $t0, $t1, $zero” implemented using “add $t0, $t1, $zero”
When considering performance you should count real When considering performance you should count real instructions and clock cyclesinstructions and clock cycles
Assembly Language vs. Assembly Language vs. Machine LanguageMachine Language
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3434
simple instructions, all 32 bits widesimple instructions, all 32 bits wide
very structured, no unnecessary baggagevery structured, no unnecessary baggage
only three instruction formatsonly three instruction formats
rely on compiler to achieve performancerely on compiler to achieve performance
op rs rt rd shamt funct
op rs rt 16 bit address
op 26 bit address
R
I
J
Overview of MIPSOverview of MIPS
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3535
Instructions:Instructions:
bne $t4, $t5, Labelbne $t4, $t5, Label Next instruction is at Label Next instruction is at Label
ifif $t4 $t4 ≠≠ $t5 $t5
beq $t4, $t5, Labelbeq $t4, $t5, Label Next instruction is at Label Next instruction is at Label
ifif $t4 = $t5$t4 = $t5
j Labelj Label Next instruction is at LabelNext instruction is at Label Formats:Formats:
op rs rt 16 bit rel. address
op 26 bit absolute address
I
J
Addresses in Branches and JumpsAddresses in Branches and Jumps
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3636
Instructions:Instructions:bne $t4,$t5,Labelbne $t4,$t5,Label Next instruction is at Label if $t4 Next instruction is at Label if $t4 ≠≠ $t5$t5beq $t4,$t5,Labelbeq $t4,$t5,Label Next instruction is at Label if $t4 = $t5Next instruction is at Label if $t4 = $t5
Formats:Formats: – 215 to 215 – 1 ~ ±32 Kwords±32 Kwords
Relative addressingRelative addressing 226 = 64 64 Mwords Mwords– with respect to PC (program counter)with respect to PC (program counter)– most branches are local (principle of locality)most branches are local (principle of locality)
Jump instruction just uses high order bits of PC Jump instruction just uses high order bits of PC – address boundaries of 256 MBytes (maximum jump 64 Mwords)address boundaries of 256 MBytes (maximum jump 64 Mwords)
Addresses in BranchesAddresses in Branches
2004 © Morgan Kaufman Publishers
op rs rt 16 bit address
op 26 bit address
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3737
Example: Loop in C (p. 74)Example: Loop in C (p. 74)
while ( save[i] == k )while ( save[i] == k )
i += 1;i += 1;
Given a value for k, set i to the index of Given a value for k, set i to the index of element in array save [ ] that does not equal element in array save [ ] that does not equal k.k.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3838
MIPS Code for While LoopMIPS Code for While LoopCompiler assigns variables to registers:
$s3 (reg 19) ← i initially 0$s5 (reg 21) ← k$s6 (reg 22) ← memory address where save [ ] begins
Then generates the following assembly code:
Loop: sll $t1, $s3, 2 # Temp reg $t1 = 4 * iadd $t1, $t1, $s6 # $t1 = address of save[i]lw $t0, 0($t1) # Temp reg $t0 = save[i]bne $t0, $s5, Exit # go to Exit if save[i] ≠ kaddi $s3, $s3, 1 # i = i + 1j Loop # go to Loop
Exit:
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 3939
Machine Code and Mem. AdressesMachine Code and Mem. Adresses
00 00 1919 99 22 00
00 99 2222 99 00 3232
3535 99 88 00
55 88 2121 Exit = +2Exit = +2
88 1919 1919 11
22 Loop = 20000 (memory word address)Loop = 20000 (memory word address)
. . . . .. . . . .
sll
add
lw
bne
addi
j
80000
80004
80008
80012
80016
80020
80024
Memory Machine codeByte addr. Bits 31-26| 25-21 | 20-16 | 15-11 | 10 – 6 | 5 – 0 |
Note: $t0 ≡ Reg 8, $t1 ≡ Reg 9, $s3 ≡ Reg 19, $s5 ≡ Reg 21, $s6 ≡ Reg 22temp temp i k save
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4040
Finding Branch Address Finding Branch Address ExitExit
Exit = +2Exit = +2 is a 16 bit integer in bne instruction is a 16 bit integer in bne instruction 000101 01000 10101 000101 01000 10101 0000000000000010 0000000000000010 = 2= 2$PC = 80016 is the byte address of the next instruction$PC = 80016 is the byte address of the next instruction 00000000000000010011100010010000 00000000000000010011100010010000 = 80016= 80016Multiply bne argument by 4 (convert to byte address)Multiply bne argument by 4 (convert to byte address) 0000000000001000 0000000000001000 = 8= 8$PC $PC ← ← $PC + 8$PC + 800000000000000010011100010011000 00000000000000010011100010011000 = 80024= 80024Thus, Thus, ExitExit is memory byte address 80024. is memory byte address 80024.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4141
Finding Jump Address Finding Jump Address LoopLoopJJ 2000020000
000010000010 00000000000100111000100000 = 20000 00000000000100111000100000 = 20000
$PC = 80024, when jump is being executed$PC = 80024, when jump is being executed
000000000000000000010011100010011000 = 800240000000000010011100010011000 = 80024
Multiply J argument by 4 (convert to byte address)Multiply J argument by 4 (convert to byte address)
0000000000010011100010000000 = 800000000000000010011100010000000 = 80000
Insert four leading bits from $PCInsert four leading bits from $PC
000000000000000000010011100010000000 = 800000000000000010011100010000000 = 80000
Thus, Thus, LoopLoop is memory byte address 80000. is memory byte address 80000.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4242
Summary: MIPS Registers and Summary: MIPS Registers and MemoryMemory
$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform
32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is
$fp, $sp, $ra, $at reserved for the assembler to handle large constants.
Memory[0], Accessed only by data transfer instructions. MIPS uses byte
230 memoryMemory[4], ..., addresses, so sequential words differ by 4. Memory holds data
words Memory[4294967292] structures, such as arrays, and spilled registers, such as those
saved on procedure calls.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4343
Summary: MIPS InstructionsSummary: MIPS InstructionsMIPS assembly language
Category Instruction Example Meaning Commentsadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers
Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers
add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants
load w ord lw $s1, 100($s2) $s1 = Memory[$s2 + 100]Word from memory to register
store w ord sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory
Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100]Byte from memory to register
store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memoryload upper immediate
lui $s1, 100 $s1 = 100 * 216 Loads constant in upper 16 bits
branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100
Equal test; PC-relative branch
Conditional
branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100
Not equal test; PC-relative
branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0
Compare less than; for beq, bne
set less than immediate
slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; else $s1 = 0
Compare less than constant
jump j 2500 go to 10000 Jump to target address
Uncondi- jump register jr $ra go to $ra For sw itch, procedure return
tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4444
Byte Halfword Word
Registers
Memory
Memory
Word
Memory
Word
Register
Register
1. Immediate addressing
2. Register addressing
3. Base addressing
4. PC-relative addressing
5. Pseudodirect addressing
op rs rt
op rs rt
op rs rt
op
op
rs rt
Address
Address
Address
rd . . . funct
Immediate
PC
PC
+
+
2004 © Morgan Kaufman Publishers
Addressing ModesExample
addi
add
lw, sw
beq, bne
j
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4545
Design alternative:Design alternative:
– provide more powerful operationsprovide more powerful operations
– goal is to reduce number of instructions executedgoal is to reduce number of instructions executed
– danger is a slower cycle time and/or a higher CPIdanger is a slower cycle time and/or a higher CPI
Let’s look (briefly) at IA-32Let’s look (briefly) at IA-32
Alternative ArchitecturesAlternative Architectures
–“The path toward operation complexity is thus fraught with peril.
To avoid these problems, designers have moved toward simpler
instructions”
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4646
IA–32 (a.k.a. x86)IA–32 (a.k.a. x86)1978:1978: The Intel 8086 is announced (16 bit architecture)The Intel 8086 is announced (16 bit architecture)1980:1980: The 8087 floating point coprocessor is addedThe 8087 floating point coprocessor is added1982:1982: The 80286 increases address space to 24 bits, The 80286 increases address space to 24 bits,
+instructions+instructions1985:1985: The 80386 extends to 32 bits, new addressing modesThe 80386 extends to 32 bits, new addressing modes1989-1995:1989-1995: The 80486, Pentium, Pentium Pro add a few The 80486, Pentium, Pentium Pro add a few instructionsinstructions
(mostly designed for higher performance)(mostly designed for higher performance)1997: 1997: 57 new “MMX” instructions are added, Pentium II57 new “MMX” instructions are added, Pentium II1999: 1999: The Pentium III added another 70 instructions (SSE – The Pentium III added another 70 instructions (SSE –
streaming SIMD extensions)streaming SIMD extensions)2001: 2001: Another 144 instructions (SSE2)Another 144 instructions (SSE2)2003: 2003: AMD extends the architecture to increase address space to AMD extends the architecture to increase address space to
64 bits, widens all registers to 64 bits and makes other 64 bits, widens all registers to 64 bits and makes other changes (AMD64)changes (AMD64)
2004: 2004: Intel capitulates and embraces AMD64 (calls it EM64T) and Intel capitulates and embraces AMD64 (calls it EM64T) and adds more media extensionsadds more media extensions
““This history illustrates the impact of the “golden handcuffs” of compatibility:This history illustrates the impact of the “golden handcuffs” of compatibility:““adding new features as someone might add clothing to a packed bag”adding new features as someone might add clothing to a packed bag”“an architecture that is difficult to explain and impossible to love”“an architecture that is difficult to explain and impossible to love” 2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4747
IA-32 OverviewIA-32 OverviewComplexity:Complexity:– Instructions from 1 to 17 bytes longInstructions from 1 to 17 bytes long– one operand must act as both a source and destinationone operand must act as both a source and destination– one operand can come from memoryone operand can come from memory– complex addressing modescomplex addressing modes
e.g., “base or scaled index with 8 or 32 bit displacement”e.g., “base or scaled index with 8 or 32 bit displacement”
Saving grace:Saving grace:– the most frequently used instructions are not too difficult to buildthe most frequently used instructions are not too difficult to build– compilers avoid the portions of the architecture that are slowcompilers avoid the portions of the architecture that are slow
““what the x86 lacks in style is made up in quantity, what the x86 lacks in style is made up in quantity, making it beautiful from the right perspective”making it beautiful from the right perspective”
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4848
IA-32 RegistersIA-32 RegistersRegisters in the 32-bit subset that originated with 80386Registers in the 32-bit subset that originated with 80386
GPR 0
GPR 1
GPR 2
GPR 3
GPR 4
GPR 5
GPR 6
GPR 7
Code segment pointer
Stack segment pointer (top of stack)
Data segment pointer 0
Data segment pointer 1
Data segment pointer 2
Data segment pointer 3
Instruction pointer (PC)
Condition codes
Use
031Name
EAX
ECX
EDX
EBX
ESP
EBP
ESI
EDI
CS
SS
DS
ES
FS
GS
EIP
EFLAGS 2004 © Morgan Kaufman Publishers
Eight generalpurpose registers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 4949
IA-32 Register RestrictionsIA-32 Register Restrictions
Fourteen major registers.Fourteen major registers.
Eight 32-bit general purpose registers.Eight 32-bit general purpose registers.ESP or EBP cannot contain memory address.ESP or EBP cannot contain memory address.
ESP cannot contain displacement from base ESP cannot contain displacement from base address.address.
. . .. . .
See Figure 2.38, page 154 (Fifth Edition).See Figure 2.38, page 154 (Fifth Edition).
2004 © Morgan Kaufman Publishers
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5050
IA-32 Typical InstructionsIA-32 Typical InstructionsFour major types of integer instructions:Four major types of integer instructions:– Data movement including move, push, popData movement including move, push, pop– Arithmetic and logical (destination register or memory)Arithmetic and logical (destination register or memory)– Control flow (use of condition codes / flags )Control flow (use of condition codes / flags )– String instructions, including string move and string compareString instructions, including string move and string compare
2004 © Morgan Kaufman Publishers
Some IA-32 InstructionsSome IA-32 Instructions
PUSHPUSH 5-bit opcode, 3-bit register operand5-bit opcode, 3-bit register operand
JEJE 4-bit opcode, 4-bit condition, 8-bit jump offset4-bit opcode, 4-bit condition, 8-bit jump offset
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5151
5-b | 3-b
4-b | 4-b | 8-b
Some IA-32 InstructionsSome IA-32 Instructions
MOVMOV 6-bit opcode, 8-bit register/mode*, 8-bit offset6-bit opcode, 8-bit register/mode*, 8-bit offset
XORXOR 8-bit opcode, 8-bit reg/mode*, 8-bit base, 8-b index8-bit opcode, 8-bit reg/mode*, 8-bit base, 8-b index
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5252
6-b |d|w| 8-b | 8-b
bit indicates move to or from memory
bit indicates byte or double word operation
8-b | 8-b | 8-b | 8-b
*8-bit register/mode: See Figure 2.42, page 158 (Fifth Edition).
Some IA-32 InstructionsSome IA-32 Instructions
ADDADD 4-bit opcode, 3-bit register, 32-bit immediate4-bit opcode, 3-bit register, 32-bit immediate
TESTTEST 7-bit opcode, 8-bit reg/mode, 32-bit immediate7-bit opcode, 8-bit reg/mode, 32-bit immediate
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5353
4-b | 3-b |w| 32-b
7-b |w| 8-b | 32-b
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5454
Additional ReferencesAdditional ReferencesIA-32, IA-64 (CISC)IA-32, IA-64 (CISC)
A. S. Tanenbaum, A. S. Tanenbaum, Structured Computer Organization, Fifth Structured Computer Organization, Fifth EditionEdition, Upper Saddle River, New Jersey: Pearson Prentice-, Upper Saddle River, New Jersey: Pearson Prentice-Hall, 2006, Chapter 5.Hall, 2006, Chapter 5.
ARM (RISC)ARM (RISC)D. Seal, D. Seal, ARM Architecture Reference Manual, Second EditionARM Architecture Reference Manual, Second Edition, , Addison-Wesley Professional, 2000.Addison-Wesley Professional, 2000.
SPARC (Scalable Processor Architecture)SPARC (Scalable Processor Architecture)
PowerPCPowerPCV. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer V. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer Organization, Fourth Edition, New York: McGraw-Hill, 1996.Organization, Fourth Edition, New York: McGraw-Hill, 1996.
Instruction ComplexityInstruction Complexity
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5555
Increasing instruction complexity
Pro
gra
m s
ize
in m
ach
ine
inst
ruct
ion
s (P
)
Av.
exe
cutio
n ti
me
per
inst
ruct
ion
(T
)
PT
P×T
URISC: The Other ExtremeURISC: The Other ExtremeInstruction set has a single instruction:Instruction set has a single instruction:
label:label: uriscurisc dest, src1, targetdest, src1, target
Subtract operand 1 from operand 2, replace operand 2 with the Subtract operand 1 from operand 2, replace operand 2 with the result, and jump to target address if the result is negative.result, and jump to target address if the result is negative.
See, B. Parhami, See, B. Parhami, Computer Architecture, from Computer Architecture, from Microprocessors to SupercomputersMicroprocessors to Supercomputers, New York: Oxford, , New York: Oxford, 2005, pp. 151-153.2005, pp. 151-153.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5656
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5757
Instruction complexity is only one variableInstruction complexity is only one variable– lower instruction count vs. higher CPI / lower clock rate – lower instruction count vs. higher CPI / lower clock rate –
we will see performance measures laterwe will see performance measures later
Design Principles:Design Principles:– simplicity favors regularitysimplicity favors regularity
– smaller is fastersmaller is faster
– good design demands compromisegood design demands compromise
– make the common case fastmake the common case fast
Instruction set architectureInstruction set architecture– a very important abstraction indeed!a very important abstraction indeed!
Links to some instruction sets – next slide.Links to some instruction sets – next slide.
SummarySummary
2004 © Morgan Kaufman Publishers
Some Instruction SetsSome Instruction SetsMIPSMIPShttp://www.d.umn.edu/~gshute/mips/MIPS.html
ARMARMhttp://simplemachines.it/doc/arm_inst.pdf
IA32/64IA32/64http://brokenthorn.com/Resources/OSDevX86.html
PowerPCPowerPChttp://pds.twi.tudelft.nl/vakken/in101/labcourse/instruction-set/
SPARCSPARChttp://www.cs.unm.edu/~maccabe/classes/341/labman/node9
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5858
Preview: Project – to be assignedPreview: Project – to be assigned
Part 1 – Design an instruction set for a 16-bit Part 1 – Design an instruction set for a 16-bit processor.processor.
The ISA may contain no more than 16 unique instructions. However, you may have multiple formats for a given type of instruction, if necessary.
Of the 16 instructions, at least one instruction should make your processor HALT.
The ISA is to support 16-bit data words only. (No byte operands.) All operands are to be 16-bit signed integers (2’s complement). Each instruction must be encoded using one 16-bit word.
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 5959
Project Preview (Cont.)Project Preview (Cont.)The ISA is to support linear addressing of 1K, 16-bit words memory. The memory is to be word-addressable only - not byte-addressable.
The ISA should contain appropriate numbers and types of user-programmable registers to support it. Since this is a small processor, the hardware does not necessarily need to support dedicated registers for stack pointer, frame pointer, etc.
The ISA must “support” C Programming Language constructs.
Control flow structures: “if-else” structures, “while” loops, “for” loops.
Functions (call and return).
Spr 2015, Jan 26 . . .Spr 2015, Jan 26 . . . ELEC 5200-001/6200-001 Lecture 3ELEC 5200-001/6200-001 Lecture 3 6060