Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of...
Transcript of Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of...
![Page 1: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/1.jpg)
Chapter2:Instructions:LanguageoftheComputer
CSCE212IntroductiontoComputerArchitecture,Spring2019https://passlab.github.io/CSCE212/
DepartmentofComputerScienceandEngineeringYonghongYan
[email protected]://cse.sc.edu/~yanyh
![Page 2: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/2.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 2
☛
![Page 3: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/3.jpg)
ReviewofLecture06
3
![Page 4: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/4.jpg)
Review:CBasics
4
#include <stdio.h>int main(int argc, char* argv[]){/* print a greeting */printf("Good evening!\n");return 0;
}
![Page 5: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/5.jpg)
CVariableandPointer
5
&=addressof*=contentsat
![Page 6: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/6.jpg)
CPointerandMemory
6
&=addressof*=contentsat
![Page 7: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/7.jpg)
Example2:swap_2
void swap_2(int *a, int *b){int temp;temp = *a;*a = *b;*b = temp;
}
void call_swap_2( ) {int x = 3;int y = 4;swap_1(&x, &y);/* values of x and y ? */
}
Q: Let x=3, y=4,after swap_2(&x,&y);x =? y=?
A1: x=3; y=4;
A2: x=4; y=3;
![Page 8: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/8.jpg)
Arrays
• Adjacentmemorylocationsstoringthesametypeofdata• int a[6];meansspaceforsixintegers
• aisthenameofthearray’sbaseaddress– 0x0C
![Page 9: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/9.jpg)
AddressofArrayElements
• int a[6];
• a isthenameofthearray’sbaseaddress– 0x0C
– E.g.&a[2]:0x0C+2*4=0x14• Byitself,aisalsotheaddressofthefirstinteger
– *aanda[0]meanthesamething• Theaddressofaisnotstoredinmemory:thecompilerinsertscodetocomputeitwhenitappears
&a[i]:a+i *sizeof(int)
![Page 10: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/10.jpg)
CStoresArrayinMemoryinRowMajor
10
8 6 5 4
2 1 9 7
3 6 4 2
int A[3][4];
=A +offset(fromAtoA[1][2])=A +sizeof (int)*(1 *4 +2)=A +4*6=A +24
OffsetofA[1][2]
AddressofelementA[1][2]:
![Page 11: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/11.jpg)
EndofReviewofLecture06
11
![Page 12: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/12.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 12
☛
![Page 13: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/13.jpg)
MIPSandX86_64AssemblyExample
13
§2.1 Introduction
![Page 14: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/14.jpg)
InstructionSet
• Therepertoireofinstructionsofacomputer• Differentcomputershavedifferentinstructionsets
– Butwithmanyaspectsincommon• Earlycomputershadverysimpleinstructionsets
– Simplifiedimplementation• Manymoderncomputersalsohavesimpleinstructionsets
14
![Page 15: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/15.jpg)
InstructionSetArchitecture:theInterfacebetweenHardwareandSoftware
InstructionSetArchitecture– theportionofthemachinevisibletotheassemblylevelprogrammerortothecompilerwriter– Tousethehardwareofacomputer,wemustspeak itslanguage– Thewordsofacomputerlanguagearecalledinstructions,andits
vocabularyiscalledaninstructionset
instructionset
software
hardware
Instr.# Operation+Operandsi movl -4(%ebp),%eax(i+1) addl %eax,(%edx)(i+2) cmpl 8(%ebp),%eax(i+3) jl L5:L5:
15
![Page 16: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/16.jpg)
RISCvs.CISC• Design“philosophies”forISAs:RISCvs.CISC
– CISC=ComplexInstructionSetComputer• X86,X86_64(IntelandAMD,main-streamdesktop/laptop/server)• X86*internallyarestillRISC
– RISC=ReducedInstructionSetComputer• ARM:smartphone/pad• RISC-V:freeISA,closertoMIPSthanotherISAs,thesametextbookinRISC-Vversion• Others:Power,SPARC,etc
• Tradeoff:
• RISC:– Smallinstructionset
• Easierforcompilers– Limiteachinstructionto(atmost):
• threeregisteraccesses,• onememoryaccess,• oneALUoperation• =>facilitatesparallelinstructionexecution(ILP)
– Load-storemachine:minimizeoff-chipaccess
cycle ClockSeconds
nInstructiocycles Clock
ProgramnsInstructioTime CPU ´´=
![Page 17: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/17.jpg)
TheMIPSInstructionSet
• Usedastheexamplethroughoutthebook• StanfordMIPScommercializedbyMIPSTechnologies(www.mips.com)
• Largeshareofembeddedcoremarket– Applicationsinconsumerelectronics,network/storageequipment,
cameras,printers,…• TypicalofmanymodernISAs
– SeeMIPSReferenceDatatear-outcard,andAppendixesBandE
• OtherInstructionSetArchitectures:– X86andX86_32:IntelandAMD,main-streamdesktop/laptop/server– ARM:smartphone/pad– RISC-V:emergingandfreeISA,closertoMIPSthanotherISAs
• ThesametextbookinRISC-Vversion– Others:Power,SPARC,etc
17
![Page 18: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/18.jpg)
ArithmeticOperations
• Addandsubtract,threeoperands– Twosourcesandonedestination
add a, b, c # a gets b + c
• Allarithmeticoperationshavethisform
• DesignPrinciple1: Simplicityfavours regularity– Regularitymakesimplementationsimpler– Simplicityenableshigherperformanceatlowercost
§2.2 Operations of the C
omputer H
ardware
18
![Page 19: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/19.jpg)
ArithmeticExample
• Ccode:
f = (g + h) - (i + j);
• CompiledMIPScode:
add t0, g, h # temp t0 = g + hadd t1, i, j # temp t1 = i + jsub f, t0, t1 # f = t0 - t1
19
![Page 20: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/20.jpg)
RegistersinCPU
• Registersaresuper-fastsmallstorageusedinCPU.• Dataandinstructionsneedtobeloadedtoregisterinordertobeprocessed.
20
§2.3 Operands of the C
omputer H
ardware
![Page 21: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/21.jpg)
RegisterOperands
• Arithmeticinstructionsuseregisteroperands• MIPShasa32× 32-bitregisterfile
– Useforfrequentlyaccesseddata– Numbered0to31– 32-bitdatacalleda“word”
• Assemblernames– $t0,$t1,…,$t9fortemporaryvalues– $s0,$s1,…,$s7forsavedvariables
• DesignPrinciple2: Smallerisfaster– c.f.mainmemory:millionsoflocations
21
![Page 22: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/22.jpg)
RegisterOperandExample
• Ccode:
f = (g + h) - (i + j);– f,…,jin$s0,…,$s4
• CompiledMIPScode:add $t0, $s1, $s2 #register$t0containsg+hadd $t1, $s3, $s4 #register$t1containsi +jsub $s0, $t0, $t1 #$s0gets$t0– $t1,whichis
#(g+h)–(i +j)
22
![Page 23: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/23.jpg)
MemoryOperands
• Mainmemoryusedforcompositedata– Arrays,structures,dynamicdata
• Toapplyarithmeticoperations– Loadvaluesfrommemoryintoregisters– Storeresultfromregistertomemory
• Memoryisbyteaddressed– Eachaddressidentifiesan8-bitbyte
• Wordsarealignedinmemory– Addressmustbeamultipleof4
• MIPSisBigEndian– Most-significantbyteatleastaddressofaword– c.f. LittleEndian:least-significantbyteatleastaddress
23
![Page 24: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/24.jpg)
MemoryOperandExample1
• Ccode:int a[N]
g = h + A[8];– gin$s1,hin$s2,baseaddressofAin$s3
• CompiledMIPScode:– Index8requiresoffsetof32,A[8]isright-val referenceà load
• 4bytesperword
lw $t0, 32($s3) # load wordadd $s1, $s2, $t0
offset base register
24
![Page 25: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/25.jpg)
MemoryOperandExample2
• Ccode:int a[N]
A[12] = h + A[8];– hin$s2,baseaddressofAin$s3
• CompiledMIPScode:– Index8requiresoffsetof32:A[8]:right-val,A[12]:left-val
lw $t0, 32($s3) # load wordadd $t0, $s2, $t0sw $t0, 48($s3) # store word
25
![Page 26: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/26.jpg)
Registersvs.Memory
• Registersarefastertoaccessthanmemory• Operatingonmemorydatarequiresloadsandstores
– Moreinstructionstobeexecuted• Compilermustuseregistersforvariablesasmuchaspossible
– Onlyspill tomemoryforlessfrequentlyusedvariables– Registeroptimizationisimportant!
26
![Page 27: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/27.jpg)
ImmediateOperands
• Constant dataspecifiedinaninstructionaddi $s3, $s3, 4
• Nosubtractimmediateinstruction– Justuseanegativeconstantaddi $s2, $s1, -1
• DesignPrinciple3:Makethecommoncasefast– Smallconstantsarecommon– Immediateoperandavoidsaloadinstruction
27
![Page 28: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/28.jpg)
TheConstantZero
• MIPSregister0($zero)istheconstant0– Cannotbeoverwritten
• Usefulforcommonoperations– E.g.,movebetweenregistersadd $t2, $s1, $zero
28
![Page 29: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/29.jpg)
UnsignedBinaryIntegers
• Givenann-bitnumber
00
11
2n2n
1n1n 2x2x2x2xx ++++= -
--
- !
n Range: 0 to +2n – 1n Example
n 0000 0000 0000 0000 0000 0000 0000 10112= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
n Using 32 bitsn 0 to +4,294,967,295
§2.4 Signed and Unsigned N
umbers
29
![Page 30: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/30.jpg)
2s-ComplementSignedIntegers• Givenann-bitnumber
00
11
2n2n
1n1n 2x2x2x2xx ++++-= -
--
- !
n Range: –2n – 1 to +2n – 1 – 1n Example
n 1111 1111 1111 1111 1111 1111 1111 11002= –1×231 + 1×230 + … + 1×22 +0×21 +0×20
= –2,147,483,648 + 2,147,483,644 = –410
n Using 32 bitsn –2,147,483,648 to + 2,147,483,647
30
![Page 31: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/31.jpg)
2s-ComplementSignedIntegers• Bit31issignbit
– 1fornegativenumbers– 0fornon-negativenumbers
• 2n– 1 can’tberepresented– 1000… isnegativenow
• Non-negativenumbershavethesameunsignedand2s-complementrepresentation
• Somespecificnumbers– 0: 00000000…0000– –1: 11111111…1111– Most-negative: 10000000…0000,whichis–2,147,483,648– Most-positive: 01111111…1111,whichis2,147,483,647
31
00
11
2n2n
1n1n 2x2x2x2xx ++++-= -
--
- !
![Page 32: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/32.jpg)
SignedNegation
• Complementandadd1– Complementmeans1→0,0→ 1
32
x1x
11111...111xx 2
-=+
-==+
n Example: negate +2n +2 = 0000 0000 … 00102
n –2 = +2 +1 = 0000 0000 … 00102 + 1= 1111 1111 … 11012 + 1 = 1111 1111 … 11102
![Page 33: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/33.jpg)
SignExtension
• Representinganumberusingmorebits– E.g.shorta=-5;int b=a;– Preservethenumericvalue
• Replicatethesignbittotheleft– c.f.unsignedvalues:extendwith0s
• Examples:8-bitto16-bit– +2:00000010=>0000000000000010– –2:11111110=>1111111111111110
• InMIPSinstructionset– addi:extendimmediatevalue– lb,lh:extendloadedbyte/halfword– beq,bne:extendthedisplacement
33
![Page 34: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/34.jpg)
RepresentingInstructions
• Instructionsareencodedinbinary– Calledmachinecode
• MIPSinstructions– Encodedas32-bitinstructionwords– Smallnumberofformatsencodingoperationcode(opcode),
registernumbers,…– Regularity!
• Registernumbers(total32registers)mappingconvention– $t0– $t7arereg’s $8– $15– $t8– $t9arereg’s $24– $25– $s0– $s7arereg’s $16– $23
§2.5 Representing Instructions in the C
omputer
34
![Page 35: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/35.jpg)
MIPSR-formatInstructions
• Instructionfields– op:operationcode(opcode)– rs:firstsourceregisternumber– rt:secondsourceregisternumber– rd:destinationregisternumber– shamt:shiftamount(00000fornow)– funct:functioncode(extendsopcode)
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
35
![Page 36: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/36.jpg)
R-formatExample
add $t0, $s1, $s2
special $s1 $s2 $t0 0 add
0 17 18 8 0 32
000000 10001 10010 01000 00000 100000
000000100011001001000000001000002 = 0232402016
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
36
![Page 37: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/37.jpg)
Hexadecimal
• Base16– Compactrepresentationofbitstrings– 4bitsperhexdigit
37
0 0000 4 0100 8 1000 c 11001 0001 5 0101 9 1001 d 11012 0010 6 0110 a 1010 e 11103 0011 7 0111 b 1011 f 1111
n Example: eca8 6420n 1110 1100 1010 1000 0110 0100 0010 0000
![Page 38: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/38.jpg)
MIPSI-formatInstructions
• Immediatearithmeticandload/storeinstructions– rt:destinationorsourceregisternumber– Constant:–215 to+215 – 1– Address:offsetaddedtobaseaddressinrs
• DesignPrinciple4: Gooddesigndemandsgoodcompromises– Differentformatscomplicatedecoding,butallow32-bit
instructionsuniformly– Keepformatsassimilaraspossible
38
op rs rt constant or address6 bits 5 bits 5 bits 16 bits
g = h + A[8];
![Page 39: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/39.jpg)
MIPSInstructionEncoding
• TextbookExampleinpage84- 85
39
![Page 40: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/40.jpg)
StoredProgramComputers
• Instructionsrepresentedinbinary,justlikedata
• Instructionsanddatastoredinmemory
• Programscanoperateonprograms– e.g.,compilers,linkers,…
• Binarycompatibilityallowscompiledprogramstoworkondifferentcomputers– StandardizedISAs
The BIG Picture
40
![Page 41: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/41.jpg)
EndofLecture07
41
![Page 42: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/42.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 42
☛
![Page 43: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/43.jpg)
ReviewofLecture07
43
![Page 44: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/44.jpg)
Review:InstructionSetArchitecture:theInterfacebetweenHardwareandSoftware
InstructionSetArchitecture– theportionofthemachinevisibletotheassemblylevelprogrammerortothecompilerwriter– Tousethehardwareofacomputer,wemustspeak itslanguage– Thewordsofacomputerlanguagearecalledinstructions,andits
vocabularyiscalledaninstructionset
instructionset
software
hardware
Instr.# Operation+Operandsi movl -4(%ebp),%eax(i+1) addl %eax,(%edx)(i+2) cmpl 8(%ebp),%eax(i+3) jl L5:L5:
44
![Page 45: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/45.jpg)
Arithmetic-LogicInstructions(add,sub,addi,and,or,shiftleft|right,etc)
• Ccode:
f = (g + h) - (i + j);– f,…,jin$s0,…,$s4
• CompiledMIPScode:(R-type,i.e.Registersasoperands)add $t0, $s1, $s2 #register$t0containsg+hadd $t1, $s3, $s4 #register$t1containsi +jsub $s0, $t0, $t1 #$s0gets$t0– $t1,whichis
#(g+h)–(i +j)
I-type(Immediateasoneoftheoperands)addi $s3, $s3, 4
45
![Page 46: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/46.jpg)
MemoryLoad/StoreInstructions:Lw andSw
• Ccode:int a[N]
A[12] = h + A[8];– hin$s2,baseaddressofAin$s3
• CompiledMIPScode:– Index8requiresoffsetof32:A[8]:right-val,A[12]:left-val
lw $t0, 32($s3) # load wordadd $t0, $s2, $t0sw $t0, 48($s3) # store word
46
![Page 47: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/47.jpg)
2s-ComplementSignedIntegers• Bit31issignbit
– 1fornegativenumbers– 0fornon-negativenumbers
• 2n– 1 can’tberepresented– 1000… isnegativenow
• Non-negativenumbershavethesameunsignedand2s-complementrepresentation
• Somespecificnumbers– 0: 00000000…0000– –1: 11111111…1111– Most-negative: 10000000…0000,whichis–2,147,483,648– Most-positive: 01111111…1111,whichis2,147,483,647
47
00
11
2n2n
1n1n 2x2x2x2xx ++++-= -
--
- !
![Page 48: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/48.jpg)
SignedNegation
• Complementandadd1– Complementmeans1→0,0→ 1
48
x1x
11111...111xx 2
-=+
-==+
n Example: negate +2n +2 = 0000 0000 … 00102
n –2 = +2 +1 = 0000 0000 … 00102 + 1= 1111 1111 … 11012 + 1 = 1111 1111 … 11102
![Page 49: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/49.jpg)
SignExtension
• Representinganumberusingmorebits– E.g.shorta=-5;int b=a;– Preservethenumericvalue
• Replicatethesignbittotheleft– c.f.unsignedvalues:extendwith0s
• Examples:8-bitto16-bit– +2:00000010=>0000000000000010– –2:11111110=>1111111111111110
• InMIPSinstructionset– addi:extendimmediatevalue– lb,lh:extendloadedbyte/halfword– beq,bne:extendthedisplacement
49
![Page 50: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/50.jpg)
InstructionEncoding:R-format,MIPS32-bitinstructionword,32registers
add $t0, $s1, $s2
special $s1 $s2 $t0 0 add
0 17 18 8 0 32
000000 10001 10010 01000 00000 100000
000000100011001001000000001000002 = 0232402016
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
50
![Page 51: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/51.jpg)
MIPSI-formatInstructions
• Immediatearithmeticandload/storeinstructions– rt:destinationorsourceregisternumber– Constant:–215 to+215 – 1– Address:offsetaddedtobaseaddressinrs
• DesignPrinciple4: Gooddesigndemandsgoodcompromises– Differentformatscomplicatedecoding,butallow32-bit
instructionsuniformly– Keepformatsassimilaraspossible
51
op rs rt constant or address6 bits 5 bits 5 bits 16 bits
g = h + A[8];
![Page 52: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/52.jpg)
MIPSInstructionEncoding
• TextbookExampleinpage84- 85
52
![Page 53: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/53.jpg)
EndofReviewofLecture07
53
![Page 54: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/54.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 54
☛
![Page 55: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/55.jpg)
ThreeClassesofInstructionsWeWillFocusOn:
1. Arithmetic-logicinstructions– add,sub,addi,and,or,shiftleft|right,etc
2. Memoryloadandstoreinstructions– lw andsw:Load/storeword– Lb andsb:Load/storebyte
• Controltransferinstructions– Conditionalbranch:bne,beq– Unconditionaljump:j– Procedurecallandreturn:jal andjr
55
![Page 56: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/56.jpg)
LogicalOperations
• Instructionsforbitwisemanipulation
56
Operation C Java MIPSShift left << << sll
Shift right >> >>> srl
Bitwise AND & & and, andi
Bitwise OR | | or, ori
Bitwise NOT ~ ~ nor
n Useful for extracting and inserting groups of bits in a word
§2.6 Logical Operations
![Page 57: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/57.jpg)
ShiftOperations
• shamt:howmanypositionstoshift• Shiftleftlogical
– Shiftleftandfillwith0bits– sll byi bitsmultipliesby2i– E.g.int a=b<<2;//a=b*2(22)
• Shiftrightlogical– Shiftrightandfillwith0bits– srl byi bitsdividesby2i (unsignedonly)– E.g.int a=b>>2;//a=b/4(22)
57
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
![Page 58: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/58.jpg)
ANDOperations
• Usefultomaskbitsinaword– Selectsomebits,clearothersto0
and $t0, $t1, $t2
58
0000 0000 0000 0000 0000 1101 1100 0000
0000 0000 0000 0000 0011 1100 0000 0000
$t2
$t1
0000 0000 0000 0000 0000 1100 0000 0000$t0
![Page 59: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/59.jpg)
OROperations
• Usefultoincludebitsinaword– Setsomebitsto1,leaveothersunchanged
or $t0, $t1, $t2
59
0000 0000 0000 0000 0000 1101 1100 0000
0000 0000 0000 0000 0011 1100 0000 0000
$t2
$t1
0000 0000 0000 0000 0011 1101 1100 0000$t0
![Page 60: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/60.jpg)
NOTOperations
• Usefultoinvertbitsinaword– Change0to1,and1to0
• MIPShasNOR3-operandinstruction– aNORb==NOT(aORb)
nor $t0, $t1, $zero
60
0000 0000 0000 0000 0011 1100 0000 0000$t1
1111 1111 1111 1111 1100 0011 1111 1111$t0
Register 0: always read as zero
![Page 61: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/61.jpg)
ConditionalBranchandUnconditionalJump
Branchtoalabeledinstructionifaconditionistrue– Otherwise,continuesequentially– Labelisthesymbolicrepresentationofthememoryaddressof
aninstruction.• beq rs, rt, L1
– if(rs ==rt)branchtoinstructionlabeledL1;• bne rs, rt, L1
– if(rs !=rt)branchtoinstructionlabeledL1;
UnconditionalJump• j L1
– unconditionaljumptoinstructionlabeledL1
§2.7 Instructions for Making D
ecisions
61
![Page 62: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/62.jpg)
CompilingIfStatements
• Ccode:
if (i==j) f = g+h;else f = g-h;
– f,g,…in$s0,$s1,…• CompiledMIPScode:
bne $s3, $s4, Elseadd $s0, $s1, $s2j Exit
Else: sub $s0, $s1, $s2Exit: …
Assembler calculates addresses
62
![Page 63: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/63.jpg)
CompilingLoopStatements
• Ccode:
while (save[i] == k) i += 1;
– i in$s3,kin$s5,addressofsavein$s6• CompiledMIPScode:
Loop: sll $t1, $s3, 2 #i=i*4add $t1, $t1, $s6 #base+offsetlw $t0, 0($t1) #newbase in $t1 bne $t0, $s5, Exit #bneaddi $s3, $s3, 1 #i=i+1;j Loop
Exit: …
63
![Page 64: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/64.jpg)
MoreConditionalOperations
• Setresultto1ifaconditionistrue– Otherwise,setto0
• slt rd, rs, rt– if(rs <rt)rd =1;elserd =0;
• slti rt, rs, constant– if(rs <constant)rt =1;elsert =0;
• Useincombinationwithbeq,bneslt $t0, $s1, $s2 # if ($s1 < $s2)bne $t0, $zero, L # branch to L
64
![Page 65: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/65.jpg)
BranchInstructionDesign
• Whynotblt,bge,etc?• Hardwarefor<,≥,…slowerthan=,≠
– Combiningwithbranchinvolvesmoreworkperinstruction,requiringaslowerclock
– Allinstructionspenalized!• beq andbne arethecommoncase• Thisisagooddesigncompromise
65
![Page 66: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/66.jpg)
Signedvs.Unsigned
• Signedcomparison:slt,slti• Unsignedcomparison:sltu,sltui
• Example– $s0=11111111111111111111111111111111– $s1=00000000000000000000000000000001– slt $t0, $s0, $s1 # signed
• –1<+1Þ $t0=1– sltu $t0, $s0, $s1 # unsigned
• +4,294,967,295>+1Þ $t0=0
66
![Page 67: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/67.jpg)
Memorylayoutofaprogram(process)andhardwaresupportforfunctioncalls
67
![Page 68: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/68.jpg)
./a.out:LoadingaFileforExecution
• Steps:– Itreadstheexecutable’sheadertodeterminethe
sizeofthetextanddatasegments.– Itcreatesanewaddressspacefortheprogram.– Itcopiesinstructionsanddatafromtheexecutable
intothenewaddressspace.– Itcopiesargumentspassedtotheprogramonto
thestack.– Itinitializesthemachineregisters.
• Ingeneral,mostregistersarecleared,butthestackpointermustbeassignedtheaddressoftherst freestacklocation(seeSectionA.5).
– Itjumpstoastart-uproutinethatcopiestheprogram’sargumentsfromthestacktoregistersandcallstheprogram’smain routine.• Whenthemain routinereturns,thestart-uproutineterminatestheprogramwiththeexitsystemcall.
68
§A.4 Loading
![Page 69: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/69.jpg)
ProcessMemoryLayout
• Text:programcode• Staticdata:globalvariables
– e.g.,staticvariablesinC,constantarraysandstrings
– $gp initializedtoaddressallowing±offsetsintothissegment
• Dynamicdata:heap– E.g.,malloc inC,newinJava
• Stack:automaticstorage
69
![Page 70: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/70.jpg)
ProgramCounter(PC)
• Aregistertoholdtheaddressofthecurrentinstructionbeingexecuted.– Abettername:instructionaddressregister.
70
Relativeaddress
![Page 71: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/71.jpg)
LinuxProcessMemoryin32-bitSystem(4Gspace)• Code(machineinstructions)à Textsegment• Staticvariablesà DataorBSSsegment• Functionvariablesà stack(i,A[100]andB)
– Aisavariablethatstoresmemoryaddress,thememoryforA’s100int elementsisinthestack– Bisamemoryaddress,itisstoredinstack,butthememoryBpointstoisinheap(100int elements)
• Dynamicallocatedmemoryusingmalloc orC++“new”à heap(B[100)),memoryacrossfuntion calls
71
#include <stdio.h>
static char *gonzo = “God’s own prototype”;static char *userName;
int main(int argc, char* argv[]){int i; /* stack */int A[100]; /* stack */int *B = (int*)malloc(sizeof(int)*100); //heap
for(i = 0; i < 100; i++) {A[i] = i*i;B[i] = A[i] * 20;printf(”A[i]: %d, B[i]: %d\n",A[i], B[i]);
}}
Stacksizelimit.If8MB,“intA[10,000,000]”won’twork.
§A.5 Mem
ory Usage
![Page 72: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/72.jpg)
ProcedureCalling
• Stepsrequired– Placeparametersinregisters– Transfercontroltoprocedure– Acquirestorageforprocedure– Performprocedure’soperations– Placeresultinregisterforcaller– Returntoplaceofcall
72
§2.8 Supporting Procedures in Com
puter Hardw
are§A.6 Procedure C
all Convention
![Page 73: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/73.jpg)
SumExample:sum_full.c
73https://passlab.github.io/CSCE212/exercises/sum
![Page 74: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/74.jpg)
SumExample,sum_full_mips.s
74https://passlab.github.io/CSCE212/exercises/sum
Argumentsforsumcall
Memoryaddressofsumentry
Storereturnaddressin$31andcalltransfertosum
Returntocaller
![Page 75: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/75.jpg)
ProcedureCallInstructions
• Procedurecall:jumpandlinkjal ProcedureLabel– Addressoffollowinginstructionputin$ra– Jumpstotargetaddress
• Procedurereturn:jumpregisterjr $ra– Copies$ratoprogramcounter– Canalsobeusedforcomputedjumps
• e.g.,forcase/switchstatements
75
![Page 76: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/76.jpg)
RegisterUsage
• $a0– $a3:arguments(reg’s 4– 7)• $v0,$v1:resultvalues(reg’s 2and3)• $t0– $t9:temporaries
– Canbeoverwrittenbycallee• $s0– $s7:saved
– Mustbesaved/restoredbycallee• $gp:globalpointerforstaticdata(reg 28)• $sp:stackpointer(reg 29)• $fp:framepointer(reg 30)• $ra:returnaddress(reg 31)
76
![Page 77: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/77.jpg)
StackMemoryUsedforFunctionCalls
• StackisLast-In-First-Out(LIFO)datastructuretostoretheinfoofeachfunctionofthecallpath– Main()callsfoo(),foo()callsbar(),bar()calls
tar()– Callin:pushfunctiontothestacktop– Return:popfunctionfromthetop
• Stackframe,functionframe,activationrecord– Thememoryandthedataoftheinfoforeach
functioncall
77
main()foo()bar()tar()
push pop
top
![Page 78: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/78.jpg)
StackFrame(ActivationRecord)ofaFunctionCall
• Information:– Parameters– Localvariables– Returnaddress– Locationtoputreturnvalue
whenfunctionexits– Controllinktothecaller’s
activationrecord– Savedregisters– Temporaryvariablesand
intermediateresults– (notalways)Accesslinktothe
function’sstaticparent• Framepointer(fp register):
thestartingaddressofAR• Stackpointer(sp register):
theendingaddressofAR78
![Page 79: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/79.jpg)
LeafProcedureExample
• Leafprocedure– Aproceduredoesnotcallotherprocedures
• Thinkingofprocedurecallsasatree• Ccode:int leaf_example (int g, h, i, j){ int f;f = (g + h) - (i + j);return f;
}– Argumentsg,…,jin$a0,…,$a3– fin$s0(hence,needtosave$s0onstack)– Resultin$v0
79
![Page 80: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/80.jpg)
LeafProcedureExample
• MIPScode:leaf_example:addi $sp, $sp, -4sw $s0, 0($sp)add $t0, $a0, $a1add $t1, $a2, $a3sub $s0, $t0, $t1add $v0, $s0, $zerolw $s0, 0($sp)addi $sp, $sp, 4jr $ra
Save $s0 on stack
Procedure body
Restore $s0
Result
Return
80
push
pop
![Page 81: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/81.jpg)
Non-LeafProcedures
• Proceduresthatcallotherprocedures• Fornestedcall,callerneedstosaveonthestack:
– Itsreturnaddress– Anyargumentsandtemporariesneededafterthecall
• Restorefromthestackafterthecall
81
![Page 82: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/82.jpg)
Non-LeafProcedureExample
• Ccode:int fact (int n){ if (n < 1) return f;else return n * fact(n - 1);
}– Argumentnin$a0– Resultin$v0
82
![Page 83: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/83.jpg)
Non-LeafProcedureExample
• MIPScode:fact:
addi $sp, $sp, -8 # adjust stack for 2 itemssw $ra, 4($sp) # save return addresssw $a0, 0($sp) # save argument nslti $t0, $a0, 1 # test for n < 1beq $t0, $zero, L1addi $v0, $zero, 1 # if so, result is 1addi $sp, $sp, 8 # pop 2 items from stackjr $ra # and return
L1: addi $a0, $a0, -1 # else decrement n jal fact # recursive calllw $a0, 0($sp) # restore original nlw $ra, 4($sp) # and return addressaddi $sp, $sp, 8 # pop 2 items from stackmul $v0, $a0, $v0 # multiply to get resultjr $ra # and return
83
push
pop
pop
![Page 84: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/84.jpg)
ReadingAfterClass
• Readandunderstandtheexamplesin2.8andA.6
84
![Page 85: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/85.jpg)
CharacterData
• Byte-encodedcharactersets– ASCII:128characters
• 95graphic,33control• charv=`a`;//storebyte-sizecharacterainvariablev
– Latin-1:256characters• ASCII,+96moregraphiccharacters
• Unicode:32-bitcharacterset– UsedinJava,C++widecharacters,…– Mostoftheworld’salphabets,plussymbols– UTF-8,UTF-16:variable-lengthencodings
§2.9 Com
municating w
ith People
85
![Page 86: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/86.jpg)
Byte/HalfwordOperations
• Couldusebitwiseoperations• MIPSbyte/halfword load/storetoa32-bitregister
– Stringprocessingisacommoncase
lb rt, offset(rs) lh rt, offset(rs)– Signextendto32bitsinrt
lbu rt, offset(rs) lhu rt, offset(rs)– Zeroextendto32bitsinrt
sb rt, offset(rs) sh rt, offset(rs)– Storejustrightmostbyte/halfword
86
![Page 87: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/87.jpg)
StringCopyExample
• Ccode(naïve):– Null-terminatedstring
void strcpy (char x[], char y[]){ int i;i = 0;while ((x[i]=y[i])!='\0')i += 1;
}– Addressesofx,yin$a0,$a1– iin$s0
87
![Page 88: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/88.jpg)
StringCopyExample
• MIPScode:strcpy:
addi $sp, $sp, -4 # adjust stack for 1 itemsw $s0, 0($sp) # save $s0add $s0, $zero, $zero # i = 0
L1: add $t1, $s0, $a1 # addr of y[i] in $t1lbu $t2, 0($t1) # $t2 = y[i]add $t3, $s0, $a0 # addr of x[i] in $t3sb $t2, 0($t3) # x[i] = y[i]beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i + 1j L1 # next iteration of loop
L2: lw $s0, 0($sp) # restore saved $s0addi $sp, $sp, 4 # pop 1 item from stackjr $ra # and return
88
![Page 89: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/89.jpg)
EndofLecture8
89
![Page 90: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/90.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 90
☛
![Page 91: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/91.jpg)
ReviewofofLecture8IMPORTANT:Readandunderstandeachassemblyinstructionandcodeintheexamples
91
![Page 92: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/92.jpg)
Review:ConditionalBranchandUnconditionalJump
Branchtoalabeledinstructionifaconditionistrue– Otherwise,continuesequentially– Labelisthesymbolicrepresentationofthememoryaddressof
aninstruction.• beq rs, rt, L1
– if(rs ==rt)branchtoinstructionlabeledL1;• bne rs, rt, L1
– if(rs !=rt)branchtoinstructionlabeledL1;
UnconditionalJump• j L1
– unconditionaljumptoinstructionlabeledL1
§2.7 Instructions for Making D
ecisions
92
![Page 93: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/93.jpg)
CompilingIfStatements
• Ccode:
if (i==j) f = g+h;else f = g-h;
– f,g,…in$s0,$s1,…• CompiledMIPScode:
bne $s3, $s4, Elseadd $s0, $s1, $s2j Exit
Else: sub $s0, $s1, $s2Exit: …
Assembler calculates addresses
93
![Page 94: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/94.jpg)
CompilingLoopStatements
• Ccode:
while (save[i] == k) i += 1;
– i in$s3,kin$s5,addressofsavein$s6• CompiledMIPScode:
Loop: sll $t1, $s3, 2 #i=i*4add $t1, $t1, $s6 #base+offsetlw $t0, 0($t1) #newbase in $t1 bne $t0, $s5, Exit #bneaddi $s3, $s3, 1 #i=i+1;j Loop
Exit: …
94
![Page 95: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/95.jpg)
MoreConditionalOperations
• Setresultto1ifaconditionistrue– Otherwise,setto0
• slt rd, rs, rt– if(rs <rt)rd =1;elserd =0;
• slti rt, rs, constant– if(rs <constant)rt =1;elsert =0;
• Useincombinationwithbeq,bneslt $t0, $s1, $s2 # if ($s1 < $s2)bne $t0, $zero, L # branch to L
95
![Page 96: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/96.jpg)
ProcedureCallInstructions
• Procedurecall:jumpandlinkjal ProcedureLabel– Addressoffollowinginstructionputin$ra– Jumpstotargetaddress
• Procedurereturn:jumpregisterjr $ra– Copies$ratoprogramcounter– Canalsobeusedforcomputedjumps
• e.g.,forcase/switchstatements
96
![Page 97: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/97.jpg)
StackMemoryUsedforFunctionCalls
• StackisLast-In-First-Out(LIFO)datastructuretostoretheinfoofeachfunctionofthecallpath– Main()callsfoo(),foo()callsbar(),bar()calls
tar()– Callin:pushfunctiontothestacktop– Return:popfunctionfromthetop
• Stackframe,functionframe,activationrecord– Thememoryandthedataoftheinfoforeach
functioncall
97
main()foo()bar()tar()
push pop
top
![Page 98: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/98.jpg)
LeafProcedureExample
• Leafprocedure– Aproceduredoesnotcallotherprocedures
• Thinkingofprocedurecallsasatree• Ccode:int leaf_example (int g, h, i, j){ int f;f = (g + h) - (i + j);return f;
}– Argumentsg,…,jin$a0,…,$a3– fin$s0(hence,needtosave$s0onstack)– Resultin$v0
98
![Page 99: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/99.jpg)
LeafProcedureExample
• MIPScode:leaf_example:addi $sp, $sp, -4sw $s0, 0($sp)add $t0, $a0, $a1add $t1, $a2, $a3sub $s0, $t0, $t1add $v0, $s0, $zerolw $s0, 0($sp)addi $sp, $sp, 4jr $ra
Save $s0 on stack
Procedure body
Restore $s0
Result
Return
99
push
pop
![Page 100: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/100.jpg)
Non-LeafProcedureExample
• Ccode:int fact (int n){ if (n < 1) return f;else return n * fact(n - 1);
}– Argumentnin$a0– Resultin$v0
100
![Page 101: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/101.jpg)
Non-LeafProcedureExample
• MIPScode:fact:
addi $sp, $sp, -8 # adjust stack for 2 itemssw $ra, 4($sp) # save return addresssw $a0, 0($sp) # save argument nslti $t0, $a0, 1 # test for n < 1beq $t0, $zero, L1addi $v0, $zero, 1 # if so, result is 1addi $sp, $sp, 8 # pop 2 items from stackjr $ra # and return
L1: addi $a0, $a0, -1 # else decrement n jal fact # recursive calllw $a0, 0($sp) # restore original nlw $ra, 4($sp) # and return addressaddi $sp, $sp, 8 # pop 2 items from stackmul $v0, $a0, $v0 # multiply to get resultjr $ra # and return
101
push
pop
pop
![Page 102: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/102.jpg)
EndofReviewofofLecture8
102
![Page 103: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/103.jpg)
Chapter2:Instructions:LanguageoftheComputer
• Lecture07– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading 103
☛
![Page 104: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/104.jpg)
32-bitConstants
• Mostconstantsaresmall– I-format32-bitinstructionwordincludes16-bitforconstant– 16-bitimmediateissufficient
• Fortheoccasional32-bitconstantlui rt, constant– Copiestheleft(upper)16bitsoftheconstantofrt– Clearsright16bitsofrt to0
104
§2.10 MIPS Addressing for 32-Bit Im
mediates
and Addresses
op rs rt constant or address6 bits 5 bits 5 bits 16 bits
![Page 105: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/105.jpg)
Loadinga32-BitConstant
105
![Page 106: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/106.jpg)
BranchAddressing
• Branchinstructions(I-format)specify– Opcode,tworegisters,targetaddress
• Mostbranchtargetsarenearbranch– Forwardorbackward
106
op rs rt constant or address as offset6 bits 5 bits 5 bits 16 bits
n PC-relative addressingn Target address = PC + offset × 4n PC already incremented by 4 by this time
![Page 107: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/107.jpg)
JumpAddressing
• Jump(j andjal)targetscouldbeanywhereintextsegment– Encodefulladdressininstruction
107
op address6 bits 26 bits
n (Pseudo) Direct jump addressingn Target address = PC31…28 : (address × 4)
![Page 108: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/108.jpg)
TargetAddressingExample
• Loopcodefromearlierexample– AssumeLoopatlocation80000
108
Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0
add $t1, $t1, $s6 80004 0 9 22 9 0 32
lw $t0, 0($t1) 80008 35 9 8 0
bne $t0, $s5, Exit 80012 5 8 21 2
addi $s3, $s3, 1 80016 8 19 19 1
j Loop 80020 2 20000
Exit: … 80024
PC = 80024, which is PC + offset * 4 = 80016 + 2 * 4 = 80024
PC = 80000, which is address * 4 = 20000 * 4 = 80000
![Page 109: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/109.jpg)
BranchingFarAway
• Ifbranchtargetistoofartoencodewith16-bitoffset,assemblerrewritesthecode
• Examplebeq $s0,$s1, L1
↓bne $s0,$s1, L2j L1
L2: …
109
![Page 110: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/110.jpg)
CSortExample
• IllustratesuseofassemblyinstructionsforaCbubblesortfunction
• Swapprocedure(leaf)void swap(int v[], int k){
int temp;temp = v[k];v[k] = v[k+1];v[k+1] = temp;
}– vin$a0,kin$a1,tempin$t0
110
§2.13 A C Sort Exam
ple to Put It All Together
![Page 111: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/111.jpg)
TheProcedureSwap• 4-byte per element
swap: sll $t1, $a1, 2 # $t1 = k * 4
add $t1, $a0, $t1 # $t1 = v+(k*4)
# (address of v[k])
lw $t0, 0($t1) # $t0 (temp) = v[k]
lw $t2, 4($t1) # $t2 = v[k+1]
sw $t2, 0($t1) # v[k] = $t2 (v[k+1])
sw $t0, 4($t1) # v[k+1] = $t0 (temp)
jr $ra # return to calling routine
111
Note:1.Arrayreferences(V[k]andV[k+1])aretranslatedtoLW/SWdependingonwhetheritisareadorwrite(right-val orleft-val).2.v[k]=v[k+1]istranslatedtotwoinstructions,i.e.LWandSW
![Page 112: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/112.jpg)
TheSortProcedureinC
• Bubblesort– Doublenestedloop
• Non-leaf(callsswap)void sort (int v[], int n) {
int i, j;for (i = 0; i < n; i += 1) {
for (j = i – 1;j >= 0 && v[j] > v[j + 1];j -= 1) {
swap(v,j);}
}}
– vin$a0,nin$a1,i in$s0,jin$s1112
![Page 113: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/113.jpg)
113
![Page 114: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/114.jpg)
114
![Page 115: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/115.jpg)
EffectofCompilerOptimization
0
0.5
1
1.5
2
2.5
3
none O1 O2 O3
Relative Performance
020000400006000080000
100000120000140000160000180000
none O1 O2 O3
Clock Cycles
020000400006000080000
100000120000140000
none O1 O2 O3
Instruction count
0
0.5
1
1.5
2
none O1 O2 O3
CPI
Compiled with gcc for Pentium 4 under Linux
115
![Page 116: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/116.jpg)
EffectofLanguageandAlgorithm
0
0.5
1
1.5
2
2.5
3
C/none C/O1 C/O2 C/O3 Java/int Java/JIT
Bubblesort Relative Performance
0
0.5
1
1.5
2
2.5
C/none C/O1 C/O2 C/O3 Java/int Java/JIT
Quicksort Relative Performance
0
500
1000
1500
2000
2500
3000
C/none C/O1 C/O2 C/O3 Java/int Java/JIT
Quicksort vs. Bubblesort Speedup
116
![Page 117: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/117.jpg)
LessonsLearnt
• InstructioncountandCPIarenotgoodperformanceindicatorsinisolation
• Compileroptimizationsaresensitivetothealgorithm– High-leveloptimizationà betterperformance– 02isgenerallythedefaultoptimizationlevel
• Java/JITcompiledcodeissignificantlyfasterthanJVMinterpreted– ComparabletooptimizedCinsomecases
• Nothingcanfixadumbalgorithm!
117
![Page 118: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/118.jpg)
Arraysvs.Pointers
• Arrayindexinginvolves– Multiplyingindexbyelementsize– Addingtoarraybaseaddress
• Pointerscorresponddirectlytomemoryaddresses– Canavoidindexingcomplexity
118
§2.14 Arrays versus Pointers
![Page 119: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/119.jpg)
Example:ClearingandArray
clear1(int array[], int size) {int i;for (i = 0; i < size; i += 1)array[i] = 0;
}
clear2(int *array, int size) {int *p;for (p = &array[0]; p < &array[size];
p = p + 1)*p = 0;
}
move $t0,$zero # i = 0
loop1: sll $t1,$t0,2 # $t1 = i * 4
add $t2,$a0,$t1 # $t2 =
# &array[i]
sw $zero, 0($t2) # array[i] = 0
addi $t0,$t0,1 # i = i + 1
slt $t3,$t0,$a1 # $t3 =
# (i < size)
bne $t3,$zero,loop1 # if (…)# goto loop1
move $t0,$a0 # p = & array[0]
sll $t1,$a1,2 # $t1 = size * 4
add $t2,$a0,$t1 # $t2 =
# &array[size]
loop2: sw $zero,0($t0) # Memory[p] = 0
addi $t0,$t0,4 # p = p + 4
slt $t3,$t0,$t2 # $t3 =
#(p<&array[size])
bne $t3,$zero,loop2 # if (…)
# goto loop2
119
6instructionsperiterationvs4instructionsperiterationIfweusep!=&array[size]forloopcondition,wedonotneedsltintheloop.
![Page 120: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/120.jpg)
ComparisonofArrayvs.Ptr
• Multiply“strengthreduced”toshift• Arrayversionrequiresshifttobeinsideloop
– Partofindexcalculationforincrementedi– c.f.incrementingpointer
• Compilercanachievesameeffectasmanualuseofpointers– Inductionvariableelimination– Bettertomakeprogramclearerandsafer
120
![Page 121: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/121.jpg)
EndofLecture09
121
![Page 122: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/122.jpg)
ReviewofLecture09
122
![Page 123: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/123.jpg)
Review:TargetAddressingExample
• Loopcodefromearlierexample– AssumeLoopatlocation80000
123
Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0
add $t1, $t1, $s6 80004 0 9 22 9 0 32
lw $t0, 0($t1) 80008 35 9 8 0
bne $t0, $s5, Exit 80012 5 8 21 2
addi $s3, $s3, 1 80016 8 19 19 1
j Loop 80020 2 20000
Exit: … 80024
PC = 80024, which is PC + offset * 4 = 80016 + 2 * 4 = 80024
PC = 80000, which is address * 4 = 20000 * 4 = 80000
![Page 124: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/124.jpg)
Example:CSort,andArrayvsPointer
• IMPORTANT:Readandunderstandeachassemblyinstructionandcodeintheexamples
124
![Page 125: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/125.jpg)
EndofReviewofLecture09
125
![Page 126: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/126.jpg)
Chapter2:Instructions:LanguageoftheComputer• Lecture07
– 2.1Introduction– 2.2OperationsoftheComputerHardware– 2.3OperandsoftheComputerHardware– 2.4SignedandUnsignedNumbers– 2.5RepresentingInstructionsintheComputer
• Lecture08– 2.6LogicalOperations– 2.7InstructionsforMakingDecisions– A.4– A.6:Loading,MemoryandProcedureCallConvention– 2.8 SupportingProceduresinComputerHardware– 2.9 CommunicatingwithPeople
• Lecture09– 2.10 MIPSAddressingfor32-BitImmediates andAddresses– 2.11 ParallelismandInstructions:Synchronization– 2.12 TranslatingandStartingaProgram
• WecoveredinAppendixAandCBasics– 2.13 ACSortExampletoPutItAllTogether– 2.14 ArraysversusPointers– 2.15 AdvancedMaterial:CompilingCandInterpretingJava
• Lecture10– 2.16 RealStuff:ARMv7(32-bit)Instructions– 2.17 RealStuff:x86Instructions– 2.18 RealStuff:ARMv8(64-bit)Instructions– 2.19 FallaciesandPitfalls– 2.20 ConcludingRemarks– 2.21 HistoricalPerspectiveandFurtherReading– IntroductionofMARSMIPSassemblerandsimulator 126
☛
![Page 127: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/127.jpg)
ARM&MIPSSimilarities
• ARM:themostpopularembeddedcore• SimilarbasicsetofinstructionstoMIPS
127
§2.16 Real Stuff: AR
M Instructions
ARM MIPSDate announced 1985 1985Instruction size 32 bits 32 bitsAddress space 32-bit flat 32-bit flatData alignment Aligned AlignedData addressing modes 9 3Registers 15 × 32-bit 31 × 32-bitInput/output Memory
mappedMemory mapped
![Page 128: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/128.jpg)
CompareandBranchinARM
• Usesconditioncodesforresultofanarithmetic/logicalinstruction– Negative,zero,carry,overflow– Compareinstructionstosetconditioncodeswithoutkeepingthe
result• Eachinstructioncanbeconditional
– Top4bitsofinstructionword:conditionvalue– Canavoidbranchesoversingleinstructions
128
![Page 129: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/129.jpg)
ARMvsMIPS
129
![Page 130: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/130.jpg)
InstructionEncoding
130
![Page 131: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/131.jpg)
TheIntelx86ISA
• Evolutionwithbackwardcompatibility– 8080(1974):8-bitmicroprocessor
• Accumulator,plus3index-registerpairs– 8086(1978):16-bitextensionto8080
• Complexinstructionset(CISC)– 8087(1980):floating-pointcoprocessor
• AddsFPinstructionsandregisterstack– 80286(1982):24-bitaddresses,MMU
• Segmentedmemorymappingandprotection– 80386(1985):32-bitextension(nowIA-32)
• Additionaladdressingmodesandoperations• Pagedmemorymappingaswellassegments
131
§2.17 Real Stuff: x86 Instructions
![Page 132: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/132.jpg)
TheIntelx86ISA
• Furtherevolution…– i486(1989):pipelined,on-chipcachesandFPU
• Compatiblecompetitors:AMD,Cyrix,…– Pentium(1993):superscalar,64-bitdatapath
• LaterversionsaddedMMX(Multi-MediaeXtension)instructions• TheinfamousFDIVbug
– PentiumPro(1995),PentiumII(1997)• Newmicroarchitecture(seeColwell,ThePentiumChronicles)
– PentiumIII(1999)• AddedSSE(StreamingSIMDExtensions)andassociatedregisters
– Pentium4(2001)• Newmicroarchitecture• AddedSSE2instructions
132
![Page 133: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/133.jpg)
TheIntelx86ISA
• Andfurther…– AMD64(2003):extendedarchitectureto64bits– EM64T– ExtendedMemory64Technology(2004)
• AMD64adoptedbyIntel(withrefinements)• AddedSSE3instructions
– IntelCore(2006)• AddedSSE4instructions,virtualmachinesupport
– AMD64(announced2007):SSE5instructions• Inteldeclinedtofollow,instead…
– AdvancedVectorExtension(announced2008)• LongerSSEregisters,moreinstructions
• IfInteldidn’textendwithcompatibility,itscompetitorswould!– Technicalelegance≠marketsuccess
133
![Page 134: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/134.jpg)
Basicx86Registers
134
![Page 135: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/135.jpg)
Basicx86AddressingModes
• Twooperandsperinstruction
135
Source/dest operand Second source operandRegister RegisterRegister ImmediateRegister MemoryMemory RegisterMemory Immediate
n Memory addressing modesn Address in registern Address = Rbase + displacementn Address = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3)n Address = Rbase + 2scale × Rindex + displacement
![Page 136: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/136.jpg)
x86InstructionEncoding
• Variablelengthencoding– Postfixbytesspecify
addressingmode– Prefixbytesmodify
operation• Operandlength,repetition,locking,…
136
![Page 137: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/137.jpg)
ImplementingIA-32
• Complexinstructionsetmakesimplementationdifficult– Hardwaretranslatesinstructionstosimplermicrooperations
• Simpleinstructions:1–1• Complexinstructions:1–many
– MicroenginesimilartoRISC– Marketsharemakesthiseconomicallyviable
• ComparableperformancetoRISC– Compilersavoidcomplexinstructions
137
![Page 138: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/138.jpg)
ConcludingRemarks
• Designprinciples1. Simplicityfavorsregularity2. Smallerisfaster3. Makethecommoncasefast4. Gooddesigndemandsgoodcompromises
• Layersofsoftware/hardware– Compiler,assembler,hardware
• MIPS:typicalofRISCISAs– c.f.x86
§2.20 Concluding R
emarks
138
![Page 139: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/139.jpg)
ConcludingRemarks
• MeasureMIPSinstructionexecutionsinbenchmarkprograms– Considermakingthecommoncasefast– Considercompromises
139
Instruction class MIPS examples SPEC2006 Int SPEC2006 FPArithmetic add, sub, addi 16% 48%
Data transfer lw, sw, lb, lbu, lh, lhu, sb, lui
35% 36%
Logical and, or, nor, andi, ori, sll, srl
12% 4%
Cond. Branch beq, bne, slt, slti, sltiu
34% 8%
Jump j, jr, jal 2% 0%
![Page 140: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/140.jpg)
ThreeClassesofInstructions
1. Arithmetic-logicinstructions– add,sub,addi,and,or,shiftleft|right,etc
2. Memoryloadandstoreinstructions– lw andsw:Load/storeword– Lb andsb:Load/storebyte
• Controltransferinstructions– Conditionalbranch:bne,beq– Unconditionaljump:j– Procedurecallandreturn:jal andjr
140
![Page 141: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/141.jpg)
MIPSSimulator
• MARS (MIPSAssemblerandRuntimeSimulator)– http://courses.missouristate.edu/KenVollmar/MARS/index.htm– https://courses.missouristate.edu/KenVollmar/mars/tutorial.htm
141
![Page 142: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/142.jpg)
WriteAssemblyCodeinMARShttps://courses.missouristate.edu/KenVollmar/mars/tutorial.htm
• .datasegment• .textsegment
• row-major.asm example
142
![Page 143: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/143.jpg)
row-major.asm:OffsetandAddressforrow-major2-dimensionalarray
• Slides49https://passlab.github.io/CSCE212/notes/lecture06_CBasic.pdf
143
![Page 144: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/144.jpg)
MultiplicationinMIPS
• Multiplication,Textbook3.3,page188– MIPSmultiplyunitcontainstwo32-bitregisterscalledhi andlo
• Theyarenotgeneralpurposeregisters– Theproductofmultiplyingtwo32-bitsoperandsarestoredinhi
andlo registers
• Needmfhi andmflo toloadvaluesinhi andlo togeneralpurposeregister
144
![Page 145: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/145.jpg)
row-major.asm Example
145
![Page 146: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/146.jpg)
row-major.asm ExampleinMARSEditwindowandToAssembly
146
![Page 147: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/147.jpg)
row-major.asm ExampleinMARSAssemblywindow
147
![Page 148: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/148.jpg)
row-major.asm ExampleinMARSAfterExecution
148
![Page 149: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/149.jpg)
DownloadandTryExamples
• MARSTutorial:https://courses.missouristate.edu/KenVollmar/mars/tutorial.htm– Fibonacci.asm– row-major.asm– column-major.asm
• Rewritetextbookexampletomakeitrun– Csort– Arrayvspointer
149
![Page 150: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/150.jpg)
EndofChapter02
150
![Page 151: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/151.jpg)
Otherslidesofthechapter
151
![Page 152: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/152.jpg)
Synchronization
• Twoprocessorssharinganareaofmemory– P1writes,thenP2reads– DataraceifP1andP2don’tsynchronize
• Resultdependsoforderofaccesses
• Hardwaresupportrequired– Atomicread/writememoryoperation– Nootheraccesstothelocationallowedbetweenthereadand
write• Couldbeasingleinstruction
– E.g.,atomicswapofregister↔memory– Oranatomicpairofinstructions
§2.11 Parallelism and Instructions: Synchronization
152
![Page 153: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/153.jpg)
SynchronizationinMIPS
• Loadlinked:ll rt, offset(rs)• Storeconditional:sc rt, offset(rs)
– Succeedsiflocationnotchangedsincethell• Returns1inrt
– Failsiflocationischanged• Returns0inrt
• Example:atomicswap(totest/setlockvariable)try: add $t0,$zero,$s4 ;copy exchange value
ll $t1,0($s1) ;load linkedsc $t0,0($s1) ;store conditionalbeq $t0,$zero,try ;branch store failsadd $s4,$zero,$t1 ;put load value in $s4
153
![Page 154: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/154.jpg)
TranslationandStartup
Many compilers produce object modules directly
Static linking
§2.12 Translating and Starting a Program
154
![Page 155: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/155.jpg)
AssemblerPseudoinstructions• Mostassemblerinstructionsrepresentmachineinstructionsone-to-one
• Pseudoinstructions:figmentsoftheassembler’simaginationmove $t0, $t1 → add $t0, $zero, $t1
blt $t0, $t1, L → slt $at, $t0, $t1
bne $at, $zero, L
– $at(register1):assemblertemporary
155
![Page 156: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/156.jpg)
ProducinganObjectModule
• Assembler(orcompiler)translatesprogramintomachineinstructions
• Providesinformationforbuildingacompleteprogramfromthepieces– Header:describedcontentsofobjectmodule– Textsegment:translatedinstructions– Staticdatasegment:dataallocatedforthelifeoftheprogram– Relocationinfo:forcontentsthatdependonabsolutelocationof
loadedprogram– Symboltable:globaldefinitionsandexternalrefs– Debuginfo:forassociatingwithsourcecode
156
![Page 157: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/157.jpg)
LinkingObjectModules
• Producesanexecutableimage1. Mergessegments2. Resolvelabels(determinetheiraddresses)3. Patchlocation-dependentandexternalrefs
• Couldleavelocationdependenciesforfixingbyarelocatingloader– Butwithvirtualmemory,noneedtodothis– Programcanbeloadedintoabsolutelocationinvirtualmemory
space
157
![Page 158: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/158.jpg)
LoadingaProgram
• Loadfromimagefileondiskintomemory1. Readheadertodeterminesegmentsizes2. Createvirtualaddressspace3. Copytextandinitializeddataintomemory
• Orsetpagetableentriessotheycanbefaultedin4. Setupargumentsonstack5. Initializeregisters(including$sp,$fp,$gp)6. Jumptostartuproutine
• Copiesargumentsto$a0,…andcallsmain• Whenmainreturns,doexitsyscall
158
![Page 159: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/159.jpg)
DynamicLinking
• Onlylink/loadlibraryprocedurewhenitiscalled– Requiresprocedurecodetoberelocatable– Avoidsimagebloatcausedbystaticlinkingofall(transitively)
referencedlibraries– Automaticallypicksupnewlibraryversions
159
![Page 160: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/160.jpg)
LazyLinkage
Indirection table
Stub: Loads routine ID,Jump to linker/loader
Linker/loader code
Dynamicallymapped code
160
![Page 161: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/161.jpg)
StartingJavaApplications
Simple portable instruction set for
the JVM
Interprets bytecodes
Compiles bytecodes of “hot” methods
into native code for host
machine
161
![Page 162: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/162.jpg)
ARMv8Instructions
• Inmovingto64-bit,ARMdidacompleteoverhaul• ARMv8resemblesMIPS
– Changesfromv7:• Noconditionalexecutionfield• Immediatefieldis12-bitconstant• Droppedload/storemultiple• PCisnolongeraGPR• GPRsetexpandedto32• Addressingmodesworkforallwordsizes• Divideinstruction• Branchifequal/branchifnotequalinstructions
§2.18 Real Stuff: AR
M v8 (64-bit) Instructions
162
![Page 163: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/163.jpg)
Fallacies
• PowerfulinstructionÞ higherperformance– Fewerinstructionsrequired– Butcomplexinstructionsarehardtoimplement
• Mayslowdownallinstructions,includingsimpleones– Compilersaregoodatmakingfastcodefromsimpleinstructions
• Useassemblycodeforhighperformance– Butmoderncompilersarebetteratdealingwithmodern
processors– MorelinesofcodeÞmoreerrorsandlessproductivity
§2.19 Fallacies and Pitfalls
163
![Page 164: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/164.jpg)
Fallacies
• BackwardcompatibilityÞ instructionsetdoesn’tchange– Buttheydoaccretemoreinstructions
164
x86 instruction set
![Page 165: Chapter 2: Instructions: Language of the Computer · Instruction Set Architecture –the portion of the machine visible to the assembly level programmer or to the compiler writer](https://reader031.fdocuments.us/reader031/viewer/2022022118/5cc36a9988c993ac648c824c/html5/thumbnails/165.jpg)
Pitfalls
• Sequentialwordsarenotatsequentialaddresses– Incrementby4,notby1!
• Keepingapointertoanautomaticvariableafterprocedurereturns– e.g.,passingpointerbackviaanargument– Pointerbecomesinvalidwhenstackpopped
165