CS 61C: Great Ideas in Computer Architecture CALL ...cs61c/sp16/lec/12/... · – Acts as a defense...
Transcript of CS 61C: Great Ideas in Computer Architecture CALL ...cs61c/sp16/lec/12/... · – Acts as a defense...
CS61C:GreatIdeasinComputerArchitecture
CALLcontinued(LinkingandLoading)
1
Instructors:NicholasWeaver&VladimirStojanovic
http://inst.eecs.Berkeley.edu/~cs61c/sp16
WhereAreWeNow?
2
Linker(1/3)• Input:Objectcodefiles,informationtables(e.g.,foo.o,libc.o forMIPS)
• Output:Executablecode(e.g.,a.out forMIPS)
• Combinesseveralobject(.o)filesintoasingleexecutable(“linking”)
• Enableseparatecompilationoffiles– Changestoonefiledonotrequirerecompilationofthewholeprogram• Windows7sourcewas>40Mlinesofcode!
– Oldname“LinkEditor”fromeditingthe“links”injumpandlinkinstructions
3
.o file 1text 1data 1info 1
.o file 2text 2data 2info 2
Linker
a.outRelocated text 1Relocated text 2Relocated data 1Relocated data 2
Linker(2/3)
4
Linker(3/3)
• Step1:Taketextsegmentfromeach.o fileandputthemtogether
• Step2:Takedatasegmentfromeach.o file,putthemtogether,andconcatenatethisontoendoftextsegments
• Step3:Resolvereferences– GothroughRelocationTable;handleeachentry– Thatis,fillinallabsoluteaddresses
5
FourTypesofAddresses
• PC-RelativeAddressing(beq,bne)– neverrelocate
• AbsoluteFunctionAddress(j,jal)– alwaysrelocate
• ExternalFunctionReference(usuallyjal)– alwaysrelocate
• StaticDataReference(oftenlui andori)– alwaysrelocate
6
AbsoluteAddressesinMIPS
• Whichinstructionsneedrelocationediting?– J-format:jump,jumpandlink
– Loadsandstorestovariablesinstaticarea,relativetoglobalpointer
– Whataboutconditionalbranches?
– PC-relativeaddressingpreservedevenifcodemoves
j/jal xxxxx
lw/sw $gp $x address
beq/bne $rs $rt address
7
ResolvingReferences(1/2)
• Linkerassumesfirstwordoffirsttextsegmentisataddress0x04000000.– (Morelaterwhenwestudy“virtualmemory”)
• Linkerknows:– lengthofeachtextanddatasegment– orderingoftextanddatasegments
• Linkercalculates:– absoluteaddressofeachlabeltobejumpedto(internalorexternal)andeachpieceofdatabeingreferenced
8
ResolvingReferences(2/2)
• Toresolvereferences:– searchforreference(dataorlabel)inall“user”symboltables
– ifnotfound,searchlibraryfiles(forexample,forprintf)
– onceabsoluteaddressisdetermined,fillinthemachinecodeappropriately
• Outputoflinker:executablefilecontainingtextanddata(plusheader)
9
WhereAreWeNow?
10
LoaderBasics• Input:ExecutableCode(e.g.,a.out forMIPS)
• Output:(programisrun)• Executablefilesarestoredondisk• Whenoneisrun,loader’sjobistoloaditintomemoryandstartitrunning
• Inreality,loaderistheoperatingsystem(OS)– loadingisoneoftheOStasks– Andthesedays,theloaderactuallydoesalotofthelinking
11
Loader…whatdoesitdo?• Readsexecutablefile’sheadertodeterminesizeoftextand
datasegments• Createsnewaddressspaceforprogramlargeenoughtohold
textanddatasegments,alongwithastacksegment• Copiesinstructionsanddatafromexecutablefileintothenew
addressspace• Copiesargumentspassedtotheprogramontothestack• Initializesmachineregisters
– Mostregisterscleared,butstackpointerassignedaddressof1stfreestacklocation
• Jumpstostart-uproutinethatcopiesprogram’sargumentsfromstacktoregisters&setsthePC– Ifmainroutinereturns,start-uproutineterminatesprogramwiththe
exitsystemcall 12
Clicker/PeerInstructionAtwhatpointinprocessareallthemachinecodebitsdeterminedforthefollowingassemblyinstructions:1)addu $6, $7, $82)jal fprintf
A:1)&2)AftercompilationB:1)Aftercompilation,2)AfterassemblyC:1)Afterassembly,2)AfterlinkingD:1)Aftercompilation,2)AfterlinkingE:1)Aftercompilation,2)Afterloading
13
Example:C⇒ Asm⇒ Obj⇒ Exe⇒ Run
#include <stdio.h>int main (int argc, char *argv[]) {int i, sum = 0;for (i = 0; i <= 100; i++)
sum = sum + i * i;printf ("The sum of sq from 0 .. 100 is %d\n", sum);
}
C Program Source Code: prog.c
“printf” lives in “libc”
14
Compilation:MAL.text.align 2.globl main
main:subu $sp,$sp,32sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)
loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)
addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,32jr $ra.data.align 0
str:.asciiz "The sum of sq from 0 .. 100 is %d\n"
Where are7 pseudo-instructions?
15
Compilation:MAL.text.align 2.globl main
main:subu $sp,$sp,32sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)
loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)
addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,32jr $ra.data.align 0
str:.asciiz "The sum of sq from 0 .. 100 is %d\n"
7 pseudo-instructionsunderlined
16
Assemblystep1:
00 addiu $29,$29,-3204 sw$31,20($29)08 sw$4, 32($29)0c sw$5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)
30 addiu $8,$14, 134 sw$8,28($29)38 slti $1,$8, 101 3c bne $1,$0, loop40 lui $4, l.str44 ori $4,$4,r.str 48 lw$5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr $31
Remove pseudoinstructions, assign addresses
17
Assemblystep2
• SymbolTableLabel address(inmodule) typemain: 0x00000000 global textloop: 0x00000018 local textstr: 0x00000000 local data
• RelocationInformationAddress Instr.type Dependency0x00000040 lui l.str0x00000044 ori r.str 0x0000004c jal printf
Create relocation table and symbol table
18
Assemblystep3
00 addiu $29,$29,-3204 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)
30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -1040 lui $4, l.str44 ori $4,$4,r.str48 lw $5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr $31
Resolve local PC-relative labels
19
Assemblystep4
• Generateobject(.o)file:– Outputbinaryrepresentationfor• textsegment(instructions)• datasegment(data)• symbolandrelocationtables
– Usingdummy“placeholders”forunresolvedabsoluteandexternalreferences
20
Textsegmentinobjectfile0x000000 001001111011110111111111111000000x000004 101011111011111100000000000101000x000008 101011111010010000000000001000000x00000c 101011111010010100000000001001000x000010 101011111010000000000000000110000x000014 101011111010000000000000000111000x000018 100011111010111000000000000111000x00001c 100011111011100000000000000110000x000020 000000011100111000000000000110010x000024 001001011100100000000000000000010x000028 001010010000000100000000011001010x00002c 101011111010100000000000000111000x000030 000000000000000001111000000100100x000034 000000110000111111001000001000010x000038 000101000010000011111111111101110x00003c 101011111011100100000000000110000x000040 001111000000010000000000000000000x000044 100011111010010100000000000000000x000048 000011000001000000000000111011000x00004c 001001000000000000000000000000000x000050 100011111011111100000000000101000x000054 001001111011110100000000001000000x000058 000000111110000000000000000010000x00005c 00000000000000000001000000100001
21
Linkstep1:combineprog.o,libc.o
• Mergetext/datasegments• Createabsolutememoryaddresses• Modify&mergesymbolandrelocationtables• SymbolTable– Label Addressmain: 0x00000000loop: 0x00000018str: 0x10000430printf: 0x000003b0 …
• RelocationInformation– Address Instr.Type Dependency0x00000040 lui l.str0x00000044 ori r.str0x0000004c jal printf …
22
Linkstep2:
00 addiu $29,$29,-3204 sw$31,20($29)08 sw$4, 32($29)0c sw$5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)
30 addiu $8,$14, 134 sw$8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -1040 lui $4, 409644 ori $4,$4,107248 lw$5,24($29)4c jal 81250 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr$31
• Edit Addresses in relocation table • (shown in TAL for clarity, but done in binary )
23
Linkstep3:
• Outputexecutableofmergedmodules– Singletext(instruction)segment– Singledatasegment– Headerdetailingsizeofeachsegment
• NOTE:– Thepreceeding examplewasamuchsimplifiedversionofhowELFandotherstandardformatswork,meantonlytodemonstratethebasicprinciples.
24
StaticvsDynamicallylinkedlibraries
• Whatwe’vedescribedisthetraditionalway:statically-linked approach– Thelibraryisnowpartoftheexecutable,soifthelibraryupdates,wedon’tgetthefix(havetorecompileifwehavesource)
– Itincludestheentire libraryevenifnotallofitwillbeused
– Executableisself-contained• Analternativeisdynamicallylinkedlibraries(DLL),commononWindows&UNIXplatforms
25
Dynamicallylinkedlibraries
• Space/timeissues+Storingaprogramrequireslessdiskspace+Sendingaprogramrequireslesstime+Executingtwoprogramsrequireslessmemory(iftheysharealibrary)– Atruntime,there’stimeoverheadtodolink
• Upgrades+Replacingonefile(libXYZ.so)upgradeseveryprogramthatuseslibrary“XYZ”– Havingtheexecutableisn’tenoughanymore
Overall, dynamic linking adds quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits that often outweigh these
en.wikipedia.org/wiki/Dynamic_linking
26
Dynamicallylinkedlibraries• Theprevailingapproachtodynamiclinkingusesmachine
codeasthe“lowestcommondenominator”– Thelinkerdoesnotuseinformationabouthowtheprogramor
librarywascompiled(i.e.,whatcompilerorlanguage)– Thiscanbedescribedas“linkingatthemachinecodelevel”– Thisisn’ttheonlywaytodoit...
• Alsothesedayswillrandomizelayout (AddressSpaceLayoutRandomization)– ActsasadefensetomakeexploitingCmemoryerrors
substantiallyharder,asmodernexploitationrequiresjumpingtopiecesofexistingcode(“Returnorientedprogramming”)tocounteranotherdefense(markingheap&stackunexecutable,soattackercan’twritecodeintojustanywhereinmemory).
27
UpdateYourLinuxSystems!!!• TheGNUglibc hasacatastrophicallybadbug– Astackoverflowingetaddrinfo()
• Functionthatturns"DNSname"into"IPaddress"• CVE-2015-7547
– "CommonVulnerabilities and Exposures"
• If”badguy”canmakeyourprogramlookupanameoftheirchoosing…– Andtheirbadnamehasaparticularlylongreply...
• Withstaticlinking,therewouldbeaneedtorecompileandupdatehundreds ofdifferentprograms
• Withdynamiclinking,"just"needtoupdatetheoperatingsystem
28
InConclusion…§ Compiler converts a single HLL file
into a single assembly language file.§ Assembler removes pseudo-
instructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). A .s file becomes a .o file.ú Does 2 passes to resolve addresses,
handling internal forward references
§ Linker combines several .o files and resolves absolute addresses.ú Enables separate compilation, libraries
that need not be compiled, and resolves remaining addresses
§ Loader loads executable into memory and begins execution.
29