CS 61C: Great Ideas in Computer Architecture CALL ...cs61c/sp16/lec/12/... · – Acts as a defense...

Post on 20-Jun-2020

1 views 0 download

Transcript of CS 61C: Great Ideas in Computer Architecture CALL ...cs61c/sp16/lec/12/... · – Acts as a defense...

CS61C:GreatIdeasinComputerArchitecture

CALLcontinued(LinkingandLoading)

1

Instructors:NicholasWeaver&VladimirStojanovic

http://inst.eecs.Berkeley.edu/~cs61c/sp16

WhereAreWeNow?

2

Linker(1/3)• Input:Objectcodefiles,informationtables(e.g.,foo.o,libc.o forMIPS)

• Output:Executablecode(e.g.,a.out forMIPS)

• Combinesseveralobject(.o)filesintoasingleexecutable(“linking”)

• Enableseparatecompilationoffiles– Changestoonefiledonotrequirerecompilationofthewholeprogram• Windows7sourcewas>40Mlinesofcode!

– Oldname“LinkEditor”fromeditingthe“links”injumpandlinkinstructions

3

.o file 1text 1data 1info 1

.o file 2text 2data 2info 2

Linker

a.outRelocated text 1Relocated text 2Relocated data 1Relocated data 2

Linker(2/3)

4

Linker(3/3)

• Step1:Taketextsegmentfromeach.o fileandputthemtogether

• Step2:Takedatasegmentfromeach.o file,putthemtogether,andconcatenatethisontoendoftextsegments

• Step3:Resolvereferences– GothroughRelocationTable;handleeachentry– Thatis,fillinallabsoluteaddresses

5

FourTypesofAddresses

• PC-RelativeAddressing(beq,bne)– neverrelocate

• AbsoluteFunctionAddress(j,jal)– alwaysrelocate

• ExternalFunctionReference(usuallyjal)– alwaysrelocate

• StaticDataReference(oftenlui andori)– alwaysrelocate

6

AbsoluteAddressesinMIPS

• Whichinstructionsneedrelocationediting?– J-format:jump,jumpandlink

– Loadsandstorestovariablesinstaticarea,relativetoglobalpointer

– Whataboutconditionalbranches?

– PC-relativeaddressingpreservedevenifcodemoves

j/jal xxxxx

lw/sw $gp $x address

beq/bne $rs $rt address

7

ResolvingReferences(1/2)

• Linkerassumesfirstwordoffirsttextsegmentisataddress0x04000000.– (Morelaterwhenwestudy“virtualmemory”)

• Linkerknows:– lengthofeachtextanddatasegment– orderingoftextanddatasegments

• Linkercalculates:– absoluteaddressofeachlabeltobejumpedto(internalorexternal)andeachpieceofdatabeingreferenced

8

ResolvingReferences(2/2)

• Toresolvereferences:– searchforreference(dataorlabel)inall“user”symboltables

– ifnotfound,searchlibraryfiles(forexample,forprintf)

– onceabsoluteaddressisdetermined,fillinthemachinecodeappropriately

• Outputoflinker:executablefilecontainingtextanddata(plusheader)

9

WhereAreWeNow?

10

LoaderBasics• Input:ExecutableCode(e.g.,a.out forMIPS)

• Output:(programisrun)• Executablefilesarestoredondisk• Whenoneisrun,loader’sjobistoloaditintomemoryandstartitrunning

• Inreality,loaderistheoperatingsystem(OS)– loadingisoneoftheOStasks– Andthesedays,theloaderactuallydoesalotofthelinking

11

Loader…whatdoesitdo?• Readsexecutablefile’sheadertodeterminesizeoftextand

datasegments• Createsnewaddressspaceforprogramlargeenoughtohold

textanddatasegments,alongwithastacksegment• Copiesinstructionsanddatafromexecutablefileintothenew

addressspace• Copiesargumentspassedtotheprogramontothestack• Initializesmachineregisters

– Mostregisterscleared,butstackpointerassignedaddressof1stfreestacklocation

• Jumpstostart-uproutinethatcopiesprogram’sargumentsfromstacktoregisters&setsthePC– Ifmainroutinereturns,start-uproutineterminatesprogramwiththe

exitsystemcall 12

Clicker/PeerInstructionAtwhatpointinprocessareallthemachinecodebitsdeterminedforthefollowingassemblyinstructions:1)addu $6, $7, $82)jal fprintf

A:1)&2)AftercompilationB:1)Aftercompilation,2)AfterassemblyC:1)Afterassembly,2)AfterlinkingD:1)Aftercompilation,2)AfterlinkingE:1)Aftercompilation,2)Afterloading

13

Example:C⇒ Asm⇒ Obj⇒ Exe⇒ Run

#include <stdio.h>int main (int argc, char *argv[]) {int i, sum = 0;for (i = 0; i <= 100; i++)

sum = sum + i * i;printf ("The sum of sq from 0 .. 100 is %d\n", sum);

}

C Program Source Code: prog.c

“printf” lives in “libc”

14

Compilation:MAL.text.align 2.globl main

main:subu $sp,$sp,32sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)

loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)

addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,32jr $ra.data.align 0

str:.asciiz "The sum of sq from 0 .. 100 is %d\n"

Where are7 pseudo-instructions?

15

Compilation:MAL.text.align 2.globl main

main:subu $sp,$sp,32sw $ra, 20($sp)sd $a0, 32($sp)sw $0, 24($sp)sw $0, 28($sp)

loop:lw $t6, 28($sp)mul $t7, $t6,$t6lw $t8, 24($sp)addu $t9,$t8,$t7sw $t9, 24($sp)

addu $t0, $t6, 1sw $t0, 28($sp)ble $t0,100, loopla $a0, strlw $a1, 24($sp)jal printfmove $v0, $0lw $ra, 20($sp)addiu $sp,$sp,32jr $ra.data.align 0

str:.asciiz "The sum of sq from 0 .. 100 is %d\n"

7 pseudo-instructionsunderlined

16

Assemblystep1:

00 addiu $29,$29,-3204 sw$31,20($29)08 sw$4, 32($29)0c sw$5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)

30 addiu $8,$14, 134 sw$8,28($29)38 slti $1,$8, 101 3c bne $1,$0, loop40 lui $4, l.str44 ori $4,$4,r.str 48 lw$5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr $31

Remove pseudoinstructions, assign addresses

17

Assemblystep2

• SymbolTableLabel address(inmodule) typemain: 0x00000000 global textloop: 0x00000018 local textstr: 0x00000000 local data

• RelocationInformationAddress Instr.type Dependency0x00000040 lui l.str0x00000044 ori r.str 0x0000004c jal printf

Create relocation table and symbol table

18

Assemblystep3

00 addiu $29,$29,-3204 sw $31,20($29)08 sw $4, 32($29)0c sw $5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)

30 addiu $8,$14, 134 sw $8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -1040 lui $4, l.str44 ori $4,$4,r.str48 lw $5,24($29)4c jal printf50 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr $31

Resolve local PC-relative labels

19

Assemblystep4

• Generateobject(.o)file:– Outputbinaryrepresentationfor• textsegment(instructions)• datasegment(data)• symbolandrelocationtables

– Usingdummy“placeholders”forunresolvedabsoluteandexternalreferences

20

Textsegmentinobjectfile0x000000 001001111011110111111111111000000x000004 101011111011111100000000000101000x000008 101011111010010000000000001000000x00000c 101011111010010100000000001001000x000010 101011111010000000000000000110000x000014 101011111010000000000000000111000x000018 100011111010111000000000000111000x00001c 100011111011100000000000000110000x000020 000000011100111000000000000110010x000024 001001011100100000000000000000010x000028 001010010000000100000000011001010x00002c 101011111010100000000000000111000x000030 000000000000000001111000000100100x000034 000000110000111111001000001000010x000038 000101000010000011111111111101110x00003c 101011111011100100000000000110000x000040 001111000000010000000000000000000x000044 100011111010010100000000000000000x000048 000011000001000000000000111011000x00004c 001001000000000000000000000000000x000050 100011111011111100000000000101000x000054 001001111011110100000000001000000x000058 000000111110000000000000000010000x00005c 00000000000000000001000000100001

21

Linkstep1:combineprog.o,libc.o

• Mergetext/datasegments• Createabsolutememoryaddresses• Modify&mergesymbolandrelocationtables• SymbolTable– Label Addressmain: 0x00000000loop: 0x00000018str: 0x10000430printf: 0x000003b0 …

• RelocationInformation– Address Instr.Type Dependency0x00000040 lui l.str0x00000044 ori r.str0x0000004c jal printf …

22

Linkstep2:

00 addiu $29,$29,-3204 sw$31,20($29)08 sw$4, 32($29)0c sw$5, 36($29)10 sw $0, 24($29)14 sw $0, 28($29)18 lw $14, 28($29)1c multu $14, $1420 mflo $1524 lw $24, 24($29)28 addu $25,$24,$152c sw $25, 24($29)

30 addiu $8,$14, 134 sw$8,28($29)38 slti $1,$8, 101 3c bne $1,$0, -1040 lui $4, 409644 ori $4,$4,107248 lw$5,24($29)4c jal 81250 add $2, $0, $054 lw $31,20($29) 58 addiu $29,$29,325c jr$31

• Edit Addresses in relocation table • (shown in TAL for clarity, but done in binary )

23

Linkstep3:

• Outputexecutableofmergedmodules– Singletext(instruction)segment– Singledatasegment– Headerdetailingsizeofeachsegment

• NOTE:– Thepreceeding examplewasamuchsimplifiedversionofhowELFandotherstandardformatswork,meantonlytodemonstratethebasicprinciples.

24

StaticvsDynamicallylinkedlibraries

• Whatwe’vedescribedisthetraditionalway:statically-linked approach– Thelibraryisnowpartoftheexecutable,soifthelibraryupdates,wedon’tgetthefix(havetorecompileifwehavesource)

– Itincludestheentire libraryevenifnotallofitwillbeused

– Executableisself-contained• Analternativeisdynamicallylinkedlibraries(DLL),commononWindows&UNIXplatforms

25

Dynamicallylinkedlibraries

• Space/timeissues+Storingaprogramrequireslessdiskspace+Sendingaprogramrequireslesstime+Executingtwoprogramsrequireslessmemory(iftheysharealibrary)– Atruntime,there’stimeoverheadtodolink

• Upgrades+Replacingonefile(libXYZ.so)upgradeseveryprogramthatuseslibrary“XYZ”– Havingtheexecutableisn’tenoughanymore

Overall, dynamic linking adds quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits that often outweigh these

en.wikipedia.org/wiki/Dynamic_linking

26

Dynamicallylinkedlibraries• Theprevailingapproachtodynamiclinkingusesmachine

codeasthe“lowestcommondenominator”– Thelinkerdoesnotuseinformationabouthowtheprogramor

librarywascompiled(i.e.,whatcompilerorlanguage)– Thiscanbedescribedas“linkingatthemachinecodelevel”– Thisisn’ttheonlywaytodoit...

• Alsothesedayswillrandomizelayout (AddressSpaceLayoutRandomization)– ActsasadefensetomakeexploitingCmemoryerrors

substantiallyharder,asmodernexploitationrequiresjumpingtopiecesofexistingcode(“Returnorientedprogramming”)tocounteranotherdefense(markingheap&stackunexecutable,soattackercan’twritecodeintojustanywhereinmemory).

27

UpdateYourLinuxSystems!!!• TheGNUglibc hasacatastrophicallybadbug– Astackoverflowingetaddrinfo()

• Functionthatturns"DNSname"into"IPaddress"• CVE-2015-7547

– "CommonVulnerabilities and Exposures"

• If”badguy”canmakeyourprogramlookupanameoftheirchoosing…– Andtheirbadnamehasaparticularlylongreply...

• Withstaticlinking,therewouldbeaneedtorecompileandupdatehundreds ofdifferentprograms

• Withdynamiclinking,"just"needtoupdatetheoperatingsystem

28

InConclusion…§ Compiler converts a single HLL file

into a single assembly language file.§ Assembler removes pseudo-

instructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). A .s file becomes a .o file.ú Does 2 passes to resolve addresses,

handling internal forward references

§ Linker combines several .o files and resolves absolute addresses.ú Enables separate compilation, libraries

that need not be compiled, and resolves remaining addresses

§ Loader loads executable into memory and begins execution.

29