CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern...

32
CS 61C: Great Ideas in Computer Architecture C Arrays, Strings and Memory Management Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.Berkeley.edu/~cs61c/sp16 1

Transcript of CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern...

Page 1: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

CS61C:GreatIdeasinComputerArchitecture

CArrays,StringsandMemoryManagement

Instructors:NicholasWeaver&VladimirStojanovic

http://inst.eecs.Berkeley.edu/~cs61c/sp16

1

Page 2: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Reminder!CArraysareVeryPrimitive

• AnarrayinCdoesnotknowitsownlength,anditsboundsarenotchecked!– Consequence:Wecanaccidentallyaccessofftheendofanarray– Consequence:Wemustpassthearrayanditssizetoanyprocedure

thatisgoingtomanipulateit• Segmentationfaultsandbuserrors:

– TheseareVERYdifficulttofind;becareful!(You’lllearnhowtodebugtheseinlab)

– Butalso“fun”toexploit:• “Stackoverflowexploit”,maliciouslywriteofftheendofanarrayonthestack• “Heapoverflowexploit”,maliciouslywriteofftheendofanarrayontheheap

• IfyouwriteprogramsinC,youwill writecodethathasarray-boundserrors!

2

Page 3: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

UseDefinedConstants• Arraysizen;wanttoaccessfrom0 ton-1,soyoushouldusecounterAND

utilizeavariablefordeclaration&incrementation– Badpattern

int i, ar[10];for(i = 0; i < 10; i++){ ... }

– Betterpatternconst int ARRAY_SIZE = 10;int i, a[ARRAY_SIZE];for(i = 0; i < ARRAY_SIZE; i++){ ... }

• SINGLESOURCEOFTRUTH– You’reutilizing indirectionandavoidingmaintainingtwocopiesofthenumber

10– DRY:“Don’tRepeatYourself”– Anddon’t forgetthe< ratherthan<=:

WhenNicktook60c,helostadaytoa“segfault inamalloc calledbyprintf onlargeinputs”:Hada<= ratherthana< inasinglearrayinitialization!

3

Page 4: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

WhenArraysGoBad:Heartbleed• InTLSencryption,messageshavealength…– Andgetcopiedintomemorybeforebeingprocessed

• Onemessagewas“EchoMebackthefollowingdata,itsthislong...”– Butthe(different)echolengthwasn’tcheckedtomakesureitwasn’ttoobig...

• Soyousendasmallrequestthatsays“readbackalotofdata”– Andthusgetwebrequestswithauth cookiesandotherbitsofdatafromrandombitsofmemory…

4

M 5 HB L=5000 107:Oul7;GET / HTTP/1.1\r\nHost: www.mydomain.com\r\nCookie: login=117kf9012oeu\r\nUser-Agent: Mozilla….

Page 5: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

PointingtoDifferentSizeObjects• Modernmachinesare“byte-addressable”

– Hardware’smemorycomposedof8-bitstoragecells,eachhasauniqueaddress• Typedeclarationtellscompilerhowmanybytestofetchoneachaccessthroughpointer– E.g.,32-bitintegerstoredin4consecutive8-bitbytes

• Butweactuallywant“Bytealignment”– Someprocessorswillnotallowyoutoaddress32bvalueswithoutbeingon4byte

boundaries– Otherswilljustbeveryslowifyoutrytoaccess“unaligned”memory.

5

424344454647484950515253545556575859

int *x

32-bitintegerstoredinfourbytes

short *y

16-bitshortstoredintwobytes

char *z

8-bitcharacterstoredinonebyte

Byteaddress

Page 6: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

sizeof()operator• sizeof(type)returnsnumberofbytesinobject– Butnumberofbitsinabyteisnotstandardized

• Inoldentimes,whendragonsroamedtheearth,bytescouldbe5,6,7,9bitslong

• Bydefinition,sizeof(char)==1– CdoesnotplaywellwithUnicode(unlikePython),sonochar c = ‘💩’

• Cantakesizeof(arg),orsizeof(structtype)– Structuretypesgetpaddedtoensurestructuresarealsoaligned

• We’llseemoreofsizeof whenwelookatdynamicmemorymanagement

6

Page 7: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

7

PointerArithmeticpointer +number pointer – numbere.g.,pointer + 1 adds1something toapointer

char *p;char a;char b;

p = &a;p += 1;

int *p;int a;int b;

p = &a;p += 1;

Ineach,p nowpointstob(Assumingcompilerdoesn’treordervariablesinmemory.Nevercodelikethis!!!!)

Adds1*sizeof(char) tothememoryaddress

Adds1*sizeof(int)tothememoryaddress

Pointerarithmeticshouldbeusedcautiously(andbyNick’sstandard,“cautious”==AlmostNEVER!)

Page 8: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

8

ArraysandPointers

• Array≈ pointertotheinitial(0th)arrayelement

a[i] ≡ *(a+i)

• Anarrayispassedtoafunctionasapointer– Thearraysizeislost!

• Usuallybadstyletointerchangearraysandpointers– Avoidpointerarithmetic!

Really int *array

intfoo(int array[],

unsigned int size){

… array[size - 1] …}

intmain(void){

int a[10], b[5];… foo(a, 10)… foo(b, 5) …

}

Must explicitlypass the size

Passing arrays:

Page 9: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

9

ArraysandPointersint foo(int array[],

unsigned int size){

…printf(“%d\n”, sizeof(array));

}

intmain(void)

{int a[10], b[5];… foo(a, 10)… foo(b, 5) …printf(“%d\n”, sizeof(a));

}

Whatdoesthisprint?

Whatdoesthisprint?

4

40

...becausearray isreallyapointer(andapointer isarchitecturedependent, butlikelytobe4or8onmodern32-64bitmachines!)

Page 10: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

10

ArraysandPointers

int i;int array[10];

for (i = 0; i < 10; i++){array[i] = …;

}

int *p;int array[10];

for (p = array; p < &array[10]; p++){*p = …;

}

Thesecodesequenceshavethesameeffect!

Buttheformerismuchmorereadable:

Page 11: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Clickers/PeerInstructionTimeint x[] = { 2, 4, 6, 8, 10 };int *p = x;int **pp = &p;(*pp)++;(*(*pp))++;printf("%d\n", *p);

11

Resultis:A:2B:3C:4D:5E:Noneoftheabove

Page 12: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Clickers/PeerInstructionTimeint x[] = { 2, 4, 6, 8, 10 };int *p = x;int **pp = &p;(*pp)++;(*(*pp))++;printf("%d\n", *p);

12

Resultis:A:2B:3C:4D:5E:Noneoftheabove

IncrementsPpointto2nd element(4)Increments2nd elementby1(5)

Ppoints tothestartofX(2)PPpointstoP

Page 13: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Administrivia

• hw0- edX isdueonSunday1/31,• hw0minibioisdueinlabnextweek,turnintoyourlabTA

• MT1willbeThursday,2/25from6-8PM• MT2willbeMonday,4/4from7-9PM– EmailWilliamandFredifyouhaveconflicts(exceptfor16B,thatiscurrentlybeingresolved)

13

Page 14: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

61CInstructorintheNews…

• NickpresentedattheEnigmaconference:– “TheGoldenAgeofBulkSurveillance”– https://www.youtube.com/watch?v=zqnKdGnzoh0

14

Page 15: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Concisestrlen()int strlen(char *s){

char *p = s;while (*p++)

; /* Null body of while */return (p – s – 1);

}

Whathappensifthereisnozerocharacteratendofstring?

15

Page 16: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Pointpastendofarray?• Arraysizen;wanttoaccessfrom0 ton-1,buttestforexitbycomparingtoaddressoneelementpastthearrayint ar[10], *p, *q, sum = 0;...p = &ar[0]; q = &ar[10];while (p != q)

/* sum = sum + *p; p = p + 1; */sum += *p++;– Isthislegal?• Cdefinesthatoneelementpastendofarraymustbeavalidaddress,i.e.,notcauseanerror– BUTDONOTDOTHIS:ThisisONLYvalidforarraysdeclaredinthismanner,NOTarraysdeclaredusingdynamicallocation(malloc)!

Page 17: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

ValidPointerArithmetic• Addanintegertoapointer.• Subtractinganintegerfromapointer• Subtract2pointers(inthesamearray)• Comparepointers(<,<=,==,!=,>,>=)• ComparepointertoNULL(indicatesthatthepointerpointstonothing)

Everythingelseillegalsincemakesnosense:• addingtwopointers• multiplyingpointers• subtractpointerfrominteger

Page 18: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Argumentsinmain()

• Togetargumentstothemainfunction,use:– int main(int argc, char *argv[])

• Whatdoesthismean?– argc containsthenumberofstringsonthecommandline(theexecutablecountsasone,plusoneforeachargument).Hereargc is2:unix%sortmyFile

– argv isapointertoanarraycontainingtheargumentsasstrings

18

Page 19: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Example

• foo hello 87 "bar baz"• argc = 4 /* number arguments */ • argv[0] = "foo", argv[1] = "hello", argv[2] = "87",argv[3] = "bar baz",–Arrayofpointerstostrings

19

Page 20: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

CMemoryManagement• HowdoestheCcompilerdeterminewheretoputallthevariablesinmachine’smemory?

• Howtocreatedynamicallysizedobjects?• Tosimplifydiscussion,weassumeoneprogramrunsatatime,withaccesstoallofmemory.

• Later,we’lldiscussvirtualmemory,whichletsmultipleprogramsallrunatsametime,eachthinkingtheyownallofmemory.

20

Page 21: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

CMemoryManagement

• Program’saddressspacecontains4regions:– stack:localvariablesinside

functions,growsdownward– heap:spacerequestedfor

dynamicdataviamalloc();resizesdynamically,growsupward

– staticdata:variablesdeclaredoutsidefunctions,doesnotgroworshrink.Loadedwhenprogramstarts,canbemodified.

– code:loadedwhenprogramstarts,doesnotchange

code

staticdata

heap

stack~FFFFFFFFhex

~00000000hex

2121

MemoryAddress(32bitsassumedhere)

Page 22: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

WhereareVariablesAllocated?

• Ifdeclaredoutsideafunction,allocatedin“static”storage

• Ifdeclaredinsidefunction,allocatedonthe“stack”andfreedwhenfunctionreturns– main()istreatedlikeafunction

int myGlobal;main() {

int myTemp;}

22

Page 23: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

TheStack• Everytimeafunctioniscalled,anewframe

isallocatedonthestack• Stackframeincludes:

– Returnaddress(whocalledme?)– Arguments– Spaceforlocalvariables

• Stackframesusescontiguousblocksofmemory;stackpointerindicatesstartofstackframe

• Whenfunctionends,stackpointermovesup;freesmemoryforfuturestackframes

• We’llcoverdetailslaterforMIPSprocessorfooD frame

fooB frame

fooC frame

fooA frame

StackPointer23

fooA() { fooB(); }fooB() { fooC(); }fooC() { fooD(); }

Page 24: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

StackAnimation

• LastIn,FirstOut(LIFO)datastructuremain (){ a(0); }

void a (int m){ b(1); }void b (int n){ c(2); }void c (int o){ d(3); }void d (int p){ }

stack

StackPointer

StackPointer

StackPointer

StackPointer

StackPointer

Stackgrowsdown

24

Page 25: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

ManagingtheHeapCsupportsfunctionsforheapmanagement:

• malloc() allocateablockofuninitializedmemory• calloc() allocateablockofzeroedmemory• free() freepreviouslyallocatedblockofmemory• realloc() changesizeofpreviouslyallocatedblock• careful– itmightmove!

25

Page 26: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Malloc()• void *malloc(size_t n):

– Allocateablockofuninitializedmemory– NOTE:Subsequentcallsprobablywillnotyieldadjacentblocks– n isaninteger,indicatingsizeofrequestedmemoryblockinbytes– size_t isanunsignedintegertypebigenoughto“count”memorybytes– Returnsvoid* pointertoblock;NULL returnindicatesnomorememory– Additionalcontrolinformation(includingsize)storedintheheapforeach

allocatedblock.

• Examples:int *ip;ip = (int *) malloc(sizeof(int));

typedef struct { … } TreeNode;TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode));

sizeof returnssizeofgiventypeinbytes,producesmoreportablecode

26

“Cast”operation,changestypeofavariable.Herechanges(void *) to(int *)

Page 27: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

ManagingtheHeap• void free(void *p):

– Releasesmemoryallocatedby malloc()– p ispointercontainingtheaddress originally returnedbymalloc()

int *ip;ip = (int *) malloc(sizeof(int));... .. ..free((void*) ip); /* Can you free(ip) after ip++ ? */

typedef struct {… } TreeNode;TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode));

... .. ..free((void *) tp);

– When insufficient freememory,malloc() returns NULL pointer;Checkforit!

if ((ip = (int *) malloc(sizeof(int))) == NULL){printf(“\nMemory is FULL\n”);exit(1); /* Crash and burn! */

}

– Whenyoufree memory,youmustbesurethatyoupasstheoriginaladdressreturned from malloc() tofree(); Otherwise,systemexception(orworse)!– Andneverusethatmemoryagain“useafterfree”

27

Page 28: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

UsingDynamicMemorytypedef struct node {

int key;struct node *left;struct node *right;

} Node;

Node *root = NULL;

Node *create_node(int key, Node *left, Node *right){

Node *np;if ( (np = (Node*) malloc(sizeof(Node))) == NULL){ printf("Memory exhausted!\n"); exit(1); }else{ np->key = key;

np->left = left;np->right = right;return np;

}}

void insert(int key, Node **tree){

if ( (*tree) == NULL){ (*tree) = create_node(key, NULL, NULL); return; }

if (key <= (*tree)->key)insert(key, &((*tree)->left));

elseinsert(key, &((*tree)->right));

}insert(10, &root);insert(16, &root);insert(5, &root);insert(11 , &root); 28

Root

Key=10

Left Right

Key=5

Left RightKey=16

Left Right

Key=11

Left Right

Page 29: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Observations• Code,Staticstorageareeasy:theynevergroworshrink• Stackspaceisrelativelyeasy:stackframesarecreatedanddestroyedinlast-in,first-out(LIFO)order

• Managingtheheapistricky:memorycanbeallocated/deallocated atanytime– Ifyouforgettodeallocatememory:“MemoryLeak”

• Yourprogramwilleventuallyrunoutofmemory– Ifyoucallfreetwiceonthesamememory:“DoubleFree”

• Possiblecrashorexploitablevulnerability– Ifyouusedataaftercallingfree:“Useafterfree”

• Possiblecrashorexploitablevulnerability

29

Page 30: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Clickers/PeerInstruction!int x = 2;int result;

int foo(int n){ int y;

if (n <= 0) { printf("End case!\n"); return 0; }else{ y = n + foo(n-x);

return y;}

}result = foo(10);

Rightaftertheprintf executesbutbeforethereturn 0,howmanycopiesofx andy are thereallocated inmemory?

A:#x=1,#y=1B:#x=1,#y=5C:#x=5,#y=1D:#x=1,#y=6E:#x=6,#y=6

30

Page 31: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

Clickers/PeerInstruction!int x = 2;int result;

int foo(int n){ int y;

if (n <= 0) { printf("End case!\n"); return 0; }else{ y = n + foo(n-x);

return y;}

}result = foo(10);

Rightaftertheprintf executesbutbeforethereturn 0,howmanycopiesofx andy are thereallocated inmemory?

A:#x=1,#y=1B:#x=1,#y=5C:#x=5,#y=1D:#x=1,#y=6E:#x=6,#y=6

31

Stack:foo(10)foo(8)foo(6)foo(4)foo(2)foo(0)

Page 32: CS 61C: Great Ideas in Computer Architecture C Arrays ...cs61c/sp16/lec/05/... · • Modern machines are “byte-addressable” – Hardware’s memory composed of 8-bit storage

AndInConclusion,…

• Chasthreemainmemorysegmentsinwhichtoallocatedata:– StaticData:Variablesoutsidefunctions– Stack:Variableslocaltofunction– Heap:Objectsexplicitlymalloc-ed/free-d.

• HeapdataisbiggestsourceofbugsinCcode

32