Chapter 14/17 Functions & Activation Records Recursion.

34
Chapter 14/17 Functions & Activation Records Recursion

Transcript of Chapter 14/17 Functions & Activation Records Recursion.

Page 1: Chapter 14/17 Functions & Activation Records Recursion.

Chapter 14/17

Functions & Activation RecordsRecursion

Page 2: Chapter 14/17 Functions & Activation Records Recursion.

2College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Function

Smaller, simpler, subcomponent of program Provides abstraction

– hide low-level details– give high-level structure to program,

easier to understand overall program flow– enables separable, independent development

C functions– zero or multiple arguments passed in– single result returned (optional)– return value is always the same type– by convention, only one function named main (this defines the initial PC)

In other languages, called procedures, subroutines, ...

Page 3: Chapter 14/17 Functions & Activation Records Recursion.

3College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Functions in C

A function consists of a– Declaration (a.k.a prototype)

includes return value, function name, and the order and data type of all arguments – names of argument are optional)

– Calls – Definition

Names of variables do not need to match prototype, but must match order/type Defines functionality (source code) and returns control (and a value) to caller May produce side-effects

Declaration int Factorial (int n); Call a = x + Factorial (f + g); Definition

int Factorial(int n){ int i; int result = 1; for (i = 1; i <= n; i++) result *= i; return result;}

Page 4: Chapter 14/17 Functions & Activation Records Recursion.

4College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Why Declaration? Since function definition also includes return and argument types, why

is declaration needed? Use might be seen before definition.

– Compiler needs to know return and arg types and number of arguments. Definition might be in a different file, written by a different

programmer.– include a "header" file with function declarations only– compile separately, link together to make executable

What if we need to pass more parameters than we have registers?!? All the compiler needs to know

– (1) what symbol to call the function (its name), – (2) the type of the return value (for proper usage/intermediate storage)– (3) how to set up the stack to call the function (arguments)

This allows the compiler to inserted the necessary code for function activation when it is called

Page 5: Chapter 14/17 Functions & Activation Records Recursion.

5College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Implementing a Function Call: Overview Making a function call involves three basic steps

– The parameters of the call are passed from the caller to the callee– The callee does its task– A return value is returned to the caller

Functions in C are caller-independent– Every function must be callable from any other function

Activation record– information about each function, including arguments and local variables– also includes bookkeeping information

values that must always be saved by the caller before doing the JSR– arguments and bookeeping info pushed on run-time stack by caller– callee responsible for other uses of stack (such as local variables)– popped of the stack by caller after getting return value

R5 (the frame pointer) is the “bottom” of the stack as far as the callee function definition is aware. – It should not modify anything existing on the stack before its execution!

Page 6: Chapter 14/17 Functions & Activation Records Recursion.

6College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Implementing a Function Call: Overview

Calling function

allocate memory for activation record (push)copy arguments into stack arguments

call function

get result from stack (pop)deallocate memory for activation record (pop)

Called functionallocate space for return value [bookkeeping] (push)store mandatory callee save registers [bookkeeping] (push)allocate local variables (push)

execute code

put result in return value spacedeallocate local variables (pop) load callee save registers (pop)return

Page 7: Chapter 14/17 Functions & Activation Records Recursion.

7College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Activation Record int funName(int a, int b)

{ int w, x, y; . . . return y;}

Name Type Offset Scope

abwxy

intintintintint

450-1-2

funNamefunNamefunNamefunNamefunName

yxw

dynamic linkreturn addressreturn value

ab

bookkeeping

locals

args

R5

R6

Page 8: Chapter 14/17 Functions & Activation Records Recursion.

8College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Activation Record Bookkeeping

Return value– space for value returned by function– allocated even if function does not return a value

Return address– save pointer to next instruction in calling function– convenient location to store R7 to protect it from change in case

another function (JSR) is called Dynamic link

– caller’s frame pointer (basically a caller save of its own frame pointer R5)– used to “pop” this activation record from stack

In the LC-3 the format for an activation record includes only (and always) these three words pushed in that order!

Page 9: Chapter 14/17 Functions & Activation Records Recursion.

9College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Run-Time Stack

Caller

Memory

R6

Callee

Memory

R6

Caller

Memory

Caller

Before call During call After call

R5

R5

R6R5

Page 10: Chapter 14/17 Functions & Activation Records Recursion.

10College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Example Function Call

int Callee (int q, int r) { int k, m; ... return k;}

int Caller (int a) { int w = 25; w = Callee (w,10); return w;}

wdyn linkret addrret vala

25

xFD00

R5R6

Page 11: Chapter 14/17 Functions & Activation Records Recursion.

11College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

The Function Call: Preparation to Jump

int Callee (int q, int r) … int Caller (int a) … … w = Callee(w, 10);… ; push second arg

AND R0, R0, #0ADD R0, R0, #10ADD R6, R6, #-1STR R0, R6, #0; push first argumentLDR R0, R5, #0ADD R6, R6, #-1STR R0, R6, #0

JSR CALLEE ; Jump!Note: Caller needs to know number and type of arguments,doesn't know about local variables.

qrwdyn linkret addrret vala

251025

xFD00

R5R6

New R6

Page 12: Chapter 14/17 Functions & Activation Records Recursion.

12College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Starting the Callee Function ; push space for return value

ADD R6, R6, #-1; push return addressADD R6, R6, #-1STR R7, R6, #0; push dyn link (caller’s R5)ADD R6, R6, #-1STR R5, R6, #0; set new frame pointerADD R5, R6, #-1; allocate space for localsADD R6, R6, #-2

… execute body of function

mkDyn lnk (R5)ret addrret valueqrwdyn linkret addrret vala

xFCFBx3100

251025

xFD00

R5

R6

new R5

new R6

Note: Callee makes room for bookkeeping variables (standard format) and its local variables (as necessary)

Page 13: Chapter 14/17 Functions & Activation Records Recursion.

13College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Ending the Callee Function

return k;

; copy k into return valueLDR R0, R5, #0STR R0, R5, #3; pop local variablesADD R6, R5, #1; pop dynamic link (into R5)LDR R5, R6, #0ADD R6, R6, #1; pop return addr (into R7)LDR R7, R6, #0ADD R6, R6, #1; return control to callerRET

mkdyn linkret addrret valqrwdyn linkret addrret vala

-43217xFCFBx310021725 1025

xFD00

R6R5

new R6

new R5

Page 14: Chapter 14/17 Functions & Activation Records Recursion.

14College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Resuming the Caller Function

w = Callee(w,10);

; JSR Callee ; LAST COMMAND

; load return value (top of stack)LDR R0, R6, #0; perform assignment of return valueSTR R0, R5, #0; pop return value and argumentsADD R6, R6, #3

… Continue caller code

ret valqrwdyn linkret addrret vala

21725 10217

xFD00

R6

R5new R6

Page 15: Chapter 14/17 Functions & Activation Records Recursion.

15College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Summary of LC-3 Function Call Implementation

1. Caller pushes arguments (last to first).2. Caller invokes subroutine (JSR).3. Callee allocates return value, pushes R7 and R5.4. Callee allocates space for local variables (first to last).5. Callee executes function code.6. Callee stores result into return value slot.7. Callee pops local vars, pops R5, pops R7.8. Callee returns (JMP R7).9. Caller loads return value and pops arguments.10. Caller resumes computation…

Page 16: Chapter 14/17 Functions & Activation Records Recursion.

16College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

x0014x000Ax0001

Group Problem

main() { int x = 1, y = 10, z = 20; … z = foo(x, y); …}

int foo(int a, int b) { int i, j; … i = j + b; … return i;}

Show the assembly code for this instruction.

Assume JSR FOO ends up at address x3C00

zyx

dyn linkret addrret val

xFD00

R5

R6

xFCFFxFCFExFCFDxFCFC

Page 17: Chapter 14/17 Functions & Activation Records Recursion.

17College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

What is Recursion?

A recursive function is one that solves its task by calling itself on smaller pieces of data.– Similar to recurrence function in mathematics.– Like iteration -- can be used interchangeably;

sometimes recursion results in a simpler solution. Standard example: Fibonacci numbers

– The n-th Fibonacci number is the sum of the previous two Fibonacci numbers.

– F(n) = F(n – 1) + F(n – 2) where F(1) = F(0) = 1

int Fibonacci(int n){ if ((n == 0) || (n == 1))

return 1; else

return Fibonacci(n-1) + Fibonacci(n-2);}

Page 18: Chapter 14/17 Functions & Activation Records Recursion.

18College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Activation Records Whenever Fibonacci is invoked, a new activation record is pushed onto

the stack.

Fib(1)

R6

Fib(2)

Fib(3)

main

main calls Fibonacci(3)

Fibonacci(3) calls Fibonacci(2)

Fibonacci(2) calls Fibonacci(1)

R6Fib(3)

main

R6Fib(2)

Fib(3)

main

Page 19: Chapter 14/17 Functions & Activation Records Recursion.

19College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Activation Records (cont.)

Fibonacci(1) returns,Fibonacci(2) calls

Fibonacci(0)

Fibonacci(2) returns,Fibonacci(3) calls

Fibonacci(1)

Fibonacci(3)returns

R6main

R6Fib(1)

Fib(3)

main

Fib(0)

R6

Fib(2)

Fib(3)

main

Page 20: Chapter 14/17 Functions & Activation Records Recursion.

20College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Tracing the Function Calls

If we are debugging this program, we might want to trace all the calls of Fibonacci.– Note: A trace will also contain the arguments

passed into the function.

For Fibonacci(3), a trace looks like: Fibonacci(3)

Fibonacci(2) Fibonacci(1) Fibonacci(0) Fibonacci(1)

What would trace of Fibonacci(4) look like?

Page 21: Chapter 14/17 Functions & Activation Records Recursion.

21College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Fibonacci: LC-3 Code

Activation Record

tempdynamic link

return addressreturn value

n

bookkeeping

arg

Compiler generatestemporary variable to holdresult of first Fibonacci call.

local

Page 22: Chapter 14/17 Functions & Activation Records Recursion.

22College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

In Summary: The Stack

Since our program usually starts at a low memory address and grows upward, we start the stack at a high memory address and work downward.

Purposes– Temporary storage of variables– Temporary storage of program addresses– Communication with subroutines

Push variables on stack Jump to subroutine Clean stack Return

Page 23: Chapter 14/17 Functions & Activation Records Recursion.

23College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Parameter passing on the stack If we use registers to pass our parameters:

– Limit of 8 parameters to/from any subroutine.– We use up registers so they are not available to our program.

So, instead we push the parameters onto the stack.– Parameters are passed on the stack– Return values can be provided in registers (such as R0) or on the stack.– Generally, only R6 should be changed by a subroutine.

Other registers that are changed should must be callee saved/restored. Subroutines should be transparent

Both the subroutine and the main program must know how many parameters are being passed!– In C we would use a prototype: int power (int number, int exponent);

In assembly, you must take care of this yourself. After you return from a subroutine, you must also clear the stack.

– Clean up your mess!

Page 24: Chapter 14/17 Functions & Activation Records Recursion.

24College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Characteristics of good subroutines

Generality – can be called with any arguments– Passing arguments on the stack does this.

Transparency – you have to leave the registers like you found them, except R6.– Registers must be callee saved.

Readability – well documented. Re-entrant – subroutine can call itself if necessary

– Store all information relevant to specific execution to non-fixed memory locations The stack!

– This includes temporary callee storage of register values!

Page 25: Chapter 14/17 Functions & Activation Records Recursion.

25College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Know how to…

Push parameters onto the stack Access parameters on the stack using base + offset addressing mode Draw the stack to keep track of subroutine execution

– Parameters– Return address

Clean the stack after a subroutine call

Page 26: Chapter 14/17 Functions & Activation Records Recursion.

26College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

String Library Code– Implementation of C/C++ function gets

No way to specify limit on number of characters to read

– Similar problems with other C/C++ functions strcpy,scanf, fscanf, sscanf, when given %s conversion specification cin (as commonly used in C++)

/* Get string from stdin */char *gets(char *dest){ int c = getc(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getc(); } *p = '\0'; return dest;}

Page 27: Chapter 14/17 Functions & Activation Records Recursion.

27College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Vulnerable Buffer Code

int main(){ printf("Type a string:"); echo(); return 0;}

/* Echo Line */void echo(){ char buf[4]; /* Way too small! */ gets(buf); puts(buf);}

Page 28: Chapter 14/17 Functions & Activation Records Recursion.

28College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Buffer Overflow Executions

unix>./bufdemoType a string:123123

unix>./bufdemoType a string:12345Segmentation Fault

unix>./bufdemoType a string:12345678Segmentation Fault

Page 29: Chapter 14/17 Functions & Activation Records Recursion.

29College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Malicious Use of Buffer Overflow

– Input string contains byte representation of executable code– Overwrite return address with address of buffer– When bar() executes ret, will jump to exploit code

void bar() { char buf[64]; gets(buf); ... }

void foo(){ bar(); ...}

Stack after call to gets()

B

returnaddress

A

foo stack frame

bar stack frameB exploitcode

pad

data written

bygets()

Page 30: Chapter 14/17 Functions & Activation Records Recursion.

30College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Exploits Based on Buffer Overflows Buffer overflow bugs allow remote machines to execute arbitrary code

on victim machines. Internet worm

– Early versions of the finger server (fingerd) used gets() to read the argument sent by the client:

finger [email protected]– Worm attacked fingerd server by sending phony argument:

finger “exploit-code padding new-return-address” exploit code: executed a root shell on the victim machine with a direct TCP

connection to the attacker. IM War

– AOL exploited existing buffer overflow bug in AIM clients– exploit code: returned 4-byte signature (the bytes at some location in the

AIM client) to server. – When Microsoft changed code to match signature, AOL changed signature

location.

Page 31: Chapter 14/17 Functions & Activation Records Recursion.

31College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Code Red Worm

History– June 18, 2001. Microsoft announces buffer overflow vulnerability in IIS

Internet server– July 19, 2001. over 250,000 machines infected by new virus in 9 hours– White house must change its IP address. Pentagon shut down public

WWW servers for day When We Set Up CS:APP Web Site

– Received strings of formGET /default.ida?

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN....NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u0003%u8b00%u531b%u53ff%u0078%u0000%u00=a

HTTP/1.0" 400 325 "-" "-"

Page 32: Chapter 14/17 Functions & Activation Records Recursion.

32College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Code Red Exploit Code– Starts 100 threads running– Spread self

Generate random IP addresses & send attack string

Between 1st & 19th of month– Attack www.whitehouse.gov

Send 98,304 packets; sleep for 4-1/2 hours; repeat

– Denial of service attack Between 21st & 27th of month

– Deface server’s home page After waiting 2 hours

Page 33: Chapter 14/17 Functions & Activation Records Recursion.

33College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Avoiding Overflow Vulnerability

Use Library Routines that Limit String Lengths– fgets instead of gets– strncpy instead of strcpy– Don’t use scanf with %s conversion specification

Use fgets to read the string

/* Echo Line */void echo(){ char buf[4]; /* Way too small! */ fgets(buf, 4, stdin); puts(buf);}

Page 34: Chapter 14/17 Functions & Activation Records Recursion.

34College of CharlestonDr. Anderson, Computer Science

CS 250Comp. Org. & Assembly

Practice problems

14.2, 14.4, 14.9, 14.10, 14.15 (good!) The convention in LC-3 assembly is that all registers are callee-saved

except for R5 (the frame pointer) R6 (the stack pointer) and R7 (the return link).

– Why is R5 not callee-saved?– Why is R6 not callee-saved?– Why is R7 not callee-saved?

Is it true that any problem that can be solved recursively can be solved iteratively using a stack data structure? Why or why not?