CPS110: Intro to processes Landon Cox. OS Complexity Lines of code XP: 40 million Linux 2.6: 6...
-
Upload
suzanna-rogers -
Category
Documents
-
view
226 -
download
0
description
Transcript of CPS110: Intro to processes Landon Cox. OS Complexity Lines of code XP: 40 million Linux 2.6: 6...
CPS110: Intro to processes
Landon Cox
OS Complexity Lines of code
XP: 40 million Linux 2.6: 6 million (mostly driver code)
Sources of complexity Multiple instruction streams (processes) Multiple interrupt sources (I/O, timers, faults)
How can we keep everything straight?
Dealing with complexity Program decomposition
Functions OO: Classes, types
int main () { cout << “input: ”; cin >> input; output = sqrt (input); output = pow (output,3); cout << output << endl;}
int main () { getInput (); computeResult (); printOutput ();}void getInput () { cout << “input: ”; cin >> input;}void computerResult () { output = sqrt (input); output = pow (output,3);}void printOutput () { cout << output << endl;}
Intro to processes Decompose activities into separate
tasks Allow them to run in parallel “Independently” (what does this mean?) “without dependencies” …
Key OS abstraction: processes Run independently of each other Don’t have to know about others
Intro to processes Remember, for any area of OS, ask
What interface does the hardware provide? What interface does the OS provide?
What is physical reality? Single computer (CPU + memory) Execute instructions from many programs
What does an application see? Each app thinks it has its own CPU + memory
Hardware, OS interfaces
Hardware
OSApplications
Memory CPU
CPU, Mem
Job 1CPU, Mem
Job 2CPU, Mem
Job 3
What is a process? Informal
A program in execution Running code + things it can read/write Process ≠ program
Formal ≥ 1 threads in their own address space (soon threads will share an address space)
Course administration Project 0 due soon (+), groups due today (+)
Post questions to the blackboard message board Once I have your group + you have your CS account,
you can submit Project 1 will be out next Wednesday
Will be due in about a month Discussion sections
F 2:50 – 4:05 Any other questions?
Parts of a process Thread
Sequence of executing instructions Active: does things
Address space Data the process uses as it runs Passive: acted upon by threads
Identity, security context, open resources, etc…
Play analogy Process is like a play
performance Program is like the play’s scriptThreads
Address space
What are the threads?
What is the address space?
What is in the address space?
Program code Instructions, also called “text”
Data segment Global variables, static variables Heap (where “new” memory comes from)
Stack Where local variables are stored
Review of the stack Each stack frame contains a
function’s Local variables Parameters Return address Saved values of calling function’s
registers The stack enables recursion
const1=1const2=0main
Example stack
tmp=1RA=0x804838cA
RA=0x8048361B
const=0RA=0x8048354C
tmp=0RA=0x8048347A
0xffffffff
0x0
Memoryvoid C () { A (0);}
void B () { C ();}
void A (int tmp){ if (tmp) B ();}
int main () { A (1); return 0;}
0x8048347
0x8048354
0x8048361
0x804838c
Code Stack
…
SP
SP
SP
SP
SP
const1=3const2=0main
The stack and recursion
bnd=3RA=0x804838cA
bnd=2RA=0x8048361A
bnd=1RA=0x8048361A
bnd=0RA=0x8048361A
0xfffffff
0x0
Memory
void A (int bnd){ if (bnd) A (bnd-1);}
int main () { A (3); return 0;}
0x8048361
0x804838c
Code Stack
…
SP
SP
SP
SP
SP
How can recursion go wrong?Can overflow the stack …Keep adding frame after frame
wrd[3]wrd[2]wrd[1]wrd[0]
const2=0
main
The stack and buffer overflows
b= 0x00234RA=0x804838ccap
0xfffffff
0x0
Memoryvoid cap (char* b){ for (int i=0; b[i]!=‘\0’; i++) b[i]+=32;}int main(char*arg) { char wrd[4]; strcpy(arg, wrd); cap (wrd); return 0;}
0x8048361
0x804838c
Code Stack
…SP
SP
0x00234What can go wrong?Can overflow wrd variable …Overwrite cap’s RA
What is missing? What process state isn’t in the
address space? Registers Program counter (PC) General purpose registers
Review 104 for more details
Multiple threads in an addr space
Several actors on a single set Sometimes they interact (speak,
dance) Sometimes they are apart
(different scenes)
Private vs global thread state
What state is private to each thread? PC (where actor is in his/her script) Stack, SP (actor’s mindset)
What state is shared? Global variables, heap (props on set) Code (like lines of a play)
Looking ahead: concurrency
Concurrency Having multiple threads active at one
time Thread is the unit of concurrency
Primary topics How threads cooperate on a single task How multiple threads can share the CPU
Subject of Project 1
Looking ahead: address spaces
Address space Unit of “state partitioning”
Primary topics Many addr spaces sharing physical
memory Efficiency Safety (protection)
Subject of Project 2
Thread independence Ideal decomposition of tasks:
Tasks are completely independent Remember our earlier definition of independence
Is such a pure abstraction really feasible? Word saves a pdf, starts acroread, which reads the
pdf? Running mp3 player, while compiling 110 project?
Sharing creates dependencies Software resource (file, address space) Hardware resource (CPU, monitor, keyboard)
True thread independence
What would pure independence actually look like? (system with no shared software, hardware resources)
Multiple computer systems Each running non-interacting programs Technically still share the power grid …
“Pure” independence is infeasible Tension between software dependencies,“features”
Key question: is the thread abstraction still useful? Easier to have one thread with multiple responsibilities?
Consider a web server One processor (say) Multiple disks Tasks
Receives multiple, simultaneous requests
Reads web pages from disk Returns on-disk files to requester
Web server (single thread)
Option 1: could handle requests serially
Easy to program, but painfully slow (why?)
Client 1 Client 2WSR1 arrivesReceive R1
R2 arrivesDisk request 1a
1a completesR1 completesReceive R2
Web server (event-driven) Option 2: use asynchronous I/O Fast, but hard to program (why?)
Client 1 DiskWSR1 arrivesReceive R1
Disk request 1a
1a completes
R1 completes
Receive R2
Client 2
R2 arrives
Finish 1a
Start 1a
Web server (multi-threaded)
Option 3: assign one thread per request
Where is each request’s state stored?
Client 1 Client 2WS1
R1 arrivesReceive R1
R2 arrivesDisk request 1a
1a completesR1 completes
Receive R2
WS2
Threads are useful It cannot provide total independence
But it is still a useful abstraction! Threads make concurrent programming easier
Thread system manages sharing the CPU (unlike in event-driven case)
Apps can encapsulate task state w/i a thread (e.g. web request state)
Where are threads used? When a resource is slow, don’t want to wait on
it Windowing system
One thread per window, waiting for window input What is slow?
Human input, mouse, keyboard Network file/web/DB server
One thread per incoming request What is slow?
Network, disk, remote user (e.g. ATM bank customer)
Where are threads used? When a resource is slow, don’t want to wait on
it Operating system kernel
One thread waits for keyboard input One thread waits for mouse input One thread writes to the display One thread writes to the printer One thread receives data from the network card One thread per disk … Just about everything except the CPU is slow
Cooperating threads Assume each thread has its own CPU
We will relax this assumption later
CPUs run at unpredictable speeds Source of non-determinism
Memory
CPUThread A
CPUThread B
CPUThread C
Non-determinism and ordering
Time
Thread A
Thread B
Thread C
Global orderingWhy do we care about the global ordering? Might have dependencies between events Different orderings can produce different resultsWhy is this ordering unpredictable? Can’t predict how fast processors will run
Non-determinism example 1
Thread A: cout << “ABC”; Thread B: cout << “123”; Possible outputs?
“A1BC23”, “ABC123”, … Impossible outputs? Why?
“321CBA”, “B12C3A”, … What is shared between threads?
Screen, maybe the output buffer
Non-determinism example 2
y=10; Thread A: int x = y+1; Thread B: y = y*2; Possible results?
A goes first: x = 11 and y = 20 B goes first: y = 20 and x = 21
What is shared between threads? Variable y
Non-determinism example 3
x=0; Thread A: x = 1; Thread B: x = 2; Possible results?
B goes first: x = 1 A goes first: x = 2
Is x = 3 possible?
Example 3, continued What if “x = <int>;” is implemented as
x := x & 0 x := x | <int>
Consider this schedule Thread A: x := x & 0 Thread B: x := x & 0 Thread B: x := x | 1 Thread A: x := x | 2
Atomic operations Must know what operations are atomic
before we can reason about cooperation Atomic
Indivisible Happens without interruption
Between start and end of atomic action No events from other threads can occur
Review of examples Print example (ABC, 123)
What did we assume was atomic? What if “print” is atomic? What if printing a char was not
atomic? Arithmetic example (x=y+1, y=y*2)
What did we assume was atomic?
Atomicity in practice On most machines
Memory assignment/reference is atomic E.g.: a=1, a=b
Many other instructions are not atomic E.g.: double-precision floating point store (often involves two memory operations)
Virtual/physical interfaces
Hardware
OSApplications
If you don’t have atomic operations, you can’t make one.
Another example Two threads (A and B)
A tries to increment i B tries to decrement i
Thread A: i = o; while (i < 10){ i++; } print “A done.”
Thread B: i = o; while (i > -10){ i--; } print “B done.”
Example continued Who wins? Does someone have to win?
Thread A: i = o; while (i < 10){ i++; } print “A done.”
Thread B: i = o; while (i > -10){ i--; } print “B done.”
Example continued Will it go on forever if both threads
Start at about the same time And execute at exactly the same speed? Yes, if each C statement is atomic.
Thread A: i = o; while (i < 10){ i++; } print “A done.”
Thread B: i = o; while (i > -10){ i--; } print “B done.”
Example continued What if i++/i-- are not atomic?
tmp := i+1 i := tmp (tmp is private to A and B)
Example continued Non-atomic i++/i--
If A starts ½ statement ahead, B can win How?
Thread A: tmpA := i + 1 // tmpA == 1Thread B: tmpB := i - 1 // tmpB == -1Thread A: i := tmpA // i == 1Thread B: i := tmpB // i == -1
Example continued Non-atomic i++/i--
If A starts ½ statement ahead, B can win How? Do you need to worry about this?
Yes!!! No matter how unlikely
Debugging non-determinism
Requires worst-case reasoning Eliminate all ways for program to break
Debugging is hard Can’t test all possible interleavings Bugs may only happen sometimes
Heisenbug Re-running program may make the bug
disappear Doesn’t mean it isn’t still there!
Constraining concurrency Synchronization
Controlling thread interleavings Some events are independent
No shared state Relative order of these events don’t matter
Other events are dependent Output of one can be input to another Their order can affect program results
Goals of synchronization1. All interleavings must give correct result
Correct concurrent program Works no matter how fast threads run
Important for your projects!2. Constrain program as little as possible
Why? Constraints slow program down Constraints create complexity
Conclusion Next class: more cooperation
“How do actors interact on stage?” After next week
Should be able to start Project 1 Review C++/STL this weekend Remember to send me your
groups!