6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek...

6.894: Distributed Operating System Engineering

Lecturers:

Frans Kaashoek ([email protected])

Robert Morris ([email protected])

TA:

Jinyang Li ([email protected])

www.pdos.lcs.mit.edu/6.894

mailto:[email protected]



Operating System

• Software that turns silicon into something useful– Provides applications with a programming

interface– Manages hardware resources on behalf of

applications

Distributed Operating System

• The holy grail: transparency– provide applications with a virtual machine consisting

of many processors distributed around the network.

• Distributed OS engineering is difficult:– Failures

– High-degree of concurrency

– Long latencies

– New classes of security attacks

Client/Server Architecture

• A modular architecture to structure distributed systems– Clients request services from servers– Client and servers communicate with messages– Servers are typically trusted

• Other architectures– Peer-to-peer (decentralized)– Single address space

6.894 topics

• Client-server components– Remote procedure call, threads, address spaces,

etc.

• Storage– File systems, transactions

• Security– Confidentiality, authentication, etc.

• Scalable servers

6.894 is an advanced 6.033

• Perform actual systems research– Perform a research project– Study recent research papers

• Design systems for real workloads– New abstractions, protocols, datastructures,

algorithms, etc.

• Build a real system (lab)– Real enough that you can use it

Internet video-on-demand server

• Example to study issues and overview 6.894

• Requirements:– Low and high-quality video– Many users, spread around the Internet– Last mile bandwidth may be low– Access control

Client and server structure

Client() {

fd = connect(“server”);

write (fd, “video.mpg”);

while (!eof(fd)) {

read (fd, buf);

display (buf);

}

}

Server() {while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) {

read(fd, block);write (cfd, block);

} close (cfd); close (fd);}}

Performance “analysis”

• Server capacity:– Network (100 Mbit/s)– Disk (20 Mbyte/s)

• Obtained performance: one client stream

• Server is limited by software structure

• If a video is 200 Kbit/s, server should be able to support more than one client.

Better single-server performance

• Goal: run at server’s hardware speed– Disk or network should be bottleneck

• Method:– Pipeline blocks of each request– Multiplex requests from multiple clients

• Two implementation approaches:– Multithreaded server– Asynchronous I/O

Multithreaded server

server() {while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) {

read(fd, block);write (cfd, block);

} close (cfd); close (fd);}}

for (i = 0; i < 10; i++)fork (server);

• When waiting for I/O, thread scheduler runs another thread

• All shared data must protected by locks

• Release locks when blocking

Asynchronous I/Ostruct callback {

bool (*is_ready)();

void (*cb)(arg);

void *arg;

}main() { while (1) {

for (c = each callback) { if (c->is_ready()) c->handler(c->arg);

} }}

• Code is structured as a collection of handlers• Handlers are nonblocking• Create new handlers for blocking operations• When operation completes, call handler

Asychronous server

init() {on_accept(accept_cb);

}accept_cb() {

on_readable(cfd,name_cb);}on_readable(fd, fn) {

c = new callback(test_readable, fn, fd); add c to callback list;}

name_cb(cfd) {read(cfd,name);fd = open(name);on_readable(fd, read_cb);

}read_cb(cfd, fd) {

read(fd, block);on_writeeable(fd, write_cb);

}write_cb(cfd, fd) {

write(cfd, block);on_readable(fd, read_cb);

}

Multithreaded vs. Async

• Hard to program– Locking code

– Need to know what blocks

• Coordination explicit

• State stored on thread’s stack– Memory allocation implicit

• Context switch may be expensive

• Multiprocessors

• Hard to program– Callback code

– Need to know what blocks

• Coordination implicit

• State passed around explicitly– Memory allocation explicit

• Lightweight context switch

• Uniprocessors

Coordination example

• Threaded server:– Thread for network

interface

– Interrupt wakes up network thread

– Protected (locks and conditional variables) shared buffer shared between server threads and network thread

• Asynchronous I/O– Poll for packets

• How often to poll?

– Or, interrupt generates an event

• Be careful: disable interrupts when manipulating callback queue.

Scheduling: polling vs. interrupts

• Maintain peak performance under heavy load– Interrupts model can lead to livelock

• Solution:– Use interrupts under low load (good latency)

– Use polling under heavy load (good throughput)• Polling is typically more efficient than interrupts

– Fits naturally into asynchronous I/O model

Other design issues

• Disk scheduling– Elevator algorithm

• Memory management– File system buffer cache

• Address spaces (VM management)– Fault isolate different servers

• Efficient local communication?

• Efficient transfers between disk and networks– Avoid copies

More than one processor

• Problem: single machine may not scale to enough clients

• Solutions:– Multiprocessors

• Helps when CPU is bottleneck

– Server clusters• Helps when bandwidth between server and backbone is high

– Distributed server clusters• Helps when bandwidth between client and distant server is

low

Clusters

• Naming transparency– Server cluster transparent to client?

• Server selection– Metrics: CPU load, presence of data

• Consistency– Partition data

• Availability– More processors can decrease reliability– Replicate data (makes consistency more difficult)

Distributed clusters

• Replication policies• Data distribution• Consistency• Network monitoring and modeling• Global load balancing

Tradeoff between accuracy, latency, and network load

Making it secure: access control

• Redo design: don’t add on– Firewalls: insecure and break many things

• CPU cycles is an issue– A secure HTTP server can do about 10-20

connections a second

• Pulls in other global issues– Name to key binding– Key management infrastructure

Example summary

• Pipelining of disk and network requests– Need a lot of sophisticated software

infrastructure

• Replication for reliability and performance– Need sophisticated protocols

• Difficult: We did it for one application– What if data changes rapidly?– Lack of abstractions!

6.894 lab: real systems

• Multi-finger (due next week)– Asynchronous I/O

• HTTP proxy– High-performance proxy– Cache, consistency, etc.

• Open-ended file system project– Research

6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek...

Documents

Transcript of 6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek...