6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek...
-
Upload
steven-rose -
Category
Documents
-
view
214 -
download
0
Transcript of 6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek...
6.894: Distributed Operating System Engineering
Lecturers:
Frans Kaashoek ([email protected])
Robert Morris ([email protected])
TA:
Jinyang Li ([email protected])
www.pdos.lcs.mit.edu/6.894
Operating System
• Software that turns silicon into something useful– Provides applications with a programming
interface– Manages hardware resources on behalf of
applications
Distributed Operating System
• The holy grail: transparency– provide applications with a virtual machine consisting
of many processors distributed around the network.
• Distributed OS engineering is difficult:– Failures
– High-degree of concurrency
– Long latencies
– New classes of security attacks
Client/Server Architecture
• A modular architecture to structure distributed systems– Clients request services from servers– Client and servers communicate with messages– Servers are typically trusted
• Other architectures– Peer-to-peer (decentralized)– Single address space
6.894 topics
• Client-server components– Remote procedure call, threads, address spaces,
etc.
• Storage– File systems, transactions
• Security– Confidentiality, authentication, etc.
• Scalable servers
6.894 is an advanced 6.033
• Perform actual systems research– Perform a research project– Study recent research papers
• Design systems for real workloads– New abstractions, protocols, datastructures,
algorithms, etc.
• Build a real system (lab)– Real enough that you can use it
Internet video-on-demand server
• Example to study issues and overview 6.894
• Requirements:– Low and high-quality video– Many users, spread around the Internet– Last mile bandwidth may be low– Access control
Client and server structure
Client() {
fd = connect(“server”);
write (fd, “video.mpg”);
while (!eof(fd)) {
read (fd, buf);
display (buf);
}
}
Server() {while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) {
read(fd, block);write (cfd, block);
} close (cfd); close (fd);}}
Performance “analysis”
• Server capacity:– Network (100 Mbit/s)– Disk (20 Mbyte/s)
• Obtained performance: one client stream
• Server is limited by software structure
• If a video is 200 Kbit/s, server should be able to support more than one client.
Better single-server performance
• Goal: run at server’s hardware speed– Disk or network should be bottleneck
• Method:– Pipeline blocks of each request– Multiplex requests from multiple clients
• Two implementation approaches:– Multithreaded server– Asynchronous I/O
Multithreaded server
server() {while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) {
read(fd, block);write (cfd, block);
} close (cfd); close (fd);}}
for (i = 0; i < 10; i++)fork (server);
• When waiting for I/O, thread scheduler runs another thread
• All shared data must protected by locks
• Release locks when blocking
Asynchronous I/Ostruct callback {
bool (*is_ready)();
void (*cb)(arg);
void *arg;
}main() { while (1) {
for (c = each callback) { if (c->is_ready()) c->handler(c->arg);
} }}
• Code is structured as a collection of handlers• Handlers are nonblocking• Create new handlers for blocking operations• When operation completes, call handler
Asychronous server
init() {on_accept(accept_cb);
}accept_cb() {
on_readable(cfd,name_cb);}on_readable(fd, fn) {
c = new callback(test_readable, fn, fd); add c to callback list;}
name_cb(cfd) {read(cfd,name);fd = open(name);on_readable(fd, read_cb);
}read_cb(cfd, fd) {
read(fd, block);on_writeeable(fd, write_cb);
}write_cb(cfd, fd) {
write(cfd, block);on_readable(fd, read_cb);
}
Multithreaded vs. Async
• Hard to program– Locking code
– Need to know what blocks
• Coordination explicit
• State stored on thread’s stack– Memory allocation implicit
• Context switch may be expensive
• Multiprocessors
• Hard to program– Callback code
– Need to know what blocks
• Coordination implicit
• State passed around explicitly– Memory allocation explicit
• Lightweight context switch
• Uniprocessors
Coordination example
• Threaded server:– Thread for network
interface
– Interrupt wakes up network thread
– Protected (locks and conditional variables) shared buffer shared between server threads and network thread
• Asynchronous I/O– Poll for packets
• How often to poll?
– Or, interrupt generates an event
• Be careful: disable interrupts when manipulating callback queue.
Scheduling: polling vs. interrupts
• Maintain peak performance under heavy load– Interrupts model can lead to livelock
• Solution:– Use interrupts under low load (good latency)
– Use polling under heavy load (good throughput)• Polling is typically more efficient than interrupts
– Fits naturally into asynchronous I/O model
Other design issues
• Disk scheduling– Elevator algorithm
• Memory management– File system buffer cache
• Address spaces (VM management)– Fault isolate different servers
• Efficient local communication?
• Efficient transfers between disk and networks– Avoid copies
More than one processor
• Problem: single machine may not scale to enough clients
• Solutions:– Multiprocessors
• Helps when CPU is bottleneck
– Server clusters• Helps when bandwidth between server and backbone is high
– Distributed server clusters• Helps when bandwidth between client and distant server is
low
Clusters
• Naming transparency– Server cluster transparent to client?
• Server selection– Metrics: CPU load, presence of data
• Consistency– Partition data
• Availability– More processors can decrease reliability– Replicate data (makes consistency more difficult)
Distributed clusters
• Replication policies• Data distribution• Consistency• Network monitoring and modeling• Global load balancing
Tradeoff between accuracy, latency, and network load
Making it secure: access control
• Redo design: don’t add on– Firewalls: insecure and break many things
• CPU cycles is an issue– A secure HTTP server can do about 10-20
connections a second
• Pulls in other global issues– Name to key binding– Key management infrastructure
Example summary
• Pipelining of disk and network requests– Need a lot of sophisticated software
infrastructure
• Replication for reliability and performance– Need sophisticated protocols
• Difficult: We did it for one application– What if data changes rapidly?– Lack of abstractions!
6.894 lab: real systems
• Multi-finger (due next week)– Asynchronous I/O
• HTTP proxy– High-performance proxy– Cache, consistency, etc.
• Open-ended file system project– Research