Distributed Systems 2006 Course Summary About Exam.

Distributed Systems 2006

Course Summary

About Exam

Distributed Systems 2006 2

Course Summary

Summary of– Fundamental Concepts– Communication– Remote Procedure Call– Client/Server Computing– Web Services– Distributed Time and Commit– Static and Dynamic Membership – Group Communication– Virtual Synchrony– Retrofitting Reliability

”Executive summary” style– One slide per topic...


Aim of the course

The main aim of the course is to introduce fundamental concepts and techniques for distributed systems

The course gives the students prerequisites to analyse, design, and implement distributed systems.


Fundamental Concepts

Definitions– Process: a program running on a computer– Message: arbitrary-sized data used to communicate between processes– Network: a collection of computers interconnected by hardware that

supports message passing– Distributed system: processes that coordinate actions through message

exchange, usually over a network– Protocol: an algorithm by which processes cooperate through message

exchange Motivation

– Resource sharing, collaboration, ...– Needs to need for performance, availability, fault tolerance, ...

Characteristics– Concurrent processes– Message exchange– No global clock– Independent failures

• Reliability: components of reliability, types of failures

Models– Real-world networks– Asynchronous model– Synchronous model

Open Systems Interconnection (OSI) stack

Application

Middleware

Operating System

Communication System


Communication

Open Systems Interconnection (OSI) stack– Messages travel conceptually between layers on the

same level OSI Details

– Physical layer: electrical and physical means for communicating (e.g, parts of Ethernet)

– Data link layer: transit of data across physical link– Network layer: logical network layout– Transport layer: segments data– Session layer: coordinates request and response

sessions between applications– Presentation layer: encoding and decoding of

application data– Application layer

Networking hardware– Ethernet, routers, switches/bridges

Protocols– The end-to-end argument– Addressing– IP, UDP, TCP, HTTP


Remote Procedure Call

Remote Procedure Call (RPC)– Mask distributed communication

using transparent mechanism akin to local procedure calls

– But new kinds of exceptions– Handling of complex data

RPC protocols– Basic, assuming no failures– ”Reliable” RPC

• Message loss, disorder, duplication• Machine failure

RPC semantics– At most once– Exactly once (impossible)– At least once

Java RMI– Basic mechanisms– RMI wire protocol

Threading and RPC


Client/Server Computing

Characteristics– Clients use resources of server– Little communication among clients– Client to server ratio high

Central concepts– Caching: stale/consistent,

in-coherent/coherent– Stateless/stateful

Archetypical examples– Distributed file systems

• Mimicking local file system• SUN NFS as stateless example

– Transactional databases• Transaction model: ACID• Serialization• Concurrency control

Application

Data Cache

Buffer Pool

File Store

Primary server

Database


Web Services

Web architecture– Applications, proxies, servers

Relevant web technologies– XML

• Well-formedness, validity, ...– HTTP

• GET, HEAD, POST, PUT, ...

Web services– Interoperability as main motivation– RPC via

• WSDL: interface description via service definition and service binding

• SOAP: defining XML messages• HTTP: transport protocol

Performance– XML vs binary encodings– FastWS as a potential speed-up

Reliability– Web security– WS_RELIABILITY (including WS_TRANSACTIONS) via

intermediaries– Unbreakable stream connections


Distributed Time and Commit

Distributed time– No concept of global time possible

• (Can come close with GPS time, though)• Need logical clocks in many protocols

– Lamport’s happens-before relation• Can order locally for process• Send(a) happens-before receive(a)• Consistent cuts: ”now”

– Supporting logical clocks• Logical timestamp: increment integer and send it

with messages• Vector timestamps: increment integer per process

Distributed commit– Two-phase commit (2PC)

• 1) Transaction manager asks data managers for vote

• 2) Commit if all ready, abort if not• Garbage collection, handling failure

– Three-phase commit (3PC)• 2PC has ”bad state”: transaction manager + process

fails just after process learns commit – what should others do?

• Add extra ”prepared to commit state” – all processes may learn outcome of vote before actual commit

• Need accurate failure detection


Static and Dynamic Membership

Which processes are available in a distributed system?

– Static membership• Resolve liveness of potential list of members on a per-

operation basis• Slow, simple

– Dynamic membership• Use group membership protocol to track members in the

form of views• Performant, complex

Static: quorum update and read• A quorum read/write should intersect prior quorum write at

at least one process to ensure that latest value is read/written

• Qr + Qw > N, Qw + Qw > N• Uses a 2PC protocol for update

Dynamic: Group Membership Service (GMS)– Tracks a set of processes: join, leave, shun (accept

false positives)– GMS service itself needs to be fault tolerant

• Designate leader (oldest process) running 2PC protocol for view change

• Run 3PC if leader fails– JGroups as an example of a Java toolkit supporting a

GMS


Group Communication

Multicast primitive– Deliver message to members of a process group

Unreliable multicast– E.g., using UDP multicast

Reliable multicast– Non-uniform, failure-atomic

• Use acks to guide retransmits• Use GMS for failure handling

– Dynamically uniform, failure-atomic• Additional round for ensuring all have message

– Flush• Ensure all members have seen same messages when view changes

Ordered multicast– fbcast/FIFO

• Number multicasts, handle failure– cbcast

• Deliver in causal order, use vector timestamps– abcast

• E.g., using a token to designate a sequencer of messages– gbcast

• ”abcast + flush”– Safety vs speed

Programming with multicast

Server A

Server B Server C


Virtual Synchrony

Elements of virtual synchrony– Process groups with identical views and ranking– Ordered reliable, multicast– View-synchronous delivery– Gap-freedom– State transfer when joining

Main insights– Execution looks synchronous/state machine-like, but

free to optimize– Correlated failures less likely

Uses of virtual synchrony– Election: choose oldest process as leader– Consensus: leader multicasts question, all reply,

leader chooses (compare FLP)– Consistent snapshot: start multicast, done multicast;

use cbcast– Replicated data: read, update, lock operations; abcast

vs cbcast– Load-balancing: group decides vs client decides– Fault tolerance: e.g., primary-backup group of servers


Retrofitting Reliability

Toolkit approach– Provide reliability tools via a toolkit directly

to programmers– JGroups supporting virtual synchrony as

example– Mixed experiences

• Tricky to work with• E.g., JGroups hidden in JBoss

Wrappers– Transparent wrapping of reliability tools– Wrapping services, data, communication

channels

Examples– CORBA fault tolerance– Unbreakable TCP– Reliable Distributed Shared Memory


Meta summary

(not the complete picture, though...)

Group Membership

Fault-tolerant multicast

Ordered multicast

Toolkits and wrappers

Robust Web Services

2PC and 3PC

Communication


About Exam

Curriculum Exam questions Trivial advice Where and when


Curriculum

[Birman, 2005] – Chapters 1, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14 – Chapter 15 except Sections 15.2 and 15.3 – Chapter 16 except Section 16.2.5 – Chapter 18 except Section 18.2 – Chapter 20

Slides from the lectures Solutions to hand-ins A1 to A5


Exam Questions

1. Fundamental Concepts2. Communication3. Remote Procedure Call4. Client/Server Computing5. Web Services6. Distributed Time and Commit7. Static and Dynamic Membership8. Group Communication9. Virtual Synchrony10.Retrofitting Reliability


Exam

Exam form– Oral– No preparation– 13-grade scale– Internal censor (Jonas Thomsen)

Process at exam– Pick question– Present what you know about the question (for ca. 12

minutes)– Discuss with examiner and censor– Max 20 minutes altogether


Trivial advice – before the exam

Read and practice Create outlines for each exam question

– (The summaries given here are not necessarily good ones…)

– Map literature to questions– Practice exam questions in groups

• Pick questions, hold “test exams”• Fidelity, fix-points

Go watch other exams if possible


Trivial advice – during the exam

Pick question, don’t panic, look at your outline, put it away

Use whiteboard as much as possible– Structures your presentation and helps the

subsequent evaluation

Use transparencies of diagrams (max 2) for figures

Listen to (and think about) questions from examiner and censor before answering

Don’t drink fizzy drinks


Trivial advice – after the exam

Unacceptable (00, 03, 5)– Unable to present particular question– Very limited knowledge and operational skills in topic– Unable to distinguish relevant and irrelevant– Insecure use of knowledge and skills on known problems

Acceptable (6)– Able to present particular question– Knowledge and operational skills in topic– Understanding of essence of course

Mean (7, 8, 9)– Good presentation of particular question– Able to cover (and argue for) most relevant parts of particular questions– Good knowledge and operational skills in topic

Very good (10, 11, 13)– Very good presentation of particular question – Extensive, thorough, and secure knowledge and operational skills in topic– Able to distinguish relevant from irrelevant– Able to combine concepts, methods, information from different parts of the topic


Where and when

Shannon-157, Finlandsgade 24 A Week 13

– Tuesday 2006-03-28 – Friday 2006-03-31– Starting at 9:00– Exam lists are at ”Grædemuren” in Ny Munkegade

http://ereception.katrinebjerg.net/ereception/browser.jsp?s=shannon%20157


Questions?

Distributed Systems 2006 Course Summary About Exam.

Documents

Transcript of Distributed Systems 2006 Course Summary About Exam.