Distributed Systems 2006 Course Summary About Exam.

23
Distributed Systems 2006 Course Summary About Exam
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    1

Transcript of Distributed Systems 2006 Course Summary About Exam.

Distributed Systems 2006

Course Summary

About Exam

Distributed Systems 2006 2

Course Summary

  Summary of– Fundamental Concepts– Communication– Remote Procedure Call– Client/Server Computing– Web Services– Distributed Time and Commit– Static and Dynamic Membership – Group Communication– Virtual Synchrony– Retrofitting Reliability

  ”Executive summary” style– One slide per topic...

Distributed Systems 2006 3

Aim of the course

  The main aim of the course is to introduce fundamental concepts and techniques for distributed systems

  The course gives the students prerequisites to analyse, design, and implement distributed systems.

Distributed Systems 2006 4

Fundamental Concepts

  Definitions– Process: a program running on a computer– Message: arbitrary-sized data used to communicate between processes– Network: a collection of computers interconnected by hardware that

supports message passing– Distributed system: processes that coordinate actions through message

exchange, usually over a network– Protocol: an algorithm by which processes cooperate through message

exchange  Motivation

– Resource sharing, collaboration, ...– Needs to need for performance, availability, fault tolerance, ...

  Characteristics– Concurrent processes– Message exchange– No global clock– Independent failures

• Reliability: components of reliability, types of failures

  Models– Real-world networks– Asynchronous model– Synchronous model

  Open Systems Interconnection (OSI) stack

Application

Middleware

Operating System

Communication System

Distributed Systems 2006 5

Communication

  Open Systems Interconnection (OSI) stack– Messages travel conceptually between layers on the

same level  OSI Details

– Physical layer: electrical and physical means for communicating (e.g, parts of Ethernet)

– Data link layer: transit of data across physical link– Network layer: logical network layout– Transport layer: segments data– Session layer: coordinates request and response

sessions between applications– Presentation layer: encoding and decoding of

application data– Application layer

  Networking hardware– Ethernet, routers, switches/bridges

  Protocols– The end-to-end argument– Addressing– IP, UDP, TCP, HTTP

Distributed Systems 2006 6

Remote Procedure Call

  Remote Procedure Call (RPC)– Mask distributed communication

using transparent mechanism akin to local procedure calls

– But new kinds of exceptions– Handling of complex data

  RPC protocols– Basic, assuming no failures– ”Reliable” RPC

• Message loss, disorder, duplication• Machine failure

  RPC semantics– At most once– Exactly once (impossible)– At least once

  Java RMI– Basic mechanisms– RMI wire protocol

  Threading and RPC

Distributed Systems 2006 7

Client/Server Computing

  Characteristics– Clients use resources of server– Little communication among clients– Client to server ratio high

  Central concepts– Caching: stale/consistent,

in-coherent/coherent– Stateless/stateful

  Archetypical examples– Distributed file systems

• Mimicking local file system• SUN NFS as stateless example

– Transactional databases• Transaction model: ACID• Serialization• Concurrency control

Application

Data Cache

Buffer Pool

File Store

Primary server

Database

Distributed Systems 2006 8

Web Services

  Web architecture– Applications, proxies, servers

  Relevant web technologies– XML

• Well-formedness, validity, ...– HTTP

• GET, HEAD, POST, PUT, ...

  Web services– Interoperability as main motivation– RPC via

• WSDL: interface description via service definition and service binding

• SOAP: defining XML messages• HTTP: transport protocol

  Performance– XML vs binary encodings– FastWS as a potential speed-up

  Reliability– Web security– WS_RELIABILITY (including WS_TRANSACTIONS) via

intermediaries– Unbreakable stream connections

Distributed Systems 2006 9

Distributed Time and Commit

  Distributed time– No concept of global time possible

• (Can come close with GPS time, though)• Need logical clocks in many protocols

– Lamport’s happens-before relation• Can order locally for process• Send(a) happens-before receive(a)• Consistent cuts: ”now”

– Supporting logical clocks• Logical timestamp: increment integer and send it

with messages• Vector timestamps: increment integer per process

  Distributed commit– Two-phase commit (2PC)

• 1) Transaction manager asks data managers for vote

• 2) Commit if all ready, abort if not• Garbage collection, handling failure

– Three-phase commit (3PC)• 2PC has ”bad state”: transaction manager + process

fails just after process learns commit – what should others do?

• Add extra ”prepared to commit state” – all processes may learn outcome of vote before actual commit

• Need accurate failure detection

Distributed Systems 2006 10

Static and Dynamic Membership

  Which processes are available in a distributed system?

– Static membership• Resolve liveness of potential list of members on a per-

operation basis• Slow, simple

– Dynamic membership• Use group membership protocol to track members in the

form of views• Performant, complex

  Static: quorum update and read• A quorum read/write should intersect prior quorum write at

at least one process to ensure that latest value is read/written

• Qr + Qw > N, Qw + Qw > N• Uses a 2PC protocol for update

  Dynamic: Group Membership Service (GMS)– Tracks a set of processes: join, leave, shun (accept

false positives)– GMS service itself needs to be fault tolerant

• Designate leader (oldest process) running 2PC protocol for view change

• Run 3PC if leader fails– JGroups as an example of a Java toolkit supporting a

GMS

Distributed Systems 2006 11

Group Communication

  Multicast primitive– Deliver message to members of a process group

  Unreliable multicast– E.g., using UDP multicast

  Reliable multicast– Non-uniform, failure-atomic

• Use acks to guide retransmits• Use GMS for failure handling

– Dynamically uniform, failure-atomic• Additional round for ensuring all have message

– Flush• Ensure all members have seen same messages when view changes

  Ordered multicast– fbcast/FIFO

• Number multicasts, handle failure– cbcast

• Deliver in causal order, use vector timestamps– abcast

• E.g., using a token to designate a sequencer of messages– gbcast

• ”abcast + flush”– Safety vs speed

  Programming with multicast

Server A

Server B Server C

Distributed Systems 2006 12

Virtual Synchrony

  Elements of virtual synchrony– Process groups with identical views and ranking– Ordered reliable, multicast– View-synchronous delivery– Gap-freedom– State transfer when joining

  Main insights– Execution looks synchronous/state machine-like, but

free to optimize– Correlated failures less likely

  Uses of virtual synchrony– Election: choose oldest process as leader– Consensus: leader multicasts question, all reply,

leader chooses (compare FLP)– Consistent snapshot: start multicast, done multicast;

use cbcast– Replicated data: read, update, lock operations; abcast

vs cbcast– Load-balancing: group decides vs client decides– Fault tolerance: e.g., primary-backup group of servers

Distributed Systems 2006 13

Retrofitting Reliability

  Toolkit approach– Provide reliability tools via a toolkit directly

to programmers– JGroups supporting virtual synchrony as

example– Mixed experiences

• Tricky to work with• E.g., JGroups hidden in JBoss

  Wrappers– Transparent wrapping of reliability tools– Wrapping services, data, communication

channels

  Examples– CORBA fault tolerance– Unbreakable TCP– Reliable Distributed Shared Memory

Distributed Systems 2006 14

Meta summary

  (not the complete picture, though...)

Group Membership

Fault-tolerant multicast

Ordered multicast

Toolkits and wrappers

Robust Web Services

2PC and 3PC

Communication

Distributed Systems 2006 15

About Exam

  Curriculum  Exam questions  Trivial advice  Where and when

Distributed Systems 2006 16

Curriculum

  [Birman, 2005] – Chapters 1, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14 – Chapter 15 except Sections 15.2 and 15.3 – Chapter 16 except Section 16.2.5 – Chapter 18 except Section 18.2 – Chapter 20

  Slides from the lectures  Solutions to hand-ins A1 to A5

Distributed Systems 2006 17

Exam Questions

1. Fundamental Concepts2. Communication3. Remote Procedure Call4. Client/Server Computing5. Web Services6. Distributed Time and Commit7. Static and Dynamic Membership8. Group Communication9. Virtual Synchrony10.Retrofitting Reliability

Distributed Systems 2006 18

Exam

  Exam form– Oral– No preparation– 13-grade scale– Internal censor (Jonas Thomsen)

  Process at exam– Pick question– Present what you know about the question (for ca. 12

minutes)– Discuss with examiner and censor– Max 20 minutes altogether

Distributed Systems 2006 19

Trivial advice – before the exam

  Read and practice   Create outlines for each exam question

– (The summaries given here are not necessarily good ones…)

– Map literature to questions– Practice exam questions in groups

• Pick questions, hold “test exams”• Fidelity, fix-points

  Go watch other exams if possible

Distributed Systems 2006 20

Trivial advice – during the exam

  Pick question, don’t panic, look at your outline, put it away

  Use whiteboard as much as possible– Structures your presentation and helps the

subsequent evaluation

  Use transparencies of diagrams (max 2) for figures

  Listen to (and think about) questions from examiner and censor before answering

  Don’t drink fizzy drinks

Distributed Systems 2006 21

Trivial advice – after the exam

  Unacceptable (00, 03, 5)– Unable to present particular question– Very limited knowledge and operational skills in topic– Unable to distinguish relevant and irrelevant– Insecure use of knowledge and skills on known problems

  Acceptable (6)– Able to present particular question– Knowledge and operational skills in topic– Understanding of essence of course

  Mean (7, 8, 9)– Good presentation of particular question– Able to cover (and argue for) most relevant parts of particular questions– Good knowledge and operational skills in topic

  Very good (10, 11, 13)– Very good presentation of particular question – Extensive, thorough, and secure knowledge and operational skills in topic– Able to distinguish relevant from irrelevant– Able to combine concepts, methods, information from different parts of the topic

Distributed Systems 2006 22

Where and when

  Shannon-157, Finlandsgade 24 A  Week 13

– Tuesday 2006-03-28 – Friday 2006-03-31– Starting at 9:00– Exam lists are at ”Grædemuren” in Ny Munkegade

Distributed Systems 2006 23

Questions?