ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

40
ICS362 – Distributed Systems Dr. Ken Cosh Week 1

Transcript of ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Page 1: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

ICS362 – Distributed Systems

Dr. Ken Cosh

Week 1

Page 2: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Course Description

This course provides an introduction to the basic issues in the design and implementation of distributed systems. Topics include communication, processes, naming, synchronisation, consistency and replication, fault tolerance and security.

Page 3: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Course Objectives

On completion of this course students will be able to:– 3.1 Discuss key elements to consider when

managing Distributed Systems, such as security, fault tolerance, consistency and replication.

– 3.2 Compare differences between different Object Based Systems, File Systems, Web Based Systems and Co-ordination Based Systems.

Page 4: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

References

1) (Compulsary) Distributed Systems, Principles and Paradigms, 2nd Edition, Andrew S. Tanenbaum & Maarten Van Steen, 2007.

2) Distributed Systems, Concepts and Design, 4th Edition, George Coulouris, Jean Dollimore, Tim Kindberg, 2005.

Page 5: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Topics

Introduction Architectures Processes Communication Naming Synchronisation Consistency / Replication Fault Tolerance Security Example Systems

Page 6: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Assessment

1. Quizzes and Presentations - 30% 2. Midterm exam - 30% 3. Final exam - 40%

Page 7: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Course Info.

Mon / Wed 12:30-14:00 Room PC319 Office Hours: By Appointment

NOTE: Plagiarism = 0.

Page 8: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

What is a Distributed System?

“A distributed System is a collection of independent computers that appears to it users as a single coherent system.” (Tanenbaum)

“Hardware of Software components located at networked computers communicate and coordinate their actions only by passing messages” (Coulouris)

Page 9: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Key Features

Components that are autonomous Users think they are dealing with a single

system This requires some collaboration

Note: The challenges involved are independent of the type of computers used.

Page 10: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Characteristics of DS

How it works is hidden from user. Interaction is consistent & uniform Scalability Continuously available, even if some parts

are out of order

Page 11: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Layered Architecture

Commonly implemented through layers & middleware

Application A

Local OS 1 Local OS 2 Local OS 3

Application C

Local OS 4

Distributed System Layer (Middleware)

Application B

Network

Page 12: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Goals

Make Resources Available Hide the fact that resources are distributed

– Distribution Transparency

Be Open Be Scalable

Page 13: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Make Resources Available

E.g. Printers, storage facilities, data, files, webpages, networks etc.– For economic reasons– For collaboration reasons– To create virtual organisations

This produces challenges– Security– Privacy

Page 14: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Distribution Transparency

An important goal of distributed systems is to hide the fact that processes / resources are physically distributed Enabling users to use the system without worrying about where the resources are.

•Access Transparency•Location Transparency•Migration Transparency•Relocation Transparency•Replication Transparency•Concurrency Transparency•Failure Transparency

Page 15: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Access Transparency

Different Resources may represent data in different formats, but this shouldn’t be an issue for the user.

– A user on an Intel workstation sending data to a Sun SPARC machine, shouldn’t be concerned that Intel orders its bytes by little endian format (high order bytes first) while SPARC uses big endian format (low order bytes first).

Different file naming formats should also not be of concern to the user. ‘/’ or ‘\’.

Page 16: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Location Transparency

Location Transparency refers to the physical position of a resource, which should be hidden from the user. This is normally achieved through naming, where normally only logical names are used;– http://cis.payap.ac.th/index.php

Where is it (physically)? Has it always been there?

Page 17: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Migration / Relocation Transparency

In the previous web address, you have no idea whether index.html has always been on the cis.payap.ac.th server, or when it might have moved there. If resources can be moved without affecting the way the resource is accessed then migration transparency is provided. If that movement occurs while the resource is being accessed, then relocation transparency is provided. Consider moving around using a wireless laptop.

Page 18: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Replication Transparency

The efficiency of distributed systems can be improved greatly by locating replicas (copies) of a resources physically closer to a user. Replication transparency enables the system to do this, without the user knowing they are using a replica.

Page 19: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Concurrency Transparency

A goal of distributed systems is often sharing of resources between users. These users may wish to access or even update the same data at the same time (concurrently). An important challenge when designing distributed systems is how to deal with concurrent accesses.– How to maintain consistency when different users

use the same resource in different ways.

Page 20: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Failure Consistency

“You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done!”

Failure Consistency tries to mask failures such as this.

It is difficult to identify between a resource that has failed and a resource which is performing badly (slowly).

– Consider opening a webpage - is it dead or painfully slow, how long should the browser wait?

Page 21: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Complete Transparency?

Complete Transparency isn’t always completely necessary.– E.g. daily newspaper arriving at 7am regardless

of location in the world.

Nor is it always possible.– Physics behind signal transmission.

Page 22: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Openness

A further goal of distributed systems is openness - that any resource conforms to a set of open standards. Doing so enables different parts of the system to make use of required services.

This is normally achieved through modules which offer services which are specified through interfaces, using a standard IDL (Interface Definition Language).

The IDL specifies the syntax of the resource, harder to specify is the semantics of what the services actually do.

Page 23: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Openness

Distributed Systems should be complete and neutral, and in doing so should be interoperable and portable;

– Interoperability refers to how well 2 different systems (possibly from different manufacturers) can co-exist making use of each others services.

– Portability refers to whether an application written for system A can be used by system B.

Page 24: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Openness

Another feature of open systems is flexibility. Systems should be flexible to enable users to specialise their interactions without affecting other users or components.

Flexibility is often achieved through designing systems as a collection of small, replaceable or adaptable components.

Page 25: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Scalability

A further goal of Distributed Systems is that they should be scalable - that is that they can grow;– Scalable by size; more users or resources can be

added to the system.– Scalable by location; resources and users may be

physically distant.– Scalable by administration; system can be easily

manageable as it grows.

Page 26: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Scalability

One problem often encountered when dealing with scalability is dealing with centralisation.

– Centralised services– Centralised data– Centralised algorithms

Imagine how the internet would work if there was only one single DNS table, and every address resolution request had to be directed through that computer.

Page 27: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Scalability

Another problem affecting scalability concerns whether synchronous communication is actually possible.

– Many existing systems were designed for synchronous communication.

The laws of physics (including the speed of light), limits the speed of communication between physically distant resources.

– Leaving a ‘client’ blocked until a reply is sent back.

Page 28: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Scalability & Administration

What happens when a system needs to scale across multiple, independent adminstrative domains?– Conflicting policies

Resource Usage Management Security

Page 29: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Solving Scalability (briefly & currently)

Hiding Communication Latencies– Essentially asynchronous communication. Not waiting for a

reply, instead creating a special handler (thread) to complete previous requests.

Distribution– Splitting a component into smaller parts – e.g. DNS,

splits .com, .th, .edu etc.

Replication– For example caching. A copy of the data closer to the

request.

Page 30: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Replication & Scalability

Replication can have a downside effect on Scalability– Consistency Problems– How big a problem is this?

Page 31: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Complexity

Clearly designing a DS is a complex task. Some common false assumptions adding to complexity:

– The network is reliable– The network is secure– The network is homogenous– The topology doesn’t change– Latency is zero– Bandwidth is infinite– Transport cost is zero– There is one administrator

Page 32: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Examples of DS

Distributed Computing Systems– Cluster Computing– Grid Computing

Distributed Information Systems– Transaction Processing Systems– Enterprise Application Integration

Distributed Pervasive Systems

Page 33: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Distributed Computing Systems

For high performance computing tasks When price/performance ration of PCs and

Workstations improved, it was financially & technically attractive to build supercomputers by hooking up a collection of simple computers on a high speed network.

Page 34: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Cluster Computing

Homogeneous hardware Master node handles allocation of tasks and

user interface E.g. Beowulf Linux clusters

Page 35: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Grid Computing

Heterogeneous Hardware– No assumptions about hardware, OS, Networks,

Administrative domains, security policies

Resources from different organisations are brought together to allow collaboration – essentially realising a virtual organisation.– Towards Service Oriented Architectures

Page 36: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Distributed Information Systems

When Business Information Systems moved into a networked environment.– Sharing data between functional units– Sharing functionality both internally and externally

Page 37: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Transaction Processing Systems

Consider a transaction as an operation on a database.– Handled through Remote Procedure Calls (RPCs)

Each transaction should have 4 characteristics (ACID)– Atomic– Consistent– Isolated– Durable

Page 38: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

ACID

Atomic– Either the whole transaction happens, or none of it.

Consistent– Certain invariants must remain true – e.g. the total amount

of money in a bank must remain the same before an after internal transfers (even if momentarily during the transaction this isn’t true).

Isolated– Two concurrently running transactions should not interfere

with each other. Durable

– One a transaction commits, there is no going back.

Page 39: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Enterprise Application Integration

Applications are built on top of databases – separated from the databases.– So these applications may need to communicate

with each other.

Which leads to different communication middleware– RPC– Remote Method Invocations (RMI)

Page 40: ICS362 – Distributed Systems Dr. Ken Cosh Week 1.

Distributed Pervasive Systems

Thus far systems have been ‘stable’, i.e. relatively permanent fixed nodes with high quality connections.

– Pervasive systems integrate mobile / embedded computing devices.

Small, battery-powered, mobile, wirelessly connected nodes which blend into their environment.

– Nodes should be able to discover local services and react accordingly

E.g. Home Systems, Electronic Health Care Systems, Sensor Networks