Software Engineering of Distributed Systems University of Colorado Boulder ECEN5053.
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
3
Transcript of Software Engineering of Distributed Systems University of Colorado Boulder ECEN5053.
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
2
Course Logistics
Introductions http://ece.colorado.edu/~swengctf http://ece.colorado.edu/~swengctf/distributed Format Calendar Exams -- final exam only Homework -- in teams of 2 to 3 Phone number for late arrival Contact information Text web site: www.cdk3.net -- see key pts.
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
3
Outline for this session
Definition of distributed systems Purposes Demands/challenges Hardware concepts Software concepts An example model
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
4
Definition of a Distributed System
A distributed system is a collection of independent computers that appears to its users as a single coherent system. Andrew Tanenbaum
A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages. Coulouris et al (your text) concurrency of components lack of a global clock independent failures of components
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
5
Alternative definition of a distributed system
“You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done.” Leslie Lamport
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
6
If true, implied characteristics?
Computer heterogeneity & the user Communication paths from user’s perspective User interaction with system from various locations User interaction with applications Scalability Availability Addition or temporary removal of certain
components
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
7
Examples?
internet -- Not quite there -- some internet applications
more so than others Some applications, user must be very aware of
which computer is being accessed and what else?
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
8
Timeline of what had to happen first
1945mainframes
~1985powerful microprocessors
high speed networks
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
9
Necessary Developments
Take an historical view 1945 - 1985
Computers are large & expensive Most organizations had only a few
lacked a way to connect themoperated independently from one another
By mid-80’s ... powerful microprocessors with power of a then-contemporary mainframe
High speed networks! Result: Easy to combine large numbers of
computers via a high-speed network.
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
10
Purposes -- what problems are solved?
Easily connect users to remote resources Share resources with remote users in a controlled
way Hide the fact that the resources are physically
distributed over a network -- transparency Should be an open system
Offers services by standard rules that describe the syntax and semantics of those services
Should be scalable size, geography, and administration
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
11
Purpose 1: Access and sharing remotely
Why share? economics ease of collaboration -- virtual organizations ease of info exchange commerce
Connectivity and sharing lead to security issues Currently, inadequate protection
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
12
Purpose 2: Transparency
Transparency Description -- Hide:
Access differences in data representation & how resource is accessed
Location where a resource is located
Migration that a resource may move locations
Relocation that a resource may be moved while in use
Replication that a resource is replicated
Concurrency that a resource may be shared by competitors
Failure failure and recovery of a resource
Persistence whether a sw resource is in memory or on disk
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
13
Degree of Transparency
Hiding all distribution aspects not always good idea Some times desirable to remain fixed Messages between processes that are thousands
of miles apart will take hundreds of milliseconds Trade-off between high degree of transparency and
performance -- why? The degree of desirable transparency should be
considered in context with other issues such as performance and cost
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
14
Purpose 3: Openness
Offers services according to standard rules describing syntax and semantics of the services.
Rules are formalized in protocols Services generally specified through interfaces
using Interface Definition Language (IDL) specify syntax only
natural language used to describe semantics allows arbitrary process that needs an interface
to talk to another process that provides it proper interfaces are complete and neutral
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
15
Goals of Openness
Interoperability and portability completeness and neutrality are prerequisites
Flexible easy to configure the system out of different
components from different developers easy to add new components without impact easy to replace existing ones without impact i.e. extensible easier said than done
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
16
Purpose 4: Flexibility -- Policy and Mechanism
System must be organized as a collection of relatively small and easily replaceable or adaptable components
Need for change: component does not provide optimal policy for a specific user or app
Example: differing caching policies Need to be able to separate policy & mechanism
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
17
Purpose 5: Scability Challenges -- Size
Size Limitations of centralized services, data, and
algorithms -- become bottleneck Unlimited processing power and storage cannot
overcome communication limitations Decentralization introduces some kinds of
uncertainty
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
18
Purpose 6: Scalability Challenges -- Geography
Existing distributed systems designed for LANs are based on synchronous communication
Communication in WANs is inherently unreliable and almost always point-to-point LANs provide reliable comm based on
broadcasting -- WAN needs special location services
Centralized components prevent geographic scale
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
19
Purpose 7: Scalability Challenges -- Administration
How to scale across multiple independent administrative domains
Conflicting policies usage (payment) management security
protect against malice from the new domains protect against malice from the distributed
system -- e.g. downloaded programs
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
20
Scaling Techniques
Scalability problems appear as performance ones hide communication latencies
avoid waiting for responses as much as possible i.e. construct the requestor to use
asynchronous comm as much as possible reduce overall communication
distribution -- spreading component parts across the system, e.g. DNS (see next slide)
replication across the distributed system increases availability (helps hide latency) helps balance the load between components
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
21
Example: Dividing DNS name space into zones
Generic
int com mil org ...govedu
Countries
Z1
Z2
Z3
colorado
cs ece ...
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
22
Outline
Definition Purposes Demands/challenges Hardware concepts Software concepts An example model
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
23
Hardware Concepts
Introduction to how distributed systems can be organized how they are interconnected how they communicate
Shared
bus-based
Private
bus-based
Shared
switch-based
Private
switch-based
MemoryInterconnection
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
24
Shared Memory & Private Memory Multiprocessors (not multicomputers)
Single physical address space shared by all CPUs CPU A writes 37 to address 1000 CPU B then reads from address 1000 and gets 37 e.g., multiple processors on a board with shared
memory Multicomputers
Every machine has its own private memory CPU A writes 37 to its address 1000 CPU B reads from its address 1000 and gets
whatever happens to be there; not affected by the other write
For example, PCs connected by a network
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
25
Bus-based & Switch-based
Bus architecture of the interconnection network single network, backplane, bus, cable or other
medium that connects all the machines For example, cable television
Switched architecture Individual wires from machine to machine with
many different wiring patterns in use Msgs move along wires with an explicit
switching decision made at each step to route the message along one of the outgoing wires.
e.g., worldwide public telephone system
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
26
Divide & conquer -- select and explain
Performance Impacts bus, shared memory switched, shared memory not quite shared memory homogeneous multicomputers private memory, bus-based network private memory, switch-based network heterogeneous multicomputer systems
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
27
Performance Impacts--bus, shared memory
Bus-based multiprocessor, shared memory Coherent memory Bus contention If cache memory for each CPU has a high hit
rate, bus traffic drops dramatically but introduces serious problem -- what is it? Caching and memory coherence is an issue for
distributed systems Limited scalability
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
28
Performance impacts -- switched, shared memory
1. Divide memory into modules; connect them to CPU’s with a matrix of switches called a crossbar switch Allows multiple CPU’s to access shared memory
simultaneously One still has to wait if both want to access same
module 2. Network of switches to route any input to any
output May be several switching stages in-between Need extremely fast switching to reduce latency=$$
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
29
Performance impacts--not quite shared memory
Reduce cost of switching with hierarchical system SOME memory associated with each CPU (not
shared) Access to own local memory is quick Accessing anybody else’s memory is available
but slower NUMA - “Non Uniform Memory Access”
better average access times than switched nw’s what’s the problem?
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
30
Performance impacts-- homogeneous multicomputers (SANs)
System of individual computers. Therefore... Each CPU has direct connection to its own local
memory Challenges surround communication between the
CPUs Traffic volume will be orders of magnitude lower
than when interconnection network is also used for CPU-to-memory traffic
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
31
Performance impacts - private memory, bus-based network (SANs) Processors connected thru shared multiaccess
network such as Fast Ethernet Limited scalability -- performance degrades with 25-
100 nodes depending on amt of communication
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
32
Performance impacts - private memory, switch-based network (SANs)
Messages are routed through an interconnection network instead of broadcast as in bus-based
Interconnection networks vary Grid -- suitable to 2-dimensional problems Hypercube -- n-dimensional cube
MPPs - massively parallel processors (1000’s) high-performance proprietary interconnection network
designed for low latency, high bandwidth COWs - clusters of workstations
Std wkstns connected by off-the-shelf communication components; no special measures for high bandwidth or reliability --> ??
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
33
Performance impacts - heterogeneous multicomputer systems Most distributed systems are these Computers are heterogeneous w.r.t. processor type,
memory size, I/O bandwidth, etc. Interconnection networks can be heterogeneous, too Many large-scale heterogeneous multicomputers lack a
global system view cannot assume same performance or services are
available everywhere THEREFORE sophisticated software is needed
shield application developers from what is going on at hardware level (transparency)
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
34
Software Concepts
Distributed systems software acts as resource manager(s) for the
underlying hardware Hide intricacies and heterogeneity of
underlying hardware The issues that this software faces are the
core of distributed systems principles we will study this semester
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
35
When is a distributed system not a distributed system?
Distributed operating system: Not intended to handle a collection of independent
computers Network operating system:
Does not provide a view of a single coherent system
“true” distributed system Goal: scalability and openness of network o.s. and
transparency and ease of use of distributed o.s. Additional layer called middleware
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
36
Various middleware models (paradigms)
A particular paradigm is a set of decisions about how to describe distribution and communication Distributed file systems Remote procedure calls Distributed objects Distributed documents
See table
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
37
Sample ParadigmsParadigm Distribution Communication
Distributed
file system
Dist. xparency supp’d for traditional files
Remote
proc calls
Network xparency allows process to call procedure on remote machine
Distributed
objects
meth. invocation: interface implementation on process’ mach. translates invoc into msg sent to remote object; reply msg --> return value
Distributed
documents
Info org’d as docs; each doc somewhere in the world
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
38
Each paradigm must address these issues:
Communication Processes & their synchronization Processes & their interaction Naming Consistency and replication Fault tolerance Security
August 30, 2002 University of Colorado ECEN5053 Software Engineering of Distributed Systems Week 1 Introduction
39
Software Engineering of Distributed Systems
Requirements specification of these issues in distributed systems -- how to recognize, analyze, specify, trace, and manage
Design -- how to choose, represent, and verify Implementation -- tools, language support Testing -- static and dynamic