Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14,...

28
Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar

Transcript of Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14,...

Page 1: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Self-* Networks of Unmanned Vehicles

CSE 597c, Fall 2006

Introduction to Self-* SystemsSept. 14, 2006

Bhuvan Urgaonkar

Page 2: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Definition

Self-* Systems A regular expression

Self-tuning, self-configuring, self-healing, self-stabilizing, …

Autonomic computing [IBM] Inspired by the autonomous central

nervous system in a living organism In humans and other vertebrates, the part of the nervous

system that regulates the involuntary activity of the heart, intestines and glands.

Page 3: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Some History What do you think the first Self-* system was?

Wind/water mill? Emergence of (semi-) autonomous systems

starting with the industrial revolution Steam engine, printing press, car, … Could carry out certain tasks without human intervention Development of feedback-control theory, signal processing

Thermostat (Albert Butz of the Thermo-Electric Regulator Co., Minneapolis, 1885)

Cruise control

Early 20th century onwards Major advances in engineering & emergence of computing Now you could program a mechanical/electrical/… system

More complex autonomous systems

Page 4: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

More History

Artificial Intelligence Make a machine/computer do what a (smart/able)

human can do Learn like a human does Sometimes easy, very often not!

Turing Test A computer that can pose as a human passes the Turing

test A definition of Self-* ness?

Would imitating human behavior alone be enough?

Page 5: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Complexity of Modern Systems

Computer systems grew in complexity Others as well, but lets talk about CS

NYTimes: All science is computer science Complex h/w, s/w, Distributed systems, Heterogeneity, …

Can’t be managed by housewives who are given a manual – WW II !!

IBM’s DB2 database server has about 80 parameters! Modern systems operate in highly dynamic conditions Human-intervention based operation often infeasible

Error-prone Slow Expensive …

Page 6: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Operating Environments that Prohibit Human Participation

Robots or machines operating in mines, under oceans, volcanic areas, … Must “take care” of themselves

Page 7: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Defining Self-* ness The Turing test doesn’t quite capture

Self-* ness Sometimes we want better than what

even the smartest/fastest human can do! Not quite the same as the original AI goal And not a superset of it Some intersection, but also some

orthogonal requirements

Page 8: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Outline Motivation and history Examples Self-* networks/distributed systems Relevant areas/useful techniques Summary

Page 9: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Example 1:General-purpose Operating Systems

CPU scheduling and memory management First computers did batch processing of jobs A human would schedule the jobs

Multi-programming came up Dynamically changing set of processes Interleaving of computation and I/O Response time sensitive processes such as editors The CPU scheduler had to adapt to these dynamics

Self-tuning behavior was desired Same for memory manager

Page 10: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Self-tuning Systems

External environmentincluding inputs

System output(e.g., performance)

Feedback

System components

Keep output within desired bounds even when the external environment is changing

Page 11: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Example 2:Mission-critical Operating Systems

OSes running on space-crafts System had to discover errors and recover on its own

Self-healing systems Initial/simple solutions: High degree of redundancy

Introduce redundancy to deal with failures Implement mechanisms to quickly discover failures

OK for a space-craft, but not for a more “down-to-earth” system Could be very expensive How can a system self-heal without excessive redundancy?

Later: Software became very complex S/w failures far more serious problem than h/w failures!

Software engineering, programming languages

Page 12: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Self-healing Systems

Keep output within reasonable bounds even when internal components fail

What’s different from a self-tuning system? Failures are internal events; changes in operating

environment are external events Note: Failures might be induced by external events

System output(e.g., performance)

Feedback

External environmentincluding inputs

ComponentFailure

Page 13: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Self-Stabilization

Green=good, Blue=bad Guaranteed to return to a good state, eventually, on its

own Related to fault tolerance

How?

Page 14: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Classification of Self-* Systems

Self-tuning Performance

Self-healing Failure handling

Self-stabilizing Convergence

Is this a good classification? Note: Not necessarily a non-intersecting

classification

Page 15: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Defining Self-* ness (contd.)

First define for each member of our classification

Page 16: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Quantifying Self-tunability How good is the system at meeting performance targets under

dynamic operating conditions? E.g., Can the system ensure response time degradation is always at

most proportional to increase in request arrival? Note: The system can change its internal state (e.g., increase its capacity

dynamically) to achieve its goal

Page 17: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Quantifying the Goodness of a Self-healing System

How good is the system at maintaining functionality under failures? E.g. 1, Can the system continue functioning even after N failures? E.g. 2, Can the system continue to offer the same response time even

after N failures?

Page 18: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Quantifying the Goodness of a Self-stabilizing System

How long does it take the system to return to a good state after a perturbation?

Page 19: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Defining Self-* ness (contd.)

One approch: Define a vector whose individual elements characterize self-tunability, goodness of self-healing, and self-stabilization

E.g., <ST=excellent, SH=poor, SS=good> Conflicting goals!

E.g., maintaining performance might require fewer components; dealing with failures might require redundancy

Need to understand what is more important Context dependent

Relative importance of various self-* properties vary across systems

Page 20: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Outline Motivation and history Examples Self-* networks/distributed systems Relevant areas/useful techniques Summary

Page 21: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Distributed Systems How do things change? Cons: Problems associated with a distributed system

Data consistency Larger communication delays Heterogeneity More failures, more kinds of failures …

Pros: More sources of redundancy might mean better self-

healing More resources might mean more options to self-tune Any more?

Page 22: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Example 3:Networking: TCP/IP

Simple AIMD based congestion control De-centralized, only at end-points Has worked pretty well!

Scaled to current Internet I consider TCP a good Self-tuning protocol

What about link failures and how IP handles them?

Page 23: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Example 4:Enterprise/Utility Computing

Varying workloads, complex applications Human management infeasible, error-prone

How to manage resources to maximize revenue while meeting client requirements

Page 24: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Example 5:Search Engine: Google

Web content highly dynamic Self-tuning:

How good is the search engine at keeping up with changes in Web content?

Self-healing: Thousands of servers and disks in their data

center, failures every few hours! Does google.com keep working despite these

failures? How much human intervention does this need?

Page 25: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Outline Motivation and history Examples Self-* networks/distributed systems Relevant areas/useful techniques Summary

Page 26: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Relevant areas/useful techniques

Multi-criteria Optimization Techniques (economics) Analytical modeling (e.g., to infer resource needs of an app) Measurement techniques Feedback-control theory (reactive) Statistical techniques for prediction, learning (reactive+proactive) Biological, ecological, social networks

How do termites with pinhead-sized brains build air-conditioned colonies?

Theoretical CS: online algorithms, approximation algorithms Distributed computing Systems issues

Efficient & bug-free software, prototyping, simulation, experiment design)

Page 27: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Outline Motivation and history Examples Self-* networks/distributed systems Relevant areas/useful techniques Summary

Page 28: Self-* Networks of Unmanned Vehicles CSE 597c, Fall 2006 Introduction to Self-* Systems Sept. 14, 2006 Bhuvan Urgaonkar.

Summary: Key Principles Keep is simple, silly!

Occam’s razor E.g., Partial automation vs complete automation

Understand and define system goals clearly Which Self-* properties are essential, which are not?

Understand system properties, operating environments

One size may not fit all Measurements Prediction, classification, learning, feed-back control

Design for agility (assuming online operation) Efficient algorithms & systems mechanisms