Physical Programming: Beyond Mere Logic

41
Bran Selic Bran Selic Rational Software Rational Software Canada Canada [email protected] [email protected] om om Physical Programming: Beyond Mere Logic

description

Physical Programming: Beyond Mere Logic. Bran Selic Rational Software Canada [email protected]. What I am Hoping For. E T HEORY A ND P RACTICE OF S OFTWARE. The Ideal and the Real. PLAT  N. By focussing on the imperfect world of physical reality we may miss the essence. - PowerPoint PPT Presentation

Transcript of Physical Programming: Beyond Mere Logic

Page 1: Physical Programming:  Beyond Mere Logic

Bran SelicBran SelicRational Software CanadaRational Software Canada

[email protected]@rational.com

Physical Programming: Beyond Mere Logic

Physical Programming: Beyond Mere Logic

Page 2: Physical Programming:  Beyond Mere Logic

2

What I am Hoping ForWhat I am Hoping For

E TTHEORYHEORY

AANDND

PPRACTICERACTICE OF OF

SSOFTWAREOFTWARE

Page 3: Physical Programming:  Beyond Mere Logic

3

The Ideal and the RealThe Ideal and the Real

By focussing on the imperfect world of physical reality we may miss the essence

Software seems much closer to the “ideal” world

Page 4: Physical Programming:  Beyond Mere Logic

4

The Software World The Software World Fundamental design principle: separate program logic from the

underlying implementation technology separation of concerns software portability

Program LogicProgram Logic

Computing Environment Computing Environment & Technology& Technology

HL ProgrammingHL ProgrammingLanguagesLanguages

HL ProgrammingHL ProgrammingLanguagesLanguages

Page 5: Physical Programming:  Beyond Mere Logic

5

The Real-Time Software World The Real-Time Software World Key question: How long will it take? The quantitative characteristics of the computing environment

encroach upon the purity of the logic software design involves engineering tradeoffs

HL ProgrammingHL ProgrammingLanguagesLanguages

Program LogicProgram Logic

Computing Environment Computing Environment & Technology& Technology

Page 6: Physical Programming:  Beyond Mere Logic

6

A Simple Programming ApplicationA Simple Programming Application Traverse a transactions log database and print all

transactions pertaining to a specific account

open (DB);for i := 1 to DB.size do

record := read (DB);if (record.acctNo = myAccount)then

print (record);enddo;

close (DB);

DBPrinterPrinter CPUCPU

Page 7: Physical Programming:  Beyond Mere Logic

7

open (DB);for i := 1 to DB.size do

record := read (DB);if (record.acctNo = myAccount)then

print (record);enddo;

close (DB);

Porting to a Distributed EnvironmentPorting to a Distributed Environment Can it really be this simple?

RPC_open (DB);for i := 1 to DB.size do

record := RPC_read (DB);if (record.acctNo =

myAccount)thenprint (record);

enddo;RPC_close (DB);

PrinterPrinter CPUCPU DBCPUCPU

NetworkNetwork

DBCPUCPU

Replicated DBservers

Page 8: Physical Programming:  Beyond Mere Logic

8

Some (Unstated!) AssumptionsSome (Unstated!) Assumptions The CPU and database are fast enough for the needs of

the application e.g. random access database hardware

The CPU and database fail as a unit i.e., no need to contend with failures of the database

Communications is reliable order preserving exactly once semantics

A system never has anything more important to do than what it is doing at the moment

Page 9: Physical Programming:  Beyond Mere Logic

9

Partial FailuresPartial Failures Distributed systems can exhibit partial failures

fault tolerance: ability to recover from partial failures Issue: failure recovery strategy

fault detection failure recovery fault diagnosis

Issue: how do other sites detect that a site has failed? (apparent) lack of activity/response how do we distinguish between a failed site and a lost

message?• Timeout is the only general mechanism available

how long do we wait?• Tradeoff between responsiveness vs. degree of certainty

Page 10: Physical Programming:  Beyond Mere Logic

10

A More Realistic Distribution ScenarioA More Realistic Distribution Scenario Dealing with partial failures

DB := locate_database (Network)exception abort; RPC_open (DB)exception do DB := locate_database (Network)exception abort; enddo; for i := 1 to DB.size do record := RPC_read (DB)exception do DB := locate_database (Network)exception abort; for j := 1 to (i-1) do

RPC_read (DB) exception abort; retry; enddo;

if (record.acctNo = myAccount)then print (record); enddo; RPC_close (DB);

Most of the code is in the Most of the code is in the exception handlers!exception handlers!

Page 11: Physical Programming:  Beyond Mere Logic

11

Asynchronous Events and Fault ToleranceAsynchronous Events and Fault Tolerance Partial system failures are only one kind of event that

may need to be handled in the course of execution of a distributed program

Others: high-priority situations (e.g., imminent deadlines) aborts

These events are often unpredictable may occur at any point in the execution of a program fault tolerance requires that whenever they occur and

whatever they are, we need to deal with them

Page 12: Physical Programming:  Beyond Mere Logic

12

Step NStep N

Step N+1Step N+1

Step N+2Step N+2

Revisiting An Old AssumptionRevisiting An Old Assumption Is the traditional “main path” focussed programming style

appropriate when exceptions are the rule?

Exception!Exception! Handler AHandler ANN

Handler BHandler B

Handler AHandler AN+1N+1Exception!Exception!

Page 13: Physical Programming:  Beyond Mere Logic

13

Asynchronous Event HandlingAsynchronous Event Handling This is nicely captured by the state-event matrix of finite state

machines

Step NStep N

Step N+1Step N+1

Step N+2Step N+2

Event AEvent A

Handler AHandler ANN

Handler AHandler AN+1N+1

Handler AHandler AN+2N+2

Event BEvent B

Handler BHandler B

etc.etc.Event SEvent S

Page 14: Physical Programming:  Beyond Mere Logic

14

A ConclusionA Conclusion In an event-driven and deadline-based application, a

state machine-based programming model may be more appropriate than the traditional algorithmic (“main path”) programming model

The environment strikes back the program logic is strongly affected by the environment

Page 15: Physical Programming:  Beyond Mere Logic

15

Communication Media FailuresCommunication Media Failures

Message loss due to hardware failures due to software failures (e.g., buffer overflow)

Message reordering due to different paths due to variable delays (e.g., due to variable message lengths) retransmission due to fault-tolerant protocols

Message duplication due to faulty hardware retransmission due to fault-tolerant protocols

Page 16: Physical Programming:  Beyond Mere Logic

16

Processing Site Processing Site

observeron offoffon

State?State?

“on”“on”

“on”“on”

Transmission DelaysTransmission Delays Possibility of out of date status information

Page 17: Physical Programming:  Beyond Mere Logic

17

clientAclientA notifier1notifier1 notifier2notifier2 clientBclientB

timetime

E1E1

E2

E2

Relativistic EffectsRelativistic Effects Relativistic effects:

different observers see different event orderings (due to different and variable transmission delays)

Page 18: Physical Programming:  Beyond Mere Logic

18

Processing SiteProcessing Site

Communications MediumCommunications Medium

Processing SiteProcessing Site

Distribution TransparenciesDistribution Transparencies Providing supporting layers of functionality that shield the

application from the undesirable effects of distribution e.g., reliable communication protocols

client server

Reliable Comm Service

Reliable Comm Service

Page 19: Physical Programming:  Beyond Mere Logic

19

Impossibility Result No.1Impossibility Result No.1

It is not possible to guarantee that agreement can be reached in finite time over an asynchronous communication medium, if the medium is lossy or one of the distributed sites can fail

Fischer, M., N. Lynch, and M. Paterson, “Impossibility of Distributed Consensus with One Faulty Process” Journal of the ACM, (32, 2) April 1985.

Page 20: Physical Programming:  Beyond Mere Logic

20

Impossibility Result No.2Impossibility Result No.2

Even when communication is fully reliable, it is not possible to guarantee common knowledge if communication delays are unbounded

Halpern, J.Y, and Moses, Y., “Knowledge and common knowledge in a distributed environment” Journal of the ACM, (37, 3) 1990.

Page 21: Physical Programming:  Beyond Mere Logic

21

The “End-To-End” ArgumentThe “End-To-End” Argument

Transparency mechanisms are intended to protect the application from observing the undesirable effects of distribution Most transparency types require distributed agreement!

The end-to-end argument [Saltzer et al.]: if transparency cannot be guaranteed, the application is not

really shielded from the effects of distribution the overhead of introducing transparency mechanisms may

not be justified

Page 22: Physical Programming:  Beyond Mere Logic

22

Stepping Back...Stepping Back... Most distribution problems are a consequence of the

encroachment of the physical world into the pliable and limitless “logical” world of software the problem is fundamental (e.g., the end-to-end argument)

Traditional Programming = Logic Physical Programming = Logic + Physics

like traditional engineers, software designers must take into account the raw material out of which they spin their logic

finite resources, finite delays, finite reliability...

Page 23: Physical Programming:  Beyond Mere Logic

23

Quality of Service Concepts Quality of Service Concepts

The physical characteristics of software can be specified using the general notion of Quality of Service (QoS):

a specification of how well a service is (to be) performed e.g. throughput, capacity, response time usually a quantitative measure

QoS specifications are two sided: offered QoS: the QoS that is offered to clients required QoS: the QoS required by a client

Page 24: Physical Programming:  Beyond Mere Logic

24

Resources and Quality of ServiceResources and Quality of Service Resource: an element whose functional capacity is

limited, directly or indirectly, by the finite capacities of the underlying physical computing environment

The services of a resource are characterized by one or more QoS attributes capacity, reliability, availability, response time, etc.

ClientClientClientClientResource DemandResource Demand

S1S1

S1S1

ResourceResourceResourceResource

{RequiredQoS OfferedQoS}{RequiredQoS OfferedQoS}

RequiredQoSRequiredQoS OfferedQoSOfferedQoS

Page 25: Physical Programming:  Beyond Mere Logic

25

Simple ExampleSimple Example Concurrent tasks accessing a monitor with known response time

characteristics

access ( )

Client2

access ( )

Client1

{Deadline = 5 ms}

Required QoSRequired QoS

{Deadline = 3 ms}

myMonitor

Offered QoSOffered QoS

{MaxExecutionTime = 4 ms}

Page 26: Physical Programming:  Beyond Mere Logic

26

Types and “Physical” TypesTypes and “Physical” Types

The purpose of types is to tell us about the externally

relevant properties of software components so that we

can validate whether they are being used appropriately

Physical types: type specifications that incorporate QoS

characteristics

Answer two key engineering questions: can this component support the “load” intended for it?

what does this component require to support its offered QoS?

Page 27: Physical Programming:  Beyond Mere Logic

27

Physical Type ExamplePhysical Type Example A semaphore type:

class Semaphore {

{heap= 10 bytes} -- required QoS

{CPU 5 MIPS} -- required QoS

get(){proc 0.4*CPU us;stack=4 bytes};rel(){proc 0.4*CPU us;stack=4 bytes};

}

Usage:mySema : Semaphore;

mySema.get() {proc 3 us} -- req. QoS

Page 28: Physical Programming:  Beyond Mere Logic

28

Violation of Encapsulation?Violation of Encapsulation?

Aren’t the offered QoS characteristics a consequence of

the implementation?

Not necessarily...

The offered QoS characteristics can and should be

defined independently of the implementation the “worst-case” numbers of traditional engineering

The contractual obligations that the component designer

is willing to assume

Page 29: Physical Programming:  Beyond Mere Logic

29

Physical Type CheckingPhysical Type Checking Can physical types be statically checked?

The good news: Yes, they can (in most cases) The bad news: typically requires complex analysis methods (queueing network analysis, schedulability analysis, etc.) but then, model checking and theorem proving is not simple either

Some issues: Typically, QoS-based analyses cannot be done incrementally -- the full system context is required but then, the same holds for many formal verification methods Each type of QoS (e.g., bandwidth, CPU performance) combines differently

Page 30: Physical Programming:  Beyond Mere Logic

30

Required QoSRequired QoS Like all guarantees, the offered QoS is contingent on the

component getting what it needs to do its job There are two distinct dimensions to this:

the peer dimension the layering dimension

ClientClientS1S1

S1S1

ResourceAResourceAS2S2

S2S2

ResourceBResourceBResourceBResourceB

CPUCPU

CPU

Physical ProcessorPhysical Processor

Page 31: Physical Programming:  Beyond Mere Logic

31

Logical ViewpointLogical Viewpoint Example: logical view of aircraft simulator software

INSTRUCTORINSTRUCTORSTATIONSTATION

INSTRUCTORINSTRUCTORSTATIONSTATION

AIRFRAMEAIRFRAMEAIRFRAMEAIRFRAME

GROUNDGROUNDMODELMODEL

GROUNDGROUNDMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ENGINESENGINESENGINESENGINES

CONTROLCONTROLSURFACESSURFACES

CONTROLCONTROLSURFACESSURFACES PILOTPILOT

CONTROLSCONTROLS

PILOTPILOTCONTROLSCONTROLS

Page 32: Physical Programming:  Beyond Mere Logic

32

Engineering (Realization) ViewpointEngineering (Realization) Viewpoint The realization of a specific set of logical components using facilities of the run-time environment

ProcessorProcessorEthernet LANEthernet LAN

ProcessorProcessor

OSOS process processOSOS process process

stackstack

OSOS process processOSOS process process

stackstack

OSOS process processOSOS process process

stackstack

TCP/IP socketTCP/IP socket

TCP/IP socketTCP/IP socket

Page 33: Physical Programming:  Beyond Mere Logic

33

Engineering ViewpointEngineering ViewpointEngineering ViewpointEngineering Viewpoint

ProcessorProcessorEthernet LANEthernet LAN

ProcessorProcessor

OSOS process processOSOS process process

stackstack

OSOS process processOSOS process process

stackstack

OSOS process processOSOS process process

stackstack

TCP/IP socketTCP/IP socket

TCP/IP socketTCP/IP socket

Viewpoints and MappingsViewpoints and MappingsLogical ViewpointLogical ViewpointLogical ViewpointLogical Viewpoint

INSTRUCTORINSTRUCTORSTATIONSTATION

INSTRUCTORINSTRUCTORSTATIONSTATION

AIRFRAMEAIRFRAMEAIRFRAMEAIRFRAME

GROUNDGROUNDMODELMODEL

GROUNDGROUNDMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ENGINESENGINESENGINESENGINES

CONTROLCONTROLSURFACESSURFACES

CONTROLCONTROLSURFACESSURFACES PILOTPILOT

CONTROLSCONTROLS

PILOTPILOTCONTROLSCONTROLS

RealizationmappingsRealizationmappings

Page 34: Physical Programming:  Beyond Mere Logic

34

The Engineering ViewpointThe Engineering Viewpoint

The engineering viewpoint represents the “raw material”

out of which we construct the logical viewpoint the quality of the outcome is only as good as the quality of the

ingredients that are put in

as in all true engineering, the quantitative aspects of the logical

model are often crucial (How long will it take? How much will

be required?…)

Page 35: Physical Programming:  Beyond Mere Logic

35

Distributed Systems DilemmaDistributed Systems Dilemma

Dilemma: How can we account for the engineering

characteristics of the system without prematurely and

possibly unnecessarily committing to a specific

technology?

Proposed solution: Include in the logical model a generic

(technology-neutral) specification of the

required/expected characteristics of the engineering

environment

Page 36: Physical Programming:  Beyond Mere Logic

36

Viewpoint SeparationViewpoint Separation Required Environment: a technology-neutral environment

specification required by the logical elements of a modelLogical ViewpointLogical ViewpointLogical ViewpointLogical Viewpoint

Required EnvironmentRequired EnvironmentRequired EnvironmentRequired Environment

UNIXUNIXProcessProcess

UNIXUNIXProcessProcess

Engineering Viewpoint (alternative A)Engineering Viewpoint (alternative A)Engineering Viewpoint (alternative A)Engineering Viewpoint (alternative A)

UNIXUNIXProcessProcess

UNIXUNIXProcessProcess

WinNTWinNTProcessProcessWinNTWinNT

ProcessProcess

Engineering Viewpoint (alternative B)Engineering Viewpoint (alternative B)Engineering Viewpoint (alternative B)Engineering Viewpoint (alternative B)

WinNTWinNTProcessProcessWinNTWinNT

ProcessProcess

Page 37: Physical Programming:  Beyond Mere Logic

37

Required Environment SpecificationsRequired Environment Specifications What a logical component needs in order to perform its

function according to spec

20MB20MB 3MIPs3MIPs 100Mbit/s100Mbit/s offered QoS valuesoffered QoS valuesoffered QoS valuesoffered QoS values

CPUCPUCPUCPU LANLANLANLANengineering engineering element element (resource)(resource)

engineering engineering element element (resource)(resource)

CPU :CPU :3 MIPs3 MIPs

Bandw. : Bandw. : 70Mbit/s70Mbit/s

Mem :Mem :2MB2MB required QoS valuesrequired QoS valuesrequired QoS valuesrequired QoS values

AirframeAirframeAirframeAirframe logical element (client)logical element (client)logical element (client)logical element (client)

realization mappingrealization mappingrealization mappingrealization mapping

Page 38: Physical Programming:  Beyond Mere Logic

38

Required Environment PartitionsRequired Environment Partitions

INSTRUCTORINSTRUCTORSTATIONSTATION

INSTRUCTORINSTRUCTORSTATIONSTATION

AIRFRAMEAIRFRAMEAIRFRAMEAIRFRAME

GROUNDGROUNDMODELMODEL

GROUNDGROUNDMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ATMOSPHEREATMOSPHEREMODELMODEL

ENGINESENGINESENGINESENGINES

CONTROLCONTROLSURFACESSURFACES

CONTROLCONTROLSURFACESSURFACES PILOTPILOT

CONTROLSCONTROLS

PILOTPILOTCONTROLSCONTROLS

Logical elements often share common QoS requirements

QoS domain(e.g.,failure unit,uniform comm properties)

QoS domain(e.g.,failure unit,uniform comm properties)

Page 39: Physical Programming:  Beyond Mere Logic

39

QoS DomainsQoS Domains Specify a domain in which certain QoS values apply

throughout: failure characteristics (failure modes, availability, reliability) CPU speeds communications characteristics (delay, throughput, capacity) etc.

The QoS values of a domain can be compared against those of a concrete engineering environment to see if a given environment is adequate for a specific model

Page 40: Physical Programming:  Beyond Mere Logic

40

“Physical” Programming“Physical” Programming The notions of QoS and QoS domains enable the design

of distributed systems that properly account for the effects of distribution and other non-transparent physical phenomena, while allowing for a high degree of portability and technology independence

They are also the basis for formal verification of realization mappings{required QoS QoS of the proposed engineering environment}

May also be used to automatically synthesize engineering environments that satisfy a given QoS specification of a logical model

Page 41: Physical Programming:  Beyond Mere Logic

41

Conclusions and an Appeal...Conclusions and an Appeal... The physical aspects of software will not go away

ignoring them can be perilous especially when working with distributed systems

most interesting software systems of the future will be distributed and will have stringent dependability requirements (“cannot reboot the Internet”)

What is needed is a proper theoretical framework for dealing with physical types

The QoS framework described here is currently being incorporated into a profile of UML for real-time applications