Object-Oriented Real-Time Distributed Computing 2nd IEEE

32
Towards Predictable CORBA-based Web-Services Andreas Polze, Jan Richling, Janek Schwarz and Miroslaw Malek Department of Computer Science Humboldt University of Berlin AP 5/99 2nd IEEE International Symposium on Object-Oriented Real-Time Distributed Computing

Transcript of Object-Oriented Real-Time Distributed Computing 2nd IEEE

Page 1: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Towards Predictable CORBA-basedWeb-Services

Andreas Polze, Jan Richling, Janek Schwarz and Miroslaw MalekDepartment of Computer Science

Humboldt University of Berlin

AP 5/99

2nd IEEE International Symposium onObject-Oriented Real-Time Distributed Computing

Page 2: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Towards Predictable CORBA-basedWeb-Services

Composite Objects:Building Blocks for Predictable CORBA-based ServicesReal-time & Call Admission:

- Producer/Consumer/Viewer- Scheduling Server for predictable execution

Fault-tolerance: - Observer/Observable Objects: FT Netscape- Broker+Group Communication: FT Maze

Conclusions & Future Work- FT-DIO: Distributed I/O-Framework

Page 3: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Web-Service- fault-tolerant- real-time- replicated

Composite Object

Web-Service- fault-tolerant- real-time- replicated

Composite Object

Communication Middleware

CORBA

Observer

Client(Browser)

- fault-tolerant- ObservableObject- backup

Client(Browser)

- fault-tolerant- ObservableObject- primary

Endsystem Architecture for Predictable Web-Services

Web-Service- fault-tolerant- real-time- replicated

Composite Object

• Observer/Observable Objects for fault-tolerant clients

• Composite Objects/Consensus for responsive Web-services

Page 4: Object-Oriented Real-Time Distributed Computing 2nd IEEE

CORBAinterface

RT servicemethodsRT comm

CORBAinterface

RT servicemethodsRT comm

CORBAinterface

RT servicemethodsRT comm

consensus object(Composite Object)

broker object(transparent)

CORBA client

replicatedrequest

RT communicationfor consensusprotocols (e.g., voting)

Responsive Services

- Fault-tolerance: redundancy in time and space

- Real-time: guarantees from underlying OS (Mach)

- Method invocation as unit of replication/scheduling

Responsive Service

implemented byreplicated, distributedserver objects

CORE/SONiC execution model

Page 5: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Responsive Computing with CORBA:Mismatch of system assumptions

knowledge about implementation details:- resource usage,- timing behavior,- scheduling policies

0 complete

CORBA MARSCORE/SONiC

Problem:

Solutions:

1) "Realtime CORBA": quality-of-service guarantees for CORBA by extending specification

2) "Responsive Services": based on CORE/SONiC and CORBA connected by Composite Objects

-> Composite Objects for predictable integration of CORBA with FT RT computing

Page 6: Object-Oriented Real-Time Distributed Computing 2nd IEEE

CORBA and Real-Time Computing:

1. NONINTERFERENCE:- we should create an environment in which general purpose

computing and real time computing will not burden each other.

2. INTEROPERABILITY:- the services exported by general purpose computing objects

and by real time computing objects can be utilized by each other.

3. ADAPTIVE ABSTRACTION:- lower level information and scheduling actions needed by real

time computing is available for real time objects but transparentto non-real time objects.

Standard CORBA is not sufficient.Modifications to ORB implementations not desirable.

-> Composite Objects

Page 7: Object-Oriented Real-Time Distributed Computing 2nd IEEE

DCE DCOM CORBA

Composite Obect

RTFT

Security

Composite Objects: filtering bridgepredictable integration of RT and non-RT (CORBA) functionality

ideally: multiprocessor to separateRT & non-RT tasks

• use standard scheduling techniques:- RMA for RT tasks- interactive scheduling/aging for

non-RT tasks

now: simulate multiprocessor onuniprocessor

• vertical firewall: time slicing / Scheduling Server

• horizontal firewall: assign different priority levels to RT/non-RT tasks

• Composite Objects provide functions for assignment of priorities to methods and for registration with Scheduling Server

Page 8: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Replicate object’s data: RT & non-RT part

Management of data transfer/mirroring between RT & non-RT part of theComposite Object:

• shadow variables for continous data• buffers+flush+exceptions+timeouts for discrete data (depth of buffer is parameter)• variables/buffers with different priorities

Memory managment

• memory locking for RT data -> overloaded versions of new/malloc• paging for non-RT data

Page 9: Object-Oriented Real-Time Distributed Computing 2nd IEEE

CPU Firewalls

vertical

Scheduling Server: restricting CORBA in its CPU usage

• no changes to Object Request Broker required• without changes to Mach OS kernel (user space server)• similar work exists for rtLinux, Solaris (URsched)

SchedulingServer

interactivetasks

soft RTtasks

time

priority31

kerneltasks

time

priority

RT-priority levels

non-RT priority levels

horizontal

Page 10: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Scheduling S

erver Concept

high priority server manipulates client thread’s priority

fixed priority scheduling policy

Scheduling

Server

Client

tasks

task list

AB

C

Earliest D

eadline First (E

DF

)R

ate Monotonic S

cheduling (RM

S)

Scheduling S

erver implem

ents:

deadline

task control port

handoff scheduling - hints to the OS

’ scheduler

and ensures interactive availab

ility !

computing power

number of

background processes

experiments

stability of rtLinux version undervarying background loads

Page 11: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Scheduling Server: overhead and stability

clie

nt ta

sk’s

per

form

ance

(M

FLO

PS

)

rem

aini

ng s

yste

m’s

per

form

ance

(M

FLO

PS

)

remaining system’s performance

client task’s performance

100 experiments per given CPU percentage

MF

LOP

S

experiments

number of background processes

MF

LOP

S

experiments

number of background processes

• implementation based on MachOS (NeXTSTEP), HP PA-RISC

• little impact of varying background/ disk I/O loads

• overhead less than 10%, typical 5%

overhead

stability: background load

stability: disk I/O

Page 12: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Composite Object

CORBA-threads

public NRT data

shared RT/NRTvariables

RT-threads

public RT data

shared RT/NRTvariables

consistency protocolpipe, rt-fifo

pipe, rt-fifo

handler thread handler thread

write write

read read

Communication inside a Composite Object

- replicated data: shared Real Time / Non-Real Time variables- weakly consistent memory management

- handler thread implements data mirroring:• periodically, programmable update rate• event-triggered

Page 13: Object-Oriented Real-Time Distributed Computing 2nd IEEE

CORBAclient

CORBAmethod

RT-method

datasource

pipe

pipeT7

T3

T2T1T0

T6 T5T4

Mach IPCCORBA IIOP

Composite Object´s Overhead

0

40

80

120

160

0 100 200 300 400

t in

ms.

number of experiments

’T7 - T0’’T6 - T1’’T5 - T2’

Composite Object - overhead

Scenario:

Environment:

- Composite Object´s host: HP 715/50, NeXTSTEP 3.3, ILU 2.0 alpha 12- CORBA client: SparcStation 2, Solaris 2.6, OOC OmniBroker 2.02

Stable timing behaviour inside Composite Object.Communication latency increased by 1ms.

Observations:

A.P./M.M. 4/98

Page 14: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Restricting the ORB:call admission via Scheduling Server

• Object Request Broker is restricted in its CPU usage:- independent of load/number of clients, calls, objects

• tradeoff between predictability and communication latency• no changes neither to ORB nor OS kernel necessary

Version B: (contd.)Viewer is CORBA client

Call Admission via Scheduling Server - Communication via pipes

Period: 50ms, CORBA: 10ms quantum Period: 80ms, CORBA: 10ms quantuminitial - no Scheduling ServerVariance: (1 client) 0.072025

(2 clients) 0.091340 (5 clients) 0.146772

Variance: (1 client) 0.077117 (2 clients) 0.077483 (5 clients) 0.088174

Variance: (1 client) 0.074085 (2 clients) 0.056264 (5 clients) 0.059500

80

120

160

200

0 50 100 150 200 250 300 350 400 450 500

’1_Client’’2_Clients’’5_Clients’

t in ms.

number of executions

80

120

160

200

0 50 100 150 200 250 300 350 400 450 500

’1_Client’’2_Clients’’5_Clients’

t in ms.

number of executions

80

120

160

200

0 50 100 150 200 250 300 350 400 450 500

’1_Client’’2_Clients’’5_Clients’

t in ms.

number of executions

Page 15: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Software Fault-tolerance - Fault Model

Omission faultCrash fault

Timing fault

Byzantine fault

Computation fault

Membership protocols

System diagnosis / Voting protocols

Byzantine agreement

Page 16: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Observer/Observable Object - Software Fault-tolerance

• Recover-and-Retry / primary-backup approach

• Toleration of crash-faults (program, processing node)

• Observer monitors application/wrapper via "alive"-messages

• ObservableObject encapsulates standard UNIX apps.

• Detection of an application’s status / checkpointing:

- stdin/stout- UNIX signals: kill (pid, 0)- IPC: pipes, shared memory, messages- X11 mechanisms: properties- program specific API, i.e., CORBA interface

• Exposition of backup on failover;backup instance becomes primary

• Start of a new backup instance

Page 17: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Manager

Observer

WrapperApp: primary(netscape)

WrapperApp: backup(netscape)

machineboundaries

createPrimary()

createBackup()

alive ?

alive ?

XDisplay

X11

CORBACommu-nication

Fault-tolerant Netscape - Communication Structure

Helper

timerevent

ObservableFactory

ObservableObject

ObservableObject

ObservableFactory

Page 18: Object-Oriented Real-Time Distributed Computing 2nd IEEE

IDL interfaces - Observer / ObservableObjects

Observer:

typedef short Id;

interface ObservableObject;

interface ObservableFactory;

interface ObserverBasics {

exception BadExec { string why; };

void connect( in ObservableObject ref,

in Id id, in Id fid ) raises (BadExec);

void connectFactory( ObservableFactory ref,

in Id id, ) raises (BadExec);

void disconnect( in ObservableObject ref,

in Id id, in Id fid ) raises (BadExec);

};

interface ObserverManager {

exception ManagerOnly { string why; };

exception badExec { string why; };

void start( in short m_id )

raises (ManagerOnly, BadExec);

void stop( in short m_id )

raises (ManagerOnly, BadExec);

};

ObservableObject:

interface ObservableObject {

short state();

short id();

void bePrimary( in short o_id );

void shutdown( in short o_id );

};

ObservableFactory:

# include "ObservableObject.idl"

interface ObservableFactory:

ObservableObject {

void create_primary( in short o_id );

void create_backup( in short o_id );

};

Page 19: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Fault-tolerant Netscape - Fault detection mechanism

• X properties for monitoring / controlling Netscape Navigator- _MOZILLA_URL, _MOZILLA_LOCK,

- _MOZILLA_COMMAND, _MOZILLA_RESPONSE

• Window ID obtained through Xlib-functionsXQueryTree() and XmuClientWindow()problem: atomicity

• periodic checkpointing: store current URL in file

• access to non-existent X property results in X error-> wrapper detects application crash (Netscape Navigator)-> Observer activates backup as new primary

(loading of current URL)-> Observer starts new backup instance

• invalid CORBA reference to ObservableFactory indicatescrash of computer

Page 20: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Fault-tolerant Netscape - Screendump

Page 21: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Execution of fault-tolerant labyrinth search

FTmaze

• 4 nodes

• right-hand first search rule

• load partitioning scheme

Generate Run

Kill Node Reset

Page 22: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Composite Object

Composite Object

Composite Object

Composite Object

alive-messages

Java Frontend

Client

Architecture of a Responsive Service( FTMaze )

• task generator• display

• Consensus for group membership• Automatic reconfiguration on crash faults• Graceful degradation• Static load partitioning scheme

task

search engine

search engine

search engine

search engine

partial solutions

task

Consensus

periodic

IIOP

IIOP

Mach IPC

Page 23: Object-Oriented Real-Time Distributed Computing 2nd IEEE

The Unstoppable Robots -a Fault-tolerant Real-time application

• Fault-tolerance: consensus protocol (voting) among replicated controllers• Real-time: robots use up fuel with constant rate;

moves have to be computed with 4 Hz frequency

Page 24: Object-Oriented Real-Time Distributed Computing 2nd IEEE

controller

NeXTSTEP

1st World Display

simulationenvironment

NeXTSTEP

controller

NeXTSTEP

controller

NeXTSTEP

gateway #2Composite Objectcontroller

NeXTSTEP

Java-basedcontroller

SolarisWindowsNT

2nd WorldJava-based Display

gateway #1Composite Object

NeXTSTEP

gateway #3Composite Object

NeXTSTEP

gateway #4Composite Object

rtLinux

kheperarobot

consensusalgorithm

robotcontrol

CORBAIIOP

CORBAIIOP

CORBAIIOP

SolarisWindowsNT

Page 25: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Khepera-Robot

CORBA

• Composite Object is critical: CORBA’s varying communicationlatency must not disturb rtLinux driver task

• rtLinux driver task performs trajectory generation and outputsfine-grained motion commands with frequency of 800Hz

• simulation sends coarse-grained commands with frequency of 4Hzvia CORBA

Composite Object

trajectory generationdriver task

rtLinux

Page 26: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Java Applet showing Unstoppable Robots Simulation

Virtual Joystick for one robotJava Controller

Web-interface to Unstoppable Robots

• Open interfaces must notdisturb timing behavior ofreal-time application

• Web-clients may createinacceptable high loads

• read accesses are easy:

Composite Object implementscaching and call admission

• write accesses may modify RT-data

• RT-app must kept stable even if deadlines aremissed

Recovery Blocks: RT-part of Composite Objectimplements save fallback-method

Page 27: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Java applet

controlalgorithm

CompositeObject fallback

procedure

soft realtimesimulation

deadline

NRT exec RT exec

IIOP Mach IPC

Analytic Redundancy - RT fallback procedure

• concept similar to recovery blocks, mutli-version programming

Page 28: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Unstoppable Robots -CORBA interactions with Java-based external controller

Unstoppable RobotsSoft RT SimulationNeXTSTEP event loop

Composite ObjectFallback procedure

Composite ObjectCORBA communicationCall admissionScheduling Server

Java-basedController

pipes

unloadedWindows NTsystem runningexternal controller

periods of robots simulation

0

50

100

150

200

050 100 150 200 250 300 350 400 450 500

number of experiments

CORBA partReal time part

t in msec

0

50

100

150

200

0 50 100 150 200 250 300 350 400 450 500

number of experiments

Real time partCORBA part

t in msec

loadedWindows NTsystem runningexternal controller

Page 29: Object-Oriented Real-Time Distributed Computing 2nd IEEE

legacy application legacy application legacy application

Group Communication Protocol

Voting

Fault-tolerant Distributed Input/Output (FT-DIO)

• stream-based communication(sockets + select())

• Group communication protocol(Totem, Horus, ISIS)

• RPC-style communication(CORBA + Broker)

Future:

Page 30: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Conclusions

Composite Objects allow for predictable integration ofCORBA and responsive computing with small overhead

Scheduling Server provides basis for true call admissionwithout changes to CORBA (ORB) and OS kernel

Concepts can be transformed onto many COTS platforms:-> rtLinux, Solaris, Windows NT

Data replication and weak memory consistency are keyconcepts for decoupling CORBA and responsive computing

Open Web interfaces for leagcy real-time applications arefeasible using Composite Objects technology

Page 31: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Demo Applications - Lessons learned

• Observer/ObservableObject:generic interfaces for fault-tolerance based on CORBA-> have been successfully reused

• only small subset of CORBA functionality needed forimplementation of responsive services

(architectural approach applicable to minimal/embedded CORBA?)

• Composite Objects decouple CORBA andreal-time consensus protocols

• Java language binding allows Web-access to responsive services

Page 32: Object-Oriented Real-Time Distributed Computing 2nd IEEE

Group, Topics, and Contact Info• Jan Richling, Ph.D. student

"Real Time and CORBA: Experiments with Composite Objects" (in german)

• Janek Schwarz, M.S. student"Fault-tolerance techniques for CORBA --Experiments, Measurements, Evaluation" (in german)

• Oliver Freitag, Mario Schröer, M.S. students"Automatic Generation of responsive CORBA-services" (in german)

• Martin Meyka, Thomas Lehmann, M.S. students,"Business Object Framework --Security- and Consistency-protocols for replicated CORBA-Objects" (in german)

• Robots on the Web:http://www.informatik.hu-berlin.de/~apolze/rescue

• Contact: Dr. Andreas PolzeDepartment of Computer ScienceHumboldt University of Berlin10099 Berlin, Germany [email protected]

Responsive CORBA Unified Environment ( RESCUE )