ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F....

37
ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F. JAHANIAN, S. JOHNSON, P. MARRON, A. MEHRA, T. MITTON, A. SHAIKH, K. SHIN, Z. WANG, H. ZOU Real-Time Computing Laboratory University of Michigan Presented by Guoliang

Transcript of ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F....

ARMADA Middleware and Communication Services

T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F. JAHANIAN, S. JOHNSON, P. MARRON, A. MEHRA, T. MITTON, A. SHAIKH, K. SHIN, Z. WANG, H. ZOU

Real-Time Computing Laboratory

University of MichiganPresented by Guoliang Xing

Agenda

Introduction RTCAST Group Comm. Service Real-Time Channel Architecture Platforms RTPB Replication Service Evaluation Tools

Target Applications

Embedded fault-tolerant applications

Industrial and manufacturing systems

Distributed multimedia Air traffic control

Key Challenges Timely delivery of services with end-to-

end real-time constraints Dependability of services in the presence

of h/s failures Scalability of computation and

communication resources Exploitation of open systems and

emerging standards in operating systems and communication services

ARMADA Architecture

Applications

MiddlewareServices

EvaluationTools

API

Real-TimeChannels

Microkernel

RTCAST Multicast comm. and group management

in timely fashion, with faults

Group Communication

Reliable message delivery Agreement on group membership Failure detection and handling Consistency

Atomicity: either everybody gets the message or nobody gets it

Global order

Real-time Group Comm.

Late message means failure Atomic, ordered message delivery

in timely fashion Immediate message delivery

without compromising the above

Achieve reliability, atomicity, RT

Reliability: each member either receives a multicast message m or crashes before receiving m

Atomicity: correct members receive all message and in the same order

Time-bounded multicast: each member either receives each multicast m in total order within T time units or crashes during T before receiving m

RTCAST - ArchitectureReal-time Process Groups API

Clock Synchronization Virtual Network Interface

Unicast Datagram Communication

Admission Control andSchedulability Analysis

Group MembershipService

Timed Atomic Multicast

System Model Assumptions:

each processor has its own unique identifier a path exists between any two processors communication delay is bounded (in the

absence of failures) synchronized clocks

Failures processors may suffer performance or crash

failures messages may suffer performance or

omission failures

Agreement on membership

All members have the same membership view at group initialization time

For each membership update U which changes membership view from V to V’, U is delivered atomically (in order) to all members in V U V’ within T time units

Steady-state operation

Steady-state operation Token Ring: ensure order

A processor sends messages only after holds token

Upon receiving the token sends multicast messages within maximum

token hold time sends a heartbeat which is a token to

successor Upon receiving a multicast message

deliver to application in sequence if message omission detected, crash

Steady-state operation– contd.

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

Handle faults

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

1

2

3

4

1 3 4

1 2 3 4

1 3 4

1 3 4

1

2

3

4

Membership Changes Processor crashes

Each processor checks the heartbeats from members when its turn comes

Send membership update multicast Joins

Sends a join request to some processor which multicasts membership change message

Joining processor checks the consistence of membership views sent in ACKs

Token Rotation Period

Ptoken – Token rotation time Ti – maximum token hold time at any

processor n – number of processes dmax – comm. delay

Admission Control

Goal: Only admit affordable messages

Assumptions: Each sender can transmit messages

for up to Tj units of time within P Time elapsed between the send and

delivery is bounded by Δ

Admission Control – Contd. Real-time message: Maximum

transmission time Ci, period Pi, deadline di

Sufficient Schedulability Condition:

Implementation

Agenda

Introduction RTCAST Group Comm. Service Real-Time Comm. Architecture Platforms RTPB Replication Service Evaluation Tools Conclusion

RT Channel Architecture

RT Comm. Architecture – Contd.

Real-time channel: unicast virtual connection between two hosts with bounded end-to-end delay guarantee RTC API: Clip: endpoint with QoS parameters RTCOP: Signaling and resource reservation QoS model & Admission control:

RTC API

RTCOP-Contd.

Real-Time Connection Ordination Protocol: Distributed end-to-end signaling Request and reply handler: manage signaling

state and interface to admission control Comm. module: reliably forward signaling

message Signaling connection is non-real-time but

reliable

RTCOP

Resource scheduling

Resource scheduling- Contd.

QoS-sensitive CPU scheduling: Each message must be sent within deadline Comm. Handler scheduled with EDF policy

Resource reservation: Associate each Comm. Handler with budget

Policing: Link bandwidth allocation:

Dynamic priority based link scheduler

Resource Scheduling – contd.

Msg Msg Msg

Cl i ps

pkt pkt pkt

Li nk

EDFSchedul er

EDFSchedul er

Buget

Traffic isolation in RTC

Agenda

Introduction RTCAST Group Comm. Service Real-Time Comm. Architecture Platforms RTPB Replication Service Evaluation Tools Conclusion

Platforms Microkernel x-kernel: Co-

located server

UDP/IP

RTPB Architecture

Many RT applications can tolerate minor inconsistencies in replicated state

Backup maintains a less current copy of primary

Distance between the primary and backup data is bounded within a time window

Evaluation Tools - ORCHESTRA A distributed protocol is viewed as an

abstraction layer through which participants communicate by exchanging messages

A probe/fault injection (PFI) layer is inserted between any two consecutive layers in a protocol stack.

PFI layer can delay, drop, reorder, duplicate, modify, introducing spontaneous messages

Conclusions Middleware Services for fault-

tolerant group communication Real-time communication services validation tools

Questions?