Wide-Area Service Composition: Evaluation of Availability and Scalability
-
Upload
brett-potter -
Category
Documents
-
view
26 -
download
0
description
Transcript of Wide-Area Service Composition: Evaluation of Availability and Scalability
Wide-Area Service Composition:Evaluation of Availability and
Scalability
Bhaskaran RamanSAHARA, EECS, U.C.Berkeley
Provider QProvider Q
TextTexttoto
audioaudio
Provider RProvider R
CellularPhone
Emailrepository
Provider AProvider A Video-on-demandserver
Provider BProvider B
ThinClient
Transcoder
Problem Statement and Goals
Goals– Performance: Choose
set of service instances
– Availability: Detect and handle failures quickly
– Scalability: Internet-scale operation
Problem Statement– Path could stretch across
– multiple service providers– multiple network domains
– Inter-domain Internet paths:– Poor availability [Labovitz’99]– Poor time-to-recovery
[Labovitz’00]– Take advantage of service replicas
Provider AProvider A
Provider AProvider A
Video-on-demandserver
Provider BProvider B
Provider BProvider BThinClient
Transcoder
Related Work– TACC: composition within cluster– Web-server choice: SPAND, Harvest– Routing around failures: Tapestry, RON
We address: wide-area n/w perf., failure issues for long-lived composed sessions
Is “quick” failure detection possible?
• What is a “failure” on an Internet path?– Outage periods happen for varying durations
• Study outage periods using traces– 12 pairs of hosts
• Berkeley, Stanford, UIUC, UNSW (Aus), TU-Berlin (Germany)• Results could be skewed due to Internet2 backbone?
– Periodic UDP heart-beat, every 300 ms– Study “gaps” between receive-times
• Results:– Short outage (1.2-1.8 sec) Long outage (> 30 sec)
• Sometimes this is true over 50% of the time
– False-positives are rare:• O(once an hour) at most
– Similar results with ping-based study using ping-servers– Take away: okay to react to short outage periods, by
switching service-level path
UDP-based keep-alive stream
HB destination HB source Total time Num. False positives
Num. Failures
Berkeley UNSW 130:48:45 135 55
UNSW Berkeley 130:51:45 9 8
Berkeley TU-Berlin 130:49:46 27 8
TU-Berlin Berkeley 130:50:11 174 8
TU-Berlin UNSW 130:48:11 218 7
UNSW TU-Berlin 130:46:38 24 5
Berkeley Stanford 124:21:55 258 7
Stanford Berkeley 124:21:19 2 6
Stanford UIUC 89:53:17 4 1
UIUC Stanford 76:39:10 74 1
Berkeley UIUC 89:54:11 6 5
UIUC Berkeley 76:39:40 3 5Acknowledgements: Mary Baker, Mema Roussopoulos, Jayant Mysore, Roberto Barnes, Venkatesh Pranesh, Vijaykumar Krishnaswamy, Holger Karl, Yun-Shen Chang, Sebastien Ardon, Binh Thai
Architecture
Composed services
Hardware platform
Peering relations,Overlay network
Service clusters
Logical platform
Application plane
Service cluster: compute cluster capable of running
services
Internet
Peering: exchange perf. info.
Destination
Source
Fin
ding
Ove
rlay
Ent
ry/E
xit
Loc
atio
n of
Ser
vice
Rep
lica
s Service-Level PathCreation, Maintenance,
and Recovery
Link-State Propagation
At-least-once UDP
Perf.Meas.
LivenessDetection
Functionalities at the Cluster-Manager
Evaluation• What is the effect of recovery mechanism on application?
– Text-to-Speech application
– Two possible places of failure• 20-node overlay network• One service instance for each service• Deterministic failure for 10sec during session• Metric: gap between arrival of successive audio packets at the client
• What is the scaling bottleneck?– Parameter: #client sessions across peering clusters
• Measure of instantaneous load when failure occurs
– 5000 client sessions in 20-node overlay network– Deterministic failure of 12 different links (12 data-points in graph)– Metric: average time-to-recovery
Leg-2 Leg-1TextText
totoaudioaudio Text Source
End-ClientRequest-response protocolData (text, or RTP audio)Keep-alive soft-state refreshApplication soft-state (for restart on failure)
11
22
Recovery of Application
Session:CDF of
gaps>100ms
Recovery time: 822 ms(quicker than leg-2 due to
buffer at text-to-audio service)
Recovery time: 2963 ms
Recovery time: 10,000 ms
Jump at 350-400 ms: due to synch. text-to-audio processing (impl. artefact)
11
AverageTime-to-
Recovery vs. Instantaneous
Load• Two services in each
path• Two replicas per service• Each data-point is a
separate run
End-to-End recovery algorithm
High variance due to varying path length
Load: 1,480 paths on failed linkAvg. path recovery time: 614 ms
22
Results: Discussion• Recovery after failure (leg-2): 2,963 = 1,800 + O(700) +
O(450)– 1,800 ms: timeout to conclude failure– 700 ms: signaling to setup alternate path– 450 ms: recovery of application soft-state: re-process current
sentence• Without recovery algorithm: takes as long as failure duration• O(3 sec) recovery
– Can be completely masked with buffering– Interactive apps: still much better than without recovery
• Quick recovery possible since failure information does not have to propagate across network
• 12th data point (instantaneous load of 1,480) stresses emulator limits– 1,480 translates to about 700 simul. paths per cluster-
manager– In comparison, our text-to-speech implementation can
support O(15) clients per machine• Other scaling limits? Link-state floods? Graph computation?
11
22
Summary
• Service Composition: flexible service creation• We address performance, availability,
scalability• Initial analysis: Failure detection -- meaningful
to timeout in O(1.2-1.8 sec)• Design: Overlay network of service clusters• Evaluation: results so far
– Good recovery time for real-time applications: O(3 sec)
– Good scalability -- minimal additional provisioning for cluster managers
• Ongoing work:– Overlay topology issues: how many nodes,
peering– Stability issues
Feedback, Questions?Presentation made using VMWare
Evaluation
Analy
sis
Design
Emulation Testbed
App
LibNode 1
Node 2
Node 3
Node 4
Rule for 12
Rule for 13
Rule for 34
Rule for 43
Emulator
Operational limits of emulator: 20,000 pkts/sec, for upto 500 byte pkts, 1.5GHz Pentium-4