Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services...
![Page 1: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/1.jpg)
MendosusA SAN-Based Fault Injection Test-Bed for
Construction of Highly Available Network Services
Xiaoyan Li, Richard Martin, Kiran Nagaraja,
Thu D. Nguyen and Bin Zhang
Dept. of Computer Science, Rutgers University
http://www.panic-lab.rutgers.edu
![Page 2: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/2.jpg)
Talk Outline
Motivation Design Implementation Benchmarks Case Studies Related Work Future Work
![Page 3: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/3.jpg)
Motivation
Ubiquitous network access exponential growth in network services
Availability is one key challenge Networked systems are comprised of large numbers of
heterogeneous components Faults are not uncommon Complex interaction between components
Examples of costly failures: Ebay, Brittanica
Currently difficult to assess service availability How to analyze impact of failures? How to set up an appropriate test-bed?
![Page 4: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/4.jpg)
Mendosus
Goal: provide infrastructure for service designers to assess the availability of network services
Overview: Provide flexible infrastructure to accurately model a
variety of different networking systems from the application’s point-of-view
Run application in real-time and inject faults to assess application’s behavior
Two key components: Real-time emulation of a variety of interconnects General fault injection infrastructure
![Page 5: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/5.jpg)
Vision
Map available resources to emulated network
![Page 6: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/6.jpg)
Design
![Page 7: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/7.jpg)
Mendosus Architecture
Applications
KernelLatency
Routing
Fault Inclusion
Mendosus daemon
Central Controller
Network State
User Level
Fast & Reliable SAN
Emulator Module
Events
![Page 8: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/8.jpg)
Design Decisions
Central controller Advantage: consistent network and fault information Disadvantage: limits scalability
Not involved in network emulation so should still scale well to targeted system sizes (thousands or tens of thousands of components)
Entire network state is maintained at each end node Advantage: performance Disadvantage: limits scalability
Only maintain state for LAN
Emulation module embedded within kernel Advantage: no modifications to application code Disadvantage: more difficult to modify and extend
![Page 9: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/9.jpg)
Functional Components
Topology Maintenance
Fault Injection
Emulation
![Page 10: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/10.jpg)
Topology Maintenance
Specification - simple ns-2 like topology scripts Specify available resources
Central controller manages topology Initializes original topology on each node Consistent view
Real time topology changes Specified as scripted events
Controller monitors network connectivity Detects partitions
![Page 11: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/11.jpg)
Fault Injection
Every n/w component can have a fault profile Switches, hubs, NICs, links, end nodes
Fault specification: trace files or theoretical distributions Exponential, Weibull, constant
Simulate fail-stop components MTTR - constant or follow a distribution E.g. unplugging, port shutdown
![Page 12: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/12.jpg)
Emulation
Completely distributed Every node has enough network state
Emulation Messaging sequence Application initiates communication Routing – determine route Fault Inclusion – effect of injected faults Latency – corresponding to route taken
We do not implement the innards of network components Switching
![Page 13: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/13.jpg)
Implementation
![Page 14: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/14.jpg)
Ethernet LAN Emulation
Routing Emulate computation of Ethernet spanning tree
Controller chooses root of tree Emulator on each node computes identical spanning tree
Reconfiguration performed periodically (every 2 secs)
Broadcast & Multicast Emulate using sequence of unicast
![Page 15: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/15.jpg)
Ethernet LAN Emulation - Faults
Network partitions Controller monitors connectivity Multiple roots - one for each partition
NIC fail-over Multiple interfaces using IP aliasing support in Linux
![Page 16: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/16.jpg)
Emulation completeness…
YesYesP-to-P
Software (multiple unicast)
HardwareBroadcast
Not implementedSome advanced switches
Layer 3, 4 services
E.g.VLAN, IGMP
Software(Broadcast w/ filters)
HardwareMulticast
Emulated Ethernet
EthernetFeature
![Page 17: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/17.jpg)
Micro-benchmarks
![Page 18: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/18.jpg)
Emulation Limits
53.479.61Emulator
54.879.18
130.066.00Gigabit Ethernet
88.911.81Fast Ethernet
RTT usecThroughput MB/sec
No. of Switches in Topology
Network
![Page 19: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/19.jpg)
Software Broadcast Scaling
![Page 20: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/20.jpg)
Fault View Convergence
![Page 21: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/21.jpg)
Case Studies
![Page 22: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/22.jpg)
Group Membership
Test protocol behavior under faults subtle interactions in distributed protocols
Three Round Membership algorithm Robust against multiple node failures, packet drops and
network partitions Two modes of operation: normal and FCM
![Page 23: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/23.jpg)
Membership Observations
A C
B D
5. Link L up
4. Packet drops at A
3. NIC at B recovers
2. Link L down
1. NIC failure at B
1 2 3 4 5
L
![Page 24: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/24.jpg)
Multi-Level Switched Network
Large enterprise LANs have multiple layers of network components Access, core and aggregation switches
How to evaluate availability vs. cost vs. complexity?
Study service availability with increased redundancy Faults following exponential distributions
![Page 25: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/25.jpg)
Enterprise LAN
![Page 26: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/26.jpg)
Availability Vs Redundancy
![Page 27: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/27.jpg)
Related Work
Network Emulation Distributed emulation
Emulab [Utah], DelayLine
Centralized emulation NISTNET, Lancaster emulator
Fault injection Script-based probing and fault injection
Orchestra, DOCTOR
Co-related faults Loki [UIUC]
Simulation NS-2, REAL[Cornell], SSFNet, x-sim[Arizona]
![Page 28: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/28.jpg)
Future Work
Extend Mendosus to emulate other networks WAN: Build in performance dynamics model Wireless LAN - Realistic fault and performance models
Support pluggable modules within network components which add functionality and additional failures ! Intelligent Routing protocols (E.g. HSRP) Dynamic DNS, RR DNS
![Page 29: Mendosus A SAN-Based Fault Injection Test-Bed for Construction of Highly Available Network Services Xiaoyan Li, Richard Martin, Kiran Nagaraja, Thu D.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d3f5503460f94a195dd/html5/thumbnails/29.jpg)
Summary
Test-bed for service designers to systematically analyze network and protocol design against failures
Results show that real-time emulation is feasible given capability of current SAN networks
Demonstrated the flexibility and usefulness of Mendosus through 2 case studies
Another step towards building highly available services…