Challenges and Chances in Network Reliability
description
Transcript of Challenges and Chances in Network Reliability
CHALLENGES AND CHANCES IN NETWORK
RELIABILITY
Zhaobo ZhangHuawei Technologies (USA)
2014-09-11
Outline
Background of IP Network System Reliability Causes of unreliable network Potential Directions
Background
Fast growing computers/mobile
device; ISP(regional, backbones ); IXP
Primary source of information sharing & communication
Various applications Data, voice, video
conferencing, P2P High demands
QoS, reliability, efficiency
Billions
Millions
Thousands
Hundreds
2010 Internet
The Opte Project by Barrett Lyon
Seek to make an accurate
representation of the Internet using visual graphics.
Network System reliability
Metrics Quality of service
connectivity, E2E delay, E2E packet loss rate Network topology, service level agreement Availability = MTBF/(MTBF+MTTR)
Mean Time Between Failure, Mean Time to Repair e.g. 99.999%, means annual downtime 5.15
mins Verification
Through fault insertion test and field data
Causes of unreliable network
IP connectivity errors unstable transmission, overflow throughput, delay, network
security threat, IP resource management Network mis-configuration
network topology loop, non-optimal path, duplex mismatch, protocol unawareness
Software version/patch conflict; Logic mis-configuration; device driver
bugs, Environment
Cable/fiber cut/device damage; electrical noise, power outage Hardware: power/clock, logic aging, ram failure, soft error
Potential Direction 1
Reliability-aware hardware design Redundancy: RAM, link, NPU, board Built in smart logic
Monitor misbehavior (e.g. delay increase), early alert
Monitor traffic, Balance traffic/heat to slow aging, auto-reroute to avoid defective logic.
NPU NPU
NPU NPU
RAM
RAM
SmartOrange colors are spares
Potential Direction 2
Data mining & automated process Learn history data, provide guidance for
current/next generation design, verification introduction, debug
Data
R&D
CMO&M • Failure cases• Test & component stats
• Field-return data• Field failure
cases
• Design spec• Verification list• Fault database• FIT result• FMEA
Potential Direction 3
Big data, big network, big infrastructure, BIG power
Power consumption control Low power design Dynamic control: sleep mode, turn off SerDes, MAC
Thermal control Heat is an enemy of devices every 10 degrees Celsius of temperature rise, the
speed of all chemical reactions doubles.
Wikipedia: I know everything!Google: I have everything!
Facebook: I know everybodyInternet: Without me you all nothing!Electricity: keep talking bitches. 2% Global energy usage
Potential Direction 4
Fault tolerant control layer design/testing SDN & open flow
Decouple network control and forwarding functions
Directly programmable network control
controller performs design validation as part of configuring the network and that design validation eliminates manual errors
Business ApplicationBusiness
ApplicationBusiness Application
Network ServiceNetwork ServiceNetwork Service
Application Layer
Infrastructure Layer
SDN Control Layer
Thanks!