1 Rethinking Network Control and Management David A. Maltz [email protected].
-
Upload
meryl-ford -
Category
Documents
-
view
222 -
download
0
Transcript of 1 Rethinking Network Control and Management David A. Maltz [email protected].
222
Context for Network Control and Management
Many different network environments Access, backbone networks
Data-center networks, enterprise/campus
Many different technologies Longest-prefix routing, label switching, circuit switching
IP, Ethernet, MPLS, optical circuits
Outsourcing of responsibility into the network Middle-boxes: firewalls, network monitoring, …
Many different policies Routing, reachability, transit, traffic engineering, robustness
333
ATT/CMU Study of 31 Production networks
Provider & enterprise networks (10-1200 routers)
Many different routing designs
Packet filters, multiple OSPF instances, multiple ASs
Router ID8810
Lines in
config file
2000
1000
0
444
Fundamental Problem: Wrong Abstractions
Management Plane• Figure out what is happening in
network• Decide how to change it
Shell scripts Traffic Eng
DatabasesPlanning tools
OSPFSNMP netflow modemsConfigs
OSPFBGP
Link metrics
OSPFBGP
OSPFBGP
Control Plane• Multiple routing processes on
each router• Each router with different
configuration program• Huge number of control knobs:
metrics, ACLs, policy
FIB
FIB
FIB
Routing policies
Packet filters
Data Plane• Distributed routers• Forwarding, filtering, queueing• Based on FIB or labels
555
Inside a Single Network
Data Plane
Distributed routers
Forwarding, filtering, queueing
Based on FIB or labels
Management Plane• Figure out what is
happening in network• Decide how to change it
Shell scripts Traffic Eng
DatabasesPlanning tools
OSPFSNMP netflow modemsConfigs
OSPFBGP
Link metrics
OSPFBGP
OSPFBGP
Control Plane• Multiple routing processes
on each router• Each router with different
configuration program• Huge number of control
knobs: metrics, ACLs, policy
FIB
FIB
FIB
Routing policies
Packet filters
State everywhere!
• Dynamic state in FIBs
• Configured state in settings, policies, packet filters
• Programmed state in magic constants, timers
• Many dependencies between bits of state
State updated in uncoordinated, decentralized way!
666
Inside a Single Network
Data Plane
Distributed routers
Forwarding, filtering, queueing
Based on FIB or labels
Management Plane• Figure out what is
happening in network• Decide how to change it
Shell scripts Traffic Eng
DatabasesPlanning tools
OSPFSNMP netflow modemsConfigs
OSPFBGP
Link metrics
OSPFBGP
OSPFBGP
Control Plane• Multiple routing processes
on each router• Each router with different
configuration program• Huge number of control
knobs: metrics, ACLs, policy
FIB
FIB
FIB
Routing policies
Packet filters
State everywhere!
• Dynamic state in FIBs
• Configured state in settings, policies, packet filters
• Programmed state in magic constants, timers
• Many dependencies between bits of state
State updated in uncoordinated, decentralized way!
Logic everywhere!
• Path Computation built i
nto routing protocols
• Routin
g Policy distributed across the routers
• Packet Filte
rs placed by tools in Mng. Plane
No way to arbitrate inconsistencies between logic
777
Control Plane: The Key Leverage Point
Great Potential: control plane determines the behavior of the network
Reaction to events, reachability, services
Great Opportunities
Each network (administrative domain) has its own control plane
A radical clean-slate control plane can be deployed
– Agnostic to user data format: IPv4/v6, ethernet, circuit
– No changes to end-system software
Control plane is the nexus of network evolution
– Changing the control plane logic can smooth transitions in network technologies and architectures
888
An Alternative: The 4D Architecture
Key principles
Network-level objectives
Network-wide views
Direct control
Corollaries
Predictable behavior (including overload threshold)
Zero device-specific or manual configuration
Data plane support for network-wide view
Define objectives in terms of organizationally salient entities
999
Good Abstractions Reduce Complexity
All decision making logic lifted out of control plane
Eliminates duplicate logic in management plane
Dissemination plane provides robust communication to/from data plane switches
ManagementPlane
Control Plane
Data Plane
DecisionPlane
Dissemination
Data Plane
Configs
FIBs, ACLs FIBs, ACLs
101010
Overview of the 4D Architecture
Decision Plane:
All management logic implemented on centralized servers making all decisions
Decision Elements use views to compute data plane state that meets objectives, then directly writes this state to routers
Decision
Dissemination
Discovery
Data
Network-level objectives
Direct control
Network-wide views
111111
Concerns and Challenges
Distributed Systems issues
How will communication between routers and DEs survive failures in the network?
Latency means DE’s view of network is behind reality. Will the control loop be stable?
What is the overhead to/from the DEs?
What happens in a network partition?
Networking issues
Does the 4D simplify control and management?
Can we create logic to meet multiple objectives?
121212
Evaluation of the 4D Prototype
Evaluated using Emulab (www.emulab.net)
Linux PCs used as routers (650 – 800MHz)
Tested on 9 enterprise network topologies (10-100 routers each)
Example network with 49 switches and 5 DEs
131313
Performance of the 4D Prototype
Trivial prototype has performance comparable to well-tuned production networks
Recovers from single link failure in < 300 ms
< 1 s response considered “excellent”
Faster forwarding reconvergence possible
Survives failure of master Decision Element
New DE takes control within 1 s
No disruption unless second fault occurs
Gracefully handles complete network partitions
Less than 1.5 s of outage
151515
Future Work
Scalability Evaluate over 1-10K switches, 10-100K routes
Networks with backbone-like propagation delays
Structuring decision logic Arbitrate among multiple, potentially competing objectives
Unify control when some logic takes longer than others
Protocol improvements Better dissemination and discovery planes
Deployment in today’s networks Data center, enterprise, campus, backbone (RCP)
161616
Future Work
Expand relationships with security
Securing the infrastructure
Using 4D as mechanism for monitoring/quarantine
Formulate models that establish bounds of 4D
Scale, latency, stability, failure models, objectives
Generate evidence to support/refute principles
171717
Themes of Network Control & Management
Holistic Design
Many different technologies – a few common problems
Find the right abstractions: exploit commonality
Clean Slate
How much autonomy do routers/switches need?
New principles for controlling networks
Separate networking issues from distributed system issues
Leverage Network Structure
Many different types of networks exist - each with different objectives and topologies
181818
Recent Publications
G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “On Static Reachability Analysis of IP Networks,” IEEE INFOCOM 2005, Orlando, FL, March 2005.
J. Rexford, A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang, “Network-Wide Decision Making: Toward a Wafer-Thin Control Plane,” Proceedings of ACM HotNets-III, San Diego, CA, November 2004.
D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A. Greenberg, H. Zhang, “Routing Design in Operational Networks: A Look from the Inside,” Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM 2004), Portland, Oregon, 2004.
D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G. Hjalmtysson, A. Greenberg, J. Rexford, “Structure Preserving Anonymization of Router Configuration Data,” Proceedings of ACM/Usenix Internet Measurement Conference (IMC 2004), Sicily, Italy, 2004.
191919
A Clean-slate Design
What are the fundamental causes of network problems?
How to secure the network and protect the infrastructure?
What functionality needs to be distributed – what can be centralized?
How to reduce/simplify the software in networks?
What would a “RISC” router look like?
How to leverage technology trends?
CPU and link-speed growing faster than # of switches
202020
Three Principles forNetwork Control & Management
Network-level Objectives:
Express goals explicitly
Security policies, QoS, egress point selection
Do not bury goals in box-specific configuration
ManagementLogic
Reachability matrixTraffic engineering rules
212121
Three Principles forNetwork Control & Management
Network-wide Views:
Design network to provide timely, accurate info
Topology, traffic, resource limitations
Give logic the inputs it needs
ManagementLogic
Reachability matrixTraffic engineering rules
Read state info
222222
Three Principles forNetwork Control & Management
Direct Control:
Allow logic to directly set forwarding state
FIB entries, packet filters, queuing parameters
Logic computes desired network state, let it implement it
ManagementLogic
Reachability matrixTraffic engineering rules
Read state info
Write state
232323
Overview of the 4D Architecture
Dissemination Plane:
Provides a robust communication channel to each router – and robustness is the only goal!
May run over same links as user data, but logically separate and independently controlled
Decision
Dissemination
Discovery
Data
Network-level objectives
Direct control
Network-wide views
242424
Overview of the 4D Architecture
Discovery Plane:
Each router discovers its own resources and its local environment
E.g., the identity of its immediate neighbors
Decision
Dissemination
Discovery
Data
Network-level objectives
Direct control
Network-wide views
252525
Overview of the 4D Architecture
Data Plane:
Spatially distributed routers/switches
Can deploy with today’s technology
Looking at ways to unify forwarding paradigms across technologies
Decision
Dissemination
Discovery
Data
Network-level objectives
Direct control
Network-wide views
262626
Fundamental Problem: Conflation of Issues
Ideal case: all routing information flooded to all routers inside network
Robustness achieved via flooding
Reality: routing information filtered and aggregated extensively
Route filtering used to implement security and resource policies
Route aggregation used to achieve scalability
272727
4D Separates Distributed Computing Issues from Networking Issues
Distributed computing issues ! protocols and network architecture Overhead
Resiliency
Scalability
Networking issues ! management logic Traffic engineering and service provisioning
Egress point selection
Reachability control (VPNs)
Precomputation of backup paths
282828
4D Can Leverage Network Structure
Decision plane logic can be specialized for structure of each physical network
Distributed protocols must be prepared for arbitrary topology graphs
4D enables network logic specialized differently for access and for backbone
E.g., creating aggregation tree in access network
Advantages
Faster route computations
Retain flexibility to evolve network as needed
Support transition to 100x100 architecture
292929
The Feasibility of the 4D Architecture
We designed and built a prototype of the 4D Architecture
4D Architecture permits many designs – prototype is a single, simple design point
Decision plane
Contains logic to simultaneously compute routes and enforce reachability matrix
Multiple Decision Elements per network, using simple election protocol to pick master
Dissemination plane
Uses source routes to direct control messages
Extremely simple & robust
Quickly route around failed data links, even multiple failures