A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as...
Transcript of A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as...
![Page 1: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/1.jpg)
A New So(ware Architecture for Core Internet Routers
Robert Broberg
September 16, 2011
![Page 2: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/2.jpg)
Disclaimers and Credits
• This is research and no product plans are implied by any of this work.
• r3.cis.upenn.edu • Early and conInued support from www.vu.nl
• A large team has generated this work and I am just one of many spokespersons for them. – any mistakes in this talk are mine.
![Page 3: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/3.jpg)
Agenda
Overview of the evoluIon of Core router design
A sampling of SW problems encountered during evoluIon
An approach to resolving SW problems and conInued evoluIon
![Page 4: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/4.jpg)
Core Router EvoluIon
• WAN interconnects of Mainframes over telecommunicaIon infrastructure
• LAN/WAN interconnects – CORE routers(1+1 architectures) – Leased telco lines for customers
– Dialup aggregaIon
• As CORE routers evolved the old migrated to support edge connects
• Telco becomes a client of the IP network
![Page 5: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/5.jpg)
Moore’s law x2/18m
DRAM access rate x1.1/18m
Silicon speed x1.5/18m
Router Capacity x2.9/18m
The demand for increased network system performance/scale is
relentless...
0
200
400
600
800
1000
1200
2004 2006 2008 2010 2012 2014
Internet traffic “2x/year”
1
10
100
1000
10000
Growth driven by increased user demand
![Page 6: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/6.jpg)
System Scaling Problems
![Page 7: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/7.jpg)
![Page 8: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/8.jpg)
Some of the reasons SW problems were encountered
• Routers started as Ightly coupled embedded systems – speeds and feeds were the game with features
• CPUs + NPUs + very aware programmers led the game
• EvoluIon was very fast – Business customers
• leased lines and frame relay – Mid 1990s 64kbit dialup starts – Core bandwidth doubling every year
• As IP customer populaIons grew feature demands increased
• Model of SW delivery not conducive to resilience of rapid feature deployment
![Page 9: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/9.jpg)
Intent /Goals
– build an applicaIon unaware fault tolerant distributed system for routers
– always on(200msec failover of apps)
– allow for inserIon of new features with no impact to exisIng operaIons
– support +/‐ 1 versioning of key applicaIons with zero packet loss
– versioning to allow for live feature tesIng
![Page 10: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/10.jpg)
Fault Tolerant RouIng
![Page 11: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/11.jpg)
MoIvaIons
• We must be able to do be]er than 1+1 – Low confidence in 1+1 as only tested when actually upgrading/downgrading/crashing
• Want 100% confidence in new code – Despite lab Ime, rollout o(en uncovers showstoppers – Rollback can be very disrupIve
• Aiming for sub‐200ms ‘outages’ – Want to be able to recover before VOIP calls noIce
![Page 12: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/12.jpg)
Core Routers are built as Clusters but act as a single virtual machine
• MulIple line cards with potenIally various types of interfaces use NPUs to route/switch amongst themselves via a data‐plane ( switch fabric )
• A separate control plane controls all NPUs programming switching tables and managing interface state along, rouIng protocols along with environmental condiIons – Control plane CPUs are typically generic and ride the commodity curve
• The Systems are heterogeneous and large – Current Cisco CRS3 deployments switch 128tb, have ~150 x86 CPUs for the
control plane along with ~1terabyte of memory and scale higher
`
![Page 13: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/13.jpg)
VirtualizaIon/VoIng/BGP
• BGP state is Ied to TCP connecIon state – loopback interfaces
• Process Placement
• Versioning • Leader elecIon • HW virtualizaIon – e.g. NPU virtualizaIon???
![Page 14: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/14.jpg)
Approach taken • AbstracIon layers chosen to isolate applicaIons
– applicaIons ( e.g. protocols) isolated with wrappers • applicaIon transparent check poinIng!!!!
• FTSS used to store state • SHIM used as wrapper
– model to allow for voIng • OpImize, opImize, opImize
– experiment and prototype • ORCM used for process placement • Protocols isolated by a shim layer
– mulIple versions called siblings • 2 levels of operaIon chosen
– no use seen for hypervisor – user mode for apps; kernel; abstracIon layer via SHIM + FTSS
![Page 15: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/15.jpg)
Protocol VirtualizaIon
• ExisIng protocol code largely untouched • Can run N siblings
– Can be different versions – the protocol being virtualized – Allows full tesIng of new code – with seamless switchover and switch back
• Currently we run one virtualizaIon wrapper – Protected by storing state into FTSS – Can be restarted thus upgradeable – Designed to know as li]le about protocol as possible
• Treats most of it as a ‘bag of bits’
• ‘Run anywhere’ – no RP/LC assumpIons – We don’t care what you call the compute resources
![Page 16: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/16.jpg)
CRS uIlisaIon
RP RP
LC LC LC LC
• The CRS contains many CPUs which we treat as compute nodes in a cluster • If a node fails the others take up its workload • No data is lost on a failure, and the so(ware adapts to re‐establish redundancy
![Page 17: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/17.jpg)
CRS uIlisaIon
RP RP
LC LC LC LC
• External resources can be added to the system to add redundancy or compute power
Blade server
![Page 18: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/18.jpg)
Placement of Components
• Each compute node runs FTSS and ORCM – both are started by ‘qn’ (system process monitor)
• FTSS stores rouIng data redundantly across all the systems in the router
• ORCM manages rouIng processes and distributes them around the router – constraints can be applied via configuraIon
• FTSS can run on other nodes to make use of memory if desired.
RP
LC
FTSS
ORCM
FTSS
ORCM
![Page 19: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/19.jpg)
BGP VirtualisaIon
FTSS
ORCM
Distributed dataplane
RIB
BGP VirtualisaIon service (shim)
Reliable TCP endpoint
BGP
BGP BGP
BGP BGP new BGP
![Page 20: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/20.jpg)
VirtualisaIon Layer recovery
FTSS
ORCM
Distributed dataplane
RIB
BGP VirtualisaIon service (shim)
Reliable TCP endpoint
BGP
BGP
BGP
New shim
![Page 21: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/21.jpg)
IS‐IS VirtualisaIon
ORCM
Distributed dataplane
RIB
IS‐IS VirtualisaIon service (shim)
IS‐IS L2 receiver
IS‐IS
IS‐IS
IS‐IS
![Page 22: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/22.jpg)
Fault Tolerant State Storage
• Distributed Hash Table with intelligent placement of data
• You can decide how much replicaIon – 2,3,4,N copies.
• More copies ‐ more memory & slower write Imes.
• Fewer copies – less simultaneous failures
• Virtual Nodes – able to balance memory usage to space on compute node
![Page 23: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/23.jpg)
FTSS distributed storage
FTSS RP0
FTSS RP1
FTSS LC0 FTSS LC1
FTSS LC2
Some data – stored redundantly in 2 places
![Page 24: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/24.jpg)
FTSS: losing a node
FTSS
FTSS
FTSS FTSS
FTSS
Data missing is replicated from predecessor
![Page 25: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/25.jpg)
Key
• Binary data • Unique in DHT
Value
• Binary data
Link
• Unique set of binary data items
• OpImizaIons for use as a list of keys
DHT tuples
DHT provides opImised rouInes for: • fast parallel store and deleIon of mulIple tuples • fast update of mulIple links within a tuple • OperaIons directly using the link list for storing related data • fast parallel recovery of mulIple, possibly inter‐linked, KVL tuples
Copies of the tuples are stored on mulIple nodes for redundancy
![Page 26: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/26.jpg)
DHT use in BGP processing
Receive incoming BGP messages
Acknowledge TCP
Create minimal
message set
Hand to BGP siblings; routes
produced
Pass routes from lead
sibling to RIB
Unprocessed messages
RIB prefixes A]ributes +
NLRI
Early redundant store to permit fast
acknowledgement of incoming BGP messages
Minimal set of incoming BGP
data
Data store for re‐syncing with RIBs on restart
BGP Shim operaIons
DHT
![Page 27: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/27.jpg)
BGP data in DHT (I)
126
127
128
..
10.0.0.4
192.168.22.5
4.1.0.77
..
ASPATH 1 + a]rs
ASPATH 2 + a]rs
ASPATH 3 + a]rs
..
NLRI + peer 1
NLRI + peer 2
NLRI + peer 3
..
Unprocessed incoming messages
Source peers Announcements from peers, minimal set
Data within links
Tuples Tuples
![Page 28: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/28.jpg)
BGP data in DHT (II)
1
2
3
..
10.0.0.4
19.1.22.5
4.1.0.77
..
Siblings RIB prefixes
Links Tuples
![Page 29: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/29.jpg)
DHT use in IS‐IS processing
Receive incoming IS‐IS
frames
Create minimal
message set
Hand to IS‐IS siblings; routes
produced
Pass routes from lead
sibling to RIB
RIB prefixes LSPs
Minimal set of incoming IS‐IS
frames
Data store for resyncing with RIBs on restart
IS‐IS Shim operaIons
DHT
![Page 30: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/30.jpg)
MulIpath IGP/EGP demo
![Page 31: A New Soware Architecture for Core Internet Routers · 2015. 8. 23. · Core Routers are built as Clusters but act as a single virtual machine • Mulple line cards with potenally](https://reader035.fdocuments.us/reader035/viewer/2022071516/613898c00ad5d20676495a0b/html5/thumbnails/31.jpg)