White Box - listes.renater.fr
Transcript of White Box - listes.renater.fr
White Box
16th November 2018
Journée Réseaux RégionauxXavier JEANNIN
Maxime Wisslé
Frédéric Loui
1
• Disagregation trend
– Network Operating System and hardware
– Junos on ‘kind of generic hardware’ – OCX1100 (now end of life)
• The idea of having a NOS running on unmarked hardware emerge
– Even programmable hardware
– Reduce Total Cost of Ownership
• The architecture of data-center, regional network and telecom carrier
become very similar
• A large scope of use-cases
– Routing/switching: From Data-center to telecom carrier
– Advanced network feature: Telemetry, Analytics,
2
Introduction
3
Proprietary design
• Business model
– Hardware design Proprietary
– Proprietary NOS (embedded)
– Hardware maintenance
– NOS maintenance
• Dependence on one vendor
4
White-box Design
• Chipset are the same as proprietary devices
• Performances and features depend on forwarding chipset
(switching ASIC)
– Trident, Trident 2, Tomahawk, Qumran, Jericho
– Juniper/Cisco use the same chipset
– ...
• Network Operating System (NOS)
– Commercial and open source
5
Network Operating SystemNOS commentaires
DELL
OS9 ou OS10 (Free BSD)
Commercial - Linux + Quagga + BGP EVPN (VXLAN ?)
Cumulus Networks Commercial - Unix + CLI Data center
IP Infusion
OcNOS
Commercial - Expérience opérateur
- Nombreuse fonctionnalités opérateur : MPLS
Pluribus Networks
NetVisor
Commercial - Solution SDN mais contrôleur embarqué dans chaque boite
- fonctionnalités opérateurs spécifiques : data analytics,
service chaining and VNF functions
Open Network Linux Open / Free - https://opennetlinux.org/
Barefoot networks
Sur chipset Tofino
Open et peut être
commercial
- certainement l’avenir
- A partir de chipset programmable en P4 : Telemetry, des
DDOS mitigateurs, des load balancers, …
Pica8
PicOS
Commercial Hybrid Networking: OpenFlow agent, with native L2 and L3
features
Big Switch Networks
SwitchLight
Commercial Solution SDN avec controleur non réparti
Canonical Snappy
Ubuntu Core
Commercial
Software for Open Networking in the
Cloud
SONIC
Open Microsoft and co-contributors to OCP
Orienté cloud pour AZUR
OpenSwitch Open Se focalise sur peu de fonctions mais très rapide
SnapRoute FlexSwitch Open
Open Network Foundation Atrium SDN
Distribution
Open - CORD use case
Open Compute Project
http://www.opencompute.org/
projects/networking/
Open Projet “Mère” qui porte :
- Campus Branch Wireless (CBW)
- Open Network Install Environment (ONIE)
- Open Network Linux (ONL)
- SONIC
- Switch Abstraction Interface (SAI)
• Routing and switching
– Cloud and data center (SURFsara), Campus, GIX (London Internet
Exchange [LINX]), Telco Operator …
• Analytics, Telemetry - Pluribus
• Service chaining and Network functions - Pluribus
• SDN architecture – Big switch, …
• Data manipulation
– Security/filtering, deduplication, load balancing, packet slicing, time
stamping and protocol stripping.
• Optical white box ???
6
Scope of white box 1/2
• Data Plane Programmable – P4 language
– Based on PISA architecture [Protocol-Independent Switch Architecture], FPGA, Open VSwitch
Tofino chipset barefoot network (https://www.barefootnetworks.com)
• Application
– Security: In-Network DDoSDetection
– Performance: Layer 4 Load Balancer
– Monitoring: Advanced Network Telemetry
– Analytics
7
Scope of white box 2/2
https://p4.org/
8
Are white boxes able to scale
and
can they make the job?
9
10
White-box architecture
TCAMAdditional memory
• Arista 7050X switch architecture
11
Chipset architecture
12
Chipset characteristics
• Buffer and memory
– Packet queue length
• Route number limit in FIB
– For instance, Trident2 is limited to around 200 000 FIB routes
• Traditionally not scalable
• Several core
• Scalable?
– Depend on #port
– Depend on network services
– Do not know the answer
• “Probably”, it should work with few ports (4 ports) ?
• Can address some use cases ?
13
Why not use a X86 server ?
NIC NIC NIC
• Crucial to check the way the NOS is written
– Feature available
– Perfomance and Scalability• Manage route update, #routes, …
– Reliability
– Monolithic Vs several subsystems (deamon)
– Relationship between FIB and RIB
– Time sharing
– Sharing memory
– Watch dog
– Security – Routing Engine protection
14
NOS: points to consider
• Linux-based Universal NOS Installer
• Combination of boot loader and light OS
• Pre-installed on White-Box
• NOS Install/Uninstall
• NOS image host
– DHCP/Web server
– Licence server
• Serial access required
15
ONIE : Open Network Install
Environment
• Being more independent regarding the hardware choice;
• Reduce network devices the total cost ownership;
• Possibly benefit advanced features;
– Traffic Engineering, PCE/SDN, Telemetry, Analytics, load
balancer, security (DDOS mitigation) ...
• Lot of use cases for R&E community
– Campus, data center / Cloud, …
• Being able to benefit of specific to education and
research network features and services;
– LHCONE traffic
– …
• Data Plane Programmability for research
16
White-box stakes
• The same services that NRENs provide to theirs end-users?
• Is this technology mature enough regarding NREN reliability requirement?
• What is missing in an open network operating system before going into
production?
– Manageability, Open network operating system security, Documentation,
Maintenance model
– …
• What are the potential first use cases for such technology?
– Campus, Metropolitan network;
– security feature (DDOS mitigation, …);
– Scientist experiment
– …
• What would be a transition model that could be put in place in order to go in
production?
• What is the total cost ownership for such technology?
17
Condition to adopt white-box model?
• NREN backbone
• Regional network
• Campus network
• Science project
• Global Internet eXchange (GIX)
• Cloud Fabric
18
Use cases
• PE for NREN Backbone
– 72 PoPs
– Inter-AS services
• Requirement
– Lot of network features
– VPN services (L2/L3)
– Advanced L2/L3 features
– CISCO (ASR 9000) and Juniper interoperability
(MX 2010) 19
NREN backbone
• Our regional network is based on carrier Ethernet
• G 8032 ring, ERP, E-LINE, E-LAN, E-Tree, OAM
20
Access Network
/ Regional Network
• Our education and research community used to run network campus, these campuses could be a good candidate for white box
21
Campus
• Can be isolated in our backbone (VRF or VRF-light) and
forwarded by a set of white boxes
• The traffic can be heavy but the route number is quite small
(several thousands)
22
Science project
• NREN use to run Internet eXchange Point
• Exchange point can be used for providing connectivity
between academic cloud and also with commercial cloud
23
Global Internet eXchange (GIX)
and distributed GIX
• Both RENATER and research & education community
run data-center that can be for small to large size ;
• Technologies:– IP Fabric + EVPN/VXLAN or …
– Others ….
24
Cloud fabric
• Objectives
– Check white-box usage according to different context
(uses cases)?
• Features and bugs
– Are all elements available for production (network operation)?
• Monitoring
• Management (Tacacs, Routing engine protection, …)
• Software support provided by NOS provider
• Documentation
– Business model• Licence model
• TCO
25
A first white-box project
26
Testbed
• Hardware : DELL S4048-ON 1
– 48 interfaces 10G
– 4 interface 40G
– Trident II
• Very polyvalent chipset
• Open Network Install Environment
– (ONIE) 2
– Network Operating System (NOS)
• first NOS tested MPLS capable
• Support of lot of NOS
– Netvisor
– Cumulus
– …
27
Test environment
S4048 DELL
• Hardware issue from Data center world • No small machine available
• Management
– Authentication
• TACACS, Radius, …
– Monitoring
• sFLOW, Port mirroring, NTP, SMTP, DHCP
– …
• Routing
– VPN (L3VPN, L2VPN, ...)
– VLAN
– VXLAN
– IS-IS (Single-topology, no Multi-topology)
– BGP, MPLS, LDP
– Load balancing
– ….
• White_Box_PoC_result_table.docx
28
First results
• This NOS can maybe used in specific use cases: GIX,
Campus, …
– Maturity issue but encouraging perspective
• Currently use it for our new GIX design in Paris
– Still have to solve the problem of the unicity of routing engine …
– …
29
Next step
• Operation, security, documentation, maintenance model
are mandatory
• Transition model is the key point
– Innovate in order to overcome what is missing
– Find the appropriate use case
• To start, train your team and get experience
– Use experience in RE community
• Big data center already use white box
30
Conclusion 1/3
• Do not expect better that you have
– Same will be a great result
– New usage/feature will be a breakthrough
• White box will not replace our Juniper or Cisco box in a
first stage
– Instead move specific services on white box: GIX, LHCONE, …
• Classical vendors will not stay without any reactions
– Price of legacy vendor box is already decreasing a lot
31
Conclusion 2/3
• The network landscape is changing deeply so a long
term approach is necessary
– Linux does not replace Solaris in one year /
Rome was not built in a day
• Not reversible path
– For White Box
– For data plane programming
32
Conclusion 3/3
• New hardware dedicated to Telco is
coming: Qumran, Jericho
– Internet route: Additional TCAM
– Buffer size increased
– Scalability
• Other use cases: telemetry, security
• Network programmability
– P4 / Bigfoot
– A perfect tools for computing research
33
Perspectives
• GEANT project in GN4ph3 Jan. 2019 – Dec 2020
– White Box
• Study best NOS for Education and Research
• Maybe Open Source NOS
– Data Plane Programming
• First use case Telemetry/monitoring
• Looking for your use case that can be implement in pilot
production
– RENATER will certainly propose GIX and possibily study your
use cases
– GRNET will propose an IP fabric for Data Center with an overlay
network (VXLAN/EVPN or …) with Open Stack orchestration
34
Perspectives