Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... ·...
Transcript of Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... ·...
![Page 1: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/1.jpg)
Progress on SDN, OVS and dynamic circuits
Ramiro VoicuCalifornia Institute of Technology
LHCOPN-LHCONE meetingAmsterdam, October 28-29, 2015
![Page 2: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/2.jpg)
LHC Collisions at 13 TeVImage source: CMS
![Page 3: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/3.jpg)
Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity, even at the start of LHC Run2
WLCG Dashboard Transfer Throughput
CMS
ATLAS
ALICE
LHCb
June 2015 October 2015
15GB/s 20GB/s
![Page 4: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/4.jpg)
PhEDEx and Dynamic CircuitsUsingdynamiccircuitsinPhEDEx allowsformoredeterministicworkflows,usefulforco-schedulingCPUwithdatamovementIntegratingcircuitawarenessintotheFileDownload agent:
• Applicationisbackendagnostic;NomodificationstoPhEDEx DB• AllcontrollogicisintheFileDownload agent• TransparentforallotherPhEDEx instances
PhEDEx throughput on a shared path
(with 5 Gbps of UDP cross traffic)
Seamless switchoverNo interruption of service
sandy01-amswood1-ams
T2_ANSE_Amsterdam
sandy01-gva hermes2
T2_ANSE_Geneva
HighspeedWANcircuit
Sharedpath
PhEDEx throughput on a dedicated path
1h moving average
1h moving average
FDTsustainedrates:~1500MB/sAverageover24hrs:~1360MB/s
Phed
exRa
teM
B/s
![Page 5: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/5.jpg)
Integrating NSI Circuits with PhEDEx: Control Logic
1. PhEDEx requests a circuit between sites A and B; waits for confirmation
2. Wrapper gets a vector of source and destination IPs of all servers involved in the transfer, via an SRM plugin
3. Wrapper passes this information to the OF controller4. PhEDEx receives the confirmation of the circuit, informs the OF
controller that a circuit has been established between the two sites5. OF controller adds routing information in the OF switches that direct
all traffic on the subnet to the circuit
![Page 6: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/6.jpg)
Advanced Network Services for Experiments (ANSE) Integrating PhEDEx with Dynamic Circuits for CMS
• BuildingontheAutoGOLEFabricofOpenExchangePointsandNSIstandardvirtualcircuitsOSCARS+NSIcircuitsareusedtocreateWANpathswithreservedbandwidthacrosstheAutoGOLE fabric
• OFflow-matchingisdoneonspecificsubnetstorouteonlythedesiredtraffic
• Openflow alsocanbeusedtoselectpathsoutsidethefabric
Lapatadescu, Wildish, Mughal, Bunn, Legrand, Newman
ANSE: 1st real life application integration of NSI and the AutoGOLE fabric with PhEDEx for CMS
![Page 7: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/7.jpg)
Open vSwitch OVS
q “Open vSwitch is a production quality, multilayer virtual switch”
q OpenFlow protocol support (1.3)q Kernel and user space forwarding enginesq Runs on all major Linux distributions used in HEPq NIC bondingq Fine grained QoS supportq Ingress qdisc, HFSC, HTBq Used in a variety of hardware platforms (e.g. Pica8) and
software appliances like Mininetq Interoperates with OpenStack
q OVN (Open Virtual Network): Virtualized network “implemented on top of a tunnel-based (VXLAN, NVGRE, IPsec., etc) overlay network”
![Page 8: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/8.jpg)
Using OVS for end-host orchestrationIntegrating PhEDEx with Dynamic Circuits for CMS
Standard OpenFlow (or OVSDB) protocol for end-hostnetwork orchestration (no need for custom SB protocol)Simple procedure to migrate to OVS on the end-host. SDN controller not required in the initial deployment phaseHost type (storage, compute) dynamically discovered using OF identification string
Use SDN controller to create an overlay networkfrom circuit end-point (Border Router) to the storage
![Page 9: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/9.jpg)
• OVS2.3.1withstockRH6.xkernel
• OVSbridgedinterfaceachievedsameperformanceashardware(10Gbps)
SDN QoS: Traffic Shaping withOpen vSwitch (OVS)
• egressrate-limit• BasedonLinuxkernel:
• HTB(HierarchicalTokenBucket)
• HFSC(HierarchicalFair-ServiceCurve)
![Page 10: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/10.jpg)
• OVS2.4withstockkernel• NSIcircuitCaltech->UMICH(~60ms)
• Verystableupto7.5Gbps• Fairlygoodshapingabove• 8Gbps (smallinstabilities)
Traffic Shaping withOpen vSwitch (OVS) WAN tests over NSI
![Page 11: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/11.jpg)
OVS benefitsq Standard OpenFlow (and/or OVSDB) end-host
orchestration q QoS SDN orchestration in non-OpenFlow clustersq OVS works with stock SL/CentOS/RH 6.x kernel used in
HEP; works out-of-the-box on SL7/CC7q OVS bridged interface achieved the same performance as
the hardware (10Gbps)q No CPU overhead when OVS does traffic shaping on the
physical portq Traffic shaping (egress) of outgoing flows may help
performance in such cases when the upstream switch (or ToR) has smaller buffers
https://indico.cern.ch/event/376098/contribution/24/material/slides/1.pdf
![Page 12: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/12.jpg)
ODL controlling OVS via OVSDB and OF
ODL Controller
q OVSDB – Open vSwitchDatabase Management
q RFC7047q Support as SB protocol
in major SDN controllersq Used to create the virtual
bridgesq Virtual bridges can use
standard OF to speak with the controller
q Normal routing if the controller is down
![Page 13: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/13.jpg)
OpenFlow topology discovery with non-OpenFlow islands
ODL Controller
Non OF switch
q Same topology used for QoS tests
q OVS instances on the two hosts controlled by a standard ODL(BE)
q Non-OpenFlow switch in the middle
Topology seen by the controller is missing the link between the OVS instances
![Page 14: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/14.jpg)
OpenFlow topology discovery with non-OpenFlow islands
ODL Controller
Non OF switch
Use the same EtherType0x88cc (LLDP)
Instead of bridge-filteredmulticast MAC (01:80:C2:00:00:0E) use a normal multicast mac address (01:23:00:00:00:01)
Unofficially known as OpenFlow Discovery ProtocolOFDP
![Page 15: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/15.jpg)
OpenFlow islands over WAN & NSI circuit
ODL @ Caltech
OF island 1
OF island 3(Umich)
OF island 2
NSI Circuit Caltech - UMICH
DTN/FDTServer@UMICH
DTN/FDTServer@Caltech
A. Mughal, I.Legrand
![Page 16: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/16.jpg)
q Simplified topology management using a “single” controller
q The distributed SDN controller is a redundant setup for HA. The controllers share the same view of the network (support for redundancy embedded in ODL and ONOS)
q OVS can be used to identify the flows right at the edgesq Can also be used for QoS right at the edgeq Standard (and widely supported) protocols and software
components for controlq Seamless topology discovery even with Non-OF devices and/or
NSI(L2) circuits in the middle
Possible architecture with a “single” controller
NSI Circuit
![Page 17: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/17.jpg)
100G DTN TESTS
![Page 18: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/18.jpg)
DTN: 100G network tests
NetworkTopology:Serversusedintests:Sc-100G-1andSc100G-2
q Two Identical Haswell Serversq E5-2697 v3q X10DRi
Motherboardq 128GB DDR4
RAMq Mellanox VPI NICsq Qlogic NICsq CentOS 7.1q Dell Z9100 100GE
Switch q 100G CR4 Cables
from Elpeus for switch connections
q 100G CR4 Cables from Mellanox for back to back connections
![Page 19: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/19.jpg)
100G TCP tests
2 TCP StreamsStable 94Gbps
1 TCP StreamsStable 68Gbps
Line rate (100G) with 4+ TCP Streams
Client #numactl --physcpubind=20--localalloc java-jarfdt.jar-c1.1.1.2–nettest -P1-p7000Server #numactl --physcpubind=20--localalloc java-jarfdt.jar-p7000
CPU: Single Core - 100% utilization
![Page 20: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/20.jpg)
Pacific Research Platform (DTN Test)
Disk to Disk Transfers (actual files on disk)
UCSD à Caltech = Avg 36Gbps, Peaks of 40GbpsUCLA à Caltech = Avg 35Gbps, Peaks of 36GbpsUCSC à Caltech = Avg 10Gbps
Caltech– FDTServer=4xPCIe Gen3NVMEDrivesUCSD– FIONA=Zpool (16xSSDDrives,Intel530,120G)UCLA– FIONA=Zpool (16xSSDDrives,Intel530,120G)
Several issues (mostly Disk I/O related):NVME Drives:
• Zpool performance Tuning, poor performance compared to individual drives
• Software RAID, sync hangs, system crashes
![Page 21: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/21.jpg)
Dual DTN Test – Disk performance
Peaks of5.7 GBytes/s
Reading from disk at both UCSD and UCLAThe disk pools at each of the - Zpool of 16 Intel 530 SSD drivesReceiver side at Caltech8 Intel D3700 NVME drives were used (4 each for UCSD and UCLA)Traffic peaked at 5.7GB/sec
At this rate with TCP, CPU cores were fully utilized.
Several issues (mostly Disk I/O related):NVME Drives:
• Zpool performance Tuning, poor performance compared to individual drives• Software RAID, sync hangs, system occasional hangs
![Page 22: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/22.jpg)
THANK YOU!
QUESTIONS?
![Page 23: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/23.jpg)
EXTRA SLIDES
![Page 24: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/24.jpg)
SDN Multipath OpenDaylightDemonstrations at Supercomputing 2014
q 100 Gbit links between Brocade and Extreme switches at Caltech, iCAIR and Vanderbilt booths
q 40 Gbit links from many booth hosts to switches
q Single ODL/Multipath Controller operating in “reactive” mode
§ For matching packets: Controller writes flow rules into switches, with a variety of path selection strategies
§ Unmatched packets “punted” to Controller by switch
Demonstrated:• Successful, high speed, flow path calculation,
selection and writing• OF switch support from vendors• Resilience against changing net topologies [At
layer 1 or 2]• Monitoring and Control
SC14Demo:Caltech/iCAIR/VanderbiltOFlinks
![Page 25: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,](https://reader034.fdocuments.us/reader034/viewer/2022042308/5ed44fcb4e1aa219885a942f/html5/thumbnails/25.jpg)
Focused Technical Workshop Demo 2015:SDN-Driven Multipath Circuits
- Hardened OESS and OSCARS installations at Caltech, Umich, AmLight- Updated Dell switch firmware to operate stably with OpenFlow
A. MughalJ. Bezerra
- Dynamic circuit paths under SDN control - Prelude to the ANSE architecture: SDN load-balanced, moderated flows
Caltech, Michigan, FIU, ANSP and Rio, with Network Partners:Internet2, CENIC, Merit, FLR, AmLight, ANSP and Rio in Brazil