MED: The Monitor-Emulator-Debugger for Software-Defined ...weixu/Krvdro9c/...MED functions MED:a...

Post on 06-Aug-2021

1 views 0 download

Transcript of MED: The Monitor-Emulator-Debugger for Software-Defined ...weixu/Krvdro9c/...MED functions MED:a...

MED:TheMonitor-Emulator-DebuggerforSoftware-DefinedNetworks

Quanquan Zhi andWeiXuInstitute for Interdisciplinary Information Sciences

Tsinghua University

Software-Defined Networks (SDN):promises and challenges

• SDN will simplify future network design and operation

• Bugs are common─ Controller─ Switch software─ Race conditions

• Network Ops -> SystemsDevOps─ Command line -> programs─ Lacking of tools─ Fast, repeatable

Monitor-Emulator-Debugger:A debug / testing tool forSDNDevOps

• A software Debugger─ fast, repeatable, automated tools

─ addresses concurrency bugs

• Tightly coupled with physical network

- Automatic physical network sync

MED architecture overview

Monitor Emulator Debugger

App

Controlmessages

App App

Controller

RealSDN

MED Agent (Monitor) MED(Emulator)

VirtualSDN

OVS

OVS

OVS

Datapackets

Packet Tracer

Loop and Reachability Checker

Table Checker

Race Conditions Detector

Debugger Controller

Debugger

• Snapshot (initialization)─ Physical network topology (LLDP)─ Initial forwarding table states

• Capture SDN state changes over time─ Openflowmessages to/from the SDN controller─ E.g. packets-in,packets-out,ruleinstallation/removal,andportsup/downevents

• Sample data packets─ Essential for replay/testing

The monitor

The emulator: key ideas• Thekeychallenge

─ Emulating a blackbox controller from physical SDN

• Solution─ Replay all Openflowmessages captured=>settoatime

• Question:In what order?App

Controlmessages

App App

Controller

Statemessages

RealSDN

Emulator Controller

VirtualSDN

OVS

OVS

OVS

Replayedmessages

DebuggerController

App App

Injectmessages

The emulator: operation• Online Operation- Trackingmode

• Offline Operation─ “Time Travel”

Initial setup

Set_to_current Trackingstate

Set_to_stable Specified state

Set_to_nondeterministic(t)State1 State2 StateN

Replay

Online

Offline

The emulator: offline operations• Set to a stable state at any time

• Emulateallpossibleorderingforconcurrentevents

Initial setup

Set_to_current Trackingstate

Set_to_stable Specified state

Set_to_nondeterministic(t)State1 State2 StateN

Replay

Online

Offline

The debugger

• A controller that injects messages into the replayedmessage stream

• “Apps” built on top of the emulator─ Settoaspecifictime─ Anexternalcontrollerinterface

• Example debugger apps─ Packettracer

─ Loopandreachabilitychecker

─ Forwarding tablechecker

─ Raceconditionsdetector

Emulator Controller

Replayedmessages

VirtualSDN

OVS

OVS

OVS

Exampledebuggerapp 1:PacketTracer(PT)

DebuggerController

PT

TO_CONTROLLERReplay:Packet_Out

Packet_InFlow_Status_Request

Flow_status_reply

PacketmatchesNormalEntryPacketmatchesTO_CONTROLLER

Outputs:1. A packet’s entire path through the network2. Whichforwardingruleisusedoneach hop

Exampledebuggerapp 2:LoopandReachabilityChecker(LRC)

DebuggerController

PT

LRC

Asserts:

• The packet forwarding has no loop

-- AND --

• The packet reaches the destination

• Worksonlineoroffline

Exampledebuggerapp 3:RaceConditionDetector(RCD)

Asserts:

• In ANY possible concurrent state, there is no loopor blackhole

Initial setup

Set_to_nondeterministic(t)State1 State2 StateN…

Offline

• Expensive?Cantriviallyruninparallelwithmultipleemulators

DebuggerController

PT

LRC

RCD

Exampledebuggerapp 4:TableChecker(TC)

Asserts:• The forwarding tables on physical switches are thesame as those in the emulator

Forwardingrules

Flow table

OpenFlow Switch

SDN

Forwardingrules

Flow table

OVS

EmulatorTable

Checker

Installrules

DebuggerController

PT

LRC

RCD

TC

Evaluation

• Performance- Emulatorinitialization

- PacketTracing (PT)performance

• Case studies- Bugsonphysicalswitchsoftware

- Racecondition analysis

Experimentsetup

• 20switchesnetwork,typicalDCNtopology─ Pica8P-3298

─ 30,000OpenFlow total(~1,500rulesperswitch)

Initial setup performance

Discoverphysicaltopo+setupemulator

topo

Dumpallflowtablesfromswitches

InstallallflowtablesentriestoEmulator

(30Krules)

4.9sec 0.54sec 12.2sec

State changed during the setup? Redo until done.

PacketTracing(PT) performance

• Random routing

• Performanceoftracingpathswithdifferentlengths

#hops 2 4 6 8 10

% oftestdata 10.6% 13.2% 57.9% 16.2% 2.1%

Timetaken(ms) 0.626 1.536 2.828 3.532 5.001

Realworldbuginswitch software

Pica8switchflowtable:

MEDOVSflowtable:

BuginPicOS-OVS2.3

“AGREportisinjectingARPrequestpacketsbacktothesameport.TheexpectedresultsistoforwardallpacketsexcepttheGREport.”

http://www.pica8.com/document/v2.3/html/release-notes-for-picos-2.3

Non-deterministicstatesinthenetworkduetoconcurrentmessages

Controller

• Whichswitchprocessedthemessagefirst?─ Sometimeswedonotknow

─ Canbeok,butcanmeanproblems

Raceconditionexample

r:in_port=1->Port2

r:in_port=1->Port3r:in_port=3->Port1

Should we enforce the ordering?

Are we enforcing them correctly?

[1] XinJin,Hongqiang HarryLiu,RohanGandhi,Srikanth Kandula,Ratul Mahajan,MingZhang, JenniferRexford,RogerWattenhofer, DynamicSchedulingofNetworkUpdates, SIGCOMM, 2014

AB

C

Racecondition detectorexample(cont’d)

Conclusion

• A step bring in the software testing / debugging tools toSDN• Fast, reproducible

• Single step tracingwith packets

• Debugging concurrencyproblems

• Emulates physical network

• Evaluation on an SDN with 20-switches

Wei Xu <weixu@tsinghua.edu.cn>

Backup slides

MEDfunctions

MED: ausefultooltodebugproblemsinSDN

• Createanemulatorthatcanbesettothenetworkstateatanygivenpointoftime

• Tracetheforwardingpathsandtheflowtableentriesusedalongthepath, foreachindividualdatapackets

• CaptureandfindthecauseofcommonSDNproblems:Loop,Reachability failure and RaceConditions

Performance:insertingrules