PhlexTMF+ Simplifies Compliance While Delivering Ongoing ...
Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies...
Transcript of Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies...
![Page 1: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/1.jpg)
Passiverealtime datacenterfaultdetectionandlocalization
ArjunRoy,JamesHongyi Zeng*,JasmeetBagga*,andAlexC.SnoerenUniversityofCalifornia,SanDiegoFacebook*
1
![Page 2: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/2.jpg)
“Itwouldbeniceifwecouldfigureoutwhichlinkwascausingtheseretransmits.”
- Ranjeeth Dasineni,Facebook(paraphrased)
2
![Page 3: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/3.jpg)
Contemporarydatacenternetwork
However:faultsmaybepartial/intermittent.3
![Page 4: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/4.jpg)
Partialfaults:Afewexamples
• Netpilot (Sigcomm 2011):Framecheckerror,unequalECMPhashing,etc.Wu,Xin,etal."Netpilot:automatingdatacenternetworkfailuremitigation." ACMSIGCOMMComputerCommunicationReview 42.4(2012):419-430.
• Everflow (Sigcomm 2015):TCAMbiterrors,silentpacketdrops.Zhu,Yibo,etal."Packet-LevelTelemetryinLargeDatacenterNetworks.”SIGCOMM,2015.
• Pingmesh (Sigcomm 2015):“fiberFCS…errors,switchingASICdefects,switchfabricflaw,switchsoftwarebug,NICconfigurationissue,networkcongestions,etc.Wehaveseenallthesetypesofissuesinourproductionnetworks.”
Guo,Chuanxiong,etal."Pingmesh:ALarge-ScaleSystemforDataCenterNetworkLatencyMeasurementandAnalysis.” SIGCOMM,2015.
4
![Page 5: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/5.jpg)
Vastbodyofpriorwork(justasmallsample…)• Applicationinstrumentation:variousproductionsystems
• Activeprobing:Pingmesh (SIGCOMM’15),NetNorad (Facebook),ATPG(CoNEXT ‘12),Everflow (SIGCOMM‘15)
• Machinelearning:NetPoirot (SIGCOMM’16)
• Graphalgorithms:Gestalt(Usenix ATC‘14),SCORE(NSDI‘05)
• Pathtracing: Everflow (SIGCOMM‘15),NetNorad (Facebook),NetSight (NSDI‘14),TinyPacketPrograms(SIGCOMM‘14)
• Networkinstrumentation:FlowRadar (NSDI’16),Planck(SIGCOMM‘14),NetPilot (SIGCOMM‘11)
5
![Page 6: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/6.jpg)
Weexploit:highlyregularloadbalancedtraffic
Sourceracktrafficmagnitude
Destinationracktrafficmagnitude
6
ArjunRoy,Hongyi Zeng,JasmeetBagga,GeorgePorter,andAlexC.Snoeren.InsidetheSocialNetwork's(Datacenter)Network. ACMSIGCOMM'15,London,England.
![Page 7: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/7.jpg)
Loadbalancedtrafficsimplifiesfaulthandling
• Evenlyloadedpathsmeansperpathperformanceissimilarifnoerrors.• Networkfaultsleadtooutlierpaths.• Ifflownetworkpathknown,cancorrelateflowperformancewithpath.
• Approachallowsustofindandlocalizefaults:• Inanapplicationagnosticmanner• Incurringnoadditionalprobingoverhead• Morerapidlythanpriorpublishedworks
7
![Page 8: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/8.jpg)
Facebookdatacentertopology
8
AlexeyAndreyev.Introducingdatacenterfabric,thenext-generationFacebookdatacenternetwork.https://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the-next-generation-facebook-data-center-network/
![Page 9: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/9.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg9
![Page 10: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/10.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg10
![Page 11: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/11.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg11
![Page 12: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/12.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg12
![Page 13: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/13.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg13
![Page 14: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/14.jpg)
FindingpathinformationatFacebook
ToR ToRCoreCore
Core
CoreCore
Core
CoreCore
Core
CoreCore
Core
Sourcehost
DestinationhostAgg
Agg
Agg
Agg Agg
Agg
Agg
Agg14
![Page 15: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/15.jpg)
FindingpathinformationatFacebook
Core
Core
Core
Agg
Agg
AggToR ToR
Agg
Agg
Core
Core
Sourcehost
Destinationhost
Agg
Agg
Agg
15
![Page 16: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/16.jpg)
FindingpathinformationatFacebook
Core
Core
Core
Agg
Agg
AggToR ToR
Agg
Agg
Core
Core
Sourcehost
Destinationhost
Agg
Agg
Agg
16
![Page 17: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/17.jpg)
FindingpathinformationatFacebook
Core
Core
Core
Agg
Agg
AggToR ToR
Agg
Agg
Core
Core
Sourcehost
Destinationhost
Agg
Agg
Solution:aggregationswitchmarkspacketsbasedoncoredownlinktraversed.
Agg
17
![Page 18: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/18.jpg)
Howdoweusepathinformation?
• Inprinciple:cancompareflowperformancebypath.1. Combinatorialdisaster:O(10,000)pathsfromsinglehosttoremoteracks.2. Nolocalization:doesn’ttelluswhichlink/switchisatfault.
• But:forthistrafficpattern,ECMProutinggivesusevenbytes/link.
• Solution:Justcomparelinks!
Create“EquivalenceSets”:setsoflinkshandlingsimilarload
andexhibitingsimilarperformance,intheabsenceoffaults
18
Equivalencesets:1. Reducesnumberofcomparisonsneeded.
2. Pinpointsfaulttospecificlocation.
![Page 19: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/19.jpg)
EquivalencesetsinFacebooktopologyCoreCoreCore
CoreCoreCore
CoreCoreCore
Sourcehost
Agg
Agg
Agg
ToRCoreCoreCoreAgg
Equivalenceset:4uplinksfromeachToR
topodAgg layer
…eachhasclosetoidenticalperformancedistribution
inabsenceoferrors19
![Page 20: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/20.jpg)
CoreCoreCore
CoreCoreCore
CoreCoreCore
Sourcehost
Agg
Agg
Agg
ToRCoreCoreCoreAgg
…eachhasclosetoidenticalperformancedistribution
inabsenceoferrors
Equivalenceset:NuplinksfrompodAgg layertocorelayer
EquivalencesetsinFacebooktopology
20
![Page 21: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/21.jpg)
Outlieranalysiswithapplicationagnosticmetrics
Hostsalreadytrackmetricsforcongestioncontrolorperformancemonitoring:
TCPCongestionwindow:Affectedbypacketloss.TCPRetransmits:Affectedbypacketloss.SmoothedRoundtriptime:Affectedbylatencyspikes.Systemcalllatency: Affectedbypacketloss.
Caveat:Canbedifficulttodetermineifanaffectisduetoafaultylink,overloadedhosts,applicationvariance,etc.
Withequivalencesetbasedgrouping,wecancomparedistributionsbylink.
Onlylinkfaultscausevariancebetweenlinks.
21
![Page 22: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/22.jpg)
DemonstratingequivalencesetsfromAgg toToR
(1)ToR markspacketDSCP
perinboundlink
(2)HostaggregatesTCPmetricsbylink(3b)Host drops0.5%ofpacketstraversinglink
(3a)Wesimulateerroronthislink:
22
Host ToRAgg 2
Agg 3
Agg 4
Agg 1
![Page 23: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/23.jpg)
TCPCongestionwindowinAgg toToR equivalenceset
Cacheserver 23
![Page 24: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/24.jpg)
Congestionwindowsignalisapplicationagnostic
Cacheserver Webserver24
![Page 25: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/25.jpg)
Weuse:TCPretransmitsinourwork
Cacheserver Webserver25
![Page 26: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/26.jpg)
Detectingfaultsinproduction
• Monitoredtrafficthroughpodaggregationswitch.1. Nofaultsinjected.2. CollectedTCPmetricdataon30webserverhosts.3. Equivalenceset:fourlinecards connectingtocorelayer
(eachlinecard hasequalshareofuplinks).
• OnJanuary25th,asinglelinecard hadasoftwarefault.1. Linecard controllersoftwarehung.2. BGProutestimedout,productiontrafficthroughlinecard routedaway.3. Afewminuteslater,NetNORAD flaggedunresponsivelinecard.
26
![Page 27: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/27.jpg)
Faultvisibletoourapproachin30seconds
27
![Page 28: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/28.jpg)
Classifyingfaultylinks
• “Doesthislinkhavemoreretransmitsperflowthantheotherlinks?”
• “Dotwodistributionshavethesamemean,orisonegreater?”
28
Classifier:compareeachlinktootherlinkswithonesampleStudent’sT-Test.
![Page 29: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/29.jpg)
OnlinefaultmonitoringwithT-Testalone
• Inprinciple:cansetupasystemthatusesendhostT-Testresulttotelluswhichnetworklinksarefaulty.
• However:byitselfthisissusceptibletoFalsePositives.
• Can’taffordfalsepositivesinnetworkwithO(10,000)links!
29
![Page 30: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/30.jpg)
Accountingforfalsepositives
• However,twocharacteristicsaidus:1. Per-hostfalsepositivesevenlydistributedperlinkovertime.2. Datacenterhasaplethoraofhostsforwhichthisistrue.
• Thus,we’renottryingtoseeif agivenlinkismarkedfaultybyhosts.
• Instead,weonceagainperformoutlieranalysis.1. “Areallthelinksbeingmarkedfaultybyhostsatsimilarrates?”2. “Arehostsflaggingaparticularsubsetoflinksasfaultyathigherrates?”
30
Chi-squaredtest:determinesifanylinksareoutliers.
P-Value≈ 1:“Yes,allthelinksbeingmarkedfaultybyhostsatsimilarrates.”
P-Value≈ 0: “No,asubsethasacomparativelyhighpercentageofhostsclaimingfault.”
![Page 31: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/31.jpg)
Evaluationinthedatacenter
• Smalldetectionsurface;didnotdetectany‘organic’partialfaults.
• Approach:inject‘simulated’faultstoevaluateapproach.
• Inducedavarietyoffaultscenariostochallengeoursystem.
31
![Page 32: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/32.jpg)
Evaluationinthedatacenter:faultscenarios
• Minisculefaults:faultscanhaveverylowdroprates.
• Concurrentfaults:multiplefaultscanoccursimultaneously.
• Maskedfaults:largerfaultcanmaskeffectofminisculefault.
• Correlatedfaults:hardwarefaultcanimpactmultiplenearbylinks,confoundingoutlieranalysis.
32
![Page 33: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/33.jpg)
Evaluationinthedatacenter:faultscenarios
• Minisculefaults:faultscanhavevery lowdroprates.
• Concurrentfaults:multiplefaultscanoccursimultaneously.
• Maskedfaults:largerfaultcanmaskeffectofminisculefault.
• Correlatedfaults:hardwarefaultcanimpactmultiplenearbylinks,confoundingoutlieranalysis.
33
![Page 34: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/34.jpg)
CoreCoreCore
CoreCoreCore
CoreCoreCore
HostHostHost
HostHostHost
HostHostHost
Agg
Agg
Agg
ToR
ToR
ToR
Findingminisculefaults:experimentsetup
Core1
Core2
CoreN
Agg
…Core3
34
![Page 35: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/35.jpg)
CoreCoreCore
CoreCoreCore
CoreCoreCore
HostHostHost
HostHostHost
HostHostHost
Agg
Agg
Agg
ToR
ToR
ToR
Findingminisculefaults:experimentsetup
Core1
Core2
CoreN
Agg
…Core3
35
![Page 36: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/36.jpg)
CoreCoreCore
CoreCoreCore
CoreCoreCore
HostHostHost
HostHostHost
HostHostHost
Agg
Agg
Agg
ToR
ToR
ToR
Findingminisculefaults:experimentsetup
Core1
Core2
CoreN
Agg
…Core3
Equivalenceset:NuplinksfrompodAgg layertocorelayer
36
![Page 37: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/37.jpg)
CoreCoreCore
CoreCoreCore
CoreCoreCore
HostHostHost
HostHostHost
HostHostHost
Agg
Agg
Agg
ToR
ToR
ToR
Findingminisculefaults:experimentsetup
Core1
Core2
CoreN
Agg
…Core3
Partialfaultinducedonsingle
CoretoAggdownlink.
37
![Page 38: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/38.jpg)
Faultdetectionratevsdroprate
38
![Page 39: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/39.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
39
![Page 40: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/40.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
40
![Page 41: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/41.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
41
![Page 42: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/42.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
42
![Page 43: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/43.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
43
![Page 44: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/44.jpg)
Minisculefaults:choosingbetweendetectionspeedandsensitivity
44
![Page 45: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/45.jpg)
“Itwouldbeniceifwecouldfigureoutwhichlinkwascausingtheseretransmits.”
Ranjeeth Dasineni,Facebook(paraphrased)
45
![Page 46: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/46.jpg)
46
![Page 47: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/47.jpg)
InterpretingtheT-Test
1. T-Statistic:“Doesthislinkhavemoreorlessretransmitsthanaverage?”
• Positive T-statisticmeanslargerthanaverage.• Negative T-statisticmeanssmallerthanaverage.
2. P-Value:“Isthedifferenceinmeanbigenoughtoconcernus?”
• Closeto0meansthislinkcouldbeanoutlier.• Closeto1meanswearenotconcerned.
47
![Page 48: Passive realtimedatacenter fault detection and localization · Load balanced traffic simplifies fault handling •Evenly loaded paths means per path performance is similar if no errors.](https://reader033.fdocuments.us/reader033/viewer/2022050112/5f49d75ebe3dca43bf345af8/html5/thumbnails/48.jpg)
InterpretingtheT-Test
P-value0,t-stat>0
P-value1,t-stat≈0
48