Post on 23-Jan-2018
TechTalk:GiveMetheBadNewsStraight: WhyModelsareaBrokenApproachtoAlerting
DavidB.Martin
DevOps:AgileOps
CATechnologiesAPMProductManagerDO5T41T
#CAWorld
2 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
GiveMetheBadNewsStraight:WhyModelsareaBrokenApproachtoAlerting
The industry standard approach to automatic alerts is to create modelsfrom base-lining application latencies. But when something goes wrong,is it because something is really broken or because the model wasincorrect? Training the model to avoid mistakes is complex and time-intensive. CA Application Performance Management (CA APM) 10replaces the whole approach with a brand new one: react to changes inapplication stability as they occur. Outliers are automatically ignored,while tremors in latency register progressively bigger values for theintensity of an event, a little like the richter scale for earthquakes. Jointhe discussion and learn how CA APM transforms automatic alerting.
DavidB.MartinCATechnologiesProductManager
3 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
©2015CA.Allrightsreserved.Alltrademarksreferencedhereinbelongtotheirrespectivecompanies.
Thecontentprovidedinthis CAWorld2015presentationisintendedforinformationalpurposesonlyanddoesnotformanytypeofwarranty. The informationprovidedbyaCApartnerand/orCAcustomerhasnotbeenreviewedforaccuracybyCA.
ForInformationalPurposesOnlyTermsofthisPresentation
4 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Agenda
WHYMODELSAREFAILING
ABRIEFHISTORYOFAPMALERTING
CATECHNOLOGIESDIFFERENTIALANALYSIS
MODELSAREMADETOBEBROKEN
DATA-DRIVENDIVEINTOAUTOMATICALERTINGMODELS
SHEWHARTSAVESTHEDAY
1
2
3
4
5
6
5 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Keepingmypromise!
§ Iwillbeginthissessionbymakingadetailed,data-centriccaseforwhyCATechnologiesnewdifferentialanalysisfeatureisasuperior,market-leadingapproachtoautomaticalerting.
§ No,Iwillnotthenpullarabbitoutofahat.‘Cuz thisain’tmagicpeople…evenifitlookslikemagic.
§ “Anysufficientlyadvancedtechnologyisindistinguishablefrommagic.”—A.C.Clarke
6 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
WhatwasCA’slastanswer?
§ Intheearly90s,WilyimplementedHolt’sLinearExponentialSmooth(HLES)tocalculatebaselines for metrics.
§ Baselineswerefooledbyregularproductionevents—manyweremoreaboutregularpatternsinloadthanaboutmaintenanceevents.Seasonalitydebutstoaddressit.
§ Thisleadstorules—andrulesengines—toaddressedgecasesthatseasonalitydoesnotaddress(e.g.“+3std dev frombaseline”todeadenthesensitivityoftriggers).
Andwhatareourcompetitorsdoing?
7 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
What’stheproblemwiththestate-of-the-art?
§ Asthefollowingslideswillexplain,seasonalbaselinesmissproblemsthatyoudon’twanttomiss.
§ Inevitably,theyalsoreporttoooften.
§ Whentheydo,youhavetowriterulesresolvetheissuewithyourissues.
§ Nowyou’vefailedtofindtheautomaticalertinggrail.
§ Itmayactuallybemoreefficienttogobacktowritingstaticthresholdsforyourkeycomponents.
Or,agood reasonforteachingyousomeinterestingmath.
8 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD440
460
480
500
520
540
560
580
600
620
AverageResponseTime
+1StdDev
+2StdDev
+3StdDev
Thisisastableapplicationresponsetime,withbandsofstandarddeviation.Mostbaselinesarefancyformsofstandarddeviationthattakeintoaccount thingslikeseasonality.
9 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD0
200
400
600
800
1000
1200
1400
1600
1800
Anoutlier…Whattodo?Ifit’sinaseasonalwindow,ithastobeabiggeroutlier,buttheproblemof,“ToAlertorNottoAlert,”remains
thesame.
Youmusteithersendanalertforthissinglespikeorwritearuletosaythatthespikehastobe“sobig”beforeyoucare(whichisusuallydonewithamanuallywrittenrulelike
“>3stddev”).
“Mr.Opswon’tevenputdownhissandwichforasinglefailedtransaction.”
10 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD0
500
1000
1500
2000
2500
Whatabout thesituationofasustainedspike?
Supposedly, seasonalitycancelsout thenormaloperations.Buthowmanyofyouhaveappsinwhichasingleuserlogsinandstartsrunningexpensive(e.g.reporting)transactions?
Traditionalapproachhastoagaindecide:whentoalert?Ifappusersloginatirregularintervalsandperformthistypeoftransaction,thentriggeringalertson theirnormal(non-seasonal)activity?
“catalerts/dev/null”.
Buthowlongdoyouwaitthen?Onceagain,adecisionyou havetomakeand
configureforeachofyourapps.
11 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD0
500
1000
1500
2000
2500
3000
Betterhope thatsustained,normalchangesinresponsetimeareseasonalwhentheyhappen…Ifnot,youmustwriterules!
Andifyouwriterules,youmightaccidentallydeadenthethresholdtoactualproblems.Dang,gum!
12 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
OurHero:WalterShewhart
§ Inthe1920s,WalterShewhart etalworkedonqualitycontrolforburiedtelephonelines.
§ Shewhart observedthatwhileeverylinedisplaysvariation,somelinesoccasionallydisplayuncontrolledvariation.Likeaseismometer,therearenormalfluctuationsandthenthereareearthquakes.
§ Shewhart inventedcontrolchartsandtheWesternElectricRulestoidentifyuncontrolledvariance,earninghimselfthetitle:“FatherofStatisticalQualityControl.”
13 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Translationplease!
§ Shewhart taughtustofavorrealtimeobservationovermathematicalmodelsofasignal’sbehavior.
§ Westillbaselinethesignal,buttheWesternElectricRulesdefinethesituationsinwhichthesignalshouldbeconsideredinabadstateandnotasimpledeltafromthebaselinemodel.
§ Shewhart’smethodofcharacterizingthequalityofasignalmirrorsthebehaviorofahumanobserver.
Trustus,youwillunderstand thismath.
14 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Shewhart’s WesternElectricRulesStraightoffWikipedia…
ThecanonicalWesternElectricRulesuseplain,oldstandarddeviationastheirrealtimemeasure.Eachruleidentifiesapatterninthesignal:
Rule#1– Astatisticallyinterestingoutlier
Rule#2– Twosomewhatinterestingoutliersoutofthreemeasurements.
Rule#3– Foursmalleroutliersoutoffivemeasurements.
Rule#4– Manysmalloutliersovermanymeasurements.
Thismuchweflatoutstolefrommathhistory!
SeeCommentstotheright
15 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CATechnologiesInnovation
§ WesternElectricRulesarebrilliantforbothrealtimeanalysisoftelephonesignalsandapplicationsignals.
§ Asinglerulebreach,however,istoodullabladeforslicingthroughthistoughproblem.
§ Byassigningweightstoeachrulebreach,keepingarunningsumandagingoutoldbreaches,wecanproduceasingle,normalizedvalueforvarianceintensity.
CAAPM10hasseveralpatentspending.
16 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Inabusysystem,therearealwaysvaryinglevelsofstability.
Inthispicture,canyou tellwhichsignalsareleaststable?
17 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Thissignalexperiencedanoutlier,butitdidn’tturnblue.
Asinglerulebreachisn’tenough for“Petetoputdownhissandwich.”
18 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Inthiscase,thechangeinstabilitywassustainedoveraboutfortyminutes.
Whathappened? Click tofindout…
19 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Thisapplicationexperiencedaremarkabledegradationinperformanceoveraforty-minuteperiodoftime.
Botholdandournewapproachwouldalerthere,butCA’salertwouldhappenearlyintheeventandtriggertracecollectionautomatically.
Theoldapproachmightnothaveletanoperatorknowforthirtyminutesormore,basedontherulestheyconfigured.
20 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
Triageisabattlefieldmedicineterm:wherearethewoundedsoldiers?
CA’sapproachmeansidentifyingchronicproblemsaswellasacuteones.Whichoftheselinesaremorestable,but stillhavingchronicstabilityeventsatregularintervals?
21 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
22 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
23 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
DifferentialAnalysisDefaultConfiguration
24 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
25 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
CATECHNOLOGIESTEAMPEGASUSClockwisefromleft:
PrashantPathak,MarkLoSacco,WeiniYu,PrasannaRamVenkatachalam,NareshChippada,CareyFeldstein,
PaulCallahanandSai KrishnaRayanapati.[notpictured:me]
26 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
RecommendedSessions
SESSION# TITLE DATE/TIME
DO5X189SHowtoAchieveaCustomer-Centric ViewinanOmni-ChannelWorld 11/18/2015 at1:00pm
DO5X194SMonitorMicroservices, Containers, Cloud Foundry andNodewithCAApplication PerformanceManagement 11/18/2015 at4:30pm
DO5X193SCustomizeCAApplicationPerformanceManagementwithTipsforUsingtheCAApplicationPerformanceManagementOpenAPIs
11/19/2015 at4:30pm
27 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
MustSeeDemos
ApplicationPerformanceManagementandDevOps,featuringAPMuseinpreproduction scenarios
ApplicationPerformanceManagementTheater5
ApplicationPerformanceManagement,ModernMonitoring, featuringthenewAPMTeamCenter
ApplicationPerformanceManagementTheater5
Ensuringa“5star”mobileappexperiencewithCAMobileAppAnalytics
MobileAppAnalyticsTheater5
UnifiedMonitoring:APMIntegrationsincludingUIM
ApplicationPerformanceManagementTheater5
28 ©2015CA.ALLRIGHTSRESERVED.@CAWORLD #CAWORLD
FollowOnConversationsAt…
SmartBarApplicationPerformanceManagementTheater5
TechTalksApplicationPerformanceManagementTheater5
QuestionandAnswerDAVID.B.MARTIN@CA.COM