Monitorama 2013 Keynote

Post on 05-Dec-2014

3.638 views 0 download

description

 

Transcript of Monitorama 2013 Keynote

Mo e Than Monitoring#monitoringr++

Neil Gunther

Performance Dynamics

Monitorama KeynoteBoston, March 28 2013

SM

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 1 / 47

Let’s Get Calibrated about Data

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 2 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: All data is wrong by definition

Measurement is a process, not math.

All data contains measurement errors.

How big are they and can you tolerate them?

Treating data as divine is a sin.

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: All data is wrong by definition

Measurement is a process, not math.

All data contains measurement errors.

How big are they and can you tolerate them?

Treating data as divine is a sin.

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: VAMOOS your data doubts

Visualize

Analyze

Modelize

Over and Over until

Satisfied

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: VAMOOS your data doubts

Visualize

Analyze

Modelize

Over and Over until

Satisfied

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: There are only 3 performance metrics

1 Time, e.g., cpu_ticks2 Rate (inverse time), e.g., httpGets/s,3 Number or count, e.g., RSS

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47

Let’s Get Calibrated about Data

Guerrilla Mantra: There are only 3 performance metrics

1 Time, e.g., cpu_ticks2 Rate (inverse time), e.g., httpGets/s,3 Number or count, e.g., RSS

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47

Let’s Get Calibrated about Data

Watch Out for Patterns

I mean that in a bad way. Your brain can’t help itself.

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 6 / 47

Potted History of Monitoring

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 7 / 47

Potted History of Monitoring

Old Adage: “Nothing New in Computer Science”

Mainframes didn’t need real-time monitoring. Batch processing.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 8 / 47

Potted History of Monitoring

How You Programmed It

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 9 / 47

Potted History of Monitoring

Later ... the interface improved

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 10 / 47

Potted History of Monitoring

CTSS (Compatible Time-Sharing System) developed in 1961 at MIT on IBM 7094.Compatible meant compatibility with the standard IBM batch processing O/S.

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 11 / 47

Potted History of Monitoring

Multics Instrumentation c.1965

Multics was a multiuser O/S following CTSS time-share.

The Implementation“a rough measure of response time for a time-sharing console user, an exponential average of the numberof users in the highest priority scheduling queue is continuously maintained. An integrator, L , initiallyzero, is updated periodically by the formula

L ← L ×m + Nq

where Nq is the measured length of the scheduling queue at the instant of update, and m is an exponentialdamping constant”

This equation is an iterative form of exponentially damped moving average.In modern terminology, it’s a data smoother.

The Lesson“experience with Multics, and earlier with CTSS, shows that building permanent instrumentation into keysupervisor modules is well worth the effort, since the cost of maintaining well-organized instrumentation islow, and the payoff is very high.”

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 12 / 47

Potted History of Monitoring

You know this better as ...

Linux load average58 extern unsigned long avenrun[ ]; /* Load averages */5960 #define FSHIFT 11 /* nr of bits of precision */61 #define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */62 #define LOAD_FREQ (5*HZ) /* 5 sec intervals */63 #define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-pt */64 #define EXP_5 2014 /* 1/exp(5sec/5min) */65 #define EXP_15 2037 /* 1/exp(5sec/15min) */6667 #define CALC_LOAD(load,exp,n) \68 load *= exp; \69 load += n*(FIXED_1-exp); \70 load >>= FSHIFT;

Lines 67–70 are identical to the 1965 Multics formula.See Chap. 4 of my Perl::PDQ book for the details.

UNIX load average

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 13 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS begat Multics begat Unics begat UnixGet it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS

begat Multics begat Unics begat UnixGet it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS begat Multics

begat Unics begat UnixGet it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS begat Multics begat Unics

begat UnixGet it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS begat Multics begat Unics begat Unix

Get it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Unix at Bell Labs c.1970

CTSS begat Multics begat Unics begat UnixGet it?

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47

Potted History of Monitoring

Then Came Screens 9:40

Note the mouse in her right hand.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 15 / 47

Potted History of Monitoring

Unix top: A Legacy App

Green ASCII characters on black backgroundc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 16 / 47

Potted History of Monitoring

Desktop GUI c.1995

Lots of colored spaghettic© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 17 / 47

Potted History of Monitoring

Static Charts on the Web c.2000

Load average over 24 hr period with 1, 5, 15 min LAs as green, blue, red TS.(which is completely redundant, BTW)

As informative as watching a ticker chart on Wall Street

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 18 / 47

Potted History of Monitoring

Browser-based Dashboards

Interminable strip charts are not good for your brain.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 19 / 47

Performance Visualization Basics

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 20 / 47

Performance Visualization Basics

The Central Challenge

Find the best cognitive impedance match

between the digital computer and the neural computer

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 21 / 47

Performance Visualization Basics

Cognitive Circuitry is Largely Unknown

PerfViz is an N-dimensional problem

Brain is trapped in (3 + 1)-dimensions

No 5-fold rotational symmetry

Physicists have all the fun with SciViz

Time dimension becomes animation sequence

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 22 / 47

Performance Visualization Basics

Your Brain is Easily Fooled

All cognition is computationYour brain is a differential analyzerDifference errors produce perceptual illusions

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 23 / 47

Monitored Data are Time Series

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 24 / 47

Monitored Data are Time Series

Gothic graphs can hurt your brain (Bad Z value)

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 25 / 47

Monitored Data are Time Series

There’s a Whole Science of Color

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 26 / 47

Monitored Data are Time Series

Pastel Colors on White

0 1000 2000 3000 4000 5000

200000

400000

600000

800000

1200000

t-Index

LIO/s

Sandy Bridge 16 VPU Throughput

test1.HTT.Turb

test2.Turbo

test3.HTT

test4.AllOff

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 27 / 47

Monitored Data are Time Series

Pastel Colors on Black

0 1000 2000 3000 4000 5000

200000

400000

600000

800000

1200000

t-Index

LIO/s

Sandy Bridge 16 VPU Throughput

test1.HTT.Turb

test2.Turbo

test3.HTT

test4.AllOff

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 28 / 47

Monitored Data are Time Series

Pastel Colors on Neutral Gray

0 1000 2000 3000 4000 5000

200000

400000

600000

800000

1200000

t-Index

LIO/s

Sandy Bridge 16 VPU Throughput

test1.HTT.Turb

test2.Turbo

test3.HTT

test4.AllOff

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 29 / 47

Monitored Data are Time Series

Coordinated Colors on Neutral Gray

0 1000 2000 3000 4000 5000

200000

400000

600000

800000

1200000

t-Index

LIO/s

Sandy Bridge 16 VPU Throughput

test1.HTT.Turb

test2.Turbo

test3.HTT

test4.AllOff

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 30 / 47

Monitored Data are Time Series

Time Series Can Reveal Data Correlations 9:50

02:00 07:00 12:00 17:00 22:00

010

2030

CPU%

02:00 07:00 12:00 17:00 22:00

7585

95

Mem%

02:00 07:00 12:00 17:00 22:00

05

1015

20

ioWait%

02:00 07:00 12:00 17:00 22:00

0.0

0.2

0.4

Time

LdAvg-1

server.p.65 : 2012-05-03 to 2012-05-04

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 31 / 47

Monitored Data are Time Series

But Data Doesn’t Tell All: Monitored Server Consumption

050

100

150

200

Time (m:s)

Capacity (

U%

)

00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48

Server saturation

Uavg dataUmax data

Monitored Server Consumption

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 32 / 47

Monitored Data are Time Series

Beyond Data: Effective Server Consumption

050

100

150

200

Time (m:s)

Capacity (

U%

)

00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48

Effective max consumption

Server saturation

Uavg dataUmax data

Ueff predicted

Lookahead Server Consumption

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 33 / 47

Performance Visualization in R

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 34 / 47

Performance Visualization in R

Choose Your Cognitive Z in R0

12

34

5

mpg

100 200 300 400 2 3 4 5

1015

2025

30

100

200

300

400

disp

drat

3.0

3.5

4.0

4.5

5.0

10 15 20 25 30

23

45

3.0 3.5 4.0 4.5 5.0

wt

4 6 8

10

15

20

25

30

3D Scatterplot

1 2 3 4 5 6

10

15

20

25

30

35

0

100

200

300

400

500

wt

dispmpg

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 35 / 47

Performance Visualization in R

Enhanced Plots in R

Raw bench data

p

Xp

50

100

150

200

250

300

10 20 30 40 50 60

Data smoother

p

Xp

50

100

150

200

250

300

10 20 30 40 50 60

USL fit

p

Xp

50

100

150

200

250

300

10 20 30 40 50 60

USL fit + CI bands

p

Xp

50

100

150

200

250

300

10 20 30 40 50 60

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 36 / 47

Performance Visualization in R

Chernoff Faces in R

Example (using R)library(TeachingDemos)faces2(matrix( runif(18*10), nrow=12), main=’Random Faces’)

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 37 / 47

Performance Visualization in R

Kiviat and Radar Charts in RCorrelation Radar

Alp12Mn

AvrROE

DivToP

GrowAPS

GrowAsst

GrowBPS

GrowCFPS

GrowDPS

GrowEPS

GrowSPS

HistAlp

HistSigm

InvVsSal

LevGrow

Payout5

PredSigm

RecVsSal

Ret12Mn

Ret3MnRet1Mn

ROE_CshPlow_DDM_EarnMom_EstChgs_EstRvMd_Neglect_NrmEToP

_PredEToP_RelStMd

_ResRev_SectMom

AssetToP

ARM_Pref_Earnings

AvrCFtoP

AvrDtoP

AvrEtoP

ARM_Sec_Earnings

BondSens

BookToP

Capt

CaptAdj

CashToP

CshFlToP

CurrSen

DivCuts5

EarnToP

Earnvar

Earnyld

Growth

HistBeta

IndConc

Leveflag

Leverag

Leverage

Lncap

Momentum

Payoflag

PredBeta

Ret_11M_Momentum

PotDilu

Price

ProjEgro

RecEPSGr

SalesToP

Size

SizeNonl

TradactvTradVol

ValueVarDPSVolatilityYieldCFROIADJUSTERC

RCSPX

R1000

MarketCapTotalRisk

Value_AX

truncate_ret_1mo

truncate_PredSigma

Residual_Returns

ARM_Revenue

ARM_Rec_Comp

ARM_Revisions_Comp

ARM_Global_Rank

ARM_Score

TEMP

EQ_Raw

EQ_Region_Rank

EQ_Acc_Comp

EQ_CF_Comp

EQ_Oper_Eff_Comp

EQ_Exc_Comp-0.5 0 0.5 1

Example (using R)require(plotrix)corelations <- c(1:97)corelation.names <- names(corelations) <- c("Alp12Mn","AvrROE", "DivToP", "GrowAPS", "GrowAsst", "GrowBPS", "GrowCFPS",...corelations <- c(0.223, 0.1884, -0.131, 0.1287, 0.0307,...par(ps=6)radial.plot(corelations, labels=corelation.names,rp.type="p",main="Correlation Radar", radial.lim=c(-1,1),line.col="blue")

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 38 / 47

Performance Visualization in R

Treemaps in R

GDAT: Top 100 Websites

-8e+09 -4e+09 0e+00 4e+09 8e+09

Search/portal

Retail

Software Media/news

Social network Reference

Video

Portal

Blogging Financial Computer

Media/news

Commerce

Tech news

Photo sharing

Health

WeatherAdult Travel

Gaming

Voip

File sharing Online dating

Children

Recruitment

Sport

File storageForum

GDAT: Top 100 Websites

-8e+09 -4e+09 0e+00 4e+09 8e+09

Google

MSNBing

Yahoo!

Microsoft

Facebook YouTube Wikipedia

AOL eBay Apple Amazon Blogger

Ask

Fox Interactive Media

Mozilla

Real Network

Adobe

About PayPalWordPressWeather Channel Glam MediaCNN

Twitter

Skype

CBS

IMDb

Wal-Mart

Craigslist

BBC Terra CNETOrangeDisney OnlineAT&TNetShelter Technology

Flickr

Picasa

Gorilla Nation Websites

WikiAnswers

Orkut

Chase

UOLBank of AmericaeHowLivejasminESPN ZyngaShopzilla

Comcast

Videolan

Everyday Health Network

LinkedIn

Expedia

iG

Target

Dell

Globo

Scripps Networks Digital

NYTimes

LimeWire

WebMDFriendFinder NetworkShopping.comNickelodeon Kids and Family NetworkClassmates Online

NetflixMeeboSix ApartTurner Sports & Entertainment Digital NetworkComcast

Hewlett Packard

NexTag

NBC Universal

Conduit

Verizon

TripAdvisorBest BuyMonsterRTL NetworkPriceline Network

Experian

Pornhub

iVillage

UPS

SuperPagesFox NewsNFL Dailymotion

T-Online

Reed Business Information Network

Free

CitibankVistaprintSears

Tribune NewspapersElectronic Arts Online

MegauploadVodafoneGeeknet

Example (using R)library(portfolio)bbc <- read.csv("nielsen100-2010.csv")map.market(id=seq(1:100), area=bbc$uniqueAudience, group=bbc$categoryBBC,color=bbc$totalVisits, main="GDAT: Top 100 Websites")

There is another treemap pkg on CRANc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 39 / 47

Performance Visualization in R

Heatmap of Multiple Servers in Time

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 40 / 47

Performance Visualization in R

Barry in 2D

p1

p3p2

p3=1/3

p1=

1/3

p2=1/3

p2

p3=0.3

p1=

0.6

p2=0.1

p1

p3

p2 p4

p3

p1

p2 p4

p3

p1

p2=.25,p4=.25,p3=.1,p1=.4

p2=.1,p4=.05,p3=.05,p1=.8

p1

p3p2

p3=1/3

p1=

1/3

p2=1/3

p2

p3=0.3

p1=

0.6

p2=0.1

p1

p3

p2 p4

p3

p1

p2 p4

p3

p1

p2=.25,p4=.25,p3=.1,p1=.4

p2=.1,p4=.05,p3=.05,p1=.8

Barycentric coordinate system for %CPU = %user + %sys + %idle

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 41 / 47

Performance Visualization in R

Barry in 3D: Tukey-like Rotations

Tukey trumps Tufte ,

Barycentric coordinate system for %BW = %unicast + %multicast + %broadcast + %idle

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 42 / 47

Possible Hacks

Outline

1 Let’s Get Calibrated about Data

2 Potted History of Monitoring

3 Performance Visualization Basics

4 Monitored Data are Time Series

5 Performance Visualization in R

6 Possible Hacks

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 43 / 47

Possible Hacks

Interactive and Streaming in R

R derives from S at Bell Labs (home of Unix) c.1975, 1980, 1988

R scripting language

console interface > (x^(k-1)*exp^(-x/s))/(gamma(k)*s^k)

cf. Mathematica document paradigmxk−1 e−x/θ

Γ(k) θk

No fonts, no symbolic computation

More recent focus is on enabling:

Better IDE integration, e.g., RStudio

Browser-based interaction, e.g., Shiny

Streaming data acquisition, e.g., R plus Hadoop, but ...

R interpreter is single-threadedNeeds a full app stack b/w data and R engineRevolution Analytics is in this space

Plenty of room for innovative developmentc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 44 / 47

Possible Hacks

Some Ideas for Tomorrow

1 Lots of opportunities

2 Coupling simple statistical analysis to monitored data

3 Display the errors in monitored data

4 Replace the black background in Graphite

5 Apply ColorBrewer to Graphite

6 Apply effective capacity consumption to your monitored data

7 Replacing strip charts with animation

WARNINGCommon sense is the p i t f

al

l

of all performance analysis

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 45 / 47

Possible Hacks

Modelizing GitHub Growth

Since I didn’t discuss modeling part of VAMOOS ...

Donnie Berkholz of redmonk.com wrote on his Jan21, 2013 blog that GitHub will reach:

4 million users near Aug 2013

5 million users near Dec 2013

That’s based on a log-linear model.I claim it’s a log-log model and therefore:

4 million users around Oct 2013

5 million users around Apr 2014

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 46 / 47

Possible Hacks

Performance Dynamics CompanyCastro Valley, Californiawww.perfdynamics.comperfdynamics.blogspot.comtwitter.com/DrQzFacebooknjgunther@perfdynamics.comOFF: +1-510-537-5758

c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 47 / 47