Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research...

41
Deconstructing SPECweb99 Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center www.research.ibm.com/people/n/nahum [email protected]

Transcript of Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research...

Page 1: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 1

Deconstructing SPECweb99

Erich Nahum

IBM T.J. Watson Research Centerwww.research.ibm.com/people/n/nahum [email protected]

Page 2: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 2

Talk Overview

• Workload Generators

• SPECweb99

• Methodology

• Results

• Summary and Conclusions

Page 3: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 3

Why Workload Generators?• Allows stress-testing and bug-

finding• Gives us some idea of server

capacity• Allows us a scientific process

to compare approaches– e.g., server models, gigabit

adaptors, OS implementations

• Assumption is that difference in testbed translates to some difference in real-world

• Allows the performance debugging cycle

Measure Reproduce

Find Problem

Fix and/or improve

The Performance Debugging Cycle

Page 4: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 4

How does W. Generation Work?

• Many clients, one server– match asymmetry of Internet

• Server is populated with some kind of synthetic content

• Simulated clients produce requests for server

• Master process to control clients, aggregate results

• Goal is to measure server– not the client or network

• Must be robust to conditions– e.g., if server keeps sending 404 not

found, will clients notice?

ResponsesRequests

Page 5: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 5

Problems with Workload Generators

• Only as good as our understanding of the traffic• Traffic may change over time

– generators must too

• May not be representative– e.g., are file size distributions from IBM.com similar to

mine?

• May be ignoring important factors– e.g., browser behavior, WAN conditions, modem

connectivity

• Still, useful for diagnosing and treating problems

Page 6: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 6

What Server Workload Generators Exist?

• Many. In order of publication:– WebStone (SGI)– SPECweb96 (SPEC)– Scalable Client (Rice Univ.)– SURGE (Boston Univ.)– httperf (HP Labs)– SPECweb99 (SPEC)– TPC-W (TPC)– WaspClient (IBM)– WAGON (IBM)

• Not to mention those for proxies (e.g. polygraph)• Focus of this talk: SPECweb99

Page 7: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 7

Why SPECweb99?

• Has become the de-facto standard used in Industry:– 141 submissions in 3 years on the SPEC web site– Hardware: Compaq, Dell, Fujitsu, HP, IBM, Sun– OS’es: AIX, HPUX, Linux, Solaris, Windows NT– Servers: Apache, IIS, Netscape, Tux, Zeus

• Used within corporations for performance, testing, and marketing– E.g., within IBM, used by AIX, Linux, and 390 groups

• Begs the question: how realistic is it?

Page 8: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 8

Server Workload Characterization

• Over the years, many observations have been made about Web server behavior:

– Request methods– Response codes– Document Popularity– Document Sizes– Transfer Sizes– Protocol use– Inter-arrival times

How well does SPECweb99 capture these characteristics?

Page 9: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 9

History: SPECweb96

• SPEC: Systems Performance Evaluation Consortium– Non-profit group with many benchmarks (CPU, FS)– Pay for membership, get source code

• First attempt to get somewhat representative– Based on logs from NCSA, HP, Hal Computers

• 4 classes of files:

• Poisson distribution within each class

Percentage Size

35.00 0-1 KB

50.00 1-10 KB

14.00 10-100 KB

1.00 100 KB – 1 MB

Page 10: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 10

SPECweb96 (cont)

• Notion of scaling versus load:– number of directories in data set size doubles as

expected throughput quadruples (sqrt(throughput/5)*10)

– requests spread evenly across all application directories

• Process based WG• Clients talk to master via RPC's • Does only GETS, no keep-alive

www.spec.org/osg/web96

Page 11: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 11

Evolution: SPECweb99• In response to people "gaming" benchmark, now

includes rules:– IP maximum segment lifetime (MSL) must be at least 60

seconds – Link-layer maximum transmission unit (MTU) must not be

larger than 1460 bytes (Ethernet frame size)– Dynamic content may not be cached

• not clear that this is followed– Servers must log requests.

• W3C common log format is sufficient but not mandatory.– Resulting workload must be within 10% of target.– Error rate must be below 1%.

• Metric has changed:– now "number of simultaneous conforming connections“: rate of

a connection must be greater than 320 Kbps

Page 12: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 12

SPECweb99 (cont)• Directory size has changed:

(25 + (400000/122000)* simultaneous conns) / 5.0)

• Improved HTTP 1.0/1.1 support:– Keep-alive requests (client closes after N requests)– Cookies

• Back-end notion of user demographics– Used for ad rotation– Request includes user_id and last_ad

• Request breakdown:– 70.00 % static GET– 12.45 % dynamic GET– 12.60 % dynamic GET with custom ad rotation– 04.80 % dynamic POST – 00.15 % dynamic GET calling CGI code

Page 13: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 13

SPECweb99 (cont)• Other breakdowns:

– 30 % HTTP 1.0 with no keep-alive or persistence– 70 % HTTP 1.1 with keep-alive to "model" persistence– still has 4 classes of file size with Poisson distribution– supports Zipf popularity

• Client implementation details:– Master-client communication uses sockets– Code includes sample Perl code for CGI– Client configurable to use threads or processes

• Much more info on setup, debugging, tuning• All results posted to web page,

– including configuration & back end code

www.spec.org/osg/web99

Page 14: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 14

Methodology

• Take a log from a large-scale SPECweb99 run • Take a number of available server logs• For each characteristic discussed in the

literature:– Show what SPECweb99 does– Compare to results from the literature– Compare to results from a set of sample server logs– Render judgment on how well SPECweb99 does

Page 15: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 15

Sample Logs for Illustration

Name: Chess1997

Olympics1998

IBM1998

World Cup1998

Dept. Store2000

IBM2001

Description: Kasparov-Deep Blue Event Site

Nagano 1998 Olympics Event Site

Corporate Presence

SportingEventSite

OnlineShopping

Corporate Presence

Period: 2 weeks inMay 1997

2 days inFeb 1998

1 day inJune 1998

31 days inJun-Jul 1998

12 days inJune 2000

1 day inFeb 2001

Hits: 1,586,667 5,800,000 11,485,600 1,111,970,278

13,169,361 12,445,739

Bytes: 14,171,711 10,515,507 54,697,108 54,697,108 54,697,108 28,804,852

Clients: 256,382 80,921 86,0211 2,240,639 86,021 319,698

URLS: 2,293 30,465 15,788 89,997 15,788 42,874

We’ll use statistics generated from these logs as examples.

Page 16: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 16

Talk Overview

• Workload Generators

• SPECweb99

• Methodology

• Results

• Summary and Conclusions

Page 17: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 17

Request Methods

• AW96, AW00, PQ00, KR01: majority are GETs, few POSTs• SPECweb99: No HEAD request, too many POSTS

Chess 1997

Olymp. 1998

IBM 1998

W. Cup 1998

Dept. 2000

IBM 2001

SPEC web99

GET 92.18 99.37 99.91 99.75 99.42 97.54 95.06

HEAD 03.18 00.30 00.08 00.23 00.45 02.09 00.00

POST 00.01 00.04 00.02 00.01 00.01 00.20 04.93

Other: noise noise noise noise noise noise noise

Page 18: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 18

Response Codes

• AW96, AW00, PQ00, KR01: Most are 200s, next 304’s• SPECweb99 doesn’t capture anything but 200 OK

Response Code

Chess 1997

Olymp 1998

IBM 1998

W. Cup 1998

Dept. 2000

IBM 2001

SPEC web99

200 OK 85.32 76.02 75.28 79.46 86.80 67.73 100.00

206 Partial Cont 00.00 00.00 00.00 00.06 00.00 00.00 00.00

302 Found 00.05 00.05 01.18 00.56 00.56 15.11 00.00

304 Not Modified

13.73 23.25 22.84 19.75 12.40 16.26 00.00

403 Forbidden 00.01 00.02 00.01 00.00 00.02 00.01 00.00

404 Not Found 00.55 00.64 00.65 00.70 00.18 00.79 00.00

Page 19: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 19

Resource Popularity

• p(r) = C/r^alpha (alpha = 1 true Zipf; others “Zipf-like")• Consistent with CBC95, AW96, CB96, PQ00, KR01• SPECweb99 does a good job here with alpha = 1

Page 20: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 20

Resource (File) Sizes

• Lognormal body, consistent with results from AW96, CB96, KR01.• SPECweb99 curve is sparse, 4 distinct regions

Page 21: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 21

Tails of the File Size

• AW96, CB96: sizes have Pareto tail; Downey01: Sizes are lognormal.• SPECweb99 tail only goes to 900 KB (vs 10 MB for others)

Page 22: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 22

Response (Transfer) Sizes

• Lognormal body, consistent with CBC95, AW96, CB96, KR01• SPECweb99 doesn’t capture zero-byte transfers (304s)

Page 23: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 23

Transfer Sizes w/o 304’s

• When 304’s removed, SPECweb99 much closer

Page 24: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 24

Tails of the Transfer Size

• SPECweb99 tail is neither lognormal nor pareto• Again, max transfer is only 900 KB

Page 25: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 25

Inter-Arrival Times

• Literature gives exponential distr. for session arrivals• KR01: Request inter-arrivals are pareto• Here we look at request inter-arrivals

Page 26: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 26

Tails of Inter-Arrival Times

• SPECweb99 has pareto tail• Not all others do, but may be due to truncation

– (e.g. log duration of only one day)

Page 27: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 27

HTTP Version

• Over time, more and more requests are served using 1.1• But SPECweb99 is much higher than any other log• Literature doesn’t look at this, so no judgments

Protocol Version

Chess 1997

Olymp. 1998

IBM 1998

W. Cup 1998

Dept. 2000

IBM 2001

SPEC web99

HTTP 1.0 95.30 78.56 77.22 78.62 51.13 51.08 30.00

HTTP 1.1 00.00 20.92 18.43 21.35 48.82 48.30 70.00

Unclear 04.70 00.05 04.34 00.02 00.05 00.06 00.00

Page 28: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 28

Summary and Conclusions

• SPECweb99 has a mixed record depending on characteristic:– Methods: OK– Response codes: bad– Document popularity: good– File sizes: OK to bad– Transfer sizes: bad– Inter-arrival times: good

• Main problems are:– Needs to capture conditional GETs with IMS for 304’s– Better file size distribution (smoother, larger)

Page 29: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 29

Future Work

• Several possibilities for future work:– Compare logs with SURGE– More detail on HTTP 1.1 (requires better

workload characterization, e.g. packet traces)– Dynamic content (e.g., TPC-W) (again,

requires workload characterization)

• Latter 2 will not be easy due to privacy, competitive concerns

Page 30: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 30

Probability

• Graph shows 3 distributions with average = 2.• Note average median in some cases !• Different distributions have different “weight” in tail.

Page 31: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 31

Important Distributions

Some Frequently-Seen Distributions:

• Normal: – (avg. sigma, variance mu)

• Lognormal:– (x >= 0; sigma > 0)

• Exponential: – (x >= 0)

• Pareto: – (x >= k, shape a, scale k)

2)(

)2/()( 22

xe

xf

2)(

)2/())(ln( 22

x

exf

x

xexf )(

)1(/)( aa xakxf

Page 32: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 32

Probability Refresher

• Lots of variability in workloads– Use probability distributions to express– Want to consider many factors

• Some terminology/jargon:– Mean: average of samples– Median : half are bigger, half are smaller– Percentiles: dump samples into N bins (median is 50th percentile number)

• Heavy-tailed: – As x->infinity

acxxX ]Pr[

Page 33: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 33

Session Inter-Arrivals

• Inter-arrival time between successive requests – “Think time"– difference between user requests vs. ALL requests– partly depends on definition of boundary

• CB96: variability across multiple timescales, "self-similarity", average load very different from peak or heavy load

• SCJO01: log-normal, 90% less than 1 minute.• AW96: independent and exponentially distributed• KR01: session arrivals follow poisson distribution,

but requests follow pareto with a=1.5

Page 34: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 34

Protocol Support

• IBM.com 2001 logs:– Show roughly 53% of client requests are 1.1

• KA01 study:– 92% of servers claim to support 1.1 (as of Sep 00)– Only 31% actually do; most fail to comply with spec

• SCJO01 show:– Avg 6.5 requests per persistent connection– 65% have 2 connections per page, rest more. – 40-50% of objects downloaded by persistent

connections

Appears that we are in the middle of a slow transition to 1.1

Page 35: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 35

WebStone• The original workload generator from SGI in 1995• Process based workload generator, implemented in C• Clients talk to master via sockets• Configurable: # client machines, # client processes, run

time• Measured several metrics: avg + max connect time,

response time, throughput rate (bits/sec), # pages, # files• 1.0 only does GETS, CGI support added in 2.0• Static requests, 5 different file sizes:

Percentage Size

35.00 500 B

50.00 5 KB

14.00 50 KB

0.90 500 KB

0.10 5 MBwww.mindcraft.com/webstone

Page 36: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 36

SURGE

• Scalable URL Reference GEnerator– Barford & Crovella at Boston University CS Dept.

• Much more worried about representativeness, captures:– server file size distributions,– request size distribution,– relative file popularity– embedded file references– temporal locality of reference– idle periods ("think times") of users

• Process/thread based WG

Page 37: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 37

SURGE (cont)

• Notion of “user-equivalent”:– statistical model of a user – active “off” time (between URLS),– inactive “off” time (between pages)

• Captures various levels of burstiness• Not validated, shows that load generated is

different than SpecWeb96 and has more burstiness in terms of CPU and # active connections

www.cs.wisc.edu/~pb

Page 38: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 38

S-Client

• Almost all workload generators are closed-loop:– client submits a request, waits for server, maybe thinks for

some time, repeat as necessary

• Problem with the closed-loop approach:– client can't generate requests faster than the server can

respond– limits the generated load to the capacity of the server– in the real world, arrivals don’t depend on server state

• i.e., real users have no idea about load on the server when they click on a site, although successive clicks may have this property

– in particular, can't overload the server

• s-client tries to be open-loop:– by generating connections at a particular rate – independent of server load/capacity

Page 39: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 39

S-Client (cont)• How is s-client open-loop?

– connecting asynchronously at a particular rate– using non-blocking connect() socket call

• Connect complete within a particular time?– if yes, continue normally.– if not, socket is closed and new connect initiated.

• Other details:– uses single-address space event-driven model like Flash– calls select() on large numbers of file descriptors– can generate large loads

• Problems:– client capacity is still limited by active FD's– “arrival” is a TCP connect, not an HTTP request

www.cs.rice.edu/CS/Systems/Web-measurement

Page 40: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 40

TPC-W• Transaction Processing Council (TPC-W)

– More known for database workloads like TPC-D– Metrics include dollars/transaction (unlike SPEC)– Provides specification, not source– Meant to capture a large e-commerce site

• Models online bookstore– web serving, searching, browsing, shopping carts– online transaction processing (OLTP)– decision support (DSS)– secure purchasing (SSL), best sellers, new products– customer registration, administrative updates

• Has notion of scaling per user– 5 MB of DB tables per user– 1 KB per shopping item, 25 KB per item in static images

Page 41: Deconstructing SPECweb99Erich Nahum 1 Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center  nahum@us.ibm.com.

Deconstructing SPECweb99 Erich Nahum 41

TPC-W (cont)• Remote browser emulator (RBE)

– emulates a single user– send HTTP request, parse, wait for thinking, repeat

• Metrics:– WIPS: shopping– WIPSb: browsing– WIPSo: ordering

• Setups tend to be very large:– multiple image servers, application servers, load balancer– DB back end (typically SMP)– Example: IBM 12-way SMP w/DB2, 9 PCs w/IIS: 1M $

www.tpc.org/tpcw