Web Proxy Traces

10
CACHING CHARACTERISTICS OF INTERNET AND INTRANET WEB PROXY TRACES Arthur Goldberg, Ilya Pevzner, Robert Buff Computer Science Department Courant Institute of Mathematical Science New York University {artg, pevzner, buff}@cs.nyu.edu www.cs.nyu.edu/artg , www.cs.nyu.edu/phd_students/{ pevzner, buff} This paper studies the caching characteristics of HTTP requests and responses that  pass through production Web proxies. We evaluate caching opportunities and problems. Traces with 5.9 million entries from a large Internet Service Provider (ISP) and 2.0 million entr ies fr om an Intr anet fi rewall ar e st udied. We fi nd maxi mum cache hi t rate opportunities of about 40% for an I SP and 70% for an Intranet. Cache size needs and document residence times are also examined. 1. In tr od uct ion Caching proxies [LUOT94] play an important role in IP netwo rks, enhan cing secur ity, conser ving bandwidth and poten tial ly impr oving performan ce by reduc ing resp onse ti me (e .g . [ABR A9 5] and [HAMI98] ). Designing and deploying proxies is challenging work because proxies are subject to complex inputs which can be summar ize d as a re qu est str eam and associ ated respo nses. In this paper we charac teriz e real isti c pr oxy inputs to help people who desi gn pr oxie s and people who ar ch it ect an d op er at e networks with proxies. 2. Pr oxy Da ta Sou rces We have analyzed proxy traces collected in a large ISP and a large int ranet. We now di scuss these environments. ISP Pro digy Int ernet, the wor ld’ s 5 th largest ISP wi th approximate ly 450,00 0 users, mai nta ins a Uni ted Stat es netwo rk of eight pr oxy server s. When a user logs onto Prodigy their browser is configured to use proxy. prodi gy.net as a proxy . A custom DNS server resolves the domain name proxy.prodigy.net to the IP address of one of the eight proxies. A user remains connected to the same proxy for the duration of their session. The resoluti on alternates among the proxies in a round-r obin fash ion. Thus, eac h proxy serv ices users from the entire United States. A typica l pr oxy ser ver sup ports about 500 uni que cli ents (aver aged over a 5-minute per iod ) with an average load of 30 requests per second during peak hours, or appr oximatel y 10 6 requests per day. Between the fall of 1996 and now (summer 1998) the proxies have been implemented with Netscape 2.5 on IBM RS/6000 systems running AIX 4.1 and equipped wi th 256 MB of RAM and thr ee 4 GB disks. Eac h proxy has been configured with a cache of 5.5 GB spread over the 3 di sks. Ot her rel evant cache configuration parameters [NETS97] have been set as follows: max-uncheck 1 = 21600 seconds (6 hours) lm-factor 2 = 0.1 term-percent 3 = 80%. We analyzed traces from the proxy proxy3.ykt.prodigy.net located in Yorktown, New York. We examined traces collected between June 8 and 13, 1998. This log has very few entries for Tuesday, June 9, suggesting that the proxy server was down or being maintained on tha t day. The log consi sted of 5.9 mil lion entr ies or about 983 300 entri es per day. The traces were in Netscape Extended-2 log for mat [NETS97] with a record for each request-response pair containing the following fields, among others: 1  max-uncheck is the maximum time allowed between consecutive up-to-date checks. 2 lm-factor is used to estimate the duration for which the document will remain unchanged, which is know as the ex piration time. The estimated expiration time is given by the time elapsed since the last modification mult ipli ed by lm-fac tor. Note tha t this only guess es how long a document might remain up-to-date. 3 If the client interrupts a response when a document has been only partly retrieved from the server then the pro xy att emp ts to comple te the retrie val if at least term-percent of the document has al ready been ret rieved. Otherwise, the pro xy closes the ser ver connection and removes the partial file.

Transcript of Web Proxy Traces

Page 1: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 1/10

Page 2: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 2/10

Page 3: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 3/10

Page 4: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 4/10

Page 5: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 5/10

Page 6: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 6/10

Page 7: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 7/10

Page 8: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 8/10

Page 9: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 9/10

Page 10: Web Proxy Traces

8/8/2019 Web Proxy Traces

http://slidepdf.com/reader/full/web-proxy-traces 10/10