Understanding and reducing web delays

14
Copyright © 2005 John Wiley & Sons, Ltd. Understanding and reducing web delays Kevin Curran* ,† and Connor Duffy Over the years the number of web users has increased dramatically unfortunately leading to the inherent problem of congestion. This can affect each user’s surfing experience. This paper investigates download times associated with a web request, identifies where delays occur, and provides guidelines which can be followed by web developers to enable a faster and more efficient download and service for their users. Copyright © 2005 John Wiley & Sons, Ltd. Kevin Curran (BSc, PhD) is a lecturer at the University of Ulster in Northern Ireland. He has written over 100 academic research papers on areas such as distributed computing, emerging trends within wireless ad-hoc networks, dynamic protocol stacks, and mobile systems. Connor Duffy is an IT consultant in Londonderry, Northern Ireland. He graduated from the University of Ulster in 2003. His areas of exper- tise include networking, streaming media and internet congestion. *Correspondence to: Kevin Curran, School of Computing and Intelligent Systems, University of Ulster, Magee Campus, Northern Ireland BT47 3QL, UK. E-mail: [email protected] Introduction I nternet statistics from NUA 1 imply with an ‘educated guess’ that there are now over 605 million users on the Internet world wide. If only 5% of these users were on at any one time that still means there are 30 million people online. This will clearly lead to great delays which are a major factor affecting the performance of a site and ulti- mately a site’s success. 2 This mostly affects small companies on the net because most of the major players have already established a name for them- selves and delays most likely would not put off their customers. This is why it so important for the smaller companies to make every effort to ensure their site is of a high quality and download times are kept to a minimum. When a user launches a browser and requests an action to be performed the browser interprets the request. It sends information to the appropriate site server where the requested information is stored and this site server sends the information back. The Internet is actually a packet switching network which sends requests via packets of data (datagrams). Each packet contains the IP address of sender and receiver, and the information being requested. On any one request there can be more than one packet. This is because each packet is of a fixed size and some requests may need more than this. This means the request must be broken up into the appropriate number of packets. The route taken to obtain the requested information depends on the sender’s geographical location, and that of the receiver. If there are a lot of packets along a certain route they will all be queued or find a different route until they reach their destination. 3 The destination cannot send any information until all the associated packets have been received. INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT Int. J. Network Mgmt 2005; 15: 89–102 Published online 26 January 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nem.546

Transcript of Understanding and reducing web delays

Copyright © 2005 John Wiley & Sons, Ltd.

Understanding and reducing web delays

Kevin Curran*,† and Connor Duffy

Over the years the number of web users has increased dramaticallyunfortunately leading to the inherent problem of congestion. This canaffect each user’s surfing experience. This paper investigates downloadtimes associated with a web request, identifies where delays occur, andprovides guidelines which can be followed by web developers to enable afaster and more efficient download and service for their users. Copyright© 2005 John Wiley & Sons, Ltd.

Kevin Curran (BSc, PhD) is a lecturer at the University of Ulster in Northern Ireland. He has written over 100 academic research papers onareas such as distributed computing, emerging trends within wireless ad-hoc networks, dynamic protocol stacks, and mobile systems.

Connor Duffy is an IT consultant in Londonderry, Northern Ireland. He graduated from the University of Ulster in 2003. His areas of exper-tise include networking, streaming media and internet congestion.

*Correspondence to: Kevin Curran, School of Computing and Intelligent Systems, University of Ulster, Magee Campus, Northern Ireland BT473QL, UK.†E-mail: [email protected]

Introduction

Internet statistics from NUA1 imply with an‘educated guess’ that there are now over 605million users on the Internet world wide. If

only 5% of these users were on at any one time thatstill means there are 30 million people online. Thiswill clearly lead to great delays which are a majorfactor affecting the performance of a site and ulti-mately a site’s success.2 This mostly affects smallcompanies on the net because most of the majorplayers have already established a name for them-selves and delays most likely would not put offtheir customers. This is why it so important for thesmaller companies to make every effort to ensuretheir site is of a high quality and download timesare kept to a minimum.

When a user launches a browser and requests anaction to be performed the browser interprets the

request. It sends information to the appropriatesite server where the requested information isstored and this site server sends the informationback. The Internet is actually a packet switchingnetwork which sends requests via packets of data(datagrams). Each packet contains the IP addressof sender and receiver, and the information beingrequested. On any one request there can be morethan one packet. This is because each packet is ofa fixed size and some requests may need morethan this. This means the request must be brokenup into the appropriate number of packets. Theroute taken to obtain the requested informationdepends on the sender’s geographical location,and that of the receiver. If there are a lot of packetsalong a certain route they will all be queued or finda different route until they reach their destination.3

The destination cannot send any information untilall the associated packets have been received.

INTERNATIONAL JOURNAL OF NETWORK MANAGEMENTInt. J. Network Mgmt 2005; 15: 89–102Published online 26 January 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nem.546

When all the packets have been received the des-tination sends the requested information back inpackets via routers to the sender. Again the sameproblem arises in relation to routes and band-width.2 The Transmission Control Protocol (TCP)is a communication protocol which enables twohosts to establish a connection and exchangestreams of data.4 It sends information to aclient/server and must receive a response or it willsend the information again. The client TCP breaksdown the information into smaller packets andnumbers it. The server TCP then reconstructs it sothe original data can be viewed. When a user typesin a URL into the address box and presses ‘go’ aDNS look-up is performed. When the DNS look-up is complete the client connects to the server viaa TCP connection. The request is broken down intoequal-sized packets. Each packet, containing the IPaddress of the destination server, is then passedthrough the Internet via routers. Each routerresolves the next router in line using a routingalgorithm. This process is repeated by every routeruntil the destination server is reached with therequest. The server then responds to the request-ing client in the same way until the request hasbeen fulfilled.5 Between the server and the client isthe most likely place for a congestion to occur. Thisis why it is so important that the size of the infor-mation the user has requested should be as smallas possible.

When placing images on the Internet it isimportant that they are first optimized.

—Image Compression—

When placing images on the Internet it is impor-tant that they are first optimized. This will do twothings: compress it in size and ensure its coloursare web safe (can be displayed properly). Com-pressing an image gets rid of redundant data fromthe image. Without compression, images remainlarge and are quite often ‘bottlenecks’ taking toolong to download. For example, if a company isselling products online and they have not com-pressed their images the extra download time

required can alienate a user and cause thecompany who owns the website great problems.There needs to be a trade-off between quality anddownload time if the company is to succeed. Thiscan also indirectly affect other web users who maynot even be visiting this website as large imagesmay cause congestion along the channels theirpackets are travelling.

—Caching and its Architectures—

Caching is a technique used to store populardocuments closer to the user. It uses algorithms topredict user’s needs for specific documents andstores documents it deems important. Caching canoccur anywhere within a network, on the user’scomputer, at an ISP or at a server. Many compa-nies also use web proxy caches to display fre-quently accessed pages to their employees. Thiscan reduce the bandwidth the company requireswhich in turn can lower costs.6 Another exampleof caching can be seen on Google, one of theworlds most popular search engines, which usescaching to provide their users with the documentsthey need much faster than without caching(sometimes in under half a second). Web-cacheperformance is directly proportional to the size of the client community.2 The larger the client community the greater the possibility of cacheddata being requested and therefore the better the cache’s performance. The main drawback withcaching is that in most cases cached documentsmay never be viewed again. Caching a documentcan also cause other problems. Most documents onthe Internet change over time as they are updated.Static and Dynamic Caching are two differenttechnologies which use caching to combat thisproblem in order to reduce download time andcongestion. Static Caching stores the content of awebpage which does not change. There is no needto repeatedly request the same information overand over again as this just wastes time. This is anexcellent approach to combat congestion. DynamicCaching is slightly different. It determines whetherthe content of a page has changed by checking forupdates. If the contents have changed, it will thenstore the updated version.7 This unfortunately canlead to congestion and so is possibly not a verygood approach as it does require checks to be performed on the source of the data to establish

90 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

whether or not an update is necessary. If usedproperly these two technologies can work togetherto reduce latency and congestion. Prefetching is an intelligent technique used to reduce perceivedcongestion. It tries to predict the next page or document a user might want to access.8 Forexample, if a user is on a page with many links thePrefetching algorithm will predict that the usermay want to view associated links within thatpage. The Prefetcher will then request the pages itpredicts the user will try to access next and storesthem until the user does actually request them. Itwill then display the page significantly faster thanif the user requested the page without Prefetching.The only real drawback is that if the user does not request the pages the Prefetcher algorithmretrieves, congestion may have been caused needlessly.

Evaluation of Page Download Delay

The next stage of this study was to investigateweb delays. This paper analyses the performanceof 35 sites on the Internet; 10 US sites, 10 UK sites,10 other worldwide sites and 5 Northern Irelandsites using methodical tests. The University has a10mbs connection to the UK Universities Super-janet Network. Additionally the sites used had tobe sites that were not likely to be offline at anystage. The reason for this is that the results wouldnot be reliable if the sites were offline. The testswere conducted during the hours of 4am and 6am when the network load was as light as couldbe expected. The paper investigates the effects ofInternet congestion by analysing site downloadtime, site content and other technologies used by 35 websites worldwide. Analysing these sitesinvolved timing the sites homepage downloadtime from initial request until their transfer is com-plete. The reason for only choosing the homepagewas that this should be representative of the wholesite. Three types of request are used:

(1) Normal Request. No information has beenrequested from the site before. Therefore thecomputer will have no files stored from thesite. (The cached files and histories werecleared hourly.)

(2) Using Pop-up Filter. This blocks any pop-up windows, such as advertisements and

other links, which may hinder the perfor-mance of the requested webpage and thusdecrease the download time of the site. Thesoftware used was FilterGate 4.0.

(3) Cached Request. After the normal requestthe computer will have cached files from thesite. This should mean the computer willprocess this request faster by displaying thesite’s static content faster. However, anydynamic content will have to be retrievedfrom the server.

This was done hourly for a week. The next partof the research involved analysing the content ofthe sites. This meant examining the sites content,from images to text. The images were checked forsize and use of compression. All the images arecompressed, using Adobe Photoshop 7.01, wherepossible and compared to the original image to seeif appropriate compression was used. A new sizewas only recorded if the size was decreased andthe quality of the image was not reduced. All theresults were recorded and analysed to identifywhy delays were occurring and how delays couldbe avoided. Additional information was also gath-ered about the location, using NeoTrace Pro 3.25of the server where the sites are stored.

Below is a subset of sites this research used intesting. Examples of sites in the USA were ABC(www.abc.com), AOL (www.aol.com), and Google(www.google.com). Sites in the UK included theBBC (www.bbc.co.uk), Dabs (www.dabs.com) and Streets Online (www.streetsonline.co.uk).Other worldwide sites included Australia(www.csu.edu.au), China (www.edu.cn), Japan(www.nikkei.co.jp). Local sites included Kainos(www.kainos.co.uk), Singularity (www.singular-ity.co.uk) and Ulsterbus (www.ulsterbus.co.uk).

Tests were carried out hourly over a period of 7days. Each of the 35 websites were tested using anormal request (no cache or pop-up filter), withCache enabled, and with the Pop-up Filter. One ofthe most important points was to carry out thetests in a consistent manner. The computers cachewas checked to make sure it only worked as andwhen it was required, and the pop-up filter wasturned on and off as appropriate. This means thatall the results are accurate. All the information forthe 35 sites was recorded. The results have beenbroken down into categories to help examine themain findings more closely. The categories are 1.

UNDERSTANDING AND REDUCING WEB DELAYS 91

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Normal Downloads versus Web Cache Down-loads, 2. Normal Downloads versus Pop-up FilterDownloads, 3. Download Time versus Image Size,4. Download Time versus Total Site Size, 5. Origi-nal Image versus Compressed Image, and 6. Totalsite size before Compression versus after Com-pression. Table 1 is a cross-reference for all thegraphs used to display the results.

—Normal versus Web Cache Downloads—

This section looks at the performance of WebCached downloads against Normal downloads to

see what effects, if any, a web cache has on thedownload time. This will give an insight intocaching and help to identify design techniques forweb developers.

Figure 1 displays the average of the normaldownloads against the average of the web cache for the UK. The majority of the sites havesignificantly reduced their download times;however, two of the sites have not. These arewww.8over8.com (local) and www.dabs.com(UK). On analysis of the content of these sites it isnot immediately obvious why they have not per-formed well. They both have a smaller number ofimages, a smaller total size of images and a smallerHTML file size than some other sites which have performed better. 8over8 actually has a low

92 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

0

5

10

15

20

25

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Do

wn

load

Tim

e (s

ecs)

Figure 1. UK—Normal download versus web cache. Download time ---, web cache time ---

WebSite Abbreviation WebSite Abbreviation

8over8.com= 8ov abc.com= abcamd.com= amd americanmail.com= ameaol.com= aol bahn.de= bahbbc.co.uk= bbc bet365.com= betbuyagift.co.uk= buy clarendonmanor.co.uk= clacsu.eudu.au= csu dabs.com= dabdanskirsk.dk= dan edu.cn= edugoogle.com= goo intel.com= intkainos.co.uk= kai kelkoo.co.uk= kellouvre.fr= lou martins-seafresh.co.uk= marmlbapparel.com= mlb navyseals.com= navnic.it= nic nikkei.co.jp= nikquestionmarket.com= que sevengatesdesigns.com= sevsingularity.co.uk= sin streetsonline.co.uk= strulsterbus.co.uk= uls unam.mx= unayahoo.com= yah wtr.ru= wtr

Table 1. Reference Key for results

normal download time and the web cache time isalso low. Dabs, however, was slightly higher thanmost of the other sites. This may be due to the factthat it has 56 images on the site which must takelonger to process.

The above results (Figure 2) for the Rest of theWorld are much the same as those found for theUK. The web cache appears to improve the down-load time. Only three of these twenty sites havenot actually performed as well as the others. These were www.edu.cn, www.intel.com andwww.nikkei.co.jp and this is only because theiroriginal download times were quite high. Interest-ingly, the sites which performed the best from both the UK and the Rest of the World when using the web cache were sites which had mostly images and very small HTML files withrespect to the other sites. These sites were www.clarendonmanor.co.uk, www.google.com,www.nic.it, www.sevengatesdesigns.com andwww.wtr.ru. (Table 2).

It was also interesting to see that the fastestdownload for the web cache was www.clarendon-manor.co.uk which had the largest size of images.

Also this site has no text to display. The HTML forthis site has only tags to display the images. Theother three sites have some text to display but nota lot when compared to the other websites whichwere used in this paper. Of these five sites, www.google.com is the slowest to display when cached.This could be because the HTML file is slightlylarger than all but www.sevengatesdesign.com,and also the code required to carry out the searchfor the search engine is a bit more complex. Thesefindings do seem to indicate that images play amajor role in download times.

—Normal Download versus Pop-up Filter—

This section looks at how the use of a pop-upfilter can effect the download time of a site.

The results here were very interesting (Figure 3).The pop-up filter had no visible effect on decreas-ing the download time. It did however noticeably increase the download time forwww.buyagift.co.uk. This is perhaps because this

UNDERSTANDING AND REDUCING WEB DELAYS 93

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

0

5

10

15

20

25

abc

amd

ame

aol

bah

csu

dan

edu

goo

int lou mlb

nav

nic nik que

una

uni

wtr yah

Do

wn

load

Tim

e (s

ecs)

Figure 2. Rest of the World—Normal download versus web cache

Website No of images Size of images HTML file Cached time

Clarendonmanor.co.uk 6 82.5 2.93 0.17google.com 1 8.35 3.67 1.65nic.it 6 21.3 2.4 0.55sevengatesdesigns.com 5 38.9 4.1 1.35wtr.ru 3 20.5 1.8 0.88

Table 2. Best performance using cache

site was displayed in frames and the pop-up filtermay have viewed the HTML documents being dis-played in these frames as pop-up windows. Theresults from the worldwide sites were the same—the pop-up filter does not have much affect ondownload time. It even had an effect of increasingdownload time for one of the sites.

—Download Time versus Image Size—

Here we examine the effect of image size ondownload time. This is important as it will indi-cate the importance of image use on sites. Allimages were of the JPEG format.

Figure 4 displays all the results for downloadtime against image size. It is apparent from theseresults that download time is affected by imagesize. There is a positive gradient curve which rep-resents this. Figure 5 displays the total site size

against download time. Again there is a positivegradient curve present which indicates that as site content increases so too does download time.There are also a few anomalies present but againthis was expected. These anomalies are exactly thesame anomalies that were identified in Figure 4and have already been proved insignificant.

—Original Image versus Compressed Image—

This section looks at the effect image compres-sion has on the total image content of the sitesinvestigated. This is a very important section as itaims to highlight the importance of image com-pression. It should provide details about the use ofimage compression.

Figure 6 displays the results for the UK sites oforiginal image sizes for each site against com-pressed image sizes. There is an obvious pattern

94 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

0

5

10

15

20

25

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Do

wn

load

Tim

e (s

ecs)

Figure 3. UK—Normal download versus pop-up filter. Download time ---, Pop-up filter time ---

0

20

40

60

80

100

120

140

160

180

200

0 5 10 15 20 25

Download Time (secs)

Imag

e S

ize

(kb

)

Figure 4. Download time versus image size

0

50

100

150

200

250

0 5 10 15 20 25

Download Time (secs)

Site

Siz

e (k

b)

Figure 5. Total site size versus download time

emerging. The larger the image content of each site the greater the compression achieved. Fivesites here have revealed the use of good imagecompression. These are www.bbc.co.uk,www.bet365.com, www.dabs.com, www.kainos.co.uk and www.singularity.co.uk. The trend wasalso similar for the worldwide sites. Out of thosetwenty sites, four performed well. These werewww.google.com, www.louvre.fr, www.unisa.ac.za and www.yahoo.com.

—Site Content before Compressionversus after Compression—

This section looks again at the effects of imagecompression. This time the whole content of eachsite has been assessed to see how the compressedimages would change the site size. This is signifi-cant as it has already been shown that downloadtime is affected by both the image size and overallsite content.

Figure 7 shows the total site content, includingimages and HTML files, for the UK before andafter the compression of the images. These resultsfurther prove that the greater the site content thehigher the compression ratio achieved. This meansthat out of this sample of fifteen sites from the UK,seven have performed well for the use of com-pression on images. The trend was the same for theworldwide sites.

Evaluation of ResultsTo evaluate the research carried out and the

results found by this paper the overall perfor-mance for each site in the categories mentionedmust be taken into consideration. All the resultsseem to indicate the importance of image size andcompression over download time. Appropriatecategories have been chosen to evaluate the resultsand identify the best and worst sites.

UNDERSTANDING AND REDUCING WEB DELAYS 95

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

0

20

40

60

80

100

120

140

160

kb180

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Figure 6. UK—original image versus compressed image. Original image ---, compressed image ---

0

50

100

kb

150

200

250

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Figure 7. UK—Site before versus after compression. Site before compression ---, site after compression ---

—Best Performing Sites—

Three categories will be used to identify the best sites. These categories are Download Times(Normal), Cacheability, and Image Compression.The best sites will be those which have performedwell in at least two of these categories. This isbecause sites may achieve good results from onecategory and not in others. The download timesfor all the samples varies from 3 seconds to 25seconds. In order to decide which sites were thebest for download time a criteria must be set. Thecriteria was 0–5 seconds = Excellent, 5–10 seconds =Good, 10–12 seconds = Average, and 12–25 seconds =Poor. For a site to be classed as the best it must fit into the excellent range (0–5 seconds). Table 3 displays the download times for all thesites.

Nine sites were identified as excellent. Thesewere www.8over8.com, www.aol.com, www.bbc.co.uk, www.google.com, www.louvre.fr, www.nic.it, www.yahoo.com, www.sevengatesdesigns.com and www.singularity.co.uk. Another impor-tant measure was caching. Many users will revisitthe same sites time and time again. Computers canstore files from various sites on their internal cachefor re-use when a user requests information fromthe same location. This paper discovered that a webcache can notably decrease the download time formany sites. Five sites were identified that per-formed very well when a cache was used. Thesesites were www.clarendonmanor.co.uk, www.

wtr.ru, www.google.com, www.nic.it andwww.sevengatesdesigns.com.

An important point is that sites cannot bejudged by their size as the content of the site.However, they can be judged on their use of imagecompression. The use of image compression andits effects on overall site content have beenanalysed. The sites which performed the best arethose which achieved a low overall difference in site size when comparing size before compres-sion with that after compression. Twelve sites were identified that fit into this category. They are: www.8over8.com, www.bbc.co.uk,www.bet365.com, www.com3.org, www.dabs.com, www.edu.cn, www.google.com, www.kainos.co.uk, www.louvre.fr, www.singularity.co.uk, www.unisa.ac.za and www.yahoo.com.

The overall best sites are those which appear intwo or more of the above categories. Therefore thebest sites from this sample are www.8over8.com,www.bbc.co.uk, www.google.com, www.louvre.fr,www.nic.it, www.singularity.co.uk, www.sevengatesdesigns.com and www.yahoo.com.This means eight sites out of thirty-five sites, fromthis sample, are well developed, with downloadtimes and users in mind. This is only 23%. Theoverall best site was www.google.com. It isobvious that the key here is simplicity. Google has1 small image and a small HTML file to display thesite. The techniques used by these sites are appar-ent. There was a good use of compression. This isimportant as it depends on the actual image being

96 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Website Download time(s) Website Download time(s)

8over8.com 3.621142857 kainos.co.uk 5.961571429abc.com 8.403285714 kelkoo.co.uk 7.060571429amd.com 7.301 louvre.fr 4.932714286americanmail.com 9.160285714 martins-seafresh.co.uk 10.175aol.com 4.442285714 mlbapparel.com 6.767714286bahn.de 14.41757143 navyseals.com 13.77457143bbc.co.uk 4.354571429 nic.it 3.604857143bet365.com 5.897571429 nikkei.co.jp 23.79514286buyagift.co.uk 7.729571429 Questionmarket.com 12.77242857clarendonmanor.co.uk 6.113857143 sevengatesdesigns.com 4.487com3.org 5.822857143 singularity.co.uk 4.539857143countrybookshop.co.uk 7.867428571 streetsonline.co.uk 9.838142857csu.edu.au 15.48971429 ulsterbus.co.uk 8.131142857dabs.com 7.349428571 unam.mx 15.49042857danskirsk.dk 7.648571429 unisa.ac.za 11.75942857intel.com 24.45642857 wtr.ru 9.843142857google.com 2.970428571 yahoo.com 3.363571429

Table 3. Download times for all sample sites

used. A large image with lots of different colourswill not display as well as a smaller image with asmall range of colours. There were also no bulkyspecial effects and small HTML files.

—Worst Performing Sites—

The sites that caching has the poorest effect onwere those which have not significantly improvedfrom their original download times and also hada normal download time outside the excellent cri-teria. These were www.dabs.com, www.edu.cn,www.intel.com and www.nikkei.co.jp.

Again these sites will not be judged on size buttheir use of image compression. There were eight sites which fell into this category for bad use of compression. These were www.bahn.de,www.countrybookshop.co.uk, www.martins-seafresh.co.uk, www.navyseals.com, www.nic.it,www.nikkei.co.jp and www.wtr.ru. The overallworst sites are those which fall into two or more of the above categories, i.e. www.bahn.de,www.edu.cn, www.intel.com, www.navyseals.com, www.nikkei.co.jp and www.unam.mx. Thismeans six sites out of thirty-five (17%), from thissample, are badly developed. The main character-istics of the worst sites were large images, bad code. Some included special effects whichincreaseed size and slowed performance. Anotherimportant point to realise at this stage is thatalthough a site may be strong in one area it maybe weak in others. For example, www.edu.cnappears in the best category for Image compres-sion but it appears in two of the worst categoriesfor Download Time and Cacheability.

If any code has been written usingautomatic coding software it should bechecked for redundant tags and other excesscode, which increase site size and thereforedownload time.

—The Effect of HTML Editors on Page Size—

It is known that inefficient coding can result in adelay at the client-side reconstruction of a site. Thisis true of HTML. Many developers nowadays use

time-saving software which can write HTML codefor them. Some of these include Dreamweaver andFrontPage. Alternatively they can use Notepadwhere they will have to write the code themselves.A question which must be asked is how efficientthese automatic web development software prod-ucts are in comparison to manual coding. It hasbeen discovered that there are differences in theamount and size of code each of these produce. Forexample, we produced various web pages usingthese three pieces of software and discovered thatthere was a difference in the file sizes of each.Notepad produced the smallest file followed byDreamweaver and then FrontPage (Table 4).

This means that both Dreamweaver and Front-Page produced redundant code and resulted in alarger file than was needed. Despite the fact thatthis research was only carried out on a small scalethe results are notable. If more research wascarried out on larger sites developed from thesetypes of software products it may be found thatthe problem is of a greater order of magnitude (i.e.larger sites means larger discrepancies and moreredundant code). This area is significantly impor-tant now as people and technology are movingaway from Desktop interfaces and towards mobiledevices such as PDAs.

Page Design GuidelinesThis section details guidelines, identified from

the findings of this paper, for web developers onhow to optimize websites for transfer over theinternet. Following these guidelines will result insmaller, more efficient file content which will bothreduce bandwidth required to transfer a site froma server to a client and also reduce server-sidereconstruction time. This will therefore reduce thedownload time observed by the user. The guide-lines are as follows:

• All images, irrespective of original size,should be compressed. Image simplicity is also important. Avoid using too manycolours. Using a small number of colours will

UNDERSTANDING AND REDUCING WEB DELAYS 97

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Site 1 Site 2 Site 3

Notepad 268bytes 669bytes 1.48kbDreamweaver 548bytes 998bytes 1.67kbFrontPage 640bytes 1.9kb 1.73kb

Table 4. HTML Editors

result in a much higher compression than animage with a large range of colours. There aretrade-offs between quality and speed. This isup to the developers own judgement.

• Keep content static where possible asdynamic content will cause a delay while acache retrieves the dynamic data from thesite’s server

• Use technologies which will decrease thedownload time and the effort required forclient side reconstruction. We tested howDHTML compared in size to a Flash file for awebsite’s Introduction. It was discovered thatusing DHTML code instead of inserting aFlash file into a website decreased the size ofthe data required to produce exactly the sameeffect (Table 5).

• The DHTML reduces effort required in devel-oping a complicated Flash file and also lowersthe size and therefore reduces the bandwidthrequired for transferring the file, thus decreas-ing download time. Also because it has beenwritten in HTML it will start immediately andnot require the user’s browser to load a Flashfile.

• Avoid the use of background images wherepossible and opt for Hex colour codes instead.Background images increase site size and takelonger to reconstruct than Hex codes.

• Locate server close to target market. A hostserver may offer inexpensive domain hostingbut this may mean it takes longer for a site toreach its targeted users.

• If any code has been written using automaticcoding software it should be checked forredundant tags and other excess code, whichincrease site size and therefore downloadtime.

ConclusionIt was apparent from the results obtained from

this research that delays in Internet traffic and con-gestion can often times be attributed to poorly

developed sites. There is a direct correlationbetween site size and download times thus webdevelopers must be careful when choosing imagesto place on websites as the findings of this paperidentified that image size has a considerable effecton a site’s download time. Therefore image size isan important contributing factor to Internet trafficcongestion. This paper has shown that 66% of thesites sampled either did not use compression toolsat all, or did not use them properly. This was only arandom sample of a small number of sites. If furtherresearch was carried out it may be discovered thatthis problem is worse. An interesting and impor-tant finding from the research was the fact that webcaches store images more readily than text. Anotherinteresting finding from the research was that pop-up filters have no effect on download time. Thismeans Internet browsers can display multiplewindows which have no effect on one another.

References1. NUA, Internet Trends and Statistics, ‘How Many

Online?’ http://www.nua.ie/surveys/how_many_online/

2. Saiedian M, Naeem M. Understanding and ReducingWeb Delays. IEEE Computer Journal, December 2001:34(12).

3. Idaho State University, Certified Internet Web-master, ‘History of the Internet—Brief Timeline’,http://www.homelandweb.com/ciw/history.htm

4. Xilinx, Transmission Control Protocol/Internet Protocol (TCP/IP), Taken from: http://www.xilinx.com/esp/optical/net_tech/tcp.htm

5. Franklin, Curt, How Routers Work, HowStuff-Works.com, http://www.howstuffworks.com/router.htm

6. Caching Tutorial for Web Authors http://www.web-caching.com/mnot_tutorial/intro.html

7. Dynamic Caching, foedero.com, http://www.foedero.com/dynamicCaching.html

8. Fan L, Jacobson Q, Cao P, Lin W. Web prefetchingbetween low-bandwidth clients and proxies: poten-tial and performance. Proceedings of the 1999 ACMSIGMETRICS International Conference on Measurementand Modelling of Computer Systems, Atlanta, Georgia,USA, pp 178–187. �

98 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

DHTML Flash & HTML

Size = 2.98kb Flash File Size = 4.31kbHTML File Size = 0.94kb

Total of two files = 5.25kb

Table 5. DHTML versus Flash

If you wish to order reprints for this or anyother articles in the International Journal ofNetwork Management, please see the SpecialReprint instructions inside the front cover.

UNDERSTANDING AND REDUCING WEB DELAYS 99

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Appendix A. Images and Compression Details for Selected Sites

www.8over8.com

Image Image size Compressed image Compression (kb) (kb) ratio

1 6.15 2.99 75%2 0.26 0.26 0%3 0.26 0.26 0%4 0.34 0.34 0%5 0.24 0.24 0%6 0.29 0.29 0%7 0.23 0.23 0%8 4.81 1.97 75%9 12.1 10.7 75%

10 5.32 3.42 0%11 0.66 0.66 0%12 2.59 2.59 0%13 7.46 4.05 75%14 8.44 5.42 75%15 0.29 0.29 0%Total 49.4 33.75

Bandwidth saving % = 31.66

www.abc.com

Image Image size Compressed image Compression (kb) (kb) ratio

1 15.5 11.3 75%2 1.21 1.21 0%3 1.45 1.45 0%4 1.45 1.45 0%5 4.14 2.48 75%6 3.37 1.57 75%7 3.37 1.33 75%8 3.44 2.37 75%9 0.41 0.41 0%

10 0.81 0.81 0%11 0.49 0.49 0%12 0.13 0.13 0%13 0.14 0.14 0%14 0.32 0.32 0%15 0.17 0.17 0%Total 35.9 25.74

Bandwidth saving % = 28.29

100 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Appendix B. Download Times for Selected Sites (4am–6am)www.8over8.com

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 4.345 3.891 3.76 4.23 3.05 3.121 2.951 3.621142857Web cache 3.175 3.632 3.985 4.121 3.854 2.957 2.721 3.492142857Pop-up filter 3.211 3.931 4.12 3.875 3.921 3.121 2.913 3.584571429

www.abc.com

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 8.469 9.651 7.2 9.754 7.184 9.211 7.354 8.403285714Web cache 5.809 4.395 4.567 4.256 5.124 6.749 4.921 5.117285714Pop-up filter 7.525 8.141 7.643 9.121 6.901 7.143 6.031 7.500714286

www.amd.com

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 5.308 10.856 7.649 6.143 6.959 7.891 6.301 7.301Web cache 3.345 3.143 3.366 4.751 4 3.911 3.658 3.739142857Pop-up filter 6.103 8.961 7.832 6.517 6.631 5.376 5.978 6.771142857

www.americanmail.com

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 7.453 8.152 10.231 8.954 9.325 10.354 9.653 9.160285714Web cache 2.502 2.091 3.836 4.056 3.666 4.106 3.345 3.371714286Pop-up filter 7.891 7.964 8.635 10.721 9.364 10.906 8.972 9.207571429

www.aol.com

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 4.797 5.132 4.235 3.96 4.897 3.874 4.201 4.442285714Web cache 4.146 4.115 3.214 2.611 3.147 3.136 3.541 3.415714286Pop-up filter 4.535 4.753 4.351 3.75 5.132 4.135 4.324 4.425714286

www.bahn.de

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 13.251 16.324 14.3 15.743 13.439 14.5 13.366 14.41757143Web cache 2.223 3.743 3.876 3.505 3.485 5.013 3.769 3.659142857Pop-up filter 14.231 15.943 13.959 15.432 14.657 14.267 14.311 14.68571429

www.bbc.co.uk

Category Mon Tues Wed Thu Fri Sat Sun Average

Normal 2.54 3.61 3.15 5.43 4.29 6.12 5.342 4.354571429Web cache 1.773 1.592 2.134 3.895 3.176 4.375 3.478 2.917571429Pop-up filter 3.012 4.12 4.236 3.716 4.363 4.142 5.211 4.114285714

UNDERSTANDING AND REDUCING WEB DELAYS 101

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Appendix C. Grouped Details for Selected SitesPerformance

Website Location of Normal Web cache Pop-up filter server download (avg) (avg)

8over8.com London 3.621142857 3.492142857 3.584571429abc.com Seattle 8.403285714 5.117285714 7.500714286amd.com London 7.301 3.739142857 6.771142857americanmail.com New York 9.160285714 3.371714286 9.207571429aol.com Chicago 4.442285714 3.415714286 4.425714286bahn.de London 14.41757143 3.659142857 14.68571429bbc.co.uk London 4.354571429 2.917571429 4.114285714bet365.com London 5.897571429 3.733285714 6.55buyagift.co.uk London 7.729571429 5.554142857 10.50371429clarendonmanor.co.uk London 6.113857143 0.170142857 6.427285714com3.org London 5.822857143 3.1 5.833571429countrybookshop.co.uk London 7.867428571 4.419285714 7.162142857csu.edu.au Sydney 15.48971429 6.690857143 15.22385714dabs.com Norway 7.349428571 6.883 7.417857143danskirsk.dk Denmark 7.648571429 2.683857143 7.412428571edu.cn Beijing 15.61828571 11.54842857 13.88885714google.com California 2.970428571 1.647142857 2.940857143intel.com Los Angeles 24.45642857 15.28042857 23.421kainos.co.uk London 5.961571429 3.778 5.938714286kelkoo.co.uk Brussels 7.060571429 2.969 7.156857143louvre.fr Paris 4.932714286 3.553428571 5.234428571martins-seafresh.co.uk Isle of Man 10.175 6.533428571 10.22914286mlbapparel.com Chicago 6.767714286 3.117428571 6.962navyseals.com New York 13.77457143 4.827 12.97785714nic.it Rome 3.604857143 0.554714286 3.550428571nikkei.co.jp Tokyo 23.79514286 12.10342857 22.32442857questionmarket.com Toronto 12.77242857 2.601285714 12.78628571sevengatesdesigns.com Dublin 4.487 1.353571429 4.821285714singularity.co.uk London 4.539857143 2.864857143 4.306428571streetsonline.co.uk London 9.838142857 3.359142857 10.61257143ulsterbus.co.uk London 8.131142857 5.526 7.215428571

102 K. CURRAN AND C. DUFFY

Copyright © 2005 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2005; 15: 89–102

Performance

Website Number Size of New size Compressed Text in Site kbof images by % HTML beforeimages (kb) kb compression

8over8.com 15 49.4 33.758 31.66396761 12.9 62.3abc.com 15 35.9 25.743 28.29247911 54.3 90.2amd.com 18 58.6 32.623 44.32935154 47.3 104.9americanmail.com 16 46.2 26.757 42.08441558 24.8 71.0aol.com 18 43.7 30.549 30.09382151 23.3 67bahn.de 27 163 68.8266 57.88953947 59 222bbc.co.uk 17 24 20.315 15.45280506 34.1 58.1bet365.com 13 25.9 19.707 23.74337345 24.4 50.3buyagift.co.uk 29 53.1 38.159 28.13747646 30.2 83.3clarendonmanor.co.uk 6 82.5 58.265 29.67410984 2.93 85.43com3.org 10 47.9 35.526 25.85775106 17.7 65.6countrybookshop.co.uk 31 175 124.619 28.78914286 56 231csu.edu.au 29 113 88.704 21.50088496 12 125dabs.com 56 50.3 47.393 5.779324056 37.3 87.6danskirsk.dk 13 53.9 41.186 23.58812616 30 83.9edu.cn 26 27.4 24.319 11.01394123 40.7 68.1google.com 1 8.35 8.004 4.143712575 3.67 12.2intel.com 22 27.8 21.879 21.29573006 50.7 78.5kainos.co.uk 27 33 30.592 7.296969697 29.8 62.8kelkoo.co.uk 15 33.5 29.343 12.40895522 18.3 51.8louvre.fr 7 23.9 21.191 11.44588383 6.09 30.8martins-seafresh.co.uk 9 75.8 31.685 58.19920844 11.9 87.8mlbapparel.com 26 75.1 61.54 17.85904965 14.2 89.3navyseals.com 26 103 65.72 36.19417476 29.5 132.5nic.it 6 21.3 11.539 45.673258 2.4 23.7nikkei.co.jp 26 97.6 57.21 41.29297075 84.7 182.3questionmarket.com 8 19 13.068 31.29337539 4.6 23.6sevengatesdesigns.com 5 38.9 25.471 34.42070031 4.1 43singularity.co.uk 12 29.5 25.79 12.57627119 19.4 48.9streetsonline.co.uk 37 47.4 37.876 20.092827 59.3 106.7ulsterbus.co.uk 20 114 102.656 9.950877193 33 147