Evaluation of Data and Request Distribution Policies in Clustered Servers

44
Evaluation of Data and Request Distribution Policies in Clustered Servers Adnan Khaleel and A. L. Narasimha Reddy Texas A&M University adnan,[email protected]

description

Evaluation of Data and Request Distribution Policies in Clustered Servers. Adnan Khaleel and A. L. Narasimha Reddy Texas A&M University adnan,[email protected]. Introduction. Internet use has skyrocketed 74MB/month in ‘92, several gigabytes/hour today - PowerPoint PPT Presentation

Transcript of Evaluation of Data and Request Distribution Policies in Clustered Servers

Page 1: Evaluation of Data and Request Distribution Policies in Clustered Servers

Evaluation of Data and Request Distribution Policies in Clustered

Servers

Adnan Khaleel and A. L. Narasimha Reddy

Texas A&M University

adnan,[email protected]

Page 2: Evaluation of Data and Request Distribution Policies in Clustered Servers

2

Introduction

Internet use has skyrocketed– 74MB/month in ‘92, several gigabytes/hour

today Trend can be expected to grow in coming

years Increasing load has placed burdens on

hardware and software beyond their original designs

Page 3: Evaluation of Data and Request Distribution Policies in Clustered Servers

3

Introduction (cont’d)

Clustered Servers are viable solutions

Page 4: Evaluation of Data and Request Distribution Policies in Clustered Servers

4

Issues in Clustered Servers

Need to present a single server image– DNS aliasing, magic routers etc

Multiplicity in Back-End Servers:– How should data be organized on back-end ?– How should incoming requests be distributed

amongst the back-end servers ?

Page 5: Evaluation of Data and Request Distribution Policies in Clustered Servers

5

Issues in Clustered Servers (cont’d)

Data Organization– Disk Mirroring

Identical data maintained on all back-end servers Every machine able to service requests without

having to access files on other machines. Several redundant machines present, good system

reliability Disadvantages

– Inefficient use of disk space – Data cached on several nodes simultaneously

Page 6: Evaluation of Data and Request Distribution Policies in Clustered Servers

6

Issues in Clustered Servers (cont’d)

Data Organization (cont’d)– Disk Striping

Borrowed from Network File Servers Entire data space divided over all the back-end

servers Portion of file may reside on several machines Improve reliability through parity protection For large file accesses, automatic load distribution Better access times

Page 7: Evaluation of Data and Request Distribution Policies in Clustered Servers

7

Issues in Clustered Servers (cont’d)

Locality– Taking advantage of files already cached in

back-end server’s memory

– For clustered Server System Requests accessing same data be sent to the same

set of servers

Page 8: Evaluation of Data and Request Distribution Policies in Clustered Servers

8

Issues in Clustered Servers (cont’d)

Distribution Vs Locality ?– Load balanced system

Distribute requests evenly among back-end servers

– Improve hit-rate and response time Maximize locality

Current studies focus only on one aspect and ignore the other

Page 9: Evaluation of Data and Request Distribution Policies in Clustered Servers

9

Request Distribution Schemes (cont’d)

Round Robin Request Distribution

Page 10: Evaluation of Data and Request Distribution Policies in Clustered Servers

10

Request Distribution Schemes (cont’d)

Round Robin Request Distribution (cont’d)– Requests distributed in a sequential manner– Results in ideal distribution– Does not take server loading into account

Weighted Round Robin Two Tier Round Robin

– Cache Hits purely coincidental

Page 11: Evaluation of Data and Request Distribution Policies in Clustered Servers

11

Request Distribution Schemes (cont’d)

Round Robin Request Distribution (cont’d)– Every back-end server has to cache the entire

content of the Server Unnecessary duplication of files in cache Inefficient use of cache space

– Back-ends may see different queuing times due to uneven hit rates

Page 12: Evaluation of Data and Request Distribution Policies in Clustered Servers

12

Request Distribution Schemes (cont’d)

File Based Request Distribution

Page 13: Evaluation of Data and Request Distribution Policies in Clustered Servers

13

Request Distribution Schemes (cont’d)

File Based Request Distribution (cont’d)– Locality based distribution– Partition file-space and assign a partition to

each back-end server

– Advantages Does not suffer from duplicated data on cache Based on access patterns, can yield high hit rates

Page 14: Evaluation of Data and Request Distribution Policies in Clustered Servers

14

Request Distribution Schemes (cont’d)

File Based Request Distribution (cont’d)– Disadvantages

How to determine file-space partitioning ?– Difficult to partition so requests load back-ends evenly– Dependent on client access patterns, no one partition

scheme can satisfy all cases

Some files will always be requested more than others

Locality is of primary concern, distribution ignored– Hope that partitioning achieves the distribution

Page 15: Evaluation of Data and Request Distribution Policies in Clustered Servers

15

Request Distribution Schemes (cont’d)

Client Based Request Distribution

Page 16: Evaluation of Data and Request Distribution Policies in Clustered Servers

16

Request Distribution Schemes (cont’d)

Client Based Request Distribution (cont’d)– Also locality based– Partition client-space and assign a partition to

each back-end server– Advantages and disadvantages similar to file-

based Difficult to find ideal partitioning scheme Ignores distribution

Page 17: Evaluation of Data and Request Distribution Policies in Clustered Servers

17

Request Distribution Schemes (cont’d)

Client Based Request Distribution (cont’d)

– Slightly modified from DNS used in Internet Allows flexibility in client-server mapping

– TTL set during first resolution– On expiration, client expected to re-resolve name– Possibly different TTL could be used for different

workload characteristics– However, clients ignore TTL– Hence a STATIC scheme

Page 18: Evaluation of Data and Request Distribution Policies in Clustered Servers

18

Request Distribution Schemes (cont’d)

Locality Aware Request Distribution [5]– Broadly based on file-based scheme– Addresses the issue of load balancing– Each file assigned a dynamic set of servers

instead of just one server

Page 19: Evaluation of Data and Request Distribution Policies in Clustered Servers

19

Request Distribution Schemes (cont’d)

LARD (cont’d)– Technique

– On first request for a file, assign least loaded back-end– On subsequent requests for the same file

• Determine Max/Min loaded servers in assigned set• If (Max loaded server > High Threshold OR a server

exists in cluster with load < Low Threshold ) then add the new least loaded server to set and assign to service request

• Else assign Min loaded server in set to service request• If any server in set inactive > time T, remove from set

Page 20: Evaluation of Data and Request Distribution Policies in Clustered Servers

20

Request Distribution Schemes (cont’d)

LARD (cont’d)– File-space partitioning done on the fly– Disadvantages

Large amounts of processing needs to be performed by the front-end

Large amount of memory needed to maintain information on each individual file

Possible bottleneck as system is scaled

Page 21: Evaluation of Data and Request Distribution Policies in Clustered Servers

21

Request Distribution Schemes (cont’d)

Dynamic Client Based Request Distribution– Based on the premise that file reuse among

clients is high– Complete ignorance of server loads– Propose a modification to the static client based

distribution to make it actively modify distribution based on back-end loads.

Page 22: Evaluation of Data and Request Distribution Policies in Clustered Servers

22

Request Distribution Schemes (cont’d)

Dynamic Client Based (cont’d)– Use of time-to-live (TTL) for server mappings

within cluster - TTL is continuously variable– In heavily loaded systems

RR type distribution preferable as queue times predominate

TTL values should be small

– In lightly loaded systems TTL values should be large in order to maximize benefits

of locality

Page 23: Evaluation of Data and Request Distribution Policies in Clustered Servers

23

Request Distribution Schemes (cont’d)

Dynamic Client Based (cont’d)– On TTL expiration, assign client partition to least

loaded back-end server in cluster If more than one server has the same low load - choose

randomly from that set

– Allows server using an IPRP[4] type protocol to redirect client to other server if it aids load balancing

Unlike DNS, clients cannot void this mechanism Hence - Dynamic

Page 24: Evaluation of Data and Request Distribution Policies in Clustered Servers

24

Request Distribution Schemes (cont’d)

Dynamic Client Based (cont’d)– Trend in server load essential to determine if

TTL is to be increased or decreased– Need to average out the requests to smooth out

transient activity– Moving Window Averaging Scheme– Only requests that come within the window

period actively contribute towards load calculation

Page 25: Evaluation of Data and Request Distribution Policies in Clustered Servers

25

Simulation Model

Trace Driven simulation model Based on CSIM [8] Modelled an IBM OS/2 for various hardware

parameters Several parameters could be modified

– # of servers, memory size, CPU capacity in MIPS (50), disk access times, Network communication time/packet, data organization - disk mirror or stripe

Page 26: Evaluation of Data and Request Distribution Policies in Clustered Servers

26

Simulation Model (cont’d)

In disk mirror and disk striping, data cached at request servicing nodes

In disk striping, data is also cached at disk-end nodes

Page 27: Evaluation of Data and Request Distribution Policies in Clustered Servers

27

Simulation Model (cont’d)

Traces – Representative of two arena where clustered

servers are currently used

World Wide Web (WWW) Servers Network File (NFS) Servers

Page 28: Evaluation of Data and Request Distribution Policies in Clustered Servers

28

Simulation Model (cont’d)

WEB Trace ClarkNet WWW Server - ISP for Metro Baltimore -

Washington DC area Collected over a period of two weeks Original trace had 3 million records Weeded out non HTTP related records like CGI, ftp Resulting trace had 1.4 million records Over 90,000 clients Over 24,000 files that had a total occupancy of

slightly under 100 MBytes

Page 29: Evaluation of Data and Request Distribution Policies in Clustered Servers

29

Simulation Model (cont’d)

WEB Trace (cont’d)– Records had timestamps with 1 second

resolution Did not accurately represent real manner of request

arrivals Requests that arrived in the same second were

augmented with a randomly generated microsecond extension

Page 30: Evaluation of Data and Request Distribution Policies in Clustered Servers

30

Simulation Model (cont’d)

NFS Trace– Obtained from Auspex [9] file server at UC

Berkeley– Consists of post client-cache misses– Collected over a period of one week– Had 231 clients, over 68,000 files that had a

total occupancy of 1,292 Mbytes

Page 31: Evaluation of Data and Request Distribution Policies in Clustered Servers

31

Simulation Model (cont’d)

NFS Trace (cont’d)– Original trace had a large amount of backup data at

night and over weekends, only daytime records used in simulation

– Records had timestamps with microsecond resolution

Cache allowed to WARM-UP prior to any measurements being made

Page 32: Evaluation of Data and Request Distribution Policies in Clustered Servers

32

Results - Effects of Memory Size

NFS Trace,Disk Stripe– Increase mem =

increase cache space

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

32 64 128 256 512

Memory Size MBytes

Resp

onse

Time

Secs

RR

FB

LD

CB(d)

Response time for 4 back-end servers.

Page 33: Evaluation of Data and Request Distribution Policies in Clustered Servers

33

Results - Effects of Memory Size

NFS trace, Disk Stripe– FB better at extracting

locality– RR hits are purely

probabilistic

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

32 64 128 256 512

Memory Size MBytes

% Hit

Ratio

RR

FB

LD

CB(d)

Cache-hit ratio for 4 back-end servers.

Page 34: Evaluation of Data and Request Distribution Policies in Clustered Servers

34

Results - Effects of Memory Size

WEB trace, Disk Stripe– WEB trace has a

smaller working set– Increase in memory as

less of an effect

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

32

RR128

512

32

FB128

512

32

LD128

512

32

CB(d)128

512

Memory Size MBytes

Resp

onse

Tim

e Sec

s

Response time for 4 back-end servers.

Page 35: Evaluation of Data and Request Distribution Policies in Clustered Servers

35

Results - Effects of Memory Size

WEB trace, Disk Stripe– Extremely high hit rates,

even at 32 Mbytes– FB able to extract

maximum locality– Distribution scheme less

of an effect on response time

– Load distribution was acceptable for all schemes, best RR , worst FB

90.00%

91.00%

92.00%

93.00%

94.00%

95.00%

96.00%

97.00%

98.00%

99.00%

100.00%

32 64 128 256 512

Memory Size MBytes

Cach

e Hit R

atio

RR

FB

LD

CB(d)

Cache hit rates for 4 back-end system.

Page 36: Evaluation of Data and Request Distribution Policies in Clustered Servers

36

Results - Effects of Memory Size

WEB Trace,Disk Mirror– Very similar to DS– With smaller memory,

hit rates slightly lower as no disk end caching

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

32 64 128 256 512 32 64 128 256 512

Memory Size MBytes

Resp

onse

Tim

e Sec

s

Disk stripe vs. disk mirror.

Disk Stripe Disk Mirror

Page 37: Evaluation of Data and Request Distribution Policies in Clustered Servers

37

Results - Scalability Performance

NFS trace, Disk Stripe– RR shows least benefit

Due to probabilistic cache hits

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

4

8RR

16

4

8FB

16

4

8LD

16

4

8CB(d)

16

# of servers

Resp

onse

Time

Number of servers on response time (128MB

memory).

Page 38: Evaluation of Data and Request Distribution Policies in Clustered Servers

38

Results - Scalability Performance

NFS Trace,Disk Stripe– ROUND ROBIN

Drop in hit rates with more servers

Lesser “probabilistic” locality

40.00%

45.00%

50.00%

55.00%

60.00%

65.00%

70.00%

75.00%

80.00%

85.00%

32 64 128 256 512

Memory Size MBytes

Cach

e Hit R

atio

Four

Eight

Sixteen

Cache hit rate vs. memory size and number of back-end servers

Page 39: Evaluation of Data and Request Distribution Policies in Clustered Servers

39

Results - Scalability Performance

NFS Trace,Disk Mirror– RR performance

worsens with more servers

– All other schemes perform similar to Disk Striping 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

4

RR8

16

4

FB8

16

4

LD8

16

4

CB(d)8

16

# of servers

Resp

onse

Time

Secs

Number of servers on response

time (128MB).

Page 40: Evaluation of Data and Request Distribution Policies in Clustered Servers

40

Results - Scalability Performance

NFS Trace,Disk Mirror– For RR, lower hit rates

with more servers - lower response time

– For RR, disk-end caching offers better hit rates in disk striping than in disk mirror

30.00%

35.00%

40.00%

45.00%

50.00%

55.00%

60.00%

65.00%

4 8 16 4 8 16

# of servers

Cach

e Hit R

atio

(loca

l + rem

ote)

Cache hit rates for RR under Disk striping vs. mirroring (128MB)

Disk Stripe Disk Mirror

Page 41: Evaluation of Data and Request Distribution Policies in Clustered Servers

41

Results - Effects of Memory Size

NFS trace, Disk Mirror– Similar effect of more

memory– Stagnation of hit rates

in FB, DM does better than DS due to caching of data at disk end

– RR exhibits better hit rates with DS than DM, greater variety of files in cache

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

32 64 128 256 512

Memory Size MBytes

Cach

e Hit R

atio

Block Hit % DS RR

Block Hit % DM RR

Block Hit % DS FB

Block Hit % DM FB

Cache hit rates with disk mirror and disk striping.

Page 42: Evaluation of Data and Request Distribution Policies in Clustered Servers

42

Results - Disk Stripe Vs Disk Mirror

Implicit distribution of load in Disk striping produces low disk queues

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

32 64 128 256 512 32 64 128 256 512

Memory Size MBytes

Queu

e Ti

me

Server Q

Disk Q

Queueing time in Disk stripe and disk mirror. NFS trace with

a 4 back-end system used.

Disk Stripe Disk Mirror

Page 43: Evaluation of Data and Request Distribution Policies in Clustered Servers

43

Conclusion & Future Work

RR ideal distribution, poor response rates due to probabilistic nature of cache hit rates.

File -based was the best at extracting locality, complete lack of server loads, poor load distribution

LARD, similar to FB but better load distribution For WEB Trace, cache hit rates were so high that

distribution did not play a role in determining response time

Page 44: Evaluation of Data and Request Distribution Policies in Clustered Servers

44

Conclusion & Future Work

Dynamic CB addressed the problem of server load ignorance of static CB, better distribution in NFS trace, better hit rates in WEB Trace

Disk Striping distributed requests over several servers, relieved disk queues but increased server queues

In the process of evaluating a flexible caching approach with Round Robin distribution that can exploit the file-based caching methodology

Throughput comparisons of various policies Impact of faster processors Impact of Dynamically generated web page content