CS6320 – Performance more details

65
CS6320 – Performance more details L. Grewe 1

description

CS6320 – Performance more details. L. Grewe. System Architecture. Tier 1. Tier 2. Tier 3. DMS. Client. Application Server. Database Server. Web Server. Performance Desires and Approaches. Improving performance and reliability to provide Higher throughput - PowerPoint PPT Presentation

Transcript of CS6320 – Performance more details

Page 1: CS6320 – Performance more details

CS6320 – Performance more details

L. Grewe

1

Page 2: CS6320 – Performance more details

System Architecture

Client

Web Server

Tier 2Tier 1 Tier 3

Application Server

Database Server

DMS

Page 3: CS6320 – Performance more details

Performance Desires and Approaches

• Improving performance and reliability to provide– Higher throughput– Lower latency (i.e., response time)– Increase availability

• Some Approaches– Scaling/Replication

• How performance, redundancy, and reliability are related to scalability– Load balancing– Web caching

3

Page 4: CS6320 – Performance more details

Where to Apply Scalability

• To the network • To individual servers• Make sure the network has capacity

before scaling by adding servers

4

Page 5: CS6320 – Performance more details

An example…but, first Hardware Review

• Firewall– Restricts traffic based on rules and can “protect” the internal network from

intruders

• Router– Directs traffic to a destination based on the “best” path; can

communicate between subnets

• Switch– Provides a fast connection between multiple servers on the same subnet

• Load Balancer– Takes incoming requests for one “virtual” server and redirects them to

multiple “real” servers

Page 6: CS6320 – Performance more details

Switch: Conencting More than 2 Machines

Page 7: CS6320 – Performance more details

Case Study: Retail eBusiness

7

Internet

R ou te r

W ebS erver

D atabaseS erver

Switch

D ata C ircu it

This is the initial design

PROBLEM: site is growingand too many users- performance is inadequate

Page 8: CS6320 – Performance more details

Solution - Scaling

8

Internet

R ou te r

W ebS erver

D atabaseS erver

W ebS erver

W ebS erver

W ebS erver

W ebS erver

D atabaseS erver

W ebS erver

Switch

D ata C ircu it

Scaling throughReplication of systems

Page 9: CS6320 – Performance more details

Initial Redesign

9

C ata lys tV LA N 2

Internet

CatalystVLAN4

F irew a ll R ou te r

W E BS E R V E R S

FIR E W A LLR O U TE R

LO A DB A LA N C E R

S W ITC HV LA N 2

S W ITC HV LA N 3

S W ITC HV LA N 4

A P P LIC A T IO NFIR E W A LL

D A TA B A S ES E R V E R S

CatalystVLAN3

LoadB a lance r

Firewall

Scaling mostly the webservers.

Problem: still have oneEntrance through firewall for clients. A bottleneck

Page 10: CS6320 – Performance more details

The Redesign Again:

10

Internet

Primary

Connection

RedundantConnection

C ata lys t 2V LA N 4

Catalyst 2VLAN1

C ata lys t 2V LA N 2

C ata lys t 1V LA N 2

Catalyst 1VLAN1

C ata lys t 1V LA N 4

Firewall RouterBackup

Firewall RouterPrim ary

C ata lys t 2V LA N 3

C ata lys t 1V LA N 3

Prim ary FW Backup FW

LoadBalancer

LoadBalancer

W E BS E R V E R S

FIR E W A LLR O U TE R S

S W ITC H E SV LA N 1

LO A DB A LA N C E R S

S W ITC H E SV LA N 2

S W ITC H E SV LA N 3

S W ITC H E SV LA N 4

FIR E W A LLS

D A TA B A S ES E R V E R S

Last design: still bottleneckComing in on one path ….

Here we split into 2 “connected”Paths.

Redundant Primary

Page 11: CS6320 – Performance more details

Performance, Redundancy, andScalability

• Scale for performance• But what about redundancy? Site going down.

11

Page 12: CS6320 – Performance more details

How to get rid of Single Points of Failure (SPOF):

12

Internet

California

C ata lys t 2V LA N 4

Catalyst 2VLAN1

C ata lys t 2V LA N 2

C ata lys t 1V LA N 2

Catalyst 1VLAN1

C ata lys t 1V LA N 4

Firewall RouterBackup

Firewall RouterPrim ary

C ata lys t 2V LA N 3

C ata lys t 1V LA N 3

Primary Connection Redundant Connection

Prim ary FW Backup FW

LoadBalancer

LoadBalancer

New York

C ata lys t 2V LA N 4

Catalyst 2VLAN1

C ata lys t 2V LA N 2

C ata lys t 1V LA N 2

Catalyst 1VLAN1

C ata lys t 1V LA N 4

Firewall RouterBackup

Firewall RouterPrim ary

C ata lys t 2V LA N 3

C ata lys t 1V LA N 3

Primary Connection

Redundant Connection

Prim ary FW Backup FW

LoadBalancer

LoadBalancer

Problem: Last designif services to the singlegeographical networkgo down…site is down.

Answer: Replicate indifferent geographicallocations

Page 13: CS6320 – Performance more details

Scaling Servers: Out or Up

• Scale Out (Horizontal)..we saw this in previous design– Multiple servers– Add more servers to scale – Most commonly done with web servers

• Scale Up (Vertical) – Fewer larger servers to add more internal resources– Add more processors, memory, and disk space– Most commonly done with database servers

13

Page 14: CS6320 – Performance more details

Some Approaches to Scalability

• Approaches– Farming– Cloning– RACS– Partitioning– RAPS

• Load balancing• Web caching

14

Page 15: CS6320 – Performance more details

Farming

• Farm - the collection of all the servers, applications, and data at a particular site.– Farms have many specialized services (i.e.,

directory, security, http, mail, database, etc.)

15

This is about theHW scaling

Page 16: CS6320 – Performance more details

Simple Web Farm

Page 17: CS6320 – Performance more details

Cloning

• A service can be cloned on many replica nodes, each having the same software and data.

• Cloning offers both scalability and availability.– If one is overloaded, a load-balancing system can

be used to allocate the work among the duplicates.

– If one fails, the other can continue to offer service.

17

This is aboutService / SWreplication

Page 18: CS6320 – Performance more details

Two Clone Design Styles

18

•Shared Nothing is simpler to implement and scales IO bandwidth as the site grows.

•Shared Disc design is more economical for large or update-intensive databases.

Page 19: CS6320 – Performance more details

Reliable Array of Cloned Services (RACS)

• RACS (Reliable Array of Cloned Services) – a collection of clones for a particular service– shared-nothing RACS

• each clone duplicates the storage locally• updates should be applied to all clone’ s storage

– shared-disk RACS (cluster)• all the clones share a common storage manager• storage server should be fault-tolerant• subtle algorithms need to manage updates (cache

invalidation, lock managers, etc.)

19

Page 20: CS6320 – Performance more details

Clones and RACS

• can be used for read-mostly applications with low consistency requirements.– i.e., Web servers, file servers, security servers…

• requirements of cloned services:– automatic replication of software and data to new

clones– automatic request routing to load balance the work– route around failures– recognize repaired and new nodes

20

Page 21: CS6320 – Performance more details

Some definitions - Partitions and Packs

21

•Data Objects (mailboxes, database records, business objects,…) are partitioned among storage and server nodes.•For availability, the storage elements may be served by a pack of servers.

Page 22: CS6320 – Performance more details

Partition• grows a service by

– duplicating the hardware and software– dividing the data among the nodes (by object), e.g.,

mail servers by mailboxes• should be transparent to the application

– requests to a partitioned service are routed to the partition with the relevant data

• does not improve availability– the data is stored in only one place– partitions are implemented as a pack of two or more

nodes that provide access to the storage

22

Page 23: CS6320 – Performance more details

Taxonomy of Scaleability Designs

23

Page 24: CS6320 – Performance more details

Reliable Array of Partitioned Services RAPS

• RAPS (Reliable Array of Partitioned Services)– nodes that support a packed-partitioned service– shared-nothing RAPS, shared-disk RAPS

• Update-intensive and large database applications are better served by routing requests to servers dedicated to serving a partition of the data (RAPS).

24

Page 25: CS6320 – Performance more details

Some Approaches to Scalability

• Approaches– Farming– Cloning– RACS– Partitioning– RAPS

• Load balancing• Web caching

25

Page 26: CS6320 – Performance more details

Load Balancing / Sharing

26

Page 27: CS6320 – Performance more details

Load Management

• Balancing loads (load balancer) can operate at different OSI layers– Round-robin DNS– Layer-4 (Transport layer, e.g. TCP) switches– Layer-7 (Application layer) switches

Page 28: CS6320 – Performance more details

The 7 OSI (Open System Interconnection) Layers(a model of a network)

Page 29: CS6320 – Performance more details

Load Balancing Strategies

• Flat architecture– DNS rotation, switch based, MagicRouter

• Hierarchical architecture • Locality-Aware Request Distribution

29

Page 30: CS6320 – Performance more details

DNS Rotation - Round Robin Cluster

30

Page 31: CS6320 – Performance more details

Flat Architecture - DNS Rotation• DNS rotates IP addresses of a Web site

– treat all nodes equally • Pros:

– A simple clustering strategy• Cons:

– Client-side IP caching: load imbalance, connection to down node• Hot-standby machine (failover)

– expensive, inefficient• Switching products

– Cisco, Foundry Networks, and F5Labs– Cluster servers by one IP– Distribute workload (load balancing)– Failure detection

• Problem– Not sufficient for dynamic content

31

Page 32: CS6320 – Performance more details

Load Balance Idea 2: Switch-based Cluster

32

Page 33: CS6320 – Performance more details

Flat Architecture - Switch Based

• Switching products– Cluster servers by one IP– Distribute workload (load balancing)

• i.e. round-robin

– Failure detection– Cisco, Foundry Networks, and F5Labs

• Problem– Not sufficient for dynamic content

33

Page 34: CS6320 – Performance more details

Problems with DNS or Switch Load Balancing

• Problems– Not sufficient for dynamic content– Adding/Removing nodes can be involved

• Manual configuration required

– limited load balancing in switch– Simple algorithms do not consider current loads

34

Page 35: CS6320 – Performance more details

Load Sharing Strategies

• Flat architecture– DNS rotation, switch based, MagicRouter

• Hierarchical architecture • Locality-Aware Request Distribution

35

Page 36: CS6320 – Performance more details

Hierarchical Architecture

• Master/slave architecture • Two levels

– Level I• Master: static and dynamic content

– Level II• Slave: only dynamic

36

Page 37: CS6320 – Performance more details

Hierarchical Architecture

37

M/S Architecture

Page 38: CS6320 – Performance more details

Hierarchical Architecture

38

Page 39: CS6320 – Performance more details

Hierarchical Architecture

• Benefits– Better failover support

• Master restarts job if a slave fails

– Separate dynamic and static content• resource intensive jobs (CGI scripts) runs by slave• Master can return static results quickly

39

Page 40: CS6320 – Performance more details

Locality-Aware Request Distribution

• Content-based distribution– Improved hit rates– Increased secondary storage– Specialized back end servers

• Architecture– Front-end

• distributes request– Back-end

• process request

40

Page 41: CS6320 – Performance more details

Load Sharing Strategies

• Flat architecture– DNS rotation, switch based, MagicRouter

• Hierarchical architecture • Locality-Aware Request Distribution

41

Page 42: CS6320 – Performance more details

Locality-Aware Request Distribution

42

Naïve Strategy

Page 43: CS6320 – Performance more details

Some Approaches to Scalability

• Approaches– Farming– Cloning– RACS– Partitioning– RAPS

• Load balancing• Web caching

43

Page 44: CS6320 – Performance more details

Web Caching

44

Page 45: CS6320 – Performance more details

Web Proxy• Intermediate between clients and Web

servers• It is used to implement firewall• To improve performance, proxy caching

45

Client (browser) Web serverWith Caching

Page 46: CS6320 – Performance more details

Web Architecture

• Client (browser), Proxy, Web server

46

Web server

Proxy

Client (browser)

Firewall

Page 47: CS6320 – Performance more details

Web Caching not only at Proxy Servers

• Caching popular objects is one way to improve Web performance.

• Web caching at clients, proxies, and servers.

47

Proxy

Client (browser)

Web server

Page 48: CS6320 – Performance more details

Advantages of Web Caching

• Reduces bandwidth consumption (decrease network traffic)

• Reduces access latency in the case of cache hit• Reduces the workload of the Web server• Enhances the robustness of the Web service • Usage history collected by Proxy cache can be

used to determine the usage patterns and allow the use of different cache replacement and prefetching policies.

48

Page 49: CS6320 – Performance more details

Disadvantages of Web Caching

• Stale data can be serviced due to the lack of proper updating

• Latency may increase in the case of a cache miss

• A single proxy cache is always a bottleneck.• A single proxy is a single point of failure• Client-side and proxy cache reduces the hits

on the original server.

49

Page 50: CS6320 – Performance more details

Web Caching Issues

• Cache replacement• Prefetching• Cache coherency• Dynamic data caching

50

Page 51: CS6320 – Performance more details

Cache Replacement

• Characteristics of Web objects– different size, accessing cost, access pattern.

• Traditional replacement policies do not work well– LRU (Least Recently Used), LFU (Least Frequently

Used), FIFO (First In First Out), etc

• There are replacement policies for Web objects:– key-based– cost-based

51

Page 52: CS6320 – Performance more details

Caching -Two Replacement Schemes

• Key-based replacement policies:– Size: evicts the largest objects– LRU-MIN: evicts the least recently used object among ones with largest

log(size)– Lowest Latency First: evicts the object with the lowest download latency

• Cost-based replacement policies – Cost function of factors such as last access time, cache entry time, transfer

time cost, and so on– Least Normalized Cost Replacement: based on the access frequency, the

transfer time cost and the size.– Server-assisted scheme: based on fetching cost, size, next request time, and

cache prices during request intervals.

52

Page 53: CS6320 – Performance more details

Caching -Prefetching

• The benefit from caching is limited.– Maximum cache hit rate - no more than 40-50%– to increase hit rate, anticipate future document

requests and prefetch the documents in caches• documents to prefetch

– considered as popular at servers– predicted to be accessed by user soon, based on the

access pattern• It can reduce client latency at the expense of

increasing the network traffic.

53

Page 54: CS6320 – Performance more details

Cache Coherence• Cache may provide users with stale documents.• HTTP commands for cache coherence

– GET : retrieves a document given its URL– Conditional GET: GET combined with the header IF-

Modified-Since. – Progma: no-cache : this header indicate that the

object be reloaded from the server.– Last-Modified : returned with every GET message and

indicate the last modification time of the document.• Two possible semantics

– Strong cache consistency– Weak cache consistency

54

Page 55: CS6320 – Performance more details

Strong cache consistency• Client validation (polling-every-time)

– sends an IF-Modified-Since header with each access of the resources

– server responses with a Not Modified message if the resource does not change

• Server invalidation– whenever a resource changes, the server sends

invalidation to all clients that potentially cached the resource.

– Server should keep track of clients to use.– Server may send invalidation to clients who are no

longer caching the resource.

55

Page 56: CS6320 – Performance more details

Weak Cache Consistency– Adaptive TTL (time-to-live)

• adjust a TTL based on a lifetime (age) - if a file has not been modified for a long time, it tends to stay unchanged.

• This approach can be shown to keep the probability of stale documents within reasonable bounds ( < 5%).

• Most proxy servers use this mechanism.• No strong guarantee as to document staleness

– Piggyback Invalidation• Piggyback Cache Validation (PCV) - whenever a client communicates with a server,

it piggybacks a list of cached, but potentially stale, resources from that server for validation.

• Piggyback Server Invalidation (PSI) - a server piggybacks on a reply to a client, the list of resources that have changed since the last access by the client.

• If access intervals are small, then the PSI is good. But, if the gaps are long, then the PCV is good.

56

Page 57: CS6320 – Performance more details

Dynamic Data Caching

• Non-cacheable data – authenticated data, server dynamically generated data, etc.– how to make more data cacheable– how to reduce the latency to access non-cacheable data

• Active Cache– allows servers to supply cache applets to be attached with documents.– the cache applets are invoked upon cache hits to finish necessary processing

without contacting the server.– bandwidth savings at the expense of CPU costs– due to significant CPU overhead, user access latencies are much larger than

without caching dynamic objects.

57

Page 58: CS6320 – Performance more details

Dynamic Data Caching• Web server accelerator

– resides in front of one or more Web servers– provides an API which allows applications to

explicitly add, delete, and update cached data.– The API allows static/dynamic data to be cached. – An example - the official Web site for the Olympic

Winter Games• whenever new content became available, updated

Web reflecting these changes were made available quickly.

• Data Update Propagation (DUP, IBM Watson) is used for improving performance.

58

Page 59: CS6320 – Performance more details

Dynamic Data Caching• Data Update Propagation (DUP)

– maintains data dependence information between cached objects and the underlying data which affect their values

– upon any change to underlying data, determines which cached objects are affected by the change.

– Such affected cached objects are then either invalidated or updated.

– With DUP, about 100% cache hit rate at the 1998 Olympic Winter Games official Web site.

– Without DUP, 80% cache hit rate at the 1996 Olympic Games official Web site.

59

Page 60: CS6320 – Performance more details

Towards Large Scale system ….and need for clusering

• Large scale systems (think Yahoo!, YouTube, Ebay, Amazon, Google)…

60

Page 61: CS6320 – Performance more details

One Large Scale Need – High Availability

• High availability is a major driving requirement behind large-scale system design. Basically, means the system is available (and responding) a high percentage of the time.– Uptime: typically measured in nines, and traditional infrastructure

systems such as the phone system aim for four or five nines (“four nines” implies 0.9999 uptime, or less than 60 seconds of downtime per week).

61

Page 62: CS6320 – Performance more details

High Availability – how to measure– Meantime-between-failure (MTBF)– Mean-time-to-repair (MTTR)– uptime = (MTBF – MTTR)/MTBF

– yield = queries completed/queries offered

– harvest = data available/complete data

– DQ Principle: Data per query × queries per second →constant (total data delivered)

• System level physical bottleneck• Total I/O bandwidth (disk or network)• Optimization goal is to minimize the• utilization of the bottleneck resource• Fault tolerance: trade-off between D and Q

Graceful Degradation is a goal

Page 63: CS6320 – Performance more details

Using High Availability metrics to compare Replication vs.

Partitioning

Replication of data in 2 nodes• – 1 failure: 100% harvest (D), 50% yield (Q)

Partition of data in 2 nodes• – 1 failure: 50% harvest (D), 100% yield (Q)

63

Page 64: CS6320 – Performance more details

Cluster ExampleSmaller tomid-sized Cluster Example.

Large examples like Amazon have in the thousands nodes

Page 65: CS6320 – Performance more details

Some Tips• Get the basics right. Start with a professional data center and layer-7 switches, and use symmetry to

simplify analysis and management.

• Decide on your availability metrics. Everyone should agree on the goals and how to measure them daily. Remember that harvest and yield are more useful than just uptime.

• Focus on MTTR at least as much as MTBF. Repair time is easier to affect for an evolving system and has just as much impact.

• Understand load redirection during faults. Data replication is insufficient for preserving uptime under faults; you also need excess DQ.

• Graceful degradation is a critical part of a high-availability strategy. Intelligent admission control and dynamic database reduction are the key tools for implementing the strategy.

• Use DQ analysis on all upgrades. Evaluate all proposed upgrades ahead of time, and do capacity planning.

• Automate upgrades as much as possible. Develop a mostly automatic upgrade method, such as rolling upgrades. Using a staging area will reduce downtime, but be sure to have a fast, simple way to revert to the old version.