Clustering: Transparent Replication, Load Balancing, and...

Clustering: Transparent Replication,Load Balancing, and FailoverBuilding Scalable and Highly Available E-Commerce Applicationswith Borland AppServer

January 2000

Salil [email protected]

Technology Tower Suite 1000191 Coronado AveSan Carlos CA 94070-2805(650) 551-1600 voicehttp://www.customware.com/

mailto:[email protected]

2

1 Table of Contents

1 TABLE OF CONTENTS ....................................................................................................................................2

2 OVERVIEW..........................................................................................................................................................4

3 A MACRO ARCHITECTURE FOR MULTI-TIER APPLICATIONS....................................................5

4 CLIENT-TIER BOTTLENECKS .....................................................................................................................6

5 CLIENT-TO-WEB APPLICATION (WAN ) COMMUNICATION BOTTLENECKS ........................6

6 BOTTLENECKS/FAILURES IN WEB SERVERS AND SERVLET ENGINES....................................8

6.1 WHAT IS IP LOAD BALANCING?.....................................................................................................................86.2 WHO NEEDS IP LOAD BALANCING?...............................................................................................................96.3 WHY IS IP LOAD BALANCING OFTEN NOT ADEQUATE BY ITSELF? .............................................................96.4 LOAD BALANCING AT SESSION INITIATION TIME...........................................................................................96.5 LOAD BALANCING THROUGHOUT A SESSION’S LIFETIME ...........................................................................10

6.5.1 Redesigning Your Servlets ..................................................................................................................106.5.2 Using Centralized Servlet Session State Managers...........................................................................106.5.3 Replicating Servlet Session State........................................................................................................106.5.4 Factoring Out State from Servlets ......................................................................................................116.5.5 Placing “Stringified” Object References into Cookies .....................................................................12

7 BOTTLENECKS/FAILURES IN EJB OBJECTS AND CONTAINERS................................................12

7.1 A JNDI- AND COSNAMING-COMPLIANT NAMING SERVICE WITH SUPPORT FOR CLUSTERING .................127.1.1 Pluggable Backing Stores and Support for Namespace Replication ...............................................127.1.2 Clustering Features of the Naming Service .......................................................................................137.1.3 High Availability of the Naming Service Itself...................................................................................13

8 THE BAS SOLUTION FOR REPLICATING EJBS AND EJB CONTAINERS...................................13

8.1 CLUSTERING STATELESS SESSION BEANS....................................................................................................138.2 CLUSTERING ENTITY BEANS ........................................................................................................................148.3 CLUSTERING STATEFUL SESSION BEANS .....................................................................................................148.4 THE IMPORTANCE OF DISTRIBUTED TRANSACTIONS ..................................................................................15

9 THE BAS SOLUTION FOR MANAGING REPLICATED AND LOAD BALANCED SERVICES 16

9.1 APPCENTER...................................................................................................................................................169.2 LOAD BALANCING ........................................................................................................................................169.3 FAULT TOLERANCE.......................................................................................................................................16

10 CONCLUSIONS ............................................................................................................................................16

11 APPENDIX 1: HTTP- AND IP-LEVEL LOAD BALANCING PRODUCTS ....................................18

11.1 APPLIANCES (DEDICATED ROUTER-LIKE DEVICES) ......................................................................................1811.1.1 Radware Web Server Director Pro+ http://www.radware.co.il/......................................................1811.1.2 F5 Networks Big/ip http://www.radware.co.il/ ................................................................................1911.1.3 Coyote Point Equalizer E250 http://www.coyotepoint.com/............................................................1911.1.4 HyrdaWeb Hydra5000 http://www.hydraweb.com/........................................................................1911.1.5 IPivot Intelligent Broker 4000 http://www.ipivot.com/ ....................................................................19

11.2 LOAD BALANCING SOFTWARE .....................................................................................................................1911.2.1 Resonate Central Dispatch http://www.resonate.com/.....................................................................20

3

11.3 SWITCHES......................................................................................................................................................2011.3.1 Alteon ACEdirector 2 http://www.alteonwebsystems.com/..............................................................2011.3.2 HolonTech HyperFlow 2 http://www.holontech.com/.....................................................................20

11.4 HOSTING & CACHING SERVICES ..................................................................................................................2111.4.1 Akamai FreeFlow http://www.akamai.com/ ......................................................................................21

12 ABOUT THE AUTHOR...............................................................................................................................22

4

2 Overview

Websites have evolved from serving up static pages to hosting full-fledged applications. This hasrequired IT staff to evaluate and deploy application servers instead of, or in addition to, web servers.Among these, application servers such as Borland AppServer (BAS) v4.0, which are based on theJava 2 Enterprise Edition (J2EE) standards – which include Enterprise Java Beans (EJB), Servlets,JavaServer Pages (JSP), Java Database Connectivity (JDBC), and others – have become thearchitecture of choice among users.

As the applications hosted by such application servers become increasingly large & complex,increasingly important to its users, and thus increasingly important to the organization hosting theapplications, it becomes imperative that the applications can (a) scale and (b) resist failure.

Many application servers attempt to provide scalability and high availability to applications viafeatures known as Replication, Load Balancing, and Failover, often commonly referred to togetheras Clustering.

Clustering makes applications scalable and highly available because:

! Work requests from a sea of clients be distributed among a set of software resources(EJB containers & EJB objects, Servlet engines & Servlets, JSP engines & JSPs) andhardware resources (machines) using algorithms of your choosing

! Each work request can be routed to the least loaded resource if so desired! Requests can be routed away from a failed resource to a different one, possibly

containing replicas of objects that were in the failed resource! Both software resources and hardware resources can be “brought on-line” or “taken

off-line” without bringing the entire application down! Each node in the cluster can be an arbitrary combination of resources, giving the

administrator complete flexibility in configuring the application for scalability andfault tolerance.

However, even application servers that all conform to the same set of standards (J2EE) greatly differin how well they support clustering. Many application servers do not support the features mentionedthe above list at all. Some application servers support some features but not all, and furthermore,have arbitrary limitations that hinder real-world deployments.

This white paper explains in detail how Borland AppServer delivers one of the best and mostcomplete clustering solutions on the market, which includes replication, load balancing, andfailover.

It discusses in depth the techniques used to achieve these goals, and compares them to those used bythe competition. And most importantly, it demonstrates how BAS4 is used to deploy applicationsthat perform and scale from a few clients to hundreds of thousands.

3 A Macro Architecture for Multi-tier Applications

Clustering features such as Replication, Load Balancing, and Failover are an important part ofbuilding scalable and reliable applications. Good application design is equally important.

Let’s establish a reference macro architecture for multi-tier applications employing J2EEtechnologies as shown in Figure 1. This will allow us to discuss the challenges in making suchapplications meet expectations in terms of performance, scalability, and reliability, including:

! Client-tier bottlenecks! WAN communication bottlenecks! Bottlenecks/failures in web servers and servlet engines, and the BAS solution for

replicating web servers and servlet engines! Bottlenecks/failures in the application server / EJB container! The BAS solution for replicating beans and containers

The discussion will highlight how Borland AppServer (BAS)’s clustering features can be elegantlyused to build scalable applications.

Even while examining the arccontains at least four tiers.

JavaApplication

Non-JavaApplication

JavaApplet

JavaScript

HTML

Web Server

Servlets

JSPs

Application Server

EJBs

Database

Figure 1 – A Macro Architecture for Multi-tier Web Applications

5

hitecture at a high level, we quickly see that the architecture often

6

Standalone Java or non-Java applications or Java applets running in a web browser can accessbusiness components (beans) directly across WANs.

Very-thin clients such as HTML/JavaScript clients access services through a web server; they areusually serviced by web-server extensions such as Servlets and JSPs, which are replacements forolder, less efficient, mechanisms such as CGI.

Each tier, and each interconnect between each two tiers, is a potential performance bottleneck.

And often, even more tiers are inserted in between the ones shown above, to perform connectionconcentration, load balancing, proxying, firewall traversing, etc.

4 Client-Tier Bottlenecks

Processing bottlenecks in the client tier occur if the client platforms are underpowered (slow CPUs,not enough memory, not enough disk/swap space) with respect to the kind of processing that ishappening in that tier. Simply running Java inside a browser, or running large standaloneapplications, can easily make underpowered machines unusable.

This happens if your application’s presentation layer is complex (e.g., lots of screens composed oflots of widgets), you perform extensive input validation in the GUI tier, or if your client-tier containsa significant amount of business logic, which you might have put there because you wanted to makethe server tiers leaner and more scaleable.

To put things in perspective, however, this is the least severe of the potential bottlenecks in thereference architecture, and is thus the least of the concerns addressed by this paper.

Problem: Client-tier processing is slow.

Solution: Make client-tier thinner (do less work in the client tier). For example:

! Use HTML / DHTML / JavaScript instead of Java applets! Move some logic to Servlets or EJBs

Partially due to the problems described above (but mostly due to the problems described in the nextsection), many architects of web-based application currently opt for thinner, HTML/JavaScript-based clients.

5 Client-to-web application (WAN ) Communication Bottlenecks

Depending on the situation, clients across WANs may experience a wide range of bandwidths, from28.8Kbps to 1Mbps+; clients at the lower end of that spectrum often experience problems, with webapplications designed for higher bandwidths.

Java applet clients that need to be downloaded at client-session startup time are a common source ofbottlenecks in this area, especially when client platforms are behind a slow WAN link (less than100Kpbs), for typical applet sizes. As mentioned on the last page, this portion of the problem isoften solved by using HTML/JavaScript for the WAN-client tier.

7

Problem: Java applets take too long to download.

Solutions: Don’t use Java applets, use HTML/DHTML/JavaScript + Servlets.

Make Java applets smaller.

Consider asking your end-users to install something on their machines (e.g., ajar file, a plug-in, a DLL, etc.)

Sometimes the problem is not with client bandwidth, but with the server infrastructure’s ability toadequately support the bandwidth demands caused by numerous clients. Is this situation, upgradingthe server’s bandwidth to the LAN (upgrading from T1 to T3 to OC1 to OC3, or switching ISPs), isa real option.

Problem: WAN Communication bottleneck is due to server-side network infrastructure

Solutions: Upgrade server-side connection to the WAN (e.g., T1"T3, OC1"OC3)

Replicate and distribute (perhaps geographically) your servers. This creates anew set of issues, which we will soon discuss.

Push more processing (logic & data) to the client. This can create a new set ofissues, some of which were raised in the discussion on client-tier bottlenecks.

A potentially more serious problem is posed by client applications or applets that are “chatty” -- i.e.,they communicate frequently with the server across the WAN, sometimes for fine-grained actionssuch as populating screens, validating fields, or determining what to do next, after each userinteraction.

The problem described above is compounded if the interactions are non-trivial, i.e., they pass aroundlarge sets of data.

These problems must be addressed by carefully making design choices, deciding where dataprimarily resides, where it is copied, cached, etc.

Problem: Chatty Clients

Solutions: Careful [re] design and [re] partitioning of tiers. Consider:

! When should clients use references to objects in the middletier?

! When should the middle tier pass a copy of objects’ state toclients? (Optionally, using “pass by value” features).

8

6 Bottlenecks/Failures in Web Servers and Servlet Engines

Millions of web clients hitting a single web server clearly make that web server a bottleneck. Tosolve this problem, large web sites run multiple web servers, on multiple machines, often withreplicated copies of the content.

Load balancing across web servers is done with a variety of techniques, from simplistic ones such asround-robin DNS, special web server load-balancing software such as Central Dispatch and itscompetitors, or IP-level load balancing products such as Cisco’s Local Director, or its competitors.

Problems: Large number of hits to the web-server tier makes it a bottleneck.

The web-server becomes a single-point of failure.

Solution: Replicate the web servers and load balance across them

Problem: Which mechanism to use to load balance across multiple web servers?

Solutions: Round-robin DNS (simple).

Web-server load-balancing software (e.g., Resonate Central Dispatch etc. –see Appendix 1).

IP-level load-balancing devices (e.g., Cisco Local Director, etc. – seeAppendix 1).

6.1 What is IP Load Balancing?

Don’t be underwhelmed by the name “IP-level Load Balancing” or just “IP Load Balancing.” Theterm does not begin to describe the incredible functionality or feature set offered by a strong IPLBproduct. The press coined the phrase in early 1997 and it stuck. P.C. Magazine published the firstproduct review / test on IP Load Balancing in February of 1997 in an article titled “Web ServerLoad Balancing”. Since then the term IP Load Balancing has become the de-facto industry standardused to describe products that generally intercept IP packets and intelligently distribute load amongstan array of devices sitting logically behind them.

IP Load Balancing uses a Virtual IP address (VIP) to represent the logical content set located onfrom one to n servers or devices. Servers can be added or removed dynamically from the farm and itcan all take place in a fashion transparent to the client. IPLB devices can operate locally or globally.Local solutions allow a single site to dynamically scale resources on demand. Global sites canrepresent an entire company’s data resources through single or multiple URLs and can load balancebetween them using advanced measuring tools, which make redirection decisions, based on acombination of latency, router hops, and/or dynamic load between the sites.

9

6.2 Who needs IP Load Balancing?

As soon as client traffic becomes overwhelming for a single device, a device farm can be built.Device farms are inexpensive to build, and with an IP Load Balancer, a heterogeneous mix ofhardware and software can be retained, protecting legacy equipment and thus return on investment.

6.3 Why is IP Load Balancing Often Not Adequate by Itself?

Functionally the major value of IPLB devices is that they keep the front-ends of sites available. Forsites that consist mostly of static web pages or simple CGI scripts or equivalent, these devices mightbe all that is required.

IP Load Balancers do not do an adequate job of intelligently handling load balancing and failoveracross stateful services. Servlets, JSP, and EJBs are all, in the general case, stateful. In such cases,after a client initiates a session with one such object, if each subsequent hit from that client is load-balanced willy-nilly across various objects/machines/servers without consideration of such state, theload balancing breaks the application.

Indeed, what works so well for static web pages cannot work for objects with state. The applicationserver’s clustering features must be used in conjunction with these low-level IP load balancers, tobuild scalable & highly available J2EE applications.

An overview of IP Load Balancing solutions is provided in Appendix 1.

6.4 Load Balancing at Session Initiation Time

The following workaround for dealing with session state is applicable in a majority of situations…instead of load-balancing every hit from the client, load balancing takes place only at sessioninitiation time. Thereafter, all hits for a particular client session are handled by the same set ofsoftware and hardware resources (particular JSP instances, Servlet instances, EJB containers andobjects, and machines).

Problem: Static content (like HTML and simple CGI scripts or equivalent) is easy toreplicate and load-balance over. Servlets & JSPs with state (e.g., shoppingcarts) interfere with simple load balancing approaches.

Solution: No need to load-balance every web hit; it is often adequate to load balanceonly at session initiation time.

For example, a web client would visit a URL such as the following to begin using a particularapplication:

http://www.mybroker.com/trading/start

It is at this point that the server performs load balancing (perhaps in conjunction with clientauthentication and other initialization), and directs the client toward other URLs that are not load-balanced, such as:

http://trading72.mybroker.com/servlet/tradingScreen?acct=xyz…

http://www.mybroker.com/trading/start

http://trading72.mycompany.com/servlet/tradingScreen?acct=xyz

10

Periodically and/or every time the client performs certain important or permanent operations, thesession state is flushed to permanent storage. In the event that there is a failure of the session (due tofailure of any of the objects, servers, or machines involved), any transient session state that has notbeen checkpointed is lost, and clients are directed back to the login / session initiation page wherethey may resume the session from its most recent saved state.

For perspective, note that at most online trading sites, transient session state is indeed lost in case ofsession failure. For example, a client confirming the placement of an order is an importantpermanent operation whose effects are made recoverable, but client interactions in between suchrecoverable operations build up transient session state, which is indeed lost in the event of failures.

This approach is low-tech, does not depend on any particular load balancing features in theapplication server, and can be used easily in the presence of IPLB devices.

6.5 Load Balancing throughout a Session’s Lifetime

You may decide that the above solution is inadequate for the applications you are architecting… youmight require that client hits must be load-balanced throughout the lifetime of the session, not just atsession initiation time. Client sessions in your application may be long lived and you may not want aparticular client pinned to a certain set of resources for the lifetime of the session. Or, the loss oftransient session state may not be acceptable.

6.5.1 Redesigning Your Servlets

In such cases, it is possible to redesign your Servlets such that your Servlet state is flushedperiodically (i.e., more often than at session initiation or termination), and clients can use differentServlet instances at different times, even for the same session.

6.5.2 Using Centralized Servlet Session State Managers

Another option is to use Servlet engines that provide a Servlet session state manager which allowpotentially each web hit to use a different Servlet instance in a different Servlet engine instance, withthe proper session state always available via the session state manager.

6.5.3 Replicating Servlet Session State

The Servlet session state manager becomes a bottleneck in such centralized configurations, becauseit is hit by the Servlet engines potentially for every client request. Another approach used byapplication servers involves replicating Servlet session state across multiple nodes. Any changes toone copy must be propagated to the secondary copy (ies). Generally, the replication for such sessionstate is in-memory, and not disk- or database-based. It is thus still susceptible to failures, and notsuitable for persistent components.

Achieving an acceptable balance of performance vs. scalability and safety can also pose a challengein such session state replication. If the number of copies is high, then replication requires so muchcommunication amongst the replicas that performance decreases after two or three nodes.

Therefore, we do not recommend the use of such Servlet session state replication. The Servletengine of Borland AppServer does not perform such Servlet session state replication. As describedlater in this paper, BAS instead concentrates on providing solid load balancing, replication andfailover support at the EJB container level, a more superior approach.

11

Even application server products that do support Servlet session state replication caution against itsuse. For example, the BEA white paper, Achieving Scalability and High Availability for E-Commerce and Other Web Applications, after touting in-memory replication of Servlet session state,goes on to say:

In general, the cluster is far more efficient when a particular client’s requests can be serviced bya single server, and only rerouted in the event of a failure. Since there are many moreconcurrent clients than servers in a typical web cluster, the load can just as effectively balanced[sic] without scattering a particular client’s requests across several servers.

Problem: Cannot load-balance only at session initiation time. Must load balance morefrequently (every web hit or every few web hits)

Solution: Redesign your servlets such that state is flushed periodically, and a web clientcan use a different servlet instance [in a different servlet engine].

Some products provide a central web session state manager so that each webhit can use a different session object on a different machine. Use with caution,or do not use at all.

6.5.4 Factoring Out State from Servlets

Even for applications making heavy use of Servlets, it is possible to factor out all state managementout of the Servlets.

Servlets can be used simply as a conduit for bringing a client’s invocation into the EJB container.All session state management takes place in stateful session beans.

Such refactoring makes even more sense when you consider the following additional factors:

! Servlets are a “poor man’s version” of stateful session beans. In fact, it was revealedas part of discussion on the EJB-INTEREST list at java.sun.com, that had the ServletAPI specifications been standardized after the EJB specifications were concrete,Servlets would have been specified to be just another kind of stateful session bean (“aweb session bean”)

! High-end application server vendors such as Inprise have invested significantengineering into making EJB containers scalable, replicable, and highly available andfault tolerant.

! Some application server products such as Borland AppServer allow applicationdeployers to collocate the servlet engine and the EJB container within one JavaVirtual Machine (JVM) which allows such refactoring to happen without degradingperformance, and often, improving performance because session state moves closer tothe data (entity beans) that the session manipulates.

Servlets can also make a potentially nonstandard use of entity beans to store session state. Althoughentity beans are meant to represent persistent objects, not session state, it is sometimes reasonable tobend the rules and use entity beans as a mechanism for making servlet session state persistent, andthus accessible from any servlet engine.

12

6.5.5 Placing “Stringified” Object References into Cookies

Furthermore, because CORBA and RMI-over-IIOP form the underpinnings of Borland AppServer,we can take advantage of certain CORBA features to make J2EE applications more scalable.

In particular, instead of relying on Servlet state management APIs, it is possible to “stringify” theobject references of session beans and place them in a cookie (using standard HTTP Cookiemanagement APIs). Cookies are unique identifiers which can be created by servers andtransparently returned to clients, and are transparently sent back to the server with each clientrequest. This technique frees clients to use any Servlet instance on any server machine for each hit –in fact, the application deployer can how use an IPLB device to load balance each hit now. Theservlets in the web application now only need to pull out the session bean references from thecookie, destringify them, and invoke methods on them.

To make these operations fast, Inprise has highly optimized and tuned the CORBAobject_to_string() and string_to_object() operations in VisiBroker. Application server products thatare built on proprietary protocols do not offer standard calls to make string representations of theirsession object references.

Problem: How to completely avoid storing state in servlets

Solution: Move the state to EJBs, and “stringify” the object references of those EJBsand place them in HTTP Cookies. Requires support for CORBAstringification & destringification in the application server product.

7 Bottlenecks/Failures in EJB objects and containers

How well an application server product supports the clustering of EJBs, directly defines how goodof a product it is. We begin by discussing how a replicated naming service with built-in support forclustering forms the foundation of this set of features in Borland AppServer. We will leavediscussion of the important topic of state management of replicated objects, until the followingsection.

7.1 A JNDI- and CosNaming-compliant Naming Service with Support for Clustering

With the VisiBroker and Borland AppServer products, Inprise offers one cohesive naming servicewith support for both the JNDI API (as required by the EJB specifications) and the CosNaming API(as required by the CORBA specifications), and support for federation, replication, load balancing(with custom criteria), and failover.

7.1.1 Pluggable Backing Stores and Support for Namespace Replication

To keep naming data persistent, the naming service can use a choice of pluggable data stores,including Inprise’s all-Java JDataStore, Oracle, Sybase, any JDBC resource, or LDAP. The NamingService itself can be fully replicated – that is, multiple copies of the Naming Service server can bestarted on multiple machines for the same namespace, and all copies can simultaneously use thesame backing store.

13

7.1.2 Clustering Features of the Naming Service

The Clustering features of the Naming Service allow applications to associate multiple objectreferences with a single name, using standard JNDI or CosNaming APIs. This can happencompletely automatically if the administrator deploys multiple containers (possibly on multiplemachines), which contains an overlapping set of beans.

When clients look up such a name, the Naming Service can load balance over the set of objectreferences associated with that name, to distribute the load over them.

The load-balancing algorithm is specified at cluster creation time. In addition to the built-in load-balancing algorithms, application developers can code other load-balancing algorithms.

If one of the objects in a cluster fails, the BAS runtime can automatically substitute it with the objectreferences of one of the replicas in the cluster. Depending on the state management needs of theobject being failed over, certain other actions are also performed automatically – these are discussedin subsequent sections.

7.1.3 High Availability of the Naming Service Itself

For high availability of the Naming Service, copies of the Naming Service servers can be run in aMaster/Slave mode. In other words, two naming servers run at the same time, with one of them instandby mode. Both the master and the slave support the same namespace, and work off a commonpersistent backing store.

While both naming servers are active, the Master is always the preferred choice for clients using theNaming Service. In the event that the Master terminates unexpectedly, the Slave server takes over…the switch from the Master to the Slave is seamless and transparent to the clients.

In the meantime, administrators can take whatever remedial actions are required to revive the failedMaster server. If the Master server comes back again, Naming Service related requests from newclients automatically go back to the Master server again. Clients who have switched and bound tothe slave server continue to use the slave server. If the slave fails however, those clientsautomatically switch back to using the Master.

The transparent switchovers are accomplished using a combination of bootstrapping protocols thatare part of the CORBA Interoperable Naming Service (INS) specifications, and Inprise’s ownVisiBroker Smart Agent technology.

8 The BAS Solution for Replicating EJBs and EJB Containers

The previous section explained how a replicated, highly available Naming Service with support forclustering allows clients to transparently fail over (a) across naming services, and (b) across multipleobject references in the same cluster.

What about state management though? We must consider all three kinds of EJBs… statelesssession beans, stateful session beans, and entity beans.

8.1 Clustering Stateless Session Beans

This is the simplest of all cases, and in fact, almost trivial. By definition, stateless session beans arenot allowed to maintain application state on behalf of a client. Indeed, the EJB specification has

14

formalized the abstraction of a stateless session bean because stateless services are likely to be morescalable than stateful services.

Stateless session beans of the same type in multiple containers are failover replicas of each other.When clients are using stateless session beans in one container, upon failure of those beans (or thecontainer housing them), the clients will transparently fail over to beans of the same type in othercontainers.

Since the beans are stateless, it does not matter to the client which stateless bean gets used. Statelesssession beans of the same type are effectively interchangeable with each other. There are no statemanagement issues during failover.

More sophisticated approaches are required for Entity Beans and Stateful Session Beans.

8.2 Clustering Entity Beans

Entity Beans represent data, of course, and are generally persisted to databases. Therefore, unlikestateless session beans, an entity bean is not generally interchangeable with another entity bean ofthe same type. The fact that the data of entity beans is readily available via a database, however, canbe used by application servers to accomplish failover relatively easily.

Before allowing clients to fail over from entity beans in one container to entity beans of the sametype in a different container, the application server must “load” the replica with the proper state,from the database. (Whether the bean has been coded to use Bean Managed Persistence orContainer Managed Persistence, is not a factor).

What about intermediate states that the old entity bean(s) might have been in, when they failed?These changes to the entity bean state may not yet have been written to the database. How does theapplication server determine what the “correct state” of a failed entity bean is? The answer lies in thefact that a transaction was in progress when the failure of the entity bean occurred. The “correctstate” of the failed entity bean is the state at the end of the last successful transaction. Conveniently,this state is in the database.

Therefore, during the fail over, the “current transaction” will fail, as it should; because the originalentity bean has become unavailable, the transaction manager will presume rollback (as per theOTS/JTS specification) and roll back the current transaction.

Subsequent transactions will automatically fail over to use an entity bean in another container.Before this happens, a copy of the proper entity bean will be instantiated in the other container andthe last consistent state (the state after the last successful transaction) for that bean will be loadedinto it.

8.3 Clustering Stateful Session Beans

Stateful session beans can pose the greatest challenge to cluster. This is because they are neitherstateless (like stateless session beans are), nor is a consistent snapshot of their state guaranteed to bein the database (like it is for entity beans). For this reason, application server products vary greatly intheir level of support for clustering stateful session beans. Many products do not allow the clusteringof stateful session beans at all.

Borland AppServer provides very good support for clustering stateful session beans. Part of thesolution is Inprise’s JDataStore, an all-Java relational database with a Type 4 JDBC driver, whichallows the JDataStore database to be either local or remote.

15

Borland AppServer offers a “Session Storage Service” based on the all-Java JDataStore. Statefulsession beans are periodically passivated (i.e., the state of the stateful session beans periodicallywritten) into this Session Storage Service. The period is configurable by the administrator. Nospecial coding is required on part of the author of the stateful session bean to take advantage of thisfacility.

Furthermore, the Session Storage Service can be configured to run (a) “in-process” to a particularEJB container, or (b) “out-of-process” (perhaps even on a separate machine), and shared by multipleEJB containers potentially running on several different machines.

In configuration (b), it can be used to fail over stateful session beans from one container to another,along with their state… just like replicas of failed entity beans can be “loaded” with the proper statewhich can be retrieved from the database, replicas of failed stateful session beans can be loaded withthe proper state which can be retrieved from the Session Storage Service.

As an example scenario, assume that we have two Containers running replicated Stateful SessionBeans, and the Session Storage service running as a separate process. Let's say a client creates ashopping cart (a Stateful SB) in Container 1. The shopping cart will be automatically passivatedevery 5 seconds (a tunable parameter). Now suppose that the client puts a book into the shoppingcart, and then thinks for a while. Within 5 seconds, the shopping cart will be stored persistently inJDataStore. Now, suppose that Container 1 crashes. The shopping cart session bean willautomatically fail-over to Container 2. Container 2 will notice that it does not have the state forthe user's shopping cart loaded, and it will load the state from the session storage service (running asa separate process). The session storage service can be put on a reliable machine, and because itcontains no user-code, it is less likely to crash than servlet engines or EJB containers. The client willnow continue using the shopping cart with its contents intact.

8.4 The Importance of Distributed Transactions

It is important that an application server support distributed transactions (two-phase-commit)because it can become important in failover situations.

Imagine that a client is invoking a (stateless or stateful) session bean, S1, in container C1, which is inturn invoking entity beans E1 and E2, both in container C2. S1 involves both E1 and E2 in certaintransactions that it must perform on behalf of its client. Now imagine that container C2 fails…suppose that replicas of E1 and E2 do exist, in containers C3 and C4, each running on differentmachines. S1 can indeed fail over to use those E1 and E2 replicas, but because E1 & E2 now live indifferent containers and must still participate in the same transaction, the failover situation hascreated the need for distributed transactions.

Because Borland AppServer does support distributed transactions, it does allow for such situations ifthe administrator wishes, and automatically performs distributed transactions when necessary.

Most other application server products support the Java Transaction Service (JTS) and the relatedJTA APIs, but do not in fact support distributed transactions. They therefore are forced to imposearbitrary limitations in their failover support, lest situations that need distributed transactions arise intheir environment.

16

9 The BAS Solution for Managing Replicated and Load Balanced Services

9.1 AppCenter

Inprise’s AppCenter product is a distributed application management solution that works inconjunction with BAS. It ensures that the correct number of BAS and EJB container instances areavailable, according to the application’s fault tolerance and load balancing requirements. This isdone on a per application basis by using a unique model-based architecture that allows developersand managers to define important characteristics of the application, such as its load balancing andfault tolerant needs. While AppCenter is a separate product it is often sold in conjunction with BASin order to provide a complete, managed solution.

9.2 Load Balancing

While it is very important to distribute the client transaction load evenly between the currentlyavailable services, it is equally important to ensure that the correct number of services are actuallyavailable in order to meet the current client needs. It is, for example, pointless to evenly distributethe client load between two EJB containers if they are both already over-loaded. What is needed atthis point is for another EJB container to be started on a different host, so that it can handle some ofthe extra load. AppCenter can start and stop extra services, such as BAS instances and EJBcontainers, depending on current load conditions, guaranteeing that an adequate number of servicesare always available.

9.3 Fault Tolerance

At a macro level, fault tolerance can be seen as the guaranteeing or service availability. AppCentercan periodically monitor the current status of BAS instances, EJB containers and EJBs in order todetermine if they are currently working correctly. AppCenter can also be configured to restartservices upon failure, or if restart is impossible (for example, the host computer has gone down) tostart additional services on other hosts. Fail-over rules can be entered into AppCenter via themodeling interface.

10 Conclusions

This paper has reviewed common issues related to clustering (replication, load balancing, highavailability and fail over) of static web resources such as HTML pages and simple CGI scripts,servlets and JSPs, and the more complex enterprise beans. It has shown that Borland AppServerprovides a robust set of features that solve these problems well, without arbitrary limitations thatcompetitor’s solutions often contain.

BAS 4 supports heterogeneous clustering where any number of beans can be clustered across anynumber of Application Servers. Heterogeneous clustering imposes no restrictions on the types ofbeans or drivers that can appear on the various Application Servers within the cluster.

This is in stark contrast to the clustering approach of most other application servers, which allowonly homogeneous clustering: they impose restrictions on the way that beans can be clustered acrossservers. In these products, each Application Server in the cluster must contain the exact bean andJDBC driver configuration that every other Application Server in the cluster contains. In fact, eachserver must have the exact same EJB services running and they must even be started in the exact

17

same order in order to create a cluster. In real world situations, is very reasonable to expect anapplication server in a cluster located in London to write data to a database in London, and anotherserver in a cluster in New York to write to data in a database in New York. This scenario is just notpossible using clustering features of many application servers. Some products have the furtherrestriction that unless the server in New York and London are located on the same LAN and can beaccessed via an IP multi-cast they cannot be clustered at all.

With Borland AppServer, Inprise leverages the years of experience in building scalable middlewaresuch as the award-winning VisiBroker (the most deployed CORBA product), which included robustimplementations of load balancing, high availability and failover. With Borland AppServer, thesefeatures have been taken to the “next level” with stronger integration with enterprise naming serviceand special attention to the clustering of servlets, JSP, and EJBs.

Borland AppServer delivers scalability and high availability for your E-commerce applications,without limits.

18

11 Appendix 1: HTTP- and IP-Level Load Balancing Products

Although HTTP- and IP-level load balancing products are as diverse as the vendors that make them,they fall into the following basic categories:

! Appliances (dedicated router-like devices). Examples include Radware’s Web ServerDirector, F5 Networks’ Big/ip, Coyote Point’s Equalizer, HydraWeb’s Hydra5000,and IPivot’s Intelligent Broker 4000.

! Intelligent Switches (LAN switches with programmable functionality). Examplesinclude Alteon’s ACEdirector 2 and HolonTech’s HyperFlow 2.

! Load Balancing Software. Examples include Resonate’s Central Dispatch for Sun,NT, and AIX

! Hosting and Caching Services. Examples include Akamai’s FreeFlow.

The text below refers to “the Network World Test,” which is documented athttp://www.nwfusion.com/reviews/0614rev.html

11.1 Appliances (dedicated router-like devices)

Appliances are load balancing and Internet traffic management applications packaged with more-or-less standard hardware and sold as a dedicated device. By far, the majority of load balancing andtraffic management products are sold in this form. This paper therefore spends more time discussingthese than the other two categories.

An appliance is generally the product form of choice for most network managers because the routersand other network components they are already using, are of this form. Appliances are dedicatedcomputer systems that provide specific functions. They are engineered to be easy to install andoperate.

If you were to look inside such an appliance, you would see a computer system that resembled apersonal computer—a Pentium processor (or even an entire PC motherboard), 100MB or so ofRAM, and two or more network interface cards (NICs). In contrast to a PC, the network applianceruns a specialized, pared-down operating system (obviously not Windows). The appliance may noteven have a disk drive, but may instead use non-volatile Flash RAM to store the applications and theconfiguration data. Compared to a PC, the network device’s hardware and operating system aresimpler, and for that reason, the appliance typically has higher performance and is more reliable thanan otherwise comparable PC or workstation application.

Because the software and hardware of an appliance are installed and tested at the factory, theperformance can be accurately characterized, and the configuration problems that typify PCs can beavoided.

The only limitations on the throughput of appliances are the throughput of conventional LAN links(100 Mbit/sec) and the throughput of existing NICs. For the biggest web sites with very highcapacity Internet links, these may be significant limitations. For web sites in the T-1 – T-3 range (2 –45 Mbits/sec) – most web sites fall at the low end of this range – appliance throughput is typicallymore than adequate.

11.1.1 Radware Web Server Director Pro+ http://www.radware.co.il/

Radware’s Web Server Director Pro won second place in the Network World tests. It is one of themost scalable and feature-rich products, in all categories. It is easy to configure and manage. You

http://www.nwfusion.com/reviews/0614rev.html

http://www.radware.co.il/

19

can configure Web Server Director as a router, using two interfaces to pass all packets to and from asecure network, or as a server, redirecting connections to web servers. If you need to scale beyondits capacity as a router, you can configure Web Server Director like a software load balancer. TheGUI lets you configure NAT and several other load-balancing options. The product also allowstesting of response time for remote web site balancing. Web Server Director can provide tightsecurity for itself and the cluster of servers it services in router mode, but not when it is acting as aserver.

11.1.2 F5 Networks Big/ip http://www.radware.co.il/

Strong security distinguishes F5 Networks’ Big/ip High Availability+ Single Controller 2.0.1. ThisUnix-based device works very well as a router and includes an extensive packet filter. It can be hardto configure for someone with limited Unix experience, but for routine maintenance andconfigurations, it is not too laborious. Unfortunately, Big/ip offers limited tools for monitoring webservers and provides no trending data. Big/ip has only two network interfaces, one internal and oneexternal. This could limit their throughput when configured as routers, causing performanceproblems in high-bandwidth situations, such as with a T-3 connection.

11.1.3 Coyote Point Equalizer E250 http://www.coyotepoint.com/

The Equalizer E250 is a basic web server load balancer which is simple to operate and does notrequire any training. It is easy to install and manage, but you won’t find many advanced features andits interface is a bit plain. Like Big/ip, Equalizer E250 has only two network interfaces, which canpotentially be a performance problem. Equalizer features a web-based management utility that is fastbut plain, providing all the necessary configuration options for load balancing. Equalizer offersinnovative data tracking and plotting of historical statistics, which gives you a good idea of howtraffic is spiking and when servers are being overloaded.

11.1.4 HyrdaWeb Hydra5000 http://www.hydraweb.com/

Hydra5000 offers strong performance and tight security, but the product lacks a GUI and a webbrowser interface. It overcomes Big/ip’s and Equalizer E250’s limitation of only two networkinterfaces, with its four-port router. HydraWeb also offers an optional global load-balancingmanagement tool, which provides enterprise scalability, site-level resiliency, traffic prioritization anddisaster recovery. Like Big/ip, Hydra5000 requires you to configure its load balancer from a Unixcommand line, but the product will soon also have a GUI.

11.1.5 IPivot Intelligent Broker 4000 http://www.ipivot.com/

IPivot’s Intelligent Broker 4000 has a lot of useful features, but according to Network World, itlacks the polish of the leaders. Similar to Radware’s Web Server Director, you can configureIntelligent Broker 4000 like a router, using its single interface for internal and external connections,and switch to server mode to enhance performance. Intelligent Broker requires a fair knowledge ofUnix commands for configuration and maintenance. It does provide a web-based interface, which isadequate for day-to-day management and helps you add servers to the cluster. Like Web ServerDirector, Intelligent Broker can provide tight security for itself and the cluster of servers it servicesin router mode, but not when it is acting as a server.

11.2 Load Balancing Software

Just as network people prefer appliances, software people prefer software. Software solutions canhave lower initial cost, since it can be installed on existing hardware.

http://www.radware.co.il/

http://www.coyotepoint.com/

http://www.hydraweb.com/

http://www.ipivot.com/

20

Because the solution has not been packaged as an appliance, it can potentially execute morecomplex logic while making load-balancing and traffic management decisions. Additional smarts(new algorithms, etc.) can be added post-installation, as modules or upgrades.

Software solutions can be modified and adapted, which is both a benefit and a liability. Theflexibility is an advantage if it allows the product do something it wasn’t exactly designed for, but itis a disadvantage if it leads to reliability or cost-of-ownership problems.

Software load balancers accept connection requests and then hand the connections over to the webserver chosen in the balancing scheme. In this way, software load balancers avoid having to examineeach packet to make load-balancing decisions. Thus, for the same processing power, a software loadbalancer is often able to handle roughly twice the web service requests of an Appliance or Switch.(Appliances and Switches try to compensate by having more processing power on board).

11.2.1 Resonate Central Dispatch http://www.resonate.com/

In the Network World test, Resonate’s Central Dispatch was the fastest product under the greatestload, and won the “Blue Ribbon” award. Central Dispatch v2.2.1b is easy to install and manage. Itssecurity features were weak, however, and did not include Network Address Translation (NAT) atthe time of the test. Configuring Central Dispatch is easy with Resonate’s web-based Java GUI.Because Central Dispatch has agents running on each server, the product allows control of balancingbased on server performance. For example, you can shift load based on open connections, CPUspeed and CPU Utilization.

11.3 Switches

Switching devices are typically much more expensive than the appliances or software describedabove. They typically have higher throughput than appliances or software, and they can includecustom silicon parts that permit more complex load balancing and traffic management decisions tobe made, based for example, on the content of the messages.

Switch-based load balancers scale well. The switches either put web servers on their own switchedports or cluster the servers on hubs connected to the switch, with multiple connections to the WANinterface.

Neither of the two switches mentioned below supports remote web site load balancing, while all theother products mentioned above, do.

11.3.1 Alteon ACEdirector 2 http://www.alteonwebsystems.com/

Alteon’s ACEdirector 2 won third place in the Network World test. ACEdirector has Layer 3switching, and NAT capabilities, but is missing “global” load balancing capability that many of theappliances have. ACEdirector 2 has eight ports for servers, and you can add another switch if youneed to support more connections. ACEdirector 2’s web-based interface is well-designed andintuitive, although it could be a bit more responsive. One of the few drawbacks of ACEdirector isthat it does not provide server performance history.

11.3.2 HolonTech HyperFlow 2 http://www.holontech.com/

HyperFlow 2, which is a 16-port load-balancing switch, does everything reasonably well, but needsrefining. Quality of the documentation is poor. HyperFlow’s configuration and management toolsare almost as intuitive as ACEdirector 2. Not only does HyperFlow provide NAT, but it also has

http://www.alteonwebsystems.com/

http://www.holontech.com/

21

controls to secure the unit from all unauthorized protocols and addresses. In this respect, HyperFlowis like Unix platforms, but it is configured using a browser-based GUI.

11.4 Hosting & Caching Services

Hosting and caching services are an interesting outgrowth of Internet computing, strangelyreminiscent of the mainframe-era service bureaus that gave companies an alternative to owning andoperating their own computers.

Hosting services are the more traditional kind of services; caching services are newer and moreinteresting.

Hosting services come in a variety of forms. They all have one thing in common: computingfacilities with fat connections to the Internet. Rather than hosting your content on a web server inyour own facilities, you host all or part of it at a hosting service. The hosting services provide anetwork-friendly environment, including high-bandwidth Internet connections, convenient rackmounting for computers and network devices, convenient cabling and reconfiguration. Most hostingservices also provide reliable electrical power, multiple, redundant connections to the Internet, andother forms of disaster protection.

Caching services take advantage of the fact that over 75% of most web pages contain embeddedobjects (images etc.) that do not change often. They provide easy-to-use methods for specifyingwhich portions of your content should be cached by the service. The web server at your facility isthen relieved from serving these portions of content. The caching service caches and serves it in a farmore efficient manner than your web site could.

The groundbreaking example of such a caching service, is Akamai’s FreeFlow service.

11.4.1 Akamai FreeFlow http://www.akamai.com/

For a web site deployer, adopting FreeFlow is non-intrusive. Companies select content to be servedby Akamai with an easy-to-use software utility called Launcher. Simply put, Launcher tags objectswithin a Web page that are to be served over the Akamai network. When customers request thoseobjects, the requests are sent to one of the servers in Akamai’s network, generally the one closest tothe requestor, where those objects are cached.

In more detail, when a user requests a page containing objects tagged to be served by Akamai, his orher browser automatically points to an Akamai server rather than to the customer's central site.Based on Akamai's real-time network map, FreeFlow directs requests to the Akamai server best ableto satisfy each request, resulting in better performance and reliability than relying just on the centralsite. This process, which uses standard Internet protocols, is transparent to all browsers and does notrequire any plug-ins or user configuration. Akamai has developed algorithms to disperse content tothe servers from the central site in a way that they claim guarantees that no server is ever overloadedby requests. As the number of requests for a document increases, so does the number of serverscontaining copies.

FreeFlow's sophisticated algorithms generate a unique map of current Internet traffic conditions, theloads of all Akamai servers worldwide, and the locations of Internet users. Akamai's global map isconstantly updated – as frequently as once per second, the company claims – ensuring that FreeFlowinstantly responds to Internet outages and congestion.

22

12 About The Author

This white paper was written by CustomWare at Inprise’s invitation to review their BAS technology.

Salil Deshpande is President & Chief Technical Officer of CustomWare, a company that providestraining & consulting on Enterprise Java Beans (EJB) and other Enterprise Java and J2EEtechnologies, including CORBA & IIOP. CustomWare is an authorized training and consultingpartner for most of the enterprise Java vendors, such as Borland/Inprise, BEA/WebLogic, IBM, SunMicrosystems, Persistence Software, and many others. Prior to CustomWare, Salil was the Presidentof a training & consulting company focusing on CORBA, which was acquired by CORBA vendorVisigenic Software. Salil received an MS in EE/CS from Stanford University in 1991 and a BS inEE/CS from Cornell University in 1989.

Clustering: Transparent Replication, Load Balancing, and...

Documents

Transcript of Clustering: Transparent Replication, Load Balancing, and...