IBM Systems Magazine

8/8/2019 IBM Systems Magazine

1/4


2/4

For the purposes of this article, well focus on the first purpose, as most users of HACMP use

the product for availability purposes. One of the obvious benefits of HACMP is that it

provides availability during planned system outages. As most systems administrators know,

planned outages (thankfully) come much more frequently then unplanned outages. Planned

outages include outages taken for systems software updates, application updates/upgrades

and firmware upgrades. As partition mobility will soon provide the ability to account for

planned outages, the necessity of HACMP moving forward will be primarily for unplanned

outages.

HACMP is certainly not for everybody. While its a mature, proven product, even the most

experienced system administrators will tell you that configuring and maintaining HACMP

clusters isnt for the faint of heart. While much cheaper than competing products such as

VERITAS Cluster Manager (also available for AIX and Systems p), HACMP still comes with

a cost, much of which actually lies outside of the licensing and associated software costs. This

cost includes the funds necessary to train IT staff in HACMP, probable consulting costs

incurred during cluster installation and configuration and other related maintenance costs.

Further, HACMP is only really necessary in environments that must have continuous

availability. When deciding whether to use HACMP, you must consider the cost of deploying

it versus the cost of having your systems down for four to eight hours while your hardware isbeing fixed. Other important considerations include the actual cost of failure to your

environment, and what applications must be highly available. The bottom line is, if you can

afford the downtime, then you dont need really HACMP. If your application absolutely

cant afford to be down, then you may not be able to live without it. (We should also note

that many applications, including Oracle, also provide availability solutions at the application

layer.)

How does HACMP work?

While HACMP can support up to 32 nodes (8 for Linux), the vast majority of configurations

are two-node clusters, in which one node functions as the failover node for the primary node.In HACMP lingo, that means one active and one standby node are running, both using the

same shared disk. See Figure 1.

This figure illustrates a two-node IBM AIX HACMP environment running Oracle, consisting

of an active and a standby server. Mutual takeover configurations, where both nodes are

running applications and backing each other up, arent as common, though they certainly also

work well. When configuring your cluster, you must account and plan your applications,

cluster topology, network connectivity, shared storage devices, shared LVM components,

resource groups, cluster event processing and ultimately the clients themselves.

Each resource is defined as being part of a resource group, which are then configured to haverelationships with its nodes. Depending on this relationship, resources can be defined in four

different ways: cascading, cascading without fallback, rotating or concurrent access. When

the primary server goes down because of a failover event, the HACMP software on the

standby system recognizes this event and starts to take action, usually taking over the service

IP address of the primary policy server. The HACMP software will also mount the shared

filesystem on the standby system and start up its applications.

In a typical cascade relationship, the standby server remains operational until the HACMP

software on the standby system recognizes that the primary system is operational and falls


3/4

back. In this relationship, one would plan for when they want to run the application again on

the primary server. While you can theoretically have both hosts function as logical partitions

on one System p frame, that would defeat the purpose of having hardware availability, so

lets assume these partitions are on two separate frames. Its also important to note that in the

event that the standby server is configured to backup several primary servers, the failover

node must be configured to be able to service all available workloads. Note that because the

disk is shared, HACMP doesnt provide for disk availability. Your storage subsystem must

provide for that level of redundancy.

Testing your HACMP is one of the most important components of an HACMP deployment.

Before deploying HACMP in production, every possible scenario should be tested to ensure

that the cluster works the way it was designed to work. When validating that testing, its not

enough for your UNIX administrators to say that it works. Functional applications teams must

be part of the HACMP validation process or else it really hasnt been tested adequately. The

purpose of HACMP isnt so much to ensure that filesystems or processes have started on

another box, but that every application youve identified must continue to work in the event

of a failover and works without any manual intervention.

What about HACMP V5.4?

The most important enhancements of HACMP V5.4 include non-disruptive startup and fast

failure detection. With non-disruptive startup, one doesnt need to take down the application

when installing, upgrading and doing maintenance to HACMP. Its difficult to configure

anything in a production environment, so this was IBMs answer to providing flexibility

around service-level agreements. Fast failure detection lets users detect node failures much

more quickly than ever before.

Other improvements include:

New smart assists for DB2, Oracle and WebSphere, and a two-node assist to create a

clusterWebSMIT is now easier to configure and also provides a GUI

Improved cluster verification tools

Reintroduction of forced stop to help avoid resource conflicts by putting resource

groups into an unmanaged state

HACMP/XD GLVM Multi-Link feature for improved data mirroring protection and

performance

Concurrent mode access for simultaneous applications execution at a local site

Support for Linux on System p for the first time

What HACMP Can Do For You

What does all this mean for you? Clearly, IBMs recent POWER6 innovations show the

importance that IBM emphasizes on availability. At the same time, IBM continues to

innovate and enhance HACMP, its flagship HA product. If you need help installing and

configuring your cluster, contact your IBM Business Partner or IBM, which provides its own

High Availability Services, a fee-based service offering. Your Business Partner can also

usually sell you this offering at a discounted price.

As a user of System p systems, its important that you understand the purpose behind the new

features and also how best to implement HA in your environment. Understand also that


4/4

HACMP is not fault tolerance. Its a step down from that, as it will take a few moments for

the standby server to start up. HA may not be for everyone, but its an essential part of any

mission-critical application running on a System p platform. If you cant live with your

systems being down, dont leave home without it.

IBM Systems Magazine is a trademark of International Business Machines Corporation. The

editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia

under license from International Business Machines Corporation.

2010 MSP Communications, Inc. All rights reserved.

IBM Systems Magazine

Documents

Transcript of IBM Systems Magazine