IBM Systems Magazine

download IBM Systems Magazine

of 4

Transcript of IBM Systems Magazine

  • 8/8/2019 IBM Systems Magazine

    1/4

  • 8/8/2019 IBM Systems Magazine

    2/4

    For the purposes of this article, well focus on the first purpose, as most users of HACMP use

    the product for availability purposes. One of the obvious benefits of HACMP is that it

    provides availability during planned system outages. As most systems administrators know,

    planned outages (thankfully) come much more frequently then unplanned outages. Planned

    outages include outages taken for systems software updates, application updates/upgrades

    and firmware upgrades. As partition mobility will soon provide the ability to account for

    planned outages, the necessity of HACMP moving forward will be primarily for unplanned

    outages.

    HACMP is certainly not for everybody. While its a mature, proven product, even the most

    experienced system administrators will tell you that configuring and maintaining HACMP

    clusters isnt for the faint of heart. While much cheaper than competing products such as

    VERITAS Cluster Manager (also available for AIX and Systems p), HACMP still comes with

    a cost, much of which actually lies outside of the licensing and associated software costs. This

    cost includes the funds necessary to train IT staff in HACMP, probable consulting costs

    incurred during cluster installation and configuration and other related maintenance costs.

    Further, HACMP is only really necessary in environments that must have continuous

    availability. When deciding whether to use HACMP, you must consider the cost of deploying

    it versus the cost of having your systems down for four to eight hours while your hardware isbeing fixed. Other important considerations include the actual cost of failure to your

    environment, and what applications must be highly available. The bottom line is, if you can

    afford the downtime, then you dont need really HACMP. If your application absolutely

    cant afford to be down, then you may not be able to live without it. (We should also note

    that many applications, including Oracle, also provide availability solutions at the application

    layer.)

    How does HACMP work?

    While HACMP can support up to 32 nodes (8 for Linux), the vast majority of configurations

    are two-node clusters, in which one node functions as the failover node for the primary node.In HACMP lingo, that means one active and one standby node are running, both using the

    same shared disk. See Figure 1.

    This figure illustrates a two-node IBM AIX HACMP environment running Oracle, consisting

    of an active and a standby server. Mutual takeover configurations, where both nodes are

    running applications and backing each other up, arent as common, though they certainly also

    work well. When configuring your cluster, you must account and plan your applications,

    cluster topology, network connectivity, shared storage devices, shared LVM components,

    resource groups, cluster event processing and ultimately the clients themselves.

    Each resource is defined as being part of a resource group, which are then configured to haverelationships with its nodes. Depending on this relationship, resources can be defined in four

    different ways: cascading, cascading without fallback, rotating or concurrent access. When

    the primary server goes down because of a failover event, the HACMP software on the

    standby system recognizes this event and starts to take action, usually taking over the service

    IP address of the primary policy server. The HACMP software will also mount the shared

    filesystem on the standby system and start up its applications.

    In a typical cascade relationship, the standby server remains operational until the HACMP

    software on the standby system recognizes that the primary system is operational and falls

  • 8/8/2019 IBM Systems Magazine

    3/4

    back. In this relationship, one would plan for when they want to run the application again on

    the primary server. While you can theoretically have both hosts function as logical partitions

    on one System p frame, that would defeat the purpose of having hardware availability, so

    lets assume these partitions are on two separate frames. Its also important to note that in the

    event that the standby server is configured to backup several primary servers, the failover

    node must be configured to be able to service all available workloads. Note that because the

    disk is shared, HACMP doesnt provide for disk availability. Your storage subsystem must

    provide for that level of redundancy.

    Testing your HACMP is one of the most important components of an HACMP deployment.

    Before deploying HACMP in production, every possible scenario should be tested to ensure

    that the cluster works the way it was designed to work. When validating that testing, its not

    enough for your UNIX administrators to say that it works. Functional applications teams must

    be part of the HACMP validation process or else it really hasnt been tested adequately. The

    purpose of HACMP isnt so much to ensure that filesystems or processes have started on

    another box, but that every application youve identified must continue to work in the event

    of a failover and works without any manual intervention.

    What about HACMP V5.4?

    The most important enhancements of HACMP V5.4 include non-disruptive startup and fast

    failure detection. With non-disruptive startup, one doesnt need to take down the application

    when installing, upgrading and doing maintenance to HACMP. Its difficult to configure

    anything in a production environment, so this was IBMs answer to providing flexibility

    around service-level agreements. Fast failure detection lets users detect node failures much

    more quickly than ever before.

    Other improvements include:

    New smart assists for DB2, Oracle and WebSphere, and a two-node assist to create a

    clusterWebSMIT is now easier to configure and also provides a GUI

    Improved cluster verification tools

    Reintroduction of forced stop to help avoid resource conflicts by putting resource

    groups into an unmanaged state

    HACMP/XD GLVM Multi-Link feature for improved data mirroring protection and

    performance

    Concurrent mode access for simultaneous applications execution at a local site

    Support for Linux on System p for the first time

    What HACMP Can Do For You

    What does all this mean for you? Clearly, IBMs recent POWER6 innovations show the

    importance that IBM emphasizes on availability. At the same time, IBM continues to

    innovate and enhance HACMP, its flagship HA product. If you need help installing and

    configuring your cluster, contact your IBM Business Partner or IBM, which provides its own

    High Availability Services, a fee-based service offering. Your Business Partner can also

    usually sell you this offering at a discounted price.

    As a user of System p systems, its important that you understand the purpose behind the new

    features and also how best to implement HA in your environment. Understand also that

  • 8/8/2019 IBM Systems Magazine

    4/4

    HACMP is not fault tolerance. Its a step down from that, as it will take a few moments for

    the standby server to start up. HA may not be for everyone, but its an essential part of any

    mission-critical application running on a System p platform. If you cant live with your

    systems being down, dont leave home without it.

    IBM Systems Magazine is a trademark of International Business Machines Corporation. The

    editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia

    under license from International Business Machines Corporation.

    2010 MSP Communications, Inc. All rights reserved.