Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

38
Welcome to HACMP Introduction Demo Class Email: [email protected] Call us: +91 8099776681

Transcript of Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Page 1: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Welcome to HACMPIntroduction Demo Class

Email: [email protected] us: +91 8099776681

Page 2: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Unit objectivesAfter completing this unit, you should be able to:

Define High Availability and explain why it is needed

List the key considerations when designing and implementing a high availability cluster

Outline the features and benefits of HACMP for AIX

Describe the components of an HACMP for AIX cluster

Explain how HACMP for AIX operates in typical cases HACMP

Page 3: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

High Availability and HACMP conceptsAfter completing this topic, you should be able to:

Define High Availability

Recognize that eliminating single points of failure (SPOFs) is part of the HACMP implementation process

Outline the features and benefits for HACMP for AIX

Describe the HACMP concepts of topology and resources

Give examples of topology components and resources

Provide a brief description of the software and hardware components of a typical HACMP cluster

HACMP

Page 4: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

So, what is High Availability?High Availability characteristics:The masking or elimination of both planned and unplanned downtimeThe elimination of single points of failure (SPOFs)Fault resilience and system hardeningNo specialized hardware requirement

HACMP

client

Workload Fallover

WAN

Production Node/LPAR Standby Node/LPAR

Page 5: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Eliminating single points of failure

HACMP

Cluster Object Eliminated as a single point of failure by:

Node Using multiple nodes

Power source Using multiple circuits or uninterruptible power supplies

Network adapterNetwork

Using redundant network adaptersUsing multiple networks to connect nodes

TCP/IP subsystem Using non-IP networks to connect adjoining nodes and clients

Disk adapterDisk

Using redundant disk adapter or multipath hardwareUsing multiple disks with mirroring or raid

Application Adding node for takeover; configuring application monitor

VIO Server Implementing dual VIO Servers

Site Adding an additional site

The fundamental goal of (successful) cluster design isthe elimination of single points of failure (SPOFs).

Page 6: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

High availability clusters (HACMP base)

HACMP

System p and AIX RAS features include:Application and Partition MobilityFirst Failure Data Capture (FFDC)Dynamic CPU DeallocationFlexible Service ProcessorRedundant Power and CoolingError Correction Checking MemoryHot Swap AdaptersDynamic KernelJournaled FilesystemRedundant Data Paths

Dual Disk Adapters (MPIO)Data Mirroring and/or StripingHot Swap / Hot Spare StorageRedundant Power/Cooling for Storage Arrays

With High Availability Clustering (HACMP)Protection against node and OS failure with Redundant

nodesProtection against NIC failure with Redundant Network

AdaptersProtection against Network failure with Redundant

NetworksSelf-healing clusters with Application MonitoringProtection against Site Failure (typically limited by SAN

infrastructure) or no distance limitations with HACMP/XD

Page 7: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What about site failure?

HACMP

Limited distance (LVM mirroring and SAN): HACMP for AIX

Extended distance: Geographic Clustering Solution (that is, HACMP/XD)

Distance unlimitedApplication, disk, and network independentAutomated site failover and reintegrationA single cluster across two sitesGet more details in HACMP System Administration III –

AU620

Toronto

Brussels

Metro Mirror/PPRCGLVMGeoRM

Data Replication

Page 8: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

IBM's HA solution for AIX

HACMP

HACMP for AIX characteristics:Stands for High Availability Cluster Multi-processingIs based on cluster technology (RSCT)Provides two environments (which can co-exist simultaneously):

Serial (High Availability): the process of ensuring that an application is available for use through the use of serially accessible shared data and duplicated resourcesParallel (Cluster Multiprocessing): concurrent access to shared data

Page 9: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Fundamental HACMP concepts

HACMP

Topology: Physical “networking centric” components Resources: Entities that are being made highly available Resource group: A collection of resources, which HACMP controls as a single unit

A given resource can appear only in, at most, one resource groupResource group policies:

startup policy: which node the resource group is activated onfallover policy: determines target when there is a failurefallback policy: determines fallback behavior

Customization The process of augmenting HACMP, typically via implementing

scripts Minimum: application start and stop scriptsOptional:

Application monitoring scripts (highly recommended!)Event customization

Notification, pre- and post-event scripts, recovery scripts, user-defined events, time until warning (config_too_long timeout)

Page 10: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

A highly available cluster

HACMP

Resource

group Shared Storage

clstrmgr clstrmgr

Fallover Node Node

Fundamental Concepts

Cluster is comprised of physical components (topology) and logical components (resource groups and resources).

Page 11: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

HACMP's topology components (1 of 2)

HACMP

IP Network

CommunicationInterface

Non-IP

Networ

k

Communicatio

n

Device

Node

The Topology components consist of a cluster, nodes and the technology that connects them together.

Page 12: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

HACMP’s topology components (2 of 2)

HACMP

Ethernet / Etherchannel

ServerServerPC

Non -IP Server Server

Heartbeat on DiskRS232/422

SAN IBM

RS/6000RS/6000

DS8000 Fibre

DS4000

Fibre Channel

Node Any-to-any, including LPARs Minimum number of physical adapters

for redundancy must be considered

Networking Ethernet

Physical and virtualEtherchannel

Non-IPHeartbeat on disk, RS-232, Target-

mode SCSI

Shared storage Physical

SCSI or Fibre Channel Virtual SCSI

Page 13: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What is HACMP?

HACMP

An application which:Controls where resource groups runMonitors and reacts to eventsProvides tools for cluster-wide configuration and

synchronizationRelies on other AIX Subsystems (ODM, LVM, RSCT, TCP/IP, SRC,

and so on)Cluster Manager Subsystem (clstrmgrES)

Topology manager

Resource manager

Event manager

SNMP manager

RSCT(topsvcs, grpsvcs, RMCsubsystems)

snmpd clinfoES

clcomdES

clstat

Page 14: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Additional features of HACMP

HACMP

HACMP is shipped with utilities to simplify configuration, monitoring, customization, and cluster administration.

OLPW smit via web

Configuration Assistant

CSPOCDARE

clstrmgrESSNMP

VerificationAuto tests

TivoliIntegration

Application Monitoring

Page 15: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Some assembly required

HACMP

HACMP can be used out of the box; however, some assembly is required.Minimum:

Application Start/Stop/Monitor scriptsOptional:

Customized pre/post event scriptsReaction to events

Error notification MethodsUser Defined Event’s (UDE’s)Cluster State Change

HACMP's flexibility allows for complex customization in order to meet availability goals

Page 16: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Let’s review

HACMP

1. Which of the following items are examples of topology components in HACMP? (Select all that apply.)

a. Nodeb. Networkc. Service IP labeld. Hard disk drive

2. True or False?All nodes in an HACMP cluster must have roughly equivalent performance characteristics.

3. Which of the following is a characteristic of high availability?a. High availability always requires specially designed hardware

components.b. High availability solutions always require manual intervention to

ensure recovery following fallover. c. High availability solutions never require customization.d. High availability solutions use redundant standard equipment (no

specialized hardware).4. True or False?

A thorough design and detailed planning is required for all high availability solutions.

Page 17: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Let’s review solutions

HACMP

1. Which of the following items are examples of topology components in HACMP? (Select all that apply.)

a. Nodeb. Networkc. Service IP labeld. Hard disk drive

2. True or False?All nodes in an HACMP cluster must have roughly equivalent performance characteristics.a

3. Which of the following is a characteristic of high availability?a. High availability always requires specially designed hardware

components.b. High availability solutions always require manual intervention to

ensure recovery following fallover. c. High availability solutions never require customization.d. High availability solutions use redundant standard equipment (no

specialized hardware).4. True or False?

A thorough design and detailed planning is required for all high availability solutions.

Page 18: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What does HACMP do?

HACMP

After completing this topic, you should be able to:

Describe the failures that HACMP detects directly

Provide an overview of the standby and takeover cluster configuration options in HACMP

Describe some of the considerations and limits of an HACMP cluster

Page 19: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Just what does HACMP do?

HACMP

HACMP functions:Monitors the states of nodes, networks, network adapters and

devicesStrives to keep resource groups highly availableOptionally, monitors the state of the applications, and can be

customized to react to every possible failure

Page 20: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What happens when something fails?

HACMP

How the cluster responds to a failure depends on what has failed, what the resource group's fallover policy is, and if there are any resource group dependencies: Typically, another equivalent component takes over duties of failed

component (for example, another node takes over from a failed node).

Page 21: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What happens when a problem is fixed?

HACMP

How the cluster responds to the recovery of a failed component depends on what has recovered, what the resource group's fallback policy is, and the resource group dependencies:Typically, administrators need to indicate or confirm that the fixed

component is approved for use. Some components are integrated automatically; for instance, when a communication interface recovers.a

Page 22: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Standby (active/passive) with fallback

HACMP

Node USA fails Node UK fails

USA returns UK returns

One node is primary

RG can be configured to come online on the primary or any node

(no change)

A

A A

AA

Page 23: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Standby (active/passive) without fallback

HACMP

USA fails

UK failsUSA returns

Eliminates anotheroutageReduces downtime

A

A

A

A UK returns

Page 24: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Mutual takeover: Active/Active

HACMP

UK fails

Very commonNo one node/LPAR is left idle

A B

B

B

B A

A

A

(with Fallback) (with Fallback)

Page 25: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Concurrent: Multiple active nodes

HACMP

USA, Germany, and UK are all running Application A, each using a separate IP Address A A A

A A AAIf nodes fail, the application remains continuously available as long as there are surviving nodes to run on.

Fixed nodes resume running their copy of the application.

Application must be designed to run simultaneously onmultiple nodes.This has the potential for essentially zero downtime.

Page 26: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Points to ponder

HACMP

Resource groups:Must be serviced by at least two nodesCan have different policiesCan be migrated (manually or automatically) to rebalance loads

Clusters:Must have at least one IP network and one non-IP networkNeed not have any shared storageCan have any combination of supported nodes *Can be split across two sites

Might or might not require replicating data (HACMP/XD).Applications:

Can be restarted via monitoringMust be manageable via scripts (start/restart and stop)

* Application performance requirements and other operational issuesalmost certainly impose practical constraints on the size and complexity of a given cluster.

Page 27: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Other considerations for HACMP

HACMP

Design, planning, testing Focus on service and availabilityApply appropriate risk analysisDisciplined system administration practices

Documented operational procedures

High availability

Continuous operation

Continuous

availability

SystemsManagement

People

Data

Hardware

Software

Environment

Networking

Page 28: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Things HACMP does not do

HACMP

Back-up and restorationTime synchronizationApplication specific configurationSystem administration tasks unique to each node

Page 29: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

When is HACMP not the correct solution?

HACMP

Zero downtime required Maybe a fault tolerant system is the correct choice.Availability 7x24x365; HACMP occasionally needs to be

shut down for maintenance.Life-critical environments.

Security issuesToo little security

Many people can change the environment.Too much security

C2 and B1 environments might not allow HACMP to function as designed.

Unstable environmentsHACMP cannot make an unstable and poorly managed

environment stable. HACMP tends to reduce the availability of poorly managed

systems.

Page 30: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

What do we plan to achieve this week?

HACMP

Your mission this week is to build a two-node mutual takeover highly available cluster using two previously separate AIX systems, each of which has an application which needs to be made highly available.

A

B

A

B

Page 31: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Overview of the implementation process

HACMP

Plan and configure AIXElimination of single points of failureStorage (adapters, LVM volume group, filesystem)Networks (IP interfaces, /etc/hosts, non-IP networks, and devices)Application start and stop scripts

Install the HACMP filesets (Note: 5.3 and earlier reboot!)

Configure the HACMP environmentTopology

Cluster, node names, HACMP IP and non-IP networksResources and Resource groups:

Identify name, nodes, policiesResources: Application Server, service label, VG, filesystem

Synchronize, then start HACMPNote: If using two nodes and one application “Configure the

HACMP environment” can be done in one step.

Page 32: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Hints to get started

HACMP

•Draw a diagram.•Use (online) planning sheets.•Focus on eliminating SPOFs.•Always factor in a non-IP network.•Ensure that you have multipath access to shared storage devices.•Document a test plan.•Test the cluster carefully.•Be methodical.

hints

Public Network

Resource Group databaserg containsVolume Group = dbvg

hdisk3, hdisk4, hdisk5, hdisk6, hdisk7Major # = 51JFS Log = dblvlogLogical Volume = dblv1, dblv2FS Mount Point = /db, /dbdata

Node Name = nodea Resource group = dbrg

Applications = database Resources = cascading

A-B Priority = 1,2 CWOF = yes

Label = a_tmssa Device = /dev/tmssa1

Label = a_tty Device = /dev/tty1

Node Name =nodeb Resource group = httprg

Applications = http Resources = cascading

B-A Priority = 2,1 CWOF = yes

Label = b_tmssa Device = /dev/tmssa2

Label = a_tty Device = /dev/tty1

tmssa network

serial network

VG = dbvgRaid5100GB

VG =httpvgRaid19GB

rootvgraid19.1GB

rootvgraid19.1GB

usercommunity

HACMP Clusterfor

the ABC company

Resource Group httprg containsVolume Group = httpvghdisk2,hdisk8

Major # = 50JFS Log = httplvlogLogical Volume = httplvFS Mount Point = /http

Node A IP Label IP Address NetmaskService webserv 192.168.9.5 255.255.255.0Boot nodebboot 192.168.9.6 255.255.255.0Standby nodebstand 192.168.254.3 255.255.255.0

Node A IP Label IP Address NetmaskService database 192.168.9.3 255.255.255.0Boot nodeaboot 192.168.9.4 255.255.255.0Standby nodeastand 192.168.254.3 255.255.255.0

Page 33: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Sources of HACMP information

HACMP

HACMP manuals come with the product cluster.doc.en_US.es.html cluster.doc.en_US.es.pdf

HACMP documentation also available online http://www.ibm.com/servers/eserver/pseries/library/

hacmp_docs.htmlRelease Notes contain important information about the version

release /usr/es/sbin/cluster/release_notes

Sales manual: http://www.ibm.com/common/ssiIBM courses:

HACMP Admin. I: Planning and Implementation (AU540/AU54) HACMP Admin II: Admin. and Problem Determination

(AU610/AU61) HACMP Administration III: Virtualization and Disaster Recovery

(AU620/AU62) HACMP V5 Internals (AU60)

IBM Web site: http://www-03.ibm.com/systems/p/ha/

Non-IBM sources (not endorsed by IBM but probably worth a look): http://lpar.co.uk http://portal.explico.de/ http://www.matilda.com/hacmp/ http://groups.yahoo.com/group/hacmp/

Page 34: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Checkpoint

HACMP

1. True or False?Resource Groups can be moved from node to node.

2. True or False?HACMP/XD is a complete solution for building geographically distributed clusters.

3. Which of the following capabilities does HACMP not provide? (Select all that apply.)a.Time synchronizationb.Automatic recovery from node and network adapter

failurec. System Administration tasks unique to each node; back-

up and restorationd.Fallover of just a single resource group

4. True or False?All nodes in a resource group must have equivalent performance characteristics.

Page 35: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Checkpoint solutions

HACMP

True or False?Resource Groups can be moved from node to node.

True or False?HACMP/XD is a complete solution for building geographically distributed clusters.

Which of the following capabilities does HACMP not provide? (Select all that apply.):Time synchronizationAutomatic recovery from node and network adapter

failureSystem Administration tasks unique to each node;

back-up and restorationFallover of just a single resource group

True or False?All nodes in a resource group must have equivalent performance characteristics.

Page 36: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

www.kerneltraining.com

Unit summary

HACMP

Having completed this unit, you should be able to:

Define high availability and explain why it is needed

Outline the various options for implementing high availability

List the key considerations when designing and implementing a high availability cluster

Outline the features and benefits of HACMP for AIX

Describe the components of an HACMP for AIX cluster

Explain how HACMP for AIX operates in typical casesa

Page 37: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Questions?

www.kerneltraining.com

HACMP

Page 38: Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Email: [email protected] us: +91 8099776681

THANK YOUfor attending

Demo of HACMP

www.kerneltraining.com

HACMP