Vcs

117
Veritas Cluster Server Learning Document for HA Concepts & Veritas Cluster Server By Enterprise Services Wipro Infotech Delhi Confidentiality Wipro Confidential Page 1 of 117

description

vcs

Transcript of Vcs

Course Overview

Veritas Cluster Server

Learning Document for HA Concepts &

Veritas Cluster Server

By

Enterprise Services

Wipro Infotech Delhi

Confidentiality

This document is being submitted to Adobe Pvt. Ltd.by Wipro Infotech, with the explicit understanding that the contents would not be divulged to any third party without prior written consent from Wipro Infotech.

VERITAS Cluster Server

This topic provides an overview of the key concepts, features, and benefits of VERITAS Cluster Server (VCS).

An Overview of VCS

VCS is an architecture-independent, availability management solution focused on proactive management of service groups, or application services. It is equally applicable in simple shared disk, shared nothing, or SAN configurations of up to 32 nodes and compatible with single node, parallel, and distributed applications. Cascading and multi-directional application failover is supported, and application services can also be manually migrated to alternate nodes for maintenance purposes. VCS provides a comprehensive availability management solution designed to minimize both planned and unplanned downtime.

Designed with a modular and extensible architecture to make it easy to install, configure, and modify, VCS can be used to enhance the availability of any application service with its fully automated, application-level fault detection, isolation and recovery. All fault monitors, implemented in software, are themselves monitored and can be automatically restarted in the event of a monitor process failure. Monitored service groups and resources can either be restarted locally or migrated to another node and restarted. A service group may include an unlimited number of resources. Various off-the-shelf agents are available from VERITAS to monitor specific applications such as file services, RDBMS and enterprise resource planning, or the product can be customized to monitor any hardware component or software-based service. An SNMP agent allows VCS to generate SNMP traps so that resource state changes can be communicated to any SNMP-based management tool such as HP OpenView, CA Unicenter, Tivoli TME, and others. Although applicable to any application service that requires higher availability, VCS is most often deployed in mission-critical enterprise environments such as file serving, database, and enterprise resource planning (ERP).

The Industry's Most Scalable Availability Management Solution

Conventional cluster products rely on inefficient, point to point fault management and heartbeat mechanisms that do not scale well to large cluster configurations. To ensure scalability, VCS leverages a unique internode communication mechanism, called ClusterStat that supports global atomic broadcast across a very low latency transport. This internode communication protocol is faster, more reliable and significantly more scalable than the protocols in any of today's existing cluster products. In addition, all fault management has been multi-threaded to speed recovery in large configurations, and efficient multi-level fault management ensures very low overhead in configurations which may include thousands of managed resources. VCS supports 32 nodes today, but VERITAS expects this product to support hundreds of nodes in the future.

Other features which support very large configurations include a Cluster Registry that is based on a single configuration file auto-replicated between all nodes, support for an unlimited number of service groups and a scalable, Java-based management GUI. A syntax checker built into the Cluster Registry minimizes operator error during configuration, and the registry supports dependency definitions between managed resources. During recovery, resources may either be started in parallel to speed recovery or according to the defined dependency hierarchy. An auto discovery capability automatically recognizes new nodes as they are added to the cluster and replicates the registry to them. Through the use of a scrollable, Microsoft Explorer-like management interface, the VCS Cluster Server Manager (CSM) can easily provide a comprehensive view of the status of all service groups in a single cluster with the ability to drill down for more detailed information or to perform administrative tasks with the click of a mouse button. It can also manage multiple clusters, if so configured, across up to 32 nodes in a SAN configuration from a single management console. VCS' ability to scale efficiently and manageably sets it apart from other availability management products on the market today.

As enterprises move to SAN architectures, the scalability of cluster management software will play a key role in efficiently leveraging large, centralized disk stores. More scalable software will allow more nodes to share centralized storage, thus optimizing the use of storage and minimizing availability management and administrative costs. It will also provide for a much better long-term growth path, allowing more nodes and disk arrays to be added to accommodate even very rapid business expansion over a period of years.

VERITAS SANPoint Foundation Suite HA

This topic provides an overview of the key concepts, features, and benefits of VERITAS SANPoint Foundation Suite HA (SPFSHA).

Overview of SPFSHA

SPFSHA extends VERITAS File System and VERITAS Volume Manager to support shared data in a SAN environment. Using SANPoint Foundation Suite HA, multiple servers can access shared storage and files, transparently to the applications and concurrently with each other. SANPoint Foundation Suite HA incorporates VERITAS Cluster Server to provide cluster failover capabilities as well as internode communications across the servers.

Features and Benefits of SPFSHA

SANPoint Foundation Suite HA makes shared storage possible and practical for a wide variety of applications.

Failover is faster on highly available configurations if a shared file system remains running even during a single server failure.

Web serving gains manageability and scalability by accessing a common set of files for content serving on a site. In the event of a server failure, applications can redistribute load by reassigning network addresses.

Workflow applications with large files, such as video production and CAD, can eliminate network traffic and data copying for improved performance and easier manageability.

A backup process running on a separate server can access shared storage directly, reducing the impact of backups on production systems and networks.

Transparent access to shared files

Using SANPoint Foundation Suite HA, multiple servers can mount and access the same file system on shared media. No modifications to existing applications are required.

File system integrity in a shared environment

SANPoint Foundation Suite HA ensures the integrity of the shared file system by controlling access to the file system structure using the global lock manager. It also manages cache coherence and locking, so that systems accessing shared file systems always see the most current information.

Faster failover for high availability environments

SANPoint Foundation Suite HA includes the robust application-level failover capabilities of VERITAS Cluster Server. Application failover is very fast, as another server can start a failed server's application without having to restart the file system.

Cluster-wide management of SAN data

SANPoint Foundation Suite HA simplifies the management of shared data, with clusterwide logical device naming, volume and file system operations.

Course Overview

System availability continues to receive wide attention as many organizations grow their critical business applications on Local Area Networks (LANs). The primary reason to address availability issues is the cost of downtime. You can establish an annual cost of downtime for every system and measure the benefits obtained by solving the problems that cause a system to fail. You can then select among the various available options to improve server uptime, based upon a reasonable cost and effort as well as a reasonable return on your investment.

Course Objectives

The overall goal of this learning experience is to provide a basic understanding of the concepts related to HA. This course will build the foundation on which to base more advanced courses on VERITAS HA products. During this course you will:

Define the general concept of high availability.

Identify HA storage management solutions at the disk level, such as hardware Redundant Array of Independent Disks (RAID) and volume management software.

Describe the concept of clustering and investigate common clustering configurations.

Identify HA methods at the network level, such as redundant network connections and redundant networks.

Describe VERITAS HA products.

Lessons

Defining High Availability

What is High Availability?

Describe the concept of high availability.

The Need for High Availability

Identify the need for increased data availability in today's computer environments.

Types of Faults and Failures

Identify different types of faults and failures that can occur.

High Availability vs. Disaster Planning

Differentiate between the goals and functions of high availability and disaster planning.

High Availability vs. Fault Tolerance

Differentiate between the goals and functions of high availability and fault tolerant availability methods.

High Availability Planning

Identify guidelines to consider when planning a high availability solution.

The Layered Approach to Availability

Describe the concept of the layered availability approach.

Online Storage Management

General RAID Levels

Describe the various RAID levels.

Software RAID vs. Hardware RAID

Identify the advantages and disadvantages of hardware and software RAID.

Defining a Volume

Describe volumes and identify the advantages of using them.

VERITAS Volume Management: Virtual Objects

Describe the relationships between the virtual objects in VERITAS Volume Manager.

VERITAS Volume Management: Volume Layouts

Identify the volume layouts that are available in VERITAS Volume Manager.

VERITAS Volume Management: Hot Relocation

Describe the hot relocation process.

High Availability Clustering

Fault Resilient Clustering Concepts

Describe the general characteristics of fault resilient HA solutions.

Asymmetric 1 to 1 Configurations

Describe an asymmetric 1 to 1 configuration.

Symmetric 1 to 1 Configurations

Describe a symmetric 1 to 1 configuration.

N to 1 Clustering

Describe a traditional N to 1 networked cluster configuration.

N to 1 SAN Clustering

Describe clustering techniques in a Storage Area Network environment.

Failover Granularity in Clusters

Describe how resource and service groups enable application-level failover.

Highly Available Networks

Networking Overview

Describe general network components, concepts, and common topologies.

Public Network Failures

Describe failures that may affect the public service network.

Heartbeat Network Failures

Describe challenges to maintaining proper heartbeat communication between nodes in a cluster.

Redundant Networks and Network Connections

Describe how to configure redundant networks and network connections.

VERITAS Comprehensive Availability Solutions

VERITAS Comprehensive Availability

Identify the role VERITAS software components play in an overall high availability solution.

VERITAS Volume Manager

Provide an overview of the key concepts, features, and benefits of VERITAS Volume Manager.

VERITAS Storage Replicator

Provide an overview of the key concepts, features, and benefits of VERITAS Storage Replicator.

VERITAS NetBackup

Provide an overview of the key concepts, features, and benefits of VERITAS NetBackup.

VERITAS Cluster Server

Provide an overview of the key concepts, features, and benefits of VERITAS Cluster Server.

VERITAS SANPoint Foundation Suite HA

Provide an overview of the key concepts, features, and benefits of VERITAS SANPoint Foundation Suite HA.

What is High Availability?

You design a system, utilizing software and hardware components and implementing appropriate procedures, to satisfy the basic functional requirements of your organization. This system functions properly assuming that no faults or failures occur. However, whenever a fault or failure occurs that requires some type of maintenance operation, an outage is observed by your users. An HA solution enables you to design, implement, and deploy software and hardware components to satisfy your functional requirements and features sufficient redundancy to mask faults and failures from your users. This topic describes the general concept of HA solutions.

Defining HA

HA is defined as the ability of a system to perform its function without interruption for an extended period of time. HA can be accomplished through special HA software and the implementation of redundant system and network hardware components. In a properly designed HA system, all of the possible failure modes for critical applications, network connections, and data storage have been identified and the recovery times have been analyzed. Therefore, you can determine how long the system will be down for any given failure. You can scale an HA system to an appropriate level so that in the event of a fault or failure, the system can recover to a known, consistent state in an acceptable period of time.

Availability Statistics

System availability is expressed as a measure of the period of time that the system is functioning normally. This involves the determination of the various component failures to factor into the overall rate of system failure. It is important to note that there is a distinction between component failure statistics and system failure statistics. The basic availability equation is used to determine the availability of a specific system component:

Availability = MTBF/ (MTBF + MTTR)

Where MTBF is the mean time between failures and MTTR is the mean time to repair.

MTBF

MTBF = Total actual operating time/Total number of failures

The MTBF is an expected future performance based on the past performance of a system component. If the component is new, there is no historical data to base the MTBF upon. When determining the MTBF of new hardware components, you should obtain these statistics from the particular vendor. However, these statistics may be inflated or have been calculated using a high standard deviation.

MTTR

The MTTR is an average amount of time that it takes to repair a component, based upon actual statistical data. When calculating the MTTR, you can consider only the amount of on-site time that it takes to recover the component from the time when it failed. You can also calculate the MTTR including factors such as unavailability, response time, and travel time, in addition to on-site repair time. Many aspects of MTTRs are out of your control. For example, you may need to replace a specific part of a server. If this part is not currently in your stock, you will have to purchase the replacement component from the vendor or some other source and rely solely upon their ability to deliver the part in a short amount of time.

System Availability

As stated earlier, to calculate the availability of a system, you must take into account the availability of the individual system components such as servers, disks, I/O cards, etc.

The more hardware the system features, the more likely the system will fail. It is here that the effect of having many of a single type of component affects the availability of a system.

For example, suppose a new disk has a quoted manufacturer's MTBF of 600,000 hours, which indicates that a disk would be expected to fail once in about 70 years. This MTBF is calculated rather than based on actual failures. In addition, this MTBF value considers only the disk mechanism itself. If you factor in the power supply, controller, and fans, the MTBF becomes about 150,000 hours or about 17 years. If your system utilizes 500 disks, the failure rates are multiplicative and the MTBF for 500 disks is 150,000 hours divided by 500, or only 300 hours. This means that the system would fail about 30 times a year due to disk failure. The best way to reduce the frequency and duration of failures that affect the system is to employ a properly designed HA solution.

The Rule of the Nines

Availability is often measured by the "rule of the nines".

PRIVATEPercentage UptimePercentage DowntimeDowntime Per YearDowntime Per Week

98%2%7.3 days3 hours, 22 minutes

99% (2 nines)1%3.65 days1 hour, 41 minutes

99.8%0.2%17 hours, 13 minutes20 minutes, 10 seconds

99.9%(3 nines)0.1%8 hours, 45 minutes10 minutes, 5 seconds

99.99%(4 nines)0.01%52.5 minutes1 minute

99.999%(5 nines)0.001%5.25 minutes6 seconds

99.9999%(6 nines)0.0001%31.5 seconds0.6 seconds

For most environments, 99% availability is adequate. This level of availability results in less than 2 hours of downtime a week. It is important to consider when this downtime is taking place. For example, if a typical business system is down on a Sunday between 3 A.M. and 4:30 A.M., this is more acceptable than if the system is down during a Tuesday afternoon between 2 P.M. and 3:30 P.M. It is also important to consider when 100% availability is required. For example, suppose that a brokerage house performs all stock transactions between 9 A.M. and 4 P.M on weekdays. If the system is designed for 99% availability, it is crucial that you ensure that no system downtime occurs during the most critical business hours.

HA Requirements

There is a trade-off in costs and benefits for various degrees of availability. When designing a system with HA requirements, the initial requirements often include:

System availability at all times with no perceived loss of service

No loss of data at any time

Maintenance and upgrade activities do not interfere with operational service

Without being properly informed of the total costs and consequences of implementing a system that satisfies these requirements, it is natural to want an HA solution to satisfy these lofty goals. 100% data availability is an ideal concept, but the implementation of this solution results in very high monetary, performance, and complexity costs. As you move from lower to higher degrees of availability, the costs can increase dramatically. In most environments, a step from one level to the next (for example, from 99% to 99.9%), increases costs 5 to 10 times.

It is ultimately the responsibility of an HA system designer to determine:

The degree of availability that is actually required by the users, as opposed to what they might like to have

The technological alternatives that can be used to meet these requirements

All the costs

Not only monetary, but also performance degradation and system complexity.

The High Availability Equation

One way to look at high availability is to view it as a simple equation. The effectiveness of any HA system must include reducing the time required to recover from a fault and simplifying management of the system to help enable you to scale and grow your system.

Time to Recovery

Most enterprise environments feature a wide range of systems ranging from on-line e-commerce systems to less-critical human resources (HR) systems. It is important to analyze the required recovery times of the various systems in your enterprise by performing a business impact analysis. Currently, there is a lot of work being done in this area by organizations in the analyst community such as the Gartner Group, Matter Group, and Intelligent Directions Consulting (IDC) among others. Typically, you can break the systems in an enterprise down into five basic levels based upon the time to recovery requirements:

Safety critical

Mission critical

Business critical

Task critical

Task known critical

Examples of safety critical applications include systems that manage a nuclear reactor or maintain patients' heartbeats at a hospital. The other end of the spectrum is task known critical systems such as an HR system, that can probably withstand an extended outage without significant impact on the overall enterprise.

Levels of Availability

It may be acceptable for a task known critical system to have a recovery time in terms of days or tens of hours. For these systems, basic availability, such as a traditional offline tape backup, is sufficient. If you lose your HR system, you can simply recover it from a secondary copy of the data from tape and bring the system back online in a number of hours. If the recovery process takes a day or two, the downtime will not significantly impact users.

For business and mission critical systems, you should use a different availability approach. For example, rather than restoring from an offline copy, you can recover from an online copy of the data. You can utilize technology such as replication, snapshots, and mirroring to reduce the time to recovery to tens of minutes up to a couple of hours. For even more critical systems, you can reduce the recovery time to minutes or seconds by using clustering.

There is a wide range of data availability possible. However, this range can be divided into four common levels of availability:

Basic Availability

A basic availability environment requires no specific planning for downtime. Backups might be taken to protect data, but the time required to restore the data can be quite extensive in this environment. Basic availability can be adequate for many applications, but if downtime causes any significant costs, you should consider a higher level of availability. Task known systems would probably feature a basic availability solution.

Increased Availability

This level of availability is achieved by employing RAID (redundant array of independent disks) technology to provide online data protection in addition to the advantages of basic availability. RAID is an array of disks in which redundant data is stored in different places on multiple disks. RAID technology is described in detail in the "RAID Basics" section of this course. A task critical system might employ an increased availability solution.

High Availability

In an HA architecture, hardware and software failures may occur. However, the intent is to mask the failure from the user and to reduce the time needed to recover from that failure down to several minutes or less. It is important to note that HA solutions are not fault tolerant. It is possible for all of the systems in an HA configuration to fail. The goal of an HA strategy is to recover as soon as possible from a system failure, rather than ensuring that a failure never occurs. In a simple example of an HA system, two independent servers are logically connected to form a cluster. One server stores a copy of every component of the other system. If a failure occurs on the primary server, files or services can be transferred from the secondary server. In addition to masking the failure, HA methods enable you to significantly reduce recovery times to a matter of minutes in the event of a major system failure. Typically, business critical and mission critical systems would need to use an HA solution.

Continous Availability

The most advanced level of the availability is continuous availability (CA). CA is defined as an environment explicitly designed to eliminate all computer downtime, both unplanned and planned. Today, CA environments approach 99.999% availability, or less than 5 minutes of downtime per year. However, it is important to note that the costs for CA systems can range into the millions of dollars. Examples of industries that most often utilize continuous availability solutions include air-traffic control and stock-floor trading systems

Advanced CA architectures usually feature proprietary, large, hardware-based fault tolerant host machines. In a fault tolerant system, hardware is designed to perform self-checking diagnostics and all of the main hardware components are physically duplicated. Self-checking resides on each major hardware component and detects and isolates failures instantly. This ensures that erroneous data cannot corrupt other system areas. In fact, some diagnostics built into specific CA architectures often automatically detect problems before they lead to failures, and initiate service instantaneously should a component fail. Component duplication enables normal processing to continue even in the event of a hardware failure, with no performance degradation. Safety critical systems would require a CA solution.

Simplified Management

In a typical data center environment, you may have a number of servers that have different operating systems: Solaris, HP, Windows NT, and Windows 2000. The system might feature a number of network connections as well, such as traditional Ethernet or SCSI connections, fibre-type connections, or storage area networking (SAN). There are also various types of storage devices in the system.

Today's enterprise is a very heterogeneous environment. In addition, almost every environment is growing at tremendous speeds. This requires more disk storage, different types of storage, more systems, applications, networks, etc. How do you manage all of this? The second part of the high availability equation is simplifying management.

Today's enterprise is a very heterogeneous environment. In addition, almost every environment is growing at tremendous speeds. This requires more disk storage, different types of storage, more systems, applications, networks, etc. How do you manage all of this? The second part of the high availability equation is simplifying management.

It is important for the enterprise to feature an infrastructure that enables scalability required by future demands. In addition, you need to implement a solution that enables you to perform automated tasks, virtualization, and consolidation across all systems in the enterprise, no matter the platform or operating system.

The Need For High Availability

Historically, only a select number of applications were considered critical enough to require an HA solution. In the past several years, the cost of systems has been significantly reduced and many new technologies have emerged in the business landscape, such as fibre-based Storage Area Networks (SANs). Modern applications have improved user productivity and increased the speed of business transactions. Modern businesses are much more dependent on the availability of their computer systems. This topic identifies the need for increased data availability in today's computer environments.

To make a business successful, employees and customers need to have access to their data and the services to work with that data. In today's E-commerce environment, customer expectations require round-the-clock data availability. The maintenance of corporate data and access to the data is a business necessity. Critical applications and services include:

Database servers

File servers and filesystems

Web servers

Enterprise Resource Planning (ERP)

Application servers

There are many different reasons for implementing an HA solution. Typically, there are two situations that an HA solution is designed to address:

The system crashes due to an unforeseen fault or failure.

The system is brought down intentionally for system maintenance and upgrade.

Originally, it was the utility companies that led the way toward more available systems and applications. Now, global business and E-commerce are having a significant impact on the definition of acceptable system availability.

Data must be available round-the-clock. Regular business hours do not exist in our contemporary global marketplace. For example, an Internet service organization must account for customers arriving at their site at any hour of any day.

In addition, most modern organizations depend on networking technologies. More and more business-critical data is available through networks. Access to corporate information and shared knowledge has significantly improved productivity and communication. However, this reliance on network solutions has also helped to create a need for an HA solution to ensure that the network is resilient to failures.

These new requirements are creating greater demands on the corporate information technology (IT) infrastructure. In the past, it was acceptable to expect 99% system availability. This would equate to about 3.5 days of downtime per year. However, the growth of E-commerce, greater demands for customer service, an increased dependence on network solutions, and a competitive global market have contributed to a need for high availability. When you consider the new costs of downtime, 99% system availability is no longer acceptable.

The Costs of Downtime

Before you can analyze the costs of an HA solution, you should consider the cost of not implementing such a solution. For example, in the highly competitive world of Web-based brokerage houses, one hour of downtime can cost a firm an estimated $6.5 million an hour. Gartner Group and Dataquest studies indicate that in 2000, downtime cost United States firms over $4.6 billion.

PRIVATEIndustryBusiness OperationAverage Downtime Cost Per Hour

FinancialBrokerage operations$6.45M

FinancialCredit card/sales authorization$2.6M

MediaPay per view TV$150K

RetailHome shopping (TV)$113K

RetailHome catalog sales$90K

TransportationAirline reservations$89.5K

MediaTelephone ticket sales$69K

TransportationPackage shipping$28K

FinancialATM fees14.5K

These numbers only represent direct monetary losses. They don't include less obvious losses, such as lost opportunities or customers moving their business to a competitor. Downtime can adversely affect your corporate image in the industry as well. Competitors may discover this loss and spread the news through the corporate community. Today, many companies find themselves relying on their systems to provide data continually to facilitate employee productivity, improve corporate image, and better serve their customers.

Types of Faults and Failures

Before learning about HA solutions that can be used to recover from a fault or failure, it is useful to explore faults and failures in more detail. This topic identifies different types of faults and failures that can occur. There can be a distinction made between faults and failures. Faults are often defined as non-compliances within the system which may or may not be externally visible to the end user. Whereas, failures can be defined as those faults which are externally visible. Within this course the terms fault and failure are used interchangeably.

Defining a Failure

A failure is a deviation from the expected behavior of the system. In other words, if the system is specified to exhibit a certain functionality, and in the process of execution the system produces a discernibly different functionality, a failure has occurred. Functionality is typically delivered from the system by running a procedure to execute the logic contained in software that runs in a hardware environment containing client and server machines, networks, data storage, and other peripherals. Failures can occur in any of these software procedures or the hardware in a system.

Failures can be classified as either:

Reproducible A prescribed set of actions leads to the observance of the failure in a predictable manner.

Hard reproducible failures occur identically on every execution with the same input.

Soft reproducible failures might occur with a certain probability on identical executions.

Nonreproducible

The appearance of the failure is random, or is linked to a root cause outside of the environment for which the system was engineered.

HA solutions are useful in dealing with soft reproducible and non-reproducible failures, but less effective with hard reproducible failures.

Types of Possible Nonreproducible Failures

There are several different types of failures:

Physical Hardware Failure

Although the industry has come a long way in increasing the MTBF rates for individual hardware packaging and mechanical components, hardware is still vulnerable to faults. Hardware failures are typically non-reproducible. For example, a hard drive crashes or a tape library breaks. The most common examples of hardware failures include:

System memory or CPU failures

Some contemporary computer systems have the ability to reconfigure a failed component without requiring a reboot of the system. This capability helps increase data availability in the event of CPU or memory failures.

Backplane failure

Backplanes, or motherboards, are the large circuit boards that contain sockets for expansion cards and provide the general pathway for all data in a computer system. These components rarely fail, but they can fail in some circumstances.

In addition to the expansion sockets, active backplanes also contain logical circuitry that performs CPU operations.

Passive backplanes contain almost no computing circuitry. Usually, the CPU is inserted on an additional card in the passive backplane. Passive backplanes enable you to repair failed components or upgrade to new components easily.

Disk failure

Backplanes, or motherboards, are the large circuit boards that contain sockets for expansion cards and provide the general pathway for all data in a computer system. These components rarely fail, but they can fail in some circumstances.

In addition to the expansion sockets, active backplanes also contain logical circuitry that performs CPU operations.

Passive backplanes contain almost no computing circuitry. Usually, the CPU is inserted on an additional card in the passive backplane. Passive backplanes enable you to repair failed components or upgrade to new components easily.

Disk failure

Disks are very prone to failures because of the high rotation speed, low tolerances, and possible problems with the controller boards or cables.

Tape device failure

Tape devices have similar characteristics to disks, such as high speeds and low tolerances, and are also failure-prone. In addition, tape devices are repeatedly stopping and starting. These actions may strain or overheat the motor and lead to motor failure.

Fan failure

Fans can also fail. If the cooling system fails, the effects may not be immediately visible, but over time excessive heat can cause a system to act unpredictably or fail at an undesirable point in the future.

Power supply failure

Power supplies often have the worst MTBF of all components in a system. They can fail instantly or over time. The gradual failure of a power supply can cause intermittent failures or unpredictable behavior in other components. Failures in power supplies are caused by excessive switching, varying voltage levels, or other stress-inducing factors.

Network Interface Card (NIC) failure

NICs are expansion boards inserted into a computer so the computer can be connected to a network. If a NIC fails, network connectivity is lost. It may be difficult to detect a NIC failure. A simple method used to detect these failures is to initiate some network traffic, and then use a command to display the packet count. If the packet count does not increase, it is likely that the NIC has failed. Redundant NICs should be used to avoid any loss of network connectivity due to the failure of a single NIC.

Environmental Failures

Failures can not only be caused by internal system components, but also by environmental forces beyond your control. Such environmental failures include:

Power fluctuations or outages

The most common external source of system failures is power outages. Things to consider in determining the probability of power outages should include, but not be limited to, the history of local utility companies providing uninterrupted service, the history of brownouts due to high temperatures in your area, and your proximity to major power sources.

Cooling system failure

The environmental cooling system can fail. This would cause massive overheating of some of your crucial system components. You should analyze your facilities' environmental control system for the likelihood of failure.

Structural failure

Structural failures can range from the complete collapse of the building's support structure, to the structural failure of a single computer rack or cabinet.

Natural disasters

Natural disasters are occurrences such as fires, floods, earthquakes, typhoons, or hurricanes. Considerations when identifying your organization's susceptibility to natural disasters can include geographic location, the topography of the land, or the history of natural disasters in the local area.

Human Error or Acts of Terrorism

The causes of system failures are not limited to natural causes, failures can be caused by human error or acts of terrorism.

Human error

A failure can result from an operator or administrator issuing an inappropriate command or an individual disrupting the system by accidentally tripping over a cable or unplugging a power supply.

Acts of terrorism

Unfortunately, in the contemporary computing world, there are many examples of terrorism: sabotage, vandalism, arson, robbery, vehicle crashes, hazardous waste, civil disorder, war, or malicious computer crimes. Human threats are difficult to identify, but you might consider such things as your proximity to major highways that might transport combustible or otherwise hazardous materials, the implementation of virus protection software, or the degree of employee's accessibility to the your computer facilities.

Network Failures

Networks are susceptible to failures in every component within the network. Network failures include physical failures that can take place in many network-specific hardware components, such as switches, network cables, or NICs (network interface cards). In addition to these physical components, networks feature complex configurations and service information that, if misconfigured or not authenticated properly, can lead to countless failures that can bring down a network.

Database Failures

A database system typically features a large source of data and many sub applications to use to extract specific information from this data based on specified conditions. Failures can occur at any level in a database system, ranging from a catastrophic failure of the main database engine to a temporary hang in the client-side application. Web servers often interact with back-end database servers and therefore all the possible failures that can occur in a database application can adversely affect a Web server as well.

High Availability vs. Disaster Planning

This topic differentiates between the goals and functions of high availability and disaster planning.

Disaster Recovery

The ability to recover from a natural disaster, such as fire, flood, or earthquake, in a short time is called disaster recovery. The results of these disasters include physical damage to systems, and loss of data, telecommunication, power and work space. Recovery time might be as short as minutes or hours, or as long as days or weeks. Frequently, recovery time is directly related to how quickly a system can be accessed, the data and applications loaded, and telecommunications restored. Redundancy is usually provided by a duplicate system at a different, geographically remote site.

The need for disaster recovery solutions and services is increasing rapidly. The costs after a disaster become quite large, and the need to restore access to systems and applications becomes very important. Two important issues associated with disaster recovery are the replication of data and the currency of the data. The replication of data to an alternate site is affected by distance and speed of the links. The slower the replication method, the more data will be lost in case of disaster. The impact of a disaster on the organization must be assessed along with the cost of providing for disaster recovery.

Comparing Disaster Recovery and HA

PRIVATEDisaster PlanningHigh Availability

CriticalUrgent

Testing and planning for a theoretical eventOptimizing and responding to current events

Offsite/offline data storageEverything online

Notification of loss by event; possible advance notificationNotification of loss by users

HA involves system optimization and the ability to respond to current events. HA is an urgent requirement for most organizations. Disaster planning and recovery should be a critical concern to most organizations. Disaster planning requires continual testing and refinement. Opportunities to conduct real world drills are scarce or non-existent. In most cases, testing your disaster recovery plan is more of a theoretical exercise than a real world experience. HA is much more of a day-to-day operation and therefore, more organizations neglect disaster planning in favor of HA. A properly developed disaster recovery plan should involve offsite and offline storage for recent copies of your data. HA strives to keep all data available at all times. In the event of a disaster, you will often know about the loss of data before your users will. In an HA environment, users will often inform you in the event of the loss of data or any system downtime.

HA and DR Together

When defining a disaster recovery plan, your top priority is your mission critical applications. Mission critical applications are required to be available at all times. While backup and recovery technology ensures data protection, recovery methods are often not fast enough to handle the recovery of data used by mission critical applications. HA methods such as replication and clustering can help to ensure immediate recovery whenever a disaster strikes.

This example illustrates a plan that addresses HA and disaster planning. By implementing a configuration with cluster management and replication concerns, you can effectively maintain and protect your end-users and information. You can manage clusters and move applications running at a primary site to a secondary site, while maintaining access to critical information through the continuous replication of data between sites. Clustering and replication are covered in more detail later in this course.

Disaster Planning Using Storage Level Data Replication

Storage level data replication is a popular choice for disaster planning. However, replication solutions must ensure that your data is replicated with full integrity. Replicated data should be consistent, up-to date, and ready to use at a moment's notice, while also being transparent to the application. The replication of data should also be seamless, such that the application data can be sent from one primary site to multiple secondary sites for greater protection. Replication products should not rely on any dedicated networks or vendor specific storage hardware platforms, in order to offer better protection against a single point of failure and offer greater flexibility for change and growth. Clustering with replication offers mission critical applications the optimal mechanism for immediate recovery. If both the machine and the disks fail, recovery can occur in such a way that the application can fail over to another machine in the cluster using the replicated data.

High Availability vs. Fault Tolerance

This topic differentiates between the goals and functions of high availability and fault tolerant availability methods. A system described as fault tolerant contains multiple hardware components that function concurrently, replicating all of the I/O. This type of system protects against hardware failures by incorporating redundant hardware components in a single system. Fault tolerant systems cost as much as ten times more than non-fault tolerant, but highly available solutions.

Defining Fault Tolerance

Fault tolerance extends the definition of high availability. This term is used for systems that can tolerate nearly any type of possible fault without going down. This is a solution used by industries like power companies and telephone companies.

Fault tolerance guarantees 99.9999% availability, or approximately 30 seconds of downtime per year.

Fault tolerant systems are very expensive because of the way they are designed.

They include complete hardware redundancy with no single point of failure from a hardware perspective. The only situations that can cause downtime in a fault tolerant system is software or application failure, or a catastrophic environmental disaster. Hardware redundancy is necessary, but not sufficient for fault tolerance.

A fault tolerant system must also feature some sort of redundancy management. For example, a system may provide redundant hardware components to ensure that at least one result is correct in the presence of a fault. If a user must somehow examine the results and select the correct one, then the only fault tolerance is performed by the user. However, if the system selects the correct redundant result for the user, then the system is not only redundant, but also fault tolerant.

Fault tolerant systems cannot run on typical configurations because their specialized applications must communicate directly with the hardware, sometimes for each transaction.

Although 99.9999% availability is appealing, the costs of this solution are well beyond the affordability of most companies.

Characteristics of a Fault Tolerant System

It is important to note that fault tolerant solutions and HA solutions are two very different concepts. A fault tolerant system:

Is not impacted in the event of a fault or failure.

Features no loss of access.

Enables immediate and transparent recovery.

Includes replacement, or spare, hardware components that are on-line and running in sync with the primary system.

Is expensive.

Is limited in scalability.

Does not use off-the-shelf hardware and software. The hardware usually has very specific software hooks, and applications need to be written to a specific API of the operating system.

Requires a specially modified operating environment.

Features inherent redundancy management.

Fault Tolerance Processes

Fault tolerance involves the following actions in the event of a fault or failure:

Detection

The system determines that a fault or failure has occurred.

Diagnosis

The system identifies the precise subsystem or system component that failed, and determines the immediate cause of the fault.

Containment

The system prevents the propagation of faults from their origin to a point in the system where the fault can have any effect on the service to the user.

Masking

The system ensures that the correct output is passed to the user in spite of the failed component.

Compensation

It may be necessary for the system to provide a proper response to compensate for the output of the faulty component.

Repair

The system removes the fault from the system or recovers the system. In well-designed fault tolerant systems, faults are contained before they propagate to the extent that the delivery of system service is affected. This leaves a portion of the system unusable because of residual faults. If subsequent faults occur, the system may be unable to cope because of this loss of resources, unless these resources are reclaimed through a recovery process which ensures that no faults remain in system resources or in the system state.

High Availability Planning

Determining an organization's availability requirements and architecting a system to meet them is a complicated process. This topic identifies guidelines to consider when planning a high availability solution.

Guidelines When Planing an HA Implementation

When you are planning your HA system, you need to consider many different factors. Because every environment is unique and every business has different needs, it is difficult to create an all-encompassing checklist for planning an HA system. This list addresses the most important guidelines to consider:

Determine the cost of downtime. It is difficult to estimate the cost to an organization if the system goes down. In general, the consequences of a serious system failure will vary depending on the characteristics of the specific business. The investment in a high availability solution should match the cost and risk of unavailability.

Understand the recovery point and recovery time.

It is important for you to determine when recovery is necessary in your system's operations and how long a time exists between the point of failure and recovery. The recovery point is more significant in data-centric operations where any loss of data is unacceptable. Recovery time is most important in transaction-centric environments. For example, this simple diagram illustrates how the estimated recovery times and recovery points relate in CA, electronic vaulting, off-site storage, and standard HA scenarios

Protect appropriate system components. When you design your high availability solution, you should allocate more money to protect specific system components. When determining the specific system components to protect, select the components that would have the most impact in the event of a failure, are most likely to fail, or are the most expensive to replace.

Focus on the areas that can have a significant, negative impact on the ability to keep an application and your organization up and running if they fail.

You should consider which components are most likely to fail, because these will have the most harmful effect on the MTBF values.

Protect components that may be expensive to replace in the event of failure.

Isolate and eliminate any single points of failure (SPOFs).

A SPOF is any system component that will cause downtime if it fails. It is important to investigate the path of execution in your system and identify all the weak links in the chain. If one link breaks, the whole system fails no matter how well constructed the rest of the system. You should walk through the whole process from your servers and disk storage, to the applications, through the network, and to the client systems. Common SPOFs are:

the computer system

Clustering software can be used to link several systems that can each run each other's applications in time of failure of the primary system.

disks

Disk mirroring or disk array technology can be used to protect data.

host adapters and cables

Host adapter failures can be protected against with operating system features and redundant host adapters.

networks

Networking has many hardware components; each could be a SPOF. The key to eliminating failures within the network is understanding the topologies being used, understanding the failure points within those topologies, and removing these failure points from the network. There are many hardware and software products which provide increased network availability.

electrical power

Uninterruptible Power Supplies (UPSs) and/or multiple power sources can protect against electrical power failures.

Ensure the security of the system.

Prevent data corruption and unauthorized access to your system. Security is an issue that is often overlooked in discussions of HA management, because it does not immediately reduce the impact of failure. However, it is important to any HA solution. The management center must be secured, so that only authorized personnel have access to it. The management systems, or applications, also need to support some type of user authentication, such as userIDs and passwords. Secure transactions between the applications and the system components are available through Remote Procedure Calls (RPC), or some other protocol. Secure communications should be implemented whenever possible in an HA configuration.

Centralize similar applications and services on large servers.

It should be noted that this is not a steadfast rule, sometimes many small machines running single instances of databases or single applications can be a more appropriate configuration. In general, by consolidating similar applications and services on centralized large servers, you can significantly reduce the complexity of your system, the number of backups that are required, and the number of components that can fail.

Automate repetitive tasks.

You can significantly reduce the number of hours required for hands-on operations by automating the tasks that are standard and repetitive. In addition, automation reduces the number of possible faults due to human error, such as mis-typed commands or accidental file deletion. You can also update and maintain consistent policies and procedures in a single centralized location.

Perform a thorough test initially and perform additional tests on a regular basis once the system is up and running. Before you deploy your HA solution, you should perform a thorough test that investigates every level in your system, from hardware component faults to network failures. The testing environment should mimic the eventual system environment as closely as possible: the same hardware, software, services, networks, configurations, loads on the system, and users.

It is also important to perform tests on a regular basis once the system is up and running. Systems and environments are constantly changing. The only way to ensure that the system can react to failures appropriately at any given point in time is to test the system throughout it's life cycle.

Account for future growth. It is important that any HA solution account for scalability. All data systems will expand with time. It is much less expensive and easier to manage this growth if it is planned for early in development. For example, it is much easier and cost-effective to add disks to large servers with many empty slots than it would be to purchase additional servers.

Document policies and procedures. While you are initially planning your HA system and while your system is being implemented, it is important to document every policy or procedure that you develop. This documentation can serve as an official archive of system information, a source for any troubleshooting actions that may be required in the future, and it can ensure that other individuals can access vital system information in the event that you are unavailable.

This documentation can be in a variety of formats:

HTML

This is probably the most common choice. This format is extremely portable and can be read in any browser. The major consideration with this format is that you ensure that you make relative references to the servers rather than hard links to particular URLs. HTML may prove to take up a little more room that other formats.

PDF

Adobe PDF documents are very compressed and platform-independent. The only major drawback is the limited ability to edit the documents in their native format.

Word processor

This format is the easiest to manage, however access to an appropriate reader may be an issue. This is not as portable as other formats.

Paper documentation (hard copies)

Soft copies of your documentation are much easier to update than hard copies. However, you may want to print a limited number of copies of your documentation once in a while to refer to quickly, in case of a complete or extended system outage.

Select the appropriate software and hardware.

You should select the appropriate software to maximize data availability for your organization. There are many considerations when selecting this software. Your data management software should feature capabilities for clustering, load-balancing, application-level recovery, intelligent system and application monitoring, centralized management, and you should select software that will be easy to troubleshoot through mature customer/technical support and consulting organizations. You should always arrange for on-site consulting to help you implement your HA solution. It is also a good idea to take advantage of other resources such as the product documentation, user groups and news groups, the software company's Web-site, and classroom training on the products.

There is a direct correlation between the reliability of your hardware and the your overall system reliability. It is important that you obtain appropriate reliability data from hardware vendors, such as mean time between failure figures that are proven and realistic. There are several other hardware considerations in addition to reliability, such as ease of repair, ease of access, cost, compatibility, and storage capacity. It is also a good idea to purchase spare hardware for components that may be more prone to fail that others.

Do not overcomplicate the system.

This is a very important guideline to consider when designing an HA solution, and is half of the availability equation. There are many points in any system at which failures can occur. You should always try to keep the design simple. For example, you should eliminate any extraneous system components, maintain servers that are running only a single application or service, and choose a naming convention throughout your system that is easy to remember and organize.

Reduce planned downtime.

Downtime is best defined as the period of time in which a user is unable to perform tasks in an efficient and timely manner due to poor system performance or system failure. In data centers worldwide, a lot of attention and investment has been made to ensure redundancy and high availability of hardware system components, the vessels which process and hold corporate data.

In a study published by the IEEE (International Electric and Electronic Engineering Association), hardware failures are the cause of only 10% of total system downtime. As much as 30% of all downtime is pre-scheduled, and most of this time is required due to the inability of system tools to permit online administration of systems. Another 40% of downtime is due to software errors. Some of these errors are as simple as a database running out of space on disk and stopping its operations as a result. Any comprehensive HA solution has to be able to deliver application and information availability in the event of any cause of downtime.

Examples of planned downtime include those times when the system is shutdown to add additional hardware, upgrade the operating system, rearrange or repartition disk space, or clean up logfiles and memory. If you implement an effective HA strategy, you can significantly reduce the amount of planned downtime. For example, you can provide for backups, maintenance, and upgrades while the system is up and running. You can also reduce the time required to perform the tasks that can only be done while the system is down.

Balance the cost of the availability solution with the rewards. The cost of purchasing, implementing, and managing the HA solution should be consistent with the operational loss you wish to prevent. Achieve an appropriate trade off between the cost and the rewards of an HA system. The relationship between cost and return on investment in HA systems can be viewed as a curve. This curve illustrates the law of diminishing returns. As you move from a less expensive, simple solution to more advanced solutions the costs increase dramatically

The Layered Approach to High Availability

This topic describes the layered availability approach and introduces the concepts and terminology involved in the availability issues posed by each layer. The range of layers includes the application layer, and storage management layer that enables you to manage logical, or virtual, storage volumes. The storage network infrastructure layer features such components as hubs, switches, and Fibre Channel connectivity. Finally the disk and data storage layer contains the tape libraries, intelligent disk arrays, and other storage devices. The concepts and terminology introduced in this topic, are covered in greater detail in other topics throughout the course.

To simplify the management of a complicated system, you can break the system down to four basic layers:

Application layer

Storage management layer

Storage network infrastructure layer

Data storage layer

In order to reduce the time of recovery, you need to determine the level of service that each layer must deliver to the others. You can also simplify management by logically organizing the resources in each layer.

Application Layer

The application layer is the direct interface between the system and the client machines, such as a database, an E-mail, or a custom application. HA solutions feature functionality that provides continuous service or access to applications in the event of a fault or failure in a transparent manner. Throughout this course, it is important to view your system from an application-based viewpoint. In other words, no matter what components, structure, policies, and procedures are implemented in your HA solution, the most important consideration at any time is to minimize the impact of a fault or failure on the users ability to access data through the application or service. HA issues involved in this layer include clustering, application-level failovers, simplified management of large server farms, common availability management, and replication of data to multiple sites.

Storage Management Layer

The storage management layer refers to the method by which the server manages the storage devices or disks. This management is performed by the building blocks of an HA solution: volume management and a journaling filesystem.

Volume Management

Often, the first step taken towards increasing a system's availability is to enable software-based redundancy of disks, or software RAID. Software RAID defines a logical volume. A volume is a logical object on which filesystems are written or to which databases write their data. Software RAID is often packaged with volume management software.

Journaling Filesystem

A file system is a collection of directories organized into a structure that enables you to locate and store files. All information processed is eventually stored in a file system. When a system or server fails, the filesystem can be eliminated. To avoid this problem, a tape backup is required to restore the filesystem. A journaling filesystem journals the changes to the file system structure (and occasionally data). If the system crashes and is rebooted, the journal is replayed to ensure the correctness of the file system structure. Data recovery is dependent upon the specific application. For example, recovery of an Oracle database would require the use of Oracle log files.

Storage Network Infrastructure Layer

This layer refers to storage network connectivity. This layer is becoming more and more of a concern to the modern enterprise. Originally, most environments simply connected a server to a storage device through a SCSI connection. Now, organizations are using other more advanced network connection technology such as Fibre Channel technology and storage area networking (SAN). Rather than viewing this layer simply as a server connecting to a piece of storage, you should consider multiple paths between servers and storage. You need to investigate the possibility of implementing some sort of network redundancy to ensure that if you lose an access route between the system and storage, there is another access path available.

Data Storage Layer

In addition to application availability, managing storage effectively, and ensuring that you maintain network connectivity, there are data availability concerns in the storage pool itself. In this layer, you can enable online, dynamic reconfiguration of storage pool. You need to account for growth and scalability. No matter how many disk arrays you have, you will inevitably require more in the future. You should also consider the capacity management aspects of your storage devices and determine how to optimize storage space across common disk hardware.

General RAID Levels

RAID is an array of disks in which redundant data is stored in different places on multiple disks. The redundant information enables regeneration of user data in the event that one of the disks in the array or the access data path to it fails. By placing data on multiple disks, I/O operations can overlap in a balanced way, improving performance. RAID also increases the MTBF and fault tolerance. RAID employs the disk striping, or partitioning of each drive's storage space into units. The stripes of all the disks are interleaved and addressed in order. In this topic, you learn about the various RAID levels.

General RAID Levels

There are five basic RAID levels that are commonly recognized. In addition, there are several other RAID levels that are less common variations on these five basic levels. There are also several common RAID combinations that can also be configured. The most appropriate RAID configuration for a specific filesystem or database tablespace must be determined based on data access patterns and determining an appropriate trade-off between cost and performance.

RAID-0 (striping)

This RAID level features disk striping, but no redundancy of data. In this configuration, a collection of data is divided into small chunks that are written to a separate disk in the array. This RAID level supplies performance acceleration at no increased storage cost, because individual disks can perform concurrent write operations. RAID-0 offers no increase in data availability. In fact, if implemented by itself, RAID-0 decreases overall data availability. This is because for one disk to function, all the other disks in the array must be functioning as well. Any failure of an individual disk in the stripe will result in the inability to perform any read or write operations in the entire stripe. RAID-0 would be an option for applications requiring high bandwidth such as video production and editing, image editing, or pre-press applications.

RAID-1 (mirroring)

RAID-1 requires at least double the disk capacity of RAID-0. In RAID-1, the data is replicated on a separate disk, or multiple disks. No disk striping occurs. Every byte on one disk is copied block-for-block on a separate disk that acts as a peer and is completely in sync with the original disk. In the event of an individual disk failure, the other disk maintains operation without any service interruption. RAID-1 provides the highest performance for redundant storage, because it does not require read-modify-write cycles to update data, and because multiple copies of data can be used to accelerate read-intensive applications. However, resyncing or creating a new RAID-1 copy requires time and a significant amount of I/O. Therefore, a disadvantage to RAID-1 is the fact that write performance may suffer. RAID-1 requires 100% additional disk capacity for each mirror copy. Therefore, another major disadvantage is cost. This RAID level would be recommended for applications requiring increased availability such as accounting, payroll, or other financial applications.

RAID-2 (Hamming encoding)

RAID-2 features disk striping. This RAID level detects errors that occur and determines which part is in error by using error checking and correcting (ECC) information. RAID-2 detects 2-bit errors and corrects 1-bit errors on the fly. Each data disk has its Hamming Code ECC information recorded on ECC disks. On read operations, the ECC code verifies data or corrects single disk errors. You need a high ratio of ECC disks to data disks with smaller word sizes. It has no clear advantages over RAID-3, and is not used in practice.

RAID-3 (byte striped across a group of disks)

RAID-3 uses disk striping in a parallel fashion with each virtual disk block distributed across all the disks in the array except for one that stores the parity check. The parity disk permits the regeneration and rebuilding of data in the event of a disk failure. In RAID-3, the stripe depth of an N+1 array is equal to 1/N virtual blocks and each disk drive must be on its own separate I/O channel. For example, if the virtual block size for a 4+1 set, is 512 bytes, then the stripe depth is 128 bytes (512/4). The RAID volume can only process one disk I/O at a time. All I/O operations access all disks, because the bytes are distributed across multiple disks (parallel transfer). For this reason, RAID-3 is best for applications that are single stream bandwidth-oriented. This would not be a good choice for a database server, because databases tend to read and write smaller blocks. RAID-3 is likely to perform significantly better in a controller-based implementation.

RAID-4 (dedicated parity disk)

RAID-4 uses large stripes, and dedicates one drive to storing parity information. RAID-4 is very similar to RAID-3. The major difference is that where in a RAID-3 array, the stripe and logical block size are equal, RAID-4 arrays implement variable stripe sizes. In RAID-4, the stripe depth is an integer multiple of the virtual block size. This means that multiple virtual blocks can be placed within a single stripe in the RAID-4 array.You can read records from any single drive. This enables you to take advantage of overlapped I/O for read operations. Since all write operations have to update the parity drive, no I/O overlapping is possible. RAID-4 offers no advantage over RAID-5. As with RAID-3, a RAID-4 implementation is ideal for systems performing large file transfers. It does not perform well when used in applications that require small file writes at high I/O rates.

RAID-5 (block striped across a group of disks)

RAID-5 removes a possible bottleneck on the parity drive by rotating parity across all drives in the set. RAID-5 requires at least three and usually five disks for the array. All read and write operations can be overlapped. RAID-5 stores parity information but not redundant data. Recovery from a RAID-5 disk failure requires a complete read of all the disks in the stripe. The recovery process can be time-consuming and system performance will suffer during recovery. This is the most complex and versatile of the basic RAID architectures. RAID-5 is best suited for file and application servers, database servers in a datawarehousing environment, Web servers, and e-mail servers.

The performance overhead for writes can be substantial in a RAID-5 configuration, because a write can involve much more than simply writing to a data block. A write can involve reading the old data and parity, computing the new parity, and writing the new data and parity.

RAID Level Variations

RAID-6

RAID-6 is similar to RAID-5, but with additional independently computed check data. It includes a second parity scheme that is distributed across different drives and offers very high fault-tolerance. Currently, there are very few commercial examples of RAID-6.

RAID-7

RAID-7 includes a real-time embedded operating system as a controller, caching data through a high-speed bus, and other characteristics of a stand-alone computer. This RAID level is not common.

RAID Combinations

RAID-01 (mirrored stripes)

RAID-01 is a mirrored RAID-1 pair made from two RAID-0 stripe sets. It is configured by creating two RAID-0 sets and adding RAID-1. If you lose a drive on one side of a RAID-01 array, then lose another drive on the other side of that array before the first side is recovered, you will suffer complete data loss. It is also important to note that in the event of a single disk failure, all drives in the surviving mirror are involved in rebuilding the entire damaged stripe set. Performance during recovery is severely degraded during recovery unless the RAID subsystem allows adjusting the priority of recovery. However, shifting the priority toward production will lengthen recovery time and increase the risk of the kind of the catastrophic data loss mentioned earlier.

Example of RAID01 Failure

In this example, if Disks A and D fail, all the disks are unavailable.

RAID-10 (striped mirrors)

RAID-10 is a stripe set made up from a number of mirrored pairs. Only the loss of both drives in the same mirrored pair can result in any data loss and the loss of that particular drive is 1/Nth as likely as the loss of some drive on the opposite mirror in RAID-01. Recovery only involves the replacement drive and its mirror so the rest of the array performs at 100% capacity during recovery. Since only the single drive needs recovery bandwidth, requirements during recovery are lower and recovery takes far less time, reducing the risk of catastrophic data loss. The performance of RAID-10 and RAID-01 are identical, but they have different levels of data integrity.

Example of RAID10 Failure

In this example, first Disk A fails and all the other disks are available. If disk D fails, only the data on disks A and D are offline.

RAID-53

RAID-53 offers an array of stripes in which each stripe is a RAID-3 array of disks. This offers higher performance than RAID-3 but at much higher cost and requires at least 5 drives

Software RAID vs. Hardware RAID

The basic characteristics and configurations of RAID levels are the same in both software and hardware RAID. The main difference is the point at which the disk management operations occur. This topic identifies the advantages and disadvantages of software and hardware RAID

Hardware RAID (Controller-Based)

In hardware RAID, the management operations required to implement the RAID disk array occur within the disk array itself. The host system does not perform the operations, but an interface program runs on the host system that enables you to monitor the disk management operations. A hardware RAID operation creates a logical unit (LUN) that can be monitored regardless of the operating system of the host system. It is often a safe assumption that the disks are managed properly, no matter what the RAID level, within a hardware RAID configuration. A hardware RAID system is basically a specialized, single-purpose system that features a controller that does nothing but aggregate storage disks, stripe and mirror data across these disks, and calculate parity.

The advantages of using hardware RAID over software RAID:

Increased performance on the host system

Performance is increased because the disk management operations are off-loaded onto the disk array. For example, in a mirrored controller-based configuration, the host would need to pass only one write request through the disk driver and across the I/O bus, where the controller would decompose it into two separate writes.

Enhanced features

Hardware RAID manufacturers often add enhanced functionality to their hardware. Such enhancements include additional internal memory in the disk array, and the abilities to replicate data over a WAN, share specific disks between multiple host systems, and lock out other hosts while a single host is accessing a disk. Enterprise class hardware RAID systems also often include redundant power supplies and cooling fans.

Efficiency

Hardware RAID systems tend to be very efficient because they feature hardware that is only concerned with performing RAID operations. The RAID controller does not have to concern itself with graphic user interfaces (GUIs) and other aspects of a general purpose operating system.

The disadvantages of using hardware RAID over software RAID:

Dependence on one RAID hardware vendor

Every RAID manufacturer uses a different management interface and once you familiarize yourself to one, it will be difficult to switch to a different vendor.

Inability to combine disks from different arrays into a single array

This will create another SPOF in the system.

Hard to resize LUNs.

In most cases, once a LUN is full, you cannot simply increase the size of the LUN to accommodate new data. You have to destroy the original LUN, create another larger LUN, and then restore the original data to the new LUN.

Hardware limits on the number and size of LUNs

Often, RAID vendors will enforce some hardware limits that might limit your ability to configure your system for optimal performance.

Cost

Hardware RAID is more expensive than software RAID.

No inter-box protection

A specific RAID controller has no visibility to other RAID boxes or storage devices.

Software RAID (Host-Based)

Rather than utilizing a dedicated hardware controller to perform the various management operations required to implement a RAID array, in software RAID the operations are performed by the host system processor using special software. Disk array management is a somewhat low level activity that is performed underneath the other applications that run on the host system. Therefore, software RAID is usually implemented at the operating system level. Software RAID is supported on Windows NT and 2000 platforms, as well as a majority of the various UNIX platforms. The output of software RAID is a logical volume. A volume is a logical object on which file systems are written or to which databases write their data.

Advantages of using software RAID over hardware RAID:

Cost

If you are already running an operating system that supports software RAID, you have no additional costs for controller hardware. However, you may be required to add more system memory.

Simplicity

You are not required to install, configure, or manage a hardware RAID controller.

Flexibility in hardware

By moving the management operations off the hardware, you are allowed more flexibility in selecting appropriate hardware. In fact, you can use a wide range of online storage, such as just a bunch of disks (JBOD), enterprise RAID, and a smaller RAID system.

Flexibility in disk configuration

Software RAID implementations can build RAID objects from partitions of disks, rather than being restricted to whole disks, they can use a disk pool to meet a diverse set of performance and availability requirements. For instance, one might create a small high-performance striped file system by using only a few cylinders on a very large number of drives, and use the remaining space on those same drives for concatenated, mirrored or RAID-5 volumes with different I/O characteristics.

Increased redundancy

A duplexed RAID-1 array can sometimes be implemented in software RAID, but not in hardware RAID, depending on the controller. Building redundant layouts using disks with separate connections to the host can enhance availability, eliminating the single points of failure introduced by non-redundant host connections.

The disadvantages of using software RAID over hardware RAID:

Performance

The most significant drawback of software RAID is that it provides lower overall system performance than hardware RAID. Cycles are taken from the CPU of the host system to manage the RAID array. In reality, the impact of these operations is not that excessive for simple RAID levels like RAID-1. However, the impact on performance can be substantial, particularly with any RAID levels that involve striping with parity, such as RAID-5.

Boot volume limitations

The operating system cannot boot from the RAID array, due to the fact that the operating system has to be running to enable the RAID array. A separate partition needs to be created for the operating system. This segments the system capacity, lowers the performance, and increases the time required to boot the system.

RAID level limitations

Software RAID is usually limited to RAID-0, RAID-1, RAID-5, RAID-01, and RAID-10.

Advanced feature support

Software RAID normally does not include support for the advanced features that may be available to hardware RAID arrays.

Operating system (OS) compatibility issues

Generally, if you enable software RAID by using a particular operating system, only that particular operating system can access that array. This creates problems with multiple-OS environments.

Software compatibility issues

Some software utilities, such as partitioning and formatting utilities, may have conflicts with software RAID arrays.

Reliability

Implementing software RAID increases the chance of potential bugs that might compromise the integrity and reliability of the array.

Combining Software and Hardware RAID

You should not consider there to be a distinct choice between software and hardware RAID solutions. Host-based volume management features all the advantages of software RAID to complement hardware RAID systems. By combining hardware and software RAID, you can realize the best features of both solutions. The off-loaded processing, reduced I/O transfer requirements, and redundant components of most.hardware RAID subsystems, coupled with the configuration flexibility added by the inclusion of software-based RAID. Combining hardware and software RAID solutions offer several key benefits:

Increased availability

Many hardware RAID solutions retain single points of failure (SPOFs), allowing data to become unavailable if a non-disk component of the array fails. When software RAID is used to build configurations that incorporate hardware RAID units in separate arrays, many of these vulnerabilities can be eliminated.

Increased performance

A single hardware RAID controller may present a bottleneck to data access because of limited array bus and host-to-array bandwidth, as well as CPU cycles needed for parity calculations. Efficient controller-based algorithms can be combined with multiple host connections and supplementary software RAID processing to increase bandwidth and throughput.

Improved manageability

The limited set of configuration options and the static configuration utilities for hardware RAID subsystems may make initial setup seem simpler than setting up a software RAID configuration. However, after running the system, the configuration may need to be modified to reflect the actual I/O pattern of the applications. With a controller-based setup, this is usually achieved by backing up the data, reconfiguring the array, and reloading the data. This requires interruption of data access. The on-line reconfiguration capabilities of most software RAID solutions can be used to enhance the performance monitoring, tuning, and reconfiguration of hardware RAID, simplifying administration while increasing uptime and performance.

Defining a Volume

The basis for any volume management solution is a volume. This topic defines a volume and identifies the advantages of using volumes to manage storage.

What Is a Volume?

Volumes enable an application to view a number of disks as a single logical unit, no matter the physical location of the disks. This volume has the performance, reliability, and other attributes of its individual components. Each volume records and retrieves data from one or more physical disks. Volumes are accessed by file systems, databases, or other applications in the same way that physical disks are accessed. Volumes are also composed of other virtual objects that are used to change the volume configuration. Volumes and their virtual components are called virtual objects. Volumes can be used to perform administrative tasks on disks without interrupting applications and users.

Advantages of Volumes

There are several advantages to using volumes:

Ability to combine RAID levels

Volumes enable you to combine any number of different RAID levels. For example, if the important consideration is cost, maybe you would implement a RAID-5 solution. Alternatively, if you require very high performance, then you might use striped mirrors.

Scalability

Virtual volumes also offer the flexibility to grow the storage capacity without disrupting the system. Instead of taking the server off-line or physically moving data from point A to point B, you can simply add more storage to the volume.

Increased performance and failure tolerance

You can combine enterprise RAID and JBOD and your system will feature the advantages of both. You can take advantage of a hardware controller and the flexibility of host-based volume management.

VERITAS Volume Management: Virtual Objects

There are several basic methods to manage online storage to increase data availability. Before you can understand the specific principles involved in each of these methods, it is important to define some of the basic virtual objects and their relationships to each other. This topic provides an overview of VERITAS Volume Manager (VxVM) and describes the relationships between the various VxVM objects.

Overview of VxVM

VxVM provides easy-to-use online disk storage management for computing environments. Traditional disk storage management often requires that systems be taken offline at a major inconvenience to users. VxVM provides the tools to improve performance and ensure data availability and integrity. VxVM also enables you to dynamically configure disk storage while the system is active.

The connection between physical objects and VxVM objects is made when you place a physical disk under VxVM control. VxVM creates virtual objects and makes logical connections between the objects. The virtual objects are then used by VxVM to perform storage management tasks. VxVM objects include:

VxVM disks

When you place a physical disk under VxVM control, a VxVM disk is assigned to the physical disk. Each VxVM disk corresponds to at least one physical disk. A VxVM disk typically includes a public region where user data is stored, and a private region where VxVM internal configuration information is stored.

Disk groups

A disk group is a collection of VxVM disks. You group disks into disk groups for management purposes, such as to hold the data for a specific application or set of applications. For example, data for accounting applications can be organized in a disk group called "acctdg".

A disk group configuration is a set of records with detailed information about related VxVM objects, their attributes, and their connections. Disk groups are configured by the system administrator and represent management and configuration boundaries. You can create additional disk groups as necessary. Disk groups allow you to group disks into logical collections. Disk groups enable high availability, because a disk group and its components can be moved as a unit from one host system to another. Disk drives can be shared by two or more hosts, but accessed by only one host at a time. If one host crashes, the other host can take over the failed host's disk drives, as well as its disk groups.

Subdisks

A subdisk is a set of contiguous disk blocks. VxVM allocates disk space by dividing a VxVM disk into one or more subdisks. Each subdisk represents a specific portion of a VxVM disk, which is mapped to a specific region of a physical disk. A VxVM disk can contain multiple subdisks, but subdisks cannot overlap or share the same portions of a VxVM disk.

Plexes (mirrors)

VxVM uses subdisks to build virtual objects called plexes (or mirrors). A plex consists of one or more subdisks located on one or more physical disks. To organize data on the subdisks to form a plex, use the following methods:

Concatenation

Striping (RAID-0)

Mirroring (RAID-1)

Striping with