Storage - DS8K HA Best Practices_v1.9

download Storage - DS8K HA Best Practices_v1.9

of 27

Transcript of Storage - DS8K HA Best Practices_v1.9

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    1/27

    Copyright 2007 IBM Corporation. All rights reserved.

    Recommended Best Practices Considerations for HighAvailability on IBM System Storage DS8000 andDS6000 and IBM TotalStorage ESS

    Prepared by:Cam-Thuy Do and John Sing

    IBM High Availability Center of CompetencyOctober 2007

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    2/27

    IBM Systems and Technology Group

    Copyright IBM Corporation 2007. All rights reserved.Version 1.8

    Disclaimers

    Copyright 2007 by International Business Machines Corporation.

    No part of this document may be reproduced or transmitted in any form without written permission from IBMCorporation.

    Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change

    without notice. This information could include technical inaccuracies or typographical errors. IBM may makeimprovements and/or changes in the product(s) and/or programs(s) at any time without notice.

    Any statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, andrepresent goals and objectives only.

    References in this document to IBM products, programs, or services does not imply that IBM intends to make suchproducts, programs or services available in all countries in which IBM operates or does business. Any reference toan IBM Program Product in this document is not intended to state or imply that only that program product may beused. Any functionally equivalent program, that does not infringe IBMs intellectually property rights, may be used

    instead. It is the users responsibility to evaluate and verify the operation of any on-IBM product, program or service.

    THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHEREXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR APARTICULAR PURPOSE OR NONINFRINGEMENT. IBM shall have no responsibility to update this information. IBMproducts are warranted according to the terms and conditions of the agreements (e.g., IBM Customer Agreement,Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBMis not responsible for the performance or interoperability of any non-IBM products discussed herein.

    The provision of the information contained herein is not intended to, and does not, grant any right or license under anyIBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:

    IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 10504-1785U.S.A.

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    3/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    Trademarks

    IBM, IBM eServer, IBM logo, e-business logo, CICS, DB2, MQ, ESCON, Enterprise Storage Server, GDPS, IMS, MVS,OS/390, Parallel Sysplex, Redbook, Resource Link, S/390, System z9.iSeries, pSeries, xSeries, OS/400, i15OS, SystemStorage, TotalStorage, VM/ESA, VSE/ESA, WebSphere, z/OS, z/VM, z/VSE, and zSeries are trademarks or registeredtrademarks of International Business Machines Corp. in the United States, other countries, or both.

    Linux is a registered trademark of Linux Torvalds in the United States, other countries, or both.

    Microsoft is a registered trademark of Microsoft Corporation in the United States, other countries, or both.

    UNIX is a registered trademark of The Open Group in the United States, other countries, or both.

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    4/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    IBM System Storage Enterprise Disk

    DS6000

    DS8000

    New Standard inPricing and

    Packaging

    New Standardin Functionality,

    Performance, TCO

    ESS 750 / 800

    This document provides a summary of recommended High Availability best practiceconsiderations for the DS8000, DS6000, and Enterprise Storage Server disk subsystems

    The reader is assumed to have a baseline understanding of the concepts and facilities ofthese products

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    5/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices

    Configuration

    RAID 5 - spreads data across multiple disk drives using parity (P) and spares, thusproviding redundancy (e.g. A 6+P+S array consists of six data, one parity drive andone spare) Use RAID-5 when the desire is to use less storage, but at expense of longer rebuild time if drive fails

    RAID 10 stripes half the disk drives while the other half of the array mirrors the firstset of disk drives Use RAID-10 when the desire is for highest performance and/or lower rebuild time

    At expense of requiring larger amount of raw storage

    Exploit available hardware options Server & Storage fail-over/fall-back in Metro Mirror Environment Concurrent Maintenance Minimize Single Frame DS8300 purchases as 1st expansion frame upgrade is disruptive. Distribute Host connections across multiple physical adapters on the DS8000 Verify all host paths are available before upgrading software Logical Partitioning (LPAR) capability to distribute workloads

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    6/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices

    Multiple Redundant Management Control Consoles

    Uninterruptible Power Supply Earthquake Resistant Kit (where applicable) Consider IBM Standby Capacity on Demand (Standby CoD) offering for capacity planning

    Enable Call Home and Remote Support Monitor the storage subsystem status

    e-mail notification for a serviceable event

    Simple Network Management Protocol (SNMP) notification

    Service Information Message (SIM) notification zSeries

    Reviewing the event log of the DS8000

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    7/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices

    Maintain Currency Create a regular maintenance window for storage and SAN Install firmware updates as recommended Understand what fixes/upgrades are in a Firmware update Integrate into Change Control Management May install first on less critical systems, prior to production Maintain supported combinations of Host Adapter Driver Subscribe to MySupporthttp://www.ibm.com/support/mySupport

    Concurrent Maintenance

    Perform Concurrent Maintenance operations of the storage subsystem duringtime of low activities

    Microcode upgrade will be performed by IBM support personnel

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    8/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices

    Host Based Monitors and Alert

    GDPS/PPRC HyperSwap Manager

    GDPS/PPRC HyperSwap Monitors & Alerts

    TPC-R

    Host Based Collection Facilities

    z/OS LOGREC

    Host Based High Availability Options for Data

    DFSMF Dataset Name separation

    Host Connections provide multiple paths from each host to the storage

    MPIO or Subsystem Device Driver (SDD) for Open Systems

    Dynamic Path Selection (DPS) and Dynamic Path Reconnect (DPS) forzOS

    Host Based Monitors and Alert GDPS/PPRC HyperSwap Manager

    GDPS/PPRC HyperSwap Monitors & Alerts

    TPC-R

    Host Based Collection Facilities z/OS LOGREC

    Host Based High Availability Options for Data

    DFSMF Dataset Name separation

    Host Connections provide multiple paths from each host to the storage MPIO or Subsystem Device Driver (SDD) for Open Systems

    Dynamic Path Selection (DPS) and Dynamic Path Reconnect (DPR) for zOS

    Distribute paths across multiple physical adapters on the DS8000

    System i

    DSCLI commands executed through i5/OS interface Copy Services for System i Toolkit

    Combination of iSeries Navigator and 5250 interface

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    9/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices

    Duplicate Storage Subsystems in Campus or Same Data Center Floor

    Can use Metro Mirror for data redundancy, to enable quick Re-IPL Requires automation S/W such as TPC-R or GDPS

    IBM Softek TDMF to move data around in Real Time Can perform local Site Switch before maintenance actions to reduce impact

    of human errors and reduce impact to production

    Know the following IBM System Storage web sites

    IBM System Storage support web site

    Starting point for IBM System Storage hardware and s/w support Includes links to subscription services to sign up for email alerts Includes links to product docs, contact information, fix search engine

    IBM System Storage Interoperation Centerhttp://www-01.ibm.com/servers/storage/support/config/ess/index.jsp

    Fibre Channel host bus adapter firmware and driver level matrix site

    IBM Hi h A il bilit C t f C t

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    10/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Advanced Copy Functions Overview for Availability

    Point in Time Copy (FlashCopy)

    Minimize application / database downtime required to make local point in time copies for:

    - Production backup, data cloning, data warehouse, test and development

    - Disk subsystem microcode creates internal copy of data (FlashCopy) Copy initialization of large terabytes of data can be accomplished in seconds

    Remote Mirroring (Metro Mirror, Global Mirror, zOS Global Mirror)

    Create real-time, continuously updated remote copies of disk subsystem data- Campus, metropolitan, or geographically distant site

    Data suitable for High Availability fast failover and failback

    Supports large amounts of data, at the terabyte level

    Disk subsystem microcode mirrors volumes/LUNs to remote disk subsystem

    - Synchronous capability (Metro Mirror)

    - Asynchronous capability (Global Mirror)

    IBM Hi h A il bilit C t f C t

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    11/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Point in Time internal Data Replication

    Fast Time-Zero internal data replication capability (FlashCopy)

    Create internal copies of data for backup, cloning, data mining, etc.

    Physical configuration

    Assure sufficient target disk space allocated

    Usage practices:

    Plan databases/applications to be in hot backup mode or quiesce to maintain dataintegrity

    Back up internal volume/LUN required for:

    Operating System catalogs, etc.

    Database/application metadata

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    12/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Metro Mirror - Synchronous Data Replication

    Applicability:

    General:provide synchronous data replication of disk subsystem at volume / LUN level

    System z:In combination with GDPS HyperSwap, provides foundation for removal ofParallel Sysplex disk subsystem single point of failure

    Physical configuration, link and infrastructure planning

    Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth

    Automation

    Plan for highly automated operational control of mirroring to mask complexity andsupport reliability, repeatability, testability

    Testing and testing resource expectations

    Plan to provide Tertiary Copy storage at remote site- For every production TB to be mirrored, ideally 2x that TB at remote site

    - To provide additional storage for ongoing testing environment, resync protection,and golden copy, problem determination, validation

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    13/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Global Mirror - Asynchronous Data Replication

    Applicability of IBM Global Mirror : is usually chosen when

    Open Systems or mix of z/OS and Open asynchronous replication of volumes/LUNs isdesired, and when reduced bandwidth is a necessity

    Link and infrastructure planning

    Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth

    Similar speed and throughput characteristics on source and target volumes can provideoptimum performance

    Automation Plan for highly automated operational control of mirroring to mask complexity and

    support reliability, repeatability, testability

    Availability and Testing

    Plan to provide sufficient Tertiary Copy storage at remote site

    - For every production TB to be mirrored, ideally 3x that TB at remote site

    - To provide storage for ongoing testing environment, resync protection, golden copy, problemdetermination, validation

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    14/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Global Mirror (XRC) - Asynchronous Data Replication

    Applicability of IBM z/OS Global Mirror (XRC):

    General:z/OS Global Mirror is usually chosen when:

    - Only z/OS data requires asynchronous data replication, or when heterogeneous

    z/OS disk vendors are required.

    Physical configuration, link and infrastructure planning

    Must perform initial and ongoing analysis of write workload to determine sufficientSAN/WAN/telecom infrastructure bandwidth

    Similar speed and throughput characteristics on source and target volumes can provide

    optimum performance Plan to provide sufficient System z cycles at remote site for System Data Mover

    Automation

    Plan for highly automated operational control of mirroring to mask complexity andsupport reliability, repeatability, testability

    Availability and Testing

    Plan to provide sufficient Tertiary Copy storage at remote site- For every production TB to be mirrored, ideally 2x that TB at remote site

    - To provide ongoing testing environment for setup, validation, problem determination,

    validation

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    15/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Three site replication

    When to use 3 site

    Three site replication is used when the requirement is to combine zero data loss RPO using localMetro Mirror, and combining that with out of region recovery (async).

    Pre-requisites: Three site replication is affordable and justifiable to the business when:

    Data Center strategy and implementation is already well under way towards Active-Active or PlannedWorkload Rotation for two site

    Pre-requisite: Two site configuration already includes ongoing:

    Automated failover/failback

    Full Tertiary Copy capability for testing, problem determination, validation, automation

    Ongoing WAN / bandwidth / workload Capacity Planning

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    16/27

    IBM High Availability Center of Competency

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Management of Replication

    Plan for highly automated disk mirroring environment

    Provides foundation for Reliability, Repeatability, Scalability, Testability

    Recommendations for automation software:

    System z environment: GDPS

    Mixed open platform: GDOC

    General disk mirroring mgmt: TPC for Replication

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    17/27

    g a ab ty Ce te o Co pete cy

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Resources

    System Storage Business Continuity Solutions website

    http://www-03.ibm.com/servers/storage/solutions/business_continuity/index.html

    System Storage Technology Center

    http://www-03.ibm.com/system/storage/

    Storage Education http://www-03.ibm.com/systems/education/cust/crossprod/custcp.html

    System Storage Interoperation Center

    http://www-01.ibm.com/systems/support/storage/config/ssic/index.jsp

    System Storage Services

    http://www-03.ibm.com/systems/storage/services/index.html

    Redbooks/Redpapers http://www.redbooks.ibm.com/redbooks.nsf/portals/Storage The IBM TotalStorage DS8000 Series: Concepts and Architecture (SG24-6452-00) IBM System Storage Business Continuity Solutions Overview (SG24-6684-01) IBM System Storage DS8000 Series: Copy Services with IBM System z (SG24-6787-02) IBM System Storage DS8000 Series: Copy Services in Open Environments (SG24-6788-02)

    IBM System Storage Solutions Handbook (SG24-5250-06)White papers

    IBM Storage Infrastructure for Business Continuity Solution

    Global Mirror Technical Whitepaper

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    18/27

    Copyright 2007 IBM Corporation. All rights reserved.

    Data Corruption Solutions

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    19/27

    g y p y

    Copyright 2007 IBM Corporation. All right reserved.

    System Storage Enterprise Disk Practices Data Corruption

    Logical data corruption protection must be designed at the operational andapplication level

    Best practices procedures are:

    Sufficient point in time disk copies of data To provide adequate known restart points

    Supplemented by operational procedures at the database/application level

    Tools include (but not limited to): FlashCopy Point in Time Copy

    Software: zCDP for DB2 (zOS 1.8 + DB2 9)

    - Eliminates need for DB2 Backup Windows via DB2 BACKUP Utility

    - No interruption to DB2 Processing to take backups.

    - DFSMShsm Maintains up to 50 Backup versions across disk & Tape.

    - DB2 RESTORE Utility Granularity - System, Volume, DB Table.

    Future: zCDP for Storage IBM SOD on providing CDP function for all zOS data.

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    20/27

    Copyright 2007 IBM Corporation. All rights reserved.

    Supplemental Information

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    21/27

    Copyright 2007 IBM Corporation. All right reserved.

    FlashCopy: Local Point in Time Data Replication to improvedata availability

    Copy data command issued - Copy is immediately available

    Read and write to bothsource and copy possible

    Write Read

    When copy is complete,relationship betweensource and target ends

    Time

    Optional background copy

    Source Target

    FlashCopy Use Cases

    - Production backup

    Regain information from an older level of data Re-establish production in case of any server errors

    - Data backup Create backups with the shortest possible

    application outage

    - Data Mining Avoid performance impacts of the production system

    - Test system Allow to test new application with real production

    data

    - Moving and migrating data Move a consistent data set from one host to another

    with a minimum of downtime for the host application

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    22/27

    Copyright 2007 IBM Corporation. All right reserved.

    Storage

    Mirroring

    Scalable Data Integrity

    Storage

    Network

    Server

    cluster

    Synchronous data mirroring (up to 300km) Superior performance

    - Low internal MM Overhead (at zero distance DS8000additional overhead is .38ms)- Optimized Protocol Exchange- Each 100KM add 1ms- Plus Switch/channel extender Overheads- Generally Fewer Links Required over

    competition

    Platform environment

    System z: GPDS/PPRC, GPDS/PPRC Hyperswap Manager

    System p: AIX HACMP/XD + Metro Mirror

    System i: High Availability Business Partner software; ASR Toolkit

    Geographically Dispersed Open Clusters (GDOC) for Unix, Linux and

    Windows

    Metro Mirror: synchronous replication of data between two

    storage subsystems to improve data availability

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    23/27

    Copyright 2007 IBM Corporation. All right reserved.

    GDPS/PPRC HyperSwap Manager and Metro Mirror

    Extends Parallel Sysplex Availability to z/OS DS8000,

    DS6000, ESS disk subsystems Eliminates disk subsystem as single point of failure

    in a z/OS Parallel Sysplex

    Masks primary disk subsystem failures by transparentlyswitching to use secondary disks (Unplanned

    HyperSwap)

    Provides ability to perform disk maintenance withoutrequiring applications to be quiesced (PlannedHyperSwap)

    Delivered as IGS Services offering Technical concept:

    Planned or unplanned HyperSwap will dynamicallysubstitute DS8000, DS6000, or ESS Metro Mirrorsecondary for primary device

    No operator interaction - GDPS-managed

    Can swap large number of volumes - fast

    Includes volumes with Sysres, page DS, catalogs

    Non-disruptive - applications keep using samedevice addresses

    P S

    applicationapplication

    UCB

    Metro

    Mirror

    UCB

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    24/27

    Copyright 2007 IBM Corporation. All right reserved.

    Global Mirror: Asynchronous data replication between two storagesubsystems to improve data availability at global distance

    'A

    Primary

    Native

    Performance

    Consistent Data

    FlashCopy

    REMOTEHOSTS

    SAN

    PRIMARY

    HOSTS

    BGlobalCopySecondary

    SAN

    TransmissionPerformance

    Two site, unlimited global distance

    Complete and consistent data mirroring

    Consistency groups

    Across zOS and Open Systems data Across up to 16 subsystems

    Currency can be configured to as little as 3 to 5

    seconds behind host I/O

    Native application performancePlatform environment

    System z: GPDS/GM

    Geographically Dispersed Open Clusters (GDOC)

    for Unix, Linux and Windows

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    25/27

    Copyright 2007 IBM Corporation. All right reserved.

    z/OS Exploit Global Mirror (XRC): Asynchronous data replication between twostorage subsystems to improve data availability at global distance, using

    System z MIPs

    Productivity tool that integrates management of

    XRC and FlashCopy

    Premium performance & scalability

    Data moved by System Data Mover (SDM)address space(s) running on System z

    Supports heterogeneous disk subsystems

    GDPS/XRC runs in the SDM location

    Manages availability of SDM Sysplex

    Performs fully automated site failover Single point of control for multiple / coupled

    System Data Movers

    Supports zSeries and zSeries Linux data

    Over 200 installations worldwide

    XRC manages secondary consistency

    Across any number of primary subsystemsAll writes time-stamped and sorted before committed to secondary devices

    SDM systems

    GDPS/XRCproduction

    systems

    secondary disk

    subsystems

    journals

    primary disk

    subsystems

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    26/27

    Copyright 2007 IBM Corporation. All right reserved.

    Ability to switch production to any site

    Planned/Unplanned Outage

    Minimal Data Movement

    Protection from local site disaster

    Metro Mirror (Sync PPRC )

    GDPS/MGM with HyperSwap locally Protection from regional disaster

    Global Mirror (Async PPRC) Regional C

    Minimal Data Loss (3-5 seconds)

    Resynchronize any site with incremental

    changes only

    Managed by GDPS/MGM or TPC-R

    Metro/Global Mirror : IBM three site recovery

    Metro

    Mirror

    IHIHIH

    LH

    RJ

    RH

    BackupGlobalMirror

    IBM High Availability Center of Competency

  • 8/8/2019 Storage - DS8K HA Best Practices_v1.9

    27/27

    Copyright 2007 IBM Corporation. All right reserved.

    IBM TotalStorage Productivity Center for Replication (TPC-R)

    Flash Copy

    Metro Mirror, Global Mirror

    Session Management

    Consistency Groups

    Replication Monitor

    Copy Device Interface

    Basic function plus

    High AvailabilityDR Management

    3rd Party Storage

    TPC-R V3.1 Two-Site BC V 3.1

    TPC-R V3.1

    ESS 800 DS6KDS8KSVC

    GUI / CLI / API Enable the configuration of complexreplication environments, providefeedback on the state of their operations,and make changes easy to accomplish

    Provide Common Interface

    Single point of control

    Single set of commands and session states

    Build on copy services functions to provideour customers a DR solution

    Dynamically monitor Metro Mirror andmaintain write order data consistency

    Hide differing hardware technologies andunique Copy Service functionimplementations

    Automate Metro/Global Mirror Incremental

    Resync function