LAB03 Dynamic.data.Center.with.Multi Site.dr.Using.vmware.infrastructure.3

Building a Dynamic Data Center with Multi-Site DR using VMware® Infrastructure 3

September 10-13, 2007

Contents i

Contents

Dynamic Data Center Lab Overview ............................................................................... 1

Lab Architecture and Setup ......................................................................................................... 1

Creating Resource Pools ................................................................................................ 2

Overview of Resource Pools ....................................................................................................... 2

Understanding CPU and Memory Limits, Reservations and Shares ......................................... 2

Overview of the Share System .................................................................................................... 3

Objectives for This Section.......................................................................................................... 3

Part 1: Create Resource Pools .................................................................................................. 3

Part 2: Verify Resource Pool Functionality ................................................................................ 8

VMware VMotion .......................................................................................................... 10

Overview of VMware VMotion ................................................................................................... 10

Objectives for this Section ......................................................................................................... 10

Part 1: Prepare for Migration .................................................................................................... 10

Part 2: Check VMotion Requirements of the ESX Server System .......................................... 12

Part 3: Perform a VMotion Migration of a Virtual Machine ...................................................... 14

VMware DRS (Distributed Resource Scheduler) ........................................................... 17

Overview of VMware DRS......................................................................................................... 17

Objectives for This Section........................................................................................................ 17

Enable Cluster for VMware DRS............................................................................................... 18

Configure DRS Automation Migration Settings......................................................................... 19

Add ESX Server Hosts to the VMware DRS Cluster ................................................................ 20

Configure VMware DRS Affinity Rule ....................................................................................... 21

Configure VMware DRS Anti-Affinity Rule ................................................................................ 24

Work with Your Lab Partners for This Lab ................................................................................ 28

VMware HA (High Availability) ...................................................................................... 29

Overview of VMware HA ........................................................................................................... 29

Objectives for This Section........................................................................................................ 29

Part 1: Prepare for the Lab ....................................................................................................... 29

Part 2: Modify Cluster to Add VMware HA Functionality ......................................................... 29

Part 3: Verify VMware HA Functionality ................................................................................... 30

Virtual Infrastructure Disaster Recovery Demonstration ................................................ 32

Overview of Virtual Infrastructure DR ....................................................................................... 32

Objectives for This Demonstration ............................................................................................ 33

Section 1: Review Initial Setup for Demonstration .................................................................. 33

Appendix A ................................................................................................................... 35

Contents ii

Pod Design ................................................................................................................................ 35

Appendix B ................................................................................................................... 37

DR Configuration ....................................................................................................................... 37

Appendix C .................................................................................................................. 38

DR References .......................................................................................................................... 38

Appendix D .................................................................................................................. 39

Introduction to EMC MirrorView/A ............................................................................................. 39

Theory of Operation for the MirrorView/A Process ................................................................... 40

MirrorView/A Configuration ....................................................................................................... 40

Appendix E ................................................................................................................... 42

Overview for the Disaster Recovery Demonstration ................................................................. 42

Objectives for this Section ......................................................................................................... 42

DR Demonstration Overview ..................................................................................................... 42

Part 1: Preconfiguration for the DR Environment ..................................................................... 44

Part 2: Simulating a Disastrous Event ...................................................................................... 50

Part 3: Disaster Recovery ......................................................................................................... 51

Part 4: Graceful Failback ........................................................................................................... 51

Appendix F ................................................................................................................... 56

DR Setup Notes ......................................................................................................................... 56

Notes ........................................................................................................................... 60

Instructors Zack Zimmerman

David Day

Stephen Sorota

Denise Glasscock

Luis Herrera

Fredrick Rynger


VMworld 2007 1

Dynamic Data Center Lab Overview

This Dynamic Datacenter Lab gives hands-on experience with the setup, use, and configuration of resource pools, VMware VMotion, VMware HA (High Availability), VMware DRS (Distributed Resource Activity), as well as disaster recovery, to help transform your traditional static and inflexible computing environment into a more dynamic datacenter.

Lab Architecture and Setup

• Each lab participant sits next to a lab partner. You and your partner share the laptop in front of you, and work within the virtual machine on that laptop specifically set up for this lab.

• At specific points in the lab, each team works with the team next to them to perform certain

functions. For this reason, please don’t jump too far ahead. • Each participant is asked to refer to the one-page sheet to review important information

regarding seat number, server information, passwords, and other necessary information to participate in the lab.

• In order to start the lab, click the VMware Virtual Infrastructure Client on the screen in

front of you. A page similar to the one below appears. Use the information appropriate to your station (provided in the session and on the one-page cheat sheet) to log into your VirtualCenter Server.


VMworld 2007 2

Creating Resource Pools

Overview of Resource Pools

Resource pools are used to hierarchically partition available CPU and memory resources. Resource pools allow you to manage and control aggregate CPU and memory resources as a single computer resource—either with a standalone ESX Server host or a cluster of ESX Server hosts. Resource pools may have a parent resource pool (indeed, they almost always must) and can contain child resource pools to further subdivide resources. Furthermore, each resource pool always has shares, reservations, and limits associated with it.

Understanding CPU and Memory Limits, Reservations and Shares

CPU Limit

Amount of CPU specified as upper limit for this virtual machine. By default, no limit is specified and Unlimited is displayed. Measured in MHz.

Memory Limit

Amount of memory specified as upper limit for this virtual machine. By default, no limit is specified and Unlimited is displayed. Measured in MB.

CPU Reservation

Amount of CPU cycles reserved for this virtual machine, measured in MHz.

Memory Reservation

Amount of memory reserved for this virtual machine, measured in MB.

CPU Shares

More shares means that this virtual machine wins competitions for CPU time more often.

Memory Shares

More shares means that this virtual machine wins competitions for memory more often.


VMworld 2007 3

Overview of the Share System

Each virtual machine is entitled to resources in proportion to its specified shares, bounded by its reservation and limit. A virtual machine with twice as many shares as another is entitled to twice as many resources.

Number of Shares

• Change number of shares

• Power on VM

• Power off VM

Number of Shares

• Change number of shares

• Power on VM

• Power off VM

• On the first line, where “Number of Shares” is listed, each virtual machine is guaranteed one-third of the available resources if there is contention.

• On the second line, where “Change number of shares” is listed, Virtual Machine B is guaranteed three-fifths of the available resources if there is contention.

• On the third line, where “Power on VM” is listed, Virtual Machine B is guaranteed one-half of the available resources if there is contention.

• On the last line, where “Power off VM” is listed, Virtual Machine B is guaranteed three-fifths of the available resources if there is contention.

Objectives for This Section • Demonstrate resource pools on standalone hosts. • Build two resource pools, and assign resource policies to them. • See the resource pools’ impact on resource allocation.

Part 1: Create Resource Pools 1. Using the VI Client, log into the VirtualCenter host as vcuserX.

2. Create a resource pool named Test and Dev. To do this:

a. Right-click your assigned ESX Server machine in the inventory (10.91.49.x). b. Select New Resource Pool from the menu.


VMworld 2007 4

c. Create a resource pool with the properties listed in the table below.

Property Setting Name Test and Dev CPU Resource Shares Low


VMworld 2007 5

d. Click OK.

3. Verify that the CPU Resource Shares value is set to Low by selecting the Summary tab of the Test and Dev resource pool.


VMworld 2007 6

4. Create a resource pool named Production. To do this:

a. Right-click your ESX Server machine in the inventory. b. Choose New Resource Pool from the menu. c. Give the resource pool the following properties:

Property Setting Name Production CPU Resource Shares High

d. Click OK.

5. Verify that the CPU Resource Shares value has changed by viewing the Summary tab of the Production resource pool.

6. Drag and drop one virtual machine into each resource pool.


VMworld 2007 7

7. Force both virtual machines to run on the same CPU (this is a logical CPU if hyperthreading is enabled, a CPU core if a dual or multi-core CPU, or a physical CPU.) To do this for each virtual machine:

a. Right-click the virtual machine. b. Choose Edit Settings from the menu.

c. Select the Resources tab. d. Select Advanced CPU. e. In the Scheduling Affinity section, select Run on processor(s).


VMworld 2007 8

f. Select CPU1. Make sure the same CPU is selected for each virtual machine. Avoid CPU 0, as this is used by the service console.

Part 2: Verify Resource Pool Functionality

1. Log into the virtual machine using your user name and password provided through the console. To do this, click the virtual machine and choose Console from the top menu.

2. Access each virtual machine’s console and run the cpubusy.vbs. To do this, change the

directory to c:\labfiles and enter the following command: cscript cpubusy.vbs

3. Let the scripts run for a few seconds, then compare the performance of the script in each virtual machine.


VMworld 2007 9

4. Return to the inventory and edit the settings of the Production resource pool. Experiment with custom values for its CPU resource shares and compare performance between the two virtual machines.

5. When you are finished, clean up.

a. End the cpubusy.vbs script (press Ctrl-C). b. Set the resource pools’ CPU Resource Shares back to Normal. c. Modify the Virtual Machines Scheduling Affinity to No affinity (by right- clicking the

virtual machine, then selecting Edit Settings > Resources > Advanced CPU).


VMworld 2007 10

VMware VMotion Overview of VMware VMotion

VMware VMotion enables the live migration of running virtual machines from one physical server to another with zero downtime, continuous service availability, and complete transaction integrity. VMotion gives users the ability to perform planned hardware maintenance without downtime, as well as the ability to proactively migrate virtual machines away from failing or underperforming servers.

Objectives for this Section • Ensure VMotion requirements are met. • Perform actual migrations with VMotion.

Part 1: Prepare for Migration

NOTE: This part should be performed by each ESX Server team.

1. Select a virtual machine on your assigned ESX Server machine to be migrated using VMotion. Preferably, select one that is powered on. Record the name of the selected virtual machine here: ____________________________

2. Verify that the virtual machine you selected meets the VMotion requirements. The virtual

machine should have no active connection to an internal virtual switch. This requirement is in place to prevent the virtual machine from being migrated accidentally if the services running within it require connectivity to another resource available only through that


VMworld 2007 11

internal switch. To verify this, look at the Network section in the virtual machine’s Summary tab.

3. The virtual machine should have no active connection to a local CD-ROM or floppy image. This requirement is in place to prevent the virtual machine from being migrated accidentally and experiencing sudden loss of connectivity to that CD-ROM or floppy image. To verify this:

a. Right-click the virtual machine in the inventory. b. Choose Edit Settings from the menu. c. Select the floppy drive and CD/DVD drive one at a time and make sure that the

Connected box is unchecked. 4. The virtual machine should not currently have a CD-ROM or floppy image located in a local

datastore defined. This requirement is in place to prevent the virtual machine from being migrated accidentally and experiencing sudden loss of connectivity to that CD-ROM or floppy image. To verify this:

a. Right-click the virtual machine in the inventory. b. Choose Edit Settings from the menu. c. Select Floppy Drive 1 from the Hardware list. d. If Use existing floppy image in datastore is selected, make sure that the image

specified is not in a local datastore. e. If a local image is specified, then disallow using an existing floppy image in datastore

by checking another selection, such as Host Device. f. Select the CD/DVD Drive 1 from the Hardware list. g. If Datastore ISO file is selected, make sure that the image specified is not in a local

datastore. h. If a local image is specified, then disallow using the datastore ISO file by checking

another selection, such as Host Device.

5. The virtual machine should not be mapped to a SAN LUN that cannot be seen by the destination ESX Server system. This requirement is in place to prevent the virtual machine from being migrated accidentally and experiencing sudden loss of connectivity to storage it requires. (You will verify this in the next section).

6. The virtual machine should not have CPU affinity set (to verify this, right-click on the virtual machine, and choose Edit Settings > Resources > Advanced CPU. Set the Mode to Any).

7. The virtual machine is not clustered (using MSCS) with another virtual machine.

8. Once the virtual machine has satisfied all VMotion requirements, then power on the virtual machine, if necessary.


VMworld 2007 12

Part 2: Check VMotion Requirements of the ESX Server System


1. Verify that both ESX Server machines in your datacenter have compatible CPUs. To do this, look at the Processor information in each ESX Server’s Summary tab by clicking on one of the servers in your cluster.

2. Use the new host resources topology map to verify that both ESX Server machines are

connected to the same network and the VMFS datastore containing the virtual machine’s files. The host resources topology map simplifies the overall checking required for setup to ensure successful VMotion. To use it:

a. Click the PodXCluster icon. b. Click the Maps panel at the far right (see illustration below) c. In the left pane, check the boxes next to the ESX Server machines assigned to your

pod. Uncheck all other boxes. d. Choose Host Resources from the Map Relationships drop-down menu on the right

side of the page. e. Under the Host Options, select Host To Network and Host To Datastore. f. Select Apply Relationships. g. Verify that both ESX Server machines in the datacenter see the same datastore and

network to which the virtual machine is connected. It should resemble the following:


VMworld 2007 13

3. Verify that the virtual switch being used for VMotion is on a Gigabit Ethernet backplane, by selecting Networking under the Hardware listing of the Configuration tab.


VMworld 2007 14

4. Verify that the VMkernel port for VMotion has already been configured. a. For the virtual switch associated with the VMkernel, click Properties. b. Select VMkernel and click Edit. c. Ensure that VMotion is enabled.

Part 3: Perform a VMotion Migration of a Virtual Machine


1. Migrate with VMotion the selected virtual machine to the other ESX Server machine in the datacenter. To do this:

a. In the inventory, select the virtual machine to migrate. b. Launch this virtual machine’s console so you can see it running. To do this, right-

click the virtual machine and choose Open Console from the menu.


VMworld 2007 15

c. Log into the guest operating system of this virtual machine, if you have not already done so.

d. Open a command prompt to display the virtual machine’s IP and MAC address (ipconfig /all). Note the addresses here: ________________________________

e. Issue a constant ping to the other ESX Server machine assigned to your pod. Enter

the command: ping 10.91.49.x –t NOTE: The –t is important.

f. In the inventory, right-click the virtual machine to migrate. g. Choose Migrate from the menu.

h. The Migrate Virtual Machine wizard appears. Use the following values when prompted for information by the wizard.

Migrate Virtual Machine Prompts

Values

Destination Host Select the other ESX Server machine assigned to the pod to which you have been assigned.

NOTE: At this point, a validation takes place of VMotion requirements. If validation does not succeed, an error message is generated and you are unable to continue with the migration until the problem is resolved.

Resource Pool None exists, so just click Next.

Migration Priority High priority.


VMworld 2007 16

2. Monitor the progress of the VMotion using the Recent Tasks pane at the bottom of the page. The VMotion migration occurs quickly.

3. View the virtual machine’s Summary tab to verify that it is now running on the other ESX

Server system.

4. From the virtual machine’s console, check the virtual machine’s IP address and MAC address once more (using the ipconfig /all command). Did these addresses change? ____________________

To complete this section, migrate all virtual machines to the odd-numbered ESX Server host located in your pod. Then power off all of the virtual machines.


VMworld 2007 17

VMware DRS (Distributed Resource Scheduler) Overview of VMware DRS DRS extends the concept of a CPU scheduler to span physical servers. One of its main goals is to balance the load of virtual machines across all hosts in the cluster. DRS also tries to ensure that resource policies are enforced appropriately, and respects any placement constraints that exist. This includes the anti-affinity and affinity rules, but it also includes the familiar set of VMotion compatibility constraints—including the processor, SAN, and LAN requirements—which should already be familiar to you from the last section.

To achieve these goals, DRS uses two techniques: dynamic balancing and initial placement. With dynamic balancing, DRS monitors key metrics associated with virtual machines, pools, and hosts. This information, in addition to the associated resource policies, is used to deliver the entitle resources to virtual machines and pools. In so doing, it not only rebalances the load but also adjusts for resource entitlement (resource entitlement is a term used to encapsulate all the information about the resource pools that an administrator should set up in terms of shares, reservations, and limits).

DRS re-evaluates system load every few minutes. It also re-evaluates system load when hosts are added or removed, and when changes are made to the resource pool or virtual machine settings. The result of this is a prioritized list of recommendations. In manual mode, the user has the choice of applying or not applying these recommendations. In fully-automated mode, DRS automatically applies the recommendations.

When DRS wants to make a change for dynamic rebalancing of virtual machines, DRS uses VMotion.

Initial placement is a simplified form of dynamic balancing. With initial placement, DRS acts when a virtual machine is being powered on or resumed. When a virtual machine is powered on in a resource pool, DRS automatically determines its resource entitlement and picks the best host for it to reside on—unless, of course, its automation level is set to manual. In this case, it presents a prioritized list of recommendations.

Objectives for This Section • Enable a cluster for DRS. • Add ESX Server hosts and resource pools to a cluster group. • Configure automation settings for DRS. • Add ESX Server hosts to a DRS cluster (both teams). • Automatically place and balance a new virtual machine across the DRS host (create a new

virtual machine via template). • Stimulate DRS to eliminate resource bottlenecks (both teams). • Trigger DRS migrations by generating resource load (both teams). • Use affinity and anti-affinity rules (both teams). • Demonstrate maintenance mode. • Add additional ESX Server hosts to increase available cluster resources to meet demand.


VMworld 2007 18

Enable Cluster for VMware DRS

1. Right-click on your cluster labeled PodXCluster (where X is the pod number) and choose Edit Settings.

2. Enable the cluster for VMware DRS and HA by selecting the appropriate checkboxes.

3. Select VMware HA virtual machine options. Select Allow Virtual Machines to violate availability constraints. This is important later.

4. After enabling the cluster for VMware DRS and HA, take a moment to select the cluster

and look at the summary information in the right-hand pane. Note that the cluster is now enabled for both VMware HA and VMware DRS. In addition, notice that there is a DRS resource distribution chart.


VMworld 2007 19

Configure DRS Automation Migration Settings

1. Right-click on the desired cluster and choose Edit Settings.

2. Select VMware DRS.

3. Under Automation Level, select Fully Automated.

4. Slide the migration threshold all the way to the right so it ends up at the aggressive setting, and click OK.


VMworld 2007 20

Add ESX Server Hosts to the VMware DRS Cluster

Now you need to add ESX Server hosts to the DRS cluster.

1. Team A should drag their assigned ESX Server host into the DRS cluster.

a. At the Choose Destination Resource Pool screen, keep the default of Put all of this host’s virtual machines in the cluster’s root resource pool, and click Next.

b. Select Finish.

2. Team B should repeat the steps up above and drag their assigned ESX Server host into the DRS cluster.

3. After both teams have added their VMware ESX Server hosts, select the DRS Cluster and look at the summary in the right-hand pane. Notice that there are now CPU and memory resources that represent the cumulative total of the two ESX Server hosts that belong to the VMware DRS cluster.


VMworld 2007 21

Configure VMware DRS Affinity Rule Now you need to configure a DRS affinity rule to keep the two virtual machines together. This is used, for instance, when two virtual machines benefit in terms of performance if they are located on the same ESX Server host. The first virtual machine is called pXsXvm1 and the second is pXsXvm2. In this example, pXsXvm1 is a virtual machine with a web server application and pXsXvm2 is a virtual machine that has a SQL database. These two machines benefit from being able to communicate with one another via the ESX Server virtual switch directly, instead of having to communicate across a physical wire.

1. Right-click on the PodXCluster and choose Edit Settings.


VMworld 2007 22

2. Select Rules and click Add.

3. For the name, type Keep Web and DB Together and click Add. Select the appropriate web server and database server, and click OK.


VMworld 2007 23

4. Click OK again to add the rule.

5. Click OK one more time to finish.


VMworld 2007 24

Configure VMware DRS Anti-Affinity Rule Now you need to configure a DRS anti-affinity rule to keep two virtual machines apart from one another. This is useful when two virtual machines should not be located on the same ESX Server host. In this example, pXsXvm3 is a virtual machine with a high I/O enterprise email application, and pXsXvm4 is a virtual machine that has a high I/O database application. Given their similar workload characteristics, it is desirable to keep these two virtual machines apart for maximum performance.

1. Right-click on the PodXCluster and choose Edit Settings.

2. Select Rules and click Add.

3. For the name, type Keep Mail and DB Apart, then click Add. Select the appropriate virtual machines (pXsXvm3 and pXsXvm4) and click OK.


VMworld 2007 25

4. Click OK again to add the rule.

5. Click OK one more time to finish.

6. Power on all virtual machines. a. In each assigned Pod, select each ESX Server machine, then select the virtual

machines tab. Note that while originally all virtual machines were on a single host, they are now evenly distributed across the servers.


VMworld 2007 26

b. Open the consoles to the virtual machines and launch the cpubusy scripts as before. Check that the rules are enforced.


VMworld 2007 27

Planned Downtime If you are required to make changes to your hardware, DRS configurations can be useful. By making use of maintenance mode, virtual machines are automatically evacuated to another ESX Server machine, allowing you to power down a server for scheduled maintenance without experiencing downtime.

NOTE: If there are virtual machines that cannot be migrated, before going into maintenance mode, you must power off these virtual machines manually.

Before starting this lab, remove your anti-affinity rule you created earlier.

1. Right-click - the PodXCluster and choose Edit Settings.

2. Select Rules and Keep Mail and DB Apart. Click Remove.

3. Click OK to finish.

The anti-affinity rule stops the host from going into maintenance mode.


VMworld 2007 28

Work with Your Lab Partners for This Lab 1. Select the lowest number ESX Server machine, ensuring there are some running virtual

machines on the server.

2. Right-click the server on which you are working, and choose Enter Maintenance Mode.

3. Watch the event viewer, and note the virtual machines that will be migrated to the other server in the cluster.

4. When the host has successfully entered maintenance mode, right-click the host, and choose Exit Maintenance Mode.

Because the DRS cluster is in fully automated mode, some virtual machines may be migrated back to the ESX Server host.


VMworld 2007 29

VMware HA (High Availability)

Overview of VMware HA VMware HA is a fully integrated, scalable virtual machine solution, integrated with VMotion, DRS, and VirtualCenter.

VMware HA provides higher availability to virtual machines by automatically failing over virtual machines when their VMware ESX Server host has gone down. VMware HA allows virtual machines to be more resilient to hardware failures, but is not meant to replace application-level clustering. VMware HA helps mitigate the effects of hardware downtime within virtual infrastructure.

Applications clustering provides application- and operating-sytem-level monitoring.

VMware HA clustering allows dissimilar hardware, is easier to configure, is often cheaper, and is available for any workload.

Objectives for This Section

• Demonstrate the VMware HA feature of VirtualCenter 2.0. • Add VMware HA functionality to an existing cluster. • Cause VMware HA to restart virtual machines following the simulated failure of a physical

server.

Part 1: Prepare for the Lab NOTE: This lab must be performed with another student team.

Record the student team here: _______________________________ Each ESX Server machine should have at least two virtual machines running. If necessary, use VMotion to migrate virtual machines.

Part 2: Modify Cluster to Add VMware HA Functionality

NOTE: All students in the cluster team must work together for this part.

1. Using the VI Client, log into the VirtualCenter Server as the Administrator user.


VMworld 2007 30

2. There is currently a DRS cluster named PodXCluster. Confirm this cluster is enabled for

VMware HA functionality. To do this:

a. Right-click the selected DRS cluster in the inventory. b. Choose Edit Settings from the menu. c. The Cluster Settings window appears and you see the General section. Check the

Enable VMware HA box.

d. Notice that the left pane now allows you to view and change VMware HA settings. e. Click OK.

3. Monitor the Recent Tasks pane. Notice that VMware HA is being configured on each of

the ESX Server hosts in the VMware HA/DRS cluster.

Part 3: Verify VMware HA Functionality NOTE: All students in the cluster team MUST work together for this part.

1. To verify that the VMware HA cluster works properly, right-click your highest numbered ESX Server machine, and choose Reboot.


VMworld 2007 31

2. As the server reboots, note the virtual machine list on the remaining ESX Server host. Look at the events of your VMware HA/DRS cluster.

f. Select PodXCluster in the inventory. g. Click the Tasks & Events tab. h. Click Events. i. Click Show all entries, and choose Show Cluster Entries. j. Notice the entries made by VMware HA when the host failure was detected.

3. Verify that the virtual machines that were running on the failed host have been moved to the other host in the cluster.


VMworld 2007 32

Virtual Infrastructure Disaster Recovery Demonstration Overview of Virtual Infrastructure DR Virtual Infrastructure has inherent capabilities that make it an ideal platform for disaster recovery (DR). Virtual machines are encapsulated into a set of files that allow hardware-independence, therefore removing much of the hardware expense and complexity that traditional DR solutions have.

VI disaster recovery provides even higher availability to virtual machines by introducing a remote virtual infrastructure to fail over virtual machines when the primary datacenter is unavailable, either due to disaster or planned maintenance.

Feature Benefit Encapsulation All specific virtual machine components are contained in

a set of files, which enables simple replication of system state, applications, and data. (VMFS3 now puts all necessary files for a virtual machine in a single directory.)

Hardware Independence Allows you to re-allocate old equipment or buy new equipment. Eliminates the need for identical hardware at the primary and secondary sites. Eliminates the bare metal restore process during recovery. Eliminates system compatibility issues between the hardware and operating system at the DR site. Recovery is therefore much more reliable.

Consolidation Overprovision DR site with test and batch workloads.

Instantly repurpose DR site at disaster time but shutting off test jobs and starting up recovery virtual machines.

No wasted resources.

No system reconfiguration time necessary.

Snapshots and VLAN Support

Enables simple, reliable, and non-disruptive tests without extra hardware, operating-system, or application configuration.

Virtual Machines Boot from Shared Storage

Frees system disks from the host, making them available to array and network based replication tools, just like data.


VMworld 2007 33

Source

Target

Gold Copy

Delta

1

2

3

5 4

Eliminates the need to maintain operating-system and application patches, and upgrades separately at source and destination sites.

Resource Pooling Frees system disks from the host, making them available to array and network based replication tools, just like data.

Eliminates the need to maintain operating-system and application patches, and upgrades separately at source and destination sites

Objectives for This Demonstration • Demonstrate one method of failing-over virtual machines from a local datacenter to a remote

datacenter. • Use LUN replication technologies to minimize downtime and maintenance complexity. • Demonstrate the process for recovering virtual machines to virtual infrastructure residing at

a remote datacenter. • Demonstrate the process for failing-back the virtual machines to the primary datacenter. • Refer to Appendices B, C, D, E, and F for background and details on the demonstration

procedure.

Section 1: Review Initial Setup for Demonstration LUN Replication Primary DC Remote DC Virtual Infrastructure Inventory for DR Demonstration


VMworld 2007 34

• Management – One VirtualCenter instance for each datacenter. • Servers –Two ESX Server hosts, one for each datacenter. • Virtual Machines – Two virtual machines with all files (vm, vmdk, nvram, vswp, and so on)

residing on the same vmfs volume. • Replication – Use LUN replication technologies to minimize downtime and maintenance

complexity. • Storage – One LUN at each DC (mirrored to a remote location using SAN replication; refer

to the diagram above). • Network – DHCP registrations that pre-register virtual machine macs to predetermined IPs

and DNS names


VMworld 2007 35

Appendix A

Pod Design Each pod is made up of two servers.

Each server is connected to shared storage.

Each server is connected to the VirtualCenter server.

Each server is running four virtual machines.

Both servers are set up to be within a cluster or resource group.


VMworld 2007 36

The CPU busy script is used to simulate activity to the virtual machine, to show performance with shares and to implement DRS.


VMworld 2007 37

Appendix B DR Configuration

.


VMworld 2007 38

Appendix C DR References For the disaster recovery demonstration, EMC Clariion and MirrorView technologies are used for storage and replication. However, many storage vendors support similar functionality. For more information on VI3 compatible storage arrays, refer to the links below:

• VMware Compatibility Guide for ESX Server 3.0: http://www.vmware.com/resources/techresources/457

• For additional hardware compatibility information:

http://www.vmware.com/vmtn/resources/cat/119

• For community supported hardware: http://www.vmware.com/resources/communitysupport/

http://www.vmware.com/resources/techresources/457

http://www.vmware.com/vmtn/resources/cat/119

http://www.vmware.com/resources/communitysupport/


VMworld 2007 39

Appendix D Introduction to EMC MirrorView/A EMC® MirrorView™ software provides a means to periodically update remote copies of production data. The software application keeps a point-in-time (MirrorView/A) copy, maintaining a periodic copy image of a logical unit (LU) at separate locations. This provides for disaster recovery; one image continues if a serious accident or natural disaster disables the other. It uses FC and IP devices to provide replication over long distances (thousands of miles).

The production image (the one mirrored) is called the primary image; the copy image is called the secondary image. MirrorView/A supports one remote image, which resides on a storage system. The primary image receives I/O from a host called the production host; the secondary image is maintained by a separate storage system that can be a standalone storage system or can be connected to its own computer system. Both storage systems are managed by the same management station, which can promote the secondary image if the primary image becomes inaccessible. After initial synchronization, the remote site always has a consistent copy of the data.

Using MirrorView/A to implement data replication offers several advantages:

• Provision for disaster recovery with minimal overhead Provision for disaster recovery is the major benefit of MirrorView/A mirroring. Destruction of data at the primary site could cripple or ruin an organization. MirrorView/A lets data-processing operations resume with minimal overhead, and enables a quicker recovery by creating and maintaining a copy of the data on another storage system.

MirrorView/A is transparent to hosts and their applications. Host applications do not know that a LUN is mirrored and the effect on performance is minimal. MirrorView/A is not host-based; therefore it uses no host I/O or CPU resources. The processing for mirroring is performed on the storage system. MirrorView/A uses asynchronous writes, which means that secondary systems are periodically updated.

• CLARiiON MirrorView/A environment MirrorView/A operates in a high-availability environment, leveraging the dual-SP design of CLARiiON systems. If one SP fails, MirrorView/A running on the other SP controls and maintains the mirrored LUNs. If the host is able to fail over I/O to the remaining SP, then mirroring continues as normal. The high-availability features of RAID protect against disk failure, and mirrors are resilient to an SP failure in the primary or secondary storage system.

• Bi-directional mirroring The primary or secondary terms, in the context of a storage system, apply only in the context of a particular mirror. A single storage system may be primary (that is, holds the primary image) for some mirrors and secondary (that is, holds the secondary image) for others. This enables bi-directional mirroring.


VMworld 2007 40

Source

Target

Gold Copy

Delta

1

2

3

5 4

• Integration with EMC SnapView™ software EMC SnapView allows users to create a snapshot copy of an active LUN at any point in time; however, this should be done only when the secondary image is synchronized and an update is not in process. The snapshot copy is a consistent image that can serve for other application purposes while I/O continues to the source LUN. The secondary image is not viewable to any hosts, but you can use SnapView in conjunction with MirrorView/A to create a snapshot of a secondary image on a secondary storage system to perform data verification and run parallel processes.

• Replication over long distances MirrorView/A uses FC-to-IP devices to provide replication over long distances (hundreds to thousands of miles).

Theory of Operation for the MirrorView/A Process The MirrorView/A option is used primarily for long-distance mirroring for disaster recovery purposes. Data on the source volumes and the target volumes are always one replication cycle behind. What is important is that MirrorView/A provides a consistent point-in-time view of the production data at the target site.

Figure 1 illustrates the MirrorView/A process.

Figure 1. MirrorView/A Process

MirrorView/A Configuration Figure 2 illustrates the architecture of a customer’s system environment.


VMworld 2007 41

Figure 2. Customer Configuration

Table 1 shows the definition of MirrorView/A groups.

Table 1. Customer Configuration

LUN-Level Consistency

Group

Primary LUN Storage Group Name

Secondary Image

Storage System Name

LUN ID Size

Storage System Name LUN ID

Vmfs_dr_san1 4000 LUN_0

00 19.75GB

ESX Server

4005 LUN_0 01

CLARiiON CX3-20 SN:

CLARiiON CX3-20 SN:

Total Link Bandwidth: OC-48 Number of Mirrors: 04 Amount of Data: 1.6 TB


VMworld 2007 42

Appendix E

Overview for the Disaster Recovery Demonstration The following sections contain the VI-specific procedures for performing disaster recovery for the purpose of demonstration. Note that Appendix F describes the procedures performed on the storage array and replication software.

Objectives for this Section • Demonstrate how to use VI3 for disaster recovery in a simplified scenario. • Show how to automate the DR procedure with Vipertoolkit. • Explain the steps required for manual disaster recovery procedure (optional).

This demonstration consist of four parts:

1. Pre-configuring for the DR environment. 2. Simulating a datacenter outage. 3. Automatic disaster recovery. 4. Graceful manual failback.

DR Demonstration Overview This demonstration includes two datacenters named DR Primary and DR Secondary. The DR primary datacenter initially acts as the source, and DR secondary acts as the target. During the failback demonstration, the DR secondary datacenter fails back to the DR primary datacenter.

Each of the datacenters has its own VirtualCenter instance, named VC1 for the DR primary and VC2 for the DR secondary datacenter. It is important to understand that there are many other configurations and approaches to performing Virtual Infrastructure disaster recovery. To learn more about VMware products and services supporting DR, refer to the VMware Site Recovery Manager materials distributed during VMworld 07.

Each datacenter has its own ESX Server standalone host managed independently by its associated VC; that is, VC1 manages the ESX Server host in the primary datacenter, and VC2 manages the ESX Server host in the secondary datacenter. For the sake of simplicity, our demonstration uses just one ESX Server host in each of the datacenters. You can easily extend the procedure shown here to several ESX Server hosts, clusters, and resource pools.

Each datacenter has its own storage arrays. This lab uses EMC Clariion CX 300 for shared storage and EMC MirrorView for LUN replication, but a similar procedure can be followed for any of the storage arrays listed in the VMware HCL.


VMworld 2007 43

The VMFS LUNs that store the virtual machines running in each datacenter are being replicated (mirrored) between the storage arrays sitting in the DR primary and DR secondary datacenters.

Note the following EMC MirrorView terminology:

• Primary image: The production LUN that is mirrored. • Secondary image: A mirror of the production LUN, contained on another Clariion. • Mirror: Made up of one primary image and one secondary image. • Mirror synchronization: A process by which a secondary image is made identical (

block by block) to the primary image. • Mirror promote: An operation in which the primary and secondary roles are reversed. • Consistency group: A group of mirrors treated as one atomic unit.

There are two primary images and two secondary images as part of two mirrors: one mirror replicates the production LUN in the primary datacenter to the secondary datacenter, and the second mirror works the other way around.

In case of disaster there are two restart scenarios, from the storage point of view:

• Unplanned failover (disaster restart) – An unexpected outage forces a move to a

secondary site. For example, a power outage, primary server failure, or primary storage array failure. The demonstration uses this to demonstrate a typical disaster restart.

• Planned failover (graceful failover) – As part of a maintenance window, the server or the storage array is shut down in an orderly fashion to perform hardware upgrades, network maintenance, or disaster recovery drill. Virtual machines are shut down cleanly and restarted at the DR primary datacenter. This demonstration illustrates this scenario with a graceful failback.

Networking and access control in this lab is as follows:

• Each datacenter has its own and independent IP addressing schema, using DHCP

reservations to associate virtual machines MAC addresses with IP addresses. In order words, each virtual machine has a fixed MAC address, but depending on what datacenter is running, has a different IP address and hostname (potentially).

• Each datacenter has its own active directory server. • These is no public IP addressing. (To fail over public services offered from the

datacenter, we would need to consider a combination of the following techniques for Global IP load balancing: global NATing, round robin DNS, VLANs, BGP, and IGP, or Route Health Injection.)


VMworld 2007 44

Part 1: Preconfiguration for the DR Environment

1. Register the virtual machines to the primary VirtualCenter. Make sure that the ESX Server host in the primary datacenter is registered to the primary VirtualCenter. (Do this before promoting the LUNs to the secondary datacenter.)


VMworld 2007 45

2. Modify the ESX Server host’s advanced settings. From the host’s Configuration tab, choose Advanced Settings. Set a zero (0) value for the LVM.DisallowSnapshotLun and LVM.EnableResignature. Setting these parameters to zero ensures that the VMFS label information is preserved on the replica, which is critical in order to bring up the target virtual machines as soon as possible in case of disaster.


VMworld 2007 46

3. Rescan the LUNs.

Click the Configuration tab, and select Storage Adapters from the Hardware section on the left-hand side. At the top of the new pane that appears, click Rescan.


VMworld 2007 47

4. Register the ESX Server hosts to the remote DC and VirtualCenter. Register the secondary ESX Server hosts in the DR secondary datacenter and VirtualCenter (this has already been done in the demonstration). Make sure that the target and destination ESX Server hosts registered to each VirtualCenter are in different datacenter (their object within VirtualCenter cannot be in the same namespace).

5. Modify the advanced settings on the secondary ESX Server hosts. Repeat step 2 for the remote datacenter’s VirtualCenter advanced settings. Click the host’s Configuration tab and choose Advanced Settings. Set a zero (0) value for both the LVM.DisallowSnapshotLun and LVM.EnableResignature settings. Setting these parameters to zero ensures that the VMFS label information is preserved on the replica, which is critical in order to bring up the target virtual machines as soon as possible in case of disaster.


VMworld 2007 48

6. Rescan the LUNs.

Repeat step 3 for the remote datacenter. Click the Configuration tab, select Storage Adapters from the Hardware section on the left-hand side. At the top of the new pane that appears, click Rescan.


VMworld 2007 49

7. Verify that the LUN snapshots are seen by the remote host. You should be able to see the VMFS volumes with labels prefixed by snap. In particular, you see the snap-prefix-source datastore.

The LUN has been detected by the physical host and is now shown with a snap-prefix.


VMworld 2007 50

8. Verify that the datastore data has been replicated.

Part 2: Simulating a Disastrous Event

1. Save the current data and verify the contents of the datastores. A picture of the class is uploaded to DR_WinVM1 to show current saved data. Verify that the picture is saved on DR_WinVM1.

2. Power off the ESX Server host in the primary datacenter.

3. Promote the secondary mirror.

From the EMC Navisphere, the LUN has been promoted from the primary datacenter to the secondary datacenter. In this scenario, the primary servers and primary storage arrays are down, and in the case of Clariion MirrorView, we need to perform a forced promotion.


VMworld 2007 51

Part 3: Disaster Recovery

1. Show the initial state of the disaster recovery secondary datacenter (the mirrored LUNs are not yet shown).

2. Before starting the virtual machines, double-check that the vipertoolkit dr.pl (batch script) is installed on the servers where VirtualCenter has been installed. Explain what the script is going to do.

3. Initiate an ESX Server SAN rescan from the secondary server.

4. Run dr.pl to start the virtual machines. This Perl script, when run in the secondary

datacenter, automatically: • Registers the virtual machines in the secondary VirtualCenter. • Powers on the virtual machines.

To run the dr.pl Perl script, open a session and execute the following command in the command window: C:\Perl\viperltoolkit\dr\dr.pl –service_url https://vcserver/sdk --username <username> --password <password> The script goes through each of the ESX Server hosts and registers all the virtual machines that it finds on the SAN datastores associated with each ESX Server host. The script assumes that you provide:

• A sensible naming convention – The virtual machine names should follow a convention like <VMFS_NAME>xy, where xy is a number. The match that you supply should be san_vmfs_convention ‘VMFS_NAME’.

• A stagger factor – Don’t try to control the level of parallelism of certain tasks (currently only the PowerONVM_Task).

This applies a best practice for DR but also takes out a lot of the manual tasks one would expect to do when in DR mode, including registering virtual machines from datastores, powering them on, and answering the UUID question.

To stagger, power on the virtual machines in batches of four, have them sleep for 15 seconds, answer the first batch, and power on a second batch of four.

In this demonstration, we simply power them on sequentially, wait 10 seconds, answer, and power on the next virtual machine in list.

Part 4: Graceful Failback In a real life scenario and in case of outage, the primary Clariion might be unreachable and the mirror inconsistent, or in other words, the mirror relationship is invalid. You need to follow these steps:

1. Recreate the mirrors, this time mirroring from secondary back to primary.

https://vcserver/sdk


VMworld 2007 52

2. Wait for reverse synchronization. The primary image needs to be brought back into sync. 3. Revert the virtual machines back to the primary site. This can be done at the most

convenient time for the organization and can follow the same procedures as a graceful failback.

This demonstration simulates a full and manual graceful failback.

1. Shut down the virtual machines in the secondary datacenter. This leaves the primary and secondary mirrors in a consistent state.

2. Start the DR primary ESX Server host that was off.

3. Promote the secondary mirrors. Note that the mirrors simply switch roles in this case

4. Initiate the ESX Server SAN rescan on the primary server.

If you choose to fail the replica LUNs back to their original location, you need to rescan the appropriate ESX Server host. Remember to change the values of LVM.EnableResignature and LVM.DisallowSnapshotLun to the appropriate values.

5. Locate the original LUNs.

From the Storage section of the ESX Server host’s Configuration page, locate the original LUNs that have just failed back. Select Browse Datastore, as shown below.


VMworld 2007 53

2. Register the virtual machines on the primary ESX Server hosts by browsing the directory containing the configuration file (the file with the .vmx extension) and adding the virtual machine to the inventory. From VirtualCenter, click the Configuration tab. Select Storage (SCSI, SAN & NFS) from the left-hand pane. Double-click Datastores to open the datastore browser.

3. Navigate to the .vmx file of the virtual machines. Right-click the folder and select Add to

Inventory.

Re-enabling the virtual machine by returning it to inventory


VMworld 2007 54

NOTE: You could also use the following command in a terminal window: vmware-cmd -s register vmfs/volume/vmfslabel/VMname/VMname.vmx

4. Select Keep to retain its original Universally Unique Identifier (UUID), as shown in the

figure below.


VMworld 2007 55

Retaining the original UUID for the virtual machine

5. Power on the virtual machines. Verify that they have IP connectivity, that theapplication is

running, and that the pictures of the lab participants can be retrieved.


VMworld 2007 56

Appendix F DR Setup Notes In this lab, attendees have read-only access to datastores.

The setup should include:

• One primary VirtualCenter Server, and at least two hosts and two primary LUNs. • One secondary site with a secondary VirtualCenter, with two secondary LUNs of the

same size as the primary LUNs. LUNs in the primary datacenter should be zoned for primary host HBAs. LUNs in the secondary datacenter should be zoned for secondary host HBAs.

• Navisphere and Mirrorview. Several virtual machines should be created on two primary LUNs, running and set for DHCP (a SQL server or some other application).

• DHCP server configured on the VirtualCenter Server at each site, and virtual machines pre-registered on the DHCP server to get pre-ordained IP addresses (with a different IP on each).

Create a Remote Mirror for Each LUN to Be Recovered 1. From the Storage tab of the Enterprise Storage dialog box, navigate to the icon for the

storage system for which you want to create a remote mirror, and select Create Mirror from the MirrorView menu option. In the Create Mirror dialog box, you can change the following mirror parameters: • Mirror Type — Specifies the mirror type as synchronous or asynchronous. If only one

type of mirror is available to create, this field displays only the available type. • Name — Once you create an asynchronous mirror, you cannot change its name. • Description • Minimum Required Images — Specifies the number of secondary images required

for the mirror to remain in the active state. If the number of secondary images is not met, the mirror goes into the attention state (the mirror continues to operate). This state indicates that one of the secondary images has an issue that requires attention.

• Primary Storage System LUN to be Mirrored — You can select the eligible LUN on the selected primary storage system to mirror.

2. Save the changes.

Add a Secondary Image 1. From the Storage tab of the Enterprise Storage dialog box, right-click the icon for a

remote mirror, and select Add Secondary Image. 2. In the dialog box, you can do the following:

• Select the secondary storage system on which the secondary mirror image is to reside.

• Select the LUN on the specified secondary storage system that comprises the secondary mirror image. The application’s choice is checked in the Select


VMworld 2007 57

Secondary Mirror Image LUN list. To choose the LUN yourself, clear Auto Select. In the Select Secondary Mirror Image LUN list, select the LUN to comprise the secondary mirror image.

You can also view or modify the following advanced parameters:

• Initial Sync Required — Lets you perform a full synchronization on the newly added secondary mirror image by accepting this option. If there is no pre-existing data (including disk formatting and signature data) on the primary image, it is not necessary to synchronize the secondary image when it is added. Use this no-synchronization option only if you are sure that both LUNs are identical.

• Recovery Policy — Specifies the policy for recovering the secondary mirror image after a failure.

• Automatic — Specifies that recovery automatically resumes as soon as the primary image determines that the secondary mirror image is once again accessible.

• Manual — Specifies that the administrator must explicitly start a synchronization operation to recover the secondary mirror image.

• Synchronization Rate — Specifies a relative value (low, medium, or high) for the synchronization write delay parameter for the secondary mirror image.

• Update Type — Specifies how frequently you want the update to occur: • Manual Update indicates that you must explicitly update the image. • Start of Last Update indicates the time period (in minutes) from the beginning of

the previous update to schedule the next update. The update must complete before the next one can start.

• End of Last Update indicates the time period (in minutes) from the end of the previous update to schedule the next update.

3. Click OK to add the secondary image and close the dialog box. The application places an icon for the secondary image under the remote mirror image icon in the storage tree. After adding a secondary image, an initial synchronization automatically starts. The entire LUN is copied in the initial synchronization to ensure that you have a complete copy of the mirror.

Synchronize a Secondary Image From the Storage tab of the Enterprise Storage dialog box, navigate to a secondary mirror image, and select Synchronize.

The application starts the synchronization process and reports whether or not the operation was successful.

Create an Array-level Consistency Group 1. From the Storage tab of the Enterprise Storage dialog box, navigate to the icon for the

storage system for which you want to create an array-level consistency gGroup, and select Create Group from the Mirror menu option.

2. In the Create Group dialog box, you can change the following group parameters:


VMworld 2007 58

• Name — Specifies a valid name for the group. • Description — Specifies more detailed description about the group being created. • Available Remote Mirrors — Lists available asynchronous mirrors that have one

secondary image and that are not currently part of any other array-level consistency group. The Mirrors column lists the name of each mirror. The Remote Subsystem column lists the name of the remote storage system to which this mirror is connected.

• Selected Remote Mirrors — Lists the mirrors that you selected from the Available Remote Mirrors list and moved to the Selected Remote Mirrors list using the arrow button.

3. You can also view or modify the following advanced parameters:

• Recovery Policy — Specifies the policy for recovering the secondary mirror image after a failure. • Automatic is automatic as soon as the primary image determines that the

secondary mirror image is once again accessible. • Manual indicates that the administrator must explicitly start a synchronization

operation to recover the secondary mirror image. • Update Type — Specifies how frequently you want the update to occur.

• Manual Update indicates that you must explicitly update the image. • Start of Last Update indicates the time period (in minutes) from the beginning of

the previous update to schedule the next update. • End of Last Update indicates the time period (in minutes) from the end of the

previous update to schedule the next update. When a group begins an update, the mirrors within the group begin updating and end whenever its individual update finishes. The group update completes when the last mirror completes.

4. Apply the changes.

Promote the Secondary Image If the primary image and secondary image can communicate with each other, when the secondary image is promoted, the former primary image is demoted to a secondary image. To promote a secondary image, the following conditions must be true:

• The storage system hosting the secondary mirror image must be currently managed by manager.

• The state of the secondary image to be promoted must be either consistent or synchronized. CAUTION! Promoting a secondary image causes a loss of data written to the primary image after the start of the last completed update. If any updates have been made to the primary image since that time, a full resynchronization of the mirror is required after the promotion.

If an update is currently active (that is, transferring data), the promote operation is not


VMworld 2007 59

allowed; allow the update to complete and the image to transition into the synchronized state, then perform the promote operation. Once a primary image is demoted (from the promotion of the secondary image), the former primary image is no longer accessible by the primary host.

Before promoting a secondary image to a primary image in a failure situation, perform the following steps:

1. Verify that the storage system that failed is not the master of the domain. If it is, assign another storage system to be the master.

2. If the existing primary image is accessible, remove the primary image from any storage groups before promoting the secondary image to avoid I/O and therefore inconsistent data.

3. Ensure that no I/O is occurring in the asynchronous mirror, either generated from a host or by an update in progress.

4. Promote a secondary image to a primary image. From the Storage tab of the Enterprise Storage dialog box, navigate to a secondary mirror image, and select Promote. If you update a secondary image and attempt to promote it, the promote fails; therefore, you must admin fracture the secondary image or wait for the update to complete before doing the promote operation.

5. If the secondary image is not in sync (and the primary array is not accessible), then continuing with the promote operation results in a mirror that requires a full re-synchronization. A dialog box appears and asks you how to continue. The options include: • Local Only Promote — Promotes the secondary image without adding the old

primary as a secondary image of the promoted mirror. This does not affect access to the existing primary image and creates a second mirror, with the former secondary image becoming the primary image of the new mirror.

• Force Promote — Performs the promote operation even if the new secondary image requires a full resynchronization or cannot be contacted. If the existing old primary image can be contacted, the promote operation converts it to an image of the new mirror. It does not check for connectivity or whether or not the promoted mirror will be in the synchronized state.

• Cancel — Cancesl the promote operation.

6. Confirm whether or not you want to continue the promote operation. The current primary image, if accessible, is demoted, so that it is now a secondary mirror image for the remote mirror.

7. Add the newly promoted image to a storage group (on the newly promoted array).


VMworld 2007 60

Notes

LAB03 Dynamic.data.Center.with.Multi Site.dr.Using.vmware.infrastructure.3

Documents

Transcript of LAB03 Dynamic.data.Center.with.Multi Site.dr.Using.vmware.infrastructure.3