h6813 Implting Symmetrix Vrtl Prvsning Vsphere Wp Copy

White Paper

Abstract

This white paper provides a detailed description of the technical aspects and benefits of deploying VMware vSphere™ versions 4 and 5 on Symmetrix VMAX™ family storage arrays using Virtual Provisioning™. This paper also includes an overview of the latest Symmetrix® Virtual Provisioning features and best practice recommendations for deploying virtual machines onto Symmetrix thin devices. November 2012

IMPLEMENTING EMC SYMMETRIX VIRTUAL PROVISIONING WITH VMWARE VSPHERE

2 Implementing EMC Symmetrix

Virtual Provisioning with VMware vSphere

Copyright © 2012 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate of its publication date. The information is subject to change without notice. The information in this publication is provided “as is”. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. VMware, ESXi, vMotion, VMware vCenter, and VMware vSphere are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number h6813.5



Table of Contents

Executive summary.................................................................................................. 5 Terminology ....................................................................................................................... 6

EMC Symmetrix Virtual Provisioning ......................................................................... 9 Overview ............................................................................................................................ 9 Thin pool write balancing ................................................................................................. 11 Virtual Provisioning space reclamation ............................................................................. 13 WRITE SAME support ........................................................................................................ 13 UNMAP support ................................................................................................................ 13 FAST VP ............................................................................................................................ 14 Virtual LUN VP Mobility ..................................................................................................... 16

Rebinding thin devices ................................................................................................. 16 Persistent pre-allocation .................................................................................................. 17

5876 Virtual Provisioning enhancements ............................................................... 17 Thin devices as gatekeepers ............................................................................................. 17 VLUN Migration Enhancements ........................................................................................ 17 FAST VP Enhancements .................................................................................................... 17 TimeFinder/VP Snap ......................................................................................................... 18

Thin device compression ....................................................................................... 19 Migration and thin device compression ............................................................................ 22 FAST VP and thin device compression .............................................................................. 23

Thin device sizing .................................................................................................. 23

Guest OS allocation considerations ....................................................................... 25

Considerations for VMware vSphere on thin devices ............................................... 25 Device visibility and access .............................................................................................. 25 Virtual Machine File System on thin devices ..................................................................... 26 VMFS block allocation and reuse algorithms .................................................................... 28

All vSphere versions and Enginuity 5876.82 and earlier ............................................... 29 vSphere 5.0 U1 with Enginuity 5876.159.102 and later ................................................ 30 Reclaiming space on VMFS volumes without Dead Space Reclamation support ............ 34

Impact of guest operating system activities on thin pool utilization .................................. 38 Virtual disk allocation mechanisms .................................................................................. 39

“Zeroedthick” allocation format ................................................................................... 40 “Thin” allocation format ............................................................................................... 41 “Eagerzeroedthick” allocation format ........................................................................... 43 Virtual Disk Type Considerations .................................................................................. 45

Migrating virtual machines ............................................................................................... 48 Changing virtual disk formats using Storage vMotion ................................................... 49



Creating new virtual disks ................................................................................................ 49 Virtual machine templates ................................................................................................ 50 VMware DRS and Virtual Provisioning ............................................................................... 51

Nondisruptive expansion of VMFS datastores on thin devices ................................. 52 VMFS Volume Grow .......................................................................................................... 53

Thin pool space reclamation in VMware environments ............................................ 59 Virtual Provisioning space reclamation in VMware vSphere environments ........................ 59 Considerations for space reclamation in VMware environments ....................................... 59

Remote migration to thin devices ........................................................................... 60

Local and remote replication between thin devices ................................................ 61

Local replication between thin and traditional, non-thin devices ............................. 61

Thin pool management .......................................................................................... 62 Capacity monitoring ......................................................................................................... 62 Viewing thin pools details with the Virtual Storage Integrator ........................................... 63 Pro-active VMFS-capacity exhaustion avoidance with Storage DRS ................................... 66 Exhaustion of oversubscribed pools ................................................................................. 67

Behavior of ESX(i) without Out of Space error support .................................................. 68 Behavior of ESX(i) with Out of Space error support ....................................................... 70

Conclusion ............................................................................................................ 72

References ............................................................................................................ 72



Executive summary One of the biggest challenges facing storage administrators is provisioning storage for new applications. Administrators typically allocate space based on anticipated future growth of applications. This is done to mitigate recurring operational functions, such as incrementally increasing storage allocations or adding discrete blocks of storage as existing space is consumed. Using this approach results in more physical storage being allocated to the application than needed for a significant amount of time and at a higher cost than necessary. The overprovisioning of physical storage also leads to increased power, cooling, and floor space requirements. Even with the most careful planning, it may be necessary to provision additional storage in the future, which could potentially require an application outage.

A second layer of storage overprovisioning happens when a server and application administrator overallocate storage for their environment. The operating system sees the space as completely allocated but internally only a fraction of the allocated space is used.

EMC® Symmetrix Virtual Provisioning™ (which will be referred to as virtual provisioning throughout the rest of the document) can address both of these issues. Virtual Provisioning allows more storage to be presented to an application than is physically available. More importantly, Virtual Provisioning will allocate physical storage only when the storage is actually written to or when purposively pre-allocated by an administrator. This allows more flexibility in predicting future growth, reduces the initial costs of provisioning storage to an application, and can obviate the inherent waste in over-allocation of space and administrative management of subsequent storage allocations.

EMC Symmetrix VMAX storage arrays continue to provide increased storage utilization and optimization, enhanced capabilities, and greater interoperability and security. The implementation of Virtual Provisioning, generally known in the industry as “thin provisioning,” for these arrays enables organizations to improve ease of use, enhance performance, and increase capacity utilization for certain applications and workloads. It is important to note that the recently introduced Symmetrix VMAXe/10k storage array is designed and configured for a 100 percent virtually provisioned environment.

VMware’s vSphere virtualization suite complementarily offers vStorage thin provisioning. While Symmetrix® Virtual Provisioning occurs at the array level, VMware thin provisioning occurs on the virtual disk level. The host-level thin provisioning feature allows VMware vSphere™ administrators to gain the ability to allocate virtual disk files as “thick” or “thin.” Thin provisioning of virtual disks allows virtual machines on VMware® ESXi ® hosts to provision the entire space required for the disk’s current and future activities, but at first commits only as much storage space as the disk needs for its initial operation.



This white paper will discuss how to best leverage Symmetrix Virtual Provisioning in a VMware vSphere environment. The integration and intelligence between vSphere and Virtual Provisioning has been greatly enhanced in vSphere 5 and those improvements will be explained. This paper addresses the considerations and best practices for deploying VMware vSphere versions 4 and 5 environments on virtually provisioned devices.

An understanding of the principles that are exposed here will allow the reader to deploy and utilize VMware vSphere environments with virtual provisioning in the most effective manner.

IMPORTANT: This document assumes an Enginuity level of at least 5875.231 and therefore older revisions of Enginuity may not have all of the features/functionality discussed. Please consult Enginuity release documentation for details. Enhancements included in 5876 and later will be called out and noted when appropriate. Recommendations and best practices typically still apply for previous releases. When a recommendation is different for an older revision of Enginuity this difference will be pointed out.

Terminology Table 1. Basic Symmetrix array terms

Term Description

Device A logical unit of storage defined within a Symmetrix array. In this paper this style of device is also referred to as a traditional device and a non-thin device

Device Capacity The actual storage capacity of a device

Device Extent Specifies a quantum of logical contiguous block of storage

Host Accessible Device

A device that is exported for host use

Internal Device A device used for internal function of the array

Metavolume An aggregation of host-accessible devices, seen from the host as a single device

Pool A collection of internal devices for some specific purpose

Table 2. Virtual Provisioning terms

Term Description

Thin Device A host-accessible device that has no storage directly associated with it

Data Device An internal device that provides storage capacity to be used by thin devices

Thin Device Extent The minimum quantum of storage that must be mapped at a time to a thin device



Term Description

Thin Pool A collection of data devices that provides storage capacity for thin devices. It should be noted that a thin pool can exist without any storage associated to it

Thin Pool Capacity The sum of the capacities of the member data devices

Bind The process by which one or more thin devices are associated to a thin pool

Unbind The process by which a thin device is disassociated from a given thin pool. When unbound, all previous allocated extents are freed and returned to the thin pool

Rebind The process by which a currently bound thin device is associated to different thin pool. When rebound, all previous allocated extents are remain on the original thin pool, but all new extents will be allocated on the new pool.

Enabled Data Device

A data device belonging to a thin pool from which capacity is allocated for thin devices. This state is under user control

Disabled Data Device

A data device belonging to a thin pool from which capacity cannot be allocated for thin devices. This state is under user control

Activated Data Device

Reads and writes can be done from allocated and unallocated space on activated data devices

Deactivated Data Device

Reads and writes can be done from already allocated space on deactivated data devices. No new allocations can be done on deactivated data devices

Data Device Drain The process of removing allocated extents off of one data device and moving them to another enabled data device in the same thin pool

Thin Pool Enabled Capacity

The sum of the capacities of enabled data devices belonging to a thin pool

Thin Pool Allocated Capacity

A subset of thin pool enabled capacity that has been allocated to thin devices bound to that thin pool

Thin Device Written Capacity

The capacity on a thin device that was written to by a host. In most implementations this will be a large percentage of the thin device allocated capacity

Thin Device Subscribed Capacity

The total amount of space that the thin device would use from the pool if the thin device were completely allocated

Oversubscribed Thin Pool

A thin pool whose thin pool enabled capacity is less than the sum of the reported sizes of the thin devices using the pool

Thin Device Allocation Limit

The capacity limit that a thin device is entitled to withdraw from a thin pool, which may be equal to or less than the thin device subscribed capacity



Term Description

Thin Device Subscription Ratio

The ratio between the thin device subscribed capacity and the thin pool enabled capacity. This value is expressed as a percentage

Thin Device Allocation Ratio

The ratio between the thin device allocated capacity and the thin device subscribed capacity. This value is expressed as a percentage

Thin Device Utilization Ratio

The ratio between the thin device written capacity and thin device allocated capacity. This value is expressed as a percentage

Thin Pool Subscription Ratio

The ratio between the sum of the thin device subscribed capacity of all its bound thin devices and the associated thin pool enabled capacity. This value is expressed as a percentage

Thin Pool Allocation Ratio

The ratio between the thin pool allocated capacity and thin pool enabled capacity. This value is expressed as a percentage

Pre-allocating Allocating a range of extents from the thin pool to the thin device at the time the thin device is bound. Sometimes used to reduce the operational impact of allocating extents or to eliminate the potential for a thin device to run out of available extents



EMC Symmetrix Virtual Provisioning Virtual Provisioning is the ability to present an application with more capacity than is physically allocated. The physical storage is then allocated to the application “on demand” from a shared pool of capacity as it is needed. Virtual Provisioning increases capacity utilization, improves performance, and simplifies storage management when increasing capacity.

Figure 1. Symmetrix Virtual Provisioning

Overview

Symmetrix thin devices are logical devices that can be used in many of the same ways that Symmetrix devices have traditionally been used. Unlike standard Symmetrix devices, thin devices do not need to have physical storage completely allocated at the time the device is created and presented to a host. A thin device is not usable until it has been bound to a shared storage pool known as a thin pool. The thin pool is comprised of special devices known as data devices that provide the actual physical storage to support the thin device allocations.

When a write is performed to part of any thin device for which physical storage has not yet been allocated, the Symmetrix allocates physical storage from the thin pool for that portion of the thin device only. The Symmetrix operating environment, Enginuity, satisfies the requirement by providing a unit of physical storage from the thin pool called a thin device extent. This approach reduces the amount of storage that is actually consumed.

The thin device extent is the minimum amount of physical storage that can be reserved at a time for the dedicated use of a thin device. An entire thin device extent is physically allocated to the thin device at the time the thin storage allocation is made as a result of a host write operation. A round-robin mechanism is used to balance the allocation of data device extents across all of the data devices in the thin

semmal ibrahim



pool that are enabled and that have remaining unused capacity. The thin device extent size is twelve 64 KB tracks (768 KB). Note that the initial bind of a thin device to a pool causes one thin device extent to be allocated per thin device. If the thin device is a metavolume, then one thin device extent is allocated per metavolume member. So a four-member thin metavolume would cause 4 extents (3,078 KB) to be allocated when the device is initially bound to a thin pool.

When a read is performed on a thin device, the data being read is retrieved from the appropriate data device in the thin pool. If a read is performed against an unallocated portion of the thin device, zeros are returned.

When more physical data storage is required to service existing or future thin devices, such as when a thin pool is running out of physical space, data devices can be added to existing thin pools online. Users can rebalance extents over the thin pool when any new data devices are added. This feature is discussed in detail in the next section. New thin devices can also be created and bound to existing thin pools.

When data devices are added to a thin pool they can be in an "enabled" or "disabled" state. In order for the data device to be used for thin extent allocation, however, it needs to be "enabled". Conversely, for it to be removed from the thin pool, it needs to be in a "disabled" state. Active data devices can be disabled, which will cause any allocated extents to be drained and reallocated to the other enabled devices in the pool. They can then be removed from the pool when the drain operation has completed.

Figure 2 depicts the relationships between thin devices and their associated thin pools. There are nine devices associated with thin Pool A and three thin devices associated with thin Pool B.

semmal ibrahim



Figure 2. Thin devices and thin pools containing data devices

The way thin extents are allocated across the data devices results in a form of striping in the thin pool. The more data devices that are in the thin pool, the wider the striping will be, and therefore the greater the number of devices available to service I/O.

Thin pool write balancing

Whenever writes occur on a thin device, the actual data gets written to the data devices within the thin pool to which the thin device is bound. Enginuity intelligently spreads out the data equally on all of the data devices within the thin pool, so at any given moment the used capacity of each data device is expected to be uniform across the pool.

Prior to the availability of automated thin pool write balancing, as soon as new data devices were added to the thin pool, the pool became unbalanced. If data devices were added in groups that were smaller in number than what were previously in the thin pool, performance decreased due to the fact that as soon as the original data devices became full, any new data written to the pool was striped over fewer data devices. Therefore, the recommendation was to add a large number of data devices, typically equal to, or greater than, the number of devices used when the pool was originally created. This maintained expected performance levels. This method of expanding pools led to an increase in assigned, but unused, storage since in most situations the thin pool did not need to be expanded by a factor of two or more to fulfill the increased thin pool storage requirements.

semmal ibrahim



Automated thin pool write balancing alleviates both of these problems. It is now possible to add any number of data devices and still have a pool that is balanced for capacity without sacrificing performance.

Thin pool write balancing works by normalizing the used capacity levels of data devices within a thin pool. A background optimization process scans the used capacity levels of the data devices within a pool and performs movements of groups from the most utilized devices to the least. The write balancing threshold is initially set to 10% but can be altered to between 1% and 50%. The write balancing threshold means the following: first, that there can be up to a 10 percent difference between the most used and least used data devices after a rebalance operation completes; and second, that if the thin pool is within this range from the initiation of the balancing command, the balancing operation will not start.

The basic steps taken by thin pool write balancing are as follows:

• Initialization – the target resources are locked1 so that no other process can try to optimize them

• Scan – the pool is scanned for utilization statistics

• Move – the group movements are initiated between devices

• Wait – the task monitors the background data movements until the next period begins

The balancing procedure will start by calculating the minimum, maximum, and mean used capacity values of the data devices. Afterward, a source/target data device pair is automatically initialized by Enginuity out of the existing thin pool to bring the source data device’s utilization down and the target data device’s utilization up toward the mean. Once the data movements have finished, the process restarts by scanning the pool again. Data devices that are currently a source or a target in the pool balance operation will be shown to have a state of “Balancing.” This state can be verified by viewing the data devices in the target thin pool in either Unisphere for VMAX or Solutions Enabler. While a data device is in this state, it cannot be disabled or drained. The “Balancing” state will be transient as Enginuity will pick different devices periodically to accomplish the pool balancing operation.

While write balancing is going on, it is possible that the data can be moved from any “deactivated” data devices but data will never be moved to “deactivated” data devices. Solutions Enabler will report the state “Balancing” in the pool state and once the task is completed, the pool state will revert to the default state of “Enabled.” The maximum number of concurrent devices that can participate in a pool rebalance is configurable and can be set anywhere from two devices to the entire pool.

For more detailed information on how to perform thin pool write balancing, please refer to the technical note Best Practices for Fast, Simple Capacity Allocation with EMC Symmetrix Virtual Provisioning.

1 Only the backend data devices are locked. I/O to thin devices or configuration changes to thin devices are not affected or prevented.

semmal ibrahim

semmal ibrahim

semmal ibrahim

semmal ibrahim



Virtual Provisioning space reclamation

Virtual Provisioning space reclamation provides the ability to free, or “de-allocate”, storage extents found to contain all zeros. This feature is an extension of the existing Virtual Provisioning space de-allocation mechanism. Previous versions of Enginuity and Solutions Enabler allowed for reclaiming allocated, reserved but unused, thin device space from a thin pool. Administrators now have the ability to reclaim both allocated/unwritten extents as well as extents entirely filled with host-written zeros. The space reclamation process is nondisruptive and can be executed with the targeted thin device ready and read/write to operating systems and applications.

Starting the space reclamation process spawns a back-end disk director (DA) task that examines the allocated thin device extents on specified thin devices. As discussed earlier, a thin device extent is 768 KB (or 12 tracks) in size and is the default unit of storage at which allocations occur. For each allocated extent, all 12 tracks will be brought into Symmetrix cache and examined to see if they contain all zero data. If the entire extent contains all zero data, the extent will be de-allocated and added back into the pool, making it available for a new extent allocation operation. An extent that contains any non-zero data is not reclaimed.

WRITE SAME support

Enginuity 5875 and later provide support for the WRITE SAME command (0x93), which instructs a Symmetrix VMAX to write the same block of data to a specified number of sequential logical blocks in a single request. In the absence of support for this command, a host needs to use multiple write requests to a device. WRITE SAME support is fully available in Enginuity 5875. In the case of a Symmetrix thin device, the WRITE SAME SCSI command allows an additional bit to be set, called the UNMAP bit. This bit allows the tracks identified by the WRITE SAME operation to be de-allocated and returned to the pool. If the WRITE SAME SCSI command (with the UNMAP bit set) range covers a subset of the 12 tracks in an extent, those tracks are marked NWBH. The extent is therefore not returned to the pool; however those tracks do not have to be read from disk, copied for TimeFinder/Snap, TimeFinder/Clones or rebuilds, or impact SRDF by taking bandwidth. Supported operating systems like VMware ESXi (versions 4.1 and later) use this command to zero (write contiguous zeros) out a Symmetrix LUN. It is important to note that VMware ESXi only makes use of WRITE SAME without the UNMAP bit set and therefore does not de-allocate blocks when issued.

UNMAP support

Enginuity 5875 and later provide support for the UNMAP SCSI command (0x42) which advises a target device that a range of blocks is no longer needed. If the range covers a full Virtual Provisioning extent, Enginuity 5875 can return that extent to the pool. If, on the other hand, the range covers a subset of the 12 tracks in an extent, those tracks are marked “Never Written by Host (NWBH).” In this case the extent is not returned to the pool; however those tracks do not have to be read from disk, copied

semmal ibrahim

semmal ibrahim

semmal ibrahim



for TimeFinder Snap, TimeFinder Clones or rebuilds, or impact SRDF by taking bandwidth.

IMPORTANT: Even though Enginuity 5875 supports the use of the SCSI UNMAP command, UNMAP with vSphere is NOT supported unless the Symmetrix array is running 5876.159.102

FAST VP

FAST VP maximizes the benefits of in-the-box tiered storage by optimizing cost vs. performance requirements by placing the right thin data extents, on the right tier, at the right time. The FAST VP system allows a storage administrator to decide how much SATA/FC/EFD capacity is given to a particular application and then automatically place the appropriate busiest thin data extents on the desired performance tier and the least busy thin data extents on the capacity tier. The administrator’s input criteria are assembled into FAST policies. The FAST VP system uses this policy information to perform extent data movement operations within two or three disk tiers in the Symmetrix VMAX and VMAXe arrays. Because the unit of analysis and movement is measured in thin extents this sub-LUN optimization is extremely powerful and efficient. FAST VP, made available in 5875, is an evolution of the existing FAST and EMC Optimizer technology.

There are two components of FAST VP – the FAST controller and the Enginuity microcode. The microcode is responsible for the collection of performance statistics, at both the LUN and sub-LUN level. The FAST controller is responsible for analyzing performance data collected by the microcode.

The resulting data analysis generates a data movement policy that contains promotion and demotion thresholds for each tier included in a FAST policy.

The microcode applies this movement policy to all thin devices under FAST VP control to determine the appropriate tier for the data and will generate and execute movement requests to relocate thin extents to the appropriate tier.

semmal ibrahim



Figure 3. FAST VP components

FAST VP requires three control objects — storage groups, FAST policies, and thin tiers.

• Storage groups are a logical collection of Symmetrix volumes, typically associated with an application, which are to be managed together.

• FAST policies contain a set of tier usage rules that can be applied on one or more storage groups.

• Thin tiers contain between one and four thin storage pools of a matching drive technology (EFD, FC, or SATA) and a RAID protection type.

The storage group definitions are shared between FAST and Auto-provisioning Groups. However, a Symmetrix device may only belong to one storage group that is under FAST control.

A FAST VP policy groups between one and three tiers and assigns an upper usage limit for each thin tier. The upper limit specifies how much allocated capacity of the thin devices in an associated storage group can reside on that particular thin tier.

The upper capacity usage limit for each thin tier is specified as a percentage of the allocated capacity of the thin devices in the associated storage group. The usage limit for each tier must be between 1 percent and 100 percent. When combined, the upper usage limit for all Symmetrix tiers in the policy must total at least 100 percent, but may be greater than 100 percent.

A thin tier contains at least one thin storage pool from the Symmetrix but can include up to four. If more than one thin pool is contained in a thin tier, the thin pools must be of the same drive technology type and RAID protection.

Data movement

The metrics collected at the sub-LUN level for thin devices under FAST VP control contain measurements to allow FAST VP to make separate data movement requests



for each 7,680 KB unit of storage that make up the thin device. This unit of storage consisting of 10 contiguous thin device extents is known as an extent group. Activity bitmaps are maintained at the level of an extent group to track I/Os contributing to the performance metrics.

Data movements take place by moving the allocated track groups of each FAST VP sub-extent from one thin pool to another. When the relocation completes, the track groups from the original thin pool are de-allocated.

All of the best practices and considerations outlined in this document for creating and managing thin pools fully apply to any thin pool that will be assigned to a FAST VP tier. The actual configuration of FAST VP is beyond the scope of this document. Refer to the white paper, Implementing Fully Automated Storage Tiering with Virtual Pools (FAST VP) for EMC Symmetrix VMAX Series Arrays, for a specific VMware and FAST VP integration discussion or for further specific details on FAST VP, refer to the technical note, Storage Tiering for VMware Environments Deployed on EMC Symmetrix VMAX with Enginuity 5876.

Virtual LUN VP Mobility

Virtual LUN technology offers manual control of data mobility between storage tiers within a Symmetrix VMAX array. Virtual LUN data movement can nondisruptively change the drive type (capacity, rotational speed) and the protection method (RAID scheme) of Symmetrix logical volumes. Enginuity 5875 brings an additional enhancement to this feature by allowing thin device movement from one thin pool to another thin pool.

Open system thin devices can be migrated to alternative thin pools for capacity-based reasons or for performance-based reasons to a different RAID scheme or drive type. Virtual LUN data movement is transparent to hosts and applications and does not affect local and remote disaster recovery protection during and after the data movement.

Virtual LUN VP migrations are session-based – each session may contain multiple devices to be migrated at the same time. There may also be multiple concurrent migration sessions. At the time of initial execution, a migration session name is specified and this session name is subsequently used for monitoring and managing the migration. While an entire thin device is specified for migration, only thin device extents that are allocated will be relocated. Thin device extents that have been allocated, but not written to (for example, pre-allocated tracks), are relocated but will not cause any actual data to be copied. New extent allocations that occur as a result of a host write to the thin device during the migration will be satisfied from the migration target pool.

Rebinding thin devices

Previously, if a user needed to change the binding of a thin device to a different pool, it had to be unbound and then rebound to a new pool, which could result in data loss or unavailability if not manually protected. This functionality allows a user to rebind a thin device to a new pool without moving its data or losing any data. Rebind simply

http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_Documentation/300-012-015.pdf

http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Technical_Documentation/300-012-015.pdf



changes the thin device's current binding to a new pool. Rebind will not move any existing data to the new pool, but all new allocations to the thin device will go to the new pool after the rebind action completes.

Persistent pre-allocation

During allocation of a thin device, it can be marked as “persistent”. Allocations marked as persistent are unaffected by a standard reclaim operation and any Clone, Snap, or SRDF copy operations. The value (None, Some, or All) of persistently-allocated tracks is reported by the symcfg list –tdev –detail command. This is the preferred method of pre-allocating a thin device and should be used if pre-allocation is desired. This will avoid unintentional de-allocation from, for example, an end-user’s execution of the ESXi 5 implementation of UNMAP.

5876 Virtual Provisioning enhancements Solutions Enabler version 7.4 and Enginuity 5876 introduce a number of new features/enhancements to Symmetrix Virtual Provisioning– notably TimeFinder/VP Snap. It is important to note that Enginuity 5876 is only available on the Symmetrix VMAX family arrays. A quick summary of the Virtual Provisioning enhancements are listed below.

Thin devices as gatekeepers

Enginuity 5876 and Solutions Enabler v7.4 allow thin devices to function as gatekeepers. These devices do not need to be bound to a pool nor is there any benefit to doing so. Sizing requirements do not differ from thick gatekeepers so the recommended size remains at 3 MB.

VLUN Migration Enhancements

5876 allows the migration of allocations from one pool to another pool for a user-specified set of devices. Instead of performing a VLUN migration of a thin device that previously would move all of its allocated extents from all pools to the destination pool, the user can now specify which pool should have its extents relocated therefore leaving any allocations in other pools in place.

FAST VP Enhancements

Enginuity 5876 offers the following new features for FAST VP:

• FAST VP and SRDF coordination: With Solutions Enabler v7.4 and Enginuity 5876, users can set FAST VP RDF coordination on an association to instruct FAST VP to factor the R1 device statistics into the move decisions that are made on the R2 device.

• Thin pool allocations: All Thin devices managed by FAST VP can have new allocations placed in any pool in the policy, rather than from just the pool to which the device is bound. This is called Allocate by FAST Policy and is a

semmal ibrahim

semmal ibrahim

semmal ibrahim



system-wide setting. The system will choose the best pool to use when allocating new extents. If the physical capacity of the bound pool is exhausted, for instance, a write will not fail and instead it will be redirected and allocated to another pool in the policy (if available).

• Storage group re-associate: With Solutions Enabler v7.4, users can re-associate a storage group to a new policy. With a re-association, all current attributes set on the association propagate automatically to the new association.

TimeFinder/VP Snap

This new TimeFinder technology adds the ability to share capacity allocation for track groups originating from the same source volume. TimeFinder VP Snap is easy to manage and combines the benefits of both full copy clone and space saving snap technologies.

TimeFinder VP Snap provides the ability for multiple clone VSE (Virtual Space Efficient) sessions to target thin devices and share extent allocations within the thin pool as shown in Figure 4. This reduces the space needed for the storage of saved tracks.

Figure 4. TimeFinder/VP Snap

VP Snap sessions copy data from the source device to the target device only if triggered by a host write operation. Read I/Os to protected tracks on the target device do not result in data being copied.

For a single activated VP Snap session on a source device, the target represents a single point-in-time copy of the source. Copied data resides on allocations in the thin pool. For example, if tracks 100, 200, and 300 are written on the source device, the point-in-time data for each track resides in a unique allocation in the thin pool.

When there is a second VP Snap session from the same source device to a different target, the allocations can be shared. For example, if there is a write I/O to tracks 1100, 1200, and 1300 on the source device, the data is new to both targets' point-in-



time, and the point-in-time data can be saved in a single set of allocations that is shared by both target devices.

However, if there is another write I/O to tracks 100, 200, or 300 on the source device, since the data is new for only the second session's point-in-time, when the tracks are copied to the second target device, the point-in-time data is put into a set of allocations that are uniquely owned by the second target device. In other words, the allocations for these tracks cannot be shared.

If more VP Snap sessions are added to the same source device, data is copied to the targets based on whether the source data is new with respect to the point-in-time of each copy. When data is copied to more than one target, only a single shared copy resides in the thin pool.

If there is a write I/O to one or more of the tracks stored in a shared allocation, the affected allocation for that target will be split off from the shared group because the data is now different than the data for the other targets that are using that allocation. The new data of the written target will be stored in a separate allocation while the shared allocation will still contain the data of the other targets.

When VP Snap sessions are terminated, the target device is removed from any shared allocations that were part of the session, and any non-shared allocations for that device are de-allocated. When all but one of the VP Snap sessions are terminated, the last remaining session uses the same space in the thin pool, but it is no longer a shared allocation. Upon termination of the last session, the space is de-allocated.

Refer to the document Implementing TimeFinder VP Snap for Local Replication for more information on VP Snap.

Thin device compression With the Enginuity 5876.159.102 and Solutions Enabler v7.5, data within a thin device can be compressed/decompressed to save space.

Important: VP compression is intended for use with data that is seldom and lightly utilized (months, years). In the event that compressed data pools receive significant I/O workload, a system-wide performance impact can be expected. VP Compression is limited to a maximum of 10TB of compressed data per engine. Manual compression should therefore be utilized with discretion.

It is important to note that users should leverage compression only if the hosted application can sustain a slightly elevated response time when reading from, or writing to, compressed data. If the application is sensitive to heightened response times it is recommended to not compress the data. Preferably, data compression should be primarily used for data that is completely inactive and is expected to remain that way for an elongated period of time. If a thin device (one that hosts a VMFS volume or is in use as an RDM) is presently compressed and an imminent I/O workload is anticipated that will require significant decompression to commit new



writes, it is suggested to manually decompress the volume first thereby avoiding the decompression latency penalty.

To compress a thin device, the entire bound pool must be enabled for compression. This can be achieved by setting the compression capability on a given pool to “enabled” in Unisphere or through the Solutions Enabler CLI. An example of this process with Unisphere can be seen in Figure 5. When compression is no longer desired for allocations in a pool, the compression capability on a pool can be disabled.

Figure 5. Enabling VP Compression on a thin pool

Note that once a pool is enabled for compression, a quick background process will run to reserve storage in the pool that will be used to temporarily decompress data (called the Decompressed Read Queue or DRQ). When compression is disabled, the reserved storage in the pool will be returned to the pool only after all compressed allocations are decompressed. Disabling the compression capability on a pool will not automatically cause compressed allocations to be decompressed. This is an action that must be executed by the user. Solutions Enabler or Unisphere will report the current state of the compression capability on a pool, the amount of compressed allocations in the pool, and which thin devices have compressed allocations. Figure 6 shows a thin device with a compression ratio of 21% in Unisphere.



Figure 6. Viewing thin device-specific compression information with EMC Unisphere

As of version 1.5 of Unisphere for VMAX, the ability to manually compress a device has not yet been included. Therefore, device compression must be initiated using Solutions Enabler CLI commands. There are a few different mechanisms offered by Solutions Enabler to do this:

To compress/decompress a single device:

symdev compress <device ID> -sid <Symmetrix SN>

symdev uncompress <device ID> -sid <Symmetrix SN>

To compress/decompress all valid devices in a device group:

symdg compress -g <device group name>

symdg uncompress -g <device group name>

To compress/decompress all valid devices in a composite group:

symcg compress -cg <composite group name>

symcg uncompress -cg <composite group name>

To compress/decompress all valid devices in a storage group:

symsg compress -sg <storage group name> -sid <Symmetrix SN>



symsg uncompress -sg <storage group name> -sid <Symmetrix SN>

All decompression or compression operations listed above can be halted by appending the –stop tag to the original command like so:

symdev compress <device ID> -sid <Symmetrix SN> -stop

Stopping a decompression or compression operation will not reverse any decompressions or compressions that have already been committed by the terminated operation. The opposing command must be run to reverse any compressions or decompressions already committed.

When a user chooses to compress a device, only allocations that have been written will be compressed. Allocations that are not written to or fully-zeroed will be reclaimed during compression if the tracks are not persistently allocated. In effect, the compression process also performs a zero-space reclamation operation while analyzing for compressible allocations. This implies that after a decompression the allocated tracks will not match the original allocated track count for the TDEV. If this behavior is undesirable, the user can make use of the persistent allocation feature, and then the allocated tracks will be maintained and no compression will take place on persistently allocated tracks. If allocations are persistent and a reclaim of space is in fact desired, then the persistent attribute for existing allocations may be cleared to allow compression and reclamation to occur.

If the device has allocations on multiple pools, only the allocations on compression-enabled pools will be compressed.

Note that compression of allocations for a thin device is a low-priority background task. Once the compression request is accepted, the thin device will have a low-priority background task associated with it that performs the compression. While this task is running, no other background task, like allocate or reclaim, can be run against the thin device until it completes or is terminated by the user.

A read of a compressed track will temporarily uncompress the track into the DRQ maintained in the pool. The space in the DRQ is controlled by a least recently used algorithm ensuring that a track can always be decompressed and that the most recently utilized tracks will still be available in an decompressed form. Writes to compressed allocations will always cause decompression.

Migration and thin device compression

Table 3 describes the behavior or Enginuity as it handles migrating thin devices with compressed allocations.

Table 3. Compression behavior with data migration operations

Target Compression Enabled Target Compression Disabled

Source Compression

The compressed tracks will be migrated to the target as

The compressed tracks will be migrated to the target as



Enabled compressed tracks. The target pool’s utilized space will be increased by the compressed size while the free space will be decreased by the compressed size.

The source pool’s utilized space will be decreased by the compressed size and the free space will be increased by the compressed size.

decompressed (allocated) tracks. The target pool’s utilized space will be increased by the decompressed (allocated) size while the free space will be decreased by the decompressed size.

The source pool’s utilized space will be decreased by the compressed size and the free space will be increased by the compressed size.

Source Compression Disabled

The decompressed (allocated) tracks will be migrated to the target pool as decompressed tracks (allocated). The target pool’s utilized space will be increased by the decompressed (allocated) size while the free space will be decreased by the decompressed (allocated) size.

The source pool’s utilized space will be decreased by the decompressed (allocated) size and the free space will be increased by the decompressed (allocated) size.

The decompressed (allocated) tracks will be migrated to the target pool. The target pool’s utilized space will be increased by the decompressed (allocated) size while the free space will be decreased by the decompressed (allocated) size.

The source pool’s utilized space will be decreased by the decompressed (allocated) size and the free space will be increased by the decompressed (allocated) size.

FAST VP and thin device compression

Over time if the data has not been accessed, compression may occur if the device is under FAST VP management and FAST VP is supporting compression on the association. The FAST engine will be able to treat compression as a sub-tier, for instance demoting from SATA to SATA-Compressed (within the same actual tier), likewise promoting from SATA-compressed to SATA or higher.

Thin device sizing The maximum size of a thin device on a Symmetrix VMAX is approximately 240 GB. If a larger size is needed, then a metavolume comprised of thin devices can be created. For environments using 5874 and earlier, it is recommended that thin metavolumes be concatenated rather than striped since concatenated metavolumes support fast expansion capabilities, as new metavolume members can easily be added to the existing concatenated metavolume. This functionality is required when the



provisioned thin device has been completely consumed by the host, and further storage allocations are required. Beginning with Enginuity 5875, striped metavolumes can be expanded to larger sizes online. Therefore, EMC recommends that customers utilize striped thin metavolumes with the 5875 Enginuity level and later.

Concatenated metavolumes using standard volumes have the drawback of being limited by the performance of a single metavolume member. In the case of a concatenated metavolume comprised of thin devices, however, each member device is typically spread across the entire underlying thin pool, thus somewhat eliminating that drawback.

However, there are several conditions in which the performance of concatenated thin metavolumes can be limited. Specifically:

1. Enginuity allows one outstanding write per thin device per path with Synchronous SRDF®. With concatenated metavolumes, this could cause a performance problem by limiting the concurrency of writes. This limit will not affect striped metavolumes in the same way because of the small size of the metavolume stripe (one cylinder or 1,920 blocks).

2. Symmetrix Enginuity has a logical volume write-pending limit to prevent one volume from monopolizing writeable cache. Because each metavolume member gets a small percentage of cache, a striped metavolume is likely to offer more writeable cache to the metavolume.

If changes in metavolume configuration is desired, Solutions Enabler 7.3 and 5875.198 and later support changing a concatenated thin metavolume into a striped thin metavolume while protecting the data using a thin or thick BCV device that is persistently allocated2. Previous to this, protected metavolume conversion was not allowed. For details on how to perform this operation, please refer to Unisphere for VMAX or Solutions Enabler documentation.

2 Solutions Enabler 7.3 introduces the feature to persistently pre-allocate thin devices. This allows any pre-allocation of a thin device to be marked as “persistent”. Persistent allocations are henceforth unaffected by a standard reclaim operation and any Clone, Snap, or SRDF copy operations.



Guest OS allocation considerations Host file system and volume management routines behave in subtly different ways. These behaviors are usually invisible to end users. The introduction of thin devices can put these activities in a new perspective. For example, a UNIX file system might show as empty when a “df -k” command is executed, but many blocks may have been written to the thin device in support of file system metadata like inodes and others. From a Virtual Provisioning point of view, this thin device could be using a substantial number of thin extents even though the space is showing as available to the operating system and logical volume manager. While this kind of preformatting activity diminishes the value of Virtual Provisioning in one area, it does not negate the other values that Virtual Provisioning brings, such as the performance gains due to wide striping or the inherent flexibility that the Virtual Provisioning paradigm offers.

In addition, in VMware vSphere environments, the guest operating system, file systems, and applications running in virtual machines each can have a different kind of behavior when installed or run on a virtually provisioned device. An educated choice as to file system and logical volume manager (such as NTFS formatting options in Windows) options in the guest operating system can yield the maximum value from the Virtual Provisioning infrastructure.

Considerations for VMware vSphere on thin devices There are certain considerations that VMware administrators must be aware of when deploying a VMware vSphere environment that employs Virtual Provisioning. How the various VMware vSphere features work with Virtual Provisioning depends on the feature, its usage, and the guest operating system running in the VMware vSphere environment.

Device visibility and access

Bound thin devices appear like any other SCSI-attached device to VMware vSphere3. A thin device can be used to create a VMware file system, or assigned exclusively to a virtual machine as a raw disk mapping (RDM). VMware vSphere 5, unlike previous versions, has the ability to detect whether or not the device is thin. ESXi will identify thin provisioned LUNs using the “thin provisioning enabled” (TPE) bit from a SCSI READ CAPACITY (16) response.

One thing to remember is that when a thin device is presented as an RDM (either physical or virtual compatibility) to a virtual machine, the VMware kernel does not play a direct role in the I/O to the thin device from the guest operating system. In this

3 Thin devices not bound to a thin pool are presented as not-ready to the host. However, the SCSI INQ command to the device succeeds. The VMware ESXI kernel, therefore, cannot determine the correct status of the unbound device. If the VMware ESXi kernel performs a read on unbound thin devices, zeros would be returned back. Similarly, any attempt to create a datastore on an unbound thin device will fail. Therefore, EMC strongly dissuades customers from presenting unbound thin devices to VMware ESXi hosts (with the exception of thin devices in use as gatekeepers).

semmal ibrahim



configuration, the considerations for using thin devices are no different from the ones for physical servers running the same operating system and applications.

Virtual Machine File System on thin devices

The Virtual Machine File System (VMFS) has some interesting and valuable behavioral characteristics when utilized in a Virtual Provisioning environment. First, a minimal number of thin extents are allocated from the thin pool when a VMFS is created on virtually provisioned devices. The amount of storage required to store the file system metadata is a fixed size plus an additional increment which is a function of the size of the VMFS datastore.

VMFS-3 volumes will require approximately 573 MB of capacity to be reserved on the VMFS volume plus an additional 1 MB for each 100GB of the volume. For example, the metadata consumption for two different VMFS-3 volumes (one 100 GB and the other 200 GB) on two different thin devices is shown in Figure 7. The 100 GB VMFS-3 volume consumes and reserves 574 MB of available capacity and the 200 GB VMFS volume consumes and reserves 575 MB.

Figure 7. Metadata area reserved on the VMFS-3 volume

VMFS-5 behavior is very similar when scaled, but requires slightly more reserved space than VMFS-3. VMFS-5 volumes will require approximately 970 MB of capacity to be reserved on the VMFS volume plus an additional 1 MB for each 100GB of the volume. For example, the metadata consumption for two different VMFS-5 volumes (one 100 GB and the other 200 GB) on two different thin devices is shown in Figure 7. The 100 GB VMFS-5 volume consumes and reserves 971 MB of available capacity and the 200 GB VMFS volume consumes and reserves 972 MB.

semmal ibrahim

semmal ibrahim

semmal ibrahim



Figure 8. Metadata area reserved on the VMFS-5 volume

While capacity in the order of hundreds of MB is reserved on the VMFS volume for metadata, far less is actually written to the Symmetrix thin device on creation of the volume.

Figure 9 is a printout from Solutions Enabler that shows the actual written tracks due solely to the write activity generated by the formatting of two thin devices hosting VMFS-3 volumes shown in Figure 7. It can be seen from Figure 9 that only 489 tracks were written to the thin device hosting the 100 GB VMFS volume and 497 tracks written to the thin device hosting the 200 GB VMFS volume. One track equals 64 KB so this translates into 30.5 MB and 31.0 MB respectively.

Figure 9. Allocation from a thin pool in response to VMFS-3 format activity

Figure 10 is a printout from Solutions Enabler that shows the actual written tracks due solely to the write activity generated by the formatting of two thin devices hosting VMFS-5 volumes shown in Figure 8. It can be seen from Figure 10 that only 436 tracks were written to the thin device hosting the 100 GB VMFS volume and 444 tracks written to the thin device hosting the 200 GB VMFS volume. By converting tracks into MB this translates into 27.25 MB and 27.75 MB respectively.



Figure 10. Allocation from a thin pool in response to VMFS-5 format activity

By comparing Figure 9 and Figure 10 it can be noted that the initial creation and formatting of a VMFS-5 volume writes less data to the thin device than the creation and formatting of a VMFS-3 volume. However, both VMFS versions require the same additional capacity of 8 tracks (512 KB) per 100 GB of the VMFS volume total capacity.

It is important to note that the numbers shown in this section may not always be identical to every environment as they may vary slightly depending on a number of factors such as block size. These numbers can, however, be used as a guide to anticipate behavior for a given vSphere environment.

In conclusion, it can be established that ESXi does not write all of its metadata to disks on creation of a VMFS. The VMFS only formats and uses the reserved area for metadata as the need arises.

VMFS block allocation and reuse algorithms

The ESXi kernel uses a very efficient mechanism to allocate and reuse space on VMFS volumes. Understanding this behavior and how it is affected by the virtual disk types and guest write operations is crucial knowledge that enables the VMware administrators to properly size and manage their storage.

Depending on the version of Enginuity and vSphere this behavior changes slightly as the latest versions introduced a new feature called Dead Space Reclamation that allows for the reclamation of previously used, but currently freed up, storage (known as “dead space”).

This section has three subsections:

1. VMFS block allocation and reuse with all versions of vSphere and Enginuity 5876.82 and earlier (Dead Space Reclamation not supported).

2. VMFS block allocation and reuse with vSphere 5.0 U1 and Enginuity 5876.159.102 (Dead Space Reclamation supported).

3. Reclaiming space from Symmetrix thin devices for vSphere/Enginuity combinations that do not support Dead Space Reclamation

semmal ibrahim

semmal ibrahim



All vSphere versions and Enginuity 5876.82 and earlier

Figure 11 shows the results from a series of tests conducted to determine the behavior of the VMFS when virtual disks are created and deleted and when files within those virtual disks are created or deleted. The graph shows the used capacity in GB of the VMFS volume as vSphere sees it and the actual GBs allocated on the thin pool after the operation.

Figure 11. VMFS allocation mechanisms without Dead Space Reclamation



It can be seen from Figure 11 that the VMFS utilization drops when a virtual disk is deleted and the percent utilization of the thin pool does not change. However, from that point on the VMFS reuses the freed blocks as new files are created and deleted on the file system. The VMFS, thus, exhibits a behavior that is still beneficial when used with virtually provisioned devices.

vSphere 5.0 U1 with Enginuity 5876.159.102 and later

Referred to as Dead Space Reclamation by VMware, ESXi 5.0 U1 introduced support for the SCSI UNMAP command (0x42) which issues requests to de-allocate extents on a thin device. Dead Space Reclamation offers the ability to reclaim blocks of a thin-provisioned LUN on the array via a VMware-provided command line utility.

It is important to note that the Symmetrix only supports this functionality with Enginuity 5876.159.102 and higher. If a customer is running Enginuity 5876.82 or lower, they must therefore upgrade Enginuity to leverage this functionality or utilize the workaround process described in the next section Reclaiming space on VMFS volumes without Dead Space Reclamation support.

A Dead Space Reclamation should be executed after a virtual disk is deleted or is migrated to a different datastore. As shown in the previous section, when virtual machines were migrated from a datastore, the blocks used by the virtual machine prior to the migration were still being reported as “in use” by the array as shown in Figure 11. With this new VMware Storage API for Array Integration primitive, the Symmetrix will be informed that the blocks are no longer in use, resulting in better reporting of disk space consumption and a much more efficient use of resources.

To confirm that the Enginuity level is correct via the ESXi host and that UNMAP is in fact supported on a LUN, the following command can be issued to query the device:

esxcli storage core device vaai status get -d <naa>

Example output:

naa.<naa>

VAAI Plugin Name: VMW_VAAI_SYMM

ATS Status: supported

Clone Status: supported

Zero Status: supported

Delete Status: supported

A device displaying Delete Status as supported means that the device is capable of receiving SCSI UNMAP commands or, in other words, the Symmetrix is indeed running the correct level of Enginuity to support UNMAP from an ESXi host.

In order to reclaim space, ESXi 5.0 Update 1 includes an updated version of vmkfstools that provides an option (-y) to issue the UNMAP operation. This command should be issued against the device after navigating to the root directory of the VMFS volume residing on it:

semmal ibrahim

semmal ibrahim

semmal ibrahim

semmal ibrahim

semmal ibrahim



cd /vmfs/volumes/<volume-name>

vmkfstools -y <percentage of deleted block to reclaim>

The percentage of deleted blocks can be configured from anywhere between 1 to 99%.

This command creates a temporary “balloon” file at the top level of the datastore. This file can be as large as the aggregate size of blocks being reclaimed as seen in Figure 12. If the reclaim operation is somehow interrupted, this temporary file may not be deleted automatically and users may need to delete them manually.

Figure 12. SSH session to ESXi server to reclaim dead space with vmkfstools

It is recommended to not use too large of a percentage in the vmkfstools reclaim command as resulting balloon file temporarily consumes and reserves space on the datastore. If a new virtual disk needs to be deployed or an existing one needs to be grown, it is possible that a large enough temporary balloon file will block these actions. Therefore, it is recommended to use a reclaim percentage closer to 60% and to avoid using the maximum of 99%.

In situations where a large amount of space is to be reclaimed, it is possible to run into errors like below:

Could not set file .vmfsBalloonCluQlx length to 3.7 TB (Cannot allocate memory).

This indicates that the VMFS Heap size is set too low and needs to be increased. Refer to VMware KB article 1004424 for information on changing VMFS Heap size. It is important to note that changing this setting does require a reboot of the ESXi host to go into effect.

There is no practical way of knowing how long a vmkfstools -y operation will take to complete. It can run for a few minutes or even for a few hours depending on the

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004424



size of the datastore, the amount of dead space and the available resources on the array to execute the operation. Reclamation on the Symmetrix is a low-priority task (in order for it to not negatively affect host I/O performance) so a heavy workload on the device4, pool or the array itself can throttle the process and slow the reclamation.

There are some caveats with UNMAP use where certain device types/configurations may cause the UNMAP command to be blocked or not valid for the entire device. These are:

1. Not supported on devices in use by Open Replicator for Symmetrix (ORS)

2. Not supported with devices in use by RecoverPoint

3. Not supported for encapsulated FTS devices

4. Can be partially blocked by thin devices that have persistent allocations. The VMAX accepts the UNMAP command but respects persistent allocations. If the specified UNMAP range has persistent and non-persistent allocations, the VMAX will de-allocate the non-persistent allocations and leave the persistent allocations.

5. Not supported with virtual devices (TimeFinder/Snap targets)

6. Not supported with devices currently involved in duplicate write sessions (this is a temporary state; these sessions are used during meta reconfiguration and expansion)

7. Not supported with TimeFinder/VP Snap targets

UNMAP to a thin SRDF device is accepted and supported on the R1 and is propagated to its remote replicas (R21, R2) as long as the links are active and the remote boxes are also running 5876.159.102.

Figure 13 shows the results from a series of tests conducted to determine the behavior of the VMFS when virtual disks are created and deleted and when files within those virtual disks are created or deleted. This time, a Dead Space Reclamation operation is executed after each delete operation. The graph shows the used capacity in GB of the VMFS volume as vSphere sees it and the actual GBs allocated on the thin pool after the operation. Thereby

4 An in-progress VAAI XCOPY session to the device being reclaimed (due to a Storage vMotion or Clone operation) can cause the device to exceed device pending operation limits thereby throttling the UNMAP session to possibly substantially reduced rates until the threshold has been cleared.

semmal ibrahim



Figure 13. VMFS allocation mechanisms with Dead Space Reclamation

It can be seen from Figure 13 that the VMFS utilization drops when the virtual disks are deleted and vmkfstools is run but not when files within them are deleted no matter the type of virtual disk. Due to Dead Space Reclamation (UNMAP), the percent utilization of the thin pool changes when virtual disks containing actual data or eagerzeroedthick virtual disks are deleted. Also of note, the VMFS volume reuses the freed blocks as new files are created and deleted on the file system. VMFS, with the addition of Dead Space Reclamation, exhibits a behavior that is exceptionally beneficial when used with virtually provisioned devices.



Reclaiming space on VMFS volumes without Dead Space Reclamation support

As described earlier, in versions of vSphere prior to 5.0 U1 or Symmetrix arrays running Enginuity 5876.82 or earlier, thin device allocations cannot be reclaimed with the vmkfstools command. If an ESX and/or Enginuity upgrade is not possible, the following process is a workaround5 to remove these allocations quickly and simply:

1. Log into the vSphere Client and ensure that the EMC Virtual Storage Integrator Storage Viewer plugin is installed.

2. Synchronize VSI. To make sure VSI has the most up-to-date information concerning thin pool allocations, it is important to synchronize the local database supporting VSI Storage Viewer. In the vSphere Client, navigate to Home > Solutions and Applications > EMC and click the Symmetrix Arrays panel. Click the “Sync All Arrays” link at the top of the panel as shown in Figure 14.

Figure 14. Synchronizing VSI information

3. Before space can be reclaimed, the previously used allocations must be overwritten with contiguous zeroes to allow the Symmetrix to be able to reclaim them. The simplest method of achieving this is by deploying an eagerzeroedthick virtual disk. When possible, the ESXi kernel will re-use previously allocated blocks before writing to new ones, which therefore makes this the most efficient method of zeroing out previously written segments. While users could simply create a virtual disk as large as possible, it is a quicker and far more efficient process if the user knows approximately how much space can be reclaimed so they can limit the size of the temporary virtual disk and the duration of this process. The maximum reclaimable space can be approximated with the use of the Storage Viewer feature of VSI. To find the reclaimable space:

1. Select a host with access to the VMFS volume that is to be reclaimed and click on the EMC VSI tab to open the VSI Storage Viewer.

5 Since this process requires support for zero space reclamation, it can only be performed on Symmetrix VMAX arrays running 5874.207 or higher.



2. Select the VMFS volume in the “Datastores” panel and then choose the “Storage Pools” button to see thin pool allocations of the thin device.

3. Find the value of “Free” column of the selected VMFS volume in the Datastores list

4. Identify the value of the “Free” capacity in the storage pool storage details screen of the thin device itself. Note that although there may be multiple storage pools listed, only the summary at the top contains the relevant information.

The third step shows the total free capacity of the VMFS datastore as vCenter observes it (space is freed when VMs are deleted or moved). The fourth step shows the free space of the thin device as the Symmetrix sees it (space is not freed as VMs are deleted or moved). The difference in these two numbers is the approximate amount of space that can be reclaimed and therefore should be the size of the eagerzeroedthick virtual disk that is deployed to zero out the VMFS volume dead space. The locations of the values are noted by red arrows in Figure 15.

Figure 15. Identifying reclaimable space with VSI Storage Viewer

4. In the example shown in Figure 15 the reported free space on the VMFS volume is 98.80 GB and the reported free space on the thin device is 66.74 GB. This means that approximately 32.06 GB was at one point in use on the VMFS volume but has



since been deleted (i.e. 32.06 GB of dead space). Therefore, up to 32.06 GB of space can be freed on the thin device.

NOTE: Due to the fact that this is merely a workaround, the efficiency of this process is not always 100%. This is because ESXi may not always perfectly re-use previously deleted space before creating new allocations. The ability to re-use blocks depends on a variety of factors (VMFS fragmentation etc.). The best practice therefore is to size the eagerzeroedthick virtual disk somewhat larger than the calculated free space to account for any discrepancy. In this case, to make up for any variations, the eagerzeroedthick virtual disk that will be deployed to zero the VMFS volume will be rounded up in size from 32.06 GB to 40 GB. This will increase the likelihood of reclaiming the maximum amount of space. In some situations the virtual disk might need to be increased by 100% or more to maximize reclaimed space.

5. At this point the temporary virtual disk can be deployed. The method of deployment (see step 5) is entirely at the discretion of the user. All that matters is the following:

a. The size of the virtual disk is adequately large.

b. The virtual disk is of type eagerzeroedthick.

c. The virtual disk is deployed on the correct VMFS volume.

6. It is important to note that this temporary virtual disk does not have to be associated with a virtual machine and can and should be deleted immediately after creation. However, since the vSphere Client does not offer a mechanism to create virtual disks that are not associated with a virtual machine, the virtual disk must be temporarily added to a virtual machine or a new temporary virtual machine must be created. However, VMware CLI tools such as vmkfstools can be used to create a virtual disk that is not attached to a virtual machine and also has the added benefit of being scriptable. These options are shown in Figure 16.



Figure 16. Virtual disk creation options

7. When a method has been chosen, the creation of the eagerzeroedthick virtual disk will zero out most, if not all, of the previously dead space on the thin device. For detailed information on what the virtual disk deployment process is doing refer to the section “Eagerzeroedthick” allocation format on page 43.

8. The virtual disk (or virtual machine if a new one was created) should be deleted immediately after completion.

9. These newly-written contiguous zeroes can be detected by the Symmetrix and reclaimed with either Unisphere for VMAX or Solutions Enabler CLI. Unisphere is shown in Figure 17.

WARNING: If a storage administrator has purposely allocated space on this TDEV this process will de-allocate it unless they were persistent allocations. Furthermore, the zeroes of any eagerzeroedthick virtual disks on that VMFS volume will be reclaimed. Removing these zeroes and allocations could impact performance for that VMFS so it is important to understand what other activity is present on that datastore before executing a reclaim. Consequently, before running a reclaim, Storage vMotion any eagerzeroedthick virtual machines whose space you do not want reclaimed, off of the VMFS and move them back after the reclamation process has completed.



Figure 17. Reclaiming zeroes with Unisphere for VMAX

The reclamation process is not instantaneous as the entire device must first be scanned for thin extents that are filled with contiguous zeroes. For smaller devices this process may take minutes, for large devices perhaps hours. Furthermore the reclaim process is a low priority task on the Symmetrix directors, so it may slow down if the director(s) are under a heavy load from tasks with higher priority such as the servicing of significant host I/O.

Impact of guest operating system activities on thin pool utilization

As discussed in the previous section, a small number of thin extents are allocated in response to the creation of a new virtual machine. However, the thin pool utilization rapidly grows as the user performs various activities in the virtual machine. For example, the installation of an operating system in the virtual machine triggers write I/Os to previously uninitialized blocks of data. These I/Os in turn result in allocation of new thin extents in the thin pool associated with the thin device.

The amount of storage used by the virtual machine depends on the behavior of the operating system, the logical volume manager, the file systems, and applications running inside the virtual machine. Poor allocation and reuse of blocks freed from deleted files can quickly result in thin pool allocation that is equal to the size of the virtual disks presented to the virtual machines. Nevertheless, the thin pool allocation in support of a virtual machine can never exceed the size of the virtual disks. Therefore, users should consider the guest operating system and application behavior when configuring virtual disks for new virtual machines.



A number of additional documents listed in the References section discuss the behavior of various operating systems, file systems, and applications on physical servers in a Virtual Provisioning environment. Since the file system behavior does not change inside a virtual machine compared to a physical machine, readers should consult these documents to understand the anticipated behavior of the virtual machines in a VMware vSphere environment.

Virtual disk allocation mechanisms

VMware vSphere offers multiple ways of formatting virtual disks and has integrated these options into VMware vCenter. For new virtual machine creation, only the formats eagerzeroedthick, zeroedthick, and thin are offered as options. In order to create clones of a VMDK file with non-default allocation schemes (in other words, mechanisms other than thin, zeroedthick, or eagerzeroedthick) for use with other virtualization platforms, VMware Converter or the CLI utility, vmkfstools must be used.

All of the allocation mechanisms that the VMware kernel provides for virtual disks are listed in Table 4.

Table 4. Allocation policies when creating new virtual disks on a VMware datastore

Allocation mechanism (Virtual Disk format)

VMware kernel behavior

zeroedthick All space is allocated at creation but is not initialized with zeros. However, the allocated space is wiped clean of any previous contents of the physical media before any new reads or writes. This is the default policy when creating new virtual disks. This is referred to as “Thick Provision Lazy Zeroed” in vSphere Client.

eagerzeroedthick This allocation mechanism allocates all of the space and initializes all of the blocks with zeros. This option performs a write to every block of the virtual disk, and hence results in equivalent storage use in the thin pool. This is referred to as “Thick Provision Eager Zeroed” in vSphere Client.

thin This allocation mechanism does not reserve any space on the VMware VMFS on creation of the virtual disk. The space is allocated and zeroed on demand.

rdm The virtual disk created in this mechanism is a mapping file that contains the pointers to the blocks of SCSI disk it is mapping. However, the SCSI INQ information of the physical media is virtualized. This format is commonly known as the “virtual compatibility mode raw disk mapping.”

rdmp This format is similar to the rdm format discussed above. In this format, the SCSI INQ information of the physical media is not virtualized. This format is commonly known as the “physical compatibility mode raw disk mapping.”

raw This mechanism can be used to address all SCSI devices supported by the kernel except for SCSI disks.



2gbsparse The virtual disk created using this format is broken into multiple sparsely allocated extents (if needed), with each extent no more than 2 GB in size. However, a sparse disk cannot be powered on, on an ESXi™ host, unless you first re-import the disk in a compatible format, such as zeroedthick, eagerzeroedthick, or thin, with vmkfstools.

monosparse A monolithic sparse disk. A disk in this format can be used with other VMware products. This format is only applicable as a destination format in a clone operation, and not usable for disk creation.

monoflat A monolithic flat disk. A disk in this format can be used with other VMware products. This format is only applicable as a destination format in a clone operation, and not usable for disk creation.

“Zeroedthick” allocation format

When creating, cloning, or converting virtual disks in the vSphere Client, the default option is called "Thick Provision Lazy Zeroed" in vSphere 5 and simply “Thick” in vSphere 4. The "Thick Provision Lazy Zeroed" or “Thick” selection is actually the "zeroedthick" format. In this allocation scheme, the storage required for the virtual disks is reserved in the datastore but the VMware kernel does not initialize all the blocks. The blocks are initialized by the guest operating system as write activities to previously uninitialized blocks are performed. The VMFS will return zeros to the guest operating system if it attempts to read blocks of data that it has not previously written to. This is true even in cases where information from previous allocation (data "deleted" on the host, but not de-allocated on the thin pool) is available. The VMFS will not present stale data to the guest operating system when the virtual disk is created using the "zeroedthick" format. Since the VMFS volume will report the virtual disk as fully allocated, the risk of oversubscribing is reduced. This is due to the fact that the oversubscription does not occur on both the VMware layer and the Symmetrix layer. The virtual disks will not require more space on the VMFS volume as their reserved size is static with this allocation mechanism and more space will be needed only if additional virtual disks are added. Therefore, the only free capacity that must be monitored diligently is the free capacity on the thin pool itself. This is shown in Figure 18, which displays a single 10 GB virtual disk that uses the “zeroedthick” allocation method. This virtual disk currently only has 1.6 GB of data written from the guest OS. Regardless, the datastore browser reports the virtual disk as consuming the full 10 GB.



Figure 18. ”Zeroedthick” virtual disk allocation size as seen in a VMFS datastore browser

However, since the VMware kernel does not actually initialize unused blocks, the full 10 GB is not consumed on the thin device. In the example, the virtual disk resides on thin device 007D and, as seen in Figure 19, only consumes about 1.6 GB of space on it.

Figure 19. ”Zeroedthick” virtual disk allocation on Symmetrix thin devices

“Thin” allocation format

Like zeroedthick, the "thin" allocation mechanism is also Virtual Provisioning-friendly, but, as will be explained in this section, should be used with caution in conjunction with Symmetrix Virtual Provisioning. "Thin" virtual disks increase the efficiency of storage utilization for virtualization environments by using only the amount of underlying storage resources needed for that virtual disk, exactly like zeroedthick. But unlike zeroedthick, thin devices do not reserve space on the VMFS volume—allowing more virtual disks per VMFS. Upon the initial provisioning of the virtual disk, the disk



is provided with an allocation equal to one block size worth of storage from the datastore. As that space is filled, additional chunks of storage in multiples of the VMFS block size are allocated for the virtual disk so that the underlying storage demand will grow as its size increases.

It is important to note that this allocation size depends on the block size of the target VMFS. For VMFS-3, the default block size is 1 MB but can also be 2, 4, or 8 MB. Therefore, if the VMFS block size is 8 MB, then the initial allocation of a new "thin" virtual disk will be 8 MB and will be subsequently increased in size by 8 MB chunks. For VMFS-5, the block size is not configurable and therefore is, with one exception, always 1 MB. The sole exception is if the VMFS-5 volume was upgraded from a VMFS-3 volume. In this case, the block size will be inherited from the original VMFS volume and therefore could be 1, 2, 4 or 8 MB.

A single 10 GB virtual disk (with only 1.6 GB of actual data on it) that uses the thin allocation method is shown in Figure 20. The datastore browser reports the virtual disk as consuming only 1.6 GB on the volume.

Figure 20. Thin virtual disk allocation size as seen in a VMFS datastore browser

As is the case with the "zeroedthick" allocation format, the VMware kernel does not actually initialize unused blocks for "thin" virtual disks, so the full 10 GB is neither reserved on the VMFS nor consumed on the thin device. The virtual disk presented in Figure 20, resides on thin device 007D and, as seen in Figure 21, only consumes about 1.6 GB of space on it.



Figure 21. Thin virtual disk allocation on Symmetrix thin devices

“Eagerzeroedthick” allocation format

With the "eagerzeroedthick" allocation mechanism (referred to as "Thick Provision Eager Zeroed" in the vSphere Client in vSphere 5), space required for the virtual disk is completely allocated and written at creation time. This leads to a full reservation of space on the VMFS datastore and on the underlying Symmetrix device. Accordingly, it takes longer to create disks in this format than other types of disks.

A single 10 GB virtual disk (with only 1.6 GB of actual data on it) that uses the “eagerzeroedthick” allocation method is shown in Figure 22. The datastore browser reports the virtual disk as consuming the entire 10 GB on the volume.

Figure 22. “Eagerzeroedthick” virtual disk allocation size as seen in a VMFS datastore browser

Unlike with “zeroedthick” and “thin,” the VMware kernel initializes all the unused blocks, so the full 10 GB is reserved on the VMFS and consumed on the thin device.



The “eagerzeroedthick” virtual disk resides on thin device 0484 and, as seen in Figure 23, consumes 10 GB of space on it.

Figure 23. “Eagerzeroedthick” virtual disk allocation on Symmetrix thin devices

“Eagerzeroedthick” with ESXi 4.1+ and Enginuity 5875+

VMware vSphere 4.1/5.x offers a variety of VMware vStorage APIs for Array Integration (VAAI) that provides the capability to offload specific storage operations to EMC Symmetrix VMAX to increase both overall system performance and efficiency. Symmetrix VMAX with Enginuity 5875 supports the following of VMware's Storage APIs - Full Copy and Block Zero and Hardware-Assisted Locking.

The Block Zero primitive delivers hardware-accelerated zero initialization, greatly reducing common I/O tasks such as creating new virtual disks. This feature is especially beneficial when creating fault-tolerant (FT)-enabled virtual machines that make use of the eagerzeroedthick virtual disk. Having the array complete bulk zeroing of a virtual disk instead of the host, speeds up the standard initialization process by an order of magnitude.

Without the block zeroing primitive, the virtual disk creation is not complete until the host has completed the zeroing process. Consequently, for a very large disk this can take a long time. The block zeroing primitive, which uses the WRITE SAME SCSI command (0x93), enables the disk array to return the cursor to the requesting service as though the process of writing the zeros has been completed and then finish the job of zeroing out those blocks in the background on the array.

This primitive has a profound effect on Symmetrix Virtual Provisioning when deploying "eagerzeroedthick" virtual disks. When this feature is enabled, combined with Symmetrix Virtual Provisioning, the typical host zeroing that occurs when deploying disks using this allocation mechanism, will be offloaded to the array. Furthermore, the Symmetrix will not write the zeros to disk but instead simply set the track to “Never Written By Host” and allocate the respective tracks on the thin pool. If desired, these allocated, but unwritten tracks can be de-allocated using Unisphere for VMAX or Solutions Enabler. Please refer to the Solutions Enabler Controls Product Guide or the Unisphere for VMAX online help for more detail.



Block Zero is enabled by default on both the Symmetrix VMAX running 5875 Enginuity or later, and on a properly licensed ESXi 4.1+ host. Block Zero can be disabled on an ESXi host through the use of the vSphere Client or command line utilities. Block Zero can be disabled or enabled by altering the setting DataMover.HardwareAcceleratedInit in the ESXi host advanced settings under DataMover as shown in Figure 24.

Figure 24. Enabling hardware-accelerated Block Zero

Virtual Disk Type Considerations

In general, before vSphere 4.1, EMC recommended using "zeroedthick" instead of "thin" virtual disks when using Symmetrix Virtual Provisioning. The reason that "thin on thin" was not always recommended is that using thin provisioning on two separate layers (host and array) increases the risk of out-of-space conditions for virtual machines. vSphere 4.1 (and even more so in vSphere 5) in conjunction with the latest release of Enginuity integrate these layers better than ever. As a result, using thin on thin is now acceptable in far more situations, but it is important to remember that risks still remain in doing so.

For this reason, EMC recommends "zeroedthick" (for better reliability as only the array must be monitored) or "eagerzeroedthick" (for absolute reliability as all space is completely reserved) for mission critical virtual machines.

The relative performance, protection against capacity exhaustion and efficiency between the three types of virtual disks are show in Figure 25.



Figure 25. Virtual disk allocation mechanism considerations



It is important to remember that using "thin" rather than "zeroedthick" virtual disks does not provide increased space savings on the physical disks. As previously discussed, since "zeroedthick" only writes data written by the guest OS, it does not consume more space on the Symmetrix thin pool than a similarly sized "thin" virtual disk. The primary difference between "zeroedthick" and "thin" virtual disks is not a question of space, but a question of the quantity and capacity of the Symmetrix thin devices presented to the ESX server. Due to the architecture of Virtual Provisioning, binding more thin devices to a thin pool or increasing the size of them does not take up more physical disk space than fewer or smaller thin devices. So whether 10 small thin devices are presented to an ESX host, or 1 large thin device, the storage usage on the Symmetrix is essentially the same. Therefore, packing more virtual machines into smaller or fewer thin devices is not necessarily more space efficient. Prior to ESXi 5, a single-extent VMFS volume was limited to approximately 2 TB and this led customers to use "thin on thin" to keep the number of devices presented to a host minimal as a way to ease management. "Thin" virtual disks allow for fewer presented Symmetrix devices and a higher virtual machine density. With the larger allowable single-extent VMFS volume size in ESXi 5 (up to 64 TB) the need to add more thin devices is reduced as larger, more appropriately sized thin devices can be presented and used and expanded as necessary.

Furthermore, the hardware assisted locking primitive offered by VAAI introduced in ESX(i) 4.1 allows for larger devices. It provides a more granular means to protect the VMFS metadata than the SCSI reservations used before hardware-assisted locking was available. Hardware-assisted locking leverages a storage array atomic test and set capability to enable a fine-grained block-level locking mechanism. Simple things like moving a virtual machine, starting it, creating a new virtual machine from a template, taking snapshots or even stopping a virtual machine will result in VMFS having to allocate or return storage to or from the shared free-space pool. Although the VMFS use of the SCSI reservation locking the LUN does not often result in degradation of performance, the use of hardware-assisted locking provides a much more efficient means to avoid retries for acquiring a lock.

With ESXi 5, customers therefore do not need to make the virtual disks "thin" in order to fit a large number of them on a single volume as the datastore is no longer constrained by the 2 TB minus 512 byte limit.

However, as previously mentioned, VMware and EMC have recently ameliorated the practicality of using thin virtual disks with Symmetrix Virtual Provisioning. This has been achieved through:

1. Refining vCenter and array reporting while also widening the breadth of direct Symmetrix integration

2. Removing the slight performance overhead that existed with thin VMDKs. This was caused by zeroing-on-demand and intermittent expansion of the virtual disk as new blocks were written to by the guest OS and has been respectively significantly diminished with the advent of Block Zero and Hardware-Assisted Locking (ATS).



3. The introduction of Storage Dynamic Resource Scheduler in vSphere 5 (SDRS) further reduces the risk of running out of space on a VMFS volume as it can be configured to migrate virtual machines from a datastore when that datastore reaches a user-specified percent-full threshold. This all but eliminates the risk of running out of space on VMFS volumes, with the only assumption being that there is available capacity elsewhere to move the virtual machines to.

These improvements can make "thin on thin" a much more viable option—especially in vSphere 5. Essentially, it depends on how risk-averse an organization is and the importance/priority of an application running in a virtual machine. If virtual machine density and storage efficiency is valued above the added protection provided by a thick virtual disk (or simply the possible risk of an out-of-space condition is acceptable), “thin on thin” may be used. If "thin on thin" is used, alerts on the vCenter level and the array level should be configured.

Migrating virtual machines

VMware Storage vMotion is a component of VMware vSphere that provides an intuitive interface for live migration of virtual machine disk and configuration files within or across storage arrays with no downtime or disruption in service. Storage vMotion relocates virtual machine files from one shared storage location to another shared storage location with zero downtime, continuous service availability, and complete transactional integrity.

Storage vMotion can be used to perform a variety of different functions, including:

• Migrating a VM between datastores on the same array

• Migrating a VM between datastores on different arrays

• Changing the virtual disk format

The Migrate Virtual Machine wizard gives the option to change the format of the virtual disk during migration as shown in Figure 26. As previously discussed, EMC recommends the use of the “Thick Provision Lazy Zeroed” format when using Symmetrix Virtual Provisioning due to the risk of surpassing subscription limits when using the “thin” format on a virtually provisioned Symmetrix device. It is important to note that the “Thick Provision Lazy Zeroed” option is the “zeroedthick” format, not “eagerzeroedthick”. The “thin” format may be used on traditional, non-thin Symmetrix devices, but care needs to be taken to monitor free space on the VMFS datastore as the “thin” disks grow.



Figure 26. Selecting virtual disk format in Migrate Virtual Machine wizard

Changing virtual disk formats using Storage vMotion

Many times it is necessary to modify the allocation format of existing virtual disks to satisfy new storage requirements. In the case of migrating from between thin and thick storage environments, the ability to change disks’ formats is especially important. VMware provides a built-in mechanism to nondisruptively convert virtual disks’ formats through the use of Storage vMotion using the Migrate Virtual Machine wizard. As discussed in the previous section, Storage vMotion is typically used to move virtual machines from one datastore to another and, if desired, change the virtual disk allocation mechanism. In the current release of vSphere 5.x, Storage vMotion does not allow a virtual machine to have the same target VMware file system as the source. Consequently, in order for the virtual disk format to be changed, the location must be changed as well. This requires an administrator to migrate the virtual machine from the source volume to a different target volume and then migrate it back to the original datastore using the desired virtual disk allocation format.

Creating new virtual disks

The Create a Disk wizard provided by vSphere Client is not impacted by the use of virtually provisioned devices. This is due to the fact that a VMFS datastore on thin devices appear as any other datastore to the wizard. As can be seen in Figure 27, there are three options for choosing the allocation mechanism of a new virtual disk:

• Thick Provision Lazy Zeroed. This creates the virtual disk using the “zeroedthick” mechanism and is the recommended method for use with Symmetrix Virtual Provisioning

• Thick Provision Eager Zeroed. This will create the virtual disk using the “eagerzeroedthick” mechanism and it is not typically recommended to be used with Symmetrix Virtual Provisioning.



• Thin Provision. This creates the virtual disk using the “thin” mechanism. EMC does not promote the use of this format for Symmetrix Virtual Provisioning without careful capacity monitoring.

Figure 27. Disk format options when creating a new virtual disk

Virtual machine templates

The functionality provided by the use of virtual machine templates makes creating and expanding virtual environments simple and efficient, which is highly desirable in large infrastructures. A template is a golden image created from an already configured virtual machine. A template usually includes a specified operating system, a base set of applications, and configuration of virtual hardware. Virtual machines can be turned into or deployed from a template with great ease. Both of these processes have implications when it comes to Symmetrix Virtual Provisioning.

In the vSphere 5.x client, the wizard to clone a virtual machine to a template offers the user four options on how to format the disk as displayed in Figure 28.



Figure 28. Creating templates from existing virtual machines

As previously discussed, when using Symmetrix Virtual Provisioning the use of the “thin” format is not always recommended when creating a new virtual machine. However, when creating/cloning to a template, the use of the “thin” format is recommended. This is due to the fact that templates, unlike virtual machines, will never grow in size. If this option is selected, it is recommended that a separate datastore is dedicated for templates as the contents of the datastore will tend to be relatively static. Consequently, in the worst-case scenario, if the capacity of the thin pool supporting the datastore holding the templates is reached or exceeded, the only impact would be to the process of creating new templates. Existing templates could still be used to provision new virtual machines. This behavior, although inconvenient, has minimal impact on the production environment6.

An alternative to the Clone to Template process is the in-place conversion of an existing virtual machine into a template. This process does not modify the virtual disks associated with the virtual machine. The only modification that occurs is to the configuration file. Therefore, an in-place conversion to a template does not unnecessarily increase allocations from the thin pool.

Based on these observations, EMC recommends the use of virtually provisioned devices for templates. A dedicated template datastore on a thin device should be used in conjunction with the “thin” option of the Clone to Template wizard. Alternatively, the in-place conversion of virtual machines on thin devices can be used.

VMware DRS and Virtual Provisioning

EMC Symmetrix thin devices behave like any other SCSI disk attached to the VMware ESXi server kernel. If a thin device is presented to all nodes of a VMware Distributed

6 Using the “Thick Provision Lazy Zeroed” format (zeroedthick) for templates is not recommended as it will just squander space on VMFS volumes with unused, but allocated, tracks. This is especially important when virtual machine templates have large virtual disks.



Resource Scheduler cluster group, vCenter allows live migration of viable virtual machines on the thin device from one node of the cluster to another using VMware vMotion.

Automatic migrations of virtual machines occur if the “automatic” policy is enabled for the DRS cluster group. vCenter will make migration recommendations for virtual machines if the VMware DRS cluster group automation level is set to Partially Automated or Manual. These behaviors are not different from any other VMware DRS cluster using shared storage on non-thin devices.

VMware DRS controls and manages placement of virtual machines by analyzing a few different types of metrics (memory, CPU usage etc.). With vSphere 5.0, storage capacity and latency has been added to this mix. Storage DRS delivers the aforementioned DRS benefits such as resource aggregation, automated initial placement, and bottleneck avoidance to the storage environment. Users can group and manage similar datastores as a single load-balanced storage resource called a datastore cluster or pod. Storage DRS makes VMDK placement and migration recommendations to avoid I/O and space utilization bottlenecks on the datastores in the cluster. The important thing to note when it comes to Storage DRS and Symmetrix Virtual Provisioning, is that VMware highly recommends that they are only be used in conjunction when a vStorage API for Storage Awareness provider (VASA) is configured for the array. This API allows the VASA storage provider to inform vCenter and ESXi of the capabilities, topology and state of the storage whilst also allowing users to see this information from vSphere client. Since virtual provisioning operates differently than traditional, non-thin storage, VMware requires this additional insight into the storage when using Storage DRS and array-based thin provisioning.

A VASA Provider has been added to the array provider functionality in SMI-S version 4.3.1+. More information on the installation and configuration of the Symmetrix VASA provider can be found in the white paper Using VMware vStorage API for Storage Awareness with Symmetrix Storage on Powerlink.

Nondisruptive expansion of VMFS datastores on thin devices In an ideal world, thin devices presented to a VMware vSphere environment should be configured to address the storage requirement for the lifetime of the infrastructure if possible. This approach is viable since thin devices do not initially require equivalent physical storage on the back end. Physical capacity can be added to the thin pool as the environment grows. Therefore, a properly configured and sized thin device should never need expansion. Nevertheless, it is possible to expand a thin device nondisruptively while maintaining existing data on the devices. To exploit this functionality, the thin devices need to be originally presented as metavolumes or converted to metavolumes.

• With Enginuity 5874 and Solutions Enabler 7.1 and earlier, these metavolumes must be concatenated.



• With Enginuity 5875 and Solutions Enabler 7.2 and later, striped metavolumes can be expanded. In previous versions of Enginuity, thin devices needed to be unmasked and unmapped from the host to undergo a conversion to a concatenated metavolume. Enginuity 5875 and later allows non-metadevices to be converted to concatenated metadevices online and presented to the host.

In vSphere, VMFS Volume Grow allows administrators to take advantage of expanded thin metavolumes without resorting to using multiple extents.

VMFS Volume Grow

VMFS Volume Grow offers a way to increase the size of a datastore that resides on a VMFS volume. It complements the dynamic LUN expansion capability that exists in Symmetrix storage arrays. If a metavolume is increased in size, then VMFS Volume Grow enables the VMFS volume extent to dynamically increase in size as well. With VMFS Volume Grow the existing partition table is changed and extended to include the additional capacity of the underlying LUN instead of creating a second partition. The process of increasing the size of the VMFS volume is integrated through the vCenter GUI. Provided that additional capacity on the existing LUN is available, the VMFS volume can now be expanded dynamically up to the approximately 64 TB7 limit per LUN. For VMFS volumes that might already span multiple extents, VMFS Volume Grow can be used to grow each of those extents up to approximately 64 TB as well. It is highly recommended to avoid using multiple extents as doing so removes unnecessary complexity to managing storage back-end information of a given multi-extent VMFS datastore.

Thin metavolumes work similar to traditional, non-thin metavolumes and as a result use the same processes (form meta, dissolve meta, convert meta) to be configured and altered. The following caveats, however, should be noted:

• The following stipulations refer to all types of thin metavolumes

� Members must be unmasked, unmapped and unbound before being added to a thin metavolume.

� Only thin devices can be used to create a thin metavolume. Mixing of thin and traditional, non-thin devices, to create a metavolume is not allowed.

• The following specifically refer to creating and expanding concatenated thin metavolumes

� If using Solutions Enabler 7.1 and Enginuity 5874 and earlier, bound thin devices may be converted to concatenated thin metavolumes, but they cannot be mapped and masked to a host. Any data on the thin device is preserved through the conversion, but since the device is unmapped and unmasked, this is an offline operation. Beginning with Solutions Enabler 7.2 and Enginuity 5875, a thin device can be bound and mapped/masked to the host during the conversion, allowing for an entirely non-disruptive operation.

7 64 terabytes is the new limit for VMFS size in ESXi 5.0 for VMFS5 only. For VMFS3 the limit is still 2 TB.



� Concatenated thin metavolumes can be expanded non-disruptively regardless of the Solutions Enabler version

• The following specifically refer to creating and expanding striped thin metavolumes

� For a striped metavolume, all members must have the same device size.

� Only unbound thin devices can be converted to striped thin metavolumes. The striped thin metavolumes must first be formed, and then it can be bound and presented to a host.

If using Solutions Enabler 7.1 and Enginuity 5874 and earlier, bound striped thin metavolumes cannot be expanded online, they must be unbound, unmapped and unmasked. Since this process will cause complete destruction of the data stored on the metavolume, any data on the striped thin metavolume must be migrated off before attempting to expand it in order to prevent this data loss. Beginning with Solutions Enabler 7.2 and Enginuity 5875, with the assistance of a BCV, a striped thin metavolume can be expanded with no impact to host access or to the data stored on the metavolume. As an example, Figure 29 shows the process to expand a concatenated metavolume by adding additional thin devices. The existing data is preserved during the expansion operation.

Figure 29. Expanding a concatenated metavolume using Solutions Enabler 7.0

After the expansion is complete and a rescan of the ESXi host is performed, VMware vSphere recognizes the additional storage available on the newly expanded thin device. Using Volume Grow, the VMFS can then be expanded to occupy the additional storage without the need for using multiple extents.



In vSphere Client, as seen in Figure 30, the datastore vSphere_Expand_DS is full. It can also be seen in the figure that the underlying device currently had no extra storage for VMFS expansion. The datastore consumes approximately 251 GB on a thin metavolume that is 256 GB in size.

Figure 30. VMFS datastore before being grown using VMFS Volume Grow

After expanding the thin metavolume using storage array utilities (SYMCLI or Unisphere for VMAX), the SCSI bus on one of the ESXi servers accessing the datastore should be rescanned. Once the rescan has been executed, the VMFS datastore can be grown. To do so, in vSphere Client, navigate to the datastores listing on any ESXi server having access and select the datastore that needs to be expanded. As shown in Figure 31, clicking the Properties link results in a new window shown in Figure 32.



Figure 31. VMFS properties in vSphere Client

In the Datastore Properties pop-up window (Figure 32), click the Increase button and on the subsequent window select the same device on which the datastore currently resides.

Figure 32. Initiating VMFS Volume Grow in vSphere Client

Figure 33 summarizes the disk layout of the VMFS volume. The figure shows that the datastore currently is 256 GB in size and now has an additional 128 GB of free space.



Figure 33. Confirming available free space in the VMFS Volume Grow wizard

The option to use only part of the newly available space is offered when the next step of the wizard is selected. For this example, all of the additional space will be merged into the VMFS volume. This is achieved by selecting the Maximize capacity checkbox as shown in Figure 34.

Figure 34. Choosing how much additional capacity for the VMFS Volume Grow operation

The final result is shown in Figure 35. The datastore vSphere_Expand_DS is now 384 GB in size with 132 GB of free space.



Figure 35. Newly expanded VMFS volume



Thin pool space reclamation in VMware environments In a VMware environment, the zeroing out of virtual disks can lead to unnecessary physical storage allocation on Symmetrix thin pools, a situation that is inherently detrimental to the space-saving benefit of Symmetrix Virtual Provisioning. Introduced in the Enginuity 5874 SR1 for the Symmetrix VMAX is the ability to perform space reclamation on thin devices hosting, for example, zeroed-out virtual disks. In VMware vSphere 5.x, there are situations where performing space reclamation on thin devices can be helpful.

Virtual Provisioning space reclamation in VMware vSphere environments

In VMware vSphere 5.x, the ability to choose virtual disk allocation options allows administrators to deploy virtual machines using the “thin”, “zeroedthick”, or “eagerzeroedthick” mechanisms. If a new virtual machine is created with the “eagerzeroedthick” allocation mechanism, and is placed on a thin device, the creation results in a complete allocation. Space reclamation can be used in this instance to reclaim the consumed space on the thin pool. However, if the virtual machine is targeted to be used with VMware Fault Tolerance, the zeros should not be reclaimed. VMware fully allocates virtual disks of Fault Tolerance machines purposefully to reduce latency and the zeros are not meant to be removed and could lead to issues with the use of the Fault Tolerance virtual machine.

Considerations for space reclamation in VMware environments

For vSphere 5.x, there are important considerations that should be evaluated before performing reclaims. These are:

• If a thin device stores multiple virtual machines, a space reclamation function cannot be performed on just one of the virtual disks. It is currently not possible to determine the specific blocks each virtual disk consumes on the thin device and therefore it is not possible to limit a reclaim to a single virtual disk out of many on a given VMware file system8. Consequently, a reclaim should only be performed on a thin device if all virtual machines hosted by the VMFS volume on that thin device can and should be reclaimed. In particular, in vSphere environments, virtual machines that are set up in a Fault Tolerance configuration should not share Symmetrix thin devices (or the VMFS datastores that are hosted on them) with other virtual machines. This will reduce the risk of reclaiming the zeros from a Fault Tolerance virtual machine.

• Zeroing by guest operating systems must be taken into account before performing space reclamation. If the virtual machines’ operating system or its hosted applications zero-out files for a particular reason, any implications of removing those zeros should be considered before reclaiming them.

8 EMC is actively working with VMware on the vStorage initiative that may allow, in the future, for space reclamation to be performed on a single virtual disk.



• The use of Virtual Storage Integrator is highly recommended in order to easily map virtual machines and their corresponding VMFS volumes to the correct underlying Symmetrix thin device. Double-checking storage mapping information with Virtual Storage Integrator9 will eliminate the possibility of performing space reclamation on an incorrect thin device.

Remote migration to thin devices Beginning with Enginuity 5875, it is now possible to use SRDF to migrate data from standard devices in another Symmetrix array to thin devices in a VMAX running Enginuity 5875. Refer to Enginuity release notes for required full interoperability and support information.

In the case where the source array is running 5773 or later, the RAs on the target VMAX will perform zero data detection on-the-fly on a per-track basis. On-the-fly zero detection will prevent unnecessary allocation caused by large amounts of contiguous zeros from wasting space on the target thin pool. Zero data detection for SRDF is only supported in adaptive copy mode. Zero data detection is accomplished by looking at each track of data as it arrives to determine if it meets one of the two conditions below:

10. Was the never written by host indicator (NWBH) received for the track?

11. Does the track contain all zero data?

Beginning with the Symmetrix VMAX and Enginuity 5874 the special indicator –NWBH is maintained in the metadata stored in cache for every track on every drive in the array. When this indicator is set for a track, it signifies that a host has never written any data to the track. If Enginuity receives a read request for a track with the NWBH indicator set, it can skip performing a read I/O to the disk and simply return all zeros to the host. If the track is part of a device participating in a local or remote replication session, the NWBH indicator will be set on the corresponding track of the target device.

• If the RA determines that an incoming track contains all zero data, by scanning that track’s worth of data, it will indicate this in the track’s metadata.

• If the 12 tracks that comprise a thin provisioning extent all contain zero data, that extent will be reclaimed immediately, thus maintaining the thin nature of the target device.

If SRDF is not an option for performing an array-based migration or if the source array is running Enginuity 5671, the Virtual Provisioning space reclamation feature introduced in Enginuity 5874 can be used after the migration is complete to reclaim thin provisioning extents that contain all zero data.

9 For more information on the Virtual Storage Integrator please refer to the product guides listed in the “References” section at the end of this document.



VMware Storage vMotion can also be used for migrating to thin devices within an array or between separate arrays as well. Refer to the section, Migrating virtual machines for an in-depth discussion of this process.

Local and remote replication between thin devices Users will be able to perform “thin to thin” replication with Symmetrix thin devices by using standard TimeFinder®, SRDF, and Open Replicator operations.

In addition, thin devices can be used as control devices for hot and cold pull and cold push Open Replicator copy operations. If a push operation is done using a thin device as the source, zeros will be sent for any regions of the thin device that have not been allocated, or that have been allocated but have not been written to.

Open Replicator can also be used to copy data from a traditional, non-thin device to a thin device. If a pull or push operation is initiated from a traditional, non-thin device that targets a thin device, then a portion of the target thin device, equal in size to the reported size of the source volume, will become allocated.

Multi-virtual snap is now supported with thin devices and TimeFinder/Snap10. This means that as many as 128 snapshots can be created from a single thin volume. The target of a TimeFinder/Snap operation is always a virtual device (VDEV). Though not technically a thin device, VDEVs are cache-only devices that use pooled storage in a similar manner.

TimeFinder/Clone now supports the creation of clone sessions between thin source devices and larger thin target devices in Enginuity 5874.

Note: The size in this case refers to the amount of host-perceived capacity rather than allocated capacity for the thin devices. Allocated capacity following the clone copy will be the same for both the source and target devices.

Local replication between thin and traditional, non-thin devices With the Enginuity 5874 Q4 service release, TimeFinder/Clone enables users to replicate standard volumes to thin volumes efficiently to ensure only written tracks are copied, reducing capacity requirements and TCO. Thin volumes also can be replicated to traditional, non-thin volumes, further improving mobility into and out of thin pools.

When a clone copy is made between a traditional, non-thin source device and a thin target device, thin device tracks that have never been written to by a host will not be copied to the target volume. Following the clone copy, any thin device extents that were allocated on the clone target that contain all zeros can be reclaimed and added back to the target thin device’s pool. This is accomplished using Virtual Provisioning space reclamation.

10 TimeFinder/Snap is not supported on the VMAXe/VMAX 10K platform.



Thin pool management When storage is provisioned from a thin pool to support multiple thin devices there is usually more "virtual" storage provisioned to hosts than is supported by the underlying data devices. This condition is known as oversubscription. This is one of the main reasons for using Virtual Provisioning. However, there is a possibility that applications using a thin pool may grow rapidly and request more storage capacity from the thin pool than is actually there. This section discusses the steps necessary to avoid out-of-space conditions.

Capacity monitoring

There are several methodologies to monitor the capacity consumption of the Symmetrix thin pools. The Solutions Enabler symcfg monitor command can be used to monitor pool utilization, as well as display the current allocations through the symcfg show -pool command. There are also event thresholds that can be monitored through the SYMAPI event daemon, thresholds that can be set with the Unisphere for VMAX as shown in Figure 36, and SNMP traps that can be sent for monitoring by EMC Ionix™ ControlCenter® or any data center management product.

Figure 36. Setting Pool Utilization Thresholds in Unisphere for VMAX

VMFS volume capacity can be monitored through the use of a vCenter alarm which is shown in Figure 37.



Figure 37. Datastore capacity monitoring in vCenter

System administrators and storage administrators must put processes in place to monitor the capacity for thin pools to make sure that they do not get filled. Thin pools can be dynamically expanded to include more physical capacity without application impact. For DMX arrays running 5773 or VMAX arrays running an Enginuity level earlier than 5874 Service Release 1, these devices should be added in groups long before the thin pool approaches a full condition. If devices are added individually, hot spots on the disks can be created since much of the write activity is suddenly directed to the newly added data device(s) because other data devices in the pool are full. With the introduction of thin pool write balancing in Enginuity 5874 SR1, the data devices can be added individually without introducing hot spots on the pool, though it is best to add the disks when the activity levels on the pool are low to allow the balancing to take place without conflict.

Viewing thin pools details with the Virtual Storage Integrator

The Storage Viewer feature of Virtual Storage Integrator 5.x offers the ability to view the details of thin pools backing Symmetrix devices presented to the VMware environment11.

When the Storage Pools button is selected, the view will show the storage pool information for each LUN of that datastore. For each LUN, the following fields are shown:

s Array — Shows the serial number of the array on which the LUN resides. If a friendly name is configured for the array, it will appear in place of the serial number.

s Device — Shows the device name for the LUN.

11 A connection to a remote or local instance of Solutions Enabler with access to the correct array must be configured to allow this functionality. Consult VSI documentation on Powerlink for more information.



s Capacity pie chart — Displays a graphic depicting the total capacity of the LUN, along with the free and used space.

s Storage Pools — In most cases a LUN is configured from a single storage pool. However, with FAST VP or after a rebinding operation, portions of a LUN can be moved to other storage pools which can better accommodate its performance needs. For that reason, the number of storage pools is displayed, as well as the details for each storage pool. The details for a particular storage pool can be collapsed by selecting the toggle arrow on the heading for that storage pool. The following details are presented for each storage pool:

o Name — The heading bar shows the name of the storage pool.

o Description — Shows the description that the user assigned to the storage pool when it was created.

o Capacity pie chart — Displays a graphic depicting the total physical capacity of the storage pool, along with the free and used space.

o State — Shows the state of the storage pool as Enabled or Disabled.

o RAID — Shows the RAID type of the physical devices backing the thin pool, such as RAID_1, RAID_5, etc…

o Drive Type — Shows the type of drives used in the storage pool.

o Used By LUN — Shows the total space consumed by this LUN in the storage pool.

o Subscribed — Shows the total capacity of the sum of all of the thin devices bound to the pool and the ratio (as a percentage) of that capacity to the total enabled physical capacity of the thin pool.

o Max subscription — Shows the maximum subscription percentage allowed by the storage pool if configured.

An example of this is shown in Figure 38. In this case, a virtual machine is selected in the vCenter inventory and its single virtual disk resides on a VMFS volume on a Symmetrix thin device. The device shown is of type RDF1+TDEV and is bound to a single pool. This screen can be arrived at in several different ways; Figure 38 shows accessing the thin pool details by clicking on a virtual machine, it can also be reached by clicking the EMC VSI tab after selecting a datastore or an ESXi host within the vSphere Client inventory. Figure 39 shows a VMFS volume residing on a Symmetrix thin device bound to two different pools. For more information refer to the EMC VSI for VMware vSphere: Storage Viewer Product Guide available for download on Powerlink.



Figure 38. Viewing thin pool details with the Virtual Storage Integrator 5.0



Figure 39. Viewing pool information for a Symmetrix thin device bound to multiple pools with Virtual Storage Integrator 5

Pro-active VMFS-capacity exhaustion avoidance with Storage DRS

In addition to analyzing I/O metrics of datastores and performing Storage vMotion operations to maintain balanced latencies, Storage DRS also offers VMFS capacity monitoring. Administrators can group datastores into what VMware calls “datastore clusters” and apply capacity thresholds to that cluster. When that threshold is exceeded on a given VMFS volume, Storage DRS will appropriately move virtual machines to another VMFS volume in the cluster. This is especially useful when “thin” virtual disks are used on Symmetrix thin devices where the risk of running out of space is higher.

Storage DRS does not help prevent situations where the actual physical capacity of the Symmetrix thin pool is exhausted. Storage DRS only helps to prevent out-of-space conditions on a VMFS volume caused by virtual disk growth.

By default, the threshold is set to 80% but can be modified to a percentage between 50 and 100. For example, if a threshold of 80% is configured and a VMFS volume in that datastore cluster reaches 80% full, Storage DRS will try to move VMs from that datastore to other datastores in the cluster to reduce the used capacity of the source datastore back down to below 80%. Furthermore, if Storage DRS determines that there is a capacity full percentage difference of less than 5% between VMFS volumes



in the cluster, then no migration will occur since nothing will be gained by a migration. Both the threshold and the variance can be configured in the Storage DRS settings wizard as shown in Figure 40.

Storage DRS has two modes; automatic and manual. If the Storage DRS is configured to run in automatic mode, Storage DRS will move VMs to other datastores in the datastore cluster if the threshold is surpassed, without user intervention or approval. If Storage DRS is set to manual, the administrator will be given virtual machine migration recommendations when thresholds are reached. The administrator can either execute the recommendations or dismiss them.

Figure 40. Storage DRS configuration options

Exhaustion of oversubscribed pools

A VMFS datastore that is deployed on a Symmetrix thin device can detect only the logical size of the thin device. For example, if the array reports the thin device is 2 TB in size while in reality the thin pool only provides 1 TB, the datastore still only considers 2 TB to be the LUN's size. As the datastore fills up, it cannot determine whether the actual amount of physical space is still sufficient for its needs.



Before ESXi 5.0, the resulting behavior of running out of space on a thin pool could cause data loss or corruption on the affected virtual machines. With the introduction of ESXi 5.0 and Enginuity 5875, a new vStorage API for Thin Provisioning sometimes called “Stun & Resume” was added. This enables the storage array to notify the ESXi kernel that the thin pool capacity has been exhausted and any affected virtual machines that require more space should be paused to prevent the aforementioned possibly data loss and/or corruption.

Behavior of ESX(i) without Out of Space error support

If a VMware environment is running a version of ESX(i) prior to version 5, or the Enginuity level on the target array is lower than 5875, the out-of-space error (Stun & Resume) is not supported. In this case, if a thin pool is oversubscribed and has no available space for new extent allocation, different behaviors can be observed depending on the activity that caused the thin pool capacity to be exceeded. If the capacity was exceeded when a new virtual machine was being deployed using the vCenter wizard, the error message shown in Figure 41 is posted.



Figure 41. Error message displayed by vSphere Client on thin pool full conditions

The behavior of virtual machines residing on a VMFS volume when the thin pool is exhausted depends on a number of factors including the guest operating system, the application running in the virtual machine, and also the format that was utilized to create the virtual disks. The virtual machines behave no differently from the same configuration running in a physical environment. In general the following comments can be made for VMware vSphere environments:

• Virtual machines configured with virtual disks using the “eagerzeroedthick” format continue to operate without any disruption.

• Virtual machines that do not require additional storage allocations continue to operate normally.

• If any virtual machine in the environment is impacted due to lack of additional storage, other virtual machines continue to operate normally as long as those machines do not require additional storage.



• Some of the virtual machines may need to be restarted after additional storage is added to the thin pool. In this case, if the virtual machine hosts an ACID-compliant application (such as relational databases), the application performs a recovery process to achieve a consistent point in time after a restart.

• The VMkernel continues to be responsive as long as the ESXi server is installed on a device with sufficient free storage for critical processes.

Behavior of ESX(i) with Out of Space error support With the new vStorage API for Thin Provisioning introduced in ESXi 5, the host integrates with the physical storage and becomes aware of underlying thin devices and their space usage. Using this new level of array thin provisioning integration, an ESXi host can be alerted of the use of space on thin pools to avoid data loss and/or data corruption.

When a thin pool that hosts VMFS volumes exhausts its capacity, vCenter issues an out-of-space error. The out-of-space error works in an in-band fashion through the use of SCSI sense codes issued directly from the array12. When a thin pool hosting VMFS volumes exhausts its assigned capacity the array returns a tailored response to the ESXi host13. This informs the ESXi host that the pool is full and consequently any virtual machines requiring more space will be paused or "stunned" until more capacity is added. Furthermore, any Storage vMotion, Clone or Deploy from Template operations with that device as a target (or any device sharing that pool) will be fail with an out-of-space error.

IMPORTANT: The out-of-space error “stun” behavior will prevent data loss or corruption, but, by design it will also cause virtual machines to be temporarily suspended. Therefore, any applications running on affected virtual machines will not be accessible until additional capacity has been added and the virtual machines have been manually resumed. Even though data unavailability might be preferred over experiencing data loss/corruption, this is not an acceptable situation for most organizations in their production environments and this feature should only be thought of as a last resort. It would be best that this situation is never arrived at—prevention of such an event can be achieved through proper and diligent monitoring of thin pool capacity.

Once a virtual machine is paused, additional capacity must then be added to the thin pool to which the thin device is bound. After this has been successfully completed, the VMware administrator must manually resume each paused virtual machine affected by the out-of-space condition. This can be seen in Figure 42. Any virtual machine that does not require additional space (or virtual machines utilizing "eagerzeroedthick" virtual disks) will continue to operate normally.

12 Enginuity 5875 or later only. 13 The response is a device status of CHECK CONDITION and Sense Key = 7h, ASC = 27h, ASCQ=7h



Figure 42. Resuming a paused virtual machine after an out-of-space condition



Conclusion EMC Symmetrix Virtual Provisioning provides a simple, non-invasive, and economical way to provide storage for VMware vSphere 5.x environments. The following scenarios are optimal for use of Virtual Provisioning with VMware vSphere 5:

• Provisioned virtual machines that use the “zeroedthick” (referred to as “thick”) allocation mechanism

• Creation of virtual machine templates on virtually provisioned devices using the “thin” allocation mechanism with the Clone to Template wizard

• Systems where over-allocation of storage is typical

• Systems where rapid growth is expected over time but downtime is limited

The following are situations where Virtual Provisioning may not be optimal:

• Provisioned virtual machines that use the “eagerzeroedthick” allocation mechanism

• VMware vSphere environments that cannot tolerate an occasional response time increase of approximately one millisecond due to writes to uninitialized blocks

• Virtual machines in which data is deleted and re-created rapidly and the underlying data management structure does not allow for reuse of the deleted space

References The following documents can be found on EMC’s Powerlink website:

White papers

• New Features in EMC Enginuity 5875 for Symmetrix Open Systems Environments

• New Features in EMC Enginuity 5876 for Symmetrix Open Systems Environments

• Storage Tiering for VMware Environments Deployed on EMC Symmetrix VMAX with Enginuity 5876

• Implementing VASA with Symmetrix Storage Arrays

• Using VMware vSphere VAAI with EMC Symmetrix

Technical note

• Best Practices for Fast, Simple Capacity Allocation with EMC Symmetrix Virtual Provisioning

• Implementing Fully Automated Storage Tiering with Virtual Pools (FAST VP) for EMC Symmetrix VMAX Series Arrays

http://www.emc.com/collateral/hardware/white-papers/h8092-features-enginuity-open-systems-wp.pdf

http://www.emc.com/collateral/hardware/white-papers/h10497-enginuity5876-new-features-vmax-wp.pdf

http://www.emc.com/collateral/hardware/white-papers/h8101-storage-tiering-vmware-vmax-wp.pdf

http://www.emc.com/collateral/hardware/white-papers/h8101-storage-tiering-vmware-vmax-wp.pdf

http://www.emc.com/collateral/software/white-papers/h10630-vmware-vasa-symmetrix-wp.pdf

http://www.emc.com/collateral/hardware/white-papers/h8115-vmware-vstorage-vmax-wp.pdf



Feature sheet

• Symmetrix Virtual Provisioning Feature Specification

Product documentation

• EMC Solutions Enabler Symmetrix Array Management CLI Product Guide

• EMC Solutions Enabler Symmetrix Array Controls CLI Product Guide

• EMC Solutions Enabler Symmetrix CLI Command Reference

• VSI for VMware vSphere: Storage Viewer Version Product Guide

• VSI for VMware vSphere: Path Management Version Product Guide

• VSI for VMware vSphere: Symmetrix SRA Utilities Version Product Guide

h6813 Implting Symmetrix Vrtl Prvsning Vsphere Wp Copy

Documents

Transcript of h6813 Implting Symmetrix Vrtl Prvsning Vsphere Wp Copy