RETHINKING STORAGE FOR VIRTUALIZED INFRASTRUCTURES · 2020-03-30 · 2014 EMC Proven Professional...
Transcript of RETHINKING STORAGE FOR VIRTUALIZED INFRASTRUCTURES · 2020-03-30 · 2014 EMC Proven Professional...
RETHINKING STORAGE FOR VIRTUALIZED INFRASTRUCTURES
Alexey Polkovnikov(EMCCAe, EMCDSA, EMCISA)Senior System ArchitectACCESS Europe GmbH
2014 EMC Proven Professional Knowledge Sharing 2
Table of Contents
Introduction ...................................................................................................................................... 4
Traditional storage systems: File Systems, LUNs ............................................................................ 4
Virtualized infrastructure storage challenge ..................................................................................... 4
Storage response to the virtualized infrastructure ............................................................................ 5
VVols overview: core idea, history, current state .............................................................................. 5
VVol concept history ..................................................................................................................... 6
Core Concepts ............................................................................................................................. 7
How granularity is making a difference ............................................................................................ 9
Snapshots .................................................................................................................................... 9
Clones ........................................................................................................................................ 10
Deduplication ............................................................................................................................. 11
Replication ................................................................................................................................. 12
VVols mechanics basics ................................................................................................................ 13
Out-of-band VVols management ................................................................................................ 13
Interacting parties ....................................................................................................................... 14
Binding a Virtual Volume to a Protocol Endpoint ........................................................................ 14
Creating a Virtual Volume........................................................................................................... 15
Storage Policy-Based Management ............................................................................................... 15
Software-Defined Storage with VVols ............................................................................................ 18
Summary ....................................................................................................................................... 18
Appendix A .................................................................................................................................... 20
2014 EMC Proven Professional Knowledge Sharing 3
Table of Figures
Figure 1 - vSphere Storage APIs structure ....................................................................................... 6
Figure 2 - VVol concept history ........................................................................................................ 7
Figure 3 - Protocol Endpoint's role ................................................................................................... 8
Figure 4 - Storage Container's role .................................................................................................. 9
Figure 5 - VVol-level snapshots ..................................................................................................... 10
Figure 6 - VVol-level cloning .......................................................................................................... 11
Figure 7- VVol-level deduplication .................................................................................................. 12
Figure 8 - VVol-level replication ..................................................................................................... 13
Figure 9 - Basic VVol interaction parties ........................................................................................ 14
Figure 10 - Service-oriented provisioning with SPBM..................................................................... 16
Figure 11 - SPBM connects requirements with resources .............................................................. 17
Disclaimer: The views, processes, or methodologies published in this article are those of the
author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.
2014 EMC Proven Professional Knowledge Sharing 4
Introduction
This article presents the cutting-edge concept of “VM-granular Storage”. I will keep things simple
and not go into much technical detail in order to maintain focus on the underlying meaning of this
modern storage paradigm that will likely dominate the IT industry in the near future. Use cases are
provided to demonstrate the infrastructure benefits from this new storage paradigm.
Traditional storage systems: File Systems, LUNs
Traditional storage systems mainly use LUNs and File Systems as provisioning and management
units. We talk about LUNs for the block-wise access, FS for the file-level, and both when it comes
to the unified storage. Taken further, for iSCSI, FC, and NFS (CIFS), the level of granularity is
LUNs and files.
Virtualized infrastructure storage challenge
Infrastructure has changed dramatically, becoming increasingly virtualized. Today, virtual server is
a default deployment option (over physical server) for many enterprise deployment policies.
Different industry reports and estimations indicate that virtual server deployments have surpassed
physical server deployments during the last 4-5 years.
So what? Infrastructure virtualization is not a secret (and storage virtualization also is not). The
point is that traditional storage arrays that are very effective when working with physical servers are
less so when the servers are virtualized.
In a traditional (physical) infrastructure, storage arrays are smart enough to understand single host
I/O pattern when a LUN is dedicated to a single host or when it is shared and there is a possibility
to track the hosts with some identifier (i.e. World Wide Name).
Caching techniques that help the array improve its external performance can be effectively used in
this case to pre-fetch the data from the internal drives. Note: FLASH-based drives are still too
expensive to hold a significant part of an array’s capacity, so there are still a lot of mechanical hard
drives inside.
These performance improvements fall short when it comes to the virtualized hosts, due to the fact
that I/O from the different VMs is coming the same way and storage is unaware of this (this effect is
also known as “I/O blender”).
2014 EMC Proven Professional Knowledge Sharing 5
Data protection/copying/movement features are also greatly impacted by the fact that the same
LUNs are used by numerous virtualized hosts. LUN level locking is a significant issue in such
cases.
Storage response to the virtualized infrastructure
Of course, the limitations and drawbacks of traditional storage in virtualized infrastructures have
been already addressed by storage/virtualization vendors. There is a trend in the storage industry
called “VM-aware storage”. VM-aware storage is all about making it possible for the storage and
hypervisor to talk to each other. This helps the storage to be more effective in VM workloads,
handling and hypervisors to offload the operations to the storage level and avoiding unnecessary
hypervisor-level data movement and inefficient “software” operations.
VM-aware storage is a good practical step forward in terms of the storage/virtualization integration.
However, it still doesn’t address the fundamental need of having the same language between the
virtualization side and storage side.
Here the emerging concept of the VM granular storage comes into play. This is a rethinking of the
storage arrays which takes into account the virtualized infrastructure seriously and makes VMs first-
class citizens for the storage systems.
VVols overview: core idea, history, current state
As VMware is currently a strong virtualization leader (according to Gartner’s 2013 market research;
see Appendix A for more details), in this article I will describe VMware’s approach to the VM-
granular storage – VVols (Virtual Volumes, VM Volumes). Currently VVols exists in the format of
“pre-technology”, meaning only technology previews have been released at the time this article was
written and there are no expressed commitments from the vendor. However, very likely, this is the
future of the storage for the virtualized world.
The core idea of VVols is to make the storage serve for the virtualization infrastructures at the level
of the virtualization’s native concepts like Virtual Disk (VMDK) instead of traditional storage
management concepts or effective-yet-workaround ways like VMFS datastores.
This changes the existing storage paradigm of using standard LUNs and File Systems as universal
underlying units; Virtual Volumes would be generally based on the resource pools, not on the LUNs
and File Systems. This should align the storage side management with the reality on the virtualized
host side.
2014 EMC Proven Professional Knowledge Sharing 6
VVol concept history
VVols technology is not something that is easy to deliver in a single piece; it was presented by
VMware and its storage partners in the format of technology preview/updates on VMworld 2011,
2012, and 2013 (see Appendix A for more details).
Successful implementation of such a holistic and deep concept requires a staging approach as well
as support from the storage vendors’ part of the equation.
Currently, communication between VMware vSphere and storage is achieved through a set of
interfaces, called VMware vSphere Storage APIs shown in Figure 1.
Figure 1: vSphere Storage APIs structure
I will touch on two of the VMware vSphere Storage APIs that are related to the topic: array
integration API and storage awareness API.
vSphere API for Array Integration (VAAI) was introduced by VMware several years ago. VAAI’s
purpose was to offload some of the operations from ESX hypervisor servers to the storage arrays
(if the array vendor supports this feature through a vendor plug-in for VMware vSphere). This
greatly improved performance of the operations that were hardware-offloaded like cloning, copying,
and zeroing. If the offloading process is supported by the storage, VMware vSphere VMkernel Data
Mover avoids its “software data movement” and uses “hardware data movement” when the API can
be effectively used for tasks like inside-storage inter-datastore VMDK copying.
After that, VMware vSphere APIs for Storage Awareness (VASA) appeared. VASA was targeted to
give the virtual environment insights into the storage system. What kind of insights? The
information to enable vSphere-based monitoring, trouble-shooting, provisioning. This was a step
forward in making the virtual infrastructure more storage-aware and it continued to move the focus
to the VM-based concepts. However, these interfaces were reporting-only; no “active storage
management” was possible.
vSphere Storage APIs
Storage Awareness
Array Integration
Multi-pathing
Data Protection
2014 EMC Proven Professional Knowledge Sharing 7
Finally, VASA 2.0 will introduce full support for the new concept called “Virtual Volumes”. VASA will
be extended in a way that introduces powerful basic abstractions: Storage Containers, Virtual
Volumes (VVols), Protocol Endpoints. Storage Policy-Based Management (SPBM) will be
introduced to enable creation of policies on the virtualization side (vSphere) that will be mapped to
the Storage Capability Profiles exposed by the storage and to manage the storage based on these
policies. When will the VVols technology be released? There are no commitments from VMware or
its storage partners. However, according to the update from VMworld 2013, expected technology
release is to be delivered in 2014-2015.
Figure 2: VVol concept history
Core Concepts
VVols architecture defines the model that consists of the following main concepts:
1. Storage Containers – Containing entities that host the Virtual Volumes (VVols can be
created only inside such containers). Storage Containers are managed by the storage
administrator.
2. Virtual Volumes (VVols) – Virtual disks that reside inside the Storage Containers and
controlled by the virtualization side. Virtualization administrator creates them using vSphere
Client; ESXi hosts are directing I/O to them through the Protocol Endpoints on behalf of the
virtual machines running on those hypervisors.
2010 2011 2012 2013 2014/15
vSphere 4.1: VAAI
vSphere 5.0 : VASA 1.0 VAAI update
vSphere 5.5: VAAI update vSphere 5.1
vSphere 2015: VASA 2.0 - VVols
Virtualization side - VMware
Storage side - Market leaders
VASA /VAAI support: EMC, IBM
VASA/VAAI support: NetApp, Dell, HP
VASA 2.0/VVols support: EMC, IBM, NetApp, Dell, HP
2014 EMC Proven Professional Knowledge Sharing 8
3. Protocol Endpoints (EP) – Protocol-dependant connection points that VVols I/O go through.
This is a powerful I/O-demultiplexing concept that enables numerous VVols that reside on a
single array be accessed by the ESX hosts through the on-demand data path. For the
virtualization side, PE will be either a LUN (for iSCSI) or a mount point (for NFS).
Storage Containers and Protocol Endpoints are orthogonal concepts in this model. Storage
Containers’ role assumes being a logical grouping entity for the VVols (for example, different
Storage Containers can be used for the different tenants, i.e. departments/BUs of the enterprise,
service provider customers, etc.). PE, on the other hand can be seen as a fault domain. PE is not
supposed to serve for purposes of VVol/host isolation.
Figure 3: Protocol Endpoint's role
Storage Array
VVol VVol
Hypervisor Host
VM VM
Single Protocol Endpoint to demultiplex I/O from different VMs
VVol VVol
Hypervisor Host
VM VM
PE
I/O
2014 EMC Proven Professional Knowledge Sharing 9
Figure 4: Storage Container's role
In terms of numbers, a single storage system could support a few million Virtual Volumes, that
reside in a few thousands of Storage Containers. All of these VVols are accessible with just a pair
of Protocol Endpoints (for example, one for the file and one for the block access).
How granularity is making a difference
Let’s consider some examples of what VM storage granularity and hypervisor/storage integration
means in practice when it comes to data services. Snapshots, cloning, replication, and
deduplication use cases are presented below.
Snapshots
Having separate VVols for the VM disks makes it possible to do snapshots in a smart way:
1. Only the needed VMs’ disks are copied (not the datastore or entire LUN).
2. The operation itself is offloaded to the storage, which makes it far more effective than
performing ESX-server-involving “software” snapshots.
3. Retention policies and schedules are managed on a per-VM basis.
Storage Array
Storage Container
VVol VVol
Hypervisor Host
VM VM
Storage Container logically groups volumes. Grouping can be related to tenants.
VM VM
Storage Container
VVol VVol
Association only, I/O goes through the PEs
2014 EMC Proven Professional Knowledge Sharing 10
Figure 5: VVol-level snapshots
VM configuration data and swap that are also stored on separate VVols (not shown here, for
simplicity) are not snapped so no unnecessary data gets into the snapshot.
Clones
Pretty much like snapshots, VM disk cloning benefits from having separate manageable VVols that
correspond to the disks. When it is needed to clone a VM and its disks, the operation is hardware-
offloaded (and optimized on the storage side) and yet only the cloned VM-related disks are copied.
Storage Array
VVol VVol
Hypervisor Host (ESXi)
VM
Snapshot
VM
Separate VM disk, individual schedule and retention policy.
Hardware offloaded
Association only, I/O goes through the PEs
snap
2014 EMC Proven Professional Knowledge Sharing 11
Figure 6: VVol-level cloning
VM configuration data and swap that are also stored on separate VVols (not shown here, for
simplicity) are not cloned, so no unnecessary data gets cloned.
Deduplication
Virtual volumes for the different VMs have different space efficiency and performance
requirements. With VVols, it is possible to dedupe (or not) each volume separately.
Storage Array
VVol VVol
Hypervisor Host (ESXi)
VM VM
Separate VM disk is cloned for the separate VM
Hardware offloaded
VVol
VM
Association only, I/O goes through the PEs clone
clone
2014 EMC Proven Professional Knowledge Sharing 12
Figure 7: VVol-level deduplication
Replication
Replication also benefits from the level of granularity. There is no need to replicate everything
within a LUN and to copy things like swap files.
Storage Array
VVol VVol
Hypervisor Host (ESXi)
VM VM
Deduped VM disk, its own deduplication domain
Association only, I/O goes through the PEs
No dedupe, no performance penalty
2014 EMC Proven Professional Knowledge Sharing 13
Figure 8: VVol-level replication
VM swap that is stored on a separate VVol (not shown here, for simplicity) is not snapped, so no
unnecessary data gets replicated.
VVols mechanics basics
In order to better understand the idea of VVols in action, let’s look at some VVols mechanics
basics.
Out-of-band VVols management
As opposed to in-band management—going through the data path (the path where I/O goes)—all
VVols-related management operations, including binding, are going out-of-band through the VASA
interface that is exposed to the vCenter Server and ESXi servers.
VASA interface is implemented by the VASA Provider, a part of the storage firmware that is
responsible for ensuring VVols-related functionality from the storage side. The VASA Vendor
Provider is exposed as a web-service for the virtualization-side storage-awareness requests.
Storage Array Storage Array
VVol VVol
Hypervisor Host (ESXi)
VM VM
Only a VM disk that requires replication
Hardware offloaded
Association only, I/O goes through the PEs
replicate VVol
2014 EMC Proven Professional Knowledge Sharing 14
Interacting parties
Let’s take a closer look at identifying basic interacting parties in VVol technology.
Figure 9: Basic VVol interaction parties
VASA Provider is the management interface (or control path entry point) for the virtualization side. It
communicates to the ESXi Hosts and vCenter Servers to enable “active storage management” from
the virtualization side.
The data path (where I/O flows) goes between ESXi Host and the storage. In order to address a
specific VVol, it is demultiplexed by the Protocol Endpoints that are assigned on VVol binding.
VVols can be accessed through block and file PEs using SCSI or NFS.
Binding a Virtual Volume to a Protocol Endpoint
How are the Protocol Endpoints bound to the Virtual Volumes? ESXi or vCenter Server are
requesting a storage system (VASA Provider, being more specific) to perform binding for the Virtual
Storage Array
VVol
PE
ESXi Host
VM VM VM
VASA
Provider
vCenter Server
I/O Control
2014 EMC Proven Professional Knowledge Sharing 15
Volume with Bind request. Storage system replies with an identifier of the Protocol Endpoint that
will be used for the I/O to this Virtual Volume. It is up to the storage system what PE to return.
Creating a Virtual Volume
Virtual Volumes are created by the ESXi hosts or vCenter Server using VASA API. The operation is
out-of-band. Creating a new VVol requires selecting an existing Storage Container where the VVol
will reside. This part of the process (selection) is actually one of the most promising parts of the
VVols story. Storage provisioning, the VVols way, is based on the two concepts we touched on
before: Storage Capability Profile and Policy Profile. These two concepts are the enablers of
Storage Policy-Based Management.
Storage Policy-Based Management
The goal of Storage Policy-Based Management (SPBM) is to change the historical way storage
provisioning works: storage admin and virtualization admin first discuss the application requirement.
Then, the storage admin creates storage pools, sets them up, and exposes the LUNs to the
hypervisor host. After that, the VMs are using the storage resources allocated.
SPBM-style storage resources provisioning is much more “service-oriented”:
1. “Storage Service Providers” publish the offerings in the “Storage Service Catalog”. Service
Level Agreements (SLAs) for the different types of the “Storage Services” are clearly stated
and should be guaranteed by the “Storage Service Provider”.
2. “Storage Consumer” states its requirements to the “Storage Service” using Service Level
Objective (SLO).
3. There is an automated mechanism that makes mappings between the consumer
requirements and the provider offerings.
Storage Capability Profile and Policy Profile are used to formulate those offerings and requirements
statements mentioned above. The profile definitions are:
Storage Capability Profiles – Definitions that are used in order to publish a set of specific
Storage Container capabilities, including primary storage aspects, like technology, storage
efficiency, and recovery (those are the quantified elements of the SLA like: "availability" = 7,
"replication" = true, “thin” = false, "maxRPO" = 15 minutes, etc.) These profiles are coming
from the storage side.
Policy Profiles – Collections of policies that define the target ranges of capability values that
comprise SLO (those are quantified, as well). Policy profiles are the VM/application-side
means for the storage provisioning requirements statements.
2014 EMC Proven Professional Knowledge Sharing 16
Instances of these two profile types are matched to each other to map the VM/application
requirements to the available storage resources.
So who does what? The Storage admin creates the Storage Containers that advertize one or more
capability profiles that describe the capabilities of the storage resources the containers are based
on (i.e. storage pools). The VM admin assigns policies to the VMs to get the available storage
resources provisioned optimally for each VM.
Figure 10: Service-oriented provisioning with SPBM
Storage Consumer Side
Storage Service Side
Storage Container
Profile Profile
Storage Container
Profile Profile
VM VM VM
I have this
Policy Policy Policy
Storage Admin
Virtualization Admin
I need this
2014 EMC Proven Professional Knowledge Sharing 17
Figure 11: SPBM connects requirements with resources
ESXi Host
VM VM VM
vSphere
Storage Array
Pool Pool Pool Pool
Storage Container Storage Container
Profile Profile Profile
Policy Policy Policy
VM
SPBM
2014 EMC Proven Professional Knowledge Sharing 18
SPBM is a part of a broader strategy, Policy-Based Management that makes it possible to manage
different types of resources including compute, network, and storage in order to make provisioning
decisions and dynamically reallocate resources. SPBM and Virtual Volumes are also enabling
VMware’s approach to Software-Defined Storage.
Software-Defined Storage with VVols
Software-Defined Storage (SDS) is another big trend of the storage industry. As with other
Software-Defined approaches (Software-Defined Networking, Software-Defined Datacenter, etc.)
the main idea of SDS is to make the hardware as standard, compliant, and commoditized as
possible, while controlling/provisioning the resources from the software control plate.
The software control plate in this case is overarching different pools that are built of the resources
with the same SLA. Each hardware unit still owns the operations that are hardware-offloaded: those
that require performance and optimal implementation that is aware of the hardware internals.
Virtual Volumes and PBM are making it possible to implement such a control plane that is situated
off-the-box (meaning a storage array by the box). Service-oriented SPBM-based storage
management/provisioning is a great way to unify heterogeneous storage systems capabilities and
selecting those storage resources (no matter which storage system owns them) that are optimally
fit the application requirements.
The other part of the SDS story is automation. Storage automation is one of the pillars of modern
virtualized and cloud infrastructures. Since provisioning the virtual machine is basically provisioning
the storage, this operation should be automated. Policy-Based Management and Virtual Volumes is
a great approach to control plane automation that allows not only using capabilities and policies to
perform the placement decisions, but to support alarming and reallocation if the advertized SLA’s
are not being met.
Please use the links in Appendix A to get more information on VMware’s strategy for SDS.
Summary
Virtualized infrastructures are becoming standard in the world of modern IT. The storage part of the
infrastructure should handle the virtualized hosts’ needs “natively” in order to effectively perform in
the new reality. Virtual Volumes technology is an emerging response to the deep underlying need
of alignment between the storage and the rest of the infrastructure.
2014 EMC Proven Professional Knowledge Sharing 19
There is a significant chance that we’ll see VVols all around the storage systems from all major
vendors. There is also a great chance this technology, supported by the storage industry, will
become a keystone of the Software-Defined Storage approach implementation.
2014 EMC Proven Professional Knowledge Sharing 20
Appendix A
1. Gartner Magic Quadrant for x86 Server Virtualization Infrastructure:
http://www.gartner.com/technology/reprints.do?id=1-1GJA88J&ct=130628
2. eWeek - Enterprises Thinking Virtualization First, IDC Says: http://www.eweek.com/c/a/IT-
Infrastructure/Enterprises-Thinking-Virtualization-First-IDC-Says-896006/
3. IDC Worldwide Quarterly Server Virtualization Tracker:
http://www.idc.com/tracker/showproductinfo.jsp?prod_id=39
4. 451 Research's TheInfoPro service reports that average x86 server virtualization levels
have reached 51%: http://www.serverwatch.com/server-trends/survey-51-of-x86-servers-
now-virtualized.html
5. The I/O Blender - PureStorage Blog: http://www.purestorage.com/blog/the-io-blende/
6. VMware vSphere Storage APIs – Array Integration (VAAI) Whitepaper:
http://www.vmware.com/resources/techresources/10337
7. VMworld 2013 - VM-aware Storage for the Software Defined Datacenter:
http://www.vmworld.com/docs/DOC-8671
8. VMworld 2013: VVol update with EMC VNX:
http://www.youtube.com/watch?v=Wnjf230LYTA
9. vSphere VVol and EMC VMAX Tech Preview - VMworld 2012:
http://www.youtube.com/watch?v=yngHLnanq3s
10. vSphere VVol and EMC VPLEX Tech Preview - VMworld 2012:
http://www.youtube.com/watch?v=cx4f9pe4jpA
11. VMware vSphere Blog - Virtual Volumes (VVols) Tech Preview:
http://blogs.vmware.com/vsphere/2012/10/virtual-volumes-vvols-tech-preview-with-
video.html
12. VMworld 2011: VSP3205 - VMware vStorage APIs for VM and Application Granular Data
Management: http://www.youtube.com/watch?v=elttTnltgLI
13. VMware Storage Futures by Cormac Hogan (VMUG):
http://blogs.vmware.com/vsphere/2013/05/vmware-storage-futures-video-courtesy-of-vmug-
italia.html
14. Tintri's Brandon Salmon demonstrates how Tintri VMstore supports VMware VVols:
http://www.youtube.com/watch?v=sPEw59JKLu4
15. EMC ViPR Software-Defined Storage:
http://www.youtube.com/playlist?list=PLbssOJyyvHuW1TAxMzVd8PF4aQPZk5Mb6
16. VMware Office of the CTO - VMware’s Strategy for Software-Defined Storage:
http://cto.vmware.com/vmwares-strategy-for-software-defined-storage/
2014 EMC Proven Professional Knowledge Sharing 21
17. Chuck's Blog (Chuck Hollis, Chief Strategist, VMware SAS BU) - The VMware View Of
Software Defined Storage: http://chucksblog.emc.com/chucks_blog/2013/08/the-vmware-
view-of-software-defined-storage.html
18. VMware vSphere Blog, What is Software Defined Storage? A VMware TMM Perspective -
https://blogs.vmware.com/vsphere/2012/11/what-is-software-defined-storage-a-vmware-
tmm-perspective.html
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.