STO1297BE Stretched Clusters or VMware Site Recovery or ... · What are VMware Metro Storage...
Transcript of STO1297BE Stretched Clusters or VMware Site Recovery or ... · What are VMware Metro Storage...
STO1297BE
Stretched Clusters or VMware Site Recovery Manager? We Say Both!
Jeff Hunter, VMware, @jhuntervmwareGS Khalsa, VMware, @gurusimran
#VMworld #STO1297BE
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
2#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What are VMware Metro Storage Clusters - vMSC
6
– Applications are protected against localized site failures
– Proactive live migrations of applications for continuous availability
– Automated recovery of workloads after host failures
– Single Management console for all workloads
Stretched vSphere Cluster
Stretched Storage
Site A (Active)
Site B (Active)
HA Restart
Live Migration
< 100 KM distance
vCenter
VMworld 2017 Content: Not fo
r publication or distri
bution
What you need for an Active-Active datacenter model
• Stretched Storage solution
– Stretched L2 Network
– Storage clustering solution that supports distributed data mirroring
– Read/write access to the same volumes from both sites
– Some tie-break mechanism to avoid split-brain
– Examples: VMware vSAN, EMC VPLEX, IBM SVC, NetApp MetroCluster
Backend Arrays
Site 2
Storage
Controllers
Stretched Volumes Across Sites
Storage
Controllers
Backend Arrays
Site 1
Stretched Volumes
VMworld 2017 Content: Not fo
r publication or distri
bution
Keep in mind
• HA
• DRS
• SDRS
• Maintenance/Management
• Cost
• Complexity
10
VMworld 2017 Content: Not fo
r publication or distri
bution
VMware HCI addresses typical challenges of Stretched Clusters
11
Complex configuration and management
Expensive to deploy and operate
Creates silos of specialized hardware
COMPUTE STORAGE COMPUTESTORAGE
Typical Stretched Cluster
vSphere
Storage and Replication Management
Simplifies management: policy-based, app-level
Lowers TCO: server economics
Eliminates Silos: your choice of hardware
COMPUTE + SERVER-ATTACHED STORAGE
COMPUTE + SERVER-ATTACHED STORAGE
vSphere vSAN
vSAN Stretched Cluster
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSAN stretched cluster spans two sites
• Standard x86 hardware
• Up to 15 hosts per site
• 10Gbps recommended
• L2 recommended
– L3 supported
• 5ms or less RTT required
• No multicast in vSAN 6.6.x
• Automated failover using HA
Today
vSphere vSAN
3rd site for
witness
#STO1297BE CONFIDENTIAL 12
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere vSAN
3rd site for
witness
vSAN stretched cluster witness host deployed at third site
• ESXi virtual appliance
• Deployed from OVA
• Stores metadata only
• Third site latency requirements
– 1-10 hosts/site <200ms RTT
– 11-15 hosts/site <100ms RTT
• 100Mbps, L3 to main sites
5ms RTT, 10GbE
vESXi
Witness Host
VM
#STO1297BE CONFIDENTIAL 13
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Demo – Configuring a vSAN Stretched Cluster is easy
14#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Witness
Cache
Capacity Capacity Capacity
Cache
Capacity Capacity Capacity
Preferred Site Secondary Site
Tertiary Site
VM
VMDK
15
Stretched cluster component placement
#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched cluster local failure protection
• Redundancy against host failure and site failure
• If site fails, vSAN maintains local redundancy in surviving site
• No change in stretched cluster configuration steps
• Minimized I/O traffic across sites
– Local reads, local resyncs
– Single inter-site writes
New in vSAN 6.6
16#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Cluster Local Failure Protection – RAID-5
Witness
Preferred Site Secondary Site
Tertiary Site
VM
VMDK
New in vSAN 6.6
17#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Cluster Local Failure Protection – RAID-5
New in vSAN 6.6
18#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Cluster Local Failure Protection – RAID-1
Witness
Preferred Site Secondary Site
Tertiary Site
VM
VMDK
New in vSAN 6.6
19#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Cluster Local Failure Protection – RAID-1
New in vSAN 6.6
20#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSAN stretched cluster
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
21#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Network partition or site failure
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM
X
X
X
VM VM VM VM
22#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere HA restarts VMs at the other site
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM
VM VM VM VM
X
X
X
HA Restart
23#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSAN stretched cluster
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
24#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Inter-site network disconnected
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
25
X
#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere HA powers off VMs at Secondary Site
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
HA Power Off
26#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere HA restarts VMs at Preferred Site
Witness
Tertiary Site
VM VM VM VM
VM VM VM VM
X
HA Restart
Preferred Site Secondary Site
27#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSAN stretched cluster
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
28#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Witness host network disconnected
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
29
X
#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Witness host leaves cluster, VMs continue to run
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
Witness host leaves cluster
30#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Witness host is offline, VMs continue to run
Witness
Preferred Site Secondary Site
Tertiary Site
VM VM VM VM VM VM VM VM
X
31#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Changing the witness host is easy
New in vSAN 6.6
32#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSAN stretched cluster
• Easy and less expensive to deploy
• Lowers complexity with standard x86 server hardware
• Provides local and cross-site availability
However…
• Main data sites must be located no more than 5ms apart
• Large disasters, power grid failure, etc. can take both sites offline
Stretched clusters require a disaster recovery solution
33#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
What is Site Recovery Manager?
• Centralized recovery plans for 1000’s of VMs
• Non-disruptive recovery testing
• Automated DR workflows
• Integrated with the VMware product stack
• Eliminates complexity and risk of manual
processes
• Enables fast and highly predictable RTOs
• Provides policy-driven DR control for any
virtualized app
vSphere
vCenter ServerSite Recovery
ManagervCenter Server
Site Recovery
Manager
vSphere
Production Site Recovery Site
Servers ServersArray-based
replication
vSphere
Replication
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Transforms Management of Recovery And Migration Plans
❖ Weeks or months to set up recovery plans
❖ Unstructured and manual makes them error-prone
❖ Quickly fall out of sync with infrastructure changes
✓ Simple set up in minutes
✓ Software-defined workflows eliminate errors
✓ Simple to update and keep in sync with changes
From Complex Runbooks… …to Simple Recovery Plans
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
SRM Demo
37#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
What is vSphere Replication?
• Per-VM, host-based replication
• Network-efficient by replicating only changed data
• Included with vSphere Essentials Plus and higher editions
38
OS
Data
App
OS
Data
App
SAN vSAN
vCenter
Server
vSphere
Replication
Only changes
are replicated
OS
Data
App
OS
Data
App
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere Replication Features and Benefits
• Easy virtual appliance deployment
– Minimal time investment, no hardware procurement
• Integration with vSphere Web Client
– Ease of administration and monitoring
• Protect any VM regardless of OS and apps
– One solution reduces complexity and cost
39
OS
Data
AppVMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere Replication Features and Benefits
• Flexible recovery point objective (RPO) policies – 5 mins to 24 hours
– Supports a wide variety of business requirements
• Compatible with vSAN, SAN, NAS, local storage
– One solution reduces complexity and cost
• Quick recovery for individual VMs
– Reduces downtime, minimizes resource requirements
40
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vSphere Replication Features and Benefits
• End-to-end network compression
– Further reduces bandwidth requirements
• Network traffic isolation
– Control bandwidth, improve performance, security
• Windows VSS and Linux file system quiescing
– Increased reliability when recovering VMs
41
Management
Replication WAN
LAN
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
SRM or vMSC?
• What are the Benefits and Challenges?
44#STO1297BE CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Challenges…
… with vMSC
• No Orchestration
• No non-disruptive testability
• DRS and HA are not site aware
• Availability of a single vCenter managing both sites
… with SRM
• No option for moving VMs without downtime
• Requires multiple vCenters
• Previously, no integration with stretched clusters
45
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Benefits…
… with vMSC
• Requires a single vCenter
• Zero downtime disaster avoidance
• No need to change IP addresses
• Zero RPO with potential RTO of minutes
… with SRM
• Orchestrated Recovery
• Non-disruptive Testing
• No requirement for stretched L2
46
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
SRM + Stretched Storage
• The best of both
• Stretched storage with SRM
• Orchestrate cross-VC vMotion
• Unified plan for DR, migrations
• Zero-downtime migrations for planned maintenance, disaster avoidance
• Ability to non-disruptively test
• Enhanced reliability
• Lower RTO
vCenterSite
Recovery Manager
Sita A
Stretched Storage
Site B
vCenterSite
Recovery Manager
vSpherevSphere
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
SRM integration with Active-Active storage solutions
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage
active-active datacenters with SRM
Site 1 vSphere cluster
vCenter SRM
ESXi ESXi ESXi
Site 2 vSphere cluster
ESXi ESXi ESXi
Volume A at Site 1 (Full R/W access)
Volume A at Site 2 (Full R/W access)
vCenter SRM
Stretched Networks
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage
scenario 1: local host failures in one site
Site 1 vSphere cluster
vCenter SRM
ESXi ESXi ESXi
Site 2 vSphere cluster
ESXi ESXi ESXi
Volume A at Site 1 (Full R/W access)
Volume A at Site 2 (Full R/W access)
vCenter SRMStretched Networks
HA handles local host failuresESXi
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage
scenario 2: disaster avoidance at one site
Site 1 vSphere cluster
vCenter SRM
ESXi ESXi ESXi
Site 2 vSphere cluster
ESXi ESXi ESXi
Volume A at Site 1 (Full R/W access)
Volume A at Site 2 (Full R/W access)
vCenter SRM
Stretched Networks
Execute SRM Planned Migration with vMotion
BEFORE disaster
SRM invokes vMotion as per VM priority and dependencies
in Recovery Plan
SRM gives you the “easy button” for handling planned downtime in Active-Active datacenters
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage
scenario 3: faster recovery from unplanned failures
Site 1 vSphere cluster
vCenter SRM
ESXi ESXi ESXi
Site 2 vSphere cluster
ESXi ESXi ESXi
Volume A at Site 1 (Full R/W access)
Volume A at Site 2 (Full R/W access)
vCenter SRM
Stretched Networks
Use SRM’s test capability to prepare for site-wide
failures
In a site failure scenario, execute SRM Recovery Plan
at other Active site.
Can be automatically triggered by an external
system
SRM enables a reliable, testable and low RTO solution for handling unplanned failures in Active-Active datacenters
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage Planned Migration
55
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
vMotion in Stretched Storage Recovery Plan
56
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage in DR Mode
57
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Stretched Storage in DR Mode
58
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
High availability with stretched cluster, automated DR with SRM
Preferred Site Secondary Site
VC VR SRM VM VM VM VM VM
vSAN vSAN
Tertiary Site
VC VR SRM WH
vSAN
Site Recovery Manager
vSAN Stretched Cluster
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO1297BE
Demo – vSAN Stretched Cluster and SRM
61
VMworld 2017 Content: Not fo
r publication or distri
bution
Questions
64
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution