What’s New in Performance in vSphere 5.0 – White Paper ......TECHNICAL WHITE PAPER / 3 What’s...
Transcript of What’s New in Performance in vSphere 5.0 – White Paper ......TECHNICAL WHITE PAPER / 3 What’s...
What’s New in Performance in VMware vSphere™ 5.0T E C H N I C A L W H I T E P A P E R
T E C H N I C A L W H I T E P A P E R / 2
What’s New in Performance in VMware vSphere 5.0
IntroductionVMware vSphere™ 5.0 (“vSphere”), the best VMware® solution for building cloud infrastructures, pushes further ahead in performance and scalability. vSphere 5.0 enables higher consolidation ratios with unequaled performance. It supports the build-out of private and hybrid clouds at even lower operational costs than before.
The following are some of the performance highlights:
VMware vCenter™ Server scalability •Fasterhigh-availability(HA)configurationtimes
•Fasterfailoverrates–60%morevirtualmachineswithinthesametime
•Lowermanagementoperationallatencies
•Highermanagementoperationsthroughput(Ops/min)
Compute•32-wayvCPUscalability
•1TBmemorysupport
Storage •vSphere®StorageI/OControl(StorageI/OControl)nowsupportsNFS–Setstoragequalityofservice
priorities per virtual machine for better access to storage resources for high-priority applications
Network•vSphere®NetworkI/OControl(NetworkI/OControl)–Givesahighergranularityofnetworkloadbalancing
vSphere vMotion®•vSpherevMotion®(vMotion)–Multi–networkadaptorenablementthatcontributestoanevenfastervMotion
•vSphereStoragevMotion®(StoragevMotion)–Fast,livestoragemigrationwithI/Omirroring
VMware vCenter Server and Scalability EnhancementsVirtual Machines per ClusterEven with the increased number of supported virtual machines per cluster, vSphere 5.0 delivers a higher throughput(Ops/min)ofmanagementoperations—upto120%,dependingontheintensityoftheoperations.Examples of management operations include virtual machine power-on, virtual machine power-off, virtual machinemigrate,virtualmachineregister,virtualmachineunregister,createfolder,andsoon.Operationlatencyimprovementsrangefrom25%to75%,dependingonthetypeofoperationandthebackgroundload.
HA (High Availability)SignificantadvanceshavebeenmadeinvSphere5.0toreduceconfigurationtimeandimprovefailoverperformance.A32-host,3,200–virtualmachine,HA-enabledclustercanbeconfigured9xfasterinvSphere5.0.Inthesameamountoftimeaswithpreviouseditions,vSphere5.0canfailover60%morevirtualmachines.Theminimalrecoverytime,fromfailuretothefirstvirtualmachinerestarts,hasbeenimprovedby44.4%.Theaveragevirtualmachinefailovertimehasimprovedby14%.
Inaddition,invSphere5.0,thedefaultCPUslotsize—thatis,spareCPUcapacityreservedforeachHA-protectedvirtualmachine—issmallerincomparisontoitsvalueinvSphere4.1,toenableahigherconsolidationratio.Asaresult,DistributedPowerManagement(DPM)mightbeabletoputmorehostsintostandbymodewhentheclusterutilizationislow,leadingtomorepowersavings.
T E C H N I C A L W H I T E P A P E R / 3
What’s New in Performance in VMware vSphere 5.0
VMware ESX VMware ESX VMware ESXi
VM VM
Resource Pool
Failed ServerOperating Server Operating Server
VMVM VMMVVMVVMMMMVVV VMVMVMMMMM VMVM
Figure 1. vSphere 5.0 High Availability
CPU Enhancements32-vCPU performance and scalabilityVMwarevSphere5.0cannowsupportupto32vCPUs.UsingalargenumberofvCPUsinavirtualmachinecanpotentiallyhelplarge–scale,mission–critical,Tier1applicationsachievehigherperformanceandthroughputand/orfasterruntimeforcertainapplications.
TheVMwarePerformanceEngineeringlabconductedavarietyofexperiments,includingcommercialTier1andHPC(high-performancecomputing)applications.Performancewasobservedclosetonative,92–97%,asvirtualizedapplicationsscaledto32vCPUs.
Intel SMT–related CPU scheduler enhancements IntelSMTarchitectureexposestwohardwarecontextsfromasinglecore.Thebenefitofutilizingtwohardwarecontextsrangesfrom10%to30%inimprovedapplicationperformance,dependingontheworkloads.InvSphere5.0,theVMware®ESXi™CPUscheduler’spolicyistunedforthistypeofarchitecturetobalancebetweenmaximumthroughputandfairnessbetweenvirtualmachines.InadditiontothenumberofperformanceoptimizationsmadearoundSMTCPUschedulinginvSphere4.1(runningtheVMwareESXihypervisorarchitecture),wehavefurther enhanced the SMT scheduler in VMware ESXi 5.0 to ensure high efficiency and performance for mission-critical applications.
Virtual NUMA (vNUMA) VirtualNUMA(vNUMA)exposeshostNUMAtopologytotheguestoperatingsystem,enablingNUMA-awareguestoperatingsystemsandapplicationstomakethemostefficientuseoftheunderlyinghardware’sNUMAarchitecture.
VirtualNUMA,whichrequiresvirtualhardwareversion8,canprovidesignificantperformancebenefitsforvirtualizedoperatingsystemsandapplicationsthatfeatureNUMAoptimization.
T E C H N I C A L W H I T E P A P E R / 4
What’s New in Performance in VMware vSphere 5.0
Storage EnhancementsNFS support for Storage I/O Control vSphere5.0StorageI/OControlnowsupportsNFSandnetwork-attachedstorage(NAS)–basedsharesonshareddatastores.ItnowprovidesthefollowingbenefitsforNFS:
•Dynamicallyregulatesmultiplevirtualmachines’accesstosharedI/Oresources,basedondisksharesassignedto the virtual machines.
•Helpsisolateperformanceoflatency-sensitiveapplicationsthatemploysmaller(<8KB)randomI/Orequests.Thishasbeenshowntoincreaseperformancebyasmuchas20%.
•Redistributesunutilizedresourcestothosevirtualmachinesthatneedtheminproportiontothevirtualmachines’diskshares.Thisresultsinafairallocationofstorageresourceswithoutanylossinutilization.
•LimitsperformancefluctuationsofacriticalworkloadtoasmallrangeduringperiodsofI/Ocongestion. Thisresultsinuptoan11%performancebenefitcomparedtothatinanunmanagedscenariowithout StorageI/OControl.
StorageI/OControlprovidesadynamiccontrolmechanismformanagingvirtualmachines’accesstoI/Oresources,FibreChannel(FC),iSCSIorNFS,inacluster.ItdeliversthesameperformancebenefitstoNFSdatastoresasitdoestoalreadysupportedFCoriSCSIdatastores.Testshaveshownthatwiththerightbalanceofworkloads,StorageI/OControlmightimproveperformanceofcriticalapplicationsbyasmuchas10%, withalatencyimprovementperI/Ooperationofasmuchas33%.
VMVM VM VM VM
VIP
VM
VIP
With StorageI/O Control
Without StorageI/O Control
OnlineStore
MicrosoftExchange
OnlineStore
DataMining
PrintServer
DataMining
PrintServer
MicrosoftExchange
VM VM VMVMVM VM
During high I/O from non-critical application
Figure 2. vSphere 5.0 Storage I/O Control
T E C H N I C A L W H I T E P A P E R / 5
What’s New in Performance in VMware vSphere 5.0
Memory Management Enhancements1TB vMem support InvSphere5.0,virtualmachinescannowsupportvMemupto1TB.Withthenew1TBvMemsupport,Tier1applicationsconsuminglargeamountsofmemory(>255GB),suchasin-memorydatabases,cannowbevirtualized.Experimentalmeasurementswithmemoryintensiveworkloadsforvirtualmachineswithupto 1TBvMemdemonstratesimilarperformancewhencomparedtoidenticalphysicalconfigurations.
SSD Swap Cache vSphere5.0canbeconfiguredtoenableVMwareESXihostswappingtoasolid-statedisk(SSD).Inthelowhostmemory–availablestates(highmemoryusage),whereguestballooning,transparentpagesharing(TPS)andmemory compression have not been sufficient to reclaim the needed host memory, hypervisor swapping is used as the last resort to reclaim memory from the virtual machines. vSphere employs three methods to address limitations of hypervisor swapping to improve hypervisor swapping performance:
•Randomlyselectingthevirtualmachinephysicalpagestobeswappedout.ThishelpsmitigatetheimpactofVMwareESXipathologicallyinteractingwiththeguestoperatingsystem’smemorymanagementheuristics.This has been enabled from early releases of VMware ESX®.
•MemorycompressionofthevirtualmachinepagesthataretargetedbyVMwareESXitobeswappedout.Thisfeature,introducedinvSphere4.1,reducesthenumberofmemorypagesthatmustbeswappedouttothedisk,whilereclaiminghostmemoryeffectivelyatthesametimeandtherebybenefitingapplicationperformance.
•vSphere5.0nowenablesuserstochoosetoconfigureaswapcacheontheSSD.VMwareESXi5.0willthenusethis swap cache to store the swapped-out pages instead of sending them to the regular and slower hypervisor swapfileonthedisk.Uponthenextaccesstoapageintheswapcache,thepagewillberetrievedquicklyfromthecacheandthenremovedfromtheswapcachetofreeupspace.BecauseSSDreadlatenciesareanorderofmagnitudefasterthantypicaldiskaccesslatencies,thissignificantlyreducestheswap-inlatenciesandgreatlyimproves the application performance in high memory over commitment scenarios.
SSD Swap Cache
Flash
Page Table
01101001
Main Memory
Figure 3. SSD Swap Cache
T E C H N I C A L W H I T E P A P E R / 6
What’s New in Performance in VMware vSphere 5.0
Network EnhancementsNetwork I/O ControlNetworkI/OControl(NIOC)enablestheallocationofnetworkbandwidthtoresourcepools.InvSphere5.0,userscancreatenewresourcepoolstoassociatewithportgroupsandspecify802.1ptags,whereasinvSphere4.1,onlypredefinedresourcepoolscouldbeusedforbandwidthallocation.Sodifferentvirtualmachinescannowbeindifferentresourcepoolsandbegivenahigherorlowershareofbandwidththantheothers.ThepredefinedresourcegroupsincludevMotion,NFS,iSCSI,vSphere®FaultTolerance(FaultTolerance),virtualmachine,andmanagement.
splitRxModevSphere5.0introducesanewwayofdoingnetworkpacketreceiveprocessingintheVMkernel.InpreviousreleasesofvSphere,allreceivepacketprocessingforanetworkqueuewasdoneinasinglecontext,whichmightbe shared among various virtual machines. In cases where there is a higher density of virtual machines per networkqueue,itispossibleforthecontexttobecomeresourceconstrained.Multicastisonesuchworkloadwheremultiplereceivervirtualmachinesshareonenetworkqueue.vSphere5.0includesanewmechanismtosplitthecostofreceivepacketprocessingtomultiplecontexts.SplitRxModecanspecifyforeachvirtualmachinewhetherreceivepacketprocessingshouldbeinthenetworkqueuecontextoraseparatecontext.
WecalledthismodesplitRxMode,anditcanbeenabledonaper-vNICbasisbyeditingtheVMXfileofthevirtualmachinewithethernetX.emuRxMode=“1”fortheEthernetdevice.
NOTE: This configuration is available only with vmxnet3. Once the option is enabled, the packet processing for the virtual machine is handled by a separate context. This improves the multicast performance significantly. However, there is an overhead associated with splitting the cost among various PCPUs. This cost can be attributed to higher CPU utilization per packet processed, so VMware recommends enabling this option for only those workloads, such as multicast, that are known to benefit from this feature.
VMware performance labs conducted a performance study for multiple receivers with the sender transmitting multicastpacketsat16,000packetspersecond.Aseachadditionalreceiverwasadded,therewasanincrease inpacketshandledbythenetworkingcontext.Aftertheloadofthecontextreacheditsmaximum,thereweresignificantpacketlosses.ByenablingsplitRxModeonthesevirtualmachines,theloadonthecontextwasdecreasedandtherebyincreasingthenumberofmulticastreceivershandledwithoutsignificantdrops.Figure4showsthatthereisnonoticeabledifferenceinlossrateforthesetupuntilthereare24virtualmachinespoweredon.Aftertherearemorethan24virtualmachinespoweredon,thereisanalmost10–25%packetloss.EnablingsplitRxModeforthevirtualmachinesreducesthelossratetolessthan0.01%.
Multicast Performance
Default
splitRxMode
Perc
ent P
acke
t Los
s
30
25
20
15
10
5
01 2 4 8 12 16 24 28 32
Number of Receivers
Figure 4. Multicast Performance
T E C H N I C A L W H I T E P A P E R / 7
What’s New in Performance in VMware vSphere 5.0
vSphere DirectPath I/O vSphere®DirectPathI/O(DirectPathI/O)enablesguestoperatingsystemstodirectlyaccessnetworkdevices.DirectPathI/OforvSphere5.0hasbeenenhancedtoallowthevMotionofavirtualmachinecontainingDirectPathI/OnetworkadaptorsontheCiscoUnifiedComputingSystem(UCS)platform.Onsuchplatforms,with proper setup and ports available for passthrough, vSphere 5.0 can transition the device from one that is paravirtualizedtoonethatisdirectlyaccessed,andtheotherwayaround.Whilebothmodescansustainhighthroughput(beyond10Gbps),directaccesscanadditionallysaveCPUcyclesinworkloadswithhighpacketrates(forexample,approximatelygreaterthan50kps).
BecauseDirectPathI/OdoesnotsupportsomevSpherefeatures—memoryovercommitmentandNIOC—VMwarerecommendsusingDirectPathI/Oonlyforworkloadswithveryhighpacketrates,wheresavingCPUcycles can be critical to achieving the desired performance.
Host Power Management EnhancementsVMware®HostPowerManagement(VMwareHPM)invSphere5.0providespowersavingsworkingatthehostlevel.WithvSphere5.0,thedefaultpowermanagementpolicyis“balanced.”Thebalancedpolicyusesanalgorithmthatexploitsaprocessor’sP-states.WorkloadsoperatingunderlowCPUloadscanexpecttoseesomepowersavings(dependingentirelyontheworkloadCPUusagecharacteristics)withminimalperformanceloss(dependentonapplicationCPUusagecharacteristics).VMwarestronglyadvisesperformance-sensitiveuserstoconductcontrolledexperimentstoobservethepowersavings/performanceoftheirapplicationwithvSphere 5.0. If the performance loss is unacceptable, switch VMware ESX to the “high performance” mode. Most oftheworkloadsoperatingunderlowloadscanseeupto5%inpowersavingswhenusingbalancedmode.
T E C H N I C A L W H I T E P A P E R / 8
What’s New in Performance in VMware vSphere 5.0
VMware vMotionMulti–network adaptor enablement for vMotionThenewperformanceenhancementsinvSphere5.0enablevMotiontoeffectivelysaturatea10GbEnetworkadaptorbandwidthduringthemigration,significantlyreducingvMotiontransfertimes.Inaddition,VMwareESX5.0addsamulti-networkadaptorfeaturethatenablesuserstoemploymultiplenetworkadaptorsforvMotion.TheVMkernelwilltransparentlyload-balancethevMotiontrafficoverallthevMotion-enabledvmknicsinanefforttosaturatealloftheconnections.Infact,evenwhenthereisasinglevMotion,VMkernelusesalltheavailablenetworkadaptorstospreadthevMotiontraffic.
Figure5summarizesthevMotionperformanceenhancementsinvSphere5.0overvSphere4.1.Thetestenvironmentmodeledbothasingle-instanceSQLservervirtualmachine(virtualmachineconfiguredwithfourvCPUsand16GBmemoryandrunningOLTPworkload)andmultiple-instanceSQLservervirtualmachine(eachvirtualmachineconfiguredwithfourvCPUsand16GBmemoryandrunningOLTPworkload)deploymentscenariosforthefollowingconfigurations:
1. vSphere 4.1: Source/Destination hosts configured with a single 10GbE port for vMotion
2. vSphere 5.0: Source/Destination hosts configured with a single 10GbE port for vMotion
3. vSphere 5.0: Source/Destination hosts configured with two 10GbE ports for vMotion
vSphere 4.1
vSphere 5
vSphere 5 (2 NICs)
1 Virtual Machine
70
60
50
40
30
20
10
0
2 Virtual Machines
Tim
e in
sec
onds
Figure 5. vMotion Performance in vSphere 4.1 and vSphere 5.0
ThefigureclearlyshowstheenhancementsmadeinvSphere5.0toreducethetotalelapsedtimeinboth single–virtualmachineandmultiple-instanceSQLservervirtualmachinedeploymentscenarios.TheaveragepercentagedropinvMotiontransfertimewasalittleover30%invSphere5.0whenusingasingle10GbEnetworkadaptor.Whenusingtwo10GbEnetworkadaptorsforvMotion(enabledbythenewmulti–networkadaptorfeatureinvSphere5.0),thetotalmigrationtimereduceddramatically:afactorof2.3ximprovementoverthevSphere4.1result.Thefigurealsoillustratesthefactthatthemulti–networkadaptorfeaturetransparentlyloadbalancesthevMotiontrafficontomultiplenetworkadaptors,eveninthecasewhereasinglevirtual machine is subjected to vMotion. This feature can be especially handy when a virtual machine is configuredwithalargeamountofmemory.
T E C H N I C A L W H I T E P A P E R / 9
What’s New in Performance in VMware vSphere 5.0
Metro vMotionvSphere 5.0 introduces a new latency-aware “Metro vMotion” feature that provides better performance over longlatencynetworksandalsoincreasestheround-triplatencylimitforvMotionnetworksfrom5millisecondsto10milliseconds.Previously,vMotionwassupportedonlyonnetworkswithround-triplatenciesofupto 5 milliseconds.
VMware Storage vMotion (Live Migration) EnhancementsvSphere5.0nowsupportslivemigrationforstoragewiththeuseofI/Omirroring.Inpreviousreleases,livemigration moved memory and device state only for a virtual machine, limiting migration to hosts with identical sharedstorage.Livestoragemigrationovercomesthislimitationbyenablingthemovementofvirtualdisksspanningvolumesanddistance.Thisenablesgreatervirtualmachinemobility,zerodowntimemaintenanceandupgrades of storage elements, and automatic storage load balancing.
VMwarehasbeenabletogreatlyreducethetimeittakesforthestoragetomigrate,startingwiththeuseofsnapshotsinVirtualInfrastructure3.5,movingtousingdirtyblocktrackinginvSphere4.0,andnowtousingI/Omirroring in vSphere 5.0.
Livestoragemigrationhasthefollowingadvantages.
•Zerodowntimemaintenance– Enables customers to move virtual machines on and off storage volumes, upgradestoragearrays,performfilesystemupgrades,andservicehardwarewithoutpoweringdown virtual machines.
•Manualandautomaticstorageloadbalancing–Customerscancontinuetomanuallyload-balancetheirvSphereclusterstoimprovestorageperformance,ortheycantakeadvantageofautomaticstorageloadbalancing(vSphere®StorageDRS),whichisnowavailableinvSphere5.0.
•Livestoragemigrationincreasesvirtualmachinemobility– Virtual machines are no longer pinned to the storagearraytheyareinstantiatedon.Livemigrationworksbycopyingthememoryanddevicestateofavirtual machine from one host to another with negligible virtual machine downtime.
VMware ESXi & ESX
VM VM VMVM
Storage vMotion
Figure 6. vSphere 5.0 Storage vMotion
T E C H N I C A L W H I T E P A P E R / 1 0
What’s New in Performance in VMware vSphere 5.0
Tier 1 Application PerformanceMicrosoft Exchange Server 2010MicrosoftExchangeServer2010significantlybenefitsfromthesuperiorperformanceandscalabilityprovidedbyvSphere5.0.MicrosoftExchangeLoadGenerator2010benchmarkresultsindicatethatvSphere5.0improvesMicrosoft Exchange transaction response times and vMotion and Storage vMotion migration times when compared with the previous vSphere release.
ThefollowingareexamplesofperformanceandscalabilitygainsobservedincomparisontovSphere4.1:
•6%reductioninSendMail95thpercentiletransactionlatencieswithafour-vCPUExchangeMailBoxservervirtual machine
•13%reductioninSendMail95thpercentiletransactionlatencieswithaneight-vCPUExchangeMailBoxservervirtual machine
•33%reductioninvMotionmigrationtimeforafour-vCPUExchangeMailBoxservervirtualmachine
•11%reductioninStoragevMotionmigrationtimefora350GBExchangedatabaseVMDK
ZimbraMailServerPerformanceThefull-featured,robustandopen-sourceZimbraCollaborationSuite(ZCS)includesemail,calendaring,andcollaboration software. Zimbra is an important part of the VMware cloud computing portfolio, as software as a service(SaaS)andasaprepackagedvirtualapplianceforSMBs.
ForvSphere5.0,virtualizedZCSperformswithin95%ofnativeperformancevirtualmachine,scalinguptoeightvCPUs.Uptoeight4-vCPUZCSvirtualmachinessupporting32,000mailuserswerescaledoutonasinglehostwithjusta10%increaseinSendmaillatenciescomparedtoasinglevirtualmachine.
Inrecent4-vCPUexperiments,ZCShaslessthanhalftheCPUconsumptionofExchange2010withthesamenumberofusers,andmailuserprovisiontimeforZCSwas10xfaster.
Zimbra MTA+LDAP/Exchange 2010 CAS+HUB
Zimbra Database/Exchange 2010 Indexing
Mailbox
% C
PU U
tiliz
atio
n
60
50
40
30
20
10
0Zimbra Mailbox Server Exchange 2010
Figure 7. Zimbra CPU Efficiency over Exchange 2010
What’s New in Performance in VMware vSphere 5.0
VMware, Inc. 3401 Hillview Avenue Palo Alto CA 94304 USA Tel 877-486-9273 Fax 650-427-5001 www.vmware.comCopyright © 2011 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies. Item No: VMW-WP-PERFORMANCE-USLET-101
SummaryVMware innovations continue to ensure that VMware vSphere 5.0 pushes the envelope of performance and scalability.ThenumerousperformanceenhancementsinvSphere5.0enableorganizationstogetevenmoreoutoftheirvirtualinfrastructureandfurtherreinforcetheroleofVMwareasindustryleaderinvirtualization.
vSphere 5.0 represents advances in performance, to ensure that even the most resource-intensive applications, such as large databases and Microsoft Exchange email systems, can run on private, hybrid and public clouds powered by vSphere.
References (www.vmware.com)Understanding Memory Resource Management in VMware vSphere 5
Managing Performance Variance of Applications Using Storage I/O Control in VMware vSphere 5
Performance Best Practices Guide for VMware vSphere 5
Zimbra Mail Performance on VMware vSphere 5
Host Power Management in VMware vSphere 5