Black-box and Gray-box Strategies for Virtual Machine Migration
description
Transcript of Black-box and Gray-box Strategies for Virtual Machine Migration
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science
Black-box and Gray-box Strategies for Virtual Machine Migration
Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif*
University of Massachusetts Amherst*Intel, Portland
Enterprise Data CentersData Centers are composed of:
Large clusters of serversNetwork attached storage devices
Multiple applications per serverShared hosting environmentMulti-tier, may span multiple servers
Allocates resources to meet Service Level Agreements (SLAs)
Virtualization increasingly common
Benefits of VirtualizationRun multiple applications on one server
Each application runs in its own virtual machineMaintains isolation
Provides securityRapidly adjust resource allocations
CPU priority, memory allocationVM migration
“Transparent” to applicationNo downtime, but incurs overhead
How can we use virtualization to more efficiently utilize data center resources?
Data Center WorkloadsWeb applications see highly dynamic workloads
Multi-time-scale variationsTransient spikes and flash crowds
Time (days)0 1 2 3 4 5
0
1200
Arri
vals
per
min
0
20000
40000
60000
80000
100000
120000
140000
0 5 10 15 20
Time (hrs)R
eque
st R
ate
(req
/min
)Ar
riva
ls p
er m
in
How can we provision resources to meet these changing demands?
Provisioning MethodsHotspots form if resource demand exceeds provisioned capacity
Static over-provisioningAllocate for peak load
Wastes resourcesNot suitable for dynamic workloadsDifficult to predict peak resource requirements
Dynamic provisioningAdjust based on workload
Often done manuallyBecoming easier with virtualization
Problem Statement
How can we automatically detect and eliminate hotspots in data center environments?
Use VM migration and dynamic resource allocation!
OutlineIntroduction & Motivation
System Overview
When? How much? And Where to?
Implementation and Evaluation
Conclusions
Research ChallengesSandpiper: automatically detect and mitigate hotspots through virtual machine migration
When to migrate?
Where to move to?
How much of each resource to allocate?
How much information needed to make decisions?
A migratory bird
Sandpiper ArchitectureNucleusNucleus
Monitor resources Report to control planeOne per server
Control PlaneCentralized server
Hotspot DetectorHotspot DetectorDetect when a hotspot occurs
Profiling EngineProfiling EngineDecide how much to allocate
Migration ManagerMigration ManagerDetermine where to migrate
NucleusNucleus VM 1
VM 1
VM 2
VM 2
HotspotHotspotDetectorDetector
Control PlaneControl Plane
MigrationMigrationManagerManager
ProfilingProfilingEngineEngine
…
PM = Physical MachineVM = Virtual Machine
PM 1 PM N
Black-Box and Gray-BoxBlack-box: only data from outside the VM
Completely OS and application agnostic
Gray-Box: access to OS stats and application logsRequest level data can improve detection and profilingNot always feasible – customer may control OS
Gray Box
Application logsOS statistics
Black Box???
Is black-box sufficient?What do we gain from gray-box data?
OutlineIntroduction & Motivation
System Overview
When? How much? And Where to?
Implementation and Evaluation
Conclusions
Black-box MonitoringXen uses a “Driver Domain”
Special VM with network and disk driversNucleus runs here
CPU Scheduler statistics
Network Linux device information
Memory Detect swapping from disk I/OOnly know when performance is poor
HypervisorHypervisor
DriverDriverDomainDomain
NucleusNucleus
VMVM
Hotspot Detection – When?Resource Thresholds
Potential hotspot if utilization exceeds thresholdOnly trigger for sustained overload
Must be overloaded for k out of n measurementsAutoregressive Time Series Model
Use historical data to predict future values Minimize impact of transient spikes
Time
Utiliz
atio
n
TimeUt
ilizat
ion
Time
Utiliz
atio
n
Not overloadedNot overloaded Hotspot Detected!Hotspot Detected!
How much of each resource to give a VMCreate distribution from time seriesProvision to meet peaks of recent workload
What to do if utilization is at 100%?Gray-box
Request level knowledge can helpCan use application models to determine requirements
Resource Profiling – How much?
0
20
40
60
80
100
0 20 40 60 80 100
Historical data
% Utilization
Probability
Utilization Profile
Determining Placement – Where to?Migrate VMs from overloaded to underloaded servers
Use Volume to find most loaded serversCaptures load on multiple resource dimensionsHighly loaded servers are targeted first
Migrations incur overhead Migration cost determined by RAMMigrate the VM with highest Volume/RAM ratio
Volume = 11-cpu
11-net
11-mem* *
cpu
memne
t
Maximize the amount of load transferred while minimizing the overhead of migrations
Placement AlgorithmFirst try migrations
Displace VMs from high Volume servers Use Volume/RAM to minimize overhead
Don’t create new hotspots!What if high average load in system?
Swap if necessarySwap a high Volume VM for a low Volume oneRequires 3 migrations
Can’t support both at once
PM1 PM2
VM3VM2VM1
VM4
PM1 PM2
VM3VM2
VM4
Spare
VM1VM5
Migration
Swap
Swaps increase the number of hotspots we can resolve
OutlineIntroduction & Motivation
System Overview
When? How much? And Where to?
Implementation and Evaluation
Conclusions
ImplementationUse Xen 3.0.2-3 virtualization software
Testbed of twenty 2.4Ghz P4 servers
Apache 2.0.54, PHP 4.3.10, MySQL 4.0.24
Synthetic PHP applicationsRUBiS – multi-tier ebay-like web application
Migration Effectiveness3 Physical servers, 5 virtual machines
VMs serve CPU intensive PHP scriptsMigration triggered when CPU usage exceeds 75%
Sandpiper detects and responds to 3 hotspots
PM 1
PM 2
PM 3CPU
Usag
e (s
tack
ed)
Memory HotspotsVirtual machine runs SpecJBB benchmark
Memory utilization increases over timeBlack-box increases by 32MB if page-swapping observedGray-box maintains 32 MB free
Significantly reduces page-swapping
256
306
356
406
456
506
556
606
656
706
756
0 200 400 600 800 1000 1200 1400
Time (sec)
RA
M (M
B)
Black-boxGray-box
Gray-box can improve application performance by proactively increasing allocation
Data Center Prototype16 server cluster runs realistic data center applications on 35 virtual machines6 servers (14 VMs) become simultaneously overloaded
4 CPU hotspots and 2 network hotspotsSandpiper eliminates all hotspots in four minutes
Uses 7 migrations and 2 swapsDespite migration overhead, VMs see fewer periods of overload
0
2
4
6
8
10
12
1 11 21 31 41 51
Time
# of
Hot
spot
s
StaticSandpiper
0
20
40
60
80
100
120
140
160
180
Overloaded Sustained
Tim
e (in
terv
als)
StaticSandpiper
Related WorkMenasce and Bennani 2006
Single server resource management
VIOLIN and VirtuosoUse virtualization for dynamic resource control in grid computing environments
ShirakoMigration used to meet resource policies determined by application owners
VMware Distributed Resource SchedulerAutomatically migrates VMs to ensure they receive their resource quota
SummaryVirtual Machine migration is a viable tool for dynamic data center provisioningSandpiper can rapidly detect and eliminate hotspots while treating each VM as a black-boxGray-Box information can improve performance in some scenarios
Proactive memory allocations
Future workImproved black-box memory monitoringSupport for replicated services
Thank you
http://lass.cs.umass.edu
Stability During OverloadPredict future usage
Will not migrate if destination could become overloaded
Each set of migrations must eliminate a hotspotAlgorithm only performs bounded number of migrations
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 50 100 150 200 250 300
Time (sec)
Util
izat
ion
PM1PM2
Measured Predicted
Sandpiper OverheadCPU/mem same as monitoring tools (1%)Network bandwidth negligiblePlacement algorithm completes in less than 10 seconds for up to 750 VMs
Can distribute computation if necessary
Gray v. Black - ApacheLoad spikes on 2 web servers cause CPU saturation
Black-box underestimates each VM’s requirement Does not know how much more to allocateRequires 3 sequential migrations to resolve hotspot
Gray-box correctly judges resource requirements by using application logs
Initiates 2 migrations in parallelEliminates hotspot 60% faster
Web Server Response Time Migrations