Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

35
Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Transcript of Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Page 1: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Understanding Performance Monitoring in VMware VI3(based on VMworld 2007 – TA64)

Page 2: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Sources of Performance Problems

Bottom Line: virtual machines share physical resources

Over-commitment of resources can lead to poor performance

Imbalances in the system can also lead to slower-than-expected performance

Configuration issues can also contribute to poor performance

Page 3: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Tools for Performance Monitoring/Analysis

VirtualCenter client (VI client): per-host stats and per-cluster statistics

Esxtop: per-host statistics

SDK: allows users to collect only the statistics they want

All tools use same mechanism to retrieve data (special vmkernel calls)

Page 4: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client screenshot

Real-time vs. Historical

Rollup Stats type

Object

Counter type

Chart Type

Page 5: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Real-time vs. Historical stats

VirtualCenter stores statistics at different granularities

Time Interval Data frequency Number of samples

Past Hour (real-time) 20s 180

Past Day 5 minutes 288

Past Week 15 minutes 672

Past Month 1 hour 720

Past Year 1 day 365

Samples are “rolled up” (averaged) for next time interval

Page 6: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Stacked vs. Line charts

Line

Each instance shown separately

Stacked

Graphs are stacked on top of each other

Only applies to certain kinds of charts, e.g.:

Breakdown of Host CPU MHz by Virtual Machine

Breakdown of Virtual Machine CPU by VCPU

Page 7: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Esxtop

Per-host statistics

Shows CPU/Memory/Disk/Network on separate screens

Sampling frequency (refresh every 5s by default)

Batch mode (can look at data offline with perfmon)

Host information

Per-VM/world information

Page 8: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

SDK

Use the VIM API to access statistics relevant to a particular user

Can only access statistics that are exported by the VIM API (and thus are accessible via esxtop/VI client)

Page 9: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Shares example

Change shares for VM

Dynamic reallocation

Add VM, overcommit

Graceful degradation

Remove VM

Exploit extra resources

Page 10: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

ESX CPU Scheduling

World states (simplified view):

ready = ready-to-run but no physical CPU free

run = currently active and running

wait = blocked on I/O

Multi-CPU Virtual Machines => gang scheduling

Co-run (latency to get vCPUs running)

Co-stop (time in “stopped” state)

Page 11: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

ESX, VirtualCenter, and Resource pools

Resource Pool extends proportional-share scheduling to groups of hosts (a cluster)

VirtualCenter can VMotion VMs to provide resource balance (DRS)

Page 12: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Tools for monitoring CPU performance: VI Client

Basic stuff

CPU usage (percent)

CPU ready time (but ready time by itself can be misleading)

Advanced stuff

CPU wait time: time spent blocked on IO

CPU extra time: time given to virtual machine over reservation

CPU guaranteed: min CPU for virtual machine

Cluster-level statistics

Percent of entitled resources delivered

Utilization percent

Effective CPU resources: MHz for cluster

Page 13: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client CPU screenshot

Note CPU milliseconds and percent are on the same chart but use different axes

Page 14: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Cluster-level information in the VI Client

Utilization % describes available capacity on hosts (here: CPU usage low, memory usage medium)

% Entitled resources delivered: best if all 90-100+.

Page 15: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

CPU performance analysis: esxtop

PCPU(%): CPU utilization

Per-group stats breakdown

%USED: Utilization

%RDY: Ready Time

%TWAIT: Wait and idling time

Co-Scheduling stats (multi-CPU Virtual Machines)

%CRUN: Co-run state

%CSTOP: Co-stop state

Nmem: each member can consume 100% (expand to see breakdown)

Affinity

HTSharing

Page 16: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

esxtop CPU screenshot

2-CPU box, but 3 active VMs (high %used)

High %rdy + high %used can imply CPU overcommitment

Page 17: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client and Ready Time

Ready time < used time

Used time

Ready time~ used time

Used time ~ ready time: may signal contention. However, might not be overcommitted due to workload variability

In this example, we have periods of activity and idle periods: CPU isn’t overcommitted all the time

Page 18: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Memory

ESX must balance memory usage for all worlds

Virtual machines, Service Console, and vmkernel consume memory

Page sharing to reduce memory footprint of Virtual Machines

Ballooning to relieve memory pressure in a graceful way

Host swapping to relieve memory pressure when ballooning insufficient

ESX allows overcommitment of memory

Sum of configured memory sizes of virtual machines can be greater than physical memory if working sets fit

VC adds the ability to create resource pools based on memory usage

Page 19: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client

Main page shows “consumed” memory (formerly “active” memory)

Performance charts show important statistics for virtual machines

Consumed memory

Granted memory

Ballooned memory

Shared memory

Swapped memory

Swap in

Swap out

Page 20: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Virtual Machine Memory Metrics

Metric Description

Active Memory Physical pages touched recently by a virtual machine

Memory usage Active memory / configured memory

Consumed Memory Machine memory mapped to a virtual machine, not including shared & overhead memory

Granted Memory Physical pages allocated to a virtual machine. May be less than configured memory.

Shared Memory Physical pages shared with other virtual machines

Ballooned Memory Physical memory ballooned from a virtual machine

Swapped Memory Physical pages swapped from a virtual machine by the vmkernel (swap in and swap out are cumulative)

Overhead Memory Machine pages used for virtualization

Page 21: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client: VM list summary

Host CPU: avg. CPU utilization for Virtual Machine

Host Memory: consumed memory for Virtual Machine

Guest Memory: active memory for guest

Page 22: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Esxtop for memory: Host information

PMEM: Total physical memory breakdown

VMKMEM: Memory managed by vmkernel

COSMEM: Service Console memory breakdown

SWAP: Swap breakdown

MEMCTL: Balloon information

Page 23: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client: Memory example for Virtual Machine

Balloon & targetSwap in

Swap out

Swap usage

Active memory

Consumed & granted

Increase in swap activity

No swap activity

Page 24: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Disk

Disk performance is dependent on many factors:

Filesystem performance

Disk subsystem configuration (SAN, NAS, iSCSI, local disk)

Disk caching

Disk formats (thick, sparse, thin)

ESX is tuned for Virtual Machine I/O

VMFS clustered filesystem => keeping consistency imposes some overheads

Page 25: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

ESX Storage Stack

Different latencies for local disk vs. SAN (caching, switches, etc.)

Queuing within kernel and in hardware

Page 26: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client disk statistics

Mostly coarse-grained statistics

Disk bandwidth

Disk read rate, disk write rate (KB/s)

Disk usage: sum of read BW and write BW (KB/s)

Disk operations during sampling interval (20s for real-time)

Disk read requests, disk write requests, commands issued

Bus resets, command aborts

Per-LUN and aggregated statistics

Page 27: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Esxtop Disk Statistics

Aggregated statistics like VI Client

READS/s, WRITES/s, MBREAD/s, MBWRITE/s

Latency statistics

KAVG/cmd, DAVG/cmd, GAVG/cmd

Queuing information

Adapter (AQLEN), LUN (LQLEN), vmkernel (QUED)

K: Kernel, D: Device, G: Guest

Page 28: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Disk performance example: VI Client

Throughput withCache (good)

Throughput w/oCache (bad)

Page 29: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Disk performance example: esxtop

Latency seems high

After enabling cache,latency much better

Page 30: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Network

Guest to vmkernel

Address space switch

Virtual interrupts

Virtual I/O stack

Packet copy

Packet routing

virtual driver

physicaldriver

Virtual I/O stack

Guest OS

Virtual Device

TCP/IP

Page 31: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

VI Client Networking Statistics

Mostly high-level statistics

Bandwidth

KBps transmitted, received

Network usage (KBps): sum of TX, RX over all NICs

Operations/s

Network packets received during sampling interval (real-time: 20s)

Network packets transmitted during sampling interval

Per-adapter and aggregated statistics

Per VM Stacked Graphs

Page 32: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Esxtop Networking Statistics

Bandwidth

Receive (MbRX/s), Transmit (MbRX/s)

Operations/s

Receive (PKTRX/s), Transmit (PKTTX/s)

Configuration info

Duplex (FDUPLX), speed (SPEED)

Errors

Packets dropped during transmit (%DRPTX), receive (%DRPRX)

Page 33: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Esxtop network output

Setup A (10/100):

Setup B (GigE):

Physical configuration

Performance

Page 34: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)

Approaching Performance Issues

Make sure it is an apples-to-apples comparison

Check guest tools & guest processes

Check host configurations & host processes

Check VirtualCenter client for resource issues

Check esxtop for obvious resource issues

Examine log files for errors

If no suspects, run microbenchmarks (e.g., Iometer, netperf) to narrow scope

Once you have suspects, check relevant configurations

If all else fails…contact VMware

Page 35: Understanding Performance Monitoring in VMware VI3 (based on VMworld 2007 – TA64)