9 Troubleshooting
-
Upload
kedar-vishnu-lad -
Category
Documents
-
view
217 -
download
3
description
Transcript of 9 Troubleshooting
![Page 1: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/1.jpg)
ESX Server System Management IIModule 9
TroubleshootingESX ServerPreventionLikely problemsResponding to issues
![Page 2: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/2.jpg)
2For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.2
ESX Server troubleshooting philosophy
• Most ESX Server problems are caused by• Hardware problems
• Misconfigurations
• Inadequate planning
• An ounce of prevention• Aggressively validate hardware
• Plan and review deployment
• Develop and apply good data-center policies
• A pound of cure• Learn common symptoms, faults, fixes
![Page 3: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/3.jpg)
3For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.3
Avoiding problems before they occur
• Aggressively validate hardware• Run memtest86 for 72 hours before deployment
• Install a dummy OS on hardware
• Check installed items against supported hardware listhttp://www.vmware.com/pdf/esx2_IO_guide.pdfhttp://www.vmware.com/pdf/esx2_SAN_guide.pdf
• Plan the deployment• Allocate enough resources to Service Console
• Develop datacenter policies
![Page 4: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/4.jpg)
4For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.4
Service Console resource problems
Resource Problems caused by shortageService Console
RAM
•Poor Remote Console performance
•Poor MUI performance
Service Console
swap
•Randomly killed processes (especially MUI)
•Inability to start new VMs
Service Console
disk
•Underuse of template technique
•Full file systems, causing…
•Inability to launch MUI
•Incompletely written logs
![Page 5: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/5.jpg)
5For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.5
Datacenter policy problemsPolicy issue Problems caused by lack of policy
Root password too widely
known
•Inappropriately timed maintenance
•VMs get created as root
•Root privilege used casually, worsening impact of operator error
•Audit trail is obscured
•Difficult to change root password
Root should not own VMs
•Encourages casual use of root login
•Each VM should be owned by a named individual or group
Passwords should not be shared
•Audit trail is obscured
•Greater odds of individuals working at cross-purposes
![Page 6: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/6.jpg)
6For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.6
Installation issues
• VMkernel can only see supported devices• Only devices with drivers in /usr/lib/vmware/vmkmod
• Installation OS is uniprocessor, does not see IOAPIC• If devices not seen, activate APIC system manually boot: esx apic
• Be sure to detach external storage when doing the ESX Server install
• Some hardware models require resetting PCI slots in BIOS when cards change
![Page 7: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/7.jpg)
7For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.7
When to involve VMware support
• Always let support know when…• The VMkernel panics (the “Purple Screen of Death”)
• A virtual machine crashes, leaving behind a monitor core dump in its home directory
• Whenever you contact support about a VM problem• Find that VM’s world number in its monitor log
• Look in the VMkernel log for references to that world number
• Run /usr/bin/vm-support script and include the resulting file
![Page 8: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/8.jpg)
8For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.8
The Purple Screen of Death
• Displayed on ESX Server’s video monitor in the event of VMkernel panicVMware ESX Server [Release.1.5.1$Name: build-2173 $]SPIN count exceeded - probable deadlockgate=0x0 cr2=0x40017000 frame=0x801bc8 cr3=0x141c200 cr4=0x6f0eax=0 ebx=0 ecx=0 edx=0ebp=801d40 esi=0 edi=0CPU 0 96 console: cpu 1 93 idle1: cpu 2 94 idle2: cpu 3 95 idle3:[0x43457e]SP_WaitLockIRQ+0xd6(0xe9d58c, 0x0, 0xe5c248)[0x46c827]pci_request_regions+0x1af(0xe7bdf0, 0x61, 0x8f40e0)[0x4187bf]IDTDoInterrupt+0x1af(0x61, 0x801e01, 0x1)[0x41898a]IDT_HandleInterrupt+0x4e(0x801e28, 0x801e58, 0xe5c248)[0x416cca]HostHandleInterrupt+0x26(0x801e28, 0x439a0038, 0xffff0038)[0x45d8a3]HostEntry+0x83(0xeac968, 0x801e88, 0x41bb9d)[0x472eda]scsi_try_bus_reset+0x36(0xeac968, 0xc, 0x0)[0x471ca5]scsi_build_commandblocks+0xc71(0xe9d4f0, 0xea3010, 0xea9220)[0x438d85]SCSIResetCommandInt+0x65(0x105, 0xea9220, 0x801f6c)[0x438b14]SCSIExecuteCommandInt+0x48(0x105, 0xea9220, 0x801f6c)VMK uptime: 0:02:44:27.64Dumping VMkernel core and log ... Done.Waiting for debugger... (world 96)Debugger is listening on serial port ...
![Page 9: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/9.jpg)
9For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.9
Most frequent types of PSODs
• Machine check exception• A general hardware problem
• VMware Support can help pinpoint the failing subsystem
• NMI ECC or Parity Error• Specifically memory failures
• VMware Support can help pinpoint the failing bank
![Page 10: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/10.jpg)
10For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.10
In the event of PSOD
• Copy down the screen display, screen-grab it, or take a photo
• If the machine had been running in a steady state, with running VMs• Check for environmental factors
•Especially room temperature
• Check for detached external devices
• If the machine had been recently rebooted• Check for hardware configuration changes
![Page 11: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/11.jpg)
11For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.11
The vm-support script
• Gathers all support-relevant information, bundles it for delivery to VMware Support
• To run:# cd {writable directory with disk space}# vm-support
• Attach resulting esx-date.id.tgz file to support request
![Page 12: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/12.jpg)
12For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.12
Key ESX Server logs/var/log
messages vmkernel vmkwarning
Service Console errors, boot failures
VMkernel actions, including world creations
informational messages, not likely to indicate problems
/home/user/vmware/vmname
vmware.log Monitor log for this VM only
![Page 13: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/13.jpg)
13For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.13
Some possible ESX Server problems
• Can’t start a VM
• Can’t connect to MUI
• Can’t connect to Remote Console
• VM bluescreens or hangs
• Remote Console performance problems
• Application performance problems
![Page 14: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/14.jpg)
14For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.14
Problem: Can’t start a VM
• Possible causes:• Wrong permissions on virtual disks or config file
• Virtual disks are not in a VMFS
• Physical addresses for virtual disks may be no longer valide.g., vmhba0:1:2:0 is now vmhba1:1:2:0
•Fix: Use VMFS names!
• Not enough memory in the system
• Not enough unreserved VMkernel swap
• Service Console’s hostname has no associated IP address
• Virtual disks are corrupt or in COW format
![Page 15: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/15.jpg)
15For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.15
Problem: Can’t connect to MUI
• Possible causes:• Loss of IP connectivity
• Wrong DNS name or IP address for ESX Server
• Service Console root file system may be full•Use df –k to check
• Service Console may have run out of swap•Linux may kill processes if this happens
•To check for presence of MUI server:ps –ef | grep httpd
•To manually restart MUI server:/etc/rc.d/rc3.d/S91httpd.vmware start
![Page 16: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/16.jpg)
16For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.16
Problem: Can’t connect to Remote Console
• Possible causes:• Loss of IP connectivity
• Wrong DNS name or IP address for ESX Server
• NIC duplex or speed mismatch with Ethernet switch
• Service Console root file system may be full•Use df –k to check
• Remote Console may be running on a non-default port number
•Check /etc/xinetd.d/vmware-authd for port number
•Remember to specify port number in client if not 902esx.company.com 8092 /home/fred/vmware/a/a.cfg
![Page 17: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/17.jpg)
17For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.17
Problem: VM bluescreens or hangs
• Troubleshoot the issue just as on physical hardware
• Possible causes:• Application problems running in the guest OS
• Hardware problems in the ESX Server
• Bugs found in the VMkernel• VMware technical support will post a patch
![Page 18: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/18.jpg)
18For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.18
Remote Console performance problems
• Possible causes:• Service Console may be swapping
• NIC duplex or speed mismatch with Ethernet switch
• Bit depth of virtual machine may need to be lower for this network path
•Try to operate Linux virtual machines without graphics
• Windows 2000 media detection feature•Start the VM with the Virtual CD-ROM disconnected
• User expectations•Remote Console is not a replacement for Windows Terminal Services or Citrix Metaframe
![Page 19: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/19.jpg)
19For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.19
Application performance problems
• Distinguish between user perception and actual performance issues• A machine can “feel slow” interactively while still delivering
good transactions per second
• Possible causes• Name resolution issues
• vmnic duplex or speed mismatch with Ethernet switch
• The VM may need more of a limiting resource•CPU, memory, or disk bandwidth
![Page 20: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/20.jpg)
20For ESX Server 2.0.1 2003-11-17
Copyright © 2003 VMware, Inc. All rights reserved.20
Most frequent ESX Server support issues
• Diagnosing failures due to hardware
• Questions related to SANs and HBA failover• Supported configurations, how-to
• Configuring speed and duplex settings for NICs
• Assessing performance of virtual machines
• Allocating devices to Service Console and VMkernel• vmkpcidivy
• Linux questions and issues not specific to ESX Server
![Page 21: 9 Troubleshooting](https://reader035.fdocuments.us/reader035/viewer/2022062819/577c7c2f1a28abe05499a7e1/html5/thumbnails/21.jpg)
ESX Server System Management IIModule 9
Questions?