Troubleshooting and Health Checks - Arista and Health Checks If you encounter an issue when using...
-
Upload
trinhkhanh -
Category
Documents
-
view
223 -
download
0
Transcript of Troubleshooting and Health Checks - Arista and Health Checks If you encounter an issue when using...
Configuration Guide CloudVision version 201720 317
Chapter 21
Troubleshooting and Health ChecksIf you encounter an issue when using CloudVision appliance check to see if there are troubleshootingsteps for the issue
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
318 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
211 TroubleshootingThe following table lists the troubleshooting procedures for known issues
Issue Potential Cause Solution
HBase Master and Tomcat showas NOT RUNNING under thefollowing conditions
bull At the end of a shell-basedinstallation or an ISO-basedinstallation
bull After running cvpi status all
The input NTP and DNS serversare not reachable
Check to see if the NTP and DNS servers specifiedduring the installation are reachable
Fix reachability of NTP and DNS servers and rebootthe CVP VM from the console using the sudo init 6command
If after the reboot the problem persists delete the VMand then re-install it
The Hbase is corrupted Try creating configlets in a test container (withoutdevices) Check to see if they are created
CVP behavior seems affectedimmediately after a rebootfollowing an unplanned powerfailure or an unclean shutdown
Using cvpi staus all showshbase still not running after 15minutes
There are multiple potential causes See ldquoCVP Behavior Change Following Powercyclerdquoon page 319 for details on the potential causes and for troubleshooting steps
After upgrading CVP RunTimeexceptions occur on basicoperations (for example addingdevices to the inventory)
Your browser has cached itemsfor the previous version of CVPthat are not valid for the newversion of CVP
Clear your browserrsquos cache cookies and hosted appdata Then refresh the browser and try again
After installing CVP the cvpi start all command fails withmessages about invalid DNSnames
The CVP host names specifiedare not fully qualified domainnames (FQDN)
A re-installation is required
bull Shell-based FQDNs have to be entered whenprompted for CVP host names
bull ISO-based The CVP host names specified in thecvpyaml file must be FQDNs
CVP redirects you to a URL thatyou do not have access rights toview The URL and message are
bull URL httpltyour cvpgtwebunAuthorised
bull Message ldquoYou do not havesufficient privileges to accessthe specified URL Pleasecontact your administratorrdquo
If you access CVP using httpsand a self-signed certificate thecertificate may have expired butis still cached by your browser
Clear your browserrsquos cache cookies hosted app dataand content licenses
Using cvpi staus all showsjust the cvp-frontend and orcvp-backend components as NOTRUNNING or FAILED
On the primary node execute the following commands cvpi watchdog off cvpi stop cvp and then cvpi start cvp cvpi watchdog on (which may take 5-10minutes to execute) If that doesnt result in all services showing as running see ldquoSystemRecoveryrdquo on page 321 to resolve the issue
Installation process ends withsome CVP services failing to start
Using the cvpi status all command some CVP services have the status of NOTRUNNING
On the primary node execute the following commands cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on (which may take 5-10minutes to execute) If that doesnt result in all services showing as running see ldquoSystemRecoveryrdquo on page 321 to resolve the issue
In a multi-node cluster Zookeeperand Hazelcast exceptions occur
There may be issues withnetwork connectivity qualitybetween nodes
Check both the connectivity between nodes (usingping) as well as the quality of the connectivity betweennodes (for example using ping -f)
Ensure the network connectivity and 100 pass rate ofthat connectivity
Chapter 21 Troubleshooting and Health Checks Troubleshooting
Configuration Guide CloudVision version 201720 319
2111 CVP Behavior Change Following Powercycle
You may encounter an unexpected change in the behavior of CVP immediately after a reboot that wasperformed following an unplanned power failure or an unclean shutdown of CVP
For information on the potential causes and details on the troubleshooting steps see
bull ldquoPotential Causesrdquo
bull ldquoConfirming the Causerdquo
bull ldquoTroubleshooting Procedurerdquo on page 320
21111 Potential Causes
The potential causes for CVP behavior changes in this situation include
bull Lease recovery on WAL file fails after power cycleSee httpsissuesapacheorgjirabrowseHDFS-7342 for details
bull Lease on WAL file cannot be released because blocks are replicated in Hadoop
bull Combination of the previous 2 items
21112 Confirming the Cause
The objective of this task is to confirm that the cause is a WAL file lease recovery failure afterpowercycle or a failure to release the WAL file due to blocks being replicated in Hadoop Confirmingthe cause is a simple process that involves reviewing thecvpihbaselogshbase-cvp-master-ltfqdngtlog file
To confirm the cause complete the following steps
Step 1 Open the following log file on primary or secondary node
cvpihbaselogshbase-cvp-master-ltfqdngtlog
Step 2 Go to the last exception in the log (it should be near the end of the log and should have beengenerated within the last 3 minutes of logging activity captured in the log)
Cannot login to CVP
+
System is not synchronizing withntp servers
Nodes are not synchornizing withthe ntpserver and that has leadto a clock skew between thenodes which is more thanallowed by CVP components
Run ntpstat on all nodes Output from all nodesmust say
synchronised to NTP server () at hellip
1 Run service ntpd restart
2 Then wait a few seconds
3 Check ntpstat
4 If time is still not synchronized run
service ntpd stop ntpdate lthostname or IP of an ntpservergt service ntpd start
5 Check ntpstat again
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine (See ldquoRunning Health Checksrdquo on page 323for an example of the output of the cvpi resources command)
Issue Potential Cause Solution
320 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
Step 3 Make sure that the exception in the log file is the same as the exception shown in this table
21113 Troubleshooting Procedure
This procedure provides the troubleshooting steps for situations that meet the conditions specified inthe table above
Pre-requisites
Make sure that you have confirmed the cause (see ldquoConfirming the Causerdquo on page 319)
Complete the following steps to resolve the issue
Step 1 Use the cvpi watchdog off command to disable watchdog
Step 2 Wait 15 minutes for Hadoop to finish replicating blocks
Step 3 Use the cvpi start hbase command to start hbase
Step 4 Use the cvpi status hbase command to verify that hbase is running
Step 5 Do one of the following
bull If hbase is running use the cvpi watchdog on command to re-enable watchdog and thenwait for services to come up
bull If hbase is not running go to system recovery to resolve the issue (see ldquoSystem Recoveryrdquoon page 321)
Related topics
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
Exception found in log
orgapachehadoopipcRemoteException(orgapachehadoophdfsprotocolAlreadyBeingCreatedException) DIR NameSysteminternalReleaseLease Failed to release lease for file hbaseMasterProcWALsstate-00000000000000000ltnumbergtlogCommitted blocks are waiting to be minimally replicated Try again later
Chapter 21 Troubleshooting and Health Checks System Recovery
Configuration Guide CloudVision version 201720 321
212 System RecoverySystem recovery should be used only when the CVP cluster has become unusable and other stepssuch as performing a cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on have failed For example situations in which regardless of restarts a cvpi status allcontinues to show some components as having a status of UNHEALTHY or NOT RUNNING
If a GUI-based backup has been saved while the system was healthy it is possible to redeploy the CVPcluster restore the backup and be at the same state within CVP as when the backup was takenCreating backups on a regular basis is recommended and described in ldquoCreating a Backuprdquo onpage 299
There are two ways to completely recover a CVP cluster
bull ldquoVM Redeploymentrdquo
bull ldquoCVP Re-Install without VM Redeploymentrdquo
Note A good backup is required to proceed with either of these system recoveries
2121 VM Redeployment
Complete these steps
Step 1 Delete all the CVP VMs
Step 2 Redeploy the VMs using the procedures in
Step 3 Issue a ldquocvpi status allrdquo command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
2122 CVP Re-Install without VM Redeployment
Complete these steps
Step 1 Run lsquocvpReInstall from the Linux shell of the primary node This may take 15 minutes tocomplete[rootcvp99 ~] cvpReInstall0Log directory is tmpcvpReinstall_17_02_23_01_59_48Existing cvpicvp-configyaml will be backed up herehelliphellipComplete
CVP configuration not backed up please use cvpShell to setup the cluster
CVP Re-install complete you can now configure the cluster
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
318 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
211 TroubleshootingThe following table lists the troubleshooting procedures for known issues
Issue Potential Cause Solution
HBase Master and Tomcat showas NOT RUNNING under thefollowing conditions
bull At the end of a shell-basedinstallation or an ISO-basedinstallation
bull After running cvpi status all
The input NTP and DNS serversare not reachable
Check to see if the NTP and DNS servers specifiedduring the installation are reachable
Fix reachability of NTP and DNS servers and rebootthe CVP VM from the console using the sudo init 6command
If after the reboot the problem persists delete the VMand then re-install it
The Hbase is corrupted Try creating configlets in a test container (withoutdevices) Check to see if they are created
CVP behavior seems affectedimmediately after a rebootfollowing an unplanned powerfailure or an unclean shutdown
Using cvpi staus all showshbase still not running after 15minutes
There are multiple potential causes See ldquoCVP Behavior Change Following Powercyclerdquoon page 319 for details on the potential causes and for troubleshooting steps
After upgrading CVP RunTimeexceptions occur on basicoperations (for example addingdevices to the inventory)
Your browser has cached itemsfor the previous version of CVPthat are not valid for the newversion of CVP
Clear your browserrsquos cache cookies and hosted appdata Then refresh the browser and try again
After installing CVP the cvpi start all command fails withmessages about invalid DNSnames
The CVP host names specifiedare not fully qualified domainnames (FQDN)
A re-installation is required
bull Shell-based FQDNs have to be entered whenprompted for CVP host names
bull ISO-based The CVP host names specified in thecvpyaml file must be FQDNs
CVP redirects you to a URL thatyou do not have access rights toview The URL and message are
bull URL httpltyour cvpgtwebunAuthorised
bull Message ldquoYou do not havesufficient privileges to accessthe specified URL Pleasecontact your administratorrdquo
If you access CVP using httpsand a self-signed certificate thecertificate may have expired butis still cached by your browser
Clear your browserrsquos cache cookies hosted app dataand content licenses
Using cvpi staus all showsjust the cvp-frontend and orcvp-backend components as NOTRUNNING or FAILED
On the primary node execute the following commands cvpi watchdog off cvpi stop cvp and then cvpi start cvp cvpi watchdog on (which may take 5-10minutes to execute) If that doesnt result in all services showing as running see ldquoSystemRecoveryrdquo on page 321 to resolve the issue
Installation process ends withsome CVP services failing to start
Using the cvpi status all command some CVP services have the status of NOTRUNNING
On the primary node execute the following commands cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on (which may take 5-10minutes to execute) If that doesnt result in all services showing as running see ldquoSystemRecoveryrdquo on page 321 to resolve the issue
In a multi-node cluster Zookeeperand Hazelcast exceptions occur
There may be issues withnetwork connectivity qualitybetween nodes
Check both the connectivity between nodes (usingping) as well as the quality of the connectivity betweennodes (for example using ping -f)
Ensure the network connectivity and 100 pass rate ofthat connectivity
Chapter 21 Troubleshooting and Health Checks Troubleshooting
Configuration Guide CloudVision version 201720 319
2111 CVP Behavior Change Following Powercycle
You may encounter an unexpected change in the behavior of CVP immediately after a reboot that wasperformed following an unplanned power failure or an unclean shutdown of CVP
For information on the potential causes and details on the troubleshooting steps see
bull ldquoPotential Causesrdquo
bull ldquoConfirming the Causerdquo
bull ldquoTroubleshooting Procedurerdquo on page 320
21111 Potential Causes
The potential causes for CVP behavior changes in this situation include
bull Lease recovery on WAL file fails after power cycleSee httpsissuesapacheorgjirabrowseHDFS-7342 for details
bull Lease on WAL file cannot be released because blocks are replicated in Hadoop
bull Combination of the previous 2 items
21112 Confirming the Cause
The objective of this task is to confirm that the cause is a WAL file lease recovery failure afterpowercycle or a failure to release the WAL file due to blocks being replicated in Hadoop Confirmingthe cause is a simple process that involves reviewing thecvpihbaselogshbase-cvp-master-ltfqdngtlog file
To confirm the cause complete the following steps
Step 1 Open the following log file on primary or secondary node
cvpihbaselogshbase-cvp-master-ltfqdngtlog
Step 2 Go to the last exception in the log (it should be near the end of the log and should have beengenerated within the last 3 minutes of logging activity captured in the log)
Cannot login to CVP
+
System is not synchronizing withntp servers
Nodes are not synchornizing withthe ntpserver and that has leadto a clock skew between thenodes which is more thanallowed by CVP components
Run ntpstat on all nodes Output from all nodesmust say
synchronised to NTP server () at hellip
1 Run service ntpd restart
2 Then wait a few seconds
3 Check ntpstat
4 If time is still not synchronized run
service ntpd stop ntpdate lthostname or IP of an ntpservergt service ntpd start
5 Check ntpstat again
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine (See ldquoRunning Health Checksrdquo on page 323for an example of the output of the cvpi resources command)
Issue Potential Cause Solution
320 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
Step 3 Make sure that the exception in the log file is the same as the exception shown in this table
21113 Troubleshooting Procedure
This procedure provides the troubleshooting steps for situations that meet the conditions specified inthe table above
Pre-requisites
Make sure that you have confirmed the cause (see ldquoConfirming the Causerdquo on page 319)
Complete the following steps to resolve the issue
Step 1 Use the cvpi watchdog off command to disable watchdog
Step 2 Wait 15 minutes for Hadoop to finish replicating blocks
Step 3 Use the cvpi start hbase command to start hbase
Step 4 Use the cvpi status hbase command to verify that hbase is running
Step 5 Do one of the following
bull If hbase is running use the cvpi watchdog on command to re-enable watchdog and thenwait for services to come up
bull If hbase is not running go to system recovery to resolve the issue (see ldquoSystem Recoveryrdquoon page 321)
Related topics
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
Exception found in log
orgapachehadoopipcRemoteException(orgapachehadoophdfsprotocolAlreadyBeingCreatedException) DIR NameSysteminternalReleaseLease Failed to release lease for file hbaseMasterProcWALsstate-00000000000000000ltnumbergtlogCommitted blocks are waiting to be minimally replicated Try again later
Chapter 21 Troubleshooting and Health Checks System Recovery
Configuration Guide CloudVision version 201720 321
212 System RecoverySystem recovery should be used only when the CVP cluster has become unusable and other stepssuch as performing a cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on have failed For example situations in which regardless of restarts a cvpi status allcontinues to show some components as having a status of UNHEALTHY or NOT RUNNING
If a GUI-based backup has been saved while the system was healthy it is possible to redeploy the CVPcluster restore the backup and be at the same state within CVP as when the backup was takenCreating backups on a regular basis is recommended and described in ldquoCreating a Backuprdquo onpage 299
There are two ways to completely recover a CVP cluster
bull ldquoVM Redeploymentrdquo
bull ldquoCVP Re-Install without VM Redeploymentrdquo
Note A good backup is required to proceed with either of these system recoveries
2121 VM Redeployment
Complete these steps
Step 1 Delete all the CVP VMs
Step 2 Redeploy the VMs using the procedures in
Step 3 Issue a ldquocvpi status allrdquo command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
2122 CVP Re-Install without VM Redeployment
Complete these steps
Step 1 Run lsquocvpReInstall from the Linux shell of the primary node This may take 15 minutes tocomplete[rootcvp99 ~] cvpReInstall0Log directory is tmpcvpReinstall_17_02_23_01_59_48Existing cvpicvp-configyaml will be backed up herehelliphellipComplete
CVP configuration not backed up please use cvpShell to setup the cluster
CVP Re-install complete you can now configure the cluster
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks Troubleshooting
Configuration Guide CloudVision version 201720 319
2111 CVP Behavior Change Following Powercycle
You may encounter an unexpected change in the behavior of CVP immediately after a reboot that wasperformed following an unplanned power failure or an unclean shutdown of CVP
For information on the potential causes and details on the troubleshooting steps see
bull ldquoPotential Causesrdquo
bull ldquoConfirming the Causerdquo
bull ldquoTroubleshooting Procedurerdquo on page 320
21111 Potential Causes
The potential causes for CVP behavior changes in this situation include
bull Lease recovery on WAL file fails after power cycleSee httpsissuesapacheorgjirabrowseHDFS-7342 for details
bull Lease on WAL file cannot be released because blocks are replicated in Hadoop
bull Combination of the previous 2 items
21112 Confirming the Cause
The objective of this task is to confirm that the cause is a WAL file lease recovery failure afterpowercycle or a failure to release the WAL file due to blocks being replicated in Hadoop Confirmingthe cause is a simple process that involves reviewing thecvpihbaselogshbase-cvp-master-ltfqdngtlog file
To confirm the cause complete the following steps
Step 1 Open the following log file on primary or secondary node
cvpihbaselogshbase-cvp-master-ltfqdngtlog
Step 2 Go to the last exception in the log (it should be near the end of the log and should have beengenerated within the last 3 minutes of logging activity captured in the log)
Cannot login to CVP
+
System is not synchronizing withntp servers
Nodes are not synchornizing withthe ntpserver and that has leadto a clock skew between thenodes which is more thanallowed by CVP components
Run ntpstat on all nodes Output from all nodesmust say
synchronised to NTP server () at hellip
1 Run service ntpd restart
2 Then wait a few seconds
3 Check ntpstat
4 If time is still not synchronized run
service ntpd stop ntpdate lthostname or IP of an ntpservergt service ntpd start
5 Check ntpstat again
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine (See ldquoRunning Health Checksrdquo on page 323for an example of the output of the cvpi resources command)
Issue Potential Cause Solution
320 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
Step 3 Make sure that the exception in the log file is the same as the exception shown in this table
21113 Troubleshooting Procedure
This procedure provides the troubleshooting steps for situations that meet the conditions specified inthe table above
Pre-requisites
Make sure that you have confirmed the cause (see ldquoConfirming the Causerdquo on page 319)
Complete the following steps to resolve the issue
Step 1 Use the cvpi watchdog off command to disable watchdog
Step 2 Wait 15 minutes for Hadoop to finish replicating blocks
Step 3 Use the cvpi start hbase command to start hbase
Step 4 Use the cvpi status hbase command to verify that hbase is running
Step 5 Do one of the following
bull If hbase is running use the cvpi watchdog on command to re-enable watchdog and thenwait for services to come up
bull If hbase is not running go to system recovery to resolve the issue (see ldquoSystem Recoveryrdquoon page 321)
Related topics
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
Exception found in log
orgapachehadoopipcRemoteException(orgapachehadoophdfsprotocolAlreadyBeingCreatedException) DIR NameSysteminternalReleaseLease Failed to release lease for file hbaseMasterProcWALsstate-00000000000000000ltnumbergtlogCommitted blocks are waiting to be minimally replicated Try again later
Chapter 21 Troubleshooting and Health Checks System Recovery
Configuration Guide CloudVision version 201720 321
212 System RecoverySystem recovery should be used only when the CVP cluster has become unusable and other stepssuch as performing a cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on have failed For example situations in which regardless of restarts a cvpi status allcontinues to show some components as having a status of UNHEALTHY or NOT RUNNING
If a GUI-based backup has been saved while the system was healthy it is possible to redeploy the CVPcluster restore the backup and be at the same state within CVP as when the backup was takenCreating backups on a regular basis is recommended and described in ldquoCreating a Backuprdquo onpage 299
There are two ways to completely recover a CVP cluster
bull ldquoVM Redeploymentrdquo
bull ldquoCVP Re-Install without VM Redeploymentrdquo
Note A good backup is required to proceed with either of these system recoveries
2121 VM Redeployment
Complete these steps
Step 1 Delete all the CVP VMs
Step 2 Redeploy the VMs using the procedures in
Step 3 Issue a ldquocvpi status allrdquo command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
2122 CVP Re-Install without VM Redeployment
Complete these steps
Step 1 Run lsquocvpReInstall from the Linux shell of the primary node This may take 15 minutes tocomplete[rootcvp99 ~] cvpReInstall0Log directory is tmpcvpReinstall_17_02_23_01_59_48Existing cvpicvp-configyaml will be backed up herehelliphellipComplete
CVP configuration not backed up please use cvpShell to setup the cluster
CVP Re-install complete you can now configure the cluster
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
320 Configuration Guide CloudVision version 201720
Troubleshooting Chapter 21 Troubleshooting and Health Checks
Step 3 Make sure that the exception in the log file is the same as the exception shown in this table
21113 Troubleshooting Procedure
This procedure provides the troubleshooting steps for situations that meet the conditions specified inthe table above
Pre-requisites
Make sure that you have confirmed the cause (see ldquoConfirming the Causerdquo on page 319)
Complete the following steps to resolve the issue
Step 1 Use the cvpi watchdog off command to disable watchdog
Step 2 Wait 15 minutes for Hadoop to finish replicating blocks
Step 3 Use the cvpi start hbase command to start hbase
Step 4 Use the cvpi status hbase command to verify that hbase is running
Step 5 Do one of the following
bull If hbase is running use the cvpi watchdog on command to re-enable watchdog and thenwait for services to come up
bull If hbase is not running go to system recovery to resolve the issue (see ldquoSystem Recoveryrdquoon page 321)
Related topics
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
Exception found in log
orgapachehadoopipcRemoteException(orgapachehadoophdfsprotocolAlreadyBeingCreatedException) DIR NameSysteminternalReleaseLease Failed to release lease for file hbaseMasterProcWALsstate-00000000000000000ltnumbergtlogCommitted blocks are waiting to be minimally replicated Try again later
Chapter 21 Troubleshooting and Health Checks System Recovery
Configuration Guide CloudVision version 201720 321
212 System RecoverySystem recovery should be used only when the CVP cluster has become unusable and other stepssuch as performing a cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on have failed For example situations in which regardless of restarts a cvpi status allcontinues to show some components as having a status of UNHEALTHY or NOT RUNNING
If a GUI-based backup has been saved while the system was healthy it is possible to redeploy the CVPcluster restore the backup and be at the same state within CVP as when the backup was takenCreating backups on a regular basis is recommended and described in ldquoCreating a Backuprdquo onpage 299
There are two ways to completely recover a CVP cluster
bull ldquoVM Redeploymentrdquo
bull ldquoCVP Re-Install without VM Redeploymentrdquo
Note A good backup is required to proceed with either of these system recoveries
2121 VM Redeployment
Complete these steps
Step 1 Delete all the CVP VMs
Step 2 Redeploy the VMs using the procedures in
Step 3 Issue a ldquocvpi status allrdquo command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
2122 CVP Re-Install without VM Redeployment
Complete these steps
Step 1 Run lsquocvpReInstall from the Linux shell of the primary node This may take 15 minutes tocomplete[rootcvp99 ~] cvpReInstall0Log directory is tmpcvpReinstall_17_02_23_01_59_48Existing cvpicvp-configyaml will be backed up herehelliphellipComplete
CVP configuration not backed up please use cvpShell to setup the cluster
CVP Re-install complete you can now configure the cluster
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks System Recovery
Configuration Guide CloudVision version 201720 321
212 System RecoverySystem recovery should be used only when the CVP cluster has become unusable and other stepssuch as performing a cvpi watchdog off cvpi stop all and then cvpi start all cvpi watchdog on have failed For example situations in which regardless of restarts a cvpi status allcontinues to show some components as having a status of UNHEALTHY or NOT RUNNING
If a GUI-based backup has been saved while the system was healthy it is possible to redeploy the CVPcluster restore the backup and be at the same state within CVP as when the backup was takenCreating backups on a regular basis is recommended and described in ldquoCreating a Backuprdquo onpage 299
There are two ways to completely recover a CVP cluster
bull ldquoVM Redeploymentrdquo
bull ldquoCVP Re-Install without VM Redeploymentrdquo
Note A good backup is required to proceed with either of these system recoveries
2121 VM Redeployment
Complete these steps
Step 1 Delete all the CVP VMs
Step 2 Redeploy the VMs using the procedures in
Step 3 Issue a ldquocvpi status allrdquo command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
2122 CVP Re-Install without VM Redeployment
Complete these steps
Step 1 Run lsquocvpReInstall from the Linux shell of the primary node This may take 15 minutes tocomplete[rootcvp99 ~] cvpReInstall0Log directory is tmpcvpReinstall_17_02_23_01_59_48Existing cvpicvp-configyaml will be backed up herehelliphellipComplete
CVP configuration not backed up please use cvpShell to setup the cluster
CVP Re-install complete you can now configure the cluster
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
322 Configuration Guide CloudVision version 201720
System Recovery Chapter 21 Troubleshooting and Health Checks
Step 2 Re-configure using the procedure in ldquoShell-based Configurationrdquo on page 97 Log into theLinux shell of each node as lsquocvpadminrsquo or lsquosu cvpadminrsquo
Step 3 Issue a cvpi status all command to ensure all components are running
Step 4 Login to the CVP GUI as lsquocvpadmincvpadminrsquo to set the cvpadmin password
Step 5 From the Backup amp Restore tab on the Setting page restore from the backup using theprocedures in ldquoImporting a Backuprdquo on page 301 and ldquoRestoring Datardquo on page 302
Related topics
bull ldquoHealth Checksrdquo on page 323
bull ldquoResource Checksrdquo on page 324
bull ldquoTroubleshootingrdquo on page 318
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks Health Checks
Configuration Guide CloudVision version 201720 323
213 Health ChecksThe following table lists the different types of CVP health checks you can run including the steps touse to run each check and the expected result for each check
2131 Running Health Checks
Run the cvpi resources command to execute a health check on disk bandwidth The output of thecommand indicates whether the disk bandwidth is at a healthy level or unhealthy level The thresholdfor healthy disk bandwith is 20MBS
The possible health statuses are
bull Healthy - Disk bandwidth above 20MBs
bull Unhealthy - Disk bandwidth at or below 20MBs
The output is color coded to make it easy to interpret the output Green indicates a healthy leveland red indicates an unhealthy level (see the example below)
Component Steps to Use Expected Result
Network connectivity ping -f across all nodes No packet loss network is healthy
HBase echo list | cvpihbasebinhbase shell |grep -A 2 row(
Prints an array of tables in Hbase created by CVPHbase and the underlying infrastructure works
All daemons running on allnodes bypass cvpi status all
On all nodes
su - cvp -c ldquocvpijdkbinjpsrdquo
On primary and secondary nodes 9 processesincluding jps
bull 3149 HMasterbull 2931 NameNodebull 2797 QuorumPeerMainbull 12113 Bootstrapbull 3040 DFSZKFailoverControllerbull 2828 JournalNodebull 11840 HRegionServerbull 12332 Jpsbull 2824 DataNode
On tertiary 6 processes
bull 2434 JournalNodebull 4256 HRegionServerbull 2396 QuorumPeerMainbull 2432 DataNodebull 4546 Jpsbull 8243 Bootstrap
Check time is in syncbetween nodes
On all nodes run ldquodate +srdquo UTC time should be within a few seconds of each other(typically less than one second) Up to 10 seconds isallowable
IO slowness issues The disk IO throughput is at anunhealthy level (too low)
Use the cvpi resources command to find outwhether the disk IO throughput is at a healthy level orunhealthy level The disk IO throughput reported inthe command output is measured by the VirtualMachine
See ldquoRunning Health Checksrdquo on page 323 for anexample of the output of the cvpi resources command
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
324 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Example
This example shows output of the cvpi resources command In this example the disk bandwidthstatus is healthy (above the 20MBs threshold)
Figure 21-1 Example output of cvpi resources command
Related topics
bull ldquoResource Checksrdquo
bull ldquoTroubleshootingrdquo on page 318
bull ldquoHealth Checksrdquo on page 323
214 Resource ChecksCloudVision Portal (CVP) enables you to run resource checks on CVP node VMs You can run checksto determine the current data disk size of VMs that you have upgraded to CVP version 201720 andto determine the current memory allocation for each CVP node VM
Performing these resource checks is important to ensure that the CVP node VMs in your deploymenthave the recommended data disk size and memory allocation for using the Telemetry feature If theresource checks show that the CVP node VM data disk size or memory allocation (RAM) are below therecommended levels you can increase the data disk size and memory allocation
These procedures provide detailed instructions on how to perform the resource checks and if neededhow to increase the CVP node VM data disk size and CVP node VM memory allocation
bull ldquoRunning CVP node VM Resource Checksrdquo
bull ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo on page 325
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo on page 327
2141 Running CVP node VM Resource Checks
CloudVision Portal (CVP) enables you to quickly and easily check the current resources of the primarysecondary and tertiary nodes of a cluster by running a single command The command you use is thecvpi resources command
Use this command to check the following CVP node VM resources
bull Memory allocation
bull Data disk size (storage capacity)
bull Disk throughput (in MB per second)
bull Number of CPUs
Complete the following steps to run the CVP node VM resource check
Step 1 Login to one of the CVP nodes as root
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 325
Step 2 Execute the cvpi resources command
The output shows the current resources for each CVP node VM (see Figure 21-2)
bull If the total size of sdb1 (or vdb1) is approximately 120G or less you can increase the disksize to 1TB (see ldquoIncreasing Disk Size of VMs Upgraded to CVP Version 201720rdquo)
bull If the memory allocation is the default of 16GB you can increase the RAM memoryallocation (see ldquoIncreasing CVP Node VM Memory Allocationrdquo)
Figure 21-2 Using the cvpi resource command to run CVP node VM resource checks
2142 Increasing Disk Size of VMs Upgraded to CVP Version 201720
If you already upgraded any CVP node VMs running an older version of CVP to version 201720 youmay need to increase the size of the data disk of the VMs so that the data disks have the 1TB diskimage that is used on current CVP node VMs
CVP node VM data disks that you upgraded to version 201720 may still have the original disk image(120GB data image) because the standard upgrade procedure did not upgrade the data disk imageThe standard upgrade procedure updated only the root disk which contains the Centos image alongwith rpms for CVPI CVP and Telemetry
Note It is recommended that each CVP node have 1TB of disk space reserved for enabling CVP TelemetryIf the CVP nodes in your current environment do not have the recommended reserved disk space of1TB complete the procedure below for increasing the disk size of CVP node VMs
Pre-requisites
Before you begin the procedure make sure that you
bull Have upgraded to version 201720 You cannot increase the data disk size until you havecompleted the upgrade to version 201720 (see ldquoUpgrading CloudVision Portal (CVP)rdquo onpage 304)
bull Have performed the resource check to verify that the CVP node VMs have the data disk size imageof previous CVP versions (approximately 120GB or less) See ldquoRunning CVP node VM ResourceChecksrdquo on page 324
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoUsing the GUI to Backup and Restore Datardquo on page 298)
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
326 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Procedure
Complete the following steps to increase the data disk size
Step 1 Turn off cvpi service by executing the systemctl stop cvpi command on all nodes in thecluster (For a single-node installation run this command on the node)
Step 2 Run the cvpi -v=3 stop all on the primary node
Step 3 Perform a graceful power-off of all VMs
Note You do not need to unregister and re-register VMs from vSphere Client or undefine and redefine VMsfrom kvm hypervisor
Step 4 Do the following to increase the size of the data disk to 1TB using the hypervisor
bull ESX Using vSphere client do the following (see Figure 21-3 for an example)a Select the Virtual Hardware tab and then select hard disk 2b Change the setting from 120GB to 1TBc Click OK
bull KVM Use the qemu-img resize command to resize the data disk from 120GB to 1TB Besure to select disk2qcow2
Figure 21-3 Using vSphere to increase data disk size
Step 5 Power on all CVP node VMs and wait for all services to start
Step 6 Use the cvpi status all command to verify that all the cvpi services are running
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 327
Step 7 Run the cvpitoolsdiskResizepy command on the primary node (Do not run thiscommand on the secondary and tertiary nodes)
Step 8 Run the df -h data command on all nodes to verify that the data is increased toapproximately 1TB
Step 9 Wait for all services to start
Step 10 Use the cvpi -v=3 status all command to verify the status of services
Step 11 Use the systemctl status cvpi to ensure that cvpi service is running
Related topics
bull ldquoIncreasing CVP Node VM Memory Allocationrdquo
bull ldquoRunning CVP node VM Resource Checksrdquo on page 324
2143 Increasing CVP Node VM Memory Allocation
If the CVP Open Virtual Appliance (OVA) template currently specifies the default of 16GB of memoryallocated for the CVP node VMs in the CVP cluster you need to increase the RAM to ensure that theCVP node VMs have adequate memory allocated for using the Telemetry feature
Note It is recommended that CVP node VMs have 32GB of RAM allocated for deployments in whichTelemetry is enabled
You can perform a rolling modification to increase the RAM allocation of every node in the cluster Ifyou want to keep the service up and available while you are performing the rolling modification makesure that you perform the procedure on only one CVP node VM at a time
Once you have completed the procedure on a node you repeat the procedure on another node in thecluster You must complete the procedure once for every node in the cluster
Pre-requisites
Before you begin the procedure make sure that you
bull Have performed the resource check to verify that the CVP node VMs have the default RAMmemory allocation of 16GB (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
bull Make sure that you perform a GUI-based backup of the CVP system and copy the backup to a safelocation (a location off of the CVP node VMs) The CVP GUI enables you to create a backup youcan use to restore CVP data (see ldquoRunning CVP node VM Resource Checksrdquo on page 324)
Procedure
Complete the following steps to increase the RAM memory allocation of the CVP node VMs
Step 1 Login to a CVP node of the cluster as cvp user
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
328 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 2 Using the cvpi status cvp shell command make sure that all nodes in the cluster areoperational
Step 3 Using vSphere client shutdown one CVP node VM by selecting the node in the left pane andthen click the Shut down the virtual machine option
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
Chapter 21 Troubleshooting and Health Checks Resource Checks
Configuration Guide CloudVision version 201720 329
Step 4 On the CVP node VM increase the memory allocation to 32GB by right-clicking the node iconand then choose Edit Settings
The Virtual Machine Properties dialog appears
Step 5 Do the following to increase the memory allocation for the CVP node VM
bull Using the Memory Size option click the up arrow to increase the size to 32GB
bull Click the OK button
The memory allocation for the CVP node VM is changed to 32GB The page refreshesshowing options to power on the VM or continue making edits to the VM properties
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323
330 Configuration Guide CloudVision version 201720
Resource Checks Chapter 21 Troubleshooting and Health Checks
Step 6 Click the Power on the virtual machine option
Step 7 Wait for the cluster to reform
Step 8 Once the cluster is reformed repeat step 1 through step 7 one node at a time on each of theremaining CVP node VMs in the cluster
Related topics
bull ldquoTroubleshootingrdquo on page 318
bull ldquoSystem Recoveryrdquo on page 321
bull ldquoHealth Checksrdquo on page 323