Hi all,
I have two VMs that have a recurring vSphere HA virtual machine monitoring error.
It is recurring because as I reset it to green it turns back again to red.
I have the following recurring errors in the fdm.log:
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::ReportVmMetricsResult] reset vm /vmfs/volumes/5b7d31cc-e7ec791b-6587-0017a4770820/YYYYYYYYYYYYY.vmx: false
2019-10-28T13:18:34.783Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::ReportVmMetricsResult] reset vm /vmfs/volumes/5cb1f615-60e54c97-8233-0017a4770438/XXXXXXXX.vmx: false
2019-10-28T13:18:34.783Z verbose fdm[404CB70] [Originator@6876 sub=Hal] No stats listeners! Nothing to do!
2019-10-28T13:18:36.380Z info fdm[3F89B70] [Originator@6876 sub=Cluster opID=SWI-3ab50c2a] [ClusterManagerImpl::LogState] hostId=host-261 state=Slave master=host-47194 isolated=false host-list-version=276 config-version=26355 vm-metadata-version=86995 slv-mst-tdiff-sec=0
2019-10-28T13:18:37.780Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [MonitorHeartbeatState::MoveToResetStateIfTimersExpired] VM /vmfs/volumes/5cb1f615-60e54c97-8233-0017a4770438/XXXXXXXXXX.vmx is going to reset state because of GOS crash.Reset no: 1 out of Max allowed reset count: 3
2019-10-28T13:18:37.780Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [MonitorHeartbeatState::MoveToResetStateIfTimersExpired] VM /vmfs/volumes/5b7d31cc-e7ec791b-6587-0017a4770820/YYYYYYYYYYYY.vmx is going to reset state because of GOS crash.Reset no: 1 out of Max allowed reset count: 3
2019-10-28T13:18:37.780Z info fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::PerformCheckIoStats] Checking io stats on a list of 2 VMs.
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 2 at 0 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 58 at 1 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 109 at 2 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 1 at 3 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 2 at 4 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 3 at 5 for metric 196608
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 0 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 1 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 2 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 3 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 4 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 5 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 0 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 1 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 2 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 3 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 4 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 5 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 23 at 0 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 24 at 1 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 23 at 2 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 19 at 3 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 23 at 4 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 20 at 5 for metric 589827
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 0 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 1 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 2 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 3 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 4 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 5 for metric 589826
2019-10-28T13:18:34.779Z verbose fdm[404CB70] [Originator@6876 sub=Policy] [VmOperationsManager::CompleteCheckVmMetrics] IO metrics value is 0 at 0 for metric 589826
From what I have seen the VMs have indeed crashed (confirmed from the windows server event logs), however they have rebooted and are working correctly.
The crash was related to a ntoskrnl.exe
ntoskrnl.exe ntoskrnl.exe+3ed39c fffff800`0181b000 fffff800`01df8000 0x005dd000 0x5d803c60 9/17/2019 2:52:32 AM
Even the VMware tools are up to date, however the hardware version is ESXi 5.0 and later (VM version 8).
I am on esxi 6.5 U1, and I know that it may be updated to a higher version.
Yet the esxi host still reports these errors every minute or so.
I know the trivial answer to migrate the VM off of that host and see what happens and if the error is still there, however I would like to troubleshoot this as
this is a recurring theme and the issue is likely to reappear again, and I need to know what causes it.
I would be very happy if you could point me to further steps on how to troubleshoot this!