PROBLEM:
Unable to collect PDH data
All PDH tests to a specific device start reporting "Unable to collect PDH data". Other regular WMI test monitoring to the device is not impacted.
RESOLUTION:
* Perform a manual test discovery. From experience, we have seen this rectify the issue. Please navigate to Administration --> Devices --> find the device in question --> click on the Tests link --> Create New Standard Tests and follow through with the rest of the workflow
* Run probewmitests.pl manually from the DGE/DGEX monitoring the device as documented in https://helpdesk.kaseya.com/hc/en-gb/articles/229044448-Troubleshooting-WMI-properties-for-Windows-monitoring. This is the same as above - but run directly from the DGE/DGEX and again - may rectify the issue
* Stop and start Traverse WMI Query Daemon component. This is not expected to resolve the issue - but may be worth a try before performing thenext step
* Since this is a machine to machine connection, the only solution we are aware of is to reboot the DGE/DGEX server, thus "losing" the connection. Upon restart, a new connection will be established and monitoring of the PDH counters should resume.
Details:
Traverse has the ability to monitor PDH counters or Performance Counter tests. An example is Exchange 2013 monitoring. Traverse categorizes this in the broader category of WMI counters, together with the regular WMI Windows Management Instrumentation Performance Counters (such as Win32_PerfRawData_PerfDisk_LogicalDisk). A device on Traverse can have both sets of tests configured.
Though Traverse categorizes both test types as wmi, there is a fundamental difference in how these counters are queried. This has to do with how the underlying OS establishes and manages connections. Traverse relies on the OS capabilities and uses the high level APIs to query the counter values.
Connections to query PDH counters are machine specific; connections to query WMI counters are query or session specific and not machine specific.
When the target system responds with "Unable to collect PDH data", it usually indicates that there is a disruption to the machine specific connection. Unfortunately, it is not clear from the Microsoft documentation how such a connection maybe torn down and reestablished.
How do I identify if a test is a PDH test or a MWI test?
Review the "WMI Property" field in the test properties. Note that the WMI property will have : as separators and will almost always have the property name start with Win32_
For a PDH test, the value would be of the syntax:
\Counter\Property
An example is \MSExchangeIS\RPC Averaged latency
For a WMI test, the value would be of the syntax:
\Counter:Property:Instance
An example is \Win32_PerfRawData_PerfOS_Processor:PercentProcessorTime:Name="0"
APPLIES TO:
All versions of Traverse