Follow

Stale Results

If your results have a triangle with an exclamation point inside of it, displayed next to them then this indicates that the results are old/stale and have not been updated recently. This is usually caused by:

a) BVE/DGE/DGEx's server times are not synchronized

b) A table in the MySQL database needs repairing or the database needs to be optimized.

c) DGEx is temporarily unable to connect to the upstream DGE

d) Too many tests have been provisioned on a single DGE

Stale_Results.png

 

Please check the following:

* Please confirm that the correct timezone is chosen under ADMINISTRATION >> PREFERENCES.

* Check the tests "Work Units Processed" and "Writer Queue size" for the DGE monitoring the device with 'Stale Results'. The 'Writer Queue Size (AggregatedDataDbWriter0)' test is expected to be historically high compared to normal, which would point to a database repair/optimize being required.

* Check SUPERUSER >> HEALTH

* If you see any Time offsets in SUPERUSER >> HEALTH, please correct as per Traverse Component Health Page Shows Non-Zero Time Offsets

 

Check if either the BVE or DGE(x) that is monitoring these 'Stale Results' requires a repair: 

>cd C:\Program Files (x86)\Traverse\utils

db_optimize.pl --info                              (Note: run the '.pl' version of db_optimize)

Scroll to the very top and if you see the below, the database requires a repair:

"Table 'XXXXX' is marked as crashed and last (automatic?) repair failed"

If the above output is not visible (due to too many lines outputted). Then you can pipe the output to a file

>db_optimize.pl --info >> output.txt

 

* If a repair is required, please follow How Do I Repair An Error With A Traverse Database. Bear in mind that this may take several hours to complete. It is advised to tail the database_repair.log in a separate command prompt so you can confirm it's running.

>cd C:\Program Files (x86)\Traverse\utils

db_repair.cmd

tail -f database_repair.log            (<TraverseHome>\logs)

* If a repair was carried out, then an optimize will be required. Even if a repair was not required, running a db_optimize would be advisable, especially if "Writer Queue size" test on the DGE is high, as this is an indicator of a non-optimized database. It will complete immediately if it was not required. Tip: How Do I Optimize The DGE DatabaseThis can also possibly take several hours.

>utils\db_optimize.pl --info

utils\db_optimize.pl --run

You can tail the database_init.log in a separate command prompt, to confirm the optimize is running.

>tail -f database_init.log            (<TraverseHome>\logs)

 

Once the above has been complete,  it will take a while for the system to catch up on cached results. Select a couple of tests with stale results and take note of the timestamp of the most recent result. ((STATUS >> DEVICES >> Select the blue device name hyperlink)) Revisit the test 10 minutes later and confirm that the the timestamp is incrementing until it reaches 'local time' or within a polling interval of the current time.

 

d) Too many tests have been provisioned on a single DGE

If the above does not remedy the situation, it may be possible that there are too many tests assigned to a single DGE. While the soft/hard limit of tests may not have been breached, there still may be too many tests polled such that the DGE cannot write test results in real-time. Please check you meet the minimum hardware requirements as per System Requirements (On Premise). Also please see Deployment Considerations on the currently recommended tests per DGE.

1) Check the audit.log to see if a large number of tests have been added in the run up to the Stale results.

2) Check the monitor/error log for the below log entries (or WMI equivalent). If no tests were added recently, we would expect the below entries in the days before the Stale results were encountered.

2017-03-09 04:53:45,911 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 175663 elements with 46479 testConfigs
2017-03-09 04:54:45,915 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 175603 elements with 46479 testConfigs
2017-03-09 04:55:45,919 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 180150 elements with 46479 testConfigs
2017-03-09 04:56:45,923 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 180110 elements with 46479 testConfigs
2017-03-09 04:57:45,928 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 180103 elements with 46479 testConfigs
2017-03-09 04:58:45,932 m.MonitorServer[Monitor Server]: (WARN ) Warning: too many elements on workQueue for Monitor[snmp]]: 180153 elements with 46479 testConfigs

3) If you suspect that there are too many tests assigned to a DGE, suspend a large amount of tests and confirm that the results start catching up

4) Ultimately you may wish to migrate the devices on a DGEx to a new DGEx on a different DGE as per Migrating/Moving Devices From One DGE To Another or Move Devices From One DGE(-X) To Another DGEx

 

 

Was this article helpful?
0 out of 1 found this helpful
Have more questions? Submit a request

0 Comments

Article is closed for comments.