Follow

Troubleshooting: All Traverse component health status entries critical

This symptom is likely caused by a failure within the 'Correlation & Summary Engine' or the 'Internal Communication Bus' (a.k.a. Java Messaging Service or JMS) that prevents the processing of the periodic heartbeat updates being sent by each Traverse component across all the Traverse servers.  Note that the components (other than JMS and CSE) may continue to function correctly even though the 'heartbeats' are not updated on the page 'Superuser->Health'.

In addition, while in this state the Event Manager (Status->Events) might not be updated.

To allow us to determine the cause of the failed heartbeat status updates, kindly gather the following data prior to restarting ANY Traverse components.

Where 'TRAVERSE_HOME' is the installation directory for Traverse, kindly forward:

  • a zip archive of the folder 'TRAVERSE_HOME\logs' from the BVE server 
  • a screenshot of the 'Superuser->Health' page showing the heartbeat statuses and their last update times
  • a screenshot of the Traverse Service Controller launched from the Windows BVE or the output from linux command 'service traverse status'
  • a stack dump of the CSE component taken from a Windows command prompt  on the BVE (launched using 'Run as Administrator'): 

cd TRAVERSE_HOME
apps\jre\bin\java -classpath webapp\WEB-INF\lib\traverse-7.0.jar com.zyrion.traverse.utils.FullThreadDump localhost:7696 > cse_stack.dump

  • a heap dump of the CSE component created according to the instructions outlined in Generating a heap dump using JConsole (via TCP port 7696)
  • a stack dump of the JMS component taken from a Windows command prompt  on the BVE (launched using 'Run as Administrator'): 
    cd TRAVERSE_HOME
    apps\jre\bin\java -classpath webapp\WEB-INF\lib\traverse-7.0.jar com.zyrion.traverse.utils.FullThreadDump localhost:7697 > jms_stack.dump
  • a heap dump of the JMS component created according to the instructions outlined in Generating a heap dump using JConsole (via TCP port 7697)

 

After capturing the diagnostic information above, restart the CSE component to temporarily correct the issue.  

Should the issue persist:

  • Stop all Traverse components on the BVE
  • Delete the folder 'TRAVERSE_HOME\database\jms\broker' on the BVE (it will be re-created when the JMS component starts)
  • Start all Traverse components on the BVE
Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request

3 Comments

  • 0
    Avatar
    Piyush Mehta

    To generate a thread dump for the web application component, from the BVE, the instructions are the same as above but the port to use is 7691

    From a DOS command prompt (launch it using Run as Administrator): 
    cd <TRAVERSE_HOME> 
    apps\jre\bin\java -classpath webapp\WEB-INF\lib\traverse-7.0.jar com.zyrion.traverse.utils.FullThreadDump localhost:7691 > web.stack_dump

  • 0
    Avatar
    Piyush Mehta

    To generate a thread dump for the  Correlation & Summary Engine (CSE) on a Linux server, from a shell:

        sudo
        cd <TRAVERSE_HOME>
        apps/jre/bin/java -classpath webapp/WEB-INF/lib/traverse-7.0.jar com.zyrion.traverse.utils.FullThreadDump localhost:7696 > /tmp/cse.stack_dump

    /tmp/cse.stack_dump is the thread/stack dump.

     

  • 0
    Avatar
    Nathan Sanders

    If you are running version 9.0, then change anywhere you see 7.0 with 9.0.    ....WEB-INF/lib/traverse-7.0.jar com.zyrion.traverse...  would be .....WEB-INF/lib/traverse9.0.jar com.zyrion.traverse.....

Article is closed for comments.