Follow

Traverse Component Health page shows non-zero time offsets

For a properly configured Traverse system, the Component Health page (Superuser->Health) should show a Time Offset of zero for all Traverse servers.  For each server (DGE or DGE extension) with a Time Offset other than zero (whether positive or negative):

 

  • Stop all the Traverse components.
  • Configure the system clock to synchronize with an accurate source (e.g. NTP,  ESXi host via VMware Tools, etc.)
  • Ensure that the clock is set to the current time before proceeding.  Note that in some cases, synchronization may take a few minutes to correct the clock.
  • If the 'Time Offset' value displayed on the Component Health page was positive (+ve),  then wait at least that amount of time before restarting the Traverse components.
  • If the 'Time Offset' value was negative,  the Traverse components may be started immediately.
Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request

1 Comments

  • 0
    Avatar
    Rob Arends

    We have a tip for those wanting to have an alert for the timesync of the Traverse servers being off.

     

    The info on the Traverse Component Health page is contained in the MySQL DB liveeventsdb on the BVE.

    So you can craft a sql_value/mysql test to do this.

     

    In the liveeventsdb, the componentId defines the various parts of Traverse.

    These are the valid "IDs" [up to the first space, the rest is my description]

    BVE_API_ID - BVE
    CSE_ID        - Summary engine / Live events
    DC_ID          - Data Collector (DGE Extension)
    DGE_ID         - Data Gathering Engine
    MSG_HNDLR_ID - Message Server
    RDC_ID       - Remote Distribution Client (filesync) [everything except BVE]
    RDS_ID       - Remote Distribution Server (filesync) [BVE]
    WEB_APP_ID   - Web App

     

    Below I have 'selected' DGE and DGE Extension as separate tests.
    But you can combine by adjusting the SQL select.

    eg: change "=" for "in" and provide a list in Brackets.
    -> All DGE and DGE Extensions.
    "where componentId=\'DC_ID\' " for "where componentId in (\'DC_ID\', \'DGE_ID\', ) "
    or
    -> All of Traverse
    "where componentId=\'DC_ID\' " for "where componentId in (\'RDS_ID\', \'RDC_ID\', ) "

     

    * Substitute {bve_ip/user/pass} for your environment.
    * Substitute "BVE" for your devicename for the BVE.
    * Substitute actionname of None to one valid for your environment - you can do this after, via the GUI.
    The " >= 1 " is the threshold in seconds.

    bveCLI.pl --host={bve_ip} --username={user} --password={pass] --exec 'test.create "actionname=None", "criticalthreshold=1", "database=liveeventsdb", "devicename=BVE", "driver=org.gjt.mm.mysql.Driver", "interval=60s", "loginname=emerald", "password=mysql", "port=7663", "query=select count(*) from liveeventsdb.ComponentStatus where componentId=\'DC_ID\' and ABS((pingTimeStamp-receivedTimeStamp)/60000) >= 1;", "subtype=mysql", "testname=DGEx Time Sync needs checking", "testtype=sql_value", "thresholdtype=1", "units=", "warningthreshold=1"'  

     

    bveCLI.pl --host={bve_ip} --username={user} --password={pass] --exec 'test.create "actionname=None", "criticalthreshold=1", "database=liveeventsdb", "devicename=BVE", "driver=org.gjt.mm.mysql.Driver", "interval=60s", "loginname=emerald", "password=mysql", "port=7663", "query=select count(*) from liveeventsdb.ComponentStatus where componentId=\'DGE_ID\' and ABS((pingTimeStamp-receivedTimeStamp)/60000) >= 1;", "subtype=mysql", "testname=DGE Time Sync needs checking", "testtype=sql_value", "thresholdtype=1", "units=", "warningthreshold=1"'      


    It provides a count of the number of matching servers exceeding 1 second, but at least you know to look into your timesync.

    Hope that helps someone else.

     

    Rob.

Please sign in to leave a comment.