Follow

Traverse High Availability Options

The Traverse architecture utilizes OS/server level high-availability options which may enable it to fit into existing IT processes for different organizations. 
 
It should be noted that a large degree of fault-tolerance is built into Traverse because of the distributed nature of the Data Gathering Engines (DGE) and that only the meta-data is centralized in the Business Visibility Engine (BVE). This allows a new machine to assume the identity of a failed DGE simply by connecting to the BVE with the same identity (name) of a failed DGE. For high-availability, the solution that is typically recommended is as follows:
  • At regular intervals, save the databases in the BVE and the DGE to "another" server or servers (depending on how much redundancy you want). The BVE database (configuration data) can be saved more often in the beginning, but as your system matures, you can fall back to every 24 hours as configuration changes become less frequent.
  • In the event that any of the components (DGE or BVE) fail, the spare machine is enabled with the identity of the failed machine.  Most customers are comfortable with a 5-10 minute interval before monitoring is resumed. 
Much of the activity needed for such a 'cutover' can be performed with scripts to reduce the amount of time required. Once the 'cutover' is done, your historical data would be out of date by at most 24 hours (or the age of the most recent backup), but monitoring outage would only be as little as 5 minutes (1-2 polling intervals).
Was this article helpful?
3 out of 3 found this helpful
Have more questions? Submit a request

5 Comments

  • 0
    Avatar
    Chris McBride

    The whitepaper helps, but is pretty much what I thought was the case.  Level-3 HA is custom and expensive.  Level-2 HA is manual, somewhat difficult to recover from, has outage windows, and possible lost data, along with additional license costs.  Level-1 is not an option.

     

     

  • 0
    Avatar
    Rob Arends

    We deployed Level-2 HA.  We found the built in backup script did not include every piece required to restore.  So we wrapped the script in our own creation that picks up the missing info (eg custom XML), and replicates it every 6hrs to another server.  The second server has Traverse installed but not running.  There are a number of steps required to restore the data in readiness for Traverse to be able to run as a replacement to the original failed instance.  Our script does all this, including handling of VIP migration between servers. It is long and detailed, but the end result is functional.

     

    It is disappointing that this level of HA,  given you have to buy the standby licences, is not included in the base product.

     

    Rob.

  • 0
    Avatar
    Piyush Mehta

    Hello Chris. Rob

    Thank you for your inputs. Your concerns/suggestions have been brought to the attention of the Account Manager.

    Chris - let us know if Support can assist in any way on the ticket.

     

    Regards,

    Piyush

  • 0
    Avatar
    Ben Smeele

    Hi Piyush,

    Has there been any development on this or is they only HA option effectively a backup? We need a cluster type setup of active-active so we can ensure up time particularly of the BVE as its a single point of failure.

     

    Regards,

    Ben Smeele

  • 0
    Avatar
    Rob Arends

    Me too, Me too - don't hold your breath.

Please sign in to leave a comment.