QUESTION:
How Do I Repair An Database Error (corruption, table crash)?
SOLUTION:
(this article applies to the DGE and to the BVE)
In the event of a database error (see example log entries below from error.log) due to improper shutdown/power failure, running out of disk space, or improper access by an external applications, a database repair may be required.
Examples:
2009-05-18 11:17:30,438 message.RemoveExpiredMessagesFromLiveDb
[ThreadPool[PooledCommandRunner$PooledCommandRunnerHelper]]:
(ERROR) Failed to execute command: Table '.\liveeventsdb\duplicatemessageinfo'
is marked as crashed and should be repaired
2009-09-21 12:27:09,109 monitor.AggregatedDataDbWriter[aggregation-writer-Manager]:
(ERROR) Error writing objects: Incorrect key file for table
'.\aggregateddatadb\aggregationdatascheme2.MYI'; try to repair it
On Windows platform, anti-virus software (McAfee, Symantec, etc) can cause similar corruption through "on-access scan" of the database tables/files while information is being written to them. If anti-virus software is installed on the Traverse server, they should be configured to exclude the TRAVERSE_HOME\database directories. Similarly, scheduled backup tasks should be configured to skip the database\mysql directory.
Note: If a DGE is unable to write to it's local database, you may notice missing performance data when you drill-down into a test being monitored from that DGE.
To perform a database repair, ensure there is sufficient free space on the disk equal to the size of the largest file under 'TRAVERSE_HOME\database\mysql', then follow these steps:
* Stop the Traverse components using Start -> All Programs -> Traverse -> Stop Traverse Components on Windows or "etc/traverse.init stop" from a shell prompt on Linux/Solaris.
* Zip the TRAVERSE_HOME\logs directory for analysis
* Perform database repair using 'Start -> All Programs -> Traverse -> Database Management -> DGE Database Repair' on Windows. On Linux/Solaris run 'utils/db_repair.sh' from a shell prompt. ('db_repair.cmd' on Windows)
* Start 'Performance and Event Database component' using 'Start -> All Programs -> Launch Traverse Service Controller -> Start 'Performance and Event Database' or 'etc/db_optimize.pl --run' from a shell prompt on Linux/Solaris
* Perform database optimization using 'Start -> All Programs -> Traverse -> Database Management -> DGE Database Optimization' on Windows. On Linux/Solaris run 'utils/db_optimize.pl --run' from a shell prompt.
* Start Traverse components using 'Start -> All Programs -> Traverse -> Start Traverse Components on Windows' or 'etc/traverse.init start' from a shell prompt on Linux/Solaris
Once the components are up, please check database.log and error.log files to ensure that the errors are no longer being recorded.