SUMMARY
An archive drive may no longer be recognized after rescanning due to potential conflicts with other jobs running.
ISSUE
- Archive drive is not recognized after rescaning
- uarchive log complains about drive not responding and running 'hotplug scan' shows that the drive is not responding
0 (3750): Apr 01 15:03:49 : [LOG2] rpcutil.c:154: Executing command: /usr/bp/bin/uarchive-scripts/scan.sh 0 (3750): Apr 01 15:03:58 : [LOG2] Command exit code: 0 0 (3750): Apr 01 15:03:58 : [LOG2] Command stderr (diagnostic output): 0 (3750): Scanning for hotplug devices... 0 (3750): device "sdb" not removable 0 (3750): device "sdc" not responding 0 (3750): device "sda" not removable # hotplug scan device "sdc" not responding device "sdb" not removable device "sda" not removable current removable devices: kernel: __ratelimit: 19 callbacks suppressed kernel: sd 3:0:0:0: [sdc] Unhandled error code kernel: sd 3:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK kernel: sd 3:0:0:0: [sdc] CDB: Read(10): 28 00 00 00 00 00 00 00 20 00 kernel: __ratelimit: 19 callbacks suppressed kernel: Buffer I/O error on device sdc, logical block 0 kernel: Buffer I/O error on device sdc, logical block 1 kernel: Buffer I/O error on device sdc, logical block 2 kernel: Buffer I/O error on device sdc, logical block 3 kernel: sd 3:0:0:0: [sdc] Unhandled error code
RESOLUTION
Check when "disk_mon" is running in cron. If "disk_mon" is attempting to read the drive at the same time we are scanning or removing it, the time when "disk_mon" is running should be changed to a less problematic time.
Note: You must restart the cron service after making any changes.
# crontab -l 0 12 * * * /usr/bp/bin/disk_cleanup.sh * * * * * /usr/bp/bin/cmc_host_statistics -1 >> /usr/bp/logs.dir/cmc_host_statistics.log 2>&1 0,15,30,45 * * * * /usr/bp/bin/cleanupHungBpsyncProcesses 0 15 * * * /usr/bp/bin/adaptiveDedupRatios -i >/dev/null 2>&1 0 * * * * /usr/bp/bin/failure_report.php >/dev/null 2>&1 30 6 * * * /usr/bp/bin/processSpaceAlert.php >/dev/null 2>&1 0 9 * * * php /var/www/html/recoveryconsole/bpl/reports/replication_report.php -f >/dev/null 2>&1 0,30 * * * * /bin/cp /var/tmp/disk_mon.out /var/tmp/disk_mon.out.bak; /usr/bp/bin/disk_monitoring.py -s -c >/var/tmp/disk_mon.out 2>&1 0 0 * * 0 /usr/bp/bin/ana_snmptrap_sender.py -a -c >/var/tmp/ana_snmptrap_sender.out 2>&1 0 20 * * * /usr/bp/bin/inventory_sync >/dev/null 2>&1 45 * * * * /usr/bp/bin/summary_counts 7 >/dev/null 2>&1 */15 * * * * /usr/bp/bin/sync_summary.php -r >/dev/null 2>&1 0 12 * * * php /var/www/html/recoveryconsole/bpl/gfslite.php >/dev/null 2>&1 */5 * * * * /etc/init.d/bp_rcscript startIn the above crontab, there are various other jobs set to run at the top of the hour. By changing "disk_mon" to run at 15 and 45 minutes after the hour, we avoid possible conflicts with other jobs:
15,45 * * * * /bin/cp /var/tmp/disk_mon.out /var/tmp/disk_mon.out.bak; /usr/bp/bin/disk_monitoring.py -s -c >/var/tmp/disk_mon.out 2>&1Once you have made the change, restart cron:
service crond restart