How to correct jukebox problems
Introduction
There are many reasons why a jukebox may not respond to NetWorker commands. Often an error message from NetWorker during a backup operation is returned.
Why does a jukebox become unresponsive or return unexpected results to NetWorker requests or commands?
Here are three common reasons for jukebox problems:
Manual intervention: Most often, a jukebox no longer responds to backup requests because of operator intervention. Manually moving, unloading or loading tapes causes a discrepancy between NetWorker's information on the status of the jukebox and the actual physical status of the jukebox. On jukeboxes that support barcodes and/or element status, a special feature lets the jukebox know if the door is opened. Since the jukebox has no way of knowing what the operator did while the door was open, it goes into a state which reflects that uncertainty, and in many cases does not function properly until that uncertainty is removed.
If a tape has been manually loaded or unloaded to or from a drive, try replacing the tape in its original configuration, then retry the NetWorker operation. Often a reset, re-inventory and/or power cycle is necessary (described below).
Corrupt nsrjb.res file: Another possible cause is the corruption of the jukebox resource file, nsrjb.res. Corruption may occur to an open nsrjb.res file during a system crash.
Hardware or software problems: Other hardware and software problems may also cause corruption to nsrjb.res or cause the jukebox to become unresponsive or return unexpected results in response to NetWorker requests.
The following steps are in order based upon simplicity, risk, and time. Be sure that the first items are done before proceeding to the next items. Before you begin, check that all required NetWorker patches have been installed, and verify proper SCSI bus and jukebox driver installation using suitable tools and utilities.
Verify jukebox configuration
Using the NetWorker Administrator's GUI go to the Jukeboxes window. Verify the following data fields against the hardware:
Available slots: This field should be a range of slots, i.e. 2-10, as opposed to a single digit.
Control port: This field should match the entry used for control port in such commands as pscinfo, jbexercise, etc.
Devices:
Verify that the proper device names are being used, and that the names are physically in the jukebox:
- On EXB-210's the first device listed in this field should be the top device in the jukebox.
- On EXB-60's and 120's the device names should be listed in order of physical location in the jukebox, from left to right.
Physical slots: This field should be listed as 1,10 as opposed to 1-10 in available slots.
Verify that model and number of devices match the physical jukebox.
If any of the above fields are incorrect, make corrections before continuing.
Volumes: Verify that the volumes in this field are physically in the jukebox by using the following command:
"nsrjb -Cv " (Provides a list of Volumes by slot; this should agree with the physical inventory).
If these lists do not agree, your jukebox is confused, therefore proceed with the subsequent troubleshooting steps.
Re-Initialize element status for jukebox with barcodes and/or element status
This section may be skipped for jukeboxes that do not support barcodes and/or element-status.
Opening the door may confuse jukeboxes that support barcodes and/or element-status.
If the jukebox door is manually opened, the jukebox immediately marks all the slots as questionable. Consequently, the jukebox status will not reflect physical changes, such as, removing a tape or replacing a tape, made by a user. To address this condition, the jukebox needs to check all of the slots and read the barcodes (if enabled) to verify the status of each slot.
To reset element status one of the following commands:
For NetWorker 4.1 and previous versions:
"nsrjb -HE -v" (Resets the jukebox hardware and element status).
"pscinfo -i /dev/sjid1u1" (Check the status)
For NetWorker 4.2 and later version
"nsrjb -HE -v" (Resets the jukebox hardware and element status).
"inquire" (Check the status)
If an error message is returned, try the commands again. The second try may correct the problem. If errors continue, you should analyze the errors and correct them before proceeding.
Inventory jukebox
For many simple problems, it may only be necessary to re-inventory the contents of the jukebox.
Use the following command to re-inventory the jukebox: nsrjb -I -v
A successful inventory indicates the jukebox problem is resolved. Monitor subsequent backups and jukebox operations to insure no other problems occur.
If the re-inventory step does not resolve your jukebox problem, continue with the subsequent troubleshooting steps.
Reset jukebox hardware
Some problems require a hardware reset to correct jukebox problems.
Use the following command: nsrjb -H -v
You may need to follow the hardware reset with the inventory operation listed above. A power recycle may also be necessary to fully reset the jukebox.
Fixing the confused jukebox resource file
The following procedure corrects jukebox confusion by erasing the existing inventory stored the NetWorker resource file, and then forcing the system to recreate a new inventory.
Login to the NetWorker server as root.
Start the NetWorker Administrator's GUI.
"networker -s <server -x" (NetWorker version 4.0.2.x)
"nwadmin -s server" (NetWorker version 4.1.x)
Select the Jukeboxes window.
- Select Admin - Jukeboxes (NetWorker version 4.0.2.x)
- Select Media - Jukeboxes (NetWorker version 4.1.x)
- Select View - Details
Clear contents of Loaded Slots and Loaded Volumes fields, then press the Change button.
Invoke the command prompt and type:
"nsrjb -HE -v" (the 'E' option will reset element status)
"nsrjb -Iv" (the "I" takes an inventory of the jukebox)
NetWorker and the jukebox should now be synchronized.
Completely reconfiguring the jukebox
If going through the steps listed above does not resolve your jukebox problem, the resource file, nsrjb.res, may be corrupt.
Try the one of the following suggestions to address a potentially corrupted nsrjb.res file.
Recover a previous version of the nsrjb.res file from a backup. You may need to recover more than one version of nsrjb.res, dependent upon when the corruption actually occurred.
Recreate the nsrjb.res file.
- Shutdown all NetWorker daemons by typing nsr_ shutdown -a from the command prompt.
- Type mv /nsr/res/nsrjb.res /nsr/res/nsrjb.res.old to rename the currentnsrjb.res file to nsrjb.res.old.
- Start the NetWorker daemons by typing nsrd at the command prompt.
- Run jb_configfrom the command prompt to configure the jukebox.
- Re-enter the jukebox enabler, if necessary.
PSCINFO man page reference
PSCINFO(8) MAINTENANCE COMMANDS PSCINFO(8)
NAME
pscinfo - NetWorker autochanger utility program
SYNOPSIS
pscinfo -c {INQUIRY | SENSE} device
pscinfo -i device
pscinfo -e device
DESCRIPTION
pscinfo performs a few diagnostic functions with auto- changers. A system administrator may find pscinfo useful for initializing or reporting the status of such a device
independent of the use of NetWorker.
Device is the name of the device (/dev/sjidNu1 or /dev/pscN;
see psc(8)).
OPTIONS
-c COMMAND
Special SCSI commands can be sent to the changer dev ice. The following key words and their descriptions are summarized below:
INQUIRY displays the device type, product code, product ID, and product revision strings. SENSE enables the changer to report on its operating
mode parameters. The element address assignment page is reported.
-i The Initialize Element Status command requests the changer to check all elements for the presence of a data cartridge. The changer stores this
information its cartridge inventory. Note that not all autochanger devices support this command.
-e The Read Element Status command requests that the changer report the status of its internal elements.
Note that not all autochanger devices support this command.
SEE ALSO
psc(8), nsrjb(8).