Subject: [Bacula-users] Problems with disk-based backup

Hi, I'm having some troubles with bacula failing to perform automatic backups. The issue seems to be that the storage daemon is not automatically labelling a new volume. The setup looks like this:

Director lives on an Intel server running Ubuntu 10.04.2 LTS, the installed version of bacula is from the packages for this release: 5.0.1

Storage daemons live on a pair of Netgear ReadyNAS NV+ (sparc-based NAS boxes) using 5.0.1 compiled natively on a ReadyNAS box. The NV+ is running Linux and I believe this is a re-packaged Debian Sarge

We have File daemons (version 5.0.3) deployed on a range of Windows 2003, 2000 and Ubuntu server boxes.

We do a full backup each Saturday morning, then each weeknight we write an incremental. The director config has has custom schedules so that alternate weeks are written to different storage daemons.

The problems seemed to start occurring when the volume names rolled over from Incr-0099 to Incr-0100 (that is the first instance I can find of this issue, anyway). I don't know if this is coincidence or not. When the problem occurred, we got this in the director's logs:

12-Apr 22:00 rm-bac-1-dir JobId 173: Start Backup JobId 173, Job=rov-impac-1-tshome.2011-04-12_22.00.00_53
12-Apr 22:00 rm-bac-1-dir JobId 173: Using Device "rm-nas-1-rov-impac-1"
12-Apr 21:53 rov-impac-1-fd JobId 173: DIR and FD clocks differ by -381 seconds, FD automatically compensating.
12-Apr 21:51 JobId 173: Job rov-impac-1-tshome.2011-04-12_22.00.00_53 is waiting. Cannot find any appendable volumes.
Please use the "label" command to create a new Volume for:
Storage: "rm-nas-1-rov-impac-1" (/c/bacula/rov-impac-1)
Pool: Incremental
Media type: File1
12-Apr 22:05 rm-bac-1-dir JobId 173: Created new Volume "Incr-0101" in catalogue.
12-Apr 21:56 JobId 173: Warning: mount.c:221 Open device "rm-nas-1-rov-impac-1" (/c/bacula/rov-impac-1) Volume "Incr-0101" failed: ERR=dev.c:548 Could not open: /c/bacula/rov-impac-1/Incr-0101, ERR=No such file or directory

There don't seem to be any permission or related issues preventing the volume being created. After re-starting the SD it is able to create new volumes (for a while, anyhow).

I can either go and manually issue a label command in bconsole, or I can go restart the SD and DIR and it'll all work again for a few days. However neither of those is really a proper solution.

If anyone can help, I would very much appreciate it.

I have attached config files for the director, and one of the storage daemons. Please let me know if there is more detail I can provide.

I will note that I have allowed up to 10 simultaneous jobs in the director - I recall reading that this was not recommended, however I was unsure if that recommendation was current, and frankly, the ability to run multiple jobs concurrently is quite desirable to use our network efficiently. If anyone thinks this might be the issue, I can try running with concurrency set to 1.

If you're interested, there's a lot more detail about the project here:

Any other comments on what I have set up would be appreciated - I think I've set up something reasonable, but independent sanity checks would be great.

My suspicion is that the issue is on the storage daemon. I've now set them to create debug level output (using-v -dt -d 200) which I have redirected to a file, so I hope to have more info in future.

Regards, Philip.

