Technical Exhaustion

Tech tips from the weary
  • rss
  • Home
  • About

Purging a backup volume in Bacula

August 15, 2008 | 2:49 pm

Quick and dirty - this should never be needed, but in the rare events that your auto rotation cycle fails or you’re just using a volume for testing and want to blank it to start from scratch, from the Bacula console enter :

purge jobs volume

After a short warning you can choose which pool and then which volume you want to purge.

Comments
No Comments »
Categories
Bacula
Comments rss Comments rss
Trackback Trackback

Removing and re-adding a disk in gmirror

| 12:43 pm

Running software RAID-1 using gmirror under FreeBSD can save some headaches.

Sometimes a disk literally fails leaving you running on a spare, sometimes you just have a ‘blip’ during a busy period of the server which causes gmirror to drop the disk from the array.

This happened today and from dmesg we can see :

GEOM_MIRROR: Request failed (error=5). ad4[READ(offset=48103390720, length=16384 )]
GEOM_MIRROR: Device gm0: provider ad4 disconnected.

Basically, my SATA disk ad4 had a bit of a read error. As this has happened before I’m not overly concerned. I don’t know, however, if this is a physical disk problem or just gmirror err’ing on the side of caution and aggressively removing it from the array. A full s.m.a.r.t report in due course may shed some light. If this had been a write error I would be a bit more concerned. Nonetheless gmirror caught it and removed it from the array as faulty.

We can confirm using the gmirror list and gmirror status commands

guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6

guru# gmirror list
Geom name: gm0
State: DEGRADED
Components: 2
Balance: round-robin
Slice: 4096
Flags: NONE
GenID: 6
SyncID: 1
ID: 2879715010
Providers:
1. Name: mirror/gm0
Mediasize: 79999999488 (75G)
Sectorsize: 512
Mode: r5w5e6
Consumers:
1. Name: ad6
Mediasize: 80026361856 (75G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 6
SyncID: 1
ID: 1663191490

Which tells us the mirror device gm0 is degraded and only SATA disk ad6 is still playing.

In an actual disk failure situation at this point we’d bring the server down and replace the failed ad4 device. If hardware supports you could also hotswap - but I’m not sure how the FreeBSD kernel would handle that.

We don’t need to do that in this instance, as I’m fairly sure this is just a blip. So I’ll just re add the disk to the array and see if it rebuilds.

After replacing the disk (if necessary) to re add the disk we first have to tell gmirror to ‘forget’ components of the array that have failed.

guru# gmirror forget gm0

This forgets the broken members for the array ‘gm0′.

We can confirm using the gmirror status command :

guru# gmirror status
Name Status Components
mirror/gm0 COMPLETE ad6

Now we can add (or re-add!) the ‘new’ disk and watch it rebuild!

guru# gmirror insert gm0 ad4
guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6
ad4 (0%)

After a while…

guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6
ad4 (64%)

And finally…

guru# gmirror status
Name Status Components
mirror/gm0 COMPLETE ad6
ad4
guru#

We can glean a nice summary from dmesg :

GEOM_MIRROR: Device gm0: provider ad4 detected.
GEOM_MIRROR: Device gm0: rebuilding provider ad4.
GEOM_MIRROR: Device gm0: rebuilding provider ad4 finished.
GEOM_MIRROR: Device gm0: provider ad4 activated.

Comments
No Comments »
Categories
FreeBSD
Comments rss Comments rss
Trackback Trackback

Bacula Design ‘Feature’

August 12, 2008 | 4:17 pm

I unwittingly stumbled across a reasonably well hidden Bacula design foible today.

I’ve noticed for some time that when doing test restores, or browsing the ‘most recent backup’ for a client Bacula has included files that should not be there based on the timeline of the restore.

If, after performing a full backup, some files are deleted and then a differential (or incremental!) backup is taken these previously deleted files would still be restored if I did the ‘latest backup’ for a client.

The expected behavior (in my opinion) would be for these files not to be included in a restore of the most recent backup as, at the point in time the latest backup was taken, the files were not on disk.

This is a known feature of Bacula and is documented as a sub-project to fix

This has been underway since 2005, hopefully good progress is being made.

Unfortunately this means that any restores made expecting a ‘point in time’ snapshot of the system to be written back to disk will not behave as expected.

This may leave the system in an inconsistent or, at best, unknown state. This will also skew disk usage for a restore, using more space than expected and may either cause problems depending on the target disk size or necessitate a spring clean post restore to tidy things up again. This would be particularly painful for administrators with busy systems and lots of changes between incremental and differential backup windows.

This only occurs between each full backup so one workaround would be to perform a full backup each day but this is not often a practical solution.

It’s some comfort that at least the data is backed up safely and that data can be retrieved in some form, however this makes the restore process a little more complicated than it perhaps should be.

Comments
1 Comment »
Categories
Bacula
Tags
backups, Bacula, FreeBSD
Comments rss Comments rss
Trackback Trackback

Navigation

  • Bacula
  • FreeBSD
  • Linux
  • mySQL
  • SGE
  • Solaris

Search

rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox