Removing and re-adding a disk in gmirror
August 15, 2008 | 12:43 pmRunning software RAID-1 using gmirror under FreeBSD can save some headaches.
Sometimes a disk literally fails leaving you running on a spare, sometimes you just have a ‘blip’ during a busy period of the server which causes gmirror to drop the disk from the array.
This happened today and from dmesg we can see :
GEOM_MIRROR: Request failed (error=5). ad4[READ(offset=48103390720, length=16384 )]
GEOM_MIRROR: Device gm0: provider ad4 disconnected.
Basically, my SATA disk ad4 had a bit of a read error. As this has happened before I’m not overly concerned. I don’t know, however, if this is a physical disk problem or just gmirror err’ing on the side of caution and aggressively removing it from the array. A full s.m.a.r.t report in due course may shed some light. If this had been a write error I would be a bit more concerned. Nonetheless gmirror caught it and removed it from the array as faulty.
We can confirm using the gmirror list and gmirror status commands
guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6
guru# gmirror list
Geom name: gm0
State: DEGRADED
Components: 2
Balance: round-robin
Slice: 4096
Flags: NONE
GenID: 6
SyncID: 1
ID: 2879715010
Providers:
1. Name: mirror/gm0
Mediasize: 79999999488 (75G)
Sectorsize: 512
Mode: r5w5e6
Consumers:
1. Name: ad6
Mediasize: 80026361856 (75G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 6
SyncID: 1
ID: 1663191490
Which tells us the mirror device gm0 is degraded and only SATA disk ad6 is still playing.
In an actual disk failure situation at this point we’d bring the server down and replace the failed ad4 device. If hardware supports you could also hotswap - but I’m not sure how the FreeBSD kernel would handle that.
We don’t need to do that in this instance, as I’m fairly sure this is just a blip. So I’ll just re add the disk to the array and see if it rebuilds.
After replacing the disk (if necessary) to re add the disk we first have to tell gmirror to ‘forget’ components of the array that have failed.
guru# gmirror forget gm0
This forgets the broken members for the array ‘gm0′.
We can confirm using the gmirror status command :
guru# gmirror status
Name Status Components
mirror/gm0 COMPLETE ad6
Now we can add (or re-add!) the ‘new’ disk and watch it rebuild!
guru# gmirror insert gm0 ad4
guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6
ad4 (0%)
After a while…
guru# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad6
ad4 (64%)
And finally…
guru# gmirror status
Name Status Components
mirror/gm0 COMPLETE ad6
ad4
guru#
We can glean a nice summary from dmesg :
GEOM_MIRROR: Device gm0: provider ad4 detected.
GEOM_MIRROR: Device gm0: rebuilding provider ad4.
GEOM_MIRROR: Device gm0: rebuilding provider ad4 finished.
GEOM_MIRROR: Device gm0: provider ad4 activated.





