Earlier this month, the root filesystem on our storage server underwent a hard drive failure. Luckily for us, we use ZFS with a mirrored root file system (rpool), and disaster was averted. For those following along with this series of ZFS articles, here are the juicy technical details. I present you with a real life ZFS rpool recovery:
The first step was to power down the machine and unplug the faulty drive, replacing it with the new drive. This could be done to a running machine, if you don’t share our inherent distrust of hot swap technologies. On the next startup, we changed the boot order in the system BIOS to boot from the drive on controller 12. After booting on the “secondary” drive, the rpool looked like this:
sa@quasar:~# zpool status rpool
pool: rpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror DEGRADED 0 0 0
c11d0s0 UNAVAIL 0 0 0 corrupted data
c12d0s0 ONLINE 0 0 0
There are a few preparatory steps for the new drive before we can make it available for ZFS to resilver. We started by detaching the unavailable block device from the rpool.
sa@quasar:~# zpool detach rpool c11d0s0
sa@quasar:~# zpool status rpool
pool: rpool
state: ONLINE
scrub: scrub in progress for 0h0m, 0.75% done, 0h15m to go
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c12d0s0 ONLINE 0 0 0
errors: No known data errors
The next step was to run fdisk and accept all the defaults. If this step is skipped, you will have difficulty booting from this drive later.
sa@quasar:~# fdisk /dev/rdsk/c11d0s0
WARNING: Device /dev/rdsk/c11d0s0:
The device does not appear to include absolute
sector 0 of the PHYSICAL disk (the normal location for an fdisk table).
Fdisk is normally used with the device that represents the entire fixed disk.
(For example, /dev/rdsk/c0d0p0 on x86 or /dev/rdsk/c0t5d0s2 on sparc).
Are you sure you want to continue? (y/n) y
No fdisk table exists. The default partition for the disk is:
a 100% "SOLARIS System" partition
Type "y" to accept the default partition, otherwise type "n" to edit the
partition table.
y
Warning: only -1 bytes written to clear backup VTOC at block 976751938!
Warning: only -1 bytes written to clear backup VTOC at block 976751940!
Warning: only -1 bytes written to clear backup VTOC at block 976751942!
Warning: only -1 bytes written to clear backup VTOC at block 976751944!
Warning: only -1 bytes written to clear backup VTOC at block 976751946!
fdisk: Error writing master boot record to /dev/rdsk/c11d0s0.
Then we print the partition table from the good drive and write the same settings to the new replacement
sa@quasar:~# prtvtoc /dev/rdsk/c12d0s0 | fmthard -s - /dev/rdsk/c11d0s0
fmthard: New volume table of contents now in place.
Finally we’re ready to attach this device back to the rpool
sa@quasar:~# zpool attach -f rpool c12d0s0 c11d0s0
Please be sure to invoke installgrub(1M) to make 'c11d0s0' bootable.
Isn’t the nice? ZFS reminds us to install the GRUB boot loader to the new drive.
sa@quasar:~# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c11d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 271 sectors starting at 50 (abs 16115)
Now we just sit back and watch ZFS re-establish the mirror by resilvering data to the new drive at c11d0s0
sa@quasar:~# zpool status rpool
pool: rpool
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h1m, 20.57% done, 0h4m to go
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c12d0s0 ONLINE 0 0 0
c11d0s0 ONLINE 0 0 0 2.45G resilvered
errors: No known data errors
A few minutes later, and all is well:
sa@quasar:~# zpool status rpool
pool: rpool
state: ONLINE
scrub: resilver completed after 0h5m with 0 errors on Fri Apr 23 10:33:06 2010
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c12d0s0 ONLINE 0 0 0
c11d0s0 ONLINE 0 0 0 11.9G resilvered
errors: No known data errors