bugged sysinstall, bsdlabel, zfs, gmirror - recept for disaster :)

Bartosz Stec admin at kkip.pl
Wed Sep 3 12:28:42 UTC 2008

Hello there!
Here's my story, hopefully some of you won't follow my steps and avoid 
some troubles :)

Yesterday I've decided that's about time to test zfs functionality on my 
home server PC (i386 FreeBSD 7.1-pre) . A couple of weeks ago I bought 
new desktop PC (with SATA), so I had a bunch of PATA disks from old one 
to use in server. Lucky me - there was 3 HDD at size 40GB - RAIDZ was on 

So after a thirty minutes I had a plan, and my server had 4 disks 
connected - one 20GB with actual system (ad1), and three 40GB to replace 
actual system (ad[023]).
Plan was simple:

    1. csup freebsd-stable
    2. follow the tuning guide for zfs, rebuild world, kernel, and
    follow system upgrade
    3. Reboot in single user mode
    4. fdisk new disks with sysinstall using one big slice for every disk
    5. bsdlabel every new disk with sysinstall using: 1GB for /, 512MB
    for swap, and rest unused (for ZFS)
    6. gmirror -n -v -b round-robin boot ad0s1a ad2s1a ad3s1a
    7. newfs /dev/mirror/boot
    8. mount /dev/mirror/boot /mnt && cd /mnt
    9. dump -h 0 -L -f - -C 32 / | restore rf -
    10. zpool create tank raidz ad0s1d ad2s1d ad3s1d
    11. zfs create new cool filesystems :)
    12. dump | restore old ufs2 filesystem to new cool zfs filesystems :)
    13. changing mount points from tank/foo to /foo
    14. edit new fstab on mirror by replacing root mount point by "boot"
    mirror, adding new swaps and remove ald ones and all fs now placed
    on zpool
    15. power off system, detach ad1 and power on new system in mixed
    gmirror - raidz environment. Yay!

Well...it has almost works. Sysinstall screw it up. I was always too 
lazy to read man bsdlabel, that's why I've been using this "nice" tool 
for disk related tasks. Such a mistake!
Problem with labels created with sysinstal, is that it aks for a mount 
point for every partition in slice. Well, in my case it was unwanted 
behaviour, so on every disk I created first:

    a:    /  
    b:   swap
    c:   none
    d:  /foo 

Then by using "M" key I removed mount points and saved changes with "W". 
At this point everything seems ok. So I've added gmirror to loader.conf 
and run "gmirror label -n -v -b round-robin boot ad0s1a ad2s1a ad3s1a". 
Still ok until now. Next step - kldload geom_mirror. Here's disaster! 
System became unresponsible and hangs after a while. Reboot didn't help, 
just after gmirror module was loaded by kernel, screen was flooded with 

    WARNING: Expected rawoffset 0, found 63

andy didn't boot. I've made system start only because an old drive ad1 
has no gmirror module added to loader.conf. So after reboot I've cleared 
metadata on providers and made some another attempts, but results were 
always the same. Finally I have found explanation for this issue. Man 
bsdlabel says:

    /offset/  The offset of the start of the partition from the beginning of
    	     the drive in sectors, or *** to have *bsdlabel* calculate the correct
    	     offset to use (the end of the previous partition plus one, ignor-
    	     ing partition `c').  For partition `c', *** will be interpreted as
    	     an offset of 0.  The first partition should start at offset 16,
    	     because the first 16 sectors are reserved for metadata.

So proper labels for disks should be (and they are now):

    # /dev/ad0s1:
    8 partitions:
    #        size   offset    fstype   [fsize bsize bps/cpg]
      a:  2097152       16    4.2BSD        0     0     0
      b:  1048576  2097168      swap
      c: 78156162        0    unused        0     0         # "raw"
    part, don't edit
      d: 75010418  3145744    unused        0     0

Problem was - Sysinstall has placed partition "a:" starting with offset 
0! This is what happens when you don't RTFM :) I assume that this bug 
occured because I created mount point for root on ad[023]s1a and removed 
it after, than saved label. It seems that GEOM framework didn't expect 
this, neither maual for bsdlabel. I think that should be fixed somehow.
Fortunately manually editing labels by "bsdlabel -e" wasn't so hard as I 
expected. This is how I made everything back to normal:

      a:  1024M       *    4.2BSD        0     0     0
      b:  512M  *      swap
      c: 78156162        0    unused        0     0         # "raw"
    part, don't edit
      d:     *      *    unused        0     0

After that, gmirror has stopped pissing me off, and I finished my plan, 
as below:

    # zpool status
      pool: tank
     state: ONLINE
     scrub: scrub completed with 0 errors on Wed Sep  3 10:10:07 2008

            NAME        STATE     READ WRITE CKSUM
            tank        ONLINE       0     0     0
              raidz1    ONLINE       0     0     0
                ad0s1d  ONLINE       0     0     0
                ad2s1d  ONLINE       0     0     0
                ad3s1d  ONLINE       0     0     0

    errors: No known data errors

    # gmirror status
           Name    Status  Components
    mirror/boot  COMPLETE  ad0s1a

Good luck with ZFS everyone! :) And RTFM ;)

