kern/151910: [zfs] booting from raidz/raidz2 on ciss(4) doesn't work

Emil Smolenski am at raisa.eu.org
Wed Nov 3 09:40:11 UTC 2010


>Number:         151910
>Category:       kern
>Synopsis:       [zfs] booting from raidz/raidz2 on ciss(4) doesn't work
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Nov 03 09:40:10 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Emil Smolenski
>Release:        FreeBSD 8.1-RELEASE
>Organization:
>Environment:
FreeBSD  8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49 UTC 2010     root at mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
FreeBSD 8.1 installed using Fixit# environment on HP Proliant 185 G5 with HP SmartArray does not boot from ZFS pool in raidz or raidz2 setup. Mirror-based configurations work as expected.

There are 6 disks, each configured on HP SmartArray as single disk RAID0 array. Thus there are 6 logical devices (da[0-5]).

System information gathered from Fixit# environment:

# dmesg
(...)
ciss0: <HP Smart Array P400> port 0xe800-0xe8ff mem 0xdef00000-0xdeffffff,0xdeeff000-0xdeefffff irq 35 at device 0.0 on pci4
ciss0: PERFORMANT Transport
ciss0: [ITHREAD]
(...)
da0 at ciss0 bus 0 scbus0 target 0 lun 0
da0: <COMPAQ RAID 0  VOLUME OK> Fixed Direct Access SCSI-5 device
da0: 135.168MB/s transfers
da0: Command Queueing enabled
da0: 1430767MB (2930211632 512 byte sectors: 255H 32S/T 65535C)
(...)
da1 at ciss0 bus 0 scbus0 target 1 lun 0
(...)
da2 at ciss0 bus 0 scbus0 target 2 lun 0
(...)
da3 at ciss0 bus 0 scbus0 target 3 lun 0
(...)
da4 at ciss0 bus 0 scbus0 target 4 lun 0
(...)
da5 at ciss0 bus 0 scbus0 target 5 lun 0
(...)

# diskinfo -v da0
da0
        512             # sectorsize
        1500268355584   # mediasize in bytes (1.4T)
        2930211632      # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        359094          # Cylinders according to firmware.
        255             # Heads according to firmware.
        32              # Sectors according to firmware.
        PAFGL0T9SXH13E  # Disk ident.
(...)

# camcontrol devlist -v
scbus0 on ciss0 bus 0:
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 0 lun 0 (da0,pass0)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 1 lun 0 (da1,pass1)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 2 lun 0 (da2,pass2)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 3 lun 0 (da3,pass3)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 4 lun 0 (da4,pass4)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 5 lun 0 (da5,pass5)
scbus1 on ciss0 bus 32:
scbus-1 on xpt0 bus 0:
<>                                 at scbus-1 target -1 lun -1 (xpt0)

# pciconf -lv
(...)
ciss0 at pci0:4:0:0:       class=0x010400 card=0x3234103c chip=0x3230103c rev=0x04 hdr=0x00
    class      = mass storage
    subclass   = RAID
(...)

I've done several tests with different configurations. Common setup:

Each logical device (array) has GPT scheme with following partitions:

=>        34  2930211565  da0  GPT  (1.4T)
          34         128    1  freebsd-boot  (64K)
         162     2097152    2  freebsd-swap  (1.0G)
     2097314     4194304    3  freebsd-zfs  (2.0G)
     6291618  2923919981       - free -  (1.4T)
(...)

Each freebsd-zfs partition has GPT label: gpt/test0, gpt/test1, etc. ZFS pool is created with following default options: canmount=off, checksum=fletcher4, atime=off, setuid=off. There are some datasets with compress=lzjb or compress=gzip option set.

Boot code is initialized on each disk this way:
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

Results from tests:
1) + One logical device -> works
2) + Mirror built on 2 logical devices -> works
3) + Mirror built on 6 LDs -> works
4) - RAIDZ2 built on 3 LDs -> doesn't work
5) - RAIDZ2 built on 6 LDs -> doesn't work
6) - RAIDZ built on 3 LDs -> doesn't work
7) - RAIDZ built on 6 LDs -> doesn't work

8) + RAIDZ built on 3 USB sticks on the same machine -> works
9) + RAIDZ built on 3 LDs on another machine (aacdX devices) -> works

There are three different error messages I encountered. It depends on where bootcodes were built.

a) Error message encountered when using bootcodes from Fixit# media:

error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
No ZFS pools located, can't boot

b) Error message encountered when using bootcodes that I built myself:

ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS object directory
Cant' find root filesystem - giving up
ZFS: unexpected object set type 0 
ZFS: unexpected object set type 0

FreeBSD/x86 boot
Default: test:/boot/kernel/kernel
boot:
ZFS: unexpected object set type 0

c) I also saw message similar to shown above, but with "ZFS: can't read MOS" message.

Example output from "status" command at "boot:" prompt (when error b appears):

boot: status pool: test
config:
          NAME STATE
          test ONLINE
            raidz1 ONLINE
              /dev/gpt/test0 ONLINE
              /dev/gpt/test1 ONLINE
              /dev/gpt/test2 ONLINE

So, the bootcode sees healthy ZFS pool with all devs available, but it can't boot from it.

More details on _working_ configuration (2) (mirror on 2 LDs):

works# zpool list test
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test  1.98G   434M  1.56G    21%  ONLINE  -

works# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        test           ONLINE       0     0     0
          mirror       ONLINE       0     0     0
            gpt/test0  ONLINE       0     0     0
            gpt/test1  ONLINE       0     0     0

errors: No known data errors

works# zdb -uuu test
Uberblock

        magic = 0000000000bab10c
        version = 14
        txg = 77
        guid_sum = 2401146990467298568
        timestamp = 1287650011 UTC = Thu Oct 21 08:33:31 2010
        rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:e90ca00:200> DVA[1]=<0:26037600:200> DVA[2]=<0:3e00ce00:200> fletcher4 lzjb LE contiguous birth=77 fill=169 cksum=b02658433:48b259d7157:f39e911b0c82:2286d07e38a6a3

More details on _NOT_working_ configuration (6) (raidz on 3 LDs):

doesntwork# zpool list test
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test  5.97G   655M  5.33G    10%  ONLINE  -

doesntwork# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        test           ONLINE       0     0     0
          raidz1       ONLINE       0     0     0
            gpt/test0  ONLINE       0     0     0
            gpt/test1  ONLINE       0     0     0
            gpt/test2  ONLINE       0     0     0

errors: No known data errors

doesntwork# zdb -uuu test
Uberblock

        magic = 0000000000bab10c
        version = 14
        txg = 78
        guid_sum = 8302404134133891378
        timestamp = 1287656704 UTC = Thu Oct 21 10:25:04 2010
        rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:5b9a7000:400> DVA[1]=<0:a205d800:400> DVA[2]=<0:ea016000:400> fletcher4 lzjb LE contiguous birth=78 fill=159 cksum=a13038304:4232b1013de:dc9ba0c2751d:1f1144c425d7a4

Please see also possibly the same issue reported on mailing list: http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059610.html

I also tried mm's mfsBSD with ZFSv15. System installed from it also doesn't boot. Same with FreeBSD 8.0-RELEASE.

I can provide more details on this issue and test patches.
>How-To-Repeat:
1. Install ZFS-only FreeBSD 8.1-RELEASE on HP SmartArray. Use raidz or raidz2.
2. Reboot.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list