kern/160777: RAID-Z3 causes fatal hang upon scrub/import on 9.0-BETA2/amd64

Sat Sep 17 01:30:13 UTC 2011

>Number:         160777
>Category:       kern
>Synopsis:       RAID-Z3 causes fatal hang upon scrub/import on 9.0-BETA2/amd64
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 17 01:30:11 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator:     Charlie
>Release:        9.0-BETA2/amd64
>Organization:
none
>Environment:
FreeBSD  9.0-BETA2 FreeBSD 9.0-BETA2 #0: Wed Aug 31 18:07:44 UTC 2011     root at farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64

>Description:
RAID-Z3 causes fatal hang upon scrub/import on 9.0-BETA2/amd64.

By fatal hang, I mean: (1) the hard drive LEDs freeze in a static state of on or off (rather than flashing to indicate drive activity) and stay there; (2) the console no longer responds to any keypress events such as space bar or Control-Alt-F2; (3) the system entirely stops responding to pings.

I noticed this initially when I tried running "zdb pool" while I was doing a "zpool scrub pool", and then the system crashed.  I had thought "zdb pool" would be a read only operation just to give me some interesting metadata I could page through.  But, rest assured, when I attempted to narrow down what was faulty or problematic here, I didn't touch that command with a ten foot pole (although, in the case where I confirmed that the system was working properly, such as with RAID-Z2, "zdb pool" didn't cause a problem).  I think anyhow that "zdb pool" must have consumed too much memory and so the machine crashed.  This was the first time the machine had been up and I had created the array in that boot.

So, the first time I attempted to "zpool import pool" after initial creation, I could see all drives being accessed for about a minute or so (positive activity), but then after that minute, the system fatally stalled, as described above.  I had tried "zpool scrub -s pool", and was only able to see the data at all by running "zpool export pool && zpool import -o readonly=on pool".  Then when I tried importing it read-write again, there was a stall.  It wasn't necessary to have the pool be disconnected without a clean dismount.  In fact, when I tried repeating the problem with a fresh creation of a new zpool (after a proper zpool destroy of the old one), I found that it was the "zpool import" or "zpool scrub" process alone that triggered the fatal stall.

I sincerely hope this is helpful.  I've switched to RAID-Z2 for now, unfortunately.  Rest assured, I would be able to do much more rigourous testing on ZFS.  If this problem is confirmed and fixed by 9.0 I can offer a contribution of uncovering more bugs with a debugged kernel enabled.  In the meantime I need to move forward.
>How-To-Repeat:
zpool create -O checksum=sha256 -O compression=gzip-9 pool raidz3 gpt/foo*.eli

zfs create -o checksum=sha256 -o compression=gzip-9 -o copies=3 pool/pond

zpool scrub pool
# or:
zpool export pool && zpool import pool

(Both of these seem to trigger the fatal stall as described above).

The following conditions may or may not apply.  I don't have the resources or time to check.  But, (1) the drives are 3TB each; (2) I partitioned the drives using GPT and one large labelled partition each with 99% capacity allocated to it; (3) I am using geli on the large partition.  If it seems that these factors are what are causing the problem, note that when I choose to create a RAID-Z2 pool instead of RAID-Z3, there is no problem at all.  I can also confirm that the entirety of the drives is accessible, since I did a full dd to the entire drive (partition sector, metadata and all), so it is not a matter of the kernel not seeing the drive size properly.  In any case I would expect a graceful error from the kernel instead of this kind of stall.  I haven't attempted to move past the actual stall condition such as by kernel debugging, but the reproducibility of the problem leads me to suspect that might not be necessary. 
>Fix:
Unknown.  I can confirm that if I use RAID-Z2 and do many "zpool import" and "zpool export" commands back to back as well as "zpool scrub" then there is no problem at all.

>Release-Note:
>Audit-Trail:
>Unformatted: