ZFS hangs with 8.2-release

Dan Pritts danno at internet2.edu
Thu Dec 15 15:42:22 UTC 2011


Hi all, as a followup to my notes from last week.

Short answer, I have followed most or all of the list's suggestions and 
I still get crashes when scrubbing.  In fact, It is now reliably 
crashing after <10 minutes.

Does anyone have any other suggestions?  Are the ZFS devs here, and 
would crash dumps be useful?


Below are my responses to specific things that folks suggested.


> do a memory test
my colleague reminded me that we have run a  test in the last month or 
two, since we started troubleshooting this.  24 hours with memtest86+ 
with no errors reported.  FWIW this system was stable running solaris 
for several years.
> Recommendations to upgrade to 8.2-STABLE and then polite explanations 
> after i did it wrong
We've upgraded to 8.2-STABLE and applied the 1-line patch suggested by 
Adam McDougall.

> FreeBSD netflow3.internet2.edu 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon 
> Dec 12 15:45:06 UTC 2011     
> root at netflow3.internet2.edu:/usr/obj/usr/src/sys/GENERIC  amd64

And many recommendations from Adam McDougall that resulted in the 
following /boot/loader.conf.  I also tried removing all of the zfs and 
vm lines, same problems.

I think that something in here is causing the lockups - with the empty 
loader.conf it reboots instead of locking.
> verbose_loading="YES"
> rootdev="disk16s1a"
>
> #I have 16G of Ram
>
> vfs.zfs.prefetch_disable=1
> vfs.zfs.txg.timeout="5"
> vfs.zfs.arc_min="512M"
> vfs.zfs.arc_max="4G"
> vm.kmem_size="32G"


Specifics from Adam:
>>
>> - In my experience running with prefetch disabled is a significant 
>> impact to speed, once you are comfortable with doing some performance 
>> testing I would evaluate that and decide for yourself about "some 
>> discussion suggests that the prefetch sucks"
Just to confirm, is there any STABILITY reason not to disable 
prefetch?   The notes I saw suggested that it hurt stability.

>> - Be wary of using dedupe in v28, it seems to have a huge performance 
>> drag when working with files that were written while dedupe was 
>> enabled; I won't comment more on that except to suggest not adding 
>> that variable to your issue
Good to know.  Not appropriate for our data set anyway.
>> - These comments mostly relate to speed, but I had to give the ARC 
>> enough room to work without deadlocking the system so they may help 
>> you there.
"enough to work" meaning along the lines of 2-4G as suggested above?

thanks!
danno
-- 
Dan Pritts, Sr. Systems Engineer
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224



More information about the freebsd-fs mailing list