ZFS hangs with 8.2-release
Dan Pritts
danno at internet2.edu
Thu Dec 15 15:42:22 UTC 2011
Hi all, as a followup to my notes from last week.
Short answer, I have followed most or all of the list's suggestions and
I still get crashes when scrubbing. In fact, It is now reliably
crashing after <10 minutes.
Does anyone have any other suggestions? Are the ZFS devs here, and
would crash dumps be useful?
Below are my responses to specific things that folks suggested.
> do a memory test
my colleague reminded me that we have run a test in the last month or
two, since we started troubleshooting this. 24 hours with memtest86+
with no errors reported. FWIW this system was stable running solaris
for several years.
> Recommendations to upgrade to 8.2-STABLE and then polite explanations
> after i did it wrong
We've upgraded to 8.2-STABLE and applied the 1-line patch suggested by
Adam McDougall.
> FreeBSD netflow3.internet2.edu 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon
> Dec 12 15:45:06 UTC 2011
> root at netflow3.internet2.edu:/usr/obj/usr/src/sys/GENERIC amd64
And many recommendations from Adam McDougall that resulted in the
following /boot/loader.conf. I also tried removing all of the zfs and
vm lines, same problems.
I think that something in here is causing the lockups - with the empty
loader.conf it reboots instead of locking.
> verbose_loading="YES"
> rootdev="disk16s1a"
>
> #I have 16G of Ram
>
> vfs.zfs.prefetch_disable=1
> vfs.zfs.txg.timeout="5"
> vfs.zfs.arc_min="512M"
> vfs.zfs.arc_max="4G"
> vm.kmem_size="32G"
Specifics from Adam:
>>
>> - In my experience running with prefetch disabled is a significant
>> impact to speed, once you are comfortable with doing some performance
>> testing I would evaluate that and decide for yourself about "some
>> discussion suggests that the prefetch sucks"
Just to confirm, is there any STABILITY reason not to disable
prefetch? The notes I saw suggested that it hurt stability.
>> - Be wary of using dedupe in v28, it seems to have a huge performance
>> drag when working with files that were written while dedupe was
>> enabled; I won't comment more on that except to suggest not adding
>> that variable to your issue
Good to know. Not appropriate for our data set anyway.
>> - These comments mostly relate to speed, but I had to give the ARC
>> enough room to work without deadlocking the system so they may help
>> you there.
"enough to work" meaning along the lines of 2-4G as suggested above?
thanks!
danno
--
Dan Pritts, Sr. Systems Engineer
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224
More information about the freebsd-fs
mailing list