Non-responsive 8.0-RC1 (now 8.0-STABLE)
Arnaud Houdelette
arnaud.houdelette at tzim.net
Sun Dec 6 09:54:22 UTC 2009
Peter Jeremy wrote:
> On 2009-Nov-30 19:13:30 +1100, Peter Jeremy <peter at server.vk2pj.dyndns.org> wrote:
>
>> On 2009-Nov-29 08:56:55 +0100, Thomas Backman <serenity at exscape.org> wrote:
>>
>>> On Nov 28, 2009, at 10:22 PM, Peter Jeremy wrote:
>>>
>>>
>>>> My main server is running 8.0/amd64 from between RC1 and RC2 and I've
>>>> recently had a couple of long-duration hangs on it during which time
>>>> processes doing I/O will stop responding.
>>>>
> ...
>
>> It actually "hung" again just after I sent the original mail. This
>> time I managed to get console access and could check the kernel state.
>> This showed that a number of processes were blocked on ZFS locks.
>> The most commonly reported state was 'tx->tx_quiesce_done_cv)'.
>>
>
> I've upgraded to 8-STABLE from 30-Nov and the problem is still present,
> even after disabling the boinc processes.
>
> This seems to leave race conditions inside ZFS as the only option.
>
> Has anyone else seen anything like this?
>
>
I got the same issue since I upgraded to 8.0-RELEASE. I happens during
high I/O operation such a buildworld. Since I run top in an ssh session,
I can say that before the hung [zfskern] process shows high CPU usage,
global system usage is 99%. Sometimes I can get back to normal breaking
the build with Ctrl-C. Sometimes I don't. If enabled, the watchdog kicks
in and the machine reboots (else, I just ssh control over it).
The machine is low (512MB) memory, with same tuning as I used in 7.2
(arc reduced to 60M, device cache to 5M, which gave me a stable machine).
I enabled crashdumps. I can investigate if somebody give me pointers of
where to look.
Arnaud
More information about the freebsd-stable
mailing list