Non-responsive 8.0-RC1
Peter Jeremy
peterjeremy at acm.org
Sat Nov 28 21:22:31 UTC 2009
My main server is running 8.0/amd64 from between RC1 and RC2 and I've
recently had a couple of long-duration hangs on it during which time
processes doing I/O will stop responding.
The first time, it stopped responding for about 25 minutes and then
spontaneously corrected itself. I was logged in remotely the whole
time and Ctrl-T was responding throughout (claiming the process was
'runnable'). I tried starteding a second session - which got as far
as reporting the SSH banner I have configured and then did nothing.
The second time lasted about 5 minutes.
I can't find anything in any log files or dmesg. 'vmstat -m' output
looks sensible. Unfortunately, I didn't have access to the console
on either occasion.
The system is a dual-core Athlon with the base OS (root/usr/var) on
UFS and the remainder of the filesystem ZFS. It's running SCHEDULE.
It runs a pair of BOINC processes in the background. The first time,
it should have been otherwise unused apart from a mairix (mail
indexing tool) process that I'd just started. The second time, it
would have been running a buildkernel.
Based on it managing to report the ssh banner (which is stored in
/etc) but not getting to a shell prompt (my home directory is ZFS),
my initial suspicion was ZFS but it occurs to me that it could be
a priority-inversion problem with the BOINC processes.
Can anyone suggest where to go looking for a cause?
--
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20091128/47ddebbc/attachment.pgp
More information about the freebsd-current
mailing list