6.2-STABLE deadlock?

Oleg Derevenetz oleg at vsi.ru
Tue Apr 24 19:23:36 UTC 2007


Цитирую LI Xin <delphij at delphij.net>:

> Kostik Belousov wrote:
> > On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
> >> On Tue, Mar 13, 2007 at 02:08:48PM +0000, Adrian Wontroba wrote:
> >>> At work, amoungst my stable of old computers running FreeBSD, I have
> a
> >>> Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
> >>> primarily runs Nagios and a small and lightly used MySQL database,
> along
> >>> with a few inbound FTP transfers per minute. It has a Mylex card
> based
> >>> disc subsystem, ruling out crash dumps.
> >>>
> >>> At some point during 5.5-STABLE this machine started to occasionally
> hang ...
> >> Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics
> taken
> >> rather sooner after the hang.  Processes with wmesg=ufs feature often
> in
> >> the ps output.
> >>
> >> http://www.stade.co.uk/crash1/
> > 
> > I would suspect the mlx controller. There is several processes (for
> instance,
> > 988, 50918) waiting for completion of block read, and processes in the
> "ufs"
> > states are the result of the lock cascade, IMHO.
> 
> I'm not very sure if this is specific to one disk controller.  Actually
> I got some occasional reports about similar hangs on amd64 6.2-RELEASE
> (slightly patched version) that most of processes stuck in the 'ufs'
> state, under very light load, the box was equipped with amr(4) RAID.
> 
> I was not able to reproduce the problem at my lab, though, it's still
> unknown that how to trigger the livelock :-(  Still need some
> investigate on their production system.

I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=

and there should be a thread related to this. Briefly, I suspects that this is 
related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2-
STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs 
with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, 
at least).


More information about the freebsd-stable mailing list