kern/104406: [ufs] Processes get stuck in "ufs" state under
persistent CPU load
Kris Kennaway
kris at obsecurity.org
Wed Jun 6 20:33:40 UTC 2007
On Tue, Jun 05, 2007 at 08:50:10PM +0000, Jeffrey D. Wheelhouse wrote:
> The following reply was made to PR kern/104406; it has been noted by GNATS.
>
> From: "Jeffrey D. Wheelhouse" <jdw at wheelhouse.org>
> To: bug-followup at FreeBSD.org
> Cc:
> Subject: Re: kern/104406: [ufs] Processes get stuck in "ufs" state under persistent
> CPU load
> Date: Tue, 05 Jun 2007 16:26:26 -0400
>
> I believe we have also experienced this bug (or a very similar one) on
> our 8-core amd64 systems under 6.2-RELEASE-p4.
>
> In our case, "top" shows that the system is 100% CPU utilized, with the
> vast majority of it as "system" time. (Ordinarily the system
>
> In the last case, we ended up with about 200 Apache processes that
> looked like this in ps:
>
> UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
> 25000 27121 26860 1977 -4 5 146324 33732 ufs DN ?? 0:03.75 httpd
> 25000 27147 37257 1994 -4 5 153748 29280 ufs DN ?? 0:03.72 httpd
> 25000 27157 36912 1805 -4 5 150756 26592 ufs DN ?? 0:02.91 httpd
> 25000 27224 27030 1845 -4 5 137536 24804 ufs DN ?? 0:01.25 httpd
> 25000 27274 26794 1829 -4 5 148140 35416 ufs DN ?? 0:02.90 httpd
>
> Once a process gets "stuck" in WCHAN ufs, it's blocked indefinitely, as
> described here, or at least so slow as to be indistinguishable from
> stuck. (Typical wait channels for our httpds are accept or kqread, as
> one would expect.)
>
> Each process in this state counts against the load average, so we often
> see load averages north of 200 when this is occurring. (Typical load
> average is below 2.)
>
> Kill enough processes (or possibly enough to hit the "right" process)
> and everything picks up again right where it left off.
>
> I also have no idea how to debug this.
See the Developers handbook
Kris
More information about the freebsd-bugs
mailing list