lock up in 6.2 (procs massively stuck in Giant)

Wed May 13 16:52:32 UTC 2009

On Wednesday 13 May 2009 11:41:22 am pluknet wrote:
> 2009/5/13 John Baldwin <jhb at freebsd.org>:
> > On Wednesday 13 May 2009 2:40:33 am pluknet wrote:
> >> 2009/5/13 pluknet <pluknet at gmail.com>:
> >> > 2009/5/13 John Baldwin <jhb at freebsd.org>:
> >> >> On Tuesday 12 May 2009 4:59:19 pm pluknet wrote:
> >> >>> Hi.
> >> >>>
> >> >>> From just another box (not from the first two mentioned earlier)
> >> >>> with a similar locking issue. If it would make sense, since there are
> >> >>> possibly a bit different conditions.
> >> >>> clock proc here is on swi4, I hope it's a non-important difference.
> >> >>>
> >> >>>    18     0     0     0  LL     *Giant    0xd0a6b140 [swi4: clock 
sio]
> >> >>> db> bt 18
> >> >>
> >> >> Ok, this is a known issue in 6.x.  It is fixed in 6.4.
> >> >>
> >>
> >> Looking at the face of kern_timeout.c I suspect that was fixed in 
r181012.
> >
> > No, this particular issue is fixed by a change to sched_4bsd.c in r179975.
> >
> 
> Gah.. We constrained to use ule scheduler on 6.x (yes, I know that
> "it's known to be broken (c)"), since we have had a very bad interactivity
> on 4bsd on our workload. Ok, that's just another reason to move to 7.x.

Hmmm I would have thought ULE wouldn't have suffered from this bug.  The 
problem on 4BSD was if softclock ever blocked on Giant and the thread that 
held Giant was on a run queue and pinned to a specific CPU but that another 
userland thread was running on that CPU already, the userland thread would 
never yield the CPU so long as it kept busy since the round robin timeout 
would never run.

-- 
John Baldwin