random FreeBSD panics
John Baldwin
jhb at freebsd.org
Mon Mar 29 18:28:53 UTC 2010
On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote:
> On Mon, Mar 29, 2010 at 05:01:02PM +0000, Masoom Shaikh wrote:
> > On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras <ivoras at freebsd.org> wrote:
> > > On 28 March 2010 16:42, Masoom Shaikh <masoom.shaikh at gmail.com> wrote:
> > >
> > >> lets assume if this is h/w problem, then how can other OSes overcome
> > >> this ? is there a way to make FreeBSD ignore this as well, let it
> > >> result in reasonable performance penalty.
> > >
> > > Very probably, if only we could detect where the problem is.
> > > Try adding "options PRINTF_BUFR_SIZE=128" to the kernel
> >
> > this option is already there
>
> The key word in Ivan's phrase is "less mangled". Neither use of or
> increasing PRINTF_BUFR_SIZE solves the problem of interspersed console
> output. I've been ranting/raving about this problem for years now; it
> truly looks like a mutex lock issue (or lack of such lock), but I've
> been told numerous times that isn't the case.
>
> To developers: what incentives would help get this issue well-needed
> attention? This problem makes kernel debugging, panic analysis, and
> other console-oriented viewing basically impossible.
I was recently going to look at it. The somewhat drastic approach I was going
to take was to add a simple serializing lock around trap_fatal() and a few
other places that do similar block prints (e.g. mca_log()). One of the issues
with fixing this in printf itself is that you'd want probably want to
serialize complete lines of text on a per-thread basis. You would want to be
able to accumulate this line of text across multiple calls to printf (think of
it as line-buffering ala stdio). However, some folks may be nervous about
printf not printing things immediately.
The other issue is that lots of code assumes it can call printf from anywhere
and everywhere. Mostly this just means that if you add locking and line-
buffering to printf(9) you have to be very careful to make sure it works in
odd places. Probably a lot of this could be solved by deferring things like
trap_fatal() until panic() has already been called (which is bde's preferred
solution I think).
--
John Baldwin
More information about the freebsd-stable
mailing list