PLEASE TEST: IPI deadlock avoidance patch
dwhite at gumbysoft.com
Thu Aug 26 11:18:34 PDT 2004
On Thu, 26 Aug 2004, Craig Boston wrote:
> On Sun, Aug 22, 2004 at 12:05:39PM -0700, Doug White wrote:
> > If you have a reasonably fast i386 or amd64 multiprocessor and/or
> > hyperthreading machine and are experiencing reproducible hangs during -j
> > buildwords and other highly parallel operations, please try this patch:
> Just a follow-up to my off-list message and another data point, with
> this patch I no longer get deadlocks, however I now get random data
Okay, for those of you experiencing the data corruption issue, I need to
know the following:
. cvsup date & time for the affect kernel(s)
. branch you're tracking
. revision of src/sys/kern/kern_lock.c - I'm checking for a specific set
of commits here
. reproduction case - applications involved and detailed description of
the operation(s) involved.
It would also be nice if you could set up a serial console and attempt to
break into the debugger with an NMI, if your system is so equipped. You'll
want to set these sysctls beforehand:
That should prevent the usual suspects from disrupting your entry to ddb.
This usually works for me for getting into ddb in the IPI deadlock
If you are tracking RELENG_5, be aware the patch is NOT committed there,
and cvsup will happily obliterate the changed files on next run. So be
sure to reapply the patch after cvsup until the patch is merged, which
should be Real Soon Now.
> Disabling the second processor or falling back to an older kernel (one
> from before the IPI hangs started) both fix the problem.
My guess here is that there is another change that got masked by the IPI
problems that are causing this, and getting SMP usable again has brought
it into the light.
Doug White | FreeBSD: The Power to Serve
dwhite at gumbysoft.com | www.FreeBSD.org
More information about the freebsd-current