KDB entry on NMI

Konstantin Belousov kostikbel at gmail.com
Sat Jul 19 18:29:20 UTC 2014

On Sat, Jul 19, 2014 at 10:58:18AM -0700, Marcel Moolenaar wrote:
> On Jul 18, 2014, at 9:07 AM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> > It was mentioned somewhere recently, that typical BIOS today configures
> > NMI delivery on the hardware events as broadcast.  When I developerd
> > the dmar(4) busdma backend, I indeed met the problem, and wrote a
> > prototype which avoided startup of ddb on all cores.  Instead, the patch
> > implements custom spinlock, which allows only one core to win, other
> > cores ignore the NMI, by spinning on lock.
> > 
> > The issue which I see on at least two different machines with different
> > Intel chipsets, is that NMI is somehow sticky, i.e. it is re-delivered
> > after the handler executes iret.  I am not sure what the problem is,
> > whether it is due to hardware needing some ACK, or a bug in code.
> > 
> > Anyway, even on two-cores machine, having both cores simultaneously
> > enter NMI makes the use of ddb impossible, so I believe the patch is
> > improvement.  I make measures to ensure that reboot from ddb prompt
> > works.
> > 
> > Thought ?
> One may call kdb_enter on different CPUs at the same time and it's
> also possible to call panic on multiple CPUs at the same time (but
> we serialize panic() right now). What if we let kdb_enter at al deal
> with concurrency, instead of doing it specifically for NMIs?
Then, on 80-threads machine I get the 80 ddb sessions on NMI broadcast,
like now.  With your proposal, it will be somewhat better, since
sessions are serialized, so I can do the reboot from the first one.

Still, I hope to understand what I am missing to stop NMI from
delivering in loop.  Then, having only one ddb entry would mean
that I should return only once.

> Also: we may want to do something else than going to the debugger
> when we see an NMI. More complexity in the NMI handler and specific
> to entering the debugger seems to move us away from doing other
> things more easily.
I agree there.

> Aside: I've always wanted to have the ability to have the kernel
> debugger switch to a different CPU so that you can create DDB
> commands that dump hardware resources like TLBs, etc. To support
> this, you want the KDB layer to have good CPU handling, which
> possibly makes it also a good place to handle concurrent entry
> into the debugger from different CPUs.

Me too.  I have another half-finished patch which does this, it allows
to migrate the ddb from one cpu to another.  It worked by signalling a
destination cpu that it should activate, while source cpu starts spinning.
I do not remember exact problems which were unresolved.

I needed this because some state is CPU-local, cannot be accessed
from other cores, and is not saved in pcb.  I definitely looked at
EFER and MISC_FEATURES MSRs, and local apic state.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-amd64/attachments/20140719/45e4aada/attachment.sig>

More information about the freebsd-amd64 mailing list