_mtx_lock_spin: obsolete historic handling of kdb_active and panicstr?

Wed Oct 17 12:07:03 UTC 2012

On Wed, Oct 17, 2012 at 4:20 AM, Andriy Gapon <avg at freebsd.org> wrote:
>
> _mtx_lock_spin has the following check in its retry loop:
> if (i < 60000000 || kdb_active || panicstr != NULL)
>         DELAY(1);
> else
>         _mtx_lock_spin_failed(m);
>
[snip analysis]
>
> So I'd like to propose to remove those checks altogether.  Or perhaps to
> "reverse" them and immediately induce a (possibly secondary) panic if we ever
> get to that wait loop and kdb_active || panicstr != NULL.

The panicstr can clearly be removed.  I think there can be race
conditions with entering kdb and taking a spinlock, because the
spinlock acquire will block interrupts.  I don't remember if we always
NMI for kdb enter or if that's configurable.  The old code was clearer
(or maybe I'm just remembering an Isilon hack); looking at
stop_cpus_hard() I don't see that it uses an NMI.  So a CPU can block
interrupts, then if it sees kdb_active it will spin until we leave
kdb, rather than panic.  Of course this would only be relevant if the
CPU it's trying to acquire is already held; otherwise it should find
the lock unowned and this isn't relevant.  And if the lock is owned by
the thread entering kdb, that would be a real panic, not a recoverable
kdb entry.

So I think maybe the kdb_active check is also not helpful after all.

Cheers,
matthew