svn commit: r225372 - head/sys/kern
brde at optusnet.com.au
Tue Oct 4 16:52:18 UTC 2011
On Tue, 4 Oct 2011, Attilio Rao wrote:
> 2011/10/3 Bruce Evans <brde at optusnet.com.au>:
>> On Mon, 26 Sep 2011, Attilio Rao wrote:
>>> 2011/9/4 Bruce Evans <brde at optusnet.com.au>:
>>>> On Sun, 4 Sep 2011, Attilio Rao wrote:
>>>>> Also please notice that intr enable/disable happens in the wrong way
>>>>> as it is done via the MD (x86 specific likely) interface. This is
>>>>> wrong for 2 reasons:
>>>> No, intr_disable() is MI. Â It is also used by witness. Â disable_intr()
>>>> is the corresponding x86 interface that you may be thinking of. Â The MI
>>>> interface intr_disable() was introduced to avoid the MD'ness of
>>> I was a bit surprised to verify that you are right but
>>> spinlock_enter() has the big difference besides disable_intr() of also
>>> explicitly disabling preemption via critical_enter() which some
>>> codepath can trigger without even noticing it.
>>> This means it is more safer in presence of PREEMPTION option on and
>>> thus should be preferred to the normal intr_disable(), in particular
>>> for convoluted codepaths.
>> I think this is another implementation detail which shouldn't be depended
>> on. Â Spinlocks may or may not need either interrupts disabled or a critical
>> section to work. Â Now I'm a little surprised to remember that they use a
>> critical section. Â This is to prevent context switching. Â It is useful
>> behaviour, but not strictly necessary.
>> Since disabling interrupts also prevents context switching (excep by buggy
>> trap handlers including NMI), it is safe to use hard interrupt disabling
>> instead of critical_enter() to prevent context switching. Â This is safe
>> because code that has interrupts disabled cannot wander off into other
>> code that doesn't understand this and does context switching! (unless it
>> is broken). Â But for preventing context switching, critical_enter() is
>> better now that it doesn't hard-disable interrupts internally.
> This is not entirely correct, infact you may have preemption even with
> interrupts disabled by calling code that schedule threads. This is why
> spinlock_enter() disables interrupts _and_ preemption altogether.
> Look for example at hardclock() and its internal magic (this is what I
> meant, earlier, with "non convoluted codepaths").
That is a bug in -current. As I said, only broken code can wander off
into other code that doesn't understand the caller's context. This is
one of the things that prevents hardclock() being a non-broken fast
interrupt handler. hardclock() wants to call scheduling code, but non-
broken fast interrupt handlers can't do that.
>> By un-inlining (un-macroizing) mtx_lock_spin(), but inlining
>> critical_enter(), I get the same number of function calls but much smaller
>> code since it is the tiny critical_enter() function and not the big
>> mtx_lock_spin() one that is inlined.
> I'm not entirely sure I follow.
> In -CURRENT, right now mtx_lock_spin() just yields
> _mtx_lock_spin_flags() which is not inlined.
That is the debugging version. <sys/mutex.h> is obfuscated as follows:
- mtx_lock_spin(m) is mtx_lock_spin_flags((m), 0)
- if LOCK_DEBUG > 0 or defined(MUTEX_NOINLINE)
mtx_lock_spin_flags((m), 0) is _mtx_lock_spin_flags((m), curthread, (0),
This gives the version that you described.
if LOCK_DEBUG > 0
LOCK_FILE is __FILE__ and LOCK_LINE is __LINE__
LOCK_FILE is NULL and LOCK_LINE is 0
mtx_lock_spin_flags((m), 0) is __mtx_lock_spin((m), curthread, (0),
This gives the version that I described.
__mtx_lock_spin() is passed a dummy LOCK_FILE and LOCK_LINE although
it doesn't use them in this case. It is convoluted so that it can
be used both in this case and in the non-inlined case, where it is
expanded in _mtx_lock_spin_flags(), where it is passed __FILE__
and __LINE__ iff LOCK_DEBUG > 0. But this reuse is not so good since
it gives a further obfuscations:
__mtx_lock_spin() takes flags args but doesn't have `flags' in its
name like some other mtx functions.
In the non-inlined case:
mtx_lock_spin(...) is _mtx_lock_spin_flags() as described above
_mtx_lock_spin_flags(...) invokes __mtx_lock_spin(...) as desc. above
__mtx_lock_spin(...) invokes _mtx_lock_spin(...)
The macro obfuscations end at this point -- _mtx_lock_spin() is
always a function.
_mtx_lock_spin(), like __mtx_lock_spin(), takes flags args but
doesn't have `flags' in its name.
> So the improvement here is just to have inlined critical_enter()? How
> this can lead to smaller code?
This should be obvious now. Another detail is that my mtx_lock_spin()
takes only 1 arg (this is the normal API). It doesn't need to support
passing (td, opt, file, line). critical_enter() takes no args at all
(td = curthread is implicit for it, as it is for mtx_lock_spin()). So
mtx_lock_spin(mp) calls expand to slightly more object code than
critical_enter() calls. In -current for the inlined case, mtx_lock_spin()
expands to critical_enter() plus about 40 bytes (on i386) for the atomic
op on the mutex followed by the _mtx_lock_function call.
>> The complications are mainly in critical_exit():
>> - in my version, when td_critnest is decremented to 0, a MD function is
>> Â called to "unpend" any pending interrupts that have accumulated while
>> Â in the critical region. Â Since mtx_lock_spin() doesn't hard-disable
>> Â interrupts, they may occur when a spinlock is held. Â Fast interrupts
>> Â proceed. Â Others are blocked until critical_exit() unpends them.
>> Â This includes software interrupts. Â The only really complicated part
>> Â is letting fast interrupts proceed. Â Fast interrupt handlers cannot
>> Â use or be blocked by any normal locking, since they don't respect
>> Â normal spinlocks. Â So for example, hardclock() cannot be a fast interrupt
>> Â handler.
> I don't think this is a good idea.
> We hardly rely on interrupts disabling during spinlock helding in
> order to get them be used by fast handlers and then avoid deadlocks
> with code running in interrupt/kernel context.
I think you mean "We strongly rely on...". -current certainly relies on
this. IMO this is mostly a bug. There is the implementation detail that
spinlocks hard-disable interrupts to avoid a deadlock problem in the UP
case. Too much code has come to depend on this IMO.
>>> [register_t intr_disable() interface not being quite NMI]
>>> I mostly agree, I think we should have an MD specified type to replace
>>> register_t for this (it could alias it, if it is suitable, but this
>>> interface smells a lot like x86-centric).
>> Really vax-centric. Â spl "levels" are from vax or earlier CPUs. Â x86
>> doesn't really have levels (the AT PIC has masks and precedences. Â The
>> precedences correspond to levels are but rarely depended on or programmed
>> specifically). Â alpha and sparc seem to have levels much closer to
>> With only levels, even an 8-bit interface for the level is enough (255
>> levels should be enough for anyone). Â With masks, even a 64-bit interface
>> for the mask might not be enough. Â When masks were mapped to levels
>> for FreeBSD on i386, the largish set of possible mask values was mapped
>> into < 8 standard levels (tty, net, bio, etc). Â Related encoding of
>> MD details as cookies would probably work well enough in general.
> For cookie you just mean a void * ptr?
Either a pointer, or an integer that indexes a table of pointers, or an
an integer that encodes all the info in the integer's bits, possibly
with an identity encoding.
More information about the svn-src-all