Understanding the FreeBSD locking mechanism
Yubin Ruan
ablacktshirt at gmail.com
Wed Apr 12 02:32:29 UTC 2017
On 2017年04月12日 07:11, Chris Torek wrote:
>> The difference between the "ithread" and "interrupt filter" things
>> is that ithread has its own thread context, while interrupt handling
>> through interrupt filter shares the same kernel stack.
>
> Right -- though rather than "the same" I would just say "shares
> a stack", i.e., we're not concerned with *whose* stack and/or
> thread we're borrowing, just that we have one borrowed.
>
>> So, for ithread, we should use the MTX_DEF, which don't disable
>> interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
>> disable interrupt.
>
> Right.
>
>> What really confuses me is that I don't really see how owning an
>> "independent" thread context(i.e ithread) makes a thread run in the
>> "top-half" and how sharing the same kernel stack makes a thread run in
>> the "bottom-half".
>
> It's not that it *makes* it run that way, it's that it *allows* it
> to run that way -- and then the scheduler *does* run it that way.
>
>> I did read your long explanation in the previous mail. For the non-SMP
>> case, the "top-half/bottom-half" model goes well and I understand how
>> the *code* path/*data* path things go. But I cannot still fully
>> understand the model for the SMP case.
>
> It's fundamentally fairly tricky, but we start with that same first
> notion:
>
> * If you have your own state (i.e., stack), you can be suspended
> (stopped in the scheduler, giving the CPU to other threads):
> *your* (private) state is preserved on *your* (private) stack.
>
> * If you have borrowed someone else's state, anything that suspends
> you, suspends them too. Since this may deadlock, you are not
> allowed to do it at all.
clear. How can I distinguish these two conditions? I mean, whether I
am using my own state/stack or borrowing others' state.
> Once we block interrupts locally (as for MTX_SPIN, or
> automatically inside a filter style or "bottom half" interrupt),
> we are in a special state: we may not take *any* MTX_DEF locks at
> all (the kernel should panic if we do).
>
> This in turn means that data structures are protected *either* by
> a spin mutex *or* by a default (non-spin) mutex, never both. So
> if you need to touch a spin-mutex data structure from thread-y
> ("top half") code, you obtain the spin mutex, and now no interrupts
> will occur *on this CPU*, and as a key side effect, you won't move
> *off* this CPU either. If an interrupt occurs on another CPU and
> it goes to take the spin lock that protects that CPU, it loops
> at that point, not switching tasks, waiting for the MTX_SPIN mutex
> to be released:
>
> CPU 1 CPU 2
> ----------------------------|-----------------------------
> func() { | ... code not involving mtx
> mtx_lock_spin(&mtx); | ...
> do some work | mtx_lock_spin(&mtx); /* loops */
> . | [stuck]
> . | [stuck]
> . | [stuck]
> mtx_unlock_spin(&mtx); | [unstuck]
> ... | do some work
>
> If an interrupt occurs on CPU 2, and that interrupt-handling code
> wants to touch the data protected by the spin lock, that code
> obtains the spin lock as usual. Meanwhile the interrupt *cannot*
> occur on CPU 1, as holding the spin lock has blocked interrupts.
> So the code path on CPU 2 blocks -- looping in mtx_lock_spin(),
> not giving CPU 2 over to the scheduler -- for as long as CPU 1
> holds the spin lock. The corresponding code path is already
> blocked on CPU 1, the same way it was back in the non-SMP, single-
> CPU days.
Things become clearer now. Thanks for your reply.
If I understand correctly, which kind of lock should be used depends on
which thread model(i.e "thread filter" or "ithread") we use. If I want
to use a lock, I must know in advance which kind of thread model I am
in, otherwise the interrupt handling code might cause you deadlock or
kernel panic. The problem is, how can I tell which thread model I am
in? I am not so clear about the thread model things and scheduling
code of FreeBSD...
> This means it is unwise to hold spin locks for long periods. In
> fact, if CPU 2 waits too long in that [stuck] section, it will
> panic, on the assumption that CPU 1 has done something terrible
> and the system is now hung.
>
> This is also waht gives rise to the constrant that you must take
> MTX_SPIN locks "inside" any outer MTX_DEF locks.
What do you mean by "must take MTX_SPIN locks 'inside' any outer
MTX_DEF locks?
Regards,
Yubin Ruan
More information about the freebsd-hackers
mailing list