proposed smp_rendezvous change

Attilio Rao attilio at freebsd.org
Mon May 16 22:22:39 UTC 2011


2011/5/16 Max Laier <max at love2party.net>:
> On Monday 16 May 2011 16:46:03 John Baldwin wrote:
>> On Monday, May 16, 2011 4:30:44 pm Max Laier wrote:
>> > On Monday 16 May 2011 14:21:27 John Baldwin wrote:
>> > > Yes, we need to fix that.  Humm, it doesn't preempt when you do a
>> > > critical_exit() though?  Or do you use a hand-rolled critical exit that
>> > > doesn't do a deferred preemption?
>> >
>> > Right now I just did a manual td_critnest++/--, but I guess ...
>>
>> Ah, ok, so you would "lose" a preemption.  That's not really ideal.
>>
>> > > Actually, I'm curious how the spin unlock inside the IPI could yield
>> > > the CPU.  Oh, is rmlock doing a wakeup inside the IPI handler?  I
>> > > guess that is ok as long as the critical_exit() just defers the
>> > > preemption to the end of the IPI handler.
>> >
>> > ... the earliest point where it is safe to preempt is after doing the
>> >
>> >    atomic_add_int(&smp_rv_waiters[2], 1);
>> >
>> > so that we can start other IPIs again.  However, since we don't accept
>> > new IPIs until we signal EOI in the MD code (on amd64), this might still
>> > not be a good place to do the yield?!?
>>
>> Hmm, yeah, you would want to do the EOI before you yield.  However, we
>> could actually move the EOI up before calling the MI code so long as we
>> leave interrupts disabled for the duration of the handler (which we do).
>>
>> > The spin unlock boils down to a critical_exit() and unless we did a
>> > critical_enter() at some point during the redenvouz setup, we will
>> > yield() if we owepreempt.  I'm not quite sure how that can happen, but
>> > it seems like there is a path that allows the scheduler to set it from a
>> > foreign CPU.
>>
>> No, it is only set on curthread by curthread.  This is actually my main
>> question.  I've no idea how this could happen unless the rmlock code is
>> actually triggering a wakeup or sched_add() in its rendezvous handler.
>>
>> I don't see anything in rm_cleanIPI() that would do that however.
>>
>> I wonder if your original issue was really fixed  just by the first patch
>> you had which fixed the race in smp_rendezvous()?
>
> I found the stack that lead me to this patch in the first place:
>
> #0  sched_switch (td=0xffffff011a970000, newtd=0xffffff006e6784b0,
> flags=4) at src/sys/kern/sched_ule.c:1939
> #1  0xffffffff80285c7f in mi_switch (flags=6, newtd=0x0) at
> src/sys/kern/kern_synch.c:475
> #2  0xffffffff802a2cb3 in critical_exit () at src/sys/kern/kern_switch.c:185
> #3  0xffffffff80465807 in spinlock_exit () at
> src/sys/amd64/amd64/machdep.c:1458
> #4  0xffffffff8027adea in rm_cleanIPI (arg=<value optimized out>) at
> src/sys/kern/kern_rmlock.c:180
> #5  0xffffffff802b9887 in smp_rendezvous_action () at
> src/sys/kern/subr_smp.c:402
> #6  0xffffffff8045e2a4 in Xrendezvous () at
> src/sys/amd64/amd64/apic_vector.S:235
> #7  0xffffffff802a2c6e in critical_exit () at src/sys/kern/kern_switch.c:179
> #8  0xffffffff804365ba in uma_zfree_arg (zone=0xffffff009ff4b5a0,
> item=0xffffff000f34cd40, udata=0xffffff000f34ce08) at
> src/sys/vm/uma_core.c:2370
> .
> .
> .
>
> and now that I look at it again, it is clear that critical_exit() just isn't
> interrupt safe.  I'm not sure how to fix that, yet ... but this:
>
>
>        if (td->td_critnest == 1) {
>                td->td_critnest = 0;
>                if (td->td_owepreempt) {
>                        td->td_critnest = 1;
>
> clearly doesn't work.

I'm sorry if I didn't reply to the whole rendezvous thread, but as
long as there is so many people taking care of it, I'll stay hidden in
my corner.

I just wanted to tell that I think you are misunderstanding what
critical section is supposed to do.

When an interrupt fires, it goes on the old "interrupt/kernel context"
which means it has not a context of his own. That is the reason why we
disable interrupts on spinlocks (or similar workaround for !x86
architectures) and this is why spinlocks are the only protection
usable in code that runs in interrupt context.

Preempting just means another thread will be scheduler in the middle
of another thread execution path.

This code is perfectly fine if you consider curthread won't be
descheduled while it is executing.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein


More information about the freebsd-current mailing list