svn commit: r355784 - in head/sys: compat/linuxkpi/common/src dev/dpaa kern mips/nlm sys

Mateusz Guzik mjguzik at gmail.com
Mon Dec 16 19:39:55 UTC 2019


The entire thing reads like a bug -- for each platform critical_exit
can venture into mi_switch with interrupts disabled. On amd64 it just
happens to do it after the count was updated, but still before they
got-reenabled. Other platform also do critical enter/exit dance on
each call instead of just on the 0<->1 transition of
md_spinlock_count.

iow each platform should be probably doing something like this instead:
diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c
index 7aadd86e3cd6..c90fe8ef961d 100644
--- a/sys/amd64/amd64/machdep.c
+++ b/sys/amd64/amd64/machdep.c
@@ -1991,10 +1991,10 @@ spinlock_enter(void)

        td = curthread;
        if (td->td_md.md_spinlock_count == 0) {
+               critical_enter();
                flags = intr_disable();
                td->td_md.md_spinlock_count = 1;
                td->td_md.md_saved_flags = flags;
-               critical_enter();
        } else
                td->td_md.md_spinlock_count++;
 }
@@ -2009,8 +2009,8 @@ spinlock_exit(void)
        flags = td->td_md.md_saved_flags;
        td->td_md.md_spinlock_count--;
        if (td->td_md.md_spinlock_count == 0) {
-               critical_exit();
                intr_restore(flags);
+               critical_exit();
        }
 }



On 12/16/19, Jeff Roberson <jroberson at jroberson.net> wrote:
> On Mon, 16 Dec 2019, Ryan Libby wrote:
>
>> On Mon, Dec 16, 2019 at 7:30 AM Ed Maste <emaste at freebsd.org> wrote:
>>>
>>> On Sun, 15 Dec 2019 at 16:27, Jeff Roberson <jeff at freebsd.org> wrote:
>>>>
>>>> Author: jeff
>>>> Date: Sun Dec 15 21:26:50 2019
>>>> New Revision: 355784
>>>> URL: https://svnweb.freebsd.org/changeset/base/355784
>>>>
>>>> Log:
>>>>   schedlock 4/4
>>>
>>> FYI i386, arm, arm64, riscv fail to boot now, with "panic: invalid count
>>> 2"
>>>
>>> Boot logs:
>>> i386: https://ci.freebsd.org/job/FreeBSD-head-i386-test/7797/console
>>> arm:
>>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-beaglebone-test/1317/artifact/device_tests/beaglebone.boot.log
>>> arm64:
>>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-pinea64-test/1194/artifact/device_tests/pinea64.boot.log
>>> riscv:
>>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-pinea64-test/1194/artifact/device_tests/pinea64.boot.log
>>>
>>> arm64 is:
>>>
>>> panic: invalid count 2
>>> cpuid = 0
>>> time = 1
>>> KDB: stack backtrace:
>>> db_trace_self() at db_trace_self_wrapper+0x28
>>> pc = 0xffff0000007359ec  lr = 0xffff000000106744
>>> sp = 0xffff000056b063c0  fp = 0xffff000056b065d0
>>>
>>> db_trace_self_wrapper() at vpanic+0x18c
>>> pc = 0xffff000000106744  lr = 0xffff000000408128
>>> sp = 0xffff000056b065e0  fp = 0xffff000056b06690
>>>
>>> vpanic() at panic+0x44
>>> pc = 0xffff000000408128  lr = 0xffff000000407ed8
>>> sp = 0xffff000056b066a0  fp = 0xffff000056b06720
>>>
>>> panic() at sched_switch+0x81c
>>> pc = 0xffff000000407ed8  lr = 0xffff000000434264
>>> sp = 0xffff000056b06730  fp = 0xffff000056b06810
>>>
>>> sched_switch() at mi_switch+0x170
>>> pc = 0xffff000000434264  lr = 0xffff000000413690
>>> sp = 0xffff000056b06820  fp = 0xffff000056b06840
>>>
>>> mi_switch() at cpu_idle+0xc8
>>> pc = 0xffff000000413690  lr = 0xffff0000007400a0
>>> sp = 0xffff000056b06850  fp = 0xffff000056b06860
>>>
>>> cpu_idle() at sched_idletd+0x380
>>> pc = 0xffff0000007400a0  lr = 0xffff000000436a90
>>> sp = 0xffff000056b06870  fp = 0xffff000056b06940
>>>
>>> sched_idletd() at fork_exit+0x7c
>>> pc = 0xffff000000436a90  lr = 0xffff0000003c7ba4
>>> sp = 0xffff000056b06950  fp = 0xffff000056b06980
>>>
>>> fork_exit() at fork_trampoline+0x10
>>> pc = 0xffff0000003c7ba4  lr = 0xffff0000007521ac
>>> sp = 0xffff000056b06990  fp = 0x0000000000000000
>>>
>>> KDB: enter: panic
>>> [ thread pid 11 tid 100003 ]
>>> Stopped at      0
>>> db>
>>
>> It looks like amd64 vs i386, riscv, etc are using different motifs in
>> spinlock_exit().  Perhaps we just need to rearrange them to drop the
>> spinlock count before critical_exit(), like in amd64.
>
> It took me a moment to see why but I believe you are right.  Interrupts
> being disabled would prevent a local preemption with the flags out of sync
> but critical_exit() might have owepreempt set so we will switch before
> updating the count.
>
> Jeff
>
>>
>> Ryan
>>
>


-- 
Mateusz Guzik <mjguzik gmail.com>


More information about the svn-src-head mailing list