[Bug 250866] lock inversion panic in sys/riscv/riscv/pmap.c:3887 on RISCV 1300123

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed Nov 4 16:05:56 UTC 2020


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250866

            Bug ID: 250866
           Summary: lock inversion panic in sys/riscv/riscv/pmap.c:3887 on
                    RISCV 1300123
           Product: Base System
           Version: CURRENT
          Hardware: riscv
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs at FreeBSD.org
          Reporter: dclarke at blastwave.org

While running a test image from Mitchell Horne provided at : 

    https://reviews.freebsd.org/D27045

I was able to use it just fine under qemu and with a ZFS zpool. 

However after a number of hours while building the various dependencies
I ran into the following panic : 

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex pmap (pmap) r = 0 (0xffffffd1a3420760) locked @
/usr/src/sys/riscv/riscv/pmap.c:3887
exclusive rw pmap pv list (pmap pv list) r = 0 (0xffffffc00083ff40) locked @
/usr/src/sys/riscv/riscv/pmap.c:3862
shared rw pmap pv global (pmap pv global) r = 0 (0xffffffc00083fe00) locked @
/usr/src/sys/riscv/riscv/pmap.c:3860
exclusive rw vm object (vm object) r = 0 (0xffffffd1d0046b58) locked @
/usr/src/sys/vm/vnode_pager.c:1239
stack backtrace:  339 -16    -     0B  5424K swapin   1  23:30   4.58% kernel
#0 0xffffffc000302a18 at witness_debugger+0x6celect   0   0:01   2.85% miniruby
#1 0xffffffc000303ba6 at witness_warn+0x420
#2 0xffffffc000536ff6 at page_fault_handler+0x60
#3 0xffffffc000536cb0 at do_trap_supervisor+0x64
#4 0xffffffc000527288 at cpu_exception_handler_supervisor+0x68
#5 0xffffffc0004f9b70 at vm_object_page_collect_flush+0x10a
#6 0xffffffc0004f990e at vm_object_page_clean+0x15e
#7 0xffffffc00036a9c0 at vinactivef+0xae
#8 0xffffffc00036a1bc at vput_final+0x256
#9 0xffffffc00036a248 at vput+0x32
#10 0xffffffc00037da08 at vn_close1+0x13c
#11 0xffffffc00037c466 at vn_closefile+0x44
#12 0xffffffc00024a324 at _fdrop+0x16
#13 0xffffffc00024d258 at closef+0x1e4
#14 0xffffffc00024a7dc at closefp+0x82
#15 0xffffffc00024afb6 at kern_close+0x11e
#16 0xffffffc00024ae8c at sys_close+0xe
#17 0xffffffc00053734e at do_trap_user+0x23a
t[0] == 0xffffffd1ffd8d800
t[1] == 0x0000000000000030
t[2] == 0x0000000000000000
t[3] == 0xffffffc062409708
t[4] == 0x000000000000000f
t[5] == 0x0000000000000180
t[6] == 0xffffffd00b47ed00
s[0] == 0xffffffc062409660
s[1] == 0xffffffd0a2cff2e0
s[2] == 0xffffffd1a3420778
s[3] == 0xffffffd1f3710830
s[4] == 0xffffffd1a3420760
s[5] == 0xffffffc0624097e0
s[6] == 0x0000000000000000
s[7] == 0x0000000000001000
s[8] == 0xffffffc0624095e0
s[9] == 0xfffffffffffff000
s[10] == 0x0000000000000001
s[11] == 0xffffffc000586443
a[0] == 0x0000000045f9e000
a[1] == 0x0000000000000000
a[2] == 0xffffffd1b6600178
a[3] == 0x0000000080000000
a[4] == 0x0000000280000000
a[5] == 0x0000000000000000
a[6] == 0xffffffcf80000000
a[7] == 0xffffffc000586443
ra == 0xffffffc000533736
sp == 0xffffffc062409560
gp == 0x0000000000000008
tp == 0x0000000000000020
sepc == 0xffffffc0005337ba
sstatus == 0x8000000000006120
panic: Fatal page fault at 0xffffffc0005337ba: 0000000000000000
cpuid = 1
time = 1604493628
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x146
panic() at panic+0x26
page_fault_handler() at page_fault_handler+0x17a
do_trap_supervisor() at do_trap_supervisor+0x64
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x68
--- exception 13, tval = 0
pmap_remove_write() at pmap_remove_write+0x352
vm_object_page_collect_flush() at vm_object_page_collect_flush+0x10a
vm_object_page_clean() at vm_object_page_clean+0x15e
vinactivef() at vinactivef+0xae
vput_final() at vput_final+0x256
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x16
closef() at closef+0x1e4
closefp() at closefp+0x82
kern_close() at kern_close+0x11e
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x23a
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0
KDB: enter: panic
[ thread pid 11478 tid 101371 ]
Stopped at      kdb_enter+0x4c: sd      zero,0(a0)
db> where
Tracing pid 11478 tid 101371 td 0xffffffc0620dfb80
kdb_enter() at kdb_enter+0x4a
vpanic() at vpanic+0x164
panic() at panic+0x26
page_fault_handler() at page_fault_handler+0x17a
do_trap_supervisor() at do_trap_supervisor+0x64
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x68
--- exception 13, tval = 0
pmap_remove_write() at pmap_remove_write+0x352
vm_object_page_collect_flush() at vm_object_page_collect_flush+0x10a
vm_object_page_clean() at vm_object_page_clean+0x15e
vinactivef() at vinactivef+0xae
vput_final() at vput_final+0x256
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x16
closef() at closef+0x1e4
closefp() at closefp+0x82
kern_close() at kern_close+0x11e
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x23a
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0
db> 

Could be a null pointer deref issue? 

Seeing tval is 0 could be null pointer not a locking issue. There is also
exception 8, tval = 0 and maybe that is userspace making a syscall. Not my
area of knowledge ... sorry.

this probably only happens on FreeBSD CURRENT with the witness options and
debugging options enabled. 

If I can reproduce it that I may be able to gather a dump next time. 


-- 
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list