Instant panic while trying run ports-mgmt/poudriere
Lawrence Stewart
lstewart at room52.net
Fri Aug 28 04:36:24 UTC 2015
On 08/23/15 22:54, Konstantin Belousov wrote:
> On Sun, Aug 23, 2015 at 12:08:16PM +0300, Konstantin Belousov wrote:
>> On Sun, Aug 23, 2015 at 09:54:28AM +0300, Andriy Gapon wrote:
>>> On 12/08/2015 17:11, Lawrence Stewart wrote:
>>>> On 08/07/15 07:33, Pawel Pekala wrote:
>>>>> Hi K.,
>>>>>
>>>>> On 2015-08-06 12:33 -0700, "K. Macy" <kmacy at freebsd.org> wrote:
>>>>>> Is this still happening?
>>>>>
>>>>> Still crashes:
>>>>
>>>> +1 for me running r286617
>>>
>>> Here is another +1 with r286922.
>>> I can add a couple of bits of debugging data:
>>>
>>> (kgdb) fr 8
>>> #8 0xffffffff80639d60 in knote (list=0xfffff8019a733ea0,
>>> hint=2147483648, lockflags=<value optimized out>) at
>>> /usr/src/sys/kern/kern_event.c:1964
>>> 1964 } else if ((lockflags & KNF_NOKQLOCK) != 0) {
>>> (kgdb) p *list
>>> $2 = {kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff8063a1e0
>>> <knlist_mtx_lock>, kl_unlock = 0xffffffff8063a200 <knlist_mtx_unlock>,
>>> kl_assert_locked = 0xffffffff8063a220 <knlist_mtx_assert_locked>,
>>> kl_assert_unlocked = 0xffffffff8063a240 <knlist_mtx_assert_unlocked>,
>>> kl_lockarg = 0xfffff8019a733bb0}
>>> (kgdb) disassemble
>>> Dump of assembler code for function knote:
>>> 0xffffffff80639d00 <knote+0>: push %rbp
>>> 0xffffffff80639d01 <knote+1>: mov %rsp,%rbp
>>> 0xffffffff80639d04 <knote+4>: push %r15
>>> 0xffffffff80639d06 <knote+6>: push %r14
>>> 0xffffffff80639d08 <knote+8>: push %r13
>>> 0xffffffff80639d0a <knote+10>: push %r12
>>> 0xffffffff80639d0c <knote+12>: push %rbx
>>> 0xffffffff80639d0d <knote+13>: sub $0x18,%rsp
>>> 0xffffffff80639d11 <knote+17>: mov %edx,%r12d
>>> 0xffffffff80639d14 <knote+20>: mov %rsi,-0x30(%rbp)
>>> 0xffffffff80639d18 <knote+24>: mov %rdi,%rbx
>>> 0xffffffff80639d1b <knote+27>: test %rbx,%rbx
>>> 0xffffffff80639d1e <knote+30>: je 0xffffffff80639ef6 <knote+502>
>>> 0xffffffff80639d24 <knote+36>: mov %r12d,%eax
>>> 0xffffffff80639d27 <knote+39>: and $0x1,%eax
>>> 0xffffffff80639d2a <knote+42>: mov %eax,-0x3c(%rbp)
>>> 0xffffffff80639d2d <knote+45>: mov 0x28(%rbx),%rdi
>>> 0xffffffff80639d31 <knote+49>: je 0xffffffff80639d38 <knote+56>
>>> 0xffffffff80639d33 <knote+51>: callq *0x18(%rbx)
>>> 0xffffffff80639d36 <knote+54>: jmp 0xffffffff80639d42 <knote+66>
>>> 0xffffffff80639d38 <knote+56>: callq *0x20(%rbx)
>>> 0xffffffff80639d3b <knote+59>: mov 0x28(%rbx),%rdi
>>> 0xffffffff80639d3f <knote+63>: callq *0x8(%rbx)
>>> 0xffffffff80639d42 <knote+66>: mov %rbx,-0x38(%rbp)
>>> 0xffffffff80639d46 <knote+70>: mov (%rbx),%rbx
>>> 0xffffffff80639d49 <knote+73>: test %rbx,%rbx
>>> 0xffffffff80639d4c <knote+76>: je 0xffffffff80639ee5 <knote+485>
>>> 0xffffffff80639d52 <knote+82>: and $0x2,%r12d
>>> 0xffffffff80639d56 <knote+86>: nopw %cs:0x0(%rax,%rax,1)
>>> 0xffffffff80639d60 <knote+96>: mov 0x28(%rbx),%r14
>>>
>>> Panic is in the last quoted instruction.
>>> And:
>>> (kgdb) i reg
>>> rax 0x246 582
>>> rbx 0xdeadc0dedeadc0de -2401050962867404578
>>> rcx 0x0 0
>>> rdx 0x12e 302
>>> rsi 0xffffffff80a26a5a -2136839590
>>> rdi 0xffffffff80e81b80 -2132272256
>>> rbp 0xfffffe02b7efea20 0xfffffe02b7efea20
>>> rsp 0xfffffe02b7efe9e0 0xfffffe02b7efe9e0
>>> r8 0xffffffff80a269ce -2136839730
>>> r9 0xffffffff80e82838 -2132269000
>>> r10 0x10000 65536
>>> r11 0xffffffff80fabd10 -2131051248
>>> r12 0x0 0
>>> r13 0xfffff801ff84a818 -8787511171048
>>> r14 0xfffff801ff84a800 -8787511171072
>>> r15 0xfffff8019a6974f0 -8789207452432
>>> rip 0xffffffff80639d60 0xffffffff80639d60 <knote+96>
>>> eflags 0x10286 66182
>>>
>>> I think that $rbx stands out here (this is a kernel with INVARIANTS).
>>>
>>> Looking at the code, is it possible that one of the calls from within
>>> the loop's body modifies the list? If that is so and provided that is a
>>> valid behavior, then maybe using SLIST_FOREACH_SAFE would help.
>>
>> This is first time a useful debugging data was posted.
>>
>> The 0x28 offset may indicate either kn_kq member access of the struct
>> knote, or kq_list of the struct kqueue.
>>
>> kl_list.slh_first of the list parameter is NULL, how would a list
>> iteration loop even start ? Can you look up the list argument value
>> from the previous frame (%rdi is overwritten, so debugger might be
>> confused) ?
>
> After looking at your data closely, I think you are right. The panic
> occurs when the exit1(9) does KNOTE_LOCKED(NOTE_EXIT). This is the
> only case in the tree where filter uses knlist_remove_inevent() to detach
> processed note, so indeed the slist is modified under the iterator.
>
> Below is the patch with the suggested change and unrelated cleanup of
> the uma(9) KPI use. Please test, everybody who has a panic with the
> backtrace pointing to the sys_exit().
Fixes the panic for me too, thanks Kostik.
Cheers,
Lawrence
More information about the freebsd-current
mailing list