Re: IPv6 panic (NULL * deref?) in nd6_ifnet_link_event
Date: Sat, 10 May 2025 19:49:52 UTC
On Sat, 10 May 2025, Kristof Provost wrote:
>
>
>> On 10 May 2025, at 21:32, Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> wrote:
>>
>> Hi,
>>
>> main of the last days.
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 2; apic id = 02
>> fault virtual address = 0x10
>> fault code = supervisor read data, page not present
>> instruction pointer = 0x20:0xffffffff80dbd769
>> stack pointer = 0x28:0xfffffe0106296d60
>> frame pointer = 0x28:0xfffffe0106296d70
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 12 (swi6: task queue)
>> rdi: fffff8002f997800 rsi: 000000000000001c rdx: 0000000000000000
>> rcx: 0000000000010000 r8: 0000000000000001 r9: ffffffffffffffff
>> rax: 0000000000000000 rbx: fffff8002f997a18 rbp: fffffe0106296d70
>> r10: ffffffff81c4a1e8 r11: 0000000000000001 r12: fffff80001210700
>> r13: fffff80001210728 r14: fffff8002f997800 r15: 0000000000000001
>> trap number = 12
>> panic: page fault
>> cpuid = 2
>> time = 1746903751
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0106296a90
>> vpanic() at vpanic+0x136/frame 0xfffffe0106296bc0
>> panic() at panic+0x43/frame 0xfffffe0106296c20
>> trap_pfault() at trap_pfault+0x48d/frame 0xfffffe0106296c90
>> calltrap() at calltrap+0x8/frame 0xfffffe0106296c90
>> --- trap 0xc, rip = 0xffffffff80dbd769, rsp = 0xfffffe0106296d60, rbp = 0xfffffe0106296d70 ---
>> nd6_ifnet_link_event() at nd6_ifnet_link_event+0x39/frame 0xfffffe0106296d70
>> do_link_state_change() at do_link_state_change+0x1b1/frame 0xfffffe0106296dc0
>> taskqueue_run_locked() at taskqueue_run_locked+0x1c2/frame 0xfffffe0106296e40
>> taskqueue_run() at taskqueue_run+0x4d/frame 0xfffffe0106296e60
>> ithread_loop() at ithread_loop+0x266/frame 0xfffffe0106296ef0
>> fork_exit() at fork_exit+0x82/frame 0xfffffe0106296f30
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0106296f30
>> --- trap 0x25b01e6e, rip = 0x52db004fa566ef34, rsp = 0xcadb9a4f3d667734, rbp = 0xde5a00adbd42c69c ---
>> KDB: enter: panic
>>
>>
>> (gdb) l * nd6_ifnet_link_event+0x39
>> 0xffffffff80dbd769 is in nd6_ifnet_link_event (sys/netinet6/nd6_rtr.c:327).
>> 322 static void
>> 323 defrtr_ipv6_only_ipf_down(struct ifnet *ifp)
>> 324 {
>> 325
>> 326 IF_AFDATA_WLOCK(ifp);
>> 327 ND_IFINFO(ifp)->flags &= ~ND6_IFF_IPV6_ONLY;
>> 328 IF_AFDATA_WUNLOCK(ifp);
>> 329 }
>> 330 #endif /* EXPERIMENTAL */
>> 331
>>
> That may be a known issue. There’s something odd with teardown where we sometimes clean up af_data for INET6 and still try to send v6 traffic. I know of panics where there’s a fib6_lookup() that returns a route with no v6 af_data.
> I put a hack in the pfsense tree to make the panic less likely, but I don’t know what the root cause is.
This one likely came after the ifp was gone or at least ND_IFINFO(ifp)
was NULL. The first would be a contract violation the second is likely
a bad order/race against queuing. But here both can avoid panics by
NULL checks (+warning maybe so we can find the root casue)?
--
Bjoern A. Zeeb r15:7