[Bug 284443] Hyper-V snapshot triggers SCSI errors

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 10 Feb 2025 09:21:40 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=284443

--- Comment #3 from Michael <michael.adm@gmail.com> ---
A little more information on this matter.
Every night at 3:00 to ~3:10 the VM is backed up.
Sometimes immediately after it, sometimes after a few minutes this happens:

------
/var/log/messages
...
Fatal trap 12: page fault while in kernel mode                                  
cpuid = 3; apic id = 03                                                         
fault virtual address   = 0x10                                                  
fault code              = supervisor read data, page not present                
instruction pointer     = 0x20:0xffffffff80d0cf6f                               
stack pointer           = 0x28:0xfffffe001729fc20                               
frame pointer           = 0x28:0xfffffe001729fcb0                               
code segment            = base 0x0, limit 0xfffff, type 0x1b                    
                        = DPL 0, pres 1, long 1, def32 0, gran 1                
processor eflags        = interrupt enabled, resume, IOPL = 0                   
current process         = 0 (wg_tqg_3)                                          
rdi: 0000000000000000 rsi: fffffe001729f8c0 rdx: ffffffff83a13090               
rcx: 00000000ffffffff  r8: 0000000000000000  r9: ffffffff82af3910               
rax: 0000000000000000 rbx: fffff80061d60d70 rbp: fffffe001729fcb0               
r10: fffff800e042d800 r11: fffff800026cc740 r12: fffff80061d60d00               
r13: fffff800026cc740 r14: fffff80061d60d00 r15: 0000000014dd0a0a               
trap number             = 12                                                    
panic: page fault                                                               
cpuid = 3                                                                       
time = 1738031733                                                               
KDB: stack backtrace:                                                           
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001729f910  
vpanic() at vpanic+0x136/frame 0xfffffe001729fa40                               
panic() at panic+0x43/frame 0xfffffe001729faa0                                  
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe001729fb00                       
trap_pfault() at trap_pfault+0x46/frame 0xfffffe001729fb50                      
calltrap() at calltrap+0x8/frame 0xfffffe001729fb50                             
--- trap 0xc, rip = 0xffffffff80d0cf6f, rsp = 0xfffffe001729fc20, rbp =
0xfffffe001729fcb0 ---
ip_tryforward() at ip_tryforward+0x19f/frame 0xfffffe001729fcb0                 
ip_input() at ip_input+0x2ed/frame 0xfffffe001729fd10                           
netisr_dispatch_src() at netisr_dispatch_src+0x9f/frame 0xfffffe001729fd60      
wg_deliver_in() at wg_deliver_in+0x416/frame 0xfffffe001729fe40                 
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x14e/frame 0xfffffe001729fec0 
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame
0xfffffe001729fef0              
fork_exit() at fork_exit+0x7b/frame 0xfffffe001729ff30                          
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001729ff30               
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---                                       
Uptime: 26d23h30m49s                                                            
Dumping 1848 out of 4038 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%   
Dump complete                                                                   
Automatic reboot in 15 seconds - press a key on the console to abort            
...
-----

This happened twice within a few months.
This behavior is observed with FreeBSD-13, and in FreeBSD-14, and now in
FreeBSD-15.
The condition for the occurrence of a "Fatal trap" is that SR-IOV is enabled on
the network cards + a checkpoint or backup of this VM. When SR-IOV was disabled
on network cards, this behavior was not observed.

Message log for several days including Fatal trap and initial launch in the
attachment above.

-- 
You are receiving this mail because:
You are the assignee for the bug.