Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))

From: Ronald Klop <ronald-lists_at_klop.ws>
Date: Sun, 06 Mar 2022 22:22:42 UTC
Hi,

Did some binary search with kernels from artifact.ci.freebsd.org.

I suspect "rmlock: Micro-optimize read locking" as cause.
https://cgit.freebsd.org/src/commit/?id=c84bb8cd771ce4bed58152e47a32dda470bef23a

And "rmlock: Add required compiler barriers to _rm_runlock()" as solution.
https://cgit.freebsd.org/src/commit/?id=89ae8eb74e87ac19aa2d7abe4ba16bcccd32bb9f

So I probably just had a bad day.

Regards,
Ronald.

 
Van: Ronald Klop <ronald-lists@klop.ws>
Datum: zaterdag, 5 maart 2022 16:09
Aan: FreeBSD Current <freebsd-current@freebsd.org>
Onderwerp: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))
> 
> Hi,
> 
> Another panic while building world/kernel. Different panic message and trace.
>  
>   x0:     1f5e152c32cc                                                                                                       
>   x1: ffff0000b630a000 (g_ctx + b4c4a254)                                                                                           
>   x2:                1                                                                                                              
>   x3:               2e                                                                                                              
>   x4: ffffa000bb46d600                                                                                                              
>   x5:                0                                                                                                              
>   x6:                0  x7: ffff000000f05104 (has_pan + 0)
>   x8:                1
>   x9:         809c227c
>  x10:               bd
>  x11:               40
>  x12:                0
>  x13:                1
>  x14:         1782f000
>  x15:             1001
>  x16:         1782f003
>  x17:     1f5e957392f0
>  x18: ffff00010719e630 (next_index + 2cac528)
>  x19: ffff00010719e768 (next_index + 2cac660)
>  x20:                1
>  x21: ffff0000b630a000 (g_ctx + b4c4a254)
>  x22:                1
>  x23:         ffffffbf
>  x24: ffff00010719e758 (next_index + 2cac650)
>  x25: ffffa00026cdd160
>  x26:                1
>  x27: ffffa000bb46d600
>  x28: ffff00000092815a (do_execve.fexecv_proc_title + 5483)
>  x29: ffff00010719e630 (next_index + 2cac528)
>   sp: ffff00010719e630
>   lr: ffff00000053e890 (uiomove_faultflag + 128)
>  elr: ffff000000804f80 (byte_by_byte + 4)
> spsr:               45
>  far: ffff0000b630a000 (g_ctx + b4c4a254)
>  esr:         96000047
> panic: data abort in critical section or under mutex
> cpuid = 2
> time = 1646489189
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x174
> panic() at panic+0x44
> data_abort() at data_abort+0x2a8
> handle_el1h_sync() at handle_el1h_sync+0x10
> --- exception, esr 0x96000047
> byte_by_byte() at byte_by_byte+0x4
> pipe_write() at pipe_write+0x668
> KDB: enter: panic
> [ thread pid 68336 tid 100593 ]
> Stopped at      kdb_enter+0x44: undefined       f901c11f
> db>
> 
> 
> 
> Regards,
> Ronald.
> 
> 
>  
> Van: Ronald Klop <ronald-lists@klop.ws>
> Datum: zaterdag, 5 maart 2022 12:16
> Aan: FreeBSD Current <freebsd-current@freebsd.org>
> Onderwerp: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28)
>> 
>> Hi,
>> 
>> Repeated panics on 14-CURRENT/aarch64. This happens e.g. when the nigthly backup is started.
>> # uname -a
>> FreeBSD rpi4 14.0-CURRENT FreeBSD 14.0-CURRENT #22 main-5f702d6d9a: Mon Feb 28 06:12:48 CET 2022     ronald@rpi4:/home/ronald/dev/obj/home/ronald/dev/freebsd/arm64.aarch64/sys/GENERIC-NODEBUG arm64
>> 
>> It was stable with all kernels until (and including) "FreeBSD 14.0-CURRENT #21 main-e11ad014d1-dirty: Sat Feb  5 00:09:08 CET 2022".
>> 
>> It runs ZFS-on-root via an USB disk. No other FS involved.
>> # gpart show
>> =>        40  1953458096  da0  GPT  (931G)
>>           40      102400    1  efi  (50M)
>>       102440     8388608    2  freebsd-swap  (4.0G)
>>      8491048  1944967088    3  freebsd-zfs  (927G)
>> 
>> 
>> Output on serial console:
>> x0: ffffa000059c1380                                                                                                       
>>   x1: ffffa000059b1600                                                                                                              
>>   x2:                3                                                                                                              
>>   x3: ffffa001862779a0                                                                                                              
>>   x4:                0        
>>   x5:    9438238792a1a
>>   x6:    d217e9df58308
>>   x7:               14
>>   x8: ffffa000059c1398
>>   x9:                1
>>  x10: ffffa000059b1600
>>  x11:                2
>>  x12:                1
>>  x13: f2557a42c5b0f240
>>  x14: 1013e6b85a8ecbe4
>>  x15:     24f981889f30
>>  x16: ffff4afedeb89cb8
>>  x17: fffffffffffffff2
>>  x18: ffff0000fe666800 (g_ctx + fcfa6a54)
>>  x19:                0
>>  x20: ffff0000fec41000 (g_ctx + fd581254)
>>  x21:                3
>>  x22: ffff0000419bb090 (g_ctx + 402fb2e4)
>>  x23: ffff000000c09bb7 (lockstat_enabled + 0)
>>  x24:              180
>>  x25: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
>>  x26: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
>>  x27: ffff000000c09000 (sdt_vfs_vop_vop_spare1_entry + 28)
>>  x28:                0
>>  x29: ffff0000fe666800 (g_ctx + fcfa6a54)
>>   sp: ffff0000fe666800
>>   lr: ffff00000154ca38 (zio_dva_throttle + 13c)
>>  elr: ffff00000154ca80 (zio_dva_throttle + 184)
>> spsr:         20000045
>>  far:     1f24979f8000
>> panic: Unknown kernel exception 0 esr_el1 2000000
>> cpuid = 2
>> time = 1646433952
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x174
>> panic() at panic+0x44
>> do_el1h_sync() at do_el1h_sync+0x184
>> handle_el1h_sync() at handle_el1h_sync+0x10
>> --- exception, esr 0x2000000
>> zio_dva_throttle() at zio_dva_throttle+0x184
>> zio_execute() at zio_execute+0x58
>> KDB: enter: panic
>> [ thread pid 0 tid 100128 ]
>> Stopped at      kdb_enter+0x44: undefined       f901c11f
>> db>
>> 
>> 
>> I'm going to build a newer kernel to see if the problem persists. I can keep the current kernel to reproduce this if needed.
>> 
>> Regards,
>> Ronald.
>