Re: panic: vm_fault failed: %lx error 1 (from arm64::data_abort)

From: Klaus_Küchemann <maciphone2_at_googlemail.com>
Date: Sat, 07 Jan 2023 05:50:01 UTC
well, just to confirm Mark’s estimation that it’s possibly not a kernel issue :

apart from all that hills to climb to get it to boot(without any kernel patch or kernel revert):
 I got it on  both RPi4B( „B0T“-device) &  CM4(„C0T“-device, afaik) with the latest 14current from tonight..
while on the CM4 it was a requirement to unload modules at boot (or for persistence disabling devmatch),
so for Björn I would suggest to unload (all) modules out of the loader  (as an idea for the first try) ...
<<<<
U-Boot 2022.10 (Jan 01 2023 - 06:34:49 +0000)

DRAM:  7.9 GiB
RPI Compute Module 4 (0xd03140)
….—(netboot) : ------
root@:~ # uname -a
FreeBSD  14.0-CURRENT FreeBSD 14.0-CURRENT #23 main-n259963-da303f5fd4ee: Sat Jan  7 01:16:32 CET 2023     root@fbsdr5pro:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-MMCCAM arm64
---

<<<<
U-Boot 2022.10 (Jan 01 2023 - 06:34:49 +0000)

DRAM:  7.9 GiB
RPI 4 Model B (0xd03114)root
….—(netboot) : ------
root@:~ # uname -a
FreeBSD  14.0-CURRENT FreeBSD 14.0-CURRENT #23 main-n259963-da303f5fd4ee: Sat Jan  7 01:16:32 CET 2023     root@fbsdr5pro:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC-MMCCAM arm64
---

Regards

K.

> Am 05.01.2023 um 23:57 schrieb Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>:
> 
> On Thu, 5 Jan 2023, Bjoern A. Zeeb wrote:
> 
>> On Thu, 5 Jan 2023, Bjoern A. Zeeb wrote:
>> 
>> Hi,
>> 
>>> on an unattended console after updating the machine (previious builds were Dec 23) did not come back up.
>>> I have a few last lines.
>>> esr:   96000004
>>> panic: vm_fault failed: <addr> error 1
>>> cpuid = 0
>>> time = 1
>>> KDB: stack backtrace:
>>> ..
>>> data_abort()
>>> ..
>>> --- exception, esr 0x96000004
>>> thread_init()
>>> keg_alloc_slap()
>>> zone_import()
>>> cache_alloc()
>>> cace_alloc_retry()
>>> thread_alloc()
>>> fork()
>>> kproc_create()
>>> audit_worker_init()
>>> mi_startup()
>>> virtdone()
>> 
>> Follow-up, got a serial console hooked up and a kernel as of an hour ago
>> or so:
> 
> And as another data point:  6fd6a0e342fbfb8513ae56105cf0f85f55c6276e
> (Dec 23) does boot still just fine; did a rebuild with all the same
> local changes, same kernel modules loaded, ... same loader installed
> (not changed with the dowgrade), same firmware, ...
> 
> I'll try to bisect the next days unless someone can spot any other
> commit I may have missed which could cause this.
> 
> /bz
> 
> 
>> ...
>> hostuuid: using 00000000-0000-0000-0000-000000000000
>> ULE: setup cpu 0
>> ULE: setup cpu 1
>> ULE: setup cpu 2
>> ULE: setup cpu 3
>> Fatal data abort:
>> x0: ffffa000008c1d80
>> x1:                0
>> x2:                2
>> x3:                3
>> x4:              203
>> x5:                0
>> x6: ffffffffffffffff
>> x7:             2001
>> x8: ffff000000ee5000 (dump_encrypted_write.buf + f54)
>> x9:                0
>> x10: ffffa00000845be0
>> x11:                2
>> x12: ffff00004041dd98 (fuse_mtx + 3c276888)
>> x13:   20000000000040
>> x14:           42c000
>> x15:                1
>> x16:                c
>> x17:            4082a
>> x18: ffff000000fcf6a0 (initstack + 36a0)
>> x19: ffff00004082b000 (fuse_mtx + 3c683af0)
>> x20:                0
>> x21: ffff00004082b000 (fuse_mtx + 3c683af0)
>> x22:                2
>> x23:                0
>> x24: ffff00004082d000 (fuse_mtx + 3c685af0)
>> x25:                0
>> x26: ffff000000c73000 (sdta_vfs_vop_vop_spare4_return1 + 18)
>> x27:                2
>> x28:                1
>> x29: ffff000000fcf6a0 (initstack + 36a0)
>> sp: ffff000000fcf6a0
>> lr: ffff0000004cdff8 (thread_init + 98)
>> elr: ffff0000004ce004 (thread_init + a4)
>> spsr:         600000c5
>> far:               40
>> esr:         96000004
>> panic: vm_fault failed: ffff0000004ce004 error 1
>> cpuid = 0
>> time = 1
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x13c
>> panic() at panic+0x44
>> data_abort() at data_abort+0x308
>> handle_el1h_sync() at handle_el1h_sync+0x10
>> --- exception, esr 0x96000004
>> thread_init() at thread_init+0xa4
>> keg_alloc_slab() at keg_alloc_slab+0x24c
>> zone_import() at zone_import+0xe0
>> cache_alloc() at cache_alloc+0x32c
>> cache_alloc_retry() at cache_alloc_retry+0x2c
>> thread_alloc() at thread_alloc+0x38
>> fork1() at fork1+0x348
>> kproc_create() at kproc_create+0x78
>> audit_worker_init() at audit_worker_init+0x44
>> mi_startup() at mi_startup+0x200
>> virtdone() at virtdone+0x6c
>> KDB: enter: panic
>> [ thread pid 0 tid 100000 ]
>> Stopped at      kdb_enter+0x44: undefined       f900027f
>> db> show reg
>> spsr        0xf2000000600000c5
>> x0                        0x12
>> x1                         0xa
>> x2                           0
>> x3                         0xa
>> x4          0xffff0000007f5c10  generic_bs_w_4
>> x5                        0x50
>> x6          0xffff00000051244c  kvprintf+0x470
>> x7                        0xd5
>> x8                         0x1
>> x9          0x49a2d892bc05a0b1
>> x10         0xffff000000ebd000  null_gdb_dbgport+0x20
>> x11         0xfefefefefefefeff
>> x12         0xffff000000000a63  create_pagetables+0x3b
>> x13             0xfefefeff0100
>> x14                          0
>> x15                          0
>> x16                          0
>> x17                          0
>> x18         0xffff000000fcf310  initstack+0x3310
>> x19         0xffff000000f16000  kdb_why
>> x20         0xffff000000ee3f70  vpanic.buf
>> x21         0xffff000000ec0cc0  thread0_st
>> x22                          0
>> x23         0xffff000000ee4000  vpanic.buf+0x90
>> x24                        0x1
>> x25         0xffff000000fcfaa0  initstack+0x3aa0
>> x26         0xffff000000c73000  sdta_vfs_vop_vop_spare4_return1+0x18
>> x27                        0x2
>> x28                        0x1
>> x29         0xffff000000fcf310  initstack+0x3310
>> lr          0xffff00000050b0c4  kdb_enter+0x40
>> elr         0xffff00000050b0c8  kdb_enter+0x44
>> sp          0xffff000000fcf310  initstack+0x3310
>> kdb_enter+0x44: undefined       f900027f
>> 
>> 
>> 
> 
> -- 
> Bjoern A. Zeeb                                                     r15:7




> Am 06.01.2023 um 04:58 schrieb Mark Millard <marklmi@yahoo.com>:
> 
> As a contrast, I've dd'd to microsd card media and booted:
> 
> FreeBSD-14.0-CURRENT-arm64-aarch64-RPI-20230101-231d75568f16-259905.img.xz <http://ftp3.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/14.0/FreeBSD-14.0-CURRENT-arm64-aarch64-RPI-20230101-231d75568f16-259905.img.xz>
> 
> 231d75568f16 is from:
> 
>   • Sat, 31 Dec 2022
>      . . .
>       • git: 231d75568f16 - main - Move INVLPG to pmap_quick_enter_page() from pmap_quick_remove_page(). Konstantin Belousov
> 
> So: the last listed for 2022-Dec-31.
> 
> I've booted on a couple of RPi4B's ("C0T" and "B0T" 8 GiByte
> ones as I remember). No boot crashes or such. You might want
> to test if such crashes in your context. If it does not, then
> something more specific to your environment is involved.
> 
> I sometimes do rough/partial kernel "bisect" via materials from:
> 
> https://artifact.ci.freebsd.org/snapshot/main/?C=M&O=D
> 
> without having to build. For one, if I get a replication then my
> personal builds are not the source of whatever problem I'm
> looking into at the time. Otherwise . . .
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> 
>