Re: ARM64 system error
- Reply: John F Carr : "Re: ARM64 system error"
- In reply to: John F Carr : "ARM64 system error"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 03 Aug 2022 16:28:21 UTC
> On 31 Jul 2022, at 17:55, John F Carr <jfc@mit.edu> wrote:
>
> My OverDrive 1000 (Cortex A57) running CURRENT just crashed with the unhelpful message "panic: Unhandled System Error". Is there any way to get better information? The ESR value bf000000 translates to "system error with implementation-defined code 0" so that's not much use. The instruction associated with the interrupt can't fault ("subs w22, w22, #0x1") so it must be an asynchronous error. On other systems I've seen bits you can test or registers you can read to get details.
By my reading of the Cortex-A57 documentation [1] I think the ESR value shows the exception can be attributed to the current core, is containable to a given code sequence, and is a decode error.
It’s likely due to msk_phy_readreg accessing the phy, but it doesn’t respond quickly enough.
Does an older kernel boot? If so can you try bisecting to find which commit caused the panic.
Andrew
[1] Bottom of https://developer.arm.com/documentation/ddi0488/h/system-control/aarch64-register-descriptions/exception-syndrome-register--el1-and-el3?lang=en
>
> x0: 0
> x1: ffff0000b55bd000 (crypto_dev + b3f34ec0)
> x2: 2880
> x3: 20
> x4: d3
> x5: 0
> x6: 100
> x7: ffff00011063daa0
> x8: ffff00000077218c (generic_bs_r_2 + 0)
> x9: 2880
> x10: ffff0000001ff9f4 (msk_phy_readreg + 84)
> x11: a0000045
> x12: 56000000
> x13: 5e4a6f28
> x14: ffff000000c4d038 (vnet_entry_ipport_stoprandom + 0)
> x15: ffffa000016b3000
> x16: 40ef9400
> x17: a
> x18: ffff0000b550e560 (crypto_dev + b3e86420)
> x19: ffff0000b57dc000 (crypto_dev + b4153ec0)
> x20: ffffa000029dc800
> x21: 2880
> x22: 3c4
> x23: 796d
> x24: ffffa000017f4100
> x25: ffff000000ad3da0 (miibus_readreg_desc + 0)
> x26: ffff000000bb6000 (vop_deallocate_desc + 28)
> x27: ffff000000e36980 (cc_cpu + 80)
> x28: ffff000000b1b828 (lock_class_mtx_sleep + 0)
> x29: ffff0000b550e670 (crypto_dev + b3e86530)
> sp: ffff0000b550e560
> lr: ffff0000001ff9f0 (msk_phy_readreg + 80)
> elr: ffff00000077806c (handle_el1h_irq + 8)
> spsr: a00002c5
> far: 0
> esr: bf000000
> panic: Unhandled System Error
> cpuid = 2
> time = 1659270153
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x13c
> panic() at panic+0x44
> do_serror() at do_serror+0x40
> handle_serror() at handle_serror+0x38
> --- system error, esr 0xbf000000
> handle_el1h_irq() at handle_el1h_irq+0x8
> --- interrupt
> msk_phy_readreg() at msk_phy_readreg+0x84
> e1000phy_status() at e1000phy_status+0x114
> e1000phy_service() at e1000phy_service+0x420
> mii_tick() at mii_tick+0x50
> msk_tick() at msk_tick+0x44
> softclock_call_cc() at softclock_call_cc+0x128
> softclock_thread() at softclock_thread+0xc4
> fork_exit() at fork_exit+0x74
> fork_trampoline() at fork_trampoline+0x14
> KDB: enter: panic
> [ thread pid 2 tid 100026 ]
> Stopped at kdb_enter+0x44: undefined f907c27f
>