Re: FYI: aarch64 boot (HoneyComb): example crash during system checks (power-off/power-on form of reboot still fails)
- Reply: Mark Millard via arm : "Re: FYI: aarch64 boot (HoneyComb): example crash during system checks (power-off/power-on form of reboot still fails)"
- In reply to: Mark Millard via arm : "FYI: aarch64 boot (HoneyComb): example crash during system checks"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 21 Nov 2021 19:36:08 UTC
On 2021-Nov-21, at 11:26, Mark Millard <marklmi@yahoo.com> wrote:
> Starting file system checks:
> /dev/gpt/CA72opt0EFI: 41 files, 242 MiB free (15469 clusters)
> FIXED
> /d x0: ffff000000e43ec8 (blocked_lock + 0)
> x1: ffff00013efa9f50
> x2: ffff00000090e39a (cam_status_table + 1d132)
> x3: deadc0d8
> x4: 0
> x5: ffff00000082a138 (data_abort + 0)
> x6: 5
> x7: 601
> x8: ffff000000e43ec8 (blocked_lock + 0)
> x9: deadc0de
> x10: 0
> x11: 3938700
> x12: 0
> x13: 8000
> x14: 1de
> x15: 81ce
> x16: 425b9080
> x17: 8000
> x18: ffff00013efa9f40
> x19: ffff000000e43ec8 (blocked_lock + 0)
> x20: ffffa0001a826000
> x21: 0
> x22: ffffa0001a826000
> x23: 0
> x24: ffff000000bed000 (queue_ops + 0)
> x25: 98967f
> x26: ffff000000e43ee0 (blocked_lock + 18)
> x27: 0
> x28: 114
> x29: ffff00013efa9f40
> sp: ffff00013efa9f40
> lr: ffff0000004b9028 (thread_lock_flags_ + c0)
> elr: ffff0000004b9028 (thread_lock_flags_ + c0)
> spsr: 2c5
> far: deadc178
> esr: 96000004
> timeout stopping cpus
> panic: data abort in critical section or under mutex
> cpuid = 5
> time = 1637492224
> KDB: stack backtrace:
> db_trace_self() at db_trace_self_wrapper+0x30
> pc = 0xffff000000807770 lr = 0xffff00000011d9ec
> sp = 0xffff00013efa9990 fp = 0xffff00013efa9b90
>
> db_trace_self_wrapper() at vpanic+0x188
> pc = 0xffff00000011d9ec lr = 0xffff0000004e1d10
> sp = 0xffff00013efa9ba0 fp = 0xffff00013efa9c00
>
> vpanic() at panic+0x44
> pc = 0xffff0000004e1d10 lr = 0xffff0000004e1b84
> sp = 0xffff00013efa9c10 fp = 0xffff00013efa9cc0
>
> panic() at data_abort+0x290
> pc = 0xffff0000004e1b84 lr = 0xffff00000082a3c8
> sp = 0xffff00013efa9cd0 fp = 0xffff00013efa9d50
>
> data_abort() at handle_el1h_sync+0x78
> pc = 0xffff00000082a3c8 lr = 0xffff00000080a078
> sp = 0xffff00013efa9d60 fp = 0xffff00013efa9eb0
>
> handle_el1h_sync() at thread_lock_flags_+0xbc
> pc = 0xffff00000080a078 lr = 0xffff0000004b9024
> sp = 0xffff00013efa9ec0 fp = 0xffff00013efa9f40
>
> thread_lock_flags_() at thread_lock_flags_+0xbc
> pc = 0xffff0000004b9024 lr = 0xffff0000004b9024
> sp = 0xffff00013efa9f50 fp = 0xffff00013efa9f60
>
> thread_lock_flags_() at sleepq_timeout+0x10
> pc = 0xffff0000004b9024 lr = 0xffff00000054b2a8
> sp = 0xffff00013efa9f70 fp = 0xffff00013efa9fb0
>
> sleepq_timeout() at softclock_call_cc+0x14c
> pc = 0xffff00000054b2a8 lr = 0xffff000000503134
> sp = 0xffff00013efa9fc0 fp = 0xffff00013efaa020
>
> softclock_call_cc() at callout_process+0x17c
> pc = 0xffff000000503134 lr = 0xffff000000502df0
> sp = 0xffff00013efaa030 fp = 0xffff00013efaa0a0
>
> callout_process() at handleevents+0x188
> pc = 0xffff000000502df0 lr = 0xffff00000045b42c
> sp = 0xffff00013efaa0b0 fp = 0xffff00013efaa100
>
> handleevents() at timercb+0x304
> pc = 0xffff00000045b42c lr = 0xffff00000045be7c
> sp = 0xffff00013efaa110 fp = 0xffff00013efaa170
>
> timercb() at arm_tmr_intr+0x5c
> pc = 0xffff00000045be7c lr = 0xffff0000007ff850
> sp = 0xffff00013efaa180 fp = 0xffff00013efaa1d0
>
> arm_tmr_intr() at intr_event_handle+0xac
> pc = 0xffff0000007ff850 lr = 0xffff000000493c54
> sp = 0xffff00013efaa1e0 fp = 0xffff00013efaa1e0
>
> intr_event_handle() at intr_isrc_dispatch+0x70
> pc = 0xffff000000493c54 lr = 0xffff0000007fb238
> sp = 0xffff00013efaa1f0 fp = 0xffff00013efaa230
>
> intr_isrc_dispatch() at arm_gic_v3_intr+0x11c
> pc = 0xffff0000007fb238 lr = 0xffff00000080ff34
> sp = 0xffff00013efaa240 fp = 0xffff00013efaa250
>
> arm_gic_v3_intr() at intr_irq_handler+0x7c
> pc = 0xffff00000080ff34 lr = 0xffff0000007faff0
> sp = 0xffff00013efaa260 fp = 0xffff00013efaa2b0
>
> intr_irq_handler() at handle_el1h_irq+0x74
> pc = 0xffff0000007faff0 lr = 0xffff00000080a140
> sp = 0xffff00013efaa2c0 fp = 0xffff00013efaa3f0
>
> handle_el1h_irq() at handle_el1h_sync+0x78
> pc = 0xffff00000080a140 lr = 0xffff00000080a078
> sp = 0xffff00013efaa400 fp = 0xffff00013efaa500
>
> handle_el1h_sync() at handle_el1h_sync+0x78
> pc = 0xffff00000080a078 lr = 0xffff00000080a078
> sp = 0xffff00013efaa510 fp = 0xffff00013efaa660
>
> handle_el1h_sync() at sched_switch+0x6a8
> pc = 0xffff00000080a078 lr = 0xffff0000005197fc
> sp = 0xffff00013efaa670 fp = 0xffff00013efaa6f0
>
> sched_switch() at sched_switch+0x6a8
> pc = 0xffff0000005197fc lr = 0xffff0000005197fc
> sp = 0xffff00013efaa700 fp = 0xffff00013efaa790
>
> sched_switch() at mi_switch+0xf4
> pc = 0xffff0000005197fc lr = 0xffff0000004f03a0
> sp = 0xffff00013efaa7a0 fp = 0xffff00013efaa7f0
>
> mi_switch() at sleepq_timedwait+0x28
> pc = 0xffff0000004f03a0 lr = 0xffff00000054bd0c
> sp = 0xffff00013efaa800 fp = 0xffff00013efaa830
>
> sleepq_timedwait() at _cv_timedwait_sbt+0x110
> pc = 0xffff00000054bd0c lr = 0xffff00000045e7b0
> sp = 0xffff00013efaa840 fp = 0xffff00013efaa850
>
> _cv_timedwait_sbt() at dbuf_evict_thread+0x410
> pc = 0xffff00000045e7b0 lr = 0xffff0000013ca59c
> sp = 0xffff00013efaa860 fp = 0xffff00013efaa8f0
>
> dbuf_evict_thread() at fork_exit+0x94
> pc = 0xffff0000013ca59c lr = 0xffff00000048fbf0
> sp = 0xffff00013efaa900 fp = 0xffff00013efaa950
>
> fork_exit() at fork_trampoline+0x10
> pc = 0xffff00000048fbf0 lr = 0xffff000000828ed8
> sp = 0xffff00013efaa960 fp = 0x0000000000000000
>
> KDB: enter: panic
> [ thread pid 26 tid 100194 ]
> Stopped at kdb_enter+0x48: undefined f906411f
>
> For reference:
>
> # uname -apKU
> FreeBSD CA72_16Gp_ZFS 13.0-STABLE FreeBSD 13.0-STABLE #13 stable/13-n248062-109330155000-dirty: Sat Nov 13 23:55:14 PST 2021 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/13S-CA72-nodbg-clang/usr/13S-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1300520 1300520
>
> It is a root-on-ZFS context on Optane media in the PCie slot.
>
> I've no clue if this will repeat. I've never gotten
> this before.
The reboot attempt got the following, involving
zthr_procedure instead of dbuf_evict_thread .
Starting file system checks:
/dev/gpt/CA72opt0EFI: FILESYSTEM CLEAN; SKIPPING CHECKS
x0: ffff000000e43ec8 (blocked_lock + 0)
x1: ffff00013ef9ff90
x2: ffff00000090e39a (cam_status_table + 1d132)
x3: deadc0d8
x4: 0
x5: ffff00000082a138 (data_abort + 0)
x6: 9
x7: 601
x8: ffff000000e43ec8 (blocked_lock + 0)
x9: deadc0de
x10: 0
x11: 3938700
x12: 1
x13: 8000
x14: 1ee
x15: 81cd
x16: 425b9080
x17: 8000
x18: ffff00013ef9ff80
x19: ffff000000e43ec8 (blocked_lock + 0)
x20: ffffa0000c011000
x21: 0
x22: ffffa0000c011000
x23: 0
x24: ffff000000bed000 (queue_ops + 0)
x25: 98967f
x26: ffff000000e43ee0 (blocked_lock + 18)
x27: 0
x28: 114
x29: ffff00013ef9ff80
sp: ffff00013ef9ff80
lr: ffff0000004b9028 (thread_lock_flags_ + c0)
elr: ffff0000004b9028 (thread_lock_flags_ + c0)
spsr: 2c5
far: deadc178
esr: 96000004
timeout stopping cpus
panic: data abort in critical section or under mutex
cpuid = 9
time = 1637492224
KDB: stack backtrace:
db_trace_self() at db_trace_self_wrapper+0x30
pc = 0xffff000000807770 lr = 0xffff00000011d9ec
sp = 0xffff00013ef9f9d0 fp = 0xffff00013ef9fbd0
db_trace_self_wrapper() at vpanic+0x188
pc = 0xffff00000011d9ec lr = 0xffff0000004e1d10
sp = 0xffff00013ef9fbe0 fp = 0xffff00013ef9fc40
vpanic() at panic+0x44
pc = 0xffff0000004e1d10 lr = 0xffff0000004e1b84
sp = 0xffff00013ef9fc50 fp = 0xffff00013ef9fd00
panic() at data_abort+0x290
pc = 0xffff0000004e1b84 lr = 0xffff00000082a3c8
sp = 0xffff00013ef9fd10 fp = 0xffff00013ef9fd90
data_abort() at handle_el1h_sync+0x78
pc = 0xffff00000082a3c8 lr = 0xffff00000080a078
sp = 0xffff00013ef9fda0 fp = 0xffff00013ef9fef0
handle_el1h_sync() at thread_lock_flags_+0xbc
pc = 0xffff00000080a078 lr = 0xffff0000004b9024
sp = 0xffff00013ef9ff00 fp = 0xffff00013ef9ff80
thread_lock_flags_() at thread_lock_flags_+0xbc
pc = 0xffff0000004b9024 lr = 0xffff0000004b9024
sp = 0xffff00013ef9ff90 fp = 0xffff00013ef9ffa0
thread_lock_flags_() at sleepq_timeout+0x10
pc = 0xffff0000004b9024 lr = 0xffff00000054b2a8
sp = 0xffff00013ef9ffb0 fp = 0xffff00013ef9fff0
sleepq_timeout() at softclock_call_cc+0x14c
pc = 0xffff00000054b2a8 lr = 0xffff000000503134
sp = 0xffff00013efa0000 fp = 0xffff00013efa0060
softclock_call_cc() at callout_process+0x17c
pc = 0xffff000000503134 lr = 0xffff000000502df0
sp = 0xffff00013efa0070 fp = 0xffff00013efa00e0
callout_process() at handleevents+0x188
pc = 0xffff000000502df0 lr = 0xffff00000045b42c
sp = 0xffff00013efa00f0 fp = 0xffff00013efa0140
handleevents() at timercb+0x304
pc = 0xffff00000045b42c lr = 0xffff00000045be7c
sp = 0xffff00013efa0150 fp = 0xffff00013efa01b0
timercb() at arm_tmr_intr+0x5c
pc = 0xffff00000045be7c lr = 0xffff0000007ff850
sp = 0xffff00013efa01c0 fp = 0xffff00013efa0210
arm_tmr_intr() at intr_event_handle+0xac
pc = 0xffff0000007ff850 lr = 0xffff000000493c54
sp = 0xffff00013efa0220 fp = 0xffff00013efa0220
intr_event_handle() at intr_isrc_dispatch+0x70
pc = 0xffff000000493c54 lr = 0xffff0000007fb238
sp = 0xffff00013efa0230 fp = 0xffff00013efa0270
intr_isrc_dispatch() at arm_gic_v3_intr+0x11c
pc = 0xffff0000007fb238 lr = 0xffff00000080ff34
sp = 0xffff00013efa0280 fp = 0xffff00013efa0290
arm_gic_v3_intr() at intr_irq_handler+0x7c
pc = 0xffff00000080ff34 lr = 0xffff0000007faff0
sp = 0xffff00013efa02a0 fp = 0xffff00013efa02f0
intr_irq_handler() at handle_el1h_irq+0x74
pc = 0xffff0000007faff0 lr = 0xffff00000080a140
sp = 0xffff00013efa0300 fp = 0xffff00013efa0430
handle_el1h_irq() at handle_el1h_sync+0x78
pc = 0xffff00000080a140 lr = 0xffff00000080a078
sp = 0xffff00013efa0440 fp = 0xffff00013efa0540
handle_el1h_sync() at handle_el1h_sync+0x78
pc = 0xffff00000080a078 lr = 0xffff00000080a078
sp = 0xffff00013efa0550 fp = 0xffff00013efa06a0
handle_el1h_sync() at sched_switch+0x6a8
pc = 0xffff00000080a078 lr = 0xffff0000005197fc
sp = 0xffff00013efa06b0 fp = 0xffff00013efa0730
sched_switch() at sched_switch+0x6a8
pc = 0xffff0000005197fc lr = 0xffff0000005197fc
sp = 0xffff00013efa0740 fp = 0xffff00013efa07d0
sched_switch() at mi_switch+0xf4
pc = 0xffff0000005197fc lr = 0xffff0000004f03a0
sp = 0xffff00013efa07e0 fp = 0xffff00013efa0830
mi_switch() at sleepq_timedwait+0x28
pc = 0xffff0000004f03a0 lr = 0xffff00000054bd0c
sp = 0xffff00013efa0840 fp = 0xffff00013efa0870
sleepq_timedwait() at _cv_timedwait_sbt+0x110
pc = 0xffff00000054bd0c lr = 0xffff00000045e7b0
sp = 0xffff00013efa0880 fp = 0xffff00013efa0890
_cv_timedwait_sbt() at zthr_procedure+0x20c
pc = 0xffff00000045e7b0 lr = 0xffff0000014e2fb0
sp = 0xffff00013efa08a0 fp = 0xffff00013efa08f0
zthr_procedure() at fork_exit+0x94
pc = 0xffff0000014e2fb0 lr = 0xffff00000048fbf0
sp = 0xffff00013efa0900 fp = 0xffff00013efa0950
fork_exit() at fork_trampoline+0x10
pc = 0xffff00000048fbf0 lr = 0xffff000000828ed8
sp = 0xffff00013efa0960 fp = 0x0000000000000000
KDB: enter: panic
[ thread pid 26 tid 100192 ]
Stopped at kdb_enter+0x48: undefined f906411f
Note:
All this started after a stress based I/O hangup test
that required a forced reboot.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)