Panic with mca trap
John Baldwin
jhb at freebsd.org
Thu Feb 3 13:57:07 UTC 2011
On Tuesday, February 01, 2011 11:58:12 am mdf at freebsd.org wrote:
> On a piece of hardware trying to verify basic build tests, we got an
> MCA exception that then panic'd the kernel due to WITNESS/INVARIANTS
> interaction.
>
> panic @ time 1296563157.510, thread 0xffffff0005540000: blockable
> sleep lock (sleep mutex) 128 @ /build/mnt/src/sys/vm/uma_core.c:1872
>
> Stack: --------------------------------------------------
> kernel:witness_checkorder+0x7a2
> kernel:_mtx_lock_flags+0x81
> kernel:uma_zalloc_arg+0x256
> kernel:malloc+0xc5
> kernel:mca_record_entry+0x30
> kernel:mca_scan+0xc9
> kernel:mca_intr+0x79
> kernel:trap+0x30b
> kernel:witness_checkorder+0x66
> kernel:_mtx_lock_spin_flags+0xa4
> kernel:witness_checkorder+0x2a8
> kernel:_mtx_lock_spin_flags+0xa4
> kernel:tdq_idled+0xe8
> kernel:sched_idletd+0x5b
> kernel:fork_exit+0x9b
>
> That's this bit of code in uma_zalloc_arg():
>
> #ifdef INVARIANTS
> ZONE_LOCK(zone);
> uma_dbg_alloc(zone, NULL, item);
> ZONE_UNLOCK(zone);
> #endif
>
>
> I don't know uma(9) well enough to know the best workaround. Clearly
> there are times we can be in uma_zalloc_arg() and taking a regular
> mutex is not acceptable. But what to do for the uma_dbg_free() call
> so it's happy, and whether to guard taking the ZONE lock with M_NOWAIT
> or td_critnest > 0 or both is outside my current knowledge.
>
> I don't expect we'll see this panic again any time soon, but it would
> be nice to fix the story for WITNESS of when an M_NOWAIT allocation
> can be done.
Actually, this is more my fault. The machine check happened while the
interrupted thread was already in a critical section (hence the WITNESS
complaint). However, it really isn't correct to be calling malloc() from an
arbitrary exception handler, especially one like MC# which can fire pretty
much anywhere. I think instead that we should use malloc() when polling the
machine check banks, but keep a pre-allocated pool of structures for use with
MC# exceptions and CMC interrupts and replenish the pool via an asynchronous
task.
--
John Baldwin
More information about the freebsd-current
mailing list