[stable 9] panic on reboot: ipmi_wd_event()

Attilio Rao attilio at freebsd.org
Tue Jul 31 20:51:21 UTC 2012


On 7/31/12, John Baldwin <jhb at freebsd.org> wrote:
> On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote:
>> Working on the Dell R420 today, got most of it working, even the
>> broadcom ethernet cards!  However, I get the following when I reboot the
>> system:
>>
>> Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9)
>> owns a non-sleepable lock
>> KDB: stack backtrace of thread 100107:
>> sched_switch() at sched_switch+0x19f
>> mi_switch() at mi_switch+0x208
>> sleepq_switch() at sleepq_switch+0xfc
>> sleepq_wait() at sleepq_wait+0x4d
>> _sleep() at _sleep+0x3f6
>> ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97
>> ipmi_set_watchdog() at ipmi_set_watchdog+0xb1
>> ipmi_wd_event() at ipmi_wd_event+0x8f
>> kern_do_pat() at kern_do_pat+0x10f
>> sched_sync() at sched_sync+0x1ea
>> fork_exit() at fork_exit+0x135
>> fork_trampoline() at fork_trampoline+0xe
>
> Hmmm, the watchdog pat should probably happen without holding locks if
> possible.  This is related to the IPMI watchdog being special and wanting
> to schedule a thread to work.

The watchdog pat without the locks is not easy to do because we
register the watchdog callbacks in eventhandlers, which are indeed
locked (and you may also end up racing against watchdog detach, if you
don't use any lock at all).

There is a similar issue when you enter DDB o coredump, for example
but this is someway collateral due to the "after-panic" nature of the
situation. We should seriously looking into requirements for watchdog
patting and possibly DDB entering situations, outline correct
semantics to follow and refactor code to follow them.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein


More information about the freebsd-stable mailing list