[Bug 264549] HardenedBSD: panic: e1000/bridge: "sleep on wchan ... with sleeping prohibited"

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 28 Nov 2023 10:33:57 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=264549

Zhenlei Huang <zlei@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mmacy@FreeBSD.org

--- Comment #2 from Zhenlei Huang <zlei@FreeBSD.org> ---
(In reply to mfbott from comment #0)
> I've solved this issue by replacing "pause" with "DELAY" inside the e1000 driver
> (see attachment).
> This certainly works for me, but could point to a deeper problem within anything
> bridge-related. (I couldn't reproduce this panic outside a bridge.)

IIUC this is an issue that combines use _sleep() with epoch_enter_preempt().

bridge_linkstate() will enter net epoch and then calls bstp_linkstate() and
eventually e1000_write_phy_reg_mdic() / pause() / _sleep().

As per EPOCH(9),
        EPOCH_PREEMPT
                     The  epoch  will  allow preemption during sections.  Only
                     non-sleepable locks may be acquired during a  preemptible
                     epoch.       The     functions     epoch_enter_preempt(),
                     epoch_exit_preempt(), and  epoch_wait_preempt()  must  be
                     used   in   place  of  epoch_enter(),  epoch_exit(),  and
                     epoch_wait(), respectively.

it is wrong to sleep within net epoch (allow preemption).


> There's possibly a better solution in which we allow "pause" under some
> circumstances (unlike my blanket replacement), but I haven't found one.

There's also problem report by Jean-Sébastien in
https://reviews.freebsd.org/D14984 which has the same cause.

CC the author @Matt Macy

-- 
You are receiving this mail because:
You are the assignee for the bug.