Re: Request for testing - firmware crash, wpa, locking

From: Kevin Oberman <rkoberman_at_gmail.com>
Date: Mon, 03 Apr 2023 23:18:20 UTC
On Fri, Mar 31, 2023 at 1:09 PM Bjoern A. Zeeb <bz@freebsd.org> wrote:

> Hi,
>
> (1) Colin has fixed the panic (after the firmware crash) so many people
> keep
> seeing.  This may mean that you may now (contrary to before) try a
>         service netif restart wlan0
> to recover from such a crash.  This changes is all the way to 13.2.
> I am still very pre-occupied with real life but I am hoping that I can
> get a possible fix tested and pushed during my Easter "holidays".
>
>
> (2) Along with enweiwu and cy the "startup problem" showing as
> "CTRL-EVENT-SCAN-FAILED" was debugged a bit more.  We've for now
> backed out the rc startup script change and restored the old behaviour
> of wpa_supplicant with additional logging.  This changes is currently in
> main only but will be MFCed to stable/13 in the next week most likely.
> If you etcupdate (mergemaster) and pull that change in I would kindly
> ask you to turn on debugging for wpa_supplicant and check if you see
> any log lines including "(changed)"  [beware most should be "(no
> change)" along with IFF_UP in the line.  If so please contact me.
> I still have a hypothesis that we may simply exploit a race in net80211
> there which will need better fixing.
>
>
> (3) if you are using iwlwifi (or rtw88) I just pushed some locking
> changes into main.  I would appreciate if you could test and let me know
> if there are any new regressions (they do not fix the firmware crash
> from (1) yet!).
>
> Lots of health,
> /bz
>
> --
> Bjoern A. Zeeb                                                     r15:7
>

Bjoern,

Just updated to the latest main and received  a previously unseen lock
order reversal on my iwlwifi when the network came up. This did not prevent
the network from starting normally.
FreeBSD 14.0-CURRENT #9 main-n261962-41236539d8dd-dirty: Mon Apr  3
13:06:31 PDT 2023

lock order reversal: (sleepable after non-sleepable)
 1st 0xfffffe01466a0020 iwlwifi0_com_lo (iwlwifi0_com_lo, sleep mutex) @
/usr/src/sys/net80211/ieee80211_ioctl.c:3552
 2nd 0xffffffff81fa9ce0 rtnl cloner lock (rtnl cloner lock, sx) @
/usr/src/sys/netlink/route/iface.c:306
lock order iwlwifi0_com_lo -> rtnl cloner lock attempted at:
#0 0xffffffff80c61093 at witness_checkorder+0xbb3
#1 0xffffffff80bfb5b7 at _sx_slock_int+0x67
#2 0xffffffff80e58241 at dump_iface+0x501
#3 0xffffffff80e578cb at rtnl_handle_ifevent+0xab
#4 0xffffffff80d70e75 at ieee80211_notify_ifnet_change+0x65
#5 0xffffffff80d9c29f at ieee80211_start_locked+0x6f
#6 0xffffffff80d7fd56 at ieee80211_ioctl+0x356
#7 0xffffffff80d1d2d5 at ifhwioctl+0xe05
#8 0xffffffff80d1ecd5 at ifioctl+0x925
#9 0xffffffff80c66cee at kern_ioctl+0x1fe
#10 0xffffffff80c66a84 at sys_ioctl+0x154
#11 0xffffffff810e54f0 at amd64_syscall+0x140
#12 0xffffffff810b8b7b at fast_syscall_common+0xf8

I have done no testing, but everything seems to be operating fine.

If there is further information I can provide, just let me know.

Thanks!
-- 
Kevin Oberman, Part time kid herder and retired Network Engineer
E-mail: rkoberman@gmail.com
PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683