Re: rtw88 (8822be, 8821ce) - if you have one of these please join in [was: ..]

From: Bjoern A. Zeeb <bz_at_freebsd.org>
Date: Sun, 29 Mar 2026 19:02:27 UTC
On Fri, 6 Mar 2026, Oleg Nauman wrote:

Hi Oleg, everyone else with one of the mentioned chipsets,

> On Thu, Mar 5, 2026 at 3:59 PM Bjoern A. Zeeb <bz@freebsd.org> wrote:
>>
>> In the last 48 hours I pushed the next round of LinuxKPI 802.11
>> changes which would be good to validate on a wide set of chipsets
>> and supported drivers.  One of them fixed a firmware crash, another
>> one re-enabled some code again which was silently not doing its things
>> due to Linux KPI changes (and surprisingly most of it worked anyway the
>> last months).
>
> Unfortunately I am observing regression so my
>
> rtw880@pci0:2:0:0:      class=0x028000 rev=0x00 hdr=0x00 vendor=0x10ec
> device=0xc821 subvendor=0x1a3b subdevice=0x3040
>    vendor     = 'Realtek Semiconductor Co., Ltd.'
>    device     = 'RTL8821CE 802.11ac PCIe Wireless Network Adapter'
>    class      = network
>
> can't connect to 2.4 Ghz access point again

So given rtw88 8822be abd 8821ce mostly are standing between me and pushing
further changes I had a closer look this week.

There are two problems:

(a) switching from a (rejected) HW scan to a SW scan causes problems as
     it can race with the scan_to_auth state change due to the nature
     of the compat code and some extra locking assertions in net80211
     which are hindering to at lest get half of it race free from the start.
     I tried to work around all this only to also observe the next item.

(b) more so, rtw8821c_do_iqk() (for 8821ce) fails in a lot of times here, which
     leads to a *cough* 6s delay between starting the mgd_prepare_tx()
     and getting the actual frame out.
     There are moments when it works, in 20-40/120-150 ms but I have not yet
     looked into the differences (what leads to one failing and one
     working).

     If you turn rtw88 PHY debugging (0x8 on FreeBSD or Linux) on,
     you can see lines similar (slightly adjusted here currently)
     to these; where the counter is 300 the (300 * 20ms) timeout failures
     happened and the rf_reg=0xabcde (n_iqk_fail(mask)==0x00000) are the
     ones where things seem to have worked.

[40558.969218] iqk counter=7   reload=0 do_iqk_cnt=154   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[40586.004212] iqk counter=300 reload=0 do_iqk_cnt=155   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[41325.466166] iqk counter=6   reload=0 do_iqk_cnt=156   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[41804.767148] iqk counter=300 reload=0 do_iqk_cnt=157   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[41826.729151] iqk counter=300 reload=0 do_iqk_cnt=158   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[42751.157067] iqk counter=6   reload=0 do_iqk_cnt=159   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[42778.272059] iqk counter=300 reload=0 do_iqk_cnt=160   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[43758.862997] iqk counter=6   reload=0 do_iqk_cnt=161   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[43786.057993] iqk counter=300 reload=0 do_iqk_cnt=162   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[44722.416953] iqk counter=6   reload=0 do_iqk_cnt=163   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[45505.841872] iqk counter=300 reload=0 do_iqk_cnt=164   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[45528.128872] iqk counter=300 reload=0 do_iqk_cnt=165   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[46350.516810] iqk counter=6   reload=0 do_iqk_cnt=166   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[46895.590359] iqk counter=300 reload=0 do_iqk_cnt=167   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[46916.526778] iqk counter=300 reload=0 do_iqk_cnt=168   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[48190.225691] iqk counter=6   reload=0 do_iqk_cnt=169   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[48217.376689] iqk counter=300 reload=0 do_iqk_cnt=170   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[49406.787608] iqk counter=6   reload=0 do_iqk_cnt=171   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[49433.853606] iqk counter=300 reload=0 do_iqk_cnt=172   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[50226.350550] iqk counter=7   reload=0 do_iqk_cnt=173   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[51510.819466] iqk counter=300 reload=0 do_iqk_cnt=174   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[51532.908461] iqk counter=300 reload=0 do_iqk_cnt=175   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[52540.479405] iqk counter=6   reload=0 do_iqk_cnt=176   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[52955.240359] iqk counter=300 reload=0 do_iqk_cnt=177   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[52976.144349] iqk counter=300 reload=0 do_iqk_cnt=178   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[53836.870326] iqk counter=7   reload=0 do_iqk_cnt=179   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[54887.781217] iqk counter=300 reload=0 do_iqk_cnt=180   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[54908.598250] iqk counter=300 reload=0 do_iqk_cnt=181   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[55766.079171] iqk counter=6   reload=0 do_iqk_cnt=182   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde
[56984.879080] iqk counter=300 reload=0 do_iqk_cnt=183   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[57005.691081] iqk counter=300 reload=0 do_iqk_cnt=184   n_iqk_fail(mask)=0x000ea rf_reg=0xaeaea
[57748.083033] iqk counter=6   reload=0 do_iqk_cnt=185   n_iqk_fail(mask)=0x00000 rf_reg=0xabcde


I will likely have to dive into this some more but I do not want it
to further block the other rtw88/rtw89(/mt76) updates at this point
as my other work for those seems to hold up and I'd love to get some
in in time before 15.1-R.

If anyone has the time/resources to help me debugging this issue on
the 21ce and 22be chipsets, please reply here.  Otherwise it'll have to
wait and I'll add a note to the man page (if I do not forget) for now.


Lots of health,
/bz

-- 
Bjoern A. Zeeb                                                     r15:7