kern/165866: [ath] TX hangs,
requiring a "scan" to properly reset the interface
Adrian Chadd
adrian at FreeBSD.org
Thu Mar 8 23:40:10 UTC 2012
>Number: 165866
>Category: kern
>Synopsis: [ath] TX hangs, requiring a "scan" to properly reset the interface
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Mar 08 23:40:10 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: Adrian Chadd
>Release: FreeBSD-HEAD
>Organization:
>Environment:
FreeBSD home-11bg-ap 10.0-CURRENT FreeBSD 10.0-CURRENT #18 r232400:232625M: Wed Dec 31 16:00:00 PST 1969 adrian at dummy:/home/adrian/work/freebsd/svn/obj/mipseb/mips.mipseb/usr/home/adrian/work/freebsd/svn/src/sys/TP-WN1043ND mips
>Description:
I've been seeing TX hangs during my tests.
Investigating showed that the TX queue would grow and busy buffers would stay busy.
Eg, from sysctl dev.ath.0.txagg=1:
HW TXQ 0: axq_depth=0, axq_aggr_depth=0
HW TXQ 1: axq_depth=184, axq_aggr_depth=0
HW TXQ 2: axq_depth=0, axq_aggr_depth=0
HW TXQ 3: axq_depth=0, axq_aggr_depth=0
HW TXQ 8: axq_depth=1, axq_aggr_depth=0
Busy: 14
Total TX buffers: 15; Total TX buffers busy: 1
This occured even with a completely idle access point that only responded to probe requests - ie, no active associations.
the only way to flush things was a 'scan' - this forcibly flushes the TX queue and pending frames are either handled or deleted.
I then flipped on reset debugging (sysctl dev.ath.0.debug=0x20) and forced a scan whenever I saw this occur.
I also dumped the relevant registers when this occured. I found that the TXDP for this queue was completely in the wrong place.
I also found that the TX descriptor list made no sense - there were incomplete and complete descriptor lists in the same TX queue, as well as NULL link pointers half way through the list.
So, I figured something is splicing the list together incorrectly.
>How-To-Repeat:
This kernel was compiled with TDMA support, so the ATH_BUF_BUSY flag would be set.
* set it up on a 2.4GHz channel;
* make sure there's lots of STAs and APs around;
* notice the high level of probe request traffic;
* .. wait.
>Fix:
This particular patch seems to quieten down the issues. I'm going to run this a bit more and see what happens.
Index: if_ath_tx.c
===================================================================
--- if_ath_tx.c (revision 232400)
+++ if_ath_tx.c (working copy)
@@ -623,19 +623,22 @@
ath_txq_restart_dma(struct ath_softc *sc, struct ath_txq *txq)
{
struct ath_hal *ah = sc->sc_ah;
- struct ath_buf *bf;
+ struct ath_buf *bf, *bf_last;
ATH_TXQ_LOCK_ASSERT(txq);
/* This is always going to be cleared, empty or not */
txq->axq_flags &= ~ATH_TXQ_PUTPENDING;
+ /* XXX make this ATH_TXQ_FIRST */
bf = TAILQ_FIRST(&txq->axq_q);
+ bf_last = ATH_TXQ_LAST(txq, axq_q_s);
+
if (bf == NULL)
return;
ath_hal_puttxbuf(ah, txq->axq_qnum, bf->bf_daddr);
- txq->axq_link = &bf->bf_lastds->ds_link;
+ txq->axq_link = &bf_last->bf_lastds->ds_link;
ath_hal_txstart(ah, txq->axq_qnum);
}
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list