kern/178477: [ath] missed beacon / soft reset in STA mode results in hardware error and DMA engine lockup

adrian chadd adrian at FreeBSD.org
Fri May 10 10:00:00 UTC 2013


>Number:         178477
>Category:       kern
>Synopsis:       [ath] missed beacon / soft reset in STA mode results in hardware error and DMA engine lockup
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri May 10 10:00:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     adrian chadd
>Release:        -HEAD
>Organization:
>Environment:
>Description:
With my most recent changes in ath(4) to the TX DMA list (ie, only writing new TxDP entries for a queue for the first frame being sent after reset; then always using the holding descriptor and link pointer for subsequent frames) I've uncovered a rather annoying bug.

If a no-loss reset is done (ie, no packets are lost) the hardware will end up locking up.

This is triggerable in STA mode. AP mode doesn't (for now) seem to be a problem.

What's seen:

ath0: hardware error; resetting
ath0: 0x00000000 0x00000020 0x00000000, 0x00000000 0x00000000 0x00000000
ar5416StopDmaReceive: dma failed to stop in 10ms
AR_CR=0x00000024
AR_DIAG_SW=0x42000020

after this point, no combination of soft or hard chip reset unlocks the DMA engine.

When reset debugging is enabled, the queue looks like this:

ath0: ath_tx_stopdma: tx queue [3] 0, active=1, hwpending=1, flags 0x00000000, link 0x<ptr>

As far as I'm aware, the TX queue TxDP should never be 0x0 if it's active.

Anyway. This is easy to reproduce.
>How-To-Repeat:
* Insert AR5416 card
* Create STA vap
* Associate to AP
* Force a 'stuck beacon' no-loss reset - sysctl dev.ath.X.forcebstuck=1
* .. the next transmission will cause a hardware error.

>Fix:
Not sure yet. There's not many things that can go wrong here:

* is there a frame on the TXQ that's actually already been freed?
* is the holding descriptor not being freed during a soft reset?
* .. and what about the link pointer? it should be set to NULL during reset, then the DMA restart routine should re-initialise the link pointer to the last descriptor in the last frame in the list. Or NULL, if the list is empty.

Actually, I just hacked on the DMA restart code to ensure that the link pointer is either initialised to the last descriptor in the list or NULL. That seems to have fixed it. So, the reset path isn't freeing the holding descriptor or NULL'ing the axq_link pointer.

Fix that!

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list