kern/105348: ath device stopps TX
meyer
eng at prowip.com.br
Thu Nov 9 22:11:30 UTC 2006
>Number: 105348
>Category: kern
>Synopsis: ath device stopps TX
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Nov 09 22:10:21 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator: meyer
>Release: releng_6
>Organization:
prowip
>Environment:
FreeBSD ap-h.matik.com.br 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Thu Nov 9 07:37:32 BRST 2006
>Description:
this is about ath device running hostap with ath_rate_onoe and ath_rate_sample which makes no difference
the following is the standard as it happens daily:
Nov 8 12:53:53 ap-h kernel: ath2: discard oversize frame (ether type 5e4 flags 3
len 1522 > max 1514)
Nov 8 12:54:23 ap-h last message repeated 2 times
Nov 8 12:56:25 ap-h last message repeated 4 times
Nov 8 12:58:41 ap-h last message repeated 18 times
Nov 8 13:00:00 ap-h root: WIP: 135 esta??es conectadas.
Nov 8 13:06:54 ap-h kernel: ath2: device timeout
athstats shows something like this, mostly interesting "tx stopped" because it really stopped while the card still receive üpload and traffic rx goes through
1974355 data frames received
2199237 data frames transmit
32016 tx frames with an alternate rate
799516 long on-chip tx retries
31093 tx failed 'cuz too many retries
11M current transmit rate
86472 tx management frames
31 tx frames discarded prior to association
57093 tx stopped 'cuz no xmit buffer
64442 tx frames with no ack marked
1167515 tx frames with short preamble
11001938 rx failed 'cuz of bad CRC
14966831 rx failed 'cuz of PHY err
14890917 CCK timing
75914 CCK restart
489724 beacons transmitted
1673 periodic calibrations
13 rssi of last ack
23 avg recv rssi
-98 rx noise floor
64369 cabq frames transmitted
117 cabq xmit overflowed beacon interval
28922 switched default/rx antenna
Antenna profile:
[1] tx 1135988 rx 1099419
[2] tx 1117796 rx 1177433
In this case it is completely not relevant if I set ath_txbuf to whatever, 500, 1000, 5000, 10000 or 30000, same happens alsways and daily twice or trice
Interesting is that I boot the exactly same hardware with the same setup and 5.4-R and it is rockstable, the ATH device does not hang up
I tried different ath cards and it does not make any difference either
the athstats event "tx stopped" always happens but does not always cause a real hang but it seems that the driver do not recover from sporadic "tx stopped" and get counting up and once on a certain level it hangs then.
also interesting that I can get the deve back when I am there in time, a ifconfig down and up recovers it, if I am late I need a complete reboot
any other system stats are not giving any light but I have them, ask me if you want to know something else
I also suspected an if_bridge problem but when i change the ath device for some wi the problem goes away, wi is stable too, as told before, same machine, same external environment
Appearently the problem is related to how many stations are associated, 30 seems to be the magic number and 40 seems to be the sure dead count, better explained it means the problem does not happen under 30 stations, it happens sometimes with up to 30 and it happens once or more a dya with over 30 stations associated.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list