[Bug 219927] awg0 stops working after a long output under ssh
Tom Vijlbrief
tvijlbrief at gmail.com
Mon Jun 12 13:31:27 UTC 2017
Tested with TX_MAG_SEGS at 20 and that is also stable for me, so I added a
patch to the original bug report:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927
The only downside I can see is a modest increase in kernel stack usage:
Index: sys/arm/allwinner/if_awg.c
===================================================================
--- sys/arm/allwinner/if_awg.c (revision 319826)
+++ sys/arm/allwinner/if_awg.c (working copy)
@@ -92,7 +92,7 @@
#define TX_SKIP(n, o) (((n) + (o)) & (TX_DESC_COUNT - 1))
#define RX_NEXT(n) (((n) + 1) & (RX_DESC_COUNT - 1))
-#define TX_MAX_SEGS 10
+#define TX_MAX_SEGS 20
#define SOFT_RST_RETRY 1000
#define MII_BUSY_RETRY 1000
@@ -419,14 +419,18 @@
sc->tx.buf_map[index].map, m, segs, &nsegs, BUS_DMA_NOWAIT);
if (error == EFBIG) {
m = m_collapse(m, M_NOWAIT, TX_MAX_SEGS);
- if (m == NULL)
+ if (m == NULL) {
+ device_printf(sc->miibus, "awg_setup_txbuf:
m_collapse failed\n");
return (0);
+ }
*mp = m;
error = bus_dmamap_load_mbuf_sg(sc->tx.buf_tag,
sc->tx.buf_map[index].map, m, segs, &nsegs,
BUS_DMA_NOWAIT);
}
- if (error != 0)
+ if (error != 0) {
+ device_printf(sc->miibus, "awg_setup_txbuf:
bus_dmamap_load_mbuf_sg failed\n");
return (0);
+ }
bus_dmamap_sync(sc->tx.buf_tag, sc->tx.buf_map[index].map,
BUS_DMASYNC_PREWRITE);
Op ma 12 jun. 2017 om 10:47 schreef Tom Vijlbrief <tvijlbrief at gmail.com>:
>
>
> Op ma 12 jun. 2017 09:59 schreef Henri Hennebert <hlh at restart.be>:
>
>> On 06/11/2017 17:54, Tom Vijlbrief wrote:
>> >
>> > Op zo 11 jun. 2017 om 16:23 schreef <bugzilla-noreply at freebsd.org
>> > <mailto:bugzilla-noreply at freebsd.org>>:
>> >
>> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219927
>> >
>> > Bug ID: 219927
>> > Summary: awg0 stops working after a long output under
>> ssh
>> > Product: Base System
>> > Version: CURRENT
>> > Hardware: arm64
>> > OS: Any
>> > Status: New
>> > Severity: Affects Only Me
>> > Priority: ---
>> > Component: arm
>> > Assignee: freebsd-arm at FreeBSD.org
>> > Reporter: hlh at restart.be <mailto:hlh at restart.be>
>> >
>> > Environment: pine64+ 2GB
>> > FreeBSD norquay.restart.bel 12.0-CURRENT FreeBSD 12.0-CURRENT #0
>> > r318945M: Sat
>> > Jun 10 11:47:44 CEST 2017
>> > root at norquay.restart.bel:/usr/obj/usr/src/sys/NORQUAY arm64
>> >
>> > If I connect from a wireless computer (FreeBSD 11.1-PRERELEASE #0
>> > r318860) and
>> > run a command with a big output (eg `find /`) the awg0 stops working
>> > quickly
>> > (under 20 seconds of output).
>> >
>> > If I do the same with telnet from the same computer, the output is
>> > much longer
>> > but awg0 stops working.
>> >
>> > If I do the same from a wired computer then I must run `find /` 2 or
>> > 3 times
>> > before awg0 stops working.
>> >
>> > I can rsync through ssh 12GB without problem in both directions
>> > (from and to
>> > the pine64 and the wireless computer).
>> >
>> > I have a `tcpdump -w ssh.data port 22`. (8.3 MB)
>> >
>> > I can connect with a serial console to the pine64 after awg0 stop
>> > working.
>> > ifconfig awg0 down
>> > ifconfig awg0 up
>> > don't restore the connectivity. I must reboot to restore
>> connectvity.
>> >
>> >
>> > That's a coincidence, today I'm investigating the same issue.
>> >
>> > You could try increasing TX_MAX_SEGS in sys/arm/allwinner/if_awg.c
>> line 95.
>> >
>> > I'm currently testing TX_MAX_SEGS set to 40 and no lock up yet....
>>
>> Bingo. Your solution solved the problem.
>>
>> Thanks a lot.
>>
>
> Good to hear!
>
> Increasing from 10 to 20 is probably sufficient. It is not clear to me
> what the adverse effects are of a too high value.
>
> The root cause is that the driver tries to call m_collapse with this limit
> and this will fail. The tcp stack will resent the package and the
> m_collapse will fail again and again and ...
>
>
More information about the freebsd-arm
mailing list