[Bug 235031] [em] em0: poor NFS performance, strange behavior

Sun Jan 20 12:56:34 UTC 2019

On Sun, 20 Jan 2019, Martin Birgmeier wrote:

> Regarding duplex, ifconfig shows the following:
>
> [0]# ifconfig em0
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> Â Â Â Â Â Â Â 
> options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
> Â Â Â Â Â Â Â  ether f0:de:f1:98:86:a9
> Â Â Â Â Â Â Â  inet 192.168.1.19 netmask 0xffffff00 broadcast 192.168.1.255
> Â Â Â Â Â Â Â  inet6 fe80::f2de:f1ff:fe98:86a9%em0 prefixlen 64 scopeid 0x1
> Â Â Â Â Â Â Â  inet6 fec0:0:0:4d42::13 prefixlen 64
> Â Â Â Â Â Â Â  inet6 fec0::4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
> Â Â Â Â Â Â Â  inet6 2002:bc17:f381:4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
> Â Â Â Â Â Â Â  media: Ethernet autoselect (1000baseT <full-duplex>)
> Â Â Â Â Â Â Â  status: active
> Â Â Â Â Â Â Â  nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
> [0]#
>
> This seems to be o.k.

The media setting can't be trusted to have reached the hardware -- see my
previous reply.

But I thought that you said that you were using 100 Mbps (presumably with
autoselect).  The above shos autoselect giving 1 Gbps.

I checked that iflib_media_change() is not called for autoselect to 1 Gbps
here.  Also that it fails to stop the NIC if called.  Also that it breaks
the NIC's state after a few calls in the loop:

 	while :; do
 		./ifconfig em0 media 1000baseT mediaopt full-duplex
 		./ifconfig em0 media autoselect
 	done

provided ./ifconfig is on nfs.  This gives null changes disguised as
non-null changes so that iflib_media_change() is called.

Console output for this:

XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX em0: TX(0) desc avail = 21, pidx = 34

Sometimes the queue indexes are corrupted and this messages is printed.
Sometimes, but never in this output, this message is repeated many times
before the interface comes back up.  Actually, this doesn't always
occur between down and up, and when it is repeaded the queue state is
avail = 1024, pidx = 0, and this state seems to be sticky unless ifconfig
somehow runs to generate another reinitialization.

XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX em0: TX(0) desc avail = 1, pidx = 30
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX Link state changed to up
XX link state changed to down
XX em0: TX(0) desc avail = 14, pidx = 33
XX Link state changed to up

ipv4 ping is broken most of the time while this loop is running.  Of course
ping should stop responding while the interface is down.  It rarely starts
when the interface comes back up.  Sometimes it starts with low latency,
but usually it starts with DUPs.  For about 50 iterations, the only ping
output was:

XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=0.158 ms
XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=3523.305 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=6696.247 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=9857.912 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=0.094 ms
XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=4154.124 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=7253.986 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=10367.938 ms (DUP!)
XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=13540.805 ms (DUP!)

Bruce