svn commit: r267935 - head/sys/dev/e1000 (with work around?)

Sat Sep 13 00:52:45 UTC 2014

On 9/12/2014 7:33 PM, Rick Macklem wrote:
> I wrote:
>> The patches are in 10.1. I thought his report said 10.0 in the message.
>>
>> If Mike is running a recent stable/10 or releng/10.1, then it has been
>> patched for this and NFS should work with TSO enabled. If it doesn't,
>> then something else is broken.
> Oops, I looked and I see Mike was testing r270560 (which would have both
> the patches). I don't have an explanation why TSO and 64K rsize, wsize
> would cause a hang, but does appear it will exist in 10.1 unless it
> gets resolved.
>
> Mike, one difference is that, even with the patches the driver will be
> copying the transmit mbuf list via m_defrag() to 32 MCLBYTE clusters
> when using 64K rsize, wsize.
> If you can reproduce the hang, you might want to look at how many mbuf
> clusters are allocated. If you've hit the limit, then I think that
> would explain it.

I have been running the test for a few hrs now and no lockups of the 
nic, so doing the nfs mount with -orsize=32768,wsize=32768 certainly 
seems to work around the lockup.   How do I check the mbuf clusters ?

root at backup3:/usr/home/mdtancsa # vmstat -z | grep -i clu
mbuf_cluster:          2048, 760054,    4444,     370, 3088708,   0,   0
root at backup3:/usr/home/mdtancsa #
root at backup3:/usr/home/mdtancsa # netstat -m
3322/4028/7350 mbufs in use (current/cache/total)
2826/1988/4814/760054 mbuf clusters in use (current/cache/total/max)
2430/1618 mbuf+clusters out of packet secondary zone in use (current/cache)
0/4/4/380026 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/112600 9k jumbo clusters in use (current/cache/total/max)
0/0/0/63337 16k jumbo clusters in use (current/cache/total/max)
6482K/4999K/11481K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
root at backup3:/usr/home/mdtancsa #

Interface is RUNNING and ACTIVE
em1: hw tdh = 343, hw tdt = 838
em1: hw rdh = 512, hw rdt = 511
em1: Tx Queue Status = 1
em1: TX descriptors avail = 516
em1: Tx Descriptors avail failure = 1
em1: RX discarded packets = 0
em1: RX Next to Check = 512
em1: RX Next to Refresh = 511

I just tested on the other em nic and I can wedge it as well, so its not 
limited to one particular type of em nic.

em0: Watchdog timeout -- resetting
em0: Queue(0) tdh = 349, hw tdt = 176
em0: TX(0) desc avail = 173,Next TX to Clean = 349
em0: link state changed to DOWN
em0: link state changed to UP

so it does not seem limited to just certain em nics

em0 at pci0:0:25:0:        class=0x020000 card=0x34ec8086 chip=0x10ef8086 
rev=0x05 hdr=0x00
     vendor     = 'Intel Corporation'
     device     = '82578DM Gigabit Network Connection'
     class      = network
     subclass   = ethernet
     bar   [10] = type Memory, range 32, base 0xb1a00000, size 131072, 
enabled
     bar   [14] = type Memory, range 32, base 0xb1a25000, size 4096, enabled
     bar   [18] = type I/O Port, range 32, base 0x2040, size 32, enabled
     cap 01[c8] = powerspec 2  supports D0 D3  current D0
     cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
     cap 13[e0] = PCI Advanced Features: FLR TP

I can lock things up fairly quickly by running these 2 scripts across an 
nfs mount.

#!/bin/sh

while true
do
  dd if=/dev/urandom ibs=64k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
  dd if=/dev/urandom ibs=63k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
  dd if=/dev/urandom ibs=66k count=1000 | pbzip2 -c -p3 > /mnt/test.bz2
done
root at backup3:/usr/home/mdtancsa # cat i3
#!/bin/sh

while true
do
dd if=/dev/zero of=/mnt/test2 bs=128k count=2000
sleep 10
done

	---Mike

-- 
-------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mike at sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/