FreeBSD 8.0 - network stack crashes?

Sat Nov 28 08:15:41 UTC 2009

Hi,

Gavin Atkinson wrote:
> On Tue, 2009-11-03 at 08:32 -0500, Weldon S Godfrey 3 wrote:
> > 
> > If memory serves me right, sometime around Yesterday, Gavin Atkinson told me:
> > 
> > Gavin, thank you A LOT for helping us with this, I have answered as much 
> > as I can from the most recent crash below.  We did hit max mbufs.  It is 
> > at 25Kclusters, which is the default.  I have upped it to 32K because a 
> > rather old article mentioned that as the top end and I need to get into 
> > work so I am not trying to do this with a remote console to go higher.  I 
> > have already set it to reboot next with 64K clusters.  I already have kmem 
> > maxed to what is bootable (or at least at one time) in 8.0, 4GB, how high 
> > can I safely go?  This is a NFS server running ZFS with sustained 5 min 
> > averages of 120-200Mb/s running as a store for a mail system.
> > 
> > > Some things that would be useful:
> > >
> > > - Does "arp -da" fix things?
> > 
> > no, it hangs like ssh, route add, etc
> > 
> > > - What's the output of "netstat -m" while the networking is broken?
> > Tue Nov  3 07:02:11 CST 2009
> > 36971/2033/39004 mbufs in use (current/cache/total)
> > 24869/731/25600/25600 mbuf clusters in use (current/cache/total/max)
> > 24314/731 mbuf+clusters out of packet secondary zone in use 
> > (current/cache)
> > 0/35/35/12800 4k (page size) jumbo clusters in use 
> > (current/cache/total/max)
> > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> > 58980K/2110K/61091K bytes allocated to network (current/cache/total)
> > 0/201276/90662 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > 0/0/0 sfbufs in use (current/peak/max)
> > 0 requests for sfbufs denied
> > 0 requests for sfbufs delayed
> > 0 requests for I/O initiated by sendfile
> > 0 calls to protocol drain routines
> 
> OK, at least we've figured out what is going wrong then.  As a
> workaround to get the machine to stay up longer, you should be able to
> set kern.ipc.nmbclusters=256000 in /boot/loader.conf -but hopefully we
> can resolve this soon.

I'll chip in with a report of exactly the same situation, and I'm on 8.0-RELEASE.
We've been struggling with this for some time, and latest yesterday the box was rebooted, and already last night it wedged again. We're at a whopping 
  kern.ipc.nmbclusters: 524288
and I've just doubled it once more, which means we're allocating 2GB to networking..

Much like the original poster, we're seeing this on a amd64 storage server with a large ZFS array shared through NFS, and network interfaces are two em(4) combined in a lagg(4) interface (lacp). Using either of the two em interfaces without lagg shows the same problem, just lower performance..

> Firstly, what kernel was the above output from?  And what network card
> are you using?  In your initial post you mentioned testing both bce(4)
> and em(4) cards, be aware that em(4) had an issue that would cause
> exactly this issue, which was fixed with a commit on September 11th
> (r197093).  Make sure your kernel is from after that date if you are
> using em(4).  I guess it is also possible that bce(4) has the same
> issue, I'm not aware of any fixes to it recently.

We're on GENERIC .

> So, from here, I think the best thing would be to just use the em(4) NIC
> and an up-to-date kernel, and see if you can reproduce the issue.

em(4) and 8.0-RELEASE still shows this problem.

> How important is this machine?  If em(4) works, are you able to help
> debug the issues with the bce(4) driver?

We have no bce(4), but we have the problem on em(4) so can help debug there. The server is important, but making it stable is more important.. See below the sig for some debug info.

/Eirik

Output from sysctl dev.em.[0,1].debug=1 :

em0: Adapter hardware address = 0xffffff80003ac530 
em0: CTRL = 0x140248 RCTL = 0x8002 
em0: Packet buffer = Tx=20k Rx=12k 
em0: Flow control watermarks high = 10240 low = 8740
em0: tx_int_delay = 66, tx_abs_int_delay = 66
em0: rx_int_delay = 32, rx_abs_int_delay = 66
em0: fifo workaround = 0, fifo_reset_count = 0
em0: hw tdh = 92, hw tdt = 92
em0: hw rdh = 225, hw rdt = 224
em0: Num Tx descriptors avail = 256
em0: Tx Descriptors not avail1 = 0
em0: Tx Descriptors not avail2 = 0
em0: Std mbuf failed = 0
em0: Std mbuf cluster failed = 11001
em0: Driver dropped packets = 0
em0: Driver tx dma failure in encap = 0
em1: Adapter hardware address = 0xffffff80003be530 
em1: CTRL = 0x140248 RCTL = 0x8002 
em1: Packet buffer = Tx=20k Rx=12k 
em1: Flow control watermarks high = 10240 low = 8740
em1: tx_int_delay = 66, tx_abs_int_delay = 66
em1: rx_int_delay = 32, rx_abs_int_delay = 66
em1: fifo workaround = 0, fifo_reset_count = 0
em1: hw tdh = 165, hw tdt = 165
em1: hw rdh = 94, hw rdt = 93
em1: Num Tx descriptors avail = 256
em1: Tx Descriptors not avail1 = 0
em1: Tx Descriptors not avail2 = 0
em1: Std mbuf failed = 0
em1: Std mbuf cluster failed = 17765
em1: Driver dropped packets = 0
em1: Driver tx dma failure in encap = 0

Output from netstat -m (note that I just doubled the mbuf cluster count, thus max is > total and the box currently works:

544916/3604/548520 mbufs in use (current/cache/total)
543903/3041/546944/1048576 mbuf clusters in use (current/cache/total/max)
543858/821 mbuf+clusters out of packet secondary zone in use (current/cache)
0/77/77/262144 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/131072 9k jumbo clusters in use (current/cache/total/max)
0/0/0/65536 16k jumbo clusters in use (current/cache/total/max)
1224035K/7291K/1231326K bytes allocated to network (current/cache/total)
0/58919/29431 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines