9.0-RC2 re(4) "no memory for jumbo buffers" issue
mandrews at bit0.com
Fri Dec 30 03:52:00 UTC 2011
On 11/28/2011 6:42 PM, YongHyeon PYUN wrote:
> On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote:
>> On 11/27/11 8:39 PM, YongHyeon PYUN wrote:
>>> On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote:
>>>> I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek
>>>> RTL8111C-GR gigabit NICs on it. As far as I can tell, these support
>>>> jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on
>>> Actually the maximum size is 6KB for RTL8111C, not 7422.
>>> RTL8111C and newer PCIe based gigabit controllers no longer support
>>> scattering a jumbo frame into multiple RX buffers so a single RX
>>> buffer has to receive an entire jumbo frame. This adds more burden
>>> to system because it has to allocate a jumbo frame even when it
>>> receives a pure TCP ACK.
>> OK, that makes sense.
>>>> FreeBSD 9.0-RC2, after a week or so of update, with fairly light network
>>>> activity, the interfaces die with "no memory for jumbo buffers" errors
>>>> on the console. Unloading and reloading the driver (via serial console)
>>>> doesn't help; only rebooting seems to clear it up.
>>> The jumbo code path is the same as normal MTU sized one so I think
>>> possibility of leaking mbufs in driver is very low. And the
>>> message "no memory for jumbo RX buffers" can only happen either
>>> when you up the interface again or interface restart triggered by
>>> watchdog timeout handler. I don't think you're seeing watchdog
>>> timeouts though.
>> I'm fairly certain the interface isn't changing state when this happens
>> -- it just kinda spontaneously happens after a week or two, with no
>> interface up/down transitions. I don't see any watchdog messages when
>> this happens.
> There is another code path that causes controller reinitialization.
> If you change MTU or offloading configuration(TSO, VLAN tagging,
> checksum offloading etc) it will reinitialize the controller. So do
> you happen to trigger one of these code path during a week or two?
>>> When you see "no memory for jumbo RX buffers" message, did you
>>> check available mbuf pool?
>> Not yet, that's why I asked for debugging tips -- I'll do that the next
>> time this happens.
>>>> What's the best way to go about debugging this... which sysctl's should
>>>> I be looking at first? I have already tried raising kern.ipc.nmbjumbo9
>>>> to 16384 and it doesn't seem to help things... maybe prolonging it
>>>> slightly, but not by much. The problem is it takes a week or so to
>>>> reproduce the problem each time...
>>> I vaguely guess it could be related with other subsystem which
>>> leaks mbufs such that driver was not able to get more jumbo RX
>>> buffers from system. For instance, r228016 would be worth to try on
>>> your box. I can't clearly explain why em(4) does not suffer from
>>> the issue though.
>> I've just this morning built a kernel with that fix, so we'll see how
>> that goes.
OK, this just happened again with a 9.0-RC3 kernel rev r228247.
whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig
re0: no memory for jumbo RX buffers
re1: no memory for jumbo RX buffers
whitedog# netstat -m
526/1829/2355 mbufs in use (current/cache/total)
0/1278/1278/25600 mbuf clusters in use (current/cache/total/max)
0/356 mbuf+clusters out of packet secondary zone in use (current/cache)
0/336/336/12800 4k (page size) jumbo clusters in use
512/385/897/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
4739K/7822K/12561K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/4560/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines
More information about the freebsd-stable