9.0-RC2 re(4) "no memory for jumbo buffers" issue
Mike Andrews
mandrews at bit0.com
Mon Jan 2 02:03:13 UTC 2012
On Fri, 30 Dec 2011, YongHyeon PYUN wrote:
> On Thu, Dec 29, 2011 at 10:51:25PM -0500, Mike Andrews wrote:
>> On 11/28/2011 6:42 PM, YongHyeon PYUN wrote:
>>> On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote:
>>>> On 11/27/11 8:39 PM, YongHyeon PYUN wrote:
>>>>> On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote:
>>>>>> I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek
>>>>>> RTL8111C-GR gigabit NICs on it. As far as I can tell, these support
>>>>>> jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on
>>>>> Actually the maximum size is 6KB for RTL8111C, not 7422.
>>>>> RTL8111C and newer PCIe based gigabit controllers no longer support
>>>>> scattering a jumbo frame into multiple RX buffers so a single RX
>>>>> buffer has to receive an entire jumbo frame. This adds more burden
>>>>> to system because it has to allocate a jumbo frame even when it
>>>>> receives a pure TCP ACK.
>>>> OK, that makes sense.
>>>>
>>>>>> FreeBSD 9.0-RC2, after a week or so of update, with fairly light network
>>>>>> activity, the interfaces die with "no memory for jumbo buffers" errors
>>>>>> on the console. Unloading and reloading the driver (via serial console)
>>>>>> doesn't help; only rebooting seems to clear it up.
>>>>>>
>>>>> The jumbo code path is the same as normal MTU sized one so I think
>>>>> possibility of leaking mbufs in driver is very low. And the
>>>>> message "no memory for jumbo RX buffers" can only happen either
>>>>> when you up the interface again or interface restart triggered by
>>>>> watchdog timeout handler. I don't think you're seeing watchdog
>>>>> timeouts though.
>>>> I'm fairly certain the interface isn't changing state when this happens
>>>> -- it just kinda spontaneously happens after a week or two, with no
>>>> interface up/down transitions. I don't see any watchdog messages when
>>>> this happens.
>>> There is another code path that causes controller reinitialization.
>>> If you change MTU or offloading configuration(TSO, VLAN tagging,
>>> checksum offloading etc) it will reinitialize the controller. So do
>>> you happen to trigger one of these code path during a week or two?
>>>
>>>>> When you see "no memory for jumbo RX buffers" message, did you
>>>>> check available mbuf pool?
>>>> Not yet, that's why I asked for debugging tips -- I'll do that the next
>>>> time this happens.
>>>>
>>>>>> What's the best way to go about debugging this... which sysctl's should
>>>>>> I be looking at first? I have already tried raising kern.ipc.nmbjumbo9
>>>>>> to 16384 and it doesn't seem to help things... maybe prolonging it
>>>>>> slightly, but not by much. The problem is it takes a week or so to
>>>>>> reproduce the problem each time...
>>>>>>
>>>>> I vaguely guess it could be related with other subsystem which
>>>>> leaks mbufs such that driver was not able to get more jumbo RX
>>>>> buffers from system. For instance, r228016 would be worth to try on
>>>>> your box. I can't clearly explain why em(4) does not suffer from
>>>>> the issue though.
>>>> I've just this morning built a kernel with that fix, so we'll see how
>>>> that goes.
>>> Ok.
>>
>> OK, this just happened again with a 9.0-RC3 kernel rev r228247.
>>
>>
>> whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig
>
>
> Ah, sorry. I should have spotted this issue earlier.
> Try attached patch and let me know whether it makes any difference.
>
>> re1 up
>> re0: no memory for jumbo RX buffers
>> re1: no memory for jumbo RX buffers
>> whitedog# netstat -m
>> 526/1829/2355 mbufs in use (current/cache/total)
>> 0/1278/1278/25600 mbuf clusters in use (current/cache/total/max)
>> 0/356 mbuf+clusters out of packet secondary zone in use (current/cache)
>> 0/336/336/12800 4k (page size) jumbo clusters in use
>> (current/cache/total/max)
>> 512/385/897/6400 9k jumbo clusters in use (current/cache/total/max)
>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
>> 4739K/7822K/12561K bytes allocated to network (current/cache/total)
>> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>> 0/4560/0 requests for jumbo clusters denied (4k/9k/16k)
>> 0/0/0 sfbufs in use (current/peak/max)
>> 0 requests for sfbufs denied
>> 0 requests for sfbufs delayed
>> 0 requests for I/O initiated by sendfile
>> 0 calls to protocol drain routines
>
OK, well, the patch changes things... kind of :)
After putting a lot of stress on the network -- namely about three passes
'make buildworld buildkernel' over NFS/TCP with a 5000 byte MTU -- the
interface hangs again, but the symptoms are now different. First, no
console messages whatsoever, other than NFS timeouts -- even if you
ifconfig up/down the interface, which previously would generate the 'no
memory for jumbo RX buffers' message. That message no longer appears,
ever. Even weirder, the interface will revive itself on its own after
about 15 minutes or so, and will bounce up and down every few hours for
several minutes at a time. I don't have exact timings on the outages but
I can get them if needed. The netstat -m numbers are not radically out of
line with the previous numbers, except maybe the jumbo cluster requests
are higher (but that could just be relative to the number of jumbo
packets the box has seen):
515/1495/2010 mbufs in use (current/cache/total)
0/1272/1272/25600 mbuf clusters in use (current/cache/total/max)
0/640 mbuf+clusters out of packet secondary zone in use (current/cache)
0/282/282/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
514/682/1196/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
4755K/10183K/14938K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/7888/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines
Anything I can pull out of sysctl to debug this further? Since it revives
itself eventually, I can live with it long enough to troubleshoot. It's
not a particular critical machine at the moment.
More information about the freebsd-stable
mailing list