Sudden mbuf demand increase and shortage under the load

Ivan Voras ivoras at freebsd.org
Mon Feb 15 13:05:24 UTC 2010


On 02/15/10 13:25, Maxim Sobolev wrote:
> Hi,
>
> Our company have a FreeBSD based product that consists of the numerous
> interconnected processes and it does some high-PPS UDP processing
> (30-50K PPS is not uncommon). We are seeing some strange periodic

I have nothing very useful to help you with but maybe you can detect if 
it's a em/igp issue by buying a cheap Realtek gigabit (re) card and 
trying it out. Those can be bought for a few dollars now (e.g. from 
D-Link and many others), and I can confirm that at least the one I tried 
can carry around 50K pps, but not much more (I can tell you the exact 
chip later today if you are interested).

> failures under the load in several such systems, which usually evidences
> itself in IPC (even through unix domain sockets) suddenly either
> breaking down or pausing and restoring only some time later (like 5-10
> minutes). The only sign of failure I managed to find was the increase of
> the "requests for mbufs denied" in the netstat -m and number of total
> mbuf clusters (nmbclusters) raising up to the limit.
>
> I have tried to raise some network-related limits (most notably maxusers
> and nmbclusters), but it has not helped with the issue - it's still
> happening from time to time to us. Below you can find output from the
> netstat -m few minutes right after that shortage period - you see that
> somehow the system has allocated huge amount of memory for the network
> (700MB), with only tiny amount of that being actually in use. This is
> for the kern.ipc.nmbclusters: 302400. Eventually the system reclaims all
> that memory and goes back to its normal use of 30-70MB.
>
> This problem is killing us, so any suggestions are greatly appreciated.
> My current hypothesis is that due to some issues either with the network
> driver or network subsystem itself, the system goes insane and "eats" up
> all mbufs up to nmbclusters limit. But since mbufs are shared between
> network and local IPC, IPC goes down as well.
>
> We observe this issue with systems using both em(4) driver and igb(4)
> driver. I believe both drivers share the same design, however I am not
> sure if this is some kind of design flaw in the driver or part of a
> larger problem with the network subsystem.
>
> This happens on amd64 7.2-RELEASE and 7.3-PRERELEASE alike, with 8GB of
> memory. I have not tried upgrading to 8.0, this is production system so
> upgrading will not be easy. I don't believe there are some differences
> that let us hope that this problem will go away after upgrade, but I can
> try it as the last resort.
>
> As I said, this is very critical issue, so I can provide any additional
> debug information upon request. We are ready to go as far as paying
> somebody reasonable amount of money for tracking down and resolving the
> issue.
>
> Regards,




More information about the freebsd-stable mailing list