svn commit: r242910 - in user/andre/tcp_workqueue/sys: kern sys

Alfred Perlstein bright at
Wed Nov 14 02:10:21 UTC 2012

Andre, do you think the variable "realmem" could be exported as 
something like kmemsize or something?

Or maybe a function call to subr_param.c?

The reason I ask is that I would like to scale things like number of 
default sysv semaphores to something like 64 per 1GB of "realmem".


The reason I'm interested in this is because we just had a user run out 
of sysvsems on a machine with 256GB ram.


On 11/12/12 2:49 AM, Andre Oppermann wrote:
> On 12.11.2012 09:47, Andre Oppermann wrote:
>> Author: andre
>> Date: Mon Nov 12 08:47:13 2012
>> New Revision: 242910
>> URL:
>> Log:
>>    Base the mbuf related limits on the available physical memory or
>>    kernel memory, whichever is lower.
> The commit message is a bit terse so I'm going to explain in more
> detail:
> The overall mbuf related memory limit must be set so that mbufs
> (and clusters of various sizes) can't exhaust physical RAM or KVM.
> I've chosen a limit of half the physical RAM or KVM (whichever is
> lower) as the baseline.  In any normal scenario we want to leave
> at least half of the physmem/kvm for other kernel functions and
> userspace to prevent it from swapping like hell.  Via a tunable
> it can be upped to at most 3/4 of physmem/kvm.
> Out of the overall mbuf memory limit I've chosen 2K clusters, 4K
> (page size) clusters to get 1/4 each because these are the most
> heavily used mbuf sizes.  2K clusters are used for MTU 1500 ethernet
> inbound packets.  4K clusters are used whenever possible for sends
> on sockets and thus outbound packets.
> The larger cluster sizes of 9K and 16K are limited to 1/6 of the
> overall mbuf memory limit.  Again, when jumbo MTU's are used these
> large clusters will end up only on the inbound path.  They are not
> used on outbound, there it's still 4K.  Yes, that will stay that
> way because otherwise we run into lots of complications in the
> stack.  And it really isn't a problem, so don't make a scene.
> Previously the normal mbufs (256B) weren't limited at all.  This
> is wrong as there are certain places in the kernel that on allocation
> failure of clusters try to piece together their packet from smaller
> mbufs.  The mbuf limit is the number of all other mbuf sizes together
> plus some more to allow for standalone mbufs (ACK for example) and
> to send off a copy of a cluster.  FYI: Every cluster eventually also
> has an mbuf associated with it.
> Unfortunately there isn't a way to set an overall limit for all
> mbuf memory together as UMA doesn't support such a limiting.
> Lets work out a few examples on sizing:
> 1GB KVM:
>  512MB limit for mbufs
>  419,430 mbufs
>   65,536 2K mbuf clusters
>   32,768 4K mbuf clusters
>    9,709 9K mbuf clusters
>    5,461 16K mbuf clusters
> 16GB RAM:
>  8GB limit for mbufs
>  33,554,432 mbufs
>   1,048,576 2K mbuf clusters
>     524,288 4K mbuf clusters
>     155,344 9K mbuf clusters
>      87,381 16K mbuf clusters
> These defaults should be sufficient for event the most demanding
> network loads.  If you do run into these limits you probably know
> exactly what you are doing and you are expected to tune those
> values for your particular purpose.
> There is a side-issue with maxfiles as it relates to the maximum
> number of sockets that can be opened at the same time.  With web
> servers and proxy caches of these days there may be some 100K or
> more sockets open.  Hence I've divorced maxfiles from maxusers as
> well.  There is a relationship of maxfiles with the callout callwheel
> though which has to be investigated some more to prevent ridiculous
> values from being chosen.

More information about the freebsd-current mailing list