nmbclusters: how do we want to fix this for 8.3 ?
Jack Vogel
jfvogel at gmail.com
Sat Mar 24 21:18:04 UTC 2012
This whole issue only came up on a system with 10G devices, and only igb
does anything like you're talking about, not a device/driver on most low end
systems. So, we are trading red herrings it would seem.
I'm not opposed to economizing things in a sensible way, it was I that
brought
the issue up after all :)
Jack
On Sat, Mar 24, 2012 at 2:02 PM, Juli Mallett <jmallett at freebsd.org> wrote:
> On Sat, Mar 24, 2012 at 13:33, Jack Vogel <jfvogel at gmail.com> wrote:
> > On Sat, Mar 24, 2012 at 1:08 PM, John-Mark Gurney <jmg at funkthat.com>
> wrote:
> >> If we had some sort of tuning algorithm that would keep track of the
> >> current receive queue usage depth, and always keep enough mbufs on the
> >> queue to handle the largest expected burst of packets (either
> historical,
> >> or by looking at largest tcp window size, etc), this would both improve
> >> memory usage, and in general reduce the number of require mbufs on the
> >> system... If you have fast processors, you might be able to get away
> with
> >> less mbufs since you can drain the receive queue faster, but on slower
> >> systems, you would use more mbufs.
> >
> > These are the days when machines might have 64 GIGABYTES of main storage,
> > so having sufficient memory to run high performance networking seems
> little
> > to
> > ask.
>
> I think the suggestion is that this should be configurable. FreeBSD
> is also being used on systems, in production, doing networking-related
> tasks, with <128MB of RAM. And it works fine, more or less.
>
> >> This tuning would also fix the problem of interfaces not coming up since
> >> at boot, each interface might only allocate 128 or so mbufs, and then
> >> dynamicly grow as necessary...
> >
> > You want modern fast networked servers but only giving them 128 mbufs,
> > ya right , allocating memory takes time, so when you do this people will
> > whine about latency :)
>
> Allocating memory doesn't have to take much time. A multi-queue
> driver could steal mbufs from an underutilized queue. It could grow
> the number of descriptors based on load. Some of those things are
> hard to implement in the first place and harder to cover the corner
> cases of, but not all.
>
> > When you start pumping 10G...40G...100G ...the scale of the system
> > is different, thinking in terms of the old 10Mb or 100Mb days just
> doesn't
> > work.
>
> This is a red herring. Yes, some systems need to do 40/100G. They
> require special tuning. The default shouldn't assume that everyone's
> getting maximum pps. This seems an especially-silly argument when
> much of the silicon available can't even keep up with maximum packet
> rates with minimally-sized frames, at 10G or even at 1G.
>
> But again, 1G NICs are the default now. Does every FreeBSD system
> with a 1G NIC have loads of memory? No. I have an Atheros system
> with 2 1G NICs and 256MB of RAM. It can't do anything at 1gbps. Not
> even drop packets. Why should its memory usage model be tuned for
> something it can't do?
>
> I'm not saying it should be impossible to allocate a bajillion
> gigaquads of memory to receive rings, I certainly do it myself all the
> time. But choosing defaults is a tricky thing, and systems that are
> "pumping 10G" need other tweaks anyway, whether that's enabling
> forwarding or something else. Because they have to be configured for
> the task that they are to do. If part of that is increasing the
> number of receive descriptors (as the Intel drivers already allow us
> to do — thanks, Jack) and the number of queues, is that such a bad
> thing? I really don't think it makes sense for my 8-core system or my
> 16-core system to come up with 8- or 16-queues *per interface*. That
> just doesn't make sense. 8/N or 16/N queues where N is the number of
> interfaces makes more sense under heavy load. 1 queue per port is
> *ideal* if a single core can handle the load of that interface.
>
> > Sorry but the direction is to scale everything, not pare back on the
> network
> > IMHO.
>
> There is not just one direction. There is not just one point of
> scaling. Relatively-new defaults do not necessarily have to be
> increased in the future. I mean, should a 1G NIC use 64 queues on a
> 64-core system that can do 100gbps @ 64 bytes on one core? It's
> actively-harmful to performance. The answer to "what's the most
> sensible default?" is not "what does a system that just forwards
> packets need?" A system that just forwards packets already needs IPs
> configured and a sysctl set. If we make it easier to change the
> tuning of the system for that scenario, then nobody's going to care
> what our defaults are, or think us "slow" for them.
>
More information about the freebsd-net
mailing list