upd: 7.2->8.1 & many networks trouble & flowtable

Jeremy Chadwick freebsd at jdc.parodius.com
Wed Nov 24 17:20:13 UTC 2010


On Wed, Nov 24, 2010 at 08:21:12AM -0800, Li, Qing wrote:
> So before you conclude all of the issues that you are encountering falls
> within flow-table, I urge you to articulate the issues with more details.

I agree that the OP needs to be more precise and provide verbose details
that can help troubleshoot the issue + work with you towards a
resolution.  However, please read everything I've written below.

> Also, once you disable flow-table through sysctl, what issues
> are you still running into.

Here's an example of where disabling the flowtable solved a user's
problem in October 2010:

http://forums.freebsd.org/showthread.php?t=18301

I don't know if this is the same person as who posted here (I'm doubting
it), but it still is worth reviewing.

Additionally I remember 2 or 3 posts to mailing lists here discussing
how bgpd was taking up 100% CPU (or specifically an entire CPU core).
I'm not sure what people did to solve that problem, but one has to
wonder if flowtable was the cause and they simply didn't realise it.

> Yes, I personally consider the flow-table work still being experiemental.
> More work is being done as we speak. In addition, we are considering other 
> enhancements for the routing code.

I can't speak for the OP or his situation -- flowtable appears to "work
fine for me", but then again none of our RELENG_8 systems do routing nor
handle large numbers of routes (very simple single-IP or multi-IP
systems on two networks).

However, there are now two places where authors/maintainers of the
flowtable code have admitted there are bugs/issues or that the code is
"experimental": your above statement, and another from Kip Macy here
(circa January 2010):

http://daemonflux.blogspot.com/2010/01/updates.html

I'm forced to ask, purely from a principle standpoint: if this code is
considered experimental and/or potentially buggy, why was it enabled by
default?  Was this done because it needed more users testing it +
reporting problems with it?  If so, how are users supposed to know what
to report?

For example, I'm staring at net.inet.flowtable.stats right now, across 5
different systems, but it doesn't tell me anything as to whether or not
I should be disabling it.

What's a "normal" number for net.inet.flowtable.nmbflows?  For example,
on my home LAN system (NOT acting as a router/NAT; purely a standalone
box), net.inet.flowtable.nmbflows is at 50176.  This arbitrary number
means little to me, but may mean something to you.  I look at it and
think "that seems awfully high for something that has related tunables
that control numerous TCP and UDP expiry intervals", followed by "wait a
minute, what about our systems that use pf and have timeouts/expiries
set there?  Or via system tunables?  Who trumps who?"

Furthermore, I don't quite understand what flowtable does; I can't find
any official FreeBSD documentation, man -k, or looking through
/usr/share, that outlines the details of its functionality.  All I've
been able to discern is that it addresses issues associated with layer 2
<-> layer 3 correlation being limited to a single CPU/core induced by
GIANT, which means a multi-core system without flowtable doesn't scale
well TCP/UDP-wise.  Is this correct?  Maybe I should read the paper you
wrote.  :-)

Thanks for taking the time to answer my questions, Qing.  I appreciate
your work (sincerely), and ask the above with an open mind.

> From: Andrey Groshev [mailto:greenx at yartv.ru]
> Sent: Wed 11/24/2010 4:04 AM
> To: Li, Qing
> Cc: freebsd-stable at freebsd.org
> Subject: Re: upd: 7.2->8.1 & many networks trouble & flowtable
> 
> 24.11.2010 13:18, Li, Qing ?????:
> > I am the main author of this paper you referenced in your email.
> >   
> Hi! I know that you also worked on this. Kip Macy mention because I
> found his statement regarding this issue.
> > The main discussion and focus of my paper was on the design and work done to separate L2 and L3 for both IPv4 and IPv6 to facilitate the elimination of GIANT lock in the networking subsystem, thus achieving high parallelism.
> >
> > This redesign of separately managing L2 ARP/ND6 and L3 routing tables already show performance gain on multicore systems.
> >
> > The flow-table enhancement is just one other component, described towards the end of the paper. Yes, It is experimental and was discussed as such in the paper as well as on the mailing list.
> >   
> Ie You also confirms that this feature is still experimental?
> > I did not know flow-table feature was enabled by default. I wouldn't have done so myself.
> >   
> Kip Macy added it to the generic kernel of head 2009-06-14 (vers. 1.526).
> And it so happened that when he appeared RELENG_8 she moved into the
> stable branch.
> > So help me understand you better: are you complaining about the general L2/L3 separation work, or you are angry about the flow-table enhancement in particular?
> >
> > cheers,
> >
> > -- Qing
> >
> >
> >   
> I understand the importance and necessity of the features.
> I'll be glad when it will actually carry out what should be.
> But in the current situation, this feature should not be enabled by
> default in the generic kernel of the stable branch.
> 
> Best regards,
> Andrey Groshev.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list