svn commit: r191259 - head/sys/netinet

Mon Apr 20 06:44:35 UTC 2009

On Sunday 19 April 2009 23:05:55 Kip Macy wrote:
> On Sun, Apr 19, 2009 at 1:21 PM, Marko Zec <zec at freebsd.org> wrote:
> > On Sunday 19 April 2009 19:13:37 Kip Macy wrote:
> >> On Sun, Apr 19, 2009 at 3:18 AM, Andre Oppermann <andre at freebsd.org>
> >> wrote:
> >
> > ...
> >
> >> > I have another question on the flowtable:  What is the pupose of it?
> >> > All router vendors have learned a long time ago that route caching
> >> > (aka flow caching) doesn't work out on a router that carries the DFZ
> >> > (default free zone, currently ~280k prefixes).  The overhead of
> >> > managing the flow table and the high churn rate make it much more
> >> > expensive than a direct and already very efficient radix trie lookup.
> >> > Additionally a well connected DFZ router has some 1k prefix updates
> >> > per second.  More information can be found for example at Cisco here:
> >> >  http://www.cisco.com/en/US/tech/tk827/tk831/technologies_white_paper0
> >> >918 6a00800a62d9.shtml The same findings are also available from all
> >> > other major router vendors like Juniper, Foundry, etc.
> >> >
> >> > Lets examine the situations:
> >> >  a) internal router with only a few routes; The routing and ARP table
> >> >    are small, lookups are very fast and everything is hot in the CPU
> >> >    caches anyway.
> >> >  b) DFZ router with 280k routes; A small flow table has constant
> >> > thrashing becoming negative overhead only.  A large flow table has a
> >> > high maintenance
> >> >    overhead, higher lookup times and sill a significant amount of
> >> > thrashing. The overhead of the flow table is equal or higher than a
> >> > direct routing table lookup.
> >> > Concluding that a flow table is never a win but a liability in any
> >> > realistic setting.
> >>
> >> You're assuming that a Cisco- / Juniper-class workload is
> >> representative of where FreeBSD is deployed. I agree that FreeBSD is
> >> sub-optimal for large routing environments for a whole host of other
> >> reasons. A better question is what are "typical" FreeBSD deployments,
> >> and how well would it work there. The flowtable needs to be sized to
> >> correspond to the number of flows, its utility rapidly diminishes as
> >> the number of collisions per bucket increases.
> >
> > ... which makes a flow cache a perfect DoS target in any environment, be
> > it a DFZ or enterprise router or an end host or whatever.
>
> Uhm, assuming that you don't put a limit on the number of flows
> allocated - which I do. When you hit the zone limit for flows you
> simply stop caching new flows.

... which means you fall back to the ordinary routing lookups, but only after 
you have wasted cycles to compute a hash and found out that it doesn't match 
anything in your cache -> in this case I would expect only a degradation in 
performance, not an improvement.

> So the added overhead is simply the 
> extra cache misses up to the collision depth for the bucket. Are you
> two familiar with CAMs?

Not really, but I've heared of anecdotes that Ciscos that were capped at 256K 
FIB entries in CAM had to fall back to lookups in software once the size of 
DFZ table exceeded the 256K figure - so everybody rushed to get rid 
of^H^H^H^H upgrade such hardware around 1.5 years ago in anticipation of DFZ 
table bloom.  

But it seems to me that CAM lookups are pretty resilient against DoSing by 
throwing malicious synthetic flows on them, whereas flow caches will melt 
down easily.

Marko