Route caching ?

Claudio Jeker cjeker at diehard.n-r-g.com
Wed Aug 22 15:44:40 PDT 2007


On Wed, Aug 22, 2007 at 06:13:19PM +0100, Bruce M. Simpson wrote:
> Claudio Jeker wrote:
> >Just because you believe that route caches are great doesn't mean it is
> >true. Show some real code and include benchmarks with various workloads
> >(e.g. a core router that is hit by many many many sessions).
> >  
> 
> It is a reasonable approach, for a uniprocessor design, to focus on 
> optimizing the route lookup as much as possible. Does this approach 
> scale to SMP, though? This is still a very much open question and from 
> what I have seen of the OpenBSD implementation, it only addresses the 
> uniprocessor case - again please correct me here if I have missed any 
> details.
> 

If it does not scale to SMP then you should rethink your design. But
adding a cache between the routing table and the actual lookup can be
exploited similar to hash lists. Unless you make your cache a full table
-- I would not call such a full view table a cache anymore.

> I believe the Linux dst cache is strongly tied to the IBM-patented 
> Remote-Copy-Update algorithm based on what I've read about their LC-trie 
> implementation.
> 

LC-trie implementations are normaly unable to do in place updates so you
need to recalculate the full thing form time to time and then you just
replace the pointer to the root in an atomic op.
The main problem with LC-tries is that those are covered by
some patents -- at least I remember something like this.

> >Until now all caching solutions resulted in very bad performance on busy
> >boxes. Remember ip_fastforward or how was it called? Another example are
> >all crapy L3 switches that burn down if the CAM (chache) is flodded.
> >  
> 
> I assume you are referring to NetBSD's flow-based IP forwarding cache, 
> which was implemented outside of the scope of SMP; spl-style interrupt 
> priority masking was still in use at that time.
> 

FreeBSD had the same code before Andre replaced it with a real solution
that does full route lookups.

> It is established that saturating content-addressable memory is going to 
> lead to the slow path being taken, however, that's the trade-off one 
> makes with these designs.
> 

Yes, the system is super fast if it is unloaded but under load (e.g. an
attack) the systems enters a doom loop and kills itself. Honestly such
systems are tuned for synthetic benchmarks and fail horribly in real life.

> >IMO it is better to make the route lookup faster and forget about caching.
> >  
> My concern is that you may be comparing apples with oranges here.
> 
> In the case of SMP, locking does become a consideration, and caches, if 
> carefully implemented, are one way of addressing this.
> 
> On the other hand, CPU affinity has been proposed as a limited solution, 
> however it depends how this is implemented - affinity for lookups, 
> forwarding, or both?
> 

I just wonder if your little cache will not increase the lookup time in
the real world. Take a busy core router with assume 250'000 sessions
comming from something like 10'000 to 25'000 unique networks.
This numbers are assumptions but if you remember SQLslammer you should
realize that hitting a set of 25'000 prefixes is not exagerated.
Now assume you added a cache to each CPU holding something like 100 to
1000 entries (you don't want to make it to big). So you will have
quite a few cache misses and the cache would be under constant stress
because it is way to small. So you will hit the routing table a lot and
that's your slow path. Optimizing the slowest path will give you the most
bang for the buck. Sure you could blow up your cache until everything fits
but as I said for me that is no longer a cache but more an optimized FIB
and therfore a routing table.

> Perhaps there is something I am missing about how the OpenBSD 
> implementation deals with SMP, as I am not as familiar with their code 
> as FreeBSD's.
> 

Honestly OpenBSD does not realy care about fine locked SMP at the moment
(still running under big lock). Not that we like to stay there but we lack
the manpower at the moment.

-- 
:wq Claudio


More information about the freebsd-net mailing list