multipath stuff

Qing Li qingli at speakeasy.net
Thu Apr 17 07:02:55 UTC 2008


	Hi Bruce,

> 
> First of all thanks for doing this. It would have been nice 
> to have had some advance warning though.
> 

	Hopefully it wasn't too much of a grief ...

> 
> I'm surprised XORP hasn't been mentioned here, so I'll 
> mention it now...
> 
> For what it's worth this should not make any operational changes to 
> XORP's behaviour, it already plumbs routes to the FIB from 
> its RIB with 
> the next-hop field, and always parses the next-hop in any PF_ROUTE 
> messages it sees. I can't speak for the others.
> 

	I was hoping that would be the case, and that should be
	how it's done in theory, no ?

>
> Questions:
>  * So, does this RADIX_MPATH code originate from KAME?
>

	Yes, this work originated from the KAME project.

>
>  * If so, to what extent does it share heritage with the OpenBSD code?
>

	I don't know much about the OpenBSD code and never looked at that
	branch. What I do know, is the original code from KAME worked on 
	all except FreeBSD, at least that was the case when I did the 
	initial work in FreeBSD 5.4, and the original code has not changed 
	for quite some time even before that. The bulk of the
addition/change
	is actually in route.c  With the original code, the box would
	boot-up and crash immediately. If my memory serves me right,
	the crash was due to the address alias was not being properly taken
	care of.  Also in the original code, the deletion did not really
	work and also crashed often.
 
> 
> It will be useful as a baseline for other work, in particular 
> removing the 32 (S,G) channel limitation from the multicast 
> forwarding code -- with such a change it would be possible to 
> move multicasting into the usual radix trie lookup by adding a 
> new flag which says "flood this to 
> all next-hops specified as AF_LINK sockaddrs".
> 

	That would be good. I'd like to learn more about what you have
	in mind and would be happy to help you out on that front.

>
> We really need to get ARP out of there now. :-) 
>

	Yes. I am going to be focused on getting this work revised
	and hope to wrap it up in a couple of weeks. Julian is helping me 
	out on the debugging and testing front.

>
> It would be nice if rt_mpath_matchgate() used the sa_dl_equal() macros 
> from if.c, it reads a bit quirky.
> 

	I will update the code. Thanks.

>
> The questions you raise about ownership of FIB entries bear some 
> scrutiny. Microsoft, for example, are pretty strict about 
> only exposing the forwarding table to consumers which are willing 
> to play by all rules of the API. What they have could be termed 
> a RIB in of itself, you never get to interact with the forwarding 
> tables directly outside of the TCPIP.SYS driver, except for read-only 
> access e.g. SNMP. There is locking capability in their API.
> 

	I am going to postpone this discussion until I get done
	with the new ARP work. However, one thing I do want to mention,
	is I doubt one can get the proper entries out of the table
	using SNMP because I do know the forwarding table MIB is
	and has been broken for years. The table does not support
	these concepts with the proper table index.

>
> It really is worth looking at Linux, rtnetlink has an 
> informational RFC, it uses a tag-length-value protocol which 
> addresses a number of the issues blocking further progress in this 
> area, and whilst we can't take 
> the code, the design, and the idea, are not subject to the GPL -- 
> particularly so given the informational RFC status.
> 

	Interesting. Is this the RFC you are referring to ?

3549 Linux Netlink as an IP Services Protocol. J. Salim, H. Khosravi,
     A. Kleen, A. Kuznetsov. July 2003. (Format: TXT=72161 bytes) (Status:
     INFORMATIONAL)


	-- Qing





More information about the cvs-src mailing list