cvs commit: src/sys/conf files options src/sys/net radix.c radix.h route.c route.h rtsock.c src/sys/netinet in_proto.c ip_output.c src/sys/netinet6 in6_proto.c in6_src.c nd6_nbr.c

Claudio Jeker claudio at openbsd.org
Tue Apr 15 23:06:59 UTC 2008


On Tue, Apr 15, 2008 at 05:53:12PM +0200, Andre Oppermann wrote:
> Qing Li wrote:
>> 	Hi Andre,
>>>>   is disallowed. For example,
>>>>             route add -net 192.103.54.0/24 10.9.44.1
>>>>           route add -net 192.103.54.0/24 10.9.44.2
>>>>     The second route insertion will trigger an error message of
>>>>   "add net 192.103.54.0/24: gateway 10.2.5.2: route already 
>>> in table"
>>>
>>> Would it make sense to retain this behavior by default (POLA) and have 
>>> multi-path being enabled via sysctl like packet forwarding in general?
>>> Just adding the same route twice with different next-hops can lead to 
>>> very confusing situations for the users which are not used to multi-path.
>>>
>> 	I think that is possible. Were you thinking more along the
>> 	line of accidental route insertion ... Because users who
>> 	are not familiar with ecmp probably won't ever bother
>> 	with more than one route per destination. 
>
> If there is no error message when adding a second route it easily
> happens.  Due to hash based balancing some connections work and
> some do not.  Very confusing.
>
>>>>   "route: writing to routing socket: No such process"
>>>>   "delete net default: not in table"
>>> Can this be made more descriptive?  This messages are about as confusing 
>>> and non-descript as possible.  
>> 	We should fix the above error message in general.
>>> Not being aware of the multipath functionality I would pull out my last 
>>> hair try to get rid of a route.
>>>
>> 	I think updating the manpage would be a necessary
>> 	next step.
>>> How does this behave with common routing daemons; Quagga/Zebra, OpenBGPD, 
>>> OpenOSPFD?  
>> 	Hmm... Good question, I haven't tried them but
>> 	I will.  Is this something you could help me
>> 	with ?
>
> I've chatted with Claudio Jeker (claudio at openbsd.org).  He's the author
> of OpenBGPD and OpenOSPFD plus some work on the OpenBSD multipath support.
>
> He says the implicit multipath doesn't work out right and is very difficult
> to manage from the routing daemons.  In OpenBSD they had to change it to
> explicit mark multipath routes with the RTM_MPATH flag in the table, during
> creation and removal.
>
> The problem is that many daemons and programs (dhclient, ppp, ...) do not
> properly remove routes and simply re-add a new one with different 
> parameters.
> This obviously leads to chaos.
>
> In OpenBSD multipath one has to install an multipath route explicitly with
> the -mpath modifier to route(8) and for daemons with RTF_MPATH in the 
> routing
> message.  Multipath routes also retain this flag during their lifetime.  If
> not set, the normal one-route-only behavior is kept.  This allows all 
> non-mpath
> aware programs to continue to work.
>
> I think this is the model to follow.  Also for inter-BSD compatibility.
>

We did the same misstake on the initial commit the result was unexpected
behaviour by many aplications playing with the routing socket.
Tools like ppp(8), openvpn and many others (IIRC even zebra/quagga were
afected) do routing updates blindly.  First an RTM_ADD is tried and on an
EEXIST it falls back to RTM_CHANGE. With multipath routing the EEXIST did
not happen and so a stale route was suddenly around.

We then added the RTF_MPATH flag to retain the original behaviour unless
the flag is set. By doing that we did not need to change all 3rd party
tools. Later on we decided to keep the RTF_MPATH flag in the kernel to
identify multipath routes more easily. Additionally some userland tools
filter on or use the flag to handle the routes specially.


>>> Do they have to be aware of the multipath functionality?  Will it confuse 
>>> them?
>>>
>> 	I don't believe these routing protocols necessarily
>> 	have to know about the multipath functionality.
>> 	The routing protocols should continue to function
>> 	wrt route insertion/deletion.
>
> It's easy to throw them into disarray as they do not expect routes to
> persist when they delete (one of) them.
>

Routing protocols must know about multipath routes as soon as they do
decisions based on gateway reachability. e.g. the BGP redistribution logic
uses the gateway address, OSPF has a similar behaviour.

>> 	You do bring up a good question about whether
>> 	we should associate ownership with a route entry
>> 	if multiple routing protocols are running
>> 	in parallel. Is this a common practice from your
>> 	experience ? And should we allow multiple routes
>> 	with the same next-hop but different owners in
>> 	the FIB ??
>
> Yes.  Let me explain.  There are two approaches here: The Quagga/Zebra
> approach where all routing protocol daemons communicate with a central
> daemon that is the single point of contact to the kernel.  The other
> approach is the OpenBGPD/OpenOSPFD approach where each daemon runs on
> its own (because most of the time there is little to no overlap) and
> does its own routing table manipulations.  The second approach is a
> bit tricky at the moment as the routing socket is not really intended
> for operating in this way and the daemons have to be aware of each
> other in certain ways.
>
> Ideally, and this is what Claudio says as well, we should end up with
> the following functionality:
>
>  - equal cost multipath where one prefix can have multiple next-hops.
>  - ecmp should be explicit with the RTM_MPATH flag.
>  - a hierarchy of multiple prefixes where the one with the highest
>    priority carries the traffic (possibly with ecmp).
>  - the hierarchy should have a number of precedence levels (interface
>    route, static route, IGP route, EGP route, other).
>  - within those precedence levels it should have further subdivision
>    to prefer OSPF over RIP in the IGP category for example.
>  - a change/delete applies to a specific precedence level if specified.
>  - routing socket filters on reading so that routing daemons can
>    select which precendence levels they want to track (IGP doesn't
>    have to track EGP route changes for example).
>
> With this functionality a number of independent but complementary routing
> daemons can work together is a useful and -more important- standardized 
> way.
>
> The ospfd inserts a multipath for 10.0.0.0/8 via 192.168.1.1 and 
> 192.168.1.2
> and precedence 4.  The bgpd inserts a single route for 10.0.0.0/8 via 
> 192.168.1.3
> with precedence 8.  All traffic goes through 192.68.1.1 and .2.  If the 
> ospfd
> removes both routes .3 will become active right away.  Normally bgpd would 
> have
> to notice the removal and then has to insert the new prefix.  If ospfd then
> wants to insert them again it has to remove or modify the route bgpd 
> installed.
> With precedence multipath these problems go away.
>

Yes, this is where we're heading right now. By having such routing
priorities many current issues can be solved in a very nice way.

>>> What about the other big missing piece; new-arp? ;-)  
>> 	That's on its way. Julian is helping me testing the
>> 	patch and reviewing the code etc.  I am still
>> 	debugging a locking/reference count issue and
>> 	I hope to make good progress in the coming week.
>
> May I have a look too before it goes into CVS?
>
> -- 
> Andre
>

-- 
:wq Claudio


More information about the cvs-all mailing list