svn commit: r304436 - in head: . sys/netinet

Sat Aug 20 16:49:25 UTC 2016

On 20/08/16 17:36, Ryan Stone wrote:
> + adrian@, who prompted me to look at UDP in the first place
>
> I'm really not sure what my next step should be.  I'm willing to revert
> r304436, but I really don't want to revert r304437 because we've seen
> crashes in the real world due to the missing locking.  Unfortunately,
> reverting r304436 would mean that every UDP packet would incur the
> overhead of an additional rlock/runlock call, which is what I've been
> trying to avoid.  I don't see a particularly good path forward.

Your changes have the same motivations as my historical change to move 
group-level IP multicast lookups (and locking) out of the output path.

Source-specific multicast (SSM) requires support for reception filters 
on individual sockets, so I moved those multicast-specific input checks 
into the socket layer.

> - The if_addr_lock would appear to be an excellent candidate to be
> converted into an rmlock, but unfortunately we made the mistake of
> exposing the lock through the ifnet KPI.  Fixing that would require
> rototilling every single Ethernet/WiFi/etc driver in the tree.

Oops...

> - Providing a mechanism for ip_input() to signal to udp_input() that the
> packet was addressed to an L3 broadcast address would require
> rototilling the pr_input interface, and I'd have to carefully ensure
> that if anything might interpose itself between the two layers (IPSec?)
> that the flag would have to be passed through correctly.
>
> - mbuf flags are far too precious to allocate a new one for such a
> narrow use-case

Agree.

Then it may be that slipping the definition of M_BCAST to mean 'And IP 
identified that this is network-layer broadcast' is the most expedient 
solution.

A quick look around suggests you may be able to get away with it. You'd 
essentially be widening the definition of the existing M_BCAST flag.

Given the ultimate consumer of the mbuf in the case you are addressing 
in the code (bad pun) is udp_input(), that may be fine.

(We had to do something similar for ILNPv6, because of how IPv6 input 
works, so it allocates an unused bit from the IPv6 flow label.)

But it has to be qualified very carefully. If L2 is re-entered from IP 
(e.g. a netgraph IP-level hook), it may have unexpected side-effects. I 
have not given this the KScope treatment, just a quick fxr.watson.org.