Deferring inp_freemoptions() to an asychronous task
bms at incunabulum.net
Mon Jan 9 15:41:22 UTC 2012
Sorry it's taken me so long to reply.
No objections in principle to your change, but this seems to point at a
more general issue with modern network controllers.
You've also stumbled on the behaviour specific to how BSD has
traditionally dealt with broadcast/multicast sockets. The pcbinfo
structure can't really be disentangled from this.
Of course, it doesn't help that we have historically required these
sockets to be bound to INADDR_ANY. It might be useful to break reception
out using a separate hash/tree, rather than walking all sockets as is
currently done, but legacy usage needs to be supported.
Interestingly enough, Microsoft has probably done something similar,
judging from things which appear in MSDN.
John Baldwin wrote:
> I have a workload at work where a particular device driver can take a while to
> update its MAC filter table when adding or removing multicast link-layer
> addresses. One of the ways I've tackled fixing this is to change
> inp_freemoptions() so that it does all of its actual work asychronously in a
> separate task. Currently it does its work synchronously; however, it can be
> invoked while the associated protocol holds a write lock on its pcbinfo lock
> (e.g. from in_pcbdetach() called from udp_detach()). This stalls all packet
> reception for that protocol since received packets need a read lock on the
> pcbinfo to lookup the socket associated with a given (ip, port) tuple.
There is often a delay between asking for the group and actually getting
the hash filter entry set up in the MAC, so the operations are async.
I can see many apps like to assume the operation is instantaneous rather
than deferred; they are probably being naive...
The same being true for taking down the hash filter entry is not surprising.
More information about the freebsd-net