[TEST/REVIEW] ng_ipfw: node to glue together ipfw(4) andnetgraph(4)

Thu Jan 20 12:23:27 PST 2005

Brooks Davis wrote:
> 
> On Thu, Jan 20, 2005 at 04:45:53PM +0300, Gleb Smirnoff wrote:
> >   Julian,
> >
> > On Wed, Jan 19, 2005 at 01:32:35AM -0800, Julian Elischer wrote:
> > J> I'm not sure they do two different things..  Each represents a place to
> > J> send packets.
> > J> If each active divert socket number had a pointer to the module to which it
> > J> was attached then  you could divert to either in-kernel netgraph targets or
> > J> to userland socket based targets.  Currently of you divert to a divert
> > J> 'port number' and nothing is attached to it, the packet is dropped.
> > J> If a divert socket is attached to it, it is sent ot teh socket.
> > J> I would just suggest that is not a great leap of imagination that
> > J> attaching to a hook named 3245 would attach a netgrpah hook to the ipfw
> > J> code in the sam enamespace as the divert portnumber, and that a
> > J> subsequent attempt to attach a divert socket to that port number woild
> > J> fail. The packets diverted there would simply go to the netgraph hook
> > J> instead of going to a socket or being dropped.
> >
> > Well, I've considered this. We are going to have these negatives with this change:
> >
> > 1) require divert loaded/compiled, when we are going to work with a completely
> >    different thing.
> > 2) Acquire & drop lock on divert pcb info for each packet going into netgraph.
> > 3) Extensively hack divert_packet()... Let me explain. The place where we can tell
> > whether we have a socket diversion or a netgraph diversion, is at the very end
> > of divert_packet(). Before this place many things are done, which does not apply
> > to a netgraph diversion.
> > This hacking may bring bugs into divert infrastructure, and add extra CPU cycles
> > for case of netgraph forwarding. I think saving one keyword for ipfw2 doesn't
> > worth this hacks.
> 
> I think the code should be committed more or less as is.  I think the
> netgraph and divert features are relatively orthogonal.

Ok, I agree after looking more into it and reviewing Gebius code.

However there are some potentially serious problems that may emerge with
netgraph in the ipfw picture.  So far nothing in ipfw is causing a stall
or overly deep stack in or out of ipfw.  Divert queues to a socket and
dummynet queues to a queue.  There is a break and some other mechanism
takes it from there.  With ipfw_netgraph this changes and potentially
any number of netgraph nodes can run behind this ipfw_netgraph divert
while we are in ip_input() and ipfw_check_in() plus we hold the SX lock
on the pfil hook.  A processing through of the packet causes it to enter
ip_input() and ipfw_check_in() again while we are still in the first
one potentially holding locks.  This can repeat a number of time with
multiple ipfw netgraph diversion rules.

There are two ways of solving this.  One is to decouple the netgraph
from ipfw and to go through a netisr queue in the ipfw_netgraph node.
The other is to process the netgraph hooks while holding the stack
and to wait for the packet to come back and reintroduce it to ipfw
with the "again:" goto as it is done with 'tee'.  I don't know if this
is possible at all with netgraph.

I have sent my direct code comments to the patch in private email.

When the highlighted problems are solved I'm fine with it.

-- 
Andre