Question about bridging code

Wed Jul 9 12:36:30 PDT 2003

On Wed, 9 Jul 2003 kw3wong at engmail.uwaterloo.ca wrote:

> Hi guys,
> 
> My first attempts at hacking FreeBSD kernel code has not been very fruitful, so 
> I'm hoping someone with more experience and knowhow might be able to point out 
> the mistakes that I'm making.
> 
> Firstly, let me explain what I'm trying to do. I'm currently working on a 
> University project that performs some type of transformation (compression, 
> security, string replacement, etc) on packets as they pass through the system. 
> The current setup has the FreeBSD machine configured as a router, and the 
> transformation is performed on packets that are routed. This is done via divert 
> sockets and everything is fine and dandy, we're getting great results from this 
> setup.
> 
> However, what we want to do next is to have the machine setup as a ethernet 
> bridge instead, and the transformation is to be performed on the bridged 
> packets. Unfortunately, as most of you probably know, divert sockets do not 
> work with bridges as of yet.
> 
> So I've been trying to add a somewhat hack-ish support for divert sockets over 
> bridges. The concession that I'm making is that instead of diverting ip 
> packets, I'll be diverting ethernet frames. In userspace my program will 
> reattach the ethernet headers back onto the packet before passing it back to 
> the divert socket. A second concession is that when I sendto the divert socket, 
> the sin_zero in the sockaddr must contain the source network adaptor name. All 
> these concessions are necessary (I think) as I would otherwise not know how to 
> output the data in a ip-less bridge. 

The very simple way to do what you want is to use netgraph

look at the netgraph bridging example in /usr/share/examples/netgraph
and add a pair of netgraph sockets at the appropriate places..

You can intrercept any pacet at any place much like divert sockets..
you can also do pre-filtering using the ng_bpf node that allows you to
do bpf filerring (see the ng_bpf man page)

You can do this all from the command line and you will need to make only
minimal changes to your userland program..
basically, get familiar with netgraph and you'll see that you have more
options than you can poke a stick at.

> 
> So here is what my code changes involved so far. BTW, I'm using FreeBSD 4.8
> 
> 1) Removed the check in ipfw_chk (ip_fw2.c) for whether it is layer2 or not. 
> This allows briged packets to still match the ipfw2 divert rules 
> 
> 2) In bridge.c at function bdg_forward, after the ip_fw_chk_ptr (and after the 
> check for dummynet, around line 974), the following code fragment is added
> 
>     if (i != 0 && (i & IP_FW_PORT_DYNT_FLAG) == 0) {
>         struct mbuf *m;
> 
>         /* Need to determine whether this is an IP. If not just forward
>         */
>         if (ntohs(eh->ether_type) != ETHERTYPE_IP)
>             goto forward;
> 
>         if ( shared ) {
>             int j = min(m0->m_pkthdr.len + ETHER_HDR_LEN, max_protohdr) ;
> 
>             m0 = m_pullup(m0, j) ;
>             if (m0 == NULL)
>                 return NULL;
>         }
> 
>         if (shared == 0 && once ) { /* no need to copy */
>             m = m0 ;
>             m0 = NULL ; /* original is gone */
>         } else {
>             m = m_copypacket(m0, M_DONTWAIT);
>             if (m == NULL) {
>                 printf("bdg_forward: sorry, m_copypacket failed!\n");
>                 return m0 ; /* the original is still there... */
>             }
>         }
> 
>         if ( (void *)(eh + 1) == (void *)m->m_data) {
>             m->m_data -= ETHER_HDR_LEN ;
>             m->m_len += ETHER_HDR_LEN ;
>             m->m_pkthdr.len += ETHER_HDR_LEN ;
>             bdg_predict++;
>         } else {
>             M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
>             if (m == NULL)
>             {
>                 printf("M_PREPEND failed\n");
>                 /* Should probably return original instead of NULL */
>                 /* return NULL; */
>                 return m0;
>             }
>             bcopy(&save_eh, mtod(m, struct ether_header *), ETHER_HDR_LEN);
>         }
> 
>         divert_packet(m, 1, i & 0xffff, args.divert_rule);
>         return NULL;
>     }
> 
> This allows me to divert the ethernet frames to userspace.
> 
> 
> 3) To allow me to inject ethernet frames back into the system via divert 
> sockets, I've modified div_output so that it will call ether_output_frame. The 
> following are my changes to div_output, which is added before ip_output is 
> called:
> 
>     /*  rcvif is copied from sin_zero, and is required to be valid
>         for the current system to work
>     */
>     if (m->m_pkthdr.rcvif != NULL && BDG_USED(m->m_pkthdr.rcvif))
>     {
>         if (m->m_len < sizeof(struct ether_header)) {
>             /* XXX error in the caller. */
>             error = EINVAL;
>             goto cantsend;
>         }
>         
>         return ether_output_frame(m->m_pkthdr.rcvif, m);
>     }
> 
> 4) In userspace for testing purposes, I have a program that simply reads from 
> the divert socket, and writes back out to it - here's the core snippet of the 
> code.
> 
>     while (true)
>     {
>         sstBytes = ::recvfrom(nFD, kpucInPacket, sizeof(kpucInPacket), 0,
>             (struct sockaddr *) &SockAddr, &AddrLen);
> 
>         if (sstBytes == -1)
>             ::err(errno, "recvfrom");
> 
>         ::bcopy(SockAddr.sin_zero, 
>             SockAddrSend.sin_zero, 
>             sizeof(SockAddr.sin_zero));
> 
>         int nSendBytes = ::sendto(nSendFD, (void*)kpucInPacket, sstBytes, 0,
>             (struct sockaddr *) &SockAddrSend, sizeof(SockAddrSend));
> 
>         if (nSendBytes != sstBytes)
>             ::err(errno, "sendto");
>     }
> 
> 
> Now I understand I'm breaking lots of abstractions/layers, but I do plan to 
> clean that up a bit later. And I also understand that perhaps no one else in 
> the world needs this functionality - although I can see a couple of other 
> possible applications for it. 

Netgraph is a link-layer manipulation framework..
for link-layer stuff it works much better than divert....

(I'm not biased.. archie and I wrote both of them for different reasons
:-)

> 
> The changes does seem to work, I'm able to receive the ethernet frame and also 
> reinject it via the divert sockets - ping, ftp, etc. all work over the bridge 
> when my test program is running. However, I'm finding that I'm losing/leaking 
> mbufs. sbdrop will complain and panic that the sb_cc doesn't match up with what 
> the mbuf chains says - usually the sb_cc will be larger by a couple of hundred 
> bytes. Furthermore, a netstat -m will show that I have mbufs allocated to 
> socket names and address even after the termination of the diverting program. 
> This only seem to happen when I transfer over ftp a really large file (>100M) 
> at high speed (full line speed of a 100Mbps network). Ping and ftping small 
> files do not seem to cause the mbuf leakage.
> 
> So my question is, does anyone see where I might be losing the mbufs - is there 
> some mbufs that must be freed or not freed that I'm not aware of? I've never 
> worked on the FreeBSD kernel before, so I'm not sure 100% sure how to correctly 
> manage the mbufs. Any advise, tips, discussion, anything will be highly 
> appreciated! =) If anyone needs any more clarification/information, just ask 
> and I'll try my best to explain myself better.
> 
> Thanks!!
> Bernie

haven't looked enough to spot your leak..
I'd just use netgraph..
(use libnetgraph to do netgraph manipulations from your program)

> 
> ----------------------------------------
> This mail sent through www.mywaterloo.ca
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>