Question on SOCK_RAW, implement a bpf->other host tee

Robert Watson rwatson at freebsd.org
Sun Jul 18 14:38:52 PDT 2004


On Sat, 17 Jul 2004, Don Bowman wrote:

> I'm trying to implement a 'tee' which reads from bpf, and sends matching
> packets to another layer-2 adjacent host. 

Just FYI, I would normally do this by writing back out to BPF on the
interface you're "teeing" to, perhaps with some rewriting of the ethernet
layer header.  Since I don't have a big picture of what your application
is actually doing, I can't speak to the details there :-).

> I'm doing this with SOCK_RAW to try and write the packet back out. The
> 'sendto' passes, but i don't see a packet anywhere. 
> 
> Am i correct that i can hand an arbitrarily crafted IP packet into
> sendto, and the stack will write the ethernet header on, pick an
> interface, etc, based on the address in the sendto?

Hmm.  I'm not 100% sure that this is correct.  When you don't set
IP_HDRINCL, the raw IP output code will indeed do the route lookup on the
requested address, as well as insert an IP header that includes the
requested address.  However, when you use IP_HDRINCL to provide your own
IP address, I believe that the target address argument to sendto() will be
ignored, and the address in the IP header you provide used as the
destination instead.  What could well be happening is that you're
re-injecting the packet at the IP forwarding layer, and it is being
forwarded to the original destination IP (possibly localhost) and
processed as a dup packet.  I.e., you are making copies of the packet, but
they're not being sent where you think.  If you're sniffing packets
destined for the local IP, you might well be able to see the tee'd packets
by running tcpdump on lo0.

> I have swapped the ip_len, ip_off fields. 

Are you sure you need to do this?  I thought BPF/PCAP provided those
fields in network byte order already, in which case you shouldn't need to
touch these fields unless you need to adjust them.

Also, I notice that you don't appear to be initializing the 'sin_len'
field of struct sockaddr_in, which should be set to sizeof(to) when you
initialize the structure.

Generally speaking, the notion of "tee" is a little poorly defined.  I
think what you want to do is what I suggest above -- use BPF to write the
packet out another interface, and do the link layer routing yourself,
rewriting packet header fields if you want to.  That is, basically write
the new target ethernet address into the BPF layer ethernet header, and
write it out on a BPF device attached to the network interface you're
teeing to.  This will leave all the IP layer stuff intact.  I've
previously written a BPF layer bridging application that uses this
technique to do filtered bridging and stateful TCP transformation in user
space.  You don't get the performance of the kernel, but it's a lot easier
to debug.  One caution: be careful of bridging loops and the like.

If you do mean to be changing the IP layer addresses, remember that you'll
need to recalculate header checksums for IP, TCP/UDP, etc.  If you change
any other IP layer fields, you may need to just recalculate the IP layer
checksum.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org      Principal Research Scientist, McAfee Research




> 
> The program I have is below. This is on 4.7.
> The handler gets called, the packet there looks 
> correct, no error on any system call, yet no
> output :(
> 
> Suggestions?
> 
> /*
>  * Copyright 2004 Sandvine Incorporated. All rights reserved
>  */
> 
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <netinet/in.h>
> #include <netinet/in_systm.h>
> #include <netinet/ip.h>
> #include <pcap.h>
> 
> void
> usage(const char *name)
> {
>     fprintf(stderr, "Usage: %s [-I input_interface] [-O output_interface]
> [-i output_ip(arp for mac)] [-v]\n", name);
>     exit(1);
> }
> 
> typedef struct
> {
>     int s;
>     struct in_addr output_ip;
> }
> context;
> 
> static int verbose;
> 
> static void 
> handler(unsigned char *ct,
>         const struct pcap_pkthdr *hdr,
> 	const unsigned char *pkt)
> {
>     struct ip *ip = (struct ip *)(pkt + 14);
>     context *ctxt = (context *)ct;
>     struct sockaddr_in to;
>     memset(&to,0,sizeof(to));
>     to.sin_family = AF_INET;
>     to.sin_addr = ctxt->output_ip;
>     if (verbose)
>     {
> 	fprintf(stderr, "Send %d byte packet\n", hdr->len);
>     }
>     ip->ip_len = htons(ip->ip_len);
>     ip->ip_off = htons(ip->ip_off);
>     if (sendto(ctxt->s,
> 	       ip,
> 	       hdr->len-14,
> 	       0,
> 	       (struct sockaddr *)&to,
> 	       sizeof(to)) != (hdr->len-14) )
>     {
> 	err(1, "sendto");
>     }
> }
> 
> static int
> doit(const char *input_interface,
>      const char *output_interface,
>      struct in_addr output_ip)
> {
>     char errbuf[PCAP_ERRBUF_SIZE];
>     pcap_t *in_d, *out_d;
>     context ctxt;
>     int on = 1;
>     struct bpf_program fp;
> 
>     in_d = pcap_open_live((char *)input_interface, 1600, 1, 20, errbuf);
>     if (in_d == 0)
>     {
> 	errx(1, "open of %s failed: %s", input_interface, errbuf);
>     }
> 
>     ctxt.output_ip.s_addr = htonl(output_ip.s_addr);
>     ctxt.s = socket(PF_INET, SOCK_RAW, IPPROTO_RAW);
>     if (ctxt.s < 0)
> 	errx(1, "can't open raw socket");
>     if (setsockopt(ctxt.s, IPPROTO_IP, IP_HDRINCL, (char *)&on, sizeof(on))
> < 0)
>     {
> 	err(1,"setsockopt");
>     }
> 
>     memset(&fp,0,sizeof(fp));/
>     if (pcap_compile(in_d, &fp, "ip", 0, 0xfffffff0) < 0)
>     {
> 	errx(1, "failed to compile: %s",pcap_geterr(in_d));
>     }
>     if (pcap_setfilter(in_d, &fp) < 0)
>     {
> 	errx(1, "failed to set filter");
>     }
> 
>     pcap_loop(in_d, -1, handler, (unsigned char *)&ctxt);
> }
> 
> int
> main(int argc, char *argv[])
> {
>     int ch;
>     char *input_interface = "ipfw0";
>     char *output_interface = "em2";
>     struct in_addr output_ip;
>     output_ip.s_addr = 0;
> 
>     while ((ch = getopt(argc, argv, "I:O:i:vh?")) != -1)
>     {
> 	switch (ch) 
> 	{
> 	    case 'I':
> 		input_interface = optarg;
> 		break;
> 	    case 'O':
> 		output_interface = optarg;
> 		break;
> 	    case 'i':
> 		if (inet_aton(optarg,&output_ip) < 0)
> 		{
> 		    errx(1, "unknown ip %s", optarg);
> 		}
> 		break;
> 	    case 'v':
> 		verbose = 1;
> 		break;
> 	    case 'h':
> 	    case '?':
> 	    default:
> 		usage(argv[0]);
> 	}
>     }
>     if (verbose)
> 	fprintf(stderr, "%s->%s(%s)\n",
> input_interface,output_interface,inet_ntoa(output_ip));
>     return doit(input_interface,output_interface,output_ip);
> }
> 
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> 



More information about the freebsd-net mailing list