[FIX] dummynet breaks IP reassembly

Ruslan Ermilov ru at FreeBSD.org
Fri Feb 10 14:19:19 PST 2006


Hi Andre,

If I'm not mistaken, *you* converted ipfw to use pfil(9).
During the conversion, the following bug was introduced.

When forwarding fragmented packets through a dummynet pipe
(ip_input -> ip_forward -> ip_output -> pipe -> ip_output)
the last ip_output() in the chain that does the actual IP
delivery sets ip_id of all fragments to different values,
making it impossible to reassemble the packet at receive
side.

Example of forwarding a fragmented IP datagram from em0
to xl0:

# tcpdump -nvi em0 net 192.168.2
tcpdump: listening on em0, link-type EN10MB (Ethernet), capture size 96 bytes
00:04:35.170994 IP (tos 0x0, ttl  64, id 2186, offset 0, flags [+], proto: ICMP (1), length: 1500) 192.168.1.3 > 192.168.2.1: ICMP echo request, id 2819, seq 0, length 1480
00:04:35.171242 IP (tos 0x0, ttl  64, id 2186, offset 1480, flags [none], proto: ICMP (1), length: 548) 192.168.1.3 > 192.168.2.1: icmp

# tcpdump -nvi xl0 net 192.168.2
tcpdump: listening on xl0, link-type EN10MB (Ethernet), capture size 96 bytes
00:04:35.171028 IP (tos 0x0, ttl  63, id 10560, offset 0, flags [+], proto: ICMP (1), length: 1500) 192.168.1.3 > 192.168.2.1: ICMP echo request, id 2819, seq 0, length 1480
00:04:35.171266 IP (tos 0x0, ttl  63, id 10561, offset 1480, flags [none], proto: ICMP (1), length: 548) 192.168.1.3 > 192.168.2.1: icmp

Note that IDs were rewritten from the original 2186 and
worse, they are different.

In 4,x, the "flags" field (see the patch below) was used
to keep the IP_FORWARDING bit passed to ip_output() by
ip_forward().  This bit was kept in the dummynet packet
tag (in the "flags" field) and later passed to the second
ip_output() call, causing the latter to NOT touch the IP
header again.  This feature was lost, resulting in a bug.

Since dummynet never calls ip_output() with unfilled
header, it's safe to always call ip_output() from dummynet
with the IP_FORWARDING bit set, to indicate it's forwarded
from another ip_output() and so it shouldn't attempt to
modify the header.

Surprisingly, it "seemed" to work for packets exceeding
MTU and originating from the dummynetting host, mainly
because the packet wasn't fragmented while in dummynet.
Yet, the ip_id field was always incremented in my tests
(pipe with no bandwidth limitation), comparing it before
and after the dummynet processing.  Now, the ip_id is
always preserved.

# tcpdump -nvi em0 net 192.168.2
tcpdump: listening on em0, link-type EN10MB (Ethernet), capture size 96 bytes
00:12:04.654669 IP (tos 0x0, ttl  64, id 2192, offset 0, flags [+], proto: ICMP (1), length: 1500) 192.168.1.3 > 192.168.2.1: ICMP echo request, id 7939, seq 0, length 1480
00:12:04.654917 IP (tos 0x0, ttl  64, id 2192, offset 1480, flags [none], proto: ICMP (1), length: 548) 192.168.1.3 > 192.168.2.1: icmp

# tcpdump -nvi xl0 net 192.168.2
tcpdump: listening on xl0, link-type EN10MB (Ethernet), capture size 96 bytes
00:12:04.654703 IP (tos 0x0, ttl  63, id 2192, offset 0, flags [+], proto: ICMP (1), length: 1500) 192.168.1.3 > 192.168.2.1: ICMP echo request, id 7939, seq 0, length 1480
00:12:04.654939 IP (tos 0x0, ttl  63, id 2192, offset 1480, flags [none], proto: ICMP (1), length: 548) 192.168.1.3 > 192.168.2.1: icmp

(Note the ip_id is preserved when forwarding.)

I'm not sure about the IPv6 portion but it's consistent
with the normal forwarding path so I believe it's correct.
Comments?

%%%
Index: ip_dummynet.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_dummynet.c,v
retrieving revision 1.98
diff -u -p -r1.98 ip_dummynet.c
--- ip_dummynet.c	3 Feb 2006 11:38:19 -0000	1.98
+++ ip_dummynet.c	10 Feb 2006 21:20:59 -0000
@@ -769,7 +769,7 @@ dummynet_send(struct mbuf *m)
 		pkt = dn_tag_get(m);
 		switch (pkt->dn_dir) {
 		case DN_TO_IP_OUT:
-			ip_output(m, NULL, NULL, pkt->flags, NULL, NULL);
+			ip_output(m, NULL, NULL, IP_FORWARDING, NULL, NULL);
 			break ;
 		  case DN_TO_IP_IN :
 			ip = mtod(m, struct ip *);
@@ -783,7 +783,7 @@ dummynet_send(struct mbuf *m)
 			break;
 
 		case DN_TO_IP6_OUT:
-			ip6_output(m, NULL, NULL, pkt->flags, NULL, NULL, NULL);
+			ip6_output(m, NULL, NULL, IPV6_FORWARDING, NULL, NULL, NULL);
 			break;
 #endif
 		case DN_TO_IFB_FWD:
@@ -1129,7 +1129,6 @@ locate_pipe(int pipe_nr)
  * ifp		the 'ifp' parameter from the caller.
  *		NULL in ip_input, destination interface in ip_output,
  * rule		matching rule, in case of multiple passes
- * flags	flags from the caller, only used in ip_output
  *
  */
 static int
@@ -1213,8 +1212,6 @@ dummynet_io(struct mbuf *m, int dir, str
     pkt->dn_dir = dir ;
 
     pkt->ifp = fwa->oif;
-    if (dir == DN_TO_IP_OUT || dir == DN_TO_IP6_OUT)
-	pkt->flags = fwa->flags;
 
     if (q->head == NULL)
 	q->head = m;
Index: ip_dummynet.h
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_dummynet.h,v
retrieving revision 1.38
diff -u -p -r1.38 ip_dummynet.h
--- ip_dummynet.h	29 Nov 2005 00:11:01 -0000	1.38
+++ ip_dummynet.h	10 Feb 2006 21:13:45 -0000
@@ -130,7 +130,6 @@ struct dn_pkt_tag {
 
     dn_key output_time;		/* when the pkt is due for delivery	*/
     struct ifnet *ifp;		/* interface, for ip_output		*/
-    int flags ;			/* flags, for ip_output (IPv6 ?)	*/
     struct _ip6dn_args ip6opt;	/* XXX ipv6 options			*/
 };
 #endif /* _KERNEL */
Index: ip_fw.h
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_fw.h,v
retrieving revision 1.103
diff -u -p -r1.103 ip_fw.h
--- ip_fw.h	13 Dec 2005 12:16:02 -0000	1.103
+++ ip_fw.h	10 Feb 2006 21:15:41 -0000
@@ -510,8 +510,6 @@ struct ip_fw_args {
 	struct ip_fw	*rule;		/* matching rule		*/
 	struct ether_header *eh;	/* for bridged packets		*/
 
-	int flags;			/* for dummynet			*/
-
 	struct ipfw_flow_id f_id;	/* grabbed from IP header	*/
 	u_int32_t	cookie;		/* a cookie depending on rule action */
 	struct inpcb	*inp;
%%%


Cheers,
-- 
Ruslan Ermilov
ru at FreeBSD.org
FreeBSD committer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20060211/e22a31e4/attachment.bin


More information about the freebsd-net mailing list