kern/60889 - zero IP id change issues in 5.2RC2
Andre Oppermann
andre at freebsd.org
Wed Jan 7 15:35:04 PST 2004
Andre Oppermann wrote:
>
> Richard,
>
> I've attached a patch that fixes the problem with FIN/ACK and one more
> case which got it wrong. I also fixed the host byte order problem in
> ip_output.c for the normal ip_id incrementor.
>
> Some comments and questions:
>
> 1. Do you think it is neccessary to do a htons() on the randomized
> ip_id too? I'd say yes if there is a case where it has to
> monotonically increase afterwards. Does it?
>
> 2. I have a Win2k machine but have check out how I can get tcp header
> compression to work with my Cisco AS5300 (if it doesn't do that by
> default). Will I see the problem when I do a download from a FreeBSD
> 5.2RC2 machine or do I have to use the Windoze as router sending
> packets upwards?`
>
> 3. There are indeed devices clearing the DF bit. For example Cisco
> is recommending this in it's trouble shooting section for broadband
> access via DSL/L2TP where the MTU is lower than 1500 because of the
> tunnelling overhead. Make a route-map to unset the DF bit and apply
> it to the incoming interface.
4. After reading the pf.conf man page from OpenBSD (where it talks about
scrubbing IP packets) I don't think it's a good idea to set the ip_id
to zero in the DF case. It seems to cause various kinds of problems
when for some reason DF is removed along the path (which it shouldn't
but whatever). I think it is clearly better to put a ip_id into every
packet no matter what. Although the ip_randomid() function doesn't
look really cheap to run... but then security comes at a cost.
5. Random ip_id is already a kernel option but it is *not* enabled by
default in 5.2RC2 GENERIC or -current. On the other hand the code
*does* zero the ip_id when DF is set in any case which is troublesome
as we just found out.
Updated fix attached. Same tab-brokenness due to c&p.
--
Andre
Index: ip_output.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
retrieving revision 1.203
diff -u -p -r1.203 ip_output.c
--- ip_output.c 20 Nov 2003 20:07:38 -0000 1.203
+++ ip_output.c 7 Jan 2004 23:31:56 -0000
@@ -229,10 +229,10 @@ ip_output(struct mbuf *m0, struct mbuf *
/*
* Fill in IP header. If we are not allowing fragmentation,
- * then the ip_id field is meaningless, so send it as zero
- * to reduce information leakage. Otherwise, if we are not
- * randomizing ip_id, then don't bother to convert it to network
- * byte order -- it's just a nonce. Note that a 16-bit counter
+ * then the ip_id field is meaningless, but we don't set it
+ * to zero. Doing so causes various problems when devices along
+ * the path (routers, load balancers, firewalls, etc.) illegally
+ * disable DF on our packet. Note that a 16-bit counter
* will wrap around in less than 10 seconds at 100 Mbit/s on a
* medium with MTU 1500. See Steven M. Bellovin, "A Technique
* for Counting NATted Hosts", Proc. IMW'02, available at
@@ -241,17 +241,11 @@ ip_output(struct mbuf *m0, struct mbuf *
if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
ip->ip_v = IPVERSION;
ip->ip_hl = hlen >> 2;
- if ((ip->ip_off & IP_DF) == 0) {
- ip->ip_off = 0;
#ifdef RANDOM_IP_ID
- ip->ip_id = ip_randomid();
+ ip->ip_id = ip_randomid();
#else
- ip->ip_id = ip_id++;
+ ip->ip_id = htons(ip_id++);
#endif
- } else {
- ip->ip_off = IP_DF;
- ip->ip_id = 0;
- }
ipstat.ips_localout++;
} else {
hlen = ip->ip_hl << 2;
Index: tcp_subr.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
retrieving revision 1.172
diff -u -p -r1.172 tcp_subr.c
--- tcp_subr.c 6 Jan 2004 23:29:46 -0000 1.172
+++ tcp_subr.c 7 Jan 2004 23:31:57 -0000
@@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
tlen += sizeof (struct tcpiphdr);
ip->ip_len = tlen;
ip->ip_ttl = ip_defttl;
+ if (path_mtu_discovery)
+ ip->ip_off |= IP_DF;
}
m->m_len = tlen;
m->m_pkthdr.len = tlen;
@@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
m->m_pkthdr.csum_flags = CSUM_TCP;
m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
ip->ip_len = m->m_pkthdr.len;
+ if (path_mtu_discovery)
+ ip->ip_off |= IP_DF;
error = ip_output(m, inp->inp_options, NULL,
(tw->tw_so_options & SO_DONTROUTE), NULL, inp);
}
> If these questions are answered I can prepare the final patch and
> commit it pending review and re@ approval.
>
> --
> Andre
>
> (Note: tabs converted to whitespace because of c&p)
>
> Index: ip_output.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
> retrieving revision 1.203
> diff -u -p -r1.203 ip_output.c
> --- ip_output.c 20 Nov 2003 20:07:38 -0000 1.203
> +++ ip_output.c 7 Jan 2004 22:43:54 -0000
> @@ -246,7 +246,7 @@ ip_output(struct mbuf *m0, struct mbuf *
> #ifdef RANDOM_IP_ID
> ip->ip_id = ip_randomid();
> #else
> - ip->ip_id = ip_id++;
> + ip->ip_id = htons(ip_id++);
> #endif
> } else {
> ip->ip_off = IP_DF;
> Index: tcp_subr.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
> retrieving revision 1.172
> diff -u -p -r1.172 tcp_subr.c
> --- tcp_subr.c 6 Jan 2004 23:29:46 -0000 1.172
> +++ tcp_subr.c 7 Jan 2004 22:43:55 -0000
> @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
> tlen += sizeof (struct tcpiphdr);
> ip->ip_len = tlen;
> ip->ip_ttl = ip_defttl;
> + if (path_mtu_discovery)
> + ip->ip_off |= IP_DF;
> }
> m->m_len = tlen;
> m->m_pkthdr.len = tlen;
> @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
> m->m_pkthdr.csum_flags = CSUM_TCP;
> m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
> ip->ip_len = m->m_pkthdr.len;
> + if (path_mtu_discovery)
> + ip->ip_off |= IP_DF;
> error = ip_output(m, inp->inp_options, NULL,
> (tw->tw_so_options & SO_DONTROUTE), NULL, inp);
> }
>
> Richard Wendland wrote:
> >
> > I've been asked (for freebsd-bugs) to open a discussion about PR
> > kern/60889 on freebsd-net, to decide if a recent change to IP should be
> > reversed before 5.2-RELEASE (scheduled this month) to give more time
> > for some serious issues and risks my PR has raised to be considered
> > and tested. My proposal is to revert to 5.1 behaviour for now.
> >
> > The recent change is to emit a zero fragmentation id when DF is set, with
> > the objective of improving privacy from external id sequence observation,
> > an issue raised by Steve Bellovin's paper in IMW'02.
> >
> > I've identified 4 problems with this change:
> >
> > 1) Even with path_mtu_discovery set, for some reason TCP emits FIN-ACK
> > without DF. This causes two problems for this change:
> >
> > a) Currently the change doesn't really meet its objective for TCP,
> > it just means FIN-ACK must be observed.
> >
> > b) Because now just one id is consumed per TCP connection on the
> > FIN-ACK, for most/many systems id becomes a close approximate
> > count of the number of TCP connections it has made, which external
> > observers can see, not possible before this change. To me this
> > seems more of a privacy issue for all FreeBSD users than the NAT
> > issue in Steve Bellovin's paper this change seeks to solve.
> >
> > 2) Linux made exactly this change some time ago, around Linux 2.4.4,
> > but was forced to back-out the change because (I think) of practical
> > connectivity issues related to the comment in include/net/ip.h
> > ip_select_ident() where it now implements an incrementing ip_id for DF:
> >
> > /* This is only to work around buggy Windows95/2000
> > * VJ compression implementations. If the ID field
> > * does not change, they drop every other packet in
> > * a TCP stream using header compression.
> > */
> >
> > I'm not aware that anyone checked that this buggy Windows95/2000 VJ
> > compression problem is no longer an issue in practice. I doubt that
> > the large web hosters who use FreeBSD would be best-pleased if they
> > ran into this for even just a few users - especially as there is no
> > config option to disable this change.
> >
> > 3) This change causes ip_id for non-DF to be output in native
> > byte order in ip_output.c. Unfortunately ip_id is still output in
> > Network Byte Order in ip_mroute.c and raw_ip.c, so this change risks
> > little-endian machines emitting the same fragmentation id at about
> > the same time from these different modules, rather than the usual
> > 64k cycle; creating a small but real risk of re-assembly errors.
> > [This isn't in my PR, I've only just noticed it.]
> >
> > I also have a suspicion that some middle-boxes (like HTTP load-balancers)
> > may clear DF without setting a fresh IP id - clearing DF would save them
> > the bother of routing ICMP fragmentation needed back to the source server.
> > If this is so this is another problem this change could show up which
> > may cause re-assembly errors. I know the bug is elsewhere, but it would
> > still become a practical problem for some FreeBSD users.
> >
> > So before going with this change I think four things need to be done:
> >
> > 1) TCP changed so FIN-ACK goes out with DF if path_mtu_discovery set.
> >
> > 2) Tests with Windows95/2000 TCP VJ compression (RFC1144) run.
> >
> > 3) ip_id should be emitted in the same byte order everywhere.
> >
> > 4) The change made a config option, so sites can disable it should
> > they run into problems, just as RANDOM_IP_ID is an option.
> >
> > This all seems too much to do for 5.2-RELEASE, and as I think the problems
> > and risks sufficiently serious, I propose reversing the change until this
> > can be done. We need a quick decision on this to get it into 5.2-RELEASE.
> >
> > The PR has some more detail and tcpdump output demonstrating the
> > issue:
> >
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/60889
> >
> > The change to be reversed can be seen at:
> >
> > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c#rev1.189
> > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c.diff?r1=1.188&r2=1.189
> >
> > NB If anyone can easily run tests with Windows95/2000 TCP VJ compression that
> > would be great (using tcpdump to see if there is abnormal retransmission).
> >
> > NB2 If anyone knows why FIN-ACK goes out without DF that would be helpful.
> > I've quickly looked at the source, and I can't see why. It's emitted
> > as TCP moves from FIN_WAIT_n state to TIME_WAIT probably at line 3091 of
> > tcp_input.c with a call to tcp_output(). tcp_output() always appears to
> > set IP_DF at line 998 of tcp_output.c, if path_mtu_discovery is enabled.
> > A puzzle.
> >
> > Thanks to Boris Staeblow <bs at dva.in-berlin.de> and Tim Rylance
> > <tkr at puffball.demon.co.uk> for highlighting this change and helping me
> > diagnose the issues.
> >
> > Richard
> > --
> > Richard Wendland richard at wendland.org.uk
> > _______________________________________________
> > freebsd-net at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list