[CFT/Review] net byte order for AF_INET

Gleb Smirnoff glebius at FreeBSD.org
Wed Oct 10 18:31:51 UTC 2012


  Luigi,

On Wed, Oct 10, 2012 at 02:46:37PM +0200, Luigi Rizzo wrote:
L> I am really grataful you are doing this. A few comments:
L> 
L> + as a strategy, i would probably suggest (something you mostly seem to do already)
L>   that arithmetic comparisons (even if just for equality) always use the HOST format,
L>   so that at some point one needs to change == to > or < this does not cause
L>   subtle bugs. This includes comparison with 0 (there is 1-2 instances where
L>   we do x == 0 when x is in NET format. This might confuse the reader.

Agreed. Converted these places to ntohs().

L>   I would almost suggest the same for | and & though the chances that
L>   one changes | onto something that is endian-sensitive are null.

I'd prefer to live these comparisons in net byte order, since I expect
that

	ip->ip_off & htons(IP_DF)

would be optimized by compiler to a constant comparison, while

	ntohs(ip->ip_off) & IP_DF

can't be optimized.

L> + i hope this change will be merged to STABLE as soon as we are confident
L>   that it is complete so we will have an easier time maintaining the code
L>   across branches.

Not me to decide that. This change is continuation of a change that brought
net byte order to pfil(9) hooks, thus merging should be made of both. Changing
byte order for pfil(9) hooks is ABI breakage which isn't allowed in stable/*.

L> Some comments inline (first one is probably a bug, the others
L> are just suggestions)
L> 
L> > Index: sys/netinet/raw_ip.c
L> > ===================================================================
L> > --- sys/netinet/raw_ip.c	(revision 241370)
L> > +++ sys/netinet/raw_ip.c	(working copy)
L> > @@ -292,6 +292,7 @@
L> >  	 * not modify the packet except for some
L> >  	 * byte order swaps.
L> >  	 */
L> > +	ip->ip_len = ntohs(ip->ip_len);
L> >  	ip->ip_len += off;
L> >  
L> >  	hash = INP_PCBHASH_RAW(proto, ip->ip_src.s_addr,
L> 
L> this seems wrong, perhaps you want
L> 
L> -  	ip->ip_len += off;
L> +	ip->ip_len = htons(ntohs(ip->ip_len) + off);

Maxim Dounin also noticed this. This was done intentionally, but since two
people are in doubt, I will look closer here. Historically BSD had supplied
host byte order in raw sockets, but I suppose, for head/ this is no longer
true for about a year.

L> > Index: sys/netinet/ip_input.c
L> > ===================================================================
L> > --- sys/netinet/ip_input.c	(revision 241370)
L> > +++ sys/netinet/ip_input.c	(working copy)
L> > @@ -914,22 +904,21 @@
L> >  	 * Adjust ip_len to not reflect header,
L> >  	 * convert offset of this to bytes.
L> >  	 */
L> > -	ip->ip_len -= hlen;
L> > -	if (ip->ip_off & IP_MF) {
L> > +	ip->ip_len = htons(ntohs(ip->ip_len) - hlen);
L> > +	if (ip->ip_off & htons(IP_MF)) {
L> >  		/*
L> >  		 * Make sure that fragments have a data length
L> >  		 * that's a non-zero multiple of 8 bytes.
L> >  		 */
L> > -		if (ip->ip_len == 0 || (ip->ip_len & 0x7) != 0) {
L> > +		if (ip->ip_len == 0 || (ip->ip_len & htons(0x7)) != 0) {
L> >  			IPSTAT_INC(ips_toosmall); /* XXX */
L> >  			goto dropfrag;
L> >  		}
L> >  		m->m_flags |= M_FRAG;
L> >  	} else
L> >  		m->m_flags &= ~M_FRAG;
L> > -	ip->ip_off <<= 3;
L> > +	ip->ip_off = htons(ntohs(ip->ip_off) << 3);
L> >  
L> > -
L> >  	/*
L> >  	 * Attempt reassembly; if it succeeds, proceed.
L> >  	 * ip_reass() will return a different mbuf.
L> 
L> + above i would rather use a temp variable ip_len = ntohs(ip->ip_len)
L>   and retain the existing expression in native format.

I'll look at that.

L> + in the chunk below, maybe it makes sense for readability to
L>   use a couple of temp variables for ip->ip_off and ip->ip_len
L>   and adjust the values in *ip only at the end ?

The chunk below is ip_reass() which is definitely not a modern
optimized function. If someone wants to optimize it, he should
start with something heavier than economize byte swapping operations.

Regarding readability: just adding ntohs() all over in place I'm
pretty sure that I did not break anything. Offloading values to
temporary variables may introduce errors, since we are dealing
with multiple packets here and we need carefully to update temporary
variables any time we switch to another packet. I'd prefer to
make the stack byte order swapping change as conservative as
possible, and not try to optimize ip_reass().

-- 
Totus tuus, Glebius.


More information about the freebsd-net mailing list