svn commit: r233684 - head/sys/x86/include

Fri Mar 30 12:12:01 UTC 2012

On Thu, 29 Mar 2012, Dimitry Andric wrote:

> Log:
>  Fix an issue introduced in sys/x86/include/endian.h with r232721.  In
>  that revision, the bswapXX_const() macros were renamed to bswapXX_gen().
>
>  Also, bswap64_gen() was implemented as two calls to bswap32(), and
>  similarly, bswap32_gen() as two calls to bswap16().  This mainly helps
>  our base gcc to produce more efficient assembly.
>
>  However, the arguments are not properly masked, which results in the
>  wrong value being calculated in some instances.  For example,
>  bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0)
>  returns 0xfcdefc9a7c563412.

No, bswap32() and bswap64() were properly masked in r232721, and not only
that, they were cast to the correct type, as required to prevent related
problems when they are used.  It was the next commit, rmumble, that broke
this, by using uncast expressions sub-expressions for bswap32_gen() and
bswap64_gen().

>  Fix this by appropriately masking the arguments to bswap16() in
>  bswap32_gen(), and to bswap32() in bswap64_gen().  This should also
>  silence warnings from clang.

Sorry, this is inappropriate.  It has no effect except to silence the
warnings from clang in a confusing way.

>  Submitted by:	jh
>
> Modified:
>  head/sys/x86/include/endian.h
>
> Modified: head/sys/x86/include/endian.h
> ==============================================================================
> --- head/sys/x86/include/endian.h	Thu Mar 29 23:30:17 2012	(r233683)
> +++ head/sys/x86/include/endian.h	Thu Mar 29 23:31:48 2012	(r233684)
> @@ -65,9 +65,9 @@
>
> #define	__bswap16_gen(x)	(__uint16_t)((x) << 8 | (x) >> 8)
> #define	__bswap32_gen(x)		\
> -	(((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
> +	(((__uint32_t)__bswap16((x) & 0xffff) << 16) | __bswap16((x) >> 16))
> #define	__bswap64_gen(x)		\
> -	(((__uint64_t)__bswap32(x) << 32) | __bswap32((x) >> 32))
> +	(((__uint64_t)__bswap32((x) & 0xffffffff) << 32) | __bswap32((x) >> 32))

This has no effect except to make it clear to clang (but not to humans)
that the implicit conversions do the right thing.  It would have worked
around the bug in a previous version and accidentally fixed the warning
in the same way as this version.

For example, consider bswap32(0x12345678).  This becomes __bswap32_gen(
0x12345678) after a null cast of the arg to __uint32_t.  (We don't cast
the result of __bswap32_gen(), and depend on minor magic (mainly on
32-bit ints) for this to be unnecessary.).  The next reduction, for the
correct version which has the above expression without the 0xfff mask, is:

     (((__uint32_t)__bswap16(0x12345678) << 16) | __bswap16(0x12345678 >> 16))

The __bswap16() macros work unsurprisingly (except for the conditional
operator in them!) and give constants of type uint16_t which I denote
by a US suffix.  The result of evaluating them is:

     (((__uint32_t)0x7856US << 16) | 0x3412US)

and the result is the unsurprising 0x78563412.

Note that for the 'gen' part, everything is explicitly cast, as required
for the result to be correct, and there is no reason for clang to warn,
and it doesn't.  However, for the `var' part, clang warns spuriously.
This is a bug in clang, since the `var' part is not even used.
0x12345678 >> 16 is 0x1234 which is representable by __uint16_t, so there
is no warning for it.  However, __bswap16(0x12345678) expands to
essentially:

 	1 ? __bswap16_gen(0x12345678) : __bswap16_var(0x12345678)

Here __bswap16_var() takes a __uint16_t arg, so an implicit value-losing
cast is required to call it.  clang warns about this although
__bswap16_var() is not actually called.

Masking with 0xffff fixes this accidentally by converting 0x12345678
to 0x5678.  Now it is representable as a __uint16_t, so clang doesn't
warn about downcasting it when not-calling __bswap16_var() on it.

A previous version broke the result by changing __bswap16() to
__bswap16_gen().  In the critical expression:

     (((__uint32_t)__bswap16_gen(0x12345678) << 16) | __bswap16_gen(0x12345678 >> 16)),

0x12345678 >> 16 has only 16 bits, so there are no problems with it.  But
in __bswap16_gen(0x12345678), the value has extra bits which the macro
is intentionally designed not to handle (callers are required to pass it
only 16 bits.  Its result is (0x12345678 << 8) | (0x12345678 >> 8) cast
to __uint16_t.  Compilers should warn about the overflow in this.  If
they just shift the bits out, the result is 0x34567800 | 0x00123456 =
0x34567c56 cast down = 0x7c56.  It's surprising that this has only 1
bit incorrect.

Masking with 0xffff fixes this by discarding the extra bits, which gives
the same value although a different type than the unbroken version
(the unbroken version casts the arg to __uint16_t before invoking
__bswap16_gen() although not before invoking __bswap16_var() since it
knows that the prototype will do an implicit cast for the latter and
it doesn't want to know about the clang bug.

>
> #ifdef __GNUCLIKE_BUILTIN_CONSTANT_P
> #define	__bswap16(x)				\
>

The correct fix for working around the clang bugs is as I said before:
cast the arg in the place where there is a problem:

% Index: endian.h
% ===================================================================
% RCS file: /home/ncvs/src/sys/x86/include/endian.h,v
% retrieving revision 1.7
% diff -u -2 -r1.7 endian.h
% --- endian.h	29 Mar 2012 23:31:48 -0000	1.7
% +++ endian.h	30 Mar 2012 11:20:37 -0000
% @@ -66,18 +66,18 @@
%  #define	__bswap16_gen(x)	(__uint16_t)((x) << 8 | (x) >> 8)
%  #define	__bswap32_gen(x)		\
% -	(((__uint32_t)__bswap16((x) & 0xffff) << 16) | __bswap16((x) >> 16))
% +	(((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
%  #define	__bswap64_gen(x)		\
% -	(((__uint64_t)__bswap32((x) & 0xffffffff) << 32) | __bswap32((x) >> 32))
% +	(((__uint64_t)__bswap32(x) << 32) | __bswap32((x) >> 32))

Remove accidental fix.

% 
%  #ifdef __GNUCLIKE_BUILTIN_CONSTANT_P
%  #define	__bswap16(x)				\
%  	((__uint16_t)(__builtin_constant_p(x) ?	\
% -	    __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x)))
% +	    __bswap16_gen((__uint16_t)(x)) : __bswap16_var((__uint16_t)(x))))

Don't forget to fix __bswap16() too.  To see the problem without this fix,
try __bswap16(0x123456).  The arg is weird and has more than 16-bits, so
clang warns about downcasting it for not-calling __bswap16_var() on it.

Other weird cases are supposed to be fixed too -- the arg is always cast
down explicitly if it is weird and too wide to start.

%  #define	__bswap32(x)			\
%  	(__builtin_constant_p(x) ?	\
% -	    __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
% +	    __bswap32_gen((__uint32_t)(x)) : __bswap32_var((__uint32_t)(x)))
%  #define	__bswap64(x)			\
%  	(__builtin_constant_p(x) ?	\
% -	    __bswap64_gen((__uint64_t)(x)) : __bswap64_var(x))
% +	    __bswap64_gen((__uint64_t)(x)) : __bswap64_var((__uint64_t)(x)))
%  #else
%  /* XXX these are broken for use in static initializers. */

I haven't lloked at tijl's larger rewrite yet.

Bruce