svn commit: r232491 - in head/sys: amd64/include i386/include
brde at optusnet.com.au
Tue Apr 10 01:45:28 UTC 2012
On Mon, 9 Apr 2012, David Schultz wrote:
> On Sun, Mar 04, 2012, Tijl Coosemans wrote:
>> Copy amd64 float.h to x86 and merge with i386 float.h. Replace
>> amd64/i386/pc98 float.h with stubs.
>> --- head/sys/amd64/include/float.h Sun Mar 4 12:52:48 2012 (r232490, copy source)
>> +++ head/sys/x86/include/float.h Sun Mar 4 14:00:32 2012 (r232491)
>> @@ -42,7 +42,11 @@ __END_DECLS
>> #define FLT_RADIX 2 /* b */
>> #define FLT_ROUNDS __flt_rounds()
>> #if __ISO_C_VISIBLE >= 1999
>> +#ifdef _LP64
>> #define FLT_EVAL_METHOD 0 /* no promotions */
>> +#define FLT_EVAL_METHOD (-1) /* i387 semantics are...interesting */
>> #define DECIMAL_DIG 21 /* max precision in decimal digits */
> The implication of this code is that FLT_EVAL_METHOD depends on
> the size of a long, which it does not. Instead, it depends on
> whether SSE2 support is guaranteed to be present. If anything,
> the test should be something like #ifndef __i386__.
Actually, it depends on whether both SSE1 and SSE2 support are
guaranteed to be used. The i386 ifdef is wrong too (as is the old
fixed value for i386), since clang with SSE support breaks the abstract
i386 machine by actually using SSE; with gcc, this breakage is under
control of the option -mfpmath=unit which defaults to unit=i387.
Also, float_t and double_t must match FLT_EVAL_METHOD.
I use the following hack to work around the clang breakage in libm:
% Index: math.h
% RCS file: /home/ncvs/src/lib/msun/src/math.h,v
% retrieving revision 1.82
% diff -u -2 -r1.82 math.h
% --- math.h 12 Nov 2011 19:55:48 -0000 1.82
% +++ math.h 4 Jan 2012 05:09:51 -0000
% @@ -125,4 +130,10 @@
% : __signbitl(x))
% +#ifdef __SSE_MATH__
% +#define __float_t float
% +#ifdef __SSE2_MATH__
% +#define __double_t double
% typedef __double_t double_t;
% typedef __float_t float_t;
I forgot to hack on FLT_EVAL_METHOD similarly. The fixed value of (-1)
for i386 is sort of fail-safe, since it says that the evaluation method
is indeterminate, so the code must assume the worst. The normal i386
types for float_t and double_t are also sort of fail-safe, since they
are larger than necessary. They just cause pessimal code. So would
FLT_EVAL_METHOD = -1, and I only hacked on the types since my tests
only cover the pessimizations for the types.
Note that the compiler builtin __FLT_EVAL_METHOD is unusable, since its
value is almost always wrong. With gcc, it is wrong by default (2) but
is changed correctly to 0 by -mfpmath=sse. With clang, it is wrong
by default (0), but becomes correct with SSE1 and SSE2. With only SSE1,
there are even more possibilities for the float evaluation method, but
doubles must be evaluated using the i387 so FLT_EVAL_METHOD must remain
- clang -march=athlon-xp. Athlon-XP only has SSE1, and clang evaluates
float expressions using SSE1 but double expressions using i387. This
matches float_t = float and double_t = long double given by the above.
FLT_EVAL_METHOD = -1 remains correct.
- similarly for gcc -march=athlon-xp -mfpmath=sse.
- clang -march=athlon64. Athlon64 has both SSE1 and SSE2, and clang
evaluates both float and double expressions using SSE*. This matches
float_t = float and double_t = double given by the above.
FLT_EVAL_METHOD = -1 is now wrong.
- similarly for gcc -march=athlon64 -mfpmath=sse. SSE* use can also be
controlled by -msse (instead of march), but -mfpmath doesn't
distinguish between SSE1 and SSE2, so there seems to be no way to
use SSE2 generally and SSE1 for FP without also using SSE2 for FP.
More information about the svn-src-head