Bit twiddling question
Steve Kargl
sgk at troutmask.apl.washington.edu
Thu Mar 9 07:52:37 UTC 2017
On Thu, Mar 09, 2017 at 05:58:52PM +1100, Bruce Evans wrote:
> On Wed, 8 Mar 2017, Steve Kargl wrote:
>
> > Suppose I have a float 'x' that I know is in the
> > range 1 <= x <= 0x1p23 and I know that 'x' is
> > integral, e.g., x = 12.000. If I use GET_FLOAT_WORD
> > from math_private.h, then x=12.000 maps to ix=0x41400000.
> > Is there a bit twiddling method that I can apply to ix to
> > unambiguously determine if x is even of odd?
>
> I don't know of any good method.
Spent a day or so searching; hence, my question.
> > Yes, I know I can do
> >
> > float x;
> > int32_t ix;
> > ix = (int32_t)x;
> >
> > and then test (ix & 1). But, this does not generalize to
>
> This isn't bit twiddling, and is also slow.
Yes, I know it isn't bit twiddling, but it achieves want I
need. I was hoping that if a bit twiddling algorithm was
available (and was faster), then I could change my algorithm.
> > the case of long double on a ld128 architecture. That is,
> > if I have 1 <= x < 1xp112, then I would need to have
> >
> > long double x;
> > int128_t ix;
> > ix = (int128_t)x;
> >
> > and AFAICT sparc64 doesn't have an int128_t.
>
> If sparc64 has this, it would be even slower. Sparc64 emulates
> all 128-bit FP. This makes 128-bit sparc64 ~10-100 times slower
> than 64-bit sparc64 FP, and 100-1000 times slower than modern
> x86 64 and 8-bit FP. FP to integer conversions tend to be slower
> than pure FP, and are especially tricky for integers with more
> bits than FP mantissas.
>
> Consider bit twiddling to classifiy oddness of 2.0F (0x40000000 in
> bits) and 3.0F (0x40400000 in bits). The following method seems
> to be not so bad. Calculates that the unbiased exponent for 3.0F
> is 1. This means that the 1's bit is (1 << (23 - 1)) = 0x00400000
> where we see it for 3.0F but not for 2.0F. All bits to the right
> of this must be 0 for the value to be an integer. I don't know
> how you classified integers efficiently without already determining
> the position of the "point" and looking at these bits. There are
> complications for powers of 2 and the normalization bit being implicit.
Thanks. I'll look into the above to see if I can up with
something that does not require the cast.
To give a hint at what I have been working on, I have implementations
for sinpif(x), sinpi(x), cospif(x), and cospi(x). For sinpif(x) and
cospif(x) I have kernels that give correctly rounded results for
FLT_MIN <= x < 0x1p23 (at least on x86_64). I also have slightly
faster kernels that give max ULPs of about 0.5008. For sinpi(x) and
cospi(x), my kernels currently give about 1.25 ULP for the worse case
error if I accumulate the polynomials in double precision. If I
accumulate the polynomials in long double precision, then I get
correctly rounded results. To complete the set, I was hoping to
work out ld80 and ld128 versions. `ld128 is going to be problematic
due to the absense of int128_t.
I'll send you what I have in a few days.
--
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow
More information about the freebsd-numerics
mailing list