Bit twiddling question

Thu Mar 9 07:52:37 UTC 2017

On Thu, Mar 09, 2017 at 05:58:52PM +1100, Bruce Evans wrote:
> On Wed, 8 Mar 2017, Steve Kargl wrote:
> 
> > Suppose I have a float 'x' that I know is in the
> > range 1 <= x <= 0x1p23 and I know that 'x' is
> > integral, e.g., x = 12.000.  If I use GET_FLOAT_WORD
> > from math_private.h, then x=12.000 maps to ix=0x41400000.
> > Is there a bit twiddling method that I can apply to ix to
> > unambiguously determine if x is even of odd?
> 
> I don't know of any good method.

Spent a day or so searching; hence, my question.

> > Yes, I know I can do
> >
> > float x;
> > int32_t ix;
> > ix = (int32_t)x;
> >
> > and then test (ix & 1).  But, this does not generalize to
> 
> This isn't bit twiddling, and is also slow.

Yes, I know it isn't bit twiddling, but it achieves want I 
need.  I was hoping that if a bit twiddling algorithm was 
available (and was faster), then I could change my algorithm.

> > the case of long double on a ld128 architecture.  That is,
> > if I have 1 <= x < 1xp112, then I would need to have
> >
> > long double x;
> > int128_t ix;
> > ix = (int128_t)x;
> >
> > and AFAICT sparc64 doesn't have an int128_t.
> 
> If sparc64 has this, it would be even slower.  Sparc64 emulates
> all 128-bit FP.  This makes 128-bit sparc64 ~10-100 times slower
> than 64-bit sparc64 FP, and 100-1000 times slower than modern
> x86 64 and 8-bit FP.  FP to integer conversions tend to be slower
> than pure FP, and are especially tricky for integers with more
> bits than FP mantissas.
> 
> Consider bit twiddling to classifiy oddness of 2.0F (0x40000000 in
> bits) and 3.0F (0x40400000 in bits).  The following method seems
> to be not so bad.  Calculates that the unbiased exponent for 3.0F
> is 1.  This means that the 1's bit is (1 << (23 - 1)) = 0x00400000
> where we see it for 3.0F but not for 2.0F.  All bits to the right
> of this must be 0 for the value to be an integer.  I don't know
> how you classified integers efficiently without already determining
> the position of the "point" and looking at these bits.  There are
> complications for powers of 2 and the normalization bit being implicit.

Thanks.  I'll look into the above to see if I can up with
something that does not require the cast.  

To give a hint at what I have been working on, I have implementations
for sinpif(x), sinpi(x), cospif(x), and cospi(x).  For sinpif(x) and
cospif(x) I have kernels that give correctly rounded results for 
FLT_MIN <= x < 0x1p23 (at least on x86_64).   I also have slightly
faster kernels that give max ULPs of about 0.5008.  For sinpi(x) and
cospi(x), my kernels currently give about 1.25 ULP for the worse case
error if I accumulate the polynomials in double precision.  If I 
accumulate the polynomials in long double precision, then I get 
correctly rounded results.  To complete the set, I was hoping to 
work out ld80 and ld128 versions.  `ld128 is going to be problematic
due to the absense of int128_t.

I'll send you what I have in a few days.

-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow