Implementation for coshl.

Mon Jun 10 16:29:28 UTC 2013

On Sun, Jun 09, 2013 at 10:58:34PM -0700, David Schultz wrote:
> Awesome, thanks for working on this.
> 
> On Mon, Jun 10, 2013, Bruce Evans wrote:
> > On Sun, 9 Jun 2013, Steve Kargl wrote:
> > >    Arch   |     Interval        | #calls | Time (s) | Max ULP | Compiler | Value
> > > -----------+---------------------+--------+----------+---------+----------+-------
> > >  i386 [1] | [    0.00:    0.35] |  100M  | 15.0198  | 0.58583 |  gcc     |  1
> > >  i386 [1] | [    0.35:   24.00] |  100M  | 15.1858  | 1.01504 |  gcc     |  2
> > >  i386 [1] | [   24.00:11356.52] |  100M  | 12.9591  | 0.51198 |  gcc     |  3
> > >  i386 [1] | [11356.52:11357.22] |  100M  | 13.3328  | 1.90988 |  gcc     |  4
> > > -----------+---------------------+--------+----------+---------+----------+-------
> > 
> > Quite large errors, unfortunately much the same as in double precision.
> [...]
> > > Bruce and I exchanged emails a long time ago about possible ways
> > > to reduce the ulp in this range by either computer exp(x) with
> > > extra precision or using a table with cosh(x) = cosh(x_i) * cosh(d)
> > > + sinh(x_i) * sinh(d) with d = x - x_i.  I tried the latter with
> > > disappointing results.
> 
> This is fine for now.  We might eventually want to improve the
> accuracy of the float, double, and long double implementations,
> but that ought to come later.  It's much easier to start with
> float and double when making improvements in the algorithm.

I do not have working test programs for float and double, which
will take a morning (or late night) to fix.  So, I do not have
a method for testing the either the current cosh[f] or any changes
that I might suggest.

> > > The former would require a refactoring of
> > > s_expl.c into a kernel __kernel_expl(x, hi, lo).  I have no plans on
> > > pursuing this at the this time.
> > 
> > But you need this soon for __ldexp_exp() and __ldexp_cexp(), which are
> > needed for hyperbolic functions (the large args already fixed for float
> > and double precision) and for cexp() and trig and hyperbolic complex
> > functions.  It is much easier to implement these using a kernel.  I do
> > this only for float precision.

I spent a couple of hours last night pulling __kernel_expl out
of expl().  The results were less then stellar, but Bruce's
email suggested the error in my method.  

> That's fine for now.  We need a better k_exp, too.  I believe
> Bruce was working on one.
> 
> I have a few comments on the code... mostly a subset of Bruce's
> comments:
> 
> - You can clear the sign bit on x after the Inf/NaN check.

OK.

> - I think the way you handle the tiny-x case is an improvement
>   over the double version.  However, float-to-int conversion is
>   very slow on x86, so you might instead try something like
>   RETURNI(1.0L + tiny), where tiny is volatile and e.g., 0x1p-1000.
>   That will also return the correct result in all rounding modes.

I originally had

                if (ix < BIAS - 33) {   /* |x| < 0x1p-33 */
                        /* cosh(+-0) = 1. */
                        if (x == 0)
                                return (one);
                        /* cosh(tiny) = one */
                        return (one + tiny * tiny);
                }

because C99 requires cosh(+-0) to be exact.  For |x| != the
return value will raise inexact.

> - Please check whether volatile hacks are needed to induce the
>   compiler to do the right thing.  I'm increasingly finding that
>   these are needed with clang.

I suppose I need to either look at the generated assembly or
add testing for signals to my test programs.  How do you go
about looking at this?

-- 
Steve