Implementation of half-cycle trignometric functions

Mon Apr 24 23:55:39 UTC 2017

On Thu, Apr 13, 2017 at 10:12:48AM -0700, Steve Kargl wrote:
> On Sun, Apr 09, 2017 at 03:08:09PM -0700, Steve Kargl wrote:
> > Both IEEE-754 2008 and ISO/IEC TS 18661-4 define the half-cycle
> > trignometric functions cospi, sinpi, and tanpi.  The attached
> > patch implements cospi[fl], sinpi[fl], and tanpi[fl].  Limited
> > testing on the cospi and sinpi reveal a max ULP less than 0.89;
> > while tanpi is more problematic with a max ULP less than 2.01 
> > in the interval [0,0.5].  The algorithms used in these functions
> > are documented in {ks}_cospi.c, {ks}_sinpi.c, and s_tanpi.c.
> > 
> > Note 1.  ISO/IEC TS 18661-4 says these funstions are guarded by
> > a predefine macro.  I have no idea or interest in what clang and
> > gcc do with regards to this macro.  I've put the functions behind
> > __BSD_VISIBLE.
> > 
> > Note 2.  I no longer have access to a system with ld128 and 
> > adequate support to compile and test the ld128 implementations
> > of these functions.  Given the almost complete lack of input from
> > others on improvements to libm, I doubt that anyone cares.  If 
> > someone does care, the ld128 files contain a number of FIXME comments,
> > and in particular, while the polynomial coefficients are given
> > I did not update the polynomial algorithms to properly use the
> > coefficients.
> > 
> > The code is attached the bug reportr.
> > 
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218514
> > 
> 
> While everyone is busy reviewing and testing the patch available
> in bugzilla, I suspect some may be wondering about the inverse
> half-cycle trignometric functions.  I have worked out an algorithm
> for asinpi[fl] and have a working implemenation of asinpif(x).  
> It will take a couple of weeks (due to limited available time)
> before I can submit asinpi[fl], acospi[fl], and atanpi[fl], but
> work is in progress.
> 

I have what appears to be working versions of asinpi[fl].  It
was suggested elsewhere that using an Estrin's method to
sum the polynomial approximations instead of Horner's method
may allow modern CPUs to better schedule generated code.
I have implemented an Estrin's-like method for sinpi[l]
and cospi[l], and indeed the generated code is faster on
my Intel core2 duo with only a slight degradation in the
observed max ULP.  I'll post new versions to bugzilla in
the near future. 

-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow