Implementation of expl()

Bruce Evans brde at optusnet.com.au
Mon Dec 10 03:22:17 PST 2007


On Sun, 9 Dec 2007, Steve Kargl wrote:

> On Sun, Dec 09, 2007 at 04:25:05PM -0500, David Schultz wrote:
>> On Fri, Nov 02, 2007, Steve Kargl wrote:
>>> With all the success of developing several other missing
>>> C99 math routines, I decided to tackle expl() (which I
>>> need to tackle tanhl()).
>>
>> Hmm, great, but where's the patch? :)  Maybe the mailing list
>> software ate it.
>
> This is the current version.  I need to revise how I computed
> the ploynomial coefficient.  In short, I mapped r in
> [-ln(2)/2:ln(2)/2] into the range x in [-1,1] for the Chebyshev
> interpolation, but I never scaled x back into r.  This is the
> reason why the lines "r = r * TOLN2;" exists.
>
> I don't remember if bde sent me comments on this code.  I sure
> he has plenty. :)

I don't think I commented about expl() before.  Sorry, it's further
from what I would like than the trig functions.

Here are some better polynomial approximations for it.

----------------------------------------------------------------------------
(1) 64-bit precision, approximating exp(), saves a term and should be more
     accurate.  Note the difference in the plots between the Chebyshev
     approximation and the final approximation.  The uglyness and extra
     error in the latter are due to rounding the very-high-precision
     coeffs used in the former to 64 bits.  The final approximation
     uses a Remez algorithm on rounded coeffs to try to choose the best
     set of rounded coeffs (!= the best set of coeffs, rounded).  Its
     plot is ugly because my algorithm is unstable for functions that
     aren't even, so I had to stop it after 1 significant iteration
     (adjusting the P3 term from 0xaaaaaaaaaaaaaaab.0p-66 to ...aa8.0p-66).
     If it worked properly then it would push all the extrema back to the
     top and bottom of the plot, and be slightly more accurate than the
     Chebyshev approximation.  Remez gains very little over Chebyshev in
     infinite precision (about a factor of 2), but can compensate for
     huge errors (a factor of 100[0]) due to unfortunate rounding of
     coeffs.
----------------------------------------------------------------------------

--- Simple Lagrange interpolation at Chebyshev points ---

5.975e-24 |'''''x'''''''''x'''''''''''x''''''''''''x''''''''''_x''''''_''"
           |     :        x             "            "         ::      :  :
           |_   : :       : _         " :          " :         : :     :: :
           |:   : x      :  :         :  :         :  :       :  :     :: :
           |:   " :      :  :        :   "        :   "       :  x    : x :
           |:   : :      "   :       _   :        :   :       "  :    : : :
           |:   : :      :   :       :    :       "   :       :  :    : : :
           |:   :  :     :   "       :    :       :    :      :  :    : : :
           |:   :  :    :    :      :     x       :    :     :    :   _ : :
           |:  :   :    :    :      :     :      :     "     :    :   : : :
           |:: :   x    _     :     _     :      :     :     :    :   : ::|
           `::`:```:````:`````:`````:``````:`````"`````:`````"````_```:`::`
           : : :   :    :     _     :      :     :      :    :    :   :  :|
           : : "   :   :      :    :       x     :      :    :    :  :   :|
           : : :    :  :      :    :       :    :       _   :     :  :   :|
           : : :    :  :       :   _        :   :       :   :      : :   :|
           : : :    :  _       :   :        :   "       :   _      : :   :|
           : x:     "  :       x  :         _   :        :  :      : :   :|
           : ::     : :        :  :         :  :         : :       _ "   :|
           :  :      ::         : x          : _         x :       : :   _|
           :  _      :_         _            _             _        :     |
-5.95e-24 x........."...........x............x...........x........._.....|
           -0.34658                                                 0.34658
--- Lagrange/Chebyshev with coeff adjusted to 64-bit precision ---

adjn n = 64 k = 3 j = 120 toter 0
--- rounded coeff ---
P0 =  1.000000000000000000000000000000000000000,		/*  0x8000000000000000.0p-63 */
P1 =  1.000000000000000000000000000000000000000,		/*  0x8000000000000000.0p-63 */
P2 =  0.5000000000000000000000000000000000000000,		/*  0x8000000000000000.0p-64 */
P3 =  0.1666666666666666666310000000000000000000,		/*  0xaaaaaaaaaaaaaaa8.0p-66 */
P4 =  0.04166666666666666666100000000000000000000,		/*  0xaaaaaaaaaaaaaaa9.0p-68 */
P5 =  0.008333333333333338066950000000000000000000,		/*  0x8888888888889e5d.0p-70 */
P6 =  0.001388888888888889569550000000000000000000,		/*  0xb60b60b60b60cf28.0p-73 */
P7 =  0.0001984126984124767297770000000000000000000,		/*  0xd00d00d00c013acd.0p-76 */
P8 =  0.00002480158730156209466100000000000000000000,		/*  0xd00d00d00c1851e1.0p-79 */
P9 =  0.000002755731927361941412700000000000000000000,		/*  0xb8ef1d304cd05f90.0p-82 */
P10 =  0.0000002755731927069719307580000000000000000000,		/*  0x93f27dbffa11c00d.0p-85 */
P11 =  0.00000002505205082467052704390000000000000000000,		/*  0xd7320ad84885e745.0p-89 */
P12 =  0.000000002087671071533106249630000000000000000000,		/*  0x8f76b2a8ea87f9b3.0p-92 */
P13 =  1.60924147227467723196999999999999999999 E-10,		/*  0xb0f01edf2f5fac20.0p-96 */
P14 =  1.14941936278242808142000000000000000000 E-11,		/*  0xca353f02fa6b7741.0p-100 */

7.252e-24 |''''''''''''''''_''''''''''''''''''''_""_'''''''''''''''''''''|
           |               " x                  x                         |
           |              x                    _     x        _"          |
           |                  "               x                 "         |
           --------------_---------------_xxx"--------x------_-------------
           |     _x      :     "       _"                        _        |
           |     :      :       _     x                x    _    :        |
           |    :  "    _            x                            :       |
           |    _       :        "__x                   x  _      :    x  |
           |    :   _  :                                 x_       x    :  |
           |    :      "                                          :   " : |
           |_  :     xx                                            :  : x |
           |:  :                                                   _ :  : x
           |:  x                                                     "  : :
           |:: :                                                    x   : :
           |:::                                                          ::
           : _:                                                          :|
           :  _                                                          :|
           :                                                             :|
           :                                                             x|
           :                                                              |
-2.79e-23 _..............................................................|
           -0.34658                                                 0.34658
adjn n = 64 k = 3 toter 0

----------------------------------------------------------------------------
(2) 113-bit precision, approximating exp().
----------------------------------------------------------------------------
--- Simple Lagrange interpolation at Chebyshev points ---

3.239e-38 x''''x'''''x''''''_''''''''x''''''''"''''''''_''''''"'''''"''''"
           :    :     :      :x       :        :       x:      :     :    |
           |:   :    ::      ::      : _      _ :      ::      ::    :   _|
           |"   :    _ :    : :      " :      : "      : :    : _    :   :|
           |:  : :   : :    :  :     : :      : :     :  :    : :   : :  :|
           |:  : :   : x    :  :     : :      : :     :  :    x :   : :  :|
           |:  : :   : :    "  :    :   :    :   :    :  "    : :   : :  :|
           | : : _   : :    :  "    :   :    :   :    "  :    : :   x : : |
           | : _ :  :  :    :  :    :   :    :   :    :  :    :  :  : _ : |
           | : : :  :  :    :  :    _   "    "   _    :  :    :  :  : : : |
           | : : :  :   :  :   :    :   :    :   :    :   :  :   :  : : : |
           ``x`:`:``:```:``:````:```:```:````:```:```:````:``:```:``:`:`x``
           | : : :  :   :  :    :   :   :    :   :   :    :  :   :  : : : |
           | : :  : x   :  :    :  :     :  :     :  :    :  :   x :  : : |
           | : :  : :   x  x    :  :     :  :     :  :    x  _   : :  : : |
           | ::   : :   :  :    "  :     :  :     :  "    :  :   : :   :: |
           |  :   : :   :  :    :  :     x  x     :  :    :  :   : :   :  |
           |  :   ::    : :     :  "     :  :     "  :    :  :    ::   :  |
           |  :   _:     ::      : :     :  :     : :      ::     ::   :  |
           |  :   ::     ::      ::       ::       ::      ::     :"   :  |
           |  :    :     :_      _:       ::       :_      ::     :    :  |
-3.13e-38 |..x....x....."........x.......xx......._......."".....x...._..|
           -0.34658                                                 0.34658
--- Lagrange/Chebyshev with coeff adjusted to 113-bit precision ---

adjn n = 113 k = 1 j = 120 toter 0
--- rounded coeff ---
P0 =  1.000000000000000000000000000000000000000,		/*  0x10000000000000000000000000000.0p-112 */
P1 =  1.000000000000000000000000000000000000000,		/*  0x10000000000000000000000000000.0p-112 */
P2 =  0.4999999999999999999999999999999999500000,		/*  0x1ffffffffffffffffffffffffffff.0p-114 */
P3 =  0.1666666666666666666666666666666666600000,		/*  0x15555555555555555555555555555.0p-115 */
P4 =  0.04166666666666666666666666666668415500000,		/*  0x155555555555555555555555560af.0p-117 */
P5 =  0.008333333333333333333333333333336938300000,		/*  0x11111111111111111111111111a6d.0p-119 */
P6 =  0.001388888888888888888888888886437923800000,		/*  0x16c16c16c16c16c16c16c15fa9389.0p-122 */
P7 =  0.0001984126984126984126984126980627689100000,		/*  0x1a01a01a01a01a01a01a0191e8217.0p-125 */
P8 =  0.00002480158730158730158730175735685579700000,		/*  0x1a01a01a01a01a01a01ad93230fd1.0p-128 */
P9 =  0.000002755731922398589065255750604381149600000,		/*  0x171de3a556c7338faac28603831fb.0p-131 */
P10 =  0.0000002755731922398589065187949851278033800000,		/*  0x127e4fb7789f5c72e69d53c0ac1d3.0p-134 */
P11 =  0.00000002505210838544171877444493611961585000000,		/*  0x1ae64567f544e38fdb412a0eb7de0.0p-138 */
P12 =  0.000000002087675698786810064988525091623603400000,		/*  0x11eed8eff8d8981cadc22e13e3610.0p-141 */
P13 =  1.60590438368216158655710381729946340000 E-10,		/*  0x16124613a86d09fa08f48ca8c4d5e.0p-145 */
P14 =  1.14707455977270929588227069467205860000 E-11,		/*  0x193974a8c07640267fec271f370a0.0p-149 */
P15 =  7.6471637318180850420027983995938208999 E-13,		/*  0x1ae7f3e733b16c573381f7ce23115.0p-153 */
P16 =  4.77947733504221136080232269675803310000 E-14,		/*  0x1ae7f3e773e8b2e2b7a23c9a525ff.0p-157 */
P17 =  2.81145725589067097361019490883767970000 E-15,		/*  0x1952c7706c73b79a1577d9ebcf121.0p-161 */
P18 =  1.56191903751316458500805670560905000000 E-16,		/*  0x168276d284fb449ca1ff48bfbb891.0p-165 */
P19 =  8.2206265778415557385251561200634072999 E-18,		/*  0x12f499f725c9f790053993c3f2b17.0p-169 */
P20 =  4.11616915419431123065482372671396870000 E-19,		/*  0x1e5f394471e23048e2a82e8f22b34.0p-174 */
P21 =  1.9600698694144411728713343986 E-20,		/*  0x1723f29b62196e8abacb3c5c40000.0p-178 */

9.874e-37 "''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''|
           |_                                                             |
           |                                                              |
           | _                                                            |
           | :                                                            |
           |  :                                                           |
           |  "x                                                          |
           |    "                                                         |
           |     _                                                        |
           |                                                              |
           |      _                                                       |
           |                                                              |
           |       "x_                                                    |
           |          x                                                   |
           |           x                                                  |
           |            _                                                _x
           |                                                         x_ x |
           |             "_                                    xx  _"  x  |
           |               """x                               "  "x       |
           |                   x                       x"x___"            |
           |                    "     __      __     _"                   |
         0 ,,,,,,,,,,,,,,,,,,,,,,"xxx",,"x__x",,"x__x,,,,,,,,,,,,,,,,,,,,,,
           -0.34658                                                 0.34658
adjn n = 113 k = 1 toter 0

----------------------------------------------------------------------------
(3) 64-bit precision, approximating nqexp() := x * (exp(x) + 1) / (exp(x) - 1)
     exp(x) can be recovered from nqexp(), and the latter has much better
     numerical behaviour.  E.g., it needs only 7 coeffs instead of 15.
     The Remez search completed and equalized all the extrema.  It's
     interesting that The Chevyshev approximation equalized the extrema
     for exp() but not for nqexp().
----------------------------------------------------------------------------
--- Simple Lagrange interpolation at Chebyshev points ---

2.327e-21 "''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''"
           :                                                              :
           :  _"                                                      "_  :
           :  ::                                                      ::  :
           :  ::                                                      ::  :
           :  : :       ""                                  ""       : :  :
           :  : :      x  _                                _  x      : :  :
           : :  x      :  :                                :  :      x  : :
           : :  :     :    :                              :    :     :  : :
           : :  :     _    "         __        __         "    _     :  : :
           : :   :    :             "  "x_  _x"  "             :    :   : :
           `:_```:```:``````x``````"``````""``````"``````x``````:```:```_:`
           |::   :   :            x                x            :   :   ::|
           |::   _   _       x   _                  _   x       _   _   ::|
           |::   :   :        _ _                    _ _        :   :   ::|
           |::   :  :          "                      "          :  :   ::|
           |:     : :                                            : :     :|
           |:     : x                                            x :     :|
           |:     x                                                x     :|
           |:      x                                              x      :|
           |:                                                            :|
-2.23e-21 |_............................................................_|
           -0.34658                                                 0.34658
--- Lagrange/Chebyshev with coeff adjusted to 64-bit precision ---

adjn n = 64 k = 13 toter 0
C-1 =  2.000000000000000000000000000000000000000,		/*  0x8000000000000000.0p-62 */
C0 =  0.1666666666666666660071100000000000000000,		/*  0xaaaaaaaaaaaaaa7a.0p-66 */
C1 = -0.002777777777777671015759280000000000000000,		/* -0xb60b60b60b5904a2.0p-72 */
C2 =  0.00006613756613181145420083840000000000000000,		/*  0x8ab355dfd4d5cfe3.0p-77 */
C3 = -0.000001653439010798086927524850000000000000000,		/* -0xddebbb58747e58c9.0p-83 */
C4 =  0.00000004175172569487852482302880000000000000000,		/*  0xb35282048125b16f.0p-88 */
C5 = -0.000000001045797728772232364397340000000000000000,		/* -0x8fbbbc85f0e61575.0p-93 */

1.275e-21 "''"''''''''"''''''''''''"x''''''''''x"''''''''''''"''''''''"''"
           :  :         x          _              _          x         :  :
           :  ::      " :          :  "        "  :          : "      ::  :
           :  :x      :  :        :                :        :  :      x:  :
           : : :     :   :        :    x      x    :        :   :     : : :
           : : :     :   x        "    :      :    "        x   :     : : :
           : : :     _   :        :     :    :     :        :   _     : : :
           : :  :    :   :       :      "    "      :       :   :    :  : :
           : "  :    :    :      :                  :      :    :    :  " :
           : :  :    :    :      x       x  x       x      :    :    :  : :
           |::  :   :     _      :        __        :      _     :   :  ::|
           `::``_```:`````:``````:``````````````````:``````:`````:```_``::`
           |::  :   :     :     :                    :     :     :   :  ::|
           |::  :   x      :    :                    :    :      x   :  ::|
           |:   :   :      :    "                    "    :      :   :   :|
           |:    :  :      :    :                    :    :      :  :    :|
           |:    : :       "   :                      :   "       : :    :|
           |:    : :       :   :                      :   :       : :    :|
           |:    : :        :  "                      "  :        : :    :|
           |:    " "        :  :                      :  :        " "    :|
           |_               " :                        : "               _|
-1.28e-21 |......_.........._"........................"_.........._......|
           -0.34658                                                 0.34658
adjn n = 64

----------------------------------------------------------------------------
(4) 113-bit precision, approximating nqexp().  The Remez search was stable
     but made negative progress after the C3 (?) term so I stopped it early.
     It had only almost equalized the extrema when stopped.
----------------------------------------------------------------------------
--- Simple Lagrange interpolation at Chebyshev points ---

4.567e-37 |''''''"''''''''''''''''''''''''''''''''''''''''''''''''"''''''|
           |  "   :                                                :   "  |
           |  :   :     _                                    _     :   :  |
           | ::  : :    :_                                  _:    : :  :: |
           | ::  : :    ::                                  ::    : :  :: |
           | x:  : "   : :      x                    x      : :   " :  :x |
           | ::  _ :   :  :    x                      x    :  :   : _  :: |
           | ::  : :   "  :    : "                  " :    :  "   : :  :: |
           | ::  : :   :  _   :        __    __        :   _  :   : :  :: |
           --:-:-:-:---:--:---x---x---_--"xx"--_---x---x---:--:---:-:-:-:--
           | : : :  :  :  :   :                        :   :  :  :  : : : |
           | : : :  : :    : :     x "          " x     : :    : :  : : : |
           | : ::   : :    : :      "            "      : :    : :   :: : |
           |:  ::   : :    x "                          " x    : :   ::  :|
           |:  ::   " "                                        " "   ::  :|
           |:  ::   : :     "                            "     : :   ::  :|
           |:  _:    :                                          :    :_  :|
           |:   :    _                                          _    :   :|
           |:   "                                                    "   :|
           |:                                                            :|
           |"                                                            "|
-5.88e-37 _.............................................................._
           -0.34658                                                 0.34658
--- Lagrange/Chebyshev with coeff adjusted to 113-bit precision ---

adjn n = 113 k = 3 j = 120 toter 0
--- rounded coeff ---
C-1 =  2.000000000000000000000000000000000000000,		/*  0x10000000000000000000000000000.0p-111 */
C0 =  0.1666666666666666666666666666666662000000,		/*  0x15555555555555555555555555542.0p-115 */
C1 = -0.002777777777777777777777777777553326200000,		/* -0x16c16c16c16c16c16c16c16b85140.0p-121 */
C2 =  0.00006613756613756613756613752850992610100000,		/*  0x11566abc011566abc0114a7e047ee.0p-126 */
C3 = -0.000001653439153439153439150292458494169500000,		/* -0x1bbd779334ef0aac6588d54fb28a1.0p-132 */
C4 =  0.00000004175351397573619780544017497215914800000,		/*  0x166a8f2bf70ebd9cab06da3dfcbe9.0p-137 */
C5 = -0.000000001056838027737493956831265550261534200000,		/* -0x122805d6442668558fc0ca2fcc585.0p-142 */
C6 =  2.67650730612754829770358952948658760000 E-11,		/*  0x1d6db2c4e01fe52c6eb7674cf86f8.0p-148 */
C7 = -6.7793605801133740547903998874808967000 E-13,		/* -0x17da4e1ebc3556f34306a709a13c6.0p-153 */
C8 =  1.71721130775699965239461146049180330000 E-14,		/*  0x1355864cf343fe6e71f6644193ab0.0p-158 */
C9 = -4.34912151080995129543227548203712180000 E-16,		/* -0x1f56b690b4b95dd505b8a8e6666b1.0p-164 */
C10 =  1.08203893915267000322982006799182099999 E-17,		/*  0x18f33b03a36ff97329d6990c65e9c.0p-169 */

3.741e-37 |''''''"''''''''''''''''''''''''''''''''''''''''''''''''"''''''|
           |      :    _                                      _    :      |
           |     ::    :       _       "      "       _       :    ::     |
           |     ::    :_     ":                      :"     _:   : :     |
           | _   x :  : :     ::      " "    " "      ::     : :  : x   _ |
           | :   : :  : :     : :     : :    : :     : :     : :  : :   : |
           | :"  : :  _ :    :  :     :  :  :  :     :  :    : x  _ :  ": |
           | ::  : "  :  :   :  x    :   :  :   :    x  :   :  :  : :  :: |
           | ::  : :  :  :   :  :    :   "  "   :    :  :   :  :  : :  :: |
           | ::  : :  :  :   "  :    x   :  :   x    :  "   :  :  : :  :: |
           | ::  : : :   :   :  :    :    ::    :    :  :   :   ::  :  :: |
           --::-:---::---x---:---:---:----""----:---:---:---x---::---:-::--
           | :: :   ::   :   :   :   :          :   :   :   :   ::   : :: |
           |: : :   :"   :  :    :  :            :  :    :  :   "_   : : :|
           |:  ::   "    :  :    _  :            :  _    :  :        ::  :|
           |:  ::         : :    :  :            :  :    : :         ::  :|
           |:  ::         : x    :  "            "  :    x :         ::  :|
           |:  :_         : :     : :            : :     : :         _:  :|
           |:  ::         _:      ::              ::      :_         ::  :|
           |:  :           :      _:              :_      :           :  :|
           |:  :           _       x              x       _           :  :|
-3.37e-37 _x.."......................................................"..x_
           -0.34658                                                 0.34658
adjn n = 113 k = 3 toter 0
---

Attempts to use 64-bit coeffs for the 113-bit approximation were
unsuccessful -- the error was about 5e-24, but about 2^-(113+5) =
3e-36 is needed.  The interval is just too wide for this to work.
Probably only the first couple of coeffs need to have full precision.

Bruce


More information about the freebsd-standards mailing list