From nobody Tue Sep 26 18:08:04 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Rw75S0yzxz4vT8l for ; Tue, 26 Sep 2023 18:08:12 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Rw75R5Rwyz3HQG for ; Tue, 26 Sep 2023 18:08:10 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Authentication-Results: mx1.freebsd.org; none Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.17.1/8.17.1) with ESMTP id 38QI85tP022099; Tue, 26 Sep 2023 11:08:05 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.17.1/8.17.1/Submit) id 38QI84cQ022098; Tue, 26 Sep 2023 11:08:04 -0700 (PDT) (envelope-from sgk) Date: Tue, 26 Sep 2023 11:08:04 -0700 From: Steve Kargl To: Alexander Leidinger Cc: Paul Zimmermann , freebsd-hackers@freebsd.org Subject: Re: Accuracy of Mathematical Functions Message-ID: Reply-To: sgk@troutmask.apl.washington.edu References: <1395eeabc6d404997f6a09a7b39d3da5@Leidinger.net> List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1395eeabc6d404997f6a09a7b39d3da5@Leidinger.net> X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US] X-Rspamd-Queue-Id: 4Rw75R5Rwyz3HQG On Tue, Sep 26, 2023 at 03:26:16PM +0200, Alexander Leidinger wrote: > Am 2023-09-25 15:50, schrieb Paul Zimmermann: > > > We have updated our comparison: > > > > https://members.loria.fr/PZimmermann/papers/accuracy.pdf > > > > This new update includes for the first time the FreeBSD math library, > > whose accuracy is quite good, except: > > I wonder how those functions/libs you tested compare in terms of speed... > It would allow to provide a hint to the question > "Which lib is the fastest and fulfills the needs in terms of accuracy for > the intended use-case?" > > I agree that the best way to do this requires to run all libs on the same > hardware and OS, which is not feasible in your approach. What may be > feasible is to compare the relative performance of those subsets, which you > run on the same hardware. > Speed vs accuracy is always a trade-off. Consider % ./tlibm j0 -f -x 0x1p-9 -X 2 -s 1 -N 4 Interval tested for j0f: [0.00195312,2] 4000000 calls, 0.040901 secs, 0.01023 usecs/call % ./tlibm j0 -f -x 0x1p-9 -X 2 -s 1 -N 4 Interval tested for j0f: [0.00195312,2] 4000000 calls, 0.092471 secs, 0.02312 usecs/call The former timing is for FreeBSD libm on an AMD FX-8350 using only -O2 optimization. The latter is a patched libm, which uses -O2 -funroll-loops -march=bdver2, and is twice as slow! The difference lies in accuracy. The former gives % ./tlibm j0 -f -x 0x1p-9 -X 2 -PD -N 4 Interval tested for j0f: [0.00195312,2] ulp <= 0.5: 85.04% 3401545 | 85.039% 3401545 0.5 < ulp < 0.6: 5.376% 215028 | 90.414% 3616573 0.6 < ulp < 0.7: 3.107% 124266 | 93.521% 3740839 0.7 < ulp < 0.8: 2.432% 97284 | 95.953% 3838123 0.8 < ulp < 0.9: 1.740% 69612 | 97.693% 3907735 0.9 < ulp < 1.0: 0.941% 37646 | 98.635% 3945381 1.0 < ulp < 1.5: 1.108% 44312 | 99.742% 3989693 1.5 < ulp < 2.0: 0.195% 7791 | 99.937% 3997484 2.0 < ulp < 3.0: 0.062% 2491 | 99.999% 3999975 3.0 < ulp < 0.0: 0.001% 25 | 100.000% 4000000 Max ulp: 3.259556 at 1.9667229652404785e+00 while the latter has % ./tlibm j0 -f -x 0x1p-9 -X 2 -PD -N 4 Interval tested for j0f: [0.00195312,2] ulp <= 0.5: 86.76% 3470362 | 86.759% 3470362 0.5 < ulp < 0.6: 5.531% 221257 | 92.290% 3691619 0.6 < ulp < 0.7: 2.761% 110437 | 95.051% 3802056 0.7 < ulp < 0.8: 1.705% 68195 | 96.756% 3870251 0.8 < ulp < 0.9: 1.228% 49134 | 97.985% 3919385 0.9 < ulp < 1.0: 0.841% 33628 | 98.825% 3953013 1.0 < ulp < 1.5: 1.087% 43475 | 99.912% 3996488 1.5 < ulp < 2.0: 0.087% 3473 | 99.999% 3999961 2.0 < ulp < 3.0: 0.001% 39 | 100.000% 4000000 Max ulp: 2.157274 at 1.9673234224319458e+00 The latter is more accurate, but its underlying algorithm uses summation-and-carry of the ascending series. This algorithm is sensitive to compiler options, so I haven't pushed it FreeBSD (, yet). -- Steve