From nobody Tue Sep 26 18:08:04 2023
X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Rw75S0yzxz4vT8l
	for <freebsd-hackers@mlmmj.nyi.freebsd.org>; Tue, 26 Sep 2023 18:08:12 +0000 (UTC)
	(envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(Client did not present a certificate)
	by mx1.freebsd.org (Postfix) with ESMTPS id 4Rw75R5Rwyz3HQG
	for <freebsd-hackers@freebsd.org>; Tue, 26 Sep 2023 18:08:10 +0000 (UTC)
	(envelope-from sgk@troutmask.apl.washington.edu)
Authentication-Results: mx1.freebsd.org;
	none
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
	by troutmask.apl.washington.edu (8.17.1/8.17.1) with ESMTP id 38QI85tP022099;
	Tue, 26 Sep 2023 11:08:05 -0700 (PDT)
	(envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
	by troutmask.apl.washington.edu (8.17.1/8.17.1/Submit) id 38QI84cQ022098;
	Tue, 26 Sep 2023 11:08:04 -0700 (PDT)
	(envelope-from sgk)
Date: Tue, 26 Sep 2023 11:08:04 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Alexander Leidinger <Alexander@leidinger.net>
Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>, freebsd-hackers@freebsd.org
Subject: Re: Accuracy of Mathematical Functions
Message-ID: <ZRMeBEZxce0xcA4U@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <p9u0h6ni1hwy.fsf@coriandre.loria.fr>
 <1395eeabc6d404997f6a09a7b39d3da5@Leidinger.net>
List-Id: Technical discussions relating to FreeBSD <freebsd-hackers.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-hackers
List-Help: <mailto:freebsd-hackers+help@freebsd.org>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Subscribe: <mailto:freebsd-hackers+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-hackers+unsubscribe@freebsd.org>
Sender: owner-freebsd-hackers@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1395eeabc6d404997f6a09a7b39d3da5@Leidinger.net>
X-Spamd-Bar: ----
X-Rspamd-Pre-Result: action=no action;
	module=replies;
	Message is reply to one we originated
X-Spamd-Result: default: False [-4.00 / 15.00];
	REPLY(-4.00)[];
	ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]
X-Rspamd-Queue-Id: 4Rw75R5Rwyz3HQG

On Tue, Sep 26, 2023 at 03:26:16PM +0200, Alexander Leidinger wrote:
> Am 2023-09-25 15:50, schrieb Paul Zimmermann:
> 
> > We have updated our comparison:
> > 
> > https://members.loria.fr/PZimmermann/papers/accuracy.pdf
> > 
> > This new update includes for the first time the FreeBSD math library,
> > whose accuracy is quite good, except:
> 
> I wonder how those functions/libs you tested compare in terms of speed...
> It would allow to provide a hint to the question
>   "Which lib is the fastest and fulfills the needs in terms of accuracy for
> the intended use-case?"
> 
> I agree that the best way to do this requires to run all libs on the same
> hardware and OS, which is not feasible in your approach. What may be
> feasible is to compare the relative performance of those subsets, which you
> run on the same hardware.
> 

Speed vs accuracy is always a trade-off.  Consider

% ./tlibm j0 -f -x 0x1p-9 -X 2 -s 1 -N 4
Interval tested for j0f: [0.00195312,2]
4000000 calls, 0.040901 secs, 0.01023 usecs/call

% ./tlibm j0 -f -x 0x1p-9 -X 2 -s 1 -N 4
Interval tested for j0f: [0.00195312,2]
4000000 calls, 0.092471 secs, 0.02312 usecs/call

The former timing is for FreeBSD libm on an AMD FX-8350 using 
only -O2 optimization.  The latter is a patched libm, which uses
-O2 -funroll-loops -march=bdver2, and is twice as slow!  The
difference lies in accuracy.  The former gives

% ./tlibm j0 -f -x 0x1p-9 -X 2 -PD -N 4
Interval tested for j0f: [0.00195312,2]
       ulp <= 0.5:  85.04%   3401545 |  85.039%   3401545
0.5 <  ulp <  0.6:  5.376%    215028 |  90.414%   3616573
0.6 <  ulp <  0.7:  3.107%    124266 |  93.521%   3740839
0.7 <  ulp <  0.8:  2.432%     97284 |  95.953%   3838123
0.8 <  ulp <  0.9:  1.740%     69612 |  97.693%   3907735
0.9 <  ulp <  1.0:  0.941%     37646 |  98.635%   3945381
1.0 <  ulp <  1.5:  1.108%     44312 |  99.742%   3989693
1.5 <  ulp <  2.0:  0.195%      7791 |  99.937%   3997484
2.0 <  ulp <  3.0:  0.062%      2491 |  99.999%   3999975
3.0 <  ulp <  0.0:  0.001%        25 | 100.000%   4000000
Max ulp: 3.259556 at 1.9667229652404785e+00

while the latter has

% ./tlibm j0 -f -x 0x1p-9 -X 2 -PD -N 4
Interval tested for j0f: [0.00195312,2]
       ulp <= 0.5:  86.76%   3470362 |  86.759%   3470362
0.5 <  ulp <  0.6:  5.531%    221257 |  92.290%   3691619
0.6 <  ulp <  0.7:  2.761%    110437 |  95.051%   3802056
0.7 <  ulp <  0.8:  1.705%     68195 |  96.756%   3870251
0.8 <  ulp <  0.9:  1.228%     49134 |  97.985%   3919385
0.9 <  ulp <  1.0:  0.841%     33628 |  98.825%   3953013
1.0 <  ulp <  1.5:  1.087%     43475 |  99.912%   3996488
1.5 <  ulp <  2.0:  0.087%      3473 |  99.999%   3999961
2.0 <  ulp <  3.0:  0.001%        39 | 100.000%   4000000
Max ulp: 2.157274 at 1.9673234224319458e+00

The latter is more accurate, but its underlying algorithm
uses summation-and-carry of the ascending series.  This 
algorithm is sensitive to compiler options, so I haven't 
pushed it FreeBSD (, yet).

-- 
Steve