Undefined Behavior in lib/msun/src/e_pow.c (was Re: New math library from ARM)
Pedro Giffuni
pfg at FreeBSD.org
Mon Jan 14 15:05:04 UTC 2019
On 1/14/19 7:52 AM, Bruce Evans wrote:
> On Tue, 1 Jan 2019, Pedro Giffuni wrote:
>
>> On 31/12/2018 23:54, Steve Kargl wrote:
>>> On Mon, Dec 31, 2018 at 10:32:06PM -0500, Pedro Giffuni wrote:
>>>> Hmm ...
>>>>
>>>> Looking at the changes in musl's libc I found this issue which seems
>>>> real (although somewhat theoretical):
>>>>
>>>> https://git.musl-libc.org/cgit/musl/commit/src/math?id=688d3da0f1730daddbc954bbc2d27cc96ceee04c
>>>>
>>>>
>>>> Is the attached patch acceptable?
>>>>
>>>> Also, their code is bit different here:
>>>>
>>>> https://git.musl-libc.org/cgit/musl/commit/src/math?id=282b1cd26649d69de038111f5876853df6ddc345
>>>>
>>>>
>>>> but we may also have to check fmaf(-0x1.26524ep-54, -0x1.cb7868p+11,
>>>> 0x1.d10f5ep-29).
>>>>
>>>> Cheers,
>>>>
>>>> Pedro.
>>>>
>>>> Index: lib/msun/src/e_pow.c
>>>> ===================================================================
>>>> --- lib/msun/src/e_pow.c (revision 342665)
>>>> +++ lib/msun/src/e_pow.c (working copy)
>>>> @@ -130,6 +130,7 @@
>>>> if(hx<0) {
>>>> if(iy>=0x43400000) yisint = 2; /* even integer y */
>>>> else if(iy>=0x3ff00000) {
>>>> + uint32_t j; /* Avoid UB in bit operations below. */
>>>> k = (iy>>20)-0x3ff; /* exponent */
>>>> if(k>20) {
>>>> j = ly>>(52-k);
>
> I was away.
>
> I see you committed a better version (with a cast).
>
Yes, I did what seemed to be the least invasive change, but then the
problem with duct tape is that it may work well enough to forget about
the issue.
I haven't MFC'd it while you guys decided on a better solution.
> The correct fix is more like the changing the file scope i and j from
> int32_t
> to u[_]int32_t. j is mostly used as a temporary to hold pure bits.
> Its top
> bit is usually not set, so plain int32_t works for it. The only other
> place
> where its top bit can obviously be set is in an EXTRACT_WORDS(j,i,z).
> Then
> both j and i can have their top bit set. EXTRACT_WORDS() into h[xy] and
> l[xy] is also sloppy. There l[xy] is unsigned, but h[xy] is signed. The
> top bit in the high word is soon discarded using ix = hx&0x7fffffff, etc.
> Later, the sign is tested using hx<0 and this is the only excuse for hx
> being signed. fdlibm does this quite often. It is a manual optimization
> for ~30-40 year old compilers that couldn't turn (hx&8000000) != 0 into a
> sign test iff the sign test is optimal.
>
The duct-tape cast still looks good enough ;).
> Other style bugs in the above:
>
>>> I'll defer to Bruce on this. My only comments are
>>> 1) declarations belong at the top of the file where all declarations
>>> occur
>
> 1a) missing blank line after declaration.
>
>> Modern standards let you declare variables within blocks. For style
>> we do prefer to start the function with them but in this case we can't.
>
> K&R 1978 C allows declaring variables within blocks.
>
> You can't declare j again in function scope since it is already there
> with
> a different type.
>
>>> 2) j is already declared as int32_t
>> And it has to be a signed integer: we later check for negative values.
>
> I think that is only when j is misused later for EXTRACT_WORDS().
>
> 2a,2b) Redeclaring j with a different type in an inner block is an even
> larger style bug/obfuscation than declaring a new variable or redeclaring
> j with the same type there. Redeclaring shadows it and this bug is
> detected by Wshadow. Redeclaring it with a different type (as needed to
> work) is especially obscure shadowing.
>
>>> 3) uint32_t should be written as u_int32_t.
>>
>> No, uint32_t is the standard type, u_int32_t is a BSDism. Also, the
>> file consistently uses uint32_t.
>
> No, see previous replies. 3) is correct.
>
Yes, I acknowledged this. I was looking at the musl version which
replaced all the u_int32_t's.
>>> Are you sure that UB occurs? Or, is this an attempt to
>>> placate a static analysis tool that only see shifting of
>>> a signed type? Do you need to make a similar change to
>>> e_powf.c?
>>
>> The point of the Undefined Behavior is precisely that it doesn't
>> occur: we are lucky and the compiler behaves as it always has but the
>> standard will let a different compiler behave differently and then we
>> will have surprises.
>
> No, UB does occur, but is harmless. ly is unsigned and can have its top
> bit set. j=ly>>(52-k) shifts right a few bits so doesn't overflow on
> assignment to signed j. But then j<<(52-k) gives UB attempting to
> recover
> the top bit of ly gives UB in all cases where the top bit is nonzero.
>
> The else clause doesn't have this problem, since it shifts iy instead
> of ly
> and iy never has its top bit set.
>
> e_powf.c doesn't have this problem, since there is only 1 32-bit word for
> float precision and it is handled like the top word for double precision
> (only its mantissa bits are shifted).
>
> However, it is easier to use unsigned shifts than to see that signed
> shifts
> have defined behaviour. musl did well to only detect and/or fix the 1
> case
> where UB behaviour clearly occurs.
>
Thanks for the feedback!
For now I would prefer we focus more on whatever the ARM lib does better.
Pedro.
More information about the freebsd-numerics
mailing list