svn commit: r300965 - head/lib/libc/stdlib

Mon May 30 06:42:01 UTC 2016

On 30.05.2016 6:09, Bruce Evans wrote:
> On Sun, 29 May 2016, Conrad Meyer wrote:
> 
>> Does clang actually generate different code with this change?
> 
> It should, on exotic arches.
> 
>> On Sun, May 29, 2016 at 9:39 AM, Andrey A. Chernov <ache at freebsd.org>
>> wrote:
>>> Log:
>>>   Micro optimize: C standard guarantees that right shift for unsigned
>>> value
>>>   fills left bits with zero, and we have exact 32bit unsigned value
>>>   (uint32_t), so there is no reason to add "& 0x7fffffff" here.
> 
> Using uint32_t at all is an unportable pessimization.  On exotic arches
> that don't have native uint32_t registers, a theoretically perfect
> implementation wouldn't implement uint32_t.  This would expose the
> brokenness of broken code, so a practical implementation would emulate
> uint32_t so that the broken code would just run slower for arithmetic
> and much slower for locked memory operations.
> 
> uint32_t might be implemented not very slowly using 128-bit integer
> registers, or more slowly using the 53-bit mantissa part of an IEEE
> double precision floating point register.
> 
> If uint32_t is emulated, then the compiler is forced to act as if the
> code uses a longer type and does "& 0xffffffff" after every operation.
> Thes extra operations can only be combined sometimes.  More careful
> code can use a minimal number of this or similar "&" operations.  In
> checksum calculations, one "&" at the end is usually enough.
> 
>>> Modified: head/lib/libc/stdlib/random.c
> 
> random.c was mostly written before uint32_t was standard, so it used
> u_long and long.  Perhaps it wasn't careful enough with the "&"s to
> actually work unless u_long is precisely uint32_t and long is normal
> 2's complement with benign overflow.
> 
> Anyway, it was "fixed" (unimproved) using s/u_long/uint32_/ in most
> places where the API/ABI doesn't require longs (there is 1 dubious
> long left in a comment).  The correct fix is s/u_long/uint_fast32_t
> in most places and s/u_long/uint_least32_t/ in some places and then
> fix any missing "&"'s.  The "fast" and "least" types always exist,
> unlike the fixed-width types, and using them asks for time/space
> efficiency instead of emulated fixed-width.
> 
> On non-exotic arches, fast == least == fixed-width, so the correct
> substitution works as a quick fix even with missing "&"s.
> 
> It is not necessary to use the newfangled standard integer types to
> fix this here, since correct use of long types would work (they give
> 32 bits), but long is wasteful if it actually 64 bits or longer.
> 
> Even larger problems are looming with uintmax_t.  Any code that is
> careful enough to use it is likely to break or be bloated if it is
> expanded.  This is just like using u_long in old random().
> 
>>> ==============================================================================
>>>
>>> --- head/lib/libc/stdlib/random.c       Sun May 29 16:32:56
>>> 2016        (r300964)
>>> +++ head/lib/libc/stdlib/random.c       Sun May 29 16:39:28
>>> 2016        (r300965)
>>> @@ -430,7 +430,7 @@ random(void)
>>>                  */
>>>                 f = fptr; r = rptr;
>>>                 *f += *r;
>>> -               i = (*f >> 1) & 0x7fffffff;     /* chucking least
>>> random bit */
>>> +               i = *f >> 1;    /* chucking least random bit */
> 
> This gives an "&" to restore in the version with correct substitutions.
> 
> It also breaks the indentation.  (This file mostly indents comments to the
> right of code to column 40, but column 48 was used here and now column 32
> is used.)
> 
>>>                 if (++f >= end_ptr) {
>>>                         f = state;
>>>                         ++r;
> 
> Bruce
> 

I don't introduce uint32_t and int32_t here and don't have a slightest
idea of which types will be better to change them. F.e. *f += *r;
suppose unsigned 32bit overflow which don't naturally happens for large
types. Assigning uint32_t to some large type then clip it to smaller
after calculation - all of that can produce more code than save for
calculation itself.