svn commit: r300965 - head/lib/libc/stdlib

Tue May 31 05:53:14 UTC 2016

On Tue, 31 May 2016, Andrey Chernov wrote:

> On 31.05.2016 6:42, Bruce Evans wrote:
>>
>> Er, I already said which types are better -- [u]int_fast32_t here.
>
> [u]int_fast32_t have _at_least_ 32 bits. int32_t in the initial PRNG can
> be changed since does not overflow and involve several calculations, but
> uint_fast32_t is needed just for two operations:

I think you mean a native uint32_t is needed for 2 operations.

> *f += *r;
> i = (*f >> 1) & 0x7fffffff;

This takes 2 operations (add and shift) with native uint32_t.  It takes 4
logical operations (maybe more physically, or less after optimization)
with emulated uint32_t (add, mask to 32 bits (maybe move to another
register to do this), shift, mask to 32 bits).  When you write the final
mask explicitly, it is to 31 bits and optimizing this away is especially
easy in both cases.

> We need to assign values from uint32_t to uint_fast32_t (since array
> size can't be changed),

FP code using double_t is similar: data in tables should normally be
in doubles since double_t might be too much larger; data in function
parameters is almost always in doubles since APIs are deficient and
don't even support double_t as an arg; then it is best to assign to
a double_t variable since if you just use the double then expressions
using it will promote it to double_t but it is too easy to lose this
expansion too early.  It takes extra variables and a little more code
for the assignments, but the extra variables are optimized away in
cases where there is no expansion.

> do this single operation fast and store them
> back into array of uint32_t. I doubt that much gain can comes from it
> and even pessimization in some cases. Better let compiler do its job here.

It's never a pessimization if the compiler does its job.

It is good to practice this on a simple 2-step operation.  Think of a
multi-step operation where each step requires clipping to 32 bits.
Using uint32_t for the calculation is just a concise way of writing
"& 0xffffffff" after every step (even ones that don't need it).  It
is difficult and sometimes impossible for the compiler to optimize
away these masks across a large number of steps.  Sometimes this is
easy for the programmer.

Bruce