svn commit: r300956 - head/lib/libc/stdlib

Mon May 30 05:03:37 UTC 2016

On 30.05.2016 5:17, Bruce Evans wrote:
> On Sun, 29 May 2016, Andrey A. Chernov wrote:
> 
>> Log:
>>  1) Unifdef USE_WEAK_SEEDING since it is too obsolete to support and
>> makes
>>  reading hard.
> 
> Good.
> 
>>  2) Instead of doing range transformation in each and every function
>> here,
>>  do it single time directly in do_rand(). One "mod" operation overhead
>> is not
>>  a big deal, but the code looks nicer and possible future functions
>> additions
>>  or PRNG change do not miss range transformations neither have
>> unneeded ones.
> 
> The whole implementation is silly.  It is manually optimized for 1980's
> compilers.  More below.
> 
>>  3) Use POSIX argument types for visible functions (cosmetic).
> 
> Not sure I like type changes.
> 
>> Modified: head/lib/libc/stdlib/rand.c
>> ==============================================================================
>>
>> --- head/lib/libc/stdlib/rand.c    Sun May 29 12:21:54 2016    (r300955)
>> +++ head/lib/libc/stdlib/rand.c    Sun May 29 13:57:06 2016    (r300956)
>> @@ -48,14 +48,6 @@ __FBSDID("$FreeBSD$");
>> static int
>> do_rand(unsigned long *ctx)
>> {
>> -#ifdef  USE_WEAK_SEEDING
>> -/*
>> - * Historic implementation compatibility.
>> - * The random sequences do not vary much with the seed,
>> - * even with overflowing.
>> - */
>> -    return ((*ctx = *ctx * 1103515245 + 12345) % ((u_long)RAND_MAX +
>> 1));
> 
> This is a good implementation of a not very good LCG, made very bad by
> botching RAND_MAX.  The magic numbers except for RAND_MAX are copied
> from the example in the C90 spec.  I think they are good enough there.
> The comment in at least the C99 spec says "// RAND_MAX assumed to be
> 32767".  This means that these magic numbers were chosen to work with
> this value of RAND_MAX.  (unsigned) longs are used to give a period
> much longer than RAND_MAX and for technical reasons.  Taking the modulo
> to many fewer bits than the minimum of 32 for an unsigned long then
> disguises the linearity.  The BSD version almost completly breaks this
> on arches with 32 bit longs by taking the modulo to 31 bits (mod 32 bits
> would give complete breakage).  Arches with 64-bit longs accidentally
> work a bit better, by the coefficients are poorly chosen -- they should
> be 64 bits and the arithmetic 128 bits.
> 
>> -#else   /* !USE_WEAK_SEEDING */
>> /*
>>  * Compute x = (7^5 * x) mod (2^31 - 1)
>>  * without overflowing 31 bits:
> 
> These coefficients are probably better, but they are still basically
> 32-bit ones and thus not very good for more than a 15-bit RAND_MAX,
> and the details of the calculation are excessively optimized for 1980's
> compilers and 32-bit uintmax_t.
> 
> This can be written as x = (1687 * x) % 2147483647 (with some care about
> type sizes and signedness and overflow.  It then looks like an even worse
> LCG than the botched C90 one, at least with the botch making its internals
> more visible.  E.g., when x = 1, the first couple of iterations don't even
> involve the linear term in 31 bits.
> 
> Even 1980's compiler technology was not far from reducing the division
> to a multiplication.  The LCG expression would then reduce to
> (uintN_t)(A * x + B) where N is either 32 or 64.  Perhaps N needs to
> be 64 even with the small coeefficients, due to the divisor being large
> and not a power of 2.  But if we have 64-bit arithmetic, then we can
> choose much better coefficients than the C90 32-bit ones or the ACM
> barely 16-bit ones, and uses A * x + B directly, giving a 64-bit period,
> and have a chance of our 31-bit RAND_MAX finally working.
> 
>> @@ -66,48 +58,34 @@ do_rand(unsigned long *ctx)
>>  */
>>     long hi, lo, x;
>>
>> -    /* Must be in [1, 0x7ffffffe] range at this point. */
>> -    hi = *ctx / 127773;
>> -    lo = *ctx % 127773;
>> +    /* Transform to [1, 0x7ffffffe] range. */
>> +    x = (*ctx % 0x7ffffffe) + 1;
>> +    hi = x / 127773;
>> +    lo = x % 127773;
>>     x = 16807 * lo - 2836 * hi;
>>     if (x < 0)
>>         x += 0x7fffffff;
> 
> This does the division more magically but more slowly than the compiler
> would do.   It uses one division and one remainder, and doesn't use
> the newfangled (late 1980's) ldiv() function to explicitly try to
> reduce these to one hardware divrem operation.  But compilers can
> easily do this reduction.  I think compilers can't easily (or perhaps
> at all) reduce to an A * x + B expression.  It isn't clear if using
> signed long here makes things easier or harder for compilers.  The
> algorithm is special to avoid overflow with signed longs, but it the
> compiler might not understand this.  Then -fwrapv would inhibit it
> from doing much reduction, and -fno-wrapv is just complicated.
> 
>> -    *ctx = x;
>>     /* Transform to [0, 0x7ffffffd] range. */
>> -    return (x - 1);
>> -#endif  /* !USE_WEAK_SEEDING */
>> +    x--;
>> +    *ctx = x;
>> +    return (x);
>> }
>>
>>
>> int
>> -rand_r(unsigned int *ctx)
>> +rand_r(unsigned *ctx)
> 
> You didn't change the type, but fixed a style bug (the verbose spelling
> of "unsigned") :-).  It is interesting that the style bug is missing in
> POSIX.  I thought that standards mostly got this wrong.  Actually, POSIX
> almost never uses the verbose spelling.  In the a 2001 draft, it has just
> 2 instances of "unsigned int" and these are not in literal code.  It only
> has 260 lines matching "unsigned".  The plain unsigned type is harder to
> grep for and it occurs in surprisingly few prototypes (maybe 10-20).
> 
> C99 uses the verbose variant more, but has no functions except srand()
> taking an unsigned arg, so C99 ends up with only 6 instances of
> "unsigned int" on the same line and 4 of those are related to srand().
> 
> It is a smaller style bug to spell "unsigned" as itself.  "unsigned" is
> spelled u_int in KNF.  This is more descriptive in less space, and is
> closer to the newer standard uintN_t.  Man pages and headers must use
> a standard name, but the implementation should use u_int (but don't
> churn it to do this).
> 
>> {
>>     u_long val;
> 
> This part was already in KNF, so using "unsigned int" was not even
> consistently non-KNF.
> 
> The style bugs were not in the 4.4BSD-Lite* version.  It is simpler
> than possible.  Just 5 lines for the USE_WEAK_SEEDING version of
> rand(), and 6 for srand() (1 extra for K&R style).
> 
>> @@ -116,13 +94,9 @@ rand(void)
>> }
>>
>> void
>> -srand(u_int seed)
>> +srand(unsigned seed)
>> {
>>     next = seed;
>> -#ifndef USE_WEAK_SEEDING
>> -    /* Transform to [1, 0x7ffffffe] range. */
>> -    next = (next % 0x7ffffffe) + 1;
>> -#endif
>> }
> 
> Churning alread occurred (not counting the ifdef) :-(.  4.4BSD-Lite* uses
> the KNF spelling here.
> 
>> @@ -144,10 +118,6 @@ sranddev(void)
>>     mib[0] = CTL_KERN;
>>     mib[1] = KERN_ARND;
>>     sysctl(mib, 2, (void *)&next, &len, NULL, 0);
>> -#ifndef USE_WEAK_SEEDING
>> -    /* Transform to [1, 0x7ffffffe] range. */
>> -    next = (next % 0x7ffffffe) + 1;
>> -#endif
>> }
> 
> I think is right to remove this micro-optimization.  Adjustments can
> probably all be folded into the final expression.  The final expression
> is now so pessimized that an extra addition in it won't make it much
> worse.
> 
> Bruce
> 

Perhaps you can find some ideas, answers and PRNG comparison in the
original paper:
http://www.firstpr.com.au/dsp/rand31/p1192-park.pdf

While technically equal, having KNF arg types internally and exposing
POSIX ones via headers and docs as you suggest leading to harder eye
catching, which makes things even worse when POSIX decide to change arg
type.

clang -O uses single "idivl" calculating both quotient and reminder for
current code, but for ldiv(3) case call/storage/additional calcs
overhead will be added. ldiv(3) does not reduce anything, see stdlib/ldiv.c