svn commit: r328159 - head/sys/modules

Sat Jan 20 08:58:05 UTC 2018

On Fri, 19 Jan 2018, Don Lewis wrote:

> On 19 Jan, Conrad Meyer wrote:
>> On Fri, Jan 19, 2018 at 9:37 AM, Rodney W. Grimes
>> <freebsd at pdx.rh.cn85.dnsmgr.net> wrote:
>>> If you think in assembler it is easy to understand why this is UB,
>>> most (all) architectures Right Logic or Arithmetic Shift only accept an
>>> operand that is a size that can hold log2(wordsize).

The not-unused x86 arch is one that does this.  IIRC, some history of this
is:

- on the 8086, the shift count was taken mod 32.  16 bits was enough for
   anyone, and shifting left or right by 16 through 31 (but not by 32)
   shifted out all of the bits (in the unsigned case) to give 0.

- for the 80386, someone forgot why the 8086 took the count mod 32 instead
   of just 16, and kept using 32.  16 bits was not enough for anyone, and
   shifting left or right by 32 had no effect (even in the signed case?).

   C was standardized at much the same time as the 80386 came out, so
   shifting right by 32 was not required to work.  It gave undefined
   behaviour.  Optimizing compilers took advantage of the UB to give the
   same do-nothing behaviour as the hardware for shift counts of 32
   (or do-something-strange-and-undocumented for larger shift counts).
   Pessimizing compilers could have taken advantage of the UB to shift
   out all of the bits in the sme way at runtime as at compile time like
   some programmers expect.  This would pessimize the usual case (extra
   code would be needed when the produce 0 at runtime when the shift
   count is >= 32).

- binary compatibility prevented anyone fixing this on 32-bit x86's

- modulo 32 is no good for 64-bit mode.  Either someone forgot about
   the 8086 again, or there is some binary compatibility problem that
   inhibited expanding 32 to 128 or "infinity".  (It certainly can't
   be "infinity" because even INT16_MAX is unreachable due to the
   shift count being limited to 256 by the old mistake^Woptimization
   of keeping it in %cl.)

- binary compatibilty prevented fixing this on 64-bit x86's in 32-bit
   mode.

>> This is a logical right shift by a constant larger than the width of
>> the left operand.  As a result, it would a constant zero in any
>> emitted machine code.  It is a bug in the C standard and a concession
>> to naive, non-optimizing compilers that this is considered UB.

This isn't a logical right shift, but it is what the hardware does.  It
is a feature in the C standard and a concession to smart, optimizing
compilers that this is UB.  UB allows the compiler to do anything,
including optimizing to do what the hardware does or pessimizing to
give logical shifts.

It is interesting that the behaviour is undefined even for unsigned
left operands.

UB is not strictly required.  The behaviour could also be implementation
defined or perhaps unspecified.  This makes little difference in practice.
It is unclear if the implementation can define the behaviour as back to
undefined.

> Generating one answer when compiler knows that everything is constant
> and can figure out the "correct" value at compile time, but generating
> an entirely different answer when the shift value is still constant, but
> passed in as a function parameter and hides that information from the
> compiler so the result is generated at runtime sounds like a good way to
> introduce bugs.

My pre-C90 compiler does this for integer division.  C99 requires
incorrect rounding (round towards 0 instead of towards -infinity for
positive divisors), but my compiler does correct rounding for divisions
done at compile time and in software and whatever the hardware does
(usually incorrect) otherwise.  In C90, the rounding is implementation-
defined, so it can even be correct, but in practice it cannot be trusted.

Bruce