svn commit: r314087 - head/sys/x86/x86

Sun Feb 26 12:44:55 UTC 2017

On Sun, Feb 26, 2017 at 04:43:12AM +1100, Bruce Evans wrote:
> On Sat, 25 Feb 2017, Konstantin Belousov wrote:
> 
> > On Sat, Feb 25, 2017 at 02:17:23PM +1100, Bruce Evans wrote:
> >> On Fri, 24 Feb 2017, Konstantin Belousov wrote:
> >>
> >>> On Thu, Feb 23, 2017 at 06:33:43AM +1100, Bruce Evans wrote:
> >>>> On Wed, 22 Feb 2017, Konstantin Belousov wrote:
> >>>>
> >>>>> Log:
> >>>>>  More fixes for regression in r313898 on i386.
> >>>>>  Use long long constants where needed.
> >>>>
> >>>> The long long abomination is never needed, and is always a style bug.
> >>> I never saw any explanation behind this claim.  Esp. the first part
> >>> of it, WRT 'never needed'.
> >>
> >> I hope I wrote enough about this in log messages when I cleaned up the
> >> long longs 20 years ago :-).
> >>
> >> long long was a hack to work around intmax_t not existing and long being
> >> unexpandable in practice because it was used in ABIs.  It should have gone
> >> away when intmax_t was standardized.  Unfortunately, long long was
> >> standardised too.
> > It does not make a sense even more.  long long is native compiler type,
> 
> It is only native since C99 broke C under pressure from misimplementatations
> with long long.  In unbroken C, long is the longest type, and lots of code
> depended on this (the main correct main was casting integers to long for
> printing, and the main incorrect use was using long for almost everything
> while assuming that int is 16 or 32 bits and not using typedefs much).
> Correct implementations used an extended integer type (extended integer
> types were also nonstandard before C99, and making them longer than long
> also breaks C).
> 
> 4.4BSD used quad_t and required it to be not actually a quad int, but
> precisely 64 bits, and used long excessively (for example, pid_t was
> long on all arches).  This is essentially the long long mistake, with
> long long spelled better as quad_t, combined with similar mistakes
> from a previous generation where int was 16 bits.  Long was used
> excessively as a simple way to get integers with at least 32 bits,
> although BSD never supported systems with int smaller than 32 bits
> AFAIK, and 4.4BSD certainly didn't support such systems.
2.9 BSD was a port to PDP-11, AFAIK, with 16bit ints.

> But BSD
> broke the type of at least pid_t by sprinkling longs.  In FreeBSD-1
> from Net/2, pid_t was short.  BSD wanted to expand PID_MAX from 30000,
> and did this wrong by expanding pid_t from short to long, just in time
> for this to be wrong in practice since 64-bit systems were becoming
> available so that long could easily be longer than int.
> 
> (Net)BSD for alpha wasn't burdened with ABIs requiring long to be 32
> bits, so it made long 64 bits and didn't need long long (except even
> plain long should be actually long, so it should be twice as wide as
> a register and thus 128 bits on alpha).  NetBSD cleaned up the
> sprinkling of longs at much the same time that 4.4BSD-Lite1 sprinkled
> them, to avoid getting silly sizes like 64 bits for pid_t and related
> compatibilityroblems.  I think NetBSD actually never imported 4.4BSD-
> Lite1, but later merged Lite1 or Lite2 and kept its typedefs instead
> of clobbering them with the long sprinkling.  FreeBSD was handicapped
> by the USL lawsuit.  It had to import Lite1.  It only fixed the long
> sprinkling much later by merging Lite2.  I think Lite2 got got the
> better types from NetBSD.  This is summarised in the FreeBSD commit
> log using essentially "Splat!" :-(.
> 
> > while anything_t is a typename to provide MI fixed type.  long long was
> > obviosvly choosen to extend types without requiring new keyword.
> 
> Abusing a standard keyword doesn't do much except ensure a syntax error
> if code with the extended type is compiler with a compiler that doesn't
> support the extension.  BSD's quad_t is in the application namespace.
> __quad_t would be better.
> 
> The errors are now being repeated with extensions to 128-bit integers.
> At least they are being spelled better as __int128_t instead of long
> long long or full doubling (long long long long = 128 bits).  C doesn't
> alllow __int128_t even as an extended type unless intmax_t is at least
> 128 bits.  Compatibility, ABI and bloat problems prevent enlarging
> intmax_t from 64 bits to 128 bits on LP64 systems just like they prevented
> the misimplementations enlarging long from 32 bits on LP32 and L32P64
> systems.
> 
> >> It is "never needed" since anything that can be done with it can be done
> >> better using intmax_t or intN_t or int_fastN_T or int_leastN_t.  Except,
> >> there is no suffix for explicit intmax_t constants, so you would have to
> >> write such constants using INTMAX_C() or better a cast to intmax_t if
> >> the constant is not needed in a cpp expression.
> > If you replace long long with int there, the same logical structure of
> > sentences will hold.  Does it mean that 'int' is abomination since we
> > have int32_t which allows everything to be done better ?
> 
> No, since correct uses of int are possible.  int32_t allows almost
> everything to be done worse.  It should only be used in software and
> hardware ABIs (like pid_t and network software packet layouts in
> software, and memory-mapped device registers and network hardware
> packet layouts in hardware).  Using it asks precisely 32 bits 2's
> complement with no padding bits at any cost.  Using it is unportable,
> but in practice the implementation has to emulate it if it is not a
> natural type for the CPU.  There aren't many CPUs like that any more.
> Ones with native 1's complement and no native support for 2's complement
> used to be more common.  The original alpha didn't have 8-bit loads and
> stores; I don't know if it had 32-bit ones.  Emulation just takes time
> for software, but for memory-mapped device registers it takes hardware
> support to avoid side effects from wider loads and stores.  Hardware
> tends to have this automatically -- even on i386, it is common to map
> 8-bit registers to 32-bit words in PCI space, and if 8-bit accesses are
> impossible then the hardware needs to be more careful with address
> selection to avoid using the 24 top bits in each word.
As is, original alpha cannot implement C11, which, I believe, is the
common knowledge.  Sure, unusable implementation might take a global
lock for each memory access (might be, only for each byte and half-word
access) to emulate atomicity, but this is only a theoretical play.

> 
> Typedefs are hard to use, but unavoidable when an API specifies them.
> Then a constant ABI may also make the typedefs inefficient in space
> and time.
> 
> To avoid the space/time efficiencies, int_least32_t and int_fast32_t
> should be used instead of int32_t.  These are also hard to use, and
> almost never used in practice.  It is easier to hard-code int32_t
> and assume that this is efficient in space and time.
> 
> But it is more correct to use plain int (except for ABIs).  In plain
> C and in POSIX before ~2007, plain int is essentially a convenient
> spelling of int_fast16_t.  POSIX changed this in ~2007 to require
> 32-bit ints, so int is now a convenient spelling of int_fast32_t,
> just like it has always been in BSD.
> 
> I just noticed some complications the non 2's complement cases.  C
> and POSIX still support 1's complement and this weakens the fast
> and least types.  E.g., INT32_MIN is -2**32, but this is not
> representable using 32 bits except in the 2's complement case, and
> C doesn't require int_fast32_t or int_least32_t to be able to
> represent it.  This makes the signed fast and least types difficult to
> use correctly.  Code like the following is invalid:
> 
>  	struct sc {
>  		int_least32_t least_reg_image;
>  		...
>  	};
>  	int_fast32_t fast_reg_image;
>  	int32_t reg_image;
>  	...
>  	reg_image = bus_space_read_4(bst, bsh, off);	/* not quite correct */
>  	fast_reg_image = reg_image;	/* copy it for time efficiency */
>  	fast_reg_image = adjust(fast_reg_image);
>  	sc.least_reg_image = fast_reg_image;	/* pack it for space effic. */
> 
> because the space and time conversions, and the adjustment using the fast
> type might clobber the INT32_MIN bit except in the 2's complement case.
> 
> Of course, code accessing device registers would use unsigned types and
> automatically avoid the problem.  It was already an error to convert the
> uint32_t returned by bus_space_read_4() to int32_t, although this
> conversion would not clobber the INT32_MIN since int32_t is 2's complement.
> 
> POSIX code can now simply use int or u_int if it just needs 32 bits.
> Similarly for portable code that just needs 16 bits.  int is the only
> type which is easy to use, so it should be used if possible.  It is
> supposed to be as space/time efficient as possible, subject to the
> constraint that it is 16 or 32 bits.  The fast and least types only
> give one of these at a time and are harder to use since their rank is
> opaque so you have to know too much about them to know if other types
> promote to them of if they promote to int or long, etc.  Even unsigned
> and long are not so easy to use.  Unsigned must sometimes be used to
> get 2's complement behaviour or extra range, but using it gives sign
> extension problems.  long is too long for general use, and the promotion
> rules to it doesn't occur automatically like it does for int.  long long
> is even less usable than long.
> 
> >>>> I don't like using explicit long constants either.  Here the number of bits
> >>>> in the register is fixed by the hardware at 64.  The number of bits in a
> >>>> long on amd64 and a long on i386 is only fixed by ABI because the ABI is
> >>>> broken for historical reasons.
> >>> I really cannot make any sense of this statement.
> >>
> >> To know that the ULL suffix is correct for 64-bit types, you have to now
> >> that long longs are 64 bits on all arches supported by the code.  Then
> >> to use this suffix, you have to hard-code this knowledge.  Then to read
> >> the code, the reader has to translate back to 64 bits.  The translations
> >> were easier 1 arch at a time.
> > And why it is bad ?  Same is true for int and long, which are often used
> > in MD code to provide specific word size.  In fact, there is not much for
> > a programmer to know: we and any other UNIX supports either ILP32 or LP64
> > for the given architecture.
> 
> I used basic types intentionally in i386 headers -- always use [u_]char,
> [u_short and [u_]int or perhaps a vm type like vm_offset_t and never
> [u_]intN_t.  This doesn't work so well for longs or long longs.  I supported
> i386 with correctly sized longs -- twice as long as a register.  This
> gave I32L64P32.
> 
> Merging the x86 headers churned many of the shorter type declarations
> for [u_]intN_t, but had to be more careful with longs because these differ
> between amd64 and i386, and more careful with long longs because although
> these don't differ in size, long long is unnatural on amd64 so tends to
> cause warnings.  In general, it is unclear if a fixed-width type is used
> because it is related to the word size, an API or an ABI.  u_int might
> mean precisely 32 bits.  u_long might mean the word size; it is easier to
> write, but is broken for i386 with correctly-sized longs, so I removed
> most uses of it for the word size.  unsigned long might mean precisely
> 64 bits, but is harder to write than uint64_t except in literal suffixes,
> and has a logical type mismatch on amd64, so is rarely use.
> 
> >> Casting to uint64_t is clearer, but doesn't
> >> work in cpp expressions.  In cpp expressions, use UINT64_C().  Almost no
> >> one knows about it uses this.  There are 5 examples of using it in /sys
> >> (3 in arm64 pte.h, 1 in powerpc pte.h, and 1 in mips xlr_machdep.c,
> >> where the use is unnecessary but interesting: it is ~UINT64_C(0).  We
> >> used to have squillions of magic ~0's for the rman max limit.  This was
> >> spelled ~0U, ~0UL and perhaps even plain ~0.  Plain ~0 worked best except
> >> on unsupported 1's complement machines, since it normally gets sign extended
> >> to as many bits as necessary.  Now this is spelled RM_MAX_END, which is
> >> implemented non-magically using a cast: (~(rman_res_t)0).  Grepping for
> >> ~0[uU] in dev/* shows only 1 obvious unconverted place.
> > This clearly demonstrates why ULL/UL notation is superior to UINT64_C() or
> > any other obfuscation.
> 
> RM_MAX_END is an unobfuscation.
> 
> Unconditional use of ULL just asks for future unportabilities and
> compiler warnings now.  On amd64, long long is never needed since it is
> no longer than long, and compilers could reasonably complain about uses
> when they can't see that its use has no effect or when they are generating
> portability warnings.
> 
> >> The MTRR_* macros are in x86/specialreg.h, and are spelled without ULL
> >> suffixes.  I prefer the latter, but seem to rememeber bugs elsewhere
> >> caused by using expressions like ~FOO where FOO happens to be small.
> >> Actually the problems are mostly when FOO happens to be between
> >> INT_MAX+1U and UINT_MAX.  When FOO is small and has no suffix, e.g.,
> >> if it is 0, then its type is int and ~FOO has type int and sign-extends
> >> to 64 bits if necessary.  But if FOO is say 0x80000000, it has type u_int
> >> so ~FOO doesn't sign-extend.  (Decimal constants without a suffix never
> >> have an unsigned type and the hex constant here lets me write this number
> >> and automatically give it an unsigned type.  Normally this type is best.)
> >>
> >> Explicit type suffixes mainly hide these problems.  If FOO is 0x80000000ULL,
> >> then it has the correct type for ~FOO to work in expressions where everything
> >> has type unsigned long long, but in other expressions a cast might still
> >> be needed.
> > Yes, yet another (and most useful) reason to use ULL and ignore a FUD about it.
> 
> amd64 doesn't even have such expressions.  The other terms will be mostly
> uint64_t = plain u_long.  Mixing with ULL promotes everything to unsigned
> long long.  Everything works because the arch is LP64 and also bogusly LL64,
> but this is harder to understand than code using the 64-bit types throughout,
> that is, using ~(uint64_t)FOO.

Everything works because amd64 is LPLL64, but also because i386 is
ILP32LL64, so using ULL gives desirable outcome for both architectures.

Again, after re-reading all the text above, I do not see anything wrong
with long long.  Except that you do not like it.