[CFR] mge driver / elf reloc

Bruce Evans brde at optusnet.com.au
Mon Jul 21 19:27:42 UTC 2014


On Mon, 21 Jul 2014, Warner Losh wrote:

> On Jul 21, 2014, at 10:25 AM, John-Mark Gurney <jmg at funkthat.com> wrote:
>
>> Warner Losh wrote this message on Mon, Jul 21, 2014 at 08:46 -0600:
>>>
>>> On Jul 20, 2014, at 5:10 PM, John-Mark Gurney <jmg at funkthat.com> wrote:
>>>
>>>> Tim Kientzle wrote this message on Sun, Jul 20, 2014 at 15:25 -0700:
>>>>> $ man 9 byteorder
>>>>>
>>>>> is most of what you want, lacking only some aliases to pick
>>>>> the correct macro for native byte order.
>>>>
>>>> Um, those doesn't help if you want native endian order?
>>>
>>> Ummm, yes they do. enc converts from native order. dec decodes to native byte
>>
>> No they don't.. If you want to read a value in memory that is native
>> endian order to native endian order (no conversion), they cannot be
>> used w/o using something like below…
>
> Missed the native to native. bcopy works, but is ugly, as you note below.

Indeed, the API is missing support for the easy case of host load/store.
But this is a feature.

He used memcpy(), not bcopy().  This is a case where using memcpy() instead
of bcopy() is not a style bug.  Using memcpy() is pretty.

>>> order. They are more general cases than the ntoh* functions that are more traditional
>>> since they also work on byte streams that may not be completely aligned when
>>> sitting in memory. Which is what you are asking for.
>>
>> So, you're saying that I now need to write code like:
>> #if LITTLE_ENDIAN /* or how ever this is spelled*/
>> 	var = le32enc(foo);
>> #else
>> 	var = be32enc(foo);
>> #endif
>>
>> If I want to read a arch native endian value?  No thank you…
>
> I’m not saying that at all.
>
>>>> Also, only the enc/dec functions are documented to work on non-aligned
>>>> address, so that doesn't help in most cases?
>>>
>>> They work on all addresses. They are even documented to work on any address:
>>>
>>>     The be16enc(), be16dec(), be32enc(), be32dec(), be64enc(), be64dec(),
>>>     le16enc(), le16dec(), le32enc(), le32dec(), le64enc(), and le64dec()
>>>     functions encode and decode integers to/from byte strings on any align-
>>>     ment in big/little endian format.
>>>
>>> So they are quite useful in general. Peeking under the covers at the implementation
>>> also shows they will work for any alignment, so I?m having trouble understanding
>>> where this objection is really coming from.
>>
>> There are places where you write code such as:
>> 	int i;
>> 	memcpy(&i, inp, sizeof i);
>> 	/* use i */
>>
>> In order to avoid alignment faults...  None of the functions in byteorder
>> do NO conversion of endian, or you have to know which endian you are but
>> that doesn't work on MI code...

This is good code.  memcpy() is likely to be optimized better than anything
in <sys/endian.h>.  Even with the optimizations in my previous reply.

Compilers do magic things with memcpy(), like turning it into a no-op
if the memory is already in a register and the register doesn't need
to be moved.  Sometimes the register needs to be moved but the move
can be, and is, to another register (perhaps in a different register
set, like XMM instead of a general register on x86).  Doing nothing
much for memcpy()s that are just for a alignment is a special case of
this.  For the above, on all arches, the compiler should load the
int-sized memory at address inp into an int-sized register and then
rename the register to i (i should never hit memory unless it is changed
and the change needs to be stored).  The instructions that are used
for the load depend on what the compiler knows about the alignment of
inp, and the alignment requirements of the arch.  On arches without
strict alignment, like x86 without CR0_AC, there is nothing better
than doing a mislaligned load *(int *)inp the same as you could do by
lieing to the compiler.  On arches with strict alignment, there is
nothing better than loading 1 byte at a time and combining if the
alignment of inp is 1 or unknown.

This seems to make most of byteorder(9) a mistake.  Just about everything
can be done better using memcpy() and standard functions in byteorder(3),
except the (3) functions only go up to 32 bits and have 1980's spellings
[sl] for the integer sizes.  All alignment stuff can be done in host order
using memcpy().  Most byte swapping stuff can be done using ntohl().  The
direction for n/h can be confusing but might need fewer ifdefs than be/le.

le32enc done using more-standard APIs:
    first htole32() in a register.  Only in byteorder(9)
    then store to an aligned uint32_t temporary variable (optimized away)
    then memcpy() to final place

le32dec done using more-standard APIs:
    first memcpy() to aligned uint32_t temporary variable (optimized away)
    then load to a register
    then le32toh on a register.  Only in byteorder(9)

These take 1 line extra each (for the memcpy()).  You already have to do
this for h/l conversions.  E.g.:

htonl to from register to misaligned memory (like le32enc):
    first htonl() in the register.  Standard in byteorder(3) and POSIX.
    then store to an aligned uint32_t temporary variable (optimized away)
    then memcpy() to final place

-fbuiltin is essential for optimizing memcpy(), so memcpy must be
spelled __builtin_memcpy in time-critical code if -fbuiltin might be
turned off.  -fbuiltin used to be only turned off in the kernel for
LINT, but this was broken 13 years ago by using -ffreestanding.  13
years is not long enough for LINT (conf/NOTES) to have caught up with
the change.  clang breaks this even more.  With gcc, you can use
-fbuiltin after -ffreestanding to turn builtins back on, but with clang
-ffreestanding has apparently has precedence so -fbuiltin never turns
builtins back on.  Also, gcc's man page actually documents these options,
and somewhere in the gcc documentation there is a recommendation to
keep -fbuiltin off and only use selected builtins via __builtin_foo.
Turning all builtins back on originally only caused minor problems with
a few builtins like the one for printf.

Bruce


More information about the freebsd-arm mailing list