svn commit: r325765 - head/lib/libc/string
Bruce Evans
brde at optusnet.com.au
Thu Nov 16 07:15:48 UTC 2017
On Wed, 15 Nov 2017, Mark Millard wrote:
> Bruce Evans brde at optusnet.com.au wrote on
> Tue Nov 14 12:41:50 UTC 2017 :
>
>> - memcpy.3. It was already documented in the BUGS section that memcpy() is
>> implemented as bcopy() and therefore the "strings" "may overlap" [then
>> portability considerations due to this]. This is buggier than before:
>> - old bug: only the MI implementation of memcpy() clearly implements
>> memcpy() as bcopy(). The amd64, i386 and arm MD implementations
>> do the same.
I was confused between arm and mips and only fixed this in some places.
It is mips that does the same as amd64 and i386. arm has different MD code,
and the others have no md code.
> head/sys/arm/arm/support.S ( -r283366 ) has
>
> 394 ENTRY(bcopy)
> 395 /* switch the source and destination registers */
> 396 eor r0, r1, r0
> 397 eor r1, r0, r1
> 398 eor r0, r1, r0
> 399 EENTRY(memmove)
> 400 /* Do the buffers overlap? */
> 401 cmp r0, r1
> 402 RETeq /* Bail now if src/dst are the same */
> 403 subcc r3, r0, r1 /* if (dst > src) r3 = dst - src */
> 404 subcs r3, r1, r0 /* if (src > dsr) r3 = src - dst */
> 405 cmp r3, r2 /* if (r3 < len) we have an overlap */
> 406 bcc PIC_SYM(_C_LABEL(memcpy), PLT)
> . . .
I didn't look closely at the arm code, and just noticed more bugs in it.
Falling through like that breaks at least profiling. Only ENTRY() does
_PROF_PROLOGUE, so profiling of memmove() seems to be broken.
A less worse way to use EENTRY() is ENTRY() then 1 more more EENTRY()'s
with no code in between. Then the names are just aliases. Profiling
can't distinguish the aliases, but at least all entries get counted.
amd64 and i386 use ALTENTRY() which handles this as well as possible
(not very well -- like a tail call -- too much gets charged to the
tail function. sparc64 also has ALTENTRY() but not the complications
needed for it to work only as badly as a tail call.
It is a style bug for memmove() to exist in the kernel. Especially an MD
version of it. It is bad enough that libkern has an MI memmove(). This
is not the same as the libc/string one (another style bug), but just
implements memmove() by calling bcopy(). Profiling works right for this
unless it the function is over-optimized to a tail call when it works
like ALTENTRY().
> head/lib/libc/arm/string/memmove.S ( -r288373 ) has:
>
> 37 #ifndef _BCOPY
> 38 /* LINTSTUB: Func: void *memmove(void *, const void *, size_t) */
> 39 ENTRY(memmove)
> 40 #else
> 41 /* bcopy = memcpy/memmove with arguments reversed. */
> 42 /* LINTSTUB: Func: void bcopy(void *, void *, size_t) */
> 43 ENTRY(bcopy)
> 44 /* switch the source and destination registers */
> 45 eor r0, r1, r0
> 46 eor r1, r0, r1
> 47 eor r0, r1, r0
> 48 #endif
This is the correct way to do it -- separate object files with duplication
in source files avoided using ifdefs.
> 49 /* Do the buffers overlap? */
> 50 cmp r0, r1
> 51 it eq
> 52 RETeq /* Bail now if src/dst are the same */
> 53 ite cc
> 54 subcc r3, r0, r1 /* if (dst > src) r3 = dst - src */
> 55 subcs r3, r1, r0 /* if (src > dsr) r3 = src - dst */
> 56 cmp r3, r2 /* if (r3 < len) we have an overlap */
> 57 bcc PIC_SYM(_C_LABEL(memcpy), PLT)
> . . .
>
> It looks to me like bcopy and memmove call memcpy under
> some conditions, not the other way around. I did not
> notice any code in memcpy going back to bcopy (or
> memmove) in either area. (That might have lead to
> recursion as things are.)
The above only shows memmove() calling bcopy(), only in the kernel. This
would be invalid in userland since memmove() is Standard but bcopy() is
supposed to be usable by applications (unless __BSD_VISIBLE). The
separate object files in userland make this not a problem. Otherwise,
bcopy and memmove in the same object file would cause namespace problems
even if bcopy is implemented as a [tail] call or fall-through to memmove.
> ...
> Another issue is if compilers (clang, gcc) are well
> controlled for code substitutions to preserve
> supposed FreeBSD rules about where overlaps
> are well defined. The library code might not be all
> there is to it as far as behavior goes for
> compiled C/C++ source code.
Do you mean the namespace stuff?
> head/sys/arm64/arm64/ is similar for bcopy/memmove
> calling memcpy (but head/lib/libc/amd64/string/ is
> not):
It is reasonable, and possibly best, to implement the usual
non-overlapping case by a call to memcpy() and not try hard to
optimize the overlapping case (or jump far away to it to keep
caches cleaner for the usual case).
> head/sys/arm64/arm64/memmove.S ( -r307909 ) has:
> ...
> head/lib/libc/amd64/string/bcopy.S has:
> ...
Same organisation as for plain arm, so good for libc and not so good fot
the kernel.
>> mips is backwards for bcopy() and implements it as
>> memmove(). mips has 2 separate memcpy*.S files, one for plain arm
>> and one for xscale. The plain arm one is smaller than memmove.S,
>> so presumably doesn't support overlapped copies. The xscale one
>> is much larger, so has space for overlapped copies and many optimizations
>> that no longer work. I don't know what it does.
>
> The mix of "mips" and "arm"/"xscale" above confused me. Looking
> around this seems to be referencing head/lib/libc/arm/string/
> materials instead of head/lib/libc/mips/string/ or
> head/sys/mips/mips/ materials.
I meant arm here.
mips does have some MI string instructions in libc which I missed or
got confused before. Also, aarch64 has some which I missed because the
they are obfuscated by putting them in contrib instead of under libc.
mips implements all of bcopy, memcpy and memmove in bcopy.S, and seems
to be indeed like amd64 and i386 -- none of these arches bothers to
optimize memcpy by not supporting overlapping copies in it, exactly
as documented.
However, the documentation is still incorrect. Without -ffreestanding,
compilers can and do often implement memcpy inline. Then overlapping
copies are not supported. The BUGS section in memcpy.3 should just be
deleted.
Bruce
More information about the svn-src-head
mailing list