svn commit: r333266 - head/sys/amd64/amd64

Sat May 5 04:21:16 UTC 2018

On Fri, 4 May 2018, Warner Losh wrote:

> While i wholeheartedly agree, an earlier commit explained why in this case.
> It allows the compiler to inline for small copies.

[This should have been killfiled due to top-posting.]

No, it has nothing to do with that.  The compiler can inline either bcopy()
or memcpy(), and does so after related recent changes that implement bcopy()
using builtins.

What this does is work around bugs in the recent changes, and compiler bugs.

bcopy() handles overlapping copies, but memcpy() doesn't, so it is not just
a style bug to use memcpy() instead of bcopy().  However, if bcopy() is
exposed to the compiler using builtins
     (note that -ffreestanding prevents even memcpy() being known to
     the compiler unless memcpy() is spelled __builtin_memcpy(), and
     there are similar but larger complications for bcopy())
then the compiler can almost always see when overlap occurs and further
reduce bcopy() to __builtin_memcpy().  The full reduction should be:

     bcopy() -> __builtin_memmove() -> __builtin_memcpy() -> { either
       inline code or a call to __memcpy() }

but the actual reduction is:

     bcopy() -> { __builtin_memmove() for small sizes, else bcopy() (bug1) }
       -> { __builtin_memcpy() if inlined, else a call to memmove() (bugs2,3) },
 	 else bcopy()

where bug1 is an implementation bug (not telling the compiler about large
copies), bug2 is a compiler bug (forgetting that it reduced to *memcpy when
calling the library to handle large sizes), and bug3 is a compiler bug
(calling the library function memmove() which might be unrelated in the
freestanding case).

Fixing the implementation and compiler would optimize thousands of calls
to bcopy() without churning the source code, but I'm a bit worried about
security bugs from this.  The security bugs are larger for bzero().  When
bzero() was not exposed to the compiler as __builtin_memset(), it was
guaranteed to always zero memory so there was no need for explicit_bzero()
or explicit_memset() in the kernel, and using these was just a style bug,
as is using memset() to 0 instead of bzero().  Usually the compiler doesn't
have enough information to optimize away __builtin_memset().  But changing
thousands of bzero()s to __builtin_memset()s is likely to find a couple of
cases where the compiler does do this optimization and it breaks security.

Bruce