svn commit: r333266 - head/sys/amd64/amd64

Bruce Evans brde at optusnet.com.au
Sat May 5 05:28:50 UTC 2018


On Fri, 4 May 2018, Warner Losh wrote:

> On Fri, May 4, 2018 at 5:12 PM, Mateusz Guzik <mjguzik at gmail.com> wrote:
>
>> On Sat, May 5, 2018 at 12:58 AM, Steven Hartland <
>> steven.hartland at multiplay.co.uk> wrote:
>>
>>> Can we get the why in commit messages please?
>>>
>>> This sort of message doesnt provide anything more that can be obtained
>>> from reading the diff, which just leaves us wondering why?
>>>
>>> I’m sure there is a good reason, but without confirmation we’re just left
>>> guessing. The knock on to this is if some assumption that caused the why
>>> changes, anyone looking at this will not be able to make an informed
>>> descision that that was the case.
>>>
>> bcopy is an equivalent of memmove, i.e. it accepts overlapping buffers.
>> But if we know for a fact they don't overlap (like here), doing this over
>> memcpy (which does not accept such buffers) only puts avoidable
>> constraints on the optimizer.

Indeed, but clang already does adequate optimization for som manye cases
(especially amd64), so these small changes are not much more than special
micro-optimizations for gcc on 32-bit arches.  I care about gcc and 32-bit
arches, but you don't.

> bcopy, in userland, is memmove. bcopy in the kernel has had a more
> complicated history. Today it's more like memmove, but at times in the
> history of BSD/Unix it's be more akin to memcpy with a companion ovbcopy
> used for overlapping copies. FreeBSD has almost always been more in the

I think (but don't know) that ovbcopy is a SYSVism and bcopy() always
handled overlapping copies in BSD.  It was not well documented that it
did, but with only 1 memory-copying function that function has to handle
overlapping copies or be even better documented to not handle them.

> 'bcopy is memmove' rather than the 'bcopy is memcpy' though some of the
> lower-tier architectures pulled in ovbcopy which we recently GC'd from
> NetBSD and/or OpenBSD.

In all of 4.4BSD /sys, ovbcopy is only referenced on 34 lines (almost half
in tags files), mostly to implement it on some arches:
- news3400, hp300, i386, luna68k: alias for bcopy
- sparc64: separate from bcopy.  bcopy seems to be like memcpy and doesn't
   handle overlapping copies.
- vax/inline/machpats.c: separate and too vaxish for me to understand (seems
   to be just a prologue)
- netiso/iso_pcb.c, net/slcompress.c, sparc/pmap.c. netinet/ip_output.c,
   netinet/ip_nroute.c: actually use it

The sparc64 and vax code is an indication that bcopy didn't always handle
overlapping copies in BSD.

> Plus there's been an irrational encouragement of
> using bcopy over mem* which has lead to the current state of affairs.

You mean a rational encouragement.

> For the vast majority of uses, it hasn't really mattered much in the past.
> It hasn't shown up on radar.

It matters even less now.  Deciding if the copies overlap takes about 1
branch, and with modern branch prediction that often costs about 1 cycle.
The x86 library implementation wastes more like 50 cycles in other ways.

> However, as its optimization has moved into
> the compiler I'm guessing that's changed. It's that change of heart I think
> that are taking people by surprise.

I blame micro-benchmarks.  Amdahls' law applies and gives a limit of about
1% for the possible improvements from optimizing bcopy(), except in
micro-benchmarks.  That is even though the kernel spends a relatively
large amount of time in bcopy().  Userland might take 80% of the time,
the kernel 20%, and bcopy() 10% of the 20% = 2%.  After optimizing bcopy()
to be twice as fast (which is difficult), you have speeded up applications
by 1% at most.

Bruce


More information about the svn-src-all mailing list