svn commit: r334702 - head/sys/sys

Jonathan T. Looney jtl at freebsd.org
Thu Jun 7 03:00:00 UTC 2018


On Wed, Jun 6, 2018 at 10:14 PM, Ravi Pokala <rpokala at freebsd.org> wrote:
>
> -----Original Message-----
> From: <owner-src-committers at freebsd.org> on behalf of Mateusz Guzik <
mjguzik at gmail.com>
> Date: 2018-06-06, Wednesday at 09:01
> To: Ravi Pokala <rpokala at freebsd.org>
> Cc: Mateusz Guzik <mjg at freebsd.org>, src-committers <
src-committers at freebsd.org>, <svn-src-all at freebsd.org>, <
svn-src-head at freebsd.org>
> Subject: Re: svn commit: r334702 - head/sys/sys
>
> > On Wed, Jun 6, 2018 at 1:35 PM, Ravi Pokala <rpokala at freebsd.org> wrote:
> >
> >>> + * Passing the flag down requires malloc to blindly zero the entire
object.
> >>> + * In practice a lot of the zeroing can be avoided if most of the
object
> >>> + * gets explicitly initialized after the allocation. Letting the
compiler
> >>> + * zero in place gives it the opportunity to take advantage of this
state.
> >>
> >> This part, I still don't understand. :-(
> >
> > The call to bzero() is still for the full length passed in, so how does
this help?
> >
> > bzero is:
> > #define bzero(buf, len) __builtin_memset((buf), 0, (len))
>
> I'm afraid that doesn't answer my question; you're passing the full
length to __builtin_memset() too.


I believe the theory is that the compiler (remember, this is
__builtin_memset) can optimize away portions of the zeroing, or can
optimize zeroing for small sizes.

For example, imagine you do this:

    struct foo {
        uint32_t a;
        uint32_t b;
    };

    struct foo *
    alloc_foo(void)
    {
        struct foo *rv;

        rv = malloc(sizeof(*rv), M_TMP, M_WAITOK|M_ZERO);
        rv->a = 1;
        rv->b = 2;
        return (rv);
    }

In theory, the compiler can be smart enough to know that the entire
structure is initialized, so it is not necessary to zero it.

(I personally have not tested how well this works in practice. However,
this change theoretically lets the compiler be smarter and optimize away
unneeded work.)

At minimum, it should let the compiler replace calls to memset() (and the
loops there) with optimal instructions to zero the exact amount of memory
that needs to be initialized. (Again, I haven't personally tested how smart
the compilers we use are about producing optimal code in this situation.)

Jonathan


More information about the svn-src-head mailing list