svn commit: r334702 - head/sys/sys

Ravi Pokala rpokala at freebsd.org
Thu Jun 7 02:14:17 UTC 2018


-----Original Message-----
From: <owner-src-committers at freebsd.org> on behalf of Mateusz Guzik <mjguzik at gmail.com>
Date: 2018-06-06, Wednesday at 09:01
To: Ravi Pokala <rpokala at freebsd.org>
Cc: Mateusz Guzik <mjg at freebsd.org>, src-committers <src-committers at freebsd.org>, <svn-src-all at freebsd.org>, <svn-src-head at freebsd.org>
Subject: Re: svn commit: r334702 - head/sys/sys

> On Wed, Jun 6, 2018 at 1:35 PM, Ravi Pokala <rpokala at freebsd.org> wrote:
> 
>>> + * Passing the flag down requires malloc to blindly zero the entire object.
>>> + * In practice a lot of the zeroing can be avoided if most of the object
>>> + * gets explicitly initialized after the allocation. Letting the compiler
>>> + * zero in place gives it the opportunity to take advantage of this state.
>>
>> This part, I still don't understand. :-(
> 
> The call to bzero() is still for the full length passed in, so how does this help?
> 
> bzero is:
> #define bzero(buf, len) __builtin_memset((buf), 0, (len))

I'm afraid that doesn't answer my question; you're passing the full length to __builtin_memset() too.

>>> ...
>>> + *   _malloc_item = malloc(_size, type, (flags) &~ M_ZERO);
>>> + *   if (((flags) & M_WAITOK) != 0 || _malloc_item != NULL)
>>> + *           bzero(_malloc_item, _size);
>>> + *
>>> + * If the flag is set, the compiler knows the left side is always true,
>>> + * therefore the entire statement is true and the callsite is:
>> 
>> I think you mean "... the *right* side is always true ...", since the left side is the check for the flag being set. "If the flag is set, compiler knows (the check for the flag being set) is always true" is tautological.
> 
> It explains how __builtin_constant_p(flags) being true allows the compiler
> to optimize out the flags-based check.

The __builtin_constant_p()s are in the conditional *before* the one I'm asking about. The test for M_WAITOK on the left being true, will cause the NULL-check on the right to be skipped because of the short-circuit semantics of binary ||. But because M_WAITOK is set, the NULL-check will pass anyway... Ah, and that's what you meant by "therefore the entire statement is true".

I think the wording was throwing me; it might be clearer English to say something like "If the flag is set -- and we only get here if that can be determined at compile-time, because of __builtin_constant_p() -- then the entire statement is true. This skips the NULL-check, but it will always pass if the flag is set anyway."

> I don't understand why this particular use runs into so much confusion.
> Just above it there is a M_ZERO check relying on the same property and
> receiving no attention.

In that context, it's clearer what's the condition is:
- "size" must be constant at compile-time
- "flags" must be constant at compile-time
- "flags" must have M_ZERO set

>>> ...
>>> + * If the flag is not set, the compiler knows the left size is always false
>>> + * and the NULL check is needed, therefore the callsite is:
>> 
>> Same issue here.

And same answer here too.

>>> ...
>>>  #ifdef _KERNEL
>>>  #define      malloc(size, type, flags) ({                                    \
>> 
>> Now that I'm taking another look at this, I'm confused as to why the entire macro expansion is inside parentheses? (The braces make sense, since this is a block with local variables which need to be contained.)
> 
> It is to return the value (the last expression).

Yeah, Ben / Bruce / Conrad clarified that.

>>>       void *_malloc_item;                                             \
>>> @@ -193,7 +228,8 @@ void      *malloc(size_t size, struct malloc_type *type, in
>>>       if (__builtin_constant_p(size) && __builtin_constant_p(flags) &&\
>>>           ((flags) & M_ZERO) != 0) {                                  \
>>>               _malloc_item = malloc(_size, type, (flags) &~ M_ZERO);  \
>>> -             if (((flags) & M_WAITOK) != 0 || _malloc_item != NULL)  \
>>> +             if (((flags) & M_WAITOK) != 0 ||                        \
>>> +                 __predict_true(_malloc_item != NULL))               \
>>>                       bzero(_malloc_item, _size);                     \
>>>       } else {                                                        \
>>>               _malloc_item = malloc(_size, type, flags);              \
>> 
>> This confuses me too. If the constant-size/constant-flags/M_ZERO-is-set test fails, then it falls down to calling malloc(). Which we are in the middle of defining. So what does that expand to?
> 
> Expansion is not recursive, so this is an actual call to malloc. 

Ah, right. I swear I knew that at some point. :-)

> -- 
> Mateusz Guzik <mjguzik gmail.com <http://gmail.com>>

Thanks Mateusz,

-Ravi (rpokala@)




More information about the svn-src-head mailing list