svn commit: r316311 - in head: lib/libstand sys/boot/geli sys/boot/i386/gptboot sys/boot/i386/loader sys/boot/i386/zfsboot

Sat Apr 1 05:49:53 UTC 2017

On Fri, 31 Mar 2017, Brooks Davis wrote:

> On Fri, Mar 31, 2017 at 11:29:20AM -0700, John Baldwin wrote:
>> On Friday, March 31, 2017 09:04:51 AM Peter Grehan wrote:
>>>> So... can anyone provide a clue what's "explicit" (or different in any
>>>> way) between explicit_bzero() and normal bzero()?
>>>
>>>
>>> https://www.freebsd.org/cgi/man.cgi?query=explicit_bzero&sektion=3&manpath=FreeBSD+12-current
>>
>> It should be called 'bzero_now_I_mean_it()'
>>
>> (but then we would need some other function called anybody_want_a_peanut())
>
> It's sole purpose is to prevent the compiler from observing a pattern
> like:
>
> 	char a_secret_key[len];
> 	...
> 	bzero(a_secret_key, len);
> 	return;
>
> or
>
> 	char *a_secret_key = malloc(len);
> 	...
> 	bzero(a_secret_key, len);
> 	free(a_secret_key);
>
> And optimizing away bzero() because it knows what bzero() does and that
> nothing will ever access it as far as the C language is concerned..

Only broken compilers know what bzero() does.  It is not a Standard C
function, and libstand doesn't implement Standard C.

Most boot code is compiled with -ffreestanding to prevent similar but
valid optimizations of Standard C functions, but geli isn't.

Some versions of POSIX have bzero() under certain feature test macros,
and a compiler for such versions of POSIX could optimize bzero() if
it also understands the feature test macros.  I doubt that any compiler
understands this.  In FreeBSD, the feature test macro condition for
bzero() in <strings.h> is __BSD_VISIBLE__ || POSIX_VISIBLE <= 200112.
The compiler could reasonably optimize calls to bzero() if this condition
is satisfied in the caller's compilation unit.

bzero() (and most other functions) is not properly declared in <stand.h>.
<stand.h> just includes <string.h> and many other standard headers and
gets massive namespace pollution for unimplemented things and bogus
visibility ifdefs for things like bzero() that it does implement.
bzero() is standard in libstand, but it is not obviously the standard
one.

<stand.h> handles some functions better.  It declares getchar() and doesn't
include <stdio.h>.
   (All of its function declararions have style bugs.  Functions are
   declared as extern, as was necessary to support versions of K&R older
   than the one that BSD pretended to support 25 years ago.  Many of
   its declarations have bugs.  Many have named parameters in the
   application namespace.  Some have a style bug instead of this bug
   -- they don't name the parameters.  None use the technically correct
   method of naming parameters with underscores.)
It only has the malloc() family of functions as macros which expand to
private functions like Malloc().

> The moment you enable LTO all bets are off because it can pattern match
> the code for explicit_bzero(), realize that it is that same as bzero()
> and combine them.  Declaring a_secret_key volatile likely makes things
> work, but the C language is deficient in not providing a way to express
> something like explicit_bzero() sanely and reliable.

That is an invalid optimization even for bzero().  I doubt that LTO is
"smart" enough to even read the code of bzero() and reverse engineer this
to get its implementation in C.  bzero() is nonstandard so nothing can
be inferred from its name.  It might be a private version that takes
pointers to volatiles.  Then C semantics don't allow removing the memory
accesses.  LTO can determine if the pointers are volatiles given sufficient
annotations.  Debugging annotations would give some volatile types, but
I think it is insuccient to attach the types to variables.  But determining
the absence of volatile types is not enough for the optimization.  The
compiler has merely reverse-engineered the current implmentation, and
needs more annotations to determine that the absense of volatile types is
not just an implementation detail.

The compiler could read the documentation of libstand.  Unfortately, if
it does that it can know that its optimization of bzero() is valid
subject to the complications with the feature tests.  libstand(3) just
says that "String functions are available as documented in string(3)".
So the functions must be the standard ones for the applicable standard.
The compiler next has to read and understand string(3) better than its
authors.  Of course, it doesn't document the feature test macros.

bzero() is actually documented in bzero(3).  Of course, the feature test
macros aren't documented there either.  It is documented that POSIX
removed bzero() in 2008 and it is implicit that FreeBSD still has it.

explicit_bzero() is much the same as bzero(), except if the compiler
follows the chain of optimization down to bzero(3), and understand it,
then it will see that explicit_bzero() is a little different.
explicit_bzero()'s API is broken as designed, since it doesn't have
volatiles in it.  The memory accesses are magic, and volatile is
the only way to express this.  The difference between explicit_bzero()
and bzero() comes down to the clause in bzero() that disallows
compilers running their dead store optimization pass on it:

      The explicit_bzero() variant behaves the same, but will not be removed by
      a compiler's dead store optimization pass, making it useful for clearing
      sensitive memory such as a password.

This says that compilers can and should treat explicit_bzero() the same as
bzero(), except they are not allowed to remove it via their dead store
optimization pass.  They are still allowed to remove it via any other
type of optimization pass, provided they read and understand the
documentation up to this point and know that they can apply their
optimizations to bzero() because it is standard in libstand and libstand
defere to other standards (or just the bzero(3) man page) where it is
the Standard one known to the compiler (or close enough according to
its documentation).

It is still a hack to use bzero() for its side effects.  I just don't
like explicit_bzero()'s API, starting with its name.  "explicit" is
verbose and gives little information about what it does.  "byte" is
already abbreviated to "b" to avoid a verbose name.  I might use
vbzero() (volatile vbzero()), taking a volatile arg.  This prevents
all compiler optimizations except invalid LTO, and has clearer semantics
and other uses.  Strict volatile semantics probably requires doing all
the accesses 1 byte at a time and certainly in memory order, which is
not always what is wanted, so a more general vbzero() would look like
bus_space on memory-mapped devices.  Bus space has bad verbose names too.

Bruce