svn commit: r367813 - head/lib/libutil

Stefan Esser se at freebsd.org
Wed Nov 18 23:05:53 UTC 2020


Am 18.11.20 um 23:14 schrieb Jessica Clarke:
> On 18 Nov 2020, at 21:52, Stefan Esser <se at freebsd.org> wrote:
>> Am 18.11.20 um 22:15 schrieb Jessica Clarke:
>>> On 18 Nov 2020, at 19:44, Stefan Eßer <se at freebsd.org> wrote:
>>>> +		/*
>>>> +		 * Check for some other thread already having
>>>> +		 * set localbase - this should use atomic ops.
>>>> +		 * The amount of memory allocated above may leak,
>>>> +		 * if a parallel update in another thread is not
>>>> +		 * detected and the non-NULL pointer is overwritten.
>>>> +		 */
>>> Why was this committed with a known racy/leaky implementation?
>>
>> Because the alternatives that I offered for discussion were
>> less acceptable.
> 
> That has no bearing over whether this one is.

Three alternatives have been discussed:

1) static buffer of size MAXPATHLEN
2) dynamically allocated buffer (as committed)
3) dynamically allocated buffer filled by a constructor

1) has been rejected, since it adds 1 KB of bss to each program
that is linked with libutil, whether it uses getlocalbase() or
not.

3) has been rejected since it causes 1 getenv() and 2 sysctl()
calls when the program linked with libutil starts, independently
of whether getlocalbase() is used or not.

2) has been selected, since it is only called when needed and it
does not pre-allocate a large buffer.

Which other alternative do you want to suggest?

Did you have a look at the review announced on the -current list
and mentioned in the commit?

>>> What happens if I set the value with a sysctl and call this?
>>
>> You'll get the value set with sysctl, unless overridden by the
>> environment variable. There is a window of a few nano-seconds
>> where a thread executing in parallel on another core might be
>> able to set the localbase variable (between the test for NULL
>> in this function and the assignment to it). The value that will
>> be returned by either thread will be identical, so there is no
>> risk of corruption of the result.
> 
> But if I call getlocalbase, then set it via sysctl, then call
> getlocalbase again, I see the old value. If, however, I omit the first
> getlocalbase, I see the new value. This differs from how getenv/setenv
> of the value work, where you always see the up-to-date value. Maybe you
> think that's a feature, but it's something to watch out for and
> explicitly call out in the documentation.

Actual programs call getlocalbase() exactly once and create the
required pathes from the value returned.

It is possible to copy the value from the environment to a buffer,
but at added complexity and introducing another race condition.

The program itself might have modified its environment and then
use an inconsistent value when it calls getlocalbase() again
thereafter. But why would you want to do this - it seems to be
a complicated way lof foot-shooting to me.

I'm not a native speaker of English and not best qualified to
clearly express this in the man-page. Feel free to commit any
clarification or suggest an addition for me to commit.

> You also misunderstand all the subtleties of multithreading here. There
> are no acquire/release pairs so it is entirely legal for Thread 2 to
> read Thread 1's initialised value for localbase before the contents of
> it are visible (i.e. the pointer is initialised but the data is
> garbage).

Yes, and thread 2's value will be identical to the one from
thread 1, just in a different heap location, unless there is a
write to the sysctl variable in the kernel at just the same
time. And you cannot protect against this race and you'll get
either the old or new value.

> The `(volatile const char*)localbase` cast is also a complete waste of
> time. You probably meant to write `(const char * volatile)localbase`
> but even then that does nothing useful as the cast is too late. What
> you really were trying to write was
> `*(const char * volatile *)&localbase`, but you need proper atomics
> anyway for this to be safe.

I was not sure about the correct volatile declaration, I've got
to admit. It was in the review and I did not get any comments
regarding that specific modifier. The goal was to enforce the
access to the localbase pointer in memory after returning from
the sysctl to shorten the window.

Perhaps I should just have completely ignore any multi-threading
issues and accepted that an overlapping execution of this function
would allocate multiple sysctl destination buffers but loose all
references but one in the assignment to localbase.

It will not happen in practice, it just does not make sense to
call this function more than once in a program.

>> This unlikely case may actually leak a heap allocated string
>> of typically tens of bytes (but with negligible probability).
>>
>> But this really is a non-issue, since there should never be a
>> reason to invoke this function in a multi-threaded context.
> 
> Why not? There could easily be code out there calling getenv in a
> multi-threaded context so this is inadequate as a replacement. Yes it's
> inefficient but it's perfectly legal and imaginable.

Yes, calling getenv() might occur, but getlocalbase() is generally
called before configuration files are accessed, and the resulting
path is saved in the program. Other uses are possible, but this is
the recurring pattern.

And unless the program modifies its own environment at run-time,
all invocsations of getlocalbase() are guaranteed to return the
same result again and again.

> Also if malloc returns NULL I would quite like that to be an error for
> the function and not silently fall back on _PATH_LOCALBASE.

The function is specifically specified to always succeed. Else
there is extra code required at each invocation to check for
errors. Not being able to allocate the required buffer of at
most 1 KB will cause every non-trivial program to fail anyway.

Returning the compiled in default value will direct the program
to access files in /usr/local, which is an administrator controlled
path, even on systems with non-default localbase.

Returning NULL is possible in that case, but the only ways a
program could deal with this case is to either crash, sue the
default path, or ignore the accesses to files in localbase.

In either case it will not operate as intended, but I consider
directing accesses to the FreeBSD default location (or the
modified _PATH_LOCALBASE) to be a reasonable fall-back.

But this will not happen unless provoked (by not allowing the
program to allocate any memory from the heap, which will cause
it to fail as soon as any other library function or start-up
code tries to allocate heap-space).

I had put this code up for review and it has been discussed
before (and another implementation has been reverted after
several iterations of changes).

If there had been consensus to return NULL, I could have
committed that - it is a trivial change and ENOMEM would have
been set after a malloc failure.

I have written about this possibility and I had appreciated
if comments had been made on Phabricator before the commit.

Regards, STefan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20201119/c54363de/attachment.sig>


More information about the svn-src-head mailing list