svn commit: r327495 - head/usr.sbin/rpcbind

Sat Feb 3 19:31:30 UTC 2018

On Sat, Feb 3, 2018 at 6:46 AM, Bruce Evans <brde at optusnet.com.au> wrote:
> On Tue, 2 Jan 2018, Conrad Meyer wrote:
>> ...
> Today I learned from the POSIX list that the pselect() function is designed
> to avoid this bug.  It takes a signal mask arg (like msleep() in the
> kernel).
>
> I haven't noticed any function related to poll() or kevent that takes a
> signal mask.

There is the similar function ppoll(), although it complies only with
the Linux standard, not POSIX.

With kevent, you can simply mask all (or most) signals and watch on a
signal event.  Conversion to kevent is more complicated, though.

> Programs broken by buggy conversion:
> ...
> - rpcbind(8).  This is not interactive and normally doesn't use ttys
>   which might block.  However, the -d flag makes it do fprintf() to
>   stderr.  This may block forever (e.g., for flow control), and you
>   would like to be able to kill it using a signal.  But write() will
>   restart.  rpcbind also uses plain signal() and doesn't know anything
>   about SA_RESTART.

This was not broken by conversion -- it was already broken in this
case.  If the signal delivery raced with an fprintf, we ended up
invoking the stdio atexit handlers via exit(3) call in terminate(),
which of course encountered corrupted state.  Now it is broken in a
slightly different way, yes, if signal delivery races fprintf *and*
fprintf is blocked in flow control.  This might happen with a slow
serial for stderr but seems extraordinarily unlikely otherwise.

> ...
> But this is too hard for most programs.  I think it is best to try to write
> safe signal handlers.  Unfortunately, APIs support this negatively.

Agreed.  The signal abstraction is pretty awful.  I think the safest
way to handle them is to block them entirely, then watch using
kqueue()/kevent().  That way you never have to deal with signal
context.  But that kind of conversion is more work.  You also have to
deal with EINTR or be ok with blocking signal handling indefinitely.

> perror()
> is broken as designed since it uses stdio, so it is unsafe in signal
> handlers.  The err() family copies this bug.  Even *s*printf() is not
> required to be safe in signal handlers.  I would fix the latter first.

It does seem like the printf family of routines should be
signal-handler safe.  Unfortunately, they are implemented in terms of
the unsafe stdio streams :-(.

> Then
> try to change the API of warn() and warnx() to be safe.  err() can't be
> change since it has to keep calling exit(), but it is easy to use the safe
> warn() followed by _exit() stdio is avoided, and important to know that it
> is avoided.

That sounds nice.  I'm on board with that.

>> ... Non-functional change highlighted ...
>
> I think this is too specialized and complicated.
>
> ...
>
> 1 millisecond is usually too small, but 1 second seems reasonable for
> most cases.  The timeout is only used after rarely-lost races unless it
> is small.

Feel free to change it to 1s yourself, if you think it is important.

>> Modified: head/usr.sbin/rpcbind/rpcbind.c
>>
>> ==============================================================================
>> --- head/usr.sbin/rpcbind/rpcbind.c     Tue Jan  2 16:50:57 2018
>> (r327494)
>> +++ head/usr.sbin/rpcbind/rpcbind.c     Tue Jan  2 17:25:13 2018
>> (r327495)
>> ...
>> @@ -761,8 +774,13 @@ rbllist_add(rpcprog_t prog, rpcvers_t vers, struct
>> net
>> static void
>> terminate(int signum)
>> {
>> +       char c = '\0';
>> +       ssize_t wr;
>>
>>         doterminate = signum;
>> +       wr = write(terminate_wfd, &c, 1);
>
>
> Not async-signal-safe.  Acccesses in signal handlers to objects with
> static storage duration give undefined behaviour except for assignment to
> objects of type volatile sig_atomic_t, but the access to terminate_wfd is
> a read and the type is plain int.

The type can be changed to volatile sig_atomic_t if you believe plain
int will trigger nonsensical compiler behavior.  The value is
initialized once, before the signal handler is registered, so unless
the compiler does something totally insane it should be fine on all
architectures FreeBSD runs on.

>> +       if (wr < 1)
>> +               _exit(2);
>
>
> Best to not check for write errors, since the error handling of using
> _exit()
> is worse than none.  It loses stdio flushing to handle an almost-harmless
> error.  The main problem with keeping everything in a safe handler is that
> it
> is impossible to keep stdio flushing there and we would prefer to not lost
> the stdio flushing.

I don't necessarily agree.  If the write fails, we missed the signal
telling us to terminate the program and will never exit.  That said,
how would the write ever fail?

Conrad