cvs commit: src/lib/libc/gen syslog.c

Sun Oct 10 03:16:16 PDT 2004

On Sun, Oct 10, 2004 at 09:22:06AM +0200, Andrea Campi wrote:
A> I agree the desired degree of reliability of logging is
A> application-dependent, and that is sort of my point; I don't think
A> however that using a timeout is a good solution. Our syslog
A> implementation is thread-safe; forcing all applications to use timeouts
A> (inherently a nightmare in multithreaded apps) is IMHO silly.
A> 
A> In addition, this change accomplishes nothing. We try very hard to send
A> out a UDP packet that might well be lost in transit, since syslog is as
A> unreliable as UDP. As to why the UDP packet might be lost in transit or
A> why the remove syslog may lose packet I have no answer, but IMHO it's
A> not relevant: what IS relevant is whether syslog() can make any
A> guarantee of delivery.

Forget about UDP. syslog(3) logs to local syslogd. The latter may
forward message to other machine via UDP, but this is out of
scope of our discussion.

A> Note Brian that you ask "why would syslogd lose messages if we're
A> getting ENOBUFS". What I ask is: why loop try so hard locally when we
A> don't even know if anybody is listening? That is, the two failure cases
A> need not be simultaneous or related; but unless we close both, we can't
A> guarantee syslog() is reliable.

If no one listens on local socket, we would get ENOTCONN. Note that syslog(3)
has nothing to do with UDP.

A> The scenario I'm thinking of that would be mostly affected by this is
A> admittedly complicated:
A> 
A>  - attacker needs a way to cause ENOBUFS condition on the target
A>    machine.
A>  - target machine runs an unrelated service; let's say for discussion
A>    sake that it's not network-bound (i.e. an application server or a
A>    service gateway). To make things worse, let's say the app uses a
A>    multiprocess and multithread model.
A> 
A> In this scenarion, as soon as the application needs to log anything,
A> threads start to get stuck in syslog() calls--maybe for milliseconds,
A> maybe for seconds, maybe forever. Those threads could be holding
A> application-level resources (say mutex), thus slowing down or
A> potentially blocking forever other threads.

1. Not forever.
2. This is design error if logging thread holds a mutex, that stops
the application at all.
3. If /var/run/log is overflowed that means that your machine is already
slowed down by syslogd process and its IO. Your application is already
not doing its best.

Better have consistent logs later to investigate that DoS. An attacker
may trigger that DoS intentionally to hide some messages, which will
be logged if syslogd is not overflowed.

A> Gleb, you mentioned that a DoS syslogd would mean an overloaded machine,
A> but the ENOBUFS machine is the client. Are you sure that an ENOBUFS case
A> must imply the client is so swamped with CPU and IO time that the above
A> scenario doesn't actually make things worse?

Again, client and server are same. syslog(3) uses only local socket.

A> Above all however, how can you say "not forever"? What kind of guarantee
A> do you see that the application will never succeed its send() call?
A> Sure, statistically it will succeed, but that is not good enough.

It will wait until syslogd processes other messages on the queue.

A> Note that I'm not advocating that "since it can fail, don't bother
A> retrying". The concept of trying again is fine--my only gripe is with
A> retrying an unbounded number of times.

That means: "I'd suggest that we leave a possibility to lose messages.
Let it be harder to DoS logging, but possible."

A> What this all boils down to is:
A> 
A>  - syslog is an unreliable protocol by definition (see RFC3164 6.4);

Yes, the protocol of remote syslog. But not syslog(3) API.

A>  - syslog() and family are unreliable functions (to the extent that they
A>    even return void;

POSIX specification
  http://www.opengroup.org/onlinepubs/000095399/functions/syslog.html
does not say anything about reliability.

However, one can understand words "The syslog() function shall send a
message to an implementation-defined logging facility" as "message
should be delivered to local logging facility".

A>  - if the change stays, I think it should be documented in the syslog(3)
A>    man page;

Agreed.

A>  - I strongly object to MFC'ing it;
A>  - look into a better way to accomplish the goal.

To continue argument, you need a test case, which shows that some test
application works slower by an order of magnitude or even stops, when
an attacker floods syslogd.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE