Causing a process switch to test a theory.

Steve Watt steve at Watt.COM
Sun Mar 20 22:55:36 PST 2005


In <423DE326.9000203 at digitalstratum.com>,
  matthew at digitalstratum.com wrote:
>I think I have found a possible bug in Apache's logging when using their 
>"reliable pipe" feature, but I'd like to test it prior to submitting a 
>bug report (or possibly a patch.)  Of course I posted a message on the 
>Apache development forum before posting here, but I have had no response 
>from that group.

And based on your description, I think I agree.

>My understanding of PIPE_BUF is that it is the largest amount of data 
>the kernel will guarantee to be atomic when writing to a pipe.  Thus if 
>more than one process is writing to the same pipe, and more than 
>PIPE_BUF bytes needs to be written, there is the chance of the data 
>being interleaved due to a context switch during write(), or between 
>multiple calls to write() in order to write all required data.

Yes, that is exactly the POSIX semantic for PIPE_BUF.  There are a
lot of tricky details in there that are not fully obvious, and
it's now been long enough (6 years) that I've forgotten the exact
details.  There are some weird interactions between O_NONBLOCK and
PIPE_BUF, but it looks like some of them have been ironed out in
recent versions of the standard.

>I've been reading the Apache source code to try and determine if 
>PIPE_BUF is taken into consideration while logging entries to a pipe.  
>What appears to happen is that if a single log entry is more than 512 
>bytes, it is simply written to the the pipe without regard to PIPE_BUF.  

Then there is definitely a risk of interleaving.  This is basically a
race condition -- if you're lucky the log reader can cope, but that
depends greatly on the guts of the logger.

>In this situation (each child logging it's own entries) it seems there 
>is the possibility that a child could be preempted during it's call to 
>write() when trying to write more than PIPE_BUF bytes of log data.  What 
>I'd like to do is create a test where I would be making requests to 
>Apache that would cause log entries longer than PIPE_BUF in length, then 
>be able to show the interleaving of log entries due to the PIPE_BUF 
>limit being exceeded.

I would guess that the easiest way to run into this is to cause lots of
processes to write larger blocks to the same pipe (are they all really
writing to the exact same pipe?  If not, no problem!) at the same
time.  An SMP box might be able to tickle this one better.

>Under the conditions such that cls->log_fd is a pipe (inherited from the 
>parent), len > PIPE_BUF, and there are multiple child processes all 
>logging entries with this code.

Assuming they're all writing to the same log_fd, then you might have
a problem.

>Knowing if Apache could possibly write interleaved logs when writing to 
>a pipe is critical to a program I'm developing which receives log 
>entries from Apache via a pipe.

That's another layer of indirection, though.  If all of the children
have separate pipes to the parent, and then the parent logs to your
program, all should be fine.

But at the kernel level, yes, writes longer than PIPE_BUF might get
interleaved.  The longer the write, the higher the probability, so
for your test, if you can generate, say, 10K writes over and over,
you can probably trip it.

-- 
Steve Watt KD6GGD  PP-ASEL-IA          ICBM: 121W 56' 57.8" / 37N 20' 14.9"
 Internet: steve @ Watt.COM                         Whois: SW32
   Free time?  There's no such thing.  It just comes in varying prices...


More information about the freebsd-hackers mailing list