sleep(3) sometimes too sleepy on FreeBSD 8.0?

Kostik Belousov kostikbel at gmail.com
Wed Feb 24 16:38:23 UTC 2010


On Wed, Feb 24, 2010 at 11:41:01PM +1100, John Marshall wrote:
> On Wed, 24 Feb 2010, 14:20 +0200, Kostik Belousov wrote:
> > On Wed, Feb 24, 2010 at 03:44:41AM -0800, Jeremy Chadwick wrote:
> > > On Wed, Feb 24, 2010 at 01:21:39PM +0200, Kostik Belousov wrote:
> > > > On Wed, Feb 24, 2010 at 06:53:59PM +1100, Peter Jeremy wrote:
> > > > > Updates following some off-line discussions and debugging with John on
> > > > > IRC.  I've cc'd gshapiro@ because the problem appears to be sendmail,
> > > > > rather than the FreeBSD kernel.
> > > > > 
> > > > > On 2010-Feb-23 12:35:22 +1100, John Marshall <john.marshall at riverwillow.com.au> wrote:
> > > > > >Environment: sendmail 8.14.4 on FreeBSD 8.0-RELEASE-p2
> > > > > 
> > > > > Note that this is stock ISC sendmail, not the sendmail in either the
> > > > > base system or the port.
> > > > > 
> > > > > >I posted about this in comp.mail.sendmail and was told...
> > > > > >
> > > > > >> sleep() should be one of these calls:
> > > > > >> 
> > > > > >>         if (njobs == 0 && WorkGrp[wgrp].wg_lowqintvl < MIN_SLEEP_TIME)
> > > > > >>                 sleep(MIN_SLEEP_TIME);
> > > > > >>         else if (WorkGrp[wgrp].wg_lowqintvl <= 0)
> > > > > >>                 sleep(QueueIntvl > 0 ? QueueIntvl : MIN_SLEEP_TIME);
> > > > > >>         else
> > > > > >>                 sleep(WorkGrp[wgrp].wg_lowqintvl);
> > > > > 
> > > > > Whilst it's true that the code calls sleep(), it's not calling
> > > > > sleep(3) in the FreeBSD libc.  Instead it's calling a sleep() defined
> > > > > in libsm/clock.c - which is a horrible maze of #ifdefs.
> > > > > 
> > > > > John has pre-processed that code and the result it at:
> > > > > http://www.riverwillow.net.au/~john/sm/clock.preprocessed
> > > > > 
> > > > > At a quick look, the code is broken: sm_seteventm() generates a
> > > > > one-off timer using setitimer(2), which will send SIGALRM when it
> > > > > expires.  sm_releasesignal() then unblocks SIGALRM.  In theory, the
> > > > > SIGALRM could be delivered anywhere after the (!SmSleepDone) test and
> > > > > before pause() is called - in which case, the signal is lost and
> > > > > pause() will sleep forever.
> > > > > 
> > > > > On 2010-Feb-24 08:13:06 +1100, John Marshall <john.marshall at riverwillow.com.au> wrote:
> > > > > >My ktrace file was created with 'ktrace -g 48501'.  I have the result of
> > > > > >'kdump -R -p 48504' available at:
> > > > > >
> > > > > > <http://www.riverwillow.net.au/~john/8_0/rwsrv04_201002240725.kdump.gz>
> 
> > Regarding sigsuspend() returning EINTR without delivering any signal,
> > could it be that the sendmail process was debugged ?
> 
> No.  I didn't touch the process with anything this time.  There was no
> debugger in use on the system.  That was how I found the process first
> thing this morning so I sent off the kdump output.
> 
> The process stayed in the same state until I rebooted the system this
> afternoon to install a kernel with debug symbols and options.  I have
> done the same on the other two servers, so I can dig deeper for you next
> time.  I am running ktrace on the sendmail process group on all three
> servers waiting to catch the next one.  By the way, all three are i386
> with SMP.

Kernel debugging is not much needed at this stage.

I would be interested if you tried latest RELENG_8 kernel, in regard
the sigsuspend(2) returning with EINTR without a signal delivered.

Our pause(3) as is has two problems not related to the issue you see.
One is that it uses sigcompat(3) routines, bringing them into namespace
when pause is used. Second, that is a consequence of first, is that
realtime signals are blocked during pause(3). While testing this patch,
I noted that kill(1) cannot send realtime signals to the processes.

The usual race with pause() is there, it cannot be solved.

diff --git a/bin/kill/kill.c b/bin/kill/kill.c
index bb9982e..8ee1d85 100644
--- a/bin/kill/kill.c
+++ b/bin/kill/kill.c
@@ -108,7 +108,7 @@ main(int argc, char *argv[])
 			numsig = strtol(*argv, &ep, 10);
 			if (!**argv || *ep)
 				errx(1, "illegal signal number: %s", *argv);
-			if (numsig < 0 || numsig >= sys_nsig)
+			if (numsig < 0)
 				nosig(*argv);
 		} else
 			nosig(*argv);
diff --git a/lib/libc/gen/pause.c b/lib/libc/gen/pause.c
index 00bf833..51706cf 100644
--- a/lib/libc/gen/pause.c
+++ b/lib/libc/gen/pause.c
@@ -33,8 +33,10 @@ static char sccsid[] = "@(#)pause.c	8.1 (Berkeley) 6/4/93";
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
+#include "namespace.h"
 #include <signal.h>
 #include <unistd.h>
+#include "un-namespace.h"
 
 /*
  * Backwards compatible pause.
@@ -42,7 +44,11 @@ __FBSDID("$FreeBSD$");
 int
 __pause(void)
 {
-	return sigpause(sigblock(0L));
+	sigset_t oset;
+
+	if (_sigprocmask(SIG_BLOCK, NULL, &oset) == -1)
+		return (-1);
+	return (_sigsuspend(&oset));
 }
 __weak_reference(__pause, pause);
 __weak_reference(__pause, _pause);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100224/2e83968b/attachment.pgp


More information about the freebsd-stable mailing list