svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Mark Millard marklmi at yahoo.com
Mon May 4 18:16:24 UTC 2020


[This report just shows some material for the
sendmail SIGSEGV's, based on truss output.]

I've returned to using the modern jemalloc because
it seems to show problems more, after having
caught the earlier reported dhclient example under
the older jemalloc context. (Again: jemalloc may
be exposing a problem elsewhere.)

I had truss monitor sendmail, including following
child processes. A child process start and end
normally looks like:

  963: 3092.707439595 0.000033895 sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0) = 0 (0x0)
 2432: 3092.708024959 0.000000000 <new process>
  963: 3092.708136462 0.000470115 fork()         = 2432 (0x980)
 2432: 3092.708441039 0.000033319 thr_self(0x50120000) = 0 (0x0)
. . .
 2432: 3092.717283761 0.000036008 sigaction(SIGQUIT,{ SIG_IGN SA_RESTART ss_t },{ SIG_IGN SA_RESTART ss_t }) = 0 (0x0)
 2432: 3092.717544288 0.000034352 sigprocmask(SIG_SETMASK,{ },0x0) = 0 (0x0)
 2432: 3092.717799894 0.000035768 close(0)       = 0 (0x0)
 2432: 3092.718174733 0.000103726 openat(AT_FDCWD,"/dev/null",O_RDONLY,00) = 0 (0x0)
 2432: 3092.718480437 0.000052091 openat(AT_FDCWD,"/dev/null",O_WRONLY,00) = 4 (0x4)
 2432: 3092.718778028 0.000037856 dup2(4,1)      = 1 (0x1)
 2432: 3092.719003051 0.000034255 dup2(4,2)      = 2 (0x2)
 2432: 3092.719225122 0.000033655 close(4)       = 0 (0x0)
 2432: 3092.719437735 0.000047626 fstat(0,{ mode=crw-rw-rw- ,inode=22,size=0,blksize=4096 }) = 0 (0x0)
 2432: 3092.719679274 0.000037400 fstat(1,{ mode=crw-rw-rw- ,inode=22,size=0,blksize=4096 }) = 0 (0x0)
 2432: 3092.719908859 0.000035816 fstat(2,{ mode=crw-rw-rw- ,inode=22,size=0,blksize=4096 }) = 0 (0x0)
 2432: 3092.720138299 0.000033727 clock_gettime(13,{ 1588570204.000000000 }) = 0 (0x0)
 2432: 3092.720658945 0.000035360 getpid()       = 2432 (0x980)
 2432: 3092.720931594 0.000048730 __sysctl("kern.proc.args.2432",4,0x0,0x0,0x508ae000,49) = 0 (0x0)
 2432: 3092.721572338 0.000062318 open(".",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,00) = 4 (0x4)
 2432: 3092.721962132 0.000037808 fcntl(4,F_ISUNIONSTACK,0x0) = 0 (0x0)
 2432: 3092.722323792 0.000098613 getdirentries(4,"\0\0\0\0\0\t\M-I\M^Q\0\0\0\0\0\0"...,4096,{ 0x0 }) = 144 (0x90)
 2432: 3092.722944875 0.000036944 getdirentries(4,0x50859000,4096,{ 0x200 }) = 0 (0x0)
 2432: 3092.723294461 0.000045250 close(4)       = 0 (0x0)
 2432: 3092.723576400 0.000041144 clock_gettime(13,{ 1588570204.000000000 }) = 0 (0x0)
 2432: 3092.723828718 0.000037928 setitimer(0,{ 0.000000, 0.000000 },0x0) = 0 (0x0)
 2432: 3092.724092245 0.000035815 sigprocmask(SIG_UNBLOCK,{ SIGALRM },{ }) = 0 (0x0)
 2432: 3092.724591527 0.000038024 getpid()       = 2432 (0x980)
 2432: 3092.724952323 0.000052955 setuid(0x0)    ERR#1 'Operation not permitted'
 2432: 3092.727852159 0.000042969 clock_gettime(4,{ 21633.960374942 }) = 0 (0x0)
 2432: 3092.728508193 0.000034327 clock_gettime(4,{ 21633.961033929 }) = 0 (0x0)
 2432: 3092.729146608 0.000036872 clock_gettime(4,{ 21633.961670903 }) = 0 (0x0)
 2432: 3092.729824967 0.000038360 clock_gettime(4,{ 21633.962349071 }) = 0 (0x0)
 2432: 3092.732435446 0.001634793 exit(0x0)     
 2432: 3092.732555855 0.001755202 process exit, rval = 0
  963: 3092.732638865 0.018571063 SIGNAL 20 (SIGCHLD) code=CLD_EXITED pid=2432 uid=25 status=0
  963: 3092.732822864 0.018755062 sigsuspend({ }) ERR#4 'Interrupted system call'
  963: 3092.733076525 0.000039968 sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0) = 0 (0x0)
  963: 3092.733447788 0.000076601 wait4(-1,{ EXITED,val=0 },WNOHANG,0x0) = 2432 (0x980)
  963: 3092.733781410 0.000034783 wait4(-1,0xffffbe60,WNOHANG,0x0) ERR#10 'No child processes'
  963: 3092.734065366 0.000037328 sigreturn(0xffffbe90) EJUSTRETURN
  963: 3092.734263408 0.000033295 sigprocmask(SIG_BLOCK,0x0,{ }) = 0 (0x0)
(No activity in 963 for about 1800 seconds.)

But once it starts failing, the SIGSEGV happens just after the:

setuid(0x0)    ERR#1 'Operation not permitted'

For example:

. . .
 4745: 37293.335510778 0.000035023 clock_gettime(13,{ 1588604405.000000000 }) = 0 (0x0)
 4745: 37293.335754863 0.000037593 setitimer(0,{ 0.000000, 0.000000 },0x0) = 0 (0x0)
 4745: 37293.336010253 0.000036200 sigprocmask(SIG_UNBLOCK,{ SIGALRM },{ }) = 0 (0x0)
 4745: 37293.336488027 0.000033823 getpid()      = 4745 (0x1289)
 4745: 37293.336836797 0.000051995 setuid(0x0)   ERR#1 'Operation not permitted'
 4745: 37293.338022675 0.001237873 SIGNAL 11 (SIGSEGV) code=SEGV_MAPERR trapno=768 addr=0x506a11f0
 4745: 37293.339546520 0.002761718 process killed, signal = 11
  963: 37293.339627249 0.050797919 SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=4745 uid=25 status=11
  963: 37293.339794781 0.050965451 sigsuspend({ }) ERR#4 'Interrupted system call'
  963: 37293.340038313 0.000037544 sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0) = 0 (0x0)
  963: 37293.340329951 0.000070215 wait4(-1,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 4745 (0x1289)
  963: 37293.340634048 0.000034903 wait4(-1,0xffffbe60,WNOHANG,0x0) ERR#10 'No child processes'
  963: 37293.340901417 0.000037615 sigreturn(0xffffbe90) EJUSTRETURN
  963: 37293.341090122 0.000033319 sigprocmask(SIG_BLOCK,0x0,{ }) = 0 (0x0)
(Another about 1800 seconds.)

So it seems that sendmail's SIGSEGV always happens during the
winding down of the child process: getting ready to exit.

So far, it seems that once it starts happening, it happens
for each child process created after that.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the svn-src-head mailing list