[Bug 233646] Flakey test case: bin.sh.builtins.functional_test.kill1
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Thu Dec 27 22:45:20 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233646
Jilles Tjoelker <jilles at FreeBSD.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|New |Open
--- Comment #3 from Jilles Tjoelker <jilles at FreeBSD.org> ---
In the below text, wait(2) means any wait system call; sh(1) uses wait3() which
appears as wait4() in ktrace.
The test case is meant to test that a terminated, wait(2)ed for but not
wait(1)ed for job can be passed to kill(1) without error (the command will do
nothing). The part with the second background job, p2 and wait is intended to
wait for the first background job to terminate and be wait(2)ed for, without
taking excessive time or wait(1)ing for it (which would make the %1
specification invalid). If the first background job is slow to terminate, the
kill command will do something but this is harmless. If the first background
job terminates but the kernel has not returned it yet via wait(2), the kill
command will kill a zombie which per POSIX does nothing successfully.
I noticed that the problem is quickly reproduced on head using a loop like
while sh builtins/kill1.0; do :; done
using head's sh as well as stable/11's sh, while it can run for quite a while
on stable/11 using stable/11's sh as well as head's sh built against stable/11.
Reproducing with ktrace -i seems hard, but reproducing with plain ktrace works.
The below ktrace extract seems to indicate that the kernel is at fault,
returning an [ESRCH] error for killing a zombie:
19837 sh CALL fork
19837 sh RET fork 19838/0x4d7e
19837 sh CALL wait4(0xffffffff,0x7fffffffe91c,0x1<WNOHANG>,0)
19837 sh RET wait4 0
19837 sh CALL fork
19837 sh RET fork 19839/0x4d7f
19837 sh CALL sigprocmask(SIG_BLOCK,0x7fffffffe820,0x7fffffffe810)
19837 sh RET sigprocmask 0
19837 sh CALL sigaction(SIGCHLD,0x7fffffffe850,0x7fffffffe830)
19837 sh RET sigaction 0
19837 sh CALL wait4(0xffffffff,0x7fffffffe80c,0x1<WNOHANG>,0)
19837 sh RET wait4 19839/0x4d7f
19837 sh CALL sigaction(SIGCHLD,0x7fffffffe830,0)
19837 sh RET sigaction 0
19837 sh CALL sigprocmask(SIG_SETMASK,0x7fffffffe810,0)
19837 sh RET sigprocmask 0
19837 sh CALL kill(0x4d7e,SIGTERM)
19837 sh RET kill -1 errno 3 No such process
Process ID 18007 has not been returned by a wait4() call, so it must either be
still running or a zombie. In either case, a kill() on it must succeed.
It appears that there is no test that specifically verifies that killing a
zombie process succeeds.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-testing
mailing list