kern/78824: race condition close()ing and read()ing the same
socketpair on SMP.
Marc Olzheim
zlo at zlo.nu
Mon Mar 14 08:40:04 PST 2005
>Number: 78824
>Category: kern
>Synopsis: race condition close()ing and read()ing the same socketpair on SMP.
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Mar 14 16:40:02 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator: Marc Olzheim, Sven Berkvens
>Release: FreeBSD 5.4-PRERELEASE i386
>Organization:
ilse media
>Environment:
System: FreeBSD rave.ilse.net 5.4-PRERELEASE FreeBSD 5.4-PRERELEASE #0: Thu Mar 10 15:43:26 CET 2005 root at rave.ilse.net:/usr/obj/usr/src/sys/SE3DEBUG i386
GENERIC + INVARIANTS + INVARIANT_SUPPORT + WITNESS + WITNESS_SKIPSPIN
>Description:
When read()ing from a socket while the other end is being
close()d at the same time, read() fails with errno == ENOTCONN,
instead of doing normal End-of-file handling.
References:
soisdisconnected() from
__FBSDID("$FreeBSD: src/sys/kern/uipc_socket2.c,v 1.137.2.5 2005/02/23 00:39:17 rwatson Exp $");
soreceive() from
__FBSDID("$FreeBSD: src/sys/kern/uipc_socket.c,v 1.208.2.17 2005/03/07 13:08:03 rwatson Exp $");
close() from
__FBSDID("$FreeBSD: src/sys/kern/kern_descrip.c,v 1.243.2.6 2005/03/03 22:27:32 jhb Exp $");
It seems as though soreceive() doesn't check for a lock on the
filedescriptor, just the socket buffer, allowing close() to
modify its flags at the same time.
>How-To-Repeat:
Since this is heavily timing dependant (it is a race condition),
it might not be easily reproduced. We can run our code on the
following hardware, with no other CPU-time consuming processes
running to reproduce it:
hw.machine: i386
hw.model: Intel(R) Xeon(TM) CPU 3.06GHz
hw.ncpu: 4
hw.byteorder: 1234
hw.clockrate: 3065
kern.ostype: FreeBSD
kern.osrelease: 5.4-PRERELEASE
kern.osrevision: 199506
kern.version: FreeBSD 5.4-PRERELEASE #0: Thu Mar 10 15:43:26 CET 2005
root at rave.ilse.net:/usr/obj/usr/src/sys/SE3DEBUG
kern.clockrate: { hz = 100, tick = 10000, profhz = 1024, stathz = 128 }
kern.osreldate: 503105
kern.stackprot: 7
kern.ktrace.genio_size: 4096
kern.ktrace.request_pool: 100
kern.sched.name: 4BSD
kern.smp.maxcpus: 16
kern.smp.active: 1
kern.smp.disabled: 0
kern.smp.cpus: 4
kern.smp.forward_signal_enabled: 1
kern.smp.forward_roundrobin_enabled: 1
Here's the code. I run under ktrace on our machine, the problem
is reproduced:
rave:/tmp>echo 'ktrace -i ./socketpair2 < /dev/null' | sh
<Socket is not connected> (3,4) (i:33)
<Socket is not connected> (3,4) (i:48)
<Socket is not connected> (3,4) (i:67)
<Socket is not connected> (3,4) (i:99)
100
<Socket is not connected> (3,4) (i:131)
<Socket is not connected> (3,4) (i:141)
<Socket is not connected> (3,4) (i:144)
<Socket is not connected> (3,4) (i:159)
<Socket is not connected> (3,4) (i:169)
<Socket is not connected> (3,4) (i:176)
<Socket is not connected> (3,4) (i:183)
200
<Socket is not connected> (3,4) (i:213)
<Socket is not connected> (3,4) (i:226)
<Socket is not connected> (3,4) (i:234)
<Socket is not connected> (3,4) (i:254)
<Socket is not connected> (3,4) (i:282)
...
socketpair2.c:
/* socketpair2.c: - Marc Olzheim <zlo at zlo.nu>,
* Sven Berkvens <sven at berkvens.net>
*/
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int
main(int argc, char *argv[])
{
int sock[2], i, j, wstat;
char buf[1024];
ssize_t bytes;
pid_t newpid;
if (1 != argc)
{
fprintf(stderr, "Usage: %s\n", argv[0]);
return 1;
}
for (i = 0;;++i)
{
if (socketpair(PF_UNIX, SOCK_STREAM, 0, sock))
perror("socketpair()");
newpid = fork();
if (-1 == newpid)
perror("fork()");
if (0 != newpid)
{
/* parent */
close(sock[1]);
if (write(sock[0], "A", 1) != 1)
perror("write()");
/* Suspend until the child has read the byte. */
kill(getpid(), SIGSTOP);
/* We hopefully get a time slice as soon as as a
* SIGCONT it delivered.
*/
close(sock[0]);
}
else
{
/* child */
close(sock[0]);
bytes = read(sock[1], buf, 1);
if (bytes != 1)
perror("first read()");
/* Tell the parent to continue and close his side of
* the socket.
*/
kill(getppid(), SIGCONT);
/* Since only 1 byte is send, this should
* produce EOF.
*/
bytes = read(sock[1], buf, 1);
if (bytes == -1)
{
printf("<%s> (%d,%d) (i:%d)\n",
strerror(errno),
sock[0], sock[1], i);
exit(1);
}
exit(0);
}
wait(&wstat);
if (!(i % 100) && i)
printf("%d\n", i);
}
return 0;
}
>Fix:
It's possible to catch the ENOTCONN and restart the read() to to
read the EOF...
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list