kern/100940: passing file descriptor over datagram UNIX domain
socket crashes kernel
Robert Watson
rwatson at FreeBSD.org
Mon Jul 31 17:45:32 UTC 2006
Synopsis: passing file descriptor over datagram UNIX domain socket crashes kernel
Responsible-Changed-From-To: freebsd-bugs->rwatson
Responsible-Changed-By: rwatson
Responsible-Changed-When: Mon Jul 31 17:27:10 UTC 2006
Responsible-Changed-Why:
Grab ownership of this PR, since I have a strong interest in the UNIX
domain socket code. The problem as seen here with WITNESS in place on
a 7.x kernel is:
tiger-1# ./fd_passing
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999
KDB: stack backtrace:
kdb_backtrace(1,c7193b04,c,c7164a20,e974fad4,...) at kdb_backtrace+0x29
witness_warn(5,0,c094452a) at witness_warn+0x192
trap(c7160008,c76d0028,28,0,c7497578,...) at trap+0x108
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0xc06e38ff, esp = 0xe974fb1c, ebp = 0xe974fb44 ---
uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb
sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5
sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c
kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101
sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87
sendmsg(c7164a20,e974fd04) at sendmsg+0x53
syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 ---
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault virtual address = 0x8
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc06e38ff
stack pointer = 0x28:0xe974fb1c
frame pointer = 0x28:0xe974fb44
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 901 (fd_passing)
[thread pid 901 tid 100075 ]
Stopped at uipc_send+0xdb: movl 0x8(%ecx),%edi
db> bt
Tracing pid 901 tid 100075 td 0xc7164a20
uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb
sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5
sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c
kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101
sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87
sendmsg(c7164a20,e974fd04) at sendmsg+0x53
syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 ---
db> show alllocks
Process 901 (fd_passing) thread 0xc7164a20 (100075)
exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999
995 if (vp != NULL)
996 vput(vp);
997 mtx_unlock(&Giant);
998 free(sa, M_SONAME);
999 UNP_LOCK();
1000 unp->unp_flags &= ~UNP_CONNECTING;
1001 return (error);
1002 }
1003
1004 static int
1005 unp_connect2(struct socket *so, struct socket *so2, int req)
1006 {
(gdb) l *0xc06e38ff
0xc06e38ff is in uipc_send (../../../kern/uipc_usrreq.c:609).
604 error = ENOTCONN;
605 break;
606 }
607 }
608 unp2 = unp->unp_conn;
609 so2 = unp2->unp_socket;
610 if (unp->unp_addr != NULL)
611 from = (struct sockaddr *)unp->unp_addr;
612 else
613 from = &sun_noname;
The problem appears to be that unp_connect() can return with the socket
disconnected as a result of dropping the UNIX domain socket subsystem
lock while discarding the vnode reference for the remote socket, so that
the socket is disconnected before the send can proceed. Probably the
answer is to add a check for a NULL unp->unp_conn pointer and return an
appropriate error, as the connect() and send() cannot be performed
atomically. I will follow up with a patch.
http://www.freebsd.org/cgi/query-pr.cgi?pr=100940
More information about the freebsd-bugs
mailing list