kern/103101: Locking race in tty.c causes frequent panics on SMP
Martin Blapp
mbr at FreeBSD.org
Sun Sep 10 09:40:22 PDT 2006
>Number: 103101
>Category: kern
>Synopsis: Locking race in tty.c causes frequent panics on SMP
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sun Sep 10 16:40:20 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator: Martin Blapp
>Release: FreeBSD 6.1-STABLE i386
>Organization:
ImproWare AG
>Environment:
>Description:
Normally a shared lock of the proctree lock is used to protect
tp->t_session. But this lock isn't used everywhere consequently
to protect against races like this one. The proctree_lock at this place
happens too late. The patch does fix this problem.
(kgdb) bt
#0 doadump () at pcpu.h:165
#1 0xc066355e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2 0xc06638b5 in panic (fmt=0xc0891732 "%s") at /usr/src/sys/kern/kern_shutdown.c:565
#3 0xc085c6b6 in trap_fatal (frame=0xed6e4ab8, eva=4) at /usr/src/sys/i386/i386/trap.c:836
#4 0xc085c3bf in trap_pfault (frame=0xed6e4ab8, usermode=0, eva=4) at /usr/src/sys/i386/i386/trap.c:744
#5 0xc085bfb5 in trap (frame=
{tf_fs = 8, tf_es = 40, tf_ds = -1063714776, tf_edi = -1064042304, tf_esi = 0, tf_ebp = -311538944, tf_isp = -311538972, tf_ebx
= -967615488, tf_edx = -1063651212, tf_ecx = -941099136, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066845359, tf_cs = 32,
tf_eflags = 66194, tf_esp = -967615488, tf_ss = 0})
at /usr/src/sys/i386/i386/trap.c:434
#6 0xc0848bea in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7 0xc0693b51 in ttymodem (tp=0xc6535c00, flag=-1063651212) at /usr/src/sys/kern/tty.c:1659
#8 0xc0698362 in ptcclose (dev=0x0, flags=3, fmt=8192, td=0xc7e7f780) at linedisc.h:136
#9 0xc0638a6f in giant_close (dev=0xcb3c1100, fflag=3, devtype=8192, td=0xc7e7f780) at /usr/src/sys/kern/kern_conf.c:266
#10 0xc06162bf in devfs_close (ap=0xed6e4b7c) at /usr/src/sys/fs/devfs/devfs_vnops.c:287
#11 0xc086dc1c in VOP_CLOSE_APV (vop=0x0, a=0xc099f874) at vnode_if.c:426
#12 0xc06c87e2 in vn_close (vp=0xc9cdf660, flags=3, file_cred=0x0, td=0xc7e7f780) at vnode_if.h:227
#13 0xc06c974a in vn_closefile (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/vfs_vnops.c:865
#14 0xc06162e7 in devfs_close_f (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/fs/devfs/devfs_vnops.c:297
#15 0xc0642cdc in fdrop_locked (fp=0xc6fc5438, td=0xc7e7f780) at file.h:295
#16 0xc0642c29 in fdrop (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:2122
#17 0xc06411c7 in closef (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:1942
#18 0xc063e329 in close (td=0xc7e7f780, uap=0x0) at /usr/src/sys/kern/kern_descrip.c:1007
In ttymodem() the current code checks correcty if tp->t_session isn't NULL, but does
the necessary process group lock later. Then it tries to access a member of tp->t_session
while it became NULL just before -> panic().
The only way to solve this for now is to protect t_session with exclusive locks
at the places where we modify it in tty_close() and ttioctl(), and shared locks
at places where we first test it and then access a member of it.
I know that this isn't a perfect solution, the tty subsystem definitly needs
proper locking and someone has to do it. But in the meantime, we need a
stable SMP freebsd.
I've made a more complete patch available at:
http://antispam.imp.ch/patches/patch-tty.t_pgrp.diff
>How-To-Repeat:
I haven't found a way to quickly reproduce this bug. We have seen this panics
on all SMP servers we run with FreeBSD 5/6, mostly under load conditions after 2-3
days uptime. An active serial console will trigger the bug more often it seems
but seems not to be necessary.
>Fix:
--- sys/kern/tty.c Sun Nov 6 17:09:32 2005
+++ sys/kern/tty.c Sat Jul 8 08:29:07 2006
@@ -1654,8 +1668,8 @@
!ISSET(tp->t_cflag, CLOCAL)) {
SET(tp->t_state, TS_ZOMBIE);
CLR(tp->t_state, TS_CONNECTED);
+ sx_slock(&proctree_lock); /* XXX: protect t_session */
if (tp->t_session) {
- sx_slock(&proctree_lock);
if (tp->t_session->s_leader) {
struct proc *p;
@@ -1664,8 +1678,8 @@
psignal(p, SIGHUP);
PROC_UNLOCK(p);
}
- sx_sunlock(&proctree_lock);
}
+ sx_sunlock(&proctree_lock);
ttyflush(tp, FREAD | FWRITE);
return (0);
}
>Release-Note:
>Audit-Trail:
>Unformatted:
>System: SMP kernel on SMP systems.
The bug is present in RELENG_5, RELENG_6 and in HEAD. During my tests, I've
seen the panic on FreeBSD 5 as well on FreeBSD 6.
More information about the freebsd-bugs
mailing list