kern/103101: Locking race in tty.c causes frequent panics on SMP

Martin Blapp mbr at FreeBSD.org
Sun Sep 10 09:40:22 PDT 2006


>Number:         103101
>Category:       kern
>Synopsis:       Locking race in tty.c causes frequent panics on SMP
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Sep 10 16:40:20 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Martin Blapp
>Release:        FreeBSD 6.1-STABLE i386
>Organization:
ImproWare AG
>Environment:
>Description:

Normally a shared lock of the proctree lock is used to protect
tp->t_session. But this lock isn't used everywhere consequently
to protect against races like this one. The proctree_lock at this place
happens too late. The patch does fix this problem.

(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc066355e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc06638b5 in panic (fmt=0xc0891732 "%s") at /usr/src/sys/kern/kern_shutdown.c:565
#3  0xc085c6b6 in trap_fatal (frame=0xed6e4ab8, eva=4) at /usr/src/sys/i386/i386/trap.c:836
#4  0xc085c3bf in trap_pfault (frame=0xed6e4ab8, usermode=0, eva=4) at /usr/src/sys/i386/i386/trap.c:744
#5  0xc085bfb5 in trap (frame=
      {tf_fs = 8, tf_es = 40, tf_ds = -1063714776, tf_edi = -1064042304, tf_esi = 0, tf_ebp = -311538944, tf_isp = -311538972, tf_ebx
= -967615488, tf_edx = -1063651212, tf_ecx = -941099136, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066845359, tf_cs = 32,
tf_eflags = 66194, tf_esp = -967615488, tf_ss = 0})
    at /usr/src/sys/i386/i386/trap.c:434
#6  0xc0848bea in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc0693b51 in ttymodem (tp=0xc6535c00, flag=-1063651212) at /usr/src/sys/kern/tty.c:1659
#8  0xc0698362 in ptcclose (dev=0x0, flags=3, fmt=8192, td=0xc7e7f780) at linedisc.h:136
#9  0xc0638a6f in giant_close (dev=0xcb3c1100, fflag=3, devtype=8192, td=0xc7e7f780) at /usr/src/sys/kern/kern_conf.c:266
#10 0xc06162bf in devfs_close (ap=0xed6e4b7c) at /usr/src/sys/fs/devfs/devfs_vnops.c:287
#11 0xc086dc1c in VOP_CLOSE_APV (vop=0x0, a=0xc099f874) at vnode_if.c:426
#12 0xc06c87e2 in vn_close (vp=0xc9cdf660, flags=3, file_cred=0x0, td=0xc7e7f780) at vnode_if.h:227
#13 0xc06c974a in vn_closefile (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/vfs_vnops.c:865
#14 0xc06162e7 in devfs_close_f (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/fs/devfs/devfs_vnops.c:297
#15 0xc0642cdc in fdrop_locked (fp=0xc6fc5438, td=0xc7e7f780) at file.h:295
#16 0xc0642c29 in fdrop (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:2122
#17 0xc06411c7 in closef (fp=0xc6fc5438, td=0xc7e7f780) at /usr/src/sys/kern/kern_descrip.c:1942
#18 0xc063e329 in close (td=0xc7e7f780, uap=0x0) at /usr/src/sys/kern/kern_descrip.c:1007

In ttymodem() the current code checks correcty if tp->t_session isn't NULL, but does
the necessary process group lock later. Then it tries to access a member of tp->t_session
while it became NULL just before -> panic().

The only way to solve this for now is to protect t_session with exclusive locks
at the places where we modify it in tty_close() and ttioctl(), and shared locks
at places where we first test it and then access a member of it.

I know that this isn't a perfect solution, the tty subsystem definitly needs
proper locking and someone has to do it. But in the meantime, we need a
stable SMP freebsd.

I've made a more complete patch available at:

http://antispam.imp.ch/patches/patch-tty.t_pgrp.diff

>How-To-Repeat:

I haven't found a way to quickly reproduce this bug. We have seen this panics
on all SMP servers we run with FreeBSD 5/6, mostly under load conditions after 2-3
days uptime. An active serial console will trigger the bug more often it seems
but seems not to be necessary.

>Fix:

--- sys/kern/tty.c	Sun Nov  6 17:09:32 2005
+++ sys/kern/tty.c	Sat Jul  8 08:29:07 2006
@@ -1654,8 +1668,8 @@
 		    !ISSET(tp->t_cflag, CLOCAL)) {
 			SET(tp->t_state, TS_ZOMBIE);
 			CLR(tp->t_state, TS_CONNECTED);
+			sx_slock(&proctree_lock);	/* XXX: protect t_session */
 			if (tp->t_session) {
-				sx_slock(&proctree_lock);
 				if (tp->t_session->s_leader) {
 					struct proc *p;
 
@@ -1664,8 +1678,8 @@
 					psignal(p, SIGHUP);
 					PROC_UNLOCK(p);
 				}
-				sx_sunlock(&proctree_lock);
 			}
+			sx_sunlock(&proctree_lock);
 			ttyflush(tp, FREAD | FWRITE);
 			return (0);
 		}
>Release-Note:
>Audit-Trail:
>Unformatted:
 >System:		SMP kernel on SMP systems.
 
 The bug is present in RELENG_5, RELENG_6 and in HEAD. During my tests, I've
 seen the panic on FreeBSD 5 as well on FreeBSD 6.
 


More information about the freebsd-bugs mailing list