kern/103520: tty_open(), ptty_open() panic with empty struct tty

Martin Blapp mbr at FreeBSD.org
Sat Sep 23 07:50:21 PDT 2006


>Number:         103520
>Category:       kern
>Synopsis:       tty_open(), ptty_open() panic with empty struct tty
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 23 14:50:13 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Martin Blapp
>Release:        FreeBSD 6.0-STABLE i386
>Organization:
ImproWare AG
>Environment:
>Description:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x0
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xe8dd8974
frame pointer           = 0x28:0xe8dd8988
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 70635 (perl)
trap number             = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
kdb_backtrace(100,c6bf2c00,28,e8dd8934,c,...) at kdb_backtrace+0x29
panic(c08a9756,c08fd779,0,fffff,c6bfd89b,...) at panic+0x114
trap_fatal(e8dd8934,0,c6bf2c00,c71f6cb8,c,...) at trap_fatal+0x2ce
trap_pfault(e8dd8934,0,0) at trap_pfault+0x1d7
trap(c6710008,e8dd0028,c06c0028,c6bf2c00,c643ec00,...) at trap+0x2fd
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0, esp = 0xe8dd8974, ebp = 0xe8dd8988 ---
(null)(c643ec88,0,0,c643ec10,c643ec00,...) at 0
ttwakeup(c643ec00,c643ec00,c6dd7980,c643ec00,e8dd89d8,...) at ttwakeup+0x65
ttymodem(c643ec00,1) at ttymodem+0x178
ptsopen(c6bf5200,3,2000,c6bf2c00) at ptsopen+0x99
giant_open(c6bf5200,3,2000,c6bf2c00) at giant_open+0x4f
devfs_open(e8dd8a64) at devfs_open+0x20f
VOP_OPEN_APV(c094ca20,e8dd8a64) at VOP_OPEN_APV+0x38
vn_open_cred(e8dd8bcc,e8dd8ccc,0,c6e43d00,1,...) at vn_open_cred+0x434
vn_open(e8dd8bcc,e8dd8ccc,0,1) at vn_open+0x1e
kern_open(c6bf2c00,bfbfe8b0,0,3,0,...) at kern_open+0xb6
open(c6bf2c00,e8dd8d04) at open+0x1a
syscall(808003b,3b,bfbf003b,2,0,...) at syscall+0x2bf
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (5, FreeBSD ELF32, open), eip = 0x28256aeb, esp = 0xbfbfe87c, ebp =
0xbfbfe8d8 ---

J. Porter Clark sent me some traces and dumps to analyze, but I was not able
to reproduce it until yesterday. I guess I do know now what's going on.

I'm hunted down the same panic Doug White tried to fix one year ago
in FreeBSD 5. I've added some debug output to see what was going on:

ptsopen(): tty with state 131112 has refcnt 3 (ttyp9) dev refcount = 2
ptsopen(): tty with state 131112 has refcnt 2 (ttyp8) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 1 (ttyp5) dev refcount = 2                                                                                                                 ptsopen(): tty with state 131112 has refcnt 3 (ttyp9) dev refcount = 2
ptsopen(): tty with state 131112 has refcnt 2 (ttyp2) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 1 (ttyp6) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 1 (ttyp3) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 1 (ttyp5) dev refcount = 2
ptsopen(): tty with state 131112 has refcnt 3 (ttyp0) dev refcount = 2
ptsopen(): tty with state 131112 has refcnt 2 (ttyp8) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 1 (ttyp6) dev refcount = 2
ttyrel(): tty refcnt is now 0 (ttyp3)
ptsopen(): tty with state 0 has refcnt 1 (ttyp4) dev refcount = 2
ptsopen(): tty with state 0 has refcnt 0 (ttyp3) dev refcount = 2
panic()

Interesting. This shouldn't happen in any case. ptsopen() is called with a tty
which has a refcount of 0 which means ttyrel has freed our struct tty before.
Interesting is also that I've called ttyrel() and also ptsopen() with
mtx_assert(&Giant, MA_OWNED); So it's not a missing GIANT lock at all.

The box recovers if refcount > 0 and state is set to 0, but all tty/devfs
operations are locked then for 2-3 minutes.

In the meantime I wrote a perl script to crash a box in 5-10 minutes
without significant load, just doing tty operations and writing stuff at the
same time to 'closed' ttys.

Devfs open doesn't check if the tty is still in use in some cases, I've added
now checks for ptcopen(), ptsopen(), ttyopen() to check if we have been called
with a refcount of zero. this should solve the panics with ttwakeup() we have
in RELENG_5 and RELENG_6.


>How-To-Repeat:

The easiest way I've found to do it is to login to target SMP
machine "host" from two different windows on some other machine
"remote" using ssh.

 In the first window:
 remote % ssh host
 <Scary banner>
 Password: <whatever>
 Last login: Sat Aug 26 12:02:07 2006

 In the second window, log in:
 remote % ssh host
 <Scary banner>
 Password: <whatever>
 Last login: Sat Aug 26 12:02:07 2006

 $ ls -l `tty`
 crw-------  1 jpc  tty    0, 142 Aug 26 18:58 /dev/ttyp1
 $ exit

 Now go back to the first window and write to the other
 terminal's revoked tty:

 $ echo hello > /dev/ttyp1

 Go to the second window and log in again, or try to:

 remote % ssh host
 <Scary banner>
 Password: <whatever>

 ...and that's as far as I get. Our host has panicked.	

>Fix:

--- sys/kern/tty.c	Sun Nov  6 16:09:32 2005
+++ sys/kern/tty.c	Sat Sep 23 13:16:51 2006
@@ -3101,6 +3101,12 @@
 	struct tty	*tp;
 
 	tp = dev->si_tty;
+
+	/* XXX It can happen that devfs_open calls us with tp->t_refcnt == 0 */
+	if (tp == NULL || tp->t_refcnt == 0) {
+		return (ENXIO);
+	}
+
 	s = spltty();
 	/*
 	 * We jump to this label after all non-interrupted sleeps to pick
--- sys/kern/tty_pty.c	Thu Mar 30 16:46:56 2006
+++ sys/kern/tty_pty.c	Sat Sep 23 13:17:36 2006
@@ -170,6 +170,12 @@
 		return(ENXIO);
 	pt = dev->si_drv1;
 	tp = dev->si_tty;
+
+        /* XXX It can happen that devfs_open calls us with tp->t_refcnt == 0 */
+	if (tp == NULL || tp->t_refcnt == 0) {
+		return (ENXIO);
+        }
+
 	if ((tp->t_state & TS_ISOPEN) == 0) {
 		ttyinitmode(tp, 1, 0);
 	} else if (tp->t_state & TS_XCLUDE && suser(td))
@@ -276,6 +282,12 @@
 	if (!dev->si_drv1)
 		return(ENXIO);
 	tp = dev->si_tty;
+
+	/* XXX It can happen that devfs_open calls us with tp->t_refcnt == 0 */
+        if (tp == NULL || tp->t_refcnt == 0) {
+                return (ENXIO);
+        }
+
 	if (tp->t_oproc)
 		return (EIO);
 	tp->t_timeout = -1;

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list