kern/109277: kernel ppp(4) botches clist reservation in RELENG_6

Dmitry Pryanishnikov dmitry at atlantis.dp.ua
Sun Feb 18 14:40:06 UTC 2007


>Number:         109277
>Category:       kern
>Synopsis:       kernel ppp(4) botches clist reservation in RELENG_6
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 18 14:40:05 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Dmitry Pryanishnikov
>Release:        FreeBSD 6.2-STABLE i386
>Organization:
Atlantis ISP
>Environment:
System: FreeBSD homelynx.homenet 6.2-STABLE FreeBSD 6.2-STABLE #0: Sun Feb 18 05:55:06 EET 2007 root at homelynx.homenet:/usr/obj/usr/RELENG_6/src/sys/lynx i386
    Hardware: Intel D845EBG2 mainboard + Pentium(R) 4 CPU 2.80GHz +
    RAM 512Mb, ECC check+correction enabled.
    System is rock-stable when NOT using ppp(4).

>Description:
    Very rare (maybe, once a month) spontaneous crashes occur during the
    active simultaneous use of kernel ppp and system console. When console
    is in X.org mode, system just silently reboots. OTOH, there is a certain
    chance to get valid crash dump when system console is in text mode.
    Last such a crash was "panic: clist reservation botch" (see cblock_alloc()
    function in /sys/kern/tty_subr.c), this was RELENG_6 as of 1-Feb-2007,
    backtrace was:
    
panic(c05f55c8,0,c04cd3ee,20,38,...) at 0xc049a8a4 = panic+0xa8
b_to_q(c37fd6a8,24,c36d6838,c36d6838,0,...) at 0xc04cd60e = b_to_q+0xce
pppasyncstart(c62bfc00,c36cd50c,0,c05f9daf,3e3) at 0xc0508ff4 = pppasyncstart+0x
108
pppoutput(c36cd400,c37fd600,c39b7a70,c39debdc,0,...) at 0xc0506a36 = pppoutput+0
x326
ip_output(c37fd600,0,d9bc79b8,0,0,c3a7e654) at 0xc0526ab4 = ip_output+0xa64
tcp_output(c3a81cb0) at 0xc052eee5 = tcp_output+0xe05
tcp_input(c37fde00,14,d9bc7b80,0,0,...) at 0xc052d467 = tcp_input+0x28df
ip_input(c37fde00,c37fde74,0,8c,c37fde00,...) at 0xc05248ad = ip_input+0x75d
div_send(c3a826f4,0,c37fde00,c6a27120,0,...) at 0xc079bc1b = div_send+0x17b
sosend(c3a826f4,c6a27120,d9bc7c40,c37fde00,0,0,c382c000) at 0xc04d1fd3 = sosend+
0x5eb
kern_sendit(c382c000,3,d9bc7cbc,0,0,0) at 0xc04d71a4 = kern_sendit+0x104
sendit(c382c000,3,d9bc7cbc,0,bfbdebfc,...) at 0xc04d7077 = sendit+0x147
sendto(c382c000,d9bc7d04) at 0xc04d72d5 = sendto+0x4d
syscall(3b,3b,bfbe003b,1,8c,...) at 0xc05c62c7 = syscall+0x22f
Xint0x80_syscall() at 0xc05b495f = Xint0x80_syscall+0x1f

    I've decided to look thru closed PRs and found kern/25632, which
    describes a similar problem (yes, that was RELENG_4 kernel vs.
    USB stack interaction, but the result - bothched clist reservation - 
    was the same). So there's apparently a lack of proper
    locking during the operations with clist in kernel ppp within modern
    (at least RELENG_6) kernel.

>How-To-Repeat:
    I've shamelessly stolen the idea of cblock_alloc() recursion detection
    for the kern/25632:

--- tty_subr.c.orig	Fri Jan  7 01:35:40 2005
+++ tty_subr.c	Sun Feb 18 14:37:29 2007
@@ -94,17 +94,30 @@
  * Remove a cblock from the cfreelist queue and return a pointer
  * to it.
  */
+static int someone_here = 0;
+#define N1MAX 100000
 static __inline struct cblock *
 cblock_alloc()
 {
 	struct cblock *cblockp;
+	int n1;
 
+	for (n1=0; n1<N1MAX; n1++)
+	    if (someone_here != 0) panic("cblock_alloc recursion a");
+	someone_here++;
+	for (n1=0; n1<N1MAX; n1++)
+	    if (someone_here != 1) panic("cblock_alloc recursion b");
 	cblockp = cfreelist;
 	if (cblockp == NULL)
 		panic("clist reservation botch");
 	cfreelist = cblockp->c_next;
 	cblockp->c_next = NULL;
 	cfreecount -= CBSIZE;
+	for (n1=0; n1<N1MAX; n1++)
+	    if (someone_here != 1) panic("cblock_alloc recursion c");
+	someone_here--;
+	for (n1=0; n1<N1MAX; n1++)
+	    if (someone_here != 0) panic("cblock_alloc recursion d");
 	return (cblockp);
 }
 
    With the kernel patched this way I've got the "cblock_alloc recursion a"
    panic almost immediately after setting up ppp(4) connection and pinging
    remote peer with 'ping -f' and simultaneous trampling upon the keyboard:
    
#11 0xc049abd3 in panic (fmt=0xc05f6208 "cblock_alloc recursion a")
    at /usr/RELENG_6/src/sys/kern/kern_shutdown.c:549
#12 0xc04cd7b7 in putc (chr=39, clistp=0xc36ef000)
    at /usr/RELENG_6/src/sys/kern/tty_subr.c:106
#13 0xc04c6b6b in ttyinput (c=39, tp=0xc36ef000)
    at /usr/RELENG_6/src/sys/kern/tty.c:657
#14 0xc05a81e9 in sckbdevent (thiskbd=0xc064f440, event=0, arg=0xc0667000)
    at linedisc.h:122
#15 0xc05974ed in atkbd_intr (kbd=0xc064f440, arg=0x0)
    at /usr/RELENG_6/src/sys/dev/atkbdc/atkbd.c:503
#16 0xc059860a in atkbdintr (arg=0xc1015000)
    at /usr/RELENG_6/src/sys/dev/atkbdc/atkbd_atkbdc.c:174
#17 0xc0487712 in ithread_execute_handlers (p=0xc36aa000, ie=0xc35afc00)
    at /usr/RELENG_6/src/sys/kern/kern_intr.c:682
#18 0xc0487836 in ithread_loop (arg=0xc36e2660)
    at /usr/RELENG_6/src/sys/kern/kern_intr.c:765
#19 0xc0486980 in fork_exit (callout=0xc04877d0 <ithread_loop>,
    arg=0xc36e2660, frame=0xd5633d38)
    at /usr/RELENG_6/src/sys/kern/kern_fork.c:821
#20 0xc05b551c in fork_trampoline ()
    at /usr/RELENG_6/src/sys/i386/i386/exception.s:208

    Looks like ppp(4) enters cblock_alloc(), then gets preempted, then
    ttyinput() reenters cblock_alloc().

>Fix:
    I'm ready to provide further debugging information on this issue.
    Unfortunately, I'm not familiar enough with the locking concepts
    in modern FreeBSD kernels (and in tty subsystem particularly)
    in order to make the fix myself.
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list