(9.2) panic under disk load (gam_server / knlist_remove_kq)

Konstantin Belousov kostikbel at gmail.com
Tue Jul 16 06:06:02 UTC 2013


On Mon, Jul 15, 2013 at 06:50:09PM +0200, Patrick Lamaiziere wrote:
> Le Mon, 15 Jul 2013 16:26:47 +0200,
> Mateusz Guzik <mjguzik at gmail.com> a ?crit :
> 
> Hello,
> 
> > > > I'm seeing a panic while trying to build a poudriere repository.
> > > > 
> > > > As far I can see it always happens when gam_server is started (ie
> > > > xfce is running) and under disk load (poudriere bulk build) :
> > > > (That is something new, the box was pretty stable)
> > > > 
> > > > the complete crash dump (core.0.txt) is here:
> > > > http://user.lamaiziere.net/patrick/panic_gam_server.txt
> > > 
> > > With WITNESS and ASSERTION on, I see a warning that looks related :
> > > 
> > > Jul 14 16:23:29 roxette kernel: WARNING: destroying knlist w/
> > > knotes on it!
> > > 
> > > and the box panics just after this.
> > > 
> > 
> > can you switch that printf to a panic and paste backtrace?
> 
> Yes the full core.txt :
> http://user.lamaiziere.net/patrick/panic_knlist_wknotes.txt 
> 
> panic: WARNING: destroying knlist w/ knotes on it!
> 
> Unread portion of the kernel message buffer:
> lock order reversal:
>  1st 0xfffffe00b678c098 ufs (ufs) @ /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:620
>  2nd 0xffffffff813ebda0 allproc (allproc) @ /usr/src/sys/kern/kern_descrip.c:2822
> KDB: stack backtrace:
> #0 0xffffffff8094bc26 at kdb_backtrace+0x66
> #1 0xffffffff809603ae at _witness_debugger+0x2e
> #2 0xffffffff80961a85 at witness_checkorder+0x865
> #3 0xffffffff8091b1ea at _sx_slock+0x5a
> #4 0xffffffff808d30ff at mountcheckdirs+0x3f
> #5 0xffffffff809a890f at dounmount+0x2df
> #6 0xffffffff809a913e at sys_unmount+0x3ce
> #7 0xffffffff80cec429 at amd64_syscall+0x2f9
> #8 0xffffffff80cd6d47 at Xfast_syscall+0xf7
> panic: WARNING: destroying knlist w/ knotes on it!
> 
> cpuid = 3
> KDB: stack backtrace:
> #0 0xffffffff8094bc26 at kdb_backtrace+0x66
> #1 0xffffffff80912da8 at panic+0x1d8
> #2 0xffffffff808db269 at knlist_destroy+0x39
> #3 0xffffffff809afd7e at destroy_vpollinfo+0x1e
> #4 0xffffffff809b13ef at vdropl+0x18f
> #5 0xffffffff809b404c at vputx+0xac
> #6 0xffffffff8299ce13 at null_reclaim+0x103
> #7 0xffffffff80d912eb at VOP_RECLAIM_APV+0xdb
> #8 0xffffffff809b20a2 at vgonel+0x112
> #9 0xffffffff809b4cd9 at vflush+0x2b9
> #10 0xffffffff8299bbb3 at nullfs_unmount+0x43
> #11 0xffffffff809a8982 at dounmount+0x352
> #12 0xffffffff809a913e at sys_unmount+0x3ce
> #13 0xffffffff80cec429 at amd64_syscall+0x2f9
> #14 0xffffffff80cd6d47 at Xfast_syscall+0xf7
> Uptime: 4m47s
> Dumping 915 out of 3544 MB:..2%..11%..21%..32%..41%..51%..62%..72%..81%..91%
> 
> #0  doadump (textdump=<value optimized out>) at pcpu.h:234
> 234	pcpu.h: No such file or directory.
> 	in pcpu.h
> (kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:234
> #1  0xffffffff80913354 in kern_reboot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:449
> #2  0xffffffff80912d79 in panic (fmt=0x1 <Address 0x1 out of bounds>)
>     at /usr/src/sys/kern/kern_shutdown.c:637
> #3  0xffffffff808db269 in knlist_destroy (knl=<value optimized out>)
>     at /usr/src/sys/kern/kern_event.c:1961
> #4  0xffffffff809afd7e in destroy_vpollinfo (vi=0xfffffe007ffec690)
>     at /usr/src/sys/kern/vfs_subr.c:3583
> #5  0xffffffff809b13ef in vdropl (vp=0xfffffe00b678c000)
>     at /usr/src/sys/kern/vfs_subr.c:2530
> #6  0xffffffff809b404c in vputx (vp=0xfffffe00b678c000, func=2)
>     at /usr/src/sys/kern/vfs_subr.c:2358
> #7  0xffffffff8299ce13 in ?? ()
> #8  0xffffffff8299d510 in ?? ()
> #9  0xfffffe00000002ec in ?? ()
> #10 0xffffff81090b8750 in ?? ()
> #11 0x0000000000000246 in ?? ()
> #12 0xfffffe002af55000 in ?? ()
> #13 0xffffffff81576950 in w_locklistdata ()
> #14 0xffffffff81322ce0 in pmc___lock_failed ()
> #15 0xffffffff8299d8a0 in ?? ()
> #16 0xffffff81090b87b0 in ?? ()
> #17 0x0000000000000000 in ?? ()
> (kgdb) 

Hm, try this (mostly naive) patch.  If kernel does not panic for you
anymore, check that gam_server is still operational.  If not, I have
some other thing to try.

diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index e64f379..e2c2813 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -3455,6 +3455,8 @@ vfs_msync(struct mount *mp, int flags)
 static void
 destroy_vpollinfo(struct vpollinfo *vi)
 {
+
+	knlist_clear(&vi->vpi_selinfo.si_note, 1);
 	seldrain(&vi->vpi_selinfo);
 	knlist_destroy(&vi->vpi_selinfo.si_note);
 	mtx_destroy(&vi->vpi_lock);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20130716/4d69d3ca/attachment.sig>


More information about the freebsd-stable mailing list