Core Dump / panic sleeping thread

Konstantin Belousov kostikbel at gmail.com
Wed Mar 20 09:49:59 UTC 2013


On Tue, Mar 19, 2013 at 07:37:43PM -0400, Rick Macklem wrote:
> Andriy Gapon wrote:
> > on 19/03/2013 19:35 Jeremy Chadwick said the following:
> > > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek
> > > wrote:
> > [snip]
> > >> Unread portion of the kernel message buffer:
> > >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
> > >> KDB: stack backtrace of thread 100256:
> > >> #0 0xffffffff808f2d46 at mi_switch+0x186
> > >> #1 0xffffffff8092bb52 at sleepq_wait+0x42
> > >> #2 0xffffffff808f34d6 at _sleep+0x376
> > >> #3 0xffffffff80b4f3ae at vm_object_page_remove+0x2ce
> > >> #4 0xffffffff80b5ac7d at vnode_pager_setsize+0x17d
> > >> #5 0xffffffff8082102c at nfscl_loadattrcache+0x2cc
> > >> #6 0xffffffff80818d37 at nfs_getattr+0x287
> > >> #7 0xffffffff8098f1c0 at vn_stat+0xb0
> > >> #8 0xffffffff809869d9 at kern_statat_vnhook+0xf9
> > >> #9 0xffffffff80986b55 at kern_statat+0x15
> > >> #10 0xffffffff80986c1a at sys_lstat+0x2a
> > >> #11 0xffffffff80bd7ae6 at amd64_syscall+0x546
> > >> #12 0xffffffff80bc3447 at Xfast_syscall+0xf7
> > >> panic: sleeping thread
> > >> cpuid = 0
> > >> KDB: stack backtrace:
> > >> #0 0xffffffff809208a6 at kdb_backtrace+0x66
> > >> #1 0xffffffff808ea8be at panic+0x1ce
> > >> #2 0xffffffff8092ed22 at propagate_priority+0x1d2
> > >> #3 0xffffffff8092fa4e at turnstile_wait+0x1be
> > >> #4 0xffffffff808d8d48 at _mtx_lock_sleep+0xd8
> > >> #5 0xffffffff80820fa4 at nfscl_loadattrcache+0x244
> > >> #6 0xffffffff8081758c at ncl_readrpc+0xac
> > >> #7 0xffffffff80824c45 at ncl_getpages+0x485
> > >> #8 0xffffffff80b5aa0c at vnode_pager_getpages+0x9c
> > >> #9 0xffffffff80b3fc93 at vm_fault_hold+0x673
> > >> #10 0xffffffff80b41cc3 at vm_fault+0x73
> > >> #11 0xffffffff80bd84b4 at trap_pfault+0x124
> > >> #12 0xffffffff80bd8c6c at trap+0x49c
> > >> #13 0xffffffff80bc315f at calltrap+0x8
> > [snip]
> > 
> > I think that the regular mutex which is acquired via NFSLOCKNODE() in
> > nfscl_loadattrcache() can not be held across vnode_pager_setsize.
> > I am not sure though when vap->va_size != np->n_size case is
> > triggered.
> > 
> Yep, I'd agree to that. The same bug is in the old NFS client and
> the new NFS client cribbed the code from there.
> 
> I have attached a simple patch that unlocks the mutex for the
> vnode_pager_setsize() call. Maybe you could test it?
> 
> Thanks for reporting this, rick
> ps: Hopefully "patch" can apply this patch (there have been
>     recent changes to this file, so the line#s could be off).
>     It should be easy to do manually if not. The change is
>     in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c.
> 
> 
> > > You're going to need to provide the following details:
> > >
> > > 1. Contents of /etc/rc.conf
> > > 2. Contents of /etc/sysctl.conf (if modified)
> > > 3. Contents of /etc/fstab
> > > 4. ifconfig -a
> > > 5. OS used by the NFS server, and all configuration details
> > > pertaining
> > > to that system
> > >
> > > You may also be asked to upgrade to 9.1-STABLE, as there may be
> > > fixes
> > > for whatever this is in base/stable/9 that are not in -RELEASE, but
> > > this
> > > is speculative on my part.
> > >
> > I do not see a need for any of these.
> > 
> > --
> > Andriy Gapon
> > _______________________________________________
> > freebsd-stable at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to
> > "freebsd-stable-unsubscribe at freebsd.org"

> --- fs/nfsclient/nfs_clport.c.savit	2013-03-19 18:37:33.000000000 -0400
> +++ fs/nfsclient/nfs_clport.c	2013-03-19 18:44:21.000000000 -0400
> @@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, 
>  				np->n_size = vap->va_size;
>  				np->n_flag |= NSIZECHANGED;
>  			}
> +			NFSUNLOCKNODE(np);
>  			vnode_pager_setsize(vp, np->n_size);
> +			NFSLOCKNODE(np);
>  		} else {
>  			np->n_size = vap->va_size;
>  		}

I do not like it. As I said in the previous response to Andrey,
I think that moving the vnode_pager_setsize() after the unlock is
better, since it reduces races with other thread seeing half-done
attribute update or making attribute change simultaneously.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20130320/3111501a/attachment.sig>


More information about the freebsd-stable mailing list