Hang in VOP_LOCK1_APV on 8-STABLE with NFS.

Rick Macklem rmacklem at uoguelph.ca
Fri Jan 7 19:37:26 UTC 2011

> Hi,
> OpenOffice hangs on NFS when I try to save a file or even when I try
> to
> open the save dialog in this case.
> $ 17:25:35 ronald at ronald [~]
> procstat -kk 85575
> 85575 100322 soffice.bin initial thread mi_switch+0x176
> sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39
> VOP_LOCK1_APV+0x46
> _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8
> nfs_lookup+0x65e
> VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82
> kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40
> 85575 100502 soffice.bin - mi_switch+0x176
> sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0
> do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186
> syscall+0x40
> Xfast_syscall+0xe2
> 85575 100576 soffice.bin - mi_switch+0x176
> sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0
> do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186
> syscall+0x40
> Xfast_syscall+0xe2
> 85575 100577 soffice.bin - mi_switch+0x176
> sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d
> kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40
> Xfast_syscall+0xe2
> 85575 100578 soffice.bin - mi_switch+0x176
> sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e
> seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40
> Xfast_syscall+0xe2
> 85575 100579 soffice.bin - mi_switch+0x176
> sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12
> _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186
> syscall+0x40 Xfast_syscall+0xe2
> $ 17:25:35 ronald at ronald [~]
> uname -a
> FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE
> #6:
> Mon Dec 27 23:49:30 CET 2010
> root at ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64
I think all the above tells us is that the thread is waiting for
a vnode lock. The question then becomes "what is holding a lock
on that vnode and why?".

> It is not possible to exit or kill soffice.bin. I had a slighty
> different
> procstat stack before, but that was fixed a couple of days ago.

Yea, it will be in an uniterruptible sleep when waiting for a vnode lock.

> Any thoughts? Enabling local locks in NFS doesn't fix it.

Here's some things you could try:
1 - apply the attached patch. It fixes a known problem w.r.t. the
    client side of the krpc. Not likely to fix this, but I can hope:-)
2 - If #1 doesn't fix the problem:
    - before making it hang, start capturing packets via:
    # tcpdump -s 0 -w xxx host server
    - then make it hang, kill the above and
    # procstat -ka
    # ps axHlww
    and capture the output of both of these. Hopefully these 2 commands
    will indicate what is holding the vnode lock and maybe, why. The
    "xxx" file can be looked at in wireshark to see what/if any NFS
    traffic is happening.
    If you aren't comfortable looking at the above, you can email them
    to me and I'll take a stab at them someday.
3 - Try the experimental client to see if it behaves differently. The
    mount command is:
    # mount -t newnfs -o nfsv3,<the options you already use> server:/path /mntpath
    (This might ideantify if the regular client has an infrequently executed code
     path that forgets to unlock the vnode, since it uses a somewhat different RPC
     layer. The buffer cache handling etc are almost the same, but the RPC stuff is
     fairly different.)

> The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26.
I'm afraid I can't blame Linux (at least not until we have more info;-).

> If more info is needed. I can easily reproduce this.

See above #2.

Good luck with it and let us know how it goes, rick

More information about the freebsd-stable mailing list