[nfs] process locks in "bo_wwait" on 6.4

pluknet pluknet at gmail.com
Mon Jun 29 16:16:55 UTC 2009


2009/6/29 Kostik Belousov <kostikbel at gmail.com>:
> On Mon, Jun 29, 2009 at 05:18:03PM +0400, pluknet wrote:
>> 2009/6/29 Attilio Rao <attilio at freebsd.org>:
>> > 2009/6/29 pluknet <pluknet at gmail.com>:
>> >> 2009/6/29 Attilio Rao <attilio at freebsd.org>:
>> >>> 2009/6/29 pluknet <pluknet at gmail.com>:
>> >>>> 2009/6/26 pluknet <pluknet at gmail.com>:
>> >>>>> 2009/6/26 pluknet <pluknet at gmail.com>:
>> >>>>>> Hello.
>> >>>>>>
>> >>>>>> While building a module on nfs mounted /usr/src
>> >>>>>> I got an unkillable process waiting forever in bo_wwait.
>> >>>>>
>> >>>>> Small note: iface on NFS server has mtu changed from 1500 to 1450.
>> >>>>> Can this be a source of the problem?
>> >>>>
>> >>>> This is 100% reproducible. Lock in the same place. Any hints?
>> >>>
>> >>> Can you also show the value of ps?
>> >>> A precise map of what processes are doing would give an help.
>> >>> Also would be useful to printout traces for other threads and not only
>> >>> the stucked one.
>> >>>
>> >>
>> >> >From another run:
>> >
>> > I'm unable to see who would be locking the buffer object in question.
>> > Do you have INVARIANT_SUPPORT/INVARIANTS on?
>>
>> Yes, I do both.
>>
>> > What revision of /usr/src/sys/kern/vfs_bio.c are you running with?
>> >
>>
>> As of 6.4-R: CVS rev 1.491.2.12.4.1 / SVN rev 183531.
>
> It seems that your changes of MTU cause nfs requests to never reach
> network. bo_wwait is the state where thread waits for all outstanding
> i/o on bufobj to drain.
>

It appears that you are right. I found in tcpdump that nfs client tries to send
UDP packets sized in 1500 bytes.

19:40:13.937085 IP (tos 0x0, ttl  64, id 4658, offset 0, flags [+],
proto: UDP (17), length: 1500) client.1662412076 > server.nfs: 1472
write fh 1145,216955/1372174 1779 (1779) bytes @ 0 <unstable>

While here I reverted mtu on NFS server back to 1500, then after some
seconds "locked up" NFS client box continued to build a module as there
were no any locking problems at all.
So I understand this as defined behavior.

-- 
wbr,
pluknet


More information about the freebsd-stable mailing list