Strange NFS problem implicating nfsuserd?
Graham Allan
allan at physics.umn.edu
Mon Jul 6 18:52:28 UTC 2015
On 7/1/2015 8:14 PM, Rick Macklem wrote:
> Graham Allan wrote:
>>
>> I was always able to get a failure within 10-60 minutes or so, so having
>> the nfsuserd cache timeout at 600 minutes seems like it should eliminate
>> any intermittent id lookup issues.
>>
> I'll take another look at nfsuserd.c. Maybe it does something stupid like
> getting the length of the argument wrong (trailing blank or null or something
> like that, that doesn't show up when it is printed out). All I can think of
> is a subtle bug in nfsuserd.c when the argument is specified.
>
>> I guess I could try...
>> (1) rpcdebug on the linux client, though I'm not sure which flags to
>> enable to log idmapping issues.
>> (2) watch nfsuserd with truss and look for different behaviors.
>> (3) capture NFS traffic, examine with wireshark
>>
> I'd try #3 if I were you and see if the owner and owner_group names look
> right.
>
> I'll post if I find anything in nfsuserd.c, rick
Thanks for indulging me Rick. As you might have expected though, it's
time for me to follow up with my mea culpa that my problem
identification was entirely wrong. I knew none of it made sense, but
perhaps it's fate that I need to post something embarrassingly wrong to
find the true cause :-)
The reason things became stable when I altered the nfsuserd flags is
that I also stopped our configuration management system on the affected
systems so they wouldn't get reverted during testing. And of course that
was doing something else which was responsible.
We've had a lot of workstation movement over the last few months, with
machines being moved to new buildings and new ip addresses though the
hostname remains the same. To try and address this, a periodic reload of
mountd was added - the list of permitted hostnames are in /etc/netgroup,
and it seems that mountd doesn't pick up on changed DNS values in the
netgroup without a HUP.
I guess I never thought that reloading mountd could cause i/o
disruption, but the man page does of course allude to this when
discussing the "-S" flag. I've used lots of types of unix for a long
time; I never thought I needed to read the mountd man page! For now I
simply stopped doing any reloads, but I could probably start using that
flag instead...
Graham
More information about the freebsd-fs
mailing list