NFS locks, rpcbind port = 0 failed? - try #2

Rick Romero rick at havokmon.com
Tue Oct 15 01:05:06 UTC 2013


  Quoting Rick Macklem <rmacklem at uoguelph.ca>:

> Rick Romero wrote:
>> This is a continuation of "9.1 VM nfs3 & locks over VPN" from
>> freebsd-questions - trying a
>> different angle maybe it'll jostle someones memory.  Don't mean to
>> cross-post, but as I pay more attention to the lists I'm reading,
>> this
>> seems to be the better list for NFS issues.
>>
>> I have a FreeBSD 9.2 VM at an offsite hosting company.  hostname
>> nl101vpn
>> OpenVPN is installed on it, routed not bridged mode.
>> I have multiple OSs installed on local network. I'm already
>> exportings NFS
>> off 9.1 with working file locks.
>>
>> What I see -
>> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or Linux
>> -
>> locks do not work.
>> export nfsv3 from any local system, mount on nl101vpn - locks work.
>> export nfsv3 from locally installed VM, mount on any local host or
>> nl101vpn
>> - locks work.  No OpenVPN installed on it though. This was to test if
>> virtio drivers might be causing the problem.
>>
>> I even ran a tcpdump to see if something was getting lost - both
>> sides
>> match, nothing is getting dropped
>>
>> nl101vpn - /var/log/messages:
>> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote rpcbind,
>> stat =
>> 0, port = 0  (why port 0?)
>> Oct 14 12:23:02 nl101 last message repeated 109 times
>> Oct 14 12:25:48 nl101 last message repeated 177 times
>>
>> I tried binding rpcbind to the VPN interface, but that doesn't seem
>> to
>> work.  tcpdump shows no packets trying to leave the 'Internet'
>> interface.
>>
>> So I haven't exhausted every combination, or completely 100%
>> replicated
>> whats happening offsite, but it's getting pretty ridiculous now...
>> I'm
>> lost, and I need NFS locking to work.
>> Help :)
>
> For rpcbind to work, IP broadcast needs to work between the hosts
> and I suspect that the VPN doesn't support that.
>
> Without rpcbind, I don't think you can get rpc.lockd/rpc.statd
> to work, but I am not sure. (There are command line options for
> these daemons that allow you to set specific port #s, but I don't
> think that will fix the problem, since they still need rpcbind to
> tell them the port# for the remote machines.) These protocols were
> designed in the 1980s for use on a LAN.
>
> Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4 locking
> is handled as a part of the NFSv4 protocol and always uses port #2049.
> I'd suggest you try NFSv4 again and make sure it is using NFSv4 and
> the mount has not fallen back to NFSv3. (For FreeBSD, specify "nfsv4"
> as a mount option. For Linux, specify "vers=4" as a mount option.)
> You can check what the mount is actually using via "nfsstat -m".
> If you assumed the locking for NFSv4 wasn't working because of these
> messages, that isn't the case. If you are using NFSv4 for all mounts,
> you don't need to run rpc.lockd at all (at least for FreeBSD, I'm
> not sure what the daemons do w.r.t. Linux).

  Hi Rick,

Yeah - I thought the VPN might pose a problem, but I can get locks from the
VM side (nl101vpn) via NFS3 back to the main site.  So it doesn't seem to
be an issue with the VPN. After that I created a local VM to ensure it
wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to rule out
any funky custom options the host may have thrown into their 9.1
installer).  nada.

So I'm re-trying with NFS4 - though my mount does show it was mounted nfs4
(from Linux) last time I tried:
nl101vpn:/first on /mnt type nfs4
(rw,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.92,minorversion=0,local_lock=none,addr=10.9.8.6)

After trying again, still doesn't work.  Though now I noticed the error is
different.  I have a little perl script that I test with, and the line is:
flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file:
$!\n";
Before I would get 'pemission denied' - which (IIRC) would also happen when
I forgot to run lockd or statd.
Now with NFS4 it says, 'Bad file descriptor'

After much more testing I've gotten a single Linux VM (but no other VMs on
the same host, or my other host, yet they're all from the same template) to
get a lock (n NFSv3)
Failed locks show in the logs:
NLM: failed to contact remote rpcbind, stat = 0, port = 0    (FreeBSD)
NLM: failed to contact remote rpcbind, stat = 7, port = 28416  (Linux)

I have 2 FreeBSD boxes, 9.1 and 7.2.  Both don't seem to be relaying their
ports? 
The Linux ones that fail have a port, but apparently can't be contacted. Or
is 'port' in the log not really referring to a port number?

The OpenVPN 'server' is another Linux VM - but on a different host than the
working Linux VM :P  None of the Linux VMs on that host can get a lock. 
How's THAT for weird? :)

I need some aspirin. 

Rick
 


More information about the freebsd-fs mailing list