nfs lock failure/hang when using alias address for server from linux

Rick Macklem rmacklem at uoguelph.ca
Sun Aug 21 13:54:39 UTC 2011


John De wrote:
> Hi,
> 
> I have an nfs server running 9-current. Everything works as far
> as nfs i/o operations are concerned.
> 
> From another FreeBSD box, nfs locking works great to the server
> when addressed by both it's real ip address and it's aliased ip
> address.
> 
> From a Linux system:
> 
> Linux bb05d6403.unx.sas.com 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May
> 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
> 
> nfs locking works fine if the mount goes to the real ip address
> of the server. If, however, the server is mounted by using it's
> aliased
> ip address, while nfs i/o operations work fine, file locking hangs.
> 
> On the server, the processes:
> 
> root 5995 0.0 0.0 14272 1920 ?? Ss 3:48PM 0:05.33 /usr/sbin/rpcbind -h
> 10.24.6.38 -h 172.1.1.2 -h 10.24.6.33 -h 10.24.6.34
> root 6021 0.0 0.0 12316 2364 ?? Ss 3:48PM 0:00.65 /usr/sbin/mountd -r
> -l -h 10.24.6.38 -h 172.1.1.2 -h 10.24.6.33 -h 10.24.6.34
> root 6048 0.0 0.0 10060 1864 ?? Ss 3:48PM 0:00.10 nfsd: master (nfsd)
> root 6049 0.0 0.0 10060 1368 ?? S 3:48PM 0:00.20 nfsd: server (nfsd)
> root 6074 0.0 0.0 274432 2084 ?? Is 3:48PM 0:00.03 /usr/sbin/rpc.statd
> -d -h 10.24.6.38 -h 172.1.1.2 -h 10.24.6.33 -h 10.24.6.34
> root 6099 0.0 0.0 14400 1780 ?? Ss 3:48PM 0:00.03 /usr/sbin/rpc.lockd
> -d 9 -h 10.24.6.38 -h 172.1.1.2 -h 10.24.6.33 -h 10.24.6.34
> 
> The server is accessed by udp in addition to tcp thus the -h
> options for each address. Nfsv4 is not enabled at this time. I have
> the debug output of statd & lockd running to /var/log via syslog but
> nothing useful shows up.
> 
> The interface configuration:
> 
> bce0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
> 1500
> options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
> ether 84:2b:2b:fd:a1:fc
> inet 10.24.6.38 netmask 0xffff0000 broadcast 10.24.255.255
> inet6 fe80::862b:2bff:fefd:a1fc%bce0 prefixlen 64 scopeid 0x1
> inet 10.24.6.33 netmask 0xffffffff broadcast 10.24.255.255
> inet 10.24.6.34 netmask 0xffffffff broadcast 10.24.255.255
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> media: Ethernet autoselect (1000baseT <full-duplex>)
> status: active
> 
> Above, a mount to 10.24.6.38 works. A mount to either 10.24.6.33
> or 10.24.6.34 works for nfs i/o operations, but hangs for lock
> requests.
> 
> I'd like this to work so I can transistion some volumes around to
> different servers.
> 
> Does anyone have any thoughts on the best way to debug this? I've
> looked
> at what I believe are the obvious areas. I'll probably start looking
> more
> closely at tcpdump next.
> 
I think you will probably need to capture packets and take a look.
(wireshark interprets the NFS stuff much better than tcpdump, although
 tcpdump is fine for the capture part)

A wild guess is that it will be something like:
- Linux client sends an IP broadcast (those Sun RPC protocols love to
  do that)
- FreeBSD server replies via main address and not alias
- Linux client doesn`t handle reply that isn`t from the address used
  for the mount. (You might poke around on the Linux side, in case there
  is some option or sysctl that affects what addresses their lockd can
  handle.)

rick
> Thanks,
> John
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "freebsd-current-unsubscribe at freebsd.org"


More information about the freebsd-current mailing list