zfs/nfs/proftpd problem

Daniel Braniss danny at cs.huji.ac.il
Fri Feb 22 20:32:49 UTC 2013


> Daniel Braniss wrote:
> > after upgrading the 'ftp storage' from 8.3 to 9.1-stable, our ftp
> > server is stuck.
> > 
> > the old, (ProFTPD Version 1.3.2) and working till before the upgrade
> > is stuck
> > in nlmrcv:
> > ...
> > 10000 1213 992 0 44 0 7340 3692 nlmrcv D ?? 0:08.07 proftpd:
> > ftp -
> > crawl-66-249-73-193.googlebot.com: anonymous/googlebot at google.com:
> > RETR 00690145.JPG (proftpd)
> > ...
> > 
> I suspect you know that this is waiting for a reply from some rpc.lockd.
> 
> > so we upgraded the ftp server too, to 9.1/ProFTPD Version 1.3.4b and
> > this one
> > is stuck in rpccwnd:
> > 10000 1197 984 0 20 0 32292 4792 rpccwnd D ?? 0:00.01 proftpd: ftp
> > -
> > mbpro.cs.huji.ac.il: anonymous/mozilla at example.com: LIST (proftpd)
> > 
> This one is stuck in the client side of UDP for the krpc, in the
> primitive congestion control stuff that is there.
may be it's too primitive?

> 
> > 
> > any wise suggestions :-)
> > 
> Well, maybe not wise, but you may already be aware that NFS etc over
> UDP and the NLM are two of my favourite things (especially the NLM).
> 
> Basically, it appears to be having difficulties doing RPCs over UDP,
> at least for the NLM (rpc.lockd), suggesting some transport related
> issue.
> 
> First, make sure rpc.statd and rpc.lockd are running on the NFS server
> and all clients (or disable use of it via the "nolockd" mount option).
all are ruuning bot rpc.statd and rpc.lockd
> 
> You can also do a "netstat -s" and see if there is a non-zero count
> for "fragments dropped due to timeout" in the IP section. (This happens
> when your network fabric can't handle the burst of IP fragments
> generated by a large RPC message over UDP.)
> 

there are none on the cliet (the ftp server)

> Things you could try:
> - If you are using a udp mount for NFS...
>   - reduce your rsize and wsize (especially if "fragments dropped due
>     to timeout" is non-zero)
>   or
>   - switch to TCP
> 
> If you are not using udp mounts, then the NLM (rpc.lockd) is using
> UDP anyhow. If you don't need multiple NFS clients to see the file
> locks, add "nolockd" to your mount(s).
> 
> Beyond that, you'll need to capture packets and look at them in
> wireshark, to see what is going on.
> 
the mount is tcp.
I have been staring at the tcpdump and nothing sticks out, but it's been a 
while
since I looked at rpc traffic.

some facts:
it happens every time, with any ftp command, it gets stuck on either nlmrcv
or rpccwnd, mostly the latter.
I will try to disable  the lock stuff, but isn't it avoiding the issue?

> Good luck with it, rick
thanks,
	danny




More information about the freebsd-stable mailing list