Trouble: NFS via TCP
olli at lurza.secnetix.de
Thu Nov 9 17:17:15 UTC 2006
I've got a very weird problem with NFS mounts on a RELENG_6
machine (a.k.a 6.2-PRERELEASE, sources synced yesterday,
November 8th). It's an HP Proliant DL360 G4 (G4p to be
exact), but that shouldn't matter. I've been banging my
head on the table for several hours, but I can't find the
source of the problem. :-(
What I'm trying to do should be very simple: mounting an
NFS directory via TCP (instead of UDP which is the default),
# mount_nfs -T -3 -R 3 -i -s -o ro 127.0.0.1:/localdisk /nfs/test
Symptom: As soon as I use the -T option (TCP) with the
mount command, it simply hangs forever. If I use the
intr/soft flags, I can Ctrl-C it after a while, and the
mount indeed appears in the output from "mount", but any
command that tries to access it (e.g. ls(1)) also hangs.
Even umount(8) hangs.
- UDP works perfectly fine. No problems at all.
- Other TCP connections beside NFS (e.g. ssh) work fine.
- IPF is present, but disabled (ipf -D).
- IPFW only contains the default "allow any to any" rule.
- The interface doesn't matter. Mounting from localhost
(via lo0) has the same problem as via a real NIC.
- I first observed the problem on RELENG_6 of 2006-10-19
(but it could be much older, because I haven't tried
NFS-via-TCP on this machine before). Then I updated
to 2006-11-08, no change.
- SMP or UP kernel doesn't make a difference.
- No special compiler flags, make.conf is empty.
- Kernel config is GENERIC with a few additions for more
shared memory and semaphores (so Squid and PostgreSQL
are happy) and some other unrelated details.
- No suspicious things in dmesg. Kernel prints nothing
during the mount attempts.
- Output from rpcinfo -p looks good.
- tcpdump shows that the TCP connection is immediately
shut down: After connecting successfully, it sends a
FIN, then reconnects, etc. ad infinitum. Meanwhile
vfs.nfs.reconnects increases slowly.
- On a different machine (different hardware, but same
RELENG_6 and very similar kernel config), the problem
does *NOT* occur. I compared sysctl variables relevant
to nfs, rpc and tcp, and they're all the same. Also,
rpcinfo -p is the same.
Now I'm running out of ideas ... Obviously there must be
something special with that machine, because it works fine
on a different machine, but I'm not able to find out what
I even considered putting a few printf() calls into some
places in sys/nfsclient/nfs_socket.c to find out what's
going on, but I'm not sure if that makes sense and whether
it will give any useful results.
Any hints and ideas will be greatly appreciated.
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.
"And believe me, as a C++ programmer, I don't hesitate to question
the decisions of language designers. After a decent amount of C++
exposure, Python's flaws seem ridiculously small." -- Ville Vainio
More information about the freebsd-stable