nfsd server cache flooded, try to increase nfsrc_floodlevel

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Fri Jul 25 07:42:05 UTC 2014


 Bezüglich Rick Macklem's Nachricht vom 25.07.2014 02:14 (localtime):
> Harald Schmalzbauer wrote:
>> Bezüglich Rick Macklem's Nachricht vom 08.08.2013 14:20 (localtime):
>>> Lars Eggert wrote:
>>>> Hi,
>>>>
>>>> every few days or so, my -STABLE NFS server (v3 and v4) gets
>>>> wedged
>>>> with a ton of messages about "nfsd server cache flooded, try to
>>>> increase nfsrc_floodlevel" in the log, and nfsstat shows TCPPeak
>>>> at
>>>> 16385. It requires a reboot to unwedge, restarting the server does
>>>> not help.
>>>>
>>>> The clients are (mostly) six -CURRENT nfsv4 boxes that netboot
>>>> from
>>>> the server and mount all drives from there.
>>>>
> Have you tried increasing vfs.nfsd.tcphighwater?
> This needs to be increased to increase the flood level above 16384.
>
> Garrett Wollman sets:
> vfs.nfsd.tcphighwater=100000
> vfs.nfsd.tcpcachetimeo=300
>
> or something like that, if I recall correctly.

Thanks you for your help!

I read about tuning these sysctls, but I object individually altering
these, because I don't have hundreds of clients torturing a poor server
or any other not well balanced setup.
I run into this problem with one client, connected via 1GbE (not 10 or
40GbE) link, talking to modern server with 10G RAM - and this
environment forces me to reboot the storage server every 2nd day.
IMHO such a setup shouldn't require manual tuning and I consider this as
a really urgent problem!
Whatever causes the server to lock up is strongly required to be fixed
for next release,
otherwise the shipped implementation of NFS is not really suitable for
production environment and needs a warning message when enabled.
The impact of this failure forces admins to change the operation system
in order to get a core service back into operation.
The importance is, that I don't suffer from weaker performance or
lags/delays, but my server stops NFS completely and only a reboot solves
this situation.

Are there later modifcations or other findings which are known to obsolete
your noopen.patch (http://people.freebsd.org/~rmacklem/noopen.patch)?

I'm testing this atm, but having other panics on the same machine
related to vfs locking, so results of the test won't be available too soon.

Thank you,

-Harry


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140725/24eab1fa/attachment.sig>


More information about the freebsd-stable mailing list