NFS optimization

Tue Apr 18 13:36:49 UTC 2006

David Gilbert wrote:
>>>>>> "Eric" == Eric Anderson <anderson at centtech.com> writes:
> 
> Eric> David Gilbert wrote:
> 
>>> Consider that if you are "out" of nfsd's, the penalty is increased
>>> latency for some small number of transactions that wait for an nfsd
>>> to become available..  Even if you have tonnes of NFSd processes,
>>> if disk is a limiting factor, more nfsd's won't speed the process.
> 
> 
> Eric> I have found that having too little can easily cause clients to
> Eric> block on nfs under peak usage times, so I tend to bump the
> Eric> number way up.  There's little to no harm in it.
> 
> I have never, ever seen this behaviour.  I'd go as far as to say that
> it shouldn't happen.  Not categorically, but NFS packets should be
> entirely independant... meaning it shouldn't prefer one client's
> pakcets over another unless it is massively starved for NFSd's, the
> queue should be somewhat FIFO.

I'm not too surprised really.  If lots of nfs clients slam an nfs server 
simultaneously, all wanting data from different parts of the storage 
system, then it is very easy to stack up more requests than the nfsd's 
can handle if there are too few of them.  We have a different kind of 
nfs load here than any place I've seen, so that could account for the 
difference too.

> Eric> I usually look at my nfsd's, and see what the distribution of
> Eric> run time is on them.  I like to see at minimum a few (maybe 5%
> Eric> or so) with 0:00.00 runtime - which (to me) means that I had
> Eric> enough to service the queue, and a few extra that were bored.
> Eric> For my setup, this means typically between 256 and 512 nfsd's
> Eric> (with one server at 1024).
> 
> I have run incredibly busy NFS servers (20 to 40 disks, 16 to 20
> ethernet and 100 (or more) busy diskless clients (computation cluster)
> and I have never run more than 32.  I've never found a performance
> advantage beyond 1:1 nfsd's to disks.

Well, nfs client usage patterns strongly dictate nfs server load, so the 
quantity of clients is important, but more so is how the clients use the 
data on the nfs server.

We have around 1000 busy nfs clients (all 3+GHz P4's), about 900 of them 
in a compute cluster (some diskless, some not), nearly everything is 
Gig-E.  My rule of thumb for nfsd's has come down to nfs clients / 4 = 
nfsd threads.

With very fast disk subsystems, and lots of caching, and 'good' usage 
patterns, very few nfsd's would be needed.  If you have lots of usage 
spikes, and a *lot* of random reads/writes, coupled with a large number 
of clients, you can easily see the problem I mentioned above with a 
small number of nfsd's.  There have been a few other threads on other 
mailing lists (freebsd-fs I think was one) that other users have seen 
the same issues, and merely bumping the nfsd's gets them past the problem.

Eric

-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------