cvs commit: src/sys/nfsclient nfs_vnops.c

Sat Oct 14 19:37:35 PDT 2006

On Sun, 15 Oct 2006, Rink Springer wrote:

> On Sat, Oct 14, 2006 at 07:25:12AM +0000, Bruce Evans wrote:
>>   Log:
>>   Don't do null Setattr RPCs for VA_MARK_ATIME.  When we added the
>> ...
>>   This is the smallest and easiest to fix of several bugs that have
>>   increased the number of RPCs for kernel builds on nfs by more than
>>   100% since 2004-11-05.  The real-time increase depends on network
>> ...

> The code in RELENG_6 looks as if it would benefit from this change as
> well. Do you have a MFC planned in the nearby future?

Maybe after the larger performance bugs are fixed.  See a thread in
freebsd-fs.

It would also be good to reduce nfs RPCs generally and there seems to
be a lot more scope to do this for Access ones, at least by automatically
configuring cases when a large timeout is safe.

-current and RELENG_6 still have a bogus default of 2 for nfs_access_cache
in /etc/defaults/rc.conf, and rc.conf(5) still confusingly claims that
setting this to a value of 2-10 seconds will substantially reduce
network traffic.  In fact the default is 60 and smaller nonzero settings
substantially increase network traffic; a setting of 2 is normally
made by the rc system, and settings of 3-10 only reduce network traffic
not so substantially by moving back towards the system default.  Settings
of 2-10 substantially increase network traffic relative to the system
default.

ISTR a discussion of fixing this in rc.conf, but nothing seems to have been
committed.  The setting is confusing in the kernel too:
- when the Access cache was first implemented on 1998/11/13, the default
   setting for its timeout was 0 (caching off for safety).
- the first implementation lived for 2 days.  In the second implementation
   on 1998/11/15, use of the cache was made safer and the default setting
   for its timeout was changed to 2 (caching on, but a small timeout for
   safety.  The userland default apparently dates from this time.  The
   userland documentation does date from this time, but applies better to
   the previous version.
- on 1998/07/31, many RPCs were avoided by changing the Access RPC to
   do also do a Getattr RPC so that Access RPCs don't need to be followed
   by Getattr ones, and the default timeout was changed to better match
   the Attribute cache (value NFS_MAXATTRTIMO = 60).  Apparently, userland
   still hasn't caught up with this change.

Now I'm slightly less confused about the difference between the Access
cache and the Attribute cache.  NFS_MAXATTRTIMO is for the Attribute
case and we're abusing it for the Access cache.  It isn't clear that
this is safe, but see the 1998/07/31 commit (it seems to say indirectly
that the Access cache is less important so having a much smaller timeout
for it was silly).

The existence of the sysctl to control the timeout for the Access cache
seems to be another bug.  For the Attribute cache, there are per-mount
timeouts acregmax, acregmin, acdirmax and acdirmax which provide much
finer control (see mount_nfs(8)).  NFS_MAXATTRTIMO was originally just
the default for acregmax.  The interaction of the Access cache timeouts
with the Attribute cache ones is unclear and is unlikely to be good if
the latter vary a lot across mounts.

However, note that if the system is very active with certain tasks,
then the effect of the Access cache timeout is insubstantial compared
with the effect of flushing the cache on every open()/close().  E.g.,
CPUs are now quite fast and can compile many files per second, 10 say.
Most .c files have many #include files in common, so these #include files
get cycled through the Access cache at a rate of 10 per second.  Even
the small Access cache timeout of 2 seconds would only cycle them at
a rate of 0.5 per second.

Bruce