[PATCH] Better handle NULL utimes() in the NFS client

Rick Macklem rmacklem at uoguelph.ca
Tue Jan 15 22:49:04 UTC 2013


Bruce Evans wrote:
> On Mon, 14 Jan 2013, Rick Macklem wrote:
> 
> > John Baldwin wrote:
> >> The NFS client tries to infer when an application has passed NULL
> >> to
> >> utimes()
> >> so that it can let the server set the timestamp rather than using a
> >> client-
> >> supplied timestamp. It does this by checking to see if the desired
> >> timestamp's second matches the current second. However, this breaks
> >> applications that are intentionally trying to set a specific
> >> timestamp
> >> within
> >> the current second. In addition, utimes() sets a flag to indicate
> >> if
> >> NULL was
> >> passed to utimes(). The patch below changes the NFS client to check
> >> this flag
> >> and only use the server-supplied time in that case:
> 
> It is certainly an error to not check VA_UTIMES_NULL at all. I think
> the flag (or the NULL pointer) cannot be passed to the server, so the
> best we can do for the VA_UTIMES_NULL case is read the current time on
> the client and pass it to the server. Upper layers have already read
> the current time, but have passed us VA_UTIMES_NULL so that we can
> tell
> that the pointer was originally null so that we can do the different
> permissions checks for this case.
> 
> >> Index: fs/nfsclient/nfs_clport.c
> >> ===================================================================
> >> --- fs/nfsclient/nfs_clport.c (revision 225511)
> >> +++ fs/nfsclient/nfs_clport.c (working copy)
> >> @@ -762,7 +762,7 @@
> >> *tl = newnfs_false;
> >> }
> >> if (vap->va_atime.tv_sec != VNOVAL) {
> >> - if (vap->va_atime.tv_sec != curtime.tv_sec) {
> >> + if (!(vap->va_vaflags & VA_UTIMES_NULL)) {
> >> NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
> >> *tl++ = txdr_unsigned(NFSV3SATTRTIME_TOCLIENT);
> >> txdr_nfsv3time(&vap->va_atime, tl);
> >> @@ -775,7 +775,7 @@
> >> *tl = txdr_unsigned(NFSV3SATTRTIME_DONTCHANGE);
> >> ...
> 
> Something mangled the patch so that it is hard to see what it does. It
> just uses the flag instead of guessing.
> 
> I can't see anything that does the different permissions check for
> the VA_UTIMES_NULL case, and testing shows that this case is just
> broken,
> at least for an old version of the old nfs client -- the same
> permissions
> are required for all cases, but write permission is supposed to be
> enough for the VA_UTIMES_NULL case (since write permission is
> sufficient
> for setting the mtime to the current time (plus epsilon) using
> write(2)
> and truncate(2). Setting the atime to the current time should require
> no more and no less than read permission, since it can be done using
> read(2), but utimes(NULL) requires write permission for that too).
> 
I did a quick test on a -current client/server and it seems to work ok.
The client uses SET_TIME_TO_SERVER and the server sets VA_UTIMES_NULL
for this case. At least it works for a UFS exported volume.

> > In the old days, a lot of NFS servers only stored times at a
> > resolution of 1sec, which I think is why the code had the habit
> > of comparing "seconds equal".
> 
> I think this is not the reason for the check here.
> 
> > If there is some app. out there
> > that sets "current time" via utimes(2) with a curent time argument
> > instead of a NULL argument would seem to be broken to me.
> > (It is conceivable that some app. did this to avoid clock
> > skew between the client and server, but I doubt it.)
> 
> Apps have no alternative to using the NULL arg if they have write
> permission
> to the file but don't own it.
> 
> Oops, on looking at the code I now think it _is_ possible to pass the
> request to set the current time on the server, since in the
> NFSV3SATTRTIME_TOSERVER case we just pass this case value and not
> any time value to the server, so the server has no option but to use
> its current time. It is not surprising that the permissions checks
> for this don't work right. I thought that the client was responsible
> for most permissions checks, but can't find many or the relevant one
> here. The NFSV3SATTRTIME_TOSERVER code on the server sets
> VA_UTIMES_NULL, so I would have thought that the permissions check on
> the server does the right thing.
> 
As noted above, it seems to work correctly for the new server in -current,
at least for UFS exports.

Normally a server will do permission checking for NFS RPCs. There is nothing
stopping a client from doing a check and returning an error, but traditionally
a server has not trusted a client to do so. (I'm not sure if adding a check
in the client is what jhb@ was referring to in his reply to this?)

> There are some large timestamping bugs nearby:
> 
> - the old nfs server code for NFSV3SATTRTIME_TOSERVER uses
> getnanotime()
> to read the current time. This violates the system's policy set by
> the vfs.timestamp precision in most cases, since using getnanotime()
> is the worst supported policy and is not the defaul.
> 
> The old nfs client uses the correct function to read the current
> time, vfs_timestamp(), in nfs_create(), but this is the only use of
> vfs_timestamp() in old nfs code. I think most cases use the server
> time and thus use the correct function iff the leaf server file
> system uses the correct function.
> 
> - the new nfs server code for NFSV3SATTRTIME_TOSERVER macro-izes all
> reads of the current time except 1 as NFSGETTIME(). This uses
> getmicrotime(), so it violates the system's policy in all cases,
> since using getmicrotime() is not a supported policy (using
> microtime() is supported). The 1 exception is a hard-coded
> getmicrotime() in fs/nfsclient/nfs_clport.c whose use is visible
> in the above patch. This one really didn't matter, because only the
> seconds part of curtime was used. It was just a micro-pessimization
> and style bug. The (not quite) correct way to get the seconds part
> is to use time_second, as is done in the old nfs client.
> (This way is not quite correct because there are some races and
> non-monotonicities reading the times. In the above check,
> vap->va_atime.tv_sec might have been read by a more precise clock
> than curtime.tv_sec. Then the check might give a false positive
> or negative. But the check is only a heuristic, and is inherently
> racy, so this doesn't rally matter.
> With the above pathcm the check becomes a different pessimization and
> style bug. The curtime variable becomes unused except for its
> incorrect initialization.
> 
In this case, after the patch is applied, curtime and getmicrotime() can
just be deleted (as you noted, above).

> New nfs code never uses the correct function vfs_timestamp().
This needs to be fixed. Until now, I would have had no idea what is the
correct interface. (When I did the port, I just used a call that seemed
to return what I wanted.;-)

Having said that, after reading what you wrote below, it is not obvious
to me what the correct fix is? (It seems to be a choice between microtime()
and vfs_timestamp()?)

> 
> Following the system pollcy for file timestamps causes some problems
> for utimes(NULL) too. Old versions hard-coded microtime(). Current
> versions use vfs_timestamp(). The latter is better, but tends to
> give different results than times(non_NULL), since few or no
> applications know anything about the system's policy. touch(1)
> probably should know, but doesn't. So the simple "touch foo" gives
> various results, depending:
> - touch(1) starts with gettimeofday(). This gives microseconds
> resolution and usually microseconds accuracy if its result is used.
> - touch then tries utimes(non_NULL) with the current time that it
> just read. This usually works, giving microseconds resolution,
> etc. This is OK, but often different from the system policy.
> - touch then tries utimes(NULL). If this works, then it follow the
> system policy.
> 
> Another problem is that not all file systems support nanoseconds
> resolutions, so not all system policies or utimes() requests can
> be honored.
> 
> I would usually prefer the system's policy to be enforced as far as
> possible. Thus if the system's policy is microseconds resolution,
> then times with nanoseconds resolution should be rounded down to the
> nearest microsecond. This case is most useful since utimes() cannot
> preserve times with more than microseconds resolution. Utilities like
> cp(1) blindly round the times given in nanoseconds by stat(2) to ones
> that can be written by utimes(2), so this often happens in an
> uncontrollable way anyway (POSIX is finally getting around to
> specifying
> permissible errors for unrepresentable resolutions). But sometimes I
> want utimes() to preserve times as well as possible.
> 
> Bruce


More information about the freebsd-fs mailing list