runtime nfs mount options for existing mounts

Sun Jan 30 03:47:09 UTC 2011

On Sat, 29 Jan 2011, Rick Macklem wrote:

>>> I can say that, if someone else came up with the syscall/VFS
>>> changes, I
>>> could easily implement a function in the NFS client that generates
>>> the
>>> name/value pairs like nmount() uses. (There is currently a function
>>> that
>>> basically does that for the old mount() and I think a slightly
>>> modified
>>> version of that would do it. However, I haven't actually tried
>>> it.:-)
>>
>> Old mount(2) doesn't do this. All mount(8) just use statfs(2).
>> statfs()
>> just returns the old mount flags and a couple of other broken things
>> (async and sync i/o counts) that are used mainly by mount -v.
>>
> All I was trying to say here was that the NFS client(s) already
> have a function that turns the flags, etc in the old mount args
> into the name/value pairs used by nmount() so that the old
> mount() still works.

If it the clients did that, then their parsing would be perfectly
horrible, but they do less than that -- essentially the opposite.
If they actually did that, then for old mount(2) they would start
with the args in convenient binary form (packed into struct nfs_args);
they would unpack this into 100 or so string options to match the
inconvenient nmount(2) form.  The options would then arrive back in
string form, as if from nmount(2), and have to be reparsed back into
binary form and packed into a convenient struct, which might be named
nfs_args but would now be kernel-only and need not be restricted by
the mount(2) ABI, so it could hold the binary form of string options
not supported by old ABI(s).

What the clients actually do is just pass struct nfs_args as a binary
blob attached to the string option "nfs_args".  (IIRC, mount_nfs(8)
used the same method until relatively recently when it was converted
to use the full awe fullness of nmount().  The simpler userland parsing
required by nmount() results in the current mount_nfs sources being
only 50% larger than in FreeBSD-5 (well, they still have all the old
parsing code, and code for new options, so as to support fallback to
mount(2)).)

Since struct nfs_args is still compatible with the old mount(2) ABI,
it cannot hold some of the newer options like "negnametimeo".  The
binary value of at least this newer option is held in a global variable.
So anything reporting the args back to userland would now have to
gather unstructured globals and and combine these with variables in
the nfs_args struct.

The conversion actually done by the nfs clients is essentially the opposite:
nmount() passes them options in inconvenient string form, or possibly
as a convenient already-parsed binary blob.  The binary blob is copied
to the nfs_args struct, which is checked for errors later.  String options
are converted into binary form in the nfs_args struct, except in cases like
"negnametimeo" where they don't fit.  The options parsing is of low quality.

> I believe that this function could return
> the mount options (set as flag bits in the nfs variant of the
> kernel mount structure, etc) as nmount() style name/value pairs
> if there was a way to get them to userland. (It could be done as
> yet another flag for nfssvc() if no other file system needs this
> capability.)

Well, the options could always have been returned in convenient binary
form in the nfs_args struct, except now they don't all fit, and there
were always ABI considerations which make the binary format not so good
in some cases (say a 32-bit app on a 64-bit kernel wanting to know the
nfs options).  name=value takes more conversions which are not done even
now:
- Case 1: old mount(8) utility supplied a binary blob.  The kernel just
   copied it to the kernel nfs_args struct and doesn't have many string
   options for it
- Case 2: nmount(2) supplied string options.  The kernel now has some
   options in string form, but only ones that weren't defaulted.

Examples of low-quality options parsing:
- from 4.4BSD-Lite mount_nfs.c.  Hmm, this is not to bad -- it has no atoi()
   or sscanf(), but just strtol() with common errors:

% 		case 'r':
% 			num = strtol(optarg, &p, 10);
% 			if (*p || num <= 0)
% 				errx(1, "illegal -r value -- %s", optarg);
% 			nfsargsp->rsize = num;
% 			nfsargsp->flags |= NFSMNT_RSIZE;
% 			break;

   This has mainly common overflow bugs:
   - strtol() returns long, so overflow may occur when it is blindly assigned
     to `num', which is int.  Even if overflow is benign, it breaks the
     subsequent check that num <= 0.
   - no checking of overflows avoided by by strtol(), except when longs are
     longer than ints these cause overflow in the assignment here.
   Input consisting only of whitespace is detected bogusly as num being 0.

- from FreeBSD-5.2 mount_nfs.c: it still uses strtol() in old code, but
   most new code uses atoi() like this:

% 			if (altflags & ALTF_ACREGMAX) {
% 				p = strstr(optarg, "acregmax=");
% 				if (p)
% 					nfsargsp->acregmax = atoi(p + 9);
% 			}

   atoi() is unusable in serious parsing, since among other design errors,
   it gives undefined behaviour on overflow (except in FreeBSD, this
   behaviour is defined as being the same undefined behaviour that occurs
   on blind assignment of strtol() to int as done in the previous example).

- from FreeBSD-current mount_nfs.c:

% 	if (findopt(iov, iovlen, "rsize", &opt, NULL) == 0) {
% 		if (opt == NULL) { 
% 			errx(1, "illegal rsize");
% 		}
% 		ret = sscanf(opt, "%d", &args.rsize);
% 		if (ret != 1 || args.rsize <= 0) {
% 			errx(1, "illegal wsize: %s", opt);
% 		}
% 		args.flags |= NFSMNT_RSIZE;
% 	}

   Now even the rsize parsing doesn't use strtol().  It uses sscanf(),
   whose behaviour on overflow is even more undefined than atoi()'s.
   Despite less error checking, the parsing is now more verbose, with
   the help of extra braces.  It takes such care with strings that it
   spells "rsize" as "wsize" in the error message.

   From FreeBSD-current nfsclients (the new one, but I think they use
   identical code which is almost identical to the userland code, and
   triplicates all its bugs):

% 	if (vfs_getopt(mp->mnt_optnew, "rsize", (void **)&opt, NULL) == 0) {
% 		if (opt == NULL) { 
% 			vfs_mount_error(mp, "illegal rsize");
% 			error = EINVAL;
% 			goto out;
% 		}
% 		ret = sscanf(opt, "%d", &args.rsize);
% 		if (ret != 1 || args.rsize <= 0) {
% 			vfs_mount_error(mp, "illegal wsize: %s",
% 			    opt);
% 			error = EINVAL;
% 			goto out;
% 		}
% 		args.flags |= NFSMNT_RSIZE;
% 	}

   Parsing of strings in the kernel is even more laborious than in
   userland.  sscanf() is unusable but used.  I was asleep when it was
   committed to the kernel (it was at first used only for parsing a
   couple of path/device names for booting).  The braces are needed
   now.  The vfs_mount_error() line is unnecessarily split, and triplicates
   the misspelling of "rsize" as "wsize".

% 	if ((argp->flags & NFSMNT_RSIZE) && argp->rsize > 0) {
% 		nmp->nm_rsize = argp->rsize;
% 		/* Round down to multiple of blocksize */
% 		nmp->nm_rsize &= ~(NFS_FABLKSIZE - 1);
% 		if (nmp->nm_rsize <= 0)
% 			nmp->nm_rsize = NFS_FABLKSIZE;
% 	}
% 	if (nmp->nm_rsize > maxio)
% 		nmp->nm_rsize = maxio;
% 	if (nmp->nm_rsize > MAXBSIZE)
% 		nmp->nm_rsize = MAXBSIZE;

   This is old code for range checking of rsize.  It is still useful
   and used once the value reaches the nfs_args struct.  Not much wrong
   with it, but altogether it gives about 50 lines of option parsing
   and range checking code for a single simple numeric option, with
   bugs in the range checking for overflow cases and redundant checks
   for the <= 0 case.  I would like to be able to write such code as
   nmp->nm_rsize = parsopt(stringval, minval, dfltval, maxval) in 1 place
   (actually table-driven).  But the range should also be checked in the
   kernel a userland utility did the parsing.  The kernel doesn't need
   to do any defaulting if userland does it.

Bruce