Fix MNAMELEN or reimplement struct statfs

Jilles Tjoelker jilles at stack.nl
Tue Apr 15 20:43:54 UTC 2014


On Tue, Apr 15, 2014 at 12:03:02PM -0700, Doug Ambrisko wrote:
> On Fri, Apr 11, 2014 at 08:21:46PM -0500, Bryan Drewery wrote:
> | On Tue, Nov 19 21:53:38 UTC 2013 Jilles Tjoelker wrote:
> | > On Mon, Nov 18, 2013 at 11:01:42AM -0800, Doug Ambrisko wrote:
> | >> On Sat, Nov 16, 2013 at 08:31:29PM +0200, Konstantin Belousov wrote:
> | >> | I think that struct mount should have a const char * field where the
> | >> | non-trimmed path is stored and used for match at unmount. f_mntonname
> | >> | truncation would be only unfortunate user interface glitch.

> | >> Note that we are not storing the path in mount structure so no structures
> | >> have changed which is nice since then we haven't introduced any real
> | >> ABI breakage.  So we could MFC this.  The match isn't critical since
> | >> umount will fall back to fsid and work.  One thing that might be good to
> | >> do is change umount to try to umount via fsid first and then do the
> | >> match if the fsid failed versus the other way round that it does now.

> | >> The problem I see is if someone tries to do things based on the parsed
> | >> output of mount/df then that will fail since the output is truncated.

> | > As noted in comments in sbin/umount/umount.c, the statfs() call is
> | > deliberately after the mount list checks because it may block forever
> | > for unresponsive NFS servers. It would be unfortunate if hung NFS
> | > filesystems would have to be forcibly unmounted by copy/pasting the fsid
> | > from 'mount -v'.

> | From a user perspective, I'd love to see this get committed and MFC'd.
> | It's very odd to have ENAMETOOLONG errors while traversing .zfs/snapshot.

> I have a new patch at:
> 	http://people.freebsd.org/~ambrisko/mount_bigger_2.patch
> that I tested against head.  This should be pretty close to commiting
> unless people find some issues with it.

In sys/kern/vfs_mount.c:
+		mp->mnt_path = malloc(strlen(fspath), M_MOUNT, M_WAITOK);
+		strlcpy((char *)mp->mnt_path, fspath, strlen(fspath));

This always strips the last byte off the fspath.

I like that this only touches the kernel, so it does not break anything
regarding mount/umount of filesystems with short paths, including
(NFS) filesystems that do not respond.

The patch does not enlarge f_mntfromname which may be a problem for
nullfs. It is certainly a step forwards for poudriere but [ENAMETOOLONG]
errors could still occur in more extreme situations.

> | However, this would make the situation worse for poudriere, which is
> | what this particular thread was started on. It does exactly what you
> | worry about, it parses mount(1) output and umounts all descendants for a
> | given path. We do the same thing at my work for our base build system,
> | and I believe FreeNAS is doing something like this. So it's not uncommon.

> | Or did the situation improve with the latest patch to show the full
> | mount path in mount(1)?

> Not yet but could be address in a subsequent enhancement.  The framework
> in the kernel to handle longer paths and track the full path name is
> there now.  Changes will need to be done in the structure passed to mount.
> This can be done if we implement a statvfs structure that can pass this
> data back.  Since we don't have statvfs really defined we can implement
> it to deal with longer paths.  Then mount can be updated to show the full
> path.

> | We could implement umount -r, but I'm not convinced that is adequate due
> | to unknown use of mount(1) output. We really need the real path exported
> | to userland somehow, and preferably to mount(1) by default to not break
> | scripts.

> | This may not be a clean solution, but couldn't we add another syscall,
> | say getfsmntpath(2), and have mount(1) use both statfs(2) and
> | getfsmntpath(2) to show a proper output?

> I think we can do this with statvfs to give full path names.

I agree that a new syscall or similar is needed in a later patch but
disagree that it should be statvfs(). To call statvfs(), the
filesystem's path must already be known and the filesystem (and all
parent filesystems) must respond. In fact, struct statvfs does not
include any pathnames at all.

If it is acceptable that only root can query the pathnames, a simple
sysctl can map f_fsid to mnt_path (and the extended f_mntfromname). This
can then be used together with statfs(2), fstatfs(2), getfsstat(2) or
getmntinfo(3) (but why would one use the latter?).

It is possible to extend getfsstat(2) (but not statfs(2)/fstatfs(2)) to
use an f_spare field to point to long pathnames stored elsewhere in the
buffer (using the flags argument to enable the new behaviour).

Simply increasing MNAMELEN would make struct statfs over 2 kilobytes
large, which would lead to a rather large result from getfsstat(2) on a
machine with many filesystems mounted.

-- 
Jilles Tjoelker


More information about the freebsd-hackers mailing list