Poul-Henning Kamp phk at phk.freebsd.dk
Sun Dec 5 07:09:28 PST 2004

Please help test this megapatch:


Any observations of filesystems behaving differently etc are most

I intend to commit this before X-mas if at all possible.

The problem, briefly stated:

The old mount(2) interface, ("omount") does not help filesystems much
with argument passing.  A single pointer is passed in along with a
set of flags of which some are magic to varying degrees.

The single pointer is usually used for a struct MYFS_args, which
contains what the filesystem needs.

Since these structs differ from filesystem to filesystem, we can not
reuse the userland mount code, but need a mount_msdos(8), mount_nwfs(8)
etc etc.

We ran out of flags, and rather than just postpone the trouble by
making the flags field wider, we (mux@ & I) created a new mount
system call ("nmount") which can pass lists of options and help
filesystems manage them.

How nmount(2) works

In userland, a list of options are collected, they have names which
are ascii-strings and values which can be anything:

        build_iovec(&iov, &iovlen, "fstype", "ufs", -1);
        build_iovec(&iov, &iovlen, "fspath", mntpath, -1);
        build_iovec(&iov, &iovlen, "from", dev, -1);
        build_iovec(&iov, &iovlen, "flags", &ufs_flags, sizeof ufs_flags);
        build_iovec(&iov, &iovlen, "export", &export, sizeof export);
        if (nmount(iov, iovlen, mntflags) < 0)
                err(1, "%s", dev);

On the kernel side, the options are copyin(9)'ed and arranged into a list.
A number of functions allows a filesystem to access elements in the list.


If an old style mount call comes in, the filesystem offers a "vfs_cmount"
function which converts old-style arguments into an kernel call to nmount:

	error = copyin(data, &args, sizeof args);
        if (error)
                return (error);

        ma = mount_arg(ma, "from", args.fspec, -1);
        ma = mount_arg(ma, "export", &args.export, sizeof args.export);
        ma = mount_argf(ma, "uid", "%d", args.uid);
        ma = mount_argf(ma, "gid", "%d", args.gid);
	error = kernel_mount(ma, flags);

Root mounting

When we mount the root filesystem, we do with an ascii string of the
form "$filesystem:$something", if the filesystem is able to parse
$something (passed in mount argument "from", the filesystem can be
used as root filesystem (NB: there must be a /dev directory or things
go downhill really fast).

There are no other requirements or special code needed in the filesystem.

In theory, we should be able to use a msdosfs as rootfilesystem now.

(see other email about how root mounting works)

The Short Future Pespective

After some more testing I will commit my current megapatch and deal
with breakage we find it.

The Long Future Perspective

A few filesystems need magic userland support, nfs, nwfs and similar
where userland does some of the network setup.  For these we will still
need a mount_blafs(8) program which knows about these things.

For all other filesystems my hope is that we can keep all mount options
as ascii strings so that a single mount(8) tool can work for all of
these filesystems.

Currently I have made the omount->nmount conversion as best I could,
(refinements are more than welcome) and the kernel will now respond
to both nmount and omount systemcalls.

Before 6.0 freeze I want to convert userland to use nmount exclusively,
this hopefully will amount to getting rid of a number of mount_foofs(8)

After 6.0 branch I want to remove the omount compatibility in the kernel
so that 7.x is nmount exclusively.

So, the problem here is that I can not even test all these various
filesystems and I would be surprised if I can find the time to polish
each and every one of them, so I hope the various filesystem owners
and other interested hackers will step in and help me with the last
bit of this.

What needs to be done

	release global mount flags which are areally UFS/FFS private
	flags (MNT_SOFTDEP etc).

	More sharing is possible here I think.

	Root mount code is too magic, we don't really interpret the
	"from" string as far as I can tell, rather we rely on various
	magic stuff dropped by bootloader.  Would be nice if we DTRT
	so that one could point a kernel at any NFS server without
	bootloader support.

    All filesystems:
	Verify that things work.
	Tune the option names and error checking.
	Get rid of mount_foofs(8) if possible.

    All "single user" filesystems
	There may be a "market" for some library routines to handle
	filesystems which are designed for single-user (msdosfs etc)
	rather than have the uid/gid/filemode/dirmode in three or four
	separate filesystems.

	Please help!

