[HEADSUP/TEST]: mount(2)/nmount(2) & rootfs mounting

Sun Dec 5 07:09:28 PST 2004

Please help test this megapatch:

	http://phk.freebsd.dk/patch/nmount.patch

Any observations of filesystems behaving differently etc are most
welcome.

I intend to commit this before X-mas if at all possible.

The problem, briefly stated:
----------------------------

The old mount(2) interface, ("omount") does not help filesystems much
with argument passing.  A single pointer is passed in along with a
set of flags of which some are magic to varying degrees.

The single pointer is usually used for a struct MYFS_args, which
contains what the filesystem needs.

Since these structs differ from filesystem to filesystem, we can not
reuse the userland mount code, but need a mount_msdos(8), mount_nwfs(8)
etc etc.

We ran out of flags, and rather than just postpone the trouble by
making the flags field wider, we (mux@ & I) created a new mount
system call ("nmount") which can pass lists of options and help
filesystems manage them.

How nmount(2) works
-------------------

In userland, a list of options are collected, they have names which
are ascii-strings and values which can be anything:

        build_iovec(&iov, &iovlen, "fstype", "ufs", -1);
        build_iovec(&iov, &iovlen, "fspath", mntpath, -1);
        build_iovec(&iov, &iovlen, "from", dev, -1);
        build_iovec(&iov, &iovlen, "flags", &ufs_flags, sizeof ufs_flags);
        build_iovec(&iov, &iovlen, "export", &export, sizeof export);
        if (nmount(iov, iovlen, mntflags) < 0)
                err(1, "%s", dev);

On the kernel side, the options are copyin(9)'ed and arranged into a list.
A number of functions allows a filesystem to access elements in the list.

Compatibility
-------------

If an old style mount call comes in, the filesystem offers a "vfs_cmount"
function which converts old-style arguments into an kernel call to nmount:

	error = copyin(data, &args, sizeof args);
        if (error)
                return (error);

        ma = mount_arg(ma, "from", args.fspec, -1);
        ma = mount_arg(ma, "export", &args.export, sizeof args.export);
        ma = mount_argf(ma, "uid", "%d", args.uid);
        ma = mount_argf(ma, "gid", "%d", args.gid);
	[...]
	error = kernel_mount(ma, flags);

Root mounting
-------------

When we mount the root filesystem, we do with an ascii string of the
form "$filesystem:$something", if the filesystem is able to parse
$something (passed in mount argument "from", the filesystem can be
used as root filesystem (NB: there must be a /dev directory or things
go downhill really fast).

There are no other requirements or special code needed in the filesystem.

In theory, we should be able to use a msdosfs as rootfilesystem now.

(see other email about how root mounting works)

The Short Future Pespective
---------------------------

After some more testing I will commit my current megapatch and deal
with breakage we find it.

The Long Future Perspective
---------------------------

A few filesystems need magic userland support, nfs, nwfs and similar
where userland does some of the network setup.  For these we will still
need a mount_blafs(8) program which knows about these things.

For all other filesystems my hope is that we can keep all mount options
as ascii strings so that a single mount(8) tool can work for all of
these filesystems.

Currently I have made the omount->nmount conversion as best I could,
(refinements are more than welcome) and the kernel will now respond
to both nmount and omount systemcalls.

Before 6.0 freeze I want to convert userland to use nmount exclusively,
this hopefully will amount to getting rid of a number of mount_foofs(8)
programs.

After 6.0 branch I want to remove the omount compatibility in the kernel
so that 7.x is nmount exclusively.

So, the problem here is that I can not even test all these various
filesystems and I would be surprised if I can find the time to polish
each and every one of them, so I hope the various filesystem owners
and other interested hackers will step in and help me with the last
bit of this.

What needs to be done
---------------------

    UFS/FFS:
	release global mount flags which are areally UFS/FFS private
	flags (MNT_SOFTDEP etc).

    NFS/NFSv4:
	More sharing is possible here I think.

	Root mount code is too magic, we don't really interpret the
	"from" string as far as I can tell, rather we rely on various
	magic stuff dropped by bootloader.  Would be nice if we DTRT
	so that one could point a kernel at any NFS server without
	bootloader support.

    All filesystems:
	Verify that things work.
	Tune the option names and error checking.
	Get rid of mount_foofs(8) if possible.
	Documentation.

    All "single user" filesystems
	There may be a "market" for some library routines to handle
	filesystems which are designed for single-user (msdosfs etc)
	rather than have the uid/gid/filemode/dirmode in three or four
	separate filesystems.

    Documentation
	Please help!

*END*
-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.