[repost] multiple filesystems sharing/clobbering device vnode

Andriy Gapon avg at icyb.net.ua
Tue Feb 10 10:35:18 PST 2009

Unfortunately I wasn't able to devote enough time/thinking to this
issue, so I am cowardly resorting to just reminding about it.

-------- Original Message --------
Subject: multiple filesystems sharing/clobbering device vnode
Date: Sat, 01 Mar 2008 11:33:37 +0200
From: Andriy Gapon <avg at icyb.net.ua>
To: freebsd-arch at freebsd.org

First, a little demonstration suggested by Bruce Evance:
[I hope you will continue reading after reboot]
1. mount_cd9660 /dev/acd0 /mnt1
2. mount -r /dev/acd0 /mnt2 # -r is important
3. ls -l /mnt1

The issue can be laconically described as follows:
1. We do not disallow multiple RO mounts of the same device (which could
be done either on purpose or by an accident).
2. All popular (on-disk) filesystems use/clobber bufobj of device's
vnode, even for RO mounts; some (ufs) do that even if mount fails.
3. There are no considerations for such a shared access, all filesystems
act as if it is an exclusive owner of the vnode / its bufobj.

Small snippet of code that speaks for itself (the most interesting lines
are marked with XXX at the beginning):
g_vfs_open(struct vnode *vp, struct g_consumer **cpp, const char
*fsname, int wr)
        struct g_geom *gp;
        struct g_provider *pp;
        struct g_consumer *cp;
        struct bufobj *bo;
        int vfslocked;
        int error;


        *cpp = NULL;
        pp = g_dev_getprovider(vp->v_rdev);
        if (pp == NULL)
                return (ENOENT);
        gp = g_new_geomf(&g_vfs_class, "%s.%s", fsname, pp->name);
        cp = g_new_consumer(gp);
        g_attach(cp, pp);
        error = g_access(cp, 1, wr, 1);
        if (error) {
                g_wither_geom(gp, ENXIO);
                return (error);
        vfslocked = VFS_LOCK_GIANT(vp->v_mount);
        vnode_create_vobject(vp, pp->mediasize, curthread);
        *cpp = cp;
XXX     bo = &vp->v_bufobj;
XXX     bo->bo_ops = g_vfs_bufops;
XXX     bo->bo_private = cp;
XXX     bo->bo_bsize = pp->sectorsize;
        gp->softc = bo;

        return (error);

In addition to this, some filesystems (ufs) directly modify v_bufobj.

I've been pondering this issue for over a month now, I have some ideas
but they all are wanting in one aspect or other.

I would like to hear ideas and opinions of the people on this list.

P.S. for those who didn't actually run the test, here's a hand-copied
excerpt from stack trace:

Andriy Gapon

More information about the freebsd-fs mailing list