RFC: enhancing the root mount logic
M. Warner Losh
imp at bsdimp.com
Mon Aug 23 23:13:16 UTC 2010
In message: <AFBE2FCA-30A6-4E1D-A964-AC4DC4C843EB at juniper.net>
Marcel Moolenaar <marcelm at juniper.net> writes:
: All,
:
: In embedded products, software is possibly installed as an image onto
: an actual storage device. This means that mounting the storage device
: as root is not enough to have a usable root file system. The rough
: draft below is an idea to enhance the root mount from having ad-hoc
: quirks to a well-defined and recursive mechanism to allow a wide-
: range of use cases.
:
: The root mount logic is recursive as follows:
: 1. The kernel mounts devfs as root (is it is now).
: 2. The kernel will re-mount root by virtue of reading a file, called
: /.mount.conf, in the current root file system and following the
: directives is it. devfs synthesizes the contents of this file.
:
: At each iteration, the kernel will:
: 1. move the devfs mount from /dev in the old file system to /dev in
: the new file system.
: 2. As per the directives or unconditionally, the kernel will re-mount
: the old root file system under /.mount (or some other name) within
: the new file system.
:
: devfs will synthesize the contents of /.mount.conf as per the kernel
: configuration and tunables. The administrator (or install process)
: will create and populate /.mount.conf for all other cases.
:
: Directives in /.mount.conf are envisioned to be something like:
:
: {FS}:{MOUNTPOINT} e.g. ufs:/dev/da0
: a root mount alternative. The order of the alternatives in
: the file determines the priority.
:
: .ask
: a root mount alternative that asks the operator to specify
: what the root mount should be.
:
: .wait N .e.g. .wait 5
: wait at most N seconds for a root mount alternative to
: succeed. If an alternative does not succeed within that
: time, move on to the next alternative.
:
: .onfail {panic|reboot|retry|continue}
: Tells the kernel what to do in case it can't successfully
: complete the root mount as directed to.
:
: The .wait directive works better (probably) if we have events that
: signify the arrival of a file system or device special file, so that
: we can wait for at most N seconds after the last event. This also
: allows us to wait for a separate interval between events.
:
: As an example, consider:
:
: [devfs] /.mount.conf:
: ufs:/dev/da0
: .ask
: .wait 5
: .onfail panic
:
: [ufs:/dev/da0] /.mount.conf
: md0:/images/OS-image-1.0.iso
: unionfs:/jail/freebsd-8-stable
: .wait 0
: .onfail continue
:
: In the example, the kernel will mount devfs, read /.mount.conf and
: wait at most 5 seconds to mount the UFS on /dev/da0. If that fails,
: the kernel will ask (once) and panic in case of failure.
:
: If the UFS root mount succeeded, the kernel will re-mount devfs
: underneath /dev. Since this is the first non-devfs root file system,
: the kernel will not re-mount the old root under /.mount.
:
: Since there's a /.mount.conf on the UFS, the kernel will read it
: and repeat the process. First it'll try and mount the OS image
: in /images/OS-image-1.0.iso and if it's not present will try to
: mount some -stable 8 chroot using unionfs (not necessarily a
: real-world example here :-) If either fails, the kernel will
: continue booting using the current root file system. Assuming that
: the image is present, the kernel will re-mount root, move devfs
: underneath /dev in the MD root and remount ufs:/dev/da0 under
: /.mount in the MD root. This gives the following picture:
:
: / md0:[ufs:/dev/da0]/images/OS-image-1.0.iso
: /.mount ufs:/dev/da0
: /dev devfs
:
:
: Things to not explicitly touched upon:
: o root mount options
: o directives to instruct the kernel what to run as the initial
: process to eliminate the rather ad-hoc hardcoding. E.g:
: .init /sbin/init
: .init /sbin/init.old
:
: Is this something that people feel is worth fleshing out and
: prototyping?
This sounds very interesting. If kept simple, I could see how this
would make my life a lot easier.
However, all this scripting sounds a bit like a very simple shell in
the kernel. What advantages are there to this approach vs having the
ability to run a simple shell script or executable and "pivot" the root
to a new location? And how do you emulate the mount_foo programs for
foo filesystems? Some of them do weird things that might not
translate well into the kernel...
As you can see, I'm torn about how I feel about the idea. For simple
cases, I think it is great, but as complexity builds, I become less
sure. What if that iso image was compressed? What if I had a
software RAID of disks or flash devices? What about crypto? I know I
can handle those cases in /bin/sh, but will each new one require more
code in the kernel? What would df and/or mount tell you about the
now-hidden file systems?
Warner
More information about the freebsd-arch
mailing list