misc/118160: unable to mount / rw while booting 7.0-BETA3
brde at optusnet.com.au
Tue Nov 20 23:10:03 PST 2007
The following reply was made to PR misc/118160; it has been noted by GNATS.
From: Bruce Evans <brde at optusnet.com.au>
To: Yuri <yuri at tsoft.com>
Cc: freebsd-gnats-submit at freebsd.org, freebsd-bugs at freebsd.org
Subject: Re: misc/118160: unable to mount / rw while booting 7.0-BETA3
Date: Wed, 21 Nov 2007 18:02:10 +1100 (EST)
On Wed, 21 Nov 2007, Yuri wrote:
> After recompiling and reinstalling the current BETA3 my system has a reboot problem.
> While booting log says:
> Starting file system checks:
> <here goes the list of file systems that it reports, this is ok>
> mount: : Operation not permitted.
This is probably a secondary problem. You apparently have the root device
mounted on "" or something like that.
> Mounting root file system rw failed, startup aborted.
> /etc/rc: WARNING: $true is not set properly - see rc.conf(5)
Whatever caused this is probably the main problem.
> and system gets to single user mode.
> In single user mode / is read-only. And command 'mount -uw /' fails Operation not permitted. I count't find the workaround so far.
Please keep line lengths below 80 in mail.
What does mount shouw for the root device?
> The major bug seems to be in the 'mount' system call. 'man mount' says that EPERM is returned if "The caller is neither the super-user nor the owner of dir." I am root.
THis was broken in GEOM somewhere near g_vfs_access(). g_vfs_access()
returns EPERM for all errors involving exclusive access. This breaks
the documented behaviour of [n]mount() returning EBUSY for attempts to
mount the same device more than once (unless all mounts are r/o -- multiple
r/o mounts are broken differently, by allowing them and panicing on a
garbage bufobj pointer later).
You are apparently attempting to mount the same device twice (even though
-u specifies an already-mounted device, the kernel is apparently confused
about where it is mounted).
> The secondary problem is this printout: WARNING: $true is not set properly - see rc.conf(5)
> It shouldn't print $true
FIx this first.
> Another secondary problem is with man mount(2). Isn't is supposed to mention that setting securelevel also makes 'mount' return EPERM?
I think securelevels break a lot of man pages like that.
> So now I can reboot normally only choosing "single user mode" when I boot and running "mount -uw /" as a single user. And then continuing the boot process.
Yes, it makes some sense for mounting / r/w in the right place gets it
mounted r/w before other things mess it up. Don't forget to run fsck -p
manually before continuing.
I can now see a plausible way to reach the bad state:
- after booting, the root device is mounted on / r/o with no problems
- mistype a mount command or have $true generate a wrong mount command,
so that the root device is mounted somewhere else (I don't know how
it can be on "", but it could be on " " or on any valid pathname).
If you preemptively mount it r/w, then this other mount will fail
-- look in the logs for messages about this.
- now try to remount / r/w normally. This will fail due to the r/o mount
not on /.
- if there is only 1 extra r/o mount of /, then the r/w mount should work
after unmounting the extra. If there are several extras, then unmounting
them in a certain order should give the bufobj panic.
The EPERM instead of EBUSY error is very confusing. Another variation
on it is that after shutdown to single user mode (using "kill -TERM
1" or similar), and unmounting all devices except / an /dev, and
remounting / r/o, "fsck -p" and "fsck /" are broken due to problems
near g_access(). They fail with the now familiar error EPERM. Some
file systems have a a hack to allow them fsck to work after booting,
but it doesn't apply later.
More information about the freebsd-bugs