mount(8) bug? rc.d/mountlate bug? bug in both?

Fri Dec 2 19:27:59 UTC 2011

Update below, inline.

> -----Original Message-----
> From: Devin Teske [mailto:devin.teske at fisglobal.com]
> Sent: Friday, December 02, 2011 2:09 AM
> To: 'Doug Barton'
> Cc: 'freebsd-rc at freebsd.org'; 'Ken Smith'; 'Parker-Smith'; 'Dave at FreeBSD.ORG';
> 'phk at freebsd.org'; 'Julian Elischer'; devin.teske at fisglobal.com
> Subject: RE: mount(8) bug? rc.d/mountlate bug? bug in both?
> 
> 
> 
> > -----Original Message-----
> > From: Doug Barton [mailto:dougb at FreeBSD.org]
> > Sent: Thursday, December 01, 2011 9:32 PM
> > To: Devin Teske
> > Cc: freebsd-rc at freebsd.org; Ken Smith; Parker-Smith; Dave at FreeBSD.ORG;
> > phk at freebsd.org; 'Julian Elischer'
> > Subject: Re: mount(8) bug? rc.d/mountlate bug? bug in both?
> >
> > Short answer, tag the mount(s) noauto in fstab, and mount them in
> /etc/rc.local.
> >
> 
> That may be the simplest approach.
> 
> However, we're looking more for a solution that involves keeping the NFS
> mounts in fstab(5).
> 
> Reason: In planning-to-upgrade hundreds of machines -- upgrading from
> FreeBSD-4.11 (where fstab(5) NFS failures are non-fatal) to FreeBSD-8.1 (where
> they are) -- the process would go smoother if we didn't have to migrate
fstab(5)
> NFS entries in the manner you describe above.
> 
> Researching and testing further, it appears that the "bg" option is almost
entirely
> what we want.
> 
> I was able to successfully achieve what I wanted with the following in
fstab(5):
> 
> 	bogus:/bogus /bogus nfs tcp,rw,nosuid,bg,late 0 0
> 
> AND adding to /etc/hosts:
> 
> 	128.0.0.1	bogus
> 
> A timeout occurs attempting to connect to the system named "bogus" and upon
> failure, mount_nfs(8) properly daemon(3)izes into the background, re-
> attempting every 60 seconds to mount the volume.
> 
> However, if I take out the /etc/hosts entry shown above (making the name
> unresolvable), mount_nfs(8) will this time exit immediately with error status
> rather than adhering to the "bg" option (which, for all intents and purposes,
just
> because the netid for the host isn't accessible at the time, doesn't mean it
won't
> be in 60 seconds -- thus justifying the logic that the "bg" option should
apply even
> when the netid can't be obtained, if not especially-so).
> 
> I propose the following [UNTESTED] patch, which tries to make remote-errors
> non-fatal to a filesystem marked as "bg".
> 
> ========== BEGIN PATCH EXCERPT ==========
> --- sbin/mount_nfs/mount_nfs.c.orig     Fri Dec  2 00:46:44 2011
> +++ sbin/mount_nfs/mount_nfs.c  Fri Dec  2 01:20:38 2011
> @@ -803,9 +803,17 @@ getnfsargs(char *spec, struct iovec **io
>                 if (ret == TRYRET_SUCCESS)
>                         break;
> 
> -               /* Exit if all errors were local. */
> -               if (!remoteerr)
> -                       exit(1);
> +               if ((opflags & (BGRND | ISBGRND)) == BGRND) {
> +                       warnx("Cannot immediately mount %s:%s, backgrounding",
> +                           hostp, spec);
> +                       opflags |= ISBGRND;
> +                       if (daemon(0, 0) != 0)
> +                               err(1, "daemon");
> +               } else {
> +                       /* Exit if all errors were local. */
> +                       if (!remoteerr)
> +                               exit(1);
> +               }
> 
>                 /*
>                  * If retrycnt == 0, we are to keep retrying forever.
> @@ -814,13 +822,6 @@ getnfsargs(char *spec, struct iovec **io
>                 if (retrycnt != 0 && --retrycnt == 0)
>                         exit(1);
> 
> -               if ((opflags & (BGRND | ISBGRND)) == BGRND) {
> -                       warnx("Cannot immediately mount %s:%s, backgrounding",
> -                           hostp, spec);
> -                       opflags |= ISBGRND;
> -                       if (daemon(0, 0) != 0)
> -                               err(1, "daemon");
> -               }
>                 sleep(60);
>         }
>         freeaddrinfo(ai_nfs);
> ========== END PATCH EXCERPT ==========
> 

I'm still interested in feedback on the above patch. Determining if we can reach
consensus that the "bg" option should be applied even when the hostid can't be
resolved, not just when the connection times-out.

However, for the immediate remedy, we've generated the following [TESTED] patch
(which merely allows /etc/rc.d/mountlate to be disabled via rc.conf(5) --
achieving the goal of making it impossible for ANY network-based filesystems to
drop the system into single-user mode on boot)...

NOTE: Patch is up-to-date, generated against HEAD, cvsup'd on December 2nd, 2011
(today).

========== BEGIN PATCH EXCERPT ==========

--- etc/defaults/rc.conf.orig   Fri Dec  2 11:21:08 2011
+++ etc/defaults/rc.conf        Fri Dec  2 11:22:31 2011
@@ -93,6 +93,7 @@
 netfs_types="nfs:NFS oldnfs:OLDNFS smbfs:SMB portalfs:PORTAL nwfs:NWFS" # Net
filesystems.
 extra_netfs_types="NO" # List of network extra filesystem types for delayed
                        # mount at startup (or NO).
+mountlate_enable="YES" # Mount critical late/remaining filesystems in fstab(5)

 ##############################################################
 ###  Network configuration sub-section  ######################
--- etc/rc.d/mountlate.orig     Fri Dec  2 11:19:51 2011
+++ etc/rc.d/mountlate  Fri Dec  2 11:20:28 2011
@@ -11,6 +11,7 @@
 . /etc/rc.subr

 name="mountlate"
+rcvar="`set_rcvar`"
 start_cmd="mountlate_start"
 stop_cmd=":"

========== END PATCH EXCERPT ==========

NOTE: After applying the above patch, we simply add ``mountlate_enable="NO"'' to
rc.conf(5) -- and voila! NFS/SMBFS/other network filesystems are no longer
critical to boot into multi-user mode (if they fail, they fail; the system
eventually arrives in multi-user mode where we can service it remotely
if-necessary).

Regarding the above patch, it seems like a good idea to apply this to HEAD. Can
anyone think of any reason that the above patch is a bad idea for the HEAD of
FreeBSD? or on the flip-side, does anybody else second the idea that this should
be applied?
-- 
Devin


> 
> Other comments on responses to initial situation below.
> 
> 
> > More detailed analysis of your situation follows.
> >
> > On 12/01/2011 20:34, Devin Teske wrote:
> > > Hi -RC@, Julian, Poul, and Ken,
> > >
> > > We need your help on FreeBSD-8.1!
> > >
> > > Please read the following dossier on our issue with simply
> > > attempting to add a single NFS mount to fstab(5) ***without*** the
> > > side-effect of rebooting into single-user mode should that mount
> > > fail for ANY reason during
> > boot.
> > >
> > > FULL-DISCLOSURE: We've already tried marking the filesystem as "late"
> > > and/or "bg" to no avail. We've traced the problem down to a possible
> > > bug in either
> > > mount(8) or the `/etc/rc.d/mountlate' boot-script. Need confirmation
> > > that this is a bug, OR a work-around to eliminate the numerous
> > > edge-cases where we can reliably cause the system to boot into
> > > single-user
> > mode.
> >
> > I don't think it's a bug in either.
> 
> You may be right. More below.
> 
> 
> >
> > > Corollary:
> > >
> > > Having a workstation 3000+ miles away in India reboot into
> > > single-user mode simply because of a momentary network hiccup (or
> > > any other situation that could cause failure of the NFS mount) at
> > > boot is what we're trying to avoid. That is to explain, avoiding the
> > > situation where a system that is physically afar from becoming
> > > permanently unresponsive, requiring significant expenditure/effort to
rectify.
> >
> > If you're administrating a remote system it's reasonable to assume
> > that you have a serial console on it. That said, we're certainly not
> > *trying* to break stuff willy- nilly.
> 
> No serial console enabled on the workstations/desktops.
> 
> 
> >
> > > Possible Bug:
> > >
> > > As the system is booting, /etc/rc.d/mountcritremote attempts to
> > > mount the filesystem. It fails. This is OK (because mountcritremote
> > > does not return FAILURE status -- he returns SUCCESS and boot proceeds as-
> expected).
> > >
> > > Later, /etc/rc.d/mountlate runs and attempts to mount it again. It
> > > fails again except this time mountlate calls "stop_boot" after the
> > > failure (dropping us to single-user mode).
> >
> > This the expected/desired behavior.
> 
> Fair enough.
> 
> 
> >
> > > The "possible bug" comes into play in reading /etc/rc.d/mountlate
> > > and finding out just how exactly it determines that it should have
> > > been mounting this filesystem in the first place.
> > >
> > > mountlate calls "/sbin/mount -d -a -l" to determine if there are any
"late"
> > > filesystems to mount.
> >
> > No. The -a in there means that it's looking for *all* unmounted file
> > systems, including those marked "late."
> 
> Correct.
> 
> 
> >
> > In fact a key reason for the division between mountcritremote and
> > mountlate is that circumstances on the system may have changed to
> > allow things that failed the first time to be mounted, *in addition
> > to* specifically marking certain entries "late" because we know that
> > they cannot succeed until later in the boot.
> >
> 
> Right.
> 
> 
> > > The filesystem is NOT marked as "late", but "/sbin/mount -d -a -l"
> > > will still report it because it's not yet mounted.
> >
> > Right-O.
> >
> 
> All good.
> --
> Devin

_____________
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.