nosh init system

Sun Feb 10 16:31:08 UTC 2019

In message <43C091FC-18ED-49DF-A488-784DC2329D52 at gmail.com>, Enji 
Cooper writes
:
> On Feb 9, 2019, at 20:20, Rodney W. Grimes <freebsd-rwg at pdx.rh.cn85.dnsmgr.ne
> t> wrote:
>
> >> In message <CAG6CVpWXOA6r_aJcefxQBu2QZxprf1ZpDoTb4eb2JSwWsE2m+g at mail.gma
> >> il.com>
> >> , Conrad Meyer writes:
> >>> Hi Cy,
> >>> 
> >>>> On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert <Cy.Schubert at cschubert.com> w
> rote:
> >>>> I don't see what's so "incredibly fragile" about rc(8). That's not to
> >>>> say there aren't better solutions, like SMF.
> >>> 
> >>> Maybe "incredibly" as a choice of adjective is inappropriate.  I think
> >>> we (you, me, and ngie@) can all agree it is somewhat fragile, and
> >>> there are things SMF/systemd/nosh get right that rc(8) does not
> >>> (today).  Anyway, your next paragraph goes on to be a good start at
> >>> describing some of rc's fragility.  :-)
> >>> 
> >>>> Where rc(8) falls down is any port or a customer's (user of FreeBSD) rc
> >>>> script could fail hosing the boot or worse hosing the system*. Where a
> >>>> solution like SMF solves the problem is that should a service which
> >>>> other services depend on fail, only that branch of the startup tree
> >>>> would fail.
> >>> 
> >>> Right; that's a great example.
> >>> 
> >>>> In that scenario, if a service fails but sshd start, a
> >>>> sysadmin would still be able to login remotely to resolve the problem.
> >>>> So in this regard rc(8) is at a disadvantage.
> >>>> 
> >>>> We could address the above paragraph by starting sshd earlier during
> >>>> boot thereby allowing the opportunity to fix remotely.
> >>> 
> >>> I don't think that is really sufficient without substantially
> >>> modifying init+rc to be closer to something like systemd or SMF,
> >>> anyway.  And then we'd rather just have something like SMF :-).
> >> 
> >> I'd rather see SMF but a number felt a CDDL licensed init was 
> >> unacceptable -- except for the fact that SMF doesn't replace init.
> >> 
> >>> 
> >>> As soon as *any* rc service fails to start (signal, non-zero exit,
> >>> stop_boot), rc(8) exits non-zero, causing init(8) to go to single
> >>> user.  All service state is thrown away with rc(8) exit, but any rc.d
> >>> "services" that managed to start before boot failed are not
> >>> terminated.  Even if an admin manages to log in and fix the
> >>> configuration, re-starting rc(8) restarts the runcom process from
> >>> scratch, as if nothing had already been done, without first stopping
> >>> anything that was already running.  The only safe, reproducible way to
> >>> re-start rc(8) is to fully reboot the system.
> > 
> > It -should- be safe to restart rc, as rc scripts should check to
> > see if the item they are being requested to start is already running,
> > rc scripts that fail to have this check are defective and should be
> > fixed.  You should be able to invate /etc/rc.d/foo start as many
> > times as you want in a row and only get 1 instance of foo, with the
> > other starts returning "foo already running"   Same with stop.
>
> Iâ€™m not sure if Conrad is referring to the isilon way of restarting service
> s. If so, the isilon parallel start process would effectively wipe the slate 
> clean and restart everything if interrupted, which (because of the nature of 
> cleanvar, etc), would wipe out any and all pidfiles, resulting in in weird se
> t of services which fail to start on next run through.
>
> In short, I think the fact that isilon didnâ€™t mount tmpfs to /var/run was b
> egging for pain, as itâ€™s a directory one should only setup once at boot.

Regardless of whether they use tmpfs or not, services should be 
constructed in a manner such that it should still work if the customer 
chooses not to use tmpfs.

This also goes for those who mount /usr separately like I do (which has 
saved my bacon as recently as a couple of weeks ago). A change made to 
one of the RC scripts assumed /usr was on rootfs. (When I raised the 
issue the reply was "you should /usr on / anyway.") My point is that we 
assume our way of setting up a server is the only way and we bulldoze. 
In reality FreeBSD and prior to that commercial UNIX were set up 
variously. It's only since Linux became so popular that it has been 
assumed that one size fits all.

These are two examples of why this approach doesn't work. POLA is 
painful.

>
> That being said, there are other pseudo services that arenâ€™t necessarily id
> empotent. If they run twice, the second run could result in breakage to other
>  dependent services run after them.

Cleanvar being the focus of much of our discussion should be able to 
determine it has run before.

I'm purposely not discussing implementation details.

-- 
Cheers,
Cy Schubert <Cy.Schubert at cschubert.com>
FreeBSD UNIX:  <cy at FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.