(Circumvented) insta-panic from "pkg upgrade" stable/11 @r308090

David Wolfskill david at catwhisker.org
Sun Oct 30 18:21:02 UTC 2016


Summary: I've worked around this -- at least, for now -- but a process
I've been using every Sunday since July 2015 on a pair of machines
suddenly failed this morning (on just one of the machines).

For background, (if you're interested):
* <http://www.catwhisker.org/~david/FreeBSD/upgrade.html>
* <http://www.catwhisker.org/~david/FreeBSD/convert_i386_amd64.html>
* <http://www.catwhisker.org/~david/FreeBSD/history/>

So... this morning, the update from:

FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #95  r307797M/307819:1100505: Sun Oct 23 03:52:44 PDT 2016     root at freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

to:

FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #102  r308090M/308101:1100506: Sun Oct 30 04:09:05 PDT 2016     root at freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

Just Worked -- as usual.  I rebooted both "production" machines,
logged in, fired up tmux on each, rotated my typescript files, then
fired up script and ran the csh command alias I use on both machines
to update the installed ports (from the locally-built packages that
reside on my build machine; the production machines access them via
NFS).

For one of the machines ("bats"), things Just Worked (again).

For the other ("albert"), I lost contact.  Eventually (after I actually
got up and went to the room where the machines are), I found that it had
rebooted.

Further experimentation showed that in the command sequence:

mount -u -w / && \
mount -u -w /usr && \
( cd /etc/mail && make stop-mta ) && \
service dovecot stop && \
service apache24 stop && \
pkg upgrade

it got through "service apache24 stop" OK, but when I issued "pkg
upgrade" -- the screen blanked, and the machine started rebooting.

On reboot, the /var file system (UFS2+soft updates) showed the
typescript files from before the above efforts -- not even the
"rotation" (mentioned above) was reflected.  (The typescript files in
question reside in /var/tmp on the machine.)  Oh: and the initial "fsck
-p" for /var indicated that fsck needed to be re-run (so when I booted
to single-user mode, I did just that).

There was no hint in the logs of why the reboot (panic?) occurred.

One point that may be at issue is that for bats (where things still
worked), I manually mount the package repository from the build machine
to bats:/mnt, while for albert (where things failed), I have depended on
autofs to handle the mounting as needed (since I need albert to run
autofs anyway, and bats does not).  E.g.:

bats(11.0-S)[1] cat /usr/local/etc/pkg/repos/custom.conf 
custom: {
        # url: file:///net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home
        url: file:///mnt
        enabled: yes,
}
bats(11.0-S)[2] 

vs.:

albert(11.0-S)[10] cat /usr/local/etc/pkg/repos/custom.conf 
custom: {
        url: file:///net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home
        enabled: yes,
}
albert(11.0-S)[11] 


In the process of finally(!) getting albert's "pkg upgrade" working,
I did 2 things differently:

* I did not run under tmux.  I can't imagine that this contributed, but
  I cite it for completeness.

* Prior to invoking "pkg upgrade", I issued
  "ls /net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home"
  (and got a sane result), so the mount was satisfied prior to "pkg
  upgrade" being run.

I note, too, that one of the times I logged in to albert, the login
seemed to "hang" for a while.  When I hit ^T (several times), it
was apparent that the process was trying to use autofs to mount my
home directory (from the FreeNAS box, "grundoon")... and that effort
timed out -- I ended up with the whine about inability to find my
home directory.  And then when I logged out & back in again,
/net/grundoon/mnt/tank/homedirs showed up 3 times in the output of
"df".

So perhaps there's something involving autofs and timing... though
that doesn't seem like much to go on.

Peace,
david
-- 
David H. Wolfskill				david at catwhisker.org
Those who would murder in the name of God or prophet are blasphemous cowards.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20161030/ddd4cc2d/attachment.sig>


More information about the freebsd-stable mailing list