svn commit: r296428 - head/sys/boot/common

Warner Losh imp at bsdimp.com
Mon Mar 7 17:05:00 UTC 2016


On Mon, Mar 7, 2016 at 9:51 AM, Konstantin Belousov <kostikbel at gmail.com>
wrote:

> On Mon, Mar 07, 2016 at 09:28:13AM -0700, Warner Losh wrote:
> > On Mon, Mar 7, 2016 at 8:52 AM, Konstantin Belousov <kostikbel at gmail.com
> >
> > wrote:
> >
> > > On Mon, Mar 07, 2016 at 08:39:47AM -0700, Ian Lepore wrote:
> > > > Is there no way to prevent the panic other than making the unwind
> data
> > > > be present?  Why can't the kernel be fixed to cope with the missing
> > > > data in some gentler way during a transition period?  Perhaps
> valid-but
> > > > -fake data could be generated if necessary?  Being unable to get a
> > > > stack traceback through a loaded module would be a small price to pay
> > > > for trouble-free updgrades.
> > >
> > > It is practically impossible to recover from partially-loaded object
> file'
> > > module.  The loader workaround currently only affects HEAD and since
> the
> > > MFC was done, 10.3 should be safe.  We always required lastest stable
> > > for the jump to next major branch.
> > >
> > > What could be done is demoting the panics (there are several, besides
> > > the one which was triggered) to a message and refusing to load the
> > > affected module. OTOH, if the reaction would be a message and not
> panic,
> > > it definitely go ignored for quite some time.
> > >
> >
> > The new loader could also pass in some version or cookie in the metadata
> > that says it is the new one. The kernel could examine this and issue a
> > warning,
> > on amd64 / i386, that module linking may be incomplete and you'll need to
> > upgrade your /boot/loader if you encounter a crash.
> This is absolute useless kernel bloat.  Kernel should provide an execution
> environment for user programs, and not lecture users about proper system
> configuration.


On the other hand, the kernel and the boot loader have a protocol they
both implement. When one side implements it wrong, the other side should
detect it if it is easy to do so.


> > Could the kernel detect that a .eh_frame module was loaded and ignore it
> > in "safe mode"? Perhaps combined with the new boot-loader cookie, this
> > would be an automatic way to not mysteriously crash.
> Why should kernel ignore loaded .eh_frame ? I do not see any use for
> other part of the suggestion at all. To clarify, kernel paniced because
> some (required but currently not utilized) part of the binary module was
> not loaded.


Not ignore eh_frame, just modules that have eh_frame and potentially bad
relocations. Or, you could pre-scan the relocations and only fail when the
module actually has them.  But if you make the linker pancis into warnings
instead, then that would likely also be OK.

> Alternatively, is there a switch to clang 3.8 that says 'Don't generate
> the
> > new
> > relocation, use the old one instead" which would also be safe and allow a
> > less-bumpy transition?
> >
> > Finally, would the partially loaded module stop at the first bad
> relocation,
> > or would it do them all and just skip the bad ones? Is the data from this
> > relocation
> > used all the time, or just when we're doing a stack unwind for an
> exception
> > or a backtrace?
> Practically, we could ignore that relocations and still load the module,
> but this is only because we know what the scope of the relocations is.
> For some arbitrary situation with the same detected missed place for
> relocations, loader cannot know is it safe or not.
>

True. However, this is a well-known case.


> The problem is fixed and does not deserve nuking of all computers in
> the world, which was an equivalent of some other suggestions how to
> handle that.  Most of the suggestions come to extreme which is not
> deserved.
>
> What could be useful, as I noted already, is to demote the panics from
> kernel linker to warnings.  I intended to work on this.


That would fit the bill for what I'm interested in this stuff for. Normally,
we load the new kernel with new boot loader in my company's
upgrade process. There are times, however, when we'll wind up
loading the new kernel with the old boot loader (but more commonly
vice-versa). Having some indication of the error would be quite useful
in this scenario so we know we need to do something else.

Warner


More information about the svn-src-head mailing list