Relayd crashing Kernel on 10.1
matthew at freebsd.org
Wed Nov 26 09:27:53 UTC 2014
On 11/26/14 08:43, Kolontai Andrej wrote:
> Hello @all,
> I'm new to this list and hope this is the right place to ask.
> We are using FreeBSD for our Firewalls and are actually happy with it. Since recently we use relayd (installed via pkg) to do some load balancing stuff. On a freshly installed machine running 10.0-RELEASE everything worked fine.
> On monday, I tried to upgrade to 10.1-RELEASE using freebsd-update as described in the handbook chapter 24. At first everything looked good but relayd wouldn't come up:
> "Nov 24 10:50:48 flutters relayd: fatal: cannot add rule: Operation not supported by device
> Nov 24 10:50:48 flutters relayd: lost child: pfe exited abnormally"
> When I tried to start it with /usr/local/etc/rc.d/relayd start the kernel panicked. I had to roll back the update (which worked fine). However, I was able to reproduce this behavior on a virtual machine.
> My guess is it happens here:
> #7 0xffffffff81a37954 in pfr_detach_table (kt=0x0)
> at /usr/src_10.1.0/sys/modules/pf/../../netpfil/pf/pf_table.c:2047
> The corresponding code is:
> pfr_detach_table(struct pfr_ktable *kt)
> KASSERT(kt->pfrkt_refcnt[PFR_REFCNT_RULE] > 0, ("%s: refcount %d\n",
> __func__, kt->pfrkt_refcnt[PFR_REFCNT_RULE]));
> if (!--kt->pfrkt_refcnt[PFR_REFCNT_RULE])
> pfr_setflags_ktable(kt, kt->pfrkt_flags&~PFR_TFLAG_REFERENCED);
> From what I know about C programming: kt is not supposed to be 0x0.
> My guess was that some data structure has changed between 10.0 and 10.1 kernels. So a recompile of relayd should fix that. It did. I compiled it from the ports and it worked.
> Now my question ist: did I do something wrong? Maybe compiling is the preferred method over using binaries? I'm still trying to figure out what the "stay-out-of-trouble"-mode on FreeBSD is (like yum -y update). But I'd rather use the binaries.
> Viele Grüße
What you have discovered here is clearly a bug. It seems that relayd is
poking around in kernel internals in a way that makes it particularly
fragile when the kernel is updated. That, or else somebody slipped up
and 10.1-RELEASE has not preserved KBI compatibility with 10.0-RELEASE.
Either way, it's a bug that needs to be addressed. Could you report
this via Bugzilla please?
I'd start by classifying this as a kernel bug, since userland processes
really aren't meant to be able to cause panics.
This does mean that relayd from the FreeBSD pkg repositories is
essentially only usable on 10.0 until the end of January when 10.0 goes
out of support -- the package builder uses the earliest supported
version from each major branch for building, as there is meant to be
forwards ABI and KBI compatibility within each major branch. There are
a few other programs like that, all of which tend to have a far too
intimate knowledge of various kernel internals. lsof is a case in
point. As you say, a work-around is to build the package locally.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 949 bytes
Desc: OpenPGP digital signature
More information about the freebsd-questions