RELENG_5 PAE panic

Frank McConnell fmc at reanimators.org
Sat Aug 6 20:51:59 GMT 2005


dpk wrote:
> On Thu, 4 Aug 2005, Frank McConnell wrote:
>> Further debugging led me to the conclusion that the problem is in
>> pmap_protect(), in src/sys/i386/i386/pmap.c; and has to do with a
[...]
>> Then I checked the cvs logs, and saw rev 1.524, which looks like what
>> I was thinking about as a fix, so I'm giving it a spin on top of

> FWIW, on a server we have which was panicing quite frequently, performing
> the above mentioned modification seems to have resolved the issue. The
> server has been repeatedly building kernels while having another process
> run the server out of RAM. Before, this would cause it to panic with one
> of 2 (maybe 3) messages in well under an hour. Now it's been going for 24
> hours straight without even a stray bus error.

Great!  I'd looked at the stack trace you mentioned in your initial
report and really was not sure that you were seeing the same problem.

I have two ways to provoke the failure: starting named (a modified
BIND 8 which loads blackhole lists for a total memory footprint of
somewhere in excess of 900MB), which has provoked the panic in all
but one attempt; and "make buildkernel" which will usually provoke the
panic some ways in.

So I applied this fix to one system that was running RELENG_5 from
early this week and it was able to do both, running "make buildkernel"
repeatedly (for kicks, alternating between building a kernel based on
GENERIC and building one based on PAE) for a couple hours before the
sysadmin took it back to the co-lo.

I have also applied it to another system that was running 5.4-RELEASE
(and missing 2GB of its RAM without PAE).  They're both running named
without error now.

> This appears to resolve i386/84563, and I believe it should resolve
> related bugs kern/82846 (identical panic) and i386/84306.
>
> The specific fix Frank has mentioned is this:
>
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/pmap.c.diff?r1=1.523&r2=1.524&f=h
>
> committed by jhb and submitted by Greg Taleck.
>
> Even though this pmap.c change was applied to a later version than
> distributed with FreeBSD 5.4, the modifications still apply.

Correct.  I applied exactly that two-line change to pmap_remove() by
hand.  

I'd like to see this fixed in RELENG_5, and if possible and
appropriate in RELENG_5_4, because it will break on i386 systems with
RAM above 4GB that need PAE to see all that RAM.  What do I need to do
to get this to happen, send a PR, and/or write to re@ and/or
security-officer@?  I may be able to set some computers up for testing
if that would be helpful, but will have to check with the sysadmin to
see what his deployment schedule looks like.

-Frank McConnell


More information about the freebsd-stable mailing list