RELENG_5 PAE panic

dpk dpk at dpk.net
Sat Aug 6 18:04:26 GMT 2005


On Thu, 4 Aug 2005, Frank McConnell wrote:

> Further debugging led me to the conclusion that the problem is in
> pmap_protect(), in src/sys/i386/i386/pmap.c; and has to do with a
> 32-bit-truncated pt_entry_t being passed to PHYS_TO_VM_PAGE().
> (pt_entry_t is 64 bits if the kernel is built with PAE.)  This caused
> a page fault in vm_page_flag_set() which left the thread deadlocked
> while holding vm_page_queue_mtx and in turn led to a panic when
> another thread tried to acquire vm_page_queue_mtx.
>
> Then I checked the cvs logs, and saw rev 1.524, which looks like what
> I was thinking about as a fix, so I'm giving it a spin on top of
> earlier-this-week's RELENG_5.  Thus far I'll say that with that change
> my usual way of provoking the problem hasn't, yet.
>
> I'm going to try to get this PC put back into co-lo where it can
> get some production-like testing this weekend.  It'd be nice to get
> this fix MFC'd to RELENG_5 too.
>
> -Frank McConnell

FWIW, on a server we have which was panicing quite frequently, performing
the above mentioned modification seems to have resolved the issue. The
server has been repeatedly building kernels while having another process
run the server out of RAM. Before, this would cause it to panic with one
of 2 (maybe 3) messages in well under an hour. Now it's been going for 24
hours straight without even a stray bus error.

This appears to resolve i386/84563, and I believe it should resolve
related bugs kern/82846 (identical panic) and i386/84306.

The specific fix Frank has mentioned is this:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/pmap.c.diff?r1=1.523&r2=1.524&f=h

committed by jhb and submitted by Greg Taleck.

Even though this pmap.c change was applied to a later version than
distributed with FreeBSD 5.4, the modifications still apply.


More information about the freebsd-stable mailing list