[Backtrace] 4.9 and 5.1-RELEASE occasionly panic on RAM > 4GB
without PAE (long)
=?gb2312?B?WGluIExJL8Du9s4=?=
delphij at frontfree.net
Wed Dec 17 16:40:54 PST 2003
Hello,
One of my friend is operating servers on a big ICP in China. They found that
if they run FreeBSD 4.9 or FreeBSD 5.1-RELEASE on their server, which has
more than 4GB RAM installed, then they will occasionaly have the server
panic'ed, or even rebooted without any reason.
The server hardware is: P4-Xeon-2.4G*2, 4G RAM(DDR266), 146G*2 on asr(4)
RAID-1. The kernel is not quite different from GENERIC, except that we have
AUTO_EOI1 enabled. When PAE is enabled, then the kernel will not recognize
the asr(4) device, and thus would refuse to boot from it, so we have it
disabled.
This problem afffects 4.9-RELEASE and 5.1-RELEASE. As the ICP is a
conservative user, they will not run -STABLE, nor -CURRENT. I have turned
the debugging symbols on their server and grabbed the following backtraces.
Also, I have found how to reproduce this problem on a machine running with >
4GB memory without a PAE configuration, which, in my opinion, should not
cause a kernel panic because it's just a resource hungry program and all it
will do should, at most, hang the system, or even being killed directly by
the operating system.
-----------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
char* s;
int p;
int d;
d = 4096;
for(;d>0;d--)
{
p=fork();
if(p==0)
{
s=malloc(1024*1024);
if(s==NULL)
printf("Pid %d malloc(1024*1024) failed\n",getpid());
else
memset(s,0,1024*1024);
while(1);
}
else if(p==-1)
printf("Fork() %d failed\n",d);
else
continue;
}
return 0;
}
-----------------------------------------------------------------------
Unfortuantelly, this will cause panic after a short while. We got these
backtrace:
gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.2
GNU gdb 4.18 (FreeBSD)
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read
called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
2627 in elfstab_build_psymtabs
Deprecated bfd_read called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
933 in fill_symbuf
SMP 4 cpus
IdlePTD at phsyical address 0x003af000
initial pcb at physical address 0x00313fe0
panicstr: pmap_new_proc: u_map allocation failed
panic messages:
---
panic: pmap_new_proc: u_map allocation failed
mp_lock = 02000001; cpuid = 2; lapic.id = 01000000
boot() called on cpu#2
syncing disks... 3
done
Uptime: 2h44m38s
dumping to dev #da/0x20001, offset 263168
dump 3967 3966 3965..................
.....................................
.....................................
...............10 9 8 7 6 5 4 3 2 1 0
---
#0 dumpsys () at ../../kern/kern_shutdown.c:487
487 if (dumping++) {
(kgdb) bt
#0 dumpsys () at ../../kern/kern_shutdown.c:487
#1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316
#2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595
#3 0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919
#4 0xc0239944 in vm_fork (p1=0xfbfe2ba0, p2=0xffb5d1e0, flags=20) at
../../vm/vm_glue.c:240
#5 0xc017fa85 in fork1 (p1=0xfbfe2ba0, flags=20, procp=0xfc0fef24) at ../..
/kern/kern_fork.c:517
#6 0xc017f1fa in fork (p=0xfbfe2ba0, uap=0xfc0fef80) at
../../kern/kern_fork.c:100
#7 0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = -1077937296, tf_esi = -1077937304, tf_ebp = -1077937392,
tf_isp = -66064428, tf_ebx = 1, tf_edx = 0, tf_ecx = -1077937066,
tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 1208488192,
tf_cs = 31, tf_eflags = 514, tf_esp = -1077937420, tf_ss = 47}) at
../../i386/i386/trap.c:1175
#8 0xc027e09b in Xint0x80_syscall ()
#9 0x8048506 in ?? ()
(kgdb) up
#1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316
316 dumpsys();
(kgdb)
#2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595
595 boot(bootopt);
(kgdb)
#3 0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919
919 panic("pmap_new_proc: u_map allocation
failed");
(kgdb) l
914
915 /* get a kernel virtual address for the UPAGES for this proc
*/
916 if ((up = (vm_offset_t) p->p_addr) == 0) {
917 up = kmem_alloc_nofault(kernel_map, UPAGES *
PAGE_SIZE);
918 if (up == 0)
919 panic("pmap_new_proc: u_map allocation
failed");
920 p->p_addr = (struct user *) up;
921 }
922
923 for(i = 0; i < UPAGES; i++) {
(kgdb) p up
$1 = 0
(kgdb) p {struct proc *}p
$2 = (struct proc *) 0xffb5d040
(kgdb) p {struct proc}p
$3 = {p_procq = {tqe_next = 0xffb5d040, tqe_prev = 0x0}, p_list = {le_next =
0xffb5d380, le_prev = 0xc032ac38}, p_cred = 0xd2898620,
p_fd = 0xd289e900, p_stats = 0x0, p_limit = 0xd275e700, p_upages_obj =
0xffbf9e60, p_procsig = 0xd289dac0, p_flag = 6, p_stat = 1 '\001',
p_pad1 = "\000\000", p_pid = 5832, p_hash = {le_next = 0xfd8641a0, le_prev
= 0xd2198b20}, p_pglist = {le_next = 0xffb5d380,
le_prev = 0xfbfe2bdc}, p_pptr = 0xfbfe2ba0, p_sibling = {le_next =
0xffb5d380, le_prev = 0xfbfe2bf0}, p_children = {lh_first = 0x0},
p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace =
0xffb239c0, p_estcpu = 7, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0,
p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval =
{tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}},
p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0,
p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = {
__bits = {0, 0, 0, 0}}, p_textvp = 0xfc0310c0, p_lock = 0 '\000',
p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000',
p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0
'\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0},
p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits =
{0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = {
slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp
= 0x0, ss_size = 0, ss_flags = 4}, p_priority = 50 '2',
p_usrpri = 50 '2', p_nice = 0 '\000', p_comm = "fork_and_malloc_", p_pgrp
= 0xd2740f20, p_sysent = 0xc02f13e0, p_rtprio = {type = 1,
prio = 0}, p_prison = 0x0, p_args = 0xd2770b40, p_addr = 0x0, p_md =
{md_regs = 0x0}, p_xstat = 0, p_acflag = 0, p_ru = 0x0,
p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, p_peers = 0x0, p_leader =
0xffb5d1e0, p_asleep = {as_priority = 0, as_timo = 0},
p_emuldata = 0x0, p_fdtol = 0x0}
(kgdb)
-----------------------------------------------------------------------
On another machine, we have tested the following CGI script. When having
heavy load, it will cause panic, too:
-----------------------------------------------------------------------
#!/bin/sh
printf "content-type:text/html\n\n"
printf "<table bordercolor=a0a0a0 border=1 cellpadding=1 cellspacing=1>\n"
printf "<tr>\n"
printf "<td colspan=8 align=center>`/sbin/ifconfig|grep broadcast|head -n
1|perl -pi -e 's/ +/\t/g'|cut -f3|perl -pi -e 's/\n//g'`</td>\n"
printf "</tr>\n"
printf "<tr>\n"
printf "<td align=center width=100>Load</td>\n"
printf "<td align=center width=100>Connections</td>\n"
printf "<td align=center width=100>Processes</td>\n"
printf "<td align=center width=100>Open Files</td>\n"
printf "<td align=center width=300>Processor(User%% System%% Free%%)</td>\n"
printf "<td align=center width=100>Active Memory</td>\n"
printf "<td align=center width=100>Swap</td>\n"
printf "</tr>\n"
printf "<tr>\n"
printf "<td align=right>`w w|head -n 1|perl -pi -e 's/^.*, //g'|perl -pi -e
's/\n/ /g'`</td>\n"
printf "<td align=right>`netstat -an|wc|cut -c1-8|perl -pi -e 's/ //g'|perl
-pi -e 's/\n/ /g'`</td>\n"
printf "<td align=right>`ps ax|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e
's/\n/ /g'`</td>\n"
printf "<td align=right>`fstat|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e
's/\n/ /g'`</td>\n"
printf "<td align=right>`vmstat|tail -n 1|perl -pi -e 's/ +/\t/g'|cut
-f18-|perl -pi -e 's/\t/%% /g'|perl -pi -e 's/\n/%%/g'`</td>\n"
printf "<td align=right>`top -n|head -n 4|tail -n 1|perl -pi -e 's/
+/\t/g'|cut -f2|perl -pi -e 's/\n/ /g'`</td>\n"
printf "<td align=right>`top -n|head -n 5|tail -n 1|perl -pi -e 's/
+/\t/g'|cut -f4|perl -pi -e 's/\n/ /g'`</td>\n"
printf "</tr>\n"
printf "</table>\n"
-----------------------------------------------------------------------
To simulate the situation, we have used ab:
/usr/local/apache/bin/ab -n 16384000 -c 1024
And we got this:
-----------------------------------------------------------------------
gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.1
GNU gdb 4.18 (FreeBSD)
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read
called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
2627 in elfstab_build_psymtabs
Deprecated bfd_read called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
933 in fill_symbuf
SMP 4 cpus
IdlePTD at phsyical address 0x003af000
initial pcb at physical address 0x00313fe0
panicstr: pmap_new_proc: u_map allocation failed
panic messages:
---
panic: pmap_new_proc: u_map allocation failed
mp_lock = 01000001; cpuid = 1; lapic.id = 06000000
boot() called on cpu#1
syncing disks... 1
done
Uptime: 1d0h36m40s
dumping to dev #da/0x20001, offset 263168
dump 3967 3966 3965..................
.....................................
.....................................
...............10 9 8 7 6 5 4 3 2 1 0
---
#0 dumpsys () at ../../kern/kern_shutdown.c:487
487 if (dumping++) {
(kgdb) bt
#0 dumpsys () at ../../kern/kern_shutdown.c:487
#1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316
#2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595
#3 0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919
#4 0xc0239944 in vm_fork (p1=0xfed67780, p2=0xffa7f680, flags=20) at
../../vm/vm_glue.c:240
#5 0xc017fa85 in fork1 (p1=0xfed67780, flags=20, procp=0xfed6ef24) at ../..
/kern/kern_fork.c:517
#6 0xc017f1fa in fork (p=0xfed67780, uap=0xfed6ef80) at
../../kern/kern_fork.c:100
#7 0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = 134960572, tf_esi = 135016448, tf_ebp = -1077938448,
tf_isp = -19468332, tf_ebx = 134959244, tf_edx = 4, tf_ecx =
-1077938376, tf_eax = 2, tf_trapno = 22, tf_err = 2, tf_eip = 134672192,
tf_cs = 31, tf_eflags = 582, tf_esp = -1077938476, tf_ss = 47}) at
../../i386/i386/trap.c:1175
#8 0xc027e09b in Xint0x80_syscall ()
#9 0x804b34f in ?? ()
#10 0x804d915 in ?? ()
#11 0x804d403 in ?? ()
#12 0x804d17e in ?? ()
#13 0x804b4fb in ?? ()
#14 0x804ac40 in ?? ()
#15 0x80526be in ?? ()
#16 0x805262a in ?? ()
#17 0x804813e in ?? ()
(kgdb) up
#1 0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316
316 dumpsys();
(kgdb)
#2 0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595
595 boot(bootopt);
(kgdb)
#3 0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919
919 panic("pmap_new_proc: u_map allocation
failed");
(kgdb) l
914
915 /* get a kernel virtual address for the UPAGES for this proc
*/
916 if ((up = (vm_offset_t) p->p_addr) == 0) {
917 up = kmem_alloc_nofault(kernel_map, UPAGES *
PAGE_SIZE);
918 if (up == 0)
919 panic("pmap_new_proc: u_map allocation
failed");
920 p->p_addr = (struct user *) up;
921 }
922
923 for(i = 0; i < UPAGES; i++) {
(kgdb) p p
$1 = (struct proc *) 0xffa7f680
(kgdb) p {struct proc *}p
$2 = (struct proc *) 0xffa7f4e0
(kgdb) p {struct proc}p
$3 = {p_procq = {tqe_next = 0xffa7f4e0, tqe_prev = 0x0}, p_list = {le_next =
0xffa7f820, le_prev = 0xc032ac38}, p_cred = 0xd3cb9f20,
p_fd = 0xd3c95d00, p_stats = 0x0, p_limit = 0xd3c9e100, p_upages_obj =
0xff64de04, p_procsig = 0xd3ca7400, p_flag = 4, p_stat = 1 '\001',
p_pad1 = "\000\000", p_pid = 26450, p_hash = {le_next = 0xfd812dc0,
le_prev = 0xd2198d48}, p_pglist = {le_next = 0xff18c3c0,
le_prev = 0xfed677bc}, p_pptr = 0xfed67780, p_sibling = {le_next = 0x0,
le_prev = 0xfed677d0}, p_children = {lh_first = 0x0},
p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace =
0xffae2500, p_estcpu = 184, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0,
p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval =
{tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}},
p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0,
p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = {
__bits = {0, 0, 0, 0}}, p_textvp = 0xf9b5c740, p_lock = 0 '\000',
p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000',
p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0
'\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0},
p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits =
{0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = {
slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp
= 0x0, ss_size = 0, ss_flags = 4}, p_priority = 52 '4',
p_usrpri = 52 '4', p_nice = 0 '\000', p_comm =
"sh\000pd\000r\000\000\000\000\000\000\000\000\000", p_pgrp = 0xd21ce460,
p_sysent = 0xc02f13e0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0,
p_args = 0xd3c97240, p_addr = 0x0, p_md = {md_regs = 0x0},
p_xstat = 0, p_acflag = 0, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0,
p_wakeup = 0, p_peers = 0x0, p_leader = 0xffa7f680, p_asleep = {
as_priority = 0, as_timo = 0}, p_emuldata = 0x0, p_fdtol = 0x0}
(kgdb) p kernel_map
$4 = 0xc031a60c
-----------------------------------------------------------------------
I have took a sight on the pmap and vm code, and did not figured out why the
panic was caused. So I post it here, and I will have to say that -CURRENT
may have the same problem, and it may affect the upcoming 5.2-RELEASE. For
additional information, please fell free to contact me, and I can act as
liaison.
Any suggestions? Thank you in advance!
Xin LI,
FreeBSD (Simplified) Chinese Project
More information about the freebsd-stable
mailing list