kernel deadlock
Don Bowman
don at sandvine.com
Tue Jul 29 18:05:01 PDT 2003
From: Don Bowman [mailto:don at sandvine.com]
>
> From: Robert Watson [mailto:rwatson at freebsd.org]
> > On Tue, 29 Jul 2003, Dave Dolson wrote:
> >
> > > To follow up, I've discovered that the system has
> exhausted its "FFS
> > > node" malloc type.
> ...
> >
> > Some problems with this have turned up in -CURRENT on large-memory
> > machines where some of the scaling factors have been off. In
>
> We currently have kern.maxvnodes=70354 set (automatically
> scaled). This
> is a 1GB box.
>
> I will try re-running the test with less.
>
> when it hits kern.maxvnodes, what will it do?
After applying the fixes from RELENG_4 for kern/52425,
I can still easily reproduce this hang without low memory.
Further debugging shows that vnlru process is waiting on
vlrup. This line is shown below. ie vnlru_nowhere is being
incremented ever 3 seconds.
static void
vnlru_proc(void)
{
...
s = splbio();
for (;;) {
...
if (done == 0) {
vnlru_nowhere++;
tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3);
}
}
splx(s);
syncher is in vlruwk wait from getnewvnode().
lots of other processes waiting on ffsvgt.
this implies that vlrureclaim() was unable to free anything.
i have maxvnode = 35k. as soon as i hit this value, my system locked
up [bash on serial shell non-responsive, serial driver echos chars,
can drop into ddb]. Processes which don't use filesystem seem to continue
to run ok.
A couple of procs are waiting on inode: env, cron. These never come
out of waiting for it.
suggestions?
db> ps
pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
649 dc35a8a0 e0a32000 0 641 641 004104 3 ffsvgt c03698a8 atrun
648 dc35a3c0 e0e36000 0 647 648 000014 3 vlruwk c0364c90 cron
647 dc35b740 e03d4000 0 135 135 000004 3 ppwait dc35b740 cron
646 dc35b0c0 e03ee000 0 635 101 004004 3 inode c368ee00 env
645 dc35ad80 e03f1000 0 212 644 004006 3 ffsvgt c03698a8 grep
644 dc35aa40 e0400000 0 212 644 004006 3 ffsvgt c03698a8 sysctl
641 dc35a080 e0e4c000 0 640 641 004084 3 wait dc35a080 sh
640 dc35a220 e0e39000 0 135 135 000084 3 piperd e037c5c0 cron
635 dc35a560 e0e32000 0 101 101 004084 3 piperd e037cd40 sh
456 dc35abe0 e03fc000 0 133 456 4004004 3 ffsvgt c03698a8 tclsh83
212 dc35bdc0 e0392000 0 199 212 004086 3 wait dc35bdc0 bash
199 dc35c440 e036e000 0 1 199 004186 3 wait dc35c440 login
187 dc35c2a0 e0376000 0 1 7 000086 3 select c037c460 snmpd
169 dc35af20 e03e7000 0 1 169 000084 3 nanslp c0364970
siocontrol
163 dc35b260 e03e2000 0 1 163 000084 3 nanslp c0364970 wddt
143 dc35b400 e03dd000 25 1 143 2000184 3 pause e03dd260
sendmail
140 dc35b5a0 e03d9000 0 1 140 000184 3 select c037c460 sendmail
137 dc35b8e0 e03d0000 0 1 137 000184 3 select c037c460 sshd
135 dc35ba80 e03c2000 0 1 135 000004 3 inode c35f4400 cron
133 dc35bc20 e0397000 0 1 133 000084 3 select c037c460 inetd
124 dc35bf60 e0382000 0 1 124 000084 3 select c037c460 syslogd
101 dc35c100 e037e000 0 1 101 000084 3 wait dc35c100 dhclient
6 dc35c5e0 defd1000 0 0 0 000204 3 vlrup dc35c5e0 vnlru
5 dc35c780 defce000 0 0 0 000204 3 syncer c037c388 syncer
4 dc35c920 defcb000 0 0 0 000204 3 psleep c0364b3c
bufdaemon
3 dc35cac0 defc8000 0 0 0 000204 3 psleep c0373280 vmdaemon
2 dc35cc60 defc5000 0 0 0 000204 3 psleep c0352118
pagedaemon
1 dc35ce00 dc361000 0 0 1 004284 3 wait dc35ce00 init
0 c037b760 c040e000 0 0 0 000204 3 sched c037b760 swapper
More information about the freebsd-stable
mailing list